Tharindu Hasthika
Tharindu Hasthika's Blog

Tharindu Hasthika's Blog

Introduction to MapReduce - Filter > Map > Reduce

Subscribe to my newsletter and never miss my upcoming articles

Languages like Python, Javascript, and many other have a set of functions for working with lists as sort of a pipeline. These are called filter, map and reduce. These functions are originally from functional programming languages like lisp. Essentially these functions allows you to write clean code and might I add more efficient code. For this article I'll be using javascript, but you can apply the concept to any language which these functions are available.

First we'll go through these 3 functions one by one and try to visualize what happens inside of them.

Filter

This function acts as a gatekeeper which filters the elements of the input list or array by the function that is given to it.

filter.png

From the above image it is clear that the function will only let elements whose the function f(x) is true are only allowed to the resulting array.

  • for loop
const arr = [ 6, 8, 12, 4, 23, 1 ];

const resultArray = [];

for (var i = 0; i < arr.length; i++) {
  if (arr[i] < 10) {
    resultArray.push(arr[i]);
  }
}

// resultArray <- [6, 8, 4, 1]
  • filter function
const arr = [ 6, 8, 12, 4, 23, 1 ];

const resultArray = arr.filter(function (num) {
  return num < 10;
});

// resultArray <- [6, 8, 4, 1]

Map

As the name implies it maps stuff, but mainly arrays or lists. If you can remember from mathematics there's this thing called functions f(x) where it maps a set to another set, nominally called domain and co-domain.

map.png

As you can see from the above image map is sort of a sliding function that goes over an array and create a new array according to the function that you provided.

  • for loop
const arr = [ 6, 8, 12, 4, 23, 1 ];

const resultArray = [];

for (var i = 0; i < arr.length; i++) {
  resultArray.push(2 * arr[i]);
}

// resultArray <- [12, 16, 24, 8, 46, 2]
  • map function
const arr = [ 6, 8, 12, 4, 23, 1 ];

const resultArray = arr.map(function (num) {
  return 2 * num;
});

// resultArray <- [12, 16, 24, 8, 46, 2]

Reduce

This is actually bit tricky to understand, but when you finally get it, it's really not that complicated. So to understand reduce, you have to think of accumilation or aggregation. The image below will help with the visualization of the reduce function.

reduce.png

For this example I've choosen a simple task, to calculate the total of the elements of an array. But you can use it for all sorts of stuff. Below I'll show the code for doing the above task using the for loop as well as using the reduce function.

const arr = [ 6, 8, 12, 4, 23, 1 ];

let total = 0;

for (var i = 0; i < arr.length; i++) {
  total += arr[i];
}

// total <- 54
const arr = [ 6, 8, 12, 4, 23, 1 ];

const total = arr.reduce(function (total, currentValue) {
  return total + currentValue;
}, 0); // <- 0 is important, it is the initial value of total

// total <- 54

As you can see the reduce function reduces the amount of code and also makes the code more cleaner.

Complex Problem

Lets say that we have an array of objects sort of like this,

const people = [
  {
    name: 'John',
    age: 32,
    occupation: 'Manager'
  },
  {
    name: 'Chris',
    age: 24,
    occupation: 'Programmer'
  },
  {
    name: 'Will',
    age: 14,
    occupation: 'Student'
  },
  ...
]

// get names as an array
const names = people.map((p) => { // short form of a function
  return p.name;
});

// names <- ['John', 'Chris', 'Will', ...]

// get people with occupation = 'Programmer'
const programmers = people.filter((p) => {
  return p.occupation === 'programmer';
});

// programmers <- [{ name: 'Chris', age: 24, occupation: 'Programmer' }, ...]

// get names of people of age >= 18
const adultNames = people.filter((p) => {
  return p.age >= 18;
}).map((p) => {
  return p.name;
});

// adultNames <- ['John', 'Chris', ...]

As you can see map, reduce and filter functions are really powerful when they are chained to create sort of a pipeline for your data. Hope this article helped to clear out some things about these 3 functions.

Happy Coding!

 
Share this