Writing A Hadoop MapReduce Program In PHP

I came across writing Map/Reduce in PHP.

Here is a great example:

Map: mapper.php
Save the following code in the file /home/guest/mapper.php:

Reduce: mapper.php

Save the following code in the file /home/guest/reducer.php:

Don’t forget to set execution rights for these files:

Running the PHP code on Hadoop:
Download example input data
Like Michael, we will use three ebooks from Project Gutenberg for this example:

http://www.gutenberg.org/files/20417/20417-8.txt

http://www.gutenberg.org/dirs/etext04/7ldvc10.txt

http://www.gutenberg.org/dirs/etext03/ulyss12.txt

Download each ebook and store them in a temporary directory of choice, for example /tmp/gutenberg

Copy local example data to HDFS
Before we run the actual MapReduce job, we first have to copy the files from our local file system to Hadoop’s HDFS

Run the MapReduce job
We’re all set and ready to run our PHP MapReduce job on the Hadoop cluster. We use HadoopStreaming for helping us passing data between our Map and Reduce code via STDIN and STDOUT.

or

The job will read all the files in the HDFS directory gutenberg, process it, and store the results in a single result file in the HDFS directory gutenberg-output.

You can track the status of the job using Hadoop’s web interface. Go to http://localhost:50030/

When the job has finished, Check if the result is successfully stored in HDFS directory gutenberg-output:

You can then inspect the contents of the file with the dfs -cat command:

Share on FacebookShare on Google+Email this to someoneShare on RedditShare on LinkedInShare on TumblrTweet about this on TwitterShare on StumbleUpon

2 responses to “Writing A Hadoop MapReduce Program In PHP”

  1. Shantanu says:

    Very Helpful. Thank you.

  2. Hieu says:

    I was made by instructions only but did not succeed
    Can u help me plsss

Leave a Reply

Your email address will not be published.