Posted by : Sushanth Thursday 24 December 2015



Step 1: The mapreduce program (code required to process the data) is sent to the job client.

Step2: Job Client asks Job tracker for a job id.Also check the output specification of the job and computes input splits of the job.

Step 3: Job client copies all the job resources to the shared file system.The job resources include the jar file containing the mapreduce program,configuration file containing different configuration parameters and the computed input split details.

Step 4: Calls the submit job method which tells the job trackers that it is ready to submit the job.

Step 5: Jobtracker does the job initializatio:Puts the job in an internal queue for scheduler to pickup.
            Retrieves the input splits from the shared location and creates the list of tasks to run based on
             the input split and kicks off job.

Step 6: Assign task to a task tracker - data- local /rack-local

Step 7: Task tracker : execute the job
            Localizes job jar by copying from shared to its local file system.
            Creates a local directory where it runs jars job.jar
           Spinning up the map or reduce job.
          Heart beats: report back to job tracker








Leave a Reply

Subscribe to Posts | Subscribe to Comments

- Copyright © Technical Articles - Skyblue - Powered by Blogger - Designed by Johanes Djogan -