wait at the end of job
I have a windows hadoop cluster consists of 8 slaves 1 master node. My hadoop program is a collection of recursive jobs. I create 14 map, 14 reduce tasks in each job. My files are up to 10mb. My problem is that all jobs are waiting at the end of job. Map %100 Reduce %100 seen on command prompt but it does not progress for 5 minutes. Then it goes to next job execution. So anyone has an idea? Thanks.
custom partitioner
My custom parititoner is: public class PopulationPartitioner extends Partitioner IntWritable, Chromosome implements Configurable { @Override public int getPartition(IntWritable key, Chromosome value, int numOfPartitions) { int partition = key.get(); if (partition 0 || partition = numOfPartitions) { partition = numOfPartitions-1; } System.out.println(partition +partition ); return partition; } @Override public Configuration getConf() { // TODO Auto-generated method stub return conf; } @Override public void setConf(Configuration arg0) { // TODO Auto-generated method stub conf = arg0; } private Configuration conf; } And my mapred configuration file is : configuration property namemapred.job.tracker/name valuelocalhost:9001/value /property property namemapred.tasktracker.reduce.tasks.maximum/name value4/value /property /configuration Thanks again. This shouldn't be the case at all. Can you share your Partitioner code and the job.xml of the job that showed this behavior? In any case: How do you set the numberOfReducer to 4? 2012/3/23 Harun Raşit ER harunrasi...@gmail.com: I wrote a custom partitioner. But when I work as standalone or pseudo-distributed mode, the number of partitions is always 1. I set the numberOfReducer to 4, but the numOfPartitions parameter of custom partitioner is still 1 and all my four mappers' results are going to 1 reducer. The other reducers yield empty files. How can i set the number of partitions in standalone or pseudo-distributed mode? thanks for your helps. -- Harsh J
Re: custom partitioner
Thanks for your help. I assigned key values from a static variable and when i ran in eclipse platform, i saw the right key values, but after distributed-mode debug, i have seen all my key values are 0. On 3/25/12, Harsh J ha...@cloudera.com wrote: Harun, Does your map task stdout logs show varying values for partition? Seems to me like all your keys are somehow outside of [0, numOfPartitions), and hence go to the last partition, per your logic. 2012/3/25 Harun Raşit ER harunrasi...@gmail.com: public int getPartition(IntWritable key, Chromosome value, int numOfPartitions) { int partition = key.get(); if (partition 0 || partition = numOfPartitions) { partition = numOfPartitions-1; } System.out.println(partition +partition ); return partition; } I wrote the custom partitioner above. But the problem is about the third parameter, numOfPartitions. It is always 1 in pseudo-distributed mode. I have 4 mappers and 4 reducers, but only one of the reducers uses the real values. The others yield nothing, just empty files. When I remove the if statement, hadoop complains about the partition number as illegal partition for How can i set the number of partitions in pseudo-distributed mode? Thanks. -- Harsh J
custom partitioner
public int getPartition(IntWritable key, Chromosome value, int numOfPartitions) { int partition = key.get(); if (partition 0 || partition = numOfPartitions) { partition = numOfPartitions-1; } System.out.println(partition +partition ); return partition; } I wrote the custom partitioner above. But the problem is about the third parameter, numOfPartitions. It is always 1 in pseudo-distributed mode. I have 4 mappers and 4 reducers, but only one of the reducers uses the real values. The others yield nothing, just empty files. When I remove the if statement, hadoop complains about the partition number as illegal partition for How can i set the number of partitions in pseudo-distributed mode? Thanks.
number of partitions
I wrote a custom partitioner. But when I work as standalone or pseudo-distributed mode, the number of partitions is always 1. I set the numberOfReducer to 4, but the numOfPartitions parameter of custom partitioner is still 1 and all my four mappers' results are going to 1 reducer. The other reducers yield empty files. How can i set the number of partitions in standalone or pseudo-distributed mode? thanks for your helps.