Hi Daniel, I think there are better solutions, but simply chop the input file into pieces ( i.e. 10 urls per file ) shall work.
2009/12/4 Daniel Garcia <[email protected]> > Hello! > > I'm trying to rewrite an image resizing program in terms of map/reduce. The > problem I see is that the job is not broken up in to small enough tasks. If > I only have 1 input file with 10,000 urls (the file is much less than the > HDFS block size) how can I ensure that the job is distributed amongst all > the nodes. In other words how can I ensure that the task size is small > enough so that all nodes process a proportional size of the input. > > Regards, > Daniel > > Best Regards, Tex Xu
