Hello!

I'm trying to rewrite an image resizing program in terms of map/reduce. The
problem I see is that the job is not broken up in to small enough tasks. If
I only have 1 input file with 10,000 urls (the file is much less than the
HDFS block size) how can I ensure that the job is distributed amongst all
the nodes. In other words how can I ensure that the task size is  small
enough so that all nodes process a proportional size of the input.

Regards,
Daniel

Reply via email to