Hi,
AFAIK, it is a hint. Depending on the block size, minimum split size and this
hint the exact number of splits is computed. So if you have total_size/hint <
block size but greater than min split size, you should see the exact number.
This is how I understand it, please let me know if I'm goin
Apologies for the repost. The previous message was sent from my
personal account and caused confusion for some people.
I'd like to invite the Hadoop community to check out the application
I've developed. Its name is pomsets, and it as a workflow management
system for your cloud. In short,
Yes, u r right~~
2010/3/22 毛宏
> I read from 《Towards Optimizing Hadoop Provisioning in the Cloud 》
> saying that "mapred.tasktracker.map.tasks.maximum and
> mapred.tasktracker.reduce.tasks.maximum respectively set the maximum
> number of parallel mappers and reducers that can run on a Hadoop
> s
I read from 《Towards Optimizing Hadoop Provisioning in the Cloud 》
saying that "mapred.tasktracker.map.tasks.maximum and
mapred.tasktracker.reduce.tasks.maximum respectively set the maximum
number of parallel mappers and reducers that can run on a Hadoop
slave".
It means that a tasktracker in H
Hi all,
in InputFormat.getSplits(JobConf, splitNum), I think the splitNum should be a
hint. The number of splits is equal to the numbers of mappers working on that
file. But I do get the same number of splits as indicated by splitNum, and the
sum of the split length is the length of that file. I