Re: split number

2010-03-21 Thread Amogh Vasekar
Hi, AFAIK, it is a hint. Depending on the block size, minimum split size and this hint the exact number of splits is computed. So if you have total_size/hint < block size but greater than min split size, you should see the exact number. This is how I understand it, please let me know if I'm goin

pomsets: workflow management for your cloud

2010-03-21 Thread michael j pan
Apologies for the repost. The previous message was sent from my personal account and caused confusion for some people. I'd like to invite the Hadoop community to check out the application I've developed. Its name is pomsets, and it as a workflow management system for your cloud. In short,

Re: a question about tasktracker in hadoop

2010-03-21 Thread Eason.Lee
Yes, u r right~~ 2010/3/22 毛宏 > I read from 《Towards Optimizing Hadoop Provisioning in the Cloud 》 > saying that "mapred.tasktracker.map.tasks.maximum and > mapred.tasktracker.reduce.tasks.maximum respectively set the maximum > number of parallel mappers and reducers that can run on a Hadoop > s

a question about tasktracker in hadoop

2010-03-21 Thread 毛宏
I read from 《Towards Optimizing Hadoop Provisioning in the Cloud 》 saying that "mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum respectively set the maximum number of parallel mappers and reducers that can run on a Hadoop slave". It means that a tasktracker in H

split number

2010-03-21 Thread Gang Luo
Hi all, in InputFormat.getSplits(JobConf, splitNum), I think the splitNum should be a hint. The number of splits is equal to the numbers of mappers working on that file. But I do get the same number of splits as indicated by splitNum, and the sum of the split length is the length of that file. I