Re: How do map tasks get assigned efficiently?

2012-10-24 Thread Harsh J
ng such disk contention in Hadoop? Is HDFS smart enough to > serialize major disk access? > > > > > > From: Michael Segel [mailto:michael_se...@hotmail.com] > Sent: Wednesday, October 24, 2012 6:51 PM > To: user@hadoop.apache.org > Subject: Re: How do map tasks get

RE: How do map tasks get assigned efficiently?

2012-10-24 Thread David Parks
serialize major disk access? From: Michael Segel [mailto:michael_se...@hotmail.com] Sent: Wednesday, October 24, 2012 6:51 PM To: user@hadoop.apache.org Subject: Re: How do map tasks get assigned efficiently? So... Data locality only works when you actually have data on the cluster it

Re: How do map tasks get assigned efficiently?

2012-10-24 Thread Michael Segel
So... Data locality only works when you actually have data on the cluster itself. Otherwise how can the data be local. Assuming 3X replication, and you're not doing a custom split and your input file is splittable... You will split along the block delineation. So if your input file has 5 b

How do map tasks get assigned efficiently?

2012-10-23 Thread David Parks
Even after reading O'reillys book on hadoop I don't feel like I have a clear vision of how the map tasks get assigned. They depend on splits right? But I have 3 jobs running. And splits will come from various sources: HDFS, S3, and slow HTTP sources. So I've got some concern as to how t