Re: Problems with timeout when a Hadoop job generates a large number of key-value pairs

2012-01-23 Thread Alex Kozlov
On Jan 23, 2012, at 8:57 AM, Steve Lewis wrote: > I have been silent for a few days because on my cluster I was UNABLE to > reproduce the issue. > What I do see is that merge is taking a HUGE amount of time - > Yes, correct. Hadoop has to persist (sorted) data to disk in the current implementat

Re: Problems with timeout when a Hadoop job generates a large number of key-value pairs

2012-01-23 Thread Steve Lewis
I have been silent for a few days because on my cluster I was UNABLE to reproduce the issue. What I do see is that merge is taking a HUGE amount of time - In my hands the mapper reaches 100% and then enters silent phase running the compiler other merge operations. Is it your experience that the ti

Re: Problems with timeout when a Hadoop job generates a large number of key-value pairs

2012-01-22 Thread Alex Kozlov
Hi Steve, I think I was able to reproduce your problem over the weekend (not sure though, it may be a different problem). In my case it was that the mappers were timing out during the merge phase. I also think the related tickets are MAPREDUCE-2177

Re: Problems with timeout when a Hadoop job generates a large number of key-value pairs

2012-01-20 Thread Michael Segel
Thats the one ... Sent from my iPhone On Jan 20, 2012, at 6:28 PM, "Paul Ho" wrote: > I think the balancing bandwidth property you are looking for is in > hdfs-site.xml: > > >dfs.balance.bandwidthPerSec >402653184 > > > Set the value that makes most sense for your NIC

Re: Problems with timeout when a Hadoop job generates a large number of key-value pairs

2012-01-20 Thread Paul Ho
I think the balancing bandwidth property you are looking for is in hdfs-site.xml: dfs.balance.bandwidthPerSec 402653184 Set the value that makes most sense for your NIC. But I thought this is only for balancing. On Jan 20, 2012, at 3:43 PM, Michael Segel wrote: > Ste

Re: Problems with timeout when a Hadoop job generates a large number of key-value pairs

2012-01-20 Thread Michael Segel
Steve, Ok, first your client connection to the cluster is a non issue. If you go in to /etc/Hadoop/conf That supposed to be a little h but my iPhone knows what's best... Look and see what you have set for your bandwidth... I forget which parameter but there are only a couple that deal with ban

Re: Problems with timeout when a Hadoop job generates a large number of key-value pairs

2012-01-20 Thread Steve Lewis
Interesting - I strongly suspect a disk IO or network problem since my code is very simple and very fast. If you add lines to generateSubStrings to limit String length to 100 characters (I think it is always that but this makes su public static String[] generateSubStrings(String inp, int minLeng

Re: Problems with timeout when a Hadoop job generates a large number of key-value pairs

2012-01-20 Thread Steve Lewis
Good catch on the Configured - In my tests is extends my subclass of Configured but a I took out any dependencies on my environment. Interesting - I strongly suspect a disk IO or network problem since my code is very simple and very fast. If you add lines to generateSubStrings to limit String le

Re: Problems with timeout when a Hadoop job generates a large number of key-value pairs

2012-01-20 Thread Steve Lewis
One thing I can say for sure is that generateSubStrings() is not slow - Every input line in my sample is 100 characters and the timing should be very similar from one run to the next. This sample is a simplification of a more complex real problem where we see timeouts when a map generates signifi

Re: Problems with timeout when a Hadoop job generates a large number of key-value pairs

2012-01-20 Thread Steve Lewis
On Fri, Jan 20, 2012 at 12:18 PM, Michel Segel wrote: > Steve, > If you want me to debug your code, I'll be glad to set up a billable > contract... ;-) > > What I am willing to do is to help you to debug your code.. The code seems to work well for small input files and is basically a standard sa

Re: Problems with timeout when a Hadoop job generates a large number of key-value pairs

2012-01-20 Thread Steve Lewis
; >Sent: Friday, January 20, 2012 9:16 AM > >Subject: Problems with timeout when a Hadoop job generates a large number > of key-value pairs > > > >We have been having problems with mappers timing out after 600 sec when > the > >mapper writes many more, say thousands of

Re: Problems with timeout when a Hadoop job generates a large number of key-value pairs

2012-01-20 Thread Vinod Kumar Vavilapalli
Every so often, you should do a context.progress() so that the framework knows that this map is doing useful work. That will prevent the framework from killing it after 10 mins. The framework automatically does this every time you do a context.write()/context.setStatus(), but if the map is stuck fo

Re: Problems with timeout when a Hadoop job generates a large number of key-value pairs

2012-01-20 Thread Alex Kozlov
Hi Steve, I ran your job on our cluster and it does not timeout. I noticed that each mapper runs for a long time: one way to avoid a timeout is to update a user counter. As long as this counter is updated within 10 minutes, the task should not timeout (as MR knows that something is being done).

Re: Problems with timeout when a Hadoop job generates a large number of key-value pairs

2012-01-20 Thread Michel Segel
Steve, If you want me to debug your code, I'll be glad to set up a billable contract... ;-) What I am willing to do is to help you to debug your code... Did you time how long it takes in the Mapper.map() method? The reason I asked this is to first confirm that you are failing within a map() met

Re: Problems with timeout when a Hadoop job generates a large number of key-value pairs

2012-01-20 Thread Raj V
M >Subject: Problems with timeout when a Hadoop job generates a large number of >key-value pairs > >We have been having problems with mappers timing out after 600 sec when the >mapper writes many more, say thousands of records for every >input record - even when the code in the map

Problems with timeout when a Hadoop job generates a large number of key-value pairs

2012-01-20 Thread Steve Lewis
We have been having problems with mappers timing out after 600 sec when the mapper writes many more, say thousands of records for every input record - even when the code in the mapper is small and fast. I have no idea what could cause the system to be so slow and am reluctant to raise the 600 sec l