raid (contrib) test hanging: TestBlockFixer I forced 2 thread dumps. Both hung in the same place. Filed https://issues.apache.org/jira/browse/MAPREDUCE-2283 This is a blocker for turning on MR precommit.
Cheers, Nige On Jan 25, 2011, at 11:19 PM, Nigel Daley wrote: > Started another trial run of MR precommit testing: > https://hudson.apache.org/hudson/view/G-L/view/Hadoop/job/PreCommit-MAPREDUCE-Build/17/ > > Let's see if 17th time is a charm... > > Nige > > On Jan 7, 2011, at 5:14 PM, Todd Lipcon wrote: > >> On Fri, Jan 7, 2011 at 2:11 PM, Nigel Daley <nda...@mac.com> wrote: >> >>> Hrm, the MR precommit test I'm running has hung (been running for 14 hours >>> so far). FWIW, 2 HDFS precommit tests are hung too. I suspect it could be >>> the NFS mounts on the machines. I forced a thread dump which you can see in >>> the console: >>> https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/10/console >>> >>> >> Strange, haven't seen a hang like that before in handleConnectionFailure. It >> should retry for 15 minutes max in that loop. >> >> >>> Any other ideas why these might be hanging? >>> >>> >> There is an HDFS bug right now that can cause hangs on some tests - >> HDFS-1529 - would appreciate if someone can take a look. But I don't think >> this is responsible for the MR hang above. >> >> -Todd >> >> >>> On Jan 5, 2011, at 5:42 PM, Todd Lipcon wrote: >>> >>>> On Wed, Jan 5, 2011 at 4:39 PM, Nigel Daley <nda...@mac.com> wrote: >>>> >>>>> Thanks for looking into it Todd. Let's first see if you think it can be >>>>> fixed quickly. Let me know. >>>>> >>>>> >>>> No problem, it wasn't too bad after all. Patch up on HADOOP-7087 which >>> fixes >>>> this test timeout for me. >>>> >>>> -Todd >>>> >>>> >>>>> On Jan 5, 2011, at 4:33 PM, Todd Lipcon wrote: >>>>> >>>>>> On Wed, Jan 5, 2011 at 4:19 PM, Nigel Daley <nda...@mac.com> wrote: >>>>>> >>>>>>> Todd, would love to get >>>>>>> https://issues.apache.org/jira/browse/MAPREDUCE-2121 fixed first >>> since >>>>>>> this is failing every night on trunk. >>>>>>> >>>>>> >>>>>> What if we disable that test, move that issue to 0.22 blocker, and then >>>>>> enable the test-patch? I'll also look into that one today, but if it's >>>>>> something that will take a while to fix, I don't think we should hold >>> off >>>>>> the useful testing for all the other patches. >>>>>> >>>>>> -Todd >>>>>> >>>>>> On Jan 5, 2011, at 2:45 PM, Todd Lipcon wrote: >>>>>>> >>>>>>>> Hi Nigel, >>>>>>>> >>>>>>>> MAPREDUCE-2172 has been fixed for a while. Are there any other >>>>> particular >>>>>>>> JIRAs you think need to be fixed before the MR test-patch queue gets >>>>>>>> enabled? I have a lot of outstanding patches and doing all the >>>>> test-patch >>>>>>>> turnaround manually on 3 different boxes is a real headache. >>>>>>>> >>>>>>>> Thanks >>>>>>>> -Todd >>>>>>>> >>>>>>>> On Tue, Dec 21, 2010 at 1:33 PM, Nigel Daley <nda...@mac.com> wrote: >>>>>>>> >>>>>>>>> Ok, HDFS is now enabled. You'll see a stream of updates shortly on >>>>> the >>>>>>> ~30 >>>>>>>>> Patch Available HDFS issues. >>>>>>>>> >>>>>>>>> Nige >>>>>>>>> >>>>>>>>> On Dec 20, 2010, at 12:42 PM, Jakob Homan wrote: >>>>>>>>> >>>>>>>>>> I committed HDFS-1511 this morning. We should be good to go. I >>> can >>>>>>>>>> haz snooty robot butler? >>>>>>>>>> >>>>>>>>>> On Fri, Dec 17, 2010 at 8:31 PM, Konstantin Boudnik < >>> c...@apache.org> >>>>>>>>> wrote: >>>>>>>>>>> Thanks Jacob. I am wasted already but I can do it on Sun, I think, >>>>>>>>>>> unless it is done earlier. >>>>>>>>>>> -- >>>>>>>>>>> Take care, >>>>>>>>>>> Konstantin (Cos) Boudnik >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, Dec 17, 2010 at 19:41, Jakob Homan <jgho...@gmail.com> >>>>> wrote: >>>>>>>>>>>> Ok. I'll get a patch out for 1511 tomorrow, unless someone wants >>>>> to >>>>>>>>>>>> whip one up tonight. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Dec 17, 2010 at 7:22 PM, Nigel Daley <nda...@mac.com> >>>>> wrote: >>>>>>>>>>>>> I agree with Cos on fixing HDFS-1511 first. Once that is done >>> I'll >>>>>>>>> enable hdfs patch testing. >>>>>>>>>>>>> >>>>>>>>>>>>> Cheers, >>>>>>>>>>>>> Nige >>>>>>>>>>>>> >>>>>>>>>>>>> Sent from my iPhone4 >>>>>>>>>>>>> >>>>>>>>>>>>> On Dec 17, 2010, at 7:01 PM, Konstantin Boudnik <c...@apache.org >>>> >>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> One more issue needs to be addressed before test-patch is >>> turned >>>>> on >>>>>>>>> HDFS is >>>>>>>>>>>>>> https://issues.apache.org/jira/browse/HDFS-1511 >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Take care, >>>>>>>>>>>>>> Konstantin (Cos) Boudnik >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Dec 17, 2010 at 16:17, Konstantin Boudnik < >>>>> c...@apache.org> >>>>>>>>> wrote: >>>>>>>>>>>>>>> Considering that because of these 4 faulty cases every patch >>>>> will >>>>>>> be >>>>>>>>>>>>>>> -1'ed a patch author will still have to look at it and make a >>>>>>>>> comment >>>>>>>>>>>>>>> why this particular -1 isn't valid. Lesser work, perhaps, but >>>>>>>>> messier >>>>>>>>>>>>>>> IMO. I'm not blocking it - I just feel like there's a better >>>>> way. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> Take care, >>>>>>>>>>>>>>> Konstantin (Cos) Boudnik >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Fri, Dec 17, 2010 at 15:55, Jakob Homan <jgho...@gmail.com >>>> >>>>>>>>> wrote: >>>>>>>>>>>>>>>>> If HDFS is added to the test-patch queue right now we get >>>>>>>>>>>>>>>>> nothing but dozens of -1'ed patches. >>>>>>>>>>>>>>>> There aren't dozens of patches being submitted currently. >>> The >>>>> -1 >>>>>>>>>>>>>>>> isn't the important thing, it's the grunt work of actually >>>>>>> running >>>>>>>>>>>>>>>> (and waiting) for the tests, test-patch, etc. that Hudson >>> does >>>>> so >>>>>>>>> that >>>>>>>>>>>>>>>> the developer doesn't have to. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Fri, Dec 17, 2010 at 3:48 PM, Dhruba Borthakur < >>>>>>>>> dhr...@gmail.com> wrote: >>>>>>>>>>>>>>>>> +1, thanks for doing this. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Fri, Dec 17, 2010 at 3:19 PM, Jakob Homan < >>>>> jgho...@gmail.com >>>>>>>> >>>>>>>>> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> So, with test-patch updated to show the failing tests, >>> saving >>>>>>> the >>>>>>>>>>>>>>>>>> developers the need to go and verify that the failed tests >>>>> are >>>>>>>>> all >>>>>>>>>>>>>>>>>> known, how do people feel about turning on test-patch again >>>>> for >>>>>>>>> HDFS >>>>>>>>>>>>>>>>>> and mapred? I think it'll help prevent any more tests from >>>>>>>>> entering >>>>>>>>>>>>>>>>>> the "yeah, we know" category. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> jg >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Wed, Nov 17, 2010 at 5:08 PM, Jakob Homan < >>>>>>>>> jho...@yahoo-inc.com> wrote: >>>>>>>>>>>>>>>>>>> True, each patch would get a -1 and the failing tests >>> would >>>>>>> need >>>>>>>>> to be >>>>>>>>>>>>>>>>>>> verified as those known bad (BTW, it would be great if >>>>> Hudson >>>>>>>>> could list >>>>>>>>>>>>>>>>>>> which tests failed in the message it posts to JIRA). But >>>>>>> that's >>>>>>>>> still >>>>>>>>>>>>>>>>>> quite >>>>>>>>>>>>>>>>>>> a bit less error-prone work than if the developer runs the >>>>>>> tests >>>>>>>>> and >>>>>>>>>>>>>>>>>>> test-patch themselves. Also, with 22 being cut, there are >>> a >>>>>>> lot >>>>>>>>> of >>>>>>>>>>>>>>>>>> patches >>>>>>>>>>>>>>>>>>> up in the air and several developers are juggling multiple >>>>>>>>> patches. The >>>>>>>>>>>>>>>>>>> more automation we can have, even if it's not perfect, >>> will >>>>>>>>> decrease >>>>>>>>>>>>>>>>>> errors >>>>>>>>>>>>>>>>>>> we may make. >>>>>>>>>>>>>>>>>>> -jg >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Nigel Daley wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Nov 17, 2010, at 3:11 PM, Jakob Homan wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> It's also ready to run on MapReduce and HDFS but we >>> won't >>>>>>>>> turn it on >>>>>>>>>>>>>>>>>>>>>> until these projects build and test cleanly. Looks >>> like >>>>>>> both >>>>>>>>> these >>>>>>>>>>>>>>>>>> projects >>>>>>>>>>>>>>>>>>>>>> currently have test failures. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Assuming the projects are compiling and building, is >>> there >>>>> a >>>>>>>>> reason to >>>>>>>>>>>>>>>>>>>>> not turn it on despite the test failures? Hudson is >>>>>>> invaluable >>>>>>>>> to >>>>>>>>>>>>>>>>>> developers >>>>>>>>>>>>>>>>>>>>> who then don't have to run the tests and test-patch >>>>>>>>> themselves. We >>>>>>>>>>>>>>>>>> didn't >>>>>>>>>>>>>>>>>>>>> turn Hudson off when it was working previously and there >>>>>>> were >>>>>>>>> known >>>>>>>>>>>>>>>>>>>>> failures. I think one of the reasons we have more >>> failing >>>>>>>>> tests now is >>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>> higher cost of doing Hudson's work (not a great excuse I >>>>>>>>> know). This >>>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>>>> particularly true now because several of the failing >>> tests >>>>>>>>> involve >>>>>>>>>>>>>>>>>> tests >>>>>>>>>>>>>>>>>>>>> timing out, making the whole testing regime even longer. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Every single patch would get a -1 and need investigation. >>>>>>>>> Currently, >>>>>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>>>>> would be about 83 investigations between MR and HDFS >>> issues >>>>>>>>> that are in >>>>>>>>>>>>>>>>>>>> patch available state. Shouldn't we focus on getting >>> these >>>>>>>>> tests fixed >>>>>>>>>>>>>>>>>> or >>>>>>>>>>>>>>>>>>>> removed/? Also, I need to get MAPREDUCE-2172 fixed >>>>> (applies >>>>>>> to >>>>>>>>> HDFS as >>>>>>>>>>>>>>>>>>>> well) before I turn this on. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>>>> Nige >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>> Connect to me at http://www.facebook.com/dhruba >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Todd Lipcon >>>>>>>> Software Engineer, Cloudera >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Todd Lipcon >>>>>> Software Engineer, Cloudera >>>>> >>>>> >>>> >>>> >>>> -- >>>> Todd Lipcon >>>> Software Engineer, Cloudera >>> >>> >> >> >> -- >> Todd Lipcon >> Software Engineer, Cloudera >