On Wed, Jan 26, 2011 at 10:05 AM, Nigel Daley <nda...@mac.com> wrote:
> raid (contrib) test hanging: TestBlockFixer > > I forced 2 thread dumps. Both hung in the same place. Filed > https://issues.apache.org/jira/browse/MAPREDUCE-2283 This is a blocker > for turning on MR precommit. > Since this is contrib, I'd like to suggest just disabling this test temporarily. We can re-enable it once it's fixed. Not having MR pre-commit working has been pretty painful. -Todd > On Jan 25, 2011, at 11:19 PM, Nigel Daley wrote: > > > Started another trial run of MR precommit testing: > > > https://hudson.apache.org/hudson/view/G-L/view/Hadoop/job/PreCommit-MAPREDUCE-Build/17/ > > > > Let's see if 17th time is a charm... > > > > Nige > > > > On Jan 7, 2011, at 5:14 PM, Todd Lipcon wrote: > > > >> On Fri, Jan 7, 2011 at 2:11 PM, Nigel Daley <nda...@mac.com> wrote: > >> > >>> Hrm, the MR precommit test I'm running has hung (been running for 14 > hours > >>> so far). FWIW, 2 HDFS precommit tests are hung too. I suspect it > could be > >>> the NFS mounts on the machines. I forced a thread dump which you can > see in > >>> the console: > >>> > https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/10/console > >>> > >>> > >> Strange, haven't seen a hang like that before in > handleConnectionFailure. It > >> should retry for 15 minutes max in that loop. > >> > >> > >>> Any other ideas why these might be hanging? > >>> > >>> > >> There is an HDFS bug right now that can cause hangs on some tests - > >> HDFS-1529 - would appreciate if someone can take a look. But I don't > think > >> this is responsible for the MR hang above. > >> > >> -Todd > >> > >> > >>> On Jan 5, 2011, at 5:42 PM, Todd Lipcon wrote: > >>> > >>>> On Wed, Jan 5, 2011 at 4:39 PM, Nigel Daley <nda...@mac.com> wrote: > >>>> > >>>>> Thanks for looking into it Todd. Let's first see if you think it can > be > >>>>> fixed quickly. Let me know. > >>>>> > >>>>> > >>>> No problem, it wasn't too bad after all. Patch up on HADOOP-7087 which > >>> fixes > >>>> this test timeout for me. > >>>> > >>>> -Todd > >>>> > >>>> > >>>>> On Jan 5, 2011, at 4:33 PM, Todd Lipcon wrote: > >>>>> > >>>>>> On Wed, Jan 5, 2011 at 4:19 PM, Nigel Daley <nda...@mac.com> wrote: > >>>>>> > >>>>>>> Todd, would love to get > >>>>>>> https://issues.apache.org/jira/browse/MAPREDUCE-2121 fixed first > >>> since > >>>>>>> this is failing every night on trunk. > >>>>>>> > >>>>>> > >>>>>> What if we disable that test, move that issue to 0.22 blocker, and > then > >>>>>> enable the test-patch? I'll also look into that one today, but if > it's > >>>>>> something that will take a while to fix, I don't think we should > hold > >>> off > >>>>>> the useful testing for all the other patches. > >>>>>> > >>>>>> -Todd > >>>>>> > >>>>>> On Jan 5, 2011, at 2:45 PM, Todd Lipcon wrote: > >>>>>>> > >>>>>>>> Hi Nigel, > >>>>>>>> > >>>>>>>> MAPREDUCE-2172 has been fixed for a while. Are there any other > >>>>> particular > >>>>>>>> JIRAs you think need to be fixed before the MR test-patch queue > gets > >>>>>>>> enabled? I have a lot of outstanding patches and doing all the > >>>>> test-patch > >>>>>>>> turnaround manually on 3 different boxes is a real headache. > >>>>>>>> > >>>>>>>> Thanks > >>>>>>>> -Todd > >>>>>>>> > >>>>>>>> On Tue, Dec 21, 2010 at 1:33 PM, Nigel Daley <nda...@mac.com> > wrote: > >>>>>>>> > >>>>>>>>> Ok, HDFS is now enabled. You'll see a stream of updates shortly > on > >>>>> the > >>>>>>> ~30 > >>>>>>>>> Patch Available HDFS issues. > >>>>>>>>> > >>>>>>>>> Nige > >>>>>>>>> > >>>>>>>>> On Dec 20, 2010, at 12:42 PM, Jakob Homan wrote: > >>>>>>>>> > >>>>>>>>>> I committed HDFS-1511 this morning. We should be good to go. I > >>> can > >>>>>>>>>> haz snooty robot butler? > >>>>>>>>>> > >>>>>>>>>> On Fri, Dec 17, 2010 at 8:31 PM, Konstantin Boudnik < > >>> c...@apache.org> > >>>>>>>>> wrote: > >>>>>>>>>>> Thanks Jacob. I am wasted already but I can do it on Sun, I > think, > >>>>>>>>>>> unless it is done earlier. > >>>>>>>>>>> -- > >>>>>>>>>>> Take care, > >>>>>>>>>>> Konstantin (Cos) Boudnik > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> On Fri, Dec 17, 2010 at 19:41, Jakob Homan <jgho...@gmail.com> > >>>>> wrote: > >>>>>>>>>>>> Ok. I'll get a patch out for 1511 tomorrow, unless someone > wants > >>>>> to > >>>>>>>>>>>> whip one up tonight. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> On Fri, Dec 17, 2010 at 7:22 PM, Nigel Daley <nda...@mac.com> > >>>>> wrote: > >>>>>>>>>>>>> I agree with Cos on fixing HDFS-1511 first. Once that is done > >>> I'll > >>>>>>>>> enable hdfs patch testing. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>> Nige > >>>>>>>>>>>>> > >>>>>>>>>>>>> Sent from my iPhone4 > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Dec 17, 2010, at 7:01 PM, Konstantin Boudnik < > c...@apache.org > >>>> > >>>>>>>>> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> One more issue needs to be addressed before test-patch is > >>> turned > >>>>> on > >>>>>>>>> HDFS is > >>>>>>>>>>>>>> https://issues.apache.org/jira/browse/HDFS-1511 > >>>>>>>>>>>>>> -- > >>>>>>>>>>>>>> Take care, > >>>>>>>>>>>>>> Konstantin (Cos) Boudnik > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Fri, Dec 17, 2010 at 16:17, Konstantin Boudnik < > >>>>> c...@apache.org> > >>>>>>>>> wrote: > >>>>>>>>>>>>>>> Considering that because of these 4 faulty cases every > patch > >>>>> will > >>>>>>> be > >>>>>>>>>>>>>>> -1'ed a patch author will still have to look at it and make > a > >>>>>>>>> comment > >>>>>>>>>>>>>>> why this particular -1 isn't valid. Lesser work, perhaps, > but > >>>>>>>>> messier > >>>>>>>>>>>>>>> IMO. I'm not blocking it - I just feel like there's a > better > >>>>> way. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>> Take care, > >>>>>>>>>>>>>>> Konstantin (Cos) Boudnik > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On Fri, Dec 17, 2010 at 15:55, Jakob Homan < > jgho...@gmail.com > >>>> > >>>>>>>>> wrote: > >>>>>>>>>>>>>>>>> If HDFS is added to the test-patch queue right now we get > >>>>>>>>>>>>>>>>> nothing but dozens of -1'ed patches. > >>>>>>>>>>>>>>>> There aren't dozens of patches being submitted currently. > >>> The > >>>>> -1 > >>>>>>>>>>>>>>>> isn't the important thing, it's the grunt work of actually > >>>>>>> running > >>>>>>>>>>>>>>>> (and waiting) for the tests, test-patch, etc. that Hudson > >>> does > >>>>> so > >>>>>>>>> that > >>>>>>>>>>>>>>>> the developer doesn't have to. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> On Fri, Dec 17, 2010 at 3:48 PM, Dhruba Borthakur < > >>>>>>>>> dhr...@gmail.com> wrote: > >>>>>>>>>>>>>>>>> +1, thanks for doing this. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> On Fri, Dec 17, 2010 at 3:19 PM, Jakob Homan < > >>>>> jgho...@gmail.com > >>>>>>>> > >>>>>>>>> wrote: > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> So, with test-patch updated to show the failing tests, > >>> saving > >>>>>>> the > >>>>>>>>>>>>>>>>>> developers the need to go and verify that the failed > tests > >>>>> are > >>>>>>>>> all > >>>>>>>>>>>>>>>>>> known, how do people feel about turning on test-patch > again > >>>>> for > >>>>>>>>> HDFS > >>>>>>>>>>>>>>>>>> and mapred? I think it'll help prevent any more tests > from > >>>>>>>>> entering > >>>>>>>>>>>>>>>>>> the "yeah, we know" category. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>>>>>>> jg > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> On Wed, Nov 17, 2010 at 5:08 PM, Jakob Homan < > >>>>>>>>> jho...@yahoo-inc.com> wrote: > >>>>>>>>>>>>>>>>>>> True, each patch would get a -1 and the failing tests > >>> would > >>>>>>> need > >>>>>>>>> to be > >>>>>>>>>>>>>>>>>>> verified as those known bad (BTW, it would be great if > >>>>> Hudson > >>>>>>>>> could list > >>>>>>>>>>>>>>>>>>> which tests failed in the message it posts to JIRA). > But > >>>>>>> that's > >>>>>>>>> still > >>>>>>>>>>>>>>>>>> quite > >>>>>>>>>>>>>>>>>>> a bit less error-prone work than if the developer runs > the > >>>>>>> tests > >>>>>>>>> and > >>>>>>>>>>>>>>>>>>> test-patch themselves. Also, with 22 being cut, there > are > >>> a > >>>>>>> lot > >>>>>>>>> of > >>>>>>>>>>>>>>>>>> patches > >>>>>>>>>>>>>>>>>>> up in the air and several developers are juggling > multiple > >>>>>>>>> patches. The > >>>>>>>>>>>>>>>>>>> more automation we can have, even if it's not perfect, > >>> will > >>>>>>>>> decrease > >>>>>>>>>>>>>>>>>> errors > >>>>>>>>>>>>>>>>>>> we may make. > >>>>>>>>>>>>>>>>>>> -jg > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Nigel Daley wrote: > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> On Nov 17, 2010, at 3:11 PM, Jakob Homan wrote: > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> It's also ready to run on MapReduce and HDFS but we > >>> won't > >>>>>>>>> turn it on > >>>>>>>>>>>>>>>>>>>>>> until these projects build and test cleanly. Looks > >>> like > >>>>>>> both > >>>>>>>>> these > >>>>>>>>>>>>>>>>>> projects > >>>>>>>>>>>>>>>>>>>>>> currently have test failures. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Assuming the projects are compiling and building, is > >>> there > >>>>> a > >>>>>>>>> reason to > >>>>>>>>>>>>>>>>>>>>> not turn it on despite the test failures? Hudson is > >>>>>>> invaluable > >>>>>>>>> to > >>>>>>>>>>>>>>>>>> developers > >>>>>>>>>>>>>>>>>>>>> who then don't have to run the tests and test-patch > >>>>>>>>> themselves. We > >>>>>>>>>>>>>>>>>> didn't > >>>>>>>>>>>>>>>>>>>>> turn Hudson off when it was working previously and > there > >>>>>>> were > >>>>>>>>> known > >>>>>>>>>>>>>>>>>>>>> failures. I think one of the reasons we have more > >>> failing > >>>>>>>>> tests now is > >>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>> higher cost of doing Hudson's work (not a great > excuse I > >>>>>>>>> know). This > >>>>>>>>>>>>>>>>>> is > >>>>>>>>>>>>>>>>>>>>> particularly true now because several of the failing > >>> tests > >>>>>>>>> involve > >>>>>>>>>>>>>>>>>> tests > >>>>>>>>>>>>>>>>>>>>> timing out, making the whole testing regime even > longer. > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Every single patch would get a -1 and need > investigation. > >>>>>>>>> Currently, > >>>>>>>>>>>>>>>>>> that > >>>>>>>>>>>>>>>>>>>> would be about 83 investigations between MR and HDFS > >>> issues > >>>>>>>>> that are in > >>>>>>>>>>>>>>>>>>>> patch available state. Shouldn't we focus on getting > >>> these > >>>>>>>>> tests fixed > >>>>>>>>>>>>>>>>>> or > >>>>>>>>>>>>>>>>>>>> removed/? Also, I need to get MAPREDUCE-2172 fixed > >>>>> (applies > >>>>>>> to > >>>>>>>>> HDFS as > >>>>>>>>>>>>>>>>>>>> well) before I turn this on. > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>>>>>>>> Nige > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>> Connect to me at http://www.facebook.com/dhruba > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> Todd Lipcon > >>>>>>>> Software Engineer, Cloudera > >>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> Todd Lipcon > >>>>>> Software Engineer, Cloudera > >>>>> > >>>>> > >>>> > >>>> > >>>> -- > >>>> Todd Lipcon > >>>> Software Engineer, Cloudera > >>> > >>> > >> > >> > >> -- > >> Todd Lipcon > >> Software Engineer, Cloudera > > > > -- Todd Lipcon Software Engineer, Cloudera