On Wed, Jan 26, 2011 at 10:05 AM, Nigel Daley <nda...@mac.com> wrote:

> raid (contrib) test hanging: TestBlockFixer
>
> I forced 2 thread dumps.  Both hung in the same place.  Filed
> https://issues.apache.org/jira/browse/MAPREDUCE-2283  This is a blocker
> for turning on MR precommit.
>

Since this is contrib, I'd like to suggest just disabling this test
temporarily. We can re-enable it once it's fixed.

Not having MR pre-commit working has been pretty painful.

-Todd


> On Jan 25, 2011, at 11:19 PM, Nigel Daley wrote:
>
> > Started another trial run of MR precommit testing:
> >
> https://hudson.apache.org/hudson/view/G-L/view/Hadoop/job/PreCommit-MAPREDUCE-Build/17/
> >
> > Let's see if 17th time is a charm...
> >
> > Nige
> >
> > On Jan 7, 2011, at 5:14 PM, Todd Lipcon wrote:
> >
> >> On Fri, Jan 7, 2011 at 2:11 PM, Nigel Daley <nda...@mac.com> wrote:
> >>
> >>> Hrm, the MR precommit test I'm running has hung (been running for 14
> hours
> >>> so far).  FWIW, 2 HDFS precommit tests are hung too.  I suspect it
> could be
> >>> the NFS mounts on the machines.  I forced a thread dump which you can
> see in
> >>> the console:
> >>>
> https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/10/console
> >>>
> >>>
> >> Strange, haven't seen a hang like that before in
> handleConnectionFailure. It
> >> should retry for 15 minutes max in that loop.
> >>
> >>
> >>> Any other ideas why these might be hanging?
> >>>
> >>>
> >> There is an HDFS bug right now that can cause hangs on some tests -
> >> HDFS-1529 - would appreciate if someone can take a look. But I don't
> think
> >> this is responsible for the MR hang above.
> >>
> >> -Todd
> >>
> >>
> >>> On Jan 5, 2011, at 5:42 PM, Todd Lipcon wrote:
> >>>
> >>>> On Wed, Jan 5, 2011 at 4:39 PM, Nigel Daley <nda...@mac.com> wrote:
> >>>>
> >>>>> Thanks for looking into it Todd.  Let's first see if you think it can
> be
> >>>>> fixed quickly.  Let me know.
> >>>>>
> >>>>>
> >>>> No problem, it wasn't too bad after all. Patch up on HADOOP-7087 which
> >>> fixes
> >>>> this test timeout for me.
> >>>>
> >>>> -Todd
> >>>>
> >>>>
> >>>>> On Jan 5, 2011, at 4:33 PM, Todd Lipcon wrote:
> >>>>>
> >>>>>> On Wed, Jan 5, 2011 at 4:19 PM, Nigel Daley <nda...@mac.com> wrote:
> >>>>>>
> >>>>>>> Todd, would love to get
> >>>>>>> https://issues.apache.org/jira/browse/MAPREDUCE-2121 fixed first
> >>> since
> >>>>>>> this is failing every night on trunk.
> >>>>>>>
> >>>>>>
> >>>>>> What if we disable that test, move that issue to 0.22 blocker, and
> then
> >>>>>> enable the test-patch? I'll also look into that one today, but if
> it's
> >>>>>> something that will take a while to fix, I don't think we should
> hold
> >>> off
> >>>>>> the useful testing for all the other patches.
> >>>>>>
> >>>>>> -Todd
> >>>>>>
> >>>>>> On Jan 5, 2011, at 2:45 PM, Todd Lipcon wrote:
> >>>>>>>
> >>>>>>>> Hi Nigel,
> >>>>>>>>
> >>>>>>>> MAPREDUCE-2172 has been fixed for a while. Are there any other
> >>>>> particular
> >>>>>>>> JIRAs you think need to be fixed before the MR test-patch queue
> gets
> >>>>>>>> enabled? I have a lot of outstanding patches and doing all the
> >>>>> test-patch
> >>>>>>>> turnaround manually on 3 different boxes is a real headache.
> >>>>>>>>
> >>>>>>>> Thanks
> >>>>>>>> -Todd
> >>>>>>>>
> >>>>>>>> On Tue, Dec 21, 2010 at 1:33 PM, Nigel Daley <nda...@mac.com>
> wrote:
> >>>>>>>>
> >>>>>>>>> Ok, HDFS is now enabled.  You'll see a stream of updates shortly
> on
> >>>>> the
> >>>>>>> ~30
> >>>>>>>>> Patch Available HDFS issues.
> >>>>>>>>>
> >>>>>>>>> Nige
> >>>>>>>>>
> >>>>>>>>> On Dec 20, 2010, at 12:42 PM, Jakob Homan wrote:
> >>>>>>>>>
> >>>>>>>>>> I committed HDFS-1511 this morning.  We should be good to go.  I
> >>> can
> >>>>>>>>>> haz snooty robot butler?
> >>>>>>>>>>
> >>>>>>>>>> On Fri, Dec 17, 2010 at 8:31 PM, Konstantin Boudnik <
> >>> c...@apache.org>
> >>>>>>>>> wrote:
> >>>>>>>>>>> Thanks Jacob. I am wasted already but I can do it on Sun, I
> think,
> >>>>>>>>>>> unless it is done earlier.
> >>>>>>>>>>> --
> >>>>>>>>>>> Take care,
> >>>>>>>>>>> Konstantin (Cos) Boudnik
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Fri, Dec 17, 2010 at 19:41, Jakob Homan <jgho...@gmail.com>
> >>>>> wrote:
> >>>>>>>>>>>> Ok.  I'll get a patch out for 1511 tomorrow, unless someone
> wants
> >>>>> to
> >>>>>>>>>>>> whip one up tonight.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Fri, Dec 17, 2010 at 7:22 PM, Nigel Daley <nda...@mac.com>
> >>>>> wrote:
> >>>>>>>>>>>>> I agree with Cos on fixing HDFS-1511 first. Once that is done
> >>> I'll
> >>>>>>>>> enable hdfs patch testing.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>> Nige
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Sent from my iPhone4
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Dec 17, 2010, at 7:01 PM, Konstantin Boudnik <
> c...@apache.org
> >>>>
> >>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> One more issue needs to be addressed before test-patch is
> >>> turned
> >>>>> on
> >>>>>>>>> HDFS is
> >>>>>>>>>>>>>> https://issues.apache.org/jira/browse/HDFS-1511
> >>>>>>>>>>>>>> --
> >>>>>>>>>>>>>> Take care,
> >>>>>>>>>>>>>> Konstantin (Cos) Boudnik
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Fri, Dec 17, 2010 at 16:17, Konstantin Boudnik <
> >>>>> c...@apache.org>
> >>>>>>>>> wrote:
> >>>>>>>>>>>>>>> Considering that because of these 4 faulty cases every
> patch
> >>>>> will
> >>>>>>> be
> >>>>>>>>>>>>>>> -1'ed a patch author will still have to look at it and make
> a
> >>>>>>>>> comment
> >>>>>>>>>>>>>>> why this particular -1 isn't valid. Lesser work, perhaps,
> but
> >>>>>>>>> messier
> >>>>>>>>>>>>>>> IMO. I'm not blocking it - I just feel like there's a
> better
> >>>>> way.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>> Take care,
> >>>>>>>>>>>>>>> Konstantin (Cos) Boudnik
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Fri, Dec 17, 2010 at 15:55, Jakob Homan <
> jgho...@gmail.com
> >>>>
> >>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>> If HDFS is added to the test-patch queue right now we get
> >>>>>>>>>>>>>>>>> nothing but dozens of -1'ed patches.
> >>>>>>>>>>>>>>>> There aren't dozens of patches being submitted currently.
> >>> The
> >>>>> -1
> >>>>>>>>>>>>>>>> isn't the important thing, it's the grunt work of actually
> >>>>>>> running
> >>>>>>>>>>>>>>>> (and waiting) for the tests, test-patch, etc. that Hudson
> >>> does
> >>>>> so
> >>>>>>>>> that
> >>>>>>>>>>>>>>>> the developer doesn't have to.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Fri, Dec 17, 2010 at 3:48 PM, Dhruba Borthakur <
> >>>>>>>>> dhr...@gmail.com> wrote:
> >>>>>>>>>>>>>>>>> +1, thanks for doing this.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Fri, Dec 17, 2010 at 3:19 PM, Jakob Homan <
> >>>>> jgho...@gmail.com
> >>>>>>>>
> >>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> So, with test-patch updated to show the failing tests,
> >>> saving
> >>>>>>> the
> >>>>>>>>>>>>>>>>>> developers the need to go and verify that the failed
> tests
> >>>>> are
> >>>>>>>>> all
> >>>>>>>>>>>>>>>>>> known, how do people feel about turning on test-patch
> again
> >>>>> for
> >>>>>>>>> HDFS
> >>>>>>>>>>>>>>>>>> and mapred?  I think it'll help prevent any more tests
> from
> >>>>>>>>> entering
> >>>>>>>>>>>>>>>>>> the "yeah, we know" category.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>>>> jg
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On Wed, Nov 17, 2010 at 5:08 PM, Jakob Homan <
> >>>>>>>>> jho...@yahoo-inc.com> wrote:
> >>>>>>>>>>>>>>>>>>> True, each patch would get a -1 and the failing tests
> >>> would
> >>>>>>> need
> >>>>>>>>> to be
> >>>>>>>>>>>>>>>>>>> verified as those known bad (BTW, it would be great if
> >>>>> Hudson
> >>>>>>>>> could list
> >>>>>>>>>>>>>>>>>>> which tests failed in the message it posts to JIRA).
>  But
> >>>>>>> that's
> >>>>>>>>> still
> >>>>>>>>>>>>>>>>>> quite
> >>>>>>>>>>>>>>>>>>> a bit less error-prone work than if the developer runs
> the
> >>>>>>> tests
> >>>>>>>>> and
> >>>>>>>>>>>>>>>>>>> test-patch themselves.  Also, with 22 being cut, there
> are
> >>> a
> >>>>>>> lot
> >>>>>>>>> of
> >>>>>>>>>>>>>>>>>> patches
> >>>>>>>>>>>>>>>>>>> up in the air and several developers are juggling
> multiple
> >>>>>>>>> patches.  The
> >>>>>>>>>>>>>>>>>>> more automation we can have, even if it's not perfect,
> >>> will
> >>>>>>>>> decrease
> >>>>>>>>>>>>>>>>>> errors
> >>>>>>>>>>>>>>>>>>> we may make.
> >>>>>>>>>>>>>>>>>>> -jg
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Nigel Daley wrote:
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> On Nov 17, 2010, at 3:11 PM, Jakob Homan wrote:
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> It's also ready to run on MapReduce and HDFS but we
> >>> won't
> >>>>>>>>> turn it on
> >>>>>>>>>>>>>>>>>>>>>> until these projects build and test cleanly.  Looks
> >>> like
> >>>>>>> both
> >>>>>>>>> these
> >>>>>>>>>>>>>>>>>> projects
> >>>>>>>>>>>>>>>>>>>>>> currently have test failures.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Assuming the projects are compiling and building, is
> >>> there
> >>>>> a
> >>>>>>>>> reason to
> >>>>>>>>>>>>>>>>>>>>> not turn it on despite the test failures? Hudson is
> >>>>>>> invaluable
> >>>>>>>>> to
> >>>>>>>>>>>>>>>>>> developers
> >>>>>>>>>>>>>>>>>>>>> who then don't have to run the tests and test-patch
> >>>>>>>>> themselves.  We
> >>>>>>>>>>>>>>>>>> didn't
> >>>>>>>>>>>>>>>>>>>>> turn Hudson off when it was working previously and
> there
> >>>>>>> were
> >>>>>>>>> known
> >>>>>>>>>>>>>>>>>>>>> failures.  I think one of the reasons we have more
> >>> failing
> >>>>>>>>> tests now is
> >>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>> higher cost of doing Hudson's work (not a great
> excuse I
> >>>>>>>>> know).  This
> >>>>>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>>>> particularly true now because several of the failing
> >>> tests
> >>>>>>>>> involve
> >>>>>>>>>>>>>>>>>> tests
> >>>>>>>>>>>>>>>>>>>>> timing out, making the whole testing regime even
> longer.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Every single patch would get a -1 and need
> investigation.
> >>>>>>>>> Currently,
> >>>>>>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>>>>>> would be about 83 investigations between MR and HDFS
> >>> issues
> >>>>>>>>> that are in
> >>>>>>>>>>>>>>>>>>>> patch available state.  Shouldn't we focus on getting
> >>> these
> >>>>>>>>> tests fixed
> >>>>>>>>>>>>>>>>>> or
> >>>>>>>>>>>>>>>>>>>> removed/?  Also, I need to get MAPREDUCE-2172 fixed
> >>>>> (applies
> >>>>>>> to
> >>>>>>>>> HDFS as
> >>>>>>>>>>>>>>>>>>>> well) before I turn this on.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>>>>>>> Nige
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>> Connect to me at http://www.facebook.com/dhruba
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Todd Lipcon
> >>>>>>>> Software Engineer, Cloudera
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Todd Lipcon
> >>>>>> Software Engineer, Cloudera
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>> Todd Lipcon
> >>>> Software Engineer, Cloudera
> >>>
> >>>
> >>
> >>
> >> --
> >> Todd Lipcon
> >> Software Engineer, Cloudera
> >
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

Reply via email to