How about "harbinger" for a name :)

On Sunday, June 7, 2015, Sean Busbey <bus...@cloudera.com> wrote:

> Sorry for the resend. I figured this deserves a [DISCUSS] flag.
>
>
>
> On Sat, Jun 6, 2015 at 10:39 PM, Sean Busbey <bus...@cloudera.com
> <javascript:;>> wrote:
>
> > Hi Folks!
> >
> > After working on test-patch with other folks for the last few months, I
> > think we've reached the point where we can make the fastest progress
> > towards the goal of a general use pre-commit patch tester by spinning
> > things into a project focused on just that. I think we have a mature
> enough
> > code base and a sufficient fledgling community, so I'm going to put
> > together a tlp proposal.
> >
> > Thanks for the feedback thus far from use within Hadoop. I hope we can
> > continue to make things more useful.
> >
> > -Sean
> >
> > On Wed, Mar 11, 2015 at 5:16 PM, Sean Busbey <bus...@cloudera.com
> <javascript:;>> wrote:
> >
> >> HBase's dev-support folder is where the scripts and support files live.
> >> We've only recently started adding anything to the maven builds that's
> >> specific to jenkins[1]; so far it's diagnostic stuff, but that's where
> I'd
> >> add in more if we ran into the same permissions problems y'all are
> having.
> >>
> >> There's also our precommit job itself, though it isn't large[2]. AFAIK,
> >> we don't properly back this up anywhere, we just notify each other of
> >> changes on a particular mail thread[3].
> >>
> >> [1]: https://github.com/apache/hbase/blob/master/pom.xml#L1687
> >> [2]: https://builds.apache.org/job/PreCommit-HBASE-Build/ (they're all
> >> read because I just finished fixing "mvn site" running out of permgen)
> >> [3]: http://s.apache.org/NT0
> >>
> >>
> >> On Wed, Mar 11, 2015 at 4:51 PM, Chris Nauroth <
> cnaur...@hortonworks.com <javascript:;>>
> >> wrote:
> >>
> >>> Sure, thanks Sean!  Do we just look in the dev-support folder in the
> >>> HBase
> >>> repo?  Is there any additional context we need to be aware of?
> >>>
> >>> Chris Nauroth
> >>> Hortonworks
> >>> http://hortonworks.com/
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On 3/11/15, 2:44 PM, "Sean Busbey" <bus...@cloudera.com <javascript:;>>
> wrote:
> >>>
> >>> >+dev@hbase
> >>> >
> >>> >HBase has recently been cleaning up our precommit jenkins jobs to make
> >>> >them
> >>> >more robust. From what I can tell our stuff started off as an earlier
> >>> >version of what Hadoop uses for testing.
> >>> >
> >>> >Folks on either side open to an experiment of combining our precommit
> >>> >check
> >>> >tooling? In principle we should be looking for the same kinds of
> things.
> >>> >
> >>> >Naturally we'll still need different jenkins jobs to handle different
> >>> >resource needs and we'd need to figure out where stuff eventually
> lives,
> >>> >but that could come later.
> >>> >
> >>> >On Wed, Mar 11, 2015 at 4:34 PM, Chris Nauroth <
> >>> cnaur...@hortonworks.com <javascript:;>>
> >>> >wrote:
> >>> >
> >>> >> The only thing I'm aware of is the failOnError option:
> >>> >>
> >>> >>
> >>> >>
> >>>
> http://maven.apache.org/plugins/maven-clean-plugin/examples/ignoring-erro
> >>> >>rs
> >>> >> .html
> >>> >>
> >>> >>
> >>> >> I prefer that we don't disable this, because ignoring different
> kinds
> >>> of
> >>> >> failures could leave our build directories in an indeterminate
> state.
> >>> >>For
> >>> >> example, we could end up with an old class file on the classpath for
> >>> >>test
> >>> >> runs that was supposedly deleted.
> >>> >>
> >>> >> I think it's worth exploring Eddy's suggestion to try simulating
> >>> failure
> >>> >> by placing a file where the code expects to see a directory.  That
> >>> might
> >>> >> even let us enable some of these tests that are skipped on Windows,
> >>> >> because Windows allows access for the owner even after permissions
> >>> have
> >>> >> been stripped.
> >>> >>
> >>> >> Chris Nauroth
> >>> >> Hortonworks
> >>> >> http://hortonworks.com/
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >> On 3/11/15, 2:10 PM, "Colin McCabe" <cmcc...@alumni.cmu.edu
> <javascript:;>> wrote:
> >>> >>
> >>> >> >Is there a maven plugin or setting we can use to simply remove
> >>> >> >directories that have no executable permissions on them?  Clearly
> we
> >>> >> >have the permission to do this from a technical point of view
> (since
> >>> >> >we created the directories as the jenkins user), it's simply that
> the
> >>> >> >code refuses to do it.
> >>> >> >
> >>> >> >Otherwise I guess we can just fix those tests...
> >>> >> >
> >>> >> >Colin
> >>> >> >
> >>> >> >On Tue, Mar 10, 2015 at 2:43 PM, Lei Xu <l...@cloudera.com
> <javascript:;>> wrote:
> >>> >> >> Thanks a lot for looking into HDFS-7722, Chris.
> >>> >> >>
> >>> >> >> In HDFS-7722:
> >>> >> >> TestDataNodeVolumeFailureXXX tests reset data dir permissions in
> >>> >> >>TearDown().
> >>> >> >> TestDataNodeHotSwapVolumes reset permissions in a finally clause.
> >>> >> >>
> >>> >> >> Also I ran mvn test several times on my machine and all tests
> >>> passed.
> >>> >> >>
> >>> >> >> However, since in DiskChecker#checkDirAccess():
> >>> >> >>
> >>> >> >> private static void checkDirAccess(File dir) throws
> >>> >>DiskErrorException {
> >>> >> >>   if (!dir.isDirectory()) {
> >>> >> >>     throw new DiskErrorException("Not a directory: "
> >>> >> >>                                  + dir.toString());
> >>> >> >>   }
> >>> >> >>
> >>> >> >>   checkAccessByFileMethods(dir);
> >>> >> >> }
> >>> >> >>
> >>> >> >> One potentially safer alternative is replacing data dir with a
> >>> >>regular
> >>> >> >> file to stimulate disk failures.
> >>> >> >>
> >>> >> >> On Tue, Mar 10, 2015 at 2:19 PM, Chris Nauroth
> >>> >> >><cnaur...@hortonworks.com <javascript:;>> wrote:
> >>> >> >>> TestDataNodeHotSwapVolumes, TestDataNodeVolumeFailure,
> >>> >> >>> TestDataNodeVolumeFailureReporting, and
> >>> >> >>> TestDataNodeVolumeFailureToleration all remove executable
> >>> >>permissions
> >>> >> >>>from
> >>> >> >>> directories like the one Colin mentioned to simulate disk
> failures
> >>> >>at
> >>> >> >>>data
> >>> >> >>> nodes.  I reviewed the code for all of those, and they all
> appear
> >>> >>to be
> >>> >> >>> doing the necessary work to restore executable permissions at
> the
> >>> >>end
> >>> >> >>>of
> >>> >> >>> the test.  The only recent uncommitted patch I¹ve seen that
> makes
> >>> >> >>>changes
> >>> >> >>> in these test suites is HDFS-7722.  That patch still looks fine
> >>> >> >>>though.  I
> >>> >> >>> don¹t know if there are other uncommitted patches that changed
> >>> these
> >>> >> >>>test
> >>> >> >>> suites.
> >>> >> >>>
> >>> >> >>> I suppose it¹s also possible that the JUnit process unexpectedly
> >>> >>died
> >>> >> >>> after removing executable permissions but before restoring them.
> >>> >>That
> >>> >> >>> always would have been a weakness of these test suites,
> regardless
> >>> >>of
> >>> >> >>>any
> >>> >> >>> recent changes.
> >>> >> >>>
> >>> >> >>> Chris Nauroth
> >>> >> >>> Hortonworks
> >>> >> >>> http://hortonworks.com/
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> On 3/10/15, 1:47 PM, "Aaron T. Myers" <a...@cloudera.com
> <javascript:;>> wrote:
> >>> >> >>>
> >>> >> >>>>Hey Colin,
> >>> >> >>>>
> >>> >> >>>>I asked Andrew Bayer, who works with Apache Infra, what's going
> on
> >>> >>with
> >>> >> >>>>these boxes. He took a look and concluded that some perms are
> >>> being
> >>> >> >>>>set in
> >>> >> >>>>those directories by our unit tests which are precluding those
> >>> files
> >>> >> >>>>from
> >>> >> >>>>getting deleted. He's going to clean up the boxes for us, but we
> >>> >>should
> >>> >> >>>>expect this to keep happening until we can fix the test in
> >>> question
> >>> >>to
> >>> >> >>>>properly clean up after itself.
> >>> >> >>>>
> >>> >> >>>>To help narrow down which commit it was that started this,
> Andrew
> >>> >>sent
> >>> >> >>>>me
> >>> >> >>>>this info:
> >>> >> >>>>
> >>> >> >>>>"/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-
> >>> >>
> >>>
> >>>
> >>>>>>Build/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data3
> >>> >>>>>>/
> >>> >> >>>>has
> >>> >> >>>>500 perms, so I'm guessing that's the problem. Been that way
> since
> >>> >>9:32
> >>> >> >>>>UTC
> >>> >> >>>>on March 5th."
> >>> >> >>>>
> >>> >> >>>>--
> >>> >> >>>>Aaron T. Myers
> >>> >> >>>>Software Engineer, Cloudera
> >>> >> >>>>
> >>> >> >>>>On Tue, Mar 10, 2015 at 1:24 PM, Colin P. McCabe
> >>> >><cmcc...@apache.org <javascript:;>>
> >>> >> >>>>wrote:
> >>> >> >>>>
> >>> >> >>>>> Hi all,
> >>> >> >>>>>
> >>> >> >>>>> A very quick (and not thorough) survey shows that I can't find
> >>> any
> >>> >> >>>>> jenkins jobs that succeeded from the last 24 hours.  Most of
> >>> them
> >>> >> >>>>>seem
> >>> >> >>>>> to be failing with some variant of this message:
> >>> >> >>>>>
> >>> >> >>>>> [ERROR] Failed to execute goal
> >>> >> >>>>> org.apache.maven.plugins:maven-clean-plugin:2.5:clean
> >>> >>(default-clean)
> >>> >> >>>>> on project hadoop-hdfs: Failed to clean project: Failed to
> >>> delete
> >>> >> >>>>>
> >>> >> >>>>>
> >>> >>
> >>>
> >>>
> >>>>>>>/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/hadoop-hd
> >>> >>>>>>>fs
> >>> >> >>>>>-pr
> >>> >> >>>>>oject/hadoop-hdfs/target/test/data/dfs/data/data3
> >>> >> >>>>> -> [Help 1]
> >>> >> >>>>>
> >>> >> >>>>> Any ideas how this happened?  Bad disk, unit test setting
> wrong
> >>> >> >>>>> permissions?
> >>> >> >>>>>
> >>> >> >>>>> Colin
> >>> >> >>>>>
> >>> >> >>>
> >>> >> >>
> >>> >> >>
> >>> >> >>
> >>> >> >> --
> >>> >> >> Lei (Eddy) Xu
> >>> >> >> Software Engineer, Cloudera
> >>> >>
> >>> >>
> >>> >
> >>> >
> >>> >--
> >>> >Sean
> >>>
> >>>
> >>
> >>
> >> --
> >> Sean
> >>
> >
> >
> >
> > --
> > Sean
> >
>
>
>
> --
> Sean
>


-- 
// Jonathan Hsieh (shay)
// HBase Tech Lead, Software Engineer, Cloudera
// j...@cloudera.com // @jmhsieh

Reply via email to