How about "harbinger" for a name :) On Sunday, June 7, 2015, Sean Busbey <bus...@cloudera.com> wrote:
> Sorry for the resend. I figured this deserves a [DISCUSS] flag. > > > > On Sat, Jun 6, 2015 at 10:39 PM, Sean Busbey <bus...@cloudera.com > <javascript:;>> wrote: > > > Hi Folks! > > > > After working on test-patch with other folks for the last few months, I > > think we've reached the point where we can make the fastest progress > > towards the goal of a general use pre-commit patch tester by spinning > > things into a project focused on just that. I think we have a mature > enough > > code base and a sufficient fledgling community, so I'm going to put > > together a tlp proposal. > > > > Thanks for the feedback thus far from use within Hadoop. I hope we can > > continue to make things more useful. > > > > -Sean > > > > On Wed, Mar 11, 2015 at 5:16 PM, Sean Busbey <bus...@cloudera.com > <javascript:;>> wrote: > > > >> HBase's dev-support folder is where the scripts and support files live. > >> We've only recently started adding anything to the maven builds that's > >> specific to jenkins[1]; so far it's diagnostic stuff, but that's where > I'd > >> add in more if we ran into the same permissions problems y'all are > having. > >> > >> There's also our precommit job itself, though it isn't large[2]. AFAIK, > >> we don't properly back this up anywhere, we just notify each other of > >> changes on a particular mail thread[3]. > >> > >> [1]: https://github.com/apache/hbase/blob/master/pom.xml#L1687 > >> [2]: https://builds.apache.org/job/PreCommit-HBASE-Build/ (they're all > >> read because I just finished fixing "mvn site" running out of permgen) > >> [3]: http://s.apache.org/NT0 > >> > >> > >> On Wed, Mar 11, 2015 at 4:51 PM, Chris Nauroth < > cnaur...@hortonworks.com <javascript:;>> > >> wrote: > >> > >>> Sure, thanks Sean! Do we just look in the dev-support folder in the > >>> HBase > >>> repo? Is there any additional context we need to be aware of? > >>> > >>> Chris Nauroth > >>> Hortonworks > >>> http://hortonworks.com/ > >>> > >>> > >>> > >>> > >>> > >>> > >>> On 3/11/15, 2:44 PM, "Sean Busbey" <bus...@cloudera.com <javascript:;>> > wrote: > >>> > >>> >+dev@hbase > >>> > > >>> >HBase has recently been cleaning up our precommit jenkins jobs to make > >>> >them > >>> >more robust. From what I can tell our stuff started off as an earlier > >>> >version of what Hadoop uses for testing. > >>> > > >>> >Folks on either side open to an experiment of combining our precommit > >>> >check > >>> >tooling? In principle we should be looking for the same kinds of > things. > >>> > > >>> >Naturally we'll still need different jenkins jobs to handle different > >>> >resource needs and we'd need to figure out where stuff eventually > lives, > >>> >but that could come later. > >>> > > >>> >On Wed, Mar 11, 2015 at 4:34 PM, Chris Nauroth < > >>> cnaur...@hortonworks.com <javascript:;>> > >>> >wrote: > >>> > > >>> >> The only thing I'm aware of is the failOnError option: > >>> >> > >>> >> > >>> >> > >>> > http://maven.apache.org/plugins/maven-clean-plugin/examples/ignoring-erro > >>> >>rs > >>> >> .html > >>> >> > >>> >> > >>> >> I prefer that we don't disable this, because ignoring different > kinds > >>> of > >>> >> failures could leave our build directories in an indeterminate > state. > >>> >>For > >>> >> example, we could end up with an old class file on the classpath for > >>> >>test > >>> >> runs that was supposedly deleted. > >>> >> > >>> >> I think it's worth exploring Eddy's suggestion to try simulating > >>> failure > >>> >> by placing a file where the code expects to see a directory. That > >>> might > >>> >> even let us enable some of these tests that are skipped on Windows, > >>> >> because Windows allows access for the owner even after permissions > >>> have > >>> >> been stripped. > >>> >> > >>> >> Chris Nauroth > >>> >> Hortonworks > >>> >> http://hortonworks.com/ > >>> >> > >>> >> > >>> >> > >>> >> > >>> >> > >>> >> > >>> >> On 3/11/15, 2:10 PM, "Colin McCabe" <cmcc...@alumni.cmu.edu > <javascript:;>> wrote: > >>> >> > >>> >> >Is there a maven plugin or setting we can use to simply remove > >>> >> >directories that have no executable permissions on them? Clearly > we > >>> >> >have the permission to do this from a technical point of view > (since > >>> >> >we created the directories as the jenkins user), it's simply that > the > >>> >> >code refuses to do it. > >>> >> > > >>> >> >Otherwise I guess we can just fix those tests... > >>> >> > > >>> >> >Colin > >>> >> > > >>> >> >On Tue, Mar 10, 2015 at 2:43 PM, Lei Xu <l...@cloudera.com > <javascript:;>> wrote: > >>> >> >> Thanks a lot for looking into HDFS-7722, Chris. > >>> >> >> > >>> >> >> In HDFS-7722: > >>> >> >> TestDataNodeVolumeFailureXXX tests reset data dir permissions in > >>> >> >>TearDown(). > >>> >> >> TestDataNodeHotSwapVolumes reset permissions in a finally clause. > >>> >> >> > >>> >> >> Also I ran mvn test several times on my machine and all tests > >>> passed. > >>> >> >> > >>> >> >> However, since in DiskChecker#checkDirAccess(): > >>> >> >> > >>> >> >> private static void checkDirAccess(File dir) throws > >>> >>DiskErrorException { > >>> >> >> if (!dir.isDirectory()) { > >>> >> >> throw new DiskErrorException("Not a directory: " > >>> >> >> + dir.toString()); > >>> >> >> } > >>> >> >> > >>> >> >> checkAccessByFileMethods(dir); > >>> >> >> } > >>> >> >> > >>> >> >> One potentially safer alternative is replacing data dir with a > >>> >>regular > >>> >> >> file to stimulate disk failures. > >>> >> >> > >>> >> >> On Tue, Mar 10, 2015 at 2:19 PM, Chris Nauroth > >>> >> >><cnaur...@hortonworks.com <javascript:;>> wrote: > >>> >> >>> TestDataNodeHotSwapVolumes, TestDataNodeVolumeFailure, > >>> >> >>> TestDataNodeVolumeFailureReporting, and > >>> >> >>> TestDataNodeVolumeFailureToleration all remove executable > >>> >>permissions > >>> >> >>>from > >>> >> >>> directories like the one Colin mentioned to simulate disk > failures > >>> >>at > >>> >> >>>data > >>> >> >>> nodes. I reviewed the code for all of those, and they all > appear > >>> >>to be > >>> >> >>> doing the necessary work to restore executable permissions at > the > >>> >>end > >>> >> >>>of > >>> >> >>> the test. The only recent uncommitted patch I¹ve seen that > makes > >>> >> >>>changes > >>> >> >>> in these test suites is HDFS-7722. That patch still looks fine > >>> >> >>>though. I > >>> >> >>> don¹t know if there are other uncommitted patches that changed > >>> these > >>> >> >>>test > >>> >> >>> suites. > >>> >> >>> > >>> >> >>> I suppose it¹s also possible that the JUnit process unexpectedly > >>> >>died > >>> >> >>> after removing executable permissions but before restoring them. > >>> >>That > >>> >> >>> always would have been a weakness of these test suites, > regardless > >>> >>of > >>> >> >>>any > >>> >> >>> recent changes. > >>> >> >>> > >>> >> >>> Chris Nauroth > >>> >> >>> Hortonworks > >>> >> >>> http://hortonworks.com/ > >>> >> >>> > >>> >> >>> > >>> >> >>> > >>> >> >>> > >>> >> >>> > >>> >> >>> > >>> >> >>> On 3/10/15, 1:47 PM, "Aaron T. Myers" <a...@cloudera.com > <javascript:;>> wrote: > >>> >> >>> > >>> >> >>>>Hey Colin, > >>> >> >>>> > >>> >> >>>>I asked Andrew Bayer, who works with Apache Infra, what's going > on > >>> >>with > >>> >> >>>>these boxes. He took a look and concluded that some perms are > >>> being > >>> >> >>>>set in > >>> >> >>>>those directories by our unit tests which are precluding those > >>> files > >>> >> >>>>from > >>> >> >>>>getting deleted. He's going to clean up the boxes for us, but we > >>> >>should > >>> >> >>>>expect this to keep happening until we can fix the test in > >>> question > >>> >>to > >>> >> >>>>properly clean up after itself. > >>> >> >>>> > >>> >> >>>>To help narrow down which commit it was that started this, > Andrew > >>> >>sent > >>> >> >>>>me > >>> >> >>>>this info: > >>> >> >>>> > >>> >> >>>>"/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS- > >>> >> > >>> > >>> > >>>>>>Build/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data3 > >>> >>>>>>/ > >>> >> >>>>has > >>> >> >>>>500 perms, so I'm guessing that's the problem. Been that way > since > >>> >>9:32 > >>> >> >>>>UTC > >>> >> >>>>on March 5th." > >>> >> >>>> > >>> >> >>>>-- > >>> >> >>>>Aaron T. Myers > >>> >> >>>>Software Engineer, Cloudera > >>> >> >>>> > >>> >> >>>>On Tue, Mar 10, 2015 at 1:24 PM, Colin P. McCabe > >>> >><cmcc...@apache.org <javascript:;>> > >>> >> >>>>wrote: > >>> >> >>>> > >>> >> >>>>> Hi all, > >>> >> >>>>> > >>> >> >>>>> A very quick (and not thorough) survey shows that I can't find > >>> any > >>> >> >>>>> jenkins jobs that succeeded from the last 24 hours. Most of > >>> them > >>> >> >>>>>seem > >>> >> >>>>> to be failing with some variant of this message: > >>> >> >>>>> > >>> >> >>>>> [ERROR] Failed to execute goal > >>> >> >>>>> org.apache.maven.plugins:maven-clean-plugin:2.5:clean > >>> >>(default-clean) > >>> >> >>>>> on project hadoop-hdfs: Failed to clean project: Failed to > >>> delete > >>> >> >>>>> > >>> >> >>>>> > >>> >> > >>> > >>> > >>>>>>>/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/hadoop-hd > >>> >>>>>>>fs > >>> >> >>>>>-pr > >>> >> >>>>>oject/hadoop-hdfs/target/test/data/dfs/data/data3 > >>> >> >>>>> -> [Help 1] > >>> >> >>>>> > >>> >> >>>>> Any ideas how this happened? Bad disk, unit test setting > wrong > >>> >> >>>>> permissions? > >>> >> >>>>> > >>> >> >>>>> Colin > >>> >> >>>>> > >>> >> >>> > >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> -- > >>> >> >> Lei (Eddy) Xu > >>> >> >> Software Engineer, Cloudera > >>> >> > >>> >> > >>> > > >>> > > >>> >-- > >>> >Sean > >>> > >>> > >> > >> > >> -- > >> Sean > >> > > > > > > > > -- > > Sean > > > > > > -- > Sean > -- // Jonathan Hsieh (shay) // HBase Tech Lead, Software Engineer, Cloudera // j...@cloudera.com // @jmhsieh