Sorry for the resend. I figured this deserves a [DISCUSS] flag.


On Sat, Jun 6, 2015 at 10:39 PM, Sean Busbey <bus...@cloudera.com> wrote:

> Hi Folks!
>
> After working on test-patch with other folks for the last few months, I
> think we've reached the point where we can make the fastest progress
> towards the goal of a general use pre-commit patch tester by spinning
> things into a project focused on just that. I think we have a mature enough
> code base and a sufficient fledgling community, so I'm going to put
> together a tlp proposal.
>
> Thanks for the feedback thus far from use within Hadoop. I hope we can
> continue to make things more useful.
>
> -Sean
>
> On Wed, Mar 11, 2015 at 5:16 PM, Sean Busbey <bus...@cloudera.com> wrote:
>
>> HBase's dev-support folder is where the scripts and support files live.
>> We've only recently started adding anything to the maven builds that's
>> specific to jenkins[1]; so far it's diagnostic stuff, but that's where I'd
>> add in more if we ran into the same permissions problems y'all are having.
>>
>> There's also our precommit job itself, though it isn't large[2]. AFAIK,
>> we don't properly back this up anywhere, we just notify each other of
>> changes on a particular mail thread[3].
>>
>> [1]: https://github.com/apache/hbase/blob/master/pom.xml#L1687
>> [2]: https://builds.apache.org/job/PreCommit-HBASE-Build/ (they're all
>> read because I just finished fixing "mvn site" running out of permgen)
>> [3]: http://s.apache.org/NT0
>>
>>
>> On Wed, Mar 11, 2015 at 4:51 PM, Chris Nauroth <cnaur...@hortonworks.com>
>> wrote:
>>
>>> Sure, thanks Sean!  Do we just look in the dev-support folder in the
>>> HBase
>>> repo?  Is there any additional context we need to be aware of?
>>>
>>> Chris Nauroth
>>> Hortonworks
>>> http://hortonworks.com/
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 3/11/15, 2:44 PM, "Sean Busbey" <bus...@cloudera.com> wrote:
>>>
>>> >+dev@hbase
>>> >
>>> >HBase has recently been cleaning up our precommit jenkins jobs to make
>>> >them
>>> >more robust. From what I can tell our stuff started off as an earlier
>>> >version of what Hadoop uses for testing.
>>> >
>>> >Folks on either side open to an experiment of combining our precommit
>>> >check
>>> >tooling? In principle we should be looking for the same kinds of things.
>>> >
>>> >Naturally we'll still need different jenkins jobs to handle different
>>> >resource needs and we'd need to figure out where stuff eventually lives,
>>> >but that could come later.
>>> >
>>> >On Wed, Mar 11, 2015 at 4:34 PM, Chris Nauroth <
>>> cnaur...@hortonworks.com>
>>> >wrote:
>>> >
>>> >> The only thing I'm aware of is the failOnError option:
>>> >>
>>> >>
>>> >>
>>> http://maven.apache.org/plugins/maven-clean-plugin/examples/ignoring-erro
>>> >>rs
>>> >> .html
>>> >>
>>> >>
>>> >> I prefer that we don't disable this, because ignoring different kinds
>>> of
>>> >> failures could leave our build directories in an indeterminate state.
>>> >>For
>>> >> example, we could end up with an old class file on the classpath for
>>> >>test
>>> >> runs that was supposedly deleted.
>>> >>
>>> >> I think it's worth exploring Eddy's suggestion to try simulating
>>> failure
>>> >> by placing a file where the code expects to see a directory.  That
>>> might
>>> >> even let us enable some of these tests that are skipped on Windows,
>>> >> because Windows allows access for the owner even after permissions
>>> have
>>> >> been stripped.
>>> >>
>>> >> Chris Nauroth
>>> >> Hortonworks
>>> >> http://hortonworks.com/
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> On 3/11/15, 2:10 PM, "Colin McCabe" <cmcc...@alumni.cmu.edu> wrote:
>>> >>
>>> >> >Is there a maven plugin or setting we can use to simply remove
>>> >> >directories that have no executable permissions on them?  Clearly we
>>> >> >have the permission to do this from a technical point of view (since
>>> >> >we created the directories as the jenkins user), it's simply that the
>>> >> >code refuses to do it.
>>> >> >
>>> >> >Otherwise I guess we can just fix those tests...
>>> >> >
>>> >> >Colin
>>> >> >
>>> >> >On Tue, Mar 10, 2015 at 2:43 PM, Lei Xu <l...@cloudera.com> wrote:
>>> >> >> Thanks a lot for looking into HDFS-7722, Chris.
>>> >> >>
>>> >> >> In HDFS-7722:
>>> >> >> TestDataNodeVolumeFailureXXX tests reset data dir permissions in
>>> >> >>TearDown().
>>> >> >> TestDataNodeHotSwapVolumes reset permissions in a finally clause.
>>> >> >>
>>> >> >> Also I ran mvn test several times on my machine and all tests
>>> passed.
>>> >> >>
>>> >> >> However, since in DiskChecker#checkDirAccess():
>>> >> >>
>>> >> >> private static void checkDirAccess(File dir) throws
>>> >>DiskErrorException {
>>> >> >>   if (!dir.isDirectory()) {
>>> >> >>     throw new DiskErrorException("Not a directory: "
>>> >> >>                                  + dir.toString());
>>> >> >>   }
>>> >> >>
>>> >> >>   checkAccessByFileMethods(dir);
>>> >> >> }
>>> >> >>
>>> >> >> One potentially safer alternative is replacing data dir with a
>>> >>regular
>>> >> >> file to stimulate disk failures.
>>> >> >>
>>> >> >> On Tue, Mar 10, 2015 at 2:19 PM, Chris Nauroth
>>> >> >><cnaur...@hortonworks.com> wrote:
>>> >> >>> TestDataNodeHotSwapVolumes, TestDataNodeVolumeFailure,
>>> >> >>> TestDataNodeVolumeFailureReporting, and
>>> >> >>> TestDataNodeVolumeFailureToleration all remove executable
>>> >>permissions
>>> >> >>>from
>>> >> >>> directories like the one Colin mentioned to simulate disk failures
>>> >>at
>>> >> >>>data
>>> >> >>> nodes.  I reviewed the code for all of those, and they all appear
>>> >>to be
>>> >> >>> doing the necessary work to restore executable permissions at the
>>> >>end
>>> >> >>>of
>>> >> >>> the test.  The only recent uncommitted patch I¹ve seen that makes
>>> >> >>>changes
>>> >> >>> in these test suites is HDFS-7722.  That patch still looks fine
>>> >> >>>though.  I
>>> >> >>> don¹t know if there are other uncommitted patches that changed
>>> these
>>> >> >>>test
>>> >> >>> suites.
>>> >> >>>
>>> >> >>> I suppose it¹s also possible that the JUnit process unexpectedly
>>> >>died
>>> >> >>> after removing executable permissions but before restoring them.
>>> >>That
>>> >> >>> always would have been a weakness of these test suites, regardless
>>> >>of
>>> >> >>>any
>>> >> >>> recent changes.
>>> >> >>>
>>> >> >>> Chris Nauroth
>>> >> >>> Hortonworks
>>> >> >>> http://hortonworks.com/
>>> >> >>>
>>> >> >>>
>>> >> >>>
>>> >> >>>
>>> >> >>>
>>> >> >>>
>>> >> >>> On 3/10/15, 1:47 PM, "Aaron T. Myers" <a...@cloudera.com> wrote:
>>> >> >>>
>>> >> >>>>Hey Colin,
>>> >> >>>>
>>> >> >>>>I asked Andrew Bayer, who works with Apache Infra, what's going on
>>> >>with
>>> >> >>>>these boxes. He took a look and concluded that some perms are
>>> being
>>> >> >>>>set in
>>> >> >>>>those directories by our unit tests which are precluding those
>>> files
>>> >> >>>>from
>>> >> >>>>getting deleted. He's going to clean up the boxes for us, but we
>>> >>should
>>> >> >>>>expect this to keep happening until we can fix the test in
>>> question
>>> >>to
>>> >> >>>>properly clean up after itself.
>>> >> >>>>
>>> >> >>>>To help narrow down which commit it was that started this, Andrew
>>> >>sent
>>> >> >>>>me
>>> >> >>>>this info:
>>> >> >>>>
>>> >> >>>>"/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-
>>> >>
>>>
>>> >>>>>>Build/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data3
>>> >>>>>>/
>>> >> >>>>has
>>> >> >>>>500 perms, so I'm guessing that's the problem. Been that way since
>>> >>9:32
>>> >> >>>>UTC
>>> >> >>>>on March 5th."
>>> >> >>>>
>>> >> >>>>--
>>> >> >>>>Aaron T. Myers
>>> >> >>>>Software Engineer, Cloudera
>>> >> >>>>
>>> >> >>>>On Tue, Mar 10, 2015 at 1:24 PM, Colin P. McCabe
>>> >><cmcc...@apache.org>
>>> >> >>>>wrote:
>>> >> >>>>
>>> >> >>>>> Hi all,
>>> >> >>>>>
>>> >> >>>>> A very quick (and not thorough) survey shows that I can't find
>>> any
>>> >> >>>>> jenkins jobs that succeeded from the last 24 hours.  Most of
>>> them
>>> >> >>>>>seem
>>> >> >>>>> to be failing with some variant of this message:
>>> >> >>>>>
>>> >> >>>>> [ERROR] Failed to execute goal
>>> >> >>>>> org.apache.maven.plugins:maven-clean-plugin:2.5:clean
>>> >>(default-clean)
>>> >> >>>>> on project hadoop-hdfs: Failed to clean project: Failed to
>>> delete
>>> >> >>>>>
>>> >> >>>>>
>>> >>
>>>
>>> >>>>>>>/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/hadoop-hd
>>> >>>>>>>fs
>>> >> >>>>>-pr
>>> >> >>>>>oject/hadoop-hdfs/target/test/data/dfs/data/data3
>>> >> >>>>> -> [Help 1]
>>> >> >>>>>
>>> >> >>>>> Any ideas how this happened?  Bad disk, unit test setting wrong
>>> >> >>>>> permissions?
>>> >> >>>>>
>>> >> >>>>> Colin
>>> >> >>>>>
>>> >> >>>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >> --
>>> >> >> Lei (Eddy) Xu
>>> >> >> Software Engineer, Cloudera
>>> >>
>>> >>
>>> >
>>> >
>>> >--
>>> >Sean
>>>
>>>
>>
>>
>> --
>> Sean
>>
>
>
>
> --
> Sean
>



-- 
Sean

Reply via email to