On the move over to nightly test runs: 1.2 nightly had a successful build last night after the branch-1 stabilization effort (HBASE-19204) and fixing a few unit test failures. See build 150 https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/job/branch-1.2/ It then failed, 151, because of timed out test. Need to dig in. Clean up a few more unit tests and branch-1.2 is probably ready for a release-cutting.
1.3 has a few flakies. The last build failed because of: Test Result (1 failure / ±0) org.apache.hadoop.hbase.regionserver.TestEncryptionKeyRotation.testCFKeyRotation Just a little effort should turn 1.3 green. I was going to disable the 1.4 job, https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.4/, in favor of the 1.4 nightly, https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/job/branch-1.4/, if ok w/ you Andrew Purtell... And move over the branch-1, branch-2, and master too. Thanks, S On Wed, Nov 29, 2017 at 8:06 AM, Stack <st...@duboce.net> wrote: > Example of the new nice reporting: vhttps://builds.apache.org/ > view/H-L/view/HBase/job/HBase%20Nightly/job/branch-1.2/ > S > > On Wed, Nov 29, 2017 at 8:06 AM, Stack <st...@duboce.net> wrote: > >> Note that I have disabled the HBase-1.2-JDK7, HBase-1.2-JDK8, >> HBase-1.3-JDK7, and HBase-1.3-JDK8 jobs. They have been broken for a good >> while now. In their place, refer to an ongoing Sean "Nightly" project, an >> effort he has been at for a while. It does more checking with pretty >> reports that will help figuring general stability over time. See under >> https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/ >> See the nightly builds for 1.2 and 1.3. They have some teething issues >> still but are almost there. See the 1.2 build from last night. In recent >> days, the 1.2 branch went from trash-can fire to stable. See how all tests >> passed in the last build but then we failed generating the src bundle on >> the end (this is what I mean by 'teething' issue). Will work on fixing this >> last step and moving over 1.4, etc., in the next few days. >> >> FYI, >> St.Ack >> >> >> On Tue, Nov 7, 2017 at 7:45 AM, Stack <st...@duboce.net> wrote: >> >>> On Tue, Nov 7, 2017 at 6:10 AM, Sean Busbey <bus...@apache.org> wrote: >>> >>>> > Should I be able to see the machine dir when I look at nightlies >>>> output? >>>> > (Was trying to see what else is running). >>>> >>>> Ah. we don't have the same machine sampling on nightly as we do in >>>> precommit. I am 80% on a patch for HBASE-19189 (run test ad-hoc >>>> repeatedly) that includes pulling that information gathering into a >>>> place where we could also use it in nightly. >>>> >>>> >>> Sweet. >>> >>> >>> >>>> Did we ever figure out how many cores we expect our tests to need? It >>>> looks like the Hadoop nodes have 8 cores. (with 2 executors that means >>>> 4 is our fair share) >>>> >>>> >>> At the end of the thread inquiry I suggested that we don't use enough >>> cores, that we could up our fork counts and tests would complete in less >>> time. I wanted to experiment some w/ high fork counts -- 16 or so -- to see >>> if concurrent running brought on more failure. >>> >>> St.Ack >>> >>> >>> >>> >>>> On Tue, Nov 7, 2017 at 8:05 AM, Sean Busbey <bus...@apache.org> wrote: >>>> > surefire results get zipped up (we were filling the jenkins hosts with >>>> > old test logs previously) and stored in a file called "test_logs.zip" >>>> > for each jvm run. So if that happend in the jdk7 run for branch-1.2, >>>> > it'd be in artifacts -> output-jdk7 -> test_logs.zip. >>>> > >>>> > I don't know if the archival process grabs things from surefire that >>>> > aren't the surefire XML files, but we can update it to do so if it >>>> > doesn't. >>>> > >>>> > On Mon, Nov 6, 2017 at 11:39 PM, Stack <st...@duboce.net> wrote: >>>> >> I see this in the 1.2 nightly just when it gives up the ghost.... >>>> >> >>>> >> [WARNING] Corrupted STDOUT by directly writing to native stream in >>>> >> forked JVM 2. See FAQ web page and the dump file >>>> >> /testptch/hbase/hbase-server/target/surefire-reports/2017-11 >>>> -06T20-11-30_219-jvmRun2.dumpstream >>>> >> >>>> >> .. but the pointed to dumpstream doesn't seem to be around post >>>> build. >>>> >> I am looking in wrong place? >>>> >> >>>> >> >>>> >> Thanks, >>>> >> >>>> >> S >>>> >> >>>> >> >>>> >> On Mon, Nov 6, 2017 at 8:20 PM, Stack <st...@duboce.net> wrote: >>>> >> >>>> >>> On Mon, Nov 6, 2017 at 8:35 AM, Sean Busbey <sean.bus...@gmail.com> >>>> wrote: >>>> >>> >>>> >>>> Given that all of the old post-commit tests have been posting that >>>> >>>> they're failing to JIRAs for what looks like a month, is there any >>>> >>>> reason not to switch to the new tests that also say they're >>>> failing? >>>> >>>> >>>> >>>> >>>> >>> No reason. >>>> >>> >>>> >>> >>>> >>> >>>> >>>> The reason HBASE-18467 has been sitting on hold this whole time has >>>> >>>> been because the new nightly branch tests keep complaining about >>>> >>>> failures. >>>> >>>> >>>> >>>> >>>> >>> Looking just now, it looks like killed-off test runs. >>>> >>> >>>> >>> +1 on move to nightlies. >>>> >>> >>>> >>> Can I help? >>>> >>> >>>> >>> Should I be able to see the machine dir when I look at nightlies >>>> output? >>>> >>> (Was trying to see what else is running). >>>> >>> >>>> >>> Thanks Sean, >>>> >>> St.Ack >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>>> On Mon, Nov 6, 2017 at 10:21 AM, Sean Busbey < >>>> sean.bus...@gmail.com> >>>> >>>> wrote: >>>> >>>> > It looks like old tests branch-1.2 and branch-1.3 are failing >>>> with >>>> >>>> > some maven enforcer problem that we thought we had fixed a few >>>> times >>>> >>>> > before. It's probably fixable by changing the version of maven >>>> they >>>> >>>> > use, but I'd much rather any test effort go into the last mile of >>>> >>>> > getting our new nightly tests working. >>>> >>>> > >>>> >>>> > I'll start picking this up as soon as I close out HBASE-18784. >>>> >>>> > >>>> >>>> > Please consider branch-1.2 release blocked. :( >>>> >>>> > >>>> >>>> > On Mon, Nov 6, 2017 at 10:19 AM, Stack <st...@duboce.net> wrote: >>>> >>>> >> Our builds seem pretty sick up on builds.apache.org even after >>>> the >>>> >>>> miracle >>>> >>>> >> work by Allen W containing errant hadoop processes. Looking at >>>> 1.2 and >>>> >>>> 1.3, >>>> >>>> >> we don't even get off the ground. Anyone been taking a look? >>>> >>>> >> >>>> >>>> >> When I try to run the branch-1.2 and branch-1.3 unit tests >>>> locally, >>>> >>>> about >>>> >>>> >> ten tests or so timeout. Have others tried branch-1 test runs >>>> recently? >>>> >>>> >> >>>> >>>> >> Thanks, >>>> >>>> >> S >>>> >>>> >> >>>> >>>> >> >>>> >>>> >> On Mon, Aug 21, 2017 at 1:54 PM, Stack <st...@duboce.net> >>>> wrote: >>>> >>>> >> >>>> >>>> >>> Loads of tests timing out in test runs -- then they all pass. >>>> Anyone >>>> >>>> have >>>> >>>> >>> an input? I'm trying to take a look as background task... >>>> >>>> >>> >>>> >>>> >>> S >>>> >>>> >>> >>>> >>>> >>> On Tue, Jul 11, 2017 at 7:05 PM, Stack <st...@duboce.net> >>>> wrote: >>>> >>>> >>> >>>> >>>> >>>> Thanks Appy. >>>> >>>> >>>> >>>> >>>> >>>> Any one looking at the 'ERROR ExecutionException Java heap >>>> space...' >>>> >>>> >>>> errors on patch builds or failed forking? Seems common >>>> enough. Here >>>> >>>> are >>>> >>>> >>>> complaints that remote JVM went away: >>>> >>>> >>>> >>>> >>>> >>>> https://builds.apache.org/view/H-L/view/HBase/job/PreCommit- >>>> >>>> >>>> HBASE-Build/7617/artifact/patchprocess/patch-unit-hbase-serv >>>> er.txt >>>> >>>> >>>> https://builds.apache.org/view/H-L/view/HBase/job/PreCommit- >>>> >>>> >>>> HBASE-Build/7616/artifact/patchprocess/patch-unit-hbase-serv >>>> er.txt >>>> >>>> >>>> >>>> >>>> >>>> Then this succeeds.... >>>> >>>> >>>> >>>> >>>> >>>> https://builds.apache.org/view/H-L/view/HBase/job/PreCommit- >>>> >>>> >>>> HBASE-Build/7614/artifact/patchprocess/patch-unit-hbase-serv >>>> er.txt >>>> >>>> >>>> >>>> >>>> >>>> And we are good for a while. >>>> >>>> >>>> >>>> >>>> >>>> Then heap issues: >>>> >>>> >>>> >>>> >>>> >>>> https://builds.apache.org/view/H-L/view/HBase/job/PreCommit- >>>> >>>> >>>> HBASE-Build/7607/artifact/patchprocess/patch-unit-hbase-serv >>>> er.txt >>>> >>>> >>>> >>>> >>>> >>>> Are the zombies back? >>>> >>>> >>>> >>>> >>>> >>>> St.Ack >>>> >>>> >>>> >>>> >>>> >>>> On Tue, Jul 11, 2017 at 12:33 AM, Apekshit Sharma < >>>> a...@cloudera.com >>>> >>>> > >>>> >>>> >>>> wrote: >>>> >>>> >>>> >>>> >>>> >>>>> Fixed 'trends' in flaky dashboard. Since i changed the test >>>> names >>>> >>>> in last >>>> >>>> >>>>> fix, the dots in the name were messing up with CSS >>>> selectors. :) >>>> >>>> >>>>> >>>> >>>> >>>>> >>>> >>>> >>>>> On Mon, Jul 10, 2017 at 11:34 AM, Apekshit Sharma < >>>> >>>> a...@cloudera.com> >>>> >>>> >>>>> wrote: >>>> >>>> >>>>> >>>> >>>> >>>>> > Quick update on flaky dashboard: >>>> >>>> >>>>> > Flaky dashboard wasn't working earlier because our trunk >>>> build was >>>> >>>> >>>>> broken. >>>> >>>> >>>>> > After trunk was fixed, the format of log lines in >>>> consoleText was >>>> >>>> not >>>> >>>> >>>>> the >>>> >>>> >>>>> > same, so findHangingTests.py was not able to parse it >>>> correctly >>>> >>>> for >>>> >>>> >>>>> > broken/hanging/timeout tests. That's been fixed now >>>> HBASE-18341 >>>> >>>> >>>>> > <https://issues.apache.org/jira/browse/HBASE-18341>. >>>> >>>> >>>>> > Drob brought up in other thread that 'treads' isn't >>>> working. It's >>>> >>>> >>>>> probably >>>> >>>> >>>>> > because i changed tests names (which are used as keys in >>>> python >>>> >>>> dicts) >>>> >>>> >>>>> from >>>> >>>> >>>>> > just class name to package name+classname (without common >>>> >>>> >>>>> > org.apache.hadoop.hbase prefix). I had to do it because we >>>> have >>>> >>>> some >>>> >>>> >>>>> tests >>>> >>>> >>>>> > with same class name but in different packages. >>>> >>>> >>>>> > >>>> >>>> >>>>> > I'll take a look at it sometime this week (unless someone >>>> wants to >>>> >>>> >>>>> take it >>>> >>>> >>>>> > up and work on this beautiful piece of infra ;) ) >>>> >>>> >>>>> > >>>> >>>> >>>>> > >>>> >>>> >>>>> > On Thu, Jul 6, 2017 at 11:25 PM, Stack <st...@duboce.net> >>>> wrote: >>>> >>>> >>>>> > >>>> >>>> >>>>> >> On Thu, Jul 6, 2017 at 3:45 PM, Sean Busbey < >>>> bus...@apache.org> >>>> >>>> >>>>> wrote: >>>> >>>> >>>>> >> >>>> >>>> >>>>> >> > that sounds like our project structure is broken. >>>> Please make >>>> >>>> sure >>>> >>>> >>>>> >> there's >>>> >>>> >>>>> >> > a jira that tracks it and I'll take a look later. >>>> >>>> >>>>> >> > >>>> >>>> >>>>> >> > >>>> >>>> >>>>> >> >>>> >>>> >>>>> >> Filed HBASE-18331 for now. >>>> >>>> >>>>> >> >>>> >>>> >>>>> >> I can take a look too later. >>>> >>>> >>>>> >> >>>> >>>> >>>>> >> St.Ack >>>> >>>> >>>>> >> >>>> >>>> >>>>> >> >>>> >>>> >>>>> >> >>>> >>>> >>>>> >> > On Thu, Jul 6, 2017 at 6:15 PM, Stack <st...@duboce.net >>>> > >>>> >>>> wrote: >>>> >>>> >>>>> >> > >>>> >>>> >>>>> >> > > I tried publishing hbase-3.0.0-SNAPSHOT... so >>>> >>>> hbase-checkstyle >>>> >>>> >>>>> was up >>>> >>>> >>>>> >> in >>>> >>>> >>>>> >> > > repo (presuming it relied on an aged-out snapshot). >>>> Seems to >>>> >>>> have >>>> >>>> >>>>> >> 'fixed' >>>> >>>> >>>>> >> > > it for now.... >>>> >>>> >>>>> >> > > >>>> >>>> >>>>> >> > > St.Ack >>>> >>>> >>>>> >> > > >>>> >>>> >>>>> >> > > On Thu, Jul 6, 2017 at 12:50 PM, Stack < >>>> st...@duboce.net> >>>> >>>> wrote: >>>> >>>> >>>>> >> > > >>>> >>>> >>>>> >> > > > The 3.0.0-SNAPSHOT looks suspicious ... the hbase >>>> >>>> version.... >>>> >>>> >>>>> >> > > > St.Ack >>>> >>>> >>>>> >> > > > >>>> >>>> >>>>> >> > > > On Thu, Jul 6, 2017 at 12:49 PM, Stack < >>>> st...@duboce.net> >>>> >>>> >>>>> wrote: >>>> >>>> >>>>> >> > > > >>>> >>>> >>>>> >> > > >> On Thu, Jul 6, 2017 at 12:48 PM, Stack < >>>> st...@duboce.net> >>>> >>>> >>>>> wrote: >>>> >>>> >>>>> >> > > >> >>>> >>>> >>>>> >> > > >>> Checkstyle is currently broke on our builds... >>>> looking. >>>> >>>> >>>>> >> > > >>> St.Ack >>>> >>>> >>>>> >> > > >>> >>>> >>>> >>>>> >> > > >>> >>>> >>>> >>>>> >> > > >> Works if I run it locally (of course) >>>> >>>> >>>>> >> > > >> St.Ack >>>> >>>> >>>>> >> > > >> >>>> >>>> >>>>> >> > > >> >>>> >>>> >>>>> >> > > >> >>>> >>>> >>>>> >> > > >> >>>> >>>> >>>>> >> > > >>> >>>> >>>> >>>>> >> > > >>> >>>> >>>> >>>>> >> > > >>> [ERROR] Failed to execute goal >>>> org.apache.maven.plugins: >>>> >>>> >>>>> >> > > maven-checkstyle-plugin:2.17:checkstyle >>>> (default-cli) on >>>> >>>> project >>>> >>>> >>>>> >> hbase: >>>> >>>> >>>>> >> > > Execution default-cli of goal >>>> org.apache.maven.plugins: >>>> >>>> >>>>> >> > > maven-checkstyle-plugin:2.17:checkstyle failed: >>>> Plugin >>>> >>>> >>>>> >> > > org.apache.maven.plugins:maven-checkstyle-plugin:2.17 >>>> or >>>> >>>> one of >>>> >>>> >>>>> its >>>> >>>> >>>>> >> > > dependencies could not be resolved: Could not find >>>> artifact >>>> >>>> >>>>> >> > > org.apache.hbase:hbase-checkstyle:jar:3.0.0-SNAPSHOT >>>> in >>>> >>>> Nexus ( >>>> >>>> >>>>> >> > > http://repository.apache.org/snapshots) -> [Help >>>> 1][ERROR] >>>> >>>> >>>>> [ERROR] To >>>> >>>> >>>>> >> > see >>>> >>>> >>>>> >> > > the full stack trace of the errors, re-run Maven with >>>> the -e >>>> >>>> >>>>> >> > switch.[ERROR] >>>> >>>> >>>>> >> > > Re-run Maven using the -X switch to enable full debug >>>> >>>> >>>>> logging.[ERROR] >>>> >>>> >>>>> >> > > [ERROR] For more information about the errors and >>>> possible >>>> >>>> >>>>> solutions, >>>> >>>> >>>>> >> > > please read the following articles:[ERROR] [Help 1] >>>> >>>> >>>>> >> > > http://cwiki.apache.org/confluence/display/MAVEN/ >>>> >>>> >>>>> >> > > PluginResolutionExceptionBuild step 'Invoke top-level >>>> Maven >>>> >>>> >>>>> targets' >>>> >>>> >>>>> >> > > marked build as failure >>>> >>>> >>>>> >> > > >>> Performing Post build task... >>>> >>>> >>>>> >> > > >>> Match found for :.* : True >>>> >>>> >>>>> >> > > >>> Logical operation result is TRUE >>>> >>>> >>>>> >> > > >>> Running script : # Run zombie detector script >>>> >>>> >>>>> >> > > >>> ./dev-support/zombie-detector.sh --jenkins >>>> ${BUILD_ID} >>>> >>>> >>>>> >> > > >>> [a3159d73] $ /bin/bash -xe >>>> /tmp/hudson1697041977582083402 >>>> >>>> .sh >>>> >>>> >>>>> >> > > >>> + ./dev-support/zombie-detector.sh --jenkins 3320 >>>> >>>> >>>>> >> > > >>> Thu Jul 6 01:37:09 UTC 2017 We're ok: there is no >>>> >>>> zombie test >>>> >>>> >>>>> >> > > >>> >>>> >>>> >>>>> >> > > >>> >>>> >>>> >>>>> >> > > >>> >>>> >>>> >>>>> >> > > >>> >>>> >>>> >>>>> >> > > >>> On Fri, Jun 30, 2017 at 2:43 PM, Sean Busbey < >>>> >>>> >>>>> bus...@apache.org> >>>> >>>> >>>>> >> > > wrote: >>>> >>>> >>>>> >> > > >>> >>>> >>>> >>>>> >> > > >>>> jacoco was added ages ago. I'd guess that >>>> something >>>> >>>> changed >>>> >>>> >>>>> on >>>> >>>> >>>>> >> the >>>> >>>> >>>>> >> > > >>>> machines >>>> >>>> >>>>> >> > > >>>> we use to cause it to stop working. >>>> >>>> >>>>> >> > > >>>> >>>> >>>> >>>>> >> > > >>>> On Thu, Jun 29, 2017 at 12:02 PM, Stack < >>>> >>>> st...@duboce.net> >>>> >>>> >>>>> >> wrote: >>>> >>>> >>>>> >> > > >>>> >>>> >>>> >>>>> >> > > >>>> > On Wed, Jun 28, 2017 at 8:43 AM, Josh Elser < >>>> >>>> >>>>> els...@apache.org >>>> >>>> >>>>> >> > >>>> >>>> >>>>> >> > > >>>> wrote: >>>> >>>> >>>>> >> > > >>>> > >>>> >>>> >>>>> >> > > >>>> > > >>>> >>>> >>>>> >> > > >>>> > > >>>> >>>> >>>>> >> > > >>>> > > On 6/27/17 7:20 PM, Stack wrote: >>>> >>>> >>>>> >> > > >>>> > > >>>> >>>> >>>>> >> > > >>>> > >> * test-patch's whitespace plugin can >>>> configured to >>>> >>>> >>>>> ignore >>>> >>>> >>>>> >> some >>>> >>>> >>>>> >> > > >>>> files >>>> >>>> >>>>> >> > > >>>> > (but >>>> >>>> >>>>> >> > > >>>> > >>> I >>>> >>>> >>>>> >> > > >>>> > >>> can't think of any we'd care to so >>>> whitelist) >>>> >>>> >>>>> >> > > >>>> > >>> >>>> >>>> >>>>> >> > > >>>> > >>> Generated files. >>>> >>>> >>>>> >> > > >>>> > >> >>>> >>>> >>>>> >> > > >>>> > > >>>> >>>> >>>>> >> > > >>>> > > Oh my goodness, yes, please. This has been >>>> such a >>>> >>>> pain >>>> >>>> >>>>> in the >>>> >>>> >>>>> >> > rear >>>> >>>> >>>>> >> > > >>>> for me >>>> >>>> >>>>> >> > > >>>> > > as I've been rebasing space quota patches. >>>> >>>> Sometimes, the >>>> >>>> >>>>> >> spaces >>>> >>>> >>>>> >> > > in >>>> >>>> >>>>> >> > > >>>> > > pb-gen'ed code are removed by folks before >>>> commit, >>>> >>>> other >>>> >>>> >>>>> >> times >>>> >>>> >>>>> >> > > they >>>> >>>> >>>>> >> > > >>>> > aren't. >>>> >>>> >>>>> >> > > >>>> > > >>>> >>>> >>>>> >> > > >>>> > >>>> >>>> >>>>> >> > > >>>> > Agree sir. Its a distraction at least. >>>> >>>> >>>>> >> > > >>>> > >>>> >>>> >>>>> >> > > >>>> > I see Jacoco report here now: >>>> >>>> >>>>> >> > > >>>> > https://builds.apache.org/job/ >>>> >>>> HBase-Trunk_matrix/jdk=JDK% >>>> >>>> >>>>> >> > > >>>> > 201.8%20(latest),label=Hadoop/3277/ >>>> >>>> >>>>> >> > > >>>> > >>>> >>>> >>>>> >> > > >>>> > Maybe it has been there always and I just >>>> haven't >>>> >>>> noticed. >>>> >>>> >>>>> >> > > >>>> > >>>> >>>> >>>>> >> > > >>>> > Its all 0%. We need to turn on stuff? >>>> >>>> >>>>> >> > > >>>> > >>>> >>>> >>>>> >> > > >>>> > St.Ack >>>> >>>> >>>>> >> > > >>>> > >>>> >>>> >>>>> >> > > >>>> >>>> >>>> >>>>> >> > > >>> >>>> >>>> >>>>> >> > > >>> >>>> >>>> >>>>> >> > > >> >>>> >>>> >>>>> >> > > > >>>> >>>> >>>>> >> > > >>>> >>>> >>>>> >> > >>>> >>>> >>>>> >> >>>> >>>> >>>>> > >>>> >>>> >>>>> > >>>> >>>> >>>>> > >>>> >>>> >>>>> > -- >>>> >>>> >>>>> > >>>> >>>> >>>>> > -- Appy >>>> >>>> >>>>> > >>>> >>>> >>>>> >>>> >>>> >>>>> >>>> >>>> >>>>> >>>> >>>> >>>>> -- >>>> >>>> >>>>> >>>> >>>> >>>>> -- Appy >>>> >>>> >>>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>> >>>> >>>> > >>>> >>>> > >>>> >>>> > >>>> >>>> > -- >>>> >>>> > Sean >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> >>>> Sean >>>> >>>> >>>> >>> >>>> >>> >>>> >>> >>> >> >