Ugh. I sent a reply to Gav on builds@ about maybe getting names that don't have spaces in them:
https://lists.apache.org/thread.html/8ac03dc62f9d6862d4f3d5eb37119c9c73b4059aaa3ebba52fc63bb6@%3Cbuilds.apache.org%3E In the mean time, is this an issue we need file with Hadoop or something we need to fix in our own code? On Wed, Aug 10, 2016 at 6:04 PM, Matteo Bertozzi <theo.berto...@gmail.com> wrote: > There are a bunch of builds that have most of the test failing. > > Example: > https://builds.apache.org/job/HBase-Trunk_matrix/1392/jdk=JDK%201.7%20(latest),label=yahoo-not-h2/testReport/junit/org.apache.hadoop.hbase/TestLocalHBaseCluster/testLocalHBaseCluster/ > > from the stack trace looks like the problem is with the jdk name that has > spaces: > the hadoop FsVolumeImpl calls setNameFormat(... + fileName.toString() + ...) > and this seems to not be escaped > so we end up with JDK%25201.7%2520(latest) in the string format and we get > a IllegalFormatPrecisionException: 7 > > 2016-08-10 22:07:46,108 WARN [DataNode: > [[[DISK]file:/home/jenkins/jenkins-slave/workspace/HBase-Trunk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not-h2/hbase-server/target/test-data/e7099624-ecfa-4674-87de-a8733d13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a-9c88f385e6f1/dfs/data/data1/, > [DISK]file:/home/jenkins/jenkins-slave/workspace/HBase-Trunk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not-h2/hbase-server/target/test-data/e7099624-ecfa-4674-87de-a8733d13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a-9c88f385e6f1/dfs/data/data2/]] > heartbeating to localhost/127.0.0.1:34629] > datanode.BPServiceActor(831): Unexpected exception in block pool Block > pool <registering> (Datanode Uuid unassigned) service to > localhost/127.0.0.1:34629 > java.util.IllegalFormatPrecisionException: 7 > at java.util.Formatter$FormatSpecifier.checkText(Formatter.java:2984) > at java.util.Formatter$FormatSpecifier.<init>(Formatter.java:2688) > at java.util.Formatter.parse(Formatter.java:2528) > at java.util.Formatter.format(Formatter.java:2469) > at java.util.Formatter.format(Formatter.java:2423) > at java.lang.String.format(String.java:2792) > at > com.google.common.util.concurrent.ThreadFactoryBuilder.setNameFormat(ThreadFactoryBuilder.java:68) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.initializeCacheExecutor(FsVolumeImpl.java:140) > > > > Matteo > > > On Tue, Aug 9, 2016 at 9:55 AM, Stack <st...@duboce.net> wrote: > >> Good on you Sean. >> S >> >> On Mon, Aug 8, 2016 at 9:43 PM, Sean Busbey <bus...@apache.org> wrote: >> >> > I updated all of our jobs to use the updated JDK versions from infra. >> > These have spaces in the names, and those names end up in our >> > workspace path, so try to keep an eye out. >> > >> > >> > >> > On Mon, Aug 8, 2016 at 10:42 AM, Sean Busbey <bus...@cloudera.com> >> wrote: >> > > running in docker is the default now. relying on the default docker >> > > image that comes with Yetus means that our protoc checks are >> > > failing[1]. >> > > >> > > >> > > [1]: https://issues.apache.org/jira/browse/HBASE-16373 >> > > >> > > On Sat, Aug 6, 2016 at 5:03 PM, Sean Busbey <bus...@apache.org> wrote: >> > >> Hi folks! >> > >> >> > >> this morning I merged the patch that updates us to Yetus 0.3.0[1] and >> > updated the precommit job appropriately. I also changed it to use one of >> > the Java versions post the puppet changes to asf build. >> > >> >> > >> The last three builds look normal (#2975 - #2977). I'm gonna try >> > running things in docker next. I'll email again when I make it the >> default. >> > >> >> > >> [1]: https://issues.apache.org/jira/browse/HBASE-15882 >> > >> >> > >> On 2016-06-16 10:43 (-0500), Sean Busbey <bus...@apache.org> wrote: >> > >>> FYI, today our precommit jobs started failing because our chosen jdk >> > >>> (1.7.0.79) disappeared (mentioned on HBASE-16032). >> > >>> >> > >>> Initially we were doing something wrong, namely directly referencing >> > >>> the jenkins build tools area without telling jenkins to give us an >> env >> > >>> variable that stated where the jdk is located. However, after >> > >>> attempting to switch to the appropriate tooling variable for jdk >> > >>> 1.7.0.79, I found that it didn't point to a place that worked. >> > >>> >> > >>> I've now updated the job to rely on the latest 1.7 jdk, which is >> > >>> currently 1.7.0.80. I don't know how often "latest" updates. >> > >>> >> > >>> Personally, I think this is a sign that we need to prioritize >> > >>> HBASE-15882 so that we can switch back to using Docker. I won't have >> > >>> time this week, so if anyone else does please pick up the ticket. >> > >>> >> > >>> On Thu, Mar 17, 2016 at 5:19 PM, Stack <st...@duboce.net> wrote: >> > >>> > Thanks Sean. >> > >>> > St.Ack >> > >>> > >> > >>> > On Wed, Mar 16, 2016 at 12:04 PM, Sean Busbey <bus...@cloudera.com >> > >> > wrote: >> > >>> > >> > >>> >> FYI, I updated the precommit job today to specify that only >> compile >> > time >> > >>> >> checks should be done against jdks other than the primary jdk7 >> > instance. >> > >>> >> >> > >>> >> On Mon, Mar 7, 2016 at 8:43 PM, Sean Busbey <bus...@cloudera.com> >> > wrote: >> > >>> >> >> > >>> >> > I tested things out, and while YETUS-297[1] is present the >> > default runs >> > >>> >> > all plugins that can do multiple jdks against those available >> > (jdk7 and >> > >>> >> > jdk8 in our case). >> > >>> >> > >> > >>> >> > We can configure things to only do a single run of unit tests. >> > They'll be >> > >>> >> > against jdk7, since that is our default jdk. That fine by >> > everyone? It'll >> > >>> >> > save ~1.5 hours on any build that hits hbase-server. >> > >>> >> > >> > >>> >> > On Mon, Mar 7, 2016 at 1:22 PM, Stack <st...@duboce.net> wrote: >> > >>> >> > >> > >>> >> >> Hurray! >> > >>> >> >> >> > >>> >> >> It looks like YETUS-96 is in there and we are only running on >> > jdk build >> > >>> >> >> now, the default (but testing compile against both).... Will >> > keep an >> > >>> >> eye. >> > >>> >> >> >> > >>> >> >> St.Ack >> > >>> >> >> >> > >>> >> >> >> > >>> >> >> On Mon, Mar 7, 2016 at 10:27 AM, Sean Busbey < >> > bus...@cloudera.com> >> > >>> >> wrote: >> > >>> >> >> >> > >>> >> >> > FYI, I've just updated our precommit jobs to use the 0.2.0 >> > release of >> > >>> >> >> Yetus >> > >>> >> >> > that came out today. >> > >>> >> >> > >> > >>> >> >> > After keeping an eye out for strangeness today I'll turn >> > docker mode >> > >>> >> >> back >> > >>> >> >> > on by default tonight. >> > >>> >> >> > >> > >>> >> >> > On Wed, Jan 13, 2016 at 10:14 AM, Sean Busbey < >> > bus...@apache.org> >> > >>> >> >> wrote: >> > >>> >> >> > >> > >>> >> >> > > FYI, I added a new parameter to the precommit job: >> > >>> >> >> > > >> > >>> >> >> > > * USE_YETUS_PRERELEASE - causes us to use the HEAD of the >> > >>> >> apache/yetus >> > >>> >> >> > > repo rather than our chosen release >> > >>> >> >> > > >> > >>> >> >> > > It defaults to inactive, but can be used in >> > manually-triggered runs >> > >>> >> to >> > >>> >> >> > > test a solution to a problem in the yetus library. At the >> > moment, >> > >>> >> I'm >> > >>> >> >> > > using it to test a solution to default module ordering as >> > seen in >> > >>> >> >> > > HBASE-15075. >> > >>> >> >> > > >> > >>> >> >> > > On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey < >> > bus...@cloudera.com> >> > >>> >> >> wrote: >> > >>> >> >> > > > FYI, I just pushed HBASE-13525 (switch to Apache Yetus >> for >> > >>> >> precommit >> > >>> >> >> > > tests) >> > >>> >> >> > > > and updated our jenkins precommit build to use it. >> > >>> >> >> > > > >> > >>> >> >> > > > Jenkins job has some explanation: >> > >>> >> >> > > > >> > >>> >> >> > > >> > >>> >> >> > >> > >>> >> >> >> > >>> >> https://builds.apache.org/view/PreCommit%20Builds/job/ >> > PreCommit-HBASE-Build/ >> > >>> >> >> > > > >> > >>> >> >> > > > Release note from HBASE-13525 does as well. >> > >>> >> >> > > > >> > >>> >> >> > > > The old job will stick around here for a couple of weeks, >> > in case >> > >>> >> we >> > >>> >> >> > need >> > >>> >> >> > > > to refer back to it: >> > >>> >> >> > > > >> > >>> >> >> > > > >> > >>> >> >> > > >> > >>> >> >> > >> > >>> >> >> >> > >>> >> https://builds.apache.org/view/PreCommit%20Builds/job/ >> > PreCommit-HBASE-Build-deprecated/ >> > >>> >> >> > > > >> > >>> >> >> > > > If something looks awry, please drop a note on >> HBASE-13525 >> > while >> > >>> >> it >> > >>> >> >> > > remains >> > >>> >> >> > > > open (and make a new issue after). >> > >>> >> >> > > > >> > >>> >> >> > > > >> > >>> >> >> > > > On Wed, Dec 2, 2015 at 3:22 PM, Stack <st...@duboce.net> >> > wrote: >> > >>> >> >> > > > >> > >>> >> >> > > >> As part of my continuing advocacy of builds.apache.org >> > and that >> > >>> >> >> their >> > >>> >> >> > > >> results are now worthy of our trust and nurture, here >> are >> > some >> > >>> >> >> > > highlights >> > >>> >> >> > > >> from the last few days of builds: >> > >>> >> >> > > >> >> > >>> >> >> > > >> + hadoopqa is now finding zombies before the patch is >> > committed. >> > >>> >> >> > > >> HBASE-14888 showed "-1 core tests. The patch failed >> these >> > unit >> > >>> >> >> tests:" >> > >>> >> >> > > but >> > >>> >> >> > > >> didn't have any failed tests listed (I'm trying to see >> if >> > I can >> > >>> >> do >> > >>> >> >> > > anything >> > >>> >> >> > > >> about this...). Running our little >> > >>> >> ./dev-tools/findHangingTests.py >> > >>> >> >> > > against >> > >>> >> >> > > >> the consoleText, it showed a hanging test. Running >> > locally, I see >> > >>> >> >> same >> > >>> >> >> > > >> hang. This is before the patch landed. >> > >>> >> >> > > >> + Our branch runs are now near totally zombie and flakey >> > free -- >> > >>> >> >> still >> > >>> >> >> > > some >> > >>> >> >> > > >> work to do -- but a recent patch that seemed harmless >> was >> > >>> >> causing a >> > >>> >> >> > > >> reliable flake fail in the backport to branch-1* >> > confirmed by >> > >>> >> local >> > >>> >> >> > > runs. >> > >>> >> >> > > >> The flakeyness was plain to see up in builds.apache.org >> . >> > >>> >> >> > > >> + In the last few days I've committed a patch that >> > included >> > >>> >> javadoc >> > >>> >> >> > > >> warnings even though hadoopqa said the patch introduced >> > javadoc >> > >>> >> >> issues >> > >>> >> >> > > (I >> > >>> >> >> > > >> missed it). This messed up life for folks subsequently >> as >> > their >> > >>> >> >> > patches >> > >>> >> >> > > now >> > >>> >> >> > > >> reported javadoc issues.... >> > >>> >> >> > > >> >> > >>> >> >> > > >> In short, I suggest that builds.apache.org is worth >> > keeping an >> > >>> >> eye >> > >>> >> >> > on, >> > >>> >> >> > > >> make >> > >>> >> >> > > >> sure you get a clean build out of hadoopqa before >> > committing >> > >>> >> >> anything, >> > >>> >> >> > > and >> > >>> >> >> > > >> lets all work together to try and keep our builds blue: >> > it'll >> > >>> >> save >> > >>> >> >> us >> > >>> >> >> > > all >> > >>> >> >> > > >> work in the long run. >> > >>> >> >> > > >> >> > >>> >> >> > > >> St.Ack >> > >>> >> >> > > >> >> > >>> >> >> > > >> >> > >>> >> >> > > >> On Tue, Nov 4, 2014 at 9:38 AM, Stack <st...@duboce.net >> > >> > wrote: >> > >>> >> >> > > >> >> > >>> >> >> > > >> > Branch-1 and master have stabilized and now run mostly >> > blue >> > >>> >> >> (give or >> > >>> >> >> > > take >> > >>> >> >> > > >> > the odd failure) [1][2]. Having a mostly blue branch-1 >> > has >> > >>> >> >> helped us >> > >>> >> >> > > >> > identify at least one destabilizing commit in the last >> > few >> > >>> >> days, >> > >>> >> >> > maybe >> > >>> >> >> > > >> two; >> > >>> >> >> > > >> > this is as it should be (smile). >> > >>> >> >> > > >> > >> > >>> >> >> > > >> > Lets keep our builds blue. If you commit a patch, make >> > sure >> > >>> >> >> > subsequent >> > >>> >> >> > > >> > builds stay blue. You can subscribe to >> > bui...@hbase.apache.org >> > >>> >> >> to >> > >>> >> >> > get >> > >>> >> >> > > >> > notice of failures if not already subscribed. >> > >>> >> >> > > >> > >> > >>> >> >> > > >> > Thanks, >> > >>> >> >> > > >> > St.Ack >> > >>> >> >> > > >> > >> > >>> >> >> > > >> > 1. >> > >>> >> https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.0/ >> > >>> >> >> > > >> > 2. >> > >>> >> >> https://builds.apache.org/view/H-L/view/HBase/job/HBase-TRUNK/ >> > >>> >> >> > > >> > >> > >>> >> >> > > >> > >> > >>> >> >> > > >> > On Mon, Oct 13, 2014 at 4:41 PM, Stack < >> > st...@duboce.net> >> > >>> >> wrote: >> > >>> >> >> > > >> > >> > >>> >> >> > > >> >> A few notes on testing. >> > >>> >> >> > > >> >> >> > >>> >> >> > > >> >> Too long to read, infra is more capable now and after >> > some >> > >>> >> >> work, we >> > >>> >> >> > > are >> > >>> >> >> > > >> >> seeing branch-1 and trunk mostly running blue. Lets >> > try and >> > >>> >> >> keep it >> > >>> >> >> > > this >> > >>> >> >> > > >> >> way going forward. >> > >>> >> >> > > >> >> >> > >>> >> >> > > >> >> Apache Infra has new, more capable hardware. >> > >>> >> >> > > >> >> >> > >>> >> >> > > >> >> A recent spurt of test fixing combined with more >> > capable >> > >>> >> >> hardware >> > >>> >> >> > > seems >> > >>> >> >> > > >> >> to have gotten us to a new place; tests are mostly >> > passing now >> > >>> >> >> on >> > >>> >> >> > > >> branch-1 >> > >>> >> >> > > >> >> and master. Lets try and keep it this way and start >> > to trust >> > >>> >> >> our >> > >>> >> >> > > test >> > >>> >> >> > > >> runs >> > >>> >> >> > > >> >> again. Just a few flakies remain. Lets try and nail >> > them. >> > >>> >> >> > > >> >> >> > >>> >> >> > > >> >> Our tests now run in parallel with other test suites >> > where >> > >>> >> >> previous >> > >>> >> >> > > we >> > >>> >> >> > > >> >> ran alone. You can see this sometimes when our zombie >> > detector >> > >>> >> >> > > reports >> > >>> >> >> > > >> >> tests from another project altogether as lingerers >> (To >> > be >> > >>> >> >> fixed). >> > >>> >> >> > > Some >> > >>> >> >> > > >> of >> > >>> >> >> > > >> >> our tests are failing because a concurrent hbase run >> is >> > >>> >> undoing >> > >>> >> >> > > classes >> > >>> >> >> > > >> and >> > >>> >> >> > > >> >> data from under it. Also, lets fix. >> > >>> >> >> > > >> >> >> > >>> >> >> > > >> >> Our tests are brittle. It takes 75minutes for them to >> > >>> >> complete. >> > >>> >> >> > Many >> > >>> >> >> > > >> are >> > >>> >> >> > > >> >> heavy-duty integration tests starting up multiple >> > clusters and >> > >>> >> >> > > mapreduce >> > >>> >> >> > > >> >> all in the one JVM. It is a miracle they pass at all. >> > Usually >> > >>> >> >> > > >> integration >> > >>> >> >> > > >> >> tests have been cast as unit tests because there was >> > no where >> > >>> >> >> else >> > >>> >> >> > > for >> > >>> >> >> > > >> them >> > >>> >> >> > > >> >> to get an airing. We have the hbase-it suite now >> > which would >> > >>> >> >> be a >> > >>> >> >> > > more >> > >>> >> >> > > >> apt >> > >>> >> >> > > >> >> place but until these are run on a regular basis in >> > public for >> > >>> >> >> all >> > >>> >> >> > to >> > >>> >> >> > > >> see, >> > >>> >> >> > > >> >> the fat integration tests disguised as unit tests >> will >> > remain. >> > >>> >> >> A >> > >>> >> >> > > >> review of >> > >>> >> >> > > >> >> our current unit tests weeding the old cruft and the >> > no longer >> > >>> >> >> > > relevant >> > >>> >> >> > > >> or >> > >>> >> >> > > >> >> duplicates would be a nice undertaking if someone is >> > looking >> > >>> >> to >> > >>> >> >> > > >> contribute. >> > >>> >> >> > > >> >> >> > >>> >> >> > > >> >> Alex Newman has been working on making our tests work >> > up on >> > >>> >> >> travis >> > >>> >> >> > > and >> > >>> >> >> > > >> >> circle-ci. That'll be sweet when it goes end-to-end. >> > He also >> > >>> >> >> > added >> > >>> >> >> > > in >> > >>> >> >> > > >> >> some "type" categorizations -- client, filter, >> > mapreduce -- >> > >>> >> >> > alongside >> > >>> >> >> > > >> our >> > >>> >> >> > > >> >> old "sizing" categorizations of small/medium/large. >> > His >> > >>> >> >> thinking >> > >>> >> >> > is >> > >>> >> >> > > >> that >> > >>> >> >> > > >> >> we can run these categorizations in parallel so we >> > could run >> > >>> >> the >> > >>> >> >> > > total >> > >>> >> >> > > >> >> suite in about the time of the longest test, say >> > 20-30minutes? >> > >>> >> >> We >> > >>> >> >> > > could >> > >>> >> >> > > >> >> even change Apache to run them this way. >> > >>> >> >> > > >> >> >> > >>> >> >> > > >> >> FYI, >> > >>> >> >> > > >> >> St.Ack >> > >>> >> >> > > >> >> >> > >>> >> >> > > >> >> >> > >>> >> >> > > >> >> >> > >>> >> >> > > >> >> >> > >>> >> >> > > >> >> >> > >>> >> >> > > >> >> >> > >>> >> >> > > >> >> >> > >>> >> >> > > >> > >> > >>> >> >> > > >> >> > >>> >> >> > > > >> > >>> >> >> > > > >> > >>> >> >> > > > >> > >>> >> >> > > > -- >> > >>> >> >> > > > Sean >> > >>> >> >> > > >> > >>> >> >> > >> > >>> >> >> > >> > >>> >> >> > >> > >>> >> >> > -- >> > >>> >> >> > busbey >> > >>> >> >> > >> > >>> >> >> >> > >>> >> > >> > >>> >> > >> > >>> >> > >> > >>> >> > -- >> > >>> >> > busbey >> > >>> >> > >> > >>> >> >> > >>> >> >> > >>> >> >> > >>> >> -- >> > >>> >> busbey >> > >>> >> >> > >>> >> > > >> > > >> > > >> > > -- >> > > busbey >> > >> -- busbey