Thanks Pat / Abe.
I don't see the permission issue in the build now, but the C++ test is
still failing:
[exec] [exec]
Zookeeper_simpleSystem::testWatcherAutoResetWithLocal ZooKeeper server
started ZooKeeper server started ZooKeeper server started : elapsed 15069 :
OK
[exec] [exec] Zookeeper_simpleSystem::testGetChildren2 : elapsed
1065 : OK
[exec] [exec] Zookeeper_simpleSystem::testLastZxid : elapsed
4532 : OK
[exec] [exec] Zookeeper_simpleSystem::testRemoveWatchers
ZooKeeper server started : elapsed 4357 : OK
[exec]
[exec] BUILD FAILED
[exec]
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/build.xml:1346:
The following error occurred while executing this line:
[exec]
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/build.xml:1356:
exec returned: 2
[exec]
[exec] Total time: 16 minutes 23 seconds
[exec] /bin/kill -9 18017
[exec] [exec] Zookeeper_readOnly::testReadOnly : assertion :
elapsed 4035
* [exec] [exec]
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/src/c/tests/TestReadOnlyClient.cc:99:
Assertion: equality assertion failed [Expected: 0, Actual : -4]*
[exec] [exec] Failures !!!
[exec] [exec] Run: 74 Failure total: 1 Failures: 1 Errors:
0
[exec] [exec] FAIL: zktest-mt
[exec] [exec] ==========================================
[exec] [exec] 1 of 2 tests failed
[exec] [exec] Please report to [email protected]
[exec] [exec] ==========================================
[exec] [exec] make[1]: Leaving directory
`/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/build/test/test-cppunit'
[exec] [exec]
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/src/c/tests/zkServer.sh:
line 62: kill: (17966) - No such process
[exec] [exec] make[1]: *** [check-TESTS] Error 1
[exec] [exec] make: *** [check-am] Error 2
Any ideas?
Andor
On Wed, Nov 22, 2017 at 6:34 PM, Patrick Hunt <[email protected]> wrote:
> The builds team fixed things on their side and our jenkins job is green
> again however I'm working with Abe et. al. to address this on our side as
> well.
>
> Patrick
>
> On Tue, Nov 21, 2017 at 12:55 PM, Patrick Hunt <[email protected]> wrote:
>
> > FYI: someone just reported similar problems to the builds list:
> >
> > ----
> > it seems that on some nodes the user ids, that are used by the Jenkins
> > slav=
> > es, have been changed. But there are still some directories residing in
> > /tm=
> > p with ownership to the old uid. That causes a conflict with our tests,
> > bec=
> > ause these files can neither be deleted nor moved.
> >
> > Slave where our jobs fail: H25
> > But this may not be the only one.
> >
> > Could you please check and delete (old) temp files there.
> > In our case it's /tmp/archiva, but other projects may have similar
> > problems=
> >
> >
> > On Tue, Nov 21, 2017 at 12:18 PM, Abraham Fine <[email protected]> wrote:
> >
> >> I'll take a look.
> >>
> >> On Tue, Nov 21, 2017, at 11:55, Patrick Hunt wrote:
> >> > Looks like someone is creating our test files outside of jenkins. I
> >> > modified the job to output our id and look at the perms on those
> files:
> >> >
> >> > ----
> >> >
> >> > [ZooKeeper-trunk] $ /bin/bash /tmp/jenkins291402182647699851.sh
> >> > uid=910(jenkins) gid=910(jenkins) groups=910(jenkins),999(docker)
> >> >
> >> > drwxr-xr-x 3 10025 12036 4096 Nov 10 01:39 /tmp/zkdata
> >> > -rw-r--r-- 1 10025 12036 2 Nov 10 01:39 /tmp/zkdata/myid
> >> >
> >> > /tmp/zkdata/version-2:
> >> > total 20
> >> > drwxr-xr-x 2 10025 12036 4096 Oct 22 23:35 .
> >> > drwxr-xr-x 3 10025 12036 4096 Nov 10 01:39 ..
> >> > -rw-r--r-- 1 10025 12036 1 Oct 22 23:35 acceptedEpoch
> >> > -rw-r--r-- 1 10025 12036 1 Oct 22 23:35 currentEpoch
> >> > -rw-r--r-- 1 10025 12036 562 Oct 22 23:35 snapshot.0
> >> >
> >> > ----
> >> >
> >> >
> >> > Notice that it's not jenkins.
> >> >
> >> >
> >> > Can you (Abe?) submit a jira/patch (ASAP as it's breaking the build)
> >> > to create a new directory in /tmp and then host all the tmp files
> >> > there?
> >> >
> >> >
> >> > Thanks,
> >> >
> >> >
> >> > Patrick
> >> >
> >> >
> >> >
> >> > On Tue, Nov 21, 2017 at 10:37 AM, Patrick Hunt <[email protected]>
> >> wrote:
> >> >
> >> > > With the same issue? Does it ever pass?
> >> > >
> >> > > Patrick
> >> > >
> >> > > On Tue, Nov 21, 2017 at 10:32 AM, Andor Molnar <[email protected]>
> >> wrote:
> >> > >
> >> > >> I checked back a few failing builds and see different hosts
> failing:
> >> H4,
> >> > >> H9, H12, ...
> >> > >>
> >> > >>
> >> > >>
> >> > >>
> >> > >>
> >> > >> On Tue, Nov 21, 2017 at 6:26 PM, Patrick Hunt <[email protected]>
> >> wrote:
> >> > >>
> >> > >> > Could it be an environment issue? I see the following just before
> >> the
> >> > >> > failure:
> >> > >> >
> >> > >> > [exec] rm: cannot remove '/tmp/zkdata/myid': Permission
> denied
> >> > >> >
> >> > >> > check whether it's happening on just one host (jenkins).
> >> > >> >
> >> > >> > Patrick
> >> > >> >
> >> > >> > On Tue, Nov 21, 2017 at 6:25 AM, Andor Molnar <
> [email protected]>
> >> > >> wrote:
> >> > >> >
> >> > >> > > Looks like only https://builds.apache.org/job/ZooKeeper-trunk
> is
> >> > >> > affected.
> >> > >> > >
> >> > >> > >
> >> > >> > > On Tue, Nov 21, 2017 at 3:22 PM, Andor Molnar <
> >> [email protected]>
> >> > >> > wrote:
> >> > >> > >
> >> > >> > > > Hi,
> >> > >> > > >
> >> > >> > > > Zookeeper build has been failing for a while with some weird
> >> error
> >> > >> in
> >> > >> > > > test-core-cppunit task. In most cases the error is the
> >> following:
> >> > >> > > >
> >> > >> > > > ...
> >> > >> > > > [exec] Zookeeper_simpleSystem::testGetChildren2 :
> elapsed
> >> > >> 1052 :
> >> > >> > OK
> >> > >> > > > [exec] Zookeeper_simpleSystem::testLastZxid : elapsed
> >> 4520 :
> >> > >> OK
> >> > >> > > > [exec] Zookeeper_simpleSystem::testRemoveWatchers
> >> ZooKeeper
> >> > >> > server
> >> > >> > > > started : elapsed 5390 : OK
> >> > >> > > > [exec] rm: cannot remove '/tmp/zkdata/myid': Permission
> >> denied
> >> > >> > > > [exec] Zookeeper_readOnly::testReadOnly : assertion :
> >> elapsed
> >> > >> > 4018
> >> > >> > > > [exec] /home/jenkins/jenkins-slave/wo
> >> > >> rkspace/ZooKeeper-trunk/src/
> >> > >> > > > c/tests/TestReadOnlyClient.cc:99: Assertion: equality
> >> assertion
> >> > >> failed
> >> > >> > > > [Expected: 0, Actual : -4]
> >> > >> > > > [exec] Failures !!!
> >> > >> > > > [exec] Run: 74 Failure total: 1 Failures: 1
> Errors:
> >> 0
> >> > >> > > > [exec] FAIL: zktest-mt
> >> > >> > > > [exec] ==========================================
> >> > >> > > > [exec] 1 of 2 tests failed
> >> > >> > > > [exec] Please report to [email protected]
> >> > >> > > > [exec] ==========================================
> >> > >> > > > [exec] Makefile:1744: recipe for target 'check-TESTS'
> >> failed
> >> > >> > > > [exec] make[1]: Leaving directory
> >> '/home/jenkins/jenkins-slave/
> >> > >> > > > workspace/ZooKeeper-trunk/build/test/test-cppunit'
> >> > >> > > > [exec] Makefile:2000: recipe for target 'check-am'
> failed
> >> > >> > > > [exec] /home/jenkins/jenkins-slave/wo
> >> > >> rkspace/ZooKeeper-trunk/src/
> >> > >> > > c/tests/zkServer.sh:
> >> > >> > > > line 62: kill: (10156) - No such process
> >> > >> > > > [exec] make[1]: *** [check-TESTS] Error 1
> >> > >> > > > [exec] make: *** [check-am] Error 2
> >> > >> > > >
> >> > >> > > > ----------------------
> >> > >> > > >
> >> > >> > > > Test at line TestReadOnlyClient.cc:99 got ConnectionLoss
> event.
> >> > >> > > > Does anyone has a clue what could be the root cause of this?
> >> > >> > > >
> >> > >> > > > Regards,
> >> > >> > > > Andor
> >> > >> > > >
> >> > >> > > >
> >> > >> > >
> >> > >> >
> >> > >>
> >> > >
> >> > >
> >>
> >
> >
>