Yes for the stack size. See as well the config 2.1.2.5, nofile & nproc are
important for the tests.
Check as well with the resource checker (see chapter 15.7.3.7) that the
values actually used in the tests execution are the ones that you expect.
You want MaxFileDescriptor to be above 2K to be safe. We run the tests with
a value set to 65K. Same for the nproc, in linux it actually controls the
number of java thread you can create.


On Wed, May 29, 2013 at 12:02 PM, Lisen Mu <imm...@gmail.com> wrote:

> Nicolas,
>
> Thanks for the reply!
>
> I run small & medium tests only, and I have 8G ram on the build server. My
> passing rate is also around 50%, but I'm not running -PrunAllTests.
>
> I can find 2 differences between official build and mine: missing prebuild
> step 'rm -rf /tmp/hbase-jenkins/hbase' and stack size is 1024 vs 8192 in
> ulimit. Could this be a problem? I would try out anyway. Thanks!
>
>
>
>
>
> On Wed, May 29, 2013 at 3:12 PM, Nicolas Liochon <nkey...@gmail.com>
> wrote:
>
> > Hello,
> >
> > Option 1:
> > We still have some flaky tests. You can benchmark you build against
> > https://builds.apache.org/job/HBase-TRUNK/ and
> > https://builds.apache.org/job/hbase-0.95/
> > You can also use this tool: https://github.com/jeffreyz88/jenkins-toolsto
> > get a review on the last fails:
> >
> > On 0.95, we have some tests that failed ~15% of the time:
> >
> > org.apache.hadoop.hbase.client.testadmin.testforcesplitmultifamily
> > 80/6/13     1    1    1    1    1    1    1    1    1    1    1    1
> > -1    0   -1
> > org.apache.hadoop.hbase.client.testhcm.testdeleteforzkconnleak 86/6/6
> > 1   -1    0    1    1    1    1    1    1    1    1    1    1    1    1
> > org.apache.hadoop.hbase.client.testmultiparallel.testactivethreadscount
> > 86/6/6     1    1    1    1    1    1    1   -1    0    1    1    1    1
> > 1    1
> >
> org.apache.hadoop.hbase.client.testmultiparallel.testflushcommitswithabort
> > 86/6/6     1    1    1    1    1    1    1   -1    0    1    1    1    1
> > 1    1
> >
> >
> org.apache.hadoop.hbase.replication.testreplicationqueuefailover.queuefailover
> > 86/6/6     1    1    1    1    1    1    1    1    1    1   -1    0    1
> > 1    1
> > org.apache.hadoop.hbase.rest.client.testremoteadmin.testclusterstatus
> > 86/6/6     1    1    1    1    1    1    1    1    1    1    1    1   -1
> > 0    1
> >
> >
> org.apache.hadoop.hbase.security.access.testaccesscontroller.testglobalauthorizationfornewregisteredrs
> > 86/6/6     1    1    1    1    1    1    1   -1    0    1    1    1    1
> > 1    1
> > org.apache.hadoop.hbase.util.testhbasefsck.testsplitdaughtersnotinmeta
> > 86/6/6
> >
> >
> > Option 2:
> > You've got some issues in you env. We run the test in parallel (5 b
> default
> > when you run all tests, 2 when you run only the small & medium ones). 5
> > requires around 10 GB or memory. If you have less or if the built is
> > shared, you may enter into strange conditions around test timing
> > requirements.
> > You also need to use oracle jdk.
> >
> > See as well http://hbase.apache.org/book.html / 15.7.3. Running tests,
> for
> > extra parameters.
> >
> >
> > These options are not exclusive. It seems that these days trunk build is
> ok
> > ~80% of the time these days, and 0.95 50%. You should expect something
> > similar.
> >
> >
> > A test becomes flaky because a patch breaks it just a little. The patch
> > passes the peer review and the precommit runs, but after a while the
> > randomness shows up, and we need to fix the code again. It's a never
> ending
> > story. Any help in fixing them is always greatly appreciated.
> >
> > Cheers,
> >
> > Nicolas
> >
> >
> > On Wed, May 29, 2013 at 8:21 AM, Lisen Mu <imm...@gmail.com> wrote:
> >
> > > Hello,
> > >
> > > I'm setting up a jenkins job for hbase, building branch 0.95 (from
> > github)
> > > under jdk 6.
> > >
> > > However sometimes the build pasts, sometimes does not. Several tests
> are
> > > likely to fail, such as:
> > >
> > >
> > >
> >
> org.apache.hadoop.hbase.ipc.TestDelayedRpc.testDelayedRpcImmediateReturnValue
> > > org.apache.hadoop.hbase.ipc.TestDelayedRpc.testTooManyDelayedRpcs
> > > org.apache.hadoop.hbase.client.TestHCM.testDeleteForZKConnLeak
> > >
> > > yet they do not fail every time.
> > >
> > > Any clue about what might be the problem? Thanks.
> > >
> > >
> > > From the last failed build:
> > >
> > > the executed mvn command line:
> > >
> > > Executing Maven:  -B -f
> > > /var/lib/jenkins/jobs/HBase-0.95-jdk-6/workspace/pom.xml clean package
> > >
> > >
> > >
> > > Possibly related log:
> > >
> > >
> > >
> >
> org.apache.hadoop.hbase.ipc.TestDelayedRpc.testDelayedRpcImmediateReturnValue
> > >
> > > Error Message
> > >
> > > Index: 1, Size: 1
> > >
> > > Stacktrace
> > >
> > > java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
> > >         at java.util.ArrayList.RangeCheck(ArrayList.java:547)
> > >         at java.util.ArrayList.get(ArrayList.java:322)
> > >         at
> > >
> >
> org.apache.hadoop.hbase.ipc.TestDelayedRpc.testDelayedRpc(TestDelayedRpc.java:112)
> > >         at
> > >
> >
> org.apache.hadoop.hbase.ipc.TestDelayedRpc.testDelayedRpcImmediateReturnValue(TestDelayedRpc.java:71)
> > >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >         at
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > >         at
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > >         at java.lang.reflect.Method.invoke(Method.java:597)
> > >         at
> > >
> >
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> > >         at
> > >
> >
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > >         at
> > >
> >
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> > >         at
> > >
> >
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > >         at
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> > >         at
> > >
> >
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> > >         at
> > >
> >
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> > >         at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> > >         at
> > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> > >         at
> > > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> > >         at
> > org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> > >         at
> > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> > >         at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> > >         at org.junit.runners.Suite.runChild(Suite.java:127)
> > >         at org.junit.runners.Suite.runChild(Suite.java:26)
> > >         at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> > >         at
> > > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> > >         at
> > > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> > >         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> > >         at
> > >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> > >         at
> > >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> > >         at java.lang.Thread.run(Thread.java:662)
> > >
> >
>

Reply via email to