*Single forked *surefire run doesn't help. Neither does *unbounded* *timeout* limit.
There is some part of test that is on infinite wait state and ends with OOM. Here is the full log for Travis build with no max memory limit on jvm: https://api.travis-ci.org/jobs/28583768/log.txt?deansi=true > Exception in thread "0d42f23c-c1a8-417a-bee5-672d4449ebac:frag:0:0 - Producer > Thread" java.lang.OutOfMemoryError: Direct buffer memory > at java.nio.Bits.reserveMemory(Bits.java:658) > at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123) > at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) > at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:434) > at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:179) > at io.netty.buffer.PoolArena.allocate(PoolArena.java:168) > at io.netty.buffer.PoolArena.allocate(PoolArena.java:98) > at > io.netty.buffer.PooledByteBufAllocatorL.newDirectBuffer(PooledByteBufAllocatorL.java:46) > at > io.netty.buffer.PooledByteBufAllocatorL.directBuffer(PooledByteBufAllocatorL.java:66) > at > org.apache.drill.exec.memory.TopLevelAllocator$ChildAllocator.buffer(TopLevelAllocator.java:144) > at > org.apache.drill.exec.memory.TopLevelAllocator$ChildAllocator.buffer(TopLevelAllocator.java:151) > at > org.apache.drill.exec.vector.VarCharVector.allocateNew(VarCharVector.java:306) > at > org.apache.drill.exec.vector.NullableVarCharVector.allocateNew(NullableVarCharVector.java:158) > at > org.apache.drill.exec.vector.AllocationHelper.allocate(AllocationHelper.java:31) > at > org.apache.drill.exec.physical.impl.ScanBatch$Mutator.allocate(ScanBatch.java:281) > at > org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:137) > at > org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:112) > at > org.apache.drill.exec.physical.impl.producer.ProducerConsumerBatch$Producer.run(ProducerConsumerBatch.java:122) > at java.lang.Thread.run(Thread.java:744) > > > No output has been received in the last 10 minutes, this potentially > indicates a stalled build or something wrong with the build itself. > > On Fri, Jun 27, 2014 at 10:24 AM, Jacques Nadeau <[email protected]> wrote: > I believe you're getting killed for excessive memory consumption. Dropping > to a single surefire should help > On Jun 26, 2014 8:43 PM, "Yash Sharma" <[email protected]> wrote: > > > The build still failed on Travis after increasing timeout to 200000ms. > Need > > to find appropriate value for it. > > It fails with this error - which typically comes in timeout case: > > > > > > Failed to execute goal > > org.apache.maven.plugins:maven-surefire-plugin:2.17:test > > (default-test) on project drill-java-exec: ExecutionException: > > java.lang.RuntimeException: The forked VM terminated without properly > > saying goodbye. VM crash or System.exit called? > > [ERROR] Command was /bin/sh -c cd > > /home/travis/build/yssharma/incubator-drill/exec/java-exec && > > /usr/lib/jvm/java-7-oracle/jre/bin/java -Xms512m -Xmx2g > > -Ddrill.exec.http.enabled=false > > -Ddrill.exec.sys.store.provider.local.write=false -XX:MaxPermSize=256M > > -XX:MaxDirectMemorySize=2096M -XX:+CMSClassUnloadingEnabled -jar > > > > > /home/travis/build/yssharma/incubator-drill/exec/java-exec/target/surefire/surefirebooter5087846151760741500.jar > > > > > > Will keep digging. > > > > > > @Jacques: I will re-submit the command line configurable patch soon. > > Will also dig into the surefire forks you mentioned. > > > > > > > > Yash > > > > > > > > > > On Fri, Jun 27, 2014 at 12:39 AM, Yash Sharma <[email protected]> wrote: > > > > > Its was not failing because of the git plugin - rather its how Travis > > > takes the clone. > > > Travis uses git clone --depth=50 for fast building. > > > > > > I am able to take a neat build with help of Travis team member Hiro. > The > > > current build is still going on here on my box. Will share the status > on > > > completion. > > > > > > Have added JIRA for the same, will add a patch soon: > > > https://issues.apache.org/jira/browse/DRILL-1083 > > > > > > Peace, > > > Yash > > > > > > > > > > > > On Wed, Jun 25, 2014 at 10:00 AM, Jacques Nadeau <[email protected]> > > > wrote: > > > > > >> Yeah, we have to disable test on the apache hardware as our tests our > > too > > >> hungry. I'm try to get some alternatives to work. If someone wanted > to > > >> try > > >> to figure out if we could run on Travis with fork 1, that would be > > great. > > >> Right now its failing because of the got plugin. You can try Travis > on > > >> your local fork to try to find a config that works > > >> On Jun 24, 2014 8:04 PM, "Yash Sharma" <[email protected]> wrote: > > >> > > >> > The final build #57 with skipping tests was successful. > > >> > > > >> > Majority of the tests #55 and #56 have failed due to TimedOut > > exception. > > >> > Other exceptions being - IllegalState(Child level allocators not > > >> closed). > > >> > One instance of InterruptedException which probably occurred because > > of > > >> the > > >> > test case termination only. > > >> > > > >> > > > >> > On Wed, Jun 25, 2014 at 2:59 AM, Timothy Chen <[email protected]> > > >> wrote: > > >> > > > >> > > Looks like lots of tests timed out and errored? > > >> > > > > >> > > Tim > > >> > > > > >> > > On Tue, Jun 24, 2014 at 11:53 AM, Yash Sharma <[email protected]> > > >> wrote: > > >> > > > *fingers-crossed* :) > > >> > > > > > >> > > > > > >> > > > On Wed, Jun 25, 2014 at 12:19 AM, Jacques Nadeau < > > >> [email protected]> > > >> > > wrote: > > >> > > > > > >> > > >> I kicked off another build with clean install. Good catch. > > >> Hopefully > > >> > > that > > >> > > >> will put things back on track. > > >> > > >> > > >> > > >> > > >> > > >> On Tue, Jun 24, 2014 at 11:46 AM, Yash Sharma < > [email protected] > > > > > >> > > wrote: > > >> > > >> > > >> > > >> > Not exactly able to reproduce the same error currently but I > > see > > >> > that > > >> > > it > > >> > > >> > was related to the Drill-1024 commit where the hive-storage > > code > > >> was > > >> > > >> moved > > >> > > >> > out of java-exec. The *drillOI* definition has moved from > > >> > config.fmpp > > >> > > >> > (java-exec) to config.fmpp (hive-exec). > > >> > > >> > > > >> > > >> > Jenkins build was still failing in java-exec - that means > that > > >> the > > >> > old > > >> > > >> > ObjectInspectorHelper class was still present and it was > > probably > > >> > > looking > > >> > > >> > for the tdd definition in config.fmpp(java-exec). > > >> > > >> > > > >> > > >> > Jenkins used 'mvn install' rather than 'mvn clean install' - > > >> maybe > > >> > it > > >> > > was > > >> > > >> > still referring to old ObjectInspectorHelper class. > > >> > > >> > > > >> > > >> > Still not sure. Will try reproducing exact error. > > >> > > >> > > > >> > > >> > Yash > > >> > > >> > > > >> > > >> > > > >> > > >> > > > >> > > >> > > > >> > > >> > > > >> > > >> > On Tue, Jun 24, 2014 at 10:46 PM, Jacques Nadeau < > > >> > [email protected]> > > >> > > >> > wrote: > > >> > > >> > > > >> > > >> > > Hey guys, > > >> > > >> > > > > >> > > >> > > I just saw that the build on Jenkins is failing. Any > > committer > > >> > > >> > interested > > >> > > >> > > in trying to troubleshoot? > > >> > > >> > > > > >> > > >> > > https://builds.apache.org/job/drill-scm/54 > > >> > > >> > > > > >> > > >> > > > >> > > >> > > >> > > > > >> > > > >> > > > > > > > > >
