I'm seeing the same failure but manifesting itself as a stackoverflow, various operating systems and architectures (RHEL 71, CentOS 72, SUSE 12, Ubuntu 14 04 and 16 04 LTS)
Build and test options: mvn -T 1C -Psparkr -Pyarn -Phadoop-2.7 -Phive -Phive-thriftserver -DskipTests clean package mvn -Pyarn -Phadoop-2.7 -Phive -Phive-thriftserver -Dtest.exclude.tags=org.apache.spark.tags.DockerTest -fn test -Xss2048k -Dspark.buffer.pageSize=1048576 -Xmx4g Stacktrace (this is with IBM's latest SDK for Java 8): scala> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): com.google.common.util.concurrent.ExecutionError: java.lang.StackOverflowError: operating system stack overflow at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2261) at com.google.common.cache.LocalCache.get(LocalCache.java:4000) at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4004) at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874) at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.compile(CodeGenerator.scala:849) at org.apache.spark.sql.catalyst.expressions.codegen.GenerateSafeProjection$.create(GenerateSafeProjection.scala:188) at org.apache.spark.sql.catalyst.expressions.codegen.GenerateSafeProjection$.create(GenerateSafeProjection.scala:36) at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:833) at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:830) at org.apache.spark.sql.execution.ObjectOperator$.deserializeRowToObject(objects.scala:137) ... omitted the rest for brevity Would also be useful to include this small but useful change that looks to have only just missed the cut: https://github.com/apache/spark/pull/14409 From: Reynold Xin <r...@databricks.com> To: Dongjoon Hyun <dongj...@apache.org> Cc: "dev@spark.apache.org" <dev@spark.apache.org> Date: 02/11/2016 18:37 Subject: Re: [VOTE] Release Apache Spark 2.0.2 (RC2) Looks like there is an issue with Maven (likely just the test itself though). We should look into it. On Wed, Nov 2, 2016 at 11:32 AM, Dongjoon Hyun <dongj...@apache.org> wrote: Hi, Sean. The same failure blocks me, too. - SPARK-18189: Fix serialization issue in KeyValueGroupedDataset *** FAILED *** I used `-Pyarn -Phadoop-2.7 -Pkinesis-asl -Phive -Phive-thriftserver -Dsparkr` on CentOS 7 / OpenJDK1.8.0_111. Dongjoon. On 2016-11-02 10:44 (-0700), Sean Owen <so...@cloudera.com> wrote: > Sigs, license, etc are OK. There are no Blockers for 2.0.2, though here are > the 4 issues still open: > > SPARK-14387 Enable Hive-1.x ORC compatibility with > spark.sql.hive.convertMetastoreOrc > SPARK-17957 Calling outer join and na.fill(0) and then inner join will miss > rows > SPARK-17981 Incorrectly Set Nullability to False in FilterExec > SPARK-18160 spark.files & spark.jars should not be passed to driver in yarn > mode > > Running with Java 8, -Pyarn -Phive -Phive-thriftserver -Phadoop-2.7 on > Ubuntu 16, I am seeing consistent failures in this test below. I think we > very recently changed this so it could be legitimate. But does anyone else > see something like this? I have seen other failures in this test due to OOM > but my MAVEN_OPTS allows 6g of heap, which ought to be plenty. > > > - SPARK-18189: Fix serialization issue in KeyValueGroupedDataset *** FAILED > *** > isContain was true Interpreter output contained 'Exception': > Welcome to > ____ __ > / __/__ ___ _____/ /__ > _\ \/ _ \/ _ `/ __/ '_/ > /___/ .__/\_,_/_/ /_/\_\ version 2.0.2 > /_/ > > Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_102) > Type in expressions to have them evaluated. > Type :help for more information. > > scala> > scala> keyValueGrouped: > org.apache.spark.sql.KeyValueGroupedDataset[Int,(Int, Int)] = > org.apache.spark.sql.KeyValueGroupedDataset@70c30f72 > > scala> mapGroups: org.apache.spark.sql.Dataset[(Int, Int)] = [_1: int, > _2: int] > > scala> broadcasted: org.apache.spark.broadcast.Broadcast[Int] = > Broadcast(0) > > scala> > scala> > scala> dataset: org.apache.spark.sql.Dataset[Int] = [value: int] > > scala> org.apache.spark.SparkException: Job aborted due to stage failure: > Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in > stage 0.0 (TID 0, localhost): > com.google.common.util.concurrent.ExecutionError: > java.lang.ClassCircularityError: > io/netty/util/internal/__matchers__/org/apache/spark/network/protocol/MessageMatcher > at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2261) > at com.google.common.cache.LocalCache.get(LocalCache.java:4000) > at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4004) > at > com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874) > at > org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.compile(CodeGenerator.scala:841) > at > org.apache.spark.sql.catalyst.expressions.codegen.GenerateSafeProjection$.create(GenerateSafeProjection.scala:188) > at > org.apache.spark.sql.catalyst.expressions.codegen.GenerateSafeProjection$.create(GenerateSafeProjection.scala:36) > at > org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:825) > at > org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:822) > at > org.apache.spark.sql.execution.ObjectOperator$.deserializeRowToObject(objects.scala:137) > at > org.apache.spark.sql.execution.AppendColumnsExec$$anonfun$9.apply(objects.scala:251) > at > org.apache.spark.sql.execution.AppendColumnsExec$$anonfun$9.apply(objects.scala:250) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:803) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:803) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:283) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:283) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47) > at org.apache.spark.scheduler.Task.run(Task.scala:86) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.ClassCircularityError: > io/netty/util/internal/__matchers__/org/apache/spark/network/protocol/MessageMatcher > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at > io.netty.util.internal.JavassistTypeParameterMatcherGenerator.generate(JavassistTypeParameterMatcherGenerator.java:62) > at > io.netty.util.internal.JavassistTypeParameterMatcherGenerator.generate(JavassistTypeParameterMatcherGenerator.java:54) > at > io.netty.util.internal.TypeParameterMatcher.get(TypeParameterMatcher.java:42) > at > io.netty.util.internal.TypeParameterMatcher.find(TypeParameterMatcher.java:78) > at > io.netty.handler.codec.MessageToMessageEncoder.<init>(MessageToMessageEncoder.java:60) > at > org.apache.spark.network.protocol.MessageEncoder.<init>(MessageEncoder.java:34) > at > org.apache.spark.network.TransportContext.<init>(TransportContext.java:78) > at > org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:354) > at > org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:324) > at org.apache.spark.repl.ExecutorClassLoader.org > $apache$spark$repl$ExecutorClassLoader$$getClassFileInputStreamFromSparkRPC(ExecutorClassLoader.scala:90) > at > org.apache.spark.repl.ExecutorClassLoader$$anonfun$1.apply(ExecutorClassLoader.scala:57) > at > org.apache.spark.repl.ExecutorClassLoader$$anonfun$1.apply(ExecutorClassLoader.scala:57) > at > org.apache.spark.repl.ExecutorClassLoader.findClassLocally(ExecutorClassLoader.scala:161) > at > org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:80) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at > io.netty.util.internal.JavassistTypeParameterMatcherGenerator.generate(JavassistTypeParameterMatcherGenerator.java:62) > at > io.netty.util.internal.JavassistTypeParameterMatcherGenerator.generate(JavassistTypeParameterMatcherGenerator.java:54) > at > io.netty.util.internal.TypeParameterMatcher.get(TypeParameterMatcher.java:42) > at > io.netty.util.internal.TypeParameterMatcher.find(TypeParameterMatcher.java:78) > at > io.netty.handler.codec.MessageToMessageEncoder.<init>(MessageToMessageEncoder.java:60) > at > org.apache.spark.network.protocol.MessageEncoder.<init>(MessageEncoder.java:34) > at > org.apache.spark.network.TransportContext.<init>(TransportContext.java:78) > at > org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:354) > at > org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:324) > at org.apache.spark.repl.ExecutorClassLoader.org > $apache$spark$repl$ExecutorClassLoader$$getClassFileInputStreamFromSparkRPC(ExecutorClassLoader.scala:90) > at > org.apache.spark.repl.ExecutorClassLoader$$anonfun$1.apply(ExecutorClassLoader.scala:57) > at > org.apache.spark.repl.ExecutorClassLoader$$anonfun$1.apply(ExecutorClassLoader.scala:57) > at > org.apache.spark.repl.ExecutorClassLoader.findClassLocally(ExecutorClassLoader.scala:161) > at > org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:80) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at java.lang.ClassLoader.loadClass(ClassLoader.java:411) > at > org.apache.spark.util.ParentClassLoader.loadClass(ParentClassLoader.scala:34) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at > org.apache.spark.util.ParentClassLoader.loadClass(ParentClassLoader.scala:30) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at > org.codehaus.janino.ClassLoaderIClassLoader.findIClass(ClassLoaderIClassLoader.java:78) > at org.codehaus.janino.IClassLoader.loadIClass(IClassLoader.java:254) > at org.codehaus.janino.UnitCompiler.findTypeByName(UnitCompiler.java:6893) > at > org.codehaus.janino.UnitCompiler.getReferenceType(UnitCompiler.java:5331) > at > org.codehaus.janino.UnitCompiler.getReferenceType(UnitCompiler.java:5207) > at org.codehaus.janino.UnitCompiler.getType2(UnitCompiler.java:5188) > at org.codehaus.janino.UnitCompiler.access$12600(UnitCompiler.java:185) > at > org.codehaus.janino.UnitCompiler$16.visitReferenceType(UnitCompiler.java:5119) > at org.codehaus.janino.Java$ReferenceType.accept(Java.java:2880) > at org.codehaus.janino.UnitCompiler.getType(UnitCompiler.java:5159) > at org.codehaus.janino.UnitCompiler.getType2(UnitCompiler.java:5414) > at org.codehaus.janino.UnitCompiler.access$12400(UnitCompiler.java:185) > at > org.codehaus.janino.UnitCompiler$16.visitArrayType(UnitCompiler.java:5117) > at org.codehaus.janino.Java$ArrayType.accept(Java.java:2954) > at org.codehaus.janino.UnitCompiler.getType(UnitCompiler.java:5159) > at org.codehaus.janino.UnitCompiler.access$16700(UnitCompiler.java:185) > at > org.codehaus.janino.UnitCompiler$31.getParameterTypes2(UnitCompiler.java:8533) > at > org.codehaus.janino.IClass$IInvocable.getParameterTypes(IClass.java:835) > at org.codehaus.janino.IClass$IMethod.getDescriptor2(IClass.java:1063) > at org.codehaus.janino.IClass$IInvocable.getDescriptor(IClass.java:849) > at org.codehaus.janino.IClass.getIMethods(IClass.java:211) > at org.codehaus.janino.IClass.getIMethods(IClass.java:199) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:409) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:393) > at org.codehaus.janino.UnitCompiler.access$400(UnitCompiler.java:185) > at > org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:347) > at > org.codehaus.janino.Java$PackageMemberClassDeclaration.accept(Java.java:1139) > at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:354) > at org.codehaus.janino.UnitCompiler.compileUnit(UnitCompiler.java:322) > at > org.codehaus.janino.SimpleCompiler.compileToClassLoader(SimpleCompiler.java:383) > at > org.codehaus.janino.ClassBodyEvaluator.compileToClass(ClassBodyEvaluator.java:315) > at > org.codehaus.janino.ClassBodyEvaluator.cook(ClassBodyEvaluator.java:233) > at org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:192) > at org.codehaus.commons.compiler.Cookable.cook(Cookable.java:84) > at > org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:887) > at > org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:950) > at > org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:947) > at > com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599) > at > com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2379) > at > com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342) > at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2257) > ... 26 more > > > > On Wed, Nov 2, 2016 at 4:52 AM Reynold Xin <r...@databricks.com> wrote: > > > Please vote on releasing the following candidate as Apache Spark version > > 2.0.2. The vote is open until Fri, Nov 4, 2016 at 22:00 PDT and passes if a > > majority of at least 3+1 PMC votes are cast. > > > > [ ] +1 Release this package as Apache Spark 2.0.2 > > [ ] -1 Do not release this package because ... > > > > > > The tag to be voted on is v2.0.2-rc2 > > (a6abe1ee22141931614bf27a4f371c46d8379e33) > > > > This release candidate resolves 84 issues: > > https://s.apache.org/spark-2.0.2-jira > > > > The release files, including signatures, digests, etc. can be found at: > > http://people.apache.org/~pwendell/spark-releases/spark-2.0.2-rc2-bin/ > > > > Release artifacts are signed with the following key: > > https://people.apache.org/keys/committer/pwendell.asc > > > > The staging repository for this release can be found at: > > https://repository.apache.org/content/repositories/orgapachespark-1210/ > > > > The documentation corresponding to this release can be found at: > > http://people.apache.org/~pwendell/spark-releases/spark-2.0.2-rc2-docs/ > > > > > > Q: How can I help test this release? > > A: If you are a Spark user, you can help us test this release by taking an > > existing Spark workload and running on this release candidate, then > > reporting any regressions from 2.0.1. > > > > Q: What justifies a -1 vote for this release? > > A: This is a maintenance release in the 2.0.x series. Bugs already present > > in 2.0.1, missing features, or bugs related to new features will not > > necessarily block this release. > > > > Q: What fix version should I use for patches merging into branch-2.0 from > > now on? > > A: Please mark the fix version as 2.0.3, rather than 2.0.2. If a new RC > > (i.e. RC3) is cut, I will change the fix version of those patches to 2.0.2. > > > --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU