Re: ThrowableSerializationWrapper: Task exception could not be deserialized / ClassNotFoundException: org.apache.solr.common.SolrException
bq. have tried these settings with the hbase protocol jar, to no avail In that case, HBaseZeroCopyByteString is contained in hbase-protocol.jar. In HBaseZeroCopyByteString , you can see: package com.google.protobuf; // This is a lie. If protobuf jar is loaded ahead of hbase-protocol.jar, things start to get interesting ... On Tue, Sep 29, 2015 at 6:12 PM, Dmitry Goldenbergwrote: > Ted, I think I have tried these settings with the hbase protocol jar, to > no avail. > > I'm going to see if I can try and use these with this SolrException issue > though it now may be harder to reproduce it. Thanks for the suggestion. > > On Tue, Sep 29, 2015 at 8:03 PM, Ted Yu wrote: > >> Have you tried the following ? >> --conf spark.driver.userClassPathFirst=true --conf spark.executor. >> userClassPathFirst=true >> >> On Tue, Sep 29, 2015 at 4:38 PM, Dmitry Goldenberg < >> dgoldenberg...@gmail.com> wrote: >> >>> Release of Spark: 1.5.0. >>> >>> Command line invokation: >>> >>> ACME_INGEST_HOME=/mnt/acme/acme-ingest >>> ACME_INGEST_VERSION=0.0.1-SNAPSHOT >>> ACME_BATCH_DURATION_MILLIS=5000 >>> SPARK_MASTER_URL=spark://data1:7077 >>> JAVA_OPTIONS="-Dspark.streaming.kafka.maxRatePerPartition=1000" >>> JAVA_OPTIONS="$JAVA_OPTIONS -Dspark.executor.memory=2g" >>> >>> $SPARK_HOME/bin/spark-submit \ >>> --driver-class-path $ACME_INGEST_HOME \ >>> --driver-java-options "$JAVA_OPTIONS" \ >>> --class >>> "com.acme.consumer.kafka.spark.KafkaSparkStreamingDriver" \ >>> --master $SPARK_MASTER_URL \ >>> --conf >>> "spark.executor.extraClassPath=$ACME_INGEST_HOME/conf:$ACME_INGEST_HOME/lib/hbase-protocol-0.98.9-hadoop2.jar" >>> \ >>> >>> $ACME_INGEST_HOME/lib/acme-ingest-kafka-spark-$ACME_INGEST_VERSION.jar \ >>> -brokerlist $METADATA_BROKER_LIST \ >>> -topic acme.topic1 \ >>> -autooffsetreset largest \ >>> -batchdurationmillis $ACME_BATCH_DURATION_MILLIS \ >>> -appname Acme.App1 \ >>> -checkpointdir file://$SPARK_HOME/acme/checkpoint-acme-app1 >>> Note that SolrException is definitely in our consumer jar >>> acme-ingest-kafka-spark-$ACME_INGEST_VERSION.jar which gets deployed to >>> $ACME_INGEST_HOME. >>> >>> For the extraClassPath on the executors, we've got additionally >>> hbase-protocol-0.98.9-hadoop2.jar: we're using Apache Phoenix from the >>> Spark jobs to communicate with HBase. The only way to force Phoenix to >>> successfully communicate with HBase was to have that JAR explicitly added >>> to the executor classpath regardless of the fact that the contents of the >>> hbase-protocol hadoop jar get rolled up into the consumer jar at build time. >>> >>> I'm starting to wonder whether there's some class loading pattern here >>> where some classes may not get loaded out of the consumer jar and therefore >>> have to have their respective jars added to the executor extraClassPath? >>> >>> Or is this a serialization problem for SolrException as Divya >>> Ravichandran suggested? >>> >>> >>> >>> >>> On Tue, Sep 29, 2015 at 6:16 PM, Ted Yu wrote: >>> Mind providing a bit more information: release of Spark command line for running Spark job Cheers On Tue, Sep 29, 2015 at 1:37 PM, Dmitry Goldenberg < dgoldenberg...@gmail.com> wrote: > We're seeing this occasionally. Granted, this was caused by a wrinkle > in the Solr schema but this bubbled up all the way in Spark and caused job > failures. > > I just checked and SolrException class is actually in the consumer job > jar we use. Is there any reason why Spark cannot find the SolrException > class? > > 15/09/29 15:41:58 WARN ThrowableSerializationWrapper: Task exception > could not be deserialized > java.lang.ClassNotFoundException: org.apache.solr.common.SolrException > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at > org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67) > at > java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613) > at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) > at > org.apache.spark.ThrowableSerializationWrapper.readObject(TaskEndReason.scala:163) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at >
Re: ThrowableSerializationWrapper: Task exception could not be deserialized / ClassNotFoundException: org.apache.solr.common.SolrException
I believe I've had trouble with --conf spark.driver.userClassPathFirst=true --conf spark.executor.userClassPathFirst=true before, so these might not work... I was thinking of trying to add the solr4j jar to spark.executor.extraClassPath... On Wed, Sep 30, 2015 at 12:01 PM, Ted Yuwrote: > bq. have tried these settings with the hbase protocol jar, to no avail > > In that case, HBaseZeroCopyByteString is contained in hbase-protocol.jar. > In HBaseZeroCopyByteString , you can see: > > package com.google.protobuf; // This is a lie. > > If protobuf jar is loaded ahead of hbase-protocol.jar, things start to get > interesting ... > > On Tue, Sep 29, 2015 at 6:12 PM, Dmitry Goldenberg < > dgoldenberg...@gmail.com> wrote: > >> Ted, I think I have tried these settings with the hbase protocol jar, to >> no avail. >> >> I'm going to see if I can try and use these with this SolrException issue >> though it now may be harder to reproduce it. Thanks for the suggestion. >> >> On Tue, Sep 29, 2015 at 8:03 PM, Ted Yu wrote: >> >>> Have you tried the following ? >>> --conf spark.driver.userClassPathFirst=true --conf spark.executor. >>> userClassPathFirst=true >>> >>> On Tue, Sep 29, 2015 at 4:38 PM, Dmitry Goldenberg < >>> dgoldenberg...@gmail.com> wrote: >>> Release of Spark: 1.5.0. Command line invokation: ACME_INGEST_HOME=/mnt/acme/acme-ingest ACME_INGEST_VERSION=0.0.1-SNAPSHOT ACME_BATCH_DURATION_MILLIS=5000 SPARK_MASTER_URL=spark://data1:7077 JAVA_OPTIONS="-Dspark.streaming.kafka.maxRatePerPartition=1000" JAVA_OPTIONS="$JAVA_OPTIONS -Dspark.executor.memory=2g" $SPARK_HOME/bin/spark-submit \ --driver-class-path $ACME_INGEST_HOME \ --driver-java-options "$JAVA_OPTIONS" \ --class "com.acme.consumer.kafka.spark.KafkaSparkStreamingDriver" \ --master $SPARK_MASTER_URL \ --conf "spark.executor.extraClassPath=$ACME_INGEST_HOME/conf:$ACME_INGEST_HOME/lib/hbase-protocol-0.98.9-hadoop2.jar" \ $ACME_INGEST_HOME/lib/acme-ingest-kafka-spark-$ACME_INGEST_VERSION.jar \ -brokerlist $METADATA_BROKER_LIST \ -topic acme.topic1 \ -autooffsetreset largest \ -batchdurationmillis $ACME_BATCH_DURATION_MILLIS \ -appname Acme.App1 \ -checkpointdir file://$SPARK_HOME/acme/checkpoint-acme-app1 Note that SolrException is definitely in our consumer jar acme-ingest-kafka-spark-$ACME_INGEST_VERSION.jar which gets deployed to $ACME_INGEST_HOME. For the extraClassPath on the executors, we've got additionally hbase-protocol-0.98.9-hadoop2.jar: we're using Apache Phoenix from the Spark jobs to communicate with HBase. The only way to force Phoenix to successfully communicate with HBase was to have that JAR explicitly added to the executor classpath regardless of the fact that the contents of the hbase-protocol hadoop jar get rolled up into the consumer jar at build time. I'm starting to wonder whether there's some class loading pattern here where some classes may not get loaded out of the consumer jar and therefore have to have their respective jars added to the executor extraClassPath? Or is this a serialization problem for SolrException as Divya Ravichandran suggested? On Tue, Sep 29, 2015 at 6:16 PM, Ted Yu wrote: > Mind providing a bit more information: > > release of Spark > command line for running Spark job > > Cheers > > On Tue, Sep 29, 2015 at 1:37 PM, Dmitry Goldenberg < > dgoldenberg...@gmail.com> wrote: > >> We're seeing this occasionally. Granted, this was caused by a wrinkle >> in the Solr schema but this bubbled up all the way in Spark and caused >> job >> failures. >> >> I just checked and SolrException class is actually in the consumer >> job jar we use. Is there any reason why Spark cannot find the >> SolrException class? >> >> 15/09/29 15:41:58 WARN ThrowableSerializationWrapper: Task exception >> could not be deserialized >> java.lang.ClassNotFoundException: org.apache.solr.common.SolrException >> at java.net.URLClassLoader.findClass(URLClassLoader.java:381) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >> at java.lang.Class.forName0(Native Method) >> at java.lang.Class.forName(Class.java:348) >> at >> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67) >> at >> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613) >> at >>
Re: ThrowableSerializationWrapper: Task exception could not be deserialized / ClassNotFoundException: org.apache.solr.common.SolrException
Release of Spark: 1.5.0. Command line invokation: ACME_INGEST_HOME=/mnt/acme/acme-ingest ACME_INGEST_VERSION=0.0.1-SNAPSHOT ACME_BATCH_DURATION_MILLIS=5000 SPARK_MASTER_URL=spark://data1:7077 JAVA_OPTIONS="-Dspark.streaming.kafka.maxRatePerPartition=1000" JAVA_OPTIONS="$JAVA_OPTIONS -Dspark.executor.memory=2g" $SPARK_HOME/bin/spark-submit \ --driver-class-path $ACME_INGEST_HOME \ --driver-java-options "$JAVA_OPTIONS" \ --class "com.acme.consumer.kafka.spark.KafkaSparkStreamingDriver" \ --master $SPARK_MASTER_URL \ --conf "spark.executor.extraClassPath=$ACME_INGEST_HOME/conf:$ACME_INGEST_HOME/lib/hbase-protocol-0.98.9-hadoop2.jar" \ $ACME_INGEST_HOME/lib/acme-ingest-kafka-spark-$ACME_INGEST_VERSION.jar \ -brokerlist $METADATA_BROKER_LIST \ -topic acme.topic1 \ -autooffsetreset largest \ -batchdurationmillis $ACME_BATCH_DURATION_MILLIS \ -appname Acme.App1 \ -checkpointdir file://$SPARK_HOME/acme/checkpoint-acme-app1 Note that SolrException is definitely in our consumer jar acme-ingest-kafka-spark-$ACME_INGEST_VERSION.jar which gets deployed to $ACME_INGEST_HOME. For the extraClassPath on the executors, we've got additionally hbase-protocol-0.98.9-hadoop2.jar: we're using Apache Phoenix from the Spark jobs to communicate with HBase. The only way to force Phoenix to successfully communicate with HBase was to have that JAR explicitly added to the executor classpath regardless of the fact that the contents of the hbase-protocol hadoop jar get rolled up into the consumer jar at build time. I'm starting to wonder whether there's some class loading pattern here where some classes may not get loaded out of the consumer jar and therefore have to have their respective jars added to the executor extraClassPath? Or is this a serialization problem for SolrException as Divya Ravichandran suggested? On Tue, Sep 29, 2015 at 6:16 PM, Ted Yuwrote: > Mind providing a bit more information: > > release of Spark > command line for running Spark job > > Cheers > > On Tue, Sep 29, 2015 at 1:37 PM, Dmitry Goldenberg < > dgoldenberg...@gmail.com> wrote: > >> We're seeing this occasionally. Granted, this was caused by a wrinkle in >> the Solr schema but this bubbled up all the way in Spark and caused job >> failures. >> >> I just checked and SolrException class is actually in the consumer job >> jar we use. Is there any reason why Spark cannot find the SolrException >> class? >> >> 15/09/29 15:41:58 WARN ThrowableSerializationWrapper: Task exception >> could not be deserialized >> java.lang.ClassNotFoundException: org.apache.solr.common.SolrException >> at java.net.URLClassLoader.findClass(URLClassLoader.java:381) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >> at java.lang.Class.forName0(Native Method) >> at java.lang.Class.forName(Class.java:348) >> at >> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67) >> at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613) >> at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) >> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) >> at >> org.apache.spark.ThrowableSerializationWrapper.readObject(TaskEndReason.scala:163) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:497) >> at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) >> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1900) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) >> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) >> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) >> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) >> at >>
Re: ThrowableSerializationWrapper: Task exception could not be deserialized / ClassNotFoundException: org.apache.solr.common.SolrException
Mind providing a bit more information: release of Spark command line for running Spark job Cheers On Tue, Sep 29, 2015 at 1:37 PM, Dmitry Goldenbergwrote: > We're seeing this occasionally. Granted, this was caused by a wrinkle in > the Solr schema but this bubbled up all the way in Spark and caused job > failures. > > I just checked and SolrException class is actually in the consumer job jar > we use. Is there any reason why Spark cannot find the SolrException class? > > 15/09/29 15:41:58 WARN ThrowableSerializationWrapper: Task exception could > not be deserialized > java.lang.ClassNotFoundException: org.apache.solr.common.SolrException > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at > org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67) > at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613) > at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) > at > org.apache.spark.ThrowableSerializationWrapper.readObject(TaskEndReason.scala:163) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1900) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) > at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) > at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) > at > org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72) > at > org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98) > at > org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply$mcV$sp(TaskResultGetter.scala:108) > at > org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105) > at > org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105) > at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699) > at > org.apache.spark.scheduler.TaskResultGetter$$anon$3.run(TaskResultGetter.scala:105) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) >
Re: ThrowableSerializationWrapper: Task exception could not be deserialized / ClassNotFoundException: org.apache.solr.common.SolrException
Have you tried the following ? --conf spark.driver.userClassPathFirst=true --conf spark.executor. userClassPathFirst=true On Tue, Sep 29, 2015 at 4:38 PM, Dmitry Goldenbergwrote: > Release of Spark: 1.5.0. > > Command line invokation: > > ACME_INGEST_HOME=/mnt/acme/acme-ingest > ACME_INGEST_VERSION=0.0.1-SNAPSHOT > ACME_BATCH_DURATION_MILLIS=5000 > SPARK_MASTER_URL=spark://data1:7077 > JAVA_OPTIONS="-Dspark.streaming.kafka.maxRatePerPartition=1000" > JAVA_OPTIONS="$JAVA_OPTIONS -Dspark.executor.memory=2g" > > $SPARK_HOME/bin/spark-submit \ > --driver-class-path $ACME_INGEST_HOME \ > --driver-java-options "$JAVA_OPTIONS" \ > --class "com.acme.consumer.kafka.spark.KafkaSparkStreamingDriver" \ > --master $SPARK_MASTER_URL \ > --conf > "spark.executor.extraClassPath=$ACME_INGEST_HOME/conf:$ACME_INGEST_HOME/lib/hbase-protocol-0.98.9-hadoop2.jar" > \ > > $ACME_INGEST_HOME/lib/acme-ingest-kafka-spark-$ACME_INGEST_VERSION.jar \ > -brokerlist $METADATA_BROKER_LIST \ > -topic acme.topic1 \ > -autooffsetreset largest \ > -batchdurationmillis $ACME_BATCH_DURATION_MILLIS \ > -appname Acme.App1 \ > -checkpointdir file://$SPARK_HOME/acme/checkpoint-acme-app1 > Note that SolrException is definitely in our consumer jar > acme-ingest-kafka-spark-$ACME_INGEST_VERSION.jar which gets deployed to > $ACME_INGEST_HOME. > > For the extraClassPath on the executors, we've got additionally > hbase-protocol-0.98.9-hadoop2.jar: we're using Apache Phoenix from the > Spark jobs to communicate with HBase. The only way to force Phoenix to > successfully communicate with HBase was to have that JAR explicitly added > to the executor classpath regardless of the fact that the contents of the > hbase-protocol hadoop jar get rolled up into the consumer jar at build time. > > I'm starting to wonder whether there's some class loading pattern here > where some classes may not get loaded out of the consumer jar and therefore > have to have their respective jars added to the executor extraClassPath? > > Or is this a serialization problem for SolrException as Divya > Ravichandran suggested? > > > > > On Tue, Sep 29, 2015 at 6:16 PM, Ted Yu wrote: > >> Mind providing a bit more information: >> >> release of Spark >> command line for running Spark job >> >> Cheers >> >> On Tue, Sep 29, 2015 at 1:37 PM, Dmitry Goldenberg < >> dgoldenberg...@gmail.com> wrote: >> >>> We're seeing this occasionally. Granted, this was caused by a wrinkle in >>> the Solr schema but this bubbled up all the way in Spark and caused job >>> failures. >>> >>> I just checked and SolrException class is actually in the consumer job >>> jar we use. Is there any reason why Spark cannot find the SolrException >>> class? >>> >>> 15/09/29 15:41:58 WARN ThrowableSerializationWrapper: Task exception >>> could not be deserialized >>> java.lang.ClassNotFoundException: org.apache.solr.common.SolrException >>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381) >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >>> at java.lang.Class.forName0(Native Method) >>> at java.lang.Class.forName(Class.java:348) >>> at >>> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67) >>> at >>> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613) >>> at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518) >>> at >>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774) >>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) >>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) >>> at >>> org.apache.spark.ThrowableSerializationWrapper.readObject(TaskEndReason.scala:163) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:497) >>> at >>> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) >>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1900) >>> at >>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) >>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) >>> at >>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) >>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) >>> at >>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) >>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) >>> at >>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) >>>
ThrowableSerializationWrapper: Task exception could not be deserialized / ClassNotFoundException: org.apache.solr.common.SolrException
We're seeing this occasionally. Granted, this was caused by a wrinkle in the Solr schema but this bubbled up all the way in Spark and caused job failures. I just checked and SolrException class is actually in the consumer job jar we use. Is there any reason why Spark cannot find the SolrException class? 15/09/29 15:41:58 WARN ThrowableSerializationWrapper: Task exception could not be deserialized java.lang.ClassNotFoundException: org.apache.solr.common.SolrException at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) at org.apache.spark.ThrowableSerializationWrapper.readObject(TaskEndReason.scala:163) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1900) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72) at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98) at org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply$mcV$sp(TaskResultGetter.scala:108) at org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105) at org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699) at org.apache.spark.scheduler.TaskResultGetter$$anon$3.run(TaskResultGetter.scala:105) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
Re: ThrowableSerializationWrapper: Task exception could not be deserialized / ClassNotFoundException: org.apache.solr.common.SolrException
This could be because org.apache.solr.common.SolrException doesn't implement Serializable. This error shows up when Spark is deserilizing a class which doesn't implement Serializable. Thanks Divya On Sep 29, 2015 4:37 PM, "Dmitry Goldenberg"wrote: > We're seeing this occasionally. Granted, this was caused by a wrinkle in > the Solr schema but this bubbled up all the way in Spark and caused job > failures. > > I just checked and SolrException class is actually in the consumer job jar > we use. Is there any reason why Spark cannot find the SolrException class? > > 15/09/29 15:41:58 WARN ThrowableSerializationWrapper: Task exception could > not be deserialized > java.lang.ClassNotFoundException: org.apache.solr.common.SolrException > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at > org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67) > at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613) > at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) > at > org.apache.spark.ThrowableSerializationWrapper.readObject(TaskEndReason.scala:163) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1900) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) > at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) > at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) > at > org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72) > at > org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98) > at > org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply$mcV$sp(TaskResultGetter.scala:108) > at > org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105) > at > org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105) > at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699) > at > org.apache.spark.scheduler.TaskResultGetter$$anon$3.run(TaskResultGetter.scala:105) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) >
Re: ThrowableSerializationWrapper: Task exception could not be deserialized / ClassNotFoundException: org.apache.solr.common.SolrException
I'm actually not sure how either one of these would possibly cause Spark to find SolrException. Whether the driver or executor class path is first, should it not matter, if the class is in the consumer job jar? On Tue, Sep 29, 2015 at 9:12 PM, Dmitry Goldenbergwrote: > Ted, I think I have tried these settings with the hbase protocol jar, to > no avail. > > I'm going to see if I can try and use these with this SolrException issue > though it now may be harder to reproduce it. Thanks for the suggestion. > > On Tue, Sep 29, 2015 at 8:03 PM, Ted Yu wrote: > >> Have you tried the following ? >> --conf spark.driver.userClassPathFirst=true --conf spark.executor. >> userClassPathFirst=true >> >> On Tue, Sep 29, 2015 at 4:38 PM, Dmitry Goldenberg < >> dgoldenberg...@gmail.com> wrote: >> >>> Release of Spark: 1.5.0. >>> >>> Command line invokation: >>> >>> ACME_INGEST_HOME=/mnt/acme/acme-ingest >>> ACME_INGEST_VERSION=0.0.1-SNAPSHOT >>> ACME_BATCH_DURATION_MILLIS=5000 >>> SPARK_MASTER_URL=spark://data1:7077 >>> JAVA_OPTIONS="-Dspark.streaming.kafka.maxRatePerPartition=1000" >>> JAVA_OPTIONS="$JAVA_OPTIONS -Dspark.executor.memory=2g" >>> >>> $SPARK_HOME/bin/spark-submit \ >>> --driver-class-path $ACME_INGEST_HOME \ >>> --driver-java-options "$JAVA_OPTIONS" \ >>> --class >>> "com.acme.consumer.kafka.spark.KafkaSparkStreamingDriver" \ >>> --master $SPARK_MASTER_URL \ >>> --conf >>> "spark.executor.extraClassPath=$ACME_INGEST_HOME/conf:$ACME_INGEST_HOME/lib/hbase-protocol-0.98.9-hadoop2.jar" >>> \ >>> >>> $ACME_INGEST_HOME/lib/acme-ingest-kafka-spark-$ACME_INGEST_VERSION.jar \ >>> -brokerlist $METADATA_BROKER_LIST \ >>> -topic acme.topic1 \ >>> -autooffsetreset largest \ >>> -batchdurationmillis $ACME_BATCH_DURATION_MILLIS \ >>> -appname Acme.App1 \ >>> -checkpointdir file://$SPARK_HOME/acme/checkpoint-acme-app1 >>> Note that SolrException is definitely in our consumer jar >>> acme-ingest-kafka-spark-$ACME_INGEST_VERSION.jar which gets deployed to >>> $ACME_INGEST_HOME. >>> >>> For the extraClassPath on the executors, we've got additionally >>> hbase-protocol-0.98.9-hadoop2.jar: we're using Apache Phoenix from the >>> Spark jobs to communicate with HBase. The only way to force Phoenix to >>> successfully communicate with HBase was to have that JAR explicitly added >>> to the executor classpath regardless of the fact that the contents of the >>> hbase-protocol hadoop jar get rolled up into the consumer jar at build time. >>> >>> I'm starting to wonder whether there's some class loading pattern here >>> where some classes may not get loaded out of the consumer jar and therefore >>> have to have their respective jars added to the executor extraClassPath? >>> >>> Or is this a serialization problem for SolrException as Divya >>> Ravichandran suggested? >>> >>> >>> >>> >>> On Tue, Sep 29, 2015 at 6:16 PM, Ted Yu wrote: >>> Mind providing a bit more information: release of Spark command line for running Spark job Cheers On Tue, Sep 29, 2015 at 1:37 PM, Dmitry Goldenberg < dgoldenberg...@gmail.com> wrote: > We're seeing this occasionally. Granted, this was caused by a wrinkle > in the Solr schema but this bubbled up all the way in Spark and caused job > failures. > > I just checked and SolrException class is actually in the consumer job > jar we use. Is there any reason why Spark cannot find the SolrException > class? > > 15/09/29 15:41:58 WARN ThrowableSerializationWrapper: Task exception > could not be deserialized > java.lang.ClassNotFoundException: org.apache.solr.common.SolrException > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at > org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67) > at > java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613) > at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) > at > org.apache.spark.ThrowableSerializationWrapper.readObject(TaskEndReason.scala:163) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at >
Re: ThrowableSerializationWrapper: Task exception could not be deserialized / ClassNotFoundException: org.apache.solr.common.SolrException
Ted, I think I have tried these settings with the hbase protocol jar, to no avail. I'm going to see if I can try and use these with this SolrException issue though it now may be harder to reproduce it. Thanks for the suggestion. On Tue, Sep 29, 2015 at 8:03 PM, Ted Yuwrote: > Have you tried the following ? > --conf spark.driver.userClassPathFirst=true --conf spark.executor. > userClassPathFirst=true > > On Tue, Sep 29, 2015 at 4:38 PM, Dmitry Goldenberg < > dgoldenberg...@gmail.com> wrote: > >> Release of Spark: 1.5.0. >> >> Command line invokation: >> >> ACME_INGEST_HOME=/mnt/acme/acme-ingest >> ACME_INGEST_VERSION=0.0.1-SNAPSHOT >> ACME_BATCH_DURATION_MILLIS=5000 >> SPARK_MASTER_URL=spark://data1:7077 >> JAVA_OPTIONS="-Dspark.streaming.kafka.maxRatePerPartition=1000" >> JAVA_OPTIONS="$JAVA_OPTIONS -Dspark.executor.memory=2g" >> >> $SPARK_HOME/bin/spark-submit \ >> --driver-class-path $ACME_INGEST_HOME \ >> --driver-java-options "$JAVA_OPTIONS" \ >> --class "com.acme.consumer.kafka.spark.KafkaSparkStreamingDriver" >> \ >> --master $SPARK_MASTER_URL \ >> --conf >> "spark.executor.extraClassPath=$ACME_INGEST_HOME/conf:$ACME_INGEST_HOME/lib/hbase-protocol-0.98.9-hadoop2.jar" >> \ >> >> $ACME_INGEST_HOME/lib/acme-ingest-kafka-spark-$ACME_INGEST_VERSION.jar \ >> -brokerlist $METADATA_BROKER_LIST \ >> -topic acme.topic1 \ >> -autooffsetreset largest \ >> -batchdurationmillis $ACME_BATCH_DURATION_MILLIS \ >> -appname Acme.App1 \ >> -checkpointdir file://$SPARK_HOME/acme/checkpoint-acme-app1 >> Note that SolrException is definitely in our consumer jar >> acme-ingest-kafka-spark-$ACME_INGEST_VERSION.jar which gets deployed to >> $ACME_INGEST_HOME. >> >> For the extraClassPath on the executors, we've got additionally >> hbase-protocol-0.98.9-hadoop2.jar: we're using Apache Phoenix from the >> Spark jobs to communicate with HBase. The only way to force Phoenix to >> successfully communicate with HBase was to have that JAR explicitly added >> to the executor classpath regardless of the fact that the contents of the >> hbase-protocol hadoop jar get rolled up into the consumer jar at build time. >> >> I'm starting to wonder whether there's some class loading pattern here >> where some classes may not get loaded out of the consumer jar and therefore >> have to have their respective jars added to the executor extraClassPath? >> >> Or is this a serialization problem for SolrException as Divya >> Ravichandran suggested? >> >> >> >> >> On Tue, Sep 29, 2015 at 6:16 PM, Ted Yu wrote: >> >>> Mind providing a bit more information: >>> >>> release of Spark >>> command line for running Spark job >>> >>> Cheers >>> >>> On Tue, Sep 29, 2015 at 1:37 PM, Dmitry Goldenberg < >>> dgoldenberg...@gmail.com> wrote: >>> We're seeing this occasionally. Granted, this was caused by a wrinkle in the Solr schema but this bubbled up all the way in Spark and caused job failures. I just checked and SolrException class is actually in the consumer job jar we use. Is there any reason why Spark cannot find the SolrException class? 15/09/29 15:41:58 WARN ThrowableSerializationWrapper: Task exception could not be deserialized java.lang.ClassNotFoundException: org.apache.solr.common.SolrException at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) at org.apache.spark.ThrowableSerializationWrapper.readObject(TaskEndReason.scala:163) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1900) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) at