[ 
https://issues.apache.org/jira/browse/PIG-5200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15959783#comment-15959783
 ] 

Adam Szita commented on PIG-5200:
---------------------------------

[~kellyzly]: I've been investigating this further. Looks like 
spark.yarn.user.classpath.first doesn't do any good for us, but {color:green} 
{{spark.executor.userClassPathFirst=true}} {color} does.
In this case we'll utilise Spark's {{ChildFirstURLClassLoader}} which favours 
the added jars over system classpath entries, and Kryo will set this 
classloader in its classloader field to resolve outputformat related classes.

One thing I've observed that this setting doesn't work with old snappy-java 
versions (I had 1.0.4.1 installed on cluster). (This is due to the loading 
mechanism of snappy's native parts that fail to make it go up to the root 
classloader, as a ChildFirstURLClassLoader has null set as parent..)
Nevertheless newer versions are okay, the one we depend on compile time 
(1.1.1.3) works fine.

We're currently running both unit and e2e tests to verify that this fix 
([^PIG-5200.0.patch]) doesn't break other things.

> Orc_1 and Orc_Pushdown_* tests fail on Spark
> --------------------------------------------
>
>                 Key: PIG-5200
>                 URL: https://issues.apache.org/jira/browse/PIG-5200
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: Adam Szita
>            Assignee: Adam Szita
>             Fix For: spark-branch
>
>         Attachments: PIG-5200.0.patch
>
>
> Orc_1 and all of the Orc_Pushdown E2E tests produce the following exception:
> {code}
> 2017-03-27 03:16:50,293 [task-result-getter-1] WARN  
> org.apache.spark.scheduler.TaskSetManager - Lost task 0.0 in stage 1.0 (TID 
> 1, example-2.com, executor 1): java.lang.RuntimeException: 
> com.esotericsoftware.kryo.KryoException: Unable to find class:
>  org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionTree
> Serialization trace:
> expression (org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initNextRecordReader(PigRecordReader.java:263)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.<init>(PigRecordReader.java:121)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputFormat.java:117)
>         at 
> org.apache.pig.backend.hadoop.executionengine.spark.running.PigInputFormatSpark.createRecordReader(PigInputFormatSpark.java:64)
>         at 
> org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:166)
>         at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:133)
>         at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:65)
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>         at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>         at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>         at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>         at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>         at org.apache.spark.scheduler.Task.run(Task.scala:89)
>         at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: com.esotericsoftware.kryo.KryoException: Unable to find class: 
> org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionTree
> Serialization trace:
> expression (org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl)
>         at 
> com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
>         at 
> com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
>         at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:610)
>         at 
> com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:599)
>         at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
>         at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:626)
>         at 
> org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl.fromKryo(SearchArgumentImpl.java:1006)
>         at 
> org.apache.hadoop.hive.ql.io.sarg.SearchArgumentFactory.create(SearchArgumentFactory.java:44)
>         at 
> org.apache.hadoop.hive.ql.io.sarg.SearchArgumentFactory.createFromConf(SearchArgumentFactory.java:52)
>         at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.setSearchArgument(OrcInputFormat.java:312)
>         at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.createReaderFromFile(OrcInputFormat.java:229)
>         at 
> org.apache.hadoop.hive.ql.io.orc.OrcNewInputFormat$OrcRecordReader.<init>(OrcNewInputFormat.java:69)
>         at 
> org.apache.hadoop.hive.ql.io.orc.OrcNewInputFormat.createRecordReader(OrcNewInputFormat.java:51)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initNextRecordReader(PigRecordReader.java:253)
>         ... 23 more
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionTree
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>         at java.lang.Class.forName0(Native Method)
>         at java.lang.Class.forName(Class.java:270)
>         at 
> com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:136)
>         ... 36 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to