[ 
https://issues.apache.org/jira/browse/SPARK-7154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14513217#comment-14513217
 ] 

Dmitry Goldenberg edited comment on SPARK-7154 at 4/26/15 7:37 PM:
-------------------------------------------------------------------

The attached output for building Spark 1.3.1 includes
[INFO] Including com.google.protobuf:protobuf-java:jar:2.5.0 in the shaded jar.
for building the assembly. Yet the classes in protobuf are different.


was (Author: dgoldenberg):
This maven output includes
[INFO] Including com.google.protobuf:protobuf-java:jar:2.5.0 in the shaded jar.
for building the assembly. Yet the classes in protobuf are different.

> Spark distro appears to be pulling in incorrect protobuf classes
> ----------------------------------------------------------------
>
>                 Key: SPARK-7154
>                 URL: https://issues.apache.org/jira/browse/SPARK-7154
>             Project: Spark
>          Issue Type: Bug
>          Components: Build
>    Affects Versions: 1.3.0
>            Reporter: Dmitry Goldenberg
>         Attachments: in-google-protobuf-2.5.0.zip, 
> in-spark-1.3.1-local-build.zip, spark-1.3.1-local-build.txt
>
>
> If you download Spark via the site: 
> https://spark.apache.org/downloads.html,
> for example I chose:
> http://www.apache.org/dyn/closer.cgi/spark/spark-1.3.1/spark-1.3.1-bin-hadoop2.4.tgz
> then you may see incompatibility with other libraries due to incorrect 
> protobuf classes.
> I'm seeing such a case in my Spark Streaming job which attempts to use Apache 
> Phoenix to update records in HBase. The job is built with with protobuf 2.5.0 
> dependency. However, at runtime Spark's classes take precedence in class 
> loading and that is causing exceptions such as the following:
> java.util.concurrent.ExecutionException: java.lang.IllegalAccessError: 
> com/google/protobuf/HBaseZeroCopyByteString
>         at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>         at 
> org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1620)
>         at 
> org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1577)
>         at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:1007)
>         at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.getTable(ConnectionQueryServicesImpl.java:1257)
>         at 
> org.apache.phoenix.schema.MetaDataClient.updateCache(MetaDataClient.java:350)
>         at 
> org.apache.phoenix.schema.MetaDataClient.updateCache(MetaDataClient.java:311)
>         at 
> org.apache.phoenix.schema.MetaDataClient.updateCache(MetaDataClient.java:307)
>         at 
> org.apache.phoenix.compile.FromCompiler$BaseColumnResolver.createTableRef(FromCompiler.java:333)
>         at 
> org.apache.phoenix.compile.FromCompiler$SingleTableColumnResolver.<init>(FromCompiler.java:237)
>         at 
> org.apache.phoenix.compile.FromCompiler$SingleTableColumnResolver.<init>(FromCompiler.java:231)
>         at 
> org.apache.phoenix.compile.FromCompiler.getResolverForMutation(FromCompiler.java:207)
>         at 
> org.apache.phoenix.compile.UpsertCompiler.compile(UpsertCompiler.java:248)
>         at 
> org.apache.phoenix.jdbc.PhoenixStatement$ExecutableUpsertStatement.compilePlan(PhoenixStatement.java:503)
>         at 
> org.apache.phoenix.jdbc.PhoenixStatement$ExecutableUpsertStatement.compilePlan(PhoenixStatement.java:494)
>         at 
> org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:295)
>         at 
> org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:288)
>         at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
>         at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:287)
>         at 
> org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:219)
>         at 
> org.apache.phoenix.jdbc.PhoenixPreparedStatement.execute(PhoenixPreparedStatement.java:174)
>         at 
> org.apache.phoenix.jdbc.PhoenixPreparedStatement.execute(PhoenixPreparedStatement.java:179)
>         at 
> com.kona.core.upload.persistence.hdfshbase.HUploadWorkqueueHelper.updateUploadWorkqueueEntry(HUploadWorkqueueHelper.java:139)
>         at 
> com.kona.core.upload.persistence.hdfshbase.HdfsHbaseUploadPersistenceProvider.updateUploadWorkqueueEntry(HdfsHbaseUploadPersistenceProvider.java:144)
>         at 
> com.kona.pipeline.sparkplug.error.UploadEntryErrorHandlerImpl.onError(UploadEntryErrorHandlerImpl.java:62)
>         at 
> com.kona.pipeline.sparkplug.pipeline.KonaPipelineImpl.processError(KonaPipelineImpl.java:305)
>         at 
> com.kona.pipeline.sparkplug.pipeline.KonaPipelineImpl.processPipelineDocument(KonaPipelineImpl.java:208)
>         at 
> com.kona.pipeline.sparkplug.runner.KonaPipelineRunnerImpl.notifyItemReceived(KonaPipelineRunnerImpl.java:79)
>         at 
> com.kona.pipeline.streaming.spark.ProcessPartitionFunction.call(ProcessPartitionFunction.java:83)
>         at 
> com.kona.pipeline.streaming.spark.ProcessPartitionFunction.call(ProcessPartitionFunction.java:25)
>         at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$foreachPartition$1.apply(JavaRDDLike.scala:198)
>         at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$foreachPartition$1.apply(JavaRDDLike.scala:198)
>         at 
> org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:806)
>         at 
> org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:806)
>         at 
> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1497)
>         at 
> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1497)
>         at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>         at org.apache.spark.scheduler.Task.run(Task.scala:64)
>         at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalAccessError: 
> com/google/protobuf/HBaseZeroCopyByteString
>         at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$7.call(ConnectionQueryServicesImpl.java:1265)
>         at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$7.call(ConnectionQueryServicesImpl.java:1258)
>         at org.apache.hadoop.hbase.client.HTable$17.call(HTable.java:1608)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> If you look at the protobuf classes inside the Spark assembly jar, they do 
> not match (use the cmp command) the classes in the stock protobuf 2.5.0 jar:
> BoundedByteString$1.class
> BoundedByteString$BoundedByteIterator.class
> BoundedByteString.class
> ByteString$1.class
> ByteString$ByteIterator.class
> ByteString$CodedBuilder.class
> ByteString$Output.class
> ByteString.class
> CodedInputStream.class
> CodedOutputStream$OutOfSpaceException.class
> CodedOutputStream.class
> LiteralByteString$1.class
> LiteralByteString$LiteralByteIterator.class
> LiteralByteString.class
> All of these are dependency classes for HBaseZeroCopyByteString and they're 
> incompatible which explains the java.lang.IllegalAccessError.
> What's not yet clear to me is how they can be wrong if the Spark pom 
> specifies 2.5.0:
>     <profile>
>       <id>hadoop-2.4</id>
>       <properties>
>         <hadoop.version>2.4.0</hadoop.version>
>         <protobuf.version>2.5.0</protobuf.version>
>         <jets3t.version>0.9.3</jets3t.version>
>         <hbase.version>0.98.7-hadoop2</hbase.version>
>         <commons.math3.version>3.1.1</commons.math3.version>
>         <avro.mapred.classifier>hadoop2</avro.mapred.classifier>
>         <codehaus.jackson.version>1.9.13</codehaus.jackson.version>
>       </properties>
>     </profile>
> This looks correct and in theory should override the 
> <protobuf.version>2.4.1</protobuf.version> specified higher up in the parent 
> pom (https://github.com/apache/spark/blob/master/pom.xml).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to