RE: Change protobuf version or any other third party library version in Spark application

java8964 Tue, 15 Sep 2015 08:32:14 -0700

If you use Standalone mode, just start spark-shell like following:
spark-shell --jars your_uber_jar --conf spark.files.userClassPathFirst=true 
Yong
Date: Tue, 15 Sep 2015 09:33:40 -0500
Subject: Re: Change protobuf version or any other third party library version 
in Spark application
From: ljia...@gmail.com
To: java8...@hotmail.com
CC: ste...@hortonworks.com; user@spark.apache.org


Steve,
Thanks for the input. You are absolutely right. When I use protobuf 2.6.1, I 
also ran into method not defined errors. You suggest using Maven sharding 
strategy, but I have already built the uber jar to package all my custom 
classes and its dependencies including protobuf 3. The problem is how to 
configure spark shell to use my uber jar first. 
java8964 -- appreciate the link and I will try the configuration. Looks 
promising. However, the "user classpath first" attribute does not apply to 
spark-shell, am I correct? 

Lan
On Tue, Sep 15, 2015 at 8:24 AM, java8964 <java8...@hotmail.com> wrote:



It is a bad idea to use the major version change of protobuf, as it most likely 
won't work.
But you really want to give it a try, set the "user classpath first", so the 
protobuf 3 coming with your jar will be used.
The setting depends on your deployment mode, check this for the parameter:
https://issues.apache.org/jira/browse/SPARK-2996
Yong

Subject: Re: Change protobuf version or any other third party library version 
in Spark application
From: ste...@hortonworks.com
To: ljia...@gmail.com
CC: user@spark.apache.org
Date: Tue, 15 Sep 2015 09:19:28 +0000













On 15 Sep 2015, at 05:47, Lan Jiang <ljia...@gmail.com> wrote:



Hi, there,



I am using Spark 1.4.1. The protobuf 2.5 is included by Spark 1.4.1 by default. 
However, I would like to use Protobuf 3 in my spark application so that I can 
use some new features such as Map support.  Is there anyway to do that? 



Right now if I build a uber.jar with dependencies including protobuf 3 classes 
and pass to spark-shell through --jars option, during the execution, I got the 
error java.lang.NoSuchFieldError: unknownFields. 









protobuf is an absolute nightmare version-wise, as protoc generates 
incompatible java classes even across point versions. Hadoop 2.2+ is and will 
always be protobuf 2.5 only; that applies transitively to downstream projects  
(the great protobuf upgrade
 of 2013 was actually pushed by the HBase team, and required a co-ordinated 
change across multiple projects)








Is there anyway to use a different version of Protobuf other than the default 
one included in the Spark distribution? I guess I can generalize and extend the 
question to any third party libraries. How to deal with version conflict for 
any third
 party libraries included in the Spark distribution? 







maven shading is the strategy. Generally it is less needed, though the 
troublesome binaries are,  across the entire apache big data stack:


google protobuf
google guava
kryo

jackson



you can generally bump up the other versions, at least by point releases.

RE: Change protobuf version or any other third party library version in Spark application

Reply via email to