I used the --conf spark.files.userClassPathFirst=true in the spark-shell option, it still gave me the eror: java.lang.NoSuchFieldError: unknownFields if I use protobuf 3.
The output says spark.files.userClassPathFirst is deprecated and suggest using spark.executor.userClassPathFirst. I tried that and it did not work either. Lan > On Sep 15, 2015, at 10:31 AM, java8964 <java8...@hotmail.com> wrote: > > If you use Standalone mode, just start spark-shell like following: > > spark-shell --jars your_uber_jar --conf spark.files.userClassPathFirst=true > > Yong > > Date: Tue, 15 Sep 2015 09:33:40 -0500 > Subject: Re: Change protobuf version or any other third party library version > in Spark application > From: ljia...@gmail.com > To: java8...@hotmail.com > CC: ste...@hortonworks.com; user@spark.apache.org > > Steve, > > Thanks for the input. You are absolutely right. When I use protobuf 2.6.1, I > also ran into method not defined errors. You suggest using Maven sharding > strategy, but I have already built the uber jar to package all my custom > classes and its dependencies including protobuf 3. The problem is how to > configure spark shell to use my uber jar first. > > java8964 -- appreciate the link and I will try the configuration. Looks > promising. However, the "user classpath first" attribute does not apply to > spark-shell, am I correct? > > Lan > > On Tue, Sep 15, 2015 at 8:24 AM, java8964 <java8...@hotmail.com > <mailto:java8...@hotmail.com>> wrote: > It is a bad idea to use the major version change of protobuf, as it most > likely won't work. > > But you really want to give it a try, set the "user classpath first", so the > protobuf 3 coming with your jar will be used. > > The setting depends on your deployment mode, check this for the parameter: > > https://issues.apache.org/jira/browse/SPARK-2996 > <https://issues.apache.org/jira/browse/SPARK-2996> > > Yong > > Subject: Re: Change protobuf version or any other third party library version > in Spark application > From: ste...@hortonworks.com <mailto:ste...@hortonworks.com> > To: ljia...@gmail.com <mailto:ljia...@gmail.com> > CC: user@spark.apache.org <mailto:user@spark.apache.org> > Date: Tue, 15 Sep 2015 09:19:28 +0000 > > > > > On 15 Sep 2015, at 05:47, Lan Jiang <ljia...@gmail.com > <mailto:ljia...@gmail.com>> wrote: > > Hi, there, > > I am using Spark 1.4.1. The protobuf 2.5 is included by Spark 1.4.1 by > default. However, I would like to use Protobuf 3 in my spark application so > that I can use some new features such as Map support. Is there anyway to do > that? > > Right now if I build a uber.jar with dependencies including protobuf 3 > classes and pass to spark-shell through --jars option, during the execution, > I got the error java.lang.NoSuchFieldError: unknownFields. > > > protobuf is an absolute nightmare version-wise, as protoc generates > incompatible java classes even across point versions. Hadoop 2.2+ is and will > always be protobuf 2.5 only; that applies transitively to downstream projects > (the great protobuf upgrade of 2013 was actually pushed by the HBase team, > and required a co-ordinated change across multiple projects) > > > Is there anyway to use a different version of Protobuf other than the default > one included in the Spark distribution? I guess I can generalize and extend > the question to any third party libraries. How to deal with version conflict > for any third party libraries included in the Spark distribution? > > maven shading is the strategy. Generally it is less needed, though the > troublesome binaries are, across the entire apache big data stack: > > google protobuf > google guava > kryo > jackson > > you can generally bump up the other versions, at least by point releases.