*   the jackson runtime dependencies should be updated manually (at least to 
2.9.2) in case of using Spark 2.x

yes - that is exactly what we are looking to achieve, any hints about how to do 
that? We’re not Java experts. Do you happen to have a CI recipe or binary lis 
for this particular configuration? Thank you!


  *   use Spark 3..x if possible since it already provides jackson jars of 
version 2.10.0.

we tried this too but ran into other compatibility problems. Seems that the 
Beam Spark runner (in v 2.37.0) only supports the Spark 2.x branch, as per the 
Beam docs https://beam.apache.org/documentation/runners/spark/

any ideas?

On 2022/03/28 17:38:13 Alexey Romanenko wrote:
> Well, it’s caused by recent jackson's version update in Beam [1] - so, the 
> jackson runtime dependencies should be updated manually (at least to 2.9.2) 
> in case of using Spark 2.x.
>
> Either, use Spark 3..x if possible since it already provides jackson jars of 
> version 2.10.0.
>
> [1] 
> https://github.com/apache/beam/commit/9694f70df1447e96684b665279679edafec13a0c
>  
> <https://github.com/apache/beam/commit/9694f70df1447e96684b665279679edafec13a0c><https://github.com/apache/beam/commit/9694f70df1447e96684b665279679edafec13a0c%3e>
>
> —
> Alexey
>
> > On 28 Mar 2022, at 14:15, Florian Pinault 
> > <fl...@ecmwf.int<mailto:fl...@ecmwf.int>> wrote:
> >
> > Greetings,
> >
> > We are setting up an Apache Beam cluster using Spark as a backend to run 
> > python code. This is currently a toy example with 4 virtual machines 
> > running Centos (a client, a spark main, and two spark-workers).
> > We are running into version issues (detail below) and would need help on 
> > which versions to set up.
> > We currently are trying spark-2.4.8-bin-hadoop2.7, with the pip package 
> > beam 2.37.0 on the client, and using a job-server to create docker image.
> >
> > I saw here https://beam.apache.org/blog/beam-2.33.0/ 
> > <https://beam.apache.org/blog/beam-2.33.0/><https://beam.apache.org/blog/beam-2.33.0/%3e>
> >  that "Spark 2.x users will need to update Spark's Jackson runtime 
> > dependencies (spark.jackson.version) to at least version 2.9.2, due to Beam 
> > updating its dependencies."
> >  But it looks like the jackson-core version in the job-server is 2.13.0 
> > whereas the jars in spark-2.4.8-bin-hadoop2.7/jars are
> > -. 1 mluser mluser 46986 May 8 2021 jackson-annotations-2.6.7.jar
> > -. 1 mluser mluser 258919 May 8 2021 jackson-core-2.6.7.jar
> > -. 1 mluser mluser 232248 May 8 2021 jackson-core-asl-1.9.13.jar
> > -. 1 mluser mluser 1166637 May 8 2021 jackson-databind-2.6.7.3.jar
> > -. 1 mluser mluser 320444 May 8 2021 jackson-dataformat-yaml-2.6.7.jar
> > -. 1 mluser mluser 18336 May 8 2021 jackson-jaxrs-1.9.13.jar
> > -. 1 mluser mluser 780664 May 8 2021 jackson-mapper-asl-1.9.13.jar
> > -. 1 mluser mluser 32612 May 8 2021 
> > jackson-module-jaxb-annotations-2.6.7.jar
> > -. 1 mluser mluser 42858 May 8 2021 jackson-module-paranamer-2.7.9.jar
> > -. 1 mluser mluser 515645 May 8 2021 jackson-module-scala_2.11-2.6.7.1.jar
> >
> > There must be something to update, but I am not sure how to update these 
> > jar files with their dependencies, and not sure if this would get us very 
> > far.
> >
> > Would you have a list of binaries that work together or some running CI 
> > from the apache foundation similar to what we are trying to achieve?
>
>

Reply via email to