Elliot Metsger <emets...@gmail.com>
9:48 AM (7 hours ago)
to dev
Howdy folks,

Relative newbie to Spark, and super new to Beam.  (I've asked this
question on Beam lists, but this seems like a Spark-related issue so I'm
trying my query here, too).  I'm attempting to get a simple Beam pipeline
(using the Go SDK) running on Spark. There seems to be an incompatibility
between Java components related to object serializations which prevents a
simple "hello world" pipeline from executing successfully.  I'm really
looking for some direction on where to look, so if anyone has any pointers,
it is appreciated!

When I submit the job via the go sdk, it errors out on the Spark side with:
[8:59 AM] 22/08/25 12:45:59 ERROR TransportRequestHandler: Error while
invoking RpcHandler#receive() for one-way message.
java.io.InvalidClassException:
org.apache.spark.deploy.ApplicationDescription; local class incompatible:
stream classdesc serialVersionUID = 6543101073799644159, local class
serialVersionUID = 1574364215946805297
I’m using apache/beam_spark_job_server:2.41.0 and apache/spark:latest.
 (docker-compose[0], hello world wordcount example pipeline[1]).

It appears that the org.apache.spark.deploy.ApplicationDescription object
(or something in its graph) doesn't explicitly assign a serialVersionUID.

This simple repo[2] should demonstrate the issue.  Any pointers would be
appreciated!

[0]: https://github.com/emetsger/beam-test/blob/develop/docker-compose.yml
[1]:
https://github.com/emetsger/beam-test/blob/develop/debugging_wordcount.go
[2]: https://github.com/emetsger/beam-test

Reply via email to