Elliot Metsger <[email protected]> 9:48 AM (7 hours ago) to dev Howdy folks,
Relative newbie to Spark, and super new to Beam. (I've asked this question on Beam lists, but this seems like a Spark-related issue so I'm trying my query here, too). I'm attempting to get a simple Beam pipeline (using the Go SDK) running on Spark. There seems to be an incompatibility between Java components related to object serializations which prevents a simple "hello world" pipeline from executing successfully. I'm really looking for some direction on where to look, so if anyone has any pointers, it is appreciated! When I submit the job via the go sdk, it errors out on the Spark side with: [8:59 AM] 22/08/25 12:45:59 ERROR TransportRequestHandler: Error while invoking RpcHandler#receive() for one-way message. java.io.InvalidClassException: org.apache.spark.deploy.ApplicationDescription; local class incompatible: stream classdesc serialVersionUID = 6543101073799644159, local class serialVersionUID = 1574364215946805297 I’m using apache/beam_spark_job_server:2.41.0 and apache/spark:latest. (docker-compose[0], hello world wordcount example pipeline[1]). It appears that the org.apache.spark.deploy.ApplicationDescription object (or something in its graph) doesn't explicitly assign a serialVersionUID. This simple repo[2] should demonstrate the issue. Any pointers would be appreciated! [0]: https://github.com/emetsger/beam-test/blob/develop/docker-compose.yml [1]: https://github.com/emetsger/beam-test/blob/develop/debugging_wordcount.go [2]: https://github.com/emetsger/beam-test
