Hi all,

as you might know, we are working on Spark 2.x support in the Spark runner.

I'm working on a PR about that:

https://github.com/apache/beam/pull/3808

Today, we have something working with both Spark 1.x and 2.x from a code standpoint, but I have to deal with dependencies. It's the first step of the update as I'm still using RDD, the second step would be to support dataframe (but for that, I would need PCollection elements with schemas, that's another topic on which Eugene, Reuven and I are discussing).

However, as all major distributions now ship Spark 2.x, I don't think it's required anymore to support Spark 1.x.

If we agree, I will update and cleanup the PR to only support and focus on Spark 2.x.

So, that's why I'm calling for a vote:

  [ ] +1 to drop Spark 1.x support and upgrade to Spark 2.x only
  [ ] 0 (I don't care ;))
[ ] -1, I would like to still support Spark 1.x, and so having support of both Spark 1.x and 2.x (please provide specific comment)

This vote is open for 48 hours (I have the commits ready, just waiting the end of the vote to push on the PR).

Thanks !
Regards
JB
--
Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com

Reply via email to