[ ] Use Spark 1 & Spark 2 Support Branch [X] Use Spark 2 Only Branch
On Fri, Nov 17, 2017 at 9:46 AM, Ted Yu <yuzhih...@gmail.com> wrote: > [ ] Use Spark 1 & Spark 2 Support Branch > [X] Use Spark 2 Only Branch > > On Thu, Nov 16, 2017 at 5:08 AM, Jean-Baptiste Onofré <j...@nanthrax.net> > wrote: > > > Hi guys, > > > > To illustrate the current discussion about Spark versions support, you > can > > take a look on: > > > > -- > > Spark 1 & Spark 2 Support Branch > > > > https://github.com/jbonofre/beam/tree/BEAM-1920-SPARK2-MODULES > > > > This branch contains a Spark runner common module compatible with both > > Spark 1.x and 2.x. For convenience, we introduced spark1 & spark2 > > modules/artifacts containing just a pom.xml to define the dependencies > set. > > > > -- > > Spark 2 Only Branch > > > > https://github.com/jbonofre/beam/tree/BEAM-1920-SPARK2-ONLY > > > > This branch is an upgrade to Spark 2.x and "drop" support of Spark 1.x. > > > > As I'm ready to merge one of the other in the PR, I would like to > complete > > the vote/discussion pretty soon. > > > > Correct me if I'm wrong, but it seems that the preference is to drop > Spark > > 1.x to focus only on Spark 2.x (for the Spark 2 Only Branch). > > > > I would like to call a final vote to act the merge I will do: > > > > [ ] Use Spark 1 & Spark 2 Support Branch > > [ ] Use Spark 2 Only Branch > > > > This informal vote is open for 48 hours. > > > > Please, let me know what your preference is. > > > > Thanks ! > > Regards > > JB > > > > On 11/13/2017 09:32 AM, Jean-Baptiste Onofré wrote: > > > >> Hi Beamers, > >> > >> I'm forwarding this discussion & vote from the dev mailing list to the > >> user mailing list. > >> The goal is to have your feedback as user. > >> > >> Basically, we have two options: > >> 1. Right now, in the PR, we support both Spark 1.x and 2.x using three > >> artifacts (common, spark1, spark2). You, as users, pick up spark1 or > spark2 > >> in your dependencies set depending the Spark target version you want. > >> 2. The other option is to upgrade and focus on Spark 2.x in Beam 2.3.0. > >> If you still want to use Spark 1.x, then, you will be stuck up to Beam > >> 2.2.0. > >> > >> Thoughts ? > >> > >> Thanks ! > >> Regards > >> JB > >> > >> > >> -------- Forwarded Message -------- > >> Subject: [VOTE] Drop Spark 1.x support to focus on Spark 2.x > >> Date: Wed, 8 Nov 2017 08:27:58 +0100 > >> From: Jean-Baptiste Onofré <j...@nanthrax.net> > >> Reply-To: dev@beam.apache.org > >> To: dev@beam.apache.org > >> > >> Hi all, > >> > >> as you might know, we are working on Spark 2.x support in the Spark > >> runner. > >> > >> I'm working on a PR about that: > >> > >> https://github.com/apache/beam/pull/3808 > >> > >> Today, we have something working with both Spark 1.x and 2.x from a code > >> standpoint, but I have to deal with dependencies. It's the first step of > >> the update as I'm still using RDD, the second step would be to support > >> dataframe (but for that, I would need PCollection elements with schemas, > >> that's another topic on which Eugene, Reuven and I are discussing). > >> > >> However, as all major distributions now ship Spark 2.x, I don't think > >> it's required anymore to support Spark 1.x. > >> > >> If we agree, I will update and cleanup the PR to only support and focus > >> on Spark 2.x. > >> > >> So, that's why I'm calling for a vote: > >> > >> [ ] +1 to drop Spark 1.x support and upgrade to Spark 2.x only > >> [ ] 0 (I don't care ;)) > >> [ ] -1, I would like to still support Spark 1.x, and so having > support > >> of both Spark 1.x and 2.x (please provide specific comment) > >> > >> This vote is open for 48 hours (I have the commits ready, just waiting > >> the end of the vote to push on the PR). > >> > >> Thanks ! > >> Regards > >> JB > >> > > > > -- > > Jean-Baptiste Onofré > > jbono...@apache.org > > http://blog.nanthrax.net > > Talend - http://www.talend.com > > > -- -Ben