Re: [VOTE] Release 2.2.0, release candidate #4

2017-11-19 Thread Lukasz Cwik
Eugene, you can setup your ~/.m2/settings.xml to point to the repository
containing the release candidate.

  
   
 release-repo
 
   true
 
 
   
 Release 2.2.0 RC4
 Release 2.2.0 RC4
 
https://repository.apache.org/content/repositories/orgapachebeam-1025/
   
 
   
  


The URL for the release candidate is always part of the vote e-mail.
For more details about having multiple repositories, take a look at
https://maven.apache.org/guides/mini/guide-multiple-repositories.html

On Fri, Nov 17, 2017 at 5:09 PM, Reuven Lax 
wrote:

> hmmm, I thought I removed those generated files from the zip file before
> sending this email. Let me check again.
>
> Reuven
>
> On Sat, Nov 18, 2017 at 8:52 AM, Robert Bradshaw <
> rober...@google.com.invalid> wrote:
>
> > The source distribution contains a couple of files not on github (e.g.
> > folders that were added on master, Python generated files). The pom
> > files differed only by missing -SNAPSHOT, other than that presumably
> > the source release should just be "wget
> > https://github.com/apache/beam/archive/release-2.2.0.zip;?
> >
> > diff -rq apache-beam-2.2.0 beam/ | grep -v pom.xml
> >
> > # OK?
> >
> > Only in apache-beam-2.2.0: DEPENDENCIES
> >
> > # Expected.
> >
> > Only in beam/: .git
> > Only in beam/: .gitattributes
> > Only in beam/: .gitignore
> >
> > # These folders are probably from switching around between master and
> > git branches.
> >
> > Only in apache-beam-2.2.0: model
> > Only in apache-beam-2.2.0/runners/flink: examples
> > Only in apache-beam-2.2.0/runners/flink: runner
> > Only in apache-beam-2.2.0/runners/gearpump: jarstore
> > Only in apache-beam-2.2.0/sdks/java/extensions: gcp-core
> > Only in apache-beam-2.2.0/sdks/java/extensions: sketching
> > Only in apache-beam-2.2.0/sdks/java/io: file-based-io-tests
> > Only in apache-beam-2.2.0/sdks/java/io: hdfs
> > Only in apache-beam-2.2.0/sdks/java/maven-archetypes/examples/src/
> > main/resources/archetype-resources:
> > src
> > Only in apache-beam-2.2.0/sdks/java/maven-archetypes/examples-
> > java8/src/main/resources/archetype-resources:
> > src
> > Only in apache-beam-2.2.0/sdks/java: microbenchmarks
> >
> > # Here's the generated protos.
> >
> > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > beam_artifact_api_pb2_grpc.py
> > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > beam_artifact_api_pb2.py
> > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > beam_fn_api_pb2_grpc.py
> > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > beam_fn_api_pb2.py
> > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > beam_job_api_pb2_grpc.py
> > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > beam_job_api_pb2.py
> > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > beam_provision_api_pb2_grpc.py
> > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > beam_provision_api_pb2.py
> > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > beam_runner_api_pb2_grpc.py
> > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > beam_runner_api_pb2.py
> > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > endpoints_pb2_grpc.py
> > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > endpoints_pb2.py
> > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > standard_window_fns_pb2_grpc.py
> > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > standard_window_fns_pb2.py
> >
> > And some other sdist generated Python files.
> >
> > Only in apache-beam-2.2.0/sdks/python: .eggs
> > Only in apache-beam-2.2.0/sdks/python: LICENSE
> > Only in apache-beam-2.2.0/sdks/python: NOTICE
> > Only in apache-beam-2.2.0/sdks/python: README.md
> >
> > Presumably we should just purge these files from the rc?
> >
> >
> > FWIW, the Python tarball looks fine.
> >
> > On Fri, Nov 17, 2017 at 4:40 PM, Eugene Kirpichov
> >  wrote:
> > > How can I specify a dependency on the staged RC? E.g. I'm trying to
> > > validate the quickstart per
> > > https://beam.apache.org/get-started/quickstart-java/ and specifying
> > version
> > > 2.2.0 doesn't work I suppose because it's not released yet. Should I
> pass
> > > some command-line flag to mvn to make it fetch the version from the
> > staging
> > > area?
> > >
> > > On Fri, Nov 17, 2017 at 4:37 PM Lukasz Cwik 
> > > wrote:
> > >
> > >> Its open to all, its just that there are binding votes and non-binding
> > >> votes.
> > >>
> > >> On Fri, Nov 17, 2017 at 4:26 PM, Valentyn Tymofieiev <
> > >> valen...@google.com.invalid> wrote:
> > >>
> > >> > I have a process question: is the vote open for committers only or
> for
> > >> all
> > >> > contributors?
> > >> >
> > >> > On Fri, Nov 17, 2017 at 

Re: [DISCUSS] Move away from Apache Maven as build tool

2017-11-19 Thread Jean-Baptiste Onofré

Thanks for the update Luke.

I'm updating my local working copy to do new tests.

Regards
JB

On 11/19/2017 08:21 PM, Lukasz Cwik wrote:

The gradle build rules have been merged, I'm adding a precommit[1] to start
collecting data about the build times. It currently only mirrors the Java
mvn install precommit. I'll gather data over the next two weeks and provide
a summary here.

You can rerun the precommit by issuing "Run Java Gradle PreCommit"

1: https://github.com/apache/beam/pull/4146


On Mon, Nov 13, 2017 at 9:08 AM, Lukasz Cwik  wrote:


There has been plenty of time for comments on the PR and the approach.

So far Ken Knowles has provided the most feedback on the PR, Ken would you
like to finish the review?



On Fri, Nov 10, 2017 at 1:22 PM, Romain Manni-Bucau  a écrit :


The reason to get it on master is because that is where all the PRs

are. An

upstream branch without any development means no data.
Also, our Jenkins setup via job-dsl doesn't honor using the Jenkins
configuration on the branch because the seed job always runs against
master.

On Thu, Nov 9, 2017 at 9:59 PM, Romain Manni-Bucau <

rmannibu...@gmail.com>

wrote:


What about pushing it on a "upstream" branch and testing it for 1

week in

parallel of the maven reference build? If gradle is always 50% faster

on

jenkins then it could become master setup without much discussion I

guess.

We can even have 2 jenkins jobs: one with the daemon etc and one

without.


Also noticed yesterday that gradle build is killing my machine (all 8

cores

are 100%) during the first minutes vs maven build which let me do

something

else. Then all the consumed time which makes gradle not that fast is

about

python. Will try to send figures later today.

Le 10 nov. 2017 00:10, "Lukasz Cwik"  a

écrit

:



I wouldn't mind merging this change in so I could setup those Gradle
Jenkins precommits.

As per our contribution guidelines, any committer willing to sign

off

on

the PR?

On Thu, Nov 9, 2017 at 2:12 PM, Romain Manni-Bucau <

rmannibu...@gmail.com>

wrote:


Le 9 nov. 2017 21:31, "Kenneth Knowles" 

a

écrit :


Keep in mind that a clean build is unusual during development (it

is

common

for mvn use and that is a bug) and also not necessary for

precommits

if

the

build tool is correct enough that caching is safe. So while this

number

matters, it is not the most important.


Not sure, in dev you bypass the build tool most of the time

anyway -

thanks

to IDE or other shortcuts - but not on PR and CI. Keep in mind

that

not

doing a clean and killing gradle daemon makes the build not

reproducible

and therefore useful :(. Starting to build from a subpart of the

reactor

-

with the mentionned mvn plugin for instance - can be nice on some

CI

like

travis if the caching is well configured but still not a guarantee

the

build is "green".

My trade off is to ensure an easy build and relevant result over

the

time

criteria. Do you share it as well or prefer time over other

criteria

-

which leads to other conclusions and options indeed and can make

us

not

understanding each other?


On Thu, Nov 9, 2017 at 11:30 AM, Romain Manni-Bucau <

rmannibu...@gmail.com



wrote:


I will try next week yes but the 2 runs i did were 28mn vs 32mn

from

memory

- after having downloaded all deps once.

Le 9 nov. 2017 19:45, "Lukasz Cwik" 

a

écrit :



If Gradle was slow, do you mind running the build with

--profile

and

sharing that and also sharing the Maven build log?

On Thu, Nov 9, 2017 at 10:43 AM, Lukasz Cwik <

lc...@google.com>

wrote:



Romain, I don't understand your last comment, were you

trying

to

say

that

you had the same Gradle build times like I did and it was an

improvement

over Maven or that you did not and you experienced build

times

that

were

equivalent to Maven?

On Thu, Nov 9, 2017 at 9:51 AM, Romain Manni-Bucau <

rmannibu...@gmail.com>

wrote:


2017-11-09 18:38 GMT+01:00 Kenneth Knowles



wrote:


(this is another topic so we can maybe open another

thread)

issue

is

not much about python but more about the fact the build

is

not

self

contained. it is a maven build and maven should be

sufficient

without

having to install python + dependencies.



Let's leave out the topic of whether our build should

install

things

like

JDKs, Python, Golang, Docker, protoc, findbugs, RAT, etc.

That

issue

is

somewhat independent of build tool, and the new build


Re: [DISCUSS] Move away from Apache Maven as build tool

2017-11-19 Thread Lukasz Cwik
The gradle build rules have been merged, I'm adding a precommit[1] to start
collecting data about the build times. It currently only mirrors the Java
mvn install precommit. I'll gather data over the next two weeks and provide
a summary here.

You can rerun the precommit by issuing "Run Java Gradle PreCommit"

1: https://github.com/apache/beam/pull/4146


On Mon, Nov 13, 2017 at 9:08 AM, Lukasz Cwik  wrote:

> There has been plenty of time for comments on the PR and the approach.
>
> So far Ken Knowles has provided the most feedback on the PR, Ken would you
> like to finish the review?
>
>
>
> On Fri, Nov 10, 2017 at 1:22 PM, Romain Manni-Bucau  > wrote:
>
>> This is only a setup thing and better to not break the master history for
>> poc/tests, in particular when no very localized. Alternative can be to ask
>> another temp repo to infra and have a synchro between both but dont think
>> it does worth it personally.
>>
>>
>>
>> Le 10 nov. 2017 18:57, "Lukasz Cwik"  a écrit :
>>
>> > The reason to get it on master is because that is where all the PRs
>> are. An
>> > upstream branch without any development means no data.
>> > Also, our Jenkins setup via job-dsl doesn't honor using the Jenkins
>> > configuration on the branch because the seed job always runs against
>> > master.
>> >
>> > On Thu, Nov 9, 2017 at 9:59 PM, Romain Manni-Bucau <
>> rmannibu...@gmail.com>
>> > wrote:
>> >
>> > > What about pushing it on a "upstream" branch and testing it for 1
>> week in
>> > > parallel of the maven reference build? If gradle is always 50% faster
>> on
>> > > jenkins then it could become master setup without much discussion I
>> > guess.
>> > > We can even have 2 jenkins jobs: one with the daemon etc and one
>> without.
>> > >
>> > > Also noticed yesterday that gradle build is killing my machine (all 8
>> > cores
>> > > are 100%) during the first minutes vs maven build which let me do
>> > something
>> > > else. Then all the consumed time which makes gradle not that fast is
>> > about
>> > > python. Will try to send figures later today.
>> > >
>> > > Le 10 nov. 2017 00:10, "Lukasz Cwik"  a
>> écrit
>> > :
>> > >
>> > > > I wouldn't mind merging this change in so I could setup those Gradle
>> > > > Jenkins precommits.
>> > > >
>> > > > As per our contribution guidelines, any committer willing to sign
>> off
>> > on
>> > > > the PR?
>> > > >
>> > > > On Thu, Nov 9, 2017 at 2:12 PM, Romain Manni-Bucau <
>> > > rmannibu...@gmail.com>
>> > > > wrote:
>> > > >
>> > > > > Le 9 nov. 2017 21:31, "Kenneth Knowles" 
>> a
>> > > > écrit :
>> > > > >
>> > > > > Keep in mind that a clean build is unusual during development (it
>> is
>> > > > common
>> > > > > for mvn use and that is a bug) and also not necessary for
>> precommits
>> > if
>> > > > the
>> > > > > build tool is correct enough that caching is safe. So while this
>> > number
>> > > > > matters, it is not the most important.
>> > > > >
>> > > > >
>> > > > > Not sure, in dev you bypass the build tool most of the time
>> anyway -
>> > > > thanks
>> > > > > to IDE or other shortcuts - but not on PR and CI. Keep in mind
>> that
>> > not
>> > > > > doing a clean and killing gradle daemon makes the build not
>> > > reproducible
>> > > > > and therefore useful :(. Starting to build from a subpart of the
>> > > reactor
>> > > > -
>> > > > > with the mentionned mvn plugin for instance - can be nice on some
>> CI
>> > > like
>> > > > > travis if the caching is well configured but still not a guarantee
>> > the
>> > > > > build is "green".
>> > > > >
>> > > > > My trade off is to ensure an easy build and relevant result over
>> the
>> > > time
>> > > > > criteria. Do you share it as well or prefer time over other
>> criteria
>> > -
>> > > > > which leads to other conclusions and options indeed and can make
>> us
>> > not
>> > > > > understanding each other?
>> > > > >
>> > > > >
>> > > > > On Thu, Nov 9, 2017 at 11:30 AM, Romain Manni-Bucau <
>> > > > rmannibu...@gmail.com
>> > > > > >
>> > > > > wrote:
>> > > > >
>> > > > > > I will try next week yes but the 2 runs i did were 28mn vs 32mn
>> > from
>> > > > > memory
>> > > > > > - after having downloaded all deps once.
>> > > > > >
>> > > > > > Le 9 nov. 2017 19:45, "Lukasz Cwik" 
>> a
>> > > > écrit :
>> > > > > >
>> > > > > > > If Gradle was slow, do you mind running the build with
>> --profile
>> > > and
>> > > > > > > sharing that and also sharing the Maven build log?
>> > > > > > >
>> > > > > > > On Thu, Nov 9, 2017 at 10:43 AM, Lukasz Cwik <
>> lc...@google.com>
>> > > > wrote:
>> > > > > > >
>> > > > > > > > Romain, I don't understand your last comment, were you
>> trying
>> > to
>> > > > say
>> > > > > > that
>> > > > > > > > you had the same Gradle build times like I did and it was an
>> > > > > > improvement
>> > > > > > > > over Maven or that you did not and you experienced 

Re: [VOTE] Choose the "new" Spark runner

2017-11-19 Thread Tyler Akidau
[ ] Use Spark 1 & Spark 2 Support Branch
[X] Use Spark 2 Only Branch

On Sun, Nov 19, 2017 at 1:46 PM Amit Sela  wrote:

> [X] Use Spark 2 Only Branch
>
> On Sun, Nov 19, 2017, 02:46 Reuven Lax  wrote:
>
> > [ ] Use Spark 1 & Spark 2 Support Branch
> >  [X] Use Spark 2 Only Branch
> >
> > On Sat, Nov 18, 2017 at 1:54 AM, Ben Sidhom 
> > wrote:
> >
> > > [ ] Use Spark 1 & Spark 2 Support Branch
> > > [X] Use Spark 2 Only Branch
> > >
> > > On Fri, Nov 17, 2017 at 9:46 AM, Ted Yu  wrote:
> > >
> > > > [ ] Use Spark 1 & Spark 2 Support Branch
> > > > [X] Use Spark 2 Only Branch
> > > >
> > > > On Thu, Nov 16, 2017 at 5:08 AM, Jean-Baptiste Onofré <
> j...@nanthrax.net
> > >
> > > > wrote:
> > > >
> > > > > Hi guys,
> > > > >
> > > > > To illustrate the current discussion about Spark versions support,
> > you
> > > > can
> > > > > take a look on:
> > > > >
> > > > > --
> > > > > Spark 1 & Spark 2 Support Branch
> > > > >
> > > > > https://github.com/jbonofre/beam/tree/BEAM-1920-SPARK2-MODULES
> > > > >
> > > > > This branch contains a Spark runner common module compatible with
> > both
> > > > > Spark 1.x and 2.x. For convenience, we introduced spark1 & spark2
> > > > > modules/artifacts containing just a pom.xml to define the
> > dependencies
> > > > set.
> > > > >
> > > > > --
> > > > > Spark 2 Only Branch
> > > > >
> > > > > https://github.com/jbonofre/beam/tree/BEAM-1920-SPARK2-ONLY
> > > > >
> > > > > This branch is an upgrade to Spark 2.x and "drop" support of Spark
> > 1.x.
> > > > >
> > > > > As I'm ready to merge one of the other in the PR, I would like to
> > > > complete
> > > > > the vote/discussion pretty soon.
> > > > >
> > > > > Correct me if I'm wrong, but it seems that the preference is to
> drop
> > > > Spark
> > > > > 1.x to focus only on Spark 2.x (for the Spark 2 Only Branch).
> > > > >
> > > > > I would like to call a final vote to act the merge I will do:
> > > > >
> > > > > [ ] Use Spark 1 & Spark 2 Support Branch
> > > > > [ ] Use Spark 2 Only Branch
> > > > >
> > > > > This informal vote is open for 48 hours.
> > > > >
> > > > > Please, let me know what your preference is.
> > > > >
> > > > > Thanks !
> > > > > Regards
> > > > > JB
> > > > >
> > > > > On 11/13/2017 09:32 AM, Jean-Baptiste Onofré wrote:
> > > > >
> > > > >> Hi Beamers,
> > > > >>
> > > > >> I'm forwarding this discussion & vote from the dev mailing list to
> > the
> > > > >> user mailing list.
> > > > >> The goal is to have your feedback as user.
> > > > >>
> > > > >> Basically, we have two options:
> > > > >> 1. Right now, in the PR, we support both Spark 1.x and 2.x using
> > three
> > > > >> artifacts (common, spark1, spark2). You, as users, pick up spark1
> or
> > > > spark2
> > > > >> in your dependencies set depending the Spark target version you
> > want.
> > > > >> 2. The other option is to upgrade and focus on Spark 2.x in Beam
> > > 2.3.0.
> > > > >> If you still want to use Spark 1.x, then, you will be stuck up to
> > Beam
> > > > >> 2.2.0.
> > > > >>
> > > > >> Thoughts ?
> > > > >>
> > > > >> Thanks !
> > > > >> Regards
> > > > >> JB
> > > > >>
> > > > >>
> > > > >>  Forwarded Message 
> > > > >> Subject: [VOTE] Drop Spark 1.x support to focus on Spark 2.x
> > > > >> Date: Wed, 8 Nov 2017 08:27:58 +0100
> > > > >> From: Jean-Baptiste Onofré 
> > > > >> Reply-To: dev@beam.apache.org
> > > > >> To: dev@beam.apache.org
> > > > >>
> > > > >> Hi all,
> > > > >>
> > > > >> as you might know, we are working on Spark 2.x support in the
> Spark
> > > > >> runner.
> > > > >>
> > > > >> I'm working on a PR about that:
> > > > >>
> > > > >> https://github.com/apache/beam/pull/3808
> > > > >>
> > > > >> Today, we have something working with both Spark 1.x and 2.x from
> a
> > > code
> > > > >> standpoint, but I have to deal with dependencies. It's the first
> > step
> > > of
> > > > >> the update as I'm still using RDD, the second step would be to
> > support
> > > > >> dataframe (but for that, I would need PCollection elements with
> > > schemas,
> > > > >> that's another topic on which Eugene, Reuven and I are
> discussing).
> > > > >>
> > > > >> However, as all major distributions now ship Spark 2.x, I don't
> > think
> > > > >> it's required anymore to support Spark 1.x.
> > > > >>
> > > > >> If we agree, I will update and cleanup the PR to only support and
> > > focus
> > > > >> on Spark 2.x.
> > > > >>
> > > > >> So, that's why I'm calling for a vote:
> > > > >>
> > > > >>[ ] +1 to drop Spark 1.x support and upgrade to Spark 2.x only
> > > > >>[ ] 0 (I don't care ;))
> > > > >>[ ] -1, I would like to still support Spark 1.x, and so having
> > > > support
> > > > >> of both Spark 1.x and 2.x (please provide specific comment)
> > > > >>
> > > > >> This vote is open for 48 hours (I have the commits ready, just
> > waiting
> > > > >> the end of the 

Re: [VOTE] Choose the "new" Spark runner

2017-11-19 Thread Amit Sela
[X] Use Spark 2 Only Branch

On Sun, Nov 19, 2017, 02:46 Reuven Lax  wrote:

> [ ] Use Spark 1 & Spark 2 Support Branch
>  [X] Use Spark 2 Only Branch
>
> On Sat, Nov 18, 2017 at 1:54 AM, Ben Sidhom 
> wrote:
>
> > [ ] Use Spark 1 & Spark 2 Support Branch
> > [X] Use Spark 2 Only Branch
> >
> > On Fri, Nov 17, 2017 at 9:46 AM, Ted Yu  wrote:
> >
> > > [ ] Use Spark 1 & Spark 2 Support Branch
> > > [X] Use Spark 2 Only Branch
> > >
> > > On Thu, Nov 16, 2017 at 5:08 AM, Jean-Baptiste Onofré  >
> > > wrote:
> > >
> > > > Hi guys,
> > > >
> > > > To illustrate the current discussion about Spark versions support,
> you
> > > can
> > > > take a look on:
> > > >
> > > > --
> > > > Spark 1 & Spark 2 Support Branch
> > > >
> > > > https://github.com/jbonofre/beam/tree/BEAM-1920-SPARK2-MODULES
> > > >
> > > > This branch contains a Spark runner common module compatible with
> both
> > > > Spark 1.x and 2.x. For convenience, we introduced spark1 & spark2
> > > > modules/artifacts containing just a pom.xml to define the
> dependencies
> > > set.
> > > >
> > > > --
> > > > Spark 2 Only Branch
> > > >
> > > > https://github.com/jbonofre/beam/tree/BEAM-1920-SPARK2-ONLY
> > > >
> > > > This branch is an upgrade to Spark 2.x and "drop" support of Spark
> 1.x.
> > > >
> > > > As I'm ready to merge one of the other in the PR, I would like to
> > > complete
> > > > the vote/discussion pretty soon.
> > > >
> > > > Correct me if I'm wrong, but it seems that the preference is to drop
> > > Spark
> > > > 1.x to focus only on Spark 2.x (for the Spark 2 Only Branch).
> > > >
> > > > I would like to call a final vote to act the merge I will do:
> > > >
> > > > [ ] Use Spark 1 & Spark 2 Support Branch
> > > > [ ] Use Spark 2 Only Branch
> > > >
> > > > This informal vote is open for 48 hours.
> > > >
> > > > Please, let me know what your preference is.
> > > >
> > > > Thanks !
> > > > Regards
> > > > JB
> > > >
> > > > On 11/13/2017 09:32 AM, Jean-Baptiste Onofré wrote:
> > > >
> > > >> Hi Beamers,
> > > >>
> > > >> I'm forwarding this discussion & vote from the dev mailing list to
> the
> > > >> user mailing list.
> > > >> The goal is to have your feedback as user.
> > > >>
> > > >> Basically, we have two options:
> > > >> 1. Right now, in the PR, we support both Spark 1.x and 2.x using
> three
> > > >> artifacts (common, spark1, spark2). You, as users, pick up spark1 or
> > > spark2
> > > >> in your dependencies set depending the Spark target version you
> want.
> > > >> 2. The other option is to upgrade and focus on Spark 2.x in Beam
> > 2.3.0.
> > > >> If you still want to use Spark 1.x, then, you will be stuck up to
> Beam
> > > >> 2.2.0.
> > > >>
> > > >> Thoughts ?
> > > >>
> > > >> Thanks !
> > > >> Regards
> > > >> JB
> > > >>
> > > >>
> > > >>  Forwarded Message 
> > > >> Subject: [VOTE] Drop Spark 1.x support to focus on Spark 2.x
> > > >> Date: Wed, 8 Nov 2017 08:27:58 +0100
> > > >> From: Jean-Baptiste Onofré 
> > > >> Reply-To: dev@beam.apache.org
> > > >> To: dev@beam.apache.org
> > > >>
> > > >> Hi all,
> > > >>
> > > >> as you might know, we are working on Spark 2.x support in the Spark
> > > >> runner.
> > > >>
> > > >> I'm working on a PR about that:
> > > >>
> > > >> https://github.com/apache/beam/pull/3808
> > > >>
> > > >> Today, we have something working with both Spark 1.x and 2.x from a
> > code
> > > >> standpoint, but I have to deal with dependencies. It's the first
> step
> > of
> > > >> the update as I'm still using RDD, the second step would be to
> support
> > > >> dataframe (but for that, I would need PCollection elements with
> > schemas,
> > > >> that's another topic on which Eugene, Reuven and I are discussing).
> > > >>
> > > >> However, as all major distributions now ship Spark 2.x, I don't
> think
> > > >> it's required anymore to support Spark 1.x.
> > > >>
> > > >> If we agree, I will update and cleanup the PR to only support and
> > focus
> > > >> on Spark 2.x.
> > > >>
> > > >> So, that's why I'm calling for a vote:
> > > >>
> > > >>[ ] +1 to drop Spark 1.x support and upgrade to Spark 2.x only
> > > >>[ ] 0 (I don't care ;))
> > > >>[ ] -1, I would like to still support Spark 1.x, and so having
> > > support
> > > >> of both Spark 1.x and 2.x (please provide specific comment)
> > > >>
> > > >> This vote is open for 48 hours (I have the commits ready, just
> waiting
> > > >> the end of the vote to push on the PR).
> > > >>
> > > >> Thanks !
> > > >> Regards
> > > >> JB
> > > >>
> > > >
> > > > --
> > > > Jean-Baptiste Onofré
> > > > jbono...@apache.org
> > > > http://blog.nanthrax.net
> > > > Talend - http://www.talend.com
> > > >
> > >
> >
> >
> >
> > --
> > -Ben
> >
>