Re: [RESULT] [VOTE] Release 0.6.0, release candidate #2

2017-03-15 Thread Jean-Baptiste Onofré
By the way, this step is in the "Release Guide". Bu you are right, it means the release manager needs "permission" on the Jira or ask to change the version state. Regards JB On 03/16/2017 02:42 AM, Ahmet Altay wrote: JB, 0.6.0 is flagged as released now, thank you for catching this. As a

Re: Beam spark 2.x runner status

2017-03-15 Thread Jean-Baptiste Onofré
Hi guys, sorry, due to the time zone shift, I answer a bit late ;) I think we can have the same runner dealing with the two major Spark version, introducing some adapters. For instance, in CarbonData, we created some adapters to work with Spark 1?5, Spark 1.6 and Spark 2.1. The dependencies

Re: [RESULT] [VOTE] Release 0.6.0, release candidate #2

2017-03-15 Thread Jean-Baptiste Onofré
Thanks ! Regards JB On 03/16/2017 02:42 AM, Ahmet Altay wrote: JB, 0.6.0 is flagged as released now, thank you for catching this. As a side note, I did not have enough permissions do this and asked Davor to do. I will add this to the release notes. Ahmet On Wed, Mar 15, 2017 at 7:16 AM,

Re: Docker image dependencies

2017-03-15 Thread Stephen Sisk
thanks for the discussion! In general, I agree with the sentiments expressed here. I updated https://docs.google.com/document/d/153J9jPQhMCNi_eBzJfhAg-NprQ7vbf1jNVRgdqeEE8I/edit#heading=h.hlirex1vus1a to reflect this discussion. (The plan is still that I will put that on the website.) Apache

Re: [RESULT] [VOTE] Release 0.6.0, release candidate #2

2017-03-15 Thread Ahmet Altay
JB, 0.6.0 is flagged as released now, thank you for catching this. As a side note, I did not have enough permissions do this and asked Davor to do. I will add this to the release notes. Ahmet On Wed, Mar 15, 2017 at 7:16 AM, Jesse Anderson wrote: > Excellent! > > On

Re: Beam spark 2.x runner status

2017-03-15 Thread Amit Sela
I answered inline to Abbass' comment, but I think he hit something - how about we have a branch with those adaptations ? same RDD implementation, but depending on the latest 2.x version with the minimal changes required. I'd be happy to do that, or guide anyone who wants to (I did most of it on my

Re: Beam spark 2.x runner status

2017-03-15 Thread amarouni
+1 for Spark runners based on different APIs RDD/Dataset and keeping the Spark versions as a deployment dependency. The RDD API is stable & mature enough so it makes sense to have it on master, the Dataset API still have some work to do and from our own experience it just reached a comparable RDD

Re: Call for help: let's add Splittable DoFn to Spark, Flink and Apex runners

2017-03-15 Thread Amit Sela
Great! so we'll use the hangout you added here, see you then. On Wed, Mar 15, 2017 at 7:22 PM Eugene Kirpichov wrote: > Amit - 8am is fine with me, let's do that. > > On Wed, Mar 15, 2017 at 6:00 AM Jean-Baptiste Onofré > wrote: > > > Hi, > > >

Re: Call for help: let's add Splittable DoFn to Spark, Flink and Apex runners

2017-03-15 Thread Eugene Kirpichov
Amit - 8am is fine with me, let's do that. On Wed, Mar 15, 2017 at 6:00 AM Jean-Baptiste Onofré wrote: > Hi, > > Anyway, I hope it will result with some notes on the mailing list as it > could be > helpful. > > I'm not against a video call to move forward, but, from ma

Re: Beam spark 2.x runner status

2017-03-15 Thread Amit Sela
So you're suggesting we copy-paste the current runner and adapt whatever is necessary so it runs with Spark 2 ? This also means any bug-fix / improvement would have to be maintained in two runners, and I wouldn't wanna do that. I don't like to think in terms of Spark1/2 but in terms of

Re: Beam spark 2.x runner status

2017-03-15 Thread Ismaël Mejía
> However, I do feel that we should use the Dataset API, starting with batch > support first. WDYT ? Well, this is the exact current status quo, and it will take us some time to have something as complete as what we have with the spark 1 runner for the spark 2. The other proposal has two

Re: Beam spark 2.x runner status

2017-03-15 Thread Amit Sela
I feel that as we're getting closer to supporting streaming with Spark 1 runner, and having Structured Streaming advance in Spark 2, we could start work on Spark 2 runner in a separate branch. However, I do feel that we should use the Dataset API, starting with batch support first. WDYT ? On

Re: Beam spark 2.x runner status

2017-03-15 Thread Ismaël Mejía
> So you propose to have the Spark 2 branch a clone of the current one with > adaptations around Context->Session, Accumulator->AccumulatorV2 etc. while > still using the RDD API ? Yes this is exactly what I have in mind. > I think that having another Spark runner is great if it has value, >

Re: Performance Testing Next Steps

2017-03-15 Thread Ismaël Mejía
Excellent proposal, sorry to jump into this discussion so late, this was in my toread list for almost two weeks, and I finally got the time to read the document and I have two minor comments: I have the impression that the strict separation of Providers (the data-processing systems) and Resources

Re: Beam spark 2.x runner status

2017-03-15 Thread Amit Sela
So you propose to have the Spark 2 branch a clone of the current one with adaptations around Context->Session, Accumulator->AccumulatorV2 etc. while still using the RDD API ? I think that having another Spark runner is great if it has value, otherwise, let's just bump the version. My idea of

Re: Beam spark 2.x runner status

2017-03-15 Thread Ismaël Mejía
BIG +1 JB, If we can just jump the version number with minor changes staying as close as possible to the current implementation for spark 1 we can go faster and offer in principle the exact same support but for version 2. I know that the advanced streaming stuff based on the DataSet API won't be

Re: [RESULT] [VOTE] Release 0.6.0, release candidate #2

2017-03-15 Thread Jesse Anderson
Excellent! On Wed, Mar 15, 2017, 6:13 AM Jean-Baptiste Onofré wrote: > Hi Ahmet, > > it seems Jira is not up to date: 0.6.0 version is not flagged as > "Released". > > Can you fix that please ? > > Thanks ! > Regards > JB > > On 03/15/2017 05:22 AM, Ahmet Altay wrote: > > I'm

Re: Docker image dependencies

2017-03-15 Thread Ismaël Mejía
Hi, Thanks for bringing this subject to the mailing list. +1 We definitely need a consensus on this, and I agree with your proposal and JB’s comments modulo certain clarifications: I think we shall go in this priority order if the version of the image we want is available: 1. Image provided by

Re: [RESULT] [VOTE] Release 0.6.0, release candidate #2

2017-03-15 Thread Jean-Baptiste Onofré
Hi Ahmet, it seems Jira is not up to date: 0.6.0 version is not flagged as "Released". Can you fix that please ? Thanks ! Regards JB On 03/15/2017 05:22 AM, Ahmet Altay wrote: I'm happy to announce that we have unanimously approved this release. There are 7 approving votes, 4 of which are

Re: Beam spark 2.x runner status

2017-03-15 Thread Jean-Baptiste Onofré
Hi Amit, What do you think of the following: - in the mean time that you reintroduce the Spark 2 branch, what about "extending" the version in the current Spark runner ? Still using RDD/DStream, I think we can support Spark 2.x even if we don't yet leverage the new provided features.

Re: Beam spark 2.x runner status

2017-03-15 Thread Amit Sela
Hi Cody, I will re-introduce this branch soon as part of the work on BEAM-913 . For now, and from previous experience with the mentioned branch, batch implementation should be straight-forward. Only issue is with streaming support - in the current

Re: Style: how much testing for transform builder classes?

2017-03-15 Thread Ismaël Mejía
+1 to Vikas point maybe the right place to enforce things correct build tests is in the validate and like this reduce the test boilerplate and only test the validate, but I wonder if this totally covers both cases (the buildsCorrectly and buildsCorrectlyInDifferentOrder ones). I answer Eugene’s

Re: [RESULT] [VOTE] Release 0.6.0, release candidate #2

2017-03-15 Thread Ismaël Mejía
Thanks Ahmet for dealing with the release, I just tried the pip install apache-beam and the wordcount example and as you said it feels awesome to see this working so easily now.​ Congrats to everyone working on the python SDK ! On Wed, Mar 15, 2017 at 8:17 AM, Ahmet Altay

Jenkins build is still unstable: beam_Release_NightlySnapshot #357

2017-03-15 Thread Apache Jenkins Server
See

Re: [RESULT] [VOTE] Release 0.6.0, release candidate #2

2017-03-15 Thread Ahmet Altay
This release is now complete. Thanks to everyone who have helped make this release possible! Before sending a note to users@, I would like to make a pass over the website and simplify things now that we have an official python release. I did the first 'pip install apache-beam' today and it felt