Re: Samza Runner

2018-01-25 Thread Jean-Baptiste Onofré
That's awesome ! Happy to see new runner. As the build is OK and the runner contains validation, why not simply merge the PR on master once 2.3.0 release branch is there ? It would give better visibility to this new feature and maybe attract contribution in early stage. Regards JB On 01/26/2018

Re: Samza Runner

2018-01-25 Thread Henning Rohde
+1 Exciting to see a new runner! On Thu, Jan 25, 2018 at 8:56 PM, Jesse Anderson wrote: > Excellent! > > On Fri, Jan 26, 2018, 5:37 AM Kenneth Knowles wrote: > >> Hi all, >> >> In case you haven't noticed or followed, there's a new runner in PR: >> Samza! >> >> https://github.com/apache/bea

Re: [HEADS UP] Preparing Beam 2.3.0

2018-01-25 Thread Jean-Baptiste Onofré
Just to be clear on the CassandraIO issues: they are not regression (it was like this since the addition of the CassandraIO), they are not data loss. I consider as a bug/improvement as the read is performed on a single worker (the split returns always 1). As said, I gonna work today on those issue

Re: [HEADS UP] Preparing Beam 2.3.0

2018-01-25 Thread Reuven Lax
I agree - if CassandraIO issues are not regressions (and are not critical data-loss bugs), I don't think the release should block on it. Reuven On Thu, Jan 25, 2018 at 9:07 PM, Jean-Baptiste Onofré wrote: > I disagree: the CassandraIO issues are not blocker as they are not > regression. > > In

Re: [HEADS UP] Preparing Beam 2.3.0

2018-01-25 Thread Jean-Baptiste Onofré
I disagree: the CassandraIO issues are not blocker as they are not regression. In order to insure the release pace, I will go forward, these issues will be fix for the next release cycle. For the blog, it's up to you. However, with the next release pace we do, I'm not sure it makes sense to do it

Re: Samza Runner

2018-01-25 Thread Jesse Anderson
Excellent! On Fri, Jan 26, 2018, 5:37 AM Kenneth Knowles wrote: > Hi all, > > In case you haven't noticed or followed, there's a new runner in PR: Samza! > > https://github.com/apache/beam/pull/4340 > > It has been under review and revision for some time. In local mode it > passes a solid su

Samza Runner

2018-01-25 Thread Kenneth Knowles
Hi all, In case you haven't noticed or followed, there's a new runner in PR: Samza! https://github.com/apache/beam/pull/4340 It has been under review and revision for some time. In local mode it passes a solid suite of ValidatesRunner tests (I don't have a Samza deployment handy to test non-

Re: Removing the PValueCache from the Beam Python DirectRunner

2018-01-25 Thread Robert Bradshaw
Sounds good. On Thu, Jan 25, 2018 at 4:12 PM, Charles Chen wrote: > Yes, that is correct. The scope of the attached fix is for in-process > runners. For remote runners, we should think about how to make PCollection > contents available after pipeline execution. We may also need to better > des

Re: Removing the PValueCache from the Beam Python DirectRunner

2018-01-25 Thread Charles Chen
Yes, that is correct. The scope of the attached fix is for in-process runners. For remote runners, we should think about how to make PCollection contents available after pipeline execution. We may also need to better design eager / interactive execution for that use case, since our current use o

Re: Removing the PValueCache from the Beam Python DirectRunner

2018-01-25 Thread Robert Bradshaw
Sounds good. I assume there will still need to be runner-specific support for any runner that chooses to implement this (e.g. writing to remote files then reading them in?) On Thu, Jan 25, 2018 at 3:25 PM, Charles Chen wrote: > Currently, the Python SDK supports an eager execution mode. For exam

Removing the PValueCache from the Beam Python DirectRunner

2018-01-25 Thread Charles Chen
Currently, the Python SDK supports an eager execution mode. For example, a list can be directly passed into a PTransform to obtain its result: result = [1, 2, 3] | MyPTransform() To support this use, the Python DirectRunner has an option to cache its intermediate results into a PValueCache. The

Re: [HEADS UP] Preparing Beam 2.3.0

2018-01-25 Thread Eugene Kirpichov
CassandraIO issue is not a regression - the bugs have been there all along (3 JIRAs filed), I suppose just previous users were okay with the single-threaded read performance. That said, I don't mind blocking on this issue if it only takes a couple of days. On Thu, Jan 25, 2018 at 1:40 PM Kenneth K

Re: Great work closing PRs for 2.3.0 release

2018-01-25 Thread Kenneth Knowles
Nice! Back under 100. On Wed, Jan 24, 2018 at 4:57 PM, Lukasz Cwik wrote: > I would like to give praise to the community for closing about 30 PRs in > the past couple of days for the 2.3.0 release. >

Re: [HEADS UP] Preparing Beam 2.3.0

2018-01-25 Thread Kenneth Knowles
Agreed. I would say if a previously usable IO became unusable that is (on a case-by-case basis) a fine cause to block a release. Are the JIRAs filed? On Thu, Jan 25, 2018 at 12:56 PM, Ismaël Mejía wrote: > I saw some recent reports on issues with CassandraIO that are not > blockers (not data los

Re: [HEADS UP] Preparing Beam 2.3.0

2018-01-25 Thread Ismaël Mejía
I saw some recent reports on issues with CassandraIO that are not blockers (not data loss) but IMO deserve to be included because basically the issues imply that users cannot read from Cassandra in parallel, and they were reported by production users. Probably a good idea to finish these before the

Re: [HEADS UP] Preparing Beam 2.3.0

2018-01-25 Thread Jean-Baptiste Onofré
Team work ! ;) Thanks Reuven ! Regards JB On 25/01/2018 21:08, Reuven Lax wrote: Thank you for running this JB! On Thu, Jan 25, 2018 at 11:50 AM, Jean-Baptiste Onofré > wrote: Hi guys, Kenn and I are doing the latest triage. I'm creating some PRs that

Re: [HEADS UP] Preparing Beam 2.3.0

2018-01-25 Thread Reuven Lax
Thank you for running this JB! On Thu, Jan 25, 2018 at 11:50 AM, Jean-Baptiste Onofré wrote: > Hi guys, > > Kenn and I are doing the latest triage. I'm creating some PRs that would > be good > for 2.3.0 (but not blocker). > > As discussed, I plan to start the release process tomorrow evening (my

Re: Gradle / Mvn diff

2018-01-25 Thread Romain Manni-Bucau
Well it is more about consistency and reliability than speed here. Comoilabtion result is just corrupted :( Le 25 janv. 2018 20:33, "Lukasz Cwik" a écrit : > You can only get incremental support at the build system level, not at the > individual tool level like javac. The task the represents com

Re: [HEADS UP] Preparing Beam 2.3.0

2018-01-25 Thread Jean-Baptiste Onofré
Hi guys, Kenn and I are doing the latest triage. I'm creating some PRs that would be good for 2.3.0 (but not blocker). As discussed, I plan to start the release process tomorrow evening (my time). Thanks ! Regards JB On 01/23/2018 10:39 AM, Jean-Baptiste Onofré wrote: > Hi guys, > > Some days

Re: Gradle / Mvn diff

2018-01-25 Thread Lukasz Cwik
You can only get incremental support at the build system level, not at the individual tool level like javac. The task the represents compilation would need to be broken up into smaller tasks with smaller source sets to speed up compilation of really large modules. On Wed, Jan 24, 2018 at 11:12 PM,

Build failed in Jenkins: beam_PostRelease_NightlySnapshot #1

2018-01-25 Thread Apache Jenkins Server
See -- Started by timer [EnvInject] - Loading node environment variables. Building remotely on beam8 (beam) in workspace

Re: [Proposal] Apache Beam Event's Calendar

2018-01-25 Thread Ismaël Mejía
+1 I think it makes sense to separate it the calendar in two, one for the CFPs more interesting for dev@ and one for confirmed events where there will be presentations on Beam that concerns more the users (user@). It also probably makes sense to include this one in the website too. On Thu, Jan

Re: [Proposal] Apache Beam Event's Calendar

2018-01-25 Thread Etienne Chauchot
+1, great initiative! Le 25/01/2018 à 01:05, Griselda Cuevas a écrit : Hi Beam Community, I've created this public calendar  to curate events that the Apache Beam

Jenkins build became unstable: beam_Release_NightlySnapshot #663

2018-01-25 Thread Apache Jenkins Server
See