Getting Ready for the Apache Community Summit @ San Francisco, CA

2018-03-13 Thread Griselda Cuevas
Hi Everyone, As you might remember from this thread [1] we're hosting the first Apache Beam Community Summit in San Francisco tomorrow. I've prepared a notes document [2] so that people can read after the sessions. Additionally, folks who cannot attend can add questions starting now so we can add

[Proposal] Defining and Adding SDK Metrics - Looking for Feedback

2018-03-13 Thread Alex Amato
Hello beam community, I have put together a proposal , and I would like to get some initial feedback to improve upon the ideas here. The proposal describes how metrics can be defined and communicated across the

Re: NoSuchElementException in reader.getCurrent*.

2018-03-13 Thread Thomas Groh
I'm not sure what you mean, JB. getCurrent* is source-implementor visible (and must be), and users don't need to interact with it directly via Read/other IO transforms. WIth regard to Eugene's point - I still am strongly in favor of telling a source that they at least *SHOULD* throw an exception i

Re: (java) stream & beam?

2018-03-13 Thread Romain Manni-Bucau
Yep, while we can pass lambdas i guess it is fine or we have to use proxies to hide the mutation but i dont think we need to be that purist to move to a more expressive dsl. Le 13 mars 2018 19:49, "Ben Chambers" a écrit : > The CombineFn API has three types parameters (input, accumulator, and >

Re: (java) stream & beam?

2018-03-13 Thread Ben Chambers
The CombineFn API has three types parameters (input, accumulator, and output) and methods that approximately correspond to those parts of the collector CombineFn.createAccumulator = supplier CombineFn.addInput = accumulator CombineFn.mergeAccumlator = combiner CombineFn.extractOutput = finisher T

Re: (java) stream & beam?

2018-03-13 Thread Romain Manni-Bucau
Misses the collect split in 3 (supplier, combiner, aggregator) but globally agree. I d just take java stream, remove "client" method or make them big data if possible, ensure all hooks are serializable to avoid hacks and add an unwrap to be able to access the pipeline in case we need a very custo

Re: (java) stream & beam?

2018-03-13 Thread Ben Chambers
I think the existing rationale (not introducing lots of special fluent methods) makes sense. However, if we look at the Java Stream API, we probably wouldn't need to introduce *a lot* of fluent builders to get most of the functionality. Specifically, if we focus on map, flatMap, and collect from th

Re: Advice on parallelizing network calls in DoFn

2018-03-13 Thread Romain Manni-Bucau
Le 13 mars 2018 18:45, "Lukasz Cwik" a écrit : Thanks for the data as it is clear that the fast completion stage doubles the overhead of the code segment that your going through and the regular completion stage quadruples the overhead. Its good that you also added a simple function and compared

Re: (java) stream & beam?

2018-03-13 Thread Romain Manni-Bucau
Yep I know the rational and it makes sense but it also increases the entering steps for users and is not that smooth in ides, in particular for custom code. So I really think it makes sense to build an user friendly api on top of beam core dev one. Le 13 mars 2018 18:35, "Aljoscha Krettek" a é

Re: Advice on parallelizing network calls in DoFn

2018-03-13 Thread Lukasz Cwik
Thanks for the data as it is clear that the fast completion stage doubles the overhead of the code segment that your going through and the regular completion stage quadruples the overhead. Its good that you also added a simple function and compared the run since it gives relative overhead that cou

Re: (java) stream & beam?

2018-03-13 Thread Aljoscha Krettek
https://beam.apache.org/blog/2016/05/27/where-is-my-pcollection-dot-map.html > On 11. Mar 2018, at 22:21, Romain Manni-Bucau wrote: > > > > Le 12 mars 2018 00:16, "Reuven Lax" > a écrit : > I think it would be interesting to see what a Java stream-based API would > l

Dealing with AWS Regions

2018-03-13 Thread Jacob Marble
Starting a new thread just for dealing with AWS regions better, context S3 and Redshift. S3FileSystem.amazonS3 build could be refactored to select region based on [1]: 1. the flag value region 2. the EC2 region, if found in environment (running in EC2 VM) 3. the default region (us-east-1) For act

Re: Releases and user support

2018-03-13 Thread Jean-Baptiste Onofré
You can also take a look on karaf download page about the schedule and active table. Regards JB Le 13 mars 2018 à 07:52, à 07:52, Romain Manni-Bucau a écrit: >Just to illustrate what I was looking for @beam: this kind of page >https://tomcat.apache.org/tomcat-80-eol.html but maybe not that fin

Re: Releases and user support

2018-03-13 Thread Romain Manni-Bucau
Just to illustrate what I was looking for @beam: this kind of page https://tomcat.apache.org/tomcat-80-eol.html but maybe not that fine grained. Romain Manni-Bucau @rmannibucau | Blog | Old Blog

Re: Releases and user support

2018-03-13 Thread Reuven Lax
I agree with this. Support guarantee makes more sense for products. Specifically, there are several organizations that have products based on Beam (Talend, Google,Data Artisans, Spotify, etc.). These companies may provide support guarantees to their customers, which essentially means that they are

Re: Releases and user support

2018-03-13 Thread Jean-Baptiste Onofré
+1 Le 13 mars 2018 à 05:54, à 05:54, Romain Manni-Bucau a écrit: >sounds good, let's wait this day (so tomorrow with the timezone issue) >to >see if there is any other proposals and if not start a vote tmr. > > >Romain Manni-Bucau >@rmannibucau | Blog >

Re: Releases and user support

2018-03-13 Thread Romain Manni-Bucau
sounds good, let's wait this day (so tomorrow with the timezone issue) to see if there is any other proposals and if not start a vote tmr. Romain Manni-Bucau @rmannibucau | Blog | Old Blog |

Re: Releases and user support

2018-03-13 Thread Jean-Baptiste Onofré
I would start a formal vote to have feedback from everyone and propose where to add such detail (I would suggest directly on the download/version page), then we can create a PR on the website. Regards JB Le 13 mars 2018 à 05:44, à 05:44, Romain Manni-Bucau a écrit: >Works for me. > >What's th

Re: Releases and user support

2018-03-13 Thread Romain Manni-Bucau
Works for me. What's the procedure to add it on the website (and where can we add it)? Romain Manni-Bucau @rmannibucau | Blog | Old Blog | Github | LinkedIn

Re: Releases and user support

2018-03-13 Thread Jean-Baptiste Onofré
That's the statement I'm doing in Karaf: I have two active branches, with backward compatibility guarantee on both. If we introduce a new branch, then the oldest one is flagged as "not active" (I prefer "not active" wording than "EOL" as a release can happen on a non active branch). In that sen

Re: Releases and user support

2018-03-13 Thread Romain Manni-Bucau
2018-03-13 12:50 GMT+01:00 Jean-Baptiste Onofré : > Hi > > I don't think this statement is appropriate as it sounds more like product > than project. > > Let me explain. > > At Apache, anyone can propose and do a release based on any version, > including very old ones. > Support sounds like the as

Re: NoSuchElementException in reader.getCurrent*.

2018-03-13 Thread Jean-Baptiste Onofré
I agree. I don't think it's useful to expose getCurrent to the user. That's more runner related. Regards JB Le 12 mars 2018 à 11:06, à 11:06, Romain Manni-Bucau a écrit: >I agree Thomas but I kind of read it as "yes we can drop that >constraint". >If not we should also check we are used in a t

Re: Releases and user support

2018-03-13 Thread Jean-Baptiste Onofré
Hi I don't think this statement is appropriate as it sounds more like product than project. Let me explain. At Apache, anyone can propose and do a release based on any version, including very old ones. Support sounds like the assessment that we are committed to provide fixes. That's more a pr

Re: "Radically modular data ingestion APIs in Apache Beam" @ Strata - slides available

2018-03-13 Thread James
Very informative, thanks! On Fri, Mar 9, 2018 at 4:49 PM Etienne Chauchot wrote: > Great ! > > Thanks for sharing. > > Etienne > > Le jeudi 08 mars 2018 à 19:49 +, Eugene Kirpichov a écrit : > > Hey all, > > The slides for my yesterday's talk at Strata San Jose > https://conferences.oreilly.

Re: Advice on parallelizing network calls in DoFn

2018-03-13 Thread Romain Manni-Bucau
Here are some figures (small warn is I did priviledge beam a lot in this benchmark, a bit more than it should in a real impl, I'll say more about it after): I copied the code at: https://gist.github.com/rmannibucau/fd98fb6a10f9557613fb145c8e7e2de1 And results at: https://gist.github.com/rmannib

Re: Releases and user support

2018-03-13 Thread Romain Manni-Bucau
2018-03-13 9:37 GMT+01:00 Robert Bradshaw : > I think there is never any prohibition on doing a minor or bugfix release > to an old release, if we deem it worth the effort. EOLs, etc. are more > about a promise/obligation to do a bugfix release if bugs (of a certain > type or severity?) are discov

Re: Releases and user support

2018-03-13 Thread Robert Bradshaw
I think there is never any prohibition on doing a minor or bugfix release to an old release, if we deem it worth the effort. EOLs, etc. are more about a promise/obligation to do a bugfix release if bugs (of a certain type or severity?) are discovered. Given how hard it's been just to get normal rel

Re: Releases and user support

2018-03-13 Thread Romain Manni-Bucau
Up? What about this proposal: 1. majors (X.y.z) are supported for 3 years 2. minors (x.Y.z) are supported for 6 months (1 year? does it sound doable?) Just to ensure it is clear: implication is if we have 3.0.0 today then we can have to do a 3.x.y ini 3 years even if we are at beam 10. This is t