Re: Move mock classes out of test directory in BeamSQL

2018-10-03 Thread Rui Wang
In #6566 , I moved mock classes to sdk/extensions/sql/meta/provider/test , which is in BeamSQL's src/main and has

Re: Python 3: final step

2018-10-03 Thread Manu Zhang
Thanks Valentyn. Note some test failing issues are covered by “Finish Python 3 porting for *** module”, e.g. https://issues.apache.org/jira/browse/BEAM-5315. Manu 在 2018年10月3日 +0800 PM4:18,Valentyn Tymofieiev ,写道: > Hi Rakesh and Manu, > > Thanks to both of you for offering help (in different

Re: Does anyone have a strong intelliJ setup?

2018-10-03 Thread Thomas Weise
Current content on CWiki is outdated and needs to be replaced. +1 for moving the instructions there (delete from website) On Wed, Oct 3, 2018 at 8:04 PM Mikhail Gryzykhin < gryzykhin.mikh...@gmail.com> wrote: > @Scott Wegner > > Would be really great if we can get good hints. However I would

Re: Does anyone have a strong intelliJ setup?

2018-10-03 Thread Mikhail Gryzykhin
@Scott Wegner Would be really great if we can get good hints. However I would suggest to update corresponding page on cwiki, not website. It will be easier to maintain that one up-to-date. Some of tips already present there. https://cwiki.apache.org/confluence/display/BEAM/IntelliJ+Tips

Re: Is Splittable DoFn suitable for fetch data from a socket server?

2018-10-03 Thread flyisland
Hi Raghu, > Assuming you need to ack on the same connection that served the records, finalize() functionality in UnboundedSource API is important case. You can use UnboundeSource API for now. I have got a new question now, where should I keep the connection for later ack action? The

Can we allow SimpleFunction and SerializableFunction to throw Exception?

2018-10-03 Thread Jeff Klukas
I'm working on https://issues.apache.org/jira/browse/BEAM-5638 to add exception handling options to single message transforms in the Java SDK. MapElements' via() method is overloaded to accept either a SimpleFunction, a SerializableFunction, or a Contextful, all of which are ultimately stored as

Re: [Proposal] Add exception handling option to MapElements

2018-10-03 Thread Jeff Klukas
Jira issues for adding exception handling in Java and Python SDKs: https://issues.apache.org/jira/browse/BEAM-5638 https://issues.apache.org/jira/browse/BEAM-5639 I'll plan to have a complete PR for the Java SDK put together in the next few days. On Wed, Oct 3, 2018 at 1:29 PM Jeff Klukas

Re: Metrics Pusher support on Dataflow

2018-10-03 Thread Scott Wegner
Another point that we discussed at ApacheCon is that a difference between Dataflow and other runners is Dataflow is service-based and doesn't need a locally executing "driver" program. A local driver context is a good place to implement MetricsPusher because it is a singleton process. In fact,

Re: [PROPOSAL] Prepare Beam 2.8.0 release

2018-10-03 Thread Ahmet Altay
Great. I will do the cut on 10/10. Let's start by triaging the open issues targeted for 2.8.0 [1]. If you have any issues in this list please resolve them or move to the next release. If you are aware of any critical issues please add to this list. Ahmet [1]

Re: Does anyone have a strong intelliJ setup?

2018-10-03 Thread Scott Wegner
At ApacheCon I heard from a number of people that the IntelliJ setup isn't as good as it used to be with Maven. Bad tooling makes me sad and I want to make it better :( It seems everyone has their own magic to get things working. If we got these tips added to the website [1], do you think we'd

Re: [VOTE] Donating the Dataflow Worker code to Apache Beam

2018-10-03 Thread Boyuan Zhang
Hey all, We are tracking the dataflow worker donating process here: https://issues.apache.org/jira/browse/BEAM-5634 . Boyuan Zhang On Mon, Sep 17, 2018 at 5:05 PM Lukasz Cwik wrote: > Thanks all, closing the vote with 18 +1s, 5 of which are binding. > > I'll try to get this code out and

Re: Add cleanup flag to DockerPayload

2018-10-03 Thread Henning Rohde
IMO it's the runner's responsibility to do container garbage collection and disk space management. This flag seems like a implementation-specific option that would not only to some runner/deployment combinations, so it doesn't seem to belong in that proto. Dataflow would not be able to honor such

Add cleanup flag to DockerPayload

2018-10-03 Thread Ankur Goenka
Hi, In portable flink runner, SDK Harness docker containers are created dynamically and are not garbage collected. SDK Harness container pull the staging artifact, generate logs and tmp files which is stored as an additional layer on top of image. These dead container layers accumulates over time

Re: Why not adding all coders into ModelCoderRegistrar?

2018-10-03 Thread Shen Li
Hi Lukasz, Is there a way to get the SDK coders (LengthPrefixCoder, LengthPrefixCoder etc.) instead of a LengthPrefixCoder on the runner side from RunnerApi.Pipeline? Our runner needs to serialize the key and use its hash value to keep some per-key states. Now I am getting the ClassCastException

Re: [PROPOSAL] Prepare Beam 2.8.0 release

2018-10-03 Thread Thomas Weise
+1 On Wed, Oct 3, 2018 at 12:33 PM Ted Yu wrote: > +1 > > On Wed, Oct 3, 2018 at 9:52 AM Jean-Baptiste Onofré > wrote: > >> +1 >> >> but we have to be fast in release process. 2.7.0 took more than 1 month >> to be cut ! >> >> If no blocker, we have to just move forward. >> >> Regards >> JB >>

[ANNOUNCE] Apache Beam 2.7.0 released!

2018-10-03 Thread Charles Chen
The Apache Beam team is pleased to announce the release of version 2.7.0! Apache Beam is an open source unified programming model to define and execute data processing pipelines, including ETL, batch and stream (continuous) processing. See https://beam.apache.org You can download the release

Re: [PROPOSAL] Prepare Beam 2.8.0 release

2018-10-03 Thread Andrew Pilloud
+1 for the 2.7.0 release schedule. Thanks for volunteering. Do we want a standing owner for the LTS branch (like the Linux kernel has) or will we just take volunteers for each LTS release as they arise? Andrew On Wed, Oct 3, 2018 at 12:33 PM Ted Yu wrote: > +1 > > On Wed, Oct 3, 2018 at 9:52

Re: Java SDK Extensions

2018-10-03 Thread Ben Chambers
On Wed, Oct 3, 2018 at 12:16 PM Jean-Baptiste Onofré wrote: > Hi Anton, > > jackson is the json extension as we have XML. Agree that it should be > documented. > > Agree about join-library. > > sketching is some statistic extensions providing ready to use stats > CombineFn. > > Regards > JB > >

Re: [PROPOSAL] Prepare Beam 2.8.0 release

2018-10-03 Thread Ted Yu
+1 On Wed, Oct 3, 2018 at 9:52 AM Jean-Baptiste Onofré wrote: > +1 > > but we have to be fast in release process. 2.7.0 took more than 1 month > to be cut ! > > If no blocker, we have to just move forward. > > Regards > JB > > On 03/10/2018 18:25, Ahmet Altay wrote: > > Hi all, > > > > Release

Re: Java SDK Extensions

2018-10-03 Thread Jean-Baptiste Onofré
Hi Anton, jackson is the json extension as we have XML. Agree that it should be documented. Agree about join-library. sketching is some statistic extensions providing ready to use stats CombineFn. Regards JB On 03/10/2018 20:25, Anton Kedin wrote: Hi dev@, *TL;DR:*

Java SDK Extensions

2018-10-03 Thread Anton Kedin
Hi dev@, *TL;DR:* `sdks/java/extensions` is hard to discover, navigate and understand. *Current State:* I was looking at `sdks/java/extensions`[1] and realized that I don't know what half of those things are. Only `join library` and `sorter` seem to be documented and discoverable on Beam

Re: Move mock classes out of test directory in BeamSQL

2018-10-03 Thread Rui Wang
Thanks. Looks like at least moving mock is an accepted idea. I will come up a moving plan later (either to separate or to src/main, no matter what makes sense) and share it with you. -Rui On Wed, Oct 3, 2018 at 8:15 AM Andrew Pilloud wrote: > The sql module's tests depend on mocks and mocks

Re: [Proposal] Add exception handling option to MapElements

2018-10-03 Thread Jeff Klukas
I don't personally have experience with the Python SDK, so am not immediately in a position to comment on how feasible it would be to introduce a similar change there. I'll plan to write up two separate issues for adding exception handling in the Java and Python SDKs. On Wed, Oct 3, 2018 at 12:17

Re: [PROPOSAL] Prepare Beam 2.8.0 release

2018-10-03 Thread Jean-Baptiste Onofré
+1 but we have to be fast in release process. 2.7.0 took more than 1 month to be cut ! If no blocker, we have to just move forward. Regards JB On 03/10/2018 18:25, Ahmet Altay wrote: > Hi all, > > Release cut date for the next release is 10/10 according to Beam release > calendar [1]. Since

[PROPOSAL] Prepare Beam 2.8.0 release

2018-10-03 Thread Ahmet Altay
Hi all, Release cut date for the next release is 10/10 according to Beam release calendar [1]. Since the previous release is already mostly wrapped up (modulo blog post), I would like to propose starting the next release on time (10/10). Additionally I propose designating this release as the

Re: [Proposal] Add exception handling option to MapElements

2018-10-03 Thread Thomas Weise
+1 for the proposal as well as the suggestion to offer it in other SDKs, where applicable On Wed, Oct 3, 2018 at 8:58 AM Chamikara Jayalath wrote: > Sounds like a very good addition. I'd say this can be a single PR since > changes are related. Please open a JIRA for tracking. > > Have you

Re: [Proposal] Add exception handling option to MapElements

2018-10-03 Thread Chamikara Jayalath
Sounds like a very good addition. I'd say this can be a single PR since changes are related. Please open a JIRA for tracking. Have you though about introducing a similar change to Python SDK ? (doesn't have to be the same PR). - Cham On Wed, Oct 3, 2018 at 8:31 AM Jeff Klukas wrote: > If this

Re: [apachecon 2018] Universal metrics with apache beam

2018-10-03 Thread Chamikara Jayalath
It was a very interesting talk indeed and was very well delivered. Thanks Etienne. - Cham On Wed, Oct 3, 2018 at 8:16 AM Ted Yu wrote: > Very interesting talk, Etienne. > > Looking forward to the audio recording. > > Cheers >

Re: [Proposal] Add exception handling option to MapElements

2018-10-03 Thread Jeff Klukas
If this looks good for MapElements, I agree that it makes sense to extend to FlatMapElements and Filter and to keep the API consistent between them. Do you have suggestions on how to submit changes with that wider scope? Would one PR altering MapElements, FlatMapElements, Filter, ParseJsons, and

Re: [apachecon 2018] Universal metrics with apache beam

2018-10-03 Thread Ted Yu
Very interesting talk, Etienne. Looking forward to the audio recording. Cheers

Re: Move mock classes out of test directory in BeamSQL

2018-10-03 Thread Andrew Pilloud
The sql module's tests depend on mocks and mocks depend on sql module, so moving this to a separate module creates a weird dependency graph. I don't think it is strictly circular but it comes close. Can we just move the folder from 'src/test' to 'src/main' and mark everything @Experimental?

[apachecon 2018] Universal metrics with apache beam

2018-10-03 Thread Etienne Chauchot
Hi everyone, At the Apachecon 2018 on Sept 26th, I did a talk on "universal metrics with apache beam". This talk describes the metrics system in Beam, its integration with the runners and how the metrics can be everywhere: extracted to external sinks independent from the chosen runner and also

Re: [Proposal] Add exception handling option to MapElements

2018-10-03 Thread Reuven Lax
Sounds cool. Why not support this on other transforms as well? (FlatMapElements, Filter, etc.) Reuven On Tue, Oct 2, 2018 at 4:49 PM Jeff Klukas wrote: > I've seen a few Beam users mention the need to handle errors in their > transforms by using a try/catch and routing to different outputs

Re: Metrics Pusher support on Dataflow

2018-10-03 Thread Etienne Chauchot
Hi Scott,Thanks for the update. Both solutions look good to me. Though, they both have plus and minus. I let the googlers chose which is more appropriate: - DAG modifcation: less intrusive in Dataflow but the DAG executed and shown in the DAG UI in dataflow will contain an extra step that the

Re: Move mock classes out of test directory in BeamSQL

2018-10-03 Thread Kai Jiang
Big +1. Best, Kai ᐧ On Mon, Oct 1, 2018 at 10:42 PM Jean-Baptiste Onofré wrote: > +1 > > it makes sense. > > Regards > JB > > On 02/10/2018 01:32, Rui Wang wrote: > > Hi Community, > > > > BeamSQL defines some mock classes (see mock > > < >

Re: Python 3: final step

2018-10-03 Thread Valentyn Tymofieiev
Hi Rakesh and Manu, Thanks to both of you for offering help (in different threads). It's great to see that more and more people get involved with helping to make Beam Python 3 compatible! There are a few PRs in flight, and several people in the community actively work on Python 3 support now. I

Re: Purpose of GcpApiSurfaceTest in google-cloud-platform SDK

2018-10-03 Thread Ismaël Mejía
Just adding a bit to what Kenn said, this is the JIRA that covers the classpath issue (which seem not to affect you at least) https://issues.apache.org/jira/browse/BEAM-3748 Note also that the classpath based resolution is not future compatible, I saw it broke when working on the migration to Java

Re: SplittableDoFn

2018-10-03 Thread Alex Van Boxel
Yes, but we need at least Mongo 4.0 to make it production ready. I wouldn't let anyone work with anything less because you can't checkpoint). I'm waiting till our test cluster is 4.0 to continue on this. _/ _/ Alex Van Boxel On Wed, Oct 3, 2018 at 9:43 AM Ismaël Mejía wrote: > Hello Axel,

Re: SplittableDoFn

2018-10-03 Thread Ismaël Mejía
Hello Axel, Thanks for sharing, really interesting quest story, we really need more like this (kudos for the animations too). Are you planning to contribute the continous SDF based version of the mongo connector into Beam upstream (once ready)? On Wed, Oct 3, 2018 at 7:07 AM Jean-Baptiste Onofré