Re: Thoughts on a reference runner to invest in?

2019-02-14 Thread Ismaël Mejía
This is a really interesting and important discussion. Having multiple reference runners can have its pros and cons. It is all about tradeoffs. From the end user point of view it can feel weird to deal with tools and packaging of a different ecosystem, e.g. python devs dealing with all the quirkine

Re: Thoughts on a reference runner to invest in?

2019-02-14 Thread Robert Bradshaw
I think it's good to distinguish between direct runners (which would be good to have in every language, and can grow in sophistication with the userbase) and a fully universal reference runner. We should of course continue to grow and maintain the java-runners-core shared library, possibly as drive

Re: Beam Python streaming pipeline on Flink Runner

2019-02-14 Thread Maximilian Michels
I've revised the document and included your feedback: https://s.apache.org/beam-cross-language-io I think it reads much better now. I moved away from the JSON configuration in favor of an explicit Proto-based configuration approach which leaves it up to the transform what to include in the Pro

[BEAM-6671] Possible dependency issue in 2.9.0 NoSuchFieldError

2019-02-14 Thread Alex Amato
Filed this: https://issues.apache.org/jira/browse/BEAM-6671 I received a report from a Dataflow user encountering this in Beam 2.9.0 when creating a spanner instance. I wanted to post this here as this is known to be related to dependency conflicts in the past ( https://stackoverflow.com/questions

Re: pipeline steps

2019-02-14 Thread Yi Pan
@Chamikara, if adding the metadata interface class is too much an effort now, I would accept the solution with some special PTransform method that adds the metadata to the output data types. What I wonder is that if this kind of PTransform becomes more popular to many different BeamIO's, I may as w

Signing off

2019-02-14 Thread Scott Wegner
I wanted to let you all know that I've decided to pursue a new adventure in my career, which will take me away from Apache Beam development. It's been a fun and fulfilling journey. Apache Beam has been my first significant experience working in open source. I'm inspired observing how the community

[SQL] External schema providers

2019-02-14 Thread Anton Kedin
Hi dev@, A quick update about a new Beam SQL feature. In short, we have wired up the support for plugging table providers through Beam SQL API to allow obtaining table schemas from external sources. *What does it even mean?* Previously, in Java pipelines, you could apply a Beam SQL query to exi

Re: Signing off

2019-02-14 Thread Tim Robertson
What a shame for the project but best of luck for the future Scott. Thanks for all your contributions - they have been significant! Tim On Thu, Feb 14, 2019 at 7:37 PM Scott Wegner wrote: > I wanted to let you all know that I've decided to pursue a new adventure > in my career, which will take

Re: [SQL] External schema providers

2019-02-14 Thread Rui Wang
Thanks Anton for contributing it! It's a big progress that BeamSQL can connect to Hive metastore! The HCatalogTableProvider implementation is also a good reference for people who want to implement table provider for their metastore serivces. Just add another design discussion that I am aware of: F

Re: Signing off

2019-02-14 Thread Thomas Weise
Hi Scott, Thank you for the many contributions to Beam and best of luck with the new endeavor! Thomas On Thu, Feb 14, 2019 at 10:37 AM Scott Wegner wrote: > I wanted to let you all know that I've decided to pursue a new adventure > in my career, which will take me away from Apache Beam develo

Re: [PROPOSAL] Prepare Beam 2.11.0 release

2019-02-14 Thread Ahmet Altay
Update: Post-commit tests were mostly green and I cut the branch earlier today. There are 27 issues on the tagged for this release [1], I will start triage them. I would appreciate if you can start moving non-release blocking issues from this list. [1] https://issues.apache.org/jira/projects/BEAM/

Re: Signing off

2019-02-14 Thread Kenneth Knowles
+1 Thanks for the contributions to community & code, and enjoy the new chapter! Kenn On Thu, Feb 14, 2019 at 3:25 PM Thomas Weise wrote: > Hi Scott, > > Thank you for the many contributions to Beam and best of luck with the new > endeavor! > > Thomas > > > On Thu, Feb 14, 2019 at 10:37 AM Scot

Re: [SQL] External schema providers

2019-02-14 Thread Kenneth Knowles
This is great. Being able to just simply query existing schema-fied data is such a huge win. Kenn On Thu, Feb 14, 2019 at 12:30 PM Rui Wang wrote: > Thanks Anton for contributing it! It's a big progress that BeamSQL can > connect to Hive metastore! The HCatalogTableProvider implementation is al

Re: Thoughts on a reference runner to invest in?

2019-02-14 Thread Kenneth Knowles
Interesting point about community and the fact that it didn't build a Java-based ULR even though it has been a possibility for a long time. It makes sense to me. A non-Java SDK needs portability to run on Beam's distributed runners, so building the portable SDK harness is key, unlike for Java. And

Hazelcast Jet Runner

2019-02-14 Thread Can Gencer
We at Hazelcast are looking into writing a Beam runner for Hazelcast Jet ( https://github.com/hazelcast/hazelcast-jet). I wanted to introduce myself as we'll likely have questions as we start development. Some of the things I'm wondering about currently: * Currently there seems to be a guide avai