Re: Cross-language pipelines status

2020-02-19 Thread Chamikara Jayalath
To clarify my previous point, I think transform KafkaIO.Read.TypedWithoutMetadata [1] which produces a KV (for example KV if we use ByteArraySerializer for keys and values) should work in the current form if we don't have a runner specific override for the source (hence allowing source and the subs

Re: Cross-language pipelines status

2020-02-19 Thread Robert Bradshaw
Ah, yes, registering a RowCoder seems like a fine solution here. (Either that or have a wrapping PTransform that explicitly converts to Rows or similar.) On Wed, Feb 19, 2020 at 10:03 PM Chad Dombrova wrote: > > The Java deps are only half of the problem. The other half is that PubsubIO > and Ka

Re: Cross-language pipelines status

2020-02-19 Thread Chad Dombrova
The Java deps are only half of the problem. The other half is that PubsubIO and KafkaIO are using classes that do not have a python equivalent and thus no universal coder. The solution discussed in the issue I linked above was to use row coder registries in Java, to convert from these types to row

Re: Cross-language pipelines status

2020-02-19 Thread Robert Bradshaw
Hopefully this should be resovled by https://issues.apache.org/jira/browse/BEAM-9229 On Wed, Feb 19, 2020 at 5:52 PM Chad Dombrova wrote: > > We are using external transforms to get access to PubSubIO within python. It > works well, but there is one major issue remaining to fix: we have to bui

Re: Cross-language pipelines status

2020-02-19 Thread Chamikara Jayalath
On Wed, Feb 19, 2020 at 5:52 PM Chad Dombrova wrote: > We are using external transforms to get access to PubSubIO within python. > It works well, but there is one major issue remaining to fix: we have to > build a custom beam with a hack to add the PubSubIO java deps and fix up > the coders. Th

Re: Cross-language pipelines status

2020-02-19 Thread Chad Dombrova
We are using external transforms to get access to PubSubIO within python. It works well, but there is one major issue remaining to fix: we have to build a custom beam with a hack to add the PubSubIO java deps and fix up the coders. This affects KafkaIO as well. There's an issue here: https://iss

Re: Cross-language pipelines status

2020-02-12 Thread Chamikara Jayalath
On Wed, Feb 12, 2020 at 8:10 AM Alexey Romanenko wrote: > > AFAIK, there's no official guide for cross-language pipelines. But there >> are examples and test cases you can use as reference such as: >> >> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/wordcount_xlang.p

Re: Cross-language pipelines status

2020-02-12 Thread Alexey Romanenko
> AFAIK, there's no official guide for cross-language pipelines. But there are > examples and test cases you can use as reference such as: > https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/wordcount_xlang.py > >

Re: Cross-language pipelines status

2020-02-12 Thread Alexey Romanenko
Thank you for response! > AFAIK, there's no official guide for cross-language pipelines. But there are > examples and test cases you can use as reference such as: > https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/wordcount_xlang.py > >

Re: Cross-language pipelines status

2020-02-11 Thread Chamikara Jayalath
On Tue, Feb 11, 2020 at 11:13 AM Heejong Lee wrote: > > > On Tue, Feb 11, 2020 at 9:37 AM Alexey Romanenko > wrote: > >> Hi all, >> >> I just wanted to ask for more details about the status of cross-language >> pipelines (rather, transforms). I see some discussions about that here, but >> I thin

Re: Cross-language pipelines status

2020-02-11 Thread Heejong Lee
On Tue, Feb 11, 2020 at 9:37 AM Alexey Romanenko wrote: > Hi all, > > I just wanted to ask for more details about the status of cross-language > pipelines (rather, transforms). I see some discussions about that here, but > I think it’s more around cross-language IOs. > > I’ll appreciate for any i

Re: Cross-language pipelines

2019-01-24 Thread Robert Bradshaw
On Fri, Jan 25, 2019 at 12:18 AM Reuven Lax wrote: > > On Thu, Jan 24, 2019 at 2:38 PM Robert Bradshaw wrote: >> >> On Thu, Jan 24, 2019 at 6:43 PM Reuven Lax wrote: >> > >> > Keep in mind that these user-supplied lambdas are commonly used in our >> > IOs. One common usage is in Sink IOs, to al

Re: Cross-language pipelines

2019-01-24 Thread Reuven Lax
On Thu, Jan 24, 2019 at 2:38 PM Robert Bradshaw wrote: > On Thu, Jan 24, 2019 at 6:43 PM Reuven Lax wrote: > > > > Keep in mind that these user-supplied lambdas are commonly used in our > IOs. One common usage is in Sink IOs, to allow dynamic destinations. e.g. > in BigQueryIO.Write, a user-supp

Re: Cross-language pipelines

2019-01-24 Thread Robert Bradshaw
On Thu, Jan 24, 2019 at 6:43 PM Reuven Lax wrote: > > Keep in mind that these user-supplied lambdas are commonly used in our IOs. > One common usage is in Sink IOs, to allow dynamic destinations. e.g. in > BigQueryIO.Write, a user-supplied lambda determines what table a record > should be writt

Re: Cross-language pipelines

2019-01-24 Thread Reuven Lax
Keep in mind that these user-supplied lambdas are commonly used in our IOs. One common usage is in Sink IOs, to allow dynamic destinations. e.g. in BigQueryIO.Write, a user-supplied lambda determines what table a record should be written to. Given that IOs are one of the big selling points of cros

Re: Cross-language pipelines

2019-01-24 Thread Robert Bradshaw
On Thu, Jan 24, 2019 at 5:08 PM Thomas Weise wrote: > > Exciting to see the cross-language train gathering steam :) > > It may be useful to flesh out the user facing aspects a bit more before going > too deep on the service / expansion side or maybe that was done elsewhere? It's been discussed,

Re: Cross-language pipelines

2019-01-24 Thread Thomas Weise
Exciting to see the cross-language train gathering steam :) It may be useful to flesh out the user facing aspects a bit more before going too deep on the service / expansion side or maybe that was done elsewhere? A few examples (of varying complexity) of how the shim/proxy transforms would look l

Re: Cross-language pipelines

2019-01-23 Thread Chamikara Jayalath
On Wed, Jan 23, 2019 at 1:03 PM Robert Bradshaw wrote: > On Wed, Jan 23, 2019 at 6:38 PM Maximilian Michels wrote: > > > > Thank you for starting on the cross-language feature Robert! > > > > Just to recap: Each SDK runs an ExpansionService which can be contacted > during > > pipeline translatio

Re: Cross-language pipelines

2019-01-23 Thread Robert Bradshaw
On Wed, Jan 23, 2019 at 6:38 PM Maximilian Michels wrote: > > Thank you for starting on the cross-language feature Robert! > > Just to recap: Each SDK runs an ExpansionService which can be contacted during > pipeline translation to expand transforms that are unknown to the SDK. The > service retur

Re: Cross-language pipelines

2019-01-23 Thread Maximilian Michels
Thank you for starting on the cross-language feature Robert! Just to recap: Each SDK runs an ExpansionService which can be contacted during pipeline translation to expand transforms that are unknown to the SDK. The service returns the Proto definitions to the querying process. There will be m

Re: Cross-language pipelines

2019-01-23 Thread Robert Bradshaw
No, this PR simply takes an endpoint address as a parameter, expecting it to already be up and available. More convenient APIs, e.g. ones that spin up and endpoint and tear it down, or catalog and locate code and services offering these endpoints, could be provided as wrappers on top of or extensio

Re: Cross-language pipelines

2019-01-22 Thread Kenneth Knowles
Nice! If I recall correctly, there was mostly concern about how to launch and manage the expansion service (Docker? Vendor-specific? Etc). Does this PR a position on that question? Kenn On Tue, Jan 22, 2019 at 1:44 PM Chamikara Jayalath wrote: > > > On Tue, Jan 22, 2019 at 11:35 AM Udi Meiri w

Re: Cross-language pipelines

2019-01-22 Thread Chamikara Jayalath
On Tue, Jan 22, 2019 at 11:35 AM Udi Meiri wrote: > Also debugability: collecting logs from each of these systems. > Agree. > > On Tue, Jan 22, 2019 at 10:53 AM Chamikara Jayalath > wrote: > >> Thanks Robert. >> >> On Tue, Jan 22, 2019 at 4:39 AM Robert Bradshaw >> wrote: >> >>> Now that we

Re: Cross-language pipelines

2019-01-22 Thread Udi Meiri
Also debugability: collecting logs from each of these systems. On Tue, Jan 22, 2019 at 10:53 AM Chamikara Jayalath wrote: > Thanks Robert. > > On Tue, Jan 22, 2019 at 4:39 AM Robert Bradshaw > wrote: > >> Now that we have the FnAPI, I started playing around with support for >> cross-language pi

Re: Cross-language pipelines

2019-01-22 Thread Chamikara Jayalath
Thanks Robert. On Tue, Jan 22, 2019 at 4:39 AM Robert Bradshaw wrote: > Now that we have the FnAPI, I started playing around with support for > cross-language pipelines. This will allow things like IOs to be shared > across all languages, SQL to be invoked from non-Java, TFX tensorflow > transfo