Re: Cross-language pipelines status

2020-02-19 Thread Chamikara Jayalath
here: > https://issues.apache.org/jira/browse/BEAM-7870 > >> > > >> > I consider this to be the most pressing problem with external > transforms right now. > >> > > >> > -chad > >> > > >> > > >> > > >> > On Wed, Feb 12

Re: Cross-language pipelines status

2020-02-19 Thread Robert Bradshaw
t; > On Wed, Feb 12, 2020 at 9:28 AM Chamikara Jayalath >> > wrote: >> >> >> >> >> >> >> >> On Wed, Feb 12, 2020 at 8:10 AM Alexey Romanenko >> >> wrote: >> >>> >> >>> >> >>>&g

Re: Cross-language pipelines status

2020-02-19 Thread Chad Dombrova
> > >> > >> > >> On Wed, Feb 12, 2020 at 8:10 AM Alexey Romanenko < > aromanenko@gmail.com> wrote: > >>> > >>> > >>>> AFAIK, there's no official guide for cross-language pipelines. But > there are examples and test c

Re: Cross-language pipelines status

2020-02-19 Thread Robert Bradshaw
forms > right now. > > -chad > > > > On Wed, Feb 12, 2020 at 9:28 AM Chamikara Jayalath > wrote: >> >> >> >> On Wed, Feb 12, 2020 at 8:10 AM Alexey Romanenko >> wrote: >>> >>> >>>> AFAIK, there's no off

Re: Cross-language pipelines status

2020-02-19 Thread Chamikara Jayalath
AM Alexey Romanenko < >> aromanenko....@gmail.com> wrote: >> >>> >>> AFAIK, there's no official guide for cross-language pipelines. But there >>>> are examples and test cases you can use as reference such as: >>>> >>>> h

Re: Cross-language pipelines status

2020-02-19 Thread Chad Dombrova
AFAIK, there's no official guide for cross-language pipelines. But there >>> are examples and test cases you can use as reference such as: >>> >>> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/wordcount_xlang.py >>> >>>

Re: Cross-language pipelines status

2020-02-12 Thread Chamikara Jayalath
On Wed, Feb 12, 2020 at 8:10 AM Alexey Romanenko wrote: > > AFAIK, there's no official guide for cross-language pipelines. But there >> are examples and test cases you can use as reference such as: >> >> https://github.com/apache/beam/blob/master/sdk

Re: Cross-language pipelines status

2020-02-12 Thread Alexey Romanenko
> AFAIK, there's no official guide for cross-language pipelines. But there are > examples and test cases you can use as reference such as: > https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/wordcount_xlang.py > > <https://github.com/apach

Re: Cross-language pipelines status

2020-02-12 Thread Alexey Romanenko
Thank you for response! > AFAIK, there's no official guide for cross-language pipelines. But there are > examples and test cases you can use as reference such as: > https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/wordcount_xlang.py > > <ht

Re: Cross-language pipelines status

2020-02-11 Thread Chamikara Jayalath
On Tue, Feb 11, 2020 at 11:13 AM Heejong Lee wrote: > > > On Tue, Feb 11, 2020 at 9:37 AM Alexey Romanenko > wrote: > >> Hi all, >> >> I just wanted to ask for more details about the status of cross-language >> pipelines (rather, transforms). I see som

Re: Cross-language pipelines status

2020-02-11 Thread Heejong Lee
On Tue, Feb 11, 2020 at 9:37 AM Alexey Romanenko wrote: > Hi all, > > I just wanted to ask for more details about the status of cross-language > pipelines (rather, transforms). I see some discussions about that here, but > I think it’s more around cross-language IOs. > > I’

Cross-language pipelines status

2020-02-11 Thread Alexey Romanenko
Hi all, I just wanted to ask for more details about the status of cross-language pipelines (rather, transforms). I see some discussions about that here, but I think it’s more around cross-language IOs. I’ll appreciate for any information about that topic and answers for these questions: - Are

Re: Artifact staging in cross-language pipelines

2019-12-17 Thread Robert Bradshaw
gt;>> I don't think so. It would be great to push this forward. >>> >>> Thanks, >>> Max >>> >>> On 26.11.19 02:49, Heejong Lee wrote: >>> > Hi, >>> > >>> > Is anyone actively working on artifact staging extension

Re: Artifact staging in cross-language pipelines

2019-12-17 Thread Heejong Lee
t; On Tue, Nov 26, 2019 at 3:54 AM Maximilian Michels wrote: > >> Hey Heejong, >> >> I don't think so. It would be great to push this forward. >> >> Thanks, >> Max >> >> On 26.11.19 02:49, Heejong Lee wrote: >> > Hi, >> > >>

Re: Artifact staging in cross-language pipelines

2019-12-12 Thread Heejong Lee
v 26, 2019 at 3:54 AM Maximilian Michels wrote: > Hey Heejong, > > I don't think so. It would be great to push this forward. > > Thanks, > Max > > On 26.11.19 02:49, Heejong Lee wrote: > > Hi, > > > > Is anyone actively working on artifact staging e

Re: Artifact staging in cross-language pipelines

2019-11-26 Thread Maximilian Michels
Hey Heejong, I don't think so. It would be great to push this forward. Thanks, Max On 26.11.19 02:49, Heejong Lee wrote: Hi, Is anyone actively working on artifact staging extension for cross-language pipelines? I'm thinking I can contribute to it in coming Dec. If anyone has an

Re: Artifact staging in cross-language pipelines

2019-11-25 Thread Heejong Lee
Hi, Is anyone actively working on artifact staging extension for cross-language pipelines? I'm thinking I can contribute to it in coming Dec. If anyone has any progress on this and needs help, please let me know. Thanks, On Wed, Jun 12, 2019 at 2:42 AM Ismaël Mejía wrote: > Can you pl

Re: Artifact staging in cross-language pipelines

2019-06-12 Thread Ismaël Mejía
Can you please add this to the design documents webpage. https://beam.apache.org/contribute/design-documents/ On Wed, May 8, 2019 at 7:29 PM Chamikara Jayalath wrote: > > > > On Tue, May 7, 2019 at 10:21 AM Maximilian Michels wrote: >> >> Here's the first draft: >> https://docs.google.com/docume

Re: Artifact staging in cross-language pipelines

2019-05-08 Thread Chamikara Jayalath
On Tue, May 7, 2019 at 10:21 AM Maximilian Michels wrote: > Here's the first draft: > > https://docs.google.com/document/d/1XaiNekAY2sptuQRIXpjGAyaYdSc-wlJ-VKjl04c8N48/edit?usp=sharing > > It's rather high-level. We may want to add more details once we have > finalized the design. Feel free to ma

Re: Artifact staging in cross-language pipelines

2019-05-07 Thread Maximilian Michels
Here's the first draft: https://docs.google.com/document/d/1XaiNekAY2sptuQRIXpjGAyaYdSc-wlJ-VKjl04c8N48/edit?usp=sharing It's rather high-level. We may want to add more details once we have finalized the design. Feel free to make comments and edits. All of this goes back to the idea that I t

Re: Artifact staging in cross-language pipelines

2019-05-07 Thread Robert Bradshaw
Looking forward to your writeup, Max. In the meantime, some comments below. From: Lukasz Cwik Date: Thu, May 2, 2019 at 6:45 PM To: dev > > > On Thu, May 2, 2019 at 7:20 AM Robert Bradshaw wrote: >> >> On Sat, Apr 27, 2019 at 1:14 AM Lukasz Cwik wrote: >> > >> > We should stick with URN + pay

Re: Artifact staging in cross-language pipelines

2019-05-02 Thread Maximilian Michels
BTW what are the next steps here ? Heejong or Max, will one of you be able to come up with a detailed proposal around this ? Thank you for all the additional comments and ideas. I will try to capture them in a document and share it here. Of course we can continue the discussion in the meantim

Re: Artifact staging in cross-language pipelines

2019-05-02 Thread Lukasz Cwik
On Thu, May 2, 2019 at 7:20 AM Robert Bradshaw wrote: > On Sat, Apr 27, 2019 at 1:14 AM Lukasz Cwik wrote: > > > > We should stick with URN + payload + artifact metadata[1] where the only > mandatory one that all SDKs and expansion services understand is the > "bytes" artifact type. This allows

Re: Artifact staging in cross-language pipelines

2019-05-02 Thread Robert Bradshaw
On Sat, Apr 27, 2019 at 1:14 AM Lukasz Cwik wrote: > > We should stick with URN + payload + artifact metadata[1] where the only > mandatory one that all SDKs and expansion services understand is the "bytes" > artifact type. This allows us to add optional URNs for file://, http://, > Maven, PyPi

Re: Artifact staging in cross-language pipelines

2019-04-30 Thread Lukasz Cwik
Agree on adding the 5.5 and the resolution of conflicts/duplicates could be done by either the runner or the artifact staging service. On Tue, Apr 30, 2019 at 10:03 AM Chamikara Jayalath wrote: > > On Fri, Apr 26, 2019 at 4:14 PM Lukasz Cwik wrote: > >> We should stick with URN + payload + arti

Re: Artifact staging in cross-language pipelines

2019-04-30 Thread Chamikara Jayalath
On Fri, Apr 26, 2019 at 4:14 PM Lukasz Cwik wrote: > We should stick with URN + payload + artifact metadata[1] where the only > mandatory one that all SDKs and expansion services understand is the > "bytes" artifact type. This allows us to add optional URNs for file://, > http://, Maven, PyPi, ..

Re: Artifact staging in cross-language pipelines

2019-04-26 Thread Lukasz Cwik
We should stick with URN + payload + artifact metadata[1] where the only mandatory one that all SDKs and expansion services understand is the "bytes" artifact type. This allows us to add optional URNs for file://, http://, Maven, PyPi, ... in the future. I would make the artifact staging service us

Re: Artifact staging in cross-language pipelines

2019-04-24 Thread Robert Bradshaw
On Wed, Apr 24, 2019 at 12:21 PM Maximilian Michels wrote: > > Good idea to let the client expose an artifact staging service that the > ExpansionService could use to stage artifacts. This solves two problems: > > (1) The Expansion Service not being able to access the Job Server > artifact staging

Re: Artifact staging in cross-language pipelines

2019-04-24 Thread Maximilian Michels
s the environment >>> >      >                                 that all runners should support >>> >     is that >>> >      >                                 containers provides a solution >>> >     for

Re: Artifact staging in cross-language pipelines

2019-04-23 Thread Heejong Lee
ry artifacts). > >>> > > > >>> > > For the existing "external" > >>> > environment, > >>> > > it should already come with > all the &

Re: Artifact staging in cross-language pipelines

2019-04-23 Thread Robert Bradshaw
y contain all the prepackaged >>> > > resources. Note that both >>> > "external" and >>> > > "process" will require the >

Re: Artifact staging in cross-language pipelines

2019-04-22 Thread Thomas Weise
gt; > > limiting and expanding their >> > > capabilities will quickly have us >> > > building something like a docker >> > >

Re: Artifact staging in cross-language pipelines

2019-04-22 Thread Ankur Goenka
also a design document [2]. > > > > > > Subsequently, we've added > > wrappers > > > for cross-language transforms > > to the

Re: Artifact staging in cross-language pipelines

2019-04-22 Thread Maximilian Michels
>                                     that is a better solution than >                                     adding required Jars to the SDK >                                     Harness directly, but it is not very >                                     convenient for users. >

Re: Artifact staging in cross-language pipelines

2019-04-19 Thread Chamikara Jayalath
necessary files. > > > > For my PR [3] I've naively added > > ":beam-sdks-java-io-kafka" to the SDK > > Harness which caused dependency > >

Re: Artifact staging in cross-language pipelines

2019-04-19 Thread Maximilian Michels
e entire classpath like we do in PipelineResources for Java pipelines. This provides many unneeded classes but would work. Do you think i

Re: Artifact staging in cross-language pipelines

2019-04-18 Thread Thomas Weise
e >>>>>>>> environments which is why the default should for the expansion service >>>>>>>> to >>>>>>>> be the "docker" environment. >>>>>>>> >>>>>>>> Note that a major reason f

Re: Artifact staging in cross-language pipelines

2019-04-18 Thread Chamikara Jayalath
cker container because we'll >>>>>>> quickly find ourselves solving the same problems that docker containers >>>>>>> provide (resources, file layout, permissions, ...) >>>>>>> >>>>>>> >>>&

Re: Artifact staging in cross-language pipelines

2019-04-18 Thread Chamikara Jayalath
We have previously merged support for configuring transforms across >>>>>>> languages. Please see Cham's summary on the discussion [1]. There is >>>>>>> also a design document [2]. >>>>>>> >>>>>>> Subseque

Re: Artifact staging in cross-language pipelines

2019-04-18 Thread Lukasz Cwik
;> Subsequently, we've added wrappers for cross-language transforms to >>>>>> the >>>>>> Python SDK, i.e. GenerateSequence, ReadFromKafka, and there is a >>>>>> pending >>>>>> PR [1] for WriteToKafka. All of them ut

Re: Artifact staging in cross-language pipelines

2019-04-18 Thread Ankur Goenka
s all pretty exciting :) >>>>> >>>>> We still have some issues to solve, one being how to stage artifact >>>>> from >>>>> a foreign environment. When we run external transforms which are part >>>>> of >>>>&g

Re: Artifact staging in cross-language pipelines

2019-04-18 Thread Chamikara Jayalath
t;>> necessary files. >>>> >>>> For my PR [3] I've naively added ":beam-sdks-java-io-kafka" to the SDK >>>> Harness which caused dependency problems [4]. Those could be resolved >>>> but the bigger question is how to sta

Re: Artifact staging in cross-language pipelines

2019-04-18 Thread Lukasz Cwik
gt;>> >>> Heejong has solved this by adding a "--jar_package" option to the Python >>> SDK to stage Java files [5]. I think that is a better solution than >>> adding required Jars to the SDK Harness directly, but it is not very >>

Re: Artifact staging in cross-language pipelines

2019-04-18 Thread Chamikara Jayalath
nd we both figured that the >> expansion service needs to provide a list of required Jars with the >> ExpansionResponse it provides. It's not entirely clear, how we determine >> which artifacts are necessary for an external transform. We could just >> dump the entire

Re: Artifact staging in cross-language pipelines

2019-04-18 Thread Lukasz Cwik
of required Jars with the > ExpansionResponse it provides. It's not entirely clear, how we determine > which artifacts are necessary for an external transform. We could just > dump the entire classpath like we do in PipelineResources for Java > pipelines. This provides many unneeded c

Artifact staging in cross-language pipelines

2019-04-18 Thread Maximilian Michels
Perhaps you have a better idea how to resolve the staging problem in cross-language pipelines? Thanks, Max [1] https://lists.apache.org/thread.html/b99ba8527422e31ec7bb7ad9dc3a6583551ea392ebdc5527b5fb4a67@%3Cdev.beam.apache.org%3E [2] https://s.apache.org/beam-cross-language-io [3] https://gith

Re: Cross-language pipelines

2019-01-24 Thread Robert Bradshaw
; >> >> >> On Wed, Jan 23, 2019 at 1:03 PM Robert Bradshaw >> >> >> wrote: >> >> >>> >> >> >>> On Wed, Jan 23, 2019 at 6:38 PM Maximilian Michels >> >> >>> wrote: >> >> >>>

Re: Cross-language pipelines

2019-01-24 Thread Reuven Lax
Robert Bradshaw > wrote: > >> >>> > >> >>> On Wed, Jan 23, 2019 at 6:38 PM Maximilian Michels > wrote: > >> >>> > > >> >>> > Thank you for starting on the cross-language feature Robert! > >> >>>

Re: Cross-language pipelines

2019-01-24 Thread Robert Bradshaw
Maximilian Michels >> >>> wrote: >> >>> > >> >>> > Thank you for starting on the cross-language feature Robert! >> >>> > >> >>> > Just to recap: Each SDK runs an ExpansionService which can be >> &g

Re: Cross-language pipelines

2019-01-24 Thread Reuven Lax
>>> > > >>> > Just to recap: Each SDK runs an ExpansionService which can be > contacted during > >>> > pipeline translation to expand transforms that are unknown to the > SDK. The > >>> > service returns the Proto definitions to the quer

Re: Cross-language pipelines

2019-01-24 Thread Robert Bradshaw
> Yep. Technically it doesn't have to be the SDK, or even if it is there >>> may be a variety of services (e.g. one offering SQL, one offering >>> different IOs). >>> >>> > There will be multiple environments such that during execution >>>

Re: Cross-language pipelines

2019-01-24 Thread Thomas Weise
vice returns the Proto definitions to the querying process. >> >> Yep. Technically it doesn't have to be the SDK, or even if it is there >> may be a variety of services (e.g. one offering SQL, one offering >> different IOs). >> >> > There will be multiple e

Re: Cross-language pipelines

2019-01-23 Thread Chamikara Jayalath
ne offering SQL, one offering > different IOs). > > > There will be multiple environments such that during execution > cross-language > > pipelines select the appropriate environment for a transform. > > Exactly. And fuses only those steps with compatible environm

Re: Cross-language pipelines

2019-01-23 Thread Robert Bradshaw
. The > service returns the Proto definitions to the querying process. Yep. Technically it doesn't have to be the SDK, or even if it is there may be a variety of services (e.g. one offering SQL, one offering different IOs). > There will be multiple environments such that during execution

Re: Cross-language pipelines

2019-01-23 Thread Maximilian Michels
multiple environments such that during execution cross-language pipelines select the appropriate environment for a transform. It's not clear to me, should the expansion happen during pipeline construction or during translation by the Runner? Thanks, Max On 23.01.19 04:12, Robert Bra

Re: Cross-language pipelines

2019-01-23 Thread Robert Bradshaw
;> >>> >>> On Tue, Jan 22, 2019 at 10:53 AM Chamikara Jayalath >>> wrote: >>>> >>>> Thanks Robert. >>>> >>>> On Tue, Jan 22, 2019 at 4:39 AM Robert Bradshaw >>>> wrote: >>>>> >>>>> Now

Re: Cross-language pipelines

2019-01-22 Thread Kenneth Knowles
9 at 4:39 AM Robert Bradshaw >>> wrote: >>> >>>> Now that we have the FnAPI, I started playing around with support for >>>> cross-language pipelines. This will allow things like IOs to be shared >>>> across all languages, SQL to be invoked

Re: Cross-language pipelines

2019-01-22 Thread Chamikara Jayalath
adshaw >> wrote: >> >>> Now that we have the FnAPI, I started playing around with support for >>> cross-language pipelines. This will allow things like IOs to be shared >>> across all languages, SQL to be invoked from non-Java, TFX tensorflow >>>

Re: Cross-language pipelines

2019-01-22 Thread Udi Meiri
Also debugability: collecting logs from each of these systems. On Tue, Jan 22, 2019 at 10:53 AM Chamikara Jayalath wrote: > Thanks Robert. > > On Tue, Jan 22, 2019 at 4:39 AM Robert Bradshaw > wrote: > >> Now that we have the FnAPI, I started playing around with support

Re: Cross-language pipelines

2019-01-22 Thread Chamikara Jayalath
Thanks Robert. On Tue, Jan 22, 2019 at 4:39 AM Robert Bradshaw wrote: > Now that we have the FnAPI, I started playing around with support for > cross-language pipelines. This will allow things like IOs to be shared > across all languages, SQL to be invoked from non-Java, TFX t

Cross-language pipelines

2019-01-22 Thread Robert Bradshaw
Now that we have the FnAPI, I started playing around with support for cross-language pipelines. This will allow things like IOs to be shared across all languages, SQL to be invoked from non-Java, TFX tensorflow transforms to be invoked from non-Python, etc. and I think is the next step in