here:
> https://issues.apache.org/jira/browse/BEAM-7870
> >> >
> >> > I consider this to be the most pressing problem with external
> transforms right now.
> >> >
> >> > -chad
> >> >
> >> >
> >> >
> >> > On Wed, Feb 12
t; > On Wed, Feb 12, 2020 at 9:28 AM Chamikara Jayalath
>> > wrote:
>> >>
>> >>
>> >>
>> >> On Wed, Feb 12, 2020 at 8:10 AM Alexey Romanenko
>> >> wrote:
>> >>>
>> >>>
>> >>>&g
>
> >>
> >>
> >> On Wed, Feb 12, 2020 at 8:10 AM Alexey Romanenko <
> aromanenko@gmail.com> wrote:
> >>>
> >>>
> >>>> AFAIK, there's no official guide for cross-language pipelines. But
> there are examples and test c
forms
> right now.
>
> -chad
>
>
>
> On Wed, Feb 12, 2020 at 9:28 AM Chamikara Jayalath
> wrote:
>>
>>
>>
>> On Wed, Feb 12, 2020 at 8:10 AM Alexey Romanenko
>> wrote:
>>>
>>>
>>>> AFAIK, there's no off
AM Alexey Romanenko <
>> aromanenko....@gmail.com> wrote:
>>
>>>
>>> AFAIK, there's no official guide for cross-language pipelines. But there
>>>> are examples and test cases you can use as reference such as:
>>>>
>>>> h
AFAIK, there's no official guide for cross-language pipelines. But there
>>> are examples and test cases you can use as reference such as:
>>>
>>> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/wordcount_xlang.py
>>>
>>>
On Wed, Feb 12, 2020 at 8:10 AM Alexey Romanenko
wrote:
>
> AFAIK, there's no official guide for cross-language pipelines. But there
>> are examples and test cases you can use as reference such as:
>>
>> https://github.com/apache/beam/blob/master/sdk
> AFAIK, there's no official guide for cross-language pipelines. But there are
> examples and test cases you can use as reference such as:
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/wordcount_xlang.py
>
> <https://github.com/apach
Thank you for response!
> AFAIK, there's no official guide for cross-language pipelines. But there are
> examples and test cases you can use as reference such as:
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/wordcount_xlang.py
>
> <ht
On Tue, Feb 11, 2020 at 11:13 AM Heejong Lee wrote:
>
>
> On Tue, Feb 11, 2020 at 9:37 AM Alexey Romanenko
> wrote:
>
>> Hi all,
>>
>> I just wanted to ask for more details about the status of cross-language
>> pipelines (rather, transforms). I see som
On Tue, Feb 11, 2020 at 9:37 AM Alexey Romanenko
wrote:
> Hi all,
>
> I just wanted to ask for more details about the status of cross-language
> pipelines (rather, transforms). I see some discussions about that here, but
> I think it’s more around cross-language IOs.
>
> I’
Hi all,
I just wanted to ask for more details about the status of cross-language
pipelines (rather, transforms). I see some discussions about that here, but I
think it’s more around cross-language IOs.
I’ll appreciate for any information about that topic and answers for these
questions:
- Are
gt;>> I don't think so. It would be great to push this forward.
>>>
>>> Thanks,
>>> Max
>>>
>>> On 26.11.19 02:49, Heejong Lee wrote:
>>> > Hi,
>>> >
>>> > Is anyone actively working on artifact staging extension
t; On Tue, Nov 26, 2019 at 3:54 AM Maximilian Michels wrote:
>
>> Hey Heejong,
>>
>> I don't think so. It would be great to push this forward.
>>
>> Thanks,
>> Max
>>
>> On 26.11.19 02:49, Heejong Lee wrote:
>> > Hi,
>> >
>>
v 26, 2019 at 3:54 AM Maximilian Michels wrote:
> Hey Heejong,
>
> I don't think so. It would be great to push this forward.
>
> Thanks,
> Max
>
> On 26.11.19 02:49, Heejong Lee wrote:
> > Hi,
> >
> > Is anyone actively working on artifact staging e
Hey Heejong,
I don't think so. It would be great to push this forward.
Thanks,
Max
On 26.11.19 02:49, Heejong Lee wrote:
Hi,
Is anyone actively working on artifact staging extension for
cross-language pipelines? I'm thinking I can contribute to it in coming
Dec. If anyone has an
Hi,
Is anyone actively working on artifact staging extension for cross-language
pipelines? I'm thinking I can contribute to it in coming Dec. If anyone has
any progress on this and needs help, please let me know.
Thanks,
On Wed, Jun 12, 2019 at 2:42 AM Ismaël Mejía wrote:
> Can you pl
Can you please add this to the design documents webpage.
https://beam.apache.org/contribute/design-documents/
On Wed, May 8, 2019 at 7:29 PM Chamikara Jayalath wrote:
>
>
>
> On Tue, May 7, 2019 at 10:21 AM Maximilian Michels wrote:
>>
>> Here's the first draft:
>> https://docs.google.com/docume
On Tue, May 7, 2019 at 10:21 AM Maximilian Michels wrote:
> Here's the first draft:
>
> https://docs.google.com/document/d/1XaiNekAY2sptuQRIXpjGAyaYdSc-wlJ-VKjl04c8N48/edit?usp=sharing
>
> It's rather high-level. We may want to add more details once we have
> finalized the design. Feel free to ma
Here's the first draft:
https://docs.google.com/document/d/1XaiNekAY2sptuQRIXpjGAyaYdSc-wlJ-VKjl04c8N48/edit?usp=sharing
It's rather high-level. We may want to add more details once we have
finalized the design. Feel free to make comments and edits.
All of this goes back to the idea that I t
Looking forward to your writeup, Max. In the meantime, some comments below.
From: Lukasz Cwik
Date: Thu, May 2, 2019 at 6:45 PM
To: dev
>
>
> On Thu, May 2, 2019 at 7:20 AM Robert Bradshaw wrote:
>>
>> On Sat, Apr 27, 2019 at 1:14 AM Lukasz Cwik wrote:
>> >
>> > We should stick with URN + pay
BTW what are the next steps here ? Heejong or Max, will one of you be able to
come up with a detailed proposal around this ?
Thank you for all the additional comments and ideas. I will try to
capture them in a document and share it here. Of course we can continue
the discussion in the meantim
On Thu, May 2, 2019 at 7:20 AM Robert Bradshaw wrote:
> On Sat, Apr 27, 2019 at 1:14 AM Lukasz Cwik wrote:
> >
> > We should stick with URN + payload + artifact metadata[1] where the only
> mandatory one that all SDKs and expansion services understand is the
> "bytes" artifact type. This allows
On Sat, Apr 27, 2019 at 1:14 AM Lukasz Cwik wrote:
>
> We should stick with URN + payload + artifact metadata[1] where the only
> mandatory one that all SDKs and expansion services understand is the "bytes"
> artifact type. This allows us to add optional URNs for file://, http://,
> Maven, PyPi
Agree on adding the 5.5 and the resolution of conflicts/duplicates could be
done by either the runner or the artifact staging service.
On Tue, Apr 30, 2019 at 10:03 AM Chamikara Jayalath
wrote:
>
> On Fri, Apr 26, 2019 at 4:14 PM Lukasz Cwik wrote:
>
>> We should stick with URN + payload + arti
On Fri, Apr 26, 2019 at 4:14 PM Lukasz Cwik wrote:
> We should stick with URN + payload + artifact metadata[1] where the only
> mandatory one that all SDKs and expansion services understand is the
> "bytes" artifact type. This allows us to add optional URNs for file://,
> http://, Maven, PyPi, ..
We should stick with URN + payload + artifact metadata[1] where the only
mandatory one that all SDKs and expansion services understand is the
"bytes" artifact type. This allows us to add optional URNs for file://,
http://, Maven, PyPi, ... in the future. I would make the artifact staging
service us
On Wed, Apr 24, 2019 at 12:21 PM Maximilian Michels wrote:
>
> Good idea to let the client expose an artifact staging service that the
> ExpansionService could use to stage artifacts. This solves two problems:
>
> (1) The Expansion Service not being able to access the Job Server
> artifact staging
s
the environment
>>> > > that all runners
should support
>>> > is that
>>> > > containers provides a
solution
>>> > for
ry artifacts).
> >>> > >
> >>> > > For the existing "external"
> >>> > environment,
> >>> > > it should already come with
> all the
&
y contain all the prepackaged
>>> > > resources. Note that both
>>> > "external" and
>>> > > "process" will require the
>
gt; > > limiting and expanding their
>> > > capabilities will quickly have us
>> > > building something like a docker
>> > >
also a design document [2].
> > >
> > > Subsequently, we've added
> > wrappers
> > > for cross-language transforms
> > to the
> that is a better solution than
> adding required Jars to the SDK
> Harness directly, but it is
not very
> convenient for users.
>
necessary files.
> >
> > For my PR [3] I've naively added
> > ":beam-sdks-java-io-kafka" to the SDK
> > Harness which caused dependency
> >
e entire classpath like we do
in PipelineResources for Java
pipelines. This provides many
unneeded classes but would work.
Do you think i
e
>>>>>>>> environments which is why the default should for the expansion service
>>>>>>>> to
>>>>>>>> be the "docker" environment.
>>>>>>>>
>>>>>>>> Note that a major reason f
cker container because we'll
>>>>>>> quickly find ourselves solving the same problems that docker containers
>>>>>>> provide (resources, file layout, permissions, ...)
>>>>>>>
>>>>>>>
>>>&
We have previously merged support for configuring transforms across
>>>>>>> languages. Please see Cham's summary on the discussion [1]. There is
>>>>>>> also a design document [2].
>>>>>>>
>>>>>>> Subseque
;> Subsequently, we've added wrappers for cross-language transforms to
>>>>>> the
>>>>>> Python SDK, i.e. GenerateSequence, ReadFromKafka, and there is a
>>>>>> pending
>>>>>> PR [1] for WriteToKafka. All of them ut
s all pretty exciting :)
>>>>>
>>>>> We still have some issues to solve, one being how to stage artifact
>>>>> from
>>>>> a foreign environment. When we run external transforms which are part
>>>>> of
>>>>&g
t;>> necessary files.
>>>>
>>>> For my PR [3] I've naively added ":beam-sdks-java-io-kafka" to the SDK
>>>> Harness which caused dependency problems [4]. Those could be resolved
>>>> but the bigger question is how to sta
gt;>>
>>> Heejong has solved this by adding a "--jar_package" option to the Python
>>> SDK to stage Java files [5]. I think that is a better solution than
>>> adding required Jars to the SDK Harness directly, but it is not very
>>
nd we both figured that the
>> expansion service needs to provide a list of required Jars with the
>> ExpansionResponse it provides. It's not entirely clear, how we determine
>> which artifacts are necessary for an external transform. We could just
>> dump the entire
of required Jars with the
> ExpansionResponse it provides. It's not entirely clear, how we determine
> which artifacts are necessary for an external transform. We could just
> dump the entire classpath like we do in PipelineResources for Java
> pipelines. This provides many unneeded c
Perhaps you have a better idea how to resolve the staging
problem in cross-language pipelines?
Thanks,
Max
[1]
https://lists.apache.org/thread.html/b99ba8527422e31ec7bb7ad9dc3a6583551ea392ebdc5527b5fb4a67@%3Cdev.beam.apache.org%3E
[2] https://s.apache.org/beam-cross-language-io
[3] https://gith
;
>> >> >> On Wed, Jan 23, 2019 at 1:03 PM Robert Bradshaw
>> >> >> wrote:
>> >> >>>
>> >> >>> On Wed, Jan 23, 2019 at 6:38 PM Maximilian Michels
>> >> >>> wrote:
>> >> >>>
Robert Bradshaw
> wrote:
> >> >>>
> >> >>> On Wed, Jan 23, 2019 at 6:38 PM Maximilian Michels
> wrote:
> >> >>> >
> >> >>> > Thank you for starting on the cross-language feature Robert!
> >> >>>
Maximilian Michels
>> >>> wrote:
>> >>> >
>> >>> > Thank you for starting on the cross-language feature Robert!
>> >>> >
>> >>> > Just to recap: Each SDK runs an ExpansionService which can be
>> &g
>>> >
> >>> > Just to recap: Each SDK runs an ExpansionService which can be
> contacted during
> >>> > pipeline translation to expand transforms that are unknown to the
> SDK. The
> >>> > service returns the Proto definitions to the quer
> Yep. Technically it doesn't have to be the SDK, or even if it is there
>>> may be a variety of services (e.g. one offering SQL, one offering
>>> different IOs).
>>>
>>> > There will be multiple environments such that during execution
>>>
vice returns the Proto definitions to the querying process.
>>
>> Yep. Technically it doesn't have to be the SDK, or even if it is there
>> may be a variety of services (e.g. one offering SQL, one offering
>> different IOs).
>>
>> > There will be multiple e
ne offering SQL, one offering
> different IOs).
>
> > There will be multiple environments such that during execution
> cross-language
> > pipelines select the appropriate environment for a transform.
>
> Exactly. And fuses only those steps with compatible environm
. The
> service returns the Proto definitions to the querying process.
Yep. Technically it doesn't have to be the SDK, or even if it is there
may be a variety of services (e.g. one offering SQL, one offering
different IOs).
> There will be multiple environments such that during execution
multiple environments such that during execution cross-language
pipelines select the appropriate environment for a transform.
It's not clear to me, should the expansion happen during pipeline construction
or during translation by the Runner?
Thanks,
Max
On 23.01.19 04:12, Robert Bra
;>
>>>
>>> On Tue, Jan 22, 2019 at 10:53 AM Chamikara Jayalath
>>> wrote:
>>>>
>>>> Thanks Robert.
>>>>
>>>> On Tue, Jan 22, 2019 at 4:39 AM Robert Bradshaw
>>>> wrote:
>>>>>
>>>>> Now
9 at 4:39 AM Robert Bradshaw
>>> wrote:
>>>
>>>> Now that we have the FnAPI, I started playing around with support for
>>>> cross-language pipelines. This will allow things like IOs to be shared
>>>> across all languages, SQL to be invoked
adshaw
>> wrote:
>>
>>> Now that we have the FnAPI, I started playing around with support for
>>> cross-language pipelines. This will allow things like IOs to be shared
>>> across all languages, SQL to be invoked from non-Java, TFX tensorflow
>>>
Also debugability: collecting logs from each of these systems.
On Tue, Jan 22, 2019 at 10:53 AM Chamikara Jayalath
wrote:
> Thanks Robert.
>
> On Tue, Jan 22, 2019 at 4:39 AM Robert Bradshaw
> wrote:
>
>> Now that we have the FnAPI, I started playing around with support
Thanks Robert.
On Tue, Jan 22, 2019 at 4:39 AM Robert Bradshaw wrote:
> Now that we have the FnAPI, I started playing around with support for
> cross-language pipelines. This will allow things like IOs to be shared
> across all languages, SQL to be invoked from non-Java, TFX t
Now that we have the FnAPI, I started playing around with support for
cross-language pipelines. This will allow things like IOs to be shared
across all languages, SQL to be invoked from non-Java, TFX tensorflow
transforms to be invoked from non-Python, etc. and I think is the next
step in
61 matches
Mail list logo