If it's in Java also be careful to align with the current google cloud
IO's, certainly it's dependencies. The google IO's are not depending on the
the newest client libraries and that's something we're sometimes struggling
with when we depend on our own client libraries. So make sure to align them.

Also note that although gRPC is vendored, the google IO's do still have
their own dependency on gRPC and this is the biggest reason for trouble.

 _/
_/ Alex Van Boxel


On Wed, Jan 15, 2020 at 1:18 AM Luke Cwik <[email protected]> wrote:

> It depends on what language the client libraries are exposed in. For
> example, if the client libraries are in Java, sdks/java/extensions makes
> sense while if its Python then integrating it within the gcp extension
> within sdks/python/apache_beam makes sense.
>
> Adding additional dependencies is ok depending on the licensing and the
> process is slightly different for each language.
>
> For transforms that are complicated, there is a cross language effort
> going on so that one can execute one language's transforms within another
> languages pipeline which may remove the need to write the transforms more
> then once.
>
> On Tue, Jan 14, 2020 at 7:43 AM Ismaël Mejía <[email protected]> wrote:
>
>> Nice idea, IO looks like a good place for them but there is another path
>> that could fit this case: `sdks/java/extensions`, some module like
>> `google-cloud-platform-ai` in that folder or something like that, no?
>>
>> In any case great initiative. +1
>>
>>
>>
>> On Tue, Jan 14, 2020 at 4:22 PM Kamil Wasilewski <
>> [email protected]> wrote:
>>
>>> Hi all,
>>>
>>> We’d like to implement a set of PTransforms that would allow users to
>>> use some of the Google Cloud AI services in Beam pipelines.
>>>
>>> Here's the full list of services and functionalities we’d like to
>>> integrate Beam with:
>>>
>>> * Video Intelligence [1]
>>>
>>> * Cloud Natural Language [2]
>>>
>>> * Cloud AI Platform Prediction [3]
>>>
>>> * Data Masking/Tokenization [4]
>>>
>>> * Inspecting image data for sensitive information using Cloud Vision [5]
>>>
>>> However, we're not sure whether to put those transforms directly into
>>> Beam, because they would require some additional GCP dependencies. One of
>>> our ideas is a separate library, that depends on Beam and that can be
>>> installed optionally, stored somewhere in the beam repository (e.g. in the
>>> BEAM_ROOT/extras directory). Do you think it is a reasonable approach? Or
>>> maybe it is totally fine to put them into SDKs, just like other IOs?
>>>
>>> If you have any other thoughts, do not hesitate to let us know.
>>>
>>> Best,
>>>
>>> Kamil
>>>
>>> [1] https://cloud.google.com/video-intelligence/
>>>
>>> [2] https://cloud.google.com/natural-language/
>>>
>>> [3] https://cloud.google.com/ml-engine/docs/prediction-overview
>>>
>>> [4]
>>> https://cloud.google.com/dataflow/docs/guides/templates/provided-streaming#dlptexttobigquerystreaming
>>>
>>> [5] https://cloud.google.com/vision/
>>>
>>

Reply via email to