[
https://issues.apache.org/jira/browse/BEAM-11065?focusedWorklogId=509708&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-509708
]
ASF GitHub Bot logged work on BEAM-11065:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 10/Nov/20 14:15
Start Date: 10/Nov/20 14:15
Worklog Time Spent: 10m
Work Description: ilya-kozyrev commented on pull request #13112:
URL: https://github.com/apache/beam/pull/13112#issuecomment-724728674
> Hi Ilya,
> Thank you so much for contributing this template.
> However, Beam is not the right repo to contain templates.
> All the Dataflow templates are located here
(https://github.com/GoogleCloudPlatform/DataflowTemplates).
> Can you please create the PR here and I would help to get it reviewed and
approved.
> Apologies for all the confusion and for the back and forth.
>
> Thank you so much again for this.
>
> PS: The DataflowTemplates repo holds many other template examples too.
Hi Manav,
Thank you very much for your comment! Let me clarify a bit.
We have implemented this template for the Beam repository for users, who
would like to use the beam in quite common use cases like this (Kafka ->
Pub/Sub). They could be able to get a solution that does not require changes or
requires minor improvements. We want to scale this part of the Beam repository.
The very important thing is that the Dataflow runner is an optional runner
for this template. The template is focusing on different runners and we don't
include any specific libraries inside. Sure, if you want, you could build it to
GCP and use the template in Google Dataflow, for this case we have rich
documentation in Java Doc and readme, but this approach is only one of the many
awesome approaches of how we can use templates.
We don't want to focus only on GCP suggesting adding templates to Beam. We
want to give the ability for Beam users to get ready-to-use solutions for
common use cases.
> Also, I already see a PR for the same there:
[GoogleCloudPlatform/DataflowTemplates#176](https://github.com/GoogleCloudPlatform/DataflowTemplates/pull/176)
> I guess this should be sufficient ?
As you mentioned right, we implemented a different template in the
[GoogleDataflow templates
repository](https://github.com/GoogleCloudPlatform/DataflowTemplates) for a
similar use-case. In the [template for
Dataflow](https://github.com/GoogleCloudPlatform/DataflowTemplates/pull/176) we
use specific libraries, coders, and functions that can be used only with
Dataflow runner and built from the DataflowTemplates repository. But in this PR
we suggest adding a more generic template and give the ability to the community
to extend templates in the future.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 509708)
Time Spent: 1h (was: 50m)
> Apache Beam Template to ingest from Apache Kafka to Google Pub/Sub
> ------------------------------------------------------------------
>
> Key: BEAM-11065
> URL: https://issues.apache.org/jira/browse/BEAM-11065
> Project: Beam
> Issue Type: Improvement
> Components: examples-java
> Reporter: Ilya Kozyrev
> Assignee: Ilya Kozyrev
> Priority: P3
> Time Spent: 1h
> Remaining Estimate: 0h
>
> In the Beam repository, we have awesome examples on java and python, however,
> there are not presented any templates.
> The reason for templates here, some users would like to use parametrized flex
> templates for common use cases. For example, if I want to do a typical
> pipeline, read from some source, and write to another source, I would happy
> if Beam provides a finally implemented pipeline for my common case in which I
> can put only parameters to customize input and output sources.
> I propose to add the new directory under the examples folder and name it
> "templates". It will be a new Gradle module in the project. As a first
> template, I propose to implement Kafka to Pubsub template.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)