[jira] [Commented] (BEAM-3788) Implement a Kafka IO for Python SDK

2020-05-27 Thread Chamikara Madhusanka Jayalath (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17117906#comment-17117906
 ] 

Chamikara Madhusanka Jayalath commented on BEAM-3788:
-

KafkaIO is expected to be available as a cross-language transforms for Dataflow 
with Beam 2.22. Flink/Spark already support this. So closing the JIRA.

 

We can track further improvements to the cross-language KafkaIO in other JIRAs.

> Implement a Kafka IO for Python SDK
> ---
>
> Key: BEAM-3788
> URL: https://issues.apache.org/jira/browse/BEAM-3788
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Chamikara Madhusanka Jayalath
>Assignee: Chamikara Madhusanka Jayalath
>Priority: P2
> Fix For: 2.22.0
>
>
> Java KafkaIO will be made available to Python users as a cross-language 
> transform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-3788) Implement a Kafka IO for Python SDK

2020-03-15 Thread Nicolae Rosia (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17059941#comment-17059941
 ] 

Nicolae Rosia commented on BEAM-3788:
-

until this is fixed, is there a way to implement an unbounded source in Python? 
If yes, then I could work around this by using a Kafka library in Python, such 
as [https://pypi.org/project/kafka-python/]

> Implement a Kafka IO for Python SDK
> ---
>
> Key: BEAM-3788
> URL: https://issues.apache.org/jira/browse/BEAM-3788
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Chamikara Madhusanka Jayalath
>Assignee: Chamikara Madhusanka Jayalath
>Priority: Major
>
> Java KafkaIO will be made available to Python users as a cross-language 
> transform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-3788) Implement a Kafka IO for Python SDK

2020-02-20 Thread Brian Hulette (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041230#comment-17041230
 ] 

Brian Hulette commented on BEAM-3788:
-

[~chadrik] for the coder issue: I'm +1 on the approach you suggested in your 
[BEAM-7870 
comment|https://issues.apache.org/jira/browse/BEAM-7870?focusedCommentId=16959135=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16959135]
 - use portable row coder as the transport, then on the Python side make a 
PubsubMessage TypedDict (and possibly manually convert to a different more 
useful type, until we have BEAM-8732 and we can do this more cleanly).

So there's still work to do there, but there's a clear path forward.



> Implement a Kafka IO for Python SDK
> ---
>
> Key: BEAM-3788
> URL: https://issues.apache.org/jira/browse/BEAM-3788
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Chamikara Madhusanka Jayalath
>Assignee: Chamikara Madhusanka Jayalath
>Priority: Major
>
> Java KafkaIO will be made available to Python users as a cross-language 
> transform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-3788) Implement a Kafka IO for Python SDK

2020-02-19 Thread Chamikara Madhusanka Jayalath (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040565#comment-17040565
 ] 

Chamikara Madhusanka Jayalath commented on BEAM-3788:
-

Added a comment to the email thread. I think this is a Flink specific issue 
that should go away after SDF.

> Implement a Kafka IO for Python SDK
> ---
>
> Key: BEAM-3788
> URL: https://issues.apache.org/jira/browse/BEAM-3788
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Chamikara Madhusanka Jayalath
>Assignee: Chamikara Madhusanka Jayalath
>Priority: Major
>
> Java KafkaIO will be made available to Python users as a cross-language 
> transform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-3788) Implement a Kafka IO for Python SDK

2020-02-19 Thread Chad Dombrova (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040558#comment-17040558
 ] 

Chad Dombrova commented on BEAM-3788:
-

Unless something has changed recently, 
https://issues.apache.org/jira/browse/BEAM-7870 is still a blocker for using 
KafkaIO in python out of the box.  As the title suggests, it's also blocking 
PubSubIO in python and conceptually any external transform with a non-trivial 
coder. 

[~mxm], [~bhulette] has anything changed on that issue lately?

 

> Implement a Kafka IO for Python SDK
> ---
>
> Key: BEAM-3788
> URL: https://issues.apache.org/jira/browse/BEAM-3788
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Chamikara Madhusanka Jayalath
>Assignee: Chamikara Madhusanka Jayalath
>Priority: Major
>
> Java KafkaIO will be made available to Python users as a cross-language 
> transform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-3788) Implement a Kafka IO for Python SDK

2020-02-13 Thread Chamikara Madhusanka Jayalath (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036376#comment-17036376
 ] 

Chamikara Madhusanka Jayalath commented on BEAM-3788:
-

We are actively working on making Java KafkaIO available to Python users as a 
cross-language transform for all runners. Current solution [1]makes Java Kafka 
source which is based on the UnboundedSource framework available through 
cross-language transforms framework. But this will not really work for any 
portable runners without transform overrides since UnboundedSource framework is 
not available for portable runners (This currently works for Flink since 
FlinkRunner overrides this source with a native source implementation). We need 
a Kafka cross-language transform based on a SDF version of KafkaIO which can be 
made available to Python users.

On the other hand we are working on an UnboundedSource to SDF converter. So 
this can just be a cross-language transform for a Kafka transform that uses a 
SDF for existing Kafka UnboundedSource generated through this converter.

 

[1][https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/external/kafka.py]

> Implement a Kafka IO for Python SDK
> ---
>
> Key: BEAM-3788
> URL: https://issues.apache.org/jira/browse/BEAM-3788
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Chamikara Madhusanka Jayalath
>Assignee: Chamikara Madhusanka Jayalath
>Priority: Major
>
> Java KafkaIO will be made available to Python users as a cross-language 
> transform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-3788) Implement a Kafka IO for Python SDK

2019-11-26 Thread Jing Chen (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16982762#comment-16982762
 ] 

Jing Chen commented on BEAM-3788:
-

i am curious if there is a plan to integrate confluent's schema registry etc or 
simply vanilla kafka

> Implement a Kafka IO for Python SDK
> ---
>
> Key: BEAM-3788
> URL: https://issues.apache.org/jira/browse/BEAM-3788
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Chamikara Madhusanka Jayalath
>Priority: Major
>
> This will be implemented using the Splittable DoFn framework.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-3788) Implement a Kafka IO for Python SDK

2019-10-21 Thread Chamikara Madhusanka Jayalath (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956261#comment-16956261
 ] 

Chamikara Madhusanka Jayalath commented on BEAM-3788:
-

I believe unbounded SDF support for Python SDK is few months away. We are also 
working on adding cross-language transforms support to Dataflow but I don't 
have an ETA yet. As I mentioned in a previous comment either of these will pave 
the way for a KafkaIO in Python SDK on Dataflow. KafkaIO is already available 
as a cross-language transform for Flink: 
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/external/kafka.py]

> Implement a Kafka IO for Python SDK
> ---
>
> Key: BEAM-3788
> URL: https://issues.apache.org/jira/browse/BEAM-3788
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Chamikara Madhusanka Jayalath
>Priority: Major
>
> This will be implemented using the Splittable DoFn framework.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-3788) Implement a Kafka IO for Python SDK

2019-10-21 Thread Chethan UK (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955943#comment-16955943
 ] 

Chethan UK commented on BEAM-3788:
--

[~chamikara] Any Docs? Wanted to use Kafka in Dataflow pipelines...

Thanks!.

> Implement a Kafka IO for Python SDK
> ---
>
> Key: BEAM-3788
> URL: https://issues.apache.org/jira/browse/BEAM-3788
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Chamikara Madhusanka Jayalath
>Priority: Major
>
> This will be implemented using the Splittable DoFn framework.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-3788) Implement a Kafka IO for Python SDK

2019-06-10 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860419#comment-16860419
 ] 

Chamikara Jayalath commented on BEAM-3788:
--

This is blocked till we have a streaming source framework for Python SDK for 
portable runners. To this end, Splittable DoFn is currently under development.

Note that, for folks using Flink runner, native Kafka source of Flink is 
currently available to Python SDK users through the cross-language transform 
API.

> Implement a Kafka IO for Python SDK
> ---
>
> Key: BEAM-3788
> URL: https://issues.apache.org/jira/browse/BEAM-3788
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>Priority: Major
>
> This will be implemented using the Splittable DoFn framework.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3788) Implement a Kafka IO for Python SDK

2019-06-09 Thread Willem Pienaar (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16859396#comment-16859396
 ] 

Willem Pienaar commented on BEAM-3788:
--

[~chamikara]:Is there anything still blocking this from being implemented? 
Might have some time over the next few weeks.

> Implement a Kafka IO for Python SDK
> ---
>
> Key: BEAM-3788
> URL: https://issues.apache.org/jira/browse/BEAM-3788
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>Priority: Major
>
> This will be implemented using the Splittable DoFn framework.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3788) Implement a Kafka IO for Python SDK

2019-05-20 Thread Chad Dombrova (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1683#comment-1683
 ] 

Chad Dombrova commented on BEAM-3788:
-

Hi, any updates on this?

 

> Implement a Kafka IO for Python SDK
> ---
>
> Key: BEAM-3788
> URL: https://issues.apache.org/jira/browse/BEAM-3788
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>Priority: Major
>
> This will be implemented using the Splittable DoFn framework.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)