Hi Katarzyna, I have linked the Python snowflake connector with the Cross platform one. And I agree that we can , if needed in future reopen the issue based on users feedback.
*Regards* Shashanka Balakuntala Srinivasa On Thu, Jun 18, 2020 at 7:54 PM Katarzyna Kucharczyk < [email protected]> wrote: > @Shashanka Balakuntala <[email protected]> We are waiting for the > Java Snowflake connector to be reviewed [1] and when it will be available > we will publish code for a cross-language connector (it should be rather > soon). We tested this code on Flink runner and the only issue we found for > now is that it's impossible to use a cross-language connector with Create > method [2]. From the other threads I remember that cross language on > Dataflow should be available soon [3] (or even it is currently for > KafkaIO). In terms of performance I think it's too early to say how it will > perform. But what I may add here is that the benefit of writing it in a > cross-language provides users the same set of features both in Java and > Python (or other languages in future) and we may avoid introducing > differences in the future. What I may suggest is that maybe the > issue BEAM-9466 with a Python snowflake connector (not cross-language) > should be reopened when cross-language will be insufficient for users? What > do you think? > > Thank you @Chamikara Jayalath <[email protected]> for those hints. We > will adjust the java connector in a similar way. > > [1] https://github.com/apache/beam/pull/11794 > [2] https://issues.apache.org/jira/browse/BEAM-10020 > <https://issues.apache.org/jira/browse/BEAM-10020> > [3] > https://lists.apache.org/thread.html/r15ae137c3b5a58283b7228131d1a32c1470e2cc7e1fa27bdfac5ab9b%40<dev.beam.apache.org> > <https://lists.apache.org/thread.html/r15ae137c3b5a58283b7228131d1a32c1470e2cc7e1fa27bdfac5ab9b%40%3Cdev.beam.apache.org%3E> > > > > On Wed, Jun 17, 2020 at 7:45 PM Chamikara Jayalath <[email protected]> > wrote: > >> There's some work needed to make the Java connector available as a >> cross-language transform for Python. More specifically, >> >> (1) Add a Java builder and registrar to register Java transforms with the >> expansion service (see [1] and [2] for Kafka) >> (2) Add a Python wrapper (see [3] for Kafka) >> >> Thanks, >> Cham >> >> [1] >> https://github.com/apache/beam/blob/master/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java#L396 >> [2] >> https://github.com/apache/beam/blob/master/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java#L1429 >> [3] >> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/external/kafka.py >> >> On Wed, Jun 17, 2020 at 8:57 AM Shashanka Balakuntala < >> [email protected]> wrote: >> >>> Hi All, >>> In regards with this discussion, I created a JIRA issue[1]. Now since >>> there is a talk here on cross-platform connector, should I just close the >>> issue with a link to Java Snowflake connector, or does anyone think writing >>> python based connector has some advantage in terms of performance or >>> usability. Please let me know what you guys think, so that i can take the >>> necessary step on this. >>> >>> [1] - https://issues.apache.org/jira/browse/BEAM-9466 >>> >>> *Regards* >>> Shashanka Balakuntala Srinivasa >>> >>> >>> >>> On Wed, Mar 11, 2020 at 2:25 AM Chamikara Jayalath <[email protected]> >>> wrote: >>> >>>> >>>> >>>> On Tue, Mar 10, 2020 at 1:18 PM Tyler Akidau <[email protected]> >>>> wrote: >>>> >>>>> On Tue, Mar 10, 2020 at 1:27 AM Elias Djurfeldt < >>>>> [email protected]> wrote: >>>>> >>>>>> From what I can tell, the only difference is that the Python >>>>>> connector is a pure Python implementation and doesn't rely on ODBC or >>>>>> JDBC >>>>>> (it's just a pip installable). Whereas the Java version needs JDBC. But >>>>>> that seems to be the only difference. >>>>>> >>>>> >>>>> Correct me if I'm wrong, but this sounds like a concern around having >>>>> to install Java dependencies for the cross-language transform. If so, I >>>>> think the question is: how frictionless can we make the user experience >>>>> here? If it can be relatively straightforward, even for a Python user with >>>>> zero Java familiarity, it's going to be a win from a maintainability >>>>> perspective to only have one implementation (Java, in this case) to keep >>>>> up >>>>> to date, as Cham pointed out. Kasia, do you have a sense yet for what the >>>>> experience for a Python user would be for using the Python-wrapped Java >>>>> SnowflakeIO connector? >>>>> >>>> >>>> There are many aspects related to usability of cross-language >>>> transforms that are currently being worked on. We are doing some of the >>>> usability improvements to cross-language Kafka. But the end goal is to make >>>> using cross-language transforms seamless as possible to end users. For >>>> example, >>>> (1) Expansion service can be started up automatically if users have >>>> Java installed in their system. >>>> (2) Native language wrappers can be aware of the immediate dependencies >>>> needed for the expansion service. >>>> (3) Additional dependencies can be obtained as a part of the new >>>> environment >>>> <https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/beam_runner_api.proto#L1280> >>>> received through the cross-language transform expansion protocol. >>>> >>>> Also we need to add better support for converting arbitrary Java types >>>> to arbitrary Python types using Row coder ( >>>> https://issues.apache.org/jira/browse/BEAM-8732). >>>> >>>> So hopefully, the user experience of using cross-language Java >>>> transforms from Python can be as seamless as "just install JRE and use the >>>> transforms in Python xyz_io.py". >>>> >>>> There might be additional Snowflake specific considerations I'm not >>>> aware of. >>>> >>>> Thanks, >>>> Cham >>>> >>>> >>>>> >>>>> -Tyler >>>>> >>>>> >>>>>> >>>>>> I don't know enough about the Java side of Beam (or Java in general >>>>>> really) to say if that's an issue or not though :) >>>>>> >>>>>> Cheers, >>>>>> >>>>>> On Mon, 9 Mar 2020 at 18:06, Chamikara Jayalath <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Thank you. Elias and Shashanka, do you think the Python connector >>>>>>> (and API) can offer some additional benefits that a Java >>>>>>> cross-language >>>>>>> <https://beam.apache.org/roadmap/connectors-multi-sdk/> connector >>>>>>> cannot ? It's fine to develop Java and Python versions if it makes sense >>>>>>> but if cross-language Java version offers the same benefits as Python >>>>>>> just >>>>>>> having one implementation will reduce maintenance burden. >>>>>>> >>>>>>> Thanks, >>>>>>> Cham >>>>>>> >>>>>>> On Mon, Mar 9, 2020 at 5:41 AM Katarzyna Kucharczyk < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Hi all, >>>>>>>> >>>>>>>> Me and my colleague Dariusz we are working currently on Java >>>>>>>> connector and we are planning to use cross-language to add Python as >>>>>>>> well. >>>>>>>> The proposal should arrive on dev-list in the nearest future. >>>>>>>> Also we would be happy to help if needed in current work of yours. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Kasia >>>>>>>> >>>>>>>> On Mon, Mar 9, 2020 at 9:41 AM Elias Djurfeldt < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> Cool Shashanka! Feel free to tag me in the JIRA and update me on >>>>>>>>> any progress / ponderings. >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Elias >>>>>>>>> >>>>>>>>> On Sat, 7 Mar 2020 at 03:43, Chamikara Jayalath < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> Absolutely. Please create a JIRA and coordinate with Elias and >>>>>>>>>> any others that would like to contribute to this. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Cham >>>>>>>>>> >>>>>>>>>> On Fri, Mar 6, 2020 at 10:46 AM Shashanka Balakuntala < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Chamikara and Elias, >>>>>>>>>>> This seems like an interesting feature. Can I start working on >>>>>>>>>>> this? >>>>>>>>>>> *Regards* >>>>>>>>>>> Shashanka Balakuntala Srinivasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Sat, Mar 7, 2020 at 12:00 AM Chamikara Jayalath < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> I don't think we have this but contributions are welcome. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Cham >>>>>>>>>>>> >>>>>>>>>>>> On Tue, Mar 3, 2020 at 4:46 AM Elias Djurfeldt < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi all, >>>>>>>>>>>>> >>>>>>>>>>>>> I've stumbled upon a use case where I might need a SnowflakeIO >>>>>>>>>>>>> in Python. Has anyone worked on this before or are there any >>>>>>>>>>>>> discussions >>>>>>>>>>>>> surrounding it? >>>>>>>>>>>>> >>>>>>>>>>>>> There is a Snowflake Python library available [1], so looks >>>>>>>>>>>>> feasible to implement in Beam. >>>>>>>>>>>>> >>>>>>>>>>>>> [1] >>>>>>>>>>>>> https://docs.snowflake.net/manuals/user-guide/python-connector.html >>>>>>>>>>>>> >>>>>>>>>>>>> Cheers, >>>>>>>>>>>>> Elias >>>>>>>>>>>>> >>>>>>>>>>>>
