[PROPOSAL] Adding Contextual TextIO to Beam

2020-07-17 Thread Abhishek Yadav
Hey everyone, I am working on a project for adding Contextual TextIO to Apache Beam. Please provide feedback/suggestions on the design document with reference to the JIRA . Thanks ! Abhishek

Re: ReadFromKafka returns error - RuntimeError: cannot encode a null byte[]

2020-07-17 Thread Chamikara Jayalath
Yes, seems like this is due to the key being null. XLang KafkaIO has to be updated to support this. You should not run into this error if you publish keys and values that are not null. On Fri, Jul 17, 2020 at 8:04 PM Luke Cwik wrote: > +dev > > On Fri, Jul 17, 2020 at 8:03 PM Luke Cwik

Re: ReadFromKafka returns error - RuntimeError: cannot encode a null byte[]

2020-07-17 Thread Luke Cwik
+dev On Fri, Jul 17, 2020 at 8:03 PM Luke Cwik wrote: > +Heejong Lee +Chamikara Jayalath > > > Do you know if your trial record has an empty key or value? > If so, then you hit a bug and it seems as though there was a miss > supporting this usecase. > > Heejong and Cham, > It looks like the

Re: No space left on device - beam-jenkins 1 and 7

2020-07-17 Thread Tyson Hamilton
FYI there was a job introduced to do this in Jenkins: beam_Clean_tmp_directory Currently it needs to be run manually. I'm seeing some out of disk related errors in precommit tests currently, perhaps we should schedule this job with cron? On 2020/03/11 19:31:13, Heejong Lee wrote: > Still

Re: [PROPOSAL] Executing Cross-language transforms in the Beam Go SDK

2020-07-17 Thread Robert Bradshaw
On Fri, Jul 17, 2020 at 4:11 PM Robert Burke wrote: > > Thanks for sending this out! I've added my comments and look forward to the > working prototype(s)! I've added some comments as well. > I really appreciate the level of detail you go into the Go SDK structure on > the pipeline

Re: [VOTE] Release 2.23.0, release candidate #1

2020-07-17 Thread Ahmet Altay
Thank you Valentyn. Being a release manager is difficult. It requires balancing between stability, following the process, regressions, timelines. Thank you for following the process, thank you for asking the right questions, thank you for doing the release. On Fri, Jul 17, 2020 at 3:59 PM Robert

Re: [PROPOSAL] Executing Cross-language transforms in the Beam Go SDK

2020-07-17 Thread Robert Burke
Thanks for sending this out! I've added my comments and look forward to the working prototype(s)! I really appreciate the level of detail you go into the Go SDK structure on the pipeline construction side. That's not been materially documented outside of the code itself, so this is a nice

Re: [Proposal] - Publish Content for Apache Beam Channels

2020-07-17 Thread Robert Bradshaw
All of these channels should be public and so using a tool to listen to them should not require extra access. Any insights gained would be interesting to see. I think it's be useful to see specific examples of "content produced and published" before giving write access (and also understanding what

Re: [VOTE] Release 2.23.0, release candidate #1

2020-07-17 Thread Robert Bradshaw
Thank you, Valentyn! On Fri, Jul 17, 2020 at 3:25 PM Chamikara Jayalath wrote: > > > > On Fri, Jul 17, 2020 at 3:01 PM Valentyn Tymofieiev > wrote: >> >> As a general rule, fixes pertaining to new functionality are not a good >> candidate for a cherry-pick. >> >> A case for an exception can

Re: [Proposal] - Publish Content for Apache Beam Channels

2020-07-17 Thread Brittany Hermann
Hi folks, Happy Friday! I just wanted to check in to see if you had the chance to review my proposal for publishing content for the Apache Beam Channels. Please let me know if you have any additional questions. Have a great weekend! On Thu, Jul 9, 2020 at 2:19 PM Brittany Hermann wrote: > Hi

Re: [VOTE] Release 2.23.0, release candidate #1

2020-07-17 Thread Chamikara Jayalath
On Fri, Jul 17, 2020 at 3:01 PM Valentyn Tymofieiev wrote: > As a general rule, fixes pertaining to new functionality are not a good > candidate for a cherry-pick. > > A case for an exception can be made for polishing features related to > major wide announcements with a hard deadline, which

Re: [VOTE] Release 2.23.0, release candidate #1

2020-07-17 Thread Valentyn Tymofieiev
As a general rule, fixes pertaining to new functionality are not a good candidate for a cherry-pick. A case for an exception can be made for polishing features related to major wide announcements with a hard deadline, which appears to be the case for xlang on Dataflow. I will prepare an RC2 with

Re: Chronically flaky tests

2020-07-17 Thread Ahmet Altay
Another idea, could we change our "Retest X" phrases with "Retest X (Reason)" phrases? With this change a PR author will have to look at failed test logs. They could catch new flakiness introduced by their PR, file a JIRA for a flakiness that was not noted before, or ping an existing JIRA

[PROPOSAL] Executing Cross-language transforms in the Beam Go SDK

2020-07-17 Thread Kevin Puthusseri
Hi Folks, I am working on adding support for Executing Cross-language transforms in the Beam Go SDK. Please provide feedback/suggestions on the design document with reference to the Uber JIRA

Re: [VOTE] Release 2.23.0, release candidate #1

2020-07-17 Thread Chamikara Jayalath
On Fri, Jul 17, 2020 at 10:01 AM Robert Bradshaw wrote: > Taking a step back, the goal of avoiding cherry-picks is to reduce > risk and increase the velocity of our releases, as otherwise the > release manager gets inundated by a never ending list of features > people want to get in that puts

Re: [VOTE] Release 2.23.0, release candidate #1

2020-07-17 Thread Robert Bradshaw
Taking a step back, the goal of avoiding cherry-picks is to reduce risk and increase the velocity of our releases, as otherwise the release manager gets inundated by a never ending list of features people want to get in that puts the releases further and further behind (increasing the desire to

Re: [VOTE] Extension name of Interactive Beam Side Panel in JupyterLab

2020-07-17 Thread Alexey Romanenko
+1 for 3 too, thanks > On 16 Jul 2020, at 22:27, David Yan wrote: > > +1 for 3. > > On Thu, Jul 16, 2020 at 12:35 PM Pablo Estrada > wrote: > +1 for 3. Thanks Ning. > > On Thu, Jul 16, 2020 at 10:54 AM Kenneth Knowles > wrote: > +1 for [3]

Re: KafkaIO sending KafkaRecords in CrossLanguage - where is the coder registered?

2020-07-17 Thread Robert Bradshaw
On Fri, Jul 17, 2020 at 9:19 AM Piotr Szuberski wrote: > > I will consider mapping KinesisRecord to Row and then sending it via > cross-language, but I think that for now python's RowCoder does not support > bytes (correct me if I'm wrong) Huh, looks like you're right:

Re: KafkaIO sending KafkaRecords in CrossLanguage - where is the coder registered?

2020-07-17 Thread Piotr Szuberski
I will consider mapping KinesisRecord to Row and then sending it via cross-language, but I think that for now python's RowCoder does not support bytes (correct me if I'm wrong) On 2020/07/16 23:07:06, Luke Cwik wrote: > If you want to send across a "rich" data record, consider defining a

Re: KafkaIO sending KafkaRecords in CrossLanguage - where is the coder registered?

2020-07-17 Thread Piotr Szuberski
Thanks, that's exactly what I was asking for! I really don't know how could I omit that it's really TypedWithoutMetadata and not KafkaIO.Read transform used in the external transform. I think it's hard to navigate in such a big file. On 2020/07/16 16:21:26, Boyuan Zhang wrote: > Hi Piotr, >

Re: [PROPOSAL] Azure Filesystem for Beam Java SDK

2020-07-17 Thread Luke Cwik
Thanks I took a look and left some comments. I saw that you were proposing to use azfs as the scheme but I see wasb/wasb/abfss used in other data processing systems. I'm not sure which is the common one but wasb/wasbs/abfss show up on the Microsoft site so it might be best to use that instead of

Re: Chronically flaky tests

2020-07-17 Thread Tyson Hamilton
Adding retries can be beneficial in two ways, unblocking a PR, and collecting metrics about the flakes. If we also had a flaky test leaderboard that showed which tests are the most flaky, then we could take action on them. Encouraging someone from the community to fix the flaky test is another