Re: Beam 2.21 release update

2020-05-07 Thread Thomas Weise
No additional stacktraces. Full error output below. It's not clear what is going wrong. There isn't any exception from the subprocess execution since the "WARNING:root:Installing grpcio-tools took 305.39 seconds." is printed. Also, the time it takes to perform the install is equivalent to

Re: TextIO. Writing late files

2020-05-07 Thread Reuven Lax
This is strange - AFAICT the windowing information should not be dropped. Is it possible that the Flink runner isn't handling the WriteFilesResult object (POutput value that embeds a PCollection) properly? Any way to test this with another runner? Reuven On Thu, May 7, 2020 at 3:45 PM Luke Cwik

Re: TextIO. Writing late files

2020-05-07 Thread Luke Cwik
+dev On Mon, May 4, 2020 at 3:56 AM Jose Manuel wrote: > Hi guys, > > I think I have found something interesting about windowing. > > I have a pipeline that gets data from Kafka and writes in HDFS by means of > TextIO. > Once written, generated files are combined to apply some custom

Re: Support for AWS SDK v2 and enhanced fanout in KinesisIO

2020-05-07 Thread Ismaël Mejía
Oups I meant 'would you like to start rolling this plan?' sorry. On Fri, May 8, 2020 at 12:03 AM Ismaël Mejía wrote: > > Achieving good abstractions will prove elusive since the APIs differ > and we will end up with a ton of extra maintenance work that should be > not Beam's responsibility. I

Re: Support for AWS SDK v2 and enhanced fanout in KinesisIO

2020-05-07 Thread Ismaël Mejía
Achieving good abstractions will prove elusive since the APIs differ and we will end up with a ton of extra maintenance work that should be not Beam's responsibility. I know that similar code (almost copy pasteable) is not nice to have but we should consider this as a temporary measure and

Re: Python Static Typing: Next Steps

2020-05-07 Thread Ismaël Mejía
Awesome! Congrats to all involved in making the project achieve such milestone! On Thu, May 7, 2020 at 7:13 PM Luke Cwik wrote: > > Sweet. > > On Wed, May 6, 2020 at 5:05 PM Robert Bradshaw wrote: >> >> Just an update on this: we just merged >> https://github.com/apache/beam/pull/11620 which

Re: Beam 2.21 release update

2020-05-07 Thread Udi Meiri
It's hard to say without more details what's going on. Ahmet you're right that it installs build-requirements.txt and retries calling generate_proto_files(). Thomas, were there additional stacktraces? (after a "During handling of the above exception, another exception occurred:" message?) On

Re: Support for AWS SDK v2 and enhanced fanout in KinesisIO

2020-05-07 Thread Luke Cwik
I think you should try and share as much as is reasonable. Using what is shared between AWS V1 and V2 SDKs would be a good signal as to what should be shared in Beam. There might be some places where a trivial wrapper could help but I wouldn't try to create a bunch of grand abstractions that fit

Re: [DISCUSS] finishBundle once per window

2020-05-07 Thread Reuven Lax
I think startBundle is useful for convenience and performance, but not necessarily needed semantically (as Kenn said, you could write your pipeline without startBundle). finishBundle has a stronger semantic meaning when interpreted as a way of finalizing elements. On Thu, May 7, 2020 at 2:00 PM

Re: [DISCUSS] finishBundle once per window

2020-05-07 Thread Luke Cwik
Start bundle is useful since the framework provides the necessary synchronization while using lazy init requires you to write it yourself and also pay for it on each process element call. On Wed, May 6, 2020 at 8:46 AM Kenneth Knowles wrote: > This is a great idea. I thought that (long ago) we

Re: Builtin IOs - Link to Java/Pydoc instead of code?

2020-05-07 Thread Pablo Estrada
Thanks Alexey! On Thu, May 7, 2020 at 4:15 AM Alexey Romanenko wrote: > Good point, Pablo! I think we can use “current” for this purpose, thanks. > > Though, I guess we still have to wait for documentation engine migration > to Hugo (separate thread discussion) and only then update the IO

Re: [REVIEW][please pause website changes] Migrated the Beam website to Hugo

2020-05-07 Thread Aizhamal Nurmamat kyzy
Thank you Ahmet. Robert/Brian, what do you think? The website staging and pre commit tests have passed [1]. If nobody has objections, we could merge it soon. [1] https://github.com/apache/beam/pull/11554 On Thu, May 7, 2020 at 11:38 AM Ahmet Altay wrote: > > > On Thu, May 7, 2020 at 10:50

Re: Beam 2.21 release update

2020-05-07 Thread Ahmet Altay
On Thu, May 7, 2020 at 11:56 AM Thomas Weise wrote: > Thanks Udi! This is the issue. I'm trying to upgrade from 2.18 where > build-requirements.txt didn't exist. > > Is there a reason why this cannot happen automatically when > running python3.6 setup.py sdist bdist_wheel ? > I _believe_ this

Re: Beam 2.21 release update

2020-05-07 Thread Thomas Weise
Thanks Udi! This is the issue. I'm trying to upgrade from 2.18 where build-requirements.txt didn't exist. Is there a reason why this cannot happen automatically when running python3.6 setup.py sdist bdist_wheel ? Thomas On Thu, May 7, 2020 at 11:07 AM Udi Meiri wrote: > Probably not the

Re: [REVIEW][please pause website changes] Migrated the Beam website to Hugo

2020-05-07 Thread Ahmet Altay
On Thu, May 7, 2020 at 10:50 AM Aizhamal Nurmamat kyzy wrote: > Thanks for the writeup Ahmet. > > My bias is to move forward and merge the PR. After this, we'll review the > outcome, and ensure that all the content is there. Nam will help us with > that. > The reason that I'd like to move

Re: Python Precommit significantly flaky

2020-05-07 Thread Ahmet Altay
I am aware of the following two that contributes to precommit flakiness. Someone is working on both. https://issues.apache.org/jira/browse/BEAM-9907 - ExternalTransform https://issues.apache.org/jira/browse/BEAM-9767 - test_streaming_wordcount flaky On Thu, May 7, 2020 at 10:54 AM Brian Hulette

Re: Beam 2.21 release update

2020-05-07 Thread Udi Meiri
Probably not the issue, but double checking: are you running "pip install -r sdks/python/build-requirements.txt" first? On Wed, May 6, 2020 at 7:22 PM Thomas Weise wrote: > I'm working on rebasing our fork to 2.21.0 and run into a problem > installing grpcio-tools that leads to

Re: Python Precommit significantly flaky

2020-05-07 Thread Brian Hulette
It looks like this is likely due to my PR for pipeline options in external transforms: https://github.com/apache/beam/pull/11574 I'll confirm and roll back if it is. On Thu, May 7, 2020 at 10:51 AM Pablo Estrada wrote: > Hi all, > the Precommit tests have been flaky for a couple of days due to

Python Precommit significantly flaky

2020-05-07 Thread Pablo Estrada
Hi all, the Precommit tests have been flaky for a couple of days due to ExternalTransform tests. See https://builds.apache.org/job/beam_PreCommit_Python_Cron/ - and an example failure: https://builds.apache.org/job/beam_PreCommit_Python_Cron/2723/ Does somebody know about this? Best -P.

Re: [REVIEW][please pause website changes] Migrated the Beam website to Hugo

2020-05-07 Thread Aizhamal Nurmamat kyzy
Thanks for the writeup Ahmet. My bias is to move forward and merge the PR. After this, we'll review the outcome, and ensure that all the content is there. Nam will help us with that. The reason that I'd like to move forward and merge what we have now - is that if we don't do that, the work done

Re: Python Static Typing: Next Steps

2020-05-07 Thread Luke Cwik
Sweet. On Wed, May 6, 2020 at 5:05 PM Robert Bradshaw wrote: > Just an update on this: we just merged > https://github.com/apache/beam/pull/11620 which enforces typechecking > for all files that currently pass. > > On Tue, Mar 3, 2020 at 1:12 PM Chad Dombrova wrote: > >> > >> This probably

Re: Validates Runner on Java 11 and the Java SDK Harness

2020-05-07 Thread David Morávek
Great effort Ismaël! ;) Can't wait to try this out :) On Thu, May 7, 2020 at 12:08 PM Ismaël Mejía wrote: > Filled https://issues.apache.org/jira/browse/BEAM-9915 for the moment > to track this. > > On Wed, Apr 22, 2020 at 10:35 PM Mikhail Gryzykhin > wrote: > > > > +Paweł Pasterz > > > > On

Re: Builtin IOs - Link to Java/Pydoc instead of code?

2020-05-07 Thread Alexey Romanenko
Good point, Pablo! I think we can use “current” for this purpose, thanks. Though, I guess we still have to wait for documentation engine migration to Hugo (separate thread discussion) and only then update the IO links. I created a Jira for that: https://issues.apache.org/jira/browse/BEAM-9916

Re: Validates Runner on Java 11 and the Java SDK Harness

2020-05-07 Thread Ismaël Mejía
Filled https://issues.apache.org/jira/browse/BEAM-9915 for the moment to track this. On Wed, Apr 22, 2020 at 10:35 PM Mikhail Gryzykhin wrote: > > +Paweł Pasterz > > On Wed, Apr 22, 2020, 13:23 Pablo Estrada wrote: >> >> +Mikhail Gryzykhin fyi : ) >> >> On Tue, Apr 21, 2020 at 1:25 PM Ismaël