Re: [REVIEW][please pause website changes] Migrated the Beam website to Hugo

2020-05-08 Thread Ahmet Altay
This sounds reasonable to me. Thank you. Nam, does it make sense to you? On Fri, May 8, 2020 at 11:53 AM Robert Bradshaw wrote: > I'd really like to not see this work go to waste, both the original > revision, the further efforts Nam has done in making it more manageable to > review, and the

Re: Python2.7 Beam End-of-Life Date

2020-05-08 Thread Valentyn Tymofieiev
That's good news! Thanks for sharing. Another datapoint, here are a few of Beam's dependencies that no longer release new py2 artifacts (I looked at REQUIRED_PACKAGES + aws, gcp, and interactive extras): hdfs numpy pyarrow ipython There are more if we include transitive dependencies and

Re: Beam 2.21 release update

2020-05-08 Thread Kyle Weaver
Thanks for the heads up Thomas. Please let us know as soon as possible what you find. On Fri, May 8, 2020 at 2:43 PM Thomas Weise wrote: > I could not find a way for the hidden pip install to succeed within the > virtual environment. > > Also, heads up that we found another issue related to

Runner dependent sharding for dynamic destinations in FileIO

2020-05-08 Thread amit kumar
Hi Everyone, We use FileIO's writeDynamic to write dynamically to separate groups based on an attribute's value in the input PCollection. I wanted to check if there is a way to make sharding as runner dependent? Many thanks, Amit

Re: [REVIEW][please pause website changes] Migrated the Beam website to Hugo

2020-05-08 Thread Robert Bradshaw
I'd really like to not see this work go to waste, both the original revision, the further efforts Nam has done in making it more manageable to review, and the work put into reviewing this so far, so we can get the benefits of being on Hugo. How about this for a concrete proposal: (1) We get

Re: Beam 2.21 release update

2020-05-08 Thread Thomas Weise
I could not find a way for the hidden pip install to succeed within the virtual environment. Also, heads up that we found another issue related to timer encoding that looks like a release blocker. Details to follow. On Fri, May 8, 2020 at 9:53 AM Udi Meiri wrote: > +Chad Dombrova , who added

Re: [REVIEW][please pause website changes] Migrated the Beam website to Hugo

2020-05-08 Thread Robert Bradshaw
Here's a script that we could run on the old and new sites that should quickly catch any major issues but not get caught up in formatting minutia. On Fri, May 8, 2020 at 10:23 AM Robert Bradshaw wrote: > On Fri, May 8, 2020 at 9:58 AM Aizhamal Nurmamat kyzy > wrote: > >> I understand the

Re: [REVIEW][please pause website changes] Migrated the Beam website to Hugo

2020-05-08 Thread Robert Bradshaw
On Fri, May 8, 2020 at 9:58 AM Aizhamal Nurmamat kyzy wrote: > I understand the difficulty, and this certainly comes with lessons learned > for future similar projects. > > To your questions Robert: > (1 and 2) I will commit to review the text in the resulting pages. I will > try and use some

Re: [REVIEW][please pause website changes] Migrated the Beam website to Hugo

2020-05-08 Thread Aizhamal Nurmamat kyzy
Also an update from +Nam Bui : "This commit [1] is up-to-date. So I walked through all of markdown files. Apart from Syntax changes between Jekyll & Hugo, if there were any differences in contents regarding to removed/added/modified, I would have double check the text with the current website. I

Re: TextIO. Writing late files

2020-05-08 Thread Reuven Lax
The window information should still be there. Beam propagates windows through PCollection, and I don't think WriteFiles does anything explicit to stop that. Can you try this with the direct runner to see what happens there? What is your windowing on this PCollection? Reuven On Fri, May 8, 2020

Re: [REVIEW][please pause website changes] Migrated the Beam website to Hugo

2020-05-08 Thread Aizhamal Nurmamat kyzy
I understand the difficulty, and this certainly comes with lessons learned for future similar projects. To your questions Robert: (1 and 2) I will commit to review the text in the resulting pages. I will try and use some automation to extract visible text from each page and diff it with the

Re: Beam 2.21 release update

2020-05-08 Thread Udi Meiri
+Chad Dombrova , who added _find_protoc_gen_mypy. I'm guessing that the code in _install_grpcio_tools_and_generate_proto_files creates a kind of virtualenv, but it only works well for staging Python modules and not binaries like protoc-gen-mypy. (I assume there's a reason why it doesn't invoke

Re: [REVIEW][please pause website changes] Migrated the Beam website to Hugo

2020-05-08 Thread Brian Hulette
I'm -0 on merging as-is. I have the same concerns as Robert and he's voiced them very well so I won't waste time re-airing them. (2) I spot checked the content, pulled out some common patterns, and > it mostly looks good, but there were also some issues (e.g. several > pages were replaced with

Re: Python2.7 Beam End-of-Life Date

2020-05-08 Thread Robert Bradshaw
It hasn't been 3 months yet, but I wanted to call out a milestone that Python 3 downloads crossed the 50% threshold on pypi, if just briefly. On Thu, Feb 13, 2020 at 12:40 AM Ismaël Mejía wrote: > > > I would suggest re-evaluating this within the next 3 months again. We need > > to balance

Re: Ways to learn Beam in detail

2020-05-08 Thread Luke Cwik
Start with the contribution guide[1] and pick up some of the starter tasks[2] to learn parts of the codebase. 1: https://beam.apache.org/contribute/ 2: https://s.apache.org/beam-starter-tasks On Thu, May 7, 2020 at 11:58 PM deepak kumar wrote: > Hi There, > Can anyone suggest the best way to

Re: TextIO. Writing late files

2020-05-08 Thread Jose Manuel
I got the same behavior using Spark Runner (with Spark 2.4.3), window information was missing. Just to clarify, the combiner after TextIO had different results. In Flink runner the files names were dropped, and in Spark the combination process happened twice, duplicating data. I think it is

Ways to learn Beam in detail

2020-05-08 Thread deepak kumar
Hi There, Can anyone suggest the best way to get started on Beam APIs , so i can start contributing to the codebase? Thanks Deepak

Re: [REVIEW][please pause website changes] Migrated the Beam website to Hugo

2020-05-08 Thread Robert Bradshaw
This is a tough situation. It would have been much better if this transition was structured in such a way that the review was more manageable (e.g. the suggestion of scripts, not mixing in voluminous unnecessary changes like whitespace, and not updating content), and possibly even incrementally