Re: [DISCUSS] Publish vendored dependencies independently

2018-11-13 Thread Lukasz Cwik
I have made some incremental progress on this and wanted to release our first vendored dependency of gRPC 1.13.1 since I was able to fix a good number of the import/code completion errors that Intellij was experiencing. I have published an example of what the jar/pom looks like in the Apache

Re: How to use "PortableRunner" in Python SDK?

2018-11-13 Thread Ruoyun Huang
A quick follow-up on using current PortableRunner. I followed the exact three steps as Ankur and Maximilian shared in https://beam.apache.org/roadmap/portability/#python-on-flink ; The wordcount example keeps hanging after 10 minutes. I also tried specifying explicit input/output args, either

Re: [Call for items] November Beam Newsletter

2018-11-13 Thread Rui Wang
Hi, I just added some thing related to BeamSQL. -Rui On Tue, Nov 13, 2018 at 3:26 AM Etienne Chauchot wrote: > Hi, > I just added some things that were done. > > Etienne > > Le lundi 12 novembre 2018 à 12:22 +, Matthias Baetens a écrit : > > Looks great, thanks for the effort and for

Re: [PROPOSAL] ParquetIO support for Python SDK

2018-11-13 Thread Heejong Lee
In current PR, there will be two parameters that can control the final row group size, row_group_buffer_size and record_batch_size. The records are first stored as a list of columns and then transformed into a record batch (a data structure defined in pyarrow) when the number of records in the

Re: Spotless and lint precommit

2018-11-13 Thread Udi Meiri
+1 and parallelize the 3 lint tasks On Tue, Nov 13, 2018 at 10:43 AM Thomas Weise wrote: > +1 > > > On Tue, Nov 13, 2018 at 9:06 AM Ruoyun Huang wrote: > >> +1 >> >> On Tue, Nov 13, 2018 at 8:29 AM Maximilian Michels >> wrote: >> >>> +1 >>> >>> On 13.11.18 14:22, Robert Bradshaw wrote: >>> >

Re: Bigquery streaming TableRow size limit

2018-11-13 Thread Lukasz Cwik
I would rather not have the builder method and run into the quota issue then require the builder method and still run into quota issues. On Mon, Nov 12, 2018 at 5:25 PM Reuven Lax wrote: > I'm a bit worried about making this automatic, as it can have unexpected > side effects on BigQuery

Re: Spotless and lint precommit

2018-11-13 Thread Thomas Weise
+1 On Tue, Nov 13, 2018 at 9:06 AM Ruoyun Huang wrote: > +1 > > On Tue, Nov 13, 2018 at 8:29 AM Maximilian Michels wrote: > >> +1 >> >> On 13.11.18 14:22, Robert Bradshaw wrote: >> > I really like how spottless runs separately and quickly for Java code. >> > Should we do the same for Python

Re: [BEAM-5442] Store duplicate unknown (runner) options in a list argument

2018-11-13 Thread Robert Burke
+1 to Option 3 I'd rather have each SDK have a single point of well defined complexity to do something general, than have to make tiny but simple changes. Less toil and maintenance in the long run per SDK. Similarly I don't have time to make it happen right now. On Tue, Nov 13, 2018, 9:22 AM

Re: [BEAM-5442] Store duplicate unknown (runner) options in a list argument

2018-11-13 Thread Thomas Weise
Discovering options from the job server would be the only way to perform full validation (and provide upfront help to the user). The runner cannot perform full validation, since it is not aware of the user and SDK options (that it has to forward to the SDK worker). Special runner options flag to

Re: Spotless and lint precommit

2018-11-13 Thread Ruoyun Huang
+1 On Tue, Nov 13, 2018 at 8:29 AM Maximilian Michels wrote: > +1 > > On 13.11.18 14:22, Robert Bradshaw wrote: > > I really like how spottless runs separately and quickly for Java code. > > Should we do the same for Python lint? > > > -- Ruoyun Huang

Re: Spotless and lint precommit

2018-11-13 Thread Maximilian Michels
+1 On 13.11.18 14:22, Robert Bradshaw wrote: I really like how spottless runs separately and quickly for Java code. Should we do the same for Python lint?

Re: Design review for supporting AutoValue Coders and conversions to Row

2018-11-13 Thread Jeff Klukas
Sounds, then, like we need to a define a new `AutoValueSchema extends SchemaProvider` and users would opt-in to this via the DefaultSchema annotation: @DefaultSchema(AutoValueSchema.class) @AutoValue public abstract MyClass ... Since we already have the JavaBean and JavaField reflection-based

Re: [VOTE] Mark 2.7.0 branch as a long term support (LTS) branch

2018-11-13 Thread Jean-Baptiste Onofré
+0 Regards JB On 09/11/2018 02:47, Ahmet Altay wrote: > Hi all, > > Please review the following statement: > > "2.7.0 branch will be marked as the long-term-support (LTS) release > branch. This branch will be supported for a window of 6 months starting > from the day it is marked as an LTS

Spotless and lint precommit

2018-11-13 Thread Robert Bradshaw
I really like how spottless runs separately and quickly for Java code. Should we do the same for Python lint?

Re: [PROPOSAL] ParquetIO support for Python SDK

2018-11-13 Thread Robert Bradshaw
Was there resolution on how to handle row group size, given that it's hard to pick a decent default? IIRC, the ideal was to base this on byte sizes; will this be in v1 or will there be other parameter(s) that we'll have to support going forward? On Tue, Oct 30, 2018 at 10:42 PM Heejong Lee wrote:

Re: [Call for items] November Beam Newsletter

2018-11-13 Thread Etienne Chauchot
Hi,I just added some things that were done. Etienne Le lundi 12 novembre 2018 à 12:22 +, Matthias Baetens a écrit : > Looks great, thanks for the effort and for including the Summit blogpost, > Rose! > On Thu, 8 Nov 2018 at 22:55 Rose Nguyen wrote: > > Hi Beamers: > > > > > > Time to sync