Re: Question about unbounded in-memory PCollection

2019-05-07 Thread Rui Wang
Does TestStream.java [1] satisfy your need? -Rui [1] https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/TestStream.java On Tue,

Re: Question about unbounded in-memory PCollection

2019-05-07 Thread Chengzhi Zhao
Hi Beam Team, I am new to here and recently study the programming guide, I have a question about the in-memory data, https://beam.apache.org/documentation/programming-guide/#creating-a-pcollection Is there a way to create unbounded PCollection from the in-memory collection? I want to test the unb

Re: Streaming inserts BQ with Java SDK Beam

2019-05-07 Thread Andres Angel
Pablo thanks so much I will explore this method then for STREAMING_INSERTS , this answer worth much for us :) thanks. On Tue, May 7, 2019 at 1:05 PM Pablo Estrada

Re: Streaming inserts BQ with Java SDK Beam

2019-05-07 Thread Alex Van Boxel
I think you really need a peculiar reason to force streamingInsert in a batch job. In batch mode you. Note that you will quickly hit the quota limit in batch mode: " Maximum rows per second: 100,000 rows per second, per project", as in batch load you can process a lot more information in a shorter

Re: Is it safe to cache the value of a singleton view (with a global window) in a DoFn?

2019-05-07 Thread Lukasz Cwik
Keep your code simple and rely on the runner caching the value locally so it should be very cheap to access. If you have a performance issue due to a runner lacking caching, it would be good to hear about it so we could file a JIRA about it. On Mon, May 6, 2019 at 4:24 PM Kenneth Knowles wrote:

Re: Streaming inserts BQ with Java SDK Beam

2019-05-07 Thread Pablo Estrada
Hi Andres! You can definitely do streaming inserts using the Java SDK. This is available with BigQueryIO.write(). Specifically, you can use the `withMethod`[1] call to specify whether you want batch loads or streaming inserts. If you specify streaming inserts, Beam should insert rows as they come i

Streaming inserts BQ with Java SDK Beam

2019-05-07 Thread Andres Angel
Hello everyone, I need to use BigQuery inserts within my beam pipeline, hence I know well the built-in IO options offer `BigQueryIO`, however this will insert in a batch fashion to BQ creating underneath a BQ load job. I instead need to trigger a streaming insert into BQ, and I was reviewing the J

Re: Apache BEAM on Flink in production

2019-05-07 Thread Austin Bennett
On the Beam YouTube channel: https://www.youtube.com/channel/UChNnb_YO_7B0HlW6FhAXZZQ you can see two talks from people at Lyft; they use Beam on Flink. Other users can also chime in as to how they are running. Would also suggest coming to BeamSummit.org in Berlin in June and/or sharing experienc

Apache BEAM on Flink in production

2019-05-07 Thread Stephen.Hesketh
Hi all, We currently run Apache Flink based data load processes (fairly simple streaming ETL jobs) and are looking at converting to Apache BEAM to give more flexibility on the runner. Is anyone aware of any organisations running Apache BEAM on Flink in production. Does anyone have any case stu