Re: [DISCUSS] Processing time timers in "batch" (faster-than-wall-time [re]processing)

2024-02-22 Thread Robert Burke
This is a "timely" discussion because my next step for Prism is to address ProcessingTime. The description of the watermarks matches my understanding and how it's implemented so far in Prism [0], where the "stage" contains one or more transforms to be executed by a worker. My current thinking

[DISCUSS] Processing time timers in "batch" (faster-than-wall-time [re]processing)

2024-02-22 Thread Kenneth Knowles
Forking this thread. The state of processing time timers in this mode of processing is not satisfactory and is discussed a lot but we should make everything explicit. Currently, a state and timer DoFn has a number of logical watermarks: (apologies for fixed width not coming through in email lists

Re: Throttle PTransform

2024-02-22 Thread Robert Bradshaw via dev
On Thu, Feb 22, 2024 at 9:37 AM Reuven Lax via dev wrote: > > On Thu, Feb 22, 2024 at 9:26 AM Kenneth Knowles wrote: >> >> Wow I love your input Reuven. Of course "the source" that you are applying >> backpressure to is often a runner's shuffle so it may be state anyhow, but >> it is good to gi

Re: Bigquery Connector Rate limits

2024-02-22 Thread Taher Koitawala
Hello Ahmed, Thanks for the information this helps a lot. On Thu, 22 Feb 2024 at 9:09 PM, Ahmed Abualsaud via dev wrote: > Hey Taher, > > Regarding the first question about what API Beam uses, that depends on the > BigQuery method you set in the connector's configuration. We have 4 > different w

Re: Throttle PTransform

2024-02-22 Thread Reuven Lax via dev
On Thu, Feb 22, 2024 at 9:26 AM Kenneth Knowles wrote: > Wow I love your input Reuven. Of course "the source" that you are applying > backpressure to is often a runner's shuffle so it may be state anyhow, but > it is good to give the runner the choice of how to figure that out and > maybe chain b

Re: Pipeline upgrade to 2.55.0-SNAPSHOT broken for FlinkRunner

2024-02-22 Thread Robert Bradshaw via dev
On Wed, Feb 21, 2024 at 11:58 PM Jan Lukavský wrote: > > Reasons we use Java serialization are not fundamental, probably only > historical. Thinking about it, yes, there is lucky coincidence that we > currently have to change the serialization because of Flink 1.17 support. > Flink 1.17 actuall

Re: Throttle PTransform

2024-02-22 Thread Kenneth Knowles
Wow I love your input Reuven. Of course "the source" that you are applying backpressure to is often a runner's shuffle so it may be state anyhow, but it is good to give the runner the choice of how to figure that out and maybe chain backpressure further. The goal is basically to make a sink that d

Re: [PROPOSAL] Preparing for 2.55.0 Release

2024-02-22 Thread Kenneth Knowles
Hooray! Thank you! On Thu, Feb 22, 2024 at 10:24 AM Yi Hu via dev wrote: > Hey Beam community, > > The next release (2.55.0) branch cut is scheduled on Mar 6th, 2024, > according to > the release calendar [1]. > > I volunteer to perform this release. My plan is to cut the branch on that > date,

Re: Pipeline upgrade to 2.55.0-SNAPSHOT broken for FlinkRunner

2024-02-22 Thread Kenneth Knowles
Great. Let me know if I can help. I broke it after all :-) Kenn On Thu, Feb 22, 2024 at 2:58 AM Jan Lukavský wrote: > Reasons we use Java serialization are not fundamental, probably only > historical. Thinking about it, yes, there is lucky coincidence that we > currently have to change the seri

Re: Bigquery Connector Rate limits

2024-02-22 Thread Ahmed Abualsaud via dev
Hey Taher, Regarding the first question about what API Beam uses, that depends on the BigQuery method you set in the connector's configuration. We have 4 different write methods, and a high-level description of each can be found in the documentation: https://beam.apache.org/releases/javadoc/curren

[PROPOSAL] Preparing for 2.55.0 Release

2024-02-22 Thread Yi Hu via dev
Hey Beam community, The next release (2.55.0) branch cut is scheduled on Mar 6th, 2024, according to the release calendar [1]. I volunteer to perform this release. My plan is to cut the branch on that date, and cherrypick release-blocking fixes afterwards, if any. Please help me make sure the re

Bigquery Connector Rate limits

2024-02-22 Thread Taher Koitawala
Hi All, I want to ask questions regarding sinking a very high volume stream to Bigquery. I will read messages from a Pubsub topic and write to Bigquery. In this steaming job i am worried about hitting the bigquery streaming inserts limit of 1gb per second on streaming Api writes I am fi

Beam High Priority Issue Report (37)

2024-02-22 Thread beamactions
This is your daily summary of Beam's current high priority issues that may need attention. See https://beam.apache.org/contribute/issue-priorities for the meaning and expectations around issue priorities. Unassigned P0 Issues: https://github.com/apache/beam/issues/30377 [Failing Test]: 404