Re: [DISCUSS] Provide MultimapUserStateHandler interface in StateRequestHandlers

2023-02-24 Thread Alan Zhang
Hi Robert, Thanks for your confirmation that Fn API is already ready for supporting the MultimapUserState use cases, really appreciate it! And totally agree that how to integrate it depends on the runner's implementation. A follow-up question: - is there any runner that already implemented

Re: [DISCUSS] Provide MultimapUserStateHandler interface in StateRequestHandlers

2023-02-24 Thread Robert Burke
The runners should be able to support Multimap User State portably over the FnApi already. https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/org/apache/beam/model/fn_execution/v1/beam_fn_api.proto#L937 How that's supported on each SDK is a different matter though. On

Re: [DISCUSS] Provide MultimapUserStateHandler interface in StateRequestHandlers

2023-02-24 Thread Alan Zhang
Appreciate it if anyone can help confirm and share thoughts. On Wed, Feb 22, 2023 at 11:46 PM Alan Zhang wrote: > Hi Beam devs. > > According to the Fn State API design doc[1], the state type > MultimapUserState is intended for supporting MapState/SetState. And the > implementation[2] for this

Re: Consuming one PCollection before consuming another with Beam

2023-02-24 Thread Kenneth Knowles
My suggestion is to try to solve the problem in terms of what you want to compute. Instead of trying to control the operational aspects like "read all the BQ before reading Pubsub" there is presumably some reason that the BQ data naturally "comes first", for example if its timestamps are earlier

Re: Consuming one PCollection before consuming another with Beam

2023-02-24 Thread Reuven Lax via dev
First PCollections are completely unordered, so there is no guarantee on what order you'll see events in the flattened PCollection. There may be ways to process the BigQuery data in a separate transform first, but it depends on the structure of the data. How large is the BigQuery table? Are you

Re: Consuming one PCollection before consuming another with Beam

2023-02-24 Thread Sahil Modak via dev
Yes, this is a streaming pipeline. Some more details about existing implementation v/s what we want to achieve. Current implementation: Reading from pub-sub: Pipeline input = Pipeline.create(options); PCollection pubsubStream = input.apply("Read From Pubsub",

Beam High Priority Issue Report (36)

2023-02-24 Thread beamactions
This is your daily summary of Beam's current high priority issues that may need attention. See https://beam.apache.org/contribute/issue-priorities for the meaning and expectations around issue priorities. Unassigned P1 Issues: https://github.com/apache/beam/issues/24776 [Bug]: Race