Re: [DISCUSS] Provide MultimapUserStateHandler interface in StateRequestHandlers

2023-02-24 Thread Alan Zhang
Hi Robert,

Thanks for your confirmation that Fn API is already ready for supporting
the MultimapUserState use cases, really appreciate it! And totally agree
that how to integrate it depends on the runner's implementation.

A follow-up question:

   - is there any runner that already implemented these Multimap protocols?
  - I didn't find a runner(e.g. Dataflow Flink, Spark, and Samza)
  defined handlers(codes like [1]) for handling Multimap state
requests, so I
  think the answer is NO. But I wanted to double confirm with you.
  - Having this question just wanted to know: 1) if the existing Fn
  APIs for MultimapUserState are already used in production 2) if we can
  build/abstract some generic layers(like what Beam did in
  StateRequestHandlers now) to benefit other runners

Have a good weekend!

On Fri, Feb 24, 2023 at 1:18 PM Robert Burke  wrote:

> The runners should be able to support Multimap User State portably over
> the FnApi already.
>
>
> https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/org/apache/beam/model/fn_execution/v1/beam_fn_api.proto#L937
>
> How that's supported on each SDK is a different matter though.
>
>
> On Fri, Feb 24, 2023, 12:57 PM Alan Zhang  wrote:
>
>> Appreciate it if anyone can help confirm and share thoughts.
>>
>> On Wed, Feb 22, 2023 at 11:46 PM Alan Zhang  wrote:
>>
>>> Hi Beam devs.
>>>
>>> According to the Fn State API design doc[1], the state type
>>> MultimapUserState is intended for supporting MapState/SetState. And the
>>> implementation[2] for this state type is ready on the SDK harness side.
>>> Each runner will be responsible for integrating it if they want to leverage
>>> it.
>>>
>>> Today Beam uses StateRequestHandlers to define handler interfaces for
>>> other state types, e.g. MultimapSideInputHandler for
>>> MultimapSideInput, BagUserStateHandler for BagUserState, etc.[3] This is
>>> great since each runner can implement these handler interfaces then the Fn
>>> state API integration is done.
>>>
>>> In order to support MapState/SetState, I think we will need to provide
>>> a MultimapUserStateHandler interface in StateRequestHandlers and allow the
>>> runners to implement it.
>>>
>>> What do you think?
>>>
>>> Feel free to correct me if there is any incorrect understanding since
>>> I'm new to the Beam world.
>>>
>>> Btw, I saw Flink Python used MultimapSideInput to support MapState[4]
>>> but I think this is not recommended since MultimapUserState is available
>>> today. But please correct me if I'm wrong.
>>>
>>>
>>> [1] https://s.apache.org/beam-fn-state-api-and-bundle-processing
>>> <https://s.apache.org/beam-fn-state-api-and-bundle-processing>
>>> [2] https://github.com/apache/beam/pull/15238
>>> [3]
>>> https://github.com/apache/beam/blob/master/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/state/StateRequestHandlers.java#L192
>>> [4]
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-153%3A+Support+state+access+in+Python+DataStream+API
>>> --
>>> Thanks,
>>> Alan
>>>
>>
>>
>> --
>> Thanks,
>> Alan
>>
>

-- 
Thanks,
Alan


Re: [DISCUSS] Provide MultimapUserStateHandler interface in StateRequestHandlers

2023-02-24 Thread Alan Zhang
Appreciate it if anyone can help confirm and share thoughts.

On Wed, Feb 22, 2023 at 11:46 PM Alan Zhang  wrote:

> Hi Beam devs.
>
> According to the Fn State API design doc[1], the state type
> MultimapUserState is intended for supporting MapState/SetState. And the
> implementation[2] for this state type is ready on the SDK harness side.
> Each runner will be responsible for integrating it if they want to leverage
> it.
>
> Today Beam uses StateRequestHandlers to define handler interfaces for
> other state types, e.g. MultimapSideInputHandler for
> MultimapSideInput, BagUserStateHandler for BagUserState, etc.[3] This is
> great since each runner can implement these handler interfaces then the Fn
> state API integration is done.
>
> In order to support MapState/SetState, I think we will need to provide
> a MultimapUserStateHandler interface in StateRequestHandlers and allow the
> runners to implement it.
>
> What do you think?
>
> Feel free to correct me if there is any incorrect understanding since I'm
> new to the Beam world.
>
> Btw, I saw Flink Python used MultimapSideInput to support MapState[4] but
> I think this is not recommended since MultimapUserState is available today.
> But please correct me if I'm wrong.
>
>
> [1] https://s.apache.org/beam-fn-state-api-and-bundle-processing
> <https://s.apache.org/beam-fn-state-api-and-bundle-processing>
> [2] https://github.com/apache/beam/pull/15238
> [3]
> https://github.com/apache/beam/blob/master/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/state/StateRequestHandlers.java#L192
> [4]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-153%3A+Support+state+access+in+Python+DataStream+API
> --
> Thanks,
> Alan
>


-- 
Thanks,
Alan


[DISCUSS] Provide MultimapUserStateHandler interface in StateRequestHandlers

2023-02-22 Thread Alan Zhang
Hi Beam devs.

According to the Fn State API design doc[1], the state type
MultimapUserState is intended for supporting MapState/SetState. And the
implementation[2] for this state type is ready on the SDK harness side.
Each runner will be responsible for integrating it if they want to leverage
it.

Today Beam uses StateRequestHandlers to define handler interfaces for other
state types, e.g. MultimapSideInputHandler for
MultimapSideInput, BagUserStateHandler for BagUserState, etc.[3] This is
great since each runner can implement these handler interfaces then the Fn
state API integration is done.

In order to support MapState/SetState, I think we will need to provide
a MultimapUserStateHandler interface in StateRequestHandlers and allow the
runners to implement it.

What do you think?

Feel free to correct me if there is any incorrect understanding since I'm
new to the Beam world.

Btw, I saw Flink Python used MultimapSideInput to support MapState[4] but I
think this is not recommended since MultimapUserState is available today.
But please correct me if I'm wrong.


[1] https://s.apache.org/beam-fn-state-api-and-bundle-processing

[2] https://github.com/apache/beam/pull/15238
[3]
https://github.com/apache/beam/blob/master/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/state/StateRequestHandlers.java#L192
[4]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-153%3A+Support+state+access+in+Python+DataStream+API
-- 
Thanks,
Alan


MapState/SetState(aka, MultimapUserState) are not fully supported in Beam portability framework?

2023-01-24 Thread Alan Zhang via dev
Hi everyone,

Why don’t we have some interfaces(e.g. MultimapUserStateHandler and 
MultimapUserStateHandlerFactory) for supporting MultimapUserState defined in 
the class 
StateRequestHandlers<https://github.com/apache/beam/blob/master/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/state/StateRequestHandlers.java#L69>?
 Is this support on plan but not implement yet or there were some concerns, and 
we don’t want to support it? Or this class is not the right place to define 
these MultimapUserState related handler interfaces?

For example, for supporting the BagUserState, I saw this class defined two 
related interfaces 
BagUserStateHandler<https://github.com/apache/beam/blob/master/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/state/StateRequestHandlers.java#L192>
 and 
BagUserStateHandlerFactory<https://github.com/apache/beam/blob/master/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/state/StateRequestHandlers.java#L215>,
 and the runners(Samza/Flink/Spark) can have their own implementation(e.g. 
Samza’s 
SamzaStateRequestHandlers<https://github.com/apache/beam/blob/master/runners/samza/src/main/java/org/apache/beam/runners/samza/runtime/SamzaStateRequestHandlers.java#L123>)
 for these interfaces to support ValueState, BagState and CombingState.

I saw the existing Fn Harness implementation is able to handle MapState and 
SetState by using 
FnApiStateAccessor<https://github.com/apache/beam/blob/master/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/state/FnApiStateAccessor.java#L444>,
 and build the right 
StateRequest<https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/org/apache/beam/model/fn_execution/v1/beam_fn_api.proto#L662>/StateKey
 for them. So Beam Fn APIs can provide these interfaces to let each runner to 
integrate, then I would think MultimapUserState is fully supported in Beam 
portability framework.


A little bit introduction for myself:

This is Alan from Linkedin. We are building a new managed platform which is 
powered by Samza runner and Beam portability framework, and we wanted to let 
all Linkedin Beam use cases get benefit from this new portable architecture 
eventually.
But there are few feature gaps between classic Samza runner and portable Samza 
runner, the user state support is one of the gaps. The classic Samza runner 
support 5 major user state types: ValueState, BagState, CombingState, MapState 
and SetState, while the existing portable Samza runner only supports 
ValueState, BagState and CombingState. I’m trying to address this state feature 
gap now.



--
Best,
Alan Zhang


Re: Subscribe

2023-01-24 Thread Alan Zhang via dev
Thanks, Valentyn! I realized that I made the mistake after I sent the email to 
this address right away. Now I am able to receive the latest emails from both 
user@ and dev@ maillists after correct the addresses .

Best,
Alan Zhang

From: Valentyn Tymofieiev 
Date: Tuesday, January 24, 2023 at 5:23 PM
To: dev@beam.apache.org , Alan Zhang 

Subject: Re: Subscribe
Hello Alan,

To subscribe to the list, you should send an email to 
dev-subscr...@beam.apache.org<mailto:dev-subscr...@beam.apache.org> instead.

Best,
Valentyn

On Tue, Jan 24, 2023 at 5:19 PM Alan Zhang via dev 
mailto:dev@beam.apache.org>> wrote:


Subscribe

2023-01-24 Thread Alan Zhang via dev