Re: Re: Is it possible to use OperatorState, when NOT implementing a source or sink function?

2021-06-06 Thread Yun Gao
Hi Marco, It seems to me that the imbalance problem and the state is independent for this issue: the data distribution is only decided by the KeySelector used. The only limitation for state is that the keyed state is bind to the KeySelector used across the tasks. If the imbalance is the root p

Re: Is it possible to use OperatorState, when NOT implementing a source or sink function?

2021-06-05 Thread Marco Villalobos
Ohthat won't work for me either. I needed to use MapState. Perhaps I should describe my problem. I am using a KeyedState process function, but the workload that it is processing is not distributing well across the cluster. I have four task managers, but the way my data is keyed in this opera

Re: Re: Is it possible to use OperatorState, when NOT implementing a source or sink function?

2021-06-05 Thread Yun Gao
Hi Marco, I think yes, the operator state could be used in batch mode. Since there is no checkpoint in batch mode, the operator state would serve as a kind of ordinary in-memory storage. Best, Yun -- Sender:Marco Villalobos Date:20

Re: Is it possible to use OperatorState, when NOT implementing a source or sink function?

2021-06-05 Thread Marco Villalobos
Does that work in the DataStream API in Batch Execution Mode? On Sat, Jun 5, 2021 at 12:04 AM JING ZHANG wrote: > Hi, > please use `CheckpointedFunction`, you could initialize your operator > state in `initializeState` method by using > context.getOperatorStateStore().*** > > Best regards, > JIN

Re: Is it possible to use OperatorState, when NOT implementing a source or sink function?

2021-06-05 Thread JING ZHANG
Hi, please use `CheckpointedFunction`, you could initialize your operator state in `initializeState` method by using context.getOperatorStateStore().*** Best regards, JING ZHANG Marco Villalobos 于2021年6月5日周六 下午1:55写道: > Is it possible to use OperatorState, when NOT implementing a source or > s