Thanks Arvid,
I like your propositions in my case I wanted to use the state value to
decide if I should do the Async Call to external system. The result of this
call would be a state input. So having this:
Process1(calcualteValue or take it from state) -> AsyncCall to External
system to persist/Validate the value -> Process2(feedback loop Via
meessagibg queue to process1).
Apart from that Process1 would have to consume two streams, which is ok, I
woudl actually have a delay. I wanted to avouid uneceserry calls to External
system by having the cashed/validated value in state.
And this would be done without the delay if I could use State in Async
Operators.
I'm finking bout manufacturing my own Semi Async Operator. My Idea is that I
would have normal KeyedProcessFunction that will wrap list of
SingleThreadExecutors.
In processElement method I will use Key to calculate the index of that Array
to make sure that message for same Key will go to the same ThreadExecutor. I
do want to keep the message order.
I will submit a task like
executor.submit(() -> {
MyResult result = rservice.process(message, mapState.get(key));
mapState.put(key, result);
out.collect(newMessage);
}
Big questions:
1. In my solution out.collect(newMessage); will be called from few threads
(each will have different message). Is it ThreadSafe?
2. Is using the MapState in multiThreadEnv like I would have here is thread
safe?
Alternativelly I can have associate list of mapStates, one for each
SingleThreadExecutors, so It will be used only by one thread.
With this setup I will not block my Pipeline and I will be able to use
state. I agree that Size of SingleThreadExecutors list will be a limiting
factor.
Is this setup possible with Flink?
Btw I will use RocksDbStateBackend
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/