Kostas Kloudas created FLINK-7771:
-------------------------------------
Summary: Make the operator state queryable
Key: FLINK-7771
URL: https://issues.apache.org/jira/browse/FLINK-7771
Project: Flink
Issue Type: Improvement
Components: Queryable State
Affects Versions: 1.4.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas
Fix For: 1.4.0
There seem to be some requests for making the operator (non-keyed) state
queryable. This means that the user will specify the *uuid* of the operator and
the *taskId*, and he will be able to access the state that corresponds to that
operator and for that specific task.
This issue will serve to document the discussion on the topic, so that
everybody can participate.
Personally, I think that such a feature should wait until some things on state
handling are stabilized (_e.g._ replication and checkpoint management). My main
concerns have to do with the semantics and guarantees that such a feature could
offer *for now*.
At first, operator state is essentially a list state that can be reshuffled
arbitrarily upon restoring or rescaling. This means that task1 will have at a
given execution attempt elements _A,B,C_ while after restoring (even without
rescaling) it may have _D,B,E_ without this implying that something happened to
states _A_ and _C_. They were simply assigned to another task. This makes it
hard to reason about the results that you get at any point in time, as it
provides *no locality/consistency guarantees between executions*.
The above, in combination with the fact that (for now) it is not possible to
query the state at a specific point in time (_e.g._ the last checkpointed
state), means that there is no easy way to get a consistent view of the state
of an operator. So in the example above, when querying _(operatorA, task1)_ and
_(operatorA, task2)_, the user can get states belonging to different "points in
time" which can result to duplicates, lost values and all the problems
encountered in distributed systems when there are no consistency guarantees.
The above illustrates some of the consistency problems that such a feature
could face now.
I also link [~till.rohrmann] and [~skonto] as he also mentioned that this
feature could be helpful.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)