In a failure scenario, when a container fails, it is redeployed along with
all the operators in it. The operators downstream to these operators are
also redeployed within their containers. The operators are restored from
their checkpoint and connect to the appropriate point in the stream
according to the processing mode. In at least once mode, for example, the
data is replayed from the same checkpoint

Restoring an operator state from checkpoint could turn out to be a costly
operation depending on the size of the state. In some use cases, based on
the operator logic, when there is an upstream failure, the operator state
without being restored to the checkpoint i.e., remaining as is, will still
produce the same results with the data replayed from the last fully
processed window. This is true with some operators in batch use cases. The
operator state can remain the same as it was before the upstream failure by
reusing the same operator instance from before and only the streams and
window reset to the window after the last fully processed window to
guarantee the at least once processing of tuples. If the container where
the operator itself is running goes down, it would need to be restored from
the checkpoint of course.

I would like to propose adding the ability for a user to explicitly
identify operators to be of this type and the corresponding functionality
in the engine to handle their recovery in the way described above by not
restoring their state from checkpoint, reusing the instance and restoring
the stream to the window after the last fully processed window for the
operator. When operators are not identified to be of this type, the default
behavior is what it is today and nothing changes.

I have done some prototyping on the engine side to ensure that this is
possible with our current code base without requiring a massive overhaul,
especially the restoration of the operator instance within the Node in the
streaming container, the re-establishment of the subscriber stream to a
window in the buffer server where the publisher (upstream) hasn't yet
reached as it would be restarting from checkpoint and have been able to get
it all working successfully.

Thanks

Reply via email to