Hello,

I am using Flink as the query engine for running SQL queries on both batch
and streaming data. I use the Blink planner in batch and streaming mode
respectively for the two cases.

In my current setup, I execute the batch queries synchronously via
StreamTableEnvironment::execute method. The job uses OutputFormat to
consume results in StreamTableSink and send it to the user. In case there
is an error/exception in the pipeline (possibly to user code), it is not
reported to OutputFormat or the Sink. If an error occurs after the
invocation of the write method on OutputFormat, the implementation may
falsely assume that the result successful and complete since close is
called in both success and failure cases. I can work around this, by
checking for exceptions thrown by the execute method but that adds extra
latency due to job tear down cost.

A similar problem also exists for streaming jobs. In my setup, streaming
jobs are executed asynchronously via StreamExecuteEnvironment::executeAsync.
Since the sink interface has no methods to receive errors in the pipeline,
the user code has to periodically track and manage persistent failures.

Have I missed something in the API? Or Is there some other way to get
access to error status in user code?

Regards,
Satyam

Reply via email to