What Arvid said is correct.
The only thing I have to add is that "stop" allows also exactly-once sinks
to push out their buffered data to their final destination (e.g.
Filesystem). In other words, it takes into account side-effects, so it
guarantees exactly-once end-to-end, assuming that you are
using exactly-once sources and sinks. Cancel with savepoint on the other
hand did not necessarily and committing side-effects is was following a
"best-effort" approach.

For more information you can check [1].

Cheers,
Kostas

[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103090212

On Mon, Jun 8, 2020 at 10:23 AM Arvid Heise <ar...@ververica.com> wrote:

> It was before I joined the dev team, so the following are kind of
> speculative:
>
> The concept of stoppable functions never really took off as it was a bit
> of a clumsy approach. There is no fundamental difference between stopping
> and cancelling on (sub)task level. Indeed if you look in the twitter source
> of 1.6 [1], cancel() and stop() are doing the exact same thing. I'd assume
> that this is probably true for all sources.
>
> So what is the difference between cancel and stop then? It's more the way
> on how you terminate the whole DAG. On cancelling, you cancel() on all
> tasks more or less simultaneously. If you want to stop, it's more a
> fine-grain cancel, where you stop first the sources and then let the tasks
> close themselves when all upstream tasks are done. Just before closing the
> tasks, you also take a snapshot. Thus, the difference should not be visible
> in user code but only in the Flink code itself (task/checkpoint coordinator)
>
> So for your question:
> 1. No, as on task level stop() and cancel() are the same thing on UDF
> level.
> 2. Yes, stop will be more graceful and creates a snapshot. [2]
> 3. Not that I am aware of. In the whole flink code base, there are no more
> (see javadoc). You could of course check if there are some in Bahir. But it
> shouldn't really matter. There is no huge difference between stopping and
> cancelling if you wait for a checkpoint to finish.
> 4. Okay you answered your second question ;) Yes cancel with savepoint =
> stop now to make it easier for new users.
>
> [1]
> https://github.com/apache/flink/blob/release-1.6/flink-connectors/flink-connector-twitter/src/main/java/org/apache/flink/streaming/connectors/twitter/TwitterSource.java#L180-L190
> [2]
> https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/cli.html
>
> On Sun, Jun 7, 2020 at 1:04 AM M Singh <mans2si...@yahoo.com> wrote:
>
>>
>> Hi Arvid:
>>
>> Thanks for the links.
>>
>> A few questions:
>>
>> 1. Is there any particular interface in 1.9+ that identifies the source
>> as stoppable ?
>> 2. Is there any distinction b/w stop and cancel  in 1.9+ ?
>> 3. Is there any list of sources which are documented as stoppable besides
>> the one listed in your SO link ?
>> 4. In 1.9+ there is flink stop command and a flink cancel command. (
>> https://ci.apache.org/projects/flink/flink-docs-stable/ops/cli.html#stop).
>> So it appears that flink stop will take a savepoint and the call cancel,
>> and cancel will just cancel the job (looks like cancel with savepoint is
>> deprecated in 1.10).
>>
>> Thanks again for your help.
>>
>>
>>
>> On Saturday, June 6, 2020, 02:18:57 PM EDT, Arvid Heise <
>> ar...@ververica.com> wrote:
>>
>>
>> Yes, it seems as if FlinkKinesisConsumer does not implement it.
>>
>> Here are the links to the respective javadoc [1] and code [2]. Note that
>> in later releases (1.9+) this interface has been removed. Stop is now
>> implemented through a cancel() on source level.
>>
>> In general, I don't think that in a Kinesis to Kinesis use case, stop is
>> needed anyways, since there is no additional consistency expected over a
>> normal cancel.
>>
>> [1]
>> https://ci.apache.org/projects/flink/flink-docs-release-1.6/api/java/org/apache/flink/api/common/functions/StoppableFunction.html
>> [2]
>> https://github.com/apache/flink/blob/release-1.6/flink-core/src/main/java/org/apache/flink/api/common/functions/StoppableFunction.java
>>
>> On Sat, Jun 6, 2020 at 8:03 PM M Singh <mans2si...@yahoo.com> wrote:
>>
>> Hi Arvid:
>>
>> I check the link and it indicates that only Storm SpoutSource,
>> TwitterSource and NifiSource support stop.
>>
>> Does this mean that FlinkKinesisConsumer is not stoppable ?
>>
>> Also, can you please point me to the Stoppable interface mentioned in the
>> link ?  I found the following but am not sure if TwitterSource implements
>> it :
>>
>> https://github.com/apache/flink/blob/8674b69964eae50cad024f2c5caf92a71bf21a09/flink-runtime/src/main/java/org/apache/flink/runtime/rpc/StartStoppable.java
>>
>> Thanks
>>
>>
>>
>>
>>
>> On Friday, June 5, 2020, 02:48:49 PM EDT, Arvid Heise <
>> ar...@ververica.com> wrote:
>>
>>
>> Hi,
>>
>> could you check if this SO thread [1] helps you already?
>>
>> [1]
>> https://stackoverflow.com/questions/53735318/flink-how-to-solve-error-this-job-is-not-stoppable
>>
>> On Thu, Jun 4, 2020 at 7:43 PM M Singh <mans2si...@yahoo.com> wrote:
>>
>> Hi:
>>
>> I am running a job which consumes data from Kinesis and send data to
>> another Kinesis queue.  I am using an older version of Flink (1.6), and
>> when I try to stop the job I get an exception
>>
>> Caused by: java.util.concurrent.ExecutionException:
>> org.apache.flink.runtime.rest.util.RestClientException: [Job termination
>> (STOP) failed: This job is not stoppable.]
>>
>>
>> I wanted to find out what is a stoppable job and it possible to make a
>> job stoppable if is reading/writing to kinesis ?
>>
>> Thanks
>>
>>
>>
>>
>> --
>>
>> Arvid Heise | Senior Java Developer
>>
>> <https://www.ververica.com/>
>>
>> Follow us @VervericaData
>>
>> --
>>
>> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
>> Conference
>>
>> Stream Processing | Event Driven | Real Time
>>
>> --
>>
>> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>>
>> --
>> Ververica GmbH
>> Registered at Amtsgericht Charlottenburg: HRB 158244 B
>> Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
>> (Toni) Cheng
>>
>>
>>
>> --
>>
>> Arvid Heise | Senior Java Developer
>>
>> <https://www.ververica.com/>
>>
>> Follow us @VervericaData
>>
>> --
>>
>> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
>> Conference
>>
>> Stream Processing | Event Driven | Real Time
>>
>> --
>>
>> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>>
>> --
>> Ververica GmbH
>> Registered at Amtsgericht Charlottenburg: HRB 158244 B
>> Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
>> (Toni) Cheng
>>
>
>
> --
>
> Arvid Heise | Senior Java Developer
>
> <https://www.ververica.com/>
>
> Follow us @VervericaData
>
> --
>
> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
> Conference
>
> Stream Processing | Event Driven | Real Time
>
> --
>
> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>
> --
> Ververica GmbH
> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
> (Toni) Cheng
>

Reply via email to