[jira] [Commented] (FLINK-4809) Operators should tolerate checkpoint failures
[ https://issues.apache.org/jira/browse/FLINK-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242916#comment-16242916 ] Jing Fan commented on FLINK-4809: - Do we have any update on the PR? It has been handing for weeks. > Operators should tolerate checkpoint failures > - > > Key: FLINK-4809 > URL: https://issues.apache.org/jira/browse/FLINK-4809 > Project: Flink > Issue Type: Sub-task > Components: State Backends, Checkpointing >Reporter: Stephan Ewen >Assignee: Stefan Richter > Fix For: 1.4.0 > > > Operators should try/catch exceptions in the synchronous and asynchronous > part of the checkpoint and send a {{DeclineCheckpoint}} message as a result. > The decline message should have the failure cause attached to it. > The checkpoint barrier should be sent anyways as a first step before > attempting to make a state checkpoint, to make sure that downstream operators > do not block in alignment. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (FLINK-4809) Operators should tolerate checkpoint failures
[ https://issues.apache.org/jira/browse/FLINK-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242916#comment-16242916 ] Jing Fan edited comment on FLINK-4809 at 11/7/17 9:26 PM: -- Do we have any update on the PR? It has been hanging for weeks. was (Author: pangzhi): Do we have any update on the PR? It has been handing for weeks. > Operators should tolerate checkpoint failures > - > > Key: FLINK-4809 > URL: https://issues.apache.org/jira/browse/FLINK-4809 > Project: Flink > Issue Type: Sub-task > Components: State Backends, Checkpointing >Reporter: Stephan Ewen >Assignee: Stefan Richter > Fix For: 1.4.0 > > > Operators should try/catch exceptions in the synchronous and asynchronous > part of the checkpoint and send a {{DeclineCheckpoint}} message as a result. > The decline message should have the failure cause attached to it. > The checkpoint barrier should be sent anyways as a first step before > attempting to make a state checkpoint, to make sure that downstream operators > do not block in alignment. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-4808) Allow skipping failed checkpoints
[ https://issues.apache.org/jira/browse/FLINK-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217699#comment-16217699 ] Jing Fan commented on FLINK-4808: - [~ram_krish][~StephanEwen][~till.rohrmann] Do we have any follow up on this problem possible eta? Unable to skip failed checkpoint is blocking migrating critical jobs on to flink platform. We can also contribute to solve this problem if needed. > Allow skipping failed checkpoints > - > > Key: FLINK-4808 > URL: https://issues.apache.org/jira/browse/FLINK-4808 > Project: Flink > Issue Type: New Feature > Components: State Backends, Checkpointing >Affects Versions: 1.1.2, 1.1.3 >Reporter: Stephan Ewen > Fix For: 1.4.0 > > > Currently, if Flink cannot complete a checkpoint, it results in a failure and > recovery. > To make the impact of less stable storage infrastructure on the performance > of Flink less severe, Flink should be able to tolerate a certain number of > failed checkpoints and simply keep executing. > This should be controllable via a parameter, for example: > {code} > env.getCheckpointConfig().setAllowedFailedCheckpoints(3); > {code} > A value of {{-1}} could indicate an infinite number of checkpoint failures > tolerated by Flink. > The default value should still be {{0}}, to keep compatibility with the > existing behavior. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-4808) Allow skipping failed checkpoints
[ https://issues.apache.org/jira/browse/FLINK-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16210019#comment-16210019 ] Jing Fan commented on FLINK-4808: - Any update about this PR and ticket? I think this is an important feature. In industry, state checkpoint is used frequently and unable to skip failed checkpoints will block bringing flink into production environment. > Allow skipping failed checkpoints > - > > Key: FLINK-4808 > URL: https://issues.apache.org/jira/browse/FLINK-4808 > Project: Flink > Issue Type: New Feature > Components: State Backends, Checkpointing >Affects Versions: 1.1.2, 1.1.3 >Reporter: Stephan Ewen > Fix For: 1.4.0 > > > Currently, if Flink cannot complete a checkpoint, it results in a failure and > recovery. > To make the impact of less stable storage infrastructure on the performance > of Flink less severe, Flink should be able to tolerate a certain number of > failed checkpoints and simply keep executing. > This should be controllable via a parameter, for example: > {code} > env.getCheckpointConfig().setAllowedFailedCheckpoints(3); > {code} > A value of {{-1}} could indicate an infinite number of checkpoint failures > tolerated by Flink. > The default value should still be {{0}}, to keep compatibility with the > existing behavior. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-4808) Allow skipping failed checkpoints
[ https://issues.apache.org/jira/browse/FLINK-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199614#comment-16199614 ] Jing Fan commented on FLINK-4808: - [~StephanEwen] Do we have any update on this jira? > Allow skipping failed checkpoints > - > > Key: FLINK-4808 > URL: https://issues.apache.org/jira/browse/FLINK-4808 > Project: Flink > Issue Type: New Feature > Components: State Backends, Checkpointing >Affects Versions: 1.1.2, 1.1.3 >Reporter: Stephan Ewen > > Currently, if Flink cannot complete a checkpoint, it results in a failure and > recovery. > To make the impact of less stable storage infrastructure on the performance > of Flink less severe, Flink should be able to tolerate a certain number of > failed checkpoints and simply keep executing. > This should be controllable via a parameter, for example: > {code} > env.getCheckpointConfig().setAllowedFailedCheckpoints(3); > {code} > A value of {{-1}} could indicate an infinite number of checkpoint failures > tolerated by Flink. > The default value should still be {{0}}, to keep compatibility with the > existing behavior. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-6225) Support Row Stream for CassandraSink
Jing Fan created FLINK-6225: --- Summary: Support Row Stream for CassandraSink Key: FLINK-6225 URL: https://issues.apache.org/jira/browse/FLINK-6225 Project: Flink Issue Type: New Feature Components: Cassandra Connector Affects Versions: 1.3.0 Reporter: Jing Fan Fix For: 1.3.0 Currently in CassandraSink, specifying query is not supported for row-stream. The solution should be similar to CassandraTupleSink. -- This message was sent by Atlassian JIRA (v6.3.15#6346)