[jira] [Commented] (FLINK-4809) Operators should tolerate checkpoint failures

2017-11-07 Thread Jing Fan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242916#comment-16242916
 ] 

Jing Fan commented on FLINK-4809:
-

Do we have any update on the PR? It has been handing for weeks.

> Operators should tolerate checkpoint failures
> -
>
> Key: FLINK-4809
> URL: https://issues.apache.org/jira/browse/FLINK-4809
> Project: Flink
>  Issue Type: Sub-task
>  Components: State Backends, Checkpointing
>Reporter: Stephan Ewen
>Assignee: Stefan Richter
> Fix For: 1.4.0
>
>
> Operators should try/catch exceptions in the synchronous and asynchronous 
> part of the checkpoint and send a {{DeclineCheckpoint}} message as a result.
> The decline message should have the failure cause attached to it.
> The checkpoint barrier should be sent anyways as a first step before 
> attempting to make a state checkpoint, to make sure that downstream operators 
> do not block in alignment.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (FLINK-4809) Operators should tolerate checkpoint failures

2017-11-07 Thread Jing Fan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242916#comment-16242916
 ] 

Jing Fan edited comment on FLINK-4809 at 11/7/17 9:26 PM:
--

Do we have any update on the PR? It has been hanging for weeks.


was (Author: pangzhi):
Do we have any update on the PR? It has been handing for weeks.

> Operators should tolerate checkpoint failures
> -
>
> Key: FLINK-4809
> URL: https://issues.apache.org/jira/browse/FLINK-4809
> Project: Flink
>  Issue Type: Sub-task
>  Components: State Backends, Checkpointing
>Reporter: Stephan Ewen
>Assignee: Stefan Richter
> Fix For: 1.4.0
>
>
> Operators should try/catch exceptions in the synchronous and asynchronous 
> part of the checkpoint and send a {{DeclineCheckpoint}} message as a result.
> The decline message should have the failure cause attached to it.
> The checkpoint barrier should be sent anyways as a first step before 
> attempting to make a state checkpoint, to make sure that downstream operators 
> do not block in alignment.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-4808) Allow skipping failed checkpoints

2017-10-24 Thread Jing Fan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217699#comment-16217699
 ] 

Jing Fan commented on FLINK-4808:
-

[~ram_krish][~StephanEwen][~till.rohrmann] Do we have any follow up on this 
problem possible eta? Unable to skip failed checkpoint is blocking migrating 
critical jobs on to flink platform. We can also contribute to solve this 
problem if needed.

> Allow skipping failed checkpoints
> -
>
> Key: FLINK-4808
> URL: https://issues.apache.org/jira/browse/FLINK-4808
> Project: Flink
>  Issue Type: New Feature
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.2, 1.1.3
>Reporter: Stephan Ewen
> Fix For: 1.4.0
>
>
> Currently, if Flink cannot complete a checkpoint, it results in a failure and 
> recovery.
> To make the impact of less stable storage infrastructure on the performance 
> of Flink less severe, Flink should be able to tolerate a certain number of 
> failed checkpoints and simply keep executing.
> This should be controllable via a parameter, for example:
> {code}
> env.getCheckpointConfig().setAllowedFailedCheckpoints(3);
> {code}
> A value of {{-1}} could indicate an infinite number of checkpoint failures 
> tolerated by Flink.
> The default value should still be {{0}}, to keep compatibility with the 
> existing behavior.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-4808) Allow skipping failed checkpoints

2017-10-18 Thread Jing Fan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16210019#comment-16210019
 ] 

Jing Fan commented on FLINK-4808:
-

Any update about this PR and ticket? I think this is an important feature. In 
industry, state checkpoint is used frequently and unable to skip failed 
checkpoints will block bringing flink into production environment. 

> Allow skipping failed checkpoints
> -
>
> Key: FLINK-4808
> URL: https://issues.apache.org/jira/browse/FLINK-4808
> Project: Flink
>  Issue Type: New Feature
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.2, 1.1.3
>Reporter: Stephan Ewen
> Fix For: 1.4.0
>
>
> Currently, if Flink cannot complete a checkpoint, it results in a failure and 
> recovery.
> To make the impact of less stable storage infrastructure on the performance 
> of Flink less severe, Flink should be able to tolerate a certain number of 
> failed checkpoints and simply keep executing.
> This should be controllable via a parameter, for example:
> {code}
> env.getCheckpointConfig().setAllowedFailedCheckpoints(3);
> {code}
> A value of {{-1}} could indicate an infinite number of checkpoint failures 
> tolerated by Flink.
> The default value should still be {{0}}, to keep compatibility with the 
> existing behavior.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-4808) Allow skipping failed checkpoints

2017-10-10 Thread Jing Fan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199614#comment-16199614
 ] 

Jing Fan commented on FLINK-4808:
-

[~StephanEwen] Do we have any update on this jira? 

> Allow skipping failed checkpoints
> -
>
> Key: FLINK-4808
> URL: https://issues.apache.org/jira/browse/FLINK-4808
> Project: Flink
>  Issue Type: New Feature
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.2, 1.1.3
>Reporter: Stephan Ewen
>
> Currently, if Flink cannot complete a checkpoint, it results in a failure and 
> recovery.
> To make the impact of less stable storage infrastructure on the performance 
> of Flink less severe, Flink should be able to tolerate a certain number of 
> failed checkpoints and simply keep executing.
> This should be controllable via a parameter, for example:
> {code}
> env.getCheckpointConfig().setAllowedFailedCheckpoints(3);
> {code}
> A value of {{-1}} could indicate an infinite number of checkpoint failures 
> tolerated by Flink.
> The default value should still be {{0}}, to keep compatibility with the 
> existing behavior.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-6225) Support Row Stream for CassandraSink

2017-03-30 Thread Jing Fan (JIRA)
Jing Fan created FLINK-6225:
---

 Summary: Support Row Stream for CassandraSink
 Key: FLINK-6225
 URL: https://issues.apache.org/jira/browse/FLINK-6225
 Project: Flink
  Issue Type: New Feature
  Components: Cassandra Connector
Affects Versions: 1.3.0
Reporter: Jing Fan
 Fix For: 1.3.0


Currently in CassandraSink, specifying query is not supported for row-stream. 
The solution should be similar to CassandraTupleSink.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)