[jira] [Created] (FLINK-13856) Reduce the delete file api when the checkpoint is completed

2019-08-26 Thread andrew.D.lin (Jira)
andrew.D.lin created FLINK-13856:


 Summary: Reduce the delete file api when the checkpoint is 
completed
 Key: FLINK-13856
 URL: https://issues.apache.org/jira/browse/FLINK-13856
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / State Backends
Affects Versions: 1.9.0, 1.8.1
Reporter: andrew.D.lin
 Attachments: f6cc56b7-2c74-4f4b-bb6a-476d28a22096.png

When the new checkpoint is completed, an old checkpoint will be deleted by 
calling CompletedCheckpoint.discardOnSubsume().

When deleting old checkpoints, follow these steps:
1, drop the metadata
2, discard private state objects
3, discard location as a whole

In some cases, is it possible to delete the checkpoint folder recursively by 
one call?

As far as I know the full amount of checkpoint, it should be possible to delete 
the folder directly.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-15307) Subclasses of FailoverStrategy are easily confused with implementation classes of RestartStrategy

2019-12-17 Thread andrew.D.lin (Jira)
andrew.D.lin created FLINK-15307:


 Summary: Subclasses of FailoverStrategy are easily confused with 
implementation classes of RestartStrategy
 Key: FLINK-15307
 URL: https://issues.apache.org/jira/browse/FLINK-15307
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Configuration
Affects Versions: 1.9.1, 1.9.0, 1.10.0
Reporter: andrew.D.lin
 Attachments: image-2019-12-18-14-59-03-181.png

Subclasses of RestartStrategy
 * FailingRestartStrategy
 * FailureRateRestartStrategy
 * FixedDelayRestartStrategy
 * InfiniteDelayRestartStrategy

Implementation class of FailoverStrategy
 * AdaptedRestartPipelinedRegionStrategyNG
 * RestartAllStrategy
 * RestartIndividualStrategy
 * RestartPipelinedRegionStrategy

 

FailoverStrategy describes how the job computation recovers from task failures.

I think the following names may be easier to understand and easier to 
distinguish:

Implementation class of FailoverStrategy
 * AdaptedPipelinedRegionFailoverStrategyNG
 * FailoverAllStrategy
 * FailoverIndividualStrategy
 * FailoverPipelinedRegionStrategy

FailoverStrategy is currently generated by configuration. If we change the name 
of the implementation class, it will not affect compatibility.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15568) RestartPipelinedRegionStrategy: not ensure the EXACTLY_ONCE semantics

2020-01-13 Thread Andrew.D.lin (Jira)
Andrew.D.lin created FLINK-15568:


 Summary: RestartPipelinedRegionStrategy: not ensure the 
EXACTLY_ONCE semantics
 Key: FLINK-15568
 URL: https://issues.apache.org/jira/browse/FLINK-15568
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Checkpointing
Affects Versions: 1.8.3, 1.8.1, 1.8.0
Reporter: Andrew.D.lin
 Attachments: image-2020-01-13-16-40-47-888.png

!image-2020-01-13-16-40-47-888.png!

 

In 1.8* versions, FailoverRegion.java restart method  not restore from latest 
checkpoint and marked TODO.

Should we support this feature (region restart) in flink 1.8?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21858) TaskMetricGroup taskName is too long, especially in sql tasks.

2021-03-18 Thread Andrew.D.lin (Jira)
Andrew.D.lin created FLINK-21858:


 Summary: TaskMetricGroup taskName is too long, especially in sql 
tasks.
 Key: FLINK-21858
 URL: https://issues.apache.org/jira/browse/FLINK-21858
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Metrics
Affects Versions: 1.12.2, 1.12.1, 1.12.0
Reporter: Andrew.D.lin


Now operatorName is limited to 80 by 
org.apache.flink.runtime.metrics.groups.TaskMetricGroup#METRICS_OPERATOR_NAME_MAX_LENGTH.

So propose to limit the maximum length of metric name by configuration.

 

Here is an example:

 

"taskName":"GlobalGroupAggregate(groupBy=[dt, src, src1, src2, src3, ct1, ct2], 
select=[dt, src, src1, src2, src3, ct1, ct2, SUM_RETRACT((sum$0, count$1)) AS 
sx_pv, SUM_RETRACT((sum$2, count$3)) AS sx_uv, MAX_RETRACT(max$4) AS updt_time, 
MAX_RETRACT(max$5) AS time_id]) -> Calc(select=[((MD5((dt CONCAT _UTF-16LE'|' 
CONCAT src CONCAT _UTF-16LE'|' CONCAT src1 CONCAT _UTF-16LE'|' CONCAT src2 
CONCAT _UTF-16LE'|' CONCAT src3 CONCAT _UTF-16LE'|' CONCAT ct1 CONCAT 
_UTF-16LE'|' CONCAT ct2 CONCAT _UTF-16LE'|' CONCAT time_id)) SUBSTR 1 SUBSTR 2) 
CONCAT _UTF-16LE'_' CONCAT (dt CONCAT _UTF-16LE'|' CONCAT src CONCAT 
_UTF-16LE'|' CONCAT src1 CONCAT _UTF-16LE'|' CONCAT src2 CONCAT _UTF-16LE'|' 
CONCAT src3 CONCAT _UTF-16LE'|' CONCAT ct1 CONCAT _UTF-16LE'|' CONCAT ct2 
CONCAT _UTF-16LE'|' CONCAT time_id)) AS rowkey, sx_pv, sx_uv, updt_time]) -> 
LocalGroupAggregate(groupBy=[rowkey], select=[rowkey, MAX_RETRACT(sx_pv) AS 
max$0, MAX_RETRACT(sx_uv) AS max$1, MAX_RETRACT(updt_time) AS max$2, 
COUNT_RETRACT(*) AS count1$3])"


"operatorName":"GlobalGroupAggregate(groupBy=[dt, src, src1, src2, src3, ct1, 
ct2], selec=[dt, s"
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)