[GitHub] [flink-benchmarks] fredia commented on pull request #23: [FLINK-23399][runtime/state]Add a benchmark for rescaling

2021-07-19 Thread GitBox


fredia commented on pull request #23:
URL: https://github.com/apache/flink-benchmarks/pull/23#issuecomment-882302478


   Thanks for your reply and nice suggestions,
   1. Yes, this code use some `StreamOperatorTestHarness` which are not public. 
According to the code structure suggestions, we would move code to 
[flink-state-backends](https://github.com/apache/flink/tree/master/flink-state-backends/flink-statebackend-rocksdb/src/test/java/org/apache/flink/contrib/streaming/state/benchmark),
 but not sure if it is appropriate, could you give some suggestions on where to 
move?
   2. Sorry for the careless results, I only set `iterations=1`  and `fork 
value=3` before.  (。﹏。*)
  I increase iterations to 10 and re-run the test, the new results are 
shown below.
   And the state size can also affect the results, the larger the state size, 
the more stable the results.  But the maximum state size is limit by 
memory-backed, we would remove the `heap` parameter afterward.
   ```text
   Benchmark   (backendType)  (parallelism1)  (parallelism2)  
(subtaskIndex)  Mode  CntScore   Error  Units
   RescalingBenchmark.rescale   heap   3   4
   0  avgt   308.557 ± 0.501  ms/op
   RescalingBenchmark.rescale   heap   3   4
   1  avgt   306.405 ± 0.217  ms/op
   RescalingBenchmark.rescale   heap   3   4
   2  avgt   308.446 ± 0.303  ms/op
   RescalingBenchmark.rescale   heap   3   4
   3  avgt   306.951 ± 0.634  ms/op
   RescalingBenchmark.rescale fileSystem   3   4
   0  avgt   308.967 ± 0.508  ms/op
   RescalingBenchmark.rescale fileSystem   3   4
   1  avgt   306.615 ± 0.273  ms/op
   RescalingBenchmark.rescale fileSystem   3   4
   2  avgt   308.833 ± 0.472  ms/op
   RescalingBenchmark.rescale fileSystem   3   4
   3  avgt   306.737 ± 0.302  ms/op
   RescalingBenchmark.rescalerocksdb   3   4
   0  avgt   30  100.949 ± 1.829  ms/op
   RescalingBenchmark.rescalerocksdb   3   4
   1  avgt   30   82.988 ± 6.050  ms/op
   RescalingBenchmark.rescalerocksdb   3   4
   2  avgt   30  104.332 ± 2.500  ms/op
   RescalingBenchmark.rescalerocksdb   3   4
   3  avgt   30   78.468 ± 1.353  ms/op
   RescalingBenchmark.rescalerocksdbIncr   3   4
   0  avgt   30  184.872 ± 4.912  ms/op
   RescalingBenchmark.rescalerocksdbIncr   3   4
   1  avgt   30  240.067 ± 7.431  ms/op
   RescalingBenchmark.rescalerocksdbIncr   3   4
   2  avgt   30  242.117 ± 3.595  ms/op
   RescalingBenchmark.rescalerocksdbIncr   3   4
   3  avgt   30  150.236 ± 2.395  ms/op
   ```
   
   3. The total running time is about 30 min when `warmup iterations = 10`, 
`measurement iterations = 10`, and `fork value=3`. Agree to avoid testing heap 
parameter, we would remove the heap parameter later.
 And we plan to pick out some typical cases(such as restore from partial 
state handle, restore from multiple state handles etc.) instead of testing all 
`subtaskIndex`.
   
   
   Thank you for the great suggestions. I'll improve my codes according to 
these. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] Tartarus0zm commented on pull request #16275: [FLINK-20518][rest] Add decoding characters for MessageQueryParameter

2021-07-19 Thread GitBox


Tartarus0zm commented on pull request #16275:
URL: https://github.com/apache/flink/pull/16275#issuecomment-882303224


   > > We are like #13514?
   > > Handle special characters single quotes.
   > 
   > I don't understand what you are asking/suggesting, please elaborate.
   
   @zentol  We add the handling of single quotes in 
MetricQueryService#replaceInvalidChars to avoid single quotes;
   Decode multiple times, not the best solution.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink-benchmarks] fredia edited a comment on pull request #23: [FLINK-23399][runtime/state]Add a benchmark for rescaling

2021-07-19 Thread GitBox


fredia edited a comment on pull request #23:
URL: https://github.com/apache/flink-benchmarks/pull/23#issuecomment-882302478


   Thanks for your reply and nice suggestions, @pnowojski.
   1. Yes, this code uses some `StreamOperatorTestHarness` which are not 
public. According to the code structure suggestions, we would move code to 
[flink-state-backends](https://github.com/apache/flink/tree/master/flink-state-backends/flink-statebackend-rocksdb/src/test/java/org/apache/flink/contrib/streaming/state/benchmark),
 but not sure if it is appropriate, could you give some suggestions on where to 
move?
   2. Sorry for the careless results, I only set `iterations=1`  and `fork 
value=3` before.  (。﹏。*)
  I increase iterations to 10 and re-run the test, the new results are 
shown below.
   And the state size can also affect the results, the larger the state size, 
the more stable the results.  But the maximum state size is limit by 
memory-backed, we would remove the `heap` parameter afterward.
   ```text
   Benchmark   (backendType)  (parallelism1)  (parallelism2)  
(subtaskIndex)  Mode  CntScore   Error  Units
   RescalingBenchmark.rescale   heap   3   4
   0  avgt   308.557 ± 0.501  ms/op
   RescalingBenchmark.rescale   heap   3   4
   1  avgt   306.405 ± 0.217  ms/op
   RescalingBenchmark.rescale   heap   3   4
   2  avgt   308.446 ± 0.303  ms/op
   RescalingBenchmark.rescale   heap   3   4
   3  avgt   306.951 ± 0.634  ms/op
   RescalingBenchmark.rescale fileSystem   3   4
   0  avgt   308.967 ± 0.508  ms/op
   RescalingBenchmark.rescale fileSystem   3   4
   1  avgt   306.615 ± 0.273  ms/op
   RescalingBenchmark.rescale fileSystem   3   4
   2  avgt   308.833 ± 0.472  ms/op
   RescalingBenchmark.rescale fileSystem   3   4
   3  avgt   306.737 ± 0.302  ms/op
   RescalingBenchmark.rescalerocksdb   3   4
   0  avgt   30  100.949 ± 1.829  ms/op
   RescalingBenchmark.rescalerocksdb   3   4
   1  avgt   30   82.988 ± 6.050  ms/op
   RescalingBenchmark.rescalerocksdb   3   4
   2  avgt   30  104.332 ± 2.500  ms/op
   RescalingBenchmark.rescalerocksdb   3   4
   3  avgt   30   78.468 ± 1.353  ms/op
   RescalingBenchmark.rescalerocksdbIncr   3   4
   0  avgt   30  184.872 ± 4.912  ms/op
   RescalingBenchmark.rescalerocksdbIncr   3   4
   1  avgt   30  240.067 ± 7.431  ms/op
   RescalingBenchmark.rescalerocksdbIncr   3   4
   2  avgt   30  242.117 ± 3.595  ms/op
   RescalingBenchmark.rescalerocksdbIncr   3   4
   3  avgt   30  150.236 ± 2.395  ms/op
   ```
   
   3. The total running time is about 30 min when `warmup iterations = 10`, 
`measurement iterations = 10`, and `fork value=3`. Agree to avoid testing heap 
parameter, we would remove the heap parameter later.
 And we plan to pick out some typical cases(such as restore from partial 
state handle, restore from multiple state handles etc.) instead of testing all 
`subtaskIndex`.
   
   
   Thank you for the great suggestions. I'll improve my codes according to 
these. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-18464) ClassCastException during namespace serialization for checkpoint (Heap and RocksDB)

2021-07-19 Thread guxiang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

guxiang updated FLINK-18464:

Attachment: image-2021-07-19-15-20-36-431.png

> ClassCastException during namespace serialization for checkpoint (Heap and 
> RocksDB)
> ---
>
> Key: FLINK-18464
> URL: https://issues.apache.org/jira/browse/FLINK-18464
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.9.3, 1.13.1
>Reporter: Roman Khachatryan
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2021-06-21-20-06-51-323.png, 
> image-2021-06-21-20-07-30-281.png, image-2021-06-21-20-07-43-246.png, 
> image-2021-06-21-20-33-39-295.png, image-2021-06-23-14-34-37-703.png, 
> image-2021-06-24-16-41-54-425.png, image-2021-06-24-17-51-53-734.png, 
> image-2021-07-08-14-50-12-559.png, image-2021-07-08-18-33-17-417.png, 
> image-2021-07-08-18-34-51-910.png, image-2021-07-19-14-40-17-398.png, 
> image-2021-07-19-14-44-59-511.png, image-2021-07-19-14-46-21-682.png, 
> image-2021-07-19-14-47-34-111.png, image-2021-07-19-15-20-36-431.png
>
>
> (see FLINK-23036 for error details with RocksDB)
>  
> From 
> [thread|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpoint-failed-because-of-TimeWindow-cannot-be-cast-to-VoidNamespace-td36310.html]
> {quote}I'm using flink 1.9 on Mesos and I try to use my own trigger and 
> evictor. The state is stored to memory.
> {quote}
>  
>   
> {code:java}
> input.setParallelism(processParallelism)
>         .assignTimestampsAndWatermarks(new UETimeAssigner)
>         .keyBy(_.key)
>         .window(TumblingEventTimeWindows.of(Time.minutes(20)))
>         .trigger(new MyTrigger)
>         .evictor(new MyEvictor)
>         .process(new MyFunction).setParallelism(aggregateParallelism)
>         .addSink(kafkaSink).setParallelism(sinkParallelism)
>         .name("kafka-record-sink"){code}
>  
>  
> {code:java}
> java.lang.Exception: Could not materialize checkpoint 1 for operator 
> Window(TumblingEventTimeWindows(120), JoinTrigger, JoinEvictor, 
> ScalaProcessWindowFunctionWrapper) -> Sink: kafka-record-sink (2/5).
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.handleExecutionException(StreamTask.java:1100)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:1042)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.ClassCastException: 
> org.apache.flink.streaming.api.windowing.windows.TimeWindow cannot be cast to 
> org.apache.flink.runtime.state.VoidNamespace
>  at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>  at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>  at 
> org.apache.flink.runtime.concurrent.FutureUtils.runIfNotDoneAndGet(FutureUtils.java:450)
>  at 
> org.apache.flink.streaming.api.operators.OperatorSnapshotFinalizer.(OperatorSnapshotFinalizer.java:47)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:1011)
>  
>  ... 3 more 
> Caused by: java.lang.ClassCastException: 
> org.apache.flink.streaming.api.windowing.windows.TimeWindow cannot be cast to 
> org.apache.flink.runtime.state.VoidNamespace
>  at 
> org.apache.flink.runtime.state.VoidNamespaceSerializer.serialize(VoidNamespaceSerializer.java:32)
>  at 
> org.apache.flink.runtime.state.heap.CopyOnWriteStateMapSnapshot.writeState(CopyOnWriteStateMapSnapshot.java:114)
>  at 
> org.apache.flink.runtime.state.heap.AbstractStateTableSnapshot.writeStateInKeyGroup(AbstractStateTableSnapshot.java:121)
>  at 
> org.apache.flink.runtime.state.heap.CopyOnWriteStateTableSnapshot.writeStateInKeyGroup(CopyOnWriteStateTableSnapshot.java:37)
>  at 
> org.apache.flink.runtime.state.heap.HeapSnapshotStrategy$1.callInternal(HeapSnapshotStrategy.java:191)
>  at 
> org.apache.flink.runtime.state.heap.HeapSnapshotStrategy$1.callInternal(HeapSnapshotStrategy.java:158)
>  at 
> org.apache.flink.runtime.state.AsyncSnapshotCallable.call(AsyncSnapshotCallable.java:75)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> org.apache.flink.runtime.concurrent.FutureUtils.runIfNotDoneAndGet(FutureUtils.java:447)
>     
>      ... 5 more
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-18464) ClassCastException during namespace serialization for checkpoint (Heap and RocksDB)

2021-07-19 Thread guxiang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

guxiang updated FLINK-18464:

Attachment: image-2021-07-19-15-21-04-214.png

> ClassCastException during namespace serialization for checkpoint (Heap and 
> RocksDB)
> ---
>
> Key: FLINK-18464
> URL: https://issues.apache.org/jira/browse/FLINK-18464
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.9.3, 1.13.1
>Reporter: Roman Khachatryan
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2021-06-21-20-06-51-323.png, 
> image-2021-06-21-20-07-30-281.png, image-2021-06-21-20-07-43-246.png, 
> image-2021-06-21-20-33-39-295.png, image-2021-06-23-14-34-37-703.png, 
> image-2021-06-24-16-41-54-425.png, image-2021-06-24-17-51-53-734.png, 
> image-2021-07-08-14-50-12-559.png, image-2021-07-08-18-33-17-417.png, 
> image-2021-07-08-18-34-51-910.png, image-2021-07-19-14-40-17-398.png, 
> image-2021-07-19-14-44-59-511.png, image-2021-07-19-14-46-21-682.png, 
> image-2021-07-19-14-47-34-111.png, image-2021-07-19-15-20-36-431.png, 
> image-2021-07-19-15-21-04-214.png
>
>
> (see FLINK-23036 for error details with RocksDB)
>  
> From 
> [thread|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpoint-failed-because-of-TimeWindow-cannot-be-cast-to-VoidNamespace-td36310.html]
> {quote}I'm using flink 1.9 on Mesos and I try to use my own trigger and 
> evictor. The state is stored to memory.
> {quote}
>  
>   
> {code:java}
> input.setParallelism(processParallelism)
>         .assignTimestampsAndWatermarks(new UETimeAssigner)
>         .keyBy(_.key)
>         .window(TumblingEventTimeWindows.of(Time.minutes(20)))
>         .trigger(new MyTrigger)
>         .evictor(new MyEvictor)
>         .process(new MyFunction).setParallelism(aggregateParallelism)
>         .addSink(kafkaSink).setParallelism(sinkParallelism)
>         .name("kafka-record-sink"){code}
>  
>  
> {code:java}
> java.lang.Exception: Could not materialize checkpoint 1 for operator 
> Window(TumblingEventTimeWindows(120), JoinTrigger, JoinEvictor, 
> ScalaProcessWindowFunctionWrapper) -> Sink: kafka-record-sink (2/5).
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.handleExecutionException(StreamTask.java:1100)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:1042)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.ClassCastException: 
> org.apache.flink.streaming.api.windowing.windows.TimeWindow cannot be cast to 
> org.apache.flink.runtime.state.VoidNamespace
>  at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>  at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>  at 
> org.apache.flink.runtime.concurrent.FutureUtils.runIfNotDoneAndGet(FutureUtils.java:450)
>  at 
> org.apache.flink.streaming.api.operators.OperatorSnapshotFinalizer.(OperatorSnapshotFinalizer.java:47)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:1011)
>  
>  ... 3 more 
> Caused by: java.lang.ClassCastException: 
> org.apache.flink.streaming.api.windowing.windows.TimeWindow cannot be cast to 
> org.apache.flink.runtime.state.VoidNamespace
>  at 
> org.apache.flink.runtime.state.VoidNamespaceSerializer.serialize(VoidNamespaceSerializer.java:32)
>  at 
> org.apache.flink.runtime.state.heap.CopyOnWriteStateMapSnapshot.writeState(CopyOnWriteStateMapSnapshot.java:114)
>  at 
> org.apache.flink.runtime.state.heap.AbstractStateTableSnapshot.writeStateInKeyGroup(AbstractStateTableSnapshot.java:121)
>  at 
> org.apache.flink.runtime.state.heap.CopyOnWriteStateTableSnapshot.writeStateInKeyGroup(CopyOnWriteStateTableSnapshot.java:37)
>  at 
> org.apache.flink.runtime.state.heap.HeapSnapshotStrategy$1.callInternal(HeapSnapshotStrategy.java:191)
>  at 
> org.apache.flink.runtime.state.heap.HeapSnapshotStrategy$1.callInternal(HeapSnapshotStrategy.java:158)
>  at 
> org.apache.flink.runtime.state.AsyncSnapshotCallable.call(AsyncSnapshotCallable.java:75)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> org.apache.flink.runtime.concurrent.FutureUtils.runIfNotDoneAndGet(FutureUtils.java:447)
>     
>      ... 5 more
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-23402) Expose a consistent GlobalDataExchangeMode

2021-07-19 Thread Till Rohrmann (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-23402:
--
Description: 
The Table API makes the {{GlobalDataExchangeMode}} configurable via 
{{table.exec.shuffle-mode}}.

In Table API batch mode the StreamGraph is configured with 
{{ALL_EDGES_BLOCKING}} and in DataStream API batch mode 
{{FORWARD_EDGES_PIPELINED}}.

I would vote for unifying the exchange mode of both APIs so that complex SQL 
pipelines behave identical in {{StreamTableEnvironment}} and 
{{TableEnvironment}}. Also the feedback a got so far would make 
{{ALL_EDGES_BLOCKING}} a safer option to run pipelines successfully with 
limited resources.

[~lzljs3620320]
{quote}
The previous history was like this:
- The default value is pipeline, and we find that many times due to 
insufficient resources, the deployment will hang. And the typical use of batch 
jobs is small resources running large parallelisms, because in batch jobs, the 
granularity of failover is related to the amount of data processed by a single 
task. The smaller the amount of data, the faster the fault tolerance. So most 
of the scenarios are run with small resources and large parallelisms, little by 
little slowly running.

- Later, we switched the default value to blocking. We found that the better 
blocking shuffle implementation would not slow down the running speed much. We 
tested tpc-ds and it took almost the same time.
{quote}

[~dwysakowicz]
{quote}
I don't see a problem with changing the default value for DataStream batch mode 
if you think ALL_EDGES_BLOCKING is the better default option.
{quote}

In any case, we should make this configurable for DataStream API users and make 
the specific Table API option obsolete.

It would include the following steps:

- Move {{GlobalDataExchangeMode}} from {{o.a.f.streaming.api.graph}} to 
{{o.a.f.api.common}} (with reworked JavaDocs) as {{ExchangeMode}} (to have it 
shorter) next to {{RuntimeMode}}
- Add {{StreamExecutionEnvironment.setExchangeMode()}} next to 
{{setRuntimeMode}}
- Add option {{execution.exchange-mode}}
- Add checks for invalid combinations to StreamGraphGenerator
- Deprecate ExecutionMode ([avoid 
confusion|https://stackoverflow.com/questions/68335472/what-is-difference-in-runtimeexecutionmode-and-executionmode])

  was:
The Table API makes the {{GlobalDataExchangeMode}} configurable via 
{{table.exec.shuffle-mode}}.

In Table API batch mode the StreamGraph is configured with 
{{ALL_EDGES_BLOCKING}} and in DataStream API batch mode 
{{FORWARD_EDGES_PIPELINED}}.

I would vote for unifying the exchange mode of both APIs so that complex SQL 
pipelines behave identical in {{StreamTableEnvironment}} and 
{{TableEnvironment}}. Also the feedback a got so far would make 
{{ALL_EDGES_BLOCKING}} a safer option to run pipelines successfully with 
limited resources.

[~lzljs3620320]
{noformat}
The previous history was like this:
- The default value is pipeline, and we find that many times due to 
insufficient resources, the deployment will hang. And the typical use of batch 
jobs is small resources running large parallelisms, because in batch jobs, the 
granularity of failover is related to the amount of data processed by a single 
task. The smaller the amount of data, the faster the fault tolerance. So most 
of the scenarios are run with small resources and large parallelisms, little by 
little slowly running.

- Later, we switched the default value to blocking. We found that the better 
blocking shuffle implementation would not slow down the running speed much. We 
tested tpc-ds and it took almost the same time.
{noformat}

[~dwysakowicz]
{noformat}
I don't see a problem with changing the default value for DataStream batch mode 
if you think ALL_EDGES_BLOCKING is the better default option.
{noformat}

In any case, we should make this configurable for DataStream API users and make 
the specific Table API option obsolete.

It would include the following steps:

- Move {{GlobalDataExchangeMode}} from {{o.a.f.streaming.api.graph}} to 
{{o.a.f.api.common}} (with reworked JavaDocs) as {{ExchangeMode}} (to have it 
shorter) next to {{RuntimeMode}}
- Add {{StreamExecutionEnvironment.setExchangeMode()}} next to 
{{setRuntimeMode}}
- Add option {{execution.exchange-mode}}
- Add checks for invalid combinations to StreamGraphGenerator
- Deprecate ExecutionMode ([avoid 
confusion|https://stackoverflow.com/questions/68335472/what-is-difference-in-runtimeexecutionmode-and-executionmode])


> Expose a consistent GlobalDataExchangeMode
> --
>
> Key: FLINK-23402
> URL: https://issues.apache.org/jira/browse/FLINK-23402
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / DataStream
>Reporter: Timo Walther
>Priority: Major
>
> The Table API makes the {{GlobalDataExchang

[GitHub] [flink] flinkbot edited a comment on pull request #15703: [FLINK-22275][table] Support random past for timestamp types in datagen connector

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #15703:
URL: https://github.com/apache/flink/pull/15703#issuecomment-823980125


   
   ## CI report:
   
   * 3f344c384cc687042e0835ed655e7067770b7426 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18785)
 
   * 886a0567d25763d1d786432354886942fba5288d Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20669)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] ranqiqiang commented on a change in pull request #16478: [FLINK-23228][docs-zh] Translate "Stateful Stream Processing" page into Chinese

2021-07-19 Thread GitBox


ranqiqiang commented on a change in pull request #16478:
URL: https://github.com/apache/flink/pull/16478#discussion_r672049992



##
File path: docs/content.zh/docs/concepts/stateful-stream-processing.md
##
@@ -24,342 +24,227 @@ under the License.
 
 # 有状态流处理
 
-## What is State?
+## 什么是状态?
 
-While many operations in a dataflow simply look at one individual *event at a
-time* (for example an event parser), some operations remember information
-across multiple events (for example window operators). These operations are
-called **stateful**.
+虽然数据流中的很多操作一次只着眼于一个单独的事件(例如事件解析器),但有些操作会记住多个事件的信息(例如窗口算子)。
+这些操作称为**有状态的**(stateful)。
 
-Some examples of stateful operations:
+有状态操作的一些示例:
 
-  - When an application searches for certain event patterns, the state will
-store the sequence of events encountered so far.
-  - When aggregating events per minute/hour/day, the state holds the pending
-aggregates.
-  - When training a machine learning model over a stream of data points, the
-state holds the current version of the model parameters.
-  - When historic data needs to be managed, the state allows efficient access
-to events that occurred in the past.
+  - 当应用程序搜索某些事件模式时,状态将存储到目前为止遇到的一系列事件。

Review comment:
   ”状态将按顺序保存目前为止遇到的事件“ 是不是合理点?
   
   还是我理解错了?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16273: [FLINK-18464][Runtime] when namespaceSerializer is different, forbid getPartitionedState from different namespace

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16273:
URL: https://github.com/apache/flink/pull/16273#issuecomment-867425757


   
   ## CI report:
   
   * b5a9fc5356693337ef457f67773451117c04a0fd UNKNOWN
   * a4c0d21fb912a1d7d294dfbd8c41adc18729e23c Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20667)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16399: FLIP-177 Draft: Extend Sink API

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16399:
URL: https://github.com/apache/flink/pull/16399#issuecomment-874970474


   
   ## CI report:
   
   * 76806a6d74441b2bfb2b4e4c2bcf0c8c7194b133 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20288)
 
   * ec5066dda1351e215848ee8793645fee4b9c63d5 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16530: [FLINK-23409][runtime] Add logs for FineGrainedSlotManager#checkResou…

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16530:
URL: https://github.com/apache/flink/pull/16530#issuecomment-882207317


   
   ## CI report:
   
   * ed632e41f1fd92b53282164ff9556d99d98061d7 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20659)
 
   * 72320c6005bc752c1c98b7f2a33ddebbc9644b5a Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20664)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16531: [FLINK-23204] Provide StateBackends access to MailboxExecutor

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16531:
URL: https://github.com/apache/flink/pull/16531#issuecomment-882289490


   
   ## CI report:
   
   * 98a41566d8755a72e597c7d530d5a350fd561fbd Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20670)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-23402) Expose a consistent GlobalDataExchangeMode

2021-07-19 Thread Till Rohrmann (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-23402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17383090#comment-17383090
 ] 

Till Rohrmann commented on FLINK-23402:
---

+1 for unifying the way the exchange mode is configured. I also really like 
Stephan's proposal for the 3 different batch modes. I do agree that they make a 
lot more sense than what we currently have and will make it easier for our 
users to use and understand Flink.

Scoping wise implementing first 
{{RuntimeMode.BATCH.withAllExchangesBlocking()}} and 
{{RuntimeMode.BATCH.willAllExchangesPipelined()}} could be a start. However we 
should consider that changing the default from {{withAllExchangesBlocking()}} 
to pipelined within slot sharing groups if it's a {{FORWARD}} edge, can be a 
behavioural change in terms of resource consumption. Consequently, this might 
break some jobs that all of a sudden might need more resources for a slot.

> Expose a consistent GlobalDataExchangeMode
> --
>
> Key: FLINK-23402
> URL: https://issues.apache.org/jira/browse/FLINK-23402
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / DataStream
>Reporter: Timo Walther
>Priority: Major
>
> The Table API makes the {{GlobalDataExchangeMode}} configurable via 
> {{table.exec.shuffle-mode}}.
> In Table API batch mode the StreamGraph is configured with 
> {{ALL_EDGES_BLOCKING}} and in DataStream API batch mode 
> {{FORWARD_EDGES_PIPELINED}}.
> I would vote for unifying the exchange mode of both APIs so that complex SQL 
> pipelines behave identical in {{StreamTableEnvironment}} and 
> {{TableEnvironment}}. Also the feedback a got so far would make 
> {{ALL_EDGES_BLOCKING}} a safer option to run pipelines successfully with 
> limited resources.
> [~lzljs3620320]
> {quote}
> The previous history was like this:
> - The default value is pipeline, and we find that many times due to 
> insufficient resources, the deployment will hang. And the typical use of 
> batch jobs is small resources running large parallelisms, because in batch 
> jobs, the granularity of failover is related to the amount of data processed 
> by a single task. The smaller the amount of data, the faster the fault 
> tolerance. So most of the scenarios are run with small resources and large 
> parallelisms, little by little slowly running.
> - Later, we switched the default value to blocking. We found that the better 
> blocking shuffle implementation would not slow down the running speed much. 
> We tested tpc-ds and it took almost the same time.
> {quote}
> [~dwysakowicz]
> {quote}
> I don't see a problem with changing the default value for DataStream batch 
> mode if you think ALL_EDGES_BLOCKING is the better default option.
> {quote}
> In any case, we should make this configurable for DataStream API users and 
> make the specific Table API option obsolete.
> It would include the following steps:
> - Move {{GlobalDataExchangeMode}} from {{o.a.f.streaming.api.graph}} to 
> {{o.a.f.api.common}} (with reworked JavaDocs) as {{ExchangeMode}} (to have it 
> shorter) next to {{RuntimeMode}}
> - Add {{StreamExecutionEnvironment.setExchangeMode()}} next to 
> {{setRuntimeMode}}
> - Add option {{execution.exchange-mode}}
> - Add checks for invalid combinations to StreamGraphGenerator
> - Deprecate ExecutionMode ([avoid 
> confusion|https://stackoverflow.com/questions/68335472/what-is-difference-in-runtimeexecutionmode-and-executionmode])



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] RollsBean commented on a change in pull request #16478: [FLINK-23228][docs-zh] Translate "Stateful Stream Processing" page into Chinese

2021-07-19 Thread GitBox


RollsBean commented on a change in pull request #16478:
URL: https://github.com/apache/flink/pull/16478#discussion_r672054159



##
File path: docs/content.zh/docs/concepts/stateful-stream-processing.md
##
@@ -24,342 +24,227 @@ under the License.
 
 # 有状态流处理
 
-## What is State?
+## 什么是状态?
 
-While many operations in a dataflow simply look at one individual *event at a
-time* (for example an event parser), some operations remember information
-across multiple events (for example window operators). These operations are
-called **stateful**.
+虽然数据流中的很多操作一次只着眼于一个单独的事件(例如事件解析器),但有些操作会记住多个事件的信息(例如窗口算子)。
+这些操作称为**有状态的**(stateful)。
 
-Some examples of stateful operations:
+有状态操作的一些示例:
 
-  - When an application searches for certain event patterns, the state will
-store the sequence of events encountered so far.
-  - When aggregating events per minute/hour/day, the state holds the pending
-aggregates.
-  - When training a machine learning model over a stream of data points, the
-state holds the current version of the model parameters.
-  - When historic data needs to be managed, the state allows efficient access
-to events that occurred in the past.
+  - 当应用程序搜索某些事件模式时,状态将存储到目前为止遇到的一系列事件。
+  - 当每分钟/每小时/每天聚合事件时,状态会持有待处理的聚合。
+  - 当在数据点的流上训练一个机器学习模型时,状态会保存模型参数的当前版本。
+  - 当需要管理历史数据时,状态允许有效访问过去发生的事件。
 
-Flink needs to be aware of the state in order to make it fault tolerant using
+Flink 需要知道状态以便使用
 [checkpoints]({{< ref "docs/dev/datastream/fault-tolerance/checkpointing" >}})
-and [savepoints]({{< ref "docs/ops/state/savepoints" >}}).
+和 [savepoints]({{< ref "docs/ops/state/savepoints" >}}) 进行容错。
 
-Knowledge about the state also allows for rescaling Flink applications, meaning
-that Flink takes care of redistributing state across parallel instances.
+关于状态的知识也允许我们重新调节 Flink 应用程序,这意味着 Flink 负责跨并行实例重新分布状态。
 
-[Queryable state]({{< ref 
"docs/dev/datastream/fault-tolerance/queryable_state" >}}) allows you to access 
state from outside of Flink during runtime.
+[可查询的状态]({{< ref "docs/dev/datastream/fault-tolerance/queryable_state" 
>}})允许你在运行时从 Flink 外部访问状态。
 
-When working with state, it might also be useful to read about [Flink's state
-backends]({{< ref "docs/ops/state/state_backends" >}}). Flink
-provides different state backends that specify how and where state is stored.
+在使用状态时,阅读 [Flink 的状态后端]({{< ref "docs/ops/state/state_backends" >}})可能也很有用。 
+Flink 提供了不同的状态后端,用于指定状态存储的方式和位置。
 
 {{< top >}}
 
 ## Keyed State
 
-Keyed state is maintained in what can be thought of as an embedded key/value
-store.  The state is partitioned and distributed strictly together with the
-streams that are read by the stateful operators. Hence, access to the key/value
-state is only possible on *keyed streams*, i.e. after a keyed/partitioned data
-exchange, and is restricted to the values associated with the current event's
-key. Aligning the keys of streams and state makes sure that all state updates
-are local operations, guaranteeing consistency without transaction overhead.
-This alignment also allows Flink to redistribute the state and adjust the
-stream partitioning transparently.
-
-{{< img src="/fig/state_partitioning.svg" alt="State and Partitioning" 
class="offset" width="50%" >}}
-
-Keyed State is further organized into so-called *Key Groups*. Key Groups are
-the atomic unit by which Flink can redistribute Keyed State; there are exactly
-as many Key Groups as the defined maximum parallelism.  During execution each
-parallel instance of a keyed operator works with the keys for one or more Key
-Groups.
-
-## State Persistence
-
-Flink implements fault tolerance using a combination of **stream replay** and
-**checkpointing**. A checkpoint marks a specific point in each of the
-input streams along with the corresponding state for each of the operators. A
-streaming dataflow can be resumed from a checkpoint while maintaining
-consistency *(exactly-once processing semantics)* by restoring the state of the
-operators and replaying the records from the point of the checkpoint.
-
-The checkpoint interval is a means of trading off the overhead of fault
-tolerance during execution with the recovery time (the number of records that
-need to be replayed).
-
-The fault tolerance mechanism continuously draws snapshots of the distributed
-streaming data flow. For streaming applications with small state, these
-snapshots are very light-weight and can be drawn frequently without much impact
-on performance.  The state of the streaming applications is stored at a
-configurable place, usually in a distributed file system.
-
-In case of a program failure (due to machine-, network-, or software failure),
-Flink stops the distributed streaming dataflow.  The system then restarts the
-operators and resets them to the latest successful checkpoint. The input
-streams are reset to the point of the state snapshot. Any records that are
-processed as part of the restarted parallel dataflow are guaranteed to not have
-affect

[jira] [Commented] (FLINK-21025) SQLClientHBaseITCase fails when untarring HBase

2021-07-19 Thread Dawid Wysakowicz (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17383095#comment-17383095
 ] 

Dawid Wysakowicz commented on FLINK-21025:
--

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20656&view=logs&j=c88eea3b-64a0-564d-0031-9fdcd7b8abee&t=ff888d9b-cd34-53cc-d90f-3e446d355529&l=28350

> SQLClientHBaseITCase fails when untarring HBase
> ---
>
> Key: FLINK-21025
> URL: https://issues.apache.org/jira/browse/FLINK-21025
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / HBase, Table SQL / Ecosystem, Tests
>Affects Versions: 1.12.2, 1.13.0
>Reporter: Dawid Wysakowicz
>Priority: Major
>  Labels: auto-deprioritized-critical, test-stability
> Fix For: 1.14.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=12210&view=logs&j=c88eea3b-64a0-564d-0031-9fdcd7b8abee&t=ff888d9b-cd34-53cc-d90f-3e446d355529
> {code}
> [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 908.614 s <<< FAILURE! - in 
> org.apache.flink.tests.util.hbase.SQLClientHBaseITCase
> Jan 19 08:19:36 [ERROR] testHBase[1: 
> hbase-version:2.2.3](org.apache.flink.tests.util.hbase.SQLClientHBaseITCase)  
> Time elapsed: 615.099 s  <<< ERROR!
> Jan 19 08:19:36 java.io.IOException: 
> Jan 19 08:19:36 Process execution failed due error. Error output:
> Jan 19 08:19:36   at 
> org.apache.flink.tests.util.AutoClosableProcess$AutoClosableProcessBuilder.runBlocking(AutoClosableProcess.java:133)
> Jan 19 08:19:36   at 
> org.apache.flink.tests.util.AutoClosableProcess$AutoClosableProcessBuilder.runBlocking(AutoClosableProcess.java:108)
> Jan 19 08:19:36   at 
> org.apache.flink.tests.util.AutoClosableProcess.runBlocking(AutoClosableProcess.java:70)
> Jan 19 08:19:36   at 
> org.apache.flink.tests.util.hbase.LocalStandaloneHBaseResource.setupHBaseDist(LocalStandaloneHBaseResource.java:86)
> Jan 19 08:19:36   at 
> org.apache.flink.tests.util.hbase.LocalStandaloneHBaseResource.before(LocalStandaloneHBaseResource.java:76)
> Jan 19 08:19:36   at 
> org.apache.flink.util.ExternalResource$1.evaluate(ExternalResource.java:46)
> Jan 19 08:19:36   at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
> Jan 19 08:19:36   at 
> org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> Jan 19 08:19:36   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> Jan 19 08:19:36   at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> Jan 19 08:19:36   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> Jan 19 08:19:36   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> Jan 19 08:19:36   at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> Jan 19 08:19:36   at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> Jan 19 08:19:36   at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> Jan 19 08:19:36   at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> Jan 19 08:19:36   at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> Jan 19 08:19:36   at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> Jan 19 08:19:36   at org.junit.runners.Suite.runChild(Suite.java:128)
> Jan 19 08:19:36   at org.junit.runners.Suite.runChild(Suite.java:27)
> Jan 19 08:19:36   at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> Jan 19 08:19:36   at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> Jan 19 08:19:36   at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> Jan 19 08:19:36   at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> Jan 19 08:19:36   at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> Jan 19 08:19:36   at 
> org.apache.flink.util.ExternalResource$1.evaluate(ExternalResource.java:48)
> Jan 19 08:19:36   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> Jan 19 08:19:36   at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> Jan 19 08:19:36   at org.junit.runners.Suite.runChild(Suite.java:128)
> Jan 19 08:19:36   at org.junit.runners.Suite.runChild(Suite.java:27)
> Jan 19 08:19:36   at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> Jan 19 08:19:36   at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> Jan 19 08:19:36   at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> Jan 19 08:19:36   at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> Jan 19 08:19:36   at 
> org.junit.runner

[jira] [Created] (FLINK-23423) Translate the page of "Elasticsearch Connector" into Chinese

2021-07-19 Thread huxixiang (Jira)
huxixiang created FLINK-23423:
-

 Summary: Translate the page of "Elasticsearch Connector" into 
Chinese
 Key: FLINK-23423
 URL: https://issues.apache.org/jira/browse/FLINK-23423
 Project: Flink
  Issue Type: Improvement
  Components: chinese-translation, Documentation
Affects Versions: 1.14.0
Reporter: huxixiang
 Fix For: 1.14.0


Translate the internal page 
"[https://ci.apache.org/projects/flink/flink-docs-master/docs/connectors/datastream/elasticsearch/]";
 into Chinese.

The doc located in 
"flink/docs/content/zh/docs/connectors/datastream/elasticsearch.md"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-18464) ClassCastException during namespace serialization for checkpoint (Heap and RocksDB)

2021-07-19 Thread guxiang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

guxiang updated FLINK-18464:

Attachment: image-2021-07-19-15-45-53-014.png

> ClassCastException during namespace serialization for checkpoint (Heap and 
> RocksDB)
> ---
>
> Key: FLINK-18464
> URL: https://issues.apache.org/jira/browse/FLINK-18464
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.9.3, 1.13.1
>Reporter: Roman Khachatryan
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2021-06-21-20-06-51-323.png, 
> image-2021-06-21-20-07-30-281.png, image-2021-06-21-20-07-43-246.png, 
> image-2021-06-21-20-33-39-295.png, image-2021-06-23-14-34-37-703.png, 
> image-2021-06-24-16-41-54-425.png, image-2021-06-24-17-51-53-734.png, 
> image-2021-07-08-14-50-12-559.png, image-2021-07-08-18-33-17-417.png, 
> image-2021-07-08-18-34-51-910.png, image-2021-07-19-14-40-17-398.png, 
> image-2021-07-19-14-44-59-511.png, image-2021-07-19-14-46-21-682.png, 
> image-2021-07-19-14-47-34-111.png, image-2021-07-19-15-20-36-431.png, 
> image-2021-07-19-15-21-04-214.png, image-2021-07-19-15-45-53-014.png
>
>
> (see FLINK-23036 for error details with RocksDB)
>  
> From 
> [thread|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpoint-failed-because-of-TimeWindow-cannot-be-cast-to-VoidNamespace-td36310.html]
> {quote}I'm using flink 1.9 on Mesos and I try to use my own trigger and 
> evictor. The state is stored to memory.
> {quote}
>  
>   
> {code:java}
> input.setParallelism(processParallelism)
>         .assignTimestampsAndWatermarks(new UETimeAssigner)
>         .keyBy(_.key)
>         .window(TumblingEventTimeWindows.of(Time.minutes(20)))
>         .trigger(new MyTrigger)
>         .evictor(new MyEvictor)
>         .process(new MyFunction).setParallelism(aggregateParallelism)
>         .addSink(kafkaSink).setParallelism(sinkParallelism)
>         .name("kafka-record-sink"){code}
>  
>  
> {code:java}
> java.lang.Exception: Could not materialize checkpoint 1 for operator 
> Window(TumblingEventTimeWindows(120), JoinTrigger, JoinEvictor, 
> ScalaProcessWindowFunctionWrapper) -> Sink: kafka-record-sink (2/5).
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.handleExecutionException(StreamTask.java:1100)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:1042)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.ClassCastException: 
> org.apache.flink.streaming.api.windowing.windows.TimeWindow cannot be cast to 
> org.apache.flink.runtime.state.VoidNamespace
>  at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>  at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>  at 
> org.apache.flink.runtime.concurrent.FutureUtils.runIfNotDoneAndGet(FutureUtils.java:450)
>  at 
> org.apache.flink.streaming.api.operators.OperatorSnapshotFinalizer.(OperatorSnapshotFinalizer.java:47)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:1011)
>  
>  ... 3 more 
> Caused by: java.lang.ClassCastException: 
> org.apache.flink.streaming.api.windowing.windows.TimeWindow cannot be cast to 
> org.apache.flink.runtime.state.VoidNamespace
>  at 
> org.apache.flink.runtime.state.VoidNamespaceSerializer.serialize(VoidNamespaceSerializer.java:32)
>  at 
> org.apache.flink.runtime.state.heap.CopyOnWriteStateMapSnapshot.writeState(CopyOnWriteStateMapSnapshot.java:114)
>  at 
> org.apache.flink.runtime.state.heap.AbstractStateTableSnapshot.writeStateInKeyGroup(AbstractStateTableSnapshot.java:121)
>  at 
> org.apache.flink.runtime.state.heap.CopyOnWriteStateTableSnapshot.writeStateInKeyGroup(CopyOnWriteStateTableSnapshot.java:37)
>  at 
> org.apache.flink.runtime.state.heap.HeapSnapshotStrategy$1.callInternal(HeapSnapshotStrategy.java:191)
>  at 
> org.apache.flink.runtime.state.heap.HeapSnapshotStrategy$1.callInternal(HeapSnapshotStrategy.java:158)
>  at 
> org.apache.flink.runtime.state.AsyncSnapshotCallable.call(AsyncSnapshotCallable.java:75)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> org.apache.flink.runtime.concurrent.FutureUtils.runIfNotDoneAndGet(FutureUtils.java:447)
>     
>      ... 5 more
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-18464) ClassCastException during namespace serialization for checkpoint (Heap and RocksDB)

2021-07-19 Thread guxiang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

guxiang updated FLINK-18464:

Attachment: image-2021-07-19-15-48-23-853.png

> ClassCastException during namespace serialization for checkpoint (Heap and 
> RocksDB)
> ---
>
> Key: FLINK-18464
> URL: https://issues.apache.org/jira/browse/FLINK-18464
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.9.3, 1.13.1
>Reporter: Roman Khachatryan
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2021-06-21-20-06-51-323.png, 
> image-2021-06-21-20-07-30-281.png, image-2021-06-21-20-07-43-246.png, 
> image-2021-06-21-20-33-39-295.png, image-2021-06-23-14-34-37-703.png, 
> image-2021-06-24-16-41-54-425.png, image-2021-06-24-17-51-53-734.png, 
> image-2021-07-08-14-50-12-559.png, image-2021-07-08-18-33-17-417.png, 
> image-2021-07-08-18-34-51-910.png, image-2021-07-19-14-40-17-398.png, 
> image-2021-07-19-14-44-59-511.png, image-2021-07-19-14-46-21-682.png, 
> image-2021-07-19-14-47-34-111.png, image-2021-07-19-15-20-36-431.png, 
> image-2021-07-19-15-21-04-214.png, image-2021-07-19-15-45-53-014.png, 
> image-2021-07-19-15-48-23-853.png
>
>
> (see FLINK-23036 for error details with RocksDB)
>  
> From 
> [thread|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpoint-failed-because-of-TimeWindow-cannot-be-cast-to-VoidNamespace-td36310.html]
> {quote}I'm using flink 1.9 on Mesos and I try to use my own trigger and 
> evictor. The state is stored to memory.
> {quote}
>  
>   
> {code:java}
> input.setParallelism(processParallelism)
>         .assignTimestampsAndWatermarks(new UETimeAssigner)
>         .keyBy(_.key)
>         .window(TumblingEventTimeWindows.of(Time.minutes(20)))
>         .trigger(new MyTrigger)
>         .evictor(new MyEvictor)
>         .process(new MyFunction).setParallelism(aggregateParallelism)
>         .addSink(kafkaSink).setParallelism(sinkParallelism)
>         .name("kafka-record-sink"){code}
>  
>  
> {code:java}
> java.lang.Exception: Could not materialize checkpoint 1 for operator 
> Window(TumblingEventTimeWindows(120), JoinTrigger, JoinEvictor, 
> ScalaProcessWindowFunctionWrapper) -> Sink: kafka-record-sink (2/5).
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.handleExecutionException(StreamTask.java:1100)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:1042)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.ClassCastException: 
> org.apache.flink.streaming.api.windowing.windows.TimeWindow cannot be cast to 
> org.apache.flink.runtime.state.VoidNamespace
>  at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>  at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>  at 
> org.apache.flink.runtime.concurrent.FutureUtils.runIfNotDoneAndGet(FutureUtils.java:450)
>  at 
> org.apache.flink.streaming.api.operators.OperatorSnapshotFinalizer.(OperatorSnapshotFinalizer.java:47)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:1011)
>  
>  ... 3 more 
> Caused by: java.lang.ClassCastException: 
> org.apache.flink.streaming.api.windowing.windows.TimeWindow cannot be cast to 
> org.apache.flink.runtime.state.VoidNamespace
>  at 
> org.apache.flink.runtime.state.VoidNamespaceSerializer.serialize(VoidNamespaceSerializer.java:32)
>  at 
> org.apache.flink.runtime.state.heap.CopyOnWriteStateMapSnapshot.writeState(CopyOnWriteStateMapSnapshot.java:114)
>  at 
> org.apache.flink.runtime.state.heap.AbstractStateTableSnapshot.writeStateInKeyGroup(AbstractStateTableSnapshot.java:121)
>  at 
> org.apache.flink.runtime.state.heap.CopyOnWriteStateTableSnapshot.writeStateInKeyGroup(CopyOnWriteStateTableSnapshot.java:37)
>  at 
> org.apache.flink.runtime.state.heap.HeapSnapshotStrategy$1.callInternal(HeapSnapshotStrategy.java:191)
>  at 
> org.apache.flink.runtime.state.heap.HeapSnapshotStrategy$1.callInternal(HeapSnapshotStrategy.java:158)
>  at 
> org.apache.flink.runtime.state.AsyncSnapshotCallable.call(AsyncSnapshotCallable.java:75)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> org.apache.flink.runtime.concurrent.FutureUtils.runIfNotDoneAndGet(FutureUtils.java:447)
>     
>      ... 5 more
> {code}
>  
>  



--
This message was sent by Atlassi

[jira] [Commented] (FLINK-23423) Translate the page of "Elasticsearch Connector" into Chinese

2021-07-19 Thread huxixiang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-23423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17383114#comment-17383114
 ] 

huxixiang commented on FLINK-23423:
---

Hi [~jark], I'm very interested in this issue, Could you assign it to me? 
Thanks a lot.

> Translate the page of "Elasticsearch Connector" into Chinese
> 
>
> Key: FLINK-23423
> URL: https://issues.apache.org/jira/browse/FLINK-23423
> Project: Flink
>  Issue Type: Improvement
>  Components: chinese-translation, Documentation
>Affects Versions: 1.14.0
>Reporter: huxixiang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> Translate the internal page 
> "[https://ci.apache.org/projects/flink/flink-docs-master/docs/connectors/datastream/elasticsearch/]";
>  into Chinese.
> The doc located in 
> "flink/docs/content/zh/docs/connectors/datastream/elasticsearch.md"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-18464) ClassCastException during namespace serialization for checkpoint (Heap and RocksDB)

2021-07-19 Thread guxiang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17383115#comment-17383115
 ] 

guxiang commented on FLINK-18464:
-

[Yun Tang|https://issues.apache.org/jira/secure/ViewProfile.jspa?name=yunta]

Thanks for your  reply.   Sorry for late  replying these days, I'm a little 
busy at work

I just test the performance by a for  loop.

[https://github.com/apache/flink/pull/16273/files#diff-75c634234efe759f72c64e98d07cb8fb05317d582858a5f7f15eaf33c0c3a857]

 
{code:java}
//代码占位符
long start = System.currentTimeMillis();
for (int i = 0; i < 20L; i++) {
    ValueState firstSeen2 = ctx.getPartitionedState(
            new ValueStateDescriptor("first-seen", Types.BOOLEAN)
    );
}
long end = System.currentTimeMillis();
System.out.println("the onElement test use time:"+(end-start));

{code}
They took about the same time before and after the change,and I couldn't see 
the difference.

 I  test the performance  by idea cpu profiler.

 

I find the 

 
h3. compare the  TypeSerializer

the getPartitionState take about  6166 simples.  and equals typeserializer take 
about 794 simples

 
{code:java}
//代码占位符
if (lastName != null && lastName.equals(stateDescriptor.getName()) && 
lastNamespaceSerializer != null && 
lastNamespaceSerializer.equals(namespaceSerializer)) {
lastState.setCurrentNamespace(namespace);
return (S) lastState;
}
{code}
 

 

!image-2021-07-19-15-20-36-431.png|width=1018,height=233!

!image-2021-07-19-15-21-04-214.png|width=769,height=125!
h3. compare the TypeSerializerClass

the getPartitionState take about  5550 simples.  no  equals method is not 
sampled

 
{code:java}
//代码占位符
if (lastName != null && lastName.equals(stateDescriptor.getName()) && 
lastNamespaceSerializerClass != null && 
lastNamespaceSerializerClass.equals(namespaceSerializer.getClass())) {
lastState.setCurrentNamespace(namespace);
return (S) lastState;
}{code}
 

!image-2021-07-19-14-44-59-511.png|width=1101,height=193!

 

 
h3. not compare the namespaceSerializer

the getPartitionState take about  4132 simples.   no  equals method is not 
sampled

 
{code:java}
//代码占位符
if (lastName != null && lastName.equals(stateDescriptor.getName())) {
lastState.setCurrentNamespace(namespace);
return (S) lastState;
}{code}
 

!image-2021-07-19-14-47-34-111.png|width=1034,height=229!

 

 

If you think this performance loss is unacceptable.

My final solution is   if the lastName result last time matches the lastName 
result this time.we use the cache result.

If the lastName doesn't match the current one.  Let's see if it has the same 
typeSerializerClass

just like so.

[https://github.com/apache/flink/pull/16273/files]

!image-2021-07-19-15-45-53-014.png|width=613,height=321!

 

It take about 4144 samples

!image-2021-07-19-15-48-23-853.png|width=1063,height=220!

 

Do you think it is  OK?  [~yunta] [~roman_khachatryan]

 

> ClassCastException during namespace serialization for checkpoint (Heap and 
> RocksDB)
> ---
>
> Key: FLINK-18464
> URL: https://issues.apache.org/jira/browse/FLINK-18464
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.9.3, 1.13.1
>Reporter: Roman Khachatryan
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2021-06-21-20-06-51-323.png, 
> image-2021-06-21-20-07-30-281.png, image-2021-06-21-20-07-43-246.png, 
> image-2021-06-21-20-33-39-295.png, image-2021-06-23-14-34-37-703.png, 
> image-2021-06-24-16-41-54-425.png, image-2021-06-24-17-51-53-734.png, 
> image-2021-07-08-14-50-12-559.png, image-2021-07-08-18-33-17-417.png, 
> image-2021-07-08-18-34-51-910.png, image-2021-07-19-14-40-17-398.png, 
> image-2021-07-19-14-44-59-511.png, image-2021-07-19-14-46-21-682.png, 
> image-2021-07-19-14-47-34-111.png, image-2021-07-19-15-20-36-431.png, 
> image-2021-07-19-15-21-04-214.png, image-2021-07-19-15-45-53-014.png, 
> image-2021-07-19-15-48-23-853.png
>
>
> (see FLINK-23036 for error details with RocksDB)
>  
> From 
> [thread|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpoint-failed-because-of-TimeWindow-cannot-be-cast-to-VoidNamespace-td36310.html]
> {quote}I'm using flink 1.9 on Mesos and I try to use my own trigger and 
> evictor. The state is stored to memory.
> {quote}
>  
>   
> {code:java}
> input.setParallelism(processParallelism)
>         .assignTimestampsAndWatermarks(new UETimeAssigner)
>         .keyBy(_.key)
>         .window(TumblingEventTimeWindows.of(Time.minutes(20)))
>         .trigger(new MyTrigger)
>         .evictor(new MyEvictor)
>         .process(new MyFunction).setParallelism(aggregateParallelism)
>         .addSink(kafkaSink).setParallelism

[GitHub] [flink] flinkbot edited a comment on pull request #15304: [FLINK-20731][connector] Introduce Pulsar Source

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #15304:
URL: https://github.com/apache/flink/pull/15304#issuecomment-803430086


   
   ## CI report:
   
   * 210e95323903516aaaca9766bf2e8d13679530c3 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20666)
 
   * a75d23fe73acce4a4a921d781746ec0b05cc274c UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15972: Add common source and operator metrics.

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #15972:
URL: https://github.com/apache/flink/pull/15972#issuecomment-844955772


   
   ## CI report:
   
   * 9223b2bfe148ae335891393f244dc2c29f39d2ee UNKNOWN
   * 1e6b0f37803e0fa5326184c17b3a1f3669667de2 UNKNOWN
   * 9bf9ec12f659236c7d63f0631c93dc760bbe05cb UNKNOWN
   * 50de329ece9d41953b511f5953c460c95ca46ecb UNKNOWN
   * 59088925896178940f992392d1fe0d09f669b7e9 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20164)
 
   * 6e947124c3c17aedad9e8f7faa5636a7309e90c9 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16273: [FLINK-18464][Runtime] when namespaceSerializer is different, forbid getPartitionedState from different namespace

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16273:
URL: https://github.com/apache/flink/pull/16273#issuecomment-867425757


   
   ## CI report:
   
   * b5a9fc5356693337ef457f67773451117c04a0fd UNKNOWN
   * a4c0d21fb912a1d7d294dfbd8c41adc18729e23c Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20667)
 
   * 72589d9d592f002fe55e220aa7b07b5cf694f7cf UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16368: ignore

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16368:
URL: https://github.com/apache/flink/pull/16368#issuecomment-873854539


   
   ## CI report:
   
   * cf9c6bda554fa68a11c30f7bf8f66a80eafafd43 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20657)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16412: Draft Materialization master

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16412:
URL: https://github.com/apache/flink/pull/16412#issuecomment-875532108


   
   ## CI report:
   
   * 7d16473ddb835363dd0b4e22aa5760322118e07c Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20470)
 
   * 14516245a6f5bb241bcd3778bc5351a3afd73fcf UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16399: FLIP-177 Draft: Extend Sink API

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16399:
URL: https://github.com/apache/flink/pull/16399#issuecomment-874970474


   
   ## CI report:
   
   * ec5066dda1351e215848ee8793645fee4b9c63d5 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20672)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16465: [FLINK-22910][runtime] Refine ShuffleMaster lifecycle management for pluggable shuffle service framework

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16465:
URL: https://github.com/apache/flink/pull/16465#issuecomment-878076127


   
   ## CI report:
   
   * 82583fbc85e3f9e301aee7957a9100df25a38f5f Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20660)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16521: [FLINK-22715][runtime] Implement streaming window assigner tablefunction operator.

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16521:
URL: https://github.com/apache/flink/pull/16521#issuecomment-881917566


   
   ## CI report:
   
   * 19c5a5f5370dba9f11dd921a6018a79d9897e61e Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20661)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-23322) RMQSourceITCase.testStopWithSavepoint fails on azure due to timeout

2021-07-19 Thread Fabian Paul (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-23322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17383119#comment-17383119
 ] 

Fabian Paul commented on FLINK-23322:
-

[~cmick] thanks for your suggestion I updated the PR to increase the handshake 
timeout can you take a look?

> RMQSourceITCase.testStopWithSavepoint fails on azure due to timeout
> ---
>
> Key: FLINK-23322
> URL: https://issues.apache.org/jira/browse/FLINK-23322
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors/ RabbitMQ
>Affects Versions: 1.12.4
>Reporter: Xintong Song
>Assignee: Fabian Paul
>Priority: Major
>  Labels: pull-request-available, test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20196&view=logs&j=d44f43ce-542c-597d-bf94-b0718c71e5e8&t=03dca39c-73e8-5aaf-601d-328ae5c35f20&l=13696
> {code}
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 41.237 s <<< FAILURE! - in 
> org.apache.flink.streaming.connectors.rabbitmq.RMQSourceITCase
> [ERROR] 
> testStopWithSavepoint(org.apache.flink.streaming.connectors.rabbitmq.RMQSourceITCase)
>   Time elapsed: 7.609 s  <<< ERROR!
> java.util.concurrent.TimeoutException
>   at com.rabbitmq.utility.BlockingCell.get(BlockingCell.java:77)
>   at 
> com.rabbitmq.utility.BlockingCell.uninterruptibleGet(BlockingCell.java:120)
>   at 
> com.rabbitmq.utility.BlockingValueOrException.uninterruptibleGetValue(BlockingValueOrException.java:36)
>   at 
> com.rabbitmq.client.impl.AMQChannel$BlockingRpcContinuation.getReply(AMQChannel.java:502)
>   at com.rabbitmq.client.impl.AMQConnection.start(AMQConnection.java:326)
>   at 
> com.rabbitmq.client.impl.recovery.RecoveryAwareAMQConnectionFactory.newConnection(RecoveryAwareAMQConnectionFactory.java:64)
>   at 
> com.rabbitmq.client.impl.recovery.AutorecoveringConnection.init(AutorecoveringConnection.java:156)
>   at 
> com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:1130)
>   at 
> com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:1087)
>   at 
> com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:1045)
>   at 
> com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:1207)
>   at 
> org.apache.flink.streaming.connectors.rabbitmq.RMQSourceITCase.getRMQConnection(RMQSourceITCase.java:133)
>   at 
> org.apache.flink.streaming.connectors.rabbitmq.RMQSourceITCase.setUp(RMQSourceITCase.java:82)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.testcontainers.containers.FailureDetectingExternalResource$1.evaluate(FailureDetectingExternalResource.java:30)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
>

[GitHub] [flink] tsreaper commented on a change in pull request #16511: [FLINK-22861][table-planner] Fix return value deduction of TIMESTAMPADD function

2021-07-19 Thread GitBox


tsreaper commented on a change in pull request #16511:
URL: https://github.com/apache/flink/pull/16511#discussion_r672073883



##
File path: 
flink-table/flink-table-planner/src/main/java/org/apache/calcite/sql/fun/SqlTimestampAddFunction.java
##
@@ -0,0 +1,134 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.sql.fun;
+
+import org.apache.calcite.avatica.util.TimeUnit;
+import org.apache.calcite.rel.type.RelDataType;
+import org.apache.calcite.rel.type.RelDataTypeFactory;
+import org.apache.calcite.sql.SqlFunction;
+import org.apache.calcite.sql.SqlFunctionCategory;
+import org.apache.calcite.sql.SqlKind;
+import org.apache.calcite.sql.type.OperandTypes;
+import org.apache.calcite.sql.type.SqlReturnTypeInference;
+import org.apache.calcite.sql.type.SqlTypeFamily;
+import org.apache.calcite.sql.type.SqlTypeName;
+
+/**
+ * The TIMESTAMPADD function, which adds an interval to a 
datetime (TIMESTAMP, TIME or
+ * DATE).
+ *
+ * The SQL syntax is
+ *
+ * 
+ *
+ * TIMESTAMPADD(timestamp interval, quantity,
+ * datetime)
+ *
+ * 
+ *
+ * The interval time unit can one of the following literals:
+ *
+ * 
+ *   NANOSECOND (and synonym SQL_TSI_FRAC_SECOND)
+ *   MICROSECOND (and synonyms SQL_TSI_MICROSECOND, FRAC_SECOND)
+ *   SECOND (and synonym SQL_TSI_SECOND)
+ *   MINUTE (and synonym SQL_TSI_MINUTE)
+ *   HOUR (and synonym SQL_TSI_HOUR)
+ *   DAY (and synonym SQL_TSI_DAY)
+ *   WEEK (and synonym SQL_TSI_WEEK)
+ *   MONTH (and synonym SQL_TSI_MONTH)
+ *   QUARTER (and synonym SQL_TSI_QUARTER)
+ *   YEAR (and synonym SQL_TSI_YEAR)
+ * 
+ *
+ * Returns modified datetime.
+ *
+ * This class was copied over from Calcite to fix the return type deduction 
issue on timestamp
+ * with local time zone type.
+ */
+public class SqlTimestampAddFunction extends SqlFunction {
+
+private static final int MILLISECOND_PRECISION = 3;
+private static final int MICROSECOND_PRECISION = 6;
+
+private static final SqlReturnTypeInference RETURN_TYPE_INFERENCE =
+opBinding -> {
+final RelDataTypeFactory typeFactory = 
opBinding.getTypeFactory();
+return deduceType(
+typeFactory,
+opBinding.getOperandLiteralValue(0, TimeUnit.class),
+opBinding.getOperandType(1),
+opBinding.getOperandType(2));
+};
+
+public static RelDataType deduceType(
+RelDataTypeFactory typeFactory,
+TimeUnit timeUnit,
+RelDataType intervalType,
+RelDataType datetimeType) {
+// CHANGED: this method is changed to deduce return type on timestamp 
with local time zone
+// correctly
+RelDataType type;
+switch (timeUnit) {
+case MILLISECOND:
+type =
+typeFactory.createSqlType(
+SqlTypeName.TIMESTAMP,
+Math.max(MILLISECOND_PRECISION, 
datetimeType.getPrecision()));
+break;
+case MICROSECOND:
+type =
+typeFactory.createSqlType(
+SqlTypeName.TIMESTAMP,
+Math.max(MICROSECOND_PRECISION, 
datetimeType.getPrecision()));
+break;

Review comment:
   Nice catch. However this change cannot be tested now as SQL code 
generation does not support fractions of seconds.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] beyond1920 commented on pull request #16521: [FLINK-22715][runtime] Implement streaming window assigner tablefunction operator.

2021-07-19 Thread GitBox


beyond1920 commented on pull request #16521:
URL: https://github.com/apache/flink/pull/16521#issuecomment-882341859


   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rkhachatryan commented on a change in pull request #16447: [FLINK-23144][metrics] Add percentiles to checkpoint statistics

2021-07-19 Thread GitBox


rkhachatryan commented on a change in pull request #16447:
URL: https://github.com/apache/flink/pull/16447#discussion_r672086792



##
File path: 
flink-runtime-web/web-dashboard/src/app/pages/job/checkpoints/job-checkpoints.component.html
##
@@ -193,15 +193,74 @@
 {{ checkPointStats['summary']['state_size']['avg'] | 
humanizeBytes }}
 {{ checkPointStats['summary']['processed_data']['avg'] | 
humanizeBytes }} ({{ checkPointStats['summary']['persisted_data']['avg'] | 
humanizeBytes }})
   
-  
-Maximum
-{{ checkPointStats['summary']['end_to_end_duration']['max'] | 
humanizeDuration}}
-{{ checkPointStats['summary']['state_size']['max'] | 
humanizeBytes }}
-{{ checkPointStats['summary']['processed_data']['max'] | 
humanizeBytes }} ({{ checkPointStats['summary']['persisted_data']['max'] | 
humanizeBytes }})
-  
+
+Maximum
+{{ 
checkPointStats['summary']['end_to_end_duration']['max'] | 
humanizeDuration}}
+{{ checkPointStats['summary']['state_size']['max'] | 
humanizeBytes }}
+{{ checkPointStats['summary']['processed_data']['max'] | 
humanizeBytes }} ({{ checkPointStats['summary']['persisted_data']['max'] | 
humanizeBytes }})
+
 
   
 
+
+
+
+
+
+
+
+End to End Duration
+Checkpointed Data Size
+Processed (persisted) in-flight 
data
+
+
+
+
+
+50% percentile
+{{ 
checkPointStats['summary']['end_to_end_duration']['p50'] | 
humanizeDuration}}
+{{ checkPointStats['summary']['state_size']['p50'] | 
humanizeBytes }}
+{{ checkPointStats['summary']['processed_data']['p50'] 
| humanizeBytes }} ({{ checkPointStats['summary']['persisted_data']['p50'] | 
humanizeBytes }})
+
+
+90% percentile
+{{ 
checkPointStats['summary']['end_to_end_duration']['p90'] | 
humanizeDuration}}
+{{ checkPointStats['summary']['state_size']['p90'] | 
humanizeBytes }}
+{{ checkPointStats['summary']['processed_data']['p90'] 
| humanizeBytes }} ({{ checkPointStats['summary']['persisted_data']['p90'] | 
humanizeBytes }})
+
+
+99% percentile
+{{ 
checkPointStats['summary']['end_to_end_duration']['p99'] | 
humanizeDuration}}
+{{ checkPointStats['summary']['state_size']['p99'] | 
humanizeBytes }}
+{{ checkPointStats['summary']['processed_data']['p99'] 
| humanizeBytes }} ({{ checkPointStats['summary']['persisted_data']['p99'] | 
humanizeBytes }})
+
+
+99% percentile
+{{ 
checkPointStats['summary']['end_to_end_duration']['p99'] | 
humanizeDuration}}
+{{ checkPointStats['summary']['state_size']['p99'] | 
humanizeBytes }}
+{{ checkPointStats['summary']['processed_data']['p99'] 
| humanizeBytes }} ({{ checkPointStats['summary']['persisted_data']['p99'] | 
humanizeBytes }})
+
+
+99.9% percentile

Review comment:
   Thanks for reviewing!
   
   The aggregation here is over time, not over tasks. Each data point is a 
checkpoint, without details per-subtask (subtask details are aggregated upon 
finalization using sum/max).
   
   And the "time" is currently limited to 10K checkpoints (sliding window), so 
1000 can be reached
   (10K limit is 
[set](https://github.com/apache/flink/pull/16447/files#diff-8680a6a315910794f0486c3d4234090317573c5335288b28b812f7d03c6e1b84R30)
 when creating the histogram).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] twalthr commented on pull request #16518: [FLINK-23373][task] Fully support object reuse in OperatorChain

2021-07-19 Thread GitBox


twalthr commented on pull request #16518:
URL: https://github.com/apache/flink/pull/16518#issuecomment-882348564


   Thanks for the feedback @dawidwys. I updated the PR. Please have another 
look.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] twalthr commented on pull request #16518: [FLINK-23373][task] Fully support object reuse in OperatorChain

2021-07-19 Thread GitBox


twalthr commented on pull request #16518:
URL: https://github.com/apache/flink/pull/16518#issuecomment-882350374


   Btw I also manually verified the change using the `HiveDialectQueryITCase`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-22861) TIMESTAMPADD + timestamp_ltz type throws CodeGenException

2021-07-19 Thread Caizhi Weng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caizhi Weng updated FLINK-22861:

Summary: TIMESTAMPADD + timestamp_ltz type throws CodeGenException  (was: 
TIMESTAMPADD + timestamp_ltz type throws CodeGenException when comparing with 
timestamp type)

> TIMESTAMPADD + timestamp_ltz type throws CodeGenException
> -
>
> Key: FLINK-22861
> URL: https://issues.apache.org/jira/browse/FLINK-22861
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner, Table SQL / Runtime
>Affects Versions: 1.14.0, 1.13.2
>Reporter: Caizhi Weng
>Priority: Major
>  Labels: auto-unassigned, pull-request-available
>
> Add the following test case to 
> {{org.apache.flink.table.planner.runtime.batch.sql.CalcITCase}} to reproduce 
> this issue.
> {code:scala}
> @Test
> def myTest(): Unit = {
>   checkResult("SELECT TIMESTAMPADD(MINUTE, 10, CURRENT_TIMESTAMP)", Seq())
> }
> {code}
> The exception stack is
> {code}
> org.apache.flink.table.planner.codegen.CodeGenException: Incompatible types 
> of expression and result type. 
> Expression[GeneratedExpression(result$5,isNull$4,
> isNull$4 = false || false;
> result$5 = null;
> if (!isNull$4) {
>   
>   result$5 = 
> org.apache.flink.table.data.TimestampData.fromEpochMillis(((org.apache.flink.table.data.TimestampData)
>  queryStartTimestamp).getMillisecond() + ((long) 60L), 
> ((org.apache.flink.table.data.TimestampData) 
> queryStartTimestamp).getNanoOfMillisecond());
>   
> }
> ,TIMESTAMP_LTZ(3) NOT NULL,None)] type is [TIMESTAMP_LTZ(3) NOT NULL], result 
> type is [TIMESTAMP(6) NOT NULL]
>   at 
> org.apache.flink.table.planner.codegen.ExprCodeGenerator$$anonfun$generateResultExpression$1.apply(ExprCodeGenerator.scala:311)
>   at 
> org.apache.flink.table.planner.codegen.ExprCodeGenerator$$anonfun$generateResultExpression$1.apply(ExprCodeGenerator.scala:299)
>   at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>   at 
> org.apache.flink.table.planner.codegen.ExprCodeGenerator.generateResultExpression(ExprCodeGenerator.scala:299)
>   at 
> org.apache.flink.table.planner.codegen.ExprCodeGenerator.generateResultExpression(ExprCodeGenerator.scala:255)
>   at 
> org.apache.flink.table.planner.codegen.CalcCodeGenerator$.produceProjectionCode$1(CalcCodeGenerator.scala:142)
>   at 
> org.apache.flink.table.planner.codegen.CalcCodeGenerator$.generateProcessCode(CalcCodeGenerator.scala:167)
>   at 
> org.apache.flink.table.planner.codegen.CalcCodeGenerator$.generateCalcOperator(CalcCodeGenerator.scala:50)
>   at 
> org.apache.flink.table.planner.codegen.CalcCodeGenerator.generateCalcOperator(CalcCodeGenerator.scala)
>   at 
> org.apache.flink.table.planner.plan.nodes.exec.common.CommonExecCalc.translateToPlanInternal(CommonExecCalc.java:94)
>   at 
> org.apache.flink.table.planner.plan.nodes.exec.ExecNodeBase.translateToPlan(ExecNodeBase.java:134)
>   at 
> org.apache.flink.table.planner.plan.nodes.exec.ExecEdge.translateToPlan(ExecEdge.java:247)
>   at 
> org.apache.flink.table.planner.plan.nodes.exec.batch.BatchExecSink.translateToPlanInternal(BatchExecSink.java:58)
>   at 
> org.apache.flink.table.planner.plan.nodes.exec.ExecNodeBase.translateToPlan(ExecNodeBase.java:134)
>   at 
> org.apache.flink.table.planner.delegation.BatchPlanner$$anonfun$translateToPlan$1.apply(BatchPlanner.scala:80)
>   at 
> org.apache.flink.table.planner.delegation.BatchPlanner$$anonfun$translateToPlan$1.apply(BatchPlanner.scala:79)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:891)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:104)
>   at 
> org.apache.flink.table.planner.delegation.BatchPlanner.translateToPlan(BatchPlanner.scala:79)
>   at 
> org.apache.flink.table.planner.delegation.PlannerBase.translate(PlannerBase.scala:165)
>   at 
> org.apache.flink.table.api.internal.TableEnvironmentImpl.translate(TableEnvironmentImpl.java:1657)
>   at 
> org.apache.flink.table.api.internal.TableEnvironmentImpl.executeQueryOperation(TableEnvironmentImpl.java:797)
>   at 
> org.apache.flink.table.api.internal.Table

[GitHub] [flink] flinkbot edited a comment on pull request #15304: [FLINK-20731][connector] Introduce Pulsar Source

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #15304:
URL: https://github.com/apache/flink/pull/15304#issuecomment-803430086


   
   ## CI report:
   
   * a75d23fe73acce4a4a921d781746ec0b05cc274c Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20673)
 
   * ac421c3904c5438bd236c76cf1a3dd7d81822782 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15972: Add common source and operator metrics.

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #15972:
URL: https://github.com/apache/flink/pull/15972#issuecomment-844955772


   
   ## CI report:
   
   * 9223b2bfe148ae335891393f244dc2c29f39d2ee UNKNOWN
   * 1e6b0f37803e0fa5326184c17b3a1f3669667de2 UNKNOWN
   * 9bf9ec12f659236c7d63f0631c93dc760bbe05cb UNKNOWN
   * 50de329ece9d41953b511f5953c460c95ca46ecb UNKNOWN
   * 59088925896178940f992392d1fe0d09f669b7e9 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20164)
 
   * 6e947124c3c17aedad9e8f7faa5636a7309e90c9 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20671)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16020: [FLINK-21911][table-api] Add built-in greatest/least functions support #15821

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16020:
URL: https://github.com/apache/flink/pull/16020#issuecomment-850341406


   
   ## CI report:
   
   * 5e48f83780ec9bf554fa1d9be3fe04350d1793f6 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20662)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-23424) Disable key sorting for Transformations created within table planner

2021-07-19 Thread Timo Walther (Jira)
Timo Walther created FLINK-23424:


 Summary: Disable key sorting for Transformations created within 
table planner
 Key: FLINK-23424
 URL: https://issues.apache.org/jira/browse/FLINK-23424
 Project: Flink
  Issue Type: Sub-task
  Components: API / DataStream, Table SQL / API
Reporter: Timo Walther
Assignee: Timo Walther


In batch mode, the table planner is fully in control of the generated 
Transformations. So we can disable the key sorting in this case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #16412: Draft Materialization master

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16412:
URL: https://github.com/apache/flink/pull/16412#issuecomment-875532108


   
   ## CI report:
   
   * 14516245a6f5bb241bcd3778bc5351a3afd73fcf Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20674)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16445: [FLINK-23322][connectors/rabbitmq] Capture container log output of RM…

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16445:
URL: https://github.com/apache/flink/pull/16445#issuecomment-877143095


   
   ## CI report:
   
   * 5a0cf10a15e27f43e4e6e89382767491438daa76 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20242)
 
   * 12b8fdd2b212abebbf15f4cbe76d6c9065ef3446 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16518: [FLINK-23373][task] Fully support object reuse in OperatorChain

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16518:
URL: https://github.com/apache/flink/pull/16518#issuecomment-881579487


   
   ## CI report:
   
   * 23ab649d4f802ee86f19cd6e690e5d607c708d27 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20578)
 
   * faf9679c3a6cc5e077c448781b838c7e998dcaf0 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16521: [FLINK-22715][runtime] Implement streaming window assigner tablefunction operator.

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16521:
URL: https://github.com/apache/flink/pull/16521#issuecomment-881917566


   
   ## CI report:
   
   *  Unknown: [CANCELED](TBD) 
   * 6b5e7e5dba93d6baf725ee333c7c26c55d348e43 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16530: [FLINK-23409][runtime] Add logs for FineGrainedSlotManager#checkResou…

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16530:
URL: https://github.com/apache/flink/pull/16530#issuecomment-882207317


   
   ## CI report:
   
   * 72320c6005bc752c1c98b7f2a33ddebbc9644b5a Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20664)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] tsreaper commented on pull request #16511: [FLINK-22861][table-planner] Fix return value deduction of TIMESTAMPADD function

2021-07-19 Thread GitBox


tsreaper commented on pull request #16511:
URL: https://github.com/apache/flink/pull/16511#issuecomment-882359416


   Thanks @leonardBang for the review. The CI failure is caused by the changes 
in the result type of `date + hour / minute / second`. The result type used to 
be `timestamp(6)`, however it would be more proper to deduce it as 
`timestamp(0)`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zentol commented on a change in pull request #16500: [FLINK-23395] Bump Okhttp to 3.14.9

2021-07-19 Thread GitBox


zentol commented on a change in pull request #16500:
URL: https://github.com/apache/flink/pull/16500#discussion_r672105773



##
File path: flink-kubernetes/pom.xml
##
@@ -35,17 +35,6 @@ under the License.
4.9.2

 
-   
-   
-   
-   
-   com.squareup.okhttp3
-   okhttp
-   3.12.1

Review comment:
   I assume this upgrade to be safe because it only jumps by 2 minor 
versions, and I didn't see anything major in the release notes.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-5253) Remove special treatment of "dynamic properties"

2021-07-19 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song closed FLINK-5253.
---
Resolution: Fixed

I believe this has been subsumed by FLINK-8349.

> Remove special treatment of "dynamic properties"
> 
>
> Key: FLINK-5253
> URL: https://issues.apache.org/jira/browse/FLINK-5253
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
> Environment: {{flip-6}} feature branch
>Reporter: Stephan Ewen
>Priority: Minor
>  Labels: auto-deprioritized-major, flip-6, pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The YARN client accepts configuration keys as command line parameters.
> Currently these are send to the AppMaster and TaskManager as "dynamic 
> properties", encoded in a special way via environment variables.
> The mechanism is quite fragile. We should simplify it:
>   - The YARN client takes the local {{flink-conf.yaml}} as the base.
>   - It overwrite config entries with command line properties when preparing 
> the configuration to be shipped to YARN container processes (JM / TM)
>   - No additional handling neccessary



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] RollsBean commented on a change in pull request #16478: [FLINK-23228][docs-zh] Translate "Stateful Stream Processing" page into Chinese

2021-07-19 Thread GitBox


RollsBean commented on a change in pull request #16478:
URL: https://github.com/apache/flink/pull/16478#discussion_r672108028



##
File path: docs/content.zh/docs/concepts/stateful-stream-processing.md
##
@@ -24,342 +24,227 @@ under the License.
 
 # 有状态流处理
 
-## What is State?
+## 什么是状态?
 
-While many operations in a dataflow simply look at one individual *event at a
-time* (for example an event parser), some operations remember information
-across multiple events (for example window operators). These operations are
-called **stateful**.
+虽然数据流中的很多操作一次只着眼于一个单独的事件(例如事件解析器),但有些操作会记住多个事件的信息(例如窗口算子)。
+这些操作称为**有状态的**(stateful)。
 
-Some examples of stateful operations:
+有状态操作的一些示例:
 
-  - When an application searches for certain event patterns, the state will
-store the sequence of events encountered so far.
-  - When aggregating events per minute/hour/day, the state holds the pending
-aggregates.
-  - When training a machine learning model over a stream of data points, the
-state holds the current version of the model parameters.
-  - When historic data needs to be managed, the state allows efficient access
-to events that occurred in the past.
+  - 当应用程序搜索某些事件模式时,状态将存储到目前为止遇到的一系列事件。

Review comment:
   原文 “the state will store the sequence of events encountered so far”,宾语是 
“事件的顺序”




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] twalthr commented on a change in pull request #16515: [FLINK-23369][core][docs] Mechanism to document enums used in ConfigOptions

2021-07-19 Thread GitBox


twalthr commented on a change in pull request #16515:
URL: https://github.com/apache/flink/pull/16515#discussion_r672107230



##
File path: 
flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/EmbeddedRocksDBStateBackend.java
##
@@ -91,12 +91,6 @@
 public class EmbeddedRocksDBStateBackend extends 
AbstractManagedMemoryStateBackend
 implements ConfigurableStateBackend {
 
-/** The options to chose for the type of priority queue state. */
-public enum PriorityQueueStateType {

Review comment:
   yes, let's leave the enum there for now. The class is marked as 
`PublicEvolving` but this is actually not correct. The RocksDB State Backend 
has evolved to `Public` nowadays, so we need to be careful.

##
File path: 
flink-core/src/main/java/org/apache/flink/configuration/description/TextElement.java
##
@@ -53,6 +55,11 @@ public static TextElement text(String text) {
 return new TextElement(text, Collections.emptyList());
 }
 
+/** Wraps a list of {@link InlineElement}s into a single {@link 
TextElement}. */
+public static InlineElement wrap(List elements) {

Review comment:
   wouldn't be a vararg more handy?

##
File path: 
flink-core/src/main/java/org/apache/flink/configuration/description/TextElement.java
##
@@ -53,6 +55,11 @@ public static TextElement text(String text) {
 return new TextElement(text, Collections.emptyList());
 }
 
+/** Wraps a list of {@link InlineElement}s into a single {@link 
TextElement}. */
+public static InlineElement wrap(List elements) {

Review comment:
   btw we should add a test for it, it is not used anywhere in this PR 
right?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16273: [FLINK-18464][Runtime] when namespaceSerializer is different, forbid getPartitionedState from different namespace

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16273:
URL: https://github.com/apache/flink/pull/16273#issuecomment-867425757


   
   ## CI report:
   
   * b5a9fc5356693337ef457f67773451117c04a0fd UNKNOWN
   * 72589d9d592f002fe55e220aa7b07b5cf694f7cf Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20675)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16445: [FLINK-23322][connectors/rabbitmq] Capture container log output of RM…

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16445:
URL: https://github.com/apache/flink/pull/16445#issuecomment-877143095


   
   ## CI report:
   
   * 5a0cf10a15e27f43e4e6e89382767491438daa76 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20242)
 
   * 12b8fdd2b212abebbf15f4cbe76d6c9065ef3446 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20676)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16500: [FLINK-23395] Bump Okhttp to 3.14.9

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16500:
URL: https://github.com/apache/flink/pull/16500#issuecomment-880553080


   
   ## CI report:
   
   * dfc149759a9cf43eac5c6df8f365a0a0268c20cb Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20483)
 
   * 7b80f6b36c39cad394f79c232fc952bc4d9b5c91 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16518: [FLINK-23373][task] Fully support object reuse in OperatorChain

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16518:
URL: https://github.com/apache/flink/pull/16518#issuecomment-881579487


   
   ## CI report:
   
   * 23ab649d4f802ee86f19cd6e690e5d607c708d27 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20578)
 
   * faf9679c3a6cc5e077c448781b838c7e998dcaf0 UNKNOWN
   * 13435eaaa5b11fec001d7bcf2eeb5527ffe15dda UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16511: [FLINK-22861][table-planner] Fix return value deduction of TIMESTAMPADD function

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16511:
URL: https://github.com/apache/flink/pull/16511#issuecomment-881154576


   
   ## CI report:
   
   * ab195c2256c333e96605d562ef7ef70966f5a5b8 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20538)
 
   * e017695f3a920b830c0dadb0dcfd7e575cf4e96b UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16525: [hotfix][table-api] Fix typo from Comparision to Comparison

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16525:
URL: https://github.com/apache/flink/pull/16525#issuecomment-882003167


   
   ## CI report:
   
   * 8d6734e084e93499fcd3a6c6c04a0654263e02ef Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20663)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16521: [FLINK-22715][runtime] Implement streaming window assigner tablefunction operator.

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16521:
URL: https://github.com/apache/flink/pull/16521#issuecomment-881917566


   
   ## CI report:
   
   *  Unknown: [CANCELED](TBD) 
   * 6b5e7e5dba93d6baf725ee333c7c26c55d348e43 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20678)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-14482) Bump up rocksdb version

2021-07-19 Thread Yu Li (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17383180#comment-17383180
 ] 

Yu Li commented on FLINK-14482:
---

Sorry for the late response [~trohrmann], just noticed...

[~yunta] is still trying to locate the root cause of performance regression in 
6.x as mentioned in [the above 
comment|https://issues.apache.org/jira/browse/FLINK-14482?focusedCommentId=17228997&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17228997],
 or counteracting the regression by leveraging some new features in higher 
version, and we are not 100% sure whether we could make it in 1.14 release. 
However, we will try our best and update here if any progress.

> Bump up rocksdb version
> ---
>
> Key: FLINK-14482
> URL: https://issues.apache.org/jira/browse/FLINK-14482
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / State Backends
>Reporter: Yun Tang
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned, 
> pull-request-available
> Fix For: 1.14.0
>
>
> This JIRA aims at rebasing frocksdb to [newer 
> version|https://github.com/facebook/rocksdb/releases] of official RocksDB.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (FLINK-23423) Translate the page of "Elasticsearch Connector" into Chinese

2021-07-19 Thread Jark Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu reassigned FLINK-23423:
---

Assignee: huxixiang

> Translate the page of "Elasticsearch Connector" into Chinese
> 
>
> Key: FLINK-23423
> URL: https://issues.apache.org/jira/browse/FLINK-23423
> Project: Flink
>  Issue Type: Improvement
>  Components: chinese-translation, Documentation
>Affects Versions: 1.14.0
>Reporter: huxixiang
>Assignee: huxixiang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> Translate the internal page 
> "[https://ci.apache.org/projects/flink/flink-docs-master/docs/connectors/datastream/elasticsearch/]";
>  into Chinese.
> The doc located in 
> "flink/docs/content/zh/docs/connectors/datastream/elasticsearch.md"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22054) Using a shared watcher for ConfigMap watching

2021-07-19 Thread Yi Tang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Tang updated FLINK-22054:

Labels: k8s-ha pull-request-available  (was: auto-unassigned k8s-ha 
pull-request-available)

> Using a shared watcher for ConfigMap watching
> -
>
> Key: FLINK-22054
> URL: https://issues.apache.org/jira/browse/FLINK-22054
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.12.2, 1.13.0
>Reporter: Yi Tang
>Assignee: Yi Tang
>Priority: Major
>  Labels: k8s-ha, pull-request-available
> Fix For: 1.14.0
>
>
> While using K8s HA service, the watching for ConfigMap is separate for each 
> job. As the number of running jobs increases, this consumes a large amount of 
> connections.
> Here we proposal to use a shard watcher for each FlinkKubeClient, and 
> dispatch events to different listeners. At the same time, we should keep the 
> same semantic with watching separately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] mans2singh closed pull request #16519: [FLINK-23412][table][docs] - Description improvement of dynamic table sink and full stack example

2021-07-19 Thread GitBox


mans2singh closed pull request #16519:
URL: https://github.com/apache/flink/pull/16519


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] mans2singh commented on a change in pull request #16519: [FLINK-23412][table][docs] - Description improvement of dynamic table sink and full stack example

2021-07-19 Thread GitBox


mans2singh commented on a change in pull request #16519:
URL: https://github.com/apache/flink/pull/16519#discussion_r672123456



##
File path: docs/content.zh/docs/dev/table/sourcesSinks.md
##
@@ -334,8 +334,8 @@ a reference implementation.
 In particular, it shows how to
 - create factories that parse and validate options,
 - implement table connectors,
-- implement and discover custom formats,
-- and use provided utilities such as data structure converters and the 
`FactoryUtil`.
+- implement and discover custom formats, and
+- use provided utilities such as data structure converters and the 
`FactoryUtil`.

Review comment:
   Sounds good @leonardBang.  I will close the PR.  Thanks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] twalthr closed pull request #16497: [FLINK-23371] Disable progressive watermarks for bounded SourceFunctions

2021-07-19 Thread GitBox


twalthr closed pull request #16497:
URL: https://github.com/apache/flink/pull/16497


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-23371) Disable AutoWatermarkInterval for bounded legacy sources

2021-07-19 Thread Timo Walther (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther closed FLINK-23371.

Fix Version/s: 1.14.0
   Resolution: Fixed

Fixed in 1.14.0:

commit 47627b28b3d15c20c3924cf42acfb835f330172b
[table-planner] Disable progressive watermarks for bounded SourceFunctions

commit 54e4d716a35aec357e7741aac2888ab0e72e5012
[streaming-java] Allow disabling progressive watermarks for SourceFunction

> Disable AutoWatermarkInterval for bounded legacy sources
> 
>
> Key: FLINK-23371
> URL: https://issues.apache.org/jira/browse/FLINK-23371
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / DataStream
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> {{LegacySourceTransformationTranslator}} has currently no special path for 
> bounded sources. However, the table planner might add bounded legacy sources 
> while AutoWatermarkInterval is still enabled. We should have the same logic 
> as in {{SourceTransformationTranslator}} for disabling intermediate 
> watermarks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] RollsBean commented on a change in pull request #16478: [FLINK-23228][docs-zh] Translate "Stateful Stream Processing" page into Chinese

2021-07-19 Thread GitBox


RollsBean commented on a change in pull request #16478:
URL: https://github.com/apache/flink/pull/16478#discussion_r672128643



##
File path: docs/content.zh/docs/concepts/stateful-stream-processing.md
##
@@ -24,342 +24,227 @@ under the License.
 
 # 有状态流处理
 
-## What is State?
+## 什么是状态?
 
-While many operations in a dataflow simply look at one individual *event at a
-time* (for example an event parser), some operations remember information
-across multiple events (for example window operators). These operations are
-called **stateful**.
+虽然数据流中的很多操作一次只着眼于一个单独的事件(例如事件解析器),但有些操作会记住多个事件的信息(例如窗口算子)。
+这些操作称为**有状态的**(stateful)。
 
-Some examples of stateful operations:
+有状态操作的一些示例:
 
-  - When an application searches for certain event patterns, the state will
-store the sequence of events encountered so far.
-  - When aggregating events per minute/hour/day, the state holds the pending
-aggregates.
-  - When training a machine learning model over a stream of data points, the
-state holds the current version of the model parameters.
-  - When historic data needs to be managed, the state allows efficient access
-to events that occurred in the past.
+  - 当应用程序搜索某些事件模式时,状态将存储到目前为止遇到的一系列事件。
+  - 当每分钟/每小时/每天聚合事件时,状态会持有待处理的聚合。
+  - 当在数据点的流上训练一个机器学习模型时,状态会保存模型参数的当前版本。
+  - 当需要管理历史数据时,状态允许有效访问过去发生的事件。
 
-Flink needs to be aware of the state in order to make it fault tolerant using
+Flink 需要知道状态以便使用
 [checkpoints]({{< ref "docs/dev/datastream/fault-tolerance/checkpointing" >}})
-and [savepoints]({{< ref "docs/ops/state/savepoints" >}}).
+和 [savepoints]({{< ref "docs/ops/state/savepoints" >}}) 进行容错。
 
-Knowledge about the state also allows for rescaling Flink applications, meaning
-that Flink takes care of redistributing state across parallel instances.
+关于状态的知识也允许我们重新调节 Flink 应用程序,这意味着 Flink 负责跨并行实例重新分布状态。

Review comment:
   @hackergin , “Flink takes care of redistributing state across parallel 
instances.” 是不是翻译成 “这意味着 Flink 会跨多个并行实例重新分配状态” 更符合原意,前面说“状态”支持 Flink 扩缩容,那 
Flink 就需要能在并行实例中分配状态




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Assigned] (FLINK-23139) State ownership: track and discard private state (registry+changelog)

2021-07-19 Thread Roman Khachatryan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Khachatryan reassigned FLINK-23139:
-

Assignee: Roman Khachatryan

> State ownership: track and discard private state (registry+changelog)
> -
>
> Key: FLINK-23139
> URL: https://issues.apache.org/jira/browse/FLINK-23139
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / State Backends
>Reporter: Roman Khachatryan
>Assignee: Roman Khachatryan
>Priority: Major
> Fix For: 1.14.0
>
>
> TM should own changelog backend state to prevent re-uploading state on 
> checkpoint abortion (or missing confirmation). A simpler solution to only own 
> aborted state is less maintanable in the long run.
> For that, on TM state should be tracked and discarded (on 
> subsumption+materialization; on shutdown). 
> See [state ownership design 
> doc|https://docs.google.com/document/d/1NJJQ30P27BmUvD7oa4FChvkYxMEgjRPTVdO1dHLl_9I/edit?usp=sharing],
>  in particular [Tracking private 
> state|https://docs.google.com/document/d/1NJJQ30P27BmUvD7oa4FChvkYxMEgjRPTVdO1dHLl_9I/edit#heading=h.9dxopqajsy7].
>  
> This ticket is about creating TaskStateRegistry and using it in 
> ChangelogStateBackend (for non-materialized part only; for materialized see 
> FLINK-23344).
>   
> Externalized checkpoints and savepoints should be supported (or please create 
> a separate ticket).
>  
> Retained checkpoints is a separate ticket: FLINK-23251



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23425) The impact of cpu cores on test results for StreamingJobGraphGeneratorTest#testSlotSharingResourceConfigurationWithDefaultSlotSharingGroup

2021-07-19 Thread tartarus (Jira)
tartarus created FLINK-23425:


 Summary: The impact of cpu cores on test results for 
StreamingJobGraphGeneratorTest#testSlotSharingResourceConfigurationWithDefaultSlotSharingGroup
 
 Key: FLINK-23425
 URL: https://issues.apache.org/jira/browse/FLINK-23425
 Project: Flink
  Issue Type: Bug
  Components: Tests
Reporter: tartarus


By default, LocalStreamEnvironment will use the number of cpu 
cores(Runtime.getRuntime().availableProcessors()) as the default parallelism;

In our company, we will use a 1-core docker container to run the test.

Then 
StreamingJobGraphGeneratorTest#testSlotSharingResourceConfigurationWithDefaultSlotSharingGroup
 failed.

because the parallelism of map operator is 1, so the number of Vertex is 1, not 
expected 2 anymore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #15304: [FLINK-20731][connector] Introduce Pulsar Source

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #15304:
URL: https://github.com/apache/flink/pull/15304#issuecomment-803430086


   
   ## CI report:
   
   * ac421c3904c5438bd236c76cf1a3dd7d81822782 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20679)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16496: [FLINK-23363][table-code-splitter] Java code splitter now supports functions with return statements

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16496:
URL: https://github.com/apache/flink/pull/16496#issuecomment-880472433


   
   ## CI report:
   
   * 2a92cbea6dd6b606b41bc7b157f3b97d5ab74fd9 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20665)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-22893) ResumeCheckpointManuallyITCase hangs on azure

2021-07-19 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-22893:
-
Affects Version/s: 1.11.1

> ResumeCheckpointManuallyITCase hangs on azure
> -
>
> Key: FLINK-22893
> URL: https://issues.apache.org/jira/browse/FLINK-22893
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.11.1, 1.14.0
>Reporter: Dawid Wysakowicz
>Assignee: Anton Kalashnikov
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.14.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18700&view=logs&j=2c3cbe13-dee0-5837-cf47-3053da9a8a78&t=2c7d57b9-7341-5a87-c9af-2cf7cc1a37dc&l=4382



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-22893) ResumeCheckpointManuallyITCase hangs on azure

2021-07-19 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17383202#comment-17383202
 ] 

Chesnay Schepler commented on FLINK-22893:
--

A similar issue was reporter half a year ago on the mailing lists for 1.11.1: 
http://apache-flink.147419.n8.nabble.com/flink-yarn-ha-tt10715.html#none

> ResumeCheckpointManuallyITCase hangs on azure
> -
>
> Key: FLINK-22893
> URL: https://issues.apache.org/jira/browse/FLINK-22893
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.11.1, 1.14.0
>Reporter: Dawid Wysakowicz
>Assignee: Anton Kalashnikov
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.14.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18700&view=logs&j=2c3cbe13-dee0-5837-cf47-3053da9a8a78&t=2c7d57b9-7341-5a87-c9af-2cf7cc1a37dc&l=4382



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #16500: [FLINK-23395] Bump Okhttp to 3.14.9

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16500:
URL: https://github.com/apache/flink/pull/16500#issuecomment-880553080


   
   ## CI report:
   
   * dfc149759a9cf43eac5c6df8f365a0a0268c20cb Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20483)
 
   * 7b80f6b36c39cad394f79c232fc952bc4d9b5c91 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20680)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16511: [FLINK-22861][table-planner] Fix return value deduction of TIMESTAMPADD function

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16511:
URL: https://github.com/apache/flink/pull/16511#issuecomment-881154576


   
   ## CI report:
   
   * ab195c2256c333e96605d562ef7ef70966f5a5b8 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20538)
 
   * e017695f3a920b830c0dadb0dcfd7e575cf4e96b Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20681)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16518: [FLINK-23373][task] Fully support object reuse in OperatorChain

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16518:
URL: https://github.com/apache/flink/pull/16518#issuecomment-881579487


   
   ## CI report:
   
   * 23ab649d4f802ee86f19cd6e690e5d607c708d27 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20578)
 
   * faf9679c3a6cc5e077c448781b838c7e998dcaf0 UNKNOWN
   * 13435eaaa5b11fec001d7bcf2eeb5527ffe15dda Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20677)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink-benchmarks] Jiayi-Liao opened a new pull request #25: [FLINK-23208] Add a benchmark for processing timers

2021-07-19 Thread GitBox


Jiayi-Liao opened a new pull request #25:
URL: https://github.com/apache/flink-benchmarks/pull/25


   This pull request adds a new benchmark for processing timers. 
   
   
   Output example: 
   
   ```
   Result 
"org.apache.flink.benchmark.ProcessingTimerBenchmark.fireProcessingTimers":
 2.762 ±(99.9%) 0.038 ops/ms [Average]
 (min, avg, max) = (2.636, 2.762, 2.903), stdev = 0.057
 CI (99.9%): [2.724, 2.800] (assumes normal distribution)
   
   
   # Run complete. Total time: 00:01:37
   
   Benchmark   Mode  Cnt  Score   Error   
Units
   ProcessingTimerBenchmark.fireProcessingTimers  thrpt   30  2.762 ± 0.038  
ops/ms
   
   Benchmark result is saved to jmh-result.csv
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Comment Edited] (FLINK-23308) Performance regression on 06.07

2021-07-19 Thread Dawid Wysakowicz (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-23308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381500#comment-17381500
 ] 

Dawid Wysakowicz edited comment on FLINK-23308 at 7/19/21, 9:35 AM:


runs: 
* 6f21603cc2386ba08cbf6f7e33634d96f8bc4815(flink) on 
6090943d792333f82609db362215c678468d4b67(benchmarks) -> both pre regression -> 
*33509.096369* : 
http://codespeed.dak8s.net:8080/job/flink-benchmark-request/305/artifact/jmh-result.csv/*view*/
* 764f257de69f7427f1bf6dd7c877dd2171601fde(flink) on 
6090943d792333f82609db362215c678468d4b67(benchmarks) -> flink in the middle of 
regression, benchmarks pre regression -> *31305.252790*: 
http://codespeed.dak8s.net:8080/job/flink-benchmark-request/306/artifact/jmh-result.csv/*view*/
* c46b20343d0e7dcd7ce2ac5fc6e71d02b07df1be (flink) with changes from 
63183d8a2a1267f31fdfafebb628bf61409a094e reverted on 
6090943d792333f82609db362215c678468d4b67(benchmarks): *30,32668* 
http://codespeed.dak8s.net:8080/job/flink-benchmark-request/309/artifact/jmh-result.csv/*view*/


was (Author: dawidwys):
runs: 
* 6f21603cc2386ba08cbf6f7e33634d96f8bc4815(flink) on 
6090943d792333f82609db362215c678468d4b67(benchmarks) -> both pre regression -> 
*33509.096369* : 
http://codespeed.dak8s.net:8080/job/flink-benchmark-request/305/artifact/jmh-result.csv/*view*/
* 764f257de69f7427f1bf6dd7c877dd2171601fde(flink) on 
6090943d792333f82609db362215c678468d4b67(benchmarks) -> flink in the middle of 
regression, benchmarks pre regression -> *31305.252790*: 
http://codespeed.dak8s.net:8080/job/flink-benchmark-request/306/artifact/jmh-result.csv/*view*/

> Performance regression on 06.07
> ---
>
> Key: FLINK-23308
> URL: https://issues.apache.org/jira/browse/FLINK-23308
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.14.0
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1,3&ben=twoInputMapSink&env=2&revs=200&equid=off&quarts=on&extr=on
> http://codespeed.dak8s.net:8000/timeline/?ben=readFileSplit&env=2



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-23208) Late processing timers need to wait 1ms at least to be fired

2021-07-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-23208:
---
Labels: critical pull-request-available  (was: critical)

> Late processing timers need to wait 1ms at least to be fired
> 
>
> Key: FLINK-23208
> URL: https://issues.apache.org/jira/browse/FLINK-23208
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Runtime / Task
>Affects Versions: 1.11.0, 1.11.3, 1.13.0, 1.14.0, 1.12.4
>Reporter: Jiayi Liao
>Assignee: Jiayi Liao
>Priority: Critical
>  Labels: critical, pull-request-available
> Attachments: screenshot-1.png
>
>
> The problem is from the codes below:
> {code:java}
> public static long getProcessingTimeDelay(long processingTimestamp, long 
> currentTimestamp) {
>   // delay the firing of the timer by 1 ms to align the semantics with 
> watermark. A watermark
>   // T says we won't see elements in the future with a timestamp smaller 
> or equal to T.
>   // With processing time, we therefore need to delay firing the timer by 
> one ms.
>   return Math.max(processingTimestamp - currentTimestamp, 0) + 1;
> }
> {code}
> Assuming a Flink job creates 1 timer per millionseconds, and is able to 
> consume 1 timer/ms. Here is what will happen: 
> * Timestmap1(1st ms): timer1 is registered and will be triggered on 
> Timestamp2. 
> * Timestamp2(2nd ms): timer2 is registered and timer1 is triggered
> * Timestamp3(3rd ms): timer3 is registered and timer1 is consumed, after 
> this, {{InternalTimerServiceImpl}} registers next timer, which is timer2, and 
> timer2 will be triggered on Timestamp4(wait 1ms at least)
> * Timestamp4(4th ms): timer4 is registered and timer2 is triggered
> * Timestamp5(5th ms): timer5 is registered and timer2 is consumed, after 
> this, {{InternalTimerServiceImpl}} registers next timer, which is timer3, and 
> timer3 will be triggered on Timestamp6(wait 1ms at least)
> As we can see here, the ability of the Flink job is consuming 1 timer/ms, but 
> it's actually able to consume 0.5 timer/ms. And another problem is that we 
> cannot observe the delay from the lag metrics of the source(Kafka). Instead, 
> what we can tell is that the moment of output is much later than expected. 
> I've added a metrics in our inner version, we can see the lag of the timer 
> triggering keeps increasing: 
>  !screenshot-1.png! 
> *In another word, we should never let the late processing timer wait 1ms, I 
> think a simple change would be as below:*
> {code:java}
> return Math.max(processingTimestamp - currentTimestamp, -1) + 1;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (FLINK-23416) Don't log NoResourceAvailableException stacktrace on Execution state transition

2021-07-19 Thread Till Rohrmann (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann reassigned FLINK-23416:
-

Assignee: Roman Khachatryan

> Don't log NoResourceAvailableException stacktrace on Execution state 
> transition
> ---
>
> Key: FLINK-23416
> URL: https://issues.apache.org/jira/browse/FLINK-23416
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.14.0, 1.13.1
>Reporter: Roman Khachatryan
>Assignee: Roman Khachatryan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> Logging NoResourceAvailableException stacktrace on Execution state transition 
> results in [1G+ 
> artifacts|https://artprodsu6weu.artifacts.visualstudio.com/A2d3c0ac8-fecf-45be-8407-6d87302181a9/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_apis/artifact/cGlwZWxpbmVhcnRpZmFjdDovL2FwYWNoZS1mbGluay9wcm9qZWN0SWQvOTg0NjM0OTYtMWFmMi00NjIwLThlYWItYTJlY2MxYTJlNmZlL2J1aWxkSWQvMjA0NTQvYXJ0aWZhY3ROYW1lL2xvZ3MtY3Jvbl9oYWRvb3AyNDEtdGVzdF9jcm9uX2hhZG9vcDI0MV9jb25uZWN0b3JzLTE2MjYyOTg3NjY1/content?format=zip]
>  (for [this 
> failure|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20454&view=logs&j=ba53eb01-1462-56a3-8e98-0dd97fbcaab5&t=bfbc6239-57a0-5db0-63f3-41551b4f7d51&l=14595])
> while not providing any additional info:
> {code:java}
> 01:25:15,527 [flink-akka.actor.default-dispatcher-10] INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - Source: 
> Custom Source -> Map -> Sink: Unnamed (1/4) 
> (2a191a61b3a7d7fd416b6d181948b8bd) switched from SCHEDULED to FAILED on 
> [unassigned resource].
> java.util.concurrent.CompletionException: 
> org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
> Could not acquire the minimu*m required resources.
> at 
> java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
>  ~[?:1.8.0_282]
> at 
> java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
>  ~[?:1.8.0_282]
> at 
> java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:607) 
> ~[?:1.8.0_282]
> at 
> java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
>  ~[?:1.8.0_282]
> at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
>  ~[?:1.8.0_282]
> at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
>  ~[?:1.8.0_282]
> at 
> org.apache.flink.runtime.jobmaster.slotpool.DeclarativeSlotPoolBridge$PendingRequest.failRequest(DeclarativeSlotPoolBridge.java:545)
>  ~[*flink-runtime_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
> at 
> org.apache.flink.runtime.jobmaster.slotpool.DeclarativeSlotPoolBridge.cancelPendingRequests(DeclarativeSlotPoolBridge.java:127)
>  ~[flink*-runtime_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
> at 
> org.apache.flink.runtime.jobmaster.slotpool.DeclarativeSlotPoolBridge.failPendingRequests(DeclarativeSlotPoolBridge.java:355)
>  ~[flink-r*untime_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
> at 
> org.apache.flink.runtime.jobmaster.slotpool.DeclarativeSlotPoolBridge.notifyNotEnoughResourcesAvailable(DeclarativeSlotPoolBridge.java:*344)
>  ~[flink-runtime_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
> at 
> org.apache.flink.runtime.jobmaster.JobMaster.notifyNotEnoughResourcesAvailable(JobMaster.java:816)
>  ~[flink-runtime_2.11-1.14-SNAPSHOT.j*ar:1.14-SNAPSHOT]
> at sun.reflect.GeneratedMethodAccessor77.invoke(Unknown Source) ~[?:?]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_282]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_282]
> at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:301)
>  ~[flink-rpc-akka_2.11-1.14-SNAPSHOT.jar:1.14-*SNAPSHOT]
> at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:212)
>  ~[flink-rpc-akka_2.11-1.14-SNAPSHOT.jar:1.14-SNA*PSHOT]
> at 
> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77)
>  ~[flink-rpc-akka_2.11-1.14-SNAPSHOT.j*ar:1.14-SNAPSHOT]
> at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158)
>  ~[flink-rpc-akka_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSH*OT]
> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) 
> [akka-actor_2.11-2.5.21.jar:2.5.21]
> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) 
> [akka-actor_2.11-2.5.21.jar:2.5.21]
> at scala.PartialFunction$class.applyOrElse(PartialFun

[GitHub] [flink-benchmarks] Jiayi-Liao commented on pull request #25: [FLINK-23208] Add a benchmark for processing timers

2021-07-19 Thread GitBox


Jiayi-Liao commented on pull request #25:
URL: https://github.com/apache/flink-benchmarks/pull/25#issuecomment-882404239


   @pnowojski  1_000_000 timers are too many for current processing timer's 
mechanism. The benchmark consumes 4_000 timers with four subtasks. Please take 
a review when you have time, thanks :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] tillrohrmann commented on a change in pull request #16523: [FLINK-23416][runtime] Don't log NoResourceAvailableException stacktrace

2021-07-19 Thread GitBox


tillrohrmann commented on a change in pull request #16523:
URL: https://github.com/apache/flink/pull/16523#discussion_r672149405



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java
##
@@ -1435,8 +1437,18 @@ private boolean transitionState(
 getAttemptId(),
 currentState,
 targetState);
-} else {
-if (LOG.isInfoEnabled()) {
+} else if (LOG.isInfoEnabled()) {
+Optional noResourceException =
+findThrowable(error, 
NoResourceAvailableException.class);
+if (noResourceException.isPresent()) {
+LOG.info(
+"{} ({}) switched from {} to {}: {}",
+getVertex().getTaskNameWithSubtaskIndex(),
+getAttemptId(),
+currentState,
+targetState,
+noResourceException.get().getMessage());

Review comment:
   I think there are `NoResourceAvailableException` that contain a cause 
that might be helpful to see. Look for `NoResourceAvailableException(String, 
Throwable)` calls.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-23418) 'Run kubernetes application HA test' fail on Azure

2021-07-19 Thread Till Rohrmann (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-23418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17383214#comment-17383214
 ] 

Till Rohrmann commented on FLINK-23418:
---

[~fly_in_gis] could you take a look at this instability?

> 'Run kubernetes application HA test' fail on Azure
> --
>
> Key: FLINK-23418
> URL: https://issues.apache.org/jira/browse/FLINK-23418
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes, Runtime / Coordination
>Affects Versions: 1.13.1
>Reporter: Dawid Wysakowicz
>Priority: Major
>  Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20589&view=logs&j=68a897ab-3047-5660-245a-cce8f83859f6&t=16ca2cca-2f63-5cce-12d2-d519b930a729&l=3747
> {code}
> Jul 16 23:58:49   at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:208)
>  ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> Jul 16 23:58:49   at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158)
>  ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> Jul 16 23:58:49   at 
> akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) 
> [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> Jul 16 23:58:49   at 
> akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) 
> [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> Jul 16 23:58:49   at 
> scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) 
> [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> Jul 16 23:58:49   at 
> akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) 
> [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> Jul 16 23:58:49   at 
> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170) 
> [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> Jul 16 23:58:49   at 
> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) 
> [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> Jul 16 23:58:49   at 
> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) 
> [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> Jul 16 23:58:49   at 
> akka.actor.Actor$class.aroundReceive(Actor.scala:517) 
> [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> Jul 16 23:58:49   at 
> akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) 
> [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> Jul 16 23:58:49   at 
> akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) 
> [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> Jul 16 23:58:49   at akka.actor.ActorCell.invoke(ActorCell.scala:561) 
> [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> Jul 16 23:58:49   at 
> akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) 
> [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> Jul 16 23:58:49   at akka.dispatch.Mailbox.run(Mailbox.scala:225) 
> [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> Jul 16 23:58:49   at akka.dispatch.Mailbox.exec(Mailbox.scala:235) 
> [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> Jul 16 23:58:49   at 
> akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) 
> [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> Jul 16 23:58:49   at 
> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) 
> [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> Jul 16 23:58:49   at 
> akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) 
> [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> Jul 16 23:58:49   at 
> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>  [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> Jul 16 23:58:49 Caused by: akka.pattern.AskTimeoutException: Ask timed out on 
> [Actor[akka.tcp://flink@172.17.0.3:6123/user/rpc/jobmanager_2#2101744934]] 
> after [1 ms]. Message of type 
> [org.apache.flink.runtime.rpc.messages.RemoteFencedMessage]. A typical reason 
> for `AskTimeoutException` is that the recipient actor didn't send a reply.
> Jul 16 23:58:49   at 
> akka.pattern.PromiseActorRef$$anonfun$2.apply(AskSupport.scala:635) 
> ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> Jul 16 23:58:49   at 
> akka.pattern.PromiseActorRef$$anonfun$2.apply(AskSupport.scala:635) 
> ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> Jul 16 23:58:49   at 
> akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:648) 
> ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> Jul 16 23:58:49   at 
> akka.actor.Scheduler$$anon$4.run(Scheduler.scala:205) 
> ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> Jul 16 23:58:49   at 
> scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:60

[GitHub] [flink] yiduwangkai opened a new pull request #16532: [FLINK-13400]Remove Hive and Hadoop dependencies from SQL Client

2021-07-19 Thread GitBox


yiduwangkai opened a new pull request #16532:
URL: https://github.com/apache/flink/pull/16532


   What is the purpose of the change
   This pull request will remove all dependencies on Hive/Hadoop and replace 
catalog-related tests by a testing catalog
   
   Brief change log
   Add new module flink-end-to-end-tests-hive in flink-end-to-end-tests
   remove all dependencies on Hive/Hadoop in flink-sql-client
   and move to new module flink-end-to-end-tests-hive
   support catalog-related tests
   Add SQLClientHiveITCase.java/TestHiveCatalogFactory.java in new module 
flink-end-to-end-tests-hive
   Modify SqlParserHelper.java/ExecutionContextTest.java/DependencyTest.java
   
   Verifying this change: no, because some of these modifications is about 
testing
   
   Does this pull request potentially affect one of the following parts:
   Dependencies (does it add or upgrade a dependency): (no)
   The public API, i.e., is any changed class annotated with @public(Evolving): 
(no)
   The serializers: (no)
   The runtime per-record code paths (performance sensitive): (no)
   Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
   The S3 file system connector: (no)
   Documentation
   Does this pull request introduce a new feature? (no)
   If yes, how is the feature documented? (not applicable)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] yiduwangkai commented on pull request #16532: [FLINK-13400]Remove Hive and Hadoop dependencies from SQL Client

2021-07-19 Thread GitBox


yiduwangkai commented on pull request #16532:
URL: https://github.com/apache/flink/pull/16532#issuecomment-882414681


   @fsk119  @lirui-apache pls help me review, thx


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #16532: [FLINK-13400]Remove Hive and Hadoop dependencies from SQL Client

2021-07-19 Thread GitBox


flinkbot commented on pull request #16532:
URL: https://github.com/apache/flink/pull/16532#issuecomment-882415688


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 4745129243f5398aa5dc0e6ec347e8c1fba93fdb (Mon Jul 19 
09:58:54 UTC 2021)
   
   **Warnings:**
* **3 pom.xml files were touched**: Check for build and licensing issues.
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zuoniduimian commented on pull request #16490: [FLINK-23241][docs-zh] Translate the page of "Working with State " in…

2021-07-19 Thread GitBox


zuoniduimian commented on pull request #16490:
URL: https://github.com/apache/flink/pull/16490#issuecomment-882416867


   @edmondsky @95chenjz  @RollsBean   I have made some changes according to 
your suggestions,please help me review these changes.Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Assigned] (FLINK-22893) ResumeCheckpointManuallyITCase hangs on azure

2021-07-19 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler reassigned FLINK-22893:


Assignee: Chesnay Schepler  (was: Anton Kalashnikov)

> ResumeCheckpointManuallyITCase hangs on azure
> -
>
> Key: FLINK-22893
> URL: https://issues.apache.org/jira/browse/FLINK-22893
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.11.1, 1.14.0
>Reporter: Dawid Wysakowicz
>Assignee: Chesnay Schepler
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.14.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18700&view=logs&j=2c3cbe13-dee0-5837-cf47-3053da9a8a78&t=2c7d57b9-7341-5a87-c9af-2cf7cc1a37dc&l=4382



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-13400) Remove Hive and Hadoop dependencies from SQL Client

2021-07-19 Thread frank wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-13400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17382162#comment-17382162
 ] 

frank wang edited comment on FLINK-13400 at 7/19/21, 10:05 AM:
---

[~lirui] i have submit pr about this issue 
[FLINK-13400|https://github.com/apache/flink/pull/16532],
I just removed the hive/hadoop related packages and some codes in sql-client, 
create a new module about flink-end-to-end-tests-hive, support catalog test on 
sql-client and hive-connector, i didnot implements e2e tests about sql-client 
and hive, because there is not hive test container, like [SQLClientHBaseITCase 
|https://issues.apache.org/jira/browse/FLINK-21519]


was (Author: frank wang):
[~lirui] i have submit pr about this issue 
[FLINK-13400|https://github.com/apache/flink/pull/16517],
I just removed the hive/hadoop related packages and some codes in sql-client, 
create a new module about flink-end-to-end-tests-hive, support catalog test on 
sql-client and hive-connector, i didnot implements e2e tests about sql-client 
and hive, because there is not hive test container, like [SQLClientHBaseITCase 
|https://issues.apache.org/jira/browse/FLINK-21519]

> Remove Hive and Hadoop dependencies from SQL Client
> ---
>
> Key: FLINK-13400
> URL: https://issues.apache.org/jira/browse/FLINK-13400
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Reporter: Timo Walther
>Assignee: frank wang
>Priority: Major
>  Labels: auto-deprioritized-critical, pull-request-available, 
> stale-major
> Fix For: 1.14.0
>
>
> 340/550 lines in the SQL Client {{pom.xml}} are just around Hive and Hadoop 
> dependencies.  Hive has nothing to do with the SQL Client and it will be hard 
> to maintain the long list of  exclusion there. Some dependencies are even in 
> a {{provided}} scope and not {{test}} scope.
> We should remove all dependencies on Hive/Hadoop and replace catalog-related 
> tests by a testing catalog. Similar to how we tests source/sinks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-22893) ResumeCheckpointManuallyITCase hangs on azure

2021-07-19 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17383226#comment-17383226
 ] 

Chesnay Schepler commented on FLINK-22893:
--

I have assigned myself to this issue because we believe this to be an issue in 
our Zookeeper code.

> ResumeCheckpointManuallyITCase hangs on azure
> -
>
> Key: FLINK-22893
> URL: https://issues.apache.org/jira/browse/FLINK-22893
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.11.1, 1.14.0
>Reporter: Dawid Wysakowicz
>Assignee: Chesnay Schepler
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.14.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18700&view=logs&j=2c3cbe13-dee0-5837-cf47-3053da9a8a78&t=2c7d57b9-7341-5a87-c9af-2cf7cc1a37dc&l=4382



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] tillrohrmann commented on pull request #16514: [FLINK-23406][tests] Harden ClusterUncaughtExceptionHandlerITCase.testExitDueToUncaughtException

2021-07-19 Thread GitBox


tillrohrmann commented on pull request #16514:
URL: https://github.com/apache/flink/pull/16514#issuecomment-882425133


   Thanks for the review @zentol. Merging this PR now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] tillrohrmann closed pull request #16514: [FLINK-23406][tests] Harden ClusterUncaughtExceptionHandlerITCase.testExitDueToUncaughtException

2021-07-19 Thread GitBox


tillrohrmann closed pull request #16514:
URL: https://github.com/apache/flink/pull/16514


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-23426) Support changelog processing in batch mode

2021-07-19 Thread Timo Walther (Jira)
Timo Walther created FLINK-23426:


 Summary: Support changelog processing in batch mode
 Key: FLINK-23426
 URL: https://issues.apache.org/jira/browse/FLINK-23426
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Timo Walther


The DataStream API can execute arbitrary DataStream programs when running in 
batch mode. However, this is not the case for the Table API batch mode. E.g. a 
source with non-insert only changes is not supported and updates/deletes cannot 
be emitted.

In theory, we could make this work by running the "stream mode" of the planner 
(CDC transformations) on top of the "batch mode" of DataStream API (specialized 
state backend, sorted inputs). It is up for discussion if and how we expose 
such functionality.

If we don't allow enabling incremental updates, we can also add a special batch 
operator that materializes the incoming changes for a batch pipeline. However, 
it would require "complete" CDC logs (i.e. no missing UPDATE_AFTER).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-23406) ClusterUncaughtExceptionHandlerITCase.testExitDueToUncaughtException fails on azure

2021-07-19 Thread Till Rohrmann (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann closed FLINK-23406.
-
Resolution: Fixed

Fixed via c034b7ecfab008123e9730d547b460f42ec625a8

> ClusterUncaughtExceptionHandlerITCase.testExitDueToUncaughtException fails on 
> azure
> ---
>
> Key: FLINK-23406
> URL: https://issues.apache.org/jira/browse/FLINK-23406
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.14.0
>Reporter: Xintong Song
>Assignee: Till Rohrmann
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.14.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20523&view=logs&j=d8d26c26-7ec2-5ed2-772e-7a1a1eb8317c&t=be5fb08e-1ad7-563c-4f1a-a97ad4ce4865&l=7837
> {code}
> Jul 15 21:31:22 [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, 
> Time elapsed: 4.401 s <<< FAILURE! - in 
> org.apache.flink.runtime.entrypoint.ClusterUncaughtExceptionHandlerITCase
> Jul 15 21:31:22 [ERROR] 
> testExitDueToUncaughtException(org.apache.flink.runtime.entrypoint.ClusterUncaughtExceptionHandlerITCase)
>   Time elapsed: 4.338 s  <<< FAILURE!
> Jul 15 21:31:22 java.lang.AssertionError: 
> Jul 15 21:31:22 
> Jul 15 21:31:22 Expected: is <239>
> Jul 15 21:31:22  but: was <1>
> Jul 15 21:31:22   at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> Jul 15 21:31:22   at org.junit.Assert.assertThat(Assert.java:964)
> Jul 15 21:31:22   at org.junit.Assert.assertThat(Assert.java:930)
> Jul 15 21:31:22   at 
> org.apache.flink.runtime.entrypoint.ClusterUncaughtExceptionHandlerITCase.testExitDueToUncaughtException(ClusterUncaughtExceptionHandlerITCase.java:64)
> Jul 15 21:31:22   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> Jul 15 21:31:22   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Jul 15 21:31:22   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Jul 15 21:31:22   at java.lang.reflect.Method.invoke(Method.java:498)
> Jul 15 21:31:22   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> Jul 15 21:31:22   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> Jul 15 21:31:22   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> Jul 15 21:31:22   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> Jul 15 21:31:22   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> Jul 15 21:31:22   at 
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
> Jul 15 21:31:22   at 
> org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
> Jul 15 21:31:22   at 
> org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> Jul 15 21:31:22   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
> Jul 15 21:31:22   at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
> Jul 15 21:31:22   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
> Jul 15 21:31:22   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
> Jul 15 21:31:22   at 
> org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
> Jul 15 21:31:22   at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
> Jul 15 21:31:22   at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
> Jul 15 21:31:22   at 
> org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
> Jul 15 21:31:22   at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
> Jul 15 21:31:22   at 
> org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> Jul 15 21:31:22   at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:413)
> Jul 15 21:31:22   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
> Jul 15 21:31:22   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
> Jul 15 21:31:22   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
> Jul 15 21:31:22   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
> Jul 15 21:31:22   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
> Jul 15 21:31:22   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesI

[jira] [Commented] (FLINK-23426) Support changelog processing in batch mode

2021-07-19 Thread Timo Walther (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-23426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17383244#comment-17383244
 ] 

Timo Walther commented on FLINK-23426:
--

[~jark] [~ykt836] what have been your thoughts on this topic?

> Support changelog processing in batch mode
> --
>
> Key: FLINK-23426
> URL: https://issues.apache.org/jira/browse/FLINK-23426
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Timo Walther
>Priority: Major
>
> The DataStream API can execute arbitrary DataStream programs when running in 
> batch mode. However, this is not the case for the Table API batch mode. E.g. 
> a source with non-insert only changes is not supported and updates/deletes 
> cannot be emitted.
> In theory, we could make this work by running the "stream mode" of the 
> planner (CDC transformations) on top of the "batch mode" of DataStream API 
> (specialized state backend, sorted inputs). It is up for discussion if and 
> how we expose such functionality.
> If we don't allow enabling incremental updates, we can also add a special 
> batch operator that materializes the incoming changes for a batch pipeline. 
> However, it would require "complete" CDC logs (i.e. no missing UPDATE_AFTER).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #16490: [FLINK-23241][docs-zh] Translate the page of "Working with State " in…

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16490:
URL: https://github.com/apache/flink/pull/16490#issuecomment-879692551


   
   ## CI report:
   
   * b3daf19e3d63cf1c19ad3b8e46d1fd332948bfee Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20475)
 
   * a34e830f20cf5317ddbffdae8c44fa98492ff92d UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #16532: [FLINK-13400]Remove Hive and Hadoop dependencies from SQL Client

2021-07-19 Thread GitBox


flinkbot commented on pull request #16532:
URL: https://github.com/apache/flink/pull/16532#issuecomment-882438173


   
   ## CI report:
   
   * 4745129243f5398aa5dc0e6ec347e8c1fba93fdb UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16352: [FLINK-23102][runtime] Accessing FlameGraphs while not being enabled …

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16352:
URL: https://github.com/apache/flink/pull/16352#issuecomment-872812232


   
   ## CI report:
   
   * 7c00ce144d7716ac0c04d3c0160e814c25dcc5ce Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20230)
 
   * a702e37520cc8860e767ee7fe6119ef5dc60f2a7 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16490: [FLINK-23241][docs-zh] Translate the page of "Working with State " in…

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16490:
URL: https://github.com/apache/flink/pull/16490#issuecomment-879692551


   
   ## CI report:
   
   * b3daf19e3d63cf1c19ad3b8e46d1fd332948bfee Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20475)
 
   * a34e830f20cf5317ddbffdae8c44fa98492ff92d Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20684)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16531: [FLINK-23204] Provide StateBackends access to MailboxExecutor

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16531:
URL: https://github.com/apache/flink/pull/16531#issuecomment-882289490


   
   ## CI report:
   
   * 98a41566d8755a72e597c7d530d5a350fd561fbd Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20670)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16532: [FLINK-13400]Remove Hive and Hadoop dependencies from SQL Client

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #16532:
URL: https://github.com/apache/flink/pull/16532#issuecomment-882438173


   
   ## CI report:
   
   * 4745129243f5398aa5dc0e6ec347e8c1fba93fdb Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20685)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-23218) Distribute the ShuffleDescriptors via blob server

2021-07-19 Thread Zhilong Hong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-23218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17383265#comment-17383265
 ] 

Zhilong Hong commented on FLINK-23218:
--

Hi, [~trohrmann]. While we are trying to implement the size limit for blob 
cache on TaskExecutor, we find that we cannot delete each last recently used 
permanent blob cache when the size limit exceeds. {{BlobLibraryCacheManager}} 
on TaskExecutor will cache the File pointer for user code classloader. If we 
delete the permanent blob, the access of these classloaders would raise an 
IOException. I'm not sure how to solve this problem without making blob cache 
more complicated. Do you think it's a good idea to implement the 
synchronization rather than the size limit? Thank you in advance.

??(Mentioned here) For each TaskManager, its blob cache syncs the status of all 
blobs with the blob server every 5 minutes (it's configurable). If it's removed 
from the blob server, it would be removed from the blob cache, too.??

> Distribute the ShuffleDescriptors via blob server
> -
>
> Key: FLINK-23218
> URL: https://issues.apache.org/jira/browse/FLINK-23218
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Zhilong Hong
>Priority: Major
> Fix For: 1.14.0
>
>
> h3. Introduction
> The optimizations introduced in FLINK-21110 so far have improved the 
> performance of job initialization, failover and partitions releasing. 
> However, the task deployment is still slow. For a job with two vertices, each 
> vertex has 8k parallelism and they are connected with the all-to-all edge. It 
> takes 32.611s to deploy all the tasks and make them transition to running. If 
> the parallelisms are 16k, it may take more than 2 minutes.
> As the creation of TaskDeploymentDescriptors runs in the main thread of 
> jobmanager, it means that the jobmanager cannot deal with other akka messages 
> like heartbeats, task status update, and etc., for more than two minutes.
>  
> All in all, currently there are two issues in the deployment of tasks for 
> large scale jobs:
>  # It takes a long time to deploy tasks, especially for all-to-all edges.
>  # Heartbeat timeout may happen during or after the procedure of task 
> deployments. For the streaming job, it would cause the failover of the entire 
> region. The job may never transition to running since there would be another 
> heartbeat timeout during the procedure of new task deployments.
> h3. Proposal
> Task deployment involves the following procedures:
>  # Jobmanager creates TaskDeploymentDescriptor for each task in the main 
> thread
>  # TaskDeploymentDescriptor is serialized in the future executor
>  # Akka transports TaskDeploymentDescriptors to TaskExecutors via RPC call
>  # TaskExecutors create a new task thread and execute it
> The optimization contains two parts:
> *1. Cache the compressed serialized value of ShuffleDescriptors*
> ShuffleDescriptors in the TaskDeploymentDescriptor are used to describe the 
> IntermediateResultPartitions that a task consumes. For the downstream 
> vertices connected with the all-to-all edge that has _N_ parallelism, we need 
> to calculate _N_ ShuffleDescriptors for _N_ times. However, for these 
> vertices, they share the same ShuffleDescriptors since they all consume the 
> same IntermediateResultPartitions. We don't need to calculate 
> ShuffleDescriptors for each downstream vertex individually. We can just cache 
> them. This will decrease the overall complexity of calculating 
> TaskDeploymentDescriptors from O(N^2) to O(N).
> Furthermore, we don't need to serialize the same ShuffleDescriptors for _N_ 
> times, so we can just cache the serialized value of ShuffleDescriptors 
> instead of the original object. To decrease the size of akka messages and 
> reduce the transmission of replicated data over the network, these serialized 
> value can be compressed.
> *2. Distribute the ShuffleDescriptors via blob server*
> For ShuffleDescriptors of vertices with 8k parallelism, the size of their 
> serialized value is more than 700 Kilobytes. After the compression, it would 
> be 200 Kilobytes or so. The overall size of 8k TaskDeploymentDescriptors is 
> more than 1.6 Gigabytes. Since Akka cannot send the messages as fast as the 
> TaskDeploymentDescriptors are created, these TaskDeploymentDescriptors would 
> become a heavy burden for the garbage collector to deal with.
> In TaskDeploymentDescriptor, JobInformation and TaskInformation are 
> distributed via the blob server if their sizes exceed a certain threshold 
> (which is defined as {{blob.offload.minsize}}). TaskExecutors request the 
> information from the blob server once they begin to process the 
> TaskDeploymentDescriptor. This make sure that jobmanager d

[GitHub] [flink] flinkbot edited a comment on pull request #15703: [FLINK-22275][table] Support random past for timestamp types in datagen connector

2021-07-19 Thread GitBox


flinkbot edited a comment on pull request #15703:
URL: https://github.com/apache/flink/pull/15703#issuecomment-823980125


   
   ## CI report:
   
   * 886a0567d25763d1d786432354886942fba5288d Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20669)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   3   >