[jira] [Created] (FLINK-21753) Cycle references between memory manager and gc cleaner action

2021-03-12 Thread Kezhu Wang (Jira)
Kezhu Wang created FLINK-21753:
--

 Summary: Cycle references between memory manager and gc cleaner 
action
 Key: FLINK-21753
 URL: https://issues.apache.org/jira/browse/FLINK-21753
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Task
Affects Versions: 1.12.2, 1.11.3, 1.10.3
Reporter: Kezhu Wang


{{MemoryManager.allocatePages}} uses {{this::releasePage}} as cleanup action in 
{{MemorySegmentFactory.allocateOffHeapUnsafeMemory}}. The cleanup function is 
used as gc cleaner action there. This creates a cycle referencing between 
memory manager and gc cleaner *if allocated memory segment is not 
{{MemoryManager.release}} in time.* Symptoms should be different based on 
versions:
 * Before 1.12.2: memory will not be reclaimed until gc after 
{{MemoryManager.release}}
 * * 1.12.2: memory will not be reclaimed until {{MemorySegment.free}} or gc 
after {{MemoryManager.release}}

I quotes javadoc from jdk {{java.lang.ref.Cleaner}} here for references:

{quote}The cleaning action is invoked only after the associated object becomes 
phantom reachable, so it is important that the object implementing the cleaning 
action does not hold references to the object. In this example, a static class 
encapsulates the cleaning state and action. An "inner" class, anonymous or not, 
must not be used because it implicitly contains a reference to the outer 
instance, preventing it from becoming phantom reachable. The choice of a new 
cleaner or sharing an existing cleaner is determined by the use case.
{quote}
See also FLINK-13985 FLINK-21419.

I pushed [test 
case|https://github.com/kezhuw/flink/commit/9a5d71d3a6be50611cf2f5c65c39f51353167306]
 in my repository after FLINK-21419 (which merged in 1.12.2 but not before) for 
evaluation.

cc [~azagrebin] [~xintongsong] [~trohrmann] [~ykt836] [~nicholasjiang] 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #15136: [FLINK-21622][table-planner] Introduce function TO_TIMESTAMP_LTZ(numeric [, precision])

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15136:
URL: https://github.com/apache/flink/pull/15136#issuecomment-795141489


   
   ## CI report:
   
   * 38f1cea9a24f244241f6e2a9f069feb7f774652c Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14561)
 
   * bc9e5155087db2f58663c0f13164d07c1716f7cd Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14577)
 
   * 30df8333ffb8cb76b6bf46043a3cc8dc1b2d8420 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15174: [FLINK-21651][sql-client] Migrate module related tests in LocalExecutorITCase to the new integration test framework

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15174:
URL: https://github.com/apache/flink/pull/15174#issuecomment-797861091


   
   ## CI report:
   
   * faa510486a60de5f71d9cf924ae47fcac04560ff Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14576)
 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14572)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15136: [FLINK-21622][table-planner] Introduce function TO_TIMESTAMP_LTZ(numeric [, precision])

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15136:
URL: https://github.com/apache/flink/pull/15136#issuecomment-795141489


   
   ## CI report:
   
   * 38f1cea9a24f244241f6e2a9f069feb7f774652c Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14561)
 
   * bc9e5155087db2f58663c0f13164d07c1716f7cd UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15121: The method $(String) is undefined for the type TableExample

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15121:
URL: https://github.com/apache/flink/pull/15121#issuecomment-793480240


   
   ## CI report:
   
   * 7383416dfaf59f0aeebc77450b6c3dc9b495789a Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14571)
 
   * 621d7c223c5e8044ecb7965faf086431500f7d63 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14575)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15174: [FLINK-21651][sql-client] Migrate module related tests in LocalExecutorITCase to the new integration test framework

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15174:
URL: https://github.com/apache/flink/pull/15174#issuecomment-797861091


   
   ## CI report:
   
   * faa510486a60de5f71d9cf924ae47fcac04560ff Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14572)
 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14576)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15121: The method $(String) is undefined for the type TableExample

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15121:
URL: https://github.com/apache/flink/pull/15121#issuecomment-793480240


   
   ## CI report:
   
   * 7383416dfaf59f0aeebc77450b6c3dc9b495789a Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14571)
 
   * 621d7c223c5e8044ecb7965faf086431500f7d63 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] LadyForest commented on pull request #15174: [FLINK-21651][sql-client] Migrate module related tests in LocalExecutorITCase to the new integration test framework

2021-03-12 Thread GitBox


LadyForest commented on pull request #15174:
URL: https://github.com/apache/flink/pull/15174#issuecomment-797876066


   @flinkbot run azure



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-21523) ArrayIndexOutOfBoundsException occurs while run a hive streaming job with partitioned table source

2021-03-12 Thread Jark Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu updated FLINK-21523:

Fix Version/s: 1.12.3

> ArrayIndexOutOfBoundsException occurs while run a hive streaming job with 
> partitioned table source 
> ---
>
> Key: FLINK-21523
> URL: https://issues.apache.org/jira/browse/FLINK-21523
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.12.1
>Reporter: zouyunhe
>Assignee: zouyunhe
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0, 1.12.3
>
>
> we have two hive table, the ddl as below
> {code:java}
> //test_tbl5
> create table test.test_5
>  (dpi int, 
>   uid bigint) 
> partitioned by( day string, hour string) stored as parquet;
> //test_tbl3
> create table test.test_3(
>   dpi int,
>  uid bigint,
>  itime timestamp) stored as parquet;{code}
>  then add a partiton to test_tbl5, 
> {code:java}
> alter table test_tbl5 add partition(day='2021-02-27',hour='12');
> {code}
> we start a flink streaming job to read hive table test_tbl5 , and write the 
> data into test_tbl3, the job's  sql as 
> {code:java}
> set test_tbl5.streaming-source.enable = true;
> insert into hive.test.test_tbl3 select dpi, uid, 
> cast(to_timestamp('2020-08-09 00:00:00') as timestamp(9)) from 
> hive.test.test_tbl5 where `day` = '2021-02-27';
> {code}
> and we seen the exception throws
> {code:java}
> 2021-02-28 22:33:16,553 ERROR 
> org.apache.flink.runtime.source.coordinator.SourceCoordinatorContext - 
> Exception while handling result from async call in SourceCoordinator-Source: 
> HiveSource-test.test_tbl5. Triggering job 
> failover.org.apache.flink.connectors.hive.FlinkHiveException: Failed to 
> enumerate filesat 
> org.apache.flink.connectors.hive.ContinuousHiveSplitEnumerator.handleNewSplits(ContinuousHiveSplitEnumerator.java:152)
>  ~[flink-connector-hive_2.12-1.12.1.jar:1.12.1]at 
> org.apache.flink.runtime.source.coordinator.ExecutorNotifier.lambda$null$4(ExecutorNotifier.java:136)
>  ~[flink-dist_2.12-1.12.1.jar:1.12.1]at 
> org.apache.flink.util.ThrowableCatchingRunnable.run(ThrowableCatchingRunnable.java:40)
>  [flink-dist_2.12-1.12.1.jar:1.12.1]at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [?:1.8.0_60]at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [?:1.8.0_60]at java.lang.Thread.run(Thread.java:745) [?:1.8.0_60]Caused 
> by: java.lang.ArrayIndexOutOfBoundsException: -1at 
> org.apache.flink.connectors.hive.util.HivePartitionUtils.toHiveTablePartition(HivePartitionUtils.java:184)
>  ~[flink-connector-hive_2.12-1.12.1.jar:1.12.1]at 
> org.apache.flink.connectors.hive.HiveTableSource$HiveContinuousPartitionFetcherContext.toHiveTablePartition(HiveTableSource.java:417)
>  ~[flink-connector-hive_2.12-1.12.1.jar:1.12.1]at 
> org.apache.flink.connectors.hive.ContinuousHiveSplitEnumerator$PartitionMonitor.call(ContinuousHiveSplitEnumerator.java:237)
>  ~[flink-connector-hive_2.12-1.12.1.jar:1.12.1]at 
> org.apache.flink.connectors.hive.ContinuousHiveSplitEnumerator$PartitionMonitor.call(ContinuousHiveSplitEnumerator.java:177)
>  ~[flink-connector-hive_2.12-1.12.1.jar:1.12.1]at 
> org.apache.flink.runtime.source.coordinator.ExecutorNotifier.lambda$notifyReadyAsync$5(ExecutorNotifier.java:133)
>  ~[flink-dist_2.12-1.12.1.jar:1.12.1]at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[?:1.8.0_60]at 
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
> ~[?:1.8.0_60]at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  ~[?:1.8.0_60]at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  ~[?:1.8.0_60]... 3 more{code}
> it seems the partitoned field is not found in the source table field list.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-21523) ArrayIndexOutOfBoundsException occurs while run a hive streaming job with partitioned table source

2021-03-12 Thread Jark Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17300210#comment-17300210
 ] 

Jark Wu edited comment on FLINK-21523 at 3/13/21, 6:19 AM:
---

Fixed in 
 - master: 63a6aba6722ae0e3d17381aaeb2fa464ea15d2f5
 - release-1.12: 621d7c223c5e8044ecb7965faf086431500f7d63


was (Author: jark):
Fixed in master: 63a6aba6722ae0e3d17381aaeb2fa464ea15d2f5

> ArrayIndexOutOfBoundsException occurs while run a hive streaming job with 
> partitioned table source 
> ---
>
> Key: FLINK-21523
> URL: https://issues.apache.org/jira/browse/FLINK-21523
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.12.1
>Reporter: zouyunhe
>Assignee: zouyunhe
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>
> we have two hive table, the ddl as below
> {code:java}
> //test_tbl5
> create table test.test_5
>  (dpi int, 
>   uid bigint) 
> partitioned by( day string, hour string) stored as parquet;
> //test_tbl3
> create table test.test_3(
>   dpi int,
>  uid bigint,
>  itime timestamp) stored as parquet;{code}
>  then add a partiton to test_tbl5, 
> {code:java}
> alter table test_tbl5 add partition(day='2021-02-27',hour='12');
> {code}
> we start a flink streaming job to read hive table test_tbl5 , and write the 
> data into test_tbl3, the job's  sql as 
> {code:java}
> set test_tbl5.streaming-source.enable = true;
> insert into hive.test.test_tbl3 select dpi, uid, 
> cast(to_timestamp('2020-08-09 00:00:00') as timestamp(9)) from 
> hive.test.test_tbl5 where `day` = '2021-02-27';
> {code}
> and we seen the exception throws
> {code:java}
> 2021-02-28 22:33:16,553 ERROR 
> org.apache.flink.runtime.source.coordinator.SourceCoordinatorContext - 
> Exception while handling result from async call in SourceCoordinator-Source: 
> HiveSource-test.test_tbl5. Triggering job 
> failover.org.apache.flink.connectors.hive.FlinkHiveException: Failed to 
> enumerate filesat 
> org.apache.flink.connectors.hive.ContinuousHiveSplitEnumerator.handleNewSplits(ContinuousHiveSplitEnumerator.java:152)
>  ~[flink-connector-hive_2.12-1.12.1.jar:1.12.1]at 
> org.apache.flink.runtime.source.coordinator.ExecutorNotifier.lambda$null$4(ExecutorNotifier.java:136)
>  ~[flink-dist_2.12-1.12.1.jar:1.12.1]at 
> org.apache.flink.util.ThrowableCatchingRunnable.run(ThrowableCatchingRunnable.java:40)
>  [flink-dist_2.12-1.12.1.jar:1.12.1]at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [?:1.8.0_60]at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [?:1.8.0_60]at java.lang.Thread.run(Thread.java:745) [?:1.8.0_60]Caused 
> by: java.lang.ArrayIndexOutOfBoundsException: -1at 
> org.apache.flink.connectors.hive.util.HivePartitionUtils.toHiveTablePartition(HivePartitionUtils.java:184)
>  ~[flink-connector-hive_2.12-1.12.1.jar:1.12.1]at 
> org.apache.flink.connectors.hive.HiveTableSource$HiveContinuousPartitionFetcherContext.toHiveTablePartition(HiveTableSource.java:417)
>  ~[flink-connector-hive_2.12-1.12.1.jar:1.12.1]at 
> org.apache.flink.connectors.hive.ContinuousHiveSplitEnumerator$PartitionMonitor.call(ContinuousHiveSplitEnumerator.java:237)
>  ~[flink-connector-hive_2.12-1.12.1.jar:1.12.1]at 
> org.apache.flink.connectors.hive.ContinuousHiveSplitEnumerator$PartitionMonitor.call(ContinuousHiveSplitEnumerator.java:177)
>  ~[flink-connector-hive_2.12-1.12.1.jar:1.12.1]at 
> org.apache.flink.runtime.source.coordinator.ExecutorNotifier.lambda$notifyReadyAsync$5(ExecutorNotifier.java:133)
>  ~[flink-dist_2.12-1.12.1.jar:1.12.1]at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[?:1.8.0_60]at 
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
> ~[?:1.8.0_60]at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  ~[?:1.8.0_60]at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  ~[?:1.8.0_60]... 3 more{code}
> it seems the partitoned field is not found in the source table field list.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] wuchong merged pull request #15169: [BP-1.12][FLINK-21523][hive] Fix ArrayIndexOutOfBoundsException when running hive partitioned source with projection push down

2021-03-12 Thread GitBox


wuchong merged pull request #15169:
URL: https://github.com/apache/flink/pull/15169


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21747) Encounter an exception that contains "max key length exceeded ..." when reporting metrics to influxdb

2021-03-12 Thread Jark Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17300736#comment-17300736
 ] 

Jark Wu commented on FLINK-21747:
-

We had a discussion in FLINK-20388.

> Encounter an exception that contains "max key length exceeded ..."  when 
> reporting metrics to influxdb
> --
>
> Key: FLINK-21747
> URL: https://issues.apache.org/jira/browse/FLINK-21747
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Metrics
>Affects Versions: 1.10.0
>Reporter: tim yu
>Priority: Major
>
> I run a stream job with insert statement whose size is too long, it report 
> metrics to influxdb. I find many influxdb exceptions that contains "max key 
> length exceeded ..." in the log file of job manager . The job could not write 
> any metrics to the influxdb, because "task_name" and "operator_name" is too 
> long.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (FLINK-21466) Make "embedded" parameter optional when start sql-client.sh

2021-03-12 Thread Jark Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu reassigned FLINK-21466:
---

Assignee: zck

> Make "embedded" parameter optional when start sql-client.sh
> ---
>
> Key: FLINK-21466
> URL: https://issues.apache.org/jira/browse/FLINK-21466
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Shengkai Fang
>Assignee: zck
>Priority: Major
>  Labels: starter
>
> Users can use command 
> {code:java}
> >./sql-client.sh embedded -i init1.sql,init2.sql
> {code}
>  to start the sql client and don't need to add {{embedded}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #15175: [FLINK-16829] Refactored Prometheus Metric Reporters to use construct…

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15175:
URL: https://github.com/apache/flink/pull/15175#issuecomment-797871668


   
   ## CI report:
   
   * 62141a2aff3348088ceb5eb6c449e9abfe5f9870 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14573)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Assigned] (FLINK-21702) Support option `sql-client.verbose` to print the exception stack

2021-03-12 Thread Jark Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu reassigned FLINK-21702:
---

Assignee: zhuxiaoshang

> Support option `sql-client.verbose` to print the exception stack
> 
>
> Key: FLINK-21702
> URL: https://issues.apache.org/jira/browse/FLINK-21702
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Client
>Affects Versions: 1.13.0
>Reporter: Shengkai Fang
>Assignee: zhuxiaoshang
>Priority: Major
>  Labels: starer
>
> By enable this option, users can get the full exception stack.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #15175: [FLINK-16829] Refactored Prometheus Metric Reporters to use construct…

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15175:
URL: https://github.com/apache/flink/pull/15175#issuecomment-797871668


   
   ## CI report:
   
   * 62141a2aff3348088ceb5eb6c449e9abfe5f9870 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14573)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #15175: [FLINK-16829] Refactored Prometheus Metric Reporters to use construct…

2021-03-12 Thread GitBox


flinkbot commented on pull request #15175:
URL: https://github.com/apache/flink/pull/15175#issuecomment-797871668


   
   ## CI report:
   
   * 62141a2aff3348088ceb5eb6c449e9abfe5f9870 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15174: [FLINK-21651][sql-client] Migrate module related tests in LocalExecutorITCase to the new integration test framework

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15174:
URL: https://github.com/apache/flink/pull/15174#issuecomment-797861091


   
   ## CI report:
   
   * faa510486a60de5f71d9cf924ae47fcac04560ff Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14572)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15121: The method $(String) is undefined for the type TableExample

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15121:
URL: https://github.com/apache/flink/pull/15121#issuecomment-793480240


   
   ## CI report:
   
   * 7383416dfaf59f0aeebc77450b6c3dc9b495789a Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14571)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rionmonster commented on pull request #15175: [FLINK-16829] Refactored Prometheus Metric Reporters to use construct…

2021-03-12 Thread GitBox


rionmonster commented on pull request #15175:
URL: https://github.com/apache/flink/pull/15175#issuecomment-797867198


   @zentol 
   
   I decided to make another, different pass at this guy since the last PR 
seemed to be a bit all over the place. I migrated almost all of the logic to 
simplify the constructors as much as possible, but I'm sure some additional 
eyes and feedback would be appreciated.
   
   The only thing that really stuck out was the use of the `MetricConfig` that 
was used under the hood, however the factory methods (e.g. 
`createMetricReporter(...)`) accept `Properties`, so I added an explicit cast 
since `MetricConfig` was just a superset of `Properties`. 
   
   Additionally, I updated 1-2 of the available unit tests to call the static 
method that was added to one of the factories and adjusted another that assumed 
an empty constructor for the `PrometheusReporter` which explicitly now requires 
the `port` and `httpServer` respectively. Perhaps we need another empty 
constructor as well?
   
   Anyways - it's late on this end, but hopefully this is a step in a better 
direction.
   
   Thanks much,
   
   Rion



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #15175: [FLINK-16829] Refactored Prometheus Metric Reporters to use construct…

2021-03-12 Thread GitBox


flinkbot commented on pull request #15175:
URL: https://github.com/apache/flink/pull/15175#issuecomment-797866980


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 62141a2aff3348088ceb5eb6c449e9abfe5f9870 (Sat Mar 13 
04:54:28 UTC 2021)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rionmonster opened a new pull request #15175: [FLINK-16829] Refactored Prometheus Metric Reporters to use construct…

2021-03-12 Thread GitBox


rionmonster opened a new pull request #15175:
URL: https://github.com/apache/flink/pull/15175


   ## What is the purpose of the change
   
   Clean-up for the initialization of the two Prometheus Metric Reporters 
(`PrometheusReporter` and `PrometheusPushGatewayReporter`) by explicitly 
passing in arguments to the constructors as opposed to relying on resolving 
these values via the internal configuration properties.
   
   ## Brief change log
   
   - Removed argument resolution against defaults from the `open()` functions 
and instead resolved those within the factories and passed them into the 
respective reporter instances (roughly modeled after the `JmxReporter` 
implementation)
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15174: [FLINK-21651][sql-client] Migrate module related tests in LocalExecutorITCase to the new integration test framework

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15174:
URL: https://github.com/apache/flink/pull/15174#issuecomment-797861091


   
   ## CI report:
   
   * faa510486a60de5f71d9cf924ae47fcac04560ff Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14572)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #15174: [FLINK-21651][sql-client] Migrate module related tests in LocalExecutorITCase to the new integration test framework

2021-03-12 Thread GitBox


flinkbot commented on pull request #15174:
URL: https://github.com/apache/flink/pull/15174#issuecomment-797861091


   
   ## CI report:
   
   * faa510486a60de5f71d9cf924ae47fcac04560ff UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #15174: [FLINK-21651][sql-client] Migrate module related tests in LocalExecutorITCase to the new integration test framework

2021-03-12 Thread GitBox


flinkbot commented on pull request #15174:
URL: https://github.com/apache/flink/pull/15174#issuecomment-797859298


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit faa510486a60de5f71d9cf924ae47fcac04560ff (Sat Mar 13 
03:36:38 UTC 2021)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-21651) Migrate module-related tests in LocalExecutorITCase to new integration test framework

2021-03-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-21651:
---
Labels: pull-request-available  (was: )

> Migrate module-related tests in LocalExecutorITCase to new integration test 
> framework
> -
>
> Key: FLINK-21651
> URL: https://issues.apache.org/jira/browse/FLINK-21651
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Client
>Affects Versions: 1.13.0
>Reporter: Jane Chan
>Assignee: Jane Chan
>Priority: Major
>  Labels: pull-request-available
>
> Migrate module-related tests in `LocalExecutorITCase` after FLINK-21614 
> merged.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] LadyForest opened a new pull request #15174: [FLINK-21651][sql-client] Migrate module related tests in LocalExecutorITCase to the new integration test framework

2021-03-12 Thread GitBox


LadyForest opened a new pull request #15174:
URL: https://github.com/apache/flink/pull/15174


   ## Brief change log
   
   This PR migrates the tests in`LocalExecutorITCase` to `CliClientITCase`, and 
removes `#listModules` and `#listFullModules` in `LocalExecutor` since they are 
not needed anymore.
   
   
   ## Verifying this change
   
   This change added  a `module.q` script under `test/resources/sql/` dir and 
can be verified by running `CliClientITCase`.
   
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / **no** / 
don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21466) Make "embedded" parameter optional when start sql-client.sh

2021-03-12 Thread zck (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17300669#comment-17300669
 ] 

zck commented on FLINK-21466:
-

[~jark]  plz assign to me

> Make "embedded" parameter optional when start sql-client.sh
> ---
>
> Key: FLINK-21466
> URL: https://issues.apache.org/jira/browse/FLINK-21466
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Shengkai Fang
>Priority: Major
>  Labels: starter
>
> Users can use command 
> {code:java}
> >./sql-client.sh embedded -i init1.sql,init2.sql
> {code}
>  to start the sql client and don't need to add {{embedded}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21466) Make "embedded" parameter optional when start sql-client.sh

2021-03-12 Thread zck (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17300668#comment-17300668
 ] 

zck commented on FLINK-21466:
-

plz assign to me

> Make "embedded" parameter optional when start sql-client.sh
> ---
>
> Key: FLINK-21466
> URL: https://issues.apache.org/jira/browse/FLINK-21466
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Shengkai Fang
>Priority: Major
>  Labels: starter
>
> Users can use command 
> {code:java}
> >./sql-client.sh embedded -i init1.sql,init2.sql
> {code}
>  to start the sql client and don't need to add {{embedded}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21674) JDBC sink can't get valid connection after 5 minutes using Oracle JDBC driver

2021-03-12 Thread Fuyao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17300667#comment-17300667
 ] 

Fuyao commented on FLINK-21674:
---

I was not aware of this earlier, finally I find someone who knows some 
configurations. It is a little bit tricky here.

> JDBC sink can't get valid connection after 5 minutes using Oracle JDBC driver
> -
>
> Key: FLINK-21674
> URL: https://issues.apache.org/jira/browse/FLINK-21674
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / JDBC
>Affects Versions: 1.12.1
> Environment: Flink version: 1.12.1 Scala version: 2.11 Java version: 
> 1.11 Flink System parallelism: 1 JDBC Driver: Oracle ojdbc10 Database: Oracle 
> Autonomous Database on Oracle Cloud Infrastructure version 19c(You can regard 
> this as an cloud based Oracle Database)
>  
> Flink user mailing list: 
> http://mail-archives.apache.org/mod_mbox/flink-user/202103.mbox/%3CCH2PR10MB402466373B33A3BBC5635A0AEC8C9%40CH2PR10MB4024.namprd10.prod.outlook.com%3E
>Reporter: Fuyao
>Priority: Blocker
>
> I use JDBCSink.sink() method to sink data to Oracle Autonomous Data Warehousr 
> with Oracle JDBC driver. I can sink data into Oracle Autonomous database 
> sucessfully. If there is IDLE time of over 5 minutes, then do a insertion, 
> the retry mechanism can't reestablish the JDBC and it will run into the error 
> below. I have set the retry to be 3 times, even after retry, it will still 
> fail. Only restart the application(an automatic process) could solve the 
> issue from checkpoint.
>  11:41:04,872 ERROR 
> org.apache.flink.connector.jdbc.internal.JdbcBatchingOutputFormat - JDBC 
> executeBatch error, retry times = 0
>  java.sql.BatchUpdateException: IO Error: Broken pipe
> It will fail the application and restart from checkpoint. After restarting 
> from checkpoint, the JDBC connection can be established correctly.
> The connection timeout can be configured by
> alter system set MAX_IDLE_TIME=1440; // Connection will get timeout after 
> 1440 minutes.
> Such timeout parameter behavior change can be verified by SQL developer. 
> However, Flink still got connection error after 5 minutes configuring this.
> I suspect this is some issues in reading some configuration problems from 
> Flink side to establish to sucessful connection.
> Full log:
> {code:java}
> 11:41:04,872 ERROR 
> org.apache.flink.connector.jdbc.internal.JdbcBatchingOutputFormat  - JDBC 
> executeBatch error, retry times = 0
> java.sql.BatchUpdateException: IO Error: Broken pipe
>   at 
> oracle.jdbc.driver.OraclePreparedStatement.executeLargeBatch(OraclePreparedStatement.java:9711)
>   at 
> oracle.jdbc.driver.T4CPreparedStatement.executeLargeBatch(T4CPreparedStatement.java:1447)
>   at 
> oracle.jdbc.driver.OraclePreparedStatement.executeBatch(OraclePreparedStatement.java:9487)
>   at 
> oracle.jdbc.driver.OracleStatementWrapper.executeBatch(OracleStatementWrapper.java:237)
>   at 
> org.apache.flink.connector.jdbc.internal.executor.SimpleBatchStatementExecutor.executeBatch(SimpleBatchStatementExecutor.java:73)
>   at 
> org.apache.flink.connector.jdbc.internal.JdbcBatchingOutputFormat.attemptFlush(JdbcBatchingOutputFormat.java:216)
>   at 
> org.apache.flink.connector.jdbc.internal.JdbcBatchingOutputFormat.flush(JdbcBatchingOutputFormat.java:184)
>   at 
> org.apache.flink.connector.jdbc.internal.JdbcBatchingOutputFormat.writeRecord(JdbcBatchingOutputFormat.java:167)
>   at 
> org.apache.flink.connector.jdbc.internal.GenericJdbcSinkFunction.invoke(GenericJdbcSinkFunction.java:54)
>   at 
> org.apache.flink.streaming.api.operators.StreamSink.processElement(StreamSink.java:54)
>   at 
> org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.pushToOperator(CopyingChainingOutput.java:71)
>   at 
> org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:46)
>   at 
> org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:26)
>   at 
> org.apache.flink.streaming.runtime.tasks.BroadcastingOutputCollector.collect(BroadcastingOutputCollector.java:75)
>   at 
> org.apache.flink.streaming.runtime.tasks.BroadcastingOutputCollector.collect(BroadcastingOutputCollector.java:32)
>   at 
> org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:50)
>   at 
> org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:28)
>   at 
> org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:50)
>   at 
> org.myorg.quickstart.processor.InvoiceBoProcessFunction.onTimer(InvoiceBoProcessFunction.java:613)
>   at 
> org.apache.flink.

[jira] [Resolved] (FLINK-21674) JDBC sink can't get valid connection after 5 minutes using Oracle JDBC driver

2021-03-12 Thread Fuyao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fuyao resolved FLINK-21674.
---
Resolution: Fixed

> JDBC sink can't get valid connection after 5 minutes using Oracle JDBC driver
> -
>
> Key: FLINK-21674
> URL: https://issues.apache.org/jira/browse/FLINK-21674
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / JDBC
>Affects Versions: 1.12.1
> Environment: Flink version: 1.12.1 Scala version: 2.11 Java version: 
> 1.11 Flink System parallelism: 1 JDBC Driver: Oracle ojdbc10 Database: Oracle 
> Autonomous Database on Oracle Cloud Infrastructure version 19c(You can regard 
> this as an cloud based Oracle Database)
>  
> Flink user mailing list: 
> http://mail-archives.apache.org/mod_mbox/flink-user/202103.mbox/%3CCH2PR10MB402466373B33A3BBC5635A0AEC8C9%40CH2PR10MB4024.namprd10.prod.outlook.com%3E
>Reporter: Fuyao
>Priority: Blocker
>
> I use JDBCSink.sink() method to sink data to Oracle Autonomous Data Warehousr 
> with Oracle JDBC driver. I can sink data into Oracle Autonomous database 
> sucessfully. If there is IDLE time of over 5 minutes, then do a insertion, 
> the retry mechanism can't reestablish the JDBC and it will run into the error 
> below. I have set the retry to be 3 times, even after retry, it will still 
> fail. Only restart the application(an automatic process) could solve the 
> issue from checkpoint.
>  11:41:04,872 ERROR 
> org.apache.flink.connector.jdbc.internal.JdbcBatchingOutputFormat - JDBC 
> executeBatch error, retry times = 0
>  java.sql.BatchUpdateException: IO Error: Broken pipe
> It will fail the application and restart from checkpoint. After restarting 
> from checkpoint, the JDBC connection can be established correctly.
> The connection timeout can be configured by
> alter system set MAX_IDLE_TIME=1440; // Connection will get timeout after 
> 1440 minutes.
> Such timeout parameter behavior change can be verified by SQL developer. 
> However, Flink still got connection error after 5 minutes configuring this.
> I suspect this is some issues in reading some configuration problems from 
> Flink side to establish to sucessful connection.
> Full log:
> {code:java}
> 11:41:04,872 ERROR 
> org.apache.flink.connector.jdbc.internal.JdbcBatchingOutputFormat  - JDBC 
> executeBatch error, retry times = 0
> java.sql.BatchUpdateException: IO Error: Broken pipe
>   at 
> oracle.jdbc.driver.OraclePreparedStatement.executeLargeBatch(OraclePreparedStatement.java:9711)
>   at 
> oracle.jdbc.driver.T4CPreparedStatement.executeLargeBatch(T4CPreparedStatement.java:1447)
>   at 
> oracle.jdbc.driver.OraclePreparedStatement.executeBatch(OraclePreparedStatement.java:9487)
>   at 
> oracle.jdbc.driver.OracleStatementWrapper.executeBatch(OracleStatementWrapper.java:237)
>   at 
> org.apache.flink.connector.jdbc.internal.executor.SimpleBatchStatementExecutor.executeBatch(SimpleBatchStatementExecutor.java:73)
>   at 
> org.apache.flink.connector.jdbc.internal.JdbcBatchingOutputFormat.attemptFlush(JdbcBatchingOutputFormat.java:216)
>   at 
> org.apache.flink.connector.jdbc.internal.JdbcBatchingOutputFormat.flush(JdbcBatchingOutputFormat.java:184)
>   at 
> org.apache.flink.connector.jdbc.internal.JdbcBatchingOutputFormat.writeRecord(JdbcBatchingOutputFormat.java:167)
>   at 
> org.apache.flink.connector.jdbc.internal.GenericJdbcSinkFunction.invoke(GenericJdbcSinkFunction.java:54)
>   at 
> org.apache.flink.streaming.api.operators.StreamSink.processElement(StreamSink.java:54)
>   at 
> org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.pushToOperator(CopyingChainingOutput.java:71)
>   at 
> org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:46)
>   at 
> org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:26)
>   at 
> org.apache.flink.streaming.runtime.tasks.BroadcastingOutputCollector.collect(BroadcastingOutputCollector.java:75)
>   at 
> org.apache.flink.streaming.runtime.tasks.BroadcastingOutputCollector.collect(BroadcastingOutputCollector.java:32)
>   at 
> org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:50)
>   at 
> org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:28)
>   at 
> org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:50)
>   at 
> org.myorg.quickstart.processor.InvoiceBoProcessFunction.onTimer(InvoiceBoProcessFunction.java:613)
>   at 
> org.apache.flink.streaming.api.operators.KeyedProcessOperator.invokeUserFunction(KeyedProcessOperator.java:91)
>   at 
> org.apache.flink.streaming.api.operators.Keyed

[jira] [Updated] (FLINK-21747) Encounter an exception that contains "max key length exceeded ..." when reporting metrics to influxdb

2021-03-12 Thread tim yu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tim yu updated FLINK-21747:
---
Description: I run a stream job with insert statement whose size is too 
long, it report metrics to influxdb. I find many influxdb exceptions that 
contains "max key length exceeded ..." in the log file of job manager . The job 
could not write any metrics to the influxdb, because "task_name" and 
"operator_name" is too long.  (was: I run a stream job with insert statement 
whose size is too long, it report metrics to influxdb. I find many influxdb 
exceptions that contains "max key length exceeded ..." in the log file of job 
manager . The job could not write the metrics to the influxdb, because 
"task_name" and "operator_name" is too long.)

> Encounter an exception that contains "max key length exceeded ..."  when 
> reporting metrics to influxdb
> --
>
> Key: FLINK-21747
> URL: https://issues.apache.org/jira/browse/FLINK-21747
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Metrics
>Affects Versions: 1.10.0
>Reporter: tim yu
>Priority: Major
>
> I run a stream job with insert statement whose size is too long, it report 
> metrics to influxdb. I find many influxdb exceptions that contains "max key 
> length exceeded ..." in the log file of job manager . The job could not write 
> any metrics to the influxdb, because "task_name" and "operator_name" is too 
> long.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21674) JDBC sink can't get valid connection after 5 minutes using Oracle JDBC driver

2021-03-12 Thread Fuyao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17300666#comment-17300666
 ] 

Fuyao commented on FLINK-21674:
---

Hi Roman, thanks for your help. 

I was able to figure out the root cause.

I am an employee in Oracle. I need to be within OCI VPN instead of Oracle VPN 
to get access to ADW. I used to use proxy to access it. However, proxy will be 
pose a 5 min hard limit for connection. After 5 min, the JDBC connection will 
be cut off no matter what you configured from DB side and client side.

 

With OCI VPN, I am able to get over the issue. Thanks for your help.

> JDBC sink can't get valid connection after 5 minutes using Oracle JDBC driver
> -
>
> Key: FLINK-21674
> URL: https://issues.apache.org/jira/browse/FLINK-21674
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / JDBC
>Affects Versions: 1.12.1
> Environment: Flink version: 1.12.1 Scala version: 2.11 Java version: 
> 1.11 Flink System parallelism: 1 JDBC Driver: Oracle ojdbc10 Database: Oracle 
> Autonomous Database on Oracle Cloud Infrastructure version 19c(You can regard 
> this as an cloud based Oracle Database)
>  
> Flink user mailing list: 
> http://mail-archives.apache.org/mod_mbox/flink-user/202103.mbox/%3CCH2PR10MB402466373B33A3BBC5635A0AEC8C9%40CH2PR10MB4024.namprd10.prod.outlook.com%3E
>Reporter: Fuyao
>Priority: Blocker
>
> I use JDBCSink.sink() method to sink data to Oracle Autonomous Data Warehousr 
> with Oracle JDBC driver. I can sink data into Oracle Autonomous database 
> sucessfully. If there is IDLE time of over 5 minutes, then do a insertion, 
> the retry mechanism can't reestablish the JDBC and it will run into the error 
> below. I have set the retry to be 3 times, even after retry, it will still 
> fail. Only restart the application(an automatic process) could solve the 
> issue from checkpoint.
>  11:41:04,872 ERROR 
> org.apache.flink.connector.jdbc.internal.JdbcBatchingOutputFormat - JDBC 
> executeBatch error, retry times = 0
>  java.sql.BatchUpdateException: IO Error: Broken pipe
> It will fail the application and restart from checkpoint. After restarting 
> from checkpoint, the JDBC connection can be established correctly.
> The connection timeout can be configured by
> alter system set MAX_IDLE_TIME=1440; // Connection will get timeout after 
> 1440 minutes.
> Such timeout parameter behavior change can be verified by SQL developer. 
> However, Flink still got connection error after 5 minutes configuring this.
> I suspect this is some issues in reading some configuration problems from 
> Flink side to establish to sucessful connection.
> Full log:
> {code:java}
> 11:41:04,872 ERROR 
> org.apache.flink.connector.jdbc.internal.JdbcBatchingOutputFormat  - JDBC 
> executeBatch error, retry times = 0
> java.sql.BatchUpdateException: IO Error: Broken pipe
>   at 
> oracle.jdbc.driver.OraclePreparedStatement.executeLargeBatch(OraclePreparedStatement.java:9711)
>   at 
> oracle.jdbc.driver.T4CPreparedStatement.executeLargeBatch(T4CPreparedStatement.java:1447)
>   at 
> oracle.jdbc.driver.OraclePreparedStatement.executeBatch(OraclePreparedStatement.java:9487)
>   at 
> oracle.jdbc.driver.OracleStatementWrapper.executeBatch(OracleStatementWrapper.java:237)
>   at 
> org.apache.flink.connector.jdbc.internal.executor.SimpleBatchStatementExecutor.executeBatch(SimpleBatchStatementExecutor.java:73)
>   at 
> org.apache.flink.connector.jdbc.internal.JdbcBatchingOutputFormat.attemptFlush(JdbcBatchingOutputFormat.java:216)
>   at 
> org.apache.flink.connector.jdbc.internal.JdbcBatchingOutputFormat.flush(JdbcBatchingOutputFormat.java:184)
>   at 
> org.apache.flink.connector.jdbc.internal.JdbcBatchingOutputFormat.writeRecord(JdbcBatchingOutputFormat.java:167)
>   at 
> org.apache.flink.connector.jdbc.internal.GenericJdbcSinkFunction.invoke(GenericJdbcSinkFunction.java:54)
>   at 
> org.apache.flink.streaming.api.operators.StreamSink.processElement(StreamSink.java:54)
>   at 
> org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.pushToOperator(CopyingChainingOutput.java:71)
>   at 
> org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:46)
>   at 
> org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:26)
>   at 
> org.apache.flink.streaming.runtime.tasks.BroadcastingOutputCollector.collect(BroadcastingOutputCollector.java:75)
>   at 
> org.apache.flink.streaming.runtime.tasks.BroadcastingOutputCollector.collect(BroadcastingOutputCollector.java:32)
>   at 
> org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:50)
>   at 
> org.apac

[jira] [Comment Edited] (FLINK-21747) Encounter an exception that contains "max key length exceeded ..." when reporting metrics to influxdb

2021-03-12 Thread tim yu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17300665#comment-17300665
 ] 

tim yu edited comment on FLINK-21747 at 3/13/21, 1:10 AM:
--

Hi [~chesnay], Thanks for your suggestion, but that is not a perfect way.

Should we shorten task name and operator name of some metrics ?

Hi [~jark], Do you have any good suggestion?


was (Author: yulei0824):
Hi [~chesnay], Thanks for your suggestion, but that is not perfect way.

Should we shorten task name and operator name of some metrics ?

Hi [~jark], Do you have any good suggestion?

> Encounter an exception that contains "max key length exceeded ..."  when 
> reporting metrics to influxdb
> --
>
> Key: FLINK-21747
> URL: https://issues.apache.org/jira/browse/FLINK-21747
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Metrics
>Affects Versions: 1.10.0
>Reporter: tim yu
>Priority: Major
>
> I run a stream job with insert statement whose size is too long, it report 
> metrics to influxdb. I find many influxdb exceptions that contains "max key 
> length exceeded ..." in the log file of job manager . The job could not write 
> the metrics to the influxdb, because "task_name" and "operator_name" is too 
> long.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21747) Encounter an exception that contains "max key length exceeded ..." when reporting metrics to influxdb

2021-03-12 Thread tim yu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17300665#comment-17300665
 ] 

tim yu commented on FLINK-21747:


Hi [~chesnay], Thanks for your suggestion, but that is not perfect way.

Should we shorten task name and operator name of some metrics ?

Hi [~jark], Do you have any good suggestion?

> Encounter an exception that contains "max key length exceeded ..."  when 
> reporting metrics to influxdb
> --
>
> Key: FLINK-21747
> URL: https://issues.apache.org/jira/browse/FLINK-21747
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Metrics
>Affects Versions: 1.10.0
>Reporter: tim yu
>Priority: Major
>
> I run a stream job with insert statement whose size is too long, it report 
> metrics to influxdb. I find many influxdb exceptions that contains "max key 
> length exceeded ..." in the log file of job manager . The job could not write 
> the metrics to the influxdb, because "task_name" and "operator_name" is too 
> long.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #15121: The method $(String) is undefined for the type TableExample

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15121:
URL: https://github.com/apache/flink/pull/15121#issuecomment-793480240


   
   ## CI report:
   
   * 76fdf33da7a9809d4c3b5fda8164d86880a39cac Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14566)
 
   * 7383416dfaf59f0aeebc77450b6c3dc9b495789a Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14571)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15121: The method $(String) is undefined for the type TableExample

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15121:
URL: https://github.com/apache/flink/pull/15121#issuecomment-793480240


   
   ## CI report:
   
   * dfe55bcf15e82fa63e85551fba2e0b415bba8dae Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14554)
 
   * 76fdf33da7a9809d4c3b5fda8164d86880a39cac Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14566)
 
   * 7383416dfaf59f0aeebc77450b6c3dc9b495789a Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14571)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15121: The method $(String) is undefined for the type TableExample

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15121:
URL: https://github.com/apache/flink/pull/15121#issuecomment-793480240


   
   ## CI report:
   
   * dfe55bcf15e82fa63e85551fba2e0b415bba8dae Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14554)
 
   * 76fdf33da7a9809d4c3b5fda8164d86880a39cac Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14566)
 
   * 7383416dfaf59f0aeebc77450b6c3dc9b495789a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21538) Elasticsearch6DynamicSinkITCase.testWritingDocuments fails when submitting job

2021-03-12 Thread Austin Cawley-Edwards (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17300636#comment-17300636
 ] 

Austin Cawley-Edwards commented on FLINK-21538:
---

Test debug build here: 
https://dev.azure.com/austincawley0684/flink/_build/results?buildId=3&view=results

> Elasticsearch6DynamicSinkITCase.testWritingDocuments fails when submitting job
> --
>
> Key: FLINK-21538
> URL: https://issues.apache.org/jira/browse/FLINK-21538
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / ElasticSearch, Runtime / Coordination
>Affects Versions: 1.12.1
>Reporter: Dawid Wysakowicz
>Priority: Major
>  Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=13868&view=logs&j=3d12d40f-c62d-5ec4-6acc-0efe94cc3e89&t=5d6e4255-0ea8-5e2a-f52c-c881b7872361
> {code}
> 2021-02-27T00:16:06.9493539Z 
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
> 2021-02-27T00:16:06.9494494Z  at 
> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:144)
> 2021-02-27T00:16:06.9495733Z  at 
> org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$2(MiniClusterJobClient.java:117)
> 2021-02-27T00:16:06.9496596Z  at 
> java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616)
> 2021-02-27T00:16:06.9497354Z  at 
> java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
> 2021-02-27T00:16:06.9525795Z  at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
> 2021-02-27T00:16:06.9526744Z  at 
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)
> 2021-02-27T00:16:06.9527784Z  at 
> org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:237)
> 2021-02-27T00:16:06.9528552Z  at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
> 2021-02-27T00:16:06.9529271Z  at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
> 2021-02-27T00:16:06.9530013Z  at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
> 2021-02-27T00:16:06.9530482Z  at 
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)
> 2021-02-27T00:16:06.9531068Z  at 
> org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:1046)
> 2021-02-27T00:16:06.9531544Z  at 
> akka.dispatch.OnComplete.internal(Future.scala:264)
> 2021-02-27T00:16:06.9531908Z  at 
> akka.dispatch.OnComplete.internal(Future.scala:261)
> 2021-02-27T00:16:06.9532449Z  at 
> akka.dispatch.japi$CallbackBridge.apply(Future.scala:191)
> 2021-02-27T00:16:06.9532860Z  at 
> akka.dispatch.japi$CallbackBridge.apply(Future.scala:188)
> 2021-02-27T00:16:06.9533245Z  at 
> scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)
> 2021-02-27T00:16:06.9533721Z  at 
> org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:73)
> 2021-02-27T00:16:06.9534225Z  at 
> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:68)
> 2021-02-27T00:16:06.9534697Z  at 
> scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1(Promise.scala:284)
> 2021-02-27T00:16:06.9535217Z  at 
> scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1$adapted(Promise.scala:284)
> 2021-02-27T00:16:06.9535718Z  at 
> scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:284)
> 2021-02-27T00:16:06.9536127Z  at 
> akka.pattern.PromiseActorRef.$bang(AskSupport.scala:573)
> 2021-02-27T00:16:06.9536861Z  at 
> akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:22)
> 2021-02-27T00:16:06.9537394Z  at 
> akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:21)
> 2021-02-27T00:16:06.9537916Z  at 
> scala.concurrent.Future.$anonfun$andThen$1(Future.scala:532)
> 2021-02-27T00:16:06.9605804Z  at 
> scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:29)
> 2021-02-27T00:16:06.9606794Z  at 
> scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:29)
> 2021-02-27T00:16:06.9607642Z  at 
> scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)
> 2021-02-27T00:16:06.9608419Z  at 
> akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
> 2021-02-27T00:16:06.9609252Z  at 
> akka.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:91)
> 2021-02-27T00:16:06.9610024Z  at 
> scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
> 2021-02-27T00:16:06.9613676Z  at 
> scala.concurrent.BlockContext$.with

[GitHub] [flink] rionmonster commented on pull request #15165: [FLINK-16829] Refactored Prometheus Metric Reporters to Use Constructor Initialization

2021-03-12 Thread GitBox


rionmonster commented on pull request #15165:
URL: https://github.com/apache/flink/pull/15165#issuecomment-797790797


   @zentol 
   
   I'll be submitting another PR once this work is done, ran into some git 
issues and decided to scorch the earth in the interested of keeping things 
clean.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rionmonster closed pull request #15165: [FLINK-16829] Refactored Prometheus Metric Reporters to Use Constructor Initialization

2021-03-12 Thread GitBox


rionmonster closed pull request #15165:
URL: https://github.com/apache/flink/pull/15165


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rionmonster commented on a change in pull request #15165: [FLINK-16829] Refactored Prometheus Metric Reporters to Use Constructor Initialization

2021-03-12 Thread GitBox


rionmonster commented on a change in pull request #15165:
URL: https://github.com/apache/flink/pull/15165#discussion_r593477758



##
File path: 
flink-metrics/flink-metrics-prometheus/src/main/java/org/apache/flink/metrics/prometheus/PrometheusPushGatewayReporter.java
##
@@ -56,43 +56,42 @@
 private boolean deleteOnShutdown;
 private Map groupingKey;
 
-@Override
-public void open(MetricConfig config) {
-super.open(config);
-
-String host = config.getString(HOST.key(), HOST.defaultValue());
-int port = config.getInteger(PORT.key(), PORT.defaultValue());
-String configuredJobName = config.getString(JOB_NAME.key(), 
JOB_NAME.defaultValue());
-boolean randomSuffix =
-config.getBoolean(
-RANDOM_JOB_NAME_SUFFIX.key(), 
RANDOM_JOB_NAME_SUFFIX.defaultValue());
-deleteOnShutdown =
-config.getBoolean(DELETE_ON_SHUTDOWN.key(), 
DELETE_ON_SHUTDOWN.defaultValue());
-groupingKey =
-parseGroupingKey(config.getString(GROUPING_KEY.key(), 
GROUPING_KEY.defaultValue()));
-
-if (host == null || host.isEmpty() || port < 1) {
+PrometheusPushGatewayReporter(
+@Nullable final String hostConfig,
+@Nullable final int portConfig,
+@Nullable final String jobNameConfig,
+@Nullable final boolean randomJobSuffixConfig,
+@Nullable final boolean deleteOnShutdownConfig,
+@Nullable final Map groupingKeyConfig
+) {
+deleteOnShutdown = deleteOnShutdownConfig
+groupingKey = parseGroupingKey(groupingKeyConfig)
+
+if (hostConfig == null || hostConfig.isEmpty() || portConfig < 1) {
 throw new IllegalArgumentException(
-"Invalid host/port configuration. Host: " + host + " Port: 
" + port);
+"Invalid host/port configuration. Host: " + hostConfig + " 
Port: " + portConfig);
 }
 
-if (randomSuffix) {
-this.jobName = configuredJobName + new AbstractID();
+if (randomJobSuffixConfig) {
+this.jobName = jobNameConfig + new AbstractID();
 } else {
-this.jobName = configuredJobName;
+this.jobName = jobNameConfig;
 }
 
-pushGateway = new PushGateway(host + ':' + port);
+pushGateway = new PushGateway(hostConfig + ':' + portConfig);
 log.info(
 "Configured PrometheusPushGatewayReporter with {host:{}, 
port:{}, jobName:{}, randomJobNameSuffix:{}, deleteOnShutdown:{}, 
groupingKey:{}}",
-host,
-port,
-jobName,
-randomSuffix,
+hostConfig,
+portConfig,
+jobNameConfig,
+randomJobSuffixConfig,
 deleteOnShutdown,
 groupingKey);
 }
 
+@Override
+public void open(MetricConfig config) { }
+
 Map parseGroupingKey(final String groupingKeyConfig) {

Review comment:
   Okay, totally right. I was _way_ overthinking this. I'll probably burn 
this branch down and pull latest to avoid too much shuffling around. 
   
   Much appreciated!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Assigned] (FLINK-21743) JdbcXaSinkFunction throws XAER_RMFAIL when calling snapshotState and beginTx

2021-03-12 Thread Roman Khachatryan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Khachatryan reassigned FLINK-21743:
-

Assignee: Roman Khachatryan

> JdbcXaSinkFunction throws XAER_RMFAIL when calling snapshotState and beginTx
> 
>
> Key: FLINK-21743
> URL: https://issues.apache.org/jira/browse/FLINK-21743
> Project: Flink
>  Issue Type: Test
>  Components: Connectors / JDBC
>Affects Versions: 1.13.0
> Environment: org.apache.flink:flink-streaming-java_2.11:1.12.1
> org.apache.flink:flink-connector-jdbc_2.11:1.13-SNAPSHOT
>Reporter: Wei Hao
>Assignee: Roman Khachatryan
>Priority: Major
>
> {code:java}
> public void snapshotState(FunctionSnapshotContext context) throws Exception {
> LOG.debug("snapshot state, checkpointId={}", context.getCheckpointId());
> this.rollbackPreparedFromCheckpoint(context.getCheckpointId());
> this.prepareCurrentTx(context.getCheckpointId());
> this.beginTx(context.getCheckpointId() + 1L);
> this.stateHandler.store(JdbcXaSinkFunctionState.of(this.preparedXids, 
> this.hangingXids));
> }
> {code}
> When checkpointing starts, it calls snapshotState(), which ends and prepares 
> the current transaction. The issue I found is with beginTx(), where a new Xid 
> is generated and xaFacade will run command like 'xa start new_xid', which 
> will throw the exception as shown below and causes checkpointing failure.
> {code:java}
> Caused by: org.apache.flink.connector.jdbc.xa.XaFacade$TransientXaException: 
> com.mysql.cj.jdbc.MysqlXAException: XAER_RMFAIL: The command cannot be 
> executed when global transaction is in the  PREPARED stateCaused by: 
> org.apache.flink.connector.jdbc.xa.XaFacade$TransientXaException: 
> com.mysql.cj.jdbc.MysqlXAException: XAER_RMFAIL: The command cannot be 
> executed when global transaction is in the  PREPARED state at 
> org.apache.flink.connector.jdbc.xa.XaFacadeImpl.wrapException(XaFacadeImpl.java:353)
>  at 
> org.apache.flink.connector.jdbc.xa.XaFacadeImpl.access$800(XaFacadeImpl.java:66)
>  at 
> org.apache.flink.connector.jdbc.xa.XaFacadeImpl$Command.lambda$fromRunnable$0(XaFacadeImpl.java:288)
>  at 
> org.apache.flink.connector.jdbc.xa.XaFacadeImpl$Command.lambda$fromRunnable$4(XaFacadeImpl.java:327)
>  at 
> org.apache.flink.connector.jdbc.xa.XaFacadeImpl.execute(XaFacadeImpl.java:267)
>  at 
> org.apache.flink.connector.jdbc.xa.XaFacadeImpl.start(XaFacadeImpl.java:160) 
> at 
> org.apache.flink.connector.jdbc.xa.JdbcXaSinkFunction.beginTx(JdbcXaSinkFunction.java:302)
>  at 
> org.apache.flink.connector.jdbc.xa.JdbcXaSinkFunction.snapshotState(JdbcXaSinkFunction.java:241)
>  at 
> org.apache.flink.streaming.util.functions.StreamingFunctionUtils.trySnapshotFunctionState(StreamingFunctionUtils.java:118)
>  at 
> org.apache.flink.streaming.util.functions.StreamingFunctionUtils.snapshotFunctionState(StreamingFunctionUtils.java:99)
>  at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.snapshotState(AbstractUdfStreamOperator.java:89)
>  at 
> org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.snapshotState(StreamOperatorStateHandler.java:205)
>  ... 23 more
> {code}
> I think the scenario is quite predictable because it is how xa transaction 
> works.
> The MySQL shell example below behaves quite similar to how JdbcXaSinkFunction 
> works.
> {code:java}
> xa start "";
> # Inserting some rows
> # end the current transaction
> xa end "";
> xa prepare "";
> # start a new transaction with the same connection while the previous one is 
> PREPARED
> xa prepare "";
> {code}
> This also produces error 'SQL Error [1399] [XAE07]: XAER_RMFAIL: The command 
> cannot be executed when global transaction is in the PREPARED state'.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21743) JdbcXaSinkFunction throws XAER_RMFAIL when calling snapshotState and beginTx

2021-03-12 Thread Roman Khachatryan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17300627#comment-17300627
 ] 

Roman Khachatryan commented on FLINK-21743:
---

Works fine on postgresql (after increasing max_prepared_transactions):
{code}
postgres=# begin;
BEGIN
postgres=*# prepare transaction '1';
PREPARE TRANSACTION
postgres=# begin;
BEGIN
postgres=*# prepare transaction '2';
PREPARE TRANSACTION
postgres=# commit prepared '1';
COMMIT PREPARED
postgres=# commit prepared '2';
COMMIT PREPARED
{code}

 Probably [https://bugs.mysql.com/bug.php?id=17343]

> JdbcXaSinkFunction throws XAER_RMFAIL when calling snapshotState and beginTx
> 
>
> Key: FLINK-21743
> URL: https://issues.apache.org/jira/browse/FLINK-21743
> Project: Flink
>  Issue Type: Test
>  Components: Connectors / JDBC
>Affects Versions: 1.13.0
> Environment: org.apache.flink:flink-streaming-java_2.11:1.12.1
> org.apache.flink:flink-connector-jdbc_2.11:1.13-SNAPSHOT
>Reporter: Wei Hao
>Priority: Major
>
> {code:java}
> public void snapshotState(FunctionSnapshotContext context) throws Exception {
> LOG.debug("snapshot state, checkpointId={}", context.getCheckpointId());
> this.rollbackPreparedFromCheckpoint(context.getCheckpointId());
> this.prepareCurrentTx(context.getCheckpointId());
> this.beginTx(context.getCheckpointId() + 1L);
> this.stateHandler.store(JdbcXaSinkFunctionState.of(this.preparedXids, 
> this.hangingXids));
> }
> {code}
> When checkpointing starts, it calls snapshotState(), which ends and prepares 
> the current transaction. The issue I found is with beginTx(), where a new Xid 
> is generated and xaFacade will run command like 'xa start new_xid', which 
> will throw the exception as shown below and causes checkpointing failure.
> {code:java}
> Caused by: org.apache.flink.connector.jdbc.xa.XaFacade$TransientXaException: 
> com.mysql.cj.jdbc.MysqlXAException: XAER_RMFAIL: The command cannot be 
> executed when global transaction is in the  PREPARED stateCaused by: 
> org.apache.flink.connector.jdbc.xa.XaFacade$TransientXaException: 
> com.mysql.cj.jdbc.MysqlXAException: XAER_RMFAIL: The command cannot be 
> executed when global transaction is in the  PREPARED state at 
> org.apache.flink.connector.jdbc.xa.XaFacadeImpl.wrapException(XaFacadeImpl.java:353)
>  at 
> org.apache.flink.connector.jdbc.xa.XaFacadeImpl.access$800(XaFacadeImpl.java:66)
>  at 
> org.apache.flink.connector.jdbc.xa.XaFacadeImpl$Command.lambda$fromRunnable$0(XaFacadeImpl.java:288)
>  at 
> org.apache.flink.connector.jdbc.xa.XaFacadeImpl$Command.lambda$fromRunnable$4(XaFacadeImpl.java:327)
>  at 
> org.apache.flink.connector.jdbc.xa.XaFacadeImpl.execute(XaFacadeImpl.java:267)
>  at 
> org.apache.flink.connector.jdbc.xa.XaFacadeImpl.start(XaFacadeImpl.java:160) 
> at 
> org.apache.flink.connector.jdbc.xa.JdbcXaSinkFunction.beginTx(JdbcXaSinkFunction.java:302)
>  at 
> org.apache.flink.connector.jdbc.xa.JdbcXaSinkFunction.snapshotState(JdbcXaSinkFunction.java:241)
>  at 
> org.apache.flink.streaming.util.functions.StreamingFunctionUtils.trySnapshotFunctionState(StreamingFunctionUtils.java:118)
>  at 
> org.apache.flink.streaming.util.functions.StreamingFunctionUtils.snapshotFunctionState(StreamingFunctionUtils.java:99)
>  at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.snapshotState(AbstractUdfStreamOperator.java:89)
>  at 
> org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.snapshotState(StreamOperatorStateHandler.java:205)
>  ... 23 more
> {code}
> I think the scenario is quite predictable because it is how xa transaction 
> works.
> The MySQL shell example below behaves quite similar to how JdbcXaSinkFunction 
> works.
> {code:java}
> xa start "";
> # Inserting some rows
> # end the current transaction
> xa end "";
> xa prepare "";
> # start a new transaction with the same connection while the previous one is 
> PREPARED
> xa prepare "";
> {code}
> This also produces error 'SQL Error [1399] [XAE07]: XAER_RMFAIL: The command 
> cannot be executed when global transaction is in the PREPARED state'.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #15134: [FLINK-21654][test] Added @throws information to tests

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15134:
URL: https://github.com/apache/flink/pull/15134#issuecomment-795077470


   
   ## CI report:
   
   * 63851b4ad1d97292257cb8cb02f6bd211b503979 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14562)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15049: [FLINK-21190][runtime-web] Expose exception history

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15049:
URL: https://github.com/apache/flink/pull/15049#issuecomment-787719691


   
   ## CI report:
   
   * 2cbffce55c35f7e163739d07f88e480870a0fc37 UNKNOWN
   * bd732209f8aaa9653dfeaf7be6ff8396ece70928 UNKNOWN
   * b8744b240d77fc84dee60042317b79585423d403 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14564)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15121: The method $(String) is undefined for the type TableExample

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15121:
URL: https://github.com/apache/flink/pull/15121#issuecomment-793480240


   
   ## CI report:
   
   * dfe55bcf15e82fa63e85551fba2e0b415bba8dae Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14554)
 
   * 76fdf33da7a9809d4c3b5fda8164d86880a39cac Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14566)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15136: [FLINK-21622][table-planner] Introduce function TO_TIMESTAMP_LTZ(numeric [, precision])

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15136:
URL: https://github.com/apache/flink/pull/15136#issuecomment-795141489


   
   ## CI report:
   
   * 38f1cea9a24f244241f6e2a9f069feb7f774652c Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14561)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15049: [FLINK-21190][runtime-web] Expose exception history

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15049:
URL: https://github.com/apache/flink/pull/15049#issuecomment-787719691


   
   ## CI report:
   
   * 2cbffce55c35f7e163739d07f88e480870a0fc37 UNKNOWN
   * bd732209f8aaa9653dfeaf7be6ff8396ece70928 UNKNOWN
   * d772ca219b9437388a0996cb5498789153ea8b1b Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14560)
 
   * b8744b240d77fc84dee60042317b79585423d403 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14564)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15121: The method $(String) is undefined for the type TableExample

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15121:
URL: https://github.com/apache/flink/pull/15121#issuecomment-793480240


   
   ## CI report:
   
   * dfe55bcf15e82fa63e85551fba2e0b415bba8dae Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14554)
 
   * 76fdf33da7a9809d4c3b5fda8164d86880a39cac UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] twalthr closed pull request #15154: [FLINK-21725][core] Name constructor arguments of Tuples like fields

2021-03-12 Thread GitBox


twalthr closed pull request #15154:
URL: https://github.com/apache/flink/pull/15154


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #15173: [FLINK-21367][Connectors / JDBC] Support objectReuse in JDBC sink

2021-03-12 Thread GitBox


flinkbot commented on pull request #15173:
URL: https://github.com/apache/flink/pull/15173#issuecomment-797709378


   
   ## CI report:
   
   * c2d910a15fa478f09f81913837d14aa4782601b2 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15171: [FLINK-21558][coordination] Skip check if slotreport indicates no change

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15171:
URL: https://github.com/apache/flink/pull/15171#issuecomment-797538070


   
   ## CI report:
   
   * eaf04db0fc9193bb77e3ecb917f828f2adbbca14 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14555)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15121: The method $(String) is undefined for the type TableExample

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15121:
URL: https://github.com/apache/flink/pull/15121#issuecomment-793480240


   
   ## CI report:
   
   * dfe55bcf15e82fa63e85551fba2e0b415bba8dae Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14554)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15172: [FLINK-21388] [flink-parquet] support DECIMAL parquet logical type wh…

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15172:
URL: https://github.com/apache/flink/pull/15172#issuecomment-797538175


   
   ## CI report:
   
   * 06cea2c4d6cc6724f86063ddb5974c88ae9629b1 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14556)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15154: [FLINK-21725][core] Name constructor arguments of Tuples like fields

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15154:
URL: https://github.com/apache/flink/pull/15154#issuecomment-796737826


   
   ## CI report:
   
   * 4572f80bab02248b8d5d1f8d2bda99f9a933e17f Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14559)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #15173: Flink-21367

2021-03-12 Thread GitBox


flinkbot commented on pull request #15173:
URL: https://github.com/apache/flink/pull/15173#issuecomment-797679900


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit c2d910a15fa478f09f81913837d14aa4782601b2 (Fri Mar 12 
18:49:37 UTC 2021)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] tillrohrmann merged pull request #15107: [BP-1.11][FLINK-21606] Reject JobManager <-> TaskManager connection if JobManager is not responsible for job

2021-03-12 Thread GitBox


tillrohrmann merged pull request #15107:
URL: https://github.com/apache/flink/pull/15107


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] Bruschkov opened a new pull request #15173: Flink-21367

2021-03-12 Thread GitBox


Bruschkov opened a new pull request #15173:
URL: https://github.com/apache/flink/pull/15173


   ## What is the purpose of the change
   * Users should now be able to use the jdbc-sink when object reuse is enabled 
(or from stateful functions, where object reuse is enabled explicitly)
   
   
   ## Brief change log
   
   *(for example:)*
 - Jdbc-sink and JdbcOutputFormat now implement InputTypeConfigurable, 
through which a serializer is obtained for the incoming records. When object 
reuse is enabled, a copy of each record (via the serializer) is stored in the 
jdbc-batch, as opposed to the record itself, when object reuse is disabled.
   
   
   ## Verifying this change
   This change is a trivial rework / code cleanup without any test coverage.
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): don't know
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
 - Does this pull request introduce a new feature? no
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21597) testMapAfterRepartitionHasCorrectParallelism2 Fail because of "NoResourceAvailableException"

2021-03-12 Thread Matthias (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17300540#comment-17300540
 ] 

Matthias commented on FLINK-21597:
--

I attached the test failure's log for easier analysis. We have a suspicion that 
it could be related to a known race condition in the declarative slot protocol 
implementation. That can appear in cases where multiple jobs compete for 
limited available slots. FLINK-21751 is covering this issue.

But looking at the logs again, I'm not confident that that's the reason for the 
failure: We ran into timeouts in the range of 10secs due to the internally used 
timeout threshold for inactive slots of 10 seconds. The {{PartitionITCase}} 
fails having a timeout of 300s.

> testMapAfterRepartitionHasCorrectParallelism2 Fail because of 
> "NoResourceAvailableException" 
> -
>
> Key: FLINK-21597
> URL: https://issues.apache.org/jira/browse/FLINK-21597
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.13.0
>Reporter: Guowei Ma
>Priority: Major
>  Labels: test-stability
> Attachments: FLINK-21597.log
>
>
> {code:java}
> 2021-03-04T00:17:41.2017402Z [ERROR] 
> testMapAfterRepartitionHasCorrectParallelism2[Execution mode = 
> CLUSTER](org.apache.flink.api.scala.operators.PartitionITCase)  Time elapsed: 
> 300.117 s  <<< ERROR!
> 2021-03-04T00:17:41.2018058Z 
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
> 2021-03-04T00:17:41.2018525Z  at 
> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:144)
> 2021-03-04T00:17:41.2019563Z  at 
> org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$3(MiniClusterJobClient.java:137)
> 2021-03-04T00:17:41.2020129Z  at 
> java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616)
> 2021-03-04T00:17:41.2021974Z  at 
> java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
> 2021-03-04T00:17:41.2022634Z  at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
> 2021-03-04T00:17:41.2023118Z  at 
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)
> 2021-03-04T00:17:41.2023682Z  at 
> org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:237)
> 2021-03-04T00:17:41.2024244Z  at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
> 2021-03-04T00:17:41.2024749Z  at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
> 2021-03-04T00:17:41.2025261Z  at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
> 2021-03-04T00:17:41.2026070Z  at 
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)
> 2021-03-04T00:17:41.2026814Z  at 
> org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:1066)
> 2021-03-04T00:17:41.2027633Z  at 
> akka.dispatch.OnComplete.internal(Future.scala:264)
> 2021-03-04T00:17:41.2028245Z  at 
> akka.dispatch.OnComplete.internal(Future.scala:261)
> 2021-03-04T00:17:41.2028796Z  at 
> akka.dispatch.japi$CallbackBridge.apply(Future.scala:191)
> 2021-03-04T00:17:41.2029327Z  at 
> akka.dispatch.japi$CallbackBridge.apply(Future.scala:188)
> 2021-03-04T00:17:41.2030017Z  at 
> scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
> 2021-03-04T00:17:41.2030795Z  at 
> org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:73)
> 2021-03-04T00:17:41.2031885Z  at 
> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44)
> 2021-03-04T00:17:41.2032678Z  at 
> scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252)
> 2021-03-04T00:17:41.2033428Z  at 
> akka.pattern.PromiseActorRef.$bang(AskSupport.scala:572)
> 2021-03-04T00:17:41.2034197Z  at 
> akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:22)
> 2021-03-04T00:17:41.2035094Z  at 
> akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:21)
> 2021-03-04T00:17:41.2035915Z  at 
> scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:436)
> 2021-03-04T00:17:41.2036617Z  at 
> scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:435)
> 2021-03-04T00:17:41.2037537Z  at 
> scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
> 2021-03-04T00:17:41.2038019Z  at 
> akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
> 2021-03-04T00:17:41.2038554Z  at 
> akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)

[jira] [Updated] (FLINK-21597) testMapAfterRepartitionHasCorrectParallelism2 Fail because of "NoResourceAvailableException"

2021-03-12 Thread Matthias (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias updated FLINK-21597:
-
Attachment: FLINK-21597.log

> testMapAfterRepartitionHasCorrectParallelism2 Fail because of 
> "NoResourceAvailableException" 
> -
>
> Key: FLINK-21597
> URL: https://issues.apache.org/jira/browse/FLINK-21597
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.13.0
>Reporter: Guowei Ma
>Priority: Major
>  Labels: test-stability
> Attachments: FLINK-21597.log
>
>
> {code:java}
> 2021-03-04T00:17:41.2017402Z [ERROR] 
> testMapAfterRepartitionHasCorrectParallelism2[Execution mode = 
> CLUSTER](org.apache.flink.api.scala.operators.PartitionITCase)  Time elapsed: 
> 300.117 s  <<< ERROR!
> 2021-03-04T00:17:41.2018058Z 
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
> 2021-03-04T00:17:41.2018525Z  at 
> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:144)
> 2021-03-04T00:17:41.2019563Z  at 
> org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$3(MiniClusterJobClient.java:137)
> 2021-03-04T00:17:41.2020129Z  at 
> java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616)
> 2021-03-04T00:17:41.2021974Z  at 
> java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
> 2021-03-04T00:17:41.2022634Z  at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
> 2021-03-04T00:17:41.2023118Z  at 
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)
> 2021-03-04T00:17:41.2023682Z  at 
> org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:237)
> 2021-03-04T00:17:41.2024244Z  at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
> 2021-03-04T00:17:41.2024749Z  at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
> 2021-03-04T00:17:41.2025261Z  at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
> 2021-03-04T00:17:41.2026070Z  at 
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)
> 2021-03-04T00:17:41.2026814Z  at 
> org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:1066)
> 2021-03-04T00:17:41.2027633Z  at 
> akka.dispatch.OnComplete.internal(Future.scala:264)
> 2021-03-04T00:17:41.2028245Z  at 
> akka.dispatch.OnComplete.internal(Future.scala:261)
> 2021-03-04T00:17:41.2028796Z  at 
> akka.dispatch.japi$CallbackBridge.apply(Future.scala:191)
> 2021-03-04T00:17:41.2029327Z  at 
> akka.dispatch.japi$CallbackBridge.apply(Future.scala:188)
> 2021-03-04T00:17:41.2030017Z  at 
> scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
> 2021-03-04T00:17:41.2030795Z  at 
> org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:73)
> 2021-03-04T00:17:41.2031885Z  at 
> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44)
> 2021-03-04T00:17:41.2032678Z  at 
> scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252)
> 2021-03-04T00:17:41.2033428Z  at 
> akka.pattern.PromiseActorRef.$bang(AskSupport.scala:572)
> 2021-03-04T00:17:41.2034197Z  at 
> akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:22)
> 2021-03-04T00:17:41.2035094Z  at 
> akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:21)
> 2021-03-04T00:17:41.2035915Z  at 
> scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:436)
> 2021-03-04T00:17:41.2036617Z  at 
> scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:435)
> 2021-03-04T00:17:41.2037537Z  at 
> scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
> 2021-03-04T00:17:41.2038019Z  at 
> akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
> 2021-03-04T00:17:41.2038554Z  at 
> akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)
> 2021-03-04T00:17:41.2039117Z  at 
> akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
> 2021-03-04T00:17:41.2039671Z  at 
> akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
> 2021-03-04T00:17:41.2040159Z  at 
> scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
> 2021-03-04T00:17:41.2040632Z  at 
> akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90)
> 2021-03-04T00:17:41.2041086Z  at 
> akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> 2021

[jira] [Closed] (FLINK-21606) TaskManager connected to invalid JobManager leading to TaskSubmissionException

2021-03-12 Thread Till Rohrmann (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann closed FLINK-21606.
-
Fix Version/s: 1.12.3
   1.11.4
   Resolution: Fixed

Fixed via

1.13.0:
ab2f89940d 
42b94df19d 
0bd837c4fd 
691f87e00f
b24c5e6768

1.12.3:
76fdf33da7 
b025ed27b9 
3f97d0e988 
685baca1bf 
2bcc56ef04

1.11.4:
7edd724e13
2f094da996
db0a5f5b64
969fda2525
172fdfed4d

> TaskManager connected to invalid JobManager leading to TaskSubmissionException
> --
>
> Key: FLINK-21606
> URL: https://issues.apache.org/jira/browse/FLINK-21606
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.13.0
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.11.4, 1.13.0, 1.12.3
>
> Attachments: FLINK-21606-logs.tgz
>
>
> While testing reactive mode, I had to start my JobManager a few times to get 
> the configuration right. While doing that, I had at least on TaskManager 
> (TM6), which was first connected to the first JobManager (with a running 
> job), and then to the second one.
> On the second JobManager, I was able to execute my test job (on another 
> TaskManager (TMx)), once TM6 reconnected, and reactive mode tried to utilize 
> all available resources, I repeatedly ran into this issue:
> {code}
> 2021-03-04 15:49:36,322 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - 
> Window(GlobalWindows(), DeltaTrigger, TimeEvictor, ComparableAggregator, 
> PassThroughWindowFunction) -> Sink: Print to Std. Out (5/7) 
> (ae8f39c8dd88148aff93c8f811fab22e) switched from DEPLOYING to FAILED on 
> 192.168.2.173:64041-4f7521 @ macbook-pro-2.localdomain (dataPort=64044).
> java.util.concurrent.CompletionException: 
> org.apache.flink.runtime.taskexecutor.exceptions.TaskSubmissionException: 
> Could not submit task because there is no JobManager associated for the job 
> bbe8634736b5b1d813dd322cfaaa08ea.
>   at 
> java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326)
>  ~[?:1.8.0_252]
>   at 
> java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338)
>  ~[?:1.8.0_252]
>   at 
> java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:925) 
> ~[?:1.8.0_252]
>   at 
> java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:913)
>  ~[?:1.8.0_252]
>   at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
>  ~[?:1.8.0_252]
>   at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
>  ~[?:1.8.0_252]
>   at 
> org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:234)
>  ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
>   at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
>  ~[?:1.8.0_252]
>   at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
>  ~[?:1.8.0_252]
>   at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
>  ~[?:1.8.0_252]
>   at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
>  ~[?:1.8.0_252]
>   at 
> org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:1064)
>  ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
>   at akka.dispatch.OnComplete.internal(Future.scala:263) 
> ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
>   at akka.dispatch.OnComplete.internal(Future.scala:261) 
> ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
>   at akka.dispatch.japi$CallbackBridge.apply(Future.scala:191) 
> ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
>   at akka.dispatch.japi$CallbackBridge.apply(Future.scala:188) 
> ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
>   at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) 
> ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
>   at 
> org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:73)
>  ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
>   at 
> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44) 
> ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252) 
> ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
>   at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:572) 
> ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
>   at akka.remote.DefaultMessageDispatcher.dispatch(Endpoint.scala:101) 
> 

[jira] [Commented] (FLINK-16908) FlinkKafkaProducerITCase testScaleUpAfterScalingDown Timeout expired while initializing transactional state in 60000ms.

2021-03-12 Thread Till Rohrmann (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17300526#comment-17300526
 ] 

Till Rohrmann commented on FLINK-16908:
---

Another instance: 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=14550&view=logs&j=c5f0071e-1851-543e-9a45-9ac140befc32&t=1fb1a56f-e8b5-5a82-00a0-a2db7757b4f5&l=13043

> FlinkKafkaProducerITCase testScaleUpAfterScalingDown Timeout expired while 
> initializing transactional state in 6ms.
> ---
>
> Key: FLINK-16908
> URL: https://issues.apache.org/jira/browse/FLINK-16908
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.11.0, 1.12.0
>Reporter: Piotr Nowojski
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.13.0
>
>
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=6889&view=logs&j=c5f0071e-1851-543e-9a45-9ac140befc32&t=f66652e3-384e-5b25-be29-abfea69ea8da
> {noformat}
> [ERROR] 
> testScaleUpAfterScalingDown(org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerITCase)
>   Time elapsed: 64.353 s  <<< ERROR!
> org.apache.kafka.common.errors.TimeoutException: Timeout expired while 
> initializing transactional state in 6ms.
> {noformat}
> After this initial error many other tests (I think all following unit tests) 
> failed with errors like:
> {noformat}
> [ERROR] 
> testFailAndRecoverSameCheckpointTwice(org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerITCase)
>   Time elapsed: 7.895 s  <<< FAILURE!
> java.lang.AssertionError: Detected producer leak. Thread name: 
> kafka-producer-network-thread | producer-196
>   at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerITCase.checkProducerLeak(FlinkKafkaProducerITCase.java:675)
>   at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerITCase.testFailAndRecoverSameCheckpointTwice(FlinkKafkaProducerITCase.java:311)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] tillrohrmann edited a comment on pull request #14948: [FLINK-21333][coordination] Add StopWithSavepoint state to declarative scheduler

2021-03-12 Thread GitBox


tillrohrmann edited a comment on pull request #14948:
URL: https://github.com/apache/flink/pull/14948#issuecomment-797663142


   Here is [a branch with a simple StopWithSavepoint 
implementation](https://github.com/tillrohrmann/flink/tree/FLINK-21333). Let me 
know what you think @rmetzger. This can definitely be more beautified and I 
might have overlooked some cases.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] tillrohrmann commented on pull request #14948: [FLINK-21333][coordination] Add StopWithSavepoint state to declarative scheduler

2021-03-12 Thread GitBox


tillrohrmann commented on pull request #14948:
URL: https://github.com/apache/flink/pull/14948#issuecomment-797663142


   Here is [a branch with a simple `StopWithSavepoint` 
implementation](https://github.com/tillrohrmann/flink/tree/FLINK-21333). Let me 
know what you think. This can definitely be more beautified and I might have 
overlooked some cases.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15134: [FLINK-21654][test] Added @throws information to tests

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15134:
URL: https://github.com/apache/flink/pull/15134#issuecomment-795077470


   
   ## CI report:
   
   * f77abc145b06b6ef3a745ab382c1761ecae752d1 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14447)
 
   * 63851b4ad1d97292257cb8cb02f6bd211b503979 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14562)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15106: [BP-1.12][FLINK-21606] Reject JobManager <-> TaskManager connection if JobManager is not responsible for job

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15106:
URL: https://github.com/apache/flink/pull/15106#issuecomment-792330426


   
   ## CI report:
   
   * 8bf221d8f00686baafe48f05adc458406189dcd3 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14550)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-21709) Officially deprecate the legacy planner

2021-03-12 Thread Timo Walther (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther closed FLINK-21709.

Fix Version/s: 1.13.0
 Release Note: The old planner of the Table & SQL API is deprecated and 
will be dropped in Flink 1.14. This means that both the BatchTableEnvironment 
and DataSet API interop are reaching end of life. Use the unified 
TableEnvironment for batch and stream processing with the new planner or the 
DataStream API in batch execution mode.
   Resolution: Fixed

Fixed in 1.13.0: 42a45a584665e40f9629d4ecc00c8af1316b3db5

> Officially deprecate the legacy planner
> ---
>
> Key: FLINK-21709
> URL: https://issues.apache.org/jira/browse/FLINK-21709
> Project: Flink
>  Issue Type: Sub-task
>  Components: Documentation, Table SQL / API
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>
> As discussed in 
> https://lists.apache.org/thread.html/r0851e101e37fbab273775b6a252172c7a9f7c7927107c160de779831%40%3Cdev.flink.apache.org%3E
>  we will perform the following steps in 1.13:
> - Deprecate the `flink-table-planner` module
> - Deprecate `BatchTableEnvironment` for both Java, Scala, and Python 
> - Add a notice to release notes and documentation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #14948: [FLINK-21333][coordination] Add StopWithSavepoint state to declarative scheduler

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #14948:
URL: https://github.com/apache/flink/pull/14948#issuecomment-779946654


   
   ## CI report:
   
   * 097472900d4be5c87466bf7678aa506f6fd5d463 UNKNOWN
   * f13e6af735ee5c4e7917f9d8ee1e8c84208ef2ff Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14548)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zentol commented on a change in pull request #15165: [FLINK-16829] Refactored Prometheus Metric Reporters to Use Constructor Initialization

2021-03-12 Thread GitBox


zentol commented on a change in pull request #15165:
URL: https://github.com/apache/flink/pull/15165#discussion_r593355790



##
File path: 
flink-metrics/flink-metrics-prometheus/src/main/java/org/apache/flink/metrics/prometheus/PrometheusPushGatewayReporter.java
##
@@ -56,43 +56,42 @@
 private boolean deleteOnShutdown;
 private Map groupingKey;
 
-@Override
-public void open(MetricConfig config) {
-super.open(config);
-
-String host = config.getString(HOST.key(), HOST.defaultValue());
-int port = config.getInteger(PORT.key(), PORT.defaultValue());
-String configuredJobName = config.getString(JOB_NAME.key(), 
JOB_NAME.defaultValue());
-boolean randomSuffix =
-config.getBoolean(
-RANDOM_JOB_NAME_SUFFIX.key(), 
RANDOM_JOB_NAME_SUFFIX.defaultValue());
-deleteOnShutdown =
-config.getBoolean(DELETE_ON_SHUTDOWN.key(), 
DELETE_ON_SHUTDOWN.defaultValue());
-groupingKey =
-parseGroupingKey(config.getString(GROUPING_KEY.key(), 
GROUPING_KEY.defaultValue()));
-
-if (host == null || host.isEmpty() || port < 1) {
+PrometheusPushGatewayReporter(
+@Nullable final String hostConfig,
+@Nullable final int portConfig,
+@Nullable final String jobNameConfig,
+@Nullable final boolean randomJobSuffixConfig,
+@Nullable final boolean deleteOnShutdownConfig,
+@Nullable final Map groupingKeyConfig
+) {
+deleteOnShutdown = deleteOnShutdownConfig
+groupingKey = parseGroupingKey(groupingKeyConfig)
+
+if (hostConfig == null || hostConfig.isEmpty() || portConfig < 1) {
 throw new IllegalArgumentException(
-"Invalid host/port configuration. Host: " + host + " Port: 
" + port);
+"Invalid host/port configuration. Host: " + hostConfig + " 
Port: " + portConfig);
 }
 
-if (randomSuffix) {
-this.jobName = configuredJobName + new AbstractID();
+if (randomJobSuffixConfig) {
+this.jobName = jobNameConfig + new AbstractID();
 } else {
-this.jobName = configuredJobName;
+this.jobName = jobNameConfig;
 }
 
-pushGateway = new PushGateway(host + ':' + port);
+pushGateway = new PushGateway(hostConfig + ':' + portConfig);
 log.info(
 "Configured PrometheusPushGatewayReporter with {host:{}, 
port:{}, jobName:{}, randomJobNameSuffix:{}, deleteOnShutdown:{}, 
groupingKey:{}}",
-host,
-port,
-jobName,
-randomSuffix,
+hostConfig,
+portConfig,
+jobNameConfig,
+randomJobSuffixConfig,
 deleteOnShutdown,
 groupingKey);
 }
 
+@Override
+public void open(MetricConfig config) { }
+
 Map parseGroupingKey(final String groupingKeyConfig) {

Review comment:
   I think you're overthinking it.
   
   ```
   public PrometheusPushGatewayReporter createMetricReporter(Properties 
properties) {
   // do all the stuff the current version does
   groupingKey = parseGroupingKey(groupingKeyConfig)
   return new PrometheusPushGatewayReporter(..., groupingKey);
   }
   
   static Map parseGroupingKey(String groupingKeyConfig) {
   ...
   }
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] twalthr closed pull request #15144: [FLINK-21709][table] Officially deprecate the legacy planner

2021-03-12 Thread GitBox


twalthr closed pull request #15144:
URL: https://github.com/apache/flink/pull/15144


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15136: [FLINK-21622][table-planner] Introduce function TO_TIMESTAMP_LTZ(numeric [, precision])

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15136:
URL: https://github.com/apache/flink/pull/15136#issuecomment-795141489


   
   ## CI report:
   
   * fce5cb5d02a79b95ceb3a5bbf41be7d186e2fb85 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14466)
 
   * 38f1cea9a24f244241f6e2a9f069feb7f774652c Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14561)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15134: [FLINK-21654][test] Added @throws information to tests

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15134:
URL: https://github.com/apache/flink/pull/15134#issuecomment-795077470


   
   ## CI report:
   
   * f77abc145b06b6ef3a745ab382c1761ecae752d1 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14447)
 
   * 63851b4ad1d97292257cb8cb02f6bd211b503979 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15105: [FLINK-21606] Reject JobManager <-> TaskManager connection if JobManager is not responsible for job

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15105:
URL: https://github.com/apache/flink/pull/15105#issuecomment-792330363


   
   ## CI report:
   
   * 6d709fcabe70bf783f83d9906ae4b1550d894096 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14549)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15107: [BP-1.11][FLINK-21606] Reject JobManager <-> TaskManager connection if JobManager is not responsible for job

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15107:
URL: https://github.com/apache/flink/pull/15107#issuecomment-792330475


   
   ## CI report:
   
   * 7edd724e13616e44c5b9aeff409aecdd910dc3fe Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14551)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] XComp commented on pull request #15049: [FLINK-21190][runtime-web] Expose exception history

2021-03-12 Thread GitBox


XComp commented on pull request #15049:
URL: https://github.com/apache/flink/pull/15049#issuecomment-797643595


   I addressed your comments, squashed and rebased the branch



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] XComp commented on a change in pull request #15049: [FLINK-21190][runtime-web] Expose exception history

2021-03-12 Thread GitBox


XComp commented on a change in pull request #15049:
URL: https://github.com/apache/flink/pull/15049#discussion_r593343570



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/ExceptionHistoryEntry.java
##
@@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.scheduler;
+
+import org.apache.flink.annotation.VisibleForTesting;
+import org.apache.flink.runtime.executiongraph.AccessExecution;
+import org.apache.flink.runtime.executiongraph.ErrorInfo;
+import org.apache.flink.runtime.taskmanager.TaskManagerLocation;
+
+import javax.annotation.Nullable;
+
+/**
+ * {@code ExceptionHistoryEntry} collects information about a single task 
failure that should be
+ * exposed through the exception history.
+ */
+public class ExceptionHistoryEntry extends ErrorInfo {
+
+private static final long serialVersionUID = -3855285510064263701L;
+
+@Nullable private final String failingTaskName;
+@Nullable private final TaskManagerLocation assignedResourceLocation;

Review comment:
   I created a `ArchivedTaskManagerLocation` for now as the refactoring 
would take a larger effort which is out of scope for this ticket. @zentol Do we 
already have a ticket that addresses this refactoring? If not, whats the 
headline of the refactoring ticket I would create? I understood that 
introducing `TaskManagerLocation` interface was part of a larger effort?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] XComp commented on a change in pull request #15049: [FLINK-21190][runtime-web] Expose exception history

2021-03-12 Thread GitBox


XComp commented on a change in pull request #15049:
URL: https://github.com/apache/flink/pull/15049#discussion_r593341506



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/rest/handler/job/JobExceptionsHandler.java
##
@@ -132,7 +142,62 @@ private static JobExceptionsInfo createJobExceptionsInfo(
 }
 }
 
+final ErrorInfo rootCause = executionGraph.getFailureInfo();
+return new JobExceptionsInfoWithHistory(
+rootCause.getException().getOriginalErrorClassName(),
+rootCause.getExceptionAsString(),
+rootCause.getTimestamp(),
+taskExceptionList,
+truncated,
+createJobExceptionHistory(
+executionGraphInfo.getExceptionHistory(), 
exceptionToReportMaxSize));
+}
+
+static JobExceptionsInfoWithHistory.JobExceptionHistory 
createJobExceptionHistory(
+Iterable historyEntries, int limit) {
+// we need to reverse the history to have a stable result when doing 
paging on it
+final List reversedHistoryEntries = new 
ArrayList<>();
+Iterables.addAll(reversedHistoryEntries, historyEntries);
+Collections.reverse(reversedHistoryEntries);
+
+List exceptionHistoryEntries =
+reversedHistoryEntries.stream()
+.limit(limit)
+.map(JobExceptionsHandler::createJobExceptionInfo)
+.collect(Collectors.toList());
+
+return new JobExceptionsInfoWithHistory.JobExceptionHistory(
+exceptionHistoryEntries,
+exceptionHistoryEntries.size() < 
reversedHistoryEntries.size());
+}
+
+private static JobExceptionsInfo 
createJobExceptionInfo(ExceptionHistoryEntry historyEntry) {
 return new JobExceptionsInfo(
-rootExceptionMessage, rootTimestamp, taskExceptionList, 
truncated);
+historyEntry.getException().getOriginalErrorClassName(),
+historyEntry.getExceptionAsString(),
+historyEntry.getTimestamp(),
+Collections.singletonList(

Review comment:
   You have a good point here. I refactored the schema not relying on the 
old structure anymore. The refactoring also covered your concern. The 
`taskName` and `location` are not shared anymore if `null`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-21752) NullPointerException on restore in PojoSerializer

2021-03-12 Thread Roman Khachatryan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Khachatryan updated FLINK-21752:
--
Description: 
As originally reported in 
[thread|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Schema-Evolution-Cannot-restore-from-savepoint-after-deleting-field-from-POJO-td42162.html],
 after removing a field from a class restore from savepoint fails with the 
following exception:

{code}
2021-03-10T20:51:30.406Z INFO  org.apache.flink.runtime.taskmanager.Task:960 … 
(6/8) (d630d5ff0d7ae4fbc428b151abebab52) switched from RUNNING to FAILED. 
java.lang.Exception: Exception while creating StreamOperatorStateContext.
at 
org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:195)
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:253)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:901)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:415)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:530)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flink.util.FlinkException: Could not restore keyed state 
backend for KeyedCoProcessOperator_c535ac415eeb524d67c88f4a481077d2_(6/8) from 
any of the 1 provided restore options.
at 
org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135)
at 
org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:307)
at 
org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:135)
... 6 common frames omitted
Caused by: org.apache.flink.runtime.state.BackendBuildingException: Failed when 
trying to restore heap backend
at 
org.apache.flink.runtime.state.heap.HeapKeyedStateBackendBuilder.build(HeapKeyedStateBackendBuilder.java:116)
at 
org.apache.flink.runtime.state.memory.MemoryStateBackend.createKeyedStateBackend(MemoryStateBackend.java:347)
at 
org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:291)
at 
org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:142)
at 
org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:121)
... 8 common frames omitted
Caused by: java.lang.NullPointerException: null
at 
org.apache.flink.api.java.typeutils.runtime.PojoSerializer.(PojoSerializer.java:123)
at 
org.apache.flink.api.java.typeutils.runtime.PojoSerializer.duplicate(PojoSerializer.java:186)
at 
org.apache.flink.api.java.typeutils.runtime.PojoSerializer.duplicate(PojoSerializer.java:56)
at 
org.apache.flink.api.common.typeutils.CompositeSerializer$PrecomputedParameters.precompute(CompositeSerializer.java:228)
at 
org.apache.flink.api.common.typeutils.CompositeSerializer.(CompositeSerializer.java:51)
at 
org.apache.flink.runtime.state.ttl.TtlStateFactory$TtlSerializer.(TtlStateFactory.java:250)
at 
org.apache.flink.runtime.state.ttl.TtlStateFactory$TtlSerializerSnapshot.createOuterSerializerWithNestedSerializers(TtlStateFactory.java:359)
at 
org.apache.flink.runtime.state.ttl.TtlStateFactory$TtlSerializerSnapshot.createOuterSerializerWithNestedSerializers(TtlStateFactory.java:330)
at 
org.apache.flink.api.common.typeutils.CompositeTypeSerializerSnapshot.restoreSerializer(CompositeTypeSerializerSnapshot.java:194)
at 
java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at 
java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
at 
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:546)
at 
java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:260)
at 
java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:505)
at 
org.apache.flink.api.common.typeutils.NestedSerializersSnapshotDelegate.snapshotsToRestoreSerializers(NestedSerializersSnapshotDelegate.java:225)
at 
org.apache.flink.api.common.typeutils.NestedSerializersSnapshotDelegate.getRestoredNestedSerializers(NestedSerializersSnapshotDelegate.java:83)
at 
org.apache.flink.api.common.typ

[jira] [Created] (FLINK-21752) NullPointerException on restore in PojoSerializer

2021-03-12 Thread Roman Khachatryan (Jira)
Roman Khachatryan created FLINK-21752:
-

 Summary: NullPointerException on restore in PojoSerializer
 Key: FLINK-21752
 URL: https://issues.apache.org/jira/browse/FLINK-21752
 Project: Flink
  Issue Type: Bug
  Components: API / Type Serialization System
Affects Versions: 1.9.3
Reporter: Roman Khachatryan


As originally reported in 
[thread|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Schema-Evolution-Cannot-restore-from-savepoint-after-deleting-field-from-POJO-td42162.html],
 after removing a field from a class restore from savepoint fails with the 
following exception:

{code}
2021-03-10T20:51:30.406Z INFO  org.apache.flink.runtime.taskmanager.Task:960 … 
(6/8) (d630d5ff0d7ae4fbc428b151abebab52) switched from RUNNING to FAILED. 
java.lang.Exception: Exception while creating StreamOperatorStateContext.

at 
org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:195)

at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:253)

at 
org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:901)

at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:415)

at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705)

at org.apache.flink.runtime.taskmanager.Task.run(Task.java:530)

at java.lang.Thread.run(Thread.java:748)

Caused by: org.apache.flink.util.FlinkException: Could not restore keyed state 
backend for KeyedCoProcessOperator_c535ac415eeb524d67c88f4a481077d2_(6/8) from 
any of the 1 provided restore options.

at 
org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135)

at 
org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:307)

at 
org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:135)

... 6 common frames omitted

Caused by: org.apache.flink.runtime.state.BackendBuildingException: Failed when 
trying to restore heap backend

at 
org.apache.flink.runtime.state.heap.HeapKeyedStateBackendBuilder.build(HeapKeyedStateBackendBuilder.java:116)

at 
org.apache.flink.runtime.state.memory.MemoryStateBackend.createKeyedStateBackend(MemoryStateBackend.java:347)

at 
org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:291)

at 
org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:142)

at 
org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:121)

... 8 common frames omitted

Caused by: java.lang.NullPointerException: null

at 
org.apache.flink.api.java.typeutils.runtime.PojoSerializer.(PojoSerializer.java:123)

at 
org.apache.flink.api.java.typeutils.runtime.PojoSerializer.duplicate(PojoSerializer.java:186)

at 
org.apache.flink.api.java.typeutils.runtime.PojoSerializer.duplicate(PojoSerializer.java:56)

at 
org.apache.flink.api.common.typeutils.CompositeSerializer$PrecomputedParameters.precompute(CompositeSerializer.java:228)

at 
org.apache.flink.api.common.typeutils.CompositeSerializer.(CompositeSerializer.java:51)

at 
org.apache.flink.runtime.state.ttl.TtlStateFactory$TtlSerializer.(TtlStateFactory.java:250)

at 
org.apache.flink.runtime.state.ttl.TtlStateFactory$TtlSerializerSnapshot.createOuterSerializerWithNestedSerializers(TtlStateFactory.java:359)

at 
org.apache.flink.runtime.state.ttl.TtlStateFactory$TtlSerializerSnapshot.createOuterSerializerWithNestedSerializers(TtlStateFactory.java:330)

at 
org.apache.flink.api.common.typeutils.CompositeTypeSerializerSnapshot.restoreSerializer(CompositeTypeSerializerSnapshot.java:194)

at 
java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)

at 
java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)

at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)

at 
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)

at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:546)

at 
java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:260)

at 
java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:505)

at 
org.apache.flink.api.common.typeutils.NestedSerializersSnapshotDelegate.snapshotsToRestoreSerial

[GitHub] [flink] XComp commented on pull request #15134: [FLINK-21654][test] Added @throws information to tests

2021-03-12 Thread GitBox


XComp commented on pull request #15134:
URL: https://github.com/apache/flink/pull/15134#issuecomment-797634153


   I applied your comments, squashed the commits and rebased the branch.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] XComp commented on pull request #15134: [FLINK-21654][test] Added @throws information to tests

2021-03-12 Thread GitBox


XComp commented on pull request #15134:
URL: https://github.com/apache/flink/pull/15134#issuecomment-797633919


   > Thanks for creating this PR @XComp. I think the change makes sense. I had 
a few minor comments.
   > 
   > Why is this problem only limited to when we close the `YarnTestBase`? 
According to YARN-7007 it reads as if the failing of 
`YarnClient.getApplications` is not restricted to a specific state/timing?
   
   Are you referring to the PR description here? You're right. It's not limited 
to the `close()` call. The description was out-dated. I realized during the 
implementation that `getApplication` gets also called in other situations. This 
is covered by the most recent changes. I updated the PR description accordingly.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] Yanikovic commented on a change in pull request #15140: [FLINK-20628][connectors/rabbitmq2] RabbitMQ connector using new connector API

2021-03-12 Thread GitBox


Yanikovic commented on a change in pull request #15140:
URL: https://github.com/apache/flink/pull/15140#discussion_r593330541



##
File path: 
flink-connectors/flink-connector-rabbitmq2/src/main/java/org/apache/flink/connector/rabbitmq2/sink/RabbitMQSink.java
##
@@ -0,0 +1,229 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connector.rabbitmq2.sink;
+
+import org.apache.flink.api.common.serialization.DeserializationSchema;
+import org.apache.flink.api.common.serialization.SerializationSchema;
+import org.apache.flink.api.connector.sink.Committer;
+import org.apache.flink.api.connector.sink.GlobalCommitter;
+import org.apache.flink.api.connector.sink.Sink;
+import org.apache.flink.api.connector.sink.SinkWriter;
+import org.apache.flink.connector.rabbitmq2.ConsistencyMode;
+import org.apache.flink.connector.rabbitmq2.RabbitMQConnectionConfig;
+import org.apache.flink.connector.rabbitmq2.sink.state.RabbitMQSinkWriterState;
+import 
org.apache.flink.connector.rabbitmq2.sink.state.RabbitMQSinkWriterStateSerializer;
+import org.apache.flink.connector.rabbitmq2.sink.writer.RabbitMQSinkWriterBase;
+import 
org.apache.flink.connector.rabbitmq2.sink.writer.specalized.RabbitMQSinkWriterAtLeastOnce;
+import 
org.apache.flink.connector.rabbitmq2.sink.writer.specalized.RabbitMQSinkWriterAtMostOnce;
+import 
org.apache.flink.connector.rabbitmq2.sink.writer.specalized.RabbitMQSinkWriterExactlyOnce;
+import org.apache.flink.core.io.SimpleVersionedSerializer;
+import org.apache.flink.util.Preconditions;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import javax.annotation.Nullable;
+
+import java.util.List;
+import java.util.Optional;
+
+/**
+ * RabbitMQ sink (publisher) that publishes messages from upstream flink tasks 
to a RabbitMQ queue.
+ * It provides at-most-once, at-least-once and exactly-once processing 
semantics. For at-least-once
+ * and exactly-once, checkpointing needs to be enabled. The sink operates as a 
StreamingSink and
+ * thus works in a streaming fashion.
+ *
+ * {@code
+ * RabbitMQSink
+ * .builder()
+ * .setConnectionConfig(connectionConfig)
+ * .setQueueName("queue")
+ * .setSerializationSchema(new SimpleStringSchema())
+ * .setConsistencyMode(ConsistencyMode.AT_LEAST_ONCE)
+ * .setMinimalResendInterval(10L)
+ * .build();
+ * }
+ *
+ * When creating the sink a {@code connectionConfig} must be specified via 
{@link
+ * RabbitMQConnectionConfig}. It contains required information for the 
RabbitMQ java client to
+ * connect to the RabbitMQ server. A minimum configuration contains a 
(virtual) host, a username, a
+ * password and a port. Besides that, the {@code queueName} to publish to and 
a {@link
+ * SerializationSchema} for the sink input type is required. {@code 
publishOptions} can be added to
+ * route messages in RabbitMQ.
+ *
+ * If at-least-once is required, an optional number of {@code maxRetry} 
attempts can be specified
+ * until a failure is triggered. Generally, messages are buffered until an 
acknowledgement arrives
+ * because delivery needs to be guaranteed. On each checkpoint, all 
unacknowledged messages will be
+ * resent to RabbitMQ. If the checkpointing interval is set low or a high 
frequency of resending is
+ * not desired, the {@code minimalResendIntervalMilliseconds} can be specified 
to prevent the sink
+ * from resending data that has just arrived. In case of a failure, all 
unacknowledged messages can
+ * be restored and resend.
+ *
+ * In the case of exactly-once a transactional RabbitMQ channel is used to 
achieve that all

Review comment:
   Just for clarification:
   The minimalResendInterval is used only for at-least-once, the maxRetry for 
both at-least and exactly-once. 
   For at-least-once, we didn't use a transaction but rather resend all unacked 
messages on snapshotState invocation.
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.o

[GitHub] [flink] flinkbot edited a comment on pull request #15136: [FLINK-21622][table-planner] Introduce function TO_TIMESTAMP_LTZ(numeric [, precision])

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15136:
URL: https://github.com/apache/flink/pull/15136#issuecomment-795141489


   
   ## CI report:
   
   * fce5cb5d02a79b95ceb3a5bbf41be7d186e2fb85 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14466)
 
   * 38f1cea9a24f244241f6e2a9f069feb7f774652c UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15054: [FLINK-13550][runtime][ui] Operator's Flame Graph

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15054:
URL: https://github.com/apache/flink/pull/15054#issuecomment-788337524


   
   ## CI report:
   
   * 26a28f2d83f56cb386e1365fd4df4fb8a2f2bf86 UNKNOWN
   * dfb0e42f28b809d98e5dec29e9540111e1aa7b10 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14557)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15049: [FLINK-21190][runtime-web] Expose exception history

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15049:
URL: https://github.com/apache/flink/pull/15049#issuecomment-787719691


   
   ## CI report:
   
   * 2cbffce55c35f7e163739d07f88e480870a0fc37 UNKNOWN
   * bd732209f8aaa9653dfeaf7be6ff8396ece70928 UNKNOWN
   * ce6fe841168017d9f5a3d6948d69c060514c7ea9 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14495)
 
   * d772ca219b9437388a0996cb5498789153ea8b1b Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14560)
 
   * b8744b240d77fc84dee60042317b79585423d403 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21702) Support option `sql-client.verbose` to print the exception stack

2021-03-12 Thread zhuxiaoshang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17300429#comment-17300429
 ] 

zhuxiaoshang commented on FLINK-21702:
--

[~fsk119] [~jark] plz assign to me.

> Support option `sql-client.verbose` to print the exception stack
> 
>
> Key: FLINK-21702
> URL: https://issues.apache.org/jira/browse/FLINK-21702
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Client
>Affects Versions: 1.13.0
>Reporter: Shengkai Fang
>Priority: Major
>  Labels: starer
>
> By enable this option, users can get the full exception stack.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21466) Make "embedded" parameter optional when start sql-client.sh

2021-03-12 Thread zhuxiaoshang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17300427#comment-17300427
 ] 

zhuxiaoshang commented on FLINK-21466:
--

[~fsk119] [~jark] plz assign to me.

> Make "embedded" parameter optional when start sql-client.sh
> ---
>
> Key: FLINK-21466
> URL: https://issues.apache.org/jira/browse/FLINK-21466
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Shengkai Fang
>Priority: Major
>  Labels: starter
>
> Users can use command 
> {code:java}
> >./sql-client.sh embedded -i init1.sql,init2.sql
> {code}
>  to start the sql client and don't need to add {{embedded}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #15049: [FLINK-21190][runtime-web] Expose exception history

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15049:
URL: https://github.com/apache/flink/pull/15049#issuecomment-787719691


   
   ## CI report:
   
   * 2cbffce55c35f7e163739d07f88e480870a0fc37 UNKNOWN
   * bd732209f8aaa9653dfeaf7be6ff8396ece70928 UNKNOWN
   * ce6fe841168017d9f5a3d6948d69c060514c7ea9 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14495)
 
   * d772ca219b9437388a0996cb5498789153ea8b1b UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] tillrohrmann commented on a change in pull request #14948: [FLINK-21333][coordination] Add StopWithSavepoint state to declarative scheduler

2021-03-12 Thread GitBox


tillrohrmann commented on a change in pull request #14948:
URL: https://github.com/apache/flink/pull/14948#discussion_r593279092



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/adaptive/Executing.java
##
@@ -143,6 +145,26 @@ public void notifyNewResourcesAvailable() {
 }
 }
 
+CompletableFuture stopWithSavepoint(
+@Nullable final String targetDirectory, boolean terminate) {
+final ExecutionGraph executionGraph = getExecutionGraph();
+
+StopWithSavepointOperationManager.checkStopWithSavepointPreconditions(
+executionGraph.getCheckpointCoordinator(),
+targetDirectory,
+executionGraph.getJobID(),
+getLogger());
+
+getLogger().info("Triggering stop-with-savepoint for job {}.", 
executionGraph.getJobID());

Review comment:
   nit: This sanity check could also go into the 
`AdaptiveScheduler.goToStopWithSavepoint` implementation if one argues that it 
is not the responsibility of the `Executing` state to decide whether this 
operation is possible or not.

##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/adaptive/StopWithSavepointOperationManagerForAdaptiveScheduler.java
##
@@ -0,0 +1,117 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.scheduler.adaptive;
+
+import org.apache.flink.runtime.checkpoint.CompletedCheckpoint;
+import org.apache.flink.runtime.checkpoint.StopWithSavepointOperations;
+import org.apache.flink.runtime.concurrent.FutureUtils;
+import org.apache.flink.runtime.execution.ExecutionState;
+import org.apache.flink.runtime.executiongraph.Execution;
+import org.apache.flink.runtime.executiongraph.ExecutionGraph;
+import 
org.apache.flink.runtime.scheduler.stopwithsavepoint.StopWithSavepointOperationHandler;
+import 
org.apache.flink.runtime.scheduler.stopwithsavepoint.StopWithSavepointOperationHandlerImpl;
+import org.apache.flink.util.Preconditions;
+
+import org.slf4j.Logger;
+
+import javax.annotation.Nullable;
+
+import java.util.Collection;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.Executor;
+import java.util.stream.Collectors;
+
+/**
+ * A thin wrapper around the StopWithSavepointOperationHandler to handle 
global failures according
+ * to the needs of AdaptiveScheduler. AdaptiveScheduler currently doesn't 
support local failover,
+ * hence, any reported failure will lead to a transition to Restarting or 
Failing state.
+ */
+class StopWithSavepointOperationManagerForAdaptiveScheduler implements 
GlobalFailureHandler {
+private final StopWithSavepointOperationHandler handler;
+private final ExecutionGraph executionGraph;
+private final StopWithSavepointOperations stopWithSavepointOperations;
+private boolean operationFinished = false;
+
+StopWithSavepointOperationManagerForAdaptiveScheduler(
+ExecutionGraph executionGraph,
+StopWithSavepointOperations stopWithSavepointOperations,
+boolean terminate,
+@Nullable String targetDirectory,
+Executor mainThreadExecutor,
+Logger logger) {
+
+this.stopWithSavepointOperations = stopWithSavepointOperations;
+
+// do not trigger checkpoints while creating the final savepoint. We 
will start the
+// scheduler onLeave() again.
+stopWithSavepointOperations.stopCheckpointScheduler();
+
+final CompletableFuture savepointFuture =
+
stopWithSavepointOperations.triggerSynchronousSavepoint(terminate, 
targetDirectory);
+
+this.handler =
+new StopWithSavepointOperationHandlerImpl(
+executionGraph.getJobID(), this, 
stopWithSavepointOperations, logger);
+
+FutureUtils.assertNoException(
+savepointFuture
+// the completedSavepointFuture could also be 
completed by
+// CheckpointCanceller which doesn't run in the 
mainThreadExecutor

Review comment:
   This is a very good finding.

##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/sch

[GitHub] [flink] flinkbot edited a comment on pull request #15170: [FLINK-21610][tests] Do not attempt to cancel the job

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15170:
URL: https://github.com/apache/flink/pull/15170#issuecomment-797471124


   
   ## CI report:
   
   * 8f387ae8f7bf9ef74c971efd22704d85eec8e68f Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14547)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15154: [FLINK-21725][core] Name constructor arguments of Tuples like fields

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15154:
URL: https://github.com/apache/flink/pull/15154#issuecomment-796737826


   
   ## CI report:
   
   * 3040b7bdca9f57ca1a34a89cb057ff7b4d7124fe Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14476)
 
   * 4572f80bab02248b8d5d1f8d2bda99f9a933e17f Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14559)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rionmonster commented on a change in pull request #15165: [FLINK-16829] Refactored Prometheus Metric Reporters to Use Constructor Initialization

2021-03-12 Thread GitBox


rionmonster commented on a change in pull request #15165:
URL: https://github.com/apache/flink/pull/15165#discussion_r593281214



##
File path: 
flink-metrics/flink-metrics-prometheus/src/main/java/org/apache/flink/metrics/prometheus/PrometheusPushGatewayReporterFactory.java
##
@@ -22,13 +22,38 @@
 
 import java.util.Properties;
 
+import static 
org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporterOptions.DELETE_ON_SHUTDOWN;
+import static 
org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporterOptions.GROUPING_KEY;
+import static 
org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporterOptions.HOST;
+import static 
org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporterOptions.JOB_NAME;
+import static 
org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporterOptions.PORT;
+import static 
org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporterOptions.RANDOM_JOB_NAME_SUFFIX;
+
 /** {@link MetricReporterFactory} for {@link PrometheusPushGatewayReporter}. */
 @InterceptInstantiationViaReflection(
 reporterClassName = 
"org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter")
 public class PrometheusPushGatewayReporterFactory implements 
MetricReporterFactory {
 
 @Override
 public PrometheusPushGatewayReporter createMetricReporter(Properties 
properties) {
-return new PrometheusPushGatewayReporter();
+String hostConfig = properties.getString(HOST.key(), 
HOST.defaultValue());
+int portConfig = properties.getInteger(PORT.key(), 
PORT.defaultValue());
+String jobNameConfig = properties.getString(JOB_NAME.key(), 
JOB_NAME.defaultValue())

Review comment:
   I've been working primarily with Kotlin over the last year or so, 
semicolons aren't my friends. ;)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rionmonster commented on a change in pull request #15165: [FLINK-16829] Refactored Prometheus Metric Reporters to Use Constructor Initialization

2021-03-12 Thread GitBox


rionmonster commented on a change in pull request #15165:
URL: https://github.com/apache/flink/pull/15165#discussion_r593279557



##
File path: 
flink-metrics/flink-metrics-prometheus/src/main/java/org/apache/flink/metrics/prometheus/PrometheusPushGatewayReporter.java
##
@@ -56,43 +56,42 @@
 private boolean deleteOnShutdown;
 private Map groupingKey;
 
-@Override
-public void open(MetricConfig config) {
-super.open(config);
-
-String host = config.getString(HOST.key(), HOST.defaultValue());
-int port = config.getInteger(PORT.key(), PORT.defaultValue());
-String configuredJobName = config.getString(JOB_NAME.key(), 
JOB_NAME.defaultValue());
-boolean randomSuffix =
-config.getBoolean(
-RANDOM_JOB_NAME_SUFFIX.key(), 
RANDOM_JOB_NAME_SUFFIX.defaultValue());
-deleteOnShutdown =
-config.getBoolean(DELETE_ON_SHUTDOWN.key(), 
DELETE_ON_SHUTDOWN.defaultValue());
-groupingKey =
-parseGroupingKey(config.getString(GROUPING_KEY.key(), 
GROUPING_KEY.defaultValue()));
-
-if (host == null || host.isEmpty() || port < 1) {
+PrometheusPushGatewayReporter(
+@Nullable final String hostConfig,
+@Nullable final int portConfig,
+@Nullable final String jobNameConfig,
+@Nullable final boolean randomJobSuffixConfig,
+@Nullable final boolean deleteOnShutdownConfig,
+@Nullable final Map groupingKeyConfig
+) {
+deleteOnShutdown = deleteOnShutdownConfig
+groupingKey = parseGroupingKey(groupingKeyConfig)
+
+if (hostConfig == null || hostConfig.isEmpty() || portConfig < 1) {
 throw new IllegalArgumentException(
-"Invalid host/port configuration. Host: " + host + " Port: 
" + port);
+"Invalid host/port configuration. Host: " + hostConfig + " 
Port: " + portConfig);
 }
 
-if (randomSuffix) {
-this.jobName = configuredJobName + new AbstractID();
+if (randomJobSuffixConfig) {
+this.jobName = jobNameConfig + new AbstractID();
 } else {
-this.jobName = configuredJobName;
+this.jobName = jobNameConfig;
 }
 
-pushGateway = new PushGateway(host + ':' + port);
+pushGateway = new PushGateway(hostConfig + ':' + portConfig);
 log.info(
 "Configured PrometheusPushGatewayReporter with {host:{}, 
port:{}, jobName:{}, randomJobNameSuffix:{}, deleteOnShutdown:{}, 
groupingKey:{}}",
-host,
-port,
-jobName,
-randomSuffix,
+hostConfig,
+portConfig,
+jobNameConfig,
+randomJobSuffixConfig,
 deleteOnShutdown,
 groupingKey);
 }
 
+@Override
+public void open(MetricConfig config) { }
+
 Map parseGroupingKey(final String groupingKeyConfig) {

Review comment:
   Thanks for the feedback!
   
   This makes sense from a testability perspective. Since the factory only 
presently has a `createMetricReporter` method defined on the interface, I'm 
guessing this static method would just support the sanitization / cleanup calls 
prior to just calling the other method?
   
   Something like:
   
   ```
   static PrometheusPushGatewayReporter nameTbd(Properties properties)
   {
   // Check for nulls, general sanitization code here
   
   return createMetricReporter(...);
   }
   
   @Override
   public PrometheusPushGatewayReporter createMetricReporter(Properties 
properties) {
   // Omitted for brevity 
   return new PrometheusPushGatewayReporter(...);
   }
   ```
   
   Because otherwise, it just seems like `open` (or whatever it ends up being 
named) would just circumvent the need for a factory unless I'm just 
misunderstanding what the static method would do.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15163: [FLINK-21710][table-planner-blink] FlinkRelMdUniqueKeys gets incorrect result on TableScan after project push-down

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15163:
URL: https://github.com/apache/flink/pull/15163#issuecomment-797203134


   
   ## CI report:
   
   * 6563e261e69a04e6428e7cd766273e5a3967abe9 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14541)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15154: [FLINK-21725][core] Name constructor arguments of Tuples like fields

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15154:
URL: https://github.com/apache/flink/pull/15154#issuecomment-796737826


   
   ## CI report:
   
   * 3040b7bdca9f57ca1a34a89cb057ff7b4d7124fe Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14476)
 
   * 4572f80bab02248b8d5d1f8d2bda99f9a933e17f UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15144: [FLINK-21709][table] Officially deprecate the legacy planner

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15144:
URL: https://github.com/apache/flink/pull/15144#issuecomment-796504890


   
   ## CI report:
   
   * cf09b912a5b2e94d7df1339d71f532568216e308 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14483)
 
   * 8296846a743067953fbfd108c22c3fb627f0ef09 UNKNOWN
   * ef1b016a64f7755582e49e0e2a5d7d67a5e7dc16 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14558)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15054: [FLINK-13550][runtime][ui] Operator's Flame Graph

2021-03-12 Thread GitBox


flinkbot edited a comment on pull request #15054:
URL: https://github.com/apache/flink/pull/15054#issuecomment-788337524


   
   ## CI report:
   
   * 26a28f2d83f56cb386e1365fd4df4fb8a2f2bf86 UNKNOWN
   * abab5960a130986aca48bd51aec2b897205106c9 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14545)
 
   * dfb0e42f28b809d98e5dec29e9540111e1aa7b10 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14557)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] tillrohrmann commented on a change in pull request #15153: [FLINK-21707][runtime] Do not trigger scheduling of non-CREATED regions in PipelinedRegionSchedulingStrategy

2021-03-12 Thread GitBox


tillrohrmann commented on a change in pull request #15153:
URL: https://github.com/apache/flink/pull/15153#discussion_r593272810



##
File path: 
flink-tests/src/test/java/org/apache/flink/test/scheduling/PipelinedRegionSchedulingITCase.java
##
@@ -204,4 +248,22 @@ public void invoke() throws Exception {
 }
 }
 }
+
+/** Invokable which fails exactly once with a {@link PartitionException}. 
*/
+public static class OneTimeFailingReceiverWithPartitionException extends 
AbstractInvokable {
+
+private static final AtomicBoolean hasFailed = new 
AtomicBoolean(false);

Review comment:
   I would suggest to make this resettable. Otherwise it won't be possible 
to run `testRecoverFromPartitionException` repeatedly from your IDE. Only the 
first run would throw the `PartitionNotFoundException` and all other runs would 
simply be a `NoOpInvokable`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] knaufk commented on a change in pull request #15064: [FLINK-21074][FLINK-21076][docs][coord] Document Reactive Mode

2021-03-12 Thread GitBox


knaufk commented on a change in pull request #15064:
URL: https://github.com/apache/flink/pull/15064#discussion_r593257043



##
File path: docs/content/docs/deployment/elastic_scaling.md
##
@@ -0,0 +1,107 @@
+---
+title: Elastic Scaling
+weight: 5
+type: docs
+
+---
+
+
+# Elastic Scaling
+
+Flink allows you to adjust your cluster size to your workloads. This is 
possible manually by stopping a job with a savepoint and restarting it from the 
savepoint with a different parallelism. 
+
+This page describes options where Flink automatically adjusts the parallelism.
+
+## Reactive Mode
+
+{{< hint danger >}}
+Reactive mode is a MVP ("minimum viable product") feature. The Flink community 
is actively looking for feedback by users through our mailing lists. Please 
check the limitations listed on this page.
+{{< /hint >}}
+
+Reactive Mode configures Flink so that it always uses all resources made 
available to Flink. Adding a TaskManager will scale up your job, removing 
resources will scale it down. Flink will manage the parallelism of a job's 
operators, always setting them to the highest possible values.

Review comment:
   ```suggestion
   Reactive Mode configures a job so that it always uses all resources 
available in the cluster. Adding a TaskManager will scale up your job, removing 
resources will scale it down. Flink will manage the parallelism of the job, 
always setting them to the highest possible values.
   ```

##
File path: docs/content/docs/deployment/elastic_scaling.md
##
@@ -0,0 +1,107 @@
+---
+title: Elastic Scaling
+weight: 5
+type: docs
+
+---
+
+
+# Elastic Scaling
+
+Flink allows you to adjust your cluster size to your workloads. This is 
possible manually by stopping a job with a savepoint and restarting it from the 
savepoint with a different parallelism. 
+
+This page describes options where Flink automatically adjusts the parallelism.
+
+## Reactive Mode
+
+{{< hint danger >}}
+Reactive mode is a MVP ("minimum viable product") feature. The Flink community 
is actively looking for feedback by users through our mailing lists. Please 
check the limitations listed on this page.
+{{< /hint >}}
+
+Reactive Mode configures Flink so that it always uses all resources made 
available to Flink. Adding a TaskManager will scale up your job, removing 
resources will scale it down. Flink will manage the parallelism of a job's 
operators, always setting them to the highest possible values.
+
+The Reactive Mode allows Flink users to implement a powerful autoscaling 
mechanism, by having an external service monitor certain metrics, such as 
consumer lag, aggregate CPU utilization, throughput or latency. As soon as 
these metrics are above or below a certain threshold, additional TaskManagers 
can be added or removed from the Flink cluster. This could be implemented 
through changing the [replica 
factor](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#replicas)
 of a Kubernetes deployment, or an 
[autoscaling](https://docs.aws.amazon.com/autoscaling/ec2/userguide/AutoScalingGroup.html)
 group. This external service only needs to handle the resource allocation and 
deallocation. Flink will take care of keeping the job running with the 
resources available.
+ 
+### Getting started
+
+If you just want to try out Reactive Mode, follow these instructions. They 
assume that you are deploying Flink on one machine.
+
+```bash
+
+# these instructions assume you are in the root directory of a Flink 
distribution.
+
+# Put Job into lib/ directory
+cp ./examples/streaming/TopSpeedWindowing.jar lib/
+# Submit Job in Reactive Mode
+./bin/standalone-job.sh start -Drestart-strategy=fixeddelay 
-Drestart-strategy.fixed-delay.attempts=10 -Dscheduler-mode=reactive -j 
org.apache.flink.streaming.examples.windowing.TopSpeedWindowing
+# Start first TaskManager
+./bin/taskmanager.sh start
+```
+
+Let's quickly examine the used submission command:
+- `./bin/standalone-job.sh start` deploys Flink in [Application Mode]({{< ref 
"docs/deployment/overview" >}}#application-mode)
+- `-Drestart-strategy=fixeddelay` and 
`-Drestart-strategy.fixed-delay.attempts=10` configure the job to restart 
on failure. This is needed for supporting scale-down.
+- `-Dscheduler-mode=reactive` enables Reactive Mode.
+- the last argument is passing the Job's main class name.
+
+You have now started a Flink job in Reactive Mode. The [web 
interface](http://localhost:8081) shows that the job is running on one 
TaskManager. If you want to scale up the job, simply add another TaskManager to 
the cluster:
+```bash
+# Start additional TaskManager
+./bin/taskmanager.sh start
+```
+
+To scale down, remove a TaskManager instance.
+```bash
+# Remove a TaskManager
+./bin/taskmanager.sh stop
+```
+
+### Usage
+
+ Configuration
+
+To enable Reactive Mode, you need to configure `scheduler-mode` to `reactive`.

Review comment:
Why is this configuration called "scheduler-mode"? Is it a 

  1   2   3   4   >