[jira] [Created] (FLINK-19688) Flink job gets into restart loop caused by InterruptedExceptions

2020-10-17 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19688:
--

 Summary: Flink job gets into restart loop caused by 
InterruptedExceptions
 Key: FLINK-19688
 URL: https://issues.apache.org/jira/browse/FLINK-19688
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Network, Runtime / Task
Affects Versions: 1.12.0
Reporter: Robert Metzger


I have a benchmarking test job, that throws RuntimeExceptions at any operator 
at a configured, random interval. When using low intervals, such as mean 
failure rate = 60 s, the job will get into a state where it frequently fails 
with InterruptedExceptions.

The same job does not have this problem on Flink 1.11.2 (at least not after 
running the job for 15 hours, on 1.12-SN, it happens within a few minutes)
This is the job: 
https://github.com/rmetzger/flip1-bench/blob/master/flip1-bench-jobs/src/main/java/com/ververica/TPCHQuery3.java

This is the exception:
{code}
2020-10-16 16:02:15,653 WARN  org.apache.flink.runtime.taskmanager.Task 
   [] - CHAIN GroupReduce (GroupReduce at main(TPCHQuery3.java:199)) -> 
Map (Map at appendMapper(KillerClientMapper.java:38)) (8/8)#1 
(06d656f696bf4ed98831938a7ac2359d_c1c4a56fea0536703d37867c057f0cc8_7_1) 
switched from RUNNING to FAILED.
java.lang.Exception: The data preparation for task 'CHAIN GroupReduce 
(GroupReduce at main(TPCHQuery3.java:199)) -> Map (Map at 
appendMapper(KillerClientMapper.java:38))' , caused an error: 
java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error 
obtaining the sorted input: Thread 'SortMerger Reading Thread' terminated due 
to an exception: Connection for partition 
060d457c4163472f65a4b741993c83f8#0@06d656f696bf4ed98831938a7ac2359d_0bcc9fbf9ac242d5aac540917d980e44_0_1
 not reachable.
at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:481) 
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at 
org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:370) 
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722) 
[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:547) 
[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_222]
Caused by: org.apache.flink.util.WrappingRuntimeException: 
java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error 
obtaining the sorted input: Thread 'SortMerger Reading Thread' terminated due 
to an exception: Connection for partition 
060d457c4163472f65a4b741993c83f8#0@06d656f696bf4ed98831938a7ac2359d_0bcc9fbf9ac242d5aac540917d980e44_0_1
 not reachable.
at 
org.apache.flink.runtime.operators.sort.ExternalSorter.getIterator(ExternalSorter.java:253)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at 
org.apache.flink.runtime.operators.BatchTask.getInput(BatchTask.java:1122) 
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at 
org.apache.flink.runtime.operators.GroupReduceDriver.prepare(GroupReduceDriver.java:99)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:475) 
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
... 4 more
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
Error obtaining the sorted input: Thread 'SortMerger Reading Thread' terminated 
due to an exception: Connection for partition 
060d457c4163472f65a4b741993c83f8#0@06d656f696bf4ed98831938a7ac2359d_0bcc9fbf9ac242d5aac540917d980e44_0_1
 not reachable.
at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) 
~[?:1.8.0_222]
at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) 
~[?:1.8.0_222]
at 
org.apache.flink.runtime.operators.sort.ExternalSorter.getIterator(ExternalSorter.java:250)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at 
org.apache.flink.runtime.operators.BatchTask.getInput(BatchTask.java:1122) 
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at 
org.apache.flink.runtime.operators.GroupReduceDriver.prepare(GroupReduceDriver.java:99)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:475) 
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
... 4 more
Caused by: java.lang.RuntimeException: Error obtaining the sorted input: Thread 
'SortMerger Reading Thread' terminated due to an exception: Connection for 
partition 
060d457c4163472f65a4b741993c83f8#0@06d656f696bf4ed98831938a7ac2359d_0bcc9fbf9ac242d5aac540917d980e44_0_1
 not reachable.
at 
org.apache.flink.runtime.operators.sort.ExternalSorter.lambda$getIterator$1(ExternalSorter.java:247)
 ~[flink-dist_2.11-1.12-SNAPSHOT.ja

[jira] [Created] (FLINK-19665) AlternatingCheckpointBarrierHandlerTest.testMetricsAlternation

2020-10-15 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19665:
--

 Summary: 
AlternatingCheckpointBarrierHandlerTest.testMetricsAlternation
 Key: FLINK-19665
 URL: https://issues.apache.org/jira/browse/FLINK-19665
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing
Affects Versions: 1.12.0
Reporter: Robert Metzger


https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8479=logs=6e58d712-c5cc-52fb-0895-6ff7bd56c46b=f30a8e80-b2cf-535c-9952-7f521a4ae374
{code}
[ERROR] Tests run: 9, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.115 s 
<<< FAILURE! - in 
org.apache.flink.streaming.runtime.io.AlternatingCheckpointBarrierHandlerTest
[ERROR] 
testMetricsAlternation(org.apache.flink.streaming.runtime.io.AlternatingCheckpointBarrierHandlerTest)
  Time elapsed: 0.017 s  <<< FAILURE!
java.lang.AssertionError: 

Expected: a value less than or equal to <74001L>
 but: <137102L> was greater than <74001L>
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
at 
org.apache.flink.streaming.runtime.io.AlternatingCheckpointBarrierHandlerTest.assertMetrics(AlternatingCheckpointBarrierHandlerTest.java:212)
at 
org.apache.flink.streaming.runtime.io.AlternatingCheckpointBarrierHandlerTest.testMetricsAlternation(AlternatingCheckpointBarrierHandlerTest.java:146)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19664) Upload logs before end to end tests time out

2020-10-15 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19664:
--

 Summary: Upload logs before end to end tests time out
 Key: FLINK-19664
 URL: https://issues.apache.org/jira/browse/FLINK-19664
 Project: Flink
  Issue Type: Improvement
  Components: Build System / Azure Pipelines
Affects Versions: 1.12.0
Reporter: Robert Metzger


Due to a bug in azure pipelines, we can not see the e2e output when a run times 
out.
This ticket is to add some tooling for rescuing the logs before it's too late



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19658) Local recovery and sticky scheduling end-to-end test hangs with "Expected to find info here."

2020-10-15 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19658:
--

 Summary: Local recovery and sticky scheduling end-to-end test 
hangs with "Expected to find info here."
 Key: FLINK-19658
 URL: https://issues.apache.org/jira/browse/FLINK-19658
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.12.0
Reporter: Robert Metzger
Assignee: Robert Metzger


The reason for all these e2e test hangs recently seems to be the Local recovery 
and sticky scheduling end-to-end test.

It is in a restart loop with this error:
{code}
020-10-15T13:01:42.4079891Z 2020-10-15 12:54:06,099 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - Flat Map -> 
Sink: Unnamed (1/4) 
(78a56f7797be1d41b0b1b31a75bd90e1_20ba6b65f97481d5570070de90e4e791_0_1) 
switched from RUNNING to FAILED on 
org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot@65b70d8d.
2020-10-15T13:01:42.4080637Z java.lang.NullPointerException: Expected to find 
info here.
2020-10-15T13:01:42.4081365Zat 
org.apache.flink.util.Preconditions.checkNotNull(Preconditions.java:78) 
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-10-15T13:01:42.4082067Zat 
org.apache.flink.streaming.tests.StickyAllocationAndLocalRecoveryTestJob$StateCreatingFlatMap.initializeState(StickyAllocationAndLocalRecoveryTestJob.java:343)
 ~[?:?]
2020-10-15T13:01:42.4083125Zat 
org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:185)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-10-15T13:01:42.4103820Zat 
org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:167)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-10-15T13:01:42.4104926Zat 
org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-10-15T13:01:42.4106020Zat 
org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.initializeOperatorState(StreamOperatorStateHandler.java:107)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-10-15T13:01:42.4107084Zat 
org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:262)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-10-15T13:01:42.4108295Zat 
org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:400)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-10-15T13:01:42.4109432Zat 
org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$beforeInvoke$0(StreamTask.java:505)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-10-15T13:01:42.4110458Zat 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:47)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-10-15T13:01:42.4111428Zat 
org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:501)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-10-15T13:01:42.4112328Zat 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:533) 
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-10-15T13:01:42.4113167Zat 
org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722) 
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-10-15T13:01:42.4113962Zat 
org.apache.flink.runtime.taskmanager.Task.run(Task.java:547) 
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-10-15T13:01:42.4114434Zat java.lang.Thread.run(Thread.java:748) 
~[?:1.8.0_265]
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19657) YARNSessionCapacitySchedulerITCase.checkForProhibitedLogContents: netty failed with java.io.IOException: Broken pipe

2020-10-15 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19657:
--

 Summary: 
YARNSessionCapacitySchedulerITCase.checkForProhibitedLogContents: netty failed 
with java.io.IOException: Broken pipe
 Key: FLINK-19657
 URL: https://issues.apache.org/jira/browse/FLINK-19657
 Project: Flink
  Issue Type: Bug
  Components: Deployment / YARN
Affects Versions: 1.12.0
Reporter: Robert Metzger


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=7666=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf
{code}
2020-10-15T10:41:39.3168991Z [ERROR] Failures: 
2020-10-15T10:41:39.3172085Z [ERROR]   
YARNSessionCapacitySchedulerITCase.checkForProhibitedLogContents:650->YarnTestBase.ensureNoProhibitedStringInLogFiles:479
 Found a file 
/__w/2/s/flink-yarn-tests/target/flink-yarn-tests-capacityscheduler/flink-yarn-tests-capacityscheduler-logDir-nm-1_0/application_1602757819968_0002/container_1602757819968_0002_01_02/taskmanager.log
 with a prohibited string (one of [Exception, Started 
SelectChannelConnector@0.0.0.0:8081]). Excerpts:
2020-10-15T10:41:39.3173934Z [
2020-10-15T10:41:39.3175238Z 2020-10-15 10:31:08,883 INFO  
org.apache.flink.runtime.taskexecutor.DefaultJobLeaderService [] - Remove job 
5a24db6b51d7c697fe30e49aa1a8412c from job leader monitoring.
2020-10-15T10:41:39.3176858Z 2020-10-15 10:31:08,885 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor   [] - Close 
JobManager connection for job 5a24db6b51d7c697fe30e49aa1a8412c.
2020-10-15T10:41:39.3178614Z 2020-10-15 10:31:09,531 INFO  
org.apache.flink.yarn.YarnTaskExecutorRunner [] - RECEIVED 
SIGNAL 15: SIGTERM. Shutting down as requested.
2020-10-15T10:41:39.3199037Z 2020-10-15 10:31:09,538 INFO  
org.apache.flink.runtime.blob.PermanentBlobCache [] - Shutting down 
BLOB cache
2020-10-15T10:41:39.3202316Z 2020-10-15 10:31:09,573 INFO  
org.apache.flink.runtime.io.disk.FileChannelManagerImpl  [] - 
FileChannelManager removed spill file directory 
/__w/2/s/flink-yarn-tests/target/flink-yarn-tests-capacityscheduler/flink-yarn-tests-capacityscheduler-localDir-nm-1_0/usercache/agent07_azpcontainer/appcache/application_1602757819968_0002/flink-io-fa9b2b85-fbe7-429c-85a9-6fcfb4811b01
2020-10-15T10:41:39.3204499Z 2020-10-15 10:31:09,577 INFO  
org.apache.flink.runtime.state.TaskExecutorLocalStateStoresManager [] - 
Shutting down TaskExecutorLocalStateStoresManager.
2020-10-15T10:41:39.3240288Z 2020-10-15 10:31:09,579 INFO  
org.apache.flink.runtime.io.disk.FileChannelManagerImpl  [] - 
FileChannelManager removed spill file directory 
/__w/2/s/flink-yarn-tests/target/flink-yarn-tests-capacityscheduler/flink-yarn-tests-capacityscheduler-localDir-nm-1_0/usercache/agent07_azpcontainer/appcache/application_1602757819968_0002/flink-netty-shuffle-7e95809e-e862-45b1-892e-09835297c82d
2020-10-15T10:41:39.3244086Z 2020-10-15 10:31:09,594 INFO  
org.apache.flink.runtime.filecache.FileCache [] - removed file 
cache directory 
/__w/2/s/flink-yarn-tests/target/flink-yarn-tests-capacityscheduler/flink-yarn-tests-capacityscheduler-localDir-nm-1_0/usercache/agent07_azpcontainer/appcache/application_1602757819968_0002/flink-dist-cache-ed288b32-5c40-48b9-9bcb-7186f5a7bfc2
2020-10-15T10:41:39.3246636Z 2020-10-15 10:31:09,596 INFO  
org.apache.flink.runtime.blob.TransientBlobCache [] - Shutting down 
BLOB cache
2020-10-15T10:41:39.3248653Z 2020-10-15 10:31:09,661 WARN  
akka.remote.transport.netty.NettyTransport   [] - Remote 
connection to [61b81e62b514/192.168.128.2:39365] failed with 
java.io.IOException: Broken pipe
2020-10-15T10:41:39.3250746Z 2020-10-15 10:31:09,661 WARN  
akka.remote.transport.netty.NettyTransport   [] - Remote 
connection to [61b81e62b514/192.168.128.2:39365] failed with 
java.io.IOException: Broken pipe
2020-10-15T10:41:39.3253551Z 2020-10-15 10:31:09,662 WARN  
akka.remote.transport.netty.NettyTransport   [] - Remote 
connection to [61b81e62b514/192.168.128.2:39365] failed with 
java.io.IOException: Broken pipe
2020-10-15T10:41:39.3255688Z 2020-10-15 10:31:09,662 WARN  
akka.remote.transport.netty.NettyTransport   [] - Remote 
connection to [61b81e62b514/192.168.128.2:39365] failed with 
java.io.IOException: Broken pipe
2020-10-15T10:41:39.3267975Z ]
2020-10-15T10:41:39.3270637Z [ERROR]   
YARNSessionCapacitySchedulerITCase.checkForProhibitedLogContents:650->YarnTestBase.ensureNoProhibitedStringInLogFiles:479
 Found a file 
/__w/2/s/flink-yarn-tests/target/flink-yarn-tests-capacityscheduler/flink-yarn-tests-capacityscheduler-logDir-nm-1_0/application_1602757819968_0002/container_1602757819968_0002_01_02/taskmanager.log
 with a prohibited string (one of [Exception, Started 
SelectChannelConnector@0.0.0.0:8081]). Excerpts:
2020-10-15T10:41:39.3272364Z [
2020-10-15

[jira] [Created] (FLINK-19635) HBaseConnectorITCase.testTableSourceSinkWithDDL is unstable with a result mismatch

2020-10-14 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19635:
--

 Summary: HBaseConnectorITCase.testTableSourceSinkWithDDL is 
unstable with a result mismatch
 Key: FLINK-19635
 URL: https://issues.apache.org/jira/browse/FLINK-19635
 Project: Flink
  Issue Type: Bug
  Components: Connectors / HBase
Affects Versions: 1.12.0
Reporter: Robert Metzger


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=7562=logs=d44f43ce-542c-597d-bf94-b0718c71e5e8=03dca39c-73e8-5aaf-601d-328ae5c35f20
{code}
2020-10-14T04:35:36.9268975Z testTableSourceSinkWithDDL[planner = 
BLINK_PLANNER, legacy = 
false](org.apache.flink.connector.hbase2.HBaseConnectorITCase)  Time elapsed: 
3.131 sec  <<< FAILURE!
2020-10-14T04:35:36.9276246Z java.lang.AssertionError: 
expected:<[1,10,Hello-1,100,1.01,false,Welt-1,2019-08-18T19:00,2019-08-18,19:00,12345678.0001,
 
2,20,Hello-2,200,2.02,true,Welt-2,2019-08-18T19:01,2019-08-18,19:01,12345678.0002,
 
3,30,Hello-3,300,3.03,false,Welt-3,2019-08-18T19:02,2019-08-18,19:02,12345678.0003,
 
4,40,null,400,4.04,true,Welt-4,2019-08-18T19:03,2019-08-18,19:03,12345678.0004, 
5,50,Hello-5,500,5.05,false,Welt-5,2019-08-19T19:10,2019-08-19,19:10,12345678.0005,
 
6,60,Hello-6,600,6.06,true,Welt-6,2019-08-19T19:20,2019-08-19,19:20,12345678.0006,
 
7,70,Hello-7,700,7.07,false,Welt-7,2019-08-19T19:30,2019-08-19,19:30,12345678.0007,
 
8,80,null,800,8.08,true,Welt-8,2019-08-19T19:40,2019-08-19,19:40,12345678.0008]>
 but 
was:<[1,10,Hello-1,100,1.01,false,Welt-1,2019-08-18T19:00,2019-08-18,19:00,12345678.0001,
 
2,20,Hello-2,200,2.02,true,Welt-2,2019-08-18T19:01,2019-08-18,19:01,12345678.0002,
 
3,30,Hello-3,300,3.03,false,Welt-3,2019-08-18T19:02,2019-08-18,19:02,12345678.0003]>
2020-10-14T04:35:36.9281340Zat org.junit.Assert.fail(Assert.java:88)
2020-10-14T04:35:36.9282023Zat 
org.junit.Assert.failNotEquals(Assert.java:834)
2020-10-14T04:35:36.9328385Zat 
org.junit.Assert.assertEquals(Assert.java:118)
2020-10-14T04:35:36.9338939Zat 
org.junit.Assert.assertEquals(Assert.java:144)
2020-10-14T04:35:36.9339880Zat 
org.apache.flink.connector.hbase2.HBaseConnectorITCase.testTableSourceSinkWithDDL(HBaseConnectorITCase.java:449)
2020-10-14T04:35:36.9341003Zat 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS][Release 1.12] Stale blockers and build instabilities

2020-10-12 Thread Robert Metzger
ps://issues.apache.org/jira/browse/FLINK-19388>,
> https://issues.apache.org/jira/browse/FLINK-19249 <
> https://issues.apache.org/jira/browse/FLINK-19249>)
> - 1 HBase connector (https://issues.apache.org/jira/browse/FLINK-19445 <
> https://issues.apache.org/jira/browse/FLINK-19445>)
> - 1 Application mode (https://issues.apache.org/jira/browse/FLINK-19154 <
> https://issues.apache.org/jira/browse/FLINK-19154>)
> - 1 New source API (https://issues.apache.org/jira/browse/FLINK-19384 <
> https://issues.apache.org/jira/browse/FLINK-19384>)
> - 1 Kinesis (https://issues.apache.org/jira/browse/FLINK-19332 <
> https://issues.apache.org/jira/browse/FLINK-19332>)
>
> == Recent notable build instabilities which still have no owners:
> - New source API
>https://issues.apache.org/jira/browse/FLINK-19253 <
> https://issues.apache.org/jira/browse/FLINK-19253>
> SourceReaderTestBase.testAddSplitToExistingFetcher hangs
>https://issues.apache.org/jira/browse/FLINK-19370 <
> https://issues.apache.org/jira/browse/FLINK-19370>
> FileSourceTextLinesITCase.testContinuousTextFileSource failed as results
> mismatch
>https://issues.apache.org/jira/browse/FLINK-19427 <
> https://issues.apache.org/jira/browse/FLINK-19427>
> SplitFetcherTest.testNotifiesWhenGoingIdleConcurrent is instable,
>https://issues.apache.org/jira/browse/FLINK-19437 <
> https://issues.apache.org/jira/browse/FLINK-19437>
> FileSourceTextLinesITCase.testContinuousTextFileSource failed with
> "SimpleStreamFormat is not splittable, but found split end (0) different
> from file length (198)"
>https://issues.apache.org/jira/browse/FLINK-19448 <
> https://issues.apache.org/jira/browse/FLINK-19448>
> CoordinatedSourceITCase.testEnumeratorReaderCommunication hangs
> - Runtime/Network
>https://issues.apache.org/jira/browse/FLINK-19426 <
> https://issues.apache.org/jira/browse/FLINK-19426>  End-to-end test
> sometimes fails with PartitionConnectionException
> - Unaligned Checkpoint
>https://issues.apache.org/jira/browse/FLINK-19027 <
> https://issues.apache.org/jira/browse/FLINK-19027>
> UnalignedCheckpointITCase.shouldPerformUnalignedCheckpointOnParallelRemoteChannel
> failed because of test timeout
> - Table
>https://issues.apache.org/jira/browse/FLINK-19340 <
> https://issues.apache.org/jira/browse/FLINK-19340>
> AggregateITCase.testListAggWithDistinct failed with "expected: 2,B, 3,C#A, 4,EF)> but was:"
> - HBase connector
>https://issues.apache.org/jira/browse/FLINK-18570 <
> https://issues.apache.org/jira/browse/FLINK-18570>
> SQLClientHBaseITCase.testHBase fails on azure
> https://issues.apache.org/jira/browse/FLINK-19447 <
> https://issues.apache.org/jira/browse/FLINK-19447>
> HBaseConnectorITCase.HBaseTestingClusterAutoStarter failed with "Master not
> initialized after 20ms"
> - Avro
>https://issues.apache.org/jira/browse/FLINK-19422 <
> https://issues.apache.org/jira/browse/FLINK-19422>  Avro Confluent Schema
> Registry nightly end-to-end test failed with "Register operation timed out;
> error code: 50002"
>
> Regards,
> Dian
>
> > 在 2020年9月21日,下午2:32,Robert Metzger  写道:
> >
> > Hi all,
> >
> > An update on the release status:
> > 1. We have 35 days = *5 weeks left until feature freeze*
> > 2. There are currently 2 blockers for Flink
> > <https://issues.apache.org/jira/browse/FLINK-19264?filter=12349334>, all
> > making progress
> > 3. We have 72 test instabilities
> > <https://issues.apache.org/jira/browse/FLINK-19237> (down 7 from 2 weeks
> > ago). I have pinged people to help addressing frequent or critical
> issues.
> >
> > Best,
> > Robert
> >
> >
> > On Mon, Sep 7, 2020 at 10:37 AM Robert Metzger 
> wrote:
> >
> >> Hi all,
> >>
> >> another two weeks have passed. We now have 5 blockers
> >> <https://issues.apache.org/jira/browse/FLINK-18682?filter=12349334> (Up
> >> 3 from 2 weeks ago), but they are all making progress.
> >>
> >> We currently have 79 test-instabilities
> >> <https://issues.apache.org/jira/browse/FLINK-18869?filter=12348580>,
> >> since the last report, a few have been resolved, and some others have
> been
> >> added.
> >> I have checked the tickets, closed some old ones and pinged people to
> help
> >> resolve new or frequent ones.
> >> Except for Kafka, there are no major clusters of test instabilities.
> Most
> >> failures are rarely failing tests across the entire system.
> >>
> >&

[jira] [Created] (FLINK-19585) UnalignedCheckpointCompatibilityITCase.test:97->runAndTakeSavepoint

2020-10-12 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19585:
--

 Summary: 
UnalignedCheckpointCompatibilityITCase.test:97->runAndTakeSavepoint
 Key: FLINK-19585
 URL: https://issues.apache.org/jira/browse/FLINK-19585
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing
Affects Versions: 1.12.0
Reporter: Robert Metzger






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19552) JobManager dies with IllegalStateException SharedSlot (physical request SlotRequestId{%}) has been released

2020-10-09 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19552:
--

 Summary: JobManager dies with IllegalStateException SharedSlot 
(physical request SlotRequestId{%}) has been released
 Key: FLINK-19552
 URL: https://issues.apache.org/jira/browse/FLINK-19552
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.12.0
Reporter: Robert Metzger


While running some benchmarks that involve a lot of failures, I experienced 
fatal JobManager crashes, with the following log:
{code}
2020-10-09 09:01:45,001 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - CHAIN Filter 
(Filter at main(TPCHQuery3.java:157)) -> Map (Map at 
appendMapper(KillerClientMapper.java:38)) (2/8) 
(cf993790aa641a2287b42939b3037f75_f25e1269f08c88185d8a3b9caad8d0c0_1_2) 
switched from CREATED to SCHEDULED.
2020-10-09 09:01:45,004 ERROR 
org.apache.flink.runtime.util.FatalExitExceptionHandler  [] - FATAL: Thread 
'flink-akka.actor.default-dispatcher-17' produced an uncaught exception. 
Stopping the process...
java.util.concurrent.CompletionException: java.lang.IllegalStateException: 
SharedSlot (physical request SlotRequestId{214e8e202b9b087388a6a17c6ba9bccf}) 
has been released
at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
 ~[?:1.8.0_222]
at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
 ~[?:1.8.0_222]
at 
java.util.concurrent.CompletableFuture.uniRun(CompletableFuture.java:708) 
~[?:1.8.0_222]
at 
java.util.concurrent.CompletableFuture$UniRun.tryFire(CompletableFuture.java:687)
 ~[?:1.8.0_222]
at 
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
 ~[?:1.8.0_222]
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:402)
 ~[flip1-bench-jobs-1.0-SNAPSHOT.jar:?]
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:195)
 ~[flip1-bench-jobs-1.0-SNAPSHOT.jar:?]
at 
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74)
 ~[flip1-bench-jobs-1.0-SNAPSHOT.jar:?]
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
 ~[flip1-bench-jobs-1.0-SNAPSHOT.jar:?]
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) 
[flip1-bench-jobs-1.0-SNAPSHOT.jar:?]
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) 
[flip1-bench-jobs-1.0-SNAPSHOT.jar:?]
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) 
[flip1-bench-jobs-1.0-SNAPSHOT.jar:?]
at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) 
[flip1-bench-jobs-1.0-SNAPSHOT.jar:?]
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170) 
[flip1-bench-jobs-1.0-SNAPSHOT.jar:?]
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) 
[flip1-bench-jobs-1.0-SNAPSHOT.jar:?]
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) 
[flip1-bench-jobs-1.0-SNAPSHOT.jar:?]
at akka.actor.Actor$class.aroundReceive(Actor.scala:517) 
[flip1-bench-jobs-1.0-SNAPSHOT.jar:?]
at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) 
[flip1-bench-jobs-1.0-SNAPSHOT.jar:?]
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) 
[flip1-bench-jobs-1.0-SNAPSHOT.jar:?]
at akka.actor.ActorCell.invoke(ActorCell.scala:561) 
[flip1-bench-jobs-1.0-SNAPSHOT.jar:?]
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) 
[flip1-bench-jobs-1.0-SNAPSHOT.jar:?]
at akka.dispatch.Mailbox.run(Mailbox.scala:225) 
[flip1-bench-jobs-1.0-SNAPSHOT.jar:?]
at akka.dispatch.Mailbox.exec(Mailbox.scala:235) 
[flip1-bench-jobs-1.0-SNAPSHOT.jar:?]
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) 
[flip1-bench-jobs-1.0-SNAPSHOT.jar:?]
at 
akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) 
[flip1-bench-jobs-1.0-SNAPSHOT.jar:?]
at 
akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) 
[flip1-bench-jobs-1.0-SNAPSHOT.jar:?]
at 
akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) 
[flip1-bench-jobs-1.0-SNAPSHOT.jar:?]
Caused by: java.lang.IllegalStateException: SharedSlot (physical request 
SlotRequestId{214e8e202b9b087388a6a17c6ba9bccf}) has been released
at 
org.apache.flink.util.Preconditions.checkState(Preconditions.java:217) 
~[flip1-bench-jobs-1.0-SNAPSHOT.jar:?]
at 
org.apache.flink.runtime.scheduler.SharedSlot.allocateLogicalSlot(SharedSlot.java:126)
 ~[flip1-bench-jobs-1.0-SNAPSHOT.jar:?]
at 
org.apache.flink.runtime.scheduler.SlotSharingExecutionSlotAllocator.allocateLogicalSlotsFromSharedSl

[jira] [Created] (FLINK-19534) build_wheels: linux failed with "ValueError: Incompatible component merge"

2020-10-08 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19534:
--

 Summary: build_wheels: linux failed with "ValueError: Incompatible 
component merge"
 Key: FLINK-19534
 URL: https://issues.apache.org/jira/browse/FLINK-19534
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.12.0
Reporter: Robert Metzger


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=7276=logs=fe7ebddc-3e2f-5c50-79ee-226c8653f218=b2830442-93c7-50ff-36f4-5b3e2dca8c83

{code}
2020-10-07T21:26:02.2736541Z + export PATH
2020-10-07T21:26:02.2737242Z + 
/home/vsts/work/1/s/flink-python/dev/.conda/bin/conda install -c conda-forge 
patchelf=0.11 -y
2020-10-07T21:26:08.7666087Z Collecting package metadata 
(current_repodata.json): ...working... failed
2020-10-07T21:26:08.7767051Z 
2020-10-07T21:26:08.7767792Z # >>>>>>>>>>>>>>>>>>>>>> ERROR REPORT 
<<<<<<<<<<<<<<<<<<<<<<
2020-10-07T21:26:08.7768252Z 
2020-10-07T21:26:08.7768724Z Traceback (most recent call last):
2020-10-07T21:26:08.7770155Z   File 
"/home/vsts/work/1/s/flink-python/dev/.conda/lib/python3.7/site-packages/conda/exceptions.py",
 line 1062, in __call__
2020-10-07T21:26:08.7770935Z return func(*args, **kwargs)
2020-10-07T21:26:08.7771943Z   File 
"/home/vsts/work/1/s/flink-python/dev/.conda/lib/python3.7/site-packages/conda/cli/main.py",
 line 84, in _main
2020-10-07T21:26:08.7772700Z exit_code = do_call(args, p)
2020-10-07T21:26:08.7773701Z   File 
"/home/vsts/work/1/s/flink-python/dev/.conda/lib/python3.7/site-packages/conda/cli/conda_argparse.py",
 line 82, in do_call
2020-10-07T21:26:08.7774481Z exit_code = getattr(module, 
func_name)(args, parser)
2020-10-07T21:26:08.7775977Z   File 
"/home/vsts/work/1/s/flink-python/dev/.conda/lib/python3.7/site-packages/conda/cli/main_install.py",
 line 20, in execute
2020-10-07T21:26:08.067Z install(args, parser, 'install')
2020-10-07T21:26:08.7778141Z   File 
"/home/vsts/work/1/s/flink-python/dev/.conda/lib/python3.7/site-packages/conda/cli/install.py",
 line 256, in install
2020-10-07T21:26:08.7778910Z force_reinstall=context.force_reinstall or 
context.force,
2020-10-07T21:26:08.7780010Z   File 
"/home/vsts/work/1/s/flink-python/dev/.conda/lib/python3.7/site-packages/conda/core/solve.py",
 line 112, in solve_for_transaction
2020-10-07T21:26:08.7780853Z force_remove, force_reinstall)
2020-10-07T21:26:08.7781835Z   File 
"/home/vsts/work/1/s/flink-python/dev/.conda/lib/python3.7/site-packages/conda/core/solve.py",
 line 150, in solve_for_diff
2020-10-07T21:26:08.7782503Z force_remove)
2020-10-07T21:26:08.7783604Z   File 
"/home/vsts/work/1/s/flink-python/dev/.conda/lib/python3.7/site-packages/conda/core/solve.py",
 line 249, in solve_final_state
2020-10-07T21:26:08.7784374Z ssc = self._collect_all_metadata(ssc)
2020-10-07T21:26:08.7785369Z   File 
"/home/vsts/work/1/s/flink-python/dev/.conda/lib/python3.7/site-packages/conda/common/io.py",
 line 88, in decorated
2020-10-07T21:26:08.7798322Z return f(*args, **kwds)
2020-10-07T21:26:08.7799259Z   File 
"/home/vsts/work/1/s/flink-python/dev/.conda/lib/python3.7/site-packages/conda/core/solve.py",
 line 389, in _collect_all_metadata
2020-10-07T21:26:08.7799750Z index, r = self._prepare(prepared_specs)
2020-10-07T21:26:08.7800610Z   File 
"/home/vsts/work/1/s/flink-python/dev/.conda/lib/python3.7/site-packages/conda/core/solve.py",
 line 974, in _prepare
2020-10-07T21:26:08.7801063Z self.subdirs, prepared_specs, 
self._repodata_fn)
2020-10-07T21:26:08.7801942Z   File 
"/home/vsts/work/1/s/flink-python/dev/.conda/lib/python3.7/site-packages/conda/core/index.py",
 line 216, in get_reduced_index
2020-10-07T21:26:08.7802388Z push_record(record)
2020-10-07T21:26:08.7803182Z   File 
"/home/vsts/work/1/s/flink-python/dev/.conda/lib/python3.7/site-packages/conda/core/index.py",
 line 190, in push_record
2020-10-07T21:26:08.7803624Z combined_depends = record.combined_depends
2020-10-07T21:26:08.7804456Z   File 
"/home/vsts/work/1/s/flink-python/dev/.conda/lib/python3.7/site-packages/conda/models/records.py",
 line 326, in combined_depends
2020-10-07T21:26:08.7804901Z MatchSpec(spec, optional=True) for spec in 
self.constrains or ()
2020-10-07T21:26:08.7805615Z   File 
"/home/vsts/work/1/s/flink-python/dev/.conda/lib/python3.7/site-packages/conda/models/match_spec.py",
 line 467, in merge
2020-10-07T21:26:08.7806420Z reduce(lambda x, y: x._merge(y, union), 
group) if len(group) > 

[jira] [Created] (FLINK-19518) Duration of running job is shown as 0 in web UI

2020-10-07 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19518:
--

 Summary: Duration of running job is shown as 0 in web UI
 Key: FLINK-19518
 URL: https://issues.apache.org/jira/browse/FLINK-19518
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Web Frontend
Affects Versions: 1.12.0
Reporter: Robert Metzger


Most likely caused by FLINK-16866, the web UI is showing the "Duration" of a 
job as 0 in the overview. Once you open the detail page of a job, you see the 
correct duration.





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[ANNOUNCE] Apache flink-shaded 12.0 released

2020-10-06 Thread Robert Metzger
Hi all!

The Apache Flink community is very happy to announce the release of Apache
Flink-shaded 12.0

The flink-shaded project contains a number of shaded dependencies for
Apache Flink.

Apache Flink® is an open-source stream processing framework for
distributed, high-performing, always-available, and accurate data streaming
applications.

The release is available for download at:
https://flink.apache.org/downloads.html

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12348339

We would like to thank all contributors of the Apache Flink community who
made this release possible!

Regards,
Robert Metzger


[RESULT][VOTE] Release flink-shaded 12.0

2020-10-05 Thread Robert Metzger
Thank you all for voting. The wait time has passed.

We have the following binding +1 votes:
- Chesnay
- Gordon
- Robert

There were no -1 votes.

I'll proceed with releasing flink-shaded 12.0.

On Mon, Oct 5, 2020 at 1:57 PM Robert Metzger  wrote:

> +1
>
> - Checked diffs between 1.11 and 1.12:
> https://github.com/apache/flink-shaded/compare/release-11.0...release-12.0-rc1
> - staging repository looks good
>
>
> On Mon, Oct 5, 2020 at 11:12 AM Tzu-Li (Gordon) Tai 
> wrote:
>
>> +1
>>
>> - Checked signatures
>> - Source archives contain no binaries
>> - Built from source (mvn clean verify)
>>
>> Cheers,
>> Gordon
>>
>> On Mon, Oct 5, 2020, 4:33 PM Chesnay Schepler  wrote:
>>
>>> +1
>>>
>>> Checked signatures, license files, and that the listed changes are in
>>> the release.
>>>
>>> On 9/30/2020 4:55 PM, Robert Metzger wrote:
>>> > Hi everyone,
>>> >
>>> > Please review and vote on the release candidate #1 for the version
>>> 12.0, as
>>> > follows:
>>> >
>>> > [ ] +1, Approve the release
>>> > [ ] -1, Do not approve the release (please provide specific comments)
>>> >
>>> >
>>> > The complete staging area is available for your review, which includes:
>>> > * JIRA release notes [1],
>>> > * the official Apache source release to be deployed to dist.apache.org
>>> [2],
>>> > which are signed with the key with fingerprint D9839159 [3],
>>> > * all artifacts to be deployed to the Maven Central Repository [4],
>>> > * source code tag "release-12.0-rc1" [5],
>>> > * website pull request listing the new release [6].
>>> >
>>> > The vote will be open for at least 72 hours. It is adopted by majority
>>> > approval, with at least 3 PMC affirmative votes.
>>> >
>>> > Thanks,
>>> > Robert Metzger
>>> >
>>> > [1]
>>> >
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12348339
>>> > [2]
>>> https://dist.apache.org/repos/dist/dev/flink/flink-shaded-12.0-rc1/
>>> > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
>>> > [4]
>>> https://repository.apache.org/content/repositories/orgapacheflink-1400
>>> > [5]
>>> https://github.com/apache/flink-shaded/releases/tag/release-12.0-rc1
>>> > [6] https://github.com/apache/flink-web/pull/380
>>> >
>>>
>>>


[jira] [Created] (FLINK-19506) UnalignedCheckpointITCase.shouldPerformUnalignedCheckpointOnNonParallelLocalChannel: "Exceeded checkpoint tolerable failure threshold"

2020-10-05 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19506:
--

 Summary: 
UnalignedCheckpointITCase.shouldPerformUnalignedCheckpointOnNonParallelLocalChannel:
 "Exceeded checkpoint tolerable failure threshold"
 Key: FLINK-19506
 URL: https://issues.apache.org/jira/browse/FLINK-19506
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing
Affects Versions: 1.12.0
Reporter: Robert Metzger


https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8433=logs=584fa981-f71a-5840-1c49-f800c954fe4b=532bf1f8-8c75-59c3-eaad-8c773769bc3a

{code}
2020-10-05T12:30:11.4979736Z [ERROR] Tests run: 6, Failures: 0, Errors: 1, 
Skipped: 0, Time elapsed: 97.533 s <<< FAILURE! - in 
org.apache.flink.test.checkpointing.UnalignedCheckpointITCase
2020-10-05T12:30:11.4980464Z [ERROR] 
shouldPerformUnalignedCheckpointOnNonParallelLocalChannel(org.apache.flink.test.checkpointing.UnalignedCheckpointITCase)
  Time elapsed: 32.406 s  <<< ERROR!
2020-10-05T12:30:11.4980971Z 
org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
2020-10-05T12:30:11.4988360Zat 
org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:147)
2020-10-05T12:30:11.4989659Zat 
org.apache.flink.client.program.PerJobMiniClusterFactory$PerJobMiniClusterJobClient.lambda$getJobExecutionResult$2(PerJobMiniClusterFactory.java:196)
2020-10-05T12:30:11.4990584Zat 
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616)
2020-10-05T12:30:11.4991620Zat 
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
2020-10-05T12:30:11.499Zat 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
2020-10-05T12:30:11.4992654Zat 
java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)
2020-10-05T12:30:11.4993153Zat 
org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:229)
2020-10-05T12:30:11.4993661Zat 
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
2020-10-05T12:30:11.4994133Zat 
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
2020-10-05T12:30:11.4994590Zat 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
2020-10-05T12:30:11.4995201Zat 
java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)
2020-10-05T12:30:11.4995781Zat 
org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:926)
2020-10-05T12:30:11.4996228Zat 
akka.dispatch.OnComplete.internal(Future.scala:264)
2020-10-05T12:30:11.4996985Zat 
akka.dispatch.OnComplete.internal(Future.scala:261)
2020-10-05T12:30:11.4997419Zat 
akka.dispatch.japi$CallbackBridge.apply(Future.scala:191)
2020-10-05T12:30:11.4997855Zat 
akka.dispatch.japi$CallbackBridge.apply(Future.scala:188)
2020-10-05T12:30:11.4998312Zat 
scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
2020-10-05T12:30:11.4998951Zat 
org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:74)
2020-10-05T12:30:11.4999477Zat 
scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44)
2020-10-05T12:30:11.548Zat 
scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252)
2020-10-05T12:30:11.5000504Zat 
akka.pattern.PromiseActorRef.$bang(AskSupport.scala:572)
2020-10-05T12:30:11.5000984Zat 
akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:22)
2020-10-05T12:30:11.5001567Zat 
akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:21)
2020-10-05T12:30:11.5002091Zat 
scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:436)
2020-10-05T12:30:11.5002534Zat 
scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:435)
2020-10-05T12:30:11.5002976Zat 
scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
2020-10-05T12:30:11.5003460Zat 
akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
2020-10-05T12:30:11.5004015Zat 
akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)
2020-10-05T12:30:11.5004603Zat 
akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
2020-10-05T12:30:11.5005173Zat 
akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
2020-10-05T12:30:11.5005677Zat 
scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
2020-10-05T12:30:11.5006170Zat 
akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90)
2020-10-05T12:30:11.5006644Zat 
akka.dispatch.TaskInvocation.run(AbstractDispatche

Re: [VOTE] Release flink-shaded 12.0

2020-10-05 Thread Robert Metzger
+1

- Checked diffs between 1.11 and 1.12:
https://github.com/apache/flink-shaded/compare/release-11.0...release-12.0-rc1
- staging repository looks good


On Mon, Oct 5, 2020 at 11:12 AM Tzu-Li (Gordon) Tai 
wrote:

> +1
>
> - Checked signatures
> - Source archives contain no binaries
> - Built from source (mvn clean verify)
>
> Cheers,
> Gordon
>
> On Mon, Oct 5, 2020, 4:33 PM Chesnay Schepler  wrote:
>
>> +1
>>
>> Checked signatures, license files, and that the listed changes are in
>> the release.
>>
>> On 9/30/2020 4:55 PM, Robert Metzger wrote:
>> > Hi everyone,
>> >
>> > Please review and vote on the release candidate #1 for the version
>> 12.0, as
>> > follows:
>> >
>> > [ ] +1, Approve the release
>> > [ ] -1, Do not approve the release (please provide specific comments)
>> >
>> >
>> > The complete staging area is available for your review, which includes:
>> > * JIRA release notes [1],
>> > * the official Apache source release to be deployed to dist.apache.org
>> [2],
>> > which are signed with the key with fingerprint D9839159 [3],
>> > * all artifacts to be deployed to the Maven Central Repository [4],
>> > * source code tag "release-12.0-rc1" [5],
>> > * website pull request listing the new release [6].
>> >
>> > The vote will be open for at least 72 hours. It is adopted by majority
>> > approval, with at least 3 PMC affirmative votes.
>> >
>> > Thanks,
>> > Robert Metzger
>> >
>> > [1]
>> >
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12348339
>> > [2] https://dist.apache.org/repos/dist/dev/flink/flink-shaded-12.0-rc1/
>> > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
>> > [4]
>> https://repository.apache.org/content/repositories/orgapacheflink-1400
>> > [5]
>> https://github.com/apache/flink-shaded/releases/tag/release-12.0-rc1
>> > [6] https://github.com/apache/flink-web/pull/380
>> >
>>
>>


[ANNOUNCE] New PMC member: Zhu Zhu

2020-10-04 Thread Robert Metzger
Hi all!

I'm very happy to announce that Zhu Zhu has joined the Flink PMC!

Zhu is helping the community a lot with creating and validating releases,
contributing to FLIP discussions and good code contributions to the
scheduler and related components.

Congratulations and welcome Zhu Zhu!

Regards,
Robert


[VOTE] Release flink-shaded 12.0

2020-09-30 Thread Robert Metzger
Hi everyone,

Please review and vote on the release candidate #1 for the version 12.0, as
follows:

[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org [2],
which are signed with the key with fingerprint D9839159 [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-12.0-rc1" [5],
* website pull request listing the new release [6].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Thanks,
Robert Metzger

[1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12348339
[2] https://dist.apache.org/repos/dist/dev/flink/flink-shaded-12.0-rc1/
[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] https://repository.apache.org/content/repositories/orgapacheflink-1400
[5] https://github.com/apache/flink-shaded/releases/tag/release-12.0-rc1
[6] https://github.com/apache/flink-web/pull/380


[jira] [Created] (FLINK-19461) yarn-sesson.sh -jm -tm arguments have no effect

2020-09-29 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19461:
--

 Summary: yarn-sesson.sh -jm -tm arguments have no effect
 Key: FLINK-19461
 URL: https://issues.apache.org/jira/browse/FLINK-19461
 Project: Flink
  Issue Type: Bug
  Components: Deployment / YARN
Affects Versions: 1.12.0
Reporter: Robert Metzger


It seems that I can set arbitrary values for the documented {{-jm}} and {{-tm}} 
arguments, not leading to any effects.

Example: {{./bin/yarn-session -jm 512m -tm 512m}} should fail in my opinion, 
but it starts with the default memory configuration (1280mb / 1200mb? or so).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19459) flink-dist won't build locally with newer (3.3+) maven versions

2020-09-29 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19459:
--

 Summary: flink-dist won't build locally with newer (3.3+) maven 
versions
 Key: FLINK-19459
 URL: https://issues.apache.org/jira/browse/FLINK-19459
 Project: Flink
  Issue Type: Bug
  Components: Build System
Affects Versions: 1.12.0
Reporter: Robert Metzger


flink-dist will fail on non Maven 3.2.5 versions because of banned dependencies.

These are the messages you'll see:
{code}
[INFO] --- maven-enforcer-plugin:3.0.0-M1:enforce (ban-unsafe-snakeyaml) @ 
flink-dist_2.11 ---
[WARNING] Rule 0: org.apache.maven.plugins.enforcer.BannedDependencies failed 
with message:
Found Banned Dependency: org.yaml:snakeyaml:jar:1.24
Use 'mvn dependency:tree' to locate the source of the banned dependencies.
[INFO] 
[INFO] Reactor Summary for Flink : 1.12-SNAPSHOT:

...

[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-enforcer-plugin:3.0.0-M1:enforce 
(ban-unsafe-snakeyaml) on project flink-dist_2.11: Some Enforcer rules have 
failed. Look above for specific messages explaining why the rule failed. -> 
[Help 1]
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Release flink-shaded 12.0

2020-09-29 Thread Robert Metzger
It seems that we have consensus to create a flink-shaded release.

I'll soon propose a RC.

On Fri, Sep 25, 2020 at 9:11 AM Konstantin Knauf  wrote:

> +1
>
>
>
> On Wed, Sep 23, 2020 at 9:13 AM Yu Li  wrote:
>
> > +1
> >
> > Best Regards,
> > Yu
> >
> >
> > On Tue, 22 Sep 2020 at 17:49, Robert Metzger 
> wrote:
> >
> > > No concerns from my side.
> > >
> > > On Fri, Sep 18, 2020 at 8:25 AM Chesnay Schepler 
> > > wrote:
> > >
> > > > Hello,
> > > >
> > > > I'd like to kickoff the next release of flink-shaded, which will
> > contain
> > > > a bump to netty (4.1.49) and snakeyaml (1.27).
> > > >
> > > > Any concerns? Any other dependency  people want upgrade for the 1.12?
> > > >
> > > >
> > >
> >
>
>
> --
>
> Konstantin Knauf
>
> https://twitter.com/snntrable
>
> https://github.com/knaufk
>


[jira] [Created] (FLINK-19458) ZooKeeperLeaderElectionITCase.testJobExecutionOnClusterWithLeaderChange: ZooKeeper unexpectedly modified

2020-09-29 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19458:
--

 Summary: 
ZooKeeperLeaderElectionITCase.testJobExecutionOnClusterWithLeaderChange: 
ZooKeeper unexpectedly modified
 Key: FLINK-19458
 URL: https://issues.apache.org/jira/browse/FLINK-19458
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.12.0
Reporter: Robert Metzger


https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8422=logs=70ad9b63-500e-5dc9-5a3c-b60356162d7e=944c7023-8984-5aa2-b5f8-54922bd90d3a

{code}
2020-09-29T13:34:18.1803081Z [ERROR] 
testJobExecutionOnClusterWithLeaderChange(org.apache.flink.test.runtime.leaderelection.ZooKeeperLeaderElectionITCase)
  Time elapsed: 23.524 s  <<< ERROR!
2020-09-29T13:34:18.1803707Z java.util.concurrent.ExecutionException: 
org.apache.flink.runtime.client.JobSubmissionException: Failed to submit job.
2020-09-29T13:34:18.1804343Zat 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
2020-09-29T13:34:18.1804738Zat 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
2020-09-29T13:34:18.1805274Zat 
org.apache.flink.test.runtime.leaderelection.ZooKeeperLeaderElectionITCase.testJobExecutionOnClusterWithLeaderChange(ZooKeeperLeaderElectionITCase.java:117)
2020-09-29T13:34:18.1805772Zat 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2020-09-29T13:34:18.1806136Zat 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2020-09-29T13:34:18.1806555Zat 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2020-09-29T13:34:18.1806936Zat 
java.lang.reflect.Method.invoke(Method.java:498)
2020-09-29T13:34:18.1807313Zat 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
2020-09-29T13:34:18.1807731Zat 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
2020-09-29T13:34:18.1808341Zat 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
2020-09-29T13:34:18.1808973Zat 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
2020-09-29T13:34:18.1809376Zat 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
2020-09-29T13:34:18.1809851Zat 
org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
2020-09-29T13:34:18.1810201Zat 
org.junit.rules.RunRules.evaluate(RunRules.java:20)
2020-09-29T13:34:18.1810632Zat 
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
2020-09-29T13:34:18.1811035Zat 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
2020-09-29T13:34:18.1811700Zat 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
2020-09-29T13:34:18.1812082Zat 
org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
2020-09-29T13:34:18.1812447Zat 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
2020-09-29T13:34:18.1812824Zat 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
2020-09-29T13:34:18.1813190Zat 
org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
2020-09-29T13:34:18.1813565Zat 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
2020-09-29T13:34:18.1813964Zat 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
2020-09-29T13:34:18.1814364Zat 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
2020-09-29T13:34:18.1814752Zat 
org.junit.runners.ParentRunner.run(ParentRunner.java:363)
2020-09-29T13:34:18.1815298Zat 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
2020-09-29T13:34:18.1816096Zat 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
2020-09-29T13:34:18.1816552Zat 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
2020-09-29T13:34:18.1816984Zat 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
2020-09-29T13:34:18.1817421Zat 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
2020-09-29T13:34:18.1817894Zat 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
2020-09-29T13:34:18.1818318Zat 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
2020-09-29T13:34:18.181Zat 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
2020-09-29T13:34:18.1819294ZSuppressed: 
org.apache.flink.util.FlinkException: Could not close resource.
2020-09-29T13:34:18.1819698Zat 
org.apache.flink.util.AutoCloseableAsync.close(AutoCloseableAsync.java:42)
2020-09-29T13:34:18

[jira] [Created] (FLINK-19379) Submitting job to running YARN session fails

2020-09-23 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19379:
--

 Summary: Submitting job to running YARN session fails
 Key: FLINK-19379
 URL: https://issues.apache.org/jira/browse/FLINK-19379
 Project: Flink
  Issue Type: Bug
  Components: Command Line Client, Deployment / YARN
Affects Versions: 1.11.2
Reporter: Robert Metzger


Steps to reproduce:

1. start a YARN session
2. submit a job using: ./bin/flink run -t yarn-session -yid 
application_1600852002161_0003, where application_1600852002161_0003 is the id 
of the session started in 1.

Expected behavior: submit job to running session.
Actual behavior: Fails with this unhelpful exception:
{code}

org.apache.flink.client.program.ProgramInvocationException: The main method 
caused an error: null
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:302)
at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:198)
at 
org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:149)
at 
org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:699)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:232)
at 
org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:916)
at 
org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:992)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
at 
org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:992)
Caused by: java.lang.IllegalStateException
at 
org.apache.flink.util.Preconditions.checkState(Preconditions.java:179)
at 
org.apache.flink.client.deployment.executors.AbstractSessionClusterExecutor.execute(AbstractSessionClusterExecutor.java:61)
at 
org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:973)
at 
org.apache.flink.client.program.ContextEnvironment.executeAsync(ContextEnvironment.java:124)
at 
org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:72)
at com.ververica.TPCHQuery3.main(TPCHQuery3.java:184)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:288)
... 11 more
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19378) Running "./bin/flink run" without HADOOP_CLASSPATH to submit job to running YARN session fails

2020-09-23 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19378:
--

 Summary: Running "./bin/flink run" without HADOOP_CLASSPATH to 
submit job to running YARN session fails
 Key: FLINK-19378
 URL: https://issues.apache.org/jira/browse/FLINK-19378
 Project: Flink
  Issue Type: Bug
  Components: Command Line Client, Deployment / YARN
Affects Versions: 1.11.2
Reporter: Robert Metzger


Steps to reproduce:

Shell A: 
1. export HADOOP_CLASSPATH
2. start YARN session

Shell B
(unset HADOOP_CLASSPATH if it is set automatically)
3. start a Flink job using {{./bin/flink run}}.

It will fail with {{Connection refused: localhost/127.0.0.1:8081}}.
Expected behavior: connect to YARN cluster using YARN properties file.

Workaround: set HADOOP_CLASSPATH, then the client will find the properties file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Release flink-shaded 12.0

2020-09-22 Thread Robert Metzger
No concerns from my side.

On Fri, Sep 18, 2020 at 8:25 AM Chesnay Schepler  wrote:

> Hello,
>
> I'd like to kickoff the next release of flink-shaded, which will contain
> a bump to netty (4.1.49) and snakeyaml (1.27).
>
> Any concerns? Any other dependency  people want upgrade for the 1.12?
>
>


Re: [DISCUSS][Release 1.12] Stale blockers and build instabilities

2020-09-21 Thread Robert Metzger
Hi all,

An update on the release status:
1. We have 35 days = *5 weeks left until feature freeze*
2. There are currently 2 blockers for Flink
<https://issues.apache.org/jira/browse/FLINK-19264?filter=12349334>, all
making progress
3. We have 72 test instabilities
<https://issues.apache.org/jira/browse/FLINK-19237> (down 7 from 2 weeks
ago). I have pinged people to help addressing frequent or critical issues.

Best,
Robert


On Mon, Sep 7, 2020 at 10:37 AM Robert Metzger  wrote:

> Hi all,
>
> another two weeks have passed. We now have 5 blockers
> <https://issues.apache.org/jira/browse/FLINK-18682?filter=12349334> (Up
> 3 from 2 weeks ago), but they are all making progress.
>
> We currently have 79 test-instabilities
> <https://issues.apache.org/jira/browse/FLINK-18869?filter=12348580>,
> since the last report, a few have been resolved, and some others have been
> added.
> I have checked the tickets, closed some old ones and pinged people to help
> resolve new or frequent ones.
> Except for Kafka, there are no major clusters of test instabilities. Most
> failures are rarely failing tests across the entire system.
>
>
> On Tue, Aug 25, 2020 at 9:05 AM Rui Li  wrote:
>
>> Thanks Dian for the pointer. I'll take a look.
>>
>> On Tue, Aug 25, 2020 at 3:02 PM Dian Fu  wrote:
>>
>> > Thanks Rui for the info. This issue(hive related)
>> > https://issues.apache.org/jira/browse/FLINK-19025 <
>> > https://issues.apache.org/jira/browse/FLINK-19025> is marked as a
>> blocker.
>> >
>> > Regards,
>> > Dian
>> >
>> > > 在 2020年8月25日,下午2:58,Rui Li  写道:
>> > >
>> > > Hi Dian,
>> > >
>> > > FLINK-18682 has been fixed. Is there any other blocker in the hive
>> > > connector?
>> > >
>> > > On Tue, Aug 25, 2020 at 2:41 PM Dian Fu > > > dian0511...@gmail.com>> wrote:
>> > >
>> > >> Hi all,
>> > >>
>> > >> Two weeks have passed and it seems that none of the test stabilities
>> > >> issues have been addressed since then.
>> > >>
>> > >> Here is an updated status report of Blockers and Test instabilities:
>> > >>
>> > >> Blockers <
>> > >> https://issues.apache.org/jira/browse/FLINK-18682?filter=12349334 <
>> > https://issues.apache.org/jira/browse/FLINK-18682?filter=12349334> <
>> > >> https://issues.apache.org/jira/browse/FLINK-18682?filter=12349334 <
>> > https://issues.apache.org/jira/browse/FLINK-18682?filter=12349334>>>:
>> > >> Currently 2 blockers (1x Hive, 1x CI Infra)
>> > >>
>> > >> Test-Instabilities <
>> > >> https://issues.apache.org/jira/browse/FLINK-18869?filter=12348580 <
>> > https://issues.apache.org/jira/browse/FLINK-18869?filter=12348580> <
>> > >> https://issues.apache.org/jira/browse/FLINK-18869?filter=12348580 <
>> > https://issues.apache.org/jira/browse/FLINK-18869?filter=12348580>>>:
>> > >> (total 80)
>> > >>
>> > >> Besides the issues already posted in previous mail, here are the new
>> > >> instability issues which should be taken care of:
>> > >>
>> > >> - FLINK-19012 (https://issues.apache.org/jira/browse/FLINK-19012 <
>> > https://issues.apache.org/jira/browse/FLINK-19012> <
>> > >> https://issues.apache.org/jira/browse/FLINK-19012 <
>> > https://issues.apache.org/jira/browse/FLINK-19012>>)
>> > >> E2E test fails with "Cannot register Closeable, this
>> > >> subtaskCheckpointCoordinator is already closed. Closing argument."
>> > >>
>> > >> -> This is a new issue occurred recently. It has occurred several
>> times
>> > >> and may indicate a bug somewhere and should be taken care of.
>> > >>
>> > >> - FLINK-9992 (https://issues.apache.org/jira/browse/FLINK-9992 <
>> > https://issues.apache.org/jira/browse/FLINK-9992> <
>> > >> https://issues.apache.org/jira/browse/FLINK-9992 <
>> > https://issues.apache.org/jira/browse/FLINK-9992>>)
>> > >> FsStorageLocationReferenceTest#testEncodeAndDecode failed in CI
>> > >>
>> > >> -> There is already a PR for it and needs review.
>> > >>
>> > >> - FLINK-18842 (https://issues.apache.org/jira/browse/FLINK-18842 <
>> > https://issues.

[jira] [Created] (FLINK-19259) Use classloader release hooks with Kinesis producer to avoid metaspace leak

2020-09-16 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19259:
--

 Summary: Use classloader release hooks with Kinesis producer to 
avoid metaspace leak
 Key: FLINK-19259
 URL: https://issues.apache.org/jira/browse/FLINK-19259
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Kinesis
Reporter: Robert Metzger


FLINK-17554 introduced hooks for clearing references before unloading a 
classloader.

The Kinesis Producer library is currently preventing the usercode classloader 
from being unloaded because it keeps references around.

This ticket is to use the hooks with the Kinesis producer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19246) TableSourceITCase.testStreamScanParallelism fails on private Azure accounts

2020-09-15 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19246:
--

 Summary: TableSourceITCase.testStreamScanParallelism fails on 
private Azure accounts
 Key: FLINK-19246
 URL: https://issues.apache.org/jira/browse/FLINK-19246
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / API
Affects Versions: 1.12.0
Reporter: Robert Metzger


Example: 
https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8381=logs=69332ead-8935-5abf-5b3d-e4280fb1ff4c=58eb9526-6bcb-5835-ae76-f5bd5f6df6ac
or
https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8379=logs=69332ead-8935-5abf-5b3d-e4280fb1ff4c=58eb9526-6bcb-5835-ae76-f5bd5f6df6ac
or
https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8369=logs=69332ead-8935-5abf-5b3d-e4280fb1ff4c=58eb9526-6bcb-5835-ae76-f5bd5f6df6ac
(this change is already merged to master, so it is unlikely to cause the error)
{code}
2020-09-15T13:51:34.6773312Z 
org.apache.flink.api.common.InvalidProgramException: The implementation of the 
CollectionInputFormat is not serializable. The object probably contains or 
references non serializable fields.
2020-09-15T13:51:34.6774140Zat 
org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:151)
2020-09-15T13:51:34.6774634Zat 
org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:126)
2020-09-15T13:51:34.6775136Zat 
org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:71)
2020-09-15T13:51:34.6775728Zat 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.clean(StreamExecutionEnvironment.java:1913)
2020-09-15T13:51:34.6776617Zat 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.addSource(StreamExecutionEnvironment.java:1601)
2020-09-15T13:51:34.6777322Zat 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.createInput(StreamExecutionEnvironment.java:1493)
2020-09-15T13:51:34.6778029Zat 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.createInput(StreamExecutionEnvironment.java:1483)
2020-09-15T13:51:34.6778887Zat 
org.apache.flink.table.factories.utils.TestCollectionTableFactory$CollectionTableSource.getDataStream(TestCollectionTableFactory.scala:159)
2020-09-15T13:51:34.6779659Zat 
org.apache.flink.table.factories.utils.TestCollectionTableFactory$CollectionTableSource.getDataStream(TestCollectionTableFactory.scala:134)
2020-09-15T13:51:34.6780394Zat 
org.apache.flink.table.plan.nodes.datastream.StreamTableSourceScan.translateToPlan(StreamTableSourceScan.scala:105)
2020-09-15T13:51:34.6781058Zat 
org.apache.flink.table.plan.nodes.datastream.DataStreamSink.translateInput(DataStreamSink.scala:189)
2020-09-15T13:51:34.6781655Zat 
org.apache.flink.table.plan.nodes.datastream.DataStreamSink.writeToSink(DataStreamSink.scala:84)
2020-09-15T13:51:34.6782266Zat 
org.apache.flink.table.plan.nodes.datastream.DataStreamSink.translateToPlan(DataStreamSink.scala:59)
2020-09-15T13:51:34.6782951Zat 
org.apache.flink.table.planner.StreamPlanner.org$apache$flink$table$planner$StreamPlanner$$translateToCRow(StreamPlanner.scala:274)
2020-09-15T13:51:34.6783640Zat 
org.apache.flink.table.planner.StreamPlanner$$anonfun$translate$1.apply(StreamPlanner.scala:119)
2020-09-15T13:51:34.6784227Zat 
org.apache.flink.table.planner.StreamPlanner$$anonfun$translate$1.apply(StreamPlanner.scala:116)
2020-09-15T13:51:34.6784799Zat 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
2020-09-15T13:51:34.6785345Zat 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
2020-09-15T13:51:34.6785828Zat 
scala.collection.Iterator$class.foreach(Iterator.scala:891)
2020-09-15T13:51:34.6786285Zat 
scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
2020-09-15T13:51:34.6786760Zat 
scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
2020-09-15T13:51:34.6787210Zat 
scala.collection.AbstractIterable.foreach(Iterable.scala:54)
2020-09-15T13:51:34.6787681Zat 
scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
2020-09-15T13:51:34.6788168Zat 
scala.collection.AbstractTraversable.map(Traversable.scala:104)
2020-09-15T13:51:34.6788648Zat 
org.apache.flink.table.planner.StreamPlanner.translate(StreamPlanner.scala:116)
2020-09-15T13:51:34.6789286Zat 
org.apache.flink.table.api.bridge.scala.internal.StreamTableEnvironmentImpl.toDataStream(StreamTableEnvironmentImpl.scala:178)
2020-09-15T13:51:34.6790031Zat 
org.apache.flink.table.api.bridge.scala.internal.StreamTableEnvironmentImpl.toAppendStream(StreamTableEnvironmentImpl.scala:103)
2020-09-15T13:51:34.6790705Zat 
org.apache.flink.table.api.bridge.scala.TableConversions.toAppendStream(TableConversions.scala:78)
2020-09-15T13:51:34.6791362Z

[jira] [Created] (FLINK-19241) Forward ClusterEntrypoint ioExecutor to ResourceManager

2020-09-15 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19241:
--

 Summary: Forward ClusterEntrypoint ioExecutor to ResourceManager
 Key: FLINK-19241
 URL: https://issues.apache.org/jira/browse/FLINK-19241
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination
Affects Versions: 1.12.0
Reporter: Robert Metzger


Based on the discussion in FLINK-19037, we want to forward the ioExecutor from 
the ClusterEntrypoint to the ResourceManager as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19219) Run JobManager initialization in a separate thread, to make it cancellable

2020-09-14 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19219:
--

 Summary: Run JobManager initialization in a separate thread, to 
make it cancellable
 Key: FLINK-19219
 URL: https://issues.apache.org/jira/browse/FLINK-19219
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination
Affects Versions: 1.12.0
Reporter: Robert Metzger
Assignee: Robert Metzger


FLINK-16866 made the job submission non-blocking. The job submission will be 
executed asynchronously in a thread pool, submitted through a future.
The problem is that we can not cancel a hanging job submission once it is 
running in the threadpool.
This ticket is about running the initialization in a separate thread, so that 
we can interrupt it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[ANNOUNCE] New Apache Flink Committer - Niels Basjes

2020-09-14 Thread Robert Metzger
Hi all,

On behalf of the PMC, I’m very happy to announce Niels Basjes as a new
Flink committer.

Niels has been an active community member since the early days of Flink,
with 19 commits dating back until 2015.
Besides his work on the code, he has been driving initiatives on dev@ list,
supporting users and giving talks at conferences.

Please join me in congratulating Niels for becoming a Flink committer!

Best,
Robert Metzger


Re: [VOTE] Release 1.11.2, release candidate #1

2020-09-14 Thread Robert Metzger
Thanks a lot for putting a release candidate together!

+1

Checks:
- Manually checked the git diff
https://github.com/apache/flink/compare/release-1.11.1..release-1.11.2-rc1
  - in flink-kubernetes, the shading configuration got changed, but the
NOTICE file is correct (checked the shade plugin output)
- maven clean install from source
- checked staging repo files

Note: I observed that in one local build from source the log/ directory was
missing. I could not reproduce this issue (and I triggered the build before
the weekend .. maybe I did something weird in between). I believe this is a
problem caused by me, but it would be nice if you could keep your eyes open
for this issue, just to make sure.


On Thu, Sep 10, 2020 at 9:04 AM Zhu Zhu  wrote:

> Hi everyone,
>
> Please review and vote on the release candidate #1 for the version 1.11.2,
> as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
>
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release and binary convenience releases to be
> deployed to dist.apache.org [2], which are signed with the key with
> fingerprint C63E230EFFF519A5BBF2C9AE6767487CD505859C [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag "release-1.11.2-rc1" [5],
> * website pull request listing the new release and adding announcement blog
> post [6].
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>
> Thanks,
> Zhu
>
> [1]
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12348575
> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.11.2-rc1/
> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> [4]
> https://repository.apache.org/content/repositories/orgapacheflink-1397/
> [5]
>
> https://github.com/apache/flink/commit/fe3613574f76201a8d55d572a639a4ce7e18a9db
> [6] https://github.com/apache/flink-web/pull/377
>


Re: How to schedule Flink Batch Job periodically or daily

2020-09-11 Thread Robert Metzger
Hi Sunitha,

(Note: You've emailed both the dev@ and user@ mailing list. Please only use
the user@ mailing list for questions on how to use Flink. I'm moving the
dev@ list to bcc)

Flink does not have facilities for scheduling batch jobs, and there are no
plans to add such a feature (this is not in the scope of Flink, there are
already a number of workflow management tools).


On Fri, Sep 11, 2020 at 1:10 PM s_penakalap...@yahoo.com <
s_penakalap...@yahoo.com> wrote:

> Hi Team,
>
> We have Flink Batch Jobs which needs to be scheduled as listed below:
> Case 1 :2.00 UTC time  daily
> Case 2 :Periodically 2 hours once
> Case 3: Schedule based on an event
>
> Request you to help me on this,  How to approach all the 3 use cases.
> Can we use Oozie workflows or any better approach.
>
> Regards,
> Sunitha
>


[jira] [Created] (FLINK-19158) Revisit java e2e download timeouts

2020-09-07 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19158:
--

 Summary: Revisit java e2e download timeouts
 Key: FLINK-19158
 URL: https://issues.apache.org/jira/browse/FLINK-19158
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Affects Versions: 1.12.0
Reporter: Robert Metzger


Consider this failed test case

{code}
Test testHBase(org.apache.flink.tests.util.hbase.SQLClientHBaseITCase) is 
running.

09:38:38,719 [main] INFO  
org.apache.flink.tests.util.cache.PersistingDownloadCache[] - Downloading 
https://archive.apache.org/dist/hbase/1.4.3/hbase-1.4.3-bin.tar.gz.
09:40:38,732 [main] ERROR 
org.apache.flink.tests.util.hbase.SQLClientHBaseITCase   [] - 

Test testHBase(org.apache.flink.tests.util.hbase.SQLClientHBaseITCase) failed 
with:
java.io.IOException: Process ([wget, -q, -P, 
/home/vsts/work/1/e2e_cache/downloads/1598516010, 
https://archive.apache.org/dist/hbase/1.4.3/hbase-1.4.3-bin.tar.gz]) exceeded 
timeout (12) or number of retries (3).
at 
org.apache.flink.tests.util.AutoClosableProcess$AutoClosableProcessBuilder.runBlockingWithRetry(AutoClosableProcess.java:148)
at 
org.apache.flink.tests.util.cache.AbstractDownloadCache.getOrDownload(AbstractDownloadCache.java:127)
at 
org.apache.flink.tests.util.cache.PersistingDownloadCache.getOrDownload(PersistingDownloadCache.java:36)
at 
org.apache.flink.tests.util.hbase.LocalStandaloneHBaseResource.setupHBaseDist(LocalStandaloneHBaseResource.java:76)
at 
org.apache.flink.tests.util.hbase.LocalStandaloneHBaseResource.before(LocalStandaloneHBaseResource.java:70)
at 
org.apache.flink.util.ExternalResource$1.evaluate(ExternalResource.java:46)
{code}

It seems that the download has not been retried. The download might be stuck? I 
would propose to set a timeout per try and increase the total time from 2 to 5 
minutes.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS][Release 1.12] Stale blockers and build instabilities

2020-09-07 Thread Robert Metzger
Hi all,

another two weeks have passed. We now have 5 blockers
<https://issues.apache.org/jira/browse/FLINK-18682?filter=12349334> (Up
3 from 2 weeks ago), but they are all making progress.

We currently have 79 test-instabilities
<https://issues.apache.org/jira/browse/FLINK-18869?filter=12348580>, since
the last report, a few have been resolved, and some others have been added.
I have checked the tickets, closed some old ones and pinged people to help
resolve new or frequent ones.
Except for Kafka, there are no major clusters of test instabilities. Most
failures are rarely failing tests across the entire system.


On Tue, Aug 25, 2020 at 9:05 AM Rui Li  wrote:

> Thanks Dian for the pointer. I'll take a look.
>
> On Tue, Aug 25, 2020 at 3:02 PM Dian Fu  wrote:
>
> > Thanks Rui for the info. This issue(hive related)
> > https://issues.apache.org/jira/browse/FLINK-19025 <
> > https://issues.apache.org/jira/browse/FLINK-19025> is marked as a
> blocker.
> >
> > Regards,
> > Dian
> >
> > > 在 2020年8月25日,下午2:58,Rui Li  写道:
> > >
> > > Hi Dian,
> > >
> > > FLINK-18682 has been fixed. Is there any other blocker in the hive
> > > connector?
> > >
> > > On Tue, Aug 25, 2020 at 2:41 PM Dian Fu   > dian0511...@gmail.com>> wrote:
> > >
> > >> Hi all,
> > >>
> > >> Two weeks have passed and it seems that none of the test stabilities
> > >> issues have been addressed since then.
> > >>
> > >> Here is an updated status report of Blockers and Test instabilities:
> > >>
> > >> Blockers <
> > >> https://issues.apache.org/jira/browse/FLINK-18682?filter=12349334 <
> > https://issues.apache.org/jira/browse/FLINK-18682?filter=12349334> <
> > >> https://issues.apache.org/jira/browse/FLINK-18682?filter=12349334 <
> > https://issues.apache.org/jira/browse/FLINK-18682?filter=12349334>>>:
> > >> Currently 2 blockers (1x Hive, 1x CI Infra)
> > >>
> > >> Test-Instabilities <
> > >> https://issues.apache.org/jira/browse/FLINK-18869?filter=12348580 <
> > https://issues.apache.org/jira/browse/FLINK-18869?filter=12348580> <
> > >> https://issues.apache.org/jira/browse/FLINK-18869?filter=12348580 <
> > https://issues.apache.org/jira/browse/FLINK-18869?filter=12348580>>>:
> > >> (total 80)
> > >>
> > >> Besides the issues already posted in previous mail, here are the new
> > >> instability issues which should be taken care of:
> > >>
> > >> - FLINK-19012 (https://issues.apache.org/jira/browse/FLINK-19012 <
> > https://issues.apache.org/jira/browse/FLINK-19012> <
> > >> https://issues.apache.org/jira/browse/FLINK-19012 <
> > https://issues.apache.org/jira/browse/FLINK-19012>>)
> > >> E2E test fails with "Cannot register Closeable, this
> > >> subtaskCheckpointCoordinator is already closed. Closing argument."
> > >>
> > >> -> This is a new issue occurred recently. It has occurred several
> times
> > >> and may indicate a bug somewhere and should be taken care of.
> > >>
> > >> - FLINK-9992 (https://issues.apache.org/jira/browse/FLINK-9992 <
> > https://issues.apache.org/jira/browse/FLINK-9992> <
> > >> https://issues.apache.org/jira/browse/FLINK-9992 <
> > https://issues.apache.org/jira/browse/FLINK-9992>>)
> > >> FsStorageLocationReferenceTest#testEncodeAndDecode failed in CI
> > >>
> > >> -> There is already a PR for it and needs review.
> > >>
> > >> - FLINK-18842 (https://issues.apache.org/jira/browse/FLINK-18842 <
> > https://issues.apache.org/jira/browse/FLINK-18842> <
> > >> https://issues.apache.org/jira/browse/FLINK-18842 <
> > https://issues.apache.org/jira/browse/FLINK-18842>>)
> > >> e2e test failed to download "localhost:/flink.tgz" in "Wordcount
> on
> > >> Docker test"
> > >>
> > >>
> > >>> 在 2020年8月11日,下午2:08,Robert Metzger  写道:
> > >>>
> > >>> Hi team,
> > >>>
> > >>> 2 weeks have passed since the last update. None of the test
> stabilities
> > >>> I've mentioned have been addressed since then.
> > >>>
> > >>> Here's an updated status report of Blockers and Test instabilities:
> > >>>
> > >>> Blockers <
> > >>

Re: HadoopOutputFormat has issues with LocalExecutionEnvironment?

2020-09-03 Thread Robert Metzger
Hi Ken,

sorry for the late reply. This could be a bug in Flink. Does the issue also
occur on Flink 1.11?
Have you set a breakpoint in the HadoopOutputFormat.finalizeGlobal() when
running locally to validate that this method doesn't get called?

What do you mean by "algorithm version 2"? Where can you set this? (Sorry
for the question, I'm not an expert with Hadoop's FileOutputCommitter)

Note to others: There's a related discussion here:
https://issues.apache.org/jira/browse/FLINK-19069

Best,
Robert


On Wed, Aug 26, 2020 at 1:10 AM Ken Krugler 
wrote:

> Hi devs,
>
> In HadoopOutputFormat.close(), I see code that is trying to rename
> /tmp-r-1 to be /1
>
> But when I run my Flink 1.9.2 code using a local MiniCluster, the actual
> location of the tmp-r-1 file is:
>
> /_temporary/0/task___r_01/tmp-r-1
>
> I think this is because the default behavior of Hadoop’s
> FileOutputCommitter (with algorithm == 1) is to put files in task-specific
> sub-dirs.
>
> It’s depending on a post-completion “merge paths” action to be taken by
> what is (for Hadoop) the Application Master.
>
> I assume that when running on a real cluster, the
> HadoopOutputFormat.finalizeGlobal() method’s call to commitJob() would do
> this, but it doesn’t seem to be happening when I run locally.
>
> If I set the algorithm version to 2, then “merge paths” is handled by
> FileOutputCommitter immediately, and the HadoopOutputFormat code finds
> files in the expected location.
>
> Wondering if Flink should always be using version 2 of the algorithm, as
> that’s more performant when there are a lot of results (which is why it was
> added).
>
> Thanks,
>
> — Ken
>
> --
> Ken Krugler
> http://www.scaleunlimited.com
> custom big data solutions & training
> Hadoop, Cascading, Cassandra & Solr
>
>


Re: [ANNOUNCE] New PMC member: Dian Fu

2020-08-27 Thread Robert Metzger
Congratulations Dian!

On Thu, Aug 27, 2020 at 3:09 PM Congxian Qiu  wrote:

> Congratulations Dian
> Best,
> Congxian
>
>
> Xintong Song  于2020年8月27日周四 下午7:50写道:
>
> > Congratulations Dian~!
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Thu, Aug 27, 2020 at 7:42 PM Jark Wu  wrote:
> >
> > > Congratulations Dian!
> > >
> > > Best,
> > > Jark
> > >
> > > On Thu, 27 Aug 2020 at 19:37, Leonard Xu  wrote:
> > >
> > > > Congrats, Dian!  Well deserved.
> > > >
> > > > Best
> > > > Leonard
> > > >
> > > > > 在 2020年8月27日,19:34,Kurt Young  写道:
> > > > >
> > > > > Congratulations Dian!
> > > > >
> > > > > Best,
> > > > > Kurt
> > > > >
> > > > >
> > > > > On Thu, Aug 27, 2020 at 7:28 PM Rui Li 
> > wrote:
> > > > >
> > > > >> Congratulations Dian!
> > > > >>
> > > > >> On Thu, Aug 27, 2020 at 5:39 PM Yuan Mei 
> > > > wrote:
> > > > >>
> > > > >>> Congrats!
> > > > >>>
> > > > >>> On Thu, Aug 27, 2020 at 5:38 PM Xingbo Huang  >
> > > > wrote:
> > > > >>>
> > > >  Congratulations Dian!
> > > > 
> > > >  Best,
> > > >  Xingbo
> > > > 
> > > >  jincheng sun  于2020年8月27日周四 下午5:24写道:
> > > > 
> > > > > Hi all,
> > > > >
> > > > > On behalf of the Flink PMC, I'm happy to announce that Dian Fu
> is
> > > now
> > > > > part of the Apache Flink Project Management Committee (PMC).
> > > > >
> > > > > Dian Fu has been very active on PyFlink component, working on
> > > various
> > > > > important features, such as the Python UDF and Pandas
> > integration,
> > > > and
> > > > > keeps checking and voting for our releases, and also has
> > > successfully
> > > > > produced two releases(1.9.3&1.11.1) as RM, currently working as
> > RM
> > > > to push
> > > > > forward the release of Flink 1.12.
> > > > >
> > > > > Please join me in congratulating Dian Fu for becoming a Flink
> PMC
> > > > > Member!
> > > > >
> > > > > Best,
> > > > > Jincheng(on behalf of the Flink PMC)
> > > > >
> > > > 
> > > > >>
> > > > >> --
> > > > >> Best regards!
> > > > >> Rui Li
> > > > >>
> > > >
> > > >
> > >
> >
>


[jira] [Created] (FLINK-19064) HBaseRowDataInputFormat is leaking resources

2020-08-27 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19064:
--

 Summary: HBaseRowDataInputFormat is leaking resources
 Key: FLINK-19064
 URL: https://issues.apache.org/jira/browse/FLINK-19064
 Project: Flink
  Issue Type: Bug
  Components: Connectors / HBase
Affects Versions: 1.12.0
Reporter: Robert Metzger


{{HBaseRowDataInputFormat.configure()}} is calling {{connectToTable()}}, which 
creates a connection to HBase that is not closed again.

A user reported this problem on the user@ list: 
https://lists.apache.org/thread.html/ra04f6996eb50ee83aabd2ad0d50bec9afb6a924bfbb48ada3269c6d8%40%3Cuser.flink.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [ANNOUNCE] Weekly Community Update 2020/31-34

2020-08-27 Thread Robert Metzger
Thanks a lot for doing these updates!

On Tue, Aug 25, 2020 at 10:23 PM Konstantin Knauf  wrote:

> Dear community,
>
> The "weekly" community update is back after a short summer break! This time
> I've tried to cover most of what happened during the last four weeks, but I
> might pick up some older topics in the next weeks' updates, too.
>
> Activity on the dev@ mailing list has picked up quite a bit as feature
> development & design for the next releases of Apache Flink and Apache Flink
> Stateful Functions is going at full steam. In detail:
>
> Flink Development
> ==
>
> * [releases] [Flink 1.12] The work on Flink 1.12 is well underway with
> feature freeze planned for end of October [1]. Our release managers Robert
> & Dian are periodically reminding the developer community of current
> blockers to reduce time during release testing for this release [2].
>
> * [releases] [Stateful Functions 2.2] Igal has started a discussion
> releasing Stateful Functions 2.2. soon (proposed feature freeze:
> September 10). The most notable feature is maybe the option to embed a
> stateful functions module in a DataStream program via DataStream
> Ingress/Egress. Checkout [3] for a full list of the planned features.
>
> * [releases] [Flink 1.10] Flink 1.10.2 was released. [4]
>
> * [apis] Besides the Stateful Functions API, Flink currently has three
> top-level APIs: DataStream (streaming), DataSet (batch) and TableAPI/SQL
> (unified). A major step towards the goal of a truly unified batch and
> stream processing engine is the unification of the DataStream/DataSet APIs.
> This is one of the main topics of the upcoming release(s), specifically:
> * Aljoscha has published FLIP-131 [5] proposing to deprecate and
> eventually drop the DataSet API. In order to still support the same breadth
> of use cases, we need to make sure that all its use cases are covered by
> the two remaining APIs: a unified DataStream API and the Table API. These
> changes are not part of FLIP-131 itself, but are covered in other FLIPs,
> which already exist (like FLIP-27 [6] or FLIP-129 [7]) or will be published
> over the next few weeks like FLIP-134 (see below). [8]
> * Most importantly, FLIP-134 [9] discusses how the DataStream API could
> be used to efficiently execute batch workloads in the future. In essence
> the FLIP proposes to introduce a BATCH and a STREAMING execution mode for
> DataStream programs. The STREAMING mode corresponds to the current
> behavior, while the BATCH mode adjusts the behavior in various areas to fit
> the requirements of batch processing, e.g. pipelined scheduling with region
> failover, blocking shuffles, no checkpointing, no watermarks, ... [10]
>
> * [apis] Time proposes FLIP-136 to improve the interoperability between the
> Data Stream and Table API. The FLIP covers the conversion between
> DataStream <-> Table (incl. cnangelong streams, watermarks, etc.) as well
> as more additional support for working with the Row type in the DataStream
> API. [11]
>
> * [datastream api] Dawid proposes to remove a set of deprecated methods
> from the DataStream API. [12]
>
> * [runtime] Yuan Mei has started a discussion on FLIP-135 to introduce
> task-local recovery. The FLIP is about the introduction of a new
> failover/recovery strategy for Flink Jobs, that trades consistency for
> availability. Specifically, in the case of approximate task-local recovery
> the failure of some tasks would not trigger a restart of the rest of the
> job, but in turn you can expect data loss or duplication. [13]
>
> * [python] Xingbo Huang proposes to extend the support of Pandas/vectorized
> functions from scalar functions to aggregate functions. For more details on
> Pandas support on PyFlink see the blog post linked below. [14]
>
> * [connectors] Aljoscha has started a discussion on dropping support for
> Kafka 0.10/0.11 in Flink 1.12+. [15]
>
> * [connectors] Robert has revived the discussion on adding support for
> Hbase 2.3.x. There is a consensus to add the HBase 2.x connector Apache
> Flink, but no consensus yet on whether to move the existing HBase 1.x from
> the Flink project to Apache Bahir, too. [16]
> 
> [1]
>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Planning-Flink-1-12-tp43348.html
> [2]
>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Release-1-12-Stale-blockers-and-build-instabilities-tp43477.html
> [3]
>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Next-Stateful-Functions-Release-tp44063.html
> [4] https://flink.apache.org/news/2020/08/25/release-1.10.2.html
> [5]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158866741
> [6]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface?src=contextnavpagetreemode
>
> [7]
>
> 

Re: Next Stateful Functions Release

2020-08-27 Thread Robert Metzger
+1 to release Statefun 2.2

On Tue, Aug 25, 2020 at 2:16 PM David Anderson 
wrote:

> Igal,
>
> The feature set you propose sounds great to me -- as a user I see
> plenty there to get excited about. As for the feature freeze date, I
> don't really have an informed opinion.
>
> David
>
> On Mon, Aug 24, 2020 at 10:15 AM Igal Shilman  wrote:
> >
> > Hi Flink devs,
> >
> > We have a few upcoming / implemented features for Stateful Functions on
> the
> > radar, and would like to give a heads up on what to expect for the next
> > release:
> >
> > 1. Upgrade support for Flink 1.11.x. [FLINK-18812]
> > 2. Fine grained control on remote state configuration, such as state TTL.
> > [FLINK-17954]
> > 3. New state construct for dynamic state registration [FLINK-18316]
> > 4. Add a DataStream API to StateFun [FLINK-19001]
> > 5. Support async handlers for the Python SDK [FLINK-18518]
> > 6. Add more metrics around async operations and backpressure
> [FLINK-19020]
> > 7. Out-of-box support for common storage systems in flink-statefun Docker
> > image [FLINK-19019]
> >
> > With these we think the project will be in a good spot for the next
> release.
> > What do you think about aiming at 10.9.2020 for a feature freeze for
> > StateFun 2.2?
> >
> > Kind regards,
> > Igal.
>


[jira] [Created] (FLINK-19049) TableEnvironmentImpl.executeInternal() does not wait for the final job status

2020-08-25 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19049:
--

 Summary: TableEnvironmentImpl.executeInternal() does not wait for 
the final job status
 Key: FLINK-19049
 URL: https://issues.apache.org/jira/browse/FLINK-19049
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Runtime
Affects Versions: 1.12.0
Reporter: Robert Metzger


While working on another change, I realized that the 
{{FunctionITCase.testInvalidUseOfTableFunction()}} tests throws a 
NullPointerException during execution.

This error is not visible, because TableEnvironmentImpl.executeInternal() does 
not wait for the final job status.
It submits the job using the job client ({{JobClient jobClient = 
execEnv.executeAsync(pipeline);}}), and it doesn't wait for the job to complete 
before returning a result. 

This is the null pointer that is hidden:
{code}

Caused by: org.apache.flink.util.FlinkException: Failed to execute job 
'insert-into_default_catalog.default_database.SinkTable'.
at 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:1823)
at 
org.apache.flink.table.planner.delegation.ExecutorBase.executeAsync(ExecutorBase.java:57)
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:681)
... 34 more
Caused by: java.util.concurrent.CompletionException: 
org.apache.flink.runtime.JobException: Recovery is suppressed by 
NoRestartBackoffTimeStrategy
at 
org.apache.flink.client.ClientUtils.waitUntilJobInitializationFinished(ClientUtils.java:148)
at 
org.apache.flink.client.program.PerJobMiniClusterFactory.lambda$submitJob$2(PerJobMiniClusterFactory.java:92)
at 
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616)
at 
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
at 
java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:457)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at 
java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
at 
java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Caused by: org.apache.flink.runtime.JobException: Recovery is suppressed by 
NoRestartBackoffTimeStrategy
at 
org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:116)
at 
org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:78)
at 
org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:195)
at 
org.apache.flink.runtime.scheduler.DefaultScheduler.maybeHandleTaskFailure(DefaultScheduler.java:188)
at 
org.apache.flink.runtime.scheduler.DefaultScheduler.updateTaskExecutionStateInternal(DefaultScheduler.java:182)
at 
org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:523)
at 
org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:422)
at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:284)
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:199)
at 
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74)
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
at akka.actor.ActorCell.invoke(ActorCell.scala:561)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
at akka.dispatch.Mailbox.run(Mailbox.scala:225)
at akka.dispatch.Mailbox.exec

[jira] [Created] (FLINK-19037) Introduce proper IO executor in Dispatcher

2020-08-24 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19037:
--

 Summary: Introduce proper IO executor in Dispatcher
 Key: FLINK-19037
 URL: https://issues.apache.org/jira/browse/FLINK-19037
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination
Affects Versions: 1.12.0
Reporter: Robert Metzger


Currently, IO operations in the {{Dispatcher}} are scheduled on the 
{{rpcService.getExecutor()}}.

We should introduce a separate executor for IO operations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19012) E2E test fails with "Cannot register Closeable, this subtaskCheckpointCoordinator is already closed. Closing argument."

2020-08-21 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19012:
--

 Summary: E2E test fails with "Cannot register Closeable, this 
subtaskCheckpointCoordinator is already closed. Closing argument."
 Key: FLINK-19012
 URL: https://issues.apache.org/jira/browse/FLINK-19012
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing, Runtime / Task, Tests
Affects Versions: 1.12.0
Reporter: Robert Metzger


Note: This error occurred in a custom branch with unreviewed changes. I don't 
believe my changes affect this error, but I would keep this in mind when 
investigating the error: 
https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8307=logs=1f3ed471-1849-5d3c-a34c-19792af4ad16=0d2e35fc-a330-5cf2-a012-7267e2667b1d
 
{code}
2020-08-20T20:55:30.2400645Z 2020-08-20 20:55:22,373 INFO  
org.apache.flink.runtime.taskmanager.Task[] - Registering 
task at network: Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
(cbc357ccb763df2852fee8c4fc7d55f2_0_0) [DEPLOYING].
2020-08-20T20:55:30.2402392Z 2020-08-20 20:55:22,401 INFO  
org.apache.flink.streaming.runtime.tasks.StreamTask  [] - No state 
backend has been configured, using default (Memory / JobManager) 
MemoryStateBackend (data in heap memory / checkpoints to JobManager) 
(checkpoints: 'null', savepoints: 'null', asynchronous: TRUE, maxStateSize: 
5242880)
2020-08-20T20:55:30.2404297Z 2020-08-20 20:55:22,413 INFO  
org.apache.flink.runtime.taskmanager.Task[] - Source: 
Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
(cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from DEPLOYING to RUNNING.
2020-08-20T20:55:30.2405805Z 2020-08-20 20:55:22,786 INFO  
org.apache.flink.streaming.connectors.elasticsearch6.Elasticsearch6ApiCallBridge
 [] - Pinging Elasticsearch cluster via hosts [http://127.0.0.1:9200] ...
2020-08-20T20:55:30.2407027Z 2020-08-20 20:55:22,848 INFO  
org.apache.flink.streaming.connectors.elasticsearch6.Elasticsearch6ApiCallBridge
 [] - Elasticsearch RestHighLevelClient is connected to [http://127.0.0.1:9200]
2020-08-20T20:55:30.2409277Z 2020-08-20 20:55:29,205 INFO  
org.apache.flink.runtime.checkpoint.channel.ChannelStateWriteRequestExecutorImpl
 [] - Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) discarding 0 
drained requests
2020-08-20T20:55:30.2410690Z 2020-08-20 20:55:29,218 INFO  
org.apache.flink.runtime.taskmanager.Task[] - Source: 
Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
(cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from RUNNING to FINISHED.
2020-08-20T20:55:30.2412187Z 2020-08-20 20:55:29,218 INFO  
org.apache.flink.runtime.taskmanager.Task[] - Freeing task 
resources for Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
(cbc357ccb763df2852fee8c4fc7d55f2_0_0).
2020-08-20T20:55:30.2414203Z 2020-08-20 20:55:29,224 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor   [] - 
Un-registering task and sending final execution state FINISHED to JobManager 
for task Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
cbc357ccb763df2852fee8c4fc7d55f2_0_0.
2020-08-20T20:55:30.2415602Z 2020-08-20 20:55:29,219 INFO  
org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable [] - Source: 
Sequence Source -> Flat Map -> Sink: Unnamed (1/1) - asynchronous part of 
checkpoint 1 could not be completed.
2020-08-20T20:55:30.2416411Z java.io.UncheckedIOException: java.io.IOException: 
Cannot register Closeable, this subtaskCheckpointCoordinator is already closed. 
Closing argument.
2020-08-20T20:55:30.2418956Zat 
org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.lambda$registerConsumer$2(SubtaskCheckpointCoordinatorImpl.java:468)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-08-20T20:55:30.2420100Zat 
org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.run(AsyncCheckpointRunnable.java:91)
 [flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-08-20T20:55:30.2420927Zat 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_265]
2020-08-20T20:55:30.2421455Zat 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_265]
2020-08-20T20:55:30.2421879Zat java.lang.Thread.run(Thread.java:748) 
[?:1.8.0_265]
2020-08-20T20:55:30.2422348Z Caused by: java.io.IOException: Cannot register 
Closeable, this subtaskCheckpointCoordinator is already closed. Closing 
argument.
2020-08-20T20:55:30.2423416Zat 
org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.registerAsyncCheckpointRunnable(SubtaskCheckpointCoordinatorImpl.java:378)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-08-20T20:55:30.2424635Zat 
org.apache.flink.streaming.runtime.tasks.Subtas

[jira] [Created] (FLINK-19000) Forward JobStatus.INITIALIZING timestamp to ExecutionGraph

2020-08-19 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19000:
--

 Summary: Forward JobStatus.INITIALIZING timestamp to ExecutionGraph
 Key: FLINK-19000
 URL: https://issues.apache.org/jira/browse/FLINK-19000
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination
Affects Versions: 1.12.0
Reporter: Robert Metzger


This is a follow up to FLINK-16866.

Currently, in the ExecutionGraph, the timestamp for JobStatus.INITIALIZING is 
not set (defaulting to 0), leading to an inconsistent stateTimestamps array.

To resolve this ticket, one needs to forward the timestamp from the Dispatcher 
(where the initialization is started) to the ExecutionGraph. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Release 1.10.2, release candidate #2

2020-08-17 Thread Robert Metzger
Thanks a lot for creating another release candidate!

+1 (binding)

- checked diff:
https://github.com/apache/flink/compare/release-1.10.1...release-1.10.2-rc2
  - kubernetes client was upgraded from 4.5.2 to 4.9.2 + some shading
changes > verified NOTICE file with shade output
- source compiles
- source sha is correct
- checked staging repository: versions set correctly, contents seem in line
with the changes




On Mon, Aug 17, 2020 at 1:56 PM Zhu Zhu  wrote:

> Hi everyone,
>
> Please review and vote on the release candidate #2 for the version 1.10.2,
> as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
>
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release and binary convenience releases to be
> deployed to dist.apache.org [2], which are signed with the key with
> fingerprint C63E230EFFF519A5BBF2C9AE6767487CD505859C [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag "release-1.10.2-rc2" [5],
> * website pull request listing the new release and adding announcement blog
> post [6].
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>
> Thanks,
> Zhu Zhu
>
> [1]
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12347791
> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.10.2-rc2/
> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> [4]
> https://repository.apache.org/content/repositories/orgapacheflink-1395/
> [5]
>
> https://github.com/apache/flink/commit/68bb8b612932e479ca03c7ae7c0080818d89c8a1
> [6] https://github.com/apache/flink-web/pull/366
>


[ANNOUNCE] New Flink Committer: David Anderson

2020-08-12 Thread Robert Metzger
Hi everyone,

On behalf of the PMC, I'm very happy to announce David Anderson as a new
Apache
Flink committer.

David has been a Flink community member for a long time. His first commit
dates back to 2016, code changes mostly involve the documentation, in
particular with the recent contribution of Flink training materials.
Besides that, David has been giving numerous talks and trainings on Flink.
On StackOverflow, he's among the most active helping Flink users to solve
their problems (2nd in the all-time ranking, 1st in the last 30 days). A
similar level of activity can be found on the user@ mailing list.

Please join me in congratulating David for becoming a Flink committer!

Best,
Robert


[jira] [Created] (FLINK-18893) Python tests fails with "AppendStreamTableSink requires that Table has only insert changes."

2020-08-11 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-18893:
--

 Summary: Python tests fails with "AppendStreamTableSink requires 
that Table has only insert changes."
 Key: FLINK-18893
 URL: https://issues.apache.org/jira/browse/FLINK-18893
 Project: Flink
  Issue Type: Bug
  Components: API / Python, Tests
Reporter: Robert Metzger


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=5392=logs=9cada3cb-c1d3-5621-16da-0f718fb86602=455fddbf-5921-5b71-25ac-92992ad80b28

{code}
2020-08-11T09:33:00.6405057Z 
2020-08-11T09:33:00.6407257Z === FAILURES 
===
2020-08-11T09:33:00.6407774Z _ 
StreamPandasConversionTests.test_to_pandas_for_retract_table _
2020-08-11T09:33:00.6408007Z 
2020-08-11T09:33:00.6409313Z a = ('xro10353', , 'z:org.apache.flink.table.runtime.arrow.ArrowUtils', 
'collectAsPandasDataFrame')
2020-08-11T09:33:00.6409732Z kw = {}
2020-08-11T09:33:00.6410610Z s = 'org.apache.flink.table.api.TableException: 
AppendStreamTableSink requires that Table has only insert changes.'
2020-08-11T09:33:00.6413478Z stack_trace = 
'org.apache.flink.table.plan.nodes.datastream.DataStreamSink.writeToAppendSink(DataStreamSink.scala:118)\n\t
 at 
org.apapi.python.shaded.py4j.GatewayConnection.run(GatewayConnection.java:238)\n\t
 at java.lang.Thread.run(Thread.java:748)'
2020-08-11T09:33:00.6414468Z exception = 
'org.apache.flink.table.api.TableException'
2020-08-11T09:33:00.6414663Z 
2020-08-11T09:33:00.6414858Z def deco(*a, **kw):
2020-08-11T09:33:00.6415060Z try:
2020-08-11T09:33:00.6415274Z >   return f(*a, **kw)
2020-08-11T09:33:00.6415419Z 
2020-08-11T09:33:00.6415615Z pyflink/util/exceptions.py:147: 
2020-08-11T09:33:00.6415918Z _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
2020-08-11T09:33:00.6416154Z 
2020-08-11T09:33:00.6416459Z answer = 'xro10353'
2020-08-11T09:33:00.6416746Z gateway_client = 
2020-08-11T09:33:00.6419376Z target_id = 
'z:org.apache.flink.table.runtime.arrow.ArrowUtils'
2020-08-11T09:33:00.6420325Z name = 'collectAsPandasDataFrame'
2020-08-11T09:33:00.6420574Z 
2020-08-11T09:33:00.6421009Z def get_return_value(answer, gateway_client, 
target_id=None, name=None):
2020-08-11T09:33:00.6421735Z """Converts an answer received from the 
Java gateway into a Python object.
2020-08-11T09:33:00.6422153Z 
2020-08-11T09:33:00.6422538Z For example, string representation of 
integers are converted to Python
2020-08-11T09:33:00.6423113Z integer, string representation of objects 
are converted to JavaObject
2020-08-11T09:33:00.6423424Z instances, etc.
2020-08-11T09:33:00.6423594Z 
2020-08-11T09:33:00.6423844Z :param answer: the string returned by the 
Java gateway
2020-08-11T09:33:00.6424203Z :param gateway_client: the gateway client 
used to communicate with the Java
2020-08-11T09:33:00.6424604Z Gateway. Only necessary if the answer 
is a reference (e.g., object,
2020-08-11T09:33:00.6424874Z list, map)
2020-08-11T09:33:00.6425375Z :param target_id: the name of the object 
from which the answer comes from
2020-08-11T09:33:00.6425735Z (e.g., *object1* in 
`object1.hello()`). Optional.
2020-08-11T09:33:00.6426069Z :param name: the name of the member from 
which the answer comes from
2020-08-11T09:33:00.6426422Z (e.g., *hello* in `object1.hello()`). 
Optional.
2020-08-11T09:33:00.6426648Z """
2020-08-11T09:33:00.6426868Z if is_error(answer)[0]:
2020-08-11T09:33:00.6427112Z if len(answer) > 1:
2020-08-11T09:33:00.6427358Z type = answer[1]
2020-08-11T09:33:00.6427665Z value = 
OUTPUT_CONVERTER[type](answer[2:], gateway_client)
2020-08-11T09:33:00.6428032Z if answer[1] == REFERENCE_TYPE:
2020-08-11T09:33:00.6428313Z raise Py4JJavaError(
2020-08-11T09:33:00.6428691Z "An error occurred while 
calling {0}{1}{2}.\n".
2020-08-11T09:33:00.6429083Z >   format(target_id, ".", 
name), value)
2020-08-11T09:33:00.6429798Z E   py4j.protocol.Py4JJavaError: 
An error occurred while calling 
z:org.apache.flink.table.runtime.arrow.ArrowUtils.collectAsPandasDataFrame.
2020-08-11T09:33:00.6430529Z E   : 
org.apache.flink.table.api.TableException: AppendStreamTableSink requires that 
Table has only insert changes.
2020-08-11T09:33:00.6431132Z E  at 
org.apache.flink.table.plan.nodes.datastream.DataStreamSink.writeToAppendSink(DataStreamSink.scala:118)
2020-08-11T09:33:00.6431793Z E  at 
org.apache.flink.table.plan.nodes.datastream.DataStreamSin

Re: Error with Flink-Gelly, lastJobExecutionResult is null

2020-08-11 Thread Robert Metzger
Does this mean this is a valid issue?
If so, we could track it as a starter issue?

On Mon, Aug 3, 2020 at 10:59 AM Aljoscha Krettek 
wrote:

> Thanks for the pointer!
>
> On 03.08.20 10:29, Till Rohrmann wrote:
> > Hi Xia Rui,
> >
> > thanks for reporting this issue. I think FLINK-15116 [1] caused the
> > regression. The problem is indeed that we no longer set the
> > lastJobExecutionResult when using the ContextEnvironment. The problem has
> > not surfaced since with other ExecutionEnvironments, we still set the
> field
> > correctly. I'm pulling in Aljoscha, Klou and Tison who worked on this
> > feature. I believe they can propose a fix for the problem.
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-15116
> >
> > On Sun, Aug 2, 2020 at 11:28 AM Xia Rui  wrote:
> >
> >> Hello, everyone.
> >>
> >> I am using Flink-Gelly. I got an error when running the example code of
> >> Gelly-example. I have reported the problem in Stackoverflow, and this is
> >> the
> >> link
> >> (
> >>
> https://stackoverflow.com/questions/63211746/error-with-flink-gelly-lastjob
> >> executionresult-is-null-for-executionenvironment
> >> <
> https://stackoverflow.com/questions/63211746/error-with-flink-gelly-lastjobexecutionresult-is-null-for-executionenvironment
> >
> >> )
> >>
> >>
> >>
> >> I am trying to figure out the error point. I traced the env.execute()
> >> function, and it is actually invoked in ContextEnvironment::execute()
> >> (link:
> >>
> >>
> https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/
> >> apache/flink/client/program/ContextEnvironment.java#L71
> >> <
> https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/program/ContextEnvironment.java#L71
> >).
> >> In fact, the
> >> variable lastJobExecutionResult (from ContextEnvironment's super class
> >> ExecutionEnvironment) is not set.
> >>
> >>
> >>
> >> I checked the history of ContextEnvironment, and find that
> >> lastJobExecutionResult is exclude from ClusterClient in FLINK-14456
> >> (https://issues.apache.org/jira/browse/FLINK-14456). This is merged to
> >> master after flink-1.10.
> >>
> >>
> >>
> >> I was wondering If I could set the lastJobExecutionResult in
> >> ContextEnvironment::execute() for my case (run Flink-Gelly on flink >=
> >> 1.10)
> >> without significant side effect.
> >>
> >>
> >>
> >> Thank you.
> >>
> >>
> >
>
>


Re: [DISCUSS][Release 1.12] Stale blockers and build instabilities

2020-08-11 Thread Robert Metzger
Hi team,

2 weeks have passed since the last update. None of the test stabilities
I've mentioned have been addressed since then.

Here's an updated status report of Blockers and Test instabilities:

Blockers <https://issues.apache.org/jira/browse/FLINK-18682?filter=12349334>:
Currently 3 blockers (2x Hive, 1x CI Infra)

Test-Instabilities
<https://issues.apache.org/jira/browse/FLINK-18869?filter=12348580> (total
79) which failed recently or frequently:


- FLINK-18807 <https://issues.apache.org/jira/browse/FLINK-18807>
FlinkKafkaProducerITCase.testScaleUpAfterScalingDown
failed with "Timeout expired after 6milliseconds while awaiting
EndTxn(COMMIT)"

- FLINK-18634 <https://issues.apache.org/jira/browse/FLINK-18634>
FlinkKafkaProducerITCase.testRecoverCommittedTransaction
failed with "Timeout expired after 6milliseconds while awaiting
InitProducerId"

- FLINK-16908 <https://issues.apache.org/jira/browse/FLINK-16908>
FlinkKafkaProducerITCase
testScaleUpAfterScalingDown Timeout expired while initializing
transactional state in 6ms.

- FLINK-13733 <https://issues.apache.org/jira/browse/FLINK-13733>
FlinkKafkaInternalProducerITCase.testHappyPath fails on Travis

--> The first three tickets seem related.


- FLINK-17260 <https://issues.apache.org/jira/browse/FLINK-17260>
StreamingKafkaITCase failure on Azure

--> This one seems really hard to reproduce


- FLINK-16768 <https://issues.apache.org/jira/browse/FLINK-16768>
HadoopS3RecoverableWriterITCase.testRecoverWithStateWithMultiPart
hangs

- FLINK-18374 <https://issues.apache.org/jira/browse/FLINK-18374>
HadoopS3RecoverableWriterITCase.testRecoverAfterMultiplePersistsStateWithMultiPart
produced no output for 900 seconds

--> nobody seems to feel responsible for these tickets. My guess is that
the S3 connector should have shorter timeouts / faster retries to finish
within the 15 minutes test timeout. OR there is really something wrong with
the code.


- FLINK-18333 UnsignedTypeConversionITCase failed caused by MariaDB4j
"Asked to waitFor Program"
<https://issues.apache.org/jira/browse/FLINK-18333>
<https://issues.apache.org/jira/browse/FLINK-18333>- FLINK-17159
<https://issues.apache.org/jira/browse/FLINK-17159> ES6
ElasticsearchSinkITCase unstable

- FLINK-17949 <https://issues.apache.org/jira/browse/FLINK-17949>
KafkaShuffleITCase.testSerDeIngestionTime:156->testRecordSerDe:388
expected:<310> but was:<0>

- FLINK-18222 <https://issues.apache.org/jira/browse/FLINK-18222> "Avro
Confluent Schema Registry nightly end-to-end test" unstable with "Kafka
cluster did not start after 120 seconds"

- FLINK-17511 <https://issues.apache.org/jira/browse/FLINK-17511> "RocksDB
Memory Management end-to-end test" fails with "Current block cache usage
202123272 larger than expected memory limit 2"




On Mon, Jul 27, 2020 at 8:42 PM Robert Metzger  wrote:

> Hi team,
>
> We would like to use this thread as a permanent thread for
> regularly syncing on stale blockers (need to have somebody assigned within
> a week and progress, or a good plan) and build instabilities (need to check
> if its a blocker).
>
> Recent test-instabilities:
>
>- https://issues.apache.org/jira/browse/FLINK-17159 (ES6 test)
>- https://issues.apache.org/jira/browse/FLINK-16768 (s3 test unstable)
>- https://issues.apache.org/jira/browse/FLINK-18374 (s3 test unstable)
>- https://issues.apache.org/jira/browse/FLINK-17949
>(KafkaShuffleITCase)
>- https://issues.apache.org/jira/browse/FLINK-18634 (Kafka
>transactions)
>
>
> It would be nice if the committers taking care of these components could
> look into the test failures.
> If nothing happens, we'll personally reach out to people I believe they
> could look into the ticket.
>
> Best,
> Dian & Robert
>


Re: Status of FLIPs

2020-08-10 Thread Robert Metzger
Thanks for cleaning up our Wiki.

Seth and Timo: Can you check the status of the FLIPs mentioned in (3) ?

On Wed, Aug 5, 2020 at 9:23 AM jincheng sun 
wrote:

> Good job Dian!
> Thank you for helping to maintain the state of FLIPs which is very
> important for the community to understand the progress of FLIPs.
>
> Best,
> Jincheng
>
>
> Dian Fu  于2020年8月4日周二 下午5:41写道:
>
> > Hi all,
> >
> > When going through the status of existing FLIPs[1], I found that the
> > status of quite a few of FLIPs are out of date.
> >
> > 1) For the FLIPs which I'm pretty sure that are already finished(the
> > umbrella JIRA has been resolved), I have updated the status of them by
> > moving them from "Adopted/Accepted but unreleased FLIPs" to "Implemented
> > and Released FLIPs":
> >
> > - FLIP-52: Remove legacy Program interface.
> > - FLIP-57: Rework FunctionCatalog
> > - FLIP-63: Rework table partition support
> > - FLIP-68: Extend Core Table System with Pluggable Modules
> > - FLIP-73: Introducing Executors for job submission
> > - FLIP-81: Executor-related new ConfigOptions.
> > - FLIP-84: Improve & Refactor API of TableEnvironment
> > - FLIP-85: Flink Application Mode
> > - FLIP-86: Improve Connector Properties
> > - FLIP 87: Primary key constraints in Table API
> > - FLIP-92: Add N-Ary Stream Operator in Flink
> > - FLIP-103: Better TM/JM Log Display
> > - FLIP-123: DDL and DML compatibility for Hive connector
> > - FLIP-124: Add open/close and Collector to (De)SerializationSchema
> >
> >
> > 2) For the following FLIPs, it seems that the work has already been
> > finished. However, as the umbrella JIRA is still open and so I'm leaving
> > the status of them as it is:
> >
> > - FLIP-55: Introduction of a Table API Java Expression DSL
> > - FLIP-93: JDBC catalog and Postgres catalog
> > - FLIP-110: Support LIKE clause in CREATE TABLE
> >
> >
> > 3) For the following FLIPs, there are still some open subtasks. However,
> > it seems that the main work has already been finished and so I guess(not
> > quite sure) maybe we should also move them to "Implemented and Released
> > FLIPs":
> >
> > - FLIP-43: State Processing API
> > - FLIP-66: Support time attribute in SQL DDL
> > - FLIP-69: Flink SQL DDL Enhancement
> > - FLIP-79: Flink Function DDL Support
> >
> > For the FLIPs under 2) and 3), it would be great if the people who are
> > familiar with them could double check that whether we should update the
> > status of them.
> >
> > Thanks,
> > Dian
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals
> > <
> >
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals
> > >
>


Re: [DISCUSS] Upgrade HBase connector to 2.2.x

2020-08-10 Thread Robert Metzger
@Jark: Thanks for bringing up these concerns.
All the problems you've mentioned are "solvable":
- uber jar: Bahir could provide a hbase1 uber jar (we could theoretically
also add a dependency from flink to bahir and provide the uber jar from
Flink)
- e2e tests: we know that the connector is stable, as long as we are not
adding major changes (or we are moving the respective e2e tests to bahir).

On the other hand, I agree with you that supporting multiple versions of a
connector is pretty common (see Kafka or elasticsearch), so why can't we
allow it for Hbase now?

I'm really torn on this and would like to hear more opinions on this.


On Fri, Aug 7, 2020 at 11:24 PM Felipe Lolas  wrote:

> Hi all!
>
> Im new here; I have been using the flink connector for hbase 1.2, but
> recently opt to upgrading to hbase 2.1(basically because was bundled in
> CDH6)
>
> it would be nice to add support for hbase 2.x!
> I found that supporting hbase 1.4.3 and 2.1 needs minimal changes and
> keeping that in mind last week I sent a PR with a solution supporting
> 1.4.3/2.1.0 hbase (maybe not the best, im sorry if i break some rules
> sending the PR).
>
> i would be happy to help if needed!
>
>
>
> Felipe.
>
> El 07-08-2020, a la(s) 10:53, Jark Wu  escribió:
>
> 
> I'm +1 to add HBase 2.x
>
> However, I have some concerns about moving HBase 1.x to Bahir:
> 1) As discussed above, there are still lots of people using HBase 1.x.
> 2) Bahir doesn't have the infrastructure to run the existing HBase E2E
> tests.
> 3) We also paid lots of effort to provide an uber connector jar for HBase
> (not yet released), it is helpful to improve the out-of-box experience.
>
> My thought is that adding HBase 2.x doesn't have to remove HBase 1.x. It
> doesn't add too much work to maintain a new version.
> Keeping the old version can also help us to develop the new one. I would
> suggest to keep HBase 1.x in the repository for at least one more release.
> Another idea is that maybe it's a good time to have a
> "apache/flink-connectors" repository, and move both HBase 1.x and 2.x to
> it.
> It would also be a good place to accept the contribution of pulsar
> connector and other connectors.
>
> Best,
> Jark
>
>
> On Fri, 7 Aug 2020 at 17:54, Robert Metzger  wrote:
>
>> Hi,
>>
>> Thank you for picking this up so quickly. I have no objections regarding
>> all the proposed items.
>> @Gyula: Once the bahir contribution is properly reviewed, ping me if you
>> need somebody to merge it.
>>
>>
>> On Fri, Aug 7, 2020 at 10:43 AM Márton Balassi 
>> wrote:
>>
>> > Hi Robert and Gyula,
>> >
>> > Thanks for reviving this thread. We have the implementation (currently
>> for
>> > 2.2.3) and it is straightforward to contribute it back. Miklos (ccd) has
>> > recently written a readme for said version, he would be interested in
>> > contributing the upgraded connector back. The latest HBase version is
>> > 2.3.0, if we are touching the codebase anyway I would propose to have
>> that.
>> >
>> > If everyone is comfortable with it I would assign [1] to Miklos with
>> > double checking the all functionality that Felipe has proposed is
>> included.
>> > [1] https://issues.apache.org/jira/browse/FLINK-18795
>> > [2] https://hbase.apache.org/downloads.html
>> >
>> > On Fri, Aug 7, 2020 at 10:13 AM Gyula Fóra 
>> wrote:
>> >
>> >> Hi Robert,
>> >>
>> >> I completely agree with you on the Bahir based approach.
>> >>
>> >> I am happy to help with the contribution on the bahir side, with
>> thorough
>> >>  review and testing.
>> >>
>> >> Cheers,
>> >> Gyula
>> >>
>> >> On Fri, 7 Aug 2020 at 09:30, Robert Metzger 
>> wrote:
>> >>
>> >>> It seems that this thead is not on dev@ anymore. Adding it back ...
>> >>>
>> >>> On Fri, Aug 7, 2020 at 9:23 AM Robert Metzger 
>> >>> wrote:
>> >>>
>> >>>> I would like to revive this discussion. There's a new JIRA[1] + PR[2]
>> >>>> for adding HBase 2 support.
>> >>>>
>> >>>> it seems that there is demand for a HBase 2 connector, and consensus
>> to
>> >>>> do it.
>> >>>>
>> >>>> The remaining question in this thread seems to be the "how". I would
>> >>>> propose to go the other way around as Gyula suggested: We move the
>> legacy
>> >>>> connector (1.4x)

Re: [DISCUSS] Planning Flink 1.12

2020-08-10 Thread Robert Metzger
I updated the release date in the Wiki page.

On Sun, Aug 9, 2020 at 8:18 PM Yun Tang  wrote:

> +1 for extending the feature freeze due date.
> 
> From: Zhijiang 
> Sent: Thursday, August 6, 2020 17:05
> To: dev 
> Subject: Re: [DISCUSS] Planning Flink 1.12
>
> +1 on my side for feature freeze date by the end of Oct.
>
>
> --
> From:Yuan Mei 
> Send Time:2020年8月6日(星期四) 14:54
> To:dev 
> Subject:Re: [DISCUSS] Planning Flink 1.12
>
> +1
>
> > +1 for extending the feature freeze date to the end of October.
>
>
>
> On Thu, Aug 6, 2020 at 12:08 PM Yu Li  wrote:
>
> > +1 for extending feature freeze date to end of October.
> >
> > Feature development in the master branch could be unblocked through
> > creating the release branch, but every coin has its two sides (smile)
> >
> > Best Regards,
> > Yu
> >
> >
> > On Wed, 5 Aug 2020 at 20:12, Robert Metzger  wrote:
> >
> > > Thanks all for your opinion.
> > >
> > > @Chesnay: That is a risk, but I hope the people responsible for
> > individual
> > > FLIPs plan accordingly. Extending the time till the feature freeze
> should
> > > not mean that we are extending the scope of the release.
> > > Ideally, features are done before FF, and they use the time till the
> > freeze
> > > for additional testing and documentation polishing.
> > > This FF will be virtual, there should be less disruption than a
> physical
> > > conference with all the travelling.
> > > Do you have a different proposal for the timing?
> > >
> > >
> > > I'm currently considering splitting the feature freeze and the release
> > > branch creation. Similar to the Linux kernel development, we could
> have a
> > > "merge window" and a stabilization phase. At the end of the
> stabilization
> > > phase, we cut the release branch and open the next merge window (I'll
> > start
> > > a separate thread regarding this towards the end of this release cycle,
> > if
> > > I still like the idea then)
> > >
> > >
> > > On Wed, Aug 5, 2020 at 12:04 PM Chesnay Schepler 
> > > wrote:
> > >
> > > > I'm a bit concerned about end of October, because it means we have
> > Flink
> > > > forward, which usually means at least 1 week of little-to-no
> activity,
> > > > and then 1 week until feature-freeze.
> > > >
> > > > On 05/08/2020 11:56, jincheng sun wrote:
> > > > > +1 for end of October from me as well.
> > > > >
> > > > > Best,
> > > > > Jincheng
> > > > >
> > > > >
> > > > > Kostas Kloudas  于2020年8月5日周三 下午4:59写道:
> > > > >
> > > > >> +1 for end of October from me as well.
> > > > >>
> > > > >> Cheers,
> > > > >> Kostas
> > > > >>
> > > > >> On Wed, Aug 5, 2020 at 9:59 AM Till Rohrmann <
> trohrm...@apache.org>
> > > > wrote:
> > > > >>
> > > > >>> +1 for end of October from my side as well.
> > > > >>>
> > > > >>> Cheers,
> > > > >>> Till
> > > > >>>
> > > > >>> On Tue, Aug 4, 2020 at 9:46 PM Stephan Ewen 
> > > wrote:
> > > > >>>
> > > > >>>> The end of October sounds good from my side, unless it collides
> > with
> > > > >> some
> > > > >>>> holidays that affect many committers.
> > > > >>>>
> > > > >>>> Feature-wise, I believe we can definitely make good use of the
> > time
> > > to
> > > > >>> wrap
> > > > >>>> up some critical threads (like finishing the FLIP-27 source
> > > efforts).
> > > > >>>>
> > > > >>>> So +1 to the end of October from my side.
> > > > >>>>
> > > > >>>> Best,
> > > > >>>> Stephan
> > > > >>>>
> > > > >>>>
> > > > >>>> On Tue, Aug 4, 2020 at 8:59 AM Robert Metzger <
> > rmetz...@apache.org>
> > > > >>> wrote:
> > > > >>>>> Thanks a lot for commenting on the featur

Re: Adding a new "Docker Images" component to Jira

2020-08-10 Thread Robert Metzger
Thanks for the feedback.
Transaction committed.

On Mon, Aug 10, 2020 at 5:34 AM Matt Wang  wrote:

> +1 for unifying Deployment / Docker, Dockerfiles and Release System /
> Docker into Docker.
>
>
> --
>
> Best,
> Matt Wang
>
>
> On 08/10/2020 11:21,Yang Wang wrote:
> +1 for the unifying of the component name.
>
> Best,
> Yang
>
> Andrey Zagrebin  于2020年8月8日周六 下午6:30写道:
>
> +1 for the consolidation
>
> Best,
> Andrey
>
> On Fri, Aug 7, 2020 at 3:38 PM Till Rohrmann  wrote:
>
> +1 for unifying Deployment / Docker, Dockerfiles and Release System /
> Docker into Docker.
>
> Cheers,
> Till
>
> On Fri, Aug 7, 2020 at 12:18 PM Robert Metzger 
> wrote:
>
> Hi all,
>
> we now have 3 components containing the word "docker":
> - Deployment / Docker
> <
>
>
>
> https://issues.apache.org/jira/issues/?jql=project+%3D+FLINK+AND+component+%3D+%22Deployment+%2F+Docker%22
>
> (63
> issues)
> - Dockerfiles
> <
>
>
>
> https://issues.apache.org/jira/issues/?jql=project+%3D+FLINK+AND+component+%3D+Dockerfiles
>
> (12
> issues)
> - Release System / Docker
> <
>
>
>
> https://issues.apache.org/jira/issues/?jql=project+%3D+FLINK+AND+component+%3D+%22Release+System+%2F+Docker%22
>
> (3
> issues)
>
> I would suggest consolidating these three components into one, as there
> are
> not that many tickets for this aspect of Flink.
> Maybe we should just rename "Deployment / Docker" to "flink-docker",
> and
> merge the two other components into it?
>
>
> On Fri, Feb 21, 2020 at 11:47 AM Patrick Lucas 
> wrote:
>
> Thanks, Chesnay!
>
> On Fri, Feb 21, 2020 at 11:26 AM Chesnay Schepler <
> ches...@apache.org>
> wrote:
>
> I've added a "Release System / Docker" component.
>
> On 21/02/2020 11:19, Patrick Lucas wrote:
> Hi,
>
> Could someone with permissions add a new component to the FLINK
> project
> in
> Jira for the Docker images <
> https://github.com/apache/flink-docker
> ?
>
> There is already a "Deployment / Docker" component, but that's
> not
> quite
> the same as maintenance/improvements on the flink-docker images.
>
> Either top-level "Docker Images" or perhaps "Release / Docker
> Images"
> would
> be fine.
>
> Thanks,
> Patrick
>
>
>
>
>
>
>
>


[jira] [Created] (FLINK-18869) Batch SQL end-to-end test unstable due to terminated process

2020-08-10 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-18869:
--

 Summary: Batch SQL end-to-end test unstable due to terminated 
process
 Key: FLINK-18869
 URL: https://issues.apache.org/jira/browse/FLINK-18869
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Runtime, Tests
Affects Versions: 1.12.0
Reporter: Robert Metzger
 Fix For: 1.12.0


{code}
==
Running 'Batch SQL end-to-end test'
==
TEST_DATA_DIR: 
/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-34920900539
Flink dist directory: 
/home/vsts/work/1/s/flink-dist/target/flink-1.12-SNAPSHOT-bin/flink-1.12-SNAPSHOT
Starting cluster.
Starting standalonesession daemon on host fv-az670.
Starting taskexecutor daemon on host fv-az670.
Waiting for Dispatcher REST endpoint to come up...
Waiting for Dispatcher REST endpoint to come up...
Waiting for Dispatcher REST endpoint to come up...
Dispatcher REST endpoint is up.
Job has been submitted with JobID 7a941de4e728b6da942dbff7badc955e
pass BatchSQL
Stopping taskexecutor daemon (pid: 21149) on host fv-az670.
Stopping standalonesession daemon (pid: 20848) on host fv-az670.
Skipping taskexecutor daemon (pid: 20970), because it is not running anymore on 
fv-az670.
Skipping taskexecutor daemon (pid: 21329), because it is not running anymore on 
fv-az670.
Skipping taskexecutor daemon (pid: 21742), because it is not running anymore on 
fv-az670.
Stopping taskexecutor daemon (pid: 22119) on host fv-az670.
Skipping taskexecutor daemon (pid: 25564), because it is not running anymore on 
fv-az670.
Skipping taskexecutor daemon (pid: 9154), because it is not running anymore on 
fv-az670.
Skipping taskexecutor daemon (pid: 9552), because it is not running anymore on 
fv-az670.
Skipping taskexecutor daemon (pid: 9851), because it is not running anymore on 
fv-az670.
Skipping taskexecutor daemon (pid: 15190), because it is not running anymore on 
fv-az670.
Skipping taskexecutor daemon (pid: 18068), because it is not running anymore on 
fv-az670.
Skipping taskexecutor daemon (pid: 18476), because it is not running anymore on 
fv-az670.
Skipping taskexecutor daemon (pid: 18783), because it is not running anymore on 
fv-az670.
/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/test_batch_sql.sh: line 
72: 22119 Terminated  [[ ${zookeeper_process_count} -gt 0 ]]
[FAIL] Test script contains errors.
Checking for errors...
No errors in log files.
Checking for exceptions...
No exceptions in log files.
Checking for non-empty .out files...
No non-empty .out files.

[FAIL] 'Batch SQL end-to-end test' failed after 0 minutes and 16 seconds! Test 
exited with exit code 1
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Adding a new "Docker Images" component to Jira

2020-08-07 Thread Robert Metzger
Hi all,

we now have 3 components containing the word "docker":
- Deployment / Docker

(63
issues)
- Dockerfiles

(12
issues)
- Release System / Docker

(3
issues)

I would suggest consolidating these three components into one, as there are
not that many tickets for this aspect of Flink.
Maybe we should just rename "Deployment / Docker" to "flink-docker", and
merge the two other components into it?


On Fri, Feb 21, 2020 at 11:47 AM Patrick Lucas 
wrote:

> Thanks, Chesnay!
>
> On Fri, Feb 21, 2020 at 11:26 AM Chesnay Schepler 
> wrote:
>
> > I've added a "Release System / Docker" component.
> >
> > On 21/02/2020 11:19, Patrick Lucas wrote:
> > > Hi,
> > >
> > > Could someone with permissions add a new component to the FLINK project
> > in
> > > Jira for the Docker images ?
> > >
> > > There is already a "Deployment / Docker" component, but that's not
> quite
> > > the same as maintenance/improvements on the flink-docker images.
> > >
> > > Either top-level "Docker Images" or perhaps "Release / Docker Images"
> > would
> > > be fine.
> > >
> > > Thanks,
> > > Patrick
> > >
> >
> >
>


Re: [DISCUSS] Upgrade HBase connector to 2.2.x

2020-08-07 Thread Robert Metzger
Hi,

Thank you for picking this up so quickly. I have no objections regarding
all the proposed items.
@Gyula: Once the bahir contribution is properly reviewed, ping me if you
need somebody to merge it.


On Fri, Aug 7, 2020 at 10:43 AM Márton Balassi 
wrote:

> Hi Robert and Gyula,
>
> Thanks for reviving this thread. We have the implementation (currently for
> 2.2.3) and it is straightforward to contribute it back. Miklos (ccd) has
> recently written a readme for said version, he would be interested in
> contributing the upgraded connector back. The latest HBase version is
> 2.3.0, if we are touching the codebase anyway I would propose to have that.
>
> If everyone is comfortable with it I would assign [1] to Miklos with
> double checking the all functionality that Felipe has proposed is included.
> [1] https://issues.apache.org/jira/browse/FLINK-18795
> [2] https://hbase.apache.org/downloads.html
>
> On Fri, Aug 7, 2020 at 10:13 AM Gyula Fóra  wrote:
>
>> Hi Robert,
>>
>> I completely agree with you on the Bahir based approach.
>>
>> I am happy to help with the contribution on the bahir side, with thorough
>>  review and testing.
>>
>> Cheers,
>> Gyula
>>
>> On Fri, 7 Aug 2020 at 09:30, Robert Metzger  wrote:
>>
>>> It seems that this thead is not on dev@ anymore. Adding it back ...
>>>
>>> On Fri, Aug 7, 2020 at 9:23 AM Robert Metzger 
>>> wrote:
>>>
>>>> I would like to revive this discussion. There's a new JIRA[1] + PR[2]
>>>> for adding HBase 2 support.
>>>>
>>>> it seems that there is demand for a HBase 2 connector, and consensus to
>>>> do it.
>>>>
>>>> The remaining question in this thread seems to be the "how". I would
>>>> propose to go the other way around as Gyula suggested: We move the legacy
>>>> connector (1.4x) to bahir and add the new (2.x.x) to Flink.
>>>> Why? In the Flink repo, we have a pretty solid testing infra, where we
>>>> also run Hbase end to end tests. This will help us to stabilize the new
>>>> connector and ensure a good quality.
>>>> It also, the perception of what goes into Flink, and what into Bahir is
>>>> a bit clearer if we put the stable, up to date stuff into Flink, and
>>>> legacy, experimental or unstable connectors into Bahir.
>>>>
>>>>
>>>> Who can take care of this effort? (Decide which Hbase 2 PR to take,
>>>> review and contribution to Bahir)
>>>>
>>>>
>>>> [1] https://issues.apache.org/jira/browse/FLINK-18795
>>>> [2] https://github.com/apache/flink/pull/13047
>>>>
>>>> On Mon, Jun 22, 2020 at 3:32 PM Gyula Fóra 
>>>> wrote:
>>>>
>>>>> If we were to go the bahir route, I don't see the point in migrating
>>>>> the 1.4.x version there since that's already available in Flink. To me 
>>>>> that
>>>>> is almost the same as dropping explicit support for 1.4 and telling users
>>>>> to use older connector versions if they wish to keep using it.
>>>>>
>>>>> If we want to keep 1.4 around for legacy users and slowly deprecate
>>>>> that, we can do that inside Flink and only push the 2.4.x version to 
>>>>> bahir.
>>>>>
>>>>> What do you think?
>>>>>
>>>>> Gyula
>>>>>
>>>>> On Mon, Jun 22, 2020 at 3:16 PM Arvid Heise 
>>>>> wrote:
>>>>>
>>>>>> If we support both HBase 1 and 2, maybe it's a good time to pull them
>>>>>> out to Bahir and list them in flink-packages to avoid adding even more
>>>>>> modules to Flink core?
>>>>>>
>>>>>> On Mon, Jun 22, 2020 at 4:05 AM OpenInx  wrote:
>>>>>>
>>>>>>> Hi
>>>>>>>
>>>>>>> According to my observation in the hbase community, there are still
>>>>>>> lots of hbase users running their production cluster with version 1.x 
>>>>>>> (1.4x
>>>>>>> or 1.5.x). so I'd like to suggest that
>>>>>>> supporting both hbase1.x & hbase2.x connector.
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> On Sat, Jun 20, 2020 at 2:41 PM Ming Li 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> +1 to support both HBase 2.x and Hbase 1.4.x,  

Re: [DISCUSS] Upgrade HBase connector to 2.2.x

2020-08-07 Thread Robert Metzger
It seems that this thead is not on dev@ anymore. Adding it back ...

On Fri, Aug 7, 2020 at 9:23 AM Robert Metzger  wrote:

> I would like to revive this discussion. There's a new JIRA[1] + PR[2] for
> adding HBase 2 support.
>
> it seems that there is demand for a HBase 2 connector, and consensus to do
> it.
>
> The remaining question in this thread seems to be the "how". I would
> propose to go the other way around as Gyula suggested: We move the legacy
> connector (1.4x) to bahir and add the new (2.x.x) to Flink.
> Why? In the Flink repo, we have a pretty solid testing infra, where we
> also run Hbase end to end tests. This will help us to stabilize the new
> connector and ensure a good quality.
> It also, the perception of what goes into Flink, and what into Bahir is a
> bit clearer if we put the stable, up to date stuff into Flink, and legacy,
> experimental or unstable connectors into Bahir.
>
>
> Who can take care of this effort? (Decide which Hbase 2 PR to take, review
> and contribution to Bahir)
>
>
> [1] https://issues.apache.org/jira/browse/FLINK-18795
> [2] https://github.com/apache/flink/pull/13047
>
> On Mon, Jun 22, 2020 at 3:32 PM Gyula Fóra  wrote:
>
>> If we were to go the bahir route, I don't see the point in migrating the
>> 1.4.x version there since that's already available in Flink. To me that is
>> almost the same as dropping explicit support for 1.4 and telling users to
>> use older connector versions if they wish to keep using it.
>>
>> If we want to keep 1.4 around for legacy users and slowly deprecate that,
>> we can do that inside Flink and only push the 2.4.x version to bahir.
>>
>> What do you think?
>>
>> Gyula
>>
>> On Mon, Jun 22, 2020 at 3:16 PM Arvid Heise  wrote:
>>
>>> If we support both HBase 1 and 2, maybe it's a good time to pull them
>>> out to Bahir and list them in flink-packages to avoid adding even more
>>> modules to Flink core?
>>>
>>> On Mon, Jun 22, 2020 at 4:05 AM OpenInx  wrote:
>>>
>>>> Hi
>>>>
>>>> According to my observation in the hbase community, there are still
>>>> lots of hbase users running their production cluster with version 1.x (1.4x
>>>> or 1.5.x). so I'd like to suggest that
>>>> supporting both hbase1.x & hbase2.x connector.
>>>>
>>>> Thanks.
>>>>
>>>> On Sat, Jun 20, 2020 at 2:41 PM Ming Li  wrote:
>>>>
>>>>> +1 to support both HBase 2.x and Hbase 1.4.x,  just as what we are
>>>>> doing for Kafka.
>>>>>
>>>>> On Fri, Jun 19, 2020 at 4:02 PM Yu Li  wrote:
>>>>>
>>>>>> One supplement:
>>>>>>
>>>>>> I noticed that there are discussions in HBase ML this March about
>>>>>> removing stable-1 pointer and got consensus [1], and will follow up in
>>>>>> HBase community about why we didn't take real action. However, this 
>>>>>> doesn't
>>>>>> change my previous statement / stand due to the number of 1.x usages in
>>>>>> production.
>>>>>>
>>>>>> Best Regards,
>>>>>> Yu
>>>>>>
>>>>>> [1]
>>>>>> http://mail-archives.apache.org/mod_mbox/hbase-dev/202003.mbox/%3c30180be2-bd93-d414-a158-16c9c8d01...@apache.org%3E
>>>>>>
>>>>>> On Fri, 19 Jun 2020 at 15:54, Yu Li  wrote:
>>>>>>
>>>>>>> +1 on upgrading the HBase version of the connector, and 1.4.3 is
>>>>>>> indeed an old version.
>>>>>>>
>>>>>>> OTOH, AFAIK there're still quite some 1.x HBase clusters in
>>>>>>> production. We could also see that the HBase community is still 
>>>>>>> maintaining
>>>>>>> 1.x release lines (with "stable-1 release" point to 1.4.13) [1]
>>>>>>>
>>>>>>> Please also notice that HBase follows semantic versioning [2] [3]
>>>>>>> thus don't promise any kind of compatibility (source/binary/wire, etc.)
>>>>>>> between major versions. So if we only maintain 2.x connector, it would 
>>>>>>> not
>>>>>>> be able to work with 1.x HBase clusters.
>>>>>>>
>>>>>>> I totally understand the additional efforts of maintaining two
>>>>>>> modules, but since we're also reserving multiple versions for kafka
>

Re: [DISCUSS] Releasing Flink 1.10.2

2020-08-07 Thread Robert Metzger
Thanks for taking care of this Zhu Zhu. The list of bugs from your list
certainly justifies pushing out a bugfix release.
I would propose to wait until Monday for people to speak up if they want to
have a fix included in the release. Otherwise, we could create the first RC
on Monday evening (China time).





On Thu, Aug 6, 2020 at 2:53 PM Till Rohrmann  wrote:

> Thanks for kicking this discussion off Zhu Zhu. +1 for the 1.10.2 release.
> Also thanks for volunteering as the release manager!
>
> Cheers,
> Till
>
> On Thu, Aug 6, 2020 at 1:26 PM Zhu Zhu  wrote:
>
> > Hi All,
> >
> > It has been more than 2 months since we released Flink 1.10.1. We already
> > have more than 60 resolved improvements/bugs in the release-1.10 branch.
> > Therefore, I propose to create the next bugfix release 1.10.2 for Flink
> > 1.10.
> >
> > Most noticeable fixes are:
> > - FLINK-18663 RestServerEndpoint may prevent server shutdown
> > - FLINK-18595 Deadlock during job shutdown
> > - FLINK-18539 StreamExecutionEnvironment#addSource(SourceFunction,
> > TypeInformation) doesn't use the user defined type information
> > - FLINK-18048 "--host" option could not take effect for standalone
> > application cluster
> > - FLINK-18045 Fix Kerberos credentials checking to unblock Flink on
> secured
> > MapR
> > - FLINK-18035 Executors#newCachedThreadPool could not work as expected
> > - FLINK-18012 Deactivate slot timeout if TaskSlotTable.tryMarkSlotActive
> is
> > called
> > - FLINK-17800 RocksDB optimizeForPointLookup results in missing time
> > windows
> > - FLINK-17558 Partitions are released in TaskExecutor Main Thread
> > - FLINK-17466 toRetractStream doesn't work correctly with Pojo conversion
> > class
> > - FLINK-16451 Fix IndexOutOfBoundsException for DISTINCT AGG with
> constants
> >
> > There is no known blocker issue of 1.10.2 release at the moment.
> >
> > I would volunteer as the release manager and kick off the release
> process.
> > What do you think?
> >
> > Please let me know if there are any concerns or any missing blocker
> issues
> > need to be fixed in 1.10.2.
> >
> > Thanks,
> > Zhu Zhu
> >
>


[jira] [Created] (FLINK-18829) Remove logback-related codepaths and configuration file

2020-08-05 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-18829:
--

 Summary: Remove logback-related codepaths and configuration file
 Key: FLINK-18829
 URL: https://issues.apache.org/jira/browse/FLINK-18829
 Project: Flink
  Issue Type: Task
  Components: Deployment / Scripts, Runtime / Coordination
Reporter: Robert Metzger


Flink currently (1.12-SNAPSHOT) provides support for two logging backends: 
Log4j2 (default) and logback.
The logback support is not really covered by our testing, and there's little 
evidence of widespread use.

Users can use a custom logging backend (as long as it's supported by slf4j) by 
providing the appropriate configuration files and jars in the classpath of all 
Flink services. Therefore, we should remove the logback-related files and 
codepaths in Flink and let our users set things up themselves.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Planning Flink 1.12

2020-08-05 Thread Robert Metzger
Thanks all for your opinion.

@Chesnay: That is a risk, but I hope the people responsible for individual
FLIPs plan accordingly. Extending the time till the feature freeze should
not mean that we are extending the scope of the release.
Ideally, features are done before FF, and they use the time till the freeze
for additional testing and documentation polishing.
This FF will be virtual, there should be less disruption than a physical
conference with all the travelling.
Do you have a different proposal for the timing?


I'm currently considering splitting the feature freeze and the release
branch creation. Similar to the Linux kernel development, we could have a
"merge window" and a stabilization phase. At the end of the stabilization
phase, we cut the release branch and open the next merge window (I'll start
a separate thread regarding this towards the end of this release cycle, if
I still like the idea then)


On Wed, Aug 5, 2020 at 12:04 PM Chesnay Schepler  wrote:

> I'm a bit concerned about end of October, because it means we have Flink
> forward, which usually means at least 1 week of little-to-no activity,
> and then 1 week until feature-freeze.
>
> On 05/08/2020 11:56, jincheng sun wrote:
> > +1 for end of October from me as well.
> >
> > Best,
> > Jincheng
> >
> >
> > Kostas Kloudas  于2020年8月5日周三 下午4:59写道:
> >
> >> +1 for end of October from me as well.
> >>
> >> Cheers,
> >> Kostas
> >>
> >> On Wed, Aug 5, 2020 at 9:59 AM Till Rohrmann 
> wrote:
> >>
> >>> +1 for end of October from my side as well.
> >>>
> >>> Cheers,
> >>> Till
> >>>
> >>> On Tue, Aug 4, 2020 at 9:46 PM Stephan Ewen  wrote:
> >>>
> >>>> The end of October sounds good from my side, unless it collides with
> >> some
> >>>> holidays that affect many committers.
> >>>>
> >>>> Feature-wise, I believe we can definitely make good use of the time to
> >>> wrap
> >>>> up some critical threads (like finishing the FLIP-27 source efforts).
> >>>>
> >>>> So +1 to the end of October from my side.
> >>>>
> >>>> Best,
> >>>> Stephan
> >>>>
> >>>>
> >>>> On Tue, Aug 4, 2020 at 8:59 AM Robert Metzger 
> >>> wrote:
> >>>>> Thanks a lot for commenting on the feature freeze date.
> >>>>>
> >>>>> You are raising a few good points on the timing.
> >>>>> If we have already (2 months before) concerns regarding the deadline,
> >>>> then
> >>>>> I agree that we should move it till the end of October.
> >>>>>
> >>>>> We then just need to be careful not to run into the Christmas season
> >> at
> >>>> the
> >>>>> end of December.
> >>>>>
> >>>>> If nobody objects within a few days, I'll update the feature freeze
> >>> date
> >>>> in
> >>>>> the Wiki.
> >>>>>
> >>>>>
> >>>>> On Tue, Aug 4, 2020 at 7:52 AM Kurt Young  wrote:
> >>>>>
> >>>>>> Regarding setting the feature freeze date to late September, I have
> >>>> some
> >>>>>> concern that it might make
> >>>>>> the development time of 1.12 too short.
> >>>>>>
> >>>>>> One reason for this is we took too much time (about 1.5 month, from
> >>> mid
> >>>>> of
> >>>>>> May to beginning of July)
> >>>>>> for testing 1.11. It's not ideal but further squeeze the
> >> development
> >>>> time
> >>>>>> of 1.12 won't make this better.
> >>>>>>   Besides, AFAIK July & August is also a popular vacation season for
> >>>>>> European. Given the fact most
> >>>>>>   committers of Flink come from Europe, I think we should also take
> >>> this
> >>>>>> into consideration.
> >>>>>>
> >>>>>> It's also true that the first week of October is the national
> >> holiday
> >>>> of
> >>>>>> China, so I'm wondering whether the
> >>>>>> end of October could be a candidate feature freeze date.
> >>>>>>
> >>>>>> Best,
> >>>>>> Kurt
> >>>>>

[jira] [Created] (FLINK-18815) AbstractCloseableRegistryTest.testClose unstable

2020-08-04 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-18815:
--

 Summary: AbstractCloseableRegistryTest.testClose unstable
 Key: FLINK-18815
 URL: https://issues.apache.org/jira/browse/FLINK-18815
 Project: Flink
  Issue Type: Bug
  Components: FileSystems, Tests
Affects Versions: 1.12.0
Reporter: Robert Metzger


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=5164=logs=0da23115-68bb-5dcd-192c-bd4c8adebde1=05b74a19-4ee4-5036-c46f-ada307df6cf0
{code}
[ERROR] Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.509 s 
<<< FAILURE! - in org.apache.flink.core.fs.SafetyNetCloseableRegistryTest
[ERROR] testClose(org.apache.flink.core.fs.SafetyNetCloseableRegistryTest)  
Time elapsed: 1.15 s  <<< FAILURE!
java.lang.AssertionError: expected:<0> but was:<-1>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.flink.core.fs.AbstractCloseableRegistryTest.testClose(AbstractCloseableRegistryTest.java:93)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Planning Flink 1.12

2020-08-04 Thread Robert Metzger
Thanks a lot for commenting on the feature freeze date.

You are raising a few good points on the timing.
If we have already (2 months before) concerns regarding the deadline, then
I agree that we should move it till the end of October.

We then just need to be careful not to run into the Christmas season at the
end of December.

If nobody objects within a few days, I'll update the feature freeze date in
the Wiki.


On Tue, Aug 4, 2020 at 7:52 AM Kurt Young  wrote:

> Regarding setting the feature freeze date to late September, I have some
> concern that it might make
> the development time of 1.12 too short.
>
> One reason for this is we took too much time (about 1.5 month, from mid of
> May to beginning of July)
> for testing 1.11. It's not ideal but further squeeze the development time
> of 1.12 won't make this better.
>  Besides, AFAIK July & August is also a popular vacation season for
> European. Given the fact most
>  committers of Flink come from Europe, I think we should also take this
> into consideration.
>
> It's also true that the first week of October is the national holiday of
> China, so I'm wondering whether the
> end of October could be a candidate feature freeze date.
>
> Best,
> Kurt
>
>
> On Tue, Jul 28, 2020 at 2:41 AM Robert Metzger 
> wrote:
>
> > Hi all,
> >
> > Thanks a lot for the responses so far. I've put them into this Wiki page:
> > https://cwiki.apache.org/confluence/display/FLINK/1.12+Release to keep
> > track of them. Ideally, post JIRA tickets for your feature, then the
> status
> > will update automatically in the wiki :)
> >
> > Please keep posting features here, or add them to the Wiki yourself 
> >
> > @Prasanna kumar : Dynamic Auto Scaling
> is a
> > feature request the community is well-aware of. Till has posted
> > "Reactive-scaling mode" as a feature he's working on for the 1.12
> release.
> > This work will introduce the basic building blocks and partial support
> for
> > the feature you are requesting.
> > Proper support for dynamic scaling, while maintaining Flink's high
> > performance (throughout, low latency) and correctness is a difficult task
> > that needs a lot of work. It will probably take a little bit of time till
> > this is fully available.
> >
> > Cheers,
> > Robert
> >
> >
> >
> > On Thu, Jul 23, 2020 at 2:27 PM Till Rohrmann 
> > wrote:
> >
> > > Thanks for being our release managers for the 1.12 release Dian &
> Robert!
> > >
> > > Here are some features I would like to work on for this release:
> > >
> > > # Features
> > >
> > > ## Finishing pipelined region scheduling (
> > > https://issues.apache.org/jira/browse/FLINK-16430)
> > > With the pipelined region scheduler we want to implement a scheduler
> > which
> > > can serve streaming as well as batch workloads alike while being able
> to
> > > run jobs under constrained resources. The latter is particularly
> > important
> > > for bounded streaming jobs which, currently, are not well supported.
> > >
> > > ## Reactive-scaling mode
> > > Being able to react to newly available resources and rescaling a
> running
> > > job accordingly will make Flink's operation much easier because
> resources
> > > can then be controlled by an external tool (e.g. GCP autoscaling, K8s
> > > horizontal pod scaler, etc.). In this release we want to make a big
> step
> > > towards this direction. As a first step we want to support the
> execution
> > of
> > > jobs with a parallelism which is lower than the specified parallelism
> in
> > > case that Flink lost a TaskManager or could not acquire enough
> resources.
> > >
> > > # Maintenance/Stability
> > >
> > > ## JM / TM finished task reconciliation (
> > > https://issues.apache.org/jira/browse/FLINK-17075)
> > > This prevents the system from going out of sync if a task state change
> > from
> > > the TM to the JM is lost.
> > >
> > > ## Make metrics services work with Kubernetes deployments (
> > > https://issues.apache.org/jira/browse/FLINK-11127)
> > > Invert the direction in which the MetricFetcher connects to the
> > > MetricQueryFetchers. That way it will no longer be necessary to expose
> on
> > > K8s for every TaskManager a port on which the MetricQueryFetcher runs.
> > This
> > > will then make the deployment of Flink clusters on K8s easier.
> > >
> > > ## Handle long-blocking operations during

[jira] [Created] (FLINK-18787) Add a "cluster-info" command to the Flink command line client

2020-07-31 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-18787:
--

 Summary: Add a "cluster-info" command to the Flink command line 
client
 Key: FLINK-18787
 URL: https://issues.apache.org/jira/browse/FLINK-18787
 Project: Flink
  Issue Type: New Feature
  Components: Command Line Client
Reporter: Robert Metzger


It would be useful for users to get access to cluster information from the 
command line as well.
This could include information accessible via the REST API: number of task 
managers, running jobs, configuration.
One other bit of information would be the current jobmanager REST endpoint 
address, as this is difficult to get when running on YARN, with HA etc.

(This is related to FLINK-17641)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[DISCUSS][Release 1.12] Stale blockers and build instabilities

2020-07-27 Thread Robert Metzger
Hi team,

We would like to use this thread as a permanent thread for
regularly syncing on stale blockers (need to have somebody assigned within
a week and progress, or a good plan) and build instabilities (need to check
if its a blocker).

Recent test-instabilities:

   - https://issues.apache.org/jira/browse/FLINK-17159 (ES6 test)
   - https://issues.apache.org/jira/browse/FLINK-16768 (s3 test unstable)
   - https://issues.apache.org/jira/browse/FLINK-18374 (s3 test unstable)
   - https://issues.apache.org/jira/browse/FLINK-17949 (KafkaShuffleITCase)
   - https://issues.apache.org/jira/browse/FLINK-18634 (Kafka transactions)


It would be nice if the committers taking care of these components could
look into the test failures.
If nothing happens, we'll personally reach out to people I believe they
could look into the ticket.

Best,
Dian & Robert


Re: [DISCUSS] Planning Flink 1.12

2020-07-27 Thread Robert Metzger
on, pod affinity/anti-affinity setting, etc.
> >
> > 2. Support running PyFlink on Kubernetes
> > Description:
> > Support running PyFlink on Kubernetes, including session cluster and
> > application cluster.
> > Benefits:
> > Running python application in a containerized environment.
> >
> > 3. Support built-in init-Container
> > Description:
> > We need a built-in init-Container to help solve dependency management
> > in a containerized environment, especially in the application mode.
> > Benefits:
> > Separate the base Flink image from dynamic dependencies.
> >
> > 4. Support accessing secured services via K8s secrets
> > Description:
> > Kubernetes Secrets
> > <https://kubernetes.io/docs/concepts/configuration/secret/> can be used
> to
> > provide credentials for a Flink application to access secured services.
> It
> > helps people who want to use a user-specified K8s Secret through an
> > environment variable.
> > Benefits:
> > Improve user experience.
> >
> > 5. Support configuring replica of JobManager Deployment in ZooKeeper HA
> > setups
> > Description:
> > Make the *replica* of Deployment configurable in the ZooKeeper HA
> > setups.
> > Benefits:
> > Achieve faster failover.
> >
> > 6. Support to configure limit for CPU requirement
> > Description:
> > To leverage the Kubernetes feature of container request/limit CPU.
> > Benefits:
> > Reduce cost.
> >
> > Regards,
> > Canbin Zheng
> >
> > Harold.Miao  于2020年7月23日周四 下午12:44写道:
> >
> > > I'm excited to hear about this feature,  very, very, very highly
> > encouraged
> > >
> > >
> > > Prasanna kumar  于2020年7月23日周四
> 上午12:10写道:
> > >
> > > > Hi Flink Dev Team,
> > > >
> > > > Dynamic AutoScaling Based on the incoming data load would be a great
> > > > feature.
> > > >
> > > > We should be able have some rule say If the load increased by 20% ,
> add
> > > > extra resource should be added.
> > > > Or time based say during these peak hours the pipeline should scale
> > > > automatically by 50%.
> > > >
> > > > This will help a lot in cost reduction.
> > > >
> > > > EMR cluster provides a similar feature for SPARK based application.
> > > >
> > > > Thanks,
> > > > Prasanna.
> > > >
> > > > On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger 
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > Now that the 1.11 release is out, it is time to plan for the next
> > major
> > > > > Flink release.
> > > > >
> > > > > Some items:
> > > > >
> > > > >1.
> > > > >
> > > > >Dian Fu and me volunteer to be the release managers for Flink
> > 1.12.
> > > > >
> > > > >
> > > > >
> > > > >1.
> > > > >
> > > > >Timeline: We propose to stick to our approximate 4 month release
> > > > cycle,
> > > > >thus the release should be done by late October. Given that
> > there’s
> > > a
> > > > >holiday week in China at the beginning of October, I propose to
> do
> > > the
> > > > >feature freeze on master by late September.
> > > > >
> > > > >2.
> > > > >
> > > > >Collecting features: It would be good to have a rough overview
> of
> > > the
> > > > >features that will likely be ready to be merged by late
> September,
> > > and
> > > > > that
> > > > >we want in the release.
> > > > >Based on the discussion, we will update the Roadmap on the Flink
> > > > website
> > > > >again!
> > > > >
> > > > >
> > > > >
> > > > >1.
> > > > >
> > > > >Test instabilities and blockers: I would like to avoid a
> situation
> > > > where
> > > > >we have many blocking issues or build instabilities at the time
> of
> > > the
> > > > >feature freeze. To achieve that, we will try to check every
> build
> > > > >instability within a week, to decide if it is a blocker (make
> sure
> > > to
> > > > > use
> > > > >the “test-stability” label for those tickets!)
> > > > >Blocker issues will need to have somebody assigned (responsible)
> > > > within
> > > > >a week, and we want to see progress on all blocker issues
> > > (downgrade,
> > > > >resolution, a good plan how to proceed if it is more
> complicated)
> > > > >
> > > > >2.
> > > > >
> > > > >Quality and stability of new features: In order to have a short
> > > > feature
> > > > >freeze phase, we encourage developers to only merge well-tested
> > and
> > > > >documented features. In our experience, the feature freeze works
> > > best
> > > > if
> > > > >new features are complete, and the community can focus fully on
> > > > > addressing
> > > > >newly found bugs and voting the release.
> > > > >By having a smooth release process, the next merge-window for
> the
> > > next
> > > > >release will come sooner.
> > > > >
> > > > >
> > > > > Let me know what you think about our items, and share which
> features
> > > you
> > > > > want in Flink 1.12.
> > > > >
> > > > > Best,
> > > > >
> > > > > Robert & Dian
> > > > >
> > > >
> > >
> > >
> > > --
> > >
> > > Best Regards,
> > > Harold Miao
> > >
> >
>


Re: Functions in Flick

2020-07-24 Thread Robert Metzger
Hi David,

1. Here's an example on how to deploy Stateful Functions on Kubernetes:
https://github.com/apache/flink-statefun/tree/master/statefun-examples/statefun-python-k8s-example
You don't need to install anything in a serverless environment for using it.

2. Since the underlying processing happens on the Flink framework, you can
process a lot of data with StateFun. How much data you can process depends
on the hardware you are using (network speed, memory, disk, network speeds
...). The functions can process messages as large as a few megabytes, and
for the state size, the same restrictions as with Flink apply: With
RocksDB, you can store hundreds of GB of state per machine.

Best,
Robert


On Sat, Jul 11, 2020 at 10:43 AM david david  wrote:

> Hi,
> I want to use Stateful Functions in Flick.
> Could you please clarify on the below?
>
> 1. To use  the stateful function in a serverless kubernetes , what I need
> to install in serverless and kubernetes env.?
> Looks like I have many options and looking for some detailed info.
>
> 2. How much data (or how big the file size must be) can a stateful
> function can process? Is there any limit that the fuction can hold or
> process?
>
> Thanks in advance!
>


Re: [DISCUSS] Adding Azure Platform Support in DataStream, Table and SQL Connectors

2020-07-24 Thread Robert Metzger
Great! Looking forward to your first pull request.

I agree that having a connector in Flink's codebase will probably help it's
adoption.
However, we are careful with the connectors we are adding to Flink for the
following reasons:
a) The Flink project is lacking people to maintain all connectors, leading
to a poor user experience. Some connectors have a lot of unresolved bugs,
because there's no committer involved in the component.
b) Even if the connector is unmaintained, there's a overhead for the
project to carry it along, as we sometimes make changes to the build
system, as it complicates our license checking process and slows down our
CI execution time
c) For you as a contributor / maintainer of a connector, it could be
difficult to maintain the connector, because you will always need a
committer willing to review & merge your changes. We have no bad
intentions, it's just the reality of a big, busy open source project (of
course we will consider contributors actively maintaining a component over
a longer period of time for committership)

I'm not against adding connectors per se, but for "Azure Cognitive" and
"Azure Cosmos" I could not find any evidence (on the user@ mailing list or
google) that people are asking for such connectors. In my opinion, these
connectors should exist on flink-packages.org first, and once we see that
there's a lot of activity around them, we can revisit this decision.

For "Azure Event Hub", I'm open to discuss adding a connector to Flink. It
seems to have a Kafka compatible endpoint, but I believe it'll lead to a
poor user experience (for authentication, exactly-once etc.).

Please note that all I wrote above is my personal opinion, based on my
observations of the Flink project. Maybe other PMC members have a different
opinion.


On Fri, Jul 24, 2020 at 4:32 AM Israel Ekpo  wrote:

> Thanks for the guidance Robert. I appreciate the prompt response and will
> share the pull requests for the ADLS support [2] next week.
>
> Regarding the additional, I would like to contribute them to the main
> codebase [1] if possible
>
> I initially thought maybe it is better to start outside the core codebase
> but I think the adoption would be better if we have documentation on the
> core Flink documentation and reduce the level of effort necessary to
> integrate it for users. We will take time to make sure the docs and tests
> for the connectors are solid and then we can bring them one at a time into
> the core code base.
>
> [1] https://github.com/apache/flink/tree/master/flink-connectors
>
> [2] https://issues.apache.org/jira/browse/FLINK-18562
>
>
> On Thu, Jul 23, 2020 at 3:41 AM Robert Metzger 
> wrote:
>
> > Hi Israel,
> >
> > thanks a lot for reaching out! I'm very excited about your efforts to
> bring
> > additional Azure support into Flink.
> > There are ~50 threads on the user@ mailing list mentioning Azure --
> that's
> > good evidence that our users use Flink in Azure.
> >
> > Regarding your questions:
> >
> > Do I need to create a FLIP [2] in order to make these changes to bring
> the
> > > new capabilities or the individual JIRA issues are sufficient?
> >
> >
> > For the two DataLake FS tickets, I don't see the need for a FLIP.
> >
> > I am thinking about targeting Flink versions 1.10 through 1.12
> > > For new connectors like this, how many versions can/should this be
> > > integrated into?
> >
> >
> > We generally don't add new features to old releases (unless there's a
> very
> > good reason to backport the feature). Therefore, the new integrations
> will
> > all go into the next major Flink release (in this case probably Flink
> 1.12
> > for the first tickets).
> >
> > Are there any upcoming changes to supported Java and Scala versions that
> I
> > > need to be aware of?
> >
> >
> > No, I'm not aware of any upcoming changes. Java 8 and Java 11 are the two
> > versions we test against.
> >
> > My goal is to add the first two features [FLINK-18562] and [FLINK-18568]
> to
> > > the existing file system capabilities [1] and then have the other
> > > connectors FLINK-1856[3-7] exist as standalone plugins.
> >
> >
> > I like the order in which you approach the tickets. I assigned you to the
> > first ticket and commented on the second one. I'm also willing to help
> > review your pull requests.
> >
> > What do you mean by "standalone plugins" in the context of connectors?
> > Would you like to contribute these connectors to the main Flink codebase,
> > or maintain them outside Flink but list them in flink-packages.org?
> >
> > Best,
> > Robert
&g

[jira] [Created] (FLINK-18697) Adding flink-table-api-java-bridge_2.11 to a Flink job kills the IDE logging

2020-07-23 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-18697:
--

 Summary: Adding flink-table-api-java-bridge_2.11 to a Flink job 
kills the IDE logging
 Key: FLINK-18697
 URL: https://issues.apache.org/jira/browse/FLINK-18697
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / API
Affects Versions: 1.11.0
Reporter: Robert Metzger


Steps to reproduce:
- Set up a Flink project using a Maven archetype
- Add "flink-table-api-java-bridge_2.11" as a dependency
- Running Flink won't produce any log output

Probable cause:
"flink-table-api-java-bridge_2.11" has a dependency to 
"org.apache.flink:flink-streaming-java_2.11:test-jar:tests:1.11.0", which 
contains a "log4j2-test.properties" file.

When I disable Log4j2 debugging (with "-Dlog4j2.debug"), I see the following 
line:
{code}
DEBUG StatusLogger Reconfiguration complete for context[name=3d4eac69] at URI 
jar:file:/Users/robert/.m2/repository/org/apache/flink/flink-streaming-java_2.11/1.11.0/flink-streaming-java_2.11-1.11.0-tests.jar!/log4j2-test.properties
 (org.apache.logging.log4j.core.LoggerContext@568bf312) with optional 
ClassLoader: null
{code}





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18696) Java 14 Records are not recognized as POJOs

2020-07-23 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-18696:
--

 Summary: Java 14 Records are not recognized as POJOs
 Key: FLINK-18696
 URL: https://issues.apache.org/jira/browse/FLINK-18696
 Project: Flink
  Issue Type: Improvement
  Components: API / Type Serialization System
Reporter: Robert Metzger


Having a record (https://openjdk.java.net/jeps/359) such as:
{code}
public record Simple(int id, String name) { }
{code}

Leads to this log message: "class de.robertmetzger.TableApiSql$Simple does not 
contain a setter for field id"

I believe the PojoSerializer should be able to use records.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18685) JobClient.getAccumulators() blocks until streaming job has finished in local environment

2020-07-23 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-18685:
--

 Summary: JobClient.getAccumulators() blocks until streaming job 
has finished in local environment
 Key: FLINK-18685
 URL: https://issues.apache.org/jira/browse/FLINK-18685
 Project: Flink
  Issue Type: Bug
  Components: API / DataStream
Affects Versions: 1.11.0
Reporter: Robert Metzger


*Steps to reproduce:*
{code:java}
JobClient client = env.executeAsync("Test");

CompletableFuture status = client.getJobStatus();
LOG.info("status = " + status.get());

CompletableFuture> accumulators = 
client.getAccumulators(StreamingJob.class.getClassLoader());
LOG.info("accus = " + accumulators.get(5, TimeUnit.SECONDS));
{code}

*Actual behavior*
The accumulators future will never complete for a streaming job when calling 
this just in your main() method from the IDE.

*Expected behavior*
Receive the accumulators of the running streaming job.
The JavaDocs of the method state the following: "Accumulators can be requested 
while it is running or after it has finished.". 
While it is technically true that I can request accumulators, I was expecting 
as a user that I can access the accumulators of a running job.
Also, I can request accumulators if I submit the job to a cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Adding Azure Platform Support in DataStream, Table and SQL Connectors

2020-07-23 Thread Robert Metzger
Hi Israel,

thanks a lot for reaching out! I'm very excited about your efforts to bring
additional Azure support into Flink.
There are ~50 threads on the user@ mailing list mentioning Azure -- that's
good evidence that our users use Flink in Azure.

Regarding your questions:

Do I need to create a FLIP [2] in order to make these changes to bring the
> new capabilities or the individual JIRA issues are sufficient?


For the two DataLake FS tickets, I don't see the need for a FLIP.

I am thinking about targeting Flink versions 1.10 through 1.12
> For new connectors like this, how many versions can/should this be
> integrated into?


We generally don't add new features to old releases (unless there's a very
good reason to backport the feature). Therefore, the new integrations will
all go into the next major Flink release (in this case probably Flink 1.12
for the first tickets).

Are there any upcoming changes to supported Java and Scala versions that I
> need to be aware of?


No, I'm not aware of any upcoming changes. Java 8 and Java 11 are the two
versions we test against.

My goal is to add the first two features [FLINK-18562] and [FLINK-18568] to
> the existing file system capabilities [1] and then have the other
> connectors FLINK-1856[3-7] exist as standalone plugins.


I like the order in which you approach the tickets. I assigned you to the
first ticket and commented on the second one. I'm also willing to help
review your pull requests.

What do you mean by "standalone plugins" in the context of connectors?
Would you like to contribute these connectors to the main Flink codebase,
or maintain them outside Flink but list them in flink-packages.org?

Best,
Robert


On Wed, Jul 22, 2020 at 10:43 AM Israel Ekpo  wrote:

> I have opened the following issues to track new efforts to bring additional
> Azure Support in the following areas to the connectors ecosystem.
>
> My goal is to add the first two features [FLINK-18562] and [FLINK-18568] to
> the existing file system capabilities [1] and then have the other
> connectors FLINK-1856[3-7] exist as standalone plugins.
>
> As more users adopt the additional connectors, we could incrementally bring
> them into the core code base if necessary with sufficient support.
>
> I am new to the process so that I have a few questions:
>
> Do I need to create a FLIP [2] in order to make these changes to bring the
> new capabilities or the individual JIRA issues are sufficient?
>
> I am thinking about targeting Flink versions 1.10 through 1.12
> For new connectors like this, how many versions can/should this be
> integrated into?
>
> Are there any upcoming changes to supported Java and Scala versions that I
> need to be aware of?
>
> Any ideas or suggestions you have would be great.
>
> Below is a summary of the JIRA issues that were created to track the effort
>
> Add Support for Azure Data Lake Store Gen 2 in Flink File System
> https://issues.apache.org/jira/browse/FLINK-18562
>
> Add Support for Azure Data Lake Store Gen 2 in Streaming File Sink
> https://issues.apache.org/jira/browse/FLINK-18568
>
> Add Support for Azure Cosmos DB DataStream Connector
> https://issues.apache.org/jira/browse/FLINK-18563
>
> Add Support for Azure Event Hub DataStream Connector
> https://issues.apache.org/jira/browse/FLINK-18564
>
> Add Support for Azure Event Grid DataStream Connector
> https://issues.apache.org/jira/browse/FLINK-18565
>
> Add Support for Azure Cognitive Search DataStream Connector
> https://issues.apache.org/jira/browse/FLINK-18566
>
> Add Support for Azure Cognitive Search Table & SQL Connector
> https://issues.apache.org/jira/browse/FLINK-18567
>
>
> [1] https://github.com/apache/flink/tree/master/flink-filesystems
> [2]
>
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals
>


[DISCUSS] Planning Flink 1.12

2020-07-22 Thread Robert Metzger
Hi all,

Now that the 1.11 release is out, it is time to plan for the next major
Flink release.

Some items:

   1.

   Dian Fu and me volunteer to be the release managers for Flink 1.12.



   1.

   Timeline: We propose to stick to our approximate 4 month release cycle,
   thus the release should be done by late October. Given that there’s a
   holiday week in China at the beginning of October, I propose to do the
   feature freeze on master by late September.

   2.

   Collecting features: It would be good to have a rough overview of the
   features that will likely be ready to be merged by late September, and that
   we want in the release.
   Based on the discussion, we will update the Roadmap on the Flink website
   again!



   1.

   Test instabilities and blockers: I would like to avoid a situation where
   we have many blocking issues or build instabilities at the time of the
   feature freeze. To achieve that, we will try to check every build
   instability within a week, to decide if it is a blocker (make sure to use
   the “test-stability” label for those tickets!)
   Blocker issues will need to have somebody assigned (responsible) within
   a week, and we want to see progress on all blocker issues (downgrade,
   resolution, a good plan how to proceed if it is more complicated)

   2.

   Quality and stability of new features: In order to have a short feature
   freeze phase, we encourage developers to only merge well-tested and
   documented features. In our experience, the feature freeze works best if
   new features are complete, and the community can focus fully on addressing
   newly found bugs and voting the release.
   By having a smooth release process, the next merge-window for the next
   release will come sooner.


Let me know what you think about our items, and share which features you
want in Flink 1.12.

Best,

Robert & Dian


[jira] [Created] (FLINK-18669) Remove testing-related code from production code in PubSub connector

2020-07-22 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-18669:
--

 Summary: Remove testing-related code from production code in 
PubSub connector
 Key: FLINK-18669
 URL: https://issues.apache.org/jira/browse/FLINK-18669
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Google Cloud PubSub
Reporter: Robert Metzger


(See also: https://github.com/apache/flink/pull/12846#discussion_r457380452)

At least the PubSubSink.java contains some testing-related code, that could 
probably be completely removed from the production code.
As part of this ticket, we should also check if there are other cases like this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Leverage Maven Wrapper

2020-07-03 Thread Robert Metzger
We could also build releases by calling "mvn package" again in
"flink-dist". But all these solutions are far from elegant.

Ideally the Maven folks have implemented something nicer by now. Let's see
what they say.

On Fri, Jul 3, 2020 at 1:20 PM Chesnay Schepler  wrote:

> It's not that *difficult *for us to work around it mind you; we "just"
> have to
>
> a) separate the distribution packaging from the flink-dist jar from the
> distribution assembly, so we can have dependencies on the various
> opt/plugins modules without pulling in dependencies into the flink-dist jar,
> b) get rid of as many intermediate shading instances as possible, (i.e.,
> flink-runtime which is bundled into flink-dist)
> c) in cases where we cannot get rid of intermediate shading (e.g., because
> we want to have 2 different versions of a single dependency bundled with
> separate packages), add exclusions to the shaded dependencies. (this is
> essentially what the shade-plugin does for the dependency-reduced pom)
>
> On 03/07/2020 10:12, Robert Metzger wrote:
>
> I just reached out to the users@maven mailing list again to check if
> there's any resolution for shading behavior post 3.2.5 [1]
>
>
> [1]https://lists.apache.org/thread.html/8b2dcf462de814d06d8e30bafce2c886217c5790a3ee07d33d0b8dfc%40%3Cusers.maven.apache.org%3E
>
> On Thu, Jun 4, 2020 at 3:08 PM Chesnay Schepler  
>  wrote:
>
>
> You don't necessarily need to use maven 3.2.5, you just have to know
> what to watch out for.
> For that reason we are _not_ forcing maven 3.2.5 in general, but for
> releases only to be on the safe side.
>
> Some time ago Gradle was brought up as a potential replacement for
> maven, and I'd like to see that discussion concluded before making other
> major changes to the maven development process.
>
> Overall I'm sympathetic to the idea, but not really sold.
> If we switch to gradle we obviously don't need it (duh);
> if we stick with maven we will have to find a solution for the 3.2.5
> limitation eventually, and I'd much rather solve this problem than keep
> working around the limitation.
>
> On 03/06/2020 15:48, tison wrote:
>
> Hi devs,
>
> Flink forces a fixed version(3.2.5) of Maven while higher version suffers
> from shade issues and
> so on.
>
> Since different projects have different requirement of Maven. It seems a
> good idea we add a
> maven wrapper[1] in our repository which reduces our developers burden.
>
> Any thoughts?
>
> Best,
> tison.
>
> [1] https://github.com/takari/maven-wrapper
>
>
>


Re: [DISCUSS] Leverage Maven Wrapper

2020-07-03 Thread Robert Metzger
I just reached out to the users@maven mailing list again to check if
there's any resolution for shading behavior post 3.2.5 [1]


[1]
https://lists.apache.org/thread.html/8b2dcf462de814d06d8e30bafce2c886217c5790a3ee07d33d0b8dfc%40%3Cusers.maven.apache.org%3E

On Thu, Jun 4, 2020 at 3:08 PM Chesnay Schepler  wrote:

> You don't necessarily need to use maven 3.2.5, you just have to know
> what to watch out for.
> For that reason we are _not_ forcing maven 3.2.5 in general, but for
> releases only to be on the safe side.
>
> Some time ago Gradle was brought up as a potential replacement for
> maven, and I'd like to see that discussion concluded before making other
> major changes to the maven development process.
>
> Overall I'm sympathetic to the idea, but not really sold.
> If we switch to gradle we obviously don't need it (duh);
> if we stick with maven we will have to find a solution for the 3.2.5
> limitation eventually, and I'd much rather solve this problem than keep
> working around the limitation.
>
> On 03/06/2020 15:48, tison wrote:
> > Hi devs,
> >
> > Flink forces a fixed version(3.2.5) of Maven while higher version suffers
> > from shade issues and
> > so on.
> >
> > Since different projects have different requirement of Maven. It seems a
> > good idea we add a
> > maven wrapper[1] in our repository which reduces our developers burden.
> >
> > Any thoughts?
> >
> > Best,
> > tison.
> >
> > [1] https://github.com/takari/maven-wrapper
> >
>
>


Re: Moving in and Modify Dependency Source

2020-07-03 Thread Robert Metzger
We have documented how the licensing works here:
https://cwiki.apache.org/confluence/display/FLINK/Licensing
(There's a section on the "licenses directory")

In this case, I don't think you'll need to include the apache license in
the licenses/ directory (because it's the Apache license)

On Tue, Jun 16, 2020 at 1:42 AM Austin Cawley-Edwards <
austin.caw...@gmail.com> wrote:

> Ah, missed Till's response -- thanks as well!
>
> I'll add those headers to the files, so just now wondering about including
> the licenses/ notice in the RMQ connector resources.
>
> On Mon, Jun 15, 2020 at 7:40 PM Austin Cawley-Edwards <
> austin.caw...@gmail.com> wrote:
>
> > Hey Robert,
> >
> > Thanks for getting back to me! Just wasn't sure on the license header
> > requirements for the CI checks in Flink. Not too experienced with working
> > with licenses, especially in large open-source projects. Since we would
> be
> > using APL 2 (and from this link[1] we need to state changes, include the
> > copyright, add to a notice file, add to licenses), would I just include
> > their copyright at the top of the file and then state the changes I've
> made
> > there, or somewhere else? Do I need to create a new NOTICE file/ licenses
> > in the RMQ connector resources or add it to another file?
> >
> > Sorry for all the questions! ... is there anywhere in the docs that
> > addresses this?
> >
> > Best + thanks again,
> > Austin
> >
> > [1]: https://tldrlegal.com/license/apache-license-2.0-%28apache-2.0%29
> >
> > On Mon, Jun 15, 2020 at 4:20 AM Robert Metzger 
> > wrote:
> >
> >> Hi Austin,
> >> Thanks for working on the RMQ connector! There seem to be a few users
> >> affected by that issue.
> >>
> >> The GitHub page confirms that users can choose from the three licenses:
> >> https://github.com/rabbitmq/rabbitmq-java-client#license:
> >>
> >> > This means that the user can consider the library to be licensed under
> >> any
> >> > of the licenses from the list above. For example, you may choose the
> >> > Apache Public License 2.0 and include this client into a commercial
> >> > product. Projects that are licensed under the GPLv2 may choose GPLv2,
> >> and
> >> > so on.
> >>
> >>
> >> Best,
> >> Robert
> >>
> >> On Mon, Jun 15, 2020 at 8:59 AM Till Rohrmann 
> >> wrote:
> >>
> >> > Hi Austin,
> >> >
> >> > usually if source code is multi licensed then this means that the user
> >> can
> >> > choose the license under which he wants it to use. In our case it
> would
> >> be
> >> > the Apache License version 2. But you should check the license text to
> >> make
> >> > sure that this has not been forbidden explicitly.
> >> >
> >> > When copying code from another project, the practice is to annotate it
> >> with
> >> > a comment stating from where the code was obtained. So in your case
> you
> >> > would give these files the ASL license header and add a comment to the
> >> > source code from where it was copied.
> >> >
> >> > Cheers,
> >> > Till
> >> >
> >> > On Sat, Jun 13, 2020 at 10:41 PM Austin Cawley-Edwards <
> >> > austin.caw...@gmail.com> wrote:
> >> >
> >> > > Hi all,
> >> > >
> >> > > I'm working on [FLINK-10195] on the RabbitMQ connector which
> involves
> >> > > modifying some of the RMQ client source code (that has been moved
> out
> >> of
> >> > > that package) and bringing it into Flink. The RMQ client code is
> >> > > triple-licensed under Mozilla Public License 1.1 ("MPL"), the GNU
> >> General
> >> > > Public License version 2 ("GPL"), and the Apache License version 2
> >> > ("ASL").
> >> > >
> >> > > Does anyone have experience doing something similar/ what I would
> >> need to
> >> > > do in terms of the license headers in the Flink source files?
> >> > >
> >> > > Thank you,
> >> > > Austin
> >> > >
> >> > > [FLINK-10195]: https://issues.apache.org/jira/browse/FLINK-10195
> >> > >
> >> >
> >>
> >
>


Re: [DISCUSS] Semantics of our JIRA fields

2020-07-03 Thread Robert Metzger
Since there were no further comments on this discussion, I removed the
"draft" label from the Wiki page and I consider the Jira semantics proposal
agreed upon.

On Mon, Jun 15, 2020 at 9:49 AM Piotr Nowojski  wrote:

>
> > On 12 Jun 2020, at 15:44, Robert Metzger  wrote:
> >
> > Piotrek, do you agree with my "affects version" explanation? I would like
> > to bring this discussion to a conclusion.
> >
>
> +0 for this semantic from my side.
>
> > On Tue, May 26, 2020 at 4:51 PM Till Rohrmann 
> wrote:
> >
> >> If we change the meaning of the priority levels, then I would suggest to
> >> have a dedicated discussion for it. This would also be more visible than
> >> compared to being hidden in some lengthy discussion thread. I think the
> >> proposed definitions of priority levels differ slightly from how the
> >> community worked in the past.
> >>
> >> Cheers,
> >> Till
> >>
> >> On Tue, May 26, 2020 at 4:30 PM Robert Metzger 
> >> wrote:
> >>
> >>> Hi,
> >>>
> >>> 1. I'm okay with updating the definition of the priorities for the
> reason
> >>> you've mentioned.
> >>>
> >>> 2. "Affects version"
> >>>
> >>> The reason why like to mark affects version on unreleased versions is
> to
> >>> clearly indicate which branch is affected by a bug. Given the current
> >> Flink
> >>> release status, if there's a bug only in "release-1.11", but not in
> >>> "master", there is no way of figuring that out, if we only allow
> released
> >>> versions for "affects version" (In my proposal, you would set "affects
> >>> version" to '1.11.0', '1.12.0' to indicate that).
> >>>
> >>> What we could do is introduce "1.12-SNAPSHOT" as version to mark issues
> >> on
> >>> unreleased versions. (But then people might accidentally set the "fix
> >>> version" to a "-SNAPSHOT" version.)
> >>>
> >>> I'm still in favor of my proposal. I have never heard a report from a
> >>> confused user about our Jira fields (I guess they usually check bugs
> for
> >>> released versions only)
> >>>
> >>>
> >>> On Tue, May 26, 2020 at 12:39 PM Piotr Nowojski 
> >>> wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> Sorry for a bit late response. I have two concerns:
> >>>>
> >>>> 1. Priority
> >>>>
> >>>> I would propose to stretch priorities that we are using to
> >> differentiate
> >>>> between things that must be fixed for given release:
> >>>>
> >>>> BLOCKER - drop anything you are doing, this issue must be fixed right
> >> now
> >>>> CRITICAL - release can not happen without fixing it, but can be fixed
> a
> >>>> bit later (for example without context switching and dropping whatever
> >>> I’m
> >>>> doing right now)
> >>>> MAJOR - default, nice to have
> >>>> Anything below - meh
> >>>>
> >>>> We were already using this semantic for tracking test instabilities
> >>> during
> >>>> the 1.11 release cycle. Good examples:
> >>>>
> >>>> BLOCKER - master branch not compiling, very frequent test failures
> (for
> >>>> example almost every build affected), …
> >>>> CRITICAL - performance regression/bug that we introduced in some
> >> feature,
> >>>> but which is not affecting other developers as much
> >>>> MAJOR - freshly discovered test instability with unknown
> >> impact/frequency
> >>>> (could be happening once a year),
> >>>>
> >>>> 2. Affects version
> >>>>
> >>>> If bug is only on the master branch, does it affect an unreleased
> >>> version?
> >>>>
> >>>> So far I was assuming that it doesn’t - unreleased bugs would have
> >> empty
> >>>> “affects version” field. My reasoning was that this field should be
> >> used
> >>>> for Flink users, to check which RELEASED Flink versions are affected
> by
> >>>> some bug, that user is searching for. Otherwise it might be a bit
> >>> confusing
> >>>> if there are lots of bugs with both affects versio

Re: Does FlinkKinesisConsumer not retry on NoHttpResponseException?

2020-07-03 Thread Robert Metzger
For the others on the dev@ list: I responded on SO.

On Tue, Jun 16, 2020 at 7:56 AM Singh Aulakh, Karanpreet KP
 wrote:

> Hello!
>
> (Apache Flink1.8 on AWS EMR release label 5.28.x)
>
> Our data source is an AWS Kinesis stream (with 450 shards if that
> matters). We use the FlinkKinesisConsumer to read the kinesis stream. Our
> application occasionally (once every couple of days) crashes with a "Target
> server failed to respond" error. The full stack trace is at the bottom.
>
> Looking more into the codebase I found out that
> 'ProvisionedThroughputExceededException' are the only exception types that
> are retried on. Code<
> https://github.com/apache/flink/blob/2b0a8ceeb131c938d2e41dfee66099bfa5f366ae/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/proxy/KinesisProxy.java#L365
> >
> 1. Wondering why a transient http response exception is not retried by the
> kinesis connector?
> 2. Is there a way I can pass in a retry configuration that will retry on
> these errors?
>
> As a side note, we set the following retry configuration -
>
> env.setRestartStrategy(RestartStrategies.failureRateRestart(12,
>
>   org.apache.flink.api.common.time.Time.of(60, TimeUnit.MINUTES),
>
> org.apache.flink.api.common.time.Time.of(300,
> TimeUnit.SECONDS)));
>
> Full stack trace of the exception -
>
> at
> org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1201)
>
> at
> org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1147)
>
> at
> org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:796)
>
> at
> org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:764)
>
> at
> org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:738)
>
> at
> org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:698)
>
> at
> org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:680)
>
> at
> org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:544)
>
> at
> org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:524)
>
> at
> org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.AmazonKinesisClient.doInvoke(AmazonKinesisClient.java:2809)
>
> at
> org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.AmazonKinesisClient.invoke(AmazonKinesisClient.java:2776)
>
> at
> org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.AmazonKinesisClient.invoke(AmazonKinesisClient.java:2765)
>
> at
> org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.AmazonKinesisClient.executeGetRecords(AmazonKinesisClient.java:1292)
>
> at
> org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.AmazonKinesisClient.getRecords(AmazonKinesisClient.java:1263)
>
> at
> org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxy.getRecords(KinesisProxy.java:250)
>
> at
> org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer.getRecords(ShardConsumer.java:400)
>
> at
> org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer.run(ShardConsumer.java:243)
>
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>
> at java.lang.Thread.run(Thread.java:748)
>
> (Shamelessly copy pasting from the stack overflow question I posted
> https://stackoverflow.com/questions/62399248/flinkkinesisconsumer-does-not-retry-on-nohttpresponseexception
> )
>
> --
> KP
>


Re: [DISCUSS] Drop connectors for 5.x and restart the flink es source connector

2020-07-03 Thread Robert Metzger
The discussion on dropping the ES5 connector was not conclusive, when we
discussed it in February 2020. We wanted to revisit it for the 1.12 release.

>From maven central, we have the following download numbers
ES2: 500 downloads
ES5: 10500 downloads (the es5_2.10:1.3.1 had 8000 downloads last month. I
there's a CI system or something downloading all these)
ES6: 4200 downloads
ES7: 1800 downloads

For 1.10.0 we had the following numbers:
ES5: 500
ES6: 525
ES7: 840

Based on these numbers, I would advise against removing the ES5 connector
for the 1.12 release.



On Fri, Jun 19, 2020 at 9:53 AM Jark Wu  wrote:

> I'm fine with dropping support for es5.
>
> forward to dev@.
>
> Best,
> Jark
>
>
>
> On Fri, 19 Jun 2020 at 15:46, jackylau  wrote:
>
> > Hi all:
> >  when i coding the es source connector here
> >
> >
> https://github.com/liuyongvs/flink/commit/c397a759d05956629a27bf850458dd4e70330189
> > for the elasticsearch source connector. The doc is here
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-127%3A+Support+Elasticsearch+Source+Connector
> > ,and
> > <
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-127%3A+Support+Elasticsearch+Source+Connector,and
> >
> > i find  problem of SearchHits at
> >
> > *currentScrollWindowHits = searchResponse.getHits().getHits()
> > *
> > And the SearchHits is a Interface in es5 but it is a class in es 6,7 and
> if
> > i use it in flink es connector base(the es dependency is 5). it will
> throw
> > this.
> > *Caused by: java.lang.IncompatibleClassChangeError: Found class
> > org.elasticsearch.search.SearchHits, but interface was expected.***
> >
> > To fix it : we can do this ways
> >
> > 1) move the logic to ApiCallBridge such as define ElasticsearchResponse
> or
> > Tuple2, but it wll make the code weirdly
> >
> > class ElasticsearchResponse
> > {
> > String scroll;
> > String[] result // convert ervery es connector
> > searchResponse.getHits().getHits() to this result
> > }
> >
> >
> > if user want to add some thing, it will need modify this
> >
> >
> > 2) just support es 6,7 and upgrade flink-es-connector-base es dependency
> > version to 6 and drop flink-es-connector-5. And i found this discussion
> of
> > dropping es connector 2 and 5 here
> >
> >
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/DISCUSS-Drop-connectors-for-Elasticsearch-2-x-and-5-x-td32662.html
> > .
> >
> >  The es5 connector just support DataStream api currently .And Is it
> > possible
> > to drop es5 connector and upgrade es-connector-base to es6?
> >
> > I am looking forward all your response
> > Best !
> >
> >
> >
> > --
> > Sent from:
> > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
> >
>


Re: [VOTE] Release 1.11.0, release candidate #4

2020-07-02 Thread Robert Metzger
+1 (binding)

Checks:
- source archive compiles
- checked artifacts in staging repo
  - flink-azure-fs-hadoop-1.11.0.jar seems to have a correct NOTICE file
  - versions in pom seem correct
  - checked some other jars
- deployed Flink on YARN on Azure HDInsight (which uses Hadoop 3.1.1)
  - Reported some tiny log sanity issue:
https://issues.apache.org/jira/browse/FLINK-18474
  - Wordcount against HDFS works


On Thu, Jul 2, 2020 at 7:07 PM Thomas Weise  wrote:

> Hi Zhijiang,
>
> The performance degradation manifests in backpressure which leads to
> growing backlog in the source. I switched a few times between 1.10 and 1.11
> and the behavior is consistent.
>
> The DAG is:
>
> KinesisConsumer -> (Flat Map, Flat Map, Flat Map)    forward
> -> KinesisProducer
>
> Parallelism: 160
> No shuffle/rebalance.
>
> Checkpointing config:
>
> Checkpointing Mode Exactly Once
> Interval 10s
> Timeout 10m 0s
> Minimum Pause Between Checkpoints 10s
> Maximum Concurrent Checkpoints 1
> Persist Checkpoints Externally Enabled (delete on cancellation)
>
> State backend: rocksdb  (filesystem leads to same symptoms)
> Checkpoint size is tiny (500KB)
>
> An interesting difference to another job that I had upgraded successfully
> is the low checkpointing interval.
>
> Thanks,
> Thomas
>
>
> On Wed, Jul 1, 2020 at 9:02 PM Zhijiang  .invalid>
> wrote:
>
> > Hi Thomas,
> >
> > Thanks for the efficient feedback.
> >
> > Regarding the suggestion of adding the release notes document, I agree
> > with your point. Maybe we should adjust the vote template accordingly in
> > the respective wiki to guide the following release processes.
> >
> > Regarding the performance regression, could you provide some more details
> > for our better measurement or reproducing on our sides?
> > E.g. I guess the topology only includes two vertexes source and sink?
> > What is the parallelism for every vertex?
> > The upstream shuffles data to the downstream via rebalance partitioner or
> > other?
> > The checkpoint mode is exactly-once with rocksDB state backend?
> > The backpressure happened in this case?
> > How much percentage regression in this case?
> >
> > Best,
> > Zhijiang
> >
> >
> >
> > --
> > From:Thomas Weise 
> > Send Time:2020年7月2日(星期四) 09:54
> > To:dev 
> > Subject:Re: [VOTE] Release 1.11.0, release candidate #4
> >
> > Hi Till,
> >
> > Yes, we don't have the setting in flink-conf.yaml.
> >
> > Generally, we carry forward the existing configuration and any change to
> > default configuration values would impact the upgrade.
> >
> > Yes, since it is an incompatible change I would state it in the release
> > notes.
> >
> > Thanks,
> > Thomas
> >
> > BTW I found a performance regression while trying to upgrade another
> > pipeline with this RC. It is a simple Kinesis to Kinesis job. Wasn't able
> > to pin it down yet, symptoms include increased checkpoint alignment time.
> >
> > On Wed, Jul 1, 2020 at 12:04 AM Till Rohrmann 
> > wrote:
> >
> > > Hi Thomas,
> > >
> > > just to confirm: When starting the image in local mode, then you don't
> > have
> > > any of the JobManager memory configuration settings configured in the
> > > effective flink-conf.yaml, right? Does this mean that you have
> explicitly
> > > removed `jobmanager.heap.size: 1024m` from the default configuration?
> If
> > > this is the case, then I believe it was more of an unintentional
> artifact
> > > that it worked before and it has been corrected now so that one needs
> to
> > > specify the memory of the JM process explicitly. Do you think it would
> > help
> > > to explicitly state this in the release notes?
> > >
> > > Cheers,
> > > Till
> > >
> > > On Wed, Jul 1, 2020 at 7:01 AM Thomas Weise  wrote:
> > >
> > > > Thanks for preparing another RC!
> > > >
> > > > As mentioned in the previous RC thread, it would be super helpful if
> > the
> > > > release notes that are part of the documentation can be included [1].
> > > It's
> > > > a significant time-saver to have read those first.
> > > >
> > > > I found one more non-backward compatible change that would be worth
> > > > addressing/mentioning:
> > > >
> > > > It is now necessary to configure the jobmanager heap size in
> > > > flink-conf.yaml (with either jobmanager.heap.size
> > > > or jobmanager.memory.heap.size). Why would I not want to do that
> > anyways?
> > > > Well, we set it dynamically for a cluster deployment via the
> > > > flinkk8soperator, but the container image can also be used for
> testing
> > > with
> > > > local mode (./bin/jobmanager.sh start-foreground local). That will
> fail
> > > if
> > > > the heap wasn't configured and that's how I noticed it.
> > > >
> > > > Thanks,
> > > > Thomas
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://ci.apache.org/projects/flink/flink-docs-release-1.11/release-notes/flink-1.11.html
> > > >
> > > > On Tue, Jun 30, 2020 at 3:18 AM Zhijiang  > > > .invalid>
> > > > wrote:
> > > >
> 

[jira] [Created] (FLINK-18474) Flink on YARN produces WARN log statements about the configuration

2020-07-02 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-18474:
--

 Summary: Flink on YARN produces WARN log statements about the 
configuration
 Key: FLINK-18474
 URL: https://issues.apache.org/jira/browse/FLINK-18474
 Project: Flink
  Issue Type: Bug
  Components: Deployment / YARN
Affects Versions: 1.11.0
Reporter: Robert Metzger


While testing the 1.11 release, I saw the following log statements, which might 
confuse new users:

(JM)
{code}
2020-07-02 17:35:33,123 WARN  org.apache.flink.configuration.Configuration  
   [] - Config uses deprecated configuration key 'web.port' instead of 
proper key 'rest.bind-port'
{code}

(TM)
{code}
2020-07-02 17:43:31,574 WARN  
org.apache.flink.configuration.GlobalConfiguration   [] - Error while 
trying to split key and value in configuration file ./flink-conf.yaml:16: 
"pipeline.classpaths: "
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Release 1.11.0, release candidate #4

2020-07-02 Thread Robert Metzger
Issues found:
-
https://repository.apache.org/content/repositories/orgapacheflink-1377/org/apache/flink/flink-runtime_2.12/1.11.0/flink-runtime_2.12-1.11.0.jar
./META-INF/NOTICE lists "org.uncommons.maths:uncommons-maths:1.2.2a" as a
bundled dependency. However, it seems they are not bundled. I'm waiting
with my vote until we've discussed this issue. I'm leaning towards
continuing the release vote (
https://issues.apache.org/jira/browse/FLINK-18471).

Checks:
- source archive compiles
- checked artifacts in staging repo
  - flink-azure-fs-hadoop-1.11.0.jar seems to have a correct NOTICE file
  - versions in pom seem correct
  - checked some other jars
- ... I will continue later ...

On Thu, Jul 2, 2020 at 3:47 PM Stephan Ewen  wrote:

> +1 (binding) from my side
>
>   - legal files (license, notice) looks correct
>   - no binaries in the release
>   - ran examples from command line
>   - ran some examples from web ui
>   - log files look sane
>   - RocksDB, incremental checkpoints, savepoints, moving savepoints
> all works as expected.
>
> There are some friction points, which have also been mentioned. However, I
> am not sure they need to block the release.
>   - Some batch examples in the web UI have not been working in 1.10. We
> should fix that asap, because it impacts the "getting started" experience,
> but I personally don't vote against the release based on that
>   - Same for the CDC bug. It is unfortunate, but I would not hold the
> release at such a late stage for one special issue in a new connector.
> Let's work on a timely 1.11.1.
>
>
> I would withdraw my vote, if we find a fundamental issue in the network
> system causing the increased checkpoint delays, causing the job regression
> Thomas mentioned.
> Such a core bug would be a deal-breaker for a large fraction of users.
>
>
>
>
> On Thu, Jul 2, 2020 at 11:35 AM Zhijiang  .invalid>
> wrote:
>
> > I also agree with Till and Robert's proposals.
> >
> > In general I think we should not block the release based on current
> > estimation. Otherwise we continuously postpone the release, it might
> > probably occur new bugs for blockers, then we might probably
> > get stuck in such cycle to not give a final release for users in time.
> But
> > that does not mean RC4 would be the final one, and we can reevaluate the
> > effects in progress with the accumulated issues.
> >
> > Regarding the performance regression, if possible we can reproduce to
> > analysis the reason based on Thomas's feedback, then we can evaluate its
> > effect.
> >
> > Regarding the FLINK-18461, after syncing with Jark offline, the bug would
> > effect one of three scenarios for using CDC feature, and this effected
> > scenario is actually the most commonly used way by users.
> > My suggestion is to merge it into release-1.11 ATM since the PR already
> > open for review, then let's further finalize the conclusion later. If
> this
> > issue is the only one after RC4 going through, then another option is to
> > cover it in next release-1.11.1 as Robert suggested, as we can prepare
> for
> > the next minor release soon. If there are other blockers issues during
> > voting and necessary to be resolved soon, then it is no doubt to cover
> all
> > of them in next RC5.
> >
> > Best,
> > Zhijiang
> >
> >
> > --
> > From:Till Rohrmann 
> > Send Time:2020年7月2日(星期四) 16:46
> > To:dev 
> > Cc:Zhijiang 
> > Subject:Re: [VOTE] Release 1.11.0, release candidate #4
> >
> > I agree with Robert.
> >
> > @Chesnay: The problem has probably already existed in Flink 1.10 and
> > before because we cannot run jobs with eager execution calls from the web
> > ui. I agree with Robert that we can/should improve our documentation in
> > this regard, though.
> >
> > @Thomas:
> > 1. I will update the release notes to add a short section describing that
> > one needs to configure the JobManager memory.
> > 2. Concerning the performance regression we should look into it. I
> believe
> > Zhijiang is very eager to learn more about your exact setup to further
> > debug it. Again I agree with Robert to not block the release on it at the
> > moment.
> >
> > @Jark: How much of a problem is FLINK-18461? Will it make the CDC feature
> > completely unusable or will only make a subset of the use cases to not
> > work? If it is the latter, then I believe that we can document the
> > limitations and try to fix it asap. Depending on the remaining testing
> the
> > fix might make it into the 1.11.

[jira] [Created] (FLINK-18471) flink-runtime lists "org.uncommons.maths:uncommons-maths:1.2.2a" as a bundled dependency, but it isn't

2020-07-02 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-18471:
--

 Summary: flink-runtime lists 
"org.uncommons.maths:uncommons-maths:1.2.2a" as a bundled dependency, but it 
isn't
 Key: FLINK-18471
 URL: https://issues.apache.org/jira/browse/FLINK-18471
 Project: Flink
  Issue Type: Bug
  Components: Build System
Affects Versions: 1.11.0
Reporter: Robert Metzger


This is the relevant section in the NOTICE file

{code}
This project bundles the following dependencies under the Apache Software 
License 2.0. (http://www.apache.org/licenses/LICENSE-2.0.txt)

- com.typesafe.akka:akka-remote_2.11:2.5.21
- io.netty:netty:3.10.6.Final
- org.uncommons.maths:uncommons-maths:1.2.2a
{code}.

The uncommons-maths dependency is not declared anywhere, nor is it included in 
the binary file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18468) TaskExecutorITCase.testJobReExecutionAfterTaskExecutorTermination fails with

2020-07-02 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-18468:
--

 Summary: 
TaskExecutorITCase.testJobReExecutionAfterTaskExecutorTermination fails with 
 Key: FLINK-18468
 URL: https://issues.apache.org/jira/browse/FLINK-18468
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination, Tests
Affects Versions: 1.12.0
Reporter: Robert Metzger


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=4149=logs=3b6ec2fd-a816-5e75-c775-06fb87cb6670=2aff8966-346f-518f-e6ce-de64002a5034
{code}
[ERROR] 
testJobReExecutionAfterTaskExecutorTermination(org.apache.flink.runtime.taskexecutor.TaskExecutorITCase)
  Time elapsed: 1.222 s  <<< ERROR!
java.util.concurrent.ExecutionException: 
org.apache.flink.runtime.client.DuplicateJobSubmissionException: Job has 
already been submitted.
at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
at 
org.apache.flink.runtime.taskexecutor.TaskExecutorITCase.testJobReExecutionAfterTaskExecutorTermination(TaskExecutorITCase.java:108)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Release 1.11.0, release candidate #4

2020-07-02 Thread Robert Metzger
Thanks a lot for the thorough testing Thomas! This is really helpful!

@Chesnay: I would not block the release on this. The web submission does
not seem to be the documented / preferred way of job submission. It is
unlikely to harm the beginner's experience (and they would anyways not read
the release notes). I mention the beginner experience, because they are the
primary audience of the examples.

Regarding FLINK-18461 / Jark's issue: I would not block the release on
that, but still try to get it fixed asap. It is likely that this RC doesn't
go through (given the rate at which we are finding issues), and even if it
goes through, we can document it as a known issue in the release
announcement and immediately release 1.11.1.
Blocking the release on this causes quite a bit of work for the release
managers for rolling a new RC. Until we have understood the performance
regression Thomas is reporting, I would keep this RC open, and keep testing.


On Thu, Jul 2, 2020 at 8:34 AM Jark Wu  wrote:

> Hi,
>
> I'm very sorry but we just found a blocker issue FLINK-18461 [1] in the new
> feature of changelog source (CDC).
> This bug will result in queries on changelog source can’t be inserted into
> upsert sink (e.g. ES, JDBC, HBase),
> which is a common case in production. CDC is one of the important features
> of Table/SQL in this release,
> so from my side, I hope we can have this fix in 1.11.0, otherwise, this is
> a broken feature...
>
> Again, I am terribly sorry for delaying the release...
>
> Best,
> Jark
>
> [1]: https://issues.apache.org/jira/browse/FLINK-18461
>
> On Thu, 2 Jul 2020 at 12:02, Zhijiang 
> wrote:
>
> > Hi Thomas,
> >
> > Thanks for the efficient feedback.
> >
> > Regarding the suggestion of adding the release notes document, I agree
> > with your point. Maybe we should adjust the vote template accordingly in
> > the respective wiki to guide the following release processes.
> >
> > Regarding the performance regression, could you provide some more details
> > for our better measurement or reproducing on our sides?
> > E.g. I guess the topology only includes two vertexes source and sink?
> > What is the parallelism for every vertex?
> > The upstream shuffles data to the downstream via rebalance partitioner or
> > other?
> > The checkpoint mode is exactly-once with rocksDB state backend?
> > The backpressure happened in this case?
> > How much percentage regression in this case?
> >
> > Best,
> > Zhijiang
> >
> >
> >
> > --
> > From:Thomas Weise 
> > Send Time:2020年7月2日(星期四) 09:54
> > To:dev 
> > Subject:Re: [VOTE] Release 1.11.0, release candidate #4
> >
> > Hi Till,
> >
> > Yes, we don't have the setting in flink-conf.yaml.
> >
> > Generally, we carry forward the existing configuration and any change to
> > default configuration values would impact the upgrade.
> >
> > Yes, since it is an incompatible change I would state it in the release
> > notes.
> >
> > Thanks,
> > Thomas
> >
> > BTW I found a performance regression while trying to upgrade another
> > pipeline with this RC. It is a simple Kinesis to Kinesis job. Wasn't able
> > to pin it down yet, symptoms include increased checkpoint alignment time.
> >
> > On Wed, Jul 1, 2020 at 12:04 AM Till Rohrmann 
> > wrote:
> >
> > > Hi Thomas,
> > >
> > > just to confirm: When starting the image in local mode, then you don't
> > have
> > > any of the JobManager memory configuration settings configured in the
> > > effective flink-conf.yaml, right? Does this mean that you have
> explicitly
> > > removed `jobmanager.heap.size: 1024m` from the default configuration?
> If
> > > this is the case, then I believe it was more of an unintentional
> artifact
> > > that it worked before and it has been corrected now so that one needs
> to
> > > specify the memory of the JM process explicitly. Do you think it would
> > help
> > > to explicitly state this in the release notes?
> > >
> > > Cheers,
> > > Till
> > >
> > > On Wed, Jul 1, 2020 at 7:01 AM Thomas Weise  wrote:
> > >
> > > > Thanks for preparing another RC!
> > > >
> > > > As mentioned in the previous RC thread, it would be super helpful if
> > the
> > > > release notes that are part of the documentation can be included [1].
> > > It's
> > > > a significant time-saver to have read those first.
> > > >
> > > > I found one more non-backward compatible change that would be worth
> > > > addressing/mentioning:
> > > >
> > > > It is now necessary to configure the jobmanager heap size in
> > > > flink-conf.yaml (with either jobmanager.heap.size
> > > > or jobmanager.memory.heap.size). Why would I not want to do that
> > anyways?
> > > > Well, we set it dynamically for a cluster deployment via the
> > > > flinkk8soperator, but the container image can also be used for
> testing
> > > with
> > > > local mode (./bin/jobmanager.sh start-foreground local). That will
> fail
> > > if
> > > > the heap wasn't configured and that's how I noticed it.
> > 

Re: [ANNOUNCE] Apache Flink 1.11.0, release candidate #2

2020-06-18 Thread Robert Metzger
Thanks a lot for creating another RC.

The good news first: WordCount works on my Mac :)

I didn't find this information in the email: This is the commit the RC is
based on:
https://github.com/apache/flink/commit/c4132de4a50ab9b8f653c69af1ba15af44ff29a2

Additionally, I've created [1] a docker image based on the "
flink-1.11.0-bin-scala_2.12.tgz
"
binary release candidate. It's on DockerHub:
"rmetzger/flink:1.11.0-rc2-c4132de4a50ab9b8f653c69af1ba15af44ff29a2".

Happy testing :)

[1]
https://github.com/rmetzger/flink-docker-factory/runs/785600229?check_suite_focus=true

On Thu, Jun 18, 2020 at 2:41 PM Piotr Nowojski  wrote:

> Hi all,
>
> Apache Flink-1.11.0-RC2 has been created. It has all the artifacts that we
> would typically have for a release.
>
> This RC might have still a couple of missing licensing notices (like the
> one mentioned by Jingsong Li), but as for today morning, there were no open
> release blocking bugs. No official vote will take place for it, but you can
> treat it as a solid base for release testing. It includes the following:
>
>   * The preview source release and binary convenience releases [1], which
> are signed with the key with fingerprint
> 2DA85B93244FDFA19A6244500653C0A2CEA00D0E [2],
>   * All artifacts that would normally be deployed to the Maven Central
> Repository [3]
>
> To test with these artifacts, you can create a settings.xml file with the
> content shown below [4]. This settings file can be referenced in your maven
> commands
> via --settings /path/to/settings.xml. This is useful for creating a
> quickstart project based on the staged release and also for building
> against the staged jars.
>
> Happy testing!
>
> Best,
> Piotrek
>
> [1] https://dist.apache.org/repos/dist/dev/flink/flink-1.11.0-rc2/
> [2] https://dist.apache.org/repos/dist/release/flink/KEYS
> [3]
> https://repository.apache.org/content/repositories/orgapacheflink-1374/
> [4]
> 
>
> flink-1.11.0
>
>
>
>flink-1.11.0
>
>  
>flink-1.11.0
>
> https://repository.apache.org/content/repositories/orgapacheflink-1374/
> 
> 
> 
>   archetype
>   
> https://repository.apache.org/content/repositories/orgapacheflink-1374/
> 
> 
> 
>
>
> 
>
> śr., 17 cze 2020 o 15:29 Piotr Nowojski  napisał(a):
>
> > Hi all,
> >
> > I would like to give an update about the RC2 status. We are now waiting
> > for a green azure build on one final bug fix before creating RC2. This
> bug
> > fix should be merged late afternoon/early evening Berlin time, so RC2
> will
> > be hopefully created tomorrow morning. Until then I would ask to not
> > merge/backport commits to release-1.11 branch, including bug fixes. If
> you
> > have something that's truly essential and should be treated as a release
> > blocker, please reach out to me or Zhijiang.
> >
> > Best,
> > Piotr Nowojski
> >
>


[jira] [Created] (FLINK-18370) Test Flink on Azure-hosted VMs nightly

2020-06-18 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-18370:
--

 Summary: Test Flink on Azure-hosted VMs nightly
 Key: FLINK-18370
 URL: https://issues.apache.org/jira/browse/FLINK-18370
 Project: Flink
  Issue Type: Improvement
  Components: Build System / Azure Pipelines
Reporter: Robert Metzger
Assignee: Robert Metzger


There are some tests which happen a lot more frequently on the VMs provided by 
Azure (instead of the CI infrastructure hosted in Alibaba Cloud).

Since we have enough CI resources available (at night), we can add a run on the 
Azure machines to get more visibility into those failures, and to increase the 
stability of personal account CI runs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18368) HadoopRecoverableWriterOldHadoopWithNoTruncateSupportTest.createHDFS fails with "Running in secure mode, but config doesn't have a keytab"

2020-06-18 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-18368:
--

 Summary: 
HadoopRecoverableWriterOldHadoopWithNoTruncateSupportTest.createHDFS fails with 
"Running in secure mode, but config doesn't have a keytab" 
 Key: FLINK-18368
 URL: https://issues.apache.org/jira/browse/FLINK-18368
 Project: Flink
  Issue Type: Bug
  Components: FileSystems, Tests
Affects Versions: 1.12.0
Reporter: Robert Metzger


https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8184=logs=66592496-52df-56bb-d03e-37509e1d9d0f=ae0269db-6796-5583-2e5f-d84757d711aa



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18366) Track E2E test durations centrally

2020-06-18 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-18366:
--

 Summary: Track E2E test durations centrally
 Key: FLINK-18366
 URL: https://issues.apache.org/jira/browse/FLINK-18366
 Project: Flink
  Issue Type: Improvement
  Components: Build System / CI
Reporter: Robert Metzger


Every now and then, our E2E tests start timing out (see FLINK-16795), because 
they hit the currently configured time-limit.
To better understand what the expected E2E time, and potential performance 
regressions, we should track the test execution duration centrally.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18358) TableEnvironmentITCase.testSqlUpdateAndToDataSetWithTableSource:245 unstable: result mismatch

2020-06-18 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-18358:
--

 Summary:  
TableEnvironmentITCase.testSqlUpdateAndToDataSetWithTableSource:245 unstable: 
result mismatch
 Key: FLINK-18358
 URL: https://issues.apache.org/jira/browse/FLINK-18358
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Affects Versions: 1.12.0
Reporter: Robert Metzger


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3749=logs=b2f046ab-ae17-5406-acdc-240be7e870e4=93e5ae06-d194-513d-ba8d-150ef6da1d7c

{code}
[ERROR]   TableEnvironmentITCase.testSqlUpdateAndToDataSetWithTableSource:245 
expected:<8,24.95375[]
> but was:<8,24.95375[03]
>

{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18351) ModuleManager creates a lot of duplicate/similar log messages

2020-06-17 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-18351:
--

 Summary: ModuleManager creates a lot of duplicate/similar log 
messages
 Key: FLINK-18351
 URL: https://issues.apache.org/jira/browse/FLINK-18351
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API
Affects Versions: 1.12.0
Reporter: Robert Metzger


This is a follow up to FLINK-17977: 
{code}
2020-06-03 15:02:11,982 INFO  org.apache.flink.table.module.ModuleManager   
   [] - Got FunctionDefinition 'as' from 'core' module.
2020-06-03 15:02:11,988 INFO  org.apache.flink.table.module.ModuleManager   
   [] - Got FunctionDefinition 'sum' from 'core' module.
2020-06-03 15:02:12,139 INFO  org.apache.flink.table.module.ModuleManager   
   [] - Got FunctionDefinition 'as' from 'core' module.
2020-06-03 15:02:12,159 INFO  org.apache.flink.table.module.ModuleManager   
   [] - Got FunctionDefinition 'equals' from 'core' module.
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Re-renaming "Flink Master" back to JobManager

2020-06-17 Thread Robert Metzger
Thanks a lot for looking into this!

+1 to your proposal

On Wed, Jun 17, 2020 at 10:55 AM David Anderson 
wrote:

> Aljoscha,
>
> I think this is a step in the right direction.
>
> In some cases it may be difficult to talk concretely about the
> differences between different deployment models (e.g., comparing a k8s
> per-job cluster to a YARN-based session cluster, which is something I
> typically present during training) without giving names to the internal
> components. I'm not convinced we can completely avoid mentioning the
> JobMaster (and Dispatcher and ResourceManagers) in some (rare) contexts --
> but I don't see this as an argument against the proposed change.
>
> David
>
> On Mon, Jun 15, 2020 at 2:32 PM Konstantin Knauf 
> wrote:
>
> > Hi Aljoscha,
> >
> > sounds good to me. Let’s also make sure we don’t refer to the JobMaster
> as
> > Jobmanager anywhere then (code, config).
> >
> > I am not sure we can avoid mentioning the Flink ResourceManagers in user
> > facing docs completely. For JobMaster and Dispatcher this seems doable.
> >
> > Best,
> >
> > Konstantin
> >
> > On Mon 15. Jun 2020 at 12:56, Aljoscha Krettek 
> > wrote:
> >
> > > Hi All,
> > >
> > > This came to my mind because of the master/slave discussion in [1] and
> > > the larger discussions about inequality/civil rights happening right
> now
> > > in the world. I think for this reason alone we should use a name that
> > > does not include "master".
> > >
> > > We could rename it back to JobManager, which was the name mostly used
> > > before 2019. Since the beginning of Flink, TaskManager was the term
> used
> > > for the worker component/node and JobManager was the term used for the
> > > orchestrating component/node.
> > >
> > > Currently our glossary [2] defines these terms (paraphrased by me):
> > >
> > >   - "Flink Master": it's the orchestrating component that consists of
> > > resource manager, dispatcher, and JobManager
> > >
> > >   - JobManager: it's the thing that manages a single job and runs as
> > > part of a "Flink Master"
> > >
> > >   - TaskManager: it's the worker process
> > >
> > > Prior to the introduction of the glossary the definition of JobManager
> > > would have been:
> > >
> > >   - It's the orchestrating component that manages execution of jobs and
> > > schedules work on TaskManagers.
> > >
> > > Quite some parts in the code and documentation/configuration options
> > > still use that older meaning of JobManager. Newer parts of the
> > > documentation use "Flink Master" instead.
> > >
> > > I'm proposing to go back to calling the orchestrating component
> > > JobManager, which would mean that we have to touch up the documentation
> > > to remove mentions of "Flink Master". I'm also proposing not to mention
> > > the internal components such as resource manager and dispatcher in the
> > > glossary because there are transparent to users.
> > >
> > > I'm proposing to go back to JobManager instead of an alternative name
> > > also because switching to yet another name would mean many more changes
> > > to code/documentation/peoples minds.
> > >
> > > What do you all think?
> > >
> > > Best,
> > > Aljoscha
> > >
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-18209
> > > [2]
> > >
> > >
> >
> https://ci.apache.org/projects/flink/flink-docs-master/concepts/glossary.html
> > >
> > --
> >
> > Konstantin Knauf
> >
> > https://twitter.com/snntrable
> >
> > https://github.com/knaufk
> >
>


[jira] [Created] (FLINK-18321) AbstractCloseableRegistryTest.testClose unstable

2020-06-16 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-18321:
--

 Summary: AbstractCloseableRegistryTest.testClose unstable
 Key: FLINK-18321
 URL: https://issues.apache.org/jira/browse/FLINK-18321
 Project: Flink
  Issue Type: Bug
  Components: FileSystems, Tests
Affects Versions: 1.12.0
Reporter: Robert Metzger


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3553=logs=0da23115-68bb-5dcd-192c-bd4c8adebde1=05b74a19-4ee4-5036-c46f-ada307df6cf0

{code}
java.lang.AssertionError: expected:<0> but was:<-1>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.flink.core.fs.AbstractCloseableRegistryTest.testClose(AbstractCloseableRegistryTest.java:93)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Moving in and Modify Dependency Source

2020-06-15 Thread Robert Metzger
Hi Austin,
Thanks for working on the RMQ connector! There seem to be a few users
affected by that issue.

The GitHub page confirms that users can choose from the three licenses:
https://github.com/rabbitmq/rabbitmq-java-client#license:

> This means that the user can consider the library to be licensed under any
> of the licenses from the list above. For example, you may choose the
> Apache Public License 2.0 and include this client into a commercial
> product. Projects that are licensed under the GPLv2 may choose GPLv2, and
> so on.


Best,
Robert

On Mon, Jun 15, 2020 at 8:59 AM Till Rohrmann  wrote:

> Hi Austin,
>
> usually if source code is multi licensed then this means that the user can
> choose the license under which he wants it to use. In our case it would be
> the Apache License version 2. But you should check the license text to make
> sure that this has not been forbidden explicitly.
>
> When copying code from another project, the practice is to annotate it with
> a comment stating from where the code was obtained. So in your case you
> would give these files the ASL license header and add a comment to the
> source code from where it was copied.
>
> Cheers,
> Till
>
> On Sat, Jun 13, 2020 at 10:41 PM Austin Cawley-Edwards <
> austin.caw...@gmail.com> wrote:
>
> > Hi all,
> >
> > I'm working on [FLINK-10195] on the RabbitMQ connector which involves
> > modifying some of the RMQ client source code (that has been moved out of
> > that package) and bringing it into Flink. The RMQ client code is
> > triple-licensed under Mozilla Public License 1.1 ("MPL"), the GNU General
> > Public License version 2 ("GPL"), and the Apache License version 2
> ("ASL").
> >
> > Does anyone have experience doing something similar/ what I would need to
> > do in terms of the license headers in the Flink source files?
> >
> > Thank you,
> > Austin
> >
> > [FLINK-10195]: https://issues.apache.org/jira/browse/FLINK-10195
> >
>


Re: request create flip permission for flink es bounded source/lookup source connector

2020-06-15 Thread Robert Metzger
Ah, sorry. The permission system of Confluence is a bit annoying.
I gave you "Add Attachment" permissions.

On Mon, Jun 15, 2020 at 8:57 AM Jacky Lau  wrote:

> Hi Robert:
>  When i edit the FLIP, and upload the images. It will show the
> prompting
> message like this "You'll need to ask permission to insert files here"
> Could you also help me give the permission for uploading images to FLIP
> iki?
>
> Robert Metzger wrote
> > Hi,
> > I gave you access to the Wiki!
> >
> > On Fri, Jun 12, 2020 at 11:50 AM Jacky Lau 
>
> > liuyongvs@
>
> >  wrote:
> >
> >> Hi Jack:
> >>Thank you so much. My wiki name is jackylau
> >>
> >>
> >> Jark Wu-2 wrote
> >> > Hi Jacky,
> >> >
> >> > What's your username in wiki? So that I can give the permission to
> you.
> >> >
> >> > Best,
> >> > Jark
> >> >
> >> > On Fri, 12 Jun 2020 at 11:38, Jacky Lau 
> >>
> >> > liuyongvs@
> >>
> >> >  wrote:
> >> >
> >> >> hi all:
> >> >>After this simple discussion here
> >> >>
> >> >>
> >>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discussion-flink-elasticsearch-connector-supports-td42082.html#a42106
> >> >> ,
> >> >>and i should create i flip127 to  track this. But i don't have
> >> create
> >> >> flip permision.
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Sent from:
> >> >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
> >> >>
> >>
> >>
> >> Jark Wu-2 wrote
> >> > Hi Jacky,
> >> >
> >> > What's your username in wiki? So that I can give the permission to
> you.
> >> >
> >> > Best,
> >> > Jark
> >> >
> >> > On Fri, 12 Jun 2020 at 11:38, Jacky Lau 
> >>
> >> > liuyongvs@
> >>
> >> >  wrote:
> >> >
> >> >> hi all:
> >> >>After this simple discussion here
> >> >>
> >> >>
> >>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discussion-flink-elasticsearch-connector-supports-td42082.html#a42106
> >> >> ,
> >> >>and i should create i flip127 to  track this. But i don't have
> >> create
> >> >> flip permision.
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Sent from:
> >> >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
> >> >>
> >>
> >>
> >>
> >>
> >>
> >> --
> >> Sent from:
> >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
> >>
>
>
>
>
>
> --
> Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
>


[jira] [Created] (FLINK-18291) Streaming File Sink s3 end-to-end test stalls

2020-06-15 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-18291:
--

 Summary: Streaming File Sink s3 end-to-end test stalls
 Key: FLINK-18291
 URL: https://issues.apache.org/jira/browse/FLINK-18291
 Project: Flink
  Issue Type: Bug
  Components: FileSystems, Tests
Affects Versions: 1.12.0
Reporter: Robert Metzger


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3444=logs=5c8e7682-d68f-54d1-16a2-a09310218a49=f508e270-48d6-5f1e-3138-42a17e0714f0

{code}
2020-06-12T21:55:57.6277963Z Number of produced values 10870/6
2020-06-12T21:57:10.5467073Z Number of produced values 22960/6
2020-06-12T21:58:01.0025226Z Number of produced values 59650/6
2020-06-12T21:58:52.5624619Z Number of produced values 6/6
2020-06-12T21:58:53.2407133Z Cancelling job 9412dcb358631ab461a3a1e851417b9e.
2020-06-12T21:58:54.0819168Z Cancelled job 9412dcb358631ab461a3a1e851417b9e.
2020-06-12T21:58:54.1097745Z Waiting for job (9412dcb358631ab461a3a1e851417b9e) 
to reach terminal state CANCELED ...
2020-06-13T00:00:35.0502923Z ##[error]The operation was canceled.
2020-06-13T00:00:35.0522780Z ##[section]Finishing: Run e2e tests
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18290) Kafka011ProducerExactlyOnceITCase sometimes crashes with exit code 239

2020-06-14 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-18290:
--

 Summary: Kafka011ProducerExactlyOnceITCase sometimes crashes with 
exit code 239
 Key: FLINK-18290
 URL: https://issues.apache.org/jira/browse/FLINK-18290
 Project: Flink
  Issue Type: Task
  Components: Build System / Azure Pipelines, Connectors / Kafka, Tests
Affects Versions: 1.11.0
Reporter: Robert Metzger


[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3467=logs=d44f43ce-542c-597d-bf94-b0718c71e5e8=34f486e1-e1e4-5dd2-9c06-bfdd9b9c74a8]

 
{code:java}
2020-06-15T03:24:28.4677649Z [WARNING] The requested profile "skip-webui-build" 
could not be activated because it does not exist.
2020-06-15T03:24:28.4692049Z [ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.22.1:test (integration-tests) 
on project flink-connector-kafka-0.11_2.11: There are test failures.
2020-06-15T03:24:28.4692585Z [ERROR] 
2020-06-15T03:24:28.4693170Z [ERROR] Please refer to 
/__w/2/s/flink-connectors/flink-connector-kafka-0.11/target/surefire-reports 
for the individual test results.
2020-06-15T03:24:28.4693928Z [ERROR] Please refer to dump files (if any exist) 
[date].dump, [date]-jvmRun[N].dump and [date].dumpstream.
2020-06-15T03:24:28.4694423Z [ERROR] ExecutionException The forked VM 
terminated without properly saying goodbye. VM crash or System.exit called?
2020-06-15T03:24:28.4696762Z [ERROR] Command was /bin/sh -c cd 
/__w/2/s/flink-connectors/flink-connector-kafka-0.11/target && 
/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xms256m -Xmx2048m 
-Dlog4j.configurationFile=log4j2-test.properties -Dmvn.forkNumber=2 
-XX:-UseGCOverheadLimit -jar 
/__w/2/s/flink-connectors/flink-connector-kafka-0.11/target/surefire/surefirebooter617700788970993266.jar
 /__w/2/s/flink-connectors/flink-connector-kafka-0.11/target/surefire 
2020-06-15T03-07-01_381-jvmRun2 surefire2676050245109796726tmp 
surefire_602825791089523551074tmp
2020-06-15T03:24:28.4698486Z [ERROR] Error occurred in starting fork, check 
output in log
2020-06-15T03:24:28.4699066Z [ERROR] Process Exit Code: 239
2020-06-15T03:24:28.4699458Z [ERROR] Crashed tests:
2020-06-15T03:24:28.4699960Z [ERROR] 
org.apache.flink.streaming.connectors.kafka.Kafka011ProducerExactlyOnceITCase
2020-06-15T03:24:28.4700849Z [ERROR] 
org.apache.maven.surefire.booter.SurefireBooterForkException: 
ExecutionException The forked VM terminated without properly saying goodbye. VM 
crash or System.exit called?
2020-06-15T03:24:28.4703760Z [ERROR] Command was /bin/sh -c cd 
/__w/2/s/flink-connectors/flink-connector-kafka-0.11/target && 
/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xms256m -Xmx2048m 
-Dlog4j.configurationFile=log4j2-test.properties -Dmvn.forkNumber=2 
-XX:-UseGCOverheadLimit -jar 
/__w/2/s/flink-connectors/flink-connector-kafka-0.11/target/surefire/surefirebooter617700788970993266.jar
 /__w/2/s/flink-connectors/flink-connector-kafka-0.11/target/surefire 
2020-06-15T03-07-01_381-jvmRun2 surefire2676050245109796726tmp 
surefire_602825791089523551074tmp
2020-06-15T03:24:28.4705501Z [ERROR] Error occurred in starting fork, check 
output in log
2020-06-15T03:24:28.4706297Z [ERROR] Process Exit Code: 239
2020-06-15T03:24:28.4706592Z [ERROR] Crashed tests:
2020-06-15T03:24:28.4706895Z [ERROR] 
org.apache.flink.streaming.connectors.kafka.Kafka011ProducerExactlyOnceITCase
2020-06-15T03:24:28.4707386Z [ERROR] at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.awaitResultsDone(ForkStarter.java:510)
2020-06-15T03:24:28.4708053Z [ERROR] at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.runSuitesForkPerTestSet(ForkStarter.java:457)
2020-06-15T03:24:28.4708908Z [ERROR] at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:298)
2020-06-15T03:24:28.4709720Z [ERROR] at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:246)
2020-06-15T03:24:28.4710497Z [ERROR] at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1183)
2020-06-15T03:24:28.4711448Z [ERROR] at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:1011)
2020-06-15T03:24:28.4712395Z [ERROR] at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:857)
2020-06-15T03:24:28.4712997Z [ERROR] at 
org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:132)
2020-06-15T03:24:28.4713524Z [ERROR] at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
2020-06-15T03:24:28.4714079Z [ERROR] at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
2020-06-15T03:24:28.4714560Z [ERROR] at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
2020-06-1

Re: [DISCUSS] Add Japanese translation of the flink.apache.org website

2020-06-12 Thread Robert Metzger
for people who want to modify the
> > > > English
> > > > > > docs, but also for people who try to update the translated copies
> > > > > > accordingly. I'm currently working on updating the Chinese
> > > translations
> > > > > for
> > > > > > the memory configuration documents, and I found it very hard to
> > > > identify
> > > > > > the parts that need updates. The English docs are reorganized,
> > > contents
> > > > > are
> > > > > > moved across pages, and also small pieces of details are
> modified.
> > It
> > > > is
> > > > > > not always possible for people who work on the English docs to
> > locate
> > > > the
> > > > > > right place in the translations where the updated contents should
> > be
> > > > > > pasted.
> > > > > >
> > > > > > My two cents on the potential approach. We might label the
> > > translation
> > > > > with
> > > > > > the commit id of the original doc where they are in
> > synchronization,
> > > > and
> > > > > > automatically display a warning on the translation if an
> > out-of-sync
> > > is
> > > > > > detected.
> > > > > >
> > > > > > Thank you~
> > > > > >
> > > > > > Xintong Song
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Mon, Jun 8, 2020 at 4:30 PM Dawid Wysakowicz <
> > > > dwysakow...@apache.org>
> > > > > > wrote:
> > > > > >
> > > > > > > I don't have a well defined opinion on adding new language.
> > > However,
> > > > > one
> > > > > > > thing that I'd like to bring up to the attention on that
> occasion
> > > is
> > > > it
> > > > > > > is already quite cumbersome to update two versions of the docs.
> > > > > > > Especially when we add new sections or change smaller parts of
> an
> > > > > > > existing document. Right now if I add three sections in an
> > English
> > > > > > > version I have three additional places in the Chinese documents
> > > > where I
> > > > > > > need to paste that. With additional language it doubles, making
> > it
> > > 6
> > > > > > > places where I have to manually paste the added parts.
> > > > > > >
> > > > > > > I'd be way more welcoming for adding a new language if we had a
> > > > better
> > > > > > > tooling/process for synchronizing changes across different
> > > languages.
> > > > > Or
> > > > > > > if we agree the translations are best effort documents and we
> do
> > > not
> > > > > > > update them when changing the English documents.
> > > > > > >
> > > > > > > Best,
> > > > > > >
> > > > > > > Dawid
> > > > > > >
> > > > > > > On 08/06/2020 09:44, Robert Metzger wrote:
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > we've received a pull request on flink-web.git for adding a
> > > > Japanese
> > > > > > > > translation of the Flink website:
> > > > > > > > https://github.com/apache/flink-web/pull/346
> > > > > > > >
> > > > > > > > Before we accept this PR, I believe we should have a
> discussion
> > > > about
> > > > > > it.
> > > > > > > >
> > > > > > > > *Relevance*: Looking at Google analytics, our users are
> coming
> > > from
> > > > > > China
> > > > > > > > (1.), US (2.), India (3.), Germany (4.), Japan (5.).
> > > > > > > > I'd say that is high enough to consider a Japanese
> translation.
> > > > > > > >
> > > > > > > > *Reviewing*: I'm not aware of any committer who speaks
> > Japanese.
> > > > How
> > > > > > > would
> > > > > > > > we review this?
> > > > > > > > The contributor is offering to open follow-up pull requests,
> > > which
> > > > I
> > > > > > > > believe is fine to bootstrap. Ideally we'll be able to
> attract
> > > more
> > > > > > > > contributors over time.
> > > > > > > >
> > > > > > > > In my opinion, if we find one or two reviewers, I would give
> > > this a
> > > > > > try.
> > > > > > > >
> > > > > > > > What do you think?
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Robert
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] Update our EditorConfig file

2020-06-12 Thread Robert Metzger
It seems that nobody cares about the file, as it has been basically never
touched after its creation in late 2015.

I would be fine adding the changes you are proposing to the codebase.


On Thu, Jun 11, 2020 at 4:47 AM tison  wrote:

> > is anyone actually using our .editorconfig file?
>
> I think IDEA already takes this file into consideration. So far it works
> well for me.
>
> Best,
> tison.
>
>
> Jingsong Li  于2020年6月11日周四 上午10:26写道:
>
> > +1 looks more friendly to Flink newbies.
> >
> > Best,
> > Jingsong Lee
> >
> > On Wed, Jun 10, 2020 at 8:38 PM Aljoscha Krettek 
> > wrote:
> >
> > > Hi,
> > >
> > > is anyone actually using our .editorconfig file? IntelliJ has a plugin
> > > for this that is actually quite powerful.
> > >
> > > I managed to write a .editorconfig file that I quite like:
> > > https://github.com/aljoscha/flink/commits/new-editorconfig. For me to
> > > use that, we would either need to update our Flink file to what I did
> > > there or remove the "root = true" part from the file to allow me to
> > > place my custom .editorconfig in the directory above.
> > >
> > > It's probably a lost cause to find consensus on what settings we should
> > > have in that file but it could be helpful if we all used the same
> > > settings. For what it's worth, this will format code in such a way that
> > > it pleases our (very lenient) checkstyle rules.
> > >
> > > What do you think?
> > >
> > > Best,
> > > Aljoscha
> > >
> >
> >
> > --
> > Best, Jingsong Lee
> >
>


Re: [DISCUSS] Semantics of our JIRA fields

2020-06-12 Thread Robert Metzger
I agree with you Till -- changing the definition of the priorities should
be a separate discussion.

Piotrek, do you agree with my "affects version" explanation? I would like
to bring this discussion to a conclusion.



On Tue, May 26, 2020 at 4:51 PM Till Rohrmann  wrote:

> If we change the meaning of the priority levels, then I would suggest to
> have a dedicated discussion for it. This would also be more visible than
> compared to being hidden in some lengthy discussion thread. I think the
> proposed definitions of priority levels differ slightly from how the
> community worked in the past.
>
> Cheers,
> Till
>
> On Tue, May 26, 2020 at 4:30 PM Robert Metzger 
> wrote:
>
> > Hi,
> >
> > 1. I'm okay with updating the definition of the priorities for the reason
> > you've mentioned.
> >
> > 2. "Affects version"
> >
> > The reason why like to mark affects version on unreleased versions is to
> > clearly indicate which branch is affected by a bug. Given the current
> Flink
> > release status, if there's a bug only in "release-1.11", but not in
> > "master", there is no way of figuring that out, if we only allow released
> > versions for "affects version" (In my proposal, you would set "affects
> > version" to '1.11.0', '1.12.0' to indicate that).
> >
> > What we could do is introduce "1.12-SNAPSHOT" as version to mark issues
> on
> > unreleased versions. (But then people might accidentally set the "fix
> > version" to a "-SNAPSHOT" version.)
> >
> > I'm still in favor of my proposal. I have never heard a report from a
> > confused user about our Jira fields (I guess they usually check bugs for
> > released versions only)
> >
> >
> > On Tue, May 26, 2020 at 12:39 PM Piotr Nowojski 
> > wrote:
> >
> > > Hi,
> > >
> > > Sorry for a bit late response. I have two concerns:
> > >
> > > 1. Priority
> > >
> > > I would propose to stretch priorities that we are using to
> differentiate
> > > between things that must be fixed for given release:
> > >
> > > BLOCKER - drop anything you are doing, this issue must be fixed right
> now
> > > CRITICAL - release can not happen without fixing it, but can be fixed a
> > > bit later (for example without context switching and dropping whatever
> > I’m
> > > doing right now)
> > > MAJOR - default, nice to have
> > > Anything below - meh
> > >
> > > We were already using this semantic for tracking test instabilities
> > during
> > > the 1.11 release cycle. Good examples:
> > >
> > > BLOCKER - master branch not compiling, very frequent test failures (for
> > > example almost every build affected), …
> > > CRITICAL - performance regression/bug that we introduced in some
> feature,
> > > but which is not affecting other developers as much
> > > MAJOR - freshly discovered test instability with unknown
> impact/frequency
> > > (could be happening once a year),
> > >
> > > 2. Affects version
> > >
> > > If bug is only on the master branch, does it affect an unreleased
> > version?
> > >
> > > So far I was assuming that it doesn’t - unreleased bugs would have
> empty
> > > “affects version” field. My reasoning was that this field should be
> used
> > > for Flink users, to check which RELEASED Flink versions are affected by
> > > some bug, that user is searching for. Otherwise it might be a bit
> > confusing
> > > if there are lots of bugs with both affects version and fix version set
> > to
> > > the same value.
> > >
> > > Piotrek
> > >
> > > > On 25 May 2020, at 16:40, Robert Metzger 
> wrote:
> > > >
> > > > Hi all,
> > > > thanks a lot for the feedback. The majority of responses are very
> > > positive
> > > > to my proposal.
> > > >
> > > > I have put my proposal into our wiki:
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=154995514
> > > >
> > > > Regarding the comments so far:
> > > > @Jark: I clarified this in the wiki.
> > > >
> > > > @Israel: I have not considered build changing all 3000 resolved
> tickets
> > > to
> > > > closed yet, but after consideration I don't think it is necessary. If
> > > > others in the community would like to change

Re: request create flip permission for flink es bounded source/lookup source connector

2020-06-12 Thread Robert Metzger
Hi,
I gave you access to the Wiki!

On Fri, Jun 12, 2020 at 11:50 AM Jacky Lau  wrote:

> Hi Jack:
>Thank you so much. My wiki name is jackylau
>
>
> Jark Wu-2 wrote
> > Hi Jacky,
> >
> > What's your username in wiki? So that I can give the permission to you.
> >
> > Best,
> > Jark
> >
> > On Fri, 12 Jun 2020 at 11:38, Jacky Lau 
>
> > liuyongvs@
>
> >  wrote:
> >
> >> hi all:
> >>After this simple discussion here
> >>
> >>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discussion-flink-elasticsearch-connector-supports-td42082.html#a42106
> >> ,
> >>and i should create i flip127 to  track this. But i don't have create
> >> flip permision.
> >>
> >>
> >>
> >> --
> >> Sent from:
> >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
> >>
>
>
> Jark Wu-2 wrote
> > Hi Jacky,
> >
> > What's your username in wiki? So that I can give the permission to you.
> >
> > Best,
> > Jark
> >
> > On Fri, 12 Jun 2020 at 11:38, Jacky Lau 
>
> > liuyongvs@
>
> >  wrote:
> >
> >> hi all:
> >>After this simple discussion here
> >>
> >>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discussion-flink-elasticsearch-connector-supports-td42082.html#a42106
> >> ,
> >>and i should create i flip127 to  track this. But i don't have create
> >> flip permision.
> >>
> >>
> >>
> >> --
> >> Sent from:
> >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
> >>
>
>
>
>
>
> --
> Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
>


<    1   2   3   4   5   6   7   8   9   10   >