[GitHub] [flink] flinkbot edited a comment on pull request #13029: [FLINK-18763][python] Support basic TypeInformation for Python DataSt…

2020-07-31 Thread GitBox


flinkbot edited a comment on pull request #13029:
URL: https://github.com/apache/flink/pull/13029#issuecomment-666332680


   
   ## CI report:
   
   * 3d89fc037c6cf7b1dfe19514ff831af4a4d33b50 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5066)
 
   * 39f5be60dc81c30f0cbbfb4e219722cbdd52ea21 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5070)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-17851) FlinkKafkaProducerITCase.testResumeTransaction:108->assertRecord NoSuchElement

2020-07-31 Thread Dian Fu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168454#comment-17168454
 ] 

Dian Fu commented on FLINK-17851:
-

Hi [~rmetzger] it seems that this issue is duplicate with FLINK-13733. I'd like 
to close it and let's track it in FLINK-13733.

> FlinkKafkaProducerITCase.testResumeTransaction:108->assertRecord NoSuchElement
> --
>
> Key: FLINK-17851
> URL: https://issues.apache.org/jira/browse/FLINK-17851
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka, Tests
>Affects Versions: 1.11.0
>Reporter: Robert Metzger
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.12.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=1957&view=logs&j=d44f43ce-542c-597d-bf94-b0718c71e5e8&t=34f486e1-e1e4-5dd2-9c06-bfdd9b9c74a8
> {code}
> 2020-05-20T16:36:17.1033719Z [ERROR] Tests run: 10, Failures: 0, Errors: 1, 
> Skipped: 0, Time elapsed: 68.936 s <<< FAILURE! - in 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerITCase
> 2020-05-20T16:36:17.1035011Z [ERROR] 
> testResumeTransaction(org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerITCase)
>   Time elapsed: 11.918 s  <<< ERROR!
> 2020-05-20T16:36:17.1036296Z java.util.NoSuchElementException
> 2020-05-20T16:36:17.1036802Z  at 
> org.apache.kafka.common.utils.AbstractIterator.next(AbstractIterator.java:52)
> 2020-05-20T16:36:17.1037553Z  at 
> org.apache.flink.shaded.guava18.com.google.common.collect.Iterators.getOnlyElement(Iterators.java:302)
> 2020-05-20T16:36:17.1038087Z  at 
> org.apache.flink.shaded.guava18.com.google.common.collect.Iterables.getOnlyElement(Iterables.java:289)
> 2020-05-20T16:36:17.1038654Z  at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerITCase.assertRecord(FlinkKafkaProducerITCase.java:201)
> 2020-05-20T16:36:17.1039239Z  at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerITCase.testResumeTransaction(FlinkKafkaProducerITCase.java:108)
> 2020-05-20T16:36:17.1039727Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2020-05-20T16:36:17.1040131Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2020-05-20T16:36:17.1040575Z  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2020-05-20T16:36:17.1040989Z  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2020-05-20T16:36:17.1041515Z  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> 2020-05-20T16:36:17.1041989Z  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2020-05-20T16:36:17.1042468Z  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> 2020-05-20T16:36:17.1042908Z  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2020-05-20T16:36:17.1043497Z  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
> 2020-05-20T16:36:17.1044096Z  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
> 2020-05-20T16:36:17.1044833Z  at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 2020-05-20T16:36:17.1045171Z  at java.lang.Thread.run(Thread.java:748)
> 2020-05-20T16:36:17.1045358Z 
> 2020-05-20T16:36:17.6145018Z [INFO] 
> 2020-05-20T16:36:17.6145477Z [INFO] Results:
> 2020-05-20T16:36:17.6145654Z [INFO] 
> 2020-05-20T16:36:17.6145838Z [ERROR] Errors: 
> 2020-05-20T16:36:17.6146898Z [ERROR]   
> FlinkKafkaProducerITCase.testResumeTransaction:108->assertRecord:201 » 
> NoSuchElement
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-17851) FlinkKafkaProducerITCase.testResumeTransaction:108->assertRecord NoSuchElement

2020-07-31 Thread Dian Fu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-17851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu closed FLINK-17851.
---
Resolution: Duplicate

> FlinkKafkaProducerITCase.testResumeTransaction:108->assertRecord NoSuchElement
> --
>
> Key: FLINK-17851
> URL: https://issues.apache.org/jira/browse/FLINK-17851
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka, Tests
>Affects Versions: 1.11.0
>Reporter: Robert Metzger
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.12.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=1957&view=logs&j=d44f43ce-542c-597d-bf94-b0718c71e5e8&t=34f486e1-e1e4-5dd2-9c06-bfdd9b9c74a8
> {code}
> 2020-05-20T16:36:17.1033719Z [ERROR] Tests run: 10, Failures: 0, Errors: 1, 
> Skipped: 0, Time elapsed: 68.936 s <<< FAILURE! - in 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerITCase
> 2020-05-20T16:36:17.1035011Z [ERROR] 
> testResumeTransaction(org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerITCase)
>   Time elapsed: 11.918 s  <<< ERROR!
> 2020-05-20T16:36:17.1036296Z java.util.NoSuchElementException
> 2020-05-20T16:36:17.1036802Z  at 
> org.apache.kafka.common.utils.AbstractIterator.next(AbstractIterator.java:52)
> 2020-05-20T16:36:17.1037553Z  at 
> org.apache.flink.shaded.guava18.com.google.common.collect.Iterators.getOnlyElement(Iterators.java:302)
> 2020-05-20T16:36:17.1038087Z  at 
> org.apache.flink.shaded.guava18.com.google.common.collect.Iterables.getOnlyElement(Iterables.java:289)
> 2020-05-20T16:36:17.1038654Z  at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerITCase.assertRecord(FlinkKafkaProducerITCase.java:201)
> 2020-05-20T16:36:17.1039239Z  at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerITCase.testResumeTransaction(FlinkKafkaProducerITCase.java:108)
> 2020-05-20T16:36:17.1039727Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2020-05-20T16:36:17.1040131Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2020-05-20T16:36:17.1040575Z  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2020-05-20T16:36:17.1040989Z  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2020-05-20T16:36:17.1041515Z  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> 2020-05-20T16:36:17.1041989Z  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2020-05-20T16:36:17.1042468Z  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> 2020-05-20T16:36:17.1042908Z  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2020-05-20T16:36:17.1043497Z  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
> 2020-05-20T16:36:17.1044096Z  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
> 2020-05-20T16:36:17.1044833Z  at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 2020-05-20T16:36:17.1045171Z  at java.lang.Thread.run(Thread.java:748)
> 2020-05-20T16:36:17.1045358Z 
> 2020-05-20T16:36:17.6145018Z [INFO] 
> 2020-05-20T16:36:17.6145477Z [INFO] Results:
> 2020-05-20T16:36:17.6145654Z [INFO] 
> 2020-05-20T16:36:17.6145838Z [ERROR] Errors: 
> 2020-05-20T16:36:17.6146898Z [ERROR]   
> FlinkKafkaProducerITCase.testResumeTransaction:108->assertRecord:201 » 
> NoSuchElement
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-18677) ZooKeeperLeaderRetrievalService does not invalidate leader in case of SUSPENDED connection

2020-07-31 Thread Till Rohrmann (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168458#comment-17168458
 ] 

Till Rohrmann commented on FLINK-18677:
---

Good work [~mapohl]. According to FLINK-10052 the 
{{ZooKeeperLeaderElectionService}} will revoke the leadership of the leader if 
the connection is {{SUSPENDED}}. Hence, as a first step I believe it would make 
sense to make this behaviour on the {{ZooKeeperLeaderRetrievalService}} side 
symmetric.

> ZooKeeperLeaderRetrievalService does not invalidate leader in case of 
> SUSPENDED connection
> --
>
> Key: FLINK-18677
> URL: https://issues.apache.org/jira/browse/FLINK-18677
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.10.1, 1.12.0, 1.11.1
>Reporter: Till Rohrmann
>Priority: Major
> Fix For: 1.12.0
>
>
> The {{ZooKeeperLeaderRetrievalService}} does not invalidate the leader if the 
> ZooKeeper connection gets SUSPENDED. This means that a {{TaskManager}} won't 
> cancel its running tasks even though it might miss a leader change. I think 
> we should at least make it configurable whether in such a situation the 
> leader listener should be informed about the lost leadership. Otherwise, we 
> might run into the situation where an old and a newly recovered instance of a 
> {{Task}} can run at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-17851) FlinkKafkaProducerITCase.testResumeTransaction:108->assertRecord NoSuchElement

2020-07-31 Thread Dian Fu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-17851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu updated FLINK-17851:

Fix Version/s: (was: 1.12.0)

> FlinkKafkaProducerITCase.testResumeTransaction:108->assertRecord NoSuchElement
> --
>
> Key: FLINK-17851
> URL: https://issues.apache.org/jira/browse/FLINK-17851
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka, Tests
>Affects Versions: 1.11.0
>Reporter: Robert Metzger
>Priority: Critical
>  Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=1957&view=logs&j=d44f43ce-542c-597d-bf94-b0718c71e5e8&t=34f486e1-e1e4-5dd2-9c06-bfdd9b9c74a8
> {code}
> 2020-05-20T16:36:17.1033719Z [ERROR] Tests run: 10, Failures: 0, Errors: 1, 
> Skipped: 0, Time elapsed: 68.936 s <<< FAILURE! - in 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerITCase
> 2020-05-20T16:36:17.1035011Z [ERROR] 
> testResumeTransaction(org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerITCase)
>   Time elapsed: 11.918 s  <<< ERROR!
> 2020-05-20T16:36:17.1036296Z java.util.NoSuchElementException
> 2020-05-20T16:36:17.1036802Z  at 
> org.apache.kafka.common.utils.AbstractIterator.next(AbstractIterator.java:52)
> 2020-05-20T16:36:17.1037553Z  at 
> org.apache.flink.shaded.guava18.com.google.common.collect.Iterators.getOnlyElement(Iterators.java:302)
> 2020-05-20T16:36:17.1038087Z  at 
> org.apache.flink.shaded.guava18.com.google.common.collect.Iterables.getOnlyElement(Iterables.java:289)
> 2020-05-20T16:36:17.1038654Z  at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerITCase.assertRecord(FlinkKafkaProducerITCase.java:201)
> 2020-05-20T16:36:17.1039239Z  at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerITCase.testResumeTransaction(FlinkKafkaProducerITCase.java:108)
> 2020-05-20T16:36:17.1039727Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2020-05-20T16:36:17.1040131Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2020-05-20T16:36:17.1040575Z  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2020-05-20T16:36:17.1040989Z  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2020-05-20T16:36:17.1041515Z  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> 2020-05-20T16:36:17.1041989Z  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2020-05-20T16:36:17.1042468Z  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> 2020-05-20T16:36:17.1042908Z  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2020-05-20T16:36:17.1043497Z  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
> 2020-05-20T16:36:17.1044096Z  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
> 2020-05-20T16:36:17.1044833Z  at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 2020-05-20T16:36:17.1045171Z  at java.lang.Thread.run(Thread.java:748)
> 2020-05-20T16:36:17.1045358Z 
> 2020-05-20T16:36:17.6145018Z [INFO] 
> 2020-05-20T16:36:17.6145477Z [INFO] Results:
> 2020-05-20T16:36:17.6145654Z [INFO] 
> 2020-05-20T16:36:17.6145838Z [ERROR] Errors: 
> 2020-05-20T16:36:17.6146898Z [ERROR]   
> FlinkKafkaProducerITCase.testResumeTransaction:108->assertRecord:201 » 
> NoSuchElement
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] leonardBang opened a new pull request #13035: [hotfix][csv][docs] Fix csv format option specification in docs.

2020-07-31 Thread GitBox


leonardBang opened a new pull request #13035:
URL: https://github.com/apache/flink/pull/13035


   ## What is the purpose of the change
   
   * This pull request Fix csv format option specification in docs.
   
   
   ## Brief change log
   
 - update file csv.md and csv.zh.md.
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): ( no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-16510) Task manager safeguard shutdown may not be reliable

2020-07-31 Thread Till Rohrmann (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168487#comment-17168487
 ] 

Till Rohrmann commented on FLINK-16510:
---

I think the current contract is that we terminate the Flink process when 
encountering a fatal error. {{taskmanager.jvm-exit-on-oom}} basically says 
whether an OOM originating from the user code should be considered a fatal 
error or not. Hence, I am not entirely sure what the meaning of 
{{taskmanager.jvm-exit-on-fatal-error}} would be.

What I would suggest is to make the exit behaviour configurable. One could 
introduce 
{{cluster.clean-up-on-fatal-error}}/{{cluster.fatal-error-behaviour}}/{{cluster.halt-jvm-on-fatal-exit}}
 to make the behaviour controllable.

> Task manager safeguard shutdown may not be reliable
> ---
>
> Key: FLINK-16510
> URL: https://issues.apache.org/jira/browse/FLINK-16510
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Attachments: command.txt, stack2-1.txt, stack3-mixed.txt, stack3.txt
>
>
> The {{JvmShutdownSafeguard}} does not always succeed but can hang when 
> multiple threads attempt to shutdown the JVM. Apparently mixing 
> {{System.exit()}} with ShutdownHooks and forcefully terminating the JVM via 
> {{Runtime.halt()}} does not play together well:
> {noformat}
> "Jvm Terminator" #22 daemon prio=5 os_prio=0 tid=0x7fb8e82f2800 
> nid=0x5a96 runnable [0x7fb35cffb000]
>java.lang.Thread.State: RUNNABLE
>   at java.lang.Shutdown.$$YJP$$halt0(Native Method)
>   at java.lang.Shutdown.halt0(Shutdown.java)
>   at java.lang.Shutdown.halt(Shutdown.java:139)
>   - locked <0x00047ed67638> (a java.lang.Shutdown$Lock)
>   at java.lang.Runtime.halt(Runtime.java:276)
>   at 
> org.apache.flink.runtime.util.JvmShutdownSafeguard$DelayedTerminator.run(JvmShutdownSafeguard.java:86)
>   at java.lang.Thread.run(Thread.java:748)
>Locked ownable synchronizers:
>   - None
> "FlinkCompletableFutureDelayScheduler-thread-1" #18154 daemon prio=5 
> os_prio=0 tid=0x7fb708a7d000 nid=0x5a8a waiting for monitor entry 
> [0x7fb289d49000]
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at java.lang.Shutdown.halt(Shutdown.java:139)
>   - waiting to lock <0x00047ed67638> (a java.lang.Shutdown$Lock)
>   at java.lang.Shutdown.exit(Shutdown.java:213)
>   - locked <0x00047edb7348> (a java.lang.Class for java.lang.Shutdown)
>   at java.lang.Runtime.exit(Runtime.java:110)
>   at java.lang.System.exit(System.java:973)
>   at 
> org.apache.flink.runtime.taskexecutor.TaskManagerRunner.terminateJVM(TaskManagerRunner.java:266)
>   at 
> org.apache.flink.runtime.taskexecutor.TaskManagerRunner.lambda$onFatalError$1(TaskManagerRunner.java:260)
>   at 
> org.apache.flink.runtime.taskexecutor.TaskManagerRunner$$Lambda$27464/1464672548.accept(Unknown
>  Source)
>   at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
>   at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
>   at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
>   at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
>   at 
> org.apache.flink.runtime.concurrent.FutureUtils$Timeout.run(FutureUtils.java:943)
>   at 
> org.apache.flink.runtime.concurrent.DirectExecutorService.execute(DirectExecutorService.java:211)
>   at 
> org.apache.flink.runtime.concurrent.FutureUtils.lambda$orTimeout$11(FutureUtils.java:361)
>   at 
> org.apache.flink.runtime.concurrent.FutureUtils$$Lambda$27435/159015392.run(Unknown
>  Source)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
>Locked ownable synchronizers:
>   - <0x0006d5e56bd0> (a 
> java.util.concurrent.ThreadPoolExecutor$Worker)
> {noformat}
> Note that under this condition the JVM should terminate but it still hangs. 
> Sometimes it quits after several minutes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] leonardBang opened a new pull request #13036: [hotfix][csv][docs] Fix csv format option specification in docs.

2020-07-31 Thread GitBox


leonardBang opened a new pull request #13036:
URL: https://github.com/apache/flink/pull/13036


   ## What is the purpose of the change
   
   * This pull request Fix csv format option specification in docs.
   
   
   ## Brief change log
   
 - update file csv.md and csv.zh.md.
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): ( no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] hequn8128 commented on a change in pull request #13029: [FLINK-18763][python] Support basic TypeInformation for Python DataSt…

2020-07-31 Thread GitBox


hequn8128 commented on a change in pull request #13029:
URL: https://github.com/apache/flink/pull/13029#discussion_r463382296



##
File path: flink-python/pyflink/common/typeinfo.py
##
@@ -0,0 +1,576 @@
+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+from abc import ABC, abstractmethod
+
+from py4j.java_gateway import JavaClass, JavaObject
+
+from pyflink.java_gateway import get_gateway
+
+
+class TypeInformation(ABC):
+"""
+TypeInformation is the core class of Flink's type system. FLink requires a 
type information
+for all types that are used as input or return type of a user function. 
This type information
+class acts as the tool to generate serializers and comparators, and to 
perform semantic checks
+such as whether the fields that are used as join/grouping keys actually 
existt.
+
+The type information also bridges between the programming languages object 
model and a logical
+flat schema. It maps fields from the types to columns (fields) in a flat 
schema. Not all fields
+from a type are mapped to a separate fields in the flat schema and often, 
entire types are
+mapped to one field.. It is important to notice that the schema must hold 
for all instances of a
+type. For that reason, elements in lists and arrays are not assigned to 
individual fields, but
+the lists and arrays are considered to be one field in total, to account 
for different lengths
+in the arrays.
+a) Basic types are indivisible and are considered as a single field.
+b) Arrays and collections are one field.
+c) Tuples and case classes represent as many fields as the class has 
fields.
+To represent this properly, each type has an arity (the number of fields 
it contains directly),
+and a total number of fields (number of fields in the entire schema of 
this type, including
+nested types).
+"""
+
+@abstractmethod
+def is_basic_type(self):
+"""
+Checks if this type information represents a basic type.
+Basic types are defined in BasicTypeInfo and are primitives, their 
boxing type, Strings ...
+
+:return:  True, if this type information describes a basic type, false 
otherwise.
+"""
+pass
+
+@abstractmethod
+def is_tuple_type(self):
+"""
+Checks if this type information represents a Tuple type.
+Tuple types are subclasses of the Java API tuples.

Review comment:
   We can remove this line. Tuple type is supported in Python originally. 
We should not simply copy Java comments. 

##
File path: flink-python/pyflink/common/typeinfo.py
##
@@ -0,0 +1,576 @@
+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+from abc import ABC, abstractmethod
+
+from py4j.java_gateway import JavaClass, JavaObject
+
+from pyflink.java_gateway import get_gateway
+
+
+class TypeInformation(ABC):
+"""
+TypeInformation is the core class of Flink's type system. FLink requires a 
type information
+for all types that are used as input or return type of a user function. 
This type information
+class acts as the tool 

[GitHub] [flink] flinkbot edited a comment on pull request #13033: [FLINK-18777][catalog] Supports schema registry catalog

2020-07-31 Thread GitBox


flinkbot edited a comment on pull request #13033:
URL: https://github.com/apache/flink/pull/13033#issuecomment-666927378


   
   ## CI report:
   
   * f40e57384a1aff20df20d9163b536593970d9615 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5068)
 
   * 18f6f1b410d5162bb2ab87b0875ac52b152d31ad UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #12917: [FLINK-18355][tests] Simplify tests of SlotPoolImpl

2020-07-31 Thread GitBox


flinkbot edited a comment on pull request #12917:
URL: https://github.com/apache/flink/pull/12917#issuecomment-659940433


   
   ## CI report:
   
   * da7fb96fc00acda2a4c103f5f177efb9bd9be8be UNKNOWN
   * f80ff802ee1f126a58b1617ab4ca82c3f1a40ba8 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5065)
 
   * 1c5870c57b64436014900d67be2005395e007a52 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #13036: [hotfix][csv][docs] Fix csv format option specification in docs.

2020-07-31 Thread GitBox


flinkbot commented on pull request #13036:
URL: https://github.com/apache/flink/pull/13036#issuecomment-666977648


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 82debf0d8c434eca265ae2bf2603421e408d (Fri Jul 31 
07:27:41 UTC 2020)
   
✅no warnings
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] wuchong merged pull request #13035: [hotfix][csv][docs] Fix csv format option specification in docs.

2020-07-31 Thread GitBox


wuchong merged pull request #13035:
URL: https://github.com/apache/flink/pull/13035


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #13035: [hotfix][csv][docs] Fix csv format option specification in docs.

2020-07-31 Thread GitBox


flinkbot commented on pull request #13035:
URL: https://github.com/apache/flink/pull/13035#issuecomment-666977662


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 9a3f53bb056b68dde089660836c0648013902901 (Fri Jul 31 
07:27:44 UTC 2020)
   
✅no warnings
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] wuchong merged pull request #13036: [hotfix][csv][docs] Fix csv format option specification in docs.

2020-07-31 Thread GitBox


wuchong merged pull request #13036:
URL: https://github.com/apache/flink/pull/13036


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] shuiqiangchen commented on a change in pull request #13029: [FLINK-18763][python] Support basic TypeInformation for Python DataSt…

2020-07-31 Thread GitBox


shuiqiangchen commented on a change in pull request #13029:
URL: https://github.com/apache/flink/pull/13029#discussion_r463450905



##
File path: flink-python/pyflink/common/typeinfo.py
##
@@ -0,0 +1,576 @@
+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+from abc import ABC, abstractmethod
+
+from py4j.java_gateway import JavaClass, JavaObject
+
+from pyflink.java_gateway import get_gateway
+
+
+class TypeInformation(ABC):
+"""
+TypeInformation is the core class of Flink's type system. FLink requires a 
type information
+for all types that are used as input or return type of a user function. 
This type information
+class acts as the tool to generate serializers and comparators, and to 
perform semantic checks
+such as whether the fields that are used as join/grouping keys actually 
existt.

Review comment:
   Thank you! I will revise it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] hequn8128 commented on a change in pull request #13029: [FLINK-18763][python] Support basic TypeInformation for Python DataSt…

2020-07-31 Thread GitBox


hequn8128 commented on a change in pull request #13029:
URL: https://github.com/apache/flink/pull/13029#discussion_r463445937



##
File path: flink-python/pyflink/common/typeinfo.py
##
@@ -0,0 +1,561 @@
+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+from abc import ABC, abstractmethod
+
+from py4j.java_gateway import JavaClass, JavaObject
+
+from pyflink.java_gateway import get_gateway
+
+
+class TypeInformation(ABC):
+"""
+TypeInformation is the core class of Flink's type system. FLink requires a 
type information
+for all types that are used as input or return type of a user function. 
This type information
+class acts as the tool to generate serializers and comparators, and to 
perform semantic checks
+such as whether the fields that are used as join/grouping keys actually 
existt.
+
+The type information also bridges between the programming languages object 
model and a logical
+flat schema. It maps fields from the types to columns (fields) in a flat 
schema. Not all fields
+from a type are mapped to a separate fields in the flat schema and often, 
entire types are
+mapped to one field.. It is important to notice that the schema must hold 
for all instances of a
+type. For that reason, elements in lists and arrays are not assigned to 
individual fields, but
+the lists and arrays are considered to be one field in total, to account 
for different lengths
+in the arrays.
+a) Basic types are indivisible and are considered as a single field.
+b) Arrays and collections are one field.
+c) Tuples and case classes represent as many fields as the class has 
fields.
+To represent this properly, each type has an arity (the number of fields 
it contains directly),
+and a total number of fields (number of fields in the entire schema of 
this type, including
+nested types).
+"""
+
+@abstractmethod
+def is_basic_type(self):
+"""
+Checks if this type information represents a basic type.
+Basic types are defined in BasicTypeInfo and are primitives, their 
boxing type, Strings ...
+
+:return:  True, if this type information describes a basic type, false 
otherwise.
+"""
+pass
+
+@abstractmethod
+def is_tuple_type(self):
+"""
+Checks if this type information represents a Tuple type.
+Tuple types are subclasses of the Java API tuples.
+
+:return: True, if this type information describes a tuple type, false 
otherwise.
+"""
+pass
+
+@abstractmethod
+def get_arity(self):
+"""
+Gets the arity of this type - the number of fields without nesting.
+
+:return: the number of fields in this type without nesting.
+"""
+pass
+
+@abstractmethod
+def get_total_fields(self):
+"""
+Gets the number of logical fields in this type. This includes its 
nested and transitively
+nested fields, in the case of composite types.
+The total number of fields must be at lest 1.
+
+:return: The number of fields in this type, including its sub-fields 
(for composit types).
+"""
+pass
+
+
+class WrapperTypeInfo(TypeInformation):
+"""
+A wrapper class for java TypeInformation Objects.
+"""
+
+def __init__(self, j_typeinfo):
+self._j_typeinfo = j_typeinfo
+
+def is_basic_type(self):
+return self._j_typeinfo.isBasicType()
+
+def is_tuple_type(self):
+return self._j_typeinfo.isTupleType()
+
+def get_arity(self):
+return self._j_typeinfo.getArity()
+
+def get_total_fields(self):
+return self._j_typeinfo.getTotalFields()
+
+def get_java_type_info(self):
+return self._j_typeinfo
+
+def __eq__(self, o) -> bool:
+if type(o) is type(self):
+return self._j_typeinfo.equals(o._j_typeinfo)
+else:
+return False
+
+def __hash__(self):
+return hash(self._j_typeinfo)
+
+
+class BasicTypeInfo(TypeI

[GitHub] [flink] shuiqiangchen commented on a change in pull request #13029: [FLINK-18763][python] Support basic TypeInformation for Python DataSt…

2020-07-31 Thread GitBox


shuiqiangchen commented on a change in pull request #13029:
URL: https://github.com/apache/flink/pull/13029#discussion_r463451278



##
File path: flink-python/pyflink/common/typeinfo.py
##
@@ -0,0 +1,576 @@
+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+from abc import ABC, abstractmethod
+
+from py4j.java_gateway import JavaClass, JavaObject
+
+from pyflink.java_gateway import get_gateway
+
+
+class TypeInformation(ABC):
+"""
+TypeInformation is the core class of Flink's type system. FLink requires a 
type information
+for all types that are used as input or return type of a user function. 
This type information
+class acts as the tool to generate serializers and comparators, and to 
perform semantic checks
+such as whether the fields that are used as join/grouping keys actually 
existt.
+
+The type information also bridges between the programming languages object 
model and a logical
+flat schema. It maps fields from the types to columns (fields) in a flat 
schema. Not all fields
+from a type are mapped to a separate fields in the flat schema and often, 
entire types are
+mapped to one field.. It is important to notice that the schema must hold 
for all instances of a
+type. For that reason, elements in lists and arrays are not assigned to 
individual fields, but
+the lists and arrays are considered to be one field in total, to account 
for different lengths
+in the arrays.
+a) Basic types are indivisible and are considered as a single field.
+b) Arrays and collections are one field.
+c) Tuples and case classes represent as many fields as the class has 
fields.

Review comment:
   Yes, I will remove it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] shuiqiangchen commented on a change in pull request #13029: [FLINK-18763][python] Support basic TypeInformation for Python DataSt…

2020-07-31 Thread GitBox


shuiqiangchen commented on a change in pull request #13029:
URL: https://github.com/apache/flink/pull/13029#discussion_r463451469



##
File path: flink-python/pyflink/common/typeinfo.py
##
@@ -0,0 +1,576 @@
+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+from abc import ABC, abstractmethod
+
+from py4j.java_gateway import JavaClass, JavaObject
+
+from pyflink.java_gateway import get_gateway
+
+
+class TypeInformation(ABC):
+"""
+TypeInformation is the core class of Flink's type system. FLink requires a 
type information
+for all types that are used as input or return type of a user function. 
This type information
+class acts as the tool to generate serializers and comparators, and to 
perform semantic checks
+such as whether the fields that are used as join/grouping keys actually 
existt.
+
+The type information also bridges between the programming languages object 
model and a logical
+flat schema. It maps fields from the types to columns (fields) in a flat 
schema. Not all fields
+from a type are mapped to a separate fields in the flat schema and often, 
entire types are
+mapped to one field.. It is important to notice that the schema must hold 
for all instances of a
+type. For that reason, elements in lists and arrays are not assigned to 
individual fields, but
+the lists and arrays are considered to be one field in total, to account 
for different lengths
+in the arrays.
+a) Basic types are indivisible and are considered as a single field.
+b) Arrays and collections are one field.
+c) Tuples and case classes represent as many fields as the class has 
fields.
+To represent this properly, each type has an arity (the number of fields 
it contains directly),
+and a total number of fields (number of fields in the entire schema of 
this type, including
+nested types).
+"""
+
+@abstractmethod
+def is_basic_type(self):

Review comment:
   Ok, it's good to add type hints.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] xiaolong-sn commented on a change in pull request #13005: Feature/flink 18661 efo de registration

2020-07-31 Thread GitBox


xiaolong-sn commented on a change in pull request #13005:
URL: https://github.com/apache/flink/pull/13005#discussion_r463452532



##
File path: 
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/ShardConsumer.java
##
@@ -147,9 +228,31 @@ public void run() {
}
} catch (Throwable t) {
fetcherRef.stopWithError(t);
+   } finally {
+   if (this.streamInfoList != null) {
+   try {
+   deregisterStreamConsumer();
+   } catch (Throwable t) {
+   fetcherRef.stopWithError(t);
+   }
+   }
}
}
 
+   private void deregisterStreamConsumer() throws ExecutionException, 
InterruptedException {

Review comment:
   Moved.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] xiaolong-sn commented on a change in pull request #13005: Feature/flink 18661 efo de registration

2020-07-31 Thread GitBox


xiaolong-sn commented on a change in pull request #13005:
URL: https://github.com/apache/flink/pull/13005#discussion_r463453064



##
File path: 
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/KinesisDataFetcher.java
##
@@ -400,20 +482,51 @@ private RecordEmitter createRecordEmitter(Properties 
configProps) {
SequenceNumber lastSequenceNum,
MetricGroup metricGroup,
KinesisDeserializationSchema shardDeserializer) {
+   return createShardConsumer(
+   subscribedShardStateIndex,
+   subscribedShard,
+   lastSequenceNum,
+   metricGroup,
+   shardDeserializer,
+   null
+   );
+   }
+
+   /**
+* Create a new shard consumer.
+* Override this method to customize shard consumer behavior in 
subclasses.
+* @param subscribedShardStateIndex the state index of the shard this 
consumer is subscribed to
+* @param subscribedShard the shard this consumer is subscribed to
+* @param lastSequenceNum the sequence number in the shard to start 
consuming
+* @param metricGroup the metric group to report metrics to
+* @param streamInfoList the stream info used for enhanced fan-out to 
consume from
+* @return shard consumer
+*/
+   protected ShardConsumer createShardConsumer(
+   Integer subscribedShardStateIndex,
+   StreamShardHandle subscribedShard,
+   SequenceNumber lastSequenceNum,
+   MetricGroup metricGroup,
+   KinesisDeserializationSchema shardDeserializer,
+   @Nullable List streamInfoList) {

Review comment:
   I reverted the change to ShardConsumer.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] xiaolong-sn commented on a change in pull request #13005: Feature/flink 18661 efo de registration

2020-07-31 Thread GitBox


xiaolong-sn commented on a change in pull request #13005:
URL: https://github.com/apache/flink/pull/13005#discussion_r463452805



##
File path: 
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/KinesisDataFetcher.java
##
@@ -400,20 +482,51 @@ private RecordEmitter createRecordEmitter(Properties 
configProps) {
SequenceNumber lastSequenceNum,
MetricGroup metricGroup,
KinesisDeserializationSchema shardDeserializer) {
+   return createShardConsumer(
+   subscribedShardStateIndex,
+   subscribedShard,
+   lastSequenceNum,
+   metricGroup,
+   shardDeserializer,
+   null
+   );
+   }
+
+   /**
+* Create a new shard consumer.
+* Override this method to customize shard consumer behavior in 
subclasses.
+* @param subscribedShardStateIndex the state index of the shard this 
consumer is subscribed to
+* @param subscribedShard the shard this consumer is subscribed to
+* @param lastSequenceNum the sequence number in the shard to start 
consuming
+* @param metricGroup the metric group to report metrics to
+* @param streamInfoList the stream info used for enhanced fan-out to 
consume from
+* @return shard consumer
+*/
+   protected ShardConsumer createShardConsumer(
+   Integer subscribedShardStateIndex,
+   StreamShardHandle subscribedShard,
+   SequenceNumber lastSequenceNum,
+   MetricGroup metricGroup,
+   KinesisDeserializationSchema shardDeserializer,
+   @Nullable List streamInfoList) {
 
final KinesisProxyInterface kinesis = 
kinesisProxyFactory.create(configProps);
 
final RecordPublisher recordPublisher = new 
PollingRecordPublisherFactory()
.create(configProps, metricGroup, subscribedShard, 
kinesis);
 
-   return new ShardConsumer<>(
+   return new ShardConsumer(
this,
recordPublisher,
subscribedShardStateIndex,
subscribedShard,
lastSequenceNum,
new ShardConsumerMetricsReporter(metricGroup),
-   shardDeserializer);
+   shardDeserializer,
+   configProps,

Review comment:
   Solved.

##
File path: 
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/ShardConsumer.java
##
@@ -22,21 +22,29 @@
 import 
org.apache.flink.streaming.connectors.kinesis.config.ConsumerConfigConstants;

Review comment:
   Solved





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] xiaolong-sn commented on a change in pull request #13005: Feature/flink 18661 efo de registration

2020-07-31 Thread GitBox


xiaolong-sn commented on a change in pull request #13005:
URL: https://github.com/apache/flink/pull/13005#discussion_r463453275



##
File path: 
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/KinesisDataFetcher.java
##
@@ -362,6 +440,10 @@ protected KinesisDataFetcher(List streams,
this.watermarkTracker = watermarkTracker;
this.kinesisProxyFactory = checkNotNull(kinesisProxyFactory);
this.kinesis = kinesisProxyFactory.create(configProps);
+   this.kinesisProxyV2Factory = 
checkNotNull(kinesisProxyV2Factory);
+   if (shouldRegisterConsumerEagerly()) {

Review comment:
   Made the eager registration on FlinkKinesisConsumer





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] shuiqiangchen commented on a change in pull request #13029: [FLINK-18763][python] Support basic TypeInformation for Python DataSt…

2020-07-31 Thread GitBox


shuiqiangchen commented on a change in pull request #13029:
URL: https://github.com/apache/flink/pull/13029#discussion_r463453223



##
File path: flink-python/pyflink/common/typeinfo.py
##
@@ -0,0 +1,576 @@
+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+from abc import ABC, abstractmethod
+
+from py4j.java_gateway import JavaClass, JavaObject
+
+from pyflink.java_gateway import get_gateway
+
+
+class TypeInformation(ABC):
+"""
+TypeInformation is the core class of Flink's type system. FLink requires a 
type information
+for all types that are used as input or return type of a user function. 
This type information
+class acts as the tool to generate serializers and comparators, and to 
perform semantic checks
+such as whether the fields that are used as join/grouping keys actually 
existt.
+
+The type information also bridges between the programming languages object 
model and a logical
+flat schema. It maps fields from the types to columns (fields) in a flat 
schema. Not all fields
+from a type are mapped to a separate fields in the flat schema and often, 
entire types are
+mapped to one field.. It is important to notice that the schema must hold 
for all instances of a
+type. For that reason, elements in lists and arrays are not assigned to 
individual fields, but
+the lists and arrays are considered to be one field in total, to account 
for different lengths
+in the arrays.
+a) Basic types are indivisible and are considered as a single field.
+b) Arrays and collections are one field.
+c) Tuples and case classes represent as many fields as the class has 
fields.
+To represent this properly, each type has an arity (the number of fields 
it contains directly),
+and a total number of fields (number of fields in the entire schema of 
this type, including
+nested types).
+"""
+
+@abstractmethod
+def is_basic_type(self):
+"""
+Checks if this type information represents a basic type.
+Basic types are defined in BasicTypeInfo and are primitives, their 
boxing type, Strings ...
+
+:return:  True, if this type information describes a basic type, false 
otherwise.
+"""
+pass
+
+@abstractmethod
+def is_tuple_type(self):
+"""
+Checks if this type information represents a Tuple type.
+Tuple types are subclasses of the Java API tuples.

Review comment:
   Sure, there should be differences between java and python comments.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] shuiqiangchen commented on a change in pull request #13029: [FLINK-18763][python] Support basic TypeInformation for Python DataSt…

2020-07-31 Thread GitBox


shuiqiangchen commented on a change in pull request #13029:
URL: https://github.com/apache/flink/pull/13029#discussion_r463453354



##
File path: flink-python/pyflink/common/typeinfo.py
##
@@ -0,0 +1,576 @@
+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+from abc import ABC, abstractmethod
+
+from py4j.java_gateway import JavaClass, JavaObject
+
+from pyflink.java_gateway import get_gateway
+
+
+class TypeInformation(ABC):
+"""
+TypeInformation is the core class of Flink's type system. FLink requires a 
type information
+for all types that are used as input or return type of a user function. 
This type information
+class acts as the tool to generate serializers and comparators, and to 
perform semantic checks
+such as whether the fields that are used as join/grouping keys actually 
existt.
+
+The type information also bridges between the programming languages object 
model and a logical
+flat schema. It maps fields from the types to columns (fields) in a flat 
schema. Not all fields
+from a type are mapped to a separate fields in the flat schema and often, 
entire types are
+mapped to one field.. It is important to notice that the schema must hold 
for all instances of a
+type. For that reason, elements in lists and arrays are not assigned to 
individual fields, but
+the lists and arrays are considered to be one field in total, to account 
for different lengths
+in the arrays.
+a) Basic types are indivisible and are considered as a single field.
+b) Arrays and collections are one field.
+c) Tuples and case classes represent as many fields as the class has 
fields.
+To represent this properly, each type has an arity (the number of fields 
it contains directly),
+and a total number of fields (number of fields in the entire schema of 
this type, including
+nested types).
+"""
+
+@abstractmethod
+def is_basic_type(self):
+"""
+Checks if this type information represents a basic type.
+Basic types are defined in BasicTypeInfo and are primitives, their 
boxing type, Strings ...
+
+:return:  True, if this type information describes a basic type, false 
otherwise.
+"""
+pass
+
+@abstractmethod
+def is_tuple_type(self):
+"""
+Checks if this type information represents a Tuple type.
+Tuple types are subclasses of the Java API tuples.
+
+:return: True, if this type information describes a tuple type, false 
otherwise.
+"""
+pass
+
+@abstractmethod
+def get_arity(self):
+"""
+Gets the arity of this type - the number of fields without nesting.
+
+:return: the number of fields in this type without nesting.
+"""
+pass
+
+@abstractmethod
+def get_total_fields(self):
+"""
+Gets the number of logical fields in this type. This includes its 
nested and transitively
+nested fields, in the case of composite types.
+The total number of fields must be at lest 1.
+
+:return: The number of fields in this type, including its sub-fields 
(for composit types).

Review comment:
   Thank you! I'll revise it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] xiaolong-sn commented on a change in pull request #13005: Feature/flink 18661 efo de registration

2020-07-31 Thread GitBox


xiaolong-sn commented on a change in pull request #13005:
URL: https://github.com/apache/flink/pull/13005#discussion_r463453684



##
File path: 
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/proxy/KinesisProxyV2.java
##
@@ -28,16 +51,210 @@
  */
 @Internal
 public class KinesisProxyV2 implements KinesisProxyV2Interface {
+   private static final Logger LOG = 
LoggerFactory.getLogger(KinesisProxyV2.class);
+
+   /** Random seed used to calculate backoff jitter for Kinesis 
operations. */
+   private static final Random seed = new Random();
 
private final KinesisAsyncClient kinesisAsyncClient;
 
+   private final FanOutRecordPublisherConfiguration 
fanOutRecordPublisherConfiguration;
+
/**
 * Create a new KinesisProxyV2 based on the supplied configuration 
properties.
 *
-* @param kinesisAsyncClient the kinesis async client used to 
communicate with Kinesis
+* @param configProps configuration properties containing AWS 
credential and AWS region info
+*/
+   public KinesisProxyV2(final Properties configProps, List 
streams) {
+   this.kinesisAsyncClient = createKinesisAsyncClient(configProps);
+   this.fanOutRecordPublisherConfiguration = new 
FanOutRecordPublisherConfiguration(configProps, streams);
+   }
+
+   /**
+* Creates a Kinesis proxy V2.
+*
+* @param configProps configuration properties
+* @param streams list of kinesis stream names
+* @return the created kinesis proxy v2
+*/
+   public static KinesisProxyV2Interface create(Properties configProps, 
List streams) {
+   return new KinesisProxyV2(configProps, streams);
+   }
+
+   /**
+* Create the Kinesis client, using the provided configuration 
properties.
+* Derived classes can override this method to customize the client 
configuration.
+*
+* @param configProps the properties map used to create the Kinesis 
Client
+* @return a Kinesis Client
+*/
+   protected KinesisAsyncClient createKinesisAsyncClient(final Properties 
configProps) {
+   final ClientConfiguration config = new 
ClientConfigurationFactory().getConfig();
+   return AwsV2Util.createKinesisAsyncClient(configProps, config);
+   }
+
+   /**
+* {@inheritDoc}
 */
-   public KinesisProxyV2(final KinesisAsyncClient kinesisAsyncClient) {
-   this.kinesisAsyncClient = kinesisAsyncClient;
+   @Override
+   public Map describeStream(List streams) throws 
InterruptedException, ExecutionException {

Review comment:
   I just split it and made the loop external.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] shuiqiangchen commented on a change in pull request #13029: [FLINK-18763][python] Support basic TypeInformation for Python DataSt…

2020-07-31 Thread GitBox


shuiqiangchen commented on a change in pull request #13029:
URL: https://github.com/apache/flink/pull/13029#discussion_r463453594



##
File path: flink-python/pyflink/common/typeinfo.py
##
@@ -0,0 +1,576 @@
+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+from abc import ABC, abstractmethod
+
+from py4j.java_gateway import JavaClass, JavaObject
+
+from pyflink.java_gateway import get_gateway
+
+
+class TypeInformation(ABC):
+"""
+TypeInformation is the core class of Flink's type system. FLink requires a 
type information
+for all types that are used as input or return type of a user function. 
This type information
+class acts as the tool to generate serializers and comparators, and to 
perform semantic checks
+such as whether the fields that are used as join/grouping keys actually 
existt.
+
+The type information also bridges between the programming languages object 
model and a logical
+flat schema. It maps fields from the types to columns (fields) in a flat 
schema. Not all fields
+from a type are mapped to a separate fields in the flat schema and often, 
entire types are
+mapped to one field.. It is important to notice that the schema must hold 
for all instances of a
+type. For that reason, elements in lists and arrays are not assigned to 
individual fields, but
+the lists and arrays are considered to be one field in total, to account 
for different lengths
+in the arrays.
+a) Basic types are indivisible and are considered as a single field.
+b) Arrays and collections are one field.
+c) Tuples and case classes represent as many fields as the class has 
fields.
+To represent this properly, each type has an arity (the number of fields 
it contains directly),
+and a total number of fields (number of fields in the entire schema of 
this type, including
+nested types).
+"""
+
+@abstractmethod
+def is_basic_type(self):
+"""
+Checks if this type information represents a basic type.
+Basic types are defined in BasicTypeInfo and are primitives, their 
boxing type, Strings ...
+
+:return:  True, if this type information describes a basic type, false 
otherwise.
+"""
+pass
+
+@abstractmethod
+def is_tuple_type(self):
+"""
+Checks if this type information represents a Tuple type.
+Tuple types are subclasses of the Java API tuples.
+
+:return: True, if this type information describes a tuple type, false 
otherwise.
+"""
+pass
+
+@abstractmethod
+def get_arity(self):
+"""
+Gets the arity of this type - the number of fields without nesting.
+
+:return: the number of fields in this type without nesting.
+"""
+pass
+
+@abstractmethod
+def get_total_fields(self):
+"""
+Gets the number of logical fields in this type. This includes its 
nested and transitively
+nested fields, in the case of composite types.
+The total number of fields must be at lest 1.
+
+:return: The number of fields in this type, including its sub-fields 
(for composit types).
+"""
+pass
+
+
+class WrapperTypeInfo(TypeInformation):
+"""
+A wrapper class for java TypeInformation Objects.
+"""
+
+def __init__(self, j_typeinfo):
+self._j_typeinfo = j_typeinfo
+
+def is_basic_type(self):
+return self._j_typeinfo.isBasicType()
+
+def is_tuple_type(self):
+return self._j_typeinfo.isTupleType()
+
+def get_arity(self):
+return self._j_typeinfo.getArity()
+
+def get_total_fields(self):
+return self._j_typeinfo.getTotalFields()
+
+def get_java_type_info(self):
+return self._j_typeinfo
+
+def __eq__(self, o) -> bool:
+if type(o) is type(self):
+return self._j_typeinfo.equals(o._j_typeinfo)
+else:
+return False
+
+def __hash__(self):
+return hash(self._j_typeinfo)
+
+
+class BasicTypeInfo(T

[jira] [Created] (FLINK-18780) FlinkKafkaProducer should add another constructor accepting SerializationSchema

2020-07-31 Thread Qingsheng Ren (Jira)
Qingsheng Ren created FLINK-18780:
-

 Summary: FlinkKafkaProducer should add another constructor 
accepting SerializationSchema
 Key: FLINK-18780
 URL: https://issues.apache.org/jira/browse/FLINK-18780
 Project: Flink
  Issue Type: New Feature
  Components: Connectors / Kafka
Affects Versions: 1.11.1, 1.11.0
Reporter: Qingsheng Ren


Currently `FlinkKafkaProducer` doesn't have a constructor like

`FlinkKafkaProducer(String, SerializationSchema, Properties, Semantic)`

This is used as an example in Kafka DataStream Connector's documentation. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #12917: [FLINK-18355][tests] Simplify tests of SlotPoolImpl

2020-07-31 Thread GitBox


flinkbot edited a comment on pull request #12917:
URL: https://github.com/apache/flink/pull/12917#issuecomment-659940433


   
   ## CI report:
   
   * da7fb96fc00acda2a4c103f5f177efb9bd9be8be UNKNOWN
   * f80ff802ee1f126a58b1617ab4ca82c3f1a40ba8 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5065)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13019: [FLINK-18700][FLINK-18705][debezium] Fix Debezium-JSON throws NPE when tombstone message and PG table's IDENTITY config is not FULL

2020-07-31 Thread GitBox


flinkbot edited a comment on pull request #13019:
URL: https://github.com/apache/flink/pull/13019#issuecomment-665573947


   
   ## CI report:
   
   * ff74ea871806eb320d9463f652ac1e94bb54f8c6 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5002)
 
   * 6948caea151fbbaa9a5fe829a5b98022b7b879d3 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13029: [FLINK-18763][python] Support basic TypeInformation for Python DataSt…

2020-07-31 Thread GitBox


flinkbot edited a comment on pull request #13029:
URL: https://github.com/apache/flink/pull/13029#issuecomment-666332680


   
   ## CI report:
   
   * 3d89fc037c6cf7b1dfe19514ff831af4a4d33b50 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5066)
 
   * 39f5be60dc81c30f0cbbfb4e219722cbdd52ea21 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5070)
 
   * 1a3c5b93a697dea00a2e8237deb03b5848ffb1a9 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13033: [FLINK-18777][catalog] Supports schema registry catalog

2020-07-31 Thread GitBox


flinkbot edited a comment on pull request #13033:
URL: https://github.com/apache/flink/pull/13033#issuecomment-666927378


   
   ## CI report:
   
   * f40e57384a1aff20df20d9163b536593970d9615 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5068)
 
   * 18f6f1b410d5162bb2ab87b0875ac52b152d31ad Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5072)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] shuiqiangchen commented on a change in pull request #13029: [FLINK-18763][python] Support basic TypeInformation for Python DataSt…

2020-07-31 Thread GitBox


shuiqiangchen commented on a change in pull request #13029:
URL: https://github.com/apache/flink/pull/13029#discussion_r463458488



##
File path: 
flink-python/src/main/java/org/apache/flink/datastream/typeinfo/python/PickledByteArrayTypeInfo.java
##
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.datastream.typeinfo.python;
+
+import org.apache.flink.api.common.ExecutionConfig;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.common.typeutils.TypeSerializer;
+import 
org.apache.flink.api.common.typeutils.base.array.BytePrimitiveArraySerializer;
+
+/**
+ * A PickledByteArrayTypeInfo indicates that the data of this type is a 
generated primitive byte
+ * array by pickle.
+ */
+public class PickledByteArrayTypeInfo extends TypeInformation {

Review comment:
   Ok, making it singleton is better.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-18779) Support the SupportsFilterPushDown interface for ScanTableSource.

2020-07-31 Thread jackylau (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168499#comment-17168499
 ] 

jackylau commented on FLINK-18779:
--

hi [~jark],Is this title not right, It may be *SupportsFilterPushDown interface 
for LookupTableSource and the ScanTableSource has done*

> Support the SupportsFilterPushDown interface for ScanTableSource.
> -
>
> Key: FLINK-18779
> URL: https://issues.apache.org/jira/browse/FLINK-18779
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Reporter: Jark Wu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] shuiqiangchen commented on pull request #13029: [FLINK-18763][python] Support basic TypeInformation for Python DataSt…

2020-07-31 Thread GitBox


shuiqiangchen commented on pull request #13029:
URL: https://github.com/apache/flink/pull/13029#issuecomment-666990874


   Hi @hequn8128 , thank you for your comments, I have updated the pr according 
to your review suggestions, please have a look at it.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13029: [FLINK-18763][python] Support basic TypeInformation for Python DataSt…

2020-07-31 Thread GitBox


flinkbot edited a comment on pull request #13029:
URL: https://github.com/apache/flink/pull/13029#issuecomment-666332680


   
   ## CI report:
   
   * 39f5be60dc81c30f0cbbfb4e219722cbdd52ea21 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5070)
 
   * 1a3c5b93a697dea00a2e8237deb03b5848ffb1a9 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13019: [FLINK-18700][FLINK-18705][debezium] Fix Debezium-JSON throws NPE when tombstone message and PG table's IDENTITY config is not FULL

2020-07-31 Thread GitBox


flinkbot edited a comment on pull request #13019:
URL: https://github.com/apache/flink/pull/13019#issuecomment-665573947


   
   ## CI report:
   
   * ff74ea871806eb320d9463f652ac1e94bb54f8c6 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5002)
 
   * 6948caea151fbbaa9a5fe829a5b98022b7b879d3 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5075)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-18451) Flink HA on yarn may appear TaskManager double running when HA is restored

2020-07-31 Thread ming li (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168508#comment-17168508
 ] 

ming li commented on FLINK-18451:
-

Hi,[~trohrmann].I have learned that no matter in the at-least-once or 
exactly-once scenario, this double running should not be a problem (but some 
additional guarantees are required).
In fact, in our production environment, we have a message middleware similar to 
Kafka. Different from kafka, it can only be partitioned and allocated by the 
server according to the consumer group. Each partition can only be assigned to 
one consumer. At this time, in the dual-run scenario, some consumers will not 
be able to obtain partitions. We can only allocate new partitions for 
consumption until the original task fails. At this time, some data will be 
consumed by the old consumer. As a result, data loss occurs.

> Flink HA on yarn may appear TaskManager double running when HA is restored
> --
>
> Key: FLINK-18451
> URL: https://issues.apache.org/jira/browse/FLINK-18451
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Affects Versions: 1.9.0
>Reporter: ming li
>Priority: Major
>  Labels: high-availability
>
> We found that when NodeManager is lost, the new JobManager will be restored 
> by Yarn's ResourceManager, and the Leader node will be registered on 
> Zookeeper. The original TaskManager will find the new JobManager through 
> Zookeeper and close the old JobManager connection. At this time, all tasks of 
> the TaskManager will fail. The new JobManager will directly perform job 
> recovery and recover from the latest checkpoint.
> However, during the recovery process, when a TaskManager is abnormally 
> connected to Zookeeper, it is not registered with the new JobManager in time. 
> Before the following timeout:
> 1. Connect with Zookeeper
> 2. Heartbeat with JobManager/ResourceManager
> Task will continue to run (assuming that Task can run independently in 
> TaskManager). Assuming that HA recovers fast enough, some Task double runs 
> will occur at this time.
> Do we need to make a persistent record of the cluster resources we allocated 
> during the runtime, and use it to judge all Task stops when HA is restored?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (FLINK-18767) Streaming job stuck when disabling operator chaining

2020-07-31 Thread Zhijiang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijiang reassigned FLINK-18767:


Assignee: Zhijiang

> Streaming job stuck when disabling operator chaining
> 
>
> Key: FLINK-18767
> URL: https://issues.apache.org/jira/browse/FLINK-18767
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.8.3, 1.9.3, 1.10.1, 1.11.1
>Reporter: Nico Kruber
>Assignee: Zhijiang
>Priority: Critical
>
> The following code is stuck sending data from the source to the map operator. 
> Two settings seem to have an influence here: {{env.setBufferTimeout(-1);}} 
> and {{env.disableOperatorChaining();}} - if I remove either of these, the job 
> works as expected.
> (I pre-populated my Kafka topic with one element to reproduce easily)
> {code}
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> // comment either these two and the job works
> env.setBufferTimeout(-1);
> env.disableOperatorChaining(); 
> Properties properties = new Properties();
> properties.setProperty("bootstrap.servers", "localhost:9092");
> properties.setProperty("group.id", "test");
> FlinkKafkaConsumer consumer = new FlinkKafkaConsumer<>("topic", 
> new SimpleStringSchema(), properties);
> consumer.setStartFromEarliest();
> DataStreamSource input = env.addSource(consumer);
> input
> .map((x) -> x)
> .print();
> env.execute();
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #13029: [FLINK-18763][python] Support basic TypeInformation for Python DataSt…

2020-07-31 Thread GitBox


flinkbot edited a comment on pull request #13029:
URL: https://github.com/apache/flink/pull/13029#issuecomment-666332680


   
   ## CI report:
   
   * 39f5be60dc81c30f0cbbfb4e219722cbdd52ea21 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5070)
 
   * 1a3c5b93a697dea00a2e8237deb03b5848ffb1a9 UNKNOWN
   * 9e004f9f8ebc001e7a8d63e39eb89f6b2b5cc431 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Assigned] (FLINK-18677) ZooKeeperLeaderRetrievalService does not invalidate leader in case of SUSPENDED connection

2020-07-31 Thread Till Rohrmann (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann reassigned FLINK-18677:
-

Assignee: Matthias

> ZooKeeperLeaderRetrievalService does not invalidate leader in case of 
> SUSPENDED connection
> --
>
> Key: FLINK-18677
> URL: https://issues.apache.org/jira/browse/FLINK-18677
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.10.1, 1.12.0, 1.11.1
>Reporter: Till Rohrmann
>Assignee: Matthias
>Priority: Major
> Fix For: 1.12.0
>
>
> The {{ZooKeeperLeaderRetrievalService}} does not invalidate the leader if the 
> ZooKeeper connection gets SUSPENDED. This means that a {{TaskManager}} won't 
> cancel its running tasks even though it might miss a leader change. I think 
> we should at least make it configurable whether in such a situation the 
> leader listener should be informed about the lost leadership. Otherwise, we 
> might run into the situation where an old and a newly recovered instance of a 
> {{Task}} can run at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] azagrebin commented on a change in pull request #13009: [FLINK-18690][runtime] Implement LocalInputPreferredSlotSharingStrategy

2020-07-31 Thread GitBox


azagrebin commented on a change in pull request #13009:
URL: https://github.com/apache/flink/pull/13009#discussion_r463477687



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/LocalInputPreferredSlotSharingStrategy.java
##
@@ -0,0 +1,297 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.scheduler;
+
+import org.apache.flink.runtime.instance.SlotSharingGroupId;
+import org.apache.flink.runtime.jobgraph.JobVertexID;
+import org.apache.flink.runtime.jobmanager.scheduler.CoLocationConstraintDesc;
+import org.apache.flink.runtime.jobmanager.scheduler.CoLocationGroupDesc;
+import org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroup;
+import org.apache.flink.runtime.scheduler.strategy.ExecutionVertexID;
+import org.apache.flink.runtime.scheduler.strategy.SchedulingExecutionVertex;
+import org.apache.flink.runtime.scheduler.strategy.SchedulingResultPartition;
+import org.apache.flink.runtime.scheduler.strategy.SchedulingTopology;
+
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.IdentityHashMap;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+
+import static org.apache.flink.util.Preconditions.checkNotNull;
+import static org.apache.flink.util.Preconditions.checkState;
+
+/**
+ * This strategy tries to reduce remote data exchanges. Execution vertices, 
which are connected
+ * and belong to the same SlotSharingGroup, tend to be put in the same 
ExecutionSlotSharingGroup.
+ * Co-location constraints will be respected.
+ */
+class LocalInputPreferredSlotSharingStrategy implements SlotSharingStrategy {
+
+   private final Map 
executionSlotSharingGroupMap;
+
+   LocalInputPreferredSlotSharingStrategy(
+   final SchedulingTopology topology,
+   final Set logicalSlotSharingGroups,
+   final Set coLocationGroups) {
+
+   this.executionSlotSharingGroupMap = new 
ExecutionSlotSharingGroupBuilder(
+   topology,
+   logicalSlotSharingGroups,
+   coLocationGroups).build();
+   }
+
+   @Override
+   public ExecutionSlotSharingGroup getExecutionSlotSharingGroup(final 
ExecutionVertexID executionVertexId) {
+   return executionSlotSharingGroupMap.get(executionVertexId);
+   }
+
+   @Override
+   public Set getExecutionSlotSharingGroups() {
+   return new HashSet<>(executionSlotSharingGroupMap.values());
+   }
+
+   static class Factory implements SlotSharingStrategy.Factory {
+
+   public LocalInputPreferredSlotSharingStrategy create(
+   final SchedulingTopology topology,
+   final Set 
logicalSlotSharingGroups,
+   final Set 
coLocationGroups) {
+
+   return new 
LocalInputPreferredSlotSharingStrategy(topology, logicalSlotSharingGroups, 
coLocationGroups);
+   }
+   }
+
+   private static class ExecutionSlotSharingGroupBuilder {
+   private final SchedulingTopology topology;
+
+   private final Map 
slotSharingGroupMap;
+
+   private final Map 
coLocationGroupMap;
+
+   private final Map 
executionSlotSharingGroupMap;
+
+   final Map 
constraintToExecutionSlotSharingGroupMap;
+
+   final Map> 
executionSlotSharingGroups;
+
+   private final Map> 
assignedJobVerticesForGroups;
+
+   private ExecutionSlotSharingGroupBuilder(
+   final SchedulingTopology topology,
+   final Set 
logicalSlotSharingGroups,
+   final Set 
coLocationGroups) {
+
+   this.topology = checkNotNull(topology);
+
+   this.slotSharingGroupMap = new HashMap<>();
+   for (SlotSharingGroup slotSharingGroup : 
logicalSlotSharingGroups) {
+   for (JobVertexID jobVertexId : 
slotSharingGroup.getJobVertexIds()) {
+ 

[GitHub] [flink] hequn8128 commented on a change in pull request #13029: [FLINK-18763][python] Support basic TypeInformation for Python DataSt…

2020-07-31 Thread GitBox


hequn8128 commented on a change in pull request #13029:
URL: https://github.com/apache/flink/pull/13029#discussion_r463473374



##
File path: flink-python/pyflink/common/typeinfo.py
##
@@ -0,0 +1,508 @@
+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+from abc import ABC, abstractmethod
+
+from py4j.java_gateway import JavaClass, JavaObject
+from typing import List, Union
+
+from pyflink.java_gateway import get_gateway
+
+
+class TypeInformation(ABC):
+"""
+TypeInformation is the core class of Flink's type system. FLink requires a 
type information
+for all types that are used as input or return type of a user function. 
This type information
+class acts as the tool to generate serializers and comparators, and to 
perform semantic checks
+such as whether the fields that are used as join/grouping keys actually 
exist.
+
+The type information also bridges between the programming languages object 
model and a logical
+flat schema. It maps fields from the types to columns (fields) in a flat 
schema. Not all fields
+from a type are mapped to a separate fields in the flat schema and often, 
entire types are
+mapped to one field.. It is important to notice that the schema must hold 
for all instances of a
+type. For that reason, elements in lists and arrays are not assigned to 
individual fields, but
+the lists and arrays are considered to be one field in total, to account 
for different lengths
+in the arrays.
+a) Basic types are indivisible and are considered as a single field.
+b) Arrays and collections are one field.
+c) Tuples represents as many fields as the class has fields.
+To represent this properly, each type has an arity (the number of fields 
it contains directly),
+and a total number of fields (number of fields in the entire schema of 
this type, including
+nested types).
+"""
+
+@abstractmethod
+def is_basic_type(self) -> bool:
+"""
+Checks if this type information represents a basic type.
+Basic types are defined in BasicTypeInfo and are primitives, their 
boxing type, Strings ...
+
+:return:  True, if this type information describes a basic type, false 
otherwise.
+"""
+pass
+
+@abstractmethod
+def is_tuple_type(self) -> bool:
+"""
+Checks if this type information represents a Tuple type.
+
+:return: True, if this type information describes a tuple type, false 
otherwise.
+"""
+pass
+
+@abstractmethod
+def get_arity(self) -> int:
+"""
+Gets the arity of this type - the number of fields without nesting.
+
+:return: the number of fields in this type without nesting.
+"""
+pass
+
+@abstractmethod
+def get_total_fields(self) -> int:
+"""
+Gets the number of logical fields in this type. This includes its 
nested and transitively
+nested fields, in the case of composite types.
+The total number of fields must be at lest 1.
+
+:return: The number of fields in this type, including its sub-fields 
(for composite types).
+"""
+pass
+
+
+class WrapperTypeInfo(TypeInformation):
+"""
+A wrapper class for java TypeInformation Objects.
+"""
+
+def __init__(self, j_typeinfo):
+self._j_typeinfo = j_typeinfo
+
+def is_basic_type(self) -> bool:
+return self._j_typeinfo.isBasicType()
+
+def is_tuple_type(self) -> bool:
+return self._j_typeinfo.isTupleType()
+
+def get_arity(self) -> int:
+return self._j_typeinfo.getArity()
+
+def get_total_fields(self) -> int:
+return self._j_typeinfo.getTotalFields()
+
+def get_java_type_info(self) -> JavaObject:
+return self._j_typeinfo
+
+def __eq__(self, o) -> bool:
+if type(o) is type(self):
+return self._j_typeinfo.equals(o._j_typeinfo)
+else:
+return False
+
+def __hash__(self) -> int:
+return hash(self._j_type

[GitHub] [flink] flinkbot edited a comment on pull request #12917: [FLINK-18355][tests] Simplify tests of SlotPoolImpl

2020-07-31 Thread GitBox


flinkbot edited a comment on pull request #12917:
URL: https://github.com/apache/flink/pull/12917#issuecomment-659940433


   
   ## CI report:
   
   * da7fb96fc00acda2a4c103f5f177efb9bd9be8be UNKNOWN
   * f80ff802ee1f126a58b1617ab4ca82c3f1a40ba8 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5065)
 
   * 1c5870c57b64436014900d67be2005395e007a52 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Assigned] (FLINK-18748) Savepoint would be queued unexpected if pendingCheckpoints less than maxConcurrentCheckpoints

2020-07-31 Thread Roman Khachatryan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Khachatryan reassigned FLINK-18748:
-

Assignee: tao wang

> Savepoint would be queued unexpected if pendingCheckpoints less than 
> maxConcurrentCheckpoints
> -
>
> Key: FLINK-18748
> URL: https://issues.apache.org/jira/browse/FLINK-18748
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.11.0, 1.11.1
>Reporter: Congxian Qiu(klion26)
>Assignee: tao wang
>Priority: Major
>
> Inspired by a [user-zh 
> email|http://apache-flink.147419.n8.nabble.com/flink-1-11-rest-api-saveppoint-td5497.html]
> After FLINK-17342, when triggering a checkpoint/savepoint, we'll check 
> whether the request can be triggered in 
> {{CheckpointRequestDecider#chooseRequestToExecute}}, the logic is as follow:
> {code:java}
> Preconditions.checkState(Thread.holdsLock(lock));
> // 1. 
> if (isTriggering || queuedRequests.isEmpty()) {
>return Optional.empty();
> }
> // 2 too many ongoing checkpoitn/savepoint
> if (pendingCheckpointsSizeSupplier.get() >= maxConcurrentCheckpointAttempts) {
>return Optional.of(queuedRequests.first())
>   .filter(CheckpointTriggerRequest::isForce)
>   .map(unused -> queuedRequests.pollFirst());
> }
> // 3 check the timestamp of last complete checkpoint
> long nextTriggerDelayMillis = nextTriggerDelayMillis(lastCompletionMs);
> if (nextTriggerDelayMillis > 0) {
>return onTooEarly(nextTriggerDelayMillis);
> }
> return Optional.of(queuedRequests.pollFirst());
> {code}
> But if currently {{pendingCheckpointsSizeSupplier.get()}} < 
> {{maxConcurrentCheckpointAttempts}}, and the request is a savepoint, the 
> savepoint will still wait some time in step 3. 
> I think we should trigger the savepoint immediately if 
> {{pendingCheckpointSizeSupplier.get()}} < {{maxConcurrentCheckpointAttempts}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-15467) Should wait for the end of the source thread during the Task cancellation

2020-07-31 Thread ming li (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168520#comment-17168520
 ] 

ming li commented on FLINK-15467:
-

Thanks for this fix. I want to correct the cause of this problem: it is not 
that the blob file is deleted, but that the userClassloader will be closed when 
the task ends. Later, when trying to use this classloader to load a class (when 
the classloader is not specified, the classloader of the caller class will be 
used), a ClassNotFoundException will be thrown.

> Should wait for the end of the source thread during the Task cancellation
> -
>
> Key: FLINK-15467
> URL: https://issues.apache.org/jira/browse/FLINK-15467
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.9.0, 1.9.1, 1.10.1
>Reporter: ming li
>Assignee: Roman Khachatryan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.2
>
>
>      In the new mailBox model, SourceStreamTask starts a source thread to run 
> user methods, and the current execution thread will block on mailbox.takeMail 
> (). When a task cancels, the TaskCanceler thread will cancel the task and 
> interrupt the execution thread. Therefore, the execution thread of 
> SourceStreamTask will throw InterruptedException, then cancel the task again, 
> and throw an exception.
> {code:java}
> //代码占位符
> @Override
> protected void performDefaultAction(ActionContext context) throws Exception {
>// Against the usual contract of this method, this implementation is not 
> step-wise but blocking instead for
>// compatibility reasons with the current source interface (source 
> functions run as a loop, not in steps).
>sourceThread.start();
>// We run an alternative mailbox loop that does not involve default 
> actions and synchronizes around actions.
>try {
>   runAlternativeMailboxLoop();
>} catch (Exception mailboxEx) {
>   // We cancel the source function if some runtime exception escaped the 
> mailbox.
>   if (!isCanceled()) {
>  cancelTask();
>   }
>   throw mailboxEx;
>}
>sourceThread.join();
>if (!isFinished) {
>   sourceThread.checkThrowSourceExecutionException();
>}
>context.allActionsCompleted();
> }
> {code}
>    When all tasks of this TaskExecutor are canceled, the blob file will be 
> cleaned up. But the real source thread is not finished at this time, which 
> will cause a ClassNotFoundException when loading a new class. In this case, 
> the source thread may not be able to properly clean up and release resources 
> (such as closing child threads, cleaning up local files, etc.). Therefore, I 
> think we should mark this task canceled or finished after the execution of 
> the source thread is completed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-15467) Should wait for the end of the source thread during the Task cancellation

2020-07-31 Thread Piotr Nowojski (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168528#comment-17168528
 ] 

Piotr Nowojski commented on FLINK-15467:


That was also our understanding [~Ming Li]. Either way the problem should be 
gone as {{SourceStreamTask}} wouldn't finish until {{sourceThread}} is done and 
life cycle of the Task's class loader is bound to the Task's life. 

> Should wait for the end of the source thread during the Task cancellation
> -
>
> Key: FLINK-15467
> URL: https://issues.apache.org/jira/browse/FLINK-15467
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.9.0, 1.9.1, 1.10.1
>Reporter: ming li
>Assignee: Roman Khachatryan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.2
>
>
>      In the new mailBox model, SourceStreamTask starts a source thread to run 
> user methods, and the current execution thread will block on mailbox.takeMail 
> (). When a task cancels, the TaskCanceler thread will cancel the task and 
> interrupt the execution thread. Therefore, the execution thread of 
> SourceStreamTask will throw InterruptedException, then cancel the task again, 
> and throw an exception.
> {code:java}
> //代码占位符
> @Override
> protected void performDefaultAction(ActionContext context) throws Exception {
>// Against the usual contract of this method, this implementation is not 
> step-wise but blocking instead for
>// compatibility reasons with the current source interface (source 
> functions run as a loop, not in steps).
>sourceThread.start();
>// We run an alternative mailbox loop that does not involve default 
> actions and synchronizes around actions.
>try {
>   runAlternativeMailboxLoop();
>} catch (Exception mailboxEx) {
>   // We cancel the source function if some runtime exception escaped the 
> mailbox.
>   if (!isCanceled()) {
>  cancelTask();
>   }
>   throw mailboxEx;
>}
>sourceThread.join();
>if (!isFinished) {
>   sourceThread.checkThrowSourceExecutionException();
>}
>context.allActionsCompleted();
> }
> {code}
>    When all tasks of this TaskExecutor are canceled, the blob file will be 
> cleaned up. But the real source thread is not finished at this time, which 
> will cause a ClassNotFoundException when loading a new class. In this case, 
> the source thread may not be able to properly clean up and release resources 
> (such as closing child threads, cleaning up local files, etc.). Therefore, I 
> think we should mark this task canceled or finished after the execution of 
> the source thread is completed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-18690) Implement LocalInputPreferredSlotSharingStrategy

2020-07-31 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168531#comment-17168531
 ] 

Zhu Zhu commented on FLINK-18690:
-

There are known limitations of LocalInputPreferredSlotSharingStrategy which 
leads to suboptimal results of ExecutionSlotSharingGroup creation.

> Implement LocalInputPreferredSlotSharingStrategy
> 
>
> Key: FLINK-18690
> URL: https://issues.apache.org/jira/browse/FLINK-18690
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Andrey Zagrebin
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
>
> Implement ExecutionSlotSharingGroup, SlotSharingStrategy and default 
> LocalInputPreferredSlotSharingStrategy.
> The default strategy would be LocalInputPreferredSlotSharingStrategy. It will 
> try to reduce remote data exchanges. Subtasks, which are connected and belong 
> to the same SlotSharingGroup, tend to be put in the same 
> ExecutionSlotSharingGroup.
> See [design 
> doc|https://docs.google.com/document/d/10CbCVDBJWafaFOovIXrR8nAr2BnZ_RAGFtUeTHiJplw/edit#heading=h.t4vfmm4atqoy]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-15467) Should wait for the end of the source thread during the Task cancellation

2020-07-31 Thread Roman Khachatryan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168532#comment-17168532
 ] 

Roman Khachatryan commented on FLINK-15467:
---

Thanks for the clarification [~Ming Li].

I think the fix covers this case too. In fact, I  checked it manually and it 
fixed the problem. But of course, a double check from your side would be highly 
appreciated.

> Should wait for the end of the source thread during the Task cancellation
> -
>
> Key: FLINK-15467
> URL: https://issues.apache.org/jira/browse/FLINK-15467
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.9.0, 1.9.1, 1.10.1
>Reporter: ming li
>Assignee: Roman Khachatryan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.2
>
>
>      In the new mailBox model, SourceStreamTask starts a source thread to run 
> user methods, and the current execution thread will block on mailbox.takeMail 
> (). When a task cancels, the TaskCanceler thread will cancel the task and 
> interrupt the execution thread. Therefore, the execution thread of 
> SourceStreamTask will throw InterruptedException, then cancel the task again, 
> and throw an exception.
> {code:java}
> //代码占位符
> @Override
> protected void performDefaultAction(ActionContext context) throws Exception {
>// Against the usual contract of this method, this implementation is not 
> step-wise but blocking instead for
>// compatibility reasons with the current source interface (source 
> functions run as a loop, not in steps).
>sourceThread.start();
>// We run an alternative mailbox loop that does not involve default 
> actions and synchronizes around actions.
>try {
>   runAlternativeMailboxLoop();
>} catch (Exception mailboxEx) {
>   // We cancel the source function if some runtime exception escaped the 
> mailbox.
>   if (!isCanceled()) {
>  cancelTask();
>   }
>   throw mailboxEx;
>}
>sourceThread.join();
>if (!isFinished) {
>   sourceThread.checkThrowSourceExecutionException();
>}
>context.allActionsCompleted();
> }
> {code}
>    When all tasks of this TaskExecutor are canceled, the blob file will be 
> cleaned up. But the real source thread is not finished at this time, which 
> will cause a ClassNotFoundException when loading a new class. In this case, 
> the source thread may not be able to properly clean up and release resources 
> (such as closing child threads, cleaning up local files, etc.). Therefore, I 
> think we should mark this task canceled or finished after the execution of 
> the source thread is completed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-18690) Implement LocalInputPreferredSlotSharingStrategy

2020-07-31 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168531#comment-17168531
 ] 

Zhu Zhu edited comment on FLINK-18690 at 7/31/20, 8:47 AM:
---

There are known limitations of LocalInputPreferredSlotSharingStrategy which can 
lead to suboptimal results of ExecutionSlotSharingGroup creation.


was (Author: zhuzh):
There are known limitations of LocalInputPreferredSlotSharingStrategy which 
leads to suboptimal results of ExecutionSlotSharingGroup creation.

> Implement LocalInputPreferredSlotSharingStrategy
> 
>
> Key: FLINK-18690
> URL: https://issues.apache.org/jira/browse/FLINK-18690
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Andrey Zagrebin
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
>
> Implement ExecutionSlotSharingGroup, SlotSharingStrategy and default 
> LocalInputPreferredSlotSharingStrategy.
> The default strategy would be LocalInputPreferredSlotSharingStrategy. It will 
> try to reduce remote data exchanges. Subtasks, which are connected and belong 
> to the same SlotSharingGroup, tend to be put in the same 
> ExecutionSlotSharingGroup.
> See [design 
> doc|https://docs.google.com/document/d/10CbCVDBJWafaFOovIXrR8nAr2BnZ_RAGFtUeTHiJplw/edit#heading=h.t4vfmm4atqoy]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-18690) Implement LocalInputPreferredSlotSharingStrategy

2020-07-31 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168535#comment-17168535
 ] 

Zhu Zhu commented on FLINK-18690:
-

Suboptimal case 1:
The grouping of execution vertices from brother job vertices are possible 
suboptimal because it does not take connections to consumers vertices into 
account.

Example - case1.png: A(parallelism=4) --forward--> C(parallelism=4), 
B(parallelism=2) --rescale--> C
Execution edges are: A1->C1, A2->C2, A3->C3, A4->C4; B1->C1, B1->C2, 
B2->C3,B2->C4
Optimal grouping: {A1,B1,C1}{A2,C2}{A3,B2,C3}{A4,C4} -> there would be 2 remote 
edges
Current grouping result: {A1,B1,C1} {A2,B2,C2} {A3,C3} {A4,C4} -> there would 
be 3 remote edges

> Implement LocalInputPreferredSlotSharingStrategy
> 
>
> Key: FLINK-18690
> URL: https://issues.apache.org/jira/browse/FLINK-18690
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Andrey Zagrebin
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: case1.png
>
>
> Implement ExecutionSlotSharingGroup, SlotSharingStrategy and default 
> LocalInputPreferredSlotSharingStrategy.
> The default strategy would be LocalInputPreferredSlotSharingStrategy. It will 
> try to reduce remote data exchanges. Subtasks, which are connected and belong 
> to the same SlotSharingGroup, tend to be put in the same 
> ExecutionSlotSharingGroup.
> See [design 
> doc|https://docs.google.com/document/d/10CbCVDBJWafaFOovIXrR8nAr2BnZ_RAGFtUeTHiJplw/edit#heading=h.t4vfmm4atqoy]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-18690) Implement LocalInputPreferredSlotSharingStrategy

2020-07-31 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168535#comment-17168535
 ] 

Zhu Zhu edited comment on FLINK-18690 at 7/31/20, 8:49 AM:
---

Suboptimal case 1:
The grouping of execution vertices from brother job vertices are possible 
suboptimal because it does not take connections to consumers vertices into 
account.

Example - case1.png: A(parallelism=4) -- forward --> C(parallelism=4), 
B(parallelism=2) -- rescale --> C
Execution edges are: A1->C1, A2->C2, A3->C3, A4->C4; B1->C1, B1->C2, 
B2->C3,B2->C4

Optimal grouping: {A1,B1,C1}{A2,C2}{A3,B2,C3}{A4,C4} -> there would be 2 remote 
edges

Current grouping result: {A1,B1,C1}{A2,B2,C2}{A3,C3}{A4,C4} -> there would be 3 
remote edges


was (Author: zhuzh):
Suboptimal case 1:
The grouping of execution vertices from brother job vertices are possible 
suboptimal because it does not take connections to consumers vertices into 
account.

Example - case1.png: A(parallelism=4) --forward--> C(parallelism=4), 
B(parallelism=2) --rescale--> C
Execution edges are: A1->C1, A2->C2, A3->C3, A4->C4; B1->C1, B1->C2, 
B2->C3,B2->C4
Optimal grouping: {A1,B1,C1}{A2,C2}{A3,B2,C3}{A4,C4} -> there would be 2 remote 
edges
Current grouping result: {A1,B1,C1} {A2,B2,C2} {A3,C3} {A4,C4} -> there would 
be 3 remote edges

> Implement LocalInputPreferredSlotSharingStrategy
> 
>
> Key: FLINK-18690
> URL: https://issues.apache.org/jira/browse/FLINK-18690
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Andrey Zagrebin
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: case1.png
>
>
> Implement ExecutionSlotSharingGroup, SlotSharingStrategy and default 
> LocalInputPreferredSlotSharingStrategy.
> The default strategy would be LocalInputPreferredSlotSharingStrategy. It will 
> try to reduce remote data exchanges. Subtasks, which are connected and belong 
> to the same SlotSharingGroup, tend to be put in the same 
> ExecutionSlotSharingGroup.
> See [design 
> doc|https://docs.google.com/document/d/10CbCVDBJWafaFOovIXrR8nAr2BnZ_RAGFtUeTHiJplw/edit#heading=h.t4vfmm4atqoy]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-18690) Implement LocalInputPreferredSlotSharingStrategy

2020-07-31 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-18690:

Attachment: case1.png

> Implement LocalInputPreferredSlotSharingStrategy
> 
>
> Key: FLINK-18690
> URL: https://issues.apache.org/jira/browse/FLINK-18690
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Andrey Zagrebin
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: case1.png
>
>
> Implement ExecutionSlotSharingGroup, SlotSharingStrategy and default 
> LocalInputPreferredSlotSharingStrategy.
> The default strategy would be LocalInputPreferredSlotSharingStrategy. It will 
> try to reduce remote data exchanges. Subtasks, which are connected and belong 
> to the same SlotSharingGroup, tend to be put in the same 
> ExecutionSlotSharingGroup.
> See [design 
> doc|https://docs.google.com/document/d/10CbCVDBJWafaFOovIXrR8nAr2BnZ_RAGFtUeTHiJplw/edit#heading=h.t4vfmm4atqoy]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-18690) Implement LocalInputPreferredSlotSharingStrategy

2020-07-31 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168535#comment-17168535
 ] 

Zhu Zhu edited comment on FLINK-18690 at 7/31/20, 8:49 AM:
---

Suboptimal case 1:
The grouping of execution vertices from brother job vertices are possible 
suboptimal because it does not take connections to consumers vertices into 
account.

Example - case1.png: 
Topology:
A(parallelism=4) -- forward --> C(parallelism=4), B(parallelism=2) -- rescale 
--> C

Execution edges: 
A1->C1, A2->C2, A3->C3, A4->C4; B1->C1, B1->C2, B2->C3,B2->C4

Optimal grouping: 
{A1,B1,C1}{A2,C2}{A3,B2,C3}{A4,C4} -> there would be 2 remote edges

Current grouping result: 
{A1,B1,C1}{A2,B2,C2}{A3,C3}{A4,C4} -> there would be 3 remote edges


was (Author: zhuzh):
Suboptimal case 1:
The grouping of execution vertices from brother job vertices are possible 
suboptimal because it does not take connections to consumers vertices into 
account.

Example - case1.png: A(parallelism=4) -- forward --> C(parallelism=4), 
B(parallelism=2) -- rescale --> C
Execution edges are: A1->C1, A2->C2, A3->C3, A4->C4; B1->C1, B1->C2, 
B2->C3,B2->C4

Optimal grouping: {A1,B1,C1}{A2,C2}{A3,B2,C3}{A4,C4} -> there would be 2 remote 
edges

Current grouping result: {A1,B1,C1}{A2,B2,C2}{A3,C3}{A4,C4} -> there would be 3 
remote edges

> Implement LocalInputPreferredSlotSharingStrategy
> 
>
> Key: FLINK-18690
> URL: https://issues.apache.org/jira/browse/FLINK-18690
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Andrey Zagrebin
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: case1.png
>
>
> Implement ExecutionSlotSharingGroup, SlotSharingStrategy and default 
> LocalInputPreferredSlotSharingStrategy.
> The default strategy would be LocalInputPreferredSlotSharingStrategy. It will 
> try to reduce remote data exchanges. Subtasks, which are connected and belong 
> to the same SlotSharingGroup, tend to be put in the same 
> ExecutionSlotSharingGroup.
> See [design 
> doc|https://docs.google.com/document/d/10CbCVDBJWafaFOovIXrR8nAr2BnZ_RAGFtUeTHiJplw/edit#heading=h.t4vfmm4atqoy]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-18690) Implement LocalInputPreferredSlotSharingStrategy

2020-07-31 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168535#comment-17168535
 ] 

Zhu Zhu edited comment on FLINK-18690 at 7/31/20, 8:50 AM:
---

Suboptimal case 1:
The grouping of execution vertices from brother job vertices are possible 
suboptimal because it does not take connections to consumers vertices into 
account.

Example - case1.png: 
Topology:
A(parallelism=4) -- forward --> C(parallelism=4), B(parallelism=2) -- rescale 
--> C

Execution edges: 
A1->C1, A2->C2, A3->C3, A4->C4; B1->C1, B1->C2, B2->C3,B2->C4

Optimal grouping: 
{A1,B1,C1}{A2,C2}{A3,B2,C3}{A4,C4}  ->  there would be 2 remote edges

Current grouping result: 
{A1,B1,C1}{A2,B2,C2}{A3,C3}{A4,C4}  ->  there would be 3 remote edges


was (Author: zhuzh):
Suboptimal case 1:
The grouping of execution vertices from brother job vertices are possible 
suboptimal because it does not take connections to consumers vertices into 
account.

Example - case1.png: 
Topology:
A(parallelism=4) -- forward --> C(parallelism=4), B(parallelism=2) -- rescale 
--> C

Execution edges: 
A1->C1, A2->C2, A3->C3, A4->C4; B1->C1, B1->C2, B2->C3,B2->C4

Optimal grouping: 
{A1,B1,C1}{A2,C2}{A3,B2,C3}{A4,C4} -> there would be 2 remote edges

Current grouping result: 
{A1,B1,C1}{A2,B2,C2}{A3,C3}{A4,C4} -> there would be 3 remote edges

> Implement LocalInputPreferredSlotSharingStrategy
> 
>
> Key: FLINK-18690
> URL: https://issues.apache.org/jira/browse/FLINK-18690
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Andrey Zagrebin
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: case1.png
>
>
> Implement ExecutionSlotSharingGroup, SlotSharingStrategy and default 
> LocalInputPreferredSlotSharingStrategy.
> The default strategy would be LocalInputPreferredSlotSharingStrategy. It will 
> try to reduce remote data exchanges. Subtasks, which are connected and belong 
> to the same SlotSharingGroup, tend to be put in the same 
> ExecutionSlotSharingGroup.
> See [design 
> doc|https://docs.google.com/document/d/10CbCVDBJWafaFOovIXrR8nAr2BnZ_RAGFtUeTHiJplw/edit#heading=h.t4vfmm4atqoy]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #12917: [FLINK-18355][tests] Simplify tests of SlotPoolImpl

2020-07-31 Thread GitBox


flinkbot edited a comment on pull request #12917:
URL: https://github.com/apache/flink/pull/12917#issuecomment-659940433


   
   ## CI report:
   
   * da7fb96fc00acda2a4c103f5f177efb9bd9be8be UNKNOWN
   * f80ff802ee1f126a58b1617ab4ca82c3f1a40ba8 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5065)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-16267) Flink uses more memory than taskmanager.memory.process.size in Kubernetes

2020-07-31 Thread Yordan Pavlov (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168538#comment-17168538
 ] 

Yordan Pavlov commented on FLINK-16267:
---

Hi [~xintongsong], sorry for the late reply.
Turning `state.backend.rocksdb.memory.managed` back to `true` and increasing 
taskmanager.memory.jvm-overhead.max has indeed helped. I am observing more 
rarely the problem with this setup.

> Flink uses more memory than taskmanager.memory.process.size in Kubernetes
> -
>
> Key: FLINK-16267
> URL: https://issues.apache.org/jira/browse/FLINK-16267
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.10.0
>Reporter: ChangZhuo Chen (陳昌倬)
>Priority: Major
> Attachments: flink-conf_1.10.0.yaml, flink-conf_1.9.1.yaml, 
> oomkilled_taskmanager.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This issue is from 
> [https://stackoverflow.com/questions/60336764/flink-uses-more-memory-than-taskmanager-memory-process-size-in-kubernetes]
> h1. Description
>  * In Flink 1.10.0, we try to use `taskmanager.memory.process.size` to limit 
> the resource used by taskmanager to ensure they are not killed by Kubernetes. 
> However, we still get lots of taskmanager `OOMKilled`. The setup is in the 
> following section.
>  * The taskmanager log is in attachment [^oomkilled_taskmanager.log].
> h2. Kubernete
>  * The Kubernetes setup is the same as described in 
> [https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/deployment/kubernetes.html].
>  * The following is resource configuration for taskmanager deployment in 
> Kubernetes:
> {{resources:}}
>  {{  requests:}}
>  {{    cpu: 1000m}}
>  {{    memory: 4096Mi}}
>  {{  limits:}}
>  {{    cpu: 1000m}}
>  {{    memory: 4096Mi}}
> h2. Flink Docker
>  * The Flink docker is built by the following Docker file.
> {{FROM flink:1.10-scala_2.11}}
> RUN mkdir -p /opt/flink/plugins/s3 &&
> ln -s /opt/flink/opt/flink-s3-fs-presto-1.10.0.jar /opt/flink/plugins/s3/
>  {{RUN ln -s /opt/flink/opt/flink-metrics-prometheus-1.10.0.jar 
> /opt/flink/lib/}}
> h2. Flink Configuration
>  * The following are all memory related configurations in `flink-conf.yaml` 
> in 1.10.0:
> {{jobmanager.heap.size: 820m}}
>  {{taskmanager.memory.jvm-metaspace.size: 128m}}
>  {{taskmanager.memory.process.size: 4096m}}
>  * We use RocksDB and we don't set `state.backend.rocksdb.memory.managed` in 
> `flink-conf.yaml`.
>  ** Use S3 as checkpoint storage.
>  * The code uses DateStream API
>  ** input/output are both Kafka.
> h2. Project Dependencies
>  * The following is our dependencies.
> {{val flinkVersion = "1.10.0"}}{{libraryDependencies += 
> "com.squareup.okhttp3" % "okhttp" % "4.2.2"}}
>  {{libraryDependencies += "com.typesafe" % "config" % "1.4.0"}}
>  {{libraryDependencies += "joda-time" % "joda-time" % "2.10.5"}}
>  {{libraryDependencies += "org.apache.flink" %% "flink-connector-kafka" % 
> flinkVersion}}
>  {{libraryDependencies += "org.apache.flink" % "flink-metrics-dropwizard" % 
> flinkVersion}}
>  {{libraryDependencies += "org.apache.flink" %% "flink-scala" % flinkVersion 
> % "provided"}}
>  {{libraryDependencies += "org.apache.flink" %% "flink-statebackend-rocksdb" 
> % flinkVersion % "provided"}}
>  {{libraryDependencies += "org.apache.flink" %% "flink-streaming-scala" % 
> flinkVersion % "provided"}}
>  {{libraryDependencies += "org.json4s" %% "json4s-jackson" % "3.6.7"}}
>  {{libraryDependencies += "org.log4s" %% "log4s" % "1.8.2"}}
>  {{libraryDependencies += "org.rogach" %% "scallop" % "3.3.1"}}
> h2. Previous Flink 1.9.1 Configuration
>  * The configuration we used in Flink 1.9.1 are the following. It does not 
> have `OOMKilled`.
> h3. Kubernetes
> {{resources:}}
>  {{  requests:}}
>  {{    cpu: 1200m}}
>  {{    memory: 2G}}
>  {{  limits:}}
>  {{    cpu: 1500m}}
>  {{    memory: 2G}}
> h3. Flink 1.9.1
> {{jobmanager.heap.size: 820m}}
>  {{taskmanager.heap.size: 1024m}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #13005: Feature/flink 18661 efo de registration

2020-07-31 Thread GitBox


flinkbot edited a comment on pull request #13005:
URL: https://github.com/apache/flink/pull/13005#issuecomment-665003864


   
   ## CI report:
   
   * c55d9c0b24133265d11f05ef84d0d93b46c67bec Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5067)
 
   * 9f1b031f968c039824b8f895f44877de921a3c92 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13015: [FLINK-18536][kinesis] Adding enhanced fan-out related configurations.

2020-07-31 Thread GitBox


flinkbot edited a comment on pull request #13015:
URL: https://github.com/apache/flink/pull/13015#issuecomment-665526783


   
   ## CI report:
   
   * 773d9cba0b0c76d726d1fae671ec9a50181e6c61 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5038)
 
   * b532233c37983e4dcdf803f43e9560848e5edce1 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zentol commented on a change in pull request #13020: [FLINK-18663][rest] Fix the exception occurred on AbstractHandler#handleException but not handled

2020-07-31 Thread GitBox


zentol commented on a change in pull request #13020:
URL: https://github.com/apache/flink/pull/13020#discussion_r463493044



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/rest/handler/AbstractHandler.java
##
@@ -114,6 +115,21 @@ protected void respondAsLeader(ChannelHandlerContext ctx, 
RoutedRequest routedRe
log.trace("Received request " + httpRequest.uri() + 
'.');
}
 
+   synchronized (this) {
+   if (terminationFuture != null) {
+   String errorMsg = "The handler instance for " + 
untypedResponseMessageHeaders.getTargetRestEndpointURL()
+   + " had already been closed";
+   log.warn(errorMsg);
+   HandlerUtils.sendErrorResponse(

Review comment:
   they are still useful for preventing   something similar from happening 
again.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-18781) Example in table/sql/queries.md Exists section is not correct

2020-07-31 Thread Benchao Li (Jira)
Benchao Li created FLINK-18781:
--

 Summary: Example in table/sql/queries.md Exists section is not 
correct
 Key: FLINK-18781
 URL: https://issues.apache.org/jira/browse/FLINK-18781
 Project: Flink
  Issue Type: Improvement
  Components: Documentation, Table SQL / API
Affects Versions: 1.11.1, 1.12.0
Reporter: Benchao Li


the sql should be:
{code:SQL}
SELECT user, amount
FROM Orders 
WHERE EXISTS (
SELECT product FROM NewProducts WHERE Orders.product = NewProducts.product
)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink-docker] zentol commented on a change in pull request #36: [hotfix] Replace forbidden latest-javaX with javaX

2020-07-31 Thread GitBox


zentol commented on a change in pull request #36:
URL: https://github.com/apache/flink-docker/pull/36#discussion_r463493717



##
File path: generator.sh
##
@@ -72,7 +72,7 @@ function generateReleaseMetadata {
 # "1.2.0-java11"
 # "1.2-java11"
 # "latest-java11"

Review comment:
   ```suggestion
   # "java11"
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-18690) Implement LocalInputPreferredSlotSharingStrategy

2020-07-31 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168543#comment-17168543
 ] 

Zhu Zhu commented on FLINK-18690:
-

Suboptimal case 2: (it is just a potential case to be noted. there no obvious 
evidence shows it will happen)
The producer can be assigned to any of its producer execution slot sharing 
group, even if some are better than others.

Example - case2.png:
Topology:
A(parallelism=2) -- all-to-all --> C(parallelism=2), B(parallelism=2) –- 
forward --> C

Execution edges:
A1->C1, A1->C2, A2->C1, A2->C2; B1->C1, B2->C2

Optimal grouping:
{A1,B1,C1}{A2,B2,C2} -> 2 remote edges

Suboptimal grouping:
{A1,B1,C2}{A2,B2,C1} -> 4 remote edges

Possible solution:
For each vertex, compute a score(=input edges) for each available producer 
execution slot sharing group and choose the group with highest score.

> Implement LocalInputPreferredSlotSharingStrategy
> 
>
> Key: FLINK-18690
> URL: https://issues.apache.org/jira/browse/FLINK-18690
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Andrey Zagrebin
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: case1.png
>
>
> Implement ExecutionSlotSharingGroup, SlotSharingStrategy and default 
> LocalInputPreferredSlotSharingStrategy.
> The default strategy would be LocalInputPreferredSlotSharingStrategy. It will 
> try to reduce remote data exchanges. Subtasks, which are connected and belong 
> to the same SlotSharingGroup, tend to be put in the same 
> ExecutionSlotSharingGroup.
> See [design 
> doc|https://docs.google.com/document/d/10CbCVDBJWafaFOovIXrR8nAr2BnZ_RAGFtUeTHiJplw/edit#heading=h.t4vfmm4atqoy]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #13005: Feature/flink 18661 efo de registration

2020-07-31 Thread GitBox


flinkbot edited a comment on pull request #13005:
URL: https://github.com/apache/flink/pull/13005#issuecomment-665003864


   
   ## CI report:
   
   * 9f1b031f968c039824b8f895f44877de921a3c92 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5077)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13015: [FLINK-18536][kinesis] Adding enhanced fan-out related configurations.

2020-07-31 Thread GitBox


flinkbot edited a comment on pull request #13015:
URL: https://github.com/apache/flink/pull/13015#issuecomment-665526783


   
   ## CI report:
   
   * 773d9cba0b0c76d726d1fae671ec9a50181e6c61 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5038)
 
   * b532233c37983e4dcdf803f43e9560848e5edce1 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5078)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Comment Edited] (FLINK-18690) Implement LocalInputPreferredSlotSharingStrategy

2020-07-31 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168535#comment-17168535
 ] 

Zhu Zhu edited comment on FLINK-18690 at 7/31/20, 9:11 AM:
---

Suboptimal case 1:
The grouping of execution vertices from brother job vertices are possible 
suboptimal because it does not take connections to consumers vertices into 
account.

Example - case1.png: 
Topology:
A(parallelism=4) -- forward --> C(parallelism=4), B(parallelism=2) -- rescale 
--> C

Execution edges: 
A1->C1, A2->C2, A3->C3, A4->C4; B1->C1, B1->C2, B2->C3,B2->C4

Optimal grouping: 
{A1,B1,C1}{A2,C2}{A3,B2,C3}{A4,C4}  ->  there would be 2 remote edges

Current grouping result: 
{A1,B1,C1}{A2,B2,C2}{A3,C3}{A4,C4}  ->  there would be 3 remote edges

Possible solution:
Change the rescale edge connection pattern
e.g. execution edge from B to C in this case would be: B1->C1, B1->C3, 
B2->C2,B2->C4
And put execution vertices with same index into the same slot sharing group.


was (Author: zhuzh):
Suboptimal case 1:
The grouping of execution vertices from brother job vertices are possible 
suboptimal because it does not take connections to consumers vertices into 
account.

Example - case1.png: 
Topology:
A(parallelism=4) -- forward --> C(parallelism=4), B(parallelism=2) -- rescale 
--> C

Execution edges: 
A1->C1, A2->C2, A3->C3, A4->C4; B1->C1, B1->C2, B2->C3,B2->C4

Optimal grouping: 
{A1,B1,C1}{A2,C2}{A3,B2,C3}{A4,C4}  ->  there would be 2 remote edges

Current grouping result: 
{A1,B1,C1}{A2,B2,C2}{A3,C3}{A4,C4}  ->  there would be 3 remote edges

> Implement LocalInputPreferredSlotSharingStrategy
> 
>
> Key: FLINK-18690
> URL: https://issues.apache.org/jira/browse/FLINK-18690
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Andrey Zagrebin
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: case1.png
>
>
> Implement ExecutionSlotSharingGroup, SlotSharingStrategy and default 
> LocalInputPreferredSlotSharingStrategy.
> The default strategy would be LocalInputPreferredSlotSharingStrategy. It will 
> try to reduce remote data exchanges. Subtasks, which are connected and belong 
> to the same SlotSharingGroup, tend to be put in the same 
> ExecutionSlotSharingGroup.
> See [design 
> doc|https://docs.google.com/document/d/10CbCVDBJWafaFOovIXrR8nAr2BnZ_RAGFtUeTHiJplw/edit#heading=h.t4vfmm4atqoy]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-16510) Task manager safeguard shutdown may not be reliable

2020-07-31 Thread Maximilian Michels (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168547#comment-17168547
 ] 

Maximilian Michels commented on FLINK-16510:


It might be helpful to define "fatal error". An OOM error can be a fatal error 
because we are not guaranteed to be able to recover from it. We currently do 
not treat OOM errors as fatal, except when it's thrown from the Task thread 
when {{taskmanager.jvm-exit-on-oom}} is set to true. In this case we do not 
stick to the regular {{System.exit()}} routine but we issue a 
{{Runtime.halt()}}. In the recent tests I ran, this behavior prevented the 
problem reported here. I guess the configuration option has value on its own 
and we should not touch it for now.

I'll proceed with the solution discussed here, i.e. adding an option to 
configure forceful exists instead of the default graceful exit.

> Task manager safeguard shutdown may not be reliable
> ---
>
> Key: FLINK-16510
> URL: https://issues.apache.org/jira/browse/FLINK-16510
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Attachments: command.txt, stack2-1.txt, stack3-mixed.txt, stack3.txt
>
>
> The {{JvmShutdownSafeguard}} does not always succeed but can hang when 
> multiple threads attempt to shutdown the JVM. Apparently mixing 
> {{System.exit()}} with ShutdownHooks and forcefully terminating the JVM via 
> {{Runtime.halt()}} does not play together well:
> {noformat}
> "Jvm Terminator" #22 daemon prio=5 os_prio=0 tid=0x7fb8e82f2800 
> nid=0x5a96 runnable [0x7fb35cffb000]
>java.lang.Thread.State: RUNNABLE
>   at java.lang.Shutdown.$$YJP$$halt0(Native Method)
>   at java.lang.Shutdown.halt0(Shutdown.java)
>   at java.lang.Shutdown.halt(Shutdown.java:139)
>   - locked <0x00047ed67638> (a java.lang.Shutdown$Lock)
>   at java.lang.Runtime.halt(Runtime.java:276)
>   at 
> org.apache.flink.runtime.util.JvmShutdownSafeguard$DelayedTerminator.run(JvmShutdownSafeguard.java:86)
>   at java.lang.Thread.run(Thread.java:748)
>Locked ownable synchronizers:
>   - None
> "FlinkCompletableFutureDelayScheduler-thread-1" #18154 daemon prio=5 
> os_prio=0 tid=0x7fb708a7d000 nid=0x5a8a waiting for monitor entry 
> [0x7fb289d49000]
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at java.lang.Shutdown.halt(Shutdown.java:139)
>   - waiting to lock <0x00047ed67638> (a java.lang.Shutdown$Lock)
>   at java.lang.Shutdown.exit(Shutdown.java:213)
>   - locked <0x00047edb7348> (a java.lang.Class for java.lang.Shutdown)
>   at java.lang.Runtime.exit(Runtime.java:110)
>   at java.lang.System.exit(System.java:973)
>   at 
> org.apache.flink.runtime.taskexecutor.TaskManagerRunner.terminateJVM(TaskManagerRunner.java:266)
>   at 
> org.apache.flink.runtime.taskexecutor.TaskManagerRunner.lambda$onFatalError$1(TaskManagerRunner.java:260)
>   at 
> org.apache.flink.runtime.taskexecutor.TaskManagerRunner$$Lambda$27464/1464672548.accept(Unknown
>  Source)
>   at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
>   at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
>   at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
>   at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
>   at 
> org.apache.flink.runtime.concurrent.FutureUtils$Timeout.run(FutureUtils.java:943)
>   at 
> org.apache.flink.runtime.concurrent.DirectExecutorService.execute(DirectExecutorService.java:211)
>   at 
> org.apache.flink.runtime.concurrent.FutureUtils.lambda$orTimeout$11(FutureUtils.java:361)
>   at 
> org.apache.flink.runtime.concurrent.FutureUtils$$Lambda$27435/159015392.run(Unknown
>  Source)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
>Locked ownable synchronizers:
>   - <0x0006d5e56bd0> (a 
> java.util.concurrent.ThreadPoolExecutor$Worker)
> {noformat}
> Note that under this condition the JVM s

[GitHub] [flink-docker] zentol commented on pull request #24: [hotfix] Add readme with simple instructions

2020-07-31 Thread GitBox


zentol commented on pull request #24:
URL: https://github.com/apache/flink-docker/pull/24#issuecomment-667023591


   I'm fine with it.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-17911) K8s e2e: error: timed out waiting for the condition on deployments/flink-native-k8s-session-1

2020-07-31 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168552#comment-17168552
 ] 

Robert Metzger commented on FLINK-17911:


I'm closing this ticket. Looks like the error was just a temporary glitch in 
the infrastructure.

> K8s e2e: error: timed out waiting for the condition on 
> deployments/flink-native-k8s-session-1
> -
>
> Key: FLINK-17911
> URL: https://issues.apache.org/jira/browse/FLINK-17911
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes, Tests
>Affects Versions: 1.11.0
>Reporter: Robert Metzger
>Priority: Major
>  Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=2062&view=logs&j=91bf6583-3fb2-592f-e4d4-d79d79c3230a&t=94459a52-42b6-5bfc-5d74-690b5d3c6de8
> {code}
> error: timed out waiting for the condition on 
> deployments/flink-native-k8s-session-1
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-17911) K8s e2e: error: timed out waiting for the condition on deployments/flink-native-k8s-session-1

2020-07-31 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-17911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger closed FLINK-17911.
--
Resolution: Cannot Reproduce

> K8s e2e: error: timed out waiting for the condition on 
> deployments/flink-native-k8s-session-1
> -
>
> Key: FLINK-17911
> URL: https://issues.apache.org/jira/browse/FLINK-17911
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes, Tests
>Affects Versions: 1.11.0
>Reporter: Robert Metzger
>Priority: Major
>  Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=2062&view=logs&j=91bf6583-3fb2-592f-e4d4-d79d79c3230a&t=94459a52-42b6-5bfc-5d74-690b5d3c6de8
> {code}
> error: timed out waiting for the condition on 
> deployments/flink-native-k8s-session-1
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend

2020-07-31 Thread ming li (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168554#comment-17168554
 ] 

ming li commented on FLINK-9373:


Hi,[~sihuazhou]. Recently I was reading the related code of 
flink-statebackend-rocksdb, and found that in the seekToLast method of 
org.apache.flink.contrib.streaming.state.RocksIteratorWrapper, 
iterator.seekToFirst is called. I am puzzled why iterator.seekToLast is not 
called.
{code:java}
//代码占位符
{code}
@Override public void seekToLast() \{ iterator.seekToFirst(); status(); }

> Fix potential data losing for RocksDBBackend
> 
>
> Key: FLINK-9373
> URL: https://issues.apache.org/jira/browse/FLINK-9373
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Affects Versions: 1.5.0
>Reporter: Sihua Zhou
>Assignee: Sihua Zhou
>Priority: Blocker
> Fix For: 1.5.0, 1.6.0
>
>
> Currently, when using RocksIterator we only use the _iterator.isValid()_ to 
> check whether we have reached the end of the iterator. But that is not 
> enough, if we refer to RocksDB's wiki 
> https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should 
> find that even if _iterator.isValid()=true_, there may also exist some 
> internal error. A safer way to use the _RocksIterator_ is to always call the 
> _iterator.status()_ to check the internal error of _RocksDB_. There is a case 
> from user email seems to lost data because of this 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-9373) Fix potential data losing for RocksDBBackend

2020-07-31 Thread ming li (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168554#comment-17168554
 ] 

ming li edited comment on FLINK-9373 at 7/31/20, 9:17 AM:
--

Hi,[~sihuazhou]. Recently I was reading the related code of 
flink-statebackend-rocksdb, and found that in the seekToLast method of 
org.apache.flink.contrib.streaming.state.RocksIteratorWrapper, 
iterator.seekToFirst is called. I am puzzled why iterator.seekToLast is not 
called.
{code:java}
//代码占位符
@Override 
public void seekToLast() { 
   iterator.seekToFirst(); 
   status(); 
}
{code}


was (Author: ming li):
Hi,[~sihuazhou]. Recently I was reading the related code of 
flink-statebackend-rocksdb, and found that in the seekToLast method of 
org.apache.flink.contrib.streaming.state.RocksIteratorWrapper, 
iterator.seekToFirst is called. I am puzzled why iterator.seekToLast is not 
called.
{code:java}
//代码占位符
{code}
@Override public void seekToLast() \{ iterator.seekToFirst(); status(); }

> Fix potential data losing for RocksDBBackend
> 
>
> Key: FLINK-9373
> URL: https://issues.apache.org/jira/browse/FLINK-9373
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Affects Versions: 1.5.0
>Reporter: Sihua Zhou
>Assignee: Sihua Zhou
>Priority: Blocker
> Fix For: 1.5.0, 1.6.0
>
>
> Currently, when using RocksIterator we only use the _iterator.isValid()_ to 
> check whether we have reached the end of the iterator. But that is not 
> enough, if we refer to RocksDB's wiki 
> https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should 
> find that even if _iterator.isValid()=true_, there may also exist some 
> internal error. A safer way to use the _RocksIterator_ is to always call the 
> _iterator.status()_ to check the internal error of _RocksDB_. There is a case 
> from user email seems to lost data because of this 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-18748) Savepoint would be queued unexpected if pendingCheckpoints less than maxConcurrentCheckpoints

2020-07-31 Thread Roman Khachatryan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168555#comment-17168555
 ] 

Roman Khachatryan commented on FLINK-18748:
---

Thanks for your analysis and proposal [~wayland],

I think nextTriggerDelayMillis might actually be quite heavy, as it involves a 
system call.

Why not skip this call at all if the request is periodic (and not forced) in 
chooseRequestToExecute?

> Savepoint would be queued unexpected if pendingCheckpoints less than 
> maxConcurrentCheckpoints
> -
>
> Key: FLINK-18748
> URL: https://issues.apache.org/jira/browse/FLINK-18748
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.11.0, 1.11.1
>Reporter: Congxian Qiu(klion26)
>Assignee: tao wang
>Priority: Major
>
> Inspired by a [user-zh 
> email|http://apache-flink.147419.n8.nabble.com/flink-1-11-rest-api-saveppoint-td5497.html]
> After FLINK-17342, when triggering a checkpoint/savepoint, we'll check 
> whether the request can be triggered in 
> {{CheckpointRequestDecider#chooseRequestToExecute}}, the logic is as follow:
> {code:java}
> Preconditions.checkState(Thread.holdsLock(lock));
> // 1. 
> if (isTriggering || queuedRequests.isEmpty()) {
>return Optional.empty();
> }
> // 2 too many ongoing checkpoitn/savepoint
> if (pendingCheckpointsSizeSupplier.get() >= maxConcurrentCheckpointAttempts) {
>return Optional.of(queuedRequests.first())
>   .filter(CheckpointTriggerRequest::isForce)
>   .map(unused -> queuedRequests.pollFirst());
> }
> // 3 check the timestamp of last complete checkpoint
> long nextTriggerDelayMillis = nextTriggerDelayMillis(lastCompletionMs);
> if (nextTriggerDelayMillis > 0) {
>return onTooEarly(nextTriggerDelayMillis);
> }
> return Optional.of(queuedRequests.pollFirst());
> {code}
> But if currently {{pendingCheckpointsSizeSupplier.get()}} < 
> {{maxConcurrentCheckpointAttempts}}, and the request is a savepoint, the 
> savepoint will still wait some time in step 3. 
> I think we should trigger the savepoint immediately if 
> {{pendingCheckpointSizeSupplier.get()}} < {{maxConcurrentCheckpointAttempts}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-16993) Support SupportsComputedColumnPushDown in planner

2020-07-31 Thread jackylau (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168556#comment-17168556
 ] 

jackylau commented on FLINK-16993:
--

Hi [~jark], thanks. Because of my personal and tean reason , i  may take it 
first  and i am very glad to take  FLINK-18778 or FLINK-18779 .

> Support SupportsComputedColumnPushDown in planner
> -
>
> Key: FLINK-16993
> URL: https://issues.apache.org/jira/browse/FLINK-16993
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Reporter: Timo Walther
>Priority: Major
>
> Support the {{SupportsComputedColumnPushDown}} interface for 
> {{ScanTableSource}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-18121) Support creating Docker image from local Flink distribution

2020-07-31 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168559#comment-17168559
 ] 

Chesnay Schepler commented on FLINK-18121:
--

Would people be okay with it still requiring a web server, but having a script 
handle all of that?

> Support creating Docker image from local Flink distribution
> ---
>
> Key: FLINK-18121
> URL: https://issues.apache.org/jira/browse/FLINK-18121
> Project: Flink
>  Issue Type: Improvement
>  Components: Dockerfiles
>Affects Versions: docker-1.11.0.0
>Reporter: Till Rohrmann
>Priority: Major
>
> Currently, 
> https://github.com/apache/flink-docker/blob/dev-master/Dockerfile-debian.template
>  only supports to create a Docker image from a Flink distribution which is 
> hosted on a web server. I think it would be helpful if we could also create a 
> Docker image from a Flink distribution which is stored on one's local file 
> system. That way, one would not have to upload the file or start a web server 
> for serving it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-16993) Support SupportsComputedColumnPushDown in planner

2020-07-31 Thread jackylau (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168565#comment-17168565
 ] 

jackylau commented on FLINK-16993:
--

Hi [~twalthr], i read the flip-95 and find this statement "The converter might 
be code generated and works on `RowData`" in your docs.

Could you give me any more details about it. *And i try and don't know how to* 

RowData convert(RowData producedRow).

 
{code:java}
// this it my code block, But some details are not clear

((SupportsComputedColumnPushDown) newTableSource).applyComputedColumn(new 
SupportsComputedColumnPushDown.ComputedColumnConverter() {
   @Override
   public RowData convert(RowData producedRow) {
  BinaryRowData newRowData = null;
  if (producedRow instanceof BinaryRowData) {
 newRowData = new BinaryRowData(producedRow.getArity() + 
convertedExpressions.size());
 for (int i = 0; i < convertedExpressions.size(); i++) {
DataType dataType = convertedExpressions.get(i).getOutputDataType();
//according the datatype , call this set**(pos, value)
 }
  } else if (producedRow instanceof GenericRowData ) {
 //TODO
  } else if (producedRow instanceof BoxedWrapperRowData) {
 //TODO
  }
  return newRowData;
   }

   @Override
   public void open(Context context) {

   }
}, 
TypeConversions.fromLogicalToDataType(FlinkTypeFactory.toLogicalType(project.getRowType(;
{code}

> Support SupportsComputedColumnPushDown in planner
> -
>
> Key: FLINK-16993
> URL: https://issues.apache.org/jira/browse/FLINK-16993
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Reporter: Timo Walther
>Priority: Major
>
> Support the {{SupportsComputedColumnPushDown}} interface for 
> {{ScanTableSource}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-17662) testDetachedModeSecureWithPreInstallKeytab: "The JobManager and the TaskManager should both run with Kerberos."

2020-07-31 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-17662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger closed FLINK-17662.
--
Resolution: Cannot Reproduce

> testDetachedModeSecureWithPreInstallKeytab: "The JobManager and the 
> TaskManager should both run with Kerberos."
> ---
>
> Key: FLINK-17662
> URL: https://issues.apache.org/jira/browse/FLINK-17662
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Tests
>Affects Versions: 1.11.0
>Reporter: Robert Metzger
>Priority: Major
>  Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=1133&view=logs&j=fc5181b0-e452-5c8f-68de-1097947f6483&t=6b04ca5f-0b52-511d-19c9-52bf0d9fbdfa
> {code}
> [ERROR] Tests run: 5, Failures: 1, Errors: 0, Skipped: 2, Time elapsed: 
> 78.918 s <<< FAILURE! - in org.apache.flink.yarn.YARNSessionFIFOSecuredITCase
> [ERROR] 
> testDetachedModeSecureWithPreInstallKeytab(org.apache.flink.yarn.YARNSessionFIFOSecuredITCase)
>   Time elapsed: 23.927 s  <<< FAILURE!
> java.lang.AssertionError: 
> The JobManager and the TaskManager should both run with Kerberos.
> Expected: is 
>  but: was 
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.junit.Assert.assertThat(Assert.java:956)
>   at 
> org.apache.flink.yarn.YARNSessionFIFOSecuredITCase.verifyResultContainsKerberosKeytab(YARNSessionFIFOSecuredITCase.java:161)
>   at 
> org.apache.flink.yarn.YARNSessionFIFOSecuredITCase.lambda$testDetachedModeSecureWithPreInstallKeytab$0(YARNSessionFIFOSecuredITCase.java:133)
>   at org.apache.flink.yarn.YarnTestBase.runTest(YarnTestBase.java:242)
>   at 
> org.apache.flink.yarn.YARNSessionFIFOSecuredITCase.testDetachedModeSecureWithPreInstallKeytab(YARNSessionFIFOSecuredITCase.java:119)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-17662) testDetachedModeSecureWithPreInstallKeytab: "The JobManager and the TaskManager should both run with Kerberos."

2020-07-31 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168566#comment-17168566
 ] 

Robert Metzger edited comment on FLINK-17662 at 7/31/20, 9:26 AM:
--

I'm closing this ticket, since it hasn't happened again for a while. Please 
reopen if it reoccurs.


was (Author: rmetzger):
I'm closing this ticket, since it hasn't happened again.

> testDetachedModeSecureWithPreInstallKeytab: "The JobManager and the 
> TaskManager should both run with Kerberos."
> ---
>
> Key: FLINK-17662
> URL: https://issues.apache.org/jira/browse/FLINK-17662
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Tests
>Affects Versions: 1.11.0
>Reporter: Robert Metzger
>Priority: Major
>  Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=1133&view=logs&j=fc5181b0-e452-5c8f-68de-1097947f6483&t=6b04ca5f-0b52-511d-19c9-52bf0d9fbdfa
> {code}
> [ERROR] Tests run: 5, Failures: 1, Errors: 0, Skipped: 2, Time elapsed: 
> 78.918 s <<< FAILURE! - in org.apache.flink.yarn.YARNSessionFIFOSecuredITCase
> [ERROR] 
> testDetachedModeSecureWithPreInstallKeytab(org.apache.flink.yarn.YARNSessionFIFOSecuredITCase)
>   Time elapsed: 23.927 s  <<< FAILURE!
> java.lang.AssertionError: 
> The JobManager and the TaskManager should both run with Kerberos.
> Expected: is 
>  but: was 
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.junit.Assert.assertThat(Assert.java:956)
>   at 
> org.apache.flink.yarn.YARNSessionFIFOSecuredITCase.verifyResultContainsKerberosKeytab(YARNSessionFIFOSecuredITCase.java:161)
>   at 
> org.apache.flink.yarn.YARNSessionFIFOSecuredITCase.lambda$testDetachedModeSecureWithPreInstallKeytab$0(YARNSessionFIFOSecuredITCase.java:133)
>   at org.apache.flink.yarn.YarnTestBase.runTest(YarnTestBase.java:242)
>   at 
> org.apache.flink.yarn.YARNSessionFIFOSecuredITCase.testDetachedModeSecureWithPreInstallKeytab(YARNSessionFIFOSecuredITCase.java:119)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17662) testDetachedModeSecureWithPreInstallKeytab: "The JobManager and the TaskManager should both run with Kerberos."

2020-07-31 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168566#comment-17168566
 ] 

Robert Metzger commented on FLINK-17662:


I'm closing this ticket, since it hasn't happened again.

> testDetachedModeSecureWithPreInstallKeytab: "The JobManager and the 
> TaskManager should both run with Kerberos."
> ---
>
> Key: FLINK-17662
> URL: https://issues.apache.org/jira/browse/FLINK-17662
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Tests
>Affects Versions: 1.11.0
>Reporter: Robert Metzger
>Priority: Major
>  Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=1133&view=logs&j=fc5181b0-e452-5c8f-68de-1097947f6483&t=6b04ca5f-0b52-511d-19c9-52bf0d9fbdfa
> {code}
> [ERROR] Tests run: 5, Failures: 1, Errors: 0, Skipped: 2, Time elapsed: 
> 78.918 s <<< FAILURE! - in org.apache.flink.yarn.YARNSessionFIFOSecuredITCase
> [ERROR] 
> testDetachedModeSecureWithPreInstallKeytab(org.apache.flink.yarn.YARNSessionFIFOSecuredITCase)
>   Time elapsed: 23.927 s  <<< FAILURE!
> java.lang.AssertionError: 
> The JobManager and the TaskManager should both run with Kerberos.
> Expected: is 
>  but: was 
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.junit.Assert.assertThat(Assert.java:956)
>   at 
> org.apache.flink.yarn.YARNSessionFIFOSecuredITCase.verifyResultContainsKerberosKeytab(YARNSessionFIFOSecuredITCase.java:161)
>   at 
> org.apache.flink.yarn.YARNSessionFIFOSecuredITCase.lambda$testDetachedModeSecureWithPreInstallKeytab$0(YARNSessionFIFOSecuredITCase.java:133)
>   at org.apache.flink.yarn.YarnTestBase.runTest(YarnTestBase.java:242)
>   at 
> org.apache.flink.yarn.YARNSessionFIFOSecuredITCase.testDetachedModeSecureWithPreInstallKeytab(YARNSessionFIFOSecuredITCase.java:119)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #13029: [FLINK-18763][python] Support basic TypeInformation for Python DataSt…

2020-07-31 Thread GitBox


flinkbot edited a comment on pull request #13029:
URL: https://github.com/apache/flink/pull/13029#issuecomment-666332680


   
   ## CI report:
   
   * 39f5be60dc81c30f0cbbfb4e219722cbdd52ea21 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5070)
 
   * 1a3c5b93a697dea00a2e8237deb03b5848ffb1a9 UNKNOWN
   * 9e004f9f8ebc001e7a8d63e39eb89f6b2b5cc431 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5079)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13034: [FLINK-9992][tests] Fix FsStorageLocationReferenceTest#testEncodeAndDecode by adding retries to generate a valid path

2020-07-31 Thread GitBox


flinkbot edited a comment on pull request #13034:
URL: https://github.com/apache/flink/pull/13034#issuecomment-666932731


   
   ## CI report:
   
   * f9f773fd1a15b875596ab294ca90db912b2dd2c0 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5069)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-17714) Support custom RestartBackoffTimeStrategy

2020-07-31 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168581#comment-17168581
 ] 

Chesnay Schepler commented on FLINK-17714:
--

[~trohrmann] Do we have a clear plan on how this should work? Should this use 
the plugin mechanism (I'd wager that users may start injecting rules from the 
outside)?

> Support custom RestartBackoffTimeStrategy
> -
>
> Key: FLINK-17714
> URL: https://issues.apache.org/jira/browse/FLINK-17714
> Project: Flink
>  Issue Type: Wish
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Priority: Major
>
> There are cases that users need to customize RestartBackoffTimeStrategy to 
> better control job recovery. 
> One example is that users want a job to restart only on certain errors and 
> fail on others. See this ML 
> [disscusion|https://lists.apache.org/thread.html/rde685552a83d0d146cf83560df1bc6f33d3dd569f69ae7bbcc4ae508%40%3Cuser.flink.apache.org%3E].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #12917: [FLINK-18355][tests] Simplify tests of SlotPoolImpl

2020-07-31 Thread GitBox


flinkbot edited a comment on pull request #12917:
URL: https://github.com/apache/flink/pull/12917#issuecomment-659940433


   
   ## CI report:
   
   * da7fb96fc00acda2a4c103f5f177efb9bd9be8be UNKNOWN
   * f80ff802ee1f126a58b1617ab4ca82c3f1a40ba8 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5065)
 
   * 1c5870c57b64436014900d67be2005395e007a52 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-18782) How to retain the column'name when convert a Table to DataStream

2020-07-31 Thread Ying Z (Jira)
Ying Z created FLINK-18782:
--

 Summary: How to retain the column'name when convert a Table to 
DataStream
 Key: FLINK-18782
 URL: https://issues.apache.org/jira/browse/FLINK-18782
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.9.1
Reporter: Ying Z


mail: 
[http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/How-to-retain-the-column-name-when-convert-a-Table-to-DataStream-td37002.html]

 

I met some field name errors when try to convert in Table and DataStream.
 First, init a datastream and convert to table 'source', register a 
tablefunction named 'foo'
{code:java}
val sourceStream = env.socketTextStream("127.0.0.1", 8010)
  .map(line => line.toInt)
tableEnv.registerDataStream("source_table", sourceStream, 'a)

class Foo() extends TableFunction[(Int)] {
  def eval(col: Int): Unit = collect((col * 10))
}
tableEnv.registerFunction("foo", new Foo)
{code}
Then, use sqlQuery to generate a new table t1 with columns 'a' 'b'
{code:java}
val t1 = tableEnv.sqlQuery(
  """
|SELECT source_table.a, b FROM source_table
|, LATERAL TABLE(foo(a)) as T(b)
|""".stripMargin
)
/*
 t1 table schema: root
 |-- a: INT
 |-- b: INT
 */
println(s"t1 table schema: ${t1.getSchema}")
{code}
When I try to convert 't1' to a datastream then register to a new table(for 
some reason) named 't1', the columns changes to 'a' 'f0', not 'a' 'b'
{code:java}
val t1Stream = t1.toAppendStream[Row]
// t1 stream schema: Row(a: Integer, f0: Integer)
println(s"t1 stream schema: ${t1Stream.getType()}")
tableEnv.registerDataStream("t1", t1Stream)
/*
new t1 table schema: root
|-- a: INT
|-- f0: INT
 */
println(s"new t1 table schema: ${tableEnv.scan("t1").getSchema}")
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-16566) Change the log level of the launching command and dynamic properties from DEBUG to INFO in Mesos integration

2020-07-31 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168590#comment-17168590
 ] 

Chesnay Schepler commented on FLINK-16566:
--

Sounds good, [~karmagyz] do you want to work on this?

> Change the log level of the launching command and dynamic properties from 
> DEBUG to INFO in Mesos integration
> 
>
> Key: FLINK-16566
> URL: https://issues.apache.org/jira/browse/FLINK-16566
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Mesos
>Reporter: Yangze Guo
>Priority: Minor
>
> As discussed in 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-1-10-container-memory-configuration-with-Mesos-td33594.html
>  . It would helpful for debugging to log the launching command and dynamic 
> properties at INFO level. Since such logs occur only when workers started, it 
> would not be massive.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (FLINK-17811) Update docker hub Flink page

2020-07-31 Thread Till Rohrmann (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-17811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann reassigned FLINK-17811:
-

Assignee: Robert Metzger

> Update docker hub Flink page
> 
>
> Key: FLINK-17811
> URL: https://issues.apache.org/jira/browse/FLINK-17811
> Project: Flink
>  Issue Type: Task
>  Components: Deployment / Docker
>Reporter: Andrey Zagrebin
>Assignee: Robert Metzger
>Priority: Major
>
> In FLINK-17161, we refactored the Flink docker images docs. We should also 
> update and possibly link the related Flink docs about docker integration in 
> [docker hub Flink image 
> description|https://hub.docker.com/_/flink?tab=description].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #12917: [FLINK-18355][tests] Simplify tests of SlotPoolImpl

2020-07-31 Thread GitBox


flinkbot edited a comment on pull request #12917:
URL: https://github.com/apache/flink/pull/12917#issuecomment-659940433


   
   ## CI report:
   
   * da7fb96fc00acda2a4c103f5f177efb9bd9be8be UNKNOWN
   * f80ff802ee1f126a58b1617ab4ca82c3f1a40ba8 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5065)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-18783) Load AkkaRpcService through separate class loader

2020-07-31 Thread Till Rohrmann (Jira)
Till Rohrmann created FLINK-18783:
-

 Summary: Load AkkaRpcService through separate class loader
 Key: FLINK-18783
 URL: https://issues.apache.org/jira/browse/FLINK-18783
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination
Affects Versions: 1.11.1, 1.10.1, 1.12.0
Reporter: Till Rohrmann


In order to reduce the runtime dependency on Scala and also to hide the Akka 
dependency I suggest to load the AkkaRpcService and its dependencies through a 
separate class loader similar to what we do with Flink's plugins.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-10919) Flink binary distribution too large

2020-07-31 Thread Till Rohrmann (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-10919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann closed FLINK-10919.
-
Resolution: Abandoned

Closed for inactivity.

> Flink binary distribution too large
> ---
>
> Key: FLINK-10919
> URL: https://issues.apache.org/jira/browse/FLINK-10919
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Affects Versions: 1.7.0
>Reporter: Till Rohrmann
>Priority: Major
>
> Flink's binary distribution is almost 500 MB large:
> {code}
> 128K  ./bin
>  16K  ./examples/python/streaming
>  28K  ./examples/python/batch
>  44K  ./examples/python
> 189M  ./examples/streaming
> 240K  ./examples/gelly
> 136K  ./examples/batch
> 190M  ./examples
> 131M  ./lib
> 174M  ./opt
>   0B  ./log
>  56K  ./conf
> 494M  .
> {code}
> I think this is far too large and we should try to reduce the size. For 
> example, the examples directory contains 3 different Kafka example jobs, each 
> at least 50 MB of size.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-10903) Shade Internal Akka Dependencies

2020-07-31 Thread Till Rohrmann (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-10903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann closed FLINK-10903.
-
Resolution: Workaround

Closing this ticket as there is a workaround available. The proper fix will be 
done as part of FLINK-18783.

> Shade Internal Akka Dependencies
> 
>
> Key: FLINK-10903
> URL: https://issues.apache.org/jira/browse/FLINK-10903
> Project: Flink
>  Issue Type: Wish
>  Components: Runtime / Coordination
>Reporter: Luka Jurukovski
>Priority: Minor
>
> Akka is not a publicly exposed API but is something that forces developers 
> (particularly in Scala) to use an older version. It would be nice if this was 
> shaded so that developers are free to use the version of their choosing 
> without needing to worry about binary backwards compatibility, or in the case 
> that a user is forced to use parent first classloading



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-10948) Add option to write out termination message with application status

2020-07-31 Thread Till Rohrmann (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-10948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann closed FLINK-10948.
-
Resolution: Abandoned

Closed for inactivity.

> Add option to write out termination message with application status
> ---
>
> Key: FLINK-10948
> URL: https://issues.apache.org/jira/browse/FLINK-10948
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Reporter: Ufuk Celebi
>Assignee: Ufuk Celebi
>Priority: Minor
>
> I propose to add an option to write out a termination message to a file that 
> indicates the terminal application status. With the change proposed in 
> FLINK-10743, we can't use the exit code to differentiate between cancelled 
> and succeeded applications.
> The motivating use case for both this ticket and FLINK-10743 are Flink job 
> clusters ({{StandaloneJobClusterEntryPoint}}) with Kubernetes. The idea of 
> the termination message comes from Kubernetes 
> ([https://kubernetes.io/docs/tasks/debug-application-cluster/determine-reason-pod-failure/)].
>  
> With this in place a terminated Pod will report the final status as in:
> {code:java}
> state:
>   terminated:
> exitCode: 0
> finishedAt: 2018-11-20T11:00:59Z
> message: CANCELED # <--- termination message
> reason: Completed
> startedAt: 2018-11-20T10:59:18Z
> {code}
> The implementation could be done in 
> {{ClusterEntrypoint#runClusterEntrypoint(ClusterEntrypoint)}} which is used 
> by all entry points to run Flink.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-11000) Introduce Resource Blacklist Mechanism

2020-07-31 Thread Till Rohrmann (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-11000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann closed FLINK-11000.
-
Resolution: Abandoned

Closed for inactivity.

> Introduce Resource Blacklist Mechanism
> --
>
> Key: FLINK-11000
> URL: https://issues.apache.org/jira/browse/FLINK-11000
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Reporter: Yingjie Cao
>Assignee: Yingjie Cao
>Priority: Major
>
> In a large clusters, jobs encounter Hardware and software environment 
> problems 
> occasionally, including software library missing,bad hardware,resource 
> shortage like out of disk space,these problems will lead to task failure,the 
> failover strategy will take care of that and redeploy the relevant tasks. 
> But because of reasons like location preference and limited total 
> resources,the failed task will be scheduled to be deployed on the same host, 
> then the task will fail again and again, many times. The primary cause of 
> this problem is the mismatching of task and resource. Currently, the 
> resource allocation algorithm does not take these into consideration. 
> The blacklist mechanism can solve this problem. The basic idea 
> is that when a task fails too many times on some resource, the Scheduler 
> will not assign the resource to that task. The detail design doc is as 
> follows, 
> [https://docs.google.com/document/d/1Qfb_QPd7CLcGT-kJjWSCdO8xFeobSCHF0vNcfiO4Bkw]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-18756) Support IF NOT EXISTS for CREATE TABLE statement

2020-07-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-18756:
---
Labels: pull-request-available  (was: )

> Support IF NOT EXISTS for CREATE TABLE statement
> 
>
> Key: FLINK-18756
> URL: https://issues.apache.org/jira/browse/FLINK-18756
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Reporter: Jark Wu
>Assignee: Leonard Xu
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>
> Currently, the CREATE TABLE DDL statement doesn't support IF NOT EXISTS. I 
> think this is a useful feature we missed to support, because all the other 
> CREATE DDLs support it. 
> https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql/create.html#create-table



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] leonardBang opened a new pull request #13037: [FLINK-18756][table-api] Support IF NOT EXISTS for CREATE TABLE statement

2020-07-31 Thread GitBox


leonardBang opened a new pull request #13037:
URL: https://github.com/apache/flink/pull/13037


   ## What is the purpose of the change
   
   * This pull request support `IF NOT EXISTS` for create table statement.
   
   
   ## Brief change log
   
- Support parse `IF NOT EXISTS` in `parserImpls.ftl`
- Simplify `SqlCreateTable.java` 's constructor
   
   ## Verifying this change
   - Add parse tests in `FlinkSqlParserImplTest.java`
   - Add catalog related test in `TableEnvironmentTest.scala`

   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): ( no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-10981) Add or modify metrics to show the maximum usage of InputBufferPool/OutputBufferPool to help debugging back pressure

2020-07-31 Thread Till Rohrmann (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-10981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168594#comment-17168594
 ] 

Till Rohrmann commented on FLINK-10981:
---

[~pnowojski] is this ticket still relevant or has the backpressure monitoring 
already changed so that the ticket has become obsolete?

> Add or modify metrics to show the maximum usage of 
> InputBufferPool/OutputBufferPool to help debugging back pressure
> ---
>
> Key: FLINK-10981
> URL: https://issues.apache.org/jira/browse/FLINK-10981
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics, Runtime / Network
>Reporter: Yun Gao
>Assignee: Yun Gao
>Priority: Major
>
> Currently the network layer has provided two metrics items, namely 
> _InputBufferPoolUsageGauge_ and _OutputBufferPoolUsageGauge_ to show the 
> usage of input buffer pool and output buffer pool. When there are multiple 
> inputs(SingleInputGate) or outputs(ResultPartition), the two metrics items 
> show their average usage. 
>  
> However, we found that the maximum usage of all the InputBufferPool or 
> OutputBufferPool is also useful in debugging back pressure. Suppose we have a 
> job with the following job graph:
>  
> {code:java}
>   F 
>\
> \
> _\/  
> A ---> B > C ---> D
>\
> \
>  \-> E 
>  {code}
> Besides, also suppose D is very slow and thus cause back pressure, but E is 
> very fast and F outputs few records, thus the usage of the corresponding 
> input/output buffer pool is almost 0.
>  
> Then the average input/output buffer usage of each task will be:
>  
> {code:java}
> A(100%) --> (100%) B (50%) --> (50%) C (100%) --> (100%) D
> {code}
>  
>  
> But the maximum input/output buffer usage of each task will be:
>  
> {code:java}
> A(100%) --> (100%) B (100%) --> (100%) C (100%) --> (100%) D
> {code}
> Users will be able to find the slowest task by finding the first task whose 
> input buffer usage is 100% but output usage is less than 100%.
>  
>  
> If it is reasonable to show the maximum input/output buffer usage, I think 
> there may be three options:
>  # Modify the current computation logic of _InputBufferPoolUsageGauge_ and 
> _OutputBufferPoolUsageGauge._
>  # Add two _new metrics items InputBufferPoolMaxUsageGauge and 
> OutputBufferPoolMaxUsageGauge._
>  # Try to show distinct usage for each input/output buffer pool.
> and I think maybe the second option is the most preferred. 
>  
> How do you think about that?
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-11025) Connector shading is inconsistent

2020-07-31 Thread Till Rohrmann (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-11025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann closed FLINK-11025.
-
Resolution: Abandoned

Closed for inactivity.

> Connector shading is inconsistent
> -
>
> Key: FLINK-11025
> URL: https://issues.apache.org/jira/browse/FLINK-11025
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataSet, Build System, Connectors / Common
>Affects Versions: 1.5.5, 1.6.2, 1.7.0
>Reporter: Chesnay Schepler
>Priority: Major
>
> I've had a look at our connectors and our shading practices very 
> inconsistent. Connectors either:
>  # don't shade anything at all
>  # shade dependencies that are prone to causing conflicts (like guava)
>  # shade everything
> Examples:
>  # nifi, hbase, jdbc, orc, ES6, rabbitmq, Kafka
>  # Cassandra, Kinesis
>  # ES1,2,5, twitter
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot commented on pull request #13037: [FLINK-18756][table-api] Support IF NOT EXISTS for CREATE TABLE statement

2020-07-31 Thread GitBox


flinkbot commented on pull request #13037:
URL: https://github.com/apache/flink/pull/13037#issuecomment-667046082


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 00568a9ee6d18beaffeafb6462f9e33924e3cae6 (Fri Jul 31 
10:10:15 UTC 2020)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-11055) Allow Queryable State to be transformed on the TaskManager before being returned to the client

2020-07-31 Thread Till Rohrmann (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-11055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann closed FLINK-11055.
-
Resolution: Abandoned

Closing for inactivity.

> Allow Queryable State to be transformed on the TaskManager before being 
> returned to the client
> --
>
> Key: FLINK-11055
> URL: https://issues.apache.org/jira/browse/FLINK-11055
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Queryable State
>Reporter: Galen Warren
>Priority: Major
> Fix For: 1.7.3
>
>
> The proposal here is to enhance the way Queryable State works to allow for 
> the state object to be transformed on the TaskManager before being returned 
> to the client. As an example, if some MapState were made queryable, such 
> a transform might look up a specific key in the map and return its 
> corresponding value, resulting in only that value being returned to the 
> client instead of the entire map. This could be useful in cases where the 
> client only wants a portion of the state and the state is large (this is my 
> use case).
> At a high level, I think this could be accomplished by adding an (optional) 
> serializable Function into KvStateRequest (and related 
> classes?) and having that transform be applied in the QueryableStateServer 
> (or QueryableStateClientProxy?). I expect some additional TypeInformation 
> would also have to be supplied/used in places. It should be doable in a 
> backwards compatible way such that if the client does not specify a transform 
> it works exactly as it does now.
> Would there be any interested in a PR for this? This would help me for 
> something I'm currently working on and I'd be willing to take a crack at it. 
> If there is interest, I'll be happy to do some more research to come up with 
> a more concrete proposal.
> Thanks for Flink - it's great!
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-11129) dashbord for job which contain important information

2020-07-31 Thread Till Rohrmann (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-11129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann closed FLINK-11129.
-
Resolution: Abandoned

Closing for inactivity.

> dashbord for job which contain important information
> 
>
> Key: FLINK-11129
> URL: https://issues.apache.org/jira/browse/FLINK-11129
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / REST, Runtime / Web Frontend
>Reporter: lining
>Assignee: lining
>Priority: Major
>
> Now, to check job problem need clink many pages. Can we add one dashbord for 
> job which contains failover, subtask which in queue is 100%, taskmanagers 
> which have gc problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-11148) Rescaling operator regression in Flink 1.7

2020-07-31 Thread Till Rohrmann (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-11148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann closed FLINK-11148.
-
Resolution: Duplicate

> Rescaling operator regression in Flink 1.7
> --
>
> Key: FLINK-11148
> URL: https://issues.apache.org/jira/browse/FLINK-11148
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.7.0
>Reporter: Truong Duc Kien
>Priority: Major
>
> We have a job using 20 TaskManager with 3 slot each.
> Using Flink 1.4, when we rescale a data stream from 60 to 20, each 
> TaskManager will only have one downstream slot, that receives the data from 3 
> upstream slots in the same TaskManager.
> Using Flink 1.7, this behaviour no longer hold true, multiple downstream 
> slots are being assigned to the same TaskManager. This change is causing 
> imbalance in our TaskManager load.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   >