[jira] [Commented] (FLINK-22472) The real partition data produced time is behind meta(_SUCCESS) file produced

2021-05-25 Thread Kurt Young (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350842#comment-17350842
 ] 

Kurt Young commented on FLINK-22472:


[~luoyuxia] would you like to take this issue?

> The real partition data produced time is behind meta(_SUCCESS) file produced
> 
>
> Key: FLINK-22472
> URL: https://issues.apache.org/jira/browse/FLINK-22472
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem, Connectors / Hive
>Reporter: Leonard Xu
>Priority: Major
> Attachments: image-2021-05-25-14-27-40-563.png
>
>
> I test write some data to csv file by flink filesystem connector, but after 
> the success file produced, the data file is still un-committed, it's very 
> weird to me.
> {code:java}
> bang@mac db1.db $ll 
> /var/folders/55/cw682b314gn8jhfh565hp7q0gp/T/junit8642959834366044048/junit484868942580135598/test-partition-time-commit/d\=2020-05-03/e\=12/
> total 8
> drwxr-xr-x  4 bang  staff  128  4 25 19:57 ./
> drwxr-xr-x  8 bang  staff  256  4 25 19:57 ../
> -rw-r--r--  1 bang  staff   12  4 25 19:57 
> .part-b703d4b9-067a-4dfe-935e-3afc723aed56-0-4.inprogress.b7d9cf09-0f72-4dce-8591-b61b1d23ae9b
> -rw-r--r--  1 bang  staff0  4 25 19:57 _MY_SUCCESS
> {code}
>  
> After some debug I found I have to set  {{sink.rolling-policy.file-size}} or 
> {{sink.rolling-policy.rollover-interval parameters, the default value of the 
> two parameters is pretty big(128M and 30min). It's not convenient for 
> test/demo. I think we can improve this.}}
>  
> As the doc[1] described, for row formats (csv, json), you can set the 
> parameter {{sink.rolling-policy.file-size}} or 
> {{sink.rolling-policy.rollover-interval}} in the connector properties and 
> parameter {{execution.checkpointing.interval}} in flink-conf.yaml together if 
> you don’t want to wait a long period before observe the data exists in file 
> system.
> [1] 
> https://ci.apache.org/projects/flink/flink-docs-master/docs/connectors/table/filesystem/#rolling-policy



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-22472) The real partition data produced time is behind meta(_SUCCESS) file produced

2021-05-25 Thread luoyuxia (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350850#comment-17350850
 ] 

luoyuxia commented on FLINK-22472:
--

[~ykt836] Yes, with pleasure.

> The real partition data produced time is behind meta(_SUCCESS) file produced
> 
>
> Key: FLINK-22472
> URL: https://issues.apache.org/jira/browse/FLINK-22472
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem, Connectors / Hive
>Reporter: Leonard Xu
>Priority: Major
> Attachments: image-2021-05-25-14-27-40-563.png
>
>
> I test write some data to csv file by flink filesystem connector, but after 
> the success file produced, the data file is still un-committed, it's very 
> weird to me.
> {code:java}
> bang@mac db1.db $ll 
> /var/folders/55/cw682b314gn8jhfh565hp7q0gp/T/junit8642959834366044048/junit484868942580135598/test-partition-time-commit/d\=2020-05-03/e\=12/
> total 8
> drwxr-xr-x  4 bang  staff  128  4 25 19:57 ./
> drwxr-xr-x  8 bang  staff  256  4 25 19:57 ../
> -rw-r--r--  1 bang  staff   12  4 25 19:57 
> .part-b703d4b9-067a-4dfe-935e-3afc723aed56-0-4.inprogress.b7d9cf09-0f72-4dce-8591-b61b1d23ae9b
> -rw-r--r--  1 bang  staff0  4 25 19:57 _MY_SUCCESS
> {code}
>  
> After some debug I found I have to set  {{sink.rolling-policy.file-size}} or 
> {{sink.rolling-policy.rollover-interval parameters, the default value of the 
> two parameters is pretty big(128M and 30min). It's not convenient for 
> test/demo. I think we can improve this.}}
>  
> As the doc[1] described, for row formats (csv, json), you can set the 
> parameter {{sink.rolling-policy.file-size}} or 
> {{sink.rolling-policy.rollover-interval}} in the connector properties and 
> parameter {{execution.checkpointing.interval}} in flink-conf.yaml together if 
> you don’t want to wait a long period before observe the data exists in file 
> system.
> [1] 
> https://ci.apache.org/projects/flink/flink-docs-master/docs/connectors/table/filesystem/#rolling-policy



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #15938: [FLINK-11103][runtime] Set a configurable default uncaught exception handler for all entrypoints

2021-05-25 Thread GitBox


flinkbot edited a comment on pull request #15938:
URL: https://github.com/apache/flink/pull/15938#issuecomment-842314070


   
   ## CI report:
   
   * e7bacdbc8fb620fe2e6c81ff2c664dbd8ab6cc19 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18289)
 
   * b414ce9d89a88fa826e356069dda2da97d58021b Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18299)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-22613) FlinkKinesisITCase.testStopWithSavepoint fails

2021-05-25 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350869#comment-17350869
 ] 

Robert Metzger commented on FLINK-22613:


Thanks a lot !
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18294&view=logs&j=e1276d0f-df12-55ec-86b5-c0ad597d83c9&t=906e9244-f3be-5604-1979-e767c8a6f6d9

> FlinkKinesisITCase.testStopWithSavepoint fails
> --
>
> Key: FLINK-22613
> URL: https://issues.apache.org/jira/browse/FLINK-22613
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kinesis
>Affects Versions: 1.13.0, 1.14.0, 1.12.3
>Reporter: Guowei Ma
>Assignee: Arvid Heise
>Priority: Blocker
>  Labels: test-stability
>
> {code:java}
> 2021-05-10T03:09:18.4601182Z May 10 03:09:18 [ERROR] 
> testStopWithSavepoint(org.apache.flink.streaming.connectors.kinesis.FlinkKinesisITCase)
>   Time elapsed: 3.526 s  <<< FAILURE!
> 2021-05-10T03:09:18.4601884Z May 10 03:09:18 java.lang.AssertionError: 
> 2021-05-10T03:09:18.4605902Z May 10 03:09:18 
> 2021-05-10T03:09:18.4616154Z May 10 03:09:18 Expected: a collection with size 
> a value less than <10>
> 2021-05-10T03:09:18.4616818Z May 10 03:09:18  but: collection size <10> 
> was equal to <10>
> 2021-05-10T03:09:18.4618087Z May 10 03:09:18  at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> 2021-05-10T03:09:18.4618702Z May 10 03:09:18  at 
> org.junit.Assert.assertThat(Assert.java:956)
> 2021-05-10T03:09:18.4619467Z May 10 03:09:18  at 
> org.junit.Assert.assertThat(Assert.java:923)
> 2021-05-10T03:09:18.4620391Z May 10 03:09:18  at 
> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisITCase.testStopWithSavepoint(FlinkKinesisITCase.java:126)
> 2021-05-10T03:09:18.4621115Z May 10 03:09:18  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2021-05-10T03:09:18.4621751Z May 10 03:09:18  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2021-05-10T03:09:18.4622475Z May 10 03:09:18  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2021-05-10T03:09:18.4623142Z May 10 03:09:18  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2021-05-10T03:09:18.4623783Z May 10 03:09:18  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> 2021-05-10T03:09:18.4624514Z May 10 03:09:18  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2021-05-10T03:09:18.4625246Z May 10 03:09:18  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> 2021-05-10T03:09:18.4625967Z May 10 03:09:18  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2021-05-10T03:09:18.4626671Z May 10 03:09:18  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> 2021-05-10T03:09:18.4627349Z May 10 03:09:18  at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
> 2021-05-10T03:09:18.4627979Z May 10 03:09:18  at 
> org.junit.rules.RunRules.evaluate(RunRules.java:20)
> 2021-05-10T03:09:18.4628582Z May 10 03:09:18  at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> 2021-05-10T03:09:18.4629251Z May 10 03:09:18  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> 2021-05-10T03:09:18.4629950Z May 10 03:09:18  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> 2021-05-10T03:09:18.4630616Z May 10 03:09:18  at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> 2021-05-10T03:09:18.4631339Z May 10 03:09:18  at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> 2021-05-10T03:09:18.4631986Z May 10 03:09:18  at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> 2021-05-10T03:09:18.4632630Z May 10 03:09:18  at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> 2021-05-10T03:09:18.4633269Z May 10 03:09:18  at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> 2021-05-10T03:09:18.4634016Z May 10 03:09:18  at 
> org.testcontainers.containers.FailureDetectingExternalResource$1.evaluate(FailureDetectingExternalResource.java:30)
> 2021-05-10T03:09:18.4634786Z May 10 03:09:18  at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
> 2021-05-10T03:09:18.4635412Z May 10 03:09:18  at 
> org.junit.rules.RunRules.evaluate(RunRules.java:20)
> 2021-05-10T03:09:18.4635995Z May 10 03:09:18  at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> 2021-05-10T03:09:18.4636656Z May 10 03:09:18  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
> 2021-05-10T03:09:18.4637398Z

[jira] [Commented] (FLINK-22762) cannot use set table.sql-dialect=hive;

2021-05-25 Thread luoyuxia (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350870#comment-17350870
 ] 

luoyuxia commented on FLINK-22762:
--

I think this issue is same as  [HiveParser::setCurrentTimestamp fails with 
hive-3.1.2]https://issues.apache.org/jira/browse/FLINK-22760]

> cannot use  set table.sql-dialect=hive;
> ---
>
> Key: FLINK-22762
> URL: https://issues.apache.org/jira/browse/FLINK-22762
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.13.0
> Environment: flink 1.13  
> hive 3.12
>Reporter: xueluo
>Priority: Major
> Attachments: image-2021-05-25-10-11-49-944.png, 
> image-2021-05-25-10-14-22-111.png
>
>
> sh sql-client.sh
> *CREATE CATALOG* myhive *WITH* (
>      'type' *=* 'hive',
>      'default-database' = 'default',
>      'hive-conf-dir' = '/data/hive/conf/'
>  );
>  
> USE CATALOG myhive;
>  set table.sql-dialect=hive;
> then 
> show tables; or any command error
> !image-2021-05-25-10-14-22-111.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (FLINK-4542) Add MULTISET operations

2021-05-25 Thread Timo Walther (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther reopened FLINK-4542:
-
  Assignee: (was: Sergey Nuyanzin)

> Add MULTISET operations
> ---
>
> Key: FLINK-4542
> URL: https://issues.apache.org/jira/browse/FLINK-4542
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Reporter: Timo Walther
>Priority: Minor
>  Labels: auto-closed, stale-assigned
>
> Umbrella issue for MULTISET operations like:
> MULTISET UNION, MULTISET UNION ALL, MULTISET EXCEPT, MULTISET EXCEPT ALL, 
> MULTISET INTERSECT, MULTISET INTERSECT ALL, CARDINALITY, ELEMENT, MEMBER OF, 
> SUBMULTISET OF, IS A SET, FUSION
> At the moment we only support COLLECT.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-4542) Add MULTISET operations

2021-05-25 Thread Timo Walther (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-4542:

Labels:   (was: auto-closed stale-assigned)

> Add MULTISET operations
> ---
>
> Key: FLINK-4542
> URL: https://issues.apache.org/jira/browse/FLINK-4542
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Reporter: Timo Walther
>Priority: Minor
>
> Umbrella issue for MULTISET operations like:
> MULTISET UNION, MULTISET UNION ALL, MULTISET EXCEPT, MULTISET EXCEPT ALL, 
> MULTISET INTERSECT, MULTISET INTERSECT ALL, CARDINALITY, ELEMENT, MEMBER OF, 
> SUBMULTISET OF, IS A SET, FUSION
> At the moment we only support COLLECT.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-20975) HiveTableSourceITCase.testPartitionFilter fails on AZP

2021-05-25 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-20975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-20975:
---
Priority: Critical  (was: Major)

> HiveTableSourceITCase.testPartitionFilter fails on AZP
> --
>
> Key: FLINK-20975
> URL: https://issues.apache.org/jira/browse/FLINK-20975
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.13.0
>Reporter: Till Rohrmann
>Assignee: Rui Li
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.13.0
>
>
> The test {{HiveTableSourceITCase.testPartitionFilter}} fails on AZP with the 
> following exception:
> {code}
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertFalse(Assert.java:64)
>   at org.junit.Assert.assertFalse(Assert.java:74)
>   at 
> org.apache.flink.connectors.hive.HiveTableSourceITCase.testPartitionFilter(HiveTableSourceITCase.java:278)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (FLINK-20975) HiveTableSourceITCase.testPartitionFilter fails on AZP

2021-05-25 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-20975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reopened FLINK-20975:


> HiveTableSourceITCase.testPartitionFilter fails on AZP
> --
>
> Key: FLINK-20975
> URL: https://issues.apache.org/jira/browse/FLINK-20975
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.13.0
>Reporter: Till Rohrmann
>Assignee: Rui Li
>Priority: Major
>  Labels: test-stability
> Fix For: 1.13.0
>
>
> The test {{HiveTableSourceITCase.testPartitionFilter}} fails on AZP with the 
> following exception:
> {code}
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertFalse(Assert.java:64)
>   at org.junit.Assert.assertFalse(Assert.java:74)
>   at 
> org.apache.flink.connectors.hive.HiveTableSourceITCase.testPartitionFilter(HiveTableSourceITCase.java:278)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-22762) cannot use set table.sql-dialect=hive;

2021-05-25 Thread luoyuxia (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350870#comment-17350870
 ] 

luoyuxia edited comment on FLINK-22762 at 5/25/21, 7:35 AM:


I think this issue is same as  [HiveParser::setCurrentTimestamp fails with 
hive-3.1.2][https://issues.apache.org/jira/browse/FLINK-22760]


was (Author: luoyuxia):
I think this issue is same as  [HiveParser::setCurrentTimestamp fails with 
hive-3.1.2]https://issues.apache.org/jira/browse/FLINK-22760]

> cannot use  set table.sql-dialect=hive;
> ---
>
> Key: FLINK-22762
> URL: https://issues.apache.org/jira/browse/FLINK-22762
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.13.0
> Environment: flink 1.13  
> hive 3.12
>Reporter: xueluo
>Priority: Major
> Attachments: image-2021-05-25-10-11-49-944.png, 
> image-2021-05-25-10-14-22-111.png
>
>
> sh sql-client.sh
> *CREATE CATALOG* myhive *WITH* (
>      'type' *=* 'hive',
>      'default-database' = 'default',
>      'hive-conf-dir' = '/data/hive/conf/'
>  );
>  
> USE CATALOG myhive;
>  set table.sql-dialect=hive;
> then 
> show tables; or any command error
> !image-2021-05-25-10-14-22-111.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-20975) HiveTableSourceITCase.testPartitionFilter fails on AZP

2021-05-25 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350871#comment-17350871
 ] 

Robert Metzger commented on FLINK-20975:


I reopened this issue, because it reappeared in the 1.13 branch 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18294&view=logs&j=245e1f2e-ba5b-5570-d689-25ae21e5302f&t=e7f339b2-a7c3-57d9-00af-3712d4b15354

> HiveTableSourceITCase.testPartitionFilter fails on AZP
> --
>
> Key: FLINK-20975
> URL: https://issues.apache.org/jira/browse/FLINK-20975
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.13.0
>Reporter: Till Rohrmann
>Assignee: Rui Li
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.13.0
>
>
> The test {{HiveTableSourceITCase.testPartitionFilter}} fails on AZP with the 
> following exception:
> {code}
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertFalse(Assert.java:64)
>   at org.junit.Assert.assertFalse(Assert.java:74)
>   at 
> org.apache.flink.connectors.hive.HiveTableSourceITCase.testPartitionFilter(HiveTableSourceITCase.java:278)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-20975) HiveTableSourceITCase.testPartitionFilter fails on AZP

2021-05-25 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-20975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-20975:
---
Fix Version/s: (was: 1.13.0)
   1.13.1

> HiveTableSourceITCase.testPartitionFilter fails on AZP
> --
>
> Key: FLINK-20975
> URL: https://issues.apache.org/jira/browse/FLINK-20975
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.13.0
>Reporter: Till Rohrmann
>Assignee: Rui Li
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.13.1
>
>
> The test {{HiveTableSourceITCase.testPartitionFilter}} fails on AZP with the 
> following exception:
> {code}
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertFalse(Assert.java:64)
>   at org.junit.Assert.assertFalse(Assert.java:74)
>   at 
> org.apache.flink.connectors.hive.HiveTableSourceITCase.testPartitionFilter(HiveTableSourceITCase.java:278)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-22762) cannot use set table.sql-dialect=hive;

2021-05-25 Thread luoyuxia (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350870#comment-17350870
 ] 

luoyuxia edited comment on FLINK-22762 at 5/25/21, 7:35 AM:


I think this issue is same as [HiveParser::setCurrentTimestamp fails with 
hive-3.1.2|https://issues.apache.org/jira/browse/FLINK-22760]


was (Author: luoyuxia):
I think this issue is same as  [HiveParser::setCurrentTimestamp fails with 
hive-3.1.2][https://issues.apache.org/jira/browse/FLINK-22760]

> cannot use  set table.sql-dialect=hive;
> ---
>
> Key: FLINK-22762
> URL: https://issues.apache.org/jira/browse/FLINK-22762
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.13.0
> Environment: flink 1.13  
> hive 3.12
>Reporter: xueluo
>Priority: Major
> Attachments: image-2021-05-25-10-11-49-944.png, 
> image-2021-05-25-10-14-22-111.png
>
>
> sh sql-client.sh
> *CREATE CATALOG* myhive *WITH* (
>      'type' *=* 'hive',
>      'default-database' = 'default',
>      'hive-conf-dir' = '/data/hive/conf/'
>  );
>  
> USE CATALOG myhive;
>  set table.sql-dialect=hive;
> then 
> show tables; or any command error
> !image-2021-05-25-10-14-22-111.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-22685) Write data to hive table in batch mode throws FileNotFoundException.

2021-05-25 Thread Moses (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350872#comment-17350872
 ] 

Moses commented on FLINK-22685:
---

Hi [~lirui] , thanks for your response. May be we can change this way, which 
means  we can run SQL client asynchronously.

> Write data to hive table in batch mode throws FileNotFoundException.
> 
>
> Key: FLINK-22685
> URL: https://issues.apache.org/jira/browse/FLINK-22685
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
> Environment: Flink Based on Flink 1.11.1.
>Reporter: Moses
>Priority: Minor
>
> h3. Scence
> I wanna luanch a batch job to process hive table data and write the result to 
> another table(*T1*), and my SQL statements is wriiten like below:
> {code:sql}
> -- use hive dialect
> SET table.sql-dialect=hive;
> -- insert into hive table
> insert overwrite table T1
>   partition (p_day_id,p_file_id)
> select distinct 
> {code}
> The job was success luanched, but it failed on *Sink* operator. On Flink UI 
> page I saw all task state is `*FINISHED*`, but *the job failed and it 
> restarted again*.
>  And I found exception information like below: (*The path was marksed*)
> {code:java}
> java.lang.Exception: Failed to finalize execution on master
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraph.vertexFinished(ExecutionGraph.java:1291)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionVertex.executionFinished(ExecutionVertex.java:870)
>   at 
> org.apache.flink.runtime.executiongraph.Execution.markFinished(Execution.java:1125)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraph.updateStateInternal(ExecutionGraph.java:1491)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraph.updateState(ExecutionGraph.java:1464)
>   at 
> org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:497)
>   at 
> org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:386)
>   at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:284)
>   at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:199)
>   at 
> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74)
>   at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
>   at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
>   at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
>   at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
>   at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
>   at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
>   at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>   at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>   at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
>   at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
>   at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
>   at akka.actor.ActorCell.invoke(ActorCell.scala:561)
>   at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:225)
>   at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
>   at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: org.apache.flink.table.api.TableException: Exception in 
> finalizeGlobal
>   at 
> org.apache.flink.table.filesystem.FileSystemOutputFormat.finalizeGlobal(FileSystemOutputFormat.java:97)
>   at 
> org.apache.flink.runtime.jobgraph.InputOutputFormatVertex.finalizeOnMaster(InputOutputFormatVertex.java:132)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraph.vertexFinished(ExecutionGraph.java:1286)
>   ... 31 more
> Caused by: java.io.FileNotFoundException: File 
> //XX/XXX/.staging_1621244168369 does not exist.
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:814)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$700(DistributedFileS

[GitHub] [flink-statefun] tzulitai closed pull request #230: [FLINK-22468] Fix Java 11 build by adding dependency to javax.annotations

2021-05-25 Thread GitBox


tzulitai closed pull request #230:
URL: https://github.com/apache/flink-statefun/pull/230


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-16277) StreamTableEnvironment.toAppendStream fails with Decimal types

2021-05-25 Thread Timo Walther (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-16277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther closed FLINK-16277.

Resolution: Fixed

Fixed as part of FLIP-136 in FLINK-21934.

> StreamTableEnvironment.toAppendStream fails with Decimal types
> --
>
> Key: FLINK-16277
> URL: https://issues.apache.org/jira/browse/FLINK-16277
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.10.0
>Reporter: Benoît Paris
>Priority: Minor
>  Labels: auto-deprioritized-major
> Attachments: DecimalType 38 18 Logical - stacktrace.txt, 
> flink-test-schema-update.zip
>
>
> The following fails when there is a Decimal type in the underlying 
> TableSource:
>  
> {code:java}
> DataStream appendStream = tEnv.toAppendStream(
>   asTable,
>   asTable.getSchema().toRowType()
> );{code}
> Yielding the following error:
>  
> ValidationException: Type ROW<`y` DECIMAL(38, 18)> of table field 'payload' 
> does not match with the physical type ROW<`y` LEGACY('DECIMAL', 'DECIMAL')> 
> of the 'payload' field of the TableSource return type
> 
>  
> Remarks:
>  * toAppendStream is not ready for the new type system, does not accept the 
> new DataTypes
>  * The LegacyTypeInformationType transition type hinders things. Replacing it 
> with the new DataTypes.DECIMAL type makes things work.
>  * flink-json is not ready for the new type system, does not give the new 
> DataTypes
>  
> Workaround: reprocess TypeConversions.fromLegacyInfoToDataType's output to 
> replace LegacyTypeInformationType types when they are of DECIMAL typeroot 
> with the new types.
>  
> Included is reproduction and workaround (activated by line 127) code, with 
> java + pom + stacktrace files.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-22468) Make StateFun build with Java 11

2021-05-25 Thread Tzu-Li (Gordon) Tai (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tzu-Li (Gordon) Tai closed FLINK-22468.
---
Resolution: Fixed

flink-statefun/master: a6f2e9fc799edd9a52c8c684df41b7e371d546d0

> Make StateFun build with Java 11
> 
>
> Key: FLINK-22468
> URL: https://issues.apache.org/jira/browse/FLINK-22468
> Project: Flink
>  Issue Type: Improvement
>  Components: Stateful Functions
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>Priority: Major
>  Labels: pull-request-available
> Fix For: statefun-3.1.0, statefun-3.0.1
>
>
> StateFun is currently not building with Java 11 due to the removal of 
> javax.annotations in JDK 11. This should be fixable by manually adding the 
> dependency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-16277) StreamTableEnvironment.toAppendStream fails with Decimal types

2021-05-25 Thread Timo Walther (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-16277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-16277:
-
Labels:   (was: auto-deprioritized-major)

> StreamTableEnvironment.toAppendStream fails with Decimal types
> --
>
> Key: FLINK-16277
> URL: https://issues.apache.org/jira/browse/FLINK-16277
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.10.0
>Reporter: Benoît Paris
>Priority: Minor
> Attachments: DecimalType 38 18 Logical - stacktrace.txt, 
> flink-test-schema-update.zip
>
>
> The following fails when there is a Decimal type in the underlying 
> TableSource:
>  
> {code:java}
> DataStream appendStream = tEnv.toAppendStream(
>   asTable,
>   asTable.getSchema().toRowType()
> );{code}
> Yielding the following error:
>  
> ValidationException: Type ROW<`y` DECIMAL(38, 18)> of table field 'payload' 
> does not match with the physical type ROW<`y` LEGACY('DECIMAL', 'DECIMAL')> 
> of the 'payload' field of the TableSource return type
> 
>  
> Remarks:
>  * toAppendStream is not ready for the new type system, does not accept the 
> new DataTypes
>  * The LegacyTypeInformationType transition type hinders things. Replacing it 
> with the new DataTypes.DECIMAL type makes things work.
>  * flink-json is not ready for the new type system, does not give the new 
> DataTypes
>  
> Workaround: reprocess TypeConversions.fromLegacyInfoToDataType's output to 
> replace LegacyTypeInformationType types when they are of DECIMAL typeroot 
> with the new types.
>  
> Included is reproduction and workaround (activated by line 127) code, with 
> java + pom + stacktrace files.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22468) Make StateFun build with Java 11

2021-05-25 Thread Tzu-Li (Gordon) Tai (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tzu-Li (Gordon) Tai updated FLINK-22468:

Fix Version/s: (was: statefun-3.0.1)

> Make StateFun build with Java 11
> 
>
> Key: FLINK-22468
> URL: https://issues.apache.org/jira/browse/FLINK-22468
> Project: Flink
>  Issue Type: Improvement
>  Components: Stateful Functions
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>Priority: Major
>  Labels: pull-request-available
> Fix For: statefun-3.1.0
>
>
> StateFun is currently not building with Java 11 due to the removal of 
> javax.annotations in JDK 11. This should be fixable by manually adding the 
> dependency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-22520) KafkaSourceLegacyITCase.testMultipleSourcesOnePartition hangs

2021-05-25 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350876#comment-17350876
 ] 

Robert Metzger commented on FLINK-22520:


{code}
May 24 22:00:33 [INFO] Running 
org.apache.flink.connector.kafka.source.KafkaSourceLegacyITCase
java.lang.InterruptedException
at 
java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:385)
at 
java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999)
at org.apache.flink.test.util.TestUtils.tryExecute(TestUtils.java:49)
at 
org.apache.flink.streaming.connectors.kafka.KafkaConsumerTestBase.runMultipleSourcesOnePartitionExactlyOnceTest(KafkaConsumerTestBase.java:1112)
at 
org.apache.flink.connector.kafka.source.KafkaSourceLegacyITCase.testMultipleSourcesOnePartition(KafkaSourceLegacyITCase.java:87)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.lang.Thread.run(Thread.java:834)
May 24 22:03:08 [ERROR] Tests run: 21, Failures: 0, Errors: 1, Skipped: 0, Time 
elapsed: 155.393 s <<< FAILURE! - in 
org.apache.flink.connector.kafka.source.KafkaSourceLegacyITCase
May 24 22:03:08 [ERROR] 
testMultipleSourcesOnePartition(org.apache.flink.connector.kafka.source.KafkaSourceLegacyITCase)
  Time elapsed: 60.02 s  <<< ERROR!
May 24 22:03:08 org.junit.runners.model.TestTimedOutException: test timed out 
after 6 milliseconds
May 24 22:03:08 at 
java.base@11.0.10/jdk.internal.misc.Unsafe.park(Native Method)
May 24 22:03:08 at 
java.base@11.0.10/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
May 24 22:03:08 at 
java.base@11.0.10/java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1796)
May 24 22:03:08 at 
java.base@11.0.10/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3128)
May 24 22:03:08 at 
java.base@11.0.10/java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1823)
May 24 22:03:08 at 
java.base@11.0.10/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1998)
May 24 22:03:08 at 
app//org.apache.flink.test.util.TestUtils.tryExecute(TestUtils.java:49)
May 24 22:03:08 at 
app//org.apache.flink.streaming.connectors.kafka.KafkaConsumerTestBase.runMultipleSourcesOnePartitionExactlyOnceTest(KafkaConsumerTestBase.java:1112)
May 24 22:03:08 at 
app//org.apache.flink.connector.kafka.source.KafkaSourceLegacyITCase.testMultipleSourcesOnePartition(KafkaSourceLegacyITCase.java:87)
May 24 22:03:08 at 
java.base@11.0.10/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
May 24 22:03:08 at 
java.base@11.0.10/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
May 24 22:03:08 at 
java.base@11.0.10/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
May 24 22:03:08 at 
java.base@11.0.10/java.lang.reflect.Method.invoke(Method.java:566)
May 24 22:03:08 at 
app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
May 24 22:03:08 at 
app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
May 24 22:03:08 at 
app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
May 24 22:03:08 at 
app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
May 24 22:03:08 at 
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
May 24 22:03:08 at 
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
May 24 22:03:08 at 
java.base@11.0.10/java.util.concurrent.FutureTask.run(FutureTask.java:264)
May 24 22:03

[jira] [Reopened] (FLINK-4262) Consider null handling during sorting

2021-05-25 Thread Timo Walther (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-4262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther reopened FLINK-4262:
-
  Assignee: (was: Aleksandr Blatov)

> Consider null handling during sorting
> -
>
> Key: FLINK-4262
> URL: https://issues.apache.org/jira/browse/FLINK-4262
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Reporter: Timo Walther
>Priority: Minor
>  Labels: auto-closed, stale-assigned
>
> Calcite's SQL parser allows to specify how to handle NULLs during sorting.
> {code}
> orderItem:
>   expression [ ASC | DESC ] [ NULLS FIRST | NULLS LAST ]
> {code}
> Currently, the NULL FIRST/NULLS LAST is completely ignored but might be 
> helpful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-22194) KafkaSourceReaderTest.testCommitOffsetsWithoutAliveFetchers fail due to commit timeout

2021-05-25 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350878#comment-17350878
 ] 

Robert Metzger commented on FLINK-22194:


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18292&view=logs&j=c5f0071e-1851-543e-9a45-9ac140befc32&t=1fb1a56f-e8b5-5a82-00a0-a2db7757b4f5

> KafkaSourceReaderTest.testCommitOffsetsWithoutAliveFetchers fail due to 
> commit timeout
> --
>
> Key: FLINK-22194
> URL: https://issues.apache.org/jira/browse/FLINK-22194
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.13.0
>Reporter: Guowei Ma
>Priority: Major
>  Labels: stale-major, test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=16308&view=logs&j=b0097207-033c-5d9a-b48c-6d4796fbe60d&t=e8fcc430-213e-5cce-59d4-6942acf09121&l=6535
> {code:java}
> [ERROR] 
> testCommitOffsetsWithoutAliveFetchers(org.apache.flink.connector.kafka.source.reader.KafkaSourceReaderTest)
>   Time elapsed: 60.123 s  <<< ERROR!
> java.util.concurrent.TimeoutException: The offset commit did not finish 
> before timeout.
>   at 
> org.apache.flink.core.testutils.CommonTestUtils.waitUtil(CommonTestUtils.java:210)
>   at 
> org.apache.flink.connector.kafka.source.reader.KafkaSourceReaderTest.pollUntil(KafkaSourceReaderTest.java:285)
>   at 
> org.apache.flink.connector.kafka.source.reader.KafkaSourceReaderTest.testCommitOffsetsWithoutAliveFetchers(KafkaSourceReaderTest.java:129)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
>   at 
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-4262) Consider null handling during sorting

2021-05-25 Thread Timo Walther (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-4262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-4262:

Labels:   (was: auto-closed stale-assigned)

> Consider null handling during sorting
> -
>
> Key: FLINK-4262
> URL: https://issues.apache.org/jira/browse/FLINK-4262
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Reporter: Timo Walther
>Priority: Minor
>
> Calcite's SQL parser allows to specify how to handle NULLs during sorting.
> {code}
> orderItem:
>   expression [ ASC | DESC ] [ NULLS FIRST | NULLS LAST ]
> {code}
> Currently, the NULL FIRST/NULLS LAST is completely ignored but might be 
> helpful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] wangyang0918 commented on pull request #15501: [FLINK-22054][k8s] Using a shared watcher for ConfigMap watching

2021-05-25 Thread GitBox


wangyang0918 commented on pull request #15501:
URL: https://github.com/apache/flink/pull/15501#issuecomment-847629287


   I have verified this PR with a minikube cluster. It works well. Thanks 
@yittg for the great work.
   However, I still find a suspicious log which shows up every 5 seconds. It 
might be related with 
https://github.com/fabric8io/kubernetes-client/issues/2651. This will greatly 
increase the pressure of ETCD and affect the whole K8s cluster stability. Could 
you please have a look?
   
   ```
   2021-05-25 07:04:45,580 DEBUG [pool-5-thread-1] 
io.fabric8.kubernetes.client.informers.cache.Reflector   [] - Listing items 
(4) for resource class io.fabric8.kubernetes.api.model.ConfigMap v3656102
   ```
   
   
   I have another comment about FLINK-22006. We may need to update the related 
doc here[1]. Because all the jobs now share a same web socket connection, as 
well as concurrent request. For 
`FlinkKubeClientFactory#trySetMaxConcurrentRequest`, I think we could still 
have it until we bump the fabric8 Kubernetes client version.
   
   [1]. 
https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes/#configuring-flink-on-kubernetes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-13400) Remove Hive and Hadoop dependencies from SQL Client

2021-05-25 Thread Timo Walther (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-13400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350879#comment-17350879
 ] 

Timo Walther commented on FLINK-13400:
--

Thanks for offering your help [~frank wang]. Ideally, the SQL Client should 
simply use a simple test catalog instead of a big Hive catalog. I can assign 
you to this ticket if you are still interested? Maybe [~fsk119] can help 
reviewing a PR?

> Remove Hive and Hadoop dependencies from SQL Client
> ---
>
> Key: FLINK-13400
> URL: https://issues.apache.org/jira/browse/FLINK-13400
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Reporter: Timo Walther
>Priority: Critical
>  Labels: auto-unassigned, stale-critical
> Fix For: 1.14.0
>
>
> 340/550 lines in the SQL Client {{pom.xml}} are just around Hive and Hadoop 
> dependencies.  Hive has nothing to do with the SQL Client and it will be hard 
> to maintain the long list of  exclusion there. Some dependencies are even in 
> a {{provided}} scope and not {{test}} scope.
> We should remove all dependencies on Hive/Hadoop and replace catalog-related 
> tests by a testing catalog. Similar to how we tests source/sinks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-22765) ExceptionUtilsITCase.testIsMetaspaceOutOfMemoryError is unstable

2021-05-25 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-22765:
--

 Summary: ExceptionUtilsITCase.testIsMetaspaceOutOfMemoryError is 
unstable
 Key: FLINK-22765
 URL: https://issues.apache.org/jira/browse/FLINK-22765
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.14.0
Reporter: Robert Metzger
 Fix For: 1.14.0


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18292&view=logs&j=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3&t=a99e99c7-21cd-5a1f-7274-585e62b72f56

{code}
May 25 00:56:38 java.lang.AssertionError: 
May 25 00:56:38 
May 25 00:56:38 Expected: is ""
May 25 00:56:38  but: was "The system is out of resources.\nConsult the 
following stack trace for details."
May 25 00:56:38 at 
org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
May 25 00:56:38 at org.junit.Assert.assertThat(Assert.java:956)
May 25 00:56:38 at org.junit.Assert.assertThat(Assert.java:923)
May 25 00:56:38 at 
org.apache.flink.runtime.util.ExceptionUtilsITCase.run(ExceptionUtilsITCase.java:94)
May 25 00:56:38 at 
org.apache.flink.runtime.util.ExceptionUtilsITCase.testIsMetaspaceOutOfMemoryError(ExceptionUtilsITCase.java:70)
May 25 00:56:38 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
May 25 00:56:38 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
May 25 00:56:38 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
May 25 00:56:38 at java.lang.reflect.Method.invoke(Method.java:498)
May 25 00:56:38 at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
May 25 00:56:38 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
May 25 00:56:38 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
May 25 00:56:38 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
May 25 00:56:38 at 
org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
May 25 00:56:38 at 
org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
May 25 00:56:38 at org.junit.rules.RunRules.evaluate(RunRules.java:20)
May 25 00:56:38 at 
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
May 25 00:56:38 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
May 25 00:56:38 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
May 25 00:56:38 at 
org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
May 25 00:56:38 at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
May 25 00:56:38 at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
May 25 00:56:38 at 
org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
May 25 00:56:38 at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
May 25 00:56:38 at 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
May 25 00:56:38 at org.junit.rules.RunRules.evaluate(RunRules.java:20)
May 25 00:56:38 at 
org.junit.runners.ParentRunner.run(ParentRunner.java:363)
May 25 00:56:38 at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
May 25 00:56:38 at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
May 25 00:56:38 at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
May 25 00:56:38 at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
May 25 00:56:38 at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
May 25 00:56:38 at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
May 25 00:56:38 at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
May 25 00:56:38 at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
May 25 00:56:38 

{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-22593) SavepointITCase.testShouldAddEntropyToSavepointPath unstable

2021-05-25 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350882#comment-17350882
 ] 

Robert Metzger commented on FLINK-22593:


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18292&view=logs&j=4d4a0d10-fca2-5507-8eed-c07f0bdf4887&t=c2734c79-73b6-521c-e85a-67c7ecae9107

> SavepointITCase.testShouldAddEntropyToSavepointPath unstable
> 
>
> Key: FLINK-22593
> URL: https://issues.apache.org/jira/browse/FLINK-22593
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.14.0
>Reporter: Robert Metzger
>Priority: Critical
>  Labels: test-stability
>
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=9072&view=logs&j=cc649950-03e9-5fae-8326-2f1ad744b536&t=51cab6ca-669f-5dc0-221d-1e4f7dc4fc85
> {code}
> 2021-05-07T10:56:20.9429367Z May 07 10:56:20 [ERROR] Tests run: 13, Failures: 
> 0, Errors: 1, Skipped: 0, Time elapsed: 33.441 s <<< FAILURE! - in 
> org.apache.flink.test.checkpointing.SavepointITCase
> 2021-05-07T10:56:20.9445862Z May 07 10:56:20 [ERROR] 
> testShouldAddEntropyToSavepointPath(org.apache.flink.test.checkpointing.SavepointITCase)
>   Time elapsed: 2.083 s  <<< ERROR!
> 2021-05-07T10:56:20.9447106Z May 07 10:56:20 
> java.util.concurrent.ExecutionException: 
> java.util.concurrent.CompletionException: 
> org.apache.flink.runtime.checkpoint.CheckpointException: Checkpoint 
> triggering task Sink: Unnamed (3/4) of job 4e155a20f0a7895043661a6446caf1cb 
> has not being executed at the moment. Aborting checkpoint. Failure reason: 
> Not all required tasks are currently running.
> 2021-05-07T10:56:20.9448194Z May 07 10:56:20  at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
> 2021-05-07T10:56:20.9448797Z May 07 10:56:20  at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
> 2021-05-07T10:56:20.9449428Z May 07 10:56:20  at 
> org.apache.flink.test.checkpointing.SavepointITCase.submitJobAndTakeSavepoint(SavepointITCase.java:305)
> 2021-05-07T10:56:20.9450160Z May 07 10:56:20  at 
> org.apache.flink.test.checkpointing.SavepointITCase.testShouldAddEntropyToSavepointPath(SavepointITCase.java:273)
> 2021-05-07T10:56:20.9450785Z May 07 10:56:20  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2021-05-07T10:56:20.9451331Z May 07 10:56:20  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2021-05-07T10:56:20.9451940Z May 07 10:56:20  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2021-05-07T10:56:20.9452498Z May 07 10:56:20  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2021-05-07T10:56:20.9453247Z May 07 10:56:20  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> 2021-05-07T10:56:20.9454007Z May 07 10:56:20  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2021-05-07T10:56:20.9454687Z May 07 10:56:20  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> 2021-05-07T10:56:20.9455302Z May 07 10:56:20  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2021-05-07T10:56:20.9455909Z May 07 10:56:20  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> 2021-05-07T10:56:20.9456493Z May 07 10:56:20  at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
> 2021-05-07T10:56:20.9457074Z May 07 10:56:20  at 
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
> 2021-05-07T10:56:20.9457636Z May 07 10:56:20  at 
> org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> 2021-05-07T10:56:20.9458157Z May 07 10:56:20  at 
> org.junit.rules.RunRules.evaluate(RunRules.java:20)
> 2021-05-07T10:56:20.9458678Z May 07 10:56:20  at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> 2021-05-07T10:56:20.9459252Z May 07 10:56:20  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> 2021-05-07T10:56:20.9459865Z May 07 10:56:20  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> 2021-05-07T10:56:20.9460433Z May 07 10:56:20  at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> 2021-05-07T10:56:20.9461058Z May 07 10:56:20  at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> 2021-05-07T10:56:20.9461607Z May 07 10:56:20  at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> 2021-05-07T10:56:20.9462159Z May 07 10:56:20  at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> 2021-05-07T10:56:20.9462705Z May 07 10:

[jira] [Closed] (FLINK-20644) Check return type of ScalarFunction eval method shouldn't be void

2021-05-25 Thread Timo Walther (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-20644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther closed FLINK-20644.

Resolution: Won't Fix

The FLIP-65 functions should be relatively stable by now. I will close this for 
now.

> Check return type of ScalarFunction eval method shouldn't be void
> -
>
> Key: FLINK-20644
> URL: https://issues.apache.org/jira/browse/FLINK-20644
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.11.1
> Environment: groupId:org.apache.flink
> artifactId:flink-table-api-scala-bridge_2.11
> version:1.11.1
>Reporter: shiyu
>Priority: Major
>  Labels: stale-major, starter
> Attachments: image-2020-12-17-16-04-15-131.png, 
> image-2020-12-17-16-07-39-827.png
>
>
> flink-table-api-scala-bridge_2.11
>   !image-2020-12-17-16-07-39-827.png!
> !image-2020-12-17-16-04-15-131.png!
> console:
> SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".SLF4J: Failed 
> to load class "org.slf4j.impl.StaticLoggerBinder".SLF4J: Defaulting to 
> no-operation (NOP) logger implementationSLF4J: See 
> http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.ERROR 
> StatusLogger Log4j2 could not find a logging implementation. Please add 
> log4j-core to the classpath. Using SimpleLogger to log to the console.../* 1 
> *//* 2 */      public class StreamExecCalc$13 extends 
> org.apache.flink.table.runtime.operators.AbstractProcessStreamOperator/* 3 */ 
>          implements 
> org.apache.flink.streaming.api.operators.OneInputStreamOperator \{/* 4 *//* 5 
> */        private final Object[] references;/* 6 */        private transient 
> org.apache.flink.table.runtime.typeutils.StringDataSerializer 
> typeSerializer$6;/* 7 */        private transient 
> org.apache.flink.table.data.util.DataFormatConverters.StringConverter 
> converter$9;/* 8 */        private transient 
> cn.bicon.tableapitest.udf.ScalarFunctionTest$HashCode 
> function_cn$bicon$tableapitest$udf$ScalarFunctionTest$HashCode$8999e79cc91b971a8777461fb7698c58;/*
>  9 */        private transient 
> org.apache.flink.table.data.util.DataFormatConverters.GenericConverter 
> converter$12;/* 10 */        org.apache.flink.table.data.BoxedWrapperRowData 
> out = new org.apache.flink.table.data.BoxedWrapperRowData(3);/* 11 */        
> private final org.apache.flink.streaming.runtime.streamrecord.StreamRecord 
> outElement = new 
> org.apache.flink.streaming.runtime.streamrecord.StreamRecord(null);/* 12 *//* 
> 13 */        public StreamExecCalc$13(/* 14 */            Object[] 
> references,/* 15 */            
> org.apache.flink.streaming.runtime.tasks.StreamTask task,/* 16 */            
> org.apache.flink.streaming.api.graph.StreamConfig config,/* 17 */            
> org.apache.flink.streaming.api.operators.Output output,/* 18 */            
> org.apache.flink.streaming.runtime.tasks.ProcessingTimeService 
> processingTimeService) throws Exception {/* 19 */          this.references = 
> references;/* 20 */          typeSerializer$6 = 
> (((org.apache.flink.table.runtime.typeutils.StringDataSerializer) 
> references[0]));/* 21 */          converter$9 = 
> (((org.apache.flink.table.data.util.DataFormatConverters.StringConverter) 
> references[1]));/* 22 */          
> function_cn$bicon$tableapitest$udf$ScalarFunctionTest$HashCode$8999e79cc91b971a8777461fb7698c58
>  = (((cn.bicon.tableapitest.udf.ScalarFunctionTest$HashCode) 
> references[2]));/* 23 */          converter$12 = 
> (((org.apache.flink.table.data.util.DataFormatConverters.GenericConverter) 
> references[3]));/* 24 */          this.setup(task, config, output);/* 25 */   
>        if (this instanceof 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator) {/* 26 */    
>         ((org.apache.flink.streaming.api.operators.AbstractStreamOperator) 
> this)/* 27 */              
> .setProcessingTimeService(processingTimeService);/* 28 */          }/* 29 */  
>       }/* 30 *//* 31 */        @Override/* 32 */        public void open() 
> throws Exception \{/* 33 */          super.open();/* 34 */          /* 35 */  
>         
> function_cn$bicon$tableapitest$udf$ScalarFunctionTest$HashCode$8999e79cc91b971a8777461fb7698c58.open(new
>  org.apache.flink.table.functions.FunctionContext(getRuntimeContext()));/* 36 
> */                 /* 37 */        }/* 38 *//* 39 */        @Override/* 40 */ 
>        public void 
> processElement(org.apache.flink.streaming.runtime.streamrecord.StreamRecord 
> element) throws Exception \{/* 41 */          
> org.apache.flink.table.data.RowData in1 = 
> (org.apache.flink.table.data.RowData) element.getValue();/* 42 */          /* 
> 43 */          org.apache.flink.table.data.binary.BinaryStringData field$5;/* 
> 44 */          boolean isN

[jira] [Commented] (FLINK-22194) KafkaSourceReaderTest.testCommitOffsetsWithoutAliveFetchers fail due to commit timeout

2021-05-25 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350883#comment-17350883
 ] 

Robert Metzger commented on FLINK-22194:


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18292&view=logs&j=ce8f3cc3-c1ea-5281-f5eb-df9ebd24947f&t=f266c805-9429-58ed-2f9e-482e7b82f58b

> KafkaSourceReaderTest.testCommitOffsetsWithoutAliveFetchers fail due to 
> commit timeout
> --
>
> Key: FLINK-22194
> URL: https://issues.apache.org/jira/browse/FLINK-22194
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.13.0
>Reporter: Guowei Ma
>Priority: Major
>  Labels: stale-major, test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=16308&view=logs&j=b0097207-033c-5d9a-b48c-6d4796fbe60d&t=e8fcc430-213e-5cce-59d4-6942acf09121&l=6535
> {code:java}
> [ERROR] 
> testCommitOffsetsWithoutAliveFetchers(org.apache.flink.connector.kafka.source.reader.KafkaSourceReaderTest)
>   Time elapsed: 60.123 s  <<< ERROR!
> java.util.concurrent.TimeoutException: The offset commit did not finish 
> before timeout.
>   at 
> org.apache.flink.core.testutils.CommonTestUtils.waitUtil(CommonTestUtils.java:210)
>   at 
> org.apache.flink.connector.kafka.source.reader.KafkaSourceReaderTest.pollUntil(KafkaSourceReaderTest.java:285)
>   at 
> org.apache.flink.connector.kafka.source.reader.KafkaSourceReaderTest.testCommitOffsetsWithoutAliveFetchers(KafkaSourceReaderTest.java:129)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
>   at 
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-22464) OperatorEventSendingCheckpointITCase.testOperatorEventLostWithReaderFailure hangs with `AdaptiveScheduler`

2021-05-25 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350885#comment-17350885
 ] 

Robert Metzger commented on FLINK-22464:


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18292&view=logs&j=8fd9202e-fd17-5b26-353c-ac1ff76c8f28&t=a0a633b8-47ef-5c5a-2806-3c13b9e48228

> OperatorEventSendingCheckpointITCase.testOperatorEventLostWithReaderFailure 
> hangs with `AdaptiveScheduler`
> --
>
> Key: FLINK-22464
> URL: https://issues.apache.org/jira/browse/FLINK-22464
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination, Tests
>Affects Versions: 1.14.0
>Reporter: Guowei Ma
>Priority: Critical
>  Labels: stale-critical, test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=17178&view=logs&j=8fd9202e-fd17-5b26-353c-ac1ff76c8f28&t=a0a633b8-47ef-5c5a-2806-3c13b9e48228&l=8171
> {code:java}
>   2021-05-10T02:56:09.3603584Z "main" #1 prio=5 os_prio=0 
> tid=0x7f677000b800 nid=0x40e4 waiting on condition [0x7f6776cc8000]
> 2021-05-10T02:56:09.3604176Zjava.lang.Thread.State: TIMED_WAITING 
> (sleeping)
> 2021-05-10T02:56:09.3604468Z  at java.lang.Thread.sleep(Native Method)
> 2021-05-10T02:56:09.3604925Z  at 
> org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.sleepBeforeRetry(CollectResultFetcher.java:237)
> 2021-05-10T02:56:09.3605582Z  at 
> org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.next(CollectResultFetcher.java:113)
> 2021-05-10T02:56:09.3606205Z  at 
> org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:106)
> 2021-05-10T02:56:09.3606924Z  at 
> org.apache.flink.streaming.api.operators.collect.CollectResultIterator.hasNext(CollectResultIterator.java:80)
> 2021-05-10T02:56:09.3607469Z  at 
> org.apache.flink.streaming.api.datastream.DataStream.executeAndCollect(DataStream.java:1320)
> 2021-05-10T02:56:09.3607996Z  at 
> org.apache.flink.streaming.api.datastream.DataStream.executeAndCollect(DataStream.java:1303)
> 2021-05-10T02:56:09.3608616Z  at 
> org.apache.flink.runtime.operators.coordination.OperatorEventSendingCheckpointITCase.runTest(OperatorEventSendingCheckpointITCase.java:223)
> 2021-05-10T02:56:09.3609378Z  at 
> org.apache.flink.runtime.operators.coordination.OperatorEventSendingCheckpointITCase.testOperatorEventLostWithReaderFailure(OperatorEventSendingCheckpointITCase.java:135)
> 2021-05-10T02:56:09.3609968Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2021-05-10T02:56:09.3610386Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2021-05-10T02:56:09.3610858Z  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2021-05-10T02:56:09.3611295Z  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2021-05-10T02:56:09.3611703Z  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> 2021-05-10T02:56:09.3612207Z  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2021-05-10T02:56:09.3612774Z  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> 2021-05-10T02:56:09.3613470Z  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2021-05-10T02:56:09.3613930Z  at 
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
> 2021-05-10T02:56:09.3614401Z  at 
> org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> 2021-05-10T02:56:09.3614770Z  at 
> org.junit.rules.RunRules.evaluate(RunRules.java:20)
> 2021-05-10T02:56:09.3615138Z  at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> 2021-05-10T02:56:09.3615584Z  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> 2021-05-10T02:56:09.3616070Z  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> 2021-05-10T02:56:09.3616487Z  at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> 2021-05-10T02:56:09.3616962Z  at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> 2021-05-10T02:56:09.3617361Z  at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> 2021-05-10T02:56:09.3617785Z  at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> 2021-05-10T02:56:09.3618209Z  at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> 2021-05-10T02:56:09.3618635Z  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> 2021-05-10T02:56:09.3619101Z  at 
> org.junit.intern

[jira] [Updated] (FLINK-17362) Improve table examples to reflect latest status

2021-05-25 Thread Timo Walther (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-17362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-17362:
-
Labels:   (was: auto-deprioritized-major)

> Improve table examples to reflect latest status
> ---
>
> Key: FLINK-17362
> URL: https://issues.apache.org/jira/browse/FLINK-17362
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Reporter: Kurt Young
>Priority: Minor
> Fix For: 1.14.0
>
>
> Currently the table examples seems outdated, especially after blink planner 
> becomes the default choice. We might need to refactor the structure of all 
> examples, and cover the following items:
>  # streaming sql & table api examples
>  # batch sql & table api examples
>  # table/sql & datastream interoperation
>  # table/sql & dataset interoperation
>  # DDL & DML examples



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22590) Add Scala implicit conversions for new API methods

2021-05-25 Thread Timo Walther (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-22590:
-
Labels:   (was: stale-assigned)

> Add Scala implicit conversions for new API methods
> --
>
> Key: FLINK-22590
> URL: https://issues.apache.org/jira/browse/FLINK-22590
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Major
>
> FLINK-19980 should also be exposed through Scala's implicit conversions. To 
> allow a fluent API such as `table.toDataStream(...)`



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (FLINK-22464) OperatorEventSendingCheckpointITCase.testOperatorEventLostWithReaderFailure hangs with `AdaptiveScheduler`

2021-05-25 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-22464:
--

Assignee: Robert Metzger

> OperatorEventSendingCheckpointITCase.testOperatorEventLostWithReaderFailure 
> hangs with `AdaptiveScheduler`
> --
>
> Key: FLINK-22464
> URL: https://issues.apache.org/jira/browse/FLINK-22464
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination, Tests
>Affects Versions: 1.14.0
>Reporter: Guowei Ma
>Assignee: Robert Metzger
>Priority: Critical
>  Labels: stale-critical, test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=17178&view=logs&j=8fd9202e-fd17-5b26-353c-ac1ff76c8f28&t=a0a633b8-47ef-5c5a-2806-3c13b9e48228&l=8171
> {code:java}
>   2021-05-10T02:56:09.3603584Z "main" #1 prio=5 os_prio=0 
> tid=0x7f677000b800 nid=0x40e4 waiting on condition [0x7f6776cc8000]
> 2021-05-10T02:56:09.3604176Zjava.lang.Thread.State: TIMED_WAITING 
> (sleeping)
> 2021-05-10T02:56:09.3604468Z  at java.lang.Thread.sleep(Native Method)
> 2021-05-10T02:56:09.3604925Z  at 
> org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.sleepBeforeRetry(CollectResultFetcher.java:237)
> 2021-05-10T02:56:09.3605582Z  at 
> org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.next(CollectResultFetcher.java:113)
> 2021-05-10T02:56:09.3606205Z  at 
> org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:106)
> 2021-05-10T02:56:09.3606924Z  at 
> org.apache.flink.streaming.api.operators.collect.CollectResultIterator.hasNext(CollectResultIterator.java:80)
> 2021-05-10T02:56:09.3607469Z  at 
> org.apache.flink.streaming.api.datastream.DataStream.executeAndCollect(DataStream.java:1320)
> 2021-05-10T02:56:09.3607996Z  at 
> org.apache.flink.streaming.api.datastream.DataStream.executeAndCollect(DataStream.java:1303)
> 2021-05-10T02:56:09.3608616Z  at 
> org.apache.flink.runtime.operators.coordination.OperatorEventSendingCheckpointITCase.runTest(OperatorEventSendingCheckpointITCase.java:223)
> 2021-05-10T02:56:09.3609378Z  at 
> org.apache.flink.runtime.operators.coordination.OperatorEventSendingCheckpointITCase.testOperatorEventLostWithReaderFailure(OperatorEventSendingCheckpointITCase.java:135)
> 2021-05-10T02:56:09.3609968Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2021-05-10T02:56:09.3610386Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2021-05-10T02:56:09.3610858Z  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2021-05-10T02:56:09.3611295Z  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2021-05-10T02:56:09.3611703Z  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> 2021-05-10T02:56:09.3612207Z  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2021-05-10T02:56:09.3612774Z  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> 2021-05-10T02:56:09.3613470Z  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2021-05-10T02:56:09.3613930Z  at 
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
> 2021-05-10T02:56:09.3614401Z  at 
> org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> 2021-05-10T02:56:09.3614770Z  at 
> org.junit.rules.RunRules.evaluate(RunRules.java:20)
> 2021-05-10T02:56:09.3615138Z  at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> 2021-05-10T02:56:09.3615584Z  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> 2021-05-10T02:56:09.3616070Z  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> 2021-05-10T02:56:09.3616487Z  at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> 2021-05-10T02:56:09.3616962Z  at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> 2021-05-10T02:56:09.3617361Z  at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> 2021-05-10T02:56:09.3617785Z  at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> 2021-05-10T02:56:09.3618209Z  at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> 2021-05-10T02:56:09.3618635Z  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> 2021-05-10T02:56:09.3619101Z  at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> 2021-05-10T02:56:09.3619507Z  at 
> org.junit.runners.ParentRunner.run(ParentRunner.jav

[jira] [Updated] (FLINK-19913) The precision in document and code of `INTERVAL DAY(p1) TO SECOND(p2)` are inconsistent

2021-05-25 Thread Timo Walther (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-19913:
-
Parent: FLINK-12251
Issue Type: Sub-task  (was: Improvement)

> The precision in document and code of `INTERVAL DAY(p1) TO SECOND(p2)` are 
> inconsistent
> ---
>
> Key: FLINK-19913
> URL: https://issues.apache.org/jira/browse/FLINK-19913
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.11.1, 1.12.0
>Reporter: sunjincheng
>Priority: Minor
>  Labels: auto-deprioritized-major, starter
>
> The precision in document and code of `INTERVAL DAY(p1) TO SECOND(p2)` are 
> inconsistent. In doc:
> {code:java}
> INTERVAL DAY(p1) TO SECOND(p2)
> The type can be declared using the above combinations where p1 is the number 
> of digits of days (day precision) and p2 is the number of digits of 
> fractional seconds (fractional precision). p1 must have a value between 1 and 
> 6 (both inclusive). p2 must have a value between 0 and 9 (both inclusive). If 
> no p1 is specified, it is equal to 2 by default. If no p2 is specified, it is 
> equal to 6 by default. 
> {code}
> In code:
> {code:java}
>  case typeName if DAY_INTERVAL_TYPES.contains(typeName) =>
> if (relDataType.getPrecision > 3) {
>   throw new TableException(
> s"DAY_INTERVAL_TYPES precision is not supported: 
> ${relDataType.getPrecision}")
> }
> {code}
> BTW: We can also refer to Oracle's definition of support for INTERVAL:
>  
> [https://oracle-base.com/articles/misc/oracle-dates-timestamps-and-intervals#interval]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink-statefun] tzulitai opened a new pull request #235: [FLINK-20336] [core] Do not silently pass UNRECOGNIZED state mutations

2021-05-25 Thread GitBox


tzulitai opened a new pull request #235:
URL: https://github.com/apache/flink-statefun/pull/235


   We should throw if the state mutation returned from remote functions is of 
type `UNRECOGNIZED`.
   
   The `UNRECOGNIZED` enum constant is a Protobuf auto-generated constant that 
is used if the enum to deserialize does not exist on the receiving side. Thus, 
this should only happen in the case of mismatching protocol versions between 
StateFun and the remote function.
   
   In this case, we should definitely not silently pass.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-20336) RequestReplyFunction should not silently ignore UNRECOGNIZED state value mutations types

2021-05-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-20336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-20336:
---
Labels: auto-unassigned pull-request-available  (was: auto-unassigned)

> RequestReplyFunction should not silently ignore UNRECOGNIZED state value 
> mutations types
> 
>
> Key: FLINK-20336
> URL: https://issues.apache.org/jira/browse/FLINK-20336
> Project: Flink
>  Issue Type: Bug
>  Components: Stateful Functions
>Affects Versions: statefun-2.1.0, statefun-2.2.1
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>Priority: Minor
>  Labels: auto-unassigned, pull-request-available
> Fix For: statefun-3.1.0
>
>
> If a function's response has a {{PersistedValueMutation}} type that is 
> {{UNRECOGNIZED}}, we currently just silently ignore that mutation:
> https://github.com/apache/flink-statefun/blob/master/statefun-flink/statefun-flink-core/src/main/java/org/apache/flink/statefun/flink/core/reqreply/PersistedRemoteFunctionValues.java#L84
> This is incorrect. The {{UNRECOGNIZED}} enum constant is a pre-defined 
> constant used by the Protobuf Java SDK, to represent a constant that was 
> unable to be deserialized (because the the serialized constant does not match 
> any enums defined in the protobuf message).
> Therefore, it should be handled by throwing an exception, preferably 
> indicating that there is some sort of version mismatch between the function's 
> Protobuf message definitions, and StateFun's Protobuf message definitions 
> (i.e. most likely a mismatch in the invocation protocol versions).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] akalash commented on pull request #15959: [FLINK-22684][runtime] Added ability to ignore in-flight data during …

2021-05-25 Thread GitBox


akalash commented on pull request #15959:
URL: https://github.com/apache/flink/pull/15959#issuecomment-847634879


   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] luoyuxia opened a new pull request #15995: [FLINK-22760][Hive] Fix issue of HiveParser::setCurrentTimestamp fails with hive-3.1.2

2021-05-25 Thread GitBox


luoyuxia opened a new pull request #15995:
URL: https://github.com/apache/flink/pull/15995


   …estamp fails with hive-3.1.2
   
   
   
   ## What is the purpose of the change
   
   *This pull request fixes the issue of HiveParser::setCurrentTimestamp fails 
with hive-3.1.2*
   
   
   ## Brief change log
   
 - *Convert Instant return by SessionState#getQueryCurrentTimestamp in 
hive-3.1.2 to TimeStamp*
   
   
   ## Verifying this change
   
   This change is already covered by existing tests, such as 
*org.apache.flink.connectors.hive.HiveDialectQueryITCase*.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`:   no 
 - The serializers: no 
 - The runtime per-record code paths (performance sensitive):  no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper:  no 
 - The S3 file system connector: no 
   
   ## Documentation
   
 - Does this pull request introduce a new feature?  no
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-22760) HiveParser::setCurrentTimestamp fails with hive-3.1.2

2021-05-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-22760:
---
Labels: pull-request-available  (was: )

> HiveParser::setCurrentTimestamp fails with hive-3.1.2
> -
>
> Key: FLINK-22760
> URL: https://issues.apache.org/jira/browse/FLINK-22760
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Reporter: Rui Li
>Assignee: luoyuxia
>Priority: Critical
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (FLINK-22472) The real partition data produced time is behind meta(_SUCCESS) file produced

2021-05-25 Thread Kurt Young (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kurt Young reassigned FLINK-22472:
--

Assignee: luoyuxia

> The real partition data produced time is behind meta(_SUCCESS) file produced
> 
>
> Key: FLINK-22472
> URL: https://issues.apache.org/jira/browse/FLINK-22472
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem, Connectors / Hive
>Reporter: Leonard Xu
>Assignee: luoyuxia
>Priority: Major
> Attachments: image-2021-05-25-14-27-40-563.png
>
>
> I test write some data to csv file by flink filesystem connector, but after 
> the success file produced, the data file is still un-committed, it's very 
> weird to me.
> {code:java}
> bang@mac db1.db $ll 
> /var/folders/55/cw682b314gn8jhfh565hp7q0gp/T/junit8642959834366044048/junit484868942580135598/test-partition-time-commit/d\=2020-05-03/e\=12/
> total 8
> drwxr-xr-x  4 bang  staff  128  4 25 19:57 ./
> drwxr-xr-x  8 bang  staff  256  4 25 19:57 ../
> -rw-r--r--  1 bang  staff   12  4 25 19:57 
> .part-b703d4b9-067a-4dfe-935e-3afc723aed56-0-4.inprogress.b7d9cf09-0f72-4dce-8591-b61b1d23ae9b
> -rw-r--r--  1 bang  staff0  4 25 19:57 _MY_SUCCESS
> {code}
>  
> After some debug I found I have to set  {{sink.rolling-policy.file-size}} or 
> {{sink.rolling-policy.rollover-interval parameters, the default value of the 
> two parameters is pretty big(128M and 30min). It's not convenient for 
> test/demo. I think we can improve this.}}
>  
> As the doc[1] described, for row formats (csv, json), you can set the 
> parameter {{sink.rolling-policy.file-size}} or 
> {{sink.rolling-policy.rollover-interval}} in the connector properties and 
> parameter {{execution.checkpointing.interval}} in flink-conf.yaml together if 
> you don’t want to wait a long period before observe the data exists in file 
> system.
> [1] 
> https://ci.apache.org/projects/flink/flink-docs-master/docs/connectors/table/filesystem/#rolling-policy



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] alpinegizmo commented on a change in pull request #15991: Adding gcs documentation. Connecting flink to gcs.

2021-05-25 Thread GitBox


alpinegizmo commented on a change in pull request #15991:
URL: https://github.com/apache/flink/pull/15991#discussion_r638530397



##
File path: docs/content/docs/deployment/filesystems/gcs.md
##
@@ -0,0 +1,76 @@
+---
+title: Google cloud storage

Review comment:
   ```suggestion
   title: Google Cloud Storage
   ```

##
File path: docs/content/docs/deployment/filesystems/gcs.md
##
@@ -0,0 +1,76 @@
+---
+title: Google cloud storage
+weight: 3
+type: docs
+aliases:
+  - /deployment/filesystems/gcs.html
+---
+
+
+# Google Cloud Storage
+
+[Google Cloud storage](https://cloud.google.com/storage) (GCS) provides cloud 
storage for variety of usecases. You can use it for **reading** and **writing 
data**.

Review comment:
   By its exclusion, this seems to suggest that GCS cannot be used for 
checkpointing. Is that the case?
   
   ```suggestion
   [Google Cloud storage](https://cloud.google.com/storage) (GCS) provides 
cloud storage for a variety of use cases. You can use it for **reading** and 
**writing data**.
   ```

##
File path: docs/content/docs/deployment/filesystems/gcs.md
##
@@ -0,0 +1,76 @@
+---
+title: Google cloud storage
+weight: 3
+type: docs
+aliases:
+  - /deployment/filesystems/gcs.html
+---
+
+
+# Google Cloud Storage
+
+[Google Cloud storage](https://cloud.google.com/storage) (GCS) provides cloud 
storage for variety of usecases. You can use it for **reading** and **writing 
data**.
+
+
+
+You can use GCS objects like regular files by specifying paths in the 
following format:
+
+```plain
+gs:///
+```
+
+The endpoint can either be a single file or a directory, for example:
+
+```java
+// Read from GSC bucket
+env.readTextFile("gs:///");
+
+// Write to GCS bucket
+stream.writeAsText("gs:///");
+
+```
+
+### Libraries
+
+You would require to include the following libraries to `/lib` for connecting 
flink with gcs. 
+
+```xml
+
+  org.apache.flink
+  flink-shaded-hadoop2-uber
+  ${flink.shared_hadoop_latest_version}
+
+
+
+  com.google.cloud.bigdataoss
+  gcs-connector
+  hadoop2-2.2.0
+
+```
+
+We have tested with `flink-shared-hadoop2-uber version` >= `2.8.3-1.8.3`.
+
+### Authentication to access GCS
+
+You would need authentication to access GCS. Please follow the 
[link](https://cloud.google.com/storage/docs/authentication).

Review comment:
   ```suggestion
   Most operations on GCS require authentication. Please see [the documentation 
on Google Cloud Storage 
authentication](https://cloud.google.com/storage/docs/authentication) for more 
information.
   ```

##
File path: docs/content/docs/deployment/filesystems/gcs.md
##
@@ -0,0 +1,76 @@
+---
+title: Google cloud storage
+weight: 3
+type: docs
+aliases:
+  - /deployment/filesystems/gcs.html
+---
+
+
+# Google Cloud Storage
+
+[Google Cloud storage](https://cloud.google.com/storage) (GCS) provides cloud 
storage for variety of usecases. You can use it for **reading** and **writing 
data**.
+
+
+
+You can use GCS objects like regular files by specifying paths in the 
following format:
+
+```plain
+gs:///
+```
+
+The endpoint can either be a single file or a directory, for example:
+
+```java
+// Read from GSC bucket
+env.readTextFile("gs:///");
+
+// Write to GCS bucket
+stream.writeAsText("gs:///");
+
+```
+
+### Libraries
+
+You would require to include the following libraries to `/lib` for connecting 
flink with gcs. 

Review comment:
   Can the GCS filesystem be used as a Flink plugin, or is it really the 
case that it only works by putting the jar files into the lib directory (and I 
assume you meant Flink's lib directory, rather than `/lib`).
   
   ```suggestion
   You must include the following libraries in Flink's `lib` directory to 
connect Flink with gcs:
   ```

##
File path: docs/content/docs/deployment/filesystems/gcs.md
##
@@ -0,0 +1,76 @@
+---
+title: Google cloud storage
+weight: 3
+type: docs
+aliases:
+  - /deployment/filesystems/gcs.html
+---
+
+
+# Google Cloud Storage
+
+[Google Cloud storage](https://cloud.google.com/storage) (GCS) provides cloud 
storage for variety of usecases. You can use it for **reading** and **writing 
data**.
+
+
+
+You can use GCS objects like regular files by specifying paths in the 
following format:
+
+```plain
+gs:///
+```
+
+The endpoint can either be a single file or a directory, for example:
+
+```java
+// Read from GSC bucket
+env.readTextFile("gs:///");
+
+// Write to GCS bucket
+stream.writeAsText("gs:///");
+
+```
+
+### Libraries
+
+You would require to include the following libraries to `/lib` for connecting 
flink with gcs. 
+
+```xml
+
+  org.apache.flink
+  flink-shaded-hadoop2-uber
+  ${flink.shared_hadoop_latest_version}
+
+
+
+  com.google.cloud.bigdataoss
+  gcs-connector
+  hadoop2-2.2.0
+
+```
+
+We have tested with `flink-shared-hadoop2-uber version` >= `2.8.3-1.8.3`.
+
+### Authentication to access GCS
+
+You would need authentication to access GCS. Please follow the 
[l

[GitHub] [flink] flinkbot edited a comment on pull request #15959: [FLINK-22684][runtime] Added ability to ignore in-flight data during …

2021-05-25 Thread GitBox


flinkbot edited a comment on pull request #15959:
URL: https://github.com/apache/flink/pull/15959#issuecomment-844265660


   
   ## CI report:
   
   * a0b3af8f3d5503bbd3cb16c208521ce515ef6c67 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18243)
 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18300)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #15995: [FLINK-22760][Hive] Fix issue of HiveParser::setCurrentTimestamp fails with hive-3.1.2

2021-05-25 Thread GitBox


flinkbot commented on pull request #15995:
URL: https://github.com/apache/flink/pull/15995#issuecomment-847637342


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 031d0fe9b8d8ab2050ad378d158788e51495d0e7 (Tue May 25 
07:51:23 UTC 2021)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] AHeise commented on pull request #15991: Adding gcs documentation. Connecting flink to gcs.

2021-05-25 Thread GitBox


AHeise commented on pull request #15991:
URL: https://github.com/apache/flink/pull/15991#issuecomment-847639207


   @afedulov could you please also take a look? I think you struggled with GCS 
as well. For now, I'd merge it into 1.12 and 1.13 and wait if 
https://github.com/apache/flink/pull/15599 changes things for master (we could 
also merge it into master and let that PR update it - might be a good reminder).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] LiuPeien opened a new pull request #15996: FLINK-22663 Stop NMClient get stuck if there are stopped NodeManagers

2021-05-25 Thread GitBox


LiuPeien opened a new pull request #15996:
URL: https://github.com/apache/flink/pull/15996


   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluser with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / no / don't 
know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-10211) Time indicators are not correctly materialized for LogicalJoin

2021-05-25 Thread Timo Walther (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-10211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-10211:
-
Labels:   (was: stale-major)

> Time indicators are not correctly materialized for LogicalJoin
> --
>
> Key: FLINK-10211
> URL: https://issues.apache.org/jira/browse/FLINK-10211
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API, Table SQL / Planner
>Affects Versions: 1.6.0
>Reporter: Piotr Nowojski
>Priority: Major
>
> Currently 
> {{org.apache.flink.table.calcite.RelTimeIndicatorConverter#visit(LogicalJoin)}}
>  correctly handles only windowed joins. Output of non windowed joins 
> shouldn't contain any time indicators.
> A symptom of this is the exception:
> {code}
> Rowtime attributes must not be in the input rows of a regular join. As a 
> workaround you can cast the time attributes of input tables to TIMESTAMP 
> before.
> {code}
> Or this exception:
> {code}
> org.apache.flink.table.api.TableException: Found more than one rowtime field: 
> [orderTime, payTime] in the table that should be converted to a DataStream. 
> Please select the rowtime field that should be used as event-time timestamp 
> for the DataStream by casting all other fields to TIMESTAMP. at 
> org.apache.flink.table.api.StreamTableEnvironment.translate(StreamTableEnvironment.scala:906)
> {code}
> A long-term solution would be:
> The root cause of this issue is the early phase in which 
> {{RelTimeIndicatorConverter}} is called. Due to lack of information (since 
> the join condition might not have been pushed into the join node), we can not 
> differentiate between a window and non-window join. Thus, we cannot perform 
> the time indicator materialization more fine grained. A solution would be to 
> perform the materialization later after the logical optimization and before 
> the physical translation, this would also make sense from a semantic 
> perspective because time indicators are more a physical characteristic.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22663) Release YARN resource very slow when cancel the job after some NodeManagers shutdown

2021-05-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-22663:
---
Labels: YARN pull-request-available  (was: YARN)

> Release YARN resource very slow when cancel the job after some NodeManagers 
> shutdown
> 
>
> Key: FLINK-22663
> URL: https://issues.apache.org/jira/browse/FLINK-22663
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Affects Versions: 1.12.2
>Reporter: Jinhong Liu
>Priority: Major
>  Labels: YARN, pull-request-available
>
> When I test flink on YARN, there is a case that may cause some problems.
> Hadoop Version: 2.7.3
> Flink Version: 1.12.2
> I deploy a flink job on YARN, when the job is running I stop one NodeManager, 
> after one or two minutes, the job is auto recovered. But in this situation, 
> if I cancel the job, the containers cannot be released immediately, there are 
> still some containers that are running include the app master. About 5 
> minutes later, these containers exit, and about 10 minutes later the app 
> master exit.
> I check the log of app master, seems it try to stop the containers on the 
> NodeManger which I have already stopped.
> {code:java}
> 2021-05-14 06:15:17,389 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - Job class 
> tv.freewheel.reporting.fastlane.Fastlane$ (da883ab39a7a82e4d45a3803bc77dd6f) 
> switched from state CANCELLING to CANCELED.
> 2021-05-14 06:15:17,389 INFO  
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator[] - Stopping 
> checkpoint coordinator for job da883ab39a7a82e4d45a3803bc77dd6f.
> 2021-05-14 06:15:17,390 INFO  
> org.apache.flink.runtime.checkpoint.StandaloneCompletedCheckpointStore [] - 
> Shutting down
> 2021-05-14 06:15:17,408 INFO  
> org.apache.flink.runtime.dispatcher.MiniDispatcher   [] - Job 
> da883ab39a7a82e4d45a3803bc77dd6f reached globally terminal state CANCELED.
> 2021-05-14 06:15:17,409 INFO  
> org.apache.flink.runtime.dispatcher.MiniDispatcher   [] - Shutting 
> down cluster with state CANCELED, jobCancelled: true, executionMode: DETACHED
> 2021-05-14 06:15:17,409 INFO  
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint[] - Shutting 
> YarnJobClusterEntrypoint down with application status CANCELED. Diagnostics 
> null.
> 2021-05-14 06:15:17,409 INFO  
> org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - Shutting 
> down rest endpoint.
> 2021-05-14 06:15:17,420 INFO  org.apache.flink.runtime.jobmaster.JobMaster
>  [] - Stopping the JobMaster for job class 
> tv.freewheel.reporting.fastlane.Fastlane$(da883ab39a7a82e4d45a3803bc77dd6f).
> 2021-05-14 06:15:17,422 INFO  
> org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - Removing 
> cache directory 
> /tmp/flink-web-af72a00c-0ddd-4e5e-a62c-8244d6caa552/flink-web-ui
> 2021-05-14 06:15:17,432 INFO  
> org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - 
> http://ip-10-23-19-197.ec2.internal:43811 lost leadership
> 2021-05-14 06:15:17,432 INFO  
> org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - Shut down 
> complete.
> 2021-05-14 06:15:17,436 INFO  
> org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - 
> Shut down cluster because application is in CANCELED, diagnostics null.
> 2021-05-14 06:15:17,436 INFO  org.apache.flink.yarn.YarnResourceManagerDriver 
>  [] - Unregister application from the YARN Resource Manager with 
> final status KILLED.
> 2021-05-14 06:15:17,458 INFO  
> org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl [] - Suspending 
> SlotPool.
> 2021-05-14 06:15:17,458 INFO  org.apache.flink.runtime.jobmaster.JobMaster
>  [] - Close ResourceManager connection 
> 493862ba148679a4f16f7de5ffaef665: Stopping JobMaster for job class 
> tv.freewheel.reporting.fastlane.Fastlane$(da883ab39a7a82e4d45a3803bc77dd6f)..
> 2021-05-14 06:15:17,458 INFO  
> org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl [] - Stopping 
> SlotPool.
> 2021-05-14 06:15:17,482 INFO  
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl[] - Waiting for 
> application to be successfully unregistered.
> 2021-05-14 06:15:17,566 INFO  org.apache.flink.runtime.history.FsJobArchivist 
>  [] - Job da883ab39a7a82e4d45a3803bc77dd6f has been archived at 
> hdfs:/realtime/flink-archive/da883ab39a7a82e4d45a3803bc77dd6f.
> 2021-05-14 06:15:17,589 INFO  
> org.apache.flink.runtime.entrypoint.component.DispatcherResourceManagerComponent
>  [] - Closing components.
> 2021-05-14 06:15:17,590 INFO  
> org.apache.flink.runtime.dispatcher.runner.JobDispatcherLeaderProcess [] - 
> Stopping JobDispatcherLeaderProcess.
> 2021-05-14

[jira] [Commented] (FLINK-22719) WindowJoinUtil.containsWindowStartEqualityAndEndEquality should not throw exception

2021-05-25 Thread Timo Walther (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350889#comment-17350889
 ] 

Timo Walther commented on FLINK-22719:
--

[~lzljs3620320] I'm not sure if the issue has been fixed properly. I haven't 
looked into the PR yet but I worked on this issue before as part of FLINK-10211 
before the Blink merge interrupted a complete fix. Can we guarantee that the 
downstream operators still receive a valid rowtime attribute when removing this 
check? 

> WindowJoinUtil.containsWindowStartEqualityAndEndEquality should not throw 
> exception
> ---
>
> Key: FLINK-22719
> URL: https://issues.apache.org/jira/browse/FLINK-22719
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: Jingsong Lee
>Assignee: Andy
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> This will broke regular join sql.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] twalthr closed pull request #15989: [FLINK-22747] Upgrade to commons-io 2.8.0

2021-05-25 Thread GitBox


twalthr closed pull request #15989:
URL: https://github.com/apache/flink/pull/15989


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #15996: FLINK-22663 Stop YARN NMClient get stuck if it try to stop the containers on dead NodeManagers

2021-05-25 Thread GitBox


flinkbot commented on pull request #15996:
URL: https://github.com/apache/flink/pull/15996#issuecomment-847642931


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 9589ed9beb169f69822faf6f51e1b162c755709a (Tue May 25 
07:59:17 UTC 2021)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
* **This pull request references an unassigned [Jira 
ticket](https://issues.apache.org/jira/browse/FLINK-22663).** According to the 
[code contribution 
guide](https://flink.apache.org/contributing/contribute-code.html), tickets 
need to be assigned before starting with the implementation work.
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-22766) Report metrics of KafkaConsumer in Kafka new source

2021-05-25 Thread Qingsheng Ren (Jira)
Qingsheng Ren created FLINK-22766:
-

 Summary: Report metrics of KafkaConsumer in Kafka new source
 Key: FLINK-22766
 URL: https://issues.apache.org/jira/browse/FLINK-22766
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Kafka
Affects Versions: 1.13.0
Reporter: Qingsheng Ren
 Fix For: 1.14.0, 1.13.1


Currently Kafka new source doesn't register metrics of KafkaConsumer in 
KafkaPartitionSplitReader. These metrics should be added for debugging and 
monitoring purpose. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-21920) Optimize DefaultScheduler#allocateSlots

2021-05-25 Thread Zhilong Hong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhilong Hong updated FLINK-21920:
-
Description: 
Based on the scheduler benchmark introduced in FLINK-21731, we find that 
there's a procedure related to {{DefaultScheduler#allocateSlots}} that has 
O(N^2) complexity, which is: 
{{ExecutionGraphToInputsLocationsRetrieverAdapter#getConsumedResultPartitionsProducers}}.

The original implementation is:
{code:java}
for all SchedulingExecutionVertex in DefaultScheduler:
  for all ConsumedPartitionGroup of the SchedulingExecutionVertex:
for all IntermediateResultPartition in the ConsumedPartitionGroup:
  get producer of the IntermediateResultPartition {code}
This procedure has O(N^2) complexity.

We can see that for each SchedulingExecutionVertex, the producers of its 
ConsumedPartitionGroup is calculated separately. For the 
SchedulingExecutionVertices in the same ConsumerVertexGroup, they have the same 
ConsumedPartitionGroup. Therefore, we don't need to calculate the producers 
over and over again. We can use a local cache to cache the producers. This will 
decrease the complexity from O(N^2) to O(N).

 

  was:
Based on the scheduler benchmark introduced in FLINK-21731, we find that there 
are several procedures related to {{DefaultScheduler#allocateSlots}} have 
O(N^2) complexity. 

 

The first one is: 
{{ExecutionSlotSharingGroupBuilder#tryFindAvailableProducerExecutionSlotSharingGroupFor}}.
 The original implementation is: 
{code:java}
for all SchedulingExecutionVertex in DefaultScheduler:
  for all consumed SchedulingResultPartition of the SchedulingExecutionVertex:
get the result partition's producer vertex and determine the 
ExecutionSlotSharingGroup where the producer vertex locates is available for 
current vertex{code}
This procedure has O(N^2) complexity.

It's obvious that the result partitions in the same ConsumedPartitionGroup have 
the same producer vertex. So we can just iterate over the 
ConsumedPartitionGroups instead of all the consumed partitions. This will 
decrease the complexity from O(N^2) to O(N).

 

The second one is: 
{{ExecutionGraphToInputsLocationsRetrieverAdapter#getConsumedResultPartitionsProducers}}.
 The original implementation is:
{code:java}
for all SchedulingExecutionVertex in DefaultScheduler:
  for all ConsumedPartitionGroup of the SchedulingExecutionVertex:
for all IntermediateResultPartition in the ConsumedPartitionGroup:
  get producer of the IntermediateResultPartition {code}
This procedure has O(N^2) complexity.

We can see that for each SchedulingExecutionVertex, the producers of its 
ConsumedPartitionGroup is calculated separately. For 
SchedulingExecutionVertices in the same ConsumerVertexGroup, they have the same 
ConsumedPartitionGroup. Thus, we don't need to calculate the producers over and 
over again. We can use a local cache to cache the producers. This will decrease 
the complexity from O(N^2) to O(N).

 


> Optimize DefaultScheduler#allocateSlots
> ---
>
> Key: FLINK-21920
> URL: https://issues.apache.org/jira/browse/FLINK-21920
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.13.0
>Reporter: Zhilong Hong
>Assignee: Zhilong Hong
>Priority: Major
>  Labels: auto-unassigned, pull-request-available, stale-assigned
>
> Based on the scheduler benchmark introduced in FLINK-21731, we find that 
> there's a procedure related to {{DefaultScheduler#allocateSlots}} that has 
> O(N^2) complexity, which is: 
> {{ExecutionGraphToInputsLocationsRetrieverAdapter#getConsumedResultPartitionsProducers}}.
> The original implementation is:
> {code:java}
> for all SchedulingExecutionVertex in DefaultScheduler:
>   for all ConsumedPartitionGroup of the SchedulingExecutionVertex:
> for all IntermediateResultPartition in the ConsumedPartitionGroup:
>   get producer of the IntermediateResultPartition {code}
> This procedure has O(N^2) complexity.
> We can see that for each SchedulingExecutionVertex, the producers of its 
> ConsumedPartitionGroup is calculated separately. For the 
> SchedulingExecutionVertices in the same ConsumerVertexGroup, they have the 
> same ConsumedPartitionGroup. Therefore, we don't need to calculate the 
> producers over and over again. We can use a local cache to cache the 
> producers. This will decrease the complexity from O(N^2) to O(N).
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-21920) Optimize ExecutionGraphToInputsLocationsRetrieverAdapter

2021-05-25 Thread Zhilong Hong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhilong Hong updated FLINK-21920:
-
Summary: Optimize ExecutionGraphToInputsLocationsRetrieverAdapter  (was: 
Optimize DefaultScheduler#allocateSlots)

> Optimize ExecutionGraphToInputsLocationsRetrieverAdapter
> 
>
> Key: FLINK-21920
> URL: https://issues.apache.org/jira/browse/FLINK-21920
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.13.0
>Reporter: Zhilong Hong
>Assignee: Zhilong Hong
>Priority: Major
>  Labels: auto-unassigned, pull-request-available, stale-assigned
>
> Based on the scheduler benchmark introduced in FLINK-21731, we find that 
> there's a procedure related to {{DefaultScheduler#allocateSlots}} that has 
> O(N^2) complexity, which is: 
> {{ExecutionGraphToInputsLocationsRetrieverAdapter#getConsumedResultPartitionsProducers}}.
> The original implementation is:
> {code:java}
> for all SchedulingExecutionVertex in DefaultScheduler:
>   for all ConsumedPartitionGroup of the SchedulingExecutionVertex:
> for all IntermediateResultPartition in the ConsumedPartitionGroup:
>   get producer of the IntermediateResultPartition {code}
> This procedure has O(N^2) complexity.
> We can see that for each SchedulingExecutionVertex, the producers of its 
> ConsumedPartitionGroup is calculated separately. For the 
> SchedulingExecutionVertices in the same ConsumerVertexGroup, they have the 
> same ConsumedPartitionGroup. Therefore, we don't need to calculate the 
> producers over and over again. We can use a local cache to cache the 
> producers. This will decrease the complexity from O(N^2) to O(N).
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-21920) Optimize ExecutionGraphToInputsLocationsRetrieverAdapter

2021-05-25 Thread Zhilong Hong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhilong Hong updated FLINK-21920:
-
Labels: pull-request-available  (was: auto-unassigned 
pull-request-available stale-assigned)

> Optimize ExecutionGraphToInputsLocationsRetrieverAdapter
> 
>
> Key: FLINK-21920
> URL: https://issues.apache.org/jira/browse/FLINK-21920
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.13.0
>Reporter: Zhilong Hong
>Assignee: Zhilong Hong
>Priority: Major
>  Labels: pull-request-available
>
> Based on the scheduler benchmark introduced in FLINK-21731, we find that 
> there's a procedure related to {{DefaultScheduler#allocateSlots}} that has 
> O(N^2) complexity, which is: 
> {{ExecutionGraphToInputsLocationsRetrieverAdapter#getConsumedResultPartitionsProducers}}.
> The original implementation is:
> {code:java}
> for all SchedulingExecutionVertex in DefaultScheduler:
>   for all ConsumedPartitionGroup of the SchedulingExecutionVertex:
> for all IntermediateResultPartition in the ConsumedPartitionGroup:
>   get producer of the IntermediateResultPartition {code}
> This procedure has O(N^2) complexity.
> We can see that for each SchedulingExecutionVertex, the producers of its 
> ConsumedPartitionGroup is calculated separately. For the 
> SchedulingExecutionVertices in the same ConsumerVertexGroup, they have the 
> same ConsumedPartitionGroup. Therefore, we don't need to calculate the 
> producers over and over again. We can use a local cache to cache the 
> producers. This will decrease the complexity from O(N^2) to O(N).
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-22764) Disable Values Source node in streaming plan optimization

2021-05-25 Thread Jark Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350894#comment-17350894
 ] 

Jark Wu commented on FLINK-22764:
-

FYI: FLIP-147 is aiming to fix checkpoint problem with finished source. 


https://cwiki.apache.org/confluence/display/FLINK/FLIP-147%3A+Support+Checkpoints+After+Tasks+Finished

> Disable Values Source node in streaming plan optimization
> -
>
> Key: FLINK-22764
> URL: https://issues.apache.org/jira/browse/FLINK-22764
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.13.0
>Reporter: Leonard Xu
>Priority: Major
>
> The `Values` source node may be produced in streaming plan optimization by 
> `FlinkPruneEmptyRules` currently,  and the `Values` source will be soon to 
> `FINISHED` status which will lead to the whole job can not do checkpoint and 
> savepoint.
>  
> I think we can add an option to enable/disable values source input in 
> streaming job to make the job do checkpoint/savepoint normally, for above 
> situation we can throw an exception.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-22767) Optimize the initialization of ExecutionSlotSharingGroupBuilder

2021-05-25 Thread Zhilong Hong (Jira)
Zhilong Hong created FLINK-22767:


 Summary: Optimize the initialization of 
ExecutionSlotSharingGroupBuilder
 Key: FLINK-22767
 URL: https://issues.apache.org/jira/browse/FLINK-22767
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Affects Versions: 1.13.0
Reporter: Zhilong Hong


Based on the scheduler benchmark introduced in FLINK-21731, we find that during 
the initialization of ExecutionSlotSharingGroupBuilder, there's a procedure 
that has O(N^2) complexity: 
{{ExecutionSlotSharingGroupBuilder#tryFindAvailableProducerExecutionSlotSharingGroupFor}}.
 This initialization happens during the initialization of 
LocalInputPreferredSlotSharingStrategy. 

The original implementation is: 
{code:java}
for all SchedulingExecutionVertex in DefaultScheduler:
  for all consumed SchedulingResultPartition of the SchedulingExecutionVertex:
get the result partition's producer vertex and determine the 
ExecutionSlotSharingGroup where the producer vertex locates is available for 
current vertex{code}
This procedure has O(N^2) complexity.

It's obvious that the result partitions in the same ConsumedPartitionGroup have 
the same producer vertex. So we can just iterate over the 
ConsumedPartitionGroups instead of all the consumed partitions. This will 
decrease the complexity from O(N^2) to O(N).

The optimization of this procedure will speed up the initialization of 
DefaultScheduler. It will accelerate the submission of a new job, especially 
for OLAP jobs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #15877: [FLINK-22612][python] Restructure the coders in PyFlink

2021-05-25 Thread GitBox


flinkbot edited a comment on pull request #15877:
URL: https://github.com/apache/flink/pull/15877#issuecomment-836433504


   
   ## CI report:
   
   * be2f68b4cd7086048d5a16c1ab5ee1661908a0c0 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18297)
 
   * dc679297beeaf91ca605574190d847262104e379 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-22464) OperatorEventSendingCheckpointITCase.testOperatorEventLostWithReaderFailure hangs with `AdaptiveScheduler`

2021-05-25 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350895#comment-17350895
 ] 

Robert Metzger commented on FLINK-22464:


The issue is reproducible 100% of the time locally with current master.
The job is in a restart loop due to this error:

{code}
22:48:45,251 [Checkpoint Timer] INFO  
org.apache.flink.runtime.checkpoint.CheckpointCoordinator[] - Triggering 
checkpoint 2 (type=CHECKPOINT) @ 1621896525250 for job 
566fd0f105339e3cecdfb994df6e1cf9.
22:48:45,251 [flink-akka.actor.default-dispatcher-2] INFO  
org.apache.flink.runtime.operators.coordination.OperatorCoordinatorHolder [] - 
Coordinator checkpoint 2 for coordinator cbc357ccb763df2852fee8c4fc7d55f2 is 
awaiting 1 pending events
22:48:45,704 [Checkpoint Timer] WARN  
org.apache.flink.runtime.checkpoint.CheckpointCoordinator[] - Failed to 
trigger checkpoint 2 for job 566fd0f105339e3cecdfb994df6e1cf9. (1 consecutive 
failed attempts so far)
org.apache.flink.util.FlinkException: Failing OperatorCoordinator checkpoint 
because some OperatorEvents before this checkpoint barrier were not received by 
the target tasks.
at 
org.apache.flink.runtime.operators.coordination.OperatorCoordinatorHolder.lambda$completeCheckpointOnceEventsAreDone$4(OperatorCoordinatorHolder.java:344)
 ~[flink-runtime_2.11-1.14-SNAPSHOT.jar:?]
at 
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
 ~[?:1.8.0_282]
at 
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
 ~[?:1.8.0_282]
at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) 
~[?:1.8.0_282]
at 
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
 ~[?:1.8.0_282]
at 
org.apache.flink.runtime.concurrent.FutureUtils$WaitingConjunctFuture.handleCompletedFuture(FutureUtils.java:905)
 ~[flink-runtime_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
at 
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
 ~[?:1.8.0_282]
at 
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
 ~[?:1.8.0_282]
at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) 
~[?:1.8.0_282]
at 
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
 ~[?:1.8.0_282]
at 
org.apache.flink.runtime.concurrent.FutureUtils.lambda$forwardTo$23(FutureUtils.java:1356)
 ~[flink-runtime_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
at 
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
 ~[?:1.8.0_282]
at 
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
 ~[?:1.8.0_282]
at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) 
~[?:1.8.0_282]
at 
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
 ~[?:1.8.0_282]
at 
org.apache.flink.runtime.concurrent.FutureUtils$Timeout.run(FutureUtils.java:1255)
 ~[flink-runtime_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
at 
org.apache.flink.runtime.concurrent.DirectExecutorService.execute(DirectExecutorService.java:217)
 ~[flink-runtime_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
at 
org.apache.flink.runtime.concurrent.FutureUtils.lambda$orTimeout$15(FutureUtils.java:582)
 ~[flink-runtime_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[?:1.8.0_282]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[?:1.8.0_282]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 [?:1.8.0_282]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 [?:1.8.0_282]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_282]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_282]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282]
22:48:45,706 [flink-akka.actor.default-dispatcher-3] INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - Source: 
numbers -> Map -> Sink: Data stream collect sink (1/1) 
(fe486f6c9a46e270c73a9832d6e2aab5) switched from RUNNING to FAILED on 
1072e91c-cd0e-4c5a-99fe-46fa9f65f579 @ localhost (dataPort=-1).
org.apache.flink.util.FlinkException: An OperatorEvent from an 
OperatorCoordinator to a task was lost. Triggering task failover to ensure 
consistency. Event: 'AddSplitEvents[[[B@57ec5d1f]]', targetTask: Source: 
numbers -> Map -> Sink: Data stream collect sink (1/1) - execution #0
at 
org

[GitHub] [flink] flinkbot commented on pull request #15995: [FLINK-22760][Hive] Fix issue of HiveParser::setCurrentTimestamp fails with hive-3.1.2

2021-05-25 Thread GitBox


flinkbot commented on pull request #15995:
URL: https://github.com/apache/flink/pull/15995#issuecomment-847655258


   
   ## CI report:
   
   * 031d0fe9b8d8ab2050ad378d158788e51495d0e7 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15959: [FLINK-22684][runtime] Added ability to ignore in-flight data during …

2021-05-25 Thread GitBox


flinkbot edited a comment on pull request #15959:
URL: https://github.com/apache/flink/pull/15959#issuecomment-844265660


   
   ## CI report:
   
   * a0b3af8f3d5503bbd3cb16c208521ce515ef6c67 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18300)
 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18243)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #15996: FLINK-22663 Stop YARN NMClient get stuck if it try to stop the containers on dead NodeManagers

2021-05-25 Thread GitBox


flinkbot commented on pull request #15996:
URL: https://github.com/apache/flink/pull/15996#issuecomment-847655364


   
   ## CI report:
   
   * 9589ed9beb169f69822faf6f51e1b162c755709a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] AHeise commented on pull request #15156: [FLINK-21393] [formats] Implement ParquetAvroInputFormat

2021-05-25 Thread GitBox


AHeise commented on pull request #15156:
URL: https://github.com/apache/flink/pull/15156#issuecomment-847655919


   > @AHeise ParquetInputFormat base class was removed since I submitted my PR 
hence the compilation issues, commit 
[ce3631a](https://github.com/apache/flink/commit/ce3631af7313855f675e29b8faa386f6e5a2d43c)
 removed it. This commit mentions "Use the filesystem connector with a Parquet 
format as a replacement". I guess it refers to 
https://ci.apache.org/projects/flink/flink-docs-master/docs/connectors/table/filesystem/
 which is SQL based. But what if our pipeline pipeline does not use SQL but 
DataSet API ?
   
   It's a good and tough question. I spoke to @twalthr offline and for now the 
plan is as follows:
   * Table API drops old planner and thus, most of the removed code in that 
commit is dead code.
   * Table API will support everything running on DataStream that used to work 
when it ran on DataSet.
   * 1.13 will be the last release with full DataSet support 
(BatchTableEnvironment will be dropped)
   
   Since a few features of DataSet are still not supported in DataStream, we 
expect users to stick to 1.13 for a longer time and probably skip 1.14. So, 
we'd suggest to merge your 2 PRs to release-1.13 instead of master. Then, all 
DataSet users would benefit from your contributions while we unblock future 
developments that would break DataSet as of 1.13.
   
   If it turns out that the community wants to have these features in 1.14 for 
some reasons (Table API not as far as it should), we can still re-add 
`ParquetInputFormat` and forward port your PR before feature freeze in 3 months.
   
   Note 1: It might still be possible to have a Flink 1.14 DataSet application 
using 1.13 formats. In general, it's always possible to copy the old code into 
your own project.
   Note 2: If you are missing combineable aggregations in DataStream, maybe it 
would be better to move to Table API to begin with. At this point, no-one 
really knows how much of DataSet will be ported to DataStream. It doesn't 
really make sense to have 2 APIs with high-level primitives like joins. The 
main idea is to use Table API by default with a rich user experience and go 
down to DataStream only when needed (timer, user state, ...).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-22762) cannot use set table.sql-dialect=hive;

2021-05-25 Thread xueluo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xueluo updated FLINK-22762:
---
Attachment: image-2021-05-25-16-15-40-431.png

> cannot use  set table.sql-dialect=hive;
> ---
>
> Key: FLINK-22762
> URL: https://issues.apache.org/jira/browse/FLINK-22762
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.13.0
> Environment: flink 1.13  
> hive 3.12
>Reporter: xueluo
>Priority: Major
> Attachments: image-2021-05-25-10-11-49-944.png, 
> image-2021-05-25-10-14-22-111.png, image-2021-05-25-16-15-40-431.png
>
>
> sh sql-client.sh
> *CREATE CATALOG* myhive *WITH* (
>      'type' *=* 'hive',
>      'default-database' = 'default',
>      'hive-conf-dir' = '/data/hive/conf/'
>  );
>  
> USE CATALOG myhive;
>  set table.sql-dialect=hive;
> then 
> show tables; or any command error
> !image-2021-05-25-10-14-22-111.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22762) cannot use set table.sql-dialect=hive;

2021-05-25 Thread xueluo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xueluo updated FLINK-22762:
---
Attachment: (was: image-2021-05-25-10-11-49-944.png)

> cannot use  set table.sql-dialect=hive;
> ---
>
> Key: FLINK-22762
> URL: https://issues.apache.org/jira/browse/FLINK-22762
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.13.0
> Environment: flink 1.13  
> hive 3.12
>Reporter: xueluo
>Priority: Major
> Attachments: image-2021-05-25-10-14-22-111.png, 
> image-2021-05-25-16-15-40-431.png
>
>
> sh sql-client.sh
> *CREATE CATALOG* myhive *WITH* (
>      'type' *=* 'hive',
>      'default-database' = 'default',
>      'hive-conf-dir' = '/data/hive/conf/'
>  );
>  
> USE CATALOG myhive;
>  set table.sql-dialect=hive;
> then 
> show tables; or any command error
> !image-2021-05-25-10-14-22-111.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22762) cannot use set table.sql-dialect=hive;

2021-05-25 Thread xueluo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xueluo updated FLINK-22762:
---
Description: 
sh sql-client.sh

*CREATE CATALOG* myhive *WITH* (
     'type' *=* 'hive',
     'default-database' = 'default',
     'hive-conf-dir' = '/data/hive/conf/'
 );

 

USE CATALOG myhive;

 set table.sql-dialect=hive;

then 

show tables; or any command error

!image-2021-05-25-10-14-22-111.png!

!image-2021-05-25-16-15-40-431.png!

  was:
sh sql-client.sh

*CREATE CATALOG* myhive *WITH* (
     'type' *=* 'hive',
     'default-database' = 'default',
     'hive-conf-dir' = '/data/hive/conf/'
 );

 

USE CATALOG myhive;

 set table.sql-dialect=hive;

then 

show tables; or any command error

!image-2021-05-25-10-14-22-111.png!


> cannot use  set table.sql-dialect=hive;
> ---
>
> Key: FLINK-22762
> URL: https://issues.apache.org/jira/browse/FLINK-22762
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.13.0
> Environment: flink 1.13  
> hive 3.12
>Reporter: xueluo
>Priority: Major
> Attachments: image-2021-05-25-10-14-22-111.png, 
> image-2021-05-25-16-15-40-431.png
>
>
> sh sql-client.sh
> *CREATE CATALOG* myhive *WITH* (
>      'type' *=* 'hive',
>      'default-database' = 'default',
>      'hive-conf-dir' = '/data/hive/conf/'
>  );
>  
> USE CATALOG myhive;
>  set table.sql-dialect=hive;
> then 
> show tables; or any command error
> !image-2021-05-25-10-14-22-111.png!
> !image-2021-05-25-16-15-40-431.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-22762) cannot use set table.sql-dialect=hive;

2021-05-25 Thread xueluo (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350908#comment-17350908
 ] 

xueluo commented on FLINK-22762:


[~luoyuxia] add flink lib printscreen

> cannot use  set table.sql-dialect=hive;
> ---
>
> Key: FLINK-22762
> URL: https://issues.apache.org/jira/browse/FLINK-22762
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.13.0
> Environment: flink 1.13  
> hive 3.12
>Reporter: xueluo
>Priority: Major
> Attachments: image-2021-05-25-10-14-22-111.png, 
> image-2021-05-25-16-15-40-431.png, image-2021-05-25-16-28-05-992.png
>
>
> sh sql-client.sh
> *CREATE CATALOG* myhive *WITH* (
>      'type' *=* 'hive',
>      'default-database' = 'default',
>      'hive-conf-dir' = '/data/hive/conf/'
>  );
>  
> USE CATALOG myhive;
>  set table.sql-dialect=hive;
> then 
> show tables; or any command error
> !image-2021-05-25-10-14-22-111.png!
> !image-2021-05-25-16-15-40-431.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-22663) Release YARN resource very slow when cancel the job after some NodeManagers shutdown

2021-05-25 Thread Jinhong Liu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350909#comment-17350909
 ] 

Jinhong Liu edited comment on FLINK-22663 at 5/25/21, 8:32 AM:
---

[~fly_in_gis] 

I read the source code of Flink, when we try to cancel the job, 
YarnResourceManagerDriver calls the "terminate" method. This method tries to 
stop the NMClient it created and NMClient will try to clean up all running 
containers as default.

If we see what it does when cleaning up the running containers in the method 
"cleanupRunningContainers", we find it iterates all the running containers and 
tries to stop the container one by one. But if the container's NodeManager is 
dead, it will get stuck and all the left containers can't be stopped. So I 
think this is the root cause of almost all the containers that cannot exist 
because the clean-up process is serial and synchronous.
{code:java}
public CompletableFuture terminate() {
// shut down all components
Exception exception = null;

if (resourceManagerClient != null) {
try {
resourceManagerClient.stop();
} catch (Exception e) {
exception = e;
}
}

if (nodeManagerClient != null) {
try {
nodeManagerClient.stop();
} catch (Exception e) {
exception = ExceptionUtils.firstOrSuppressed(e, exception);
}
}

return exception == null
? FutureUtils.completedVoidFuture()
: FutureUtils.completedExceptionally(exception);
}


protected synchronized void cleanupRunningContainers() {
  for (StartedContainer startedContainer : startedContainers.values()) {
try {
  stopContainer(startedContainer.getContainerId(),
  startedContainer.getNodeId());
} catch (YarnException e) {
  LOG.error("Failed to stop Container " +
  startedContainer.getContainerId() +
  "when stopping NMClientImpl");
} catch (IOException e) {
  LOG.error("Failed to stop Container " +
  startedContainer.getContainerId() +
  "when stopping NMClientImpl");
}
  }
}
{code}


was (Author: jinhongliu):
[~fly_in_gis] 

I read the source code of Flink, when we try to cancel the job, 
YarnResourceManagerDriver calls the "terminate" method. This method tries to 
stop the NMClient it created and NMClient will try to clean up all running 
container as default.

If we see what it does when cleaning up the running containers in the method 
"cleanupRunningContainers", we find it iterates all the running containers and 
tries to stop the container one by one. But if the container's NodeManager is 
dead, it will get stuck and all the left containers can't be stopped. So I 
think this is the root cause of almost all the containers that cannot exist 
because the clean-up process is serial and synchronous.
{code:java}
public CompletableFuture terminate() {
// shut down all components
Exception exception = null;

if (resourceManagerClient != null) {
try {
resourceManagerClient.stop();
} catch (Exception e) {
exception = e;
}
}

if (nodeManagerClient != null) {
try {
nodeManagerClient.stop();
} catch (Exception e) {
exception = ExceptionUtils.firstOrSuppressed(e, exception);
}
}

return exception == null
? FutureUtils.completedVoidFuture()
: FutureUtils.completedExceptionally(exception);
}


protected synchronized void cleanupRunningContainers() {
  for (StartedContainer startedContainer : startedContainers.values()) {
try {
  stopContainer(startedContainer.getContainerId(),
  startedContainer.getNodeId());
} catch (YarnException e) {
  LOG.error("Failed to stop Container " +
  startedContainer.getContainerId() +
  "when stopping NMClientImpl");
} catch (IOException e) {
  LOG.error("Failed to stop Container " +
  startedContainer.getContainerId() +
  "when stopping NMClientImpl");
}
  }
}
{code}

> Release YARN resource very slow when cancel the job after some NodeManagers 
> shutdown
> 
>
> Key: FLINK-22663
> URL: https://issues.apache.org/jira/browse/FLINK-22663
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Affects Versions: 1.12.2
>Reporter: Jinhong Liu
>Priority: Major
>  Labels: YARN, pull-request-available
>
> When I test flink on YARN, there is a case that may cause some problems.
> Hadoop Version: 2.7.3
> Flink Version: 1.12.2
> I deploy a flink job on YARN, when the job is running I stop one NodeManager, 
> after one or two minutes, the job is auto recovered. But in this situa

[jira] [Comment Edited] (FLINK-22663) Release YARN resource very slow when cancel the job after some NodeManagers shutdown

2021-05-25 Thread Jinhong Liu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350909#comment-17350909
 ] 

Jinhong Liu edited comment on FLINK-22663 at 5/25/21, 8:32 AM:
---

[~fly_in_gis] 

I read the source code of Flink, when we try to cancel the job, 
YarnResourceManagerDriver calls the "terminate" method. This method tries to 
stop the NMClient it created and NMClient will try to clean up all running 
container as default.

If we see what it does when cleaning up the running containers in the method 
"cleanupRunningContainers", we find it iterates all the running containers and 
tries to stop the container one by one. But if the container's NodeManager is 
dead, it will get stuck and all the left containers can't be stopped. So I 
think this is the root cause of almost all the containers that cannot exist 
because the clean-up process is serial and synchronous.
{code:java}
public CompletableFuture terminate() {
// shut down all components
Exception exception = null;

if (resourceManagerClient != null) {
try {
resourceManagerClient.stop();
} catch (Exception e) {
exception = e;
}
}

if (nodeManagerClient != null) {
try {
nodeManagerClient.stop();
} catch (Exception e) {
exception = ExceptionUtils.firstOrSuppressed(e, exception);
}
}

return exception == null
? FutureUtils.completedVoidFuture()
: FutureUtils.completedExceptionally(exception);
}


protected synchronized void cleanupRunningContainers() {
  for (StartedContainer startedContainer : startedContainers.values()) {
try {
  stopContainer(startedContainer.getContainerId(),
  startedContainer.getNodeId());
} catch (YarnException e) {
  LOG.error("Failed to stop Container " +
  startedContainer.getContainerId() +
  "when stopping NMClientImpl");
} catch (IOException e) {
  LOG.error("Failed to stop Container " +
  startedContainer.getContainerId() +
  "when stopping NMClientImpl");
}
  }
}
{code}


was (Author: jinhongliu):
[~fly_in_gis] 

I read the source code of Flink, when we try to cancel the job, 
YarnResourceManagerDriver calls the "terminate" method. This method tries to 
stop the NMClient it created.

If we see what it does when cleaning up the running containers in the method 
"cleanupRunningContainers", we find it iterates all the running containers and 
tries to stop the container one by one. But if the container's NodeManager is 
dead, it will get stuck and all the left containers can't be stopped. So I 
think this is the root cause of almost all the containers that cannot exist 
because the clean-up process is serial and synchronous.
{code:java}
public CompletableFuture terminate() {
// shut down all components
Exception exception = null;

if (resourceManagerClient != null) {
try {
resourceManagerClient.stop();
} catch (Exception e) {
exception = e;
}
}

if (nodeManagerClient != null) {
try {
nodeManagerClient.stop();
} catch (Exception e) {
exception = ExceptionUtils.firstOrSuppressed(e, exception);
}
}

return exception == null
? FutureUtils.completedVoidFuture()
: FutureUtils.completedExceptionally(exception);
}


protected synchronized void cleanupRunningContainers() {
  for (StartedContainer startedContainer : startedContainers.values()) {
try {
  stopContainer(startedContainer.getContainerId(),
  startedContainer.getNodeId());
} catch (YarnException e) {
  LOG.error("Failed to stop Container " +
  startedContainer.getContainerId() +
  "when stopping NMClientImpl");
} catch (IOException e) {
  LOG.error("Failed to stop Container " +
  startedContainer.getContainerId() +
  "when stopping NMClientImpl");
}
  }
}
{code}

> Release YARN resource very slow when cancel the job after some NodeManagers 
> shutdown
> 
>
> Key: FLINK-22663
> URL: https://issues.apache.org/jira/browse/FLINK-22663
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Affects Versions: 1.12.2
>Reporter: Jinhong Liu
>Priority: Major
>  Labels: YARN, pull-request-available
>
> When I test flink on YARN, there is a case that may cause some problems.
> Hadoop Version: 2.7.3
> Flink Version: 1.12.2
> I deploy a flink job on YARN, when the job is running I stop one NodeManager, 
> after one or two minutes, the job is auto recovered. But in this situation, 
> if I cancel the job, the containers cannot be released immed

[jira] [Closed] (FLINK-22758) java.lang.IllegalStateException: Trying to access closed classloader

2021-05-25 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-22758.

Resolution: Duplicate

> java.lang.IllegalStateException: Trying to access closed classloader
> 
>
> Key: FLINK-22758
> URL: https://issues.apache.org/jira/browse/FLINK-22758
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Affects Versions: 1.12.3
>Reporter: ighack
>Priority: Major
>
> when I exit sql-client.sh I get a error
>  
> Shutting down the session...Shutting down the session...done.Exception in 
> thread "Thread-4" java.lang.IllegalStateException: Trying to access closed 
> classloader. Please check if you store classloaders directly or indirectly in 
> static fields. If the stacktrace suggests that the leak occurs in a third 
> party library and cannot be fixed immediately, you can disable this check 
> with the configuration 'classloader.check-leaked-classloader'. at 
> org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.ensureInner(FlinkUserCodeClassLoaders.java:164)
>  at 
> org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.getResource(FlinkUserCodeClassLoaders.java:183)
>  at org.apache.hadoop.conf.Configuration.getResource(Configuration.java:2647) 
> at 
> org.apache.hadoop.conf.Configuration.getStreamReader(Configuration.java:2905) 
> at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2864) 
> at 
> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2838) 
> at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2715) at 
> org.apache.hadoop.conf.Configuration.get(Configuration.java:1186) at 
> org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1774) 
> at 
> org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183)
>  at 
> org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145)
>  at 
> org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65)
>  at 
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102)
>  
>  
> I use CDH 6.3.2
> hive 2.1.1
> java version "1.8.0_271"
> flink-1.12.3
>  
> in $FLINK_HOME/lib I add
> flink-connector-hive_2.11-1.12.3.jar
> hive-exec-2.1.1-cdh6.3.2.jar
> libfb303-0.9.3.jar



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-22663) Release YARN resource very slow when cancel the job after some NodeManagers shutdown

2021-05-25 Thread Jinhong Liu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350909#comment-17350909
 ] 

Jinhong Liu edited comment on FLINK-22663 at 5/25/21, 8:34 AM:
---

[~fly_in_gis] 

I read the source code of Flink, when we try to cancel the job, 
YarnResourceManagerDriver calls the "terminate" method. This method tries to 
stop the NMClient it created and NMClient will try to clean up all running 
containers as default.

If we see what it does when cleaning up the running containers in the method 
"cleanupRunningContainers", we find it iterates all the started containers and 
tries to stop the container one by one. But if the container's NodeManager is 
dead, it will get stuck and all the left containers can't be stopped. So I 
think this is the root cause of almost all the containers that cannot exist 
because the clean-up process is serial and synchronous.
{code:java}
// org.apache.flink.yarn.YarnResourceManagerDriver
public CompletableFuture terminate() {
// shut down all components
Exception exception = null;

if (resourceManagerClient != null) {
try {
resourceManagerClient.stop();
} catch (Exception e) {
exception = e;
}
}

if (nodeManagerClient != null) {
try {
nodeManagerClient.stop();
} catch (Exception e) {
exception = ExceptionUtils.firstOrSuppressed(e, exception);
}
}

return exception == null
? FutureUtils.completedVoidFuture()
: FutureUtils.completedExceptionally(exception);
}

// org.apache.hadoop.yarn.client.api.impl.NMClientImpl
protected synchronized void cleanupRunningContainers() {
  for (StartedContainer startedContainer : startedContainers.values()) {
try {
  stopContainer(startedContainer.getContainerId(),
  startedContainer.getNodeId());
} catch (YarnException e) {
  LOG.error("Failed to stop Container " +
  startedContainer.getContainerId() +
  "when stopping NMClientImpl");
} catch (IOException e) {
  LOG.error("Failed to stop Container " +
  startedContainer.getContainerId() +
  "when stopping NMClientImpl");
}
  }
}
{code}


was (Author: jinhongliu):
[~fly_in_gis] 

I read the source code of Flink, when we try to cancel the job, 
YarnResourceManagerDriver calls the "terminate" method. This method tries to 
stop the NMClient it created and NMClient will try to clean up all running 
containers as default.

If we see what it does when cleaning up the running containers in the method 
"cleanupRunningContainers", we find it iterates all the running containers and 
tries to stop the container one by one. But if the container's NodeManager is 
dead, it will get stuck and all the left containers can't be stopped. So I 
think this is the root cause of almost all the containers that cannot exist 
because the clean-up process is serial and synchronous.
{code:java}
public CompletableFuture terminate() {
// shut down all components
Exception exception = null;

if (resourceManagerClient != null) {
try {
resourceManagerClient.stop();
} catch (Exception e) {
exception = e;
}
}

if (nodeManagerClient != null) {
try {
nodeManagerClient.stop();
} catch (Exception e) {
exception = ExceptionUtils.firstOrSuppressed(e, exception);
}
}

return exception == null
? FutureUtils.completedVoidFuture()
: FutureUtils.completedExceptionally(exception);
}


protected synchronized void cleanupRunningContainers() {
  for (StartedContainer startedContainer : startedContainers.values()) {
try {
  stopContainer(startedContainer.getContainerId(),
  startedContainer.getNodeId());
} catch (YarnException e) {
  LOG.error("Failed to stop Container " +
  startedContainer.getContainerId() +
  "when stopping NMClientImpl");
} catch (IOException e) {
  LOG.error("Failed to stop Container " +
  startedContainer.getContainerId() +
  "when stopping NMClientImpl");
}
  }
}
{code}

> Release YARN resource very slow when cancel the job after some NodeManagers 
> shutdown
> 
>
> Key: FLINK-22663
> URL: https://issues.apache.org/jira/browse/FLINK-22663
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Affects Versions: 1.12.2
>Reporter: Jinhong Liu
>Priority: Major
>  Labels: YARN, pull-request-available
>
> When I test flink on YARN, there is a case that may cause some problems.
> Hadoop Version: 2.7.3
> Flink Version: 1.12.2
> I deploy a flink job on YARN, when the job is

[jira] [Updated] (FLINK-22762) cannot use set table.sql-dialect=hive;

2021-05-25 Thread xueluo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xueluo updated FLINK-22762:
---
Description: 
 

sh sql-client.sh

*CREATE CATALOG* myhive *WITH* (
     'type' *=* 'hive',
     'default-database' = 'default',
     'hive-conf-dir' = '/data/hive/conf/'
 );

 

USE CATALOG myhive;

 set table.sql-dialect=hive;

then 

show tables; or any command error

!image-2021-05-25-10-14-22-111.png!

!image-2021-05-25-16-15-40-431.png!

!image-2021-05-25-16-28-05-992.png!

  was:
sh sql-client.sh

*CREATE CATALOG* myhive *WITH* (
     'type' *=* 'hive',
     'default-database' = 'default',
     'hive-conf-dir' = '/data/hive/conf/'
 );

 

USE CATALOG myhive;

 set table.sql-dialect=hive;

then 

show tables; or any command error

!image-2021-05-25-10-14-22-111.png!

!image-2021-05-25-16-15-40-431.png!


> cannot use  set table.sql-dialect=hive;
> ---
>
> Key: FLINK-22762
> URL: https://issues.apache.org/jira/browse/FLINK-22762
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.13.0
> Environment: flink 1.13  
> hive 3.12
>Reporter: xueluo
>Priority: Major
> Attachments: image-2021-05-25-10-14-22-111.png, 
> image-2021-05-25-16-15-40-431.png, image-2021-05-25-16-28-05-992.png
>
>
>  
> sh sql-client.sh
> *CREATE CATALOG* myhive *WITH* (
>      'type' *=* 'hive',
>      'default-database' = 'default',
>      'hive-conf-dir' = '/data/hive/conf/'
>  );
>  
> USE CATALOG myhive;
>  set table.sql-dialect=hive;
> then 
> show tables; or any command error
> !image-2021-05-25-10-14-22-111.png!
> !image-2021-05-25-16-15-40-431.png!
> !image-2021-05-25-16-28-05-992.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22762) cannot use set table.sql-dialect=hive;

2021-05-25 Thread xueluo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xueluo updated FLINK-22762:
---
Attachment: image-2021-05-25-16-28-05-992.png

> cannot use  set table.sql-dialect=hive;
> ---
>
> Key: FLINK-22762
> URL: https://issues.apache.org/jira/browse/FLINK-22762
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.13.0
> Environment: flink 1.13  
> hive 3.12
>Reporter: xueluo
>Priority: Major
> Attachments: image-2021-05-25-10-14-22-111.png, 
> image-2021-05-25-16-15-40-431.png, image-2021-05-25-16-28-05-992.png
>
>
> sh sql-client.sh
> *CREATE CATALOG* myhive *WITH* (
>      'type' *=* 'hive',
>      'default-database' = 'default',
>      'hive-conf-dir' = '/data/hive/conf/'
>  );
>  
> USE CATALOG myhive;
>  set table.sql-dialect=hive;
> then 
> show tables; or any command error
> !image-2021-05-25-10-14-22-111.png!
> !image-2021-05-25-16-15-40-431.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-22663) Release YARN resource very slow when cancel the job after some NodeManagers shutdown

2021-05-25 Thread Jinhong Liu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350909#comment-17350909
 ] 

Jinhong Liu commented on FLINK-22663:
-

[~fly_in_gis] 

I read the source code of Flink, when we try to cancel the job, 
YarnResourceManagerDriver calls the "terminate" method. This method tries to 
stop the NMClient it created.

If we see what it does when cleaning up the running containers in the method 
"cleanupRunningContainers", we find it iterates all the running containers and 
tries to stop the container one by one. But if the container's NodeManager is 
dead, it will get stuck and all the left containers can't be stopped. So I 
think this is the root cause of almost all the containers that cannot exist 
because the clean-up process is serial and synchronous.
{code:java}
public CompletableFuture terminate() {
// shut down all components
Exception exception = null;

if (resourceManagerClient != null) {
try {
resourceManagerClient.stop();
} catch (Exception e) {
exception = e;
}
}

if (nodeManagerClient != null) {
try {
nodeManagerClient.stop();
} catch (Exception e) {
exception = ExceptionUtils.firstOrSuppressed(e, exception);
}
}

return exception == null
? FutureUtils.completedVoidFuture()
: FutureUtils.completedExceptionally(exception);
}


protected synchronized void cleanupRunningContainers() {
  for (StartedContainer startedContainer : startedContainers.values()) {
try {
  stopContainer(startedContainer.getContainerId(),
  startedContainer.getNodeId());
} catch (YarnException e) {
  LOG.error("Failed to stop Container " +
  startedContainer.getContainerId() +
  "when stopping NMClientImpl");
} catch (IOException e) {
  LOG.error("Failed to stop Container " +
  startedContainer.getContainerId() +
  "when stopping NMClientImpl");
}
  }
}
{code}

> Release YARN resource very slow when cancel the job after some NodeManagers 
> shutdown
> 
>
> Key: FLINK-22663
> URL: https://issues.apache.org/jira/browse/FLINK-22663
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Affects Versions: 1.12.2
>Reporter: Jinhong Liu
>Priority: Major
>  Labels: YARN, pull-request-available
>
> When I test flink on YARN, there is a case that may cause some problems.
> Hadoop Version: 2.7.3
> Flink Version: 1.12.2
> I deploy a flink job on YARN, when the job is running I stop one NodeManager, 
> after one or two minutes, the job is auto recovered. But in this situation, 
> if I cancel the job, the containers cannot be released immediately, there are 
> still some containers that are running include the app master. About 5 
> minutes later, these containers exit, and about 10 minutes later the app 
> master exit.
> I check the log of app master, seems it try to stop the containers on the 
> NodeManger which I have already stopped.
> {code:java}
> 2021-05-14 06:15:17,389 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - Job class 
> tv.freewheel.reporting.fastlane.Fastlane$ (da883ab39a7a82e4d45a3803bc77dd6f) 
> switched from state CANCELLING to CANCELED.
> 2021-05-14 06:15:17,389 INFO  
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator[] - Stopping 
> checkpoint coordinator for job da883ab39a7a82e4d45a3803bc77dd6f.
> 2021-05-14 06:15:17,390 INFO  
> org.apache.flink.runtime.checkpoint.StandaloneCompletedCheckpointStore [] - 
> Shutting down
> 2021-05-14 06:15:17,408 INFO  
> org.apache.flink.runtime.dispatcher.MiniDispatcher   [] - Job 
> da883ab39a7a82e4d45a3803bc77dd6f reached globally terminal state CANCELED.
> 2021-05-14 06:15:17,409 INFO  
> org.apache.flink.runtime.dispatcher.MiniDispatcher   [] - Shutting 
> down cluster with state CANCELED, jobCancelled: true, executionMode: DETACHED
> 2021-05-14 06:15:17,409 INFO  
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint[] - Shutting 
> YarnJobClusterEntrypoint down with application status CANCELED. Diagnostics 
> null.
> 2021-05-14 06:15:17,409 INFO  
> org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - Shutting 
> down rest endpoint.
> 2021-05-14 06:15:17,420 INFO  org.apache.flink.runtime.jobmaster.JobMaster
>  [] - Stopping the JobMaster for job class 
> tv.freewheel.reporting.fastlane.Fastlane$(da883ab39a7a82e4d45a3803bc77dd6f).
> 2021-05-14 06:15:17,422 INFO  
> org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - Removing 
> cache directory 
> /tmp/flink-web-af72a00c-0ddd-4e5e-a62c-8244d6caa552/flink-web-ui
> 2021-05-14 06:15:17,4

[GitHub] [flink] yittg commented on pull request #15501: [FLINK-22054][k8s] Using a shared watcher for ConfigMap watching

2021-05-25 Thread GitBox


yittg commented on pull request #15501:
URL: https://github.com/apache/flink/pull/15501#issuecomment-847675307


   Thanks @wangyang0918 , i will check the periodical resync. And add the doc 
update later.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-13400) Remove Hive and Hadoop dependencies from SQL Client

2021-05-25 Thread Jark Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-13400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350916#comment-17350916
 ] 

Jark Wu commented on FLINK-13400:
-

Hi [~twalthr], I think this will be a large PR/task if we want to fix the hive 
dependency. Currently, we tested some Hive connector features in SQL Client. On 
the other hand, SQL Client also relies on Hive catalog/module to testing 
statements about catalog and modules. The tests are mainly located in 
{{flink-table/flink-sql-client/src/test/resources/sql/}} and {{DependencyTest}} 
, etc... 

If we want to remove Hive dependency, we may first need to:
1. migrate the Hive-related tests into Hive connector (Hive connector will have 
a flink-sql-client dependency with test scope).
2. support testing catalog and modules, e.g. FLINK-17909, and replace them with 
the hive-related tests in flink-sql-client. 
3. remove hive dependency in flink-sql-client. 

So this would be a huge work and would be better break it into sub-tasks.

I'm fine with this refactoring. However, I just have some concerns that this 
would cost us a lot of time but the benefit maybe small. From my point of view, 
the problem of this issue is just the mess Hive dependencies in 
flink-sql-client. Could we just update all Hive dependencies to test scope, so 
we don't worry about packaging them into sql client jar by mistake. And all the 
"exclusions" list can be removed then (ideally). In this way, the list of Hive 
dependencies will be small and easy to maintain. What do you think [~twalthr]?

cc [~lirui]




> Remove Hive and Hadoop dependencies from SQL Client
> ---
>
> Key: FLINK-13400
> URL: https://issues.apache.org/jira/browse/FLINK-13400
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Reporter: Timo Walther
>Priority: Critical
>  Labels: auto-unassigned, stale-critical
> Fix For: 1.14.0
>
>
> 340/550 lines in the SQL Client {{pom.xml}} are just around Hive and Hadoop 
> dependencies.  Hive has nothing to do with the SQL Client and it will be hard 
> to maintain the long list of  exclusion there. Some dependencies are even in 
> a {{provided}} scope and not {{test}} scope.
> We should remove all dependencies on Hive/Hadoop and replace catalog-related 
> tests by a testing catalog. Similar to how we tests source/sinks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] AnkushKhanna commented on a change in pull request #15991: Adding gcs documentation. Connecting flink to gcs.

2021-05-25 Thread GitBox


AnkushKhanna commented on a change in pull request #15991:
URL: https://github.com/apache/flink/pull/15991#discussion_r638583296



##
File path: docs/content/docs/deployment/filesystems/gcs.md
##
@@ -0,0 +1,76 @@
+---
+title: Google cloud storage
+weight: 3
+type: docs
+aliases:
+  - /deployment/filesystems/gcs.html
+---
+
+
+# Google Cloud Storage
+
+[Google Cloud storage](https://cloud.google.com/storage) (GCS) provides cloud 
storage for variety of usecases. You can use it for **reading** and **writing 
data**.
+
+
+
+You can use GCS objects like regular files by specifying paths in the 
following format:
+
+```plain
+gs:///
+```
+
+The endpoint can either be a single file or a directory, for example:
+
+```java
+// Read from GSC bucket
+env.readTextFile("gs:///");
+
+// Write to GCS bucket
+stream.writeAsText("gs:///");
+
+```
+
+### Libraries
+
+You would require to include the following libraries to `/lib` for connecting 
flink with gcs. 

Review comment:
   Hi, I tried running the same application via copying jar's to 
`plugins/gcs`. It does not seem to work with that. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] AnkushKhanna commented on pull request #15991: Adding gcs documentation. Connecting flink to gcs.

2021-05-25 Thread GitBox


AnkushKhanna commented on pull request #15991:
URL: https://github.com/apache/flink/pull/15991#issuecomment-847679973


   Hi, So I did try via copying jars to `plugins/gcs`. It does not work. 
   Deploying on k8s.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15877: [FLINK-22612][python] Restructure the coders in PyFlink

2021-05-25 Thread GitBox


flinkbot edited a comment on pull request #15877:
URL: https://github.com/apache/flink/pull/15877#issuecomment-836433504


   
   ## CI report:
   
   * be2f68b4cd7086048d5a16c1ab5ee1661908a0c0 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18297)
 
   * dc679297beeaf91ca605574190d847262104e379 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18302)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15995: [FLINK-22760][Hive] Fix issue of HiveParser::setCurrentTimestamp fails with hive-3.1.2

2021-05-25 Thread GitBox


flinkbot edited a comment on pull request #15995:
URL: https://github.com/apache/flink/pull/15995#issuecomment-847655258


   
   ## CI report:
   
   * 031d0fe9b8d8ab2050ad378d158788e51495d0e7 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18303)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15996: FLINK-22663 Stop YARN NMClient get stuck if it try to stop the containers on dead NodeManagers

2021-05-25 Thread GitBox


flinkbot edited a comment on pull request #15996:
URL: https://github.com/apache/flink/pull/15996#issuecomment-847655364


   
   ## CI report:
   
   * 9589ed9beb169f69822faf6f51e1b162c755709a Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18304)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zentol merged pull request #15927: [FLINK-22639][runtime] ClassLoaderUtil cannot print classpath of Flin…

2021-05-25 Thread GitBox


zentol merged pull request #15927:
URL: https://github.com/apache/flink/pull/15927


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] LiuPeien commented on pull request #15996: FLINK-22663 Stop YARN NMClient get stuck if it try to stop the containers on dead NodeManagers

2021-05-25 Thread GitBox


LiuPeien commented on pull request #15996:
URL: https://github.com/apache/flink/pull/15996#issuecomment-847682716


   If a NodeManager crashed when the Flink is running and we try to cancel the 
job at this time, we find the containers of the job cannot be released 
immediately. The root cause is NMClient tries to stop the containers on the 
dead NodeManager and the process gets stuck because it can't connect to the 
dead NodeManager. Due to the clean-up process is serial and synchronous, when 
the process gets stuck, the containers on the normal NodeManagers also cannot 
be stopped.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-22639) ClassLoaderUtil cannot print classpath of FlinkUserCodeClassLoader

2021-05-25 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-22639.

Fix Version/s: 1.14.0
   Resolution: Fixed

master: d67064d6829335bcbdbb12961ea821dd1565dd1e

> ClassLoaderUtil cannot print classpath of FlinkUserCodeClassLoader
> --
>
> Key: FLINK-22639
> URL: https://issues.apache.org/jira/browse/FLINK-22639
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Configuration
>Affects Versions: 1.13.0
>Reporter: Adrian Zhong
>Priority: Major
>  Labels: Classloader, pull-request-available, runtime
> Fix For: 1.14.0
>
>
> Hello, community.
> I found FlinkUserCodeClassLoader is wrapping by 
> SafetyNetWrapperClassLoader, but it cut getURL invoking chain.
>  
> ClassLoaderUtil.getUserCodeClassLoaderInfo:
> {code:java}
> public static String getUserCodeClassLoaderInfo(ClassLoader loader) {
> if (loader instanceof URLClassLoader) {
> URLClassLoader cl = (URLClassLoader) loader;
> try {
> StringBuilder bld = new StringBuilder();
> if (cl == ClassLoader.getSystemClassLoader()) {
> bld.append("System ClassLoader: ");
> } else {
> bld.append("URL ClassLoader:");
> }
> for (URL url : cl.getURLs()) {
> }
> }{code}
> {code:java}
> SafetyNetWrapperClassLoader(FlinkUserCodeClassLoader inner, ClassLoader 
> parent) {
> super(new URL[0], parent);
> this.inner = inner;
> }
> {code}
> The constructor passed an empty array to super class, 
> SafetyNetWrapperClassLoader.getURL {color:#505f79}*should dispatch this 
> invocation*{color}.
> {code:java}
> @Override
> public URL[] getURLs() {
> return inner.getURLs();
> }
> {code}
> Otherwise,  
> {code:java}
> ClassLoaderUtil.getUserCodeClassLoaderInfo(theJarClassLoader);
> {code}
> will print empty, like below:
> {code:java}
> URL ClassLoader:
> {code}
>  
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (FLINK-22639) ClassLoaderUtil cannot print classpath of FlinkUserCodeClassLoader

2021-05-25 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler reassigned FLINK-22639:


Assignee: Yao Zhang

> ClassLoaderUtil cannot print classpath of FlinkUserCodeClassLoader
> --
>
> Key: FLINK-22639
> URL: https://issues.apache.org/jira/browse/FLINK-22639
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Configuration
>Affects Versions: 1.13.0
>Reporter: Adrian Zhong
>Assignee: Yao Zhang
>Priority: Major
>  Labels: Classloader, pull-request-available, runtime
> Fix For: 1.14.0
>
>
> Hello, community.
> I found FlinkUserCodeClassLoader is wrapping by 
> SafetyNetWrapperClassLoader, but it cut getURL invoking chain.
>  
> ClassLoaderUtil.getUserCodeClassLoaderInfo:
> {code:java}
> public static String getUserCodeClassLoaderInfo(ClassLoader loader) {
> if (loader instanceof URLClassLoader) {
> URLClassLoader cl = (URLClassLoader) loader;
> try {
> StringBuilder bld = new StringBuilder();
> if (cl == ClassLoader.getSystemClassLoader()) {
> bld.append("System ClassLoader: ");
> } else {
> bld.append("URL ClassLoader:");
> }
> for (URL url : cl.getURLs()) {
> }
> }{code}
> {code:java}
> SafetyNetWrapperClassLoader(FlinkUserCodeClassLoader inner, ClassLoader 
> parent) {
> super(new URL[0], parent);
> this.inner = inner;
> }
> {code}
> The constructor passed an empty array to super class, 
> SafetyNetWrapperClassLoader.getURL {color:#505f79}*should dispatch this 
> invocation*{color}.
> {code:java}
> @Override
> public URL[] getURLs() {
> return inner.getURLs();
> }
> {code}
> Otherwise,  
> {code:java}
> ClassLoaderUtil.getUserCodeClassLoaderInfo(theJarClassLoader);
> {code}
> will print empty, like below:
> {code:java}
> URL ClassLoader:
> {code}
>  
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] zentol closed pull request #15993: Merge pull request #1 from apache/master

2021-05-25 Thread GitBox


zentol closed pull request #15993:
URL: https://github.com/apache/flink/pull/15993


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] fsk119 commented on a change in pull request #15986: [FLINK-22155][table] EXPLAIN statement should validate insert and query separately

2021-05-25 Thread GitBox


fsk119 commented on a change in pull request #15986:
URL: https://github.com/apache/flink/pull/15986#discussion_r638593315



##
File path: 
flink-table/flink-table-planner-blink/src/main/java/org/apache/flink/table/planner/operations/SqlToOperationConverter.java
##
@@ -844,16 +843,19 @@ private Operation convertShowViews(SqlShowViews 
sqlShowViews) {
 return new ShowViewsOperation();
 }
 
-/** Convert EXPLAIN statement. */
-private Operation convertExplain(SqlExplain sqlExplain) {
-Operation operation = convertSqlQuery(sqlExplain.getExplicandum());
-
-if (sqlExplain.getDetailLevel() != SqlExplainLevel.EXPPLAN_ATTRIBUTES
-|| sqlExplain.getDepth() != SqlExplain.Depth.PHYSICAL
-|| sqlExplain.getFormat() != SqlExplainFormat.TEXT) {
-throw new TableException("Only default behavior is supported now, 
EXPLAIN PLAN FOR xx");
+/** Convert RICH EXPLAIN statement. */
+private Operation convertRichExplain(SqlRichExplain sqlExplain) {
+Operation operation;
+SqlNode sqlNode = sqlExplain.getStatement();
+if (sqlNode instanceof RichSqlInsert) {
+operation = convertSqlInsert((RichSqlInsert) sqlNode);
+} else if (sqlNode instanceof SqlSelect) {
+operation = convertSqlQuery(sqlExplain.getStatement());

Review comment:
   Flink has its own logic to validate the INSERT statement, which is 
different from calcite. Here we only validate the query part of the INSERT 
statement and check whether the sink schema is as same as the query schema if 
the statement is the INSERT statement. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] wuchong commented on a change in pull request #15925: [FLINK-21464][sql-client] Support ADD JAR in SQL Client

2021-05-25 Thread GitBox


wuchong commented on a change in pull request #15925:
URL: https://github.com/apache/flink/pull/15925#discussion_r638593659



##
File path: 
flink-table/flink-sql-client/src/main/java/org/apache/flink/table/client/gateway/context/SessionContext.java
##
@@ -275,4 +315,33 @@ private void 
resetSessionConfigurationToDefault(Configuration defaultConf) {
 }
 sessionConfiguration.addAll(defaultConf);
 }
+
+private void buildClassLoaderAndUpdateDependencies(Collection 
newDependencies) {
+// merge the jar in config with the jar maintained in session
+Set jarsInConfig;
+try {
+jarsInConfig =
+new HashSet<>(
+ConfigUtils.decodeListFromConfig(
+sessionConfiguration, 
PipelineOptions.JARS, URL::new));
+} catch (MalformedURLException e) {
+throw new SqlExecutionException(
+"Failed to parse the option `JARS` in configuration.", e);

Review comment:
   Regarding "value of `PipelineOptions.JARS`", I mean `"pipeline.jars"`. I 
think no user can understand what `PipelineOptions.JARS` refers, that is an 
internal field. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] yittg commented on pull request #15501: [FLINK-22054][k8s] Using a shared watcher for ConfigMap watching

2021-05-25 Thread GitBox


yittg commented on pull request #15501:
URL: https://github.com/apache/flink/pull/15501#issuecomment-847689225


   > However, I still find a suspicious log which shows up every 5 seconds. It 
might be related with 
[fabric8io/kubernetes-client#2651](https://github.com/fabric8io/kubernetes-client/issues/2651).
 This will greatly increase the pressure of ETCD and affect the whole K8s 
cluster stability. Could you please have a look?
   > 
   > ```
   > 2021-05-25 07:04:45,580 DEBUG [pool-5-thread-1] 
io.fabric8.kubernetes.client.informers.cache.Reflector   [] - Listing items 
(4) for resource class io.fabric8.kubernetes.api.model.ConfigMap v3656102
   > ```
   
   I missed the DEFAULT_PERIOD before, how about passing the `Long.MAX_VALUE` 
to walk around it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15975: [FLINK-22725][coordination] SlotManagers unregister metrics in suspend()

2021-05-25 Thread GitBox


flinkbot edited a comment on pull request #15975:
URL: https://github.com/apache/flink/pull/15975#issuecomment-845084760


   
   ## CI report:
   
   * c7293cdce7b7b106ec0db3f0bc41bf5cac79bf67 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18252)
 
   * 54d2002e4ce89e914492d5c16c16b89368d7c1e5 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15991: [FLINK-22757][docs] Adding gcs documentation. Connecting flink to gcs.

2021-05-25 Thread GitBox


flinkbot edited a comment on pull request #15991:
URL: https://github.com/apache/flink/pull/15991#issuecomment-846585260


   
   ## CI report:
   
   * d7feeafc28313533866c2599a0c49693cef2e95c Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18268)
 
   * 2323ba8421e39b663261f33a6ca0dda01b027f7b UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-22757) Update GCS documentation

2021-05-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-22757:
---
Labels: pull-request-available  (was: )

> Update GCS documentation
> 
>
> Key: FLINK-22757
> URL: https://issues.apache.org/jira/browse/FLINK-22757
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Ankush Khanna
>Priority: Minor
>  Labels: pull-request-available
>
> Currently, GCS filesystem documentation points to 
> [https://cloud.google.com/dataproc.] This does cover the correct way to 
> connect to GCS. 
> Following from this [blog 
> post|https://www.ververica.com/blog/getting-started-with-da-platform-on-google-kubernetes-engine]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-19142) Investigate slot hijacking from preceding pipelined regions after failover

2021-05-25 Thread Till Rohrmann (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-19142:
--
Labels: pull-request-available  (was: pull-request-available stale-assigned)

> Investigate slot hijacking from preceding pipelined regions after failover
> --
>
> Key: FLINK-19142
> URL: https://issues.apache.org/jira/browse/FLINK-19142
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.12.0
>Reporter: Andrey Zagrebin
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> The ticket originates from [this PR 
> discussion|https://github.com/apache/flink/pull/13181#discussion_r481087221].
> The previous AllocationIDs are used by 
> PreviousAllocationSlotSelectionStrategy to schedule subtasks into the slot 
> where they were previously executed before a failover. If the previous slot 
> (AllocationID) is not available, we do not want subtasks to take previous 
> slots (AllocationIDs) of other subtasks.
> The MergingSharedSlotProfileRetriever gets all previous AllocationIDs of the 
> bulk from SlotSharingExecutionSlotAllocator but only from the current bulk. 
> The previous AllocationIDs of other bulks stay unknown. Therefore, the 
> current bulk can potentially hijack the previous slots from the preceding 
> bulks. On the other hand the previous AllocationIDs of other tasks should be 
> taken if the other tasks are not going to run at the same time, e.g. not 
> enough resources after failover or other bulks are done.
> One way to do it may be to give to MergingSharedSlotProfileRetriever all 
> previous AllocationIDs of bulks which are going to run at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19142) Investigate slot hijacking from preceding pipelined regions after failover

2021-05-25 Thread Till Rohrmann (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350930#comment-17350930
 ] 

Till Rohrmann commented on FLINK-19142:
---

There is a PR available for fixing this problem.

> Investigate slot hijacking from preceding pipelined regions after failover
> --
>
> Key: FLINK-19142
> URL: https://issues.apache.org/jira/browse/FLINK-19142
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.12.0
>Reporter: Andrey Zagrebin
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> The ticket originates from [this PR 
> discussion|https://github.com/apache/flink/pull/13181#discussion_r481087221].
> The previous AllocationIDs are used by 
> PreviousAllocationSlotSelectionStrategy to schedule subtasks into the slot 
> where they were previously executed before a failover. If the previous slot 
> (AllocationID) is not available, we do not want subtasks to take previous 
> slots (AllocationIDs) of other subtasks.
> The MergingSharedSlotProfileRetriever gets all previous AllocationIDs of the 
> bulk from SlotSharingExecutionSlotAllocator but only from the current bulk. 
> The previous AllocationIDs of other bulks stay unknown. Therefore, the 
> current bulk can potentially hijack the previous slots from the preceding 
> bulks. On the other hand the previous AllocationIDs of other tasks should be 
> taken if the other tasks are not going to run at the same time, e.g. not 
> enough resources after failover or other bulks are done.
> One way to do it may be to give to MergingSharedSlotProfileRetriever all 
> previous AllocationIDs of bulks which are going to run at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22676) The partition tracker should support remote shuffle properly

2021-05-25 Thread Jin Xing (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jin Xing updated FLINK-22676:
-
Description: 
In current Flink, data partition is bound with the ResourceID of TM in 
Execution#startTrackingPartitions and partition tracker will stop tracking 
corresponding partitions when a TM 
disconnects(JobMaster#disconnectTaskManager), i.e. the lifecycle of shuffle 
data is bound with computing resource (TM). It works fine for internal shuffle 
service, but doesn't for remote shuffle service. Note that shuffle data is 
accommodated on remote, the lifecycle of a completed partition is capable to be 
decoupled with TM, i.e. TM is totally fine to be released when no computing 
task on it and further shuffle reading requests could be directed to remote 
shuffle cluster. In addition, when a TM is lost, its completed data partitions 
on remote shuffle cluster could avoid reproducing.

 

The issue mentioned above is because Flink JobMasterPartitionTracker mixed up 
partition's locationID (where the partition is located) and tmID (which TM the 
partition is produced from). In TM internal shuffle, partition's locationID is 
the same with tmID, but it is not in remote shuffle; JobMasterPartitionTracker 
as an independent component should be able to differentiate locationID and tmID 
of a partition, thus to handle the lifecycle of a partition properly;

We propose that JobMasterPartitionTracker manages and indexes partitions with 
both locationID and tmID. The process of registration and unregistration will 
be like below:

*A. Partiiton Registration*
 # Execution#registerProducedPartitions registers partition to ShuffleMaster 
and get a ShuffleDescriptor. Current ShuffleDescriptor#storesLocalResourcesOn 
only returns the location of the producing TM if the partition occupies local 
resources there.
 We proposes to change a proper name of this method and always return the 
locationID of the partition. It might be as below:

{code:java}
 ResourceID getLocationID();  {code}
 # Execution#registerProducePartitions then registers partition to 
JMPartitionTracker with tmID (ResourceID of TaskManager from 
TaskManagerLocation) and the locationID (acquired in step 1). 
JobMasterPartitionTracker will indexes a partition with both tmID and 
locationID;

*B. Invokes from JM and ShuffleMaster*

      JobMasterPartitionTracker listens invokes from both JM and ShuffleMaster.
 # When JMPartitionTracker hears from JobMaster#disconnectTaskManager that a TM 
disconnects, it will check whether the disconnected tmID equals to a certain 
locationID of a partition. If so, tracking of the corresponding partition will 
be stopped.
 # When JobMasterPartitionTracker hears from ShuffleMaster that a data location 
gets lost, it will unregister corresponding partitions by locationID;

*C. Partition Unregistration*
When unregister a partition, JobMasterPartitionTracker removes the 
corresponding indexes to tmID and locationID firstly, and then release the 
partition by shuffle service types --
 # If the locationID equals to the tmID, it indicates the partition is 
accommodated by TM internal shuffle service, JMPartitionTracker will invokes 
TaskExecutorGateway for the release;
 # If the locationID doesn't equal to tmID, it indicates the partition is 
accommodated by external shuffle service, JMPartitionTracker will invokes 
ShuffleMaster for the release;

  was:In current Flink, data partition is bound with the ResourceID of TM in 
Execution#startTrackingPartitions and partition tracker will stop tracking 
corresponding partitions when a TM 
disconnects(JobMaster#disconnectTaskManager), i.e. the lifecycle of shuffle 
data is bound with computing resource (TM). It works fine for internal shuffle 
service, but doesn't for remote shuffle service. Note that shuffle data is 
accommodated on remote, the lifecycle of a completed partition is capable to be 
decoupled with TM, i.e. TM is totally fine to be released when no computing 
task on it and further shuffle reading requests could be directed to remote 
shuffle cluster. In addition, when a TM is lost, its completed data partitions 
on remote shuffle cluster could avoid reproducing.


> The partition tracker should support remote shuffle properly
> 
>
> Key: FLINK-22676
> URL: https://issues.apache.org/jira/browse/FLINK-22676
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Reporter: Jin Xing
>Priority: Major
>
> In current Flink, data partition is bound with the ResourceID of TM in 
> Execution#startTrackingPartitions and partition tracker will stop tracking 
> corresponding partitions when a TM 
> disconnects(JobMaster#disconnectTaskManager), i.e. the lifecycle of shuffle 
> data is bound with computing resource (TM). It works fine 

[GitHub] [flink] tillrohrmann commented on pull request #15501: [FLINK-22054][k8s] Using a shared watcher for ConfigMap watching

2021-05-25 Thread GitBox


tillrohrmann commented on pull request #15501:
URL: https://github.com/apache/flink/pull/15501#issuecomment-847709963


   I will try to give this PR a pass this week.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-22676) The partition tracker should support remote shuffle properly

2021-05-25 Thread Jin Xing (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350933#comment-17350933
 ] 

Jin Xing commented on FLINK-22676:
--

Gentle ping [~pnowojski] [~chesnay]

Would you please give some comments on this ?

> The partition tracker should support remote shuffle properly
> 
>
> Key: FLINK-22676
> URL: https://issues.apache.org/jira/browse/FLINK-22676
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Reporter: Jin Xing
>Priority: Major
>
> In current Flink, data partition is bound with the ResourceID of TM in 
> Execution#startTrackingPartitions and partition tracker will stop tracking 
> corresponding partitions when a TM 
> disconnects(JobMaster#disconnectTaskManager), i.e. the lifecycle of shuffle 
> data is bound with computing resource (TM). It works fine for internal 
> shuffle service, but doesn't for remote shuffle service. Note that shuffle 
> data is accommodated on remote, the lifecycle of a completed partition is 
> capable to be decoupled with TM, i.e. TM is totally fine to be released when 
> no computing task on it and further shuffle reading requests could be 
> directed to remote shuffle cluster. In addition, when a TM is lost, its 
> completed data partitions on remote shuffle cluster could avoid reproducing.
>  
> The issue mentioned above is because Flink JobMasterPartitionTracker mixed up 
> partition's locationID (where the partition is located) and tmID (which TM 
> the partition is produced from). In TM internal shuffle, partition's 
> locationID is the same with tmID, but it is not in remote shuffle; 
> JobMasterPartitionTracker as an independent component should be able to 
> differentiate locationID and tmID of a partition, thus to handle the 
> lifecycle of a partition properly;
> We propose that JobMasterPartitionTracker manages and indexes partitions with 
> both locationID and tmID. The process of registration and unregistration will 
> be like below:
> *A. Partiiton Registration*
>  # Execution#registerProducedPartitions registers partition to ShuffleMaster 
> and get a ShuffleDescriptor. Current ShuffleDescriptor#storesLocalResourcesOn 
> only returns the location of the producing TM if the partition occupies local 
> resources there.
>  We proposes to change a proper name of this method and always return the 
> locationID of the partition. It might be as below:
> {code:java}
>  ResourceID getLocationID();  {code}
>  # Execution#registerProducePartitions then registers partition to 
> JMPartitionTracker with tmID (ResourceID of TaskManager from 
> TaskManagerLocation) and the locationID (acquired in step 1). 
> JobMasterPartitionTracker will indexes a partition with both tmID and 
> locationID;
> *B. Invokes from JM and ShuffleMaster*
>       JobMasterPartitionTracker listens invokes from both JM and 
> ShuffleMaster.
>  # When JMPartitionTracker hears from JobMaster#disconnectTaskManager that a 
> TM disconnects, it will check whether the disconnected tmID equals to a 
> certain locationID of a partition. If so, tracking of the corresponding 
> partition will be stopped.
>  # When JobMasterPartitionTracker hears from ShuffleMaster that a data 
> location gets lost, it will unregister corresponding partitions by locationID;
> *C. Partition Unregistration*
> When unregister a partition, JobMasterPartitionTracker removes the 
> corresponding indexes to tmID and locationID firstly, and then release the 
> partition by shuffle service types --
>  # If the locationID equals to the tmID, it indicates the partition is 
> accommodated by TM internal shuffle service, JMPartitionTracker will invokes 
> TaskExecutorGateway for the release;
>  # If the locationID doesn't equal to tmID, it indicates the partition is 
> accommodated by external shuffle service, JMPartitionTracker will invokes 
> ShuffleMaster for the release;



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-20587) Support TIMESTAMP WITH LOCAL TIME ZONE type for datagen source

2021-05-25 Thread luoyuxia (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350934#comment-17350934
 ] 

luoyuxia commented on FLINK-20587:
--

[~x1q1j1] hi, I think data type `TIMESTAMP WITH LOCAL TIME ZONE`  has been 
supported since flink 1.12.

> Support TIMESTAMP WITH LOCAL TIME ZONE type for datagen source
> --
>
> Key: FLINK-20587
> URL: https://issues.apache.org/jira/browse/FLINK-20587
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Ecosystem
>Affects Versions: 1.11.1, 1.11.2
>Reporter: Forward Xu
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> TABLE DDL:
> {code:java}
> // code placeholder
> CREATE TABLE sourceTable (
>  userid int,
>  f_random_str STRING,
>  order_time TIMESTAMP(3) WITH LOCAL TIME ZONE
> ) WITH (
>  'connector' = 'datagen',
>  'rows-per-second'='100',
>  'fields.userid.kind'='random',
>  'fields.userid.min'='1',
>  'fields.userid.max'='100',
> 'fields.f_random_str.length'='10'
> );
> CREATE TABLE print_table (
>  userid int,
>  f_random_str STRING,
>  order_time TIMESTAMP(3) WITH LOCAL TIME ZONE
> ) WITH (
>  'connector' = 'print'
> );{code}
> The DML SQL:
> {code:java}
> // code placeholder
> insert into default_catalog.default_database.print_table select * from 
> default_catalog.default_database.sourceTable;{code}
> *Exception:*
> {code:java}
> // code placeholder
> Flink SQL> insert into default_catalog.default_database.print_table select * 
> from default_catalog.default_database.sourceTable;
> [ERROR] Could not execute SQL statement. Reason:
> org.apache.flink.table.api.ValidationException: Unsupported type: 
> TIMESTAMP(3) WITH LOCAL TIME ZONE
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] cuspymd commented on a change in pull request #15986: [FLINK-22155][table] EXPLAIN statement should validate insert and query separately

2021-05-25 Thread GitBox


cuspymd commented on a change in pull request #15986:
URL: https://github.com/apache/flink/pull/15986#discussion_r638628377



##
File path: 
flink-table/flink-table-planner-blink/src/main/java/org/apache/flink/table/planner/operations/SqlToOperationConverter.java
##
@@ -844,16 +843,19 @@ private Operation convertShowViews(SqlShowViews 
sqlShowViews) {
 return new ShowViewsOperation();
 }
 
-/** Convert EXPLAIN statement. */
-private Operation convertExplain(SqlExplain sqlExplain) {
-Operation operation = convertSqlQuery(sqlExplain.getExplicandum());
-
-if (sqlExplain.getDetailLevel() != SqlExplainLevel.EXPPLAN_ATTRIBUTES
-|| sqlExplain.getDepth() != SqlExplain.Depth.PHYSICAL
-|| sqlExplain.getFormat() != SqlExplainFormat.TEXT) {
-throw new TableException("Only default behavior is supported now, 
EXPLAIN PLAN FOR xx");
+/** Convert RICH EXPLAIN statement. */
+private Operation convertRichExplain(SqlRichExplain sqlExplain) {
+Operation operation;
+SqlNode sqlNode = sqlExplain.getStatement();
+if (sqlNode instanceof RichSqlInsert) {
+operation = convertSqlInsert((RichSqlInsert) sqlNode);
+} else if (sqlNode instanceof SqlSelect) {
+operation = convertSqlQuery(sqlExplain.getStatement());

Review comment:
   Do you mean that the results of executions 1 and 2 below are different?
   ```
   // 1
   convertSqlQuery(sqlExplain.getStatement());
   // 2
   convertSqlQuery(sqlNode);
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-22768) ShuffleMaster supports to promote partitions

2021-05-25 Thread Jin Xing (Jira)
Jin Xing created FLINK-22768:


 Summary: ShuffleMaster supports to promote partitions
 Key: FLINK-22768
 URL: https://issues.apache.org/jira/browse/FLINK-22768
 Project: Flink
  Issue Type: Sub-task
Reporter: Jin Xing






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22768) ShuffleMaster supports to promote partitions

2021-05-25 Thread Jin Xing (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jin Xing updated FLINK-22768:
-
Description: Current Flink only supports to promote data partitions in TM 
internal shuffle service. Pluggable shuffle service should also supports 
promoting partitions by ShuffleMaster for interactive programming.

> ShuffleMaster supports to promote partitions
> 
>
> Key: FLINK-22768
> URL: https://issues.apache.org/jira/browse/FLINK-22768
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Jin Xing
>Priority: Major
>
> Current Flink only supports to promote data partitions in TM internal shuffle 
> service. Pluggable shuffle service should also supports promoting partitions 
> by ShuffleMaster for interactive programming.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   4   >