date:20210928

[jira] [Updated] (FLINK-17535) Treat min/max as part of the hierarchy of config option

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-17535:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Treat min/max as part of the hierarchy of config option
> ---
>
> Key: FLINK-17535
> URL: https://issues.apache.org/jira/browse/FLINK-17535
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Configuration
>Reporter: Yangze Guo
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> As discussed in 
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Should-max-min-be-part-of-the-hierarchy-of-config-option-td40578.html.
>  We decide to treat min/max as part of the hierarchy of config option. This 
> ticket is an umbrella of all tasks related to it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-15645) enable COPY TO/FROM in Postgres JDBC source/sink for faster batch processing

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-15645:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> enable COPY TO/FROM in Postgres JDBC source/sink for faster batch processing
> 
>
> Key: FLINK-15645
> URL: https://issues.apache.org/jira/browse/FLINK-15645
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / JDBC
>Reporter: Bowen Li
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned
> Fix For: 1.15.0
>
>
> Postgres has its own SQL extension as COPY FROM/TO via JDBC for faster bulk 
> loading/reading [https://www.postgresql.org/docs/12/sql-copy.html]
> Flink should be able to support that for batch use cases



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-15729) Table planner dependency instructions for executing in IDE can be improved

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-15729:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Table planner dependency instructions for executing in IDE can be improved
> --
>
> Key: FLINK-15729
> URL: https://issues.apache.org/jira/browse/FLINK-15729
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Table SQL / API
>Reporter: Tzu-Li (Gordon) Tai
>Priority: Minor
> Fix For: 1.15.0
>
>
> In the docs: 
> https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/table/#table-program-dependencies
> For it to work in the IDE, it would be clearer to add that in the IDE, you 
> would additionally need to enable the option to `Include dependencies with 
> "provided" scope`.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16466) Group by on event time should produce insert only result

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-16466:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Group by on event time should produce insert only result
> 
>
> Key: FLINK-16466
> URL: https://issues.apache.org/jira/browse/FLINK-16466
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.10.0
>Reporter: Kurt Young
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> Currently when doing aggregation queries, we can output insert only results 
> only when grouping by windows. But when users defined event time and also 
> watermark, we can also support emit insert only results when grouping on 
> event time. To be more precise, it should only require event time is one of 
> the grouping keys. One can think of grouping by event time is kind of a 
> special window, with both window start and window end equals to event time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16143) Turn on more date time functions of blink planner

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-16143:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Turn on more date time functions of blink planner
> -
>
> Key: FLINK-16143
> URL: https://issues.apache.org/jira/browse/FLINK-16143
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Reporter: Zili Chen
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> Currently blink planner has a series of built-in functions such as
> DATEDIFF
>  DATE_ADD
>  DATE_SUB
> which haven't been into used so far. We could add the necessary register, 
> generate and convert code to make it available in production scope.
>  
> what do you think [~jark]?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16776) Support schema evolution for Hive tables

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-16776:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Support schema evolution for Hive tables
> 
>
> Key: FLINK-16776
> URL: https://issues.apache.org/jira/browse/FLINK-16776
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Hive
>Affects Versions: 1.10.0
>Reporter: Rui Li
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18199) Translate "Filesystem SQL Connector" page into Chinese

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-18199:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Translate "Filesystem SQL Connector" page into Chinese
> --
>
> Key: FLINK-18199
> URL: https://issues.apache.org/jira/browse/FLINK-18199
> Project: Flink
>  Issue Type: Sub-task
>  Components: chinese-translation, Connectors / FileSystem, 
> Documentation, Table SQL / Ecosystem
>Reporter: Jark Wu
>Priority: Major
>  Labels: auto-unassigned, pull-request-available
> Fix For: 1.15.0
>
>
> The page url is 
> https://ci.apache.org/projects/flink/flink-docs-master/zh/dev/table/connectors/filesystem.html
> The markdown file is located in 
> flink/docs/dev/table/connectors/filesystem.zh.md



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17405) add test cases for cancel job in SQL client

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-17405:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> add test cases for cancel job in SQL client
> ---
>
> Key: FLINK-17405
> URL: https://issues.apache.org/jira/browse/FLINK-17405
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Client
>Reporter: godfrey he
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned, 
> pull-request-available
> Fix For: 1.15.0
>
>
> as discussed in [FLINK-15669| 
> https://issues.apache.org/jira/browse/FLINK-15669], we can re-add some tests 
> to verify cancel job logic in SQL client.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17808) Rename checkpoint meta file to "_metadata" until it has completed writing

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-17808:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Rename checkpoint meta file to "_metadata" until it has completed writing
> -
>
> Key: FLINK-17808
> URL: https://issues.apache.org/jira/browse/FLINK-17808
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Affects Versions: 1.10.0
>Reporter: Yun Tang
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> In practice, some developers or customers would use some strategy to find the 
> recent _metadata as the checkpoint to recover (e.g as many proposals in 
> FLINK-9043 suggest). However, there existed a "_meatadata" file does not mean 
> the checkpoint have been completed as the writing to create the "_meatadata" 
> file could break as some force quit (e.g. yarn application -kill).
> We could create the checkpoint meta stream to write data to file named as 
> "_metadata.inprogress" and renamed it to "_metadata" once completed writing. 
> By doing so, we could ensure the "_metadata" is not broken.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17785) Refactor orc shim to create a OrcShimFactory in hive

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-17785:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Refactor orc shim to create a OrcShimFactory in hive
> 
>
> Key: FLINK-17785
> URL: https://issues.apache.org/jira/browse/FLINK-17785
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Hive
>Reporter: Jingsong Lee
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19038) It doesn't support to call Table.limit() continuously

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-19038:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> It doesn't support to call Table.limit() continuously
> -
>
> Key: FLINK-19038
> URL: https://issues.apache.org/jira/browse/FLINK-19038
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.12.0
>Reporter: Dian Fu
>Priority: Major
>  Labels: auto-unassigned, pull-request-available
> Fix For: 1.15.0
>
>
> For example, table.limit(3).limit(2) will failed with "FETCH is already 
> defined." 
> {code}
> org.apache.flink.table.api.ValidationException: FETCH is already defined.
>   at 
> org.apache.flink.table.operations.utils.SortOperationFactory.validateAndGetChildSort(SortOperationFactory.java:125)
>   at 
> org.apache.flink.table.operations.utils.SortOperationFactory.createLimitWithFetch(SortOperationFactory.java:105)
>   at 
> org.apache.flink.table.operations.utils.OperationTreeBuilder.limitWithFetch(OperationTreeBuilder.java:418)
> {code}
> However, as we support to call table.limit() without specifying the order, I 
> guess this should be a valid usage and should be allowed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18742) Some configuration args do not take effect at client

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-18742:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Some configuration args do not take effect at client
> 
>
> Key: FLINK-18742
> URL: https://issues.apache.org/jira/browse/FLINK-18742
> Project: Flink
>  Issue Type: Improvement
>  Components: Command Line Client
>Affects Versions: 1.11.1
>Reporter: Matt Wang
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned, 
> pull-request-available
> Fix For: 1.15.0
>
>
> Some configuration args from command line will not work at client, for 
> example, the job sets the {color:#505f79}_classloader.resolve-order_{color} 
> to _{color:#505f79}parent-first,{color}_ it can work at TaskManager, but 
> Client doesn't.
> The *FlinkUserCodeClassLoaders* will be created before calling the method of 
> _{color:#505f79}getEffectiveConfiguration(){color}_ at 
> {color:#505f79}org.apache.flink.client.cli.CliFrontend{color}, so the 
> _{color:#505f79}Configuration{color}_ used by 
> _{color:#505f79}PackagedProgram{color}_ does not include Configuration args.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18567) Add Support for Azure Cognitive Search Table & SQL Connector

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-18567:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Add Support for Azure Cognitive Search Table & SQL Connector
> 
>
> Key: FLINK-18567
> URL: https://issues.apache.org/jira/browse/FLINK-18567
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Connectors / Common
>Affects Versions: 1.12.0
>Reporter: Israel Ekpo
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> The objective of this improvement is to add Azure Cognitive Search [2] as an 
> output sink for the Table & SQL connectors [1]
> [1] 
> [https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/]
>  [2] 
> [https://docs.microsoft.com/en-us/azure/search/search-what-is-azure-search]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18201) Support create_function in Python Table API

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-18201:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Support create_function in Python Table API
> ---
>
> Key: FLINK-18201
> URL: https://issues.apache.org/jira/browse/FLINK-18201
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Reporter: Dian Fu
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned
> Fix For: 1.15.0
>
>
> There is an interface *createFunction* in the Java *TableEnvironment*. It's 
> used to register a Java UserDefinedFunction class as a catalog function. We 
> should align the Python Table API with Java and add such an interface in the 
> Python *TableEnvironment*.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18624) Document CREATE TEMPORARY TABLE

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-18624:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Document CREATE TEMPORARY TABLE
> ---
>
> Key: FLINK-18624
> URL: https://issues.apache.org/jira/browse/FLINK-18624
> Project: Flink
>  Issue Type: New Feature
>  Components: Documentation
>Reporter: Jingsong Lee
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.11.5, 1.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18442) Move `testSessionWindowsWithContinuousEventTimeTrigger` to `ContinuousEventTimeTriggerTest`

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-18442:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Move `testSessionWindowsWithContinuousEventTimeTrigger` to 
> `ContinuousEventTimeTriggerTest`
> ---
>
> Key: FLINK-18442
> URL: https://issues.apache.org/jira/browse/FLINK-18442
> Project: Flink
>  Issue Type: Improvement
>  Components: Tests
>Reporter: Lijie Wang
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned
> Fix For: 1.15.0
>
>
> `testSessionWindowsWithContinuousEventTimeTrigger` in `WindowOperatorTest` is 
> introduced when fix 
> [FLINK-4862|https://issues.apache.org/jira/browse/FLINK-4862].
> But it's better to move `testSessionWindowsWithContinuousEventTimeTrigger` 
> into `ContinuousEventTimeTriggerTest`.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18563) Add Support for Azure Cosmos DB DataStream Connector

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-18563:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Add Support for Azure Cosmos DB DataStream Connector
> 
>
> Key: FLINK-18563
> URL: https://issues.apache.org/jira/browse/FLINK-18563
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Connectors / Common
>Affects Versions: 1.12.0
>Reporter: Israel Ekpo
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> Add Support for Azure Cosmos DB DataStream Connector
> The objective of this improvement is to add Azure Cosmos DB [2] as an input 
> source and output sink for the DataStream connectors [1]
> [1] 
> https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/connectors/#datastream-connectors
> [2] https://docs.microsoft.com/en-us/azure/cosmos-db/change-feed



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18564) Add Support for Azure Event Hub DataStream Connector

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-18564:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Add Support for Azure Event Hub DataStream Connector
> 
>
> Key: FLINK-18564
> URL: https://issues.apache.org/jira/browse/FLINK-18564
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Connectors / Common
>Affects Versions: 1.12.0
>Reporter: Israel Ekpo
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> The objective of this improvement is to add Azure Event Hubs [2] as an input 
> source and output sink for the DataStream connectors [1]
> [1] 
> https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/connectors/#datastream-connectors
> [2] https://docs.microsoft.com/en-us/azure/event-hubs/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18022) Add e2e test for new streaming file sink

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-18022:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Add e2e test for new streaming file sink
> 
>
> Key: FLINK-18022
> URL: https://issues.apache.org/jira/browse/FLINK-18022
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem, Tests
>Affects Versions: 1.11.0
>Reporter: Danny Chen
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18565) Add Support for Azure Event Grid DataStream Connector

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-18565:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Add Support for Azure Event Grid DataStream Connector
> -
>
> Key: FLINK-18565
> URL: https://issues.apache.org/jira/browse/FLINK-18565
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Connectors / Common
>Affects Versions: 1.12.0
>Reporter: Israel Ekpo
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> The objective of this improvement is to add Azure Event Grid [2] as an output 
> sink for the DataStream connectors [1]
> [1] 
> https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/connectors/#datastream-connectors
> [2] https://docs.microsoft.com/en-us/azure/event-grid/overview



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18229) Pending worker requests should be properly cleared

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-18229:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Pending worker requests should be properly cleared
> --
>
> Key: FLINK-18229
> URL: https://issues.apache.org/jira/browse/FLINK-18229
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / Kubernetes, Deployment / YARN, Runtime / 
> Coordination
>Affects Versions: 1.9.3, 1.10.1, 1.11.0
>Reporter: Xintong Song
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, if Kubernetes/Yarn does not have enough resources to fulfill 
> Flink's resource requirement, there will be pending pod/container requests on 
> Kubernetes/Yarn. These pending resource requirements are never cleared until 
> either fulfilled or the Flink cluster is shutdown.
> However, sometimes Flink no longer needs the pending resources. E.g., the 
> slot request is then fulfilled by another slots that become available, or the 
> job failed due to slot request timeout (in a session cluster). In such cases, 
> Flink does not remove the resource request until the resource is allocated, 
> then it discovers that it no longer needs the allocated resource and release 
> them. This would affect the underlying Kubernetes/Yarn cluster, especially 
> when the cluster is under heavy workload.
> It would be good for Flink to cancel pod/container requests as earlier as 
> possible if it can discover that some of the pending workers are no longer 
> needed.
> There are several approaches potentially achieve this.
>  # We can always check whether there's a pending worker that can be canceled 
> when a \{{PendingTaskManagerSlot}} is unassigned.
>  # We can have a separate timeout for requesting new worker. If the resource 
> cannot be allocated within the given time since requested, we should cancel 
> that resource request and claim a resource allocation failure.
>  # We can share the same timeout for starting new worker (proposed in 
> FLINK-13554). This is similar to 2), but it requires the worker to be 
> registered, rather than allocated, before timeout.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18202) Introduce Protobuf format

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-18202:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Introduce Protobuf format
> -
>
> Key: FLINK-18202
> URL: https://issues.apache.org/jira/browse/FLINK-18202
> Project: Flink
>  Issue Type: New Feature
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / Ecosystem
>Reporter: Benchao Li
>Priority: Major
>  Labels: auto-unassigned, pull-request-available, sprint
> Fix For: 1.15.0
>
> Attachments: image-2020-06-15-17-18-03-182.png
>
>
> PB[1] is a very famous and wildly used (de)serialization framework. The ML[2] 
> also has some discussions about this. It's a useful feature.
> This issue maybe needs some designs, or a FLIP.
> [1] [https://developers.google.com/protocol-buffers]
> [2] [http://apache-flink.147419.n8.nabble.com/Flink-SQL-UDF-td3725.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17916) Provide API to separate KafkaShuffle's Producer and Consumer to different jobs

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-17916:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Provide API to separate KafkaShuffle's Producer and Consumer to different jobs
> --
>
> Key: FLINK-17916
> URL: https://issues.apache.org/jira/browse/FLINK-17916
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Connectors / Kafka
>Affects Versions: 1.11.0
>Reporter: Yuan Mei
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned, 
> pull-request-available
> Fix For: 1.15.0
>
>
> Follow up of FLINK-15670
> *Separate sink (producer) and source (consumer) to different jobs*
>  * In the same job, a sink and a source are recovered independently according 
> to regional failover. However, they share the same checkpoint coordinator and 
> correspondingly, share the same global checkpoint snapshot.
>  * That says if the consumer fails, the producer can not commit written data 
> because of two-phase commit set-up (the producer needs a checkpoint-complete 
> signal to complete the second stage)
>  * Same applies to the producer
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18627) Get unmatch filter method records to side output

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-18627:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Get unmatch filter method records to side output
> 
>
> Key: FLINK-18627
> URL: https://issues.apache.org/jira/browse/FLINK-18627
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Reporter: Roey Shem Tov
>Priority: Minor
>  Labels: auto-deprioritized-major, pull-request-available
> Fix For: 1.15.0
>
>
> Unmatch records to filter functions should send somehow to side output.
> Example:
>  
> {code:java}
> datastream
> .filter(i->i%2==0)
> .sideOutput(oddNumbersSideOutput);
> {code}
>  
>  
> That's way we can filter multiple times and send the filtered records to our 
> side output instead of dropping it immediatly, it can be useful in many ways.
>  
> What do you think?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18405) Add watermark support for unaligned checkpoints

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-18405:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Add watermark support for unaligned checkpoints
> ---
>
> Key: FLINK-18405
> URL: https://issues.apache.org/jira/browse/FLINK-18405
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Network
>Affects Versions: 1.12.0
>Reporter: Arvid Heise
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> Currently, Flink generates the watermark as a first step of recovery instead 
> of 
> storing the latest watermark in the operators to ease rescaling. In unaligned 
> checkpoints, that means on recovery, Flink generates watermarks after it 
> restores in-flight data. If your pipeline uses an operator that applies the
> latest watermark on each record, it will produce incorrect results during 
> recovery if the watermark is not directly or indirectly part of the operator 
> state. Thus, SQL OVER operator should not be used with unaligned
> checkpoints, while window operators are safe to use. 
> A possible solution is to store the watermark in the operator state. If 
> rescaling may occur, watermarks should be stored per key-group in a 
> union-state. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19263) Enforce alphabetical order in configuration option docs

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-19263:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Enforce alphabetical order in configuration option docs
> ---
>
> Key: FLINK-19263
> URL: https://issues.apache.org/jira/browse/FLINK-19263
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Runtime / Configuration
>Reporter: Chesnay Schepler
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> The ConfigDocsGenerator sorts options alphabetically, however there are no 
> checks to ensure that the generated files adhere to that order.
> This is a problem because time and time again these files are manually 
> modified, breaking the order, causing other PRs that then use the generator 
> to include unrelated changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18981) Support column comment for Hive tables

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-18981:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Support column comment for Hive tables
> --
>
> Key: FLINK-18981
> URL: https://issues.apache.org/jira/browse/FLINK-18981
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Hive
>Reporter: Rui Li
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> Start working on this once FLINK-18958 is done



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19499) Expose Metric Groups to Split Assigners

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-19499:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Expose Metric Groups to Split Assigners
> ---
>
> Key: FLINK-19499
> URL: https://issues.apache.org/jira/browse/FLINK-19499
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem
>Reporter: Stephan Ewen
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> Split Assigners should have access to metric groups, so they can report 
> metrics on assignment, like pending splits, local-, and remote assignments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18235) Improve the checkpoint strategy for Python UDF execution

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-18235:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Improve the checkpoint strategy for Python UDF execution
> 
>
> Key: FLINK-18235
> URL: https://issues.apache.org/jira/browse/FLINK-18235
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Reporter: Dian Fu
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> Currently, when a checkpoint is triggered for the Python operator, all the 
> data buffered will be flushed to the Python worker to be processed. This will 
> increase the overall checkpoint time in case there are a lot of elements 
> buffered and Python UDF is slow. We should improve the checkpoint strategy to 
> improve this, e.g. buffering the data into state instead of flushing them 
> out. We can also let users to config the checkpoint strategy if needed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18578) Add rejecting checkpoint logic in source

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-18578:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Add rejecting checkpoint logic in source
> 
>
> Key: FLINK-18578
> URL: https://issues.apache.org/jira/browse/FLINK-18578
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common, Runtime / Checkpointing
>Reporter: Qingsheng Ren
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned
> Fix For: 1.15.0
>
>
> Under some database's change data capture (CDC) case, the process is usually 
> divided into two phases: snapshotting phase (lock captured tables and scan 
> all records in them) and log streaming phase (read all changes starting from 
> the moment of locking tables) in order to build a complete view of captured 
> tables. The first snapshotting phase should be atomic so we have to give up 
> all records created in snapshotting phase if any failure happen, because 
> contents in captured tables might have changed during recovery. And 
> checkpointing within snapshotting phase is meaningless too.
> As a result, we need to add a new feature in the source to reject checkpoint 
> if the source is currently within an atomic operation or some other processes 
> that cannot do a checkpoint currently. This rejection should not be treated 
> as a failure that could lead to failure of the entire job. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18753) Support local recovery for Unaligned checkpoints

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-18753:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Support local recovery for Unaligned checkpoints
> 
>
> Key: FLINK-18753
> URL: https://issues.apache.org/jira/browse/FLINK-18753
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing, Runtime / Task
>Affects Versions: 1.11.0
>Reporter: Roman Khachatryan
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> There is standard mechanism of writing two duplicating streams with state 
> data: to DFS and to local FS. On recovery, if local state is present and it 
> matches state on JM then it will be used (instead of downloading state from 
> DFS).
> Currently, local recovery is not supported if running with Unaligned 
> checkpoints enabled.
> To enable it, we need to
>  # write this second stream (see 
> CheckpointStreamWithResultProvider.getCheckpointOutputStream)
>  # make sure localState is included into the snapshot reported to JM
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19103) The PushPartitionIntoTableSourceScanRule will lead a performance problem when there are still many partitions after pruning

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-19103:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> The PushPartitionIntoTableSourceScanRule will lead a performance problem when 
> there are still many partitions after pruning
> ---
>
> Key: FLINK-19103
> URL: https://issues.apache.org/jira/browse/FLINK-19103
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.10.2, 1.11.1
>Reporter: fa zheng
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> The PushPartitionIntoTableSourceScanRule will obtain new statistic after 
> pruning, however, it uses a for loop to get statistics of each partitions and 
> then merge them together. During this process, flink will try to call 
> metastore's interface four times in one loop. When remaining partitions are 
> huge, it spends a lot of time to get new statistic. 
>  
> {code:scala}
> val newStatistic = {
>   val tableStats = catalogOption match {
> case Some(catalog) =>
>   def mergePartitionStats(): TableStats = {
> var stats: TableStats = null
> for (p <- remainingPartitions) {
>   getPartitionStats(catalog, tableIdentifier, p) match {
> case Some(currStats) =>
>   if (stats == null) {
> stats = currStats
>   } else {
> stats = stats.merge(currStats)
>   }
> case None => return null
>   }
> }
> stats
>   }
>   mergePartitionStats()
> case None => null
>   }
>   
> FlinkStatistic.builder().statistic(statistic).tableStats(tableStats).build()
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18231) flink kafka connector exists same qualified class name, whch will cause conflicts when user has all the versions of flink-kafka-connector

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-18231:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> flink kafka connector exists same qualified class name, whch will cause 
> conflicts when user has all the versions of  flink-kafka-connector
> --
>
> Key: FLINK-18231
> URL: https://issues.apache.org/jira/browse/FLINK-18231
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka
>Affects Versions: 1.11.0
>Reporter: jackylau
>Priority: Minor
>  Labels: auto-deprioritized-major, pull-request-available
> Fix For: 1.15.0
>
>
>  
> There are 0.9/0.10/0.11/2.x four version of kafka connector in 1.11
> There are 0.10/0.11/2.x three version of kafka connector in 1.12-snapshot
> But flink kafka connector exists same qualified class name such as 
> KafkaConsumerThread and Handover  whch will cause conflicts when i have all 
> the versions of flink-kafka-connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18568) Add Support for Azure Data Lake Store Gen 2 in Streaming File Sink

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-18568:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Add Support for Azure Data Lake Store Gen 2 in Streaming File Sink
> --
>
> Key: FLINK-18568
> URL: https://issues.apache.org/jira/browse/FLINK-18568
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Connectors / Common
>Affects Versions: 1.12.0
>Reporter: Israel Ekpo
>Assignee: Srinivasulu Punuru
>Priority: Minor
>  Labels: auto-deprioritized-major, stale-assigned
> Fix For: 1.15.0
>
>
> The objective of this improvement is to add support for Azure Data Lake Store 
> Gen 2 (ADLS Gen2) [2] as one of the supported filesystems for the Streaming 
> File Sink [1]
> [1] 
> https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/connectors/streamfile_sink.html
> [2] https://hadoop.apache.org/docs/current/hadoop-azure/abfs.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19774) Introduce Sub Partition View Version for Approximate Local Recovery

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-19774:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Introduce Sub Partition View Version for Approximate Local Recovery
> ---
>
> Key: FLINK-19774
> URL: https://issues.apache.org/jira/browse/FLINK-19774
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Task
>Reporter: Yuan Mei
>Priority: Major
>  Labels: auto-unassigned
> Fix For: 1.15.0
>
>
>  
> This ticket is to solve a corner case where a downstream task continuously 
> fails multiple times, or an orphan task execution may exist for a short 
> period of time after new execution is running (as described in the FLIP)
>  
> Here is an idea of how to cleanly and thoroughly solve this kind of problem:
>  # We go with the simplified release view version: only release view before a 
> new creation (in thread2). That says we won't clean up view when downstream 
> task disconnects ({{releaseView}} would not be called from the reference copy 
> of view) (in thread1 or 2).
>  * 
>  ** This would greatly simplify the threading model
>  ** This won't cause any resource leak, since view release is only to notify 
> the upstream result partition to releaseOnConsumption when all subpartitions 
> are consumed in PipelinedSubPartitionView. In our case, we do not release the 
> result partition on consumption any way (the result partition is put in track 
> in JobMaster, similar to the ResultParition.blocking Type).
>       2. Each view is associated with a downstream task execution version
>  * 
>  ** This is making sense because we actually have different versions of view 
> now, corresponding to the vertex.version of the downstream task.
>  ** createView is performed only if the new version to create is greater than 
> the existing one
>  ** If we decide to create a new view, the old view should be released.
> I think this way, we can completely disconnect the old view with the 
> subpartition. Besides that, the working handler in use would always hold the 
> freshest view reference.
>  
> Point 1 has already been addressed in FLINK-19632. This ticket is to address 
> Point 2.
> Details discussion in [https://github.com/apache/flink/pull/13648]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18734) Add documentation for DynamoStreams Consumer CDC

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-18734:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Add documentation for DynamoStreams Consumer CDC
> 
>
> Key: FLINK-18734
> URL: https://issues.apache.org/jira/browse/FLINK-18734
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kinesis, Documentation
>Affects Versions: 1.11.1
>Reporter: Vinay
>Priority: Minor
>  Labels: CDC, documentation
> Fix For: 1.11.5, 1.15.0
>
>
> Flink already supports CDC for DynamoDb - 
> https://issues.apache.org/jira/browse/FLINK-4582  by reading the data from 
> DynamoStreams but there is no documentation for the same. Given that Flink 
> now supports CDC for Debezium as well , we should add the documentation for 
> Dynamo CDC so that more users can use this feature.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19432) Whether to capture the updates which don't change any monitored columns

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-19432:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Whether to capture the updates which don't change any monitored columns
> ---
>
> Key: FLINK-19432
> URL: https://issues.apache.org/jira/browse/FLINK-19432
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.11.1
>Reporter: shizhengchao
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> with `debezium-json` and `canal-json`: 
> Whether to capture the updates which don't change any monitored columns. This 
> may happen if the monitored columns (columns defined in Flink SQL DDL) is a 
> subset of the columns in database table.  We can provide an optional option， 
> default 'true', which means all the updates will be captured. You can set to 
> 'false' to only capture changed updates



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19846) Grammar mistakes in annotations and log

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-19846:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Grammar mistakes in annotations and log
> ---
>
> Key: FLINK-19846
> URL: https://issues.apache.org/jira/browse/FLINK-19846
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.11.2
>Reporter: zoucao
>Priority: Minor
> Fix For: 1.15.0
>
>
> There exit some grammar mistakes in annotations and documents. The mistakes 
> include but are not limited to the following examples：
>  * a entry in  WebLogAnalysis.java  [246:34] and adm-zip.js [291:33](which 
> should be an entry)
>  * a input in JobGraphGenerator.java [1125:69] etc(which should be an 
> input)
>  * a intersection 
>  * an user-* in Table.java etc.  (which should be a user)
> using global search in intellij idea, more mistakes could be foud like this.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18873) Make the WatermarkStrategy API more scala friendly

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-18873:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Make the WatermarkStrategy API more scala friendly
> --
>
> Key: FLINK-18873
> URL: https://issues.apache.org/jira/browse/FLINK-18873
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Core, API / DataStream
>Affects Versions: 1.11.0
>Reporter: Dawid Wysakowicz
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> Right now there is no reliable way of passing WatermarkGeneratorSupplier 
> and/or TimestampAssigner as lambdas in scala.
> The only way to use this API is:
> {code}
> .assignTimestampsAndWatermarks(
>   WatermarkStrategy.forGenerator[(String, Long)](
> new WatermarkGeneratorSupplier[(String, Long)] {
>   override def createWatermarkGenerator(context: 
> WatermarkGeneratorSupplier.Context): WatermarkGenerator[(String, Long)] =
> new MyPeriodicGenerator
> }
>   )
> .withTimestampAssigner(new SerializableTimestampAssigner[(String, 
> Long)] {
>   override def extractTimestamp(t: (String, Long), l: Long): Long = 
> t._2
> })
> )
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19425) Correct the usage of BulkWriter#flush and BulkWriter#finish

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-19425:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Correct the usage of BulkWriter#flush and BulkWriter#finish
> ---
>
> Key: FLINK-19425
> URL: https://issues.apache.org/jira/browse/FLINK-19425
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Common
>Affects Versions: 1.11.0
>Reporter: hailong wang
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.11.0, 1.15.0
>
>
> From the comments, BulkWriter#finish method should flush all buffer before 
> close.
> But some subclasses of it do not flush data. These classes are as follows:
> 1.AvroBulkWriter#finish
> 2.HadoopCompressionBulkWriter#finish
> 3.NoCompressionBulkWriter#finish
> 4.SequenceFileWriter#finish
> We should invoke BulkWriter#flush in this finish methods.
> On the other hand, We don't have to  invoke BulkWriter#flush in close method. 
> For BulkWriter#finish will flush all data.
> 1. HadoopPathBasedPartFileWriter#closeForCommit
> 2. BulkPartWriter#closeForCommit
> 3. FileSystemTableSink#OutputFormat#close
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19085) Remove deprecated methods for writing CSV and Text files from DataStream

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-19085:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Remove deprecated methods for writing CSV and Text files from DataStream
> 
>
> Key: FLINK-19085
> URL: https://issues.apache.org/jira/browse/FLINK-19085
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / DataStream
>Reporter: Dawid Wysakowicz
>Priority: Major
>  Labels: auto-unassigned
> Fix For: 1.15.0
>
>
> We can remove long deprecated {{PublicEvolving}} methods:
> - DataStream#writeAsText
> - DataStream#writeAsCsv



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18556) Drop the unused options in TableConfig

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-18556:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Drop the unused options in TableConfig
> --
>
> Key: FLINK-18556
> URL: https://issues.apache.org/jira/browse/FLINK-18556
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Jark Wu
>Priority: Major
> Fix For: 1.15.0
>
>
> As disucssed in FLINK-16835, the following {{TableConfig}} options are not 
> preserved:
> * {{nullCheck}}: Flink will automatically enable null checks based on the 
> table schema ({{NOT NULL}} property)
> * {{decimalContext}}: this configuration is only used by the legacy planner 
> which will be removed in one of the next releases
> * {{maxIdleStateRetention}}: is automatically derived as 1.5* 
> {{idleStateRetention}} until StateTtlConfig is fully supported (at which 
> point only a single parameter is required).
> The blink planner should remove the dependencies on {{nullCheck}} and 
> {{maxIdleStateRetention}} first.  Besides, this maybe blocked by when to drop 
> old planner, because old planner is still using them. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18213) refactor kafka sql connector to use just one shade to compatible 0.10.0.2 +

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-18213:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> refactor kafka sql connector to use just one shade to compatible 0.10.0.2 +
> ---
>
> Key: FLINK-18213
> URL: https://issues.apache.org/jira/browse/FLINK-18213
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka
>Affects Versions: 1.11.0
>Reporter: jackylau
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> Flink master supports 0.10/0.11/2.x, with three flink-sql-connector shade jar 
> currently (1.12-snapshot).
> As we all know ,kafka client is compatible after 0.10.2.x, so we can use 
> kafka client 2.x to access to brocker server are  0.10/0.11/2.x. 
> So we can just use one kafka sql shade jar. 
> for this , we should do 2 things
> 1) refactor to 1 shade jar
> 2) rename flink-kafka-connector mudules with same qualified name in case of 
> conflicts such as NoSuchMethod or ClassNotFound error



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19362) Remove confusing comment for `DOT` operator codegen

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-19362:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Remove confusing comment for `DOT` operator codegen
> ---
>
> Key: FLINK-19362
> URL: https://issues.apache.org/jira/browse/FLINK-19362
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Runtime
>Affects Versions: 1.11.0
>Reporter: hailong wang
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> `DOT` operator codegen (ExprCodeGenerator#generateCallExpression) has comment 
> as following:
> {code:java}
> // due to https://issues.apache.org/jira/browse/CALCITE-2162, expression such 
> as
> // "array[1].a.b" won't work now.
> if (operands.size > 2) {
>   throw new CodeGenException(
> "A DOT operator with more than 2 operands is not supported yet.")
> }
> {code}
> But `array[1].a.b` actually can work for flink job. `DOT` will be transform 
> to `RexFieldAccess` for CALCITE-2542. And `generateDot` will never be invoked 
>  except suppporting ITEM for ROW types.
> Simply, I think we can only delete the comment which is confusing. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18566) Add Support for Azure Cognitive Search DataStream Connector

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-18566:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Add Support for Azure Cognitive Search DataStream Connector
> ---
>
> Key: FLINK-18566
> URL: https://issues.apache.org/jira/browse/FLINK-18566
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Connectors / Common
>Affects Versions: 1.12.0
>Reporter: Israel Ekpo
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> The objective of this improvement is to add Azure Cognitive Search [2] as an 
> output sink for the DataStream connectors [1]
> [1] 
> https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/connectors/#datastream-connectors
> [2] https://docs.microsoft.com/en-us/azure/search/search-what-is-azure-search



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18954) Add documentation for connectors in Python DataStream API.

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-18954:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Add documentation for connectors in Python DataStream API.
> --
>
> Key: FLINK-18954
> URL: https://issues.apache.org/jira/browse/FLINK-18954
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Python
>Reporter: Hequn Cheng
>Priority: Major
>  Labels: auto-unassigned
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19743) Add Source metrics definitions

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-19743:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Add Source metrics definitions
> --
>
> Key: FLINK-19743
> URL: https://issues.apache.org/jira/browse/FLINK-19743
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Common
>Affects Versions: 1.11.2
>Reporter: Jiangjie Qin
>Priority: Major
>  Labels: auto-unassigned, pull-request-available
> Fix For: 1.12.6, 1.15.0
>
>
> Add the metrics defined in 
> [FLIP-33|https://cwiki.apache.org/confluence/display/FLINK/FLIP-33%3A+Standardize+Connector+Metrics]
>  to \{{OperatorMetricsGroup}} and {{SourceReaderContext}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19935) Supports configure heap memory of sql-client to avoid OOM

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-19935:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Supports configure heap memory of sql-client to avoid OOM
> -
>
> Key: FLINK-19935
> URL: https://issues.apache.org/jira/browse/FLINK-19935
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / Client
>Affects Versions: 1.11.2
>Reporter: harold.miao
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
> Attachments: image-2020-11-03-10-31-08-294.png
>
>
> hi 
> when use sql-client submit job,  the command below donot set JVM heap 
> pramameters. And cause OOM error in my production environment.
> exec $JAVA_RUN $JVM_ARGS "${log_setting[@]}" -classpath "`manglePathList 
> "$CC_CLASSPATH:$INTERNAL_HADOOP_CLASSPATHS"`" 
> org.apache.flink.table.client.SqlClient "$@"
>  
> !image-2020-11-03-10-31-08-294.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18887) Add ElasticSearch connector for Python DataStream API

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-18887:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Add ElasticSearch connector for Python DataStream API
> -
>
> Key: FLINK-18887
> URL: https://issues.apache.org/jira/browse/FLINK-18887
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Python
>Reporter: Shuiqiang Chen
>Priority: Major
>  Labels: auto-unassigned
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19647) Support the limit push down for the hbase connector

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-19647:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Support the limit push down for the hbase connector
> ---
>
> Key: FLINK-19647
> URL: https://issues.apache.org/jira/browse/FLINK-19647
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / HBase, Table SQL / Ecosystem
>Reporter: Shengkai Fang
>Priority: Major
>  Labels: auto-unassigned, pull-request-available
> Fix For: 1.15.0
>
>
> Support limit push down for hbase.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19034) Remove deprecated StreamExecutionEnvironment#set/getNumberOfExecutionRetries

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-19034:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Remove deprecated StreamExecutionEnvironment#set/getNumberOfExecutionRetries
> 
>
> Key: FLINK-19034
> URL: https://issues.apache.org/jira/browse/FLINK-19034
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / DataStream
>Reporter: Dawid Wysakowicz
>Priority: Major
>  Labels: auto-unassigned
> Fix For: 1.15.0
>
>
> Remove deprecated 
> {code}
> StreamExecutionEnvironment#setNumberOfExecutionRetries/getNumberOfExecutionRetries
> {code}
> The corresponding settings in {{ExecutionConfig}} will be removed in a 
> separate issue, as they are {{Public}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19659) Array type supports equals and not_equals operator when element types are different but castable

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-19659:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Array type supports equals and not_equals operator when element types are 
> different but castable
> 
>
> Key: FLINK-19659
> URL: https://issues.apache.org/jira/browse/FLINK-19659
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.11.0
>Reporter: hailong wang
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned, 
> pull-request-available
> Fix For: 1.15.0
>
>
> Currently, Array type supports `equals` and `not_equals` when element types 
> are the same or can not be cased. For example,
> {code:java}
> Array[1] <> Array[1] -> false{code}
> {code:java}
> Array[1] <> Array[cast(1 as bigint)] -> false
> {code}
> But for the element types which are castable, it will throw error,
> {code:java}
> org.apache.flink.table.planner.codegen.CodeGenException: Unsupported cast 
> from 'ARRAY NOT NULL' to 'ARRAY NOT 
> NULL'.org.apache.flink.table.planner.codegen.CodeGenException: Unsupported 
> cast from 'ARRAY NOT NULL' to 'ARRAY NOT 
> NULL'. at 
> org.apache.flink.table.planner.codegen.calls.ScalarOperatorGens$.generateCast(ScalarOperatorGens.scala:1295)
>  at 
> org.apache.flink.table.planner.codegen.ExprCodeGenerator.generateCallExpression(ExprCodeGenerator.scala:703)
>  at 
> org.apache.flink.table.planner.codegen.ExprCodeGenerator.visitCall(ExprCodeGenerator.scala:498)
>  at 
> org.apache.flink.table.planner.codegen.ExprCodeGenerator.visitCall(ExprCodeGenerator.scala:55)
>  at org.apache.calcite.rex.RexCall.accept(RexCall.java:288){code}
> But the result should be false or true,  for example,
> {code:java}
> /Array[1] <> Array[cast(1 as bigint)] -> false
> {code}
>  
> BTW, Map and MultiSet type are same as this, If it did, I am pleasure to open 
> other issues to track those.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19613) Create flink-connector-files-test-utils for formats testing

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-19613:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Create flink-connector-files-test-utils for formats testing
> ---
>
> Key: FLINK-19613
> URL: https://issues.apache.org/jira/browse/FLINK-19613
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem, Tests
>Reporter: Jingsong Lee
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> Since flink-connector-files has some tests with scala dependencies, we cannot 
> create test-jar for it.
> We should create a new module {{flink-connector-files-test-utils}} , it 
> should be a scala free module, formats can rely on it for complete testing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19722) Pushdown Watermark to SourceProvider (new Source)

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-19722:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Pushdown Watermark to SourceProvider (new Source)
> -
>
> Key: FLINK-19722
> URL: https://issues.apache.org/jira/browse/FLINK-19722
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Reporter: Jingsong Lee
>Priority: Major
>  Labels: auto-deprioritized-critical
> Fix For: 1.15.0
>
>
> See {{StreamExecutionEnvironment.fromSource(Source, WatermarkStrategy)}}
> The new source can get watermark strategy to handle split watermark.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20091) Introduce avro.ignore-parse-errors for AvroRowDataDeserializationSchema

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20091:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Introduce avro.ignore-parse-errors  for AvroRowDataDeserializationSchema
> 
>
> Key: FLINK-20091
> URL: https://issues.apache.org/jira/browse/FLINK-20091
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / Ecosystem
>Affects Versions: 1.12.0
>Reporter: hailong wang
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned, 
> pull-request-available
> Fix For: 1.15.0
>
>
> Introduce avro.ignore-parse-errors to allow users to skip rows with parsing 
> errors instead of failing when deserializing avro format data.
> This is useful when there are dirty data, for without this option, users can 
> not skip the dirty row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19007) Automatically set presto staging directory to io.tmp.dirs

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-19007:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Automatically set presto staging directory to io.tmp.dirs
> -
>
> Key: FLINK-19007
> URL: https://issues.apache.org/jira/browse/FLINK-19007
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem, Runtime / Configuration
>Reporter: Chesnay Schepler
>Priority: Minor
>  Labels: auto-deprioritized-major, usability
> Fix For: 1.15.0
>
>
> The presto S3 filesystem uses a staging directory on the local disk (for 
> buffering data or something). This directory by default points to {{/tmp}}.
> We often see users configuring a special directory for temporary files via 
> {{io.tmp.dirs}}, and it would make sense to automatically set the 
> {{staging-directory}} to the same path, iff it was not explicitly configured.
> The corresponding property is {{presto.s3.staging-directory}}, found at 
> {{com.facebook.presto.hive.s3.PrestoS3FileSystem.S3_STAGING_DIRECTORY}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19819) SourceReaderBase supports limit push down

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-19819:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> SourceReaderBase supports limit push down
> -
>
> Key: FLINK-19819
> URL: https://issues.apache.org/jira/browse/FLINK-19819
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Common
>Reporter: Jingsong Lee
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> User requirement:
> Users need to look at a few random pieces of data in a table to see what the 
> data looks like. So users often use the SQL:
> "select * from table limit 10"
> For a large table, expect to end soon because only a few pieces of data are 
> queried.
> For DataStream or BoundedStream, they are push based execution models, so the 
> downstream cannot control the end of source operator.
> We need push down limit to source operator, so that source operator can end 
> early.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19230) Support Python UDAF on blink batch planner

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-19230:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Support Python UDAF on blink batch planner
> --
>
> Key: FLINK-19230
> URL: https://issues.apache.org/jira/browse/FLINK-19230
> Project: Flink
>  Issue Type: New Feature
>  Components: API / Python
>Reporter: Wei Zhong
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19895) Unify Life Cycle Management of ResultPartitionType Pipelined Family

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-19895:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Unify Life Cycle Management of ResultPartitionType Pipelined Family
> ---
>
> Key: FLINK-19895
> URL: https://issues.apache.org/jira/browse/FLINK-19895
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Task
>Reporter: Yuan Mei
>Priority: Major
> Fix For: 1.15.0
>
>
> This ticket is to unify lifecycle management of 
> `ResultPartitionType.PIPELINED(_BOUNDED)` and 
> `ResultPartitionType.PIPELINED_APPOXIMATE`, so that we can get rid of the 
> hacky attribute `reconenctable` introduced in FLINK-19693
>  
> In short:
> *The current behavior of PIPELINED(_BOUNDED) is* ->
>  Release partition as soon as consumer exits
>  Release partition as soon as producer fails/canceled
> *Current behavior of PIPELINED_APPOXIMATE* ->
>  Release partition as soon as producer fails/canceled
>  Release partition when the job exists
> *Unified Pipelined Family to*
>  Release partition when producer exits.
>  
> One more question:
> *whether we can unify Blocking + Pieliened Family to*
>  Producer release partition when producer fails/canceled
>  Release partition when the job exists



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20246) Add documentation on Python worker memory tuning in the memory tuning page

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20246:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Add documentation on Python worker memory tuning in the memory tuning page
> --
>
> Key: FLINK-20246
> URL: https://issues.apache.org/jira/browse/FLINK-20246
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python, Documentation
>Reporter: Dian Fu
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned
> Fix For: 1.15.0
>
>
> Per the discussion in FLINK-19177, we need to add some documentation on 
> Python worker memory tuning in the memory tuning page.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19845) Migrate all FileSystemFormatFactory implementations

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-19845:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Migrate all FileSystemFormatFactory implementations
> ---
>
> Key: FLINK-19845
> URL: https://issues.apache.org/jira/browse/FLINK-19845
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Reporter: Jingsong Lee
>Priority: Major
> Fix For: 1.15.0
>
>
> We should use interfaces introduced by FLINK-19599



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20036) Join Has NoUniqueKey when using mini-batch

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20036:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Join Has NoUniqueKey when using mini-batch
> --
>
> Key: FLINK-20036
> URL: https://issues.apache.org/jira/browse/FLINK-20036
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.11.2
>Reporter: Rex Remind
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> Hello,
>  
> We tried out mini-batch mode and our Join suddenly had NoUniqueKey.
> Join:
> {code:java}
> Table membershipsTable = tableEnv.from(SOURCE_MEMBERSHIPS)
>   .renameColumns($("id").as("membership_id"))
>   .select($("*")).join(usersTable, $("user_id").isEqual($("id")));
> {code}
> Mini-batch config:
> {code:java}
> configuration.setString("table.exec.mini-batch.enabled", "true"); // enable 
> mini-batch optimization
> configuration.setString("table.exec.mini-batch.allow-latency", "5 s"); // use 
> 5 seconds to buffer input records
> configuration.setString("table.exec.mini-batch.size", "5000"); // the maximum 
> number of records can be buffered by each aggregate operator task
> {code}
>  
> Join with mini-batch:
> {code:java}
>  Join(joinType=[InnerJoin], where=[(user_id = id0)], select=[id, 
> group_id, user_id, uuid, owner, id0, deleted_at], 
> leftInputSpec=[NoUniqueKey], rightInputSpec=[NoUniqueKey]) 
> {code}
> Join without mini-batch:
> {code:java}
> Join(joinType=[InnerJoin], where=[(user_id = id0)], select=[id, group_id, 
> user_id, uuid, owner, id0, deleted_at], leftInputSpec=[HasUniqueKey], 
> rightInputSpec=[JoinKeyContainsUniqueKey])
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19965) Refactor HiveMapredSplitReader to adapt to the new hive source

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-19965:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Refactor HiveMapredSplitReader to adapt to the new hive source
> --
>
> Key: FLINK-19965
> URL: https://issues.apache.org/jira/browse/FLINK-19965
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Hive
>Reporter: Rui Li
>Priority: Major
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20110) Support 'merge' method for first_value and last_value UDAF

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20110:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Support 'merge' method for first_value and last_value UDAF
> --
>
> Key: FLINK-20110
> URL: https://issues.apache.org/jira/browse/FLINK-20110
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Runtime
>Affects Versions: 1.12.0
>Reporter: hailong wang
>Priority: Major
>  Labels: auto-unassigned, pull-request-available
> Fix For: 1.15.0
>
>
> From the user-zh email,  when use first_value function in hop window, It 
> throws the exception because first_vaue does not implement the merge method.
> We can support 'merge' method for first_value and last_value UDAF



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20190) A New Window Trigger that can trigger window operation both by event time interval、event count for DataStream API

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20190:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> A New Window Trigger that can trigger window operation both by event time 
> interval、event count for DataStream API
> -
>
> Key: FLINK-20190
> URL: https://issues.apache.org/jira/browse/FLINK-20190
> Project: Flink
>  Issue Type: New Feature
>  Components: API / DataStream
>Reporter: GaryGao
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> In production environment, when we are do some window operation, such as 
> window aggregation, using data stream api, developers are always asked to not 
> only trigger the window operation when the watermark pass the max timestamp 
> of window, but also trigger it both by fixed event time interval and fixed 
> count of event.The reason why we want to do this is we are looking forward to 
> get the frequently updated window operation result, other than waiting for a 
> long time until the watermark pass the max timestamp of window.This is very 
> useful in reporting and other BI applications.
> For now the default triggers provided by flink can not close this 
> requirement, so I developed a New Trigger, so called 
> CountAndContinuousEventTimeTrigger, combine ContinuousEventTimeTrigger with 
> CountTrigger to do the above thing.
>  
> To use CountAndContinuousEventTimeTrigger, you should specify two parameters 
> as revealed in it constructor:
> {code:java}
> private CountAndContinuousEventTimeTrigger(Time interval, long 
> maxCount);{code}
>  * Time interval, it means this trigger will continuously fires based on a 
> given time interval, the same as ContinuousEventTimeTrigger.
>  * long maxCount, it means this trigger will fires once the count of elements 
> in a pane reaches the given count, the same as CountTrigger. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19527) Update SQL Pages

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-19527:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Update SQL Pages
> 
>
> Key: FLINK-19527
> URL: https://issues.apache.org/jira/browse/FLINK-19527
> Project: Flink
>  Issue Type: Sub-task
>  Components: Documentation
>Reporter: Seth Wiesman
>Priority: Major
>  Labels: auto-unassigned, pull-request-available
> Fix For: 1.15.0
>
>
> SQL
> Goal: Show users the main features early and link to concepts if necessary.
> How to use SQL? Intended for users with SQL knowledge.
> Overview
> Getting started with link to more detailed execution section.
> Full Reference
> Available operations in SQL as a table. This location allows to further 
> split the page in the future if we think an operation needs more space 
> without affecting the top-level structure.
> Data Definition
> Explain special SQL syntax around DDL.
> Pattern Matching
> Make pattern matching more visible.
> ... more features in the future 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20518) WebUI should escape characters in metric names

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20518:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> WebUI should escape characters in metric names
> --
>
> Key: FLINK-20518
> URL: https://issues.apache.org/jira/browse/FLINK-20518
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Affects Versions: 1.10.0
>Reporter: Chesnay Schepler
>Assignee: tartarus
>Priority: Minor
>  Labels: auto-deprioritized-major, pull-request-available, 
> stale-assigned
> Fix For: 1.15.0
>
>
> Metric names can contain characters like {{+}} or {{?}} that should be 
> escaped when querying metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19967) Clean legacy classes for Parquet and Orc

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-19967:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Clean legacy classes for Parquet and Orc
> 
>
> Key: FLINK-19967
> URL: https://issues.apache.org/jira/browse/FLINK-19967
> Project: Flink
>  Issue Type: Sub-task
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Reporter: Jingsong Lee
>Priority: Major
> Fix For: 1.15.0
>
>
> We have introduce new BulkFormats, we can clean old inner classes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19881) Optimize temporal join with upsert-Source(upsert-kafka)

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-19881:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Optimize temporal join with upsert-Source(upsert-kafka)
> ---
>
> Key: FLINK-19881
> URL: https://issues.apache.org/jira/browse/FLINK-19881
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Runtime
>Reporter: Leonard Xu
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> Currently upsert-kafka will do normalize in a physical node named 
> `ChangelogNormalize`, the normalization will do a deduplicate using state and 
> produce `UPDATE_AFTER`, `DELETE` changelog. We do same thing In the state of 
> temporal join operator,  we can merge them to one as an optimization  if the 
> query contains temporal join an upsert-kafka.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20280) Support batch mode for Python DataStream API

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20280:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Support batch mode for Python DataStream API
> 
>
> Key: FLINK-20280
> URL: https://issues.apache.org/jira/browse/FLINK-20280
> Project: Flink
>  Issue Type: New Feature
>  Components: API / Python
>Reporter: Dian Fu
>Priority: Major
>  Labels: auto-unassigned, pull-request-available
> Fix For: 1.15.0
>
>
> Currently, it still doesn't support batch mode for the Python DataStream API.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20188) Add Documentation for new File Source

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20188:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Add Documentation for new File Source
> -
>
> Key: FLINK-20188
> URL: https://issues.apache.org/jira/browse/FLINK-20188
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / FileSystem, Documentation
>Reporter: Stephan Ewen
>Priority: Major
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20726) Introduce Pulsar connector

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20726:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Introduce Pulsar connector
> --
>
> Key: FLINK-20726
> URL: https://issues.apache.org/jira/browse/FLINK-20726
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Affects Versions: 1.13.0
>Reporter: Jianyun Zhao
>Priority: Major
>  Labels: auto-unassigned
> Fix For: 1.15.0
>
>
> Pulsar is an important player in messaging middleware, and it is essential 
> for Flink to support Pulsar.
> Our existing code is maintained at 
> [streamnative/pulsar-flink|https://github.com/streamnative/pulsar-flink] , 
> next we will split it into several pr merges back to the community.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20767) add nested field support for SupportsFilterPushDown

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20767:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> add nested field support for SupportsFilterPushDown
> ---
>
> Key: FLINK-20767
> URL: https://issues.apache.org/jira/browse/FLINK-20767
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Affects Versions: 1.12.0
>Reporter: Jun Zhang
>Priority: Major
> Fix For: 1.15.0
>
>
> I think we should add the nested field support for SupportsFilterPushDown



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20787) Improve the Table API to make it usable

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20787:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Improve the Table API to make it usable
> ---
>
> Key: FLINK-20787
> URL: https://issues.apache.org/jira/browse/FLINK-20787
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Reporter: Dian Fu
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> Currently, there are quite a few bugs in the Table API which makes it 
> difficult to use. Users will encounter all kinds of problems when using Table 
> API and have to find various workarounds from time to time. This is an 
> umbrella JIRA for all the issues specific in the Table API and trying to make 
> the Table API smooth to use.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20154) Improve error messages when using CLI with wrong target

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20154:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Improve error messages when using CLI with wrong target
> ---
>
> Key: FLINK-20154
> URL: https://issues.apache.org/jira/browse/FLINK-20154
> Project: Flink
>  Issue Type: Improvement
>  Components: Command Line Client, Documentation
>Affects Versions: 1.11.0, 1.12.0
>Reporter: Till Rohrmann
>Priority: Minor
>  Labels: auto-deprioritized-major, usability
> Fix For: 1.15.0
>
>
> According to the [CLI 
> documentation|https://ci.apache.org/projects/flink/flink-docs-stable/ops/cli.html#job-submission-examples]
>  one can use the CLI with the following {{--target}} values: "remote", 
> "local", "kubernetes-session", "yarn-per-job", "yarn-session", 
> "yarn-application" and "kubernetes-application". However, when running the 
> following commands:
> {{bin/flink run -t yarn-session -p 1 examples/streaming/WindowJoin.jar}} I 
> get the following exception:
> {code}
> org.apache.flink.client.program.ProgramInvocationException: The main method 
> caused an error: null
>   at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:330)
>   at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:198)
>   at 
> org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:114)
>   at 
> org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:743)
>   at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:242)
>   at 
> org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:971)
>   at 
> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1047)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754)
>   at 
> org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>   at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1047)
> Caused by: java.lang.IllegalStateException
>   at 
> org.apache.flink.util.Preconditions.checkState(Preconditions.java:182)
>   at 
> org.apache.flink.client.deployment.executors.AbstractSessionClusterExecutor.execute(AbstractSessionClusterExecutor.java:63)
>   at 
> org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:1917)
>   at 
> org.apache.flink.client.program.StreamContextEnvironment.executeAsync(StreamContextEnvironment.java:128)
>   at 
> org.apache.flink.client.program.StreamContextEnvironment.execute(StreamContextEnvironment.java:76)
>   at 
> org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1799)
>   at 
> org.apache.flink.streaming.examples.join.WindowJoin.main(WindowJoin.java:86)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:316)
>   ... 11 more
> {code}
> Similarly when running the command {{bin/flink run -t yarn-application -p 1 
> examples/streaming/WindowJoin.jar}} I get the exception:
> {code}
> org.apache.flink.client.program.ProgramInvocationException: The main method 
> caused an error: No ExecutorFactory found to execute the application.
>   at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:330)
>   at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:198)
>   at 
> org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:114)
>   at 
> org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:743)
>   at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:242)
>   at 
> org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:971)
>   at 
> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1047)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754)
>

[jira] [Updated] (FLINK-20656) Update docs for new KafkaSource connector.

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20656:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Update docs for new KafkaSource connector.
> --
>
> Key: FLINK-20656
> URL: https://issues.apache.org/jira/browse/FLINK-20656
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Kafka
>Affects Versions: 1.12.0
>Reporter: Jiangjie Qin
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned
> Fix For: 1.12.6, 1.15.0
>
>
> We need to add docs for the KafkaSource connector. Namely the following page:
> https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/connectors/kafka.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20794) Support to select distinct columns in the Table API

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20794:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Support to select distinct columns in the Table API
> ---
>
> Key: FLINK-20794
> URL: https://issues.apache.org/jira/browse/FLINK-20794
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Dian Fu
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, there is no corresponding functionality in Table API for the 
> following SQL:
> {code:java}
> SELECT DISTINCT users FROM Orders
> {code}
> For example, for the following job:
> {code:java}
> table.select("distinct a")
> {code}
> It will thrown the following exception:
> {code:java}
> org.apache.flink.table.api.ExpressionParserException: Could not parse 
> expression at column 10: ',' expected but 'a' 
> foundorg.apache.flink.table.api.ExpressionParserException: Could not parse 
> expression at column 10: ',' expected but 'a' founddistinct a         ^
>  at 
> org.apache.flink.table.expressions.PlannerExpressionParserImpl$.throwError(PlannerExpressionParserImpl.scala:726)
>  at 
> org.apache.flink.table.expressions.PlannerExpressionParserImpl$.parseExpressionList(PlannerExpressionParserImpl.scala:710)
>  at 
> org.apache.flink.table.expressions.PlannerExpressionParserImpl.parseExpressionList(PlannerExpressionParserImpl.scala:47)
>  at 
> org.apache.flink.table.expressions.ExpressionParser.parseExpressionList(ExpressionParser.java:40)
>  at 
> org.apache.flink.table.api.internal.TableImpl.select(TableImpl.java:121){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20572) HiveCatalog should be a standalone module

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20572:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> HiveCatalog should be a standalone module
> -
>
> Key: FLINK-20572
> URL: https://issues.apache.org/jira/browse/FLINK-20572
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Hive
>Reporter: Rui Li
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> Currently HiveCatalog is the only implementation that supports persistent 
> metadata. It's possible that users just want to use HiveCatalog to manage 
> metadata, and doesn't intend to read/write Hive tables. However HiveCatalog 
> is part of Hive connector which requires lots of Hive dependencies, and 
> introducing these dependencies increases the chance of lib conflicts. We 
> should investigate whether we can move HiveCatalog to a light-weight 
> standalone module.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20849) Improve JavaDoc and logging of new KafkaSource

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20849:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Improve JavaDoc and logging of new KafkaSource
> --
>
> Key: FLINK-20849
> URL: https://issues.apache.org/jira/browse/FLINK-20849
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka
>Reporter: Qingsheng Ren
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.12.6, 1.15.0
>
>
> Some JavaDoc and logging message of the new KafkaSource should be more 
> descriptive to provide more information to users. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20578) Cannot create empty array using ARRAY[]

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20578:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Cannot create empty array using ARRAY[]
> ---
>
> Key: FLINK-20578
> URL: https://issues.apache.org/jira/browse/FLINK-20578
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.11.2
>Reporter: Fabian Hueske
>Priority: Major
>  Labels: starter
> Fix For: 1.15.0
>
>
> Calling the ARRAY function without an element (`ARRAY[]`) results in an error 
> message.
> Is that the expected behavior?
> How can users create empty arrays?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20834) Add metrics for reporting the memory usage and CPU usage of the Python UDF workers

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20834:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Add metrics for reporting the memory usage and CPU usage of the Python UDF 
> workers
> --
>
> Key: FLINK-20834
> URL: https://issues.apache.org/jira/browse/FLINK-20834
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Reporter: Wei Zhong
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> Currently these is no official approach to access the memory usage and CPU 
> usage of the Python UDF workers. We need to add these metrics to monitor the 
> running status of the Python processes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20687) Missing 'yarn-application' target in CLI help message

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20687:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Missing 'yarn-application' target in CLI help message
> -
>
> Key: FLINK-20687
> URL: https://issues.apache.org/jira/browse/FLINK-20687
> Project: Flink
>  Issue Type: Improvement
>  Components: Command Line Client
>Affects Versions: 1.12.0
>Reporter: Ruguo Yu
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.12.6, 1.15.0
>
> Attachments: image-2020-12-20-21-48-18-391.png, 
> image-2020-12-20-22-02-01-312.png
>
>
> Missing 'yarn-application' target in CLI help message when i enter command 
> 'flink run-application -h', as follows:
> !image-2020-12-20-21-48-18-391.png|width=516,height=372!
> The target name is obtained through SPI, and I checked the SPI 
> META-INF/servicesis is correct.
>  
> Next i put flink-shaded-hadoop-*-.jar to flink lib derectory or set 
> HADOOP_CLASSPATH, it can show 'yarn-application', as follows:
> !image-2020-12-20-22-02-01-312.png|width=808,height=507!
>  However, I think it is reasonable to show 'yarn-application' without any 
> action. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20895) Support LocalAggregatePushDown in Blink planner

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20895:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Support LocalAggregatePushDown in Blink planner
> ---
>
> Key: FLINK-20895
> URL: https://issues.apache.org/jira/browse/FLINK-20895
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Reporter: Sebastian Liu
>Priority: Major
>  Labels: auto-unassigned, pull-request-available
> Fix For: 1.15.0
>
>
> Will add related rule to support LocalAggregatePushDown in Blink planner



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20788) It doesn't support to use cube/rollup/grouping sets in the Table API

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20788:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> It doesn't support to use cube/rollup/grouping sets in the Table API
> 
>
> Key: FLINK-20788
> URL: https://issues.apache.org/jira/browse/FLINK-20788
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Dian Fu
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, it doesn't support to use cube/rollup/grouping sets in the Table 
> API. For the following job:
> {code}
> table.groupBy("cube(a, b)")
> {code}
> It will throw the following exception:
> {code}
> org.apache.flink.table.api.ValidationException: Undefined function: cube
>   at 
> org.apache.flink.table.expressions.resolver.LookupCallResolver.lambda$visit$0(LookupCallResolver.java:49)
>   at java.util.Optional.orElseThrow(Optional.java:290)
>   at 
> org.apache.flink.table.expressions.resolver.LookupCallResolver.visit(LookupCallResolver.java:49)
>   at 
> org.apache.flink.table.expressions.resolver.LookupCallResolver.visit(LookupCallResolver.java:38)
>   at 
> org.apache.flink.table.expressions.ApiExpressionVisitor.visit(ApiExpressionVisitor.java:37)
>   at 
> org.apache.flink.table.expressions.LookupCallExpression.accept(LookupCallExpression.java:65)
>   at 
> org.apache.flink.table.expressions.resolver.rules.LookupCallByNameRule.lambda$apply$0(LookupCallByNameRule.java:38)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374)
>   at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
>   at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
>   at 
> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
>   at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>   at 
> java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
>   at 
> org.apache.flink.table.expressions.resolver.rules.LookupCallByNameRule.apply(LookupCallByNameRule.java:38)
>   at 
> org.apache.flink.table.expressions.resolver.ExpressionResolver.lambda$null$1(ExpressionResolver.java:211)
>   at java.util.function.Function.lambda$andThen$1(Function.java:88)
>   at java.util.function.Function.lambda$andThen$1(Function.java:88)
>   at java.util.function.Function.lambda$andThen$1(Function.java:88)
>   at java.util.function.Function.lambda$andThen$1(Function.java:88)
>   at java.util.function.Function.lambda$andThen$1(Function.java:88)
>   at java.util.function.Function.lambda$andThen$1(Function.java:88)
>   at java.util.function.Function.lambda$andThen$1(Function.java:88)
>   at 
> org.apache.flink.table.expressions.resolver.ExpressionResolver.resolve(ExpressionResolver.java:178)
>   at 
> org.apache.flink.table.operations.utils.OperationTreeBuilder.aggregate(OperationTreeBuilder.java:236)
>   at 
> org.apache.flink.table.api.internal.TableImpl$GroupedTableImpl.select(TableImpl.java:632)
>   at 
> org.apache.flink.table.api.internal.TableImpl$GroupedTableImpl.select(TableImpl.java:615)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20484) Improve hive temporal table exception

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20484:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

>  Improve hive temporal table exception
> --
>
> Key: FLINK-20484
> URL: https://issues.apache.org/jira/browse/FLINK-20484
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Hive
>Affects Versions: 1.12.0
>Reporter: Leonard Xu
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned
> Fix For: 1.15.0
>
> Attachments: image-2020-12-04-16-42-28-399.png
>
>
> user need to set options when use the latest hive partition as  temporal 
> table:
> {code:java}
> 'streaming-source.enable'='true',
> 'streaming-source.partition.include' = 'latest',
> {code}
> if user missed the option `streaming-source.partition.include`, the hive 
> table becomes a unbounded table, currently a unbounded table can only be used 
> as versioned table in temporal join, so the framework require PK and 
> watermark, but it's hard to understand to users.
> !image-2020-12-04-16-42-28-399.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20865) Prevent potential resource deadlock in fine-grained resource management

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20865:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Prevent potential resource deadlock in fine-grained resource management
> ---
>
> Key: FLINK-20865
> URL: https://issues.apache.org/jira/browse/FLINK-20865
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Reporter: Yangze Guo
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
> Attachments: 屏幕快照 2021-01-06 下午2.32.57.png
>
>
> !屏幕快照 2021-01-06 下午2.32.57.png|width=954,height=288!
> The above figure demonstrates a potential case of deadlock due to scheduling 
> dependency. For the given topology, initially the scheduler will request 4 
> slots, for A, B, C and D. Assuming only 2 slots are available, if both slots 
> are assigned to Pipeline Region 0 (as shown on the left), A and B will first 
> finish execution, then C and D will be executed, and finally E will be 
> executed. However, if in the beginning the 2 slots are assigned to A and C 
> (as shown on the right), then neither of A and C can finish execution due to 
> missing B and D consuming the data they produced.
> Currently, with coarse-grained resource management, the scheduler guarantees 
> to always finish fulfilling requirements of one pipeline region before 
> starting to fulfill requirements of another. That means the deadlock case 
> shown on the right of the above figure can never happen.
> However, there’s no such guarantee in fine-grained resource management. Since 
> resource requirements for SSGs can be different, there’s no control on which 
> requirements will be fulfilled first, when there’s not enough resources to 
> fulfill all the requirements. Therefore, it’s not always possible to fulfill 
> one pipeline region prior to another.
> To solve this problem, we can make the scheduler defer requesting slots for 
> other SSGs before requirements of the current SSG are fulfilled, for 
> fine-grained resource management, at the price of more scheduling time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20896) Support SupportsAggregatePushDown for JDBC TableSource

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20896:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Support SupportsAggregatePushDown for JDBC TableSource
> --
>
> Key: FLINK-20896
> URL: https://issues.apache.org/jira/browse/FLINK-20896
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / JDBC, Table SQL / Ecosystem
>Reporter: Sebastian Liu
>Priority: Major
>  Labels: auto-unassigned
> Fix For: 1.15.0
>
>
> Will add SupportsAggregatePushDown implementation for JDBC TableSource.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20955) Refactor HBase Source in accordance with FLIP-27

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20955:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Refactor HBase Source in accordance with FLIP-27
> 
>
> Key: FLINK-20955
> URL: https://issues.apache.org/jira/browse/FLINK-20955
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / HBase, Table SQL / Ecosystem
>Reporter: Moritz Manner
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned, 
> pull-request-available
> Fix For: 1.15.0
>
>
> The HBase connector source implementation should be updated in accordance 
> with [FLIP-27: Refactor Source 
> Interface|https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface].
> One source should map to one table in HBase. Users can specify which 
> column[families] to watch; each change in one of the columns triggers a 
> record with change type, table, column family, column, value, and timestamp.
> h3. Idea
> The new Flink HBase Source makes use of the internal [replication mechanism 
> of HBase|https://hbase.apache.org/book.html#_cluster_replication]. The Source 
> is registering at the HBase cluster and will receive all WAL edits written in 
> HBase. From those WAL edits the Source can create the DataStream. 
> h3. Split
> We're still not 100% sure which information a Split should contain. We have 
> the following possibilities: 
>  # There is only one Split per Source and the Split contains all the 
> necessary information to connect with HBase. The SourceReader which processes 
> the Split will receive all WAL edits for all tables and filters the relevant 
> edits. 
>  # There are multiple Splits per Source, each Split representing one HBase 
> Region to read from. This assumes that it is possible to only receive WAL 
> edits from a specific HBase Region and not receive all WAL edits. This would 
> be preferable as it allows parallel processing of multiple regions, but we 
> still need to figure out how this is possible.
> In both cases the Split will contain information about the HBase instance and 
> table. 
> h3. Split Enumerator
> Depending on which Split we'll decide on, the split enumerator will connect 
> to HBase and get all relevant regions or just create one Split.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-21559) Python DataStreamTests::test_process_function failed on AZP

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-21559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-21559:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Python DataStreamTests::test_process_function failed on AZP
> ---
>
> Key: FLINK-21559
> URL: https://issues.apache.org/jira/browse/FLINK-21559
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.13.0
>Reporter: Till Rohrmann
>Priority: Minor
>  Labels: auto-deprioritized-critical, auto-deprioritized-major, 
> test-stability
> Fix For: 1.15.0
>
>
> The Python test case {{DataStreamTests::test_process_function}} failed on AZP.
> {code}
> === short test summary info 
> 
> FAILED 
> pyflink/datastream/tests/test_data_stream.py::DataStreamTests::test_process_function
> = 1 failed, 705 passed, 22 skipped, 303 warnings in 583.39s (0:09:43) 
> ==
> ERROR: InvocationError for command /__w/3/s/flink-python/.tox/py38/bin/pytest 
> --durations=20 (exited with code 1)
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=13992=logs=9cada3cb-c1d3-5621-16da-0f718fb86602=8d78fe4f-d658-5c70-12f8-4921589024c3



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-21634) ALTER TABLE statement enhancement

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-21634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-21634:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> ALTER TABLE statement enhancement
> -
>
> Key: FLINK-21634
> URL: https://issues.apache.org/jira/browse/FLINK-21634
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API, Table SQL / Client
>Reporter: Jark Wu
>Assignee: Jark Wu
>Priority: Major
>  Labels: auto-unassigned, stale-assigned
> Fix For: 1.15.0
>
>
> We already introduced ALTER TABLE statement in FLIP-69 [1], but only support 
> to rename table name and change table options. One useful feature of ALTER 
> TABLE statement is modifying schema. This is also heavily required by 
> integration with data lakes (e.g. iceberg). 
> Therefore, I propose to support following ALTER TABLE statements (except 
> {{SET}} and {{RENAME TO}}, others are all new introduced syntax):
> {code:sql}
> ALTER TABLE table_name {
> ADD {  | ( [, ...]) }
>   | MODIFY {  | ( [, ...]) }
>   | DROP {column_name | (column_name, column_name, ) | PRIMARY KEY | 
> CONSTRAINT constraint_name | WATERMARK}
>   | RENAME old_column_name TO new_column_name
>   | RENAME TO new_table_name
>   | SET (key1=val1, ...)
>   | RESET (key1, ...)
> }
> ::
>   {  |  |  }
> ::
>   column_name  [FIRST | AFTER column_name]
> ::
>   [CONSTRAINT constraint_name] PRIMARY KEY (column_name, ...) NOT ENFORCED
> ::
>   WATERMARK FOR rowtime_column_name AS watermark_strategy_expression
> ::
>   {  |  | 
>  } [COMMENT column_comment]
> ::
>   column_type
> ::
>   column_type METADATA [ FROM metadata_key ] [ VIRTUAL ]
> ::
>   AS computed_column_expression
> {code}
> And some examples:
> {code:sql}
> -- add a new column 
> ALTER TABLE mytable ADD new_column STRING COMMENT 'new_column docs';
> -- add columns, constraint, and watermark
> ALTER TABLE mytable ADD (
> log_ts STRING COMMENT 'log timestamp string' FIRST,
> ts AS TO_TIMESTAMP(log_ts) AFTER log_ts,
> PRIMARY KEY (id) NOT ENFORCED,
> WATERMARK FOR ts AS ts - INTERVAL '3' SECOND
> );
> -- modify a column type
> ALTER TABLE prod.db.sample MODIFY measurement double COMMENT 'unit is bytes 
> per second' AFTER `id`;
> -- modify definition of column log_ts and ts, primary key, watermark. they 
> must exist in table schema
> ALTER TABLE mytable ADD (
> log_ts STRING COMMENT 'log timestamp string' AFTER `id`,  -- reoder 
> columns
> ts AS TO_TIMESTAMP(log_ts) AFTER log_ts,
> PRIMARY KEY (id) NOT ENFORCED,
> WATERMARK FOR ts AS ts - INTERVAL '3' SECOND
> );
> -- drop an old column
> ALTER TABLE prod.db.sample DROP measurement;
> -- drop columns
> ALTER TABLE prod.db.sample DROP (col1, col2, col3);
> -- drop a watermark
> ALTER TABLE prod.db.sample DROP WATERMARK;
> -- rename column name
> ALTER TABLE prod.db.sample RENAME `data` TO payload;
> -- rename table name
> ALTER TABLE mytable RENAME TO mytable2;
> -- set options
> ALTER TABLE kafka_table SET (
> 'scan.startup.mode' = 'specific-offsets', 
> 'scan.startup.specific-offsets' = 'partition:0,offset:42'
> );
> -- reset options
> ALTER TABLE kafka_table RESET ('scan.startup.mode', 
> 'scan.startup.specific-offsets');
> {code}
> Note: we don't need to introduce new interfaces, because all the alter table 
> operation will be forward to catalog through {{Catalog#alterTable(tablePath, 
> newTable, ignoreIfNotExists)}}.
> [1]: 
> https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/table/sql/alter/#alter-table
> [2]: http://iceberg.apache.org/spark-ddl/#alter-table-alter-column
> [3]: https://trino.io/docs/current/sql/alter-table.html
> [4]: https://dev.mysql.com/doc/refman/8.0/en/alter-table.html
> [5]: https://www.postgresql.org/docs/9.1/sql-altertable.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20920) Document how to connect to kerberized HMS

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20920:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Document how to connect to kerberized HMS
> -
>
> Key: FLINK-20920
> URL: https://issues.apache.org/jira/browse/FLINK-20920
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Hive, Documentation
>Reporter: Rui Li
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20946) Optimize Python ValueState Implementation In PyFlink

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20946:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Optimize Python ValueState Implementation In PyFlink
> 
>
> Key: FLINK-20946
> URL: https://issues.apache.org/jira/browse/FLINK-20946
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.13.0
>Reporter: Huang Xingbo
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned, 
> pull-request-available
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20952) Changelog json formats should support inherit options from JSON format

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20952:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Changelog json formats should support inherit options from JSON format
> --
>
> Key: FLINK-20952
> URL: https://issues.apache.org/jira/browse/FLINK-20952
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / Ecosystem
>Reporter: Jark Wu
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned, 
> pull-request-available
> Fix For: 1.15.0
>
>
> Recently, we introduced several config options for json format, e.g. 
> FLINK-20861. It reveals a potential problem that adding a small config option 
> into json may need touch debezium-json, canal-json, maxwell-json formats. 
> This is verbose and error-prone. We need an abstract machanism support 
> reuable options. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-21408) Clarify which DataStream sources support Batch execution

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-21408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-21408:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Clarify which DataStream sources support Batch execution
> 
>
> Key: FLINK-21408
> URL: https://issues.apache.org/jira/browse/FLINK-21408
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Documentation
>Reporter: Chesnay Schepler
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> The DataStream "Execution Mode" documentation goes to great lengths to 
> describe the differences between the modes and impact on various aspects of 
> Flink like checkpointing.
> However the topic of connectors, and specifically which for Batch mode, or 
> whether there even are any that don't, is not mentioned at all.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20853) Add reader schema null check for AvroDeserializationSchema when recordClazz is GenericRecord

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20853:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Add reader schema null check for AvroDeserializationSchema when recordClazz 
> is GenericRecord 
> -
>
> Key: FLINK-20853
> URL: https://issues.apache.org/jira/browse/FLINK-20853
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.12.0
>Reporter: hailong wang
>Priority: Minor
> Fix For: 1.15.0
>
>
> Reader schema can not be null when recordClazz is GenericRecord.
> Although its constructor is default, this will cause NPE when reader schema 
> is null and recordClazz is GenericRecord for the class extends it, such as 
> RegistryAvroDeserializationSchema.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-21184) Support temporary table/function for hive connector

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-21184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-21184:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Support temporary table/function for hive connector
> ---
>
> Key: FLINK-21184
> URL: https://issues.apache.org/jira/browse/FLINK-21184
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Hive
>Reporter: Rui Li
>Priority: Major
>  Labels: auto-deprioritized-major, auto-unassigned
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-21179) Make sure that the open/close methods of the Python DataStream Function are not implemented when using in ReducingState and AggregatingState

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-21179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-21179:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Make sure that the open/close methods of the Python DataStream Function are 
> not implemented when using in ReducingState and AggregatingState
> 
>
> Key: FLINK-21179
> URL: https://issues.apache.org/jira/browse/FLINK-21179
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Python
>Reporter: Wei Zhong
>Priority: Major
> Fix For: 1.15.0
>
>
> As the ReducingState and AggregatingState only support non-rich functions, we 
> need to make sure that the open/close methods of the Python DataStream 
> Function are not implemented when using in ReducingState and AggregatingState.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-21053) Prevent potential RejectedExecutionExceptions in CheckpointCoordinator failing JM

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-21053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-21053:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Prevent potential RejectedExecutionExceptions in CheckpointCoordinator 
> failing JM
> -
>
> Key: FLINK-21053
> URL: https://issues.apache.org/jira/browse/FLINK-21053
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Reporter: Roman Khachatryan
>Priority: Minor
>  Labels: auto-unassigned
> Fix For: 1.15.0
>
>
> In the past, there were multiple bugs caused by throwing/handling 
> RejectedExecutionException in CheckpointCoordinator (FLINK-18290, 
> FLINK-20992).
>  
> And I think it's still possible as there are many places where an executor is 
> passed to calls to CompletableFuture.xxxAsync while it can already be shut 
> down.
>  
> In FLINK-20992 we discussed two approaches to fix this.
> One approach is to check executor state inside a synchronized block every 
> time when it is used.
> Second approach is to
>  # Create executors inside CheckpointCoordinator (both io & timer thread 
> pools)
>  # Check isShutdown() in their RejectedExecution handlers (if yes and it's 
> RejectedExecutionException then just log; otherwise delegate to 
> FatalExitExceptionHandler)
>  # (this will allow to remove such RejectedExecutionException checks from 
> coordinator code)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-21375) Refactor HybridMemorySegment

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-21375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-21375:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Refactor HybridMemorySegment
> 
>
> Key: FLINK-21375
> URL: https://issues.apache.org/jira/browse/FLINK-21375
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Runtime / Coordination
>Reporter: Xintong Song
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> Per the discussion in [this PR|https://github.com/apache/flink/pull/14904], 
> we plan to refactor {{HybridMemorySegment}} as follows.
> * Separate into memory type specific implementations: heap / direct / native 
> (unsafe)
> * Remove {{wrap()}}, replacing with {{processAsByteBuffer()}}
> * Remove native memory cleaner logic



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-21022) flink-connector-es add onSuccess handler after bulk process for sync success data to other third party system for data consistency checking

2021-09-28 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-21022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-21022:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> flink-connector-es add onSuccess handler after bulk process for sync success 
> data to other third party system for data consistency checking
> ---
>
> Key: FLINK-21022
> URL: https://issues.apache.org/jira/browse/FLINK-21022
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / ElasticSearch
>Reporter: Zheng WEI
>Priority: Minor
>  Labels: auto-deprioritized-major, pull-request-available
> Fix For: 1.11.5, 1.15.0
>
>
> flink-connector-es add onSuccess handler after successful bulk process, in 
> order to sync success data to other third party system for data consistency 
> checking. Default the implementation of onSuccess function is empty logic, 
> user can set its own onSuccess handler when needed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

< 1 2 3 4 5 6 7 8 9 >

101 - 200 of 873 matches

Mail list logo