date:20200203

[jira] [Comment Edited] (FLINK-15863) Fix docs stating that savepoints are relocatable

2020-02-03 Thread Bastien DINE (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17029625#comment-17029625
 ] 

Bastien DINE edited comment on FLINK-15863 at 2/4/20 7:57 AM:
--

Hello [~NicoK],
 Maybe we can just simplify it, by just removing the reference of 1.3
 Savepoints, with modification by State Processor API, can also reference other 
savepoints (as do incremental checkpoiting with checkpoint)
 And this is absolute path, so this is non relocatable too
 Maybe something like :
{code:java}
Another important precondition is that all the savepoint data must be 
accessible from the new installation and reside under the same absolute path. 
Additional files can be referenced from inside the savepoint file (e.g. the 
output from state backend snapshots) and savepoint can referenced others (e.g. 
modification with State Processor API *link* ?) The savepoint data are 
referenced through absolute path by meta data file and thus, the savepoint is 
typically not relocatable using typical filesystem operations.
{code}


was (Author: bdine):
Hello [~NicoK],
Maybe we can just simply simplify, by just removing the reference of 1.3
Savepoint, with modification by State Processor API, can also reference other 
savepoint (as do incremental checkpoiting with checkpoint)
And this is absolute path to, so this is non relocatable too
Maybe something like ?
{code:java}
Another important precondition is that all the savepoint data must be 
accessible from the new installation and reside under the same absolute path. 
Additional files can be referenced from inside the savepoint file (e.g. the 
output from state backend snapshots) and savepoint can referenced others (e.g. 
modification with State Processor API *link* ?) The savepoint data are 
referenced through absolute path by meta data file and thus, the savepoint is 
typically not relocatable using typical filesystem operations.
{code}

> Fix docs stating that savepoints are relocatable
> 
>
> Key: FLINK-15863
> URL: https://issues.apache.org/jira/browse/FLINK-15863
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.9.2, 1.10.0
>Reporter: Nico Kruber
>Priority: Blocker
>  Labels: usability
> Fix For: 1.11.0, 1.10.1
>
>
> This section from 
> https://ci.apache.org/projects/flink/flink-docs-stable/ops/upgrading.html#preconditions
>  states that savepoints are relocatable which they are not yet (see 
> FLINK-5763). It should be fixed and/or removed; I'm unsure what change from 
> 1.3 it should actually reflect.
> {quote}Another important precondition is that for savepoints taken before 
> Flink 1.3.x, all the savepoint data must be accessible from the new 
> installation and reside under the same absolute path. Before Flink 1.3.x, the 
> savepoint data is typically not self-contained in just the created savepoint 
> file. Additional files can be referenced from inside the savepoint file (e.g. 
> the output from state backend snapshots). Since Flink 1.3.x, this is no 
> longer a limitation; savepoints can be relocated using typical filesystem 
> operations..{quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15863) Fix docs stating that savepoints are relocatable

2020-02-03 Thread Bastien DINE (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17029625#comment-17029625
 ] 

Bastien DINE commented on FLINK-15863:
--

Hello [~NicoK],
Maybe we can just simply simplify, by just removing the reference of 1.3
Savepoint, with modification by State Processor API, can also reference other 
savepoint (as do incremental checkpoiting with checkpoint)
And this is absolute path to, so this is non relocatable too
Maybe something like ?
{code:java}
Another important precondition is that all the savepoint data must be 
accessible from the new installation and reside under the same absolute path. 
Additional files can be referenced from inside the savepoint file (e.g. the 
output from state backend snapshots) and savepoint can referenced others (e.g. 
modification with State Processor API *link* ?) The savepoint data are 
referenced through absolute path by meta data file and thus, the savepoint is 
typically not relocatable using typical filesystem operations.
{code}

> Fix docs stating that savepoints are relocatable
> 
>
> Key: FLINK-15863
> URL: https://issues.apache.org/jira/browse/FLINK-15863
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.9.2, 1.10.0
>Reporter: Nico Kruber
>Priority: Blocker
>  Labels: usability
> Fix For: 1.11.0, 1.10.1
>
>
> This section from 
> https://ci.apache.org/projects/flink/flink-docs-stable/ops/upgrading.html#preconditions
>  states that savepoints are relocatable which they are not yet (see 
> FLINK-5763). It should be fixed and/or removed; I'm unsure what change from 
> 1.3 it should actually reflect.
> {quote}Another important precondition is that for savepoints taken before 
> Flink 1.3.x, all the savepoint data must be accessible from the new 
> installation and reside under the same absolute path. Before Flink 1.3.x, the 
> savepoint data is typically not self-contained in just the created savepoint 
> file. Additional files can be referenced from inside the savepoint file (e.g. 
> the output from state backend snapshots). Since Flink 1.3.x, this is no 
> longer a limitation; savepoints can be relocated using typical filesystem 
> operations..{quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] GJL commented on a change in pull request #10997: [FLINK-15743][docs] Extend Flink 1.10 release notes

2020-02-03 Thread GitBox

GJL commented on a change in pull request #10997: [FLINK-15743][docs] Extend 
Flink 1.10 release notes
URL: https://github.com/apache/flink/pull/10997#discussion_r374517476
 
 

 ##
 File path: docs/release-notes/flink-1.10.md
 ##
 @@ -176,6 +182,18 @@ The container cut-off configuration options, 
`containerized.heap-cutoff-ratio`
 and `containerized.heap-cutoff-min`, have no effect for task executor processes
 anymore but they still have the same semantics for the JobManager process.
 
+ RocksDB State Backend Memory Control 
([FLINK-7289](https://issues.apache.org/jira/browse/FLINK-7289))
+Together with the introduction of the [new Task Executor Memory
+Model](#new-task-executor-memory-model-flink-13980), the memory consumption of 
the RocksDB state backend will be
 
 Review comment:
   yes


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] leventov edited a comment on issue #11005: [FLINK-15838] Dangling CountDownLatch.await(timeout)

2020-02-03 Thread GitBox

leventov edited a comment on issue #11005: [FLINK-15838] Dangling 
CountDownLatch.await(timeout)
URL: https://github.com/apache/flink/pull/11005#issuecomment-581784038
 
 
   It would be good to follow a systemic approach. What prevents introducing 
the same bugs in the future?
   
   In Apache Druid, we use IntelliJ IDEA inspections as a part of the CI 
process.
   
   @ccaominh created an image that could be integrated with Travis CI: 
https://github.com/ccaominh/intellij-inspect. See 
https://github.com/apache/druid/pull/9179 for details.
   
   On the IntelliJ inspections level, checking this issue is a matter of adding 
CountDownLatch.await in the configuration of "Result of method call ignored" 
pattern and setting it to ERROR severity level.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] leventov commented on issue #11005: [FLINK-15838] Dangling CountDownLatch.await(timeout)

2020-02-03 Thread GitBox

leventov commented on issue #11005: [FLINK-15838] Dangling 
CountDownLatch.await(timeout)
URL: https://github.com/apache/flink/pull/11005#issuecomment-581784038
 
 
   It would be good for having a systemic approach. What prevents introducing 
the same bugs in the future?
   
   In Apache Druid, we use IntelliJ IDEA inspections as a part of CI process.
   
   @ccaominh created an image which could be integrated with Travis CI: 
https://github.com/ccaominh/intellij-inspect. See 
https://github.com/apache/druid/pull/9179 for details.
   
   On IntelliJ inspections level, checking this issue is a matter of adding 
CountDownLatch.await in the configuration of "Result of method call ignored" 
pattern and setting it to ERROR severity level.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] StephanEwen closed pull request #10982: [FLINK-15683][docs] Restructure Configuration page

2020-02-03 Thread GitBox

StephanEwen closed pull request #10982: [FLINK-15683][docs] Restructure 
Configuration page
URL: https://github.com/apache/flink/pull/10982
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] StephanEwen commented on issue #10982: [FLINK-15683][docs] Restructure Configuration page

2020-02-03 Thread GitBox

StephanEwen commented on issue #10982: [FLINK-15683][docs] Restructure 
Configuration page
URL: https://github.com/apache/flink/pull/10982#issuecomment-581781839
 
 
   Merged in
 - `master` in d6a7b01d2424bdf81fce124948bd33ea05a6ce3b
 - `release-1.10` in 5729f237d2c3f01efc3ebc70d051bec09381fcc8


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] JingsongLi commented on a change in pull request #11012: [FLINK-15858][hive] Store generic table schema as properties

2020-02-03 Thread GitBox

JingsongLi commented on a change in pull request #11012: [FLINK-15858][hive] 
Store generic table schema as properties
URL: https://github.com/apache/flink/pull/11012#discussion_r374512995
 
 

 ##
 File path: 
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java
 ##
 @@ -587,57 +599,67 @@ protected static Table instantiateHiveTable(ObjectPath 
tablePath, CatalogBaseTab
// When creating a table, A hive table needs explicitly have a 
key is_generic = false
// otherwise, this is a generic table if 1) the key is missing 
2) is_generic = true
// this is opposite to reading a table and instantiating a 
CatalogTable. See instantiateCatalogTable()
+   boolean isGeneric;
if (!properties.containsKey(CatalogConfig.IS_GENERIC)) {
-   // must be a generic catalog
+   // must be a generic table
+   isGeneric = true;
properties.put(CatalogConfig.IS_GENERIC, 
String.valueOf(true));
-   properties = maskFlinkProperties(properties);
} else {
-   boolean isGeneric = 
Boolean.valueOf(properties.get(CatalogConfig.IS_GENERIC));
-
-   if (isGeneric) {
-   properties = maskFlinkProperties(properties);
-   }
+   isGeneric = 
Boolean.parseBoolean(properties.get(CatalogConfig.IS_GENERIC));
}
 
-   // Table properties
-   hiveTable.setParameters(properties);
-
// Hive table's StorageDescriptor
StorageDescriptor sd = hiveTable.getSd();
setStorageFormat(sd, properties);
 
-   List allColumns = 
HiveTableUtil.createHiveColumns(table.getSchema());
-
-   // Table columns and partition keys
-   if (table instanceof CatalogTableImpl) {
-   CatalogTable catalogTable = (CatalogTableImpl) table;
-
-   if (catalogTable.isPartitioned()) {
-   int partitionKeySize = 
catalogTable.getPartitionKeys().size();
-   List regularColumns = 
allColumns.subList(0, allColumns.size() - partitionKeySize);
-   List partitionColumns = 
allColumns.subList(allColumns.size() - partitionKeySize, allColumns.size());
+   if (isGeneric) {
+   DescriptorProperties tableSchemaProps = new 
DescriptorProperties(true);
+   
tableSchemaProps.putTableSchema(HiveCatalogConfig.GENERIC_TABLE_SCHEMA_PREFIX, 
table.getSchema());
+   properties.putAll(tableSchemaProps.asMap());
+
+   if (table instanceof CatalogTableImpl) {
+   List partColNames = ((CatalogTableImpl) 
table).getPartitionKeys();
+   if (!partColNames.isEmpty()) {
+   
properties.put(HiveCatalogConfig.GENERIC_PART_COL_NAMES, String.join(",", 
partColNames));
 
 Review comment:
   We don't need support partition for generic table. And this serialization 
maybe not right.
   We should wait https://github.com/apache/flink/pull/10059


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] lirui-apache commented on a change in pull request #11012: [FLINK-15858][hive] Store generic table schema as properties

2020-02-03 Thread GitBox

lirui-apache commented on a change in pull request #11012: [FLINK-15858][hive] 
Store generic table schema as properties
URL: https://github.com/apache/flink/pull/11012#discussion_r374510511
 
 

 ##
 File path: 
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalogConfig.java
 ##
 @@ -32,4 +32,9 @@
 
// Partition related configs
public static final String PARTITION_LOCATION = "partition.location";
+
+   // config prefix for the table schema of a generic table
+   public static final String GENERIC_TABLE_SCHEMA_PREFIX = 
"generic.table.schema";
 
 Review comment:
   `CatalogConfig.FLINK_PROPERTY_PREFIX` will be added to all the Flink 
properties. So we'll have sth like `flink.generic.table.schema.0...` and 
`flink.generic.table.schema.1...` in the final table properties.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot commented on issue #11012: [FLINK-15858][hive] Store generic table schema as properties

2020-02-03 Thread GitBox

flinkbot commented on issue #11012: [FLINK-15858][hive] Store generic table 
schema as properties
URL: https://github.com/apache/flink/pull/11012#issuecomment-581778343
 
 
   
   ## CI report:
   
   * df658c89289171ef60b8939090fd53982ce46061 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot commented on issue #11013: [hotfix][doc] Fix version settings when building PyFlink

2020-02-03 Thread GitBox

flinkbot commented on issue #11013: [hotfix][doc] Fix version settings when 
building PyFlink
URL: https://github.com/apache/flink/pull/11013#issuecomment-581778365
 
 
   
   ## CI report:
   
   * 82ec948ab5c2131c5ee0698fd96a3dfbdbe0ee2a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11010: [FLINK-15836][k8s] Throw fatal exception when the pods watcher in Fabric8FlinkKubeClient is closed

2020-02-03 Thread GitBox

flinkbot edited a comment on issue #11010: [FLINK-15836][k8s] Throw fatal 
exception when the pods watcher in Fabric8FlinkKubeClient is closed
URL: https://github.com/apache/flink/pull/11010#issuecomment-581769412
 
 
   
   ## CI report:
   
   * e6722c5f11f17384bc94de89095c46d605def5e8 Travis: 
[PENDING](https://travis-ci.com/flink-ci/flink/builds/147328987) Azure: 
[FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=4807)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11011: [FLINK-15658][table-planner-blink] Fix duplicate field names exception when join on the same key multiple times

2020-02-03 Thread GitBox

flinkbot edited a comment on issue #11011: [FLINK-15658][table-planner-blink] 
Fix duplicate field names exception when join on the same key multiple times
URL: https://github.com/apache/flink/pull/11011#issuecomment-581769441
 
 
   
   ## CI report:
   
   * e0a75123cffb0bef0a2d155f52d793ee3f6804c6 Travis: 
[PENDING](https://travis-ci.com/flink-ci/flink/builds/147329001) Azure: 
[PENDING](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=4808)
 
   * 38e5a57f0b14176f4117662cb5e2bc151e30e241 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11009: [Flink-15873] [cep] fix matches result not returned when existing earlier partial matches

2020-02-03 Thread GitBox

flinkbot edited a comment on issue #11009: [Flink-15873] [cep] fix matches 
result not returned when existing earlier partial matches
URL: https://github.com/apache/flink/pull/11009#issuecomment-581740648
 
 
   
   ## CI report:
   
   * 67c1b7f08b7de55f3d039680b976246cab8cce07 Travis: 
[FAILURE](https://travis-ci.com/flink-ci/flink/builds/147320241) Azure: 
[FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=4805)
 
   * 2352b23c725aefa8d4f51816b8c6432439a84ee5 Travis: 
[FAILURE](https://travis-ci.com/flink-ci/flink/builds/147326914) Azure: 
[FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=4806)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-15877) Stop using deprecated methods from TableSource interface

2020-02-03 Thread Kurt Young (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kurt Young updated FLINK-15877:
---
Priority: Critical  (was: Major)

> Stop using deprecated methods from TableSource interface
> 
>
> Key: FLINK-15877
> URL: https://issues.apache.org/jira/browse/FLINK-15877
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Reporter: Kurt Young
>Priority: Critical
> Fix For: 1.11.0
>
>
> This is an *umbrella* issue to track the cleaning work of current TableSource 
> interface. 
> Currently, methods like `getReturnType` and `getTableSchema` are already 
> deprecated, but still used by lots of codes in various connectors and test 
> codes. We should make sure no connector and testing codes would use these 
> deprecated methods anymore, except for the backward compatibility callings. 
> This is to prepare for the further interface improvement.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] JingsongLi commented on issue #11012: [FLINK-15858][hive] Store generic table schema as properties

2020-02-03 Thread GitBox

JingsongLi commented on issue #11012: [FLINK-15858][hive] Store generic table 
schema as properties
URL: https://github.com/apache/flink/pull/11012#issuecomment-581777662
 
 
   Can hive's table have no fields?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Created] (FLINK-15895) Stop using TableSource::getReturnType except for compatibility purpose

2020-02-03 Thread Kurt Young (Jira)

Kurt Young created FLINK-15895:
--

 Summary: Stop using TableSource::getReturnType except for 
compatibility purpose
 Key: FLINK-15895
 URL: https://issues.apache.org/jira/browse/FLINK-15895
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Kurt Young
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-15896) Stop using TableSource::getTableSchema

2020-02-03 Thread Kurt Young (Jira)

Kurt Young created FLINK-15896:
--

 Summary: Stop using TableSource::getTableSchema
 Key: FLINK-15896
 URL: https://issues.apache.org/jira/browse/FLINK-15896
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Kurt Young
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-15894) Stop overriding TableSource::getTableSchema in flink walkthrough

2020-02-03 Thread Kurt Young (Jira)

Kurt Young created FLINK-15894:
--

 Summary: Stop overriding TableSource::getTableSchema in flink 
walkthrough
 Key: FLINK-15894
 URL: https://issues.apache.org/jira/browse/FLINK-15894
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Kurt Young
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-15893) Stop overriding TableSource::getTableSchema in python table source

2020-02-03 Thread Kurt Young (Jira)

Kurt Young created FLINK-15893:
--

 Summary: Stop overriding TableSource::getTableSchema in python 
table source
 Key: FLINK-15893
 URL: https://issues.apache.org/jira/browse/FLINK-15893
 Project: Flink
  Issue Type: Sub-task
  Components: API / Python, Table SQL / API
Reporter: Kurt Young
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-15892) Stop overriding TableSource::getTableSchema in csv table source

2020-02-03 Thread Kurt Young (Jira)

Kurt Young created FLINK-15892:
--

 Summary: Stop overriding TableSource::getTableSchema in csv table 
source
 Key: FLINK-15892
 URL: https://issues.apache.org/jira/browse/FLINK-15892
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Kurt Young
Assignee: Jingsong Lee
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-15890) Stop overriding TableSource::getTableSchema in orc table source

2020-02-03 Thread Kurt Young (Jira)

Kurt Young created FLINK-15890:
--

 Summary: Stop overriding TableSource::getTableSchema in orc table 
source
 Key: FLINK-15890
 URL: https://issues.apache.org/jira/browse/FLINK-15890
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / ORC, Table SQL / API
Reporter: Kurt Young
Assignee: Jingsong Lee
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-15891) Stop overriding TableSource::getTableSchema in parquet table source

2020-02-03 Thread Kurt Young (Jira)

Kurt Young created FLINK-15891:
--

 Summary: Stop overriding TableSource::getTableSchema in parquet 
table source
 Key: FLINK-15891
 URL: https://issues.apache.org/jira/browse/FLINK-15891
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Kurt Young
Assignee: Jingsong Lee
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-15889) Stop overriding TableSource::getTableSchema in jdbc connector

2020-02-03 Thread Kurt Young (Jira)

Kurt Young created FLINK-15889:
--

 Summary: Stop overriding TableSource::getTableSchema in jdbc 
connector
 Key: FLINK-15889
 URL: https://issues.apache.org/jira/browse/FLINK-15889
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / JDBC, Table SQL / API
Reporter: Kurt Young
Assignee: Zhenghua Gao
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-15888) Stop overriding TableSource::getTableSchema in hbase

2020-02-03 Thread Kurt Young (Jira)

Kurt Young created FLINK-15888:
--

 Summary: Stop overriding TableSource::getTableSchema in hbase
 Key: FLINK-15888
 URL: https://issues.apache.org/jira/browse/FLINK-15888
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / HBase, Table SQL / API
Reporter: Kurt Young
Assignee: Leonard Xu
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-15887) Stop overriding TableSource::getTableSchema in kafka

2020-02-03 Thread Kurt Young (Jira)

Kurt Young created FLINK-15887:
--

 Summary: Stop overriding TableSource::getTableSchema in kafka
 Key: FLINK-15887
 URL: https://issues.apache.org/jira/browse/FLINK-15887
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / Kafka, Table SQL / API
Reporter: Kurt Young
Assignee: Jark Wu
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-15886) Stop overriding TableSource::getTableSchema in hive table source

2020-02-03 Thread Kurt Young (Jira)

Kurt Young created FLINK-15886:
--

 Summary: Stop overriding TableSource::getTableSchema in hive table 
source
 Key: FLINK-15886
 URL: https://issues.apache.org/jira/browse/FLINK-15886
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / Hive, Table SQL / API
Reporter: Kurt Young
Assignee: Rui Li
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-15885) Stop overriding TableSource::getTableSchema in tests

2020-02-03 Thread Kurt Young (Jira)

Kurt Young created FLINK-15885:
--

 Summary: Stop overriding TableSource::getTableSchema in tests
 Key: FLINK-15885
 URL: https://issues.apache.org/jira/browse/FLINK-15885
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API, Tests
Reporter: Kurt Young
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-15884) Stop overriding TableSource::getReturnType in python table source

2020-02-03 Thread Kurt Young (Jira)

Kurt Young created FLINK-15884:
--

 Summary: Stop overriding TableSource::getReturnType in python 
table source
 Key: FLINK-15884
 URL: https://issues.apache.org/jira/browse/FLINK-15884
 Project: Flink
  Issue Type: Sub-task
  Components: API / Python, Table SQL / API
Reporter: Kurt Young
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15798) Running ./bin/kubernetes-session.sh -Dkubernetes.cluster-id= -Dexecution.attached=true fails with exception

2020-02-03 Thread Yang Wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17029617#comment-17029617
 ] 

Yang Wang commented on FLINK-15798:
---

I have tested multiple times in my Kubernetes cluster of these two commands. 
And they works quite well. It seems that the token may have expired in your 
case.

So you run {{./bin/kubernetes-session.sh -Dkubernetes.cluster-id= 
-Dexecution.attached=true}} and stay for long time?

> Running ./bin/kubernetes-session.sh -Dkubernetes.cluster-id= 
> -Dexecution.attached=true fails with exception
> ---
>
> Key: FLINK-15798
> URL: https://issues.apache.org/jira/browse/FLINK-15798
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / Kubernetes
>Affects Versions: 1.10.0
>Reporter: Till Rohrmann
>Priority: Major
> Fix For: 1.11.0, 1.10.1
>
>
> Running {{./bin/kubernetes-session.sh -Dkubernetes.cluster-id= 
> -Dexecution.attached=true}} fails with 
> {code}
> 2020-01-28 15:04:28,669 ERROR 
> org.apache.flink.kubernetes.cli.KubernetesSessionCli  - Error while 
> running the Flink session.
> io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: 
> GET at: https://35.234.77.125/api/v1/namespaces/default/services/testing. 
> Message: Unauthorized! Token may have expired! Please log-in again. 
> Unauthorized.
>   at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:510)
>   at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:447)
>   at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:413)
>   at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:372)
>   at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:337)
>   at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:318)
>   at 
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:812)
>   at 
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:220)
>   at 
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:164)
>   at 
> org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.getService(Fabric8FlinkKubeClient.java:330)
>   at 
> org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.getInternalService(Fabric8FlinkKubeClient.java:243)
>   at 
> org.apache.flink.kubernetes.cli.KubernetesSessionCli.run(KubernetesSessionCli.java:104)
>   at 
> org.apache.flink.kubernetes.cli.KubernetesSessionCli.lambda$main$0(KubernetesSessionCli.java:185)
>   at 
> org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
>   at 
> org.apache.flink.kubernetes.cli.KubernetesSessionCli.main(KubernetesSessionCli.java:185)
> {code}
> even though {{echo "stop" | ./bin/kubernetes-session.sh 
> -Dkubernetes.cluster-id= -Dexecution.attached=true}} succeeds 
> with my setup. This is strange as I would expect that the former call should 
> do exactly the same as the second except for sending the "stop" command right 
> away. I think we should check whether this is a real problem or only specific 
> to my setup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-15883) Stop overriding TableSource::getReturnType in parquet table source

2020-02-03 Thread Kurt Young (Jira)

Kurt Young created FLINK-15883:
--

 Summary: Stop overriding TableSource::getReturnType in parquet 
table source
 Key: FLINK-15883
 URL: https://issues.apache.org/jira/browse/FLINK-15883
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Kurt Young
Assignee: Jingsong Lee
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-15882) Stop overriding TableSource::getReturnType in orc table source

2020-02-03 Thread Kurt Young (Jira)

Kurt Young created FLINK-15882:
--

 Summary: Stop overriding TableSource::getReturnType in orc table 
source
 Key: FLINK-15882
 URL: https://issues.apache.org/jira/browse/FLINK-15882
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / ORC, Table SQL / API
Reporter: Kurt Young
Assignee: Jingsong Lee
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-15881) Stop overriding TableSource::getReturnType in jdbc connector

2020-02-03 Thread Kurt Young (Jira)

Kurt Young created FLINK-15881:
--

 Summary: Stop overriding TableSource::getReturnType in jdbc 
connector
 Key: FLINK-15881
 URL: https://issues.apache.org/jira/browse/FLINK-15881
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / JDBC, Table SQL / API
Reporter: Kurt Young
Assignee: Zhenghua Gao
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-15879) Stop overriding TableSource::getReturnType in kafka connector

2020-02-03 Thread Kurt Young (Jira)

Kurt Young created FLINK-15879:
--

 Summary: Stop overriding TableSource::getReturnType in kafka 
connector
 Key: FLINK-15879
 URL: https://issues.apache.org/jira/browse/FLINK-15879
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / Kafka, Table SQL / API
Reporter: Kurt Young
Assignee: Jark Wu
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] JingsongLi commented on a change in pull request #11012: [FLINK-15858][hive] Store generic table schema as properties

2020-02-03 Thread GitBox

JingsongLi commented on a change in pull request #11012: [FLINK-15858][hive] 
Store generic table schema as properties
URL: https://github.com/apache/flink/pull/11012#discussion_r374506163
 
 

 ##
 File path: 
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalogConfig.java
 ##
 @@ -32,4 +32,9 @@
 
// Partition related configs
public static final String PARTITION_LOCATION = "partition.location";
+
+   // config prefix for the table schema of a generic table
+   public static final String GENERIC_TABLE_SCHEMA_PREFIX = 
"generic.table.schema";
 
 Review comment:
   consistent with `CatalogConfig.FLINK_PROPERTY_PREFIX`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Created] (FLINK-15880) Stop overriding TableSource::getReturnType in hbase connector

2020-02-03 Thread Kurt Young (Jira)

Kurt Young created FLINK-15880:
--

 Summary: Stop overriding TableSource::getReturnType in hbase 
connector
 Key: FLINK-15880
 URL: https://issues.apache.org/jira/browse/FLINK-15880
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / HBase, Table SQL / API
Reporter: Kurt Young
Assignee: Leonard Xu
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-15878) Stop overriding TableSource::getReturnType in tests

2020-02-03 Thread Kurt Young (Jira)

Kurt Young created FLINK-15878:
--

 Summary: Stop overriding TableSource::getReturnType in tests
 Key: FLINK-15878
 URL: https://issues.apache.org/jira/browse/FLINK-15878
 Project: Flink
  Issue Type: Sub-task
  Components: Tests
Reporter: Kurt Young
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] dianfu commented on issue #11013: [hotfix][doc] Fix version settings when building PyFlink

2020-02-03 Thread GitBox

dianfu commented on issue #11013: [hotfix][doc] Fix version settings when 
building PyFlink
URL: https://github.com/apache/flink/pull/11013#issuecomment-581774652
 
 
   @hequn8128 Thanks for the PR. LGTM, merging...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] JingsongLi commented on a change in pull request #11012: [FLINK-15858][hive] Store generic table schema as properties

2020-02-03 Thread GitBox

JingsongLi commented on a change in pull request #11012: [FLINK-15858][hive] 
Store generic table schema as properties
URL: https://github.com/apache/flink/pull/11012#discussion_r374505837
 
 

 ##
 File path: 
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java
 ##
 @@ -587,57 +599,67 @@ protected static Table instantiateHiveTable(ObjectPath 
tablePath, CatalogBaseTab
// When creating a table, A hive table needs explicitly have a 
key is_generic = false
// otherwise, this is a generic table if 1) the key is missing 
2) is_generic = true
// this is opposite to reading a table and instantiating a 
CatalogTable. See instantiateCatalogTable()
+   boolean isGeneric;
if (!properties.containsKey(CatalogConfig.IS_GENERIC)) {
-   // must be a generic catalog
+   // must be a generic table
+   isGeneric = true;
properties.put(CatalogConfig.IS_GENERIC, 
String.valueOf(true));
-   properties = maskFlinkProperties(properties);
} else {
-   boolean isGeneric = 
Boolean.valueOf(properties.get(CatalogConfig.IS_GENERIC));
-
-   if (isGeneric) {
-   properties = maskFlinkProperties(properties);
-   }
+   isGeneric = 
Boolean.parseBoolean(properties.get(CatalogConfig.IS_GENERIC));
}
 
-   // Table properties
-   hiveTable.setParameters(properties);
-
// Hive table's StorageDescriptor
StorageDescriptor sd = hiveTable.getSd();
setStorageFormat(sd, properties);
 
-   List allColumns = 
HiveTableUtil.createHiveColumns(table.getSchema());
-
-   // Table columns and partition keys
-   if (table instanceof CatalogTableImpl) {
-   CatalogTable catalogTable = (CatalogTableImpl) table;
-
-   if (catalogTable.isPartitioned()) {
-   int partitionKeySize = 
catalogTable.getPartitionKeys().size();
-   List regularColumns = 
allColumns.subList(0, allColumns.size() - partitionKeySize);
-   List partitionColumns = 
allColumns.subList(allColumns.size() - partitionKeySize, allColumns.size());
+   if (isGeneric) {
+   DescriptorProperties tableSchemaProps = new 
DescriptorProperties(true);
+   
tableSchemaProps.putTableSchema(HiveCatalogConfig.GENERIC_TABLE_SCHEMA_PREFIX, 
table.getSchema());
+   properties.putAll(tableSchemaProps.asMap());
+
+   if (table instanceof CatalogTableImpl) {
+   List partColNames = ((CatalogTableImpl) 
table).getPartitionKeys();
+   if (!partColNames.isEmpty()) {
+   
properties.put(HiveCatalogConfig.GENERIC_PART_COL_NAMES, String.join(",", 
partColNames));
 
 Review comment:
   For compatibility issue, should throw exception now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Created] (FLINK-15877) Stop using deprecated methods from TableSource interface

2020-02-03 Thread Kurt Young (Jira)

Kurt Young created FLINK-15877:
--

 Summary: Stop using deprecated methods from TableSource interface
 Key: FLINK-15877
 URL: https://issues.apache.org/jira/browse/FLINK-15877
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API
Reporter: Kurt Young
 Fix For: 1.11.0


This is an *umbrella* issue to track the cleaning work of current TableSource 
interface. 

Currently, methods like `getReturnType` and `getTableSchema` are already 
deprecated, but still used by lots of codes in various connectors and test 
codes. We should make sure no connector and testing codes would use these 
deprecated methods anymore, except for the backward compatibility callings. 
This is to prepare for the further interface improvement.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] wuchong commented on issue #11011: [FLINK-15658][table-planner-blink] Fix duplicate field names exception when join on the same key multiple times

2020-02-03 Thread GitBox

wuchong commented on issue #11011: [FLINK-15658][table-planner-blink] Fix 
duplicate field names exception when join on the same key multiple times
URL: https://github.com/apache/flink/pull/11011#issuecomment-581771189
 
 
   > Thanks @wuchong , I've always felt that the methods of the 
`ProjectionCodeGenerator` is a little redundant.
   > What about add method to `ProjectionCodeGenerator`:
   > 
   > ```
   >   def generateProjection(
   >   name: String,
   >   inputType: RowType,
   >   inputMapping: Array[Int]): GeneratedProjection = {
   > val ctx = CodeGeneratorContext.apply(new TableConfig)
   > val outputType = 
RowType.of(inputMapping.map(inputType.getChildren.get):_ *)
   > generateProjection(ctx, name, inputType, outputType, inputMapping)
   >   }
   > ```
   
   However, the outputType is still required to build the key BaseRowTypeInfo, 
so the new method doesn't help much here. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot commented on issue #11013: [hotfix][doc] Fix version settings when building PyFlink

2020-02-03 Thread GitBox

flinkbot commented on issue #11013: [hotfix][doc] Fix version settings when 
building PyFlink
URL: https://github.com/apache/flink/pull/11013#issuecomment-581770431
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 82ec948ab5c2131c5ee0698fd96a3dfbdbe0ee2a (Tue Feb 04 
07:00:36 UTC 2020)
   
✅no warnings
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot commented on issue #11012: [FLINK-15858][hive] Store generic table schema as properties

2020-02-03 Thread GitBox

flinkbot commented on issue #11012: [FLINK-15858][hive] Store generic table 
schema as properties
URL: https://github.com/apache/flink/pull/11012#issuecomment-581770442
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit df658c89289171ef60b8939090fd53982ce46061 (Tue Feb 04 
07:00:38 UTC 2020)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
* **This pull request references an unassigned [Jira 
ticket](https://issues.apache.org/jira/browse/FLINK-15858).** According to the 
[code contribution 
guide](https://flink.apache.org/contributing/contribute-code.html), tickets 
need to be assigned before starting with the implementation work.
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-15858) Unable to use HiveCatalog and kafka together

2020-02-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-15858:
---
Labels: pull-request-available  (was: )

> Unable to use HiveCatalog and kafka together
> 
>
> Key: FLINK-15858
> URL: https://issues.apache.org/jira/browse/FLINK-15858
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Ecosystem
>Affects Versions: 1.10.0
>Reporter: Jeff Zhang
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
>  
> HiveCatalog only support timestamp(9), but kafka only support timestamp(3). 
> This make user unable to use HiveCatalog and kafka together
> {code:java}
> Caused by: org.apache.flink.table.catalog.exceptions.CatalogException: 
> HiveCatalog currently only supports timestamp of precision 9
>   at 
> org.apache.flink.table.catalog.hive.util.HiveTypeUtil$TypeInfoLogicalTypeVisitor.visit(HiveTypeUtil.java:272)
>   at 
> org.apache.flink.table.catalog.hive.util.HiveTypeUtil$TypeInfoLogicalTypeVisitor.visit(HiveTypeUtil.java:173)
>   at 
> org.apache.flink.table.types.logical.TimestampType.accept(TimestampType.java:151)
>   at 
> org.apache.flink.table.catalog.hive.util.HiveTypeUtil.toHiveTypeInfo(HiveTypeUtil.java:84)
>   at 
> org.apache.flink.table.catalog.hive.util.HiveTableUtil.createHiveColumns(HiveTableUtil.java:106)
>  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] lirui-apache commented on issue #11012: [FLINK-15858][hive] Store generic table schema as properties

2020-02-03 Thread GitBox

lirui-apache commented on issue #11012: [FLINK-15858][hive] Store generic table 
schema as properties
URL: https://github.com/apache/flink/pull/11012#issuecomment-581769797
 
 
   cc @JingsongLi @bowenli86 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] hequn8128 opened a new pull request #11013: [hotfix][doc] Fix version settings when building PyFlink

2020-02-03 Thread GitBox

hequn8128 opened a new pull request #11013: [hotfix][doc] Fix version settings 
when building PyFlink
URL: https://github.com/apache/flink/pull/11013
 
 
   
   ## What is the purpose of the change
   
   Due to the problem in 
[FLINK-15638](https://issues.apache.org/jira/browse/FLINK-15638), this pull 
request tries to fix the building errors by adding extra version settings.
   
   ## Brief change log
   
 - Adds extra version settings for PyFlink build documentation.
   
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] lirui-apache opened a new pull request #11012: [FLINK-15858][hive] Store generic table schema as properties

2020-02-03 Thread GitBox

lirui-apache opened a new pull request #11012: [FLINK-15858][hive] Store 
generic table schema as properties
URL: https://github.com/apache/flink/pull/11012
 
 
   
   
   ## What is the purpose of the change
   
   Store schema of generic table as properties, so that `HiveCatalog` is able 
to support types that are not supported by Hive, e.g. `TIMESTAMP(!=9)`.
   
   
   ## Brief change log
   
 - Store table schema as properties when creating generic tables
 - Retrieve table schema from properties when getting generic tables
 - Add test
   
   
   ## Verifying this change
   
   Existing and new test cases.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? yes
 - If yes, how is the feature documented? docs
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] wangyang0918 commented on issue #10956: [FLINK-15646][client]Configurable K8s context support.

2020-02-03 Thread GitBox

wangyang0918 commented on issue #10956: [FLINK-15646][client]Configurable K8s 
context support.
URL: https://github.com/apache/flink/pull/10956#issuecomment-581769583
 
 
   @zhengcanbin The changes looks really good to me. I just have some minor 
comments.
   1. The module in the commit title should be `k8s` or `kubernetes`. Maybe 
"[FLINK-15646][k8s] Make kubernetes context configurable" is better.
   2. Even the abbreviation `k8s` is reasonable in the commit title to leave 
more characters for other useful information. I suggest to avoid in the code, 
documentation. This is not forced, and you could leave it as you want.
   
   @TisonKun could you take another look and help to merge?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11009: [Flink-15873] [cep] fix matches result not returned when existing earlier partial matches

2020-02-03 Thread GitBox

flinkbot edited a comment on issue #11009: [Flink-15873] [cep] fix matches 
result not returned when existing earlier partial matches
URL: https://github.com/apache/flink/pull/11009#issuecomment-581740648
 
 
   
   ## CI report:
   
   * 67c1b7f08b7de55f3d039680b976246cab8cce07 Travis: 
[FAILURE](https://travis-ci.com/flink-ci/flink/builds/147320241) Azure: 
[FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=4805)
 
   * 2352b23c725aefa8d4f51816b8c6432439a84ee5 Travis: 
[PENDING](https://travis-ci.com/flink-ci/flink/builds/147326914) Azure: 
[PENDING](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=4806)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot commented on issue #11011: [FLINK-15658][table-planner-blink] Fix duplicate field names exception when join on the same key multiple times

2020-02-03 Thread GitBox

flinkbot commented on issue #11011: [FLINK-15658][table-planner-blink] Fix 
duplicate field names exception when join on the same key multiple times
URL: https://github.com/apache/flink/pull/11011#issuecomment-581769441
 
 
   
   ## CI report:
   
   * e0a75123cffb0bef0a2d155f52d793ee3f6804c6 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot commented on issue #11010: [FLINK-15836][k8s] Throw fatal exception when the pods watcher in Fabric8FlinkKubeClient is closed

2020-02-03 Thread GitBox

flinkbot commented on issue #11010: [FLINK-15836][k8s] Throw fatal exception 
when the pods watcher in Fabric8FlinkKubeClient is closed
URL: https://github.com/apache/flink/pull/11010#issuecomment-581769412
 
 
   
   ## CI report:
   
   * e6722c5f11f17384bc94de89095c46d605def5e8 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] JingsongLi commented on a change in pull request #11011: [FLINK-15658][table-planner-blink] Fix duplicate field names exception when join on the same key multiple times

2020-02-03 Thread GitBox

JingsongLi commented on a change in pull request #11011: 
[FLINK-15658][table-planner-blink] Fix duplicate field names exception when 
join on the same key multiple times
URL: https://github.com/apache/flink/pull/11011#discussion_r374497851
 
 

 ##
 File path: 
flink-table/flink-table-planner-blink/src/main/java/org/apache/flink/table/planner/plan/utils/KeySelectorUtil.java
 ##
 @@ -44,20 +44,20 @@
public static BaseRowKeySelector getBaseRowSelector(int[] keyFields, 
BaseRowTypeInfo rowType) {
if (keyFields.length > 0) {
LogicalType[] inputFieldTypes = 
rowType.getLogicalTypes();
-   String[] inputFieldNames = rowType.getFieldNames();
LogicalType[] keyFieldTypes = new 
LogicalType[keyFields.length];
-   String[] keyFieldNames = new String[keyFields.length];
for (int i = 0; i < keyFields.length; ++i) {
keyFieldTypes[i] = 
inputFieldTypes[keyFields[i]];
-   keyFieldNames[i] = 
inputFieldNames[keyFields[i]];
}
-   RowType returnType = RowType.of(keyFieldTypes, 
keyFieldNames);
+   // do not provide field names for the result key type,
+   // because we may have duplicate key fields and the 
field names may conflict
+   RowType returnType = RowType.of(keyFieldTypes);
RowType inputType = RowType.of(inputFieldTypes, 
rowType.getFieldNames());
 
 Review comment:
   `rowType.toRowType()`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-15816) Limit the maximum length of the value of kubernetes.cluster-id configuration option

2020-02-03 Thread Yang Wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17029601#comment-17029601
 ] 

Yang Wang commented on FLINK-15816:
---

[~trohrmann] [~felixzheng] Thanks for your explanation. The limitation of 63 
characters in Kubernetes is by design[1]. It seems that the limitation will 
never change in the future. Even though i think the current exception message 
is clear enough, i am not against with the pre-check way.

 

[1]. 
[https://github.com/kubernetes/community/blob/master/contributors/design-proposals/architecture/identifiers.md#identifiers-and-names-in-kubernetes]

> Limit the maximum length of the value of kubernetes.cluster-id configuration 
> option
> ---
>
> Key: FLINK-15816
> URL: https://issues.apache.org/jira/browse/FLINK-15816
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / Kubernetes
>Affects Versions: 1.10.0
>Reporter: Canbin Zheng
>Assignee: Canbin Zheng
>Priority: Minor
> Fix For: 1.11.0, 1.10.1
>
> Attachments: image-2020-01-31-20-54-33-340.png
>
>
> Two Kubernetes Service will be created when a session cluster is deployed, 
> one is the internal Service and the other is the rest Service, we set the 
> internal Service name to the value of the _kubernetes.cluster-id_ 
> configuration option and then set the rest Service name to  
> _${kubernetes.cluster-id}_ with a suffix *-rest* appended, said if we set the 
> _kubernetes.cluster-id_ to *session-test*, then the internal Service name 
> will be *session-test* and the rest Service name be *session-test-rest;* 
> there is a constraint in Kubernetes that the Service name must be no more 
> than 63 characters, for the current naming convention it leads to that the 
> value of _kubernetes.cluster-id_ should not be more than 58 characters, 
> otherwise there are scenarios that the internal Service is created 
> successfully then comes up with a ClusterDeploymentException when trying to 
> create the rest Service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot commented on issue #11011: [FLINK-15658][table-planner-blink] Fix duplicate field names exception when join on the same key multiple times

2020-02-03 Thread GitBox

flinkbot commented on issue #11011: [FLINK-15658][table-planner-blink] Fix 
duplicate field names exception when join on the same key multiple times
URL: https://github.com/apache/flink/pull/11011#issuecomment-581765091
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit e0a75123cffb0bef0a2d155f52d793ee3f6804c6 (Tue Feb 04 
06:38:44 UTC 2020)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-15658) The same sql run in a streaming environment producing a Exception, but a batch env can run normally.

2020-02-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-15658:
---
Labels: pull-request-available  (was: )

> The same sql run in a streaming environment producing a Exception, but a 
> batch env can run normally.
> 
>
> Key: FLINK-15658
> URL: https://issues.apache.org/jira/browse/FLINK-15658
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Affects Versions: 1.10.0
> Environment: *Input data:*
> tenk1 is:
> 4773|9990|1|1|3|13|73|773|773|4773|4773|146|147|PB|GUOAAA|xx
> 4093|9991|1|1|3|13|93|93|93|4093|4093|186|187|LB|HUOAAA|xx
> 6587|9992|1|3|7|7|87|587|587|1587|6587|174|175|JT|IUOAAA|xx
> 6093|9993|1|1|3|13|93|93|93|1093|6093|186|187|JA|JUOAAA|xx
> 429|9994|1|1|9|9|29|429|429|429|429|58|59|NQ|KUOAAA|xx
> 5780|9995|0|0|0|0|80|780|1780|780|5780|160|161|IO|LUOAAA|xx
> 1783|9996|1|3|3|3|83|783|1783|1783|1783|166|167|PQ|MUOAAA|xx
> 2992|9997|0|0|2|12|92|992|992|2992|2992|184|185|CL|NUOAAA|xx
> 0|9998|0|0|0|0|0|0|0|0|0|0|1|AA|OUOAAA|xx
> 2968||0|0|8|8|68|968|968|2968|2968|136|137|EK|PUOAAA|xx
> int4_tbl is:
> 0
> 123456
> -123456
> 2147483647
> -2147483647
> *The sql-client configuration is :*
> execution:
>   planner: blink
>   type: batch
>Reporter: xiaojin.wy
>Assignee: Jark Wu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
> *summary:*
> The same sql can run in a batch environment normally,  but in a streaming 
> environment there will be a exception like this:
> [ERROR] Could not execute SQL statement. Reason:
> org.apache.flink.table.api.ValidationException: Field names must be unique. 
> Found duplicates: [f1]
> *The sql is:*
> CREATE TABLE `tenk1` (
>   unique1 int,
>   unique2 int,
>   two int,
>   four int,
>   ten int,
>   twenty int,
>   hundred int,
>   thousand int,
>   twothousand int,
>   fivethous int,
>   tenthous int,
>   odd int,
>   even int,
>   stringu1 varchar,
>   stringu2 varchar,
>   string4 varchar
> ) WITH (
>   
> 'connector.path'='/daily_regression_test_stream_postgres_1.10/test_join/sources/tenk1.csv',
>   'format.empty-column-as-null'='true',
>   'format.field-delimiter'='|',
>   'connector.type'='filesystem',
>   'format.derive-schema'='true',
>   'format.type'='csv'
> );
> CREATE TABLE `int4_tbl` (
>   f1 INT
> ) WITH (
>   
> 'connector.path'='/daily_regression_test_stream_postgres_1.10/test_join/sources/int4_tbl.csv',
>   'format.empty-column-as-null'='true',
>   'format.field-delimiter'='|',
>   'connector.type'='filesystem',
>   'format.derive-schema'='true',
>   'format.type'='csv'
> );
> select a.f1, b.f1, t.thousand, t.tenthous from
>   tenk1 t,
>   (select sum(f1)+1 as f1 from int4_tbl i4a) a,
>   (select sum(f1) as f1 from int4_tbl i4b) b
> where b.f1 = t.thousand and a.f1 = b.f1 and (a.f1+b.f1+999) = t.tenthous;



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] wuchong opened a new pull request #11011: [FLINK-15658][table-planner-blink] Fix duplicate field names exception when join on the same key multiple times

2020-02-03 Thread GitBox

wuchong opened a new pull request #11011: [FLINK-15658][table-planner-blink] 
Fix duplicate field names exception when join on the same key multiple times
URL: https://github.com/apache/flink/pull/11011
 
 
   
   
   
   ## What is the purpose of the change
   
   The exmaple raised in the JIRA that batch can run successfully, but 
streaming will fail. The root cause is that the join key may be duplicate and 
the constructed key RowType will have duplicate field names.
   
   ## Brief change log
   
   We do not provide field names for the result key type, because we may have 
duplicate key fields and the field names may conflict.
   
   ## Verifying this change
   
   Added an IT case to verify this, the IT case will fail without this fix. 
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] xintongsong commented on a change in pull request #11004: [FLINK-15143][docs] Add new memory configuration guide before FLIP-49

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #11004: [FLINK-15143][docs] 
Add new memory configuration guide before FLIP-49
URL: https://github.com/apache/flink/pull/11004#discussion_r374493546
 
 

 ##
 File path: docs/ops/mem_setup.md
 ##
 @@ -0,0 +1,129 @@
+---
+title: "Task Manager Memory Configuration"
+nav-parent_id: ops
+nav-pos: 5
+---
+
+
+Apache Flink provides efficient workloads on top of the JVM by tightly 
controlling the memory usage of its various components.
+While the community strives to offer sensible defaults to all configurations, 
the full breadth of applications
+that users deploy on Flink means this isn't always possible. To provide the 
most production value to our users,
+Flink allows both high level and fine-grained tuning of memory allocation 
within clusters.
+
+* toc
+{:toc}
+
+See also task manager [configuration options](config.html#taskmanager).
+
+## Total memory
+
+The *total memory* in Flink consists of JVM heap, [managed 
memory](#managed-memory) and [network buffers](#network-buffers).
+[Managed memory](#managed-memory) can be either part of JVM heap or direct 
off-heap memory.
+For containerised deployments, the *total memory* can additionally include a 
[container cut-off](#container-cut-off).
+
+All other memory components are computed from the *total memory* before 
starting the Flink process.
+After the start, the [managed memory](#managed-memory) and [network 
memory](#network-buffers) are adjusted in certain cases
+based on available JVM memory inside the process (see [Adjustments inside 
Flink process](#adjustments-inside-flink-process)).
+
+
+  
+
+
+
+When the Flink JVM process is started from scripts in [standalone 
mode](deployment/cluster_setup.html),
+on [Yarn](deployment/yarn_setup.html) or [Mesos](deployment/mesos.html), the 
*total memory* is defined by
+the configuration option 
[taskmanager.heap.size](config.html#taskmanager-heap-size-1) (or deprecated 
`taskmanager.heap.mb`).
+In case of a containerized deployment ([Yarn](deployment/yarn_setup.html) or 
[Mesos](deployment/mesos.html)),
+this is the size of the requested container.
+
+### Container cut-off
+
+In case of a containerised deployment, the [total memory](#total-memory) is 
additionally reduced by a *cut-off*
+which is a fraction 
([containerized.heap-cutoff-ratio](config.html#containerized-heap-cutoff-ratio) 
of the [total memory](#total-memory)
+but always greater or equal to its minimum value 
([containerized.heap-cutoff-min](config.html#containerized-heap-cutoff-min).
+
+The cut-off is introduced to accommodate for other types of consumed memory, 
not accounted for in this memory model,
+e.g. RocksDB native memory, JVM overhead etc. It is also a safety margin to 
prevent the container from exceeding
+its memory limit and being killed by the container manager.
+
+## Network buffers
+
+This memory is used for buffering records while shuffling them between 
operator tasks and their executors over the network.
+It is calculated as:
+```
+network = Min(max, Max(min, fraction x total)
+```
+where `fraction` is 
[taskmanager.network.memory.fraction](config.html#taskmanager-network-memory-fraction),
+`min` is 
[taskmanager.network.memory.min](config.html#taskmanager-network-memory-min) and
+`max` is 
[taskmanager.network.memory.max](config.html#taskmanager-network-memory-max).
+
+See also [setting memory fractions](config.html#setting-memory-fractions).
+
+If the before mentioned options are not set but the legacy option is used then 
the *network memory* is assumed
+to be set explicitly without fraction as:
+```
+network = legacy buffers x page
+```
+where `legacy buffers` are `taskmanager.network.numberOfBuffers` and `page` is
+[taskmanager.memory.segment-size](config.html#taskmanager-memory-segment-size).
+See also [setting number of network buffers 
directly](config.html#setting-the-number-of-network-buffers-directly).
+
+## Managed memory
+
+The *managed memory* is used for *batch* jobs. It helps Flink to run the batch 
operators efficiently and prevents
+`OutOfMemoryException`s because Flink knows how much memory it can use to 
execute operations. If Flink runs out of *managed memory*,
 
 Review comment:
   ```suggestion
   `OutOfMemoryError`s because Flink knows how much memory it can use to 
execute operations. If Flink runs out of *managed memory*,
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] xintongsong commented on a change in pull request #11004: [FLINK-15143][docs] Add new memory configuration guide before FLIP-49

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #11004: [FLINK-15143][docs] 
Add new memory configuration guide before FLIP-49
URL: https://github.com/apache/flink/pull/11004#discussion_r374491358
 
 

 ##
 File path: docs/ops/mem_setup.md
 ##
 @@ -0,0 +1,129 @@
+---
+title: "Task Manager Memory Configuration"
+nav-parent_id: ops
+nav-pos: 5
+---
+
+
+Apache Flink provides efficient workloads on top of the JVM by tightly 
controlling the memory usage of its various components.
+While the community strives to offer sensible defaults to all configurations, 
the full breadth of applications
+that users deploy on Flink means this isn't always possible. To provide the 
most production value to our users,
+Flink allows both high level and fine-grained tuning of memory allocation 
within clusters.
+
+* toc
+{:toc}
+
+See also task manager [configuration options](config.html#taskmanager).
+
+## Total memory
+
+The *total memory* in Flink consists of JVM heap, [managed 
memory](#managed-memory) and [network buffers](#network-buffers).
+[Managed memory](#managed-memory) can be either part of JVM heap or direct 
off-heap memory.
+For containerised deployments, the *total memory* can additionally include a 
[container cut-off](#container-cut-off).
+
+All other memory components are computed from the *total memory* before 
starting the Flink process.
+After the start, the [managed memory](#managed-memory) and [network 
memory](#network-buffers) are adjusted in certain cases
+based on available JVM memory inside the process (see [Adjustments inside 
Flink process](#adjustments-inside-flink-process)).
+
+
+  
+
+
+
+When the Flink JVM process is started from scripts in [standalone 
mode](deployment/cluster_setup.html),
+on [Yarn](deployment/yarn_setup.html) or [Mesos](deployment/mesos.html), the 
*total memory* is defined by
+the configuration option 
[taskmanager.heap.size](config.html#taskmanager-heap-size-1) (or deprecated 
`taskmanager.heap.mb`).
+In case of a containerized deployment ([Yarn](deployment/yarn_setup.html) or 
[Mesos](deployment/mesos.html)),
+this is the size of the requested container.
+
+### Container cut-off
+
+In case of a containerised deployment, the [total memory](#total-memory) is 
additionally reduced by a *cut-off*
+which is a fraction 
([containerized.heap-cutoff-ratio](config.html#containerized-heap-cutoff-ratio) 
of the [total memory](#total-memory)
+but always greater or equal to its minimum value 
([containerized.heap-cutoff-min](config.html#containerized-heap-cutoff-min).
 
 Review comment:
   ```suggestion
   but always greater or equal to its minimum value 
([containerized.heap-cutoff-min](config.html#containerized-heap-cutoff-min)).
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] xintongsong commented on a change in pull request #11004: [FLINK-15143][docs] Add new memory configuration guide before FLIP-49

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #11004: [FLINK-15143][docs] 
Add new memory configuration guide before FLIP-49
URL: https://github.com/apache/flink/pull/11004#discussion_r374494338
 
 

 ##
 File path: docs/ops/mem_setup.md
 ##
 @@ -0,0 +1,129 @@
+---
+title: "Task Manager Memory Configuration"
+nav-parent_id: ops
+nav-pos: 5
+---
+
+
+Apache Flink provides efficient workloads on top of the JVM by tightly 
controlling the memory usage of its various components.
+While the community strives to offer sensible defaults to all configurations, 
the full breadth of applications
+that users deploy on Flink means this isn't always possible. To provide the 
most production value to our users,
+Flink allows both high level and fine-grained tuning of memory allocation 
within clusters.
+
+* toc
+{:toc}
+
+See also task manager [configuration options](config.html#taskmanager).
+
+## Total memory
+
+The *total memory* in Flink consists of JVM heap, [managed 
memory](#managed-memory) and [network buffers](#network-buffers).
+[Managed memory](#managed-memory) can be either part of JVM heap or direct 
off-heap memory.
+For containerised deployments, the *total memory* can additionally include a 
[container cut-off](#container-cut-off).
+
+All other memory components are computed from the *total memory* before 
starting the Flink process.
+After the start, the [managed memory](#managed-memory) and [network 
memory](#network-buffers) are adjusted in certain cases
+based on available JVM memory inside the process (see [Adjustments inside 
Flink process](#adjustments-inside-flink-process)).
+
+
+  
+
+
+
+When the Flink JVM process is started from scripts in [standalone 
mode](deployment/cluster_setup.html),
+on [Yarn](deployment/yarn_setup.html) or [Mesos](deployment/mesos.html), the 
*total memory* is defined by
+the configuration option 
[taskmanager.heap.size](config.html#taskmanager-heap-size-1) (or deprecated 
`taskmanager.heap.mb`).
+In case of a containerized deployment ([Yarn](deployment/yarn_setup.html) or 
[Mesos](deployment/mesos.html)),
+this is the size of the requested container.
+
+### Container cut-off
+
+In case of a containerised deployment, the [total memory](#total-memory) is 
additionally reduced by a *cut-off*
+which is a fraction 
([containerized.heap-cutoff-ratio](config.html#containerized-heap-cutoff-ratio) 
of the [total memory](#total-memory)
+but always greater or equal to its minimum value 
([containerized.heap-cutoff-min](config.html#containerized-heap-cutoff-min).
+
+The cut-off is introduced to accommodate for other types of consumed memory, 
not accounted for in this memory model,
+e.g. RocksDB native memory, JVM overhead etc. It is also a safety margin to 
prevent the container from exceeding
+its memory limit and being killed by the container manager.
+
+## Network buffers
+
+This memory is used for buffering records while shuffling them between 
operator tasks and their executors over the network.
+It is calculated as:
+```
+network = Min(max, Max(min, fraction x total)
+```
+where `fraction` is 
[taskmanager.network.memory.fraction](config.html#taskmanager-network-memory-fraction),
+`min` is 
[taskmanager.network.memory.min](config.html#taskmanager-network-memory-min) and
+`max` is 
[taskmanager.network.memory.max](config.html#taskmanager-network-memory-max).
+
+See also [setting memory fractions](config.html#setting-memory-fractions).
+
+If the before mentioned options are not set but the legacy option is used then 
the *network memory* is assumed
+to be set explicitly without fraction as:
+```
+network = legacy buffers x page
+```
+where `legacy buffers` are `taskmanager.network.numberOfBuffers` and `page` is
+[taskmanager.memory.segment-size](config.html#taskmanager-memory-segment-size).
+See also [setting number of network buffers 
directly](config.html#setting-the-number-of-network-buffers-directly).
+
+## Managed memory
+
+The *managed memory* is used for *batch* jobs. It helps Flink to run the batch 
operators efficiently and prevents
+`OutOfMemoryException`s because Flink knows how much memory it can use to 
execute operations. If Flink runs out of *managed memory*,
+it utilizes disk space. Using *managed memory*, some operations can be 
performed directly on the raw data without having
+to deserialize the data to convert it into Java objects. All in all, *managed 
memory* improves the robustness and speed of the system.
+
+The *managed memory* can be either part of JVM heap (on-heap) or off-heap, it 
is on-heap by default
+([taskmanager.memory.off-heap](config.html#taskmanager-memory-off-heap), 
default: `false`).
+
+The managed memory size can be either set **explicitly** by the 
[taskmanager.memory.size](config.html#taskmanager-memory-size)
+or if not set explicitly then it is defined as a **fraction** 
([taskmanager.memory.fraction](config.html#taskmanager-memory-fraction))
+of [total memory](#total-memory) minus [network memory](#network-buff

[GitHub] [flink] lukess commented on issue #9848: [FLINK-14332] [flink-metrics-signalfx] add flink-metrics-signalfx

2020-02-03 Thread GitBox

lukess commented on issue #9848: [FLINK-14332] [flink-metrics-signalfx] add 
flink-metrics-signalfx
URL: https://github.com/apache/flink/pull/9848#issuecomment-581764022
 
 
   @aljoscha do you have time to review current progress? thanks


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-15836) Throw fatal exception when the watcher in Fabric8FlinkKubeClient is closed

2020-02-03 Thread Yang Wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17029598#comment-17029598
 ] 

Yang Wang commented on FLINK-15836:
---

Yeah, i will start to work on this.

> Throw fatal exception when the watcher in Fabric8FlinkKubeClient is closed
> --
>
> Key: FLINK-15836
> URL: https://issues.apache.org/jira/browse/FLINK-15836
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / Kubernetes
>Reporter: Yang Wang
>Assignee: Yang Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0, 1.10.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As the discussion in the PR[1], if the {{watchReconnectLimit}} is configured 
> by users via java properties or environment, the watch may be stopped and all 
> the changes will not be processed properly. So a fatal exception need to be 
> thrown when the watcher is closed.
> [1]. https://github.com/apache/flink/pull/10965#discussion_r373491974



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot commented on issue #11010: [FLINK-15836][k8s] Throw fatal exception when the pods watcher in Fabric8FlinkKubeClient is closed

2020-02-03 Thread GitBox

flinkbot commented on issue #11010: [FLINK-15836][k8s] Throw fatal exception 
when the pods watcher in Fabric8FlinkKubeClient is closed
URL: https://github.com/apache/flink/pull/11010#issuecomment-581763214
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit e6722c5f11f17384bc94de89095c46d605def5e8 (Tue Feb 04 
06:30:58 UTC 2020)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] carp84 commented on a change in pull request #10997: [FLINK-15743][docs] Extend Flink 1.10 release notes

2020-02-03 Thread GitBox

carp84 commented on a change in pull request #10997: [FLINK-15743][docs] Extend 
Flink 1.10 release notes
URL: https://github.com/apache/flink/pull/10997#discussion_r374493489
 
 

 ##
 File path: docs/release-notes/flink-1.10.md
 ##
 @@ -176,6 +182,18 @@ The container cut-off configuration options, 
`containerized.heap-cutoff-ratio`
 and `containerized.heap-cutoff-min`, have no effect for task executor processes
 anymore but they still have the same semantics for the JobManager process.
 
+ RocksDB State Backend Memory Control 
([FLINK-7289](https://issues.apache.org/jira/browse/FLINK-7289))
+Together with the introduction of the [new Task Executor Memory
+Model](#new-task-executor-memory-model-flink-13980), the memory consumption of 
the RocksDB state backend will be
 
 Review comment:
   We are depending on #10999 here to get the valid link, right?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-15836) Throw fatal exception when the watcher in Fabric8FlinkKubeClient is closed

2020-02-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-15836:
---
Labels: pull-request-available  (was: )

> Throw fatal exception when the watcher in Fabric8FlinkKubeClient is closed
> --
>
> Key: FLINK-15836
> URL: https://issues.apache.org/jira/browse/FLINK-15836
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / Kubernetes
>Reporter: Yang Wang
>Assignee: Yang Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0, 1.10.1
>
>
> As the discussion in the PR[1], if the {{watchReconnectLimit}} is configured 
> by users via java properties or environment, the watch may be stopped and all 
> the changes will not be processed properly. So a fatal exception need to be 
> thrown when the watcher is closed.
> [1]. https://github.com/apache/flink/pull/10965#discussion_r373491974



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (FLINK-15658) The same sql run in a streaming environment producing a Exception, but a batch env can run normally.

2020-02-03 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu reassigned FLINK-15658:
---

Assignee: Jark Wu

> The same sql run in a streaming environment producing a Exception, but a 
> batch env can run normally.
> 
>
> Key: FLINK-15658
> URL: https://issues.apache.org/jira/browse/FLINK-15658
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Affects Versions: 1.10.0
> Environment: *Input data:*
> tenk1 is:
> 4773|9990|1|1|3|13|73|773|773|4773|4773|146|147|PB|GUOAAA|xx
> 4093|9991|1|1|3|13|93|93|93|4093|4093|186|187|LB|HUOAAA|xx
> 6587|9992|1|3|7|7|87|587|587|1587|6587|174|175|JT|IUOAAA|xx
> 6093|9993|1|1|3|13|93|93|93|1093|6093|186|187|JA|JUOAAA|xx
> 429|9994|1|1|9|9|29|429|429|429|429|58|59|NQ|KUOAAA|xx
> 5780|9995|0|0|0|0|80|780|1780|780|5780|160|161|IO|LUOAAA|xx
> 1783|9996|1|3|3|3|83|783|1783|1783|1783|166|167|PQ|MUOAAA|xx
> 2992|9997|0|0|2|12|92|992|992|2992|2992|184|185|CL|NUOAAA|xx
> 0|9998|0|0|0|0|0|0|0|0|0|0|1|AA|OUOAAA|xx
> 2968||0|0|8|8|68|968|968|2968|2968|136|137|EK|PUOAAA|xx
> int4_tbl is:
> 0
> 123456
> -123456
> 2147483647
> -2147483647
> *The sql-client configuration is :*
> execution:
>   planner: blink
>   type: batch
>Reporter: xiaojin.wy
>Assignee: Jark Wu
>Priority: Major
> Fix For: 1.10.0
>
>
> *summary:*
> The same sql can run in a batch environment normally,  but in a streaming 
> environment there will be a exception like this:
> [ERROR] Could not execute SQL statement. Reason:
> org.apache.flink.table.api.ValidationException: Field names must be unique. 
> Found duplicates: [f1]
> *The sql is:*
> CREATE TABLE `tenk1` (
>   unique1 int,
>   unique2 int,
>   two int,
>   four int,
>   ten int,
>   twenty int,
>   hundred int,
>   thousand int,
>   twothousand int,
>   fivethous int,
>   tenthous int,
>   odd int,
>   even int,
>   stringu1 varchar,
>   stringu2 varchar,
>   string4 varchar
> ) WITH (
>   
> 'connector.path'='/daily_regression_test_stream_postgres_1.10/test_join/sources/tenk1.csv',
>   'format.empty-column-as-null'='true',
>   'format.field-delimiter'='|',
>   'connector.type'='filesystem',
>   'format.derive-schema'='true',
>   'format.type'='csv'
> );
> CREATE TABLE `int4_tbl` (
>   f1 INT
> ) WITH (
>   
> 'connector.path'='/daily_regression_test_stream_postgres_1.10/test_join/sources/int4_tbl.csv',
>   'format.empty-column-as-null'='true',
>   'format.field-delimiter'='|',
>   'connector.type'='filesystem',
>   'format.derive-schema'='true',
>   'format.type'='csv'
> );
> select a.f1, b.f1, t.thousand, t.tenthous from
>   tenk1 t,
>   (select sum(f1)+1 as f1 from int4_tbl i4a) a,
>   (select sum(f1) as f1 from int4_tbl i4b) b
> where b.f1 = t.thousand and a.f1 = b.f1 and (a.f1+b.f1+999) = t.tenthous;



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15305) MemoryMappedBoundedDataTest fails with IOException on ppc64le

2020-02-03 Thread Yingjie Cao (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17029597#comment-17029597
 ] 

Yingjie Cao commented on FLINK-15305:
-

Sorry for the late response, we were on Chinese Spring Festival vacation last 
week.

The reported exception happens when the data buffer to be written plus the 
header length (8 bytes) is larger than the region size. Flink always uses 
Integer.MAX_VALUE (2G) as region size which is not configurable and for data 
buffer size, Flink uses 32k by default and this option is configurable. 
Theoretically, it is possible to trigger the exception if a very large buffer 
size is configured (especially when huge pages are used).

To fix the problem, we can:
 # Remind the user to not using too large buffer size in the document of 
MEMORY_SEGMENT_SIZE;
 # Modify the test case to respect the page size, that is, calculating a proper 
data buffer size and region size based on the page size.

IMO, it is not a critical problem, after all, there is hardly anyone who sets 
buffer size to a very large value.

[~pnowojski] What do you think? Should we give it a fix?

> MemoryMappedBoundedDataTest fails with IOException on ppc64le
> -
>
> Key: FLINK-15305
> URL: https://issues.apache.org/jira/browse/FLINK-15305
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
> Environment: arch: ppc64le
> os: rhel 7.6
> jdk: 8
> mvn: 3.3.9
>Reporter: Siddhesh Ghadi
>Priority: Major
> Attachments: surefire-report.txt
>
>
> By reducing the buffer size from 76_687 to 60_787 in 
> flink-runtime/src/test/java/org/apache/flink/runtime/io/network/partition/BoundedDataTestBase.java:164,
>  test passes. Any thoughts on this approach?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] wangyang0918 opened a new pull request #11010: [FLINK-15836][k8s] Throw fatal exception when the pods watcher in Fabric8FlinkKubeClient is closed

2020-02-03 Thread GitBox

wangyang0918 opened a new pull request #11010: [FLINK-15836][k8s] Throw fatal 
exception when the pods watcher in Fabric8FlinkKubeClient is closed
URL: https://github.com/apache/flink/pull/11010
 
 
   
   
   ## What is the purpose of the change
   
   As the discussion in the 
[PR](https://github.com/apache/flink/pull/10965#discussion_r373491974), the 
watcher will always be reconnected in Kubernetes client. However, if the 
watchReconnectLimit is configured by users via java properties or environment, 
the watcher may be stopped and all the changes will not be processed properly. 
So we need to throw a fatal exception in this case.
   
   
   ## Brief change log
   
   * Throw `FlinkRuntimeException` when the watcher is closed
   
   
   ## Verifying this change
   * none
   
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374493406
 
 

 ##
 File path: docs/ops/memory/mem_tuning.md
 ##
 @@ -0,0 +1,86 @@
+---
+title: "Memory tuning guide"
+nav-parent_id: ops_mem
+nav-pos: 2
+---
+
+
+In addition to the [main memory setup guide](mem_setup.html), this section 
explains how to setup memory of task executors
+depending on the use case and which options are important in which case.
+
+* toc
+{:toc}
+
+## Configure memory for standalone deployment
+
+It is recommended to configure [total Flink 
memory](mem_setup.html#configure-total-memory)
+([taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1))
 or its [components](mem_setup.html#detailed-memory-model)
+for [standalone deployment](../deployment/cluster_setup.html) where you want 
to declare how much memory is given to Flink itself.
+Additionally, you can adjust *JVM metaspace* if it causes 
[problems](mem_trouble.html#outofmemoryerror-metaspace).
+
+The *total Process memory* is not relevant because *JVM overhead* is not 
controlled by Flink or deployment environment,
+only physical resources of the executing machine matter in this case.
+
+## Configure memory for containers
+
+It is recommended to configure [total process 
memory](mem_setup.html#configure-total-memory)
+([taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1))
 for the containerized deployments
+([Kubernetes](../deployment/kubernetes.html), 
[Yarn](../deployment/yarn_setup.html) or [Mesos](../deployment/mesos.html)).
+It declares how much memory in total should be assigned to the Flink *JVM 
process* and corresponds to the size of the requested container.
+
+Note: If you configure the *total Flink memory* Flink will 
implicitly add JVM memory components
+to derive the *total process memory* and request a container with the memory 
of that derived size,
+see also [detailed Memory Model](mem_setup.html#detailed-memory-model).
+
+
+  Warning: If Flink or user code allocates unmanaged off-heap 
(native) memory beyond the container size
+  the job can fail because the deployment environment can kill the offending 
containers.
+
+See also description of [container memory 
exceeded](mem_trouble.html#container-memory-exceeded) failure.
+
+## Configure memory for state backends
+
+When deploying a Flink streaming application, the type of [state 
backend](../state/state_backends.html) used
+will dictate the optimal memory configurations of your cluster.
+
+### Heap state backend
+
+When running a stateless job or using a heap state backend 
([MemoryStateBackend](../state/state_backends.html#the-memorystatebackend)
+or [FsStateBackend](../state/state_backends.html#the-fsstatebackend), set 
[managed memory](mem_setup.html#managed-memory) to zero.
+This will ensure that the maximum amount of memory is allocated for user code 
on the JVM.
+
+### RocksDB state backend
+
+The 
[RocksDBStateBackend](../state/state_backends.html#the-rocksdbstatebackend) 
uses native memory. By default,
+RocksDB is setup to limit native memory allocation to the size of the [managed 
memory](mem_setup.html#managed-memory).
+Therefore, it is important to reserve enough *managed memory* for your state 
use case. If you disable the default RocksDB memory control,
+task executors can be killed in containerized deployments if RocksDB allocates 
memory above the limit of the requested container size
+(the [total process memory](mem_setup.html#configure-total-memory)).
+See also 
[state.backend.rocksdb.memory.managed](../config.html#state-backend-rocksdb-memory-managed).
+
+## Configure memory for batch jobs
+
+Flink's batch operators leverage [managed 
memory](../memory/mem_setup.html#managed-memory) to run more efficiently.
+In doing so, some operations can be performed directly on raw data without 
having to be deserialized into Java objects.
+This means that [managed memory](../memory/mem_setup.html#managed-memory) 
configurations have practical effects
+on the performance of your applications. Flink will attempt to allocate and 
use as much [managed memory](../memory/mem_setup.html#managed-memory)
+as configured for batch jobs but not go beyond its limits. This prevents 
`OutOfMemoryException`'s because Flink knows precisely
 
 Review comment:
   The class name is `OutOfMemoryError`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11008: [hotfix][docs] Fix missing double quotes in catalog docs

2020-02-03 Thread GitBox

flinkbot edited a comment on issue #11008: [hotfix][docs] Fix missing double 
quotes in catalog docs
URL: https://github.com/apache/flink/pull/11008#issuecomment-581740628
 
 
   
   ## CI report:
   
   * 413f62721188f7a593ba6a5dd6917a1c0a46a4a7 Travis: 
[SUCCESS](https://travis-ci.com/flink-ci/flink/builds/147320233) Azure: 
[SUCCESS](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=4804)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11009: [Flink-15873] [cep] fix matches result not returned when existing earlier partial matches

2020-02-03 Thread GitBox

flinkbot edited a comment on issue #11009: [Flink-15873] [cep] fix matches 
result not returned when existing earlier partial matches
URL: https://github.com/apache/flink/pull/11009#issuecomment-581740648
 
 
   
   ## CI report:
   
   * 67c1b7f08b7de55f3d039680b976246cab8cce07 Travis: 
[FAILURE](https://travis-ci.com/flink-ci/flink/builds/147320241) Azure: 
[FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=4805)
 
   * 2352b23c725aefa8d4f51816b8c6432439a84ee5 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] carp84 commented on issue #10974: [FLINK-15807][docs] Add Java 11 to supported JDKs

2020-02-03 Thread GitBox

carp84 commented on issue #10974: [FLINK-15807][docs] Add Java 11 to supported 
JDKs
URL: https://github.com/apache/flink/pull/10974#issuecomment-581760713
 
 
   > What is listed in the PR description (Cassandra, Hive, HBase, Kafka 
0.8-0.11). These are the features for which we have tests that are not being 
run on Java 11. We also have no e2e runs on YARN, but the IT cases are passing.
   
   I could see this already tracked by #10997 , so it seems the PR here is 
ready to go.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-15620) State TTL: Remove deprecated enable default background cleanup

2020-02-03 Thread Arujit Pradhan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17029594#comment-17029594
 ] 

Arujit Pradhan commented on FLINK-15620:


Is someone working on this issue? I have been working on Flink for the past 
couple of months and now really want to contribute to it. Can someone please 
assign this to me?

> State TTL: Remove deprecated enable default background cleanup
> --
>
> Key: FLINK-15620
> URL: https://issues.apache.org/jira/browse/FLINK-15620
> Project: Flink
>  Issue Type: Task
>  Components: Runtime / State Backends
>Reporter: Andrey Zagrebin
>Priority: Blocker
> Fix For: 1.11.0
>
>
> Follow-up for FLINK-15606.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374454620
 
 

 ##
 File path: docs/ops/memory/mem_setup.md
 ##
 @@ -0,0 +1,251 @@
+---
+title: "Setup Task Executor Memory"
+nav-parent_id: ops_mem
+nav-pos: 1
+---
+
+
+Apache Flink provides efficient workloads on top of the JVM by tightly 
controlling the memory usage of its various components.
+While the community strives to offer sensible defaults to all configurations, 
the full breadth of applications
+that users deploy on Flink means this isn't always possible. To provide the 
most production value to our users,
+Flink allows both high level and fine-grained tuning of memory allocation 
within clusters.
+
+* toc
+{:toc}
+
+The further described memory configuration is applicable starting with the 
release version *1.10*. If you upgrade Flink
+from earlier versions, check the [migration guide](mem_migration.html) because 
many changes were introduced with the 1.10 release.
+
+Note: This memory setup guide is relevant only for 
task executors!
+Check [job manager related configuration options](../config.html#jobmanager) 
for the memory setup of job manager.
+
+## Configure Total Memory
+
+The *total process memory* of Flink JVM processes consists of memory consumed 
by Flink application (*total Flink memory*)
+and by the JVM to run the process. The *total Flink memory* consumption 
includes usage of JVM heap,
+*managed memory* (managed by Flink) and other direct (or native) memory.
+
+
+  
+
+
+
+If you run FIink locally (e.g. from your IDE) without creating a cluster, then 
only a subset of the memory configuration
+options are relevant, see also [local execution](#local-execution) for more 
details.
+
+Otherwise, the simplest way to setup memory in Flink is to configure either of 
the two following options:
+* Total Flink memory 
([taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1))
+* Total process memory 
([taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1))
+
+The rest of the memory components will be adjusted automatically, based on 
default values or additionally configured options.
+[Here](#detailed-memory-model) are more details about the other memory 
components.
+
+Configuring *total Flink memory* is better suited for standalone deployments 
where you want to declare how much memory
+is given to Flink itself. The *total Flink memory* splits up into [managed 
memory size](#managed-memory) and JVM heap.
+
+If you configure *total process memory* you declare how much memory in total 
should be assigned to the Flink *JVM process*.
+For the containerized deployments it corresponds to the size of the requested 
container, see also
+[how to configure memory for 
containers](mem_tuning.html#configure-memory-for-containers)
+([Kubernetes](../deployment/kubernetes.html), 
[Yarn](../deployment/yarn_setup.html) or [Mesos](../deployment/mesos.html)).
+
+Another way to setup the memory is to set [task 
heap](#task-operator-heap-memory) and [managed memory](#managed-memory)
+([taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)).
+This more fine-grained approach is described in more detail 
[here](#configure-heap-and-managed-memory).
+
+Note: One of the three mentioned ways has to be used to 
configure Flink’s memory (except for local execution), or the Flink startup 
will fail.
+This means that one of the following option subsets, which do not have default 
values, have to be configured explicitly in *flink-conf.yaml*:
+* 
[taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1)
+* 
[taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1)
+* 
[taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)
+
+Note: Explicitly configuring both *total process memory* and 
*total Flink memory* is not recommended.
+It may lead to deployment failures due to potential memory configuration 
conflicts. Additional configuration
+of other memory components also requires caution as it can produce further 
configuration conflicts.
+The conflict can occur e.g. if the sum of sub-components does not add up to 
the total configured memory or size of some
+component is outside of its min/max range, see also [the detailed memory 
model](#detailed-memory-model).
+
+## Configure Heap and Managed Memory
+
+As mentioned before in [total memory description](#configure-total-memory), 
another way to setup memory in Flink is
+to specify explicitly both [task heap](#task-operator-heap-memory) and 
[managed memory](#managed-memory).
+It gives more control over the available JVM heap to Flink’s

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374485798
 
 

 ##
 File path: docs/ops/state/state_backends.zh.md
 ##
 @@ -115,6 +120,8 @@ RocksDBStateBackend 的适用场景：
 然而，这也意味着使用 RocksDBStateBackend 将会使应用程序的最大吞吐量降低。
 所有的读写都必须序列化、反序列化操作，这个比基于堆内存的 state backend 的效率要低很多。
 
+Check also recommendations about the [task executor memory 
configuration](../memory/mem_tuning.html#rocksdb-state-backend) for the 
RocksDBStateBackend.
 
 Review comment:
   Translation:
   请同时参考 \[Task Executor 
内存配置\](../memory/mem_tuning.html#rocksdb-state-backend) 中关于 RocksDBStateBackend 
的建议。


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374452675
 
 

 ##
 File path: docs/ops/memory/mem_setup.md
 ##
 @@ -0,0 +1,251 @@
+---
+title: "Setup Task Executor Memory"
+nav-parent_id: ops_mem
+nav-pos: 1
+---
+
+
+Apache Flink provides efficient workloads on top of the JVM by tightly 
controlling the memory usage of its various components.
+While the community strives to offer sensible defaults to all configurations, 
the full breadth of applications
+that users deploy on Flink means this isn't always possible. To provide the 
most production value to our users,
+Flink allows both high level and fine-grained tuning of memory allocation 
within clusters.
+
+* toc
+{:toc}
+
+The further described memory configuration is applicable starting with the 
release version *1.10*. If you upgrade Flink
+from earlier versions, check the [migration guide](mem_migration.html) because 
many changes were introduced with the 1.10 release.
+
+Note: This memory setup guide is relevant only for 
task executors!
+Check [job manager related configuration options](../config.html#jobmanager) 
for the memory setup of job manager.
+
+## Configure Total Memory
+
+The *total process memory* of Flink JVM processes consists of memory consumed 
by Flink application (*total Flink memory*)
+and by the JVM to run the process. The *total Flink memory* consumption 
includes usage of JVM heap,
+*managed memory* (managed by Flink) and other direct (or native) memory.
+
+
+  
+
+
+
+If you run FIink locally (e.g. from your IDE) without creating a cluster, then 
only a subset of the memory configuration
+options are relevant, see also [local execution](#local-execution) for more 
details.
+
+Otherwise, the simplest way to setup memory in Flink is to configure either of 
the two following options:
+* Total Flink memory 
([taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1))
+* Total process memory 
([taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1))
+
+The rest of the memory components will be adjusted automatically, based on 
default values or additionally configured options.
+[Here](#detailed-memory-model) are more details about the other memory 
components.
+
+Configuring *total Flink memory* is better suited for standalone deployments 
where you want to declare how much memory
+is given to Flink itself. The *total Flink memory* splits up into [managed 
memory size](#managed-memory) and JVM heap.
+
+If you configure *total process memory* you declare how much memory in total 
should be assigned to the Flink *JVM process*.
+For the containerized deployments it corresponds to the size of the requested 
container, see also
+[how to configure memory for 
containers](mem_tuning.html#configure-memory-for-containers)
+([Kubernetes](../deployment/kubernetes.html), 
[Yarn](../deployment/yarn_setup.html) or [Mesos](../deployment/mesos.html)).
+
+Another way to setup the memory is to set [task 
heap](#task-operator-heap-memory) and [managed memory](#managed-memory)
+([taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)).
+This more fine-grained approach is described in more detail 
[here](#configure-heap-and-managed-memory).
+
+Note: One of the three mentioned ways has to be used to 
configure Flink’s memory (except for local execution), or the Flink startup 
will fail.
+This means that one of the following option subsets, which do not have default 
values, have to be configured explicitly in *flink-conf.yaml*:
+* 
[taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1)
+* 
[taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1)
+* 
[taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)
+
+Note: Explicitly configuring both *total process memory* and 
*total Flink memory* is not recommended.
+It may lead to deployment failures due to potential memory configuration 
conflicts. Additional configuration
+of other memory components also requires caution as it can produce further 
configuration conflicts.
+The conflict can occur e.g. if the sum of sub-components does not add up to 
the total configured memory or size of some
+component is outside of its min/max range, see also [the detailed memory 
model](#detailed-memory-model).
+
+## Configure Heap and Managed Memory
+
+As mentioned before in [total memory description](#configure-total-memory), 
another way to setup memory in Flink is
+to specify explicitly both [task heap](#task-operator-heap-memory) and 
[managed memory](#managed-memory).
+It gives more control over the available JVM heap to Flink’s

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374483636
 
 

 ##
 File path: docs/ops/state/state_backends.zh.md
 ##
 @@ -71,6 +71,8 @@ MemoryStateBackend 适用场景：
   - 本地开发和调试。
   - 状态很小的 Job，例如：由每次只处理一条记录的函数（Map、FlatMap、Filter 等）构成的 Job。Kafka Consumer 
仅仅需要非常小的状态。
 
+It is also recommended to set [managed memory](mem_setup.html#managed-memory) 
to zero.
+This will ensure that the maximum amount of memory is allocated for user code 
on the JVM.
 
 Review comment:
   Translation:
   建议同时将 \[managed memory\](../memory/mem_setup.html#managed-memory) 
设为0，以保证将最大限度的内存分配给 JVM 上的用户代码。


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374454668
 
 

 ##
 File path: docs/ops/memory/mem_setup.md
 ##
 @@ -0,0 +1,251 @@
+---
+title: "Setup Task Executor Memory"
+nav-parent_id: ops_mem
+nav-pos: 1
+---
+
+
+Apache Flink provides efficient workloads on top of the JVM by tightly 
controlling the memory usage of its various components.
+While the community strives to offer sensible defaults to all configurations, 
the full breadth of applications
+that users deploy on Flink means this isn't always possible. To provide the 
most production value to our users,
+Flink allows both high level and fine-grained tuning of memory allocation 
within clusters.
+
+* toc
+{:toc}
+
+The further described memory configuration is applicable starting with the 
release version *1.10*. If you upgrade Flink
+from earlier versions, check the [migration guide](mem_migration.html) because 
many changes were introduced with the 1.10 release.
+
+Note: This memory setup guide is relevant only for 
task executors!
+Check [job manager related configuration options](../config.html#jobmanager) 
for the memory setup of job manager.
+
+## Configure Total Memory
+
+The *total process memory* of Flink JVM processes consists of memory consumed 
by Flink application (*total Flink memory*)
+and by the JVM to run the process. The *total Flink memory* consumption 
includes usage of JVM heap,
+*managed memory* (managed by Flink) and other direct (or native) memory.
+
+
+  
+
+
+
+If you run FIink locally (e.g. from your IDE) without creating a cluster, then 
only a subset of the memory configuration
+options are relevant, see also [local execution](#local-execution) for more 
details.
+
+Otherwise, the simplest way to setup memory in Flink is to configure either of 
the two following options:
+* Total Flink memory 
([taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1))
+* Total process memory 
([taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1))
+
+The rest of the memory components will be adjusted automatically, based on 
default values or additionally configured options.
+[Here](#detailed-memory-model) are more details about the other memory 
components.
+
+Configuring *total Flink memory* is better suited for standalone deployments 
where you want to declare how much memory
+is given to Flink itself. The *total Flink memory* splits up into [managed 
memory size](#managed-memory) and JVM heap.
+
+If you configure *total process memory* you declare how much memory in total 
should be assigned to the Flink *JVM process*.
+For the containerized deployments it corresponds to the size of the requested 
container, see also
+[how to configure memory for 
containers](mem_tuning.html#configure-memory-for-containers)
+([Kubernetes](../deployment/kubernetes.html), 
[Yarn](../deployment/yarn_setup.html) or [Mesos](../deployment/mesos.html)).
+
+Another way to setup the memory is to set [task 
heap](#task-operator-heap-memory) and [managed memory](#managed-memory)
+([taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)).
+This more fine-grained approach is described in more detail 
[here](#configure-heap-and-managed-memory).
+
+Note: One of the three mentioned ways has to be used to 
configure Flink’s memory (except for local execution), or the Flink startup 
will fail.
+This means that one of the following option subsets, which do not have default 
values, have to be configured explicitly in *flink-conf.yaml*:
+* 
[taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1)
+* 
[taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1)
+* 
[taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)
+
+Note: Explicitly configuring both *total process memory* and 
*total Flink memory* is not recommended.
+It may lead to deployment failures due to potential memory configuration 
conflicts. Additional configuration
+of other memory components also requires caution as it can produce further 
configuration conflicts.
+The conflict can occur e.g. if the sum of sub-components does not add up to 
the total configured memory or size of some
+component is outside of its min/max range, see also [the detailed memory 
model](#detailed-memory-model).
+
+## Configure Heap and Managed Memory
+
+As mentioned before in [total memory description](#configure-total-memory), 
another way to setup memory in Flink is
+to specify explicitly both [task heap](#task-operator-heap-memory) and 
[managed memory](#managed-memory).
+It gives more control over the available JVM heap to Flink’s

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374455972
 
 

 ##
 File path: docs/ops/memory/mem_setup.md
 ##
 @@ -0,0 +1,251 @@
+---
+title: "Setup Task Executor Memory"
+nav-parent_id: ops_mem
+nav-pos: 1
+---
+
+
+Apache Flink provides efficient workloads on top of the JVM by tightly 
controlling the memory usage of its various components.
+While the community strives to offer sensible defaults to all configurations, 
the full breadth of applications
+that users deploy on Flink means this isn't always possible. To provide the 
most production value to our users,
+Flink allows both high level and fine-grained tuning of memory allocation 
within clusters.
+
+* toc
+{:toc}
+
+The further described memory configuration is applicable starting with the 
release version *1.10*. If you upgrade Flink
+from earlier versions, check the [migration guide](mem_migration.html) because 
many changes were introduced with the 1.10 release.
+
+Note: This memory setup guide is relevant only for 
task executors!
+Check [job manager related configuration options](../config.html#jobmanager) 
for the memory setup of job manager.
+
+## Configure Total Memory
+
+The *total process memory* of Flink JVM processes consists of memory consumed 
by Flink application (*total Flink memory*)
+and by the JVM to run the process. The *total Flink memory* consumption 
includes usage of JVM heap,
+*managed memory* (managed by Flink) and other direct (or native) memory.
+
+
+  
+
+
+
+If you run FIink locally (e.g. from your IDE) without creating a cluster, then 
only a subset of the memory configuration
+options are relevant, see also [local execution](#local-execution) for more 
details.
+
+Otherwise, the simplest way to setup memory in Flink is to configure either of 
the two following options:
+* Total Flink memory 
([taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1))
+* Total process memory 
([taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1))
+
+The rest of the memory components will be adjusted automatically, based on 
default values or additionally configured options.
+[Here](#detailed-memory-model) are more details about the other memory 
components.
+
+Configuring *total Flink memory* is better suited for standalone deployments 
where you want to declare how much memory
+is given to Flink itself. The *total Flink memory* splits up into [managed 
memory size](#managed-memory) and JVM heap.
+
+If you configure *total process memory* you declare how much memory in total 
should be assigned to the Flink *JVM process*.
+For the containerized deployments it corresponds to the size of the requested 
container, see also
+[how to configure memory for 
containers](mem_tuning.html#configure-memory-for-containers)
+([Kubernetes](../deployment/kubernetes.html), 
[Yarn](../deployment/yarn_setup.html) or [Mesos](../deployment/mesos.html)).
+
+Another way to setup the memory is to set [task 
heap](#task-operator-heap-memory) and [managed memory](#managed-memory)
+([taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)).
+This more fine-grained approach is described in more detail 
[here](#configure-heap-and-managed-memory).
+
+Note: One of the three mentioned ways has to be used to 
configure Flink’s memory (except for local execution), or the Flink startup 
will fail.
+This means that one of the following option subsets, which do not have default 
values, have to be configured explicitly in *flink-conf.yaml*:
+* 
[taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1)
+* 
[taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1)
+* 
[taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)
+
+Note: Explicitly configuring both *total process memory* and 
*total Flink memory* is not recommended.
+It may lead to deployment failures due to potential memory configuration 
conflicts. Additional configuration
+of other memory components also requires caution as it can produce further 
configuration conflicts.
+The conflict can occur e.g. if the sum of sub-components does not add up to 
the total configured memory or size of some
+component is outside of its min/max range, see also [the detailed memory 
model](#detailed-memory-model).
+
+## Configure Heap and Managed Memory
+
+As mentioned before in [total memory description](#configure-total-memory), 
another way to setup memory in Flink is
+to specify explicitly both [task heap](#task-operator-heap-memory) and 
[managed memory](#managed-memory).
+It gives more control over the available JVM heap to Flink’s

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374483959
 
 

 ##
 File path: docs/ops/state/state_backends.md
 ##
 @@ -92,6 +94,9 @@ The FsStateBackend is encouraged for:
   - Jobs with large state, long windows, large key/value states.
   - All high-availability setups.
 
+It is also recommended to set [managed memory](mem_setup.html#managed-memory) 
to zero.
 
 Review comment:
   ```suggestion
   It is also recommended to set [managed 
memory](../memory/mem_setup.html#managed-memory) to zero.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374476391
 
 

 ##
 File path: docs/ops/memory/mem_migration.md
 ##
 @@ -0,0 +1,184 @@
+---
+title: "Migration from old configuration (before 1.10 release)"
+nav-parent_id: ops_mem
+nav-pos: 4
+---
+
+
+The [memory setup of task managers](mem_setup.html) has changed a lot with the 
1.10 release. Many configuration options
+were removed or their semantics changed. This guide will help you to 
understand how to migrate the previous memory configuration to the new one.
+
+* toc
+{:toc}
+
+
+  Warning: It is important to review this guide because the 
legacy and new memory configuration can
+  result in different sizes of memory components. If you try to reuse your 
Flink configuration from the previous versions
+  before 1.10, it can result in changes to the behavior, performance or even 
configuration failures of your application.
+
+
+Note: The previous memory configuration allows that no memory 
related options are set at all
+as they all have default values. The [new memory 
configuration](mem_setup.html#configure-total-memory) requires
+that at least one subset of the following options is configured explicitly, 
otherwise the configuration will fail:
+* 
[taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1)
+* 
[taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1)
+* 
[taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)
+
+The [default ‘flink-conf.yaml’](#default-configuration-in-flink-confyaml) 
shipped with Flink sets 
[taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1)
+to make the default memory configuration consistent.
+
+This 
[spreadsheet](https://docs.google.com/spreadsheets/d/1mJaMkMPfDJJ-w6nMXALYmTc4XxiV30P5U7DzgwLkSoE)
 can also help
+to evaluate and compare the results of the legacy and new memory computations.
+
+## Changes in Configuration Options
+
+This chapter shortly lists all changes in the memory configuration options 
before the 1.10 release.
+It also references other chapters for more details about the migration to the 
new configuration options.
+
+The following options are completely removed. If they are still used, they 
will be just ignored.
+
+| **Removed option** | **Note**



|
+| :- | 
:--
 |
+| taskmanager.memory.fraction| Check the description of the new option 
[taskmanager.memory.managed.fraction](../config.html#taskmanager-memory-managed-fraction).
 The new option has different semantics and the value of the deprecated option 
usually has to be adjusted. See also [how to migrate managed 
memory](#managed-memory). |
+| taskmanager.memory.off-heap| on-heap managed memory is no longer 
supported, see also [how to migrate managed memory](#managed-memory)


|
+| taskmanager.memory.preallocate | pre-allocation is no longer supported and 
managed memory is always allocated lazily, see also [how to migrate managed 
memory](#managed-memory)

  |
+
+The following options are deprecated but if they are still used they will be 
interpreted as new options for backwards compatibility:
+
+| **Deprecated option**   | **Interpreted as** 



   |
+| :-- | 
:

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374483006
 
 

 ##
 File path: docs/ops/state/state_backends.md
 ##
 @@ -74,6 +74,8 @@ The MemoryStateBackend is encouraged for:
   - Local development and debugging
   - Jobs that do hold little state, such as jobs that consist only of 
record-at-a-time functions (Map, FlatMap, Filter, ...). The Kafka Consumer 
requires very little state.
 
+It is also recommended to set [managed memory](mem_setup.html#managed-memory) 
to zero.
 
 Review comment:
   ```suggestion
   It is also recommended to set [managed 
memory](../memory/mem_setup.html#managed-memory) to zero.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374473559
 
 

 ##
 File path: docs/ops/memory/mem_trouble.md
 ##
 @@ -0,0 +1,70 @@
+---
+title: "Troubleshooting"
+nav-parent_id: ops_mem
+nav-pos: 3
+---
+
+
+* toc
+{:toc}
+
+## IllegalConfigurationException
+
+If you see an *IllegalConfigurationException* thrown from 
*TaskExecutorProcessUtils*, it usually indicates
+that there is either an invalid configuration value (e.g., negative memory 
size, fraction that is greater than 1, etc.)
+or configuration conflicts. Check the documentation chapters related to the 
[memory components](mem_setup.html#detailed-memory-model)
+mentioned in the exception message.
+
+## OutOfMemoryError: Java heap space
+
+The exception usually indicates that JVM heap is not configured with enough 
size. You can try to increase the JVM heap size
+by configuring larger [total memory](mem_setup.html#configure-total-memory) or 
[task heap memory](mem_setup.html#task-operator-heap-memory).
+
+Note: You can also increase [framework heap 
memory](mem_setup.html#framework-memory) but this option
+is advanced and recommended to be changed if you are sure that the Flink 
framework itself needs more memory.
+
+## OutOfMemoryError: Direct buffer memory
+
+The exception usually indicates that the JVM *direct memory* limit is too 
small if there is no *direct memory leak*.
+You can try to increase this limit by adjusting [direct off-heap 
memory](mem_setup.html#detailed-memory-model).
+See also [how to configure off-heap 
memory](mem_setup.html#configure-off-heap-memory-direct-or-native) and
+[JVM arguments](mem_setup.html#jvm-parameters) which Flink sets.
+
+## OutOfMemoryError: Metaspace
+
+The exception usually indicates that [JVM 
metaspace](mem_setup.html#jvm-parameters) limit is configured too small.
+You can try to increase the [JVM 
metaspace](../config.html#taskmanager-memory-jvm-metaspace-size).
 
 Review comment:
   ```suggestion
   You can try to [increase the JVM 
metaspace](../config.html#taskmanager-memory-jvm-metaspace-size).
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374478914
 
 

 ##
 File path: docs/ops/memory/mem_migration.md
 ##
 @@ -0,0 +1,184 @@
+---
+title: "Migration from old configuration (before 1.10 release)"
+nav-parent_id: ops_mem
+nav-pos: 4
+---
+
+
+The [memory setup of task managers](mem_setup.html) has changed a lot with the 
1.10 release. Many configuration options
+were removed or their semantics changed. This guide will help you to 
understand how to migrate the previous memory configuration to the new one.
+
+* toc
+{:toc}
+
+
+  Warning: It is important to review this guide because the 
legacy and new memory configuration can
+  result in different sizes of memory components. If you try to reuse your 
Flink configuration from the previous versions
+  before 1.10, it can result in changes to the behavior, performance or even 
configuration failures of your application.
+
+
+Note: The previous memory configuration allows that no memory 
related options are set at all
+as they all have default values. The [new memory 
configuration](mem_setup.html#configure-total-memory) requires
+that at least one subset of the following options is configured explicitly, 
otherwise the configuration will fail:
+* 
[taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1)
+* 
[taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1)
+* 
[taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)
+
+The [default ‘flink-conf.yaml’](#default-configuration-in-flink-confyaml) 
shipped with Flink sets 
[taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1)
+to make the default memory configuration consistent.
+
+This 
[spreadsheet](https://docs.google.com/spreadsheets/d/1mJaMkMPfDJJ-w6nMXALYmTc4XxiV30P5U7DzgwLkSoE)
 can also help
+to evaluate and compare the results of the legacy and new memory computations.
+
+## Changes in Configuration Options
+
+This chapter shortly lists all changes in the memory configuration options 
before the 1.10 release.
+It also references other chapters for more details about the migration to the 
new configuration options.
+
+The following options are completely removed. If they are still used, they 
will be just ignored.
+
+| **Removed option** | **Note**



|
+| :- | 
:--
 |
+| taskmanager.memory.fraction| Check the description of the new option 
[taskmanager.memory.managed.fraction](../config.html#taskmanager-memory-managed-fraction).
 The new option has different semantics and the value of the deprecated option 
usually has to be adjusted. See also [how to migrate managed 
memory](#managed-memory). |
+| taskmanager.memory.off-heap| on-heap managed memory is no longer 
supported, see also [how to migrate managed memory](#managed-memory)


|
+| taskmanager.memory.preallocate | pre-allocation is no longer supported and 
managed memory is always allocated lazily, see also [how to migrate managed 
memory](#managed-memory)

  |
+
+The following options are deprecated but if they are still used they will be 
interpreted as new options for backwards compatibility:
+
+| **Deprecated option**   | **Interpreted as** 



   |
+| :-- | 
:

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374465989
 
 

 ##
 File path: docs/ops/memory/mem_setup.md
 ##
 @@ -0,0 +1,251 @@
+---
+title: "Setup Task Executor Memory"
+nav-parent_id: ops_mem
+nav-pos: 1
+---
+
+
+Apache Flink provides efficient workloads on top of the JVM by tightly 
controlling the memory usage of its various components.
+While the community strives to offer sensible defaults to all configurations, 
the full breadth of applications
+that users deploy on Flink means this isn't always possible. To provide the 
most production value to our users,
+Flink allows both high level and fine-grained tuning of memory allocation 
within clusters.
+
+* toc
+{:toc}
+
+The further described memory configuration is applicable starting with the 
release version *1.10*. If you upgrade Flink
+from earlier versions, check the [migration guide](mem_migration.html) because 
many changes were introduced with the 1.10 release.
+
+Note: This memory setup guide is relevant only for 
task executors!
+Check [job manager related configuration options](../config.html#jobmanager) 
for the memory setup of job manager.
+
+## Configure Total Memory
+
+The *total process memory* of Flink JVM processes consists of memory consumed 
by Flink application (*total Flink memory*)
+and by the JVM to run the process. The *total Flink memory* consumption 
includes usage of JVM heap,
+*managed memory* (managed by Flink) and other direct (or native) memory.
+
+
+  
+
+
+
+If you run FIink locally (e.g. from your IDE) without creating a cluster, then 
only a subset of the memory configuration
+options are relevant, see also [local execution](#local-execution) for more 
details.
+
+Otherwise, the simplest way to setup memory in Flink is to configure either of 
the two following options:
+* Total Flink memory 
([taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1))
+* Total process memory 
([taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1))
+
+The rest of the memory components will be adjusted automatically, based on 
default values or additionally configured options.
+[Here](#detailed-memory-model) are more details about the other memory 
components.
+
+Configuring *total Flink memory* is better suited for standalone deployments 
where you want to declare how much memory
+is given to Flink itself. The *total Flink memory* splits up into [managed 
memory size](#managed-memory) and JVM heap.
+
+If you configure *total process memory* you declare how much memory in total 
should be assigned to the Flink *JVM process*.
+For the containerized deployments it corresponds to the size of the requested 
container, see also
+[how to configure memory for 
containers](mem_tuning.html#configure-memory-for-containers)
+([Kubernetes](../deployment/kubernetes.html), 
[Yarn](../deployment/yarn_setup.html) or [Mesos](../deployment/mesos.html)).
+
+Another way to setup the memory is to set [task 
heap](#task-operator-heap-memory) and [managed memory](#managed-memory)
+([taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)).
+This more fine-grained approach is described in more detail 
[here](#configure-heap-and-managed-memory).
+
+Note: One of the three mentioned ways has to be used to 
configure Flink’s memory (except for local execution), or the Flink startup 
will fail.
+This means that one of the following option subsets, which do not have default 
values, have to be configured explicitly in *flink-conf.yaml*:
+* 
[taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1)
+* 
[taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1)
+* 
[taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)
+
+Note: Explicitly configuring both *total process memory* and 
*total Flink memory* is not recommended.
+It may lead to deployment failures due to potential memory configuration 
conflicts. Additional configuration
+of other memory components also requires caution as it can produce further 
configuration conflicts.
+The conflict can occur e.g. if the sum of sub-components does not add up to 
the total configured memory or size of some
+component is outside of its min/max range, see also [the detailed memory 
model](#detailed-memory-model).
+
+## Configure Heap and Managed Memory
+
+As mentioned before in [total memory description](#configure-total-memory), 
another way to setup memory in Flink is
+to specify explicitly both [task heap](#task-operator-heap-memory) and 
[managed memory](#managed-memory).
+It gives more control over the available JVM heap to Flink’s

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374462876
 
 

 ##
 File path: docs/ops/memory/mem_setup.md
 ##
 @@ -0,0 +1,251 @@
+---
+title: "Setup Task Executor Memory"
+nav-parent_id: ops_mem
+nav-pos: 1
+---
+
+
+Apache Flink provides efficient workloads on top of the JVM by tightly 
controlling the memory usage of its various components.
+While the community strives to offer sensible defaults to all configurations, 
the full breadth of applications
+that users deploy on Flink means this isn't always possible. To provide the 
most production value to our users,
+Flink allows both high level and fine-grained tuning of memory allocation 
within clusters.
+
+* toc
+{:toc}
+
+The further described memory configuration is applicable starting with the 
release version *1.10*. If you upgrade Flink
+from earlier versions, check the [migration guide](mem_migration.html) because 
many changes were introduced with the 1.10 release.
+
+Note: This memory setup guide is relevant only for 
task executors!
+Check [job manager related configuration options](../config.html#jobmanager) 
for the memory setup of job manager.
+
+## Configure Total Memory
+
+The *total process memory* of Flink JVM processes consists of memory consumed 
by Flink application (*total Flink memory*)
+and by the JVM to run the process. The *total Flink memory* consumption 
includes usage of JVM heap,
+*managed memory* (managed by Flink) and other direct (or native) memory.
+
+
+  
+
+
+
+If you run FIink locally (e.g. from your IDE) without creating a cluster, then 
only a subset of the memory configuration
+options are relevant, see also [local execution](#local-execution) for more 
details.
+
+Otherwise, the simplest way to setup memory in Flink is to configure either of 
the two following options:
+* Total Flink memory 
([taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1))
+* Total process memory 
([taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1))
+
+The rest of the memory components will be adjusted automatically, based on 
default values or additionally configured options.
+[Here](#detailed-memory-model) are more details about the other memory 
components.
+
+Configuring *total Flink memory* is better suited for standalone deployments 
where you want to declare how much memory
+is given to Flink itself. The *total Flink memory* splits up into [managed 
memory size](#managed-memory) and JVM heap.
+
+If you configure *total process memory* you declare how much memory in total 
should be assigned to the Flink *JVM process*.
+For the containerized deployments it corresponds to the size of the requested 
container, see also
+[how to configure memory for 
containers](mem_tuning.html#configure-memory-for-containers)
+([Kubernetes](../deployment/kubernetes.html), 
[Yarn](../deployment/yarn_setup.html) or [Mesos](../deployment/mesos.html)).
+
+Another way to setup the memory is to set [task 
heap](#task-operator-heap-memory) and [managed memory](#managed-memory)
+([taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)).
+This more fine-grained approach is described in more detail 
[here](#configure-heap-and-managed-memory).
+
+Note: One of the three mentioned ways has to be used to 
configure Flink’s memory (except for local execution), or the Flink startup 
will fail.
+This means that one of the following option subsets, which do not have default 
values, have to be configured explicitly in *flink-conf.yaml*:
+* 
[taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1)
+* 
[taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1)
+* 
[taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)
+
+Note: Explicitly configuring both *total process memory* and 
*total Flink memory* is not recommended.
+It may lead to deployment failures due to potential memory configuration 
conflicts. Additional configuration
+of other memory components also requires caution as it can produce further 
configuration conflicts.
+The conflict can occur e.g. if the sum of sub-components does not add up to 
the total configured memory or size of some
+component is outside of its min/max range, see also [the detailed memory 
model](#detailed-memory-model).
+
+## Configure Heap and Managed Memory
+
+As mentioned before in [total memory description](#configure-total-memory), 
another way to setup memory in Flink is
+to specify explicitly both [task heap](#task-operator-heap-memory) and 
[managed memory](#managed-memory).
+It gives more control over the available JVM heap to Flink’s

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374463537
 
 

 ##
 File path: docs/ops/memory/mem_setup.md
 ##
 @@ -0,0 +1,251 @@
+---
+title: "Setup Task Executor Memory"
+nav-parent_id: ops_mem
+nav-pos: 1
+---
+
+
+Apache Flink provides efficient workloads on top of the JVM by tightly 
controlling the memory usage of its various components.
+While the community strives to offer sensible defaults to all configurations, 
the full breadth of applications
+that users deploy on Flink means this isn't always possible. To provide the 
most production value to our users,
+Flink allows both high level and fine-grained tuning of memory allocation 
within clusters.
+
+* toc
+{:toc}
+
+The further described memory configuration is applicable starting with the 
release version *1.10*. If you upgrade Flink
+from earlier versions, check the [migration guide](mem_migration.html) because 
many changes were introduced with the 1.10 release.
+
+Note: This memory setup guide is relevant only for 
task executors!
+Check [job manager related configuration options](../config.html#jobmanager) 
for the memory setup of job manager.
+
+## Configure Total Memory
+
+The *total process memory* of Flink JVM processes consists of memory consumed 
by Flink application (*total Flink memory*)
+and by the JVM to run the process. The *total Flink memory* consumption 
includes usage of JVM heap,
+*managed memory* (managed by Flink) and other direct (or native) memory.
+
+
+  
+
+
+
+If you run FIink locally (e.g. from your IDE) without creating a cluster, then 
only a subset of the memory configuration
+options are relevant, see also [local execution](#local-execution) for more 
details.
+
+Otherwise, the simplest way to setup memory in Flink is to configure either of 
the two following options:
+* Total Flink memory 
([taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1))
+* Total process memory 
([taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1))
+
+The rest of the memory components will be adjusted automatically, based on 
default values or additionally configured options.
+[Here](#detailed-memory-model) are more details about the other memory 
components.
+
+Configuring *total Flink memory* is better suited for standalone deployments 
where you want to declare how much memory
+is given to Flink itself. The *total Flink memory* splits up into [managed 
memory size](#managed-memory) and JVM heap.
+
+If you configure *total process memory* you declare how much memory in total 
should be assigned to the Flink *JVM process*.
+For the containerized deployments it corresponds to the size of the requested 
container, see also
+[how to configure memory for 
containers](mem_tuning.html#configure-memory-for-containers)
+([Kubernetes](../deployment/kubernetes.html), 
[Yarn](../deployment/yarn_setup.html) or [Mesos](../deployment/mesos.html)).
+
+Another way to setup the memory is to set [task 
heap](#task-operator-heap-memory) and [managed memory](#managed-memory)
+([taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)).
+This more fine-grained approach is described in more detail 
[here](#configure-heap-and-managed-memory).
+
+Note: One of the three mentioned ways has to be used to 
configure Flink’s memory (except for local execution), or the Flink startup 
will fail.
+This means that one of the following option subsets, which do not have default 
values, have to be configured explicitly in *flink-conf.yaml*:
+* 
[taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1)
+* 
[taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1)
+* 
[taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)
+
+Note: Explicitly configuring both *total process memory* and 
*total Flink memory* is not recommended.
+It may lead to deployment failures due to potential memory configuration 
conflicts. Additional configuration
+of other memory components also requires caution as it can produce further 
configuration conflicts.
+The conflict can occur e.g. if the sum of sub-components does not add up to 
the total configured memory or size of some
+component is outside of its min/max range, see also [the detailed memory 
model](#detailed-memory-model).
+
+## Configure Heap and Managed Memory
+
+As mentioned before in [total memory description](#configure-total-memory), 
another way to setup memory in Flink is
+to specify explicitly both [task heap](#task-operator-heap-memory) and 
[managed memory](#managed-memory).
+It gives more control over the available JVM heap to Flink’s

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374474077
 
 

 ##
 File path: docs/ops/memory/mem_trouble.md
 ##
 @@ -0,0 +1,70 @@
+---
+title: "Troubleshooting"
+nav-parent_id: ops_mem
+nav-pos: 3
+---
+
+
+* toc
+{:toc}
+
+## IllegalConfigurationException
+
+If you see an *IllegalConfigurationException* thrown from 
*TaskExecutorProcessUtils*, it usually indicates
+that there is either an invalid configuration value (e.g., negative memory 
size, fraction that is greater than 1, etc.)
+or configuration conflicts. Check the documentation chapters related to the 
[memory components](mem_setup.html#detailed-memory-model)
+mentioned in the exception message.
+
+## OutOfMemoryError: Java heap space
+
+The exception usually indicates that JVM heap is not configured with enough 
size. You can try to increase the JVM heap size
+by configuring larger [total memory](mem_setup.html#configure-total-memory) or 
[task heap memory](mem_setup.html#task-operator-heap-memory).
+
+Note: You can also increase [framework heap 
memory](mem_setup.html#framework-memory) but this option
+is advanced and recommended to be changed if you are sure that the Flink 
framework itself needs more memory.
+
+## OutOfMemoryError: Direct buffer memory
+
+The exception usually indicates that the JVM *direct memory* limit is too 
small if there is no *direct memory leak*.
+You can try to increase this limit by adjusting [direct off-heap 
memory](mem_setup.html#detailed-memory-model).
+See also [how to configure off-heap 
memory](mem_setup.html#configure-off-heap-memory-direct-or-native) and
+[JVM arguments](mem_setup.html#jvm-parameters) which Flink sets.
+
+## OutOfMemoryError: Metaspace
+
+The exception usually indicates that [JVM 
metaspace](mem_setup.html#jvm-parameters) limit is configured too small.
+You can try to increase the [JVM 
metaspace](../config.html#taskmanager-memory-jvm-metaspace-size).
+
+## IOException: Insufficient number of network buffers
+
+The exception usually indicates that the size of the configured [network 
memory](mem_setup.html#detailed-memory-model)
+is not big enough.
+
 
 Review comment:
   Do we also want to say "try increasing network memory" and link to 
corresponding options here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r37629
 
 

 ##
 File path: docs/ops/memory/mem_setup.md
 ##
 @@ -0,0 +1,251 @@
+---
+title: "Setup Task Executor Memory"
+nav-parent_id: ops_mem
+nav-pos: 1
+---
+
+
+Apache Flink provides efficient workloads on top of the JVM by tightly 
controlling the memory usage of its various components.
+While the community strives to offer sensible defaults to all configurations, 
the full breadth of applications
+that users deploy on Flink means this isn't always possible. To provide the 
most production value to our users,
+Flink allows both high level and fine-grained tuning of memory allocation 
within clusters.
+
+* toc
+{:toc}
+
+The further described memory configuration is applicable starting with the 
release version *1.10*. If you upgrade Flink
+from earlier versions, check the [migration guide](mem_migration.html) because 
many changes were introduced with the 1.10 release.
+
+Note: This memory setup guide is relevant only for 
task executors!
+Check [job manager related configuration options](../config.html#jobmanager) 
for the memory setup of job manager.
 
 Review comment:
   It seems the linked page does not have the anchor `#jobmanager`. This link 
jumps to the beginning of the configuration page.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374478983
 
 

 ##
 File path: docs/ops/memory/mem_migration.md
 ##
 @@ -0,0 +1,184 @@
+---
+title: "Migration from old configuration (before 1.10 release)"
+nav-parent_id: ops_mem
+nav-pos: 4
+---
+
+
+The [memory setup of task managers](mem_setup.html) has changed a lot with the 
1.10 release. Many configuration options
+were removed or their semantics changed. This guide will help you to 
understand how to migrate the previous memory configuration to the new one.
+
+* toc
+{:toc}
+
+
+  Warning: It is important to review this guide because the 
legacy and new memory configuration can
+  result in different sizes of memory components. If you try to reuse your 
Flink configuration from the previous versions
+  before 1.10, it can result in changes to the behavior, performance or even 
configuration failures of your application.
+
+
+Note: The previous memory configuration allows that no memory 
related options are set at all
+as they all have default values. The [new memory 
configuration](mem_setup.html#configure-total-memory) requires
+that at least one subset of the following options is configured explicitly, 
otherwise the configuration will fail:
+* 
[taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1)
+* 
[taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1)
+* 
[taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)
+
+The [default ‘flink-conf.yaml’](#default-configuration-in-flink-confyaml) 
shipped with Flink sets 
[taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1)
+to make the default memory configuration consistent.
+
+This 
[spreadsheet](https://docs.google.com/spreadsheets/d/1mJaMkMPfDJJ-w6nMXALYmTc4XxiV30P5U7DzgwLkSoE)
 can also help
+to evaluate and compare the results of the legacy and new memory computations.
+
+## Changes in Configuration Options
+
+This chapter shortly lists all changes in the memory configuration options 
before the 1.10 release.
+It also references other chapters for more details about the migration to the 
new configuration options.
+
+The following options are completely removed. If they are still used, they 
will be just ignored.
+
+| **Removed option** | **Note**



|
+| :- | 
:--
 |
+| taskmanager.memory.fraction| Check the description of the new option 
[taskmanager.memory.managed.fraction](../config.html#taskmanager-memory-managed-fraction).
 The new option has different semantics and the value of the deprecated option 
usually has to be adjusted. See also [how to migrate managed 
memory](#managed-memory). |
+| taskmanager.memory.off-heap| on-heap managed memory is no longer 
supported, see also [how to migrate managed memory](#managed-memory)


|
+| taskmanager.memory.preallocate | pre-allocation is no longer supported and 
managed memory is always allocated lazily, see also [how to migrate managed 
memory](#managed-memory)

  |
+
+The following options are deprecated but if they are still used they will be 
interpreted as new options for backwards compatibility:
+
+| **Deprecated option**   | **Interpreted as** 



   |
+| :-- | 
:

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374454712
 
 

 ##
 File path: docs/ops/memory/mem_setup.md
 ##
 @@ -0,0 +1,251 @@
+---
+title: "Setup Task Executor Memory"
+nav-parent_id: ops_mem
+nav-pos: 1
+---
+
+
+Apache Flink provides efficient workloads on top of the JVM by tightly 
controlling the memory usage of its various components.
+While the community strives to offer sensible defaults to all configurations, 
the full breadth of applications
+that users deploy on Flink means this isn't always possible. To provide the 
most production value to our users,
+Flink allows both high level and fine-grained tuning of memory allocation 
within clusters.
+
+* toc
+{:toc}
+
+The further described memory configuration is applicable starting with the 
release version *1.10*. If you upgrade Flink
+from earlier versions, check the [migration guide](mem_migration.html) because 
many changes were introduced with the 1.10 release.
+
+Note: This memory setup guide is relevant only for 
task executors!
+Check [job manager related configuration options](../config.html#jobmanager) 
for the memory setup of job manager.
+
+## Configure Total Memory
+
+The *total process memory* of Flink JVM processes consists of memory consumed 
by Flink application (*total Flink memory*)
+and by the JVM to run the process. The *total Flink memory* consumption 
includes usage of JVM heap,
+*managed memory* (managed by Flink) and other direct (or native) memory.
+
+
+  
+
+
+
+If you run FIink locally (e.g. from your IDE) without creating a cluster, then 
only a subset of the memory configuration
+options are relevant, see also [local execution](#local-execution) for more 
details.
+
+Otherwise, the simplest way to setup memory in Flink is to configure either of 
the two following options:
+* Total Flink memory 
([taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1))
+* Total process memory 
([taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1))
+
+The rest of the memory components will be adjusted automatically, based on 
default values or additionally configured options.
+[Here](#detailed-memory-model) are more details about the other memory 
components.
+
+Configuring *total Flink memory* is better suited for standalone deployments 
where you want to declare how much memory
+is given to Flink itself. The *total Flink memory* splits up into [managed 
memory size](#managed-memory) and JVM heap.
+
+If you configure *total process memory* you declare how much memory in total 
should be assigned to the Flink *JVM process*.
+For the containerized deployments it corresponds to the size of the requested 
container, see also
+[how to configure memory for 
containers](mem_tuning.html#configure-memory-for-containers)
+([Kubernetes](../deployment/kubernetes.html), 
[Yarn](../deployment/yarn_setup.html) or [Mesos](../deployment/mesos.html)).
+
+Another way to setup the memory is to set [task 
heap](#task-operator-heap-memory) and [managed memory](#managed-memory)
+([taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)).
+This more fine-grained approach is described in more detail 
[here](#configure-heap-and-managed-memory).
+
+Note: One of the three mentioned ways has to be used to 
configure Flink’s memory (except for local execution), or the Flink startup 
will fail.
+This means that one of the following option subsets, which do not have default 
values, have to be configured explicitly in *flink-conf.yaml*:
+* 
[taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1)
+* 
[taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1)
+* 
[taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)
+
+Note: Explicitly configuring both *total process memory* and 
*total Flink memory* is not recommended.
+It may lead to deployment failures due to potential memory configuration 
conflicts. Additional configuration
+of other memory components also requires caution as it can produce further 
configuration conflicts.
+The conflict can occur e.g. if the sum of sub-components does not add up to 
the total configured memory or size of some
+component is outside of its min/max range, see also [the detailed memory 
model](#detailed-memory-model).
+
+## Configure Heap and Managed Memory
+
+As mentioned before in [total memory description](#configure-total-memory), 
another way to setup memory in Flink is
+to specify explicitly both [task heap](#task-operator-heap-memory) and 
[managed memory](#managed-memory).
+It gives more control over the available JVM heap to Flink’s

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374456543
 
 

 ##
 File path: docs/ops/memory/mem_setup.md
 ##
 @@ -0,0 +1,251 @@
+---
+title: "Setup Task Executor Memory"
+nav-parent_id: ops_mem
+nav-pos: 1
+---
+
+
+Apache Flink provides efficient workloads on top of the JVM by tightly 
controlling the memory usage of its various components.
+While the community strives to offer sensible defaults to all configurations, 
the full breadth of applications
+that users deploy on Flink means this isn't always possible. To provide the 
most production value to our users,
+Flink allows both high level and fine-grained tuning of memory allocation 
within clusters.
+
+* toc
+{:toc}
+
+The further described memory configuration is applicable starting with the 
release version *1.10*. If you upgrade Flink
+from earlier versions, check the [migration guide](mem_migration.html) because 
many changes were introduced with the 1.10 release.
+
+Note: This memory setup guide is relevant only for 
task executors!
+Check [job manager related configuration options](../config.html#jobmanager) 
for the memory setup of job manager.
+
+## Configure Total Memory
+
+The *total process memory* of Flink JVM processes consists of memory consumed 
by Flink application (*total Flink memory*)
+and by the JVM to run the process. The *total Flink memory* consumption 
includes usage of JVM heap,
+*managed memory* (managed by Flink) and other direct (or native) memory.
+
+
+  
+
+
+
+If you run FIink locally (e.g. from your IDE) without creating a cluster, then 
only a subset of the memory configuration
+options are relevant, see also [local execution](#local-execution) for more 
details.
+
+Otherwise, the simplest way to setup memory in Flink is to configure either of 
the two following options:
+* Total Flink memory 
([taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1))
+* Total process memory 
([taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1))
+
+The rest of the memory components will be adjusted automatically, based on 
default values or additionally configured options.
+[Here](#detailed-memory-model) are more details about the other memory 
components.
+
+Configuring *total Flink memory* is better suited for standalone deployments 
where you want to declare how much memory
+is given to Flink itself. The *total Flink memory* splits up into [managed 
memory size](#managed-memory) and JVM heap.
+
+If you configure *total process memory* you declare how much memory in total 
should be assigned to the Flink *JVM process*.
+For the containerized deployments it corresponds to the size of the requested 
container, see also
+[how to configure memory for 
containers](mem_tuning.html#configure-memory-for-containers)
+([Kubernetes](../deployment/kubernetes.html), 
[Yarn](../deployment/yarn_setup.html) or [Mesos](../deployment/mesos.html)).
+
+Another way to setup the memory is to set [task 
heap](#task-operator-heap-memory) and [managed memory](#managed-memory)
+([taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)).
+This more fine-grained approach is described in more detail 
[here](#configure-heap-and-managed-memory).
+
+Note: One of the three mentioned ways has to be used to 
configure Flink’s memory (except for local execution), or the Flink startup 
will fail.
+This means that one of the following option subsets, which do not have default 
values, have to be configured explicitly in *flink-conf.yaml*:
+* 
[taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1)
+* 
[taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1)
+* 
[taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)
+
+Note: Explicitly configuring both *total process memory* and 
*total Flink memory* is not recommended.
+It may lead to deployment failures due to potential memory configuration 
conflicts. Additional configuration
+of other memory components also requires caution as it can produce further 
configuration conflicts.
+The conflict can occur e.g. if the sum of sub-components does not add up to 
the total configured memory or size of some
+component is outside of its min/max range, see also [the detailed memory 
model](#detailed-memory-model).
+
+## Configure Heap and Managed Memory
+
+As mentioned before in [total memory description](#configure-total-memory), 
another way to setup memory in Flink is
+to specify explicitly both [task heap](#task-operator-heap-memory) and 
[managed memory](#managed-memory).
+It gives more control over the available JVM heap to Flink’s

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374480919
 
 

 ##
 File path: docs/ops/memory/mem_migration.md
 ##
 @@ -0,0 +1,184 @@
+---
+title: "Migration from old configuration (before 1.10 release)"
+nav-parent_id: ops_mem
+nav-pos: 4
+---
+
+
+The [memory setup of task managers](mem_setup.html) has changed a lot with the 
1.10 release. Many configuration options
+were removed or their semantics changed. This guide will help you to 
understand how to migrate the previous memory configuration to the new one.
+
+* toc
+{:toc}
+
+
+  Warning: It is important to review this guide because the 
legacy and new memory configuration can
+  result in different sizes of memory components. If you try to reuse your 
Flink configuration from the previous versions
+  before 1.10, it can result in changes to the behavior, performance or even 
configuration failures of your application.
+
+
+Note: The previous memory configuration allows that no memory 
related options are set at all
+as they all have default values. The [new memory 
configuration](mem_setup.html#configure-total-memory) requires
+that at least one subset of the following options is configured explicitly, 
otherwise the configuration will fail:
+* 
[taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1)
+* 
[taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1)
+* 
[taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)
+
+The [default ‘flink-conf.yaml’](#default-configuration-in-flink-confyaml) 
shipped with Flink sets 
[taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1)
+to make the default memory configuration consistent.
+
+This 
[spreadsheet](https://docs.google.com/spreadsheets/d/1mJaMkMPfDJJ-w6nMXALYmTc4XxiV30P5U7DzgwLkSoE)
 can also help
+to evaluate and compare the results of the legacy and new memory computations.
+
+## Changes in Configuration Options
+
+This chapter shortly lists all changes in the memory configuration options 
before the 1.10 release.
+It also references other chapters for more details about the migration to the 
new configuration options.
+
+The following options are completely removed. If they are still used, they 
will be just ignored.
+
+| **Removed option** | **Note**



|
+| :- | 
:--
 |
+| taskmanager.memory.fraction| Check the description of the new option 
[taskmanager.memory.managed.fraction](../config.html#taskmanager-memory-managed-fraction).
 The new option has different semantics and the value of the deprecated option 
usually has to be adjusted. See also [how to migrate managed 
memory](#managed-memory). |
+| taskmanager.memory.off-heap| on-heap managed memory is no longer 
supported, see also [how to migrate managed memory](#managed-memory)


|
+| taskmanager.memory.preallocate | pre-allocation is no longer supported and 
managed memory is always allocated lazily, see also [how to migrate managed 
memory](#managed-memory)

  |
+
+The following options are deprecated but if they are still used they will be 
interpreted as new options for backwards compatibility:
+
+| **Deprecated option**   | **Interpreted as** 



   |
+| :-- | 
:

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374463869
 
 

 ##
 File path: docs/ops/memory/mem_setup.md
 ##
 @@ -0,0 +1,251 @@
+---
+title: "Setup Task Executor Memory"
+nav-parent_id: ops_mem
+nav-pos: 1
+---
+
+
+Apache Flink provides efficient workloads on top of the JVM by tightly 
controlling the memory usage of its various components.
+While the community strives to offer sensible defaults to all configurations, 
the full breadth of applications
+that users deploy on Flink means this isn't always possible. To provide the 
most production value to our users,
+Flink allows both high level and fine-grained tuning of memory allocation 
within clusters.
+
+* toc
+{:toc}
+
+The further described memory configuration is applicable starting with the 
release version *1.10*. If you upgrade Flink
+from earlier versions, check the [migration guide](mem_migration.html) because 
many changes were introduced with the 1.10 release.
+
+Note: This memory setup guide is relevant only for 
task executors!
+Check [job manager related configuration options](../config.html#jobmanager) 
for the memory setup of job manager.
+
+## Configure Total Memory
+
+The *total process memory* of Flink JVM processes consists of memory consumed 
by Flink application (*total Flink memory*)
+and by the JVM to run the process. The *total Flink memory* consumption 
includes usage of JVM heap,
+*managed memory* (managed by Flink) and other direct (or native) memory.
+
+
+  
+
+
+
+If you run FIink locally (e.g. from your IDE) without creating a cluster, then 
only a subset of the memory configuration
+options are relevant, see also [local execution](#local-execution) for more 
details.
+
+Otherwise, the simplest way to setup memory in Flink is to configure either of 
the two following options:
+* Total Flink memory 
([taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1))
+* Total process memory 
([taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1))
+
+The rest of the memory components will be adjusted automatically, based on 
default values or additionally configured options.
+[Here](#detailed-memory-model) are more details about the other memory 
components.
+
+Configuring *total Flink memory* is better suited for standalone deployments 
where you want to declare how much memory
+is given to Flink itself. The *total Flink memory* splits up into [managed 
memory size](#managed-memory) and JVM heap.
+
+If you configure *total process memory* you declare how much memory in total 
should be assigned to the Flink *JVM process*.
+For the containerized deployments it corresponds to the size of the requested 
container, see also
+[how to configure memory for 
containers](mem_tuning.html#configure-memory-for-containers)
+([Kubernetes](../deployment/kubernetes.html), 
[Yarn](../deployment/yarn_setup.html) or [Mesos](../deployment/mesos.html)).
+
+Another way to setup the memory is to set [task 
heap](#task-operator-heap-memory) and [managed memory](#managed-memory)
+([taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)).
+This more fine-grained approach is described in more detail 
[here](#configure-heap-and-managed-memory).
+
+Note: One of the three mentioned ways has to be used to 
configure Flink’s memory (except for local execution), or the Flink startup 
will fail.
+This means that one of the following option subsets, which do not have default 
values, have to be configured explicitly in *flink-conf.yaml*:
+* 
[taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1)
+* 
[taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1)
+* 
[taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)
+
+Note: Explicitly configuring both *total process memory* and 
*total Flink memory* is not recommended.
+It may lead to deployment failures due to potential memory configuration 
conflicts. Additional configuration
+of other memory components also requires caution as it can produce further 
configuration conflicts.
+The conflict can occur e.g. if the sum of sub-components does not add up to 
the total configured memory or size of some
+component is outside of its min/max range, see also [the detailed memory 
model](#detailed-memory-model).
+
+## Configure Heap and Managed Memory
+
+As mentioned before in [total memory description](#configure-total-memory), 
another way to setup memory in Flink is
+to specify explicitly both [task heap](#task-operator-heap-memory) and 
[managed memory](#managed-memory).
+It gives more control over the available JVM heap to Flink’s

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374476080
 
 

 ##
 File path: docs/ops/memory/mem_migration.md
 ##
 @@ -0,0 +1,184 @@
+---
+title: "Migration from old configuration (before 1.10 release)"
+nav-parent_id: ops_mem
+nav-pos: 4
+---
+
+
+The [memory setup of task managers](mem_setup.html) has changed a lot with the 
1.10 release. Many configuration options
+were removed or their semantics changed. This guide will help you to 
understand how to migrate the previous memory configuration to the new one.
+
+* toc
+{:toc}
+
+
+  Warning: It is important to review this guide because the 
legacy and new memory configuration can
+  result in different sizes of memory components. If you try to reuse your 
Flink configuration from the previous versions
+  before 1.10, it can result in changes to the behavior, performance or even 
configuration failures of your application.
+
+
+Note: The previous memory configuration allows that no memory 
related options are set at all
+as they all have default values. The [new memory 
configuration](mem_setup.html#configure-total-memory) requires
+that at least one subset of the following options is configured explicitly, 
otherwise the configuration will fail:
+* 
[taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1)
+* 
[taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1)
+* 
[taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)
+
+The [default ‘flink-conf.yaml’](#default-configuration-in-flink-confyaml) 
shipped with Flink sets 
[taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1)
+to make the default memory configuration consistent.
+
+This 
[spreadsheet](https://docs.google.com/spreadsheets/d/1mJaMkMPfDJJ-w6nMXALYmTc4XxiV30P5U7DzgwLkSoE)
 can also help
+to evaluate and compare the results of the legacy and new memory computations.
+
+## Changes in Configuration Options
+
+This chapter shortly lists all changes in the memory configuration options 
before the 1.10 release.
+It also references other chapters for more details about the migration to the 
new configuration options.
+
+The following options are completely removed. If they are still used, they 
will be just ignored.
+
+| **Removed option** | **Note**



|
+| :- | 
:--
 |
+| taskmanager.memory.fraction| Check the description of the new option 
[taskmanager.memory.managed.fraction](../config.html#taskmanager-memory-managed-fraction).
 The new option has different semantics and the value of the deprecated option 
usually has to be adjusted. See also [how to migrate managed 
memory](#managed-memory). |
+| taskmanager.memory.off-heap| on-heap managed memory is no longer 
supported, see also [how to migrate managed memory](#managed-memory)


|
+| taskmanager.memory.preallocate | pre-allocation is no longer supported and 
managed memory is always allocated lazily, see also [how to migrate managed 
memory](#managed-memory)

  |
+
+The following options are deprecated but if they are still used they will be 
interpreted as new options for backwards compatibility:
+
+| **Deprecated option**   | **Interpreted as** 



   |
+| :-- | 
:

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374464483
 
 

 ##
 File path: docs/ops/memory/mem_setup.md
 ##
 @@ -0,0 +1,251 @@
+---
+title: "Setup Task Executor Memory"
+nav-parent_id: ops_mem
+nav-pos: 1
+---
+
+
+Apache Flink provides efficient workloads on top of the JVM by tightly 
controlling the memory usage of its various components.
+While the community strives to offer sensible defaults to all configurations, 
the full breadth of applications
+that users deploy on Flink means this isn't always possible. To provide the 
most production value to our users,
+Flink allows both high level and fine-grained tuning of memory allocation 
within clusters.
+
+* toc
+{:toc}
+
+The further described memory configuration is applicable starting with the 
release version *1.10*. If you upgrade Flink
+from earlier versions, check the [migration guide](mem_migration.html) because 
many changes were introduced with the 1.10 release.
+
+Note: This memory setup guide is relevant only for 
task executors!
+Check [job manager related configuration options](../config.html#jobmanager) 
for the memory setup of job manager.
+
+## Configure Total Memory
+
+The *total process memory* of Flink JVM processes consists of memory consumed 
by Flink application (*total Flink memory*)
+and by the JVM to run the process. The *total Flink memory* consumption 
includes usage of JVM heap,
+*managed memory* (managed by Flink) and other direct (or native) memory.
+
+
+  
+
+
+
+If you run FIink locally (e.g. from your IDE) without creating a cluster, then 
only a subset of the memory configuration
+options are relevant, see also [local execution](#local-execution) for more 
details.
+
+Otherwise, the simplest way to setup memory in Flink is to configure either of 
the two following options:
+* Total Flink memory 
([taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1))
+* Total process memory 
([taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1))
+
+The rest of the memory components will be adjusted automatically, based on 
default values or additionally configured options.
+[Here](#detailed-memory-model) are more details about the other memory 
components.
+
+Configuring *total Flink memory* is better suited for standalone deployments 
where you want to declare how much memory
+is given to Flink itself. The *total Flink memory* splits up into [managed 
memory size](#managed-memory) and JVM heap.
+
+If you configure *total process memory* you declare how much memory in total 
should be assigned to the Flink *JVM process*.
+For the containerized deployments it corresponds to the size of the requested 
container, see also
+[how to configure memory for 
containers](mem_tuning.html#configure-memory-for-containers)
+([Kubernetes](../deployment/kubernetes.html), 
[Yarn](../deployment/yarn_setup.html) or [Mesos](../deployment/mesos.html)).
+
+Another way to setup the memory is to set [task 
heap](#task-operator-heap-memory) and [managed memory](#managed-memory)
+([taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)).
+This more fine-grained approach is described in more detail 
[here](#configure-heap-and-managed-memory).
+
+Note: One of the three mentioned ways has to be used to 
configure Flink’s memory (except for local execution), or the Flink startup 
will fail.
+This means that one of the following option subsets, which do not have default 
values, have to be configured explicitly in *flink-conf.yaml*:
+* 
[taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1)
+* 
[taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1)
+* 
[taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)
+
+Note: Explicitly configuring both *total process memory* and 
*total Flink memory* is not recommended.
+It may lead to deployment failures due to potential memory configuration 
conflicts. Additional configuration
+of other memory components also requires caution as it can produce further 
configuration conflicts.
+The conflict can occur e.g. if the sum of sub-components does not add up to 
the total configured memory or size of some
+component is outside of its min/max range, see also [the detailed memory 
model](#detailed-memory-model).
+
+## Configure Heap and Managed Memory
+
+As mentioned before in [total memory description](#configure-total-memory), 
another way to setup memory in Flink is
+to specify explicitly both [task heap](#task-operator-heap-memory) and 
[managed memory](#managed-memory).
+It gives more control over the available JVM heap to Flink’s

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374471963
 
 

 ##
 File path: docs/ops/memory/mem_tuning.md
 ##
 @@ -0,0 +1,86 @@
+---
+title: "Memory tuning guide"
+nav-parent_id: ops_mem
+nav-pos: 2
+---
+
+
+In addition to the [main memory setup guide](mem_setup.html), this section 
explains how to setup memory of task executors
+depending on the use case and which options are important in which case.
+
+* toc
+{:toc}
+
+## Configure memory for standalone deployment
+
+It is recommended to configure [total Flink 
memory](mem_setup.html#configure-total-memory)
+([taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1))
 or its [components](mem_setup.html#detailed-memory-model)
+for [standalone deployment](../deployment/cluster_setup.html) where you want 
to declare how much memory is given to Flink itself.
+Additionally, you can adjust *JVM metaspace* if it causes 
[problems](mem_trouble.html#outofmemoryerror-metaspace).
+
+The *total Process memory* is not relevant because *JVM overhead* is not 
controlled by Flink or deployment environment,
+only physical resources of the executing machine matter in this case.
+
+## Configure memory for containers
+
+It is recommended to configure [total process 
memory](mem_setup.html#configure-total-memory)
+([taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1))
 for the containerized deployments
 
 Review comment:
   ```suggestion
   
([taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size))
 for the containerized deployments
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374473870
 
 

 ##
 File path: docs/ops/memory/mem_trouble.md
 ##
 @@ -0,0 +1,70 @@
+---
+title: "Troubleshooting"
+nav-parent_id: ops_mem
+nav-pos: 3
+---
+
+
+* toc
+{:toc}
+
+## IllegalConfigurationException
+
+If you see an *IllegalConfigurationException* thrown from 
*TaskExecutorProcessUtils*, it usually indicates
+that there is either an invalid configuration value (e.g., negative memory 
size, fraction that is greater than 1, etc.)
+or configuration conflicts. Check the documentation chapters related to the 
[memory components](mem_setup.html#detailed-memory-model)
+mentioned in the exception message.
+
+## OutOfMemoryError: Java heap space
+
+The exception usually indicates that JVM heap is not configured with enough 
size. You can try to increase the JVM heap size
+by configuring larger [total memory](mem_setup.html#configure-total-memory) or 
[task heap memory](mem_setup.html#task-operator-heap-memory).
+
+Note: You can also increase [framework heap 
memory](mem_setup.html#framework-memory) but this option
+is advanced and recommended to be changed if you are sure that the Flink 
framework itself needs more memory.
+
+## OutOfMemoryError: Direct buffer memory
+
+The exception usually indicates that the JVM *direct memory* limit is too 
small if there is no *direct memory leak*.
+You can try to increase this limit by adjusting [direct off-heap 
memory](mem_setup.html#detailed-memory-model).
+See also [how to configure off-heap 
memory](mem_setup.html#configure-off-heap-memory-direct-or-native) and
+[JVM arguments](mem_setup.html#jvm-parameters) which Flink sets.
+
+## OutOfMemoryError: Metaspace
+
+The exception usually indicates that [JVM 
metaspace](mem_setup.html#jvm-parameters) limit is configured too small.
+You can try to increase the [JVM 
metaspace](../config.html#taskmanager-memory-jvm-metaspace-size).
 
 Review comment:
   This paragraph have two "JVM metaspace"s linking to different pages.
   As a user, I would thought these two links are the same thing, and will not 
click on the second.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374463771
 
 

 ##
 File path: docs/ops/memory/mem_setup.md
 ##
 @@ -0,0 +1,251 @@
+---
+title: "Setup Task Executor Memory"
+nav-parent_id: ops_mem
+nav-pos: 1
+---
+
+
+Apache Flink provides efficient workloads on top of the JVM by tightly 
controlling the memory usage of its various components.
+While the community strives to offer sensible defaults to all configurations, 
the full breadth of applications
+that users deploy on Flink means this isn't always possible. To provide the 
most production value to our users,
+Flink allows both high level and fine-grained tuning of memory allocation 
within clusters.
+
+* toc
+{:toc}
+
+The further described memory configuration is applicable starting with the 
release version *1.10*. If you upgrade Flink
+from earlier versions, check the [migration guide](mem_migration.html) because 
many changes were introduced with the 1.10 release.
+
+Note: This memory setup guide is relevant only for 
task executors!
+Check [job manager related configuration options](../config.html#jobmanager) 
for the memory setup of job manager.
+
+## Configure Total Memory
+
+The *total process memory* of Flink JVM processes consists of memory consumed 
by Flink application (*total Flink memory*)
+and by the JVM to run the process. The *total Flink memory* consumption 
includes usage of JVM heap,
+*managed memory* (managed by Flink) and other direct (or native) memory.
+
+
+  
+
+
+
+If you run FIink locally (e.g. from your IDE) without creating a cluster, then 
only a subset of the memory configuration
+options are relevant, see also [local execution](#local-execution) for more 
details.
+
+Otherwise, the simplest way to setup memory in Flink is to configure either of 
the two following options:
+* Total Flink memory 
([taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1))
+* Total process memory 
([taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1))
+
+The rest of the memory components will be adjusted automatically, based on 
default values or additionally configured options.
+[Here](#detailed-memory-model) are more details about the other memory 
components.
+
+Configuring *total Flink memory* is better suited for standalone deployments 
where you want to declare how much memory
+is given to Flink itself. The *total Flink memory* splits up into [managed 
memory size](#managed-memory) and JVM heap.
+
+If you configure *total process memory* you declare how much memory in total 
should be assigned to the Flink *JVM process*.
+For the containerized deployments it corresponds to the size of the requested 
container, see also
+[how to configure memory for 
containers](mem_tuning.html#configure-memory-for-containers)
+([Kubernetes](../deployment/kubernetes.html), 
[Yarn](../deployment/yarn_setup.html) or [Mesos](../deployment/mesos.html)).
+
+Another way to setup the memory is to set [task 
heap](#task-operator-heap-memory) and [managed memory](#managed-memory)
+([taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)).
+This more fine-grained approach is described in more detail 
[here](#configure-heap-and-managed-memory).
+
+Note: One of the three mentioned ways has to be used to 
configure Flink’s memory (except for local execution), or the Flink startup 
will fail.
+This means that one of the following option subsets, which do not have default 
values, have to be configured explicitly in *flink-conf.yaml*:
+* 
[taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1)
+* 
[taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1)
+* 
[taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)
+
+Note: Explicitly configuring both *total process memory* and 
*total Flink memory* is not recommended.
+It may lead to deployment failures due to potential memory configuration 
conflicts. Additional configuration
+of other memory components also requires caution as it can produce further 
configuration conflicts.
+The conflict can occur e.g. if the sum of sub-components does not add up to 
the total configured memory or size of some
+component is outside of its min/max range, see also [the detailed memory 
model](#detailed-memory-model).
+
+## Configure Heap and Managed Memory
+
+As mentioned before in [total memory description](#configure-total-memory), 
another way to setup memory in Flink is
+to specify explicitly both [task heap](#task-operator-heap-memory) and 
[managed memory](#managed-memory).
+It gives more control over the available JVM heap to Flink’s

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374471075
 
 

 ##
 File path: docs/ops/memory/mem_setup.md
 ##
 @@ -0,0 +1,251 @@
+---
+title: "Setup Task Executor Memory"
+nav-parent_id: ops_mem
+nav-pos: 1
+---
+
+
+Apache Flink provides efficient workloads on top of the JVM by tightly 
controlling the memory usage of its various components.
+While the community strives to offer sensible defaults to all configurations, 
the full breadth of applications
+that users deploy on Flink means this isn't always possible. To provide the 
most production value to our users,
+Flink allows both high level and fine-grained tuning of memory allocation 
within clusters.
+
+* toc
+{:toc}
+
+The further described memory configuration is applicable starting with the 
release version *1.10*. If you upgrade Flink
+from earlier versions, check the [migration guide](mem_migration.html) because 
many changes were introduced with the 1.10 release.
+
+Note: This memory setup guide is relevant only for 
task executors!
+Check [job manager related configuration options](../config.html#jobmanager) 
for the memory setup of job manager.
+
+## Configure Total Memory
+
+The *total process memory* of Flink JVM processes consists of memory consumed 
by Flink application (*total Flink memory*)
+and by the JVM to run the process. The *total Flink memory* consumption 
includes usage of JVM heap,
+*managed memory* (managed by Flink) and other direct (or native) memory.
+
+
+  
+
+
+
+If you run FIink locally (e.g. from your IDE) without creating a cluster, then 
only a subset of the memory configuration
+options are relevant, see also [local execution](#local-execution) for more 
details.
+
+Otherwise, the simplest way to setup memory in Flink is to configure either of 
the two following options:
+* Total Flink memory 
([taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1))
+* Total process memory 
([taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1))
+
+The rest of the memory components will be adjusted automatically, based on 
default values or additionally configured options.
+[Here](#detailed-memory-model) are more details about the other memory 
components.
+
+Configuring *total Flink memory* is better suited for standalone deployments 
where you want to declare how much memory
+is given to Flink itself. The *total Flink memory* splits up into [managed 
memory size](#managed-memory) and JVM heap.
+
+If you configure *total process memory* you declare how much memory in total 
should be assigned to the Flink *JVM process*.
+For the containerized deployments it corresponds to the size of the requested 
container, see also
+[how to configure memory for 
containers](mem_tuning.html#configure-memory-for-containers)
+([Kubernetes](../deployment/kubernetes.html), 
[Yarn](../deployment/yarn_setup.html) or [Mesos](../deployment/mesos.html)).
+
+Another way to setup the memory is to set [task 
heap](#task-operator-heap-memory) and [managed memory](#managed-memory)
+([taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)).
+This more fine-grained approach is described in more detail 
[here](#configure-heap-and-managed-memory).
+
+Note: One of the three mentioned ways has to be used to 
configure Flink’s memory (except for local execution), or the Flink startup 
will fail.
+This means that one of the following option subsets, which do not have default 
values, have to be configured explicitly in *flink-conf.yaml*:
+* 
[taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1)
+* 
[taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1)
+* 
[taskmanager.memory.task.heap.size](../config.html#taskmanager-memory-task-heap-size)
 and 
[taskmanager.memory.managed.size](../config.html#taskmanager-memory-managed-size)
+
+Note: Explicitly configuring both *total process memory* and 
*total Flink memory* is not recommended.
+It may lead to deployment failures due to potential memory configuration 
conflicts. Additional configuration
+of other memory components also requires caution as it can produce further 
configuration conflicts.
+The conflict can occur e.g. if the sum of sub-components does not add up to 
the total configured memory or size of some
+component is outside of its min/max range, see also [the detailed memory 
model](#detailed-memory-model).
+
+## Configure Heap and Managed Memory
+
+As mentioned before in [total memory description](#configure-total-memory), 
another way to setup memory in Flink is
+to specify explicitly both [task heap](#task-operator-heap-memory) and 
[managed memory](#managed-memory).
+It gives more control over the available JVM heap to Flink’s

[GitHub] [flink] xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide

2020-02-03 Thread GitBox

xintongsong commented on a change in pull request #10999: [FLINK-15143] Docs 
for FLIP-49 TM memory model and configuration guide
URL: https://github.com/apache/flink/pull/10999#discussion_r374474897
 
 

 ##
 File path: docs/ops/memory/mem_migration.md
 ##
 @@ -0,0 +1,184 @@
+---
+title: "Migration from old configuration (before 1.10 release)"
+nav-parent_id: ops_mem
+nav-pos: 4
+---
+
+
+The [memory setup of task managers](mem_setup.html) has changed a lot with the 
1.10 release. Many configuration options
+were removed or their semantics changed. This guide will help you to 
understand how to migrate the previous memory configuration to the new one.
+
+* toc
+{:toc}
+
+
+  Warning: It is important to review this guide because the 
legacy and new memory configuration can
+  result in different sizes of memory components. If you try to reuse your 
Flink configuration from the previous versions
+  before 1.10, it can result in changes to the behavior, performance or even 
configuration failures of your application.
+
+
+Note: The previous memory configuration allows that no memory 
related options are set at all
+as they all have default values. The [new memory 
configuration](mem_setup.html#configure-total-memory) requires
+that at least one subset of the following options is configured explicitly, 
otherwise the configuration will fail:
+* 
[taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1)
+* 
[taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1)
 
 Review comment:
   ```suggestion
   * 
[taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size)
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

1 2 3 4 5 6 >

1 - 100 of 589 matches

Mail list logo