[GitHub] [flink] flinkbot edited a comment on issue #9905: [FLINK-14396][network] Implement rudimentary non-blocking network output
flinkbot edited a comment on issue #9905: [FLINK-14396][network] Implement rudimentary non-blocking network output URL: https://github.com/apache/flink/pull/9905#issuecomment-542185017 ## CI report: * 8b3baa21776bbfa97a36413136fba365ea960703 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/131961879) * 2caa4c446127547d127433d5614b6994cd095330 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132760396) * 510f2cc5b515d80e0e316e68adb6fa4960303536 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/132777306) * 79505f7abd7e5c4319fa5328d0aee07360a175d7 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132934966) * 2f3d7b85d84acf575589feefdb2ee9375ba1541e : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/132941969) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (FLINK-14495) Add documentation for memory control of RocksDB state backend with writebuffer manager
Yun Tang created FLINK-14495: Summary: Add documentation for memory control of RocksDB state backend with writebuffer manager Key: FLINK-14495 URL: https://issues.apache.org/jira/browse/FLINK-14495 Project: Flink Issue Type: Sub-task Reporter: Yun Tang Fix For: 1.10.0 Add user documentation of how to use this feature of write buffer manager to control the memory usage of RocksDB state backend. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-14494) Update documentation generator
Dawid Wysakowicz created FLINK-14494: Summary: Update documentation generator Key: FLINK-14494 URL: https://issues.apache.org/jira/browse/FLINK-14494 Project: Flink Issue Type: Sub-task Components: Documentation, Runtime / Configuration Reporter: Dawid Wysakowicz Assignee: Dawid Wysakowicz Fix For: 1.10.0 Update documentation generator to include the type information -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-14494) Update documentation generator
[ https://issues.apache.org/jira/browse/FLINK-14494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Wysakowicz reassigned FLINK-14494: Assignee: (was: Dawid Wysakowicz) > Update documentation generator > -- > > Key: FLINK-14494 > URL: https://issues.apache.org/jira/browse/FLINK-14494 > Project: Flink > Issue Type: Sub-task > Components: Documentation, Runtime / Configuration >Reporter: Dawid Wysakowicz >Priority: Major > Fix For: 1.10.0 > > > Update documentation generator to include the type information -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] ifndef-SleePy commented on issue #9956: [FLINK-13848][runtime] Enriches RpcEndpoint.MainThreadExecutor by supporting periodic scheduling
ifndef-SleePy commented on issue #9956: [FLINK-13848][runtime] Enriches RpcEndpoint.MainThreadExecutor by supporting periodic scheduling URL: https://github.com/apache/flink/pull/9956#issuecomment-544829995 @flinkbot attention @tillrohrmann , @pnowojski This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] danny0405 commented on a change in pull request #9952: [FLINK-14321][sql-parser] Support to parse watermark statement in SQL DDL
danny0405 commented on a change in pull request #9952: [FLINK-14321][sql-parser] Support to parse watermark statement in SQL DDL URL: https://github.com/apache/flink/pull/9952#discussion_r337351602 ## File path: flink-table/flink-sql-parser/src/main/java/org/apache/flink/sql/parser/ddl/SqlCreateTable.java ## @@ -310,9 +322,67 @@ private void printIndent(SqlWriter writer) { public List columnList = new ArrayList<>(); public SqlNodeList primaryKeyList = SqlNodeList.EMPTY; public List uniqueKeysList = new ArrayList<>(); + @Nullable public SqlWatermark watermark; } public String[] fullTableName() { return tableName.names.toArray(new String[0]); } + + // - + + private static final class ColumnValidator { + + private final Set allColumnNames = new HashSet<>(); + + /** +* Adds column name to the registered column set. This will add nested column names recursive. +* Nested column names are qualified using "." separator. +*/ + public void addColumn(SqlNode column) throws SqlValidateException { + String columnName; + if (column instanceof SqlTableColumn) { + SqlTableColumn tableColumn = (SqlTableColumn) column; + columnName = tableColumn.getName().getSimple(); + addNestedColumn(columnName, tableColumn.getType()); + } else if (column instanceof SqlBasicCall) { + SqlBasicCall tableColumn = (SqlBasicCall) column; + columnName = tableColumn.getOperands()[1].toString(); + } else { + throw new UnsupportedOperationException("Unsupported column:" + column); + } + + addColumnName(columnName, column.getParserPosition()); + } + + /** +* Returns true if the column name is existed in the registered column set. +* This supports qualified column name using "." separator. +*/ + public boolean contains(String columnName) { + return allColumnNames.contains(columnName); + } + + private void addNestedColumn(String columnName, SqlDataTypeSpec columnType) throws SqlValidateException { + SqlTypeNameSpec typeName = columnType.getTypeNameSpec(); + // validate composite type + if (typeName instanceof ExtendedSqlRowTypeNameSpec) { Review comment: Do we support nested collection type such as `ARRAY` or `MULTISET` ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (FLINK-14493) Introduce data types to ConfigOptions
Dawid Wysakowicz created FLINK-14493: Summary: Introduce data types to ConfigOptions Key: FLINK-14493 URL: https://issues.apache.org/jira/browse/FLINK-14493 Project: Flink Issue Type: Sub-task Components: Runtime / Configuration Reporter: Dawid Wysakowicz Assignee: Dawid Wysakowicz Fix For: 1.10.0 * Extend the ConfigOption class and its builder to enable richer type support * Update Configuration to support updated types -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] danny0405 commented on a change in pull request #9952: [FLINK-14321][sql-parser] Support to parse watermark statement in SQL DDL
danny0405 commented on a change in pull request #9952: [FLINK-14321][sql-parser] Support to parse watermark statement in SQL DDL URL: https://github.com/apache/flink/pull/9952#discussion_r337348164 ## File path: flink-table/flink-sql-parser/src/main/codegen/includes/parserImpls.ftl ## @@ -27,9 +27,30 @@ void TableColumn(TableCreationContext context) : UniqueKey(context.uniqueKeysList) | ComputedColumn(context) +| +Watermark(context) ) } +void Watermark(TableCreationContext context) : +{ +SqlIdentifier columnName; +SqlParserPos pos; +SqlNode watermarkStrategy; +} +{ + +columnName = CompoundIdentifier() {pos = getPos();} + +watermarkStrategy = Expression(ExprContext.ACCEPT_SUB_QUERY) { +if (context.watermark != null) { +throw new ParseException("Multiple WATERMARK statements is not supported yet."); +} else { Review comment: The ParseException does not have position info, just to notes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] danny0405 commented on a change in pull request #9952: [FLINK-14321][sql-parser] Support to parse watermark statement in SQL DDL
danny0405 commented on a change in pull request #9952: [FLINK-14321][sql-parser] Support to parse watermark statement in SQL DDL URL: https://github.com/apache/flink/pull/9952#discussion_r337347908 ## File path: flink-table/flink-sql-parser/src/main/codegen/includes/parserImpls.ftl ## @@ -27,9 +27,30 @@ void TableColumn(TableCreationContext context) : UniqueKey(context.uniqueKeysList) | ComputedColumn(context) +| +Watermark(context) ) } +void Watermark(TableCreationContext context) : +{ +SqlIdentifier columnName; +SqlParserPos pos; +SqlNode watermarkStrategy; +} +{ + +columnName = CompoundIdentifier() {pos = getPos();} + +watermarkStrategy = Expression(ExprContext.ACCEPT_SUB_QUERY) { +if (context.watermark != null) { Review comment: `ExprContext.ACCEPT_SUB_QUERY` do we really support sub-query ? I don't think so. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (FLINK-14492) Modify state backend to report reserved memory to MemoryManager
Yu Li created FLINK-14492: - Summary: Modify state backend to report reserved memory to MemoryManager Key: FLINK-14492 URL: https://issues.apache.org/jira/browse/FLINK-14492 Project: Flink Issue Type: Sub-task Components: Runtime / State Backends, Runtime / Task Reporter: Yu Li {\{MemoryManager}} will add a new \{{reserveMemory}} interface through FLINK-13984, and we should discuss to see whether state backend follows this way to implement the initial version of memory control. Status of discussion in ML and conclusion around this topic will be tracked by this JIRA. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-14479) Strange exceptions found in log file after executing `test_udf.py`
[ https://issues.apache.org/jira/browse/FLINK-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sunjincheng reassigned FLINK-14479: --- Assignee: Wei Zhong > Strange exceptions found in log file after executing `test_udf.py` > -- > > Key: FLINK-14479 > URL: https://issues.apache.org/jira/browse/FLINK-14479 > Project: Flink > Issue Type: Improvement > Components: API / Python >Reporter: sunjincheng >Assignee: Wei Zhong >Priority: Major > Fix For: 1.10.0 > > > There are several strange exceptions as follow in > `${flink_source}/build-target/log/flink-${username}-python-udf-boot-${machine_name}.local.log` > after executing > `${flink_source}/flink-python/pyflink/table/tests/test_udf.py`: > Traceback (most recent call last): > {code:java} > File > "/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", > line 193, in _run_module_as_main > "__main__", mod_spec) > File > "/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", > line 85, in _run_code > exec(code, run_globals) > File > "/Users/zhongwei/flink/flink-python/pyflink/fn_execution/sdk_worker_main.py", > line 30, in > apache_beam.runners.worker.sdk_worker_main.main(sys.argv) > File > "/Users/zhongwei/pyflink_env/py37/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker_main.py", > line 148, in main > sdk_pipeline_options.view_as(pipeline_options.ProfilingOptions)) > File > "/Users/zhongwei/pyflink_env/py37/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 133, in run > for work_request in control_stub.Control(get_responses()): > File > "/Users/zhongwei/pyflink_env/py37/lib/python3.7/site-packages/grpc/_channel.py", > line 364, in __next__ > return self._next() > File > "/Users/zhongwei/pyflink_env/py37/lib/python3.7/site-packages/grpc/_channel.py", > line 347, in _next > raise self > grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with: > status = StatusCode.CANCELLED > details = "Runner closed connection" > debug_error_string = > "{"created":"@1571660342.057172000","description":"Error received from peer > ipv6:[::1]:52699","file":"src/core/lib/surface/call.cc","file_line":1052,"grpc_message":"Runner > closed connection","grpc_status":1}"{code} > It appears randomly when executing test cases of blink planner. Although it > does not affect test results we need to find out why it appears. > Welcome any feedback! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-14478) The python test on travis consumes too much time
[ https://issues.apache.org/jira/browse/FLINK-14478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sunjincheng reassigned FLINK-14478: --- Assignee: Wei Zhong > The python test on travis consumes too much time > > > Key: FLINK-14478 > URL: https://issues.apache.org/jira/browse/FLINK-14478 > Project: Flink > Issue Type: Improvement > Components: API / Python >Reporter: sunjincheng >Assignee: Wei Zhong >Priority: Major > > The python test on travis now takes twice as long as it was a few days ago. > In this case serveral days ago it only takes about 15 minutes, more detail > can be found in [1] > But now it takes about half an hour, more detail can be found in [2] > It seems that a lot of time is spent on the newly added test cases in > `test_udf.py`. We need some optimization for these test cases to reduce test > time. > What do you think? > [1] [https://travis-ci.org/apache/flink/builds/599058891] > [2] [https://travis-ci.org/apache/flink/builds/600517667] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14478) The python test on travis consumes too much time
[ https://issues.apache.org/jira/browse/FLINK-14478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956742#comment-16956742 ] sunjincheng commented on FLINK-14478: - Thanks [~zhongwei] for taking this ticket. > The python test on travis consumes too much time > > > Key: FLINK-14478 > URL: https://issues.apache.org/jira/browse/FLINK-14478 > Project: Flink > Issue Type: Improvement > Components: API / Python >Reporter: sunjincheng >Priority: Major > > The python test on travis now takes twice as long as it was a few days ago. > In this case serveral days ago it only takes about 15 minutes, more detail > can be found in [1] > But now it takes about half an hour, more detail can be found in [2] > It seems that a lot of time is spent on the newly added test cases in > `test_udf.py`. We need some optimization for these test cases to reduce test > time. > What do you think? > [1] [https://travis-ci.org/apache/flink/builds/599058891] > [2] [https://travis-ci.org/apache/flink/builds/600517667] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14479) Strange exceptions found in log file after executing `test_udf.py`
[ https://issues.apache.org/jira/browse/FLINK-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956743#comment-16956743 ] sunjincheng commented on FLINK-14479: - Thanks [~zhongwei] for taking this ticket. > Strange exceptions found in log file after executing `test_udf.py` > -- > > Key: FLINK-14479 > URL: https://issues.apache.org/jira/browse/FLINK-14479 > Project: Flink > Issue Type: Improvement > Components: API / Python >Reporter: sunjincheng >Priority: Major > Fix For: 1.10.0 > > > There are several strange exceptions as follow in > `${flink_source}/build-target/log/flink-${username}-python-udf-boot-${machine_name}.local.log` > after executing > `${flink_source}/flink-python/pyflink/table/tests/test_udf.py`: > Traceback (most recent call last): > {code:java} > File > "/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", > line 193, in _run_module_as_main > "__main__", mod_spec) > File > "/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", > line 85, in _run_code > exec(code, run_globals) > File > "/Users/zhongwei/flink/flink-python/pyflink/fn_execution/sdk_worker_main.py", > line 30, in > apache_beam.runners.worker.sdk_worker_main.main(sys.argv) > File > "/Users/zhongwei/pyflink_env/py37/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker_main.py", > line 148, in main > sdk_pipeline_options.view_as(pipeline_options.ProfilingOptions)) > File > "/Users/zhongwei/pyflink_env/py37/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 133, in run > for work_request in control_stub.Control(get_responses()): > File > "/Users/zhongwei/pyflink_env/py37/lib/python3.7/site-packages/grpc/_channel.py", > line 364, in __next__ > return self._next() > File > "/Users/zhongwei/pyflink_env/py37/lib/python3.7/site-packages/grpc/_channel.py", > line 347, in _next > raise self > grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with: > status = StatusCode.CANCELLED > details = "Runner closed connection" > debug_error_string = > "{"created":"@1571660342.057172000","description":"Error received from peer > ipv6:[::1]:52699","file":"src/core/lib/surface/call.cc","file_line":1052,"grpc_message":"Runner > closed connection","grpc_status":1}"{code} > It appears randomly when executing test cases of blink planner. Although it > does not affect test results we need to find out why it appears. > Welcome any feedback! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] jinglining commented on issue #9601: [FLINK-13894][web]Web Ui add log url for subtask of vertex
jinglining commented on issue #9601: [FLINK-13894][web]Web Ui add log url for subtask of vertex URL: https://github.com/apache/flink/pull/9601#issuecomment-544829219 CI failed because FlinkKafkaProducerBase error, ignore. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (FLINK-14491) Introduce ConfigOptions with Data Types
Dawid Wysakowicz created FLINK-14491: Summary: Introduce ConfigOptions with Data Types Key: FLINK-14491 URL: https://issues.apache.org/jira/browse/FLINK-14491 Project: Flink Issue Type: Improvement Components: Runtime / Configuration Reporter: Dawid Wysakowicz Assignee: Dawid Wysakowicz Fix For: 1.10.0 Implement FLIP 77: https://cwiki.apache.org/confluence/display/FLINK/FLIP-77%3A+Introduce+ConfigOptions+with+Data+Types -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] wangyang0918 commented on a change in pull request #9957: [FLINK-10932] Initialize flink-kubernetes module
wangyang0918 commented on a change in pull request #9957: [FLINK-10932] Initialize flink-kubernetes module URL: https://github.com/apache/flink/pull/9957#discussion_r337350721 ## File path: flink-kubernetes/pom.xml ## @@ -0,0 +1,222 @@ + + + +http://maven.apache.org/POM/4.0.0"; +xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; +xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd";> + 4.0.0 + + org.apache.flink + flink-parent + 1.10-SNAPSHOT + .. + + + flink-kubernetes_${scala.binary.version} + flink-kubernetes + jar + + + 4.5.2 Review comment: The kubernetes client version is only used in the flink-kubernetes module. So i suggest to keep it here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-14472) Implement back-pressure monitor with non-blocking outputs
[ https://issues.apache.org/jira/browse/FLINK-14472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhijiang updated FLINK-14472: - Description: Currently back-pressure monitor relies on detecting task threads that are stuck in `requestBufferBuilderBlocking`. There are actually two cases to cause back-pressure ATM: * There are no available buffers in `LocalBufferPool` and all the given quotas from global pool are also exhausted. Then we need to wait for buffer recycling to `LocalBufferPool`. * No available buffers in `LocalBufferPool`, but the quota has not been used up. While requesting buffer from global pool, it is blocked because of no available buffers in global pool. Then we need to wait for buffer recycling to global pool. We try to implement the non-blocking network output in FLINK-14396, so the back pressure monitor should be adjusted accordingly after the non-blocking output is used in practice. In detail we try to avoid the current monitor way by analyzing the task thread stack, which has some drawbacks discussed before: * If the `requestBuffer` is not triggered by task thread, the current monitor is invalid in practice. * The current monitor is heavy-weight and fragile because it needs to understand more details of LocalBufferPool implementation. We could provide a transparent method for the monitor caller to get the backpressure result directly, and hide the implementation details in the LocalBufferPool. was: Currently back-pressure monitor relies on detecting task threads that are stuck in `requestBufferBuilderBlocking`. There are actually two cases to cause back-pressure ATM: * There are no available buffers in `LocalBufferPool` and all the given quotas from global pool are also exhausted. Then we need to wait for buffer recycling to `LocalBufferPool`. * No available buffers in `LocalBufferPool`, but the quota has not been used up. While requesting buffer from global pool, it is blocked because of no available buffers in global pool. Then we need to wait for buffer recycling to global pool. We already implemented the non-blocking output for the first case in [FLINK-14396|https://issues.apache.org/jira/browse/FLINK-14396], and we expect the second case done together with adjusting the back-pressure monitor which could check for `RecordWriter#isAvailable` instead. > Implement back-pressure monitor with non-blocking outputs > - > > Key: FLINK-14472 > URL: https://issues.apache.org/jira/browse/FLINK-14472 > Project: Flink > Issue Type: Task > Components: Runtime / Network >Reporter: zhijiang >Assignee: Yingjie Cao >Priority: Minor > Fix For: 1.10.0 > > > Currently back-pressure monitor relies on detecting task threads that are > stuck in `requestBufferBuilderBlocking`. There are actually two cases to > cause back-pressure ATM: > * There are no available buffers in `LocalBufferPool` and all the given > quotas from global pool are also exhausted. Then we need to wait for buffer > recycling to `LocalBufferPool`. > * No available buffers in `LocalBufferPool`, but the quota has not been used > up. While requesting buffer from global pool, it is blocked because of no > available buffers in global pool. Then we need to wait for buffer recycling > to global pool. > We try to implement the non-blocking network output in FLINK-14396, so the > back pressure monitor should be adjusted accordingly after the non-blocking > output is used in practice. > In detail we try to avoid the current monitor way by analyzing the task > thread stack, which has some drawbacks discussed before: > * If the `requestBuffer` is not triggered by task thread, the current > monitor is invalid in practice. > * The current monitor is heavy-weight and fragile because it needs to > understand more details of LocalBufferPool implementation. > We could provide a transparent method for the monitor caller to get the > backpressure result directly, and hide the implementation details in the > LocalBufferPool. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-14490) Add methods for interacting with temporary objects
Dawid Wysakowicz created FLINK-14490: Summary: Add methods for interacting with temporary objects Key: FLINK-14490 URL: https://issues.apache.org/jira/browse/FLINK-14490 Project: Flink Issue Type: Sub-task Components: Table SQL / API, Table SQL / Legacy Planner, Table SQL / Planner Reporter: Dawid Wysakowicz Assignee: Dawid Wysakowicz Fix For: 1.10.0 Implement changes to Java/Scala APIs mentioned in the FLIP -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14479) Strange exceptions found in log file after executing `test_udf.py`
[ https://issues.apache.org/jira/browse/FLINK-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956741#comment-16956741 ] Wei Zhong commented on FLINK-14479: --- [~sunjincheng121] I'm willing to investigate this issue. Can you assign it to me? ; ) > Strange exceptions found in log file after executing `test_udf.py` > -- > > Key: FLINK-14479 > URL: https://issues.apache.org/jira/browse/FLINK-14479 > Project: Flink > Issue Type: Improvement > Components: API / Python >Reporter: sunjincheng >Priority: Major > Fix For: 1.10.0 > > > There are several strange exceptions as follow in > `${flink_source}/build-target/log/flink-${username}-python-udf-boot-${machine_name}.local.log` > after executing > `${flink_source}/flink-python/pyflink/table/tests/test_udf.py`: > Traceback (most recent call last): > {code:java} > File > "/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", > line 193, in _run_module_as_main > "__main__", mod_spec) > File > "/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", > line 85, in _run_code > exec(code, run_globals) > File > "/Users/zhongwei/flink/flink-python/pyflink/fn_execution/sdk_worker_main.py", > line 30, in > apache_beam.runners.worker.sdk_worker_main.main(sys.argv) > File > "/Users/zhongwei/pyflink_env/py37/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker_main.py", > line 148, in main > sdk_pipeline_options.view_as(pipeline_options.ProfilingOptions)) > File > "/Users/zhongwei/pyflink_env/py37/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 133, in run > for work_request in control_stub.Control(get_responses()): > File > "/Users/zhongwei/pyflink_env/py37/lib/python3.7/site-packages/grpc/_channel.py", > line 364, in __next__ > return self._next() > File > "/Users/zhongwei/pyflink_env/py37/lib/python3.7/site-packages/grpc/_channel.py", > line 347, in _next > raise self > grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with: > status = StatusCode.CANCELLED > details = "Runner closed connection" > debug_error_string = > "{"created":"@1571660342.057172000","description":"Error received from peer > ipv6:[::1]:52699","file":"src/core/lib/surface/call.cc","file_line":1052,"grpc_message":"Runner > closed connection","grpc_status":1}"{code} > It appears randomly when executing test cases of blink planner. Although it > does not affect test results we need to find out why it appears. > Welcome any feedback! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-9953) Active Kubernetes integration
[ https://issues.apache.org/jira/browse/FLINK-9953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956740#comment-16956740 ] Canbin Zheng commented on FLINK-9953: - Sorry for the late reply. I agree with [~trohrmann] that we only support session execution mode and use Deployment to manage worker pods in the first MVP, it considerably narrows the scope of the initial commit and helps forward this feature quickly. > at the moment I don't yet fully understand why we need to persist RM state. I believe there are several reasons to persist the RM state for failures handling and some edge exception cases, since it not necessary blocks the first experimental release, we can leave this discussion after that. [~fly_in_gis] thanks, looking forward to your work! > Active Kubernetes integration > - > > Key: FLINK-9953 > URL: https://issues.apache.org/jira/browse/FLINK-9953 > Project: Flink > Issue Type: New Feature > Components: Runtime / Coordination >Reporter: Till Rohrmann >Assignee: Yang Wang >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > This is the umbrella issue tracking Flink's active Kubernetes integration. > Active means in this context that the {{ResourceManager}} can talk to > Kubernetes to launch new pods similar to Flink's Yarn and Mesos integration. > Phase1 implementation will have complete functions to make flink running on > kubernetes. Phrase2 is mainly focused on production optimization, including > k8s native high-availability, storage, network, log collector and etc. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] wangyang0918 commented on a change in pull request #9957: [FLINK-10932] Initialize flink-kubernetes module
wangyang0918 commented on a change in pull request #9957: [FLINK-10932] Initialize flink-kubernetes module URL: https://github.com/apache/flink/pull/9957#discussion_r337350386 ## File path: flink-kubernetes/pom.xml ## @@ -0,0 +1,222 @@ + + + +http://maven.apache.org/POM/4.0.0"; +xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; +xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd";> + 4.0.0 + + org.apache.flink + flink-parent + 1.10-SNAPSHOT + .. + + + flink-kubernetes_${scala.binary.version} + flink-kubernetes + jar + + + 4.5.2 + + + + + + + + org.apache.flink + flink-clients_${scala.binary.version} + ${project.version} + provided + + + + org.apache.flink + flink-runtime_${scala.binary.version} + ${project.version} + provided + + + + io.fabric8 + kubernetes-client + ${kubernetes.version} Review comment: Agree. I will use kubernetes.client.version instead. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (FLINK-14489) Check and make sure state backend fits into TaskExecutor memory configuration
Yu Li created FLINK-14489: - Summary: Check and make sure state backend fits into TaskExecutor memory configuration Key: FLINK-14489 URL: https://issues.apache.org/jira/browse/FLINK-14489 Project: Flink Issue Type: Sub-task Components: Runtime / State Backends, Runtime / Task Reporter: Yu Li [FLIP-49|https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors] and [FLINK-13980|https://issues.apache.org/jira/browse/FLINK-13984] has proposed a unified memory configuration for {{TaskExecutor}}, we should find a proper way to suit state backend memory management into it. This sub-task will track the discussion and solution on this topic. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] jinglining commented on issue #9601: [FLINK-13894][web]Web Ui add log url for subtask of vertex
jinglining commented on issue #9601: [FLINK-13894][web]Web Ui add log url for subtask of vertex URL: https://github.com/apache/flink/pull/9601#issuecomment-544828493 @flinkbot run travis This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-14478) The python test on travis consumes too much time
[ https://issues.apache.org/jira/browse/FLINK-14478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956738#comment-16956738 ] Wei Zhong commented on FLINK-14478: --- [~sunjincheng121] I would like to take this JIRA. Can you assign it to me? : ) > The python test on travis consumes too much time > > > Key: FLINK-14478 > URL: https://issues.apache.org/jira/browse/FLINK-14478 > Project: Flink > Issue Type: Improvement > Components: API / Python >Reporter: sunjincheng >Priority: Major > > The python test on travis now takes twice as long as it was a few days ago. > In this case serveral days ago it only takes about 15 minutes, more detail > can be found in [1] > But now it takes about half an hour, more detail can be found in [2] > It seems that a lot of time is spent on the newly added test cases in > `test_udf.py`. We need some optimization for these test cases to reduce test > time. > What do you think? > [1] [https://travis-ci.org/apache/flink/builds/599058891] > [2] [https://travis-ci.org/apache/flink/builds/600517667] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] wangxiyuan opened a new pull request #9964: test Travis arm ci
wangxiyuan opened a new pull request #9964: test Travis arm ci URL: https://github.com/apache/flink/pull/9964 ## What is the purpose of the change Test Travis ARM CI ## Brief change log *(for example:)* - *The TaskInfo is stored in the blob store on job creation time as a persistent artifact* - *Deployments RPC transmits only the blob storage reference* - *TaskManagers retrieve the TaskInfo from the blob cache* ## Verifying this change This change is already covered by existing tests, such as *(please describe tests)*. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / no) - The serializers: (yes / no / don't know) - The runtime per-record code paths (performance sensitive): (yes / no / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / no / don't know) - The S3 file system connector: (yes / no / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / no) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] wangxiyuan closed pull request #9964: test Travis arm ci
wangxiyuan closed pull request #9964: test Travis arm ci URL: https://github.com/apache/flink/pull/9964 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-14482) Bump up rocksdb version to support WriteBufferManager
[ https://issues.apache.org/jira/browse/FLINK-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956736#comment-16956736 ] Yun Tang commented on FLINK-14482: -- At first, we plan to bump the base rocksDB version from 5.17.2 to 5.18.3 to support write buffer manager feature. However, when we test with flink's [benchmark|https://github.com/dataArtisans/flink-benchmarks], and we found all test cases behave with performance regression from 1% ~ 9%, more details could refer to rocksDB's [issue-5774|https://github.com/facebook/rocksdb/issues/5774]. Until now, even we could reproduce this with RocksDB built-in db_bench, there still no exact answer for this problem. > Bump up rocksdb version to support WriteBufferManager > - > > Key: FLINK-14482 > URL: https://issues.apache.org/jira/browse/FLINK-14482 > Project: Flink > Issue Type: Sub-task > Components: Runtime / State Backends >Reporter: Yun Tang >Priority: Major > Fix For: 1.11.0 > > > Current rocksDB-5.17.2 does not support write buffer manager well, we need to > bump rocksdb version to support that feature. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-14488) Update python table API with temporary objects methods
Dawid Wysakowicz created FLINK-14488: Summary: Update python table API with temporary objects methods Key: FLINK-14488 URL: https://issues.apache.org/jira/browse/FLINK-14488 Project: Flink Issue Type: Sub-task Components: API / Python, Table SQL / API Reporter: Dawid Wysakowicz Fix For: 1.10.0 Update python table API with new methods introduced in Java/Scala API -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-14487) Update chinese documentation regarding Temporary Objects
Dawid Wysakowicz created FLINK-14487: Summary: Update chinese documentation regarding Temporary Objects Key: FLINK-14487 URL: https://issues.apache.org/jira/browse/FLINK-14487 Project: Flink Issue Type: Sub-task Components: Documentation, Table SQL / API Reporter: Dawid Wysakowicz Fix For: 1.10.0 Reflect changes in english documention in chinese documentation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] zhuzhurk commented on issue #9936: [FLINK-14450][runtime] Change SchedulingTopology to extend base topology
zhuzhurk commented on issue #9936: [FLINK-14450][runtime] Change SchedulingTopology to extend base topology URL: https://github.com/apache/flink/pull/9936#issuecomment-544827263 @tillrohrmann could you take a look at this PR? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (FLINK-14486) Update documentation
Dawid Wysakowicz created FLINK-14486: Summary: Update documentation Key: FLINK-14486 URL: https://issues.apache.org/jira/browse/FLINK-14486 Project: Flink Issue Type: Sub-task Components: Documentation, Table SQL / API Reporter: Dawid Wysakowicz Assignee: Dawid Wysakowicz Fix For: 1.10.0 * update references to deprecated methods * describe the concept of temporary tables -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-14486) Update documentation regarding Temporary Objects
[ https://issues.apache.org/jira/browse/FLINK-14486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Wysakowicz updated FLINK-14486: - Summary: Update documentation regarding Temporary Objects (was: Update documentation) > Update documentation regarding Temporary Objects > > > Key: FLINK-14486 > URL: https://issues.apache.org/jira/browse/FLINK-14486 > Project: Flink > Issue Type: Sub-task > Components: Documentation, Table SQL / API >Reporter: Dawid Wysakowicz >Assignee: Dawid Wysakowicz >Priority: Major > Fix For: 1.10.0 > > > * update references to deprecated methods > * describe the concept of temporary tables -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14164) Add a metric to show failover count regarding fine grained recovery
[ https://issues.apache.org/jira/browse/FLINK-14164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956735#comment-16956735 ] Zhu Zhu commented on FLINK-14164: - Make this issue a subtask of FLINK-10429 to also add such a metric for NG scheduler. I plan to change the {{SchedulerBase}} to register a meter 'numberOfRestarts' to exhibits all restarts. The meter is a {{MeterView}} and the underlying counter is determined by each scheduler implementation: 1. for legacy scheduler, it's the {{ExecutionGraph#numberOfRestartsCounter}} we added in FLINK-10429 2. for ng scheduler, it's a new counter added in {{ExecutionFailureHandler}} that counts all the task and global failures notified to it (based on FLINK-14232 global failure handling). [~trohrmann] [~gjy] Do you think this would work? If it's Ok, could you assign this ticket to me? > Add a metric to show failover count regarding fine grained recovery > --- > > Key: FLINK-14164 > URL: https://issues.apache.org/jira/browse/FLINK-14164 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination, Runtime / Metrics >Affects Versions: 1.10.0 >Reporter: Zhu Zhu >Priority: Major > Fix For: 1.10.0 > > > Previously Flink uses restart all strategy to recover jobs from failures. And > the metric "fullRestart" is used to show the count of failovers. > However, with fine grained recovery introduced in 1.9.0, the "fullRestart" > metric only reveals how many times the entire graph has been restarted, not > including the number of fine grained failure recoveries. > As many users want to build their job alerting based on failovers, I'd > propose to add such a new metric {{numberOfFailures}}/{{numberOfRestarts}} > which also respects fine grained recoveries. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-14485) Support for Temporary Objects in Table module
Dawid Wysakowicz created FLINK-14485: Summary: Support for Temporary Objects in Table module Key: FLINK-14485 URL: https://issues.apache.org/jira/browse/FLINK-14485 Project: Flink Issue Type: New Feature Components: Table SQL / API, Table SQL / Legacy Planner, Table SQL / Planner Reporter: Dawid Wysakowicz Assignee: Dawid Wysakowicz Fix For: 1.10.0 Implement FLIP-64: https://cwiki.apache.org/confluence/display/FLINK/FLIP-64%3A+Support+for+Temporary+Objects+in+Table+module -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-14484) Modify RocksDB backend to allow setting WriteBufferManager via options
Yun Tang created FLINK-14484: Summary: Modify RocksDB backend to allow setting WriteBufferManager via options Key: FLINK-14484 URL: https://issues.apache.org/jira/browse/FLINK-14484 Project: Flink Issue Type: Sub-task Components: Runtime / State Backends Reporter: Yun Tang Fix For: 1.10.0 Support to allow RocksDB statebackend to enable write buffer manager feature via options. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-14483) Build and supply frocksdb version to support WriteBufferManager
Yun Tang created FLINK-14483: Summary: Build and supply frocksdb version to support WriteBufferManager Key: FLINK-14483 URL: https://issues.apache.org/jira/browse/FLINK-14483 Project: Flink Issue Type: Sub-task Components: Runtime / State Backends Reporter: Yun Tang Fix For: 1.10.0 Since we still need to maintain our own FRocksDB, we need to provide new frocksDB version to support writebuffer manager. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on issue #9964: test Travis arm ci
flinkbot edited a comment on issue #9964: test Travis arm ci URL: https://github.com/apache/flink/pull/9964#issuecomment-544799364 ## CI report: * 754182e6b5101abec8af40e3465a9d43c685d63c : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132933631) * ea99f7d1b0cc855256cb6da75c2f88069ca46bde : UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (FLINK-14482) Bump up rocksdb version to support WriteBufferManager
Yun Tang created FLINK-14482: Summary: Bump up rocksdb version to support WriteBufferManager Key: FLINK-14482 URL: https://issues.apache.org/jira/browse/FLINK-14482 Project: Flink Issue Type: Sub-task Components: Runtime / State Backends Reporter: Yun Tang Fix For: 1.11.0 Current rocksDB-5.17.2 does not support write buffer manager well, we need to bump rocksdb version to support that feature. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] mbode commented on a change in pull request #9946: [FLINK-14468][docs] Update Kubernetes docs
mbode commented on a change in pull request #9946: [FLINK-14468][docs] Update Kubernetes docs URL: https://github.com/apache/flink/pull/9946#discussion_r337346428 ## File path: docs/ops/deployment/kubernetes.md ## @@ -230,6 +240,8 @@ spec: volumeMounts: - name: flink-config-volume mountPath: /opt/flink/conf/ +securityContext: + runAsUser: Review comment: I stumbled over this trying to deploy the example templates to a Kubernetes cluster where I did not have permissions to run privileged workloads. It seems to me to be a best practice to run unprivileged if possible. The user _flink_ with uid __ is present in the official Flink docker image, e.g. [here](https://github.com/docker-flink/docker-flink/blob/2e4b45b10e8efe04c324e44cacf7df16b2553f0f/1.9/scala_2.12-debian/Dockerfile#L55) for Flink 1.9/Scala 2.12. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9905: [FLINK-14396][network] Implement rudimentary non-blocking network output
flinkbot edited a comment on issue #9905: [FLINK-14396][network] Implement rudimentary non-blocking network output URL: https://github.com/apache/flink/pull/9905#issuecomment-542185017 ## CI report: * 8b3baa21776bbfa97a36413136fba365ea960703 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/131961879) * 2caa4c446127547d127433d5614b6994cd095330 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132760396) * 510f2cc5b515d80e0e316e68adb6fa4960303536 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/132777306) * 79505f7abd7e5c4319fa5328d0aee07360a175d7 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132934966) * 2f3d7b85d84acf575589feefdb2ee9375ba1541e : UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-14164) Add a metric to show failover count regarding fine grained recovery
[ https://issues.apache.org/jira/browse/FLINK-14164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14164: Parent: FLINK-10429 Issue Type: Sub-task (was: Improvement) > Add a metric to show failover count regarding fine grained recovery > --- > > Key: FLINK-14164 > URL: https://issues.apache.org/jira/browse/FLINK-14164 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination, Runtime / Metrics >Affects Versions: 1.10.0 >Reporter: Zhu Zhu >Priority: Major > Fix For: 1.10.0 > > > Previously Flink uses restart all strategy to recover jobs from failures. And > the metric "fullRestart" is used to show the count of failovers. > However, with fine grained recovery introduced in 1.9.0, the "fullRestart" > metric only reveals how many times the entire graph has been restarted, not > including the number of fine grained failure recoveries. > As many users want to build their job alerting based on failovers, I'd > propose to add such a new metric {{numberOfFailures}}/{{numberOfRestarts}} > which also respects fine grained recoveries. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] thousandhu commented on a change in pull request #9957: [FLINK-10932] Initialize flink-kubernetes module
thousandhu commented on a change in pull request #9957: [FLINK-10932] Initialize flink-kubernetes module URL: https://github.com/apache/flink/pull/9957#discussion_r337345302 ## File path: flink-kubernetes/pom.xml ## @@ -0,0 +1,222 @@ + + + +http://maven.apache.org/POM/4.0.0"; +xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; +xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd";> + 4.0.0 + + org.apache.flink + flink-parent + 1.10-SNAPSHOT + .. + + + flink-kubernetes_${scala.binary.version} + flink-kubernetes + jar + + + 4.5.2 + + + + + + + + org.apache.flink + flink-clients_${scala.binary.version} + ${project.version} + provided + + + + org.apache.flink + flink-runtime_${scala.binary.version} + ${project.version} + provided + + + + io.fabric8 + kubernetes-client + ${kubernetes.version} Review comment: use kubernetes-client.version instead of kubernetes.version? I think kubernetes.version sounds like the kubernetes cluster's version. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] MalcolmSanders commented on a change in pull request #9957: [FLINK-10932] Initialize flink-kubernetes module
MalcolmSanders commented on a change in pull request #9957: [FLINK-10932] Initialize flink-kubernetes module URL: https://github.com/apache/flink/pull/9957#discussion_r337343518 ## File path: flink-kubernetes/pom.xml ## @@ -0,0 +1,222 @@ + + + +http://maven.apache.org/POM/4.0.0"; +xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; +xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd";> + 4.0.0 + + org.apache.flink + flink-parent + 1.10-SNAPSHOT + .. + + + flink-kubernetes_${scala.binary.version} + flink-kubernetes + jar + + + 4.5.2 Review comment: Probably move this property to root/pom.xml ? Like This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zhijiangW commented on issue #9905: [FLINK-14396][network] Implement rudimentary non-blocking network output
zhijiangW commented on issue #9905: [FLINK-14396][network] Implement rudimentary non-blocking network output URL: https://github.com/apache/flink/pull/9905#issuecomment-544820371 Thanks for the further review and I updated the codes for new comments! Some further thoughts for the following work: 1. I would create a separate ticket for refactoring the logic between local and global pool. It might not be feasible to make the global request non-blocking completely. In other words, we can not return to the mailbox directly when fails to get available buffer from global pool, because we do not want to store the already emitted record in serialization stack. But we could improve the current way a bit. It is no need to wait 2 seconds in `LocalBufferPool` before next try to request global buffer. Instead it can be blocked there to monitor `isAvailable` from global and be waked up immediately when the future is completed. 2. Refactor the backpressure monitor in [FLINK-14472](https://issues.apache.org/jira/browse/FLINK-14472). There are two cases for causing backpressure. One is via `BufferProvider#isAvailable` and another is via judging whether it is waiting for global buffer available. We could provide a transparent interface method for giving the result of backpressure. In the LocalBufferPool implementation it could consider above two cases together and the backpressure monitor does not need to understand the details of `LocalBufferPool`. So it is no need to analyze the task thread stack, and also solve the previous problem that backpressure is invalid if `requestBuffer` is called by other threads. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9965: [FLINK-10935][kubernetes]Implement KubeClient with Faric8 Kubernetes clients
flinkbot edited a comment on issue #9965: [FLINK-10935][kubernetes]Implement KubeClient with Faric8 Kubernetes clients URL: https://github.com/apache/flink/pull/9965#issuecomment-544813931 ## CI report: * 6f90b457e56a0a8cb45d63c1b05b47d2e38030a1 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/132938440) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9962: [FLINK-14218][table] support precise function reference in FunctionCatalog
flinkbot edited a comment on issue #9962: [FLINK-14218][table] support precise function reference in FunctionCatalog URL: https://github.com/apache/flink/pull/9962#issuecomment-544666842 ## CI report: * 622f328937b3479ed9489d23579ec35fc7037d83 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132881705) * 05d1e78a54104288818f99456259c8663311b516 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132914513) * 1f48369a2d07d5bb815dd461523f3e96839e04bc : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/132938431) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition
flinkbot edited a comment on issue #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition URL: https://github.com/apache/flink/pull/9907#issuecomment-542243031 ## CI report: * dd14191ee919be31148be254070e7a777cf9cb4d : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/131984556) * 1559637102832603a0dc0d09ab730e00f2e9d224 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132628239) * cebed5831381b7bc00f3a56244f8123099f0b5bb : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/132741010) * 207250cb5110fc2fd839e643ab149b9f482e2e0a : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/132938423) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9962: [FLINK-14218][table] support precise function reference in FunctionCatalog
flinkbot edited a comment on issue #9962: [FLINK-14218][table] support precise function reference in FunctionCatalog URL: https://github.com/apache/flink/pull/9962#issuecomment-544666842 ## CI report: * 622f328937b3479ed9489d23579ec35fc7037d83 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132881705) * 05d1e78a54104288818f99456259c8663311b516 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132914513) * 1f48369a2d07d5bb815dd461523f3e96839e04bc : UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #9965: [FLINK-10935][kubernetes]Implement KubeClient with Faric8 Kubernetes clients
flinkbot commented on issue #9965: [FLINK-10935][kubernetes]Implement KubeClient with Faric8 Kubernetes clients URL: https://github.com/apache/flink/pull/9965#issuecomment-544813931 ## CI report: * 6f90b457e56a0a8cb45d63c1b05b47d2e38030a1 : UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition
flinkbot edited a comment on issue #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition URL: https://github.com/apache/flink/pull/9907#issuecomment-542243031 ## CI report: * dd14191ee919be31148be254070e7a777cf9cb4d : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/131984556) * 1559637102832603a0dc0d09ab730e00f2e9d224 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132628239) * cebed5831381b7bc00f3a56244f8123099f0b5bb : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/132741010) * 207250cb5110fc2fd839e643ab149b9f482e2e0a : UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #9965: Flink 10935
flinkbot commented on issue #9965: Flink 10935 URL: https://github.com/apache/flink/pull/9965#issuecomment-544809745 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 6f90b457e56a0a8cb45d63c1b05b47d2e38030a1 (Tue Oct 22 05:26:01 UTC 2019) **Warnings:** * **4 pom.xml files were touched**: Check for build and licensing issues. * No documentation files were touched! Remember to keep the Flink docs up to date! * **Invalid pull request title: No valid Jira ID provided** Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] wangyang0918 opened a new pull request #9965: Flink 10935
wangyang0918 opened a new pull request #9965: Flink 10935 URL: https://github.com/apache/flink/pull/9965 ## What is the purpose of the change This PR is part of FLINK-9953(Active Kubernetes integration) and based on #9695 . This PR add the `FlinkKubeClient` API and make fabric8 implementation. The following k8s resource decorators are added to support for basic k8s integration. * Service, internal service is used for tm->jm hearbeat, rest service is used for flink client to submit job. * ConfigMap, save the flink-conf.yaml and log4j config files. It will be mounted into jm/tm pod. * Deployment, for job manager. When it crashes exceptionally, a new one will be started. * Pod, for task manager. It will created by `KubernetesResourceManager`. * OwnerReference, all resources are set owner reference to internal service. When the internal service is deleted, they will be deleted by gc. ## Brief change log - Add config options and cli options for kubernetes - Add FlinkKubeClient API and basic kubernetes resources - Add fabric8 kubeClient implementation and decorators - Add tests for fabric8 FlinkKubeClient implementation ## Verifying this change This change added `Fabric8ClientTest` public api implementation of `FlinkKubeClient`. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] dianfu commented on issue #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition
dianfu commented on issue #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition URL: https://github.com/apache/flink/pull/9907#issuecomment-544808067 @hequn8128 Thanks a lot for the review. Updated the PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9927: [FLINK-14397][hive] Failed to run Hive UDTF with array arguments
flinkbot edited a comment on issue #9927: [FLINK-14397][hive] Failed to run Hive UDTF with array arguments URL: https://github.com/apache/flink/pull/9927#issuecomment-543180251 ## CI report: * 14b59100e851fa7494838f1b5c68b4dc732da590 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/132339165) * 9293184a08efd0c075be4fa67c02170894287099 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132493044) * 995c2d239889a09bffd87f228b222434cb052820 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132753822) * 5543d30d6d5f3605783dce9070b3dc42de11ffd2 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132932147) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9905: [FLINK-14396][network] Implement rudimentary non-blocking network output
flinkbot edited a comment on issue #9905: [FLINK-14396][network] Implement rudimentary non-blocking network output URL: https://github.com/apache/flink/pull/9905#issuecomment-542185017 ## CI report: * 8b3baa21776bbfa97a36413136fba365ea960703 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/131961879) * 2caa4c446127547d127433d5614b6994cd095330 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132760396) * 510f2cc5b515d80e0e316e68adb6fa4960303536 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/132777306) * 79505f7abd7e5c4319fa5328d0aee07360a175d7 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132934966) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Comment Edited] (FLINK-14472) Implement back-pressure monitor with non-blocking outputs
[ https://issues.apache.org/jira/browse/FLINK-14472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956654#comment-16956654 ] zhijiang edited comment on FLINK-14472 at 10/22/19 5:05 AM: Thanks for concerning on this issue. You are right that some known scenarios are invalid for existing back pressure monitor. Although the motivation of this ticket is not for solving that limitation, I think we might solve it meanwhile while implementing the new monitor way. The current monitor way is heavy-weight and fragile, and it also needs to understand the implementation of `LocalBufferPool` which is bad in design. I tried to provide a transparent method in `BufferProvider` to indicate whether it is back pressured or not, then the monitor caller would rely on this method to get the back pressure result. It is no need to analyze the specific thread stacks inside monitor tracker to understand the implementation of `BufferProvider`. And it also has the benefit for the rest call to only carry light-weight info. was (Author: zjwang): Thanks for concerning on this issue. You are right that some known scenarios are invalid for existing back pressure monitor. Although the motivation of this ticket is not for solving that limitation, I think we might solve it meanwhile while implementing the new monitor way. The current monitor way is heavy-weight and fragile, and it also needs to understand the implementation of `LocalBufferPool` which is bad in design. I tried to provide a transparent method in `BufferProvider` to indicate whether it is back pressured or not, then the monitor caller would rely on this method to get the back pressure result. It is no need to analyze the specific thread stacks inside monitor tracker to understand the implementation of `BufferProvider`. And it also has the benefit for the restful call to only carry light-weight info. > Implement back-pressure monitor with non-blocking outputs > - > > Key: FLINK-14472 > URL: https://issues.apache.org/jira/browse/FLINK-14472 > Project: Flink > Issue Type: Task > Components: Runtime / Network >Reporter: zhijiang >Assignee: Yingjie Cao >Priority: Minor > Fix For: 1.10.0 > > > Currently back-pressure monitor relies on detecting task threads that are > stuck in `requestBufferBuilderBlocking`. There are actually two cases to > cause back-pressure ATM: > * There are no available buffers in `LocalBufferPool` and all the given > quotas from global pool are also exhausted. Then we need to wait for buffer > recycling to `LocalBufferPool`. > * No available buffers in `LocalBufferPool`, but the quota has not been used > up. While requesting buffer from global pool, it is blocked because of no > available buffers in global pool. Then we need to wait for buffer recycling > to global pool. > We already implemented the non-blocking output for the first case in > [FLINK-14396|https://issues.apache.org/jira/browse/FLINK-14396], and we > expect the second case done together with adjusting the back-pressure monitor > which could check for `RecordWriter#isAvailable` instead. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on issue #9950: [FLINK-14464][runtime] Introduce the AbstractUserClassPathJobGraphRetriever
flinkbot edited a comment on issue #9950: [FLINK-14464][runtime] Introduce the AbstractUserClassPathJobGraphRetriever URL: https://github.com/apache/flink/pull/9950#issuecomment-544398939 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit decba8623c001e8bbe5dc797de2bc421b2e216ef (Tue Oct 22 04:49:21 UTC 2019) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9950: [FLINK-14464][runtime] Introduce the AbstractUserClassPathJobGraphRetriever
flinkbot edited a comment on issue #9950: [FLINK-14464][runtime] Introduce the AbstractUserClassPathJobGraphRetriever URL: https://github.com/apache/flink/pull/9950#issuecomment-544398939 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit decba8623c001e8bbe5dc797de2bc421b2e216ef (Tue Oct 22 04:47:19 UTC 2019) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Assigned] (FLINK-14388) Support all the data types in Python user-defined functions
[ https://issues.apache.org/jira/browse/FLINK-14388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hequn Cheng reassigned FLINK-14388: --- Assignee: Huang Xingbo > Support all the data types in Python user-defined functions > --- > > Key: FLINK-14388 > URL: https://issues.apache.org/jira/browse/FLINK-14388 > Project: Flink > Issue Type: Sub-task > Components: API / Python >Reporter: Dian Fu >Assignee: Huang Xingbo >Priority: Major > Labels: pull-request-available > Fix For: 1.10.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Currently, only BigInt type is supported and we should support the other > types as well. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] zhuzhurk commented on a change in pull request #9950: [FLINK-14464][runtime] Introduce the AbstractUserClassPathJobGraphRetriever
zhuzhurk commented on a change in pull request #9950: [FLINK-14464][runtime] Introduce the AbstractUserClassPathJobGraphRetriever URL: https://github.com/apache/flink/pull/9950#discussion_r337329436 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/entrypoint/component/AbstractUserClassPathJobGraphRetriever.java ## @@ -0,0 +1,115 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.entrypoint.component; + +import org.apache.flink.core.fs.Path; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import javax.annotation.Nullable; + +import java.io.File; +import java.io.IOException; +import java.net.URL; +import java.nio.file.FileVisitOption; +import java.nio.file.FileVisitResult; +import java.nio.file.Files; +import java.nio.file.SimpleFileVisitor; +import java.nio.file.attribute.BasicFileAttributes; +import java.util.Collections; +import java.util.EnumSet; +import java.util.LinkedList; +import java.util.List; + +/** + * Abstract class for the JobGraphRetriever, which wants to get classpath user's code depends on. + */ + +public abstract class AbstractUserClassPathJobGraphRetriever implements JobGraphRetriever { + + protected static final Logger LOG = LoggerFactory.getLogger(AbstractUserClassPathJobGraphRetriever.class); + + public static final String DEFAULT_JOB_DIR = "job"; Review comment: This default value is used for both user jar deployment and scanning. I think it's better to move to a common place, like ConfigConstants. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9964: test Travis arm ci
flinkbot edited a comment on issue #9964: test Travis arm ci URL: https://github.com/apache/flink/pull/9964#issuecomment-544799364 ## CI report: * 754182e6b5101abec8af40e3465a9d43c685d63c : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132933631) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zhuzhurk commented on a change in pull request #9950: [FLINK-14464][runtime] Introduce the AbstractUserClassPathJobGraphRetriever
zhuzhurk commented on a change in pull request #9950: [FLINK-14464][runtime] Introduce the AbstractUserClassPathJobGraphRetriever URL: https://github.com/apache/flink/pull/9950#discussion_r337329154 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/entrypoint/component/AbstractUserClassPathJobGraphRetriever.java ## @@ -0,0 +1,115 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.entrypoint.component; + +import org.apache.flink.core.fs.Path; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import javax.annotation.Nullable; + +import java.io.File; +import java.io.IOException; +import java.net.URL; +import java.nio.file.FileVisitOption; +import java.nio.file.FileVisitResult; +import java.nio.file.Files; +import java.nio.file.SimpleFileVisitor; +import java.nio.file.attribute.BasicFileAttributes; +import java.util.Collections; +import java.util.EnumSet; +import java.util.LinkedList; +import java.util.List; + +/** + * Abstract class for the JobGraphRetriever, which wants to get classpath user's code depends on. + */ + +public abstract class AbstractUserClassPathJobGraphRetriever implements JobGraphRetriever { + + protected static final Logger LOG = LoggerFactory.getLogger(AbstractUserClassPathJobGraphRetriever.class); + + public static final String DEFAULT_JOB_DIR = "job"; + + /** The directory contains all the jars, which user code depends on. */ + @Nullable + private final String jobDir; + + private List userClassPaths; + + public AbstractUserClassPathJobGraphRetriever(String jobDir) { + this.jobDir = jobDir; + } + + public List getUserClassPaths() throws IOException { + if (userClassPaths == null) { + userClassPaths = scanJarsInJobClassDir(jobDir); + } + return userClassPaths; + } + + private List scanJarsInJobClassDir(String dir) throws IOException { + + if (dir == null) { + return Collections.emptyList(); + } + + final File dirFile = new File(new Path(dir).toString()); + final List jarURLs = new LinkedList<>(); + + if (!dirFile.exists()) { + LOG.warn("the job dir " + dirFile + " dose not exists."); + return Collections.emptyList(); + } + if (!dirFile.isDirectory()) { + LOG.warn("the job dir " + dirFile + " is not a directory."); + return Collections.emptyList(); + } + + Files.walkFileTree(dirFile.toPath(), + EnumSet.of(FileVisitOption.FOLLOW_LINKS), + Integer.MAX_VALUE, + new SimpleFileVisitor(){ + + @Override + public FileVisitResult visitFile(java.nio.file.Path file, BasicFileAttributes attrs) + throws IOException { + FileVisitResult fileVisitResult = super.visitFile(file, attrs); + if (file.getFileName().toString().endsWith(".jar")) { + LOG.info("add " + file.toString() + " to user classpath"); + if (file.isAbsolute()) { Review comment: How about to make all URLs in relative format? I think relative path could supersede absolute path in all class path usage cases. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9912: [FLINK-14370][kafka][test-stability] Fix the cascading failure in kaProducerTestBase.
flinkbot edited a comment on issue #9912: [FLINK-14370][kafka][test-stability] Fix the cascading failure in kaProducerTestBase. URL: https://github.com/apache/flink/pull/9912#issuecomment-542713850 ## CI report: * c1fe0487729f91e54bb58b2ed0913dda8d00b5f6 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132165623) * d3faff4e2b44b01a43d3716ad2eb1f4cd2c4b054 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/132351880) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9905: [FLINK-14396][network] Implement rudimentary non-blocking network output
flinkbot edited a comment on issue #9905: [FLINK-14396][network] Implement rudimentary non-blocking network output URL: https://github.com/apache/flink/pull/9905#issuecomment-542185017 ## CI report: * 8b3baa21776bbfa97a36413136fba365ea960703 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/131961879) * 2caa4c446127547d127433d5614b6994cd095330 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132760396) * 510f2cc5b515d80e0e316e68adb6fa4960303536 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/132777306) * 79505f7abd7e5c4319fa5328d0aee07360a175d7 : UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9963: [FLINK-14388][python][table-planner][table-planner-blink] Support common data types in Python user-defined functions
flinkbot edited a comment on issue #9963: [FLINK-14388][python][table-planner][table-planner-blink] Support common data types in Python user-defined functions URL: https://github.com/apache/flink/pull/9963#issuecomment-544787904 ## CI report: * fe74a771e3f928774398e6cbeccf20b69dc76a58 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132929165) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #9964: test Travis arm ci
flinkbot commented on issue #9964: test Travis arm ci URL: https://github.com/apache/flink/pull/9964#issuecomment-544799364 ## CI report: * 754182e6b5101abec8af40e3465a9d43c685d63c : UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9957: [FLINK-10932] Initialize flink-kubernetes module
flinkbot edited a comment on issue #9957: [FLINK-10932] Initialize flink-kubernetes module URL: https://github.com/apache/flink/pull/9957#issuecomment-544504781 ## CI report: * 41f3075d918578641781d15eb8c4d3cfc9bb4ada : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132817274) * 54c93de3bb19bd2c5bbee71a9cc177f1c405f116 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/132927667) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9927: [FLINK-14397][hive] Failed to run Hive UDTF with array arguments
flinkbot edited a comment on issue #9927: [FLINK-14397][hive] Failed to run Hive UDTF with array arguments URL: https://github.com/apache/flink/pull/9927#issuecomment-543180251 ## CI report: * 14b59100e851fa7494838f1b5c68b4dc732da590 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/132339165) * 9293184a08efd0c075be4fa67c02170894287099 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132493044) * 995c2d239889a09bffd87f228b222434cb052820 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132753822) * 5543d30d6d5f3605783dce9070b3dc42de11ffd2 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/132932147) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9905: [FLINK-14396][network] Implement rudimentary non-blocking network output
flinkbot edited a comment on issue #9905: [FLINK-14396][network] Implement rudimentary non-blocking network output URL: https://github.com/apache/flink/pull/9905#issuecomment-542178315 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 79505f7abd7e5c4319fa5328d0aee07360a175d7 (Tue Oct 22 04:14:41 UTC 2019) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zhijiangW commented on a change in pull request #9905: [FLINK-14396][network] Implement rudimentary non-blocking network output
zhijiangW commented on a change in pull request #9905: [FLINK-14396][network] Implement rudimentary non-blocking network output URL: https://github.com/apache/flink/pull/9905#discussion_r337325076 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/ConsumableNotifyingResultPartitionWriterDecorator.java ## @@ -119,6 +121,11 @@ public void fail(Throwable throwable) { partitionWriter.fail(throwable); } + @Override + public CompletableFuture isAvailable() { + return AvailabilityProvider.AVAILABLE; Review comment: Thanks for finding this critical issue. It was my careless to take it simple as other tests way. Actually the `ConsumableNotifyingResultPartitionWriterDecorator` is used by `RecordWriter` in practice, so we should create this decorator in new added test `RecordWriterTest#testIsAvailableOrNot` to cover this issue. And I already fixed it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9905: [FLINK-14396][network] Implement rudimentary non-blocking network output
flinkbot edited a comment on issue #9905: [FLINK-14396][network] Implement rudimentary non-blocking network output URL: https://github.com/apache/flink/pull/9905#issuecomment-542178315 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 79505f7abd7e5c4319fa5328d0aee07360a175d7 (Tue Oct 22 04:10:38 UTC 2019) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition
flinkbot edited a comment on issue #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition URL: https://github.com/apache/flink/pull/9907#issuecomment-542229982 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit cebed5831381b7bc00f3a56244f8123099f0b5bb (Tue Oct 22 04:01:27 UTC 2019) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (FLINK-14481) Modify the Flink valid socket port check to 0 to 65535.
ming li created FLINK-14481: --- Summary: Modify the Flink valid socket port check to 0 to 65535. Key: FLINK-14481 URL: https://issues.apache.org/jira/browse/FLINK-14481 Project: Flink Issue Type: Improvement Components: Runtime / Network Reporter: ming li In Flink, I found that Flink's socket port check is 'port >= 0 && port <= 65536. {code:java} checkArgument(serverPort >= 0 && serverPort <= 65536, "Invalid port number.");{code} But in the process of binding the port, the valid port is 0 to 65535(A port number of zero will let the System pick up anephemeral port in a bin operation). Although the 65536 port will fail due to the port out of range when actually binding, Flink has already done a valid range check on the port, which seems to be very confusing. Should we modify Flink's port check to 0 to 65535? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] dianfu commented on a change in pull request #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition
dianfu commented on a change in pull request #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition URL: https://github.com/apache/flink/pull/9907#discussion_r337323510 ## File path: flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table/planner/plan/rules/logical/PythonCalcSplitRule.scala ## @@ -0,0 +1,321 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.planner.plan.rules.logical + +import org.apache.calcite.plan.RelOptRule.{any, operand} +import org.apache.calcite.plan.{RelOptRule, RelOptRuleCall} +import org.apache.calcite.rex.{RexCall, RexInputRef, RexNode, RexProgram} +import org.apache.calcite.sql.validate.SqlValidatorUtil +import org.apache.flink.table.functions.FunctionLanguage +import org.apache.flink.table.functions.ScalarFunction +import org.apache.flink.table.planner.functions.utils.ScalarSqlFunction +import org.apache.flink.table.planner.plan.nodes.logical.FlinkLogicalCalc +import org.apache.flink.table.planner.plan.utils.PythonUtil.containsFunctionOf +import org.apache.flink.table.planner.plan.utils.{InputRefVisitor, RexDefaultVisitor} + +import scala.collection.JavaConverters._ +import scala.collection.JavaConversions._ +import scala.collection.mutable + +/** + * Rule that splits [[FlinkLogicalCalc]] into multiple [[FlinkLogicalCalc]]s. + * This is to ensure that the Python [[ScalarFunction]]s which could be + * executed in a batch are grouped into the same [[FlinkLogicalCalc]] node. + */ +abstract class PythonCalcSplitRuleBase(description: String) + extends RelOptRule( +operand(classOf[FlinkLogicalCalc], any), +description) { + + override def onMatch(call: RelOptRuleCall): Unit = { +val calc: FlinkLogicalCalc = call.rel(0).asInstanceOf[FlinkLogicalCalc] +val input = calc.getInput +val rexBuilder = call.builder().getRexBuilder +val program = calc.getProgram +val extractedRexCalls = new mutable.ArrayBuffer[RexCall]() + +val extractedFunctionOffset = input.getRowType.getFieldCount +val splitter = new ScalarFunctionSplitter( + extractedFunctionOffset, + extractedRexCalls, + convertPythonFunction(program)) + +val (bottomCalcCondition, topCalcCondition, topCalcProjects) = split(program, splitter) +val accessedFields = extractRefInputFields( + topCalcProjects, topCalcCondition, extractedFunctionOffset) + +val bottomCalcProjects = + accessedFields.map(RexInputRef.of(_, input.getRowType)) ++ extractedRexCalls +val bottomCalcFieldNames = SqlValidatorUtil.uniquify( + accessedFields.map(i => input.getRowType.getFieldNames.get(i)).toSeq ++ +extractedRexCalls.indices.map("f" + _), + rexBuilder.getTypeFactory.getTypeSystem.isSchemaCaseSensitive) + +val bottomCalc = new FlinkLogicalCalc( + calc.getCluster, + calc.getTraitSet, + input, + RexProgram.create( +input.getRowType, +bottomCalcProjects.toList, +bottomCalcCondition.orNull, +bottomCalcFieldNames, +rexBuilder)) + +val inputRewriter = new ExtractedFunctionInputRewriter(extractedFunctionOffset, accessedFields) +val topCalc = new FlinkLogicalCalc( + calc.getCluster, + calc.getTraitSet, + bottomCalc, + RexProgram.create( +bottomCalc.getRowType, +topCalcProjects.map(_.accept(inputRewriter)), +topCalcCondition.map(_.accept(inputRewriter)).orNull, +calc.getRowType, +rexBuilder)) + +call.transformTo(topCalc) + } + + /** +* Extracts the indices of the input fields referred by the specified projects and condition. +*/ + private def extractRefInputFields( + projects: Seq[RexNode], + condition: Option[RexNode], + inputFieldsCount: Int): Array[Int] = { +val visitor = new InputRefVisitor + +// extract referenced input fields from projections +projects.foreach(exp => exp.accept(visitor)) + +// extract referenced input fields from condition +condition.foreach(_.accept(visitor)) + +// fields of indexes greater than inputFieldsCount is the extracted functions and +// should be filter
[GitHub] [flink] flinkbot commented on issue #9964: test Travis arm ci
flinkbot commented on issue #9964: test Travis arm ci URL: https://github.com/apache/flink/pull/9964#issuecomment-544795857 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 754182e6b5101abec8af40e3465a9d43c685d63c (Tue Oct 22 03:56:34 UTC 2019) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! * **Invalid pull request title: No valid Jira ID provided** Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] wangxiyuan opened a new pull request #9964: test Travis arm ci
wangxiyuan opened a new pull request #9964: test Travis arm ci URL: https://github.com/apache/flink/pull/9964 ## What is the purpose of the change Test Travis ARM CI ## Brief change log *(for example:)* - *The TaskInfo is stored in the blob store on job creation time as a persistent artifact* - *Deployments RPC transmits only the blob storage reference* - *TaskManagers retrieve the TaskInfo from the blob cache* ## Verifying this change This change is already covered by existing tests, such as *(please describe tests)*. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / no) - The serializers: (yes / no / don't know) - The runtime per-record code paths (performance sensitive): (yes / no / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / no / don't know) - The S3 file system connector: (yes / no / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / no) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9956: [FLINK-13848][runtime] Enriches RpcEndpoint.MainThreadExecutor by supporting periodic scheduling
flinkbot edited a comment on issue #9956: [FLINK-13848][runtime] Enriches RpcEndpoint.MainThreadExecutor by supporting periodic scheduling URL: https://github.com/apache/flink/pull/9956#issuecomment-544504739 ## CI report: * 6e43a1de7dbeb1783c52f0eec8118cf068716229 : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/132813574) * d90bf4ce43725beb1d2a5fad64ec56b12ee9f9b7 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/132927656) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-14472) Implement back-pressure monitor with non-blocking outputs
[ https://issues.apache.org/jira/browse/FLINK-14472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956654#comment-16956654 ] zhijiang commented on FLINK-14472: -- Thanks for concerning on this issue. You are right that some known scenarios are invalid for existing back pressure monitor. Although the motivation of this ticket is not for solving that limitation, I think we might solve it meanwhile while implementing the new monitor way. The current monitor way is heavy-weight and fragile, and it also needs to understand the implementation of `LocalBufferPool` which is bad in design. I tried to provide a transparent method in `BufferProvider` to indicate whether it is back pressured or not, then the monitor caller would rely on this method to get the back pressure result. It is no need to analyze the specific thread stacks inside monitor tracker to understand the implementation of `BufferProvider`. And it also has the benefit for the restful call to only carry light-weight info. > Implement back-pressure monitor with non-blocking outputs > - > > Key: FLINK-14472 > URL: https://issues.apache.org/jira/browse/FLINK-14472 > Project: Flink > Issue Type: Task > Components: Runtime / Network >Reporter: zhijiang >Assignee: Yingjie Cao >Priority: Minor > Fix For: 1.10.0 > > > Currently back-pressure monitor relies on detecting task threads that are > stuck in `requestBufferBuilderBlocking`. There are actually two cases to > cause back-pressure ATM: > * There are no available buffers in `LocalBufferPool` and all the given > quotas from global pool are also exhausted. Then we need to wait for buffer > recycling to `LocalBufferPool`. > * No available buffers in `LocalBufferPool`, but the quota has not been used > up. While requesting buffer from global pool, it is blocked because of no > available buffers in global pool. Then we need to wait for buffer recycling > to global pool. > We already implemented the non-blocking output for the first case in > [FLINK-14396|https://issues.apache.org/jira/browse/FLINK-14396], and we > expect the second case done together with adjusting the back-pressure monitor > which could check for `RecordWriter#isAvailable` instead. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on issue #9927: [FLINK-14397][hive] Failed to run Hive UDTF with array arguments
flinkbot edited a comment on issue #9927: [FLINK-14397][hive] Failed to run Hive UDTF with array arguments URL: https://github.com/apache/flink/pull/9927#issuecomment-543180251 ## CI report: * 14b59100e851fa7494838f1b5c68b4dc732da590 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/132339165) * 9293184a08efd0c075be4fa67c02170894287099 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132493044) * 995c2d239889a09bffd87f228b222434cb052820 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132753822) * 5543d30d6d5f3605783dce9070b3dc42de11ffd2 : UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9919: [FLINK-13303][hive]add hive e2e connector test
flinkbot edited a comment on issue #9919: [FLINK-13303][hive]add hive e2e connector test URL: https://github.com/apache/flink/pull/9919#issuecomment-542991773 ## CI report: * a7539d69ef1d149aaf1b6a749a04ca94b16ef6d2 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132269780) * f8ded75e3aa2efbc305d52fabb3a71d71ad35080 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132514120) * 1097b5e17cca1f18f46f6ddacc7232b02846911d : UNKNOWN * 72827e1766dd28f66b72777fb8ad393cb1c0feef : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132746507) * bd109e2fa708a7de0f5939c8e508b2032c7b97e3 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132760419) * c3c9f1fb27880820af3095581171e3e4aa19a110 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/132817229) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition
flinkbot edited a comment on issue #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition URL: https://github.com/apache/flink/pull/9907#issuecomment-542229982 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit cebed5831381b7bc00f3a56244f8123099f0b5bb (Tue Oct 22 03:37:00 UTC 2019) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] hequn8128 commented on a change in pull request #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition
hequn8128 commented on a change in pull request #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition URL: https://github.com/apache/flink/pull/9907#discussion_r337315726 ## File path: flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table/planner/plan/rules/logical/PythonCalcSplitRule.scala ## @@ -0,0 +1,321 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.planner.plan.rules.logical + +import org.apache.calcite.plan.RelOptRule.{any, operand} +import org.apache.calcite.plan.{RelOptRule, RelOptRuleCall} +import org.apache.calcite.rex.{RexCall, RexInputRef, RexNode, RexProgram} +import org.apache.calcite.sql.validate.SqlValidatorUtil +import org.apache.flink.table.functions.FunctionLanguage +import org.apache.flink.table.functions.ScalarFunction +import org.apache.flink.table.planner.functions.utils.ScalarSqlFunction +import org.apache.flink.table.planner.plan.nodes.logical.FlinkLogicalCalc +import org.apache.flink.table.planner.plan.utils.PythonUtil.containsFunctionOf +import org.apache.flink.table.planner.plan.utils.{InputRefVisitor, RexDefaultVisitor} + +import scala.collection.JavaConverters._ +import scala.collection.JavaConversions._ +import scala.collection.mutable + +/** + * Rule that splits [[FlinkLogicalCalc]] into multiple [[FlinkLogicalCalc]]s. + * This is to ensure that the Python [[ScalarFunction]]s which could be + * executed in a batch are grouped into the same [[FlinkLogicalCalc]] node. + */ +abstract class PythonCalcSplitRuleBase(description: String) + extends RelOptRule( +operand(classOf[FlinkLogicalCalc], any), +description) { + + override def onMatch(call: RelOptRuleCall): Unit = { +val calc: FlinkLogicalCalc = call.rel(0).asInstanceOf[FlinkLogicalCalc] +val input = calc.getInput +val rexBuilder = call.builder().getRexBuilder +val program = calc.getProgram +val extractedRexCalls = new mutable.ArrayBuffer[RexCall]() + +val extractedFunctionOffset = input.getRowType.getFieldCount +val splitter = new ScalarFunctionSplitter( + extractedFunctionOffset, + extractedRexCalls, + convertPythonFunction(program)) + +val (bottomCalcCondition, topCalcCondition, topCalcProjects) = split(program, splitter) +val accessedFields = extractRefInputFields( + topCalcProjects, topCalcCondition, extractedFunctionOffset) + +val bottomCalcProjects = + accessedFields.map(RexInputRef.of(_, input.getRowType)) ++ extractedRexCalls +val bottomCalcFieldNames = SqlValidatorUtil.uniquify( + accessedFields.map(i => input.getRowType.getFieldNames.get(i)).toSeq ++ +extractedRexCalls.indices.map("f" + _), + rexBuilder.getTypeFactory.getTypeSystem.isSchemaCaseSensitive) + +val bottomCalc = new FlinkLogicalCalc( + calc.getCluster, + calc.getTraitSet, + input, + RexProgram.create( +input.getRowType, +bottomCalcProjects.toList, +bottomCalcCondition.orNull, +bottomCalcFieldNames, +rexBuilder)) + +val inputRewriter = new ExtractedFunctionInputRewriter(extractedFunctionOffset, accessedFields) +val topCalc = new FlinkLogicalCalc( + calc.getCluster, + calc.getTraitSet, + bottomCalc, + RexProgram.create( +bottomCalc.getRowType, +topCalcProjects.map(_.accept(inputRewriter)), +topCalcCondition.map(_.accept(inputRewriter)).orNull, +calc.getRowType, +rexBuilder)) + +call.transformTo(topCalc) + } + + /** +* Extracts the indices of the input fields referred by the specified projects and condition. +*/ + private def extractRefInputFields( + projects: Seq[RexNode], + condition: Option[RexNode], + inputFieldsCount: Int): Array[Int] = { +val visitor = new InputRefVisitor + +// extract referenced input fields from projections +projects.foreach(exp => exp.accept(visitor)) + +// extract referenced input fields from condition +condition.foreach(_.accept(visitor)) + +// fields of indexes greater than inputFieldsCount is the extracted functions and +// should be fil
[GitHub] [flink] hequn8128 commented on a change in pull request #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition
hequn8128 commented on a change in pull request #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition URL: https://github.com/apache/flink/pull/9907#discussion_r337000507 ## File path: flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table/planner/plan/rules/logical/PythonCalcSplitRule.scala ## @@ -0,0 +1,321 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.planner.plan.rules.logical + +import org.apache.calcite.plan.RelOptRule.{any, operand} +import org.apache.calcite.plan.{RelOptRule, RelOptRuleCall} +import org.apache.calcite.rex.{RexCall, RexInputRef, RexNode, RexProgram} +import org.apache.calcite.sql.validate.SqlValidatorUtil +import org.apache.flink.table.functions.FunctionLanguage +import org.apache.flink.table.functions.ScalarFunction +import org.apache.flink.table.planner.functions.utils.ScalarSqlFunction +import org.apache.flink.table.planner.plan.nodes.logical.FlinkLogicalCalc +import org.apache.flink.table.planner.plan.utils.PythonUtil.containsFunctionOf +import org.apache.flink.table.planner.plan.utils.{InputRefVisitor, RexDefaultVisitor} + +import scala.collection.JavaConverters._ +import scala.collection.JavaConversions._ +import scala.collection.mutable + +/** + * Rule that splits [[FlinkLogicalCalc]] into multiple [[FlinkLogicalCalc]]s. + * This is to ensure that the Python [[ScalarFunction]]s which could be + * executed in a batch are grouped into the same [[FlinkLogicalCalc]] node. + */ +abstract class PythonCalcSplitRuleBase(description: String) + extends RelOptRule( +operand(classOf[FlinkLogicalCalc], any), +description) { + + override def onMatch(call: RelOptRuleCall): Unit = { +val calc: FlinkLogicalCalc = call.rel(0).asInstanceOf[FlinkLogicalCalc] +val input = calc.getInput +val rexBuilder = call.builder().getRexBuilder +val program = calc.getProgram +val extractedRexCalls = new mutable.ArrayBuffer[RexCall]() + +val extractedFunctionOffset = input.getRowType.getFieldCount +val splitter = new ScalarFunctionSplitter( + extractedFunctionOffset, + extractedRexCalls, + convertPythonFunction(program)) + +val (bottomCalcCondition, topCalcCondition, topCalcProjects) = split(program, splitter) +val accessedFields = extractRefInputFields( + topCalcProjects, topCalcCondition, extractedFunctionOffset) + +val bottomCalcProjects = + accessedFields.map(RexInputRef.of(_, input.getRowType)) ++ extractedRexCalls +val bottomCalcFieldNames = SqlValidatorUtil.uniquify( + accessedFields.map(i => input.getRowType.getFieldNames.get(i)).toSeq ++ +extractedRexCalls.indices.map("f" + _), + rexBuilder.getTypeFactory.getTypeSystem.isSchemaCaseSensitive) + +val bottomCalc = new FlinkLogicalCalc( + calc.getCluster, + calc.getTraitSet, + input, + RexProgram.create( +input.getRowType, +bottomCalcProjects.toList, +bottomCalcCondition.orNull, +bottomCalcFieldNames, +rexBuilder)) + +val inputRewriter = new ExtractedFunctionInputRewriter(extractedFunctionOffset, accessedFields) +val topCalc = new FlinkLogicalCalc( + calc.getCluster, + calc.getTraitSet, + bottomCalc, + RexProgram.create( +bottomCalc.getRowType, +topCalcProjects.map(_.accept(inputRewriter)), +topCalcCondition.map(_.accept(inputRewriter)).orNull, +calc.getRowType, +rexBuilder)) + +call.transformTo(topCalc) + } + + /** +* Extracts the indices of the input fields referred by the specified projects and condition. +*/ + private def extractRefInputFields( + projects: Seq[RexNode], + condition: Option[RexNode], + inputFieldsCount: Int): Array[Int] = { +val visitor = new InputRefVisitor + +// extract referenced input fields from projections +projects.foreach(exp => exp.accept(visitor)) + +// extract referenced input fields from condition +condition.foreach(_.accept(visitor)) + +// fields of indexes greater than inputFieldsCount is the extracted functions and +// should be fil
[GitHub] [flink] hequn8128 commented on a change in pull request #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition
hequn8128 commented on a change in pull request #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition URL: https://github.com/apache/flink/pull/9907#discussion_r336853308 ## File path: flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table/planner/plan/rules/logical/PythonCalcSplitRule.scala ## @@ -0,0 +1,321 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.planner.plan.rules.logical + +import org.apache.calcite.plan.RelOptRule.{any, operand} +import org.apache.calcite.plan.{RelOptRule, RelOptRuleCall} +import org.apache.calcite.rex.{RexCall, RexInputRef, RexNode, RexProgram} +import org.apache.calcite.sql.validate.SqlValidatorUtil +import org.apache.flink.table.functions.FunctionLanguage +import org.apache.flink.table.functions.ScalarFunction +import org.apache.flink.table.planner.functions.utils.ScalarSqlFunction +import org.apache.flink.table.planner.plan.nodes.logical.FlinkLogicalCalc +import org.apache.flink.table.planner.plan.utils.PythonUtil.containsFunctionOf +import org.apache.flink.table.planner.plan.utils.{InputRefVisitor, RexDefaultVisitor} + +import scala.collection.JavaConverters._ +import scala.collection.JavaConversions._ +import scala.collection.mutable + +/** + * Rule that splits [[FlinkLogicalCalc]] into multiple [[FlinkLogicalCalc]]s. + * This is to ensure that the Python [[ScalarFunction]]s which could be + * executed in a batch are grouped into the same [[FlinkLogicalCalc]] node. + */ +abstract class PythonCalcSplitRuleBase(description: String) + extends RelOptRule( +operand(classOf[FlinkLogicalCalc], any), +description) { + + override def onMatch(call: RelOptRuleCall): Unit = { +val calc: FlinkLogicalCalc = call.rel(0).asInstanceOf[FlinkLogicalCalc] +val input = calc.getInput +val rexBuilder = call.builder().getRexBuilder +val program = calc.getProgram +val extractedRexCalls = new mutable.ArrayBuffer[RexCall]() + +val extractedFunctionOffset = input.getRowType.getFieldCount +val splitter = new ScalarFunctionSplitter( + extractedFunctionOffset, + extractedRexCalls, + convertPythonFunction(program)) + +val (bottomCalcCondition, topCalcCondition, topCalcProjects) = split(program, splitter) +val accessedFields = extractRefInputFields( + topCalcProjects, topCalcCondition, extractedFunctionOffset) + +val bottomCalcProjects = + accessedFields.map(RexInputRef.of(_, input.getRowType)) ++ extractedRexCalls +val bottomCalcFieldNames = SqlValidatorUtil.uniquify( + accessedFields.map(i => input.getRowType.getFieldNames.get(i)).toSeq ++ +extractedRexCalls.indices.map("f" + _), + rexBuilder.getTypeFactory.getTypeSystem.isSchemaCaseSensitive) + +val bottomCalc = new FlinkLogicalCalc( + calc.getCluster, + calc.getTraitSet, + input, + RexProgram.create( +input.getRowType, +bottomCalcProjects.toList, +bottomCalcCondition.orNull, +bottomCalcFieldNames, +rexBuilder)) + +val inputRewriter = new ExtractedFunctionInputRewriter(extractedFunctionOffset, accessedFields) +val topCalc = new FlinkLogicalCalc( + calc.getCluster, + calc.getTraitSet, + bottomCalc, + RexProgram.create( +bottomCalc.getRowType, +topCalcProjects.map(_.accept(inputRewriter)), +topCalcCondition.map(_.accept(inputRewriter)).orNull, +calc.getRowType, +rexBuilder)) + +call.transformTo(topCalc) + } + + /** +* Extracts the indices of the input fields referred by the specified projects and condition. +*/ + private def extractRefInputFields( + projects: Seq[RexNode], + condition: Option[RexNode], + inputFieldsCount: Int): Array[Int] = { +val visitor = new InputRefVisitor + +// extract referenced input fields from projections +projects.foreach(exp => exp.accept(visitor)) + +// extract referenced input fields from condition +condition.foreach(_.accept(visitor)) + +// fields of indexes greater than inputFieldsCount is the extracted functions and +// should be fil
[GitHub] [flink] hequn8128 commented on a change in pull request #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition
hequn8128 commented on a change in pull request #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition URL: https://github.com/apache/flink/pull/9907#discussion_r336828383 ## File path: flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table/planner/plan/rules/logical/PythonCalcSplitRule.scala ## @@ -0,0 +1,321 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.planner.plan.rules.logical + +import org.apache.calcite.plan.RelOptRule.{any, operand} +import org.apache.calcite.plan.{RelOptRule, RelOptRuleCall} +import org.apache.calcite.rex.{RexCall, RexInputRef, RexNode, RexProgram} +import org.apache.calcite.sql.validate.SqlValidatorUtil +import org.apache.flink.table.functions.FunctionLanguage +import org.apache.flink.table.functions.ScalarFunction +import org.apache.flink.table.planner.functions.utils.ScalarSqlFunction +import org.apache.flink.table.planner.plan.nodes.logical.FlinkLogicalCalc +import org.apache.flink.table.planner.plan.utils.PythonUtil.containsFunctionOf +import org.apache.flink.table.planner.plan.utils.{InputRefVisitor, RexDefaultVisitor} + +import scala.collection.JavaConverters._ +import scala.collection.JavaConversions._ +import scala.collection.mutable + +/** + * Rule that splits [[FlinkLogicalCalc]] into multiple [[FlinkLogicalCalc]]s. + * This is to ensure that the Python [[ScalarFunction]]s which could be + * executed in a batch are grouped into the same [[FlinkLogicalCalc]] node. + */ +abstract class PythonCalcSplitRuleBase(description: String) + extends RelOptRule( +operand(classOf[FlinkLogicalCalc], any), +description) { + + override def onMatch(call: RelOptRuleCall): Unit = { +val calc: FlinkLogicalCalc = call.rel(0).asInstanceOf[FlinkLogicalCalc] +val input = calc.getInput +val rexBuilder = call.builder().getRexBuilder +val program = calc.getProgram +val extractedRexCalls = new mutable.ArrayBuffer[RexCall]() + +val extractedFunctionOffset = input.getRowType.getFieldCount +val splitter = new ScalarFunctionSplitter( + extractedFunctionOffset, + extractedRexCalls, + convertPythonFunction(program)) + +val (bottomCalcCondition, topCalcCondition, topCalcProjects) = split(program, splitter) +val accessedFields = extractRefInputFields( + topCalcProjects, topCalcCondition, extractedFunctionOffset) + +val bottomCalcProjects = + accessedFields.map(RexInputRef.of(_, input.getRowType)) ++ extractedRexCalls +val bottomCalcFieldNames = SqlValidatorUtil.uniquify( + accessedFields.map(i => input.getRowType.getFieldNames.get(i)).toSeq ++ +extractedRexCalls.indices.map("f" + _), + rexBuilder.getTypeFactory.getTypeSystem.isSchemaCaseSensitive) + +val bottomCalc = new FlinkLogicalCalc( + calc.getCluster, + calc.getTraitSet, + input, + RexProgram.create( +input.getRowType, +bottomCalcProjects.toList, +bottomCalcCondition.orNull, +bottomCalcFieldNames, +rexBuilder)) + +val inputRewriter = new ExtractedFunctionInputRewriter(extractedFunctionOffset, accessedFields) +val topCalc = new FlinkLogicalCalc( + calc.getCluster, + calc.getTraitSet, + bottomCalc, + RexProgram.create( +bottomCalc.getRowType, +topCalcProjects.map(_.accept(inputRewriter)), +topCalcCondition.map(_.accept(inputRewriter)).orNull, +calc.getRowType, +rexBuilder)) + +call.transformTo(topCalc) + } + + /** +* Extracts the indices of the input fields referred by the specified projects and condition. +*/ + private def extractRefInputFields( + projects: Seq[RexNode], + condition: Option[RexNode], + inputFieldsCount: Int): Array[Int] = { +val visitor = new InputRefVisitor + +// extract referenced input fields from projections +projects.foreach(exp => exp.accept(visitor)) + +// extract referenced input fields from condition +condition.foreach(_.accept(visitor)) + +// fields of indexes greater than inputFieldsCount is the extracted functions and +// should be fil
[GitHub] [flink] hequn8128 commented on a change in pull request #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition
hequn8128 commented on a change in pull request #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition URL: https://github.com/apache/flink/pull/9907#discussion_r336816564 ## File path: flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table/planner/plan/nodes/physical/batch/BatchExecPythonCalc.scala ## @@ -53,17 +52,15 @@ class BatchExecPythonCalc( override protected def translateToPlanInternal(planner: BatchPlanner): Transformation[BaseRow] = { val inputTransform = getInputNodes.get(0).translateToPlan(planner) .asInstanceOf[Transformation[BaseRow]] -val config = planner.getTableConfig -val ctx = CodeGeneratorContext(config) -createOneInputTransformation( +val ret = createPythonOneInputTransformation( inputTransform, - inputsContainSingleton = false, calcProgram, - getRelDetailedDescription, - config, - ctx, - cluster, - getRowType, -"BatchExecCalc") + "BatchExecCalc") + +if (inputsContainSingleton()) { + ret.setParallelism(1) + ret.setMaxParallelism(1) +} Review comment: We don't need this for batch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] hequn8128 commented on a change in pull request #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition
hequn8128 commented on a change in pull request #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition URL: https://github.com/apache/flink/pull/9907#discussion_r337309944 ## File path: flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table/planner/plan/rules/logical/PythonCalcSplitRule.scala ## @@ -0,0 +1,321 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.planner.plan.rules.logical + +import org.apache.calcite.plan.RelOptRule.{any, operand} +import org.apache.calcite.plan.{RelOptRule, RelOptRuleCall} +import org.apache.calcite.rex.{RexCall, RexInputRef, RexNode, RexProgram} +import org.apache.calcite.sql.validate.SqlValidatorUtil +import org.apache.flink.table.functions.FunctionLanguage +import org.apache.flink.table.functions.ScalarFunction +import org.apache.flink.table.planner.functions.utils.ScalarSqlFunction +import org.apache.flink.table.planner.plan.nodes.logical.FlinkLogicalCalc +import org.apache.flink.table.planner.plan.utils.PythonUtil.containsFunctionOf +import org.apache.flink.table.planner.plan.utils.{InputRefVisitor, RexDefaultVisitor} + +import scala.collection.JavaConverters._ +import scala.collection.JavaConversions._ +import scala.collection.mutable + +/** + * Rule that splits [[FlinkLogicalCalc]] into multiple [[FlinkLogicalCalc]]s. + * This is to ensure that the Python [[ScalarFunction]]s which could be + * executed in a batch are grouped into the same [[FlinkLogicalCalc]] node. + */ +abstract class PythonCalcSplitRuleBase(description: String) + extends RelOptRule( +operand(classOf[FlinkLogicalCalc], any), +description) { + + override def onMatch(call: RelOptRuleCall): Unit = { +val calc: FlinkLogicalCalc = call.rel(0).asInstanceOf[FlinkLogicalCalc] +val input = calc.getInput +val rexBuilder = call.builder().getRexBuilder +val program = calc.getProgram +val extractedRexCalls = new mutable.ArrayBuffer[RexCall]() + +val extractedFunctionOffset = input.getRowType.getFieldCount +val splitter = new ScalarFunctionSplitter( + extractedFunctionOffset, + extractedRexCalls, + convertPythonFunction(program)) + +val (bottomCalcCondition, topCalcCondition, topCalcProjects) = split(program, splitter) +val accessedFields = extractRefInputFields( + topCalcProjects, topCalcCondition, extractedFunctionOffset) + +val bottomCalcProjects = + accessedFields.map(RexInputRef.of(_, input.getRowType)) ++ extractedRexCalls +val bottomCalcFieldNames = SqlValidatorUtil.uniquify( + accessedFields.map(i => input.getRowType.getFieldNames.get(i)).toSeq ++ +extractedRexCalls.indices.map("f" + _), + rexBuilder.getTypeFactory.getTypeSystem.isSchemaCaseSensitive) + +val bottomCalc = new FlinkLogicalCalc( + calc.getCluster, + calc.getTraitSet, + input, + RexProgram.create( +input.getRowType, +bottomCalcProjects.toList, +bottomCalcCondition.orNull, +bottomCalcFieldNames, +rexBuilder)) + +val inputRewriter = new ExtractedFunctionInputRewriter(extractedFunctionOffset, accessedFields) +val topCalc = new FlinkLogicalCalc( + calc.getCluster, + calc.getTraitSet, + bottomCalc, + RexProgram.create( +bottomCalc.getRowType, +topCalcProjects.map(_.accept(inputRewriter)), +topCalcCondition.map(_.accept(inputRewriter)).orNull, +calc.getRowType, +rexBuilder)) + +call.transformTo(topCalc) + } + + /** +* Extracts the indices of the input fields referred by the specified projects and condition. +*/ + private def extractRefInputFields( + projects: Seq[RexNode], + condition: Option[RexNode], + inputFieldsCount: Int): Array[Int] = { +val visitor = new InputRefVisitor + +// extract referenced input fields from projections +projects.foreach(exp => exp.accept(visitor)) + +// extract referenced input fields from condition +condition.foreach(_.accept(visitor)) + +// fields of indexes greater than inputFieldsCount is the extracted functions and +// should be fil
[GitHub] [flink] hequn8128 commented on a change in pull request #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition
hequn8128 commented on a change in pull request #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition URL: https://github.com/apache/flink/pull/9907#discussion_r337311182 ## File path: flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table/planner/plan/rules/logical/PythonCalcSplitRule.scala ## @@ -0,0 +1,321 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.planner.plan.rules.logical + +import org.apache.calcite.plan.RelOptRule.{any, operand} +import org.apache.calcite.plan.{RelOptRule, RelOptRuleCall} +import org.apache.calcite.rex.{RexCall, RexInputRef, RexNode, RexProgram} +import org.apache.calcite.sql.validate.SqlValidatorUtil +import org.apache.flink.table.functions.FunctionLanguage +import org.apache.flink.table.functions.ScalarFunction +import org.apache.flink.table.planner.functions.utils.ScalarSqlFunction +import org.apache.flink.table.planner.plan.nodes.logical.FlinkLogicalCalc +import org.apache.flink.table.planner.plan.utils.PythonUtil.containsFunctionOf +import org.apache.flink.table.planner.plan.utils.{InputRefVisitor, RexDefaultVisitor} + +import scala.collection.JavaConverters._ +import scala.collection.JavaConversions._ +import scala.collection.mutable + +/** + * Rule that splits [[FlinkLogicalCalc]] into multiple [[FlinkLogicalCalc]]s. + * This is to ensure that the Python [[ScalarFunction]]s which could be + * executed in a batch are grouped into the same [[FlinkLogicalCalc]] node. + */ +abstract class PythonCalcSplitRuleBase(description: String) + extends RelOptRule( +operand(classOf[FlinkLogicalCalc], any), +description) { + + override def onMatch(call: RelOptRuleCall): Unit = { +val calc: FlinkLogicalCalc = call.rel(0).asInstanceOf[FlinkLogicalCalc] +val input = calc.getInput +val rexBuilder = call.builder().getRexBuilder +val program = calc.getProgram +val extractedRexCalls = new mutable.ArrayBuffer[RexCall]() + +val extractedFunctionOffset = input.getRowType.getFieldCount +val splitter = new ScalarFunctionSplitter( + extractedFunctionOffset, + extractedRexCalls, + convertPythonFunction(program)) + +val (bottomCalcCondition, topCalcCondition, topCalcProjects) = split(program, splitter) +val accessedFields = extractRefInputFields( + topCalcProjects, topCalcCondition, extractedFunctionOffset) + +val bottomCalcProjects = + accessedFields.map(RexInputRef.of(_, input.getRowType)) ++ extractedRexCalls +val bottomCalcFieldNames = SqlValidatorUtil.uniquify( + accessedFields.map(i => input.getRowType.getFieldNames.get(i)).toSeq ++ +extractedRexCalls.indices.map("f" + _), + rexBuilder.getTypeFactory.getTypeSystem.isSchemaCaseSensitive) + +val bottomCalc = new FlinkLogicalCalc( + calc.getCluster, + calc.getTraitSet, + input, + RexProgram.create( +input.getRowType, +bottomCalcProjects.toList, +bottomCalcCondition.orNull, +bottomCalcFieldNames, +rexBuilder)) + +val inputRewriter = new ExtractedFunctionInputRewriter(extractedFunctionOffset, accessedFields) +val topCalc = new FlinkLogicalCalc( + calc.getCluster, + calc.getTraitSet, + bottomCalc, + RexProgram.create( +bottomCalc.getRowType, +topCalcProjects.map(_.accept(inputRewriter)), +topCalcCondition.map(_.accept(inputRewriter)).orNull, +calc.getRowType, +rexBuilder)) + +call.transformTo(topCalc) + } + + /** +* Extracts the indices of the input fields referred by the specified projects and condition. +*/ + private def extractRefInputFields( + projects: Seq[RexNode], + condition: Option[RexNode], + inputFieldsCount: Int): Array[Int] = { +val visitor = new InputRefVisitor + +// extract referenced input fields from projections +projects.foreach(exp => exp.accept(visitor)) + +// extract referenced input fields from condition +condition.foreach(_.accept(visitor)) + +// fields of indexes greater than inputFieldsCount is the extracted functions and +// should be fil
[GitHub] [flink] hequn8128 commented on a change in pull request #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition
hequn8128 commented on a change in pull request #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition URL: https://github.com/apache/flink/pull/9907#discussion_r336816648 ## File path: flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table/planner/plan/nodes/physical/batch/BatchExecPythonCalc.scala ## @@ -53,17 +52,15 @@ class BatchExecPythonCalc( override protected def translateToPlanInternal(planner: BatchPlanner): Transformation[BaseRow] = { val inputTransform = getInputNodes.get(0).translateToPlan(planner) .asInstanceOf[Transformation[BaseRow]] -val config = planner.getTableConfig -val ctx = CodeGeneratorContext(config) -createOneInputTransformation( +val ret = createPythonOneInputTransformation( inputTransform, - inputsContainSingleton = false, calcProgram, - getRelDetailedDescription, - config, - ctx, - cluster, - getRowType, -"BatchExecCalc") + "BatchExecCalc") Review comment: BatchExecPythonCalc ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9905: [FLINK-14396][network] Implement rudimentary non-blocking network output
flinkbot edited a comment on issue #9905: [FLINK-14396][network] Implement rudimentary non-blocking network output URL: https://github.com/apache/flink/pull/9905#issuecomment-542178315 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 510f2cc5b515d80e0e316e68adb6fa4960303536 (Tue Oct 22 03:33:58 UTC 2019) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zhijiangW commented on a change in pull request #9905: [FLINK-14396][network] Implement rudimentary non-blocking network output
zhijiangW commented on a change in pull request #9905: [FLINK-14396][network] Implement rudimentary non-blocking network output URL: https://github.com/apache/flink/pull/9905#discussion_r337320185 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/consumer/UnionInputGate.java ## @@ -118,7 +118,7 @@ public UnionInputGate(InputGate... inputGates) { } if (!inputGatesWithData.isEmpty()) { - isAvailable = AVAILABLE; + availabilityHelper.getUnavailableToResetAvailable(); Review comment: Yes, that makes sense. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9927: [FLINK-14397][hive] Failed to run Hive UDTF with array arguments
flinkbot edited a comment on issue #9927: [FLINK-14397][hive] Failed to run Hive UDTF with array arguments URL: https://github.com/apache/flink/pull/9927#issuecomment-543172555 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 5543d30d6d5f3605783dce9070b3dc42de11ffd2 (Tue Oct 22 03:31:55 UTC 2019) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9927: [FLINK-14397][hive] Failed to run Hive UDTF with array arguments
flinkbot edited a comment on issue #9927: [FLINK-14397][hive] Failed to run Hive UDTF with array arguments URL: https://github.com/apache/flink/pull/9927#issuecomment-543172555 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 5543d30d6d5f3605783dce9070b3dc42de11ffd2 (Tue Oct 22 03:30:54 UTC 2019) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] lirui-apache commented on issue #9927: [FLINK-14397][hive] Failed to run Hive UDTF with array arguments
lirui-apache commented on issue #9927: [FLINK-14397][hive] Failed to run Hive UDTF with array arguments URL: https://github.com/apache/flink/pull/9927#issuecomment-544792010 @bowenli86 PR updated to address your comments This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9963: [FLINK-14388][python][table-planner][table-planner-blink] Support common data types in Python user-defined functions
flinkbot edited a comment on issue #9963: [FLINK-14388][python][table-planner][table-planner-blink] Support common data types in Python user-defined functions URL: https://github.com/apache/flink/pull/9963#issuecomment-544787904 ## CI report: * fe74a771e3f928774398e6cbeccf20b69dc76a58 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/132929165) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9912: [FLINK-14370][kafka][test-stability] Fix the cascading failure in kaProducerTestBase.
flinkbot edited a comment on issue #9912: [FLINK-14370][kafka][test-stability] Fix the cascading failure in kaProducerTestBase. URL: https://github.com/apache/flink/pull/9912#issuecomment-542713850 ## CI report: * c1fe0487729f91e54bb58b2ed0913dda8d00b5f6 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/132165623) * d3faff4e2b44b01a43d3716ad2eb1f4cd2c4b054 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/132351880) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9927: [FLINK-14397][hive] Failed to run Hive UDTF with array arguments
flinkbot edited a comment on issue #9927: [FLINK-14397][hive] Failed to run Hive UDTF with array arguments URL: https://github.com/apache/flink/pull/9927#issuecomment-543172555 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 995c2d239889a09bffd87f228b222434cb052820 (Tue Oct 22 03:07:28 UTC 2019) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] lirui-apache commented on a change in pull request #9927: [FLINK-14397][hive] Failed to run Hive UDTF with array arguments
lirui-apache commented on a change in pull request #9927: [FLINK-14397][hive] Failed to run Hive UDTF with array arguments URL: https://github.com/apache/flink/pull/9927#discussion_r337316929 ## File path: flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/functions/hive/conversion/HiveInspectors.java ## @@ -450,4 +454,56 @@ private static ObjectInspector getObjectInspector(TypeInfo type) { public static DataType toFlinkType(ObjectInspector inspector) { return HiveTypeUtil.toFlinkType(TypeInfoUtils.getTypeInfoFromTypeString(inspector.getTypeName())); } + + // given a Hive ObjectInspector, get the class for corresponding Flink object + private static Class classForObjectInspector(ObjectInspector inspector) { + switch (inspector.getCategory()) { + case PRIMITIVE: { + PrimitiveObjectInspector primitiveOI = (PrimitiveObjectInspector) inspector; + switch (primitiveOI.getPrimitiveCategory()) { + case STRING: + case CHAR: + case VARCHAR: + return String.class; + case INT: + return Integer.class; + case LONG: + return Long.class; + case BYTE: + return Byte.class; + case SHORT: + return Short.class; + case FLOAT: + return Float.class; + case DOUBLE: + return Double.class; + case DECIMAL: + return BigDecimal.class; + case BOOLEAN: + return Boolean.class; + case BINARY: + return byte[].class; + case DATE: + return Date.class; + case TIMESTAMP: Review comment: Not sure whether we have to disable it in the util methods. It's a known limitation documented [here](https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/table/hive/#limitations). So users shouldn't be using it in the first place. Also remembered we already have a JIRA for the bridged types: https://issues.apache.org/jira/browse/FLINK-13438 and we can continue the work there. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #9963: [FLINK-14388][python][table-planner][table-planner-blink] Support common data types in Python user-defined functions
flinkbot commented on issue #9963: [FLINK-14388][python][table-planner][table-planner-blink] Support common data types in Python user-defined functions URL: https://github.com/apache/flink/pull/9963#issuecomment-544787904 ## CI report: * fe74a771e3f928774398e6cbeccf20b69dc76a58 : UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services