[jira] [Created] (FLINK-18726) Support INSERT INTO specific columns
Caizhi Weng created FLINK-18726: --- Summary: Support INSERT INTO specific columns Key: FLINK-18726 URL: https://issues.apache.org/jira/browse/FLINK-18726 Project: Flink Issue Type: Improvement Components: Table SQL / Planner Reporter: Caizhi Weng Currently Flink only supports insert into a table without specifying columns, but most database systems support insert into specific columns by {code:sql} INSERT INTO table_name(column1, column2, ...) ... {code} The columns not specified will be filled with default values or {{NULL}} if no default value is given when creating the table. As Flink currently does not support default values when creating tables, we can fill the unspecified columns with {{NULL}} and throw exceptions if there are columns with {{NOT NULL}} constraints. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] klion26 commented on a change in pull request #12665: [FLINK-17886][docs-zh] Update Chinese documentation for new Watermark…
klion26 commented on a change in pull request #12665: URL: https://github.com/apache/flink/pull/12665#discussion_r460654064 ## File path: docs/dev/event_timestamp_extractors.zh.md ## @@ -25,83 +25,51 @@ under the License. * toc {:toc} -As described in [timestamps and watermark handling]({{ site.baseurl }}/dev/event_timestamps_watermarks.html), -Flink provides abstractions that allow the programmer to assign their own timestamps and emit their own watermarks. More specifically, -one can do so by implementing one of the `AssignerWithPeriodicWatermarks` and `AssignerWithPunctuatedWatermarks` interfaces, depending -on the use case. In a nutshell, the first will emit watermarks periodically, while the second does so based on some property of -the incoming records, e.g. whenever a special element is encountered in the stream. +如[生成 Watermark]({%link dev/event_timestamps_watermarks.zh.md %}) 小节中所述,Flink 提供的抽象方法可以允许用户自己去定义时间戳分配方式和 watermark 生成的方式。你可以通过实现 `WatermarkGenerator` 接口来实现上述功能。 -In order to further ease the programming effort for such tasks, Flink comes with some pre-implemented timestamp assigners. -This section provides a list of them. Apart from their out-of-the-box functionality, their implementation can serve as an example -for custom implementations. +为了进一步简化此类任务的编程工作,Flink 框架预设了一些时间戳分配器。本节后续内容有举例。除了开箱即用的已有实现外,其还可以作为自定义实现的示例以供参考。 -### **Assigners with ascending timestamps** + -The simplest special case for *periodic* watermark generation is the case where timestamps seen by a given source task -occur in ascending order. In that case, the current timestamp can always act as a watermark, because no earlier timestamps will -arrive. +## 单调递增时间戳分配器 -Note that it is only necessary that timestamps are ascending *per parallel data source task*. For example, if -in a specific setup one Kafka partition is read by one parallel data source instance, then it is only necessary that -timestamps are ascending within each Kafka partition. Flink's watermark merging mechanism will generate correct -watermarks whenever parallel streams are shuffled, unioned, connected, or merged. +*周期性* watermark 生成方式的一个最简单特例就是你给定的数据源中数据的时间戳升序出现。在这种情况下,当前时间戳就可以充当 watermark,因为后续到达数据的时间戳不会比当前的小。 + +注意:在 Flink 应用程序中,如果是并行数据源,则只要求并行数据源中的每个*单分区数据源任务*时间戳递增。例如,设置每一个并行数据源实例都只读取一个 Kafka 分区,则时间戳只需在每个 Kafka 分区内递增即可。Flink 的 watermark 合并机制会在并行数据流进行分发(shuffle)、联合(union)、连接(connect)或合并(merge)时生成正确的 watermark。 {% highlight java %} -DataStream stream = ... - -DataStream withTimestampsAndWatermarks = -stream.assignTimestampsAndWatermarks(new AscendingTimestampExtractor() { - -@Override -public long extractAscendingTimestamp(MyEvent element) { -return element.getCreationTime(); -} -}); +WatermarkStrategies Review comment: 英文版的这个代码看上去是 ``` WatermarkStrategy.forMonotonousTimestamps(); ``` ## File path: docs/dev/event_timestamps_watermarks.zh.md ## @@ -175,200 +140,250 @@ withTimestampsAndWatermarks +使用 `WatermarkStrategy` 去获取流并生成带有时间戳的元素和 watermark 的新流时,如果原始流已经具有时间戳或 watermark,则新指定的时间戳分配器将覆盖原有的时间戳和 watermark。 - **With Periodic Watermarks** + -`AssignerWithPeriodicWatermarks` assigns timestamps and generates watermarks periodically (possibly depending -on the stream elements, or purely based on processing time). +## 处理空闲数据源 -The interval (every *n* milliseconds) in which the watermark will be generated is defined via -`ExecutionConfig.setAutoWatermarkInterval(...)`. The assigner's `getCurrentWatermark()` method will be -called each time, and a new watermark will be emitted if the returned watermark is non-null and larger than the previous -watermark. +如果数据源中的某一个分区/分片在一段时间内未发送事件数据,则意味着 `WatermarkGenerator` 也不会获得任何新数据去生成 watermark。我们称这类数据源为*空闲输入*或*空闲源*。在这种情况下,当某些其他分区仍然发送事件数据的时候就会出现问题。由于下游算子 watermark 的计算方式是取所有不同的上游并行数据源 watermark 的最小值,则其 watermark 将不会发生变化。 -Here we show two simple examples of timestamp assigners that use periodic watermark generation. Note that Flink ships with a `BoundedOutOfOrdernessTimestampExtractor` similar to the `BoundedOutOfOrdernessGenerator` shown below, which you can read about [here]({{ site.baseurl }}/dev/event_timestamp_extractors.html#assigners-allowing-a-fixed-amount-of-lateness). +为了解决这个问题,你可以使用 `WatermarkStrategy` 来检测空闲输入并将其标记为空闲状态。`WatermarkStrategies` 为此提供了一个工具接口: Review comment: 看上去 FLINK-18011(https://github.com/apache/flink/pull/12412) 删掉了 `WatermarkStrategies` 相关的内容了,这个地方还要麻烦再修改一下(这个文档中其他相关的也修改一下),不好意思之前没有发现这个问题。 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-16404) Avoid caching buffers for blocked input channels before barrier alignment
[ https://issues.apache.org/jira/browse/FLINK-16404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165452#comment-17165452 ] Zhijiang commented on FLINK-16404: -- Yes, that is our intention to not remove this metric from web UI immediately as mentioned in the above release note. I created another separate ticket [FLINK-17572|https://issues.apache.org/jira/browse/FLINK-17572] before to handle this issue and the PR already submitted by external contributor. But [~chesnay] had some concerns for breaking the compatible of REST API when reviewing the PR. > Avoid caching buffers for blocked input channels before barrier alignment > - > > Key: FLINK-16404 > URL: https://issues.apache.org/jira/browse/FLINK-16404 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Network >Reporter: Zhijiang >Assignee: Yingjie Cao >Priority: Major > Labels: pull-request-available > Fix For: 1.11.0 > > Time Spent: 10m > Remaining Estimate: 0h > > One motivation of this issue is for reducing the in-flight data in the case > of back pressure to speed up checkpoint. The current default exclusive > buffers per channel is 2. If we reduce it to 0 and increase somewhat floating > buffers for compensation, it might cause deadlock problem because all the > floating buffers might be requested away by some blocked input channels and > never recycled until barrier alignment. > In order to solve above deadlock concern, we can make some logic changes on > both sender and receiver sides. > * Sender side: It should revoke previous received credit after sending > checkpoint barrier, that means it would not send any following buffers until > receiving new credits. > * Receiver side: The respective channel releases the requested floating > buffers if barrier is received from the network. After barrier alignment, it > would request floating buffers for the channels with positive backlog, and > notify the sender side of available credits. Then the sender can continue > transporting the buffers. > Based on above changes, we can also remove the `BufferStorage` component > completely, because the receiver would never reading buffers for blocked > channels. Another possible benefit is that the floating buffers might be more > properly made use of before barrier alignment. > The only side effect would bring somehow cold setup after barrier alignment. > That means the sender side has to wait for credit feedback to transport data > just after alignment, which would impact on delay and network throughput. But > considering the checkpoint interval not too short in general, so the above > side effect can be ignored in practice. We can further verify it via existing > micro-benchmark. > After this ticket done, we still can not set exclusive buffers to zero ATM, > there exists another deadlock issue which would be solved separately in > another ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17767) Tumble and Hop window support offset
[ https://issues.apache.org/jira/browse/FLINK-17767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165451#comment-17165451 ] Jark Wu commented on FLINK-17767: - CALCITE-4000 is for the TUMBLE TVF, not the TUMBLE window group function. AFAIK, the offset type in TUMBLE window group function is time type. For example: {{group by tumble(rowtime, interval '2' hour, time '00:12:00')}}. cc [~danny0405] > Tumble and Hop window support offset > > > Key: FLINK-17767 > URL: https://issues.apache.org/jira/browse/FLINK-17767 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Affects Versions: 1.10.0 >Reporter: hailong wang >Assignee: hailong wang >Priority: Major > Fix For: 1.12.0 > > > TUMBLE window and HOP window with alignment is not supported yet. We can > support by > (, , ) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-18713) Allow default ms unit for table.exec.mini-batch.allow-latency etc.
[ https://issues.apache.org/jira/browse/FLINK-18713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165448#comment-17165448 ] Jark Wu commented on FLINK-18713: - Yes. I think we should update them. All of them should be defined as {{ConfigOption}} now. The {{ConfigOption}} doesn't support {{Duration}} when we having above options. > Allow default ms unit for table.exec.mini-batch.allow-latency etc. > -- > > Key: FLINK-18713 > URL: https://issues.apache.org/jira/browse/FLINK-18713 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Affects Versions: 1.11.0 >Reporter: hailong wang >Priority: Major > Fix For: 1.12.0 > > > We use `scala.concurrent.duration.Duration.create` to parse timeStr in > `TableConfigUtils# > getMillisecondFromConfigDuration` for the following properties, > {code:java} > table.exec.async-lookup.timeout > table.exec.source.idle-timeout > table.exec.mini-batch.allow-latency > table.exec.emit.early-fire.delay > table.exec.emit.late-fire.delay{code} > And it must has the unit. > I think we can replace it with `TimeUtils.parseDuration(timeStr)` to parse > timeStr just like `DescriptorProperties#getOptionalDuration` to has default > ms unit and be consistent. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #12917: [WIP] [FLINK-18355][tests] Simplify tests of SlotPoolImpl
flinkbot edited a comment on pull request #12917: URL: https://github.com/apache/flink/pull/12917#issuecomment-659940433 ## CI report: * 664fa0ad0056da67af2a8580cf6ebfc82dad9ecc Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=4913) * 1c5870c57b64436014900d67be2005395e007a52 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-18122) Kubernetes test fails with "error: timed out waiting for the condition on jobs/flink-job-cluster"
[ https://issues.apache.org/jira/browse/FLINK-18122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165436#comment-17165436 ] Dian Fu commented on FLINK-18122: - Thanks [~fly_in_gis], you are right. Have created a separate JIRA: https://issues.apache.org/jira/browse/FLINK-18725 > Kubernetes test fails with "error: timed out waiting for the condition on > jobs/flink-job-cluster" > - > > Key: FLINK-18122 > URL: https://issues.apache.org/jira/browse/FLINK-18122 > Project: Flink > Issue Type: Bug > Components: Deployment / Kubernetes, Tests >Affects Versions: 1.11.0 >Reporter: Robert Metzger >Priority: Major > Labels: test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=2697=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=1e2bbe5b-4657-50be-1f07-d84bfce5b1f5 > {code} > 2020-06-04T09:25:40.7205843Z service/flink-job-cluster created > 2020-06-04T09:25:40.9661515Z job.batch/flink-job-cluster created > 2020-06-04T09:25:41.2189123Z deployment.apps/flink-task-manager created > 2020-06-04T10:32:32.6402983Z error: timed out waiting for the condition on > jobs/flink-job-cluster > 2020-06-04T10:32:33.8057757Z error: unable to upgrade connection: container > not found ("flink-task-manager") > 2020-06-04T10:32:33.8111302Z sort: cannot read: > '/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-56335570120/out/kubernetes_wc_out*': > No such file or directory > 2020-06-04T10:32:33.8124455Z FAIL WordCount: Output hash mismatch. Got > d41d8cd98f00b204e9800998ecf8427e, expected e682ec6622b5e83f2eb614617d5ab2cf. > 2020-06-04T10:32:33.8125379Z head hexdump of actual: > 2020-06-04T10:32:33.8136133Z head: cannot open > '/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-56335570120/out/kubernetes_wc_out*' > for reading: No such file or directory > 2020-06-04T10:32:33.8344715Z Debugging failed Kubernetes test: > 2020-06-04T10:32:33.8345469Z Currently existing Kubernetes resources > 2020-06-04T10:32:36.4977853Z I0604 10:32:36.497383 13191 request.go:621] > Throttling request took 1.198606989s, request: > GET:https://10.1.0.4:8443/apis/rbac.authorization.k8s.io/v1?timeout=32s > 2020-06-04T10:32:46.6975735Z I0604 10:32:46.697234 13191 request.go:621] > Throttling request took 4.398107353s, request: > GET:https://10.1.0.4:8443/apis/authorization.k8s.io/v1?timeout=32s > 2020-06-04T10:32:57.4978637Z I0604 10:32:57.497209 13191 request.go:621] > Throttling request took 1.198449167s, request: > GET:https://10.1.0.4:8443/apis/apps/v1?timeout=32s > 2020-06-04T10:33:07.4980104Z I0604 10:33:07.497320 13191 request.go:621] > Throttling request took 4.198274438s, request: > GET:https://10.1.0.4:8443/apis/apiextensions.k8s.io/v1?timeout=32s > 2020-06-04T10:33:18.4976060Z I0604 10:33:18.497258 13191 request.go:621] > Throttling request took 1.19871495s, request: > GET:https://10.1.0.4:8443/apis/apps/v1?timeout=32s > 2020-06-04T10:33:28.4979129Z I0604 10:33:28.497276 13191 request.go:621] > Throttling request took 4.198369672s, request: > GET:https://10.1.0.4:8443/apis/rbac.authorization.k8s.io/v1?timeout=32s > 2020-06-04T10:33:30.9182069Z NAME READY > STATUS RESTARTS AGE > 2020-06-04T10:33:30.9184099Z pod/flink-job-cluster-dtb67 0/1 > ErrImageNeverPull 0 67m > 2020-06-04T10:33:30.9184869Z pod/flink-task-manager-74ccc9bd9-psqwm 0/1 > ErrImageNeverPull 0 67m > 2020-06-04T10:33:30.9185226Z > 2020-06-04T10:33:30.9185926Z NAMETYPE > CLUSTER-IP EXTERNAL-IP PORT(S) > AGE > 2020-06-04T10:33:30.9186832Z service/flink-job-cluster NodePort > 10.111.92.199 > 6123:32501/TCP,6124:31360/TCP,6125:30025/TCP,8081:30081/TCP 67m > 2020-06-04T10:33:30.9187545Z service/kubernetes ClusterIP > 10.96.0.1 443/TCP > 68m > 2020-06-04T10:33:30.9187976Z > 2020-06-04T10:33:30.9188472Z NAME READY > UP-TO-DATE AVAILABLE AGE > 2020-06-04T10:33:30.9189179Z deployment.apps/flink-task-manager 0/1 1 > 0 67m > 2020-06-04T10:33:30.9189508Z > 2020-06-04T10:33:30.9189815Z NAME > DESIRED CURRENT READY AGE > 2020-06-04T10:33:30.9190418Z replicaset.apps/flink-task-manager-74ccc9bd9 1 > 1 0 67m > 2020-06-04T10:33:30.9190662Z > 2020-06-04T10:33:30.9190891Z NAME COMPLETIONS > DURATION AGE > 2020-06-04T10:33:30.9191423Z job.batch/flink-job-cluster 0/1
[jira] [Updated] (FLINK-18725) "Run Kubernetes test" failed with "30025: provided port is already allocated"
[ https://issues.apache.org/jira/browse/FLINK-18725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-18725: Issue Type: Test (was: Bug) > "Run Kubernetes test" failed with "30025: provided port is already allocated" > - > > Key: FLINK-18725 > URL: https://issues.apache.org/jira/browse/FLINK-18725 > Project: Flink > Issue Type: Test > Components: Deployment / Kubernetes, Tests >Affects Versions: 1.11.0, 1.11.1 >Reporter: Dian Fu >Priority: Major > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=4901=logs=08866332-78f7-59e4-4f7e-49a56faa3179=3e8647c1-5a28-5917-dd93-bf78594ea994 > {code} > The Service "flink-job-cluster" is invalid: spec.ports[2].nodePort: Invalid > value: 30025: provided port is already allocated > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-18725) "Run Kubernetes test" failed with "30025: provided port is already allocated"
Dian Fu created FLINK-18725: --- Summary: "Run Kubernetes test" failed with "30025: provided port is already allocated" Key: FLINK-18725 URL: https://issues.apache.org/jira/browse/FLINK-18725 Project: Flink Issue Type: Bug Components: Deployment / Kubernetes, Tests Affects Versions: 1.11.1, 1.11.0 Reporter: Dian Fu https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=4901=logs=08866332-78f7-59e4-4f7e-49a56faa3179=3e8647c1-5a28-5917-dd93-bf78594ea994 {code} The Service "flink-job-cluster" is invalid: spec.ports[2].nodePort: Invalid value: 30025: provided port is already allocated {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #12917: [WIP] [FLINK-18355][tests] Simplify tests of SlotPoolImpl
flinkbot edited a comment on pull request #12917: URL: https://github.com/apache/flink/pull/12917#issuecomment-659940433 ## CI report: * 664fa0ad0056da67af2a8580cf6ebfc82dad9ecc Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=4913) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #12917: [WIP] [FLINK-18355][tests] Simplify tests of SlotPoolImpl
flinkbot edited a comment on pull request #12917: URL: https://github.com/apache/flink/pull/12917#issuecomment-659940433 ## CI report: * 664fa0ad0056da67af2a8580cf6ebfc82dad9ecc UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #12958: [FLINK-18625][runtime] Maintain redundant taskmanagers to speed up failover
flinkbot edited a comment on pull request #12958: URL: https://github.com/apache/flink/pull/12958#issuecomment-662378507 ## CI report: * deb09ce58822dda5b7388c2928b8a119879d9749 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=4912) * 35ddeb1f517c78a9237ab5b5814dd504487acc00 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=4914) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-18712) Flink RocksDB statebackend memory leak issue
[ https://issues.apache.org/jira/browse/FLINK-18712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165422#comment-17165422 ] Farnight commented on FLINK-18712: -- Thanks a lot [~yunta]! We are trying a simple job ( remove all business related), and do the testing to reproduce this. We share more detail after the testing. > Flink RocksDB statebackend memory leak issue > - > > Key: FLINK-18712 > URL: https://issues.apache.org/jira/browse/FLINK-18712 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Affects Versions: 1.10.0 >Reporter: Farnight >Priority: Critical > > When using RocksDB as our statebackend, we found it will lead to memory leak > when restarting job (manually or in recovery case). > > How to reproduce: > # increase RocksDB blockcache size(e.g. 1G), it is easier to monitor and > reproduce. > # start a job using RocksDB statebackend. > # when the RocksDB blockcache reachs maximum size, restart the job. and > monitor the memory usage (k8s pod working set) of the TM. > # go through step 2-3 few more times. and memory will keep raising. > > Any solution or suggestion for this? Thanks! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #12958: [FLINK-18625][runtime] Maintain redundant taskmanagers to speed up failover
flinkbot edited a comment on pull request #12958: URL: https://github.com/apache/flink/pull/12958#issuecomment-662378507 ## CI report: * deb09ce58822dda5b7388c2928b8a119879d9749 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=4912) * 35ddeb1f517c78a9237ab5b5814dd504487acc00 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-18496) Anchors are not generated based on ZH characters
[ https://issues.apache.org/jira/browse/FLINK-18496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165414#comment-17165414 ] Zhilong Hong commented on FLINK-18496: -- Now I'm trying to remove the document files one by one to find the exact one (or ones) with mistakes. But I think it's too slow to figure out it in this way. So I'm wondering is there anyone experienced in debugging the documents or jekyll? And how should I find out the mistake in flink docs in a better way? > Anchors are not generated based on ZH characters > > > Key: FLINK-18496 > URL: https://issues.apache.org/jira/browse/FLINK-18496 > Project: Flink > Issue Type: Bug > Components: Project Website >Reporter: Zhu Zhu >Assignee: Zhilong Hong >Priority: Major > Labels: starter > > In ZH version pages of flink-web, the anchors are not generated based on ZH > characters. The anchor name would be like 'section-1', 'section-2' if there > is no EN characters. An example can be the links in the navigator of > https://flink.apache.org/zh/contributing/contribute-code.html > This makes it impossible to ref an anchor from the content because the anchor > name might change unexpectedly if a new section is added. > Note that it is a problem for flink-web only. The docs generated from the > flink repo can properly generate ZH anchors. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-18723) performance test
[ https://issues.apache.org/jira/browse/FLINK-18723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165413#comment-17165413 ] junbiao chen commented on FLINK-18723: -- Thanks,I close it immediately > performance test > > > Key: FLINK-18723 > URL: https://issues.apache.org/jira/browse/FLINK-18723 > Project: Flink > Issue Type: Test >Reporter: junbiao chen >Priority: Major > > what is the best performance benchmark for flink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Issue Comment Deleted] (FLINK-18723) performance test
[ https://issues.apache.org/jira/browse/FLINK-18723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] junbiao chen updated FLINK-18723: - Comment: was deleted (was: Thanks,I close it immediately) > performance test > > > Key: FLINK-18723 > URL: https://issues.apache.org/jira/browse/FLINK-18723 > Project: Flink > Issue Type: Test >Reporter: junbiao chen >Priority: Major > > what is the best performance benchmark for flink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-18723) performance test
[ https://issues.apache.org/jira/browse/FLINK-18723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] junbiao chen closed FLINK-18723. Resolution: Not A Problem > performance test > > > Key: FLINK-18723 > URL: https://issues.apache.org/jira/browse/FLINK-18723 > Project: Flink > Issue Type: Test >Reporter: junbiao chen >Priority: Major > > what is the best performance benchmark for flink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-18723) performance test
[ https://issues.apache.org/jira/browse/FLINK-18723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165409#comment-17165409 ] Kenneth William Krugler edited comment on FLINK-18723 at 7/27/20, 3:38 AM: --- Hi [~dahaishuantuoba] - this is a question best asked via the [Flink mailing list|[https://flink.apache.org/community.html]], or on [Stack Overflow|[https://stackoverflow.com/questions/tagged/apache-flink]]. If would be great if you could ask on one of those two venues, and close this issue, thanks! was (Author: kkrugler): Hi [~dahaishuantuoba] - this is a question best asked via the [Flink mailing list|[https://flink.apache.org/community.html]], or on [Stack Overflow|[https://stackoverflow.com/questions/tagged/apache-flink]]. > performance test > > > Key: FLINK-18723 > URL: https://issues.apache.org/jira/browse/FLINK-18723 > Project: Flink > Issue Type: Test >Reporter: junbiao chen >Priority: Major > > what is the best performance benchmark for flink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-18122) Kubernetes test fails with "error: timed out waiting for the condition on jobs/flink-job-cluster"
[ https://issues.apache.org/jira/browse/FLINK-18122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165399#comment-17165399 ] Yang Wang commented on FLINK-18122: --- [~dian.fu] I think your instance is caused by the port conflict. The specified node port "30025" is used by other processes. So it failed to create "flink-job-cluster" service. {code:java} The Service "flink-job-cluster" is invalid: spec.ports[2].nodePort: Invalid value: 30025: provided port is already allocated {code} > Kubernetes test fails with "error: timed out waiting for the condition on > jobs/flink-job-cluster" > - > > Key: FLINK-18122 > URL: https://issues.apache.org/jira/browse/FLINK-18122 > Project: Flink > Issue Type: Bug > Components: Deployment / Kubernetes, Tests >Affects Versions: 1.11.0 >Reporter: Robert Metzger >Priority: Major > Labels: test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=2697=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=1e2bbe5b-4657-50be-1f07-d84bfce5b1f5 > {code} > 2020-06-04T09:25:40.7205843Z service/flink-job-cluster created > 2020-06-04T09:25:40.9661515Z job.batch/flink-job-cluster created > 2020-06-04T09:25:41.2189123Z deployment.apps/flink-task-manager created > 2020-06-04T10:32:32.6402983Z error: timed out waiting for the condition on > jobs/flink-job-cluster > 2020-06-04T10:32:33.8057757Z error: unable to upgrade connection: container > not found ("flink-task-manager") > 2020-06-04T10:32:33.8111302Z sort: cannot read: > '/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-56335570120/out/kubernetes_wc_out*': > No such file or directory > 2020-06-04T10:32:33.8124455Z FAIL WordCount: Output hash mismatch. Got > d41d8cd98f00b204e9800998ecf8427e, expected e682ec6622b5e83f2eb614617d5ab2cf. > 2020-06-04T10:32:33.8125379Z head hexdump of actual: > 2020-06-04T10:32:33.8136133Z head: cannot open > '/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-56335570120/out/kubernetes_wc_out*' > for reading: No such file or directory > 2020-06-04T10:32:33.8344715Z Debugging failed Kubernetes test: > 2020-06-04T10:32:33.8345469Z Currently existing Kubernetes resources > 2020-06-04T10:32:36.4977853Z I0604 10:32:36.497383 13191 request.go:621] > Throttling request took 1.198606989s, request: > GET:https://10.1.0.4:8443/apis/rbac.authorization.k8s.io/v1?timeout=32s > 2020-06-04T10:32:46.6975735Z I0604 10:32:46.697234 13191 request.go:621] > Throttling request took 4.398107353s, request: > GET:https://10.1.0.4:8443/apis/authorization.k8s.io/v1?timeout=32s > 2020-06-04T10:32:57.4978637Z I0604 10:32:57.497209 13191 request.go:621] > Throttling request took 1.198449167s, request: > GET:https://10.1.0.4:8443/apis/apps/v1?timeout=32s > 2020-06-04T10:33:07.4980104Z I0604 10:33:07.497320 13191 request.go:621] > Throttling request took 4.198274438s, request: > GET:https://10.1.0.4:8443/apis/apiextensions.k8s.io/v1?timeout=32s > 2020-06-04T10:33:18.4976060Z I0604 10:33:18.497258 13191 request.go:621] > Throttling request took 1.19871495s, request: > GET:https://10.1.0.4:8443/apis/apps/v1?timeout=32s > 2020-06-04T10:33:28.4979129Z I0604 10:33:28.497276 13191 request.go:621] > Throttling request took 4.198369672s, request: > GET:https://10.1.0.4:8443/apis/rbac.authorization.k8s.io/v1?timeout=32s > 2020-06-04T10:33:30.9182069Z NAME READY > STATUS RESTARTS AGE > 2020-06-04T10:33:30.9184099Z pod/flink-job-cluster-dtb67 0/1 > ErrImageNeverPull 0 67m > 2020-06-04T10:33:30.9184869Z pod/flink-task-manager-74ccc9bd9-psqwm 0/1 > ErrImageNeverPull 0 67m > 2020-06-04T10:33:30.9185226Z > 2020-06-04T10:33:30.9185926Z NAMETYPE > CLUSTER-IP EXTERNAL-IP PORT(S) > AGE > 2020-06-04T10:33:30.9186832Z service/flink-job-cluster NodePort > 10.111.92.199 > 6123:32501/TCP,6124:31360/TCP,6125:30025/TCP,8081:30081/TCP 67m > 2020-06-04T10:33:30.9187545Z service/kubernetes ClusterIP > 10.96.0.1 443/TCP > 68m > 2020-06-04T10:33:30.9187976Z > 2020-06-04T10:33:30.9188472Z NAME READY > UP-TO-DATE AVAILABLE AGE > 2020-06-04T10:33:30.9189179Z deployment.apps/flink-task-manager 0/1 1 > 0 67m > 2020-06-04T10:33:30.9189508Z > 2020-06-04T10:33:30.9189815Z NAME > DESIRED CURRENT READY AGE > 2020-06-04T10:33:30.9190418Z replicaset.apps/flink-task-manager-74ccc9bd9 1 > 1
[jira] [Commented] (FLINK-18723) performance test
[ https://issues.apache.org/jira/browse/FLINK-18723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165409#comment-17165409 ] Kenneth William Krugler commented on FLINK-18723: - Hi [~dahaishuantuoba] - this is a question best asked via the [Flink mailing list|[https://flink.apache.org/community.html]], or on [Stack Overflow|[https://stackoverflow.com/questions/tagged/apache-flink]]. > performance test > > > Key: FLINK-18723 > URL: https://issues.apache.org/jira/browse/FLINK-18723 > Project: Flink > Issue Type: Test >Reporter: junbiao chen >Priority: Major > > what is the best performance benchmark for flink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-18717) reuse MiniCluster in table integration test class ?
[ https://issues.apache.org/jira/browse/FLINK-18717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165405#comment-17165405 ] godfrey he commented on FLINK-18717: cc [~kkl0u] > reuse MiniCluster in table integration test class ? > > > Key: FLINK-18717 > URL: https://issues.apache.org/jira/browse/FLINK-18717 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Legacy Planner, Table SQL / Planner >Affects Versions: 1.11.0 >Reporter: godfrey he >Priority: Major > > before 1.11, {{MiniCluster}} can be reused in each integration test class. > (see TestStreamEnvironment#setAsContext) > In 1.11, after we correct the execution behavior of TableEnvironment, > StreamTableEnvironment and BatchTableEnvironment (see > [FLINK-16363|https://issues.apache.org/jira/browse/FLINK-16363], > [FLINK-17126|https://issues.apache.org/jira/browse/FLINK-17126]), MiniCluster > will be created for each test case even in same test class (see > {{org.apache.flink.client.deployment.executors.LocalExecutor}}). It's better > we can reuse {{MiniCluster}} like before. One approach is we provide a new > kind of MiniCluster factory (such as: SessionMiniClusterFactory) instead of > using {{PerJobMiniClusterFactory}}. WDYT ? > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-18718) Support for using yarn-cluster CLI parameters on application mode
[ https://issues.apache.org/jira/browse/FLINK-18718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165404#comment-17165404 ] Yang Wang commented on FLINK-18718: --- [~flolas] Since the community decides to deprecate the cli options in the future, see the discussion here[1], they are not supported in Yarn application mode, Kubernetes session, Kubernetes application mode now. [1]. https://issues.apache.org/jira/browse/FLINK-15179?focusedCommentId=16995321=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16995321 > Support for using yarn-cluster CLI parameters on application mode > - > > Key: FLINK-18718 > URL: https://issues.apache.org/jira/browse/FLINK-18718 > Project: Flink > Issue Type: Improvement > Components: Command Line Client >Affects Versions: 1.11.0, 1.11.1 > Environment: *Flink 1.11* > *CDH6.2.1* >Reporter: Felipe Lolas >Priority: Minor > > Hi! > Just a minor improv. here: > Now with the new Application Mode, we need to use generic cli configurations > for setting up YARN deployments. > *Example* > {code:java} > -Djobmanager.memory.process.size=1024m > -Dtaskmanager.memory.process.size=1024m > -Dyarn.application.queue=root.test > -Dyarn.application.name=flink-hbase > -Dyarn.application.type=flink {code} > It would be nice to use directly the CLI options available for yarn-cluster. > This works in run per-job and session mode *but not in application mode.* > > Thanks > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-18724) Integration with DataStream and DataSet API report error
liang ding created FLINK-18724: -- Summary: Integration with DataStream and DataSet API report error Key: FLINK-18724 URL: https://issues.apache.org/jira/browse/FLINK-18724 Project: Flink Issue Type: Bug Components: API / Core, Connectors / Kafka, Table SQL / API Affects Versions: 1.11.1 Reporter: liang ding I want to create a table from a DataStream(kafka) : there is two reason I need to use DataStream: # I need decode msg to columns by custom format, in sql mode I don't known how to do it. # I have realize DeserializationSchema or FlatMapFunction both. when use datastream I can do many things before it become a suitable table, that is my prefer way in any other apply. so I do it like that: {code:java} StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); EnvironmentSettings tSet= EnvironmentSettings.newInstance().useOldPlanner().inStreamingMode().build(); StreamTableEnvironment tEnv=StreamTableEnvironment.create(env,tSet); DataStream stream = env .addSource(new FlinkKafkaConsumer<>("test-log", new SimpleStringSchema(), properties)) .flatMap(new LogParser()); //stream.printToErr(); tEnv.fromDataStream(stream).select("userId,city").execute().print(); tEnv.execute("test-sql"); //env.execute("test"); {code} then I got message: {noformat} [Kafka Fetcher for Source: Custom Source -> Flat Map ->* -> select: (userId,city) -> to: Row (3/3)] INFO org.apache.kafka.clients.FetchSessionHandler - [Consumer clientId=consumer-flink-3-5, groupId=flink-3] Node 0 sent an invalid full fetch response with extra=(test-log-0, response=( [Kafka Fetcher for Source: Custom Source -> Flat Map ->* -> select: (userId,city) -> to: Row (3/3)] INFO org.apache.kafka.clients.FetchSessionHandler - [Consumer clientId=consumer-flink-3-5, groupId=flink-3] Node 0 sent an invalid full fetch response with extra=(test-log-1, response=({noformat} it seen like both StreamExecutionEnvironment and StreamTableEnvironment start the fetcher and make no one successed. and there is no guide Integration which made me confused: should I do env.execute or tableEnv.execute or both(it's seen not) ? and the blink planner way -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-18723) performance test
junbiao chen created FLINK-18723: Summary: performance test Key: FLINK-18723 URL: https://issues.apache.org/jira/browse/FLINK-18723 Project: Flink Issue Type: Test Reporter: junbiao chen what is the best performance benchmark for flink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-10674) Fix handling of retractions after clean up
[ https://issues.apache.org/jira/browse/FLINK-10674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165401#comment-17165401 ] Xingxing Di commented on FLINK-10674: - Hi Timo, I saw this issue has been fixed in 1.7.0 but today we still got the exception under flink 1.7.2, should we reopen this issue? !screenshot-1.png! > Fix handling of retractions after clean up > -- > > Key: FLINK-10674 > URL: https://issues.apache.org/jira/browse/FLINK-10674 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.6.1 > Environment: Flink 1.6.0 >Reporter: ambition >Assignee: Timo Walther >Priority: Minor > Labels: pull-request-available > Fix For: 1.5.6, 1.6.3, 1.7.0 > > Attachments: image-2018-10-25-14-46-03-373.png, screenshot-1.png > > > Our online Flink Job run about a week,job contain sql : > {code:java} > select `time`, > lower(trim(os_type)) as os_type, > count(distinct feed_id) as feed_total_view > from my_table > group by `time`, lower(trim(os_type)){code} > > then occur NPE: > > {code:java} > java.lang.NullPointerException > at scala.Predef$.Long2long(Predef.scala:363) > at > org.apache.flink.table.functions.aggfunctions.DistinctAccumulator.remove(DistinctAccumulator.scala:109) > at NonWindowedAggregationHelper$894.retract(Unknown Source) > at > org.apache.flink.table.runtime.aggregate.GroupAggProcessFunction.processElement(GroupAggProcessFunction.scala:124) > at > org.apache.flink.table.runtime.aggregate.GroupAggProcessFunction.processElement(GroupAggProcessFunction.scala:39) > at > org.apache.flink.streaming.api.operators.LegacyKeyedProcessOperator.processElement(LegacyKeyedProcessOperator.java:88) > at > org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:202) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:105) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) > at java.lang.Thread.run(Thread.java:745) > {code} > > > View DistinctAccumulator.remove > !image-2018-10-25-14-46-03-373.png! > > this NPE should currentCnt = null lead to, so we simple handle like : > {code:java} > def remove(params: Row): Boolean = { > if(!distinctValueMap.contains(params)){ > true > }else{ > val currentCnt = distinctValueMap.get(params) > // > if (currentCnt == null || currentCnt == 1) { > distinctValueMap.remove(params) > true > } else { > var value = currentCnt - 1L > if(value < 0){ > value = 1 > } > distinctValueMap.put(params, value) > false > } > } > }{code} > > Update: > Because state clean up happens in processing time, it might be > the case that retractions are arriving after the state has > been cleaned up. Before these changes, a new accumulator was > created and invalid retraction messages were emitted. This > change drops retraction messages for which no accumulator > exists. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-10674) Fix handling of retractions after clean up
[ https://issues.apache.org/jira/browse/FLINK-10674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xingxing Di updated FLINK-10674: Attachment: screenshot-1.png > Fix handling of retractions after clean up > -- > > Key: FLINK-10674 > URL: https://issues.apache.org/jira/browse/FLINK-10674 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.6.1 > Environment: Flink 1.6.0 >Reporter: ambition >Assignee: Timo Walther >Priority: Minor > Labels: pull-request-available > Fix For: 1.5.6, 1.6.3, 1.7.0 > > Attachments: image-2018-10-25-14-46-03-373.png, screenshot-1.png > > > Our online Flink Job run about a week,job contain sql : > {code:java} > select `time`, > lower(trim(os_type)) as os_type, > count(distinct feed_id) as feed_total_view > from my_table > group by `time`, lower(trim(os_type)){code} > > then occur NPE: > > {code:java} > java.lang.NullPointerException > at scala.Predef$.Long2long(Predef.scala:363) > at > org.apache.flink.table.functions.aggfunctions.DistinctAccumulator.remove(DistinctAccumulator.scala:109) > at NonWindowedAggregationHelper$894.retract(Unknown Source) > at > org.apache.flink.table.runtime.aggregate.GroupAggProcessFunction.processElement(GroupAggProcessFunction.scala:124) > at > org.apache.flink.table.runtime.aggregate.GroupAggProcessFunction.processElement(GroupAggProcessFunction.scala:39) > at > org.apache.flink.streaming.api.operators.LegacyKeyedProcessOperator.processElement(LegacyKeyedProcessOperator.java:88) > at > org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:202) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:105) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) > at java.lang.Thread.run(Thread.java:745) > {code} > > > View DistinctAccumulator.remove > !image-2018-10-25-14-46-03-373.png! > > this NPE should currentCnt = null lead to, so we simple handle like : > {code:java} > def remove(params: Row): Boolean = { > if(!distinctValueMap.contains(params)){ > true > }else{ > val currentCnt = distinctValueMap.get(params) > // > if (currentCnt == null || currentCnt == 1) { > distinctValueMap.remove(params) > true > } else { > var value = currentCnt - 1L > if(value < 0){ > value = 1 > } > distinctValueMap.put(params, value) > false > } > } > }{code} > > Update: > Because state clean up happens in processing time, it might be > the case that retractions are arriving after the state has > been cleaned up. Before these changes, a new accumulator was > created and invalid retraction messages were emitted. This > change drops retraction messages for which no accumulator > exists. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-18681) The jar package version conflict causes the task to continue to increase and grab resources
[ https://issues.apache.org/jira/browse/FLINK-18681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165397#comment-17165397 ] Xintong Song commented on FLINK-18681: -- Hi [~apach...@163.com], It seems I've misunderstood what you meant. I thought you were complaining about the dependency conflict, but actually you were complaining about the resource increasing caused by the dependency conflict. I apologize for that misunderstanding. I think you are right. This is indeed a problem that one problematic Flink application impacts the entire cluster by taking lots of resources. Could you describe a bit more information to help us understand the problem better? I'm particularly interested in how exactly does the resources increase. - Does the application keeps requesting new containers but never release them? - What are the status of requested containers? Are the actually allocated or reserved? - Are there AM failovers? (In other words, do you see many application attempts on Yarn?) It would also be helpful if you can share the complete JM logs, if there's no sensitive information. > The jar package version conflict causes the task to continue to increase and > grab resources > --- > > Key: FLINK-18681 > URL: https://issues.apache.org/jira/browse/FLINK-18681 > Project: Flink > Issue Type: Bug >Affects Versions: 1.11.0 >Reporter: wangtaiyang >Priority: Major > > When I submit a flink task to yarn, the default resource configuration is > 1G&1core, but in fact this task will always increase resources 2core, 3core, > and so on. . . 200core. . . Then I went to look at the JM log and found the > following error: > {code:java} > //代码占位符 > java.lang.NoSuchMethodError: > org.apache.commons.cli.Option.builder(Ljava/lang/String;)Lorg/apache/commons/cli/Option$Builder;java.lang.NoSuchMethodError: > > org.apache.commons.cli.Option.builder(Ljava/lang/String;)Lorg/apache/commons/cli/Option$Builder; > at > org.apache.flink.runtime.entrypoint.parser.CommandLineOptions.(CommandLineOptions.java:28) > ~[flink-dist_2.11-1.11.1.jar:1.11.1] at > org.apache.flink.runtime.clusterframework.BootstrapTools.lambda$getDynamicPropertiesAsString$0(BootstrapTools.java:648) > ~[flink-dist_2.11-1.11.1.jar:1.11.1] at > java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:267) > ~[?:1.8.0_191] > ... > java.lang.NoClassDefFoundError: Could not initialize class > org.apache.flink.runtime.entrypoint.parser.CommandLineOptionsjava.lang.NoClassDefFoundError: > Could not initialize class > org.apache.flink.runtime.entrypoint.parser.CommandLineOptions at > org.apache.flink.runtime.clusterframework.BootstrapTools.lambda$getDynamicPropertiesAsString$0(BootstrapTools.java:648) > ~[flink-dist_2.11-1.11.1.jar:1.11.1] at > java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:267) > ~[?:1.8.0_191] at > java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1553) > ~[?:1.8.0_191] at > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) > ~[?:1.8.0_191]{code} > Finally, it is confirmed that it is caused by the commands-cli version > conflict, but the task reporting error has not stopped and will continue to > grab resources and increase. Is this a bug? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17767) Tumble and Hop window support offset
[ https://issues.apache.org/jira/browse/FLINK-17767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165396#comment-17165396 ] hailong wang commented on FLINK-17767: -- Hi [~jark] , what do you think which type of window offset may be better, SqlTypeFamily.TIME or SqlTypeFamily.DATETIME_INTERVAL? Calcite has a window table function with offset of DATETIME_INTERVAL[1]. [1]https://issues.apache.org/jira/browse/CALCITE-4000 > Tumble and Hop window support offset > > > Key: FLINK-17767 > URL: https://issues.apache.org/jira/browse/FLINK-17767 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Affects Versions: 1.10.0 >Reporter: hailong wang >Assignee: hailong wang >Priority: Major > Fix For: 1.12.0 > > > TUMBLE window and HOP window with alignment is not supported yet. We can > support by > (, , ) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #12958: [FLINK-18625][runtime] Maintain redundant taskmanagers to speed up failover
flinkbot edited a comment on pull request #12958: URL: https://github.com/apache/flink/pull/12958#issuecomment-662378507 ## CI report: * deb09ce58822dda5b7388c2928b8a119879d9749 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=4912) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-18438) TaskManager start failed
[ https://issues.apache.org/jira/browse/FLINK-18438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165393#comment-17165393 ] Xintong Song commented on FLINK-18438: -- [~JohnSiro], I could not tell what is the problem neither. Everything looks fine to me. Sorry for not being helpful. Maybe you can try a newer build of JDK8? > TaskManager start failed > > > Key: FLINK-18438 > URL: https://issues.apache.org/jira/browse/FLINK-18438 > Project: Flink > Issue Type: Bug > Components: Client / Job Submission >Affects Versions: 1.10.1 > Environment: Java: java version "1.8.0_101" > Java(TM) SE Runtime Environment (build 1.8.0_101-b13) > Java HotSpot(TM) 64-Bit Server VM (build 25.101-b13, mixed mode) > Flink: 1.10.1 (flink-1.10.1-bin-scala_2.12.tgz) > OS: Windows 10 (1903) / 64-bits >Reporter: JohnSiro >Priority: Major > > > Error: in file xxx-taskexecutor-0-xxx.out is: > Error: Could not create the Java Virtual Machine. > Error: A fatal exception has occurred. Program will exit. > Improperly specified VM option 'MaxMetaspaceSize=268435456 ' -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-18719) Define the interfaces and introduce ActiveResourceManager
[ https://issues.apache.org/jira/browse/FLINK-18719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song updated FLINK-18719: - Fix Version/s: 1.12.0 > Define the interfaces and introduce ActiveResourceManager > - > > Key: FLINK-18719 > URL: https://issues.apache.org/jira/browse/FLINK-18719 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Xintong Song >Priority: Major > Fix For: 1.12.0 > > > * Define the interface ResourceManagerDriver and ResourceEventHandler > * Rename the original ActiveResourceManager to LegacyActiveResourceManager. > * Introduce the new ActiveResourceManager. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-18675) Checkpoint not maintaining minimum pause duration between checkpoints
[ https://issues.apache.org/jira/browse/FLINK-18675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165387#comment-17165387 ] Congxian Qiu(klion26) commented on FLINK-18675: --- Hi [~raviratnakar] from the git history, the code at line number 1512 was added in 1.9.0(and seems there change would not affect this problem), IIUC, we would check whether the checkpoint can be triggered somewhere, I need to check it carefully as the code changed a lot. will reply here If found anything. > Checkpoint not maintaining minimum pause duration between checkpoints > - > > Key: FLINK-18675 > URL: https://issues.apache.org/jira/browse/FLINK-18675 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.11.0 > Environment: !image.png! >Reporter: Ravi Bhushan Ratnakar >Priority: Critical > Attachments: image.png > > > I am running a streaming job with Flink 1.11.0 using kubernetes > infrastructure. I have configured checkpoint configuration like below > Interval - 3 minutes > Minimum pause between checkpoints - 3 minutes > Checkpoint timeout - 10 minutes > Checkpointing Mode - Exactly Once > Number of Concurrent Checkpoint - 1 > > Other configs > Time Characteristics - Processing Time > > I am observing an usual behaviour. *When a checkpoint completes successfully* > *and if it's end to end duration is almost equal or greater than Minimum > pause duration then the next checkpoint gets triggered immediately without > maintaining the Minimum pause duration*. Kindly notice this behaviour from > checkpoint id 194 onward in the attached screenshot -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-18620) Unify behaviors of active resource managers
[ https://issues.apache.org/jira/browse/FLINK-18620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song updated FLINK-18620: - Fix Version/s: 1.12.0 > Unify behaviors of active resource managers > --- > > Key: FLINK-18620 > URL: https://issues.apache.org/jira/browse/FLINK-18620 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Reporter: Xintong Song >Assignee: Xintong Song >Priority: Major > Fix For: 1.12.0 > > > Flink supports various deployment modes: standalone, Kubernetes, Yarn & > Mesos. For each deployment mode, a resource manager is implemented for > managing the resources. > While StandaloneResourceManager is quite different from the others by not > being able to dynamically request and release resources, the other three > (KubernetesResourceManager, YarnResourceManager and MesosResourceManager) > share many logics in common. These common logics are currently duplicately > implemented by each of the active resource managers. Such duplication leads > to extra maintaining overhead and amplifies stability risks. > This ticket proposes a refactor design for the resource managers, with better > abstraction deduplicating common logics implementations and minimizing the > deployment specific behaviors. > This proposal is a pure refactor effort. It does not intend to change any of > the current resource management behaviors. > A detailed design doc and a simplified proof-of-concept implementation for > the Kubernetes deployment are linked to this ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-18719) Define the interfaces and introduce ActiveResourceManager
[ https://issues.apache.org/jira/browse/FLINK-18719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song reassigned FLINK-18719: Assignee: Xintong Song > Define the interfaces and introduce ActiveResourceManager > - > > Key: FLINK-18719 > URL: https://issues.apache.org/jira/browse/FLINK-18719 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Xintong Song >Assignee: Xintong Song >Priority: Major > Fix For: 1.12.0 > > > * Define the interface ResourceManagerDriver and ResourceEventHandler > * Rename the original ActiveResourceManager to LegacyActiveResourceManager. > * Introduce the new ActiveResourceManager. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-18722) Migrate MesosResourceManager to the new MesosResourceManagerDriver
Xintong Song created FLINK-18722: Summary: Migrate MesosResourceManager to the new MesosResourceManagerDriver Key: FLINK-18722 URL: https://issues.apache.org/jira/browse/FLINK-18722 Project: Flink Issue Type: Sub-task Components: Deployment / Mesos, Runtime / Coordination Reporter: Xintong Song -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-18721) Migrate YarnResourceManager to the new YarnResourceManagerDriver
Xintong Song created FLINK-18721: Summary: Migrate YarnResourceManager to the new YarnResourceManagerDriver Key: FLINK-18721 URL: https://issues.apache.org/jira/browse/FLINK-18721 Project: Flink Issue Type: Sub-task Components: Deployment / YARN, Runtime / Coordination Reporter: Xintong Song Fix For: 1.12.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-18720) Migrate KubernetesResourceManager to the new KubernetesResourceManagerDriver
Xintong Song created FLINK-18720: Summary: Migrate KubernetesResourceManager to the new KubernetesResourceManagerDriver Key: FLINK-18720 URL: https://issues.apache.org/jira/browse/FLINK-18720 Project: Flink Issue Type: Sub-task Components: Deployment / Kubernetes, Runtime / Coordination Reporter: Xintong Song Fix For: 1.12.0 * Introduce KubernetesResourceManagerDriver * Switch to ActiveResourceManager and KubernetesResourceManagerDriver for Kubernetes deployment * Remove KubernetesResourceManager -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-18719) Define the interfaces and introduce ActiveResourceManager
Xintong Song created FLINK-18719: Summary: Define the interfaces and introduce ActiveResourceManager Key: FLINK-18719 URL: https://issues.apache.org/jira/browse/FLINK-18719 Project: Flink Issue Type: Sub-task Components: Runtime / Coordination Reporter: Xintong Song * Define the interface ResourceManagerDriver and ResourceEventHandler * Rename the original ActiveResourceManager to LegacyActiveResourceManager. * Introduce the new ActiveResourceManager. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-18702) Flink elasticsearch connector leaks threads and classloaders thereof
[ https://issues.apache.org/jira/browse/FLINK-18702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165385#comment-17165385 ] Yangze Guo commented on FLINK-18702: I think it is indeed the same issue as FLINK-11205. Not sure does it make sense to backport Flink-16408. As a workaround, you could increase the taskmanager.memory.jvm-metaspace.size[1]. [1] https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/config.html#taskmanager-memory-jvm-metaspace-size > Flink elasticsearch connector leaks threads and classloaders thereof > > > Key: FLINK-18702 > URL: https://issues.apache.org/jira/browse/FLINK-18702 > Project: Flink > Issue Type: Bug > Components: Connectors / ElasticSearch >Affects Versions: 1.10.0, 1.10.1 >Reporter: Jun Qin >Assignee: Jun Qin >Priority: Major > > Flink elasticsearch connector leaking threads and classloaders thereof. This > results in OOM Metaspace when ES sink fails and restarted many times. > This issue is visible in Flink 1.10 but not in 1.11 because Flink 1.11 does > not create new class loaders in case of recoveries (FLINK-16408) > > Reproduction: > * Start a job with ES sink in a Flink 1.10 cluster, without starting the ES > cluster. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-18080) Translate the "Kerberos Authentication Setup and Configuration" page into Chinese
[ https://issues.apache.org/jira/browse/FLINK-18080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165384#comment-17165384 ] pengweibo commented on FLINK-18080: --- Jark Wu, could u show me the original link to this page ? > Translate the "Kerberos Authentication Setup and Configuration" page into > Chinese > - > > Key: FLINK-18080 > URL: https://issues.apache.org/jira/browse/FLINK-18080 > Project: Flink > Issue Type: Sub-task > Components: chinese-translation, Documentation >Reporter: Yangze Guo >Assignee: pengweibo >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-18080) Translate the "Kerberos Authentication Setup and Configuration" page into Chinese
[ https://issues.apache.org/jira/browse/FLINK-18080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165382#comment-17165382 ] pengweibo commented on FLINK-18080: --- Hello Jark Wu I havent seen these changes not until today and i am willing to translate this page ,thanks a lot ! Weibo PENG > Translate the "Kerberos Authentication Setup and Configuration" page into > Chinese > - > > Key: FLINK-18080 > URL: https://issues.apache.org/jira/browse/FLINK-18080 > Project: Flink > Issue Type: Sub-task > Components: chinese-translation, Documentation >Reporter: Yangze Guo >Assignee: pengweibo >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17730) HadoopS3RecoverableWriterITCase.testRecoverAfterMultiplePersistsStateWithMultiPart times out
[ https://issues.apache.org/jira/browse/FLINK-17730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165380#comment-17165380 ] Dian Fu commented on FLINK-17730: - https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=4911=logs=e9af9cde-9a65-5281-a58e-2c8511d36983=603cb7fd-6f38-5c99-efca-877e1439232f > HadoopS3RecoverableWriterITCase.testRecoverAfterMultiplePersistsStateWithMultiPart > times out > > > Key: FLINK-17730 > URL: https://issues.apache.org/jira/browse/FLINK-17730 > Project: Flink > Issue Type: Bug > Components: Build System / Azure Pipelines, FileSystems, Tests >Reporter: Robert Metzger >Assignee: Robert Metzger >Priority: Major > Labels: pull-request-available, test-stability > Fix For: 1.11.0, 1.12.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=1374=logs=d44f43ce-542c-597d-bf94-b0718c71e5e8=34f486e1-e1e4-5dd2-9c06-bfdd9b9c74a8 > After 5 minutes > {code} > 2020-05-15T06:56:38.1688341Z "main" #1 prio=5 os_prio=0 > tid=0x7fa10800b800 nid=0x1161 runnable [0x7fa110959000] > 2020-05-15T06:56:38.1688709Zjava.lang.Thread.State: RUNNABLE > 2020-05-15T06:56:38.1689028Z at > java.net.SocketInputStream.socketRead0(Native Method) > 2020-05-15T06:56:38.1689496Z at > java.net.SocketInputStream.socketRead(SocketInputStream.java:116) > 2020-05-15T06:56:38.1689921Z at > java.net.SocketInputStream.read(SocketInputStream.java:171) > 2020-05-15T06:56:38.1690316Z at > java.net.SocketInputStream.read(SocketInputStream.java:141) > 2020-05-15T06:56:38.1690723Z at > sun.security.ssl.InputRecord.readFully(InputRecord.java:465) > 2020-05-15T06:56:38.1691196Z at > sun.security.ssl.InputRecord.readV3Record(InputRecord.java:593) > 2020-05-15T06:56:38.1691608Z at > sun.security.ssl.InputRecord.read(InputRecord.java:532) > 2020-05-15T06:56:38.1692023Z at > sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975) > 2020-05-15T06:56:38.1692558Z - locked <0xb94644f8> (a > java.lang.Object) > 2020-05-15T06:56:38.1692946Z at > sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:933) > 2020-05-15T06:56:38.1693371Z at > sun.security.ssl.AppInputStream.read(AppInputStream.java:105) > 2020-05-15T06:56:38.1694151Z - locked <0xb9464d20> (a > sun.security.ssl.AppInputStream) > 2020-05-15T06:56:38.1694908Z at > org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) > 2020-05-15T06:56:38.1695475Z at > org.apache.http.impl.io.SessionInputBufferImpl.read(SessionInputBufferImpl.java:198) > 2020-05-15T06:56:38.1696007Z at > org.apache.http.impl.io.ContentLengthInputStream.read(ContentLengthInputStream.java:176) > 2020-05-15T06:56:38.1696509Z at > org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:135) > 2020-05-15T06:56:38.1696993Z at > com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:90) > 2020-05-15T06:56:38.1697466Z at > com.amazonaws.event.ProgressInputStream.read(ProgressInputStream.java:180) > 2020-05-15T06:56:38.1698069Z at > com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:90) > 2020-05-15T06:56:38.1698567Z at > com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:90) > 2020-05-15T06:56:38.1699041Z at > com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:90) > 2020-05-15T06:56:38.1699624Z at > com.amazonaws.event.ProgressInputStream.read(ProgressInputStream.java:180) > 2020-05-15T06:56:38.1700090Z at > com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:90) > 2020-05-15T06:56:38.1700584Z at > com.amazonaws.util.LengthCheckInputStream.read(LengthCheckInputStream.java:107) > 2020-05-15T06:56:38.1701282Z at > com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:90) > 2020-05-15T06:56:38.1701800Z at > com.amazonaws.services.s3.internal.S3AbortableInputStream.read(S3AbortableInputStream.java:125) > 2020-05-15T06:56:38.1702328Z at > com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:90) > 2020-05-15T06:56:38.1702804Z at > org.apache.hadoop.fs.s3a.S3AInputStream.lambda$read$3(S3AInputStream.java:445) > 2020-05-15T06:56:38.1703270Z at > org.apache.hadoop.fs.s3a.S3AInputStream$$Lambda$42/1204178174.execute(Unknown > Source) > 2020-05-15T06:56:38.1703677Z at > org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109) > 2020-05-15T06:56:38.1704090Z at > org.apache.hadoop.fs.s3a.Invoker.lambda$retry$3(Invoker.java:260) > 2020-05-15T06:56:38.1704607Z at > org.apache.hadoop.fs.s3a.Invoker$$Lambda$23/1991724700.execute(Unknown Source) > 2020-05-15T06:56:38.1705115Z at >
[GitHub] [flink] flinkbot edited a comment on pull request #12958: [FLINK-18625][runtime] Maintain redundant taskmanagers to speed up failover
flinkbot edited a comment on pull request #12958: URL: https://github.com/apache/flink/pull/12958#issuecomment-662378507 ## CI report: * ca4d714f273259e0c1eb42238d455d331ad93996 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=4747) * deb09ce58822dda5b7388c2928b8a119879d9749 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=4912) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-18496) Anchors are not generated based on ZH characters
[ https://issues.apache.org/jira/browse/FLINK-18496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165379#comment-17165379 ] Zhilong Hong commented on FLINK-18496: -- I try to upgrade the version of Jekyll from 3.0.5 to 4.0.1. When I attempt to build the website after upgrading, the bundler reports the errors like below: {code:bash} /srv/flink-web/.rubydeps/ruby/2.5.0/gems/jekyll-multiple-languages-2.0.3/lib/jekyll-multiple-languages/multilang.rb:55:in `append_data_for_liquid' /srv/flink-web/.rubydeps/ruby/2.5.0/gems/jekyll-multiple-languages-2.0.3/lib/jekyll-multiple-languages/document.rb:71:in `to_liquid' /srv/flink-web/.rubydeps/ruby/2.5.0/gems/jekyll-4.0.1/lib/jekyll/drops/document_drop.rb:40:in `next' /srv/flink-web/.rubydeps/ruby/2.5.0/gems/jekyll-4.0.1/lib/jekyll/drops/drop.rb:47:in `public_send' /srv/flink-web/.rubydeps/ruby/2.5.0/gems/jekyll-4.0.1/lib/jekyll/drops/drop.rb:47:in `[]' /srv/flink-web/.rubydeps/ruby/2.5.0/gems/jekyll-4.0.1/lib/jekyll/drops/drop.rb:168:in `block in each' /srv/flink-web/.rubydeps/ruby/2.5.0/gems/jekyll-4.0.1/lib/jekyll/drops/drop.rb:167:in `each' /srv/flink-web/.rubydeps/ruby/2.5.0/gems/jekyll-4.0.1/lib/jekyll/drops/drop.rb:167:in `each' /srv/flink-web/.rubydeps/ruby/2.5.0/gems/jekyll-4.0.1/lib/jekyll/drops/drop.rb:167:in `each' /srv/flink-web/.rubydeps/ruby/2.5.0/gems/jekyll-4.0.1/lib/jekyll/utils.rb:336:in `duplicate_frozen_values' /srv/flink-web/.rubydeps/ruby/2.5.0/gems/jekyll-4.0.1/lib/jekyll/utils.rb:44:in `deep_merge_hashes!' /srv/flink-web/.rubydeps/ruby/2.5.0/gems/jekyll-4.0.1/lib/jekyll/utils.rb:29:in `deep_merge_hashes' {code} I think maybe some docs are incompatible with the plugin jekyll-multiple-languages, because flink/docs uses jekyll 4.0.1 and jekyll-multiple-language 2.0.3, and it goes well. Also I attempt to build the Chinese version only, the anchors are generated based on the characters correctly. So I think currently the major challenge is how to fix the docs to make them compatible with new jekyll and the plugin jekyll-multiple-languages. I searched how to get the log of jekyll, and I got nothing on the internet. I think this makes it harder to solve this issue. > Anchors are not generated based on ZH characters > > > Key: FLINK-18496 > URL: https://issues.apache.org/jira/browse/FLINK-18496 > Project: Flink > Issue Type: Bug > Components: Project Website >Reporter: Zhu Zhu >Assignee: Zhilong Hong >Priority: Major > Labels: starter > > In ZH version pages of flink-web, the anchors are not generated based on ZH > characters. The anchor name would be like 'section-1', 'section-2' if there > is no EN characters. An example can be the links in the navigator of > https://flink.apache.org/zh/contributing/contribute-code.html > This makes it impossible to ref an anchor from the content because the anchor > name might change unexpectedly if a new section is added. > Note that it is a problem for flink-web only. The docs generated from the > flink repo can properly generate ZH anchors. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #12958: [FLINK-18625][runtime] Maintain redundant taskmanagers to speed up failover
flinkbot edited a comment on pull request #12958: URL: https://github.com/apache/flink/pull/12958#issuecomment-662378507 ## CI report: * ca4d714f273259e0c1eb42238d455d331ad93996 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=4747) * deb09ce58822dda5b7388c2928b8a119879d9749 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] eogramow edited a comment on pull request #12791: [FLINK-18362][FLINK-13838][yarn] Add yarn.ship-archives to support LocalResourceType.ARCHIVE
eogramow edited a comment on pull request #12791: URL: https://github.com/apache/flink/pull/12791#issuecomment-664063296 @kl0u Thanks. The test passed on my centos. I'll check it on ~Travis~Azure. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] eogramow commented on pull request #12791: [FLINK-18362][FLINK-13838][yarn] Add yarn.ship-archives to support LocalResourceType.ARCHIVE
eogramow commented on pull request #12791: URL: https://github.com/apache/flink/pull/12791#issuecomment-664063296 @kl0u Thanks. The test passed on my centos. I'll check it on Travis. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] Myracle commented on a change in pull request #12958: [FLINK-18625][runtime] Maintain redundant taskmanagers to speed up failover
Myracle commented on a change in pull request #12958: URL: https://github.com/apache/flink/pull/12958#discussion_r460593997 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/SlotManagerImpl.java ## @@ -299,6 +303,8 @@ public void start(ResourceManagerId newResourceManagerId, Executor newMainThread TimeUnit.MILLISECONDS); registerSlotManagerMetrics(); + + allocateRedundantSlots(redundantSlotNum); Review comment: After offline discussion, SlotManagerImpl will not allocate redundant taskmanagers at start for that it is hard to know wether there exist enough redundant taskmanagers. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-18718) Support for using yarn-cluster CLI parameters on application mode
Felipe Lolas created FLINK-18718: Summary: Support for using yarn-cluster CLI parameters on application mode Key: FLINK-18718 URL: https://issues.apache.org/jira/browse/FLINK-18718 Project: Flink Issue Type: Improvement Components: Command Line Client Affects Versions: 1.11.1, 1.11.0 Environment: *Flink 1.11* *CDH6.2.1* Reporter: Felipe Lolas Hi! Just a minor improv. here: Now with the new Application Mode, we need to use generic cli configurations for setting up YARN deployments. *Example* {code:java} -Djobmanager.memory.process.size=1024m -Dtaskmanager.memory.process.size=1024m -Dyarn.application.queue=root.test -Dyarn.application.name=flink-hbase -Dyarn.application.type=flink {code} It would be nice to use directly the CLI options available for yarn-cluster. This works in run per-job and session mode *but not in application mode.* Thanks -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #12872: [FLINK-18550][sql-client] use TableResult#collect to get select result for sql client
flinkbot edited a comment on pull request #12872: URL: https://github.com/apache/flink/pull/12872#issuecomment-656681765 ## CI report: * 71ee80061eebe7ee3dc50b2b73e991cc074d3e26 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=4907) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] kl0u commented on pull request #12791: [FLINK-18362][FLINK-13838][yarn] Add yarn.ship-archives to support LocalResourceType.ARCHIVE
kl0u commented on pull request #12791: URL: https://github.com/apache/flink/pull/12791#issuecomment-664001248 Hi @eogramow , Azure fails on your PR and the same happens after I rebased your branch on the current master. This is why I have not merged it yet. They both fail on the Kerberized YARN test. Could you investigate what the root cause may be? BTW the master seems to be passing. If you want to have also a look, this is my branch with the fix https://github.com/kl0u/flink/tree/yarn-archives-rebased This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-18710) ResourceProfileInfo is not serializable
[ https://issues.apache.org/jira/browse/FLINK-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann closed FLINK-18710. - Resolution: Fixed Fixed via master: ec8a08eb7aefb837b8c9befddad825ba2459500a 1.11.2: b86ddff71c71452102f3ee908db6c0ddc23240c4 > ResourceProfileInfo is not serializable > --- > > Key: FLINK-18710 > URL: https://issues.apache.org/jira/browse/FLINK-18710 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.11.0 >Reporter: Till Rohrmann >Assignee: Till Rohrmann >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0, 1.11.2 > > > {{ResourceProfileInfo}} should be {{Serializable}} because it is sent as an > RPC payload. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] tillrohrmann closed pull request #12991: [FLINK-18710] Make ResourceProfileInfo serializable
tillrohrmann closed pull request #12991: URL: https://github.com/apache/flink/pull/12991 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #12872: [FLINK-18550][sql-client] use TableResult#collect to get select result for sql client
flinkbot edited a comment on pull request #12872: URL: https://github.com/apache/flink/pull/12872#issuecomment-656681765 ## CI report: * 913aaebdeacc913d6b9f8498d55547aa8d480f10 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=4525) * 71ee80061eebe7ee3dc50b2b73e991cc074d3e26 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=4907) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-7129) Support dynamically changing CEP patterns
[ https://issues.apache.org/jira/browse/FLINK-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165212#comment-17165212 ] Konstantin Knauf commented on FLINK-7129: - [~lukefilwalker] Yes, the complexity of the implementation depends on your requirements (mostly complexity of patterns & latency I'd say). I think, Alex provided a simple example of this in his blog post series. With respect to this ticket, I am not aware of anyone working on it. > Support dynamically changing CEP patterns > - > > Key: FLINK-7129 > URL: https://issues.apache.org/jira/browse/FLINK-7129 > Project: Flink > Issue Type: New Feature > Components: Library / CEP >Reporter: Dawid Wysakowicz >Priority: Major > > An umbrella task for introducing mechanism for injecting patterns through > coStream -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #12872: [FLINK-18550][sql-client] use TableResult#collect to get select result for sql client
flinkbot edited a comment on pull request #12872: URL: https://github.com/apache/flink/pull/12872#issuecomment-656681765 ## CI report: * 913aaebdeacc913d6b9f8498d55547aa8d480f10 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=4525) * 71ee80061eebe7ee3dc50b2b73e991cc074d3e26 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-18717) reuse MiniCluster in table integration test class ?
godfrey he created FLINK-18717: -- Summary: reuse MiniCluster in table integration test class ? Key: FLINK-18717 URL: https://issues.apache.org/jira/browse/FLINK-18717 Project: Flink Issue Type: Improvement Components: Table SQL / Legacy Planner, Table SQL / Planner Affects Versions: 1.11.0 Reporter: godfrey he before 1.11, {{MiniCluster}} can be reused in each integration test class. (see TestStreamEnvironment#setAsContext) In 1.11, after we correct the execution behavior of TableEnvironment, StreamTableEnvironment and BatchTableEnvironment (see [FLINK-16363|https://issues.apache.org/jira/browse/FLINK-16363], [FLINK-17126|https://issues.apache.org/jira/browse/FLINK-17126]), MiniCluster will be created for each test case even in same test class (see {{org.apache.flink.client.deployment.executors.LocalExecutor}}). It's better we can reuse {{MiniCluster}} like before. One approach is we provide a new kind of MiniCluster factory (such as: SessionMiniClusterFactory) instead of using {{PerJobMiniClusterFactory}}. WDYT ? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #12992: [FLINK-18716][python][docs] Remove the deprecated execute and insert_into calls in PyFlink Table API docs.
flinkbot edited a comment on pull request #12992: URL: https://github.com/apache/flink/pull/12992#issuecomment-663982442 ## CI report: * 7c53366ee4c2dd53f6689ea7caf9f819502d8426 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=4906) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #12992: [FLINK-18716][python][docs] Remove the deprecated execute and insert_into calls in PyFlink Table API docs.
flinkbot commented on pull request #12992: URL: https://github.com/apache/flink/pull/12992#issuecomment-663982442 ## CI report: * 7c53366ee4c2dd53f6689ea7caf9f819502d8426 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #12992: [FLINK-18716][python][docs] Remove the deprecated execute and insert_into calls in PyFlink Table API docs.
flinkbot commented on pull request #12992: URL: https://github.com/apache/flink/pull/12992#issuecomment-663980522 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 7c53366ee4c2dd53f6689ea7caf9f819502d8426 (Sun Jul 26 12:14:10 UTC 2020) **Warnings:** * **This pull request references an unassigned [Jira ticket](https://issues.apache.org/jira/browse/FLINK-18716).** According to the [code contribution guide](https://flink.apache.org/contributing/contribute-code.html), tickets need to be assigned before starting with the implementation work. Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-18716) Remove the deprecated "execute" and "insert_into" calls in PyFlink Table API docs
[ https://issues.apache.org/jira/browse/FLINK-18716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-18716: --- Labels: pull-request-available (was: ) > Remove the deprecated "execute" and "insert_into" calls in PyFlink Table API > docs > - > > Key: FLINK-18716 > URL: https://issues.apache.org/jira/browse/FLINK-18716 > Project: Flink > Issue Type: Improvement > Components: API / Python, Documentation >Affects Versions: 1.11.0 >Reporter: Wei Zhong >Priority: Minor > Labels: pull-request-available > > Currently the TableEnvironment#execute and the Table#insert_into is > deprecated, but the docs of PyFlink Table API still use them. We should > replace them with the recommended API. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] WeiZhong94 opened a new pull request #12992: [FLINK-18716][python][docs] Remove the deprecated execute and insert_into calls in PyFlink Table API docs.
WeiZhong94 opened a new pull request #12992: URL: https://github.com/apache/flink/pull/12992 ## What is the purpose of the change *This pull request removes the deprecated execute and insert_into calls in PyFlink Table API docs.* ## Brief change log - *Remove the deprecated execute and insert_into calls in PyFlink Table API docs.* ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (docs) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-18716) Remove the deprecated "execute" and "insert_into" calls in PyFlink Table API docs
Wei Zhong created FLINK-18716: - Summary: Remove the deprecated "execute" and "insert_into" calls in PyFlink Table API docs Key: FLINK-18716 URL: https://issues.apache.org/jira/browse/FLINK-18716 Project: Flink Issue Type: Improvement Components: API / Python, Documentation Affects Versions: 1.11.0 Reporter: Wei Zhong Currently the TableEnvironment#execute and the Table#insert_into is deprecated, but the docs of PyFlink Table API still use them. We should replace them with the recommended API. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] Myracle commented on a change in pull request #12958: [FLINK-18625][runtime] Maintain redundant taskmanagers to speed up failover
Myracle commented on a change in pull request #12958: URL: https://github.com/apache/flink/pull/12958#discussion_r460508736 ## File path: flink-core/src/main/java/org/apache/flink/configuration/ResourceManagerOptions.java ## @@ -67,6 +67,12 @@ "for streaming workloads, which may fail if there are not enough slots. Note that this configuration option does not take " + "effect for standalone clusters, where how many slots are allocated is not controlled by Flink."); + public static final ConfigOption REDUNDANT_SLOT_NUM = ConfigOptions Review comment: Got it. I will expose slotmanager.redundant-taskmanager-num to users. In flink, the calculation will be based on slots. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] Myracle commented on a change in pull request #12958: [FLINK-18625][runtime] Maintain redundant taskmanagers to speed up failover
Myracle commented on a change in pull request #12958: URL: https://github.com/apache/flink/pull/12958#discussion_r460508614 ## File path: flink-runtime/src/test/java/org/apache/flink/runtime/resourcemanager/slotmanager/SlotManagerImplTest.java ## @@ -976,6 +976,7 @@ public void testTaskManagerTimeoutDoesNotRemoveSlots() throws Exception { try (final SlotManager slotManager = createSlotManagerBuilder() .setTaskManagerTimeout(taskManagerTimeout) + .setRedundantSlotNum(0) Review comment: If default value is changed for certain organization or special situation, developers need to fix this test. Unit test should be run in any valid config. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-18712) Flink RocksDB statebackend memory leak issue
[ https://issues.apache.org/jira/browse/FLINK-18712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165194#comment-17165194 ] Yun Tang commented on FLINK-18712: -- [~lio_sy], do you have a simple job to reproduce this and what happened if we move this job to YARN as it could also kill container once memory exceed the memory limit. I ask this is because I just wonder whether k8s would take os cache into account as the memory usage for that pod. To know how much memory used in RocksDB, there existed two ways: # Turn on block cache memory usage metrics: [https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/config.html#state-backend-rocksdb-metrics-block-cache-usage] when managed memory for RocksDB is enabled. And remember that all rocksDB instances within one slot would report the same value. # Use jemalloc and jeprof to see how much memory allocated from os by RocksDB, this is much more precious than 1st solution, and you could refer to [https://github.com/jeffgriffith/native-jvm-leaks/#going-native-with-jemalloc] to re-build and pre load related ".so" file. For flink, you could refer to [https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/config.html#forwarding-environment-variables] to know how to pass environment variables {{LD_PRELOAD and }}{{MALLOC_CONF}}. I think by doing this, you could know whether rocksDB has used too much unexpected memory. > Flink RocksDB statebackend memory leak issue > - > > Key: FLINK-18712 > URL: https://issues.apache.org/jira/browse/FLINK-18712 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Affects Versions: 1.10.0 >Reporter: Farnight >Priority: Critical > > When using RocksDB as our statebackend, we found it will lead to memory leak > when restarting job (manually or in recovery case). > > How to reproduce: > # increase RocksDB blockcache size(e.g. 1G), it is easier to monitor and > reproduce. > # start a job using RocksDB statebackend. > # when the RocksDB blockcache reachs maximum size, restart the job. and > monitor the memory usage (k8s pod working set) of the TM. > # go through step 2-3 few more times. and memory will keep raising. > > Any solution or suggestion for this? Thanks! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-18663) Fix Flink On YARN AM not exit
[ https://issues.apache.org/jira/browse/FLINK-18663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165176#comment-17165176 ] tartarus edited comment on FLINK-18663 at 7/26/20, 7:48 AM: [~trohrmann] hello, you said the {{terminationFuture}} is {{AbstractHandler.terminationFuture}} ? {{AbstractHandler.terminationFuture}} wait for {{AbstractHandler.inFlightRequestTracker}}. was (Author: tartarus): [~trohrmann] hello, you said the \{{ terminationFuture }} is \{{ AbstractHandler.terminationFuture }} ? {\{ AbstractHandler.terminationFuture }} wait for \{{ AbstractHandler.inFlightRequestTracker }}. > Fix Flink On YARN AM not exit > - > > Key: FLINK-18663 > URL: https://issues.apache.org/jira/browse/FLINK-18663 > Project: Flink > Issue Type: Bug > Components: Runtime / REST >Affects Versions: 1.10.0, 1.10.1, 1.11.0 >Reporter: tartarus >Assignee: tartarus >Priority: Critical > Labels: pull-request-available > Attachments: 110.png, 111.png, > C49A7310-F932-451B-A203-6D17F3140C0D.png, e18e00dd6664485c2ff55284fe969474.png > > > AbstractHandler throw NPE cause by FlinkHttpObjectAggregator is null > when rest throw exception, it will do this code > {code:java} > private CompletableFuture handleException(Throwable throwable, > ChannelHandlerContext ctx, HttpRequest httpRequest) { > FlinkHttpObjectAggregator flinkHttpObjectAggregator = > ctx.pipeline().get(FlinkHttpObjectAggregator.class); > int maxLength = flinkHttpObjectAggregator.maxContentLength() - > OTHER_RESP_PAYLOAD_OVERHEAD; > if (throwable instanceof RestHandlerException) { > RestHandlerException rhe = (RestHandlerException) throwable; > String stackTrace = ExceptionUtils.stringifyException(rhe); > String truncatedStackTrace = Ascii.truncate(stackTrace, > maxLength, "..."); > if (log.isDebugEnabled()) { > log.error("Exception occurred in REST handler.", rhe); > } else { > log.error("Exception occurred in REST handler: {}", > rhe.getMessage()); > } > return HandlerUtils.sendErrorResponse( > ctx, > httpRequest, > new ErrorResponseBody(truncatedStackTrace), > rhe.getHttpResponseStatus(), > responseHeaders); > } else { > log.error("Unhandled exception.", throwable); > String stackTrace = String.format(" side:%n%s%nEnd of exception on server side>", > ExceptionUtils.stringifyException(throwable)); > String truncatedStackTrace = Ascii.truncate(stackTrace, > maxLength, "..."); > return HandlerUtils.sendErrorResponse( > ctx, > httpRequest, > new ErrorResponseBody(Arrays.asList("Internal server > error.", truncatedStackTrace)), > HttpResponseStatus.INTERNAL_SERVER_ERROR, > responseHeaders); > } > } > {code} > but flinkHttpObjectAggregator some case is null,so this will throw NPE,but > this method called by AbstractHandler#respondAsLeader > {code:java} > requestProcessingFuture > .whenComplete((Void ignored, Throwable throwable) -> { > if (throwable != null) { > > handleException(ExceptionUtils.stripCompletionException(throwable), ctx, > httpRequest) > .whenComplete((Void ignored2, Throwable > throwable2) -> finalizeRequestProcessing(finalUploadedFiles)); > } else { > finalizeRequestProcessing(finalUploadedFiles); > } > }); > {code} > the result is InFlightRequestTracker Cannot be cleared. > so the CompletableFuture does‘t complete that handler's closeAsync returned > !C49A7310-F932-451B-A203-6D17F3140C0D.png! > !e18e00dd6664485c2ff55284fe969474.png! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-18663) Fix Flink On YARN AM not exit
[ https://issues.apache.org/jira/browse/FLINK-18663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165176#comment-17165176 ] tartarus commented on FLINK-18663: -- [~trohrmann] hello, you said the \{{ terminationFuture }} is \{{ AbstractHandler.terminationFuture }} ? {\{ AbstractHandler.terminationFuture }} wait for \{{ AbstractHandler.inFlightRequestTracker }}. > Fix Flink On YARN AM not exit > - > > Key: FLINK-18663 > URL: https://issues.apache.org/jira/browse/FLINK-18663 > Project: Flink > Issue Type: Bug > Components: Runtime / REST >Affects Versions: 1.10.0, 1.10.1, 1.11.0 >Reporter: tartarus >Assignee: tartarus >Priority: Critical > Labels: pull-request-available > Attachments: 110.png, 111.png, > C49A7310-F932-451B-A203-6D17F3140C0D.png, e18e00dd6664485c2ff55284fe969474.png > > > AbstractHandler throw NPE cause by FlinkHttpObjectAggregator is null > when rest throw exception, it will do this code > {code:java} > private CompletableFuture handleException(Throwable throwable, > ChannelHandlerContext ctx, HttpRequest httpRequest) { > FlinkHttpObjectAggregator flinkHttpObjectAggregator = > ctx.pipeline().get(FlinkHttpObjectAggregator.class); > int maxLength = flinkHttpObjectAggregator.maxContentLength() - > OTHER_RESP_PAYLOAD_OVERHEAD; > if (throwable instanceof RestHandlerException) { > RestHandlerException rhe = (RestHandlerException) throwable; > String stackTrace = ExceptionUtils.stringifyException(rhe); > String truncatedStackTrace = Ascii.truncate(stackTrace, > maxLength, "..."); > if (log.isDebugEnabled()) { > log.error("Exception occurred in REST handler.", rhe); > } else { > log.error("Exception occurred in REST handler: {}", > rhe.getMessage()); > } > return HandlerUtils.sendErrorResponse( > ctx, > httpRequest, > new ErrorResponseBody(truncatedStackTrace), > rhe.getHttpResponseStatus(), > responseHeaders); > } else { > log.error("Unhandled exception.", throwable); > String stackTrace = String.format(" side:%n%s%nEnd of exception on server side>", > ExceptionUtils.stringifyException(throwable)); > String truncatedStackTrace = Ascii.truncate(stackTrace, > maxLength, "..."); > return HandlerUtils.sendErrorResponse( > ctx, > httpRequest, > new ErrorResponseBody(Arrays.asList("Internal server > error.", truncatedStackTrace)), > HttpResponseStatus.INTERNAL_SERVER_ERROR, > responseHeaders); > } > } > {code} > but flinkHttpObjectAggregator some case is null,so this will throw NPE,but > this method called by AbstractHandler#respondAsLeader > {code:java} > requestProcessingFuture > .whenComplete((Void ignored, Throwable throwable) -> { > if (throwable != null) { > > handleException(ExceptionUtils.stripCompletionException(throwable), ctx, > httpRequest) > .whenComplete((Void ignored2, Throwable > throwable2) -> finalizeRequestProcessing(finalUploadedFiles)); > } else { > finalizeRequestProcessing(finalUploadedFiles); > } > }); > {code} > the result is InFlightRequestTracker Cannot be cleared. > so the CompletableFuture does‘t complete that handler's closeAsync returned > !C49A7310-F932-451B-A203-6D17F3140C0D.png! > !e18e00dd6664485c2ff55284fe969474.png! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-18202) Introduce Protobuf format
[ https://issues.apache.org/jira/browse/FLINK-18202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165170#comment-17165170 ] Zou edited comment on FLINK-18202 at 7/26/20, 7:31 AM: --- Hi [~libenchao] , here's my point: 1.1 I think it is convenient for both users and devs if we could derive table schema according to proto definition. 1.2 I prefer 'compiled proto class' , and maybe we should restrict the protobuf version used to compile the class. 1.3 Shall we do a simple performance test for both options? I am inclined to DynamicMessage if there is no significant performance difference. Futhermore, maybe we should discuss the default value for miss fields in protobuf (such as oneof fields), should we use `null` or PB default value to handle this. I prefer the former. was (Author: frankzou): Hi [~libenchao] , here's my point: 1.1 I think it is convenient for both users and devs if we could derive table schema according to proto definition. 1.2 I prefer 'compiled proto class' , and maybe we should restrict the protobuf version used to compile the class. 1.3 Shall we do a simple performance test for both options? I am inclined to DynamicMessage if there is no significant performance difference. Futher more, maybe we shoule discuss the default value for miss fields in protobuf (such as oneof fields), should we use `null` or PB default value to handle this. I prefer the former. > Introduce Protobuf format > - > > Key: FLINK-18202 > URL: https://issues.apache.org/jira/browse/FLINK-18202 > Project: Flink > Issue Type: New Feature > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table > SQL / API >Reporter: Benchao Li >Priority: Major > Attachments: image-2020-06-15-17-18-03-182.png > > > PB[1] is a very famous and wildly used (de)serialization framework. The ML[2] > also has some discussions about this. It's a useful feature. > This issue maybe needs some designs, or a FLIP. > [1] [https://developers.google.com/protocol-buffers] > [2] [http://apache-flink.147419.n8.nabble.com/Flink-SQL-UDF-td3725.html] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-18202) Introduce Protobuf format
[ https://issues.apache.org/jira/browse/FLINK-18202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165170#comment-17165170 ] Zou commented on FLINK-18202: - Hi [~libenchao] , here's my point: 1.1 I think it is convenient for both users and devs if we could derive table schema according to proto definition. 1.2 I prefer 'compiled proto class' , and maybe we should restrict the protobuf version used to compile the class. 1.3 Shall we do a simple performance test for both options? I am inclined to DynamicMessage if there is no significant performance difference. Futher more, maybe we shoule discuss the default value for miss fields in protobuf (such as oneof fields), should we use `null` or PB default value to handle this. I prefer the former. > Introduce Protobuf format > - > > Key: FLINK-18202 > URL: https://issues.apache.org/jira/browse/FLINK-18202 > Project: Flink > Issue Type: New Feature > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table > SQL / API >Reporter: Benchao Li >Priority: Major > Attachments: image-2020-06-15-17-18-03-182.png > > > PB[1] is a very famous and wildly used (de)serialization framework. The ML[2] > also has some discussions about this. It's a useful feature. > This issue maybe needs some designs, or a FLIP. > [1] [https://developers.google.com/protocol-buffers] > [2] [http://apache-flink.147419.n8.nabble.com/Flink-SQL-UDF-td3725.html] -- This message was sent by Atlassian Jira (v8.3.4#803005)