[GitHub] [flink] flinkbot edited a comment on pull request #19216: hive dialect supports dividing by zero.
flinkbot edited a comment on pull request #19216: URL: https://github.com/apache/flink/pull/19216#issuecomment-1077035902 ## CI report: * d6b0ee4743643655b4326b9584928f7e05680d26 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33674) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #19207: [FLINK-26700][docs] Document restore mode in chinese
flinkbot edited a comment on pull request #19207: URL: https://github.com/apache/flink/pull/19207#issuecomment-1075886786 ## CI report: * 0724fedd6789ee0c8b95dfba9e85e689b9186c20 UNKNOWN * a3818995538b52aef26035d82256b58651ce3a57 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33670) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (FLINK-26738) Default value of StateDescriptor is valid when enable state ttl config
[ https://issues.apache.org/jira/browse/FLINK-26738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511616#comment-17511616 ] Jianhui Dong edited comment on FLINK-26738 at 3/24/22, 6:51 AM: [~yunta], thanks for your explanation, I think it's a good idea to make it clearer in docs and javadocs, and should we mark org.apache.flink.api.common.state.StateDescriptor#default(https://github.com/apache/flink/blob/7d7a111eba368043f8624e114daa29400a74c096/flink-core/src/main/java/org/apache/flink/api/common/state/StateDescriptor.java#L107) as deprecated too, not only mark org.apache.flink.api.common.state.ValueStateDescriptor constructor(https://github.com/apache/flink/blob/7d7a111eba368043f8624e114daa29400a74c096/flink-core/src/main/java/org/apache/flink/api/common/state/ValueStateDescriptor.java#L70) was (Author: lam167): [~yunta]thanks for your explanation, I think it's a good idea to make it clearer in docs and javadocs, and should we mark org.apache.flink.api.common.state.StateDescriptor#default(https://github.com/apache/flink/blob/7d7a111eba368043f8624e114daa29400a74c096/flink-core/src/main/java/org/apache/flink/api/common/state/StateDescriptor.java#L107) as deprecated too, not only mark org.apache.flink.api.common.state.ValueStateDescriptor constructor(https://github.com/apache/flink/blob/7d7a111eba368043f8624e114daa29400a74c096/flink-core/src/main/java/org/apache/flink/api/common/state/ValueStateDescriptor.java#L70) > Default value of StateDescriptor is valid when enable state ttl config > -- > > Key: FLINK-26738 > URL: https://issues.apache.org/jira/browse/FLINK-26738 > Project: Flink > Issue Type: Bug > Components: API / Core >Affects Versions: 1.15.0 >Reporter: Jianhui Dong >Priority: Critical > > Suppose we declare a ValueState like following: > {code:java} > ValueStateDescriptor> descriptor = > new ValueStateDescriptor<>( > "average", // the state name > TypeInformation.of(new TypeHint>() > {}), > Tuple2.of(0L, 0L)); > {code} > and then we add state ttl config to the state: > {code:java} > descriptor.enableTimeToLive(StateTtlConfigUtil.createTtlConfig(6)); > {code} > the default value Tuple2.of(0L, 0L) will be invalid and may cause NPE. > I don't know if this is a bug cause I see @Deprecated in the comment of the > ValueStateDescriptor constructor with argument defaultValue: > {code:java} > Use {@link #ValueStateDescriptor(String, TypeSerializer)} instead and manually > * manage the default value by checking whether the contents of the > state is {@code null}. > {code} > and if we decide not to use the defaultValue field in the class > StateDescriptor, should we add @Deprecated annotation to the field > defaultValue? -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] flinkbot edited a comment on pull request #19207: [FLINK-26700][docs] Document restore mode in chinese
flinkbot edited a comment on pull request #19207: URL: https://github.com/apache/flink/pull/19207#issuecomment-1075886786 ## CI report: * 0724fedd6789ee0c8b95dfba9e85e689b9186c20 UNKNOWN * a3818995538b52aef26035d82256b58651ce3a57 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33670) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #19207: [FLINK-26700][docs] Document restore mode in chinese
flinkbot edited a comment on pull request #19207: URL: https://github.com/apache/flink/pull/19207#issuecomment-1075886786 ## CI report: * 0724fedd6789ee0c8b95dfba9e85e689b9186c20 UNKNOWN * a3818995538b52aef26035d82256b58651ce3a57 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33670) * c51c57e88d01803384ebe25d92fd0b3121dae045 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #19218: hive dialect supports select current database
flinkbot edited a comment on pull request #19218: URL: https://github.com/apache/flink/pull/19218#issuecomment-1077287637 ## CI report: * 2a63613728998c9addd18e281f91f02ca2f31ed4 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33677) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #19218: hive dialect supports select current database
flinkbot commented on pull request #19218: URL: https://github.com/apache/flink/pull/19218#issuecomment-1077287637 ## CI report: * 2a63613728998c9addd18e281f91f02ca2f31ed4 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #19207: [FLINK-26700][docs] Document restore mode in chinese
flinkbot edited a comment on pull request #19207: URL: https://github.com/apache/flink/pull/19207#issuecomment-1075886786 ## CI report: * 0724fedd6789ee0c8b95dfba9e85e689b9186c20 UNKNOWN * a3818995538b52aef26035d82256b58651ce3a57 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33670) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #19207: [FLINK-26700][docs] Document restore mode in chinese
flinkbot edited a comment on pull request #19207: URL: https://github.com/apache/flink/pull/19207#issuecomment-1075886786 ## CI report: * 0724fedd6789ee0c8b95dfba9e85e689b9186c20 UNKNOWN * a3818995538b52aef26035d82256b58651ce3a57 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33670) * c51c57e88d01803384ebe25d92fd0b3121dae045 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] luoyuxia opened a new pull request #19218: hive dialect supports select current database
luoyuxia opened a new pull request #19218: URL: https://github.com/apache/flink/pull/19218 ## What is the purpose of the change *(For example: This pull request makes task deployment go through the blob server, rather than through RPC. That way we avoid re-transferring them on each deployment (during recovery).)* ## Brief change log *(for example:)* - *The TaskInfo is stored in the blob store on job creation time as a persistent artifact* - *Deployments RPC transmits only the blob storage reference* - *TaskManagers retrieve the TaskInfo from the blob cache* ## Verifying this change Please make sure both new and modified tests in this PR follows the conventions defined in our code quality guide: https://flink.apache.org/contributing/code-style-and-quality-common.html#testing *(Please pick either of the following options)* This change is a trivial rework / code cleanup without any test coverage. *(or)* This change is already covered by existing tests, such as *(please describe tests)*. *(or)* This change added tests and can be verified as follows: *(example:)* - *Added integration tests for end-to-end deployment with large payloads (100MB)* - *Extended integration test for recovery after master (JobManager) failure* - *Added test that validates that TaskInfo is transferred only once across recoveries* - *Manually verified the change by running a 4 node cluster with 2 JobManagers and 4 TaskManagers, a stateful streaming program, and killing one JobManager and two TaskManagers during the execution, verifying that recovery happens correctly.* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / no) - The serializers: (yes / no / don't know) - The runtime per-record code paths (performance sensitive): (yes / no / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know) - The S3 file system connector: (yes / no / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / no) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] KarmaGYZ commented on a change in pull request #19149: [FLINK-26732][runtime] logs key info for DefaultResourceAllocationStr…
KarmaGYZ commented on a change in pull request #19149: URL: https://github.com/apache/flink/pull/19149#discussion_r833960305 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/DefaultResourceAllocationStrategy.java ## @@ -52,6 +55,10 @@ * multi-dimensional resource profiles. The complexity is not necessary. */ public class DefaultResourceAllocationStrategy implements ResourceAllocationStrategy { Review comment: I think it should be also applied to `FineGrainedSlotManager` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #19207: [FLINK-26700][docs] Document restore mode in chinese
flinkbot edited a comment on pull request #19207: URL: https://github.com/apache/flink/pull/19207#issuecomment-1075886786 ## CI report: * 0724fedd6789ee0c8b95dfba9e85e689b9186c20 UNKNOWN * a3818995538b52aef26035d82256b58651ce3a57 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33670) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-26817) Update ingress docs with templating examples
[ https://issues.apache.org/jira/browse/FLINK-26817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Wang closed FLINK-26817. - Resolution: Fixed Fixed via: main: 3906e66416fd933d091cfa840c4e24cbcf419d41 > Update ingress docs with templating examples > > > Key: FLINK-26817 > URL: https://issues.apache.org/jira/browse/FLINK-26817 > Project: Flink > Issue Type: Sub-task >Reporter: Matyas Orhidi >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (FLINK-26817) Update ingress docs with templating examples
[ https://issues.apache.org/jira/browse/FLINK-26817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Wang reassigned FLINK-26817: - Assignee: Matyas Orhidi > Update ingress docs with templating examples > > > Key: FLINK-26817 > URL: https://issues.apache.org/jira/browse/FLINK-26817 > Project: Flink > Issue Type: Sub-task >Reporter: Matyas Orhidi >Assignee: Matyas Orhidi >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Comment Edited] (FLINK-26718) Limitations of flink+hive dimension table
[ https://issues.apache.org/jira/browse/FLINK-26718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511605#comment-17511605 ] luoyuxia edited comment on FLINK-26718 at 3/24/22, 6:29 AM: [~kunghsu] If you use hive as dimension table, the answer is not. But for other type of dimension table which supports to query by key efficently such as Hbase, redise, etc., it'll only load the matched data to memory instead of loading the whole bulk of data. was (Author: luoyuxia): [~kunghsu] If you use hive as dimension table, the answer is not. > Limitations of flink+hive dimension table > - > > Key: FLINK-26718 > URL: https://issues.apache.org/jira/browse/FLINK-26718 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive >Affects Versions: 1.12.7 >Reporter: kunghsu >Priority: Major > Labels: HIVE > > Limitations of flink+hive dimension table > The scenario I am involved in is a join relationship between the Kafka input > table and the Hive dimension table. The hive dimension table is some user > data, and the data is very large. > When the data volume of the hive table is small, about a few hundred rows, > everything is normal, the partition is automatically recognized and the > entire task is executed normally. > When the hive table reached about 1.3 million, the TaskManager began to fail > to work properly. It was very difficult to even look at the log. I guess it > burst the JVM memory when it tried to load the entire table into memory. You > can see that a heartbeat timeout exception occurs in Taskmanager, such as > Heartbeat TimeoutException.I even increased the parallelism to no avail. > Official website documentation: > [https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/table/connectors/hive/hive_read_write.html#source-parallelism-inference] > So I have a question, does flink+hive not support association of large tables > so far? > Is this solution unusable when the amount of data is too large? > > > > Simply estimate, how much memory will 25 million data take up? > Suppose a line of data is 1K, 25 million K is 25000M, or 25G. > If the memory of the TM is set to 32G, can the problem be solved? > It doesn't seem to work either, because this can only be allocated roughly > 16G to the jvm. > Assuming that the official solution can support such a large amount, how > should the memory of the TM be set? > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (FLINK-26718) Limitations of flink+hive dimension table
[ https://issues.apache.org/jira/browse/FLINK-26718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511605#comment-17511605 ] luoyuxia commented on FLINK-26718: -- [~kunghsu] If you use hive as dimension table, the answer is not. > Limitations of flink+hive dimension table > - > > Key: FLINK-26718 > URL: https://issues.apache.org/jira/browse/FLINK-26718 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive >Affects Versions: 1.12.7 >Reporter: kunghsu >Priority: Major > Labels: HIVE > > Limitations of flink+hive dimension table > The scenario I am involved in is a join relationship between the Kafka input > table and the Hive dimension table. The hive dimension table is some user > data, and the data is very large. > When the data volume of the hive table is small, about a few hundred rows, > everything is normal, the partition is automatically recognized and the > entire task is executed normally. > When the hive table reached about 1.3 million, the TaskManager began to fail > to work properly. It was very difficult to even look at the log. I guess it > burst the JVM memory when it tried to load the entire table into memory. You > can see that a heartbeat timeout exception occurs in Taskmanager, such as > Heartbeat TimeoutException.I even increased the parallelism to no avail. > Official website documentation: > [https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/table/connectors/hive/hive_read_write.html#source-parallelism-inference] > So I have a question, does flink+hive not support association of large tables > so far? > Is this solution unusable when the amount of data is too large? > > > > Simply estimate, how much memory will 25 million data take up? > Suppose a line of data is 1K, 25 million K is 25000M, or 25G. > If the memory of the TM is set to 32G, can the problem be solved? > It doesn't seem to work either, because this can only be allocated roughly > 16G to the jvm. > Assuming that the official solution can support such a large amount, how > should the memory of the TM be set? > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] flinkbot edited a comment on pull request #19207: [FLINK-26700][docs] Document restore mode in chinese
flinkbot edited a comment on pull request #19207: URL: https://github.com/apache/flink/pull/19207#issuecomment-1075886786 ## CI report: * 0724fedd6789ee0c8b95dfba9e85e689b9186c20 UNKNOWN * a3818995538b52aef26035d82256b58651ce3a57 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33670) * c51c57e88d01803384ebe25d92fd0b3121dae045 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-26738) Default value of StateDescriptor is valid when enable state ttl config
[ https://issues.apache.org/jira/browse/FLINK-26738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511603#comment-17511603 ] Yun Tang commented on FLINK-26738: -- [~lam167], the reason why Flink community mark the default value in state descriptor as deprecated is that user cannot judge whether the returned default value is for the real answer or just null. This is also true for TTL state with default value, if we get the default value from TTL state, we cannot judge whether the key is expired, not existed or just the value is default value itself, which make the semantics unclear. >From my point of view, disabling default value for TTL state is reasonable and >maybe we need to make the truth more clear in docs and javadocs. WDYT? > Default value of StateDescriptor is valid when enable state ttl config > -- > > Key: FLINK-26738 > URL: https://issues.apache.org/jira/browse/FLINK-26738 > Project: Flink > Issue Type: Bug > Components: API / Core >Affects Versions: 1.15.0 >Reporter: Jianhui Dong >Priority: Critical > > Suppose we declare a ValueState like following: > {code:java} > ValueStateDescriptor> descriptor = > new ValueStateDescriptor<>( > "average", // the state name > TypeInformation.of(new TypeHint>() > {}), > Tuple2.of(0L, 0L)); > {code} > and then we add state ttl config to the state: > {code:java} > descriptor.enableTimeToLive(StateTtlConfigUtil.createTtlConfig(6)); > {code} > the default value Tuple2.of(0L, 0L) will be invalid and may cause NPE. > I don't know if this is a bug cause I see @Deprecated in the comment of the > ValueStateDescriptor constructor with argument defaultValue: > {code:java} > Use {@link #ValueStateDescriptor(String, TypeSerializer)} instead and manually > * manage the default value by checking whether the contents of the > state is {@code null}. > {code} > and if we decide not to use the defaultValue field in the class > StateDescriptor, should we add @Deprecated annotation to the field > defaultValue? -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink-table-store] LadyForest commented on a change in pull request #60: [FLINK-26834] Introduce BlockingIterator to help testing
LadyForest commented on a change in pull request #60: URL: https://github.com/apache/flink-table-store/pull/60#discussion_r833951087 ## File path: flink-table-store-core/src/test/java/org/apache/flink/table/store/file/utils/BlockingIterator.java ## @@ -0,0 +1,119 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.store.file.utils; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.Iterator; +import java.util.List; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.TimeoutException; +import java.util.function.Function; + +/** Provides the ability to bring timeout to blocking iterators. */ Review comment: Nit: Provide -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #19023: [FLINK-25705][docs]Translate "Metric Reporters" page of "Deployment" …
flinkbot edited a comment on pull request #19023: URL: https://github.com/apache/flink/pull/19023#issuecomment-1062931101 ## CI report: * d3ec7878c8779c901bec0a7497b247cbe354b96e UNKNOWN * 65c05c107320acecd49d3264212cd34d1b75beb2 UNKNOWN * e36f4139caaa7aed67336e59fe56634b3297aa0f Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33673) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #19217: [FLINK-26789][tests] Fix broken RescaleCheckpointManuallyITCase
flinkbot edited a comment on pull request #19217: URL: https://github.com/apache/flink/pull/19217#issuecomment-1077144718 ## CI report: * 965321417c1b700dcc11e639807b2aeb175885bb Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33676) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #19217: [FLINK-26789][tests] Fix broken RescaleCheckpointManuallyITCase
flinkbot commented on pull request #19217: URL: https://github.com/apache/flink/pull/19217#issuecomment-1077144718 ## CI report: * 965321417c1b700dcc11e639807b2aeb175885bb UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #19207: [FLINK-26700][docs] Document restore mode in chinese
flinkbot edited a comment on pull request #19207: URL: https://github.com/apache/flink/pull/19207#issuecomment-1075886786 ## CI report: * 0724fedd6789ee0c8b95dfba9e85e689b9186c20 UNKNOWN * a3818995538b52aef26035d82256b58651ce3a57 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33670) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] fredia opened a new pull request #19217: [FLINK-26789][tests] Fix broken RescaleCheckpointManuallyITCase
fredia opened a new pull request #19217: URL: https://github.com/apache/flink/pull/19217 ## What is the purpose of the change *Fix `RescaleCheckpointManuallyITCase.testCheckpointRescalingInKeyedState` fail* ## Brief change log - *Restore `RescalingITCase` to the original* - *Make CollectionSink no longer shared by `RescaleCheckpointManuallyITCase` and `RescalingITCase`.* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-12163) Use correct ClassLoader for Hadoop Writable TypeInfo
[ https://issues.apache.org/jira/browse/FLINK-12163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] morvenhuang updated FLINK-12163: Summary: Use correct ClassLoader for Hadoop Writable TypeInfo (was: User correct ClassLoader for Hadoop Writable TypeInfo) > Use correct ClassLoader for Hadoop Writable TypeInfo > > > Key: FLINK-12163 > URL: https://issues.apache.org/jira/browse/FLINK-12163 > Project: Flink > Issue Type: Bug > Components: Connectors / Hadoop Compatibility >Affects Versions: 1.7.2, 1.8.0 > Environment: Flink 1.5.6 standalone, Flink 1.7.2 standalone, > Hadoop 2.9.1 standalone >Reporter: morvenhuang >Assignee: arganzheng >Priority: Critical > Fix For: 1.9.0 > > > For Flink 1.5.6, 1.7.2, I keep getting error when using Hadoop Compatibility, > {code:java} > Caused by: java.lang.RuntimeException: Could not load the TypeInformation for > the class 'org.apache.hadoop.io.Writable'. You may be missing the > 'flink-hadoop-compatibility' dependency. > at > org.apache.flink.api.java.typeutils.TypeExtractor.createHadoopWritableTypeInfo(TypeExtractor.java:2140) > at > org.apache.flink.api.java.typeutils.TypeExtractor.privateGetForClass(TypeExtractor.java:1759) > at > org.apache.flink.api.java.typeutils.TypeExtractor.privateGetForClass(TypeExtractor.java:1701) > at > org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoWithTypeHierarchy(TypeExtractor.java:956) > at > org.apache.flink.api.java.typeutils.TypeExtractor.createSubTypesInfo(TypeExtractor.java:1176) > at > org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoWithTypeHierarchy(TypeExtractor.java:889) > at > org.apache.flink.api.java.typeutils.TypeExtractor.privateCreateTypeInfo(TypeExtractor.java:839) > at > org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfo(TypeExtractor.java:805) > at > org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfo(TypeExtractor.java:798) > at org.apache.flink.api.common.typeinfo.TypeHint.(TypeHint.java:50) > {code} > Packaging the flink-hadoop-compatibility dependency with my code into a fat > jar doesn't help. > The error won't go until I copy the flink-hadoop-compatibility jar to > FLINK_HOME/lib. > This seems to be a classloader issue when looking into the > TypeExtractor#createHadoopWritableTypeInfo > {code:java} > Class typeInfoClass; > try { > typeInfoClass = Class.forName(HADOOP_WRITABLE_TYPEINFO_CLASS, false, > TypeExtractor.class.getClassLoader()); > } > catch (ClassNotFoundException e) { > throw new RuntimeException("Could not load the TypeInformation for the class > '" > + HADOOP_WRITABLE_CLASS + "'. You may be missing the > 'flink-hadoop-compatibility' dependency."); > } > {code} > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Comment Edited] (FLINK-26728) Support min operation in KeyedStream
[ https://issues.apache.org/jira/browse/FLINK-26728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511587#comment-17511587 ] CaoYu edited comment on FLINK-26728 at 3/24/22, 4:28 AM: - Hi [~dianfu] Currently, I have preliminarily implemented some functions of the min operator. In the process I found that the code and the sum operator are very similar. And I think the unrealized max, minby, maxby will be also the same as min and sum. Here is the min operator code for the preliminary implementation: {code:java} class MinReduceFunction(ReduceFunction): def __init__(self, position_to_min): self._pos = position_to_min self._reduce_func = None def reduce(self, value1, value2): def init_reduce_func(value_to_check): if isinstance(value_to_check, tuple): def reduce_func(v1, v2): v1_list = list(v1) v1_list[self._pos] = \ v2[self._pos] if v2[self._pos] < v1[self._pos] else v1[self._pos] return tuple(v1_list) self._reduce_func = reduce_func elif isinstance(value_to_check, (list, Row)): pass else: if self._pos != 0: raise TypeError( "The %s field selected on a basic type. A field expression on a " "basic type can only select the 0th field (which means selecting " "the entire basic type)." % self._pos) def reduce_func(v1, v2): return v2 if v2 < v1 else v1 self._reduce_func = reduce_func try: value2 < value1 except TypeError as err: raise TypeError("To get a minimum, a given field data must be comparable " "to each other. \n%s" % err) {code} As you can see, the implementation of the core logic reduce method is very similar to sum. And importantly, the difference between the min operator and the max operator only is "<" replace to ">" So I wondered, whether to abstract a top-level method as basic method. And sum, min, max, minby, maxby exist as enumerations. By enumerations to choose implementation logic. It looks like it will be: {code:java} def _basic_min_max(self, pos, type, is_by): pass def min(self, position_to_min): self._basic_min_max(pos=position_to_min, type=min, is_by=False) def max(self, position_to_max): self._basic_min_max(pos=position_to_max, type=max, is_by=False) def min_by(self, position_to_min_by): self._basic_min_max(pos=position_to_min_by, type=min, is_by=True) def max_by(self, position_to_max_by): self._basic_min_max(pos=position_to_max_by, type=max, is_by=True) {code} What do you think, waiting for your suggestions. Thanks. was (Author: javacaoyu): Hi [~dianfu] Currently, I have preliminarily implemented some functions of the min operator. In the process I found that the code and the sum operator are very similar. And I think the unrealized max, minby, maxby will be also the same as min and sum. Here is the min operator code for the preliminary implementation: {code:java} class MinReduceFunction(ReduceFunction): def __init__(self, position_to_min): self._pos = position_to_min self._reduce_func = None def reduce(self, value1, value2): def init_reduce_func(value_to_check): if isinstance(value_to_check, tuple): def reduce_func(v1, v2): v1_list = list(v1) v1_list[self._pos] = \ v2[self._pos] if v2[self._pos] < v1[self._pos] else v1[self._pos] return tuple(v1_list) self._reduce_func = reduce_func elif isinstance(value_to_check, (list, Row)): pass else: if self._pos != 0: raise TypeError( "The %s field selected on a basic type. A field expression on a " "basic type can only select the 0th field (which means selecting " "the entire basic type)." % self._pos) def reduce_func(v1, v2): return v2 if v2 < v1 else v1 self._reduce_func = reduce_func try: value2 < value1 except TypeError as err: raise TypeError("To get a minimum, a given field data must be comparable " "to each other. \n%s" % err) {code} As you can see, the implementation of the core logic reduce method is very similar to sum. And importantly, the difference between the min operator and the max operator only is "<" replace to ">" So I wondered, whether to abstract a top-level method as basic meth
[jira] [Commented] (FLINK-26728) Support min operation in KeyedStream
[ https://issues.apache.org/jira/browse/FLINK-26728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511587#comment-17511587 ] CaoYu commented on FLINK-26728: --- Hi [~dianfu] Currently, I have preliminarily implemented some functions of the min operator. In the process I found that the code and the sum operator are very similar. And I think the unrealized max, minby, maxby will be also the same as min and sum. Here is the min operator code for the preliminary implementation: {code:java} class MinReduceFunction(ReduceFunction): def __init__(self, position_to_min): self._pos = position_to_min self._reduce_func = None def reduce(self, value1, value2): def init_reduce_func(value_to_check): if isinstance(value_to_check, tuple): def reduce_func(v1, v2): v1_list = list(v1) v1_list[self._pos] = \ v2[self._pos] if v2[self._pos] < v1[self._pos] else v1[self._pos] return tuple(v1_list) self._reduce_func = reduce_func elif isinstance(value_to_check, (list, Row)): pass else: if self._pos != 0: raise TypeError( "The %s field selected on a basic type. A field expression on a " "basic type can only select the 0th field (which means selecting " "the entire basic type)." % self._pos) def reduce_func(v1, v2): return v2 if v2 < v1 else v1 self._reduce_func = reduce_func try: value2 < value1 except TypeError as err: raise TypeError("To get a minimum, a given field data must be comparable " "to each other. \n%s" % err) {code} As you can see, the implementation of the core logic reduce method is very similar to sum. And importantly, the difference between the min operator and the max operator only is "<" replace to ">" So I wondered, whether to abstract a top-level method as basic method. And sum, min, max, minby, maxby exist as enumerations. By enumerations to choose implementation logic. looks like: {code:java} def _basic_min_max(self, pos, type, is_by): pass def min(self, position_to_min): self._basic_min_max(pos=position_to_min, type=min, is_by=False) def max(self, position_to_max): self._basic_min_max(pos=position_to_max, type=max, is_by=False) def min_by(self, position_to_min_by): self._basic_min_max(pos=position_to_min_by, type=min, is_by=True) def max_by(self, position_to_max_by): self._basic_min_max(pos=position_to_max_by, type=max, is_by=True) {code} What do you think, waiting for your suggestions. Thanks. > Support min operation in KeyedStream > > > Key: FLINK-26728 > URL: https://issues.apache.org/jira/browse/FLINK-26728 > Project: Flink > Issue Type: Improvement > Components: API / Python >Affects Versions: 1.14.3 >Reporter: CaoYu >Assignee: CaoYu >Priority: Major > > Support min operation in python-flink KeyedStream > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] flinkbot edited a comment on pull request #18958: [FLINK-15854][hive] Use the new type inference for Hive UDTF
flinkbot edited a comment on pull request #18958: URL: https://github.com/apache/flink/pull/18958#issuecomment-1056725576 ## CI report: * 94f1b47ba27eed0de18a2867f5ac25cc48ed0925 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33669) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (FLINK-22766) Report metrics of KafkaConsumer in Kafka new source
[ https://issues.apache.org/jira/browse/FLINK-22766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511578#comment-17511578 ] Adrian Zhong edited comment on FLINK-22766 at 3/24/22, 4:04 AM: I'd like to say: KafkaPartitionSplitReader is using KafkaClient API which was introduced since 2.4.1. [KIP-520: Add overloaded Consumer#committed for batching partitions|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=128651203] {code:java} org.apache.kafka.clients.consumer.KafkaConsumer#committed(java.util.Set) {code} if you are using flink-kafka-connector-1.12.3 or above,Kafka-clients version 2.4.1 or above is required, otherwise, an exception is thrown. {code:java} Caused by: java.lang.NoSuchMethodError: org.apache.kafka.clients.consumer.KafkaConsumer.committed(Ljava/util/Set;)Ljava/util/Map; at org.apache.flink.connector.kafka.source.reader.KafkaPartitionSplitReader.acquireAndSetStoppingOffsets(KafkaPartitionSplitReader.java:331) ~[flink-application-.jar:?] {code} was (Author: adrian z): I'd like to say: KafkaPartitionSplitReader is using KafkaClient API which was introduced since 2.4.1. {code:java} org.apache.kafka.clients.consumer.KafkaConsumer#committed(java.util.Set) {code} if you are using flink-kafka-connector-1.12.3 or above,Kafka-clients version 2.4.1 or above is required, otherwise, an exception is thrown. {code:java} Caused by: java.lang.NoSuchMethodError: org.apache.kafka.clients.consumer.KafkaConsumer.committed(Ljava/util/Set;)Ljava/util/Map; at org.apache.flink.connector.kafka.source.reader.KafkaPartitionSplitReader.acquireAndSetStoppingOffsets(KafkaPartitionSplitReader.java:331) ~[flink-application-.jar:?] {code} > Report metrics of KafkaConsumer in Kafka new source > --- > > Key: FLINK-22766 > URL: https://issues.apache.org/jira/browse/FLINK-22766 > Project: Flink > Issue Type: Improvement > Components: Connectors / Kafka >Affects Versions: 1.13.0 >Reporter: Qingsheng Ren >Assignee: Qingsheng Ren >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0, 1.13.2 > > > Currently Kafka new source doesn't register metrics of KafkaConsumer in > KafkaPartitionSplitReader. These metrics should be added for debugging and > monitoring purpose. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Comment Edited] (FLINK-22766) Report metrics of KafkaConsumer in Kafka new source
[ https://issues.apache.org/jira/browse/FLINK-22766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511578#comment-17511578 ] Adrian Zhong edited comment on FLINK-22766 at 3/24/22, 4:01 AM: I'd like to say: KafkaPartitionSplitReader is using KafkaClient API which was introduced since 2.4.1. {code:java} org.apache.kafka.clients.consumer.KafkaConsumer#committed(java.util.Set) {code} if you are using flink-kafka-connector-1.12.3 or above,Kafka-clients version 2.4.1 or above is required, otherwise, an exception is thrown. {code:java} Caused by: java.lang.NoSuchMethodError: org.apache.kafka.clients.consumer.KafkaConsumer.committed(Ljava/util/Set;)Ljava/util/Map; at org.apache.flink.connector.kafka.source.reader.KafkaPartitionSplitReader.acquireAndSetStoppingOffsets(KafkaPartitionSplitReader.java:331) ~[flink-application-.jar:?] {code} was (Author: adrian z): I'd like to say: KafkaPartitionSplitReader is using KafkaClient API which was introduced since 2.4.1. if you are using flink-kafka-connector-1.12.3 or above,Kafka-clients version 2.4.1 or above is required, otherwise, an exception is thrown. {code:java} org.apache.kafka.clients.consumer.KafkaConsumer#committed(java.util.Set) {code} {code:java} Caused by: java.lang.NoSuchMethodError: org.apache.kafka.clients.consumer.KafkaConsumer.committed(Ljava/util/Set;)Ljava/util/Map; at org.apache.flink.connector.kafka.source.reader.KafkaPartitionSplitReader.acquireAndSetStoppingOffsets(KafkaPartitionSplitReader.java:331) ~[flink-application-.jar:?] {code} > Report metrics of KafkaConsumer in Kafka new source > --- > > Key: FLINK-22766 > URL: https://issues.apache.org/jira/browse/FLINK-22766 > Project: Flink > Issue Type: Improvement > Components: Connectors / Kafka >Affects Versions: 1.13.0 >Reporter: Qingsheng Ren >Assignee: Qingsheng Ren >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0, 1.13.2 > > > Currently Kafka new source doesn't register metrics of KafkaConsumer in > KafkaPartitionSplitReader. These metrics should be added for debugging and > monitoring purpose. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Comment Edited] (FLINK-22766) Report metrics of KafkaConsumer in Kafka new source
[ https://issues.apache.org/jira/browse/FLINK-22766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511578#comment-17511578 ] Adrian Zhong edited comment on FLINK-22766 at 3/24/22, 4:00 AM: I'd like to say: KafkaPartitionSplitReader is using KafkaClient API which was introduced since 2.4.1. if you are using flink-kafka-connector-1.12.3 or above,Kafka-clients version 2.4.1 or above is requried. otherwise, an exception is thrown. {code:java} org.apache.kafka.clients.consumer.KafkaConsumer#committed(java.util.Set) {code} {code:java} Caused by: java.lang.NoSuchMethodError: org.apache.kafka.clients.consumer.KafkaConsumer.committed(Ljava/util/Set;)Ljava/util/Map; at org.apache.flink.connector.kafka.source.reader.KafkaPartitionSplitReader.acquireAndSetStoppingOffsets(KafkaPartitionSplitReader.java:331) ~[flink-application-.jar:?] {code} was (Author: adrian z): I'd like to say: KafkaPartitionSplitReader is using KafkaClient API which was introduced since 2.4.1 {code:java} org.apache.kafka.clients.consumer.KafkaConsumer#committed(java.util.Set) {code} > Report metrics of KafkaConsumer in Kafka new source > --- > > Key: FLINK-22766 > URL: https://issues.apache.org/jira/browse/FLINK-22766 > Project: Flink > Issue Type: Improvement > Components: Connectors / Kafka >Affects Versions: 1.13.0 >Reporter: Qingsheng Ren >Assignee: Qingsheng Ren >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0, 1.13.2 > > > Currently Kafka new source doesn't register metrics of KafkaConsumer in > KafkaPartitionSplitReader. These metrics should be added for debugging and > monitoring purpose. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Comment Edited] (FLINK-22766) Report metrics of KafkaConsumer in Kafka new source
[ https://issues.apache.org/jira/browse/FLINK-22766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511578#comment-17511578 ] Adrian Zhong edited comment on FLINK-22766 at 3/24/22, 4:00 AM: I'd like to say: KafkaPartitionSplitReader is using KafkaClient API which was introduced since 2.4.1. if you are using flink-kafka-connector-1.12.3 or above,Kafka-clients version 2.4.1 or above is required, otherwise, an exception is thrown. {code:java} org.apache.kafka.clients.consumer.KafkaConsumer#committed(java.util.Set) {code} {code:java} Caused by: java.lang.NoSuchMethodError: org.apache.kafka.clients.consumer.KafkaConsumer.committed(Ljava/util/Set;)Ljava/util/Map; at org.apache.flink.connector.kafka.source.reader.KafkaPartitionSplitReader.acquireAndSetStoppingOffsets(KafkaPartitionSplitReader.java:331) ~[flink-application-.jar:?] {code} was (Author: adrian z): I'd like to say: KafkaPartitionSplitReader is using KafkaClient API which was introduced since 2.4.1. if you are using flink-kafka-connector-1.12.3 or above,Kafka-clients version 2.4.1 or above is requried. otherwise, an exception is thrown. {code:java} org.apache.kafka.clients.consumer.KafkaConsumer#committed(java.util.Set) {code} {code:java} Caused by: java.lang.NoSuchMethodError: org.apache.kafka.clients.consumer.KafkaConsumer.committed(Ljava/util/Set;)Ljava/util/Map; at org.apache.flink.connector.kafka.source.reader.KafkaPartitionSplitReader.acquireAndSetStoppingOffsets(KafkaPartitionSplitReader.java:331) ~[flink-application-.jar:?] {code} > Report metrics of KafkaConsumer in Kafka new source > --- > > Key: FLINK-22766 > URL: https://issues.apache.org/jira/browse/FLINK-22766 > Project: Flink > Issue Type: Improvement > Components: Connectors / Kafka >Affects Versions: 1.13.0 >Reporter: Qingsheng Ren >Assignee: Qingsheng Ren >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0, 1.13.2 > > > Currently Kafka new source doesn't register metrics of KafkaConsumer in > KafkaPartitionSplitReader. These metrics should be added for debugging and > monitoring purpose. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Comment Edited] (FLINK-22766) Report metrics of KafkaConsumer in Kafka new source
[ https://issues.apache.org/jira/browse/FLINK-22766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511578#comment-17511578 ] Adrian Zhong edited comment on FLINK-22766 at 3/24/22, 3:57 AM: I'd like to say: KafkaPartitionSplitReader is using KafkaClient API which was introduced since 2.4.1 {code:java} org.apache.kafka.clients.consumer.KafkaConsumer#committed(java.util.Set) {code} was (Author: adrian z): I'd like to say: KafkaPartitionSplitReader is using KafkaClient API which was introduced since 2.4.1 {code:java} org.apache.kafka.clients.consumer.KafkaConsumer#committed(java.util.Set) {code} > Report metrics of KafkaConsumer in Kafka new source > --- > > Key: FLINK-22766 > URL: https://issues.apache.org/jira/browse/FLINK-22766 > Project: Flink > Issue Type: Improvement > Components: Connectors / Kafka >Affects Versions: 1.13.0 >Reporter: Qingsheng Ren >Assignee: Qingsheng Ren >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0, 1.13.2 > > > Currently Kafka new source doesn't register metrics of KafkaConsumer in > KafkaPartitionSplitReader. These metrics should be added for debugging and > monitoring purpose. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (FLINK-22766) Report metrics of KafkaConsumer in Kafka new source
[ https://issues.apache.org/jira/browse/FLINK-22766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511578#comment-17511578 ] Adrian Zhong commented on FLINK-22766: -- I'd like to say: KafkaPartitionSplitReader is using KafkaClient API which was introduced since 2.4.1 {code:java} org.apache.kafka.clients.consumer.KafkaConsumer#committed(java.util.Set) {code} > Report metrics of KafkaConsumer in Kafka new source > --- > > Key: FLINK-22766 > URL: https://issues.apache.org/jira/browse/FLINK-22766 > Project: Flink > Issue Type: Improvement > Components: Connectors / Kafka >Affects Versions: 1.13.0 >Reporter: Qingsheng Ren >Assignee: Qingsheng Ren >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0, 1.13.2 > > > Currently Kafka new source doesn't register metrics of KafkaConsumer in > KafkaPartitionSplitReader. These metrics should be added for debugging and > monitoring purpose. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] fredia closed pull request #19215: [FLINK-26789][state] Fix RescaleCheckpointManuallyITCase fail
fredia closed pull request #19215: URL: https://github.com/apache/flink/pull/19215 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-26805) Managed table breaks legacy connector without 'connector.type'
[ https://issues.apache.org/jira/browse/FLINK-26805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee closed FLINK-26805. Resolution: Fixed > Managed table breaks legacy connector without 'connector.type' > -- > > Key: FLINK-26805 > URL: https://issues.apache.org/jira/browse/FLINK-26805 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Reporter: Jingsong Lee >Assignee: Jingsong Lee >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > {code:java} > CREATE TABLE T (a INT) WITH ('type'='legacy'); > INSERT INTO T VALUES (1); {code} > This case can be misinterpreted as a managed table, which the user might > expect to be resolved by the legacy table factory. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Comment Edited] (FLINK-26805) Managed table breaks legacy connector without 'connector.type'
[ https://issues.apache.org/jira/browse/FLINK-26805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511017#comment-17511017 ] Jingsong Lee edited comment on FLINK-26805 at 3/24/22, 3:54 AM: master: 7d7a111eba368043f8624e114daa29400a74c096 release-1.15: 6e63e6c2ab074f070389a0eae181269cfbc82772 was (Author: lzljs3620320): release-1.15: 6e63e6c2ab074f070389a0eae181269cfbc82772 > Managed table breaks legacy connector without 'connector.type' > -- > > Key: FLINK-26805 > URL: https://issues.apache.org/jira/browse/FLINK-26805 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Reporter: Jingsong Lee >Assignee: Jingsong Lee >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > {code:java} > CREATE TABLE T (a INT) WITH ('type'='legacy'); > INSERT INTO T VALUES (1); {code} > This case can be misinterpreted as a managed table, which the user might > expect to be resolved by the legacy table factory. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] JingsongLi merged pull request #19209: [FLINK-26805][table] Managed table breaks legacy connector without 'connector.type'
JingsongLi merged pull request #19209: URL: https://github.com/apache/flink/pull/19209 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-table-store] JingsongLi commented on a change in pull request #58: [FLINK-26669] Refactor ReadWriteTableITCase
JingsongLi commented on a change in pull request #58: URL: https://github.com/apache/flink-table-store/pull/58#discussion_r833892291 ## File path: flink-table-store-connector/src/test/java/org/apache/flink/table/store/connector/ReadWriteTableITCase.java ## @@ -19,73 +19,242 @@ package org.apache.flink.table.store.connector; import org.apache.flink.api.common.RuntimeExecutionMode; +import org.apache.flink.configuration.ExecutionOptions; import org.apache.flink.table.api.TableResult; -import org.apache.flink.table.planner.runtime.utils.TestData; +import org.apache.flink.table.catalog.ObjectIdentifier; import org.apache.flink.table.store.file.FileStoreOptions; import org.apache.flink.types.Row; +import org.apache.flink.types.RowKind; import org.apache.flink.util.CloseableIterator; +import org.apache.flink.util.function.TriFunction; +import org.apache.commons.lang3.tuple.Pair; +import org.assertj.core.api.AbstractThrowableAssert; import org.junit.Test; import org.junit.runner.RunWith; import org.junit.runners.Parameterized; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; -import javax.annotation.Nullable; - -import java.math.BigDecimal; import java.nio.file.Paths; import java.util.ArrayList; import java.util.Arrays; +import java.util.Collections; +import java.util.LinkedHashMap; import java.util.List; +import java.util.Map; import java.util.UUID; +import java.util.concurrent.Callable; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.TimeUnit; import java.util.stream.Collectors; import java.util.stream.Stream; -import scala.collection.JavaConverters; - import static org.apache.flink.table.planner.factories.TestValuesTableFactory.changelogRow; import static org.apache.flink.table.planner.factories.TestValuesTableFactory.registerData; import static org.assertj.core.api.Assertions.assertThat; import static org.assertj.core.api.Assertions.assertThatThrownBy; -/** IT cases for testing querying managed table dml. */ +/** IT cases for managed table dml. */ @RunWith(Parameterized.class) public class ReadWriteTableITCase extends TableStoreTestBase { -private final boolean hasPk; -@Nullable private final Boolean duplicate; +private static final Logger LOG = LoggerFactory.getLogger(ReadWriteTableITCase.class); + +private static final Map> PROCESSED_RECORDS = new LinkedHashMap<>(); + +private static final TriFunction>> +KEY_VALUE_ASSIGNER = +(record, hasPk, partitioned) -> { +boolean retract = +record.getKind() == RowKind.DELETE +|| record.getKind() == RowKind.UPDATE_BEFORE; +Row key; +Row value; +RowKind rowKind = record.getKind(); +if (hasPk) { +key = +partitioned +? Row.of(record.getField(0), record.getField(2)) +: Row.of(record.getField(0)); +value = record; +} else { +key = record; +value = Row.of(retract ? -1 : 1); +} +key.setKind(RowKind.INSERT); +value.setKind(RowKind.INSERT); +return Pair.of(key, Pair.of(rowKind, value)); +}; + +private static final TriFunction, Boolean, List, List> COMBINER = +(records, insertOnly, schema) -> { +boolean hasPk = schema.get(0); +boolean partitioned = schema.get(1); +records.forEach( +record -> { +Pair> kvPair = +KEY_VALUE_ASSIGNER.apply(record, hasPk, partitioned); +Row key = kvPair.getLeft(); +Pair valuePair = kvPair.getRight(); +if (insertOnly || !PROCESSED_RECORDS.containsKey(key)) { +update(hasPk, key, valuePair); +} else { +Pair existingValuePair = PROCESSED_RECORDS.get(key); +RowKind existingKind = existingValuePair.getLeft(); +Row existingValue = existingValuePair.getRight(); +RowKind newKind = valuePair.getLeft(); +Row newValue = valuePair.getRight(); + +if (hasPk) { +if (existingKind == newKind && existingKind == RowKind.INSERT) { +
[jira] [Updated] (FLINK-26834) Introduce BlockingIterator to help testing
[ https://issues.apache.org/jira/browse/FLINK-26834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-26834: --- Labels: pull-request-available (was: ) > Introduce BlockingIterator to help testing > -- > > Key: FLINK-26834 > URL: https://issues.apache.org/jira/browse/FLINK-26834 > Project: Flink > Issue Type: Sub-task > Components: Table Store >Reporter: Jingsong Lee >Assignee: Jingsong Lee >Priority: Major > Labels: pull-request-available > Fix For: table-store-0.1.0 > > > BlockingIterator provides the ability to bring timeout to blocking iterators. > It use a static cached \{@link ExecutorService}. We don't limit the number of > threads since the work inside is I/O type. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-26834) Introduce BlockingIterator to help testing
Jingsong Lee created FLINK-26834: Summary: Introduce BlockingIterator to help testing Key: FLINK-26834 URL: https://issues.apache.org/jira/browse/FLINK-26834 Project: Flink Issue Type: Sub-task Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.1.0 BlockingIterator provides the ability to bring timeout to blocking iterators. It use a static cached \{@link ExecutorService}. We don't limit the number of threads since the work inside is I/O type. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink-table-store] JingsongLi opened a new pull request #60: [FLINK-26834] Introduce BlockingIterator to help testing
JingsongLi opened a new pull request #60: URL: https://github.com/apache/flink-table-store/pull/60 BlockingIterator provides the ability to bring timeout to blocking iterators. It use a static cached {@link ExecutorService}. We don't limit the number of threads since the work inside is I/O type. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #19216: hive dialect supports dividing by zero.
flinkbot edited a comment on pull request #19216: URL: https://github.com/apache/flink/pull/19216#issuecomment-1077035902 ## CI report: * d6b0ee4743643655b4326b9584928f7e05680d26 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33674) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #19216: hive dialect supports dividing by zero.
flinkbot commented on pull request #19216: URL: https://github.com/apache/flink/pull/19216#issuecomment-1077035902 ## CI report: * d6b0ee4743643655b4326b9584928f7e05680d26 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #19179: [FLINK-26756][table-planner] Fix the deserialization error for match recognize
flinkbot edited a comment on pull request #19179: URL: https://github.com/apache/flink/pull/19179#issuecomment-1073567241 ## CI report: * 9e677e67f510daf93704bd46cd3d4a8672c498ec Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33469) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-26832) Output more status info for JobObserver
[ https://issues.apache.org/jira/browse/FLINK-26832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Wang reassigned FLINK-26832: - Assignee: Biao Geng > Output more status info for JobObserver > --- > > Key: FLINK-26832 > URL: https://issues.apache.org/jira/browse/FLINK-26832 > Project: Flink > Issue Type: Sub-task >Reporter: Biao Geng >Assignee: Biao Geng >Priority: Minor > > For {{JobObserver#observeFlinkJobStatus()}}, we currently only > {{logger.info("Job status successfully updated");}}. > This is could be more informative if we output actual job status here to help > users check the status of the Job due to flink operator's log, not only > depending on the flink web ui. > The proposed change looks like: > {{logger.info("Job status successfully updated from {} to {}", currentState, > targetState);}}. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] luoyuxia opened a new pull request #19216: hive dialect supports dividing by zero.
luoyuxia opened a new pull request #19216: URL: https://github.com/apache/flink/pull/19216 ## What is the purpose of the change *(For example: This pull request makes task deployment go through the blob server, rather than through RPC. That way we avoid re-transferring them on each deployment (during recovery).)* ## Brief change log *(for example:)* - *The TaskInfo is stored in the blob store on job creation time as a persistent artifact* - *Deployments RPC transmits only the blob storage reference* - *TaskManagers retrieve the TaskInfo from the blob cache* ## Verifying this change Please make sure both new and modified tests in this PR follows the conventions defined in our code quality guide: https://flink.apache.org/contributing/code-style-and-quality-common.html#testing *(Please pick either of the following options)* This change is a trivial rework / code cleanup without any test coverage. *(or)* This change is already covered by existing tests, such as *(please describe tests)*. *(or)* This change added tests and can be verified as follows: *(example:)* - *Added integration tests for end-to-end deployment with large payloads (100MB)* - *Extended integration test for recovery after master (JobManager) failure* - *Added test that validates that TaskInfo is transferred only once across recoveries* - *Manually verified the change by running a 4 node cluster with 2 JobManagers and 4 TaskManagers, a stateful streaming program, and killing one JobManager and two TaskManagers during the execution, verifying that recovery happens correctly.* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / no) - The serializers: (yes / no / don't know) - The runtime per-record code paths (performance sensitive): (yes / no / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know) - The S3 file system connector: (yes / no / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / no) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #19023: [FLINK-25705][docs]Translate "Metric Reporters" page of "Deployment" …
flinkbot edited a comment on pull request #19023: URL: https://github.com/apache/flink/pull/19023#issuecomment-1062931101 ## CI report: * d3ec7878c8779c901bec0a7497b247cbe354b96e UNKNOWN * 65c05c107320acecd49d3264212cd34d1b75beb2 UNKNOWN * b2ec4ede218204cc06141f201572faa6c25c3ce9 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33645) * e36f4139caaa7aed67336e59fe56634b3297aa0f Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33673) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #19023: [FLINK-25705][docs]Translate "Metric Reporters" page of "Deployment" …
flinkbot edited a comment on pull request #19023: URL: https://github.com/apache/flink/pull/19023#issuecomment-1062931101 ## CI report: * d3ec7878c8779c901bec0a7497b247cbe354b96e UNKNOWN * 65c05c107320acecd49d3264212cd34d1b75beb2 UNKNOWN * b2ec4ede218204cc06141f201572faa6c25c3ce9 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33645) * e36f4139caaa7aed67336e59fe56634b3297aa0f UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #19207: [FLINK-26700][docs] Document restore mode in chinese
flinkbot edited a comment on pull request #19207: URL: https://github.com/apache/flink/pull/19207#issuecomment-1075886786 ## CI report: * 0724fedd6789ee0c8b95dfba9e85e689b9186c20 UNKNOWN * 2b16e626ba9f3efac8bd6b796353efaaf8b98de0 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33663) * a3818995538b52aef26035d82256b58651ce3a57 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33670) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] ChengkaiYang2022 commented on a change in pull request #19023: [FLINK-25705][docs]Translate "Metric Reporters" page of "Deployment" …
ChengkaiYang2022 commented on a change in pull request #19023: URL: https://github.com/apache/flink/pull/19023#discussion_r833876402 ## File path: docs/content.zh/docs/deployment/metric_reporters.md ## @@ -24,31 +24,33 @@ specific language governing permissions and limitations under the License. --> -# Metric Reporters + -Flink allows reporting metrics to external systems. -For more information about Flink's metric system go to the [metric system documentation]({{< ref "docs/ops/metrics" >}}). +# 指标发送器 +Flink 支持用户将 Flink 的各项运行时指标发送给外部系统。 +了解更多指标方面信息可查看 [metric system documentation]({{< ref "zh/docs/ops/metrics" >}})。 Review comment: Okay,thanks @RocMarshal ,I will remove prefix and figure out the hugo things first. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #19207: [FLINK-26700][docs] Document restore mode in chinese
flinkbot edited a comment on pull request #19207: URL: https://github.com/apache/flink/pull/19207#issuecomment-1075886786 ## CI report: * 0724fedd6789ee0c8b95dfba9e85e689b9186c20 UNKNOWN * 2b16e626ba9f3efac8bd6b796353efaaf8b98de0 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33663) * a3818995538b52aef26035d82256b58651ce3a57 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #19207: [FLINK-26700][docs] Document restore mode in chinese
flinkbot edited a comment on pull request #19207: URL: https://github.com/apache/flink/pull/19207#issuecomment-1075886786 ## CI report: * 0724fedd6789ee0c8b95dfba9e85e689b9186c20 UNKNOWN * 2b16e626ba9f3efac8bd6b796353efaaf8b98de0 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33663) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #19207: [FLINK-26700][docs] Document restore mode in chinese
flinkbot edited a comment on pull request #19207: URL: https://github.com/apache/flink/pull/19207#issuecomment-1075886786 ## CI report: * 0724fedd6789ee0c8b95dfba9e85e689b9186c20 UNKNOWN * 2b16e626ba9f3efac8bd6b796353efaaf8b98de0 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33663) * a3818995538b52aef26035d82256b58651ce3a57 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zoltar9264 commented on a change in pull request #19207: [FLINK-26700][docs] Document restore mode in chinese
zoltar9264 commented on a change in pull request #19207: URL: https://github.com/apache/flink/pull/19207#discussion_r833871458 ## File path: docs/content.zh/docs/ops/state/savepoints.md ## @@ -157,10 +157,54 @@ $ bin/flink run -s :savepointPath [:runArgs] 默认情况下,resume 操作将尝试将 Savepoint 的所有状态映射回你要还原的程序。 如果删除了运算符,则可以通过 `--allowNonRestoredState`(short:`-n`)选项跳过无法映射到新程序的状态: + Restore 模式 + +`Restore 模式` 决定了在 restore 之后谁拥有组成 Savepoint 或者 [externalized checkpoint]({{< ref "docs/ops/state/checkpoints" >}}/#resuming-from-a-retained-checkpoint)的文件的所有权。在这种语境下 Savepoint 和 externalized checkpoint 的行为相似。这里我们将它们都称为“快照”,除非另有明确说明。 + +如前所述,restore 模式决定了谁来接管我们从中恢复的快照文件的所有权。快照可被用户或者 Flink 自身拥有。如果快照归用户所有,Flink 不会删除其中的文件,而且 Flink 不能依赖该快照中文件的存在,因为它可能在 Flink 的控制之外被删除。 + +每种 restore 模式都有特定的用途。尽管如此,我们仍然认为默认的 *NO_CLAIM* 模式在大多数情况下是一个很好的折中方案,因为它在提供明确的所有权归属的同时只给恢复后第一个 checkpoint 带来较小的代价。 + +你可以通过如下方式指定 restore 模式: ```shell -$ bin/flink run -s :savepointPath -n [:runArgs] +$ bin/flink run -s :savepointPath -restoreMode :mode -n [:runArgs] ``` +**NO_CLAIM (默认的)** + +在 *NO_CLAIM* 模式下,Flink 不会接管快照的所有权。它会将快照的文件置于用户的控制之中,并且永远不会删除其中的任何文件。该模式下可以从同一个快照上启动多个作业。 + +为保证 Flink 不会依赖于该快照的任何文件,它会强制第一个(成功的) checkpoint 为全量 checkpoint 而不是增量的。这仅对`state.backend: rocksdb` 有影响,因为其他 backend 总是制作全量 checkpoint。 + +一旦第一个全量的 checkpoint 完成后,所有后续的 checkpoint 会照常制作(按照配置)。所以,一旦一个 checkpoint 成功制作,就可以删除原快照。在此之前不能删除原快照,因为没有任何完成的 checkpoint,Flink 会在故障时尝试从初始的快照恢复。 + + + {{< img src="/fig/restore-mode-no_claim.svg" alt="NO_CLAIM restore mode" width="70%" >}} + + +**CLAIM** + +另一个可选的模式是 *CLAIM* 模式。该模式下 Flink 将声称拥有快照的所有权,并且本质上将其作为 checkpoint 对待:控制其生命周期并且可能会在其永远不会被用于恢复的时候删除它。因此,手动删除快照和从同一个快照上启动两个作业都是不安全的。Flink 会保持[配置数量]({{< ref "docs/dev/datastream/fault-tolerance/checkpointing" >}}/#state-checkpoints-num-retained)的 checkpoint。 + + + {{< img src="/fig/restore-mode-claim.svg" alt="CLAIM restore mode" width="70%" >}} + + +{{< hint info >}} +**注意:** +1. Retained checkpoints 被存储在 `//chk_` 这样的目录中。Flink 不会接管 `/` 目录的所有权,而只会接管 `chk_` 的所有权。Flink 不会删除旧作业的目录。 Review comment: Got it, and done. thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-26833) missing link to setup-pyflink-virtual-env.sh / error during deploy
[ https://issues.apache.org/jira/browse/FLINK-26833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kafka Chris updated FLINK-26833: Description: When I navigate to [https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/python/faq/#preparing-python-virtual-environment] there is not a functioning link to the `setup-pyflink-virtual-env.sh` script. When I try to deploy to a remote cluster with a virtualenv that is created by pycharm I see the following error: {code:java} (venv) chris@chrisvb~/PycharmProjects/mythril_pyflink$ /home/chris/PycharmProjects/mythril_pyflink/venv/lib/python3.8/site-packages/pyflink/bin/flink run --jobmanager flinkmaster.myxyzdomain.com:8081 --python timescale_profit_calc_stream.py WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by org.apache.flink.api.java.ClosureCleaner (file:/home/chris/PycharmProjects/mythril_pyflink/venv/lib/python3.8/site-packages/pyflink/lib/flink-dist_2.11-1.14.3.jar) to field java.util.Properties.serialVersionUID WARNING: Please consider reporting this to the maintainers of org.apache.flink.api.java.ClosureCleaner WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release Job has been submitted with JobID fea43e68c13d0cc3183a6ad6d6157748 Traceback (most recent call last): File "timescale_profit_calc_stream.py", line 169, in execute_timescale_profit_calc_stream() File "timescale_profit_calc_stream.py", line 157, in execute_timescale_profit_calc_stream env.execute() File "/home/chris/PycharmProjects/mythril_pyflink/venv/lib/python3.8/site-packages/pyflink/datastream/stream_execution_environment.py", line 691, in execute return JobExecutionResult(self._j_stream_execution_environment.execute(j_stream_graph)) File "/home/chris/PycharmProjects/mythril_pyflink/venv/lib/python3.8/site-packages/py4j/java_gateway.py", line 1285, in _call_ return_value = get_return_value( File "/home/chris/PycharmProjects/mythril_pyflink/venv/lib/python3.8/site-packages/pyflink/util/exceptions.py", line 146, in deco return f(*a, **kw) File "/home/chris/PycharmProjects/mythril_pyflink/venv/lib/python3.8/site-packages/py4j/protocol.py", line 326, in get_return_value raise Py4JJavaError( py4j.protocol.Py4JJavaError: An error occurred while calling o10.execute. : java.util.concurrent.ExecutionException: org.apache.flink.client.program.ProgramInvocationException: Job failed (JobID: fea43e68c13d0cc3183a6ad6d6157748) at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999) at org.apache.flink.client.program.StreamContextEnvironment.getJobExecutionResult(StreamContextEnvironment.java:123) at org.apache.flink.client.program.StreamContextEnvironment.execute(StreamContextEnvironment.java:80) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.apache.flink.api.python.shaded.py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at org.apache.flink.api.python.shaded.py4j.Gateway.invoke(Gateway.java:282) at org.apache.flink.api.python.shaded.py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at org.apache.flink.api.python.shaded.py4j.commands.CallCommand.execute(CallCommand.java:79) at org.apache.flink.api.python.shaded.py4j.GatewayConnection.run(GatewayConnection.java:238) at java.base/java.lang.Thread.run(Thread.java:829) Caused by: org.apache.flink.client.program.ProgramInvocationException: Job failed (JobID: fea43e68c13d0cc3183a6ad6d6157748) at org.apache.flink.client.deployment.ClusterClientJobClientAdapter.lambda$null$6(ClusterClientJobClientAdapter.java:125) at java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:642) at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073) at org.apache.flink.util.concurrent.FutureUtils.lambda$retryOperationWithDelay$9(FutureUtils.java:403) at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859) at java.base/java.
[jira] [Created] (FLINK-26833) missing link to setup-pyflink-virtual-env.sh / error during deploy
Kafka Chris created FLINK-26833: --- Summary: missing link to setup-pyflink-virtual-env.sh / error during deploy Key: FLINK-26833 URL: https://issues.apache.org/jira/browse/FLINK-26833 Project: Flink Issue Type: Bug Components: API / Python, Documentation Affects Versions: 1.14.4 Reporter: Kafka Chris When I navigate to [https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/python/faq/#preparing-python-virtual-environment] there is not a functioning link to the `setup-pyflink-virtual-env.sh` script. When I try to deploy to a remote cluster with a virtualenv that is created by pycharm I see the following error: ``` (venv) chris@chrisvb~/PycharmProjects/mythril_pyflink$ /home/chris/PycharmProjects/mythril_pyflink/venv/lib/python3.8/site-packages/pyflink/bin/flink run --jobmanager flinkmaster.myxyzdomain.com:8081 --python timescale_profit_calc_stream.py WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by org.apache.flink.api.java.ClosureCleaner (file:/home/chris/PycharmProjects/mythril_pyflink/venv/lib/python3.8/site-packages/pyflink/lib/flink-dist_2.11-1.14.3.jar) to field java.util.Properties.serialVersionUID WARNING: Please consider reporting this to the maintainers of org.apache.flink.api.java.ClosureCleaner WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release Job has been submitted with JobID fea43e68c13d0cc3183a6ad6d6157748 Traceback (most recent call last): File "timescale_profit_calc_stream.py", line 169, in execute_timescale_profit_calc_stream() File "timescale_profit_calc_stream.py", line 157, in execute_timescale_profit_calc_stream env.execute() File "/home/chris/PycharmProjects/mythril_pyflink/venv/lib/python3.8/site-packages/pyflink/datastream/stream_execution_environment.py", line 691, in execute return JobExecutionResult(self._j_stream_execution_environment.execute(j_stream_graph)) File "/home/chris/PycharmProjects/mythril_pyflink/venv/lib/python3.8/site-packages/py4j/java_gateway.py", line 1285, in __call__ return_value = get_return_value( File "/home/chris/PycharmProjects/mythril_pyflink/venv/lib/python3.8/site-packages/pyflink/util/exceptions.py", line 146, in deco return f(*a, **kw) File "/home/chris/PycharmProjects/mythril_pyflink/venv/lib/python3.8/site-packages/py4j/protocol.py", line 326, in get_return_value raise Py4JJavaError( py4j.protocol.Py4JJavaError: An error occurred while calling o10.execute. : java.util.concurrent.ExecutionException: org.apache.flink.client.program.ProgramInvocationException: Job failed (JobID: fea43e68c13d0cc3183a6ad6d6157748) at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999) at org.apache.flink.client.program.StreamContextEnvironment.getJobExecutionResult(StreamContextEnvironment.java:123) at org.apache.flink.client.program.StreamContextEnvironment.execute(StreamContextEnvironment.java:80) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.apache.flink.api.python.shaded.py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at org.apache.flink.api.python.shaded.py4j.Gateway.invoke(Gateway.java:282) at org.apache.flink.api.python.shaded.py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at org.apache.flink.api.python.shaded.py4j.commands.CallCommand.execute(CallCommand.java:79) at org.apache.flink.api.python.shaded.py4j.GatewayConnection.run(GatewayConnection.java:238) at java.base/java.lang.Thread.run(Thread.java:829) Caused by: org.apache.flink.client.program.ProgramInvocationException: Job failed (JobID: fea43e68c13d0cc3183a6ad6d6157748) at org.apache.flink.client.deployment.ClusterClientJobClientAdapter.lambda$null$6(ClusterClientJobClientAdapter.java:125) at java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:642) at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073) at org.apache.flink.util
[jira] [Comment Edited] (FLINK-26799) StateChangeFormat#read not seek to offset correctly
[ https://issues.apache.org/jira/browse/FLINK-26799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511558#comment-17511558 ] Feifan Wang edited comment on FLINK-26799 at 3/24/22, 2:32 AM: --- [~roman] , here the problem is *`underlyingStream.getPos() == offset` not mean the wrapper stream is on correct position.* was (Author: feifan wang): [~roman] , here the problem is *`underlyingStream.getPos() == offset` not mean the wrapper stream is on correct position.* ** > StateChangeFormat#read not seek to offset correctly > --- > > Key: FLINK-26799 > URL: https://issues.apache.org/jira/browse/FLINK-26799 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Reporter: Feifan Wang >Priority: Major > > StateChangeFormat#read must seek to offset before read, current implement as > follows : > > {code:java} > FSDataInputStream stream = handle.openInputStream(); > DataInputViewStreamWrapper input = wrap(stream); > if (stream.getPos() != offset) { > LOG.debug("seek from {} to {}", stream.getPos(), offset); > input.skipBytesToRead((int) offset); > }{code} > But the if condition is incorrect, stream.getPos() return the position of > underlying stream which is different from position of input. > By the way, because of wrapped by BufferedInputStream, position of underlying > stream always at n*bufferSize or the end of file. > Actually, input is aways at position 0 at beginning, so I think we can seek > to the offset directly. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (FLINK-26799) StateChangeFormat#read not seek to offset correctly
[ https://issues.apache.org/jira/browse/FLINK-26799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511558#comment-17511558 ] Feifan Wang commented on FLINK-26799: - [~roman] , here the problem is *`underlyingStream.getPos() == offset` not mean the wrapper stream is on correct position.* ** > StateChangeFormat#read not seek to offset correctly > --- > > Key: FLINK-26799 > URL: https://issues.apache.org/jira/browse/FLINK-26799 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Reporter: Feifan Wang >Priority: Major > > StateChangeFormat#read must seek to offset before read, current implement as > follows : > > {code:java} > FSDataInputStream stream = handle.openInputStream(); > DataInputViewStreamWrapper input = wrap(stream); > if (stream.getPos() != offset) { > LOG.debug("seek from {} to {}", stream.getPos(), offset); > input.skipBytesToRead((int) offset); > }{code} > But the if condition is incorrect, stream.getPos() return the position of > underlying stream which is different from position of input. > By the way, because of wrapped by BufferedInputStream, position of underlying > stream always at n*bufferSize or the end of file. > Actually, input is aways at position 0 at beginning, so I think we can seek > to the offset directly. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] flinkbot edited a comment on pull request #19179: [FLINK-26756][table-planner] Fix the deserialization error for match recognize
flinkbot edited a comment on pull request #19179: URL: https://github.com/apache/flink/pull/19179#issuecomment-1073567241 ## CI report: * 9e677e67f510daf93704bd46cd3d4a8672c498ec Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33469) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] PatrickRen commented on pull request #18863: [FLINK-26033][flink-connector-kafka]Fix the problem that robin does not take effect due to upgrading kafka client to 2.4.1 since Flink1.11
PatrickRen commented on pull request #18863: URL: https://github.com/apache/flink/pull/18863#issuecomment-1077001529 Thanks for the update. LGTM. @fapaul Could you help to take a look and merge this PR? Thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #19179: [FLINK-26756][table-planner] Fix the deserialization error for match recognize
flinkbot edited a comment on pull request #19179: URL: https://github.com/apache/flink/pull/19179#issuecomment-1073567241 ## CI report: * 9e677e67f510daf93704bd46cd3d4a8672c498ec Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33469) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-26799) StateChangeFormat#read not seek to offset correctly
[ https://issues.apache.org/jira/browse/FLINK-26799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511550#comment-17511550 ] Feifan Wang commented on FLINK-26799: - Hi [~roman] , I'm not mean seek on the underlying stream before wrapping. We should always seek on the wrapper stream as long as the offset not equal to 0. I think the code should be like below : {code:java} if (offset != 0) { LOG.debug("seek from {} to {}", stream.getPos(), offset); input.skipBytesToRead((int) offset); } {code} > StateChangeFormat#read not seek to offset correctly > --- > > Key: FLINK-26799 > URL: https://issues.apache.org/jira/browse/FLINK-26799 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Reporter: Feifan Wang >Priority: Major > > StateChangeFormat#read must seek to offset before read, current implement as > follows : > > {code:java} > FSDataInputStream stream = handle.openInputStream(); > DataInputViewStreamWrapper input = wrap(stream); > if (stream.getPos() != offset) { > LOG.debug("seek from {} to {}", stream.getPos(), offset); > input.skipBytesToRead((int) offset); > }{code} > But the if condition is incorrect, stream.getPos() return the position of > underlying stream which is different from position of input. > By the way, because of wrapped by BufferedInputStream, position of underlying > stream always at n*bufferSize or the end of file. > Actually, input is aways at position 0 at beginning, so I think we can seek > to the offset directly. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] godfreyhe commented on pull request #19179: [FLINK-26756][table-planner] Fix the deserialization error for match recognize
godfreyhe commented on pull request #19179: URL: https://github.com/apache/flink/pull/19179#issuecomment-1076999830 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-ml] yunfengzhou-hub commented on pull request #70: [FLINK-26313] Add Transformer and Estimator of OnlineKMeans
yunfengzhou-hub commented on pull request #70: URL: https://github.com/apache/flink-ml/pull/70#issuecomment-1076988591 Thanks for the comments. I have updated the PR according to the comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #18958: [FLINK-15854][hive] Use the new type inference for Hive UDTF
flinkbot edited a comment on pull request #18958: URL: https://github.com/apache/flink/pull/18958#issuecomment-1056725576 ## CI report: * d34eb21ae0bda39ec119ef18a9782fbaad2310bc Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32660) * 94f1b47ba27eed0de18a2867f5ac25cc48ed0925 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33669) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-26827) FlinkSQL和hive整合报错
[ https://issues.apache.org/jira/browse/FLINK-26827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511545#comment-17511545 ] zhushifeng commented on FLINK-26827: OK, no problem, I have done it. Can you help me solve with the problem. > FlinkSQL和hive整合报错 > - > > Key: FLINK-26827 > URL: https://issues.apache.org/jira/browse/FLINK-26827 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.13.3 > Environment: 环境:cdh6.2.1 linux系统,j d k1.8 >Reporter: zhushifeng >Priority: Major > Attachments: image-2022-03-24-09-33-31-786.png > > > Topic : FlinkSQL combine with Hive > > *step1:* > environment: > HIVE2.1 > Flink1.13.3 > FlinkCDC2.1 > CDH6.2.1 > > *step2:* > when I do the following thing I come across some problems. For example, > copy the following jar to /flink-1.13.3/lib/ > // Flink's Hive connector > flink-connector-hive_2.11-1.13.3.jar > // Hive dependencies > hive-exec-2.1.0.jar. == hive-exec-2.1.1-cdh6.2.1.jar > // add antlr-runtime if you need to use hive dialect > antlr-runtime-3.5.2.jar > !image-2022-03-24-09-33-31-786.png! > > *step3:* restart the Flink Cluster > # ./start-cluster.sh > # Starting cluster. > # Starting standalonesession daemon on host xuehai-cm. > # Starting taskexecutor daemon on host xuehai-cm. > # Starting taskexecutor daemon on host xuehai-nn. > # Starting taskexecutor daemon on host xuehai-dn. > > *step4:* > CREATE CATALOG myhive WITH ( > 'type' = 'hive', > 'default-database' = 'default', > 'hive-conf-dir' = '/etc/hive/conf' > ); > -- set the HiveCatalog as the current catalog of the session > USE CATALOG myhive; > > *step5:* use the hive > Flink SQL> select * from rptdata.basic_xhsys_user ; > Exception in thread "main" org.apache.flink.table.client.SqlClientException: > Unexpected exception. This is a bug. Please consider filing an issue. > at > org.apache.flink.table.client.SqlClient.startClient(SqlClient.java:201) > at org.apache.flink.table.client.SqlClient.main(SqlClient.java:161) > Caused by: java.lang.ExceptionInInitializerError > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at > org.apache.flink.connectors.hive.HiveSourceFileEnumerator.createMRSplits(HiveSourceFileEnumerator.java:94) > at > org.apache.flink.connectors.hive.HiveSourceFileEnumerator.createInputSplits(HiveSourceFileEnumerator.java:71) > at > org.apache.flink.connectors.hive.HiveTableSource.lambda$getDataStream$1(HiveTableSource.java:212) > at > org.apache.flink.connectors.hive.HiveParallelismInference.logRunningTime(HiveParallelismInference.java:107) > at > org.apache.flink.connectors.hive.HiveParallelismInference.infer(HiveParallelismInference.java:95) > at > org.apache.flink.connectors.hive.HiveTableSource.getDataStream(HiveTableSource.java:207) > at > org.apache.flink.connectors.hive.HiveTableSource$1.produceDataStream(HiveTableSource.java:123) > at > org.apache.flink.table.planner.plan.nodes.exec.common.CommonExecTableSourceScan.translateToPlanInternal(CommonExecTableSourceScan.java:96) > at > org.apache.flink.table.planner.plan.nodes.exec.ExecNodeBase.translateToPlan(ExecNodeBase.java:134) > at > org.apache.flink.table.planner.plan.nodes.exec.ExecEdge.translateToPlan(ExecEdge.java:247) > at > org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecSink.translateToPlanInternal(StreamExecSink.java:114) > at > org.apache.flink.table.planner.plan.nodes.exec.ExecNodeBase.translateToPlan(ExecNodeBase.java:134) > at > org.apache.flink.table.planner.delegation.StreamPlanner.$anonfun$translateToPlan$1(StreamPlanner.scala:70) > at > scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233) > at scala.collection.Iterator.foreach(Iterator.scala:937) > at scala.collection.Iterator.foreach$(Iterator.scala:937) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1425) > at scala.collection.IterableLike.foreach(IterableLike.scala:70) > at scala.collection.IterableLike.foreach$(IterableLike.scala:69) > at scala.collection.AbstractIterable.foreach(Iterable.scala:54) > at scala.collection.TraversableLike.map(TraversableLike.scala:233) > at scala.collection.TraversableLike.map$(TraversableLike.scala:226) > at scala.collection.AbstractTraversable.map(Traversable.scala:104) > at > org.apache.flink.table.planner.delegation.StreamPlanner.translateToPlan(StreamPlanner.scala:69) > at > org.apache.flink.table.planner.delegation.PlannerBase.translate(PlannerBase.scala:165) > at > org.apache
[GitHub] [flink] chenzihao5 commented on pull request #19101: [FLINK-26634][docs-zh] Update Chinese version of Elasticsearch connector docs
chenzihao5 commented on pull request #19101: URL: https://github.com/apache/flink/pull/19101#issuecomment-1076986910 Hi, @gaoyunhaii . Can you help to review this? Thanks a lot. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] luoyuxia commented on pull request #19012: [FLINK-26540][hive] Support handle join involving complex types in on…
luoyuxia commented on pull request #19012: URL: https://github.com/apache/flink/pull/19012#issuecomment-1076986032 @beyond1920 Thanks for your reminder. I'm glad to improve. But could you please explain a bit more about what kind of improvement so that I can pay attention to these stuff next time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #18958: [FLINK-15854][hive] Use the new type inference for Hive UDTF
flinkbot edited a comment on pull request #18958: URL: https://github.com/apache/flink/pull/18958#issuecomment-1056725576 ## CI report: * d34eb21ae0bda39ec119ef18a9782fbaad2310bc Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32660) * 94f1b47ba27eed0de18a2867f5ac25cc48ed0925 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-26718) Limitations of flink+hive dimension table
[ https://issues.apache.org/jira/browse/FLINK-26718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511543#comment-17511543 ] kunghsu commented on FLINK-26718: - [~luoyuxia] Considering the amount of data is 25 million, when using dimension tables, can I let flink load this data to an external storage such as RockDB? Will this avoid OOM problems? > Limitations of flink+hive dimension table > - > > Key: FLINK-26718 > URL: https://issues.apache.org/jira/browse/FLINK-26718 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive >Affects Versions: 1.12.7 >Reporter: kunghsu >Priority: Major > Labels: HIVE > > Limitations of flink+hive dimension table > The scenario I am involved in is a join relationship between the Kafka input > table and the Hive dimension table. The hive dimension table is some user > data, and the data is very large. > When the data volume of the hive table is small, about a few hundred rows, > everything is normal, the partition is automatically recognized and the > entire task is executed normally. > When the hive table reached about 1.3 million, the TaskManager began to fail > to work properly. It was very difficult to even look at the log. I guess it > burst the JVM memory when it tried to load the entire table into memory. You > can see that a heartbeat timeout exception occurs in Taskmanager, such as > Heartbeat TimeoutException.I even increased the parallelism to no avail. > Official website documentation: > [https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/table/connectors/hive/hive_read_write.html#source-parallelism-inference] > So I have a question, does flink+hive not support association of large tables > so far? > Is this solution unusable when the amount of data is too large? > > > > Simply estimate, how much memory will 25 million data take up? > Suppose a line of data is 1K, 25 million K is 25000M, or 25G. > If the memory of the TM is set to 32G, can the problem be solved? > It doesn't seem to work either, because this can only be allocated roughly > 16G to the jvm. > Assuming that the official solution can support such a large amount, how > should the memory of the TM be set? > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] luoyuxia edited a comment on pull request #18958: [FLINK-15854][hive] Use the new type inference for Hive UDTF
luoyuxia edited a comment on pull request #18958: URL: https://github.com/apache/flink/pull/18958#issuecomment-1076984299 @twalthr Thanks for your reminder. Rebased now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] luoyuxia commented on pull request #18958: [FLINK-15854][hive] Use the new type inference for Hive UDTF
luoyuxia commented on pull request #18958: URL: https://github.com/apache/flink/pull/18958#issuecomment-1076984299 @twalthr Thanks for your reminder. I rebased now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-26829) ClassCastException will be thrown when the second operand of divide is a function call
[ https://issues.apache.org/jira/browse/FLINK-26829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511542#comment-17511542 ] xuyang commented on FLINK-26829: I'll try to fix it > ClassCastException will be thrown when the second operand of divide is a > function call > --- > > Key: FLINK-26829 > URL: https://issues.apache.org/jira/browse/FLINK-26829 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.15.0 >Reporter: luoyuxia >Priority: Major > Fix For: 1.16.0 > > > Can be reproduced by add the following code in > SqlExpressionTest#testDivideFunctions > {code:java} > testExpectedSqlException( > "1/POWER(5, 5)", divisorZeroException, classOf[ArithmeticException]) {code} > Then the method ExpressionReducer#skipAndValidateExprs will throw the > exception: > {code:java} > java.lang.ClassCastException: org.apache.calcite.rex.RexCall cannot be cast > to org.apache.calcite.rex.RexLiteral {code} > The following code will cast the DEVIDE's second op to RexLiteral, but it > maybe a function call. > {code:java} > // according to BuiltInFunctionDefinitions, the DEVIDE's second op must be > numeric > assert(RexUtil.isDeterministic(divisionLiteral)) > val divisionComparable = { > > divisionLiteral.asInstanceOf[RexLiteral].getValue.asInstanceOf[Comparable[Any]] > } {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (FLINK-26827) FlinkSQL和hive整合报错
[ https://issues.apache.org/jira/browse/FLINK-26827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhushifeng updated FLINK-26827: --- Description: Topic : FlinkSQL combine with Hive *step1:* environment: HIVE2.1 Flink1.13.3 FlinkCDC2.1 CDH6.2.1 *step2:* when I do the following thing I come across some problems. For example, copy the following jar to /flink-1.13.3/lib/ // Flink's Hive connector flink-connector-hive_2.11-1.13.3.jar // Hive dependencies hive-exec-2.1.0.jar. == hive-exec-2.1.1-cdh6.2.1.jar // add antlr-runtime if you need to use hive dialect antlr-runtime-3.5.2.jar !image-2022-03-24-09-33-31-786.png! *step3:* restart the Flink Cluster # ./start-cluster.sh # Starting cluster. # Starting standalonesession daemon on host xuehai-cm. # Starting taskexecutor daemon on host xuehai-cm. # Starting taskexecutor daemon on host xuehai-nn. # Starting taskexecutor daemon on host xuehai-dn. *step4:* CREATE CATALOG myhive WITH ( 'type' = 'hive', 'default-database' = 'default', 'hive-conf-dir' = '/etc/hive/conf' ); -- set the HiveCatalog as the current catalog of the session USE CATALOG myhive; *step5:* use the hive Flink SQL> select * from rptdata.basic_xhsys_user ; Exception in thread "main" org.apache.flink.table.client.SqlClientException: Unexpected exception. This is a bug. Please consider filing an issue. at org.apache.flink.table.client.SqlClient.startClient(SqlClient.java:201) at org.apache.flink.table.client.SqlClient.main(SqlClient.java:161) Caused by: java.lang.ExceptionInInitializerError at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.flink.connectors.hive.HiveSourceFileEnumerator.createMRSplits(HiveSourceFileEnumerator.java:94) at org.apache.flink.connectors.hive.HiveSourceFileEnumerator.createInputSplits(HiveSourceFileEnumerator.java:71) at org.apache.flink.connectors.hive.HiveTableSource.lambda$getDataStream$1(HiveTableSource.java:212) at org.apache.flink.connectors.hive.HiveParallelismInference.logRunningTime(HiveParallelismInference.java:107) at org.apache.flink.connectors.hive.HiveParallelismInference.infer(HiveParallelismInference.java:95) at org.apache.flink.connectors.hive.HiveTableSource.getDataStream(HiveTableSource.java:207) at org.apache.flink.connectors.hive.HiveTableSource$1.produceDataStream(HiveTableSource.java:123) at org.apache.flink.table.planner.plan.nodes.exec.common.CommonExecTableSourceScan.translateToPlanInternal(CommonExecTableSourceScan.java:96) at org.apache.flink.table.planner.plan.nodes.exec.ExecNodeBase.translateToPlan(ExecNodeBase.java:134) at org.apache.flink.table.planner.plan.nodes.exec.ExecEdge.translateToPlan(ExecEdge.java:247) at org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecSink.translateToPlanInternal(StreamExecSink.java:114) at org.apache.flink.table.planner.plan.nodes.exec.ExecNodeBase.translateToPlan(ExecNodeBase.java:134) at org.apache.flink.table.planner.delegation.StreamPlanner.$anonfun$translateToPlan$1(StreamPlanner.scala:70) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233) at scala.collection.Iterator.foreach(Iterator.scala:937) at scala.collection.Iterator.foreach$(Iterator.scala:937) at scala.collection.AbstractIterator.foreach(Iterator.scala:1425) at scala.collection.IterableLike.foreach(IterableLike.scala:70) at scala.collection.IterableLike.foreach$(IterableLike.scala:69) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at scala.collection.TraversableLike.map(TraversableLike.scala:233) at scala.collection.TraversableLike.map$(TraversableLike.scala:226) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at org.apache.flink.table.planner.delegation.StreamPlanner.translateToPlan(StreamPlanner.scala:69) at org.apache.flink.table.planner.delegation.PlannerBase.translate(PlannerBase.scala:165) at org.apache.flink.table.api.internal.TableEnvironmentImpl.translate(TableEnvironmentImpl.java:1518) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeQueryOperation(TableEnvironmentImpl.java:791) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:1225) at org.apache.flink.table.client.gateway.local.LocalExecutor.lambda$executeOperation$3(LocalExecutor.java:213) at org.apache.flink.table.client.gateway.context.ExecutionContext.wrapClassLoader(ExecutionContext.java:90) at org.apache.flink.table.client.gateway.local.LocalExecutor.executeOperation(LocalExecutor.java:213) at org.apache.flink.table.client.ga
[jira] [Updated] (FLINK-26827) FlinkSQL和hive整合报错
[ https://issues.apache.org/jira/browse/FLINK-26827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhushifeng updated FLINK-26827: --- Attachment: image-2022-03-24-09-33-31-786.png > FlinkSQL和hive整合报错 > - > > Key: FLINK-26827 > URL: https://issues.apache.org/jira/browse/FLINK-26827 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.13.3 > Environment: 环境:cdh6.2.1 linux系统,j d k1.8 >Reporter: zhushifeng >Priority: Major > Attachments: image-2022-03-24-09-33-31-786.png > > > HIVE2.1 Flink1.13.3 FlinkCDC2.1 按照官网整合报错如下: > Flink SQL> select * from rptdata.basic_xhsys_user ; > Exception in thread "main" org.apache.flink.table.client.SqlClientException: > Unexpected exception. This is a bug. Please consider filing an issue. > at > org.apache.flink.table.client.SqlClient.startClient(SqlClient.java:201) > at org.apache.flink.table.client.SqlClient.main(SqlClient.java:161) > Caused by: java.lang.ExceptionInInitializerError > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at > org.apache.flink.connectors.hive.HiveSourceFileEnumerator.createMRSplits(HiveSourceFileEnumerator.java:94) > at > org.apache.flink.connectors.hive.HiveSourceFileEnumerator.createInputSplits(HiveSourceFileEnumerator.java:71) > at > org.apache.flink.connectors.hive.HiveTableSource.lambda$getDataStream$1(HiveTableSource.java:212) > at > org.apache.flink.connectors.hive.HiveParallelismInference.logRunningTime(HiveParallelismInference.java:107) > at > org.apache.flink.connectors.hive.HiveParallelismInference.infer(HiveParallelismInference.java:95) > at > org.apache.flink.connectors.hive.HiveTableSource.getDataStream(HiveTableSource.java:207) > at > org.apache.flink.connectors.hive.HiveTableSource$1.produceDataStream(HiveTableSource.java:123) > at > org.apache.flink.table.planner.plan.nodes.exec.common.CommonExecTableSourceScan.translateToPlanInternal(CommonExecTableSourceScan.java:96) > at > org.apache.flink.table.planner.plan.nodes.exec.ExecNodeBase.translateToPlan(ExecNodeBase.java:134) > at > org.apache.flink.table.planner.plan.nodes.exec.ExecEdge.translateToPlan(ExecEdge.java:247) > at > org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecSink.translateToPlanInternal(StreamExecSink.java:114) > at > org.apache.flink.table.planner.plan.nodes.exec.ExecNodeBase.translateToPlan(ExecNodeBase.java:134) > at > org.apache.flink.table.planner.delegation.StreamPlanner.$anonfun$translateToPlan$1(StreamPlanner.scala:70) > at > scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233) > at scala.collection.Iterator.foreach(Iterator.scala:937) > at scala.collection.Iterator.foreach$(Iterator.scala:937) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1425) > at scala.collection.IterableLike.foreach(IterableLike.scala:70) > at scala.collection.IterableLike.foreach$(IterableLike.scala:69) > at scala.collection.AbstractIterable.foreach(Iterable.scala:54) > at scala.collection.TraversableLike.map(TraversableLike.scala:233) > at scala.collection.TraversableLike.map$(TraversableLike.scala:226) > at scala.collection.AbstractTraversable.map(Traversable.scala:104) > at > org.apache.flink.table.planner.delegation.StreamPlanner.translateToPlan(StreamPlanner.scala:69) > at > org.apache.flink.table.planner.delegation.PlannerBase.translate(PlannerBase.scala:165) > at > org.apache.flink.table.api.internal.TableEnvironmentImpl.translate(TableEnvironmentImpl.java:1518) > at > org.apache.flink.table.api.internal.TableEnvironmentImpl.executeQueryOperation(TableEnvironmentImpl.java:791) > at > org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:1225) > at > org.apache.flink.table.client.gateway.local.LocalExecutor.lambda$executeOperation$3(LocalExecutor.java:213) > at > org.apache.flink.table.client.gateway.context.ExecutionContext.wrapClassLoader(ExecutionContext.java:90) > at > org.apache.flink.table.client.gateway.local.LocalExecutor.executeOperation(LocalExecutor.java:213) > at > org.apache.flink.table.client.gateway.local.LocalExecutor.executeQuery(LocalExecutor.java:235) > at > org.apache.flink.table.client.cli.CliClient.callSelect(CliClient.java:479) > at > org.apache.flink.table.client.cli.CliClient.callOperation(CliClient.java:412) > at > org.apache.flink.table.client.cli.CliClient.lambda$executeStatement$0(CliClient.java:327) > at java.util.Optional.ifPresent(Optional.java:159) >
[GitHub] [flink-ml] yunfengzhou-hub commented on a change in pull request #70: [FLINK-26313] Add Transformer and Estimator of OnlineKMeans
yunfengzhou-hub commented on a change in pull request #70: URL: https://github.com/apache/flink-ml/pull/70#discussion_r833839951 ## File path: flink-ml-lib/src/main/java/org/apache/flink/ml/clustering/kmeans/OnlineKMeans.java ## @@ -0,0 +1,562 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.ml.clustering.kmeans; + +import org.apache.flink.api.common.functions.AggregateFunction; +import org.apache.flink.api.common.functions.FlatMapFunction; +import org.apache.flink.api.common.functions.MapFunction; +import org.apache.flink.api.common.state.ListState; +import org.apache.flink.api.common.state.ListStateDescriptor; +import org.apache.flink.api.common.typeinfo.BasicTypeInfo; +import org.apache.flink.api.common.typeinfo.TypeInformation; +import org.apache.flink.api.java.tuple.Tuple2; +import org.apache.flink.api.java.typeutils.ObjectArrayTypeInfo; +import org.apache.flink.api.java.typeutils.TupleTypeInfo; +import org.apache.flink.iteration.DataStreamList; +import org.apache.flink.iteration.IterationBody; +import org.apache.flink.iteration.IterationBodyResult; +import org.apache.flink.iteration.Iterations; +import org.apache.flink.iteration.operator.OperatorStateUtils; +import org.apache.flink.ml.api.Estimator; +import org.apache.flink.ml.common.distance.DistanceMeasure; +import org.apache.flink.ml.linalg.BLAS; +import org.apache.flink.ml.linalg.DenseVector; +import org.apache.flink.ml.linalg.Vectors; +import org.apache.flink.ml.linalg.typeinfo.DenseVectorTypeInfo; +import org.apache.flink.ml.param.Param; +import org.apache.flink.ml.util.ParamUtils; +import org.apache.flink.ml.util.ReadWriteUtils; +import org.apache.flink.runtime.state.StateInitializationContext; +import org.apache.flink.streaming.api.datastream.DataStream; +import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; +import org.apache.flink.streaming.api.operators.AbstractStreamOperator; +import org.apache.flink.streaming.api.operators.TwoInputStreamOperator; +import org.apache.flink.streaming.runtime.streamrecord.StreamRecord; +import org.apache.flink.table.api.Table; +import org.apache.flink.table.api.bridge.java.StreamTableEnvironment; +import org.apache.flink.table.api.bridge.java.internal.StreamTableEnvironmentImpl; +import org.apache.flink.table.api.internal.TableImpl; +import org.apache.flink.types.Row; +import org.apache.flink.util.Collector; +import org.apache.flink.util.Preconditions; + +import org.apache.commons.collections.IteratorUtils; +import org.apache.commons.lang3.ArrayUtils; + +import java.io.IOException; +import java.nio.file.Files; +import java.nio.file.Path; +import java.nio.file.Paths; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.Objects; +import java.util.Random; + +/** + * OnlineKMeans extends the function of {@link KMeans}, supporting to train a K-Means model + * continuously according to an unbounded stream of train data. + * + * OnlineKMeans makes updates with the "mini-batch" KMeans rule, generalized to incorporate + * forgetfulness (i.e. decay). After the centroids estimated on the current batch are acquired, + * OnlineKMeans computes the new centroids from the weighted average between the original and the + * estimated centroids. The weight of the estimated centroids is the number of points assigned to + * them. The weight of the original centroids is also the number of points, but additionally + * multiplying with the decay factor. + * + * The decay factor scales the contribution of the clusters as estimated thus far. If decay + * factor is 1, all batches are weighted equally. If decay factor is 0, new centroids are determined + * entirely by recent data. Lower values correspond to more forgetting. + */ +public class OnlineKMeans +implements Estimator, OnlineKMeansParams { +private final Map, Object> paramMap = new HashMap<>(); +private Table initModelDataTable; + +public OnlineKMeans() { +ParamUtils.initializeMapWithDefaultValues(paramMap, this); +} + +public OnlineKMeans(Table initModelDataTable) { +
[jira] [Commented] (FLINK-26799) StateChangeFormat#read not seek to offset correctly
[ https://issues.apache.org/jira/browse/FLINK-26799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511536#comment-17511536 ] Roman Khachatryan commented on FLINK-26799: --- Hi [~Feifan Wang] , thanks for looking into it, > But the if condition is incorrect, stream.getPos() return the position of > underlying stream which is different from position of input. Exactly, this is the underlying stream that needs to be positioned. Then "offset" bytes are skipped from the buffer. So I think this is correct. > Actually, input is aways at position 0 at beginning, so I think we can seek > to the offset directly. Are you proposing to seek on the underlying stream {*}before wrapping{*}? I think that won't work, because the compression flag is written at the beginning. Alternatively, seeking on the underlying stream *after wrapping* seem dangerous to me: there is no contract that no bytes are buffered by constructor AFAIK. > StateChangeFormat#read not seek to offset correctly > --- > > Key: FLINK-26799 > URL: https://issues.apache.org/jira/browse/FLINK-26799 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Reporter: Feifan Wang >Priority: Major > > StateChangeFormat#read must seek to offset before read, current implement as > follows : > > {code:java} > FSDataInputStream stream = handle.openInputStream(); > DataInputViewStreamWrapper input = wrap(stream); > if (stream.getPos() != offset) { > LOG.debug("seek from {} to {}", stream.getPos(), offset); > input.skipBytesToRead((int) offset); > }{code} > But the if condition is incorrect, stream.getPos() return the position of > underlying stream which is different from position of input. > By the way, because of wrapped by BufferedInputStream, position of underlying > stream always at n*bufferSize or the end of file. > Actually, input is aways at position 0 at beginning, so I think we can seek > to the offset directly. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (FLINK-26817) Update ingress docs with templating examples
[ https://issues.apache.org/jira/browse/FLINK-26817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-26817: --- Labels: pull-request-available (was: ) > Update ingress docs with templating examples > > > Key: FLINK-26817 > URL: https://issues.apache.org/jira/browse/FLINK-26817 > Project: Flink > Issue Type: Sub-task >Reporter: Matyas Orhidi >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] pnowojski commented on a change in pull request #19198: [FLINK-26783] Restore from a stop-with-savepoint if failed during committing
pnowojski commented on a change in pull request #19198: URL: https://github.com/apache/flink/pull/19198#discussion_r833684250 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinator.java ## @@ -1243,7 +1243,7 @@ private void completePendingCheckpoint(PendingCheckpoint pendingCheckpoint) // the pending checkpoint must be discarded after the finalization Preconditions.checkState(pendingCheckpoint.isDisposed() && completedCheckpoint != null); -if (!props.isSavepoint()) { +if (!props.isSavepoint() || props.isSynchronous()) { lastSubsumed = addCompletedCheckpointToStoreAndSubsumeOldest( Review comment: Should we subsume anything in this case? Also isn't this braking savepoint ownership? I mean what if user deletes the savepoint? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] pnowojski commented on a change in pull request #19198: [FLINK-26783] Restore from a stop-with-savepoint if failed during committing
pnowojski commented on a change in pull request #19198: URL: https://github.com/apache/flink/pull/19198#discussion_r833684250 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinator.java ## @@ -1243,7 +1243,7 @@ private void completePendingCheckpoint(PendingCheckpoint pendingCheckpoint) // the pending checkpoint must be discarded after the finalization Preconditions.checkState(pendingCheckpoint.isDisposed() && completedCheckpoint != null); -if (!props.isSavepoint()) { +if (!props.isSavepoint() || props.isSynchronous()) { lastSubsumed = addCompletedCheckpointToStoreAndSubsumeOldest( Review comment: Should we subsume anything in this case? Also isn't this braking savepoint ownership? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #19205: [hotfix][table-planner] Cleanup code around TableConfig/ReadableConfig
flinkbot edited a comment on pull request #19205: URL: https://github.com/apache/flink/pull/19205#issuecomment-1075253969 ## CI report: * 0abeef29c3501174803ec2d7ce4421b914db9626 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33667) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #19215: [FLINK-26789][state] Fix RescaleCheckpointManuallyITCase fail
flinkbot edited a comment on pull request #19215: URL: https://github.com/apache/flink/pull/19215#issuecomment-1076533796 ## CI report: * 8a5e2f0be2021d6fdf42ec913ad7199b0e78f9ef Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33665) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #19207: [FLINK-26700][docs] Document restore mode in chinese
flinkbot edited a comment on pull request #19207: URL: https://github.com/apache/flink/pull/19207#issuecomment-1075886786 ## CI report: * 0724fedd6789ee0c8b95dfba9e85e689b9186c20 UNKNOWN * 2b16e626ba9f3efac8bd6b796353efaaf8b98de0 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33663) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-26639) Publish flink-kubernetes-operator maven artifacts
[ https://issues.apache.org/jira/browse/FLINK-26639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511396#comment-17511396 ] Thomas Weise commented on FLINK-26639: -- [~chesnay] thanks for the pointer. I was mainly looking to understand the credential setup. It appears that this can also be solved through GH actions: https://issues.apache.org/jira/browse/INFRA-21167 https://github.com/apache/spark/pull/30623 > Publish flink-kubernetes-operator maven artifacts > - > > Key: FLINK-26639 > URL: https://issues.apache.org/jira/browse/FLINK-26639 > Project: Flink > Issue Type: Sub-task >Reporter: Thomas Weise >Priority: Major > > We should publish the Maven artifacts in addition to the Docker images so > that downstream Java projects can utilize the CRD classes directly. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] flinkbot edited a comment on pull request #19207: [FLINK-26700][docs] Document restore mode in chinese
flinkbot edited a comment on pull request #19207: URL: https://github.com/apache/flink/pull/19207#issuecomment-1075886786 ## CI report: * 0724fedd6789ee0c8b95dfba9e85e689b9186c20 UNKNOWN * 98001b33616f6c01f235d95342c27d4aae7e0a85 Azure: [CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33666) * 2b16e626ba9f3efac8bd6b796353efaaf8b98de0 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33663) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #19205: [hotfix][table-planner] Cleanup code around TableConfig/ReadableConfig
flinkbot edited a comment on pull request #19205: URL: https://github.com/apache/flink/pull/19205#issuecomment-1075253969 ## CI report: * 189e31dfbd7cf3196de8191bec0ec01b9ff20fd5 Azure: [CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33664) * 0abeef29c3501174803ec2d7ce4421b914db9626 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33667) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-26830) Move jobs to suspended state before upgrading
[ https://issues.apache.org/jira/browse/FLINK-26830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-26830: --- Labels: pull-request-available (was: ) > Move jobs to suspended state before upgrading > - > > Key: FLINK-26830 > URL: https://issues.apache.org/jira/browse/FLINK-26830 > Project: Flink > Issue Type: Sub-task > Components: Kubernetes Operator >Reporter: Gyula Fora >Assignee: Gyula Fora >Priority: Major > Labels: pull-request-available > > We should not upgrade jobs in one step as that might cause us to lose track > of what part of the upgrade has succeeded and not. > For example when upgrading with savepoint strategy we need to record the > savepoint info in status before trying the deployment because if the > deployment fails we lose the savepoint info. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] flinkbot edited a comment on pull request #19207: [FLINK-26700][docs] Document restore mode in chinese
flinkbot edited a comment on pull request #19207: URL: https://github.com/apache/flink/pull/19207#issuecomment-1075886786 ## CI report: * 55dba54d849604a41e5dfcb8e8eef9fbdbc093ed Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33630) * 0724fedd6789ee0c8b95dfba9e85e689b9186c20 UNKNOWN * 98001b33616f6c01f235d95342c27d4aae7e0a85 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33666) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #19205: [hotfix][table-planner] Cleanup code around TableConfig/ReadableConfig
flinkbot edited a comment on pull request #19205: URL: https://github.com/apache/flink/pull/19205#issuecomment-1075253969 ## CI report: * 189e31dfbd7cf3196de8191bec0ec01b9ff20fd5 Azure: [CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33664) * 0abeef29c3501174803ec2d7ce4421b914db9626 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #19207: [FLINK-26700][docs] Document restore mode in chinese
flinkbot edited a comment on pull request #19207: URL: https://github.com/apache/flink/pull/19207#issuecomment-1075886786 ## CI report: * 55dba54d849604a41e5dfcb8e8eef9fbdbc093ed Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33630) * 0724fedd6789ee0c8b95dfba9e85e689b9186c20 UNKNOWN * 98001b33616f6c01f235d95342c27d4aae7e0a85 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33666) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #19205: [hotfix][table-planner] Cleanup code around TableConfig/ReadableConfig
flinkbot edited a comment on pull request #19205: URL: https://github.com/apache/flink/pull/19205#issuecomment-1075253969 ## CI report: * 189e31dfbd7cf3196de8191bec0ec01b9ff20fd5 Azure: [CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33664) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #19215: [FLINK-26789][state] Fix RescaleCheckpointManuallyITCase fail
flinkbot edited a comment on pull request #19215: URL: https://github.com/apache/flink/pull/19215#issuecomment-1076533796 ## CI report: * 8a5e2f0be2021d6fdf42ec913ad7199b0e78f9ef Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33665) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-26738) Default value of StateDescriptor is valid when enable state ttl config
[ https://issues.apache.org/jira/browse/FLINK-26738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511363#comment-17511363 ] Jianhui Dong commented on FLINK-26738: -- And I think this is an issue worth discussing, from my personal point of view, if the default value is valid for TTL state, there's no difference between having the user set a default value to the state descriptor and replacing the expired state with a default value in the code manually. Even if a user sets the default value, they can also replace it with a new value, it's no conflict and would make the concept of default value clearer. And one more question, why we add @Deprecated annotation to the ValueDescriptor constructor, would we remove the default value in class StateDescriptor in the future? > Default value of StateDescriptor is valid when enable state ttl config > -- > > Key: FLINK-26738 > URL: https://issues.apache.org/jira/browse/FLINK-26738 > Project: Flink > Issue Type: Bug > Components: API / Core >Affects Versions: 1.15.0 >Reporter: Jianhui Dong >Priority: Critical > > Suppose we declare a ValueState like following: > {code:java} > ValueStateDescriptor> descriptor = > new ValueStateDescriptor<>( > "average", // the state name > TypeInformation.of(new TypeHint>() > {}), > Tuple2.of(0L, 0L)); > {code} > and then we add state ttl config to the state: > {code:java} > descriptor.enableTimeToLive(StateTtlConfigUtil.createTtlConfig(6)); > {code} > the default value Tuple2.of(0L, 0L) will be invalid and may cause NPE. > I don't know if this is a bug cause I see @Deprecated in the comment of the > ValueStateDescriptor constructor with argument defaultValue: > {code:java} > Use {@link #ValueStateDescriptor(String, TypeSerializer)} instead and manually > * manage the default value by checking whether the contents of the > state is {@code null}. > {code} > and if we decide not to use the defaultValue field in the class > StateDescriptor, should we add @Deprecated annotation to the field > defaultValue? -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] Myasuka commented on a change in pull request #19207: [FLINK-26700][docs] Document restore mode in chinese
Myasuka commented on a change in pull request #19207: URL: https://github.com/apache/flink/pull/19207#discussion_r833469968 ## File path: docs/content.zh/docs/ops/state/savepoints.md ## @@ -157,10 +157,54 @@ $ bin/flink run -s :savepointPath [:runArgs] 默认情况下,resume 操作将尝试将 Savepoint 的所有状态映射回你要还原的程序。 如果删除了运算符,则可以通过 `--allowNonRestoredState`(short:`-n`)选项跳过无法映射到新程序的状态: + Restore 模式 + +`Restore 模式` 决定了在 restore 之后谁拥有组成 Savepoint 或者 [externalized checkpoint]({{< ref "docs/ops/state/checkpoints" >}}/#resuming-from-a-retained-checkpoint)的文件的所有权。在这种语境下 Savepoint 和 externalized checkpoint 的行为相似。这里我们将它们都称为“快照”,除非另有明确说明。 + +如前所述,restore 模式决定了谁来接管我们从中恢复的快照文件的所有权。快照可被用户或者 Flink 自身拥有。如果快照归用户所有,Flink 不会删除其中的文件,而且 Flink 不能依赖该快照中文件的存在,因为它可能在 Flink 的控制之外被删除。 + +每种 restore 模式都有特定的用途。尽管如此,我们仍然认为默认的 *NO_CLAIM* 模式在大多数情况下是一个很好的折中方案,因为它在提供明确的所有权归属的同时只给恢复后第一个 checkpoint 带来较小的代价。 + +你可以通过如下方式指定 restore 模式: ```shell -$ bin/flink run -s :savepointPath -n [:runArgs] +$ bin/flink run -s :savepointPath -restoreMode :mode -n [:runArgs] ``` +**NO_CLAIM (默认的)** + +在 *NO_CLAIM* 模式下,Flink 不会接管快照的所有权。它会将快照的文件置于用户的控制之中,并且永远不会删除其中的任何文件。该模式下可以从同一个快照上启动多个作业。 + +为保证 Flink 不会依赖于该快照的任何文件,它会强制第一个(成功的) checkpoint 为全量 checkpoint 而不是增量的。这仅对`state.backend: rocksdb` 有影响,因为其他 backend 总是制作全量 checkpoint。 + +一旦第一个全量的 checkpoint 完成后,所有后续的 checkpoint 会照常制作(按照配置)。所以,一旦一个 checkpoint 成功制作,就可以删除原快照。在此之前不能删除原快照,因为没有任何完成的 checkpoint,Flink 会在故障时尝试从初始的快照恢复。 + + + {{< img src="/fig/restore-mode-no_claim.svg" alt="NO_CLAIM restore mode" width="70%" >}} + + +**CLAIM** + +另一个可选的模式是 *CLAIM* 模式。该模式下 Flink 将声称拥有快照的所有权,并且本质上将其作为 checkpoint 对待:控制其生命周期并且可能会在其永远不会被用于恢复的时候删除它。因此,手动删除快照和从同一个快照上启动两个作业都是不安全的。Flink 会保持[配置数量]({{< ref "docs/dev/datastream/fault-tolerance/checkpointing" >}}/#state-checkpoints-num-retained)的 checkpoint。 + + + {{< img src="/fig/restore-mode-claim.svg" alt="CLAIM restore mode" width="70%" >}} + + +{{< hint info >}} +**注意:** +1. Retained checkpoints 被存储在 `//chk_` 这样的目录中。Flink 不会接管 `/` 目录的所有权,而只会接管 `chk_` 的所有权。Flink 不会删除旧作业的目录。 Review comment: I think we should leave the 1st commit as a hotfix to resolve the description of `chk-` and squash all other commits into another one. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #19215: [FLINK-26789][state] Fix RescaleCheckpointManuallyITCase fail
flinkbot commented on pull request #19215: URL: https://github.com/apache/flink/pull/19215#issuecomment-1076533796 ## CI report: * 8a5e2f0be2021d6fdf42ec913ad7199b0e78f9ef Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33665) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-table-store] LadyForest commented on a change in pull request #58: [FLINK-26669] Refactor ReadWriteTableITCase
LadyForest commented on a change in pull request #58: URL: https://github.com/apache/flink-table-store/pull/58#discussion_r833463128 ## File path: flink-table-store-connector/src/test/java/org/apache/flink/table/store/connector/ReadWriteTableITCase.java ## @@ -19,73 +19,242 @@ package org.apache.flink.table.store.connector; import org.apache.flink.api.common.RuntimeExecutionMode; +import org.apache.flink.configuration.ExecutionOptions; import org.apache.flink.table.api.TableResult; -import org.apache.flink.table.planner.runtime.utils.TestData; +import org.apache.flink.table.catalog.ObjectIdentifier; import org.apache.flink.table.store.file.FileStoreOptions; import org.apache.flink.types.Row; +import org.apache.flink.types.RowKind; import org.apache.flink.util.CloseableIterator; +import org.apache.flink.util.function.TriFunction; +import org.apache.commons.lang3.tuple.Pair; +import org.assertj.core.api.AbstractThrowableAssert; import org.junit.Test; import org.junit.runner.RunWith; import org.junit.runners.Parameterized; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; -import javax.annotation.Nullable; - -import java.math.BigDecimal; import java.nio.file.Paths; import java.util.ArrayList; import java.util.Arrays; +import java.util.Collections; +import java.util.LinkedHashMap; import java.util.List; +import java.util.Map; import java.util.UUID; +import java.util.concurrent.Callable; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.TimeUnit; import java.util.stream.Collectors; import java.util.stream.Stream; -import scala.collection.JavaConverters; - import static org.apache.flink.table.planner.factories.TestValuesTableFactory.changelogRow; import static org.apache.flink.table.planner.factories.TestValuesTableFactory.registerData; import static org.assertj.core.api.Assertions.assertThat; import static org.assertj.core.api.Assertions.assertThatThrownBy; -/** IT cases for testing querying managed table dml. */ +/** IT cases for managed table dml. */ @RunWith(Parameterized.class) public class ReadWriteTableITCase extends TableStoreTestBase { -private final boolean hasPk; -@Nullable private final Boolean duplicate; +private static final Logger LOG = LoggerFactory.getLogger(ReadWriteTableITCase.class); + +private static final Map> PROCESSED_RECORDS = new LinkedHashMap<>(); Review comment: > Why use a static collection? It is very dangarous for thread safety and memory leak. Because the concurrent read/write case specs are not added yet. The current cases are restricted to sequential read/write. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] RocMarshal commented on a change in pull request #19023: [FLINK-25705][docs]Translate "Metric Reporters" page of "Deployment" …
RocMarshal commented on a change in pull request #19023: URL: https://github.com/apache/flink/pull/19023#discussion_r833463004 ## File path: docs/content.zh/docs/deployment/metric_reporters.md ## @@ -24,31 +24,33 @@ specific language governing permissions and limitations under the License. --> -# Metric Reporters + -Flink allows reporting metrics to external systems. -For more information about Flink's metric system go to the [metric system documentation]({{< ref "docs/ops/metrics" >}}). +# 指标发送器 +Flink 支持用户将 Flink 的各项运行时指标发送给外部系统。 +了解更多指标方面信息可查看 [metric system documentation]({{< ref "zh/docs/ops/metrics" >}})。 Review comment: About your build error log, I'm not sure the real reason, maybe you should setup a hugo env on your pc. To be a short, When I run the command to build the site on my pc, I found the error links pointed in the comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-26789) RescaleCheckpointManuallyITCase.testCheckpointRescalingInKeyedState failed
[ https://issues.apache.org/jira/browse/FLINK-26789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-26789: --- Labels: pull-request-available (was: ) > RescaleCheckpointManuallyITCase.testCheckpointRescalingInKeyedState failed > -- > > Key: FLINK-26789 > URL: https://issues.apache.org/jira/browse/FLINK-26789 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.16.0 >Reporter: Matthias Pohl >Assignee: Yanfei Lei >Priority: Critical > Labels: pull-request-available > Fix For: 1.16.0 > > > [This > build|https://dev.azure.com/mapohl/flink/_build/results?buildId=894&view=logs&j=0a15d512-44ac-5ba5-97ab-13a5d066c22c&t=9a028d19-6c4b-5a4e-d378-03fca149d0b1&l=5687] > failed due to > {{RescaleCheckpointManuallyITCase.testCheckpointRescalingInKeyedState}}: > {code} > Mar 21 17:05:32 [ERROR] > org.apache.flink.test.checkpointing.RescaleCheckpointManuallyITCase.testCheckpointRescalingInKeyedState > Time elapsed: 23.966 s <<< FAILURE! > Mar 21 17:05:32 java.lang.AssertionError: expected:<[(0,24000), (1,22500), > (0,34500), (1,33000), (0,21000), (0,45000), (2,31500), (2,42000), (1,6000), > (0,28500), (0,52500), (2,15000), (1,3000), (1,51000), (0,1500), (0,49500), > (2,12000), (2,6), (0,36000), (1,10500), (1,58500), (0,46500), (0,9000), > (0,57000), (2,19500), (2,43500), (1,7500), (1,55500), (2,3), (1,18000), > (0,54000), (2,40500), (1,4500), (0,16500), (2,27000), (1,39000), (2,13500), > (1,25500), (0,37500), (0,61500), (2,0), (2,48000)]> but was:<[(1,22500), > (1,33000), (0,21000), (2,18000), (1,6000), (0,20500), (0,52500), (0,15000), > (0,31000), (2,12000), (2,6), (0,36000), (1,58500), (1,10500), (0,46500), > (0,25000), (0,41000), (0,9000), (0,57000), (2,43500), (0,3), (1,4500), > (2,27000), (1,15000), (0,35000), (0,19000), (0,3000), (1,25500), (0,61500), > (2,48000), (2,0), (0,24000), (0,34500), (0,45000), (2,31500), (1,19500), > (2,1), (2,42000), (0,12500), (0,28500), (2,15000), (1,3000), (1,51000), > (0,23000), (0,49500), (0,1500), (0,33000), (0,1000), (2,19500), (1,7500), > (1,55500), (2,3), (1,18000), (0,6000), (0,38000), (0,54000), (2,40500), > (0,500), (0,16500), (1,39000), (1,7000), (0,11000), (2,13500), (0,37500)]> > Mar 21 17:05:32 at org.junit.Assert.fail(Assert.java:89) > Mar 21 17:05:32 at org.junit.Assert.failNotEquals(Assert.java:835) > Mar 21 17:05:32 at org.junit.Assert.assertEquals(Assert.java:120) > Mar 21 17:05:32 at org.junit.Assert.assertEquals(Assert.java:146) > Mar 21 17:05:32 at > org.apache.flink.test.checkpointing.RescaleCheckpointManuallyITCase.restoreAndAssert(RescaleCheckpointManuallyITCase.java:218) > Mar 21 17:05:32 at > org.apache.flink.test.checkpointing.RescaleCheckpointManuallyITCase.testCheckpointRescalingKeyedState(RescaleCheckpointManuallyITCase.java:122) > Mar 21 17:05:32 at > org.apache.flink.test.checkpointing.RescaleCheckpointManuallyITCase.testCheckpointRescalingInKeyedState(RescaleCheckpointManuallyITCase.java:88) > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] fredia opened a new pull request #19215: [FLINK-26789][state] Fix RescaleCheckpointManuallyITCase fail
fredia opened a new pull request #19215: URL: https://github.com/apache/flink/pull/19215 ## What is the purpose of the change *Fix `RescaleCheckpointManuallyITCase.testCheckpointRescalingInKeyedState` fail* ## Brief change log - *Restore `RescalingITCase` to the original* - *Make CollectionSink no longer shared by `RescaleCheckpointManuallyITCase` and `RescalingITCase`.* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org