[jira] [Commented] (FLINK-24586) SQL functions should return STRING instead of VARCHAR(2000)
[ https://issues.apache.org/jira/browse/FLINK-24586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17431015#comment-17431015 ] liwei li commented on FLINK-24586: -- I'm interested in changing the return of JSON functions, can I have this ticket and start writing JSON-related code first? > SQL functions should return STRING instead of VARCHAR(2000) > --- > > Key: FLINK-24586 > URL: https://issues.apache.org/jira/browse/FLINK-24586 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API >Reporter: Ingo Bürk >Priority: Major > > There are some SQL functions which currently return VARCHAR(2000). With more > strict CAST behavior from FLINK-24413, this could become an issue. > The following functions return VARCHAR(2000) and should be changed to return > STRING instead: > * JSON_VALUE > * JSON_QUERY > * JSON_OBJECT > * JSON_ARRAY > There are also some more functions which should be evaluated: > * CHR > * REVERSE > * SPLIT_INDEX > * PARSE_URL > * FROM_UNIXTIME > * DECODE -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] dawidwys closed pull request #17447: [hotfix] Update release notes with changes to StreamStatus
dawidwys closed pull request #17447: URL: https://github.com/apache/flink/pull/17447 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17525: [FLINK-24597][state] fix the problem that RocksStateKeysAndNamespaceIterator would return duplicate data
flinkbot edited a comment on pull request #17525: URL: https://github.com/apache/flink/pull/17525#issuecomment-947380227 ## CI report: * bf7f144b587313f1ab5cd706c52a25587151b923 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25246) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #17525: [FLINK-24597][state] fix the problem that RocksStateKeysAndNamespaceIterator would return duplicate data
flinkbot commented on pull request #17525: URL: https://github.com/apache/flink/pull/17525#issuecomment-947380401 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit bf7f144b587313f1ab5cd706c52a25587151b923 (Wed Oct 20 06:53:48 UTC 2021) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! * **This pull request references an unassigned [Jira ticket](https://issues.apache.org/jira/browse/FLINK-24597).** According to the [code contribution guide](https://flink.apache.org/contributing/contribute-code.html), tickets need to be assigned before starting with the implementation work. Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #17525: [FLINK-24597][state] fix the problem that RocksStateKeysAndNamespaceIterator would return duplicate data
flinkbot commented on pull request #17525: URL: https://github.com/apache/flink/pull/17525#issuecomment-947380227 ## CI report: * bf7f144b587313f1ab5cd706c52a25587151b923 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-ml] guoweiM commented on a change in pull request #6: [FLINK-2][iteration] Add broadcast output to broadcast events to all the downstream tasks
guoweiM commented on a change in pull request #6: URL: https://github.com/apache/flink-ml/pull/6#discussion_r732456097 ## File path: flink-ml-iteration/src/main/java/org/apache/flink/iteration/broadcast/ChainingBroadcastOutput.java ## @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.iteration.broadcast; + +import org.apache.flink.streaming.api.operators.Output; +import org.apache.flink.streaming.runtime.streamrecord.StreamRecord; +import org.apache.flink.util.OutputTag; + +/** The broadcast output corresponding to a chained output. */ +public class ChainingBroadcastOutput implements BroadcastOutput { +private final Output> rawOutput; +private final OutputTag outputTag; + +ChainingBroadcastOutput(Output> rawOutput, OutputTag outputTag) { +this.rawOutput = rawOutput; +this.outputTag = outputTag; +} + +@SuppressWarnings("unchecked") +@Override +public void broadcastEmit(StreamRecord record) { Review comment: If I understand correctly, broadcast means that no matter what the tag is, the data should be broadcast down. If this is the case, I understand that rawOutput.collect should be fine here? ## File path: flink-ml-iteration/src/main/java/org/apache/flink/iteration/broadcast/BroadcastOutputFactory.java ## @@ -0,0 +1,78 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.iteration.broadcast; + +import org.apache.flink.api.common.typeutils.TypeSerializer; +import org.apache.flink.metrics.Counter; +import org.apache.flink.runtime.io.network.api.writer.RecordWriter; +import org.apache.flink.runtime.plugable.SerializationDelegate; +import org.apache.flink.streaming.api.operators.Output; +import org.apache.flink.streaming.runtime.streamrecord.StreamElement; +import org.apache.flink.streaming.runtime.streamrecord.StreamRecord; +import org.apache.flink.util.OutputTag; + +import java.util.ArrayList; +import java.util.List; + +/** Factor that creates the corresponding {@link BroadcastOutput} from the {@link Output}. */ Review comment: Factory ## File path: flink-ml-iteration/src/main/java/org/apache/flink/iteration/broadcast/ChainingBroadcastOutput.java ## @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.iteration.broadcast; + +import org.apache.flink.streaming.api.operators.Output; +import org.apache.flink.streaming.runtime.streamrecord.StreamRecord; +import org.apache.flink.util.OutputTag; + +/** The broadcast output corresponding to a chained output. */ Review comment: Here I suggest that you can refer to specific
[jira] [Updated] (FLINK-24597) RocksdbStateBackend getKeysAndNamespaces would return duplicate data when using MapState
[ https://issues.apache.org/jira/browse/FLINK-24597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-24597: --- Labels: pull-request-available (was: ) > RocksdbStateBackend getKeysAndNamespaces would return duplicate data when > using MapState > - > > Key: FLINK-24597 > URL: https://issues.apache.org/jira/browse/FLINK-24597 > Project: Flink > Issue Type: Bug > Components: API / State Processor, Runtime / State Backends >Affects Versions: 1.14.0, 1.12.4, 1.13.3 >Reporter: Yue Ma >Priority: Major > Labels: pull-request-available > > For example, in RocksdbStateBackend , if we worked in VoidNamespace , and And > use the ValueState like below . > {code:java} > // insert record > for (int i = 0; i < 3; ++i) { > keyedStateBackend.setCurrentKey(i); > testValueState.update(String.valueOf(i)); > } > {code} > Then we get all the keysAndNamespace according the method > RocksDBKeyedStateBackend#getKeysAndNamespaces().The result of the traversal is > <1,VoidNamespace>,<2,VoidNamespace>,<3,VoidNamespace> ,which is as expected. > Thus,if we use MapState , and update the MapState with different user key, > the getKeysAndNamespaces would return duplicate data with same > keyAndNamespace. > {code:java} > // insert record > for (int i = 0; i < 3; ++i) { > keyedStateBackend.setCurrentKey(i); > mapState.put("userKeyA_" + i, "userValue"); > mapState.put("userKeyB_" + i, "userValue"); > } > {code} > The result of the traversal is > > <1,VoidNamespace>,<1,VoidNamespace>,<2,VoidNamespace>,<2,VoidNamespace>,<3,VoidNamespace>,<3,VoidNamespace>. > By reading the code, I found that the main reason for this problem is in the > implementation of _RocksStateKeysAndNamespaceIterator_. > In the _hasNext_ method, when a new keyAndNamespace is created, there is no > comparison with the previousKeyAndNamespace. So we can refer to > RocksStateKeysIterator to implement the same logic should solve this problem. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] mayuehappy opened a new pull request #17525: [FLINK-24597][state] fix the problem that RocksStateKeysAndNamespaceIterator would return duplicate data
mayuehappy opened a new pull request #17525: URL: https://github.com/apache/flink/pull/17525 …terator would return duplicate data ## What is the purpose of the change *(For example: This pull request makes task deployment go through the blob server, rather than through RPC. That way we avoid re-transferring them on each deployment (during recovery).)* ## Brief change log *(for example:)* - *The TaskInfo is stored in the blob store on job creation time as a persistent artifact* - *Deployments RPC transmits only the blob storage reference* - *TaskManagers retrieve the TaskInfo from the blob cache* ## Verifying this change *(Please pick either of the following options)* This change is a trivial rework / code cleanup without any test coverage. *(or)* This change is already covered by existing tests, such as *(please describe tests)*. *(or)* This change added tests and can be verified as follows: *(example:)* - *Added integration tests for end-to-end deployment with large payloads (100MB)* - *Extended integration test for recovery after master (JobManager) failure* - *Added test that validates that TaskInfo is transferred only once across recoveries* - *Manually verified the change by running a 4 node cluser with 2 JobManagers and 4 TaskManagers, a stateful streaming program, and killing one JobManager and two TaskManagers during the execution, verifying that recovery happens correctly.* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / no) - The serializers: (yes / no / don't know) - The runtime per-record code paths (performance sensitive): (yes / no / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know) - The S3 file system connector: (yes / no / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / no) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-24597) RocksdbStateBackend getKeysAndNamespaces would return duplicate data when using MapState
[ https://issues.apache.org/jira/browse/FLINK-24597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Wysakowicz updated FLINK-24597: - Attachment: (was: image-2021-10-20-14-23-20-333.png) > RocksdbStateBackend getKeysAndNamespaces would return duplicate data when > using MapState > - > > Key: FLINK-24597 > URL: https://issues.apache.org/jira/browse/FLINK-24597 > Project: Flink > Issue Type: Bug > Components: API / State Processor, Runtime / State Backends >Affects Versions: 1.14.0, 1.12.4, 1.13.3 >Reporter: Yue Ma >Priority: Major > > For example, in RocksdbStateBackend , if we worked in VoidNamespace , and And > use the ValueState like below . > {code:java} > // insert record > for (int i = 0; i < 3; ++i) { > keyedStateBackend.setCurrentKey(i); > testValueState.update(String.valueOf(i)); > } > {code} > Then we get all the keysAndNamespace according the method > RocksDBKeyedStateBackend#getKeysAndNamespaces().The result of the traversal is > <1,VoidNamespace>,<2,VoidNamespace>,<3,VoidNamespace> ,which is as expected. > Thus,if we use MapState , and update the MapState with different user key, > the getKeysAndNamespaces would return duplicate data with same > keyAndNamespace. > {code:java} > // insert record > for (int i = 0; i < 3; ++i) { > keyedStateBackend.setCurrentKey(i); > mapState.put("userKeyA_" + i, "userValue"); > mapState.put("userKeyB_" + i, "userValue"); > } > {code} > The result of the traversal is > > <1,VoidNamespace>,<1,VoidNamespace>,<2,VoidNamespace>,<2,VoidNamespace>,<3,VoidNamespace>,<3,VoidNamespace>. > By reading the code, I found that the main reason for this problem is in the > implementation of _RocksStateKeysAndNamespaceIterator_. > In the _hasNext_ method, when a new keyAndNamespace is created, there is no > comparison with the previousKeyAndNamespace. So we can refer to > RocksStateKeysIterator to implement the same logic should solve this problem. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] RocMarshal commented on a change in pull request #17508: [FLINK-24351][docs] Translate "JSON Function" pages into Chinese
RocMarshal commented on a change in pull request #17508: URL: https://github.com/apache/flink/pull/17508#discussion_r732397717 ## File path: docs/data/sql_functions_zh.yml ## @@ -742,9 +740,9 @@ json: - sql: JSON_EXISTS(jsonValue, path [ { TRUE | FALSE | UNKNOWN | ERROR } ON ERROR ]) table: STRING.jsonExists(STRING path [, JsonExistsOnError onError]) description: | - Determines whether a JSON string satisfies a given path search criterion. + 判定一个 JSON 字符串是否满足给定的路径搜索条件。 Review comment: ```suggestion 判定 JSON 字符串是否满足给定的路径搜索条件。 ``` ## File path: docs/data/sql_functions_zh.yml ## @@ -799,22 +792,20 @@ json: DEFAULT FALSE ON ERROR) -- 0.998D - JSON_VALUE('{"a.b": [0.998,0.996]}','$.["a.b"][0]' + JSON_VALUE('{"a.b": [0.998,0.996]}','$.["a.b"][0]' RETURNING DOUBLE) ``` - sql: JSON_QUERY(jsonValue, path [ { WITHOUT | WITH CONDITIONAL | WITH UNCONDITIONAL } [ ARRAY ] WRAPPER ] [ { NULL | EMPTY ARRAY | EMPTY OBJECT | ERROR } ON EMPTY ] [ { NULL | EMPTY ARRAY | EMPTY OBJECT | ERROR } ON ERROR ]) table: STRING.jsonQuery(path [, JsonQueryWrapper [, JsonQueryOnEmptyOrError, JsonQueryOnEmptyOrError ] ]) description: | - Extracts JSON values from a JSON string. + 从 JSON 字符串中提取 JSON 值。 - The result is always returned as a `STRING`. The `RETURNING` clause is currently not supported. + 结果总是以 'STRING' 的形式返回。'RETURNING' 的子句目前不受支持。 Review comment: ```suggestion 结果总是以 `STRING` 的形式返回。目前尚不支持 `RETURNING` 子句。 ``` ## File path: docs/data/sql_functions_zh.yml ## @@ -707,11 +707,9 @@ json: - sql: IS JSON [ { VALUE | SCALAR | ARRAY | OBJECT } ] table: STRING.isJson([JsonType type]) description: | - Determine whether a given string is valid JSON. + 判定给定字符串是否为有效的 JSON。 - Specifying the optional type argument puts a constraint on which type of JSON object is - allowed. If the string is valid JSON, but not that type, `false` is returned. The default is - `VALUE`. + 指定一个可选类型参数将会限制允许哪种类型的 JSON 对象。如果字符串是有效的 JSON,但不是指定的类型,则返回 'false'。默认值为 'VALUE'。 Review comment: 允许指定一个可选类型的参数限制 JSON 对象是否为指定的可选类型。? ## File path: docs/data/sql_functions_zh.yml ## @@ -799,22 +792,20 @@ json: DEFAULT FALSE ON ERROR) -- 0.998D - JSON_VALUE('{"a.b": [0.998,0.996]}','$.["a.b"][0]' + JSON_VALUE('{"a.b": [0.998,0.996]}','$.["a.b"][0]' RETURNING DOUBLE) ``` - sql: JSON_QUERY(jsonValue, path [ { WITHOUT | WITH CONDITIONAL | WITH UNCONDITIONAL } [ ARRAY ] WRAPPER ] [ { NULL | EMPTY ARRAY | EMPTY OBJECT | ERROR } ON EMPTY ] [ { NULL | EMPTY ARRAY | EMPTY OBJECT | ERROR } ON ERROR ]) table: STRING.jsonQuery(path [, JsonQueryWrapper [, JsonQueryOnEmptyOrError, JsonQueryOnEmptyOrError ] ]) description: | - Extracts JSON values from a JSON string. + 从 JSON 字符串中提取 JSON 值。 - The result is always returned as a `STRING`. The `RETURNING` clause is currently not supported. + 结果总是以 'STRING' 的形式返回。'RETURNING' 的子句目前不受支持。 - The `wrappingBehavior` determines whether the extracted value should be wrapped into an array, - and whether to do so unconditionally or only if the value itself isn't an array already. + 'wrappingBehavior' 决定是否将提取的值包装到一个数组中,以及是否无条件地这样做,还是只有当值本身不是数组时才这样做。 Review comment: It would be better if you could check the `'` to \` at cresponding references . ## File path: docs/data/sql_functions_zh.yml ## @@ -742,9 +740,9 @@ json: - sql: JSON_EXISTS(jsonValue, path [ { TRUE | FALSE | UNKNOWN | ERROR } ON ERROR ]) table: STRING.jsonExists(STRING path [, JsonExistsOnError onError]) description: | - Determines whether a JSON string satisfies a given path search criterion. + 判定一个 JSON 字符串是否满足给定的路径搜索条件。 - If the error behavior is omitted, `FALSE ON ERROR` is assumed as the default. + 如果忽略错误行为, 将 'FALSE ON ERROR' 作为默认设定。 Review comment: ```suggestion 如果忽略错误行为, 那么将 `FALSE ON ERROR` 设为默认值。 ``` ## File path: docs/data/sql_functions_zh.yml ## @@ -887,15 +875,11 @@ json: - sql: JSON_ARRAY([value]* [ { NULL | ABSENT } ON NULL ]) table: jsonArray(JsonOnNull, values...) description: | - Builds a JSON array string from a list of values. + 将一个数值列表构建成一个 JSON 数组字符串。 Review comment: ```suggestion 将数值列表构建成为 JSON 数组字符串。 ``` ## File path: docs/data/sql_functions_zh.yml ## @@ -765,23 +763,18 @@ json: - sql: JSON_VALUE(jsonValue, path [RETURNING ] [ { NULL | ERROR | DEFAULT } ON EMPTY ] [ { NULL | ERROR | DEFAULT } ON ERROR ]) table: STRING.jsonValue(STRING path [, returnType, onEmpty, defaultOnEmpty, onError, defaultOnError]) description: | - Extracts a scalar from a JSON string. + 从 JSON 字符串中提取标量。 - This method searches a JSON
[jira] [Created] (FLINK-24597) RocksdbStateBackend getKeysAndNamespaces would return duplicate data when using MapState
Yue Ma created FLINK-24597: -- Summary: RocksdbStateBackend getKeysAndNamespaces would return duplicate data when using MapState Key: FLINK-24597 URL: https://issues.apache.org/jira/browse/FLINK-24597 Project: Flink Issue Type: Bug Components: API / State Processor, Runtime / State Backends Affects Versions: 1.13.3, 1.12.4, 1.14.0 Reporter: Yue Ma Attachments: image-2021-10-20-14-23-20-333.png For example, in RocksdbStateBackend , if we worked in VoidNamespace , and And use the ValueState like below . {code:java} // insert record for (int i = 0; i < 3; ++i) { keyedStateBackend.setCurrentKey(i); testValueState.update(String.valueOf(i)); } {code} Then we get all the keysAndNamespace according the method RocksDBKeyedStateBackend#getKeysAndNamespaces().The result of the traversal is <1,VoidNamespace>,<2,VoidNamespace>,<3,VoidNamespace> ,which is as expected. Thus,if we use MapState , and update the MapState with different user key, the getKeysAndNamespaces would return duplicate data with same keyAndNamespace. {code:java} // insert record for (int i = 0; i < 3; ++i) { keyedStateBackend.setCurrentKey(i); mapState.put("userKeyA_" + i, "userValue"); mapState.put("userKeyB_" + i, "userValue"); } {code} The result of the traversal is <1,VoidNamespace>,<1,VoidNamespace>,<2,VoidNamespace>,<2,VoidNamespace>,<3,VoidNamespace>,<3,VoidNamespace>. By reading the code, I found that the main reason for this problem is in the implementation of _RocksStateKeysAndNamespaceIterator_. In the _hasNext_ method, when a new keyAndNamespace is created, there is no comparison with the previousKeyAndNamespace. So we can refer to RocksStateKeysIterator to implement the same logic should solve this problem. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #17522: [FLINK-24462][table] Introduce CastRule interface to reorganize casting code
flinkbot edited a comment on pull request #17522: URL: https://github.com/apache/flink/pull/17522#issuecomment-94673 ## CI report: * def050276334d930dec36d5cffe52a47c1bf039d Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25223) * 01af2774693597d1c4f7c23b9ce911fa800645a6 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25245) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17522: [FLINK-24462][table] Introduce CastRule interface to reorganize casting code
flinkbot edited a comment on pull request #17522: URL: https://github.com/apache/flink/pull/17522#issuecomment-94673 ## CI report: * def050276334d930dec36d5cffe52a47c1bf039d Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25223) * 01af2774693597d1c4f7c23b9ce911fa800645a6 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-24596) Bugs in sink.buffer-flush before upsert-kafka
[ https://issues.apache.org/jira/browse/FLINK-24596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430999#comment-17430999 ] Fabian Paul commented on FLINK-24596: - I can take it. > Bugs in sink.buffer-flush before upsert-kafka > - > > Key: FLINK-24596 > URL: https://issues.apache.org/jira/browse/FLINK-24596 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.14.0, 1.15.0 >Reporter: Jingsong Lee >Assignee: Fabian Paul >Priority: Blocker > > There is no ITCase for sink.buffer-flush before upsert-kafka. We should add > it. > FLINK-23735 brings some bugs: > * SinkBufferFlushMode bufferFlushMode not Serializable > * Function valueCopyFunction not Serializable > * Planner dose not support DataStreamProvider with new Sink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-24596) Bugs in sink.buffer-flush before upsert-kafka
[ https://issues.apache.org/jira/browse/FLINK-24596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fabian Paul reassigned FLINK-24596: --- Assignee: Fabian Paul > Bugs in sink.buffer-flush before upsert-kafka > - > > Key: FLINK-24596 > URL: https://issues.apache.org/jira/browse/FLINK-24596 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.14.0, 1.15.0 >Reporter: Jingsong Lee >Assignee: Fabian Paul >Priority: Blocker > > There is no ITCase for sink.buffer-flush before upsert-kafka. We should add > it. > FLINK-23735 brings some bugs: > * SinkBufferFlushMode bufferFlushMode not Serializable > * Function valueCopyFunction not Serializable > * Planner dose not support DataStreamProvider with new Sink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (FLINK-24546) Acknowledged description miss on Monitoring Checkpointing page
[ https://issues.apache.org/jira/browse/FLINK-24546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang resolved FLINK-24546. -- Fix Version/s: 1.15.0 Resolution: Fixed merged in master: 6b405f6318b82d759fbd93f9a6af213cda72374d > Acknowledged description miss on Monitoring Checkpointing page > -- > > Key: FLINK-24546 > URL: https://issues.apache.org/jira/browse/FLINK-24546 > Project: Flink > Issue Type: Bug > Components: Documentation >Reporter: camilesing >Assignee: camilesing >Priority: Minor > Labels: pull-request-available > Fix For: 1.15.0 > > Attachments: image-2021-10-14-19-26-33-289.png > > > !image-2021-10-14-19-26-33-289.png! > Acknowledged description miss on Monitoring Checkpointing page -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-24546) Acknowledged description miss on Monitoring Checkpointing page
[ https://issues.apache.org/jira/browse/FLINK-24546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430987#comment-17430987 ] Yun Tang commented on FLINK-24546: -- [~Alston Williams], thanks for your enthusiasm! As I wrote in [previous comment](https://github.com/apache/flink/pull/17482#issuecomment-947302284), I think you could create issue to help improve documentation of subtask details in history tab which missed the description of Processed (persisted) data. > Acknowledged description miss on Monitoring Checkpointing page > -- > > Key: FLINK-24546 > URL: https://issues.apache.org/jira/browse/FLINK-24546 > Project: Flink > Issue Type: Bug > Components: Documentation >Reporter: camilesing >Assignee: camilesing >Priority: Minor > Labels: pull-request-available > Attachments: image-2021-10-14-19-26-33-289.png > > > !image-2021-10-14-19-26-33-289.png! > Acknowledged description miss on Monitoring Checkpointing page -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-24546) Acknowledged description miss on Monitoring Checkpointing page
[ https://issues.apache.org/jira/browse/FLINK-24546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang reassigned FLINK-24546: Assignee: camilesing > Acknowledged description miss on Monitoring Checkpointing page > -- > > Key: FLINK-24546 > URL: https://issues.apache.org/jira/browse/FLINK-24546 > Project: Flink > Issue Type: Bug > Components: Documentation >Reporter: camilesing >Assignee: camilesing >Priority: Minor > Labels: pull-request-available > Attachments: image-2021-10-14-19-26-33-289.png > > > !image-2021-10-14-19-26-33-289.png! > Acknowledged description miss on Monitoring Checkpointing page -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] Myasuka closed pull request #17482: [FLINK-24546][docs]fix miss Acknowledged description on Monitoring Ch…
Myasuka closed pull request #17482: URL: https://github.com/apache/flink/pull/17482 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-ml] gaoyunhaii commented on a change in pull request #18: [FLINK-24279] Support withBroadcast in DataStream by caching in static variables
gaoyunhaii commented on a change in pull request #18: URL: https://github.com/apache/flink-ml/pull/18#discussion_r732437805 ## File path: flink-ml-lib/src/main/java/org/apache/flink/ml/common/broadcast/BroadcastUtils.java ## @@ -0,0 +1,142 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.ml.common.broadcast; + +import org.apache.flink.annotation.PublicEvolving; +import org.apache.flink.api.common.typeinfo.TypeInformation; +import org.apache.flink.ml.common.broadcast.operator.BroadcastWrapper; +import org.apache.flink.ml.common.broadcast.operator.CacheStreamOperatorFactory; +import org.apache.flink.ml.iteration.compile.DraftExecutionEnvironment; +import org.apache.flink.streaming.api.datastream.DataStream; +import org.apache.flink.streaming.api.datastream.MultipleConnectedStreams; +import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; +import org.apache.flink.streaming.api.transformations.MultipleInputTransformation; +import org.apache.flink.util.Preconditions; + +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; +import java.util.Map; +import java.util.function.Function; + +public class BroadcastUtils { + +private static DataStream cacheBroadcastVariables( +StreamExecutionEnvironment env, +Map> bcStreams, +TypeInformation outType) { +int numBroadcastInput = bcStreams.size(); +String[] broadcastInputNames = bcStreams.keySet().toArray(new String[0]); +DataStream[] broadcastInputs = bcStreams.values().toArray(new DataStream[0]); +TypeInformation[] broadcastInTypes = new TypeInformation[numBroadcastInput]; +for (int i = 0; i < numBroadcastInput; i++) { +broadcastInTypes[i] = broadcastInputs[i].getType(); +} + +MultipleInputTransformation transformation = +new MultipleInputTransformation( +"broadcastInputs", +new CacheStreamOperatorFactory(broadcastInputNames, broadcastInTypes), +outType, +env.getParallelism()); +for (DataStream dataStream : bcStreams.values()) { + transformation.addInput(dataStream.broadcast().getTransformation()); +} +env.addOperator(transformation); +return new MultipleConnectedStreams(env).transform(transformation); +} + +private static String getCoLocationKey(String[] broadcastNames) { +StringBuilder sb = new StringBuilder(); +sb.append("Flink-ML-broadcast-co-location"); +for (String name : broadcastNames) { +sb.append(name); +} +return sb.toString(); +} + +private static DataStream buildGraph( +StreamExecutionEnvironment env, +List> inputList, +String[] broadcastStreamNames, +Function>, DataStream> graphBuilder) { +TypeInformation[] inTypes = new TypeInformation[inputList.size()]; +for (int i = 0; i < inputList.size(); i++) { +TypeInformation type = inputList.get(i).getType(); +inTypes[i] = type; +} +// blocking all non-broadcast input edges by default. +boolean[] isBlocking = new boolean[inTypes.length]; +Arrays.fill(isBlocking, true); +DraftExecutionEnvironment draftEnv = +new DraftExecutionEnvironment( +env, new BroadcastWrapper<>(broadcastStreamNames, inTypes, isBlocking)); + +List> draftSources = new ArrayList<>(); +for (int i = 0; i < inputList.size(); i++) { +draftSources.add(draftEnv.addDraftSource(inputList.get(i), inputList.get(i).getType())); +} +DataStream draftOutStream = graphBuilder.apply(draftSources); + Review comment: Users may have a.map(xx).map(xx), then we would have two operators. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contac
[GitHub] [flink-ml] gaoyunhaii commented on a change in pull request #18: [FLINK-24279] Support withBroadcast in DataStream by caching in static variables
gaoyunhaii commented on a change in pull request #18: URL: https://github.com/apache/flink-ml/pull/18#discussion_r732431833 ## File path: flink-ml-lib/src/main/java/org/apache/flink/ml/common/broadcast/operator/AbstractBroadcastWrapperOperator.java ## @@ -96,31 +97,36 @@ protected final StreamOperatorFactory operatorFactory; -/** Metric group for the operator. */ protected final OperatorMetricGroup metrics; protected final S wrappedOperator; -/** variables for withBroadcast operators. */ -protected final MailboxExecutor mailboxExecutor; - -protected final String[] broadcastStreamNames; +protected transient StreamOperatorStateHandler stateHandler; -protected final boolean[] isBlocking; +protected transient InternalTimeServiceManager timeServiceManager; +protected final MailboxExecutor mailboxExecutor; +/** variables specific for withBroadcast functionality. */ Review comment: In general one empty line before each instance variable ## File path: flink-ml-lib/src/main/java/org/apache/flink/ml/common/broadcast/operator/AbstractBroadcastWrapperOperator.java ## @@ -196,17 +203,25 @@ public AbstractBroadcastWrapperOperator( } /** - * check whether all of broadcast variables are ready. + * checks whether all of broadcast variables are ready. Besides it maintains a state + * {broadcastVariablesReady} to avoiding invoking {@code BroadcastContext.isCacheFinished(...)} + * repeatedly. Finally, it sets broadcast variables for ${@link HasBroadcastVariable} if the + * broadcast variables are ready. * - * @return + * @return true if all broadcast variables are ready, false otherwise. */ protected boolean areBroadcastVariablesReady() { if (broadcastVariablesReady) { return true; } for (String name : broadcastStreamNames) { -if (!BroadcastContext.isCacheFinished(Tuple2.of(name, indexOfSubtask))) { +if (!BroadcastContext.isCacheFinished(name + "-" + indexOfSubtask)) { return false; +} else if (wrappedOperator instanceof HasBroadcastVariable) { +String key = name + "-" + indexOfSubtask; +String userKey = name.substring(name.indexOf('-') + 1); +((HasBroadcastVariable) wrappedOperator) Review comment: Use `OperatorUtils#processOperatorOrUdfIfSatisfy` instead since we may need to handle both of operators and UDF if either of them implements the interface. ## File path: flink-ml-lib/src/main/java/org/apache/flink/ml/common/broadcast/BroadcastUtils.java ## @@ -22,121 +22,168 @@ import org.apache.flink.api.common.typeinfo.TypeInformation; import org.apache.flink.ml.common.broadcast.operator.BroadcastWrapper; import org.apache.flink.ml.common.broadcast.operator.CacheStreamOperatorFactory; +import org.apache.flink.ml.common.broadcast.operator.HasBroadcastVariable; import org.apache.flink.ml.iteration.compile.DraftExecutionEnvironment; import org.apache.flink.streaming.api.datastream.DataStream; import org.apache.flink.streaming.api.datastream.MultipleConnectedStreams; import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; +import org.apache.flink.streaming.api.operators.ChainingStrategy; import org.apache.flink.streaming.api.transformations.MultipleInputTransformation; +import org.apache.flink.streaming.api.transformations.PhysicalTransformation; +import org.apache.flink.util.AbstractID; import org.apache.flink.util.Preconditions; import java.util.ArrayList; -import java.util.Arrays; import java.util.List; import java.util.Map; +import java.util.UUID; import java.util.function.Function; +/** Utility class to support withBroadcast in DataStream. */ public class BroadcastUtils { +/** + * supports withBroadcastStream in DataStream API. Broadcast data streams are available at all + * parallel instances of an operator that implements ${@link HasBroadcastVariable}. An operator + * that wants to access broadcast variables must implement ${@link HasBroadcastVariable}. + * + * In detail, the broadcast input data streams will be consumed first and further set by + * {@code HasBroadcastVariable.setBroadcastVariable(...)}. For now the non-broadcast input are + * cached by default to avoid the possible deadlocks. + * + * @param inputList non-broadcast input list. + * @param bcStreams map of the broadcast data streams, where the key is the name and the value + * is the corresponding data stream. + * @param userDefinedFunction the user defined logic in which users can access the broadcast + * data streams and produce the output data stream. Note that users can add only one + * operator in this function, otherwise it raises an exception. + * @return the output data stream. + */ +@
[GitHub] [flink] flinkbot edited a comment on pull request #17519: [FLINK-24368][testutils] Print logs JobManager and TaskManager on STDOUT in Flink container for redirecting to log4j in tests
flinkbot edited a comment on pull request #17519: URL: https://github.com/apache/flink/pull/17519#issuecomment-946579332 ## CI report: * 3b693fd284dec0633e572d6e0428bc1153bc0ed9 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25211) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17521: [FLINK-24558][API/DataStream]make parent ClassLoader variable which c…
flinkbot edited a comment on pull request #17521: URL: https://github.com/apache/flink/pull/17521#issuecomment-946651087 ## CI report: * 3b8fa52e5763fb02a0b786d938f1e478f5a7155e Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25241) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #16606: [FLINK-21357][runtime/statebackend]Periodic materialization for generalized incremental checkpoints
flinkbot edited a comment on pull request #16606: URL: https://github.com/apache/flink/pull/16606#issuecomment-887431748 ## CI report: * 594c229d544573b2697781bc2ef1e53baaccba5a Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25214) * a320ebdce38a0dbc34710e1670e64d9d17aa6e33 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25243) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #16606: [FLINK-21357][runtime/statebackend]Periodic materialization for generalized incremental checkpoints
flinkbot edited a comment on pull request #16606: URL: https://github.com/apache/flink/pull/16606#issuecomment-887431748 ## CI report: * 594c229d544573b2697781bc2ef1e53baaccba5a Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25214) * a320ebdce38a0dbc34710e1670e64d9d17aa6e33 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-23169) Support user-level app staging directory when yarn.staging-directory is specified
[ https://issues.apache.org/jira/browse/FLINK-23169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430878#comment-17430878 ] Xintong Song commented on FLINK-23169: -- hi [~hackergin], I'm trying to understand: # Why is it a problem for you to have staging files of all users in the same directory? # Can this be workaround by configuring different staging directories for users? > Support user-level app staging directory when yarn.staging-directory is > specified > - > > Key: FLINK-23169 > URL: https://issues.apache.org/jira/browse/FLINK-23169 > Project: Flink > Issue Type: Improvement > Components: Deployment / YARN >Reporter: jinfeng >Priority: Major > > When yarn.staging-directory is specified, different users will use the same > directory as the staging directory. It may not friendly for a job platform > to submit job for different users. I propose to use the user-level directory > by default when yarn.staging-directory is specified. We only need to make > small changes for `getStagingDir` function in > YarnClusterDescriptor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] Myasuka commented on pull request #17482: [FLINK-24546][docs]fix miss Acknowledged description on Monitoring Ch…
Myasuka commented on pull request #17482: URL: https://github.com/apache/flink/pull/17482#issuecomment-947302284 > I have added the description, could you please review it? @wuchong The pull request is [this](https://github.com/apache/flink/pull/17495). @AlstonWilliams , since camilesing created the issue first and he could have the priority to resolve the issue first if not so difficult and he had such willingness. If you don't mind, I will merge this PR to resolve the issue. BTW, I could see you have enthusiasm to contribute to Flink community, would you please help to improve the doc of [subtask details in history tab](https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/monitoring/checkpoint_monitoring/#history-tab) which missed the description of **Processed (persisted) data**? If you agree, please create related issue, and I could assign that to you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] JingsongLi commented on a change in pull request #17520: [FLINK-24565][avro] Port avro file format factory to BulkReaderFormatFactory
JingsongLi commented on a change in pull request #17520: URL: https://github.com/apache/flink/pull/17520#discussion_r732391776 ## File path: flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/AbstractAvroBulkFormat.java ## @@ -0,0 +1,190 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.formats.avro; + +import org.apache.flink.configuration.Configuration; +import org.apache.flink.connector.file.src.FileSourceSplit; +import org.apache.flink.connector.file.src.reader.BulkFormat; +import org.apache.flink.connector.file.src.util.CheckpointedPosition; +import org.apache.flink.connector.file.src.util.IteratorResultIterator; +import org.apache.flink.core.fs.FileSystem; +import org.apache.flink.core.fs.Path; +import org.apache.flink.formats.avro.utils.FSDataInputStreamWrapper; + +import org.apache.avro.file.DataFileReader; +import org.apache.avro.file.SeekableInput; +import org.apache.avro.generic.GenericDatumReader; +import org.apache.avro.io.DatumReader; + +import javax.annotation.Nullable; + +import java.io.IOException; +import java.util.Iterator; +import java.util.concurrent.BlockingQueue; +import java.util.concurrent.LinkedBlockingQueue; + +/** Provides a {@link BulkFormat} for Avro records. */ +public abstract class AbstractAvroBulkFormat +implements BulkFormat { + +private static final long serialVersionUID = 1L; + +@Override +public AvroReader createReader(Configuration config, SplitT split) throws IOException { +open(split); +return createReader(split); +} + +@Override +public AvroReader restoreReader(Configuration config, SplitT split) throws IOException { +open(split); +return createReader(split); +} + +@Override +public boolean isSplittable() { +return true; +} + +private AvroReader createReader(SplitT split) throws IOException { +long end = split.offset() + split.length(); +if (split.getReaderPosition().isPresent()) { +CheckpointedPosition position = split.getReaderPosition().get(); +return new AvroReader( +split.path(), +split.offset(), +end, +position.getOffset(), +position.getRecordsAfterOffset()); +} else { +return new AvroReader(split.path(), split.offset(), end, -1, 0); +} +} + +protected void open(SplitT split) {} + +abstract T convert(A record); Review comment: protected ## File path: flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/AbstractAvroBulkFormat.java ## @@ -0,0 +1,190 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.formats.avro; + +import org.apache.flink.configuration.Configuration; +import org.apache.flink.connector.file.src.FileSourceSplit; +import org.apache.flink.connector.file.src.reader.BulkFormat; +import org.apache.flink.connector.file.src.util.CheckpointedPosition; +import org.apache.flink.connector.file.src.util.IteratorResultIterator; +import org.apache.flink.core.fs.FileSystem; +import org.apache.flink.core.fs.Path; +import org.apache.flink.formats.avro.utils.FSDataInputStreamWrapper; + +import org.apache.avro.file.DataFileReader; +import org.apache.avro.file.SeekableInput; +import org.apache.avro.generic.GenericDatumReader; +import org.ap
[GitHub] [flink] JingsongLi commented on pull request #17520: [FLINK-24565][avro] Port avro file format factory to BulkReaderFormatFactory
JingsongLi commented on pull request #17520: URL: https://github.com/apache/flink/pull/17520#issuecomment-947295434 Hi @slinkydeveloper @tsreaper , I think Avro and StreamFormat are in conflict conceptually. In the comments of StreamFormat: > Compared to the {@link BulkFormat}, the stream format handles a few things out-of-the-box, like deciding how to batch records or dealing with compression. But Avro has own batch and compression logical. This conflict does lead to poor work in some places. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] tsreaper edited a comment on pull request #17520: [FLINK-24565][avro] Port avro file format factory to BulkReaderFormatFactory
tsreaper edited a comment on pull request #17520: URL: https://github.com/apache/flink/pull/17520#issuecomment-947280429 @slinkydeveloper There are four reasons why I did not choose `StreamFormat`. 1. The biggest concern is that `StreamFormatAdapter.Reader#readBatch` stores all results in a batch in heap memory. This is bad because avro is a format which supports compression. You'll never know how much data will be stuffed into heap memory after inflation. 2. `StreamFormat`, from its concept, is for a stream of bytes where each record is shipped independently. Avro is a file format which organizes the records in its own blocks, so they do not match from the concept. I would say csv format will be more suitable for `StreamFormat`. 3. `StreamFormatAdapter` cuts batches by counting number of bytes read from the file stream. If the sync size of avro is 2MB it will read 2M bytes from file in one go and produce a batch containing no records. However this only happens at the beginning of reading a file so this might be OK. 4. Both orc and parquet formats have implemented `BulkFormat` instead of `StreamFormat`, so why not `StreamFormat` for them? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] tsreaper edited a comment on pull request #17520: [FLINK-24565][avro] Port avro file format factory to BulkReaderFormatFactory
tsreaper edited a comment on pull request #17520: URL: https://github.com/apache/flink/pull/17520#issuecomment-947280429 @slinkydeveloper There are four reasons why I did not choose `StreamFormat`. 1. The biggest concern is that `StreamFormatAdapter.Reader#readBatch` stores all results in a batch in heap memory. This is bad because avro is a format which supports compression. You'll never know how much data will be stuffed into heap memory after inflation. 2. `StreamFormat`, from its concept, is for a stream of bytes where each record is shipped independently. Avro is a file format which organizes the records in its own blocks, so they aren't match from the concept. I would say csv format will be more suitable for `StreamFormat`. 3. `StreamFormatAdapter` cuts batches by counting number of bytes read from the file stream. If the sync size of avro is 2MB it will read 2M bytes from file in one go and produce a batch containing no records. However this only happens at the beginning of reading a file so this might be OK. 4. Both orc and parquet formats have implemented `BulkFormat` instead of `StreamFormat`, so why not `StreamFormat` for them? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] tsreaper commented on a change in pull request #17520: [FLINK-24565][avro] Port avro file format factory to BulkReaderFormatFactory
tsreaper commented on a change in pull request #17520: URL: https://github.com/apache/flink/pull/17520#discussion_r732386149 ## File path: flink-formats/flink-avro/pom.xml ## @@ -47,13 +47,11 @@ under the License. - org.apache.flink - flink-table-common + flink-table-runtime_${scala.binary.version} Review comment: For `FileSystemConnectorOptions.PARTITION_DEFAULT_NAME` in `AvroFileFormatFactory`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-24596) Bugs in sink.buffer-flush before upsert-kafka
[ https://issues.apache.org/jira/browse/FLINK-24596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-24596: - Affects Version/s: 1.14.0 > Bugs in sink.buffer-flush before upsert-kafka > - > > Key: FLINK-24596 > URL: https://issues.apache.org/jira/browse/FLINK-24596 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.14.0, 1.15.0 >Reporter: Jingsong Lee >Priority: Blocker > Fix For: 1.15.0 > > > There is no ITCase for sink.buffer-flush before upsert-kafka. We should add > it. > FLINK-23735 brings some bugs: > * SinkBufferFlushMode bufferFlushMode not Serializable > * Function valueCopyFunction not Serializable > * Planner dose not support DataStreamProvider with new Sink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-24596) Bugs in sink.buffer-flush before upsert-kafka
[ https://issues.apache.org/jira/browse/FLINK-24596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-24596: - Fix Version/s: (was: 1.15.0) > Bugs in sink.buffer-flush before upsert-kafka > - > > Key: FLINK-24596 > URL: https://issues.apache.org/jira/browse/FLINK-24596 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.14.0, 1.15.0 >Reporter: Jingsong Lee >Priority: Blocker > > There is no ITCase for sink.buffer-flush before upsert-kafka. We should add > it. > FLINK-23735 brings some bugs: > * SinkBufferFlushMode bufferFlushMode not Serializable > * Function valueCopyFunction not Serializable > * Planner dose not support DataStreamProvider with new Sink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] tsreaper commented on a change in pull request #17520: [FLINK-24565][avro] Port avro file format factory to BulkReaderFormatFactory
tsreaper commented on a change in pull request #17520: URL: https://github.com/apache/flink/pull/17520#discussion_r732384420 ## File path: flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/AvroFileFormatFactory.java ## @@ -20,39 +20,135 @@ import org.apache.flink.annotation.Internal; import org.apache.flink.api.common.serialization.BulkWriter; +import org.apache.flink.api.common.typeinfo.TypeInformation; import org.apache.flink.configuration.ConfigOption; import org.apache.flink.configuration.ReadableConfig; +import org.apache.flink.connector.file.src.FileSourceSplit; +import org.apache.flink.connector.file.src.reader.BulkFormat; import org.apache.flink.core.fs.FSDataOutputStream; import org.apache.flink.formats.avro.typeutils.AvroSchemaConverter; import org.apache.flink.table.connector.ChangelogMode; +import org.apache.flink.table.connector.format.BulkDecodingFormat; import org.apache.flink.table.connector.format.EncodingFormat; import org.apache.flink.table.connector.sink.DynamicTableSink; +import org.apache.flink.table.connector.source.DynamicTableSource; +import org.apache.flink.table.data.GenericRowData; import org.apache.flink.table.data.RowData; +import org.apache.flink.table.factories.BulkReaderFormatFactory; import org.apache.flink.table.factories.BulkWriterFormatFactory; import org.apache.flink.table.factories.DynamicTableFactory; +import org.apache.flink.table.filesystem.FileSystemConnectorOptions; import org.apache.flink.table.types.DataType; +import org.apache.flink.table.types.logical.LogicalType; import org.apache.flink.table.types.logical.RowType; +import org.apache.flink.table.utils.PartitionPathUtils; import org.apache.avro.Schema; import org.apache.avro.file.CodecFactory; import org.apache.avro.file.DataFileWriter; +import org.apache.avro.generic.GenericData; import org.apache.avro.generic.GenericDatumWriter; import org.apache.avro.generic.GenericRecord; import org.apache.avro.io.DatumWriter; import java.io.IOException; import java.io.OutputStream; import java.util.HashSet; +import java.util.List; import java.util.Set; +import java.util.stream.Collectors; import static org.apache.flink.formats.avro.AvroFormatOptions.AVRO_OUTPUT_CODEC; /** Avro format factory for file system. */ @Internal -public class AvroFileFormatFactory implements BulkWriterFormatFactory { +public class AvroFileFormatFactory implements BulkReaderFormatFactory, BulkWriterFormatFactory { public static final String IDENTIFIER = "avro"; +@Override +public BulkDecodingFormat createDecodingFormat( +DynamicTableFactory.Context context, ReadableConfig formatOptions) { +return new BulkDecodingFormat() { +@Override +public BulkFormat createRuntimeDecoder( +DynamicTableSource.Context sourceContext, DataType producedDataType) { +RowType physicalRowype = +(RowType) +context.getCatalogTable() +.getResolvedSchema() +.toPhysicalRowDataType() +.getLogicalType(); Review comment: I don't quite get it. `producedDataType` is the data type produced by this source operator after projection push down. It might be different from what is stored in the avro file. For example, if this source is partitioned then partition keys are stored in file path, not in avro file. Also the order of fields in `producedDataType` might be different from `physicalRowType`, and I need to map the fields in `AvroGenericRecordBulkFormat`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-23849) Support react to the node decommissioning change state on yarn and do graceful restart
[ https://issues.apache.org/jira/browse/FLINK-23849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song updated FLINK-23849: - Affects Version/s: (was: 1.13.2) (was: 1.13.1) (was: 1.12.2) > Support react to the node decommissioning change state on yarn and do > graceful restart > -- > > Key: FLINK-23849 > URL: https://issues.apache.org/jira/browse/FLINK-23849 > Project: Flink > Issue Type: New Feature > Components: Deployment / YARN >Reporter: zlzhang0122 >Priority: Major > Fix For: 1.15.0 > > > Now we are not interested in node updates in > YarnContainerEventHandler.onNodesUpdated , but sometimes we want to evict the > running flink process on one node and graceful restart on the other node > because of some unexpected reason such as the physical machine need to be > recycle or the cloud computing cluster need to be migration. Thus, we can > react to the node decommissioning change state, and call the > stopWithSavepoint function and then restart it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-24596) Bugs in sink.buffer-flush before upsert-kafka
[ https://issues.apache.org/jira/browse/FLINK-24596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430850#comment-17430850 ] Jingsong Lee commented on FLINK-24596: -- CC: [~arvid] [~fabian.paul] > Bugs in sink.buffer-flush before upsert-kafka > - > > Key: FLINK-24596 > URL: https://issues.apache.org/jira/browse/FLINK-24596 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.15.0 >Reporter: Jingsong Lee >Priority: Blocker > Fix For: 1.15.0 > > > There is no ITCase for sink.buffer-flush before upsert-kafka. We should add > it. > FLINK-23735 brings some bugs: > * SinkBufferFlushMode bufferFlushMode not Serializable > * Function valueCopyFunction not Serializable > * Planner dose not support DataStreamProvider with new Sink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] tsreaper commented on pull request #17520: [FLINK-24565][avro] Port avro file format factory to BulkReaderFormatFactory
tsreaper commented on pull request #17520: URL: https://github.com/apache/flink/pull/17520#issuecomment-947280429 @slinkydeveloper There are three reasons why I did not choose `StreamFormat`. 1. The biggest concern is that `StreamFormatAdapter.Reader#readBatch` stores all results in a batch in heap memory. This is bad because avro is a format which supports compression. You'll never know how much data will be stuffed into heap memory after inflation. 2. `StreamFormatAdapter` cuts batches by counting number of bytes read from the file stream. If the sync size of avro is 2MB it will read 2M bytes from file in one go and produce a batch containing no records. However this only happens at the beginning of reading a file so this might be OK. 3. Both orc and parquet formats have implemented `BulkFormat` instead of `StreamFormat`, so why not `StreamFormat` for them? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-24596) Bugs in sink.buffer-flush before upsert-kafka
Jingsong Lee created FLINK-24596: Summary: Bugs in sink.buffer-flush before upsert-kafka Key: FLINK-24596 URL: https://issues.apache.org/jira/browse/FLINK-24596 Project: Flink Issue Type: Bug Components: Connectors / Kafka Affects Versions: 1.15.0 Reporter: Jingsong Lee Fix For: 1.15.0 There is no ITCase for sink.buffer-flush before upsert-kafka. We should add it. FLINK-23735 brings some bugs: * SinkBufferFlushMode bufferFlushMode not Serializable * Function valueCopyFunction not Serializable * Planner dose not support DataStreamProvider with new Sink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-24122) Add support to do clean in history server
[ https://issues.apache.org/jira/browse/FLINK-24122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song updated FLINK-24122: - Affects Version/s: (was: 1.13.2) (was: 1.12.3) > Add support to do clean in history server > - > > Key: FLINK-24122 > URL: https://issues.apache.org/jira/browse/FLINK-24122 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Reporter: zlzhang0122 >Priority: Major > Labels: pull-request-available > Fix For: 1.14.1 > > > Now, the history server can clean history jobs by two means: > # if users have configured > {code:java} > historyserver.archive.clean-expired-jobs: true{code} > , then compare the files in hdfs over two clean interval and find the delete > and clean the local cache file. > # if users have configured the > {code:java} > historyserver.archive.retained-jobs:{code} > a positive number, then clean the oldest files in hdfs and local. > But the retained-jobs number is difficult to determine. > For example, users may want to check the history jobs yesterday while many > jobs failed today and exceed the retained-jobs number, then the history jobs > of yesterday will be delete. So what if add a configuration which contain a > retained-times that indicate the max time the history job retain? > Also it can't clean the job history files which was no longer in hdfs but > still cached in local filesystem and these files will store forever and can't > be cleaned unless users manually do this. Maybe we can give a option and do > this clean if the option says true. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-24122) Add support to do clean in history server
[ https://issues.apache.org/jira/browse/FLINK-24122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song updated FLINK-24122: - Priority: Major (was: Minor) > Add support to do clean in history server > - > > Key: FLINK-24122 > URL: https://issues.apache.org/jira/browse/FLINK-24122 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Affects Versions: 1.12.3, 1.13.2 >Reporter: zlzhang0122 >Priority: Major > Labels: pull-request-available > Fix For: 1.14.1 > > > Now, the history server can clean history jobs by two means: > # if users have configured > {code:java} > historyserver.archive.clean-expired-jobs: true{code} > , then compare the files in hdfs over two clean interval and find the delete > and clean the local cache file. > # if users have configured the > {code:java} > historyserver.archive.retained-jobs:{code} > a positive number, then clean the oldest files in hdfs and local. > But the retained-jobs number is difficult to determine. > For example, users may want to check the history jobs yesterday while many > jobs failed today and exceed the retained-jobs number, then the history jobs > of yesterday will be delete. So what if add a configuration which contain a > retained-times that indicate the max time the history job retain? > Also it can't clean the job history files which was no longer in hdfs but > still cached in local filesystem and these files will store forever and can't > be cleaned unless users manually do this. Maybe we can give a option and do > this clean if the option says true. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-24122) Add support to do clean in history server
[ https://issues.apache.org/jira/browse/FLINK-24122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song updated FLINK-24122: - Issue Type: Improvement (was: Bug) > Add support to do clean in history server > - > > Key: FLINK-24122 > URL: https://issues.apache.org/jira/browse/FLINK-24122 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Affects Versions: 1.12.3, 1.13.2 >Reporter: zlzhang0122 >Priority: Minor > Labels: pull-request-available > Fix For: 1.14.1 > > > Now, the history server can clean history jobs by two means: > # if users have configured > {code:java} > historyserver.archive.clean-expired-jobs: true{code} > , then compare the files in hdfs over two clean interval and find the delete > and clean the local cache file. > # if users have configured the > {code:java} > historyserver.archive.retained-jobs:{code} > a positive number, then clean the oldest files in hdfs and local. > But the retained-jobs number is difficult to determine. > For example, users may want to check the history jobs yesterday while many > jobs failed today and exceed the retained-jobs number, then the history jobs > of yesterday will be delete. So what if add a configuration which contain a > retained-times that indicate the max time the history job retain? > Also it can't clean the job history files which was no longer in hdfs but > still cached in local filesystem and these files will store forever and can't > be cleaned unless users manually do this. Maybe we can give a option and do > this clean if the option says true. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-24122) Add support to do clean in history server
[ https://issues.apache.org/jira/browse/FLINK-24122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song updated FLINK-24122: - Fix Version/s: (was: 1.14.1) 1.15.0 > Add support to do clean in history server > - > > Key: FLINK-24122 > URL: https://issues.apache.org/jira/browse/FLINK-24122 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Reporter: zlzhang0122 >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > Now, the history server can clean history jobs by two means: > # if users have configured > {code:java} > historyserver.archive.clean-expired-jobs: true{code} > , then compare the files in hdfs over two clean interval and find the delete > and clean the local cache file. > # if users have configured the > {code:java} > historyserver.archive.retained-jobs:{code} > a positive number, then clean the oldest files in hdfs and local. > But the retained-jobs number is difficult to determine. > For example, users may want to check the history jobs yesterday while many > jobs failed today and exceed the retained-jobs number, then the history jobs > of yesterday will be delete. So what if add a configuration which contain a > retained-times that indicate the max time the history job retain? > Also it can't clean the job history files which was no longer in hdfs but > still cached in local filesystem and these files will store forever and can't > be cleaned unless users manually do this. Maybe we can give a option and do > this clean if the option says true. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-24562) YarnResourceManagerDriverTest should not use ContainerStatusPBImpl.newInstance
[ https://issues.apache.org/jira/browse/FLINK-24562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song closed FLINK-24562. Fix Version/s: 1.15.0 Resolution: Fixed Fixed in - master (1.15): a08e7c6577557dce874b30df9c8b3692d9575282 > YarnResourceManagerDriverTest should not use ContainerStatusPBImpl.newInstance > -- > > Key: FLINK-24562 > URL: https://issues.apache.org/jira/browse/FLINK-24562 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.14.0 >Reporter: Feifan Wang >Assignee: Feifan Wang >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > In YarnResourceManagerDriverTest, we create ContainerStatus with the static > method ContainerStatusPBImpl{{.newInstance}}, which is annotated as private > and unstable. > Although this method is still available in the latest version of yarn, some > third-party versions of yarn may modify it. In fact, this method was modified > in the internal version provided by our yarn team, which caused flink-1.14.0 > to fail to compile. > Moreover, there is already an org.apache.flink.yarn.TestingContainerStatus, I > think we should use it directly. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-24562) YarnResourceManagerDriverTest should not use ContainerStatusPBImpl.newInstance
[ https://issues.apache.org/jira/browse/FLINK-24562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song updated FLINK-24562: - Issue Type: Bug (was: Improvement) > YarnResourceManagerDriverTest should not use ContainerStatusPBImpl.newInstance > -- > > Key: FLINK-24562 > URL: https://issues.apache.org/jira/browse/FLINK-24562 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.14.0 >Reporter: Feifan Wang >Assignee: Feifan Wang >Priority: Major > Labels: pull-request-available > > In YarnResourceManagerDriverTest, we create ContainerStatus with the static > method ContainerStatusPBImpl{{.newInstance}}, which is annotated as private > and unstable. > Although this method is still available in the latest version of yarn, some > third-party versions of yarn may modify it. In fact, this method was modified > in the internal version provided by our yarn team, which caused flink-1.14.0 > to fail to compile. > Moreover, there is already an org.apache.flink.yarn.TestingContainerStatus, I > think we should use it directly. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] becketqin merged pull request #17310: [FLINK-24308][docs] Translate Kafka DataStream connector documentation to Chinese
becketqin merged pull request #17310: URL: https://github.com/apache/flink/pull/17310 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] xintongsong closed pull request #17496: [FLINK-24562][yarn] YarnResourceManagerDriverTest should not use ContainerStatusPBImpl.newInstance
xintongsong closed pull request #17496: URL: https://github.com/apache/flink/pull/17496 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17519: [FLINK-24368][testutils] Print logs JobManager and TaskManager on STDOUT in Flink container for redirecting to log4j in tests
flinkbot edited a comment on pull request #17519: URL: https://github.com/apache/flink/pull/17519#issuecomment-946579332 ## CI report: * 3b693fd284dec0633e572d6e0428bc1153bc0ed9 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25211) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17519: [FLINK-24368][testutils] Print logs JobManager and TaskManager on STDOUT in Flink container for redirecting to log4j in tests
flinkbot edited a comment on pull request #17519: URL: https://github.com/apache/flink/pull/17519#issuecomment-946579332 ## CI report: * 3b693fd284dec0633e572d6e0428bc1153bc0ed9 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25211) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] PatrickRen commented on pull request #17519: [FLINK-24368][testutils] Print logs JobManager and TaskManager on STDOUT in Flink container for redirecting to log4j in tests
PatrickRen commented on pull request #17519: URL: https://github.com/apache/flink/pull/17519#issuecomment-947255847 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17521: [FLINK-24558][API/DataStream]make parent ClassLoader variable which c…
flinkbot edited a comment on pull request #17521: URL: https://github.com/apache/flink/pull/17521#issuecomment-946651087 ## CI report: * 4f842b98d3aaee4d1073a3621aa0f8a125e46528 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25216) * 3b8fa52e5763fb02a0b786d938f1e478f5a7155e Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25241) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17524: Update rabbitmq.md
flinkbot edited a comment on pull request #17524: URL: https://github.com/apache/flink/pull/17524#issuecomment-947119476 ## CI report: * fbe360124a6e29ed8059246acc6a261f53b5448b Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25239) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17521: [FLINK-24558][API/DataStream]make parent ClassLoader variable which c…
flinkbot edited a comment on pull request #17521: URL: https://github.com/apache/flink/pull/17521#issuecomment-946651087 ## CI report: * 4f842b98d3aaee4d1073a3621aa0f8a125e46528 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25216) * 3b8fa52e5763fb02a0b786d938f1e478f5a7155e UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] baisui1981 commented on pull request #17521: [FLINK-24558][API/DataStream]make parent ClassLoader variable which c…
baisui1981 commented on pull request #17521: URL: https://github.com/apache/flink/pull/17521#issuecomment-947244756 > checkstyle failed. You can run `mvn spotless:apply` to fix it. i have fix it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-24585) Print the change in the size of the compacted files
[ https://issues.apache.org/jira/browse/FLINK-24585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee closed FLINK-24585. Resolution: Fixed master: 69d5780abc681064f23545f2d4fb4c5ce9ba5403 > Print the change in the size of the compacted files > --- > > Key: FLINK-24585 > URL: https://issues.apache.org/jira/browse/FLINK-24585 > Project: Flink > Issue Type: Improvement > Components: Connectors / FileSystem >Affects Versions: 1.15.0 >Reporter: Yubin Li >Assignee: Yubin Li >Priority: Major > Labels: pull-request-available > > {code:java} > LOG.info( > "Compaction time cost is '{}S', target file is '{}', input files are > '{}'", > costSeconds, > target, > paths); > {code} > only print the file name and time cost in compacting, maybe we need to print > the size change. > we have a demand in this, and have implemented it, please assign this to me, > thanks -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-24585) Print the change in the size of the compacted files
[ https://issues.apache.org/jira/browse/FLINK-24585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-24585: - Fix Version/s: 1.15.0 > Print the change in the size of the compacted files > --- > > Key: FLINK-24585 > URL: https://issues.apache.org/jira/browse/FLINK-24585 > Project: Flink > Issue Type: Improvement > Components: Connectors / FileSystem >Affects Versions: 1.15.0 >Reporter: Yubin Li >Assignee: Yubin Li >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > {code:java} > LOG.info( > "Compaction time cost is '{}S', target file is '{}', input files are > '{}'", > costSeconds, > target, > paths); > {code} > only print the file name and time cost in compacting, maybe we need to print > the size change. > we have a demand in this, and have implemented it, please assign this to me, > thanks -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] JingsongLi merged pull request #17517: [FLINK-24585][filesystem] Print the change in the size of the compacted files
JingsongLi merged pull request #17517: URL: https://github.com/apache/flink/pull/17517 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-24562) YarnResourceManagerDriverTest should not use ContainerStatusPBImpl.newInstance
[ https://issues.apache.org/jira/browse/FLINK-24562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430830#comment-17430830 ] Feifan Wang commented on FLINK-24562: - Hi [~xtsong], I hive modify the pr follow your comment. > YarnResourceManagerDriverTest should not use ContainerStatusPBImpl.newInstance > -- > > Key: FLINK-24562 > URL: https://issues.apache.org/jira/browse/FLINK-24562 > Project: Flink > Issue Type: Improvement > Components: Deployment / YARN >Affects Versions: 1.14.0 >Reporter: Feifan Wang >Assignee: Feifan Wang >Priority: Major > Labels: pull-request-available > > In YarnResourceManagerDriverTest, we create ContainerStatus with the static > method ContainerStatusPBImpl{{.newInstance}}, which is annotated as private > and unstable. > Although this method is still available in the latest version of yarn, some > third-party versions of yarn may modify it. In fact, this method was modified > in the internal version provided by our yarn team, which caused flink-1.14.0 > to fail to compile. > Moreover, there is already an org.apache.flink.yarn.TestingContainerStatus, I > think we should use it directly. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] tisonkun commented on pull request #17521: [FLINK-24558][API/DataStream]make parent ClassLoader variable which c…
tisonkun commented on pull request #17521: URL: https://github.com/apache/flink/pull/17521#issuecomment-947226017 cc @AHeise from the mailing list discussion. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] tisonkun commented on pull request #17521: [FLINK-24558][API/DataStream]make parent ClassLoader variable which c…
tisonkun commented on pull request #17521: URL: https://github.com/apache/flink/pull/17521#issuecomment-947225450 cc @guoweiM @StephanEwen Also I'd like to know who can be cc to for API/DataStream related changes now? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] tisonkun commented on pull request #17521: [FLINK-24558][API/DataStream]make parent ClassLoader variable which c…
tisonkun commented on pull request #17521: URL: https://github.com/apache/flink/pull/17521#issuecomment-947224927 @baisui1981 checkstyle failed. You can run `mvn spotless:apply` to fix it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-ml] zhipeng93 commented on a change in pull request #18: [FLINK-24279] Support withBroadcast in DataStream by caching in static variables
zhipeng93 commented on a change in pull request #18: URL: https://github.com/apache/flink-ml/pull/18#discussion_r730339576 ## File path: flink-ml-lib/src/main/java/org/apache/flink/ml/common/broadcast/BroadcastContext.java ## @@ -0,0 +1,123 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.ml.common.broadcast; + +import org.apache.flink.api.java.tuple.Tuple2; + +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.concurrent.locks.ReentrantReadWriteLock; + +public class BroadcastContext { +/** + * Store broadcast DataStreams in a Map. The key is (broadcastName, partitionId) and the value + * is (isBroaddcastVariableReady, cacheList). + */ +private static Map, Tuple2>> broadcastVariables = +new HashMap<>(); Review comment: Hi Yun, thanks for the feedback. Yes, users have to make sure that the keys are unique before. To solve this problem, I have updated the implementation using id + name + subtaskIndex as the key. Please refer to `BroadastUtils#line71` and `HasBroadcastVariable` for details. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17524: Update rabbitmq.md
flinkbot edited a comment on pull request #17524: URL: https://github.com/apache/flink/pull/17524#issuecomment-947119476 ## CI report: * fbe360124a6e29ed8059246acc6a261f53b5448b Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25239) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #17524: Update rabbitmq.md
flinkbot commented on pull request #17524: URL: https://github.com/apache/flink/pull/17524#issuecomment-947120495 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit fbe360124a6e29ed8059246acc6a261f53b5448b (Tue Oct 19 21:29:26 UTC 2021) **Warnings:** * Documentation files were touched, but no `docs/content.zh/` files: Update Chinese documentation or file Jira ticket. * **Invalid pull request title: No valid Jira ID provided** Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #17524: Update rabbitmq.md
flinkbot commented on pull request #17524: URL: https://github.com/apache/flink/pull/17524#issuecomment-947119476 ## CI report: * fbe360124a6e29ed8059246acc6a261f53b5448b UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] mHz28 opened a new pull request #17524: Update rabbitmq.md
mHz28 opened a new pull request #17524: URL: https://github.com/apache/flink/pull/17524 Might be helpful to set the actual out of box default port value for RabbitMQ. The update code will work with a new installation of RabbitMQ ## What is the purpose of the change *(For example: This pull request makes task deployment go through the blob server, rather than through RPC. That way we avoid re-transferring them on each deployment (during recovery).)* ## Brief change log *(for example:)* - *The TaskInfo is stored in the blob store on job creation time as a persistent artifact* - *Deployments RPC transmits only the blob storage reference* - *TaskManagers retrieve the TaskInfo from the blob cache* ## Verifying this change *(Please pick either of the following options)* This change is a trivial rework / code cleanup without any test coverage. *(or)* This change is already covered by existing tests, such as *(please describe tests)*. *(or)* This change added tests and can be verified as follows: *(example:)* - *Added integration tests for end-to-end deployment with large payloads (100MB)* - *Extended integration test for recovery after master (JobManager) failure* - *Added test that validates that TaskInfo is transferred only once across recoveries* - *Manually verified the change by running a 4 node cluser with 2 JobManagers and 4 TaskManagers, a stateful streaming program, and killing one JobManager and two TaskManagers during the execution, verifying that recovery happens correctly.* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / no) - The serializers: (yes / no / don't know) - The runtime per-record code paths (performance sensitive): (yes / no / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know) - The S3 file system connector: (yes / no / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / no) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17496: [FLINK-24562][yarn] YarnResourceManagerDriverTest should not use ContainerStatusPBImpl.newInstance
flinkbot edited a comment on pull request #17496: URL: https://github.com/apache/flink/pull/17496#issuecomment-944116990 ## CI report: * 106c7d11694d96a167cc3a3315ba91fe15d14632 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25234) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17459: [FLINK-24397] Remove TableSchema usages from Flink connectors
flinkbot edited a comment on pull request #17459: URL: https://github.com/apache/flink/pull/17459#issuecomment-940996979 ## CI report: * 0a7a93b33cebac710002818515ba9cec493c6dc2 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25228) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17472: [FLINK-24486][rest] Make async result store duration configurable
flinkbot edited a comment on pull request #17472: URL: https://github.com/apache/flink/pull/17472#issuecomment-943076656 ## CI report: * 6c6aa3b08eb8bee6f96f27a8b84eaebfba015d7f Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25229) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17440: [FLINK-24468][runtime] Wait for the channel activation before creating partition request client
flinkbot edited a comment on pull request #17440: URL: https://github.com/apache/flink/pull/17440#issuecomment-938816662 ## CI report: * 658994fafd3947a2a9a49fa91d3c7e4fce3e52b5 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25227) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17496: [FLINK-24562][yarn] YarnResourceManagerDriverTest should not use ContainerStatusPBImpl.newInstance
flinkbot edited a comment on pull request #17496: URL: https://github.com/apache/flink/pull/17496#issuecomment-944116990 ## CI report: * Unknown: [CANCELED](TBD) * 106c7d11694d96a167cc3a3315ba91fe15d14632 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25234) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17496: [FLINK-24562][yarn] YarnResourceManagerDriverTest should not use ContainerStatusPBImpl.newInstance
flinkbot edited a comment on pull request #17496: URL: https://github.com/apache/flink/pull/17496#issuecomment-944116990 ## CI report: * Unknown: [CANCELED](TBD) * 106c7d11694d96a167cc3a3315ba91fe15d14632 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17439: [FLINK-23271][table-planner] Disallow cast from decimal numerics to boolean
flinkbot edited a comment on pull request #17439: URL: https://github.com/apache/flink/pull/17439#issuecomment-938696330 ## CI report: * 711c3b24a34da8fbc1065487b3815599a7c25af7 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25226) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zoltar9264 commented on pull request #17496: [FLINK-24562][yarn] YarnResourceManagerDriverTest should not use ContainerStatusPBImpl.newInstance
zoltar9264 commented on pull request #17496: URL: https://github.com/apache/flink/pull/17496#issuecomment-946961163 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zoltar9264 commented on a change in pull request #17496: [FLINK-24562][yarn] YarnResourceManagerDriverTest should not use ContainerStatusPBImpl.newInstance
zoltar9264 commented on a change in pull request #17496: URL: https://github.com/apache/flink/pull/17496#discussion_r732104243 ## File path: flink-yarn/src/test/java/org/apache/flink/yarn/TestingContainerStatus.java ## @@ -21,16 +21,23 @@ import org.apache.hadoop.yarn.api.records.ContainerId; import org.apache.hadoop.yarn.api.records.ContainerState; import org.apache.hadoop.yarn.api.records.ContainerStatus; -import org.apache.hadoop.yarn.api.records.impl.pb.ContainerStatusPBImpl; /** A {@link ContainerStatus} implementation for testing. */ -class TestingContainerStatus extends ContainerStatusPBImpl { +class TestingContainerStatus extends ContainerStatus { private final ContainerId containerId; private final ContainerState containerState; private final String diagnostics; private final int exitStatus; +public static ContainerStatus newInstance( +ContainerId containerId, +ContainerState containerState, +String diagnostics, +int exitStatus) { +return new TestingContainerStatus(containerId, containerState, diagnostics, exitStatus); +} Review comment: This is really not necessary, I will remove it later. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zoltar9264 commented on a change in pull request #17496: [FLINK-24562][yarn] YarnResourceManagerDriverTest should not use ContainerStatusPBImpl.newInstance
zoltar9264 commented on a change in pull request #17496: URL: https://github.com/apache/flink/pull/17496#discussion_r732103414 ## File path: flink-yarn/src/test/java/org/apache/flink/yarn/TestingContainerStatus.java ## @@ -21,16 +21,23 @@ import org.apache.hadoop.yarn.api.records.ContainerId; import org.apache.hadoop.yarn.api.records.ContainerState; import org.apache.hadoop.yarn.api.records.ContainerStatus; -import org.apache.hadoop.yarn.api.records.impl.pb.ContainerStatusPBImpl; /** A {@link ContainerStatus} implementation for testing. */ -class TestingContainerStatus extends ContainerStatusPBImpl { +class TestingContainerStatus extends ContainerStatus { Review comment: Thanks for reply. I change this base on ContainerStatusPBImpl is marked 'Private' without considering what you said. After understand these, I think extends ContainerStatusPBImpl is better. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17458: [hotfix][docs] add more information about how to speed up maven build.
flinkbot edited a comment on pull request #17458: URL: https://github.com/apache/flink/pull/17458#issuecomment-940890804 ## CI report: * 2a955c618facf78f9953cb6159f201c2262a78eb Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25231) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-24595) Programmatic configuration of S3 doesn't pass parameters to Hadoop FS
Pavel Penkov created FLINK-24595: Summary: Programmatic configuration of S3 doesn't pass parameters to Hadoop FS Key: FLINK-24595 URL: https://issues.apache.org/jira/browse/FLINK-24595 Project: Flink Issue Type: Bug Components: Connectors / Hadoop Compatibility Affects Versions: 1.14.0 Environment: Flink 1.14.0 JDK 8 {{openjdk version "1.8.0_302"}} {{OpenJDK Runtime Environment (Zulu 8.56.0.23-CA-macos-aarch64) (build 1.8.0_302-b08)}} {{OpenJDK 64-Bit Server VM (Zulu 8.56.0.23-CA-macos-aarch64) (build 25.302-b08, mixed mode)}} Reporter: Pavel Penkov Attachments: FlinkApp.java, TickingSource.java, flink_exception.txt When running in mini-cluster mode Flink apparently doesn't pass S3 configuration to underlying Hadoop FS. With a code like this {code:java} Configuration conf = new Configuration(); conf.setString("s3.endpoint", "http://localhost:4566";); conf.setString("s3.aws.credentials.provider","org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider"); conf.setString("s3.access.key", "harvester"); conf.setString("s3.secret.key", "harvester"); StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironment(conf); {code} Application fails with an exception with most relevant error being {{Caused by: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Failed to connect to service endpoint: }} So Hadoop lists all the providers but it should use only the one set in configuration. Full project that reproduces this behaviour is available at [https://github.com/PavelPenkov/flink-s3-conf] and relevant files are attached to this issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-24384) Count checkpoints failed in trigger phase into numberOfFailedCheckpoints
[ https://issues.apache.org/jira/browse/FLINK-24384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430664#comment-17430664 ] Feifan Wang commented on FLINK-24384: - Hi [~pnowojski], I test the pr of FLINK-24344 just now. It make IOException thrown by _*CheckpointStorageCoordinatorView#initializeLocationForCheckpoint(long checkpointId)*_ reaches _*CheckpointFailureManager*_, thus it make tolerableCheckpointFailureNumber can effect. But it still does not count failures into numberOfFailedCheckpoints. A common use case is to configure a relatively large tolerableCheckpointFailureNumber to prevent the job full restart, still want aware _*CheckpointStorageCoordinatorView#initializeLocationForCheckpoint(long checkpointId)*_ through metric of numberOfFailedCheckpoints. > Count checkpoints failed in trigger phase into numberOfFailedCheckpoints > > > Key: FLINK-24384 > URL: https://issues.apache.org/jira/browse/FLINK-24384 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Reporter: Feifan Wang >Priority: Major > > h1. *Problem* > In current implementation, checkpoints failed in trigger phase do not count > into metric 'numberOfFailedCheckpoints'. Such that users can not aware > checkpoint stoped by this metric. > As lang as users can use rules like _*'numberOfCompletedCheckpoints' not > increase in some minutes past*_ (maybe checkpoint interval + timeout) for > alerting, but I think it is ambages and can not alert timely. > > h1. *Proposal* > As the title, count checkpoints failed in trigger phase into > 'numberOfFailedCheckpoints'. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #17523: [FLINK-21407] Update formats in connectors doc for Flink 1.13 for DataSet and DataStream APIs
flinkbot edited a comment on pull request #17523: URL: https://github.com/apache/flink/pull/17523#issuecomment-946841463 ## CI report: * cd03025238198ad4fdd48009e412f3594ca613ac Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25230) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17522: [FLINK-24462][table] Introduce CastRule interface to reorganize casting code
flinkbot edited a comment on pull request #17522: URL: https://github.com/apache/flink/pull/17522#issuecomment-94673 ## CI report: * def050276334d930dec36d5cffe52a47c1bf039d Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25223) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17439: [FLINK-23271][table-planner] Disallow cast from decimal numerics to boolean
flinkbot edited a comment on pull request #17439: URL: https://github.com/apache/flink/pull/17439#issuecomment-938696330 ## CI report: * 5af68936a0eec1623b38a5f73ba5f5e9d62e387a Azure: [CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25219) * 711c3b24a34da8fbc1065487b3815599a7c25af7 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25226) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17458: [hotfix][docs] add more information about how to speed up maven build.
flinkbot edited a comment on pull request #17458: URL: https://github.com/apache/flink/pull/17458#issuecomment-940890804 ## CI report: * 2c0a31553ff573e2ac794d0dc5f11c33144c24c8 Azure: [CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25225) * 2a955c618facf78f9953cb6159f201c2262a78eb Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25231) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-24355) Expose the flag for enabling checkpoints after tasks finish in the Web UI
[ https://issues.apache.org/jira/browse/FLINK-24355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Wysakowicz closed FLINK-24355. Resolution: Implemented Implemented in: * master ** 91151920806a51b8fef8901ab34cdd40735786ea * 1.14.1 ** 59e3ec7fcc95224c9c655949c2289f0702eac8a8 > Expose the flag for enabling checkpoints after tasks finish in the Web UI > - > > Key: FLINK-24355 > URL: https://issues.apache.org/jira/browse/FLINK-24355 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Configuration, Runtime / Web Frontend >Affects Versions: 1.14.0 >Reporter: Dawid Wysakowicz >Assignee: Nicholas Jiang >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0, 1.14.1 > > > We should present the value of > {{execution.checkpointing.checkpoints-after-tasks-finish.enabled}} in the Web > UI. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-24487) Add DELETE endpoint for operation results
[ https://issues.apache.org/jira/browse/FLINK-24487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-24487. Fix Version/s: (was: 1.15.0) Resolution: Later May add this later. The configurable cache timeout from FLINK-24486 should serve a lot of use-cases. > Add DELETE endpoint for operation results > - > > Key: FLINK-24487 > URL: https://issues.apache.org/jira/browse/FLINK-24487 > Project: Flink > Issue Type: Sub-task > Components: Runtime / REST >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > > Add a REST endpoint for deleting operation results. > In the first version we will only allow the deletion of completed results for > simplicity. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-24484) Support manual cleanup of async operation results
[ https://issues.apache.org/jira/browse/FLINK-24484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-24484. Fix Version/s: (was: 1.15.0) Resolution: Later > Support manual cleanup of async operation results > - > > Key: FLINK-24484 > URL: https://issues.apache.org/jira/browse/FLINK-24484 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > > The REST API state for asynchronous operations results is currently kept for > 5 minutes. > This behavior is fine for human interactions. > For automated interactions it is not ideal because if the external system > fails to persist the result then it may not be able to retrieve the result > again. > To solve this issue I propose: > a) make the caching timeout configurable (currently hard-coded to 5 minutes) > b) add a DELETE endpoint for operation results. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #17512: [FLINK-24563][table-planner][test] Re-enable tests for invalid strings->timestamp_tz cast
flinkbot edited a comment on pull request #17512: URL: https://github.com/apache/flink/pull/17512#issuecomment-945871287 ## CI report: * 6c70fbff77debc375858f142cb325f6c0942d978 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25220) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-24486) Configurable async operation result cache timeout
[ https://issues.apache.org/jira/browse/FLINK-24486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-24486: - Parent: (was: FLINK-24484) Issue Type: Improvement (was: Sub-task) > Configurable async operation result cache timeout > - > > Key: FLINK-24486 > URL: https://issues.apache.org/jira/browse/FLINK-24486 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > Make the caching timeout of the CompletedOperationCache configurable. This > should also allow the user to completely disable the timeout. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] slinkydeveloper commented on a change in pull request #17522: [FLINK-24462][table] Introduce CastRule interface to reorganize casting code
slinkydeveloper commented on a change in pull request #17522: URL: https://github.com/apache/flink/pull/17522#discussion_r732012043 ## File path: flink-table/flink-table-planner/src/test/java/org/apache/flink/table/data/casting/CastRulesTest.java ## @@ -0,0 +1,165 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.data.casting; + +import org.apache.flink.table.data.StringData; +import org.apache.flink.table.data.TimestampData; +import org.apache.flink.table.types.DataType; + +import org.junit.jupiter.api.DynamicTest; +import org.junit.jupiter.api.TestFactory; + +import java.time.LocalDateTime; +import java.time.ZoneId; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; +import java.util.stream.Stream; + +import static org.apache.flink.table.api.DataTypes.BIGINT; +import static org.apache.flink.table.api.DataTypes.INT; +import static org.apache.flink.table.api.DataTypes.SMALLINT; +import static org.apache.flink.table.api.DataTypes.STRING; +import static org.apache.flink.table.api.DataTypes.TIMESTAMP; +import static org.apache.flink.table.api.DataTypes.TIMESTAMP_LTZ; +import static org.apache.flink.table.api.DataTypes.TINYINT; +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertNotNull; + +class CastRulesTest { Review comment: No need to make test classes public in junit 5 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] slinkydeveloper commented on a change in pull request #17522: [FLINK-24462][table] Introduce CastRule interface to reorganize casting code
slinkydeveloper commented on a change in pull request #17522: URL: https://github.com/apache/flink/pull/17522#discussion_r732007560 ## File path: flink-table/flink-table-planner/src/main/java/org/apache/flink/table/data/casting/CastExpression.java ## @@ -0,0 +1,56 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.data.casting; + +import org.apache.flink.table.types.logical.LogicalType; + +import java.util.Objects; + +/** Generated cast expression result. */ Review comment: Added some details. ## File path: flink-table/flink-table-common/src/main/java/org/apache/flink/table/data/casting/CastRulePredicate.java ## @@ -0,0 +1,147 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.data.casting; + +import org.apache.flink.annotation.Internal; +import org.apache.flink.table.types.logical.LogicalType; +import org.apache.flink.table.types.logical.LogicalTypeFamily; +import org.apache.flink.table.types.logical.LogicalTypeRoot; + +import javax.annotation.Nullable; + +import java.util.Collections; +import java.util.HashSet; +import java.util.Set; +import java.util.function.BiPredicate; + +/** + * In order to apply a {@link CastRule}, the runtime checks if a particular rule matches the tuple + * of input and target type using this class. In particular, a rule is applied if: + * + * + * {@link #getTargetTypes()} includes the {@link LogicalTypeRoot} of target type and either + * + * {@link #getInputTypes()} includes the {@link LogicalTypeRoot} of input type or + * {@link #getInputTypeFamilies()} includes one of the {@link LogicalTypeFamily} of + * input type + * + * Or {@link #getTargetTypeFamilies()} includes one of the {@link LogicalTypeFamily} of target + * type and either + * + * {@link #getInputTypes()} includes the {@link LogicalTypeRoot} of input type or + * {@link #getInputTypeFamilies()} includes one of the {@link LogicalTypeFamily} of + * input type + * + * Or, if {@link #getCustomPredicate()} is not null, the input type and target type matches Review comment: I tried to clarify it a bit, check it out now -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17521: [FLINK-24558][API/DataStream]make parent ClassLoader variable which c…
flinkbot edited a comment on pull request #17521: URL: https://github.com/apache/flink/pull/17521#issuecomment-946651087 ## CI report: * 4f842b98d3aaee4d1073a3621aa0f8a125e46528 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25216) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] slinkydeveloper commented on a change in pull request #17522: [FLINK-24462][table] Introduce CastRule interface to reorganize casting code
slinkydeveloper commented on a change in pull request #17522: URL: https://github.com/apache/flink/pull/17522#discussion_r732007179 ## File path: flink-table/flink-table-planner/src/main/java/org/apache/flink/table/data/casting/CastRuleProvider.java ## @@ -0,0 +1,155 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.data.casting; + +import org.apache.flink.annotation.Internal; +import org.apache.flink.table.data.casting.rules.IdentityCastRule$; +import org.apache.flink.table.data.casting.rules.TimestampToStringCastRule$; +import org.apache.flink.table.data.casting.rules.UpcastToBigIntCastRule$; +import org.apache.flink.table.types.DataType; +import org.apache.flink.table.types.logical.LogicalType; +import org.apache.flink.table.types.logical.LogicalTypeFamily; +import org.apache.flink.table.types.logical.LogicalTypeRoot; + +import javax.annotation.Nullable; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +/** This class resolves {@link CastRule} starting from the input and the target type. */ Review comment: I think that's an implementation detail of this class, what matters to the user is that I give him a rule that matches the input and target types. I prefer to keep it this way since it's more descriptive IMHO -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17482: [FLINK-24546][docs]fix miss Acknowledged description on Monitoring Ch…
flinkbot edited a comment on pull request #17482: URL: https://github.com/apache/flink/pull/17482#issuecomment-943277366 ## CI report: * 5adaf0c3a0a2d0d6a98f0e095603aea1a87e91e0 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25215) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17458: [hotfix][docs] add more information about how to speed up maven build.
flinkbot edited a comment on pull request #17458: URL: https://github.com/apache/flink/pull/17458#issuecomment-940890804 ## CI report: * 1527085f5c1365c300b47c783f655cb8494a92f5 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25003) * 2c0a31553ff573e2ac794d0dc5f11c33144c24c8 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25225) * 2a955c618facf78f9953cb6159f201c2262a78eb Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25231) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] dawidwys merged pull request #17510: [FLINK-24355][runtime] Expose the flag for enabling checkpoints after tasks finish in the Web UI
dawidwys merged pull request #17510: URL: https://github.com/apache/flink/pull/17510 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17458: [hotfix][docs] add more information about how to speed up maven build.
flinkbot edited a comment on pull request #17458: URL: https://github.com/apache/flink/pull/17458#issuecomment-940890804 ## CI report: * 1527085f5c1365c300b47c783f655cb8494a92f5 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25003) * 2c0a31553ff573e2ac794d0dc5f11c33144c24c8 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25225) * 2a955c618facf78f9953cb6159f201c2262a78eb UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17523: [FLINK-21407] Update formats in connectors doc for Flink 1.13 for DataSet and DataStream APIs
flinkbot edited a comment on pull request #17523: URL: https://github.com/apache/flink/pull/17523#issuecomment-946841463 ## CI report: * cd03025238198ad4fdd48009e412f3594ca613ac Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25230) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #17523: [FLINK-21407] Update formats in connectors doc for Flink 1.13 for DataSet and DataStream APIs
flinkbot commented on pull request #17523: URL: https://github.com/apache/flink/pull/17523#issuecomment-946841463 ## CI report: * cd03025238198ad4fdd48009e412f3594ca613ac UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] echauchot commented on pull request #17523: [FLINK-21407] Update formats in connectors doc for Flink 1.13 for DataSet and DataStream APIs
echauchot commented on pull request #17523: URL: https://github.com/apache/flink/pull/17523#issuecomment-946841301 R: @zentol CC: @AHeise Thanks guys, feel free to point me to another committer if you lack time to review that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-14954) Publish OpenAPI specification of REST Monitoring API
[ https://issues.apache.org/jira/browse/FLINK-14954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430609#comment-17430609 ] Chesnay Schepler commented on FLINK-14954: -- flink-shaded-swagger added to flink-shaded master in db48d62f7b52f012b83533f8aa3f39fc0c0ee3db. > Publish OpenAPI specification of REST Monitoring API > > > Key: FLINK-14954 > URL: https://issues.apache.org/jira/browse/FLINK-14954 > Project: Flink > Issue Type: Improvement > Components: Documentation, Runtime / REST >Reporter: Michaël Melchiore >Assignee: Chesnay Schepler >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0, shaded-15.0 > > > Hello, > Flink provides a very helpful REST Monitoring API. > OpenAPI is convenient standard to generate clients in a variety of language > for REST API documented according to their specification. In this case, > clients would be helpful to automate management of Flink clusters. > Currently, there is no "official" OpenAPI specification of Flink REST > Monitoring API. [Some|https://github.com/nextbreakpoint/flink-client] have > written by users, but their consistency across Flink releases is uncertain. > I think it would be beneficial to have an OpenAPI specification provided and > maintained by the Flink project. > > Kind regards, > -- This message was sent by Atlassian Jira (v8.3.4#803005)