[jira] [Created] (FLINK-28603) Flink runtime.rpc.RpcServiceUtils code style
wuqingzhi created FLINK-28603: - Summary: Flink runtime.rpc.RpcServiceUtils code style Key: FLINK-28603 URL: https://issues.apache.org/jira/browse/FLINK-28603 Project: Flink Issue Type: Improvement Components: Runtime / RPC Reporter: wuqingzhi Hello, I found a code style problem in , which is located in RpcServiceUtils, where nextNameOffset should be capitalized and separated by an underscore. eg: private static final AtomicLong nextNameOffset = new AtomicLong(0L); -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28602) Changelog cannot close stream normally while enabling compression
Hangxiang Yu created FLINK-28602: Summary: Changelog cannot close stream normally while enabling compression Key: FLINK-28602 URL: https://issues.apache.org/jira/browse/FLINK-28602 Project: Flink Issue Type: Bug Components: Runtime / State Backends Affects Versions: 1.15.1, 1.16.0 Reporter: Hangxiang Yu Fix For: 1.16.0, 1.15.2 While enabling compression, Changelog part will wrap output stream using StreamCompressionDecorator#decorateWithCompression. As the comment said, "IMPORTANT: For streams returned by this method, \{@link OutputStream#close()} is not propagated to the inner stream. The inner stream must be closed separately.". But StateChangeFsUploader will not close inner stream if wrapped stream has been closed. So the upload will not complete when enabling compression even if it returns success. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28601) Support FeatureHasher in FlinkML
weibo zhao created FLINK-28601: -- Summary: Support FeatureHasher in FlinkML Key: FLINK-28601 URL: https://issues.apache.org/jira/browse/FLINK-28601 Project: Flink Issue Type: New Feature Components: Library / Machine Learning Reporter: weibo zhao Support FeatureHasher in FlinkML. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28600) Support FilterPushDown in flink-connector-jdbc
Hailin Wang created FLINK-28600: --- Summary: Support FilterPushDown in flink-connector-jdbc Key: FLINK-28600 URL: https://issues.apache.org/jira/browse/FLINK-28600 Project: Flink Issue Type: Improvement Components: Connectors / JDBC Reporter: Hailin Wang Support FilterPushDown in flink-connector-jdbc -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28599) Adding FlinkJoinToMultiJoinRule to support left/right outer join can be translated to multi join
Yunhong Zheng created FLINK-28599: - Summary: Adding FlinkJoinToMultiJoinRule to support left/right outer join can be translated to multi join Key: FLINK-28599 URL: https://issues.apache.org/jira/browse/FLINK-28599 Project: Flink Issue Type: Improvement Components: Table SQL / Planner Affects Versions: 1.16.0 Reporter: Yunhong Zheng Fix For: 1.16.0 Now, Flink use Calcite's rule {code:java} JOIN_TO_MULTI_JOIN{code} to convert multiple joins into a join set, which can be used by join reorder. However, calcite's rule can not adapte to all outer joins. For left or right outer join, if they meet certain conditions, it can also be converted to multi join. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28598) ClusterEntryPoint can't get the real exit reason when shutting down
zlzhang0122 created FLINK-28598: --- Summary: ClusterEntryPoint can't get the real exit reason when shutting down Key: FLINK-28598 URL: https://issues.apache.org/jira/browse/FLINK-28598 Project: Flink Issue Type: Bug Components: Deployment / YARN, Runtime / Task Affects Versions: 1.15.1, 1.14.2 Reporter: zlzhang0122 When the cluster is starting and some error occurs, the ClusterEntryPoint will shutDown the cluster asynchronous, but if it can't get a Throwable, the shutDown reason will be null, but actually if it's a user code problem and this may happen. I think we can get the real exit reason caused by user code and pass it to the diagnostics parameter, this may help users a lot. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [DISCUSS] FLIP-248: Introduce dynamic partition pruning
Hi, Jingong, Jark, Jing, Thanks for for the important inputs. Lake storage is a very important scenario, and consider more generic and extended case, I also would like to use "dynamic filtering" concept instead of "dynamic partition". >maybe the FLIP should also demonstrate the EXPLAIN result, which is also an API. I will add a section to describe the EXPLAIN result. >Does DPP also support streaming queries? Yes, but for bounded source. >it requires the SplitEnumerator must implements new introduced `SupportsHandleExecutionAttemptSourceEvent` interface, +1 I will update the document and the poc code. Best, Godfrey Jing Zhang 于2022年7月13日周三 20:22写道: > > Hi Godfrey, > Thanks for driving this discussion. > This is an important improvement for batch sql jobs. > I agree with Jingsong to expand the capability to more than just partitions. > Besides, I have two points: > 1. Based on FLIP-248[1], > > > Dynamic partition pruning mechanism can improve performance by avoiding > > reading large amounts of irrelevant data, and it works for both batch and > > streaming queries. > > Does DPP also support streaming queries? > It seems the proposed changes in the FLIP-248 does not work for streaming > queries, > because the dimension table might be an unbounded inputs. > Or does it require all dimension tables to be bounded inputs for streaming > jobs if the job wanna enable DPP? > > 2. I notice there are changes on SplitEnumerator for Hive source and File > source. > And they now depend on SourceEvent to pass PartitionData. > In FLIP-245, if enable speculative execution for sources based on FLIP-27 > which use SourceEvent, > it requires the SplitEnumerator must implements new introduced > `SupportsHandleExecutionAttemptSourceEvent` interface, > otherwise an exception would be thrown out. > Since hive and File sources are commonly used for batch jobs, it's better > to take this point into consideration. > > Best, > Jing Zhang > > [1] FLIP-248: > https://cwiki.apache.org/confluence/display/FLINK/FLIP-248%3A+Introduce+dynamic+partition+pruning > [2] FLIP-245: > https://cwiki.apache.org/confluence/display/FLINK/FLIP-245%3A+Source+Supports+Speculative+Execution+For+Batch+Job > > > Jark Wu 于2022年7月12日周二 13:16写道: > > > I agree with Jingsong. DPP is a particular case of Dynamic Filter Pushdown > > that the join key contains partition fields. Extending this FLIP to > > general filter > > pushdown can benefit more optimizations, and they can share the same > > interface. > > > > For example, Trino Hive Connector leverages dynamic filtering to support: > > - dynamic partition pruning for partitioned tables > > - and dynamic bucket pruning for bucket tables > > - and dynamic filter pushed into the ORC and Parquet readers to perform > > stripe > > or row-group pruning and save on disk I/O. > > > > Therefore, +1 to extend this FLIP to Dynamic Filter Pushdown (or Dynamic > > Filtering), > > just like Trino [1]. The interfaces should also be adapted for that. > > > > Besides, maybe the FLIP should also demonstrate the EXPLAIN result, which > > is also an API. > > > > Best, > > Jark > > > > [1]: https://trino.io/docs/current/admin/dynamic-filtering.html > > > > > > > > > > > > > > > > > > > > > > On Tue, 12 Jul 2022 at 09:59, Jingsong Li wrote: > > > > > Thanks Godfrey for driving. > > > > > > I like this FLIP. > > > > > > We can restrict this capability to more than just partitions. > > > Here are some inputs from Lake Storage. > > > > > > The format of the splits generated by Lake Storage is roughly as follows: > > > Split { > > >Path filePath; > > >Statistics[] fieldStats; > > > } > > > > > > Stats contain the min and max of each column. > > > > > > If the storage is sorted by a column, this means that the split > > > filtering on that column will be very good, so not only the partition > > > field, but also this column is worthy of being pushed down the > > > RuntimeFilter. > > > This information can only be known by source, so I suggest that source > > > return which fields are worthy of being pushed down. > > > > > > My overall point is: > > > This FLIP can be extended to support Source Runtime Filter push-down > > > for all fields, not just dynamic partition pruning. > > > > > > What do you think? > > > > > > Best, > > > Jingsong > > > > > > On Fri, Jul 8, 2022 at 10:12 PM godfrey he wrote: > > > > > > > > Hi all, > > > > > > > > I would like to open a discussion on FLIP-248: Introduce dynamic > > > > partition pruning. > > > > > > > > Currently, Flink supports static partition pruning: the conditions in > > > > the WHERE clause are analyzed > > > > to determine in advance which partitions can be safely skipped in the > > > > optimization phase. > > > > Another common scenario: the partitions information is not available > > > > in the optimization phase but in the execution phase. > > > > That's the problem this FLIP is trying to solve: dynamic partition > > > > pruning, which could reduce the partition
Re: [DISCUSS] FLIP-238: Introduce FLIP-27-based Data Generator Source
Hi all, I updated the FLIP [1] to make it more extensible with the introduction of *SourceReaderFactory. *It gives users the ability to further customize the data generation and emission process if needed. I also incorporated the suggestion from Qingsheng and moved to the generator function design with an initializer method to support more sophisticated functions with non-serializable fields. I am personally pretty happy with the current prototype [2], [3]. Let me know if you have any other feedback, otherwise, I am going to start the vote. [1] https://cwiki.apache.org/confluence/x/9Av1D [2] https://github.com/afedulov/flink/blob/e5f8e43a67b9983c42b5db2a7617361fa459/flink-core/src/main/java/org/apache/flink/api/connector/source/lib/DataGeneratorSourceV4.java#L52 [3] https://github.com/afedulov/flink/blob/e5f8e43a67b9983c42b5db2a7617361fa459/flink-examples/flink-examples-streaming/src/main/java/org/apache/flink/streaming/examples/wordcount/GeneratorSourcePOC.java#L92 Best, Alexander Fedulov On Thu, Jul 7, 2022 at 12:08 AM Alexander Fedulov wrote: > Hi Becket, > > interesting points about the discrepancies in the *RuntimeContext* > "wrapping" throughout the framework, but I agree - this is something that > needs to be tackled separately. > For now, I adjusted the FLIP and the PoC implementation to only expose the > parallelism. > > Best, > Alexander Fedulov > > On Wed, Jul 6, 2022 at 2:42 AM Becket Qin wrote: > >> Hi Alex, >> >> Personally I prefer the latter option, i.e. just add the >> currentParallelism() method. It is easy to add more stuff to the >> SourceReaderContext in the future, and it is likely that most of the stuff >> in the RuntimeContext is not required by the SourceReader implementations. >> For the purpose of this FLIP, adding the method is probably good enough. >> >> That said, I don't see a consistent pattern adopted in the project to >> handle similar cases. The FunctionContext wraps the RuntimeContext and >> only >> exposes necessary stuff. CEPRuntimeContext extends the RuntimeContext and >> overrides some methods that it does not want to expose with exception >> throwing logic. Some internal context classes simply expose the entire >> RuntimeContext with some additional methods. If we want to make things >> clean, I'd imagine all these variations of context can become some >> specific >> combination of a ReadOnlyRuntimeContext and some "write" methods. But this >> may require a closer look at all these cases to make sure the >> ReadOnlyRuntimeContext is generally suitable. I feel that it will take >> some >> time and could be a bigger discussion than the data generator source >> itself. So maybe we can just go with adding a method at the moment. And >> evolving the SourceReaderContext to use the ReadOnlyRuntimeContext in the >> future. >> >> Thanks, >> >> Jiangjie (Becket) Qin >> >> On Tue, Jul 5, 2022 at 8:31 PM Alexander Fedulov > > >> wrote: >> >> > Hi Becket, >> > >> > I agree with you. We could introduce a *ReadOnlyRuntimeContext* that >> would >> > act as a holder for the *RuntimeContext* data. This would also require >> > read-only wrappers for the exposed fields, such as *ExecutionConfig*. >> > Alternatively, we just add the *currentParallelism()* method for now and >> > see if anything else might actually be needed later on. What do you >> think? >> > >> > Best, >> > Alexander Fedulov >> > >> > On Tue, Jul 5, 2022 at 2:30 AM Becket Qin wrote: >> > >> > > Hi Alex, >> > > >> > > While it is true that the RuntimeContext gives access to all the stuff >> > the >> > > framework can provide, it seems a little overkilling for the >> > SourceReader. >> > > It is probably OK to expose all the read-only information in the >> > > RuntimeContext to the SourceReader, but we may want to hide the >> "write" >> > > methods, such as creating states, writing stuff to distributed cache, >> > etc, >> > > because these methods may not work well with the SourceReader design >> and >> > > cause confusion. For example, users may wonder why the snapshotState() >> > > method exists while they can use the state directly. >> > > >> > > Thanks, >> > > >> > > Jiangjie (Becket) Qin >> > > >> > > >> > > >> > > On Tue, Jul 5, 2022 at 7:37 AM Alexander Fedulov < >> > alexan...@ververica.com> >> > > wrote: >> > > >> > > > Hi Becket, >> > > > >> > > > I updated and extended FLIP-238 accordingly. >> > > > >> > > > Here is also my POC branch [1]. >> > > > DataGeneratorSourceV3 is the class that I currently converged on >> [2]. >> > It >> > > is >> > > > based on the expanded SourceReaderContext. >> > > > A couple more relevant classes [3] [4] >> > > > >> > > > Would appreciate it if you could take a quick look. >> > > > >> > > > [1] >> > https://github.com/afedulov/flink/tree/FLINK-27919-generator-source >> > > > [2] >> > > > >> > > > >> > > >> > >>
[jira] [Created] (FLINK-28597) Empty checkpoint folders not deleted on job cancellation if their shared state is still in use
Roman Khachatryan created FLINK-28597: - Summary: Empty checkpoint folders not deleted on job cancellation if their shared state is still in use Key: FLINK-28597 URL: https://issues.apache.org/jira/browse/FLINK-28597 Project: Flink Issue Type: Bug Components: Runtime / Checkpointing Affects Versions: 1.16.0 Reporter: Roman Khachatryan Assignee: Roman Khachatryan Fix For: 1.16.0 After FLINK-25872, SharedStateRegistry registers all state handles, including private ones. Once the state isn't use AND the checkpoint is subsumed, it will actually be discarded. This is done to prevent premature deletion when recovering in CLAIM mode: 1. RocksDB native savepoint folder (shared state is stored in chk-xx folder so it might fail the deletion) 2. Initial non-changelog checkpoint when switching to changelog-based checkpoints (private state of the initial checkpoint might be included into later checkpoints and its deletion would invalidate them) Additionally, checkpoint folders are not deleted for a longer time which might be confusing. In case of a crash, more folders will remain. cc: [~Yanfei Lei], [~ym] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28596) Support writing arrays to postgres array columns in Flink SQL JDBC connector
Bobby Richard created FLINK-28596: - Summary: Support writing arrays to postgres array columns in Flink SQL JDBC connector Key: FLINK-28596 URL: https://issues.apache.org/jira/browse/FLINK-28596 Project: Flink Issue Type: Improvement Components: Connectors / JDBC Affects Versions: 1.15.0 Reporter: Bobby Richard -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28595) KafkaSource should not read metadata of unmatched regex topics
Afzal Mazhar created FLINK-28595: Summary: KafkaSource should not read metadata of unmatched regex topics Key: FLINK-28595 URL: https://issues.apache.org/jira/browse/FLINK-28595 Project: Flink Issue Type: Improvement Components: Connectors / Kafka Reporter: Afzal Mazhar When we use a regex to subscribe to topics, the current connector gets a list of all topics, then runs describe against all of them, and finally filters by the regex pattern. This is not performant, as well as could possibly trigger audit alarms against sensitive topics that do not match the regex. Proposed fix: move the regex filtering from the TopicPatternSubscriber's set() down into KafkaSubscriberUtils getAllTopicMetadata(). Get the list of topics, filter by pattern (if any), then get metadata. Create appropriate tests. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28594) Add metrics for FlinkService
Matyas Orhidi created FLINK-28594: - Summary: Add metrics for FlinkService Key: FLINK-28594 URL: https://issues.apache.org/jira/browse/FLINK-28594 Project: Flink Issue Type: Improvement Components: Kubernetes Operator Reporter: Matyas Orhidi Fix For: kubernetes-operator-1.2.0 We would need some metrics for the `FlinkService` to be able to tell how long does it take to perform most of the blocking operations we have in this service -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28593) Introduce default ingress templates at operator level
Matyas Orhidi created FLINK-28593: - Summary: Introduce default ingress templates at operator level Key: FLINK-28593 URL: https://issues.apache.org/jira/browse/FLINK-28593 Project: Flink Issue Type: Improvement Components: Kubernetes Operator Reporter: Matyas Orhidi Fix For: kubernetes-operator-1.2.0 Ingress templates are currently [defined at CR level|https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/operations/ingress/], but these rules can be enabled globally at operator level too. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28592) Implement custom resource counters as counters not gauges
Matyas Orhidi created FLINK-28592: - Summary: Implement custom resource counters as counters not gauges Key: FLINK-28592 URL: https://issues.apache.org/jira/browse/FLINK-28592 Project: Flink Issue Type: Improvement Components: Kubernetes Operator Affects Versions: kubernetes-operator-1.1.0 Reporter: Matyas Orhidi Fix For: kubernetes-operator-1.2.0 * change to current implementation to counters * add counters at global level -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28591) Array> is not serialized correctly
Andrzej Swatowski created FLINK-28591: - Summary: Array> is not serialized correctly Key: FLINK-28591 URL: https://issues.apache.org/jira/browse/FLINK-28591 Project: Flink Issue Type: Bug Components: Table SQL / API, Table SQL / Planner Affects Versions: 1.15.0 Reporter: Andrzej Swatowski When using Table API to insert data into array of rows, the data apparently is incorrectly serialized internally, which leads to incorrect serialization at the connectors. E.g., a following table: {code:java} CREATE TABLE wrongArray ( foo bigint, bar ARRAY>, strings ARRAY, intRows ARRAY> ) WITH ( 'connector' = 'filesystem', 'path' = 'file:///home/jovyan/issue.json', 'format' = 'json' ) {code} along with the following insert: {code:java} insert into wrongArray ( SELECT 1, array[ ('Field1', 'Value1'), ('Field2', 'Value2') ], array['foo', 'bar', 'foobar'], array[ROW(1, 1), ROW(2, 2)] FROM (VALUES(1)) ) {code} gets serialized into: {code:java} { "foo":1, "bar":[ { "foo1":"Field2", "foo2":"Value2" }, { "foo1":"Field2", "foo2":"Value2" } ], "strings":[ "foo", "bar", "foobar" ], "intRows":[ { "a":2, "b":2 }, { "a":2, "b":2 } ] }{code} It is easy to spot that `strings` (being an Array of String) yields the correct values. However, both `bar` (an Array of Rows with two Strings) and `intRows` (an Array of Rows with two Integers) consists of duplicates of the last row in the array. It is not an error connected with either a specific connector or format. I have done a bit of debugging when trying to implement my own format and it seems that `BinaryArrayData` holding the row values has wrong data saved in its `MemorySegment`, i.e. calling: {code:java} for (var i = 0; i < array.size(); i++) { Object element = arrayDataElementGetter.getElementOrNull(array, i); }{code} correctly calculates offsets but yields the same result as the data is malformed in the array's `MemorySegment`. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28590) flink on yarn checkpoint exception
wangMu created FLINK-28590: -- Summary: flink on yarn checkpoint exception Key: FLINK-28590 URL: https://issues.apache.org/jira/browse/FLINK-28590 Project: Flink Issue Type: Bug Affects Versions: 1.15.1, 1.15.0 Reporter: wangMu Attachments: image-2022-07-18-17-52-59-892.png 当我在 cdh6.2.1 上使用 flick on yarn 提交时,jobmanager 日志打印以下异常: !image-2022-07-18-17-52-59-892.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28589) Enhance Web UI for Speculative Execution
Gen Luo created FLINK-28589: --- Summary: Enhance Web UI for Speculative Execution Key: FLINK-28589 URL: https://issues.apache.org/jira/browse/FLINK-28589 Project: Flink Issue Type: Sub-task Components: Runtime / Web Frontend Affects Versions: 1.16.0 Reporter: Gen Luo -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28588) Enhance REST API for Speculative Execution
Gen Luo created FLINK-28588: --- Summary: Enhance REST API for Speculative Execution Key: FLINK-28588 URL: https://issues.apache.org/jira/browse/FLINK-28588 Project: Flink Issue Type: Sub-task Components: Runtime / REST Affects Versions: 1.16.0 Reporter: Gen Luo -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28587) FLIP-249: Flink Web UI Enhancement for Speculative Execution
Gen Luo created FLINK-28587: --- Summary: FLIP-249: Flink Web UI Enhancement for Speculative Execution Key: FLINK-28587 URL: https://issues.apache.org/jira/browse/FLINK-28587 Project: Flink Issue Type: New Feature Components: Runtime / REST, Runtime / Web Frontend Affects Versions: 1.16.0 Reporter: Gen Luo As a follow-up step of FLIP-168 and FLIP-224, the Flink Web UI needs to be enhanced to display the related information if the speculative execution mechanism is enabled. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28586) Speculative execution for new sources
Zhu Zhu created FLINK-28586: --- Summary: Speculative execution for new sources Key: FLINK-28586 URL: https://issues.apache.org/jira/browse/FLINK-28586 Project: Flink Issue Type: Sub-task Components: Connectors / Common, Runtime / Coordination Reporter: Zhu Zhu Fix For: 1.16.0 This task enables new sources(FLIP-27) for speculative execution. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28585) Speculative execution for InputFormat sources
Zhu Zhu created FLINK-28585: --- Summary: Speculative execution for InputFormat sources Key: FLINK-28585 URL: https://issues.apache.org/jira/browse/FLINK-28585 Project: Flink Issue Type: Sub-task Components: Runtime / Coordination Reporter: Zhu Zhu Fix For: 1.16.0 This task enables InputFormat sources for speculative execution. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28584) Add an option to ConfigMap that can be set to immutable
liuzhuo created FLINK-28584: --- Summary: Add an option to ConfigMap that can be set to immutable Key: FLINK-28584 URL: https://issues.apache.org/jira/browse/FLINK-28584 Project: Flink Issue Type: Improvement Components: Deployment / Kubernetes Reporter: liuzhuo When a job is started in the kubernetes environment, multiple configmaps are usually created to mount data (eg: flink-config, hadoop-config, etc.). If a cluster runs too many jobss, the configmap will become a performance bottleneck, and occasionally An exception of file mount failure occurs, resulting in slower pod startup time According to kubernetes' description of configmap, if the immutable parameter is enabled, it will greatly reduce the pressure on kube-apiserver and improve cluster performance. [configmap-immutable|https://kubernetes.io/zh-cn/docs/concepts/configuration/configmap/#configmap-immutable] In my understanding, parameter information such as flink-config, hadoop-config is loaded at startup, and even if it is subsequently modified, it cannot affect the running of the job. Should we provide a control switch to choose whether to set the configmap to immutable? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28583) make flink dist log4j dependency simple and clear
jackylau created FLINK-28583: Summary: make flink dist log4j dependency simple and clear Key: FLINK-28583 URL: https://issues.apache.org/jira/browse/FLINK-28583 Project: Flink Issue Type: Improvement Components: Deployment / Scripts Affects Versions: 1.16.0 Reporter: jackylau Fix For: 1.16.0 flink don't shade the log4j and want to put it flink/lib using shade exclusion like this {code:java} org.apache.logging.log4j:* {code} and add this to put them to flink/lib {code:java} org.apache.logging.log4j:log4j-api org.apache.logging.log4j:log4j-core org.apache.logging.log4j:log4j-slf4j-impl org.apache.logging.log4j:log4j-1.2-api {code} i suggest to make the log4j to provided. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28582) LSM tree structure may be incorrect when multiple jobs are committing into the same bucket
Caizhi Weng created FLINK-28582: --- Summary: LSM tree structure may be incorrect when multiple jobs are committing into the same bucket Key: FLINK-28582 URL: https://issues.apache.org/jira/browse/FLINK-28582 Project: Flink Issue Type: Bug Components: Table Store Affects Versions: table-store-0.2.0 Reporter: Caizhi Weng Fix For: table-store-0.2.0 Currently `FileStoreCommitImpl` only checks for conflicts by checking the files we're going to delete (due to compaction) are still there. However, consider two jobs committing into the same LSM tree at the same time. For their first compaction no conflict is detected because they'll only delete their own level 0 files. But they will both produce level 1 files and the key ranges of these files may overlap. This is not correct for our LSM tree structure. -- This message was sent by Atlassian Jira (v8.20.10#820010)