Re: [PR] [FLINK-32850][flink-runtime][JUnit5 Migration] The io.network.netty package of flink-runtime module [flink]
flinkbot commented on PR #23607: URL: https://github.com/apache/flink/pull/23607#issuecomment-1782370076 ## CI report: * 044da359c0c11df6c477f93f60b8739056a8d4ff UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33033][olap][haservice] Add haservice micro benchmark for olap [flink-benchmarks]
XComp commented on code in PR #78: URL: https://github.com/apache/flink-benchmarks/pull/78#discussion_r1374121055 ## src/main/java/org/apache/flink/olap/benchmark/HighAvailabilityServiceBenchmark.java: ## @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.olap.benchmark; + +import org.apache.curator.test.TestingServer; +import org.apache.flink.api.common.JobID; +import org.apache.flink.benchmark.BenchmarkBase; +import org.apache.flink.benchmark.FlinkEnvironmentContext; +import org.apache.flink.configuration.Configuration; +import org.apache.flink.configuration.HighAvailabilityOptions; +import org.apache.flink.runtime.jobgraph.JobGraph; +import org.apache.flink.runtime.jobgraph.JobVertex; +import org.apache.flink.runtime.jobmanager.HighAvailabilityMode; +import org.apache.flink.runtime.testtasks.NoOpInvokable; +import org.apache.flink.util.FileUtils; +import org.openjdk.jmh.annotations.Benchmark; +import org.openjdk.jmh.annotations.OutputTimeUnit; +import org.openjdk.jmh.annotations.Param; +import org.openjdk.jmh.annotations.State; +import org.openjdk.jmh.runner.Runner; +import org.openjdk.jmh.runner.RunnerException; +import org.openjdk.jmh.runner.options.Options; +import org.openjdk.jmh.runner.options.OptionsBuilder; +import org.openjdk.jmh.runner.options.VerboseMode; + +import java.io.File; +import java.io.IOException; +import java.nio.file.Files; +import java.util.UUID; + +import static java.util.concurrent.TimeUnit.SECONDS; +import static org.openjdk.jmh.annotations.Scope.Thread; + +/** The benchmark for submitting short-lived jobs with and without high availability service. */ +@OutputTimeUnit(SECONDS) +public class HighAvailabilityServiceBenchmark extends BenchmarkBase { + public static void main(String[] args) throws RunnerException { + Options options = + new OptionsBuilder() + .verbosity(VerboseMode.NORMAL) + .include(".*" + HighAvailabilityServiceBenchmark.class.getCanonicalName() + ".*") + .build(); + + new Runner(options).run(); + } + + @Benchmark + public void submitJobThroughput(HighAvailabilityContext context) throws Exception { Review Comment: sorry, I was out yesterday. But my point was that even with a warm up, some volatility in the metrics can happen. Therefore, one would run the operation in question multiple times and get the average instead of a single operation. I haven't fully understood here, why this is not necessary in this specific case. :thinking: -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [FLINK-32850][flink-runtime][JUnit5 Migration] The io.network.netty package of flink-runtime module [flink]
Jiabao-Sun opened a new pull request, #23607: URL: https://github.com/apache/flink/pull/23607 ## What is the purpose of the change [FLINK-32850][flink-runtime][JUnit5 Migration] The io.network.netty package of flink-runtime module ## Brief change log [FLINK-32850][flink-runtime][JUnit5 Migration] The io.network.netty package of flink-runtime module ## Verifying this change This change is already covered by existing tests ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33304] Introduce mutationBuffer to resolve the mutation write conflicts problem [flink-connector-hbase]
Tan-JiaLiang commented on PR #30: URL: https://github.com/apache/flink-connector-hbase/pull/30#issuecomment-1782360791 @ferenc-csaky I was double check my code and found the `DeduplicatedMutator` should be thread-safe, so i using the `synchronized` to ensure it. I rebase the main branch and squash my changes into one commit, PTAL. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33364] Introduce standard YAML for flink configuration. [flink]
flinkbot commented on PR #23606: URL: https://github.com/apache/flink/pull/23606#issuecomment-1782352138 ## CI report: * 7b95683d46fe4864573cc5690ec3ef2275829b16 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-33364) Introduce standard YAML for flink configuration
[ https://issues.apache.org/jira/browse/FLINK-33364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-33364: --- Labels: pull-request-available (was: ) > Introduce standard YAML for flink configuration > --- > > Key: FLINK-33364 > URL: https://issues.apache.org/jira/browse/FLINK-33364 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Configuration >Affects Versions: 1.19.0 >Reporter: Junrui Li >Assignee: Junrui Li >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[PR] [FLINK-33364] Introduce standard YAML for flink configuration. [flink]
JunRuiLee opened a new pull request, #23606: URL: https://github.com/apache/flink/pull/23606 ## What is the purpose of the change Introduce standard YAML for flink configuration. ## Brief change log - Use snakeyaml as the standard yaml parser - Introduce configuration file "config.yaml" for standard yaml - Global Configuration will first load the legacy Flink configuration file, "flink-conf.yaml". If this file does not exist, it will then load the standard YAML configuration file, "config.yaml", which supports the YAML format. - When the loaded file name is "config.yaml", the Configuration will convert the value to a YAML syntax string and convert it to an object. ## Verifying this change - Update ConfigurationTest, ConfigurationUtilsTest, GlobalConfigurationTest, and FlinkConfigLoaderTest to parameterized tests, which will test both new and legacy configuration approaches separately. - Add YamlParserUtilsTest to test the standard YAML parser. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (**yes** / no) - If yes, how is the feature documented? (not applicable / **docs** / JavaDocs / not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33033][olap][haservice] Add haservice micro benchmark for olap [flink-benchmarks]
KarmaGYZ merged PR #78: URL: https://github.com/apache/flink-benchmarks/pull/78 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-33033) Add haservice micro benchmark for olap
[ https://issues.apache.org/jira/browse/FLINK-33033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yangze Guo closed FLINK-33033. -- Fix Version/s: 1.19.0 Resolution: Fixed > Add haservice micro benchmark for olap > -- > > Key: FLINK-33033 > URL: https://issues.apache.org/jira/browse/FLINK-33033 > Project: Flink > Issue Type: Sub-task > Components: Benchmarks >Affects Versions: 1.19.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > > Add micro benchmarks of haservice for olap to improve the performance for > short-lived jobs -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33033) Add haservice micro benchmark for olap
[ https://issues.apache.org/jira/browse/FLINK-33033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17780163#comment-17780163 ] Yangze Guo commented on FLINK-33033: master: 6e61678dc7fffa4529d27ec674a6b57a1e79b097 > Add haservice micro benchmark for olap > -- > > Key: FLINK-33033 > URL: https://issues.apache.org/jira/browse/FLINK-33033 > Project: Flink > Issue Type: Sub-task > Components: Benchmarks >Affects Versions: 1.19.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > Labels: pull-request-available > > Add micro benchmarks of haservice for olap to improve the performance for > short-lived jobs -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33121) Failed precondition in JobExceptionsHandler due to concurrent global failures
[ https://issues.apache.org/jira/browse/FLINK-33121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated FLINK-33121: --- Description: {{JobExceptionsHandler#createRootExceptionInfo}} makes the assumption that *Global* Failures (with null Task name) may *only* be RootExceptions (jobs are considered in FAILED state when this happens and no further exceptions are captured) and *Local/Task* may be part of concurrent exceptions List *--* if this precondition is violated, an assertion is thrown as part of {{{}asserLocalExceptionInfo{}}}. The issue lies within [convertFailures|[https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/adaptive/StateWithExecutionGraph.java#L422]] logic where we take the failureCollection pointer and convert it to a HistoryEntry. In more detail, we are passing the first Failure and a pointer to the remaining failures collection as part of HistoryEntry creation — and then add the entry in the exception History. In our specific scenario a Local Failure first comes in, we call convertFailures that creates a HistoryEntry and removes the LocalFailure from the collection while also passing a pointer to the empty failureCollection. Then a Global failure comes in (and before conversion), it is added to the failureCollection (that was empty) just before serving the requestJob that returns the List of History Entries. This messes things up, as the LocalFailure now has a ConcurrentExceptionsCollection with a Global Failure that should never happen (causing the assertion). A solution is to create a Copy of the failureCollection in the conversion instead of passing the pointer around (as I did in the updated PR) This PR also fixes a smaller bug where we dont pass the [taskName|[https://github.com/apache/flink/pull/23440/files#diff-0c8b850bbd267631fbe04bb44d8bb3c7e87c3c6aabae904fabdb758026f7fa76R104]|https://github.com/apache/flink/pull/23440/files#diff-0c8b850bbd267631fbe04bb44d8bb3c7e87c3c6aabae904fabdb758026f7fa76R104] properly. Note: DefaultScheduler does not suffer from this issue as it treats failures directly as HistoryEntries (no conversion step) was: {{JobExceptionsHandler#createRootExceptionInfo}} makes the assumption that *Global* Failures (with null Task name) may *only* be RootExceptions (jobs are considered in FAILED state when this happens and no further exceptions are captured) and *Local/Task* may be part of concurrent exceptions List *--* if this precondition is violated, an assertion is thrown as part of {{{}asserLocalExceptionInfo{}}}. The issue lies within [convertFailures](https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/adaptive/StateWithExecutionGraph.java#L422) logic where we take the failureCollection pointer and convert it to a HistoryEntry. In more detail, we are passing the first Failure and a pointer to the remaining failures collection as part of HistoryEntry creation — and then add the entry in the exception History. In our specific scenario a Local Failure first comes in, we call convertFailures that creates a HistoryEntry and removes the LocalFailure from the collection while also passing a pointer to the empty failureCollection. Then a Global failure comes in (and before conversion), it is added to the failureCollection (that was empty) just before serving the requestJob that returns the List of History Entries. This messes things up, as the LocalFailure now has a ConcurrentExceptionsCollection with a Global Failure that should never happen (causing the assertion). A solution is to create a Copy of the failureCollection in the conversion instead of passing the pointer around (as I did in the updated PR) This PR also fixes a smaller bug where we dont pass the [taskName](https://github.com/apache/flink/pull/23440/files#diff-0c8b850bbd267631fbe04bb44d8bb3c7e87c3c6aabae904fabdb758026f7fa76R104) properly. Note: DefaultScheduler does not suffer from this issue as it treats failures directly as HistoryEntries (no conversion step) > Failed precondition in JobExceptionsHandler due to concurrent global failures > - > > Key: FLINK-33121 > URL: https://issues.apache.org/jira/browse/FLINK-33121 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Reporter: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > > {{JobExceptionsHandler#createRootExceptionInfo}} makes the assumption that > *Global* Failures (with null Task name) may *only* be RootExceptions (jobs > are considered in FAILED state when this happens and no further exceptions > are captured) and *Local/Task* may be part of concurrent exceptio
[jira] [Updated] (FLINK-33121) Failed precondition in JobExceptionsHandler due to concurrent global failures
[ https://issues.apache.org/jira/browse/FLINK-33121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated FLINK-33121: --- Description: {{JobExceptionsHandler#createRootExceptionInfo}} makes the assumption that *Global* Failures (with null Task name) may *only* be RootExceptions (jobs are considered in FAILED state when this happens and no further exceptions are captured) and *Local/Task* may be part of concurrent exceptions List *--* if this precondition is violated, an assertion is thrown as part of {{{}asserLocalExceptionInfo{}}}. The issue lies within [convertFailures](https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/adaptive/StateWithExecutionGraph.java#L422) logic where we take the failureCollection pointer and convert it to a HistoryEntry. In more detail, we are passing the first Failure and a pointer to the remaining failures collection as part of HistoryEntry creation — and then add the entry in the exception History. In our specific scenario a Local Failure first comes in, we call convertFailures that creates a HistoryEntry and removes the LocalFailure from the collection while also passing a pointer to the empty failureCollection. Then a Global failure comes in (and before conversion), it is added to the failureCollection (that was empty) just before serving the requestJob that returns the List of History Entries. This messes things up, as the LocalFailure now has a ConcurrentExceptionsCollection with a Global Failure that should never happen (causing the assertion). A solution is to create a Copy of the failureCollection in the conversion instead of passing the pointer around (as I did in the updated PR) This PR also fixes a smaller bug where we dont pass the [taskName](https://github.com/apache/flink/pull/23440/files#diff-0c8b850bbd267631fbe04bb44d8bb3c7e87c3c6aabae904fabdb758026f7fa76R104) properly. Note: DefaultScheduler does not suffer from this issue as it treats failures directly as HistoryEntries (no conversion step) was: {{JobExceptionsHandler#createRootExceptionInfo}} makes the assumption that *Global* Failures (with null Task name) may *only* be RootExceptions (jobs are considered in FAILED state when this happens and no further exceptions are captured) and *Local/Task* may be part of concurrent exceptions List *--* if this precondition is violated, an assertion is thrown as part of {{{}asserLocalExceptionInfo{}}}. However, in the existing logic in the AdaptiveScheduler, we always add both the Global and the Local failures at the *end* of the [failure collection list|https://github.com/confluentinc/flink/blob/b8482260622c14db00f9dc88bbf9e82233613235/flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/adaptive/StateWithExecutionGraph.java#L338] and when converting them to history entries, we *remove from the Head* the [oldest failure exception.|https://github.com/confluentinc/flink/blob/b8482260622c14db00f9dc88bbf9e82233613235/flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/adaptive/StateWithExecutionGraph.java#L386] As a result, when there is a concurrent Task failure (first) with a Global failure (second terminating the job), the global failure ends up in the concurrent exception list, violating the precondition. Note: DefaultScheduler does not suffer from this issue as it treats failures directly as HistoryEntries (no conversion step) Solution is to only add Global failures in the *head* of the List as part of handleGlobalFailure method to ensure they are ending up as RootExceptionEntries. > Failed precondition in JobExceptionsHandler due to concurrent global failures > - > > Key: FLINK-33121 > URL: https://issues.apache.org/jira/browse/FLINK-33121 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Reporter: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > > {{JobExceptionsHandler#createRootExceptionInfo}} makes the assumption that > *Global* Failures (with null Task name) may *only* be RootExceptions (jobs > are considered in FAILED state when this happens and no further exceptions > are captured) and *Local/Task* may be part of concurrent exceptions List *--* > if this precondition is violated, an assertion is thrown as part of > {{{}asserLocalExceptionInfo{}}}. > The issue lies within > [convertFailures](https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/adaptive/StateWithExecutionGraph.java#L422) > logic where we take the failureCollection pointer and convert it to a > HistoryEntry. > In more detail, we are passing the first Failure and a pointer to the > remaining f
[jira] [Commented] (FLINK-26050) Too many small sst files in rocksdb state backend when using processing time window
[ https://issues.apache.org/jira/browse/FLINK-26050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17780139#comment-17780139 ] wuzq commented on FLINK-26050: -- [~shenjiaqi] [~mayuehappy] Is there a solution to this problem.Using _state.backend.rocksdb.timer-service.factory to heap, but rocksdb small sst is still growing_ > Too many small sst files in rocksdb state backend when using processing time > window > --- > > Key: FLINK-26050 > URL: https://issues.apache.org/jira/browse/FLINK-26050 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Affects Versions: 1.10.2, 1.14.3 >Reporter: shen >Priority: Major > Attachments: image-2022-02-09-21-22-13-920.png, > image-2022-02-11-10-32-14-956.png, image-2022-02-11-10-36-46-630.png, > image-2022-02-14-13-04-52-325.png > > > When using processing time window, in some workload, there will be a lot of > small sst files(serveral KB) in rocksdb local directory and may cause "Too > many files error". > Use rocksdb tool ldb to find out content in sst files: > * column family of these small sst files is "processing_window-timers". > * most sst files are in level-1. > * records in sst files are almost kTypeDeletion. > * creation time of sst file correspond to checkpoint interval. > These small sst files seem to be generated when flink checkpoint is > triggered. Although all content in sst are delete tags, they are not > compacted and deleted in rocksdb compaction because of not intersecting with > each other(rocksdb [compaction trivial > move|https://github.com/facebook/rocksdb/wiki/Compaction-Trivial-Move]). And > there seems to be no chance to delete them because of small size and not > intersect with other sst files. > > I will attach a simple program to reproduce the problem. > > Since timer in processing time window is generated in strictly ascending > order(both put and delete). So If workload of job happen to generate level-0 > sst files not intersect with each other.(for example: processing window size > much smaller than checkpoint interval, and no window content cross checkpoint > interval or no new data in window crossing checkpoint interval). There will > be many small sst files generated until job restored from savepoint, or > incremental checkpoint is disabled. > > May be similar problem exists when user use timer in operators with same > workload. > > Code to reproduce the problem: > {code:java} > package org.apache.flink.jira; > import lombok.extern.slf4j.Slf4j; > import org.apache.flink.configuration.Configuration; > import org.apache.flink.configuration.RestOptions; > import org.apache.flink.configuration.TaskManagerOptions; > import org.apache.flink.contrib.streaming.state.RocksDBStateBackend; > import org.apache.flink.streaming.api.TimeCharacteristic; > import org.apache.flink.streaming.api.checkpoint.ListCheckpointed; > import org.apache.flink.streaming.api.datastream.DataStreamSource; > import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; > import org.apache.flink.streaming.api.functions.source.SourceFunction; > import > org.apache.flink.streaming.api.functions.windowing.ProcessWindowFunction; > import > org.apache.flink.streaming.api.windowing.assigners.TumblingProcessingTimeWindows; > import org.apache.flink.streaming.api.windowing.time.Time; > import org.apache.flink.streaming.api.windowing.windows.TimeWindow; > import org.apache.flink.util.Collector; > import java.util.Collections; > import java.util.List; > import java.util.Random; > @Slf4j > public class StreamApp { > public static void main(String[] args) throws Exception { > Configuration config = new Configuration(); > config.set(RestOptions.ADDRESS, "127.0.0.1"); > config.set(RestOptions.PORT, 10086); > config.set(TaskManagerOptions.NUM_TASK_SLOTS, 6); > new > StreamApp().configureApp(StreamExecutionEnvironment.createLocalEnvironment(1, > config)); > } > public void configureApp(StreamExecutionEnvironment env) throws Exception { > env.enableCheckpointing(2); // 20sec > RocksDBStateBackend rocksDBStateBackend = > new > RocksDBStateBackend("file:///Users/shenjiaqi/Workspace/jira/flink-51/checkpoints/", > true); // need to be reconfigured > > rocksDBStateBackend.setDbStoragePath("/Users/shenjiaqi/Workspace/jira/flink-51/flink/rocksdb_local_db"); > // need to be reconfigured > env.setStateBackend(rocksDBStateBackend); > env.getCheckpointConfig().setCheckpointTimeout(10); > env.getCheckpointConfig().setTolerableCheckpointFailureNumber(5); > env.setParallelism(1); > env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime); > env.getConfig().setTaskCancellationInterval(1
Re: [PR] [FLINK-33341][state] Add support for rescaling from local keyed state [flink]
rkhachatryan commented on code in PR #23591: URL: https://github.com/apache/flink/pull/23591#discussion_r1373904163 ## flink-runtime/src/main/java/org/apache/flink/runtime/state/IncrementalLocalKeyedStateHandle.java: ## @@ -138,22 +108,43 @@ public void discardState() throws Exception { @Override public long getStateSize() { -return super.getStateSize() + metaDataState.getStateSize(); +return directoryStateHandle.getStateSize() + metaStateHandle.getStateSize(); } @Override public int hashCode() { -int result = super.hashCode(); -result = 31 * result + getMetaDataState().hashCode(); +int result = directoryStateHandle.hashCode(); +result = 31 * result + getKeyGroupRange().hashCode(); +result = 31 * result + getMetaDataStateHandle().hashCode(); return result; } @Override public String toString() { return "IncrementalLocalKeyedStateHandle{" + "metaDataState=" -+ metaDataState ++ metaStateHandle + "} " -+ super.toString(); ++ "DirectoryKeyedStateHandle{" ++ "directoryStateHandle=" ++ directoryStateHandle ++ ", keyGroupRange=" ++ keyGroupRange ++ '}'; +} + +@Override +public void registerSharedStates(SharedStateRegistry stateRegistry, long checkpointID) { +// Nothing to do, this is for local use only. Review Comment: nit: throw an exception? ## flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/PrioritizedOperatorSubtaskStateTest.java: ## @@ -51,6 +55,108 @@ class PrioritizedOperatorSubtaskStateTest { private static final Random RANDOM = new Random(0x42); +@Test +void testTryCreateMixedLocalAndRemoteAlternative() { +testTryCreateMixedLocalAndRemoteAlternative( +StateHandleDummyUtil::createKeyedStateHandleFromSeed, +KeyedStateHandle::getKeyGroupRange); +} + + void testTryCreateMixedLocalAndRemoteAlternative( +IntFunction stateHandleFactory, Function idExtractor) { +List jmState = +Arrays.asList( +stateHandleFactory.apply(0), +stateHandleFactory.apply(1), +stateHandleFactory.apply(2), +stateHandleFactory.apply(3)); + +List alternativeA = +Arrays.asList(stateHandleFactory.apply(0), stateHandleFactory.apply(3)); + +List alternativeB = +Arrays.asList( +stateHandleFactory.apply(1), +stateHandleFactory.apply(3), +stateHandleFactory.apply(5)); + +List> alternatives = +Arrays.asList( +new StateObjectCollection<>(alternativeA), +new StateObjectCollection<>(Collections.emptyList()), +new StateObjectCollection<>(alternativeB)); + +StateObjectCollection result = + PrioritizedOperatorSubtaskState.Builder.tryComputeMixedLocalAndRemoteAlternative( +new StateObjectCollection<>(jmState), alternatives, idExtractor) +.get(); + +Assertions.assertEquals(4, result.size()); +List expected = new ArrayList<>(alternativeA); +expected.add(alternativeB.get(0)); +expected.add(jmState.get(2)); +Assertions.assertTrue(result.containsAll(expected)); Review Comment: I find this test more difficult to follow than it could be. How about: 1. Use `org.assertj.core.api.Assertions.assertThat(result).hasSameElementsAs(...)` 2. Use constants for handles: ``` SH remote0 = stateHandleFactory.apply(0); SH remote1 = stateHandleFactory.apply(1); ... List jmState = Arrays.asList( remote0, remote1 , remote2 , remote3 ); SH local0 = stateHandleFactory.apply(0); SH local3 = stateHandleFactory.apply(3); ... List alternativeA = Arrays.asList(...); List alternativeB = Arrays.asList(...); ... assertThat(result).hasSameElementsAs(Arrays.asList(local0, local3, local1, remote0)); ## flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/PrioritizedOperatorSubtaskState.java: ## @@ -313,22 +317,121 @@ public PrioritizedOperatorSubtaskState build() { restoredCheckpointId); } +/** + * This method creates an alternative recovery option by replacing as much job manager state + * with higher prioritized (=local) alternatives as possible. + * + * @param jobManagerState the state that the task got assigned from the job manager (this + * state lives in remote storage). + * @param alternativesByPriority l
Re: [PR] Draft [FLINK-33335] (CI) [flink]
afedulov commented on PR #23605: URL: https://github.com/apache/flink/pull/23605#issuecomment-1781935549 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Draft [FLINK-33335] (CI) [flink]
afedulov closed pull request #23605: Draft [FLINK-5] (CI) URL: https://github.com/apache/flink/pull/23605 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Draft [FLINK-33335] (CI) [flink]
flinkbot commented on PR #23605: URL: https://github.com/apache/flink/pull/23605#issuecomment-1781931508 ## CI report: * d9061be6ae8e9d7d246f19d7103c8d01b9db1574 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-31332] Limit the use of ExecutionConfig on JdbcOutputFormat [flink-connector-jdbc]
snuyanzin commented on code in PR #73: URL: https://github.com/apache/flink-connector-jdbc/pull/73#discussion_r1373843729 ## flink-connector-jdbc/src/main/java/org/apache/flink/connector/jdbc/table/JdbcOutputFormatBuilder.java: ## @@ -100,10 +98,7 @@ public JdbcOutputFormatBuilder setFieldDataTypes(DataType[] fieldDataTypes) { return new JdbcOutputFormat<>( new SimpleJdbcConnectionProvider(jdbcOptions), executionOptions, -ctx -> -createBufferReduceExecutor( -dmlOptions, ctx, rowDataTypeInformation, logicalTypes), Review Comment: since all the usages of `rowDataTypeInformation` are removed, could we remove the field itself? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] Draft [FLINK-33335] (CI) [flink]
afedulov opened a new pull request, #23605: URL: https://github.com/apache/flink/pull/23605 ## What is the purpose of the change *(For example: This pull request makes task deployment go through the blob server, rather than through RPC. That way we avoid re-transferring them on each deployment (during recovery).)* ## Brief change log *(for example:)* - *The TaskInfo is stored in the blob store on job creation time as a persistent artifact* - *Deployments RPC transmits only the blob storage reference* - *TaskManagers retrieve the TaskInfo from the blob cache* ## Verifying this change Please make sure both new and modified tests in this PR follows the conventions defined in our code quality guide: https://flink.apache.org/contributing/code-style-and-quality-common.html#testing *(Please pick either of the following options)* This change is a trivial rework / code cleanup without any test coverage. *(or)* This change is already covered by existing tests, such as *(please describe tests)*. *(or)* This change added tests and can be verified as follows: *(example:)* - *Added integration tests for end-to-end deployment with large payloads (100MB)* - *Extended integration test for recovery after master (JobManager) failure* - *Added test that validates that TaskInfo is transferred only once across recoveries* - *Manually verified the change by running a 4 node cluster with 2 JobManagers and 4 TaskManagers, a stateful streaming program, and killing one JobManager and two TaskManagers during the execution, verifying that recovery happens correctly.* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / no) - The serializers: (yes / no / don't know) - The runtime per-record code paths (performance sensitive): (yes / no / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know) - The S3 file system connector: (yes / no / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / no) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-31332] Limit the use of ExecutionConfig on JdbcOutputFormat [flink-connector-jdbc]
snuyanzin commented on code in PR #73: URL: https://github.com/apache/flink-connector-jdbc/pull/73#discussion_r1373838435 ## flink-connector-jdbc/src/test/java/org/apache/flink/connector/jdbc/internal/JdbcOutputSerializerTest.java: ## @@ -0,0 +1,36 @@ +package org.apache.flink.connector.jdbc.internal; + +import org.apache.flink.api.common.ExecutionConfig; +import org.apache.flink.api.common.typeinfo.TypeInformation; +import org.apache.flink.api.common.typeutils.TypeSerializer; +import org.apache.flink.types.Row; + +import org.junit.jupiter.api.Test; + +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertNotEquals; Review Comment: since in most of the places there is `org.assertj.core.api.Assertions.assertThat;` and IIRC somewhere in Flink guide it was mentioned that it's better to use `org.assertj.core.api.Assertions.assertThat;` could we use it here -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-33335) Reactivate missing e2e tests
[ https://issues.apache.org/jira/browse/FLINK-5?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-5: --- Labels: pull-request-available (was: ) > Reactivate missing e2e tests > > > Key: FLINK-5 > URL: https://issues.apache.org/jira/browse/FLINK-5 > Project: Flink > Issue Type: Improvement >Reporter: Alexander Fedulov >Assignee: Alexander Fedulov >Priority: Major > Labels: pull-request-available > > FLINK-17375 removed _run-pre-commit-tests.sh_ in Flink 1.12 [1]. Since then > the following tests are not executed anymore: > _test_state_migration.sh_ > _test_state_evolution.sh_ > _test_streaming_kinesis.sh_ > _test_streaming_classloader.sh_ > _test_streaming_distributed_cache_via_blob.sh_ > [1] > https://github.com/apache/flink/pull/12268/files#diff-39f0aea40d2dd3f026544bb4c2502b2e9eab4c825df5f2b68c6d4ca8c39d7b5e -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-32563] Allow to execute archunit tests only with Flink version that connectors were built against [flink-connector-shared-utils]
snuyanzin commented on code in PR #23: URL: https://github.com/apache/flink-connector-shared-utils/pull/23#discussion_r1373724108 ## .github/workflows/ci.yml: ## @@ -88,6 +93,10 @@ jobs: if: ${{ inputs.run_dependency_convergence }} run: echo "MVN_DEPENDENCY_CONVERGENCE=-Dflink.convergence.phase=install -Pcheck-convergence" >> $GITHUB_ENV + - name: "Disable archunit tests" +if: ${{ inputs.skip_archunit_tests }} +run: echo "MVN_ARCHUNIT_TESTS=-Dtest='!*ArchitectureTest'" >> $GITHUB_ENV Review Comment: I tried similar settings locally like `mvn -Dflink.version=1.18.0 -Dtest=\!*ArchitectureTest* clean install` and for dirs without archunit tests it fails like ``` No tests were executed! (Set -DfailIfNoTests=false to ignore this error.) ``` then if we add mentioned flag to the command then surefire mixes integration and non integration tests (looks like a bug on surefire side, which does not happen without this flag). The problem is that there is for instance test called `PackagingITCase` assuming that there is already a built jar... however if surefire runs it together with default tests instead of integration then this test fails would be great if this is somehow covered here... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-32563] Allow to execute archunit tests only with Flink version that connectors were built against [flink-connector-shared-utils]
snuyanzin commented on code in PR #23: URL: https://github.com/apache/flink-connector-shared-utils/pull/23#discussion_r1373724108 ## .github/workflows/ci.yml: ## @@ -88,6 +93,10 @@ jobs: if: ${{ inputs.run_dependency_convergence }} run: echo "MVN_DEPENDENCY_CONVERGENCE=-Dflink.convergence.phase=install -Pcheck-convergence" >> $GITHUB_ENV + - name: "Disable archunit tests" +if: ${{ inputs.skip_archunit_tests }} +run: echo "MVN_ARCHUNIT_TESTS=-Dtest='!*ArchitectureTest'" >> $GITHUB_ENV Review Comment: I tried similar settings locally like `mvn -Dflink.version=1.18.0 -Dtest=\!*ArchitectureTest* clean install` and for dirs without archunit tests it fails like ``` No tests were executed! (Set -DfailIfNoTests=false to ignore this error.) ``` then if we add mentioned flag to the command then surefire mixes integration and non integration tests (looks like a bug on surefire side, which does not happen without this flag). The problem is that there tests called `PackagingITCase` assuming that there is already a built jar... however if surefire runs it together with default tests instead of integration then this test fails would be great if this is somehow covered here... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-32563] Allow to execute archunit tests only with Flink version that connectors were built against [flink-connector-shared-utils]
snuyanzin commented on code in PR #23: URL: https://github.com/apache/flink-connector-shared-utils/pull/23#discussion_r1373724108 ## .github/workflows/ci.yml: ## @@ -88,6 +93,10 @@ jobs: if: ${{ inputs.run_dependency_convergence }} run: echo "MVN_DEPENDENCY_CONVERGENCE=-Dflink.convergence.phase=install -Pcheck-convergence" >> $GITHUB_ENV + - name: "Disable archunit tests" +if: ${{ inputs.skip_archunit_tests }} +run: echo "MVN_ARCHUNIT_TESTS=-Dtest='!*ArchitectureTest'" >> $GITHUB_ENV Review Comment: I tried similar settings locally like `mvn -Dflink.version=1.18.0 -Dtest=\!*ArchitectureTest* clean install` and for dirs without archunit tests it fails like ``` No tests were executed! (Set -DfailIfNoTests=false to ignore this error.) ``` then if we add mentioned flag to the command then surefire mixes integration and non integration tests (which does not happen without this flag) would be great if this is somehow covered here... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-33367) Invalid Check in DefaultFileFilter
[ https://issues.apache.org/jira/browse/FLINK-33367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Fedulov closed FLINK-33367. - Release Note: Not an actual issue. Resolution: Won't Fix I am closing this as there is no description of the actual issue. > Invalid Check in DefaultFileFilter > -- > > Key: FLINK-33367 > URL: https://issues.apache.org/jira/browse/FLINK-33367 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.16.2 >Reporter: Chirag Dewan >Priority: Minor > > There is a null check in DefaultFileFilter: > > if (fileName == null || fileName.length() == 0) > { return true; } > > So 2 questions here - > 1) Can a file name ever be null? > 2) What will be the behavior with return true? Should it be return false > rather? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33368) Support for SNI in the Flink Client
[ https://issues.apache.org/jira/browse/FLINK-33368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17780027#comment-17780027 ] Mihai L Lalescu commented on FLINK-33368: - Yes this is a duplicate of the above. > Support for SNI in the Flink Client > --- > > Key: FLINK-33368 > URL: https://issues.apache.org/jira/browse/FLINK-33368 > Project: Flink > Issue Type: New Feature > Components: Command Line Client > Environment: Flink Cluster on OpenShift > VIP requiring SNI > Flink client running on a VM > Flink version 16.2 > Java 8 >Reporter: Mihai L Lalescu >Priority: Major > > We have Flink clusters running on OpenShift behind a VIP that requires SNI > (Server Name Information). The Flink client fails to connect to the Job > Manager REST API through the VIP due to lack of SNI support in the Client. > The connection was using TLS 1.2. > If required, I can provide Wireshark traces. The TLS 1.2 Client Hello package > does not contain any SNI info. > I have also searched the Flink source code for the netty SniHandler class and > I could not find any use of that class. > I have not seen any SNI references here > https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/rest/RestClient.java -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-32850][flink-runtime][JUnit5 Migration] The io.network.buffer package of flink-runtime module [flink]
flinkbot commented on PR #23604: URL: https://github.com/apache/flink/pull/23604#issuecomment-1781511688 ## CI report: * a3ca65d40309be72c6ca7117e1e8cc2a80aaba86 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-32850][flink-runtime][JUnit5 Migration] The io.network.buffer package of flink-runtime module [flink]
Jiabao-Sun commented on PR #23604: URL: https://github.com/apache/flink/pull/23604#issuecomment-1781511559 Hi @RocMarshal, sorry for bothering you again. Would you mind to help review this PR as well? Many thanks for that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-32661][sql-gateway] Fix unstable OperationRelatedITCase.testOperationRelatedApis [flink]
Jiabao-Sun commented on PR #23564: URL: https://github.com/apache/flink/pull/23564#issuecomment-1781504928 Hi @leonardBang, do you have time to help review it? Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [FLINK-32850][flink-runtime][JUnit5 Migration] The io.network.buffer package of flink-runtime module [flink]
Jiabao-Sun opened a new pull request, #23604: URL: https://github.com/apache/flink/pull/23604 ## What is the purpose of the change [FLINK-32850][flink-runtime][JUnit5 Migration] The io.network.buffer package of flink-runtime module ## Brief change log [FLINK-32850][flink-runtime][JUnit5 Migration] The io.network.buffer package of flink-runtime module ## Verifying this change This change is already covered by existing tests ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-33360) HybridSource fails to clear the previous round's state when switching sources, leading to data loss
[ https://issues.apache.org/jira/browse/FLINK-33360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leonard Xu reassigned FLINK-33360: -- Assignee: Feng Jiajie > HybridSource fails to clear the previous round's state when switching > sources, leading to data loss > --- > > Key: FLINK-33360 > URL: https://issues.apache.org/jira/browse/FLINK-33360 > Project: Flink > Issue Type: Bug > Components: Connectors / HybridSource >Affects Versions: 1.16.2, 1.18.0, 1.17.1 >Reporter: Feng Jiajie >Assignee: Feng Jiajie >Priority: Major > Labels: pull-request-available > Fix For: 1.7.3, 1.18.1 > > > org.apache.flink.connector.base.source.hybrid.HybridSourceSplitEnumerator: > {code:java} > // track readers that have finished processing for current > enumerator > finishedReaders.add(subtaskId); > if (finishedReaders.size() == context.currentParallelism()) { > LOG.debug("All readers finished, ready to switch > enumerator!"); > if (currentSourceIndex + 1 < sources.size()) { > switchEnumerator(); > // switch all readers prior to sending split assignments > for (int i = 0; i < context.currentParallelism(); i++) { > sendSwitchSourceEvent(i, currentSourceIndex); > } > } > } {code} > I think that *finishedReaders* is used to keep track of all the subTaskIds > that have finished reading the current round of the source. Therefore, in the > *switchEnumerator* function, *finishedReaders* should be cleared: > If it's not cleared, then in the next source reading, whenever any > SourceReader reports a *SourceReaderFinishedEvent* (while other SourceReaders > may not have finished processing in parallel), the condition > *finishedReaders.size() == context.currentParallelism()* will be satisfied > and it will trigger {*}sendSwitchSourceEvent{*}(i, currentSourceIndex), > sending a *SwitchSourceEvent* to all SourceReaders. > If a SourceReader receives a SwitchSourceEvent before it finishes reading the > previous source, it will execute {*}currentReader.close(){*}, and some data > may not be fully read, resulting in a partial data loss in the source. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33360) HybridSource fails to clear the previous round's state when switching sources, leading to data loss
[ https://issues.apache.org/jira/browse/FLINK-33360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17779998#comment-17779998 ] Leonard Xu commented on FLINK-33360: Thanks [~fengjiajie] for report this issue, I assigned this ticket to you as you have raised a PR. > HybridSource fails to clear the previous round's state when switching > sources, leading to data loss > --- > > Key: FLINK-33360 > URL: https://issues.apache.org/jira/browse/FLINK-33360 > Project: Flink > Issue Type: Bug > Components: Connectors / HybridSource >Affects Versions: 1.16.2, 1.18.0, 1.17.1 >Reporter: Feng Jiajie >Priority: Major > Labels: pull-request-available > Fix For: 1.7.3, 1.18.1 > > > org.apache.flink.connector.base.source.hybrid.HybridSourceSplitEnumerator: > {code:java} > // track readers that have finished processing for current > enumerator > finishedReaders.add(subtaskId); > if (finishedReaders.size() == context.currentParallelism()) { > LOG.debug("All readers finished, ready to switch > enumerator!"); > if (currentSourceIndex + 1 < sources.size()) { > switchEnumerator(); > // switch all readers prior to sending split assignments > for (int i = 0; i < context.currentParallelism(); i++) { > sendSwitchSourceEvent(i, currentSourceIndex); > } > } > } {code} > I think that *finishedReaders* is used to keep track of all the subTaskIds > that have finished reading the current round of the source. Therefore, in the > *switchEnumerator* function, *finishedReaders* should be cleared: > If it's not cleared, then in the next source reading, whenever any > SourceReader reports a *SourceReaderFinishedEvent* (while other SourceReaders > may not have finished processing in parallel), the condition > *finishedReaders.size() == context.currentParallelism()* will be satisfied > and it will trigger {*}sendSwitchSourceEvent{*}(i, currentSourceIndex), > sending a *SwitchSourceEvent* to all SourceReaders. > If a SourceReader receives a SwitchSourceEvent before it finishes reading the > previous source, it will execute {*}currentReader.close(){*}, and some data > may not be fully read, resulting in a partial data loss in the source. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-33309] Add `-Djava.security.manager=allow` [flink]
snuyanzin commented on PR #23547: URL: https://github.com/apache/flink/pull/23547#issuecomment-1781460690 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-33377) When Flink version >= 1.15 and Flink Operator is used, there is a waste of resources when running Flink batch jobs.
[ https://issues.apache.org/jira/browse/FLINK-33377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17779984#comment-17779984 ] hjw commented on FLINK-33377: - [~gsomogyi] Can you take a look? > When Flink version >= 1.15 and Flink Operator is used, there is a waste of > resources when running Flink batch jobs. > --- > > Key: FLINK-33377 > URL: https://issues.apache.org/jira/browse/FLINK-33377 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.5.0 >Reporter: hjw >Priority: Major > > According to > [FLINK-29376|https://issues.apache.org/jira/browse/FLINK-29376],SHUTDOWN_ON_APPLICATION_FINISH > always be set false when Flink version 1.15 and above. > However,the JobManager still exists after a Flink batch job runs normally,Is > this a waste of resources? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33377) When Flink version >= 1.15 and Flink Operator is used, there is a waste of resources when running Flink batch jobs.
hjw created FLINK-33377: --- Summary: When Flink version >= 1.15 and Flink Operator is used, there is a waste of resources when running Flink batch jobs. Key: FLINK-33377 URL: https://issues.apache.org/jira/browse/FLINK-33377 Project: Flink Issue Type: Bug Components: Kubernetes Operator Affects Versions: kubernetes-operator-1.5.0 Reporter: hjw According to [FLINK-29376|https://issues.apache.org/jira/browse/FLINK-29376],SHUTDOWN_ON_APPLICATION_FINISH always be set false when Flink version 1.15 and above. However,the JobManager still exists after a Flink batch job runs normally,Is this a waste of resources? -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-33121] Failed precondition in JobExceptionsHandler due to concurrent global failures [flink]
pgaref commented on code in PR #23440: URL: https://github.com/apache/flink/pull/23440#discussion_r1373418508 ## flink-runtime/src/test/java/org/apache/flink/runtime/scheduler/adaptive/AdaptiveSchedulerTest.java: ## @@ -1619,12 +1619,18 @@ void testExceptionHistoryWithTaskFailureFromStopWithSavepoint() throws Exception @Test void testExceptionHistoryWithTaskConcurrentGlobalFailure() throws Exception { Review Comment: Surprisignly yes, its a `TaskConcurrentGlobalFailure` -- we could also rename to `ConcurrentTaskGlobalFailure` if more clear -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33121] Failed precondition in JobExceptionsHandler due to concurrent global failures [flink]
dmvk commented on code in PR #23440: URL: https://github.com/apache/flink/pull/23440#discussion_r1373417213 ## flink-runtime/src/test/java/org/apache/flink/runtime/scheduler/adaptive/AdaptiveSchedulerTest.java: ## @@ -1619,12 +1619,18 @@ void testExceptionHistoryWithTaskFailureFromStopWithSavepoint() throws Exception @Test void testExceptionHistoryWithTaskConcurrentGlobalFailure() throws Exception { Review Comment: is the test name still matching what's happening in the test? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-33128) TestValuesRuntimeFunctions$TestValuesLookupFunction does not call open() on converter
[ https://issues.apache.org/jira/browse/FLINK-33128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Gao closed FLINK-33128. --- > TestValuesRuntimeFunctions$TestValuesLookupFunction does not call open() on > converter > - > > Key: FLINK-33128 > URL: https://issues.apache.org/jira/browse/FLINK-33128 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.19.0 >Reporter: Jerome Gagnon >Assignee: Jerome Gagnon >Priority: Major > Labels: pull-request-available > > When using the TestValues connector with nested Row values relying on > BinaryArrayWriter the following exception happen : > {code:java} > java.lang.NullPointerException: Cannot invoke > "org.apache.flink.table.data.writer.BinaryArrayWriter.getNumElements()" > because "this.reuseWriter" is null > at > org.apache.flink.table.data.conversion.ArrayObjectArrayConverter.allocateWriter(ArrayObjectArrayConverter.java:140) > at > org.apache.flink.table.data.conversion.ArrayObjectArrayConverter.toBinaryArrayData(ArrayObjectArrayConverter.java:114) > at > org.apache.flink.table.data.conversion.ArrayObjectArrayConverter.toInternal(ArrayObjectArrayConverter.java:93) > at > org.apache.flink.table.data.conversion.ArrayObjectArrayConverter.toInternal(ArrayObjectArrayConverter.java:40) > at > org.apache.flink.table.data.conversion.DataStructureConverter.toInternalOrNull(DataStructureConverter.java:61) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:90) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:37) > at > org.apache.flink.table.data.conversion.DataStructureConverter.toInternalOrNull(DataStructureConverter.java:61) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:90) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:37) > at > org.apache.flink.table.data.conversion.DataStructureConverter.toInternalOrNull(DataStructureConverter.java:61) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:75) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:37) > at > org.apache.flink.table.data.conversion.DataStructureConverter.toInternalOrNull(DataStructureConverter.java:61) > at > org.apache.flink.table.runtime.connector.source.DataStructureConverterWrapper.toInternal(DataStructureConverterWrapper.java:51) > at > org.apache.flink.table.planner.factories.TestValuesRuntimeFunctions$TestValuesLookupFunction.lambda$indexDataByKey$0(TestValuesRuntimeFunctions.java:626) > at java.base/java.util.ArrayList.forEach(ArrayList.java:1511) > at > org.apache.flink.table.planner.factories.TestValuesRuntimeFunctions$TestValuesLookupFunction.indexDataByKey(TestValuesRuntimeFunctions.java:624) > at > org.apache.flink.table.planner.factories.TestValuesRuntimeFunctions$TestValuesLookupFunction.open(TestValuesRuntimeFunctions.java:601) > at LookupFunction$370.open(Unknown Source) > at > org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:34) > at > org.apache.flink.table.runtime.operators.join.lookup.LookupJoinRunner.open(LookupJoinRunner.java:67) > at > org.apache.flink.table.runtime.operators.join.lookup.LookupJoinWithCalcRunner.open(LookupJoinWithCalcRunner.java:51) > at > org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:34) > at > org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:100) > at > org.apache.flink.streaming.api.operators.ProcessOperator.open(ProcessOperator.java:56) > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:107) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:731) > at > org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.call(StreamTaskActionExecutor.java:100) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:706) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:672) > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:935) > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:904) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:728) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:550){code} > > This is
[jira] [Resolved] (FLINK-33128) TestValuesRuntimeFunctions$TestValuesLookupFunction does not call open() on converter
[ https://issues.apache.org/jira/browse/FLINK-33128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Gao resolved FLINK-33128. - Resolution: Fixed > TestValuesRuntimeFunctions$TestValuesLookupFunction does not call open() on > converter > - > > Key: FLINK-33128 > URL: https://issues.apache.org/jira/browse/FLINK-33128 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.19.0 >Reporter: Jerome Gagnon >Assignee: Jerome Gagnon >Priority: Major > Labels: pull-request-available > > When using the TestValues connector with nested Row values relying on > BinaryArrayWriter the following exception happen : > {code:java} > java.lang.NullPointerException: Cannot invoke > "org.apache.flink.table.data.writer.BinaryArrayWriter.getNumElements()" > because "this.reuseWriter" is null > at > org.apache.flink.table.data.conversion.ArrayObjectArrayConverter.allocateWriter(ArrayObjectArrayConverter.java:140) > at > org.apache.flink.table.data.conversion.ArrayObjectArrayConverter.toBinaryArrayData(ArrayObjectArrayConverter.java:114) > at > org.apache.flink.table.data.conversion.ArrayObjectArrayConverter.toInternal(ArrayObjectArrayConverter.java:93) > at > org.apache.flink.table.data.conversion.ArrayObjectArrayConverter.toInternal(ArrayObjectArrayConverter.java:40) > at > org.apache.flink.table.data.conversion.DataStructureConverter.toInternalOrNull(DataStructureConverter.java:61) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:90) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:37) > at > org.apache.flink.table.data.conversion.DataStructureConverter.toInternalOrNull(DataStructureConverter.java:61) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:90) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:37) > at > org.apache.flink.table.data.conversion.DataStructureConverter.toInternalOrNull(DataStructureConverter.java:61) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:75) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:37) > at > org.apache.flink.table.data.conversion.DataStructureConverter.toInternalOrNull(DataStructureConverter.java:61) > at > org.apache.flink.table.runtime.connector.source.DataStructureConverterWrapper.toInternal(DataStructureConverterWrapper.java:51) > at > org.apache.flink.table.planner.factories.TestValuesRuntimeFunctions$TestValuesLookupFunction.lambda$indexDataByKey$0(TestValuesRuntimeFunctions.java:626) > at java.base/java.util.ArrayList.forEach(ArrayList.java:1511) > at > org.apache.flink.table.planner.factories.TestValuesRuntimeFunctions$TestValuesLookupFunction.indexDataByKey(TestValuesRuntimeFunctions.java:624) > at > org.apache.flink.table.planner.factories.TestValuesRuntimeFunctions$TestValuesLookupFunction.open(TestValuesRuntimeFunctions.java:601) > at LookupFunction$370.open(Unknown Source) > at > org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:34) > at > org.apache.flink.table.runtime.operators.join.lookup.LookupJoinRunner.open(LookupJoinRunner.java:67) > at > org.apache.flink.table.runtime.operators.join.lookup.LookupJoinWithCalcRunner.open(LookupJoinWithCalcRunner.java:51) > at > org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:34) > at > org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:100) > at > org.apache.flink.streaming.api.operators.ProcessOperator.open(ProcessOperator.java:56) > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:107) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:731) > at > org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.call(StreamTaskActionExecutor.java:100) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:706) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:672) > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:935) > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:904) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:728) > at org.apache.flink.runtime.taskmanager.Task.run(Task.jav
[jira] [Updated] (FLINK-33128) TestValuesRuntimeFunctions$TestValuesLookupFunction does not call open() on converter
[ https://issues.apache.org/jira/browse/FLINK-33128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Gao updated FLINK-33128: Affects Version/s: 1.19.0 (was: 1.16.2) > TestValuesRuntimeFunctions$TestValuesLookupFunction does not call open() on > converter > - > > Key: FLINK-33128 > URL: https://issues.apache.org/jira/browse/FLINK-33128 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.19.0 >Reporter: Jerome Gagnon >Assignee: Jerome Gagnon >Priority: Major > Labels: pull-request-available > > When using the TestValues connector with nested Row values relying on > BinaryArrayWriter the following exception happen : > {code:java} > java.lang.NullPointerException: Cannot invoke > "org.apache.flink.table.data.writer.BinaryArrayWriter.getNumElements()" > because "this.reuseWriter" is null > at > org.apache.flink.table.data.conversion.ArrayObjectArrayConverter.allocateWriter(ArrayObjectArrayConverter.java:140) > at > org.apache.flink.table.data.conversion.ArrayObjectArrayConverter.toBinaryArrayData(ArrayObjectArrayConverter.java:114) > at > org.apache.flink.table.data.conversion.ArrayObjectArrayConverter.toInternal(ArrayObjectArrayConverter.java:93) > at > org.apache.flink.table.data.conversion.ArrayObjectArrayConverter.toInternal(ArrayObjectArrayConverter.java:40) > at > org.apache.flink.table.data.conversion.DataStructureConverter.toInternalOrNull(DataStructureConverter.java:61) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:90) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:37) > at > org.apache.flink.table.data.conversion.DataStructureConverter.toInternalOrNull(DataStructureConverter.java:61) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:90) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:37) > at > org.apache.flink.table.data.conversion.DataStructureConverter.toInternalOrNull(DataStructureConverter.java:61) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:75) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:37) > at > org.apache.flink.table.data.conversion.DataStructureConverter.toInternalOrNull(DataStructureConverter.java:61) > at > org.apache.flink.table.runtime.connector.source.DataStructureConverterWrapper.toInternal(DataStructureConverterWrapper.java:51) > at > org.apache.flink.table.planner.factories.TestValuesRuntimeFunctions$TestValuesLookupFunction.lambda$indexDataByKey$0(TestValuesRuntimeFunctions.java:626) > at java.base/java.util.ArrayList.forEach(ArrayList.java:1511) > at > org.apache.flink.table.planner.factories.TestValuesRuntimeFunctions$TestValuesLookupFunction.indexDataByKey(TestValuesRuntimeFunctions.java:624) > at > org.apache.flink.table.planner.factories.TestValuesRuntimeFunctions$TestValuesLookupFunction.open(TestValuesRuntimeFunctions.java:601) > at LookupFunction$370.open(Unknown Source) > at > org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:34) > at > org.apache.flink.table.runtime.operators.join.lookup.LookupJoinRunner.open(LookupJoinRunner.java:67) > at > org.apache.flink.table.runtime.operators.join.lookup.LookupJoinWithCalcRunner.open(LookupJoinWithCalcRunner.java:51) > at > org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:34) > at > org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:100) > at > org.apache.flink.streaming.api.operators.ProcessOperator.open(ProcessOperator.java:56) > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:107) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:731) > at > org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.call(StreamTaskActionExecutor.java:100) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:706) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:672) > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:935) > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:904) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:728) > at org.apa
[jira] [Assigned] (FLINK-33128) TestValuesRuntimeFunctions$TestValuesLookupFunction does not call open() on converter
[ https://issues.apache.org/jira/browse/FLINK-33128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Gao reassigned FLINK-33128: --- Assignee: Jerome Gagnon > TestValuesRuntimeFunctions$TestValuesLookupFunction does not call open() on > converter > - > > Key: FLINK-33128 > URL: https://issues.apache.org/jira/browse/FLINK-33128 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.16.2 >Reporter: Jerome Gagnon >Assignee: Jerome Gagnon >Priority: Major > Labels: pull-request-available > > When using the TestValues connector with nested Row values relying on > BinaryArrayWriter the following exception happen : > {code:java} > java.lang.NullPointerException: Cannot invoke > "org.apache.flink.table.data.writer.BinaryArrayWriter.getNumElements()" > because "this.reuseWriter" is null > at > org.apache.flink.table.data.conversion.ArrayObjectArrayConverter.allocateWriter(ArrayObjectArrayConverter.java:140) > at > org.apache.flink.table.data.conversion.ArrayObjectArrayConverter.toBinaryArrayData(ArrayObjectArrayConverter.java:114) > at > org.apache.flink.table.data.conversion.ArrayObjectArrayConverter.toInternal(ArrayObjectArrayConverter.java:93) > at > org.apache.flink.table.data.conversion.ArrayObjectArrayConverter.toInternal(ArrayObjectArrayConverter.java:40) > at > org.apache.flink.table.data.conversion.DataStructureConverter.toInternalOrNull(DataStructureConverter.java:61) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:90) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:37) > at > org.apache.flink.table.data.conversion.DataStructureConverter.toInternalOrNull(DataStructureConverter.java:61) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:90) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:37) > at > org.apache.flink.table.data.conversion.DataStructureConverter.toInternalOrNull(DataStructureConverter.java:61) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:75) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:37) > at > org.apache.flink.table.data.conversion.DataStructureConverter.toInternalOrNull(DataStructureConverter.java:61) > at > org.apache.flink.table.runtime.connector.source.DataStructureConverterWrapper.toInternal(DataStructureConverterWrapper.java:51) > at > org.apache.flink.table.planner.factories.TestValuesRuntimeFunctions$TestValuesLookupFunction.lambda$indexDataByKey$0(TestValuesRuntimeFunctions.java:626) > at java.base/java.util.ArrayList.forEach(ArrayList.java:1511) > at > org.apache.flink.table.planner.factories.TestValuesRuntimeFunctions$TestValuesLookupFunction.indexDataByKey(TestValuesRuntimeFunctions.java:624) > at > org.apache.flink.table.planner.factories.TestValuesRuntimeFunctions$TestValuesLookupFunction.open(TestValuesRuntimeFunctions.java:601) > at LookupFunction$370.open(Unknown Source) > at > org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:34) > at > org.apache.flink.table.runtime.operators.join.lookup.LookupJoinRunner.open(LookupJoinRunner.java:67) > at > org.apache.flink.table.runtime.operators.join.lookup.LookupJoinWithCalcRunner.open(LookupJoinWithCalcRunner.java:51) > at > org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:34) > at > org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:100) > at > org.apache.flink.streaming.api.operators.ProcessOperator.open(ProcessOperator.java:56) > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:107) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:731) > at > org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.call(StreamTaskActionExecutor.java:100) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:706) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:672) > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:935) > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:904) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:728) > at org.apache.flink.runtime.taskmanager.Task.r
[jira] [Commented] (FLINK-33128) TestValuesRuntimeFunctions$TestValuesLookupFunction does not call open() on converter
[ https://issues.apache.org/jira/browse/FLINK-33128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17779959#comment-17779959 ] Yun Gao commented on FLINK-33128: - Merged on master via f31770fcf5769052f1ac32a6529de979eaf339a4 > TestValuesRuntimeFunctions$TestValuesLookupFunction does not call open() on > converter > - > > Key: FLINK-33128 > URL: https://issues.apache.org/jira/browse/FLINK-33128 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.16.2 >Reporter: Jerome Gagnon >Assignee: Jerome Gagnon >Priority: Major > Labels: pull-request-available > > When using the TestValues connector with nested Row values relying on > BinaryArrayWriter the following exception happen : > {code:java} > java.lang.NullPointerException: Cannot invoke > "org.apache.flink.table.data.writer.BinaryArrayWriter.getNumElements()" > because "this.reuseWriter" is null > at > org.apache.flink.table.data.conversion.ArrayObjectArrayConverter.allocateWriter(ArrayObjectArrayConverter.java:140) > at > org.apache.flink.table.data.conversion.ArrayObjectArrayConverter.toBinaryArrayData(ArrayObjectArrayConverter.java:114) > at > org.apache.flink.table.data.conversion.ArrayObjectArrayConverter.toInternal(ArrayObjectArrayConverter.java:93) > at > org.apache.flink.table.data.conversion.ArrayObjectArrayConverter.toInternal(ArrayObjectArrayConverter.java:40) > at > org.apache.flink.table.data.conversion.DataStructureConverter.toInternalOrNull(DataStructureConverter.java:61) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:90) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:37) > at > org.apache.flink.table.data.conversion.DataStructureConverter.toInternalOrNull(DataStructureConverter.java:61) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:90) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:37) > at > org.apache.flink.table.data.conversion.DataStructureConverter.toInternalOrNull(DataStructureConverter.java:61) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:75) > at > org.apache.flink.table.data.conversion.RowRowConverter.toInternal(RowRowConverter.java:37) > at > org.apache.flink.table.data.conversion.DataStructureConverter.toInternalOrNull(DataStructureConverter.java:61) > at > org.apache.flink.table.runtime.connector.source.DataStructureConverterWrapper.toInternal(DataStructureConverterWrapper.java:51) > at > org.apache.flink.table.planner.factories.TestValuesRuntimeFunctions$TestValuesLookupFunction.lambda$indexDataByKey$0(TestValuesRuntimeFunctions.java:626) > at java.base/java.util.ArrayList.forEach(ArrayList.java:1511) > at > org.apache.flink.table.planner.factories.TestValuesRuntimeFunctions$TestValuesLookupFunction.indexDataByKey(TestValuesRuntimeFunctions.java:624) > at > org.apache.flink.table.planner.factories.TestValuesRuntimeFunctions$TestValuesLookupFunction.open(TestValuesRuntimeFunctions.java:601) > at LookupFunction$370.open(Unknown Source) > at > org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:34) > at > org.apache.flink.table.runtime.operators.join.lookup.LookupJoinRunner.open(LookupJoinRunner.java:67) > at > org.apache.flink.table.runtime.operators.join.lookup.LookupJoinWithCalcRunner.open(LookupJoinWithCalcRunner.java:51) > at > org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:34) > at > org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:100) > at > org.apache.flink.streaming.api.operators.ProcessOperator.open(ProcessOperator.java:56) > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:107) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:731) > at > org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.call(StreamTaskActionExecutor.java:100) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:706) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:672) > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:935) > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:904) > at org.apache.flink.runtime.taskm
Re: [PR] [FLINK-33128] Add converter.open() method call on TestValuesRuntimeFunctions [flink]
gaoyunhaii closed pull request #23453: [FLINK-33128] Add converter.open() method call on TestValuesRuntimeFunctions URL: https://github.com/apache/flink/pull/23453 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33375] Implement restore test base [flink]
flinkbot commented on PR #23603: URL: https://github.com/apache/flink/pull/23603#issuecomment-1781372184 ## CI report: * d8d49116263be7f4819fec276692ee59ef5f9003 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-33359) Kubernetes operator supports compiling with Java 17
[ https://issues.apache.org/jira/browse/FLINK-33359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Nuyanzin reassigned FLINK-33359: --- Assignee: Sergey Nuyanzin > Kubernetes operator supports compiling with Java 17 > --- > > Key: FLINK-33359 > URL: https://issues.apache.org/jira/browse/FLINK-33359 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Reporter: Rui Fan >Assignee: Sergey Nuyanzin >Priority: Critical > Labels: pull-request-available > Fix For: kubernetes-operator-1.7.0 > > > In the voting mailing list for flink-kubernetes-operator version 1.6.1, > Thomas mentioned Kubernetes operator cannot compile with java 17. > Offline discussion with [~gyfora] , we hope Kubernetes operator supports > compiling with Java 17 as a critical ticket in 1.7.0. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33375) Add a RestoreTestBase
[ https://issues.apache.org/jira/browse/FLINK-33375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-33375: --- Labels: pull-request-available (was: ) > Add a RestoreTestBase > - > > Key: FLINK-33375 > URL: https://issues.apache.org/jira/browse/FLINK-33375 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Planner >Reporter: Dawid Wysakowicz >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > > Add a test base class for writing restore tests. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-33359) Kubernetes operator supports compiling with Java 17
[ https://issues.apache.org/jira/browse/FLINK-33359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gyula Fora closed FLINK-33359. -- Resolution: Fixed Merged to main: d0ee0e947badcba7ed351a3ce5fdf95ee5b79847 0b6ff5a9cbfd13dddea45c82c470e44d0139ecc7 > Kubernetes operator supports compiling with Java 17 > --- > > Key: FLINK-33359 > URL: https://issues.apache.org/jira/browse/FLINK-33359 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Reporter: Rui Fan >Priority: Critical > Labels: pull-request-available > > In the voting mailing list for flink-kubernetes-operator version 1.6.1, > Thomas mentioned Kubernetes operator cannot compile with java 17. > Offline discussion with [~gyfora] , we hope Kubernetes operator supports > compiling with Java 17 as a critical ticket in 1.7.0. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33359) Kubernetes operator supports compiling with Java 17
[ https://issues.apache.org/jira/browse/FLINK-33359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gyula Fora updated FLINK-33359: --- Fix Version/s: kubernetes-operator-1.7.0 > Kubernetes operator supports compiling with Java 17 > --- > > Key: FLINK-33359 > URL: https://issues.apache.org/jira/browse/FLINK-33359 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Reporter: Rui Fan >Priority: Critical > Labels: pull-request-available > Fix For: kubernetes-operator-1.7.0 > > > In the voting mailing list for flink-kubernetes-operator version 1.6.1, > Thomas mentioned Kubernetes operator cannot compile with java 17. > Offline discussion with [~gyfora] , we hope Kubernetes operator supports > compiling with Java 17 as a critical ticket in 1.7.0. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[PR] [FLINK-33375] Implement restore test base [flink]
dawidwys opened a new pull request, #23603: URL: https://github.com/apache/flink/pull/23603 ## What is the purpose of the change This introduces a test base for writing restore tests. ## Verifying this change It contains a single test as an example for a simple calc. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (**yes** / no) - If yes, how is the feature documented? (not applicable / docs / **JavaDocs** / not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33359][FLINK-25002] add-opens jvm args to support jdk17 for kubernetes operator [flink-kubernetes-operator]
gyfora merged PR #691: URL: https://github.com/apache/flink-kubernetes-operator/pull/691 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33359][FLINK-25002] add-opens jvm args to support jdk17 for kubernetes operator [flink-kubernetes-operator]
gyfora commented on PR #691: URL: https://github.com/apache/flink-kubernetes-operator/pull/691#issuecomment-1781354452 looks good! thank you :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-29549) Add Aws Glue Catalog support in Flink
[ https://issues.apache.org/jira/browse/FLINK-29549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17779953#comment-17779953 ] Danny Cranmer commented on FLINK-29549: --- [~martijnvisser] yes, but have not had capacity to finish the code review recently. I will get back to it but appreciate support if anyone has spare cycles > Add Aws Glue Catalog support in Flink > - > > Key: FLINK-29549 > URL: https://issues.apache.org/jira/browse/FLINK-29549 > Project: Flink > Issue Type: New Feature > Components: Connectors / AWS, Connectors / Common >Reporter: Samrat Deb >Assignee: Samrat Deb >Priority: Major > Labels: pull-request-available > > Currently , Flink sql hive connector support hive metastore as hardcoded > metastore-uri. > It would be good if flink provide feature to have configurable metastore (eg. > AWS glue). > This would help many Users of flink who uses AWS > Glue([https://docs.aws.amazon.com/glue/latest/dg/start-data-catalog.html]) as > their common(unified) catalog and process data. > cc [~prabhujoseph] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-29549) Add Aws Glue Catalog support in Flink
[ https://issues.apache.org/jira/browse/FLINK-29549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Cranmer reassigned FLINK-29549: - Assignee: Samrat Deb > Add Aws Glue Catalog support in Flink > - > > Key: FLINK-29549 > URL: https://issues.apache.org/jira/browse/FLINK-29549 > Project: Flink > Issue Type: New Feature > Components: Connectors / AWS, Connectors / Common >Reporter: Samrat Deb >Assignee: Samrat Deb >Priority: Major > Labels: pull-request-available > > Currently , Flink sql hive connector support hive metastore as hardcoded > metastore-uri. > It would be good if flink provide feature to have configurable metastore (eg. > AWS glue). > This would help many Users of flink who uses AWS > Glue([https://docs.aws.amazon.com/glue/latest/dg/start-data-catalog.html]) as > their common(unified) catalog and process data. > cc [~prabhujoseph] -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-33371] Make TestValues sinks return results as Rows [flink]
twalthr commented on code in PR #23601: URL: https://github.com/apache/flink/pull/23601#discussion_r1373348800 ## flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/factories/TestValuesRuntimeFunctions.java: ## @@ -522,11 +540,13 @@ public void invoke(RowData value, Context context) throws Exception { synchronized (LOCK) { localRawResult.add(row); if (kind == RowKind.INSERT || kind == RowKind.UPDATE_AFTER) { -row.setKind(RowKind.INSERT); -localRetractResult.add(row); +final Row retractRow = Row.copy(row); Review Comment: no worries :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-33376) Add AuthInfo config option for Zookeeper configuration
[ https://issues.apache.org/jira/browse/FLINK-33376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17779947#comment-17779947 ] Oleksandr Nitavskyi commented on FLINK-33376: - For implementation we could add an additional Map config option and Flink users will be able to pass AuthInfo. There is some miss-alignment, AuthInfo type is while Map is . As simplest workaround we get accept on Flink config interface and use _getBytes()_ method in order to adapt interfaces. > Add AuthInfo config option for Zookeeper configuration > -- > > Key: FLINK-33376 > URL: https://issues.apache.org/jira/browse/FLINK-33376 > Project: Flink > Issue Type: Improvement >Reporter: Oleksandr Nitavskyi >Priority: Major > > In certain cases ZooKeeper requires additional Authentication information. > For example list of valid [names for > ensemble|https://zookeeper.apache.org/doc/r3.8.0/zookeeperAdmin.html#:~:text=for%20secure%20authentication.-,zookeeper.ensembleAuthName,-%3A%20(Java%20system%20property] > in order to prevent the accidental connecting to a wrong ensemble. > Curator allows to add additional AuthInfo object for such configuration. Thus > it would be useful to add one more additional Map property which would allow > to pass AuthInfo objects during Curator client creation. > *Acceptance Criteria:* For Flink users it is possible to configure auth info > list for Curator framework client. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33376) Add AuthInfo config option for Zookeeper configuration
Oleksandr Nitavskyi created FLINK-33376: --- Summary: Add AuthInfo config option for Zookeeper configuration Key: FLINK-33376 URL: https://issues.apache.org/jira/browse/FLINK-33376 Project: Flink Issue Type: Improvement Reporter: Oleksandr Nitavskyi In certain cases ZooKeeper requires additional Authentication information. For example list of valid [names for ensemble|https://zookeeper.apache.org/doc/r3.8.0/zookeeperAdmin.html#:~:text=for%20secure%20authentication.-,zookeeper.ensembleAuthName,-%3A%20(Java%20system%20property] in order to prevent the accidental connecting to a wrong ensemble. Curator allows to add additional AuthInfo object for such configuration. Thus it would be useful to add one more additional Map property which would allow to pass AuthInfo objects during Curator client creation. *Acceptance Criteria:* For Flink users it is possible to configure auth info list for Curator framework client. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-33371] Make TestValues sinks return results as Rows [flink]
dawidwys commented on code in PR #23601: URL: https://github.com/apache/flink/pull/23601#discussion_r1373338635 ## flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/factories/TestValuesRuntimeFunctions.java: ## @@ -522,11 +540,13 @@ public void invoke(RowData value, Context context) throws Exception { synchronized (LOCK) { localRawResult.add(row); if (kind == RowKind.INSERT || kind == RowKind.UPDATE_AFTER) { -row.setKind(RowKind.INSERT); -localRetractResult.add(row); +final Row retractRow = Row.copy(row); Review Comment: It's late :( I thought there is `else if` not `else` in the second branch 🤦 ## flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/factories/TestValuesRuntimeFunctions.java: ## @@ -522,11 +540,13 @@ public void invoke(RowData value, Context context) throws Exception { synchronized (LOCK) { localRawResult.add(row); if (kind == RowKind.INSERT || kind == RowKind.UPDATE_AFTER) { -row.setKind(RowKind.INSERT); -localRetractResult.add(row); +final Row retractRow = Row.copy(row); Review Comment: It's late :( I thought there is `else if` not `else` in the second branch 🤦 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33371] Make TestValues sinks return results as Rows [flink]
twalthr commented on code in PR #23601: URL: https://github.com/apache/flink/pull/23601#discussion_r1373335550 ## flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/factories/TestValuesRuntimeFunctions.java: ## @@ -522,11 +540,13 @@ public void invoke(RowData value, Context context) throws Exception { synchronized (LOCK) { localRawResult.add(row); if (kind == RowKind.INSERT || kind == RowKind.UPDATE_AFTER) { -row.setKind(RowKind.INSERT); -localRetractResult.add(row); +final Row retractRow = Row.copy(row); Review Comment: deduplicate code and only copy and set kind at a single location -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [hotfix] Update NOTICE files to reflect year 2023 [flink-connector-elasticsearch]
boring-cyborg[bot] commented on PR #75: URL: https://github.com/apache/flink-connector-elasticsearch/pull/75#issuecomment-1781294565 Awesome work, congrats on your first merged pull request! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [hotfix] Update NOTICE files to reflect year 2023 [flink-connector-elasticsearch]
MartijnVisser merged PR #75: URL: https://github.com/apache/flink-connector-elasticsearch/pull/75 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-25857] Add committer metrics to track the status of committables [flink]
pvary commented on PR #23555: URL: https://github.com/apache/flink/pull/23555#issuecomment-1781292345 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-29549) Add Aws Glue Catalog support in Flink
[ https://issues.apache.org/jira/browse/FLINK-29549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17779939#comment-17779939 ] Martijn Visser commented on FLINK-29549: [~dannycranmer] Do you have eyes on this? > Add Aws Glue Catalog support in Flink > - > > Key: FLINK-29549 > URL: https://issues.apache.org/jira/browse/FLINK-29549 > Project: Flink > Issue Type: New Feature > Components: Connectors / AWS, Connectors / Common >Reporter: Samrat Deb >Priority: Major > Labels: pull-request-available > > Currently , Flink sql hive connector support hive metastore as hardcoded > metastore-uri. > It would be good if flink provide feature to have configurable metastore (eg. > AWS glue). > This would help many Users of flink who uses AWS > Glue([https://docs.aws.amazon.com/glue/latest/dg/start-data-catalog.html]) as > their common(unified) catalog and process data. > cc [~prabhujoseph] -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [docs] Add a README to the flink-autoscaler module [flink-kubernetes-operator]
mxm merged PR #694: URL: https://github.com/apache/flink-kubernetes-operator/pull/694 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-28050][connectors] Migrate StreamExecutionEnvironment#fromElements() implementation to FLIP-27 Source API [flink]
afedulov commented on PR #23553: URL: https://github.com/apache/flink/pull/23553#issuecomment-1781183357 @zentol I made two major changes as per our discussions above: - https://github.com/apache/flink/commit/2712c1813ca6420905e06b9e417de0eb61d586d9 - direct type passing without the requirement to use returns() (please see my [comment](https://github.com/apache/flink/pull/23553#discussion_r1372403767) above) - https://github.com/apache/flink/pull/23553/commits/78cb92bc86e9dded9bf2458de119d549be7ad281 - allow parallel execution of fromElements Sources The second one might need some additional test fixes, but I cannot get to them at the moment because of the `japicmp` failures: ``` Failed to execute goal io.github.zentol.japicmp:japicmp-maven-plugin:0.17.1.1_m325:cmp (default) on project flink-streaming-java: There is at least one incompatibility: org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.fromElements(org.apache.flink.api.common.typeinfo.TypeInformation,java.lang.Object[]):CLASS_GENERIC_TEMPLATE_CHANGED -> [Help 1] ``` This is the diff: ``` +++* NEW METHOD: PUBLIC(+) FINAL(+) org.apache.flink.streaming.api.datastream.DataStreamSource fromElements(org.apache.flink.api.common.typeinfo.TypeInformation, java.lang.Object[]) +++ NEW ANNOTATION: java.lang.SafeVarargs GENERIC TEMPLATES: +++ OUT:java.lang.Object ``` What is the issue with adding the new method? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] add announcement blog post for Flink 1.18 [flink-web]
knaufk merged PR #680: URL: https://github.com/apache/flink-web/pull/680 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-33375) Add a RestoreTestBase
Dawid Wysakowicz created FLINK-33375: Summary: Add a RestoreTestBase Key: FLINK-33375 URL: https://issues.apache.org/jira/browse/FLINK-33375 Project: Flink Issue Type: Sub-task Components: Table SQL / Planner Reporter: Dawid Wysakowicz Assignee: Dawid Wysakowicz Fix For: 1.19.0 Add a test base class for writing restore tests. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] test [flink]
JunRuiLee commented on PR #23598: URL: https://github.com/apache/flink/pull/23598#issuecomment-1781151972 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-33374) Execute REMOVE JAR command failed via SQL gateway
Xianxun Ye created FLINK-33374: -- Summary: Execute REMOVE JAR command failed via SQL gateway Key: FLINK-33374 URL: https://issues.apache.org/jira/browse/FLINK-33374 Project: Flink Issue Type: Improvement Components: Table SQL / Gateway Affects Versions: 1.18.0 Reporter: Xianxun Ye Execute the below steps could reproduce the exception: At first, I added a specified jar to the classloader via the ADD JAR command, and using the SHOW JARS command also displayed the jars. But the REMOVE JAR command is not supported right now. {code:java} Caused by: java.lang.UnsupportedOperationException: SQL Gateway doesn't support REMOVE JAR syntax now. at org.apache.flink.table.gateway.service.operation.OperationExecutor.callRemoveJar(OperationExecutor.java:550) ~[flink-sql-gateway-1.18.0.jar:1.18.0] at org.apache.flink.table.gateway.service.operation.OperationExecutor.executeOperation(OperationExecutor.java:442) ~[flink-sql-gateway-1.18.0.jar:1.18.0] at org.apache.flink.table.gateway.service.operation.OperationExecutor.executeStatement(OperationExecutor.java:207) ~[flink-sql-gateway-1.18.0.jar:1.18.0] at org.apache.flink.table.gateway.service.SqlGatewayServiceImpl.lambda$executeStatement$1(SqlGatewayServiceImpl.java:212) ~[flink-sql-gateway-1.18.0.jar:1.18.0] at org.apache.flink.table.gateway.service.operation.OperationManager.lambda$submitOperation$1(OperationManager.java:119) ~[flink-sql-gateway-1.18.0.jar:1.18.0] {code} It seems the RemoveJarOperation is ignored. https://github.com/apache/flink/blob/release-1.18.0/flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/operation/OperationExecutor.java#L550 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-33316) Avoid unnecessary heavy getStreamOperatorFactory
[ https://issues.apache.org/jira/browse/FLINK-33316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17778510#comment-17778510 ] Rui Fan edited comment on FLINK-33316 at 10/26/23 1:31 PM: --- The change is subtle, so I push this commit directly. Merged 1.19: a2681f6a85aaad21179f91e03a91b4a05158841e and 0388b760fc66975c70f797ad07f2e073738a7171 Merged 1.17: 024fa4776d0246a283a70743f1ce3c04981daeb9 Merged 1.18: 0dd3b4ce9f0b9f193926445bf9c1f8579fa86161 was (Author: fanrui): The change is subtle, so I push this commit directly. Merged 1.17: 024fa4776d0246a283a70743f1ce3c04981daeb9 Merged 1.18: 0dd3b4ce9f0b9f193926445bf9c1f8579fa86161 > Avoid unnecessary heavy getStreamOperatorFactory > > > Key: FLINK-33316 > URL: https://issues.apache.org/jira/browse/FLINK-33316 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Configuration >Affects Versions: 1.17.0, 1.18.0 >Reporter: Rui Fan >Assignee: Rui Fan >Priority: Major > Labels: pull-request-available > Fix For: 1.17.2, 1.19.0, 1.18.1 > > > See FLINK-33315 for details. > This Jira focus on avoid unnecessary heavy getStreamOperatorFactory, it can > optimize the memory and cpu cost of Replica_2 in FLINK-33315. > Solution: We can store the serializedUdfClassName at StreamConfig, and using > the getStreamOperatorFactoryClassName instead of the heavy > getStreamOperatorFactory in OperatorChain#getOperatorRecordsOutCounter. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-33316][runtime] Using SERIALIZED_UDF_CLASS instead of SERIALIZED_UDF_CLASS_NAME [flink]
1996fanrui merged PR #23597: URL: https://github.com/apache/flink/pull/23597 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33164] Support write option sink.ignore-null-value [flink-connector-hbase]
Tan-JiaLiang commented on PR #21: URL: https://github.com/apache/flink-connector-hbase/pull/21#issuecomment-1781118161 @ferenc-csaky @MartijnVisser Thanks for the patience guide. Making sense to me. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] test [flink]
JunRuiLee commented on PR #23598: URL: https://github.com/apache/flink/pull/23598#issuecomment-1781116611 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33373][build] Capture build scans on ge.apache.org to benefit from deep build insights [flink]
flinkbot commented on PR #23602: URL: https://github.com/apache/flink/pull/23602#issuecomment-1781113268 ## CI report: * 416aec02e4876e743df8c76656ad4cf71214cf69 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33373][build] Capture build scans on ge.apache.org to benefit from deep build insights [flink]
clayburn commented on code in PR #23602: URL: https://github.com/apache/flink/pull/23602#discussion_r1373154854 ## tools/azure-pipelines/build-apache-repo.yml: ## @@ -55,6 +55,7 @@ variables: SECRET_S3_BUCKET: $[variables.IT_CASE_S3_BUCKET] SECRET_S3_ACCESS_KEY: $[variables.IT_CASE_S3_ACCESS_KEY] SECRET_S3_SECRET_KEY: $[variables.IT_CASE_S3_SECRET_KEY] + SECRET_GE_ACCESS_KEY : $[variables.GE_ACCESS_KEY] Review Comment: To whoever reviews this, this is an access key that can be provided by the ASF Infra team directly to an individual that can set it in your Azure DevOps organization. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-33373) Capture build scans on ge.apache.org to benefit from deep build insights
[ https://issues.apache.org/jira/browse/FLINK-33373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-33373: --- Labels: pull-request-available (was: ) > Capture build scans on ge.apache.org to benefit from deep build insights > > > Key: FLINK-33373 > URL: https://issues.apache.org/jira/browse/FLINK-33373 > Project: Flink > Issue Type: Improvement > Components: Build System / CI >Reporter: Clay Johnson >Priority: Minor > Labels: pull-request-available > > This improvement will enhance the functionality of the Flink build by > publishing build scans to [ge.apache.org|https://ge.apache.org/], hosted by > the Apache Software Foundation and run in partnership between the ASF and > Gradle. This Develocity instance has all features and extensions enabled and > is freely available for use by the Apache Flink project and all other Apache > projects. > On this Develocity instance, Apache Flink will have access not only to all of > the published build scans but other aggregate data features such as: > * Dashboards to view all historical build scans, along with performance > trends over time > * Build failure analytics for enhanced investigation and diagnosis of build > failures > * Test failure analytics to better understand trends and causes around slow, > failing, and flaky tests -- This message was sent by Atlassian Jira (v8.20.10#820010)
[PR] [FLINK-33373][build] Capture build scans on ge.apache.org to benefit from deep build insights [flink]
clayburn opened a new pull request, #23602: URL: https://github.com/apache/flink/pull/23602 It was nice meeting some of you at the Gradle booth at Community over Code. We discussed Develocity with some of you, and this would be the PR that enables it. ## What is the purpose of the change The build scans of the Apache Flink project are published to the Develocity instance at [ge.apache.org](https://ge.apache.org/), hosted by the Apache Software Foundation and run in partnership between the ASF and Gradle. This Develocity instance has all features and extensions enabled and is freely available for use by the Apache Flink project and all other Apache projects. On this Develocity instance, Apache Flink will have access not only to all of the published build scans but other aggregate data features such as: - Dashboards to view all historical build scans, along with performance trends over time - Build failure analytics for enhanced investigation and diagnosis of build failures - Test failure analytics to better understand trends and causes around slow, failing, and flaky tests Please let me know if there are any questions about the value of Develocity or the changes in this pull request and I’d be happy to address them. ## Brief change log - Adds and configures a Maven extension to publish build scans to ge.apache.org ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no (only the build time extensions) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [docs] Add a README to the flink-autoscaler module [flink-kubernetes-operator]
mxm opened a new pull request, #694: URL: https://github.com/apache/flink-kubernetes-operator/pull/694 This adds a README to the flink-autoscaler module, to clarify its purpose and usage. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-33373) Capture build scans on ge.apache.org to benefit from deep build insights
Clay Johnson created FLINK-33373: Summary: Capture build scans on ge.apache.org to benefit from deep build insights Key: FLINK-33373 URL: https://issues.apache.org/jira/browse/FLINK-33373 Project: Flink Issue Type: Improvement Components: Build System / CI Reporter: Clay Johnson This improvement will enhance the functionality of the Flink build by publishing build scans to [ge.apache.org|https://ge.apache.org/], hosted by the Apache Software Foundation and run in partnership between the ASF and Gradle. This Develocity instance has all features and extensions enabled and is freely available for use by the Apache Flink project and all other Apache projects. On this Develocity instance, Apache Flink will have access not only to all of the published build scans but other aggregate data features such as: * Dashboards to view all historical build scans, along with performance trends over time * Build failure analytics for enhanced investigation and diagnosis of build failures * Test failure analytics to better understand trends and causes around slow, failing, and flaky tests -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33372) Cryptic exception for a sub query in a CompiledPlan
Dawid Wysakowicz created FLINK-33372: Summary: Cryptic exception for a sub query in a CompiledPlan Key: FLINK-33372 URL: https://issues.apache.org/jira/browse/FLINK-33372 Project: Flink Issue Type: Bug Components: Table SQL / Planner Affects Versions: 1.18.0 Reporter: Dawid Wysakowicz SQL statements with a SUBQUERY can be compiled to a plan, but such plans can not be executed and they fail with a cryptic exception. Example: {code} final CompiledPlan compiledPlan = tEnv.compilePlanSql("insert into MySink SELECT * FROM LATERAL TABLE(func1(select c from MyTable))"); tEnv.loadPlan(PlanReference.fromJsonString(compiledPlan.asJsonString())).execute(); {code} fails with: {code} org.apache.flink.table.planner.codegen.CodeGenException: Unsupported call: $SCALAR_QUERY() If you think this function should be supported, you can create an issue and start a discussion for it. {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-33359][FLINK-25002] add-opens jvm args to support jdk17 for kubernetes operator [flink-kubernetes-operator]
snuyanzin commented on code in PR #691: URL: https://github.com/apache/flink-kubernetes-operator/pull/691#discussion_r1373126000 ## .github/workflows/ci.yml: ## @@ -24,12 +24,15 @@ jobs: test_ci: runs-on: ubuntu-latest name: test_ci +strategy: + matrix: +java-version: [ 11, 17 ] Review Comment: ok, now done for 1.18 only -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33371] Make TestValues sinks return results as Rows [flink]
twalthr commented on code in PR #23601: URL: https://github.com/apache/flink/pull/23601#discussion_r1373121357 ## flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/factories/TestValuesRuntimeFunctions.java: ## @@ -283,22 +293,29 @@ private abstract static class AbstractExactlyOnceSink extends RichSinkFunction rawResultState; +protected transient List localRawResult; -protected transient ListState rawResultState; -protected transient List localRawResult; - -protected AbstractExactlyOnceSink(String tableName) { +protected AbstractExactlyOnceSink( +String tableName, DataType consumedDataType, DataStructureConverter converter) { this.tableName = tableName; +this.consumedDataType = consumedDataType; +this.converter = converter; } @Override public void initializeState(FunctionInitializationContext context) throws Exception { this.rawResultState = context.getOperatorStateStore() -.getListState(new ListStateDescriptor<>("sink-results", Types.STRING)); +.getListState( +new ListStateDescriptor<>( +"sink-results", + ExternalSerializer.of(consumedDataType, true))); Review Comment: Let's use the restore tests for deeply testing this utility then. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-25809) Introduce test infra for building FLIP-190 tests
[ https://issues.apache.org/jira/browse/FLINK-25809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timo Walther closed FLINK-25809. Fix Version/s: 1.19.0 Resolution: Fixed Fixed in master: 347e4ca6c265334a35969d1c8358ff5a9f066e92 > Introduce test infra for building FLIP-190 tests > - > > Key: FLINK-25809 > URL: https://issues.apache.org/jira/browse/FLINK-25809 > Project: Flink > Issue Type: Sub-task >Reporter: Francesco Guardiani >Assignee: Timo Walther >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > > The FLIP-190 requires to build a new test infra. For this test infra, we want > to define test cases and data once, and then for each case we want to execute > the following: > * Integration test that roughly does {{create plan -> execute job -> trigger > savepoint -> stop job -> restore plan -> restore savepoint -> execute job -> > stop and assert}}. Plan and savepoint should be commited to git, so running > this tests when a plan and savepoint is available will not regenerate plan > and savepoint. > * Change detection test to check that for the defined test cases, the plan > hasn't been changed. Similar to the existing {{JsonPlanITCase}} tests. > * Completeness of tests/Coverage, that is count how many times ExecNodes > (including versions) are used in the test cases. Fail if an ExecNode version > is never covered. > Other requirements includes to "version" the test cases, that is for each > test case we can retain different versions of the plan and savepoint, in > order to make sure that after we introduce a new plan change, the old plan > still continues to run -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-33371] Make TestValues sinks return results as Rows [flink]
dawidwys commented on code in PR #23601: URL: https://github.com/apache/flink/pull/23601#discussion_r1373034259 ## flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/factories/TestValuesRuntimeFunctions.java: ## @@ -283,22 +293,29 @@ private abstract static class AbstractExactlyOnceSink extends RichSinkFunction rawResultState; +protected transient List localRawResult; -protected transient ListState rawResultState; -protected transient List localRawResult; - -protected AbstractExactlyOnceSink(String tableName) { +protected AbstractExactlyOnceSink( +String tableName, DataType consumedDataType, DataStructureConverter converter) { this.tableName = tableName; +this.consumedDataType = consumedDataType; +this.converter = converter; } @Override public void initializeState(FunctionInitializationContext context) throws Exception { this.rawResultState = context.getOperatorStateStore() -.getListState(new ListStateDescriptor<>("sink-results", Types.STRING)); +.getListState( +new ListStateDescriptor<>( +"sink-results", + ExternalSerializer.of(consumedDataType, true))); Review Comment: This would be tested only if the state is snapshotted. Apparently, no tests do that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-25809][table-api-java] Add table test program infrastructure [flink]
twalthr merged PR #23584: URL: https://github.com/apache/flink/pull/23584 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33164] Support write option sink.ignore-null-value [flink-connector-hbase]
MartijnVisser commented on PR #21: URL: https://github.com/apache/flink-connector-hbase/pull/21#issuecomment-1781047396 It's recommended to have multiple commits in a lot of situations, see https://flink.apache.org/how-to-contribute/code-style-and-quality-pull-requests/ - We can squash before merging -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-30481][FLIP-277] GlueCatalog Implementation [flink-connector-aws]
Samrat002 commented on PR #47: URL: https://github.com/apache/flink-connector-aws/pull/47#issuecomment-1781024903 [gentle ping] @dannycranmer, i have addressed to all the review comments . Please review the PR whenever time -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33164] Support write option sink.ignore-null-value [flink-connector-hbase]
ferenc-csaky commented on PR #21: URL: https://github.com/apache/flink-connector-hbase/pull/21#issuecomment-1781023023 > @ferenc-csaky I rebase the main branch and squash all my changes. > > And one more question master. there will be multiple commits during code reviews, Should I squash all commit into one and apply force push to the origin? Or keep it and wait for the maintainer to squash and merge? I think you can follow whichever path is more comfortable for you and for the specific case at hand. If there are multiple commits in the PR the maintainer will squash it, no problem. @MartijnVisser any thoughts on this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-29549) Add Aws Glue Catalog support in Flink
[ https://issues.apache.org/jira/browse/FLINK-29549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17779870#comment-17779870 ] Samrat Deb commented on FLINK-29549: please help reviewing the PR [https://github.com/apache/flink-connector-aws/pull/47] for glue Catalog. PR is open for very long time time. > Add Aws Glue Catalog support in Flink > - > > Key: FLINK-29549 > URL: https://issues.apache.org/jira/browse/FLINK-29549 > Project: Flink > Issue Type: New Feature > Components: Connectors / AWS, Connectors / Common >Reporter: Samrat Deb >Priority: Major > Labels: pull-request-available > > Currently , Flink sql hive connector support hive metastore as hardcoded > metastore-uri. > It would be good if flink provide feature to have configurable metastore (eg. > AWS glue). > This would help many Users of flink who uses AWS > Glue([https://docs.aws.amazon.com/glue/latest/dg/start-data-catalog.html]) as > their common(unified) catalog and process data. > cc [~prabhujoseph] -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-33304] Introduce mutationBuffer to resolve the mutation write conflicts problem [flink-connector-hbase]
ferenc-csaky commented on PR #30: URL: https://github.com/apache/flink-connector-hbase/pull/30#issuecomment-1780977002 @Tan-JiaLiang that's even better 👍 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33371] Make TestValues sinks return results as Rows [flink]
dawidwys commented on code in PR #23601: URL: https://github.com/apache/flink/pull/23601#discussion_r1373034259 ## flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/factories/TestValuesRuntimeFunctions.java: ## @@ -283,22 +293,29 @@ private abstract static class AbstractExactlyOnceSink extends RichSinkFunction rawResultState; +protected transient List localRawResult; -protected transient ListState rawResultState; -protected transient List localRawResult; - -protected AbstractExactlyOnceSink(String tableName) { +protected AbstractExactlyOnceSink( +String tableName, DataType consumedDataType, DataStructureConverter converter) { this.tableName = tableName; +this.consumedDataType = consumedDataType; +this.converter = converter; } @Override public void initializeState(FunctionInitializationContext context) throws Exception { this.rawResultState = context.getOperatorStateStore() -.getListState(new ListStateDescriptor<>("sink-results", Types.STRING)); +.getListState( +new ListStateDescriptor<>( +"sink-results", + ExternalSerializer.of(consumedDataType, true))); Review Comment: This would be tested only if * RocksDB is used * or state is snapshotted -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33371] Make TestValues sinks return results as Rows [flink]
dawidwys commented on code in PR #23601: URL: https://github.com/apache/flink/pull/23601#discussion_r1373034259 ## flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/factories/TestValuesRuntimeFunctions.java: ## @@ -283,22 +293,29 @@ private abstract static class AbstractExactlyOnceSink extends RichSinkFunction rawResultState; +protected transient List localRawResult; -protected transient ListState rawResultState; -protected transient List localRawResult; - -protected AbstractExactlyOnceSink(String tableName) { +protected AbstractExactlyOnceSink( +String tableName, DataType consumedDataType, DataStructureConverter converter) { this.tableName = tableName; +this.consumedDataType = consumedDataType; +this.converter = converter; } @Override public void initializeState(FunctionInitializationContext context) throws Exception { this.rawResultState = context.getOperatorStateStore() -.getListState(new ListStateDescriptor<>("sink-results", Types.STRING)); +.getListState( +new ListStateDescriptor<>( +"sink-results", + ExternalSerializer.of(consumedDataType, true))); Review Comment: This would be tested only if * RocksDB is used * or state is restored -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33371] Make TestValues sinks return results as Rows [flink]
dawidwys commented on code in PR #23601: URL: https://github.com/apache/flink/pull/23601#discussion_r1373031168 ## flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/factories/TestValuesRuntimeFunctions.java: ## @@ -283,22 +293,29 @@ private abstract static class AbstractExactlyOnceSink extends RichSinkFunction rawResultState; +protected transient List localRawResult; -protected transient ListState rawResultState; -protected transient List localRawResult; - -protected AbstractExactlyOnceSink(String tableName) { +protected AbstractExactlyOnceSink( +String tableName, DataType consumedDataType, DataStructureConverter converter) { this.tableName = tableName; +this.consumedDataType = consumedDataType; +this.converter = converter; } @Override public void initializeState(FunctionInitializationContext context) throws Exception { this.rawResultState = context.getOperatorStateStore() -.getListState(new ListStateDescriptor<>("sink-results", Types.STRING)); +.getListState( +new ListStateDescriptor<>( +"sink-results", + ExternalSerializer.of(consumedDataType, true))); Review Comment: Maybe no tests, depend on this particular one. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33371] Make TestValues sinks return results as Rows [flink]
twalthr commented on code in PR #23601: URL: https://github.com/apache/flink/pull/23601#discussion_r1373020057 ## flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/factories/TestValuesRuntimeFunctions.java: ## @@ -283,22 +293,29 @@ private abstract static class AbstractExactlyOnceSink extends RichSinkFunction rawResultState; +protected transient List localRawResult; -protected transient ListState rawResultState; -protected transient List localRawResult; - -protected AbstractExactlyOnceSink(String tableName) { +protected AbstractExactlyOnceSink( +String tableName, DataType consumedDataType, DataStructureConverter converter) { this.tableName = tableName; +this.consumedDataType = consumedDataType; +this.converter = converter; } @Override public void initializeState(FunctionInitializationContext context) throws Exception { this.rawResultState = context.getOperatorStateStore() -.getListState(new ListStateDescriptor<>("sink-results", Types.STRING)); +.getListState( +new ListStateDescriptor<>( +"sink-results", + ExternalSerializer.of(consumedDataType, true))); Review Comment: I'm confused why did no tests fail with this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33058][formats] Add encoding option to Avro format [flink]
JingGe commented on code in PR #23395: URL: https://github.com/apache/flink/pull/23395#discussion_r1373016512 ## docs/content/docs/connectors/table/formats/avro.md: ## @@ -80,6 +80,14 @@ Format Options String Specify what format to use, here should be 'avro'. + + avro.encoding + optional + yes + binary + String + Serialization encoding to use. The valid enumerations are: binary, json. https://avro.apache.org/docs/current/specification/#encodings";>(reference) + Review Comment: Thanks for binging this topic up again. Since json format is slow, I would suggest adding more info here to describe the pros and cons and pointed out binary format is commonly recommended. Binary is the default setting, not only because of the backward compatibility, but also because of the performance. WDYT? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33371] Make TestValues sinks return results as Rows [flink]
dawidwys commented on code in PR #23601: URL: https://github.com/apache/flink/pull/23601#discussion_r1373006392 ## flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/factories/TestValuesRuntimeFunctions.java: ## @@ -283,22 +293,29 @@ private abstract static class AbstractExactlyOnceSink extends RichSinkFunction rawResultState; +protected transient List localRawResult; -protected transient ListState rawResultState; -protected transient List localRawResult; - -protected AbstractExactlyOnceSink(String tableName) { +protected AbstractExactlyOnceSink( +String tableName, DataType consumedDataType, DataStructureConverter converter) { this.tableName = tableName; +this.consumedDataType = consumedDataType; +this.converter = converter; } @Override public void initializeState(FunctionInitializationContext context) throws Exception { this.rawResultState = context.getOperatorStateStore() -.getListState(new ListStateDescriptor<>("sink-results", Types.STRING)); +.getListState( +new ListStateDescriptor<>( +"sink-results", + ExternalSerializer.of(consumedDataType, true))); Review Comment: right, initially I started with `RowData` and didn't remove the flag. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33359][FLINK-25002] add-opens jvm args to support jdk17 for kubernetes operator [flink-kubernetes-operator]
gyfora commented on code in PR #691: URL: https://github.com/apache/flink-kubernetes-operator/pull/691#discussion_r1373005879 ## .github/workflows/ci.yml: ## @@ -24,12 +24,15 @@ jobs: test_ci: runs-on: ubuntu-latest name: test_ci +strategy: + matrix: +java-version: [ 11, 17 ] Review Comment: We should not run it for all jobs. We can target only the latest Flink version (1.18) for now -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33359][FLINK-25002] add-opens jvm args to support jdk17 for kubernetes operator [flink-kubernetes-operator]
snuyanzin commented on code in PR #691: URL: https://github.com/apache/flink-kubernetes-operator/pull/691#discussion_r1372999282 ## .github/workflows/ci.yml: ## @@ -24,12 +24,15 @@ jobs: test_ci: runs-on: ubuntu-latest name: test_ci +strategy: + matrix: +java-version: [ 11, 17 ] Review Comment: I made it for my own fork, so it should not be complecated. The only thing that stopped me from adding this to the PR is that after that there are 200+ jobs running in ci... and the total time is more than an hour If it's ok, then i can add it here as well ## .github/workflows/ci.yml: ## @@ -24,12 +24,15 @@ jobs: test_ci: runs-on: ubuntu-latest name: test_ci +strategy: + matrix: +java-version: [ 11, 17 ] Review Comment: I made it for my own fork, so it should not be complecated. The only thing that stopped me from adding this to the PR is that after that there are 200+ jobs running in ci... and the total time is more than an hour https://github.com/snuyanzin/flink-kubernetes-operator/actions/runs/6644874859 If it's ok, then i can add it here as well -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33371] Make TestValues sinks return results as Rows [flink]
twalthr commented on code in PR #23601: URL: https://github.com/apache/flink/pull/23601#discussion_r1372989815 ## flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/factories/TestValuesRuntimeFunctions.java: ## @@ -283,22 +293,29 @@ private abstract static class AbstractExactlyOnceSink extends RichSinkFunction rawResultState; +protected transient List localRawResult; -protected transient ListState rawResultState; -protected transient List localRawResult; - -protected AbstractExactlyOnceSink(String tableName) { +protected AbstractExactlyOnceSink( +String tableName, DataType consumedDataType, DataStructureConverter converter) { this.tableName = tableName; +this.consumedDataType = consumedDataType; +this.converter = converter; } @Override public void initializeState(FunctionInitializationContext context) throws Exception { this.rawResultState = context.getOperatorStateStore() -.getListState(new ListStateDescriptor<>("sink-results", Types.STRING)); +.getListState( +new ListStateDescriptor<>( +"sink-results", + ExternalSerializer.of(consumedDataType, true))); Review Comment: Is the `true` correct here? The input is not internal but `Row`, no? ## flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/factories/TestValuesRuntimeFunctions.java: ## @@ -283,22 +293,29 @@ private abstract static class AbstractExactlyOnceSink extends RichSinkFunction rawResultState; +protected transient List localRawResult; -protected transient ListState rawResultState; -protected transient List localRawResult; - -protected AbstractExactlyOnceSink(String tableName) { +protected AbstractExactlyOnceSink( +String tableName, DataType consumedDataType, DataStructureConverter converter) { this.tableName = tableName; +this.consumedDataType = consumedDataType; +this.converter = converter; } @Override public void initializeState(FunctionInitializationContext context) throws Exception { this.rawResultState = context.getOperatorStateStore() -.getListState(new ListStateDescriptor<>("sink-results", Types.STRING)); +.getListState( +new ListStateDescriptor<>( +"sink-results", + ExternalSerializer.of(consumedDataType, true))); Review Comment: if `localRawResult` would be RowData this would be true and we won't need DataStructureConverter anymore -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33359][FLINK-25002] add-opens jvm args to support jdk17 for kubernetes operator [flink-kubernetes-operator]
gyfora commented on code in PR #691: URL: https://github.com/apache/flink-kubernetes-operator/pull/691#discussion_r1372996355 ## .github/workflows/ci.yml: ## @@ -24,12 +24,15 @@ jobs: test_ci: runs-on: ubuntu-latest name: test_ci +strategy: + matrix: +java-version: [ 11, 17 ] Review Comment: In addition to running the CI with both java version it would be great to somehow run e2e-s as well on different versions. That may be a bit trickier because we would have to change the docker base image etc. What do you think? If it's a lot of work we can open a follow-up ticket for that as well , but we need that before we can say java 17 is supported -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33360] HybridSourceSplitEnumerator clear finishedReaders when … [flink]
fengjiajie commented on code in PR #23593: URL: https://github.com/apache/flink/pull/23593#discussion_r1372988296 ## flink-connectors/flink-connector-base/src/test/java/org/apache/flink/connector/base/source/hybrid/HybridSourceSplitEnumeratorTest.java: ## @@ -252,6 +252,41 @@ public void testInterceptNoMoreSplitEvent() { assertThat(context.hasNoMoreSplits(0)).isTrue(); } +@Test +public void testMultiSubtaskSwitchEnumerator() { +context = new MockSplitEnumeratorContext<>(2); +source = +HybridSource.builder(MOCK_SOURCE) +.addSource(MOCK_SOURCE) +.addSource(MOCK_SOURCE) +.build(); + +enumerator = (HybridSourceSplitEnumerator) source.createEnumerator(context); +enumerator.start(); + +registerReader(context, enumerator, SUBTASK0); +registerReader(context, enumerator, SUBTASK1); +enumerator.handleSourceEvent(SUBTASK0, new SourceReaderFinishedEvent(-1)); +enumerator.handleSourceEvent(SUBTASK1, new SourceReaderFinishedEvent(-1)); + +assertThat(getCurrentSourceIndex(enumerator)).isEqualTo(0); +enumerator.handleSourceEvent(SUBTASK0, new SourceReaderFinishedEvent(0)); +enumerator.handleSourceEvent(SUBTASK1, new SourceReaderFinishedEvent(0)); +assertThat(getCurrentSourceIndex(enumerator)) +.as("all reader finished source-0") +.isEqualTo(1); + +enumerator.handleSourceEvent(SUBTASK0, new SourceReaderFinishedEvent(1)); +assertThat(getCurrentSourceIndex(enumerator)) +.as( +"only reader-0 has finished reading, reader-1 is not yet done, so do not switch to the next source") +.isEqualTo(1); Review Comment: Before this PR modification, the assert will fail here, which means switching to the next source only after one reader completes reading -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org