Re: [DISCUSS] FLIP-495: Support AdaptiveScheduler record and query the rescale history

2025-09-02 Thread Matthias Pohl
Rescale structure? That seems to > be > > a transient field that should be derived via the AdaptiveScheduler's > state. > > Thank you very much for the reminder. > This is exactly the point I have been reconsidering. > After introducing TerminalState and TerminatedRea

Re: [DISCUSS] FLIP-495: Support AdaptiveScheduler record and query the rescale history

2025-08-25 Thread Matthias Pohl
f this current rescale > process terminates, an immutable rescale snapshot of this event is created > that is saved in the rescale history. > > Sorry for my previous wording was ambiguous and inconsistent with the > documentation. Thanks a lot for pointing it out. This is actually

Re: [DISCUSS] FLIP-495: Support AdaptiveScheduler record and query the rescale history

2025-08-21 Thread Matthias Pohl
5#FLIP495:SupportAdaptiveSchedulerrecordandquerytherescalehistory-Aboutrescaleeventsstorage.1 > [2]https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=334760525#FLIP495:SupportAdaptiveSchedulerrecordandquerytherescalehistory-ThemainscenarioswhereRescalestatusswitchestoterminated

Re: [DISCUSS] FLIP-495: Support AdaptiveScheduler record and query the rescale history

2025-08-10 Thread Matthias Pohl
Hi Yuepeng, thanks for reminding me of this FLIP. I went over it and have a few items which we might need to address before we can actually finalize the vote: 1. You mention a few options for when it comes to storing the data which is good. The FLIP doesn't point out, though, what option you're go

[jira] [Created] (FLINK-37412) Add slot allocation strategy to for selecting TMs first that have already slots assigned

2025-03-03 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-37412: - Summary: Add slot allocation strategy to for selecting TMs first that have already slots assigned Key: FLINK-37412 URL: https://issues.apache.org/jira/browse/FLINK-37412

Re: Migrating CI to Github Actions

2025-02-13 Thread Matthias Pohl
e/FLINK-34331 On Thu, Feb 13, 2025 at 2:21 PM Tom Cooper wrote: > Hi all, > > I was hoping to sync up on the progress on moving from Azure CI to GitHub > Actions for the main Flink repository. There was FLIP-396 [1] by Matthias > Pohl detailing the plan to trial GitHub Actions and

[jira] [Created] (FLINK-37259) StreamCheckpointingITCase times out

2025-02-05 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-37259: - Summary: StreamCheckpointingITCase times out Key: FLINK-37259 URL: https://issues.apache.org/jira/browse/FLINK-37259 Project: Flink Issue Type: Bug

[jira] [Created] (FLINK-37232) FLIP-272 breaks some synchronization assumption on the AdaptiveScheduler's side

2025-01-28 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-37232: - Summary: FLIP-272 breaks some synchronization assumption on the AdaptiveScheduler's side Key: FLINK-37232 URL: https://issues.apache.org/jira/browse/FLINK-

Re: [DISCUSS] Flink 1.18.2 release & grace period for EOL versions

2025-01-27 Thread Matthias Pohl
FYI: The nightly runs for 1.18 are deactivated in Azure and GHA On Mon, Jan 6, 2025 at 1:02 PM Matthias Pohl wrote: > Ok, thanks for your feedback. I will go ahead with disabling CI for 1.18 > next week if no other objections are shared. > > On Fri, Jan 3, 2025 at 5:21 PM Rui

[jira] [Created] (FLINK-37220) GuavaRateLimiter creates an ExecutorService without shutting it down properly leaking threads

2025-01-24 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-37220: - Summary: GuavaRateLimiter creates an ExecutorService without shutting it down properly leaking threads Key: FLINK-37220 URL: https://issues.apache.org/jira/browse/FLINK-37220

[jira] [Created] (FLINK-37215) SlotAllocationException throwing can be moved from TaskExecutor#allocateSlot to TaskSlotTableImpl#allocateSlot

2025-01-23 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-37215: - Summary: SlotAllocationException throwing can be moved from TaskExecutor#allocateSlot to TaskSlotTableImpl#allocateSlot Key: FLINK-37215 URL: https://issues.apache.org/jira

[jira] [Created] (FLINK-37214) Migrate TaskSlotTableImplTest from using the test thread as the main thread

2025-01-23 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-37214: - Summary: Migrate TaskSlotTableImplTest from using the test thread as the main thread Key: FLINK-37214 URL: https://issues.apache.org/jira/browse/FLINK-37214

[jira] [Created] (FLINK-37170) Disable 1.18 CI

2025-01-19 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-37170: - Summary: Disable 1.18 CI Key: FLINK-37170 URL: https://issues.apache.org/jira/browse/FLINK-37170 Project: Flink Issue Type: Technical Debt

Re: Re: [DISCUSS] FLIP-495: Support AdaptiveScheduler record and query the rescale history

2025-01-12 Thread Matthias Pohl
Hi Yupeng, I managed to find some time to respond. See my answers inlined below. Matthias On Fri, Jan 3, 2025 at 11:48 AM Yuepeng Pan wrote: > [...] > Sorry for not expressing this part clearly earlier. > IIUC, based on the Adaptive Scheduler state diagram [1], > when a stop-with-savepoint oper

Re: [DISCUSS] Flink 1.18.2 release & grace period for EOL versions

2025-01-06 Thread Matthias Pohl
est, > Rui > > On Fri, Jan 3, 2025 at 11:45 PM Robert Metzger > wrote: > > > Hey Matthias, > > > > it seems that there is no traction for creating another 1.18.x bugfix > > release, so *I'm +1 for disabling CI for 1.18.* > > > > On Wed, Dec 1

Re: [DISCUSS] Is it a bug that the AdaptiveScheduler does not prioritize releasing TaskManagers during downscaling in Application mode?

2025-01-05 Thread Matthias Pohl
Hi everyone and sorry for the late reply. I was mostly off in November and forgot about that topic in December last year. Thanks for summarizing and bringing up user feedback. I see the problem and agree with your view that it's a topic that we might want to address in the 1.x LTS version. I see h

Re: [DISCUSS] FLIP-495: Support AdaptiveScheduler record and query the rescale history

2025-01-02 Thread Matthias Pohl
Thanks Yuepeng for your response. I added my comments to the individual paragraphs below: > Thank you very much for the reminding The proposal makes sense to me. > Additionally, I'd like to confirm whether each rescale cycle/event > requires a status field, such as FAILED, IGNORED, SUCCESS, PENDI

[jira] [Created] (FLINK-36979) Revert netty bump for 1.20 and 1.19

2024-12-29 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-36979: - Summary: Revert netty bump for 1.20 and 1.19 Key: FLINK-36979 URL: https://issues.apache.org/jira/browse/FLINK-36979 Project: Flink Issue Type: Bug

Re: Memory leak in pekko's Netty 4 transport.

2024-12-29 Thread Matthias Pohl
fyi: The following Flink Jira issues are related to this comment: - FLINK-36290 [1] OOM in CI - FLINK-36510 [2]: netty version bump which was backported to 1.20 and 1.19 [1] https://issues.apache.org/jira/browse/FLINK-36290 [2] https://issues.apache.org/jira/browse/FLINK-36510 On Sat, Dec 28, 202

[DISCUSS] Flink 1.18.2 release & grace period for EOL versions

2024-12-18 Thread Matthias Pohl
Hi everyone, with the release of 1.20.0 [1], 1.18 reached its EOL. The community has decided to do a final "flush out" release in this case if it's requested (and, I guess, there is someone volunteering to do this). There was a question about release 1.18.2 [2] in the past. 1.18.2 contains 76 fixes

Re: CHI: Automatic Stale PR Cleanup

2024-12-18 Thread Matthias Pohl
+1 for adding the automation considering that other projects are doing this as well and the high amount of open PRs doesn't help. Thanks for summarizing the current state in such detail. X + Y should probably be larger than our usual release cycles. X=6m and Y=3m might be alright in this regard.

Re: [DISCUSS] FLIP-495: Support AdaptiveScheduler record and query the rescale history

2024-12-18 Thread Matthias Pohl
Hi Yuepeng, Sorry for not finding the time to respond earlier. I went over FLIP-495 [1] and the previous FLIP-487 discussion [2]. Thanks for putting it all together in a FLIP. That makes it easier to discuss the next iteration. Here are a few comments I have: Rescale ID section - How is the resour

Re: [QUESTION] Several unreleased 1.20.x and and 1.19.x versions in jira

2024-12-17 Thread Matthias Pohl
x > > > > On Mon, 2 Dec 2024 at 21:12, Sergey Nuyanzin wrote: > > > Thanks, Matthias > > > > confirm, now there is only one unreleased version for 1.19 and one for > 1.20 > > > > On Mon, Dec 2, 2024 at 10:17 AM Matthias Pohl wrote: > > > > >

[jira] [Created] (FLINK-36888) Building flink-shaded-netty-tcnative-static for the OpenSSL e2e tests timed out

2024-12-11 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-36888: - Summary: Building flink-shaded-netty-tcnative-static for the OpenSSL e2e tests timed out Key: FLINK-36888 URL: https://issues.apache.org/jira/browse/FLINK-36888

Re: [QUESTION] Several unreleased 1.20.x and and 1.19.x versions in jira

2024-12-02 Thread Matthias Pohl
Thanks for bringing this up, Sergey. You're right. I went over the issues that had these versions assigned, assigned them to the right patch versions and deleted the versions 1.19.3 and 1.20.2 so that contributors don't accidentally assign new Jira issues to those versions again. Best, Matthias

Re: [DISCUSS] FLIP-487: Show history of rescales in Web UI for AdaptiveScheduler

2024-12-02 Thread Matthias Pohl
Hi Yuepeng, thanks for the proposal. Having a way to see the history of rescales is a nice feature, I guess. I went over the draft and have a few questions: Can we reorganize the draft? Right now, we have some (for RescaleEvent, Required/AcquiredParallelism) schema defined in the "Proposed Changes

Re: [VOTE] Release flink-connector-kafka v3.3.0, release candidate #1

2024-10-15 Thread Matthias Pohl
+1 (binding) * Downloaded all artifacts * Extracted and built sources * Diff of git tag checkout with downloaded sources * Verified SHA512 checksums & GPG certification * Checked that all POMs have the right expected version * Generated diffs to compare pom file changes with NOTICE files Thanks A

[jira] [Created] (FLINK-36512) Make rescale trigger based on failed checkpoints depend on the cause

2024-10-11 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-36512: - Summary: Make rescale trigger based on failed checkpoints depend on the cause Key: FLINK-36512 URL: https://issues.apache.org/jira/browse/FLINK-36512 Project

[jira] [Created] (FLINK-36356) HadoopRecoverableWriterTest.testRecoverWithState due to IOException

2024-09-24 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-36356: - Summary: HadoopRecoverableWriterTest.testRecoverWithState due to IOException Key: FLINK-36356 URL: https://issues.apache.org/jira/browse/FLINK-36356 Project: Flink

[jira] [Created] (FLINK-36350) IllegalAccessError detected in JDK17+ runs

2024-09-23 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-36350: - Summary: IllegalAccessError detected in JDK17+ runs Key: FLINK-36350 URL: https://issues.apache.org/jira/browse/FLINK-36350 Project: Flink Issue Type: Bug

[jira] [Created] (FLINK-36349) ClassNotFoundException due to org.apache.flink.runtime.types.FlinkScalaKryoInstantiator missing

2024-09-23 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-36349: - Summary: ClassNotFoundException due to org.apache.flink.runtime.types.FlinkScalaKryoInstantiator missing Key: FLINK-36349 URL: https://issues.apache.org/jira/browse/FLINK-36349

[jira] [Created] (FLINK-36324) MiscAggFunctionITCase expected to raise Throwable

2024-09-19 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-36324: - Summary: MiscAggFunctionITCase expected to raise Throwable Key: FLINK-36324 URL: https://issues.apache.org/jira/browse/FLINK-36324 Project: Flink Issue

[jira] [Created] (FLINK-36317) Populate the ArchivedExecutionGraph with CheckpointStatsSnapshot data if in WaitingForResources state with a previousExecutionGraph being set

2024-09-18 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-36317: - Summary: Populate the ArchivedExecutionGraph with CheckpointStatsSnapshot data if in WaitingForResources state with a previousExecutionGraph being set Key: FLINK-36317 URL

[jira] [Created] (FLINK-36302) FileSourceTextLinesITCase timed out

2024-09-17 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-36302: - Summary: FileSourceTextLinesITCase timed out Key: FLINK-36302 URL: https://issues.apache.org/jira/browse/FLINK-36302 Project: Flink Issue Type: Bug

[jira] [Created] (FLINK-36301) TPC-H end-to-end test fails due to TimeoutException

2024-09-17 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-36301: - Summary: TPC-H end-to-end test fails due to TimeoutException Key: FLINK-36301 URL: https://issues.apache.org/jira/browse/FLINK-36301 Project: Flink Issue

[jira] [Created] (FLINK-36300) TableEnvHiveConnectorITCase.testDateTimestampPartitionColumns times out

2024-09-17 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-36300: - Summary: TableEnvHiveConnectorITCase.testDateTimestampPartitionColumns times out Key: FLINK-36300 URL: https://issues.apache.org/jira/browse/FLINK-36300 Project

[jira] [Created] (FLINK-36299) AdaptiveSchedulerTest.testStatusMetrics times out

2024-09-17 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-36299: - Summary: AdaptiveSchedulerTest.testStatusMetrics times out Key: FLINK-36299 URL: https://issues.apache.org/jira/browse/FLINK-36299 Project: Flink Issue

[jira] [Created] (FLINK-36298) NullPointerException in Calcite causes a PyFlink test failure

2024-09-17 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-36298: - Summary: NullPointerException in Calcite causes a PyFlink test failure Key: FLINK-36298 URL: https://issues.apache.org/jira/browse/FLINK-36298 Project: Flink

[jira] [Created] (FLINK-36297) SIGSEGV caused CI failure

2024-09-17 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-36297: - Summary: SIGSEGV caused CI failure Key: FLINK-36297 URL: https://issues.apache.org/jira/browse/FLINK-36297 Project: Flink Issue Type: Bug

[jira] [Created] (FLINK-36295) AdaptiveSchedulerClusterITCase. testCheckpointStatsPersistedAcrossRescale failed with

2024-09-17 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-36295: - Summary: AdaptiveSchedulerClusterITCase. testCheckpointStatsPersistedAcrossRescale failed with Key: FLINK-36295 URL: https://issues.apache.org/jira/browse/FLINK-36295

[jira] [Created] (FLINK-36294) table stage failed with general junit5 TestEngine failure

2024-09-17 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-36294: - Summary: table stage failed with general junit5 TestEngine failure Key: FLINK-36294 URL: https://issues.apache.org/jira/browse/FLINK-36294 Project: Flink

[jira] [Created] (FLINK-36293) RocksDBWriteBatchWrapperTest.testAsyncCancellation

2024-09-17 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-36293: - Summary: RocksDBWriteBatchWrapperTest.testAsyncCancellation Key: FLINK-36293 URL: https://issues.apache.org/jira/browse/FLINK-36293 Project: Flink Issue

[jira] [Created] (FLINK-36292) SplitFetcherManagerTest.testCloseCleansUpPreviouslyClosedFetcher times out

2024-09-17 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-36292: - Summary: SplitFetcherManagerTest.testCloseCleansUpPreviouslyClosedFetcher times out Key: FLINK-36292 URL: https://issues.apache.org/jira/browse/FLINK-36292 Project

[jira] [Created] (FLINK-36291) java.lang.IllegalMonitorStateException causing a fatal error on the TaskManager side

2024-09-16 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-36291: - Summary: java.lang.IllegalMonitorStateException causing a fatal error on the TaskManager side Key: FLINK-36291 URL: https://issues.apache.org/jira/browse/FLINK-36291

[jira] [Created] (FLINK-36290) OutOfMemoryError in connect test run

2024-09-16 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-36290: - Summary: OutOfMemoryError in connect test run Key: FLINK-36290 URL: https://issues.apache.org/jira/browse/FLINK-36290 Project: Flink Issue Type: Bug

[jira] [Created] (FLINK-36279) RescaleOnCheckpointITCase.testRescaleOnCheckpoint fails

2024-09-13 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-36279: - Summary: RescaleOnCheckpointITCase.testRescaleOnCheckpoint fails Key: FLINK-36279 URL: https://issues.apache.org/jira/browse/FLINK-36279 Project: Flink

[jira] [Created] (FLINK-36272) YarnFileStageTestS3ITCase fails on master

2024-09-12 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-36272: - Summary: YarnFileStageTestS3ITCase fails on master Key: FLINK-36272 URL: https://issues.apache.org/jira/browse/FLINK-36272 Project: Flink Issue Type: Bug

Re: [DISCUSSION] Disabling japicmp plugin in master for 2.0

2024-09-03 Thread Matthias Pohl
risk it that we break something which was not > > intended, but fixing those > > hopefully small amount of cases is less effort than maintaining an > endless > > list. > > > > BR, > > G > > > > > > On Thu, Aug 29, 2024 at 11:40 AM Matthias Pohl

[jira] [Created] (FLINK-36207) Disabling japicmp plugin for deprecated APIs

2024-09-03 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-36207: - Summary: Disabling japicmp plugin for deprecated APIs Key: FLINK-36207 URL: https://issues.apache.org/jira/browse/FLINK-36207 Project: Flink Issue Type

[jira] [Created] (FLINK-36194) Shutdown hook for ExecutionGraphInfo store runs concurrently to cluster shutdown hook causing race conditions

2024-09-02 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-36194: - Summary: Shutdown hook for ExecutionGraphInfo store runs concurrently to cluster shutdown hook causing race conditions Key: FLINK-36194 URL: https://issues.apache.org/jira

[DISCUSSION] Disabling japicmp plugin in master for 2.0

2024-08-29 Thread Matthias Pohl
Hi everyone, for the 2.0 work, we are expecting to run into public API changes quite a bit. This would get picked up by the japicmp plugin. The usual way is to add exclusions to the plugin configuration [1] generating a (presumably long) list of API changes. I'm wondering whether we, instead, woul

[jira] [Created] (FLINK-36168) AdaptiveSchedulerTest doesn't follow the production lifecycle

2024-08-28 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-36168: - Summary: AdaptiveSchedulerTest doesn't follow the production lifecycle Key: FLINK-36168 URL: https://issues.apache.org/jira/browse/FLINK-36168 Project:

[jira] [Created] (FLINK-36147) Removes deprecated location field

2024-08-23 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-36147: - Summary: Removes deprecated location field Key: FLINK-36147 URL: https://issues.apache.org/jira/browse/FLINK-36147 Project: Flink Issue Type: Technical

[jira] [Created] (FLINK-36099) JobIDLoggingITCase fails due to "Cannot find task to fail for execution [...]" info log message in TM logs

2024-08-19 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-36099: - Summary: JobIDLoggingITCase fails due to "Cannot find task to fail for execution [...]" info log message in TM logs Key: FLINK-36099 URL: https://issues.apache.org/j

Re: [VOTE] FLIP-472: Aligning timeout logic in the AdaptiveScheduler's WaitingForResources and Executing states

2024-08-07 Thread Matthias Pohl
+1 (binding) On Mon, Aug 5, 2024 at 11:05 AM David Morávek wrote: > +1 (binding) > > Best, > D. > > On Mon, Aug 5, 2024 at 9:24 AM yuanfeng hu wrote: > > > +1 (no-binding) > > > > Rui Fan <1996fan...@gmail.com> 于2024年8月5日周一 15:18写道: > > > > > +1(binding) > > > > > > Best, > > > Rui > > > > > >

Re: [DISCUSS] FLIP-XXX: Aligning timeout logic in the AdaptiveScheduler's WaitingForResources and Executing states

2024-08-05 Thread Matthias Pohl
ectly, and don't need to consider them as > fallback > > options, right? > > > > - jobmanager.adaptive-scheduler.scale-on-failed-checkpoints-count > > - jobmanager.adaptive-scheduler.max-delay-for-scale-trigger > > > > Best, > > Rui > > > &g

Re: [DISCUSS] FLIP-XXX: Aligning timeout logic in the AdaptiveScheduler's WaitingForResources and Executing states

2024-08-02 Thread Matthias Pohl
t; > >>> > Thanks, Mathias, for your opinions. > >>> > > >>> > I see two scenarios where different values for starting and rescaling > >>> would > >>> > be appropriate: > >>> > > >>> > 1) Flink serv

Re: [DISCUSS] FLIP-XXX: Aligning timeout logic in the AdaptiveScheduler's WaitingForResources and Executing states

2024-07-16 Thread Matthias Pohl
Thanks Zdenek for your proposal on aligning the resource control logic within the AdaptiveScheduler and cleaning up the rescaling code. Consolidating the parameters and the code as part of the 2.0 release makes sense in my opinion: The proposed change adds consistent behavior to the WaitingForReso

Re: failed to submit Job to Flink standalone ZooKeeper-HA-cluster

2024-07-12 Thread Matthias Pohl
Hi love_h1...@126.com, Thanks for reaching out to the Flink community. Just a few general remarks: - Flink's Jira [1] should be used to report potential bugs. The dev mailing list is used for design and community-related discussions. - It's also useful to provide not only snippets of the logs but a

Re: Out of space on connector runner (jdbc)

2024-07-10 Thread Matthias Pohl
FYI: We're doing something similar in the GHA workflow for Apache Flink [1]. [1] https://github.com/apache/flink/blob/master/.github/actions/job_init/action.yml#L54-L69 On Wed, Jul 10, 2024 at 3:53 PM João Boto wrote: > I will send this with better format.. > Sorry for that > > On 2024/07/10 1

Re: [2.0] How to handle on-going feature development in Flink 2.0?

2024-07-05 Thread Matthias Pohl
; > > might > > > > > be a little early to drop support for Java 11. We can discuss this > > > > > separately. > > > > > > > > > > Thanks, > > > > > > > > > > Jiangjie (Becket) Qin > > > > > > >

[jira] [Created] (FLINK-35748) DeduplicateITCase.testLastRowWithoutAllChangelogOnRowtime with MiniBatch mode and RocksDB backend enabled

2024-07-03 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-35748: - Summary: DeduplicateITCase.testLastRowWithoutAllChangelogOnRowtime with MiniBatch mode and RocksDB backend enabled Key: FLINK-35748 URL: https://issues.apache.org/jira/browse

[jira] [Created] (FLINK-35729) HiveITCase.testReadWriteHive

2024-06-28 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-35729: - Summary: HiveITCase.testReadWriteHive Key: FLINK-35729 URL: https://issues.apache.org/jira/browse/FLINK-35729 Project: Flink Issue Type: Bug

[jira] [Created] (FLINK-35728) PyFlink end-to-end test because miniconda couldn't be downloaded

2024-06-28 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-35728: - Summary: PyFlink end-to-end test because miniconda couldn't be downloaded Key: FLINK-35728 URL: https://issues.apache.org/jira/browse/FLINK-35728 Project:

[jira] [Created] (FLINK-35727) "Run kubernetes pyflink application test" failed due to access denied issue

2024-06-28 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-35727: - Summary: "Run kubernetes pyflink application test" failed due to access denied issue Key: FLINK-35727 URL: https://issues.apache.org/jira/browse/F

[jira] [Created] (FLINK-35722) CoordinatorEventsToStreamOperatorRecipientExactlyOnceITCase.testCheckpoint fails because of missed operator event

2024-06-28 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-35722: - Summary: CoordinatorEventsToStreamOperatorRecipientExactlyOnceITCase.testCheckpoint fails because of missed operator event Key: FLINK-35722 URL: https://issues.apache.org/jira

[2.0] How to handle on-going feature development in Flink 2.0?

2024-06-25 Thread Matthias Pohl
Hi 2.0 release managers, With the 1.20 release branch being cut [1], master is now referring to 2.0-SNAPSHOT. I remember that, initially, the community had the idea of keeping the 2.0 release as small as possible focusing on API changes [2]. What does this mean for new features? I guess blocking t

Re: [VOTE] FLIP-461: Synchronize rescaling with checkpoint creation to minimize reprocessing for the AdaptiveScheduler

2024-06-20 Thread Matthias Pohl
, 2024 at 11:38 AM Gabor Somogyi > wrote: > > > +1 (binding) > > > > G > > > > > > On Mon, Jun 17, 2024 at 10:24 AM Matthias Pohl > wrote: > > > > > Hi everyone, > > > the discussion in [1] about FLIP-461 [2] is kind of concluded. I

[RESULT][VOTE] FLIP-461: Synchronize rescaling with checkpoint creation to minimize reprocessing for the AdaptiveScheduler

2024-06-20 Thread Matthias Pohl
Hi everyone, the vote [1] for FLIP-461 [2] is over. The number of required binding votes (3) was reached (total: 10, binding: 7, non-binding: 3). No objections were raised. - David Morávek (binding) - Rui Fan (binding) - Zakelly Lan (binding) - Gyula Fóra (binding) - Weijie Guo (binding) - Gabor S

[VOTE] FLIP-461: Synchronize rescaling with checkpoint creation to minimize reprocessing for the AdaptiveScheduler

2024-06-17 Thread Matthias Pohl
Hi everyone, the discussion in [1] about FLIP-461 [2] is kind of concluded. I am starting a vote on this one here. The vote will be open for at least 72 hours (i.e. until June 20, 2024; 8:30am UTC) unless there are any objections. The FLIP will be considered accepted if 3 binding votes (from activ

Re: [DISCUSS] FLIP-461: FLIP-461: Synchronize rescaling with checkpoint creation to minimize reprocessing

2024-06-17 Thread Matthias Pohl
://cwiki.apache.org/confluence/display/FLINK/FLIP-461%3A+Synchronize+rescaling+with+checkpoint+creation+to+minimize+reprocessing+for+the+AdaptiveScheduler On Fri, Jun 7, 2024 at 6:42 PM Matthias Pohl wrote: > Hi Zakelly, > good point. I updated the FLIP to use "scale-on-failed-checkpoints-count&

Re: [VOTE] Release 1.19.1, release candidate #1

2024-06-10 Thread Matthias Pohl
+1 (binding) * Downloaded all artifacts * Extracted sources and ran compilation on sources * Diff of git tag checkout with downloaded sources * Verified SHA512 & GPG checksums * Checked that all POMs have the right expected version * Generated diffs to compare pom file changes with NOTICE files *

Re: [DISCUSS] FLIP-461: FLIP-461: Synchronize rescaling with checkpoint creation to minimize reprocessing

2024-06-07 Thread Matthias Pohl
we might have to reprocess a substantial backlog. > > > > I think in the future we might actually want to enhance this by > triggering > > some kind of specialized "rescaling" checkpoint that prepares the cluster > > for rescaling (eg. by replicating state to ne

[jira] [Created] (FLINK-35553) Integrate newly added trigger interface with checkpointing

2024-06-07 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-35553: - Summary: Integrate newly added trigger interface with checkpointing Key: FLINK-35553 URL: https://issues.apache.org/jira/browse/FLINK-35553 Project: Flink

[jira] [Created] (FLINK-35552) Move CheckpointStatsTracker out of ExecutionGraph into Scheduler

2024-06-07 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-35552: - Summary: Move CheckpointStatsTracker out of ExecutionGraph into Scheduler Key: FLINK-35552 URL: https://issues.apache.org/jira/browse/FLINK-35552 Project: Flink

[jira] [Created] (FLINK-35551) Introduces RescaleManager#onTrigger endpoint

2024-06-07 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-35551: - Summary: Introduces RescaleManager#onTrigger endpoint Key: FLINK-35551 URL: https://issues.apache.org/jira/browse/FLINK-35551 Project: Flink Issue Type

[jira] [Created] (FLINK-35550) Introduce new component RescaleManager

2024-06-07 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-35550: - Summary: Introduce new component RescaleManager Key: FLINK-35550 URL: https://issues.apache.org/jira/browse/FLINK-35550 Project: Flink Issue Type: Sub

[jira] [Created] (FLINK-35549) FLIP-461: Synchronize rescaling with checkpoint creation to minimize reprocessing for the AdaptiveScheduler

2024-06-07 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-35549: - Summary: FLIP-461: Synchronize rescaling with checkpoint creation to minimize reprocessing for the AdaptiveScheduler Key: FLINK-35549 URL: https://issues.apache.org/jira/browse

Re: Savepoints not considered during failover

2024-06-06 Thread Matthias Pohl
One reason could be that the savepoints are self-contained, owned by the user rather than Flink and, therefore, could be moved. Flink wouldn't have a proper reference in that case anymore. I don't have a link to a discussion, though. Best, Matthias On Fri, Jun 7, 2024 at 8:47 AM Gyula Fóra wrot

Re: [DISCUSS] FLIP-461: FLIP-461: Synchronize rescaling with checkpoint creation to minimize reprocessing

2024-06-05 Thread Matthias Pohl
id confusion. > Best, > Zakelly > > On Wed, Jun 5, 2024 at 3:02 PM Matthias Pohl wrote: > > > Hi ConradJam, > > thanks for your response. > > > > The CheckpointStatsTracker gets notified about the checkpoint completion > > after th

Re: [DISCUSS] FLIP-461: FLIP-461: Synchronize rescaling with checkpoint creation to minimize reprocessing

2024-06-05 Thread Matthias Pohl
I updated the FLIPs title. > > Also, I also don't understand why this proposal needs to care about the > checkpoint type is unaligned checkpoint or aligned checkpoint. > > Please correct me if anything is wrong, thanks. > > Best, > Rui On Wed, Jun 5, 2024 at 3:01 PM Ma

Re: [DISCUSS] FLIP-461: FLIP-461: Synchronize rescaling with checkpoint creation to minimize reprocessing

2024-06-05 Thread Matthias Pohl
d, Jun 5, 2024 at 4:46 AM ConradJam wrote: > I have a few questions: > Unaligned checkpoints Do we need to enable this feature? Whether this > feature should be disabled for checkpoints that do not check it > > Matthias Pohl 于2024年6月4日周二 18:03写道: > > > Hi everyone, > >

[DISCUSS] FLIP-461: FLIP-461: Synchronize rescaling with checkpoint creation to minimize reprocessing

2024-06-04 Thread Matthias Pohl
Hi everyone, I'd like to discuss FLIP-461 [1]. The FLIP proposes the synchronization of rescaling and the completion of checkpoints. The idea is to reduce the amount of data that needs to be processed after rescaling happened. A more detailed motivation can be found in FLIP-461. I'm looking forwar

Re: [ANNOUNCE] New Apache Flink PMC Member - Weijie Guo

2024-06-04 Thread Matthias Pohl
Congratulations, Weijie! Matthias On Tue, Jun 4, 2024 at 11:12 AM Guowei Ma wrote: > Congratulations! > > Best, > Guowei > > > On Tue, Jun 4, 2024 at 4:55 PM gongzhongqiang > wrote: > > > Congratulations Weijie! Best, > > Zhongqiang Gong > > > > Xintong Song 于2024年6月4日周二 14:46写道: > > > > > Hi

Re: [DISCUSS] Proposing an LTS Release for the 1.x Line

2024-05-27 Thread Matthias Pohl
> > > > formalize > > > > > > > > > > the result of this discussion in a FLIP. That's just > easier > > > to > > > > > > point > > > > > > > > > >

[FYI] The Azure CI for PRs is currently not triggered

2024-04-03 Thread Matthias Pohl
I run isn't the best option, either. But I still want to be transparent about your options. Matthias [1] https://issues.apache.org/jira/browse/FLINK-34999 [2] https://dev.azure.com/apache-flink/apache-flink/_build?definitionId=1&_a=summary -- [image: Aiven] <https://www.aiven.io

[jira] [Created] (FLINK-35000) PullRequest template doesn't use the correct format to refer to the testing code convention

2024-04-03 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-35000: - Summary: PullRequest template doesn't use the correct format to refer to the testing code convention Key: FLINK-35000 URL: https://issues.apache.org/jira/browse/FLINK-

[jira] [Created] (FLINK-34999) PR CI stopped operating

2024-04-03 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-34999: - Summary: PR CI stopped operating Key: FLINK-34999 URL: https://issues.apache.org/jira/browse/FLINK-34999 Project: Flink Issue Type: Bug

[jira] [Created] (FLINK-34989) Apache Infra requests to reduce the runner usage for a project

2024-04-02 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-34989: - Summary: Apache Infra requests to reduce the runner usage for a project Key: FLINK-34989 URL: https://issues.apache.org/jira/browse/FLINK-34989 Project: Flink

[jira] [Created] (FLINK-34988) Class loading issues in JDK17 and JDK21

2024-04-02 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-34988: - Summary: Class loading issues in JDK17 and JDK21 Key: FLINK-34988 URL: https://issues.apache.org/jira/browse/FLINK-34988 Project: Flink Issue Type: Bug

[jira] [Created] (FLINK-34961) GitHub Actions statistcs can be monitored per workflow name

2024-03-28 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-34961: - Summary: GitHub Actions statistcs can be monitored per workflow name Key: FLINK-34961 URL: https://issues.apache.org/jira/browse/FLINK-34961 Project: Flink

[jira] [Created] (FLINK-34940) LeaderContender implementations handle invalid state

2024-03-26 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-34940: - Summary: LeaderContender implementations handle invalid state Key: FLINK-34940 URL: https://issues.apache.org/jira/browse/FLINK-34940 Project: Flink Issue

[jira] [Created] (FLINK-34939) Harden TestingLeaderElection

2024-03-26 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-34939: - Summary: Harden TestingLeaderElection Key: FLINK-34939 URL: https://issues.apache.org/jira/browse/FLINK-34939 Project: Flink Issue Type: Bug

[jira] [Created] (FLINK-34937) Apache Infra GHA policy update

2024-03-26 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-34937: - Summary: Apache Infra GHA policy update Key: FLINK-34937 URL: https://issues.apache.org/jira/browse/FLINK-34937 Project: Flink Issue Type: Bug

[jira] [Created] (FLINK-34933) JobMasterServiceLeadershipRunnerTest#testResultFutureCompletionOfOutdatedLeaderIsIgnored isn't implemented properly

2024-03-25 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-34933: - Summary: JobMasterServiceLeadershipRunnerTest#testResultFutureCompletionOfOutdatedLeaderIsIgnored isn't implemented properly Key: FLINK-34933 URL: https://issues.apach

[jira] [Created] (FLINK-34921) SystemProcessingTimeServiceTest fails due to missing output

2024-03-22 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-34921: - Summary: SystemProcessingTimeServiceTest fails due to missing output Key: FLINK-34921 URL: https://issues.apache.org/jira/browse/FLINK-34921 Project: Flink

[jira] [Created] (FLINK-34897) JobMasterServiceLeadershipRunnerTest#testJobMasterServiceLeadershipRunnerCloseWhenElectionServiceGrantLeaderShip needs to be enabled again

2024-03-20 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-34897: - Summary: JobMasterServiceLeadershipRunnerTest#testJobMasterServiceLeadershipRunnerCloseWhenElectionServiceGrantLeaderShip needs to be enabled again Key: FLINK-34897 URL: https

[jira] [Created] (FLINK-34695) Move Flink's CI docker container into a public repo

2024-03-15 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-34695: - Summary: Move Flink's CI docker container into a public repo Key: FLINK-34695 URL: https://issues.apache.org/jira/browse/FLINK-34695 Project: Flink

Re: [DISCUSS] Removing documentation on Azure Pipelines for Flink forks

2024-03-14 Thread Matthias Pohl
s working ok > > however to be on the safe side what if we mark it for removal or deprecated > first > and then remove together with dropping support of 1.17 where GHA is not > supported IIUC? > > On Thu, Mar 14, 2024 at 11:42 AM Matthias Pohl > wrote: > > > Hi everyone, &g

[DISCUSS] Removing documentation on Azure Pipelines for Flink forks

2024-03-14 Thread Matthias Pohl
Hi everyone, I'm wondering whether anyone has objections against removing the Azure Pipelines Tutorial to "set up CI for a fork of the Flink repository" in the Flink wiki. Flink's GitHub Actions workflow seems to work fine for forks (at least for 1.18+ changes). No need to guide contributors to the

  1   2   3   4   5   6   7   8   9   10   >