Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-05-25 Thread rui fan
Hi everyone, Offline Confirmed with Anton. He has replied in an earlier email: "I vote for 5 as 'max-overdraft-buffers-per-gate'." So as we understand everybody agrees this. This FLIP-227 discussion is over, I've updated FLIP-227. It will be split into 3 tickets to complete. Thanks for all the d

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-05-06 Thread rui fan
Hi I created the FLINK-27530[1] as the parent ticket. And I updated it to FLIP. [1] https://issues.apache.org/jira/browse/FLINK-27530 Thanks fanrui On Fri, May 6, 2022 at 4:27 PM Piotr Nowojski wrote: > Hi, > > I'm not sure. Maybe 5 will be fine? Anton, Dawid, what do you think? > > Can you c

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-05-06 Thread Piotr Nowojski
Hi, I'm not sure. Maybe 5 will be fine? Anton, Dawid, what do you think? Can you create a parent ticket for the whole FLIP to group all of the issues together? Also FLIP should be officially voted first. Best, Piotrek pt., 6 maj 2022 o 09:08 rui fan <1996fan...@gmail.com> napisaƂ(a): > Hi Ant

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-05-06 Thread rui fan
Hi Anton, Piotrek and Dawid, Thanks for your help. I created FLINK-27522[1] as the first task. And I will finish it asap. @Piotrek, for the default value, do you think it should be less than 5? What do you think about 3? Actually, I think 5 isn't big. It's 1 or 3 or 5 that doesn't matter much, t

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-05-05 Thread Piotr Nowojski
Hi fanrui, > How to identify legacySource? legacy sources are always using the SourceStreamTask class and SourceStreamTask is used only for legacy sources. But I'm not sure how to enable/disable that. Adding `disableOverdraft()` call in SourceStreamTask would be better compared to relying on the

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-05-05 Thread Anton Kalashnikov
Hi, Thanks Fanrui, It looks correct for me. I vote for 5 as 'max-overdraft-buffers-per-gate'. If I understand correctly, Legacy source can be detected by the operator which is an instance of StreamSource and it is also can be detected by invokable which is an instance of SourceStreamTask. We

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-05-05 Thread rui fan
Hi, Thanks a lot for your discussion. After several discussions, I think it's clear now. I updated the "Proposed Changes" of FLIP-227[1]. If I have something missing, please help to add it to FLIP, or add it in the mail and I can add it to FLIP. If everything is OK, I will create a new JIRA for t

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-05-05 Thread Piotr Nowojski
Hi again, After sleeping over this, if both versions (reserve and overdraft) have the same complexity, I would also prefer the overdraft. > `Integer.MAX_VALUE` as default value was my idea as well but now, as > Dawid mentioned, I think it is dangerous since it is too implicit for > the user and i

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-05-04 Thread rui fan
Hi everyone, I still have some questions. 1. If the total buffers of LocalBufferPool <= the reserve buffers, will LocalBufferPool never be available? Can't process data? 2. If the overdraft buffer use the extra buffers, when the downstream task inputBuffer is insufficient, it should fail to start

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-05-04 Thread Anton Kalashnikov
Hi, It is a good point about possible deadlock I have no answer for it right now but we definitely should take it into account. `Integer.MAX_VALUE` as default value was my idea as well but now, as Dawid mentioned, I think it is dangerous since it is too implicit for the user and if the user

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-05-04 Thread Dawid Wysakowicz
Hey all, I have not replied in the thread yet, but I was following the discussion. Personally, I like Fanrui's and Anton's idea. As far as I understand it the idea to distinguish between inside flatMap & outside would be fairly simple, but maybe slightly indirect. The checkAvailability would r

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-05-04 Thread Piotr Nowojski
Hi, Thanks for the answers. > we may still need to discuss whether the > overdraft/reserve/spare should use extra buffers or buffers > in (exclusive + floating buffers)? and > These things resolve the different problems (at least as I see that). > The current hardcoded "1" says that we switch

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-05-03 Thread Anton Kalashnikov
Hi, >> Do you mean to ignore it while processing records, but keep using `maxBuffersPerChannel` when calculating the availability of the output? Yes, it is correct. >> Would it be a big issue if we changed it to check if at least "overdraft number of buffers are available", where "overdraft

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-05-03 Thread rui fan
Hi Thanks for Martijn Visser and Piotrek's feedback. I agree with ignoring the legacy source, it will affect our design. User should use the new Source Api as much as possible. Hi Piotrek, we may still need to discuss whether the overdraft/reserve/spare should use extra buffers or buffers in (ex

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-05-03 Thread Piotr Nowojski
Hi fanrui, > Do you mean don't add the extra buffers? We just use (exclusive buffers * > parallelism + floating buffers)? The LocalBufferPool will be available when > (usedBuffers+overdraftBuffers <= exclusiveBuffers*parallelism+floatingBuffers) > and all subpartitions don't reach the maxBuffersPe

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-05-03 Thread Martijn Visser
Hi everyone, Just wanted to chip in on the discussion of legacy sources: IMHO, we should not focus too much on improving/adding capabilities for legacy sources. We want to persuade and push users to use the new Source API. Yes, this means that there's work required by the end users to port any cus

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-05-02 Thread rui fan
Hi Piotrek > Do you mean to ignore it while processing records, but keep using > `maxBuffersPerChannel` when calculating the availability of the output? I think yes, and please Anton Kalashnikov to help double check. > +1 for just having this as a separate configuration. Is it a big problem > th

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-05-02 Thread Piotr Nowojski
Hi, +1 for the general proposal from my side. It would be a nice workaround flatMaps, WindowOperators and large records issues with unaligned checkpoints. > The first task is about ignoring max buffers per channel. This means if > we request a memory segment from LocalBufferPool and the > maxBuff

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-04-29 Thread rui fan
Let me add some information about the LegacySource. If we want to disable the overdraft buffer for LegacySource. Could we add the enableOverdraft in LocalBufferPool? The default value is false. If the getAvailableFuture is called, change enableOverdraft=true. It indicates whether there are checks

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-04-29 Thread rui fan
Hi, Thanks for your quick response. For question 1/2/3, we think they are clear. We just need to discuss the default value in PR. For the legacy source, you are right. It's difficult for general implementation. Currently, we implement ensureRecordWriterIsAvailable() in SourceFunction.SourceConte

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-04-29 Thread Anton Kalashnikov
Hi. -- 1. Do you mean split this into two JIRAs or two PRs or two commits in a PR? Perhaps, the separated ticket will be better since this task has fewer questions but we should find a solution for LegacySource first. -- 2. For the first task, if the flink user disables the Unaligned Ch

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-04-29 Thread rui fan
Hi, Thanks for your feedback. I have a servel of questions. 1. Do you mean split this into two JIRAs or two PRs or two commits in a PR? 2. For the first task, if the flink user disables the Unaligned Checkpoint, do we ignore max buffers per channel? Because the overdraft isn't usef

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-04-29 Thread Anton Kalashnikov
Hi, We discuss about it a little with Dawid Wysakowicz. Here is some conclusion: First of all, let's split this into two tasks. The first task is about ignoring max buffers per channel. This means if we request a memory segment from LocalBufferPool and the maxBuffersPerChannel is reached for

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-04-29 Thread rui fan
Hi Anton Kalashnikov, I think you agree with we should limit the maximum number of overdraft segments that each LocalBufferPool can apply for, right? I prefer to hard code the maxOverdraftBuffers due to don't add the new configuration. And I hope to hear more from the community. Best wishes fanr

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-04-28 Thread Anton Kalashnikov
Hi, You are right about LegacySource(is the same about BoundedStreamTask?). I am not really sure about hardcoded limit. We need to think about that. Theoretically, when we create a task we know should it use overdraft or not but I am not really sure how we can use this knoweledge yet. 28.04.2

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-04-27 Thread rui fan
Hi Anton Kalashnikov, Thanks for your very clear reply, I think you are totally right. The 'maxBuffersNumber - buffersInUseNumber' can be used as the overdraft buffer, it won't need the new buffer configuration.Flink users can turn up the maxBuffersNumber to control the overdraft buffer size. Al

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-04-27 Thread Anton Kalashnikov
Hi fanrui, Thanks for creating the FLIP. In general, I think the overdraft is good idea and it should help in described above cases. Here are my thoughts about configuration: Please, correct me if I am wrong but as I understand right now we have following calculation. maxBuffersNumber(per

[DISCUSS] FLIP-227: Support overdraft buffer

2022-04-26 Thread rui fan
Hi everyone, Unaligned Checkpoint (FLIP-76 [1]) is a major feature of Flink. It effectively solves the problem of checkpoint timeout or slow checkpoint when backpressure is severe. We found that UC(Unaligned Checkpoint) does not work well when the back pressure is severe and multiple output buffe