Re: Backpressure handling in FileSource APIs - Flink 1.16
Hi Kamal, The network buffer will be full for specific `FileSource` when the job has back pressure which will block the source subtask. You can refer to network buffer [1] for more information. [1] https://flink.apache.org/2019/06/05/a-deep-dive-into-flinks-network-stack/ Best, Shammon FY On Fri, May 26, 2023 at 7:13 PM Kamal Mittal wrote: > Hello Shammon, > > Can you please point out the classes where like for "FileSource" slow down > logic is placed? > > Just wanted to understand it more better and try it at my end by running > various perf. runs, also apply changes in my application if any. > > Rgds, > Kamal > > On Thu, May 25, 2023 at 9:16 AM Kamal Mittal wrote: > >> Hello Shammon, >> >> Can you please point out the classes where like for "FileSource" slow >> down logic is placed? >> >> Just wanted to understand it more better and try it at my end by running >> various perf. runs, also apply changes in my application if any. >> >> Rgds, >> Kamal >> >> On Wed, May 24, 2023 at 11:41 AM Kamal Mittal wrote: >> >>> Thanks Shammon for clarification. >>> >>> On Wed, May 24, 2023 at 11:01 AM Shammon FY wrote: >>> Hi Kamal, The source will slow down when there is backpressure in the flink job, you can refer to docs [1] and [2] to get more detailed information about backpressure mechanism. Currently there's no API or Callback in source for users to do some customized operations for backpressure, but users can collect the metrics of the job and analysis, for example, the metrics in [1] and [3]. I hope this can help you. [1] https://flink.apache.org/2021/07/07/how-to-identify-the-source-of-backpressure/#:~:text=Backpressure%20is%20an%20indicator%20that,the%20queues%20before%20being%20processed . [2] https://www.alibabacloud.com/blog/analysis-of-network-flow-control-and-back-pressure-flink-advanced-tutorials_596632 [3] https://nightlies.apache.org/flink/flink-docs-master/docs/ops/monitoring/back_pressure/ On Tue, May 23, 2023 at 9:40 PM Kamal Mittal wrote: > Hello Community, > > Can you please share views about the query asked above w.r.t back > pressure for FileSource APIs for Bulk and Record stream formats. > Planning to use these APIs w.r.t AVRO to Parquet and vice-versa > conversion. > > Rgds, > Kamal > > On Tue, 23 May 2023, 12:26 pm Kamal Mittal, > wrote: > >> Added Flink community DL as well. >> >> -- Forwarded message - >> From: Kamal Mittal >> Date: Tue, May 23, 2023 at 7:57 AM >> Subject: Re: Backpressure handling in FileSource APIs - Flink 1.16 >> To: Shammon FY >> >> >> Hello, >> >> Yes, want to take some custom actions and also if there is any >> default behavior of slowing down sending data in pipeline further or >> reading data from source somehow? >> >> Rgds, >> Kamal >> >> On Tue, May 23, 2023 at 6:06 AM Shammon FY wrote: >> >>> Hi Kamal, >>> >>> If I understand correctly, do you want the source to do some custom >>> actions, such as current limiting, when there is backpressure in the >>> job? >>> >>> Best, >>> Shammon FY >>> >>> >>> On Mon, May 22, 2023 at 2:12 PM Kamal Mittal >>> wrote: >>> Hello Community, Can you please share views about the query asked above w.r.t back pressure for FileSource APIs for Bulk and Record stream formats. Planning to use these APIs w.r.t AVRO to Parquet and vice-versa conversion. Rgds, Kamal On Thu, May 18, 2023 at 2:33 PM Kamal Mittal wrote: > Hello Community, > > Does FileSource APIs for Bulk and Record stream formats handle > back pressure by any way like slowing down sending data in piepline > further > or reading data from source somehow? > Or does it give any callback/handle so that any action can be > taken? Can you please share details if any? > > Rgds, > Kamal >
Re: [ANNOUNCE] Apache Flink 1.16.2 released
Hi Jing, Thank you for caring about the releasing process. It has to be said that the entire process went smoothly. We have very comprehensive documentation[1] to guide my work, thanks to the contribution of previous release managers and the community. Regarding the obstacles, I actually only have one minor problem: We used an older twine(1.12.0) to deploy python artifacts to PyPI, and its compatible dependencies (such as urllib3) are also older. When I tried twine upload, the process couldn't work as expected as the version of urllib3 installed in my machine was relatively higher. In order to solve this, I had to proactively downgrade the version of some dependencies. I added a notice in the cwiki page[1] to prevent future release managers from encountering the same problem. It seems that this is a known issue(see comments in [2]), which has been resolved in the higher version of twine, I wonder if we can upgrade the version of twine? Does anyone remember the reason why we fixed a very old version(1.12.0)? Best regards, Weijie [1] https://cwiki.apache.org/confluence/display/FLINK/Creating+a+Flink+Release [2] https://github.com/pypa/twine/issues/997 Jing Ge 于2023年5月27日周六 00:15写道: > Hi Weijie, > > Thanks again for your effort. I was wondering if there were any obstacles > you had to overcome while releasing 1.16.2 and 1.17.1 that could lead us to > any improvement wrt the release process and management? > > Best regards, > Jing > > On Fri, May 26, 2023 at 4:41 PM Martijn Visser > wrote: > > > Thank you Weijie and those who helped with testing! > > > > On Fri, May 26, 2023 at 1:06 PM weijie guo > > wrote: > > > > > The Apache Flink community is very happy to announce the release of > > > Apache Flink 1.16.2, which is the second bugfix release for the Apache > > > Flink 1.16 series. > > > > > > > > > > > > Apache Flink® is an open-source stream processing framework for > > > distributed, high-performing, always-available, and accurate data > > > streaming applications. > > > > > > > > > > > > The release is available for download at: > > > > > > https://flink.apache.org/downloads.html > > > > > > > > > > > > Please check out the release blog post for an overview of the > > > improvements for this bugfix release: > > > > > > https://flink.apache.org/news/2023/05/25/release-1.16.2.html > > > > > > > > > > > > The full release notes are available in Jira: > > > > > > > > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12352765 > > > > > > > > > > > > We would like to thank all contributors of the Apache Flink community > > > who made this release possible! > > > > > > > > > > > > Feel free to reach out to the release managers (or respond to this > > > thread) with feedback on the release process. Our goal is to > > > constantly improve the release process. Feedback on what could be > > > improved or things that didn't go so well are appreciated. > > > > > > > > > > > > Regards, > > > > > > Release Manager > > > > > >
[DISCUSS] Hive dialect shouldn't fall back to Flink's default dialect
Hi, community . I want to start the discussion about Hive dialect shouldn't fall back to Flink's default dialect. Currently, when the HiveParser fail to parse the sql in Hive dialect, it'll fall back to Flink's default parser[1] to handle flink-specific statements like "CREATE CATALOG xx with (xx);". As I‘m involving with Hive dialect and have some communication with community users who use Hive dialectrecently, I'm thinking throw exception directly instead of falling back to Flink's default dialect when fail to parse the sql in Hive dialect Here're some reasons: First of all, it'll hide some error with Hive dialect. For example, we found we can't use Hive dialect any more with Flink sql client in release validation phase[2], finally we find a modification in Flink sql client cause it, but our test case can't find it earlier for although HiveParser faill to parse it but then it'll fall back to default parser and pass test case successfully. Second, conceptually, Hive dialect should be do nothing with Flink's default dialect. They are two totally different dialect. If we do need a dialect mixing Hive dialect and default dialect , may be we need to propose a new hybrid dialect and announce the hybrid behavior to users. Also, It made some users confused for the fallback behavior. The fact comes from I had been ask by community users. Throw an excpetioin directly when fail to parse the sql statement in Hive dialect will be more intuitive. Last but not least, it's import to decouple Hive with Flink planner[3] before we can externalize Hive connector[4]. If we still fall back to Flink default dialct, then we will need depend on `ParserImpl` in Flink planner, which will block us removing the provided dependency of Hive dialect as well as externalizing Hive connector. Although we hadn't announced the fall back behavior ever, but some users may implicitly depend on this behavior in theirs sql jobs. So, I hereby open the dicussion about abandoning the fall back behavior to make Hive dialect clear and isoloted. Please remember it won't break the Hive synatax but the syntax specified to Flink may fail after then. But for the failed sql, you can use `SET table.sql-dialect=default;` to switch to Flink dialect. If there's some flink-specific statements we found should be included in Hive dialect to be easy to use, I think we can still add them as specific cases to Hive dialect. Look forwards to your feedback. I'd love to listen the feedback from community to take the next steps. [1]:https://github.com/apache/flink/blob/678370b18e1b6c4a23e5ce08f8efd05675a0cc17/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/HiveParser.java#L348 [2]:https://issues.apache.org/jira/browse/FLINK-26681 [3]:https://issues.apache.org/jira/browse/FLINK-31413 [4]:https://issues.apache.org/jira/browse/FLINK-30064 Best regards, Yuxia