date:20240315



LYanquan opened a new pull request, #3153:
URL: https://github.com/apache/flink-cdc/pull/3153

   A follow up of https://github.com/apache/flink-cdc/pull/3146 to complete the 
docs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-34671) Update the content of README.md in FlinkCDC project

2024-03-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-34671:
---
Labels: pull-request-available  (was: )

> Update the content of README.md in FlinkCDC project
> ---
>
> Key: FLINK-34671
> URL: https://issues.apache.org/jira/browse/FLINK-34671
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Affects Versions: cdc-3.1.0
>Reporter: LvYanquan
>Priority: Major
>  Labels: pull-request-available
> Fix For: cdc-3.1.0
>
>
> As we have updated the doc site of FlinkCDC, we should modify the content of 
> README.md to update those links and add some more accurate description.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] [FLINK-34671][cdc] update README.md file to update links and description. [flink-cdc]



LYanquan opened a new pull request, #3152:
URL: https://github.com/apache/flink-cdc/pull/3152

   Some links of document should be updated after 
https://github.com/apache/flink-cdc/pull/3146 merged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-34700) CLONE - Create Git tag and mark version as released in Jira



 [ 
https://issues.apache.org/jira/browse/FLINK-34700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee updated FLINK-34700:

Reporter: lincoln lee  (was: Sergey Nuyanzin)

> CLONE - Create Git tag and mark version as released in Jira
> ---
>
> Key: FLINK-34700
> URL: https://issues.apache.org/jira/browse/FLINK-34700
> Project: Flink
>  Issue Type: Sub-task
>Reporter: lincoln lee
>Assignee: lincoln lee
>Priority: Major
>
> Create and push a new Git tag for the released version by copying the tag for 
> the final release candidate, as follows:
> {code:java}
> $ git tag -s "release-${RELEASE_VERSION}" refs/tags/${TAG}^{} -m "Release 
> Flink ${RELEASE_VERSION}"
> $ git push  refs/tags/release-${RELEASE_VERSION}
> {code}
> In JIRA, inside [version 
> management|https://issues.apache.org/jira/plugins/servlet/project-config/FLINK/versions],
>  hover over the current release and a settings menu will appear. Click 
> Release, and select today’s date.
> (Note: Only PMC members have access to the project administration. If you do 
> not have access, ask on the mailing list for assistance.)
> If PRs have been merged to the release branch after the the last release 
> candidate was tagged, make sure that the corresponding Jira tickets have the 
> correct Fix Version set.
>  
> 
> h3. Expectations
>  * Release tagged in the source code repository
>  * Release version finalized in JIRA. (Note: Not all committers have 
> administrator access to JIRA. If you end up getting permissions errors ask on 
> the mailing list for assistance)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-34701) Publish the Dockerfiles for the new release



 [ 
https://issues.apache.org/jira/browse/FLINK-34701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee reassigned FLINK-34701:
---

Assignee: lincoln lee  (was: Jing Ge)

> Publish the Dockerfiles for the new release
> ---
>
> Key: FLINK-34701
> URL: https://issues.apache.org/jira/browse/FLINK-34701
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Sergey Nuyanzin
>Assignee: lincoln lee
>Priority: Major
>  Labels: pull-request-available
>
> Note: the official Dockerfiles fetch the binary distribution of the target 
> Flink version from an Apache mirror. After publishing the binary release 
> artifacts, mirrors can take some hours to start serving the new artifacts, so 
> you may want to wait to do this step until you are ready to continue with the 
> "Promote the release" steps in the follow-up Jira.
> Follow the [release instructions in the flink-docker 
> repo|https://github.com/apache/flink-docker#release-workflow] to build the 
> new Dockerfiles and send an updated manifest to Docker Hub so the new images 
> are built and published.
>  
> 
> h3. Expectations
>  * Dockerfiles in [flink-docker|https://github.com/apache/flink-docker] 
> updated for the new Flink release and pull request opened on the Docker 
> official-images with an updated manifest



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34701) Publish the Dockerfiles for the new release



 [ 
https://issues.apache.org/jira/browse/FLINK-34701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee updated FLINK-34701:

Reporter: lincoln lee  (was: Sergey Nuyanzin)

> Publish the Dockerfiles for the new release
> ---
>
> Key: FLINK-34701
> URL: https://issues.apache.org/jira/browse/FLINK-34701
> Project: Flink
>  Issue Type: Sub-task
>Reporter: lincoln lee
>Assignee: lincoln lee
>Priority: Major
>  Labels: pull-request-available
>
> Note: the official Dockerfiles fetch the binary distribution of the target 
> Flink version from an Apache mirror. After publishing the binary release 
> artifacts, mirrors can take some hours to start serving the new artifacts, so 
> you may want to wait to do this step until you are ready to continue with the 
> "Promote the release" steps in the follow-up Jira.
> Follow the [release instructions in the flink-docker 
> repo|https://github.com/apache/flink-docker#release-workflow] to build the 
> new Dockerfiles and send an updated manifest to Docker Hub so the new images 
> are built and published.
>  
> 
> h3. Expectations
>  * Dockerfiles in [flink-docker|https://github.com/apache/flink-docker] 
> updated for the new Flink release and pull request opened on the Docker 
> official-images with an updated manifest



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34701) Publish the Dockerfiles for the new release



 [ 
https://issues.apache.org/jira/browse/FLINK-34701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee updated FLINK-34701:

Summary: Publish the Dockerfiles for the new release  (was: CLONE - Publish 
the Dockerfiles for the new release)

> Publish the Dockerfiles for the new release
> ---
>
> Key: FLINK-34701
> URL: https://issues.apache.org/jira/browse/FLINK-34701
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Sergey Nuyanzin
>Assignee: Jing Ge
>Priority: Major
>  Labels: pull-request-available
>
> Note: the official Dockerfiles fetch the binary distribution of the target 
> Flink version from an Apache mirror. After publishing the binary release 
> artifacts, mirrors can take some hours to start serving the new artifacts, so 
> you may want to wait to do this step until you are ready to continue with the 
> "Promote the release" steps in the follow-up Jira.
> Follow the [release instructions in the flink-docker 
> repo|https://github.com/apache/flink-docker#release-workflow] to build the 
> new Dockerfiles and send an updated manifest to Docker Hub so the new images 
> are built and published.
>  
> 
> h3. Expectations
>  * Dockerfiles in [flink-docker|https://github.com/apache/flink-docker] 
> updated for the new Flink release and pull request opened on the Docker 
> official-images with an updated manifest



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34700) Create Git tag and mark version as released in Jira



 [ 
https://issues.apache.org/jira/browse/FLINK-34700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee updated FLINK-34700:

Summary: Create Git tag and mark version as released in Jira  (was: CLONE - 
Create Git tag and mark version as released in Jira)

> Create Git tag and mark version as released in Jira
> ---
>
> Key: FLINK-34700
> URL: https://issues.apache.org/jira/browse/FLINK-34700
> Project: Flink
>  Issue Type: Sub-task
>Reporter: lincoln lee
>Assignee: lincoln lee
>Priority: Major
>
> Create and push a new Git tag for the released version by copying the tag for 
> the final release candidate, as follows:
> {code:java}
> $ git tag -s "release-${RELEASE_VERSION}" refs/tags/${TAG}^{} -m "Release 
> Flink ${RELEASE_VERSION}"
> $ git push  refs/tags/release-${RELEASE_VERSION}
> {code}
> In JIRA, inside [version 
> management|https://issues.apache.org/jira/plugins/servlet/project-config/FLINK/versions],
>  hover over the current release and a settings menu will appear. Click 
> Release, and select today’s date.
> (Note: Only PMC members have access to the project administration. If you do 
> not have access, ask on the mailing list for assistance.)
> If PRs have been merged to the release branch after the the last release 
> candidate was tagged, make sure that the corresponding Jira tickets have the 
> correct Fix Version set.
>  
> 
> h3. Expectations
>  * Release tagged in the source code repository
>  * Release version finalized in JIRA. (Note: Not all committers have 
> administrator access to JIRA. If you end up getting permissions errors ask on 
> the mailing list for assistance)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-34700) Create Git tag and mark version as released in Jira



 [ 
https://issues.apache.org/jira/browse/FLINK-34700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee closed FLINK-34700.
---
Resolution: Fixed

https://github.com/apache/flink/releases/tag/release-1.19.0

> Create Git tag and mark version as released in Jira
> ---
>
> Key: FLINK-34700
> URL: https://issues.apache.org/jira/browse/FLINK-34700
> Project: Flink
>  Issue Type: Sub-task
>Reporter: lincoln lee
>Assignee: lincoln lee
>Priority: Major
>
> Create and push a new Git tag for the released version by copying the tag for 
> the final release candidate, as follows:
> {code:java}
> $ git tag -s "release-${RELEASE_VERSION}" refs/tags/${TAG}^{} -m "Release 
> Flink ${RELEASE_VERSION}"
> $ git push  refs/tags/release-${RELEASE_VERSION}
> {code}
> In JIRA, inside [version 
> management|https://issues.apache.org/jira/plugins/servlet/project-config/FLINK/versions],
>  hover over the current release and a settings menu will appear. Click 
> Release, and select today’s date.
> (Note: Only PMC members have access to the project administration. If you do 
> not have access, ask on the mailing list for assistance.)
> If PRs have been merged to the release branch after the the last release 
> candidate was tagged, make sure that the corresponding Jira tickets have the 
> correct Fix Version set.
>  
> 
> h3. Expectations
>  * Release tagged in the source code repository
>  * Release version finalized in JIRA. (Note: Not all committers have 
> administrator access to JIRA. If you end up getting permissions errors ask on 
> the mailing list for assistance)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-34700) CLONE - Create Git tag and mark version as released in Jira



 [ 
https://issues.apache.org/jira/browse/FLINK-34700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee reassigned FLINK-34700:
---

Assignee: (was: Jing Ge)

> CLONE - Create Git tag and mark version as released in Jira
> ---
>
> Key: FLINK-34700
> URL: https://issues.apache.org/jira/browse/FLINK-34700
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Sergey Nuyanzin
>Priority: Major
>
> Create and push a new Git tag for the released version by copying the tag for 
> the final release candidate, as follows:
> {code:java}
> $ git tag -s "release-${RELEASE_VERSION}" refs/tags/${TAG}^{} -m "Release 
> Flink ${RELEASE_VERSION}"
> $ git push  refs/tags/release-${RELEASE_VERSION}
> {code}
> In JIRA, inside [version 
> management|https://issues.apache.org/jira/plugins/servlet/project-config/FLINK/versions],
>  hover over the current release and a settings menu will appear. Click 
> Release, and select today’s date.
> (Note: Only PMC members have access to the project administration. If you do 
> not have access, ask on the mailing list for assistance.)
> If PRs have been merged to the release branch after the the last release 
> candidate was tagged, make sure that the corresponding Jira tickets have the 
> correct Fix Version set.
>  
> 
> h3. Expectations
>  * Release tagged in the source code repository
>  * Release version finalized in JIRA. (Note: Not all committers have 
> administrator access to JIRA. If you end up getting permissions errors ask on 
> the mailing list for assistance)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-34699) Deploy artifacts to Maven Central Repository



 [ 
https://issues.apache.org/jira/browse/FLINK-34699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee closed FLINK-34699.
---
Resolution: Fixed

release: [https://dist.apache.org/repos/dist/release/flink/flink-1.19.0/]

dev cleanup: https://dist.apache.org/repos/dist/dev/flink/

> Deploy artifacts to Maven Central Repository
> 
>
> Key: FLINK-34699
> URL: https://issues.apache.org/jira/browse/FLINK-34699
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Lincoln Lee
>Assignee: lincoln lee
>Priority: Major
>
> Use the [Apache Nexus repository|https://repository.apache.org/] to release 
> the staged binary artifacts to the Maven Central repository. In the Staging 
> Repositories section, find the relevant release candidate orgapacheflink-XXX 
> entry and click Release. Drop all other release candidates that are not being 
> released.
> h3. Deploy source and binary releases to dist.apache.org
> Copy the source and binary releases from the dev repository to the release 
> repository at [dist.apache.org|http://dist.apache.org/] using Subversion.
> {code:java}
> $ svn move -m "Release Flink ${RELEASE_VERSION}" 
> https://dist.apache.org/repos/dist/dev/flink/flink-${RELEASE_VERSION}-rc${RC_NUM}
>  https://dist.apache.org/repos/dist/release/flink/flink-${RELEASE_VERSION}
> {code}
> (Note: Only PMC members have access to the release repository. If you do not 
> have access, ask on the mailing list for assistance.)
> h3. Remove old release candidates from 
> [dist.apache.org|http://dist.apache.org/]
> Remove the old release candidates from 
> [https://dist.apache.org/repos/dist/dev/flink] using Subversion.
> {code:java}
> $ svn checkout https://dist.apache.org/repos/dist/dev/flink --depth=immediates
> $ cd flink
> $ svn remove flink-${RELEASE_VERSION}-rc*
> $ svn commit -m "Remove old release candidates for Apache Flink 
> ${RELEASE_VERSION}
> {code}
>  
> 
> h3. Expectations
>  * Maven artifacts released and indexed in the [Maven Central 
> Repository|https://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.flink%22]
>  (usually takes about a day to show up)
>  * Source & binary distributions available in the release repository of 
> [https://dist.apache.org/repos/dist/release/flink/]
>  * Dev repository [https://dist.apache.org/repos/dist/dev/flink/] is empty
>  * Website contains links to new release binaries and sources in download page
>  * (for minor version updates) the front page references the correct new 
> major release version and directs to the correct link



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-34700) CLONE - Create Git tag and mark version as released in Jira



 [ 
https://issues.apache.org/jira/browse/FLINK-34700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee reassigned FLINK-34700:
---

Assignee: lincoln lee

> CLONE - Create Git tag and mark version as released in Jira
> ---
>
> Key: FLINK-34700
> URL: https://issues.apache.org/jira/browse/FLINK-34700
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Sergey Nuyanzin
>Assignee: lincoln lee
>Priority: Major
>
> Create and push a new Git tag for the released version by copying the tag for 
> the final release candidate, as follows:
> {code:java}
> $ git tag -s "release-${RELEASE_VERSION}" refs/tags/${TAG}^{} -m "Release 
> Flink ${RELEASE_VERSION}"
> $ git push  refs/tags/release-${RELEASE_VERSION}
> {code}
> In JIRA, inside [version 
> management|https://issues.apache.org/jira/plugins/servlet/project-config/FLINK/versions],
>  hover over the current release and a settings menu will appear. Click 
> Release, and select today’s date.
> (Note: Only PMC members have access to the project administration. If you do 
> not have access, ask on the mailing list for assistance.)
> If PRs have been merged to the release branch after the the last release 
> candidate was tagged, make sure that the corresponding Jira tickets have the 
> correct Fix Version set.
>  
> 
> h3. Expectations
>  * Release tagged in the source code repository
>  * Release version finalized in JIRA. (Note: Not all committers have 
> administrator access to JIRA. If you end up getting permissions errors ask on 
> the mailing list for assistance)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-34699) CLONE - Deploy artifacts to Maven Central Repository



 [ 
https://issues.apache.org/jira/browse/FLINK-34699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee reassigned FLINK-34699:
---

Assignee: lincoln lee  (was: Jing Ge)

> CLONE - Deploy artifacts to Maven Central Repository
> 
>
> Key: FLINK-34699
> URL: https://issues.apache.org/jira/browse/FLINK-34699
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Sergey Nuyanzin
>Assignee: lincoln lee
>Priority: Major
>
> Use the [Apache Nexus repository|https://repository.apache.org/] to release 
> the staged binary artifacts to the Maven Central repository. In the Staging 
> Repositories section, find the relevant release candidate orgapacheflink-XXX 
> entry and click Release. Drop all other release candidates that are not being 
> released.
> h3. Deploy source and binary releases to dist.apache.org
> Copy the source and binary releases from the dev repository to the release 
> repository at [dist.apache.org|http://dist.apache.org/] using Subversion.
> {code:java}
> $ svn move -m "Release Flink ${RELEASE_VERSION}" 
> https://dist.apache.org/repos/dist/dev/flink/flink-${RELEASE_VERSION}-rc${RC_NUM}
>  https://dist.apache.org/repos/dist/release/flink/flink-${RELEASE_VERSION}
> {code}
> (Note: Only PMC members have access to the release repository. If you do not 
> have access, ask on the mailing list for assistance.)
> h3. Remove old release candidates from 
> [dist.apache.org|http://dist.apache.org/]
> Remove the old release candidates from 
> [https://dist.apache.org/repos/dist/dev/flink] using Subversion.
> {code:java}
> $ svn checkout https://dist.apache.org/repos/dist/dev/flink --depth=immediates
> $ cd flink
> $ svn remove flink-${RELEASE_VERSION}-rc*
> $ svn commit -m "Remove old release candidates for Apache Flink 
> ${RELEASE_VERSION}
> {code}
>  
> 
> h3. Expectations
>  * Maven artifacts released and indexed in the [Maven Central 
> Repository|https://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.flink%22]
>  (usually takes about a day to show up)
>  * Source & binary distributions available in the release repository of 
> [https://dist.apache.org/repos/dist/release/flink/]
>  * Dev repository [https://dist.apache.org/repos/dist/dev/flink/] is empty
>  * Website contains links to new release binaries and sources in download page
>  * (for minor version updates) the front page references the correct new 
> major release version and directs to the correct link



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34699) CLONE - Deploy artifacts to Maven Central Repository



 [ 
https://issues.apache.org/jira/browse/FLINK-34699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee updated FLINK-34699:

Reporter: Lincoln Lee  (was: Sergey Nuyanzin)

> CLONE - Deploy artifacts to Maven Central Repository
> 
>
> Key: FLINK-34699
> URL: https://issues.apache.org/jira/browse/FLINK-34699
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Lincoln Lee
>Assignee: lincoln lee
>Priority: Major
>
> Use the [Apache Nexus repository|https://repository.apache.org/] to release 
> the staged binary artifacts to the Maven Central repository. In the Staging 
> Repositories section, find the relevant release candidate orgapacheflink-XXX 
> entry and click Release. Drop all other release candidates that are not being 
> released.
> h3. Deploy source and binary releases to dist.apache.org
> Copy the source and binary releases from the dev repository to the release 
> repository at [dist.apache.org|http://dist.apache.org/] using Subversion.
> {code:java}
> $ svn move -m "Release Flink ${RELEASE_VERSION}" 
> https://dist.apache.org/repos/dist/dev/flink/flink-${RELEASE_VERSION}-rc${RC_NUM}
>  https://dist.apache.org/repos/dist/release/flink/flink-${RELEASE_VERSION}
> {code}
> (Note: Only PMC members have access to the release repository. If you do not 
> have access, ask on the mailing list for assistance.)
> h3. Remove old release candidates from 
> [dist.apache.org|http://dist.apache.org/]
> Remove the old release candidates from 
> [https://dist.apache.org/repos/dist/dev/flink] using Subversion.
> {code:java}
> $ svn checkout https://dist.apache.org/repos/dist/dev/flink --depth=immediates
> $ cd flink
> $ svn remove flink-${RELEASE_VERSION}-rc*
> $ svn commit -m "Remove old release candidates for Apache Flink 
> ${RELEASE_VERSION}
> {code}
>  
> 
> h3. Expectations
>  * Maven artifacts released and indexed in the [Maven Central 
> Repository|https://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.flink%22]
>  (usually takes about a day to show up)
>  * Source & binary distributions available in the release repository of 
> [https://dist.apache.org/repos/dist/release/flink/]
>  * Dev repository [https://dist.apache.org/repos/dist/dev/flink/] is empty
>  * Website contains links to new release binaries and sources in download page
>  * (for minor version updates) the front page references the correct new 
> major release version and directs to the correct link



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34699) Deploy artifacts to Maven Central Repository



 [ 
https://issues.apache.org/jira/browse/FLINK-34699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee updated FLINK-34699:

Summary: Deploy artifacts to Maven Central Repository  (was: CLONE - Deploy 
artifacts to Maven Central Repository)

> Deploy artifacts to Maven Central Repository
> 
>
> Key: FLINK-34699
> URL: https://issues.apache.org/jira/browse/FLINK-34699
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Lincoln Lee
>Assignee: lincoln lee
>Priority: Major
>
> Use the [Apache Nexus repository|https://repository.apache.org/] to release 
> the staged binary artifacts to the Maven Central repository. In the Staging 
> Repositories section, find the relevant release candidate orgapacheflink-XXX 
> entry and click Release. Drop all other release candidates that are not being 
> released.
> h3. Deploy source and binary releases to dist.apache.org
> Copy the source and binary releases from the dev repository to the release 
> repository at [dist.apache.org|http://dist.apache.org/] using Subversion.
> {code:java}
> $ svn move -m "Release Flink ${RELEASE_VERSION}" 
> https://dist.apache.org/repos/dist/dev/flink/flink-${RELEASE_VERSION}-rc${RC_NUM}
>  https://dist.apache.org/repos/dist/release/flink/flink-${RELEASE_VERSION}
> {code}
> (Note: Only PMC members have access to the release repository. If you do not 
> have access, ask on the mailing list for assistance.)
> h3. Remove old release candidates from 
> [dist.apache.org|http://dist.apache.org/]
> Remove the old release candidates from 
> [https://dist.apache.org/repos/dist/dev/flink] using Subversion.
> {code:java}
> $ svn checkout https://dist.apache.org/repos/dist/dev/flink --depth=immediates
> $ cd flink
> $ svn remove flink-${RELEASE_VERSION}-rc*
> $ svn commit -m "Remove old release candidates for Apache Flink 
> ${RELEASE_VERSION}
> {code}
>  
> 
> h3. Expectations
>  * Maven artifacts released and indexed in the [Maven Central 
> Repository|https://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.flink%22]
>  (usually takes about a day to show up)
>  * Source & binary distributions available in the release repository of 
> [https://dist.apache.org/repos/dist/release/flink/]
>  * Dev repository [https://dist.apache.org/repos/dist/dev/flink/] is empty
>  * Website contains links to new release binaries and sources in download page
>  * (for minor version updates) the front page references the correct new 
> major release version and directs to the correct link



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-34698) Deploy Python artifacts to PyPI



 [ 
https://issues.apache.org/jira/browse/FLINK-34698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee closed FLINK-34698.
---
Resolution: Fixed

pypi:

[https://pypi.org/project/apache-flink/#files]

[https://pypi.org/project/apache-flink-libraries/#files]

 

Release Wiki Page Updates
 * 
 ** swith to use token for uploading python

!https://cwiki.apache.org/confluence/download/attachments/276105702/image-2024-3-15_20-54-28.png?version=1=1710507268804=v2|height=150!
 * 
 ** no need to upload signatures to PyPI 
!https://cwiki.apache.org/confluence/download/attachments/276105702/image-2024-3-15_20-55-54.png?version=1=1710507355000=v2|height=150!

> Deploy Python artifacts to PyPI
> ---
>
> Key: FLINK-34698
> URL: https://issues.apache.org/jira/browse/FLINK-34698
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Lincoln Lee
>Assignee: lincoln lee
>Priority: Major
>
> Release manager should create a PyPI account and ask the PMC add this account 
> to pyflink collaborator list with Maintainer role (The PyPI admin account 
> info can be found here. NOTE, only visible to PMC members) to deploy the 
> Python artifacts to PyPI. The artifacts could be uploaded using 
> twine([https://pypi.org/project/twine/]). To install twine, just run:
> {code:java}
> pip install --upgrade twine==1.12.0
> {code}
> Download the python artifacts from dist.apache.org and upload it to pypi.org:
> {code:java}
> svn checkout 
> https://dist.apache.org/repos/dist/dev/flink/flink-${RELEASE_VERSION}-rc${RC_NUM}
> cd flink-${RELEASE_VERSION}-rc${RC_NUM}
>  
> cd python
>  
> #uploads wheels
> for f in *.whl; do twine upload --repository-url 
> https://upload.pypi.org/legacy/ $f $f.asc; done
>  
> #upload source packages
> twine upload --repository-url https://upload.pypi.org/legacy/ 
> apache-flink-libraries-${RELEASE_VERSION}.tar.gz 
> apache-flink-libraries-${RELEASE_VERSION}.tar.gz.asc
>  
> twine upload --repository-url https://upload.pypi.org/legacy/ 
> apache-flink-${RELEASE_VERSION}.tar.gz 
> apache-flink-${RELEASE_VERSION}.tar.gz.asc
> {code}
> If upload failed or incorrect for some reason (e.g. network transmission 
> problem), you need to delete the uploaded release package of the same version 
> (if exists) and rename the artifact to 
> \{{{}apache-flink-${RELEASE_VERSION}.post0.tar.gz{}}}, then re-upload.
> (!) Note: re-uploading to pypi.org must be avoided as much as possible 
> because it will cause some irreparable problems. If that happens, users 
> cannot install the apache-flink package by explicitly specifying the package 
> version, i.e. the following command "pip install 
> apache-flink==${RELEASE_VERSION}" will fail. Instead they have to run "pip 
> install apache-flink" or "pip install apache-flink==${RELEASE_VERSION}.post0" 
> to install the apache-flink package.
>  
> 
> h3. Expectations
>  * Python artifacts released and indexed in the 
> [PyPI|https://pypi.org/project/apache-flink/] Repository



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-34698) Deploy Python artifacts to PyPI



 [ 
https://issues.apache.org/jira/browse/FLINK-34698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee reassigned FLINK-34698:
---

Assignee: lincoln lee  (was: Jing Ge)

> Deploy Python artifacts to PyPI
> ---
>
> Key: FLINK-34698
> URL: https://issues.apache.org/jira/browse/FLINK-34698
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Sergey Nuyanzin
>Assignee: lincoln lee
>Priority: Major
>
> Release manager should create a PyPI account and ask the PMC add this account 
> to pyflink collaborator list with Maintainer role (The PyPI admin account 
> info can be found here. NOTE, only visible to PMC members) to deploy the 
> Python artifacts to PyPI. The artifacts could be uploaded using 
> twine([https://pypi.org/project/twine/]). To install twine, just run:
> {code:java}
> pip install --upgrade twine==1.12.0
> {code}
> Download the python artifacts from dist.apache.org and upload it to pypi.org:
> {code:java}
> svn checkout 
> https://dist.apache.org/repos/dist/dev/flink/flink-${RELEASE_VERSION}-rc${RC_NUM}
> cd flink-${RELEASE_VERSION}-rc${RC_NUM}
>  
> cd python
>  
> #uploads wheels
> for f in *.whl; do twine upload --repository-url 
> https://upload.pypi.org/legacy/ $f $f.asc; done
>  
> #upload source packages
> twine upload --repository-url https://upload.pypi.org/legacy/ 
> apache-flink-libraries-${RELEASE_VERSION}.tar.gz 
> apache-flink-libraries-${RELEASE_VERSION}.tar.gz.asc
>  
> twine upload --repository-url https://upload.pypi.org/legacy/ 
> apache-flink-${RELEASE_VERSION}.tar.gz 
> apache-flink-${RELEASE_VERSION}.tar.gz.asc
> {code}
> If upload failed or incorrect for some reason (e.g. network transmission 
> problem), you need to delete the uploaded release package of the same version 
> (if exists) and rename the artifact to 
> \{{{}apache-flink-${RELEASE_VERSION}.post0.tar.gz{}}}, then re-upload.
> (!) Note: re-uploading to pypi.org must be avoided as much as possible 
> because it will cause some irreparable problems. If that happens, users 
> cannot install the apache-flink package by explicitly specifying the package 
> version, i.e. the following command "pip install 
> apache-flink==${RELEASE_VERSION}" will fail. Instead they have to run "pip 
> install apache-flink" or "pip install apache-flink==${RELEASE_VERSION}.post0" 
> to install the apache-flink package.
>  
> 
> h3. Expectations
>  * Python artifacts released and indexed in the 
> [PyPI|https://pypi.org/project/apache-flink/] Repository



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34701) CLONE - Publish the Dockerfiles for the new release

lincoln lee created FLINK-34701:
---

 Summary: CLONE - Publish the Dockerfiles for the new release
 Key: FLINK-34701
 URL: https://issues.apache.org/jira/browse/FLINK-34701
 Project: Flink
  Issue Type: Sub-task
Reporter: Sergey Nuyanzin
Assignee: Jing Ge


Note: the official Dockerfiles fetch the binary distribution of the target 
Flink version from an Apache mirror. After publishing the binary release 
artifacts, mirrors can take some hours to start serving the new artifacts, so 
you may want to wait to do this step until you are ready to continue with the 
"Promote the release" steps in the follow-up Jira.

Follow the [release instructions in the flink-docker 
repo|https://github.com/apache/flink-docker#release-workflow] to build the new 
Dockerfiles and send an updated manifest to Docker Hub so the new images are 
built and published.

 

h3. Expectations

 * Dockerfiles in [flink-docker|https://github.com/apache/flink-docker] updated 
for the new Flink release and pull request opened on the Docker official-images 
with an updated manifest



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34698) Deploy Python artifacts to PyPI



 [ 
https://issues.apache.org/jira/browse/FLINK-34698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee updated FLINK-34698:

Reporter: Lincoln Lee  (was: Sergey Nuyanzin)

> Deploy Python artifacts to PyPI
> ---
>
> Key: FLINK-34698
> URL: https://issues.apache.org/jira/browse/FLINK-34698
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Lincoln Lee
>Assignee: lincoln lee
>Priority: Major
>
> Release manager should create a PyPI account and ask the PMC add this account 
> to pyflink collaborator list with Maintainer role (The PyPI admin account 
> info can be found here. NOTE, only visible to PMC members) to deploy the 
> Python artifacts to PyPI. The artifacts could be uploaded using 
> twine([https://pypi.org/project/twine/]). To install twine, just run:
> {code:java}
> pip install --upgrade twine==1.12.0
> {code}
> Download the python artifacts from dist.apache.org and upload it to pypi.org:
> {code:java}
> svn checkout 
> https://dist.apache.org/repos/dist/dev/flink/flink-${RELEASE_VERSION}-rc${RC_NUM}
> cd flink-${RELEASE_VERSION}-rc${RC_NUM}
>  
> cd python
>  
> #uploads wheels
> for f in *.whl; do twine upload --repository-url 
> https://upload.pypi.org/legacy/ $f $f.asc; done
>  
> #upload source packages
> twine upload --repository-url https://upload.pypi.org/legacy/ 
> apache-flink-libraries-${RELEASE_VERSION}.tar.gz 
> apache-flink-libraries-${RELEASE_VERSION}.tar.gz.asc
>  
> twine upload --repository-url https://upload.pypi.org/legacy/ 
> apache-flink-${RELEASE_VERSION}.tar.gz 
> apache-flink-${RELEASE_VERSION}.tar.gz.asc
> {code}
> If upload failed or incorrect for some reason (e.g. network transmission 
> problem), you need to delete the uploaded release package of the same version 
> (if exists) and rename the artifact to 
> \{{{}apache-flink-${RELEASE_VERSION}.post0.tar.gz{}}}, then re-upload.
> (!) Note: re-uploading to pypi.org must be avoided as much as possible 
> because it will cause some irreparable problems. If that happens, users 
> cannot install the apache-flink package by explicitly specifying the package 
> version, i.e. the following command "pip install 
> apache-flink==${RELEASE_VERSION}" will fail. Instead they have to run "pip 
> install apache-flink" or "pip install apache-flink==${RELEASE_VERSION}.post0" 
> to install the apache-flink package.
>  
> 
> h3. Expectations
>  * Python artifacts released and indexed in the 
> [PyPI|https://pypi.org/project/apache-flink/] Repository



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34698) Deploy Python artifacts to PyPI



 [ 
https://issues.apache.org/jira/browse/FLINK-34698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee updated FLINK-34698:

Summary: Deploy Python artifacts to PyPI  (was: CLONE - Deploy Python 
artifacts to PyPI)

> Deploy Python artifacts to PyPI
> ---
>
> Key: FLINK-34698
> URL: https://issues.apache.org/jira/browse/FLINK-34698
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Sergey Nuyanzin
>Assignee: Jing Ge
>Priority: Major
>
> Release manager should create a PyPI account and ask the PMC add this account 
> to pyflink collaborator list with Maintainer role (The PyPI admin account 
> info can be found here. NOTE, only visible to PMC members) to deploy the 
> Python artifacts to PyPI. The artifacts could be uploaded using 
> twine([https://pypi.org/project/twine/]). To install twine, just run:
> {code:java}
> pip install --upgrade twine==1.12.0
> {code}
> Download the python artifacts from dist.apache.org and upload it to pypi.org:
> {code:java}
> svn checkout 
> https://dist.apache.org/repos/dist/dev/flink/flink-${RELEASE_VERSION}-rc${RC_NUM}
> cd flink-${RELEASE_VERSION}-rc${RC_NUM}
>  
> cd python
>  
> #uploads wheels
> for f in *.whl; do twine upload --repository-url 
> https://upload.pypi.org/legacy/ $f $f.asc; done
>  
> #upload source packages
> twine upload --repository-url https://upload.pypi.org/legacy/ 
> apache-flink-libraries-${RELEASE_VERSION}.tar.gz 
> apache-flink-libraries-${RELEASE_VERSION}.tar.gz.asc
>  
> twine upload --repository-url https://upload.pypi.org/legacy/ 
> apache-flink-${RELEASE_VERSION}.tar.gz 
> apache-flink-${RELEASE_VERSION}.tar.gz.asc
> {code}
> If upload failed or incorrect for some reason (e.g. network transmission 
> problem), you need to delete the uploaded release package of the same version 
> (if exists) and rename the artifact to 
> \{{{}apache-flink-${RELEASE_VERSION}.post0.tar.gz{}}}, then re-upload.
> (!) Note: re-uploading to pypi.org must be avoided as much as possible 
> because it will cause some irreparable problems. If that happens, users 
> cannot install the apache-flink package by explicitly specifying the package 
> version, i.e. the following command "pip install 
> apache-flink==${RELEASE_VERSION}" will fail. Instead they have to run "pip 
> install apache-flink" or "pip install apache-flink==${RELEASE_VERSION}.post0" 
> to install the apache-flink package.
>  
> 
> h3. Expectations
>  * Python artifacts released and indexed in the 
> [PyPI|https://pypi.org/project/apache-flink/] Repository



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34697) Finalize release 1.19.0



 [ 
https://issues.apache.org/jira/browse/FLINK-34697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee updated FLINK-34697:

Reporter: Lincoln Lee  (was: Sergey Nuyanzin)

> Finalize release 1.19.0
> ---
>
> Key: FLINK-34697
> URL: https://issues.apache.org/jira/browse/FLINK-34697
> Project: Flink
>  Issue Type: New Feature
>Reporter: Lincoln Lee
>Assignee: lincoln lee
>Priority: Major
>  Labels: release
> Fix For: 1.19.0
>
>
> Once the release candidate has been reviewed and approved by the community, 
> the release should be finalized. This involves the final deployment of the 
> release candidate to the release repositories, merging of the website 
> changes, etc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-34697) Finalize release 1.19.0



 [ 
https://issues.apache.org/jira/browse/FLINK-34697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee reassigned FLINK-34697:
---

Assignee: lincoln lee  (was: Jing Ge)

> Finalize release 1.19.0
> ---
>
> Key: FLINK-34697
> URL: https://issues.apache.org/jira/browse/FLINK-34697
> Project: Flink
>  Issue Type: New Feature
>Reporter: Sergey Nuyanzin
>Assignee: lincoln lee
>Priority: Major
>  Labels: release
> Fix For: 1.19.0
>
>
> Once the release candidate has been reviewed and approved by the community, 
> the release should be finalized. This involves the final deployment of the 
> release candidate to the release repositories, merging of the website 
> changes, etc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34697) Finalize release 1.19.0



 [ 
https://issues.apache.org/jira/browse/FLINK-34697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee updated FLINK-34697:

Labels: release  (was: )

> Finalize release 1.19.0
> ---
>
> Key: FLINK-34697
> URL: https://issues.apache.org/jira/browse/FLINK-34697
> Project: Flink
>  Issue Type: New Feature
>Reporter: Sergey Nuyanzin
>Assignee: Jing Ge
>Priority: Major
>  Labels: release
> Fix For: 1.19.0
>
>
> Once the release candidate has been reviewed and approved by the community, 
> the release should be finalized. This involves the final deployment of the 
> release candidate to the release repositories, merging of the website 
> changes, etc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34700) CLONE - Create Git tag and mark version as released in Jira

lincoln lee created FLINK-34700:
---

 Summary: CLONE - Create Git tag and mark version as released in 
Jira
 Key: FLINK-34700
 URL: https://issues.apache.org/jira/browse/FLINK-34700
 Project: Flink
  Issue Type: Sub-task
Reporter: Sergey Nuyanzin
Assignee: Jing Ge


Create and push a new Git tag for the released version by copying the tag for 
the final release candidate, as follows:
{code:java}
$ git tag -s "release-${RELEASE_VERSION}" refs/tags/${TAG}^{} -m "Release Flink 
${RELEASE_VERSION}"
$ git push  refs/tags/release-${RELEASE_VERSION}
{code}
In JIRA, inside [version 
management|https://issues.apache.org/jira/plugins/servlet/project-config/FLINK/versions],
 hover over the current release and a settings menu will appear. Click Release, 
and select today’s date.

(Note: Only PMC members have access to the project administration. If you do 
not have access, ask on the mailing list for assistance.)

If PRs have been merged to the release branch after the the last release 
candidate was tagged, make sure that the corresponding Jira tickets have the 
correct Fix Version set.

 

h3. Expectations
 * Release tagged in the source code repository
 * Release version finalized in JIRA. (Note: Not all committers have 
administrator access to JIRA. If you end up getting permissions errors ask on 
the mailing list for assistance)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34699) CLONE - Deploy artifacts to Maven Central Repository

lincoln lee created FLINK-34699:
---

 Summary: CLONE - Deploy artifacts to Maven Central Repository
 Key: FLINK-34699
 URL: https://issues.apache.org/jira/browse/FLINK-34699
 Project: Flink
  Issue Type: Sub-task
Reporter: Sergey Nuyanzin
Assignee: Jing Ge


Use the [Apache Nexus repository|https://repository.apache.org/] to release the 
staged binary artifacts to the Maven Central repository. In the Staging 
Repositories section, find the relevant release candidate orgapacheflink-XXX 
entry and click Release. Drop all other release candidates that are not being 
released.
h3. Deploy source and binary releases to dist.apache.org

Copy the source and binary releases from the dev repository to the release 
repository at [dist.apache.org|http://dist.apache.org/] using Subversion.
{code:java}
$ svn move -m "Release Flink ${RELEASE_VERSION}" 
https://dist.apache.org/repos/dist/dev/flink/flink-${RELEASE_VERSION}-rc${RC_NUM}
 https://dist.apache.org/repos/dist/release/flink/flink-${RELEASE_VERSION}
{code}
(Note: Only PMC members have access to the release repository. If you do not 
have access, ask on the mailing list for assistance.)
h3. Remove old release candidates from [dist.apache.org|http://dist.apache.org/]

Remove the old release candidates from 
[https://dist.apache.org/repos/dist/dev/flink] using Subversion.
{code:java}
$ svn checkout https://dist.apache.org/repos/dist/dev/flink --depth=immediates
$ cd flink
$ svn remove flink-${RELEASE_VERSION}-rc*
$ svn commit -m "Remove old release candidates for Apache Flink 
${RELEASE_VERSION}
{code}
 

h3. Expectations
 * Maven artifacts released and indexed in the [Maven Central 
Repository|https://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.flink%22]
 (usually takes about a day to show up)
 * Source & binary distributions available in the release repository of 
[https://dist.apache.org/repos/dist/release/flink/]
 * Dev repository [https://dist.apache.org/repos/dist/dev/flink/] is empty
 * Website contains links to new release binaries and sources in download page
 * (for minor version updates) the front page references the correct new major 
release version and directs to the correct link



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34697) Finalize release 1.19.0



 [ 
https://issues.apache.org/jira/browse/FLINK-34697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee updated FLINK-34697:

Fix Version/s: 1.19.0

> Finalize release 1.19.0
> ---
>
> Key: FLINK-34697
> URL: https://issues.apache.org/jira/browse/FLINK-34697
> Project: Flink
>  Issue Type: New Feature
>Reporter: Sergey Nuyanzin
>Assignee: Jing Ge
>Priority: Major
> Fix For: 1.19.0
>
>
> Once the release candidate has been reviewed and approved by the community, 
> the release should be finalized. This involves the final deployment of the 
> release candidate to the release repositories, merging of the website 
> changes, etc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34697) Finalize release 1.19.0

lincoln lee created FLINK-34697:
---

 Summary: Finalize release 1.19.0
 Key: FLINK-34697
 URL: https://issues.apache.org/jira/browse/FLINK-34697
 Project: Flink
  Issue Type: New Feature
Reporter: Sergey Nuyanzin
Assignee: Jing Ge


Once the release candidate has been reviewed and approved by the community, the 
release should be finalized. This involves the final deployment of the release 
candidate to the release repositories, merging of the website changes, etc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34698) CLONE - Deploy Python artifacts to PyPI

lincoln lee created FLINK-34698:
---

 Summary: CLONE - Deploy Python artifacts to PyPI
 Key: FLINK-34698
 URL: https://issues.apache.org/jira/browse/FLINK-34698
 Project: Flink
  Issue Type: Sub-task
Reporter: Sergey Nuyanzin
Assignee: Jing Ge


Release manager should create a PyPI account and ask the PMC add this account 
to pyflink collaborator list with Maintainer role (The PyPI admin account info 
can be found here. NOTE, only visible to PMC members) to deploy the Python 
artifacts to PyPI. The artifacts could be uploaded using 
twine([https://pypi.org/project/twine/]). To install twine, just run:
{code:java}
pip install --upgrade twine==1.12.0
{code}
Download the python artifacts from dist.apache.org and upload it to pypi.org:
{code:java}
svn checkout 
https://dist.apache.org/repos/dist/dev/flink/flink-${RELEASE_VERSION}-rc${RC_NUM}
cd flink-${RELEASE_VERSION}-rc${RC_NUM}
 
cd python
 
#uploads wheels
for f in *.whl; do twine upload --repository-url 
https://upload.pypi.org/legacy/ $f $f.asc; done
 
#upload source packages
twine upload --repository-url https://upload.pypi.org/legacy/ 
apache-flink-libraries-${RELEASE_VERSION}.tar.gz 
apache-flink-libraries-${RELEASE_VERSION}.tar.gz.asc
 
twine upload --repository-url https://upload.pypi.org/legacy/ 
apache-flink-${RELEASE_VERSION}.tar.gz 
apache-flink-${RELEASE_VERSION}.tar.gz.asc
{code}
If upload failed or incorrect for some reason (e.g. network transmission 
problem), you need to delete the uploaded release package of the same version 
(if exists) and rename the artifact to 
\{{{}apache-flink-${RELEASE_VERSION}.post0.tar.gz{}}}, then re-upload.

(!) Note: re-uploading to pypi.org must be avoided as much as possible because 
it will cause some irreparable problems. If that happens, users cannot install 
the apache-flink package by explicitly specifying the package version, i.e. the 
following command "pip install apache-flink==${RELEASE_VERSION}" will fail. 
Instead they have to run "pip install apache-flink" or "pip install 
apache-flink==${RELEASE_VERSION}.post0" to install the apache-flink package.

 

h3. Expectations
 * Python artifacts released and indexed in the 
[PyPI|https://pypi.org/project/apache-flink/] Repository



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] Bump follow-redirects from 1.15.1 to 1.15.6 in /flink-runtime-web/web-dashboard [flink]



flinkbot commented on PR #24507:
URL: https://github.com/apache/flink/pull/24507#issuecomment-2000400848

   
   ## CI report:
   
   * 9904004e6d4f37ebeaaa8ac157741509f4fe60e1 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] Bump follow-redirects from 1.15.1 to 1.15.6 in /flink-runtime-web/web-dashboard [flink]



dependabot[bot] opened a new pull request, #24507:
URL: https://github.com/apache/flink/pull/24507

   Bumps 
[follow-redirects](https://github.com/follow-redirects/follow-redirects) from 
1.15.1 to 1.15.6.
   
   Commits
   
   https://github.com/follow-redirects/follow-redirects/commit/35a517c5861d79dc8bff7db8626013d20b711b06;>35a517c
 Release version 1.15.6 of the npm package.
   https://github.com/follow-redirects/follow-redirects/commit/c4f847f85176991f95ab9c88af63b1294de8649b;>c4f847f
 Drop Proxy-Authorization across hosts.
   https://github.com/follow-redirects/follow-redirects/commit/8526b4a1b2ab3a2e4044299377df623a661caa76;>8526b4a
 Use GitHub for disclosure.
   https://github.com/follow-redirects/follow-redirects/commit/b1677ce00110ee50dc5da576751d39b281fc4944;>b1677ce
 Release version 1.15.5 of the npm package.
   https://github.com/follow-redirects/follow-redirects/commit/d8914f7982403ea096b39bd594a00ee9d3b7e224;>d8914f7
 Preserve fragment in responseUrl.
   https://github.com/follow-redirects/follow-redirects/commit/65858205e59f1e23c9bf173348a7a7cbb8ac47f5;>6585820
 Release version 1.15.4 of the npm package.
   https://github.com/follow-redirects/follow-redirects/commit/7a6567e16dfa9ad18a70bfe91784c28653fbf19d;>7a6567e
 Disallow bracketed hostnames.
   https://github.com/follow-redirects/follow-redirects/commit/05629af696588b90d64e738bc2e809a97a5f92fc;>05629af
 Prefer native URL instead of deprecated url.parse.
   https://github.com/follow-redirects/follow-redirects/commit/1cba8e85fa73f563a439fe460cf028688e4358df;>1cba8e8
 Prefer native URL instead of legacy url.resolve.
   https://github.com/follow-redirects/follow-redirects/commit/72bc2a4229bc18dc9fbd57c60579713e6264cb92;>72bc2a4
 Simplify _processResponse error handling.
   Additional commits viewable in https://github.com/follow-redirects/follow-redirects/compare/v1.15.1...v1.15.6;>compare
 view
   
   
   
   
   
   [![Dependabot compatibility 
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=follow-redirects=npm_and_yarn=1.15.1=1.15.6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
   
   Dependabot will resolve any conflicts with this PR as long as you don't 
alter it yourself. You can also trigger a rebase manually by commenting 
`@dependabot rebase`.
   
   [//]: # (dependabot-automerge-start)
   [//]: # (dependabot-automerge-end)
   
   ---
   
   
   Dependabot commands and options
   
   
   You can trigger Dependabot actions by commenting on this PR:
   - `@dependabot rebase` will rebase this PR
   - `@dependabot recreate` will recreate this PR, overwriting any edits that 
have been made to it
   - `@dependabot merge` will merge this PR after your CI passes on it
   - `@dependabot squash and merge` will squash and merge this PR after your CI 
passes on it
   - `@dependabot cancel merge` will cancel a previously requested merge and 
block automerging
   - `@dependabot reopen` will reopen this PR if it is closed
   - `@dependabot close` will close this PR and stop Dependabot recreating it. 
You can achieve the same result by closing it manually
   - `@dependabot show  ignore conditions` will show all of 
the ignore conditions of the specified dependency
   - `@dependabot ignore this major version` will close this PR and stop 
Dependabot creating any more for this major version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this minor version` will close this PR and stop 
Dependabot creating any more for this minor version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this dependency` will close this PR and stop 
Dependabot creating any more for this dependency (unless you reopen the PR or 
upgrade to it yourself)
   You can disable automated security fix PRs for this repo from the [Security 
Alerts page](https://github.com/apache/flink/network/alerts).
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] Bump org.apache.zookeeper:zookeeper from 3.6.3 to 3.8.4 [flink-ml]



dependabot[bot] opened a new pull request, #259:
URL: https://github.com/apache/flink-ml/pull/259

   Bumps org.apache.zookeeper:zookeeper from 3.6.3 to 3.8.4.
   
   
   [![Dependabot compatibility 
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=org.apache.zookeeper:zookeeper=maven=3.6.3=3.8.4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
   
   Dependabot will resolve any conflicts with this PR as long as you don't 
alter it yourself. You can also trigger a rebase manually by commenting 
`@dependabot rebase`.
   
   [//]: # (dependabot-automerge-start)
   [//]: # (dependabot-automerge-end)
   
   ---
   
   
   Dependabot commands and options
   
   
   You can trigger Dependabot actions by commenting on this PR:
   - `@dependabot rebase` will rebase this PR
   - `@dependabot recreate` will recreate this PR, overwriting any edits that 
have been made to it
   - `@dependabot merge` will merge this PR after your CI passes on it
   - `@dependabot squash and merge` will squash and merge this PR after your CI 
passes on it
   - `@dependabot cancel merge` will cancel a previously requested merge and 
block automerging
   - `@dependabot reopen` will reopen this PR if it is closed
   - `@dependabot close` will close this PR and stop Dependabot recreating it. 
You can achieve the same result by closing it manually
   - `@dependabot show  ignore conditions` will show all of 
the ignore conditions of the specified dependency
   - `@dependabot ignore this major version` will close this PR and stop 
Dependabot creating any more for this major version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this minor version` will close this PR and stop 
Dependabot creating any more for this minor version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this dependency` will close this PR and stop 
Dependabot creating any more for this dependency (unless you reopen the PR or 
upgrade to it yourself)
   You can disable automated security fix PRs for this repo from the [Security 
Alerts page](https://github.com/apache/flink-ml/network/alerts).
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-34643) JobIDLoggingITCase failed

2024-03-15 Thread Roman Khachatryan (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Khachatryan closed FLINK-34643.
-
Fix Version/s: 1.20.0
   Resolution: Fixed

Merged into master as 6b5ae445724b68db05a3f9687cff6dd68e2129d7.

> JobIDLoggingITCase failed
> -
>
> Key: FLINK-34643
> URL: https://issues.apache.org/jira/browse/FLINK-34643
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.20.0
>Reporter: Matthias Pohl
>Assignee: Roman Khachatryan
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.20.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=58187=logs=8fd9202e-fd17-5b26-353c-ac1ff76c8f28=ea7cf968-e585-52cb-e0fc-f48de023a7ca=7897
> {code}
> Mar 09 01:24:23 01:24:23.498 [ERROR] Tests run: 1, Failures: 0, Errors: 1, 
> Skipped: 0, Time elapsed: 4.209 s <<< FAILURE! -- in 
> org.apache.flink.test.misc.JobIDLoggingITCase
> Mar 09 01:24:23 01:24:23.498 [ERROR] 
> org.apache.flink.test.misc.JobIDLoggingITCase.testJobIDLogging(ClusterClient) 
> -- Time elapsed: 1.459 s <<< ERROR!
> Mar 09 01:24:23 java.lang.IllegalStateException: Too few log events recorded 
> for org.apache.flink.runtime.jobmaster.JobMaster (12) - this must be a bug in 
> the test code
> Mar 09 01:24:23   at 
> org.apache.flink.util.Preconditions.checkState(Preconditions.java:215)
> Mar 09 01:24:23   at 
> org.apache.flink.test.misc.JobIDLoggingITCase.assertJobIDPresent(JobIDLoggingITCase.java:148)
> Mar 09 01:24:23   at 
> org.apache.flink.test.misc.JobIDLoggingITCase.testJobIDLogging(JobIDLoggingITCase.java:132)
> Mar 09 01:24:23   at java.lang.reflect.Method.invoke(Method.java:498)
> Mar 09 01:24:23   at 
> java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:189)
> Mar 09 01:24:23   at 
> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
> Mar 09 01:24:23   at 
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
> Mar 09 01:24:23   at 
> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
> Mar 09 01:24:23   at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
> Mar 09 01:24:23 
> {code}
> The other test failures of this build were also caused by the same test:
> * 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=58187=logs=2c3cbe13-dee0-5837-cf47-3053da9a8a78=b78d9d30-509a-5cea-1fef-db7abaa325ae=8349
> * 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=58187=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=8209



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34643][tests] Fix JobIDLoggingITCase [flink]



rkhachatryan merged PR #24484:
URL: https://github.com/apache/flink/pull/24484


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-34696) GSRecoverableWriterCommitter is generating excessive data blobs



[ 
https://issues.apache.org/jira/browse/FLINK-34696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827590#comment-17827590
 ] 

Galen Warren commented on FLINK-34696:
--

One more thing, regarding:
{quote}To mitigate this problem, we propose modifying the `composeBlobs` method 
to immediately delete source blobs once they have been successfully combined. 
This change could significantly reduce data duplication and associated costs. 
{quote}
I'm not sure it would be safe to delete the raw temporary blobs (i.e. the 
uncomposed ones) until the commit succeeds, because they are referenced in the 
Recoverable object and would need to be there if a commit were retried. I 
suppose it would be fine to delete truly intermediate blobs along the way 
during the composition process, but these are deleted anyway at the end of the 
commit, so does that buy much? Perhaps it does with very large files.

 

> GSRecoverableWriterCommitter is generating excessive data blobs
> ---
>
> Key: FLINK-34696
> URL: https://issues.apache.org/jira/browse/FLINK-34696
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem
>Reporter: Simon-Shlomo Poil
>Priority: Major
>
> The `composeBlobs` method in 
> `org.apache.flink.fs.gs.writer.GSRecoverableWriterCommitter` is designed to 
> merge multiple small blobs into a single large blob using Google Cloud 
> Storage's compose method. This process is iterative, combining the result 
> from the previous iteration with 31 new blobs until all blobs are merged. 
> Upon completion of the composition, the method proceeds to remove the 
> temporary blobs.
> *Issue:*
> This methodology results in significant, unnecessary data storage consumption 
> during the blob composition process, incurring considerable costs due to 
> Google Cloud Storage pricing models.
> *Example to Illustrate the Problem:*
>  - Initial state: 64 blobs, each 1 GB in size (totaling 64 GB).
>  - After 1st step: 32 blobs are merged into a single blob, increasing total 
> storage to 96 GB (64 original + 32 GB new).
>  - After 2nd step: The newly created 32 GB blob is merged with 31 more blobs, 
> raising the total to 159 GB.
>  - After 3rd step: The final blob is merged, culminating in a total of 223 GB 
> to combine the original 64 GB of data. This results in an overhead of 159 GB.
> *Impact:*
> This inefficiency has a profound impact, especially at scale, where terabytes 
> of data can incur overheads in the petabyte range, leading to unexpectedly 
> high costs. Additionally, we have observed an increase in storage exceptions 
> thrown by the Google Storage library, potentially linked to this issue.
> *Suggested Solution:*
> To mitigate this problem, we propose modifying the `composeBlobs` method to 
> immediately delete source blobs once they have been successfully combined. 
> This change could significantly reduce data duplication and associated costs. 
> However, the implications for data recovery and integrity need careful 
> consideration to ensure that this optimization does not compromise the 
> ability to recover data in case of a failure during the composition process.
> *Steps to Reproduce:*
> 1. Initiate the blob composition process in an environment with a significant 
> number of blobs (e.g., 64 blobs of 1 GB each).
> 2. Observe the temporary increase in data storage as blobs are iteratively 
> combined.
> 3. Note the final amount of data storage used compared to the initial total 
> size of the blobs.
> *Expected Behavior:*
> The blob composition process should minimize unnecessary data storage use, 
> efficiently managing resources to combine blobs without generating excessive 
> temporary data overhead.
> *Actual Behavior:*
> The current implementation results in significant temporary increases in data 
> storage, leading to high costs and potential system instability due to 
> frequent storage exceptions.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [table] Add support for temporal join on rolling aggregates [flink]



flinkbot commented on PR #24506:
URL: https://github.com/apache/flink/pull/24506#issuecomment-2000214771

   
   ## CI report:
   
   * 12a272d8e372799a8f2b8469ada493a7bc1ce32f UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] [table] Add support for temporal join on rolling aggregates [flink]



schevalley2 opened a new pull request, #24506:
URL: https://github.com/apache/flink/pull/24506

   ## What is the purpose of the change
   
   This is more of a proposal to demonstrate a possible fix. I am looking for 
feedback for people that are more knowledgeable.
   
   Following this thread on the mailing list: 
https://lists.apache.org/thread/9q7sjyqptcnw1371wc190496nwpdv1tz 
   
   Given an order table:
   
   ```sql
   CREATE TABLE orders (
   order_id INT,
   price DECIMAL(6, 2),
   currency_id INT,
   order_time AS NOW(),
   WATERMARK FOR order_time AS order_time - INTERVAL '2' SECOND
   ) WITH (…)
   ```
   
   and a currency rate table:
   
   ```sql
   CREATE TABLE currency_rates (
   currency_id INT,
   conversion_rate DECIMAL(4, 3),
   created_at AS NOW(),
   WATERMARK FOR created_at AS created_at - INTERVAL '2' SECOND
   PRIMARY KEY (currency_id) NOT ENFORCED
   ) WITH (…)
   ```
   
   that we would aggregate in an unbounded way like this:
   
   ```sql
   CREATE TEMPORARY VIEW max_rates AS (
   SELECT
   currency_id,
   MAX(conversion_rate) AS max_rate
   FROM currency_rates
   GROUP BY currency_id
   );
   ```
   
   It's not possible to do a temporal join between `orders` and `max_rates` and 
it fails with the following error:
   
   ```
   Exception in thread "main" org.apache.flink.table.api.ValidationException:
   Event-Time Temporal Table Join  requires both primary key and row time 
   attribute in versioned table, but no row time attribute can be found.
   ```
   
   After some investigation we realised the way the temporal join checks for 
event/proc time is by looking if the row types contains some timing 
information, so we added to `max_rates` another columns like this:
   
   ```sql
   CREATE TEMPORARY VIEW max_rates AS (
   SELECT
   currency_id,
   MAX(conversion_rate) AS max_rate,
   LAST_VALUE(created_at) AS updated_at
   FROM currency_rates
   GROUP BY currency_id
   );
   ```
   
   However, `LAST_VALUE` does not support timestamp type 
([FLINK-15867](https://issues.apache.org/jira/browse/FLINK-15867)). We added 
that and we ended up with a Planner assertion error:
   
   ```
   java.lang.AssertionError: Sql optimization: Assertion error: type mismatch:
   ref:
   TIMESTAMP_LTZ(3) *ROWTIME* NOT NULL
   input:
   TIMESTAMP_WITH_LOCAL_TIME_ZONE(3) NOT NULL
   
at 
org.apache.flink.table.planner.plan.optimize.program.FlinkVolcanoProgram.optimize(FlinkVolcanoProgram.scala:79)
at 
org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram.$anonfun$optimize$1(FlinkChainedProgram.scala:59)
   ```
   
   We understood the issue was in the way `RelTimeIndicatorConverter` rewrites 
the FlinkLogicalJoin. Left and right input expressions get their 
`TimeIndicatorRelDataType` types replaced by normal timestamps but in the 
temporal join case, the `condition` is not rewritten (but in the `else` it's 
actually done).
   
   However, I thought it might not be the solution to the problem. So I also 
compared if I replaced `max_rates` definition with a simple `SELECT` like in:
   
   ```sql
   CREATE TEMPORARY VIEW max_rates AS (
  SELECT
   currency_id,
   conversion_rate AS max_rate,
   created_at AS updated_at
   FROM currency_rates
   );
   ```
   
   What I've noticed is that the timestamp on the `right` side of the join is 
not replaced and stay a `TimeIndicatorRelDataType`. This is because the graph 
of `RelNode` on the right side is:
   
   ```
   FlinkLogicalTableSourceScan -> FlinkLogicalCalc -> 
FlinkLogicalWatermarkAssigner -> FlinkLogicalSnapshot -> FlinkLogicalJoin
   ```
   
   and `WatermarkAssigner` overrides `deriveRowType` which actually force the 
`TimeIndicatorRelDataType` to be there, whereas for the `FlinkLogicalAggregate` 
it simply gets converted into a normal timestamp.
   
   So the use of `LAST_VALUE(…)` here is a bit of a hack to keep having the 
time information in the query. It actually would not even work depending on the 
aggregation ones want to write.
   
   However, it seems that supporting temporal join with rolling aggregate would 
be a good idea.
   
   Looking forward to discuss more on this with you.
   
   ## Brief change log
   
   ## Verifying this change
   
 - *Added test that execute the type of queries we wanted to support*
 - *Added tests that checks that with the new feature disabled, the join is 
not supported*
 - 
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): don't no / no 
(it's part of the planner)
 - Anything that affects deployment or recovery: JobManager (and its

[jira] [Comment Edited] (FLINK-34696) GSRecoverableWriterCommitter is generating excessive data blobs

[
https://issues.apache.org/jira/browse/FLINK-34696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827568#comment-17827568
]

Galen Warren edited comment on FLINK-34696 at 3/15/24 5:49 PM:
---

No, this just moves the storage of the temporary files to a different bucket so
that a lifecycle policy can be applied to them. The intermediate blobs still
have to be composed into the final blob either way. And yes there is one extra
copy if you use the temp bucket, but the benefit is you won't orphan temporary
files (more below).

Upon rereading and looking back at the code (it's been a while!), I may have
misunderstood the initial issue. There could be a couple things going on here.

1) There could be orphaned temporary blobs sticking around, because they were
written at one point and then due to restores from check/savepoints, they were
"forgotten". Writing all intermediate files to a temporary bucket and then
applying a lifecycle policy to that bucket would address that issue. It does
come at the cost of one extra copy operation; in GCP blobs cannot be composed
into a different bucket, so using the temporary bucket incurs one extra copy
that doesn't occur if the temporary bucket is the same as the final bucket.

2) There could be large numbers of temporary blobs being composed together,
i.e. the writing pattern is such that large numbers of writes occur between
checkpoints. Since there's a limit to how many blobs can be composed together
at one time (32), there's no way to avoid a temporary blob's data being
involved in more than compose operation, in principle, but I do think the
composition could be optimized differently.

I think #2 is the issue here, not #1? Though one should also be careful about
orphaned blobs if state is ever restored from check/savepoints.

Regarding optimization: Currently, the composing of blobs occurs
[here|https://github.com/apache/flink/blob/f6e1b493bd6292a87efd130a0e76af8bd750c1c9/flink-filesystems/flink-gs-fs-hadoop/src/main/java/org/apache/flink/fs/gs/writer/GSRecoverableWriterCommitter.java#L131].

It does it in a way that minimizes the number of compose operations but not
necessarily the total volume of data involved in those compose operations,
which I think is the issue here.

Consider the situation where there are 94 temporary blobs to be composed into a
committed blob. As it stands now, the first 32 would be committed into 1 blob,
leaving 1 + (94 - 32) = 63 blobs remaining. This process would be repeated,
composing the first 32 against into 1, leaving 1 + (63 - 32) = 32 blobs. Then
these remaining 32 blobs would be composed into 1 final blob. So, 3 total
compose operations.

This could be done another way – the first 32 blobs could be composed into 1
blob, the second 32 blobs could be composed into 1 blob, and the third 30 blobs
could be composed into 1 blob, resulting in three intermediate blobs that would
then be composed into the final blob. So, 4 total compose operations in this
case. Still potentially recursive with large numbers of blobs due to the
32-blob limit, but this would avoid the piling up of data in the first blob
when there are large numbers of temporary blobs.

Even though the second method would generally involve more total compose
operations than the first, it would minimize the total bytes being composed. I
doubt the difference would be significant for small numbers of temporary blobs
and could help with large numbers of temporary blobs. That would seem like a
reasonable enhancement to me.

was (Author: galenwarren):
No, this just moves the storage of the temporary files to a different bucket so
that a lifecycle policy can be applied to them. The intermediate blobs still
have to be composed into the final blob either way. And yes there is one extra
copy if you use the temp bucket, but the benefit is you won't orphan temporary
files (more below).

Upon rereading and looking back at the code (it's been a while!), I may have
misunderstood the initial issue. There could be a couple things going on here.

[jira] [Commented] (FLINK-34696) GSRecoverableWriterCommitter is generating excessive data blobs



[ 
https://issues.apache.org/jira/browse/FLINK-34696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827568#comment-17827568
 ] 

Galen Warren commented on FLINK-34696:
--

No, this just moves the storage of the temporary files to a different bucket so 
that a lifecycle policy can be applied to them. The intermediate blobs still 
have to be composed into the final blob either way. And yes there is one extra 
copy if you use the temp bucket, but the benefit is you won't orphan temporary 
files (more below).

Upon rereading and looking back at the code (it's been a while!), I may have 
misunderstood the initial issue. There could be a couple things going on here.

1) There could be orphaned temporary blobs sticking around, because they were 
written at one point and then due to restores from check/savepoints, they were 
"forgotten". Writing all intermediate files to a temporary bucket and then 
applying a lifecycle policy to that bucket would address that issue. It does 
come at the cost of one extra copy operation; in GCP blobs cannot be composed 
into a different bucket, so using the temporary bucket incurs one extra copy 
that doesn't occur if the temporary bucket is the same as the final bucket.

2) There could be large numbers of temporary blobs being composed together, 
i.e. the writing pattern is such that large numbers of writes occur between 
checkpoints. Since there's a limit to how many blobs can be composed together 
at one time (32), there's no way to avoid a temporary blob's data being 
involved in more than compose operation, in principle, but I do think the 
composition could be optimized differently.

I think #2 is the issue here, not #1? Though one should also be careful about 
orphaned blobs if state is ever restored from check/savepoints.

Regarding optimization: Currently, the composing of blobs occurs 
[here|[https://github.com/apache/flink/blob/f6e1b493bd6292a87efd130a0e76af8bd750c1c9/flink-filesystems/flink-gs-fs-hadoop/src/main/java/org/apache/flink/fs/gs/writer/GSRecoverableWriterCommitter.java#L131]].

It does it in a way that minimizes the number of compose operations but not 
necessarily the total volume of data involved in those compose operations, 
which I think is the issue here.

Consider the situation where there are 94 temporary blobs to be composed into a 
committed blob. As it stands now, the first 32 would be committed into 1 blob, 
leaving 1 + (94 - 32) = 63 blobs remaining. This process would be repeated, 
composing the first 32 against into 1, leaving 1 + (63 - 32) = 32 blobs. Then 
these remaining 32 blobs would be composed into 1 final blob. So, 3 total 
compose operations.

This could be done another way – the first 32 blobs could be composed into 1 
blob, the second 32 blobs could be composed into 1 blob, and the third 30 blobs 
could be composed into 1 blob, resulting in three intermediate blobs that would 
then be composed into the final blob. So, 4 total compose operations in this 
case. Still potentially recursive with large numbers of blobs due to the 
32-blob limit, but this would avoid the piling up of data in the first blob 
when there are large numbers of temporary blobs.

Even though the second method would generally involve more total compose 
operations than the first, it would minimize the total bytes being composed. I 
doubt the difference would be significant for small numbers of temporary blobs 
and could help with large numbers of temporary blobs. That would seem like a 
reasonable enhancement to me.

 

 

 

 

> GSRecoverableWriterCommitter is generating excessive data blobs
> ---
>
> Key: FLINK-34696
> URL: https://issues.apache.org/jira/browse/FLINK-34696
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem
>Reporter: Simon-Shlomo Poil
>Priority: Major
>
> The `composeBlobs` method in 
> `org.apache.flink.fs.gs.writer.GSRecoverableWriterCommitter` is designed to 
> merge multiple small blobs into a single large blob using Google Cloud 
> Storage's compose method. This process is iterative, combining the result 
> from the previous iteration with 31 new blobs until all blobs are merged. 
> Upon completion of the composition, the method proceeds to remove the 
> temporary blobs.
> *Issue:*
> This methodology results in significant, unnecessary data storage consumption 
> during the blob composition process, incurring considerable costs due to 
> Google Cloud Storage pricing models.
> *Example to Illustrate the Problem:*
>  - Initial state: 64 blobs, each 1 GB in size (totaling 64 GB).
>  - After 1st step: 32 blobs are merged into a single blob, increasing total 
> storage to 96 GB (64 original + 32 GB new).
>  - After 2nd step: The newly created 32 GB blob is merged with 31 more blobs, 
> raising the total

[jira] [Commented] (FLINK-34696) GSRecoverableWriterCommitter is generating excessive data blobs

2024-03-15 Thread Tobias Hofer (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-34696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827557#comment-17827557
 ] 

Tobias Hofer commented on FLINK-34696:
--

Will the temporary bucket prevent the use of object compose? Letting the 
GSRecoverableWriterCommitter not recursively compose objects? To ask more 
precisely: will the use of a temporary bucket just double the amount of data 
that is being written (one time in temporary bucket, the other time in the 
final target)?

 

> GSRecoverableWriterCommitter is generating excessive data blobs
> ---
>
> Key: FLINK-34696
> URL: https://issues.apache.org/jira/browse/FLINK-34696
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem
>Reporter: Simon-Shlomo Poil
>Priority: Major
>
> The `composeBlobs` method in 
> `org.apache.flink.fs.gs.writer.GSRecoverableWriterCommitter` is designed to 
> merge multiple small blobs into a single large blob using Google Cloud 
> Storage's compose method. This process is iterative, combining the result 
> from the previous iteration with 31 new blobs until all blobs are merged. 
> Upon completion of the composition, the method proceeds to remove the 
> temporary blobs.
> *Issue:*
> This methodology results in significant, unnecessary data storage consumption 
> during the blob composition process, incurring considerable costs due to 
> Google Cloud Storage pricing models.
> *Example to Illustrate the Problem:*
>  - Initial state: 64 blobs, each 1 GB in size (totaling 64 GB).
>  - After 1st step: 32 blobs are merged into a single blob, increasing total 
> storage to 96 GB (64 original + 32 GB new).
>  - After 2nd step: The newly created 32 GB blob is merged with 31 more blobs, 
> raising the total to 159 GB.
>  - After 3rd step: The final blob is merged, culminating in a total of 223 GB 
> to combine the original 64 GB of data. This results in an overhead of 159 GB.
> *Impact:*
> This inefficiency has a profound impact, especially at scale, where terabytes 
> of data can incur overheads in the petabyte range, leading to unexpectedly 
> high costs. Additionally, we have observed an increase in storage exceptions 
> thrown by the Google Storage library, potentially linked to this issue.
> *Suggested Solution:*
> To mitigate this problem, we propose modifying the `composeBlobs` method to 
> immediately delete source blobs once they have been successfully combined. 
> This change could significantly reduce data duplication and associated costs. 
> However, the implications for data recovery and integrity need careful 
> consideration to ensure that this optimization does not compromise the 
> ability to recover data in case of a failure during the composition process.
> *Steps to Reproduce:*
> 1. Initiate the blob composition process in an environment with a significant 
> number of blobs (e.g., 64 blobs of 1 GB each).
> 2. Observe the temporary increase in data storage as blobs are iteratively 
> combined.
> 3. Note the final amount of data storage used compared to the initial total 
> size of the blobs.
> *Expected Behavior:*
> The blob composition process should minimize unnecessary data storage use, 
> efficiently managing resources to combine blobs without generating excessive 
> temporary data overhead.
> *Actual Behavior:*
> The current implementation results in significant temporary increases in data 
> storage, leading to high costs and potential system instability due to 
> frequent storage exceptions.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34696) GSRecoverableWriterCommitter is generating excessive data blobs

2024-03-15 Thread Piotr Nowojski (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-34696:
---
Issue Type: Improvement  (was: Bug)

> GSRecoverableWriterCommitter is generating excessive data blobs
> ---
>
> Key: FLINK-34696
> URL: https://issues.apache.org/jira/browse/FLINK-34696
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem
>Reporter: Simon-Shlomo Poil
>Priority: Major
>
> The `composeBlobs` method in 
> `org.apache.flink.fs.gs.writer.GSRecoverableWriterCommitter` is designed to 
> merge multiple small blobs into a single large blob using Google Cloud 
> Storage's compose method. This process is iterative, combining the result 
> from the previous iteration with 31 new blobs until all blobs are merged. 
> Upon completion of the composition, the method proceeds to remove the 
> temporary blobs.
> *Issue:*
> This methodology results in significant, unnecessary data storage consumption 
> during the blob composition process, incurring considerable costs due to 
> Google Cloud Storage pricing models.
> *Example to Illustrate the Problem:*
>  - Initial state: 64 blobs, each 1 GB in size (totaling 64 GB).
>  - After 1st step: 32 blobs are merged into a single blob, increasing total 
> storage to 96 GB (64 original + 32 GB new).
>  - After 2nd step: The newly created 32 GB blob is merged with 31 more blobs, 
> raising the total to 159 GB.
>  - After 3rd step: The final blob is merged, culminating in a total of 223 GB 
> to combine the original 64 GB of data. This results in an overhead of 159 GB.
> *Impact:*
> This inefficiency has a profound impact, especially at scale, where terabytes 
> of data can incur overheads in the petabyte range, leading to unexpectedly 
> high costs. Additionally, we have observed an increase in storage exceptions 
> thrown by the Google Storage library, potentially linked to this issue.
> *Suggested Solution:*
> To mitigate this problem, we propose modifying the `composeBlobs` method to 
> immediately delete source blobs once they have been successfully combined. 
> This change could significantly reduce data duplication and associated costs. 
> However, the implications for data recovery and integrity need careful 
> consideration to ensure that this optimization does not compromise the 
> ability to recover data in case of a failure during the composition process.
> *Steps to Reproduce:*
> 1. Initiate the blob composition process in an environment with a significant 
> number of blobs (e.g., 64 blobs of 1 GB each).
> 2. Observe the temporary increase in data storage as blobs are iteratively 
> combined.
> 3. Note the final amount of data storage used compared to the initial total 
> size of the blobs.
> *Expected Behavior:*
> The blob composition process should minimize unnecessary data storage use, 
> efficiently managing resources to combine blobs without generating excessive 
> temporary data overhead.
> *Actual Behavior:*
> The current implementation results in significant temporary increases in data 
> storage, leading to high costs and potential system instability due to 
> frequent storage exceptions.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-34696) GSRecoverableWriterCommitter is generating excessive data blobs



[ 
https://issues.apache.org/jira/browse/FLINK-34696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827528#comment-17827528
 ] 

Galen Warren commented on FLINK-34696:
--

Yes, the issue is recoverability. However, there is one thing you can do. 
Create a separate bucket in GCP for temporary files and initialize the 
filesystem configuration (via FileSystem.initialize) with 
*_[gs.writer.temporary.bucket.name|http://gs.writer.temporary.bucket.name/]_* 
set to the name of that bucket. This will cause the GSRecoverableWriter to 
write intermediate/temporary files to that bucket instead of the "real" bucket. 
Then, you can apply a TTL[ lifecycle policy 
|https://cloud.google.com/storage/docs/lifecycle]to the temporary bucket to 
have files be deleted after whatever TTL you want. 
 
If you try to recover to a check/savepoint farther back in time than that TTL 
interval, the recovery will probably fail, but this will let you dial in 
whatever recoverability period you want, i.e. longer (at higher storage cost) 
or shorter (at lower storage cost).

> GSRecoverableWriterCommitter is generating excessive data blobs
> ---
>
> Key: FLINK-34696
> URL: https://issues.apache.org/jira/browse/FLINK-34696
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / FileSystem
>Reporter: Simon-Shlomo Poil
>Priority: Major
>
> The `composeBlobs` method in 
> `org.apache.flink.fs.gs.writer.GSRecoverableWriterCommitter` is designed to 
> merge multiple small blobs into a single large blob using Google Cloud 
> Storage's compose method. This process is iterative, combining the result 
> from the previous iteration with 31 new blobs until all blobs are merged. 
> Upon completion of the composition, the method proceeds to remove the 
> temporary blobs.
> *Issue:*
> This methodology results in significant, unnecessary data storage consumption 
> during the blob composition process, incurring considerable costs due to 
> Google Cloud Storage pricing models.
> *Example to Illustrate the Problem:*
>  - Initial state: 64 blobs, each 1 GB in size (totaling 64 GB).
>  - After 1st step: 32 blobs are merged into a single blob, increasing total 
> storage to 96 GB (64 original + 32 GB new).
>  - After 2nd step: The newly created 32 GB blob is merged with 31 more blobs, 
> raising the total to 159 GB.
>  - After 3rd step: The final blob is merged, culminating in a total of 223 GB 
> to combine the original 64 GB of data. This results in an overhead of 159 GB.
> *Impact:*
> This inefficiency has a profound impact, especially at scale, where terabytes 
> of data can incur overheads in the petabyte range, leading to unexpectedly 
> high costs. Additionally, we have observed an increase in storage exceptions 
> thrown by the Google Storage library, potentially linked to this issue.
> *Suggested Solution:*
> To mitigate this problem, we propose modifying the `composeBlobs` method to 
> immediately delete source blobs once they have been successfully combined. 
> This change could significantly reduce data duplication and associated costs. 
> However, the implications for data recovery and integrity need careful 
> consideration to ensure that this optimization does not compromise the 
> ability to recover data in case of a failure during the composition process.
> *Steps to Reproduce:*
> 1. Initiate the blob composition process in an environment with a significant 
> number of blobs (e.g., 64 blobs of 1 GB each).
> 2. Observe the temporary increase in data storage as blobs are iteratively 
> combined.
> 3. Note the final amount of data storage used compared to the initial total 
> size of the blobs.
> *Expected Behavior:*
> The blob composition process should minimize unnecessary data storage use, 
> efficiently managing resources to combine blobs without generating excessive 
> temporary data overhead.
> *Actual Behavior:*
> The current implementation results in significant temporary increases in data 
> storage, leading to high costs and potential system instability due to 
> frequent storage exceptions.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [hotfix] In case of unexpected errors do not loose the primary failur [flink]



pnowojski merged PR #24487:
URL: https://github.com/apache/flink/pull/24487


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [hotfix] In case of unexpected errors do not loose the primary failur [flink]



pnowojski commented on PR #24487:
URL: https://github.com/apache/flink/pull/24487#issuecomment-1999835148

   Merging. Builds are failing due to unrelated test instabilities/bugs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-34696) GSRecoverableWriterCommitter is generating excessive data blobs

2024-03-15 Thread Simon-Shlomo Poil (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon-Shlomo Poil updated FLINK-34696:
--
Description: 
*Description:*

The `composeBlobs` method in 
`org.apache.flink.fs.gs.writer.GSRecoverableWriterCommitter` is designed to 
merge multiple small blobs into a single large blob using Google Cloud 
Storage's compose method. This process is iterative, combining the result from 
the previous iteration with 31 new blobs until all blobs are merged. Upon 
completion of the composition, the method proceeds to remove the temporary 
blobs.

*Issue:*

This methodology results in significant, unnecessary data storage consumption 
during the blob composition process, incurring considerable costs due to Google 
Cloud Storage pricing models.

*Example to Illustrate the Problem:*

- Initial state: 64 blobs, each 1 GB in size (totaling 64 GB).
- After 1st step: 32 blobs are merged into a single blob, increasing total 
storage to 96 GB (64 original + 32 GB new).
- After 2nd step: The newly created 32 GB blob is merged with 31 more blobs, 
raising the total to 159 GB.
- After 3rd step: The final blob is merged, culminating in a total of 223 GB to 
combine the original 64 GB of data. This results in an overhead of 159 GB.

*Impact:*

This inefficiency has a profound impact, especially at scale, where terabytes 
of data can incur overheads in the petabyte range, leading to unexpectedly high 
costs. Additionally, we have observed an increase in storage exceptions thrown 
by the Google Storage library, potentially linked to this issue.

*Suggested Solution:*

To mitigate this problem, we propose modifying the `composeBlobs` method to 
immediately delete source blobs once they have been successfully combined. This 
change could significantly reduce data duplication and associated costs. 
However, the implications for data recovery and integrity need careful 
consideration to ensure that this optimization does not compromise the ability 
to recover data in case of a failure during the composition process.

*Steps to Reproduce:*

1. Initiate the blob composition process in an environment with a significant 
number of blobs (e.g., 64 blobs of 1 GB each).
2. Observe the temporary increase in data storage as blobs are iteratively 
combined.
3. Note the final amount of data storage used compared to the initial total 
size of the blobs.

*Expected Behavior:*

The blob composition process should minimize unnecessary data storage use, 
efficiently managing resources to combine blobs without generating excessive 
temporary data overhead.

*Actual Behavior:*

The current implementation results in significant temporary increases in data 
storage, leading to high costs and potential system instability due to frequent 
storage exceptions.
 
 
 

  was:
In the "composeBlobs" method of 
org.apache.flink.fs.gs.writer.GSRecoverableWriterCommitter
many small blobs are combined to generate a final single blob using the google 
storage compose method. This compose action is performed iteratively each time 
composing  the resulting blob from the previous step with 31 new blobs until 
there are not remaining blobs. When the compose action is completed the 
temporary blobs are removed.
 
This unfortunately leads to significant excessive use of data storage (which 
for google storage is a rather costly situation). 
 
*Simple example*
We have 64 blobs each 1 GB; i.e. 64 GB
1st step: 32 blobs are composed into one blob; i.e. now 64 GB + 32 GB = 96 GB
2nd step: The 32 GB blob from previous step is composed with 31 blobs; now we 
have 64 GB + 32 GB + 63 GB = 159 GB
3rd step: The last remaining blob is composed with the blob from the previous 
step; now we have: 64 GB + 32 GB + 63 GB + 64 GB = 223 GB
I.e. in order to combine 64 GB of data we had an overhead of 159 GB. 
 
*Why is this big issue?*
With large amount of data the overhead becomes significant. With TiB of data we 
experienced peaks of PiB leading to unexpected high costs, and (maybe 
unrelated) frequent storage exceptions thrown by the Google Storage library.
 
*Suggested solution:* 
When the blobs are composed together they should be deleted to not duplicate 
data.
Maybe this has implications for recoverability?
 
 
 
 
 


> GSRecoverableWriterCommitter is generating excessive data blobs
> ---
>
> Key: FLINK-34696
> URL: https://issues.apache.org/jira/browse/FLINK-34696
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / FileSystem
>Reporter: Simon-Shlomo Poil
>Priority: Major
>
> *Description:*
> The `composeBlobs` method in 
> `org.apache.flink.fs.gs.writer.GSRecoverableWriterCommitter` is designed to 
> merge multiple small blobs into a single large blob using Google Cloud 
> Storage's compose method. This process is

[jira] [Updated] (FLINK-34696) GSRecoverableWriterCommitter is generating excessive data blobs

2024-03-15 Thread Simon-Shlomo Poil (Jira)

[
https://issues.apache.org/jira/browse/FLINK-34696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Simon-Shlomo Poil updated FLINK-34696:
--
Description:
The `composeBlobs` method in
`org.apache.flink.fs.gs.writer.GSRecoverableWriterCommitter` is designed to
merge multiple small blobs into a single large blob using Google Cloud
Storage's compose method. This process is iterative, combining the result from
the previous iteration with 31 new blobs until all blobs are merged. Upon
completion of the composition, the method proceeds to remove the temporary
blobs.

*Issue:*

This methodology results in significant, unnecessary data storage consumption
during the blob composition process, incurring considerable costs due to Google
Cloud Storage pricing models.

*Example to Illustrate the Problem:*
- Initial state: 64 blobs, each 1 GB in size (totaling 64 GB).
- After 1st step: 32 blobs are merged into a single blob, increasing total
storage to 96 GB (64 original + 32 GB new).
- After 2nd step: The newly created 32 GB blob is merged with 31 more blobs,
raising the total to 159 GB.
- After 3rd step: The final blob is merged, culminating in a total of 223 GB
to combine the original 64 GB of data. This results in an overhead of 159 GB.

*Impact:*

This inefficiency has a profound impact, especially at scale, where terabytes
of data can incur overheads in the petabyte range, leading to unexpectedly high
costs. Additionally, we have observed an increase in storage exceptions thrown
by the Google Storage library, potentially linked to this issue.

*Suggested Solution:*

To mitigate this problem, we propose modifying the `composeBlobs` method to
immediately delete source blobs once they have been successfully combined. This
change could significantly reduce data duplication and associated costs.
However, the implications for data recovery and integrity need careful
consideration to ensure that this optimization does not compromise the ability
to recover data in case of a failure during the composition process.

*Steps to Reproduce:*

1. Initiate the blob composition process in an environment with a significant
number of blobs (e.g., 64 blobs of 1 GB each).
2. Observe the temporary increase in data storage as blobs are iteratively
combined.
3. Note the final amount of data storage used compared to the initial total
size of the blobs.

*Expected Behavior:*

The blob composition process should minimize unnecessary data storage use,
efficiently managing resources to combine blobs without generating excessive
temporary data overhead.

*Actual Behavior:*

The current implementation results in significant temporary increases in data
storage, leading to high costs and potential system instability due to frequent
storage exceptions.

was:
*Description:*

The `composeBlobs` method in
`org.apache.flink.fs.gs.writer.GSRecoverableWriterCommitter` is designed to
merge multiple small blobs into a single large blob using Google Cloud
Storage's compose method. This process is iterative, combining the result from
the previous iteration with 31 new blobs until all blobs are merged. Upon
completion of the composition, the method proceeds to remove the temporary
blobs.

*Issue:*

This methodology results in significant, unnecessary data storage consumption
during the blob composition process, incurring considerable costs due to Google
Cloud Storage pricing models.

*Example to Illustrate the Problem:*

- Initial state: 64 blobs, each 1 GB in size (totaling 64 GB).
- After 1st step: 32 blobs are merged into a single blob, increasing total
storage to 96 GB (64 original + 32 GB new).
- After 2nd step: The newly created 32 GB blob is merged with 31 more blobs,
raising the total to 159 GB.
- After 3rd step: The final blob is merged, culminating in a total of 223 GB to
combine the original 64 GB of data. This results in an overhead of 159 GB.

*Impact:*

*Suggested Solution:*

*Steps to Reproduce:*

1. Initiate the blob composition process in an environment with a significant
number of blobs (e.g., 64 blobs of 1 GB each).
2. Observe the temporary increase in data storage as blobs

[jira] [Created] (FLINK-34696) GSRecoverableWriterCommitter is generating excessive data blobs

2024-03-15 Thread Simon-Shlomo Poil (Jira)

Simon-Shlomo Poil created FLINK-34696:
-

 Summary: GSRecoverableWriterCommitter is generating excessive data 
blobs
 Key: FLINK-34696
 URL: https://issues.apache.org/jira/browse/FLINK-34696
 Project: Flink
  Issue Type: Bug
  Components: Connectors / FileSystem
Reporter: Simon-Shlomo Poil


In the "composeBlobs" method of 
org.apache.flink.fs.gs.writer.GSRecoverableWriterCommitter
many small blobs are combined to generate a final single blob using the google 
storage compose method. This compose action is performed iteratively each time 
composing  the resulting blob from the previous step with 31 new blobs until 
there are not remaining blobs. When the compose action is completed the 
temporary blobs are removed.
 
This unfortunately leads to significant excessive use of data storage (which 
for google storage is a rather costly situation). 
 
*Simple example*
We have 64 blobs each 1 GB; i.e. 64 GB
1st step: 32 blobs are composed into one blob; i.e. now 64 GB + 32 GB = 96 GB
2nd step: The 32 GB blob from previous step is composed with 31 blobs; now we 
have 64 GB + 32 GB + 63 GB = 159 GB
3rd step: The last remaining blob is composed with the blob from the previous 
step; now we have: 64 GB + 32 GB + 63 GB + 64 GB = 223 GB
I.e. in order to combine 64 GB of data we had an overhead of 159 GB. 
 
*Why is this big issue?*
With large amount of data the overhead becomes significant. With TiB of data we 
experienced peaks of PiB leading to unexpected high costs, and (maybe 
unrelated) frequent storage exceptions thrown by the Google Storage library.
 
*Suggested solution:* 
When the blobs are composed together they should be deleted to not duplicate 
data.
Maybe this has implications for recoverability?
 
 
 
 
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34643][tests] Fix JobIDLoggingITCase [flink]



rkhachatryan commented on code in PR #24484:
URL: https://github.com/apache/flink/pull/24484#discussion_r1526332843


##
flink-tests/src/test/java/org/apache/flink/test/misc/JobIDLoggingITCase.java:
##


Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34643][tests] Fix JobIDLoggingITCase [flink]



rkhachatryan commented on code in PR #24484:
URL: https://github.com/apache/flink/pull/24484#discussion_r1526329297


##
flink-test-utils-parent/flink-test-utils-junit/src/main/java/org/apache/flink/testutils/logging/LoggerAuditingExtension.java:
##
@@ -81,7 +81,9 @@ public void beforeEach(ExtensionContext context) throws 
Exception {
 new AbstractAppender("test-appender", null, null, false, 
Property.EMPTY_ARRAY) {
 @Override
 public void append(LogEvent event) {
-loggingEvents.add(event.toImmutable());
+if (event.toImmutable() != null) {
+loggingEvents.add(event.toImmutable());
+}

Review Comment:
   I've seen NPE logged, but don't see it anymore - will remove the check.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34643][tests] Fix JobIDLoggingITCase [flink]



XComp commented on code in PR #24484:
URL: https://github.com/apache/flink/pull/24484#discussion_r1526327468


##
flink-tests/src/test/java/org/apache/flink/test/misc/JobIDLoggingITCase.java:
##
@@ -205,6 +268,8 @@ private static JobID runJob(ClusterClient clusterClient) 
throws Exception {
 // wait for all tasks ready and then checkpoint
 while (true) {
 try {
+clusterClient.triggerCheckpoint(jobId, 
CheckpointType.DEFAULT).get();
+// get checkpoint notification

Review Comment:
   :+1: That might be a more descriptive comment than the one in the code right 
now. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-32074][checkpoint] Merge file across checkpoints [flink]



Zakelly commented on PR #24497:
URL: https://github.com/apache/flink/pull/24497#issuecomment-1999698683

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-33816) SourceStreamTaskTest.testTriggeringStopWithSavepointWithDrain failed due async checkpoint triggering not being completed

2024-03-15 Thread Piotr Nowojski (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski closed FLINK-33816.
--
Fix Version/s: 2.0.0
 Assignee: jiabao.sun
   Resolution: Fixed

merged commit 5aebb04 into apache:master 

> SourceStreamTaskTest.testTriggeringStopWithSavepointWithDrain failed due 
> async checkpoint triggering not being completed 
> -
>
> Key: FLINK-33816
> URL: https://issues.apache.org/jira/browse/FLINK-33816
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / Coordination
>Affects Versions: 1.19.0
>Reporter: Matthias Pohl
>Assignee: jiabao.sun
>Priority: Major
>  Labels: github-actions, pull-request-available, test-stability
> Fix For: 2.0.0
>
> Attachments: screenshot-1.png
>
>
> [https://github.com/XComp/flink/actions/runs/7182604625/job/19559947894#step:12:9430]
> {code:java}
> rror: 14:39:01 14:39:01.930 [ERROR] Tests run: 16, Failures: 1, Errors: 0, 
> Skipped: 0, Time elapsed: 1.878 s <<< FAILURE! - in 
> org.apache.flink.streaming.runtime.tasks.SourceStreamTaskTest
> 9426Error: 14:39:01 14:39:01.930 [ERROR] 
> org.apache.flink.streaming.runtime.tasks.SourceStreamTaskTest.testTriggeringStopWithSavepointWithDrain
>   Time elapsed: 0.034 s  <<< FAILURE!
> 9427Dec 12 14:39:01 org.opentest4j.AssertionFailedError: 
> 9428Dec 12 14:39:01 
> 9429Dec 12 14:39:01 Expecting value to be true but was false
> 9430Dec 12 14:39:01   at 
> java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62)
> 9431Dec 12 14:39:01   at 
> java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502)
> 9432Dec 12 14:39:01   at 
> org.apache.flink.streaming.runtime.tasks.SourceStreamTaskTest.testTriggeringStopWithSavepointWithDrain(SourceStreamTaskTest.java:710)
> 9433Dec 12 14:39:01   at 
> java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
> 9434Dec 12 14:39:01   at 
> java.base/java.lang.reflect.Method.invoke(Method.java:580)
> [...] {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-33816) SourceStreamTaskTest.testTriggeringStopWithSavepointWithDrain failed due async checkpoint triggering not being completed

2024-03-15 Thread Piotr Nowojski (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-33816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827503#comment-17827503
 ] 

Piotr Nowojski edited comment on FLINK-33816 at 3/15/24 1:33 PM:
-

Thanks for the fix!

merged commit 5aebb04 into apache:master 


was (Author: pnowojski):
merged commit 5aebb04 into apache:master 

> SourceStreamTaskTest.testTriggeringStopWithSavepointWithDrain failed due 
> async checkpoint triggering not being completed 
> -
>
> Key: FLINK-33816
> URL: https://issues.apache.org/jira/browse/FLINK-33816
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / Coordination
>Affects Versions: 1.19.0
>Reporter: Matthias Pohl
>Assignee: jiabao.sun
>Priority: Major
>  Labels: github-actions, pull-request-available, test-stability
> Fix For: 2.0.0
>
> Attachments: screenshot-1.png
>
>
> [https://github.com/XComp/flink/actions/runs/7182604625/job/19559947894#step:12:9430]
> {code:java}
> rror: 14:39:01 14:39:01.930 [ERROR] Tests run: 16, Failures: 1, Errors: 0, 
> Skipped: 0, Time elapsed: 1.878 s <<< FAILURE! - in 
> org.apache.flink.streaming.runtime.tasks.SourceStreamTaskTest
> 9426Error: 14:39:01 14:39:01.930 [ERROR] 
> org.apache.flink.streaming.runtime.tasks.SourceStreamTaskTest.testTriggeringStopWithSavepointWithDrain
>   Time elapsed: 0.034 s  <<< FAILURE!
> 9427Dec 12 14:39:01 org.opentest4j.AssertionFailedError: 
> 9428Dec 12 14:39:01 
> 9429Dec 12 14:39:01 Expecting value to be true but was false
> 9430Dec 12 14:39:01   at 
> java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62)
> 9431Dec 12 14:39:01   at 
> java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502)
> 9432Dec 12 14:39:01   at 
> org.apache.flink.streaming.runtime.tasks.SourceStreamTaskTest.testTriggeringStopWithSavepointWithDrain(SourceStreamTaskTest.java:710)
> 9433Dec 12 14:39:01   at 
> java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
> 9434Dec 12 14:39:01   at 
> java.base/java.lang.reflect.Method.invoke(Method.java:580)
> [...] {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34643][tests] Fix JobIDLoggingITCase [flink]



rkhachatryan commented on code in PR #24484:
URL: https://github.com/apache/flink/pull/24484#discussion_r1526297069


##
flink-tests/src/test/java/org/apache/flink/test/misc/JobIDLoggingITCase.java:
##
@@ -205,6 +268,8 @@ private static JobID runJob(ClusterClient clusterClient) 
throws Exception {
 // wait for all tasks ready and then checkpoint
 while (true) {
 try {
+clusterClient.triggerCheckpoint(jobId, 
CheckpointType.DEFAULT).get();
+// get checkpoint notification

Review Comment:
   I wanted to check the log message when checkpoint completion notification is 
received. 
   That notification is not guaranteed to be sent if the job is cancelled right 
after the checkpoint, so I added a 2nd checkpoint.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-33816][streaming] Fix unstable test SourceStreamTaskTest.testTriggeringStopWithSavepointWithDrain [flink]



pnowojski merged PR #24016:
URL: https://github.com/apache/flink/pull/24016


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-33816][streaming] Fix unstable test SourceStreamTaskTest.testTriggeringStopWithSavepointWithDrain [flink]



pnowojski commented on PR #24016:
URL: https://github.com/apache/flink/pull/24016#issuecomment-1999674592

   Thanks for the fix!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Resolved] (FLINK-34194) Upgrade Flink CI Docker container to Ubuntu 22.04



 [ 
https://issues.apache.org/jira/browse/FLINK-34194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Pohl resolved FLINK-34194.
---
Fix Version/s: 1.20.0
   Resolution: Fixed

master: 
[2b94a57f59ee5a4fa09831236764712d1f5affc6|https://github.com/apache/flink/commit/2b94a57f59ee5a4fa09831236764712d1f5affc6]

> Upgrade Flink CI Docker container to Ubuntu 22.04
> -
>
> Key: FLINK-34194
> URL: https://issues.apache.org/jira/browse/FLINK-34194
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.17.2, 1.19.0, 1.18.1
>Reporter: Matthias Pohl
>Assignee: Matthias Pohl
>Priority: Major
>  Labels: github-actions, pull-request-available
> Fix For: 1.20.0
>
>
> The current CI Docker image is based on Ubuntu 16.04. We already use 20.04 
> for the e2e tests. We can update the Docker image to a newer version to be on 
> par with what we need in GitHub Actions (FLINK-33923).
> This issue can cover the following topics:
>  * Update to 22.04
>  ** OpenSSL 1.0.0 dependency should be added for netty-tcnative support
>  ** Use Python3 instead of Python 2.7 (python symlink needs to be added due 
> to FLINK-34195) 
>  * Removal of Maven (FLINK-33501 makes us rely on the Maven wrapper)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34194][ci] Updates test CI container to be based on Ubuntu 22.04 [flink]



XComp merged PR #24165:
URL: https://github.com/apache/flink/pull/24165


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-34533) Propose a pull request for website updates



 [ 
https://issues.apache.org/jira/browse/FLINK-34533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee closed FLINK-34533.
---
Resolution: Won't Fix

> Propose a pull request for website updates
> --
>
> Key: FLINK-34533
> URL: https://issues.apache.org/jira/browse/FLINK-34533
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.19.0
>Reporter: Lincoln Lee
>Assignee: lincoln lee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>
> The final step of building the candidate is to propose a website pull request 
> containing the following changes:
>  # update 
> [apache/flink-web:_config.yml|https://github.com/apache/flink-web/blob/asf-site/_config.yml]
>  ## update {{FLINK_VERSION_STABLE}} and {{FLINK_VERSION_STABLE_SHORT}} as 
> required
>  ## update version references in quickstarts ({{{}q/{}}} directory) as 
> required
>  ## (major only) add a new entry to {{flink_releases}} for the release 
> binaries and sources
>  ## (minor only) update the entry for the previous release in the series in 
> {{flink_releases}}
>  ### Please pay notice to the ids assigned to the download entries. They 
> should be unique and reflect their corresponding version number.
>  ## add a new entry to {{release_archive.flink}}
>  # add a blog post announcing the release in _posts
>  # add a organized release notes page under docs/content/release-notes and 
> docs/content.zh/release-notes (like 
> [https://nightlies.apache.org/flink/flink-docs-release-1.15/release-notes/flink-1.15/]).
>  The page is based on the non-empty release notes collected from the issues, 
> and only the issues that affect existing users should be included (e.g., 
> instead of new functionality). It should be in a separate PR since it would 
> be merged to the flink project.
> (!) Don’t merge the PRs before finalizing the release.
>  
> 
> h3. Expectations
>  * Website pull request proposed to list the 
> [release|http://flink.apache.org/downloads.html]
>  * (major only) Check {{docs/config.toml}} to ensure that
>  ** the version constants refer to the new version
>  ** the {{baseurl}} does not point to {{flink-docs-master}}  but 
> {{flink-docs-release-X.Y}} instead



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-34695) Move Flink's CI docker container into a public repo



[ 
https://issues.apache.org/jira/browse/FLINK-34695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827500#comment-17827500
 ] 

Matthias Pohl commented on FLINK-34695:
---

This should be discussed in the dev ML before going ahead with the change, I 
guess. ...if someone wants to pick this up.

> Move Flink's CI docker container into a public repo
> ---
>
> Key: FLINK-34695
> URL: https://issues.apache.org/jira/browse/FLINK-34695
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System / CI
>Affects Versions: 1.19.0, 1.18.1, 1.20.0
>Reporter: Matthias Pohl
>Priority: Major
>
> Currently, Flink's CI (GitHub Actions and Azure Pipelines) use a container to 
> run the logic. The intention behind it is to have a way to mimick the CI 
> setup locally as well.
> The current Docker image is maintained from the 
> [zentol/flink-ci-docker|https://github.com/zentol/flink-ci-docker] fork 
> (owned by [~chesnay]) of 
> [flink-ci/flink-ci-docker|https://github.com/flink-ci/flink-ci-docker] (owned 
> by Ververica) which is not ideal. We should move this repo into a 
> Apache-owned repository.
> Additionally, the there's no workflow pushing the image automatically to a 
> registry from where it can be used. Instead, the images were pushed to 
> personal Docker Hub repos in the past (rmetzger, chesnay, mapohl). This is 
> also not ideal. We should use a public repo using a GHA workflow to push the 
> image to that repo.
> Questions to answer here:
> # Where shall the Docker image code be located?
> # Which Docker registry should be used?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34194][ci] Updates test CI container to be based on Ubuntu 22.04 [flink]



XComp commented on PR #24165:
URL: https://github.com/apache/flink/pull/24165#issuecomment-1999640749

   I agree with you. I had this discussion with @zentol recently as well. 
Ideally, we should move the Docker image into a Apache-owned repo and push to a 
public Docker registry. I created FLINK-34695 to cover this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-34695) Move Flink's CI docker container into a public repo



[ 
https://issues.apache.org/jira/browse/FLINK-34695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827499#comment-17827499
 ] 

Matthias Pohl commented on FLINK-34695:
---

{quote}
Where shall the Docker image code be located?
{quote}

* A docker subfolder in 
[apache/flink:.github/|https://github.com/apache/flink/tree/master/.github] 
since that's the closes it can be to the actual CI. Right now, there is some 
redundancy with 
[apache/flink-connector-shared-utils|https://github.com/apache/flink-connector-shared-utils/blob/ci_utils/docker/base/Dockerfile]
 that is used for connector CI. Having a dedicated CI Docker image in 
{{apache/flink}} would not resolve this redundancy. The question is just: Do we 
want to resolve it?
* Use 
[apache/flink-connector-shared-utils|https://github.com/apache/flink-connector-shared-utils/blob/ci_utils/docker/base/Dockerfile]
 as the Docker image not only for the connector repos but also the Flink main 
repo. This would remove the redundancy.

{quote}
Which Docker registry should be used?
{quote}
* We already use ghcr.io for the Flink nightly builds (see snapshot workflow 
config in 
[apache/flink-docker:.github/workflows/snapshot.yml:24|https://github.com/apache/flink-docker/blob/master/.github/workflows/snapshot.yml#L24].
 That could be used for {{flink-docker-ci}} as well.

> Move Flink's CI docker container into a public repo
> ---
>
> Key: FLINK-34695
> URL: https://issues.apache.org/jira/browse/FLINK-34695
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System / CI
>Affects Versions: 1.19.0, 1.18.1, 1.20.0
>Reporter: Matthias Pohl
>Priority: Major
>
> Currently, Flink's CI (GitHub Actions and Azure Pipelines) use a container to 
> run the logic. The intention behind it is to have a way to mimick the CI 
> setup locally as well.
> The current Docker image is maintained from the 
> [zentol/flink-ci-docker|https://github.com/zentol/flink-ci-docker] fork 
> (owned by [~chesnay]) of 
> [flink-ci/flink-ci-docker|https://github.com/flink-ci/flink-ci-docker] (owned 
> by Ververica) which is not ideal. We should move this repo into a 
> Apache-owned repository.
> Additionally, the there's no workflow pushing the image automatically to a 
> registry from where it can be used. Instead, the images were pushed to 
> personal Docker Hub repos in the past (rmetzger, chesnay, mapohl). This is 
> also not ideal. We should use a public repo using a GHA workflow to push the 
> image to that repo.
> Questions to answer here:
> # Where shall the Docker image code be located?
> # Which Docker registry should be used?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-34693) Memory leak in KafkaWriter

2024-03-15 Thread Leonard Xu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-34693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827496#comment-17827496
 ] 

Leonard Xu commented on FLINK-34693:


[~srichter] Thanks for your report, but the Affects version should Kafka 
connector version instead of Flink’s version as Kafka connector has been move 
from Flink main repo and release separately. 

> Memory leak in KafkaWriter
> --
>
> Key: FLINK-34693
> URL: https://issues.apache.org/jira/browse/FLINK-34693
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.18.0, 1.19.0, 1.18.1
>Reporter: Stefan Richter
>Priority: Blocker
> Attachments: image-2024-03-15-10-30-08-280.png
>
>
> KafkaWriter is keeping objects in Dequeue of closeables 
> ({{{}producerCloseables{}}}) that are never removed so that the can be GC’ed.
> From heap 
> dump:!04599375-f923-4d1a-8d68-9e17e54b363c#media-blob-url=true=9d1e022e-8762-45b3-877b-d298ec956078==870337=306=2284=!
>   !image-2024-03-15-10-30-08-280.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-34693) Memory leak in KafkaWriter

2024-03-15 Thread Leonard Xu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-34693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827495#comment-17827495
 ] 

Leonard Xu commented on FLINK-34693:


[~renqs] Would you like look into this issue?

> Memory leak in KafkaWriter
> --
>
> Key: FLINK-34693
> URL: https://issues.apache.org/jira/browse/FLINK-34693
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.18.0, 1.19.0, 1.18.1
>Reporter: Stefan Richter
>Priority: Blocker
> Attachments: image-2024-03-15-10-30-08-280.png
>
>
> KafkaWriter is keeping objects in Dequeue of closeables 
> ({{{}producerCloseables{}}}) that are never removed so that the can be GC’ed.
> From heap 
> dump:!04599375-f923-4d1a-8d68-9e17e54b363c#media-blob-url=true=9d1e022e-8762-45b3-877b-d298ec956078==870337=306=2284=!
>   !image-2024-03-15-10-30-08-280.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34695) Move Flink's CI docker container into a public repo

Matthias Pohl created FLINK-34695:
-

 Summary: Move Flink's CI docker container into a public repo
 Key: FLINK-34695
 URL: https://issues.apache.org/jira/browse/FLINK-34695
 Project: Flink
  Issue Type: Improvement
  Components: Build System / CI
Affects Versions: 1.18.1, 1.19.0, 1.20.0
Reporter: Matthias Pohl


Currently, Flink's CI (GitHub Actions and Azure Pipelines) use a container to 
run the logic. The intention behind it is to have a way to mimick the CI setup 
locally as well.

The current Docker image is maintained from the 
[zentol/flink-ci-docker|https://github.com/zentol/flink-ci-docker] fork (owned 
by [~chesnay]) of 
[flink-ci/flink-ci-docker|https://github.com/flink-ci/flink-ci-docker] (owned 
by Ververica) which is not ideal. We should move this repo into a Apache-owned 
repository.

Additionally, the there's no workflow pushing the image automatically to a 
registry from where it can be used. Instead, the images were pushed to personal 
Docker Hub repos in the past (rmetzger, chesnay, mapohl). This is also not 
ideal. We should use a public repo using a GHA workflow to push the image to 
that repo.

Questions to answer here:
# Where shall the Docker image code be located?
# Which Docker registry should be used?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34694) Delete num of associations for streaming outer join

2024-03-15 Thread Roman Boyko (Jira)

Roman Boyko created FLINK-34694:
---

 Summary: Delete num of associations for streaming outer join
 Key: FLINK-34694
 URL: https://issues.apache.org/jira/browse/FLINK-34694
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Runtime
Affects Versions: 1.18.1, 1.19.0, 1.17.2, 1.16.3
Reporter: Roman Boyko
 Attachments: image-2024-03-15-19-51-29-282.png, 
image-2024-03-15-19-52-24-391.png

Currently in StreamingJoinOperator (non-window) in case of OUTER JOIN the 
OuterJoinRecordStateView is used to store additional field - the number of 
associations for every record. This leads to store additional Tuple2 and 
Integer data for every record in outer state.

This functionality is used only for sending:
 * -D[nullPaddingRecord] in case of first Accumulate record
 * +I[nullPaddingRecord] in case of last Revoke record

The overhead of storing additional data and updating the counter for 
associations can be avoided by checking the input state for these events.

 

The proposed solution can be found here - 
[https://github.com/rovboyko/flink/commit/1ca2f5bdfc2d44b99d180abb6a4dda123e49d423]

 

According to the nexmark q20 test (changed to OUTER JOIN) it could increase the 
performance up to 20%:
 * Before:

!image-2024-03-15-19-52-24-391.png!
 * After:

!image-2024-03-15-19-51-29-282.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-31472] Remove external timer service reference from AsyncSinkThrottling Test [flink]



XComp commented on PR #24481:
URL: https://github.com/apache/flink/pull/24481#issuecomment-1999570415

   @dannycranmer do you have capacity to look into it. I saw that you reviewed 
the FLINK-25793 work which introduced this test. I'm not too familiar with this 
part of the code. That's why I'd prefer someone else to review this one.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-34693) Memory leak in KafkaWriter



 [ 
https://issues.apache.org/jira/browse/FLINK-34693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Richter updated FLINK-34693:
---
Attachment: (was: image-2024-03-15-10-30-50-902.png)

> Memory leak in KafkaWriter
> --
>
> Key: FLINK-34693
> URL: https://issues.apache.org/jira/browse/FLINK-34693
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.18.0, 1.19.0, 1.18.1
>Reporter: Stefan Richter
>Priority: Blocker
> Attachments: image-2024-03-15-10-30-08-280.png
>
>
> KafkaWriter is keeping objects in Dequeue of closeables 
> ({{{}producerCloseables{}}}) that are never removed so that the can be GC’ed.
> From heap 
> dump:!04599375-f923-4d1a-8d68-9e17e54b363c#media-blob-url=true=9d1e022e-8762-45b3-877b-d298ec956078==870337=306=2284=!
>   !image-2024-03-15-10-30-08-280.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34693) Memory leak in KafkaWriter



 [ 
https://issues.apache.org/jira/browse/FLINK-34693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Richter updated FLINK-34693:
---
Description: 
KafkaWriter is keeping objects in Dequeue of closeables 
({{{}producerCloseables{}}}) that are never removed so that the can be GC’ed.

>From heap 
>dump:!04599375-f923-4d1a-8d68-9e17e54b363c#media-blob-url=true=9d1e022e-8762-45b3-877b-d298ec956078==870337=306=2284=!
>  !image-2024-03-15-10-30-08-280.png!

  was:
KafkaWriter is keeping instances of {{TwoPhaseCommitProducer}} in Dequeue of 
closeables ({{{}producerCloseables{}}}). We are only adding instances to the 
queue (for each txn?), but never remove them so that the can be GC’ed.

>From heap dump:

!image-2024-03-15-10-30-50-902.png!
!04599375-f923-4d1a-8d68-9e17e54b363c#media-blob-url=true=9d1e022e-8762-45b3-877b-d298ec956078==870337=306=2284=!
 !image-2024-03-15-10-30-08-280.png!


> Memory leak in KafkaWriter
> --
>
> Key: FLINK-34693
> URL: https://issues.apache.org/jira/browse/FLINK-34693
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.18.0, 1.19.0, 1.18.1
>Reporter: Stefan Richter
>Priority: Blocker
> Attachments: image-2024-03-15-10-30-08-280.png, 
> image-2024-03-15-10-30-50-902.png
>
>
> KafkaWriter is keeping objects in Dequeue of closeables 
> ({{{}producerCloseables{}}}) that are never removed so that the can be GC’ed.
> From heap 
> dump:!04599375-f923-4d1a-8d68-9e17e54b363c#media-blob-url=true=9d1e022e-8762-45b3-877b-d298ec956078==870337=306=2284=!
>   !image-2024-03-15-10-30-08-280.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34516] Move CheckpointingMode to flink-core [flink]



Zakelly commented on PR #24381:
URL: https://github.com/apache/flink/pull/24381#issuecomment-1999431941

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-32074][checkpoint] Merge file across checkpoints [flink]



Zakelly commented on PR #24497:
URL: https://github.com/apache/flink/pull/24497#issuecomment-1999420909

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-32079][checkpoint] Support to read/write checkpoint metadata of merged files [flink]



Zakelly commented on PR #24480:
URL: https://github.com/apache/flink/pull/24480#issuecomment-1999414694

   > @Zakelly @fredia Could you take a look ?
   
   @masteryhx 
   How about waiting for the FLINK-34668, where some new state handles will be 
introduced.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-34585) [JUnit5 Migration] Module: Flink CDC



[ 
https://issues.apache.org/jira/browse/FLINK-34585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827462#comment-17827462
 ] 

Jiabao Sun commented on FLINK-34585:


Thanks [~kunni] for volunteering.
Assigned to you.

> [JUnit5 Migration] Module: Flink CDC
> 
>
> Key: FLINK-34585
> URL: https://issues.apache.org/jira/browse/FLINK-34585
> Project: Flink
>  Issue Type: Sub-task
>  Components: Flink CDC
>Reporter: Hang Ruan
>Assignee: LvYanquan
>Priority: Major
>
> Most tests in Flink CDC are still using Junit 4. We need to use Junit 5 
> instead.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-34585) [JUnit5 Migration] Module: Flink CDC



 [ 
https://issues.apache.org/jira/browse/FLINK-34585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiabao Sun reassigned FLINK-34585:
--

Assignee: LvYanquan

> [JUnit5 Migration] Module: Flink CDC
> 
>
> Key: FLINK-34585
> URL: https://issues.apache.org/jira/browse/FLINK-34585
> Project: Flink
>  Issue Type: Sub-task
>  Components: Flink CDC
>Reporter: Hang Ruan
>Assignee: LvYanquan
>Priority: Major
>
> Most tests in Flink CDC are still using Junit 4. We need to use Junit 5 
> instead.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-25544) [JUnit5 Migration] Module: flink-streaming-java



[ 
https://issues.apache.org/jira/browse/FLINK-25544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827460#comment-17827460
 ] 

Jiabao Sun commented on FLINK-25544:


master: 62f44e0118539c1ed0dedf47099326f97c9d0427

> [JUnit5 Migration] Module: flink-streaming-java
> ---
>
> Key: FLINK-25544
> URL: https://issues.apache.org/jira/browse/FLINK-25544
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Hang Ruan
>Assignee: Jiabao Sun
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-25544) [JUnit5 Migration] Module: flink-streaming-java



 [ 
https://issues.apache.org/jira/browse/FLINK-25544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiabao Sun updated FLINK-25544:
---
Release Note:   (was: master: 62f44e0118539c1ed0dedf47099326f97c9d0427)

> [JUnit5 Migration] Module: flink-streaming-java
> ---
>
> Key: FLINK-25544
> URL: https://issues.apache.org/jira/browse/FLINK-25544
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Hang Ruan
>Assignee: Jiabao Sun
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (FLINK-25544) [JUnit5 Migration] Module: flink-streaming-java



 [ 
https://issues.apache.org/jira/browse/FLINK-25544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiabao Sun resolved FLINK-25544.

Fix Version/s: 1.20.0
 Release Note: master: 62f44e0118539c1ed0dedf47099326f97c9d0427
   Resolution: Fixed

> [JUnit5 Migration] Module: flink-streaming-java
> ---
>
> Key: FLINK-25544
> URL: https://issues.apache.org/jira/browse/FLINK-25544
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Hang Ruan
>Assignee: Jiabao Sun
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-25544][streaming][JUnit5 Migration] The runtime package of module flink-stream-java [flink]



Jiabao-Sun merged PR #24483:
URL: https://github.com/apache/flink/pull/24483


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34334][state] Add sub-task level RocksDB file count metrics [flink]



masteryhx commented on code in PR #24322:
URL: https://github.com/apache/flink/pull/24322#discussion_r1526040906


##
flink-state-backends/flink-statebackend-rocksdb/src/test/java/org/apache/flink/contrib/streaming/state/RocksDBNativeMetricOptionsTest.java:
##
@@ -29,7 +29,11 @@ public class RocksDBNativeMetricOptionsTest {
 public void testNativeMetricsConfigurable() {
 for (RocksDBProperty property : RocksDBProperty.values()) {
 Configuration config = new Configuration();
-config.setBoolean(property.getConfigKey(), true);
+if (property.getConfigKey().contains("num-files-at-level")) {
+
config.setBoolean("state.backend.rocksdb.metrics.num-files-at-level", true);

Review Comment:
   ```suggestion
   
config.setBoolean(RocksDBNativeMetricOptions.MONITOR_NUM_FILES_AT_LEVEL.key(), 
true);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34593][release] Add release note for version 1.19 [flink]



flinkbot commented on PR #24505:
URL: https://github.com/apache/flink/pull/24505#issuecomment-1999291335

   
   ## CI report:
   
   * d936cd9edb37ccf9b7b70631b3463d10d2c4e33c UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34533][release] Add release note for version 1.19 [flink]



lincoln-lil merged PR #24394:
URL: https://github.com/apache/flink/pull/24394


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34690] Cast decimal to VARCHAR/INT/BIGINT/LARGEINT as primary … [flink-cdc]



loserwang1024 commented on code in PR #3150:
URL: https://github.com/apache/flink-cdc/pull/3150#discussion_r1526000193


##
flink-cdc-connect/flink-cdc-pipeline-connectors/flink-cdc-pipeline-connector-starrocks/src/main/java/org/apache/flink/cdc/connectors/starrocks/sink/StarRocksUtils.java:
##
@@ -297,10 +297,29 @@ public StarRocksColumn.Builder visit(DoubleType 
doubleType) {
 
 @Override
 public StarRocksColumn.Builder visit(DecimalType decimalType) {
-builder.setDataType(DECIMAL);
+// StarRocks is not support Decimal as primary key, so decimal 
should be cast to INT,
+// BIGINT, LARGEINT or VARHCAR.
+if (!isPrimaryKeys) {

Review Comment:
   Based on Flink Schema passed to sink, SEE 
org.apache.flink.cdc.connectors.starrocks.sink.StarRocksUtils#toStarRocksTable 
-> 
org.apache.flink.cdc.connectors.starrocks.sink.StarRocksUtils#toStarRocksDataType



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-34693) Memory leak in KafkaWriter



 [ 
https://issues.apache.org/jira/browse/FLINK-34693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Richter updated FLINK-34693:
---
Description: 
KafkaWriter is keeping instances of {{TwoPhaseCommitProducer}} in Dequeue of 
closeables ({{{}producerCloseables{}}}). We are only adding instances to the 
queue (for each txn?), but never remove them so that the can be GC’ed.

>From heap dump:

!image-2024-03-15-10-30-50-902.png!
!04599375-f923-4d1a-8d68-9e17e54b363c#media-blob-url=true=9d1e022e-8762-45b3-877b-d298ec956078==870337=306=2284=!
 !image-2024-03-15-10-30-08-280.png!

  was:
KafkaWriter is keeping instances of {{TwoPhaseCommitProducer}} in Dequeue of 
closeables ({{{}producerCloseables{}}}). We are only adding instances to the 
queue (for each txn?), but never remove them so that the can be GC’ed.

>From heap dump:

!image-2024-03-15-10-30-50-902.png!
!04599375-f923-4d1a-8d68-9e17e54b363c#media-blob-url=true=9d1e022e-8762-45b3-877b-d298ec956078==870337=306=2284=!!image-2024-03-15-10-30-08-280.png!!image-2024-03-15-10-28-48-591.png!


> Memory leak in KafkaWriter
> --
>
> Key: FLINK-34693
> URL: https://issues.apache.org/jira/browse/FLINK-34693
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.18.0, 1.19.0, 1.18.1
>Reporter: Stefan Richter
>Priority: Blocker
> Attachments: image-2024-03-15-10-30-08-280.png, 
> image-2024-03-15-10-30-50-902.png
>
>
> KafkaWriter is keeping instances of {{TwoPhaseCommitProducer}} in Dequeue of 
> closeables ({{{}producerCloseables{}}}). We are only adding instances to the 
> queue (for each txn?), but never remove them so that the can be GC’ed.
> From heap dump:
> !image-2024-03-15-10-30-50-902.png!
> !04599375-f923-4d1a-8d68-9e17e54b363c#media-blob-url=true=9d1e022e-8762-45b3-877b-d298ec956078==870337=306=2284=!
>  !image-2024-03-15-10-30-08-280.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34693) Memory leak in KafkaWriter

Stefan Richter created FLINK-34693:
--

 Summary: Memory leak in KafkaWriter
 Key: FLINK-34693
 URL: https://issues.apache.org/jira/browse/FLINK-34693
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.18.1, 1.19.0, 1.18.0
Reporter: Stefan Richter
 Attachments: image-2024-03-15-10-30-08-280.png, 
image-2024-03-15-10-30-50-902.png

KafkaWriter is keeping instances of {{TwoPhaseCommitProducer}} in Dequeue of 
closeables ({{{}producerCloseables{}}}). We are only adding instances to the 
queue (for each txn?), but never remove them so that the can be GC’ed.

>From heap dump:

!image-2024-03-15-10-30-50-902.png!
!04599375-f923-4d1a-8d68-9e17e54b363c#media-blob-url=true=9d1e022e-8762-45b3-877b-d298ec956078==870337=306=2284=!!image-2024-03-15-10-30-08-280.png!!image-2024-03-15-10-28-48-591.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34533][release] Add release note for version 1.19 [flink]



lincoln-lil commented on PR #24394:
URL: https://github.com/apache/flink/pull/24394#issuecomment-1999261121

   Will merge this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] Update oracle-cdc.md [flink-cdc]



e-mhui commented on PR #3151:
URL: https://github.com/apache/flink-cdc/pull/3151#issuecomment-1999260453

   If you are using Flink SQL, you need to add "debezium" as a prefix. However, 
when using the DataStream API, you don't need the "debezium" prefix.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34643][tests] Fix JobIDLoggingITCase [flink]



XComp commented on code in PR #24484:
URL: https://github.com/apache/flink/pull/24484#discussion_r1525984268


##
flink-test-utils-parent/flink-test-utils-junit/src/main/java/org/apache/flink/testutils/logging/LoggerAuditingExtension.java:
##
@@ -81,7 +81,9 @@ public void beforeEach(ExtensionContext context) throws 
Exception {
 new AbstractAppender("test-appender", null, null, false, 
Property.EMPTY_ARRAY) {
 @Override
 public void append(LogEvent event) {
-loggingEvents.add(event.toImmutable());
+if (event.toImmutable() != null) {
+loggingEvents.add(event.toImmutable());
+}

Review Comment:
   It's not clear to me why this change is necessary. The `LogEvent` interface 
doesn't indicate that `null` is a valid return value. :thinking: You might want 
to add a comment here if we really need to check for `null` values here.



##
flink-tests/src/test/java/org/apache/flink/test/misc/JobIDLoggingITCase.java:
##


Review Comment:
   Just to minor nitty general comments on JUnit5 testing:
   * JUnit5 doesn't require for the test class to be public, anymore. This also 
removes the necessity to add JavaDoc to please checkstyle.
   * Same is true for the test methods



##
flink-tests/src/test/java/org/apache/flink/test/misc/JobIDLoggingITCase.java:
##
@@ -205,6 +268,8 @@ private static JobID runJob(ClusterClient clusterClient) 
throws Exception {
 // wait for all tasks ready and then checkpoint
 while (true) {
 try {
+clusterClient.triggerCheckpoint(jobId, 
CheckpointType.DEFAULT).get();
+// get checkpoint notification

Review Comment:
   why do need a second checkpoint trigger to get the checkpoint notification? 
because the 2nd checkpoint is waiting for the first one to finish? :thinking: 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [release] Update version to 1.9-SNAPSHOT [flink-kubernetes-operator]



mxm merged PR #798:
URL: https://github.com/apache/flink-kubernetes-operator/pull/798


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [release] Adjust website for Kubernetes operator 1.8.0 release [flink-web]



mxm commented on PR #726:
URL: https://github.com/apache/flink-web/pull/726#issuecomment-1999239653

   Yes, the blog post is still a TODO, although it isn't strictly needed for 
the release.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-34670) The asyncOperationsThreadPool in SubtaskCheckpointCoordinatorImpl can only create one worker thread

2024-03-15 Thread Jinzhong Li (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinzhong Li updated FLINK-34670:

Description: 
Now, the asyncOperations ThreadPoolExecutor of SubtaskCheckpointCoordinatorImpl 
is create with a LinkedBlockingQueue and zero corePoolSize.

!image-2024-03-14-20-24-14-198.png|width=614,height=198!

And in the ThreadPoolExecutor, except for the first time the task is submitted, 
*no* new thread is created until the queue is full. But the capacity of 
LinkedBlockingQueue is Integer.Max. This means that there is almost *only one 
thread* working in this thread pool, *even if* {*}there are many concurrent 
checkpoint requests or checkpoint abort requests waiting to be processed{*}.

!image-2024-03-14-20-27-37-540.png|width=614,height=175!

This problem can be verified by changing ExecutorService implementation in UT 
SubtaskCheckpointCoordinatorTest#testNotifyCheckpointAbortedDuringAsyncPhase. 
When the LinkedBlockingQueue is used, this UT will deadlock because only one 
worker thread can be created.
!image-2024-03-14-20-33-28-851.png|width=606,height=235!

  was:
Now, the asyncOperations ThreadPoolExecutor of SubtaskCheckpointCoordinatorImpl 
is create with a LinkedBlockingQueue and zero corePoolSize.

!image-2024-03-14-20-24-14-198.png|width=614,height=198!

And in the ThreadPoolExecutor, except for the first time the task is submitted, 
*no* new thread is created until the queue is full. But the capacity of 
LinkedBlockingQueue is Integer.Max. This means that there is almost *only one 
thread* working in this thread pool, even if {*}there are many concurrent 
checkpoint requests or checkpoint abort requests waiting to be processed{*}.

!image-2024-03-14-20-27-37-540.png|width=614,height=175!

This problem can be verified by changing ExecutorService implementation in UT 
SubtaskCheckpointCoordinatorTest#testNotifyCheckpointAbortedDuringAsyncPhase. 
When the LinkedBlockingQueue is used, this UT will deadlock because only one 
worker thread can be created.
!image-2024-03-14-20-33-28-851.png|width=606,height=235!


> The asyncOperationsThreadPool in SubtaskCheckpointCoordinatorImpl can only 
> create one worker thread
> ---
>
> Key: FLINK-34670
> URL: https://issues.apache.org/jira/browse/FLINK-34670
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.18.0, 1.19.0
>Reporter: Jinzhong Li
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.20.0
>
> Attachments: image-2024-03-14-20-24-14-198.png, 
> image-2024-03-14-20-27-37-540.png, image-2024-03-14-20-33-28-851.png
>
>
> Now, the asyncOperations ThreadPoolExecutor of 
> SubtaskCheckpointCoordinatorImpl is create with a LinkedBlockingQueue and 
> zero corePoolSize.
> !image-2024-03-14-20-24-14-198.png|width=614,height=198!
> And in the ThreadPoolExecutor, except for the first time the task is 
> submitted, *no* new thread is created until the queue is full. But the 
> capacity of LinkedBlockingQueue is Integer.Max. This means that there is 
> almost *only one thread* working in this thread pool, *even if* {*}there are 
> many concurrent checkpoint requests or checkpoint abort requests waiting to 
> be processed{*}.
> !image-2024-03-14-20-27-37-540.png|width=614,height=175!
> This problem can be verified by changing ExecutorService implementation in UT 
> SubtaskCheckpointCoordinatorTest#testNotifyCheckpointAbortedDuringAsyncPhase. 
> When the LinkedBlockingQueue is used, this UT will deadlock because only one 
> worker thread can be created.
> !image-2024-03-14-20-33-28-851.png|width=606,height=235!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34670) The asyncOperationsThreadPool in SubtaskCheckpointCoordinatorImpl can only create one worker thread

2024-03-15 Thread Jinzhong Li (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinzhong Li updated FLINK-34670:

Description: 
Now, the asyncOperations ThreadPoolExecutor of SubtaskCheckpointCoordinatorImpl 
is create with a LinkedBlockingQueue and zero corePoolSize.

!image-2024-03-14-20-24-14-198.png|width=614,height=198!

And in the ThreadPoolExecutor, except for the first time the task is submitted, 
*no* new thread is created until the queue is full. But the capacity of 
LinkedBlockingQueue is Integer.Max. This means that there is almost *only one 
thread* working in this thread pool, even if {*}there are many concurrent 
checkpoint requests or checkpoint abort requests waiting to be processed{*}.

!image-2024-03-14-20-27-37-540.png|width=614,height=175!

This problem can be verified by changing ExecutorService implementation in UT 
SubtaskCheckpointCoordinatorTest#testNotifyCheckpointAbortedDuringAsyncPhase. 
When the LinkedBlockingQueue is used, this UT will deadlock because only one 
worker thread can be created.
!image-2024-03-14-20-33-28-851.png|width=606,height=235!

  was:
Now, the asyncOperations ThreadPoolExecutor of SubtaskCheckpointCoordinatorImpl 
is create with a LinkedBlockingQueue and zero corePoolSize.

!image-2024-03-14-20-24-14-198.png|width=614,height=198!

And in the ThreadPoolExecutor, except for the first time the task is submitted, 
*no* new thread is created until the queue is full. But the capacity of 
LinkedBlockingQueue is Integer.Max. This means that there is almost *only one 
thread* working in this thread pool, even if there are many concurrent 
checkpoint requests or checkpoint abort requests waiting to be processed.

!image-2024-03-14-20-27-37-540.png|width=614,height=175!

This problem can be verified by changing ExecutorService implementation in UT 
SubtaskCheckpointCoordinatorTest#testNotifyCheckpointAbortedDuringAsyncPhase. 
When the LinkedBlockingQueue is used, this UT will deadlock because only one 
worker thread can be created.
!image-2024-03-14-20-33-28-851.png|width=606,height=235!


> The asyncOperationsThreadPool in SubtaskCheckpointCoordinatorImpl can only 
> create one worker thread
> ---
>
> Key: FLINK-34670
> URL: https://issues.apache.org/jira/browse/FLINK-34670
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.18.0, 1.19.0
>Reporter: Jinzhong Li
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.20.0
>
> Attachments: image-2024-03-14-20-24-14-198.png, 
> image-2024-03-14-20-27-37-540.png, image-2024-03-14-20-33-28-851.png
>
>
> Now, the asyncOperations ThreadPoolExecutor of 
> SubtaskCheckpointCoordinatorImpl is create with a LinkedBlockingQueue and 
> zero corePoolSize.
> !image-2024-03-14-20-24-14-198.png|width=614,height=198!
> And in the ThreadPoolExecutor, except for the first time the task is 
> submitted, *no* new thread is created until the queue is full. But the 
> capacity of LinkedBlockingQueue is Integer.Max. This means that there is 
> almost *only one thread* working in this thread pool, even if {*}there are 
> many concurrent checkpoint requests or checkpoint abort requests waiting to 
> be processed{*}.
> !image-2024-03-14-20-27-37-540.png|width=614,height=175!
> This problem can be verified by changing ExecutorService implementation in UT 
> SubtaskCheckpointCoordinatorTest#testNotifyCheckpointAbortedDuringAsyncPhase. 
> When the LinkedBlockingQueue is used, this UT will deadlock because only one 
> worker thread can be created.
> !image-2024-03-14-20-33-28-851.png|width=606,height=235!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34670) The asyncOperationsThreadPool in SubtaskCheckpointCoordinatorImpl can only create one worker thread

2024-03-15 Thread Jinzhong Li (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinzhong Li updated FLINK-34670:

Summary: The asyncOperationsThreadPool in SubtaskCheckpointCoordinatorImpl 
can only create one worker thread  (was: Use SynchronousQueue to create 
asyncOperationsThreadPool for SubtaskCheckpointCoordinatorImpl)

> The asyncOperationsThreadPool in SubtaskCheckpointCoordinatorImpl can only 
> create one worker thread
> ---
>
> Key: FLINK-34670
> URL: https://issues.apache.org/jira/browse/FLINK-34670
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.18.0, 1.19.0
>Reporter: Jinzhong Li
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.20.0
>
> Attachments: image-2024-03-14-20-24-14-198.png, 
> image-2024-03-14-20-27-37-540.png, image-2024-03-14-20-33-28-851.png
>
>
> Now, the asyncOperations ThreadPoolExecutor of 
> SubtaskCheckpointCoordinatorImpl is create with a LinkedBlockingQueue and 
> zero corePoolSize.
> !image-2024-03-14-20-24-14-198.png|width=614,height=198!
> And in the ThreadPoolExecutor, except for the first time the task is 
> submitted, *no* new thread is created until the queue is full. But the 
> capacity of LinkedBlockingQueue is Integer.Max. This means that there is 
> almost *only one thread* working in this thread pool, even if there are many 
> concurrent checkpoint requests or checkpoint abort requests waiting to be 
> processed.
> !image-2024-03-14-20-27-37-540.png|width=614,height=175!
> This problem can be verified by changing ExecutorService implementation in UT 
> SubtaskCheckpointCoordinatorTest#testNotifyCheckpointAbortedDuringAsyncPhase. 
> When the LinkedBlockingQueue is used, this UT will deadlock because only one 
> worker thread can be created.
> !image-2024-03-14-20-33-28-851.png|width=606,height=235!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] Update oracle-cdc.md [flink-cdc]



stardustman opened a new pull request, #3151:
URL: https://github.com/apache/flink-cdc/pull/3151

   fix error the "log.mining.strategy" should be start with prefix "debezium." 
according to 
https://nightlies.apache.org/flink/flink-cdc-docs-release-3.0/zh/docs/connectors/cdc-connectors/oracle-cdc/#connector-options
 and the help from @e-mhui with the link 
https://github.com/apache/flink-cdc/pull/2315#issuecomment-1999166980


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34643][tests] Fix JobIDLoggingITCase [flink]



XComp commented on code in PR #24484:
URL: https://github.com/apache/flink/pull/24484#discussion_r1525054790


##
flink-tests/src/test/java/org/apache/flink/test/misc/JobIDLoggingITCase.java:
##
@@ -116,18 +116,55 @@ public void testJobIDLogging(@InjectClusterClient 
ClusterClient clusterClient
 
 assertJobIDPresent(jobID, 3, checkpointCoordinatorLogging);
 assertJobIDPresent(jobID, 6, streamTaskLogging);
+
+// Expect *at least* the following
+// - Receive slot request 0d0aa777e9ac73047d35ec6657270462 for job
+// 3cfdefe3c1ff1ea91661ee3d00c0f21a from resource manager with leader 
id
+// b7c6e00948bd0a00a1d15c6d87454ce0.,
+// - Establish JobManager connection for job 
3cfdefe3c1ff1ea91661ee3d00c0f21a.,
+// - Offer reserved slots to the leader of job 
3cfdefe3c1ff1ea91661ee3d00c0f21a.,
+// - Received task Source: Sequence Source -> Sink: Unnamed (1/1)#0
+// 
(1e4894ed9ef9533cac2effb6f7f3e6b3_cbc357ccb763df2852fee8c4fc7d55f2_0_0), deploy 
into slot
+// with allocation id 0d0aa777e9ac73047d35ec6657270462.,
+// - Operator event for
+// 
1e4894ed9ef9533cac2effb6f7f3e6b3_cbc357ccb763df2852fee8c4fc7d55f2_0_0 -
+// cbc357ccb763df2852fee8c4fc7d55f2,
+// - Trigger checkpoint 1@1710170116122 for
+// 
1e4894ed9ef9533cac2effb6f7f3e6b3_cbc357ccb763df2852fee8c4fc7d55f2_0_0.,
 assertJobIDPresent(
 jobID,
-9,

Review Comment:
   can't we make this number of expected log events another set of patterns 
that would be passed in? That would allow for a more detailed assertion in 
`assertJobIDPresent` and would avoid having this long comment here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-32079][checkpoint] Support to read/write checkpoint metadata of merged files [flink]



masteryhx commented on PR #24480:
URL: https://github.com/apache/flink/pull/24480#issuecomment-1999194013

   @Zakelly @fredia Could you take a look ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-34677) Refactor the structure of documentation for Flink CDC

2024-03-15 Thread Leonard Xu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu updated FLINK-34677:
---
Component/s: Documentation

> Refactor the structure of documentation for Flink CDC
> -
>
> Key: FLINK-34677
> URL: https://issues.apache.org/jira/browse/FLINK-34677
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Flink CDC
>Affects Versions: cdc-3.1.0
>Reporter: Qingsheng Ren
>Assignee: Qingsheng Ren
>Priority: Major
>  Labels: pull-request-available
> Fix For: cdc-3.1.0
>
>
> The documentation structure of Flink CDC is not quite in good shape 
> currently. We plan to refactor it as below (✅ for existed pages and ️  for 
> new pages to write):
>  * Get Started
>  ** ️ Introduction
>  ** ✅ Quickstart
>  * Core Concept
>  ** ️ (Pages for data pipeline / sources / sinks / table ID / transform / 
> route)
>  * Connectors
>  ** ️ Overview
>  ** ✅ (Pages for connectors)
>  ** ✅ Legacy Flink CDC Sources (For CDC sources before 3.0)
>  * Deployment
>  ** ️ Standalone
>  ** ️ Kubernetes
>  ** ️ YARN
>  * Developer Guide
>  ** ️ Understand Flink CDC API
>  ** ️ Contribute to Flink CDC
>  ** ️ Licenses
>  * FAQ
>  ** ✅ Frequently Asked Questions



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34409][ci] Enables (most) tests that were disabled for the AdaptiveScheduler due to missing features [flink]



XComp commented on PR #24285:
URL: https://github.com/apache/flink/pull/24285#issuecomment-1999190279

   I fixed an indentation issue and rebased the branch. @MartijnVisser @JingGe 
can one of you go ahead and approve this one? So that we can finalize the 
efforts and merge this PR. I'm going to create backports as well to enable the 
tests in the release branches as well. I guess, increasing the test coverage 
also in those branches doesn't hurt.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34546] Emit span with failure labels on failure in AdaptiveScheduler. [flink]



StefanRRichter merged PR #24498:
URL: https://github.com/apache/flink/pull/24498


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34660][checkpoint] Parse cluster configuration for AutoRescalingITCase#testCheckpointRescalingInKeyedState [flink]



flinkbot commented on PR #24504:
URL: https://github.com/apache/flink/pull/24504#issuecomment-1999168324

   
   ## CI report:
   
   * 63123deb7d39d9ee30bc7991bace080755468dde UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [Oracle][MySQL][SqlServer][PostgresSQL] Fix Oracle/MySQL/SqlServer/PostgresSQL CDC parser schema change event failed [flink-cdc]