Re: [PR] [FLINK-35217] Add missing fsync to #closeForCommit in some subclasses of RecoverableFsDataOutputStream. [flink]

2024-04-30 Thread via GitHub


StefanRRichter merged PR #24722:
URL: https://github.com/apache/flink/pull/24722


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-35217] Add missing fsync to #closeForCommit in some subclasses of RecoverableFsDataOutputStream. [flink]

2024-04-29 Thread via GitHub


rkhachatryan commented on PR #24722:
URL: https://github.com/apache/flink/pull/24722#issuecomment-2083081066

   Thanks for updating the PR, LGTM
   (please re-format the code and commit history before merging)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-35217] Add missing fsync to #closeForCommit in some subclasses of RecoverableFsDataOutputStream. [flink]

2024-04-26 Thread via GitHub


StefanRRichter commented on PR #24722:
URL: https://github.com/apache/flink/pull/24722#issuecomment-2078937897

   Thanks for the review, for
   1. I don't know how we can test this without OS level tools, but I'm open to 
ideas.
   2. I don't think `GSRecoverableFsDataOutputStream` requires any fix because 
the commit method is calling the same methods internally as the persist method. 
I also don't see the connection to `FSDataOutputStreamWrapper` and what you 
think should be fixed there, maybe you can give more detail?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-35217] Add missing fsync to #closeForCommit in some subclasses of RecoverableFsDataOutputStream. [flink]

2024-04-26 Thread via GitHub


rkhachatryan commented on PR #24722:
URL: https://github.com/apache/flink/pull/24722#issuecomment-2078909771

   Thanks for the fix, I think it should solve the problem. However, 
   1. Is there a way to test it?
   2. Should `GSRecoverableFsDataOutputStream` and `FSDataOutputStreamWrapper` 
also be fixed?
   
   I suspect that there might be more bugs like this. So if we come up with 
some generic solution for (1), then its value will be more than just testing 
this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-35217] Add missing fsync to #closeForCommit in some subclasses of RecoverableFsDataOutputStream. [flink]

2024-04-25 Thread via GitHub


flinkbot commented on PR #24722:
URL: https://github.com/apache/flink/pull/24722#issuecomment-2077228887

   
   ## CI report:
   
   * 9cd6ad61e5ad7bdc825c9bb634a067747cb4b84a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [FLINK-35217] Add missing fsync to #closeForCommit in some subclasses of RecoverableFsDataOutputStream. [flink]

2024-04-25 Thread via GitHub


StefanRRichter opened a new pull request, #24722:
URL: https://github.com/apache/flink/pull/24722

   ## What is the purpose of the change
   
   Sync'ing back written data to disk is seemingly missing from some 
implementation of RecoverableFsDataOutputStream. This PR adds sync, typically 
by calling #persist() inside #closeForCommit.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org