Sounds great, thank you! > I was thinking if the label name can be picked from its defined scheme, > like from FileSystemFactory.getScheme(). Would it be better or not > necessary complexity with little return?
I think it'd make more sense with Approach B (i.e. some MonitoringFileSystem that is a wrapper around the actual file system); but for Approach D I don't see much benefit. Regards, Roman On Wed, May 13, 2026 at 6:45 PM Samrat Deb <[email protected]> wrote: > Hi Roman, > Thanks for the review. > > > > 1. Is it possible to expose file size metrics? It might be helpful to > > troubleshoot slow recoveries caused by downloading many small files for > > example > > Yes, this is feasible, and I'll include it. The native S3 FS already has > the raw data at call sites. NativeS3InputStream wraps a GetObjectResponse. > response.contentLength() gives the object size before the first byte is > read. NativeS3OutputStream already accumulates the byte count before the > final PutObject is issued. > > > Is bulkCopyHelper covered by the proposal? I think it would be helpful > > to have requests.size() and total bytes received as metrics > > Yes, NativeS3BulkCopyHelper will be in scope, but in the next phase. This > is important for multipart upload and catching zombie files. On a > high-level idea is to expose > 1. s3.bulk-copy.files - Histogram of files-per-batch (i.e., requests.size() > per copyFiles invocation). This is the "requests.size()" signal you > mentioned. > 2. s3.bulk-copy.bytes - Counter of total bytes transferred. After each > FileDownload.completionFuture() resolves, Files.size(destinationPath) gives > the exact byte count > without additional S3 API calls. > > > 3. s3.bulk-copy.duration.ms - Histogram of end-to-end copyFiles wall-clock > time. > > WDUT about adding these values? > > > 3. Ideally, such metrics should be exposed by other file systems; then > I'd > > suggest having "s3n" as a label rather than a part of the metric name > > Yes, suggestion to add label as s3n as part of metric name is a good idea. > I will update the FLIP and add a section about it. > So MetricGroup.addGroup("filesystem_type", "s3n") creates a key-value > labelled subgroup. In Prometheus reporters, this flattens to > requests{filesystem_type="s3n"}. > > I was thinking if the label name can be picked from its defined scheme, > like from FileSystemFactory.getScheme(). Would it be better or not > necessary complexity with little return? > > > We use something similar to Approach B internally; I don't think it "Adds > > overhead to the per-record path" > > (because we don't have per-record file operations); but it lacks > > lower-level signals indeed. > > So the recommended approach makes sense to me. > > Yes, for the lower-level signal approach, D is better and provides ease way > to add metrics as per requirement. > > Bests, > Samrat > > > On Tue, May 5, 2026 at 4:32 PM Roman Khachatryan <[email protected]> wrote: > > > Hi Samrat, > > > > Thanks for the proposal, such a feature would be very helpful! > > > > I have several questions: > > 1. Is it possible to expose file size metrics? It might be helpful to > > troubleshoot slow recoveries caused by downloading many small files for > > example > > 2. Is bulkCopyHelper covered by the proposal? I think it would be helpful > > to have requests.size() and total bytes received as metrics > > 3. Ideally, such metrics should be exposed by other file systems; then > I'd > > suggest having "s3n" as a label rather than a part of metric name > > > > As for the "Open questions for community discussion" section, I agree > with > > both points: > > - enable the feature by default and > > - don't correlate with checkpoints (it might be more tricky than > > ThreadLocal). > > > > We use something similar to Approach B internally; I don't think it "Adds > > overhead to the per-record path" > > (because we don't have per-record file operations); but it lacks > > lower-level signals indeed. > > So the recommended approach makes sense to me. > > > > Regards, > > Roman > > > > > > On Tue, May 5, 2026 at 11:58 AM Samrat Deb <[email protected]> > wrote: > > > > > Hi All, > > > > > > I'd like to open a discussion on FLIP-576: Filesystem-Plugin > > Observability > > > for (flink-s3-fs-native)[1]. > > > > > > Apache Flink’s filesystem layer is critical to core operations like > > > checkpoints, savepoints, and state access. Most of which rely heavily > on > > > S3. Despite this, the current observability in s3<>flink is offering > > little > > > insight into underlying issues. Engineers lack visibility into key > > failure > > > signals, including S3 throttling, retry behaviour, slow operations, > load > > > distribution, multipart upload leaks, and intermittent stream failures. > > As > > > a result, diagnosing production issues often requires manual > correlation > > > across logs and external systems, making troubleshooting slow and > > > unreliable. This observability gap significantly impacts the > operability > > of > > > Flink in real-world large-scale deployments. > > > This FLIP proposal addresses the same and builds support for native S3 > > FS. > > > > > > Looking forward to your feedback. > > > > > > Bests, > > > Samrat > > > > > > [1] > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=421957173 > > > > > >
