Agree, the idea behind was to use these stats - but they could be pulled from thanos (or alike) too - to say "ok, if we go that path we'll be throttled", kind of cost estimation but you are right.
Romain Manni-Bucau @rmannibucau <https://x.com/rmannibucau> | .NET Blog <https://dotnetbirdie.github.io/> | Blog <https://rmannibucau.github.io/> | Old Blog <http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> | LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book <https://www.packtpub.com/en-us/product/java-ee-8-high-performance-9781788473064> Javaccino founder (Java/.NET service - contact via linkedin) Le ven. 13 févr. 2026 à 19:30, Steve Loughran <[email protected]> a écrit : > > > On Fri, 13 Feb 2026 at 12:53, Romain Manni-Bucau <[email protected]> > wrote: > >> Hi Steve, >> >> Fully agree with you about all the points - and thanks for the details >> BTW. >> My main concern - and why I did sent the mail, is "what is provided by >> default". >> To make it more concrete what "stops" me to go too far is that it is not >> built-in so basically I have to redo it myself...if we pull the logic to >> its extent I can also just redo my full spark integration, see what I mean? >> >> I know most vendors did solve it somehow so my question is do we want to >> integrate it as a standard in iceberg? >> Should it relate to the REST catalog to have a metrics awareness/stats? >> > > generally different stats though > catalog: table, files etc > stuff collected by the engine during a query: engine specific and based on > the configuration and deployment. > > Probably more broadly relevant: end to end telemetry where the metrics go > to whatever telemetry db is deployed. > > There's also the need for the endpoint signers to log something about > every request was signed, especially because they'll end up in the > cloudtrail log as actions by the assumed role, not the principal querying > the table. > > >> Any will to move in that direction? >> >> Romain Manni-Bucau >> @rmannibucau <https://x.com/rmannibucau> | .NET Blog >> <https://dotnetbirdie.github.io/> | Blog <https://rmannibucau.github.io/> | >> Old Blog <http://rmannibucau.wordpress.com> | Github >> <https://github.com/rmannibucau> | LinkedIn >> <https://www.linkedin.com/in/rmannibucau> | Book >> <https://www.packtpub.com/en-us/product/java-ee-8-high-performance-9781788473064> >> Javaccino founder (Java/.NET service - contact via linkedin) >> >> >> Le ven. 13 févr. 2026 à 12:02, Steve Loughran <[email protected]> a >> écrit : >> >>> >>> >>> On Thu, 12 Feb 2026 at 20:52, Romain Manni-Bucau <[email protected]> >>> wrote: >>> >>>> Commented inline >>>> >>>> Romain Manni-Bucau >>>> @rmannibucau <https://x.com/rmannibucau> | .NET Blog >>>> <https://dotnetbirdie.github.io/> | Blog >>>> <https://rmannibucau.github.io/> | Old Blog >>>> <http://rmannibucau.wordpress.com> | Github >>>> <https://github.com/rmannibucau> | LinkedIn >>>> <https://www.linkedin.com/in/rmannibucau> | Book >>>> <https://www.packtpub.com/en-us/product/java-ee-8-high-performance-9781788473064> >>>> Javaccino founder (Java/.NET service - contact via linkedin) >>>> >>>> >>>> Le jeu. 12 févr. 2026 à 21:13, Steve Loughran <[email protected]> a >>>> écrit : >>>> >>>>> >>>>> >>>>> you get all thread local stats for a specific thread >>>>> from IOStatisticsContext.getCurrentIOStatisticsContext().getIOStatistics() >>>>> >>>> >>>> How is it supposed to work, my understanding is that it is basically a >>>> thread local like impl based on a map - important point being it works in >>>> the same bound thread - whereas the data is pulled from the sink in a >>>> scheduled executor thread so I would still need to do my registry/sync it >>>> with spark metrics system no? >>>> >>>> >>>>> >>>>> take a snapshot and that and you have something json marshallable or >>>>> java serializable which aggregates nicely >>>>> >>>>> Call IOStatisticsContext.getCurrentIOStatisticsContext().reset() when >>>>> your worker thread starts a specific task to ensure you only get the stats >>>>> for that task (s3a & I think gcs). >>>>> >>>> >>>> Do you mean impl my own S3A or file io? This is the instrumentation I >>>> tried to avoid since I think it should be built-in, not in apps. >>>> >>> >>> more that spark worker threads need to reset the stats once they pick up >>> their next piece of work, collect the changes then push up the stats on >>> task commit, and job commit aggregates these. >>> >>> The s3a committers do all this behind the scenes (first into the >>> intermediate manifest then into the final _SUCCESS file). Now that spark >>> builds with a version with the API, someone could consider doing it there >>> and lining up with spark history server. Then whatever fs client, input >>> stream or any other instrumented component would just add its numbers) >>> >>> >>>> >>>>> >>>>> from the fs you getIOStatistics() and you get all the stats of all >>>>> filesystems and streams after close(). which from a quick look at some s3 >>>>> io to a non-aws store shows a couple of failures, interestingly enough. We >>>>> collect separate averages for success and failure on every op so you can >>>>> see the difference. >>>>> >>>>> the JMX stats we collect are a very small subset of the statistics, >>>>> stuff like "bytes drained in close" and time to wait for an executor in >>>>> the thread pool (action_executor_acquired) are important as they're >>>>> generally sign of misconfigurations >>>>> >>>> >>>> Yep, my focus high level is to see if the tuning or tables must be >>>> tuned so 429, volume, latencies are key there. >>>> >>> >>> If you turn on AWS S3 server logging you will get numbers of 503 >>> throttle events and the paths; 429 is other stores. Bear in mind that the >>> recipients of the throttle events may not be the only caller triggering >>> it...things like bulk delete (hello, compaction) can throttle other work >>> going on against the same shard. >>> >>> >>> Another thing I don't get is why not reusing hadoop-aws in spark? It >>>> would at least enable to mix datasources more nicely and focus in a single >>>> location the work (it is already done). >>>> >>>> >>>> >>> >>> Well in Cloudera we do. Nothing to stop you. >>> >>> I also have a PoC of an s3 signer for Hadoop 3.4.3+ which gets its >>> credentials from the rest server -simply wraps the existing one but picks >>> up its binding info from the filesystem Configuration. >>> >>> -Steve >>> >>> >>> >>>
