alamb commented on issue #7000: URL: https://github.com/apache/datafusion/issues/7000#issuecomment-2094807338
> > Looking at the trace in > > @alamb I'd like to mention, that extending of mutable batch spends a lot of time (MutableArrayData::Extend, utils::extend_offsets) and related allocator's work. I think those particular functions are the ones that actually copy data, so I am not sure how much more they can be optimized > > I suppose that it's much better to preallocate bigger arrow buffer instead of extending it by small portions. And I believe that it will give us an effect. I agree it may well > > Also I noticed that **~18%** was spent by `asm_exc_page_fault` which is probably an issue of enabled transparent huge pages (which is bad for databases workloads). I will investigate more on that and post some conclusions later 👍 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
