Le 06/02/2020 à 20:20, Wes McKinney a écrit : >> Actually, on a more high-level basis, is the goal to prefetch for >> sequential consumption of row groups? >> > > Essentially yes. One "easy" optimization is to prefetch the entire > serialized row group. This is an evolution of that idea where we want to > prefetch only the needed parts of a row group in a minimum number of IO > calls (consider reading the first 10 columns from a file with 1000 columns > -- so we want to do one IO call instead of 10 like we do now).
There are no situations where you would want to consume a scattered subset of row groups (e.g. predicate pushdown)? Regards Antoine.