+1 to keeping the result set loader. Also, IMO, the parquet effort should move to using the result set loader (I believe Salim has a plan to do so).
On Sun, Jul 15, 2018 at 6:50 PM, Paul Rogers <par0...@yahoo.com.invalid> wrote: > Hi All, > > Over the last six months I've been slowly trying to get the "result set > loader" work committed to Drill. As a recap, this was supposed to provide a > uniform way to optimally pack a record batch up to a proscribed memory > limit. This technique is particularly useful in readers which do not have > much information about incoming data sizes. > > In the mean time, the team has done a great job using the "sizer" approach > to get a good-enough solution for all internal operators. The sizer simply > uses statistics about incoming batches to predict outgoing batch size. > > At the same time, work has been done to create a one-off solution for > Parquet. Since Parquet is, by far, Drill's most important data source, this > means we have the reader problem is solved for the most critical use case. > > A time goes on, I get less and less time to maintain the result set loader > code. My knowledge of team priorities and of Drill code drifts out of date. > > So, the question for the group is, is the result set loader work still > needed? If not, we can wait to do the remaining commits until a compelling > need presents itself. If it is needed, it would be good to know how the > team plans to use it so that we stay in sync. > > Thanks, > > - Paul > >