Hi All,

Over the last six months I've been slowly trying to get the "result set loader" 
work committed to Drill. As a recap, this was supposed to provide a uniform way 
to optimally pack a record batch up to a proscribed memory limit. This 
technique is particularly useful in readers which do not have much information 
about incoming data sizes.

In the mean time, the team has done a great job using the "sizer" approach to 
get a good-enough solution for all internal operators. The sizer simply uses 
statistics about incoming batches to predict outgoing batch size.

At the same time, work has been done to create a one-off solution for Parquet. 
Since Parquet is, by far, Drill's most important data source, this means we 
have the reader problem is solved for the most critical use case.

A time goes on, I get less and less time to maintain the result set loader 
code. My knowledge of team priorities and of Drill code drifts out of date.

So, the question for the group is, is the result set loader work still needed? 
If not, we can wait to do the remaining commits until a compelling need 
presents itself. If it is needed, it would be good to know how the team plans 
to use it so that we stay in sync.

Thanks,

- Paul

Reply via email to