Suggest topics for hangout tomorrow (9/20)

2016-09-19 Thread Aman Sinha
I'll start the hangout tomorrow at the usual time. I don't have a set agenda yet but if there are any topics folks wish to discuss, please respond on this thread such that others who might be interested can also join. Thanks.

Re: LIMIT push down to parquet row group

2016-09-19 Thread Aman Sinha
Adding to what Jinfeng said, the LIMIT handling relies on the downstream operator sending a 'kill incoming input stream' api which is called by the parent operator on its child once the parent (Limit) has received the required number of rows. Since the unit of processing in Drill is record batche

Re: LIMIT push down to parquet row group

2016-09-19 Thread Jinfeng Ni
Drill applies LIMIT filtering at row group level. For LIMIT n, it will scan the first m row groups that have at least n rows, and discard the rest of row groups. In your case, since you have only 1 row group, it does not have any row group filtering for LIMIT 1. I'm not sure how 32767 comes from.

LIMIT push down to parquet row group

2016-09-19 Thread Veera Naranammalpuram
Does anyone know how and if the LIMIT push down to Parquet file works? I have a parquet file with 53K records in 1 row group. When I run a SELECT * from LIMIT 1, I see the Parquet reader operator process 32768 records. I would have expected either 1 or 53K. So questions; 1) Does the Parquet MR l