Makes sense.

Is there we can do this with lazy materializations rather than writing
complex expression tree logic? I hate have no all this custom expression
tree manipulation logic.

Also, it seems like this should be N phased rather than two phase where N
is the number of directories below the base path.

Thoughts?
On Sep 9, 2015 10:54 AM, "Aman Sinha" <amansi...@apache.org> wrote:

> Currently, partition pruning gets all file names in the table and applies
> the pruning.  Suppose the files are spread out over several directories and
> there is a filter  on dirN,  this is not efficient - both in terms of
> elapsed time and memory usage.  This has been seen in a few use cases
> recently.
>
> We should ideally perform the pruning in 2 steps:  first get the top-level
> directory names only and apply the directory filter, then get the filenames
> within that directory and apply remaining filters.
>
> I will create a JIRA for this enhancement but let me know your thoughts...
>
> Aman
>

Reply via email to