Here's a doc page that explains how to leverage the directory structure in
a query: https://drill.apache.org/docs/querying-directories/


On Mon, Jun 1, 2015 at 2:34 PM, Paul Mogren <pmog...@commercehub.com> wrote:

> On 6/1/15, 12:14 PM, "Matt" <bsg...@gmail.com> wrote:
>
>
> >Segmenting data into directories in HDFS would require clients to
> >structure queries accordingly, but would there be benefit in reduced
> >query time by limiting scan ranges?
>
> Yes. I am just a newbie user, but I have already seen that work with
> localFS and S3; I fully expect it will work for HDFS also, as I have seen
> mention of such a strategy for HDFS outside the context of Drill. Ignorant
> clients can also still query the root directory and just not get the
> benefit. I believe you could even define a view that would allow clients
> to apply WHERE clause filters against artificial columns of date
> information that you map to the directory structure, thereby hiding the
> structure from the client.
>
> HTH,
> Paul
>
>

Reply via email to