Here's a doc page that explains how to leverage the directory structure in a query: https://drill.apache.org/docs/querying-directories/
On Mon, Jun 1, 2015 at 2:34 PM, Paul Mogren <pmog...@commercehub.com> wrote: > On 6/1/15, 12:14 PM, "Matt" <bsg...@gmail.com> wrote: > > > >Segmenting data into directories in HDFS would require clients to > >structure queries accordingly, but would there be benefit in reduced > >query time by limiting scan ranges? > > Yes. I am just a newbie user, but I have already seen that work with > localFS and S3; I fully expect it will work for HDFS also, as I have seen > mention of such a strategy for HDFS outside the context of Drill. Ignorant > clients can also still query the root directory and just not get the > benefit. I believe you could even define a view that would allow clients > to apply WHERE clause filters against artificial columns of date > information that you map to the directory structure, thereby hiding the > structure from the client. > > HTH, > Paul > >