Hey all,

So while I've dabbled in drill, this past week I've really dug in, and
honestly, I think this project is a game changer, I was able to do some
amazing things with Drill kudos to all the hard work that has been done
with Drill.

I had one question, and potential feature request:

When using drill this weekend, I had a workspace setup, and I found myself
using the show files command often to find my directories etc. The thing
is, the return of show files is not ordered.  And when looking at file
system data there are many possible ways to order the results for
efficiency as a user.

Consider the ls command in unix.  The ability to specify different sorting
is built in there.  I checked out
http://drill.apache.org/docs/show-files-command/ as well as tried the
"obvious" show files order by name and that didn't work nor did I see how I
could in the documentation.

So, is there a way to order output? If there isn't now, could that be
added? I think just adding ORDER BY SQL methodology would be perfect here,
you have 8 fields (seen below) and ordering by any one of them, or group of
them, with ASC/DESC just like standard order by would be a huge win.

I suppose one could potentially ask for WHERE clause too, and maybe a
select (which fields) however I am more concerned with the order, but if I
had to implement all there I could see:

(All Three, select, where, and order) (I.e. after "Files" if the token
isn't WHERE  or ORDER then check for the fields, if it's not a valid field
list error)

SHOW FILES name, accessTime where name like '%.csv' order by name;

(Where clause and order, note the token after FILES is WHERE)
SHOW FILES WHERE name like '%.csv' order by length ASC, name DESC;

(Only Order, ORDER Is the first token after FILES)
SHOW FILES ORDER BY length ASC, name DESC

I don't think we have to grant full SQL functionality here, just the
ability to display various fields, filter on criteria, and ordering.. No
aggregates, etc. If you wanted to get fancy, I suppose you could take the
table and make it a full on table, i.e. take the results make it a quick
inmemory table and then utilize the whole drill stack (minus aggregates) of
functions on it.  Lots of options.  I just wanted to get this down in an
email as it was something I found myself wishing I had over and over during
data exploration.


|name| isDirectory  | isFile  | length | owner  group|permissions|accessTime
 | modificationTime  |



John

Reply via email to