Re: directory structure containing multiple file types

2015-10-19 Thread Aman Sinha
With regard to the last comment on directory based pruning, please watch DRILL-3759 (https://issues.apache.org/jira/browse/DRILL-3759). I don't have a timeline for it yet but hopefully in the next Drill release. Aman On Mon, Oct 19, 2015 at 3:50 AM, Dhruv Gohil wrote: > "What's needed in Dril

Re: directory structure containing multiple file types

2015-10-19 Thread Dhruv Gohil
"What's needed in Drill to truly eliminate ETL" +1 but in another thread ;-) few 'hacks' we want to share there of our 'work rounds' related to various drill limitations on multi directory queries (99% of our workload) , e.g. avoiding empty directory failures, building queries with direc

Re: directory structure containing multiple file types

2015-10-19 Thread Stefán Baxter
Hi Ted, Your approach only works for a single directory, not a directory structure. I will create an improvement request later today. I would welcome a session on "What's needed in Drill to truly eliminate ETL" (Just an idea) Regards, -Stefan On Sun, Oct 18, 2015 at 10:30 PM, Stefán Baxter w

Re: directory structure containing multiple file types

2015-10-18 Thread Stefán Baxter
than you Jacques, I will. On Sun, Oct 18, 2015 at 10:01 PM, Jacques Nadeau wrote: > Stefan, can you open a JIRA for reading multiple files types in a single > directory. It isn't the most common case we've run across but is definitely > something that should be addressed. > > -- > Jacques Nadeau

Re: directory structure containing multiple file types

2015-10-18 Thread Jacques Nadeau
Stefan, can you open a JIRA for reading multiple files types in a single directory. It isn't the most common case we've run across but is definitely something that should be addressed. -- Jacques Nadeau CTO and Co-Founder, Dremio On Sat, Oct 17, 2015 at 10:33 AM, Stefán Baxter wrote: > Thanks A

Re: directory structure containing multiple file types

2015-10-17 Thread Ted Dunning
Yes. This is a pain in the butt. One thing that might work for you is to use a union of different wild-cards. Here is an example where I have a directory with both csv and json files. select * from ( select columns[0] as a, columns[1] as b from dfs.tdunning.`foo1/*.csv` ) union ( s

Re: directory structure containing multiple file types

2015-10-17 Thread Stefán Baxter
Thanks Abhishek, I think Drill is still quite far from eliminating ETL and the list of obstacles on the way to there seems growing. (yeah, disappointment got me for a bit) Regards, -Stefan

Re: directory structure containing multiple file types

2015-10-17 Thread Abhishek Girish
While querying directories on a file system, Drill expects all files within it to be of the same format/type. Heterogenous types aren't supported afaik. I've seen a case where Drill would start off querying but would fail later. And another case where it would fail right away. I think this is a kn

directory structure containing multiple file types

2015-10-17 Thread Stefán Baxter
Hi, I have a single directory structure containing both .avro and .json files. There content is the same and they use the same schema (Avro files explicitly and JSON files implicitly). When I query the directory Drill returns an error informing me that the Avro files can not be read as JSON files