Re: Query local files on cluster? [Beginner]

2015-05-27 Thread Matt
Drill can process a lot of data quickly, and for best performance and consistency you will likely find that the sooner you get the data to the DFS the better. Already most of the way there. Initial confusion came from the features to query the local / native filesystem, and how that does not

Re: Query local files on cluster? [Beginner]

2015-05-26 Thread Andries Engelbrecht
Perhaps I’m missing something here. Why not create a DFS plug in for HDFS and put the file in HDFS? On May 26, 2015, at 4:54 PM, Matt bsg...@gmail.com wrote: New installation with Hadoop 2.7 and Drill 1.0 on 4 nodes, it appears text files need to be on all nodes in a cluster? Using the

Re: Query local files on cluster? [Beginner]

2015-05-26 Thread Matt
That might be the end goal, but currently I don't have an HDFS ingest mechanism. We are not currently a Hadoop shop - can you suggest simple approaches for bulk loading data from delimited files into HDFS? On May 26, 2015, at 8:04 PM, Andries Engelbrecht aengelbre...@maprtech.com

Re: Query local files on cluster? [Beginner]

2015-05-26 Thread Matt
Thanks, I am incorrectly conflating the file system with data storage. Looking to experiment with the Parquet format, and was looking at CTAS queries as an import approach. Are direct queries over local files meant for an embedded drill, where on a cluster files should be moved into HDFS

Re: Query local files on cluster? [Beginner]

2015-05-26 Thread Matt
New installation with Hadoop 2.7 and Drill 1.0 on 4 nodes, it appears text files need to be on all nodes in a cluster? Using the dfs config below, I am only able to query if a csv file is on all 4 nodes. If the file is only on the local node and not others, I get errors in the form of: ~~~

Re: Query local files on cluster? [Beginner]

2015-05-26 Thread Andries Engelbrecht
You can use the HDFS shell hadoop fs -put To copy from local file system to HDFS For more robust mechanisms from remote systems you can look at using NFS, MapR has a really robust NFS integration and you can use it with the community edition. On May 26, 2015, at 5:11 PM, Matt

Re: Query local files on cluster? [Beginner]

2015-05-25 Thread Matt
That does represent something I have not tried yet. Will test as soon as I can. Thanks! On 25 May 2015, at 20:39, Kristine Hahn wrote: The storage plugin location needs to be the full path to the localdata directory. This partial storage plugin definition works for the user named mapr: {

Re: Query local files on cluster? [Beginner]

2015-05-24 Thread USC
Because you already defined dfs.root as '/localdata'. You might just need to say dfs.root.`testdata.csv` in your where clause Sent from my iPhone On May 24, 2015, at 1:56 PM, Matt bsg...@gmail.com wrote: I have used a single node install (unzip and run) to query local text / csv files,