Drill can process a lot of data quickly, and for best performance and
consistency you will likely find that the sooner you get the data to
the DFS the better.
Already most of the way there. Initial confusion came from the features
to query the local / native filesystem, and how that does not
Perhaps I’m missing something here.
Why not create a DFS plug in for HDFS and put the file in HDFS?
On May 26, 2015, at 4:54 PM, Matt bsg...@gmail.com wrote:
New installation with Hadoop 2.7 and Drill 1.0 on 4 nodes, it appears text
files need to be on all nodes in a cluster?
Using the
That might be the end goal, but currently I don't have an HDFS ingest
mechanism.
We are not currently a Hadoop shop - can you suggest simple approaches for bulk
loading data from delimited files into HDFS?
On May 26, 2015, at 8:04 PM, Andries Engelbrecht aengelbre...@maprtech.com
Thanks, I am incorrectly conflating the file system with data storage.
Looking to experiment with the Parquet format, and was looking at CTAS queries
as an import approach.
Are direct queries over local files meant for an embedded drill, where on a
cluster files should be moved into HDFS
New installation with Hadoop 2.7 and Drill 1.0 on 4 nodes, it appears
text files need to be on all nodes in a cluster?
Using the dfs config below, I am only able to query if a csv file is on
all 4 nodes. If the file is only on the local node and not others, I get
errors in the form of:
~~~
You can use the HDFS shell
hadoop fs -put
To copy from local file system to HDFS
For more robust mechanisms from remote systems you can look at using NFS, MapR
has a really robust NFS integration and you can use it with the community
edition.
On May 26, 2015, at 5:11 PM, Matt
That does represent something I have not tried yet. Will test as soon as
I can.
Thanks!
On 25 May 2015, at 20:39, Kristine Hahn wrote:
The storage plugin location needs to be the full path to the
localdata
directory. This partial storage plugin definition works for the user
named
mapr:
{
Because you already defined dfs.root as '/localdata'.
You might just need to say dfs.root.`testdata.csv` in your where clause
Sent from my iPhone
On May 24, 2015, at 1:56 PM, Matt bsg...@gmail.com wrote:
I have used a single node install (unzip and run) to query local text / csv
files,