Hello all,
I’m working with some data that is a little messy and I was wondering in CSV
files:
1. When you are trying to read the headers, what characters are NOT allowed in
the field names
2. Is there a way to tell Drill to skip n lines in the beginning of the file?
(I know about the
Firstly, I don't think this is a default setting, so you will need to
explicitly add this under every text format plugin ("csv", "tsv", ...), and
inside every dfs storage plugin (if you have more than one). Later turn on
the new text reader system/session option, before you can query.
Secondly,
Thanks Abhishek. This helped.
On Fri, Apr 15, 2016 at 3:13 PM, Abhishek Girish
wrote:
> Can you take a look at
>
> https://drill.apache.org/docs/s3-storage-plugin/#quering-parquet-format-files-on-s3
> ? It could be an issue of connection to s3 timing out.
>
> On Fri,
Absolutely. Use the JDBC/ODBC interfaces and a workspace / storage plugin
writing the data into a distributed file system.
https://drill.apache.org/docs/odbc-jdbc-interfaces/
On Mon, Apr 18, 2016 at 12:25 PM, M Gates wrote:
>
>
> Question on drill, completely new to
Question on drill, completely new to this tool and it looks awesome.
Is there a way to run drill remotely and over say over ODBC/JDBC driver send a
query to my hadoop cluster ? i.e.: drill is on my client computer but the write
stays in my hadoop workspace.
Thanks,
Mark.
I found that the dfs storage section for csv file types did not all have
the extractHeader setting in place. Manually putting it in all four of
my nodes may have resolved the issue.
In my vanilla Hadoop 2.7.0 setup on the same servers, I don't recall
having to set it on all nodes.
Did I