That works with the file that has no row header, Sudhakar. On Sun, Feb 8, 2015 at 10:34 PM, Sudhakar Thota <sth...@maprtech.com> wrote:
> May be you have to cast it to integer. > > Sudhakar Thota > Sent from my iPhone > > > On Feb 8, 2015, at 6:54 PM, Minnow Noir <minnown...@gmail.com> wrote: > > > > The error actually happens when I remove the header row from the file. > > > > cat test2.csv > > "Ed",100 > > "Pete",200 > > "Ed",100 > > "Pete",400 > > > > > > 0: jdbc:drill:zk=local> select columns[0], columns[1] from > > dfs.`/data/test2.csv`; > > +------------+------------+ > > | EXPR$0 | EXPR$1 | > > +------------+------------+ > > | "Ed" | 100 | > > | "Pete" | 200 | > > | "Ed" | 100 | > > | "Pete" | 400 | > > +------------+------------+ > > 4 rows selected (0.837 seconds) > > > > 0: jdbc:drill:zk=local> select columns[0], sum(columns[1]) from > > dfs.`/data/test2.csv` group by columns[0]; > > Query failed: Query failed: Failure while running fragment., Only COUNT > > aggregate function supported for Boolean type [ > > a9344c36-ebb4-4b9d-9c7d-75a0f7e52edd on sandbox.hortonworks.com:31010 ] > > [ a9344c36-ebb4-4b9d-9c7d-75a0f7e52edd on sandbox.hortonworks.com:31010 > ] > > > > > > Error: exception while executing query: Failure while executing query. > > (state=,code=0) > > > > Also, I don't see anything on the Drill wiki about filtering out the > header > > row. How is that done? > > > > Thanks > > > > > > On Sun, Feb 8, 2015 at 9:23 PM, Andries Engelbrecht < > > aengelbre...@maprtech.com> wrote: > > > >> You need to filter out the header line on the CSV file as you are trying > >> to sum a string in column1. > >> > >> > >> —Andries > >> > >> > >>> On Feb 8, 2015, at 6:12 PM, Minnow Noir <minnown...@gmail.com> wrote: > >>> > >>> I'm trying to perform a basic query in order to learn Drill, but > getting > >> an > >>> the error message in the subject line. > >>> > >>> I created a dead simple CSV file on disk. Note that Sales values are > not > >>> quoted. > >>> > >>> cat test.csv > >>> Employee,Sales > >>> Ed,100 > >>> Pete,200 > >>> Ed,100 > >>> Pete,400 > >>> > >>> When I query it without performing a sum, it returns the expected > values. > >>> > >>> 0: jdbc:drill:zk=local> select columns[0], columns[1] as TotalSales > from > >>> dfs.`/data/test.csv`; > >>> +------------+------------+ > >>> | EXPR$0 | TotalSales | > >>> +------------+------------+ > >>> | Employee | Sales | > >>> | Ed | 100 | > >>> | Pete | 200 | > >>> | Ed | 100 | > >>> | Pete | 400 | > >>> +------------+------------+ > >>> 5 rows selected (0.11 seconds) > >>> > >>> However, if I throw a sum() in there, I get the confusing error message > >> in > >>> the subject line: > >>> > >>> 0: jdbc:drill:zk=local> select columns[0], sum(columns[1]) as > TotalSales > >>> from dfs.` > >>> /data/test.csv` group by columns[0]; > >>> Query failed: Query failed: Failure while running fragment., Only COUNT > >>> aggregate function supported for Boolean type [ > >>> bfd34bd1-2fac-4d9e-a9bd-26bced552120 on sandbox.hortonworks.com:31010 > ] > >>> [ bfd34bd1-2fac-4d9e-a9bd-26bced552120 on > sandbox.hortonworks.com:31010 > >> ] > >>> > >>> The message seems to be saying that Drill interpreted the Sales column > >> data > >>> as being Boolean somehow, and therefore, the only function that can be > >>> called is count(). It's not clear why Drill would interpret the Sales > >>> column values as being Boolean. > >>> > >>> Some Googling turned up this thread, which says the error also occurs > for > >>> character data: https://issues.apache.org/jira/browse/DRILL-1998 > >>> > >>> Of course, it's not clear why Drill would interpret 100, 200, etc. as > >> being > >>> character data unless it's getting thrown off by the header row...which > >> of > >>> course is present in every CSV and TSV file. However, I created a copy > >> of > >>> test.csv *without* the header row, and reran the query, but got the > same > >>> error. > >>> > >>> Any ideas what's causing the issue and how to resolve it? > >>> > >>> Thanks > >> > >> >