Re: Error using sum on numeric data -- Only COUNT aggregate function supported for Boolean type

Minnow Noir Sun, 08 Feb 2015 19:46:13 -0800

That works with the file that has no row header, Sudhakar.

On Sun, Feb 8, 2015 at 10:34 PM, Sudhakar Thota <sth...@maprtech.com> wrote:


> May be you have to cast it to integer.
>
> Sudhakar Thota
> Sent from my iPhone
>
> > On Feb 8, 2015, at 6:54 PM, Minnow Noir <minnown...@gmail.com> wrote:
> >
> > The error actually happens when I remove the header row from the file.
> >
> > cat test2.csv
> > "Ed",100
> > "Pete",200
> > "Ed",100
> > "Pete",400
> >
> >
> > 0: jdbc:drill:zk=local> select columns[0], columns[1] from
> > dfs.`/data/test2.csv`;
> > +------------+------------+
> > |   EXPR$0   |   EXPR$1   |
> > +------------+------------+
> > | "Ed"       | 100        |
> > | "Pete"     | 200        |
> > | "Ed"       | 100        |
> > | "Pete"     | 400        |
> > +------------+------------+
> > 4 rows selected (0.837 seconds)
> >
> > 0: jdbc:drill:zk=local> select columns[0], sum(columns[1]) from
> > dfs.`/data/test2.csv` group by columns[0];
> > Query failed: Query failed: Failure while running fragment., Only COUNT
> > aggregate function supported for Boolean type [
> > a9344c36-ebb4-4b9d-9c7d-75a0f7e52edd on sandbox.hortonworks.com:31010 ]
> > [ a9344c36-ebb4-4b9d-9c7d-75a0f7e52edd on sandbox.hortonworks.com:31010
> ]
> >
> >
> > Error: exception while executing query: Failure while executing query.
> > (state=,code=0)
> >
> > Also, I don't see anything on the Drill wiki about filtering out the
> header
> > row.  How is that done?
> >
> > Thanks
> >
> >
> > On Sun, Feb 8, 2015 at 9:23 PM, Andries Engelbrecht <
> > aengelbre...@maprtech.com> wrote:
> >
> >> You need to filter out the header line on the CSV file as you are trying
> >> to sum a string in column1.
> >>
> >>
> >> —Andries
> >>
> >>
> >>> On Feb 8, 2015, at 6:12 PM, Minnow Noir <minnown...@gmail.com> wrote:
> >>>
> >>> I'm trying to perform a basic query in order to learn Drill, but
> getting
> >> an
> >>> the error message in the subject line.
> >>>
> >>> I created a dead simple CSV file on disk.  Note that Sales values are
> not
> >>> quoted.
> >>>
> >>> cat test.csv
> >>> Employee,Sales
> >>> Ed,100
> >>> Pete,200
> >>> Ed,100
> >>> Pete,400
> >>>
> >>> When I query it without performing a sum, it returns the expected
> values.
> >>>
> >>> 0: jdbc:drill:zk=local> select columns[0], columns[1] as TotalSales
> from
> >>> dfs.`/data/test.csv`;
> >>> +------------+------------+
> >>> |   EXPR$0   | TotalSales |
> >>> +------------+------------+
> >>> | Employee   | Sales      |
> >>> | Ed         | 100        |
> >>> | Pete       | 200        |
> >>> | Ed         | 100        |
> >>> | Pete       | 400        |
> >>> +------------+------------+
> >>> 5 rows selected (0.11 seconds)
> >>>
> >>> However, if I throw a sum() in there, I get the confusing error message
> >> in
> >>> the subject line:
> >>>
> >>> 0: jdbc:drill:zk=local> select columns[0], sum(columns[1]) as
> TotalSales
> >>> from dfs.`
> >>> /data/test.csv` group by columns[0];
> >>> Query failed: Query failed: Failure while running fragment., Only COUNT
> >>> aggregate function supported for Boolean type [
> >>> bfd34bd1-2fac-4d9e-a9bd-26bced552120 on sandbox.hortonworks.com:31010
> ]
> >>> [ bfd34bd1-2fac-4d9e-a9bd-26bced552120 on
> sandbox.hortonworks.com:31010
> >> ]
> >>>
> >>> The message seems to be saying that Drill interpreted the Sales column
> >> data
> >>> as being Boolean somehow, and therefore, the only function that can be
> >>> called is count().  It's not clear why Drill would interpret the Sales
> >>> column values as being Boolean.
> >>>
> >>> Some Googling turned up this thread, which says the error also occurs
> for
> >>> character data: https://issues.apache.org/jira/browse/DRILL-1998
> >>>
> >>> Of course, it's not clear why Drill would interpret 100, 200, etc. as
> >> being
> >>> character data unless it's getting thrown off by the header row...which
> >> of
> >>> course is present in every CSV and TSV file.  However, I created a copy
> >> of
> >>> test.csv *without* the header row, and reran the query, but got the
> same
> >>> error.
> >>>
> >>> Any ideas what's causing the issue and how to resolve it?
> >>>
> >>> Thanks
> >>
> >>
>

Re: Error using sum on numeric data -- Only COUNT aggregate function supported for Boolean type

Reply via email to