Hi,

event_count is a column alias for count(*) (without the as).

Regards,
 -Stefan

On Thu, Jul 30, 2015 at 3:32 PM, Jacques Nadeau <jacq...@dremio.com> wrote:

> select p.client_ip.country_code country_code, count(*) event_count from
> dfs.tmp.`/events` as p where event_count > 30000 group by
> p.client_ip.country_code;
> +---------------+--------------+
> | country_code  | event_count  |
> +---------------+--------------+
> +---------------+--------------+
>
> In this situation, a fully-schemaed system would have stated that the
> event_count column does not exist.  In the case of json, we don't know if
> you're unlying data has an event_count column.  When we actually try to get
> it, we find it doesn't exist and is therefore null.  Null is never greater
> than 30000 so we exclude all records.  Modify your data to look like this:
>
> {event_count: 30001, "client_ip":{"country_code":"US"}}
> {event_count: 30001, "client_ip":{"country_code":"US"}}
> {event_count: 30001, "client_ip":{"country_code":"US"}}
> {event_count: 30001, "client_ip":{"country_code":"GB"}}
> {event_count: 30001, "client_ip":{"country_code":"US"}}
>
> And you'll see what your query is actually doing.  In SQL, an aggregate
> function is applied after the WHERE clause.  The only way to apply a filter
> after an aggregate is with the HAVING clause.
>
>
>
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>
> On Thu, Jul 30, 2015 at 8:20 AM, Andries Engelbrecht <
> aengelbre...@maprtech.com> wrote:
>
> > Last I checked group by and having clauses do not support column aliases.
> >
> > —Andries
> >
> >
> > > On Jul 30, 2015, at 7:19 AM, Stefán Baxter <ste...@activitystream.com>
> > wrote:
> > >
> > > Hi,
> > >
> > > That last case works as expected, sorry, this test data does have null
> > > values for country_code.
> > >
> > > That means that I have a working solution but that it would be nice if
> v1
> > > (above) would work.
> > >
> > > Thank you,
> > > -Stefán
> > >
> > > On Thu, Jul 30, 2015 at 2:15 PM, Stefán Baxter <
> > ste...@activitystream.com>
> > > wrote:
> > >
> > >> make that "Count in where or having clause does not work as expected"
> > >>
> > >> On Thu, Jul 30, 2015 at 2:14 PM, Stefán Baxter <
> > ste...@activitystream.com>
> > >> wrote:
> > >>
> > >>> Hi,
> > >>>
> > >>> I have data that can be reduced to this:
> > >>>
> > >>>   - {"client_ip":{"country_code":"US"}}
> > >>>   - {"client_ip":{"country_code":"US"}}
> > >>>   - {"client_ip":{"country_code":"US"}}
> > >>>   - {"client_ip":{"country_code":"GB"}}
> > >>>   - {"client_ip":{"country_code":"US"}}
> > >>>
> > >>> This works fine:
> > >>>
> > >>> select p.client_ip.country_code country_code, count(*) event_count
> from
> > >>> dfs.tmp.`/events` as p group by p.client_ip.country_code;
> > >>>
> > >>>
> > >>> +---------+--------------+
> > >>> | country_code  | event_count  |
> > >>> +---------+--------------+
> > >>> | US      | 21516        |
> > >>> | GB      | 323594       |
> > >>> +---------+--------------+
> > >>>
> > >>>
> > >>> This simple where clause returns nothing:
> > >>>
> > >>> select p.client_ip.country_code country_code, count(*) event_count
> from
> > >>> dfs.tmp.`/events` as p where event_count > 30000 group by
> > >>> p.client_ip.country_code;
> > >>> +---------------+--------------+
> > >>> | country_code  | event_count  |
> > >>> +---------------+--------------+
> > >>> +---------------+--------------+
> > >>>
> > >>>
> > >>> Using a having clause (v1):
> > >>>
> > >>> select p.client_ip.country_code country_code, count(*) event_count
> from
> > >>> dfs.tmp.`/events` as p group by p.client_ip.country_code having
> > event_count
> > >>>> 30000;
> > >>>
> > >>> Error: PARSE ERROR: From line 1, column 169 to line 1, column 179:
> > >>> Expression 'event_count' is not being grouped
> > >>>
> > >>>
> > >>> Using a having clause (v2):
> > >>>
> > >>> select p.client_ip.country_code country_code, count(*) event_count
> from
> > >>> dfs.tmp.`/events` as p group by p.client_ip.country_code having
> > count(*) >
> > >>> 30000;
> > >>> +---------------+--------------+
> > >>> | country_code  | event_count  |
> > >>> +---------------+--------------+
> > >>> | GB            | 323594       |
> > >>> | null*          | 566667       |
> > >>> +---------------+--------------+
> > >>>
> > >>>
> > >>> * there are no null values in the database and this looks to be the
> > >>> "total for the rest"
> > >>>
> > >>>
> > >>> Does anyone know how this can be made to work?
> > >>>
> > >>> Regards,
> > >>> -Stefan
> > >>>
> > >>
> > >>
> >
> >
>

Reply via email to