The problem is that "day" is the value in the for loop.

I've tried doing a join with a table that contains the set of days, but the
problem is that you can't do a join on a range ... Hive only support
equality in the join. For example:

INSERT OVERWRITE TABLE mausummary SELECT day, COUNT(DISTINCT(userid))
    FROM days
    JOIN logins ON date_add(logins.t, 30) >= days.day AND logins.t <=
days.day
    GROUP BY day;

fails because of the range in the join.

Tom


On Wed, Oct 10, 2012 at 8:50 PM, MiaoMiao <liy...@gmail.com> wrote:

> How about
> SELECT day, COUNT(DISTINCT(userid)) FROM logins WHERE day - logins.day
> < 30 GROUP BY day;
>
> On Thu, Oct 11, 2012 at 6:05 AM, Tom Hubina <t...@z2live.com> wrote:
> > I'm trying to compute the number of active users in the previous 30 days
> for
> > each day over a date range. I can't think of any way to do it directly
> > within Hive so I'm wondering if you guys have any ideas.
> >
> > Basically the algorithm is something like:
> >
> > For each day in date range:
> >    SELECT day, COUNT(DISTINCT(userid)) FROM logins WHERE day -
> logins.day <
> > 30;
> >
> > Thanks for your help!
> >
> > Tom
> >
>

Reply via email to