On Mon, 2004-03-08 at 15:52, David Scott wrote:
> OK, here is a problem that has been bugging me for some time.
> 
> If I have a table like this, showing 3 visitors to a website and the pages they 
> view....
> 
> id    visitorID    pageID
> 1    1                3
> 2    1                4
> 3    1                2
> 4    2                3
> 5    2                4
> 6    3                2
> 7    3                4
> 
> so to get the count for each entry page to the site I would do:
> 
> SELECT pageid, count(pageID) as 'count' FROM hits GROUP BY visitorid ORDER BY 
> 'count' DESC
> 
> pageid    count
> 3            2
> 2            1
> 
> so page 3 has been the first page people visit 2 times and page 2 has been the first 
> page once.

I think you've misunderstood the results of this query.  First off, with
the data you've provided, the result of the above query will be:

mysql> SELECT pageid, count(pageID) as 'count' FROM hits GROUP BY
visitorid ORDER BY 'count' DESC;
+--------+-------+
| pageid | count |
+--------+-------+
|      3 |     3 |
|      3 |     2 |
|      2 |     2 |
+--------+-------+
3 rows in set (0.05 sec)


Note the extra row and that page ID 2 has a count of 2.  The problem
here is that the 'count' column isn't a count of the number of times a
page ID has been the first page (and I assume "first" is defined by
having the lowest 'id' value for a particular user) hit by a visitor. 
Rather, its the count of the *total* pages seen by each visitor.

Your query tells MySQL you want to group the result set on visitorid,
which means that you only want to see one row returned per unique
visitorid.  Aggregate functions like "count" and "sum" tell MySQL what
do to with the other non-grouped columns on each row that a particular
visitorid stays the same.  You've specified that you want a count of
each pageid while visitorid stays the same, which is what the 'count'
result column shows.  You've also specified that you want to show
pageid.  However, pageid isn't grouped and its isn't part of an
aggregate function, so in this case MySQL happens to show you the first
value it found for that visitorid.

However, as stated in the MySQL docs, you aren't guaranteed to get the
first value found for these non-grouped columns:

    http://www.mysql.com/doc/en/GROUP-BY-hidden-fields.html

Since the field you've left out of the group clause aren't unique in the
group, you can't depend on it being what you think.

Unfortunately, I don't think you can do what you want in one SQL
statement.  You could do it with temporary tables though:

  create temporary table first ( id int(11), pageid int(11) );

  insert into first (pageid, id)
    select pageid, min(id) as firstid from hits group by visitorid;

  select pageid, count(pageid) as count from first group by pageid order
by count;

> BUT how do I get the last page they have visited? a sort of reverse GROUP BY??? 
> Which would return...
> 
> pageid    count
> 4            2
> 2            1
> 
> Please Help its driving me nuts :S

If you want the last page seen, then you could change the 'min' to a
'max' in the statements above.

-- 

 |- Garth Webb       -|
 |- [EMAIL PROTECTED] -|

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to