On Thu, 17 Dec 2009 10:38:32 +0100
"A. Kretschmer" <andreas.kretsch...@schollglas.com> wrote:

> In response to Ivan Sergio Borgonovo :
> > I've a web application.
> > 
> > I'm logging data related to the landing page and subsequent
> > selected hits.
> > 
> > create table track_hit (
> >   hitid serial not null,
> >   /* pk? I don't mind if serial wrap around
> >   pk could be (hitid, tracking_time) */
> >   esid varchar(32), -- related to session
> >   tracking_time timestamp not null default now(),
> >   -- some other stats
> >   -- first_hit boolean not null default false, -- ???
> > );
> >   
> > I'd like to be sure I just count one "first hit" in a session
> > (same esid) in an efficient way that means I'd like to mark them
> > with a
> 
> select distinct on (esid) esid, tracking_time from track_hit order
> by esid, tracking_time;
> 
> returns only one record for each esid, ordered by tracking_time.
> Should work with 8.x, maybe sice 7.x (I'm not sure)

I think I've tried to resolve a concurrency problem in the wrong
place... still... what is the difference between:

select min(hitid) as h from track_hit group by esid;

and

select distinct on (esid) hitid from track_hit order by esid,
track_time;

I haven't enough test data to see if they perform differently. The
second form seems to perform a little bit faster.
I'd expect the opposite: the first performing better.

I think I'll add an index on track_time for reporting and maybe make
primary key (hitid, tracking_time).
I don't want to be bothered by hitid wrap around, so I don't want to
make it a pk alone, still I may need a pk.

thanks

-- 
Ivan Sergio Borgonovo
http://www.webthatworks.it


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to