You can group by multiple keys, so perhaps

prod_grouped = group prod by (sid, ip);
prod_hits = foreach prod_grouped generate FLATTEN(group) as (sid, ip),
COUNT($1) as prod_hit_count;

On Sun, Jan 16, 2011 at 5:02 PM, Cam Bazz <[email protected]> wrote:

> Hello,
>
> I have rigged my web application so it generates some sort of custom
> access log. Each line in my access log has the ipnumber,
> sessionCookie, idOfPage.
>
> How can i count unique visits to per idOfPage?
>
> I followed the tutorial to write a script for calculating number of
> visits per idOfPage:
>
> raw = load '/home/cambazz/my.log' using PigStorage('\t');
> rawprod = filter raw by $2=='PROD';
> prod = foreach rawprod generate $0 as time, $3 as ip, $4 as session, $9 as
> sid;
> prod_grouped = group prod by sid;
> prod_hits = foreach prod_grouped generate group, COUNT($1);
> dump prod_hits;
>
> which was easy.
>
> I now want to calculate number of unique visits, where visits from
> same ip,sessionCookie counts as 1 per sid.
>
> I tried various schemes, but could not quite come up with it.
>
> Any ideas / suggestions / help greatly appreciated.
>
>
> Best Regards,
> C.B.
>

Reply via email to