Gerald Taylor <[EMAIL PROTECTED]> wrote on 10/14/2005 10:42:33 AM:

> Thank you for your answer. What I'm trying to do is return the scores of 

>    the diff values of datapoints between a particular user and all the 
> other users in the system. (hence the mypmd.uid != pmd.uid)   Some 
> datapoints might be missing.   I also want to know how many data point 
> diffs we got for each user.   I no longer need the avg.
> 
> The tables are pretty straightforward
> 
> CREATE TABLE pmd (
>    id bigint unsigned not null auto_increment primary key,
>    uid bigint unsigned not null,
>    pmid bigint unsigned not null,
>    value int not null
> ) engine InnoDB DEFAULT CHARSET Latin1
> 
> and mnames is just the names of the datapoints being measured
> 
> 
> CREATE TABLE mnames (
>    pmid bigint unsigned not null auto_increment primary key,
>    mname VARCHAR(20)
> ) engine InnoDB DEFAULT CHARSET Latin1
> 
> 
> If an outer join would do it, I will like to learn something.
> Thnka you again.
> 
> [EMAIL PROTECTED] wrote:
> > 
> > 
> > Gerald Taylor <[EMAIL PROTECTED]> wrote on 10/13/2005 12:20:53 PM:
> > 
> >  >   SELECT avg(abs(pmd.value) - (mypmd.value))) as diff, pm.mname, 
pmd.uid
> >  >   FROM pmdata pmd, mnames pm, pmdata mypmd
> >  > WHERE mypmd.uid= ?
> >  > AND pmd.pmid=pm.pmid AND mypmd.pmid=pm.pmid
> >  > AND pmd.uid != mypmd.uid GROUP BY pmd.pmid, pmd.uid ORDER BY 
pmd.uid
> >  >
> >  > What I would like also to return in this query is a COUNT of the 
number
> >  > of pmd.uid of each different value  so I know how many values I got 
from
> >  > uid#1, uid#2 etc.
> >  >
> >  > I tried putting COUNT(pmd.uid) as numdelta but it just gave me a 1 
in
> >  > every row.   The problem is I am not doing a straight GROUP BY 
pmd.uid
> >  >
> > 
> > Can you explain, in simple language, the question you are trying to 
> > answer with this query? I see an "INNER JOIN" and a "!=" used together 

> > which makes me wonder if you needed to use one of the OUTER JOINs 
> > instead.  Along with your explanation, please respond with the results 
of
> > 
> > SHOW CREATE TABLE pmdata\G
> > SHOW CREATE TABLE mnames\G
> > 
> > Thanks!
> > 
> > Shawn Green
> > Database Administrator
> > Unimin Corporation - Spruce Pine
> 
> 

If you know how to make an explicit INNER JOIN, making either one of the 
OUTER JOINs becomes easy. I try to only use the explicit form of creating 
JOINs and I avoid the implicit form (the comma separated list that you 
used) like the plague. In your case, the INNER JOIN is appropriate because 
if there is only one user, there can be no "differences" and you shouldn't 
expect any results.

If you want just a table of differences, that does not require a GROUP BY:

SELECT me.uid as me
        , them.uid as them
        , n.mname as name
        , (me.value - them.value) as diff
FROM pmd me
INNER JOIN pmd them
        ON me.uid != them.uid
        AND me.pmid = them.pmid
INNER JOIN mnames n
        ON me.pmid = n.pmid;

If you want some statistics on your differences, that's another story. 
Assuming that each pmid value can appear more than once per uid (which may 
not be true for your data, but this assumption generalizes the solution) 
or that some pmids do not apply to some uids, this query should provide 
for each pmid of each uid, the name of the pmid, the sums of the 
differences, the sums of the absolute differences, the sum of the squares 
of the differences, the average difference, the average absolute 
difference, the standard deviation of the differences, and the count of 
how many other uids and pmids were compared to for each pmid that belongs 
to the user. In this case I would like to list ALL of the pmids for each 
uid, regardless if any other uid has the same one, so I need to use an 
OUTER JOIN ( the LEFT JOIN, in particular).

SELECT me.uid 
        , n.mname as name
        , sum(me.value - them.value) as sum_diff
        , sum(abs(me.value - them.value)) as sum_abs_diff
        , sum((me.value - them.value)*(me.value - them.value)) as 
sum_sq_diff
        , avg(me.value - them.value) as avg_diff
        , avg(abs(me.value - them.value)) as avg_abs_diff
        , std(me.value - them.value) as std_diff
        , count(distinct them.uid) as otheruids
        , count(them.pmid) as datapoints
FROM pmd me
INNER JOIN mnames n
        ON me.pmid = n.pmid;
LEFT JOIN pmd them
        ON me.uid != them.uid
        AND me.pmid = them.pmid
GROUP BY me.uid, n.mname;

Let me know if that does what you want. And don't forget to CC: the list 
on all responses :-o

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

Reply via email to