A little sort/group by help please

2006-12-01 Thread T.J. Mahaffey

I've developed several job tracking applications here at my job.
One of them relies heavily on a series of date stamps in the main table.
Each job has about 6 milestones, each with a respective "sibling"  
column in the same table. The first sibling reflects the job's  
predicted date milestone and the second sibling gets populated with  
the *actual* date when that milestone is met. LIke this:


col1_date_predicted		col1_date_actual		col2_date_predicted		 
col2_date_actual


The "_date_actual" columns contain "-00-00" until their  
respective milestones have actually passed and is set to a real  
datestamp.
Now, when displaying the records in this table, I'm highlighting a  
row in blue if all of the *actual* dates are populated with legit  
datestamp.


My question: I can't for the life of me figure out how I might sort  
or group these all-populated rows (colored in blue) together in the  
list.

Can someone give me some guidance?
Thanks in advance.

--
T.J. Mahaffey
[EMAIL PROTECTED]




--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Order By or Group By Help Please

2002-12-08 Thread Bill Easton
SOLUTION 1:

It's probably simplest, conceptually, to build a temporary table which
provides the proper ordering.  The "ordering" column below is a computed
value which determines the desired orderint of the File's.  Then, you
can join with an ORDER BY clause that gives the desired order.

mysql> create temporary table temp
-> select substring(FileNumber,1,6) as File,
->min(concat(1-RUSH, PDate, PTime, substring(FileNumber,1,6)))
as ordering
-> from mytable
-> group by File;
Query OK, 7 rows affected (0.04 sec)
Records: 7  Duplicates: 0  Warnings: 0

mysql> select * from temp;
++---+
| File   | ordering  |
++---+
| 1-1023 | 012/0408:001-1023 |
| 1- | 012/0408:301- |
| 1- | 012/0406:001- |
| 1-6655 | 112/1108:401-6655 |
| 1-7654 | 012/0508:001-7654 |
| 1-9868 | 112/0514:001-9868 |
| 1- | 012/0408:001- |
++---+
7 rows in set (0.00 sec)

mysql> select mytable.* from
-> mytable, temp
-> where substring(mytable.FileNumber,1,6) = temp.File
-> order by ordering,RUSH desc,PDate,PTime;
+--++---+---+
| RUSH | FileNumber | PTime | PDate |
+--++---+---+
|1 | 1--001 | 06:00 | 12/04 |
|1 | 1-1023-001 | 08:00 | 12/04 |
|0 | 1-1023-002 | 14:00 | 12/09 |
|0 | 1-1023-003 | 11:00 | 12/10 |
|1 | 1--123 | 08:00 | 12/04 |
|0 | 1--124 | 09:30 | 12/09 |
|0 | 1--125 | 10:00 | 12/15 |
|1 | 1--000 | 08:30 | 12/04 |
|1 | 1-7654-043 | 08:00 | 12/05 |
|0 | 1-9868-000 | 14:00 | 12/05 |
|0 | 1-6655-021 | 08:40 | 12/11 |
|0 | 1-6655-022 | 13:30 | 12/15 |
+--++---+---+
12 rows in set (0.01 sec)

SOLUTION 2:

Yes, you can do it in one select statement.  You join two copies of your
table,
using one to compute the ordering.  I don't know how to get rid of the
ordering column in the result, as the expression won't work in the ORDER BY
clause.

mysql> select A.*,
->min(concat(1-B.RUSH, B.PDate, B.PTime,
substring(B.FileNumber,1,6))) as ordering
-> from mytable A, mytable B
-> where substring(A.FileNumber,1,6) = substring(B.FileNumber,1,6)
-> group by A.FileNumber
-> order by ordering, A.RUSH desc, A.PDate, A.PTime;
+--++---+---+---+
| RUSH | FileNumber | PTime | PDate | ordering  |
+--++---+---+---+
|1 | 1--001 | 06:00 | 12/04 | 012/0406:001- |
|1 | 1-1023-001 | 08:00 | 12/04 | 012/0408:001-1023 |
|0 | 1-1023-002 | 14:00 | 12/09 | 012/0408:001-1023 |
|0 | 1-1023-003 | 11:00 | 12/10 | 012/0408:001-1023 |
|1 | 1--123 | 08:00 | 12/04 | 012/0408:001- |
|0 | 1--124 | 09:30 | 12/09 | 012/0408:001- |
|0 | 1--125 | 10:00 | 12/15 | 012/0408:001- |
|1 | 1--000 | 08:30 | 12/04 | 012/0408:301- |
|1 | 1-7654-043 | 08:00 | 12/05 | 012/0508:001-7654 |
|0 | 1-9868-000 | 14:00 | 12/05 | 112/0514:001-9868 |
|0 | 1-6655-021 | 08:40 | 12/11 | 112/1108:401-6655 |
|0 | 1-6655-022 | 13:30 | 12/15 | 112/1108:401-6655 |
+--++---+---+---+
12 rows in set (0.03 sec)

From: "Roger Davis" <[EMAIL PROTECTED]>
To: "Mysql" <[EMAIL PROTECTED]>
Subject: Order By or Group By Help Please
Date: Thu, 5 Dec 2002 19:54:57 -0500

Ok, I will try this one again.

I need some help on a select if it is possible.  Take for Example the
following data.

--
| RUSH | FileNumber | PTime  | PDate |
--
|  1   | 1-1023-001 | 08:00  | 12/04 |
|  1   | 1--001 | 06:00  | 12/04 |
|  0   | 1-1023-002 | 14:00  | 12/09 |
|  1   | 1--000 | 08:30  | 12/04 |
|  0   | 1-1023-003 | 11:00  | 12/10 |
|  1   | 1--123 | 08:00  | 12/04 |
|  0   | 1--124 | 09:30  | 12/09 |
|  0   | 1-6655-021 | 08:40  | 12/11 |
|  0   | 1--125 | 10:00  | 12/15 |
|  1   | 1-7654-043 | 08:00  | 12/05 |
|  0   | 1-6655-022 | 13:30  | 12/15 |
|  0   | 1-9868-000 | 14:00  | 12/05 |
--

To end up grouped like this.

--
| RUSH | FileNumber | PTime  | PDate |
--
|  1   | 1--001 | 06:00  | 12/04 |
|  1   | 1-1023-001 | 08:00  | 12/04 |
|  0   | 1-1023-002 | 14:00  | 12/09 |
|  0   | 1-1023-003 | 11:00  | 12/10 |
|  1   | 1--123 | 08:00  | 12/04 |
|  0   | 1--124 | 09:30  | 12/09 |
|  0   | 1--125 | 10:00  | 12/15 |
|  1   | 1--000 | 08:30  | 12/04 |
|  1   | 1-7654-043 | 08:00  | 12/05 |
|  0   | 1-9868-000 | 14:00  | 12/05 |
|  0   | 1-6655-021 | 08:40  | 12/11 |
|  0   | 1-6655-022 | 13:30  | 12/15 |
--

Basically, I need this.

If it is a RUSH (1), It needs to be

Order By or Group By Help Please

2002-12-05 Thread Roger Davis
Ok, I will try this one again.

I need some help on a select if it is possible.  Take for Example the
following data.

--
| RUSH | FileNumber | PTime  | PDate |
--
|  1   | 1-1023-001 | 08:00  | 12/04 |
|  1   | 1--001 | 06:00  | 12/04 |
|  0   | 1-1023-002 | 14:00  | 12/09 |
|  1   | 1--000 | 08:30  | 12/04 |
|  0   | 1-1023-003 | 11:00  | 12/10 |
|  1   | 1--123 | 08:00  | 12/04 |
|  0   | 1--124 | 09:30  | 12/09 |
|  0   | 1-6655-021 | 08:40  | 12/11 |
|  0   | 1--125 | 10:00  | 12/15 |
|  1   | 1-7654-043 | 08:00  | 12/05 |
|  0   | 1-6655-022 | 13:30  | 12/15 |
|  0   | 1-9868-000 | 14:00  | 12/05 |
--

To end up grouped like this.

--
| RUSH | FileNumber | PTime  | PDate |
--
|  1   | 1--001 | 06:00  | 12/04 |
|  1   | 1-1023-001 | 08:00  | 12/04 |
|  0   | 1-1023-002 | 14:00  | 12/09 |
|  0   | 1-1023-003 | 11:00  | 12/10 |
|  1   | 1--123 | 08:00  | 12/04 |
|  0   | 1--124 | 09:30  | 12/09 |
|  0   | 1--125 | 10:00  | 12/15 |
|  1   | 1--000 | 08:30  | 12/04 |
|  1   | 1-7654-043 | 08:00  | 12/05 |
|  0   | 1-9868-000 | 14:00  | 12/05 |
|  0   | 1-6655-021 | 08:40  | 12/11 |
|  0   | 1-6655-022 | 13:30  | 12/15 |
--

Basically, I need this.

If it is a RUSH (1), It needs to be first
If there are duplicate files (Like 1- (it has 2 dups) or 1-1023 (2
dups)) and one of them is a rush
They need to be grouped together.

Then sort it by PDate and PTime

I have tried several group by combinations but no luck so far.  Any help
will be appreciated.

Thanks
Roger

Oh yeah,   SQL, MySQL

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: group by help?

2002-04-22 Thread Mick Watson

You're right, it wasn't the answer I wanted to hear ;-)
But it has a ring of truth to it.

Thanks for all your help

Mick

Paul DuBois wrote:

> At 13:34 +0100 4/19/02, Mick Watson wrote:
> >Thank you :-)
> >
> >I do realise the SQL is not being used as it should be, but as it is not my
> >database and nor is it my SQL, there's not much I can do to really change
> >either of these I merely have to work with a database that has been
> >produced by a third party group, and I must say at this point that I have
> >nothing but praise for the ensembl group for the difficult job that they have
> >done very well (see http://www.ensembl.org)
> >
> >So, I realise this is not the problem of mySQL, but what I want to try and
> >find out is really the internal workings of mySQL when it performs this
> >operation so that I can understand why my copy of ensembl, which is produced
> >from direct dumps of the main ensembl, behaves differently to the main
> >ensembl.
>
> I guess you won't want to hear this, but that is completely the wrong
> approach.  You *might* be able to achieve the same query output on both
> machines by dumping the database on one machine, then dropping and reloading
> the tables on both machines using the same copy of the dump.  Then presumably
> both tables will have both records loaded in the same order, and the query
> *might* produce the same result on both machines.
>
> - ...until the table gets modified
> - ...until MySQL is upgraded on one machine and not the other, and something
>happens in the query optimizer that changes the query output
> - ...etc.
>
> The group you're working with *must* change the queries to be correct.
> Otherwise, you're entirely at the mercy of any number of external factors
> that can change the output and over which you have no control.
>
> >
> >I guess from what you are saying, that the data is stored randomly, that there
> >is very little I can do to actually make my database behave the same as the
> >main database?  Could it be affected in any way by operating system and/or
> >file system?  Superficially the data is organised in exactly the same way in
> >both databases, but I have no doubt that things like memory locations are
> >completely different, but possibly if I could understand what the variables
> >are that affect this behaviour I could minimise the inconsistancy...?
> >
> >Thanks for your time
> >
> >Mick
> >
> >Richard Emery wrote:
> >
> >>  mysql is acting correctly.
> >>
> >>  GROUP BY is used to consolidate data for SUMming, COUNTing, etc.  Your
> >>  SELECT statement makes not such request.  You have simply requested the
> >>  value of a specific field.  Data are stored in mysql databases randomly.
> >>  Therefore, when you request a field's data, you are getting whatever is
> >>  first in the list of records matching your WHERE clause.
> >>
> >>  Bottom line: you are NOT using GROUP BY as it is supposed to be used.  Your
> >  > SQL is in error, not mysql.


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: group by help?

2002-04-20 Thread Paul DuBois

At 13:34 +0100 4/19/02, Mick Watson wrote:
>Thank you :-)
>
>I do realise the SQL is not being used as it should be, but as it is not my
>database and nor is it my SQL, there's not much I can do to really change
>either of these I merely have to work with a database that has been
>produced by a third party group, and I must say at this point that I have
>nothing but praise for the ensembl group for the difficult job that they have
>done very well (see http://www.ensembl.org)
>
>So, I realise this is not the problem of mySQL, but what I want to try and
>find out is really the internal workings of mySQL when it performs this
>operation so that I can understand why my copy of ensembl, which is produced
>from direct dumps of the main ensembl, behaves differently to the main
>ensembl.

I guess you won't want to hear this, but that is completely the wrong
approach.  You *might* be able to achieve the same query output on both
machines by dumping the database on one machine, then dropping and reloading
the tables on both machines using the same copy of the dump.  Then presumably
both tables will have both records loaded in the same order, and the query
*might* produce the same result on both machines.

- ...until the table gets modified
- ...until MySQL is upgraded on one machine and not the other, and something
   happens in the query optimizer that changes the query output
- ...etc.

The group you're working with *must* change the queries to be correct.
Otherwise, you're entirely at the mercy of any number of external factors
that can change the output and over which you have no control.

>
>I guess from what you are saying, that the data is stored randomly, that there
>is very little I can do to actually make my database behave the same as the
>main database?  Could it be affected in any way by operating system and/or
>file system?  Superficially the data is organised in exactly the same way in
>both databases, but I have no doubt that things like memory locations are
>completely different, but possibly if I could understand what the variables
>are that affect this behaviour I could minimise the inconsistancy...?
>
>Thanks for your time
>
>Mick
>
>Richard Emery wrote:
>
>>  mysql is acting correctly.
>>
>>  GROUP BY is used to consolidate data for SUMming, COUNTing, etc.  Your
>>  SELECT statement makes not such request.  You have simply requested the
>>  value of a specific field.  Data are stored in mysql databases randomly.
>>  Therefore, when you request a field's data, you are getting whatever is
>>  first in the list of records matching your WHERE clause.
>>
>>  Bottom line: you are NOT using GROUP BY as it is supposed to be used.  Your
>  > SQL is in error, not mysql.


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: group by help?

2002-04-19 Thread Mick Watson

It's ok, I do actually fully understand relational database theory and what the
group by statement is used for, but like I say, I personally have no control
over the database or the SQL :-)

I was hoping to get hold of someone who maybe knows the source code quite well
already and could  help me out before I take the drastic step of trawling
through the source code myself.

It does strike me that, on the rare occasions when a truly arbitrary decision
must be made, when there is a choice but it really doesn't matter which is
chosen, there is no method in relational databases to make this choice in a
consistent manner.  A tough task probably, and obviously group by is not the
perfect answer, but you can see why the developer made that choice as it does
make mySQL make an arbitrary decision... just not a consistent one :-)

Thanks

Mick

Richard Emery wrote:

> As a relational database, it doe not matter the order in which data are
> stored.
>
> Rather, what matters are the methods by which data are extracted via the
> SELECT statement.  In order to derive benefits of GROUP BY, you must use it
> with SELECTs that use GROUP BY functionality, such as SUM(), COUNT().
>
> If you need to know the internal machinations of mysql, I can only suggest
> that you study the source code.
> - Original Message -
> From: Mick Watson <[EMAIL PROTECTED]>
> To: Richard Emery <[EMAIL PROTECTED]>
> Cc: <[EMAIL PROTECTED]>
> Sent: Friday, April 19, 2002 7:34 AM
> Subject: Re: group by help?
>
> Thank you :-)
>
> I do realise the SQL is not being used as it should be, but as it is not my
> database and nor is it my SQL, there's not much I can do to really change
> either of these I merely have to work with a database that has been
> produced by a third party group, and I must say at this point that I have
> nothing but praise for the ensembl group for the difficult job that they
> have
> done very well (see http://www.ensembl.org)
>
> So, I realise this is not the problem of mySQL, but what I want to try and
> find out is really the internal workings of mySQL when it performs this
> operation so that I can understand why my copy of ensembl, which is produced
> from direct dumps of the main ensembl, behaves differently to the main
> ensembl.
>
> I guess from what you are saying, that the data is stored randomly, that
> there
> is very little I can do to actually make my database behave the same as the
> main database?  Could it be affected in any way by operating system and/or
> file system?  Superficially the data is organised in exactly the same way in
> both databases, but I have no doubt that things like memory locations are
> completely different, but possibly if I could understand what the variables
> are that affect this behaviour I could minimise the inconsistancy...?
>
> Thanks for your time
>
> Mick
>
> Richard Emery wrote:
>
> > mysql is acting correctly.
> >
> > GROUP BY is used to consolidate data for SUMming, COUNTing, etc.  Your
> > SELECT statement makes not such request.  You have simply requested the
> > value of a specific field.  Data are stored in mysql databases randomly.
> > Therefore, when you request a field's data, you are getting whatever is
> > first in the list of records matching your WHERE clause.
> >
> > Bottom line: you are NOT using GROUP BY as it is supposed to be used.
> Your
> > SQL is in error, not mysql.
> >
> > hope this helps...
> > - Original Message -
> > From: Mick Watson <[EMAIL PROTECTED]>
> > To: <[EMAIL PROTECTED]>
> > Sent: Friday, April 19, 2002 6:01 AM
> > Subject: group by help?
> >
> > Hi
> >
> > I am having some trouble figuring out how mySQL interprets the group by
> > clause in a particular set of data that I have (the ensembl database if
> > anyone is familar with it!)
> >
> > Now, I have some data like this:
> >
> > +-++--+
> > | gene_id | display_id | db_name  |
> > +-++--+
> > |   24173 | Q9H701 | SPTREMBL |
> > |   24173 | Q96GS5 | SPTREMBL |
> > +-++--+
> >
> > and this is produced by the following SQL:
> >
> >  select t.gene_id, x.display_id,e.db_name
> >   from ensembl_core_test.objectXref as ox,
> > ensembl_core_test.Xref   as x,
> > ensembl_core_test.transcript as t,
> > ensembl_core_test.externalDB as e
> >  where e.db_name='SPTREMBL' and
> > x.xrefID = ox.xrefID and
> >t.translation_id = ox.ensembl_id and

Re: group by help?

2002-04-19 Thread Richard Emery

As a relational database, it doe not matter the order in which data are
stored.

Rather, what matters are the methods by which data are extracted via the
SELECT statement.  In order to derive benefits of GROUP BY, you must use it
with SELECTs that use GROUP BY functionality, such as SUM(), COUNT().

If you need to know the internal machinations of mysql, I can only suggest
that you study the source code.
- Original Message -
From: Mick Watson <[EMAIL PROTECTED]>
To: Richard Emery <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Friday, April 19, 2002 7:34 AM
Subject: Re: group by help?


Thank you :-)

I do realise the SQL is not being used as it should be, but as it is not my
database and nor is it my SQL, there's not much I can do to really change
either of these I merely have to work with a database that has been
produced by a third party group, and I must say at this point that I have
nothing but praise for the ensembl group for the difficult job that they
have
done very well (see http://www.ensembl.org)

So, I realise this is not the problem of mySQL, but what I want to try and
find out is really the internal workings of mySQL when it performs this
operation so that I can understand why my copy of ensembl, which is produced
from direct dumps of the main ensembl, behaves differently to the main
ensembl.

I guess from what you are saying, that the data is stored randomly, that
there
is very little I can do to actually make my database behave the same as the
main database?  Could it be affected in any way by operating system and/or
file system?  Superficially the data is organised in exactly the same way in
both databases, but I have no doubt that things like memory locations are
completely different, but possibly if I could understand what the variables
are that affect this behaviour I could minimise the inconsistancy...?

Thanks for your time

Mick

Richard Emery wrote:

> mysql is acting correctly.
>
> GROUP BY is used to consolidate data for SUMming, COUNTing, etc.  Your
> SELECT statement makes not such request.  You have simply requested the
> value of a specific field.  Data are stored in mysql databases randomly.
> Therefore, when you request a field's data, you are getting whatever is
> first in the list of records matching your WHERE clause.
>
> Bottom line: you are NOT using GROUP BY as it is supposed to be used.
Your
> SQL is in error, not mysql.
>
> hope this helps...
> - Original Message -
> From: Mick Watson <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Sent: Friday, April 19, 2002 6:01 AM
> Subject: group by help?
>
> Hi
>
> I am having some trouble figuring out how mySQL interprets the group by
> clause in a particular set of data that I have (the ensembl database if
> anyone is familar with it!)
>
> Now, I have some data like this:
>
> +-++--+
> | gene_id | display_id | db_name  |
> +-++--+
> |   24173 | Q9H701 | SPTREMBL |
> |   24173 | Q96GS5 | SPTREMBL |
> +-++--+
>
> and this is produced by the following SQL:
>
>  select t.gene_id, x.display_id,e.db_name
>   from ensembl_core_test.objectXref as ox,
> ensembl_core_test.Xref   as x,
> ensembl_core_test.transcript as t,
> ensembl_core_test.externalDB as e
>  where e.db_name='SPTREMBL' and
> x.xrefID = ox.xrefID and
>t.translation_id = ox.ensembl_id and
>e.externalDBId=x.externalDBId and gene_id =24173;
>
> Now, the SQL is not important, what is is that we have two display_ids
> for one gene_id.  Now, if we add a "group by gene_id" clause into the
> above SQL, then presumably mySQL must make an arbitrary decision on
> which display_id to choose.  And the odd thing is that in the main
> ensembl database it chooses one, and in my local copy it chooses the
> other!
>
> So what I want to figure out is how mySQL makes that arbitrary decision
> - is it based on which it comes across first in memory, which it comes
> across last, alphabetical order, random choice (though mySQL is always
> consistent in which it chooses) ... or is there some other way it will
> make the decision?
>
> Furthermore, and more confusingly, mySQL chooses differently if I
> parameterise the SQL.  For example:
>
> mysql> create table test
> -> select t.gene_id, x.display_id,e.db_name
> ->   from ensembl_core_test.objectXref as ox,
> ->ensembl_core_test.Xref   as x,
> ->ensembl_core_test.transcript as t,
> ->   ensembl_core_test.externalDB as e
> ->  where e.db_name='SPTREMBL&#x

Re: group by help?

2002-04-19 Thread Mick Watson

Thank you :-)

I do realise the SQL is not being used as it should be, but as it is not my
database and nor is it my SQL, there's not much I can do to really change
either of these I merely have to work with a database that has been
produced by a third party group, and I must say at this point that I have
nothing but praise for the ensembl group for the difficult job that they have
done very well (see http://www.ensembl.org)

So, I realise this is not the problem of mySQL, but what I want to try and
find out is really the internal workings of mySQL when it performs this
operation so that I can understand why my copy of ensembl, which is produced
from direct dumps of the main ensembl, behaves differently to the main
ensembl.

I guess from what you are saying, that the data is stored randomly, that there
is very little I can do to actually make my database behave the same as the
main database?  Could it be affected in any way by operating system and/or
file system?  Superficially the data is organised in exactly the same way in
both databases, but I have no doubt that things like memory locations are
completely different, but possibly if I could understand what the variables
are that affect this behaviour I could minimise the inconsistancy...?

Thanks for your time

Mick

Richard Emery wrote:

> mysql is acting correctly.
>
> GROUP BY is used to consolidate data for SUMming, COUNTing, etc.  Your
> SELECT statement makes not such request.  You have simply requested the
> value of a specific field.  Data are stored in mysql databases randomly.
> Therefore, when you request a field's data, you are getting whatever is
> first in the list of records matching your WHERE clause.
>
> Bottom line: you are NOT using GROUP BY as it is supposed to be used.  Your
> SQL is in error, not mysql.
>
> hope this helps...
> - Original Message -
> From: Mick Watson <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Sent: Friday, April 19, 2002 6:01 AM
> Subject: group by help?
>
> Hi
>
> I am having some trouble figuring out how mySQL interprets the group by
> clause in a particular set of data that I have (the ensembl database if
> anyone is familar with it!)
>
> Now, I have some data like this:
>
> +-++--+
> | gene_id | display_id | db_name  |
> +-++--+
> |   24173 | Q9H701 | SPTREMBL |
> |   24173 | Q96GS5 | SPTREMBL |
> +-++--+
>
> and this is produced by the following SQL:
>
>  select t.gene_id, x.display_id,e.db_name
>   from ensembl_core_test.objectXref as ox,
> ensembl_core_test.Xref   as x,
> ensembl_core_test.transcript as t,
> ensembl_core_test.externalDB as e
>  where e.db_name='SPTREMBL' and
> x.xrefID = ox.xrefID and
>t.translation_id = ox.ensembl_id and
>e.externalDBId=x.externalDBId and gene_id =24173;
>
> Now, the SQL is not important, what is is that we have two display_ids
> for one gene_id.  Now, if we add a "group by gene_id" clause into the
> above SQL, then presumably mySQL must make an arbitrary decision on
> which display_id to choose.  And the odd thing is that in the main
> ensembl database it chooses one, and in my local copy it chooses the
> other!
>
> So what I want to figure out is how mySQL makes that arbitrary decision
> - is it based on which it comes across first in memory, which it comes
> across last, alphabetical order, random choice (though mySQL is always
> consistent in which it chooses) ... or is there some other way it will
> make the decision?
>
> Furthermore, and more confusingly, mySQL chooses differently if I
> parameterise the SQL.  For example:
>
> mysql> create table test
> -> select t.gene_id, x.display_id,e.db_name
> ->   from ensembl_core_test.objectXref as ox,
> ->ensembl_core_test.Xref   as x,
> ->ensembl_core_test.transcript as t,
> ->   ensembl_core_test.externalDB as e
> ->  where e.db_name='SPTREMBL' and
> ->x.xrefID = ox.xrefID and
> ->t.translation_id = ox.ensembl_id and
> ->e.externalDBId=x.externalDBId and gene_id = 24173
> ->   group by gene_id
> ->   order by gene_id;
> Query OK, 1 row affected (0.00 sec)
> Records: 1  Duplicates: 0  Warnings: 0
>
> mysql> select * from test;
> +-++--+
> | gene_id | display_id | db_name  |
> +-++--+
> |   24173 | Q9H701 | SPTREMBL |
> +-++--+
> 1 row in set (

Re: group by help?

2002-04-19 Thread Richard Emery

mysql is acting correctly.

GROUP BY is used to consolidate data for SUMming, COUNTing, etc.  Your
SELECT statement makes not such request.  You have simply requested the
value of a specific field.  Data are stored in mysql databases randomly.
Therefore, when you request a field's data, you are getting whatever is
first in the list of records matching your WHERE clause.

Bottom line: you are NOT using GROUP BY as it is supposed to be used.  Your
SQL is in error, not mysql.

hope this helps...
- Original Message -
From: Mick Watson <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Friday, April 19, 2002 6:01 AM
Subject: group by help?


Hi

I am having some trouble figuring out how mySQL interprets the group by
clause in a particular set of data that I have (the ensembl database if
anyone is familar with it!)

Now, I have some data like this:

+-++--+
| gene_id | display_id | db_name  |
+-++--+
|   24173 | Q9H701 | SPTREMBL |
|   24173 | Q96GS5 | SPTREMBL |
+-++--+

and this is produced by the following SQL:

 select t.gene_id, x.display_id,e.db_name
  from ensembl_core_test.objectXref as ox,
ensembl_core_test.Xref   as x,
ensembl_core_test.transcript as t,
ensembl_core_test.externalDB as e
 where e.db_name='SPTREMBL' and
x.xrefID = ox.xrefID and
   t.translation_id = ox.ensembl_id and
   e.externalDBId=x.externalDBId and gene_id =24173;

Now, the SQL is not important, what is is that we have two display_ids
for one gene_id.  Now, if we add a "group by gene_id" clause into the
above SQL, then presumably mySQL must make an arbitrary decision on
which display_id to choose.  And the odd thing is that in the main
ensembl database it chooses one, and in my local copy it chooses the
other!

So what I want to figure out is how mySQL makes that arbitrary decision
- is it based on which it comes across first in memory, which it comes
across last, alphabetical order, random choice (though mySQL is always
consistent in which it chooses) ... or is there some other way it will
make the decision?

Furthermore, and more confusingly, mySQL chooses differently if I
parameterise the SQL.  For example:

mysql> create table test
-> select t.gene_id, x.display_id,e.db_name
->   from ensembl_core_test.objectXref as ox,
->ensembl_core_test.Xref   as x,
->ensembl_core_test.transcript as t,
->   ensembl_core_test.externalDB as e
->  where e.db_name='SPTREMBL' and
->x.xrefID = ox.xrefID and
->t.translation_id = ox.ensembl_id and
->e.externalDBId=x.externalDBId and gene_id = 24173
->   group by gene_id
->   order by gene_id;
Query OK, 1 row affected (0.00 sec)
Records: 1  Duplicates: 0  Warnings: 0

mysql> select * from test;
+-++--+
| gene_id | display_id | db_name  |
+-++--+
|   24173 | Q9H701 | SPTREMBL |
+-++--+
1 row in set (0.00 sec)


compare this to:


mysql> create table test
-> select t.gene_id, x.display_id,e.db_name
->   from ensembl_core_test.objectXref as ox,
->ensembl_core_test.Xref   as x,
->ensembl_core_test.transcript as t,
->   ensembl_core_test.externalDB as e
->  where e.db_name='SPTREMBL' and
->x.xrefID = ox.xrefID and
->t.translation_id = ox.ensembl_id and
->e.externalDBId=x.externalDBId
->   group by gene_id
->   order by gene_id;
Query OK, 11674 rows affected (6.84 sec)
Records: 11674  Duplicates: 0  Warnings: 0

mysql> select * from test where gene_id = 24173;
+-++--+
| gene_id | display_id | db_name  |
+-++--+
|   24173 | Q96GS5 | SPTREMBL |
+-++--+
1 row in set (0.01 sec)


So here we see that mySQL has chosen differently simply because of the
presence or absence of the gene_id = 24173 in the create table command

I want to try and figure out why mySQL is making these rather
inconsistant decisions and see if it's possible to remove this "feature"

Thanks for your time, if there is another mailing list which is more
appropriate, please tell me!

Thanks
Mick


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail
<[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://

group by help?

2002-04-19 Thread Mick Watson

Hi

I am having some trouble figuring out how mySQL interprets the group by
clause in a particular set of data that I have (the ensembl database if
anyone is familar with it!)

Now, I have some data like this:

+-++--+
| gene_id | display_id | db_name  |
+-++--+
|   24173 | Q9H701 | SPTREMBL |
|   24173 | Q96GS5 | SPTREMBL |
+-++--+

and this is produced by the following SQL:

 select t.gene_id, x.display_id,e.db_name
  from ensembl_core_test.objectXref as ox,
ensembl_core_test.Xref   as x,
ensembl_core_test.transcript as t,
ensembl_core_test.externalDB as e
 where e.db_name='SPTREMBL' and
x.xrefID = ox.xrefID and
   t.translation_id = ox.ensembl_id and
   e.externalDBId=x.externalDBId and gene_id =24173;

Now, the SQL is not important, what is is that we have two display_ids
for one gene_id.  Now, if we add a "group by gene_id" clause into the
above SQL, then presumably mySQL must make an arbitrary decision on
which display_id to choose.  And the odd thing is that in the main
ensembl database it chooses one, and in my local copy it chooses the
other!

So what I want to figure out is how mySQL makes that arbitrary decision
- is it based on which it comes across first in memory, which it comes
across last, alphabetical order, random choice (though mySQL is always
consistent in which it chooses) ... or is there some other way it will
make the decision?

Furthermore, and more confusingly, mySQL chooses differently if I
parameterise the SQL.  For example:

mysql> create table test
-> select t.gene_id, x.display_id,e.db_name
->   from ensembl_core_test.objectXref as ox,
->ensembl_core_test.Xref   as x,
->ensembl_core_test.transcript as t,
->   ensembl_core_test.externalDB as e
->  where e.db_name='SPTREMBL' and
->x.xrefID = ox.xrefID and
->t.translation_id = ox.ensembl_id and
->e.externalDBId=x.externalDBId and gene_id = 24173
->   group by gene_id
->   order by gene_id;
Query OK, 1 row affected (0.00 sec)
Records: 1  Duplicates: 0  Warnings: 0

mysql> select * from test;
+-++--+
| gene_id | display_id | db_name  |
+-++--+
|   24173 | Q9H701 | SPTREMBL |
+-++--+
1 row in set (0.00 sec)


compare this to:


mysql> create table test
-> select t.gene_id, x.display_id,e.db_name
->   from ensembl_core_test.objectXref as ox,
->ensembl_core_test.Xref   as x,
->ensembl_core_test.transcript as t,
->   ensembl_core_test.externalDB as e
->  where e.db_name='SPTREMBL' and
->x.xrefID = ox.xrefID and
->t.translation_id = ox.ensembl_id and
->e.externalDBId=x.externalDBId
->   group by gene_id
->   order by gene_id;
Query OK, 11674 rows affected (6.84 sec)
Records: 11674  Duplicates: 0  Warnings: 0

mysql> select * from test where gene_id = 24173;
+-++--+
| gene_id | display_id | db_name  |
+-++--+
|   24173 | Q96GS5 | SPTREMBL |
+-++--+
1 row in set (0.01 sec)


So here we see that mySQL has chosen differently simply because of the
presence or absence of the gene_id = 24173 in the create table command

I want to try and figure out why mySQL is making these rather
inconsistant decisions and see if it's possible to remove this "feature"

Thanks for your time, if there is another mailing list which is more
appropriate, please tell me!

Thanks
Mick


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php