Re: Help needed with Query UPDATE INFO

2001-04-27 Thread Eric Fitzgerald

Ok, a few things I see...first off, slap some keys into those tables on the
join portions.  Secondly, upon examining your query a second time, your
never joining subsnp and locuslink.  I see your trying to use a full join,
but you didn't put the criteria for it in the where clause.  That would
cause some VERY weird join results, and may be your problem.  Try adding in
the join clause in the where clause.

- Original Message -
From: "Bryan Coon" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Friday, April 27, 2001 2:52 PM
Subject: RE: Help needed with Query UPDATE INFO


> Okay, here is the full monty from the database regarding the query in
> question, any suggestions on how to improve any of it are greatly
> appreciated!
>
> Also, if any more info is needed, just let me know.
>
> Thanks!
> Bryan
>
>
> QUERY:
> mysql> select distinct subsnp_pk as sqnmid, c.chrom, chrompos, source as
> panel,  null as first_pcrp, null as second_pcrp, null as prb_seq, null as
> term,  null as validation, null as freq, null as population  from
locuslink
> l, subsnp left outer join chromosome_position c  on subsnp_pk = c.snp_fk
> left outer join locus_annotation a  on subsnp_pk = a.snp_fk where
a.locusid
> = substring(l.locusid, 4) and description like '%GABA%' order by
subsnp_pk;
>
> EXPLAIN:
>
++--+---+--+-+--+---+---
> --+
> | table  | type | possible_keys | key  | key_len | ref  | rows  | Extra
> |
>
++--+---+--+-+--+---+---
> --+
> | subsnp | ALL  | NULL  | NULL |NULL | NULL |  1201 | Using
> temporary; Using filesort |
> | l  | ALL  | NULL  | NULL |NULL | NULL | 21294 | where
used
> |
> | c  | ALL  | NULL  | NULL |NULL | NULL |   579 |
> |
> | a  | ALL  | NULL  | NULL |NULL | NULL |   704 | where
> used; Distinct|
>
++--+---+--+-+--+---+---
> --+
> 4 rows in set (0.00 sec)
>
> DESCRIBE:
> mysql> describe locuslink;
> ++--+--+-+-+---+
> | Field  | Type | Null | Key | Default | Extra |
> ++--+--+-+-+---+
> | LOCUSID| varchar(12)  |  | | |   |
> | SYMBOL | varchar(15)  | YES  | | NULL|   |
> | INTERIM_SYMBOL | varchar(15)  | YES  | | NULL|   |
> | MIM_NUMBER | varchar(15)  | YES  | | NULL|   |
> | CHROM  | varchar(6)   | YES  | | NULL|   |
> | BAND   | varchar(20)  | YES  | | NULL|   |
> | DESCRIPTION| varchar(150) | YES  | | NULL|   |
> | SPECIES| varchar(10)  | YES  | | NULL|   |
> | SOURCESEQ  | varchar(15)  | YES  | | NULL|   |
> ++--+--+-+-+---+
> 9 rows in set (0.01 sec)
>
> mysql> describe subsnp;
> ++-+--+-+-+---+
> | Field  | Type| Null | Key | Default | Extra |
> ++-+--+-+-+---+
> | SUBSNP_PK  | int(11) |  | PRI | 0   |   |
> | SOURCE | varchar(40) |  | | |   |
> | SOURCEID   | varchar(20) |  | | |   |
> | SNPPOSITION| int(11) |  | | 0   |   |
> | TOTALSEQLENGTH | int(11) |  | | 0   |   |
> | VARIATION  | varchar(20) | YES  | | NULL|   |
> | NUM_ALLELES| int(11) | YES  | | NULL|   |
> ++-+--+-+-+---+
> 7 rows in set (0.00 sec)
>
> mysql> describe chromosome_position;
> +--++--+-+-+---+
> | Field| Type   | Null | Key | Default | Extra |
> +--++--+-+-+---+
> | SNP_FK   | int(11)|  | | 0   |   |
> | CHROM| varchar(5) | YES  | | NULL|   |
> | CHROMPOS | int(11)| YES  | | NULL|   |
> +--++--+-+-+---+
> 3 rows in set (0.00 sec)
>
> mysql> describe locus_annotation;
> +-+-+--+-+-+---+
> | Field   | Type| Null | Key | Default | Extra |
> +-+-+--+-+-+---+
> | SNP_FK  | int(11) |  | | 0   |   |
> | LOCUS  

RE: Help needed with Query UPDATE INFO

2001-04-27 Thread Bryan Coon

Okay, here is the full monty from the database regarding the query in
question, any suggestions on how to improve any of it are greatly
appreciated!

Also, if any more info is needed, just let me know.

Thanks!
Bryan


QUERY:
mysql> select distinct subsnp_pk as sqnmid, c.chrom, chrompos, source as
panel,  null as first_pcrp, null as second_pcrp, null as prb_seq, null as
term,  null as validation, null as freq, null as population  from locuslink
l, subsnp left outer join chromosome_position c  on subsnp_pk = c.snp_fk
left outer join locus_annotation a  on subsnp_pk = a.snp_fk where a.locusid
= substring(l.locusid, 4) and description like '%GABA%' order by subsnp_pk;

EXPLAIN:
++--+---+--+-+--+---+---
--+
| table  | type | possible_keys | key  | key_len | ref  | rows  | Extra
|
++--+---+--+-+--+---+---
--+
| subsnp | ALL  | NULL  | NULL |NULL | NULL |  1201 | Using
temporary; Using filesort |
| l  | ALL  | NULL  | NULL |NULL | NULL | 21294 | where used
|
| c  | ALL  | NULL  | NULL |NULL | NULL |   579 |
|
| a  | ALL  | NULL  | NULL |NULL | NULL |   704 | where
used; Distinct|
++--+---+--+-+--+---+---
--+
4 rows in set (0.00 sec)

DESCRIBE:
mysql> describe locuslink;
++--+--+-+-+---+
| Field  | Type | Null | Key | Default | Extra |
++--+--+-+-+---+
| LOCUSID| varchar(12)  |  | | |   |
| SYMBOL | varchar(15)  | YES  | | NULL|   |
| INTERIM_SYMBOL | varchar(15)  | YES  | | NULL|   |
| MIM_NUMBER | varchar(15)  | YES  | | NULL|   |
| CHROM  | varchar(6)   | YES  | | NULL|   |
| BAND   | varchar(20)  | YES  | | NULL|   |
| DESCRIPTION| varchar(150) | YES  | | NULL|   |
| SPECIES| varchar(10)  | YES  | | NULL|   |
| SOURCESEQ  | varchar(15)  | YES  | | NULL|   |
++--+--+-+-+---+
9 rows in set (0.01 sec)

mysql> describe subsnp;
++-+--+-+-+---+
| Field  | Type| Null | Key | Default | Extra |
++-+--+-+-+---+
| SUBSNP_PK  | int(11) |  | PRI | 0   |   |
| SOURCE | varchar(40) |  | | |   |
| SOURCEID   | varchar(20) |  | | |   |
| SNPPOSITION| int(11) |  | | 0   |   |
| TOTALSEQLENGTH | int(11) |  | | 0   |   |
| VARIATION  | varchar(20) | YES  | | NULL|   |
| NUM_ALLELES| int(11) | YES  | | NULL|   |
++-+--+-+-+---+
7 rows in set (0.00 sec)

mysql> describe chromosome_position;
+--++--+-+-+---+
| Field| Type   | Null | Key | Default | Extra |
+--++--+-+-+---+
| SNP_FK   | int(11)|  | | 0   |   |
| CHROM| varchar(5) | YES  | | NULL|   |
| CHROMPOS | int(11)| YES  | | NULL|   |
+--++--+-+-+---+
3 rows in set (0.00 sec)

mysql> describe locus_annotation;
+-+-+--+-+-+---+
| Field   | Type| Null | Key | Default | Extra |
+-+-+--+-+-+---+
| SNP_FK  | int(11) |  | | 0   |   |
| LOCUS   | varchar(20) | YES  | | NULL|   |
| LOCUSID | int(11) | YES  | | NULL|   |
+-+-+--+-+-+---+
3 rows in set (0.00 sec)

mysql> 

KEYS:
mysql> show keys from locuslink;
Empty set (0.00 sec)

mysql> show keys from subsnp;
+++--+--+-+---+-
+--++-+
| Table  | Non_unique | Key_name | Seq_in_index | Column_name | Collation |
Cardinality | Sub_part | Packed | Comment |
+++--+--+-+---+-
+--++-+
| subsnp |  0 | PRIMARY  |1 | SUBSNP_PK   | A |
1201 | NULL | NULL   | |
+++--+--+-+---+-
+--++-+
1 row in set (0.00 sec)

mysql> show keys from chromosome_position;
Empty set (0.00 sec)

mysql> show keys from locus_annotation;
Empty set (0.00 sec)

mysql> 


-
Before posting, please check:
   http://www.mysql.com

Re: Help needed with Query

2001-04-27 Thread Eric Fitzgerald

Please do a "SHOW FIELDS FROM" and "SHOW KEYS FROM" on all tables involved
so we can see what's going on here.  Also, a warning, if desacription is
indexed, by doing LIKE '%GABA%' with wildcard at beginning, it won't use
index's.

Would also help if you did an EXPLAIN on that query and sent that.

- Original Message -
From: "Bryan Coon" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Friday, April 27, 2001 1:20 PM
Subject: Help needed with Query


> I am working on a mysql database with many large tables (1.5 million rows
on
> some) and came across a beeg problem.
>
> I have need for a query like this:
> mysql> select distinct subsnp_pk as sqnmid, c.chrom, chrompos, source as
> panel,  null as first_pcrp, null as second_pcrp, null as prb_seq, null as
> term,  null as validation, null as freq, null as population  from
locuslink
> l, subsnp left outer join chromosome_position c  on subsnp_pk = c.snp_fk
> left outer join locus_annotation a  on subsnp_pk = a.snp_fk where
a.locusid
> = substring(l.locusid, 4) and description like '%GABA%' order by
subsnp_pk;
>
> Which seems insane.  This 4 table join in another situation would become a
6
> table join.  As it is, it takes 1min 10.60 seconds for this query to
return
> one result.  Clearly I am not a MySQL expert, and even though this query
> works, there must be a better way to do this.
>
> Is it a matter of indexing?  A matter of poor schema design?  Poor query
> design?  Is it unavoidable?
>
> As far as I know, none of the tables are indexed (I did not set up the
> schema).
>
> Please let me know if more info is required to help with this.
>
> Thanks,
> Bryan
>
>
> -
> Before posting, please check:
>http://www.mysql.com/manual.php   (the manual)
>http://lists.mysql.com/   (the list archive)
>
> To request this thread, e-mail <[EMAIL PROTECTED]>
> To unsubscribe, e-mail <[EMAIL PROTECTED]>
> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
>
>
>


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




RE: Help needed with Query

2001-04-27 Thread Ravi Raman

hi.

by skimming that query, the first thing that will take a long time is the
'description like "%GABA%"' part...that's a pretty inefficient method of
searching due to the wildcard at the beginning of the string.

if there are no indexes on the tables, you should probably add some.
"show index from tablename" will tell you for sure.

check out the mysql manual section re: indexes for more information...also
use the keyword EXPLAIN before the select query to find out where/if indexes
are being used and where they would likely be the most helpful.

hth.
-ravi.

-Original Message-
From: Bryan Coon [mailto:[EMAIL PROTECTED]]
Sent: Friday, April 27, 2001 4:20 PM
To: '[EMAIL PROTECTED]'
Subject: Help needed with Query


I am working on a mysql database with many large tables (1.5 million rows on
some) and came across a beeg problem.

I have need for a query like this:
mysql> select distinct subsnp_pk as sqnmid, c.chrom, chrompos, source as
panel,  null as first_pcrp, null as second_pcrp, null as prb_seq, null as
term,  null as validation, null as freq, null as population  from locuslink
l, subsnp left outer join chromosome_position c  on subsnp_pk = c.snp_fk
left outer join locus_annotation a  on subsnp_pk = a.snp_fk where a.locusid
= substring(l.locusid, 4) and description like '%GABA%' order by subsnp_pk;

Which seems insane.  This 4 table join in another situation would become a 6
table join.  As it is, it takes 1min 10.60 seconds for this query to return
one result.  Clearly I am not a MySQL expert, and even though this query
works, there must be a better way to do this.

Is it a matter of indexing?  A matter of poor schema design?  Poor query
design?  Is it unavoidable?

As far as I know, none of the tables are indexed (I did not set up the
schema).

Please let me know if more info is required to help with this.

Thanks,
Bryan


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php



-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Help needed with Query

2001-04-27 Thread Bryan Coon

I am working on a mysql database with many large tables (1.5 million rows on
some) and came across a beeg problem.

I have need for a query like this:
mysql> select distinct subsnp_pk as sqnmid, c.chrom, chrompos, source as
panel,  null as first_pcrp, null as second_pcrp, null as prb_seq, null as
term,  null as validation, null as freq, null as population  from locuslink
l, subsnp left outer join chromosome_position c  on subsnp_pk = c.snp_fk
left outer join locus_annotation a  on subsnp_pk = a.snp_fk where a.locusid
= substring(l.locusid, 4) and description like '%GABA%' order by subsnp_pk;

Which seems insane.  This 4 table join in another situation would become a 6
table join.  As it is, it takes 1min 10.60 seconds for this query to return
one result.  Clearly I am not a MySQL expert, and even though this query
works, there must be a better way to do this.

Is it a matter of indexing?  A matter of poor schema design?  Poor query
design?  Is it unavoidable?

As far as I know, none of the tables are indexed (I did not set up the
schema).

Please let me know if more info is required to help with this.

Thanks,
Bryan


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php