subject:"Indexing"

indexing on column having duplicate values

2014-05-28 Thread Rajeev Prasad

Gurus,
I am going to have a big table with lot of records, to expedite searching i 
wanted to index on a key field (which is numeric value). BUT, there will be 
records which will have same value for the key field (other columns will be 
different).

so how can i do this? right now, i am getting error, about duplicate entries 
and they are being discarded. All entries are important and I have to find a 
way to locate records based on this key field.

thx for help.
Rajeev


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql

Re: indexing on column having duplicate values

2014-05-28 Thread Reindl Harald



Am 28.05.2014 21:43, schrieb Rajeev Prasad:
 I am going to have a big table with lot of records, to expedite searching i 
 wanted to index on a key field (which is numeric value). BUT, there will be 
 records which will have same value for the key field (other columns will be 
 different).
 
 so how can i do this? right now, i am getting error, about duplicate entries 
 and they are being discarded. All entries are important and I have to find a 
 way to locate records based on this key field.

who said that a key needs to be unique?

just get phpMyAdmin to learn such things
there you see two different types of keys and after assign you get the sql 
command



signature.asc
Description: OpenPGP digital signature

Re: indexing on column having duplicate values

2014-05-28 Thread Reindl Harald



Am 28.05.2014 22:29, schrieb Rajeev Prasad:
 I am using phpMyAdmin, i looked closely and found index is this index you 
 are talking about? (earlier i used Primary).

surely

primary is a uniqe key, honestly consider to read some manuals

 My further question is: the index key here is going to be epoch system time. 
 I currently 
 have it as integer 10. The table will grow very fast (about 5 million rows 
 within 12 months) 
 and will be updated atleast twice in 24 hours. So I have to consider the 
 price i am going to 
 pay to rebuild the index on each INSERT, (about ~42000 records every 12 
 hours).

you need to try it out in doubt

 what would you/all suggest?
 shoud I have an index key? all searches will be based on index key mostly, 
 and sometimes another 'unindexed' column.
 should i change my index key to character type?
 should i index the other 'unindexed column' also?
 _or_
 should i simply go grabbing on an unindexed table?

that are basic questions

http://dev.mysql.com/doc/refman/5.5/en/optimization-indexes.html

 On Wednesday, May 28, 2014 2:50 PM, Reindl Harald h.rei...@thelounge.net 
 wrote:
 
 Am 28.05.2014 21:43, schrieb Rajeev Prasad:
 I am going to have a big table with lot of records, to expedite searching i 
 wanted to index on a key field (which is numeric value). BUT, there will be 
 records which will have same value for the key field (other columns will be 
 different).

 so how can i do this? right now, i am getting error, about duplicate entries 
 and they are being discarded. All entries are important and I have to find a 
 way to locate records based on this key field.
 
 who said that a key needs to be unique?
 
 just get phpMyAdmin to learn such things
 there you see two different types of keys and after assign you get the sql 
 command



signature.asc
Description: OpenPGP digital signature

Re: indexing on column having duplicate values

2014-05-28 Thread Rajeev Prasad

(re-sending, i got err from yahoo)

thx Reindl,

I am using phpMyAdmin, i looked closely and found index is this index you are 
talking about? (earlier i used Primary).

My further question is: the index key here is going to be epoch system time. I 
currently have it as integer 10. The table will grow very fast (about 5 million 
rows within 12 months) and will be updated atleast twice in 24 hours. So I have 
to consider the price i am going to pay to rebuild the index on each INSERT, 
(about ~42000 records every 12 hours).


what would you/all suggest?
shoud I have an index key? all searches will be based on index key mostly, and 
sometimes another 'unindexed' column.
should i change my index key to character type?
should i index the other 'unindexed column' also?
_or_
should i simply go grabbing on an unindexed table?

thanks and regards.
Rajeev



On Wednesday, May 28, 2014 3:29 PM, Rajeev Prasad rp.ne...@yahoo.com wrote:
thx Reindl,

I am using phpMyAdmin, i looked closely and found index is this index you are 
talking about? (earlier i used Primary).

My further question is: the index key here is going to be epoch system time. I 
currently have it as integer 10. The table will grow very fast (about 5 million 
rows within 12 months) and will be updated atleast twice in 24 hours. So I have 
to consider the price i am going to pay to rebuild the index on each INSERT, 
(about ~42000 records every 12 hours).


what would you/all suggest?
shoud I have an index key? all searches will be based on index key mostly, and 
sometimes another 'unindexed' column.
should i change my index key to character type?
should i index the other 'unindexed column' also?
_or_
should i simply go grabbing on an unindexed table?

thanks and regards.
Rajeev






On Wednesday, May 28, 2014 2:50 PM, Reindl Harald h.rei...@thelounge.net 
wrote:





Am 28.05.2014 21:43, schrieb Rajeev Prasad:
 I am going to have a big table with lot of records, to expedite searching i 
 wanted to index on a key field (which is numeric value). BUT, there will be 
 records which will have same value for the key field (other columns will be 
 different).
 
 so how can i do this? right now, i am getting error, about duplicate entries 
 and they are being discarded. All entries are important and I have to find a 
 way to locate records based on this key field.

who said that a key needs to be unique?

just get phpMyAdmin to learn such things
there you see two different types of keys and after assign you get the sql 
command

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql

Re: indexing on column having duplicate values

2014-05-28 Thread Reindl Harald


Am 28.05.2014 22:39, schrieb Rajeev Prasad:
 (re-sending, i got err from yahoo)

your previous message made it off-list to me
*don't use reply-all on mailing lists*



signature.asc
Description: OpenPGP digital signature

Re: Indexing about 40 Billion Entries

2012-06-21 Thread Christian Koetteritzsch

Thank you a lot. The first indexing process finished after about 13 
hours, so I think the problem is solved now.

I set the myisam_sort_bufffer_size to 10GB.

For the query I will adjust it to your version.

Am 20.06.2012 23:32, schrieb Rick James:

SELECT ruid1, ruid2, overlap FROM l4_link WHERE ruid1=xxx OR ruid2=xxx ORDER BY 
overlap DESC

MySQL does not optimize that kind of OR well.  This will run _much_ faster 
(with your two indexes):

(
SELECT ruid1, ruid2, overlap FROM l4_link WHERE ruid1=xxx
UNION
SELECT ruid1, ruid2, overlap FROM l4_link WHERE ruid2=xxx
) ORDER BY overlap DESC;

Make it UNION ALL or UNION DISTINCT depending on whether xxx can be in both 
fields of one row.  UNION DISTINCT makes a pass over the temp table to dedup.

Your version (OR) will do a table scan.

My version will:
1. scan ruid1 index (.MYI) (1+ disk hits for index range scan)
2. for each matching ruid1, fetch the data row (in .MYD) (1 hit/row); write to 
tmp table
3,4.  ditto for ruid2, appending to same tmp table
5. sort tmp table (ORDER BY) (probably 0 disk hits, due to using MEMORY)

Once the indexes are in place...  Depending on the version you are using, 
key_buffer_size is limited to 4G or is not.  For your huge machine, 24G might 
be wise.  The key_buffer will help steps 1,3.  The rest of RAM will be 
available for the OS to cache the data blocks (2,4).

The 16 cores -- A single query (even with the UNION I suggested) will use only 
one core.

How many rows (average, max) do you expect from
SELECT ... FROM l4_link WHERE ruid1=xxx
I ask, because that might run as slow as 100 rows/sec., simply because of the 
disk hits.


I'm expecting  an average of 5000 rows.


PARTITIONing, per se, does not help performance.  There are only a few use 
cases where PARTITION shines.  (I have not seen such [yet] in your application.)


-Original Message-
From: Christian Koetteritzsch [mailto:ckoetteritz...@e-humanities.net]
Sent: Wednesday, June 20, 2012 1:42 PM
To: Rick James
Cc: Ananda Kumar; mysql@lists.mysql.com
Subject: Re: Indexing about 40 Billion Entries

Thanks for the information. It is no problem if it takes days or weeks,
because the server is specially for such tasks that takes time and uses
lots of resources.

Am 20.06.2012 19:55, schrieb Rick James:

Even if you get past the REPAIR WITH KEYCACHE, the ALTER will still

take days, maybe weeks.

I strongly recommend you do not try to access that many rows

directly.  Instead, build summary tables, and access them.  We can
discuss further.
Did you mean that I make several partitions with for example 1 Billion
Entries and than make a union of the results from the partitions?

Unfortunately, you did not even include a PRIMARY KEY when you built

the table.  This makes any operations slow.
There is no PRIMARY KEY because ruid1 and ruid2 are not unique values.

I need an index for ruid1 and for ruid2.

What will your SELECTs look like?  You may be better off with
INDEX(ruid1, ruid2, overlap), and INDEX(ruid2, ruid1, overlap)

Will you be adding more rows to this table?  Or is it now static?

No data will be added to this table so it is a static table.  And the
index, as i have planed them will be INDEX(ruid1), and INDEX(ruid2).

My SELECT looks like this:

SELECT ruid1, ruid2, overlap FROM l4_link WHERE ruid1=xxx OR ruid2=xxx

and xxx are the same int values


What are the semantics of the fields?

The semantic behind these fields are, that ruid1 and ruid2 are ID's for
sentences that have common words and overlap is the number of words
they have in common.

I hope the informations I gave you are helpful. If you have more
questions than you can ask them.

Kind regards

Christian



-Original Message-
From: Ananda Kumar [mailto:anan...@gmail.com]
Sent: Wednesday, June 20, 2012 4:37 AM
To: Christian Koetteritzsch
Cc:mysql@lists.mysql.com
Subject: Re: Indexing about 40 Billion Entries

looks like the value that you give for myisam_max_sort_size is not
enough for the index creation and hence it doing a REPAIR WITH
KEYCACHE

Use the below query to set the min values required for
myisam_max_sort_size to avoid repair with keycache


select
  a.index_name as index_name,
  ceil(
  (select count(*) from `db_name`.`table_name`) *(
  @@global.myisam_data_pointer_size +
  sum(ifnull(1 + `a`.`sub_part` *
(`b`.`character_octet_length`/`b`.`character_maximum_length`),
  ifnull(1 + `b`.`character_octet_length`,
  case
  when `b`.`data_type` = 'tinyint'

then 1

  when `b`.`data_type` = 'smallint'
then
2
  when `b`.`data_type` = 'mediumint'
then
3
  when `b`.`data_type` = 'int' then 4
  when `b`.`data_type` = 'bigint'

then 8

  when `b`.`data_type` = 'float' then

4

Re: Indexing about 40 Billion Entries

2012-06-21 Thread Brent Clark



On 20/06/2012 11:45, Christian Koetteritzsch wrote:

Hi guys,

As the title says I'm trying to index 40 billion entries with two indexes on a 
server with 16 cores and 128GB RAM. The table is the one below and it is a 
myisam table. The *.myd file is about 640GB



Hiya

I am unable to help. But one question that was running through my mind, as I 
was reading this is:

How do you do your backups?

I use mylvmbackup on a slave master replication server, but I would love to 
know how or what you use and do your backups.

Hope you come right with your problem.

Thanks
Brent

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql

Re: Indexing about 40 Billion Entries

2012-06-21 Thread mos


At 02:04 AM 6/21/2012, you wrote:
Thank you a lot. The first indexing process finished after about 13 
hours, so I think the problem is solved now.

I set the myisam_sort_bufffer_size to 10GB.


The first indexing process???

You should have created all of your indexes with one Alter statement. 
Otherwise it will take another 13+ hours to build the second index 
because it has to create a copy of the table all over again.


Mike


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql

Re: Indexing about 40 Billion Entries

2012-06-21 Thread Christian Koetteritzsch


Hi,

thanks for the hint, I was testing the settings for the 
myisam_sort_buffer_size so I totally forgot it.
But I have another three of these databases to do, so next time I do it 
with one ALTER statement


Christian

Am 21.06.2012 16:50, schrieb mos:

At 02:04 AM 6/21/2012, you wrote:
Thank you a lot. The first indexing process finished after about 13 
hours, so I think the problem is solved now.

I set the myisam_sort_bufffer_size to 10GB.


The first indexing process???

You should have created all of your indexes with one Alter statement. 
Otherwise it will take another 13+ hours to build the second index 
because it has to create a copy of the table all over again.


Mike






--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql

RE: Indexing about 40 Billion Entries

2012-06-21 Thread Rick James

Possibly worse than that, since it will rebuild the 'first' index again.

 -Original Message-
 From: mos [mailto:mo...@fastmail.fm]
 Sent: Thursday, June 21, 2012 7:51 AM
 To: mysql@lists.mysql.com
 Subject: Re: Indexing about 40 Billion Entries

 At 02:04 AM 6/21/2012, you wrote:
 Thank you a lot. The first indexing process finished after about 13
 hours, so I think the problem is solved now.
 I set the myisam_sort_bufffer_size to 10GB.

 The first indexing process???

 You should have created all of your indexes with one Alter statement.
 Otherwise it will take another 13+ hours to build the second index
 because it has to create a copy of the table all over again.

 Mike

 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/mysql

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql

Indexing about 40 Billion Entries

2012-06-20 Thread Christian Koetteritzsch


Hi guys,

As the title says I'm trying to index 40 billion entries with two 
indexes on a server with 16 cores and 128GB RAM. The table is the one 
below and it is a myisam table. The *.myd file is about 640GB


DROP TABLE IF EXISTS `l4_link`;
CREATE TABLE  `l4_link` (
  `ruid1` int NOT NULL,
  `ruid2` int NOT NULL,
  `overlap` int NOT NULL
);

I need an index for ruid1 and for ruid2.

The status for this table is the following:

   Name: l4_link
 Engine: MyISAM
Version: 10
 Row_format: Fixed
   Rows: 39806500262
 Avg_row_length: 17
Data_length: 676710504454
Max_data_length: 4785074604081151
   Index_length: 1024
  Data_free: 0
 Auto_increment: NULL
Create_time: 2012-06-19 14:51:29
Update_time: 2012-06-19 16:26:35
 Check_time: NULL
  Collation: utf8_general_ci
   Checksum: NULL
 Create_options:
Comment:

The variables for myisam are the following:
mysql show global variables like '%myisam%';
++---+
| Variable_name   | 
Value  |

++---+
| myisam_data_pointer_size  | 6 |
| myisam_max_sort_file_size | 9223372036853727232  |
| myisam_mmap_size | 18446744073709551615|
| myisam_recover_options | BACKUP   |
| myisam_repair_threads   | 1 |
| myisam_sort_buffer_size| 8388608  |
| myisam_stats_method| nulls_unequal |
| myisam_use_mmap | 
OFF  |

+---++
8 rows in set (0.00 sec)

The temp folder has about 16tb free space.

When I start the indexing process, it copies the 640Gb into a temp file 
and then starts with repair with keycache.
On the internet I found that if it says repair with keycache you shold 
increase the myisam_max_sort_file_size, but this didn't work.
It still says repair with keycache after it copied the data to the 
temp dir.


I hope you have any idea how to fix this.

Thanks in advance.

Christian


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql

Re: Indexing about 40 Billion Entries

2012-06-20 Thread Ananda Kumar

looks like the value that you give for myisam_max_sort_size is not enough
for the index creation and hence it doing a REPAIR WITH KEYCACHE

Use the below query to set the min values required for myisam_max_sort_size
to avoid repair with keycache


select
a.index_name as index_name,
ceil(
(select count(*) from `db_name`.`table_name`) *(
@@global.myisam_data_pointer_size +
sum(ifnull(1 + `a`.`sub_part` *
(`b`.`character_octet_length`/`b`.`character_maximum_length`),
ifnull(1 + `b`.`character_octet_length`,
case
when `b`.`data_type` = 'tinyint' then 1
when `b`.`data_type` = 'smallint' then 2
when `b`.`data_type` = 'mediumint' then 3
when `b`.`data_type` = 'int' then 4
when `b`.`data_type` = 'bigint' then 8
when `b`.`data_type` = 'float' then 4
when `b`.`data_type` = 'double' then 8
when `b`.`data_type` = 'real' then 8
when `b`.`data_type` = 'bit' then 8
when `b`.`data_type` = 'date' then 3
when `b`.`data_type` = 'datetime' then 8
when `b`.`data_type` = 'timestamp' then 4
when `b`.`data_type` = 'time' then 3
when `b`.`data_type` = 'year' then 1
when `b`.`data_type` = 'enum' then 2
when `b`.`data_type` = 'set' then 8
when `b`.`data_type` = 'decimal' then 8

end
)

)
+ if(`a`.`nullable`='YES',1,0)
+ if(`b`.`character_octet_length` =255,2,0)
))/1048576)*1048576  as `index_size`
from
`information_schema`.`statistics` `a`,
`information_schema`.`columns` `b`
where
`a`.`table_name`=`b`.`table_name` and
`a`.`table_schema`=`b`.`table_schema` and
`a`.`column_name`=`b`.`column_name` and
`a`.`table_schema`='db_name' and
`a`.`table_name`='table_name'

group by `a`.`index_name`
order by `index_size` desc limit 1;

On Wed, Jun 20, 2012 at 3:15 PM, Christian Koetteritzsch 
ckoetteritz...@e-humanities.net wrote:

 Hi guys,

 As the title says I'm trying to index 40 billion entries with two indexes
 on a server with 16 cores and 128GB RAM. The table is the one below and it
 is a myisam table. The *.myd file is about 640GB

 DROP TABLE IF EXISTS `l4_link`;
 CREATE TABLE  `l4_link` (
  `ruid1` int NOT NULL,
  `ruid2` int NOT NULL,
  `overlap` int NOT NULL
 );

 I need an index for ruid1 and for ruid2.

 The status for this table is the following:

   Name: l4_link
 Engine: MyISAM
Version: 10
 Row_format: Fixed
   Rows: 39806500262
  Avg_row_length: 17
Data_length: 676710504454
 Max_data_length: 4785074604081151
   Index_length: 1024
  Data_free: 0
  Auto_increment: NULL
Create_time: 2012-06-19 14:51:29
Update_time: 2012-06-19 16:26:35
 Check_time: NULL
  Collation: utf8_general_ci
   Checksum: NULL
  Create_options:
Comment:

 The variables for myisam are the following:
 mysql show global variables like '%myisam%';
 +-**---+--**
 -+
 | Variable_name   | Value
  |
 +-**---+--**
 -+
 | myisam_data_pointer_size  | 6 |
 | myisam_max_sort_file_size | 9223372036853727232  |
 | myisam_mmap_size | 18446744073709551615|
 | myisam_recover_options | BACKUP   |
 | myisam_repair_threads   | 1 |
 | myisam_sort_buffer_size| 8388608  |
 | myisam_stats_method| nulls_unequal |
 | myisam_use_mmap | OFF
|
 +-**--+---**
 -+
 8 rows in set (0.00 sec)

 The temp folder has about 16tb free space.

 When I start the indexing process, it copies the 640Gb into a temp file
 and then starts with repair with keycache.
 On the internet I found that if it says repair with keycache you shold
 increase the myisam_max_sort_file_size, but this didn't work.
 It still says repair with keycache after it copied the data to the temp
 dir.

 I hope you have any idea how to fix this.

 Thanks in advance.

 Christian


 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/mysql

RE: Indexing about 40 Billion Entries

2012-06-20 Thread Rick James

Even if you get past the REPAIR WITH KEYCACHE, the ALTER will still take 
days, maybe weeks.

I strongly recommend you do not try to access that many rows directly.  
Instead, build summary tables, and access them.  We can discuss further.

Unfortunately, you did not even include a PRIMARY KEY when you built the table. 
 This makes any operations slow.

 I need an index for ruid1 and for ruid2.
What will your SELECTs look like?  You may be better off with
INDEX(ruid1, ruid2, overlap), and
INDEX(ruid2, ruid1, overlap)

Will you be adding more rows to this table?  Or is it now static?

What are the semantics of the fields?


 -Original Message-
 From: Ananda Kumar [mailto:anan...@gmail.com]
 Sent: Wednesday, June 20, 2012 4:37 AM
 To: Christian Koetteritzsch
 Cc: mysql@lists.mysql.com
 Subject: Re: Indexing about 40 Billion Entries
 
 looks like the value that you give for myisam_max_sort_size is not
 enough for the index creation and hence it doing a REPAIR WITH
 KEYCACHE
 
 Use the below query to set the min values required for
 myisam_max_sort_size to avoid repair with keycache
 
 
 select
 a.index_name as index_name,
 ceil(
 (select count(*) from `db_name`.`table_name`) *(
 @@global.myisam_data_pointer_size +
 sum(ifnull(1 + `a`.`sub_part` *
 (`b`.`character_octet_length`/`b`.`character_maximum_length`),
 ifnull(1 + `b`.`character_octet_length`,
 case
 when `b`.`data_type` = 'tinyint' then 1
 when `b`.`data_type` = 'smallint' then
 2
 when `b`.`data_type` = 'mediumint' then
 3
 when `b`.`data_type` = 'int' then 4
 when `b`.`data_type` = 'bigint' then 8
 when `b`.`data_type` = 'float' then 4
 when `b`.`data_type` = 'double' then 8
 when `b`.`data_type` = 'real' then 8
 when `b`.`data_type` = 'bit' then 8
 when `b`.`data_type` = 'date' then 3
 when `b`.`data_type` = 'datetime' then
 8
 when `b`.`data_type` = 'timestamp' then
 4
 when `b`.`data_type` = 'time' then 3
 when `b`.`data_type` = 'year' then 1
 when `b`.`data_type` = 'enum' then 2
 when `b`.`data_type` = 'set' then 8
 when `b`.`data_type` = 'decimal' then 8
 
 end
 )
 
 )
 + if(`a`.`nullable`='YES',1,0)
 + if(`b`.`character_octet_length` =255,2,0)
 ))/1048576)*1048576  as `index_size` from
 `information_schema`.`statistics` `a`, `information_schema`.`columns`
 `b` where
 `a`.`table_name`=`b`.`table_name` and
 `a`.`table_schema`=`b`.`table_schema` and
 `a`.`column_name`=`b`.`column_name` and
 `a`.`table_schema`='db_name' and
 `a`.`table_name`='table_name'
 
 group by `a`.`index_name`
 order by `index_size` desc limit 1;
 
 On Wed, Jun 20, 2012 at 3:15 PM, Christian Koetteritzsch 
 ckoetteritz...@e-humanities.net wrote:
 
  Hi guys,
 
  As the title says I'm trying to index 40 billion entries with two
  indexes on a server with 16 cores and 128GB RAM. The table is the one
  below and it is a myisam table. The *.myd file is about 640GB
 
  DROP TABLE IF EXISTS `l4_link`;
  CREATE TABLE  `l4_link` (
   `ruid1` int NOT NULL,
   `ruid2` int NOT NULL,
   `overlap` int NOT NULL
  );
 
  I need an index for ruid1 and for ruid2.
 
  The status for this table is the following:
 
Name: l4_link
  Engine: MyISAM
 Version: 10
  Row_format: Fixed
Rows: 39806500262
   Avg_row_length: 17
 Data_length: 676710504454
  Max_data_length: 4785074604081151
Index_length: 1024
   Data_free: 0
   Auto_increment: NULL
 Create_time: 2012-06-19 14:51:29
 Update_time: 2012-06-19 16:26:35
  Check_time: NULL
   Collation: utf8_general_ci
Checksum: NULL
   Create_options:
 Comment:
 
  The variables for myisam are the following:
  mysql show global variables like '%myisam%';
  +-**---+--**
  -+
  | Variable_name   | Value
   |
  +-**---+--**
  -+
  | myisam_data_pointer_size  | 6 |
  | myisam_max_sort_file_size | 9223372036853727232  |
  | myisam_mmap_size | 18446744073709551615|
  | myisam_recover_options | BACKUP   |
  | myisam_repair_threads   | 1 |
  | myisam_sort_buffer_size| 8388608
 |
  | myisam_stats_method

Re: Indexing about 40 Billion Entries

2012-06-20 Thread Christian Koetteritzsch

Thanks for the information. It is no problem if it takes days or weeks, 
because the server is specially for such tasks that takes time and uses 
lots of resources.


Am 20.06.2012 19:55, schrieb Rick James:

Even if you get past the REPAIR WITH KEYCACHE, the ALTER will still take 
days, maybe weeks.

I strongly recommend you do not try to access that many rows directly.  
Instead, build summary tables, and access them.  We can discuss further.
Did you mean that I make several partitions with for example 1 Billion 
Entries and than make a union of the results from the partitions?


Unfortunately, you did not even include a PRIMARY KEY when you built the table. 
 This makes any operations slow.

There is no PRIMARY KEY because ruid1 and ruid2 are not unique values.



I need an index for ruid1 and for ruid2.

What will your SELECTs look like?  You may be better off with
INDEX(ruid1, ruid2, overlap), and
INDEX(ruid2, ruid1, overlap)

Will you be adding more rows to this table?  Or is it now static?
No data will be added to this table so it is a static table.  And the 
index, as i have planed them will be

INDEX(ruid1), and INDEX(ruid2).

My SELECT looks like this:

SELECT ruid1, ruid2, overlap FROM l4_link WHERE ruid1=xxx OR ruid2=xxx

and xxx are the same int values


What are the semantics of the fields?
The semantic behind these fields are, that ruid1 and ruid2 are ID's for 
sentences that have common words and overlap is the number of words they 
have in common.


I hope the informations I gave you are helpful. If you have more 
questions than you can ask them.


Kind regards

Christian




-Original Message-
From: Ananda Kumar [mailto:anan...@gmail.com]
Sent: Wednesday, June 20, 2012 4:37 AM
To: Christian Koetteritzsch
Cc:mysql@lists.mysql.com
Subject: Re: Indexing about 40 Billion Entries

looks like the value that you give for myisam_max_sort_size is not
enough for the index creation and hence it doing a REPAIR WITH
KEYCACHE

Use the below query to set the min values required for
myisam_max_sort_size to avoid repair with keycache


select
 a.index_name as index_name,
 ceil(
 (select count(*) from `db_name`.`table_name`) *(
 @@global.myisam_data_pointer_size +
 sum(ifnull(1 + `a`.`sub_part` *
(`b`.`character_octet_length`/`b`.`character_maximum_length`),
 ifnull(1 + `b`.`character_octet_length`,
 case
 when `b`.`data_type` = 'tinyint' then 1
 when `b`.`data_type` = 'smallint' then
2
 when `b`.`data_type` = 'mediumint' then
3
 when `b`.`data_type` = 'int' then 4
 when `b`.`data_type` = 'bigint' then 8
 when `b`.`data_type` = 'float' then 4
 when `b`.`data_type` = 'double' then 8
 when `b`.`data_type` = 'real' then 8
 when `b`.`data_type` = 'bit' then 8
 when `b`.`data_type` = 'date' then 3
 when `b`.`data_type` = 'datetime' then
8
 when `b`.`data_type` = 'timestamp' then
4
 when `b`.`data_type` = 'time' then 3
 when `b`.`data_type` = 'year' then 1
 when `b`.`data_type` = 'enum' then 2
 when `b`.`data_type` = 'set' then 8
 when `b`.`data_type` = 'decimal' then 8

 end
 )

 )
 + if(`a`.`nullable`='YES',1,0)
 + if(`b`.`character_octet_length` =255,2,0)
 ))/1048576)*1048576  as `index_size` from
`information_schema`.`statistics` `a`, `information_schema`.`columns`
`b` where
 `a`.`table_name`=`b`.`table_name` and
 `a`.`table_schema`=`b`.`table_schema` and
 `a`.`column_name`=`b`.`column_name` and
 `a`.`table_schema`='db_name' and
 `a`.`table_name`='table_name'

group by `a`.`index_name`
order by `index_size` desc limit 1;

On Wed, Jun 20, 2012 at 3:15 PM, Christian Koetteritzsch 
ckoetteritz...@e-humanities.net wrote:


Hi guys,

As the title says I'm trying to index 40 billion entries with two
indexes on a server with 16 cores and 128GB RAM. The table is the one
below and it is a myisam table. The *.myd file is about 640GB

DROP TABLE IF EXISTS `l4_link`;
CREATE TABLE  `l4_link` (
  `ruid1` int NOT NULL,
  `ruid2` int NOT NULL,
  `overlap` int NOT NULL
);

I need an index for ruid1 and for ruid2.

The status for this table is the following:

   Name: l4_link
 Engine: MyISAM
Version: 10
 Row_format: Fixed
   Rows: 39806500262
  Avg_row_length: 17
Data_length: 676710504454
Max_data_length: 4785074604081151
   Index_length: 1024

Re: Indexing about 40 Billion Entries

2012-06-20 Thread Christian Koetteritzsch

sorry I forget the ORDER BY in the SELECT statement so the correct 
SELECT statement is:


SELECT ruid1, ruid2, overlap FROM l4_link WHERE ruid1=xxx OR ruid2=xxx 
ORDER BY overlap DESC


Am 20.06.2012 22:42, schrieb Christian Koetteritzsch:
Thanks for the information. It is no problem if it takes days or 
weeks, because the server is specially for such tasks that takes time 
and uses lots of resources.


Am 20.06.2012 19:55, schrieb Rick James:
Even if you get past the REPAIR WITH KEYCACHE, the ALTER will still 
take days, maybe weeks.


I strongly recommend you do not try to access that many rows 
directly.  Instead, build summary tables, and access them.  We can 
discuss further.
Did you mean that I make several partitions with for example 1 Billion 
Entries and than make a union of the results from the partitions?


Unfortunately, you did not even include a PRIMARY KEY when you built 
the table.  This makes any operations slow.

There is no PRIMARY KEY because ruid1 and ruid2 are not unique values.



I need an index for ruid1 and for ruid2.

What will your SELECTs look like?  You may be better off with
INDEX(ruid1, ruid2, overlap), and
INDEX(ruid2, ruid1, overlap)

Will you be adding more rows to this table?  Or is it now static?
No data will be added to this table so it is a static table.  And the 
index, as i have planed them will be

INDEX(ruid1), and INDEX(ruid2).

My SELECT looks like this:

SELECT ruid1, ruid2, overlap FROM l4_link WHERE ruid1=xxx OR ruid2=xxx

and xxx are the same int values


What are the semantics of the fields?
The semantic behind these fields are, that ruid1 and ruid2 are ID's 
for sentences that have common words and overlap is the number of 
words they have in common.


I hope the informations I gave you are helpful. If you have more 
questions than you can ask them.


Kind regards

Christian




-Original Message-
From: Ananda Kumar [mailto:anan...@gmail.com]
Sent: Wednesday, June 20, 2012 4:37 AM
To: Christian Koetteritzsch
Cc:mysql@lists.mysql.com
Subject: Re: Indexing about 40 Billion Entries

looks like the value that you give for myisam_max_sort_size is not
enough for the index creation and hence it doing a REPAIR WITH
KEYCACHE

Use the below query to set the min values required for
myisam_max_sort_size to avoid repair with keycache


select
 a.index_name as index_name,
 ceil(
 (select count(*) from `db_name`.`table_name`) *(
 @@global.myisam_data_pointer_size +
 sum(ifnull(1 + `a`.`sub_part` *
(`b`.`character_octet_length`/`b`.`character_maximum_length`),
 ifnull(1 + `b`.`character_octet_length`,
 case
 when `b`.`data_type` = 'tinyint' 
then 1

 when `b`.`data_type` = 'smallint' then
2
 when `b`.`data_type` = 'mediumint' 
then

3
 when `b`.`data_type` = 'int' then 4
 when `b`.`data_type` = 'bigint' then 8
 when `b`.`data_type` = 'float' then 4
 when `b`.`data_type` = 'double' then 8
 when `b`.`data_type` = 'real' then 8
 when `b`.`data_type` = 'bit' then 8
 when `b`.`data_type` = 'date' then 3
 when `b`.`data_type` = 'datetime' then
8
 when `b`.`data_type` = 'timestamp' 
then

4
 when `b`.`data_type` = 'time' then 3
 when `b`.`data_type` = 'year' then 1
 when `b`.`data_type` = 'enum' then 2
 when `b`.`data_type` = 'set' then 8
 when `b`.`data_type` = 'decimal' 
then 8


 end
 )

 )
 + if(`a`.`nullable`='YES',1,0)
 + if(`b`.`character_octet_length` =255,2,0)
 ))/1048576)*1048576  as `index_size` from
`information_schema`.`statistics` `a`, `information_schema`.`columns`
`b` where
 `a`.`table_name`=`b`.`table_name` and
 `a`.`table_schema`=`b`.`table_schema` and
 `a`.`column_name`=`b`.`column_name` and
 `a`.`table_schema`='db_name' and
 `a`.`table_name`='table_name'

group by `a`.`index_name`
order by `index_size` desc limit 1;

On Wed, Jun 20, 2012 at 3:15 PM, Christian Koetteritzsch 
ckoetteritz...@e-humanities.net wrote:


Hi guys,

As the title says I'm trying to index 40 billion entries with two
indexes on a server with 16 cores and 128GB RAM. The table is the one
below and it is a myisam table. The *.myd file is about 640GB

DROP TABLE IF EXISTS `l4_link`;
CREATE TABLE  `l4_link` (
  `ruid1` int NOT NULL,
  `ruid2` int NOT NULL,
  `overlap` int NOT NULL
);

I need an index for ruid1 and for ruid2.

The status for this table

Re: Indexing about 40 Billion Entries

2012-06-20 Thread Shawn Green


On 6/20/2012 5:45 AM, Christian Koetteritzsch wrote:

Hi guys,

As the title says I'm trying to index 40 billion entries with two
indexes on a server with 16 cores and 128GB RAM. The table is the one
below and it is a myisam table. The *.myd file is about 640GB

DROP TABLE IF EXISTS `l4_link`;
CREATE TABLE  `l4_link` (
   `ruid1` int NOT NULL,
   `ruid2` int NOT NULL,
   `overlap` int NOT NULL
);

I need an index for ruid1 and for ruid2.



Actually, based on your proposed query, I believe you want an index on 
(ruid1, ruid2) not separate indexes for each column.




The status for this table is the following:

Name: l4_link
  Engine: MyISAM
 Version: 10
  Row_format: Fixed
Rows: 39806500262
  Avg_row_length: 17
 Data_length: 676710504454
Max_data_length: 4785074604081151
Index_length: 1024
   Data_free: 0
  Auto_increment: NULL
 Create_time: 2012-06-19 14:51:29
 Update_time: 2012-06-19 16:26:35
  Check_time: NULL
   Collation: utf8_general_ci
Checksum: NULL
  Create_options:
 Comment:

The variables for myisam are the following:
mysql show global variables like '%myisam%';
++---+

| Variable_name   |
Value  |
++---+

| myisam_data_pointer_size  | 6 |
| myisam_max_sort_file_size | 9223372036853727232  |
| myisam_mmap_size | 18446744073709551615|
| myisam_recover_options | BACKUP   |
| myisam_repair_threads   | 1 |
| myisam_sort_buffer_size| 8388608  |
| myisam_stats_method| nulls_unequal |
| myisam_use_mmap |
OFF  |
+---++

8 rows in set (0.00 sec)

The temp folder has about 16tb free space.

When I start the indexing process, it copies the 640Gb into a temp file
and then starts with repair with keycache.
On the internet I found that if it says repair with keycache you shold
increase the myisam_max_sort_file_size, but this didn't work.
It still says repair with keycache after it copied the data to the
temp dir.

I hope you have any idea how to fix this.



Try maxing out the following settings:

  myisam_sort_buffer_size
  key_buffer_size

http://dev.mysql.com/doc/refman/5.5/en/server-system-variables.html#sysvar_myisam_sort_buffer_size

http://dev.mysql.com/doc/refman/5.5/en/server-system-variables.html#sysvar_myisam_sort_buffer_size

You will need to experiment with how large the maximum value will be 
permitted for your platform.


--
Shawn Green
MySQL Principal Technical Support Engineer
Oracle USA, Inc. - Hardware and Software, Engineered to Work Together.
Office: Blountville, TN



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql

RE: Indexing about 40 Billion Entries

2012-06-20 Thread Rick James

(ruid1, ruid2) will help for AND, but not at all for OR.

 -Original Message-
 From: Shawn Green [mailto:shawn.l.gr...@oracle.com]
 Sent: Wednesday, June 20, 2012 2:30 PM
 To: mysql@lists.mysql.com
 Subject: Re: Indexing about 40 Billion Entries

 On 6/20/2012 5:45 AM, Christian Koetteritzsch wrote:
  Hi guys,

  As the title says I'm trying to index 40 billion entries with two
  indexes on a server with 16 cores and 128GB RAM. The table is the one
  below and it is a myisam table. The *.myd file is about 640GB

  DROP TABLE IF EXISTS `l4_link`;
  CREATE TABLE  `l4_link` (
 `ruid1` int NOT NULL,
 `ruid2` int NOT NULL,
 `overlap` int NOT NULL
  );

  I need an index for ruid1 and for ruid2.

 Actually, based on your proposed query, I believe you want an index on
 (ruid1, ruid2) not separate indexes for each column.

  The status for this table is the following:

  Name: l4_link
Engine: MyISAM
   Version: 10
Row_format: Fixed
  Rows: 39806500262
Avg_row_length: 17
   Data_length: 676710504454
  Max_data_length: 4785074604081151
  Index_length: 1024
 Data_free: 0
Auto_increment: NULL
   Create_time: 2012-06-19 14:51:29
   Update_time: 2012-06-19 16:26:35
Check_time: NULL
 Collation: utf8_general_ci
  Checksum: NULL
Create_options:
   Comment:

  The variables for myisam are the following:
  mysql show global variables like '%myisam%';
  ++---
 +

  | Variable_name   |
  Value  |
  ++---
 +

  | myisam_data_pointer_size  | 6 |
  | myisam_max_sort_file_size | 9223372036853727232  |
  | myisam_mmap_size | 18446744073709551615|
  | myisam_recover_options | BACKUP   |
  | myisam_repair_threads   | 1 |
  | myisam_sort_buffer_size| 8388608
 |
  | myisam_stats_method| nulls_unequal |
  | myisam_use_mmap |
  OFF  |
  +---+
 +

  8 rows in set (0.00 sec)

  The temp folder has about 16tb free space.

  When I start the indexing process, it copies the 640Gb into a temp
 file
  and then starts with repair with keycache.
  On the internet I found that if it says repair with keycache you
 shold
  increase the myisam_max_sort_file_size, but this didn't work.
  It still says repair with keycache after it copied the data to the
  temp dir.

  I hope you have any idea how to fix this.

 Try maxing out the following settings:

myisam_sort_buffer_size
key_buffer_size

 http://dev.mysql.com/doc/refman/5.5/en/server-system-
 variables.html#sysvar_myisam_sort_buffer_size

 http://dev.mysql.com/doc/refman/5.5/en/server-system-
 variables.html#sysvar_myisam_sort_buffer_size

 You will need to experiment with how large the maximum value will be
 permitted for your platform.

 --
 Shawn Green
 MySQL Principal Technical Support Engineer
 Oracle USA, Inc. - Hardware and Software, Engineered to Work Together.
 Office: Blountville, TN

 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/mysql

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql

RE: Indexing about 40 Billion Entries

2012-06-20 Thread Rick James

 SELECT ruid1, ruid2, overlap FROM l4_link WHERE ruid1=xxx OR ruid2=xxx ORDER 
 BY overlap DESC

MySQL does not optimize that kind of OR well.  This will run _much_ faster 
(with your two indexes):

(
   SELECT ruid1, ruid2, overlap FROM l4_link WHERE ruid1=xxx
   UNION
   SELECT ruid1, ruid2, overlap FROM l4_link WHERE ruid2=xxx
) ORDER BY overlap DESC;

Make it UNION ALL or UNION DISTINCT depending on whether xxx can be in both 
fields of one row.  UNION DISTINCT makes a pass over the temp table to dedup.

Your version (OR) will do a table scan.

My version will:
1. scan ruid1 index (.MYI) (1+ disk hits for index range scan)
2. for each matching ruid1, fetch the data row (in .MYD) (1 hit/row); write to 
tmp table
3,4.  ditto for ruid2, appending to same tmp table
5. sort tmp table (ORDER BY) (probably 0 disk hits, due to using MEMORY)

Once the indexes are in place...  Depending on the version you are using, 
key_buffer_size is limited to 4G or is not.  For your huge machine, 24G might 
be wise.  The key_buffer will help steps 1,3.  The rest of RAM will be 
available for the OS to cache the data blocks (2,4).

The 16 cores -- A single query (even with the UNION I suggested) will use only 
one core.

How many rows (average, max) do you expect from
   SELECT ... FROM l4_link WHERE ruid1=xxx
I ask, because that might run as slow as 100 rows/sec., simply because of the 
disk hits.

PARTITIONing, per se, does not help performance.  There are only a few use 
cases where PARTITION shines.  (I have not seen such [yet] in your application.)

 -Original Message-
 From: Christian Koetteritzsch [mailto:ckoetteritz...@e-humanities.net]
 Sent: Wednesday, June 20, 2012 1:42 PM
 To: Rick James
 Cc: Ananda Kumar; mysql@lists.mysql.com
 Subject: Re: Indexing about 40 Billion Entries
 
 Thanks for the information. It is no problem if it takes days or weeks,
 because the server is specially for such tasks that takes time and uses
 lots of resources.
 
 Am 20.06.2012 19:55, schrieb Rick James:
  Even if you get past the REPAIR WITH KEYCACHE, the ALTER will still
 take days, maybe weeks.
 
  I strongly recommend you do not try to access that many rows
 directly.  Instead, build summary tables, and access them.  We can
 discuss further.
 Did you mean that I make several partitions with for example 1 Billion
 Entries and than make a union of the results from the partitions?
 
  Unfortunately, you did not even include a PRIMARY KEY when you built
 the table.  This makes any operations slow.
 There is no PRIMARY KEY because ruid1 and ruid2 are not unique values.
 
  I need an index for ruid1 and for ruid2.
  What will your SELECTs look like?  You may be better off with
  INDEX(ruid1, ruid2, overlap), and INDEX(ruid2, ruid1, overlap)
 
  Will you be adding more rows to this table?  Or is it now static?
 No data will be added to this table so it is a static table.  And the
 index, as i have planed them will be INDEX(ruid1), and INDEX(ruid2).
 
 My SELECT looks like this:
 
 SELECT ruid1, ruid2, overlap FROM l4_link WHERE ruid1=xxx OR ruid2=xxx
 
 and xxx are the same int values
 
  What are the semantics of the fields?
 The semantic behind these fields are, that ruid1 and ruid2 are ID's for
 sentences that have common words and overlap is the number of words
 they have in common.
 
 I hope the informations I gave you are helpful. If you have more
 questions than you can ask them.
 
 Kind regards
 
 Christian
 
 
  -Original Message-
  From: Ananda Kumar [mailto:anan...@gmail.com]
  Sent: Wednesday, June 20, 2012 4:37 AM
  To: Christian Koetteritzsch
  Cc:mysql@lists.mysql.com
  Subject: Re: Indexing about 40 Billion Entries
 
  looks like the value that you give for myisam_max_sort_size is not
  enough for the index creation and hence it doing a REPAIR WITH
  KEYCACHE
 
  Use the below query to set the min values required for
  myisam_max_sort_size to avoid repair with keycache
 
 
  select
   a.index_name as index_name,
   ceil(
   (select count(*) from `db_name`.`table_name`) *(
   @@global.myisam_data_pointer_size +
   sum(ifnull(1 + `a`.`sub_part` *
  (`b`.`character_octet_length`/`b`.`character_maximum_length`),
   ifnull(1 + `b`.`character_octet_length`,
   case
   when `b`.`data_type` = 'tinyint'
 then 1
   when `b`.`data_type` = 'smallint'
  then
  2
   when `b`.`data_type` = 'mediumint'
  then
  3
   when `b`.`data_type` = 'int' then 4
   when `b`.`data_type` = 'bigint'
 then 8
   when `b`.`data_type` = 'float' then
 4
   when `b`.`data_type` = 'double'
 then 8
   when `b`.`data_type` = 'real' then
 8
   when `b`.`data_type` = 'bit' then 8

Re: Indexing question

2010-10-06 Thread Jonas Galvez

Thanks Gavin and Joerg, that was very helpful!

-- Jonas

On Sun, Oct 3, 2010 at 12:44 PM, Joerg Bruehe joerg.bru...@oracle.comwrote:

 Hi Neil, all!


 Tompkins Neil wrote:
  So if you have individual indexes for example field_1, field_2 and
 field_3
  etc and then perform a search like
 
  WHERE field_1 = 10
  AND field_3 = 'abc'
 
  This wouldn't improve the search ?  You have to create a index for all
  possible combined field searches ?

 No - you didn't read Gavin's mail exact enough:

  On Fri, Oct 1, 2010 at 9:35 PM, Gavin Towey gto...@ffn.com wrote:
 
  [[...]]
 
  Additionally indexes are always read left to right.  So an index on
  ('user_id', 'product_id') will help when doing WHERE user_id=N AND
  product_id IN (1,2,3), but wouldn't help for just the condtion on
  product_id.

 What Gavin calls left to right is what I call most significant
 first, the result is the same:

 In a multi-column index, the columns are listed in the order of their
 significance. Any DBMS (this is not limited to MySQL) can use such an
 index only if a condition for the first (= most significant) field(s) is
 (are) specified.

 Example: Assume the index is on fields A, B, and C in that order.

 A statement ... where A = x and B = y and C = z can use the index.
 A statement ... where A = x and B = y can use the index, limited to
 the first two fields.
 A statement ... where A = x can use the index. the first field only.
 A statement ... where A = x and C = z can also use the index for A,
 but will have to evaluate the condition on C by scanning all records
 matching A.

 A statement ... where B = y and C = z cannot use the index, because
 there is no condition on A.

 If there are many searches based on A and C only (not B), and there are
 many records matching A with different values of C, then an additional
 index on these two columns may be helpful.

 Compare the index with a phone book, which (typically) lists the entries
 sorted by last name (most significant), then first name, then ... :
 If you don't know the last name, you cannot profit from the sorting and
 have to scan the wole book.

 
  See the manual for full details on how mysql uses indexes:
  http://dev.mysql.com/doc/refman/5.1/en/mysql-indexes.html

 HTH,
 Jörg

 --
 Joerg Bruehe,  MySQL Build Team,  joerg.bru...@oracle.com
 ORACLE Deutschland B.V.  Co. KG,   Komturstrasse 18a,   D-12099 Berlin
 Geschaeftsfuehrer: Juergen Kunz, Marcel v.d. Molen, Alexander v.d. Ven
 Amtsgericht Muenchen: HRA 95603


 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:
 http://lists.mysql.com/mysql?unsub=jonasgal...@gmail.com

Re: Indexing question

2010-10-05 Thread Tompkins Neil

Hi

Find attached the first of my queries, I require to be checked over.  Let me
know if you need any more information, I've included the query, EXPLAIN
output, and the relavant SHOW CREATE TABLE

Regards
Neil

On Mon, Oct 4, 2010 at 9:50 PM, Neil Tompkins
neil.tompk...@googlemail.comwrote:

 I've got a fair few number of queries to be checked over. Will send them
 tommorrow


 On 4 Oct 2010, at 18:27, Gavin Towey gto...@ffn.com wrote:

  Include the query, EXPLAIN output, and the relavant SHOW CREATE TABLE
 table \G output.  Someone should be able to offer suggestions.

 -Original Message-
 From: Tompkins Neil [mailto:neil.tompk...@googlemail.com]
 Sent: Monday, October 04, 2010 8:54 AM
 To: Joerg Bruehe
 Cc: [MySQL]
 Subject: Re: Indexing question

 Jörg

 Thanks for the useful reply.  Maybe I can EXPLAIN my select queries for
 you
 to advise if any changes need to be made ?

 Regards
 Neil

 On Mon, Oct 4, 2010 at 3:38 PM, Joerg Bruehe joerg.bru...@oracle.com
 wrote:

  Hi!


 Neil Tompkins wrote:

 Thanks for your reply. So should we create individual indexes on each
 field or a multiple column index ??


 This question cannot be answered without checking and measuring your
 installation. The decision whether to create an index is always an act
 of balancing:

 - If there is an index, the database server can use it to find data
 records by looking up the index, not scanning the base data.
 This results in load reduction (both CPU and disk IO) and speeds up
 query execution.

 - If there is an index, the database server must maintain it whenever
 data are altered (insert/update/delete), in addition to the base data.
 This is increased load (both CPU and disk IO) and slows down data
 changes.

 So obviously you want to create only those indexes that are helpful for
 query execution: you will never (voluntarily) create an index on a
 column which isn't used in search conditions, or whose use is already
 provided by other indexes.
 Of the remaining candidate indexes, you will never (voluntarily) create
 one that provides less gain in searches than it costs in data changes.

 With MySQL, AFAIK there is the limitation that on one table only one
 index can be used. As a result, the choice of indexes to create depends
 on the searches executed by your commands, their relative frequency, and
 the frequency of data changes.


 To answer your other question: If you run aggregate functions (like
 SUM(), MIN(), or MAX()) on all records of a table, their results could
 be computed by accessing a matching index only. I don't know whether
 MySQL does this, I propose you check that yourself using EXPLAIN.

 If you run them on subsets of a table only, an index on that column will
 not help in general.

 In database implementations, there is the concept of a covering index:
 If you have an index on columns A and B of some table, its contents
 (without the base data) would suffice to answer
  SELECT SUM(B) WHERE A = x
 Again, I don't know whether MySQL does this, and I refer you to EXPLAIN.


 HTH,
 Jörg

 --
 Joerg Bruehe,  MySQL Build Team,  joerg.bru...@oracle.com
 ORACLE Deutschland B.V.  Co. KG,   Komturstrasse 18a,   D-12099 Berlin
 Geschaeftsfuehrer: Juergen Kunz, Marcel v.d. Molen, Alexander v.d. Ven
 Amtsgericht Muenchen: HRA 95603



 This message contains confidential information and is intended only for
 the individual named.  If you are not the named addressee, you are notified
 that reviewing, disseminating, disclosing, copying or distributing this
 e-mail is strictly prohibited.  Please notify the sender immediately by
 e-mail if you have received this e-mail by mistake and delete this e-mail
 from your system. E-mail transmission cannot be guaranteed to be secure or
 error-free as information could be intercepted, corrupted, lost, destroyed,
 arrive late or incomplete, or contain viruses. The sender therefore does not
 accept liability for any loss or damage caused by viruses or errors or
 omissions in the contents of this message, which arise as a result of e-mail
 transmission. [FriendFinder Networks, Inc., 220 Humbolt court, Sunnyvale, CA
 94089, USA, FriendFinder.com


'players_bids', 'CREATE TABLE `players_bids` (
  `players_bids_id` bigint(20) NOT NULL auto_increment,
  `worlds_id` int(10) NOT NULL,
  `seasons_id` int(10) NOT NULL,
  `teams_id_from` bigint(20) NOT NULL,
  `teams_id_to` bigint(20) NOT NULL,
  `users_id_from` bigint(20) NOT NULL,
  `users_id_to` bigint(20) NOT NULL,
  `players_id` bigint(20) NOT NULL,
  `bid_value` double NOT NULL default ''0'',
  `bid_type` enum(''transfer'',''loan'',''exchange'') collate utf8_unicode_ci 
NOT NULL,
  `bid_status` varchar(45) collate utf8_unicode_ci NOT NULL,
  `bid_date` datetime NOT NULL,
  `bid_completed_date` datetime default NULL,
  PRIMARY KEY  (`players_bids_id`),
  KEY `FK_players_bids_worlds_id` (`worlds_id`),
  KEY `IDX_bid_date` (`bid_date`),
  KEY `IDX_bid_status` (`bid_status`),
  KEY `IDX_bid_type` (`bid_type`),
  KEY

Re: Indexing question

2010-10-04 Thread Joerg Bruehe

Hi!


Neil Tompkins wrote:
 Thanks for your reply. So should we create individual indexes on each
 field or a multiple column index ??

This question cannot be answered without checking and measuring your
installation. The decision whether to create an index is always an act
of balancing:

- If there is an index, the database server can use it to find data
  records by looking up the index, not scanning the base data.
  This results in load reduction (both CPU and disk IO) and speeds up
  query execution.

- If there is an index, the database server must maintain it whenever
  data are altered (insert/update/delete), in addition to the base data.
  This is increased load (both CPU and disk IO) and slows down data
  changes.

So obviously you want to create only those indexes that are helpful for
query execution: you will never (voluntarily) create an index on a
column which isn't used in search conditions, or whose use is already
provided by other indexes.
Of the remaining candidate indexes, you will never (voluntarily) create
one that provides less gain in searches than it costs in data changes.

With MySQL, AFAIK there is the limitation that on one table only one
index can be used. As a result, the choice of indexes to create depends
on the searches executed by your commands, their relative frequency, and
the frequency of data changes.


To answer your other question: If you run aggregate functions (like
SUM(), MIN(), or MAX()) on all records of a table, their results could
be computed by accessing a matching index only. I don't know whether
MySQL does this, I propose you check that yourself using EXPLAIN.

If you run them on subsets of a table only, an index on that column will
not help in general.

In database implementations, there is the concept of a covering index:
If you have an index on columns A and B of some table, its contents
(without the base data) would suffice to answer
   SELECT SUM(B) WHERE A = x
Again, I don't know whether MySQL does this, and I refer you to EXPLAIN.


HTH,
Jörg

-- 
Joerg Bruehe,  MySQL Build Team,  joerg.bru...@oracle.com
ORACLE Deutschland B.V.  Co. KG,   Komturstrasse 18a,   D-12099 Berlin
Geschaeftsfuehrer: Juergen Kunz, Marcel v.d. Molen, Alexander v.d. Ven
Amtsgericht Muenchen: HRA 95603


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org

Re: Indexing question

2010-10-04 Thread Tompkins Neil

Jörg

Thanks for the useful reply.  Maybe I can EXPLAIN my select queries for you
to advise if any changes need to be made ?

Regards
Neil

On Mon, Oct 4, 2010 at 3:38 PM, Joerg Bruehe joerg.bru...@oracle.comwrote:

 Hi!


 Neil Tompkins wrote:
  Thanks for your reply. So should we create individual indexes on each
  field or a multiple column index ??

 This question cannot be answered without checking and measuring your
 installation. The decision whether to create an index is always an act
 of balancing:

 - If there is an index, the database server can use it to find data
  records by looking up the index, not scanning the base data.
  This results in load reduction (both CPU and disk IO) and speeds up
  query execution.

 - If there is an index, the database server must maintain it whenever
  data are altered (insert/update/delete), in addition to the base data.
  This is increased load (both CPU and disk IO) and slows down data
  changes.

 So obviously you want to create only those indexes that are helpful for
 query execution: you will never (voluntarily) create an index on a
 column which isn't used in search conditions, or whose use is already
 provided by other indexes.
 Of the remaining candidate indexes, you will never (voluntarily) create
 one that provides less gain in searches than it costs in data changes.

 With MySQL, AFAIK there is the limitation that on one table only one
 index can be used. As a result, the choice of indexes to create depends
 on the searches executed by your commands, their relative frequency, and
 the frequency of data changes.


 To answer your other question: If you run aggregate functions (like
 SUM(), MIN(), or MAX()) on all records of a table, their results could
 be computed by accessing a matching index only. I don't know whether
 MySQL does this, I propose you check that yourself using EXPLAIN.

 If you run them on subsets of a table only, an index on that column will
 not help in general.

 In database implementations, there is the concept of a covering index:
 If you have an index on columns A and B of some table, its contents
 (without the base data) would suffice to answer
   SELECT SUM(B) WHERE A = x
 Again, I don't know whether MySQL does this, and I refer you to EXPLAIN.


 HTH,
 Jörg

 --
 Joerg Bruehe,  MySQL Build Team,  joerg.bru...@oracle.com
 ORACLE Deutschland B.V.  Co. KG,   Komturstrasse 18a,   D-12099 Berlin
 Geschaeftsfuehrer: Juergen Kunz, Marcel v.d. Molen, Alexander v.d. Ven
 Amtsgericht Muenchen: HRA 95603

RE: Indexing question

2010-10-04 Thread Gavin Towey

Include the query, EXPLAIN output, and the relavant SHOW CREATE TABLE table \G 
output.  Someone should be able to offer suggestions.

-Original Message-
From: Tompkins Neil [mailto:neil.tompk...@googlemail.com]
Sent: Monday, October 04, 2010 8:54 AM
To: Joerg Bruehe
Cc: [MySQL]
Subject: Re: Indexing question

Jörg

Thanks for the useful reply.  Maybe I can EXPLAIN my select queries for you
to advise if any changes need to be made ?

Regards
Neil

On Mon, Oct 4, 2010 at 3:38 PM, Joerg Bruehe joerg.bru...@oracle.comwrote:

 Hi!

 Neil Tompkins wrote:
  Thanks for your reply. So should we create individual indexes on each
  field or a multiple column index ??

 This question cannot be answered without checking and measuring your
 installation. The decision whether to create an index is always an act
 of balancing:

 - If there is an index, the database server can use it to find data
  records by looking up the index, not scanning the base data.
  This results in load reduction (both CPU and disk IO) and speeds up
  query execution.

 - If there is an index, the database server must maintain it whenever
  data are altered (insert/update/delete), in addition to the base data.
  This is increased load (both CPU and disk IO) and slows down data
  changes.

 So obviously you want to create only those indexes that are helpful for
 query execution: you will never (voluntarily) create an index on a
 column which isn't used in search conditions, or whose use is already
 provided by other indexes.
 Of the remaining candidate indexes, you will never (voluntarily) create
 one that provides less gain in searches than it costs in data changes.

 With MySQL, AFAIK there is the limitation that on one table only one
 index can be used. As a result, the choice of indexes to create depends
 on the searches executed by your commands, their relative frequency, and
 the frequency of data changes.

 To answer your other question: If you run aggregate functions (like
 SUM(), MIN(), or MAX()) on all records of a table, their results could
 be computed by accessing a matching index only. I don't know whether
 MySQL does this, I propose you check that yourself using EXPLAIN.

 If you run them on subsets of a table only, an index on that column will
 not help in general.

 In database implementations, there is the concept of a covering index:
 If you have an index on columns A and B of some table, its contents
 (without the base data) would suffice to answer
   SELECT SUM(B) WHERE A = x
 Again, I don't know whether MySQL does this, and I refer you to EXPLAIN.

 HTH,
 Jörg

 --
 Joerg Bruehe,  MySQL Build Team,  joerg.bru...@oracle.com
 ORACLE Deutschland B.V.  Co. KG,   Komturstrasse 18a,   D-12099 Berlin
 Geschaeftsfuehrer: Juergen Kunz, Marcel v.d. Molen, Alexander v.d. Ven
 Amtsgericht Muenchen: HRA 95603

This message contains confidential information and is intended only for the 
individual named.  If you are not the named addressee, you are notified that 
reviewing, disseminating, disclosing, copying or distributing this e-mail is 
strictly prohibited.  Please notify the sender immediately by e-mail if you 
have received this e-mail by mistake and delete this e-mail from your system. 
E-mail transmission cannot be guaranteed to be secure or error-free as 
information could be intercepted, corrupted, lost, destroyed, arrive late or 
incomplete, or contain viruses. The sender therefore does not accept liability 
for any loss or damage caused by viruses or errors or omissions in the contents 
of this message, which arise as a result of e-mail transmission. [FriendFinder 
Networks, Inc., 220 Humbolt court, Sunnyvale, CA 94089, USA, FriendFinder.com

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org

Re: Indexing question

2010-10-04 Thread Neil Tompkins

I've got a fair few number of queries to be checked over. Will send  
them tommorrow


On 4 Oct 2010, at 18:27, Gavin Towey gto...@ffn.com wrote:

Include the query, EXPLAIN output, and the relavant SHOW CREATE  
TABLE table \G output.  Someone should be able to offer suggestions.


-Original Message-
From: Tompkins Neil [mailto:neil.tompk...@googlemail.com]
Sent: Monday, October 04, 2010 8:54 AM
To: Joerg Bruehe
Cc: [MySQL]
Subject: Re: Indexing question

Jörg

Thanks for the useful reply.  Maybe I can EXPLAIN my select queries  
for you

to advise if any changes need to be made ?

Regards
Neil

On Mon, Oct 4, 2010 at 3:38 PM, Joerg Bruehe  
joerg.bru...@oracle.comwrote:



Hi!


Neil Tompkins wrote:
Thanks for your reply. So should we create individual indexes on  
each

field or a multiple column index ??


This question cannot be answered without checking and measuring your
installation. The decision whether to create an index is always an  
act

of balancing:

- If there is an index, the database server can use it to find data
records by looking up the index, not scanning the base data.
This results in load reduction (both CPU and disk IO) and speeds up
query execution.

- If there is an index, the database server must maintain it whenever
data are altered (insert/update/delete), in addition to the base  
data.

This is increased load (both CPU and disk IO) and slows down data
changes.

So obviously you want to create only those indexes that are helpful  
for

query execution: you will never (voluntarily) create an index on a
column which isn't used in search conditions, or whose use is already
provided by other indexes.
Of the remaining candidate indexes, you will never (voluntarily)  
create
one that provides less gain in searches than it costs in data  
changes.


With MySQL, AFAIK there is the limitation that on one table only one
index can be used. As a result, the choice of indexes to create  
depends
on the searches executed by your commands, their relative  
frequency, and

the frequency of data changes.


To answer your other question: If you run aggregate functions (like
SUM(), MIN(), or MAX()) on all records of a table, their results  
could

be computed by accessing a matching index only. I don't know whether
MySQL does this, I propose you check that yourself using EXPLAIN.

If you run them on subsets of a table only, an index on that column  
will

not help in general.

In database implementations, there is the concept of a covering  
index:

If you have an index on columns A and B of some table, its contents
(without the base data) would suffice to answer
 SELECT SUM(B) WHERE A = x
Again, I don't know whether MySQL does this, and I refer you to  
EXPLAIN.



HTH,
Jörg

--
Joerg Bruehe,  MySQL Build Team,  joerg.bru...@oracle.com
ORACLE Deutschland B.V.  Co. KG,   Komturstrasse 18a,   D-12099  
Berlin
Geschaeftsfuehrer: Juergen Kunz, Marcel v.d. Molen, Alexander v.d.  
Ven

Amtsgericht Muenchen: HRA 95603




This message contains confidential information and is intended only  
for the individual named.  If you are not the named addressee, you  
are notified that reviewing, disseminating, disclosing, copying or  
distributing this e-mail is strictly prohibited.  Please notify the  
sender immediately by e-mail if you have received this e-mail by  
mistake and delete this e-mail from your system. E-mail transmission  
cannot be guaranteed to be secure or error-free as information could  
be intercepted, corrupted, lost, destroyed, arrive late or  
incomplete, or contain viruses. The sender therefore does not accept  
liability for any loss or damage caused by viruses or errors or  
omissions in the contents of this message, which arise as a result  
of e-mail transmission. [FriendFinder Networks, Inc., 220 Humbolt  
court, Sunnyvale, CA 94089, USA, FriendFinder.com


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org

Re: Indexing question

2010-10-03 Thread Tompkins Neil

So if you have individual indexes for example field_1, field_2 and field_3
etc and then perform a search like

WHERE field_1 = 10
AND field_3 = 'abc'

This wouldn't improve the search ?  You have to create a index for all
possible combined field searches ?



On Fri, Oct 1, 2010 at 9:35 PM, Gavin Towey gto...@ffn.com wrote:

 You can't use an index to select records in a range, and order them.  The
 order by will cause a filesort in that case.

 Additionally indexes are always read left to right.  So an index on
 ('user_id', 'product_id') will help when doing WHERE user_id=N AND
 product_id IN (1,2,3), but wouldn't help for just the condtion on
 product_id.

 See the manual for full details on how mysql uses indexes:
 http://dev.mysql.com/doc/refman/5.1/en/mysql-indexes.html


 -Original Message-
 From: Jonas Galvez [mailto:jonasgal...@gmail.com]
 Sent: Friday, October 01, 2010 11:48 AM
 To: mysql@lists.mysql.com
 Subject: Indexing question

 Suppose I wanted to be able to perform queries against three columns of my
 table: 'user_id', 'product_id' and 'created'. Most of the time I'll just be
 range-selecting records from the table ordering by 'created'. But I may
 also
 want to select where 'user_id' = something and 'product_id' in (list, of,
 ids), ordered by 'created'. Do I need two separate indexes, one on
 'created'
 and another on ('user_id', 'product_id', 'created'), or does having only
 the
 latter suffice the former case?


 -- Jonas, http://jonasgalvez.com.br

 This message contains confidential information and is intended only for the
 individual named.  If you are not the named addressee, you are notified that
 reviewing, disseminating, disclosing, copying or distributing this e-mail is
 strictly prohibited.  Please notify the sender immediately by e-mail if you
 have received this e-mail by mistake and delete this e-mail from your
 system. E-mail transmission cannot be guaranteed to be secure or error-free
 as information could be intercepted, corrupted, lost, destroyed, arrive late
 or incomplete, or contain viruses. The sender therefore does not accept
 liability for any loss or damage caused by viruses or errors or omissions in
 the contents of this message, which arise as a result of e-mail
 transmission. [FriendFinder Networks, Inc., 220 Humbolt court, Sunnyvale, CA
 94089, USA, FriendFinder.com

 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:
 http://lists.mysql.com/mysql?unsub=neil.tompk...@googlemail.com

Re: Indexing question

2010-10-03 Thread Joerg Bruehe

Hi Neil, all!


Tompkins Neil wrote:
 So if you have individual indexes for example field_1, field_2 and field_3
 etc and then perform a search like
 
 WHERE field_1 = 10
 AND field_3 = 'abc'
 
 This wouldn't improve the search ?  You have to create a index for all
 possible combined field searches ?

No - you didn't read Gavin's mail exact enough:

 On Fri, Oct 1, 2010 at 9:35 PM, Gavin Towey gto...@ffn.com wrote:
 
 [[...]]

 Additionally indexes are always read left to right.  So an index on
 ('user_id', 'product_id') will help when doing WHERE user_id=N AND
 product_id IN (1,2,3), but wouldn't help for just the condtion on
 product_id.

What Gavin calls left to right is what I call most significant
first, the result is the same:

In a multi-column index, the columns are listed in the order of their
significance. Any DBMS (this is not limited to MySQL) can use such an
index only if a condition for the first (= most significant) field(s) is
(are) specified.

Example: Assume the index is on fields A, B, and C in that order.

A statement ... where A = x and B = y and C = z can use the index.
A statement ... where A = x and B = y can use the index, limited to
the first two fields.
A statement ... where A = x can use the index. the first field only.
A statement ... where A = x and C = z can also use the index for A,
but will have to evaluate the condition on C by scanning all records
matching A.

A statement ... where B = y and C = z cannot use the index, because
there is no condition on A.

If there are many searches based on A and C only (not B), and there are
many records matching A with different values of C, then an additional
index on these two columns may be helpful.

Compare the index with a phone book, which (typically) lists the entries
sorted by last name (most significant), then first name, then ... :
If you don't know the last name, you cannot profit from the sorting and
have to scan the wole book.


 See the manual for full details on how mysql uses indexes:
 http://dev.mysql.com/doc/refman/5.1/en/mysql-indexes.html

HTH,
Jörg

-- 
Joerg Bruehe,  MySQL Build Team,  joerg.bru...@oracle.com
ORACLE Deutschland B.V.  Co. KG,   Komturstrasse 18a,   D-12099 Berlin
Geschaeftsfuehrer: Juergen Kunz, Marcel v.d. Molen, Alexander v.d. Ven
Amtsgericht Muenchen: HRA 95603


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org

Re: Indexing question

2010-10-03 Thread Neil Tompkins

Thanks for your reply. So should we create individual indexes on each  
field or a multiple column index ??




On 3 Oct 2010, at 16:44, Joerg Bruehe joerg.bru...@oracle.com wrote:


Hi Neil, all!


Tompkins Neil wrote:
So if you have individual indexes for example field_1, field_2 and  
field_3

etc and then perform a search like

WHERE field_1 = 10
AND field_3 = 'abc'

This wouldn't improve the search ?  You have to create a index for  
all

possible combined field searches ?


No - you didn't read Gavin's mail exact enough:


On Fri, Oct 1, 2010 at 9:35 PM, Gavin Towey gto...@ffn.com wrote:


[[...]]

Additionally indexes are always read left to right.  So an index on
('user_id', 'product_id') will help when doing WHERE user_id=N AND
product_id IN (1,2,3), but wouldn't help for just the condtion on
product_id.


What Gavin calls left to right is what I call most significant
first, the result is the same:

In a multi-column index, the columns are listed in the order of their
significance. Any DBMS (this is not limited to MySQL) can use such an
index only if a condition for the first (= most significant) field 
(s) is

(are) specified.

Example: Assume the index is on fields A, B, and C in that order.

A statement ... where A = x and B = y and C = z can use the index.
A statement ... where A = x and B = y can use the index, limited to
the first two fields.
A statement ... where A = x can use the index. the first field only.
A statement ... where A = x and C = z can also use the index for A,
but will have to evaluate the condition on C by scanning all records
matching A.

A statement ... where B = y and C = z cannot use the index, because
there is no condition on A.

If there are many searches based on A and C only (not B), and there  
are

many records matching A with different values of C, then an additional
index on these two columns may be helpful.

Compare the index with a phone book, which (typically) lists the  
entries

sorted by last name (most significant), then first name, then ... :
If you don't know the last name, you cannot profit from the sorting  
and

have to scan the wole book.



See the manual for full details on how mysql uses indexes:
http://dev.mysql.com/doc/refman/5.1/en/mysql-indexes.html


HTH,
Jörg

--
Joerg Bruehe,  MySQL Build Team,  joerg.bru...@oracle.com
ORACLE Deutschland B.V.  Co. KG,   Komturstrasse 18a,   D-12099  
Berlin

Geschaeftsfuehrer: Juergen Kunz, Marcel v.d. Molen, Alexander v.d. Ven
Amtsgericht Muenchen: HRA 95603



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org

Re: Indexing question

2010-10-03 Thread Neil Tompkins

Following on from my previous email I have columns containing numbers  
which are then used in SUM and MIN/ MAX functions should these be  
indexed too ?


On 3 Oct 2010, at 16:44, Joerg Bruehe joerg.bru...@oracle.com wrote:


Hi Neil, all!


Tompkins Neil wrote:
So if you have individual indexes for example field_1, field_2 and  
field_3

etc and then perform a search like

WHERE field_1 = 10
AND field_3 = 'abc'

This wouldn't improve the search ?  You have to create a index for  
all

possible combined field searches ?


No - you didn't read Gavin's mail exact enough:


On Fri, Oct 1, 2010 at 9:35 PM, Gavin Towey gto...@ffn.com wrote:


[[...]]

Additionally indexes are always read left to right.  So an index on
('user_id', 'product_id') will help when doing WHERE user_id=N AND
product_id IN (1,2,3), but wouldn't help for just the condtion on
product_id.


What Gavin calls left to right is what I call most significant
first, the result is the same:

In a multi-column index, the columns are listed in the order of their
significance. Any DBMS (this is not limited to MySQL) can use such an
index only if a condition for the first (= most significant) field 
(s) is

(are) specified.

Example: Assume the index is on fields A, B, and C in that order.

A statement ... where A = x and B = y and C = z can use the index.
A statement ... where A = x and B = y can use the index, limited to
the first two fields.
A statement ... where A = x can use the index. the first field only.
A statement ... where A = x and C = z can also use the index for A,
but will have to evaluate the condition on C by scanning all records
matching A.

A statement ... where B = y and C = z cannot use the index, because
there is no condition on A.

If there are many searches based on A and C only (not B), and there  
are

many records matching A with different values of C, then an additional
index on these two columns may be helpful.

Compare the index with a phone book, which (typically) lists the  
entries

sorted by last name (most significant), then first name, then ... :
If you don't know the last name, you cannot profit from the sorting  
and

have to scan the wole book.



See the manual for full details on how mysql uses indexes:
http://dev.mysql.com/doc/refman/5.1/en/mysql-indexes.html


HTH,
Jörg

--
Joerg Bruehe,  MySQL Build Team,  joerg.bru...@oracle.com
ORACLE Deutschland B.V.  Co. KG,   Komturstrasse 18a,   D-12099  
Berlin

Geschaeftsfuehrer: Juergen Kunz, Marcel v.d. Molen, Alexander v.d. Ven
Amtsgericht Muenchen: HRA 95603



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org

Indexing question

2010-10-01 Thread Jonas Galvez

Suppose I wanted to be able to perform queries against three columns of my
table: 'user_id', 'product_id' and 'created'. Most of the time I'll just be
range-selecting records from the table ordering by 'created'. But I may also
want to select where 'user_id' = something and 'product_id' in (list, of,
ids), ordered by 'created'. Do I need two separate indexes, one on 'created'
and another on ('user_id', 'product_id', 'created'), or does having only the
latter suffice the former case?


-- Jonas, http://jonasgalvez.com.br

RE: Indexing question

2010-10-01 Thread Gavin Towey

You can't use an index to select records in a range, and order them.  The order 
by will cause a filesort in that case.

Additionally indexes are always read left to right.  So an index on ('user_id', 
'product_id') will help when doing WHERE user_id=N AND product_id IN (1,2,3), 
but wouldn't help for just the condtion on product_id.

See the manual for full details on how mysql uses indexes: 
http://dev.mysql.com/doc/refman/5.1/en/mysql-indexes.html


-Original Message-
From: Jonas Galvez [mailto:jonasgal...@gmail.com]
Sent: Friday, October 01, 2010 11:48 AM
To: mysql@lists.mysql.com
Subject: Indexing question

Suppose I wanted to be able to perform queries against three columns of my
table: 'user_id', 'product_id' and 'created'. Most of the time I'll just be
range-selecting records from the table ordering by 'created'. But I may also
want to select where 'user_id' = something and 'product_id' in (list, of,
ids), ordered by 'created'. Do I need two separate indexes, one on 'created'
and another on ('user_id', 'product_id', 'created'), or does having only the
latter suffice the former case?


-- Jonas, http://jonasgalvez.com.br

This message contains confidential information and is intended only for the 
individual named.  If you are not the named addressee, you are notified that 
reviewing, disseminating, disclosing, copying or distributing this e-mail is 
strictly prohibited.  Please notify the sender immediately by e-mail if you 
have received this e-mail by mistake and delete this e-mail from your system. 
E-mail transmission cannot be guaranteed to be secure or error-free as 
information could be intercepted, corrupted, lost, destroyed, arrive late or 
incomplete, or contain viruses. The sender therefore does not accept liability 
for any loss or damage caused by viruses or errors or omissions in the contents 
of this message, which arise as a result of e-mail transmission. [FriendFinder 
Networks, Inc., 220 Humbolt court, Sunnyvale, CA 94089, USA, FriendFinder.com

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org

Re: Intro to indexing?

2009-07-29 Thread william drescher


muhammad subair wrote:

On Mon, Jul 27, 2009 at 9:25 PM, Ken D'Ambrosio k...@jots.org wrote:


Hey, all.  I'm trying to get indexing -- like, when do you specify an
index name during index creation, is index use implicit or explicit, and,
honestly, how exactly does it work, anyway?  I've been RTFM'ing, but
haven't found anything that really laid it out in black and white;
usually, they'd give an example or two, but were awfully sparse on the
whys and wherefores.

So, if anyone has something they could point me to -- electronic or dead
tree -- I'd be deeply appreciative.

Thanks!



Indexes can be on a single column or can span multiple columns (just like
keys). An index will be used when running a query, if the search is being
performed on the following:

- A single column that has a single-column index for example, if we index
departments on departmentID and perform a query like SELECT...WHERE
departmentID=n.
- A set of columns that forms a multicolumn index for example, if we have
created an index on the employee.assignment table on (clientID, employeeID,
workdate) and we perform a query like SELECT...WHERE clientID=x AND
employeeID=y AND workdate=z.
- A column or set of columns that forms a subset of a multicolumn index, as
long as there is a leftmost prefix of the index columns for example, with
the assignment table as before, with an index on (clientID, employeeID,
workdate), indexes would be used for these types of queries:


Source: MySQL Tutorial - SAMS Publishing




When you set up the table, you must have a primary index.
You make your best guess as to what would make it easier for the 
sql engine to find the data you want, but the sql engine, in its 
own wisdom will decide whether or not to use an index.  You do 
not explicitly tell it to do a select using an index.


Considerations:
  every time you do an insert, replace, or update of data 
included in an index, the index needs to be updated - which takes 
a small amount of time.

  indexes take space on disk - usually not a problem.
  if the engine can use an index, finding is much faster.
  if the database is small, who cares.


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org

Intro to indexing?

2009-07-27 Thread Ken D'Ambrosio

Hey, all.  I'm trying to get indexing -- like, when do you specify an
index name during index creation, is index use implicit or explicit, and,
honestly, how exactly does it work, anyway?  I've been RTFM'ing, but
haven't found anything that really laid it out in black and white;
usually, they'd give an example or two, but were awfully sparse on the
whys and wherefores.

So, if anyone has something they could point me to -- electronic or dead
tree -- I'd be deeply appreciative.

Thanks!

-Ken


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org

Re: Intro to indexing?

2009-07-27 Thread muhammad subair

On Mon, Jul 27, 2009 at 9:25 PM, Ken D'Ambrosio k...@jots.org wrote:

 Hey, all.  I'm trying to get indexing -- like, when do you specify an
 index name during index creation, is index use implicit or explicit, and,
 honestly, how exactly does it work, anyway?  I've been RTFM'ing, but
 haven't found anything that really laid it out in black and white;
 usually, they'd give an example or two, but were awfully sparse on the
 whys and wherefores.

 So, if anyone has something they could point me to -- electronic or dead
 tree -- I'd be deeply appreciative.

 Thanks!


Indexes can be on a single column or can span multiple columns (just like
keys). An index will be used when running a query, if the search is being
performed on the following:

- A single column that has a single-column index for example, if we index
departments on departmentID and perform a query like SELECT...WHERE
departmentID=n.
- A set of columns that forms a multicolumn index for example, if we have
created an index on the employee.assignment table on (clientID, employeeID,
workdate) and we perform a query like SELECT...WHERE clientID=x AND
employeeID=y AND workdate=z.
- A column or set of columns that forms a subset of a multicolumn index, as
long as there is a leftmost prefix of the index columns for example, with
the assignment table as before, with an index on (clientID, employeeID,
workdate), indexes would be used for these types of queries:


Source: MySQL Tutorial - SAMS Publishing

-- 
Muhammad Subair

When does indexing happen?

2009-07-26 Thread buford

Using version 5.0.6x on RH. The question I have is about the updating of
indexes.

Say I have a table with a primary key and one or more indexes. I run an
INSERT statement by way of a call to mysql_real_query() in the C api. If
that function call returns zero, i.e., indicating success, does that mean
that everything is guaranteed to be completed? In particular that all the
table indexes have been updated, so that an immediately-subsequent SELECT
statement with an appropriate WHERE clause seeking for the same row will
find it using the index?




-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org

Re: When does indexing happen?

2009-07-26 Thread buford

 Using version 5.0.6x on RH. The question I have is about the updating of
 indexes.

 Say I have a table with a primary key and one or more indexes. I run an
 INSERT statement by way of a call to mysql_real_query() in the C api. If
 that function call returns zero, i.e., indicating success, does that mean
 that everything is guaranteed to be completed? In particular that all the
 table indexes have been updated, so that an immediately-subsequent SELECT
 statement with an appropriate WHERE clause seeking for the same row will
 find it using the index?

Update with information I forgot to mention: using MyISAM tables for this.
Also, the table is a MERGED table.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org

Re: When does indexing happen?

2009-07-26 Thread Dan Nelson

In the last episode (Jul 26), buf...@biffco.net said:
 Using version 5.0.6x on RH. The question I have is about the updating of
 indexes.
 
 Say I have a table with a primary key and one or more indexes. I run an
 INSERT statement by way of a call to mysql_real_query() in the C api. If
 that function call returns zero, i.e., indicating success, does that mean
 that everything is guaranteed to be completed? In particular that all the
 table indexes have been updated, so that an immediately-subsequent SELECT
 statement with an appropriate WHERE clause seeking for the same row will
 find it using the index?

Yes.

-- 
Dan Nelson
dnel...@allantgroup.com

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org

Re: Indexing dynamics in MySQL Community Edition 5.1.34

2009-06-27 Thread Simon J Mudd

mo...@fastmail.fm (mos) writes:

 At 12:37 AM 6/25/2009, you wrote:

...

 my.cnf based on my-huge.cnf, expanding key_buffer to 8G,
 myisam_sort_buffer_size to 256M, and putting tmpdir on the fiber channel
 disk.
 
  You mean key_buffer_size don't you and not key_buffer? If you
 are using MySQL 5.0 then the upper limit is 4gb but 5.1 allows for
 more than 4gb.

That's not entirely true. Later versions of 5.0 (above 5.0.56?) also allow
key_buffer_size to be greater than 4GB and we are using that on
several machines. Earlier versions of 5.0 did indeed have this problem.

Simon


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org

Re: Indexing dynamics in MySQL Community Edition 5.1.34

2009-06-27 Thread Mike Spreitzer

  | 
| Ssl_ctx_verify_mode   | 0  | 
| Ssl_default_timeout   | 0  | 
| Ssl_finished_accepts  | 0  | 
| Ssl_finished_connects | 0  | 
| Ssl_session_cache_hits| 0  | 
| Ssl_session_cache_misses  | 0  | 
| Ssl_session_cache_mode| NONE   | 
| Ssl_session_cache_overflows   | 0  | 
| Ssl_session_cache_size| 0  | 
| Ssl_session_cache_timeouts| 0  | 
| Ssl_sessions_reused   | 0  | 
| Ssl_used_session_cache_entries| 0  | 
| Ssl_verify_depth  | 0  | 
| Ssl_verify_mode   | 0  | 
| Ssl_version   || 
| Table_locks_immediate | 27591  | 
| Table_locks_waited| 0  | 
| Tc_log_max_pages_used | 0  | 
| Tc_log_page_size  | 0  | 
| Tc_log_page_waits | 0  | 
| Threads_cached| 0  | 
| Threads_connected | 4  | 
| Threads_created   | 4  | 
| Threads_running   | 2  | 
| Uptime| 202522 | 
| Uptime_since_flush_status | 202522 | 
+---++

Thanks,
Mike Spreitzer




mos mo...@fastmail.fm 
06/25/09 01:05 PM

To
mysql@lists.mysql.com
cc

Subject
Re: Indexing dynamics in MySQL Community Edition 5.1.34






Mike,
   I re-posted your Show Status to the group to see if anyone can 
offer 
a way to speed up the indexing for you.

BTW, you are adding ALL of the indexes to the table using ONE sql 
statement 
right? And not a separate SQL statement to build each index?

Mike

At 02:01 AM 6/25/2009, you wrote:

Like I said in the subject line, I am using 5.1.34.  I started with 
my-huge.cnf, which says key_buffer rather than key_buffer_size; SHOW 
GLOBAL VARIABLES confirms, however, that my key_buffer_size is 8GB.

That indexing operation finally finished after about 1.5 hours; that was 
about 0.5 hours ago.  Now I am on to other things.  Here is the status 
you 
suggested:

 [SNIP]

Thanks,
Mike Spreitzer



mos mo...@fastmail.fm

06/25/09 02:32 AM
To
mysql@lists.mysql.com
cc
Subject
Re: Indexing dynamics in MySQL Community Edition 5.1.34




At 12:37 AM 6/25/2009, you wrote:
 Actually, my characterization of the current state is wrong.  It 
appears
 that one core is completely busy, I suppose MySQL does this indexing 
work
 in a single thread.  Is it reasonable for indexing to be CPU bound?
 
 
 my.cnf based on my-huge.cnf, expanding key_buffer to 8G,
 myisam_sort_buffer_size to 256M, and putting tmpdir on the fiber 
channel
 disk.

Mike,
 You mean key_buffer_size don't you and not key_buffer? If you 
are
using MySQL 5.0 then the upper limit is 4gb but 5.1 allows for more than 
4gb.
Also try increasing sort_buffer_size.

Posting your Show Status will help people see where the bottle neck is.

Mike


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=mspre...@us.ibm.com



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=mspre...@us.ibm.com




-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org

Re: Indexing dynamics in MySQL Community Edition 5.1.34

2009-06-27 Thread Mike Spreitzer

Today's instance finished shortly after I sent the email below.  BTW, here 
are some specifics on the table (which uses MyISAM).  Thursday's instance 
has 11 GB of data and 0.78 GB of index.  Today's instance has 26 GB of 
data and 1.8 GB of index.

Thanks,
Mike Spreitzer




Mike Spreitzer/Watson/i...@ibmus 
06/27/09 09:48 AM

To
mos mo...@fastmail.fm
cc
mysql@lists.mysql.com
Subject
Re: Indexing dynamics in MySQL Community Edition 5.1.34






Yes, all the indices are added in one ALTER TABLE statement.  Thursday's 

incarnation took about 1.5 hours, on a table created from about 8 GB of 
CSV.  Today's has already taken over 8 hours, on a table created from 
about 22 GB of data.  The logarithm of 22 GB is about 24/23 of the 
logarithm of 8 GB.  I seem to have fallen off an additional cliff.

As a reminder, here is the situation.  I load a table from CSV with zero 
indices defined.  Then I add some indices, and that takes a long time.  On 

Thursday it kept one core busy, but the disk was not very busy for much of 

that time (about 600 blocks out per sec).  Now I am seeing about 2.5 
blocks out per second --- still nowhere near capacity --- and one core 
busy.

The MySQL server is running on a 64-bit RHEL 5 machine with 16 Intel cores 


at 2.4 GHz, and 64 GB RAM.  The db storage is on fiber channel.  I created 


my.cnf based on my-huge.cnf, expanding key_buffer[_size] to 8G, 
myisam_sort_buffer_size to 256M, and putting tmpdir on the fiber channel 
disk.

Here is my current SHOW STATUS:

+---++
| Variable_name | Value  |
+---++
| Aborted_clients   | 8  | 
| Aborted_connects  | 0  | 
| Binlog_cache_disk_use | 0  | 
| Binlog_cache_use  | 0  | 
| Bytes_received| 95 | 
| Bytes_sent| 180| 
| Com_admin_commands| 0  | 
| Com_assign_to_keycache| 0  | 
| Com_alter_db  | 0  | 
| Com_alter_db_upgrade  | 0  | 
| Com_alter_event   | 0  | 
| Com_alter_function| 0  | 
| Com_alter_procedure   | 0  | 
| Com_alter_server  | 0  | 
| Com_alter_table   | 0  | 
| Com_alter_tablespace  | 0  | 
| Com_analyze   | 0  | 
| Com_backup_table  | 0  | 
| Com_begin | 0  | 
| Com_binlog| 0  | 
| Com_call_procedure| 0  | 
| Com_change_db | 0  | 
| Com_change_master | 0  | 
| Com_check | 0  | 
| Com_checksum  | 0  | 
| Com_commit| 0  | 
| Com_create_db | 0  | 
| Com_create_event  | 0  | 
| Com_create_function   | 0  | 
| Com_create_index  | 0  | 
| Com_create_procedure  | 0  | 
| Com_create_server | 0  | 
| Com_create_table  | 0  | 
| Com_create_trigger| 0  | 
| Com_create_udf| 0  | 
| Com_create_user   | 0  | 
| Com_create_view   | 0  | 
| Com_dealloc_sql   | 0  | 
| Com_delete| 0  | 
| Com_delete_multi  | 0  | 
| Com_do| 0  | 
| Com_drop_db   | 0  | 
| Com_drop_event| 0  | 
| Com_drop_function | 0  | 
| Com_drop_index| 0  | 
| Com_drop_procedure| 0  | 
| Com_drop_server   | 0  | 
| Com_drop_table| 0  | 
| Com_drop_trigger  | 0  | 
| Com_drop_user | 0  | 
| Com_drop_view | 0  | 
| Com_empty_query   | 0  | 
| Com_execute_sql   | 0  | 
| Com_flush | 0  | 
| Com_grant | 0  | 
| Com_ha_close  | 0  | 
| Com_ha_open   | 0  | 
| Com_ha_read   | 0  | 
| Com_help  | 0  | 
| Com_insert| 0  | 
| Com_insert_select | 0  | 
| Com_install_plugin| 0  | 
| Com_kill  | 0  | 
| Com_load  | 0

Re: Indexing dynamics in MySQL Community Edition 5.1.34

2009-06-27 Thread Todd Lyons

On Sat, Jun 27, 2009 at 7:03 AM, Mike Spreitzermspre...@us.ibm.com wrote:
 Today's instance finished shortly after I sent the email below.  BTW, here
 are some specifics on the table (which uses MyISAM).  Thursday's instance
 has 11 GB of data and 0.78 GB of index.  Today's instance has 26 GB of
 data and 1.8 GB of index.

If you have the ability to test, I'd compare that to importing the csv
into the table with the indexes already defined.  The way you did it
should be faster, but since you see that it's only using one core, I'd
try splitting the data up into 16 separate files and importing them
all at once.  In theory (SWAG actually), multiple imports would each
use their own core to whatever thread count you have innodb defined to
use ... Oh, just saw that you were using myisam.  Never mind.

At any rate, I'd be very surprised if importing into a table with
indexes already defined was the same speed or faster, but doing so
could give you some useful information, such as at what point the
import (and concurrent index creation) drops from expected level X to
much reduced level Y.  You could then (hopefully) find a correlation
between some cache or buffer setting that will explain the sudden drop
in speed.
-- 
Regards...  Todd

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org

Re: Indexing dynamics in MySQL Community Edition 5.1.34

2009-06-26 Thread Moon's Father

Who can please tell me what is mean of The db storage is on fiber
channel.?

On Fri, Jun 26, 2009 at 1:05 AM, mos mo...@fastmail.fm wrote:

 Mike,
  I re-posted your Show Status to the group to see if anyone can offer a
 way to speed up the indexing for you.

 BTW, you are adding ALL of the indexes to the table using ONE sql statement
 right? And not a separate SQL statement to build each index?

 Mike

 At 02:01 AM 6/25/2009, you wrote:

  Like I said in the subject line, I am using 5.1.34.  I started with
 my-huge.cnf, which says key_buffer rather than key_buffer_size; SHOW
 GLOBAL VARIABLES confirms, however, that my key_buffer_size is 8GB.

 That indexing operation finally finished after about 1.5 hours; that was
 about 0.5 hours ago.  Now I am on to other things.  Here is the status you
 suggested:

 +---+---+
 | Variable_name | Value |
 +---+---+
 | Aborted_clients   | 0 |
 | Aborted_connects  | 0 |
 | Binlog_cache_disk_use | 0 |
 | Binlog_cache_use  | 0 |
 | Bytes_received| 135   |
 | Bytes_sent| 1405  |
 | Com_admin_commands| 0 |
 | Com_assign_to_keycache| 0 |
 | Com_alter_db  | 0 |
 | Com_alter_db_upgrade  | 0 |
 | Com_alter_event   | 0 |
 | Com_alter_function| 0 |
 | Com_alter_procedure   | 0 |
 | Com_alter_server  | 0 |
 | Com_alter_table   | 0 |
 | Com_alter_tablespace  | 0 |
 | Com_analyze   | 0 |
 | Com_backup_table  | 0 |
 | Com_begin | 0 |
 | Com_binlog| 0 |
 | Com_call_procedure| 0 |
 | Com_change_db | 0 |
 | Com_change_master | 0 |
 | Com_check | 0 |
 | Com_checksum  | 0 |
 | Com_commit| 0 |
 | Com_create_db | 0 |
 | Com_create_event  | 0 |
 | Com_create_function   | 0 |
 | Com_create_index  | 0 |
 | Com_create_procedure  | 0 |
 | Com_create_server | 0 |
 | Com_create_table  | 0 |
 | Com_create_trigger| 0 |
 | Com_create_udf| 0 |
 | Com_create_user   | 0 |
 | Com_create_view   | 0 |
 | Com_dealloc_sql   | 0 |
 | Com_delete| 0 |
 | Com_delete_multi  | 0 |
 | Com_do| 0 |
 | Com_drop_db   | 0 |
 | Com_drop_event| 0 |
 | Com_drop_function | 0 |
 | Com_drop_index| 0 |
 | Com_drop_procedure| 0 |
 | Com_drop_server   | 0 |
 | Com_drop_table| 0 |
 | Com_drop_trigger  | 0 |
 | Com_drop_user | 0 |
 | Com_drop_view | 0 |
 | Com_empty_query   | 0 |
 | Com_execute_sql   | 0 |
 | Com_flush | 0 |
 | Com_grant | 0 |
 | Com_ha_close  | 0 |
 | Com_ha_open   | 0 |
 | Com_ha_read   | 0 |
 | Com_help  | 0 |
 | Com_insert| 0 |
 | Com_insert_select | 0 |
 | Com_install_plugin| 0 |
 | Com_kill  | 0 |
 | Com_load  | 0 |
 | Com_load_master_data  | 0 |
 | Com_load_master_table | 0 |
 | Com_lock_tables   | 0 |
 | Com_optimize  | 0 |
 | Com_preload_keys  | 0 |
 | Com_prepare_sql   | 0 |
 | Com_purge | 0 |
 | Com_purge_before_date | 0 |
 | Com_release_savepoint | 0 |
 | Com_rename_table  | 0 |
 | Com_rename_user   | 0 |
 | Com_repair| 0 |
 | Com_replace   | 0 |
 | Com_replace_select| 0 |
 | Com_reset | 0

RE: Indexing dynamics in MySQL Community Edition 5.1.34

2009-06-26 Thread Jerry Schwartz

-Original Message-
From: Moon's Father [mailto:yueliangdao0...@gmail.com]
Sent: Friday, June 26, 2009 4:12 AM
To: mos
Cc: mysql@lists.mysql.com
Subject: Re: Indexing dynamics in MySQL Community Edition 5.1.34

Who can please tell me what is mean of The db storage is on fiber
channel.?

[JS] Fiber Channel (and that is the official way it is spelled, even in the
USA) is an ultra-high-speed network for disk arrays, computers, and other
devices. Although the physical medium is usual fiber optic, there is a lot
more to it than that. It is more like an intranet in some ways, in that
devices have addresses and there is intelligence in the network itself.

On Fri, Jun 26, 2009 at 1:05 AM, mos mo...@fastmail.fm wrote:

 Mike,
  I re-posted your Show Status to the group to see if anyone can
offer a
 way to speed up the indexing for you.

 BTW, you are adding ALL of the indexes to the table using ONE sql
statement
 right? And not a separate SQL statement to build each index?

 Mike

 At 02:01 AM 6/25/2009, you wrote:

  Like I said in the subject line, I am using 5.1.34.  I started with
 my-huge.cnf, which says key_buffer rather than key_buffer_size;
SHOW
 GLOBAL VARIABLES confirms, however, that my key_buffer_size is 8GB.

 That indexing operation finally finished after about 1.5 hours; that
was
 about 0.5 hours ago.  Now I am on to other things.  Here is the
status you
 suggested:

 +---+---+
 | Variable_name | Value |
 +---+---+
 | Aborted_clients   | 0 |
 | Aborted_connects  | 0 |
 | Binlog_cache_disk_use | 0 |
 | Binlog_cache_use  | 0 |
 | Bytes_received| 135   |
 | Bytes_sent| 1405  |
 | Com_admin_commands| 0 |
 | Com_assign_to_keycache| 0 |
 | Com_alter_db  | 0 |
 | Com_alter_db_upgrade  | 0 |
 | Com_alter_event   | 0 |
 | Com_alter_function| 0 |
 | Com_alter_procedure   | 0 |
 | Com_alter_server  | 0 |
 | Com_alter_table   | 0 |
 | Com_alter_tablespace  | 0 |
 | Com_analyze   | 0 |
 | Com_backup_table  | 0 |
 | Com_begin | 0 |
 | Com_binlog| 0 |
 | Com_call_procedure| 0 |
 | Com_change_db | 0 |
 | Com_change_master | 0 |
 | Com_check | 0 |
 | Com_checksum  | 0 |
 | Com_commit| 0 |
 | Com_create_db | 0 |
 | Com_create_event  | 0 |
 | Com_create_function   | 0 |
 | Com_create_index  | 0 |
 | Com_create_procedure  | 0 |
 | Com_create_server | 0 |
 | Com_create_table  | 0 |
 | Com_create_trigger| 0 |
 | Com_create_udf| 0 |
 | Com_create_user   | 0 |
 | Com_create_view   | 0 |
 | Com_dealloc_sql   | 0 |
 | Com_delete| 0 |
 | Com_delete_multi  | 0 |
 | Com_do| 0 |
 | Com_drop_db   | 0 |
 | Com_drop_event| 0 |
 | Com_drop_function | 0 |
 | Com_drop_index| 0 |
 | Com_drop_procedure| 0 |
 | Com_drop_server   | 0 |
 | Com_drop_table| 0 |
 | Com_drop_trigger  | 0 |
 | Com_drop_user | 0 |
 | Com_drop_view | 0 |
 | Com_empty_query   | 0 |
 | Com_execute_sql   | 0 |
 | Com_flush | 0 |
 | Com_grant | 0 |
 | Com_ha_close  | 0 |
 | Com_ha_open   | 0 |
 | Com_ha_read   | 0 |
 | Com_help  | 0 |
 | Com_insert| 0 |
 | Com_insert_select | 0 |
 | Com_install_plugin| 0 |
 | Com_kill  | 0 |
 | Com_load  | 0 |
 | Com_load_master_data  | 0 |
 | Com_load_master_table | 0 |
 | Com_lock_tables   | 0 |
 | Com_optimize

RE: Indexing dynamics in MySQL Community Edition 5.1.34

2009-06-26 Thread Jerry Schwartz

Oops, my spell checker got the better of me: it should be Fibre Channel.

Regards,
 
Jerry Schwartz
The Infoshop by Global Information Incorporated
195 Farmington Ave.
Farmington, CT 06032
 
860.674.8796 / FAX: 860.674.8341
 
www.the-infoshop.com




-Original Message-
From: Jerry Schwartz [mailto:jschwa...@the-infoshop.com]
Sent: Friday, June 26, 2009 10:22 AM
To: 'Moon's Father'; 'mos'
Cc: mysql@lists.mysql.com
Subject: RE: Indexing dynamics in MySQL Community Edition 5.1.34



-Original Message-
From: Moon's Father [mailto:yueliangdao0...@gmail.com]
Sent: Friday, June 26, 2009 4:12 AM
To: mos
Cc: mysql@lists.mysql.com
Subject: Re: Indexing dynamics in MySQL Community Edition 5.1.34

Who can please tell me what is mean of The db storage is on fiber
channel.?

[JS] Fiber Channel (and that is the official way it is spelled, even in
the
USA) is an ultra-high-speed network for disk arrays, computers, and
other
devices. Although the physical medium is usual fiber optic, there is a
lot
more to it than that. It is more like an intranet in some ways, in that
devices have addresses and there is intelligence in the network itself.

On Fri, Jun 26, 2009 at 1:05 AM, mos mo...@fastmail.fm wrote:

 Mike,
  I re-posted your Show Status to the group to see if anyone can
offer a
 way to speed up the indexing for you.

 BTW, you are adding ALL of the indexes to the table using ONE sql
statement
 right? And not a separate SQL statement to build each index?

 Mike

 At 02:01 AM 6/25/2009, you wrote:

  Like I said in the subject line, I am using 5.1.34.  I started with
 my-huge.cnf, which says key_buffer rather than key_buffer_size;
SHOW
 GLOBAL VARIABLES confirms, however, that my key_buffer_size is 8GB.

 That indexing operation finally finished after about 1.5 hours; that
was
 about 0.5 hours ago.  Now I am on to other things.  Here is the
status you
 suggested:

 +---+---+
 | Variable_name | Value |
 +---+---+
 | Aborted_clients   | 0 |
 | Aborted_connects  | 0 |
 | Binlog_cache_disk_use | 0 |
 | Binlog_cache_use  | 0 |
 | Bytes_received| 135   |
 | Bytes_sent| 1405  |
 | Com_admin_commands| 0 |
 | Com_assign_to_keycache| 0 |
 | Com_alter_db  | 0 |
 | Com_alter_db_upgrade  | 0 |
 | Com_alter_event   | 0 |
 | Com_alter_function| 0 |
 | Com_alter_procedure   | 0 |
 | Com_alter_server  | 0 |
 | Com_alter_table   | 0 |
 | Com_alter_tablespace  | 0 |
 | Com_analyze   | 0 |
 | Com_backup_table  | 0 |
 | Com_begin | 0 |
 | Com_binlog| 0 |
 | Com_call_procedure| 0 |
 | Com_change_db | 0 |
 | Com_change_master | 0 |
 | Com_check | 0 |
 | Com_checksum  | 0 |
 | Com_commit| 0 |
 | Com_create_db | 0 |
 | Com_create_event  | 0 |
 | Com_create_function   | 0 |
 | Com_create_index  | 0 |
 | Com_create_procedure  | 0 |
 | Com_create_server | 0 |
 | Com_create_table  | 0 |
 | Com_create_trigger| 0 |
 | Com_create_udf| 0 |
 | Com_create_user   | 0 |
 | Com_create_view   | 0 |
 | Com_dealloc_sql   | 0 |
 | Com_delete| 0 |
 | Com_delete_multi  | 0 |
 | Com_do| 0 |
 | Com_drop_db   | 0 |
 | Com_drop_event| 0 |
 | Com_drop_function | 0 |
 | Com_drop_index| 0 |
 | Com_drop_procedure| 0 |
 | Com_drop_server   | 0 |
 | Com_drop_table| 0 |
 | Com_drop_trigger  | 0 |
 | Com_drop_user | 0 |
 | Com_drop_view | 0 |
 | Com_empty_query   | 0 |
 | Com_execute_sql   | 0 |
 | Com_flush | 0 |
 | Com_grant | 0 |
 | Com_ha_close  | 0 |
 | Com_ha_open   | 0 |
 | Com_ha_read

RE: Indexing? (Warning: relative newbie.)

2009-06-26 Thread Ken D'Ambrosio

After a few off-list e-mails with Tim, I issued

ALTER TABLE dbmail_messageblks ADD INDEX ( blocksize AND physmessage_id );

which took almost 11 hours to index.  Once done, however, my select
statement went from a hair over 50 minutes to 15 seconds.  (1.69 seconds
after the index was cached.)

Wow.

Thanks for the help, all!

-Ken


On Wed, June 24, 2009 12:03 pm, Little, Timothy wrote:
 To answer your questions in no particular order, YES you can speed it up
 with indexing.

 You might want to first create an index on ( blocksize AND
 physmessage_id ).

 Why, you might ask, index on physmessage_id?  Because then the db won't
 have to do a fetch on items from the table since it's in the INDEX itself,
 saving any unnecessary reads.

 Realistically, I can't see that taking more than a few seconds, at most,
 to execute.  However, making the index might take a serious bit of time.

 Please let us all know how it does or does not work.


 Tim...


 -Original Message-
 From: Ken D'Ambrosio [mailto:k...@jots.org]
 Sent: Wednesday, June 24, 2009 11:07 AM
 To: mysql@lists.mysql.com
 Subject: Indexing? (Warning: relative newbie.)


 Hi, all.  I'm a long-time MySQL user who's only recently had to start
 learning some administrative stuff, largely because I finally have a
 decently-sized database.  My database is about 100 GB; I'm using it -- via
  dbmail (www.dbmail.org) -- as a mail server for my company.  While dbmail
  is well-and-good with its IMAP front-end, I'm thinking of writing a
 Python
 front-end to do some queries directly against MySQL.  But some of them take
 a l-o-n-g time.  As an example, I've got a table with slightly over a
 million records; I'd like to be able to show (say) only IDs of messages
 under a half-MB.  The query would look something like this:

 select physmessage_id,blocksize from dbmail_messageblks where blocksize 
 50;


 That query takes 50 minutes.  A smidge long to wait.


 So I said, Huh.  That's impressive.  And I tried it without the
 physmessage_id:
 select blocksize from dbmail_messageblks where blocksize  50;

 That took 14 seconds.  A bit more in my timeframe.  Can I optimize this
 with indexing?  Should I be using a different DB engine?  Is there a
 site/book I should be learning DBA fundamentals from that might offer me
 direction for stuff like this?

 Sorry for all the newbie questions, but I haven't done serious database
 stuff since Foxbase/dBase III days.  Things have changed a little since
 then.

 Thanks!


 -Ken



 --
 This message has been scanned for viruses and
 dangerous content by MailScanner, and is believed to be clean.


 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/mysql?unsub=tlit...@tgrnet.com



 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/mysql?unsub=...@jots.org



 --
 This message has been scanned for viruses and
 dangerous content by MailScanner, and is believed to be clean.





-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org

Re: Indexing? (Warning: relative newbie.)

2009-06-26 Thread Shawn Green


Hi Walter,

Walter Heck - OlinData.com wrote:

Hey Tim, all

On Wed, Jun 24, 2009 at 10:03 AM, Little,
Timothytlit...@thomaspublishing.com wrote:

Why, you might ask, index on physmessage_id?  Because then the db won't
have to do a fetch on items from the table since it's in the INDEX
itself, saving any unnecessary reads.

FYI: That only holds true for InnoDB, not for MyISAM.



I think you have confused the InnoDB behavior of using the entire 
PRIMARY KEY as the unique row identifier for each entry in a secondary 
key with the practice of defining a covering index.


Tim was correct: an index on (blocksize,physmessage_id) would allow that 
query to avoid any direct reads of the data table as all of the 
information for the query would have come from the index itself 
(regardless of database engine).


--
Shawn Green, MySQL Senior Support Engineer
Sun Microsystems, Inc.
Office: Blountville, TN



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org

Re: Indexing dynamics in MySQL Community Edition 5.1.34

2009-06-25 Thread mos


At 12:37 AM 6/25/2009, you wrote:

Actually, my characterization of the current state is wrong.  It appears
that one core is completely busy, I suppose MySQL does this indexing work
in a single thread.  Is it reasonable for indexing to be CPU bound?


my.cnf based on my-huge.cnf, expanding key_buffer to 8G,
myisam_sort_buffer_size to 256M, and putting tmpdir on the fiber channel
disk.


Mike,
You mean key_buffer_size don't you and not key_buffer? If you are 
using MySQL 5.0 then the upper limit is 4gb but 5.1 allows for more than 4gb.

Also try increasing sort_buffer_size.

Posting your Show Status will help people see where the bottle neck is.

Mike 



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org

RE: Indexing dynamics in MySQL Community Edition 5.1.34

2009-06-25 Thread Jerry Schwartz



-Original Message-
From: Mike Spreitzer [mailto:mspre...@us.ibm.com]
Sent: Thursday, June 25, 2009 1:38 AM
To: mysql@lists.mysql.com
Subject: Re: Indexing dynamics in MySQL Community Edition 5.1.34

Actually, my characterization of the current state is wrong.  It appears
that one core is completely busy, I suppose MySQL does this indexing
work
in a single thread.  Is it reasonable for indexing to be CPU bound?

[JS] Yes, it is very reasonable if you have enough data cached. One of the
things it has to do is sort, and because memory access is much faster than
disk access you want to avoid the latter if at all possible.

If you'd ever seen a sort using magnetic tape (pre-disk), you'd have a vivid
mental image of this.

As for using a single thread, that is not compulsory. You could partition
the data and do a multi-threaded sort-merge operation. I no longer remember
the results of the sort wars of the early days, and I have no idea what
MySQL does.

Thanks,
Mike Spreitzer




Mike Spreitzer/Watson/i...@ibmus
06/25/09 01:30 AM

To
mysql@lists.mysql.com
cc

Subject
Indexing dynamics in MySQL Community Edition 5.1.34






Using MyISAM on a table loaded from 8GB of CSV, I am now adding some
indices.  In a separate shell I monitor the progress, alternately with
`vmstat` and show full processlist.  At first vmstat shows rapid
progress; an example is

# vmstat 5
procs ---memory-- ---swap-- -io --system--
-cpu--
 r  b   swpdfree   buffcache   si   sobi bo   in   cs us
sy id wa st
 1  6  0 8542108 256860 5394040000 0 152783 1314  298  3
4

60 32  0
 1  6  0 8541000 256868 5394039200 0 147868 1301  287  3
4

68 25  0
 1  5  0 8541380 256876 5394040000 0 150633 1310  277  3
4

72 21  0
 1  6  0 8541108 256884 5394039200 0 152066 1307  271  3
4

71 21  0
 1  7  0 8541116 256892 5394040000 0 151452 1312  311  3
4

64 29  0
 1  6  0 8541992 256900 5394039200 0 192175 1402  295  3
4

66 26  0
 1  6  0 8535684 256908 5394040000 0 108783 1227  276  3
4

69 24  0
 1  8  0 8539116 256916 5394039200 0 155958 1318  262  3
4

82 11  0
 1  6  0 8540860 256924 5394039200 0 166599 1340  328  3
4

66 27  0
 1  9  0 8538512 256932 5394039200 0 165386 1336  319  3
4

62 31  0
 1  6  0 8536776 256940 5394039200 0 175106 1358  303  3
5

66 27  0
 2  0  0 8538884 256944 5394039600 0 187839 1402  305  3
5

70 22  0
 1  1  0 8517060 256952 5394040000 0 188694 1379  307  3
4

66 27  0
 1 10  0 8511604 256960 5394040000 0 175821 1335  294  2
5

69 24  0
 1 10  0 8513340 256968 5394040000 0 164252 1335  300  3
4

65 28  0
 2  0  0 8523012 256976 5394039200 0 151527 1318  305  3
5

60 33  0
 1 10  0 8490152 256976 5394040000 0 178613 1352  301  2
5

67 26  0
 2  0  0 8499576 256976 5394040000 0 142186 1319  302  2
5

69 23  0
 1 10  0 8474280 256984 5394040000 0 185598 1348  301  2
5

57 36  0
 2  9  0 8440676 256984 5394040000 0 166807 1334  306  2
5

53 39  0
 1  9  0 8465228 256988 5394039600 0 114594 1268  306  1
6

56 36  0
 1  9  0 16819736 256992 4554294400 0 185034 1342  301
1
6 56 36  0
 1  9  0 20314428 257028 4213472400 0 186163 1371  282
1
6 56 37  0
 1  9  0 20276856 257068 4217125200 0 166406 1342  281
2
5 59 34  0
 1  9  0 20237672 257108 4220934400 0 166810 1333  252
2
5 56 37  0

At this point, and not for the first time, I stop vmstat and show full
processlist.  It says

++--+---+--+-+--+---
+---

--+
| Id | User | Host  | db   | Command | Time | State
|
Info |
++--+---+--+-+--+---
+---

--+
|  1 | root | localhost | cel_4x52 | Query   |  542 | copy to tmp table
|
ALTER TABLE ntfelt ADD PRIMARY KEY (p, epoch, ssi, q, kind, ev, c),
ADD UNIQUE INDEX pesqekvc(p, epoch, ssi, q, eqoch, kind,
version,
c),
ADD INDEX tc(t, c),
ORDER BY p, epoch, ssi, q, kind, ev, c |
|  3 | root | localhost | NULL | Query   |0 | NULL
|
show full processlist

Re: Indexing dynamics in MySQL Community Edition 5.1.34

2009-06-25 Thread mos


Mike,
  I re-posted your Show Status to the group to see if anyone can offer 
a way to speed up the indexing for you.


BTW, you are adding ALL of the indexes to the table using ONE sql statement 
right? And not a separate SQL statement to build each index?


Mike

At 02:01 AM 6/25/2009, you wrote:

Like I said in the subject line, I am using 5.1.34.  I started with 
my-huge.cnf, which says key_buffer rather than key_buffer_size; SHOW 
GLOBAL VARIABLES confirms, however, that my key_buffer_size is 8GB.


That indexing operation finally finished after about 1.5 hours; that was 
about 0.5 hours ago.  Now I am on to other things.  Here is the status you 
suggested:


+---+---+
| Variable_name | Value |
+---+---+
| Aborted_clients   | 0 |
| Aborted_connects  | 0 |
| Binlog_cache_disk_use | 0 |
| Binlog_cache_use  | 0 |
| Bytes_received| 135   |
| Bytes_sent| 1405  |
| Com_admin_commands| 0 |
| Com_assign_to_keycache| 0 |
| Com_alter_db  | 0 |
| Com_alter_db_upgrade  | 0 |
| Com_alter_event   | 0 |
| Com_alter_function| 0 |
| Com_alter_procedure   | 0 |
| Com_alter_server  | 0 |
| Com_alter_table   | 0 |
| Com_alter_tablespace  | 0 |
| Com_analyze   | 0 |
| Com_backup_table  | 0 |
| Com_begin | 0 |
| Com_binlog| 0 |
| Com_call_procedure| 0 |
| Com_change_db | 0 |
| Com_change_master | 0 |
| Com_check | 0 |
| Com_checksum  | 0 |
| Com_commit| 0 |
| Com_create_db | 0 |
| Com_create_event  | 0 |
| Com_create_function   | 0 |
| Com_create_index  | 0 |
| Com_create_procedure  | 0 |
| Com_create_server | 0 |
| Com_create_table  | 0 |
| Com_create_trigger| 0 |
| Com_create_udf| 0 |
| Com_create_user   | 0 |
| Com_create_view   | 0 |
| Com_dealloc_sql   | 0 |
| Com_delete| 0 |
| Com_delete_multi  | 0 |
| Com_do| 0 |
| Com_drop_db   | 0 |
| Com_drop_event| 0 |
| Com_drop_function | 0 |
| Com_drop_index| 0 |
| Com_drop_procedure| 0 |
| Com_drop_server   | 0 |
| Com_drop_table| 0 |
| Com_drop_trigger  | 0 |
| Com_drop_user | 0 |
| Com_drop_view | 0 |
| Com_empty_query   | 0 |
| Com_execute_sql   | 0 |
| Com_flush | 0 |
| Com_grant | 0 |
| Com_ha_close  | 0 |
| Com_ha_open   | 0 |
| Com_ha_read   | 0 |
| Com_help  | 0 |
| Com_insert| 0 |
| Com_insert_select | 0 |
| Com_install_plugin| 0 |
| Com_kill  | 0 |
| Com_load  | 0 |
| Com_load_master_data  | 0 |
| Com_load_master_table | 0 |
| Com_lock_tables   | 0 |
| Com_optimize  | 0 |
| Com_preload_keys  | 0 |
| Com_prepare_sql   | 0 |
| Com_purge | 0 |
| Com_purge_before_date | 0 |
| Com_release_savepoint | 0 |
| Com_rename_table  | 0 |
| Com_rename_user   | 0 |
| Com_repair| 0 |
| Com_replace   | 0 |
| Com_replace_select| 0 |
| Com_reset | 0 |
| Com_restore_table | 0 |
| Com_revoke| 0 |
| Com_revoke_all| 0 |
| Com_rollback  | 0

Indexing? (Warning: relative newbie.)

2009-06-24 Thread Ken D'Ambrosio

Hi, all.  I'm a long-time MySQL user who's only recently had to start
learning some administrative stuff, largely because I finally have a
decently-sized database.  My database is about 100 GB; I'm using it -- via
dbmail (www.dbmail.org) -- as a mail server for my company.  While dbmail
is well-and-good with its IMAP front-end, I'm thinking of writing a Python
front-end to do some queries directly against MySQL.  But some of them
take a l-o-n-g time.  As an example, I've got a table with slightly over a
million records; I'd like to be able to show (say) only IDs of messages
under a half-MB.  The query would look something like this:

select physmessage_id,blocksize from dbmail_messageblks where blocksize 
50;

That query takes 50 minutes.  A smidge long to wait.

So I said, Huh.  That's impressive.  And I tried it without the
physmessage_id:
select blocksize from dbmail_messageblks where blocksize  50;

That took 14 seconds.  A bit more in my timeframe.  Can I optimize this
with indexing?  Should I be using a different DB engine?  Is there a
site/book I should be learning DBA fundamentals from that might offer me
direction for stuff like this?

Sorry for all the newbie questions, but I haven't done serious database
stuff since Foxbase/dBase III days.  Things have changed a little since
then.

Thanks!

-Ken


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org

RE: Indexing? (Warning: relative newbie.)

2009-06-24 Thread Little, Timothy

To answer your questions in no particular order, YES you can speed it up
with indexing.

You might want to first create an index on ( blocksize AND
physmessage_id ).

Why, you might ask, index on physmessage_id?  Because then the db won't
have to do a fetch on items from the table since it's in the INDEX
itself, saving any unnecessary reads.

Realistically, I can't see that taking more than a few seconds, at most,
to execute.  However, making the index might take a serious bit of time.

Please let us all know how it does or does not work.

Tim...

-Original Message-
From: Ken D'Ambrosio [mailto:k...@jots.org] 
Sent: Wednesday, June 24, 2009 11:07 AM
To: mysql@lists.mysql.com
Subject: Indexing? (Warning: relative newbie.)

Hi, all.  I'm a long-time MySQL user who's only recently had to start
learning some administrative stuff, largely because I finally have a
decently-sized database.  My database is about 100 GB; I'm using it --
via
dbmail (www.dbmail.org) -- as a mail server for my company.  While
dbmail
is well-and-good with its IMAP front-end, I'm thinking of writing a
Python
front-end to do some queries directly against MySQL.  But some of them
take a l-o-n-g time.  As an example, I've got a table with slightly over
a
million records; I'd like to be able to show (say) only IDs of messages
under a half-MB.  The query would look something like this:

select physmessage_id,blocksize from dbmail_messageblks where blocksize

50;

That query takes 50 minutes.  A smidge long to wait.

So I said, Huh.  That's impressive.  And I tried it without the
physmessage_id:
select blocksize from dbmail_messageblks where blocksize  50;

That took 14 seconds.  A bit more in my timeframe.  Can I optimize this
with indexing?  Should I be using a different DB engine?  Is there a
site/book I should be learning DBA fundamentals from that might offer me
direction for stuff like this?

Sorry for all the newbie questions, but I haven't done serious database
stuff since Foxbase/dBase III days.  Things have changed a little since
then.

Thanks!

-Ken


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=tlit...@tgrnet.com


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org

Re: Indexing? (Warning: relative newbie.)

2009-06-24 Thread Walter Heck - OlinData.com

Hey Tim, all

On Wed, Jun 24, 2009 at 10:03 AM, Little,
Timothytlit...@thomaspublishing.com wrote:
 Why, you might ask, index on physmessage_id?  Because then the db won't
 have to do a fetch on items from the table since it's in the INDEX
 itself, saving any unnecessary reads.
FYI: That only holds true for InnoDB, not for MyISAM.

cheers,

-- 
Walter Heck, Engineer @ Open Query (http://openquery.com)
Affordable Training and ProActive Support for MySQL  related technologies

Follow our blog at http://openquery.com/blog/
OurDelta: free enhanced builds for MySQL @ http://ourdelta.org

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org

Indexing dynamics in MySQL Community Edition 5.1.34

2009-06-24 Thread Mike Spreitzer

Using MyISAM on a table loaded from 8GB of CSV, I am now adding some 
indices.  In a separate shell I monitor the progress, alternately with 
`vmstat` and show full processlist.  At first vmstat shows rapid 
progress; an example is

# vmstat 5
procs ---memory-- ---swap-- -io --system-- 
-cpu--
 r  b   swpdfree   buffcache   si   sobi bo   in   cs us 
sy id wa st
 1  6  0 8542108 256860 5394040000 0 152783 1314  298  3 4 
60 32  0
 1  6  0 8541000 256868 5394039200 0 147868 1301  287  3 4 
68 25  0
 1  5  0 8541380 256876 5394040000 0 150633 1310  277  3 4 
72 21  0
 1  6  0 8541108 256884 5394039200 0 152066 1307  271  3 4 
71 21  0
 1  7  0 8541116 256892 5394040000 0 151452 1312  311  3 4 
64 29  0
 1  6  0 8541992 256900 5394039200 0 192175 1402  295  3 4 
66 26  0
 1  6  0 8535684 256908 5394040000 0 108783 1227  276  3 4 
69 24  0
 1  8  0 8539116 256916 5394039200 0 155958 1318  262  3 4 
82 11  0
 1  6  0 8540860 256924 5394039200 0 166599 1340  328  3 4 
66 27  0
 1  9  0 8538512 256932 5394039200 0 165386 1336  319  3 4 
62 31  0
 1  6  0 8536776 256940 5394039200 0 175106 1358  303  3 5 
66 27  0
 2  0  0 8538884 256944 5394039600 0 187839 1402  305  3 5 
70 22  0
 1  1  0 8517060 256952 5394040000 0 188694 1379  307  3 4 
66 27  0
 1 10  0 8511604 256960 5394040000 0 175821 1335  294  2 5 
69 24  0
 1 10  0 8513340 256968 5394040000 0 164252 1335  300  3 4 
65 28  0
 2  0  0 8523012 256976 5394039200 0 151527 1318  305  3 5 
60 33  0
 1 10  0 8490152 256976 5394040000 0 178613 1352  301  2 5 
67 26  0
 2  0  0 8499576 256976 5394040000 0 142186 1319  302  2 5 
69 23  0
 1 10  0 8474280 256984 5394040000 0 185598 1348  301  2 5 
57 36  0
 2  9  0 8440676 256984 5394040000 0 166807 1334  306  2 5 
53 39  0
 1  9  0 8465228 256988 5394039600 0 114594 1268  306  1 6 
56 36  0
 1  9  0 16819736 256992 4554294400 0 185034 1342  301  1 
6 56 36  0
 1  9  0 20314428 257028 4213472400 0 186163 1371  282  1 
6 56 37  0
 1  9  0 20276856 257068 4217125200 0 166406 1342  281  2 
5 59 34  0
 1  9  0 20237672 257108 4220934400 0 166810 1333  252  2 
5 56 37  0

At this point, and not for the first time, I stop vmstat and show full 
processlist.  It says

++--+---+--+-+--+---+-+
| Id | User | Host  | db   | Command | Time | State | 
Info |
++--+---+--+-+--+---+-+
|  1 | root | localhost | cel_4x52 | Query   |  542 | copy to tmp table | 
ALTER TABLE ntfelt ADD PRIMARY KEY (p, epoch, ssi, q, kind, ev, c),
ADD UNIQUE INDEX pesqekvc(p, epoch, ssi, q, eqoch, kind, version, 
c),
ADD INDEX tc(t, c),
ORDER BY p, epoch, ssi, q, kind, ev, c | 
|  3 | root | localhost | NULL | Query   |0 | NULL  | 
show full processlist   | 
++--+---+--+-+--+---+-+

OK, so it is still indexing.  Then I start up `vmstat` again, and it shows 
very different dynamics:

# vmstat 5
procs ---memory-- ---swap-- -io --system-- 
-cpu--
 r  b   swpd free   buffcache   si   sobibo   in   cs us 
sy id wa st
 1  0  0 32429508 257248 3043925600 03076  0 0 
100  0  0
 1  1  0 32416124 257272 3045166800 0  2471 1020  111  6 0 
94  0  0
 1  0  0 32405096 257292 3046178000 0  2467 1017  109  6 0 
94  0  0
 1  0  0 32391828 257312 3047443600 0  2056 1019  107  6 0 
94  0  0
 1  0  0 32378684 257332 3048635600 0  2563 1040  109  6 0 
94  0  0
 1  0  0 32358224 257352 3050082400 0  3756 1038  109  6 0 
93  0  0
 1  0  0 32342600 257380 3051949200 0  3356 1035  112  6 0 
93  0  0
 1  0  0 32322140 257404 3053768800 0  3696 1023  108  6 0 
94  0  0

I check

Re: Indexing dynamics in MySQL Community Edition 5.1.34

2009-06-24 Thread Mike Spreitzer

Actually, my characterization of the current state is wrong.  It appears 
that one core is completely busy, I suppose MySQL does this indexing work 
in a single thread.  Is it reasonable for indexing to be CPU bound?

Thanks,
Mike Spreitzer




Mike Spreitzer/Watson/i...@ibmus 
06/25/09 01:30 AM

To
mysql@lists.mysql.com
cc

Subject
Indexing dynamics in MySQL Community Edition 5.1.34






Using MyISAM on a table loaded from 8GB of CSV, I am now adding some 
indices.  In a separate shell I monitor the progress, alternately with 
`vmstat` and show full processlist.  At first vmstat shows rapid 
progress; an example is

# vmstat 5
procs ---memory-- ---swap-- -io --system-- 
-cpu--
 r  b   swpdfree   buffcache   si   sobi bo   in   cs us 
sy id wa st
 1  6  0 8542108 256860 5394040000 0 152783 1314  298  3 4 

60 32  0
 1  6  0 8541000 256868 5394039200 0 147868 1301  287  3 4 

68 25  0
 1  5  0 8541380 256876 5394040000 0 150633 1310  277  3 4 

72 21  0
 1  6  0 8541108 256884 5394039200 0 152066 1307  271  3 4 

71 21  0
 1  7  0 8541116 256892 5394040000 0 151452 1312  311  3 4 

64 29  0
 1  6  0 8541992 256900 5394039200 0 192175 1402  295  3 4 

66 26  0
 1  6  0 8535684 256908 5394040000 0 108783 1227  276  3 4 

69 24  0
 1  8  0 8539116 256916 5394039200 0 155958 1318  262  3 4 

82 11  0
 1  6  0 8540860 256924 5394039200 0 166599 1340  328  3 4 

66 27  0
 1  9  0 8538512 256932 5394039200 0 165386 1336  319  3 4 

62 31  0
 1  6  0 8536776 256940 5394039200 0 175106 1358  303  3 5 

66 27  0
 2  0  0 8538884 256944 5394039600 0 187839 1402  305  3 5 

70 22  0
 1  1  0 8517060 256952 5394040000 0 188694 1379  307  3 4 

66 27  0
 1 10  0 8511604 256960 5394040000 0 175821 1335  294  2 5 

69 24  0
 1 10  0 8513340 256968 5394040000 0 164252 1335  300  3 4 

65 28  0
 2  0  0 8523012 256976 5394039200 0 151527 1318  305  3 5 

60 33  0
 1 10  0 8490152 256976 5394040000 0 178613 1352  301  2 5 

67 26  0
 2  0  0 8499576 256976 5394040000 0 142186 1319  302  2 5 

69 23  0
 1 10  0 8474280 256984 5394040000 0 185598 1348  301  2 5 

57 36  0
 2  9  0 8440676 256984 5394040000 0 166807 1334  306  2 5 

53 39  0
 1  9  0 8465228 256988 5394039600 0 114594 1268  306  1 6 

56 36  0
 1  9  0 16819736 256992 4554294400 0 185034 1342  301  1 
6 56 36  0
 1  9  0 20314428 257028 4213472400 0 186163 1371  282  1 
6 56 37  0
 1  9  0 20276856 257068 4217125200 0 166406 1342  281  2 
5 59 34  0
 1  9  0 20237672 257108 4220934400 0 166810 1333  252  2 
5 56 37  0

At this point, and not for the first time, I stop vmstat and show full 
processlist.  It says

++--+---+--+-+--+---+-+
| Id | User | Host  | db   | Command | Time | State | 
Info |
++--+---+--+-+--+---+-+
|  1 | root | localhost | cel_4x52 | Query   |  542 | copy to tmp table | 
ALTER TABLE ntfelt ADD PRIMARY KEY (p, epoch, ssi, q, kind, ev, c),
ADD UNIQUE INDEX pesqekvc(p, epoch, ssi, q, eqoch, kind, version, 
c),
ADD INDEX tc(t, c),
ORDER BY p, epoch, ssi, q, kind, ev, c | 
|  3 | root | localhost | NULL | Query   |0 | NULL  | 
show full processlist   | 
++--+---+--+-+--+---+-+

OK, so it is still indexing.  Then I start up `vmstat` again, and it shows 

very different dynamics:

# vmstat 5
procs ---memory-- ---swap-- -io --system-- 
-cpu--
 r  b   swpd free   buffcache   si   sobibo   in   cs us 
sy id wa st
 1  0  0 32429508 257248 3043925600 03076  0 0 

100  0  0
 1  1  0 32416124 257272 3045166800 0  2471 1020  111  6 0 

94  0  0
 1  0  0 32405096 257292 3046178000 0  2467 1017  109  6 0 

94  0  0
 1  0  0

INDEXING ALL COLUMNS

2008-09-05 Thread Krishna Chandra Prajapati

Hi all,

I am looking for, is there any specific reason for not indexing all columns
of a table. whats the impact on the performance. Although indexing is meant
for getting great performance. So, why indexing all columns is not
feasible.  (Read in docs that  all columns should not be indexed)

-- 
Krishna Chandra Prajapati

Re: INDEXING ALL COLUMNS

2008-09-05 Thread Mike Zupan

As your table grows your inserts will start to get slower and slower. You
run into the issue of locking a table due to re-creating the indexes. Also
wasted space for indexes



On 9/5/08, Krishna Chandra Prajapati [EMAIL PROTECTED] wrote:

 Hi all,

 I am looking for, is there any specific reason for not indexing all columns
 of a table. whats the impact on the performance. Although indexing is meant
 for getting great performance. So, why indexing all columns is not
 feasible.  (Read in docs that  all columns should not be indexed)

 --

 Krishna Chandra Prajapati

Re: INDEXING ALL COLUMNS

2008-09-05 Thread ewen fortune

Hi,

Following on from what Mike mentioned, indexing all columns does not
really help as MySQL will at most use one index for a query, so its
important to pick your indexes carefully and consider constructing
composite indexes. An index on a single column may not even be used
due to poor cardinality.

Ewen

On Fri, Sep 5, 2008 at 9:27 PM, Krishna Chandra Prajapati
[EMAIL PROTECTED] wrote:
 Hi all,

 I am looking for, is there any specific reason for not indexing all columns
 of a table. whats the impact on the performance. Although indexing is meant
 for getting great performance. So, why indexing all columns is not
 feasible.  (Read in docs that  all columns should not be indexed)

 --
 Krishna Chandra Prajapati


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: INDEXING ALL COLUMNS

2008-09-05 Thread Krishna Chandra Prajapati

Hi,

What would you say about the below table . What can i do to make it more
efficient.

CREATE TABLE mailer_student_status (
  student_id decimal(22,0) NOT NULL default '0',
  param varchar(128) NOT NULL default '',
  value varchar(128) default NULL,
  PRIMARY KEY  (student_id,param).
 KEY idx_value (value)
)

On Sat, Sep 6, 2008 at 1:16 AM, ewen fortune [EMAIL PROTECTED] wrote:

 Hi,

 Following on from what Mike mentioned, indexing all columns does not
 really help as MySQL will at most use one index for a query, so its
 important to pick your indexes carefully and consider constructing
 composite indexes. An index on a single column may not even be used
 due to poor cardinality.

 Ewen

 On Fri, Sep 5, 2008 at 9:27 PM, Krishna Chandra Prajapati
 [EMAIL PROTECTED] wrote:
  Hi all,
 
  I am looking for, is there any specific reason for not indexing all
 columns
  of a table. whats the impact on the performance. Although indexing is
 meant
  for getting great performance. So, why indexing all columns is not
  feasible.  (Read in docs that  all columns should not be indexed)
 
  --
  Krishna Chandra Prajapati
 




-- 
Krishna Chandra Prajapati

Re: INDEXING ALL COLUMNS

2008-09-05 Thread Aaron Blew

We'd need more information on what the where clauses of the queries
look like to assist with this.

-Aaron

On 9/5/08, Krishna Chandra Prajapati [EMAIL PROTECTED] wrote:
 Hi,

 What would you say about the below table . What can i do to make it more
 efficient.

 CREATE TABLE mailer_student_status (
   student_id decimal(22,0) NOT NULL default '0',
   param varchar(128) NOT NULL default '',
   value varchar(128) default NULL,
   PRIMARY KEY  (student_id,param).
  KEY idx_value (value)
 )

 On Sat, Sep 6, 2008 at 1:16 AM, ewen fortune [EMAIL PROTECTED] wrote:

 Hi,

 Following on from what Mike mentioned, indexing all columns does not
 really help as MySQL will at most use one index for a query, so its
 important to pick your indexes carefully and consider constructing
 composite indexes. An index on a single column may not even be used
 due to poor cardinality.

 Ewen

 On Fri, Sep 5, 2008 at 9:27 PM, Krishna Chandra Prajapati
 [EMAIL PROTECTED] wrote:
  Hi all,
 
  I am looking for, is there any specific reason for not indexing all
 columns
  of a table. whats the impact on the performance. Although indexing is
 meant
  for getting great performance. So, why indexing all columns is not
  feasible.  (Read in docs that  all columns should not be indexed)
 
  --
  Krishna Chandra Prajapati
 




 --
 Krishna Chandra Prajapati


-- 
Sent from my mobile device

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: INDEXING ALL COLUMNS

2008-09-05 Thread ewen fortune

Hi,

Well at first glance its hard to tell since param and value don't
say a lot about the nature of the data.
If this is innodb, you can have a PRIMARY KEY of student_id (assuming
its unique) and a separate index on param, this is because of the way
innodb is structure, the primary key is always implied in the makeup
of any other index.

You could perhaps consider how much of param and are interesting and
create a composite index on them idx_param_value (param(10),value(10))
or something similar.

Whats important for data types is what is going to be held, and whats
important for indexes is how the data is going to be queried.

Ewen

On Fri, Sep 5, 2008 at 10:50 PM, Krishna Chandra Prajapati
[EMAIL PROTECTED] wrote:
 Hi,

 What would you say about the below table . What can i do to make it more
 efficient.

 CREATE TABLE mailer_student_status (
   student_id decimal(22,0) NOT NULL default '0',
   param varchar(128) NOT NULL default '',
   value varchar(128) default NULL,
   PRIMARY KEY  (student_id,param).
  KEY idx_value (value)
 )

 On Sat, Sep 6, 2008 at 1:16 AM, ewen fortune [EMAIL PROTECTED] wrote:

 Hi,

 Following on from what Mike mentioned, indexing all columns does not
 really help as MySQL will at most use one index for a query, so its
 important to pick your indexes carefully and consider constructing
 composite indexes. An index on a single column may not even be used
 due to poor cardinality.

 Ewen

 On Fri, Sep 5, 2008 at 9:27 PM, Krishna Chandra Prajapati
 [EMAIL PROTECTED] wrote:
  Hi all,
 
  I am looking for, is there any specific reason for not indexing all
  columns
  of a table. whats the impact on the performance. Although indexing is
  meant
  for getting great performance. So, why indexing all columns is not
  feasible.  (Read in docs that  all columns should not be indexed)
 
  --
  Krishna Chandra Prajapati
 



 --
 Krishna Chandra Prajapati



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: INDEXING ALL COLUMNS

2008-09-05 Thread Krishna Chandra Prajapati

More details.

CREATE TABLE mailer_student_status (
  student_id decimal(22,0) NOT NULL default '0',
  param varchar(128) NOT NULL default '',
  value varchar(128) default NULL,
  PRIMARY KEY  (student_id,param).
 KEY idx_value (value)
)

SELECT VALUE
  FROM mailer_student_status
  WHERE student_id=586925 and VALUE = 0

SELECT VALUE
  FROM mailer_student_status
  WHERE PARAM = 'FIRST_MAILER_COUPON_CODE' and VALUE = 0




On Sat, Sep 6, 2008 at 3:04 AM, Aaron Blew [EMAIL PROTECTED] wrote:

 We'd need more information on what the where clauses of the queries
 look like to assist with this.

 -Aaron

 On 9/5/08, Krishna Chandra Prajapati [EMAIL PROTECTED] wrote:
  Hi,
 
  What would you say about the below table . What can i do to make it more
  efficient.
 
  CREATE TABLE mailer_student_status (
student_id decimal(22,0) NOT NULL default '0',
param varchar(128) NOT NULL default '',
value varchar(128) default NULL,
PRIMARY KEY  (student_id,param).
   KEY idx_value (value)
  )
 
  On Sat, Sep 6, 2008 at 1:16 AM, ewen fortune [EMAIL PROTECTED]
 wrote:
 
  Hi,
 
  Following on from what Mike mentioned, indexing all columns does not
  really help as MySQL will at most use one index for a query, so its
  important to pick your indexes carefully and consider constructing
  composite indexes. An index on a single column may not even be used
  due to poor cardinality.
 
  Ewen
 
  On Fri, Sep 5, 2008 at 9:27 PM, Krishna Chandra Prajapati
  [EMAIL PROTECTED] wrote:
   Hi all,
  
   I am looking for, is there any specific reason for not indexing all
  columns
   of a table. whats the impact on the performance. Although indexing is
  meant
   for getting great performance. So, why indexing all columns is not
   feasible.  (Read in docs that  all columns should not be indexed)
  
   --
   Krishna Chandra Prajapati
  
 
 
 
 
  --
  Krishna Chandra Prajapati
 

 --
 Sent from my mobile device




-- 
Krishna Chandra Prajapati

Show indexing status

2008-06-04 Thread Stut


Hi,

I just finished restoring a 22gig SQL dump but the server is not  
performing anywhere near where it should be. I'm assuming this is  
because it's still rebuilding indexes on the imported tables.


Is there any way to see the indexing status so I can gauge how far  
it's got?


Thanks.

-Stut

--
http://stut.net/

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Show indexing status

2008-06-04 Thread Ben Clewett


As far as I am aware, the index's are built on import.

It may be that the key-buffer or innodb-buffer (depending on engine), 
and the query cache, are all cold.  May take a day or so to build them 
up depending on size and load.


Other than that there must be some external difference.  Is it the same 
server?  Same config file?  Same default engine?


Ben

Stut wrote:

Hi,

I just finished restoring a 22gig SQL dump but the server is not 
performing anywhere near where it should be. I'm assuming this is 
because it's still rebuilding indexes on the imported tables.


Is there any way to see the indexing status so I can gauge how far it's 
got?


Thanks.

-Stut



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Show indexing status

2008-06-04 Thread Stut


On 4 Jun 2008, at 15:53, Ben Clewett wrote:

As far as I am aware, the index's are built on import.

It may be that the key-buffer or innodb-buffer (depending on  
engine), and the query cache, are all cold.  May take a day or so to  
build them up depending on size and load.


Other than that there must be some external difference.  Is it the  
same server?  Same config file?  Same default engine?


All tables are InnoDB.

Different server (temporarily on one of the Apache servers until  
replacement HDDs for the DB server arrive) but CPU is 50% idle with  
MySQL taking ~33% continuously. Load is hovering just over 2 (2 CPUs  
in there).


Config file is probably a little different - we didn't have a copy of  
it before the DB HDD's went belly up.



Stut wrote:

Hi,
I just finished restoring a 22gig SQL dump but the server is not  
performing anywhere near where it should be. I'm assuming this is  
because it's still rebuilding indexes on the imported tables.
Is there any way to see the indexing status so I can gauge how far  
it's got?

Thanks.
-Stut


Thanks.

-Stut

--
http://stut.net/

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Indexing question

2008-03-25 Thread J. Christian Hesketh

Hi,
I have created a rather large table containing about 16M records. Most
of the indexed fields are smallint, but there is one field that is a
text field that I am using fulltext indexing on. The total size of the
smallint indexes is only about 30 MB, but the fulltext index brings
the total index size to about 2 GB. My question is: When mysql adds
indexes to the keycache, does it add each index individually or does
it dump the entire .MYI file to the keycache. If it is the latter, I
should move the fulltext index to another table.
Thanks in advance,
Christian

-- 
J. Christian Hesketh M.Sc.
CEO - Ion Channel Media Group
2028 Harvard Avenue, Suite 103
Montreal, QC
CANADA H4A 2V9

Tel: +1(514)245-8107

http://www.IonChannelMedia.com

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Indexing question

2008-03-25 Thread Arthur Fuller

What is the size of the text field you're fulltext indexing? How often is
that index used? You might be best off to create a table containing only
that column and a PK that is equal to the PK in the original table. You
might also keep a portion of the text field (say 50 characters) in the
original table, for head-first indexing.

Arthur

On Tue, Mar 25, 2008 at 9:56 AM, J. Christian Hesketh [EMAIL PROTECTED]
wrote:

 Hi,
 I have created a rather large table containing about 16M records. Most
 of the indexed fields are smallint, but there is one field that is a
 text field that I am using fulltext indexing on. The total size of the
 smallint indexes is only about 30 MB, but the fulltext index brings
 the total index size to about 2 GB. My question is: When mysql adds
 indexes to the keycache, does it add each index individually or does
 it dump the entire .MYI file to the keycache. If it is the latter, I
 should move the fulltext index to another table.
 Thanks in advance,
 Christian

 --
 J. Christian Hesketh M.Sc.
 CEO - Ion Channel Media Group
 2028 Harvard Avenue, Suite 103
 Montreal, QC
 CANADA H4A 2V9

 Tel: +1(514)245-8107

 http://www.IonChannelMedia.com http://www.ionchannelmedia.com/

 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:
 http://lists.mysql.com/[EMAIL PROTECTED]

Re: Indexing question

2008-03-25 Thread J. Christian Hesketh

The field has up to 1000 characters, usually well over 255. It is
searched constantly (2-3 times every second). The smallint values are
scanned up to 40 times a second. So, are you saying that the entire
MYI file is dumped into the keycache? If so, your suggestion to create
a separate table with PK and text field seems to be the only sensible
decision. Thanks in advance,
Christian

On Tue, Mar 25, 2008 at 10:11 AM, Arthur Fuller [EMAIL PROTECTED] wrote:
 What is the size of the text field you're fulltext indexing? How often is
 that index used? You might be best off to create a table containing only
 that column and a PK that is equal to the PK in the original table. You
 might also keep a portion of the text field (say 50 characters) in the
 original table, for head-first indexing.

 Arthur



 On Tue, Mar 25, 2008 at 9:56 AM, J. Christian Hesketh
 [EMAIL PROTECTED] wrote:

 
 
 
  Hi,
  I have created a rather large table containing about 16M records. Most
  of the indexed fields are smallint, but there is one field that is a
  text field that I am using fulltext indexing on. The total size of the
  smallint indexes is only about 30 MB, but the fulltext index brings
  the total index size to about 2 GB. My question is: When mysql adds
  indexes to the keycache, does it add each index individually or does
  it dump the entire .MYI file to the keycache. If it is the latter, I
  should move the fulltext index to another table.
  Thanks in advance,
  Christian
 
  --
  J. Christian Hesketh M.Sc.
  CEO - Ion Channel Media Group
  2028 Harvard Avenue, Suite 103
  Montreal, QC
  CANADA H4A 2V9
 
  Tel: +1(514)245-8107
 
  http://www.IonChannelMedia.com
 
  --
  MySQL General Mailing List
  For list archives: http://lists.mysql.com/mysql
  To unsubscribe:
 http://lists.mysql.com/[EMAIL PROTECTED]
 
 





-- 
J. Christian Hesketh M.Sc.
CEO - Ion Channel Media Group
2028 Harvard Avenue, Suite 103
Montreal, QC
CANADA H4A 2V9

Tel: +1(514)245-8107

http://www.IonChannelMedia.com

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Indexing one byte flags - what implementattion is better

2007-12-28 Thread Artem Kuchin


Garris, Nicole wrote:

Is id a sequential number? And is it referenced by other tables? If
so, and if over time new products become old products, then CASE 2
is more complex, because when moving a product (i.e., a row) from the
new product table to the old product table, the value of id needs to
stay the same. So for CASE 2 you'll need a third object to keep track
of the highest value for id.


I think you did not get it right. There is no  MOVING of products.
All product are stored in 'products' table only, and newproducts
is just is kind of FLAG table. It contains ONLY ids of products
considered new, nothing else. So, if product is new then its is
of course in products table and in newproducts table and if it is not new
then it is only in 'products' table.

ID is a seqential number (but no auto_increment - i hate it).

PS: top posting is really popular in  mysql list!

--
Artem



-Original Message-
From: Artem Kuchin [mailto:[EMAIL PROTECTED]
Sent: Thursday, December 27, 2007 1:19 PM
To: mysql@lists.mysql.com
Subject: Indexing one byte flags - what implementattion is better

Maybe someone could provide a good resonable
input on this issue.

Let's say i have a table products

CASE 1:

table: products

id int unsigned not null,
name char(128) not null,
f_new tinyint not null

id - is basically the id of a product
name - is the name of a product
f_new - is a one byte flag. If it is 1 the product is condireed new.


In this case to select all new products including name i need to do:

select id, name  from products wher f_new=1

CASE 2:

The above can be done another way - via two table,
one products table and another one - listing all ids
for new products

create table products (
id int unsigned not null,
name char(128) not null,
primay key (id)
);

create table newproducts (
product_id int unsigned not null,
primay key (id)
);


If product is is in newproducts table that it is
a new product.

To choose all new products including name i need to do:

SELECT id,name
FROM newproducts
INNER JOIN products ON products.id=newproducts.product_id

The questions are:

1) which way is FASTER?
2) which way eats less memory?
3) which way eats less cpu?
4) which way eats less hdd io?

There are several cases for each question:
1) 1000 products - i think both methods are pretty much the same
in this case because all of the data woul be cached in memory

2) 10 products, 3 new products - interesting to know
which method is better here and how each of the method performs.

3) 10 products, 50 new products - interesting to know
which method is better here and how each of the method performs.


I will greately appriciate input  on this issue.

--
Artem


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Indexing one byte flags - what implementattion is better

2007-12-27 Thread Artem Kuchin


Maybe someone could provide a good resonable
input on this issue.

Let's say i have a table products

CASE 1:

table: products

id int unsigned not null,
name char(128) not null,
f_new tinyint not null

id - is basically the id of a product
name - is the name of a product
f_new - is a one byte flag. If it is 1 the product is condireed new.


In this case to select all new products including name i need to do:

select id, name  from products wher f_new=1

CASE 2:

The above can be done another way - via two table,
one products table and another one - listing all ids
for new products

create table products (
   id int unsigned not null,
   name char(128) not null,
   primay key (id)
);

create table newproducts (
   product_id int unsigned not null,
   primay key (id)
);


If product is is in newproducts table that it is
a new product.

To choose all new products including name i need to do:

SELECT id,name 
FROM newproducts 
INNER JOIN products ON products.id=newproducts.product_id


The questions are:

1) which way is FASTER?
2) which way eats less memory?
3) which way eats less cpu?
4) which way eats less hdd io?

There are several cases for each question:
1) 1000 products - i think both methods are pretty much the same
in this case because all of the data woul be cached in memory

2) 10 products, 3 new products - interesting to know
which method is better here and how each of the method performs.

3) 10 products, 50 new products - interesting to know
which method is better here and how each of the method performs.


I will greately appriciate input  on this issue.

--
Artem


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

RE: Indexing one byte flags - what implementattion is better

2007-12-27 Thread Garris, Nicole

Is id a sequential number? And is it referenced by other tables? If so,
and if over time new products become old products, then CASE 2 is more
complex, because when moving a product (i.e., a row) from the new
product table to the old product table, the value of id needs to stay
the same. So for CASE 2 you'll need a third object to keep track of the
highest value for id.

-Original Message-
From: Artem Kuchin [mailto:[EMAIL PROTECTED] 
Sent: Thursday, December 27, 2007 1:19 PM
To: mysql@lists.mysql.com
Subject: Indexing one byte flags - what implementattion is better

Maybe someone could provide a good resonable
input on this issue.

Let's say i have a table products

CASE 1:

table: products

id int unsigned not null,
name char(128) not null,
f_new tinyint not null

id - is basically the id of a product
name - is the name of a product
f_new - is a one byte flag. If it is 1 the product is condireed new.


In this case to select all new products including name i need to do:

select id, name  from products wher f_new=1

CASE 2:

The above can be done another way - via two table,
one products table and another one - listing all ids
for new products

create table products (
id int unsigned not null,
name char(128) not null,
primay key (id)
);

create table newproducts (
product_id int unsigned not null,
primay key (id)
);


If product is is in newproducts table that it is
a new product.

To choose all new products including name i need to do:

SELECT id,name 
FROM newproducts 
INNER JOIN products ON products.id=newproducts.product_id

The questions are:

1) which way is FASTER?
2) which way eats less memory?
3) which way eats less cpu?
4) which way eats less hdd io?

There are several cases for each question:
1) 1000 products - i think both methods are pretty much the same
in this case because all of the data woul be cached in memory

2) 10 products, 3 new products - interesting to know
which method is better here and how each of the method performs.

3) 10 products, 50 new products - interesting to know
which method is better here and how each of the method performs.


I will greately appriciate input  on this issue.

--
Artem


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

indexing tables using my owns functions

2007-11-13 Thread Pau Marc Munoz Torres

Hi

 I've created a function that return a float value the code for it is :

create function IDR(pin1 varchar(20),pin4 varchar(20),pin6 varchar(20),pin7
varchar(20),pin9 varchar(20),MOL varchar(20)) returns float
DETERMINISTIC
begin


declare output float;
declare P1 float;
declare P4 float;
declare P6 float;
declare P7 float;
declare P9 float;


select VALUE into P1 from PSSMS where AA=pin1 and POS='1'
and MOLEC=MOL;
select VALUE into P4 from PSSMS where AA=pin4 and POS='4'
and MOLEC=MOL;
select VALUE into P6 from PSSMS where AA=pin6 and POS='6'
and MOLEC=MOL;
select VALUE into P7 from PSSMS where AA=pin7 and POS='7'
and MOLEC=MOL;
select VALUE into P9 from PSSMS where AA=pin9 and POS='9'
and MOLEC=MOL;

select P1+P4+P6+P7+P9 into output;

return output;
end
//


And it works, now, i would like index a table using this function.
The table description is:
mysql describe precalc;
+---+-+--+-+-++
| Field | Type| Null | Key | Default | Extra  |
+---+-+--+-+-++
| id  | int(6)| NO   | PRI | NULL| auto_increment |
| P1| char(1) | YES  || NULL||
| P4| char(1) | YES  || NULL||
| P6| char(1) | YES  || NULL||
| P7| char(1) | YES  || NULL||
| P9| char(1) | YES  ||  NULL||
+---+-+--+-+-++
6 rows in set (0.01 sec)

and i try index by the following command:

mysql create index AA on  precalc (IDR(P1,P4,P6,P7,P9,'HLA-DRB13'));

But i Get the following error:

ERROR 1064 (42000): You have an error in your SQL syntax; check the manual
that corresponds to your MySQL server version for the right syntax to use
near ''P1','P4','P6','P7','P9','HLA-DRB13'))' at line 1

Some one knows where is the error?

Thanks

Pau

-- 
Pau Marc Muñoz Torres

Laboratori de Biologia Computacional
Institut de  Biotecnologia   i Biomedicina Vicent
Villar
Universitat Autonoma de Barcelona
E-08193 Bellaterra (Barcelona)

telèfon: 93 5812807
Email : [EMAIL PROTECTED]


-- 
Pau Marc Muñoz Torres

Laboratori de Biologia Computacional
Institut de  Biotecnologia   i Biomedicina Vicent
Villar
Universitat Autonoma de Barcelona
E-08193 Bellaterra (Barcelona)

telèfon: 93 5812807
Email : [EMAIL PROTECTED]

Re: indexing tables using my owns functions

2007-11-13 Thread Martijn Tonies

mysql create index AA on  precalc (IDR(P1,P4,P6,P7,P9,'HLA-DRB13'));

But i Get the following error:

ERROR 1064 (42000): You have an error in your SQL syntax; check the manual
that corresponds to your MySQL server version for the right syntax to use
near ''P1','P4','P6','P7','P9','HLA-DRB13'))' at line 1

Some one knows where is the error?

As far as I can see (
http://dev.mysql.com/doc/refman/5.0/en/create-index.html )
you can only use columns, not a function.

Martijn Tonies
Database Workbench - tool for InterBase, Firebird, MySQL, NexusDB, Oracle 
MS SQL Server
Upscene Productions
http://www.upscene.com
My thoughts:
http://blog.upscene.com/martijn/
Database development questions? Check the forum!
http://www.databasedevelopmentforum.com


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

indexing tables using my owns functions

2007-11-13 Thread Pau Marc Munoz Torres

Hi

 I've created a function that return a float value the code for it is :

create function IDR(pin1 varchar(20),pin4 varchar(20),pin6 varchar(20),pin7
varchar(20),pin9 varchar(20),MOL varchar(20)) returns float
DETERMINISTIC
begin


declare output float;
declare P1 float;
declare P4 float;
declare P6 float;
declare P7 float;
declare P9 float;


select VALUE into P1 from PSSMS where AA=pin1 and POS='1'
and MOLEC=MOL;
select VALUE into P4 from PSSMS where AA=pin4 and POS='4'
and MOLEC=MOL;
select VALUE into P6 from PSSMS where AA=pin6 and POS='6'
and MOLEC=MOL;
select VALUE into P7 from PSSMS where AA=pin7 and POS='7'
and MOLEC=MOL;
select VALUE into P9 from PSSMS where AA=pin9 and POS='9'
and MOLEC=MOL;

select P1+P4+P6+P7+P9 into output;

return output;
end
//


And it works, now, i would like index a table using this function.
The table description is:
mysql describe precalc;
+---+-+--+-+-++
| Field | Type| Null | Key | Default | Extra  |
+---+-+--+-+-++
| id  | int(6)| NO   | PRI | NULL| auto_increment |
| P1| char(1) | YES  || NULL||
| P4| char(1) | YES  || NULL||
| P6| char(1) | YES  || NULL||
| P7| char(1) | YES  || NULL||
| P9| char(1) | YES  ||  NULL||
+---+-+--+-+-++
6 rows in set (0.01 sec)

and i try index by the following command:

mysql create index AA on  precalc (IDR(P1,P4,P6,P7,P9,'HLA-DRB13'));

But i Get the following error:

ERROR 1064 (42000): You have an error in your SQL syntax; check the manual
that corresponds to your MySQL server version for the right syntax to use
near ''P1','P4','P6','P7','P9','HLA-DRB13'))' at line 1

Some one knows where is the error?

Thanks

Pau

-- 
Pau Marc Muñoz Torres

Laboratori de Biologia Computacional
Institut de  Biotecnologia   i Biomedicina Vicent
Villar
Universitat Autonoma de Barcelona
E-08193 Bellaterra (Barcelona)

telèfon: 93 5812807
Email : [EMAIL PROTECTED]


-- 
Pau Marc Muñoz Torres

Laboratori de Biologia Computacional
Institut de  Biotecnologia   i Biomedicina Vicent
Villar
Universitat Autonoma de Barcelona
E-08193 Bellaterra (Barcelona)

telèfon: 93 5812807
Email : [EMAIL PROTECTED]


-- 
Pau Marc Muñoz Torres

Laboratori de Biologia Computacional
Institut de  Biotecnologia   i Biomedicina Vicent
Villar
Universitat Autonoma de Barcelona
E-08193 Bellaterra (Barcelona)

telèfon: 93 5812807
Email : [EMAIL PROTECTED]

Re: indexing tables using my owns functions

2007-11-13 Thread Pau Marc Munoz Torres

, as far as i can see, from mysql 5.0 and upper it is possible create
index using functions.

http://www.faqs.org/docs/ppbook/r24254.htm

But i keep having problems with the exemple from the link. Is there any bug
in  mysql 5.0.24a-log?

2007/11/13, Martijn Tonies [EMAIL PROTECTED]:

mysql create index AA on  precalc (IDR(P1,P4,P6,P7,P9,'HLA-DRB13'));
 
 But i Get the following error:
 
 ERROR 1064 (42000): You have an error in your SQL syntax; check the
 manual
 that corresponds to your MySQL server version for the right syntax to use
 near ''P1','P4','P6','P7','P9','HLA-DRB13'))' at line 1
 
 Some one knows where is the error?

 As far as I can see (
 http://dev.mysql.com/doc/refman/5.0/en/create-index.html )
 you can only use columns, not a function.

 Martijn Tonies
 Database Workbench - tool for InterBase, Firebird, MySQL, NexusDB, Oracle
 
 MS SQL Server
 Upscene Productions
 http://www.upscene.com
 My thoughts:
 http://blog.upscene.com/martijn/
 Database development questions? Check the forum!
 http://www.databasedevelopmentforum.com


 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]




-- 
Pau Marc Muñoz Torres

Laboratori de Biologia Computacional
Institut de  Biotecnologia   i Biomedicina Vicent
Villar
Universitat Autonoma de Barcelona
E-08193 Bellaterra (Barcelona)

telèfon: 93 5812807
Email : [EMAIL PROTECTED]

Re: indexing tables using my owns functions

2007-11-13 Thread Martijn Tonies


, as far as i can see, from mysql 5.0 and upper it is possible create
index using functions.

http://www.faqs.org/docs/ppbook/r24254.htm

But i keep having problems with the exemple from the link. Is there any bug
in  mysql 5.0.24a-log?

The above website says:
Practical PostgreSQL

I cannot find MySQL anywhere on that page.

Martijn Tonies
Database Workbench - development tool for MySQL, and more!
Upscene Productions
http://www.upscene.com
My thoughts:
http://blog.upscene.com/martijn/
Database development questions? Check the forum!
http://www.databasedevelopmentforum.com


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Advanced Indexing

2007-06-08 Thread Baron Schwartz


Hi Cory,

Cory Robin wrote:

Is there a way to only include certain matching conditions
in indexes?

Example if I have a row I want to index that is mysql dates
(2007-06-07) and I only want to include CURRENT and FUTURE
dates in the index and ignore any past dates.   Is that
possible at all?

The issue I have is that the ratio of queries on old vs.
new data is like 1:10.   And searches would be MUCH
faster if I could force my queries that are looking at
current or future data to use an index that ONLY had that
information in them..


You can't do this.  The strategies to do what you want will usually involve archiving 
off old data to other tables with fewer indexes, and creating a VIEW over the old and 
new tables, or moving all old data to a data warehouse, or something like that.


Incidentally, this is exactly what I built MySQL Archiver to do (just released a few 
days ago):


http://www.xaprb.com/blog/2007/06/06/mysql-archiver-091-released/
http://www.xaprb.com/blog/2006/05/02/how-to-write-efficient-archiving-and-purging-jobs-in-sql/
http://sourceforge.net/projects/mysqltoolkit

Baron

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Advanced Indexing

2007-06-08 Thread Perrin Harkins


On 6/7/07, Cory Robin [EMAIL PROTECTED] wrote:

The issue I have is that the ratio of queries on old vs.
new data is like 1:10.   And searches would be MUCH
faster if I could force my queries that are looking at
current or future data to use an index that ONLY had that
information in them..


You might want to look at partition pruning in MySQL 5.1.

- Perrin

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Advanced Indexing

2007-06-07 Thread Cory Robin

Is there a way to only include certain matching conditions
in indexes?

Example if I have a row I want to index that is mysql dates
(2007-06-07) and I only want to include CURRENT and FUTURE
dates in the index and ignore any past dates.   Is that
possible at all?

The issue I have is that the ratio of queries on old vs.
new data is like 1:10.   And searches would be MUCH
faster if I could force my queries that are looking at
current or future data to use an index that ONLY had that
information in them..

Thanks in advance for insight here.



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

indexing order column

2007-05-04 Thread Afan Pasalic


hi,
if I have column order_id(int(4)) null do I have to index it too. I'm 
going to use it ONLY for sorting records.


thanks.

-afan


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: indexing order column

2007-05-04 Thread Baron Schwartz


Hi Afan,

Afan Pasalic wrote:

hi,
if I have column order_id(int(4)) null do I have to index it too. I'm 
going to use it ONLY for sorting records.


It depends a lot on how much data is in the table, etc etc.  An index will make sorting 
more efficient in the general case when you have a decent amount of data.  But it's 
hard to be specific with so little information.


Baron

--
Baron Schwartz
http://www.xaprb.com/

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: indexing order column

2007-05-04 Thread Afan Pasalic




Baron Schwartz wrote:

Hi Afan,

Afan Pasalic wrote:

hi,
if I have column order_id(int(4)) null do I have to index it too. I'm 
going to use it ONLY for sorting records.


It depends a lot on how much data is in the table, etc etc.  An index 
will make sorting more efficient in the general case when you have a 
decent amount of data.  But it's hard to be specific with so little 
information.


Baron

I have table products (product_id is PK). I have table categories 
(cat_id is PK). since the product can be in more than one category, I 
have prod_cat table:

create prod_cat(
   cat_id int(8) unsigned not null,
   prod_id int(8) unsigned not null,
   order_id int(4) unsigned null,
   PRIMARY KEY (cat_id, prod_id)
) engine=Innodb;

would it be enough info?

thanks.
-afan



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: indexing order column

2007-05-04 Thread Baron Schwartz


Hi,

Afan Pasalic wrote:


Baron Schwartz wrote:

Hi Afan,

Afan Pasalic wrote:

hi,
if I have column order_id(int(4)) null do I have to index it too. I'm 
going to use it ONLY for sorting records.


It depends a lot on how much data is in the table, etc etc.  An index 
will make sorting more efficient in the general case when you have a 
decent amount of data.  But it's hard to be specific with so little 
information.


Baron

I have table products (product_id is PK). I have table categories 
(cat_id is PK). since the product can be in more than one category, I 
have prod_cat table:

create prod_cat(
   cat_id int(8) unsigned not null,
   prod_id int(8) unsigned not null,
   order_id int(4) unsigned null,
   PRIMARY KEY (cat_id, prod_id)
) engine=Innodb;



Okay, so your order_id is really sort order, not id of the customer's request to buy 
something.  (As an aside, perhaps sort_order would confuse you less in the future 
when you don't remember the column's purpose anymore).


It probably makes sense to index the column if you want to use it for sorting.  You 
could also just order by the primary key.  But I understand there are times when you 
want to have a different ordering.


Baron

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: indexing order column

2007-05-04 Thread Afan Pasalic


Baron Schwartz wrote:

Hi,

Afan Pasalic wrote:


Baron Schwartz wrote:

Hi Afan,

Afan Pasalic wrote:

hi,
if I have column order_id(int(4)) null do I have to index it too. 
I'm going to use it ONLY for sorting records.


It depends a lot on how much data is in the table, etc etc.  An 
index will make sorting more efficient in the general case when you 
have a decent amount of data.  But it's hard to be specific with so 
little information.


Baron

I have table products (product_id is PK). I have table categories 
(cat_id is PK). since the product can be in more than one category, I 
have prod_cat table:

create prod_cat(
   cat_id int(8) unsigned not null,
   prod_id int(8) unsigned not null,
   order_id int(4) unsigned null,
   PRIMARY KEY (cat_id, prod_id)
) engine=Innodb;



Okay, so your order_id is really sort order, not id of the customer's 
request to buy something.  (As an aside, perhaps sort_order would 
confuse you less in the future when you don't remember the column's 
purpose anymore).


It probably makes sense to index the column if you want to use it for 
sorting.  You could also just order by the primary key.  But I 
understand there are times when you want to have a different ordering.


Baron

yes, you're right. sort_order does make more sense :)
order by PK, in my case is, let's say impossible because I'm sorting 
products in ONE category. e.g.:

cat_id|prod_id|order_id
1   |   23   |   1
1   |   25   |   2
1   |   36   |   3
1   |   13   |   4
2   |   13   |   1
2   |   45   |   2
2   |   47   |   3
2   |   51   |   4
3   |   32   |   1
3   |   33   |   2
3   |   34   |   3
3   |   35   |   4

-afan







--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

RE: Joins versus Grouping/Indexing: Normalization Excessive?

2007-04-03 Thread Jerry Schwartz

Do you really mean 1:1 relationships? A song might have more than one
artist, album, or genre. Could a song have no album at all, such as
something that was released in MP3 format? Also, you might have more than
one version of a song. You might need a separate table for recordings, with
each song having one or more recordings. That would be a good place to keep
the release information, rather than storing it with the song.

The biggest problem would be to figure out how to index the songs,
themselves. I don't have a good suggestion for that off the top of my head.

That all being said, there's no reason I can think of not to normalize the
data.

Regards,

Jerry Schwartz
Global Information Incorporated
195 Farmington Ave.
Farmington, CT 06032

860.674.8796 / FAX: 860.674.8341


 -Original Message-
 From: Daniel Cousineau [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, April 03, 2007 12:15 AM
 To: mysql@lists.mysql.com
 Subject: Joins versus Grouping/Indexing: Normalization Excessive?

 So I'm currently designing a database for a web site and
 intra net for my
 campuses student radio. Since I'm not getting paid for this
 and I'm doing
 this in my free time I kinda want to take my time and have
 the system setup
 as perfectly as any one college student can.

 I'm currently debating on how I want to store all the songs
 in the system.
 Namely I'm not quite sure which way will eek the most
 performance out of
 MySQL.

 My plan so far is quite normalized, a songs table with 1:1
 relationships
 with an Albums, Artists, and Genres table.

 The big benefits I see from this is when I'm building the intra net
 application I'll want to pull just all of the artists or all
 of the albums,
 etc. However I feel like I'm encountering issues with where
 to store the
 year field of an mp3 (do I want it on the album, song, or
 both) along with
 issues like printing everything out at once.

 The only other way I can think of thats relatively efficient
 is to have the
 singular songs table and have indexes on albums, artists, and genres.

 My question, more out of curiosity than necessity, is which
 of these would
 be more efficient (given that I'll be using the InnoDB
 storage engine)?

 Other relevant facts include it'll be using the latest,
 stable release of
 MySQL 5 and I'll be developing in PHP5 (through CakePHP's database
 abstraction layer).

 --
 Thanks and Gig 'Em!
 Daniel Cousineau
 http://www.terminalfuture.com/
 http://www.linkedin.com/in/dcousineau
 [EMAIL PROTECTED]





-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Joins versus Grouping/Indexing: Normalization Excessive?

2007-04-02 Thread Daniel Cousineau


So I'm currently designing a database for a web site and intra net for my
campuses student radio. Since I'm not getting paid for this and I'm doing
this in my free time I kinda want to take my time and have the system setup
as perfectly as any one college student can.

I'm currently debating on how I want to store all the songs in the system.
Namely I'm not quite sure which way will eek the most performance out of
MySQL.

My plan so far is quite normalized, a songs table with 1:1 relationships
with an Albums, Artists, and Genres table.

The big benefits I see from this is when I'm building the intra net
application I'll want to pull just all of the artists or all of the albums,
etc. However I feel like I'm encountering issues with where to store the
year field of an mp3 (do I want it on the album, song, or both) along with
issues like printing everything out at once.

The only other way I can think of thats relatively efficient is to have the
singular songs table and have indexes on albums, artists, and genres.

My question, more out of curiosity than necessity, is which of these would
be more efficient (given that I'll be using the InnoDB storage engine)?

Other relevant facts include it'll be using the latest, stable release of
MySQL 5 and I'll be developing in PHP5 (through CakePHP's database
abstraction layer).

--
Thanks and Gig 'Em!
Daniel Cousineau
http://www.terminalfuture.com/
http://www.linkedin.com/in/dcousineau
[EMAIL PROTECTED]

Re: Joins versus Grouping/Indexing: Normalization Excessive?

2007-04-02 Thread Micah Stevens

I think you're approaching this from the wrong angle. You'll want to put 
the data at the highest level at which it changes.


i.e. If every song on an album is always the same year, put it at the 
album level, however, if it changes from song to song on a particular 
album, then you want it at the song level.


Year wouldn't ever apply to artist I don't think, unless they're truly a 
one hit wonder. :)


-Micah

On 04/02/2007 09:14 PM, Daniel Cousineau wrote:

So I'm currently designing a database for a web site and intra net for my
campuses student radio. Since I'm not getting paid for this and I'm doing
this in my free time I kinda want to take my time and have the system 
setup

as perfectly as any one college student can.

I'm currently debating on how I want to store all the songs in the 
system.

Namely I'm not quite sure which way will eek the most performance out of
MySQL.

My plan so far is quite normalized, a songs table with 1:1 relationships
with an Albums, Artists, and Genres table.

The big benefits I see from this is when I'm building the intra net
application I'll want to pull just all of the artists or all of the 
albums,

etc. However I feel like I'm encountering issues with where to store the
year field of an mp3 (do I want it on the album, song, or both) 
along with

issues like printing everything out at once.

The only other way I can think of thats relatively efficient is to 
have the

singular songs table and have indexes on albums, artists, and genres.

My question, more out of curiosity than necessity, is which of these 
would

be more efficient (given that I'll be using the InnoDB storage engine)?

Other relevant facts include it'll be using the latest, stable release of
MySQL 5 and I'll be developing in PHP5 (through CakePHP's database
abstraction layer).



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Help indexing this query.

2007-01-22 Thread altendew


--- ptsSignups -- 
id int(20) No 
mid int(20) No 0 
ptsID int(20) No 0 
pps double No 0 
points double No 0 
loginID varchar(255) No 
emailConfirm longtext No 
time timestamp Yes CURRENT_TIMESTAMP 
reasonForDeny longtext No 
status int(1) No 1 

--- index (timeframe) --- 

timeframe (mid,status,time) 

--- query --- 

SELECT SUM(s.pps) as earned,m.id,m.username 
FROM ptsSignups s 
FORCE INDEX(timeframe) 
JOIN members m 
ON s.mid=m.id 
AND m.status='Member' 
LEFT JOIN ptsContestExclude e 
ON e.cid=1 
AND e.mid=m.id 
WHERE 
s.status='2' 
AND s.time=2004-06-08 
AND s.time2008-06-08+INTERVAL 1 DAY 
AND e.mid IS NULL 
GROUP BY s.mid 
HAVING earned0 
ORDER BY earned DESC 

--- problem --- 

`ptsSignups` is a table listing everything my members have completed.
Sometimes I like to run contests to see who has earned the most. `members`
is a table that contains all my users. `ptsContestExclude` is a table of
members of whom I would like to exclude from the contest. 

What I do first is group the table `ptsSignups` by member id, and calculate
a sum of how much they earned. Then I reorder that sum in Descending order
so the highest earned is on top. 

This `ptsSignups` table contains 82752 rows and is 75KB big. It runs
extremely slow. I tried to create an index for it but it failed to increase
performance. 

Any help is appreciated.
-- 
View this message in context: 
http://www.nabble.com/Help-indexing-this-query.-tf3059045.html#a8505554
Sent from the MySQL - General mailing list archive at Nabble.com.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Help indexing this query.

2007-01-22 Thread Dan Buettner


Andrew, can you post the result of EXPLAIN query for your query?
Minus the FORCE INDEX too.  Also the structure of the other 2 tables
would be helpful as well.

Thanks,
Dan



On 1/22/07, altendew [EMAIL PROTECTED] wrote:


--- ptsSignups --
id int(20) No
mid int(20) No 0
ptsID int(20) No 0
pps double No 0
points double No 0
loginID varchar(255) No
emailConfirm longtext No
time timestamp Yes CURRENT_TIMESTAMP
reasonForDeny longtext No
status int(1) No 1

--- index (timeframe) ---

timeframe (mid,status,time)

--- query ---

SELECT SUM(s.pps) as earned,m.id,m.username
FROM ptsSignups s
FORCE INDEX(timeframe)
JOIN members m
ON s.mid=m.id
AND m.status='Member'
LEFT JOIN ptsContestExclude e
ON e.cid=1
AND e.mid=m.id
WHERE
s.status='2'
AND s.time=2004-06-08
AND s.time2008-06-08+INTERVAL 1 DAY
AND e.mid IS NULL
GROUP BY s.mid
HAVING earned0
ORDER BY earned DESC

--- problem ---

`ptsSignups` is a table listing everything my members have completed.
Sometimes I like to run contests to see who has earned the most. `members`
is a table that contains all my users. `ptsContestExclude` is a table of
members of whom I would like to exclude from the contest.

What I do first is group the table `ptsSignups` by member id, and calculate
a sum of how much they earned. Then I reorder that sum in Descending order
so the highest earned is on top.

This `ptsSignups` table contains 82752 rows and is 75KB big. It runs
extremely slow. I tried to create an index for it but it failed to increase
performance.

Any help is appreciated.
--
View this message in context: 
http://www.nabble.com/Help-indexing-this-query.-tf3059045.html#a8505554
Sent from the MySQL - General mailing list archive at Nabble.com.


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]




--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Help indexing this query.

2007-01-22 Thread altendew


--- EXPLAIN ---

1 SIMPLE e system cid NULL NULL NULL 0 const row not found 
1 SIMPLE s ref sid,status,timeframe status 4 const 20438 Using where; Using
temporary; Using filesort 
1 SIMPLE m eq_ref PRIMARY,status PRIMARY 4 sc_72.s.mid 1 Using where 

--- members ---

id  int(20)  No
first  varchar(255) No  
last  varchar(255) No  
username  varchar(25) No  
email  varchar(255) No  
password  varchar(25) No  

PRIMARY id  

--- ptsContestExclude ---

cid  int(20)  No  0
mid  int(20)  No  0

UNIQUE (cid,mid)

--- ptsSignups (all indexes) ---

PRIMARY id  
INDEX (mid,ptsID)
INDEX (status,ptsID)  
INDEX timeframe (mid, status, time)  


Dan Buettner-2 wrote:
 
 Andrew, can you post the result of EXPLAIN query for your query?
 Minus the FORCE INDEX too.  Also the structure of the other 2 tables
 would be helpful as well.
 
 Thanks,
 Dan
 
 
 
 On 1/22/07, altendew [EMAIL PROTECTED] wrote:

 --- ptsSignups --
 id int(20) No
 mid int(20) No 0
 ptsID int(20) No 0
 pps double No 0
 points double No 0
 loginID varchar(255) No
 emailConfirm longtext No
 time timestamp Yes CURRENT_TIMESTAMP
 reasonForDeny longtext No
 status int(1) No 1

 --- index (timeframe) ---

 timeframe (mid,status,time)

 --- query ---

 SELECT SUM(s.pps) as earned,m.id,m.username
 FROM ptsSignups s
 FORCE INDEX(timeframe)
 JOIN members m
 ON s.mid=m.id
 AND m.status='Member'
 LEFT JOIN ptsContestExclude e
 ON e.cid=1
 AND e.mid=m.id
 WHERE
 s.status='2'
 AND s.time=2004-06-08
 AND s.time2008-06-08+INTERVAL 1 DAY
 AND e.mid IS NULL
 GROUP BY s.mid
 HAVING earned0
 ORDER BY earned DESC

 --- problem ---

 `ptsSignups` is a table listing everything my members have completed.
 Sometimes I like to run contests to see who has earned the most.
 `members`
 is a table that contains all my users. `ptsContestExclude` is a table of
 members of whom I would like to exclude from the contest.

 What I do first is group the table `ptsSignups` by member id, and
 calculate
 a sum of how much they earned. Then I reorder that sum in Descending
 order
 so the highest earned is on top.

 This `ptsSignups` table contains 82752 rows and is 75KB big. It runs
 extremely slow. I tried to create an index for it but it failed to
 increase
 performance.

 Any help is appreciated.
 --
 View this message in context:
 http://www.nabble.com/Help-indexing-this-query.-tf3059045.html#a8505554
 Sent from the MySQL - General mailing list archive at Nabble.com.


 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:   
 http://lists.mysql.com/[EMAIL PROTECTED]


 
 -- 
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Help-indexing-this-query.-tf3059045.html#a8505966
Sent from the MySQL - General mailing list archive at Nabble.com.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Help indexing this query.

2007-01-22 Thread Dan Buettner


Andrew, couple of suggestions:

1 - where you use
s.status='2'
change it to
s.status=2
otherwise MySQL is likely casting your data from int to string, which
is slow and also precludes using an index.

2 - in this case, instead of using a left join, try using a subquery:
WHERE ...
AND s.mid NOT IN (SELECT mid FROM ptsContestExclude)
- or -
change your index around, from
UNIQUE (cid,mid)
to
UNIQUE (mid,cid)
due to the way MySQL uses indices you need the queried-upon column(s)
listed first(earlier) in the index.

These might speed things up

HTH,
Dan


On 1/22/07, altendew [EMAIL PROTECTED] wrote:


--- EXPLAIN ---

1 SIMPLE e system cid NULL NULL NULL 0 const row not found
1 SIMPLE s ref sid,status,timeframe status 4 const 20438 Using where; Using
temporary; Using filesort
1 SIMPLE m eq_ref PRIMARY,status PRIMARY 4 sc_72.s.mid 1 Using where

--- members ---

id  int(20)  No
first  varchar(255) No
last  varchar(255) No
username  varchar(25) No
email  varchar(255) No
password  varchar(25) No

PRIMARY id

--- ptsContestExclude ---

cid  int(20)  No  0
mid  int(20)  No  0

UNIQUE (cid,mid)

--- ptsSignups (all indexes) ---

PRIMARY id
INDEX (mid,ptsID)
INDEX (status,ptsID)
INDEX timeframe (mid, status, time)


Dan Buettner-2 wrote:

 Andrew, can you post the result of EXPLAIN query for your query?
 Minus the FORCE INDEX too.  Also the structure of the other 2 tables
 would be helpful as well.

 Thanks,
 Dan



 On 1/22/07, altendew [EMAIL PROTECTED] wrote:

 --- ptsSignups --
 id int(20) No
 mid int(20) No 0
 ptsID int(20) No 0
 pps double No 0
 points double No 0
 loginID varchar(255) No
 emailConfirm longtext No
 time timestamp Yes CURRENT_TIMESTAMP
 reasonForDeny longtext No
 status int(1) No 1

 --- index (timeframe) ---

 timeframe (mid,status,time)

 --- query ---

 SELECT SUM(s.pps) as earned,m.id,m.username
 FROM ptsSignups s
 FORCE INDEX(timeframe)
 JOIN members m
 ON s.mid=m.id
 AND m.status='Member'
 LEFT JOIN ptsContestExclude e
 ON e.cid=1
 AND e.mid=m.id
 WHERE
 s.status='2'
 AND s.time=2004-06-08
 AND s.time2008-06-08+INTERVAL 1 DAY
 AND e.mid IS NULL
 GROUP BY s.mid
 HAVING earned0
 ORDER BY earned DESC

 --- problem ---

 `ptsSignups` is a table listing everything my members have completed.
 Sometimes I like to run contests to see who has earned the most.
 `members`
 is a table that contains all my users. `ptsContestExclude` is a table of
 members of whom I would like to exclude from the contest.

 What I do first is group the table `ptsSignups` by member id, and
 calculate
 a sum of how much they earned. Then I reorder that sum in Descending
 order
 so the highest earned is on top.

 This `ptsSignups` table contains 82752 rows and is 75KB big. It runs
 extremely slow. I tried to create an index for it but it failed to
 increase
 performance.

 Any help is appreciated.
 --
 View this message in context:
 http://www.nabble.com/Help-indexing-this-query.-tf3059045.html#a8505554
 Sent from the MySQL - General mailing list archive at Nabble.com.


 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:
 http://lists.mysql.com/[EMAIL PROTECTED]



 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]




--
View this message in context: 
http://www.nabble.com/Help-indexing-this-query.-tf3059045.html#a8505966
Sent from the MySQL - General mailing list archive at Nabble.com.


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]




--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Help indexing this query.

2007-01-22 Thread altendew


Thanks for the casting tip.


Dan Buettner-2 wrote:
 
 Andrew, couple of suggestions:
 
 1 - where you use
 s.status='2'
 change it to
 s.status=2
 otherwise MySQL is likely casting your data from int to string, which
 is slow and also precludes using an index.
 
 2 - in this case, instead of using a left join, try using a subquery:
 WHERE ...
 AND s.mid NOT IN (SELECT mid FROM ptsContestExclude)
 - or -
 change your index around, from
 UNIQUE (cid,mid)
 to
 UNIQUE (mid,cid)
 due to the way MySQL uses indices you need the queried-upon column(s)
 listed first(earlier) in the index.
 
 These might speed things up
 
 HTH,
 Dan
 
 
 On 1/22/07, altendew [EMAIL PROTECTED] wrote:

 --- EXPLAIN ---

 1 SIMPLE e system cid NULL NULL NULL 0 const row not found
 1 SIMPLE s ref sid,status,timeframe status 4 const 20438 Using where;
 Using
 temporary; Using filesort
 1 SIMPLE m eq_ref PRIMARY,status PRIMARY 4 sc_72.s.mid 1 Using where

 --- members ---

 id  int(20)  No
 first  varchar(255) No
 last  varchar(255) No
 username  varchar(25) No
 email  varchar(255) No
 password  varchar(25) No

 PRIMARY id

 --- ptsContestExclude ---

 cid  int(20)  No  0
 mid  int(20)  No  0

 UNIQUE (cid,mid)

 --- ptsSignups (all indexes) ---

 PRIMARY id
 INDEX (mid,ptsID)
 INDEX (status,ptsID)
 INDEX timeframe (mid, status, time)


 Dan Buettner-2 wrote:
 
  Andrew, can you post the result of EXPLAIN query for your query?
  Minus the FORCE INDEX too.  Also the structure of the other 2 tables
  would be helpful as well.
 
  Thanks,
  Dan
 
 
 
  On 1/22/07, altendew [EMAIL PROTECTED] wrote:
 
  --- ptsSignups --
  id int(20) No
  mid int(20) No 0
  ptsID int(20) No 0
  pps double No 0
  points double No 0
  loginID varchar(255) No
  emailConfirm longtext No
  time timestamp Yes CURRENT_TIMESTAMP
  reasonForDeny longtext No
  status int(1) No 1
 
  --- index (timeframe) ---
 
  timeframe (mid,status,time)
 
  --- query ---
 
  SELECT SUM(s.pps) as earned,m.id,m.username
  FROM ptsSignups s
  FORCE INDEX(timeframe)
  JOIN members m
  ON s.mid=m.id
  AND m.status='Member'
  LEFT JOIN ptsContestExclude e
  ON e.cid=1
  AND e.mid=m.id
  WHERE
  s.status='2'
  AND s.time=2004-06-08
  AND s.time2008-06-08+INTERVAL 1 DAY
  AND e.mid IS NULL
  GROUP BY s.mid
  HAVING earned0
  ORDER BY earned DESC
 
  --- problem ---
 
  `ptsSignups` is a table listing everything my members have completed.
  Sometimes I like to run contests to see who has earned the most.
  `members`
  is a table that contains all my users. `ptsContestExclude` is a table
 of
  members of whom I would like to exclude from the contest.
 
  What I do first is group the table `ptsSignups` by member id, and
  calculate
  a sum of how much they earned. Then I reorder that sum in Descending
  order
  so the highest earned is on top.
 
  This `ptsSignups` table contains 82752 rows and is 75KB big. It runs
  extremely slow. I tried to create an index for it but it failed to
  increase
  performance.
 
  Any help is appreciated.
  --
  View this message in context:
 
 http://www.nabble.com/Help-indexing-this-query.-tf3059045.html#a8505554
  Sent from the MySQL - General mailing list archive at Nabble.com.
 
 
  --
  MySQL General Mailing List
  For list archives: http://lists.mysql.com/mysql
  To unsubscribe:
  http://lists.mysql.com/[EMAIL PROTECTED]
 
 
 
  --
  MySQL General Mailing List
  For list archives: http://lists.mysql.com/mysql
  To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
 
 
 

 --
 View this message in context:
 http://www.nabble.com/Help-indexing-this-query.-tf3059045.html#a8505966
 Sent from the MySQL - General mailing list archive at Nabble.com.


 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:   
 http://lists.mysql.com/[EMAIL PROTECTED]


 
 -- 
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Help-indexing-this-query.-tf3059045.html#a8513291
Sent from the MySQL - General mailing list archive at Nabble.com.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Indexing issue in slave !!

2007-01-06 Thread Himanshu Raina

Hi,

I am facing a peculiar problem.When i execute a query on slave server it 
doesn't use indexes that have been created and hence read all the records 
present in that table.The same query when execute on Master yields proper 
results.The table structure , table types are same.What could be wrong?

regards
Himanshu Raina

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Indexing issue in slave !!

2007-01-06 Thread Dan Nelson

In the last episode (Jan 06), Himanshu Raina said:
 I am facing a peculiar problem.When i execute a query on slave server
 it doesn't use indexes that have been created and hence read all the
 records present in that table.The same query when execute on Master
 yields proper results.The table structure , table types are same.What
 could be wrong?

If you compare the EXPLAIN SELECT .. outputs for the query on both
servers, what's different?  

-- 
Dan Nelson
[EMAIL PROTECTED]

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Foreign Language FULLTEXT Indexing

2006-06-27 Thread Macheret, Catherine H.

I'm guessing that the way MySQL handles foreign language full text
indexing is through parser plug-ins and custom stop word lists.  Am I
right?  And If so, these must have been already created for the common
western languages such as German, French etc. Where can I find these
plug-ins?  Is there any MySQL or third party documentation that anyone
knows of that can help me out.  I'm evaluating MySQL for an application
doing full text indexing and searching of English and many different
foreign language documents.
 
Catherine Macheret

Re: example when indexing hurts simple select?

2006-06-12 Thread Marco Simon

Hi Gasper,

MySql allows to package the index - to get its size smaller and to gain
performance.
Some information about that can be found here:
http://www.mysqlperformanceblog.com/2006/05/13/to-pack-or-not-to-pack-myisam-key-compression/



Gaspar Bakos schrieb:
 Hi,


 RE:
   
 Have you tried
 analyze table x;
 

 This was quick:

 mysql analyze table TEST;
 Table  Op  Msg_typeMsg_text
 CAT.TEST   analyze status  Table is already up to date

 --

 mysql show index from TEST;
 +---+++--+-+---+-+--++--++-+
 | Table | Non_unique | Key_name   | Seq_in_index | Column_name | Collation | 
 Cardinality | Sub_part | Packed | Null | Index_type | Comment |
 +---+++--+-+---+-+--++--++-+
 | TEST |  1 | MMtestfiel |1 | MMtestfiel  | A |   
   838 | NULL | NULL   |  | BTREE  | NULL|
 | TEST |  1 | MMi_m  |1 | MMi_m   | A |   
 25857 | NULL | NULL   | YES  | BTREE  | NULL|
 +---+++--+-+---+-+--++--++-+


 ---
 I am trying to figure out what the Packed field means.

 Gaspar

   



smime.p7s
Description: S/MIME Cryptographic Signature

example when indexing hurts simple select?

2006-06-11 Thread Gaspar Bakos

Hello,

There is a table (TEST) with ~100 million records, 70 columns (mostly
integers, some doubles, and a few short fixed char()), and has a ~100Gb
size.

The table has a single (not unique) index on one integer column: MMi.

If I invoke a simple select based on MMi, then the selection is VERY slow:

nohup time mysql CAT -u catadmin -p$MPWD -e create table test2
select * from TEST where MMi  9000;

( this selects only ~0.5 % of the table, by the way, so test2 is a
small table, and the time is not spent with writing it on disk)

  Time used: 47 minutes:
0.00user 0.00system 47:17.37elapsed 0%CPU (0avgtext+0avgdata
0maxresident)k 0inputs+0outputs (0major+416minor)pagefaults 0swaps

If I do the same but ignore the index, the select time drops to 1/5th !!!

nohup time mysql CAT -u catadmin -p$MPWD -e create table test3 \
select * from TEST ignore index (MMi) where \
MMi  9000;

  Time used: 11 minutes:
0.00user 0.00system 11:08.23elapsed 0%CPU (0avgtext+0avgdata
0maxresident)k 0inputs+0outputs (0major+415minor)pagefaults 0swaps

Without the index, MySQL does a simple thing; it reads in sequentially
the 100Gb database, and while reading, it parses the lines, and
determines if the MMi is  9000. This is done with about 16Mb/s speed.

With the index, it performs a large number of random seeks. The data
(.MYD) is probably not organized on the disk according to sorted MMi.

Questions;

1. Is there a way to decrease random seeks? E.g. mysqld config
parameters, increase some buffer/cache sizes?

2. Optimize table: is there a way to rearrange data so that random
seeks are minimized?

3. If we have to live with large number of random seeks does anyone
know how the underlying FS can be optimized?

Cheers,
Gaspar


(   All this TEST DB is kept on FC3 Redhat + MySQL 5.0.22, XFS
filesystem on a 1.5Tb RAID-5 array of 4 SATA disks (served by a
3ware RAID controller).  The computer is running on two opteron
2.0GHZ CPUs and 4Gb RAM.
)

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: example when indexing hurts simple select?

2006-06-11 Thread Philip M. Gollucci


Questions;

1. Is there a way to decrease random seeks? E.g. mysqld config
parameters, increase some buffer/cache sizes?

2. Optimize table: is there a way to rearrange data so that random
seeks are minimized?

3. If we have to live with large number of random seeks does anyone
know how the underlying FS can be optimized?

What is the EXPLAIN output of each?

Have you tried
analyze table x;
optimize table x;
On a 100GB, these might table a while. You probably want to LOCK the 
table before running them.


Is it MyISAM or Innodb ?


--

Philip M. Gollucci ([EMAIL PROTECTED]) 323.219.4708
Consultant / http://p6m7g8.net/Resume/resume.shtml
Senior Software Engineer - TicketMaster - http://ticketmaster.com
1024D/A79997FA F357 0FDD 2301 6296 690F  6A47 D55A 7172 A799 97F

It takes a minute to have a crush on someone, an hour to like someone,
and a day to love someone, but it takes a lifetime to forget someone...

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: example when indexing hurts simple select?

2006-06-11 Thread Gaspar Bakos

Hi, Philip,

RE:
 What is the EXPLAIN output of each?

OK, first I naively typed:
explain create table test2 select * from TEST where MMi  9000;
but of course, this does not work.

The simple select that uses MMi_m as index (and takes up to an hour):

mysql explain select * from TEST where MMi_m  9000;
++-+---+---+---+---+-+--++-+
| id | select_type | table | type  | possible_keys | key   | key_len | ref  | 
rows   | Extra   |
++-+---+---+---+---+-+--++-+
|  1 | SIMPLE  | TEST  | range | MMi_m | MMi_m | 3   | NULL | 
406649 | Using where |
++-+---+---+---+---+-+--++-+

The select with ignoreing the index (takes only 11 minutes)
mysql explain select * from TEST ignore key (MMi_m) where MMi_m  9000;
++-+---+--+---+--+-+--+---+-+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | 
rows  | Extra   |
++-+---+--+---+--+-+--+---+-+
|  1 | SIMPLE  | TEST  | ALL  | NULL  | NULL | NULL| NULL | 
470992970 | Using where |
++-+---+--+---+--+-+--+---+-+

 Have you tried
 analyze table x;
 optimize table x;

Not yet.

As regards optimize table, I thought it would not make too much sense,
because:

OPTIMIZE TABLE should be used if you have deleted a large part of a
table or if you have made many changes to a table with variable-length
rows (tables that have VARCHAR, VARBINARY, BLOB, or TEXT columns).

But I have just uploaded this table from ASCII, and made no changes.
Nevertheless, I will give a try, maybe there is some feature of
OPTIMIZE TABLE I don't know of.

What did you think of?

 Is it MyISAM or Innodb ?

MyISAM.

I'll keep you posted. I am very curious about how this can be resolved.

Cheers,
Gaspar

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: need white papers on performace tuning of full text indexing

2006-02-23 Thread sheeri kritzer

A simple search on google for

mysql fulltext indexing

provided many links, including:

http://jeremy.zawodny.com/blog/archives/000576.html

http://epsilondelta.wordpress.com/2006/02/08/dissecting-mysql-fulltext-indexing/
(overviews of how it works)

and
http://dev.mysql.com/doc/refman/5.0/en/fulltext-fine-tuning.html
(fine tuning full text search)

Sincerely,

Sheeri

On 2/23/06, Anand Sachdev [EMAIL PROTECTED] wrote:
 anyone know where i can get these, will highly appreciate, this is a feature
 of mysql 5.0 and my platform is linux.



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

need white papers on performace tuning of full text indexing

2006-02-22 Thread Anand Sachdev

anyone know where i can get these, will highly appreciate, this is a feature
of mysql 5.0 and my platform is linux.

Re: MySQL 4.0, FULL-TEXT Indexing and Search Arabic Data, Unicode

2005-11-27 Thread AmirBehzad Eslami

Mohsen wrote:
  
   But himself solved his problem.
   with : mysql_query(SET NAMES utf8);
   Even 4.0.x
  
  Wrong. 
  
  I decided to prepare two different versions for my software:
  - A MySQL 4.0-friendly version using Romanizing method (Hats off to you, 
Ehsan)
  - A MySQL 4.1-compatible version.
  
  The code you mentioned belongs to the 2nd version.
  
   SET NAMES indicates what is in the SQL statements that the client  sends. 
Thus, SET NAMES 'cp1251' tells the server future incoming  messages from this 
client are in character set cp1251. It also  specifies the character set for 
results that the server sends back to  the client. (For example, it indicates 
what character set column values  are if you use a SELECT statement.) 
  
  MySQL Manual 4.1 - 10.3.6. Connection Character Sets and Collations.
  
  Kind Regards,
  Behzad
  



-
 Yahoo! Music Unlimited - Access over 1 million songs. Try it free.

1 2 3 4 >

1 - 100 of 398 matches

Mail list logo