Re: large temp files created by mysql

2012-05-24 Thread Luis Daniel Lucio Quiroz
I got a solution maybe

step 1:
mysql explain select * from users;
++-+---+--+---+--+-+--+--+---+
| id | select_type | table | type | possible_keys | key  | key_len |
ref  | rows | Extra |
++-+---+--+---+--+-+--+--+---+
|  1 | SIMPLE  | users | ALL  | NULL  | NULL | NULL|
NULL | 32883093 |   |
++-+---+--+---+--+-+--+--+---+
1 row in set (0.00 sec)

so you get the rows field

Step2:
select * from users, limit $r,1


What do you think? Is the only way i found what delays seconds not
minuts. USERS is a 19GB Table for me.

LD

2011/10/30 Jan Steinman j...@bytesmiths.com:
 Actually, having tried that, you still need the ORDER BY RAND() in there. 
 Otherwise, I keep getting the same record over and over. But it surely cuts 
 way down on the number of rows that need to be sorted.

 So if your table size is fairly stable, and you pick a good number for the 
 WHERE constant, you can make this quite speedy.

 Still, it seems there should be a better way...

 On 30 Oct 11, at 18:51, Jan Steinman wrote:

 From: mos mo...@fastmail.fm


 At 10:34 AM 10/24/2011, you wrote:
 select id from table order by rand() limit 1;
 is doing as example a dumb temporary table with the full size

 Because it has to sort the entire table, then it returns the one row. This 
 of course is extremely inefficient. :)

 That is absolutely incredible and counter-intuitive, and (as you say) 
 extremely inefficient!

 This is used everywhere. Perhaps it is one of the biggest anti-patterns in 
 SQL. I just checked two different SQL cookbook sites, and they both 
 recommend ORDER BY RAND().

 I just googled around a bit, and found that putting RAND() in the WHERE 
 clause is very efficient:

 SELECT id FROM table WHERE RAND()  0.01 LIMIT 1

 The comparison constant can be optimized for the number of rows you have. 
 The above returns the first record of 1% of the table. If you have a million 
 rows, you might want to bump that to something like 100 parts per million or 
 so.

 But really, folks, this is something so ubiquitous and so recommended, why 
 can't the query optimizer look out for ORDER BY RAND() and simply skip the 
 table sort and just grab some record? (Hopefully using something better than 
 Knuth's LCRNG...)

 
 Learning to think wholistically requires an overriding, or reversal, of much 
 of the cultural heritage of the last few hundred years. -- David Holmgren
  Jan Steinman, EcoReality Co-op 


 
 Within a few human generations, the low-energy patterns observable in natural 
 landscapes will again form the basis of human system design after the richest 
 deposits of fossil fuels and minerals are exhausted. -- David Holmgren
  Jan Steinman, EcoReality Co-op 


 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:    
 http://lists.mysql.com/mysql?unsub=luis.daniel.lu...@gmail.com


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



RE: large temp files created by mysql

2012-05-24 Thread Rick James
Maybe.

1. The Rows is approximate, so you could over-shoot or under-shoot the end of 
the table.

2. OFFSET (limit $r,1) still has to scan $r rows.

3. SELECT * with an OFFSET will read the entire rows.  SELECT `foo`, where 
`foo` is indexed, but not the InnoDB PRIMARY KEY, will scan only the 
INDEX(foo).  This is likely to be much faster.  But you are unlikely to do that 
unless foo is UNIQUE.

A slight improvement (addressing both issues) is to decide which end to 
approach from.  But scanning from the end needs an ORDER BY, preferably on the 
PRIMARY KEY.

etc.


 -Original Message-
 From: Luis Daniel Lucio Quiroz [mailto:luis.daniel.lu...@gmail.com]
 Sent: Thursday, May 24, 2012 12:00 AM
 To: Jan Steinman
 Cc: mysql@lists.mysql.com
 Subject: Re: large temp files created by mysql
 
 I got a solution maybe
 
 step 1:
 mysql explain select * from users;
 ++-+---+--+---+--+-+---
 ---+--+---+
 | id | select_type | table | type | possible_keys | key  | key_len |
 ref  | rows | Extra |
 ++-+---+--+---+--+-+---
 ---+--+---+
 |  1 | SIMPLE  | users | ALL  | NULL  | NULL | NULL|
 NULL | 32883093 |   |
 ++-+---+--+---+--+-+---
 ---+--+---+
 1 row in set (0.00 sec)
 
 so you get the rows field
 
 Step2:
 select * from users, limit $r,1
 
 
 What do you think? Is the only way i found what delays seconds not
 minuts. USERS is a 19GB Table for me.
 
 LD
 
 2011/10/30 Jan Steinman j...@bytesmiths.com:
  Actually, having tried that, you still need the ORDER BY RAND() in
 there. Otherwise, I keep getting the same record over and over. But it
 surely cuts way down on the number of rows that need to be sorted.
 
  So if your table size is fairly stable, and you pick a good number
 for the WHERE constant, you can make this quite speedy.
 
  Still, it seems there should be a better way...
 
  On 30 Oct 11, at 18:51, Jan Steinman wrote:
 
  From: mos mo...@fastmail.fm
 
 
  At 10:34 AM 10/24/2011, you wrote:
  select id from table order by rand() limit 1; is doing as example
 a
  dumb temporary table with the full size
 
  Because it has to sort the entire table, then it returns the one
  row. This of course is extremely inefficient. :)
 
  That is absolutely incredible and counter-intuitive, and (as you
 say) extremely inefficient!
 
  This is used everywhere. Perhaps it is one of the biggest anti-
 patterns in SQL. I just checked two different SQL cookbook sites,
 and they both recommend ORDER BY RAND().
 
  I just googled around a bit, and found that putting RAND() in the
 WHERE clause is very efficient:
 
  SELECT id FROM table WHERE RAND()  0.01 LIMIT 1
 
  The comparison constant can be optimized for the number of rows you
 have. The above returns the first record of 1% of the table. If you
 have a million rows, you might want to bump that to something like 100
 parts per million or so.
 
  But really, folks, this is something so ubiquitous and so
  recommended, why can't the query optimizer look out for ORDER BY
  RAND() and simply skip the table sort and just grab some record?
  (Hopefully using something better than Knuth's LCRNG...)
 
  
  Learning to think wholistically requires an overriding, or reversal,
  of much of the cultural heritage of the last few hundred years. --
  David Holmgren
   Jan Steinman, EcoReality Co-op 
 
 
  
  Within a few human generations, the low-energy patterns observable in
  natural landscapes will again form the basis of human system design
  after the richest deposits of fossil fuels and minerals are
 exhausted.
  -- David Holmgren
   Jan Steinman, EcoReality Co-op 
 
 
  --
  MySQL General Mailing List
  For list archives: http://lists.mysql.com/mysql To unsubscribe:
  http://lists.mysql.com/mysql?unsub=luis.daniel.lu...@gmail.com
 
 
 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/mysql


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



RE: large temp files created by mysql

2012-05-24 Thread Luis Daniel Lucio Quiroz
For my propouses aprox is fine. I guess for others it doesnt
El may 24, 2012 9:59 a.m., Rick James rja...@yahoo-inc.com escribió:

 Maybe.

 1. The Rows is approximate, so you could over-shoot or under-shoot the
 end of the table.

 2. OFFSET (limit $r,1) still has to scan $r rows.

 3. SELECT * with an OFFSET will read the entire rows.  SELECT `foo`, where
 `foo` is indexed, but not the InnoDB PRIMARY KEY, will scan only the
 INDEX(foo).  This is likely to be much faster.  But you are unlikely to do
 that unless foo is UNIQUE.

 A slight improvement (addressing both issues) is to decide which end to
 approach from.  But scanning from the end needs an ORDER BY, preferably on
 the PRIMARY KEY.

 etc.


  -Original Message-
  From: Luis Daniel Lucio Quiroz [mailto:luis.daniel.lu...@gmail.com]
  Sent: Thursday, May 24, 2012 12:00 AM
  To: Jan Steinman
  Cc: mysql@lists.mysql.com
  Subject: Re: large temp files created by mysql
 
  I got a solution maybe
 
  step 1:
  mysql explain select * from users;
  ++-+---+--+---+--+-+---
  ---+--+---+
  | id | select_type | table | type | possible_keys | key  | key_len |
  ref  | rows | Extra |
  ++-+---+--+---+--+-+---
  ---+--+---+
  |  1 | SIMPLE  | users | ALL  | NULL  | NULL | NULL|
  NULL | 32883093 |   |
  ++-+---+--+---+--+-+---
  ---+--+---+
  1 row in set (0.00 sec)
 
  so you get the rows field
 
  Step2:
  select * from users, limit $r,1
 
 
  What do you think? Is the only way i found what delays seconds not
  minuts. USERS is a 19GB Table for me.
 
  LD
 
  2011/10/30 Jan Steinman j...@bytesmiths.com:
   Actually, having tried that, you still need the ORDER BY RAND() in
  there. Otherwise, I keep getting the same record over and over. But it
  surely cuts way down on the number of rows that need to be sorted.
  
   So if your table size is fairly stable, and you pick a good number
  for the WHERE constant, you can make this quite speedy.
  
   Still, it seems there should be a better way...
  
   On 30 Oct 11, at 18:51, Jan Steinman wrote:
  
   From: mos mo...@fastmail.fm
  
  
   At 10:34 AM 10/24/2011, you wrote:
   select id from table order by rand() limit 1; is doing as example
  a
   dumb temporary table with the full size
  
   Because it has to sort the entire table, then it returns the one
   row. This of course is extremely inefficient. :)
  
   That is absolutely incredible and counter-intuitive, and (as you
  say) extremely inefficient!
  
   This is used everywhere. Perhaps it is one of the biggest anti-
  patterns in SQL. I just checked two different SQL cookbook sites,
  and they both recommend ORDER BY RAND().
  
   I just googled around a bit, and found that putting RAND() in the
  WHERE clause is very efficient:
  
   SELECT id FROM table WHERE RAND()  0.01 LIMIT 1
  
   The comparison constant can be optimized for the number of rows you
  have. The above returns the first record of 1% of the table. If you
  have a million rows, you might want to bump that to something like 100
  parts per million or so.
  
   But really, folks, this is something so ubiquitous and so
   recommended, why can't the query optimizer look out for ORDER BY
   RAND() and simply skip the table sort and just grab some record?
   (Hopefully using something better than Knuth's LCRNG...)
  
   
   Learning to think wholistically requires an overriding, or reversal,
   of much of the cultural heritage of the last few hundred years. --
   David Holmgren
    Jan Steinman, EcoReality Co-op 
  
  
   
   Within a few human generations, the low-energy patterns observable in
   natural landscapes will again form the basis of human system design
   after the richest deposits of fossil fuels and minerals are
  exhausted.
   -- David Holmgren
    Jan Steinman, EcoReality Co-op 
  
  
   --
   MySQL General Mailing List
   For list archives: http://lists.mysql.com/mysql To unsubscribe:
   http://lists.mysql.com/mysql?unsub=luis.daniel.lu...@gmail.com
  
 
  --
  MySQL General Mailing List
  For list archives: http://lists.mysql.com/mysql
  To unsubscribe:http://lists.mysql.com/mysql




Re: large temp files created by mysql

2011-10-30 Thread Jan Steinman
 From: mos mo...@fastmail.fm
 
 
 At 10:34 AM 10/24/2011, you wrote:
 select id from table order by rand() limit 1;
 is doing as example a dumb temporary table with the full size
 
 Because it has to sort the entire table, then it returns the one row. This of 
 course is extremely inefficient. :)

That is absolutely incredible and counter-intuitive, and (as you say) extremely 
inefficient!

This is used everywhere. Perhaps it is one of the biggest anti-patterns in 
SQL. I just checked two different SQL cookbook sites, and they both recommend 
ORDER BY RAND().

I just googled around a bit, and found that putting RAND() in the WHERE clause 
is very efficient:

SELECT id FROM table WHERE RAND()  0.01 LIMIT 1

The comparison constant can be optimized for the number of rows you have. The 
above returns the first record of 1% of the table. If you have a million rows, 
you might want to bump that to something like 100 parts per million or so.

But really, folks, this is something so ubiquitous and so recommended, why 
can't the query optimizer look out for ORDER BY RAND() and simply skip the 
table sort and just grab some record? (Hopefully using something better than 
Knuth's LCRNG...)


Learning to think wholistically requires an overriding, or reversal, of much of 
the cultural heritage of the last few hundred years. -- David Holmgren
 Jan Steinman, EcoReality Co-op 


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: large temp files created by mysql

2011-10-30 Thread Jan Steinman
Actually, having tried that, you still need the ORDER BY RAND() in there. 
Otherwise, I keep getting the same record over and over. But it surely cuts way 
down on the number of rows that need to be sorted.

So if your table size is fairly stable, and you pick a good number for the 
WHERE constant, you can make this quite speedy.

Still, it seems there should be a better way...

On 30 Oct 11, at 18:51, Jan Steinman wrote:

 From: mos mo...@fastmail.fm
 
 
 At 10:34 AM 10/24/2011, you wrote:
 select id from table order by rand() limit 1;
 is doing as example a dumb temporary table with the full size
 
 Because it has to sort the entire table, then it returns the one row. This 
 of course is extremely inefficient. :)
 
 That is absolutely incredible and counter-intuitive, and (as you say) 
 extremely inefficient!
 
 This is used everywhere. Perhaps it is one of the biggest anti-patterns in 
 SQL. I just checked two different SQL cookbook sites, and they both 
 recommend ORDER BY RAND().
 
 I just googled around a bit, and found that putting RAND() in the WHERE 
 clause is very efficient:
 
 SELECT id FROM table WHERE RAND()  0.01 LIMIT 1
 
 The comparison constant can be optimized for the number of rows you have. The 
 above returns the first record of 1% of the table. If you have a million 
 rows, you might want to bump that to something like 100 parts per million or 
 so.
 
 But really, folks, this is something so ubiquitous and so recommended, why 
 can't the query optimizer look out for ORDER BY RAND() and simply skip the 
 table sort and just grab some record? (Hopefully using something better than 
 Knuth's LCRNG...)
 
 
 Learning to think wholistically requires an overriding, or reversal, of much 
 of the cultural heritage of the last few hundred years. -- David Holmgren
  Jan Steinman, EcoReality Co-op 
 


Within a few human generations, the low-energy patterns observable in natural 
landscapes will again form the basis of human system design after the richest 
deposits of fossil fuels and minerals are exhausted. -- David Holmgren
 Jan Steinman, EcoReality Co-op 


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: large temp files created by mysql

2011-10-26 Thread Reindl Harald


Am 26.10.2011 05:14, schrieb mos:
  It is not braindead. You told MySQL to sort by rand() which is a 
  non-indexed column.
  It needs to assign a value to each row of the result set (all ids of the 
  table) and sort
  it to get the lowest random number. This is very inefficient for large 
  tables.

 but there is mo need to do this with the whole table
 if the only requested field is the primary key
 
 Sure but if the table has 100 million rows and you want 1 random id, that 
 means sorting 100 million id's from the
 index to disk. This is still grossly inefficient. It may work fine on tables 
 with a couple thousand rows, but not
 for million row tables. That's why the two methods I suggested don't use 
 sorting.

this is not the topic
the topic is order by rand() works braindead
all other things are workarounds

and it still matters on small tables with only 512KB if mysqld
makes on each query a 512KB temp-file or not - have fun under
concurrent load even with small tables!

only the auto-ids are tiny data and nothing more is requested here
as result and no reason to copy the whole table with all fields around




signature.asc
Description: OpenPGP digital signature


Re: large temp files created by mysql

2011-10-25 Thread Reindl Harald


Am 25.10.2011 05:45, schrieb mos:
 At 05:31 PM 10/24/2011, Reindl Harald wrote:
 
 
 Am 24.10.2011 23:31, schrieb mos:
  At 11:32 AM 10/24/2011, Reindl Harald wrote:
 
 
  Am 24.10.2011 18:02, schrieb mos:
   At 10:34 AM 10/24/2011, you wrote:
   select id from table order by rand() limit 1;
   is doing as example a dumb temporary table with the full size
  
   Because it has to sort the entire table, then it returns the one row. 
   This of course is extremely
 inefficient. :)
   You need to choose a random row by using an auto-inc field. Something 
   like:
  
   select id from table where id=myrandomnum limit 1
 
  but this is TOTALLY braindead if id is a primary-KEY with auto-increment
 
  It all depends on how many holes you have in the sequence and how random 
  you want the selections to be. If there
  are no holes then it will work. You need of course to get the first and 
  last id and generate myrandomnum within
  that range. If there are a lot of holes in the sequence then build another 
  table with the columns bin and an
  autoinc column and pick one of those rows randomly. Regenerate the table 
  once an hour or once a day.
 
  Either way it is going to be a LOT FASTER than sorting the entire table

 and why in the world is with the query above the WHOLE table
 copied in a temp-table while fecth the whole id-list in a
 php-array and take a random one is more than 1000 times faster?

 the implementation if order by rand() is totally braindead
 
 It is not braindead. You told MySQL to sort by rand() which is a 
 non-indexed column. 
 It needs to assign a value to each row of the result set (all ids of the 
 table) and sort 
 it to get the lowest random number. This is very inefficient for large tables.

but there is mo need to do this with the whole table
if the only requested field is the primary key





signature.asc
Description: OpenPGP digital signature


Re: large temp files created by mysql

2011-10-25 Thread mos

At 03:32 AM 10/25/2011, you wrote:



Am 25.10.2011 05:45, schrieb mos:
 At 05:31 PM 10/24/2011, Reindl Harald wrote:


 Am 24.10.2011 23:31, schrieb mos:
  At 11:32 AM 10/24/2011, Reindl Harald wrote:
 
 
  Am 24.10.2011 18:02, schrieb mos:
   At 10:34 AM 10/24/2011, you wrote:
   select id from table order by rand() limit 1;
   is doing as example a dumb temporary table with the full size
  
   Because it has to sort the entire table, then it returns 
the one row. This of course is extremely

 inefficient. :)
   You need to choose a random row by using an auto-inc field. 
Something like:

  
   select id from table where id=myrandomnum limit 1
 
  but this is TOTALLY braindead if id is a primary-KEY with 
auto-increment

 
  It all depends on how many holes you have in the sequence and 
how random you want the selections to be. If there
  are no holes then it will work. You need of course to get the 
first and last id and generate myrandomnum within
  that range. If there are a lot of holes in the sequence then 
build another table with the columns bin and an
  autoinc column and pick one of those rows randomly. Regenerate 
the table once an hour or once a day.

 
  Either way it is going to be a LOT FASTER than sorting the entire table

 and why in the world is with the query above the WHOLE table
 copied in a temp-table while fecth the whole id-list in a
 php-array and take a random one is more than 1000 times faster?

 the implementation if order by rand() is totally braindead

 It is not braindead. You told MySQL to sort by rand() which is 
a non-indexed column.
 It needs to assign a value to each row of the result set (all ids 
of the table) and sort
 it to get the lowest random number. This is very inefficient for 
large tables.


but there is mo need to do this with the whole table
if the only requested field is the primary key


Sure but if the table has 100 million rows and you want 1 random id, 
that means sorting 100 million id's from the index to disk. This is 
still grossly inefficient. It may work fine on tables with a couple 
thousand rows, but not for million row tables. That's why the two 
methods I suggested don't use sorting.


Mike  



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: large temp files created by mysql

2011-10-24 Thread Joey L
On Mon, Oct 24, 2011 at 10:52 AM, Joey L mjh2...@gmail.com wrote:
 I have a very large table - approx 3 or 4 gig in size.
 When i initiate a process on my webpage - mysql starts to create a
 temporary table in /tmp directory.
 Is there a way i can create this file/table ahead of time so mysql
 does not have to create it ?

 thanks
 mjh


Sorry - it is 30 gig file .

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: large temp files created by mysql

2011-10-24 Thread Antony T Curtis
Something you're doing is creating a very large temporary table as  
part of handling it's query. Usual culprit would be something doing a  
full table join combined with an order by or group by which would  
typically cause MySQL to need to create a temp table.


You should do EXPLAINs on your slow-running queries to find out which  
ones are likely... If it is generating a 30gig file, I'd expect it  
must be a very slow query.


Regards
Antony,


On 24 Oct 2011, at 08:03, Joey L wrote:


On Mon, Oct 24, 2011 at 10:52 AM, Joey L mjh2...@gmail.com wrote:

I have a very large table - approx 3 or 4 gig in size.
When i initiate a process on my webpage - mysql starts to create a
temporary table in /tmp directory.
Is there a way i can create this file/table ahead of time so mysql
does not have to create it ?

thanks
mjh



Sorry - it is 30 gig file .

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=antony.cur...@ieee.org




--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: large temp files created by mysql

2011-10-24 Thread Reindl Harald
select id from table order by rand() limit 1;
is doing as example a dumb temporary table with the full size

Am 24.10.2011 17:03, schrieb Joey L:

On Mon, Oct 24, 2011 at 10:52 AM, Joey L mjh2...@gmail.com wrote:

 I have a very large table - approx 3 or 4 gig in size.
 When i initiate a process on my webpage - mysql starts to create a
 temporary table in /tmp directory.
 Is there a way i can create this file/table ahead of time so mysql
 does not have to create it ?

 thanks
 mjh

Sorry - it is 30 gig file .



-- 

Mit besten Grüßen, Reindl Harald
the lounge interactive design GmbH
A-1060 Vienna, Hofmühlgasse 17
CTO / software-development / cms-solutions
p: +43 (1) 595 3999 33, m: +43 (676) 40 221 40
icq: 154546673, http://www.thelounge.net/

http://www.thelounge.net/signature.asc.what.htm




signature.asc
Description: OpenPGP digital signature


Re: large temp files created by mysql

2011-10-24 Thread mos

At 10:34 AM 10/24/2011, you wrote:

select id from table order by rand() limit 1;
is doing as example a dumb temporary table with the full size


Because it has to sort the entire table, then it 
returns the one row. This of course is extremely inefficient. :)

You need to choose a random row by using an auto-inc field. Something like:

select id from table where id=myrandomnum limit 1

Here is the complete article:
http://www.greggdev.com/web/articles.php?id=6

Keep in mind that if there are a lot of holes in 
your table where id is not sequential, then the 
randomness will favor the number following the 
missing ids. So if you had rows with id's from 1 
to 1000, but the rows 100 to 199 were missing, 
then the number 200 will appear 100x more often 
than any other random number (assuming there were 
no other missing numbers). So for this to be as 
random as possible, then you should keep the 
missing id's as small as possible. The other 
solution is to renumber a non-autoinc column 
every so often to make sure the numbers are truly 
sequential and of course use a stored procedure 
to find the next highest number when adding new rows.


Mike



Am 24.10.2011 17:03, schrieb Joey L:

On Mon, Oct 24, 2011 at 10:52 AM, Joey L mjh2...@gmail.com wrote:

 I have a very large table - approx 3 or 4 gig in size.
 When i initiate a process on my webpage - mysql starts to create a
 temporary table in /tmp directory.
 Is there a way i can create this file/table ahead of time so mysql
 does not have to create it ?

 thanks
 mjh

Sorry - it is 30 gig file .



--

Mit besten Grüßen, Reindl Harald
the lounge interactive design GmbH
A-1060 Vienna, Hofmühlgasse 17
CTO / software-development / cms-solutions
p: +43 (1) 595 3999 33, m: +43 (676) 40 221 40
icq: 154546673, http://www.thelounge.net/

http://www.thelounge.net/signature.asc.what.htm






--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: large temp files created by mysql

2011-10-24 Thread Joey L

 You should do EXPLAINs on your slow-running queries to find out which ones
 are likely... If it is generating a 30gig file, I'd expect it must be a very
 slow query.


I know why the files are being created - but is there anything i can
do on the server to accomodate for large file operations ?
in my.cnf ? possibly?
I have 22gig of ram -- should i bump these up ??? anything else ?
table_cache= 256M
max_heap_table_size = 256M
tmp_table_size  = 256M


thanks
mjh

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: large temp files created by mysql

2011-10-24 Thread Joey L
  MySQLTuner 1.2.0 - Major Hayden ma...@mhtx.net
   Bug reports, feature requests, and downloads at http://mysqltuner.com/
   Run with '--help' for additional options and output filtering
Please enter your MySQL administrative login: root
Please enter your MySQL administrative password:

 General Statistics --
[--] Skipped version check for MySQLTuner script
[OK] Currently running supported MySQL version 5.1.49-3-log
[OK] Operating on 64-bit architecture

 Storage Engine Statistics ---
[--] Status: +Archive -BDB -Federated +InnoDB -ISAM -NDBCluster
[--] Data in MyISAM tables: 9G (Tables: 2512)
[--] Data in InnoDB tables: 1M (Tables: 40)
[!!] Total fragmented tables: 58

 Performance Metrics -
[--] Up for: 10s (2K q [237.300 qps], 591 conn, TX: 947K, RX: 261K)
[--] Reads / Writes: 97% / 3%
[--] Total buffers: 2.3G global + 34.6M per thread (200 max threads)
[OK] Maximum possible memory usage: 9.1G (41% of installed RAM)
[OK] Slow queries: 0% (0/2K)
[OK] Highest usage of available connections: 3% (6/200)
[!!] Key buffer size / total MyISAM indexes: 2.0G/12.0G
[!!] Key buffer hit rate: 89.9% (5K cached / 524 reads)
[!!] Query cache efficiency: 17.9% (298 cached / 1K selects)
[OK] Query cache prunes per day: 0
[OK] Sorts requiring temporary tables: 0% (0 temp sorts / 45 sorts)
[!!] Joins performed without indexes: 25
[!!] Temporary tables created on disk: 29% (93 on disk / 316 total)
[OK] Thread cache hit rate: 98% (6 created / 591 connections)
[!!] Table cache hit rate: 3% (407 open / 13K opened)
[OK] Open file limit used: 78% (808/1K)
[OK] Table locks acquired immediately: 100% (997 immediate / 997 locks)
[OK] InnoDB data size / buffer pool: 1.3M/8.0M

 Recommendations -
General recommendations:
Run OPTIMIZE TABLE to defragment tables for better performance
MySQL started within last 24 hours - recommendations may be inaccurate
Enable the slow query log to troubleshoot bad queries
Adjust your join queries to always utilize indexes
Temporary table size is already large - reduce result set size
Reduce your SELECT DISTINCT queries without LIMIT clauses
Increase table_cache gradually to avoid file descriptor limits
Variables to adjust:
key_buffer_size ( 12.0G)
query_cache_limit ( 2M, or use smaller result sets)
join_buffer_size ( 32.0M, or always use indexes with joins)
table_cache ( 407)

I just did some adjustments and used mysqltuner.pl ---any advice ??
I have 22gigs of ram on the server - so if i need to allocate - i can.
thanks
mjh

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: large temp files created by mysql

2011-10-24 Thread Reindl Harald


Am 24.10.2011 18:02, schrieb mos:
 At 10:34 AM 10/24/2011, you wrote:
 select id from table order by rand() limit 1;
 is doing as example a dumb temporary table with the full size
 
 Because it has to sort the entire table, then it returns the one row. This of 
 course is extremely inefficient. :)
 You need to choose a random row by using an auto-inc field. Something like:
 
 select id from table where id=myrandomnum limit 1

but this is TOTALLY braindead if id is a primary-KEY with auto-increment



signature.asc
Description: OpenPGP digital signature


Re: large temp files created by mysql

2011-10-24 Thread Reindl Harald


Am 24.10.2011 18:27, schrieb Joey L:
  Performance Metrics -
 [--] Up for: 10s (2K q [237.300 qps], 591 conn, TX: 947K, RX: 261K)
 [!!] Key buffer size / total MyISAM indexes: 2.0G/12.0G
 [!!] Key buffer hit rate: 89.9% (5K cached / 524 reads)

 I just did some adjustments and used mysqltuner.pl ---any advice ??
 I have 22gigs of ram on the server - so if i need to allocate - i can

mysqltuner after 10 seconds is mostly useless

but variables to adjust: key_buffer_size ( 12.0G) is mandatory
since you do not benfit as mach as you can if your keys are permanently
loaded from disk



signature.asc
Description: OpenPGP digital signature


Re: large temp files created by mysql

2011-10-24 Thread Johan De Meersman
- Original Message -
 From: Joey L mjh2...@gmail.com
 
 I know why the files are being created - but is there anything i can
 do on the server to accomodate for large file operations ?

*We* don't, however :-) If you'd care to enlighten us as to the particular job 
and why [you believe that] the tempfile creation is unavoidable, we might be 
able to offer something more.

 max_heap_table_size = 256M
 tmp_table_size  = 256M

Maybe. Those would help as long as your intermediate resultset is lower than 
what you can allocate to them, but you're talking 22G in the machine and 30G of 
data. You also need to realize that those are going to be allocated 
per-connection (albeit not necessarily at full size), so that can be a 
potential disaster.


-- 
Bier met grenadyn
Is als mosterd by den wyn
Sy die't drinkt, is eene kwezel
Hy die't drinkt, is ras een ezel

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: large temp files created by mysql

2011-10-24 Thread mos

At 11:32 AM 10/24/2011, Reindl Harald wrote:



Am 24.10.2011 18:02, schrieb mos:
 At 10:34 AM 10/24/2011, you wrote:
 select id from table order by rand() limit 1;
 is doing as example a dumb temporary table with the full size

 Because it has to sort the entire table, then it returns the one 
row. This of course is extremely inefficient. :)

 You need to choose a random row by using an auto-inc field. Something like:

 select id from table where id=myrandomnum limit 1

but this is TOTALLY braindead if id is a primary-KEY with auto-increment


It all depends on how many holes you have in the sequence and how 
random you want the selections to be. If there are no holes then it 
will work. You need of course to get the first and last id and 
generate myrandomnum within that range. If there are a lot of holes 
in the sequence then build another table with the columns bin and an 
autoinc column and pick one of those rows randomly. Regenerate the 
table once an hour or once a day.


Either way it is going to be a LOT FASTER than sorting the entire table.

Miike



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: large temp files created by mysql

2011-10-24 Thread Reindl Harald


Am 24.10.2011 23:31, schrieb mos:
 At 11:32 AM 10/24/2011, Reindl Harald wrote:
 
 
 Am 24.10.2011 18:02, schrieb mos:
  At 10:34 AM 10/24/2011, you wrote:
  select id from table order by rand() limit 1;
  is doing as example a dumb temporary table with the full size
 
  Because it has to sort the entire table, then it returns the one row. This 
  of course is extremely inefficient. :)
  You need to choose a random row by using an auto-inc field. Something like:
 
  select id from table where id=myrandomnum limit 1

 but this is TOTALLY braindead if id is a primary-KEY with auto-increment
 
 It all depends on how many holes you have in the sequence and how random you 
 want the selections to be. If there
 are no holes then it will work. You need of course to get the first and last 
 id and generate myrandomnum within
 that range. If there are a lot of holes in the sequence then build another 
 table with the columns bin and an
 autoinc column and pick one of those rows randomly. Regenerate the table once 
 an hour or once a day.
 
 Either way it is going to be a LOT FASTER than sorting the entire table

and why in the world is with the query above the WHOLE table
copied in a temp-table while fecth the whole id-list in a
php-array and take a random one is more than 1000 times faster?

the implementation if order by rand() is totally braindead
___

SELECT pri_key FROM table ORDER BY RAND() LIMIT 1;

let's say 'pri_key' as a autoincrement integer
let's say the table has some blob-fields and is 4 GB large

fetch ONE random id via mysql will explode your webserver
because 4 GB data is copied - there is no valid reason for
such crappy implementation, really no single reason





signature.asc
Description: OpenPGP digital signature


Re: large temp files created by mysql

2011-10-24 Thread mos

At 05:31 PM 10/24/2011, Reindl Harald wrote:



Am 24.10.2011 23:31, schrieb mos:
 At 11:32 AM 10/24/2011, Reindl Harald wrote:


 Am 24.10.2011 18:02, schrieb mos:
  At 10:34 AM 10/24/2011, you wrote:
  select id from table order by rand() limit 1;
  is doing as example a dumb temporary table with the full size
 
  Because it has to sort the entire table, then it returns the 
one row. This of course is extremely inefficient. :)
  You need to choose a random row by using an auto-inc field. 
Something like:

 
  select id from table where id=myrandomnum limit 1

 but this is TOTALLY braindead if id is a primary-KEY with auto-increment

 It all depends on how many holes you have in the sequence and how 
random you want the selections to be. If there
 are no holes then it will work. You need of course to get the 
first and last id and generate myrandomnum within
 that range. If there are a lot of holes in the sequence then 
build another table with the columns bin and an
 autoinc column and pick one of those rows randomly. Regenerate 
the table once an hour or once a day.


 Either way it is going to be a LOT FASTER than sorting the entire table

and why in the world is with the query above the WHOLE table
copied in a temp-table while fecth the whole id-list in a
php-array and take a random one is more than 1000 times faster?

the implementation if order by rand() is totally braindead


It is not braindead. You told MySQL to sort by rand() which is a 
non-indexed column. It needs to assign a value to each row of the 
result set (all ids of the table) and sort it to get the lowest 
random number. This is very inefficient for large tables.


That is why I recommended using:

select id from table where id=myrandomnum limit 1

provided id is uniformly distributed (not too many holes).  If there 
are a lot of holes in the distribution then copy the id's to a 
separate table with its own auto-in rcd id. This will take about 1-2 
seconds. Now you can select a random row from this table using the 
same SQL statement above and get a random id in about 1 ms. I didn't 
say to load all of the ids into the separate table (or a PHP array) 
every time you wanted a random id.  That would be braindead. You 
will need to update this secondary table whenever your table gets 
rows inserted or deleted (or reload the contents of the secondary 
table every few minutes).


Now there is a 3rd way to get a random row that is easier to 
implement. Add an extra column to your table called RndNum Float. 
When a row is added to the table, assign a random number to the 
column. This column needs to be indexed. Now you can execute:


select id from table where RndNum = rand() limit 1

and this will return a random id in a couple of ms. You should use a 
stored procedure and handle the occasions when no row is returned 
because rand() that was generated was larger than the largest RndNum 
in the table. Depending on how random you want the results, you can 
make the index unique and handle the occurrences when a duplicate 
rand() number is generated.




___

SELECT pri_key FROM table ORDER BY RAND() LIMIT 1;

let's say 'pri_key' as a autoincrement integer
let's say the table has some blob-fields and is 4 GB large

fetch ONE random id via mysql will explode your webserver
because 4 GB data is copied - there is no valid reason for
such crappy implementation, really no single reason


The blob field is not included in the query so it won't be copied and 
the server will not explode. If it worked the way you said it did, 
then we would have exploding web servers with so much blob material 
encircling every small that not even Steve McQueen could eradicate it.


What does happen is a temporary table is created with all of the 
pri_key values and is sorted on Rand() and a single pri_key is 
returned. Since pri_key is an index, it only needs to access the 
index file and won't even bother accessing the data in the table.


And that's the way it is.

Mike



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org