Re: Speed of SELECT ... LIMIT #,#?

2003-03-25 Thread Joseph Bueno
If you want to speed it up, you have to make it use an index.
You need to add a WHERE or an ORDER BY clause.
Have you tried :
SELECT Message_ID, Body FROM Body_etc ORDER BY Message_ID LIMIT N,M
Of course, I assume that Message_ID is indexed ;)

Regards,
Joseph Bueno
Nick Arnett wrote:
-Original Message-
From: Jeremy Zawodny [mailto:[EMAIL PROTECTED]
Sent: Monday, March 24, 2003 9:17 PM


Please post the query and the output of running it thru EXPLAIN.

It is likely sorting the results without an index and having to weed
thru more and more records the farther back you look in the list.


I'm fairly certain that's true.  The query is very simple: SELECT
Message_ID, Body FROM Body_etc LIMIT N,M.  Those are the only fields in the
table; Message_ID is a CHAR field, Body is TEXT.
Here's what EXPLAIN gives:

+--+--+---+--+-+--+-+---
+
| table| type | possible_keys | key  | key_len | ref  | rows| Extra
|
+--+--+---+--+-+--+-+---
+
| body_etc | ALL  | NULL  | NULL |NULL | NULL | 1586994 |
|
+--+--+---+--+-+--+-+---
+
Sorry if I didn't make that clear in my first posting.  I can't quite see
how to speed things up, though it dawned on me that perhaps I could add a
sequence field, index it, then use it, rather than the LIMIT clause, to
select the records.  It was quite fast at the beginning and caught me by
surprise that it slowed down so much.
I'm getting there, slowly, by bumping up the max_packet_size and doing three
INSERTs of 5,000 for every SELECT, so I'm grabbing about 15,000 records at a
time, greatly reducing the number of SELECTs from when I was only getting
1,000.  Still taking hours, but I'll be letting it run all night.
By the way, I noticed that in one of your presentations, you noted that
Handler_read_rnd_next is better if it's smaller.  It's at 5.7 million now,
after doing about 900K records.  I'm assuming that's an indication that
there's some inefficiency, but I'm still stumped as to how to address it.
The machine I'm running MySQL on has 1 GB of memory, so I have lots of
headroom to play with.
And since people tend to start asking questions when they realize I'm
indexing large amounts of e-mail, etc., I'll explain a little.  I'm doing
traffic and content analysis to track what's going on in certain large
software development communities.
Thanks!

Nick




--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


RE: Speed of SELECT ... LIMIT #,#?

2003-03-25 Thread Nick Arnett
 -Original Message-
 From: Joseph Bueno [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, March 25, 2003 1:36 AM

..

 If you want to speed it up, you have to make it use an index.
 You need to add a WHERE or an ORDER BY clause.
 Have you tried :
 SELECT Message_ID, Body FROM Body_etc ORDER BY Message_ID LIMIT N,M

 Of course, I assume that Message_ID is indexed ;)

Message_ID is the primary key.  But your suggestion doesn't help.  If
anything, it is slower.

However, I think I've figured out the right way to do this -- use a
server-side cursor.  I can completely get rid of the need for a LIMIT in the
SELECT statement.  I've never used server-side cursors before, so I am a bit
surprised to see that even when I do a SELECT for all 1.5 million records,
MySQL's memory usage doesn't increase a bit above where it was when I was
doing the same queries using a normal cursor.  All I have to do is figure
out how many records I can safely insert at one shot, which is not a
problem.

Just noticed something odd, though, with the MySQLdb SSCursor.  When close()
is called, it does a fetchall(), getting any records that you hadn't
retrieved, trying to load all of them into memory.  It's actually calling
nextset(), even though MySQL doesn't support multiple result sets.

Nick


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



RE: Speed of SELECT ... LIMIT #,#?

2003-03-25 Thread Nick Arnett
 -Original Message-
 From: Nick Arnett [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, March 25, 2003 8:48 AM

...

 Just noticed something odd, though, with the MySQLdb SSCursor.
 When close()
 is called, it does a fetchall(), getting any records that you hadn't
 retrieved, trying to load all of them into memory.  It's actually calling
 nextset(), even though MySQL doesn't support multiple result sets.

This is because MySQL requires all rows to be read from a server-side
connection before issuing another query on that connection, Andy Dustman
tells me.  It doesn't seem to be a problem as long as you do, in fact, read
all the rows, so this is more an issue of the way I was testing than the
real world.

Nick


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Speed of SELECT ... LIMIT #,#?

2003-03-24 Thread Jeremy Zawodny
On Mon, Mar 24, 2003 at 04:48:01PM -0800, Nick Arnett wrote:
 I'm reading 1,000 records at a time from a large table (overcoming the FT
 indexing problem I wrote about yesterday) and I'm discovering that as the
 starting record number grows larger, the retrieve speed is dropping rapidly.
 Any suggestions for how to speed this up?  It's a strategy I use fairly
 often, mainly to keep from using excess memory when retrieving and/or
 killing the connection when inserting records.  In the current case, I'm
 doing a simple select, no ordering, grouping, etc.  This is on MySQL
 4.012-nt.  Somewhere in the vicinity of 700,000, retrieval speed dropped
 tremendously.  I'm guessing that that's where index caching was no longer
 sufficient...?

Please post the query and the output of running it thru EXPLAIN.

It is likely sorting the results without an index and having to weed
thru more and more records the farther back you look in the list.

Jeremy
-- 
Jeremy D. Zawodny |  Perl, Web, MySQL, Linux Magazine, Yahoo!
[EMAIL PROTECTED]  |  http://jeremy.zawodny.com/

MySQL 4.0.8: up 49 days, processed 1,697,931,880 queries (395/sec. avg)

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



RE: Speed of SELECT ... LIMIT #,#?

2003-03-24 Thread Nick Arnett
 -Original Message-
 From: Jeremy Zawodny [mailto:[EMAIL PROTECTED]
 Sent: Monday, March 24, 2003 9:17 PM

 Please post the query and the output of running it thru EXPLAIN.

 It is likely sorting the results without an index and having to weed
 thru more and more records the farther back you look in the list.

I'm fairly certain that's true.  The query is very simple: SELECT
Message_ID, Body FROM Body_etc LIMIT N,M.  Those are the only fields in the
table; Message_ID is a CHAR field, Body is TEXT.

Here's what EXPLAIN gives:

+--+--+---+--+-+--+-+---
+
| table| type | possible_keys | key  | key_len | ref  | rows| Extra
|
+--+--+---+--+-+--+-+---
+
| body_etc | ALL  | NULL  | NULL |NULL | NULL | 1586994 |
|
+--+--+---+--+-+--+-+---
+

Sorry if I didn't make that clear in my first posting.  I can't quite see
how to speed things up, though it dawned on me that perhaps I could add a
sequence field, index it, then use it, rather than the LIMIT clause, to
select the records.  It was quite fast at the beginning and caught me by
surprise that it slowed down so much.

I'm getting there, slowly, by bumping up the max_packet_size and doing three
INSERTs of 5,000 for every SELECT, so I'm grabbing about 15,000 records at a
time, greatly reducing the number of SELECTs from when I was only getting
1,000.  Still taking hours, but I'll be letting it run all night.

By the way, I noticed that in one of your presentations, you noted that
Handler_read_rnd_next is better if it's smaller.  It's at 5.7 million now,
after doing about 900K records.  I'm assuming that's an indication that
there's some inefficiency, but I'm still stumped as to how to address it.
The machine I'm running MySQL on has 1 GB of memory, so I have lots of
headroom to play with.

And since people tend to start asking questions when they realize I'm
indexing large amounts of e-mail, etc., I'll explain a little.  I'm doing
traffic and content analysis to track what's going on in certain large
software development communities.

Thanks!

Nick


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]