Re: Speed of SELECT ... LIMIT #,#?
If you want to speed it up, you have to make it use an index. You need to add a WHERE or an ORDER BY clause. Have you tried : SELECT Message_ID, Body FROM Body_etc ORDER BY Message_ID LIMIT N,M Of course, I assume that Message_ID is indexed ;) Regards, Joseph Bueno Nick Arnett wrote: -Original Message- From: Jeremy Zawodny [mailto:[EMAIL PROTECTED] Sent: Monday, March 24, 2003 9:17 PM Please post the query and the output of running it thru EXPLAIN. It is likely sorting the results without an index and having to weed thru more and more records the farther back you look in the list. I'm fairly certain that's true. The query is very simple: SELECT Message_ID, Body FROM Body_etc LIMIT N,M. Those are the only fields in the table; Message_ID is a CHAR field, Body is TEXT. Here's what EXPLAIN gives: +--+--+---+--+-+--+-+--- + | table| type | possible_keys | key | key_len | ref | rows| Extra | +--+--+---+--+-+--+-+--- + | body_etc | ALL | NULL | NULL |NULL | NULL | 1586994 | | +--+--+---+--+-+--+-+--- + Sorry if I didn't make that clear in my first posting. I can't quite see how to speed things up, though it dawned on me that perhaps I could add a sequence field, index it, then use it, rather than the LIMIT clause, to select the records. It was quite fast at the beginning and caught me by surprise that it slowed down so much. I'm getting there, slowly, by bumping up the max_packet_size and doing three INSERTs of 5,000 for every SELECT, so I'm grabbing about 15,000 records at a time, greatly reducing the number of SELECTs from when I was only getting 1,000. Still taking hours, but I'll be letting it run all night. By the way, I noticed that in one of your presentations, you noted that Handler_read_rnd_next is better if it's smaller. It's at 5.7 million now, after doing about 900K records. I'm assuming that's an indication that there's some inefficiency, but I'm still stumped as to how to address it. The machine I'm running MySQL on has 1 GB of memory, so I have lots of headroom to play with. And since people tend to start asking questions when they realize I'm indexing large amounts of e-mail, etc., I'll explain a little. I'm doing traffic and content analysis to track what's going on in certain large software development communities. Thanks! Nick -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: Speed of SELECT ... LIMIT #,#?
-Original Message- From: Joseph Bueno [mailto:[EMAIL PROTECTED] Sent: Tuesday, March 25, 2003 1:36 AM .. If you want to speed it up, you have to make it use an index. You need to add a WHERE or an ORDER BY clause. Have you tried : SELECT Message_ID, Body FROM Body_etc ORDER BY Message_ID LIMIT N,M Of course, I assume that Message_ID is indexed ;) Message_ID is the primary key. But your suggestion doesn't help. If anything, it is slower. However, I think I've figured out the right way to do this -- use a server-side cursor. I can completely get rid of the need for a LIMIT in the SELECT statement. I've never used server-side cursors before, so I am a bit surprised to see that even when I do a SELECT for all 1.5 million records, MySQL's memory usage doesn't increase a bit above where it was when I was doing the same queries using a normal cursor. All I have to do is figure out how many records I can safely insert at one shot, which is not a problem. Just noticed something odd, though, with the MySQLdb SSCursor. When close() is called, it does a fetchall(), getting any records that you hadn't retrieved, trying to load all of them into memory. It's actually calling nextset(), even though MySQL doesn't support multiple result sets. Nick -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: Speed of SELECT ... LIMIT #,#?
-Original Message- From: Nick Arnett [mailto:[EMAIL PROTECTED] Sent: Tuesday, March 25, 2003 8:48 AM ... Just noticed something odd, though, with the MySQLdb SSCursor. When close() is called, it does a fetchall(), getting any records that you hadn't retrieved, trying to load all of them into memory. It's actually calling nextset(), even though MySQL doesn't support multiple result sets. This is because MySQL requires all rows to be read from a server-side connection before issuing another query on that connection, Andy Dustman tells me. It doesn't seem to be a problem as long as you do, in fact, read all the rows, so this is more an issue of the way I was testing than the real world. Nick -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Speed of SELECT ... LIMIT #,#?
On Mon, Mar 24, 2003 at 04:48:01PM -0800, Nick Arnett wrote: I'm reading 1,000 records at a time from a large table (overcoming the FT indexing problem I wrote about yesterday) and I'm discovering that as the starting record number grows larger, the retrieve speed is dropping rapidly. Any suggestions for how to speed this up? It's a strategy I use fairly often, mainly to keep from using excess memory when retrieving and/or killing the connection when inserting records. In the current case, I'm doing a simple select, no ordering, grouping, etc. This is on MySQL 4.012-nt. Somewhere in the vicinity of 700,000, retrieval speed dropped tremendously. I'm guessing that that's where index caching was no longer sufficient...? Please post the query and the output of running it thru EXPLAIN. It is likely sorting the results without an index and having to weed thru more and more records the farther back you look in the list. Jeremy -- Jeremy D. Zawodny | Perl, Web, MySQL, Linux Magazine, Yahoo! [EMAIL PROTECTED] | http://jeremy.zawodny.com/ MySQL 4.0.8: up 49 days, processed 1,697,931,880 queries (395/sec. avg) -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: Speed of SELECT ... LIMIT #,#?
-Original Message- From: Jeremy Zawodny [mailto:[EMAIL PROTECTED] Sent: Monday, March 24, 2003 9:17 PM Please post the query and the output of running it thru EXPLAIN. It is likely sorting the results without an index and having to weed thru more and more records the farther back you look in the list. I'm fairly certain that's true. The query is very simple: SELECT Message_ID, Body FROM Body_etc LIMIT N,M. Those are the only fields in the table; Message_ID is a CHAR field, Body is TEXT. Here's what EXPLAIN gives: +--+--+---+--+-+--+-+--- + | table| type | possible_keys | key | key_len | ref | rows| Extra | +--+--+---+--+-+--+-+--- + | body_etc | ALL | NULL | NULL |NULL | NULL | 1586994 | | +--+--+---+--+-+--+-+--- + Sorry if I didn't make that clear in my first posting. I can't quite see how to speed things up, though it dawned on me that perhaps I could add a sequence field, index it, then use it, rather than the LIMIT clause, to select the records. It was quite fast at the beginning and caught me by surprise that it slowed down so much. I'm getting there, slowly, by bumping up the max_packet_size and doing three INSERTs of 5,000 for every SELECT, so I'm grabbing about 15,000 records at a time, greatly reducing the number of SELECTs from when I was only getting 1,000. Still taking hours, but I'll be letting it run all night. By the way, I noticed that in one of your presentations, you noted that Handler_read_rnd_next is better if it's smaller. It's at 5.7 million now, after doing about 900K records. I'm assuming that's an indication that there's some inefficiency, but I'm still stumped as to how to address it. The machine I'm running MySQL on has 1 GB of memory, so I have lots of headroom to play with. And since people tend to start asking questions when they realize I'm indexing large amounts of e-mail, etc., I'll explain a little. I'm doing traffic and content analysis to track what's going on in certain large software development communities. Thanks! Nick -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]