Matt,

Many thanks for the answer. It has helped enormously.

First, I have been getting the odd index corruption that has proved to be 
very annoying. I had checked the changes document for releases since 
4.0.13 and there didn't seem to be any mention of an index problem but now 
I'll upgrade asap. Thanks for that.

Next, I had extended my key buffer too much. When I calculated the high 
water mark for key buffer usage, I found that indeed it was considerably 
less than the space I had allocated. I will modify accordingly. However, I 
was just thinking about what you said about this only being a high water 
mark ... I can't see any way, apart from dropping an index or table, that 
information is going to be purged from the cache especially as you say 
that MySQL updates the contents of the cache when an index is modified, so 
won't that mean that during normal operation the key_blocks_used should 
indicate exactly how many blocks are currently in use ? Some of the 
contents may of not been used for a while but still they won't be purged 
unless the maximum extent of the cache is reached ?

With regards to the caching on myd data, is the fact that MySQL doesn't 
cache myd data a design choice ? It makes perfect sense for MySQL 
installations on a dedicated machine as its a fair assumption that there's 
no other nasty apps around filling your system cache with other data and 
the OS is probably in the best position to cache the disks. However, in my 
case (and I would guess in the proportion of the cases), the database 
shares the machine and in my case this is with Apache which depending on 
the usage on the website, is likely to flush the cache reasonably quickly. 
It would be easier to get more consistent query execution times if MySQL 
maintained it's own caches (for myd data as well) so that more control 
could be kept on cache contents. In a previous life I worked with Sybase 
and one of the advantages (only ?) is that the administrator has control 
on the caching of index and data for each individual table. Very handy if 
it was benficial to ensure the contents of specific tables were available 
in a cache.

Lastly, I'd love to use the query cache but I do have to update the 
indexes every 5 minutes (the system revolves around retrieving SNMP data 
from a bunch of routers every 5 mins then dumping it into the db ... a 
user then requests a report periodically) so the QCache is invalidated 
every 5 mins anyway. So, a) do  you (or anyone else) know of any plans to 
extend the caching functionality and b) are there any other parameters 
that may be worth a tweak ? One thing I had considered was to extend a 
composite index to incorporate the data that is required in the 
problematic query then the query should be able to extract the data 
required without having to search the myd file. I understand that this 
will increase the index size and slow the inserts but otherwise is this a 
legitamate solution i.e. there must be some other downside surely ?

Anyway, thanks again for the response. 

John










"Matt W" <[EMAIL PROTECTED]>
2004-01-14 11:18

 
        To:     John Everitt/EHV/CORP/[EMAIL PROTECTED]
<[EMAIL PROTECTED]>
        cc: 
        Subject:        Re: How does key buffer work ?
        Classification: 



Hi John,

I'll give my comments. :-)


----- Original Message -----
From: <[EMAIL PROTECTED]>
Sent: Wednesday, January 14, 2004 2:04 AM
Subject: How does key buffer work ?


> I've been trying to optimise the operation of a MySQL (4.0.13)
> installation on a Windows 2000 based web server.

First, I'd upgrade MySQL to the latest 4.0.x for bug fixes, etc.  The
biggest of which may be possible index corruption (I think) in versions
before 4.0.14 (or is it .15?).


> After going through the
> docs and browsing the net for some time, it seems that after ensuring
that
> your database design is sound and your queries correctly structured
with
> the right indexes present then further optimisation can be realised by
> tinkering with the MySQL server's startup parameters, principally the
key
> buffer size. It seems that the key buffer is solely used as an index
cache
> and that extending this, up to a point, potentially will significantly
> improve performance.

Yup, table/index design and optimized queries are very important for
performance.

I'm not of the opinion that a huge key_buffer is as important as a lot
of people make it. :-)  Sure, it's important, but I don't know if many
changes will "significantly" improve performance. :-)  Making it too
large may actually hurt overall performance.

You see, MySQL doesn't cache any row data from the .MYD files.  The OS
will use any free RAM to cache file data such as that (to save costly
disk seeks/reads).  (BTW, in Win2k, the Performance tab of Task Manager,
where it says System Cache, I *think* is a good indicator of how much
file data is cached.)  And if you make your key_buffer too big, this
will be [more] memory that a program (MySQL) has allocated, from the
OS's view.  That's that much less free RAM that could be used to cache
the data file.

Compared to randomly reading data file rows (especially larger, variable
length ones) after index lookups, it's much faster to read the index
file, even from disk (if key_buffer is too small).  That's because the
index file is smaller and everything is in order, to be read more
sequentially, thus saving random disk seeks.  Besides, even if the
key_buffer is "too small," the OS will also cache the index file data,
so it may not actually have to be read from disk.


> However, after playing with this value on my system
> for a while, I have a number of questions about how it works...
> 1) I assume that the key buffer caches the contents of the myi files
(I'm
> only talking MyISAM tables here) but is this a direct copy of the
contents?

Yes.


> i.e. if you extend the key buffer so that it is bigger than the sum of
> the size of the myi files on your system, then will this be sufficient
to
> be able to cache all the indexes all the time ?

Yes it will.  Making it as big as your .MYI files is the *maximum* you
should use.  BUT, it's probably not the best. :-)  It's more like, "How
much of those .MYI files are accessed *regularly*?"  Probably not all of
them.

And remember about leaving enough memory to cache row data.

After the server's been running awhile, I think if Key_reads divided by
Key_read_requests (from SHOW STATUS) is less than 0.01 like it says in
the manual, you should be pretty good.


> 2) Does the whole index get loaded into the cache on the first time
it's
> accessed or do only 'parts' of the index get loaded as they are used
for
> various queries ?

Only parts. :-)  Blocks, actually.  A block is usually 1024 bytes;
though if you have an index more than like 255 bytes long, the blocks
will be 2048 bytes.

They are loaded on demand when they're not in the key_buffer (Key_reads
status var).  The status variable Key_blocks_used is like a "high water
mark."  It's the most blocks that were ever in the key_buffer (not
necessarily currently for some reason *shrug*) since the server was
started.  If the blocks are the usual 1K size, then 16384
Key_blocks_used, for example, would mean 16MB of indexes were in the
key_buffer at some point -- and may still be, of course.

Again, after MySQL's been running awhile (doing typical queries), if
Key_blocks_used divided by 1024 is LESS THAN your key_buffer_size (in
MB), your key_buffer is probably too big -- as it's never getting
filled.


> 3) If an index is updated for any reason, is the whole cache copy of
the
> index then invalidated or is the cache copy updated at the same time
as
> the disk file?

I think the block in the key_buffer is updated first, then on disk.
Don't hold me to this, though. :-)  If anything was invalidated, it
would just be the block(s) that were updated.


> One idea I was toying with was to 'delay' all inserts to the sensitive
> tables (an update is done every five minutes for my particular system)
so
> that the tables are updated pretty much in one single go and then
ensure
> the key buffer is refreshed so that all select queries on these tables
for
> the next five minute period will use only cached indexes. Does this
sound
> plausible and or sensible ?

To me, not really for the sake of indexes.  Although, you may be able to
get more speed if you do many writes at once -- especially if you use
LOCK TABLES around them.

Where you may get a benefit from doing many writes together, if the same
SELECTs are repeated, is from MySQL's Query Cache if you're using it.
Since updates to a table invalidate any of its cached SELECTs, the more
time between them, the more chance the SELECTs can be fetched from the
query cache. :-)


> Thanks and Regards,
>
> John Everitt
> PGN MSS
> Philips C/IT.

Hope that helps.


Matt



Reply via email to