Re: Perf Advice Needed: cache buffers chains, high waits, _db_block_hash_buckets
Don't forget that Oracle has redesigned the whole area of hash buckets and hash latches. In 8.1, the number of buckets is roughly 2 x db_block_buffers, but each latch covers multiple buckets (don't forget that a latch is about 160 bytes, so a noticeable amount of memory overhead) Because each latch covers multiple buckets, the latches can be held for a short time, as they need only be held whilst one bucket is searched. Unfortunately, there is no easy way to see the number of buffered blocks per bucket. Steve Adams has a formula relating db_block_buffers to the number of latches, but for most reasonable sized database, the answer is 1024. Typically you are likely to see between 64 and 128 buffers per latch. Jonathan Lewis http://www.jlcomp.demon.co.uk Now running 3-day intensive seminars http://www.jlcomp.demon.co.uk/seminar.html Host to The Co-Operative Oracle Users' FAQ http://www.jlcomp.demon.co.uk/faq/ind_faq.html Author of: Practical Oracle 8i: Building Efficient Databases -Original Message- [EMAIL PROTECTED] To: Multiple recipients of list ORACLE-L [EMAIL PROTECTED] Date: 04 March 2002 09:12 _db_block_hash_buckets |$)CFrom my R11 11.0.3 system : (11.0.3 against 8.1.7.2.1 on Tru64, 4CPUs, |db_block_lru_latches=16) | |MIN(BUFFERS_PER) MAX(BUFFERS_PER) AVG(BUFFERS_PER) SUM(BUFFERS_PER) | | 69 119 93.824218896076 | |Do I have a problem ? Very long buffer-chains, huh ! Nobody's complaining |about |performance. CPU util is generally 40% to 80% only occasssionally at 90%+ |when there |are more than 6 reports running concurrently. |Can't remember why I set db_block_lru_latches to 16, back in June 2001 when |the DB |was 8.0.5 | |Hemant K Chitale |Principal DBA |Chartered Semiconductor Manufacturing Ltd -- Please see the official ORACLE-L FAQ: http://www.orafaq.com -- Author: Jonathan Lewis INET: [EMAIL PROTECTED] Fat City Network Services-- (858) 538-5051 FAX: (858) 538-5051 San Diego, California-- Public Internet access / Mailing Lists To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).
RE: Perf Advice Needed: cache buffers chains, high waits, _db_block_hash_buckets
Hi, while we are on this topic, I would to ask you all about a system I was working on recently. The main problem was that approx. 64 threads were almost continuously doing full table scans on a small table of 800 rows (the developers insisted this was necessary). This table was in 1 or 2 blocks and was having a huge amount of cache buffers chains latching. When we forced the sql to use the index, the latching moved to the index. I did all the usual tricks like spreading the table out across a lot more blocks , increase the spin count etc. with limited success. My view was that with so much activity going on on this small table, the latching would never be fully eliminated, and it was poorly designed code. I just want to check that everyone agrees with me, or would you expect to be able to eliminate the waits even under these conditions? Thanks for your advice, Jim -Original Message- Manning Sent: 28 February 2002 17:14 To: Multiple recipients of list ORACLE-L _db_block_hash_buckets [Mogens Nørgaard] Amen. Contention for cache buffers chains means too much logical IO, ie. find and exterminate heavy SQL. I don't see why the heavy SQL would result in the chain having 66 buffer heads in it, though, or why the sleep count would be so skewed. And my core question is still whether the number of buckets being non-prime is normal or not - it seems awfully wrong to me. That there's a lot of contention *is* a factor of the SQL, but the fact that it's so skewed to only a few chains is what worries me more. Once I have the contention down to a particular latch, but that latch protects a buffer chain with 66 buffer heads in it, how can I find out which ones of the 66 are generating the most attempts at that latch? Tell ya what - can I get a few ppl to run this query? It tells the min/max/avg for the number of buffers associated with each chain and if my numbers are high I can at least have a chance of spreading out the buffers over more chains (by upping the number of latches from 4k to 16k, 32, whatever) - it won't drop the actual IO any, of course, but since I don't have a hard fix on which buffers of the 66 are really the source of my contention, I'm not sure where to go from here. SELECT min(buffers_per), max(buffers_per), avg(buffers_per), sum(buffers_per) FROM ( SELECT count(*) buffers_per, hladdr FROM x$bh b, all_objects o, v$latch_children v WHERE b.HLADDR=v.addr AND b.obj=o.object_id AND v.name LIKE '%cache buffers %' GROUP BY hladdr ) My results: min = 39 max = 119 avg = 55.06 sum = 22 If this shows to be about the same in other (well-tuned) Oracle DB's, then I won't worry as much about the number of buffers in each chain and would then focus on trying to isolate the specific buffers, then the source SQL causing the problem, etc. Given my previous sql trace analyses, I have a good idea what the problem SQL statement is, but it's a bit of a necessary evil right now (a join of a table (260k rows) and a materialized view (2k rows), 6 conditions in there where, and it gets executed a ton, probably on the order of 10x a second at peak) - all indexes that helped performance are created and around already. :( But, ideally I'd like to be able to prove this is the cause of the hot buffers before fixing anything. Thanks, guys!! James -- James Manning [EMAIL PROTECTED] GPG Key fingerprint = B913 2FBD 14A9 CE18 B2B7 9C8E A0BF B026 EEBB F6E4 -- Please see the official ORACLE-L FAQ: http://www.orafaq.com -- Author: James Manning INET: [EMAIL PROTECTED] Fat City Network Services-- (858) 538-5051 FAX: (858) 538-5051 San Diego, California-- Public Internet access / Mailing Lists To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing). -- Please see the official ORACLE-L FAQ: http://www.orafaq.com -- Author: James McCann INET: [EMAIL PROTECTED] Fat City Network Services-- (858) 538-5051 FAX: (858) 538-5051 San Diego, California-- Public Internet access / Mailing Lists To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).
Re: Perf Advice Needed: cache buffers chains, high waits, _db_block_hash_buckets
Really, The only thing to do is fix the SQL. Each logical I/O or buffer get results in a cache buffer chain latch get. So by doing less LIO, you will get fewer latch gets and as a result fewer sleeps on latches. This is how you fix the *problem*. You can also fix the *symptom*: bump up _spin_count (assuming that you run on a SMP) or set _db_ block_hash_latches to a higher value. Fixing the SQL is the right way to go. Are you shooting for a 99.9 percent buffer cache hit ratio ? If you are than that could also be a reason for the problem. Oh and there is a bug in Oracle 8.1.6/8.1.7 I believe that causes an additional buffer get for the index root block (assuming that the hash latches with the high sleeps cover index root blocks). Anjo Kolk http://www.oraperf.com - Original Message - To: Multiple recipients of list ORACLE-L [EMAIL PROTECTED] Sent: Thursday, February 28, 2002 7:53 AM _db_block_hash_buckets db_block_buffers = 360448 db_block_lru_latches = 4 db_block_size = 8192 _db_block_hash_buckets = 720896 Ok, what I have so far is: - using itrprof, I saw that 35% of my elapsed time was based on waits of cache buffers chains latches. - checking v$latch_children (latch#=66), there are a good number (8-10 I'd guess) of the 4096 children that have a very high (10k+) number of sleeps - the rest of the children (of this type latch) have sleep counts are 10-12, so we have a ton of contention on a low # of cache buffers chains latches. - joining with x$bh (v$latch_children.addr=x$bh.hladdr), I see that the most contentioned-for of these latches (51,240 sleeps!) has 66 blocks on the chain. Checking with all_objects, I'm noticing that these blocks are scattered in some of the more important (and most-accessed) tables and indexes - The other latch children that have high sleep counts also have 30-50 buffers in their chains Questions: - to me, 66 seems awfully high - is it? - the sleep count is obviously high from what I can tell - is it definitely tied to the buffer chain this latch is protecting being so long and just happening to be 66 buffers that are mostly important tables and indexes? - I haven't set it by hand, but _db_block_hash_buckets = 720896 and this is 11 * 2^16. Everything I've read says it should be a prime number (and that jives with my comp sci background) - why is it not prime, why is it exactly twice db_block_buffers? - the number of children for cache buffers chains is 4096. Now, increasing that could have a positive effect on distributing the contention, but since the sleps are so heavily skewed to only a few of the children as it stands, I don't get the feeling that's the right fix. Anyone have any advice to offer? Pages/URL's that can help give some advice? It's worth noting that these latches are basically non-existant as wait events at low load - log file sync is about the only wait event I see at low loads, and I'm working on reducing my commit counts much further to help tackle that. Thanks!! James -- James Manning [EMAIL PROTECTED] GPG Key fingerprint = B913 2FBD 14A9 CE18 B2B7 9C8E A0BF B026 EEBB F6E4 -- Please see the official ORACLE-L FAQ: http://www.orafaq.com -- Author: James Manning INET: [EMAIL PROTECTED] Fat City Network Services-- (858) 538-5051 FAX: (858) 538-5051 San Diego, California-- Public Internet access / Mailing Lists To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing). -- Please see the official ORACLE-L FAQ: http://www.orafaq.com -- Author: Anjo Kolk INET: [EMAIL PROTECTED] Fat City Network Services-- (858) 538-5051 FAX: (858) 538-5051 San Diego, California-- Public Internet access / Mailing Lists To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).
Re: Perf Advice Needed: cache buffers chains, high waits, _db_block_hash_buckets
Amen. Contention for cache buffers chains means too much logical IO, ie. find and exterminate heavy SQL. Anjo Kolk wrote: Really,The only thing to do is fix the SQL. Each logical I/O or buffer get resultsin a cache buffer chain latch get. So by doing less LIO, you will get fewerlatch gets and as a result fewer sleeps on latches. This is how you fix the*problem*. You can also fix the *symptom*: bump up _spin_count (assumingthat you run on a SMP) or set _db_ block_hash_latches to a higher value.Fixing the SQL is the right way to go. Are you shooting for a 99.9percent buffer cache hit ratio ? If you are than that could also be a reasonfor the problem. Oh and there is a bug in Oracle 8.1.6/8.1.7 I believe thatcauses an additional buffer get for the index root block (assuming that thehash latches with the high sleeps cover index root blocks).Anjo Kolkhttp://www.oraperf.com- Original Message -To: "Multiple recipients of list ORACLE-L" [EMAIL PROTECTED]Sent: Thursday, February 28, 2002 7:53 AM_db_block_hash_buckets db_block_buffers = 360448db_block_lru_latches = 4db_block_size = 8192_db_block_hash_buckets = 720896Ok, what I have so far is:- using itrprof, I saw that 35% of my elapsed time was based on waits of "cache buffers chains" latches.- checking v$latch_children (latch#=66), there are a good number (8-10 I'd guess) of the 4096 children that have a very high (10k+) number of sleeps - the rest of the children (of this type latch) have sleep counts are 10-12, so we have a ton of contention on a low # of "cache buffers chains" latches.- joining with x$bh (v$latch_children.addr=x$bh.hladdr), I see that the most contentioned-for of these latches (51,240 sleeps!) has 66 blocks on the chain. Checking with all_objects, I'm noticing that these blocks are scattered in some of the more important (and most-accessed) tables and indexes- The other latch children that have high sleep counts also have 30-50 buffers in their chainsQuestions:- to me, 66 seems awfully high - is it?- the sleep count is obviously high from what I can tell - is it definitely tied to the buffer chain this latch is protecting being so long and just happening to be 66 buffers that are mostly important tables and indexes?- I haven't set it by hand, but _db_block_hash_buckets = 720896 and this is 11 * 2^16. Everything I've read says it should be a prime number (and that jives with my comp sci background) - why is it not prime, why is it exactly twice db_block_buffers?- the number of children for "cache buffers chains" is 4096. Now, increasing that could have a positive effect on distributing the contention, but since the sleps are so heavily skewed to only a few of the children as it stands, I don't get the feeling that's the right fix.Anyone have any advice to offer? Pages/URL's that can help giv e someadvice?It's worth noting that these latches are basically non-existant aswait events at low load - "log file sync" is about the only waitevent I see at low loads, and I'm working on reducing my commitcounts much further to help tackle that.Thanks!!James--James Manning [EMAIL PROTECTED]GPG Key fingerprint = B913 2FBD 14A9 CE18 B2B7 9C8E A0BF B026 EEBB F6E4--Please see the official ORACLE-L FAQ: http://www.orafaq.com--Author: James Manning INET: [EMAIL PROTECTED]Fat City Network Services-- (858) 538-5051 FAX: (858) 538-5051San Diego, California-- Public Internet access / Mailing ListsTo REMOVE yourself from this mailing list, send an E-Mail messageto: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and inthe message BODY, include a line containing: UNSUB ORACLE-L(or the name of mailing list you want to be removed from). You mayalso send the HELP command for other information (like subscribing).
Re: Perf Advice Needed: cache buffers chains, high waits, _db_block_hash_buckets
[Mogens Nørgaard] Amen. Contention for cache buffers chains means too much logical IO, ie. find and exterminate heavy SQL. I don't see why the heavy SQL would result in the chain having 66 buffer heads in it, though, or why the sleep count would be so skewed. And my core question is still whether the number of buckets being non-prime is normal or not - it seems awfully wrong to me. That there's a lot of contention *is* a factor of the SQL, but the fact that it's so skewed to only a few chains is what worries me more. Once I have the contention down to a particular latch, but that latch protects a buffer chain with 66 buffer heads in it, how can I find out which ones of the 66 are generating the most attempts at that latch? Tell ya what - can I get a few ppl to run this query? It tells the min/max/avg for the number of buffers associated with each chain and if my numbers are high I can at least have a chance of spreading out the buffers over more chains (by upping the number of latches from 4k to 16k, 32, whatever) - it won't drop the actual IO any, of course, but since I don't have a hard fix on which buffers of the 66 are really the source of my contention, I'm not sure where to go from here. SELECT min(buffers_per), max(buffers_per), avg(buffers_per), sum(buffers_per) FROM ( SELECT count(*) buffers_per, hladdr FROM x$bh b, all_objects o, v$latch_children v WHERE b.HLADDR=v.addr AND b.obj=o.object_id AND v.name LIKE '%cache buffers %' GROUP BY hladdr ) My results: min = 39 max = 119 avg = 55.06 sum = 22 If this shows to be about the same in other (well-tuned) Oracle DB's, then I won't worry as much about the number of buffers in each chain and would then focus on trying to isolate the specific buffers, then the source SQL causing the problem, etc. Given my previous sql trace analyses, I have a good idea what the problem SQL statement is, but it's a bit of a necessary evil right now (a join of a table (260k rows) and a materialized view (2k rows), 6 conditions in there where, and it gets executed a ton, probably on the order of 10x a second at peak) - all indexes that helped performance are created and around already. :( But, ideally I'd like to be able to prove this is the cause of the hot buffers before fixing anything. Thanks, guys!! James -- James Manning [EMAIL PROTECTED] GPG Key fingerprint = B913 2FBD 14A9 CE18 B2B7 9C8E A0BF B026 EEBB F6E4 -- Please see the official ORACLE-L FAQ: http://www.orafaq.com -- Author: James Manning INET: [EMAIL PROTECTED] Fat City Network Services-- (858) 538-5051 FAX: (858) 538-5051 San Diego, California-- Public Internet access / Mailing Lists To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).
Re: Perf Advice Needed: cache buffers chains, high waits, _db_block_hash_buckets
From my biggest problem child. No bind variables, tends to get busy at times. Mission critical of course. MIN(BUFFERS_PER) MAX(BUFFERS_PER) AVG(BUFFERS_PER) SUM(BUFFERS_PER) 1 17 5.45231072 5545 1 row selected. From our production SAP system: MIN(BUFFERS_PER) MAX(BUFFERS_PER) AVG(BUFFERS_PER) SUM(BUFFERS_PER) 1 13 4.0584075941899 1 row selected. Jared James Manning [EMAIL PROTECTED] Sent by: [EMAIL PROTECTED] 02/28/02 09:13 AM Please respond to ORACLE-L To: Multiple recipients of list ORACLE-L [EMAIL PROTECTED] cc: Subject:Re: Perf Advice Needed: cache buffers chains, high waits, _db_block_hash_buckets [Mogens Nørgaard] Amen. Contention for cache buffers chains means too much logical IO, ie. find and exterminate heavy SQL. I don't see why the heavy SQL would result in the chain having 66 buffer heads in it, though, or why the sleep count would be so skewed. And my core question is still whether the number of buckets being non-prime is normal or not - it seems awfully wrong to me. That there's a lot of contention *is* a factor of the SQL, but the fact that it's so skewed to only a few chains is what worries me more. Once I have the contention down to a particular latch, but that latch protects a buffer chain with 66 buffer heads in it, how can I find out which ones of the 66 are generating the most attempts at that latch? Tell ya what - can I get a few ppl to run this query? It tells the min/max/avg for the number of buffers associated with each chain and if my numbers are high I can at least have a chance of spreading out the buffers over more chains (by upping the number of latches from 4k to 16k, 32, whatever) - it won't drop the actual IO any, of course, but since I don't have a hard fix on which buffers of the 66 are really the source of my contention, I'm not sure where to go from here. SELECT min(buffers_per), max(buffers_per), avg(buffers_per), sum(buffers_per) FROM ( SELECT count(*) buffers_per, hladdr FROM x$bh b, all_objects o, v$latch_children v WHERE b.HLADDR=v.addr AND b.obj=o.object_id AND v.name LIKE '%cache buffers %' GROUP BY hladdr ) My results: min = 39 max = 119 avg = 55.06 sum = 22 If this shows to be about the same in other (well-tuned) Oracle DB's, then I won't worry as much about the number of buffers in each chain and would then focus on trying to isolate the specific buffers, then the source SQL causing the problem, etc. Given my previous sql trace analyses, I have a good idea what the problem SQL statement is, but it's a bit of a necessary evil right now (a join of a table (260k rows) and a materialized view (2k rows), 6 conditions in there where, and it gets executed a ton, probably on the order of 10x a second at peak) - all indexes that helped performance are created and around already. :( But, ideally I'd like to be able to prove this is the cause of the hot buffers before fixing anything. Thanks, guys!! James -- James Manning [EMAIL PROTECTED] GPG Key fingerprint = B913 2FBD 14A9 CE18 B2B7 9C8E A0BF B026 EEBB F6E4 -- Please see the official ORACLE-L FAQ: http://www.orafaq.com -- Author: James Manning INET: [EMAIL PROTECTED] Fat City Network Services-- (858) 538-5051 FAX: (858) 538-5051 San Diego, California-- Public Internet access / Mailing Lists To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing). -- Please see the official ORACLE-L FAQ: http://www.orafaq.com -- Author: INET: [EMAIL PROTECTED] Fat City Network Services-- (858) 538-5051 FAX: (858) 538-5051 San Diego, California-- Public Internet access / Mailing Lists To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).