Re: Perf Advice Needed: cache buffers chains, high waits, _db_block_hash_buckets

2002-03-04 Thread Jonathan Lewis


Don't forget that Oracle has redesigned the whole
area of hash buckets and hash latches.

In 8.1, the number of buckets is roughly 2 x db_block_buffers,
but each latch covers multiple buckets (don't forget that
a latch is about 160 bytes, so a noticeable amount of
memory overhead)

Because each latch covers multiple buckets, the latches
can be held for a short time, as they need only be held
whilst one bucket is searched.  Unfortunately, there is no
easy way to see the number of buffered blocks per bucket.

Steve Adams has a formula relating db_block_buffers to the
number of latches, but for most reasonable sized database,
the answer is 1024.  Typically you are likely to see between 64
and 128 buffers per latch.



Jonathan Lewis
http://www.jlcomp.demon.co.uk

Now running 3-day intensive seminars
http://www.jlcomp.demon.co.uk/seminar.html

Host to The Co-Operative Oracle Users' FAQ
http://www.jlcomp.demon.co.uk/faq/ind_faq.html

Author of:
Practical Oracle 8i: Building Efficient Databases


-Original Message-
[EMAIL PROTECTED]
To: Multiple recipients of list ORACLE-L [EMAIL PROTECTED]
Date: 04 March 2002 09:12
_db_block_hash_buckets


|$)CFrom my R11 11.0.3 system :  (11.0.3 against 8.1.7.2.1 on Tru64,
4CPUs,
|db_block_lru_latches=16)
|
|MIN(BUFFERS_PER) MAX(BUFFERS_PER) AVG(BUFFERS_PER) SUM(BUFFERS_PER)
|   
|  69  119   93.824218896076
|
|Do I have a problem ?  Very long buffer-chains, huh !  Nobody's
complaining
|about
|performance.  CPU util is generally 40% to 80% only occasssionally at
90%+
|when there
|are more than 6 reports running concurrently.
|Can't remember why I set db_block_lru_latches to 16, back in June
2001 when
|the DB
|was 8.0.5
|
|Hemant K Chitale
|Principal DBA
|Chartered Semiconductor Manufacturing Ltd


-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: Jonathan Lewis
  INET: [EMAIL PROTECTED]

Fat City Network Services-- (858) 538-5051  FAX: (858) 538-5051
San Diego, California-- Public Internet access / Mailing Lists

To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).



RE: Perf Advice Needed: cache buffers chains, high waits, _db_block_hash_buckets

2002-03-01 Thread James McCann

Hi,
   while we are on this topic, I would to ask you all about a system I was
working on recently.

The main problem was that approx. 64 threads were almost continuously doing
full table scans on a small table of 800 rows (the developers insisted this
was necessary). This table was in 1 or 2 blocks and was having a huge amount
of cache buffers chains latching.

When we forced the sql to use the index, the latching moved to the index.

I did all the usual tricks like spreading the table out across a lot more
blocks , increase the spin count etc. with limited success.

My view was that with so much activity going on on this small table, the
latching would never be fully eliminated, and it was poorly designed code.

I just want to check that everyone agrees with me, or would you expect to be
able to eliminate the waits even under these conditions?

Thanks for your advice,

Jim



-Original Message-
Manning
Sent: 28 February 2002 17:14
To: Multiple recipients of list ORACLE-L
_db_block_hash_buckets


[Mogens Nørgaard]
Amen.  Contention  for cache buffers chains means too much logical IO,
ie. find and exterminate heavy SQL.

I don't see why the heavy SQL would result in the chain having 66 buffer
heads in it, though, or why the sleep count would be so skewed.

And my core question is still whether the number of buckets being
non-prime is normal or not - it seems awfully wrong to me.

That there's a lot of contention *is* a factor of the SQL, but the
fact that it's so skewed to only a few chains is what worries me more.

Once I have the contention down to a particular latch, but that latch
protects a buffer chain with 66 buffer heads in it, how can I find out
which ones of the 66 are generating the most attempts at that latch?

Tell ya what - can I get a few ppl to run this query?  It tells the
min/max/avg for the number of buffers associated with each chain and if
my numbers are high I can at least have a chance of spreading out the
buffers over more chains (by upping the number of latches from 4k to 16k,
32, whatever) - it won't drop the actual IO any, of course, but since
I don't have a hard fix on which buffers of the 66 are really the source
of my contention, I'm not sure where to go from here.


SELECT min(buffers_per), max(buffers_per),
   avg(buffers_per), sum(buffers_per)
FROM (
   SELECT count(*) buffers_per, hladdr
   FROM x$bh b, all_objects o, v$latch_children v
   WHERE
   b.HLADDR=v.addr
   AND b.obj=o.object_id
   AND v.name LIKE '%cache buffers %'
   GROUP BY hladdr
)

My results:
min = 39
max = 119
avg = 55.06
sum = 22

If this shows to be about the same in other (well-tuned) Oracle DB's, then
I won't worry as much about the number of buffers in each chain and would
then focus on trying to isolate the specific buffers, then the source SQL
causing the problem, etc.

Given my previous sql trace analyses, I have a good idea what the problem
SQL statement is, but it's a bit of a necessary evil right now (a join
of a table (260k rows) and a materialized view (2k rows), 6 conditions
in there where, and it gets executed a ton, probably on the order of 10x
a second at peak) - all indexes that helped performance are created and
around already. :(  But, ideally I'd like to be able to prove this is
the cause of the hot buffers before fixing anything.

Thanks, guys!!

James
--
James Manning [EMAIL PROTECTED]
GPG Key fingerprint = B913 2FBD 14A9 CE18 B2B7  9C8E A0BF B026 EEBB F6E4
--
Please see the official ORACLE-L FAQ: http://www.orafaq.com
--
Author: James Manning
  INET: [EMAIL PROTECTED]

Fat City Network Services-- (858) 538-5051  FAX: (858) 538-5051
San Diego, California-- Public Internet access / Mailing Lists

To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: James McCann
  INET: [EMAIL PROTECTED]

Fat City Network Services-- (858) 538-5051  FAX: (858) 538-5051
San Diego, California-- Public Internet access / Mailing Lists

To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).



Re: Perf Advice Needed: cache buffers chains, high waits, _db_block_hash_buckets

2002-02-28 Thread Anjo Kolk

Really,

The only thing to do is fix the SQL. Each logical I/O or buffer get results
in a cache buffer chain latch get. So by doing less LIO, you will get fewer
latch gets and as a result fewer sleeps on latches. This is how you fix the
*problem*. You can also fix the *symptom*: bump up _spin_count (assuming
that you run on a SMP) or set _db_ block_hash_latches to a higher value.

Fixing the SQL is the right way to go. Are you shooting for a 99.9
percent buffer cache hit ratio ? If you are than that could also be a reason
for the problem. Oh and there is a bug in Oracle 8.1.6/8.1.7 I believe that
causes an additional buffer get for the index root block (assuming that the
hash latches with the high sleeps cover index root blocks).

Anjo Kolk
http://www.oraperf.com

- Original Message -
To: Multiple recipients of list ORACLE-L [EMAIL PROTECTED]
Sent: Thursday, February 28, 2002 7:53 AM
_db_block_hash_buckets


 db_block_buffers = 360448
 db_block_lru_latches = 4
 db_block_size = 8192

 _db_block_hash_buckets = 720896

 Ok, what I have so far is:

 - using itrprof, I saw that 35% of my elapsed time was based on waits of
   cache buffers chains latches.
 - checking v$latch_children (latch#=66), there are a good number (8-10
   I'd guess) of the 4096 children that have a very high (10k+) number
   of sleeps - the rest of the children (of this type latch) have sleep
   counts are 10-12, so we have a ton of contention on a low # of cache
   buffers chains latches.
 - joining with x$bh (v$latch_children.addr=x$bh.hladdr), I see that
   the most contentioned-for of these latches (51,240 sleeps!) has 66
   blocks on the chain.  Checking with all_objects, I'm noticing that these
   blocks are scattered in some of the more important (and most-accessed)
   tables and indexes
 - The other latch children that have high sleep counts also have 30-50
   buffers in their chains

 Questions:

 - to me, 66 seems awfully high - is it?
 - the sleep count is obviously high from what I can tell - is it
   definitely tied to the buffer chain this latch is protecting being
   so long and just happening to be 66 buffers that are mostly important
   tables and indexes?
 - I haven't set it by hand, but _db_block_hash_buckets = 720896
   and this is 11 * 2^16.  Everything I've read says it should be
   a prime number (and that jives with my comp sci background) - why
   is it not prime, why is it exactly twice db_block_buffers?
 - the number of children for cache buffers chains is 4096.  Now,
   increasing that could have a positive effect on distributing the
   contention, but since the sleps are so heavily skewed to only a few
   of the children as it stands, I don't get the feeling that's the
   right fix.

 Anyone have any advice to offer?  Pages/URL's that can help give some
 advice?

 It's worth noting that these latches are basically non-existant as
 wait events at low load - log file sync is about the only wait
 event I see at low loads, and I'm working on reducing my commit
 counts much further to help tackle that.

 Thanks!!

 James
 --
 James Manning [EMAIL PROTECTED]
 GPG Key fingerprint = B913 2FBD 14A9 CE18 B2B7  9C8E A0BF B026 EEBB F6E4
 --
 Please see the official ORACLE-L FAQ: http://www.orafaq.com
 --
 Author: James Manning
   INET: [EMAIL PROTECTED]

 Fat City Network Services-- (858) 538-5051  FAX: (858) 538-5051
 San Diego, California-- Public Internet access / Mailing Lists
 
 To REMOVE yourself from this mailing list, send an E-Mail message
 to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
 the message BODY, include a line containing: UNSUB ORACLE-L
 (or the name of mailing list you want to be removed from).  You may
 also send the HELP command for other information (like subscribing).


-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: Anjo Kolk
  INET: [EMAIL PROTECTED]

Fat City Network Services-- (858) 538-5051  FAX: (858) 538-5051
San Diego, California-- Public Internet access / Mailing Lists

To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).



Re: Perf Advice Needed: cache buffers chains, high waits, _db_block_hash_buckets

2002-02-28 Thread Mogens Nørgaard



Amen. Contention for cache buffers chains means too much logical IO, ie.
find and exterminate heavy SQL.

Anjo Kolk wrote:

  Really,The only thing to do is fix the SQL. Each logical I/O or buffer get resultsin a cache buffer chain latch get. So by doing less LIO, you will get fewerlatch gets and as a result fewer sleeps on latches. This is how you fix the*problem*. You can also fix the *symptom*: bump up _spin_count (assumingthat you run on a SMP) or set _db_ block_hash_latches to a higher value.Fixing the SQL is the right way to go. Are you shooting for a 99.9percent buffer cache hit ratio ? If you are than that could also be a reasonfor the problem. Oh and there is a bug in Oracle 8.1.6/8.1.7 I believe thatcauses an additional buffer get for the index root block (assuming that thehash latches with the high sleeps cover index root blocks).Anjo Kolkhttp://www.oraperf.com- Original Message -To: "Multiple recipients of list ORACLE-L" [EMAIL PROTECTED]Sent: Thursday, February 28, 2002 7:53 AM_db_block_hash_buckets
  
db_block_buffers = 360448db_block_lru_latches = 4db_block_size = 8192_db_block_hash_buckets = 720896Ok, what I have so far is:- using itrprof, I saw that 35% of my elapsed time was based on waits of  "cache buffers chains" latches.- checking v$latch_children (latch#=66), there are a good number (8-10  I'd guess) of the 4096 children that have a very high (10k+) number  of sleeps - the rest of the children (of this type latch) have sleep  counts are 10-12, so we have a ton of contention on a low # of "cache  buffers chains" latches.- joining with x$bh (v$latch_children.addr=x$bh.hladdr), I see that  the most contentioned-for of these latches (51,240 sleeps!) has 66  blocks on the chain.  Checking with all_objects, I'm noticing that these  blocks are scattered in some of the more important (and most-accessed)  tables and indexes- The other latch children that have high sleep counts
 also have 30-50  buffers in their chainsQuestions:- to me, 66 seems awfully high - is it?- the sleep count is obviously high from what I can tell - is it  definitely tied to the buffer chain this latch is protecting being  so long and just happening to be 66 buffers that are mostly important  tables and indexes?- I haven't set it by hand, but _db_block_hash_buckets = 720896  and this is 11 * 2^16.  Everything I've read says it should be  a prime number (and that jives with my comp sci background) - why  is it not prime, why is it exactly twice db_block_buffers?- the number of children for "cache buffers chains" is 4096.  Now,  increasing that could have a positive effect on distributing the  contention, but since the sleps are so heavily skewed to only a few  of the children as it stands, I don't get the feeling that's the  right fix.Anyone have any advice to offer?  Pages/URL's that can help giv
e someadvice?It's worth noting that these latches are basically non-existant aswait events at low load - "log file sync" is about the only waitevent I see at low loads, and I'm working on reducing my commitcounts much further to help tackle that.Thanks!!James--James Manning [EMAIL PROTECTED]GPG Key fingerprint = B913 2FBD 14A9 CE18 B2B7  9C8E A0BF B026 EEBB F6E4--Please see the official ORACLE-L FAQ: http://www.orafaq.com--Author: James Manning  INET: [EMAIL PROTECTED]Fat City Network Services-- (858) 538-5051  FAX: (858) 538-5051San Diego, California-- Public Internet access / Mailing ListsTo 
REMOVE yourself from this mailing list, send an E-Mail messageto: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and inthe message BODY, include a line containing: UNSUB ORACLE-L(or the name of mailing list you want to be removed from).  You mayalso send the HELP command for other information (like subscribing).








Re: Perf Advice Needed: cache buffers chains, high waits, _db_block_hash_buckets

2002-02-28 Thread James Manning

[Mogens Nørgaard]
Amen.  Contention  for cache buffers chains means too much logical IO,
ie. find and exterminate heavy SQL.

I don't see why the heavy SQL would result in the chain having 66 buffer
heads in it, though, or why the sleep count would be so skewed.

And my core question is still whether the number of buckets being
non-prime is normal or not - it seems awfully wrong to me.

That there's a lot of contention *is* a factor of the SQL, but the
fact that it's so skewed to only a few chains is what worries me more.

Once I have the contention down to a particular latch, but that latch
protects a buffer chain with 66 buffer heads in it, how can I find out
which ones of the 66 are generating the most attempts at that latch?

Tell ya what - can I get a few ppl to run this query?  It tells the
min/max/avg for the number of buffers associated with each chain and if
my numbers are high I can at least have a chance of spreading out the
buffers over more chains (by upping the number of latches from 4k to 16k,
32, whatever) - it won't drop the actual IO any, of course, but since
I don't have a hard fix on which buffers of the 66 are really the source
of my contention, I'm not sure where to go from here.


SELECT min(buffers_per), max(buffers_per), 
   avg(buffers_per), sum(buffers_per) 
FROM (
   SELECT count(*) buffers_per, hladdr
   FROM x$bh b, all_objects o, v$latch_children v
   WHERE
   b.HLADDR=v.addr
   AND b.obj=o.object_id
   AND v.name LIKE '%cache buffers %'
   GROUP BY hladdr
)

My results:
min = 39
max = 119
avg = 55.06
sum = 22

If this shows to be about the same in other (well-tuned) Oracle DB's, then
I won't worry as much about the number of buffers in each chain and would
then focus on trying to isolate the specific buffers, then the source SQL
causing the problem, etc.

Given my previous sql trace analyses, I have a good idea what the problem
SQL statement is, but it's a bit of a necessary evil right now (a join
of a table (260k rows) and a materialized view (2k rows), 6 conditions
in there where, and it gets executed a ton, probably on the order of 10x
a second at peak) - all indexes that helped performance are created and
around already. :(  But, ideally I'd like to be able to prove this is
the cause of the hot buffers before fixing anything.

Thanks, guys!!

James
-- 
James Manning [EMAIL PROTECTED]
GPG Key fingerprint = B913 2FBD 14A9 CE18 B2B7  9C8E A0BF B026 EEBB F6E4
-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: James Manning
  INET: [EMAIL PROTECTED]

Fat City Network Services-- (858) 538-5051  FAX: (858) 538-5051
San Diego, California-- Public Internet access / Mailing Lists

To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).



Re: Perf Advice Needed: cache buffers chains, high waits, _db_block_hash_buckets

2002-02-28 Thread Jared . Still

From my biggest problem child.  No bind variables, tends to get busy at 
times.
Mission critical of course.

MIN(BUFFERS_PER) MAX(BUFFERS_PER) AVG(BUFFERS_PER) SUM(BUFFERS_PER)
   
   1   17   5.45231072 5545

1 row selected.


From our production SAP system:

MIN(BUFFERS_PER) MAX(BUFFERS_PER) AVG(BUFFERS_PER) SUM(BUFFERS_PER)
   
   1   13   4.0584075941899

1 row selected.

Jared





James Manning [EMAIL PROTECTED]
Sent by: [EMAIL PROTECTED]
02/28/02 09:13 AM
Please respond to ORACLE-L

 
To: Multiple recipients of list ORACLE-L [EMAIL PROTECTED]
cc: 
Subject:Re: Perf Advice Needed: cache buffers chains, high waits, 
_db_block_hash_buckets


[Mogens Nørgaard]
Amen.  Contention  for cache buffers chains means too much logical 
IO,
ie. find and exterminate heavy SQL.

I don't see why the heavy SQL would result in the chain having 66 buffer
heads in it, though, or why the sleep count would be so skewed.

And my core question is still whether the number of buckets being
non-prime is normal or not - it seems awfully wrong to me.

That there's a lot of contention *is* a factor of the SQL, but the
fact that it's so skewed to only a few chains is what worries me more.

Once I have the contention down to a particular latch, but that latch
protects a buffer chain with 66 buffer heads in it, how can I find out
which ones of the 66 are generating the most attempts at that latch?

Tell ya what - can I get a few ppl to run this query?  It tells the
min/max/avg for the number of buffers associated with each chain and if
my numbers are high I can at least have a chance of spreading out the
buffers over more chains (by upping the number of latches from 4k to 16k,
32, whatever) - it won't drop the actual IO any, of course, but since
I don't have a hard fix on which buffers of the 66 are really the source
of my contention, I'm not sure where to go from here.


SELECT min(buffers_per), max(buffers_per), 
   avg(buffers_per), sum(buffers_per) 
FROM (
   SELECT count(*) buffers_per, hladdr
   FROM x$bh b, all_objects o, v$latch_children v
   WHERE
   b.HLADDR=v.addr
   AND b.obj=o.object_id
   AND v.name LIKE '%cache buffers %'
   GROUP BY hladdr
)

My results:
min = 39
max = 119
avg = 55.06
sum = 22

If this shows to be about the same in other (well-tuned) Oracle DB's, then
I won't worry as much about the number of buffers in each chain and would
then focus on trying to isolate the specific buffers, then the source SQL
causing the problem, etc.

Given my previous sql trace analyses, I have a good idea what the problem
SQL statement is, but it's a bit of a necessary evil right now (a join
of a table (260k rows) and a materialized view (2k rows), 6 conditions
in there where, and it gets executed a ton, probably on the order of 10x
a second at peak) - all indexes that helped performance are created and
around already. :(  But, ideally I'd like to be able to prove this is
the cause of the hot buffers before fixing anything.

Thanks, guys!!

James
-- 
James Manning [EMAIL PROTECTED]
GPG Key fingerprint = B913 2FBD 14A9 CE18 B2B7  9C8E A0BF B026 EEBB F6E4
-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: James Manning
  INET: [EMAIL PROTECTED]

Fat City Network Services-- (858) 538-5051  FAX: (858) 538-5051
San Diego, California-- Public Internet access / Mailing Lists

To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).



--
Please see the official ORACLE-L FAQ: http://www.orafaq.com
--
Author:
  INET: [EMAIL PROTECTED]

Fat City Network Services-- (858) 538-5051  FAX: (858) 538-5051
San Diego, California-- Public Internet access / Mailing Lists

To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).