RE: Arch configuration -- I/O stuck

Steve Adams Thu, 01 Nov 2001 19:03:38 -0800

Hi Pablo,

I've seen an 'ls' hang for more than a minute under 10.20 when there were a lot of 
delayed writes pending on an
unrelated file-system. A colleague of mine (Chris Bunting) did some testing to 
reproduce the problem and concluded that
all filesystems of the same type (JFS or HFS) were affected. HP made some kernel 
changes for 11.0 that have reduced the
severity of the problem, but it can still occur.


If your case the archive writes are not delayed writes because Oracle explicitly opens 
the files in synchronous mode, so
you should not see a delay any longer than that corresponding to the time that it 
would take your Symmetrix box to
destage the cache allocations for the target LUNs, unless there happens to be 
simultaneous heavy delayed write activity
elsewhere on the system.

The failure of the multiblock_read_test.sql script probably indicates that the "large" 
table that you scanned already
had a large number of block in the cache.

@   Regards,
@   Steve Adams
@   http://www.ixora.com.au/              -  For DBAs
@   http://www.christianity.net.au/       -  For all


-----Original Message-----
Sent: Friday, 2 November 2001 6:39
To: Steve Adams; Multiple recipients of list ORACLE-L


Steve, thanks for the help and for the url and the
advice of stripping.

I don't understand what I'm pasting here , I'm
executing a 'ls' in a FS that's in a different disk in
differents LUNs (on the same Symmetrix), why is it
still stucking. Shouldn't it be placed in a different
queue??

"The 'ls' is probably getting stuck because the I/O is
very slow and file system metadata writes are stuck in
the I/O
queue while locks are held on the file system metadata
pending the completion of those writes."



One more question, besides what you just advised me,
I've been trying to reduce ARCH bandwidth (as I read
in a TIP at your site), to spread ARCH work along more
time and reduce the impact in foreground processes. So
I've set log_archive_buffers from 4 to 2 and today
I've tried to set log_archive_buffer_size to the
MAX_IO_SIZE of the OS. But I found a problem with
this.

I tried to check what was the MAX_IO_SIZE, so I used
10046 event and check at scattered reads in a big FTS
(as you do in your scripts) and I always got p3=5. I
checked this into 2 differents databases running on
the same box. Both reported p3=5 (5 blocks I think),
but the surprise is that one of them has got
db_block_size=4K and the other db_block_size=8K.

How can it be possible? according to this test
MAX_IO_SIZE could be 20K or 40K. what's wrong here?

And something worst, MAX_IO_SIZE can't be so small,
right? I thought it was 1MB or 512K in HP-UX 11.0

thannks for your time.
TIA







 --- Steve Adams <[EMAIL PROTECTED]> escribió:
> Hi Pablo,
>
> The 'ls' is probably getting stuck because the I/O
> is very slow and file system metadata writes are
> stuck in the I/O
> queue while locks are held on the file system
> metadata pending the completion of those writes.
>
> The problem could be that you are saturating the
> cache allocations for the EMC LUNs containing your
> archive destination
> file system. See the answer at
> http://www.ixora.com.au/q+a/0010/20102738.htm for a
> bit about the EMC cache allocation
> policy. To solve the problem you can use LVM to
> stripe a large number of small LUNs together so as
> to increase the total
> amount of cache available for the archival writes.
> You would also do well to avoid RAID-S of course!
>
> @   Regards,
> @   Steve Adams
> @   http://www.ixora.com.au/              -  For
> DBAs
> @   http://www.christianity.net.au/       -  For all
>
>
>
> -----Original Message-----
> From: Pablo ksksksk [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, 1 November 2001 5:45
> To: Multiple recipients of list ORACLE-L
> Subject: Arch configuration -- I/O stuck
>
>
> Hi list,
>
>   Oracle 7.3.4
>   HP-UX
>   log_archive_buffer_size=32 (redo log blocks = 1K)
>   log_archive_buffers=4
>   Filesystem based (no direct I/O)
>
>   I've been detecting that my box gets stucked
> eventually for some time.
>   When this happens I can't do even a "ls" (it
> actually executes it but it takes a long time).
>   If I check my cpu with TOP, I see 47% idle time
> and
> there's no process monopolizing the CPU.
>   But when I check disk activity with sar -d I see
> that  one disk is 100% busy and it's avwait+avserv >
> 1000 ms. The other disks are fine.
>   I then check disk activity with Glance and I can
> identify the process that's writting/reading on this
> disk is: ARCH (ARCH is writting a 1.9 GB redo log.)
>
>   So here are my doubts:
>       1)If only one disk is saturated (I've got
> about
> 30 disks in this box (a SYMMETRIX array) with some
> controllers), why does the whole box get stucked?
> why
> are even other applications connected to other
> instances running on this box affected? (may be
> because the HP-UX LVM system gets saturated???)
>
>      2) What can I do to avoid this problem?,
> (reduce
> log_archive_buffers parameter may be, or increase
> log_archive_buffer_size)
>
> help me on this
> Thanks
>

_______________________________________________________________
Do You Yahoo!?
Yahoo! Messenger
Comunicación instantánea gratis con tu gente.
http://messenger.yahoo.es

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: Steve Adams
  INET: [EMAIL PROTECTED]

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

RE: Arch configuration -- I/O stuck

Reply via email to