Hi Pablo, I've seen an 'ls' hang for more than a minute under 10.20 when there were a lot of delayed writes pending on an unrelated file-system. A colleague of mine (Chris Bunting) did some testing to reproduce the problem and concluded that all filesystems of the same type (JFS or HFS) were affected. HP made some kernel changes for 11.0 that have reduced the severity of the problem, but it can still occur.
If your case the archive writes are not delayed writes because Oracle explicitly opens the files in synchronous mode, so you should not see a delay any longer than that corresponding to the time that it would take your Symmetrix box to destage the cache allocations for the target LUNs, unless there happens to be simultaneous heavy delayed write activity elsewhere on the system. The failure of the multiblock_read_test.sql script probably indicates that the "large" table that you scanned already had a large number of block in the cache. @ Regards, @ Steve Adams @ http://www.ixora.com.au/ - For DBAs @ http://www.christianity.net.au/ - For all -----Original Message----- Sent: Friday, 2 November 2001 6:39 To: Steve Adams; Multiple recipients of list ORACLE-L Steve, thanks for the help and for the url and the advice of stripping. I don't understand what I'm pasting here , I'm executing a 'ls' in a FS that's in a different disk in differents LUNs (on the same Symmetrix), why is it still stucking. Shouldn't it be placed in a different queue?? "The 'ls' is probably getting stuck because the I/O is very slow and file system metadata writes are stuck in the I/O queue while locks are held on the file system metadata pending the completion of those writes." One more question, besides what you just advised me, I've been trying to reduce ARCH bandwidth (as I read in a TIP at your site), to spread ARCH work along more time and reduce the impact in foreground processes. So I've set log_archive_buffers from 4 to 2 and today I've tried to set log_archive_buffer_size to the MAX_IO_SIZE of the OS. But I found a problem with this. I tried to check what was the MAX_IO_SIZE, so I used 10046 event and check at scattered reads in a big FTS (as you do in your scripts) and I always got p3=5. I checked this into 2 differents databases running on the same box. Both reported p3=5 (5 blocks I think), but the surprise is that one of them has got db_block_size=4K and the other db_block_size=8K. How can it be possible? according to this test MAX_IO_SIZE could be 20K or 40K. what's wrong here? And something worst, MAX_IO_SIZE can't be so small, right? I thought it was 1MB or 512K in HP-UX 11.0 thannks for your time. TIA --- Steve Adams <[EMAIL PROTECTED]> escribió: > Hi Pablo, > > The 'ls' is probably getting stuck because the I/O > is very slow and file system metadata writes are > stuck in the I/O > queue while locks are held on the file system > metadata pending the completion of those writes. > > The problem could be that you are saturating the > cache allocations for the EMC LUNs containing your > archive destination > file system. See the answer at > http://www.ixora.com.au/q+a/0010/20102738.htm for a > bit about the EMC cache allocation > policy. To solve the problem you can use LVM to > stripe a large number of small LUNs together so as > to increase the total > amount of cache available for the archival writes. > You would also do well to avoid RAID-S of course! > > @ Regards, > @ Steve Adams > @ http://www.ixora.com.au/ - For > DBAs > @ http://www.christianity.net.au/ - For all > > > > -----Original Message----- > From: Pablo ksksksk [mailto:[EMAIL PROTECTED]] > Sent: Thursday, 1 November 2001 5:45 > To: Multiple recipients of list ORACLE-L > Subject: Arch configuration -- I/O stuck > > > Hi list, > > Oracle 7.3.4 > HP-UX > log_archive_buffer_size=32 (redo log blocks = 1K) > log_archive_buffers=4 > Filesystem based (no direct I/O) > > I've been detecting that my box gets stucked > eventually for some time. > When this happens I can't do even a "ls" (it > actually executes it but it takes a long time). > If I check my cpu with TOP, I see 47% idle time > and > there's no process monopolizing the CPU. > But when I check disk activity with sar -d I see > that one disk is 100% busy and it's avwait+avserv > > 1000 ms. The other disks are fine. > I then check disk activity with Glance and I can > identify the process that's writting/reading on this > disk is: ARCH (ARCH is writting a 1.9 GB redo log.) > > So here are my doubts: > 1)If only one disk is saturated (I've got > about > 30 disks in this box (a SYMMETRIX array) with some > controllers), why does the whole box get stucked? > why > are even other applications connected to other > instances running on this box affected? (may be > because the HP-UX LVM system gets saturated???) > > 2) What can I do to avoid this problem?, > (reduce > log_archive_buffers parameter may be, or increase > log_archive_buffer_size) > > help me on this > Thanks > _______________________________________________________________ Do You Yahoo!? Yahoo! Messenger Comunicación instantánea gratis con tu gente. http://messenger.yahoo.es -- Please see the official ORACLE-L FAQ: http://www.orafaq.com -- Author: Steve Adams INET: [EMAIL PROTECTED] Fat City Network Services -- (858) 538-5051 FAX: (858) 538-5051 San Diego, California -- Public Internet access / Mailing Lists -------------------------------------------------------------------- To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).