All I have a meeting tomorrow where I am going to point out why SAN and Oracle does not go very well together. Here are my thoughts. Can you pick holes in this argument, modify it or suggest any changes....
TIA Babu SAN and Oracle ? Conflicting IO behavior * There are four types of IO in Oracle 1. Random Reads (RR) ? DBWR - Using indexes 2. Sequential Reads (SR) ? DBWR - Full table scans 3. Random Writes (RW) ? DBWR ? Writing dirty blocks 4. Sequential Writes (SW) ? LGWR, Arch ? Writing redo logs and Redo Archival + Control files * Bulk of any Oracle database's IO is done in RR, SR and RW. If SW is very high it denotes configuration problems. * SAN (or for that matter any RAID device) is configured for writing or reading large chunks at a time. The stripe size on most SANs and RAID devices are 256K or more. Compare this to the Oracle block size of 4k/8k in most databases (going upto 32K in datawarehouses) * SANs do *Read Ahead*. If one block is requested, they read more than one blocks *while at the disk* hoping that the same process will request the other blocks some time soon. Here is the conflict. * When ever Oracle does a RR, SR or RW it writes randomly and not sequentially. It will read/write a particular block at a time in case of RR and RW and 'x' blocks (where x = dbfile_multi_block_read_count) in case of SR. Therefore only during SR will Oracle use the entire stripe width. In all other cases, The difference in the stripe width and db_block_size will be excess IO. * Why *read ahead* will cause a conflict : * The internal structure of a datafile could be as follows. The file consists of 10 blocks. These are occupied by 3 tables. The blocks shown below are numbered using table_name.block_number |---------+---------+---------+---------+---------+---------+---------+---------+---------+---------| | | | | | | | | | | | | | 1.1 | 1.2 | 2.1 | 3.1 | 3.2 | 3.3 | 2.2 | 1.3 | 2.3 | | 3.4 | | | | | | | | | | | | | |---------+---------+---------+---------+---------+---------+---------+---------+---------+---------| * The first block on the datafile is the first block of table 1, second block is the second block of table 1, the third block is the first block of table 2 and so on.. (For simplicity sake, I am assuming Oracle will allocate space in blocks and not in extents) * Now assume Oracle requests the first block of table 1. Assume read ahead is set to three blocks (three blocks will be read instead of 2 blocks). In this case the SAN will read 2.1, 3.1,3.2. * The blocks 3.1 and 3.2 will be entirely useless as Oracle is never going to read it. SAN cannot tell that the block 2.2 that Oracle might possible request next is the 7th block in the datafile and so it can never *read ahead* intelligently. Why the buffer of SAN has very little impact w.r.t Oracle read performance? * Oracle has its own buffering for all IO types * DBWR reads and writes uses the DB Buffer Cache * LGWR uses the Log buffer * Db buffer Cache is managed by a LRU Algorithm (Touchcount from 9I). * Bulk of the IO done by Oracle is Logical IO (LIO) and not Physical IO (PIO). * Assume the buffer cache hit ratio is 80%. This means that only 20% of the IO calls are PIO. Only 20% of the calls ever hit the SAN's cache. Since this 20% is probably the least requested/never requested data (going by Oracle's LRU algorithm) , its quite likely that the SAN's buffers don't have this either. * Given that Oracle is going to cache even this 20% in its buffers, the next PIO call is going to be for something totally different ? which is not there in the SAN's buffer. * Couple this with the read-ahead (discussed earlier), Our SAN's buffer is now populated with lots of data that Oracle might never use a PIO to retrieve. * Thus the SAN's buffer can never really provide to Oracle the data it reads most ? Its already there in Oracle. To be fair, SAN's huge buffers will come as a boon to small databases ? where the entire database can be cached in the SAN's buffers. -- Please see the official ORACLE-L FAQ: http://www.orafaq.com -- Author: INET: [EMAIL PROTECTED] Fat City Network Services -- (858) 538-5051 FAX: (858) 538-5051 San Diego, California -- Public Internet access / Mailing Lists -------------------------------------------------------------------- To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).