I'm studying the behavior of the page cache vs. direct i/o in the kernel (so far up to 2.6.9-42.0.10 for CentOS 4.4, moving to 2.6.18 for CentOS 5 soon).
I have found that large I/O requests get broken up differently between direct i/o and cached i/o (not surprising, but it's the way they're broken up that puzzles me). While both direct and cached i/o break the i/o requests down to page sized chunks, these are subsequently merged back together until a limit is reached. For direct i/o on my SCSI disks, that limit is 88 segments, or 704 sectors (352k of a 512k block), so when direct i/os of 512k chunks are performed via dd, each 512k i/o usually gets broken into one 704 sector chunk and one 320 sector chunk. However, when this same kind of i/o is done through the page cache, the pages also get chunked together, and each 704 sector chunk limits the size of an i/o sent to the scheduler, but this time, the 704 sector chunks typically get merged into a 1408 sector chunk because the limiting factor here is a maximum sectors per i/o size of 2048. My question is: what is the relationship between the hardware segment limit (enforced in ll_new_segment and __bio_add_page) of 88 pages per segment and the hardware maximum sectors per i/o of 2048 (enforced by ll_back_merge, and also ll_front_merge, though I haven't seen that one happen yet)? Where do these limits come from and why don't they match? (The other subject of my study is why EMC NAS devices take at least 2x as long as SCSI "DAS" disks, and why paged i/o appears to be no more than 1/2 the speed of direct i/o on both types of disk, specifically for large i/os (>256k chunks.) Please reply with CC to me directly as I am still not on the list (but I will be soon.) Thanks. -- Mark Hull-Richter, Linux Kernel Engineer DATAllegro (www.datallegro.com) 85 Enterprise, Second Floor, Aliso Viejo, CA 92656 949-680-3082 - Office 949-330-7691 - fax - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

