> From: Richard Elling [mailto:rich...@nexenta.com] > > > > Regardless of multithreading, multiprocessing, it's absolutely > possible to > > have contiguous files, and/or file fragmentation. That's not a > > characteristic which depends on the threading model. > > Possible, yes. Probable, no. Consider that a file system is > allocating > space for multiple, concurrent file writers.
Process A is writing. Suppose it starts writing at block 10,000 out of my 1,000,000 block device. Process B is also writing. Suppose it starts writing at block 50,000. These two processes write simultaneously, and no fragmentation occurs, unless Process A writes more than 40,000 blocks. In that case, A's file gets fragmented, and the 2nd fragment might begin at block 300,000. The concept which causes fragmentation (not counting COW) in the size of the span of unallocated blocks. Most filesystems will allocate blocks from the largest unallocated contiguous area of the physical device, so as to minimize fragmentation. I can't say how ZFS behaves authoritatively, but I'd be extremely surprised if two processes writing different files as fast as possible result in all their blocks interleaved with each other on physical disk. I think this is possible if you have multiple processes lazily writing at less-than full speed, because then ZFS might remap a bunch of small writes into a single contiguous write. > > Also regardless of raid, it's possible to have contiguous or > fragmented > > files. The same concept applies to multiple disks. > > RAID works against the efforts to gain performance by contiguous access > because the access becomes non-contiguous. These might as well have been words randomly selected from the dictionary to me - I recognize that it's a complete sentence, but you might have said "processors aren't needed in computers anymore," or something equally illogical. Suppose you have a 3-disk raid stripe set, using traditional simple striping, because it's very easy to explain. Suppose a process is writing as fast as it can, and suppose it's going to write block 0 through block 99 of a virtual device. virtual block 0 = block 0 of disk 0 virtual block 1 = block 0 of disk 1 virtual block 2 = block 0 of disk 2 virtual block 3 = block 1 of disk 0 virtual block 4 = block 1 of disk 1 virtual block 5 = block 1 of disk 2 virtual block 6 = block 2 of disk 0 virtual block 7 = block 2 of disk 1 virtual block 8 = block 2 of disk 2 virtual block 9 = block 3 of disk 0 ... virtual block 96 = block 32 of disk 0 virtual block 97 = block 32 of disk 1 virtual block 98 = block 32 of disk 2 virtual block 99 = block 33 of disk 0 Thanks to buffering and command queueing, the OS tells the RAID controller to write blocks 0-8, and the raid controller tells disk 0 to write blocks 0-2, tells disk 1 to write blocks 0-2, and tells disk 2 to write 0-2, simultaneously. So the total throughput is the sum of all 3 disks writing continuously and contiguously to sequential blocks. This accelerates performance for continuous sequential writes. It does not "work against efforts to gain performance by contiguous access." The same concept is true for raid-5 or raidz, but it's more complicated. The filesystem or raid controller does in fact know how to write sequential filesystem blocks to sequential physical blocks on the physical devices for the sake of performance enhancement on contiguous read/write. If you don't believe me, there's a very easy test to prove it: Create a zpool with 1 disk in it. time writing 100G (or some amount of data >> larger than RAM.) Create a zpool with several disks in a raidz set, and time writing 100G. The speed scales up linearly with the number of disks, until you reach some other hardware bottleneck, such as bus speed or something like that. _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss