Anton B. Rang wrote: >> There are many different ways to place the data on the media and we would >> typically >> strive for a diverse stochastic spread. >> > > Err ... why? > > A random distribution makes reasonable sense if you assume that future read > requests are independent, or that they are dependent in unpredictable ways. > Now, if you've got sufficient I/O streams, you could argue that requests > *are* independent, but in many other cases they are not, and they're usually > predictable (particularly after a startup period). Optimizing for the > predicted access cases makes sense. (Optimizing for observed access may make > sense in some cases as well.) > >
For modern disks, media bandwidths are now getting to be > 100 MBytes/s. If you need 500 MBytes/s of sequential read, you'll never get it from one disk. You can get it from multiple disks, so the questions are: 1. How to avoid other bottlenecks, such as a shared fibre channel path? Diversity. 2. How to predict the data layout such that you can guarantee a wide spread? This is much more difficult. But you can use random distribution to reduce the probability (stochastic) that you'll be reading all blocks from one disk. There are pathological cases, especially for block-aligned data, but those tend to be rather easy to identify when you look at the performance data. -- richard _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss