Am 02.10.2010 14:44, schrieb Florian Philipp: > Am 02.10.2010 14:11, schrieb Volker Armin Hemmann: >> On Saturday 02 October 2010, Florian Philipp wrote: > [...] >>> >>> Assumptions: >>> >>> 1. Seek time is constant. For HDDs we can take an average value. Of >>> course this doesn't work for tapes. They have a seek time which >>> increases linearly with the distance between the fragments. >> >> I think you misunderstood my remark. >> >> Tapes try to stream. Take an old DLT drive with 5-10mb/sec streaming speed. >> Slow, isn't it? >> >> But when you do a backup on such an old tape even with a modern harddisk you >> have problems keeping it streaming. As soon as you hit a directory with many >> small files - like ~/Mail or /usr/portage you are screwed. >> >> Yes, you have wonderfull 100mb/sec when you read a big, fat file. Or a >> single >> small file. But when you have houndreds, thousands or hundreds of thousands >> of >> small files, harddisks suck. >> And your tape drive has to stop and rewind every couple of seconds because >> your harddisks were not able to keep up the required 10mb/sec. Trueley >> pathetic. > > Well, that's exactly what my little math shows. When you read 4kB files, > you can end up with 0.0065 * 50 MB/s = 0.32 MB/s effective throughput > (worst case). > >> >> Besides, seek times are not constant ;) >> > > Sure they aren't. That's why it is stated as an assumption. It is just a > model. Like every model it has its limits.[1] It doesn't take into > account prefetching, caching and NCQ/TCQ, for example. > > Still it is a valid assumption: *On average*, the read/write head has to > move around half the radius of the platter to reach its next position > and it has to wait for half a rotation until the right block is under > the head. If we assume that fragments are uniformly distributed over the > whole disk, we can simply take an average value for seek times. > > The model also doesn't take into account that even with no > fragmentation, there might be some seek operations: Blocks on an HDD are > organized in rings (tracks), not as a spiral like the sound track on an > good old LP. That means that at some point, the r/w head has to switch > to the next track when the file does not reside on one track alone. > > [1] A bit off-topic: I work in applied sciences and engineering. There > I've learned two basic rules about models: 1. Truth doesn't matter, > usefulness does. 2. Every model has its limits. Knowing these limits is > the single most important important thing when using a model. >
Hmm, I've just looked up some specs from this page: http://www.wdc.com/en/products/products.asp?driveid=512 These make me a wonder a bit: Average latency is 5.5 ms. That's the time the disk needs for half a rotation. However, read seek time is 12 ms. That's a bit more than one rotation. If that's an average value rather than a maximum, it makes me wonder what takes it so long. It can't be rotational delay (that's their average latency). Therefore it must be the time it takes the r/w head to move into position. I find it a bit unbelievable that this takes so long. If that really is the limiting factor then a high-end server disk couldn't be much faster simply by increasing the RPM. Well, I guess their "seek time" is the maximum value. Then it makes sense: One rotation is 11.111 ms. Then they might add some latency due to data processing etc. Any different thoughts? Another interesting bit of information: Track-to-track seek time is 2 ms. That's the seek time I mentioned above which also occurs for sequential reads/writes. Regards, Florian Philipp
signature.asc
Description: OpenPGP digital signature