On Fri, 11 May 2012 12:48:08 +0200 Edgar Fuß <e...@math.uni-bonn.de> wrote:
> > Edgar is describing the desideratum for a minimum-latency > > application. > Yes, I'm looking for minimum latency. > I've logged the current file server's disc "business" and the only > time they really are busy is during the nightly backup. I suppose > that mainly consists of random reads traversing the directory tree > followed by random reads transferring the modified files (which, I > suppose, are typically small). > > Of course I finally will have to experiment, but a) I'm used to > trying to understand the theory first and b) the rate of experiments > will probably be limited to one per day or less given the amount of > real data I would have to re-transfer each time. > > Since my understanding of the NetBSD I/O system may be incorrect, > incomplete or whatever, I better try to ask people first who know > better: > > Given 6.0/amd64 and a RAID 5 accross 4+1 SAS discs. > Suppose I have a 16k fsbsize and a stripe size such that there is one > full fs block per disc (i.e. the stripe is 4*16k large). Suppose > everything is correctly aligned (apart from where I say it isn't, of > course). > > Suppose there is no disc I/O apart from that I describe. Ok :) > Scenario A: > I have two processes each reading a single file system block. Those > two blocks happen to end up on two different discs (there is a 3/4 > probability for that being true). Will I end up with those two discs > actually seeking in parallel? Yes. Absolutely. > Scenario B: > I have one process reading a 64k chunk that is 16k-aligned, but not > 64k- aligned (so it's not just reading a full stripe). Will I end up > in four discs seeking and reading in parallel? Yes. Aligned or not, there will be 4 discs busy. (think of it as reading the last part of one stripe, and the first part of another. Those are always aligned to be non-overlapping...) > I.e. will this be > degraded wrt. a stripe-aligned read? No. Consider the following layout: d0 d1 d2 d3 p0 d5 d6 d7 p1 d4 d10 d11 p2 d8 d9 d15 p3 d12 d13 d14 p4 d16 d17 d18 d19 where dn is a 16K datablock n of the RAID set and px is the parity block for stripe x. (This is what the layout will be for your configuration.) If you read 64K that is strip aligned, then you would maybe be reading d0-d3 or d4-d7 or d8-d11 or d12-d15 or d16-d19. As you can see, all of those span all 4 discs. If you read something that isn't stripe aligned (e..g d1-d4 or d7-d10 or d11-d14 or whatever) you also see that those IOs are evenly distributed among the discs. > Scenario C: > I have one process doing something largely resulting in meta-data > reads (i.e. traversing a very large directory tree). Will the kernel > only issue sequential reads or will it be able to parallelise, e.g. > reading indirect blocks? I don't know the answer to this off the top of my head... > What will change if I scale all sizes by a factor of, say, 4, such > that the full stripe exceeds MAXPHYS? RAIDframe won't be able to directly handle a request larger than MAXPHYS, so a single MAXPHYS-sized request to RAIDframe will end up accessing only a single component. The parallelism seen in A will still be there, but not the parallelism in B. > Sorry if the answers to those question are obvious from what has > already been written so far for someone more familiar with the matter > than I am. Later... Greg Oster