On Mon, Oct 26, 2015 at 09:14:00AM +0000, Duncan wrote:
> Dmitry Katsubo posted on Sun, 18 Oct 2015 11:44:08 +0200 as excerpted:
> 
> >> Meanwhile, the present btrfs raid1 read-scheduler is both pretty simple
> >> to code up and pretty simple to arrange tests for that run either one
> >> side or the other, but not both, or that are well balanced to both.
> >> However, it's pretty poor in terms of ensuring optimized real-world
> >> deployment read-scheduling.
> >> 
> >> What it does is simply this.  Remember, btrfs raid1 is specifically two
> >> copies.  It chooses which copy of the two will be read very simply,
> >> based on the PID making the request.  Odd PIDs get assigned one copy,
> >> even PIDs the other.  As I said, simple to code, great for ensuring
> >> testing of one copy or the other or both, but not really optimized at
> >> all for real-world usage.
> >> 
> >> If your workload happens to be a bunch of all odd or all even PIDs,
> >> well, enjoy your testing-grade read-scheduler, bottlenecking everything
> >> reading one copy, while the other sits entirely idle.
> > 
> > I think PID-based solution is not the best one. Why not simply take a
> > random device? Then at least all drives in the volume are equally loaded
> > (in average).
> 
> Nobody argues that the even/odd-PID-based read-scheduling solution is 
> /optimal/, in a production sense at least.  But at the time and for the 
> purpose it was written it was pretty good, arguably reasonably close to 
> "best", because the implementation is at once simple and transparent for 
> debugging purposes, and real easy to test either one side or the other, 
> or both, and equally important, to duplicate the results of those tests, 
> by simply arranging for the testing to have either all even or all odd 
> PIDs, or both.  And for ordinary use, it's good /enough/, as ordinarily, 
> PIDs will be evenly distributed even/odd.
> 
> In that context, your random device read-scheduling algorithm would be 
> far worse, because while being reasonably simple, it's anything *but* 
> easy to ensure reads go to only one side or equally to both, or for that 
> matter, to duplicate the tests, because randomization, by definition 
> does /not/ lend itself to duplication.

   For what it's worth, David tried implementing round-robin (IIRC)
some time ago, and found that it performed *worse* than the pid-based
system. (It may have been random, but memory says it was round-robin).

   Hugo.

-- 
Hugo Mills             | Great films about cricket: The Umpire Strikes Back
hugo@... carfax.org.uk |
http://carfax.org.uk/  |
PGP: E2AB1DE4          |

Attachment: signature.asc
Description: Digital signature

Reply via email to