On Mon, Oct 26, 2015 at 09:14:00AM +0000, Duncan wrote: > Dmitry Katsubo posted on Sun, 18 Oct 2015 11:44:08 +0200 as excerpted: > > >> Meanwhile, the present btrfs raid1 read-scheduler is both pretty simple > >> to code up and pretty simple to arrange tests for that run either one > >> side or the other, but not both, or that are well balanced to both. > >> However, it's pretty poor in terms of ensuring optimized real-world > >> deployment read-scheduling. > >> > >> What it does is simply this. Remember, btrfs raid1 is specifically two > >> copies. It chooses which copy of the two will be read very simply, > >> based on the PID making the request. Odd PIDs get assigned one copy, > >> even PIDs the other. As I said, simple to code, great for ensuring > >> testing of one copy or the other or both, but not really optimized at > >> all for real-world usage. > >> > >> If your workload happens to be a bunch of all odd or all even PIDs, > >> well, enjoy your testing-grade read-scheduler, bottlenecking everything > >> reading one copy, while the other sits entirely idle. > > > > I think PID-based solution is not the best one. Why not simply take a > > random device? Then at least all drives in the volume are equally loaded > > (in average). > > Nobody argues that the even/odd-PID-based read-scheduling solution is > /optimal/, in a production sense at least. But at the time and for the > purpose it was written it was pretty good, arguably reasonably close to > "best", because the implementation is at once simple and transparent for > debugging purposes, and real easy to test either one side or the other, > or both, and equally important, to duplicate the results of those tests, > by simply arranging for the testing to have either all even or all odd > PIDs, or both. And for ordinary use, it's good /enough/, as ordinarily, > PIDs will be evenly distributed even/odd. > > In that context, your random device read-scheduling algorithm would be > far worse, because while being reasonably simple, it's anything *but* > easy to ensure reads go to only one side or equally to both, or for that > matter, to duplicate the tests, because randomization, by definition > does /not/ lend itself to duplication.
For what it's worth, David tried implementing round-robin (IIRC) some time ago, and found that it performed *worse* than the pid-based system. (It may have been random, but memory says it was round-robin). Hugo. -- Hugo Mills | Great films about cricket: The Umpire Strikes Back hugo@... carfax.org.uk | http://carfax.org.uk/ | PGP: E2AB1DE4 |
signature.asc
Description: Digital signature