Hi Eberhard,
Le 08.09.2010 à 22:00, Eberhard Moenkeberg a écrit :
> Can mirrorbrain test latencies?
Not properly. The check that is done every minute doesn't pay big attention at
the response time, it just assesses whether a mirror responded within a fixed
time (usually 20s). However, it would be straighforward to extend the script to
do more things with the response time, like saving a history, plotting it, or
pulling some trigger when the response time suddenly grows.
I never made a directed effort to assess latencies, other than subjectively
experimenting. I generally think that albeit latency is easy to test, the
interpretation could be difficult. The distance to the tested mirror (from the
point where the test is run) would play a role and the results need related to
this. Or one might measure the latency relative to the "usual" (previously
observed) latency. But that's again also dependent on other factors. -I don't
have an easy recipe, although I like the idea.
> For the efficiency of the cache effect, it would be good if mirrorbrain could
> for each file request prefer a server to whom this request recently already
> got redirected. Is that possible?
It could be done. Christoph Thiel, my predecessor and former colleague at SUSE,
also suggested it some years ago. The idea was to make better use of the
mirrors buffer caches.
I brainstormed a little bit and came up with the following conclusions.
It seems definitely possible, and if I'm right, actually feasible.
It is principally an old problem to optimize the utilization of caches which
are located between users and servers. The goal is to distribute requests into
"buckets" equally, close both to the users and to the servers. Ideally in a
stable fashion, that if one bucket is added, or one drops out, not all caches
are flushed by a redistribution.
Characteristically, the classic problem is about a large number of objects
(say, web pages) cached, compared to a relatively small number of caches
(intermediaries). In todays mirroring, we often have a few particular files
(the hot spots) that are comparably large, but infrequent (let's say 2 OOo
files, or 5 Linux iso images), and 1-10 mirrors per country. However, the most
obvious approach, a simple hash function that distributes the requests, won't
work here. Why not? I'll demonstrate below.
First, there are conditions that must be met.
1) The mirrors considered must be in *equal proximity* to the user. It would
obviously not make much sense to send all requests to DVD #1 to national
mirrors, whereas DVD #2 is served from another continent.
2) The mirrors considered must have *equal capacity*. If there are two mirror
nearby, but one of them has 10x the bandwidth, the requests can't be split
fifty-fifty. If there are two DVDs to be served, it wouldn't be good to serve
one from a strong mirror and one from a much weaker one.
3) If all the users run after just *one* file, the whole idea doesn't apply.
4) The files to be distributed intelligently must be similar in size and
popularity. It would not work to send all requests on DVDs to mirror A and all
requests to the release notes to mirror B...
Fifth, it is worth asking the question, how many mirrors are really disk-bound
in their operation, and how many are more bandwidth-bound. Maybe the situation
of some largest mirrors, whose bottleneck is disk throughput, is not
representative for everyone. There are mirrors like in India for instance,
where sheer bandwidth seems to be the biggest bottleneck. So, the first
question is, which percentage of mirrors would benefit from better buffer cache
utilization at all? Is it worth putting time into this?
Implications and limitations:
- Currently, mirror selection tries to alternate between available mirrors;
this means, users have a certain chance to work around a broken or slow mirror
simply by repeating their request. The technique proposed here would avoid
that, and mirror selection would be stuck to a one mirror, or to a smaller set
of mirrors; users had no chance to get to a better mirror with the next request.
- Assuming the method works as designed, there can still always be requests
that "disturb" the cache nevertheless: like people accessing the mirror
directly, and "local" requests that are currently rightfully routed to a
certain mirror because they originate from the network or the autonomous system
of a mirror. Depending on the frequency of such requests, they might kill the
whole effort. I am not sure about the effect of these requests. It probably
depends on their absolute frequency.
+ Even if only a small effect is achieved in the attempt to split up the
requests, some mirrors will benefit very much from it.
+ Byterange requests (partial GETs), which can reach mirrors in random order,
are alleviated, because it is obviously better for mirrors they are limited to
fewer files.
+ Intermediary proxy caches also benefit from the technique, automatically, and
for free.
Now, how to go about it?
* The technique should not require too much manual tweaking. Ideally, none. A
need to configure manually which files are redirected to which mirror would not
be very usable.
* Obviously, mirrors that are considered for splitting requests up between them
need to be "grouped": Mirrors of similar location (same country) and similar
capacity (same priority) can be put together into a group. Inside such an
entity, there is room for the "fanciness".
* A classic mechanism to distribute requests would be to calculate a quick hash
of the URL of filename, and take the modulo (remainder) of a division by the
number of eligible mirrors. However, this alone will result in uneven
distribution among mirrors, up to the extreme that if there is only one really
popular file, all requests would be sent to a single mirror, instead of all
mirrors. Or when onsidering 2 popular files, and 7 mirrors, only 2 mirrors
would get requests. Thus, the usual "hash % number of buckets" algorithm is not
suitable. (It would have been easy to just calculate "inode % # mirrors"
and be done...)
I think it is illustrative to look at the different constellations that need to
be handled. In any case, there is a number of files (for which requests are to
be split up) and a number of mirrors (that fell into the same group of
geo-proximity and capacity):
- With only 1 file, the exercise is pointless.
- With only 1 mirror, the exercise is pointless as well.
- With 2 files, and an even number of mirrors, a perfect distribution is
possible.
- With 2 files, and an odd number of mirrors, the requests can be split between
n-1 mirrors, leaving one mirror unconsidered, which should receive a fair share
of requests on all files.
- With 3 files it's getting a little bit more complicated:
With 2 mirrors, 30% buffer cache can be saved by splitting 2 files and sending
the 3rd file to both mirrors.
With 3 mirrors, we have the ideal case where every mirror gets requests for one
file.
With 4 mirrors, we need to ensure that mirror #4 does not get 3 times as many
requests as the other mirrors. 1/4 (one forth) of the requests needs to go to
this "remaining" mirror.
With 5 mirrors, it's principally the same as with 4 mirrors: split requests
between 3 of them, and give the remaining two a fair share of the requests (two
fifths = 40%).
With 6 mirrors, we have an "ideal case" again.
With 7 mirrors, ... you get the pattern by now:
7 mirrors % 3 files = 1
which can be transcribed by stating that 1 of the 7 mirrors needs to get 1/7 of
the requests, and the rest of the requests is distributed round-robin to the
other 6 mirrors.
Stable round-robin distribution to a "fitting" number of mirrors is
straightforward if the mirrors are internally ordered by some internal id that
doesn't change.
So to cast that into pseudo code, the mirror selection could be implemented
like this:
nF = number of files
nM = number of mirrors
f = # of the file in question
selected = # of the selected mirror (integer between 0 and nM-1)
selected_mirror = int( rand(0..1) * (nm/nf) + f * (nm/nf) )
This simple, and light-weight computation does the job. It is not even
necessary to take separate care of the remaining mirrors (nR = nM % nF),
because they are dealt with automatically.
To see if this pencil-and-paper exercise is workable, I wrote a little script.
It is attached to this mail. It runs random "requests" to a defined number of
files, and uses the above algorithm to assign them to a defined number mirrors.
It appears to work wonderfully to me.
Below, you see the output of the attached test script with various parameters -
let's start with 1 file. The output is as expected. (Sorry, no colourful
diagrams ;-)
% ./ra.py --files 1 --mirrors 1
100000 requests (100%) to mirror0 for file0
% ./ra.py --files 1 --mirrors 2
50095 requests (50%) to mirror0 for file0
49905 requests (49%) to mirror1 for file0
% ./ra.py --files 1 --mirrors 3
33459 requests (33%) to mirror0 for file0
33170 requests (33%) to mirror1 for file0
33371 requests (33%) to mirror2 for file0
With 2 files:
% ./ra.py --files 2 --mirrors 2
50237 requests (50%) to mirror0 for file0
49763 requests (49%) to mirror1 for file1
% ./ra.py --files 2 --mirrors 3
33329 requests (33%) to mirror0 for file0
33381 requests (33%) to mirror1 for file1, file0
33290 requests (33%) to mirror2 for file1
% ./ra.py --files 2 --mirrors 4
25081 requests (25%) to mirror0 for file0
24968 requests (24%) to mirror1 for file0
25054 requests (25%) to mirror2 for file1
24897 requests (24%) to mirror3 for file1
% ./ra.py --files 2 --mirrors 5
19851 requests (19%) to mirror0 for file0
20040 requests (20%) to mirror1 for file0
20008 requests (20%) to mirror2 for file1, file0
20019 requests (20%) to mirror3 for file1
20082 requests (20%) to mirror4 for file1
% ./ra.py --files 2 --mirrors 6
16645 requests (16%) to mirror0 for file0
16808 requests (16%) to mirror1 for file0
16804 requests (16%) to mirror2 for file0
16759 requests (16%) to mirror3 for file1
16554 requests (16%) to mirror4 for file1
16430 requests (16%) to mirror5 for file1
Perfect!
Now, how does it cope with 3 files?
% ./ra.py --files 3 --mirrors 1
100000 requests (100%) to mirror0 for file2, file1, file0
% ./ra.py --files 3 --mirrors 2
49806 requests (49%) to mirror0 for file1, file0
50194 requests (50%) to mirror1 for file2, file1
% ./ra.py --files 3 --mirrors 3
33284 requests (33%) to mirror0 for file0
33323 requests (33%) to mirror1 for file1
33393 requests (33%) to mirror2 for file2
% ./ra.py --files 3 --mirrors 4
25095 requests (25%) to mirror0 for file0
25024 requests (25%) to mirror1 for file1, file0
24912 requests (24%) to mirror2 for file2, file1
24969 requests (24%) to mirror3 for file2
% ./ra.py --files 3 --mirrors 5
20026 requests (20%) to mirror0 for file0
20102 requests (20%) to mirror1 for file1, file0
19871 requests (19%) to mirror2 for file1
19991 requests (19%) to mirror3 for file2, file1
20010 requests (20%) to mirror4 for file2
% ./ra.py --files 3 --mirrors 6
16613 requests (16%) to mirror0 for file0
16626 requests (16%) to mirror1 for file0
16598 requests (16%) to mirror2 for file1
16876 requests (16%) to mirror3 for file1
16743 requests (16%) to mirror4 for file2
16544 requests (16%) to mirror5 for file2
% ./ra.py --files 3 --mirrors 7
14251 requests (14%) to mirror0 for file0
14125 requests (14%) to mirror1 for file0
14257 requests (14%) to mirror2 for file1, file0
14447 requests (14%) to mirror3 for file1
14308 requests (14%) to mirror4 for file2, file1
14252 requests (14%) to mirror5 for file2
14360 requests (14%) to mirror6 for file2
Fine. This is as good as it can get, and it saves buffer cache on all mirrors.
And with 4 files?
% ./ra.py --files 4 --mirrors 1
100000 requests (100%) to mirror0 for file3, file2, file1, file0
% ./ra.py --files 4 --mirrors 2
49845 requests (49%) to mirror0 for file1, file0
50155 requests (50%) to mirror1 for file3, file2
% ./ra.py --files 4 --mirrors 3
33213 requests (33%) to mirror0 for file1, file0
33212 requests (33%) to mirror1 for file2, file1
33575 requests (33%) to mirror2 for file3, file2
% ./ra.py --files 4 --mirrors 4
24978 requests (24%) to mirror0 for file0
25034 requests (25%) to mirror1 for file1
25137 requests (25%) to mirror2 for file2
24851 requests (24%) to mirror3 for file3
% ./ra.py --files 4 --mirrors 5
20206 requests (20%) to mirror0 for file0
19819 requests (19%) to mirror1 for file1, file0
20070 requests (20%) to mirror2 for file2, file1
20103 requests (20%) to mirror3 for file3, file2
19802 requests (19%) to mirror4 for file3
% ./ra.py --files 4 --mirrors 6
16518 requests (16%) to mirror0 for file0
16533 requests (16%) to mirror1 for file1, file0
16755 requests (16%) to mirror2 for file1
16761 requests (16%) to mirror3 for file2
16602 requests (16%) to mirror4 for file3, file2
16831 requests (16%) to mirror5 for file3
Excellent!
And with 5 files (is anyone still following this far?):
% ./ra.py --files 5 --mirrors 1
100000 requests (100%) to mirror0 for file3, file2, file1, file0, file4
% ./ra.py --files 5 --mirrors 2
50059 requests (50%) to mirror0 for file2, file1, file0
49941 requests (49%) to mirror1 for file3, file2, file4
% ./ra.py --files 5 --mirrors 3
33404 requests (33%) to mirror0 for file1, file0
33249 requests (33%) to mirror1 for file3, file2, file1
33347 requests (33%) to mirror2 for file3, file4
% ./ra.py --files 5 --mirrors 4
24904 requests (24%) to mirror0 for file1, file0
25022 requests (25%) to mirror1 for file2, file1
25109 requests (25%) to mirror2 for file3, file2
24965 requests (24%) to mirror3 for file3, file4
% ./ra.py --files 5 --mirrors 5
20192 requests (20%) to mirror0 for file0
19836 requests (19%) to mirror1 for file1
20202 requests (20%) to mirror2 for file2
19869 requests (19%) to mirror3 for file3
19901 requests (19%) to mirror4 for file4
% ./ra.py --files 5 --mirrors 6
16880 requests (16%) to mirror0 for file0
16654 requests (16%) to mirror1 for file1, file0
16731 requests (16%) to mirror2 for file2, file1
16593 requests (16%) to mirror3 for file3, file2
16592 requests (16%) to mirror4 for file3, file4
16550 requests (16%) to mirror5 for file4
Fantastic.
To my surprise, it seems to work beautifully for larger number of files even!
Or am I shooting over the roof now? See:
% ./ra.py --files 100 --mirrors 5
20083 requests (20%) to mirror0 for file3, file12, file1, file10, file7,
file15, file5, file4, file8, file11, file9, file18, file16, file0, file6,
file13, file17, file2, file19, file14
19777 requests (19%) to mirror1 for file32, file23, file27, file35, file36,
file34, file39, file38, file28, file29, file24, file31, file22, file30, file20,
file21, file26, file33, file37, file25
19823 requests (19%) to mirror2 for file57, file59, file56, file47, file40,
file41, file42, file43, file44, file45, file46, file58, file48, file49, file55,
file54, file53, file52, file51, file50
20059 requests (20%) to mirror3 for file75, file63, file73, file67, file65,
file61, file64, file66, file74, file77, file76, file62, file70, file60, file72,
file71, file79, file78, file68, file69
20258 requests (20%) to mirror4 for file87, file94, file80, file82, file96,
file93, file99, file98, file88, file89, file86, file92, file84, file85, file91,
file90, file97, file81, file95, file83
It seems fair and efficient.
When mirrors drop out or are added, the distribution changes -- it is not
stable against such changes, but I suppose we can live with that. (After all,
we live with complete entropy at present!)
This all works only under a further condition: files that should be treated
this way need to be indexed in some way. Maybe one would want to apply the
special handling only to certain "hotspots" in the tree, like freshly released
DVDs of a new distro. Practically, these files could be (automatically)
assigned an integer as index during configuration. That would be the missing
number "f" in the formula. Or it could be the n-th file inside a directory.
That's an implementation-specific detail I guess. I don't know for other
redirectors, but for MirrorBrain I would have a rough idea how to solve it in
an easy way.
Tagging only certain "hotspot" files for this extra treatment would allow to
stick to a simpler mechanism for the mass of other files. Because, not to
forget, for the buffer-cache-friendly request splitting the mirrors need to be
grouped in a sensible way. Which is not a big task in itself, but the little
things add up. In the picture that forms in front of my eye, I would expect the
code to be not very costly, though.
Peter
#!/usr/bin/env python
import sys
import random
from optparse import OptionParser
def main():
usage = "usage: %prog --files N --mirrors N [--samples N]"
parser = OptionParser(usage=usage)
parser.add_option("-f", "--files", dest="nf", help="number of files", metavar="N")
parser.add_option("-m", "--mirrors", dest="nm", help="number of mirrors", metavar="N")
parser.add_option("-s", "--samples", dest="samplesize", default=100000,
help="number of samples to randomly run", metavar="N")
(options, args) = parser.parse_args()
if not options.nf or not options.nm:
sys.exit(parser.get_usage())
nf = float(options.nf)
nm = float(options.nm)
samplesize = int(options.samplesize)
mirrors = {}
for i in range(int(nm)):
mirrors[i] = { 'name': 'mirror%s' % i,
'count_req': 0,
'files': set() }
files = {}
for i in range(int(nf)):
files[i] = 'file%s' % i
for s in range(samplesize):
f = random.randint(0, nf-1)
sel = ( random.random() * (nm/nf) + f * (nm/nf) )
sel = int(sel)
mirrors[sel]['count_req'] += 1
mirrors[sel]['files'].add(files[f])
for i in mirrors.keys():
m = mirrors[i]
print '%6d requests (%d%%) to %s for %s' % \
(m['count_req'],
m['count_req'] / float(samplesize) * 100,
m['name'],
', '.join([f for f in m['files']]))
if __name__ == '__main__':
main()
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]