Ryan Schmidt said the following, on 09-08-13 9:15 PM:
On Aug 9, 2013, at 19:00, Roman Naumenko wrote:
Ryan Schmidt said the following, on 09-08-13 7:12 PM:
You can configure any number of read-only slaves which maintain copies of the master
repository with a very slight delay. The mirroring and keeping in sync would be
accomplished using svnsync. To access the repositories, users would use the hostname of a
mirror near to them. For read operations, they would occur on the mirror and therefore be
faster than accessing the farther-away master. For write operations, you configure the
mirror to proxy those requests back to the master. (Search for "write-through
proxy" for more on this.) In this way the users only need to know the address of
their closest mirror; they do not need to know which is the master or to know its address.
I wanted to have universal URL, which might resolve to different IP based on
location - for performance.
I'm not familiar with how to set that up at the DNS level but if you are then
go for it.
Views in bind or something similar, DNS server will reply with IP that
depends on the request's originating network.
But more important, I'd like to have a few nodes handling writes.
Ah yes. Well then that's different.
You must have one heck of a large svn installation for that to be a bottleneck.
One day it might grow there, but even with the moderate load it is still a huge
convenience when pair or more frontends available to handle the load, can take
one down for maintenance any time. VMs can be used instead of physical box too
and sized more adequately.
Of course, it would be ideal if subversion nodes could just share a
storage, so any sort of requests from a load balancer can processed by
any node without need to replicate changes over network.
Of course, it would be ideal if subversion nodes could just share a
storage, so any sort of requests from a load balancer can processed by
any node without need to replicate changes over network.
If your storage is robust (i.e. a cluster filesystem, such as Xsan) and you
want to run multiple Subversion servers that each have access to the same
repositories on the same storage, then yes, you can do that instead.
The storage is robust enough - NetApp or possibly SAN with all enterprise bells
and whistles.
It would need to be not just a SAN but a SAN with a cluster filesystem, based
on previous conversations (see below).
Yeah, of course - SAN storage will require own layer to handle data
sharing.
Few mentioned GFS worked.
Ok, so if multiple nodes are accessing the same mount point with repos data,
will they be able to handle writes from multiple clients correctly? Thinking
out loud: yes, they should - since it's no difference for a repository if
multiple clients commiting over same server or few distributed nodes. Or is it
different when the same process handles all requests?
I have not set it up myself, but I participated in discussions about it on this
list some years ago:
http://svn.haxx.se/users/archive-2006-10/0195.shtml
http://svn.haxx.se/users/archive-2007-05/0214.shtml
You may want to read those threads completely and carefully to get all the
nuances. And of course information may have changed since then.
Tom Mornini <tmornini_at_engineyard.com> confirmed that GFS works in
that thread and the other too,
http://svn.haxx.se/users/archive-2007-01/1307.shtml
But again, there is no "official" confirmation or reference architecture.
It seems like the number of repositories or the load on a server is
never large enough to make administrators (or subversion developers to
some extent) designing or implementing load-balancing cluster. Or maybe
it is close to huge, but in most cases svnsycn + write-though solve the
problem.
On the other side, there are commercial solutions available, so demand
must be there :)
--Roman