Re: Balancing and proxing
On 2013/08/12 8:35 PM, Nico Kadel-Garcia wrote: Nico Kadel-Garcia Email: nka...@gmail.com Sent from iPhone On Aug 9, 2013, at 20:12, Roman Naumenko ro...@naumenko.ca You mean this one (svn clustering)? http://www.wandisco.com/get?f=documentation/datasheets/DataSheet-Clustering.pdf It doesn't look like it's a simple loadbalancing architecture with a shared storage for repositories. Right. Shared storage is very vulnerable to corrupting that single shared back end. This seems to be a well thought out multi master setup, and should be far more resilient for most environments. I tend to agree, although such direction limits scalability and administration 'kiss'-ness. There is some replication and synchronization involved, automatic failover, etc. Is anybody using it, what its like? --Roman I'm working from the public specs. I'm not a Subversion master in my current workplace, so lack the 3 hosts needed to really test it put. --Roman Naumenko ___ This email is intended only for the use of the individual(s) to whom it is addressed and may be privileged and confidential. Unauthorised use or disclosure is prohibited. If you receive This e-mail in error, please advise immediately and delete the original message. This message may have been altered without your or our knowledge and the sender does not accept any liability for any errors or omissions in the message. Ce courriel est confidentiel et protégé. L'expéditeur ne renonce pas aux droits et obligations qui s'y rapportent. Toute diffusion, utilisation ou copie de ce message ou des renseignements qu'il contient par une personne autre que le (les) destinataire(s) désigné(s) est interdite. Si vous recevez ce courriel par erreur, veuillez m'en aviser immédiatement, par retour de courriel ou par un autre moyen.
Re: Balancing and proxing
Nico Kadel-Garcia Email: nka...@gmail.com Sent from iPhone On Aug 9, 2013, at 20:12, Roman Naumenko ro...@naumenko.ca You mean this one (svn clustering)? http://www.wandisco.com/get?f=documentation/datasheets/DataSheet-Clustering.pdf It doesn't look like it's a simple loadbalancing architecture with a shared storage for repositories. Right. Shared storage is very vulnerable to corrupting that single shared back end. This seems to be a well thought out multi master setup, and should be far more resilient for most environments. There is some replication and synchronization involved, automatic failover, etc. Is anybody using it, what its like? --Roman I'm working from the public specs. I'm not a Subversion master in my current workplace, so lack the 3 hosts needed to really test it put.
Re: Balancing and proxing
If sharing storage is an option you can load balance the servers. See: http://www.orcaware.com/svn/wiki/Server_performance_tuning_for_Linux_and_Unix Sent from my iPad On Aug 9, 2013, at 4:40 PM, Naumenko, Roman roman.naume...@rbccm.com wrote: Hi, I wanted to check if it's possible to configure subversion in master-slave mode with some sort of common URL on the proxy server or loadbalancer, so end users wouldn't bother with different names for slave/master/readonly and geolocal names. Of course, it would be ideal if subversion nodes could just share a storage, so any sort of requests from a load balancer can processed by any node without need to replicate changes over network. Right now it's something like this: rwro https://svnm.ab.orghttps://svns.ab.org rw | |r | rw| redirect--- https://svnm.ab.org master-replication- slave Which means that users should be aware which name to use for commits and which for pulls (in case of mistake, the write request will be proxied transparently to master) I'd like to do something like this, where LB or proxy takes care about request types and directs them accordingly (maybe with some specific svn-awareness). https://svn.ab.org LB rw-- | ro || || master -- slave Thanks, --Roman PS Not sure if text diagram will keep format. ___ This email is intended only for the use of the individual(s) to whom it is addressed and may be privileged and confidential. Unauthorised use or disclosure is prohibited. If you receive This e-mail in error, please advise immediately and delete the original message. This message may have been altered without your or our knowledge and the sender does not accept any liability for any errors or omissions in the message. Ce courriel est confidentiel et protégé. L'expéditeur ne renonce pas aux droits et obligations qui s'y rapportent. Toute diffusion, utilisation ou copie de ce message ou des renseignements qu'il contient par une personne autre que le (les) destinataire(s) désigné(s) est interdite. Si vous recevez ce courriel par erreur, veuillez m'en aviser immédiatement, par retour de courriel ou par un autre moyen.
Re: Balancing and proxing
On Fri, Aug 9, 2013 at 4:40 PM, Naumenko, Roman roman.naume...@rbccm.com wrote: Hi, I wanted to check if it's possible to configure subversion in master-slave mode with some sort of common URL on the proxy server or loadbalancer, so end users wouldn't bother with different names for slave/master/readonly and geolocal names. Of course, it would be ideal if subversion nodes could just share a storage, so any sort of requests from a load balancer can processed by any node without need to replicate changes over network. Wandisco publishes a multiple master toolkit that might solve your issues. They charge money for it, but it seems to be quite intelligent and has good reports here of its high availability behavior.
Re: Balancing and proxing
On Aug 9, 2013, at 15:40, Naumenko, Roman wrote: I wanted to check if it's possible to configure subversion in master-slave mode with some sort of common URL on the proxy server or loadbalancer, so end users wouldn't bother with different names for slave/master/readonly and geolocal names. You can configure any number of read-only slaves which maintain copies of the master repository with a very slight delay. The mirroring and keeping in sync would be accomplished using svnsync. To access the repositories, users would use the hostname of a mirror near to them. For read operations, they would occur on the mirror and therefore be faster than accessing the farther-away master. For write operations, you configure the mirror to proxy those requests back to the master. (Search for write-through proxy for more on this.) In this way the users only need to know the address of their closest mirror; they do not need to know which is the master or to know its address. Of course, it would be ideal if subversion nodes could just share a storage, so any sort of requests from a load balancer can processed by any node without need to replicate changes over network. If your storage is robust (i.e. a cluster filesystem, such as Xsan) and you want to run multiple Subversion servers that each have access to the same repositories on the same storage, then yes, you can do that instead.
Re: Balancing and proxing
Ryan Schmidt said the following, on 09-08-13 7:12 PM: On Aug 9, 2013, at 15:40, Naumenko, Roman wrote: I wanted to check if it's possible to configure subversion in master-slave mode with some sort of common URL on the proxy server or loadbalancer, so end users wouldn't bother with different names for slave/master/readonly and geolocal names. You can configure any number of read-only slaves which maintain copies of the master repository with a very slight delay. The mirroring and keeping in sync would be accomplished using svnsync. To access the repositories, users would use the hostname of a mirror near to them. For read operations, they would occur on the mirror and therefore be faster than accessing the farther-away master. For write operations, you configure the mirror to proxy those requests back to the master. (Search for write-through proxy for more on this.) In this way the users only need to know the address of their closest mirror; they do not need to know which is the master or to know its address. I wanted to have universal URL, which might resolve to different IP based on location - for performance. But more important, I'd like to have a few nodes handling writes. Of course, it would be ideal if subversion nodes could just share a storage, so any sort of requests from a load balancer can processed by any node without need to replicate changes over network. If your storage is robust (i.e. a cluster filesystem, such as Xsan) and you want to run multiple Subversion servers that each have access to the same repositories on the same storage, then yes, you can do that instead. The storage is robust enough - NetApp or possibly SAN with all enterprise bells and whistles. Ok, so if multiple nodes are accessing the same mount point with repos data, will they be able to handle writes from multiple clients correctly? Thinking out loud: yes, they should - since it's no difference for a repository if multiple clients commiting over same server or few distributed nodes. Or is it different when the same process handles all requests? Does it mean that HA and loadbalancing should be pretty easy to setup? It should be, yet the information is almost absent about examples of such architecture. I must be missing something here. --Roman
Re: Balancing and proxing
Nico Kadel-Garcia said the following, on 09-08-13 6:45 PM: On Fri, Aug 9, 2013 at 4:40 PM, Naumenko, Roman roman.naume...@rbccm.com wrote: Hi, I wanted to check if it's possible to configure subversion in master-slave mode with some sort of common URL on the proxy server or loadbalancer, so end users wouldn't bother with different names for slave/master/readonly and geolocal names. Of course, it would be ideal if subversion nodes could just share a storage, so any sort of requests from a load balancer can processed by any node without need to replicate changes over network. Wandisco publishes a multiple master toolkit that might solve your issues. They charge money for it, but it seems to be quite intelligent and has good reports here of its high availability behavior. You mean this one (svn clustering)? http://www.wandisco.com/get?f=documentation/datasheets/DataSheet-Clustering.pdf It doesn't look like it's a simple loadbalancing architecture with a shared storage for repositories. There is some replication and synchronization involved, automatic failover, etc. Is anybody using it, what its like? --Roman
Re: Balancing and proxing
On Aug 9, 2013, at 19:00, Roman Naumenko wrote: Ryan Schmidt said the following, on 09-08-13 7:12 PM: You can configure any number of read-only slaves which maintain copies of the master repository with a very slight delay. The mirroring and keeping in sync would be accomplished using svnsync. To access the repositories, users would use the hostname of a mirror near to them. For read operations, they would occur on the mirror and therefore be faster than accessing the farther-away master. For write operations, you configure the mirror to proxy those requests back to the master. (Search for write-through proxy for more on this.) In this way the users only need to know the address of their closest mirror; they do not need to know which is the master or to know its address. I wanted to have universal URL, which might resolve to different IP based on location - for performance. I'm not familiar with how to set that up at the DNS level but if you are then go for it. But more important, I'd like to have a few nodes handling writes. Ah yes. Well then that's different. You must have one heck of a large svn installation for that to be a bottleneck. Of course, it would be ideal if subversion nodes could just share a storage, so any sort of requests from a load balancer can processed by any node without need to replicate changes over network. If your storage is robust (i.e. a cluster filesystem, such as Xsan) and you want to run multiple Subversion servers that each have access to the same repositories on the same storage, then yes, you can do that instead. The storage is robust enough - NetApp or possibly SAN with all enterprise bells and whistles. It would need to be not just a SAN but a SAN with a cluster filesystem, based on previous conversations (see below). Ok, so if multiple nodes are accessing the same mount point with repos data, will they be able to handle writes from multiple clients correctly? Thinking out loud: yes, they should - since it's no difference for a repository if multiple clients commiting over same server or few distributed nodes. Or is it different when the same process handles all requests? I have not set it up myself, but I participated in discussions about it on this list some years ago: http://svn.haxx.se/users/archive-2006-10/0195.shtml http://svn.haxx.se/users/archive-2007-05/0214.shtml You may want to read those threads completely and carefully to get all the nuances. And of course information may have changed since then.
Re: Balancing and proxing
Ryan Schmidt said the following, on 09-08-13 9:15 PM: On Aug 9, 2013, at 19:00, Roman Naumenko wrote: Ryan Schmidt said the following, on 09-08-13 7:12 PM: You can configure any number of read-only slaves which maintain copies of the master repository with a very slight delay. The mirroring and keeping in sync would be accomplished using svnsync. To access the repositories, users would use the hostname of a mirror near to them. For read operations, they would occur on the mirror and therefore be faster than accessing the farther-away master. For write operations, you configure the mirror to proxy those requests back to the master. (Search for write-through proxy for more on this.) In this way the users only need to know the address of their closest mirror; they do not need to know which is the master or to know its address. I wanted to have universal URL, which might resolve to different IP based on location - for performance. I'm not familiar with how to set that up at the DNS level but if you are then go for it. Views in bind or something similar, DNS server will reply with IP that depends on the request's originating network. But more important, I'd like to have a few nodes handling writes. Ah yes. Well then that's different. You must have one heck of a large svn installation for that to be a bottleneck. One day it might grow there, but even with the moderate load it is still a huge convenience when pair or more frontends available to handle the load, can take one down for maintenance any time. VMs can be used instead of physical box too and sized more adequately. Of course, it would be ideal if subversion nodes could just share a storage, so any sort of requests from a load balancer can processed by any node without need to replicate changes over network. Of course, it would be ideal if subversion nodes could just share a storage, so any sort of requests from a load balancer can processed by any node without need to replicate changes over network. If your storage is robust (i.e. a cluster filesystem, such as Xsan) and you want to run multiple Subversion servers that each have access to the same repositories on the same storage, then yes, you can do that instead. The storage is robust enough - NetApp or possibly SAN with all enterprise bells and whistles. It would need to be not just a SAN but a SAN with a cluster filesystem, based on previous conversations (see below). Yeah, of course - SAN storage will require own layer to handle data sharing. Few mentioned GFS worked. Ok, so if multiple nodes are accessing the same mount point with repos data, will they be able to handle writes from multiple clients correctly? Thinking out loud: yes, they should - since it's no difference for a repository if multiple clients commiting over same server or few distributed nodes. Or is it different when the same process handles all requests? I have not set it up myself, but I participated in discussions about it on this list some years ago: http://svn.haxx.se/users/archive-2006-10/0195.shtml http://svn.haxx.se/users/archive-2007-05/0214.shtml You may want to read those threads completely and carefully to get all the nuances. And of course information may have changed since then. Tom Mornini tmornini_at_engineyard.com confirmed that GFS works in that thread and the other too, http://svn.haxx.se/users/archive-2007-01/1307.shtml But again, there is no official confirmation or reference architecture. It seems like the number of repositories or the load on a server is never large enough to make administrators (or subversion developers to some extent) designing or implementing load-balancing cluster. Or maybe it is close to huge, but in most cases svnsycn + write-though solve the problem. On the other side, there are commercial solutions available, so demand must be there :) --Roman