Re: Balancing and proxing

2013-08-13 Thread Naumenko, Roman
On 2013/08/12 8:35 PM, Nico Kadel-Garcia wrote:
 Nico Kadel-Garcia
 Email: nka...@gmail.com
 Sent from iPhone

 On Aug 9, 2013, at 20:12, Roman Naumenko ro...@naumenko.ca

 You mean this one (svn clustering)?
 http://www.wandisco.com/get?f=documentation/datasheets/DataSheet-Clustering.pdf

 It doesn't look like it's a simple loadbalancing architecture with a shared 
 storage for repositories.
 Right. Shared storage is very vulnerable to corrupting that single shared 
 back end. This seems to be a well thought out multi master setup, and should 
 be far more resilient for most environments.

I tend to agree, although such direction limits scalability and administration 
'kiss'-ness.

 There is some replication and synchronization involved, automatic failover, 
 etc.
 Is anybody using it, what its like?

 --Roman
 I'm working from the public specs. I'm not a Subversion master in my current 
 workplace, so lack the 3 hosts needed to really test it put.

--Roman Naumenko
___

This email is intended only for the use of the individual(s) to whom it is 
addressed and may be privileged and confidential.
Unauthorised use or disclosure is prohibited. If you receive This e-mail in 
error, please advise immediately and delete the original message. 
This message may have been altered without your or our knowledge and the sender 
does not accept any liability for any errors or omissions in the message.

Ce courriel est confidentiel et protégé. L'expéditeur ne renonce pas aux droits 
et obligations qui s'y rapportent. 
Toute diffusion, utilisation ou copie de ce message ou des renseignements qu'il 
contient par une personne autre que le (les) destinataire(s) désigné(s) est 
interdite.
Si vous recevez ce courriel par erreur, veuillez m'en aviser immédiatement, par 
retour de courriel ou par un autre moyen.



Re: Balancing and proxing

2013-08-12 Thread Nico Kadel-Garcia


Nico Kadel-Garcia
Email: nka...@gmail.com
Sent from iPhone

On Aug 9, 2013, at 20:12, Roman Naumenko ro...@naumenko.ca

 You mean this one (svn clustering)?
 http://www.wandisco.com/get?f=documentation/datasheets/DataSheet-Clustering.pdf
 
 It doesn't look like it's a simple loadbalancing architecture with a shared 
 storage for repositories.

Right. Shared storage is very vulnerable to corrupting that single shared back 
end. This seems to be a well thought out multi master setup, and should be far 
more resilient for most environments.

 There is some replication and synchronization involved, automatic failover, 
 etc.
 Is anybody using it, what its like?
 
 --Roman

I'm working from the public specs. I'm not a Subversion master in my current 
workplace, so lack the 3 hosts needed to really test it put.

Re: Balancing and proxing

2013-08-09 Thread Mark Phippard
If sharing storage is an option you can load balance the servers.  See:

http://www.orcaware.com/svn/wiki/Server_performance_tuning_for_Linux_and_Unix

Sent from my iPad

On Aug 9, 2013, at 4:40 PM, Naumenko, Roman roman.naume...@rbccm.com wrote:

 Hi,
 
 I wanted to check if it's possible to configure subversion in 
 master-slave mode with some sort of common URL on the proxy server or 
 loadbalancer, so end users wouldn't bother with different names for 
 slave/master/readonly and geolocal names.
 
 Of course, it would be ideal if subversion nodes could just share a 
 storage, so any sort of requests from a load balancer can processed by 
 any node without need to replicate changes over network.
 
 Right now it's something like this:
 
 rwro
 https://svnm.ab.orghttps://svns.ab.org
 rw |  |r
  |  rw| 
 redirect--- https://svnm.ab.org
   master-replication-   slave
 
 Which means that users should be aware which name to use for commits and 
 which for pulls (in case of mistake, the write request will be proxied 
 transparently to master)
 
 I'd like to do something like this, where LB or proxy takes care about 
 request types and directs them accordingly (maybe with some specific 
 svn-awareness).
 
 https://svn.ab.org
 LB
   rw-- |  ro
 ||
 ||
 master  --   slave
 
 Thanks,
 --Roman
 PS Not sure if text diagram will keep format.
 ___
 
 This email is intended only for the use of the individual(s) to whom it is 
 addressed and may be privileged and confidential.
 Unauthorised use or disclosure is prohibited. If you receive This e-mail in 
 error, please advise immediately and delete the original message. 
 This message may have been altered without your or our knowledge and the 
 sender does not accept any liability for any errors or omissions in the 
 message.
 
 Ce courriel est confidentiel et protégé. L'expéditeur ne renonce pas aux 
 droits et obligations qui s'y rapportent. 
 Toute diffusion, utilisation ou copie de ce message ou des renseignements 
 qu'il contient par une personne autre que le (les) destinataire(s) désigné(s) 
 est interdite.
 Si vous recevez ce courriel par erreur, veuillez m'en aviser immédiatement, 
 par retour de courriel ou par un autre moyen.
 


Re: Balancing and proxing

2013-08-09 Thread Nico Kadel-Garcia
On Fri, Aug 9, 2013 at 4:40 PM, Naumenko, Roman
roman.naume...@rbccm.com wrote:
 Hi,

 I wanted to check if it's possible to configure subversion in
 master-slave mode with some sort of common URL on the proxy server or
 loadbalancer, so end users wouldn't bother with different names for
 slave/master/readonly and geolocal names.

 Of course, it would be ideal if subversion nodes could just share a
 storage, so any sort of requests from a load balancer can processed by
 any node without need to replicate changes over network.

Wandisco publishes a multiple master toolkit that might solve your
issues. They charge money for it, but it seems to be quite intelligent
and has good reports here of its high availability behavior.


Re: Balancing and proxing

2013-08-09 Thread Ryan Schmidt
On Aug 9, 2013, at 15:40, Naumenko, Roman wrote:

 I wanted to check if it's possible to configure subversion in 
 master-slave mode with some sort of common URL on the proxy server or 
 loadbalancer, so end users wouldn't bother with different names for 
 slave/master/readonly and geolocal names.

You can configure any number of read-only slaves which maintain copies of the 
master repository with a very slight delay. The mirroring and keeping in sync 
would be accomplished using svnsync. To access the repositories, users would 
use the hostname of a mirror near to them. For read operations, they would 
occur on the mirror and therefore be faster than accessing the farther-away 
master. For write operations, you configure the mirror to proxy those requests 
back to the master. (Search for write-through proxy for more on this.) In 
this way the users only need to know the address of their closest mirror; they 
do not need to know which is the master or to know its address.


 Of course, it would be ideal if subversion nodes could just share a 
 storage, so any sort of requests from a load balancer can processed by 
 any node without need to replicate changes over network.

If your storage is robust (i.e. a cluster filesystem, such as Xsan) and you 
want to run multiple Subversion servers that each have access to the same 
repositories on the same storage, then yes, you can do that instead.



Re: Balancing and proxing

2013-08-09 Thread Roman Naumenko

Ryan Schmidt said the following, on 09-08-13 7:12 PM:

On Aug 9, 2013, at 15:40, Naumenko, Roman wrote:

I wanted to check if it's possible to configure subversion in
master-slave mode with some sort of common URL on the proxy server or
loadbalancer, so end users wouldn't bother with different names for
slave/master/readonly and geolocal names.

You can configure any number of read-only slaves which maintain copies of the master 
repository with a very slight delay. The mirroring and keeping in sync would be 
accomplished using svnsync. To access the repositories, users would use the hostname of a 
mirror near to them. For read operations, they would occur on the mirror and therefore be 
faster than accessing the farther-away master. For write operations, you configure the 
mirror to proxy those requests back to the master. (Search for write-through 
proxy for more on this.) In this way the users only need to know the address of 
their closest mirror; they do not need to know which is the master or to know its address.
I wanted to have universal URL, which might resolve to different IP 
based on location - for performance.

But more important, I'd like to have a few nodes handling writes.

Of course, it would be ideal if subversion nodes could just share a
storage, so any sort of requests from a load balancer can processed by
any node without need to replicate changes over network.

If your storage is robust (i.e. a cluster filesystem, such as Xsan) and you 
want to run multiple Subversion servers that each have access to the same 
repositories on the same storage, then yes, you can do that instead.
The storage is robust enough - NetApp or possibly SAN with all 
enterprise bells and whistles.


Ok, so if  multiple nodes are accessing the same mount point with repos 
data, will they be able to handle writes from multiple clients 
correctly? Thinking out loud: yes, they should - since it's no 
difference for a repository if multiple clients commiting over same 
server or few distributed nodes. Or is it different when the same 
process handles all requests?
Does it mean that HA and loadbalancing should be pretty easy to setup?  
It should be, yet the information is almost absent about examples of 
such architecture. I must be missing something here.

--Roman


Re: Balancing and proxing

2013-08-09 Thread Roman Naumenko

Nico Kadel-Garcia said the following, on 09-08-13 6:45 PM:

On Fri, Aug 9, 2013 at 4:40 PM, Naumenko, Roman
roman.naume...@rbccm.com wrote:

Hi,

I wanted to check if it's possible to configure subversion in
master-slave mode with some sort of common URL on the proxy server or
loadbalancer, so end users wouldn't bother with different names for
slave/master/readonly and geolocal names.

Of course, it would be ideal if subversion nodes could just share a
storage, so any sort of requests from a load balancer can processed by
any node without need to replicate changes over network.

Wandisco publishes a multiple master toolkit that might solve your
issues. They charge money for it, but it seems to be quite intelligent
and has good reports here of its high availability behavior.

You mean this one (svn clustering)?
http://www.wandisco.com/get?f=documentation/datasheets/DataSheet-Clustering.pdf

It doesn't look like it's a simple loadbalancing architecture with a 
shared storage for repositories.
There is some replication and synchronization involved, automatic 
failover, etc.

Is anybody using it, what its like?

--Roman


Re: Balancing and proxing

2013-08-09 Thread Ryan Schmidt
On Aug 9, 2013, at 19:00, Roman Naumenko wrote:
 Ryan Schmidt said the following, on 09-08-13 7:12 PM:
 You can configure any number of read-only slaves which maintain copies of 
 the master repository with a very slight delay. The mirroring and keeping in 
 sync would be accomplished using svnsync. To access the repositories, users 
 would use the hostname of a mirror near to them. For read operations, they 
 would occur on the mirror and therefore be faster than accessing the 
 farther-away master. For write operations, you configure the mirror to proxy 
 those requests back to the master. (Search for write-through proxy for 
 more on this.) In this way the users only need to know the address of their 
 closest mirror; they do not need to know which is the master or to know its 
 address.
 I wanted to have universal URL, which might resolve to different IP based on 
 location - for performance.

I'm not familiar with how to set that up at the DNS level but if you are then 
go for it.


 But more important, I'd like to have a few nodes handling writes.

Ah yes. Well then that's different.

You must have one heck of a large svn installation for that to be a bottleneck.


 Of course, it would be ideal if subversion nodes could just share a
 storage, so any sort of requests from a load balancer can processed by
 any node without need to replicate changes over network.
 If your storage is robust (i.e. a cluster filesystem, such as Xsan) and you 
 want to run multiple Subversion servers that each have access to the same 
 repositories on the same storage, then yes, you can do that instead.
 The storage is robust enough - NetApp or possibly SAN with all enterprise 
 bells and whistles.

It would need to be not just a SAN but a SAN with a cluster filesystem, based 
on previous conversations (see below).


 Ok, so if  multiple nodes are accessing the same mount point with repos data, 
 will they be able to handle writes from multiple clients correctly? Thinking 
 out loud: yes, they should - since it's no difference for a repository if 
 multiple clients commiting over same server or few distributed nodes. Or is 
 it different when the same process handles all requests?

I have not set it up myself, but I participated in discussions about it on this 
list some years ago:

http://svn.haxx.se/users/archive-2006-10/0195.shtml

http://svn.haxx.se/users/archive-2007-05/0214.shtml

You may want to read those threads completely and carefully to get all the 
nuances. And of course information may have changed since then.





Re: Balancing and proxing

2013-08-09 Thread Roman Naumenko


Ryan Schmidt said the following, on 09-08-13 9:15 PM:

On Aug 9, 2013, at 19:00, Roman Naumenko wrote:

Ryan Schmidt said the following, on 09-08-13 7:12 PM:

You can configure any number of read-only slaves which maintain copies of the master 
repository with a very slight delay. The mirroring and keeping in sync would be 
accomplished using svnsync. To access the repositories, users would use the hostname of a 
mirror near to them. For read operations, they would occur on the mirror and therefore be 
faster than accessing the farther-away master. For write operations, you configure the 
mirror to proxy those requests back to the master. (Search for write-through 
proxy for more on this.) In this way the users only need to know the address of 
their closest mirror; they do not need to know which is the master or to know its address.

I wanted to have universal URL, which might resolve to different IP based on 
location - for performance.

I'm not familiar with how to set that up at the DNS level but if you are then 
go for it.
Views in bind or something similar, DNS server will reply with IP that 
depends on the request's originating network.



But more important, I'd like to have a few nodes handling writes.

Ah yes. Well then that's different.

You must have one heck of a large svn installation for that to be a bottleneck.


One day it might grow there, but even with the moderate load it is still a huge 
convenience when pair or more frontends available to handle the load, can take 
one down for maintenance any time. VMs can be used instead of physical box too 
and sized more adequately.

Of course, it would be ideal if subversion nodes could just share a
storage, so any sort of requests from a load balancer can processed by
any node without need to replicate changes over network.


Of course, it would be ideal if subversion nodes could just share a
storage, so any sort of requests from a load balancer can processed by
any node without need to replicate changes over network.

If your storage is robust (i.e. a cluster filesystem, such as Xsan) and you 
want to run multiple Subversion servers that each have access to the same 
repositories on the same storage, then yes, you can do that instead.

The storage is robust enough - NetApp or possibly SAN with all enterprise bells 
and whistles.

It would need to be not just a SAN but a SAN with a cluster filesystem, based 
on previous conversations (see below).
Yeah, of course - SAN storage will require own layer to handle data 
sharing.

Few mentioned GFS worked.

Ok, so if  multiple nodes are accessing the same mount point with repos data, 
will they be able to handle writes from multiple clients correctly? Thinking 
out loud: yes, they should - since it's no difference for a repository if 
multiple clients commiting over same server or few distributed nodes. Or is it 
different when the same process handles all requests?

I have not set it up myself, but I participated in discussions about it on this 
list some years ago:

http://svn.haxx.se/users/archive-2006-10/0195.shtml

http://svn.haxx.se/users/archive-2007-05/0214.shtml

You may want to read those threads completely and carefully to get all the 
nuances. And of course information may have changed since then.
Tom Mornini tmornini_at_engineyard.com  confirmed that GFS works in 
that thread and the other too, 
http://svn.haxx.se/users/archive-2007-01/1307.shtml

But again, there is no official confirmation or reference architecture.

It seems like the number of repositories or the load on a server is 
never large enough to make administrators (or subversion developers to 
some extent) designing or implementing load-balancing cluster. Or maybe 
it is close to huge, but in most cases svnsycn + write-though solve the 
problem.
On the other side, there are commercial solutions available, so demand 
must be there :)


--Roman