Re: [Gluster-users] Replicate Over VPN

2014-03-18 Thread Brock Nanson
Thanks guys, for your responses!  I get the digest, so I'm going to
cut/paste the juicier bits into one message... And a warning... if some of
my comments suggest I really don't know what I'm doing - well, that could
very well be right.  I'm definitely down the learning curve a way - IT is
not my real job or background.



 -- Forwarded message --
 From: Alex Chekholko ch...@stanford.edu
 To: gluster-users@gluster.org
 Cc:
 Date: Mon, 17 Mar 2014 11:23:15 -0700
 Subject: Re: [Gluster-users] Replicate Over VPN


 On 03/13/2014 04:50 PM, Brock Nanson wrote:


  ...

 2) I've seen it suggested that the write function isn't considered
 complete until it's complete on all bricks in the volume. My write
 speeds would seem to confirm this.


 Yes, the write will return when all replicas are written.  AKA synchronous
 replication.  Usually replication means synchronous replication.


OK, so the replication is bit by bit, real time across all the replicas.
 'Synchronous' meaning 'common clock' in essence.



  Is this correct and is there any way
 to cache the data and allow it to trickle over the link in the
 background?


 You're talking about asynchronous replication.  Which GlusterFS calls
 geo-replication.


Understood... so this means one direction only in reality, at least until
the nut of doing the replication in both directions can be cracked.
 'Asynchronous' might be a bit of a misdirection though, because it would
suggest (to me at least), communication in *both* directions, but not based
on the same clock.


 ...

 Geo-replication would seem to be the ideal solution, except for the fact
 that it apparently only works in one direction (although it was
 evidently hoped it would be upgraded in 3.4.0 to go in both directions I
 understand).


 So if you allow replication to be delayed, and you allow writes on both
 sides, how would you deal with the same file simultaneously being written
 on both sides.  Which would win in the end?


This is the big question of course, and I think the answer requires more
knowledge than I have relating to how the replication process occurs.  In
my unsophisticated way, I would assume that under the hood, gluster would
sound something like this whenever a new file is written to Node A:

1) Samba wants to write a file, I'm awake!
2) Hey Node B, wake up, we're about to start writing some bits
synchronously.  File is called 'junk.txt'.
3) OK, we've both opened that file for writing...
3) Samba, start your transmission.
4) 'write, write, write', in Node A/B perfect harmony
5) Close that file and make sure the file listing is updated.

This bit level understanding is something I don't have.  At some point, the
directory listing would be updated to show the new or updated file.  When
does that happen?  Before or after the file is written?

So to answer your question about which file would be win if simultaneously
written, I need to understand whether simply having the file opened for
writing is enough to take control of it.  That is, can Node A tell Node B
that junk.txt is going to be written, thus preventing Node B from accepting
a local write request?  If this is the case, then gluster would only need
to send enough information from Node A to Node B to indicate the write was
coming and that the file is off limits until further notice.  The write
could occur as fast as possible on the local node, and dribble across the
VPN as fast as the link allows to the other.  So #4 above would be 'write,
write, write as fast as each node reasonably can, but not necessarily in
harmony'.  And if communication was broken during the process, the heal
function would be called upon to sort it out when communication is restored.



 So are there any configuration tricks (write-behind, compression etc)
 that might help me out?  Is there a way to fool geo-replication into
 working in both directions, recognizing my application isn't seeing
 serious read/write activity and some reasonable amount of risk is
 acceptable?


 You're basically talking about running rsyncs in both directions.  How
 will you handle any file conflicts?


Yes, I suppose in a way I am, but not based on a cron job... it would
ideally be a full time synchronization, like gluster does, but without the
requirement of perfect Synchronicity (wasn't that a Police album?).

Assuming my kindergarten understanding above could be applied here, the
file conflicts would presumably only exist if the VPN link went down,
preventing the 'open the file for writing' command to be completed on both
ends.  If the link went down part way through a dribbling write to Node B,
the healing process would presumably have a go at fixing the problem after
the link is reinstated.  If someone wrote to the remote copy during the
outage, the typical heal issues would come into play.



 --
 Alex Chekholko ch...@stanford.edu


 -- Forwarded message --
 From: Alex Chekholko ch...@stanford.edu
 To: gluster-users@gluster.org
 Cc

Re: [Gluster-users] Replicate Over VPN

2014-03-17 Thread Alex Chekholko



On 03/13/2014 04:50 PM, Brock Nanson wrote:

Yeah... I found the Joe Julian do's and don'ts blog post that pretty
much says I shouldn't have started down this road too late.  But I have
started down the road, so I'd like to make the best of it.
(http://joejulian.name/blog/glusterfs-replication-dos-and-donts/)

...

2) I've seen it suggested that the write function isn't considered
complete until it's complete on all bricks in the volume. My write
speeds would seem to confirm this.


Yes, the write will return when all replicas are written.  AKA 
synchronous replication.  Usually replication means synchronous 
replication.



Is this correct and is there any way
to cache the data and allow it to trickle over the link in the
background?


You're talking about asynchronous replication.  Which GlusterFS calls 
geo-replication.


I'm thinking about the write-behind-window size setting,

etc.  It would be nice if something like DRBD Protocol A could be
implemented, where writes are considered complete when the fast local
one is done.  I realize the potential for data loss if something goes
wrong, but in my case the heal would take care of almost every scenario
I can envision.

Geo-replication would seem to be the ideal solution, except for the fact
that it apparently only works in one direction (although it was
evidently hoped it would be upgraded in 3.4.0 to go in both directions I
understand).


So if you allow replication to be delayed, and you allow writes on both 
sides, how would you deal with the same file simultaneously being 
written on both sides.  Which would win in the end?




So are there any configuration tricks (write-behind, compression etc)
that might help me out?  Is there a way to fool geo-replication into
working in both directions, recognizing my application isn't seeing
serious read/write activity and some reasonable amount of risk is
acceptable?



You're basically talking about running rsyncs in both directions.  How 
will you handle any file conflicts?



--
Alex Chekholko ch...@stanford.edu
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Replicate Over VPN

2014-03-17 Thread Marcus Bointon
On 17 Mar 2014, at 23:11, Alex Chekholko ch...@stanford.edu wrote:

 For your async use case, how often does the shared data change?  Perhaps 
 something like a plain rsync every night would be sufficient?  Or a ZFS 
 send/receive if that's faster than rsync?

(This should really have been in reply to Brock, but I lost his post somewhere)

There are some fairly simple solutions for this that may be workable, 
especially if writes are somewhat constrained. If all reads and writes by a 
single client go to the same back-end server, perhaps because of cookie or 
IP-based stickiness, they can cope with longish latency propagating to other 
servers, read-what-you-just-wrote will always succeed, and simultaneous writes 
to the same file are very unlikely. A classic use case would be user-uploaded 
image files for a web server cluster.

Bidirectional rsync has serious issues with deletions. Other systems worth 
looking at include:
csync2: http://oss.linbit.com/csync2/
Unison: http://www.cis.upenn.edu/~bcpierce/unison/
Bsync: https://github.com/dooblem/bsync

None of these do what gluster does of course, and may create their own issues!

Marcus
-- 
Marcus Bointon
Technical Director, Synchromedia Limited

Creators of http://www.smartmessages.net/
UK 1CRM solutions http://www.syniah.com/
mar...@synchromedia.co.uk | http://www.synchromedia.co.uk/



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Replicate Over VPN

2014-03-13 Thread Brock Nanson
Yeah... I found the Joe Julian do's and don'ts blog post that pretty much
says I shouldn't have started down this road too late.  But I have started
down the road, so I'd like to make the best of it. (
http://joejulian.name/blog/glusterfs-replication-dos-and-donts/)

So now I'm wondering what I can do to speed things up as much as possible.
 First of all, I'll describe why I'm in this situation.

I need to maintain the same read/write access to all files in two
geographically-distant offices.  Essentially, the way Autodesk sets up
project files makes moving them between offices problematic, so Glusterfs
with a fast connection would solve the problem.  But the VPN connection is
slow for the next year (under 10 mbits, hopefully 100 mbits eventually
which still isn't ideally fast).  The nature of the file use is such that
it would be surprising if a file was accessed from both offices at the same
time.  99% of the time I could disconnect the bricks from each other and
reconnect at the end of the day and the heal function would do fine after
hours, no split-brain problems.  Except for that 1%, I could even rsync...

In testing (Samba, Glusterfs 3.4.2, Ubuntu 12.04LTS), I'm seeing two
issues:  1) Browsing folders is slow; and 2) file writes are being done at
approximately the speed of the VPN connection.

Before I can attempt to improve the situation (if that's even possible),
I'd like to know if my understandings are correct!

1) My reading suggests that every brick's file list is read when a folder
is browsed by the client.  Meaning the latency of the link is the
bottleneck.  Does this actually happen?  Is there a way to prevent it?  If
bricks are supposedly exact replicas of each other, why get the file
listings from all the other bricks in the volume instead of trusting the
local one?  If there were actually discrepancies found, wouldn't that
suggest a bigger problem with the replication?

2) I've seen it suggested that the write function isn't considered complete
until it's complete on all bricks in the volume. My write speeds would seem
to confirm this.  Is this correct and is there any way to cache the data
and allow it to trickle over the link in the background?  I'm thinking
about the write-behind-window size setting, etc.  It would be nice if
something like DRBD Protocol A could be implemented, where writes are
considered complete when the fast local one is done.  I realize the
potential for data loss if something goes wrong, but in my case the heal
would take care of almost every scenario I can envision.

Geo-replication would seem to be the ideal solution, except for the fact
that it apparently only works in one direction (although it was evidently
hoped it would be upgraded in 3.4.0 to go in both directions I understand).

So are there any configuration tricks (write-behind, compression etc) that
might help me out?  Is there a way to fool geo-replication into working in
both directions, recognizing my application isn't seeing serious read/write
activity and some reasonable amount of risk is acceptable?
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users