Re: [tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-08 Thread Zooko Wilcox-O'Hearn
On Wed, Aug 8, 2012 at 2:42 AM, Tony Arcieri wrote: > Awesome! Is the merge algorithm documented anywhere? Is it a patch-style > algorithm, or more advanced? It is only for mutable directories, not for mutable files. Unfortunately it is not documented, I don't think, which is one of the reasons t

Re: [tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-08 Thread Tony Arcieri
On Wed, Aug 8, 2012 at 1:46 AM, James A. Donald wrote: > What is, however, avoidable, and should be avoided, is that you might > write the most recent version of a file, and then get a mangled mixture of > more recent and less recent versions. > Most editors will attempt to inform you if you are

Re: [tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-08 Thread Zooko Wilcox-O'Hearn
On Wed, Aug 8, 2012 at 2:36 AM, Tony Arcieri wrote: > >> with the Tahoe-LAFS access control >> architecture -- in which most things are immutable, and most mutable >> things are writable by few or only one writer -- such cases appear to >> be very rare. > > > I operate a Friendgrid, and we have a

Re: [tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-08 Thread James A. Donald
On 2012-08-08 5:57 PM, Tony Arcieri wrote: There are only two options: - Available: Tahoe still accepts writes (and all other operations, but writes are the hardest) in the middle of a network partition, like it does today (provided sufficient nodes are available). The current mechanism is "last

Re: [tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-08 Thread Tony Arcieri
Awesome! Is the merge algorithm documented anywhere? Is it a patch-style algorithm, or more advanced? On Wed, Aug 8, 2012 at 1:40 AM, Zooko Wilcox-O'Hearn wrote: > On Wed, Aug 8, 2012 at 2:31 AM, Zooko Wilcox-O'Hearn > wrote: > > On Mon, Aug 6, 2012 at 1:38 PM, Tony Arcieri > wrote: > > > >> Fr

Re: [tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-08 Thread Zooko Wilcox-O'Hearn
On Wed, Aug 8, 2012 at 2:31 AM, Zooko Wilcox-O'Hearn wrote: > On Mon, Aug 6, 2012 at 1:38 PM, Tony Arcieri wrote: > >> From what I've read of how Tahoe handles conflicts, it employs a monotonic >> version number and timestamps. So it sounds like in the event of a conflict, >> Tahoe employs a last

Re: [tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-08 Thread Tony Arcieri
On Wed, Aug 8, 2012 at 1:31 AM, Zooko Wilcox-O'Hearn wrote: > In fact, it is quite likely that there were no files or > directories to which write access was held by people on both sides of > the partition! > > So, empirically, all this distributed consistency stuff that we're > talking about is t

Re: [tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-08 Thread Zooko Wilcox-O'Hearn
On Mon, Aug 6, 2012 at 1:38 PM, Tony Arcieri wrote: > On Mon, Aug 6, 2012 at 12:30 PM, Zooko Wilcox-O'Hearn > wrote: >> >> How can both that story and also the things that have already been >> posted on this thread both be true? ... > As far as CAP theorem goes, it sounds like Tahoe falls into

Re: [tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-08 Thread Michael Rogers
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/08/12 08:57, Tony Arcieri wrote: > These are the only two options. For those who desire > "reliability", these are the only buckets that reliability can be > segmented into. So far I have not heard *any* good arguments > towards strong consistenc

Re: [tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-08 Thread Tony Arcieri
On Tue, Aug 7, 2012 at 8:04 PM, Two Spirit wrote: > in split brain scenario, wouldn't both halves have write capabilities? > unless you made some requirement that more than half of the nodes needed to > be up in order to write. Hi everyone, This is very much a formalized topic of computer scie

Re: [tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-07 Thread Two Spirit
in split brain scenario, wouldn't both halves have write capabilities? unless you made some requirement that more than half of the nodes needed to be up in order to write. On Tue, Aug 7, 2012 at 6:46 PM, James A. Donald wrote: > On 2012-08-07 5:16 PM, Two Spirit wrote: > >> Since when is lost da

Re: [tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-07 Thread James A. Donald
On 2012-08-07 5:16 PM, Two Spirit wrote: Since when is lost data considered highly RELIABLE storage? It isn't storage if it doesn't store. in my eyes vanishing data is not acceptable storage. double penalty to guy who worked hard to finish early, since the one who wrote last wins. we might as wel

Re: [tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-07 Thread Frederik Braun
Considering your argument about lacking documentation and lost data: As far as I know, Tahoe defaults to immutable files. This means that updates (regardless of whether they happen during a split or not) are new files and the old ones are still recoverable. Since garbage collection is pretty slow

Re: [tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-07 Thread Michael Rogers
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I don't think there would be any lost data if you followed erpo41's suggestion and set H higher than half the number of storage nodes. In the event of a partition there would be at most one component with H or more nodes, where writes would succeed; al

Re: [tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-07 Thread Two Spirit
Since when is lost data considered highly RELIABLE storage? It isn't storage if it doesn't store. in my eyes vanishing data is not acceptable storage. double penalty to guy who worked hard to finish early, since the one who wrote last wins. we might as well rename it to "lossy storage" or "leakage"

Re: [tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-06 Thread David-Sarah Hopwood
On 07/08/12 01:23, Shawn Willden wrote: > On Mon, Aug 6, 2012 at 5:12 PM, Tony Arcieri > wrote: > > An alternative to make it more robust would be to have vector clocks of > which nodes > modified which data. Tahoe could use this information to produce > "

Re: [tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-06 Thread Shawn Willden
On Mon, Aug 6, 2012 at 5:12 PM, Tony Arcieri wrote: > An alternative to make it more robust would be to have vector clocks of > which nodes modified which data. Tahoe could use this information to > produce "siblings" in the event that the same file is modified by several > parties. In the event

Re: [tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-06 Thread Greg Troxel
Two Spirit writes: > If the algorithm is "last writer wins", then any edits by the other > disconnected half are lost. Wouldn't it make sense to approach it like a > source control merge conflict where both revisions are preserved and > presented to the user for the user to resolve? Depending on

Re: [tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-06 Thread erp...@gmail.com
The way I see it, the goal of tahoe is to provide highly reliable storage--not highly available storage. In my mind, the right solution is to set H to a number higher than half the number of storage nodes and call the problem solved. On Aug 6, 2012 5:12 PM, "Tony Arcieri" wrote: > On Mon, Aug 6,

Re: [tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-06 Thread Tony Arcieri
On Mon, Aug 6, 2012 at 4:08 PM, Two Spirit wrote: > If the algorithm is "last writer wins", then any edits by the other > disconnected half are lost. Wouldn't it make sense to approach it like a > source control merge conflict where both revisions are preserved and > presented to the user for the

Re: [tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-06 Thread Two Spirit
If the algorithm is "last writer wins", then any edits by the other disconnected half are lost. Wouldn't it make sense to approach it like a source control merge conflict where both revisions are preserved and presented to the user for the user to resolve? Depending on the length of outage, this co

Re: [tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-06 Thread Tony Arcieri
On Mon, Aug 6, 2012 at 12:30 PM, Zooko Wilcox-O'Hearn wrote: > “At Virginia Tech Linux and Unix Users Group, we have a working > Tahoe-LAFS deployment of about 9-14 nodes. It's incredibly reliable. > It's based at Virginia Tech, with the introducer on a > university-hosted servers, plus a few node

Re: [tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-06 Thread Zooko Wilcox-O'Hearn
It isn't that what has been said on this thread so far is *wrong*, exactly. I think it is correct. But contrast the overall impression that one gets from this discussion with this story: “At Virginia Tech Linux and Unix Users Group, we have a working Tahoe-LAFS deployment of about 9-14 nodes. It's

Re: [tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-04 Thread Wim Lewis
On 8/3/12 4:16 PM, Two Spirit wrote: > well, let me back up one then, if two remote offices share a directory, > and the connectivity between them gets disconnected, each side thinks it > just lost half its repository, then you site1 adds file1, and site2 adds > file2 to their respective halfs, not

Re: [tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-03 Thread Greg Troxel
Two Spirit writes: > well, let me back up one then, if two remote offices share a directory, and > the connectivity between them gets disconnected, each side thinks it just > lost half its repository, then you site1 adds file1, and site2 adds file2 > to their respective halfs, not knowing the ot

Re: [tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-03 Thread Two Spirit
well, let me back up one then, if two remote offices share a directory, and the connectivity between them gets disconnected, each side thinks it just lost half its repository, then you site1 adds file1, and site2 adds file2 to their respective halfs, not knowing the other site is not connected. whe

Re: [tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-03 Thread erp...@gmail.com
Afaik, tahoe doesn't handle multiple users updating the same file at the same time. If you have two workers in two offices working on the same file at the same time, they had better be calling each other on the phone while they're doing it. On Aug 3, 2012 12:13 AM, "Two Spirit" wrote: > Is there

[tahoe-dev] split brain? how handled in tahoe -- docs?

2012-08-02 Thread Two Spirit
Is there any docs how the "split-brain" scenario is handled with tahoe where the distributed data stores get split in two halves and both halves are working independent of each other, and any file changes cause each half to become divergent. Maybe a simple scenario is two remote corporate sites lo