On 20/04/14 19:33, Trevor Perrin wrote: > On Sat, Apr 19, 2014 at 9:16 AM, Ximin Luo <[email protected]> wrote: >> On 19/04/14 16:45, Trevor Perrin wrote: >>> On Sat, Apr 19, 2014 at 6:53 AM, Ximin Luo <[email protected]> wrote: >>>> On 19/04/14 00:54, Trevor Perrin wrote: >>>>> >>>>> To handle "join" and "part" events, I think we can assume these >>>>> actions are represented as messages which operate on the "member-set" >>>>> each participant recognizes. The question Ximin has thought about >>>>> earlier is how to handle "merging" partially-ordered changes to the >>>>> member-set >>> [...] >>>> >>>> I have an algorithm (with source code and a few test cases) for this, plus >>>> a semi-formal argument on why it works, and also why it's the unique >>>> solution. It would be good to formalise it. It's probably a little too >>>> technical to post here but I can explain it in more detail next time we >>>> meet. >>> >>> Thinking more: if a group chat allows member-add but not >>> member-delete, then this is is easy, right? The merged member-set is >>> just the union of the parent member-sets? >>> >>> With delete, there's not a perfect solution: what if one parent branch >>> adds Bob, and another adds Bob, then deletes him? Since these >>> branches aren't ordered, I think it's ambiguous whether the merge >>> should contain Bob or not. >>> >> >> I'd argue the correct merge is that it should contain Bob. Perhaps I'm >> biased in my assessment, since this is what my algorithm does :p > > OK. So a "member-delete" message would cancel all causally-prior > "member-adds" of that member (though not necessarily all > temporally-prior ones). > > Seems reasonable. Does the below work? - > > Clients track sent/received messages, but garbage-collect messages > more than X seconds older than the latest message (i.e. messages are > garbage-collected once they are old enough that new messages are > expected to refer to later ones). > > For each message, clients track: > - a member-set M, each member associated with the messageID(s) that added it > - a deleted-member-set D, each member associated with the > causally-prior messageID(s) that added the member being deleted > > The client's view of the current M and D is derived by merging M and D > from its immediate predecessor messages: > - M = union(M from predecessors), D = union(D from predecessors) > - if there is a member listed in M and D, delete from the member in M > all the messageIDs listed for the member in D. Then, if the member > has no messageIDs in M left, delete the member from M. > > When messages are garbage-collected, if the last mention of a > messageID for a member in M is removed, then the same messageID can be > removed for that member in all D. (In other words: once a member-add > has been fully cancelled by a member-delete in all messages that a new > message could refer to, the member-delete no longer needs to be > tracked.) >
I'm unsure what you're trying to achieve with M, D. If you only want to achieve
the cancel behaviour, tracking which message "joined" a member is unnecessary.
The cancel behaviour is actually only a by-product of the general merge
algorithm, which works by finding the Latest-Common-Ancestor between merge
parents. This is what git does, and I assume darcs/hg too, and is what inspired
my algorithm.
Come to think of it, I think you will actually never get *any* items in D. If
an entry (mId, member) is in D for a message, then it must also be in M, since
mId is causally before that message. But then, you ought to delete this entry
from both D and M.
If you designed the above specifically to be able to discard old history, I
think this is still possible with my merge algorithm - it will never need
information before LCA({everyone's latest message}), so you can discard
everything before those messages.
For your example above, it would work as follows (same behaviour for git, and
my unordered-sets version):
Say r is tip of the first branch (members={A}), and s is the tip of the second
branch (members={A,B}). Then p (members={A}) is the LCA of r, s. Then, the
merged message m must have member-set:
members(m)
= apply(r, diff(p, s)) (or equivalently) apply(s, diff(p, r))
= apply({A}, (+{B}, -{})) (or equivalently) apply({A, B}, (+{}, -{}))
= {A, B}
So you see, the fact that you added-then-deleted Bob from the first branch, is
bypassed by the algorithm. I am pretty certain the LHS == RHS always, which
makes the algorithm symmetrical wrt the order of merge-parents, but I haven't
yet formally proved this.
If the LCA is multiple sets, you recursively merge these first - this is what
git means when it says "merge by recursive". If you merge multiple parents, you
fold over the pairwise merge - this is what "git merge-base" does. (I
originally had a simpler behaviour that I thought was a short-cut, but it did
the wrong thing; I now better understand why git-merge-base works the way it
does.)
X
> ---
>
> This is getting complicated, would be great if someone could create a
> wiki or something and try to list:
> - threats we're defending against
> - various assumptions that can be made about the underlying network,
> group semantics, and UI
> - algorithms and their options
>
>
> Trevor
>
--
GPG: 4096R/1318EFAC5FBBDBCE
git://github.com/infinity0/pubkeys.git
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Messaging mailing list [email protected] https://moderncrypto.org/mailman/listinfo/messaging
