Re: [messaging] Merging partially-ordered membership operations was Re: Message order in group chat (attempt at summary)

Ximin Luo Mon, 21 Apr 2014 06:25:25 -0700

On 21/04/14 04:31, Trevor Perrin wrote:
> On Sun, Apr 20, 2014 at 5:20 PM, Ximin Luo <[email protected]> wrote:
>>
>> I'm unsure what you're trying to achieve with M, D.
> 
> I'd like to understand the algorithms for tracking causal order and
> group membership.
> 
> There's different membership policies, depending on who can add
> members (anyone? group leader?) and who can delete (anyone? group
> leader? you can delete yourself?).  But I think the "anyone can add
> and delete" policy is the most complex, so if we can handle that we're
> in good shape.
> 
> I think my algorithm works.  Yours probably works too, though I'd like
> to see a fuller writeup.
> 
> 
>> Come to think of it, I think you will actually never get *any* items in D. 
>> If an entry (mId, member) is in D for a message, then it must also be in M, 
>> since mId is causally before that message.
> 
> No, I meant the following:
>  - A "member-add" message would result in (mId, member) being added to
> that message's M.
>  - A "member-delete" message would move all entries associated with
> member from M to D.
> 
> 
>> But then, you ought to delete this entry from both D and M.
> 
> The only case where an entry in D could be removed (or ignored) was
> once it becomes irrelevant (all references to mId in any message's M
> have been garbage-collected).
> 
>


I think this does work, but I see a few issues. It leaks some information about 
previous members in the session - I don't think the garbage-collection would be 
enough to hide this. Also, I'm generally uncomfortable with relying on 
redundant information - M and D can be deduced by examining previous history, 
which is the canonical place where this information is generated.

If you want to support incomplete views of history (e.g. so that new joiners 
don't see old history) this redundancy may present some problems, since new 
joiners can't verify that M and D are consistent with the old history.

In a more complex situation, this could lead to unreasonable behaviour (perhaps 
severe enough to call "an attack"). In the diagram below, we do a merge (C) of 
2 branches (A,B) where Bob joined separately in each branch. Then (in C) M 
should have two entries (A, Bob), (B, Bob). When Alice then adds Carol (in F) 
she can misrepresent M to only contain one entry (B, Bob). Will the other 
members raise an alert about this? If not, then Carol is forced to trust Alice. 
When she deletes Bob (in G), she will only move one entry to D, when she ought 
to move both entries. Later, a merge (H) of (A,G) will cause Bob to be added 
back into the conversation, even though this should not be the case, since Bob 
was removed in a message (G) strictly later than both A and B.

O -- A -----------------
 \       \              \
  -- B -- C -- F -- G -- H

In the situation where everyone can add-and-delete, this can be fixed manually 
(if the humans notice...). But this may cause problems in more 
restricted-policy scenarios.

In general, I believe that {minimising reliance on redundant information} is a 
good principle to follow when designing secure protocols, even if it makes your 
algorithms more complex. Though, I don't think my algorithm is *too* complex, 
at least less so than e.g. a block cipher.

To support partial views, my approach is to have a looser concept of "when" 
someone joins/parts. I believe the general merge algorithm is compatible with 
subjective incomplete views of the history. For example, using the syntax 
mId{members}:

A{p} <-- B{p,q} <-- C{p,q,r}.

p thinks that q joined at message B. However, from r's point of view, q only 
joined at message C (B and A don't exist for them).

I appreciate all of this is getting a bit complicated and I will try to write 
this up. The algorithm itself is not too bad (being very similar to git; see 
below), but the explanation on "why it works" is more complex, especially when 
extending the justification to the incomplete-view scenario above.

>> If the LCA is multiple sets, you recursively merge these first - this is 
>> what git means when it says "merge by recursive". If you merge multiple 
>> parents, you fold over the pairwise merge - this is what "git merge-base" 
>> does.
> 
> Do you mean something like this?
> 
> http://codicesoftware.blogspot.com/2011/09/merge-recursive-strategy.html
> 
> That looks complicated.  Maybe you should try to write out the
> pseudocode for all this, so we can evaluate.
> 

Here is the pseudocode with type annotations, but it will take me a lot longer 
to write up the justification:

(set[UId] is the type of a member-set, A|B is set union, A-B is set subtraction)

merge_mem(parents: set[MId]) -> set[UId]:
    """The merged members of the given set of messages.

    If parents is empty, return empty set.
    If parents is a singleton, return members(parents[0]).
    Otherwise, same as members(foldr(merge_mem_2, parents)), except that we
    do not actually add any nodes to the DAG, where merge_mem_2, etc. are
    defined as:

    merge_mem_2(a: MId, b: MId) -> MId:
        merged = new Message {
            parents = {a, b}
            members = 3_way_merge(merge_mem(lca(a,b)), members(a), members(b))
        }
        graph.add(merged)
        return merged.mId

    3_way_merge(O: set[UId], A: set[UId], B: set[UId]) -> set[UId]:
        := A | (B-O) - (O-B)
        := A - (O-B) | (B-O)
        := B | (A-O) - (O-A)
        := B - (O-A) | (A-O)
        # all four definitions are equivalent

    lca(messages: set[MId]) -> set[MId]:
        := max(anc(messages)) # latest common ancestors

    anc(messages: set[MId]) -> set[MId]:
        := { p | p <= m for all m in messages }

    merge_mem_2 is commutative and associative, so the order of the fold
    doesn't affect the final result of merge_mem. (TODO: formally prove
    associativity. So far we only do this "by example" in the tests.)

    NOTE: implementations will want a slightly modified version of 
    merge_mem_2 that doesn't add any new nodes, and uses an accumulator. 
    We decided to describe this simpler version instead, so as to not to 
    visually detract from the core purpose of our algorithm.
    """

(I do have the actual implementation as well, but this version is easier to 
explain and understand. The performance is fine, at least good enough for 
chatting with; one can optimise the LCA calculation.)

X

-- 
GPG: 4096R/1318EFAC5FBBDBCE
git://github.com/infinity0/pubkeys.git

signature.asc
Description: OpenPGP digital signature

_______________________________________________
Messaging mailing list
[email protected]
https://moderncrypto.org/mailman/listinfo/messaging

Re: [messaging] Merging partially-ordered membership operations was Re: Message order in group chat (attempt at summary)

Reply via email to