On 14/03/14 07:14, Trevor Perrin wrote: > > On Wed, Mar 12, 2014 at 1:42 PM, Ximin Luo <[email protected] > <mailto:[email protected]>> wrote: > > Hey guys, I am doing work on partially-ordered transcripts. This was > inspired by Old Blue[0], discussions with George Kadianakis and Trevor > Perrin, as well as background knowledge on git, and immutable > content-addressing schemes as seen in Freenet and Tahoe-LAFS. > > For everyone's interest, here is what I have so far, followed by a > statement of some issues that (AFAICS) have not been discussed before, and > directions for solving them. > > > Hey Ximin, > > We may need a more gentle introduction to this, and pseudocode - not everyone > here saw the discussions on OTR-dev, and I think you've worked out new > details I don't fully understand. >
Thanks for the input - I'll work on spending more time to explain things at a
slower pace, and either re-post or upload it somewhere.
> My take: You're considering problems that arise when trying to achieve
> "transcript consistency" for a multiparty conversation.
>
> The general approach you're considering is for parties to piggyback hash
> values on the messages they send, referring to the messages they've
> previously sent and received. The goal is to protect the context of your
> message, so that if you send a message saying "yes", an attacker can't
> perform replay/reorder/deletion to change what it appears you're responding
> to.
>
The hashIDs also serve another purpose - they give some level of assurance to
the other recipients that they are *seeing the same view* of what I send.
Otherwise, I can send different things to different people.
> You're raising a couple questions about what happens when users join / leave
> the conversation.
>
> You point out that piggybacked hashes sent to new users might leak
> information about messages before the new users joined.
>
> I'm not sure that's a big deal. But at least in a "pairwise" situation where
> each message is separately encrypted to each recipient (instead of using a
> group key and broadcast medium), wouldn't it be easy to omit old hashes to a
> new member?
>
Due to hashIDs needing to be the same for each message, we send the same actual
ciphertext to all recipients, but they decrypt/read different portions of it.
To save space, there is only one copy of the message contents, including the
parent pointers - but the path of which parts each recipient needs to decrypt,
is different, e.g. roughly ciphertext = {E(K1,K) E(K2, K) E(K,Msg)}.
I could indeed separately-encrypt different parents to different recipients.
The difficulty would be to do this in a restricted manner, to *only* allow the
case we're describing, to make sure that people can't send
arbitrarily-inconsistent parents. It would also make the message format more
complex.
I'll explore this idea a bit more - thanks for the tip. It is similar in
principle to my other idea ("show the same parents, but add some annotation to
tell the new user he shouldn't expect to see a subset of them") and probably
not much more complex.
> I think you were also concerned about whether users joining/leaving the
> conversation could get things out of sync? It seems you've resolved that
> concern, but I admit I didn't quite follow.
>
The stuff about "not-after"/"after" in the first email was maybe not the most
direct way of explaining things, but it was what I had in my confused head at
the time. Perhaps the following model is easier to follow:
At all times, the chat must have a member-set ("current participants") - this
is how you decide, "who to send my next message to". If your transcript only
has one "most recent" message, we can define it simply as the member-set of the
previous message (i.e. its sender plus recipients, embedded into each message),
plus/minus anyone you want to invite/kick.
This definition is also *constant* (wrt who you are in the chat), given a
constant transcript. This is important, for consistency between members.
The problem is when we have a fork in our transcript, then who do we send our
next message to? It turns out that, we need a merge algorithm. Each node
(message) has a member-set, edges represent join/part/NOP operations to this
member-set, and we need to merge these together when we have a fork.
Every version-control system has one of these algorithms, but sometimes they
result in edit conflicts. With line-by-line content, it is theoretically
impossible to automatically resolve some of these (there is a proof floating
around somewhere).
This was my concern in the first email. However, I later showed that with
unordered sets (with no additional structure that needs to be preserved), like
"members in a chat", we can actually automatically-resolve all of these.
However, the merge algorithm in my previous email has a (fixable) flaw, which I
describe below. But if you didn't follow the previous few paragraphs, please
let me know - the rest of this email will not make sense otherwise.
I explored the area a bit more today, and concluded that the merge_mem I
described in my previous email doesn't quite work like we need it to. It is
indeed consistent and verifiable, however it does not actually behave
intuitively, as I asserted.
In the following graphs, the node labels are the set of members (i.e.
"participants in the chat") of each node, e.g. "bc" means "this node/message
has users b and c". For brevity, I also use this as an identifier for the node,
when describing inputs to merge_mem(). In the diagrams below, this is ok,
because each node has a unique member-set that no other node has (which is
generally not the case, of course).
~~~~
digraph multiroot {
x;
bcd -> bc [label="+d"];
b -> bc [label="-c"];
rankdir=BT;
{ rank=same; bc x }
}
~~~~
https://people.torproject.org/~infinity0/res/mcrypt/po-merge-set-multiroot.png
Intuitively, we have {bc}, +d, -c and {x}, so we want merge_mem(bcd, b, x) ==
bdx. However, the algorithm I outlined in my previous email returns bcdx
instead.
At first I thought it was a problem with my generosity of not disallowing
multiple roots. But, in the algorithm I defined previously, it simply treats
multiple roots *as if they all have one same (invisible) parent*, where that
parent's member-set is the empty set.
The problem shows up in the same way, if we add a member {a} to all of the
nodes, and show the empty-set (now with {a}) node explicitly:
~~~~
digraph singleroot {
ax -> a [label="+x"];
abc -> a [label="+bc"];
abcd -> abc [label="+d"];
ab -> abc [label="-c"];
rankdir=BT;
{ rank=max; a }
}
~~~~
https://people.torproject.org/~infinity0/res/mcrypt/po-merge-set-singleroot.png
Again, we want merge_mem(abcd, ab, ax) == abdx, but the algorithm defined
previously returns abcdx.
The problem stems from the fact that our previous algorithm treats all nodes
equally. However, in the cases above, {bcd, b} have a more recent parent than
x, so we should really merge bcd, b together first, before we try to merge it
with x. It will be a bit more complex trying to specify this precisely, though.
X
--
GPG: 4096R/1318EFAC5FBBDBCE
git://github.com/infinity0/pubkeys.git
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Messaging mailing list [email protected] https://moderncrypto.org/mailman/listinfo/messaging
