Nathaniel Smith wrote:
On Fri, Mar 02, 2007 at 03:37:50PM +0100, Ulf Ochsenfahrt wrote:
*snip*
Note that the way netsync is currently set up, every new revision is first sent without any branch info, and then the branch info is sent for that revision. So effectively every branch cert you send looks like you are trying to steal permission to look at a pre-existing revision.
>
I'm not sure how the solve this -- I suppose each connection would have to track in memory the complete set of revisions that read access was granted to, and also all revisions that have been sent down this connection?
Only if netsync keeps that ordering. If netsync orders revisions and branch certs such that the relevant info is closer to each other, then the receiving side can already write a bunch of stuff to the database. That might make netsync more complicated though. I presume that the revisions are already ordered such that the oldest stuff comes first?
I don't know if there are other similar security problems -- netsync is complicated. But I guess if you want this feature, you should make sure you know that you have thought of everything :-). Everyone would be perfectly happy to see more capable access controls for netsync, it's just not clear how to actually _get_ such a thing without redesigning whole chunks.
Conceptually, I think there is an easy way to think about this: let netsync simply not 'see' revisions that the other side doesn't have read access to at the db abstraction layer, and block branch certificates for branches that the other side doesn't have write access to. The only problem is that you could end up with revisions without branch cert.
BTW, the sourceforge example is a red herring for other reasons -- all our code assumes that network operations are generally database-wide, so their efficiency tends to be O(whole database), not O(subsets of database involved in this particular sync).
I can't argue with you here, but I was under the impression that the merkle trie is only build for the branch pattern that is to be synced.
> This is not particularly
fixable -- trying to sieve out 100 megabytes of relevant data from a multi-gigabyte (or multi-terabyte, for sourceforge) database is never going to be fast; you need some kind of lower level data partitioning.
Google performs fairly well for multi-???byte queries, although I don't know if that is a valid comparison.
On the other hand, if netsync is extended to handle multi-db syncs on a single connection, wouldn't that solve that very same problem?
With the downside that identical data can't be shared among dbs.
So in any kind of large hosting situation, you would certainly be using some sort of vhost support and multiple databases anyway. Also, sourceforge probably wants a security model that is simple to analyze, which netsync-based security is unlikely to be...
Is there any documentation on the database layout/netsync protocol? Cheers, -- Ulf
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel