Re: [Monotone-devel] Re: encrypted monotone (and digression on

2006-07-11 Thread Daniel Carosone
On Mon, Jul 10, 2006 at 05:35:53PM -0700, Graydon Hoare wrote:
 3. That buffer is immediately appended to a heap std::string and data is 
 parsed from there using safer extractor functions. The extractor 
 functions all test the length of every extraction against the string 
 length, and assert fatally if they are asked to pass the end of the 
 string they're reading from.

Although an example of careful programming for different objectives,
this sounds like a way to DoS/crash a server.

The other points all sound good - at least necessary, if not
sufficient :-)

Another possible interpretation of the question is around data
confidentiality, assuming all the other points are addressed. If I
expose a monotone server containing a collection of branches, even
with all the process containment tricks, I have to rely on monotone's
internal security controls regarding selective access to db contents.
So it's valid to question the robustness of these controls and any
implementation or deployment caveats around them.  I'm not really sure
if this was part of the OP's concern.

--
Dan.



pgpo6BObOVMzw.pgp
Description: PGP signature
___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Re: encrypted monotone (and digression on

2006-07-11 Thread jcrisp
I believe a company called Fortify will allow you to run their security
validation tool (DFA style expert system) against open source code for
free. If I remember properly they found several exploitable issues in the
Kernel.

Might be worth a look.

Joel

 Rob Schoening wrote:
 I have a somewhat unrelated question that touches on a more fundamental
 security issue:

 what is the relative security risk of running netsync on a public port
 assuming it's running as a non privileged user?  how much of a
 vulnerability is it for the host that's serving it?

 Nobody's shown us exploits yet, but it would be foolish for me to imply
 that none exist or are possible. I can point to a few things that might
 reassure you. Whether they do is another matter.

 1. Monotone authenticates users (by RSA-signing a nonce and requesting
 an RSA signature in response) before anything else. One may be able to
 DoS the server (in a CPU sense) if anonymous requests are permitted; if
 you insist on authenticated connections from known clients, this risk is
 reduced.

 2. Monotone does ::read() off a network socket and into a fixed-size
 stack buffer. However, it does this in exactly one place (netsync.cc,
 session::read_some()) and always issues the read call for the full
 length of the buffer, starting at its beginning, and never restarts the
 read or tries to mix parsing and reading.

 3. That buffer is immediately appended to a heap std::string and data is
 parsed from there using safer extractor functions. The extractor
 functions all test the length of every extraction against the string
 length, and assert fatally if they are asked to pass the end of the
 string they're reading from. If there's insufficient data for a complete
 command packet during parsing, we give up and restart parsing from the
 string's beginning next time we receive data.

 4. Other major parsing points are basic_io.{cc,hh} and xdelta.cc; it is
 possible that those contain logic that can be tricked into indexing past
 the end of the std::strings they're reading from. I'd be happy to go
 through them with a concerned reader doing an audit / inserting more
 dynamic checks / adding tests that try specific attacks.

 5. With the exception of misbehavior in glibc during getaddrinfo() and
 setlocale(), we appear to be valgrind-clean.

 6. You should be able to chroot / jail / zone / otherwise sandbox us, so
 long as we can access libstdc++, libc, libnss, and our database. We also
 need to be able to create transient journal files in the directory we
 keep the database in, as part of the page-transaction system in sqlite.

 Still, it's a nontrivial program, you're right to be concerned. Even if
 you trust our code, we also inherit the possibility of vulnerabilities
 from sqlite, botan, lua, idna, and boost. We do a fair bit of input
 validation, don't call printf, are careful to avoid malloc/free or use
 of raw pointers, etc. but it's hard to be sure.

 -graydon



 ___
 Monotone-devel mailing list
 Monotone-devel@nongnu.org
 http://lists.nongnu.org/mailman/listinfo/monotone-devel







___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Re: encrypted monotone (and digression on

2006-07-11 Thread Nathaniel Smith
On Tue, Jul 11, 2006 at 10:16:18AM +0100, [EMAIL PROTECTED] wrote:
 I believe a company called Fortify will allow you to run their security
 validation tool (DFA style expert system) against open source code for
 free. If I remember properly they found several exploitable issues in the
 Kernel.
 
 Might be worth a look.

Well, of course, feel free :-).

-- Nathaniel

-- 
- Don't let your informants burn anything.
- Don't grow old.
- Be good grad students.
  -- advice of Murray B. Emeneau on the occasion of his 100th birthday


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


[Monotone-devel] Review of diff-p branch

2006-07-11 Thread Nathaniel Smith
+/* Find, and write to ENCLOSER, the nearest line before POS which
matches
+ ENCLOSER_PATTERN.  We remember the last line scanned, and the
matched, to
+   avoid duplication of effort.  */
+   
+void
+hunk_consumer::find_encloser(size_t pos, string  encloser)
+{
+  typedef vectorstring::const_reverse_iterator riter;
+
+  if (!encloser_re)
+return;
+
+  riter last = a.rbegin() + (a.size() - encloser_last_search);
+  encloser_last_search = pos;
+
+  for (riter i = a.rbegin() + (a.size() - pos); i != last; i++) {
+if (boost::regex_search (*i, *encloser_re))
+  {
+encloser_last_match = a.size() - (i - a.rbegin());
+L(FL(find_encloser: from %u matching line %d, \%s\)
+  % pos % encloser_last_match % *i);
+
+// the number 40 is chosen to match GNU diff.  it could safely be
+// increased up to about 60 without overflowing the standard
+// terminal width.
+encloser = string( ) + (*i).substr(0, 40);
+return;
+  }
+  }
+
+  if (encloser_last_match)
+{
+  ssize_t i = encloser_last_match;
+  L(FL(find_encloser: from %u matching cached %d, \%s\)
+% pos % i % a[i]);
+  encloser = string( ) + a[i].substr(0, 40);
+}
+}
^^ I think I'd feel more comfortable here with some I()'s scattered
around here?  It is Clever, and involves Pointers, you see.


+for (i = hunk.begin(); i != hunk.end(); i++)
+  if ((*i)[0] != ' ')
+{
+  first_mod = i - hunk.begin();
+  break;
+}
+
+find_encloser(a_begin + first_mod, encloser);
+ost   @@  encloser  endl;
^^ The way the unidiff and context diff writers go back and parse
their own output seems a bit... well, odd, anyway?


The docs weren't fully updated when the code changed (and had a
weird formatting issue, with a line starting with a period).  I fixed
this.

Merged to mainline.

-- Nathaniel

-- 
Damn the Solar System.  Bad light; planets too distant; pestered with
comets; feeble contrivance; could make a better one myself.
  -- Lord Jeffrey


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] encrypted monotone (and digression on

2006-07-11 Thread Jeronimo Pellegrini
On Mon, Jul 10, 2006 at 12:21:46PM -0700, Nathaniel Smith wrote:
 Well, umm, blame cmarcelo, I guess :-):
   http://del.icio.us/tag/monotone

Ah, right. That's Caio.

 As a practical matter, I find it unlikely that the FSF will release a
 GPL v3 that somehow cannot be applied to, say... gnupg.
 
 Consult a lawyer etc. etc., but personally I'd just slap v2 or later
 on it and worry about v3... later.  Like, after it actually exists
 :-).

I think you're right. I'm a bit of a paranoid when it comes to legal
issues, but OK -- GPL v2 it is!

 (In the mean time, a number of people, myself included, will not want
 to look at any non-free code, regardless of author's expressed plans.)

I never had the intention to make it non-free. It's just that I wasn't
OK with that problem (in particular, since the system could be used to
encrypt source code, this could be a problem -- but I'll just use teh
GPL anyway)

 Ah, makes sense -- so it is push/pull only?  What do you do to allow
 incremental pull?  (Or do you?  And if not, how does it differ from
 gpg --encrypt foo.mtn? ;-))

Actually, it converts your ordinary database to an encrypted one (so you
keep both on your desktop/laptop/whatever). When you synchronize, you
use the encrypted database. In untrusted hosts you can keep only the
encrypted one (the keys won't ever get there, since they're not
necessary for syncing).
Since the packets are individual files in the encrypted database, when
they are synced only the relevant ones are transmitted.

I'll work more on that page and on packaging the code later. Right now
the site is probably not up to date with the work, and I'm not sure if
the mtn server is working there. (Have to work a lot on another project
now, can't stop!)

J.



___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Re: Monotone-devel Digest, Vol 39, Issue 15

2006-07-11 Thread Nathaniel Smith
On Mon, Jul 10, 2006 at 10:47:35AM -0700, Eric Anderson wrote:
Content-Description: message body text
 [EMAIL PROTECTED] writes:
   From: Nathaniel Smith [EMAIL PROTECTED]
   Subject: Re: [Monotone-devel] Re: Monotone-devel Digest, Vol 39, Issue
  15
   
   [ code to check that mtn process is still alive after sleep is wrong ]
 
 I just saw the code in mtn.py that does a sleep(3) in order to wait
 for the server to get going.  Take a look at the revised attached
 patch which does the check and will also bail out quickly if the
 sub-process fails.

Thanks!  I just committed a modified version of this same idea as
a3490b231ad2574cb876ee7dcd5e062a8dd0e1d8.

-- Nathaniel

-- 
But in Middle-earth, the distinct accusative case disappeared from
the speech of the Noldor (such things happen when you are busy
fighting Orcs, Balrogs, and Dragons).


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


[Monotone-devel] diff -p default?

2006-07-11 Thread Nathaniel Smith
Now that we have diff -p support on mainline, is there any reason we
shouldn't make it the default?

(For those who haven't run across this before, diff -p gives output
like:

--- hello   80ad86578e12a12c838cd4ff7ca226aa6bcc44e9
+++ hello   94ebfe438b30bf18631c1846b2891b818f46aa23
@@ -9,3 +9,9 @@ int main()
 {
 say_hello();
 }
+
+void say_goodbye()
+{
+printf(goodbye\n);
+}
+

In particular, note the int main() stuck on the end of the @@ line,
to give you context when reading the patch.)

AFAIK it's still compatible with patch(1) and the various other tools
out there.

-- Nathaniel

-- 
Sentience can be such a burden.


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Review of diff-p branch

2006-07-11 Thread Nathaniel Smith
On Tue, Jul 11, 2006 at 02:48:22PM -0700, Zack Weinberg wrote:
 On 7/11/06, Nathaniel Smith [EMAIL PROTECTED] wrote:
 +/* Find, and write to ENCLOSER, the nearest line before POS which
 matches
 + ENCLOSER_PATTERN.  We remember the last line scanned, and the
 matched, to
 +   avoid duplication of effort.  */
 ...
 ^^ I think I'd feel more comfortable here with some I()'s scattered
 around here?  It is Clever, and involves Pointers, you see.
 
 Not really pointers, just iterators and a lot of complicated
 vector-index arithmetic, but see attached; maybe it's clearer?

std::vector iterators are pretty thin wrappers around pointers -- have
a lot of the same risks.  The patch looks fine; I just feel more
warm-and-fuzzy when complicated arithmetic is going on, if the
assumptions are also documented in code :-).

Looks fine to commit.

 ...
 The way the unidiff and context diff writers go back and parse
 their own output seems a bit... well, odd, anyway?
 
 The requirement is to pass find_encloser() an index which is one less
 than the first changed line (+, -, or !) in the hunk; this is not
 always a_begin+ctx, as a previous version tried.  I could have the
 writers keep track of which line this is, but honestly it seemed
 clearer to me to do it this way.

Fair enough.  It's hardly the weirdest bit of code in monotone :-).

-- Nathaniel

-- 
...these, like all words, have single, decontextualized meanings: everyone
knows what each of these words means, everyone knows what constitutes an
instance of each of their referents.  Language is fixed.  Meaning is
certain.  Santa Claus comes down the chimney at midnight on December 24.
  -- The Language War, Robin Lakoff


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Review of diff-p branch

2006-07-11 Thread Zack Weinberg

 Not really pointers, just iterators and a lot of complicated
 vector-index arithmetic, but see attached; maybe it's clearer?

std::vector iterators are pretty thin wrappers around pointers -- have
a lot of the same risks.  The patch looks fine; I just feel more
warm-and-fuzzy when complicated arithmetic is going on, if the
assumptions are also documented in code :-).

Looks fine to commit.


Done.

zw


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Re: branch review for net.venge.monotone.multihead

2006-07-11 Thread Zack Weinberg

I rewrote CMD(merge) again according to your suggestions; please have a look?

I was thinking about using commit date as a further heuristic, i.e.
when we have two LCAs neither of which is an ancestor of the other,
merge the newest one first; furthermore, when we have three or more
heads with the same LCA, merge the newest two first.  However, it
seems like a huge pain to get from a revision_id to its commit date,
and in fact I'm not sure the date cert is guaranteed to exist.
Thoughts?

zw


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


[Monotone-devel] Re: branch review for net.venge.monotone.multihead

2006-07-11 Thread Bruce Stephens
Zack Weinberg [EMAIL PROTECTED] writes:

 I was thinking about using commit date as a further heuristic, i.e.
 when we have two LCAs neither of which is an ancestor of the other,
 merge the newest one first; furthermore, when we have three or more
 heads with the same LCA, merge the newest two first.  However, it
 seems like a huge pain to get from a revision_id to its commit date,
 and in fact I'm not sure the date cert is guaranteed to exist.

I think mtn always creates a date cert.  There may be more than one,
of course (with different values), so there's a certain amount of
flexibility in how you might determine the newest two.

Intuitively it seems like you might try to merge the two closest in
terms of numbers of files that differ, or something.  I've no idea
whether that would be a good heuristic, though.


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Re: branch review for net.venge.monotone.multihead

2006-07-11 Thread Timothy Brownawell
On Wed, 2006-07-12 at 02:51 +0100, Bruce Stephens wrote:
 Zack Weinberg [EMAIL PROTECTED] writes:
 
  I was thinking about using commit date as a further heuristic, i.e.
  when we have two LCAs neither of which is an ancestor of the other,
  merge the newest one first; furthermore, when we have three or more
  heads with the same LCA, merge the newest two first.  However, it
  seems like a huge pain to get from a revision_id to its commit date,
  and in fact I'm not sure the date cert is guaranteed to exist.
 
 I think mtn always creates a date cert.  There may be more than one,
 of course (with different values), so there's a certain amount of
 flexibility in how you might determine the newest two.

It will always generate one, yes. But it's perfectly valid for there not
to be one (probably can only happen with 'db execute'), or as Bruce said
for there to be more than one. So don't rely on there being a unique
date cert, or even there being a date cert at all.

Tim




___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Re: branch review for net.venge.monotone.multihead

2006-07-11 Thread Daniel Carosone
On Tue, Jul 11, 2006 at 06:40:54PM -0700, Zack Weinberg wrote:
 I was thinking about using commit date as a further heuristic, i.e.
 when we have two LCAs neither of which is an ancestor of the other,
 merge the newest one first; furthermore, when we have three or more
 heads with the same LCA, merge the newest two first. 

Absent other clearly-obvious better choices (such as conflicting vs
non), I like the simple predictability of merging in alpha-sorted
revision-id order.

In particular, this is important to avoid a lot of merge fan-out.  On
a busy project with many developers syncing and merging at the same
time, all with slightly different mostly-overlapping sets of
revisions, we want to minimise the number of additional intermediate
merge nodes that will be created because different users will merge
subsets of nodes in different orders.  Those merge nodes are only
going to have to be merged again, creating mostly pointless tangle.

As a simple dumb example, consider a minor modification of the present
algorithm, that doesn't use any of your additional smarts:

 1. make a sorted list of heads
 2. attempt to merge the first pair
 3. if successful, start again with a new list
(that's pretty much what we do now).
 4. if not successful, move one slot down the list and try again for
the next pair, rather than failing

I'm not advocating this change. This is by no means going to produce
the best chances for a successful or least-manual-assistance merge, as
you're trying to do.

To illustrate my point, however, it will do a pretty good job of
producing convergent sets of merge nodes amongst multiple people
attempting to merge, while allowing further progress than we currently
do.

Both objectives are important, and you need to consider the tradeoffs
between them. I don't have a clear picture of what those tradeoffs
might be, but i'm nervous that, between developers with partial views
of eachothers work, the nodes that diverge recently from LCAs are
perhaps the *least* likely nodes for them to have in common.  

A counterargument against any more eager merging algorithm is that if
such merges are stopped at the first failure, there's another
opportunity to sync and learn of more nodes before attempting again.
If we merge eagerly in such cases, we're going to produce a more
complex set of internal merge nodes and multiple re-merges before
finally getting to a single head.

I'm not trying to discourage you, nor to suggest that having extra
merge nodes is really something to be frightened of: just that we need
to consider this too.  For all I know, we can come up with a selection
order algorithm that will actually improve this situation.  One
strawman example that comes immediately to mind is to merge nodes with
common author certs first, on the assumption an author is more likely
to know about their own revisions than those of others.

 However, it
 seems like a huge pain to get from a revision_id to its commit date,
 and in fact I'm not sure the date cert is guaranteed to exist.

It's not certain to exist. It's not certain to be correct. It's not
certain to be unique. It's not certain to represent time-of-commit; I
have at least one case where I set the date according to the time the
content was current, rather than the time i'm later committing that
record.

The most pertinent example here, though, is common merge nodes that
have been created by different people; they'll have different dates on
them and will be sorted differently by different viewers until the
date certs meet up.  I don't think it's a good idea.

--
Dan.


pgpCEkoWDGEYD.pgp
Description: PGP signature
___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel