Re: [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)

Mark Knecht Fri, 08 Aug 2014 11:35:35 -0700

Hi Duncan,
   Responding to one thing here, the rest in-line:

[QUOTE]
(Meanwhile, one further personal note FWIW.  You may think that all these
long explanations take quite some time to type up, and you'd be correct.
But don't make the mistake of thinking that I don't get a benefit from it
myself.  My dad was a teacher, and one of the things he used to say that
I've found to be truer than true, is that the best way to /learn/
something is to try to teach it to someone.
[/QUOTE]


I couldn't agree more and appreciate your efforts. And even if I might
already understand some of what you document I'm sure there are
others that come later looking for answers who get lots from these
conversations, solve problems and we never hear about it. Anyway,
a big thanks.

On Thu, Aug 7, 2014 at 2:18 PM, Duncan <1i5t5.dun...@cox.net> wrote:
> Mark Knecht posted on Thu, 07 Aug 2014 11:16:23 -0700 as excerpted:
>
>> So that's all looking pretty good, as a first step. If it's a matter of
>> 3 1/2 minutes instead of 1-2 minutes then I can live with that part.
>> However that's just (I think) the portage tree and not signed source
>> code, correct?
>
> [I just posted a reply to the gpg specific stuff.]
>
> Technically correct, but not really so in implementation.  See below...
>
>> Now, is the idea that I have a validated portage snapshot at this point
>> and stiff have to actually get the code using the regular emerge which
>> will do the checking because I have:
>>
>> FEATURES="buildpkg strict webrsync-gpg"
>
> No...  It doesn't work that way.
>
>> I don't see any evidence that emerge checked what it downloaded, but
>> maybe those checks are only done when I really build the code?
>
> Here's what happens.
>
> FEATURES=webrsync-gpg simply tells the webrsync stuff to gpg-verify the
> snapshot-tarball that webrsync downloads.  Without that, it'd still
> download it the same, but it wouldn't verify the signature.  This allows
> people who use the webrsync only because they're behind a firewall that
> wouldn't allow normal rsync, but who don't care about the gpg signing
> security stuff, to use the same tool as the people who actually use
> webrsync for the security aspect, regardless of whether they could use
> normal rsync or not.
>

And to clarify, I believe this step is responsible for putting into place on
a Gentoo machine much of what's in /usr/portage, most specifically in the
app categorization  directories. In the old days the Gentoo Install Guide
used to have us download the portage snapshots for a location such as

http://distfiles.gentoo.org/snapshots/

That's now been replaced by a call to emerge-webrsync so newbies
might not have that view. Additionally, even if we're downloading the
snapshot tarball it appears, at least on my system, it's deleted after
it's expanded/ Or at least it's not showing up in a locate command.


> So that gets you a signed and verified tree.  Correct so far.
>
> But as part of that tree, there are digest files for each package that
> verify the integrity of the ebuild as well as of the sources tarballs
> (distfiles).
>

Yep.

> Now it's important to grasp the difference between gpg signing and simple
> hash digests, here.
>
> Anybody with the appropriate tools (md5sum, for example, does md5 hashes,
> but there's sha and other hashes as well, and the portage tree uses
> several hash algorithms in case one is broken) can take a hash of a file,
> and provided it's exactly the same bit-for-bit file they should get
> exactly the same hash.
>
> In fact, that's how portage checks the hashes of both the ebuild files
> and the distfiles it uses, regardless of this webrsync-gpg stuff.  The
> tree ships the hash values that the gentoo package maintainer took of the
> files in its digest files, and portage takes its own hash of the files
> and compares it to the hash value stored in the digest files.  If they
> match, portage is happy.  If they don't, depending on how strict you have
> portage set to be (FEATURES=strict), it will either warn about (without
> strict) or entirely refuse to merge that package (with strict), until
> either the digest is updated, or a new file matching the old digest is
> downloaded.
>
> So far so good, but while the hashes protect against accidental damage as
> the file was being downloaded, because anyone can take a hash of the
> file, without something stronger, if say one of the mirror operators was
> a bad guy, they could replace the files with hacked files and as long as
> they replaced the digest files with the new ones they created for the
> hacked files at the same time, portage wouldn't know.
>
> So while hashes/digests alone protect quite well from accidental damage,
> they can't protect, by themselves, from deliberate replacement of those
> files with malware infested copies.
>
> Which is where the gpg signed tree snapshots come in.  But before we can
> understand how they help, we need to understand how gpg signing differs
> from simple hashes.
>

Some years ago (1997/98) I purchased one of Bruce Schneier's books - looking
at Amazon I recollect "Applied Cryptography: Protocols, Algorithms, and
Source Code in C" - so I've been through a lot of this in the area of
semiconductor
design. (5C Encryption model for 'protecting' movie content. What a joke...)

> PGP, gpg, and various other public/private-pair key signing (and
> encryption) take advantage of a particular mathematical relationship
> property between the public and private keys.  I'm not a cryptographer
> nor a mathematician, so I'm content to leave it at that rather handwavy
> assertion and not get into the details, but enough people I trust say the
> same thing about the details, and enough of our modern Internet banking
> and the like, depends upon the same idea, that I'm relatively confident
> in the general principle, at least.
>
> It works like this.  People keep the private key from the pair private --
> if it gets out, they've lost the secret.  But people publish the public
> half of the key.  The relationship of the keys is such that people can't
> figure out the private key from the public key, but if you have the
> private key, you can sign stuff with it, and people with the public key
> can verify the signature and thus trust that it really was the person
> with that key that signed the content.  Similarly, people can use the
> public key to encrypt something, and only the person with the private key
> will be able to decrypt it -- having the public key doesn't help.
>
> Actually, as I understand it signing is simply a combination of hashing
> and encryption, such that a hash of the content to be signed is taken,
> and then that hash is encrypted with the private key.  Now anyone with
> the public key can "decrypt" the hash and verify the content with it,
> thereby verifying that the private key used to sign the content by
> encrypting the hash was the one used.  If some other key had been used,
> attempting to decrypt the hash with an unmatched public key would simply
> produce gibberish, and the supposedly "decrypted" hash wouldn't be the
> hash produced when checking the content, thereby failing to verify that
> the signed content actually came from the person that it was claimed to
> have come from.
>

If I recall correctly the flow looks like:

File -> (Sender Private/Receiver Public) -> Encrypted File

Encrypted File -> (Sender Public/Receiver Private) -> File

and this should be safe, albeit Rich's comment early on was

"3.  Have an army of the best cryptographers in the world, etc."

coupled with lots of compute power leaves me with little doubt it's
not a 100% thing...

>
> OK, we've now established that hashes simply verify that the content
> didn't get modified in transit, but they do NOT by themselves verify who
> SENT that content, so indeed, a man-in-the-middle could have replaced
> BOTH the content and the hash, and someone relying on just hashes
> couldn't tell the difference.
>
> And we've also established that a signature verifies that the content
> actually came from the person who had the private key matching the public
> key used to verify it, by mechanism of encrypting the hash of that
> content with the private key, so only by "decrypting" it with the
> matching public key, does the hash of the content match the one taken at
> the other end and encrypted with the private key.
>
> *NOW* we're equipped to see how the portage tree snapshot signing method
> actually allows us to verify distfiles as well.  Because the tree
> includes digests that we can now verify came from our trusted source,
> gentoo, NOW those digests can be used to verify the distfiles, because
> the digests were part of the signed tree and nobody could tamper with
> that signed tree including those digests without detection.
>

Correct. Hashes for all that stuff is in the Manifest files and I don't create
my own Manifests ever.

> If our nefarious gentoo mirror operator tried to switch out the source
> tarballs AND the digests, he could do so for normal rsync users, and for
> webrsync users not doing gpg verification, without detection.  But should
> he try that with someone that's using webrsync-gpg, he has no way to sign
> the tampered with tarball with the correct private key since he doesn't
> have it, and those using webrsync with FEATURES=webrsync-gpg would detect
> the tampered tarball as portage (via webrsync, via eix in your case)
> would reject that tarball as unverified.
>

Well, maybe yes, maybe no as per the comment above, but agreed in general.

> So the hash-digest method used to protect ordinary rsync users (and
> webrsync users without webrsync-gpg turned on) from ACCIDENTAL damage,
> now protects webrsync-gpg users from DELIBERATE man-in-the-middle attacks
> as well, not because the digests themselves are different, but because we
> can now trust and verify that they came from a legitimate source.
>
> Tho it should be noted that "legitimate source" is defined as anyone
> having access to that that private signing key.  So should someone breakin
> to the snapshotting server and steal that private key doing the signing,
> they now become a "legitimate source" as far as webrsync-gpg is concerned.
>

Yep.

>
> So where does that leave us in practice?
>
> Basically here:
>
> You're now verifying that the snapshot tarballs are coming from a source
> with the private signing key, and we're assuming that gentoo security
> hasn't been broken and thus that only gentoo's snapshot signing servers
> (and their admins, of course) have access to the private signing key,
> which in turn means we're assuming the machine with that signing key must
> be gentoo, and thus that the snapshotted tarballs are legit.
>
> But it's actually webrsync in combination with FEATURES=webrsync-gpg
> that's doing that verification.
>
> Once the verified tarball is actually unpacked on our system, portage
> operate just as it normally does, simply verifying the usual hash digests
> against the ebuilds and the distfiles /exactly/ as it normally would.
>

Understood.

> Repeating in different words to hopefully ensure it's understood:
>
> It's *ONLY* the fact that we have actually gpg-verified that snapshot
> tarball and thus the digests within it, that gives us any more security
> than an ordinary rsync user.  After that's downloaded, verified and
> unpacked, portage operates exactly as it normally does.
>
>
> Meanwhile, part of that normal operation includes FEATURES=strict, if
> you've set it, which causes portage to refuse to merge the package if
> those digests don't match.  But that part of things is just normal
> portage operation.  Rsync users get it too -- they just don't have the
> additional assurance that those digest files actually came from gentoo
> (or at least from someone with gentoo's private signing key), that
> webrsync with FEATURES=webrsync-gpg provides.
>

Yep, I set that first before I got the gpg stuff working. I'll leave
it in place
for now.

>
> (Meanwhile, one further personal note FWIW.  You may think that all these
> long explanations take quite some time to type up, and you'd be correct.
> But don't make the mistake of thinking that I don't get a benefit from it
> myself.  My dad was a teacher, and one of the things he used to say that
> I've found to be truer than true, is that the best way to /learn/
> something is to try to teach it to someone.  That's exactly what I'm
> doing, and all the unexpected questions and corner cases that I'd have
> never thought about on my own, that people bring up and force me to think
> about in ordered to answer them, help me improve my own previously more
> handwavy and fuzzy "general concept" understanding as well.  I'm much
> more confident in my own understanding of the general public/private key
> concepts, how gpg actually uses them and how its web-of-trust works, and
> more specifically, how portage can use that via webrsync-gpg to actually
> improve the gentooer's own security, than I ever was before.
>
> And it has been quite some time since I worked with gpg and saw it in
> interactive mode like that, too, and it turns out that in the intervening
> years, I've actually understood quite a bit more about how it all works
> than I did back then, thus my ability to dig that all up and present it
> here, while back a few years ago, I was just as clueless about how all
> that web-of-trust stuff worked, and make exactly the same mistake of
> "ultimately trusting" the distro's package-signing key, for exactly the
> same reasons.  Turns out I absorbed rather more from all those security
> and encryption articles I've read over the years than I realized, but it
> actually took my replies right here in this thread to lay it all out
> logically so I too realized how much more I understand what's going on
> now, than I did back then.)
>
> So... Thanks for the thread! =:^)
>
> --
> Duncan - List replies preferred.   No HTML msgs.
> "Every nonfree program has a lord, a master --
> and if you use the program, he is your master."  Richard Stallman
>
>

Re: [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)

Reply via email to