[gentoo-user] Re: package download verification

James Thu, 08 May 2014 12:14:29 -0700

Alan McKinnon <alan.mckinnon <at> gmail.com> writes:

> > But why not just use a simple script:


> > <scriptname> package.just.downloaded package.just.downloaded.DIGESTS


Right now, I perform manual inspections, which are essential only if deemed
essential, proned to (visual inspection) mistakes  and time consuming. It
there is  (which I'm unaware of) scripts, programs, gui-interfaces and such
that greatly simplify this manual spot checking random approach?



> >
http://arstechnica.com/information-technology/2014/04/openssl-code-beyond-repair-claims-creator-of-libressl-fork/
> 
> Thanks, now I understand better the question you are asking.


Ok, cleaning up of this tool (openssl code) is but one part of the work that
 needs to be done. The Rat is well qualified to clean up this code.


> I don't think it can be solved at all in the general case, 
> for two reasons.

ems fight'n words......


> One, the internet and it's core protocols are inherently not worthy of
> trust. There just isn't any way to prove that traffic is what it claims
> to be and no crypto verification built into the core of it. You either
> trust the traffic or you don't, but there's nothing inherent in the
> traffic to help you decide. So, all the download protocols have security
> checking bolted on afterwards by individual apps. These apps may or may
> not be compatible with each other and may or may not do their checks
> similarly from one protocol to the next. Somebody would have to garner
> enough support so that all the major projects doing file and data
> transfers agree on some way to implement crypto checks. Good luck with
> that  if they do agree on something, we have the second problem.
> 
> Internet downloads have an inherent problem - you download an unknown
> bunch of bits from somewhere and can't fully trust the result. You can
> check hashes against the downloaded file, but you have to get them from
> somewhere. And the method to get them is the same as getting the data
> file itself - a bunch of bits from somewhere and you can't trust it. How
> can you download trusted hash data from a source where you don't trust
> the regular downloads? Can't work; two no trusts don't make a one trust.
> 
> And who's global hash store of all known hashes of all known
> downloadables would you trust anyway? The NSAs? 
> 
> Best you can do is make something for the specific case. The Gentoo tree
> and distfiles can be GPG signed and if you agree to trust Gentoo's keys
> then you are good to go and it can be automated (which is the easy bit
> btw).
> 
> For the general case/ I can't see that work at all. I trust Gentoo with
> Gentoo, but I don't see myself ever trusting $ARB_3RD_PARTY 
> with $EVERYTHING


Your comments are well received and I do not even disagree with your points.
I think you need to relax, grab your favorite beverage, recline and put
on your "deep thinking hat". Perhaps a foot massage from your least 
productive  underling would set your mind at ease?


So, let us assume you are correct in everything you have stated. But,
try on this idea and shoot away. Note in this context, I use the terms
code=package=sofware=download, so as to focus on the 10,000 foot view
of the idea, not the minutia. 


Premiss: 
Any individual code/software/package/download can be hacked as can it's
keys/hashes, regardless of where they are located.  But, it would be very
difficult for an interloper, to inject into  such codes  at a thousand 
differnet locations without detection. Note, at each repository, hashes
can be regenerated and had better match the hashes of the the orignation
site(s).

Proposal: 
So rather than a static singular  check-point of where you code check, why
not develop checking tools that check the integrity of any given piece of
code, from many multiple locations? (Fault tolerance via redundancy, if
you like).


Possible solution:
1) Source archives usually contain revision histories and sync those up with
revision releases. So mantain a master list of hashes/keys on their sources
in the form of a histogram. So a code  periodically updated n(10) times
would have n(10) hashes with  n(10) timestamps as the basis of the
histogram. Think of a digial (camera) histogram. [1] This would develop a
histogram of changes in the hashes for a given code/package not only at the
sourcecode reporsitory, but also at those institutional repositories who
generate their own hashes/keys and link them to release date-time-stamps;
had better have convergence with the development sources. 

Now we would not only have the hashes, which can be manually checked
anywhere anytime, but a historm image check, based on the historical dates
where the code is known to have changed. Every code changes does not have to
be included, only significant, period releases.  Code could be check by a
bit by bit number by number approach, as well as a single image that is a
compilation of those bits into the form of a histogram. [2]

The archive sites (common download repositories) should be able to check
histograms each time a code they offer is changed. Nothing would stop
capable users from using these sorts of tools too. Please observe: these
"histograms", particularly if well distributed across the net, would greatly
enhance forensic and integrity ensurance efforts.


2) The individual could mantain a master list of hashes/keys on their
(gentoo) system(s). Yes they would have to be periodically updated, but an
archival database approach, complete with timestamps when a particular codes
hask/key changes, would be logged, per package. This could probably be a
compliment to  portage.


3) For every (non-gentoo users also) individual, a distributed checking tool
could be develop to simulataneously check against dozens or hundreds of
hashes from random sites against their copy of the hash/key. It'd be pretty
hard to hack many of those sources in a coordinated fashion.


For the paranoid usb stick(s) could be used  to house the hashes for
transient usage/updates. The pathelogically paranoic could download, drop
the their ether connection, insert the usb(s) and perform hash, code and
system checks.


Nothing in this scenario would stop tainted code from the original
development team. But wait, holy_oscars_batman, the fact that those
(trusted) codes are developed in an open fashion where other folks can audit
the sources, historically and concurrently, should drasitcally reduce
nefarious codes, as we currently have evidence to support.


So, what a torrent_style tool that uses a distributed  hashes/keys to check
code integrity; is possible?

Surely the code histogram idea is possible?


James


[1] http://digital-photography-school.com/understanding-histograms/

[2] Not the proper forum here to refine this part, but, Z and fourier
transforms make quick, easy work for this sort of quick image parsing.

[gentoo-user] Re: package download verification

Reply via email to