Re: Tricking peer review

2021-10-28 Thread Ludovic Courtès
Hi!

Christine Lemmer-Webber  skribis:

> Here's another way to think of it, borrowing from some ocap systems: the
> hash is the actual, canonical identifier of this package revision.  The
> URL to get the package is merely a "hint" as to where to get it.

Yes, definitely.  It remains we need to educate ourselves to not give
too much credit to this hint.

Thanks,
Ludo’.



Re: Tricking peer review

2021-10-25 Thread Christine Lemmer-Webber
Ludovic Courtès  writes:

> It builds just fine:
>
> $ guix build -f /tmp/content-addressed.scm  
> /gnu/store/lpais26sjwxcyl7y7jqns6f5qrbrnb34-sed-4.8
> $ guix build -f /tmp/content-addressed.scm -S --check -v0
> /gnu/store/mgais6lk92mm8n5kyx70knr11jbwgfhr-sed-4.8.tar.gz
>
>
> Did you spot a problem?
>
> …
>
>
> So, what did we just build?
>
> $ ls $(guix build -f /tmp/content-addressed.scm)/bin
> egrep  fgrep  grep
>
>
> Oh oh!  This ‘sed’ package is giving us ‘grep’!  How come?
>
> The trick is easy: we give a URL that’s actually 404, with the hash of a
> file that can be found on Software Heritage (in this case, that of
> ‘grep-3.4.tar.xz’).  When downloading the source, the automatic
> content-addressed fallback kicks in, and voilà:
>
> $ guix build -f /tmp/content-addressed.scm  -S --check 
> La jena derivaĵo estos konstruata:
>/gnu/store/nq2jdzbv3nh9b1mglan54dcpfz4l7bli-sed-4.8.tar.gz.drv
> building /gnu/store/nq2jdzbv3nh9b1mglan54dcpfz4l7bli-sed-4.8.tar.gz.drv...
>
> Starting download of 
> /gnu/store/1mlpazwwa2mi35v7jab5552lm3ssvn6r-sed-4.8.tar.gz
>>From https://ftpmirror.gnu.org/gnu/zed/sed-4.8.tar.gz...
> following redirection to 
> `https://mirror.cyberbits.eu/gnu/zed/sed-4.8.tar.gz'...
> download failed "https://mirror.cyberbits.eu/gnu/zed/sed-4.8.tar.gz"; 404 "Not 
> Found"
>
> [...]
>
> Starting download of 
> /gnu/store/1mlpazwwa2mi35v7jab5552lm3ssvn6r-sed-4.8.tar.gz
>>From 
>>https://archive.softwareheritage.org/api/1/content/sha256:58e6751c41a7c25bfc6e9363a41786cff3ba5709cf11d5ad903cf7cce31cc3fb/raw/...
> downloading from 
> https://archive.softwareheritage.org/api/1/content/sha256:58e6751c41a7c25bfc6e9363a41786cff3ba5709cf11d5ad903cf7cce31cc3fb/raw/
>  ...
>
> warning: rewriting hashes in 
> `/gnu/store/mgais6lk92mm8n5kyx70knr11jbwgfhr-sed-4.8.tar.gz'; cross fingers
> successfully built 
> /gnu/store/nq2jdzbv3nh9b1mglan54dcpfz4l7bli-sed-4.8.tar.gz.drv
>
>
> It’s nothing new, it’s what I do when I want to test the download
> fallbacks (see also ‘GUIX_DOWNLOAD_FALLBACK_TEST’ in commit
> c4a7aa82e25503133a1bd33148d17968c899a5f5).  Still, I wonder if it could
> somehow be abused to have malicious packages pass review.

Here's another way to think of it, borrowing from some ocap systems: the
hash is the actual, canonical identifier of this package revision.  The
URL to get the package is merely a "hint" as to where to get it.

Therefore, there can be many other "hints" as to where to get it too,
enabling mirrors to be elevated to the "same" priority as the original
source.




Re: Tricking peer review

2021-10-21 Thread zimoun
Hi,

On Wed, 20 Oct 2021 at 19:03, Leo Famulari  wrote:
> On Tue, Oct 19, 2021 at 10:39:12AM +0200, zimoun wrote:
>> Drifting from the initial comment.  One could name “tragic” commits are
>> commits which break “guix pull”.  It is rare these days but there are
>> some reachable ones via “guix time-machine”.
>
> That's a good point. Is it a good idea to teach Guix to help (not force)
> users to avoid these commits?

Somehow, it means extend the mechanism behind
’channel-with-substitutes-available’ [1], I guess.

1: 


Cheers,
simon





Re: Tricking peer review

2021-10-21 Thread Ludovic Courtès
Hi,

Leo Famulari  skribis:

> On Fri, Oct 15, 2021 at 08:54:09PM +0200, Ludovic Courtès wrote:
>> The trick is easy: we give a URL that’s actually 404, with the hash of a
>> file that can be found on Software Heritage (in this case, that of
>> ‘grep-3.4.tar.xz’).  When downloading the source, the automatic
>> content-addressed fallback kicks in, and voilà:
> [...]
>> Thoughts?
>
> It's a real risk... another illustration that our security model trusts
> committers implicitly (not saying that's a bad thing or even avoidable).
>
> In years past I mentioned a similar technique but based on using
> old/vulnerable versions of security-critical packages like OpenSSL. The
> same approach would have worked since we started using Nix's
> content-addressed mirror.

Right.  Like zimoun wrote, the SWH fallback makes this even more
stealthily exploitable.

>> It’s nothing new, it’s what I do when I want to test the download
>> fallbacks (see also ‘GUIX_DOWNLOAD_FALLBACK_TEST’ in commit
>> c4a7aa82e25503133a1bd33148d17968c899a5f5).  Still, I wonder if it could
>> somehow be abused to have malicious packages pass review.
>
> Nice feature! Sorry if this was already suggested, but is it possible to
> create an argument to this variable that disallows use of the fallback
> mechanisms? I would certainly use that while reviewing and testing my
> own patches.

Yes, you can do “GUIX_DOWNLOAD_FALLBACK_TEST=none” (added in
bd61d62182bfda4a695757ec66810b28e8e1a6d0).

Thanks,
Ludo’.



Re: Tricking peer review

2021-10-20 Thread Leo Famulari
On Fri, Oct 15, 2021 at 08:54:09PM +0200, Ludovic Courtès wrote:
> The trick is easy: we give a URL that’s actually 404, with the hash of a
> file that can be found on Software Heritage (in this case, that of
> ‘grep-3.4.tar.xz’).  When downloading the source, the automatic
> content-addressed fallback kicks in, and voilà:
[...]
> Thoughts?

It's a real risk... another illustration that our security model trusts
committers implicitly (not saying that's a bad thing or even avoidable).

In years past I mentioned a similar technique but based on using
old/vulnerable versions of security-critical packages like OpenSSL. The
same approach would have worked since we started using Nix's
content-addressed mirror.

> It’s nothing new, it’s what I do when I want to test the download
> fallbacks (see also ‘GUIX_DOWNLOAD_FALLBACK_TEST’ in commit
> c4a7aa82e25503133a1bd33148d17968c899a5f5).  Still, I wonder if it could
> somehow be abused to have malicious packages pass review.

Nice feature! Sorry if this was already suggested, but is it possible to
create an argument to this variable that disallows use of the fallback
mechanisms? I would certainly use that while reviewing and testing my
own patches.



Re: Tricking peer review

2021-10-20 Thread Leo Famulari
On Tue, Oct 19, 2021 at 10:39:12AM +0200, zimoun wrote:
> Drifting from the initial comment.  One could name “tragic” commits are
> commits which break “guix pull”.  It is rare these days but there are
> some reachable ones via “guix time-machine”.

That's a good point. Is it a good idea to teach Guix to help (not force)
users to avoid these commits?



Re: Tricking peer review

2021-10-20 Thread zimoun
Hi,

On Wed, 20 Oct 2021 at 10:22, Giovanni Biscuolo  wrote:

> I think the "final" result of this discussion should be condensed in a
> few (one?) additional paragraphs in the Contributing section of the Guix
> manual

Run “guix lint” is already listed.  What do you have in mind about more
additions?


> Well done Simon: AFAIU this is a complete analisys of the possible
> "source" attacks, or is something missing?

To my knowledge, yes it is exhaustive with the current situation about
tricking the content-addressed system.

On the top of that, it is addressed by hash functions; it is thus
vulnerable to preimage attack of such hash functions.  SWH uses SHA-1 to
address and I do not know how they address potential collisions.

For instance, the cost for SHA-1 [1] is still really expensive.  Well,
for interested reader, one can read the discussion here [2].  SHA-1 is
2^160 (~10^48.2) and compare to 10^50 which is the estimated number of
atoms in Earth.  Speaking about content-addressability, SHA-1 seems
fine.  We are speaking about content-addressability not about using
SHA-1 as hash function for security, IMHO.  It is the same situation as
Git, for instance.

The surface of attack is very low because:

 a) SWH is an archive and not a forge,
 b) a chosen-prefix attack [3] could no work if review is correctly done;
which means run “guix lint”,
 c) an attacker has to trick the checksum (SHA-256) and the address
(SHA-1); at various locations: Guix history (now signed), SWH,
 Disarchive-DB.

1: 
2: 
3: 

>>> Also, just because a URL looks nice and is reachable doesn’t mean the
>>> source is trustworthy either.  An attacker could submit a package for an
>>> obscure piece of software that happens to be malware.  The difference
>>> here is that the trick above would allow targeting a high-impact
>>> package.
>>
>> I agree.
>
> I also agree (obviously) and I think this kind of attack should also be
> documented in the manual (if not already done)

Well, nothing new here, IMHO.  A distribution relies on content, i.e.,
any distribution points to that content.  Whatever the nature of the
pointing arrow (URL, Git commit, hash, etc.), the pointed material must
be carefully checked at package time; as explained by «Submitting
Patches» [4]. :-) That’s why I am advocating [5] that:

 new packages should *always* go via guix-patches, wait 15 days,
then push if no remark.  It lets the time for the community to
chime in.  And if not, it just slows down for 2 weeks.


4: 
5: 

Cheers,
simon



patches for new packages proper workflow (Re: Tricking peer review)

2021-10-20 Thread Giovanni Biscuolo
Hi,

zimoun  writes:

[...]

>> All in all, it’s probably not as worrisome as it first seems.  However,
>> it’s worth keeping in mind when reviewing a package.
>
> I agree with a minor comment.  From my opinion, not enough patches are
> going via guix-patches and are pushed directly.
>
> For instance, the «Commit policy» section says «For patches that just
> add a new package, and a simple one, it’s OK to commit, if you’re
> confident (which means you successfully built it in a chroot setup, and
> have done a reasonable copyright and license auditing).»
>
> And from my point of view, new packages should *always* go via
> guix-patches, wait 15 days, then push if no remark.  It lets the time
> for the community to chime in.  And if not, it just slows down for 2
> weeks.

I agree with Simon, the policy to add new packages should be changed as
he proposes.

Thanks! Giovanni

-- 
Giovanni Biscuolo

Xelera IT Infrastructures


signature.asc
Description: PGP signature


Re: Tricking peer review

2021-10-20 Thread Giovanni Biscuolo
Hi Simon and Ludovic,

very interesting thread, thank you!

I think the "final" result of this discussion should be condensed in a
few (one?) additional paragraphs in the Contributing section of the Guix
manual

zimoun  writes:

[...]

>  - url-fetch: the attacker has to introduce the tarballs into SWH.
>There is not so much means, from my understanding: SWH ingests
>tarballs via loaders, for instance gnu.org or sources.json or
>debian.org etc. Therefore the attacker has to introduce the malicious
>code to these platforms.
>
>  - url-fetch without metadata (as your example), indeed, the reviewer
>could be abused; mitigated by the fact that “guix lint” spots the
>potential issue.
>
>  - url-fetch with metadata: the attacker have to also corrupt
>Diasarchive-DB.   Otherwise, the tarball returned by SWH will not
>match the checksum.
>
>  - svn-fetch, hg-fetch, cvs-fetch: no attack possible, yet.
>
>  - git-fetch: it is the *real* issue.  Because it is easy for the
>attacker to introduce malicious code into SWH (create a repo on
>GitHub, click Save, done).  Then submit a package using it as you
>did.  It is the same case as url-fetch without metadata but easier
>for the attacker.  It is mitigated by “guix lint”.

Well done Simon: AFAIU this is a complete analisys of the possible
"source" attacks, or is something missing?

> That’s said, if I am an attacker and I would like to corrupt Guix, then
> I would create a fake project mimicking a complex software.  For
> instance, Gmsh is a complex C++ scientific software.  The correct URL is
>  and the source at
> .  Then, as an attacker, I buy the
> domain say gmsh.org

or onelab.org, onehab.info and also set up a https://onehab.info web
site identical to the legitimate one just to trick people

> and put a malicious code there.  Last, I send for
> inclusion a package using this latter URL.  The reviewer would be
> abused.
> That’s why more eyes, less issues. :-)

I agree, but eyes should also be aware of this class of possible attacks

>> Also, just because a URL looks nice and is reachable doesn’t mean the
>> source is trustworthy either.  An attacker could submit a package for an
>> obscure piece of software that happens to be malware.  The difference
>> here is that the trick above would allow targeting a high-impact
>> package.
>
> I agree.

I also agree (obviously) and I think this kind of attack should also be
documented in the manual (if not already done)

[...]

Thanks! Gio'

-- 
Giovanni Biscuolo

Xelera IT Infrastructures


signature.asc
Description: PGP signature


Re: Tricking peer review

2021-10-19 Thread zimoun
Hi Ludo,

On Tue, 19 Oct 2021 at 14:56, Ludovic Courtès  wrote:
> zimoun  skribis:
>
>> I agree with a minor comment.  From my opinion, not enough patches are
>> going via guix-patches and are pushed directly.
>>
>> For instance, the «Commit policy» section says «For patches that just
>> add a new package, and a simple one, it’s OK to commit, if you’re
>> confident (which means you successfully built it in a chroot setup, and
>> have done a reasonable copyright and license auditing).»
>>
>> And from my point of view, new packages should *always* go via
>> guix-patches, wait 15 days, then push if no remark.  It lets the time
>> for the community to chime in.  And if not, it just slows down for 2
>> weeks.
>
> Three comments: (1) two weeks for a trivial Python/R/C package (the
> “simple one” rule above) can be a lot, (2) committers are by definition
> trusted to not mess up and to clean up behind them, and (3) when a
> simple package is pushed there’s after-the-fact peer review in practice
> (sometimes I reply to guix-commits notifications, for example.)

While I agree for updates or various fixes, I disagree when adding new
packages.  I do not see what is so urgent, I mean, if it was so urgent,
why the package had not been added before?  Therefore, I disagree with
(1) when adding new packages.

About (2), yes I agree that committers are by definition trusted.  No
question here.  However, (a) more eyes prevent mistakes and (b) some
ramblings.

One question is “encouragement” for reviewing, somehow.  Asking for new
package additions to go via guix-patches is a call making kind of
equality between contributors.  As someone without commit access, I can
tell you that it is often demotivating to send a trivial addition, wait
forever, ping people (aside I know who I have to ping :-)).  Usually, it
means people are busy elsewhere, so I try to help to reduce the workload
by reviewing stuff or by doing bug triage.  However, in the same time, I
see committers push their own trivial additions.  It appears to me
“unfair”.  Why are committer’s trivial additions more “urgent” than
mine?

I do not blame anyone, obviously not! I just comment here from my side
the current process in order to improve it.

About (3), if peer review before pushing can be used, how is a
after-the-fact peer review a justification?  Mistakes happen and I am
fine with that.  And hopefully they can be fixed.  However, they remain
forever in the Git history tree – therefore, here we have avoidable
“tragic” commits.

(This tricking peer review is a corollary of a more general
rambling about review. :-))

> I think it’s about finding the right balance to be reasonably efficient
> while not compromising on quality.

I totally agree.  And I do not see nor understand where is the
inefficiency here when asking to go via guix-patches and wait two weeks
for adding a new package.

Could you provide one concrete example of an urgent trivial package
addition?


Cheers,
simon



Re: Tricking peer review

2021-10-19 Thread Ludovic Courtès
Hi,

zimoun  skribis:

> I agree with a minor comment.  From my opinion, not enough patches are
> going via guix-patches and are pushed directly.
>
> For instance, the «Commit policy» section says «For patches that just
> add a new package, and a simple one, it’s OK to commit, if you’re
> confident (which means you successfully built it in a chroot setup, and
> have done a reasonable copyright and license auditing).»
>
> And from my point of view, new packages should *always* go via
> guix-patches, wait 15 days, then push if no remark.  It lets the time
> for the community to chime in.  And if not, it just slows down for 2
> weeks.

Three comments: (1) two weeks for a trivial Python/R/C package (the
“simple one” rule above) can be a lot, (2) committers are by definition
trusted to not mess up and to clean up behind them, and (3) when a
simple package is pushed there’s after-the-fact peer review in practice
(sometimes I reply to guix-commits notifications, for example.)

I think it’s about finding the right balance to be reasonably efficient
while not compromising on quality.

Ludo’.



Re: Tricking peer review

2021-10-19 Thread zimoun
Hi Ludo,

On Mon, 18 Oct 2021 at 09:40, Ludovic Courtès  wrote:

> I don’t think there are “tragic” commits either.  Usually, one records a
> revision that works for them and use the same months later.

Drifting from the initial comment.  One could name “tragic” commits are
commits which break “guix pull”.  It is rare these days but there are
some reachable ones via “guix time-machine”.


Cheers,
simon



Re: Tricking peer review

2021-10-19 Thread zimoun
Hi,

On Fri, 15 Oct 2021 at 20:54, Ludovic Courtès  wrote:

> --8<---cut here---start->8---
> $ guix build -f /tmp/content-addressed.scm  -S --check 
> La jena derivaĵo estos konstruata:
>/gnu/store/nq2jdzbv3nh9b1mglan54dcpfz4l7bli-sed-4.8.tar.gz.drv
> building /gnu/store/nq2jdzbv3nh9b1mglan54dcpfz4l7bli-sed-4.8.tar.gz.drv...
>
> Starting download of 
> /gnu/store/1mlpazwwa2mi35v7jab5552lm3ssvn6r-sed-4.8.tar.gz
> From https://ftpmirror.gnu.org/gnu/zed/sed-4.8.tar.gz...
> following redirection to 
> `https://mirror.cyberbits.eu/gnu/zed/sed-4.8.tar.gz'...
> download failed "https://mirror.cyberbits.eu/gnu/zed/sed-4.8.tar.gz"; 404 "Not 
> Found"
>
> [...]
>
> Starting download of 
> /gnu/store/1mlpazwwa2mi35v7jab5552lm3ssvn6r-sed-4.8.tar.gz
> From 
> https://archive.softwareheritage.org/api/1/content/sha256:58e6751c41a7c25bfc6e9363a41786cff3ba5709cf11d5ad903cf7cce31cc3fb/raw/...
> downloading from 
> https://archive.softwareheritage.org/api/1/content/sha256:58e6751c41a7c25bfc6e9363a41786cff3ba5709cf11d5ad903cf7cce31cc3fb/raw/
>  ...
>
> warning: rewriting hashes in 
> `/gnu/store/mgais6lk92mm8n5kyx70knr11jbwgfhr-sed-4.8.tar.gz'; cross fingers
> successfully built 
> /gnu/store/nq2jdzbv3nh9b1mglan54dcpfz4l7bli-sed-4.8.tar.gz.drv
> --8<---cut here---end--->8---

The message can be even more “spottable” than the URL
’archive.softwareheritage.org’ if the vault requires a cooking.  One
will see «SWH vault: requested bundle cooking, waiting for
completion...» or «SWH vault: retrying...».

Yeah, it is hidden with the option ’v0’ but it is the user’s
responsibility to check, IMHO.

> It’s nothing new, it’s what I do when I want to test the download
> fallbacks (see also ‘GUIX_DOWNLOAD_FALLBACK_TEST’ in commit
> c4a7aa82e25503133a1bd33148d17968c899a5f5).  Still, I wonder if it could
> somehow be abused to have malicious packages pass review.

Yes, it is nothing new.  Somehow, it is an issue with any
content-addressed system.  Here, we need to split the issue depending on
the origins:

 - url-fetch: the attacker has to introduce the tarballs into SWH.
   There is not so much means, from my understanding: SWH ingests
   tarballs via loaders, for instance gnu.org or sources.json or
   debian.org etc. Therefore the attacker has to introduce the malicious
   code to these platforms.

 - url-fetch without metadata (as your example), indeed, the reviewer
   could be abused; mitigated by the fact that “guix lint” spots the
   potential issue.

 - url-fetch with metadata: the attacker have to also corrupt
   Diasarchive-DB.   Otherwise, the tarball returned by SWH will not
   match the checksum.

 - svn-fetch, hg-fetch, cvs-fetch: no attack possible, yet.

 - git-fetch: it is the *real* issue.  Because it is easy for the
   attacker to introduce malicious code into SWH (create a repo on
   GitHub, click Save, done).  Then submit a package using it as you
   did.  It is the same case as url-fetch without metadata but easier
   for the attacker.  It is mitigated by “guix lint”.

That’s said, if I am an attacker and I would like to corrupt Guix, then
I would create a fake project mimicking a complex software.  For
instance, Gmsh is a complex C++ scientific software.  The correct URL is
 and the source at
.  Then, as an attacker, I buy the
domain say gmsh.org and put a malicious code there.  Last, I send for
inclusion a package using this latter URL.  The reviewer would be
abused.

That’s why more eyes, less issues. :-)
   

> Also, just because a URL looks nice and is reachable doesn’t mean the
> source is trustworthy either.  An attacker could submit a package for an
> obscure piece of software that happens to be malware.  The difference
> here is that the trick above would allow targeting a high-impact
> package.

I agree.

> On the plus side, such an attack would be recorded forever in Git
> history.

I agree again. :-)

> All in all, it’s probably not as worrisome as it first seems.  However,
> it’s worth keeping in mind when reviewing a package.

I agree with a minor comment.  From my opinion, not enough patches are
going via guix-patches and are pushed directly.

For instance, the «Commit policy» section says «For patches that just
add a new package, and a simple one, it’s OK to commit, if you’re
confident (which means you successfully built it in a chroot setup, and
have done a reasonable copyright and license auditing).»

And from my point of view, new packages should *always* go via
guix-patches, wait 15 days, then push if no remark.  It lets the time
for the community to chime in.  And if not, it just slows down for 2
weeks.

Cheers,
simon



Re: Tricking peer review

2021-10-18 Thread Ryan Prior
On Monday, October 18th, 2021 at 7:40 AM, Ludovic Courtès 
 wrote:

> Hi Ryan,
> How would we define “bad” though?

A definition isn't necessary, this can be an "I know it when I see it" thing. 
If we have an oops or discover an issue, and say oh darn that lives in the repo 
forever now, we'd be able to leave a note to all who try in the future to visit 
impacted commits that all was not well.

Some of this is already captured by our CVE scanning feature, but other things 
(like your hypothetical "somebody snuck a bad `sed` in!") would benefit from 
yet more explanation. We don't need a perfect and complete definition of "bad" 
to agree that any commit where `sed` is actually `grep` (or malware) is a bad 
commit & merits a warning. This should not interfere with people who want to 
keep using their pinned version of Guix & aren't impacted by the bad package, 
which remains useful as you note.



Re: Tricking peer review

2021-10-18 Thread Ludovic Courtès
Hello,

Thiago Jung Bauermann  skribis:

> I’ve been thinking lately that Guix {sh,c}ould have a new ’release-signing-
> keys’ field in the package record which would list the keys that are known 
> to sign official releases of the package. Then Guix would check the tarball/
> git commit/git tag when downloading it. It would be an additional (and IMHO 
> important) source of truth.

Yes, it’s been discussed a few times and I agree it’d be nice.

The difficulty here is that it’s “silent” metadata: it’s not used, or at
least not necessarily used as part of the download process.  But maybe
that’s OK: we can have the download process check signatures when
possible.

Right now we rely on ‘guix refresh -u’ or contributors/reviewers do
perform this check.

> There are details that would need to be hashed out such as how to deal with 
> revoked keys or whether to store the keys themselves on the Guix repo or 
> anywhere else in Guix’s infrastructure, but I think it’s possible to arrive 
> at a reasonable solution.

Perhaps a first step would be to download keys opportunistically.

We have (guix openpgp) which can be used to verify signatures without
taking revocation into account.

Thanks,
Ludo’.



Re: Tricking peer review

2021-10-18 Thread Ludovic Courtès
Hi Ryan,

Ryan Prior  skribis:

> I've suggested this before and this seems like a good time to bring it
> up again: can we create a database of known "bad" Guix commit hashes,
> and make time-machine fetch the list and warn before it'll visit one
> of those hashes? This would resolve the land-mine problem and
> generally de-risk our git tree, which is maintained by fallible
> volunteers who will occasionally push tragic commits.

How would we define “bad” though?

By definition, packages found in past Guix revisions are missing
features, have unfixed bugs and unpatched security vulnerabilities.
Yet, pinning the Guix revision or re-deploying a past revision as-is has
its uses and I’d say it’s a key feature, even.

I don’t think there are “tragic” commits either.  Usually, one records a
revision that works for them and use the same months later.

Thanks,
Ludo’.



Re: Tricking peer review

2021-10-18 Thread Ludovic Courtès
Moin!

Liliana Marie Prikler  skribis:

> Am Freitag, den 15.10.2021, 20:54 +0200 schrieb Ludovic Courtès:

[...]

>> It’s nothing new, it’s what I do when I want to test the download
>> fallbacks (see also ‘GUIX_DOWNLOAD_FALLBACK_TEST’ in commit
>> c4a7aa82e25503133a1bd33148d17968c899a5f5).  Still, I wonder if it
>> could somehow be abused to have malicious packages pass review.
> I don't think this is much of a problem for packages where we have
> another source of truth (in this case mirrors/archives of sed), but it
> does point at a bigger problem when SWH is our only source of truth. 
> I.e. when trying to conserve such software for the future, when other
> archives might fail and perhaps SHA256 itself might be broken, we can
> no longer be sure that the Guix time-machine indeed does what it
> promises.

At the time a package is committed, its source is normally not
downloaded from SWH—at least that’s what we aim for, and ‘guix lint’
warns against 404 source URLs.  So when the package is reviewed and
committed, people can check the origin of the source, verify it against
published signatures when possible, and so on.

>> Also, just because a URL looks nice and is reachable doesn’t mean the
>> source is trustworthy either.  An attacker could submit a package for
>> an obscure piece of software that happens to be malware.  The
>> difference here is that the trick above would allow targeting a high-
>> impact package.
> Again, less of an issue w.r.t. review because the reviewers can at
> review time check that the tarball matches their expectations.  I
> personally find "I can't find this source anywhere but on SWH" to be a
> perfect reason to reject software in the main Guix channel, though
> perhaps that rule is a bit softer in Guix Past.

Right.  SWH is a fallback, meaning that, eventually, most source gets
downloaded from there (because the original hosting sites vanish); but
again, at the time of review, source must be available elsewhere.

>> On the plus side, such an attack would be recorded forever in Git
>> history.
> On the minus side, time-machine makes said record a landmine to step
> into.

That’s one way to look at it; the same could be said of unpatched
vulnerabilities found in old versions.  It remains that deploying from a
pinned Guix revision has its uses.

[...]

> I agree, that cross-checking “guix download” might be good praxis for
> review.

Reviewing includes at least building the package, thus downloading its
source, and running running ‘guix lint’.  So there’s nothing really new
here I guess,

> Perhaps in light of this we should extend it to Git/SVN/other VCS?

Thanks,
Ludo’.



Re: Tricking peer review

2021-10-15 Thread Thiago Jung Bauermann
Hello,

Em sexta-feira, 15 de outubro de 2021, às 19:03:22 -03, Liliana Marie 
Prikler escreveu:
> Am Freitag, den 15.10.2021, 20:54 +0200 schrieb Ludovic Courtès:
> > Consider this file as if it were a patch you’re reviewing:
> > 
> > (define-module (content-addressed))
> > (use-modules (guix)
> > 
> >  (guix build-system gnu)
> >  (guix licenses)
> >  (gnu packages perl))
> > 
> > (define-public sed
> > 
> >   (package
> >   
> >(name "sed")
> >(version "4.8")
> >(source (origin
> >
> > (method url-fetch)
> > (uri (string-append "mirror://gnu/zed/sed-" version
> > 
> > ".tar.gz"))
> 
> To be fair, gnu/zed sounds wonky, but you could try inserting a version
> that does not exist (e.g. 1+ the current latest version) and as a
> committer thereby bypass review entirely.  However, given that we trust
> committers in this aspect, I'd say they should be able to verify both
> URI and version field.  This is trivially possible with most schemes
> safe for the mirror:// one.
> 
> > (sha256
> > 
> >  (base32
> >  
> >   "1yy33kiwrxrwj2nxa4fg15bvmwyghqbs8qwkdvy5phm784f7brjq")
> > 
> > )))
> > 
> >(build-system gnu-build-system)
> >(synopsis "Stream editor")
> >(native-inputs
> >
> > `(("perl" ,perl)));for tests
> >
> >(description
> >
> > "Sed is a non-interactive, text stream editor.  It receives a
> > 
> > text
> > input from a file or from standard input and it then applies a series
> > of text
> > editing commands to the stream and prints its output to standard
> > output.  It
> > is often used for substituting text patterns in a stream.  The GNU
> > implementation offers several extensions over the standard utility.")
> > 
> >(license gpl3+)
> >(home-page "https://www.gnu.org/software/sed/";)))
> > 
> > sed
> > 
> > It builds just fine:
> > 
> > --8<---cut here---start->8---
> > $ guix build -f /tmp/content-addressed.scm
> > /gnu/store/lpais26sjwxcyl7y7jqns6f5qrbrnb34-sed-4.8
> > $ guix build -f /tmp/content-addressed.scm -S --check -v0
> > /gnu/store/mgais6lk92mm8n5kyx70knr11jbwgfhr-sed-4.8.tar.gz
> > --8<---cut here---end--->8---
> > 
> > Did you spot a problem?
> > 
> > …
> > 
> > 
> > So, what did we just build?
> > 
> > --8<---cut here---start->8---
> > $ ls $(guix build -f /tmp/content-addressed.scm)/bin
> > egrep  fgrep  grep
> > --8<---cut here---end--->8---
> > 
> > Oh oh!  This ‘sed’ package is giving us ‘grep’!  How come?
> > 
> > The trick is easy: we give a URL that’s actually 404, with the hash
> > of a file that can be found on Software Heritage (in this case, that
> > of ‘grep-3.4.tar.xz’).  When downloading the source, the automatic
> > content-addressed fallback kicks in, and voilà:
> > 
> > --8<---cut here---start->8---
> > $ guix build -f /tmp/content-addressed.scm  -S --check
> > 
> > La jena derivaĵo estos konstruata:
> >/gnu/store/nq2jdzbv3nh9b1mglan54dcpfz4l7bli-sed-4.8.tar.gz.drv
> > 
> > building /gnu/store/nq2jdzbv3nh9b1mglan54dcpfz4l7bli-sed-
> > 4.8.tar.gz.drv...
> > 
> > Starting download of /gnu/store/1mlpazwwa2mi35v7jab5552lm3ssvn6r-sed-
> > 4.8.tar.gz
> > 
> > > From https://ftpmirror.gnu.org/gnu/zed/sed-4.8.tar.gz...
> > 
> > following redirection to `
> > https://mirror.cyberbits.eu/gnu/zed/sed-4.8.tar.gz'...
> > download failed "https://mirror.cyberbits.eu/gnu/zed/sed-4.8.tar.gz";
> > 404 "Not Found"
> > 
> > [...]
> > 
> > Starting download of /gnu/store/1mlpazwwa2mi35v7jab5552lm3ssvn6r-sed-
> > 4.8.tar.gz
> > 
> > > From
> > > https://archive.softwareheritage.org/api/1/content/sha256:58e6751c41a
> > > 7c25bfc6e9363a41786cff3ba5709cf11d5ad903cf7cce31cc3fb/raw/ ...
> > 
> > downloading from
> > https://archive.softwareheritage.org/api/1/content/sha256:58e6751c41a7c
> > 25bfc6e9363a41786cff3ba5709cf11d5ad903cf7cce31cc3fb/raw/ ...
> > 
> > warning: rewriting hashes in
> > `/gnu/store/mgais6lk92mm8n5kyx70knr11jbwgfhr-sed-4.8.tar.gz'; cross
> > fingers
> > successfully built /gnu/store/nq2jdzbv3nh9b1mglan54dcpfz4l7bli-sed-
> > 4.8.tar.gz.drv
> > --8<---cut here---end--->8---
> > 
> > It’s nothing new, it’s what I do when I want to test the download
> > fallbacks (see also ‘GUIX_DOWNLOAD_FALLBACK_TEST’ in commit
> > c4a7aa82e25503133a1bd33148d17968c899a5f5).  Still, I wonder if it
> > could somehow be abused to have malicious packages pass review.
> 
> I don't think this is much of a problem for packages where we have
> another source of truth (in this case mirrors/archives of sed), but it
> does point at a bigger problem when SWH is our only source of truth.
> I.e. when trying to conserve such software for the future

Re: Tricking peer review

2021-10-15 Thread Ryan Prior
‐‐‐ Original Message ‐‐‐
> A "bad" commit might still be perfectly fine to fetch certain things from if 
> they're unaffected by it

The database could store a comment with each "bad" commit hash to help people 
decide if they're affected. It could even go further and include a list of 
tainted packages, so you could programmatically determine whether one of them 
is in your dependency tree.

> you're now tasked with the job of keeping the list of bad commits safe 
> somehow.

Right now afaik Ludovic's key is the root of trust (is this still true?) so I 
imagine we'd sign the list too, with that key or some other key signed by it.

> In some situations resetting a branch might work, but obviously not for 
> months old sleeper commits.

Not sure what you mean by this, can you explain?



Re: Tricking peer review

2021-10-15 Thread Liliana Marie Prikler
Am Freitag, den 15.10.2021, 22:28 + schrieb Ryan Prior:
> On Friday, October 15th, 2021 at 10:03 PM, Liliana Marie Prikler <
> liliana.prik...@gmail.com> wrote:
> 
> > > On the plus side, such an attack would be recorded forever in Git
> > > 
> > > history.
> > 
> > On the minus side, time-machine makes said record a landmine to
> > step
> > 
> > into.
> 
> I've suggested this before and this seems like a good time to bring
> it up again: can we create a database of known "bad" Guix commit
> hashes, and make time-machine fetch the list and warn before it'll
> visit one of those hashes? This would resolve the land-mine problem
> and generally de-risk our git tree, which is maintained by fallible
> volunteers who will occasionally push tragic commits.
I don't think things would be quite as simple.  A "bad" commit might
still be perfectly fine to fetch certain things from if they're
unaffected by it, plus you're now tasked with the job of keeping the
list of bad commits safe somehow.  In some situations resetting a
branch might work, but obviously not for months old sleeper commits.




Re: Tricking peer review

2021-10-15 Thread Ryan Prior
On Friday, October 15th, 2021 at 10:03 PM, Liliana Marie Prikler 
 wrote:

> > On the plus side, such an attack would be recorded forever in Git
> >
> > history.
>
> On the minus side, time-machine makes said record a landmine to step
>
> into.

I've suggested this before and this seems like a good time to bring it up 
again: can we create a database of known "bad" Guix commit hashes, and make 
time-machine fetch the list and warn before it'll visit one of those hashes? 
This would resolve the land-mine problem and generally de-risk our git tree, 
which is maintained by fallible volunteers who will occasionally push tragic 
commits.



Re: Tricking peer review

2021-10-15 Thread Liliana Marie Prikler
Hi,

Am Freitag, den 15.10.2021, 20:54 +0200 schrieb Ludovic Courtès:
> Hello,
> 
> Consider this file as if it were a patch you’re reviewing:

> (define-module (content-addressed))
> (use-modules (guix)
>  (guix build-system gnu)
>  (guix licenses)
>  (gnu packages perl))
> 
> (define-public sed
>   (package
>(name "sed")
>(version "4.8")
>(source (origin
> (method url-fetch)
> (uri (string-append "mirror://gnu/zed/sed-" version
> ".tar.gz"))
To be fair, gnu/zed sounds wonky, but you could try inserting a version
that does not exist (e.g. 1+ the current latest version) and as a
committer thereby bypass review entirely.  However, given that we trust
committers in this aspect, I'd say they should be able to verify both
URI and version field.  This is trivially possible with most schemes
safe for the mirror:// one.
> (sha256
>  (base32
>   "1yy33kiwrxrwj2nxa4fg15bvmwyghqbs8qwkdvy5phm784f7brjq")
> )))
>(build-system gnu-build-system)
>(synopsis "Stream editor")
>(native-inputs
> `(("perl" ,perl)));for tests
>(description
> "Sed is a non-interactive, text stream editor.  It receives a
> text
> input from a file or from standard input and it then applies a series
> of text
> editing commands to the stream and prints its output to standard
> output.  It
> is often used for substituting text patterns in a stream.  The GNU
> implementation offers several extensions over the standard utility.")
>(license gpl3+)
>(home-page "https://www.gnu.org/software/sed/";)))
> 
> sed

> It builds just fine:
> 
> --8<---cut here---start->8---
> $ guix build -f /tmp/content-addressed.scm  
> /gnu/store/lpais26sjwxcyl7y7jqns6f5qrbrnb34-sed-4.8
> $ guix build -f /tmp/content-addressed.scm -S --check -v0
> /gnu/store/mgais6lk92mm8n5kyx70knr11jbwgfhr-sed-4.8.tar.gz
> --8<---cut here---end--->8---
> 
> Did you spot a problem?
> 
> …
> 
> 
> So, what did we just build?
> 
> --8<---cut here---start->8---
> $ ls $(guix build -f /tmp/content-addressed.scm)/bin
> egrep  fgrep  grep
> --8<---cut here---end--->8---
> 
> Oh oh!  This ‘sed’ package is giving us ‘grep’!  How come?
> 
> The trick is easy: we give a URL that’s actually 404, with the hash
> of a file that can be found on Software Heritage (in this case, that
> of ‘grep-3.4.tar.xz’).  When downloading the source, the automatic
> content-addressed fallback kicks in, and voilà:
> 
> --8<---cut here---start->8---
> $ guix build -f /tmp/content-addressed.scm  -S --check 
> La jena derivaĵo estos konstruata:
>/gnu/store/nq2jdzbv3nh9b1mglan54dcpfz4l7bli-sed-4.8.tar.gz.drv
> building /gnu/store/nq2jdzbv3nh9b1mglan54dcpfz4l7bli-sed-
> 4.8.tar.gz.drv...
> 
> Starting download of /gnu/store/1mlpazwwa2mi35v7jab5552lm3ssvn6r-sed-
> 4.8.tar.gz
> > From https://ftpmirror.gnu.org/gnu/zed/sed-4.8.tar.gz...
> following redirection to `
> https://mirror.cyberbits.eu/gnu/zed/sed-4.8.tar.gz'...
> download failed "https://mirror.cyberbits.eu/gnu/zed/sed-4.8.tar.gz";
> 404 "Not Found"
> 
> [...]
> 
> Starting download of /gnu/store/1mlpazwwa2mi35v7jab5552lm3ssvn6r-sed-
> 4.8.tar.gz
> > From 
> > https://archive.softwareheritage.org/api/1/content/sha256:58e6751c41a7c25bfc6e9363a41786cff3ba5709cf11d5ad903cf7cce31cc3fb/raw/
> > ...
> downloading from 
> https://archive.softwareheritage.org/api/1/content/sha256:58e6751c41a7c25bfc6e9363a41786cff3ba5709cf11d5ad903cf7cce31cc3fb/raw/
> ...
> 
> warning: rewriting hashes in
> `/gnu/store/mgais6lk92mm8n5kyx70knr11jbwgfhr-sed-4.8.tar.gz'; cross
> fingers
> successfully built /gnu/store/nq2jdzbv3nh9b1mglan54dcpfz4l7bli-sed-
> 4.8.tar.gz.drv
> --8<---cut here---end--->8---
> 
> It’s nothing new, it’s what I do when I want to test the download
> fallbacks (see also ‘GUIX_DOWNLOAD_FALLBACK_TEST’ in commit
> c4a7aa82e25503133a1bd33148d17968c899a5f5).  Still, I wonder if it
> could somehow be abused to have malicious packages pass review.
I don't think this is much of a problem for packages where we have
another source of truth (in this case mirrors/archives of sed), but it
does point at a bigger problem when SWH is our only source of truth. 
I.e. when trying to conserve such software for the future, when other
archives might fail and perhaps SHA256 itself might be broken, we can
no longer be sure that the Guix time-machine indeed does what it
promises.

> Also, just because a URL looks nice and is reachable doesn’t mean the
> source is trustworthy either.  An attacker could submit a package for
> an obscure piece of software that happens to be malware.  The
> difference here is that the trick above would allow targeting a high-
> impact package.
Again, less

Tricking peer review

2021-10-15 Thread Ludovic Courtès
Hello,

Consider this file as if it were a patch you’re reviewing:

(define-module (content-addressed))
(use-modules (guix)
 (guix build-system gnu)
 (guix licenses)
 (gnu packages perl))

(define-public sed
  (package
   (name "sed")
   (version "4.8")
   (source (origin
(method url-fetch)
(uri (string-append "mirror://gnu/zed/sed-" version
".tar.gz"))
(sha256
 (base32
  "1yy33kiwrxrwj2nxa4fg15bvmwyghqbs8qwkdvy5phm784f7brjq"
   (build-system gnu-build-system)
   (synopsis "Stream editor")
   (native-inputs
`(("perl" ,perl)));for tests
   (description
"Sed is a non-interactive, text stream editor.  It receives a text
input from a file or from standard input and it then applies a series of text
editing commands to the stream and prints its output to standard output.  It
is often used for substituting text patterns in a stream.  The GNU
implementation offers several extensions over the standard utility.")
   (license gpl3+)
   (home-page "https://www.gnu.org/software/sed/";)))

sed

It builds just fine:

--8<---cut here---start->8---
$ guix build -f /tmp/content-addressed.scm  
/gnu/store/lpais26sjwxcyl7y7jqns6f5qrbrnb34-sed-4.8
$ guix build -f /tmp/content-addressed.scm -S --check -v0
/gnu/store/mgais6lk92mm8n5kyx70knr11jbwgfhr-sed-4.8.tar.gz
--8<---cut here---end--->8---

Did you spot a problem?

…


So, what did we just build?

--8<---cut here---start->8---
$ ls $(guix build -f /tmp/content-addressed.scm)/bin
egrep  fgrep  grep
--8<---cut here---end--->8---

Oh oh!  This ‘sed’ package is giving us ‘grep’!  How come?

The trick is easy: we give a URL that’s actually 404, with the hash of a
file that can be found on Software Heritage (in this case, that of
‘grep-3.4.tar.xz’).  When downloading the source, the automatic
content-addressed fallback kicks in, and voilà:

--8<---cut here---start->8---
$ guix build -f /tmp/content-addressed.scm  -S --check 
La jena derivaĵo estos konstruata:
   /gnu/store/nq2jdzbv3nh9b1mglan54dcpfz4l7bli-sed-4.8.tar.gz.drv
building /gnu/store/nq2jdzbv3nh9b1mglan54dcpfz4l7bli-sed-4.8.tar.gz.drv...

Starting download of /gnu/store/1mlpazwwa2mi35v7jab5552lm3ssvn6r-sed-4.8.tar.gz
>From https://ftpmirror.gnu.org/gnu/zed/sed-4.8.tar.gz...
following redirection to `https://mirror.cyberbits.eu/gnu/zed/sed-4.8.tar.gz'...
download failed "https://mirror.cyberbits.eu/gnu/zed/sed-4.8.tar.gz"; 404 "Not 
Found"

[...]

Starting download of /gnu/store/1mlpazwwa2mi35v7jab5552lm3ssvn6r-sed-4.8.tar.gz
>From 
>https://archive.softwareheritage.org/api/1/content/sha256:58e6751c41a7c25bfc6e9363a41786cff3ba5709cf11d5ad903cf7cce31cc3fb/raw/...
downloading from 
https://archive.softwareheritage.org/api/1/content/sha256:58e6751c41a7c25bfc6e9363a41786cff3ba5709cf11d5ad903cf7cce31cc3fb/raw/
 ...

warning: rewriting hashes in 
`/gnu/store/mgais6lk92mm8n5kyx70knr11jbwgfhr-sed-4.8.tar.gz'; cross fingers
successfully built 
/gnu/store/nq2jdzbv3nh9b1mglan54dcpfz4l7bli-sed-4.8.tar.gz.drv
--8<---cut here---end--->8---

It’s nothing new, it’s what I do when I want to test the download
fallbacks (see also ‘GUIX_DOWNLOAD_FALLBACK_TEST’ in commit
c4a7aa82e25503133a1bd33148d17968c899a5f5).  Still, I wonder if it could
somehow be abused to have malicious packages pass review.

Granted, ‘guix lint’ immediately flags the issue:

--8<---cut here---start->8---
$ guix lint -L /tmp/p sed
guix lint: warning: plursenca pak-specifigo 'sed'
guix lint: warning: ni elektas sed@4.8 el /tmp/content-addressed.scm:8:2
/tmp/content-addressed.scm:11:11: sed@4.8: all the source URIs are unreachable:
/tmp/content-addressed.scm:11:11: sed@4.8: URI 
https://ftpmirror.gnu.org/gnu/zed/sed-4.8.tar.gz ne estas alirebla: 404 ("Not 
Found")
/tmp/content-addressed.scm:11:11: sed@4.8: URI 
ftp://ftp.cs.tu-berlin.de/pub/gnu/zed/sed-4.8.tar.gz domajno ne trovita: Nomo 
aŭ servo ne konatas
/tmp/content-addressed.scm:11:11: sed@4.8: URI 
ftp://ftp.funet.fi/pub/mirrors/ftp.gnu.org/gnu/zed/sed-4.8.tar.gz ne estas 
alirebla: 550 ("Can't change directory to zed: No such file or directory")
/tmp/content-addressed.scm:11:11: sed@4.8: URI 
http://ftp.gnu.org/pub/gnu/zed/sed-4.8.tar.gz ne estas alirebla: 404 ("Not 
Found")
--8<---cut here---end--->8---

Also, just because a URL looks nice and is reachable doesn’t mean the
source is trustworthy either.  An attacker could submit a package for an
obscure piece of software that happens to be malware.  The difference
here is that the trick above would allow targeting a high-impact
package.

On the plus side, such an attack would be recorded forever in Git
history.

Als