Bug#884095: flag to force file types

2018-03-22 Thread Hans-Christoph Steiner

Chris Lamb:
> Hi Hans,
> 
>> It would be literally impossible to auto-detect since a Janus APK is
>> both a valid DEX file (starting with the bytes "dex") and […]
> 
> Oh dear, I got a little lost in the weeds of Janus/APK/ZIP here..
> 
> Could you excuse my pedanticness and ask for direct links to files,
> what you are seeing and what you are expecting?  That would immediately
> clarify a few questions and avoid a lengthy back-and-forth :)
> 
> 
> Best wishes,

https://www.androidpolice.com/wp-content/uploads/janus-poc/HelloWorld-Janus.apk

Or create your own:
https://github.com/odensc/janus

.hc

___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds

Bug#884095: flag to force file types

2018-03-22 Thread Hans-Christoph Steiner


Chris Lamb:
> Hi Hans!
> 
>>> Have we really exhausted the detection route for this? :)
>>
>> I think the detection route has been exhausted.  It seems that no one
>> wants to do what it takes to reliably detect APKs. 
> 
> I'm sorry you think so and, with the greatest of respect, I'm not
> sure this is entirely accurate... at least from my point of view.
> 
> Could you perhaps attach or otherwise link to some testcases where
> diffoscope gets the detection wrong? It sounds like a fun challenge,
> if nothing else..

Any Janus APK, including the examples linked to in the github, etc. are
test cases.

It would be literally impossible to auto-detect since a Janus APK is
both a valid DEX file (starting with the bytes "dex") and a functional
ZIP and APK.  Most ZIP readers will happily skip any bytes that don't
make sense before the ZIP contents, since the file information is stored
at the end of a ZIP.  A Janus APK is technically not an officially valid
ZIP, since it has non-ZIP bytes before the ZIP header. The most recent
APK tools now reject Janus APKs as invalid, but zip tools will still
happily work with them.  So in my case, I'd want to compare a valid APK
with a modified version of the valid APK that turns it into a Janus APK.

As for increasing the reliability of the auto-detection, I think libfile
could do a quick check for APK Signature v2 or v3, then reliably mark
the file as an APK (vs. ZIP or JAR).

APK Signature v2:
https://source.android.com/security/apksigning/v2

APK Signature v3:
https://android-review.googlesource.com/c/platform/tools/apksig/+/587834/

.hc

___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds


Bug#884095: flag to force file types

2018-03-21 Thread Hans-Christoph Steiner
Chris Lamb:
> severity 884095 wishlist
> thanks
> 
> Hi hc,
> 
>> Something like --force=apk would solve both.
> 
> So, I'm a little nervous about introducing such a directive.
> 
> This is primarily in terms that diffoscope should really just Do The
> Right Thing by default in all cases and not need magic flags to get a
> the desired result. :)
> 
> This is just a better user experience but also has real practical
> implications; it is not tidy (or even possible) to specify such flags
> in automated or hosted CI environments such as tests.reproducible-builds.org, 
> try.diffoscope.org. Travis CI, etc. on a per-package basis.
> 
> Whilst we might have other flags that you could point to that would
> violate this informal "rule", I would certainly cheer their removal.
> 
> (There are also — entirely secondary — concerns around whether this
> flag would change the behaviour in nested files as well, but we can
> leave that for now..)
> 
> Have we really exhausted the detection route for this? :)
> 
> 
> Regards,
> 

I think the detection route has been exhausted.  It seems that no one
wants to do what it takes to reliably detect APKs.  I understand why
libfile does not want to include more elaborate checks like:

* ZIP file
* with AndroidManifest.xml in it

There are also often cases when working with malware samples that they
are deliberately created to avoid being detected as APKs, for example
the "Janus" vuln https://github.com/odensc/janus.  That works by making
the APK seem like a DEX file.

___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds

Bug#890904: diffoscope does not show classes.dex diff

2018-02-26 Thread Hans-Christoph Steiner
sorry, the server was down.  Its back up so you should be able to access
those links now.

___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds


Bug#890904: diffoscope does not show classes.dex diff

2018-02-20 Thread Hans-Christoph Steiner

Package: diffoscope
Version: 90~bpo9+1

Attached are two APKs that have different classes.dex files.  They are
the same size, but have different contents.  diffoscope does not show a
diff for them.  When I extract the classes.dex files from the APK, diff
and vbindiff do show the differences.

Here are the test files:
https://verification.f-droid.org/tmp/a2dp.Vol_137.apk
https://verification.f-droid.org/tmp/sigcp_a2dp.Vol_137.apk

And the report:
https://verification.f-droid.org/tmp/a2dp.Vol_137.apk.diffoscope.txt
https://verification.f-droid.org/tmp/a2dp.Vol_137.apk.diffoscope.html

___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds


Bug#884095: flag to force file types

2017-12-11 Thread Hans-Christoph Steiner

Package: diffoscope
Version: 88

The Janus bug for Android works by making a valid APK file that is also
a valid DEX file.

https://www.guardsquare.com/en/blog/new-android-vulnerability-allows-attackers-modify-apps-without-affecting-their-signatures

Diffoscope sees these files as different file types, so there is no way
to imspect the malware payload. Given this and the issues in file
detection in #849782, there should be a way to force which kind of
comparison that diffoscope does.  Something like --force=apk would solve
both.

There are two example files attached.


HelloWorld.apk
Description: application/vnd.android.package-archive


HelloWorld-Janus.apk
Description: application/vnd.android.package-archive


signature.asc
Description: OpenPGP digital signature
___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds

Bug#875451: diffoscope crashes when using --max-diff-block-lines

2017-09-11 Thread Hans-Christoph Steiner

Package: diffoscope
Version: 85~bpo9+1

`fdroid verify` calls diffoscope like this:

diffoscope --max-report-size 12345678 \
   --max-diff-block-lines 100 \
   --html foo.html --text foo.txt \
   foo.apk another_foo.apk

And it has recently started to crash like this:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/diffoscope/main.py", line 396, in
 main
sys.exit(run_diffoscope(parsed_args))
  File "/usr/lib/python3/dist-packages/diffoscope/main.py", line 356, in
run_diffoscope
Config().check_constraints()
  File "/usr/lib/python3/dist-packages/diffoscope/config.py", line 62,
in check_constraints
self.check_ge("max_diff_block_lines", "max_page_diff_block_lines")
  File "/usr/lib/python3/dist-packages/diffoscope/config.py", line 59,
in check_ge
raise ValueError("{0} ({1}) cannot be smaller than {2}
({3})".format(a, va, b, vb))
ValueError: max_diff_block_lines (100) cannot be smaller than
max_page_diff_block_lines (128)


Since we're not setting max_page_diff_block_lines, this should not crash.

.hc

___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds


Bug#868486: diffoscope often fails to detect APKs

2017-07-24 Thread Hans-Christoph Steiner

The APK format is a ZIP file that always includes the files
AndroidManifest.xml and classes.dex.  Then it also always
has a JAR signature (i.e. META-INF/).  It does not have the
JAR magic number CAFEBABE in it.

___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds


Bug#868486: updated URL

2017-07-17 Thread Hans-Christoph Steiner

I had to move this APK to here:

https://verification.f-droid.org/logs/Zom-15.1.0-alpha-5-zomrelease-release-unsigned.apk

___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds


Bug#868486: diffoscope often fails to detect APKs

2017-07-15 Thread Hans-Christoph Steiner

Package: diffoscope
Version: 83

APKs are basically a ZIP file with a JAR signature, but not necessarily
the CAFEBABE byte sequence that marks a JAR.  This means that comparing
APKs with diffoscope often results in a straight binary diff, which is
useless.

Here's one example:
https://verification.f-droid.org/im.zom.messenger_1510005.binary.apk.diffoscope.html

im.zom.messenger_1510005.binary.apk is available here:
https://verification.f-droid.org/Zom-15.1.0-alpha-5-zomrelease-release-unsigned.apk


im.zom.messenger_1510005.apk is available here:
https://github.com/zom/Zom-Android/releases/download/15.1.0-alpha-5/Zom-15.1.0-alpha-5-zomrelease-release.apk

You can get lots and lots of APKs from here:
https://f-droid.org/packages


I'd like a way to force the file type in diffoscope.   We are calling it
from a build process, so we already know all files are going to be APKs.
Also,  I tried to get this added to libfile, but upstream is not willing
to accept detection routines that rely on more complicated things like
presence of a file in a ZIP. They just want byte patterns, which is not
enough to consistently detect APKs.

___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds


Bug#851147: --max-report-size does not apply to --text reports

2017-01-12 Thread Hans-Christoph Steiner

Package: diffoscope
Version: 67

On https://verification.fdroid.org, diffoscope is run like this:

diffoscope --max-report-size 12345678 --max-diff-block-lines 100 \
  --html foo.html  --text bar.txt

The HTML reports are being size-limited, but there are still some giant
text reports, including a few that are hundreds of MB:

https://verification.f-droid.org/?C=S;O=D

___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds


Bug#850501: rerunning using diffoscope 67-21-gfe7ae15

2017-01-09 Thread Hans-Christoph Steiner

Thanks for your work on the APK diffing!  I had to fix a typo to get it
running that was introduced in diffoscope commit
fe7ae15e1c177866acd478af4cc4a51bd5002017 at the bottom of it. It turned
'f_out' into a non-existent 'w'.

With that change, diffoscope is now working for me again.  I'm running
it on the fdroid verification server, you can see the output here:

https://verification.f-droid.org/


Also, about apktool.yml, I don't think we need it at all in diffoscope.
It is not a file from the APK and it is not produced by Android SDK
tools. As far as I know, it is only used by apktool if you want to use
apktool to reconstruct a previously "decoded" APK.

___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds


Bug#849638: related bug

2017-01-03 Thread Hans-Christoph Steiner

FYI, I filed https://bugs.debian.org/849782 about APKs being
inconsistently detected.

___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds


Bug#849638: diffoscope 66 does binary diff on APK files

2016-12-29 Thread Hans-Christoph Steiner


Reiner Herrmann:
> On Thu, Dec 29, 2016 at 12:41:16PM +0100, Hans-Christoph Steiner wrote:
>> When running diffoscope on two APKs using version 66, it now just does a
>> straight binary comparison of the direct file itself.  Running
>> diffoscope 64 generated a nice output of the individual files in the ZIP
>> (an APK is a signed JAR with some other special features). Attached is
>> the output from v66.  You can find v64 output here:
>>
>> http://37.218.242.117/
>>
>> You can download these two APKs to test with from here:
>>
>> http://37.218.242.117/aarddict.android_26.apk
>> https://f-droid.org/repo/aarddict.android_26.apk
> 
> Just had a short look at this. The reason is that file/libmagic detects
> the files differently:
> 
> $ file 1/aarddict.android_26.apk 2/aarddict.android_26.apk 
> 1/aarddict.android_26.apk: Java archive data (JAR)
> 2/aarddict.android_26.apk: Zip archive data, at least v2.0 to extract
> 
> So the APK comparator should probably also work on files detected as zip.
> 

Yeah, that makes sense as long as libmagic cannot reliably detect APKs
vs JAR vs ZIP.

___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds


Bug#849638: downgrading back to 64 gives me the same

2016-12-29 Thread Hans-Christoph Steiner

So it seems that the issue is not in diffoscope per se, since now
downgrading back to 64 from snapshot.debian.org generates the same
output.  I'm guessing then this is related to interactions with the
dependencies, since I also did an `apt upgrade` at the same time. This
is on a machine running stretch.

___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds


Bug#849638: diffoscope 66 does binary diff on APK files

2016-12-29 Thread Hans-Christoph Steiner

Package: diffoscope
Version: 66
Severity: important

When running diffoscope on two APKs using version 66, it now just does a
straight binary comparison of the direct file itself.  Running
diffoscope 64 generated a nice output of the individual files in the ZIP
(an APK is a signed JAR with some other special features). Attached is
the output from v66.  You can find v64 output here:

http://37.218.242.117/

You can download these two APKs to test with from here:

http://37.218.242.117/aarddict.android_26.apk
https://f-droid.org/repo/aarddict.android_26.apk


aarddict.android_26.apk.html.bz2
Description: application/bzip
___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds

Re: [Reproducible-builds] Preliminary review of dpkg-genbuildinfo

2015-02-16 Thread Hans-Christoph Steiner
Daniel Kahn Gillmor:
 On Fri 2015-02-13 03:36:20 -0500, Hans-Christoph Steiner wrote:
 I think it would be much simpler to just have the single package signature
 that is embedded in the package file itself, like Android APKs and Java JARs.
  Since the package is built reproducibly, anyone who builds it can just copy
 the canonical signature into their copy of the package they just built, and
 it'll match the sha512sum of the signed apt metadata.
 
 It seems like you're saying everyone will be able to agree on which
 signing authority is canonical for any given package.  I'm not
 convinced that's the case.
 
 The big question there is determining the where the canonical
 signature happens.  It seems like it should be the official Debian
 build process, since it is the only process guaranteed to be the same
 
 Even though i'm personally likely to treat Debian as the canonical
 source i care about, i don't want it to be that way.  I would like
 Debian to be able to be a downstream as well as an upstream (see the
 work feeding back into debian from ubuntu, for example); if a .deb
 package can contain an internal signature, and i'm looking at a given
 .deb in isolation on my debian system, i want to see it signed *by
 debian*, not by whoever happened to produce it first.  Otherwise, it's
 not clear to me that the embedded signature is useful to me as an end
 user at all.
 
 Another question is whether dpkg checks whether the signers match when
 upgrading, like the Android model (a package can only be upgraded by
 another package signed by the same key).  This would be nice, but
 seems optional and hard to do in Debian.
 
 Maybe this is the question we need to answer to move the discussion
 forward to make sure we're taking the desire for embedded signatures
 into account when thinking about reproducible .debs: how exactly do we
 expect an embedded signature to be used/evaluated?  by who, and in what
 context?

 I think the .buildinfo file is useful, but for a separate process.  It should
 be the canonical file for running a reproducible build.
 
 I'm not sure what this means.  I'd be very happy if *all* of my debian
 packages were reproducible builds, and i could have a way of verifying
 it.  I'd consider that more valuable than knowing that all my .debs were
 signed by any individual authority.  So if we're really talking about a
 tradeoff between signed buildinfo files and signed packages, i'd
 certainly prefer signed buildinfo files.
 
 But my proposal was an attempt to let people have both, without forcing
 the entire ecosystem to agree on who is a canonical authority for
 package X, without whom a reproducible package is impossible
 
   --dkg

I think this topic is far too vast with far too many dependencies to really
have a useful discussion on without a full time, dedicated team.  Since that
seems highly unlikely in the near future, we need to break it down into chunks
of work that we can achieve with the time and resources we actually have.

So we need to focus on drilling down to what is the simplest useful form of
package signing that will cause the least amount of problems when we decide to
change how package signing works.  This means we get a prototype out as soon
as possible, and we can learn a lot from that. I think that's pretty easy to
do, something like this:

* make dpkg optionally check package sigs, and refuse to install on bad sig
* use apt signing model: signatures verified from the apt key ring
* signing can start happening in the build tools, by the uploader
* start work towards getting the Debian built/apt infrastructure signing

.hc


-- 
PGP fingerprint: 5E61 C878 0F86 295C E17D  8677 9F0F E587 374B BE81
https://pgp.mit.edu/pks/lookup?op=vindexsearch=0x9F0FE587374BBE81

___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds


Re: [Reproducible-builds] Wiki reorganization

2015-01-07 Thread Hans-Christoph Steiner

Looks drastically better!  I think this wiki is really the central resource
for anyone interested in making reproducible builds, Debian or not.  So I'm
glad to see it reorganized to look like a community resource rather than the
giant notepad it was before.

.hc

Jérémy Bobbio:
 Hi!
 
 While waiting on builds and rebuilds of linux, I started to reorganize,
 refresh and improve the documentation on the wiki.
 
  .O.
 === HEADS-UP! ==o===o= HEADS-UP! ===
 
 If you were previously subscribed to the ReproducibleBuilds wiki
 page, you should edit your notifications:
 
 https://wiki.debian.org/?action=userprefssub=notification
 
 And add to the list of pages the following regex:
 
 ReproducibleBuilds/.*
 
 Otherwise, you are likely to miss some of the fun!
 
 
 
 The main page is now a landing page with links based on potential
 reader's interests. Have a look:
 https://wiki.debian.org/ReproducibleBuilds
 
 I'm happy with the status of the About, Contribute, and
 ExperimentalToolchain sub-pages. I should be able to work on Howto
 and History tomorrow. Don't wait for me, though. :)
 
 
 
 ___
 Reproducible-builds mailing list
 Reproducible-builds@lists.alioth.debian.org
 http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
 

-- 
PGP fingerprint: 5E61 C878 0F86 295C E17D  8677 9F0F E587 374B BE81
https://pgp.mit.edu/pks/lookup?op=vindexsearch=0x9F0FE587374BBE81



signature.asc
Description: OpenPGP digital signature
___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds

Re: [Reproducible-builds] [RFC] debbindiff

2014-09-30 Thread Hans-Christoph Steiner

Definitely use setup.py.  It makes the packaging easy and standardized, and it
is the standard way to build python.  It also makes it easy to publish
releases to pypi, the central package repository for python.  I attached a
quick untested one for you.

.hc

Jérémy Bobbio wrote:
 Hi!
 
 I've been working at high pace since Sunday on a replacement for the
 diffp script [1]. These GPLv3 lines of Python are called debbindiff.
 
 Get it from Git:
 
 https://anonscm.debian.org/cgit/reproducible/debbindiff.git/
 
 Attached is an output produced for the attr package. The new tool is at
 least as capable as diffp, is way more extensible, and the result is
 more readable.
 
 Example usage:
 
 $ ./debbindiff.py --html /tmp/debbindiff.html b1/*.changes b2/*.changes
 
 There's no requirements for actually comparing .changes. You can use it
 to compare jar files directly if that's your kick.
 
 I'd love to see reviews of the code. It's scarce on comments but names
 should be explicit enough, or so I hope.
 
 It's missing Debian packaging. I guess I should learn how to write a
 setup.cfg or similar. Pointers or patches welcome.
 
 One thing this codebase should enable is writing “hints”. Once the tree
 of differences is generated, it should be doable to run through it to
 generate statements like: “Many files in data.tar have different
 timestamps, dh_fixmtimes has probably not been called. Are you
 using dh?” This still needs to be done though.
 
 Last note: I've been pushing everything else aside while I had the
 thrills to work on this. It's unclear when will be the next time, so
 patches are preferred rather than suggestion.
 
  [1]: https://anonscm.debian.org/cgit/reproducible/misc.git/tree/diffp
 
 
 
 ___
 Reproducible-builds mailing list
 Reproducible-builds@lists.alioth.debian.org
 http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
 

-- 
PGP fingerprint: 5E61 C878 0F86 295C E17D  8677 9F0F E587 374B BE81
#!/usr/bin/env python2

from setuptools import setup
import sys

setup(name='debbindiff',
  version='0.1',
  description='display differences between files',
  long_description=open('README').read(),
  author='Lunar',
  author_email='lu...@debian.org',
  url='https://wiki.debian.org/ReproducibleBuilds',
  packages=['debbindiff'],
  scripts=['debbindiff'],
  install_requires=[
  'python-debian',
  ],
  classifiers=[
  'Development Status :: 3 - Alpha',
  'Intended Audience :: Developers',
  'License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)',
  'Operating System :: POSIX',
  'Topic :: Utilities',
  ],
  )


signature.asc
Description: OpenPGP digital signature
___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds

Re: [Reproducible-builds] [RFC] debbindiff

2014-09-30 Thread Hans-Christoph Steiner

Comparing jars gives a stacktrace, looks like a missing import.

$ ./debbindiff.py
~/code/guardianproject/cacheword/cachewordlib/cachewordlib-v0.1-1-g04cb18e.jar
/tmp/cachewordlib-v0.1-1-g04cb18e.jar
Traceback (most recent call last):
  File ./debbindiff.py, line 53, in module
main()
  File ./debbindiff.py, line 43, in main
differences = debbindiff.comparators.compare_files(parsed_args.file1,
parsed_args.file2)
  File
/media/share/code/reproducible/debbindiff/debbindiff/comparators/__init__.py, 
line
85, in compare_files
return comparator(path1, path2, source)
  File
/media/share/code/reproducible/debbindiff/debbindiff/comparators/utils.py,
line 51, in with_fallback
inside_differences = original_function(path1, path2, source)
  File
/media/share/code/reproducible/debbindiff/debbindiff/comparators/zip.py,
line 57, in compare_zip_files
zipinfo1 = get_zipinfo(path1)
  File
/media/share/code/reproducible/debbindiff/debbindiff/comparators/zip.py,
line 31, in get_zipinfo
return re.sub(re.escape(path), os.path.basename(path), output)
NameError: global name 're' is not defined


Also, I updated the setup.py for two small things. I recommend using code
checkers like pyflakes and pylint:

$ pyflakes *.py debbindiff/*.py
debbindiff/difference.py:20: 'difflib' imported but unused
debbindiff/difference.py:41: redefinition of function 'comment' from line 37
hans@palatschinken debbindiff $ pylint *.py debbindiff/*.py
...


.hc

Hans-Christoph Steiner wrote:
 
 Definitely use setup.py.  It makes the packaging easy and standardized, and it
 is the standard way to build python.  It also makes it easy to publish
 releases to pypi, the central package repository for python.  I attached a
 quick untested one for you.
 
 .hc
 
 Jérémy Bobbio wrote:
 Hi!

 I've been working at high pace since Sunday on a replacement for the
 diffp script [1]. These GPLv3 lines of Python are called debbindiff.

 Get it from Git:

 https://anonscm.debian.org/cgit/reproducible/debbindiff.git/

 Attached is an output produced for the attr package. The new tool is at
 least as capable as diffp, is way more extensible, and the result is
 more readable.

 Example usage:

 $ ./debbindiff.py --html /tmp/debbindiff.html b1/*.changes b2/*.changes

 There's no requirements for actually comparing .changes. You can use it
 to compare jar files directly if that's your kick.

 I'd love to see reviews of the code. It's scarce on comments but names
 should be explicit enough, or so I hope.

 It's missing Debian packaging. I guess I should learn how to write a
 setup.cfg or similar. Pointers or patches welcome.

 One thing this codebase should enable is writing “hints”. Once the tree
 of differences is generated, it should be doable to run through it to
 generate statements like: “Many files in data.tar have different
 timestamps, dh_fixmtimes has probably not been called. Are you
 using dh?” This still needs to be done though.

 Last note: I've been pushing everything else aside while I had the
 thrills to work on this. It's unclear when will be the next time, so
 patches are preferred rather than suggestion.

  [1]: https://anonscm.debian.org/cgit/reproducible/misc.git/tree/diffp



 ___
 Reproducible-builds mailing list
 Reproducible-builds@lists.alioth.debian.org
 http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds

 

-- 
PGP fingerprint: 5E61 C878 0F86 295C E17D  8677 9F0F E587 374B BE81
#!/usr/bin/env python2

from setuptools import setup

setup(name='debbindiff',
  version='0.1',
  description='display differences between files',
  long_description=open('README').read(),
  author='Lunar',
  author_email='lu...@debian.org',
  url='https://wiki.debian.org/ReproducibleBuilds',
  packages=['debbindiff'],
  scripts=['debbindiff.py'],
  install_requires=[
  'python-debian',
  ],
  classifiers=[
  'Development Status :: 3 - Alpha',
  'Intended Audience :: Developers',
  'License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)',
  'Operating System :: POSIX',
  'Topic :: Utilities',
  ],
  )


signature.asc
Description: OpenPGP digital signature
___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds

Re: [Reproducible-builds] concrete steps for improving apt downloading security and privacy

2014-09-29 Thread Hans-Christoph Steiner


Stefan Fritsch wrote:
 On Sunday 21 September 2014 21:13:50, Richard van den Berg wrote:
 Package formats like apk and jar avoid this chicken and egg problem
 by hashing the files inside a package, and storing those hashes in
 a manifest file. Signatures only sign the manifest file. The
 manifest itself and the signature files are not part of the
 manifest, but are part of the package. So a package including it's
 signature(s) is still a single file.
 
 This is bad design and will inevitably lead to security issues (as has 
 been demonstrated by Android and apk). One must check the signature 
 first, and only if the signature matches, start parsing complex file 
 formats. And yes, zip is complex enough to be a problem.

It is true that an embedded signature requires more complicated code, but it
also simplifies the parts that the user has to understand.  Perfect code with
a bad user experience will also inevitably lead to security issues.

I'm guessing that ar format is simpler than zip, so that'd be helpful.

.hc

-- 
PGP fingerprint: 5E61 C878 0F86 295C E17D  8677 9F0F E587 374B BE81

___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds


Re: [Reproducible-builds] concrete steps for improving apt downloading security and privacy

2014-09-24 Thread Hans-Christoph Steiner


Daniel Kahn Gillmor wrote:
 On 09/22/2014 04:06 PM, Hans-Christoph Steiner wrote:
 I think we're starting to nail down the moving parts here, so I want to
 outline that so we can find out the parts where we agree and where we 
 disagree.

 * I hope we can all agree that the package itself should not change once it
 has hit the official repos.

 * I believe we can achieve what we want without taking a shortcut and
 introducing a new core package type (.sdeb .debs or whatever).  We can figure
 out how to do this with the .deb file.  Personally, I would accept a new
 package type after a thorough exploration of keeping .deb fails to deliver,
 but not before.

 * There should be at least one verification build before a package becomes
 official.

 * Then there needs to be a channel for people to submit the results of their
 own builds.  That could be only positive results or only negative results, or
 both.

 * the .buildinfo file should contain all info needed to reproduce the build,
 given a standard Debian build environment
 
 Thanks, the above is a very useful summary.
 
 Anything I left out?
 
 I think the summary above hints at but doesn't answer the question of
 what an official package means, and the fact that there may be
 multiple repositories (possibly operated by different organizations)
 with different rules about what should make a package official.
 
 I think we need to ask whether we care about byte-for-byte identical
 .deb files *across* different repositories or not.
 
 If we don't care about cross-repo (or cross-organization) byte-for-byte
 reproducibility, then an embedded signature in the .deb might be
 acceptable (though the data it contains would be redundant to signatures
 over the buildinfo files, which would eventually be necessary for
 external policies or corroboration anyway).
 
 If we *do* decide that we care about cross-repo byte-for-byte
 compatibility, then embedding a signature in the .deb suggests that one
 repo can act as the gating factor for another, because repos
 collaborating in this reproducibility push cannot both hold the key that
 makes a .deb official.
 
 I don't think that's a good tradeoff.  As tempting as it might be to try
 to cement debian's authoritative role via such a lock-in, i'd much
 rather than debian derivatives, blends, side projects, etc, can all take
 initiative that can then be absorbed back into debian cleanly and
 reproducibly.
 
 i also suspect that the redundancy between internal signatures and
 signed .buildinfo records is likely to cause some increase in confusion,
 but i don't think that's as serious of a problem as the question of
 which signing keys get to be authoritative.
 
   --dkg

Cross-repo byte-for-byte compatibility is a nice thing to strive for, but it
sounds quite difficult to achieve and will require lots of social coordination
as well as technical work.

In terms of builds of a particular .deb by multiple distros, each distro will
have to use the exact same toolchain to build the .deb for most packages.
Different versions of gcc, javac, etc. will produce different binaries.
You'll have the same problems as the canonical signature, like if two distros
make a new package at the same time, but with different standards (gcc
version, signer, etc).  Ubuntu's gcc version will create a .deb with one hash,
and Debian's gcc will create a .deb with a different hash, and each distro
will mark theirs as canonical. That seems to be a much harder thing to manage
across the distros.  So if another distro or repo is going to buy into
Debian's reproducible system, adding a bit about canonical signatures seems
totally feasible to manage.

The canonical signature would just need to be done by a key in the
debian-keyring for Debian, ubuntu-keyring for Ubuntu, etc.  While a static
canonical signer is desirable, I don't think it can be supported without
adding some restrictions (I think it would be worth it, for the record).

Whoever is the first to publish a given package version claims the canonical
role for both build setup and signature.  To prevent accidental collisions,
dput could check the various NEW queues, the various package repos, etc. to
look for an existing canonical package.  Then the first distro to publish is
canonical.

Shall we have a real time discussion on this topic? voice, video, or in person
all work for me.

.hc

-- 
PGP fingerprint: 5E61 C878 0F86 295C E17D  8677 9F0F E587 374B BE81

___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds


Re: [Reproducible-builds] concrete steps for improving apt downloading security and privacy

2014-09-22 Thread Hans-Christoph Steiner


Elmar Stellnberger wrote:
 Am 22.09.14 um 01:52 schrieb Paul Wise:
 On Mon, Sep 22, 2014 at 2:04 AM, Elmar Stellnberger wrote:

 A package with some new signatures added is no more the old package.
 That is exactly what we do *not* want for reproducible builds.

 It should have a different checksum and be made available again for update.
 The Debian archive does not allow files to change their checksum, so
 every signature addition requires a new version number. That sounds
 like a bad idea to me.
 Yes, that is something we definitely do not want.
 Nonetheless it would still be an issue to have the package and the signatures
 in one file because we usually need them together. My only idea to realize 
 this
 in spite of the said objection would be another proposal:
 Put the .deb and the signatures into one .ar called .sdeb and make tools like
 dpkg work on .sdebs or on .deb + signatures respecively. Whenever someone
 offers some packages for download that will be in the form of .sdebs while
 official debian repositories may separate both kinds of files. User interfaces
 like http://debtags.debian.net/search/ could then generate .sdebs on the fly
 to satisfy petted users.

I think we're starting to nail down the moving parts here, so I want to
outline that so we can find out the parts where we agree and where we disagree.

* I hope we can all agree that the package itself should not change once it
has hit the official repos.

* I believe we can achieve what we want without taking a shortcut and
introducing a new core package type (.sdeb .debs or whatever).  We can figure
out how to do this with the .deb file.  Personally, I would accept a new
package type after a thorough exploration of keeping .deb fails to deliver,
but not before.

* There should be at least one verification build before a package becomes
official.

* Then there needs to be a channel for people to submit the results of their
own builds.  That could be only positive results or only negative results, or
both.

* the .buildinfo file should contain all info needed to reproduce the build,
given a standard Debian build environment

Anything I left out?

.hc


-- 
PGP fingerprint: 5E61 C878 0F86 295C E17D  8677 9F0F E587 374B BE81

___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds


Re: [Reproducible-builds] concrete steps for improving apt downloading security and privacy

2014-09-19 Thread Hans-Christoph Steiner


Daniel Kahn Gillmor wrote:
 Hi Hans--
 
 I think we're in agreement here about most things actually, despite our
 back-and-forth.  hopefully this is a clarifying response:
 
 Daniel Kahn Gillmor wrote:
 In that case, the .deb that was installed on a sid system *is not* the
 .deb that is installed on a testing system.

 If i run a mixed unstable/testing system (i do, actually, this is not
 hypothetical) should i need to re-install foo_1.2-3_mipsel.deb when that
 package transitions from unstable to testing (without any changes made
 to it other than new signatures)?  That seems odd, but the .debs are now
 no longer bytewise identical.  should archive operators who are doing
 rsync mirroring of a number of pools update their .debs as new
 signatures are added to them?  can they still use rsync for this cleanly
 without a massive increase in bandwidth between mirrors or do we need to
 define a new synchronization mechanism?
 
 The question of what is a canonical, immutable signature for any given
 distribution is also problematic, because it ties the policy of the
 distribution (already defined by what that apt repository includes and
 references in Release.gpg) to a set of individual package signatures.
 But this is exactly the point where we'd like more flexibility.  People
 who care about apt repo X and use it online can use Release.gpg, while
 people who are *not* using the apt repo might have a different set of
 policies.  And some repos might want to share specific packages with
 each other -- what if their signing policies about the canonical
 signature conflict?  should they have to rebuild the package?

Packages should not be accepted into any official repo, sid included, without
some verification builds.  A .deb should remain unchanged once it is accepted
into any official repo (maybe experimental could be an exception, but not
sid).  I think that is essential.

I see no reason for changing the .deb between sid and testing, except for
perhaps how existing implementations are doing it.  It is usually worth the
work to do things right way, rather than the easy way.

The build verification process needs to happen between the package upload and
publishing to sid or security updates.  Two builds is easy: the .deb that the
uploader generates and the one the Debian process makes.  That is probably 
enough.

In Debian's case, it probably is too complicated to include multiple
signatures.  In that case, there should be only one canonical signature by dak
once the build verification signature threshold has been passed. Then all of
the other signatures could be added to .buildinfo or .changes or whatever
other file.

Another option is to do it like f-droid.org does.  F-droid.org generates a APK
signing key for each app, then manages the signing on a specialized signing
server.  Or another option is just requiring all the signers to be from the
debian-keyring, rather than an exact match for previous signers.  In any case,
the .deb needs to remain unchanged.




 You're entirely right that when fetching files via the web directly
 (instead of an apt repository) or sneakernet, people tend to transfer
 only the minimal set of possible files, and therefore having detached
 signatures is a bad idea for adoption.
 
 But i don't think this addresses the concern raised above that specific
 .deb files have constant size and contents, which is an assumption that
 permeates the repository, mirror, and distribution mechanisms.
 Rejecting that assumption means potentially breaking a lot of
 infrastructure that currently works, as well as forcing incompatible
 policy changes on archive operators.
 
 So i'd like to have this cake and eat it too, please :)
 
 Here's a proposal for chewing over:
 
  * define a new package format called .debs
 
  * foo_1.2-3_mipsel.debs is a tarball that contains at least three files:
 
 foo_1.2-3_mipsel.deb
 foo_1.2-3_mipsel.buildinfo
 foo_1.2-3_mipsel.buildinfo.0EE5BE979282D80B9F7540F1CCD2ED94D21739E9.asc
 
   (it can contain more than one .asc if it wants to include multiple
 signatures)
 
  * if you invoke dpkg -i foo_1.2-3_mipsel.debs, dpkg should unpack and
 inspect the .debs, and the signatures, and refuse to install the .deb if
 the signatures don't meet local policy.  (i'm hand-waving here about
 what local policy is, since i think that's a separate discussion)
 
 Now we can leave the current online archive distribution alone -- apt
 works (modulo bugs) and archive operators can continue to function as
 they currently do.  But we tell users and upstream developers that if
 they want to install packages via sneakernet or by downloading them
 individually from the web that they really should be passing around
 .debs files, and not .deb files.  We could even modify dpkg to reject
 installations of plain .deb files unless a package manager (which has
 presumably already verified the package by other means) is doing the
 installation.
 
 what do you think?
 
   --dkg

I think this can 

Re: [Reproducible-builds] concrete steps for improving apt downloading security and privacy

2014-09-19 Thread Hans-Christoph Steiner


Daniel Kahn Gillmor wrote:
 Thanks for the discussion, Hans.
 
 On 09/19/2014 02:47 PM, Hans-Christoph Steiner wrote:
 Packages should not be accepted into any official repo, sid included, without
 some verification builds.  A .deb should remain unchanged once it is accepted
 into any official repo (maybe experimental could be an exception, but not
 sid).  I think that is essential.
 
 But some repositories could have different rules for package inclusion
 than others, right?  for example, say debian wanted to offer an
 unstable-reproducible suite, which only permitted packages that had been
 independently rebuilt reproducibly by multiple DDs and at least two
 different buildds.  Ideally, the packages that are shared between this
 repository and other repositories would be identical.
 
 Note that if .deb files are internally signed, two developers *cannot*
 create the same exact .deb if they do not share their secret keys.

You're missing one key detail here, let's see if I can suss it out:

* the builds are _exactly_ the same, except the signatures
* the embedded signature does not sign the signature files (see
  jar and apk formats, which are almost the same, for examples)
* anyone can just copy other dev's signature into the package and it
  will validate because the package contents are exactly the same
* the signature files sign the package contents, not the hash of
  whole .deb file (i.e. control.tar.gz and data.tar.gz).

Therefore two developers can easily create the same .deb if that have access
to the signature file since they can just copy it.  No need to run the signing
process again.  If people create their own .deb files in a reproducible
process, then copy in the same signature files, then the hash of the .deb will
be the same.


 I see no reason for changing the .deb between sid and testing, except for
 perhaps how existing implementations are doing it.  It is usually worth the
 work to do things right way, rather than the easy way.
 
 I agree with this sentiment, i think we're trying to sort out what is
 the right way.
 
 The build verification process needs to happen between the package upload and
 publishing to sid or security updates.  Two builds is easy: the .deb that the
 uploader generates and the one the Debian process makes.  That is probably 
 enough.

 In Debian's case, it probably is too complicated to include multiple
 signatures.  In that case, there should be only one canonical signature by 
 dak
 once the build verification signature threshold has been passed. Then all of
 the other signatures could be added to .buildinfo or .changes or whatever
 other file.
 
 but the .buildinfo file is designed to say i generated the .deb that
 matches this digest exactly, which the corroborating builder cannot do,
 because they cannot produce the internal signature.

No need to produce the signature, just copy it!


 Plus, we now have two different places to look for signatures.  one
 canonical one and then some external ones, and the signatures
 themselves have different properties (one signs parts of the deb, the
 other signs the whole .deb; one signs the build environment, the other
 does not, etc)

Definitely look at jar signing, it handles multiple signatures fine.  I see no
reason why you can't include an unlimited number of signatures in a .deb.
Changing the number of signatures will change the hash of the .deb, that is
why there needs to be a canonical set of signatures for each .deb.

As for signing the hash of the entire .deb, that is what apt already gives us,
that does not need to be reproduced in the dpkg-sig embedded signature. For
people who want to verify the contents of a .deb with any kind of signature,
then a tool will have to compare the hashes of control.tar.gz and data.tar.gz.


 Another option is to do it like f-droid.org does.  F-droid.org generates a 
 APK
 signing key for each app, then manages the signing on a specialized signing
 server.  Or another option is just requiring all the signers to be from the
 debian-keyring, rather than an exact match for previous signers.
 
 Again, i think this is getting ahead of the discussion.  i'm not
 proposing that we try to set debian (or other derived distro) archive
 policy here, i just think we want to think
 
  In any case, the .deb needs to remain unchanged.
 
 right.  but it can't be unchanged if the archive distributor decides
 that a different signer is the canonical signer.  So you're making the
 contents of the .deb dependent on archive policy, rather than the other
 way around.
 
 I *want* ubuntu and debian and mint to all ship the exact same .deb for
 any packages that are reproducible (and eventually, all packages!) that
 they share, and i also want those different distros to be able to
 produce the reproducible .deb independently of one another.  If
 foo_1.2-3_mipsel.deb is built first on the ubuntu builders and ubuntu
 decides to include it in the archive, and then debian is able to
 reproduce that build