Bug#884095: flag to force file types
Chris Lamb: > Hi Hans, > >> It would be literally impossible to auto-detect since a Janus APK is >> both a valid DEX file (starting with the bytes "dex") and […] > > Oh dear, I got a little lost in the weeds of Janus/APK/ZIP here.. > > Could you excuse my pedanticness and ask for direct links to files, > what you are seeing and what you are expecting? That would immediately > clarify a few questions and avoid a lengthy back-and-forth :) > > > Best wishes, https://www.androidpolice.com/wp-content/uploads/janus-poc/HelloWorld-Janus.apk Or create your own: https://github.com/odensc/janus .hc ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Bug#884095: flag to force file types
Chris Lamb: > Hi Hans! > >>> Have we really exhausted the detection route for this? :) >> >> I think the detection route has been exhausted. It seems that no one >> wants to do what it takes to reliably detect APKs. > > I'm sorry you think so and, with the greatest of respect, I'm not > sure this is entirely accurate... at least from my point of view. > > Could you perhaps attach or otherwise link to some testcases where > diffoscope gets the detection wrong? It sounds like a fun challenge, > if nothing else.. Any Janus APK, including the examples linked to in the github, etc. are test cases. It would be literally impossible to auto-detect since a Janus APK is both a valid DEX file (starting with the bytes "dex") and a functional ZIP and APK. Most ZIP readers will happily skip any bytes that don't make sense before the ZIP contents, since the file information is stored at the end of a ZIP. A Janus APK is technically not an officially valid ZIP, since it has non-ZIP bytes before the ZIP header. The most recent APK tools now reject Janus APKs as invalid, but zip tools will still happily work with them. So in my case, I'd want to compare a valid APK with a modified version of the valid APK that turns it into a Janus APK. As for increasing the reliability of the auto-detection, I think libfile could do a quick check for APK Signature v2 or v3, then reliably mark the file as an APK (vs. ZIP or JAR). APK Signature v2: https://source.android.com/security/apksigning/v2 APK Signature v3: https://android-review.googlesource.com/c/platform/tools/apksig/+/587834/ .hc ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Bug#884095: flag to force file types
Chris Lamb: > severity 884095 wishlist > thanks > > Hi hc, > >> Something like --force=apk would solve both. > > So, I'm a little nervous about introducing such a directive. > > This is primarily in terms that diffoscope should really just Do The > Right Thing by default in all cases and not need magic flags to get a > the desired result. :) > > This is just a better user experience but also has real practical > implications; it is not tidy (or even possible) to specify such flags > in automated or hosted CI environments such as tests.reproducible-builds.org, > try.diffoscope.org. Travis CI, etc. on a per-package basis. > > Whilst we might have other flags that you could point to that would > violate this informal "rule", I would certainly cheer their removal. > > (There are also — entirely secondary — concerns around whether this > flag would change the behaviour in nested files as well, but we can > leave that for now..) > > Have we really exhausted the detection route for this? :) > > > Regards, > I think the detection route has been exhausted. It seems that no one wants to do what it takes to reliably detect APKs. I understand why libfile does not want to include more elaborate checks like: * ZIP file * with AndroidManifest.xml in it There are also often cases when working with malware samples that they are deliberately created to avoid being detected as APKs, for example the "Janus" vuln https://github.com/odensc/janus. That works by making the APK seem like a DEX file. ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Bug#890904: diffoscope does not show classes.dex diff
sorry, the server was down. Its back up so you should be able to access those links now. ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Bug#890904: diffoscope does not show classes.dex diff
Package: diffoscope Version: 90~bpo9+1 Attached are two APKs that have different classes.dex files. They are the same size, but have different contents. diffoscope does not show a diff for them. When I extract the classes.dex files from the APK, diff and vbindiff do show the differences. Here are the test files: https://verification.f-droid.org/tmp/a2dp.Vol_137.apk https://verification.f-droid.org/tmp/sigcp_a2dp.Vol_137.apk And the report: https://verification.f-droid.org/tmp/a2dp.Vol_137.apk.diffoscope.txt https://verification.f-droid.org/tmp/a2dp.Vol_137.apk.diffoscope.html ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Bug#884095: flag to force file types
Package: diffoscope Version: 88 The Janus bug for Android works by making a valid APK file that is also a valid DEX file. https://www.guardsquare.com/en/blog/new-android-vulnerability-allows-attackers-modify-apps-without-affecting-their-signatures Diffoscope sees these files as different file types, so there is no way to imspect the malware payload. Given this and the issues in file detection in #849782, there should be a way to force which kind of comparison that diffoscope does. Something like --force=apk would solve both. There are two example files attached. HelloWorld.apk Description: application/vnd.android.package-archive HelloWorld-Janus.apk Description: application/vnd.android.package-archive signature.asc Description: OpenPGP digital signature ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Bug#875451: diffoscope crashes when using --max-diff-block-lines
Package: diffoscope Version: 85~bpo9+1 `fdroid verify` calls diffoscope like this: diffoscope --max-report-size 12345678 \ --max-diff-block-lines 100 \ --html foo.html --text foo.txt \ foo.apk another_foo.apk And it has recently started to crash like this: Traceback (most recent call last): File "/usr/lib/python3/dist-packages/diffoscope/main.py", line 396, in main sys.exit(run_diffoscope(parsed_args)) File "/usr/lib/python3/dist-packages/diffoscope/main.py", line 356, in run_diffoscope Config().check_constraints() File "/usr/lib/python3/dist-packages/diffoscope/config.py", line 62, in check_constraints self.check_ge("max_diff_block_lines", "max_page_diff_block_lines") File "/usr/lib/python3/dist-packages/diffoscope/config.py", line 59, in check_ge raise ValueError("{0} ({1}) cannot be smaller than {2} ({3})".format(a, va, b, vb)) ValueError: max_diff_block_lines (100) cannot be smaller than max_page_diff_block_lines (128) Since we're not setting max_page_diff_block_lines, this should not crash. .hc ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Bug#868486: diffoscope often fails to detect APKs
The APK format is a ZIP file that always includes the files AndroidManifest.xml and classes.dex. Then it also always has a JAR signature (i.e. META-INF/). It does not have the JAR magic number CAFEBABE in it. ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Bug#868486: updated URL
I had to move this APK to here: https://verification.f-droid.org/logs/Zom-15.1.0-alpha-5-zomrelease-release-unsigned.apk ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Bug#868486: diffoscope often fails to detect APKs
Package: diffoscope Version: 83 APKs are basically a ZIP file with a JAR signature, but not necessarily the CAFEBABE byte sequence that marks a JAR. This means that comparing APKs with diffoscope often results in a straight binary diff, which is useless. Here's one example: https://verification.f-droid.org/im.zom.messenger_1510005.binary.apk.diffoscope.html im.zom.messenger_1510005.binary.apk is available here: https://verification.f-droid.org/Zom-15.1.0-alpha-5-zomrelease-release-unsigned.apk im.zom.messenger_1510005.apk is available here: https://github.com/zom/Zom-Android/releases/download/15.1.0-alpha-5/Zom-15.1.0-alpha-5-zomrelease-release.apk You can get lots and lots of APKs from here: https://f-droid.org/packages I'd like a way to force the file type in diffoscope. We are calling it from a build process, so we already know all files are going to be APKs. Also, I tried to get this added to libfile, but upstream is not willing to accept detection routines that rely on more complicated things like presence of a file in a ZIP. They just want byte patterns, which is not enough to consistently detect APKs. ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Bug#851147: --max-report-size does not apply to --text reports
Package: diffoscope Version: 67 On https://verification.fdroid.org, diffoscope is run like this: diffoscope --max-report-size 12345678 --max-diff-block-lines 100 \ --html foo.html --text bar.txt The HTML reports are being size-limited, but there are still some giant text reports, including a few that are hundreds of MB: https://verification.f-droid.org/?C=S;O=D ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Bug#850501: rerunning using diffoscope 67-21-gfe7ae15
Thanks for your work on the APK diffing! I had to fix a typo to get it running that was introduced in diffoscope commit fe7ae15e1c177866acd478af4cc4a51bd5002017 at the bottom of it. It turned 'f_out' into a non-existent 'w'. With that change, diffoscope is now working for me again. I'm running it on the fdroid verification server, you can see the output here: https://verification.f-droid.org/ Also, about apktool.yml, I don't think we need it at all in diffoscope. It is not a file from the APK and it is not produced by Android SDK tools. As far as I know, it is only used by apktool if you want to use apktool to reconstruct a previously "decoded" APK. ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Bug#849638: related bug
FYI, I filed https://bugs.debian.org/849782 about APKs being inconsistently detected. ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Bug#849638: diffoscope 66 does binary diff on APK files
Reiner Herrmann: > On Thu, Dec 29, 2016 at 12:41:16PM +0100, Hans-Christoph Steiner wrote: >> When running diffoscope on two APKs using version 66, it now just does a >> straight binary comparison of the direct file itself. Running >> diffoscope 64 generated a nice output of the individual files in the ZIP >> (an APK is a signed JAR with some other special features). Attached is >> the output from v66. You can find v64 output here: >> >> http://37.218.242.117/ >> >> You can download these two APKs to test with from here: >> >> http://37.218.242.117/aarddict.android_26.apk >> https://f-droid.org/repo/aarddict.android_26.apk > > Just had a short look at this. The reason is that file/libmagic detects > the files differently: > > $ file 1/aarddict.android_26.apk 2/aarddict.android_26.apk > 1/aarddict.android_26.apk: Java archive data (JAR) > 2/aarddict.android_26.apk: Zip archive data, at least v2.0 to extract > > So the APK comparator should probably also work on files detected as zip. > Yeah, that makes sense as long as libmagic cannot reliably detect APKs vs JAR vs ZIP. ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Bug#849638: downgrading back to 64 gives me the same
So it seems that the issue is not in diffoscope per se, since now downgrading back to 64 from snapshot.debian.org generates the same output. I'm guessing then this is related to interactions with the dependencies, since I also did an `apt upgrade` at the same time. This is on a machine running stretch. ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Bug#849638: diffoscope 66 does binary diff on APK files
Package: diffoscope Version: 66 Severity: important When running diffoscope on two APKs using version 66, it now just does a straight binary comparison of the direct file itself. Running diffoscope 64 generated a nice output of the individual files in the ZIP (an APK is a signed JAR with some other special features). Attached is the output from v66. You can find v64 output here: http://37.218.242.117/ You can download these two APKs to test with from here: http://37.218.242.117/aarddict.android_26.apk https://f-droid.org/repo/aarddict.android_26.apk aarddict.android_26.apk.html.bz2 Description: application/bzip ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Re: [Reproducible-builds] Preliminary review of dpkg-genbuildinfo
Daniel Kahn Gillmor: On Fri 2015-02-13 03:36:20 -0500, Hans-Christoph Steiner wrote: I think it would be much simpler to just have the single package signature that is embedded in the package file itself, like Android APKs and Java JARs. Since the package is built reproducibly, anyone who builds it can just copy the canonical signature into their copy of the package they just built, and it'll match the sha512sum of the signed apt metadata. It seems like you're saying everyone will be able to agree on which signing authority is canonical for any given package. I'm not convinced that's the case. The big question there is determining the where the canonical signature happens. It seems like it should be the official Debian build process, since it is the only process guaranteed to be the same Even though i'm personally likely to treat Debian as the canonical source i care about, i don't want it to be that way. I would like Debian to be able to be a downstream as well as an upstream (see the work feeding back into debian from ubuntu, for example); if a .deb package can contain an internal signature, and i'm looking at a given .deb in isolation on my debian system, i want to see it signed *by debian*, not by whoever happened to produce it first. Otherwise, it's not clear to me that the embedded signature is useful to me as an end user at all. Another question is whether dpkg checks whether the signers match when upgrading, like the Android model (a package can only be upgraded by another package signed by the same key). This would be nice, but seems optional and hard to do in Debian. Maybe this is the question we need to answer to move the discussion forward to make sure we're taking the desire for embedded signatures into account when thinking about reproducible .debs: how exactly do we expect an embedded signature to be used/evaluated? by who, and in what context? I think the .buildinfo file is useful, but for a separate process. It should be the canonical file for running a reproducible build. I'm not sure what this means. I'd be very happy if *all* of my debian packages were reproducible builds, and i could have a way of verifying it. I'd consider that more valuable than knowing that all my .debs were signed by any individual authority. So if we're really talking about a tradeoff between signed buildinfo files and signed packages, i'd certainly prefer signed buildinfo files. But my proposal was an attempt to let people have both, without forcing the entire ecosystem to agree on who is a canonical authority for package X, without whom a reproducible package is impossible --dkg I think this topic is far too vast with far too many dependencies to really have a useful discussion on without a full time, dedicated team. Since that seems highly unlikely in the near future, we need to break it down into chunks of work that we can achieve with the time and resources we actually have. So we need to focus on drilling down to what is the simplest useful form of package signing that will cause the least amount of problems when we decide to change how package signing works. This means we get a prototype out as soon as possible, and we can learn a lot from that. I think that's pretty easy to do, something like this: * make dpkg optionally check package sigs, and refuse to install on bad sig * use apt signing model: signatures verified from the apt key ring * signing can start happening in the build tools, by the uploader * start work towards getting the Debian built/apt infrastructure signing .hc -- PGP fingerprint: 5E61 C878 0F86 295C E17D 8677 9F0F E587 374B BE81 https://pgp.mit.edu/pks/lookup?op=vindexsearch=0x9F0FE587374BBE81 ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Re: [Reproducible-builds] Wiki reorganization
Looks drastically better! I think this wiki is really the central resource for anyone interested in making reproducible builds, Debian or not. So I'm glad to see it reorganized to look like a community resource rather than the giant notepad it was before. .hc Jérémy Bobbio: Hi! While waiting on builds and rebuilds of linux, I started to reorganize, refresh and improve the documentation on the wiki. .O. === HEADS-UP! ==o===o= HEADS-UP! === If you were previously subscribed to the ReproducibleBuilds wiki page, you should edit your notifications: https://wiki.debian.org/?action=userprefssub=notification And add to the list of pages the following regex: ReproducibleBuilds/.* Otherwise, you are likely to miss some of the fun! The main page is now a landing page with links based on potential reader's interests. Have a look: https://wiki.debian.org/ReproducibleBuilds I'm happy with the status of the About, Contribute, and ExperimentalToolchain sub-pages. I should be able to work on Howto and History tomorrow. Don't wait for me, though. :) ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds -- PGP fingerprint: 5E61 C878 0F86 295C E17D 8677 9F0F E587 374B BE81 https://pgp.mit.edu/pks/lookup?op=vindexsearch=0x9F0FE587374BBE81 signature.asc Description: OpenPGP digital signature ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Re: [Reproducible-builds] [RFC] debbindiff
Definitely use setup.py. It makes the packaging easy and standardized, and it is the standard way to build python. It also makes it easy to publish releases to pypi, the central package repository for python. I attached a quick untested one for you. .hc Jérémy Bobbio wrote: Hi! I've been working at high pace since Sunday on a replacement for the diffp script [1]. These GPLv3 lines of Python are called debbindiff. Get it from Git: https://anonscm.debian.org/cgit/reproducible/debbindiff.git/ Attached is an output produced for the attr package. The new tool is at least as capable as diffp, is way more extensible, and the result is more readable. Example usage: $ ./debbindiff.py --html /tmp/debbindiff.html b1/*.changes b2/*.changes There's no requirements for actually comparing .changes. You can use it to compare jar files directly if that's your kick. I'd love to see reviews of the code. It's scarce on comments but names should be explicit enough, or so I hope. It's missing Debian packaging. I guess I should learn how to write a setup.cfg or similar. Pointers or patches welcome. One thing this codebase should enable is writing “hints”. Once the tree of differences is generated, it should be doable to run through it to generate statements like: “Many files in data.tar have different timestamps, dh_fixmtimes has probably not been called. Are you using dh?” This still needs to be done though. Last note: I've been pushing everything else aside while I had the thrills to work on this. It's unclear when will be the next time, so patches are preferred rather than suggestion. [1]: https://anonscm.debian.org/cgit/reproducible/misc.git/tree/diffp ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds -- PGP fingerprint: 5E61 C878 0F86 295C E17D 8677 9F0F E587 374B BE81 #!/usr/bin/env python2 from setuptools import setup import sys setup(name='debbindiff', version='0.1', description='display differences between files', long_description=open('README').read(), author='Lunar', author_email='lu...@debian.org', url='https://wiki.debian.org/ReproducibleBuilds', packages=['debbindiff'], scripts=['debbindiff'], install_requires=[ 'python-debian', ], classifiers=[ 'Development Status :: 3 - Alpha', 'Intended Audience :: Developers', 'License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)', 'Operating System :: POSIX', 'Topic :: Utilities', ], ) signature.asc Description: OpenPGP digital signature ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Re: [Reproducible-builds] [RFC] debbindiff
Comparing jars gives a stacktrace, looks like a missing import. $ ./debbindiff.py ~/code/guardianproject/cacheword/cachewordlib/cachewordlib-v0.1-1-g04cb18e.jar /tmp/cachewordlib-v0.1-1-g04cb18e.jar Traceback (most recent call last): File ./debbindiff.py, line 53, in module main() File ./debbindiff.py, line 43, in main differences = debbindiff.comparators.compare_files(parsed_args.file1, parsed_args.file2) File /media/share/code/reproducible/debbindiff/debbindiff/comparators/__init__.py, line 85, in compare_files return comparator(path1, path2, source) File /media/share/code/reproducible/debbindiff/debbindiff/comparators/utils.py, line 51, in with_fallback inside_differences = original_function(path1, path2, source) File /media/share/code/reproducible/debbindiff/debbindiff/comparators/zip.py, line 57, in compare_zip_files zipinfo1 = get_zipinfo(path1) File /media/share/code/reproducible/debbindiff/debbindiff/comparators/zip.py, line 31, in get_zipinfo return re.sub(re.escape(path), os.path.basename(path), output) NameError: global name 're' is not defined Also, I updated the setup.py for two small things. I recommend using code checkers like pyflakes and pylint: $ pyflakes *.py debbindiff/*.py debbindiff/difference.py:20: 'difflib' imported but unused debbindiff/difference.py:41: redefinition of function 'comment' from line 37 hans@palatschinken debbindiff $ pylint *.py debbindiff/*.py ... .hc Hans-Christoph Steiner wrote: Definitely use setup.py. It makes the packaging easy and standardized, and it is the standard way to build python. It also makes it easy to publish releases to pypi, the central package repository for python. I attached a quick untested one for you. .hc Jérémy Bobbio wrote: Hi! I've been working at high pace since Sunday on a replacement for the diffp script [1]. These GPLv3 lines of Python are called debbindiff. Get it from Git: https://anonscm.debian.org/cgit/reproducible/debbindiff.git/ Attached is an output produced for the attr package. The new tool is at least as capable as diffp, is way more extensible, and the result is more readable. Example usage: $ ./debbindiff.py --html /tmp/debbindiff.html b1/*.changes b2/*.changes There's no requirements for actually comparing .changes. You can use it to compare jar files directly if that's your kick. I'd love to see reviews of the code. It's scarce on comments but names should be explicit enough, or so I hope. It's missing Debian packaging. I guess I should learn how to write a setup.cfg or similar. Pointers or patches welcome. One thing this codebase should enable is writing “hints”. Once the tree of differences is generated, it should be doable to run through it to generate statements like: “Many files in data.tar have different timestamps, dh_fixmtimes has probably not been called. Are you using dh?” This still needs to be done though. Last note: I've been pushing everything else aside while I had the thrills to work on this. It's unclear when will be the next time, so patches are preferred rather than suggestion. [1]: https://anonscm.debian.org/cgit/reproducible/misc.git/tree/diffp ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds -- PGP fingerprint: 5E61 C878 0F86 295C E17D 8677 9F0F E587 374B BE81 #!/usr/bin/env python2 from setuptools import setup setup(name='debbindiff', version='0.1', description='display differences between files', long_description=open('README').read(), author='Lunar', author_email='lu...@debian.org', url='https://wiki.debian.org/ReproducibleBuilds', packages=['debbindiff'], scripts=['debbindiff.py'], install_requires=[ 'python-debian', ], classifiers=[ 'Development Status :: 3 - Alpha', 'Intended Audience :: Developers', 'License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)', 'Operating System :: POSIX', 'Topic :: Utilities', ], ) signature.asc Description: OpenPGP digital signature ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Re: [Reproducible-builds] concrete steps for improving apt downloading security and privacy
Stefan Fritsch wrote: On Sunday 21 September 2014 21:13:50, Richard van den Berg wrote: Package formats like apk and jar avoid this chicken and egg problem by hashing the files inside a package, and storing those hashes in a manifest file. Signatures only sign the manifest file. The manifest itself and the signature files are not part of the manifest, but are part of the package. So a package including it's signature(s) is still a single file. This is bad design and will inevitably lead to security issues (as has been demonstrated by Android and apk). One must check the signature first, and only if the signature matches, start parsing complex file formats. And yes, zip is complex enough to be a problem. It is true that an embedded signature requires more complicated code, but it also simplifies the parts that the user has to understand. Perfect code with a bad user experience will also inevitably lead to security issues. I'm guessing that ar format is simpler than zip, so that'd be helpful. .hc -- PGP fingerprint: 5E61 C878 0F86 295C E17D 8677 9F0F E587 374B BE81 ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Re: [Reproducible-builds] concrete steps for improving apt downloading security and privacy
Daniel Kahn Gillmor wrote: On 09/22/2014 04:06 PM, Hans-Christoph Steiner wrote: I think we're starting to nail down the moving parts here, so I want to outline that so we can find out the parts where we agree and where we disagree. * I hope we can all agree that the package itself should not change once it has hit the official repos. * I believe we can achieve what we want without taking a shortcut and introducing a new core package type (.sdeb .debs or whatever). We can figure out how to do this with the .deb file. Personally, I would accept a new package type after a thorough exploration of keeping .deb fails to deliver, but not before. * There should be at least one verification build before a package becomes official. * Then there needs to be a channel for people to submit the results of their own builds. That could be only positive results or only negative results, or both. * the .buildinfo file should contain all info needed to reproduce the build, given a standard Debian build environment Thanks, the above is a very useful summary. Anything I left out? I think the summary above hints at but doesn't answer the question of what an official package means, and the fact that there may be multiple repositories (possibly operated by different organizations) with different rules about what should make a package official. I think we need to ask whether we care about byte-for-byte identical .deb files *across* different repositories or not. If we don't care about cross-repo (or cross-organization) byte-for-byte reproducibility, then an embedded signature in the .deb might be acceptable (though the data it contains would be redundant to signatures over the buildinfo files, which would eventually be necessary for external policies or corroboration anyway). If we *do* decide that we care about cross-repo byte-for-byte compatibility, then embedding a signature in the .deb suggests that one repo can act as the gating factor for another, because repos collaborating in this reproducibility push cannot both hold the key that makes a .deb official. I don't think that's a good tradeoff. As tempting as it might be to try to cement debian's authoritative role via such a lock-in, i'd much rather than debian derivatives, blends, side projects, etc, can all take initiative that can then be absorbed back into debian cleanly and reproducibly. i also suspect that the redundancy between internal signatures and signed .buildinfo records is likely to cause some increase in confusion, but i don't think that's as serious of a problem as the question of which signing keys get to be authoritative. --dkg Cross-repo byte-for-byte compatibility is a nice thing to strive for, but it sounds quite difficult to achieve and will require lots of social coordination as well as technical work. In terms of builds of a particular .deb by multiple distros, each distro will have to use the exact same toolchain to build the .deb for most packages. Different versions of gcc, javac, etc. will produce different binaries. You'll have the same problems as the canonical signature, like if two distros make a new package at the same time, but with different standards (gcc version, signer, etc). Ubuntu's gcc version will create a .deb with one hash, and Debian's gcc will create a .deb with a different hash, and each distro will mark theirs as canonical. That seems to be a much harder thing to manage across the distros. So if another distro or repo is going to buy into Debian's reproducible system, adding a bit about canonical signatures seems totally feasible to manage. The canonical signature would just need to be done by a key in the debian-keyring for Debian, ubuntu-keyring for Ubuntu, etc. While a static canonical signer is desirable, I don't think it can be supported without adding some restrictions (I think it would be worth it, for the record). Whoever is the first to publish a given package version claims the canonical role for both build setup and signature. To prevent accidental collisions, dput could check the various NEW queues, the various package repos, etc. to look for an existing canonical package. Then the first distro to publish is canonical. Shall we have a real time discussion on this topic? voice, video, or in person all work for me. .hc -- PGP fingerprint: 5E61 C878 0F86 295C E17D 8677 9F0F E587 374B BE81 ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Re: [Reproducible-builds] concrete steps for improving apt downloading security and privacy
Elmar Stellnberger wrote: Am 22.09.14 um 01:52 schrieb Paul Wise: On Mon, Sep 22, 2014 at 2:04 AM, Elmar Stellnberger wrote: A package with some new signatures added is no more the old package. That is exactly what we do *not* want for reproducible builds. It should have a different checksum and be made available again for update. The Debian archive does not allow files to change their checksum, so every signature addition requires a new version number. That sounds like a bad idea to me. Yes, that is something we definitely do not want. Nonetheless it would still be an issue to have the package and the signatures in one file because we usually need them together. My only idea to realize this in spite of the said objection would be another proposal: Put the .deb and the signatures into one .ar called .sdeb and make tools like dpkg work on .sdebs or on .deb + signatures respecively. Whenever someone offers some packages for download that will be in the form of .sdebs while official debian repositories may separate both kinds of files. User interfaces like http://debtags.debian.net/search/ could then generate .sdebs on the fly to satisfy petted users. I think we're starting to nail down the moving parts here, so I want to outline that so we can find out the parts where we agree and where we disagree. * I hope we can all agree that the package itself should not change once it has hit the official repos. * I believe we can achieve what we want without taking a shortcut and introducing a new core package type (.sdeb .debs or whatever). We can figure out how to do this with the .deb file. Personally, I would accept a new package type after a thorough exploration of keeping .deb fails to deliver, but not before. * There should be at least one verification build before a package becomes official. * Then there needs to be a channel for people to submit the results of their own builds. That could be only positive results or only negative results, or both. * the .buildinfo file should contain all info needed to reproduce the build, given a standard Debian build environment Anything I left out? .hc -- PGP fingerprint: 5E61 C878 0F86 295C E17D 8677 9F0F E587 374B BE81 ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Re: [Reproducible-builds] concrete steps for improving apt downloading security and privacy
Daniel Kahn Gillmor wrote: Hi Hans-- I think we're in agreement here about most things actually, despite our back-and-forth. hopefully this is a clarifying response: Daniel Kahn Gillmor wrote: In that case, the .deb that was installed on a sid system *is not* the .deb that is installed on a testing system. If i run a mixed unstable/testing system (i do, actually, this is not hypothetical) should i need to re-install foo_1.2-3_mipsel.deb when that package transitions from unstable to testing (without any changes made to it other than new signatures)? That seems odd, but the .debs are now no longer bytewise identical. should archive operators who are doing rsync mirroring of a number of pools update their .debs as new signatures are added to them? can they still use rsync for this cleanly without a massive increase in bandwidth between mirrors or do we need to define a new synchronization mechanism? The question of what is a canonical, immutable signature for any given distribution is also problematic, because it ties the policy of the distribution (already defined by what that apt repository includes and references in Release.gpg) to a set of individual package signatures. But this is exactly the point where we'd like more flexibility. People who care about apt repo X and use it online can use Release.gpg, while people who are *not* using the apt repo might have a different set of policies. And some repos might want to share specific packages with each other -- what if their signing policies about the canonical signature conflict? should they have to rebuild the package? Packages should not be accepted into any official repo, sid included, without some verification builds. A .deb should remain unchanged once it is accepted into any official repo (maybe experimental could be an exception, but not sid). I think that is essential. I see no reason for changing the .deb between sid and testing, except for perhaps how existing implementations are doing it. It is usually worth the work to do things right way, rather than the easy way. The build verification process needs to happen between the package upload and publishing to sid or security updates. Two builds is easy: the .deb that the uploader generates and the one the Debian process makes. That is probably enough. In Debian's case, it probably is too complicated to include multiple signatures. In that case, there should be only one canonical signature by dak once the build verification signature threshold has been passed. Then all of the other signatures could be added to .buildinfo or .changes or whatever other file. Another option is to do it like f-droid.org does. F-droid.org generates a APK signing key for each app, then manages the signing on a specialized signing server. Or another option is just requiring all the signers to be from the debian-keyring, rather than an exact match for previous signers. In any case, the .deb needs to remain unchanged. You're entirely right that when fetching files via the web directly (instead of an apt repository) or sneakernet, people tend to transfer only the minimal set of possible files, and therefore having detached signatures is a bad idea for adoption. But i don't think this addresses the concern raised above that specific .deb files have constant size and contents, which is an assumption that permeates the repository, mirror, and distribution mechanisms. Rejecting that assumption means potentially breaking a lot of infrastructure that currently works, as well as forcing incompatible policy changes on archive operators. So i'd like to have this cake and eat it too, please :) Here's a proposal for chewing over: * define a new package format called .debs * foo_1.2-3_mipsel.debs is a tarball that contains at least three files: foo_1.2-3_mipsel.deb foo_1.2-3_mipsel.buildinfo foo_1.2-3_mipsel.buildinfo.0EE5BE979282D80B9F7540F1CCD2ED94D21739E9.asc (it can contain more than one .asc if it wants to include multiple signatures) * if you invoke dpkg -i foo_1.2-3_mipsel.debs, dpkg should unpack and inspect the .debs, and the signatures, and refuse to install the .deb if the signatures don't meet local policy. (i'm hand-waving here about what local policy is, since i think that's a separate discussion) Now we can leave the current online archive distribution alone -- apt works (modulo bugs) and archive operators can continue to function as they currently do. But we tell users and upstream developers that if they want to install packages via sneakernet or by downloading them individually from the web that they really should be passing around .debs files, and not .deb files. We could even modify dpkg to reject installations of plain .deb files unless a package manager (which has presumably already verified the package by other means) is doing the installation. what do you think? --dkg I think this can
Re: [Reproducible-builds] concrete steps for improving apt downloading security and privacy
Daniel Kahn Gillmor wrote: Thanks for the discussion, Hans. On 09/19/2014 02:47 PM, Hans-Christoph Steiner wrote: Packages should not be accepted into any official repo, sid included, without some verification builds. A .deb should remain unchanged once it is accepted into any official repo (maybe experimental could be an exception, but not sid). I think that is essential. But some repositories could have different rules for package inclusion than others, right? for example, say debian wanted to offer an unstable-reproducible suite, which only permitted packages that had been independently rebuilt reproducibly by multiple DDs and at least two different buildds. Ideally, the packages that are shared between this repository and other repositories would be identical. Note that if .deb files are internally signed, two developers *cannot* create the same exact .deb if they do not share their secret keys. You're missing one key detail here, let's see if I can suss it out: * the builds are _exactly_ the same, except the signatures * the embedded signature does not sign the signature files (see jar and apk formats, which are almost the same, for examples) * anyone can just copy other dev's signature into the package and it will validate because the package contents are exactly the same * the signature files sign the package contents, not the hash of whole .deb file (i.e. control.tar.gz and data.tar.gz). Therefore two developers can easily create the same .deb if that have access to the signature file since they can just copy it. No need to run the signing process again. If people create their own .deb files in a reproducible process, then copy in the same signature files, then the hash of the .deb will be the same. I see no reason for changing the .deb between sid and testing, except for perhaps how existing implementations are doing it. It is usually worth the work to do things right way, rather than the easy way. I agree with this sentiment, i think we're trying to sort out what is the right way. The build verification process needs to happen between the package upload and publishing to sid or security updates. Two builds is easy: the .deb that the uploader generates and the one the Debian process makes. That is probably enough. In Debian's case, it probably is too complicated to include multiple signatures. In that case, there should be only one canonical signature by dak once the build verification signature threshold has been passed. Then all of the other signatures could be added to .buildinfo or .changes or whatever other file. but the .buildinfo file is designed to say i generated the .deb that matches this digest exactly, which the corroborating builder cannot do, because they cannot produce the internal signature. No need to produce the signature, just copy it! Plus, we now have two different places to look for signatures. one canonical one and then some external ones, and the signatures themselves have different properties (one signs parts of the deb, the other signs the whole .deb; one signs the build environment, the other does not, etc) Definitely look at jar signing, it handles multiple signatures fine. I see no reason why you can't include an unlimited number of signatures in a .deb. Changing the number of signatures will change the hash of the .deb, that is why there needs to be a canonical set of signatures for each .deb. As for signing the hash of the entire .deb, that is what apt already gives us, that does not need to be reproduced in the dpkg-sig embedded signature. For people who want to verify the contents of a .deb with any kind of signature, then a tool will have to compare the hashes of control.tar.gz and data.tar.gz. Another option is to do it like f-droid.org does. F-droid.org generates a APK signing key for each app, then manages the signing on a specialized signing server. Or another option is just requiring all the signers to be from the debian-keyring, rather than an exact match for previous signers. Again, i think this is getting ahead of the discussion. i'm not proposing that we try to set debian (or other derived distro) archive policy here, i just think we want to think In any case, the .deb needs to remain unchanged. right. but it can't be unchanged if the archive distributor decides that a different signer is the canonical signer. So you're making the contents of the .deb dependent on archive policy, rather than the other way around. I *want* ubuntu and debian and mint to all ship the exact same .deb for any packages that are reproducible (and eventually, all packages!) that they share, and i also want those different distros to be able to produce the reproducible .deb independently of one another. If foo_1.2-3_mipsel.deb is built first on the ubuntu builders and ubuntu decides to include it in the archive, and then debian is able to reproduce that build