Hi all, Here's my first pass at proposed Thandy changes as discussed on irc a couple of weeks ago.
Justin 0. Proposed Thandy Changes ========================== This is a set of proposals that includes a section of simple changes that can be considered on their own (Section 1) as well as a more fundamental Thandy restructuring proposal (Section 2). This isn't meant to be at the level of detail needed for a spec and subsequent implementation. This is to get feedback and promote discussion. It's not an official proposal at this point but more of a request for comment. A few relevant documents for reference: * Thandy spec: https://gitweb.torproject.org/thandy.git/blob_plain/HEAD:/specs/thandy-spec.txt * TUF spec: https://www.updateframework.com/browser/specs/tuf-spec.txt * High-level differences between Thandy and TUF: https://www.updateframework.com/wiki/ThandyDifferences * Paper on TUF: http://www.freehaven.net/~arma/tuf-ccs2010.pdf 1. Individual Thandy Changes ============================ These are changes that could be made to Thandy without major overhaul and can be considered separately of the restructuring proposal (Section 2). 1.1. Multiple File Hashes ------------------------- Make all file hashes be a set of (algorithm, digest) pairs rather than a single digest of a predefined algorithm. Thus, instead of describing a file's hash in metadata with: "hash" : 349dceb3de2db82e363c3d73063f031c56c5aac5 It would be described as: "hash" : ["sha1" : 349dceb3de2db82e363c3d73063f031c56c5aac5, "sha256" : 95eaa1682a99fba24b26c94499b545...747d7759ba845c8b5c] Note that it could still be allowed to list only one digest. This just allows the ability to use multiple hashes. It's up to the client implementation to determine which are checked. 1.2. Refer to Keys by Their ID when Delegating ---------------------------------------------- In the Thandy key list file (the "root metadata"), the full keys are listed each time they are referenced. This may decrease the human readability of the key list. An alternative approach is to use a separate section in the file that defines the keys that will be used in the rest of this metadata file, list them with their ID (a hash of the canonical format of the key), and then refer to them later by this ID. This is similar to how signatures are already done in Thandy: the ID of the key is listed along with the signature. An implementation of this needs to check for ID collisions when reading keys from metadata. It's fine to see the same key specified with the same ID, but a different key with the same ID as one that has been seen indicates something wrong. (Note that the implementation would always check that the specified IDs are correct for the corresponding key, even without collisions.) 1.3. Indicate Signature Thresholds ---------------------------------- The current Thandy spec isn't clear about how multiple keys are specified for a role. There also doesn't appear to be a way to specify a threshold that is less than the total number of keys. The method of specifying multiple keys should be made clear and the ability to indicate the number of signatures of those keys that are required should be added. (There's an example of how this can be specified in metadata in Section 2.2.) 1.4. Add a 'Release' Role ------------------------- Thandy currently lists the hashes of all other metadata in the timestamp file. There are certain attacks that could be mitigated if the Timestamp role signed a separate Release role's metadata that listed the hashes of all other metadata files. The idea here is that an attacker who compromises only the Timestamp role cannot present clients with a mix-and-match of signed metadata files that were available from the repository at different times. The separation helps because the Timestamp role has a higher likelihood of key compromise because the keys are used in an automated fashion, whereas the Release role would not be used in an automated fashion. Though the idea of a metadata mix-and-match attack is in general something worth keeping in mind, it may be the case that Thandy isn't at much risk because bundles serve a similar role of grouping together package versions in a way that attackers can't cause the clients to use an unintended combination of package versions. The risk to Thandy depends on whether packagers ever replace a package version rather than increment it (they aren't supposed to ever replace a version) and whether Thandy bundles always specify exact package versions rather than minimum/maximum package versions or package version ranges. 2. Thandy Restructuring Proposal ================================ Primary goal: Keep Thandy's concepts of bundles and packages but overlay them on top of the generic 'targets' approach of TUF. Note: This proposal is not advocating using/maintaining/relying on TUF as a separate project. That depends on factors such as the future of TUF according to the current TUF maintainers, whether Python is an appropriate choice for Windows clients, etc. 2.1 Approach ------------ Two separate layers: 1. An authentication layer that downloads and authenticates opaque 'target' files according to metadata it understands that lists hashes and sizes of the target files. This layer doesn't understand what bundles and packages are. 2. A decision/installation layer that uses the authentication layer to download bundle/package info and associated files. This layer doesn't know the details of the authentication mechanisms or roles; it gets files from the authentication layer that the authentication layer has already authenticated. * Note that the update decision and installation code are probably separate, but for the sake of this proposal all that matters is that the Thandy authentication layer is logically separate from the rest of Thandy. For the authentication layer, we start with the following roles (the same as TUF uses): * Root o Root of trust for the entire PKI. Indicates through signed metadata which keys are trusted for the Release, Targets, Timestamp, and Mirror roles. * Timestamp o Signs a frequently regenerated timestamp file with a short expiration indicating the most recent release metadata. * Release o Signs the release metadata which lists the hashes and sizes of all other metadata files (other than the timestamp file). Note that bundleinfo and pkginfo are not considered metadata at the authentication layer. * Targets o Signs a metadata file that lists the hashes and sizes of target files: the files that the decision layer ultimately wants to obtain. o Can delegate to sub-roles the responsibility for providing target files from specific paths on the repository (e.g. Role A is trusted to provide files from the /targets/role_a/ directory). * Mirror o Signs a metadata file that lists the locations and details of repository mirrors. >From here we use delegation by the Targets role to create the roles for bundlers and packagers. The top-level Targets role delegates a separate role for each bundle and each package. The targets role hierarchy looks like this (with many more bundle and package roles): Root `-- Targets |-- bundles/tor-browser-stable |-- bundles/tor-browser-beta `-- pkgs/openssl Each bundle version and package version that bundlers and packagers released has a separate bundleinfo and pkginfo file, respectively. These bundleinfo and pkginfo files are opaque to the authentication layer: it considers them target files like any other. However, the decision layer understands the contents of these files and uses them to make subsequent download and installation decisions (with the downloads always being done through the authentication layer). 2.2. Repository Structure ------------------------- Top-level metadata files are: /meta/root.txt /meta/release.txt /meta/timestamp.txt /meta/targets.txt /meta/mirrors.txt The /meta/targets.txt file would include a delegations section such as: delegations : { keys : { 'ABC...' : { details }, '123...' : { details }, ... }, roles : { 'bundles/tor-browser-stable' : { keys : ['ABC...', '123...'], threshold : 2, paths : ['bundles/tor-browser-stable/**'], }, 'pkgs/openssl' : { keys : ['DEF...', '456...'], threshold : 2, paths : ['pkgs/openssl/**'], }, ... } } The above would mean that the top-level Targets role had delegated a role whose full name would be targets/bundles/tor-browser-stable (as it is delegated by the targets role, the prepended targets/ is implicit in the delegated role's name). This role for the tor-browser-stable bundle would be trusted for the specified paths relative to the repository's targets/ directory. Thus, a specific version's bundleinfo file created by the bundler could be placed on the repository at, for example: /targets/bundles/tor-browser-stable/win32/0.1/tor-browser-stable_win32_0.1.bundleinfo (Note that this bundle role is trusted for all targets files matching the path 'bundles/tor-browser-stable/**' under the repository's targets/ directory, as specified when this role was created through the above delegation.) The bundle maintainer would sign a metadata file listing the hash and size of this bundleinfo. This metadata would be placed on the repository at: /meta/targets/bundles/tor-browser-stable/win32/0.1/tor-browser-stable_win32_0.1.txt (Note that the basename of these files isn't crucial to this aspect of the design. They don't need to repeat the path info, though that's probably helpful for humans.) More generally, the metadata location is: /meta/ROLE_NAME/[ANY_PATH/]ANY_NAME.txt Packages are similar to bundles with the difference that there are one or more target files in addition to the pkginfo file. A package maintainer may supply the following files to be placed on the repository: /targets/pkgs/openssl/win32/0.9.8m/openssl_win32_0.9.8m.pkginfo /targets/pkgs/openssl/win32/0.9.8m/libeay32.dll /targets/pkgs/openssl/win32/0.9.8m/ssleay32.dll The hashes and sizes of these files are listed in metadata signed by the targets/pkgs/openssl role (that is, the openssl package maintainer's role). This metadata would be placed on the repository at: /meta/targets/pkgs/openssl/win32/0.9.8m/openssl_win32_0.9.8m.txt 2.3. Update Procedure --------------------- The update procedure is: * The decision layer uses the authentication layer to retrieve a list of all available bundleinfo files. o Implementation: the decision layer asks the authentication layer for a list of all available metadata file paths/names. The authentication layer obtains this information from the release metadata. * Looking at the paths/names of available bundleinfo files, the decision layer identifies whether there is a newer version of a bundle it is interested in. o Implementation: the bundle names, OS, arch, and bundle version are all contained in paths of the available bundle metadata files. * The decision layer notices a bundle version in the list that it wants and uses the authentication layer to retrieve the bundleinfo file for that version. * The decision layer reads the contents of the bundleinfo file which indicate the necessary package versions and any other info the decision layer needs. * The decision layer uses the authentication layer to retrieve the pkginfo files for each of the package versions that it wants. * The decision layer understands the contents of the pkginfo files. These files indicate the individual files that are part of this version of the package. * The decision layer uses the authentication layer to retrieve the individual files (e.g. /targets/pkgs/openssl/win32/0.9.8m/libeay32.dll) that are needed. * The decision layer hands off the relevant installation instructions (from the bundleinfo and pkginfo files) and individual package files to the code that performs the installation/upgrade. 2.4.bundleinfo and pkginfo -------------------------- As the contents of the bundleinfo and pkginfo are opaque to the authentication layer, essentially there are two completely separate sets of metadata in this design. It would make sense to have them use the same format (e.g. Canonical JSON) and be parsed/generated by the same code. The bundleinfo and pkginfo files would contain largely the same information as these files do in the current Thandy spec (though they wouldn't be directly signed but rather would be described in signed authentication-layer metadata). There are a few reasons it is good to have the bundleinfo/pkginfo be opaque to the authentication layer. One reason is that changes to bundleinfo/pkginfo fields can be tested independently of the authentication layer. Also, non-backwards-compatible changes could be made by introducing a new file name such as bundleinfo.v2 which would be effectively invisible to legacy clients. 2.5. Differences with TUF ------------------------- The authentication layer's metadata and roles are very similar to the current TUF specification. However, there are a few differences. TUF currently does not allow a single role to directly delegate multiple roles deep. In TUF, one would need the following role structure: Root `-- Targets |-- bundles | `-- tor-browser-stable `-- pkgs `-- openssl That is, the Targets role would have to first delegate a bundles role which then delegates a tor-browser-stable role. Relatedly, TUF gives each delegated role the ability to sign a single metadata file whose name is exactly the role's name. This may be non-ideal for Thandy because bundlers and packagers would need to keep a continuously growing metadata file that lists all of the versions that they want to be available to clients or, alternatively, delegate subroles for each version in order to use separate metadata files for each. (Note that this is talking about the authentication layer's metadata, not bundleinfo and pkginfo files.) In contrast, with this proposal, a bundler/packager would sign a metadata file that lists only the new target files they are adding to the repository.---This isn't a case where there's one correct way to do things, but my understanding is that Thandy would like old versions to remain available within their expiration times and would like bundlers/packagers to not have to deal with issues such as accidentally removing an old version they didn't mean to remove when generating and signing metadata to make a new version available. [end of proposal]
