On Thu, Nov 02, 2017 at 08:11:59PM +0100, Michał Górny wrote:
> Next version. Now without MISC/OPTIONAL, and with many clarifications.
Huge improvements in this version, I found it much easier to understand.

Nits: 
- please stick to ASCII ellipsis. The unicode ellipsis is unreadable in
  some monospace fonts.

Further items inline:
> Directory tree coverage
> -----------------------
...
> The file entries (except for ``IGNORE``) can be specified for regular
> files only. Symbolic links are followed when opening files
> and traversing directories. It is an error to specify an entry for
> a different file type. If the tree contain files of other types
> that are not otherwise ignored, they need to be covered by an explicit
> ``IGNORE``.
> 
> All the local (non-``DIST``) files covered by a Manifest tree must
> reside on the same filesystem. It is an error to specify entries
> applying to files on another filesystem. If subdirectories
> that are not otherwise ignored reside on a different filesystem, they
> must be explicitly excluded via ``IGNORE``.
I would prefer this to say:
'If files that are not otherwise ignored reside on a different
filesystem', as expanded from sub-directories.  
This implicitly forbids following a symlink that crosses a filesystem
boundary, and then matches the similar part of 'Tree layout
restrictions'.

> Rationale
> =========
...
> Tree layout restrictions
> ------------------------
> 
> The algorithm is meant to work primarily with ebuild repositories which
> normally contain only files and directories. Directories provide
> no useful metadata for verification, and specifying special entries
> for additional file types is purposeless. Therefore, the specification
> is restricted to dealing with regular files.
> 
> The Gentoo repository does not use symbolic links. Some Gentoo
> repositories do, however. To provide a simple solution for dealing with
> symlinks without having to take care to implement special handling for
> them, the common behavior of implicitly resolving them is used.
> Therefore, symbolic links to files are stored as if they were regular
> files, and symbolic links to directories are followed as if they were
> regular directories.
> 
> Dotfiles are implicitly ignored as that is a common notion used
> in software written for POSIX systems. All other common filenames
> require explicit ``IGNORE`` lines.
'common' in the second sentence seems odd. What about uncommon
filenames? Maybe just s/other common filenames/other filenames/.

> An ability to inject additional ignore entries is provided to account
> for site configuration affecting the repository tree — placing
> additional files in it, skipping some of the categories from syncing.
Mention that the package manager may provide wildcards or regex in the
additional entries. Eg: 'IGNORE **/metadata.xml' 

> Non-strict Manifest verification
> --------------------------------
...
> The cases for stripping unnecessary files mostly focused around space
> savings. For this purpose, stripping ``metadata.xml`` and similar files
> has little value. It is much more common for users to strip whole
> categories which can not be handled via the ``MISC`` type, and needs
> a dedicated package manager mechanism. The same mechanism can also
> handle files that used the ``MISC`` type.
Exclusion by package does happen as well. A list of categories or
packages can be used for both the rsync exclusion and the IGNORE.

> Splitting distfile checksums from file checksums
> ------------------------------------------------
> 
> Another problem with the current Manifest format is that the checksums
> for fetched files are combined with checksums for local files
> in a single file inside the package directory. It has been specifically
> pointed out that:
> 
> - since distfiles are sometimes reused across different packages,
>   the repeating checksums are redundant,
Comment: 8.4% of all DIST entries are duplicate, representing a 2MiB
saving in tree size (25MiB of DIST entries altogether).

> - mirror admins were interested in the possibility of verifying all
>   the distfiles with a single tool.
> 
> This specification does not provide a clean solution to this problem.
> It technically permits moving ``DIST`` entries to higher-level Manifests
> but the usefulness of such a solution is doubtful.
This solution would require the packager manager to consider
higher-level Manifests or all Manifests in the tree when searching for
the DIST entry. The most useful implementation of this would be for the
git->rsync process to move all DIST entries elsewhere (metadata/ maybe).

Either way, this would have many downsides, and make manual work on the
Manifest DIST entries painful.

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Asst. Treasurer
E-Mail   : robb...@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136

Attachment: signature.asc
Description: Digital signature

Reply via email to