Bug#788864: python-debian: License field in files paragraph should be required not optional
Hi Orestis, Thanks for your feedback > > * It introduces a `MachineReadableFormatError` which is used for format > > errors; I think it's worth distinguishing between an error in the format > > and the copyright file not being in the format at all > > (`NotMachineReadableError`). `MachineReadableFormatError` is derived from > > `ValueError` which I think makes sense. > > I think the module should raise errors inherited by Error > https://salsa.debian.org/python-debian-team/python-debian/blob/8ed0ff84e7f41 > d0ba08ac11dcc0d7c66597f37b1/lib/debian/copyright.py#L46 otherwise there is > not much use to it. Hence i suggest that > MachineReadableFormatError inherits from that as well. I think I have now addressed that and we now have: class MachineReadableFormatError(Error, ValueError): so that all errors that the class may raise are derived from copyright.Error as you suggest but backwards compatibility and the semantics of the ValueError are also preserved. Does that look right? https://salsa.debian.org/python-debian-team/python-debian/merge_requests/1 Any other comments? (from you or anyone else?!) cheers Stuart -- Stuart Prescotthttp://www.nanonanonano.net/ stu...@nanonanonano.net Debian Developer http://www.debian.org/ stu...@debian.org GPG fingerprint90E2 D2C1 AD14 6A1B 7EBB 891D BBC1 7EBB 1396 F2F7
Bug#788864: python-debian: License field in files paragraph should be required not optional
Hello, Thanks for taking the time to do this :) and sorry that it took me time to respond. As the user of Copyright module in debsources here are few comments On 02/18/2018 05:59 AM, Stuart Prescott wrote: > > In particular: > > * It introduces a `MachineReadableFormatError` which is used for format > errors; I think it's worth distinguishing between an error in the format and > the copyright file not being in the format at all > (`NotMachineReadableError`). > `MachineReadableFormatError` is derived from `ValueError` which I think makes > sense. > I think the module should raise errors inherited by Error https://salsa.debian.org/python-debian-team/python-debian/blob/8ed0ff84e7f41d0ba08ac11dcc0d7c66597f37b1/lib/debian/copyright.py#L46 otherwise there is not much use to it. Hence i suggest that MachineReadableFormatError inherits from that as well. > > By throwing an exception as soon as an error is found, this becomes a bit of > an all-or-nothing approach. Would a better approach be an incremental > validation where as much as is possible is read with an `valid` attribute per > stanza that propagates to the whole `Copyright` instance? Users would then > check that the file is valid after read in rather than using exception > handling. > I prefer the all-or-nothing approach for the strict mode at least. While the incremental could be interesting for some use cases i feel that for the copyright module in debsources it would create an unnecessary overhead and it would be harder to qualify d/copyright files as machine readable or not. When the copyright hook (parses d/copyright files of packages and then adds in the db the license of each file) is active on debsources (the case these days) we parse the whole file and treat the files mentioned in it, so it will be a bit confusing to have files within a package without their license and other files of the same package with a license. License rendering such as https://sources.debian.org/copyright/license/python-pip/9.0.1-2/ would be more confusing. Or when searching a license of file using its sha256 checksum we would need to maybe handle differently files that their packages have a "partial" machine readable copyright format. (https://sources.debian.org/copyright/sha256/?checksum=d77d235e41d54594865151f4751e835c5a82322b0e87ace266567c3391a4b912) I hope what i am describing is clear here :D Cheers, Orestis
Bug#788864: python-debian: License field in files paragraph should be required not optional
Control: tags -1 + patch Dear Orestis, python-debian mainatiners & debsources folks, > Debian standard [1] suggests that the license field in the files > paragraph is required whereas when you parse it is only optional. > > This is sometimes causing a trouble when using the package since > the user has to verify that the license object in the files > paragraph is not None and thus raising AttributeError when > accessing the synopsis for example. > > I guess the solution must not be to consider the d/copyright file > as non machine readable but you might want to omit that specific > paragraph and log this error. I think this is a good suggestion, and looking at the use of `debian.copyright.Copyright` in debsources, I can see that they have had to do a similar dance to what you describe. https://salsa.debian.org/qa/debsources/blob/master/lib/debsources/ license_helper.py#L105 I've taken a first cut at making the `Copyright` reader more strict with regards to required fields but I would like some feedback from users of the `Copyright` class before merging it. https://salsa.debian.org/python-debian-team/python-debian/merge_requests/1 In particular: * It introduces a `MachineReadableFormatError` which is used for format errors; I think it's worth distinguishing between an error in the format and the copyright file not being in the format at all (`NotMachineReadableError`). `MachineReadableFormatError` is derived from `ValueError` which I think makes sense. * I've changed other uses of `ValueError` within `Copyright` to use `MachineReadableFormatError` to be consistent (but that should also be backwards compatible) * As suggested within comments already in the code, I've allowed a `strict=False` mode which continues to use python warnings rather than raising exceptions. * The comments in the code also talked about treating the http and https versions of the copyright spec as being the same; the spec has since been changed to explicitly say that both are OK but that the https URL is preferred and so the code will silently upgrade from http to https too. By throwing an exception as soon as an error is found, this becomes a bit of an all-or-nothing approach. Would a better approach be an incremental validation where as much as is possible is read with an `valid` attribute per stanza that propagates to the whole `Copyright` instance? Users would then check that the file is valid after read in rather than using exception handling. comments, please! (Either to this bug or to the MR on salsa) thanks Stuart -- Stuart Prescotthttp://www.nanonanonano.net/ stu...@nanonanonano.net Debian Developer http://www.debian.org/ stu...@debian.org GPG fingerprint90E2 D2C1 AD14 6A1B 7EBB 891D BBC1 7EBB 1396 F2F7
Bug#788864: python-debian: License field in files paragraph should be required not optional
Package: python-debian Version: 0.1.27 Severity: normal Dear Maintainer, Debian standard [1] suggests that the license field in the files paragraph is required whereas when you parse it is only optional. This is sometimes causing a trouble when using the package since the user has to verify that the license object in the files paragraph is not None and thus raising AttributeError when accessing the synopsis for example. I guess the solution must not be to consider the d/copyright file as non machine readable but you might want to omit that specific paragraph and log this error. Thanks for maintaining python-debian, Orestis [1] https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/ -- System Information: Debian Release: 8.1 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable') Architecture: amd64 (x86_64) Foreign Architectures: i386 Kernel: Linux 3.16.0-4-amd64 (SMP w/4 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) Versions of packages python-debian depends on: ii python-chardet 2.3.0-1 ii python-six 1.8.0-1 pn python:any Versions of packages python-debian recommends: ii python-apt 0.9.3.11 Versions of packages python-debian suggests: ii gpgv 1.4.18-7 -- no debconf information -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org