Bug#844026: Bug #844026: Please handle malformed changelogs with more specific exceptions

2016-12-26 Thread Stuart Prescott
Dear python-debian maintainers,

The changelog entry in this case has a malformed date line and this is 
correctly found by Changelog:parse_changelog() but the failure is suppressed 
by __init__ even when strict=True is set:

In [1]: import debian.changelog

In [2]: version = 
debian.changelog.Changelog(file=open('changelog').read()).get_version()
---
IndexErrorTraceback (most recent call last)
 in ()
> 1 version = 
debian.changelog.Changelog(file=open('changelog').read()).get_version()

/usr/lib/python3/dist-packages/debian/changelog.py in get_version(self)
463 def get_version(self):
464 """Return a Version object for the last version"""
--> 465 return self._blocks[0].version
466 
467 def set_version(self, version):

IndexError: list index out of range


By way of contrast, doing exactly the same thing in two steps and with the 
same value of strict=True, the ChangelogParseError is correctly raised.


In [3]: ch = debian.changelog.Changelog()

In [4]: ch.parse_changelog(open('changelog').read())
---
ChangelogParseError   Traceback (most recent call last)
 in ()
> 1 ch.parse_changelog(open('changelog').read())

/usr/lib/python3/dist-packages/debian/changelog.py in parse_changelog(self, 
file, max_blocks, allow_empty_author, strict, encoding)
412 if end_match.group(3) != '  ':
413 self._parse_error("Badly formatted trailer "
--> 414 "line: %s" % line, strict)
415 current_block._trailer_separator = 
end_match.group(3)
416 current_block.author = "%s <%s>" \

/usr/lib/python3/dist-packages/debian/changelog.py in _parse_error(self, 
message, strict)
274 def _parse_error(self, message, strict):
275 if strict:
--> 276 raise ChangelogParseError(message)
277 else:
278 warnings.warn(message)

ChangelogParseError: Could not parse changelog: Badly formatted trailer line:  
-- John Smith  Tue, 27 Sep 2016 14:08:04 -0600


The offending piece of code that I don't understand is in Changelog:__init__ 
(changelog.py:266):

if file is not None:
try:
self.parse_changelog(file, max_blocks=max_blocks,
allow_empty_author=allow_empty_author,
strict=strict)
except ChangelogParseError:
pass

ChangelogParseError is only raised by parse_changelog if strict=True is given 
to parse_changelog. Does it make sense to have a setting for strict=True if 
any parse errors that are detected in the strict mode will be silently eaten 
in any case? That sounds like the opposite of strict.

To have the code match the intent (the documentation is a little odd here 
too), removing the try/except block is probably correct and I'm happy to do 
that and adjust the documentation to make this clearer. Since the default mode 
here is (unintentionally) strict=False, that should perhaps become the default 
mode in the code so as to not accidentally break users of this code that will 
suddenly have exceptions thrown at them that they have never seen before.

My feeling is that the try/except is incorrect and should be removed but I'd 
like one of the original authors or someone more familiar with the code to 
comment on this first.

regards
Stuart



-- 
Stuart Prescotthttp://www.nanonanonano.net/   stu...@nanonanonano.net
Debian Developer   http://www.debian.org/ stu...@debian.org
GPG fingerprint90E2 D2C1 AD14 6A1B 7EBB 891D BBC1 7EBB 1396 F2F7



Bug#844026: Please handle malformed changelogs with more specific exceptions

2016-11-11 Thread Joe Bisch
Package: python-debian
Version: 0.1.29

I am getting a generic IndexError when I try to parse the attached
changelog using the attached reproducer script. I would expect a more
specific exception from python-debian. Dpkg-parsechangelog is able to
identify that the trailer line is badly formatted. (The issue is that
there is only a single space between the email address and the
timestamp).

Using the attached changelog with dpkg-parsechangelog, I get the following
results:

$ dpkg-parsechangelog
dpkg-parsechangelog: warning: debian/changelog(l5): badly formatted trailer 
line
LINE:  -- John Smith  Tue, 27 Sep 2016 14:08:04 -0600
Source: package
Version: 1.0-1
Distribution: codename
Urgency: medium
Maintainer: John Smith 
Timestamp: 1475006884
Date: Tue, 27 Sep 2016 14:08:04 -0600
Changes:
 package (1.0-1) codename; urgency=medium
  .
 * minimal example reproducer

whereas the script I have attached gives:

$ python3 reproducer.py
Traceback (most recent call last):
  File "reproducer.py", line 5, in 
version = debian.changelog.Changelog(file=contents).get_version()
  File "/usr/lib/python3.5/site-packages/debian/changelog.py", line 465, in 
get_version
return self._blocks[0].version
IndexError: list index out of range

-- 
Joe Bisch
HPE Linux, Hewlett Packard Enterprise
package (1.0-1) codename; urgency=medium

  * minimal example reproducer

 -- John Smith  Tue, 27 Sep 2016 14:08:04 -0600
import debian.changelog

with open('changelog') as f:
contents = f.read()
version = debian.changelog.Changelog(file=contents).get_version()