Your message dated Tue, 02 Nov 2021 10:03:33 +0000
with message-id <[email protected]>
and subject line Bug#948467: fixed in feedparser 6.0.8-1
has caused the Debian Bug report #948467,
regarding python3-feedparser: Handling of invalid XHTML differs from upstream 
package
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact [email protected]
immediately.)


-- 
948467: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=948467
Debian Bug Tracking System
Contact [email protected] with problems
--- Begin Message ---
Package: python3-feedparser
Version: 5.2.1-1
Severity: normal


Dear maintainer(s),

The attached script uses feedparser to parse an invalid XHTML document.

If feedparser is installed from PyPI with pip, then the script succeeds
exists without error.

If feedparser is installed from Debian 10 repositories (or Archlinux, I
am told), it errors with: "TypeError: startswith first arg must be bytes
or a tuple of bytes, not str" (full traceback attached).

In all cases, feedparser 5.2.1 is used (5.2.1-1 on Debian).


I did not investigate further, but this might be caused by a different
version of sgmllib (bundled in Debian's python3-feedparser package)



-- System Information:
Debian Release: 10.2
  APT prefers oldstable-debug
  APT policy: (500, 'oldstable-debug'), (500, 'stable'), (500,
'oldstable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: armhf

Kernel: Linux 4.19.0-6-amd64 (SMP w/4 CPU cores)
Kernel taint flags: TAINT_DIE, TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE
Locale: LANG=fr_FR.UTF-8, LC_CTYPE=fr_FR.UTF-8 (charmap=UTF-8),
LANGUAGE=fr_FR.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages python3-feedparser depends on:
ii  python3  3.7.3-1

python3-feedparser recommends no packages.

python3-feedparser suggests no packages.

-- no debconf information
import feedparser

data = '''<?xml version='1.0' encoding='utf-8'?>
<feed xmlns='http://www.w3.org/2005/Atom'>

<entry> 
    <content type='xhtml'><div xmlns='http://www.w3.org/1999/xhtml'>
<p><i></p>
    </div></content> 
</entry>
<entry> 
    <content type='xhtml'><div xmlns='http://www.w3.org/1999/xhtml'>
<p>&#8482;</p>
    </div></content> 
</entry>
</feed>
'''

feedparser.parse(data)
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/feedparser_debian/sgmllib3.py", line 
352, in finish_endtag
    method = getattr(self, 'end_' + tag)
AttributeError: '_LooseFeedParser' object has no attribute 'end_content'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "feedparser_invalid_xhtml.py", line 19, in <module>
    feedparser.parse(data)
  File "/usr/lib/python3/dist-packages/feedparser.py", line 3972, in parse
    feedparser.feed(data.decode('utf-8', 'replace'))
  File "/usr/lib/python3/dist-packages/feedparser.py", line 2131, in feed
    sgmllib.SGMLParser.feed(self, data)
  File "/usr/lib/python3/dist-packages/feedparser_debian/sgmllib3.py", line 98, 
in feed
    self.goahead(0)
  File "/usr/lib/python3/dist-packages/feedparser_debian/sgmllib3.py", line 
137, in goahead
    k = self.parse_endtag(i)
  File "/usr/lib/python3/dist-packages/feedparser_debian/sgmllib3.py", line 
314, in parse_endtag
    self.finish_endtag(tag)
  File "/usr/lib/python3/dist-packages/feedparser_debian/sgmllib3.py", line 
354, in finish_endtag
    self.unknown_endtag(tag)
  File "/usr/lib/python3/dist-packages/feedparser.py", line 704, in 
unknown_endtag
    method()
  File "/usr/lib/python3/dist-packages/feedparser.py", line 1840, in 
_end_content
    value = self.popContent('content')
  File "/usr/lib/python3/dist-packages/feedparser.py", line 1011, in popContent
    value = self.pop(tag)
  File "/usr/lib/python3/dist-packages/feedparser.py", line 863, in pop
    if piece.startswith('</'):
TypeError: startswith first arg must be bytes or a tuple of bytes, not str


--- End Message ---
--- Begin Message ---
Source: feedparser
Source-Version: 6.0.8-1
Done: Jochen Sprickerhof <[email protected]>

We believe that the bug you reported is fixed in the latest version of
feedparser, which is due to be installed in the Debian FTP archive.

A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to [email protected],
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Jochen Sprickerhof <[email protected]> (supplier of updated feedparser 
package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing [email protected])


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Format: 1.8
Date: Tue, 02 Nov 2021 09:56:58 +0100
Source: feedparser
Architecture: source
Version: 6.0.8-1
Distribution: unstable
Urgency: medium
Maintainer: Debian Python Team <[email protected]>
Changed-By: Jochen Sprickerhof <[email protected]>
Closes: 946788 948467 964816 997649
Changes:
 feedparser (6.0.8-1) unstable; urgency=medium
 .
   * Team upload.
 .
   [ Debian Janitor ]
   * Set upstream metadata fields: Bug-Submit.
 .
   [ Jochen Sprickerhof ]
   * New upstream version 6.0.8 (Closes: #946788, #948467, #964816, #997649)
   * Update d/copyright
   * Drop upstream applied patches
   * Bump policy and debhelper versions
   * Update packaging
Checksums-Sha1:
 d4b53db3ec0737eb9e728cbeae6c117668a440c4 2026 feedparser_6.0.8-1.dsc
 bec0a6c2216cd492c1827db43d5e9ea05aacc92b 285827 feedparser_6.0.8.orig.tar.gz
 858eb92879ba8a733f6a41fe447f8582b3cb02a4 6040 feedparser_6.0.8-1.debian.tar.xz
 d433cd59ea3440265d3b375dfd0558712361914b 6841 
feedparser_6.0.8-1_source.buildinfo
Checksums-Sha256:
 3a86b6003248e0c7e186a375d3a92856f02321022a95e7b72d6287fafa02136e 2026 
feedparser_6.0.8-1.dsc
 5ce0410a05ab248c8c7cfca3a0ea2203968ee9ff4486067379af4827a59f9661 285827 
feedparser_6.0.8.orig.tar.gz
 3c6648f33eda746124ce93528b8cab9f191c38138df7c4219e33d252aa6dcb59 6040 
feedparser_6.0.8-1.debian.tar.xz
 b39289b55da2db489af3b5ee6fad63f7b4fafaa639f3b8c4b1957da88ba363eb 6841 
feedparser_6.0.8-1_source.buildinfo
Files:
 3eb63ff151de5ff79df2d507f0c1d08c 2026 python optional feedparser_6.0.8-1.dsc
 8d0ba773e049e8f1edc2541737593a92 285827 python optional 
feedparser_6.0.8.orig.tar.gz
 5b044e36e8dab29d37f0ebfca1c477b2 6040 python optional 
feedparser_6.0.8-1.debian.tar.xz
 386e1dd83c7308b917015d8bdffa4636 6841 python optional 
feedparser_6.0.8-1_source.buildinfo

-----BEGIN PGP SIGNATURE-----

iQIyBAEBCgAdFiEEc7KZy9TurdzAF+h6W//cwljmlDMFAmGBC4kACgkQW//cwljm
lDPP3g/4pBX/XlICfo7JuEOYVmChQFP0hIwdeH9wnKIiGfjVvcFssgcHb1N8NtNy
o3Hzkdo/0edmhhGbvnpERrjWhyyPZgzFQMRuO+U/GJBmKfwSbhUiheCUt+4UP0pS
f9xX1imhtJzsVgUFWsLyTD5iD/Lzi1yrcmO1p5is9E11rZt7SnX/yYfUOj05aFjL
8ts1s4zTVV1Brt0H/mezxY3ktr1HMONPJ0IBp3F5Sfjes6uT3bzg3VYH4yAfSK6r
1x/QKE5QySot8Iv/v3NNfwIkM8pdJ+joISgp8TvF5NoPRqxbfGAKAW3A2I8k4Rri
ZJqpq2ImBHYrfvFSVaA1o2jypMjWSSOcI51Ap+MGi8S8WLFiUIDqXXDuhjdGQmsj
0aE5VMsXyXB1kqxxcK0xULP/hrSH/2VTxn02OlY1nao47CVySAbrGeys563FNUQe
Zp0I6BD6B0uiWxlVfBxH8IUOgAKNyOu31Sew7spiy+fbL2MFK4qzPX4fjigo3/Uu
K+GbLo4vH8uLGmf5vNY4dyR4GFFVXZiMEEPq/PE+522kCFem8w8nR97xEzOO0Yca
VRnusT6bMu2JzsIA2lohyx06MA6h3PaDwFL4z3XIQ5yvG3Om/0L9klsO0wHkj5F/
46tQsVWq4rNP94zHmy+J4ZvxpJaEeyVHS+e9MmLzDzZbQjLJIw==
=kcLo
-----END PGP SIGNATURE-----

--- End Message ---

Reply via email to