Subject: webcheck: crash when checking the debian website
Package: webcheck
Version: 1.10.0
Severity: important

I was running webcheck on the debian website when it crashed:

 > webcheck:   http://www.nl.debian.org/intl/french/typographie
 > webcheck:   ftp://ftp.icm.edu.pl/pub/Linux/distributions/debian-non-US/
 > Traceback (most recent call last):
 >   File "/usr/bin/webcheck", line 249, in ?
 >     main()
 >   File "/usr/bin/webcheck", line 211, in main
 >     site = serialize.deserialize(fp)
 >   File "/usr/share/webcheck/serialize.py", line 329, in deserialize
 >     _deserialize_link(link, key, value)
 >   File "/usr/share/webcheck/serialize.py", line 284, in _deserialize_link
 >     link.add_linkproblem(_readstring(value, False))
 >   File "/usr/share/webcheck/serialize.py", line 167, in _readstring
 >     return str(_unescape(txt))
 > UnicodeEncodeError: 'ascii' codec can't encode character u'\ufffd' in 
 > position 205: ordinal not in range(128)

The last page that it shows (ftp://ftp.icm.edu.pl/...) doesn't seem to
be the cause of this though, as that one parses fine when I run
webcheck on it directly.

Ive put the webcheck.dat file at
http://zoetekouw.net/Zooi/webcheck.dat.bz2 as the BTS won't accept is
as an attachment.  Note that it does take quite a while running
webcheck in continuation mode with this webcheck.dat before the crash
occurs (>30 minutes or so).


-- System Information:
Debian Release: lenny/sid
  APT prefers unstable
  APT policy: (500, 'unstable'), (500, 'stable'), (1, 'experimental')
Architecture: i386 (x86_64)

Kernel: Linux 2.6.18.8 (SMP w/1 CPU core; PREEMPT)
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages webcheck depends on:
ii  python                        2.4.4-6    An interactive high-level object-o
ii  python-support                0.6.4      automated rebuilding support for p

Versions of packages webcheck recommends:
ii  python-beautifulsoup          3.0.4-1    error-tolerant HTML parser for Pyt

-- no debconf information


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to