Rob Tanner wrote:
>
>I am trying to rebuild archives -- actually porting archives over from
>another machine and then doing a rebuild, but the problem below shows up
>inb all the archives.
>
>I run the command "bin/arch small_centers" and get the following error:
>
>#00000 <[EMAIL PROTECTED]>
>figuring article archives
>2008-February
>Pickling archive state into
>/var/lib/mailman/archives/private/small_centers/pipermail.pck
>Traceback (most recent call last):
> File "bin/arch", line 200, in <module>
> main()
> File "bin/arch", line 188, in main
> archiver.processUnixMailbox(fp, start, end)
> File "/usr/lib/mailman/Mailman/Archiver/pipermail.py", line 580, in
>processUnixMailbox
> self.add_article(a)
> File "/usr/lib/mailman/Mailman/Archiver/pipermail.py", line 624, in
>add_article
> author = fixAuthor(article.decoded['author'])
> File "/usr/lib/mailman/Mailman/Archiver/pipermail.py", line 62, in
>fixAuthor
> while i>0 and (L[i-1][0] in lowercase or
>UnicodeDecodeError: 'ascii' codec can't decode byte 0xb5 in position 26:
>ordinal not in range(128)
>
>This actually looks like a problem in a specific email message in the
>archive. How do I identify the mesand how do I fix it?
If you're doing a 'rebuild', you probably want the --wipe option with
bin/arch, but that isn't the problem.
The message has a hex b5 (Greek mu, micro sign), I think in the From:
header.
You can try the attached patch to cleanarch which should enable
cleanarch to find the problems. It won't "fix" them, but it will tell
you where they are.
--
Mark Sapiro <[EMAIL PROTECTED]> The highway is for gamblers,
San Francisco Bay Area, California better use your sense - B. Dylan
--- test-mailman-2.1/bin/cleanarch 2007-06-18 08:35:57.000000000 -0700
+++ bin/cleanarch 2008-02-06 14:39:35.859375000 -0800
@@ -117,6 +117,8 @@
statuscnt = 0
messages = 0
prevline = None
+ inheaders = False
+ badre = re.compile(r'[\177-\377]')
while True:
lineno += 1
line = sys.stdin.readline()
@@ -144,6 +146,7 @@
else:
# It's a valid Unix-From line
messages += 1
+ inheaders = True
if output:
# Before we spit out the From_ line, make sure the
# previous line was blank.
@@ -154,9 +157,15 @@
else:
# This is a bogus Unix-From line
escape_line(line, lineno, quiet, output)
- elif output:
+ else:
# Any old line
- sys.stdout.write(line)
+ if len(line.strip('\r\n')) == 0:
+ inheaders = False
+ if inheaders:
+ if badre.search(line):
+ print >> sys.stderr, 'Non-ascii in %s\nat line number %d'
% (line, lineno)
+ if output:
+ sys.stdout.write(line)
if status > 0 and (lineno % status) == 0:
sys.stderr.write('#')
statuscnt += 1
------------------------------------------------------
Mailman-Users mailing list
[email protected]
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe:
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org
Security Policy:
http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq01.027.htp