[Mailman-Users] Getting python errors with bin/arch

2008-02-06 Thread Rob Tanner
Hi,

I am trying to rebuild archives -- actually porting archives over from 
another machine and then doing a rebuild, but the problem below shows up 
inb all the archives.

I run the command "bin/arch small_centers" and get the following error:

#0 <[EMAIL PROTECTED]>
figuring article archives
2008-February
Pickling archive state into 
/var/lib/mailman/archives/private/small_centers/pipermail.pck
Traceback (most recent call last):
  File "bin/arch", line 200, in 
main()
  File "bin/arch", line 188, in main
archiver.processUnixMailbox(fp, start, end)
  File "/usr/lib/mailman/Mailman/Archiver/pipermail.py", line 580, in 
processUnixMailbox
self.add_article(a)
  File "/usr/lib/mailman/Mailman/Archiver/pipermail.py", line 624, in 
add_article
author = fixAuthor(article.decoded['author'])
  File "/usr/lib/mailman/Mailman/Archiver/pipermail.py", line 62, in 
fixAuthor
while i>0 and (L[i-1][0] in lowercase or
UnicodeDecodeError: 'ascii' codec can't decode byte 0xb5 in position 26: 
ordinal not in range(128)

This actually looks like a problem in a specific email message in the 
archive.  How do I identify the mesand how do I fix it?

Thanks,
Rob




-- 
Rob Tanner
UNIX Services Manager
Linfield College, McMinnville OR
--
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: 
http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq01.027.htp


Re: [Mailman-Users] Getting python errors with bin/arch

2008-02-06 Thread Mark Sapiro
Rob Tanner wrote:
>
>I am trying to rebuild archives -- actually porting archives over from 
>another machine and then doing a rebuild, but the problem below shows up 
>inb all the archives.
>
>I run the command "bin/arch small_centers" and get the following error:
>
>#0 <[EMAIL PROTECTED]>
>figuring article archives
>2008-February
>Pickling archive state into 
>/var/lib/mailman/archives/private/small_centers/pipermail.pck
>Traceback (most recent call last):
>  File "bin/arch", line 200, in 
>main()
>  File "bin/arch", line 188, in main
>archiver.processUnixMailbox(fp, start, end)
>  File "/usr/lib/mailman/Mailman/Archiver/pipermail.py", line 580, in 
>processUnixMailbox
>self.add_article(a)
>  File "/usr/lib/mailman/Mailman/Archiver/pipermail.py", line 624, in 
>add_article
>author = fixAuthor(article.decoded['author'])
>  File "/usr/lib/mailman/Mailman/Archiver/pipermail.py", line 62, in 
>fixAuthor
>while i>0 and (L[i-1][0] in lowercase or
>UnicodeDecodeError: 'ascii' codec can't decode byte 0xb5 in position 26: 
>ordinal not in range(128)
>
>This actually looks like a problem in a specific email message in the 
>archive.  How do I identify the mesand how do I fix it?


If you're doing a 'rebuild', you probably want the --wipe option with
bin/arch, but that isn't the problem.

The message has a hex b5 (Greek mu, micro sign), I think in the From:
header.

You can try the attached patch to cleanarch which should enable
cleanarch to find the problems. It won't "fix" them, but it will tell
you where they are.

-- 
Mark Sapiro <[EMAIL PROTECTED]>The highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan

--- test-mailman-2.1/bin/cleanarch  2007-06-18 08:35:57.0 -0700
+++ bin/cleanarch   2008-02-06 14:39:35.859375000 -0800
@@ -117,6 +117,8 @@
 statuscnt = 0
 messages = 0
 prevline = None
+inheaders = False
+badre = re.compile(r'[\177-\377]')
 while True:
 lineno += 1
 line = sys.stdin.readline()
@@ -144,6 +146,7 @@
 else:
 # It's a valid Unix-From line
 messages += 1
+inheaders = True
 if output:
 # Before we spit out the From_ line, make sure the
 # previous line was blank.
@@ -154,9 +157,15 @@
 else:
 # This is a bogus Unix-From line
 escape_line(line, lineno, quiet, output)
-elif output:
+else:
 # Any old line
-sys.stdout.write(line)
+if len(line.strip('\r\n')) == 0:
+inheaders = False
+if inheaders:
+if badre.search(line):
+print >> sys.stderr, 'Non-ascii in %s\nat line number %d' 
% (line, lineno)
+if output:
+sys.stdout.write(line)
 if status > 0 and (lineno % status) == 0:
 sys.stderr.write('#')
 statuscnt += 1
--
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: 
http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq01.027.htp