Re: [Mailman-Users] Mailman and UTF8

2006-11-23 Thread Mark Sapiro
Jan Kohnert wrote:
>
>Mark Sapiro schrieb:
>>
>> Thus it appears that there may be a scrubber issue when the character
>> set of the incoming message is iso-8859-1 but the i18n translated
>> canned messages are utf-8.
>>
>> What mailman version is this?
>
>Leaving your comment completely in heare for reference; as said above,
>this is v2.1.9.

Sorry, somehow I overlooked your mention of the version in the OP.

Anyway, this is definitely a scrubber issue. I see why it occurs, but
I'm not yet sure how to fix it. The problem is when the character set
of the translation returned by

  _('-- next part --\n')

is not compatible with the character set of the message being scrubbed,
the translation can be garbled.

I think we should be using the character set of the list's preferred
language rather than the character set of the message in this case,
but the process is complicated and I'm not sure how to do it.

If you want, you can try the attached scrubber.patch.txt (apply to
Mailman/Handlers/Scrubber.py).

-- 
Mark Sapiro <[EMAIL PROTECTED]>   The highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan

--- Scrubber.py 2006-02-19 13:03:38.0 -0800
+++ Scrubber.new.py 2006-11-23 10:34:41.328565700 -0800
@@ -380,6 +380,11 @@
 text.append(t)
 # Now join the text and set the payload
 sep = _('-- next part --\n')
+try:
+s = unicode(sep, lcset, 'replace')
+sep = s.encode(charset, 'replace')
+except (UnicodeError, LookupError, ValueError):
+pass
 replace_payload_by_text(msg, sep.join(text), charset)
 return msg
 
--
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: 
http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq01.027.htp

Re: [Mailman-Users] Mailman and UTF8

2006-11-23 Thread Jan Kohnert
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Mark Sapiro schrieb:
> Jan Kohnert wrote:
>>
>>So I found out, I have to encode the german mailman.po file in UTF-8 and
>>then rebuild the *.mo out of it. Now it works, so I can provide this
>>version (to large for this list to attach it). (Mailman 2.1.9_rc1).
>
>
> I18n issues like the above are better discussed on the mailman-i18n
> list .

Agreed, so I'm crossposting this one for References in the I18N list.
Followups please on that list.

>>But there is one (small) thing left:
>>If you look in [1] you will notice one incorrectly displayed character
>>(the ---next part---, in German ---n=E4chster Teil--- does not work in
>> all
>>cases ([1] does not work, [2] does), altough all my editors say, the
>>umlaut is correctly declared...
>
>
> It looks like in [1] somehow the utf-8 encoded message got interpreted
> as some other character set (maybe iso-8859-1) and then got encoded
> again as utf-8 so that instead if the a with umlaut, you see the bytes
> of the utf-8 encoding of a with umulaut displayed as characters.
>
> This may be a scrubber issue of some kind, but I am not sure why it
> would occur with only one of two apparently structurally identical
> messages from the same poster, but here is a clue.
>
> I looked at the text file
> .
> While there are no Content-Type: headers in that file, I can see the
> encoding of the Subject: header. It appears that the 'bad' posts are
> 'original' posts and are iso-8859-1 encoded by the poster's (you) MUA,
> and the 'good' posts are 'replies' and are utf-8 encoded by the MUA.
>
> Thus it appears that there may be a scrubber issue when the character
> set of the incoming message is iso-8859-1 but the i18n translated
> canned messages are utf-8.
>
> What mailman version is this?

Leaving your comment completely in heare for reference; as said above,
this is v2.1.9.

Regards Jan


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (GNU/Linux)

iD8DBQFFZZ+3ZRp6KEAo/3oRArw/AJ4qn1yjvFmwvYJbZO3e1bgqTadMrgCgsXDO
2sdRW2NP8u6cwQj4GK21/Gw=
=peRZ
-END PGP SIGNATURE-

--
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: 
http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq01.027.htp


Re: [Mailman-Users] Mailman and UTF8

2006-11-22 Thread Mark Sapiro
Jan Kohnert wrote:
>
>So I found out, I have to encode the german mailman.po file in UTF-8 and
>then rebuild the *.mo out of it. Now it works, so I can provide this
>version (to large for this list to attach it). (Mailman 2.1.9_rc1).


I18n issues like the above are better discussed on the mailman-i18n
list .


>But there is one (small) thing left:
>If you look in [1] you will notice one incorrectly displayed character
>(the ---next part---, in German ---n=E4chster Teil--- does not work in all
>cases ([1] does not work, [2] does), altough all my editors say, the
>umlaut is correctly declared...


It looks like in [1] somehow the utf-8 encoded message got interpreted
as some other character set (maybe iso-8859-1) and then got encoded
again as utf-8 so that instead if the a with umlaut, you see the bytes
of the utf-8 encoding of a with umulaut displayed as characters.

This may be a scrubber issue of some kind, but I am not sure why it
would occur with only one of two apparently structurally identical
messages from the same poster, but here is a clue.

I looked at the text file
.
While there are no Content-Type: headers in that file, I can see the
encoding of the Subject: header. It appears that the 'bad' posts are
'original' posts and are iso-8859-1 encoded by the poster's (you) MUA,
and the 'good' posts are 'replies' and are utf-8 encoded by the MUA.

Thus it appears that there may be a scrubber issue when the character
set of the incoming message is iso-8859-1 but the i18n translated
canned messages are utf-8.

What mailman version is this?

-- 
Mark Sapiro <[EMAIL PROTECTED]>   The highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan

--
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: 
http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq01.027.htp


Re: [Mailman-Users] Mailman and utf8

2005-09-08 Thread Jan Kohnert
Dan Phillips schrieb:
> On Sep 8, 2005, at 8:29 AM, Jan Kohnert wrote:
> > In the html-pages of the archive there is still the html-meta tag:
> > 
>
> Did you regenerate the archives? They are static html created at he
> time a message is added, so for a change in character set to appear,
> you must run bin/arch --wipe listname

Many thanks, now it is working.

> Dan

Best regards Jan

-- 
OpenPGP Public-Key Fingerprint:
0E9B 4052 C661 5018 93C3 4E46 651A 7A28 4028 FF7A


pgpQevH5U23eQ.pgp
Description: PGP signature
--
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: 
http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq01.027.htp

Re: [Mailman-Users] Mailman and utf8

2005-09-08 Thread Jan Kohnert
Mark Sapiro schrieb:
> Jan Kohnert wrote:
> >Is there a parameter to tell pipermail to use utf8 encoding for the web
> >archive? Or at least the encoding of the mail itself (although this would
> >obviously be more difficult to do...)?
>
> There is an LC_DESCRIPTIONS dictionary that has an entry for each
> Mailman supported language giving the language name and character set.
> Look at the end of Defaults.py.
>
> You can put something like
>
>
> def _(s):
> return s
> add_language('en',_('English (USA)'),   'utf-8')
> del _

So I added the lines:
def _(s):
return s
add_language('de',_('German'),   'iso-8859-15')
  ^^^
  if I set utf-8 here, I get
  horrible font problems, must
  figure out, why...
del _

and also:
DEFAULT_CHARSET = 'utf-8'

to mm_cfg.py

but that didn't do the trick.

In the html-pages of the archive there is still the html-meta tag:


and so the content is not displayed correctly...

Best regards Jan

-- 
OpenPGP Public-Key Fingerprint:
0E9B 4052 C661 5018 93C3 4E46 651A 7A28 4028 FF7A


pgpMYKLvpLPgW.pgp
Description: PGP signature
--
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: 
http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq01.027.htp

Re: [Mailman-Users] Mailman and utf8

2005-09-07 Thread Mark Sapiro
Jan Kohnert wrote:
>
>Is there a parameter to tell pipermail to use utf8 encoding for the web
>archive? Or at least the encoding of the mail itself (although this would
>obviously be more difficult to do...)?

There is an LC_DESCRIPTIONS dictionary that has an entry for each
Mailman supported language giving the language name and character set.
Look at the end of Defaults.py.

You can put something like


def _(s):
return s
add_language('en',_('English (USA)'),   'utf-8')
del _

in mm_cfg.py. The above would change the character set for US English
from the default us-ascii to utf-8. Note that since you are in this
example giving new values to LC_DESCRIPTIONS[en], this overrides the
entry in Defaults.py.

--
Mark Sapiro <[EMAIL PROTECTED]>   The highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan

--
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: 
http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq01.027.htp


[Mailman-Users] Mailman and utf8

2005-09-07 Thread Jan Kohnert
Hello guys,

sorry if I missed something in the docs...

I recently changed my system to use utf8 encoding and get now strange signs in 
the maillinglist archives for non US-ASCII letters linke ä,ö,ü,ß and so on.

Is there a parameter to tell pipermail to use utf8 encoding for the web 
archive? Or at least the encoding of the mail itself (although this would 
obviously be more difficult to do...)?

TIA,
Best regards Jan

-- 
OpenPGP Public-Key Fingerprint:
0E9B 4052 C661 5018 93C3 4E46 651A 7A28 4028 FF7A


pgpcHpslKBPOD.pgp
Description: PGP signature
--
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: 
http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq01.027.htp