Package: readpst Version: 0.5.2-1 Severity: normal Hi,
Thanks for interesting tool. This is helping me to move my wife's old Windows mail to a diifferent platform. I realized that old windows (95,98,Me) versions used non-utf-8 encodings for different countries. So when one extracts the content of pst file, filenames are encoded in traditional non-utf-8 format (Names like Inbox are translated into each language in pst). readpst just writes its content to the filename as is. In case of Japan, it was Shift-JIS. But my sid/etch box is utf-8. You need to convert filenames to UTF-8 encoding on modrn Linux system and all other new platforms (Mac, NT, ...). So having capability to pipe filenames created by readpst with iconv comannd will be good thing to do if options are given. -f shift_jis -t utf-8 (Just use iconv or equivalent libray calls, please) Also, the file contents seems to be in the traditional non-utf-8 format, i.e., shift-jis for Japan. This issue needs a bit more investigation. Simple run of iconv will error out. -c option helps processig them but makes some mail with "mojibake", i.e., decoding error. Funny thing is even if original mail is 7 bit encoded iso-2022-jp file, stored content seems to be in shift-jis thus readable on utf-8 console running vim. (Unix mbox usually keeps original 7 bit encoding while storing it. MS may be doing some shortcut here since pst is their proprietary format.) So far just running iconv to convert entire generated file does the decent job. For multipart plain/text contents, I needed to change the stratig part: ----boundary-LibPST-iamunique-1804289383_-_- Content-type: text/plain to be ----boundary-LibPST-iamunique-1804289383_-_- Content-type: text/plain; charset=utf-8 for mutt to read them OK. (Non-extensive test. vim was always able to read it since it does not care these mailbox specific encoding directives.) For mutt on Linux, I may have other ways to read it but this was problem moving file to Mac. This kind of rewrite is best done in readpst. Once you adress the first issue, please downgrade this to the wish list. "Lacks capability to adress encoding issue for the mail content". I will post more detailed report soon. I think addressing issues here properly will address non-ASCII people (eastern europe, Asia, ... even ISO-8859-1). -- System Information: Debian Release: 4.0 APT prefers unstable APT policy: (500, 'unstable'), (500, 'testing') Architecture: amd64 (x86_64) Shell: /bin/sh linked to /bin/bash Kernel: Linux 2.6.18-mactel64 Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Versions of packages readpst depends on: ii libc6 2.3.6.ds1-13 GNU C Library: Shared libraries readpst recommends no packages. -- debconf-show failed -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]