Re: [Nmh-workers] General question - unsupported charset conversion
Ken Hornstein writes: > >Unfortunately, I have a lot of experience and troubles with character > >set conversion. > > Well, if you just bit the bullet and switched to UTF-8, you wouldn't have > all of these problems! :-) It is not that simple. Utf-8 solves couple of problems but creates some new =:-) Advantages and disadvantages of utf-8 is a very wide topic. > >In practice it means a spam in exotic language and at this point I know > >that I do not want to read such a message. > > I can see that, but I'm not sure that's an appropriate choice for all > cases (like, for instance, MIME parameters). That is right. On the other hand, you never prevent malformed MIME parameters. > >This is very frequent and causes a lot of troubles. Entire message in > >English and one foreign family name in original. Message send in utf-8 > >but (suppose) my terminal support only ASCII. Converison would fail. > > Errr ... really? In the case I'm thinking, the one foreign family > name would have the offending character output as a '?' (or whatever). > The conversion would go through fine. Well, the meaning of word "fail". Formally it is not possible to convert any utf-8 character to 256 characters in iso/cp/... 8bit set. Converison would fail. Ignoring absent symbols or substituting them by something else causes that the conversion would go through fine. Ignoring symbols or substituting them by '?' causes that conversion is non-reversible and the result may be difficult to read. It is not a problem in case of one or two missing or substituted symbols in long text. We can guess what is the me?ning of the word. For many non-convertible symbols reading of such a text is more similar to solving a crossword puzzle. What could be '??o??w??d' > >In my personal opinion a very good choice is conversion into > >html-entities, like ą or ł . It remains quite readable and > >is still unique enough to convert it back in case of need. > > Um, ouch. Unless there's a common library that already implements > that behavior, that's not on the table at all. This is a serious argument. However, mentioned Recode library has something like that: ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MISC/SGML.TXT I do not know is it useful or not. max ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] General question - unsupported charset conversion
Ken Hornstein writes: > I've been grappling with to do when we have issues with character set > conversion. Unfortunately, I have a lot of experience and troubles with character set conversion. > Specifically, I have two issues: > > - What to do if the character set is unsupported. > Should we return the original bytes? It is not the best idea. Some sequences of bytes are control sequences for terminal. This sometimes set terminal in unusable state. > An error? [..] Some string which says, "We cannot convert > klingon-8842 to us-ascii" or the equivalent? > In practice it means a spam in exotic language and at this point I know that I do not want to read such a message. In rare cases when I want to read in charset unsupported by configuration this is advantage of mh system that it is possible to handle it separately. Save, decode, convert .. whatever. > - What to do when we cannot convert a particular character. This is a > little more clear; the general trend is to use a substitution > character. This is very frequent and causes a lot of troubles. Entire message in English and one foreign family name in original. Message send in utf-8 but (suppose) my terminal support only ASCII. Converison would fail. I can prepare an example but including it into this message can make it difficult to read. In my personal opinion a very good choice is conversion into html-entities, like ą or ł . It remains quite readable and is still unique enough to convert it back in case of need. max ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] UTF-8 message bodies
Ken Hornstein writes: > >It is possible to keep almost unchanged state with addition of > >one more clause to mhbuild like pair #off #on which marks the > >region where ^# is not interpreted as directive. > > But to me it seems dumb that # characters can't be in the > beginning of a line, and having people have to know about > #on/#off directives just seems like the wrong solution. [..] > But if you run "mime" at a WhatNow? prompt then presumably > you're smart enough to know you have to escape any leading # > characters. I'm trying to write as short as possible (to make less gramar mistakes =:-)), but sometimes it is too short. Suppose, you use automimeproc: 1 and you want to include (as a part of the message) some lines from program source, shell script or whatever. You can type #off :r whatever (or copy by mouse) #on and do not have to edit (and remember to edit) included part to escape leading #. As the additional directive, user doesn't have to know about it unless he needs it. max ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] UTF-8 message bodies
Ken Hornstein writes: > >Well, it seems that both approaches can coexists if buildmimeproc > >would do nothing in case of already MIMEfied message. Instead of > >error reporting. > > There is one caution here ... if you have a line that begins with a "#" > that is NOT an mhbuild directive then you'll get an error. That's fine for > us that know about it, but it can bite you. > I know it too well, already. I have to use automimeproc=1 because of charset. Every time the message contains #include or #! I see an error. I've been thinking a bit of this and I can see (as a user not programmer) a couple of possibilities. It is possible to keep almost unchanged state with addition of one more clause to mhbuild like pair #off #on which marks the region where ^# is not interpreted as directive. It can be separated directive typed by user and (long) directive for mhbuild. The last can be almost uniqe, since it never be typed directly. The user directive can be then very flexible with any starting sign (e.g. one can use @attach file.ext). Processing user directives would require additional preprocessor (supplied or user made - like sed script). The real MIMEfying would be by mhbuild before sending. Those are just ideas. I like to know your opinion. max ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] More than one parameters in .mh_profile
Ken Hornstein writes: > >I need moreproc to be "less -force" but show (nmh-1.3) refuses > >this. > > Yeah, I guess what happens there is mhl (or whatever) is trying to > exec("less -force"). Which as you've noted doesn't work. > > Other people have complained about this as well. But in this case you > could just set the environment variable LESS to "f", right? > Not quite. In fact I need -force only in show, to enforce silently displaying incompatibile charsets. > >Workaround is to make the shell script like vim-mail which is in > >fact call to vim -c ":set ft=mail" . > > > >Is it possible to do such thigs simpler? > > Right now ... no. To start, I have no idea how this interface should > look like. Suggestions here are welcome; code is even more welcome :-) > It does not seem too difficult to implement function which splits any string into separate pieces and prepend them to exec* parameters. But discussion shows that fundamental question is rather: should it be passed to the shell (and gives chance to use !$ or some such) or replace the shell job and interpret string inside the code? Finally, I think it is not worth to solve it now. max ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] UTF-8 message bodies
Ralph Corderoy writes: > Hi max, > > > Why not set in .mh_profile > > automimeproc: 1 > > I like to look over the mime'd draft before sending to check I got the > directives right. Well, it seems that both approaches can coexists if buildmimeproc would do nothing in case of already MIMEfied message. Instead of error reporting. max ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] UTF-8 message bodies
Joel Uckelman writes: > My wife pointed out to me today that for the past seven years (!) since > the Linux distribution we use switched to UTF-8, when she sends messages > containing non-ASCII characters [..] and that nmh is > producing messages which lack any headers indicating that the contents > are UTF-8. > > Is there a solution to this with nmh 1.4 or 1.5? All I turn up when I > search for things about character sets and nmh are things related to > displaying received messages in nmh, nothing about outgoing mail. Why not set in .mh_profile automimeproc: 1 Today non-MIME messages are slightly obsolete even they have to be accepted for backward compatibility. max ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
[Nmh-workers] More than one parameters in .mh_profile
In the .mh_profile some entries specifies programs like: Editor: vim-mail moreproc: less postproc: /usr/lib/mh/post Some of those programs require options or parameters but apparently this is not accepted. I need moreproc to be "less -force" but show (nmh-1.3) refuses this. Workaround is to make the shell script like vim-mail which is in fact call to vim -c ":set ft=mail" . Is it possible to do such thigs simpler? max ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] mailxi and attachements
Ralph Corderoy writes: > Hi, > > I still use mail(1) for sending one-line emails or in pipes. More imporant (for me) is the possibility to send only file(s) as attachements. echo Enclosed | nail -a some.file some...@somewhere.net Similar functionality has mutt, switch -a. I can't do it via nmh. Also composing message with attachment is not very easy. For me some form of -a switch in comp and repl is missing. Use of mail client from different subsystem is not convenient, e.g. I can't use .mh_aliases. > It's provided here by the heirloom-mailx package, derived from > Berkeley Mail 8.1 but brought up to date. A concise list of > features, http://heirloom.sourceforge.net/mailx.html It depends on package. Sometimes it is installed as nail and -a means attachchment, sometimes is installed as mailx and because of backward compatibility -a has different meaning. Then heilroom-mailx looses its most important feature. Max ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] repl and mime handling
Ken Hornstein writes: > > Couple of years ago emacs switch to "internal" coding in utf-8. I > > had to stop using emacs and mh-e. > > See, this is what I'm missing - why, exactly? I assume the problem was not > just philsophical. Neither politycal, nor religious =:-) Purely practical. Imagine replying to an utf-8 message. Appended signature is in iso, so I start with mixture of charsets. Then I have a lot of codings to play with. Conversion of utf-8 part of message. Headers of the message. Three types of internal emacs codings. It was possible to set it all correctlly, but I realize that my life is too short for such a games. How to process ... and have some work done. =:-) I was advised to switch to utf-8, but this would not help too much. Imagine replying to iso message And inserting text file from disc. With regret I had to find another editor. Max ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] repl and mime handling
Lyndon Nerenberg writes: > >> But restrict the entire nmh to utf-8 charset would cripple system. > > How so, specifically? Plan9 has run a native UTF8-only mail environment > for ages (with a very MH-like mailstore, as well), and it's far from > crippled. This a tiny difference between "entire" and "internally". Normally, no one is interesting how the TV signal is processed inside the TV set. As long as he can see on the screen what he wants. So, I am not against utf-8 used internally. Almost. Couple of years ago emacs switch to "internal" coding in utf-8. I had to stop using emacs and mh-e. So, there can be another tiny difference between "internally" and "internally". Max ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] repl and mime handling
Ralph Corderoy writes: > Hi Aleksander, > > > For English-speaking countries UTF-8, in majority, means ASCII, they > > can see no difference. > > I don't think that's the case. Even North Americans, who have $ in > ASCII, still find ` ' " " and ... cropping up, especially when services > automatically convert ` ' " " and And then there's L and Euro. Well, for some people/software it is funny to use LEFT SINGLE QUOTATION MARK instead '. But some converters are clever. Mine treat it correctly and it took me some time to find what you are writing about. > > > As an advantage they can use foreign names like Moebius in original, > > this makes message more readable. But I'm afraid they wouldn't be > > happy with message written in Russian, Chinese or Korean. > > The UTF-8 fonts on systems like Linux, and I assume Windows and Mac too, > handle these just fine; Cyrillic, Chinese, and Japanese spam turns up > here daily and mhshow copes. > Do not confuse message perfectly displayed and message perfectly readable. > > But restrict the entire nmh to utf-8 charset would cripple system. > > What language/charset/locale is it that you have where UTF-8 causes > problems? My mental system. Try some test. Take the file with code (any programming language), replace all normal characters like space, '"- etc. with their funny equivalents from utf-8. Send this file to the programmer who works in utf-8 environment. Measure the time until he find the reason for problems. I do not want discuss the (dis)advantages of this or that. There are people, who do not want work with utf-8, for better or worse reasons. No software should enforce them to this. Cheers, Max -- -- ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] repl and mime handling
Joel Uckelman writes: > Thus spake Oliver Kiddle: > > > > The limitations occur where e-mails use characters that can't be > > displayed in the current locale but we can't do anything about that. > > > How likely is it that a message containing characters undisplayable in > the user's locale will be useful for the user? (This isn't meant > rhetorically, it's a serious question.) This is not that simple. For years I enforced displaying iso-8859-1 charset on terminal supporting only iso-8859-2 and it works. 1. Charset declared in mail header. Quite a lot of people have incorrectly configured charset. 2. Language of the message. Might be different than charset suggests. For almost any charset basic ASCII is the same, so message writen in English would be readable. 3. Rare non-latin characters (e.g. names, cities) may enforce MUA to switch to another charset, while the almost whole text is readable. On the other end, message written in (say) Japanese would be unreadable even perfectly displayed =:-) But the same would happen in case of message written in foreign language that use the same charset as mine. This is not so strong relation: supported charset => readable message. Max -- -- ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] repl and mime handling
Ken Hornstein writes: > There is a good chunk of code inside of nmh that assumes ASCII (in > terms of "what is a space", "what is a newline", and other things). > [..] > internal representation UTF-8. What do you mean by "internal representation"? Conversion from any to utf-8, processing by the code and conversion back to the original charset is really internal, transparent for the user. > Now my plan > was to convert from UTF-8 to the native character set, but that > conversion won't be perfect. But such the internall conversion would be perfect, no new characters is introduced (except formatting like newlines, spaces). The question is: what charset will be in draft for edition? Original, converted to something (e.g. according to locale) or utf-8. This is no longer internal. > > I'm writing the code, I'm the one > who makes the decisions. [..] > you can give me your OPINION, Clear statement. Max -- -- ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] repl and mime handling
Ken Hornstein writes: > >*Please, no!* Conversion from any charset to utf-8 is possible, but > >conversion back, according to user preferences, is not. People > >start to use funny characters like non-breakable space and so on. > > Unfortunately, we don't have unlimited development resources. > > Here's my reading of the world: > > - The general trend (especially in English-speaking countries) is to move > toward Unicode (specifically, UTF-8). For English-speaking countries UTF-8, in majority, means ASCII, they can see no difference. As an advantage they can use foreign names like Moebius in original, this makes message more readable. But I'm afraid they wouldn't be happy with message written in Russian, Chinese or Korean. > > - People in Eastern Europe aren't crazy about this. I know, at least, one exception. =:-) > - Given the lack of unlimited development resources, > I don't really see people > willing to change all of the internal APIs to include character set > information. That means we pretty much have to choose one character set > for an internal representation inside of nmh. In fact, I know very little about API, so it might be difficult. But restrict the entire nmh to utf-8 charset would cripple system. Beside the information on charset inside API, from my point of view the correct, and too much resource consumig, is move out module for conversion outside, as separate program. The default program would convert to utf-8, but anyone can provide his own program for conversion according to his taste. Suppose an entry in .mh_profile mh-text-convert: prog This (or in similiar way) would also can handle conversion to and from html, not only charset. Max -- -- ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] repl and mime handling
Joel Uckelman writes: > > > So, I have some thoughts in this direction, but I'm wondering: what do > > > you want out of repl in terms of better MIME handling? > > > > All the "text" parts turned into UTF-8 and quoted would be a good start. > > I can then trim down in vi as normal. > > Yes, please. *Please, no!* Conversion from any charset to utf-8 is possible, but conversion back, according to user preferences, is not. People start to use funny characters like non-breakable space and so on. Problem seems to be impossible to solve not beacuse of technical difficuly but because of very different user preferences. Much more flexible mechanism in needed. Any charset conversion implies problem what would be sending charset and how to change it. > Even just decoding the text parts which are base64 would > be a huge improvement. *Yes, please!* Now, I'm filtering messages via slocal: default - pipe ? "/usr/bin/mimedecode|/usr/lib/mh/rcvpack /var/mail/max -mbox" what is not the best practice, but it works. max -- -- ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
[Nmh-workers] vmh and other unused files
I am new at the list. Sorry for direct (and perhaps stupid) question. Does anyone can compile vmh from nmh-1.4-RC2? Form me it gives a lot of errors. I read at list archive that in nmh there is something like vmh. This what I need, ncurses interface to nmh, but it is not a part of debian nhm package and does not compile. Difficult to test... AM ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers