Re: Encoding problem
A.J.Mechelynck wrote: David Woodfall wrote: SOLVED! Well I think I fixed by rtfm: :set termenc=cp1252 Seems to work, but I don't know yet whether it breaks anything else. 'termencoding' tells Vim (in both the Console and GUI versions) how your keyboard translates data and (in the Console version only) how the terminal displays it. By default it is set to empty, which means "use the value of 'encoding'". This is OK as long as you don't change 'encoding' in your vimrc. If you do, then it is "prudent" to save in 'termencoding' your "locale" encoding, i.e., whatever 'encoding' was set to at startup, like this: ERRATUM if has("multi_byte") " if not, we don't have Unicode support if &enc !~? '^u' " if 'encoding' starts with u or U, " then Unicode is already set if &tenc == "" let &tenc = &enc " avoid clobbering the keyboard encoding endif set enc=utf-8 endif set fencs=ucs-bom,utf-8,latin1 " heuristics for existing files setglobal bomb fenc=latin1 " defaults for newly-created files else echomsg "Warning: No multibyte support" endif " the following adds (among other things) the current 'fileencoding' " to the statusline text if has("statusline") exe 'set statusline=%<%f\ %h%m%r%=%k[%{(&fenc\ ' \ . '==\ \"\"?&enc:&fenc).(&bomb?\",BOM\":\"\")}]\ %l,%c%V[%b=0x%02B]\ %P' endif Best regards, Tony. Best regards, Tony. -- "Wagner's music is better than it sounds." -- Mark Twain
Re: Encoding problem
David Woodfall wrote: SOLVED! Well I think I fixed by rtfm: :set termenc=cp1252 Seems to work, but I don't know yet whether it breaks anything else. 'termencoding' tells Vim (in both the Console and GUI versions) how your keyboard translates data and (in the Console version only) how the terminal displays it. By default it is set to empty, which means "use the value of 'encoding'". This is OK as long as you don't change 'encoding' in your vimrc. If you do, then it is "prudent" to save in 'termencoding' your "locale" encoding, i.e., whatever 'encoding' was set to at startup, like this: if has("multi_byte") " if not, we don't have Unicode support if &enc !~? '^u' " if 'encoding' starts with u or U, " then Unicode is already set let &tenc = &enc " avoid clobbering the keyboard encoding set enc=utf-8 endif set fencs=ucs-bom,utf-8,latin1 " heuristics for existing files setglobal bomb fenc=latin1 " defaults for newly-created files else echomsg "Warning: No multibyte support" endif " the following adds (among other things) the current 'fileencoding' " to the statusline text if has("statusline") exe 'set statusline=%<%f\ %h%m%r%=%k[%{(&fenc\ ' \ . '==\ \"\"?&enc:&fenc).(&bomb?\",BOM\":\"\")}]\ %l,%c%V[%b=0x%02B]\ %P' endif Best regards, Tony. -- All true wisdom is found on T-shirts.
Re: Encoding problem
David Woodfall wrote: I have a bit of a problem with encoding. A particular file (made in windows btw) shows characters wrong in vim, but ok in gvim. Example: ¹²³€ (made by holding alt-gr key and typing 1234). Gvim shows encoding as utf-8 as does vim, so I thought maybe it was a problem with my terminal (mrxvt) but in Irssi I have set char-set as cp1252 and these characters show correctly in my term. CP1252 doesn't appear to be an option in vim though. Also the pound £ sign doesn't show correctly. Any ideas how to get around this problem? When setting "Character Encoding" to UTF-8 in my mailer, I see your first string as ¹²³€ i.e. (exponent 1)(exponent 2)(exponent 3)(Euro sign) and I see your pound sign as -- well, a pound sign: £ Even if your 'encoding' is set to UTF-8, you can read and write files in any other encoding. For instance: :e ++enc=cp1252 filename for Windows-1252. In that case ":setlocal fenc?" will answer " fileencoding=cp1252". see ":help ++opt" Best regards, Tony. -- hundred-and-one symptoms of being an internet addict: 34. You laugh at people with 14400 baud modems.
Re: Encoding problem
SOLVED! Well I think I fixed by rtfm: :set termenc=cp1252 Seems to work, but I don't know yet whether it breaks anything else. On (15:20 15/02/07), David Woodfall <[EMAIL PROTECTED]> put forth the proposition: > I have a bit of a problem with encoding. A particular file (made in windows > btw) shows characters wrong in vim, but ok in gvim. Example: > > ¹²³⬠> > (made by holding alt-gr key and typing 1234). > > Gvim shows encoding as utf-8 as does vim, so I thought maybe it was a > problem with my terminal (mrxvt) but in Irssi I have set char-set as cp1252 > and these characters show correctly in my term. CP1252 doesn't appear to be > an option in vim though. Also the pound £ sign doesn't show correctly. > > Any ideas how to get around this problem? > > -- > "A fractal is by definition a set for which the Hausdorff Besicovitch > dimension strictly exceeds the topological dimension." > -- Mandelbrot, "The Fractal Geometry of Nature" -- "The society which scorns excellence in plumbing as a humble activity and tolerates shoddiness in philosophy because it is an exalted activity will have neither good plumbing nor good philosophy ... neither its pipes nor its theories will hold water."
Encoding problem
I have a bit of a problem with encoding. A particular file (made in windows btw) shows characters wrong in vim, but ok in gvim. Example: ¹²³⬠(made by holding alt-gr key and typing 1234). Gvim shows encoding as utf-8 as does vim, so I thought maybe it was a problem with my terminal (mrxvt) but in Irssi I have set char-set as cp1252 and these characters show correctly in my term. CP1252 doesn't appear to be an option in vim though. Also the pound £ sign doesn't show correctly. Any ideas how to get around this problem? -- "A fractal is by definition a set for which the Hausdorff Besicovitch dimension strictly exceeds the topological dimension." -- Mandelbrot, "The Fractal Geometry of Nature"
Re: Encoding problem
Hi Tony :) * A.J.Mechelynck <[EMAIL PROTECTED]> dixit: > >>":scriptencoding" applies no farther than the end of the current script. > > > >And does it affect sourced scripts or should I put that line in all > >scripts? > > It doesn't affect sourced scripts. Each script should include or not > include a ":scriptencoding" statement according to what bytes are found in > that script itself. OK, I thought it was more like a flag, meaning "from now on, everything you source is latin1". Of course, it wouldn't make much sense that way, I should have noticed O:) > >>OK, let's try the opposite: edit options.vim, remove the sriptencoding > >>statement, then save it with > >> > >>:setlocal bomb fenc=utf-8 > >>:x > >> > >>Then restart Vim and see if it works. > > > >No, it doesn't work, but the strange thing is that vim barfs *only* > >with 'showbreak'. I have latin1 (well, utf-8 now) characters in the > >script, namely in 'foldtext' and 'listchars' at least, and they are > >processed correctly. Maybe the codes I'm using are considered printable > >in latin1 and nonprintable in utf8? > > What characters are seen as printable in Vim depends on the 'isprint' > option. Oh, I didn't remember that, I assumed that Vim was using the "isprint()" functions in ctype for that. > That option's default is OS-dependent, but apparently not > locale-dependent. ASCII characters from 0x20 (space) to 0x7E (tilde), > including all digits and letters, are always "printable", even if the > option doesn't mention them. Anyway, I have that option set to "@,161-255", so probably if I set my encoding to utf8, the multibyte characters (division and left guillemot) should be printable, but in this case looks like Vim doesn't like them (it likes them on 'listchars', so the problem is not the encoding, definitely). > >Oops, I think I know what's happening. I don't have an utf8 locale, > >and I don't mean active, I mean *installed*, so if vim is trying to use > >an utf-8 locale to see if a character is printable or not, it won't work > >unless vim itself knows if some character is printable or not under > >utf8. That's why the error is E595 and only shows with 'showbreak'. Vim > >is considering the division sign and the left guillemot non printable > >under utf8 encoding (which, BTW, is not right). Probably if I install an > >utf8 locale, things will work OK. By now I'll leave 'encoding' as > >default, 'fenc' and 'fencs' empty and will set utf-8 by hand when needed > >(which is not very frequently for me). > > There used to be a limitation on 'listchars', and possibly it still applies > to 'showbreak': the characters in that option had to be valid in the > current 'encoding'. If you change the 'encoding', the option may become > invalid in the new 'encoding'. If you use 7-bit characters in 'showbreak' > it should be OK in all 'encoding's. Yes, but if I set the encoding to utf8 and save the file *as* utf8, then Vim should handle it, am I wrong?. Those characters will be valid utf8 and will be printable :? > >Problem solved! Thanks a lot for everything, Tony :) > > > De nada, hombre. Do you know that with that kind of expressions you would pass as a spanish native? Your spanish is much better than you think ;))) I'll try to learn a bit of esperanto to correspond to your kindness :) Raúl Núñez de Arenas Coronado -- Linux Registered User 88736 | http://www.dervishd.net It's my PC and I'll cry if I want to... RAmen!
Re: Encoding problem
A.J.Mechelynck wrote: [...] If you leave 'encoding' set at Latin1, Vim won't be able to represent in memory any Unicode codepoints higher than U+00FF, even if you use ":e ++enc=utf-8 filename". See for instance the Russian and Arabic text in my front page, http://users.skynet.be/antoine.mechelynck/index.htm . If you /don't/ use ++enc, then with 'fencs' empty (which is not the default) there will be no translation, and every codepoint above U+007F in a UTF-8 file will appear as two or more bytes of gibberish. For instance, "Raúl Núñez" would be shown as "Raúl Núñez" which is not very pretty to look at. [...] Oops! That page plays a dirty trick: it sets your 'encoding' to UTF-8. To see the difference, download it on your computer, remove the modeline at line 3, and restart Vim. Best regards, Tony.
Re: Encoding problem
DervishD wrote: Hi Tony :) * A.J.Mechelynck <[EMAIL PROTECTED]> dixit: DervishD wrote: ":scriptencoding" is used to tell Vim's sourcing engine in which 'fileencoding' the script was written. There are two cases where it is not necessary: - the same as 'encoding', or - UTF-8 with BOM. IOW, yes, if you set 'encoding' to UTF-8 you may have to also issue ":scriptencoding latin1". I have this line as the first line of my "options.vim", but it doesn't seem to work. Probably because I do the following: my /etc/vimrc sources /etc/vim/options.vim, which is the problematic script and the only one that has "scriptencoding" on it. Probably when vim is parsing the file, it already has decided that the rc files are utf-8, since /etc/vimrc has no latin1 characters on it. ":scriptencoding" applies no farther than the end of the current script. And does it affect sourced scripts or should I put that line in all scripts? It doesn't affect sourced scripts. Each script should include or not include a ":scriptencoding" statement according to what bytes are found in that script itself. OK, let's try the opposite: edit options.vim, remove the sriptencoding statement, then save it with :setlocal bomb fenc=utf-8 :x Then restart Vim and see if it works. No, it doesn't work, but the strange thing is that vim barfs *only* with 'showbreak'. I have latin1 (well, utf-8 now) characters in the script, namely in 'foldtext' and 'listchars' at least, and they are processed correctly. Maybe the codes I'm using are considered printable in latin1 and nonprintable in utf8? What characters are seen as printable in Vim depends on the 'isprint' option. That option's default is OS-dependent, but apparently not locale-dependent. ASCII characters from 0x20 (space) to 0x7E (tilde), including all digits and letters, are always "printable", even if the option doesn't mention them. Multibyte characters above 256 (but not necessarily Unicode codepoints in the range U+0080 to U+00FF, which are multibyte in all Unicode encodings but are not above 256) are also always "printable"; however, some of them don't display and may be handled specially. Oops, I think I know what's happening. I don't have an utf8 locale, and I don't mean active, I mean *installed*, so if vim is trying to use an utf-8 locale to see if a character is printable or not, it won't work unless vim itself knows if some character is printable or not under utf8. That's why the error is E595 and only shows with 'showbreak'. Vim is considering the division sign and the left guillemot non printable under utf8 encoding (which, BTW, is not right). Probably if I install an utf8 locale, things will work OK. By now I'll leave 'encoding' as default, 'fenc' and 'fencs' empty and will set utf-8 by hand when needed (which is not very frequently for me). There used to be a limitation on 'listchars', and possibly it still applies to 'showbreak': the characters in that option had to be valid in the current 'encoding'. If you change the 'encoding', the option may become invalid in the new 'encoding'. If you use 7-bit characters in 'showbreak' it should be OK in all 'encoding's. If you leave 'encoding' set at Latin1, Vim won't be able to represent in memory any Unicode codepoints higher than U+00FF, even if you use ":e ++enc=utf-8 filename". See for instance the Russian and Arabic text in my front page, http://users.skynet.be/antoine.mechelynck/index.htm . If you /don't/ use ++enc, then with 'fencs' empty (which is not the default) there will be no translation, and every codepoint above U+007F in a UTF-8 file will appear as two or more bytes of gibberish. For instance, "Raúl Núñez" would be shown as "Raúl Núñez" which is not very pretty to look at. Problem solved! Thanks a lot for everything, Tony :) Raúl Núñez de Arenas Coronado De nada, hombre. Best regards, Tony.
Re: Encoding problem
Hi Tony :) * A.J.Mechelynck <[EMAIL PROTECTED]> dixit: > DervishD wrote: > >>":scriptencoding" is used to tell Vim's sourcing engine in which > >>'fileencoding' the script was written. There are two cases where it is > >>not necessary: > >>- the same as 'encoding', or > >>- UTF-8 with BOM. > >>IOW, yes, if you set 'encoding' to UTF-8 you may have to also issue > >>":scriptencoding latin1". > > > >I have this line as the first line of my "options.vim", but it > >doesn't seem to work. Probably because I do the following: my /etc/vimrc > >sources /etc/vim/options.vim, which is the problematic script and the > >only one that has "scriptencoding" on it. Probably when vim is parsing > >the file, it already has decided that the rc files are utf-8, since > >/etc/vimrc has no latin1 characters on it. > > ":scriptencoding" applies no farther than the end of the current script. And does it affect sourced scripts or should I put that line in all scripts? > OK, let's try the opposite: edit options.vim, remove the sriptencoding > statement, then save it with > > :setlocal bomb fenc=utf-8 > :x > > Then restart Vim and see if it works. No, it doesn't work, but the strange thing is that vim barfs *only* with 'showbreak'. I have latin1 (well, utf-8 now) characters in the script, namely in 'foldtext' and 'listchars' at least, and they are processed correctly. Maybe the codes I'm using are considered printable in latin1 and nonprintable in utf8? Oops, I think I know what's happening. I don't have an utf8 locale, and I don't mean active, I mean *installed*, so if vim is trying to use an utf-8 locale to see if a character is printable or not, it won't work unless vim itself knows if some character is printable or not under utf8. That's why the error is E595 and only shows with 'showbreak'. Vim is considering the division sign and the left guillemot non printable under utf8 encoding (which, BTW, is not right). Probably if I install an utf8 locale, things will work OK. By now I'll leave 'encoding' as default, 'fenc' and 'fencs' empty and will set utf-8 by hand when needed (which is not very frequently for me). Problem solved! Thanks a lot for everything, Tony :) Raúl Núñez de Arenas Coronado -- Linux Registered User 88736 | http://www.dervishd.net It's my PC and I'll cry if I want to... RAmen!
Re: Encoding problem
DervishD wrote: Hi Tony :) * A.J.Mechelynck <[EMAIL PROTECTED]> dixit: DervishD wrote: * A.J.Mechelynck <[EMAIL PROTECTED]> dixit: [...] As long as your vimrc includes only 7-bit ASCII, there's no problem. But in the particular case of your vimrc, you could add the following lines at top, do ":setlocal fenc=latin1", and (IIUC) it will always be _read_ as Latin1 in the future, because of the accented letters in your name: Won't "scriptencoding" work? I have latin1 characters in my vimrc and setting "encoding=utf8" now causes vim to spill an error when reading it :((( I'm afraid I will have to keep it at the default value. Maybe I didn't express myself clearly enough. Unless your vimrc includes codepoints higher than U+00FF, it can be represented in Latin1. Any Latin1 file which includes the words "Raúl Núñez" will cause the UTF-8 heuristic to fail in 'fileencodings', and Vim will see it as Latin1. Which doesn't work if 'encoding' is utf8, I've tested :(( Vim barfs in some latin1 characters I use in 'showbreak' (I don't know the Unicode code point of the characters, but in latin1 they're 0xf7 and 0xbb). ":scriptencoding" is used to tell Vim's sourcing engine in which 'fileencoding' the script was written. There are two cases where it is not necessary: - the same as 'encoding', or - UTF-8 with BOM. IOW, yes, if you set 'encoding' to UTF-8 you may have to also issue ":scriptencoding latin1". I have this line as the first line of my "options.vim", but it doesn't seem to work. Probably because I do the following: my /etc/vimrc sources /etc/vim/options.vim, which is the problematic script and the only one that has "scriptencoding" on it. Probably when vim is parsing the file, it already has decided that the rc files are utf-8, since /etc/vimrc has no latin1 characters on it. ":scriptencoding" applies no farther than the end of the current script. I'll make a test... OK, it still fails. I've put "scriptencoding" at the top of my vimrc file, and vim barfs in the same latin1 characters. 0xF7 in Lain1, or Unicode U+00F7, is the divide sign (English style, a colon with a dash in the middle). 0xBB, or U+00BB, is the closing French quote (Ctrl-K >>). Again, thanks a lot for your help, you're great : Raúl Núñez de Arenas Coronado OK, let's try the opposite: edit options.vim, remove the sriptencoding statement, then save it with :setlocal bomb fenc=utf-8 :x Then restart Vim and see if it works. Best regards, Tony.
Re: Encoding problem
Hi Tony :) * A.J.Mechelynck <[EMAIL PROTECTED]> dixit: > DervishD wrote: > > * A.J.Mechelynck <[EMAIL PROTECTED]> dixit: > [...] > >>As long as your vimrc includes only 7-bit ASCII, there's no problem. But > >>in the particular case of your vimrc, you could add the following lines > >>at top, do ":setlocal fenc=latin1", and (IIUC) it will always be _read_ > >>as Latin1 in the future, because of the accented letters in your name: > > > >Won't "scriptencoding" work? I have latin1 characters in my vimrc > >and setting "encoding=utf8" now causes vim to spill an error when > >reading it :((( I'm afraid I will have to keep it at the default value. > > Maybe I didn't express myself clearly enough. Unless your vimrc includes > codepoints higher than U+00FF, it can be represented in Latin1. Any Latin1 > file which includes the words "Raúl Núñez" will cause the UTF-8 heuristic > to fail in 'fileencodings', and Vim will see it as Latin1. Which doesn't work if 'encoding' is utf8, I've tested :(( Vim barfs in some latin1 characters I use in 'showbreak' (I don't know the Unicode code point of the characters, but in latin1 they're 0xf7 and 0xbb). > ":scriptencoding" is used to tell Vim's sourcing engine in which > 'fileencoding' the script was written. There are two cases where it is not > necessary: > - the same as 'encoding', or > - UTF-8 with BOM. > IOW, yes, if you set 'encoding' to UTF-8 you may have to also issue > ":scriptencoding latin1". I have this line as the first line of my "options.vim", but it doesn't seem to work. Probably because I do the following: my /etc/vimrc sources /etc/vim/options.vim, which is the problematic script and the only one that has "scriptencoding" on it. Probably when vim is parsing the file, it already has decided that the rc files are utf-8, since /etc/vimrc has no latin1 characters on it. I'll make a test... OK, it still fails. I've put "scriptencoding" at the top of my vimrc file, and vim barfs in the same latin1 characters. Again, thanks a lot for your help, you're great : Raúl Núñez de Arenas Coronado -- Linux Registered User 88736 | http://www.dervishd.net It's my PC and I'll cry if I want to... RAmen!
Re: Encoding problem
DervishD wrote: Hi Tony :) * A.J.Mechelynck <[EMAIL PROTECTED]> dixit: [...] As long as your vimrc includes only 7-bit ASCII, there's no problem. But in the particular case of your vimrc, you could add the following lines at top, do ":setlocal fenc=latin1", and (IIUC) it will always be _read_ as Latin1 in the future, because of the accented letters in your name: Won't "scriptencoding" work? I have latin1 characters in my vimrc and setting "encoding=utf8" now causes vim to spill an error when reading it :((( I'm afraid I will have to keep it at the default value. Thanks for all the help :) Raúl Núñez de Arenas Coronado Maybe I didn't express myself clearly enough. Unless your vimrc includes codepoints higher than U+00FF, it can be represented in Latin1. Any Latin1 file which includes the words "Raúl Núñez" will cause the UTF-8 heuristic to fail in 'fileencodings', and Vim will see it as Latin1. ":scriptencoding" is used to tell Vim's sourcing engine in which 'fileencoding' the script was written. There are two cases where it is not necessary: - the same as 'encoding', or - UTF-8 with BOM. IOW, yes, if you set 'encoding' to UTF-8 you may have to also issue ":scriptencoding latin1". Or else you can use a UTF-8 vimrc, provided that you can be sure that it will never be sourced (or that characters >0x7F in it will always be skipped) in a Vim compiled with -multi_byte. Best regards, Tony.
Re: Encoding problem
Hi Tony :) * A.J.Mechelynck <[EMAIL PROTECTED]> dixit: > >>Your problem lies in the relation between UTF-8, Latin1 and US-ASCII. > >>Characters 0x00 to 0x7F are represented identically in all three, > >>therefore if a file contains only 7-bit ASCII characters, it won't make > >>any difference whether it is interpreted as US-ASCII, Latin1 or UTF-8 -- > >>the data will be the same, *represented the same way*, in all three cases. > > > >I know that, ucs-bom and utf-8 are tried before latin1 and utf-8 > >always succeeds for US-ASCII files :((( > > No need to frown: US-ASCII "is" UTF-8 (but the reciprocal is not always > true): or if you prefer, a UTF-8 file containing only codepoints below > U+0080 can be read correctly, with no errors or misreadings, by any program > accepting US-ASCII. Sorry, I'm afraid I didn't use the proper smiley O:) I wasn't frowning, what I wanted to express was more in the lines of "how unfortunate am I", or something like that. Sorry for the mistake... > >A partial solution for me would be to force "latin1" when saving a > >file, but then I take the risk of messing the encoding of a couple of > >projects where I may add code which are utf-8 :(( > > > >Probably my best bet is to map "save as latin1" and do this > >manually. > > Rather than map ":w ++enc=latin1" I would map ":setlocal fenc=latin1", > because with the latter (but not with the former) all saves of the file > will be in latin1 until you ":quit" the file Nice idea :)) Thanks a lot :)) > >BTW, and regarding your suggestion above, I just forgot to do it > >back when I wrote my vimrc while reading the documentation. I missed the > >prominent note, sorry O: > > As long as your vimrc includes only 7-bit ASCII, there's no problem. But in > the particular case of your vimrc, you could add the following lines at > top, do ":setlocal fenc=latin1", and (IIUC) it will always be _read_ as > Latin1 in the future, because of the accented letters in your name: Won't "scriptencoding" work? I have latin1 characters in my vimrc and setting "encoding=utf8" now causes vim to spill an error when reading it :((( I'm afraid I will have to keep it at the default value. Thanks for all the help :) Raúl Núñez de Arenas Coronado -- Linux Registered User 88736 | http://www.dervishd.net It's my PC and I'll cry if I want to... RAmen!
Re: Encoding problem
DervishD wrote: Hi Tony :) * A.J.Mechelynck <[EMAIL PROTECTED]> dixit: DervishD wrote: My system is latin-1, so I want my files written using latin-1 encoding. But sometimes I get files in utf8 encoding, so I set up my vim like this: set encoding =latin1 set fileencoding =latin1 set fileencodings =ucs-bom,utf-8,latin1 This last line is causing big problems to me. Everytime I edit one of MY files, not the utf8 imported files, vim converts it to utf-8, because while ucs-bom may fail as an encoding, utf-8 not. My problem will be gone if I set "fileencodings" to just latin1, but then I won't get utf-8 files automagically converted and presented to me in a readable form. Is there any way to get what I want, that is, to have ALL my files edited as latin1 but convert utf-8 files properly without using the "++enc" thing? Your problem lies in the relation between UTF-8, Latin1 and US-ASCII. Characters 0x00 to 0x7F are represented identically in all three, therefore if a file contains only 7-bit ASCII characters, it won't make any difference whether it is interpreted as US-ASCII, Latin1 or UTF-8 -- the data will be the same, *represented the same way*, in all three cases. I know that, ucs-bom and utf-8 are tried before latin1 and utf-8 always succeeds for US-ASCII files :((( No need to frown: US-ASCII "is" UTF-8 (but the reciprocal is not always true): or if you prefer, a UTF-8 file containing only codepoints below U+0080 can be read correctly, with no errors or misreadings, by any program accepting US-ASCII. You can do it in advance by intentionally placing some upper-ASCII in the file, for instance by underlining the top title with ÷ (a line of "divide-by" signs, 0xF7), then saving the file as Latin1. Yes, I thought about that solution, but it's messy and not always applicable (I cannot place upper latin1 characters in some files at the beginning, or remember to save it as latin1). I think you can place upper-half characters anywhere. If there are _no_ upper-ascii characters anywhere in the file, then it's us-ascii and the above remark applies. Note that in order to edit Unicode files properly, it is more prudent to set 'encoding' to UTF-8, otherwise if you happen to edit a file containing anything which your current 'encoding' cannot represent, it will get garbled, and Vim won't be able to restore the original value when saving the file. You can do it as follows (in your vimrc): if &encoding !~? "^u" if &termencoding == "" let &termencoding = &encoding endif set encoding=utf-8 fileencodings=ucs-bom,utf-8,latin1 setglobal bomb fileencoding=latin1 endif So, there is no way of solving my problem unless I put "latin1" before "utf8" in "fileencodings", but then nothing will work because "latin1" will always succeed :((( Yes, 'fileencodings' should contain at most one 8-bit encoding, and if there is one, it should be last, because 8-bit encodings never give a "fail" signal. A partial solution for me would be to force "latin1" when saving a file, but then I take the risk of messing the encoding of a couple of projects where I may add code which are utf-8 :(( Probably my best bet is to map "save as latin1" and do this manually. Rather than map ":w ++enc=latin1" I would map ":setlocal fenc=latin1", because with the latter (but not with the former) all saves of the file will be in latin1 until you ":quit" the file BTW, and regarding your suggestion above, I just forgot to do it back when I wrote my vimrc while reading the documentation. I missed the prominent note, sorry O: Thanks for your help :) Raúl Núñez de Arenas Coronado As long as your vimrc includes only 7-bit ASCII, there's no problem. But in the particular case of your vimrc, you could add the following lines at top, do ":setlocal fenc=latin1", and (IIUC) it will always be _read_ as Latin1 in the future, because of the accented letters in your name: " Vim configuration file " Maintainer: Raúl Núñez de Arenas Coronado <[EMAIL PROTECTED]> " Last change: 11-Jan-2007 (The "Last change" line is _not_ updated automagically.) Best regards, Tony.
Re: Encoding problem
Hi Tony :) * A.J.Mechelynck <[EMAIL PROTECTED]> dixit: > DervishD wrote: > >My system is latin-1, so I want my files written using latin-1 > >encoding. But sometimes I get files in utf8 encoding, so I set up my vim > >like this: > > > >set encoding =latin1 > >set fileencoding =latin1 > >set fileencodings =ucs-bom,utf-8,latin1 > > > >This last line is causing big problems to me. Everytime I edit one > >of MY files, not the utf8 imported files, vim converts it to utf-8, > >because while ucs-bom may fail as an encoding, utf-8 not. > > > >My problem will be gone if I set "fileencodings" to just latin1, but > >then I won't get utf-8 files automagically converted and presented to me > >in a readable form. > > > >Is there any way to get what I want, that is, to have ALL my files > >edited as latin1 but convert utf-8 files properly without using the > >"++enc" thing? > > Your problem lies in the relation between UTF-8, Latin1 and US-ASCII. > Characters 0x00 to 0x7F are represented identically in all three, therefore > if a file contains only 7-bit ASCII characters, it won't make any > difference whether it is interpreted as US-ASCII, Latin1 or UTF-8 -- the > data will be the same, *represented the same way*, in all three cases. I know that, ucs-bom and utf-8 are tried before latin1 and utf-8 always succeeds for US-ASCII files :((( > You can do it in advance by intentionally placing some upper-ASCII in the > file, for instance by underlining the top title with > ÷ (a line of "divide-by" signs, 0xF7), then saving > the file as Latin1. Yes, I thought about that solution, but it's messy and not always applicable (I cannot place upper latin1 characters in some files at the beginning, or remember to save it as latin1). > Note that in order to edit Unicode files properly, it is more prudent to > set 'encoding' to UTF-8, otherwise if you happen to edit a file containing > anything which your current 'encoding' cannot represent, it will get > garbled, and Vim won't be able to restore the original value when saving > the file. You can do it as follows (in your vimrc): > > if &encoding !~? "^u" > if &termencoding == "" > let &termencoding = &encoding > endif > set encoding=utf-8 fileencodings=ucs-bom,utf-8,latin1 > setglobal bomb fileencoding=latin1 > endif So, there is no way of solving my problem unless I put "latin1" before "utf8" in "fileencodings", but then nothing will work because "latin1" will always succeed :((( A partial solution for me would be to force "latin1" when saving a file, but then I take the risk of messing the encoding of a couple of projects where I may add code which are utf-8 :(( Probably my best bet is to map "save as latin1" and do this manually. BTW, and regarding your suggestion above, I just forgot to do it back when I wrote my vimrc while reading the documentation. I missed the prominent note, sorry O: Thanks for your help :) Raúl Núñez de Arenas Coronado -- Linux Registered User 88736 | http://www.dervishd.net It's my PC and I'll cry if I want to... RAmen!
Re: Encoding problem
Hi Scot :) * Scot Becker <[EMAIL PROTECTED]> dixit: > Try removing both the > set encoding and > set fileencoding lines. > > And see if it does what you want. > It should do latin1 still by default (based on your system settings), > and still let you see utf files. If that fails, leave the 'set > encoding', but leave out the 'set fileencoding'. I think that you only > need a fileencoding line when you want to force conversion. Otherwise > 'set encoding' does the trick. > > I had this problem too, when I WANTED to set everything to utf-8. The problem is still the same, because ucs-bom and utf8 are tried before latin1 for ascii-7 files. I want latin1 by default for ascii-7 files. Thanks anyway! :))) Raúl Núñez de Arenas Coronado -- Linux Registered User 88736 | http://www.dervishd.net It's my PC and I'll cry if I want to... RAmen!
Re: Encoding problem
DervishD wrote: Hi all :) My system is latin-1, so I want my files written using latin-1 encoding. But sometimes I get files in utf8 encoding, so I set up my vim like this: set encoding =latin1 set fileencoding =latin1 set fileencodings =ucs-bom,utf-8,latin1 This last line is causing big problems to me. Everytime I edit one of MY files, not the utf8 imported files, vim converts it to utf-8, because while ucs-bom may fail as an encoding, utf-8 not. My problem will be gone if I set "fileencodings" to just latin1, but then I won't get utf-8 files automagically converted and presented to me in a readable form. Is there any way to get what I want, that is, to have ALL my files edited as latin1 but convert utf-8 files properly without using the "++enc" thing? Thanks a lot in advance :)) Raúl Núñez de Arenas Coronado Your problem lies in the relation between UTF-8, Latin1 and US-ASCII. Characters 0x00 to 0x7F are represented identically in all three, therefore if a file contains only 7-bit ASCII characters, it won't make any difference whether it is interpreted as US-ASCII, Latin1 or UTF-8 -- the data will be the same, *represented the same way*, in all three cases. If you want Vim to explicitly see a file as Latin 1 with the 'fileencodings' above, it must contain some character(s) in the range 0x80-0xFF, because otherwise it won't contain anything which is "invalid" as UTF-8. That doesn't create any problem as long as there is only 7-bit data in the file because Latin1 and UTF-8 are both supersets of US-ASCII and represent all 128 US-ASCII characters the same way; the first time you type something above 0x7F, _then_ you should make sure to use ":setlocal fileencoding=latin1". You can do it in advance by intentionally placing some upper-ASCII in the file, for instance by underlining the top title with ÷ (a line of "divide-by" signs, 0xF7), then saving the file as Latin1. Note that in order to edit Unicode files properly, it is more prudent to set 'encoding' to UTF-8, otherwise if you happen to edit a file containing anything which your current 'encoding' cannot represent, it will get garbled, and Vim won't be able to restore the original value when saving the file. You can do it as follows (in your vimrc): if &encoding !~? "^u" if &termencoding == "" let &termencoding = &encoding endif set encoding=utf-8 fileencodings=ucs-bom,utf-8,latin1 setglobal bomb fileencoding=latin1 endif Best regards, Tony.
Re: Encoding problem
Try removing both the set encoding and set fileencoding lines. And see if it does what you want. It should do latin1 still by default (based on your system settings), and still let you see utf files. If that fails, leave the 'set encoding', but leave out the 'set fileencoding'. I think that you only need a fileencoding line when you want to force conversion. Otherwise 'set encoding' does the trick. I had this problem too, when I WANTED to set everything to utf-8. Scot
Encoding problem
Hi all :) My system is latin-1, so I want my files written using latin-1 encoding. But sometimes I get files in utf8 encoding, so I set up my vim like this: set encoding =latin1 set fileencoding =latin1 set fileencodings =ucs-bom,utf-8,latin1 This last line is causing big problems to me. Everytime I edit one of MY files, not the utf8 imported files, vim converts it to utf-8, because while ucs-bom may fail as an encoding, utf-8 not. My problem will be gone if I set "fileencodings" to just latin1, but then I won't get utf-8 files automagically converted and presented to me in a readable form. Is there any way to get what I want, that is, to have ALL my files edited as latin1 but convert utf-8 files properly without using the "++enc" thing? Thanks a lot in advance :)) Raúl Núñez de Arenas Coronado -- Linux Registered User 88736 | http://www.dervishd.net It's my PC and I'll cry if I want to... RAmen!
Re: Encoding problem - Ubuntu 5.10 & Vim 7.0
I don't know what file to edit to get ISO-8859-1 automatically. When editing a file, I would do :set fenc=latin1 or :set enc=latin1. Sometimes it does work, sometimes it doesn't. I want something that set my encoding at start, or better according to what characters I'm entering. Help me please.