Hi Tony, * A.J.Mechelynck on Saturday, September 23, 2006 at 17:35:25 +0200: > Christian Ebert wrote: >> * A.J.Mechelynck on Saturday, September 23, 2006 at 09:57:40 +0200: >>> #1. >>> cat file1.utf8.txt file2.latin1.txt file3.utf8.txt > file99.utf8.txt >>> >>> will produce invalid output unless the Latin1 input file is actually >>> 7-bit US-ASCII. This is not a limitation of the "cat" program (which >>> inherently never translates anything) but a false manoeuver on the part >>> of the user. >> >> Hm, I want illegal stuff, hehe. > > Then don't use UTF-8 files.
Yup. Basically I can't edit files with mixed encodings. What fooled me was that if I do in an utf-8 environment: $ echo 'Vögel' >file-utf8.txt and then "illegally": $ echo 'Vögel' | iconv -f utf-8 -t iso-8859-1 >>file-utf8.txt $ vim file-utf8.txt Vim then decides to convert to latin1 automatically for representation: #v+ Vögel Vögel #v- Makes sense as Vim considers 'ö' as legal latin1 chars. And apparently there is no way to force Vim in a less sensible way ;) like to represent the illegal chars with a placeholder. Blinded by my (dirty workaround) purpose I hoped for a way to force Vim /not/ to convert. >>> #2. >>> gvim >>> :if &tenc == "" | let &tenc = &enc | endif >>> :set enc=utf-8 fencs=utf-bom,utf-8,latin1 >> ucs-bom >>> :e ++enc=utf-8 file1.utf8.txt >>> :$r ++enc=latin1 file2.latin1.txt >>> :$r ++enc=utf-8 file3.utf-8.txt >>> :saveas file99.utf8.txt >> >> Then file99.utf8.txt is the same as the one produced with the >> cat command. Which is actually what I want. > > No. It is what the one produced with the cat command should have been, with > the Latin1 accented characters properly converted to UTF-8. You are right, of course. To summarize: I tried to work around a shortcoming in a LaTeX package (it can't parse utf input). For my purposes the easiest workaround would have been the dirtiest: [LaTeX pseudo-code] #v+ \usepackage[utf8]{inputenc} \usepackage{soul}% <- the package in question .... Loads of legal utf-8 text ... \begingroup\inputencoding{latin1} \caps{short text in illegal iso-8859-1} \endgroup Loads of legal utf-8 text ... #v- This does not work in one file if I want to continue to edit the "loads of legal utf-8 text" in Vim. In the above simple case I could do: $ voeg=`echo 'Vögel' | iconv -f utf-8 -t iso-8859-1`; \ sed -i~ -e "s/\\caps{.*}/\\caps{$voeg}/" file-utf8.tex to get the result (LaTeX output) I wanted. Or I could write the group around \caps in a latin1 file and \input it, or decide to switch to a latin1 environment ... ... or rewrite the LaTeX-package to accept utf-8 encoding -- which would be the cleanest solution, but unfortunately over my head ATM. So, what I had in mind was too dirty (for Vim). Thanks for taking your time, Tony. c -- _B A U S T E L L E N_ lesen! --->> <http://www.blacktrash.org/baustellen.html>