Hi Ben
thank you for your detailed reply and friendly advices. Sorry if my message has been misunderstood: My intent was to add another view, which could eventually help Pavel and Bram, to save work time or to extend the scope of solved problems. On 2015-06-24 Wednesday at 07:49 -0700 Ben Fritz wrote: > On Wednesday, June 24, 2015 at 5:12:01 AM UTC-5, Roland Eggner wrote: > > There are more cases, where vim tries to fix “errors” or “irregularities” > > of > > files and thereby damages them, unless “++binary” has been included in the > > reading command. Just two examples: > > > > (1) viminfo files with register contents resulting from alternating > > fileencondings, e.g. utf-8, latin1, latin9: When viewing or editing > > such > > viminfo files, including “++binary” in the reading command avoids data > > damage. > > First of all, .viminfo files are not really intended to be hand-edited, > although the format is simple enough that it's certainly possible most of the > time. Maybe I am a “villain” because of calling vim via a wrapper script which edits .viminfo files. This gives me a buffer list, which keeps entries until referred files disappear from filesystem or their atime is older than 3 weeks. And it decreases the frequency of losing text marks `a - `z by deletion of jump lists. > However, you're expecting something fairly unreasonable. Vim has no way of > marking different regions of a file as having different encodings. In fact I > am not aware of any text editors that DO allow this. How does the editor > know what encoding to apply to any new text? How are the regions delimited, > especially if the delimiter could have different representations in different > encodings? In the case that you have multiple encodings in a file, the file > really and truly *IS* a binary file. My intent was exactly this conclusion to be drawn by readers. If line specific encodings could be implemented properly, it would be of little use in the case of my example (2): the patch files would appear redundant. Bram added this probably related entry to the todo list more than 5 years ago: > When a register contains illegal bytes, writing viminfo in utf-8 and reading > it back doesn't result in utf-8. (Devin Bayer) “:help viminfo-encoding” appears to give a hint in the last sentence, that different encodings used for different lines under certain circumstances might be intentional: > … > :set viminfo+=c > Vim will then attempt to convert the text in the viminfo file from the > 'encoding' value it was written with to the current 'encoding' value. > This requires Vim to be compiled with the |+iconv| feature. Filenames > are not converted. > But why do you have multiple encodings in the file? The encoding of text in > the _viminfo file should only depend on the 'encoding' option of Vim, it > should not depend on the fileencoding option of the various files. Are you > setting 'encoding' differently as you open files in different files? You > should not be doing that...you should keep 'encoding' set to utf-8 and change > 'fileencoding' as needed. Yes, in theory it should, but praxis differs, despite I am doing for many years exactly what you recommend. Detailed bug reports regarding this topic must wait, until I find more spare time. > > How can I specify the binary attribute, when vim tries to restore this > > register contents in a later session? vim-7.4 appears to ignore the > > line “*encoding=utf-8“ on reading of viminfo files. > > Vim does not, by default, detect any encoding from any text in the file, > except for reading a BOM to detect certain Unicode encodings. If you distinguish the _guessing_ of encodings performed in src/fileio.c:readfile() from _detection_ of declared encodings: fully agreed. > For that, you need a plugin. I am fond of Autofenc: > http://www.vim.org/scripts/script.php?script_id=2721 Thank you for this reference. I will check it when I update my vim installations the next time (not in the near future). > > (2) The patch file resulting from the diff between old and new files after > > a command similar to “iconv -f ISO-8859-1 -t utf-8 …” usually needs to > > be > > treated as binary. vim damages such patch files on writing, unless > > the > > reading command includes “++binary”. > > > > See above. Such a file *is* a binary file, it cannot be anything else. My intent was exactly this conclusion to be drawn by readers. > If you forget to read it with ++binary in the first place, you can always ":e > ++binary" after loading. > > > A concept which can be reused _consistently_ for the solution of all > > problems > > of this class probably can save a lot of future work time. > > > > I disagree that these problems are in the same class at all, … Two points of disagreeing (here and in the last but one paragraph below) besides agreeing in every other point maybe we can carry? > … but the option for preventing such issues already exists: ":e ++binary". > The binary option tells Vim not to mess with any of the bytes in the file. > That's what you want, right? Is something wrong with this option? The option is ok for me. Just vim should use it _automatically_ whenever an invalid utf-8 sequence occurs in a file being read. This would protect the user from data loss much better than the heuristic “if not valid utf-8 then latin1” resulting from the default value of option “fileencodings”. Conversion from any commonly used multibyte encoding to any extended latin encoding is always a partial data loss, _even_ if all used characters actually have codepoints in the latin encoding. For this reason e.g. the “recode” utility in such cases warns “ambiguous output” and refuses to perform the conversion, unless option “--force” is used. Similarly, and in line with its many other efforts to protect users from data loss, vim should abort and give the following warning, when the execution of a command would involve the conversion from a multibyte encoding to an extended latin encoding: “Encoding conversion from … to latinX would cause data loss. Aborted. If you are absolutely sure, add option "I-want-to-loose-some-data" and retry.“ When I have been a lesser experienced user many years ago, this would have helped me much more, than just the mention of data loss in “:help 'fileencodings'”. > > Transparent decompression and compression of *.gz *.bz2 … files is > > implemented > > with certain autocommands. This autocommands require much less lines of > > code > > than the patch proposed by Pavel. Why not solving the “missing trailing > > EOL” > > problem with similar autocommands? Less lines of code means less time to > > wait > > until Bram can merge this and other patches. > > > > The autocmds to prevent messing with the EOL are here: > http://vim.wikia.com/wiki/Preserve_missing_end-of-line_at_end_of_text_files > > > Are the more lines of code of Pavels patch outweight by better reusability, > > compared to autocommands? > > > > ":set respecteol" is much easier for the end user than installing a plugin or > writing a few dozen lines of vimscript to work around the editor's lack of an > option to turn off an undesired behavior. Working with *.csproj files is a topic for software engineers, not for “end users”. The missing EOL problem is and will remain rare, because rareness is the precondition to benefit from breaking a standard. Bacteria learned this several billions of years ago. Microsoft apparently just reuses this knowledge for its business policy. > … And probably less fragile as well, since it will be baked into the editor > code. And autocmds can be bypassed or interfere with each other. And see > the discussion on that link about weird behavior with respect to undo for the > autocmds...that's not very comforting that we never figured that out. “… less fragile …” to me seems a common place rather than an argument: Driving faster _always_ increases risk and requires additional attention. You can use seat belts or go by foot, as you prefer. In software engineering this is just similar. And the undo discussion does not matter: after an _automatically_ added EOL there is no desire for an undo. If he does not like autocommands for this task, maybe my second alternative solution proposed in my reply to Pavel in this thread can provide him a satisfying solution with less long term effort. -- Best regards, Roland Eggner -- -- You received this message from the "vim_dev" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
pgpwH_OobrPWU.pgp
Description: PGP signature