Bill Moseley wrote:
I have a utf-8 file that uses the unicode line separator. Not
something I've come across very often. In utf-8 the sequence is:
0xE2 0x80 0xA8 (e280a8)
In a uxterm vim correctly reads (and sets) the file encoding as utf8
(there's no BOM on the file), but the U-2028 character is displayed
as an un-displayable character and not displayed as a new line.
That is, all the text is displayed as a single line.
Can anyone educate me a bit on the use of the Line Separator character
and if or how it can be supported in Vim?
I'm having other problems -- such as the Perl script that is reading
this file doesn't see the character as a new line (although it does
see it as a matching a \s regular expression.
I may be wrong, but IIUC this codepoint plays the same role as the HTML <br>
tag: it does not define an "end of line" in the text file which contains it,
but it means that, when rendered typographically, as in a browser or a WYSIWYG
editor (neither of which Vim is, or tries to mimic), the rendered output must
have a linebreak at this point.
IOW: I think it's a feature, not a bug.
You can add a linebreak after every occurrence of that codepoint by using
:exe "%s/\<Char-0x2028>/" . '\0\r/g'
Note that I intentionally use double quotes in the first part and single
quotes in the second part.
Best regards,
Tony.
--
It is said that the lonely eagle flies to the mountain peaks while the
lowly ant crawls the ground, but cannot the soul of the ant soar as
high as the eagle?