Bill Moseley wrote:
I have a utf-8 file that uses the unicode line separator.  Not
something I've come across very often.  In utf-8 the sequence is:

    0xE2 0x80 0xA8 (e280a8)

In a uxterm vim correctly reads (and sets) the file encoding as utf8
(there's no BOM on the file), but the U-2028 character is displayed
as an un-displayable character and not displayed as a new line.
That is, all the text is displayed as a single line.

Can anyone educate me a bit on the use of the Line Separator character
and if or how it can be supported in Vim?

I'm having other problems -- such as the Perl script that is reading
this file doesn't see the character as a new line (although it does
see it as a matching a \s regular expression.




I may be wrong, but IIUC this codepoint plays the same role as the HTML <br> tag: it does not define an "end of line" in the text file which contains it, but it means that, when rendered typographically, as in a browser or a WYSIWYG editor (neither of which Vim is, or tries to mimic), the rendered output must have a linebreak at this point.

IOW: I think it's a feature, not a bug.

You can add a linebreak after every occurrence of that codepoint by using

        :exe "%s/\<Char-0x2028>/" . '\0\r/g'

Note that I intentionally use double quotes in the first part and single quotes in the second part.


Best regards,
Tony.
--
It is said that the lonely eagle flies to the mountain peaks while the
lowly ant crawls the ground, but cannot the soul of the ant soar as
high as the eagle?

Reply via email to