Matthew Winn wrote:
On Fri, 02 Mar 2007 20:24:44 +0100, "A.J.Mechelynck"
<[EMAIL PROTECTED]> wrote:

Bill Moseley wrote:
I have a utf-8 file that uses the unicode line separator.  Not
something I've come across very often.  In utf-8 the sequence is:

    0xE2 0x80 0xA8 (e280a8)

In a uxterm vim correctly reads (and sets) the file encoding as utf8
(there's no BOM on the file), but the U-2028 character is displayed
as an un-displayable character and not displayed as a new line.
That is, all the text is displayed as a single line.

Can anyone educate me a bit on the use of the Line Separator character
and if or how it can be supported in Vim?
I may be wrong, but IIUC this codepoint plays the same role as the HTML <br> tag: it does not define an "end of line" in the text file which contains it, but it means that, when rendered typographically, as in a browser or a WYSIWYG editor (neither of which Vim is, or tries to mimic), the rendered output must have a linebreak at this point.

IOW: I think it's a feature, not a bug.

You can add a linebreak after every occurrence of that codepoint by using

        :exe "%s/\<Char-0x2028>/" . '\0\r/g'

Note that I intentionally use double quotes in the first part and single quotes in the second part.

According to http://www.unicode.org/reports/tr13/tr13-9.html the
correct way to treat U+2028 and U+2029 (paragraph separator) is to
translate them into the platform's standard sequence for representing
the end of a line. (What it actually says is that if the purpose of
the line break is unambiguously known -- that is, whether it is the
end of a line or the end of a paragraph -- then the corresponding
Unicode character should be used. But Vim is a text editor and knows
nothing of paragraphs, so I would expect both these characters to be
translated into the platform's end-of-line representation.)

However, this would be lossy, so if this were to be implemented I
suspect an option would be required for the benefit of people who want
to edit Unicode text without losing the distinction between line and
paragraph endings.


That's why I suggested adding an ASCII linebreak after the LSEP, not replacing 
it.

Best regards,
Tony.
--
There was a young lady from Maine
Who claimed she had men on her brain.
        But you knew from the view,
        As her abdomen grew,
It was not on her brain that he'd lain.

Reply via email to