Le 31/03/2012 18:47, john Culleton a ?crit :
> I note that when importing text prepared in an external editor
> like Gvim every end of line is interpreted by Scribus text
> importer as an end of paragraph marker. I have set up Gvim to
> automatically generate an EOL after I type 72 characters. For a
> paragraph end I insert a blank line, following the TeX
> convention. But when I import the text file into Scribus every
> line becomes a paragraph. 

This is most likely caused by gvim using line-feed (0x0A) or carriage return 
(0x0D) as
line separator :
- this way of doing is ancient and non portable as the interpretation of those 
characters
is totally platform dependent
- this method is not unicode compliant : for unicode those characters are both 
paragraph
separators, with the exception that 0x0D0A sequence must be interpreted as only 
one
paragraph separator instead of two

Unicode has defined non ambiguous and portable characters to use as line and 
paragraph
separator :
- line separator : 0x2028 
(http://www.fileformat.info/info/unicode/char/2028/index.htm)
- paragraph separator : 0x2029 
(http://www.fileformat.info/info/unicode/char/2029/index.htm)

In gvim case it should be using 0x2028 character and not the carriage return.

> if I delete all EOL markers in Gvim then Gvim
> will still wrap the lines visually but may divide lines in the middle
> of a word to accomplish the visual wrap. That's a bit clumsy.

This only underlines gvim uses a basic and non unicode compliant line breaking 
algorithm.

> 
> HTML does not so interpret an EOL for example. And the blank line
> convention is followed when typing emails.

For HTML line feed and carriage-return are equivalent to a space. But this is 
specific to
HTML. As most mail programs implement unicode line breaking algorithm, text 
wraps without
problems.

> 
> Is there a setting somewhere that prevents Scribus from
> interpreting an EOL as an end of paragraph marker when importing text? 

No, because we follow here the unicode guide line about interpretation of 
characters in a
word processing context. See following link section 4.2 point 2 :

http://unicode.org/standard/reports/tr13/tr13-5.html#Interpreting%20characters%20in%20text

"In word processing, interpret any NLF the same as PS."

That mean 0x0D0A, 0x0A, 0xOD sequences are to be interpreted as paragraph 
separators.

However as far as i remember we currently do not interpret the new unicode line 
and
paragraph separators. This is something we'll have to fix at a point. But gvim 
should be
using the proper line separator too.

Jean


Reply via email to