On Feb 6, 2006, at 08:17, Manuel Mall wrote:
[ME:]
<snip/>
A preserved carriage return can be treated the same way as a
linefeed, under the very exceptional condition that it survives
white-
space handling:
* white-space-treatment="ignore-if-*"
* the CR does not follow/precede a linefeed
* it is the first character in a sequence of whitespace, so
it survives white-space-collapse
Shouldn't a CR always survive whitespace handling?
Not always:
If white-space-treatment="preserve" then any XML whitespace other
than a linefeed is converted into a normal space. IMO, the editors
put it this way because of the possibility of Windows-specific line-
endings, where a linefeed is followed by a CR.
For a starters it is fairly difficult to get a CR out of a XML parser.
Difficult? It's simply a characters event, just like any other...
Only if the CR is hidden in an entity reference can it survive.
Also, as Simon pointed out in some other contribution, whitespace
handling
is designed to deal with pretty printing and readable XML layout
introduced
whitespace. A CR preserved by the XML parser certainly does not
fall into
that category.
Oh yes it does... Remember that not all our users are unix/linux-
based, which means for Windows users, you're likely to get the
sequence '

' as line-terminator, while Mac-users saving a
source file with native line-endings will simply get a '
'.
(UTF-8 encoding is recommended, but not enforced... An XML file can
be any encoding the parser supports on top of the UTF-8 minimum.)
A carriage-return can survive white-space-handling, for instance, in
the following case (suppose Mac-encoding):
<fo:block>
First line, then a CR
 some spaces, and more text
</fo:block>
The CR (which isn't necessarily a Numerical Character Reference, but
could be just the byte '0D') is not converted into a space (white-
space-treatment="ignore-if-surrounding-linefeed").
It does not precede or follow a linefeed.
It is the first character in a sequence of whitespace, so no matter
what the value of white-space-collapse, it will survive...
I am also not aware that the XSL-FO spec mentions CR as falling
under whitespace. IMO
for whitespace handling CR is just a non whitespace character.
Nope, it does fall into the category of XML whitespace. There are
exactly four of those: 	 (tab), 
 (linefeed), 
(carriage-return) and   (space). If you don't believe me, it's
indeed not in the XSL-FO Rec, but you might want to check the XML
Recommendation...
So, we only need to consider what fop layout should do if it
encounters a
CR. I would say, keep it simple, throw it away and log a warning.
Now, what about a tab character under the same circumstances? Do we
use an elastic width of X spaces optimum, where X is purely
conventional?
Similar considerations as for CR apply to TAB.
...
Cheers,
Andreas