On Feb 6, 2006, at 17:04, Luca Furini wrote:
Hi Manuel / Luca,
Manuel Mall wrote:
IMO yes there can be a break and no only the space needs to be
removed. Again the argument is that nbsp is not whitespace as per
XSL-FO definition and need not to be removed.
What makes you think that both the nbsp and the space needs to be
removed around a fop generated linebreak?
Oops, I forgot to add an importand condition: if the user
explicitly states that the nsbp must be discarded around a line break:
<fo:inline suppress-at-line-break="suppress"> </fo:inline>
Oops, typo? suppress-at-line-break is a non-inherited property, only
applicable to fo:character :-)
Well, the more I look at this, the more it seems unlikely to ever
happen ... we are probably having a highly theoretical
disquisition! :-)
<fo:character character=" " suppress-at-line-break="suppress" />
followed by a space is indeed very theoretical.
So is (another alternative):
<fo:inline suppress-at-line-break="suppress">
<fo:character character=" "
suppress-at-line-break="inherit" /> </fo:inline>
OTOH, if we can make the algorithm work in these exotic cases, then
the commonly used scenarios will be a cake-walk. :-)
This does, in any case, shed some different light on the notion of
'pretty printing whitespace', since currently --at least that was my
understanding of the discussions, and that's what I worked towards--
a fo:character is considered the same as a regular character, in that
fo:characters representing XML whitespace are subject to whitespace-
removal... Yet, one can arguably defend the idea that any
*fo:*character is inserted for *XML* pretty printing purposes, no?
Should this change be reverted then?
[Maybe partly, because suppose:
<fo:block>
<fo:character character=" " suppress-at-line-break="retain" />
...
Currently, the fact that it is a fo:character is not known when
running this through the algorithm. The CharIterators deal with the
characters. The XMLWhiteSpaceHandler makes a decision based purely on
the value of the character property. It is agnostic to the suppress-
at-line-break property's value... I myself would tend to use a non-
breaking space in this case, since it escapes the whitespace
handling, but it is a theoretical possibility. :-)
Another alternative would be to introduce a member to the
CharIterators...
Something like isSuppressible(), which would return true if:
( the current element is a regular character
and it has codepoint U+0020 )
or ( the current element is a fo:character
and
(( the value of its character property is codepoint U+0020
and suppress-at-line-break="auto" )
or ( suppress-at-line-break="suppress" ))
As such, refinement (white-space)-character-removal could operate on
this basis, and already resolve such issues at that stage.
The current approach is still not 100% correct anyway...]
Anyway, I was still not sure whether there could be a break so I
looked back at the Unicode Annex #14.
<snip />
So, it seems there could be a break between SPACE and NBSP (with
NBSP starting the next line), but not between NBSP and SPACE. Can
we say this is settled?
Yes! Definitely. We're looking for UAX#14 'compliance' as well here.
My 2 cents.
Cheers,
Andreas