On Jan 5, 2006, at 18:48, Andreas L Delmelle wrote:
<snip />
To summarize this thread (it has taken long enough :-))
I thought it over a bit more, and what I'm currently working on (and
will most likely finish during the weekend) is the following:
1) Basically keep the algorithm the way I recently altered it, but
containing some additional processing for trailing inline FOs that
end with a sequence of white-space. Determining this last bit is easy
enough, since it just means that XMLWhiteSpaceHandler.inWhiteSpace
will be false after handleWhiteSpace(). At the end of the block, we
will do one more pass over all those trailing inlines, if any.
IMO, in the vast majority of use-cases there will be either zero, one
or at most two of those, but theoretically this could be any
number... If there are any, then if white-space-collapse has the
default value of "true" there will be only one trailing white-space
character left at that point, so this additional bit of processing
will cost virtually nothing.
2) Simplify the CharIterator structure, in the sense that we'll still
only need an iterator over FOText and Characters. Unless layout needs
access to the iterators, I think charIterator() can be pushed down to
be specific to FObjMixed, and then the overrides of this method can
be removed from all other FOs apart from FOText and Character. For
1), it could turn out handy if I add the possibility to iterate
backwards until the last non-white-space is encountered...
3) Exclude markers (and their descendants) from white-space handling
during refinement, for the mentioned reasons:
* retrieve-marker's ancestor's white-space properties govern the
treatment in this case
* possibly page-break context is needed when dealing with
alternating static-contents
* retrieve-markers with retrieve-boundary="document"
3) of course means the recently enabled marker_bug.xml testcase will
have to be disabled again until we find a way to tackle this in
layout. I had thought of using XMLWhiteSpaceHandler itself for this,
but the tricky part is that, once a Marker (and its descendants) have
been white-space-treated, the stripped white-space is permanently
gone, and since that same Marker can again be retrieved in a
different context etc.
[end-of-thread, I hope ;-)]
Cheers,
Andreas