Manuel Mall wrote:
> I added a boolean attribute in SpaceArea that is true for adjustable
> spaces (at the moment it is not used, but I will fix it soon).
Why would you envisage this is required by the renderers?
I can only speak of the PDFRenderer, as I don't know the other formats
very well.
Remember that this started from the bug 36238: in the pdf format, the
multi-byte "normal space" character is not affected by the Tw operator
(that sets the word adjustment) so we needed a way to apply this
adjustment in another manner. The pdf format allow to specify an
horizontal offset between fragments of text, so that their distance can be
increased or reduced: if the font is multy-byte, using this feature we can
adjust the spaces.
But we need to know which spaces can be adjusted, and which cannot. If we
don't wont to duplicate the logic for the "space recognition", the
SpaceAreas must simply have a boolean value stating whether the space is
adjustable, so that the renderers won't need to look at the space and
decide.
> At the moment the offset in SpaceArea and WordArea are unused, but
> this is how I think they could be used: if, because of the rounding in
> the adjustment computation, the applied adjustment is different than
> the needed one, the TextLM should distribute this difference (a few
> millipoints) among the SpaceAreas and / or WordAreas, setting their
> offset.
I see. Not sure if offset is the correct attribute though. Its stated
purpose is for offsetting in the bpd direction. Overloading it with a
different meaning is questionable.
Maybe "offset" is not the most appropriate name, even if the pdf
specification calls it that way (last line in the last row of table 5.6);
at least it is can be misleading in the context of fop.
Instead for SpaceAreas we could use the IPD trait. Basically the LM
tells the renderers via the SpaceArea 'I want an <ipd> mpt wide space
here'. How the renderers make this space happen is up to them. Not sure
if we should even consider distributing the rounding errors to the
WordAreas. I would leave them with the SpaceAreas.
At the moment neither the WordAreas nor the SpaceAreas have the ipd
explicitly set, so it is 0 (the ipd is set for the parent TextArea): these
kind of area were initially thought of as "light weight areas", in order
to increase the size of the area tree as little as possible, and we could
even define them in a minimal fashion, so that they have no unused
inherited attributes and methods.
Setting the ipd for each word and each space is another approach: it would
give the full control on the text positioning to fop's layout enginge,
while at the moment we control the overall text using the adjustment
parameters.
The drawback of this increased control is that the renderers would need to
be more complex in order to handle the extra information.
A different approach would be not to have the space areas at all and
simply set space-start/space-end traits on the word areas but that may
be a bit radical (in the sense of its impact on the renderers) at this
stage (but see below - CJK).
[1] My only concern is: if we "show" spaces but don't insert space
characters, how would the text extracted from such an output look like?
Wouldn'titbesomethinglikethis? :-)
Of course we could also use space-start/space-end traits on the
SpaceAreas to model shrink/stretch instead of the IPD trait. May be that
would be preferable?
Well, this is not very different from what is now (improperly) called
offset :-)
> The renderers will use this according to their own adjustment rule:
> for example the PDFRenderer would add it to the text adjustment if the
> character is multibyte.
>
> The offset could come in handy for the cjk support (bug 36977): in
> this case there are no adjustable spaces, and if text is justified all
> the difference between line width and unadjusted character width could
> be handled modifying the offsets of some special characters.
Hmm, in the CJK case it appears most characters will become a single
WordArea. If justification is required we either need to insert
SpaceAreas between the WordAreas or add a property to the WordAreas.
If adding a SpaceArea means adding some kind of space character, this
could "corrupt" the cjk text that could be extracted from the output;
otherwise goto [1] :-)
If we go for the additional property I would opt for setting
space-start/space-end traits on those areas. This seems to me to be more
in 'the spirit of XSL-FO'.
So, what if we rename offset -> spaceAfter? It seems to me that we are
here speaking of the same thing using two different names. :-)
Regards
Luca