Re: Improving Keeps and Breaks

Vincent Hennebert Fri, 19 Oct 2007 04:33:35 -0700

Hi Andreas,

Andreas L Delmelle wrote:
> On Oct 18, 2007, at 19:23, Vincent Hennebert wrote:
<snip/>
>> I think I see your point. Basically you’re proposing a push method (a LM
>> notifies its parent LM that it has a break-before) while mine is a pull
>> method (a LM asks its children LMs if they have break-before).
> 
> Yep, although it would not be the LM but rather the FO that pushes the
> break-before upwards to its parent if it is also the first child. The


Sure, of course. BTW, that may be a dumb question, but: how does a FO 
know that it is the first child of its parent? AFAICT there’s no such 
information in the FObjs. Which means that an object would blindly 
notify its parent that it has a break, and that would be up to the 
parent to figure out whether it should take it into account or not. 
Would be a bit overkill, wouldn’t it?


> LMs would largely continue to work as they do now, except that under a
> certain set of conditions, they don't need to check the outside anymore:
> only take into account the forced break on its own FO. If there is none,
> then no need to recursively check for first descendants having forced
> breaks.

Of course, but I’m concerned about spreading code relevant to the same 
functionality over several places of the codebase.


> Currently (sorry if it becomes boring to stress this) the construction
> of the layout-tree starts only when the end-of-page-sequence event
> occurs. I still see room for changing this in the future, and so I need
> to consider the effects on the layout-algorithm as well: the algorithm
> will, for instance, no longer be able to rely on *all* childLMs being
> available the first time it enters the loop... The last childLM in an
> iteration might turn out to be not-the-last-one-after-all. For many
> following FONodes, the LMs do not exist yet at that point. Not in my
> head, at least. ;-)

I think nothing would prevent the layout process from being started 
earlier. On most FOs only the first child needs to be checked for 
a break. And for a table, only the first row needs to be retrieved in 
order to know if a break must occur before it. That’s another topic, but 
we could imagine that FObj instances and corresponding LMs are 
dynamically created when requested by a parent LM. Whereas at the 
(child) FObj level, it’s unclear to me how we will be able to say that 
we know enough to start the layout, and that we don’t need to grab 
further children FObjs.


> Anyway, I remember that when I implemented implicit column-numbers, I
> also gave TableBody an instance member to check whether we are adding
> cells in the first row or not, so this particular case would be easily
> addressed. (Checking... yep, it's still there.)
> 
> Come to think of tables, I'd consider 'propagation' in terms of pushing
> a forced break on a cell to the first cell in the row.
> In the table-layout code, at the point where we have a reference to the
> row or the first cell in a row, we would immediately know whether there
> is a forced break on a first descendant in any of the following sibling
> cells without having to request the corresponding childLMs and trigger a
> tree-traversal of who-knows-how-many levels.
> 
> Keeping in mind the above mentioned idea of triggering layout sooner, if
> we can guarantee that the layoutengine always receives complete rows,
> then the table-layout job should become a bit simpler in the general
> use-case, while still not adding much complexity in trickier, more
> exotic cases, like:
> //table-cell/block[position() > [EMAIL PROTECTED]'page']

That one triggers a break /inside/ the table-row, not before it. Anyway, 
at a given LM level the work to do looks simple enough to me.


> especially where the cell's column-number corresponds to the highest
> column-number.
> 
> Triggering layout sooner is the only way we are ever going to get FOP to
> accept arbitrarily large tables, without consuming massive amounts of
> heap. A 'simple' grid of 5 x 500 cells generates +5000 FONodes
> (table-cells must have at least one block each) that stay in memory
> until the page-sequence is completely finished. I wonder how many
> break-possibilities that generates... :/

Like said above, I don’t think anything in my approach prevents that 
from happening.


>> A matter of taste, probably, but I think I’d prefer the pull method: the
>> LM performs requests to the appropriate children LMs exactly when and if
>> needed.
> 
> The only thing an LM should initially pull/request from its children,
> AFAIU, is a list of elements, given a certain LayoutContext.
> When composing its own element list, an LM should ideally be able to
> rely on the lists it receives from its children. Then add/delete/update
> elements and (un)wrap, depending on context that is unknown or
> irrelevant to the child.

That one I don’t quite agree with. Although I thought of it too on 
a first step. I think it’s more complicated to play with a list of 
elements, try and get the first one from children if any, create a new 
one if necessary, change the break context (column, page) if needed, 
etc., rather than simply request the applicable children LMs for their 
break-before values. And again, in the case of tables that means that 
the merging algorithm needs to deal with many possibilities of break 
elements to occur. That’s really not its job I think.


>> That may simplify code as well (and improve its readability) as
>> some form of pull method is necessary anyway (the
>> mustKeepWithPrevious/WithNext/Together methods).
> 
> Keeps are a different story indeed. Big difference is that keeps have
> strengths, and breaks do not.

Yes, but keeps and breaks are handled at the same place, mainly, when 
a LM considers the stacking of children LMs (BlockLM, for example). And 
the treatment is very similar.


> Consider:
> 
> <fo:block id="b1">
>   ...
>   <fo:block id="b2">
>     <fo:block id="b3" keep-with-previous.within-page="...">
>       <fo:block id="b4">
>         <fo:block id="b5" break-before="page">
> 
> This may be interpretation: you cannot specify a 'strength' for a break.
> It is either there or not. I take this to mean that a forced break
> overrules any keep.

Indeed, Section 4.8 of XSL 1.1.

<snip/>
>> - the code would be about the same for Block- and InlineStackingLM
>> - we could factorize it into a common super-class
> 
> 
> AbstractStackingLM...?

Rather StackedLM. That’s the StackingLM that would have StackedLM 
children implementing the necessary methods. Ideally that would be an 
abstract class, but since multiple inheritance is impossible in Java I’m 
not sure that would be feasible. Hence static methods in a separate 
class.


> I kind of like the idea. For the really shared portions,
> AbstractStackingLM could then implement a set of static methods.
> 
>> but both those classes
>>   have subclasses to which breaks don’t apply (Flow-, StaticContentLM,
>>   for example).
> 
> I wouldn't really see this as a problem. The related methods will never
> be called, unless there is a flaw in our logic[*]. To stress the fact
> that they serve no purpose there, we could add overrides that always
> return false.

That really looks like bad design to me. If some methods don’t apply to 
an object then that object shouldn’t inherit the related 
class/interface.


> [*] (They won't be called, precisely because breaks don't apply?)
> 
>> OTOH keeps apply to AbstractGraphicsLM which doesn’t
>>   inherit any of those classes.
> 
> That's a special case, since in principle a graphic does not itself
> consist of more layout-objects that need to be stacked. To the
> layoutengine, a graphic is simply a monolithic box. Graphics are inline
> by definition nonetheless, so it could be InlineStackingLM with the same
> reservations as for FlowLM and StaticContentLM, but for other methods
> (the actual 'inline-stacking' can be considered to be delegated to the
> producer of the graphic, here).

So it’s not stacking, but stacked. Might make sense to introduce that 
concept, actually.

To sum up, my main concern is to find code for a same functionality at 
several places of the codebase. Some form of treatment is necessary at 
the LM level anyway, so why not just put all the code there? It may also 
be good to keep FO tree building code as much independent as possible 
from the layout code. Simpler, easier to understand, easier to debug, 
easier to replace a component which another one, etc. The collaborative 
approach you’re proposing looks interesting, but for practical reasons 
it may be best to keeps things separated. The codebase is quite large 
and it’s difficult to have a detailed understanding of all its parts. 
That might be good to be able to concentrate on just one part. Easier 
for newcomers, too.


WDYT?
Vincent

Re: Improving Keeps and Breaks

Reply via email to