Re: String.intern() test and measurement

2003-12-04 Thread Peter B. West
J.Pietschmann wrote:
John Austin wrote:

A high runner in FOP 0.20.5 is: PropertyList.findProperty().
It calls other functions in org.apache.fop.fo that consume
significant CPU resources. In one example it called itself
recursively to a (depth of 10)


Without taking a closer look at the code, I suspect it tries
to find inherited values. One approach to cope with this is
to resolve inherited and default values immediately during
FO construction. Because the full property set is stored,
in contrast to only specified properties now, only the
parent has to be looked up. This comes, of course at the
price of consuming more memory, and there are functions like
frome-nearest-specified-value() which require specified
properties to be marked. If someone can come up with a space
efficient storage, this may be a solution.
You could also look at the way alt.design tries for the best of both 
worlds.  See fop.fo.FONode.java in the alt.design tree; notably 
makeSparsePropsSet()
BitSet specifiedProps
PropertyValue[] propertySet
PropertyValue[] sparsePropsSet
final int[] sparsePropsMap
final int[] sparseIndices

Peter
--
Peter B. West 

Re: String.intern() thoughts

2003-12-04 Thread Glen Mazza
Thanks for the explanation.

Glen

--- "Peter B. West" <[EMAIL PROTECTED]> wrote:
> Glen Mazza wrote:
> ...
> > 
> > I think the next thing to consider is the storage
> of
> > specified vs. computed values.  Let's say we store
> > pointers for many properties to the same
> > {"property-name", "property-value"} pair.  A
> specified
> > property value of "10%" would not make this a very
> > helpful data structure if that percentage resolves
> to
> > different computed values for each property
> sharing
> > this pair.  I believe the goal for us then would
> be
> > just store the computed value for each pair
> (meaning
> > many more pairs), as long as we take into account
> the
> >
> can't-resolve-everything-without-knowledge-of-layout
> > issue.
> 
> alt.design makes no attempt to look for
> commonalities here.  It resolves 
> every possible property value, and keeps a
> partly-resolved value for 
> percentages.  Inheritance is (almost exclusively) of
> computed values.
> 
> For a given inheritable property, if that property
> is present on a 
> child, then that value of the property is used for
> that child (and its 
> descendants until explicitly re-set in a lower
> descendant); otherwise, 
> the specified value of that property on the child is
> the computed value 
> of that property on the parent formatting object.
>  5.1.4 Inheritance
> 
> There is an exception that comes to mind.  For
> "line-height" a value may 
> be specified which, although not specified as a
> percentage, is a factor 
> by which the font-size (from memory) is multiplied. 
> When such a value 
> is inherited, it is the factor, not the computed
> value.
> 
> In general, the computed value of a percentage is
> inherited.  That still 
> leaves a problem, because the computed value is
> unknown.  In alt.design, 
> what is effectively a link back to the unresolved
> property is specified 
> as the value.  When the parent property is resolved,
> that resolved value 
> is then available to the inheriting descendants.
> 
> Yet-to-be-implemented is the handling of expressions
> involving 
> percentages, as mentioned in an earlier post.
> 
> In essence, alt.design attempts to resolve every
> property to its 
> computed value, and store the result only on nodes
> to which the property 
> applies.  There is no attempt to reduce storage by
> procedures like 
> interning strings.  Note, though, that the process
> of resolving 
> properties does eliminate most strings.  My
> objective was that when 
> areas were being laid out, the relevant properties
> would all be directly 
> available.
> 
> Peter
> -- 
> Peter B. West
> 
> 


__
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
http://companion.yahoo.com/


Re: String.intern() thoughts

2003-12-04 Thread Peter B. West
Glen Mazza wrote:
...
I think the next thing to consider is the storage of
specified vs. computed values.  Let's say we store
pointers for many properties to the same
{"property-name", "property-value"} pair.  A specified
property value of "10%" would not make this a very
helpful data structure if that percentage resolves to
different computed values for each property sharing
this pair.  I believe the goal for us then would be
just store the computed value for each pair (meaning
many more pairs), as long as we take into account the
can't-resolve-everything-without-knowledge-of-layout
issue.
alt.design makes no attempt to look for commonalities here.  It resolves 
every possible property value, and keeps a partly-resolved value for 
percentages.  Inheritance is (almost exclusively) of computed values.

For a given inheritable property, if that property is present on a 
child, then that value of the property is used for that child (and its 
descendants until explicitly re-set in a lower descendant); otherwise, 
the specified value of that property on the child is the computed value 
of that property on the parent formatting object.
 5.1.4 Inheritance

There is an exception that comes to mind.  For "line-height" a value may 
be specified which, although not specified as a percentage, is a factor 
by which the font-size (from memory) is multiplied.  When such a value 
is inherited, it is the factor, not the computed value.

In general, the computed value of a percentage is inherited.  That still 
leaves a problem, because the computed value is unknown.  In alt.design, 
what is effectively a link back to the unresolved property is specified 
as the value.  When the parent property is resolved, that resolved value 
is then available to the inheriting descendants.

Yet-to-be-implemented is the handling of expressions involving 
percentages, as mentioned in an earlier post.

In essence, alt.design attempts to resolve every property to its 
computed value, and store the result only on nodes to which the property 
applies.  There is no attempt to reduce storage by procedures like 
interning strings.  Note, though, that the process of resolving 
properties does eliminate most strings.  My objective was that when 
areas were being laid out, the relevant properties would all be directly 
available.

Peter
--
Peter B. West 


Re: [VOTE] Properties API

2003-12-04 Thread Peter B. West
Victor Mote wrote:
Peter B. West wrote:
...
Yes, this is the real issue. Since an fo:marker's content can be used more
than one place, this requires that its contents be "grafted" into the tree
where needed.
I think the only trick here is to pass the static content context back to
the "get" method so that it knows how to get the information it needs. Sec
6.11.4 says that fo:retrieve-marker "is (conceptually) replaced by the
children of the fo:marker that it retrieves." The most general way that I
can think of to implement this is to force the passage of a parent
fop.fo.flow.RetrieveMarker in the "get" method's signature. This tells the
"get" method: "One of your ancestors is an fo:marker object, and, for
purposes of this "get", consider that ancestor grafted into the tree at this
fo:retrieve-marker's location." Of course, if there is no ancestor
fo:marker, pass a null.
Now, this raises another issue. FONode has a getParent() method. This method
may need to be expanded to include this concept. Any child could then ask
for its parent either with null (go up the tree through fo:marker, i.e. the
way the input specifies, and the way it works now), or with a "grafting
point" specified, so that if a grafting point is specified, it will go up
the tree in that direction instead. In fact, it may be good to create a
GraftingPoint interface that RetrieveMarker implements, in case there are
additional similar items now or in the future.
class Marker {
...
getParent(GraftingPoint gp) {
if (gp == null) {
return this.parent;
}
return gp.getParent(null);
}
...
}
So, lets use:
font-size="12pt+2%+0.8*(from-nearest-specified(height) div 32)
as an example. Lets assume an FOTree fragment that looks like this:
  fo:marker
fo:block
  fo:inline
For both the block and the inline, the "get" will need to research its
ancestry to resolve the expression. If we pass the grafting point to the
"get", and the "get" directly or indirectly uses the getParent(GraftingPoint
gp) method to find that ancestry, it seems to me that everybody has
everything they need.
To restate a particular point here about grafting:-  In the model I am
developing for grafting, it is a pure implementation activity.  It is a
method of building static-content FO subtrees in a "transparent" manner.
 Once these static-content/marker subtrees are built, they look just
like the other subtrees of the FO tree.  In particular, they still
suffer from the same sort of problems of interaction with the Area tree
that afflict the rest of the FO tree.
What that means for higher level interfaces is that the grafting is
transparent - higher level interfaces, like your "get", just won't see it.
The key insight for me here is that *none* of this is actually dependent on
the Area Tree at all, that what we are really doing is grafting. I had
originally thought that some Area Tree information would need to be passed
in, but I really think the above is much more elegant, and more clearly
follows the concepts that are in play. Of cource, I rely on the rest of you
guys to tell me if I have missed something (a real possibility).
Peter
--
Peter B. West 




Re: [VOTE] Properties API

2003-12-04 Thread Peter B. West
Victor Mote wrote:
Peter B. West wrote:

2% of what?  Of a reference area.  Of what actually gets laid out on a
page.  If a single flow object gets laid out over more than one page,
that reference may vary, but nothing changes in the FO Tree.  It makes o
sense to second-guess the Area tree within the FO tree.  It's within the
Area tree that all of these floe objects begin to take on concrete
dimensions.


Sec. 7.8.4 indicate that font-size percentages apply to the parent element's
font size, which would be from the FOTree, not from areas.
However, I fear that in the general case you may be right. The relative
column-width problem in tables may fall into this category. If so, then the
solution is to pass the relevant Area object to the "get" method so that it
can see more of the Area's context. Any Area can (or should) be able to see
not only its Area Tree ancestry, but its FOTree ancestry as well.
See Section 7.3 Reference Rectangle for Percentage Computations
http://www.w3.org/TR/xsl/slice7.html#percrule
are maintained in spite of any to-ing and fro-ing with the Area Tree.
Markers are an exception, and because marker properties are resolved in
the context of the static-content into which they are eventually placed,
all the information required for from-nearest-specified() must be
available in the static-content FO subtrees.


Yes, this is the real issue.
Only one of the real issues, I'm afraid.


OK, what are the others?

The thorns that immediately stick in my finger are

1) markers
I have a pretty well-developed idea of how to implement the FO tree
building side of this in alt.design.  It involves
static-content-specific changes to the current FO tree building
algorithm, and a slight generalization of the SyncedFoXmlEventsBuffer
class to allow events to be read from a variety of sources.  In a word,
grafting.
2) Area tree dependencies for FO expressions.  Basically, length
calculations involving percentages.
2a) Handling the expressions.
2b) Designing the interaction between the FO builder and the Area tree
builder.
3) Layout look-ahead and backtracking.  Closely related to 2b.

4) Managing the association out-of-line areas (footnotes and floats) 
with the FONode/Area in which it was defined and the higher-level areas 
(e.g.  before-float-reference-area, footnote-reference-area, 
main-reference-area) which are juggled as a result of the lower-level 
contents.

More on these design issues in a subsequent post.
--
Peter B. West