Morten Jorgensen <[EMAIL PROTECTED]> wrote:
> At a first glance it
> seems very much like our internal DOM
I am worried most about the way we did our "expanded type IDs", which keeps
three integer values to represent the type, name, and namespace. You use a
single value to represent the concatenated name and namespace, which we
didn't like, but our scheme may have problems for what you need to do also.
So I think we'll have to play around with this until we're both happy.
> Have you any intention of creating mire DTD builders, such as your
existing
> DOM2DTM and SAX2DTM - for example a JDBC2DTM or an LDAP2DTM.
Yes indeed. John Gentilin is working on this now (our SQL extension). I
assume that LDAP2DTM is an obvious one that someone will work on (anyone
interested?). Someone else is working on a compact version (currently in
DTMDocumentImpl... this will be renamed) that will be yet smaller than
SAX2DTM (though probably a bit slower). Also, we're hoping to do a Xerces
native implementation, that can directly take advantage of some of the
internal parser data structures.
> ast.SyntaxTreeNode
> |
> ast.Instruction
> |
> ast.ForEach
> / \
> compiler.ForEach process.ForEach
>
> I suggest one general AST package that contains all the AST stuff up
> to and including the parsing level, and then keeping the compiler and
> interpreter stuff separate:
Yep. My question is if compiler.ForEach and process.ForEach should derive
from ast.Instructruction or should use a visitor pattern. I would think a
visiter pattern is best. I don't know how much structural rewritting you
are doing now, but I believe that structural rewrites are very important
for optimization. I would see the rewrites as working by multiple
optimization iterations over the AST, rewriting it probably several times
(redundent expression elimination, dead code elimination, tail merging,
inline expansion, etc.), until it can be optimized no more, and then using
the final version to produce the compiled form. I think this is how an
optimizing compiler works, though I am no expert. Perhaps the derivation
vs. tree walking really doesn't make a difference for this, ...not sure.
> Does this make sense?
Yes, or at least abstractly close enough for the time being.
> It is just an initial thought, and I agree
> with you that we should focus on the DOM/DTD integration first.
Actually I would like to jump on the ast integration pretty quickly. In
our code we have some less-than-beautiful stuff (namely XPath opmaps) that
I need to send on it's way. This is important for us because our existing
structures have become blockers in terms of what we want to do with the
code. Having a solid AST structure enables us to more forward more quickly
with other initiatives that we want to do. We will probably want to use
the AST structure we develop for the C++ version of Xalan also.
> How does the Xalan core interact with the serializers? Translets use a
> defined internal interface to send all its output to an output post-
> processor (org.apache.xalan.xsltc.runtime.TextOutput - bad name, I know)
> that removed duplicate attributes etc. and generates SAX events for the
> final SAX output handler (we only support SAX output, as I am sure you
> know already).
I think it's almost exactly the same. Our post-processor is the
ResultTreeHandler. I think there is very little or no difference between
the serialization processes, though currently our ResultTreeHandler has to
do some extra work for tooling (we allow a ContentHandler to call back to
get the current node, current template, etc.). We can handle this through
layering, I think.
-scott