Joseph Kesselman/CAM/Lotus wrote: > For what it's worth, that's the impression this user got too... XNI's goal, > as I understood it, we specifically to componentize the parser so those > components -- and others -- could be assembled in whatever combination or
Yes, it is. But the general user who is only interested in using DOM or SAX doesn't care (and shouldn't care) about the XNI API. However, the advanced user will likely use it to solve advanced problems -- things you just can't do with normal XML parsers (like create your own pipelines, parse other document formats as XML, etc). > XNI should probalby still be considered experimental and subject to change. > And there are probably some calls which are currently lumped under the XNI > banner which _should_ be considered parser internals. But there needs to be Which ones? > (Personal pet peeve, which I've cited before: Incremental parsing. That's > definitely an application-level issue, and the only way to access it right > now is through XNI.) True. We've been asked enough times for this that we should definitely make it public. I avoided it until now because I wasn't sure what we should do in the case that the parser configuration used doesn't support pull parsing. But I think we could simply throw an exception from those methods if the underlying configuration doesn't support pull parsing. And speaking of pull parsing, I have a comment and a question for the Xerces-J developers out there. First, my comment: now the incremental parser cannot guarantee that each call to the parse method calls one and only one method to the registered handlers. And in general this isn't possible anyway because some events may be removed and/or synthesized downstream from the scanner which we have no control over (e.g. think about the notification of start/endPrefixMapping for namespaces). However, I was thinking of writing a more "true" incremental parser from Xerces by queueing events and only dispatching the top (or is that bottom???) event in the queue. I figure the only thing I would need to change to support this well is to add a feature that tells the entity manager to NOT re-use character buffers -- this would allow components (and apps) to keep references to the character buffers, knowing that their contents would never change. But I digress... My question is related to the pull parsing JSR (StAX). There are some big names involved in that work (e.g. James Clark) and I am concerned that the completed API will be at odds with the current design and implementation of Xerces. I'm fine with them being different beasts because they are designed to do different things. But I worry that people will come to us and say "why don't you build the parser from pull events?" when this doesn't make sense for us. The power of XNI and Xerces2 is that the components in the pipeline can be interchanged easily to create new configurations. Trying to make this same thing possible as a pull parser is extremely difficult, if not impossible (for some of the very reasons I was alluding to previously). Again, I run at the mouth... On to my question! I am considering volunteering for the "Expert Group" of this JSR to make sure that my concerns are taken into consideration as this work continues. This would require me to sign an agreement that is currently at odds with the Apache Foundation. So my question is this: is this situation being resolved? Or are we stuck in a limbo that keeps us from helping to guide these specifications that could very well directly impact the work that we are currently doing? > Re grammar caching: Y'know, I'm _NOT_ convinced that there should be a > single Grammar Cache for all grammars. This sounds like the sort of thing > that ought to be a feature of the grammar engine rather than of Xerces as a > whole. I'm currently in favor of having a single grammar cache but I'd like to hear more of your concerns. I think a single cache would make it easier for sharing of pre-parsed grammars across multiple parser instances, not just between multiple validator components in the same parser pipeline. -- Andy Clark * [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
