Simon - Sorry I haven't had a chance to respond to your email. I was actually more concentrating on answering your question about use-cases, but it sounds like I don't need to sell this need as much as I thought I did (at least for now).
> If someone (eg Justin) is keen to work on this now, we could potentially get it in the next release. Otherwise I suggest this could go on the to-do list for post-1.6. I don't know if "keen" is the right word, but I'm committed to "digesting" mixed content XML for an application now, i.e. I need to solve this problem one way or another (or use some other XML ingesting mechanism, which I don't want to do for purely selfish reasons). At this point, I'm planning on an implementation as a subclass. Whether this subclass is accepted and put into the Digester release is dependent upon a variety of factors, but my intention is to develop a solution either way. I have sign-off on contributing modifications back to Apache, so that's not an issue. Of course, I have a high level of respect for the members of this list (not blowing smoke, I swear) so I'm very interested in crafting the mixed content solution based on any feedback Commons developers may have. Just to be clear, is there a timeframe for 1.6? > Justin, if you have any arguments to back your original design, please speak up! Or if you are willing to try implementing some other approach that doesn't involve "@text" patterns, please > speak up too. Let's separate the two issues in my original design - Using a special text designator and the specific designator used. To be honest, @text was really just a placeholder on my end. The only requirement I have for this designator is that it be an illegal XML element name so as to ensure that there's no conflict (i.e. if the designator was just text, that would pose an issue if you had an element named text). My core argument in favor of using a specific designator is that it explicitly indicates that the pattern (i.e. /element/@text) uses different functionality then the traditional Digester method. This is a pretty weak argument, I'll admit. I also feel (and can't prove yet) that this method is better performance-wise because the extra iterations over the list of rules is over a smaller list. As indicated above, I'm very willing to try alternate implementations, including the interface solution you suggested. > How do people feel about my initial proposed solution to this (as follows)? My only concern about the interface solution (for lack of a better name) is when you wrote 'for each rule matched by the last call to startElement' - In my original subclassed-version, I had a call to super.startElement(), but in order to do what you've described, I think you'd need to replicate all the code in Digester.startElement() in the subclasses startElement() method. Otherwise, the overridden startElement() in the subclass would have to make an extra call to Rules.match(). I was originally worried about having to maintain the subclass's startElement method to reflect changes in the Digester implementation, thus the call to super.startElement(). Is this too dogmatic? I'm not looking to rehash the cut-and-paste vs. "eating our own dog food" discussion recently seen in the context of [lang]. I'll have some time later in the week to take a crack at implementing the interface solution. Yet a third implementation that I've been thinking about would be to take the new interface and create some additional interfaces around it - MixedContentRules and MixedContentRuleSet. These object would basically parallel the Rule/Rules/RuleSet interfaces. Within the MixedContentDigester subclass, there'd be a new instance variable called mixedContentRules. In short, the concept is that the classes that implement MixedContentRule would be segregated from the traditional Digester rules. The core reason for this is that I'm concerned about the performance impact of both my original and Simon's solutions. By segregating the rules, I've ensured that a match() call to a MixedContentRules object only searches within MixedContentRules which should lead to better performance. > And if you think there are other features that could be added to digester using "@...."-style patterns then that would also be good to mention. I had worked up a use-case for @comment that would allow for comments to be digested (imagine a JavaDoc/Xdoclet-style application that read comments out of a struts config XML file). But then I remembered that SAX ignores comments, so this is a bigger can of worms. I've gone ahead and submitted my test case to Bugzilla (#28068). I can never remember how Bugzilla reacts to XML submitted in it's forms, so I kept that out. I assume that we're all on the same page as to what "mixed content" means, but I can easily add an example. Thanks for the interest. I was a bit surprised at first that some were so willing to write off Digester as just for configuration files. Justin --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]