Re: [digester2] Additions
Hi Oliver, On Mon, 2005-02-14 at 08:16 +0100, Oliver Zeigermann wrote: [update] - now all member variables of DefaultRuleManager are private - introduced copy ctor for copying members from subclasses thanks - that looks good to me. - DefaultRuleManager now returns an unmodifiable list upon getMatchingActions I'm not so sure about this one. This change means a new object is created each time getMatchingActions is called, and that is once for every xml element in the input document. And in addition, each method call is relayed by the UnmodifiableList wrapper to the underlying one, resulting in a performance hit too. Is this significant in the context of parsing an xml document? I don't know; maybe it's not measurable. However the benefits of this change are really limited only to preventing bad code in SAXHandler or RuleManager classes. Cases where users write their own versions of those will be extremely rare. And mistakes in the core digester classes are, I think, likely to be picked up by our unit tests. So my current feeling is that while the above change **is theoretically correct**, I would leave it out and return the actual list, just documenting that it should be treated as unmodifiable. Comments? - created a new FallbackRuleManager class that only takes care about fallback actions [see my comments at end first!] Have you considered writing this class as a *decorator* rather than a subclass of DefaultRuleManager? If it was a decorator, it could then be applied to any rulemanager class, not just the default. It would be used as follows: RuleManager realrm = new DefaultRuleManager(); // or any other type RuleManager rm = new FallbackRuleManagerDecorator(realrm); Digester digester = new Digester(rm); The disadvantage would be that it would be necessary to implement each of the RuleManager methods in order to forward them to the decorated instance, together with the performance hit that would entail. [ah, if only this were Ruby] - SupplementaryRuleManager now takes account of the explicitely unmodifiable list returned by DefaultRuleManager and does a wild mixture of caching and concatenating the list returned from DefaultRuleManager and its own list of supplementary actions [see my comments at end first!] The ReadOnlyConcatList is cute. I don't think it would save any time or effort over List list = new ArrayList(l1.size() + l2.size()); list.appendAll(l1); list.appendAll(l2); but I guess the unmodifiable feature is thrown in for free. I've been thinking about the caching and see a mild problem. If wildcard pattern a (equivalent to */a in digester1), and a elements are scattered around in various places in the input document, then the cache is going to grow fairly large. So basically, whether caching improve things or not depends upon whether the input xml is simple and repetitive (caching wins) or reasonably unstructured (caching loses). And given we can't tell which is more likely over the set of digester users (at least I don't have a clue) I would probably prefer the simplest solution - not to cache. I'm also thinking about the possibility that we may move to a RuleManager style that matches based on much richer state than just the current path. FOr example, if we want to support patterns like this: /foo/[EMAIL PROTECTED]'me']/baz[pos()==last()][1] then getMatchingActions will definitely not be taking a String path any more, but instead an array of w3c.dom.Element objects or similar so it has sufficient info to decide whether the pattern matches or not. That won't affect many places in the code, but would make this caching impossible to implement so we'd have to take it out then anyway. [1] sorry my xpath is a bit rusty - this is probably not valid - Both FallbackRuleManager and SupplementaryRuleManager now can have several callback actions - Still docs and tests need improvement - will do so once the code seems somewhat stable Comments? You're probably going to hit me for this :-) After some thought, I think my preferred option would be to include fallback and supplementary features as standard, ie define them in the RuleManager interface, provide an implementation in the DefaultRuleManager class, and to remove the FallbackRuleManager and SupplementaryRuleManager classes completely. I've always said I'm happy for fallback features to be in DefaultRuleManager, and after thought I believe it ought to be defined as part of the main RuleManager interface. And as you're so keen for supplementary actions, I'm ok with adding them too as long as there is no performance hit for people who don't use them. Note that this would definitely be the easiest solution from the user point of view, with no special classes required in order to use supplementary actions. And I would go for the simplest implementation of supplementary actions, by simply creating a new list each time getMatchingActions is called when one or more supplementary actions are
Re: [digester2] Additions
On Tue, 15 Feb 2005 01:08:53 +1300, Simon Kitching [EMAIL PROTECTED] wrote: - DefaultRuleManager now returns an unmodifiable list upon getMatchingActions I'm not so sure about this one. This change means a new object is created each time getMatchingActions is called, and that is once for every xml element in the input document. And in addition, each method call is relayed by the UnmodifiableList wrapper to the underlying one, resulting in a performance hit too. Is this significant in the context of parsing an xml document? I don't know; maybe it's not measurable. However the benefits of this change are really limited only to preventing bad code in SAXHandler or RuleManager classes. Cases where users write their own versions of those will be extremely rare. And mistakes in the core digester classes are, I think, likely to be picked up by our unit tests. So my current feeling is that while the above change **is theoretically correct**, I would leave it out and return the actual list, just documenting that it should be treated as unmodifiable. Well, think of the situation where people inherit and do things like I have done on my first try. So, I would be all for keeping it in, but would not protest if you removed it. - created a new FallbackRuleManager class that only takes care about fallback actions [see my comments at end first!] Have you considered writing this class as a *decorator* rather than a subclass of DefaultRuleManager? If it was a decorator, it could then be applied to any rulemanager class, not just the default. It would be used as follows: RuleManager realrm = new DefaultRuleManager(); // or any other type RuleManager rm = new FallbackRuleManagerDecorator(realrm); Digester digester = new Digester(rm); The disadvantage would be that it would be necessary to implement each of the RuleManager methods in order to forward them to the decorated instance, together with the performance hit that would entail. [ah, if only this were Ruby] - SupplementaryRuleManager now takes account of the explicitely unmodifiable list returned by DefaultRuleManager and does a wild mixture of caching and concatenating the list returned from DefaultRuleManager and its own list of supplementary actions [see my comments at end first!] The ReadOnlyConcatList is cute. I don't think it would save any time or effort over List list = new ArrayList(l1.size() + l2.size()); list.appendAll(l1); list.appendAll(l2); but I guess the unmodifiable feature is thrown in for free. ReadOnlyConcatList certainly would be faster and have a smaller memory footprint. I've been thinking about the caching and see a mild problem. If wildcard pattern a (equivalent to */a in digester1), and a elements are scattered around in various places in the input document, then the cache is going to grow fairly large. So basically, whether caching improve things or not depends upon whether the input xml is simple and repetitive (caching wins) or reasonably unstructured (caching loses). And given we can't tell which is more likely over the set of digester users (at least I don't have a clue) I would probably prefer the simplest solution - not to cache. If the cache gets too large we can easily limit its size using LRUMap or something. I'm also thinking about the possibility that we may move to a RuleManager style that matches based on much richer state than just the current path. FOr example, if we want to support patterns like this: /foo/[EMAIL PROTECTED]'me']/baz[pos()==last()][1] then getMatchingActions will definitely not be taking a String path any more, but instead an array of w3c.dom.Element objects or similar so it has sufficient info to decide whether the pattern matches or not. That won't affect many places in the code, but would make this caching impossible to implement so we'd have to take it out then anyway. We can worry about this as soon as getMatchingActions takes anything else than a String. Right now the method signature that I implemented says it is a String, so there is no problem. You're probably going to hit me for this :-) I can't, you are too far away ;) After some thought, I think my preferred option would be to include fallback and supplementary features as standard, ie define them in the RuleManager interface, provide an implementation in the DefaultRuleManager class, and to remove the FallbackRuleManager and SupplementaryRuleManager classes completely. I've always said I'm happy for fallback features to be in DefaultRuleManager, and after thought I believe it ought to be defined as part of the main RuleManager interface. And as you're so keen for supplementary actions, I'm ok with adding them too as long as there is no performance hit for people who don't use them. Note that this would definitely be the easiest solution from the user point of view, with no special classes required in order to
Re: [digester2] Additions
On Mon, 2005-02-14 at 17:24 +0100, Oliver Zeigermann wrote: On Tue, 15 Feb 2005 01:08:53 +1300, Simon Kitching [EMAIL PROTECTED] wrote: - DefaultRuleManager now returns an unmodifiable list upon getMatchingActions I'm not so sure about this one. This change means a new object is created each time getMatchingActions is called, and that is once for every xml element in the input document. And in addition, each method call is relayed by the UnmodifiableList wrapper to the underlying one, resulting in a performance hit too. Is this significant in the context of parsing an xml document? I don't know; maybe it's not measurable. However the benefits of this change are really limited only to preventing bad code in SAXHandler or RuleManager classes. Cases where users write their own versions of those will be extremely rare. And mistakes in the core digester classes are, I think, likely to be picked up by our unit tests. So my current feeling is that while the above change **is theoretically correct**, I would leave it out and return the actual list, just documenting that it should be treated as unmodifiable. Well, think of the situation where people inherit and do things like I have done on my first try. So, I would be all for keeping it in, but would not protest if you removed it. I said above: Cases where users write their own versions of those will be extremely rare. I really believe that the number of people writing their own custom RuleManager class will be about 3, worldwide, over the next decade. But the number of people who get impacted by the (admittedly small) performance hit for the proposed change is 100% of users. I guess we have a tie here: +1 from you, and -1 from me. The best thing would be if we got a third opinion from either Craig or Robert. Otherwise, we could just have a remote game of bzflag (http://bzflag.org/screenshots/bzfi0026.jpg) or something, with winner takes all :-) [snip] The ReadOnlyConcatList is cute. I don't think it would save any time or effort over List list = new ArrayList(l1.size() + l2.size()); list.appendAll(l1); list.appendAll(l2); but I guess the unmodifiable feature is thrown in for free. ReadOnlyConcatList certainly would be faster and have a smaller memory footprint. Can you describe why ReadOnlyConcatList is faster and smaller? The array list is (in c terms) just an array of pointers to Actions. To create a new one, I am sure the ArrayList constructor does: * malloc a manager object with a few ints (size, capacity) and a reference to the data block. * malloc the data block, which is capacity*sizeof(reference) * for each entry in the source list, copy reference to new list [1] [2] Once the new ArrayList has been created, access is very fast. [1] The list is likely to be 1, 2, or 3 elements. Having more than 3 Actions match a particular input xml element would be very unusual.. [2] There is even the possibility that ArrayList has been optimised to do a memcpy if its source is an ArrayList. With the ReadOnlyConcatList, the work done is: * malloc a block of data for the ReadOnlyConcatList, which contains one int and the two references to ArrayList objects. * copy 2 references and call size once And accesses are fractionally slower, because get invokes the ReadOnlyConcatList, which performs an if-statement then invokes the get method on the appropriate underlying list. So on the positive side I think that ReadOnlyConcatList is marginally smaller (it doesn't need the data block, which is typically 4-12 bytes), and slightly more efficient to create (one less malloc, a few less copies of references) On the negative, the get method is slightly slower to use. It also adds a dozen lines to the source and an extra .class file to the jar. All in all, we are debating microseconds and handfuls of bytes here. Both seem *acceptable* solutions to me, though I do prefer simple in this case. And like the Unmodifiable discussion, the easiest resolution would be if some other Digester developer (eg Robert or Craig) would give their call, then we could get on with it. I'm also thinking about the possibility that we may move to a RuleManager style that matches based on much richer state than just the current path. FOr example, if we want to support patterns like this: /foo/[EMAIL PROTECTED]'me']/baz[pos()==last()][1] then getMatchingActions will definitely not be taking a String path any more, but instead an array of w3c.dom.Element objects or similar so it has sufficient info to decide whether the pattern matches or not. That won't affect many places in the code, but would make this caching impossible to implement so we'd have to take it out then anyway. We can worry about this as soon as getMatchingActions takes anything else than a String. Right now the method signature that I implemented says it is a String, so there is no problem. It just seems a
Re: [digester2] Additions
On Tue, 15 Feb 2005 11:27:29 +1300, Simon Kitching [EMAIL PROTECTED] wrote: On Mon, 2005-02-14 at 17:24 +0100, Oliver Zeigermann wrote: On Tue, 15 Feb 2005 01:08:53 +1300, Simon Kitching [EMAIL PROTECTED] wrote: - DefaultRuleManager now returns an unmodifiable list upon getMatchingActions I'm not so sure about this one. This change means a new object is created each time getMatchingActions is called, and that is once for every xml element in the input document. And in addition, each method call is relayed by the UnmodifiableList wrapper to the underlying one, resulting in a performance hit too. Is this significant in the context of parsing an xml document? I don't know; maybe it's not measurable. However the benefits of this change are really limited only to preventing bad code in SAXHandler or RuleManager classes. Cases where users write their own versions of those will be extremely rare. And mistakes in the core digester classes are, I think, likely to be picked up by our unit tests. So my current feeling is that while the above change **is theoretically correct**, I would leave it out and return the actual list, just documenting that it should be treated as unmodifiable. Well, think of the situation where people inherit and do things like I have done on my first try. So, I would be all for keeping it in, but would not protest if you removed it. I said above: Cases where users write their own versions of those will be extremely rare. I really believe that the number of people writing their own custom RuleManager class will be about 3, worldwide, over the next decade. But the number of people who get impacted by the (admittedly small) performance hit for the proposed change is 100% of users. I guess we have a tie here: +1 from you, and -1 from me. The best thing would be if we got a third opinion from either Craig or Robert. Otherwise, we could just have a remote game of bzflag (http://bzflag.org/screenshots/bzfi0026.jpg) or something, with winner takes all :-) I am not +1. I am +0. I have reverted that change. [snip] The ReadOnlyConcatList is cute. I don't think it would save any time or effort over List list = new ArrayList(l1.size() + l2.size()); list.appendAll(l1); list.appendAll(l2); but I guess the unmodifiable feature is thrown in for free. ReadOnlyConcatList certainly would be faster and have a smaller memory footprint. Can you describe why ReadOnlyConcatList is faster and smaller? The array list is (in c terms) just an array of pointers to Actions. To create a new one, I am sure the ArrayList constructor does: * malloc a manager object with a few ints (size, capacity) and a reference to the data block. * malloc the data block, which is capacity*sizeof(reference) * for each entry in the source list, copy reference to new list [1] [2] Once the new ArrayList has been created, access is very fast. [1] The list is likely to be 1, 2, or 3 elements. Having more than 3 Actions match a particular input xml element would be very unusual.. [2] There is even the possibility that ArrayList has been optimised to do a memcpy if its source is an ArrayList. With the ReadOnlyConcatList, the work done is: * malloc a block of data for the ReadOnlyConcatList, which contains one int and the two references to ArrayList objects. * copy 2 references and call size once And accesses are fractionally slower, because get invokes the ReadOnlyConcatList, which performs an if-statement then invokes the get method on the appropriate underlying list. So on the positive side I think that ReadOnlyConcatList is marginally smaller (it doesn't need the data block, which is typically 4-12 bytes), and slightly more efficient to create (one less malloc, a few less copies of references) On the negative, the get method is slightly slower to use. It also adds a dozen lines to the source and an extra .class file to the jar. All in all, we are debating microseconds and handfuls of bytes here. Both seem *acceptable* solutions to me, though I do prefer simple in this case. And like the Unmodifiable discussion, the easiest resolution would be if some other Digester developer (eg Robert or Craig) would give their call, then we could get on with it. No problem, really, simply replace it. If neither Robert nor Craig offer their opinion, then I'm willing to settle for the following compromise. It won't make either of us 100% happy, but I don't think there's anything we can't live with, either, and we can move forward again: * merge fallback rules into RuleManager and DefaultRuleManager, and remove FallbackRuleManager * leave SupplementaryRuleManager exactly as you wrote it, with caching and ReadOnlyConcatList (except for the changes needed due to removal of FallbackRuleManager) * remove the UnmodifiableList stuff from DefaultRuleManager
Re: [digester2] Additions
On Tue, 2005-02-15 at 01:35 +0100, Oliver Zeigermann wrote: After I have thought about it, I am +1 to do it the way you proposed initially: have it all in one class. That's not because I am frustrated or am no longer interested, but I think that's a good solution and causes the least controversy. Go ahead with it please. Ok, done. I've used the term mandatoryAction rather than supplementaryAction as it's equivalent, shorter and hopefully more familiar to non-english speakers. I've moved the path-matching functionality from SupplementaryRuleManager into a new class SimpleMatchUtils. And of course new class SimpleMatchUtilsTestCase now contains the basic xmlio demonstration adapted from the old SupplementaryRuleManagerTestCase. Cheers, Simon - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [digester2] Additions
One design goal was to avoid using protected members. The problem here is that using protected for a *member variable* locks us into an implementation forever (a protected member is part of the external api that users can access). I think it preferable to add a public or protected method that provides access to the member rather than exposing the member directly. The method can later get the data in some manner other than reading from a member if we change the implementation. Note that having a method that returns a Map isn't a *whole* lot better. I think what we need is a copy-helper type method rather than exposing the namespace member just so a subclass can copy it. I can't think what the best solution is at the moment, but I feel there's an elegant (and probably fairly well known) solution nearby. How does clone() solve this issue? Note that I have (I think) been rigorous about avoiding protected members in the o.a.c.digester2 package, but not so rigorous in the action subpackage. I think it probably is a good idea to apply this to the action package too, though. In case those member variables really change this would be a general redesign I guess. In this case you might just as well create a copy from DefaultRuleManager maybe called NewDefaultRuleManager or DefaultRuleManager2 or whatever. I agree that passing the information using a method is no better. And having something like a copy helper is not enough as people might want have access to DefaultRuleManager guts in other scenarios as well. === I would suggest that the match method be placed somewhere other than on the Path class. Digester could well provide multiple different matching implementations in future. I think a static method on some XMLIOUtils class might be appropriate. The user's hand-written Action can then choose whether to use that matching algorithm or some other. All that Path currently does is return a String representing the canonical form of the Path, and that seems clean focussed to me. OK. If you think so. Done. === This bit in SupplementaryRuleManager will cause problems I think: public List getMatchingActions(String path) { List actionList = super.getMatchingActions(path); ... if (supplementaryAction != null) { actionList.add(supplementaryAction); } The list the RuleManager returns should not be modified; it is actually an internal data structure. This does point out that DefaultRuleManager I see. You are right. should be a little more defensive, perhaps returning Collections.unmodifiableList(actionList) which would cause the above add call to fail and throw an exception. That would, however, cause an object to be created on each getMatchingActions call, which I would prefer to avoid if possible. How about this instead? public List getMatchingActions(String path) { List actionList = super.getMatchingActions(path); ... if (supplementaryAction != null) { newActionList = new ArrayList(actionList); newActionList.add(supplementaryAction) } I suppose this could be done instead: if (supplementaryAction != null) { if (actionList.contains(supplementaryAction) { // nothing to do, we must have added the // supplementary action to this list earlier } else { // permanently modify DefaultRuleManager's // internal data structure; yecch! actionList.add(action) } } but as the comments indicate, I think that having one class manipulate another's internal data is not in terribly good taste, nor easy to maintain. Got that. I will think about that and provide something new. Why is there only facility for one fallback action, and one supplementary action? Wouldn't having a List of actions for each be no more difficult and potentially more useful? Maybe. Will do so. === On a minor SVN note, the SVN property svn:keywords needs to be set on a file in order to enable expansion of $Id$ into the appropriate text. This can be done manually via: svn propset svn:keywords Id filename It takes a commit to update the repository. I suggest this would be nice to do next time you update the SupplementaryRuleManager class, otherwise that text in the first line of the file will never change. Adding the following to your ~/.subversion/config file will ensure that each .java file that you add to subversion automatically gets $Id$ expansion enabled: [auto-props] *.java = svn:keywords=Id Tried that. Hopefully done now. (2) Why are the guts of Context accessible to the Action as well? They should only be interesting to the Digester core, not to the application. Why not adding an interface that hides the implentation details? Like public interface Context { Path getCurrentPath(); String getMatchPath();
Re: [digester2] Additions
[update] - now all member variables of DefaultRuleManager are private - introduced copy ctor for copying members from subclasses - DefaultRuleManager now returns an unmodifiable list upon getMatchingActions - created a new FallbackRuleManager class that only takes care about fallback actions - SupplementaryRuleManager now takes account of the explicitely unmodifiable list returned by DefaultRuleManager and does a wild mixture of caching and concatenating the list returned from DefaultRuleManager and its own list of supplementary actions - Both FallbackRuleManager and SupplementaryRuleManager now can have several callback actions - Still docs and tests need improvement - will do so once the code seems somewhat stable Comments? Oliver - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [digester2] Additions
On Sat, 2005-02-12 at 08:28 +0100, Oliver Zeigermann wrote: Hi Simon, I have now committed the first proposal for the XMLIO like rule manager called SupplementaryRuleManager. I also added a very basic test for it. I had to make a minor visibility change in DefaultRuleManager that it extends. Path now has match methods. All this isn't perfect, especially Javadoc is missing and tests need to be extended, but I wanted to make this public for discussion as soon as possible. So, comments? One design goal was to avoid using protected members. The problem here is that using protected for a *member variable* locks us into an implementation forever (a protected member is part of the external api that users can access). I think it preferable to add a public or protected method that provides access to the member rather than exposing the member directly. The method can later get the data in some manner other than reading from a member if we change the implementation. Note that having a method that returns a Map isn't a *whole* lot better. I think what we need is a copy-helper type method rather than exposing the namespace member just so a subclass can copy it. I can't think what the best solution is at the moment, but I feel there's an elegant (and probably fairly well known) solution nearby. How does clone() solve this issue? Note that I have (I think) been rigorous about avoiding protected members in the o.a.c.digester2 package, but not so rigorous in the action subpackage. I think it probably is a good idea to apply this to the action package too, though. === I would suggest that the match method be placed somewhere other than on the Path class. Digester could well provide multiple different matching implementations in future. I think a static method on some XMLIOUtils class might be appropriate. The user's hand-written Action can then choose whether to use that matching algorithm or some other. All that Path currently does is return a String representing the canonical form of the Path, and that seems clean focussed to me. === This bit in SupplementaryRuleManager will cause problems I think: public List getMatchingActions(String path) { List actionList = super.getMatchingActions(path); ... if (supplementaryAction != null) { actionList.add(supplementaryAction); } The list the RuleManager returns should not be modified; it is actually an internal data structure. This does point out that DefaultRuleManager should be a little more defensive, perhaps returning Collections.unmodifiableList(actionList) which would cause the above add call to fail and throw an exception. That would, however, cause an object to be created on each getMatchingActions call, which I would prefer to avoid if possible. How about this instead? public List getMatchingActions(String path) { List actionList = super.getMatchingActions(path); ... if (supplementaryAction != null) { newActionList = new ArrayList(actionList); newActionList.add(supplementaryAction) } I suppose this could be done instead: if (supplementaryAction != null) { if (actionList.contains(supplementaryAction) { // nothing to do, we must have added the // supplementary action to this list earlier } else { // permanently modify DefaultRuleManager's // internal data structure; yecch! actionList.add(action) } } but as the comments indicate, I think that having one class manipulate another's internal data is not in terribly good taste, nor easy to maintain. === As I mentioned earlier, I'm happy to put the fallbackAction functionality into DefaultRuleManager. === Why is there only facility for one fallback action, and one supplementary action? Wouldn't having a List of actions for each be no more difficult and potentially more useful? === On a minor SVN note, the SVN property svn:keywords needs to be set on a file in order to enable expansion of $Id$ into the appropriate text. This can be done manually via: svn propset svn:keywords Id filename It takes a commit to update the repository. I suggest this would be nice to do next time you update the SupplementaryRuleManager class, otherwise that text in the first line of the file will never change. Adding the following to your ~/.subversion/config file will ensure that each .java file that you add to subversion automatically gets $Id$ expansion enabled: [auto-props] *.java = svn:keywords=Id And, by the, now that I have done some work and I came across some new questions. Maybe they are just dumb, but what do you think about this? (1) Why aren't relative paths to actions also cached? The current DefaultRuleManager implementation is straight from the old RulesBase class, except that absolute paths are expected to start with /. I have plans for a major rework of
Re: [digester2] Additions
Hi Simon, I have now committed the first proposal for the XMLIO like rule manager called SupplementaryRuleManager. I also added a very basic test for it. I had to make a minor visibility change in DefaultRuleManager that it extends. Path now has match methods. All this isn't perfect, especially Javadoc is missing and tests need to be extended, but I wanted to make this public for discussion as soon as possible. So, comments? And, by the, now that I have done some work and I came across some new questions. Maybe they are just dumb, but what do you think about this? (1) Why aren't relative paths to actions also cached? (2) Why are the guts of Context accessible to the Action as well? They should only be interesting to the Digester core, not to the application. Why not adding an interface that hides the implentation details? Like public interface Context { Path getCurrentPath(); String getMatchPath(); ClassLoader getClassLoader(); Object getRoot(); boolean isEmpty(); int getStackSize(); Object peek(); Object peek(int n); putItem(ItemId id, Object data); Object getItem(ItemId id); } and ContextImpl being what Context is now. Similar thing with Path. If you agree I could do the work. (3) Bringing this up again: Wouldn't it be more flexible to allow more than a String as a pattern in RuleManager#addRule(String pattern, Action action)? Oliver - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[digester2] Additions
Folks, as I noticed Simon has done so much work on Digester2, I just wanted to be sure that my scheduled additions still are appropriate and aligned to the overall design. Here they are: (1) XMLIORuleManager: A rule manager that takes an action and unconditionally calls it on any event. It's useful to imitate xmlio style processing, but not limited to this. One application might be logging. As it is more general any proposal for a better name? Quoting code sample from Simon: // xmlio-style digester Action myHandler = new AbstractAction() { public void begin( Context context, String namespace, String name, Attributes attrs) { String path = context.getMatchPath(); if (path.equals(..)) { } else { } } public void body(...) { } } RuleManager xmlioRuleManager = new XMLIORuleManager(myHandler); Digester d = new Digester(); d.setRuleManager(xmlioRuleManager); As an option it might be possible to register an action with any rule manager that gets called unconditionally - epsecially for logging and debugging this can be really helpful. However, I wasn't quite sure if Simon was happy with that. (2) Next to the body and bodySegment call back methods there might be one that gives you the complete body of an element as a string composed of all character and markup data. This might be useful when you want to verbosely take over formatted mixed content like in XHTML. For performance reasons there should be some mechanism that tells digester when such content is needed on an element basis. Maybe something returned by the begin method or - maybe cleaner - something returned by an additional call back directly called after begin like boolean requiresComposedBody(String namespace, String name) or anything similar. This method could alternatively be part of an additional or extended interface. Comments? Oliver - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [digester2] Additions
Hi Oliver, On Thu, 2005-02-10 at 19:08 +0100, Oliver Zeigermann wrote: Folks, as I noticed Simon has done so much work on Digester2, I just wanted to be sure that my scheduled additions still are appropriate and aligned to the overall design. Here they are: (1) XMLIORuleManager: A rule manager that takes an action and unconditionally calls it on any event. It's useful to imitate xmlio style processing, but not limited to this. One application might be logging. As it is more general any proposal for a better name? Quoting code sample from Simon: // xmlio-style digester Action myHandler = new AbstractAction() { public void begin( Context context, String namespace, String name, Attributes attrs) { String path = context.getMatchPath(); if (path.equals(..)) { } else { } } public void body(...) { } } RuleManager xmlioRuleManager = new XMLIORuleManager(myHandler); Digester d = new Digester(); d.setRuleManager(xmlioRuleManager); As an option it might be possible to register an action with any rule manager that gets called unconditionally - epsecially for logging and debugging this can be really helpful. However, I wasn't quite sure if Simon was happy with that. Adding functionality to DefaultRuleManager that returns a set of default actions *if no other action matches the current path* is not a problem. I'm all for it. And this would meet your requirements, yes? So I'm happy for you to go with this if you agree. And as a bonus, implementation should be trivial: a dozen lines or so. I'm not sure whether this API should then be exposed via the RuleManager interface (ie required for all RuleManagers) or just left on DefaultRuleManager. Adding functionality that returns a set of actions *in addition to* the actions that match the current path is trickier. In what order are they returned? And can we avoid instantiating a new List object each time getMatchingActions is called? I'm not saying I'm opposed to it, I'm just not sure right now. (2) Next to the body and bodySegment call back methods there might be one that gives you the complete body of an element as a string composed of all character and markup data. This might be useful when you want to verbosely take over formatted mixed content like in XHTML. For performance reasons there should be some mechanism that tells digester when such content is needed on an element basis. Maybe something returned by the begin method or - maybe cleaner - something returned by an additional call back directly called after begin like boolean requiresComposedBody(String namespace, String name) or anything similar. This method could alternatively be part of an additional or extended interface. Yep. I'm in favour of this. I would prefer an eye to be kept on performance; the requiresComposedBody solution doesn't tempt me greatly as that is a call that must be made on each Action each time it matches, and 99.999% of those calls will return false. I could live with it if there is no better alternative. However if an action's begin method could enable the body text collection and the end method disable it then wouldn't that have the same effect without additional work when the feature isn't being used? I agree it's a little less intuitive for Action writers though...and I haven't thought it through fully. Note that I am tentatively planning to do some significant rework of the DefaultRuleManager class. But that certainly won't be for some weeks so hack away! I guess you'll also be wanting to work on the Path class, so the custom Action class can more easily test the current path? I'll make sure I check on this list before altering this class (though I've got no plans to do so). Cheers, Simon - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]