Hi Fritz,
Thanks for raising this. This is an issue I had been considering addressing for a _long_ time. It is unfortunately not a really trivial thing to 'do right'. There are three things that jump out at me as being potentially problematic with your current implementation.
1) getElementById I believe works on a subtree of the document. So unless the 'globally' registered element is a child of the element getElementById is called on the result should be null.
2) Adding/removing elements/subtree's. Someone can add a whole subtree to a document and all those nodes need to have their id's added, also when a subtree is removed there id's need to be removed (which might reveal a hidden duplicate id...).
3) For things like document fragments or simply disconnected node tree's they should probably have an independent id space.
The first is fairly easily solved by simply ensuring that 'this' (in getElementById) is an ancestor of the element found in the hash table. If it isn't then return null.
The second is trickier. A few ideas, detecting that an old element has been removed from the tree should be fairly easy given the 'parent' check above (it will fail). Although it points out that the hash entries should be soft references so that elements that were once part of the tree with an id can still be GC'ed. There is a significant issue about what should happen if the hashed node isn't listed (see next section).
Adding nodes, the simple thing would be to just walk the new subtree and add it's nodes to the parent document. One issue will be the handling of duplicate Id's. They may be duplicates because the old element was recently removed, or they may be duplicates because for the next instant both nodes will be in the tree (they add then remove for example). In the second case the "hard ass" response is throw an illegal state exception... This will not win very many friends however it is probably the 'correct' thing to do. If you don't do this, then you run the risk that when you remove a subtree there were element's that had the same id that were shadowed which would be a pain to track.
Another option would be to be slightly lazy about it, so when subtree's are added mark them as 'dirty', when someone asks for an id - check the hash table if it is present and still part of the document great return it. Otherwise start walking the 'dirty' subtree's adding id's as you go, until you get to the one you want (or return null). This is nice from the point of view of not making 'add' overly expensive.
Anyway I'm kinda rambling but I hope this helps somewhat!
Sieker, Fritz wrote:
Hi:
I have a number of SVG files produced by Visio that take a long time (> 20 minutes) to load. I used Jprobe and found that the biggest time sink was in Batik.dom.svg.SVGOMDocument.getById(). As an experiment, I replaced the scan of "protected static Element getById(String id, Node node)" by a hash table.
Specifically, I added the following in SVGOMDocument.java
private HashMap elementsWithIDs = new HashMap();
public void registerElement (String id, Element ele) { elementsWithIDs.put(id, ele); }
And modified the getElementById() call to:
public Element getElementById(String elementId) { return (Element) elementsWithIDs.get(elementId); }
Then to actually, load the hash table, I modified org.apache.batik.dom.AbstractElement.java to register elements when the "id" was set. For example, in "public void setAttributeNS()", at the end I added code
if (qualifiedName.equals("id")) { ((org.apache.batik.dom.svg.SVGOMDocument)getOwnerDocument()) .registerElement(value, this); }
I similarly modified "public void setAttribute()"
The results were dramatic, with the > 20 minute load dropping to just a few seconds.
What other changes would be necessary to make this change "correct"?
Thanks
Fritz Sieker
[EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
