Hello All,

I have dug into the issue a little further and have found a solution to my
problem. Assuming that the implementation's current tendency to only copy
the UserData for the top-most Node being adopted is correct, and assuming
my implementation in the original post is correct, then here is a somewhat
hacky work-around:

Create a subclass of CoreDocumentImpl in a package with the namespace of
org.apache.xerces.dom. This subclass you are creating will have access to
the xerces CoreDocumentImpl's internal protected UserData hashtable that's
used to hang onto UserData for associated nodes.

Here is my very simplistic/non-optimal solution:

package org.apache.xerces.dom;

import java.util.Hashtable;

import org.apache.xerces.dom.CoreDocumentImpl;
import org.w3c.dom.Document;
import org.w3c.dom.Node;

public class DocumentImplHax extends CoreDocumentImpl
{
  public static void adoptNodeWithCorrectUserDataCopy(Node node, Document
adoptingDocument)
  {
    Document srcDoc = node.getOwnerDocument();

    Hashtable srcUserDataTable = null;

    if (srcDoc instanceof CoreDocumentImpl)
    {
      srcUserDataTable = ((CoreDocumentImpl) srcDoc).userData;
    }

    adoptingDocument.adoptNode(node);

    if (adoptingDocument instanceof CoreDocumentImpl)
    {
      CoreDocumentImpl coreAdoptiongDocument = (CoreDocumentImpl)
adoptingDocument;

      if (srcUserDataTable != null)
      {
        if (coreAdoptiongDocument.userData == null)
        {
          coreAdoptiongDocument.userData = new Hashtable();
        }
        coreAdoptiongDocument.userData.putAll(srcUserDataTable);
      }
    }
    else
    {
      System.out.println("Warning, User data may be lost because adopting
document is not an " +
        "org.apache.xerces.dom.CoreDocumentImpl, it is a: " +
adoptingDocument.getClass().getName());
    }
  }
}

All this class is doing is grabbing a reference to the source document's
userData hashtable and adding every object in that hashtable to the
adopting document's userData hashtable.

The xerces document<-->node implementation model stores Node userData
information inside the Document userData hashtable, keyed by the Node
objects themselves. After the adoptNode() operation the new
adoptingDocument does not get the userData information for Node objects
other than the root node you're trying to adopt, so this addAll() call in
the class above handles the problem by copying the entire source
document's userData map into the adopting document's userData map.

Beware that if you're only adopting a small portion of some document's DOM
tree into another document, then this implementation above is overkill as
it's going to put the entire set of UserData information for the entire
source document's Nodes into the target document. For our particular use
case this is a non-issue, since we are adopting the entire source tree
into the adopting document.

Also beware of memory issues associated with UserData in these maps, there
are several comments in the xerces documentation and code about how a Node
with UserData won't be garbage collected until its owner document is..
because the UserData isn't owned by the Node object, its owned by the
owning document.. in a hashtable.. keyed by the Node object, so the Node
will stick around until that hashtable gets garbage collected (when the
owning document gets garbage collected). This solution here could make the
problem worse: every node in a source document with userData will now be
prevented from garbage collection until the adoptive document is garbage
collected as well. So the point is, if you are adopting partial portions
of a document's dom tree into another document, you'll need to work on the
method above and do some tree walking for efficiency sake, most likely.

- Jason

> Hello,
>
> I am working with the java version of xerces and am having trouble
> figuring out how to get adopted nodes to carry attached UserData objects
> along with them from one document object to another.
>
> In particular, I am trying to add Line Number information to Node objects
> in a xerces document. There is a handy xerces sample application that
> shows me how to do this, you can find the sample in
> Xerces-J-src.2.9.1.zip:xerces-2_9_1\samples\dom\DOMAddLines.java.
>
> This sample is great! It got me started in the right direction and works
> perfectly as long as you are not trying to clone nodes or adopt them to
> other documents with the expectation that your line numbers will stick.
>
> The first problem is that the sample does not provide a UserDataHandler
> when it calls setUserData(), so I changed my sample around to include my
> own handler:
>
> public class DOMLineNumberUserDataHandler implements UserDataHandler
> {
>
>   public void handle(short operation, String key, Object data, Node src,
> Node dst)
>   {
>     System.out.println("\nDOMLineNumberUserDataHandler Called: operation="
> + operation + ", key=" +
>                               key + ", data=" + data + ", src=" + src + ", 
> dst=" + dst);
>
>     if (operation == UserDataHandler.NODE_ADOPTED)
>     {
>       System.out.println("(Current Operation is NODE_ADOPTED)");
>     }
>
>     if (dst != null)
>     {
>       dst.setUserData(key, data, this);
>     }
>    }
>
> }
>
> That's my handler code, then the sample needed to be changed around, I
> added a private UserDataHandler like so at the top of the DOMAddLines:
>
> private UserDataHandler userDataHandler = new
> DOMLineNumberUserDataHandler();
>
> Then I changed the two lines from this:
>
> node.setUserData("startLine", String.valueOf(locator.getLineNumber()),
> null);
>
> to this:
>
> node.setUserData("startLine", String.valueOf(locator.getLineNumber()),
> this.userDataHandler);
>
> Those two lines are in the DOMAddLines "startDocument" and "startElement"
> methods.
>
> With my handler in place, cloning nodes works beautifully.
>
> However, adopting nodes from one document to another does not work.
>
> The scenario we're developing for is this: We have XML a, it has some
> specialized tags that are place holders for other XML documents (b) we
> pull in. When our code is searching through the DOM tree for XML-a and we
> find one of these tags that needs to be replaced with XML-b's contents,
> then we read in XML-b, adopt XML-b's nodes into the XML-a document, and do
> the replace. The adoptNodes call does not copy UserData for nodes other
> than the top level node we're adopting though. We determined that we must
> call adoptNode on the XML-b's root element before sticking the nodes into
> the XML-a DOM tree in order to avoid 'WRONG_DOCUMENT_ERR' errors.
>
> Here is a brief test case that exemplifies the behaviour I am seeing:
>
> public void testLineNumberStuff()
> {
>   try
>   {
>     String originalDocument =
> "<some>\n<document>\n<with>\n<no>content</no></with></document></some>";
>     DOMParser originalParser = new DOMAddLines();
>     originalParser.parse(new InputSource(new
> StringReader(originalDocument)));
>     Element originalDocumentElement =
> originalParser.getDocument().getDocumentElement();
>
>     String adoptingDocument = "<some>\n<other>\ndocument</other></some>";
>     DOMParser adoptingParser = new DOMAddLines();
>     adoptingParser.parse(new InputSource(new
> StringReader(adoptingDocument)));
>     Element adoptingDocumentElement =
> adoptingParser.getDocument().getDocumentElement();
>
>     DOMAddLines someUnrelatedParserForPrinting = new DOMAddLines();
>
>     System.out.println("ORIGINAL DOCUMENT BEFORE ADOPTION:");
>     someUnrelatedParserForPrinting.print(originalDocumentElement);
>
>     System.out.println("ADOPTING DOCUMENT BEFORE ADOPTION: ");
>     someUnrelatedParserForPrinting.print(adoptingDocumentElement);
>
>     
> adoptingDocumentElement.getOwnerDocument().adoptNode(originalDocumentElement);
>
>     someUnrelatedParserForPrinting = new DOMAddLines();
>
>     System.out.println("ORIGINAL DOCUMENT AFTER ADOPTION:");
>     someUnrelatedParserForPrinting.print(originalDocumentElement);
>
>     System.out.println("ADOPTING DOCUMENT AFTER ADOPTION: ");
>     someUnrelatedParserForPrinting.print(adoptingDocumentElement);
>   }
>   catch (Exception e)
>   {
>     System.out.println("Caught exception: " + e.getMessage());
>     e.printStackTrace();
>   }
> }
>
> The output of this test case is:
>
> ORIGINAL DOCUMENT BEFORE ADOPTION:
>
> content4:<no></no>3:<with></with>2:<document></document>1:<some></some>
>
> ADOPTING DOCUMENT BEFORE ADOPTION:
>
> document2:<other></other>1:<some></some>
>
> DOMLineNumberUserDataHandler Called: operation=5, key=startLine, data=1,
> src=[some: null], dst=null
> (Current Operation is NODE_ADOPTED)
>
> ORIGINAL DOCUMENT AFTER ADOPTION:
>
> contentnull:<no></no>null:<with></with>null:<document></document>1:<some></some>
>
> ADOPTING DOCUMENT AFTER ADOPTION:
>
>
> document2:<other></other>1:<some></some>
>
> As you can see, all of the line number UserData objects except one are
> dropped during the adoptNode scenario. All of the numbers in the "ORIGINAL
> DOCUMENT" are replaced with nulls except for the "1" that is the UserData
> on the top-most node in our example DOM tree.
>
> I have dug around a little in an eclipse debug view of the Document
> objects and I have also dug around a little on google and have not found a
> clear statement of this issue elsewhere, but I have seen discussion of the
> UserData objects being stored in the parent Document object in a hash map.
> It appears only the initial root node's UserData in the DOM tree for a
> document is being "adopted" over when an adoptNode call is made with that
> root node.
>
> Am I doing this incorrectly? Is the correct way to get the UserData copied
> over simply a manual traversal of the DOM tree I want to adopt and calling
> adoptNode() one each of those nodes?
>
> Thanks for your time,
>
> Jason Baker
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to