Is the following a bug or a feature? (Sorry it takes a while to explain.)
I have a moderately large XML document which, for maintainability, is
divided into about 25 physical files, included using entity references.
It's validated with respect to a dtd (also in a separate file).
It has lots of elements with tagname "const". Most of them have an
ID attribute called "name".
Depending on whether or not I include the line
m_parser->setCreateEntityReferenceNodes(false);
while configuring an instance of XercesDOMParser I see what I expect
from DOMDocument::getElementById or something that seems very weird,
namely it seems to find a different element from the one I started
with, which I already know has the sought-after id.
Here is a fragment of the test program where "doc" is the pointer
to the document returned by the parser
* * * * * * * * * * * *
// Lots of elements with tagname = "const" have ID attribute called "name"
XMLCh* xmlchConst = XMLString::transcode("const");
XMLCh* xmlchName = XMLString::transcode("name");
DOMNodeList* constElts = doc->getElementsByTagName(xmlchConst);
// For each one, write out address. Get value of name attribute, if any.
// If has a name, find it via getElementById
// and write out that address. Should match
unsigned int nElt = constElts->getLength();
for (unsigned int iElt = 0; iElt < nElt; iElt++) {
DOMNode* item = constElts->item(iElt);
DOMElement* itemElt = dynamic_cast<DOMElement *>(item);
std::cout << std::endl << "Const elt " << iElt << " Address as node: "
<< item << " and as element: " << itemElt << std::endl;
const XMLCh* xmlchNamevalue = itemElt->getAttribute(xmlchName);
if (XMLString::stringLen(xmlchNamevalue) > 0 ) {
char* namevalue = XMLString::transcode(xmlchNamevalue);
std::cout << "element has name " << namevalue << std::endl;
DOMElement* byIdElt = doc->getElementById(xmlchNamevalue);
std::cout << "Address from getElementById: " << byIdElt << std::endl;
XMLString::release(&namevalue);
}
}
* * * * * * * * * * * *
Here is output without the problematical parser setting:
Const elt 0 Address as node: 0x8172560 and as element: 0x8172560
element has name CsISegLength
Address from getElementById: 0x8172560
Const elt 1 Address as node: 0x8173058 and as element: 0x8173058
element has name nCsISegM1
Address from getElementById: 0x8173058
Const elt 2 Address as node: 0x8173a10 and as element: 0x8173a10
Const elt 3 Address as node: 0x8174158 and as element: 0x8174158
element has name CALMaxLayer
Address from getElementById: 0x8174158
* * * * * * * * * * * *
..and so forth
With the line, I get
* * * * * * * * * * * *
Const elt 0 Address as node: 0x83e2960 and as element: 0x83e2960
element has name CsISegLength
Address from getElementById: 0x82716e8
Const elt 1 Address as node: 0x83e31e8 and as element: 0x83e31e8
element has name nCsISegM1
Address from getElementById: 0x82721e0
Const elt 2 Address as node: 0x83e3480 and as element: 0x83e3480
Const elt 3 Address as node: 0x83e3ef0 and as element: 0x83e3ef0
element has name CALMaxLayer
Address from getElementById: 0x82732e0
* * * * * * * * * * * *
..and so forth. That is, the address returned by getElementById is
never what I started out with. Furthermore, the element it points
to has the READ_ONLY flag set whereas the original doesn't.
The problem with omitting that parser setting is it breaks another
piece of code I have which tries to check whether a particular element
has a particular parent. I can try to find a work-around for that one,
but I'm wondering whether the above behavior is actually what's intended.
I first ran into this with Xerces 2.4. This morning I fetched snapshot
http://cvs.apache.org/snapshots/xml-xerces/xml-xerces_20040217232946.tar.gz
and built that (on a machine running Redhat9, gcc 3.2 compiler), but it
didn't change anything. Comparable code for Xerces 1.7 behaved as expected,
but for 1.7 I never had a reason to touch the default setting of the
'include entity references' flag.
Joanne
------------
Joanne Bogart
[EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]