I have a problem using xerces-2_7_1: I have to print out
the internal subset of XML documents, but the parameter entity refs
are always resolved by DOM parser. If I set the feature http://xml.org/sax/features/external-parameter-entities to
false, then it is resolved to empty string (so the parameter entity ref
disappears.
To reproduce the problem:
1. Save this as an XML
document
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE PEtest PUBLIC "-//TEST//DTD PETest XML//EN"
"petest.dtd" [
<!ENTITY % local.ent PUBLIC "-//TEST//DTD PETest Extension XML//EN" "">
%local.ent;
]>
<PETest>
<front>
</front>
<body>
</body></PETest>
<!ENTITY % local.ent PUBLIC "-//TEST//DTD PETest Extension XML//EN" "">
%local.ent;
]>
<PETest>
<front>
</front>
<body>
</body></PETest>
2. Set the feature http://xml.org/sax/features/external-parameter-entities to
false and call the DOM parser. Here is a sample
program:
public static void main(String argv[]) {
ParserWrapper parser =
null;
// create parser
try {
parser = (ParserWrapper)Class.forName("dom.wrappers.Xerces").newInstance();
parser.setFeature("http://xml.org/sax/features/namespaces", false);
parser.setFeature("http://xml.org/sax/features/validation", false);
parser.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
parser.setFeature("http://apache.org/xml/features/validation/schema", false);
parser.setFeature("http://apache.org/xml/features/validation/dynamic", false);
parser.setFeature("http://apache.org/xml/features/xinclude", false);
parser.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
// parse file
Document document = parser.parse(argv[0]);
System.out.println(document.getDoctype().getInternalSubset());
}
catch (Exception e) {
e.printStackTrace(System.err);
}
} // main(String[])
// create parser
try {
parser = (ParserWrapper)Class.forName("dom.wrappers.Xerces").newInstance();
parser.setFeature("http://xml.org/sax/features/namespaces", false);
parser.setFeature("http://xml.org/sax/features/validation", false);
parser.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
parser.setFeature("http://apache.org/xml/features/validation/schema", false);
parser.setFeature("http://apache.org/xml/features/validation/dynamic", false);
parser.setFeature("http://apache.org/xml/features/xinclude", false);
parser.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
// parse file
Document document = parser.parse(argv[0]);
System.out.println(document.getDoctype().getInternalSubset());
}
catch (Exception e) {
e.printStackTrace(System.err);
}
} // main(String[])
Result: The getInternalSubset will return
<!ENTITY % local.ent PUBLIC "-//TEST//DTD PETest Extension XML//EN" "">
which means that the entity ref is resolved to empty string. (If I set http://xml.org/sax/features/external-parameter-entities to true, then the parameter entity ref is correctly resolved and returned bu getInternalSubset, however I need the reference without resolving it.)
I have found that
if I add the following code into
org.apache.xerces.parsers.AbstractDOMParser.startParameterEntity(String,
XMLResourceIdentifier, String, Augmentations):
...
//append the paremeter entity reference if it was not
resolved
if (fInDTD && fInternalSubset != null && !fInDTDExternalSubset && augs != null) {
Object skip = augs.getItem(Constants.ENTITY_SKIPPED);
if (skip instanceof Boolean && (Boolean)skip == Boolean.TRUE) {
fInternalSubset.append(name);
fInternalSubset.append(";\n");
}
}
if (fInDTD && fInternalSubset != null && !fInDTDExternalSubset && augs != null) {
Object skip = augs.getItem(Constants.ENTITY_SKIPPED);
if (skip instanceof Boolean && (Boolean)skip == Boolean.TRUE) {
fInternalSubset.append(name);
fInternalSubset.append(";\n");
}
}
then the parameter entity ref is included in the internal
subset.
Can somebody confirm whether this is a bug or a feature?
Thanks!
Laszlo Bartos
