XML Digital Signature requires a rigorous solution to the canonicalization problem in order to make hashing work. (See http://www.w3.org/TR/2008/REC-xmldsig-core-20080610/ and http://www.w3.org/TR/2001/REC-xml-c14n-20010315.) One implementation is Apache Santuario (http://santuario.apache.org/cindex.html). It might be useful.
If you decide to do your own thing, it's worth reviewing the DSig spec to make sure you handle all the cases. You'll need to do some sort of serialization in order to do a hash. "Write it out" sounds like you mean to write to disk, which is not necessary. -----Original Message----- From: Ben Griffin [mailto:[email protected]] Sent: Friday, May 06, 2011 9:03 AM To: [email protected] Subject: A xercesc api access for a digest ? Within any of the the DOM/etc frameworks that Xercesc implements, is there a digest of a DOMDocument available, or will I have to write the document out and then digest it myself? Primarily, I am looking for a means of being able to identify if a particular DOMDocument is the same as another as a part of a rapid-access hashmap - so I need something that is fast. Typically, there will be not more than a few hundred hashmap insertions, of which 80% will be insertion clashes (duplicate documents), but there will be hundreds of thousands of finds. So, my current implementation involves digesting each hashmap candidate, which entails having to write it out. (This is necessary so as to ensure that the encoding is consistent - the sources use inconsistent encodings, and they cannot be preprocessed, as some of them are availalble via eg URLs )
