On 8/21/07, Mark H. Wood <[EMAIL PROTECTED]> wrote: > However, character entities such as mdash and sect > are rejected as "referenced but not defined". Where would I find the > rules? Where would I find the code defining the rules?
The rules are the rules for well-formed but not valid XML, which is what the dublin_core.xml file essentially is. In a nutshell, well-formed but not valid XML can't use named character entities, because interpreting them is part of the validation process. If you don't have a translator to UTF-8 handy, you *can* use numeric entities (e.g. — for an em dash) in a brute-force search-and-replace. I'll send you a bit of Python I wrote long ago to help fix this precise problem under separate cover. Nothing fancy, just a dictionary of named-to-numbers. Dorothea -- Dorothea Salo [EMAIL PROTECTED] Digital Repository Librarian AIM: mindsatuw University of Wisconsin Rm 218, Memorial Library (608) 262-5493 ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ DSpace-tech mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-tech

