On 8/21/07, Mark H. Wood <[EMAIL PROTECTED]> wrote:
>  However, character entities such as mdash and sect
> are rejected as "referenced but not defined".  Where would I find the
> rules?  Where would I find the code defining the rules?

The rules are the rules for well-formed but not valid XML, which is
what the dublin_core.xml file essentially is. In a nutshell,
well-formed but not valid XML can't use named character entities,
because interpreting them is part of the validation process.

If you don't have a translator to UTF-8 handy, you *can* use numeric
entities (e.g. &#8212; for an em dash) in a brute-force
search-and-replace. I'll send you a bit of Python I wrote long ago to
help fix this precise problem under separate cover. Nothing fancy,
just a dictionary of named-to-numbers.

Dorothea

-- 
Dorothea Salo                [EMAIL PROTECTED]
Digital Repository Librarian      AIM: mindsatuw
University of Wisconsin
Rm 218, Memorial Library
(608) 262-5493

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to