Alan,
I'd recommend starting with xdmp:document-load() -
http://developer.marklogic.com/pubs/3.2/apidocs/UpdateBuiltins.html#document-load
You might also be interested in
http://developer.marklogic.com/howto/tutorials/2006-06-recordloader.xqy
-- Mike
Alan Darnell wrote:
I have a number of documents (sample below) in XML format but not UTF-8
encoding and with an externally referenced DTD and rendering
stylesheet. What's the best way to get these documents into MarkLogic
so that:
- the encoding is changed to UTF-8
- any entities in the DTD are resolved to UTF-8 encoded characters
- any CDATA sections are removed with the content left intact, including
markup embedded in the CDATA content
Do I need to pre-process the files before loading or can Mark logic
handle these kinds of conversion as part of the load functions?
Also, does anyone know of any good strategies for converting math in TeX
format to MathML?
Thanks,
Alan
Alan Darnell
University of Toronto
<?xml version="1.0" encoding="iso-8859-1"?><?xml-stylesheet
type="text/xsl" href="file://batchgate1\StyleS\bpg4
0.xsl"?>
<!DOCTYPE content PUBLIC "-//BLACKWELL PUBLISHING GROUP//DTD 4.0//EN"
"\\Batchgate1\bpgdtd\4-0\bpg4-0.dtd">
<content dtdver="4.0" docfmt="xml">
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general