Hi Folks,
I wrote an xquery module (http-load.xqy) that takes an HTTP POST whose contents contain an XML record containing sub-elements for the XML content, the repair, location, URI, permissions, collection, and document properties required to execute xdmp:document-load. The xquery module http-load.xqy is available from the marklogic server via an HTTP app server. I invoke http-load.xqy via a C#.Net application using the HttpWebRequest .Net library in which I explicitly set the content type as follows: wb.ContentType = "text/xml;charset=\"utf-8\""; This was working very well until one of our vendors starting sending content that explicitly specified the XML encoding as ISO-8859-1 in the XML declaration (which was not previously supplied in any of their content): <?xml version="1.0" encoding="ISO-8859-1"?> As a result the loader is giving me the following error: XDMP-DOCUTF8SEQ: Invalid UTF-8 escape sequence at http://[server]/[path]/doc.xml line 44 -- document is not UTF-8 encoded . I can modify the C#.Net content type to remove the character set declaration as follows: wb.ContentType = "text/xml"; (which is what I'm inclined to do to see what happens), but I need to be able to support either UTF-8 or ISO-8859-1 character sets and I don't want to have to determine the encoding before loading into MarkLogic. Content comes from a variety of vendors so it would be nice to let marklogic figure out the encoding. The encoding can be explicitly specified as either UTF-8 or ISO-8859-1 in the xdmp-document-load() options, but I'm wondering if the encoding can be automatically discovered and/or if it assumes UTF-8 unless explicitly set in the XML declaration? Thanks ahead of time for any help! Tim Meagher
_______________________________________________ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general