I think the issue is more likely to be about the encoding for  
serialization, not the encoding for input. Everything in MarkLogic is  
stored in UTF-8. The entity escapes you see are not what is stored: we  
store the actual characters.

You can force the issue with output settings on the appserver. Look at  
output-encoding setting in particular.


On Sat, 14 Oct 2017 03:48:12 -0700, Zakiya Tamimi  
<[email protected]> wrote:

> I have posted my question at stackoverflow
> https://stackoverflow.com/questions/46722188/marklogic-encoding-xdmpdocument-load
>
> Here's the text of the question:
>
> I have noticed that utf-8 xml documents loaded (xdmp:document-get() +
> xdmp:document-insert()) into our development marklogic server (7.0-6.8)
> have ascii encoding. Meanwhile back on production server (7.0-5.1), there
> is no problem; utf-8 is loaded as utf-8. I traced the problem and found  
> it
> to be caused by xdmp:document-get().
>
> So I wrote the following code snippet and ran it on both server consoles
> and got incorrect encoding on the development server and correct encoding
> on production.
>
> let $options := <options xmlns="xdmp:document-get">
>   <repair>full</repair>
>   <encoding>UTF-8</encoding>
>   <format>xml</format>
> </options>
> let $url := "http://******/ref_batches/electronic/20170801_e31_004
> /201731780-004.xml"
> return xdmp:document-get($url, $options)
>
> My initial guess: different version numbers may have caused this. So I
> tested on a local server (7.0-6-12) and got correct utf-8 encoding. Later
> we upgraded our development server to (7.0-6-12) and re-tested to get
> incorrect encoding (ascii)
>
> Is there some marklogic configurations that are responsible for this
> trans-coding?
>
> Thanks


-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/
_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to