[ 
https://issues.apache.org/jira/browse/SOLR-2381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13000027#comment-13000027
 ] 

Bernd Fehling commented on SOLR-2381:
-------------------------------------

Looks already much better. 
First tests show that with DIH the unicode above BMP get correctly stored in a 
string index field.
If displayed with wt=json it is correct unicode.
If displayed with wt=xml it is invalid unicode.

Example (Mathematical sans-serif capital S):
loaded unicode with DIH - U+1D5B2 (F0 9D 96 B2)
displayed with wt=json  - U+1D5B2 (F0 9D 96 B2)
displayed with wt=xml   - ??????? (ED A0 B5 ED B6 B2)

This was logged with wireshark directly from the network.

Open question:
- is the xml output a jetty problem or XMLwriter from Lucene/Solr? 


> The included jetty server does not support UTF-8
> ------------------------------------------------
>
>                 Key: SOLR-2381
>                 URL: https://issues.apache.org/jira/browse/SOLR-2381
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Robert Muir
>            Assignee: Robert Muir
>            Priority: Blocker
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2381.patch, SOLR-ServletOutputWriter.patch, 
> jetty-6.1.26-patched-JETTY-1340.jar, jetty-util-6.1.26-patched-JETTY-1340.jar
>
>
> Some background here: 
> http://www.lucidimagination.com/search/document/6babe83bd4a98b64/which_unicode_version_is_supported_with_lucene
> Some possible solutions:
> * wait and see if we get resolution on 
> http://jira.codehaus.org/browse/JETTY-1340. To be honest, I am not even sure 
> where jetty is being maintained (there is a separate jetty project at 
> eclipse.org with another bugtracker, but the older releases are at codehaus).
> * include a patched version of jetty with correct utf-8, using that patch.
> * remove jetty and include a different container instead.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to