[
https://issues.apache.org/jira/browse/SLING-5973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15426098#comment-15426098
]
Stefan Seifert edited comment on SLING-5973 at 8/18/16 8:32 AM:
----------------------------------------------------------------
i tried out your application in sling launchpad and opened the URL
http://localhost:8080/emojistrip/ - what should i see if it is correct or not?
looking at http://localhost:8080/emojistrip/phrases/list/good-luck.json or
https://github.com/Whistlepost/emojistrip/blob/master/emojistrip-content/src/main/resources/SLING-INF/content/phrases.json
it think the problem is the escaping you use in the JCR content.
escaping in form {{&\#x2602;}} is HTML specific and not compatible with the
JSON spec. you should store the UTF-8 data natively in JSON/JCR and let the
HTML serializer take care of escaping the only when rendering the page. so you
can either put the UTF-8 character directly into the JSON file, or escape them
as definied in "2.5. Strings" in [RFC4627|http://www.ietf.org/rfc/rfc4627.txt
e.g. {{\u005C}}. when storing date using JCR API or Sling API you should not do
any escaping manually, just store the UTF-8 string.
was (Author: [email protected]):
i tried out your application in sling launchpad and opened the URL
http://localhost:8080/emojistrip/ - what should i see if it is correct or not?
looking at http://localhost:8080/emojistrip/phrases/list/good-luck.json or
https://github.com/Whistlepost/emojistrip/blob/master/emojistrip-content/src/main/resources/SLING-INF/content/phrases.json
it think the problem is the escaping you use in the JCR content.
escaping in form {{☂}} is HTML specific and not compatible with the JSON
spec. you should store the UTF-8 data natively in JSON/JCR and let the HTML
serializer take care of escaping the only when rendering the page. so you can
either put the UTF-8 character directly into the JSON file, or escape them as
definied in "2.5. Strings" in [RFC4627|http://www.ietf.org/rfc/rfc4627.txt
e.g. {{\u005C}}. when storing date using JCR API or Sling API you should not do
any escaping manually, just store the UTF-8 string.
> HTMLSerializer not handling some unicode characters (emoji, etc.)
> -----------------------------------------------------------------
>
> Key: SLING-5973
> URL: https://issues.apache.org/jira/browse/SLING-5973
> Project: Sling
> Issue Type: Bug
> Reporter: Ben Fortuna
>
> I've noticed that when I have unicode special characters (e.g. emoji) in my
> sling content and the sling rewriter is enabled the characters are not output
> correctly to the browser. For example:
> {code}😁{code} becomes {code}��{code}
> If I disable the rewriter pipeline the output is as expected.
> I've looked in the code and I suspect the issue is in the HTMLSerializer from
> the Cocoon library, however I'm not sure why as it should be using the
> default encoding for output (which is UTF-8). My rewriter pipeline is using
> the default html-generator and html-serializer provided by sling.
> My code is available on GitHub here:
> https://github.com/Whistlepost/emojistrip
> It provides a very simple app/content project pair with some emoji characters
> in the content (see src/main/resources/SLING-INF/content/phrases.json). Many
> thanks.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)