[
https://issues.apache.org/jira/browse/SLING-5973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15427858#comment-15427858
]
Ben Fortuna edited comment on SLING-5973 at 8/19/16 9:26 AM:
-------------------------------------------------------------
Ah ok, so my rewriter pipeline includes a reasonably simple transformer
implementation that extends from DefaultTransformer. See here for details:
https://github.com/micronode/whistlepost/tree/master/whistlepost-rewrite-lib
You'll find the implementation (LinkTransformerFactory) is only concerned with
rewriting links in the output to remove the /content from the path.
The other references in that config (html-generator, html-serializer) are
provided by Sling itself. I should also probably mention that I am using the
official Sling docker container to test this, however I have confirmed the
default file encoding, etc. is still UTF-8 in this environment.
Now that I think of it, I guess the rewriter (and the html-serializer) is only
"activated" on the paths specified by the transformer, which means you won't
have an issue unless the path matches the transformer configuration.
was (Author: fortuna):
Ah ok, so my rewriter pipeline includes a reasonably simple transformer
implementation that extends from DefaultTransformer. See here for details:
https://github.com/micronode/whistlepost/tree/master/whistlepost-rewrite-lib
You'll find the implementation (LinkTransformerFactory) is only concerned with
rewriting links in the output to remove the /content from the path.
The other references in that config (html-generator, html-serializer) are
provided by Sling itself. I should also probably mention that I am using the
official Sling docker container to test this, however I have confirmed the
default file encoding, etc. is still UTF-8 in this environment.
> HTMLSerializer not handling some unicode characters (emoji, etc.)
> -----------------------------------------------------------------
>
> Key: SLING-5973
> URL: https://issues.apache.org/jira/browse/SLING-5973
> Project: Sling
> Issue Type: Bug
> Components: Extensions
> Reporter: Ben Fortuna
> Attachments: emoji-no-sling-rewriter.png,
> emoji-with-sling-rewriter.png
>
>
> I've noticed that when I have unicode special characters (e.g. emoji) in my
> sling content and the sling rewriter is enabled the characters are not output
> correctly to the browser. For example:
> {code}😁{code} becomes {code}��{code}
> If I disable the rewriter pipeline the output is as expected.
> I've looked in the code and I suspect the issue is in the HTMLSerializer from
> the Cocoon library, however I'm not sure why as it should be using the
> default encoding for output (which is UTF-8). My rewriter pipeline is using
> the default html-generator and html-serializer provided by sling.
> My code is available on GitHub here:
> https://github.com/Whistlepost/emojistrip
> It provides a very simple app/content project pair with some emoji characters
> in the content (see src/main/resources/SLING-INF/content/phrases.json). Many
> thanks.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)