[ 
https://issues.apache.org/jira/browse/TEXT-216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17513868#comment-17513868
 ] 

Richard Bunel commented on TEXT-216:
------------------------------------

Well, my target usage (in my web application) is to use the "unescapeHtml5" 
method to parse HTML content (to detect potential XSS attack) before it is sent 
to and rendered on the browser. Leaving escaped characters entities create 
vulnerabilities.

For example, if I try to prevent against javascript injection on images, a 
simple string like this will bypass the filter as the : entity remains 
escaped.
{code:java}
<img src=javascript&colon;alert("XSS")> {code}
 

The usage of "escapeHTML5" is admittedly less evident, but so are the 
"escapeHtml4" or "escapeHtml3" methods and they still form part of the library.

> HTML 5.0 Entities are not supported
> -----------------------------------
>
>                 Key: TEXT-216
>                 URL: https://issues.apache.org/jira/browse/TEXT-216
>             Project: Commons Text
>          Issue Type: Improvement
>    Affects Versions: 1.0
>            Reporter: Richard Bunel
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> As noted in 
> [TEXT-193|https://issues.apache.org/jira/projects/TEXT/issues/TEXT-193] and 
> probably other tickets, HTML 5.0 entities are not supported.
> A nice evolution would be to include them all.
> Tentative PR: https://github.com/apache/commons-text/pull/312



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to