[ https://issues.apache.org/jira/browse/TEXT-215?focusedWorklogId=768321&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-768321 ]
ASF GitHub Bot logged work on TEXT-215: --------------------------------------- Author: ASF GitHub Bot Created on: 10/May/22 05:44 Start Date: 10/May/22 05:44 Worklog Time Spent: 10m Work Description: kinow commented on PR #310: URL: https://github.com/apache/commons-text/pull/310#issuecomment-1121955771 > Thanks for the quick answer ! I'm not subscribed yet, but will do it right now, and then I'll send the email to explain my PR. Thank you very much ! Brilliant! Every component is discussed there, that's the only downside. Use the following prefix for your subject, please: "[text] Enter your email subject here". I have a rule in GMail to move it to another folder so that I can take a look when I'm not busy. You can ignore emails for components you are not interested (I'm about to write one for Commons Configuration). Hopefully we will find a solution for this issue and fix & release it soon. Thanks!!! Issue Time Tracking ------------------- Worklog Id: (was: 768321) Time Spent: 2h (was: 1h 50m) > NumericEntityUnescaper may miss decimal entity > ---------------------------------------------- > > Key: TEXT-215 > URL: https://issues.apache.org/jira/browse/TEXT-215 > Project: Commons Text > Issue Type: Bug > Affects Versions: 1.0 > Reporter: Richard Bunel > Assignee: Bruno P. Kinoshita > Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > > *Description:* > A security breach can be used in the NumericEntityUnescaper through the use > of decimal character entities. > At > [line|https://github.com/apache/commons-text/blob/master/src/main/java/org/apache/commons/text/translate/NumericEntityUnescaper.java#L117] > 117 a string of hexadecimal characters are searched, whether or not the > entity is an hexadecimal one. > Therefore, if the "semiColonOptional" option is enabled and a deicmal entity > without semi-colon is immediately followed by one or several letters from A > to F, these letters will be caught. The Integer parsing with a radix at 10 > will then fail and the whole entity will be ignored. > *Example:* > If one uses the following string: > {code:java} > <iframe src=\"javascript:alert(1)\">{code} > The sequence identifying the entity will wrongly be "ja" instead of > "j". > As "ja" is not a valid decimal entity, its Integer parsing fails and the > whole entity remains escaped. > Such code would then trigger the alert on all modern browsers. > *Solution:* > The fix for this is to restrict hexadecimal characters to hexadecimal > entities and decimal characters to decimal entities. -- This message was sent by Atlassian Jira (v8.20.7#820007)