[
https://issues.apache.org/jira/browse/LANG-935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14361627#comment-14361627
]
Fabian Lange commented on LANG-935:
-----------------------------------
Benchmark:
{code}
@Benchmark
public String bench() {
return StringEscapeUtils.escapeHtml4("this is a very long text with the
escaped special char & at the end&");
}
{code}
Result on Java 8_40
before
{code}
Result: 153589.535 ±(99.9%) 12944.377 ops/s [Average]
Statistics: (min, avg, max) = (150350.500, 153589.535, 158659.365), stdev =
3361.614
Confidence interval (99.9%): [140645.158, 166533.912]
{code}
after
{code}
Result: 361767.593 ±(99.9%) 78626.649 ops/s [Average]
Statistics: (min, avg, max) = (338958.103, 361767.593, 379619.757), stdev =
20419.091
Confidence interval (99.9%): [283140.944, 440394.242]
{code}
Result on Java 7_55
before
{code}
Result: 161540.572 ±(99.9%) 36142.539 ops/s [Average]
Statistics: (min, avg, max) = (151321.894, 161540.572, 172699.558), stdev =
9386.103
Confidence interval (99.9%): [125398.033, 197683.111]
{code}
after
{code}
Result: 359147.459 ±(99.9%) 77912.414 ops/s [Average]
Statistics: (min, avg, max) = (335531.130, 359147.459, 380033.890), stdev =
20233.607
Confidence interval (99.9%): [281235.045, 437059.874]
{code}
As you can see it doubles performance.
I need to see how i can execute on 6, cause I dont want to pollute my mac with
6 again :)
> Possible performance improvement on string escape functions
> -----------------------------------------------------------
>
> Key: LANG-935
> URL: https://issues.apache.org/jira/browse/LANG-935
> Project: Commons Lang
> Issue Type: Improvement
> Components: lang.text.translate.*
> Affects Versions: 3.1
> Reporter: Peter Wall
> Priority: Minor
> Labels: performance
> Fix For: Patch Needed
>
> Attachments: tempproject1.zip
>
>
> The escape functions for HTML etc. use the same code and the same
> initialisation tables for the escape and unescape functions, and while this
> is an elegant approach it leads to a number of deficiencies:
> 1. The code is very much less efficient than it could be
> 2. A new output string is created even when no conversion is required
> 3. No mapping is provided for characters that do not have a specific
> representation (for example HTML 0x101 should become ā )
> The proposal is to use a new mapping technique to address these issues
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)