[
https://issues.apache.org/jira/browse/LANG-935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14360394#comment-14360394
]
ASF GitHub Bot commented on LANG-935:
-------------------------------------
GitHub user CodingFabian opened a pull request:
https://github.com/apache/commons-lang/pull/50
LANG-935 optimize lookup of translations by LookupTranslator
The previous implementation retrieved substrings from the input and checked
if
it had an replacement for it. The problem is that this will always create
substrings (which are no longer "free" since JDK 7). This happens also for
substrings which are obviously not having a mapping.
The new implementation will no longer hash substrings, but will look for
translations that could be applied to the input.
Usually the very first character can rule out translation already, so this
is
the new key for the mapping table.
This is twice as fast as the previous implementation and avoids a lot of
Substring allocation.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/CodingFabian/commons-lang LANG-935
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/commons-lang/pull/50.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #50
----
commit d8c59abd376a8aaf24fcca46a571efffe20b7a76
Author: Fabian Lange <[email protected]>
Date: 2015-03-13T14:00:07Z
LANG-935 optimize lookup of translations by LookupTranslator
The previous implementation retrieved substrings from the input and checked
if
it had an replacement for it. The problem is that this will always create
substrings (which are no longer "free" since JDK 7). This happens also for
substrings which are obviously not having a mapping.
The new implementation will no longer hash substrings, but will look for
translations that could be applied to the input.
Usually the very first character can rule out translation already, so this
is
the new key for the mapping table.
This is twice as fast as the previous implementation and avoids a lot of
Substring allocation.
----
> Possible performance improvement on string escape functions
> -----------------------------------------------------------
>
> Key: LANG-935
> URL: https://issues.apache.org/jira/browse/LANG-935
> Project: Commons Lang
> Issue Type: Improvement
> Components: lang.text.translate.*
> Affects Versions: 3.1
> Reporter: Peter Wall
> Priority: Minor
> Labels: performance
> Fix For: Patch Needed
>
> Attachments: tempproject1.zip
>
>
> The escape functions for HTML etc. use the same code and the same
> initialisation tables for the escape and unescape functions, and while this
> is an elegant approach it leads to a number of deficiencies:
> 1. The code is very much less efficient than it could be
> 2. A new output string is created even when no conversion is required
> 3. No mapping is provided for characters that do not have a specific
> representation (for example HTML 0x101 should become &#257; )
> The proposal is to use a new mapping technique to address these issues
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)