Dmitry Potapov created MIME4J-283:
-------------------------------------
Summary: DecoderUtil performance fix
Key: MIME4J-283
URL: https://issues.apache.org/jira/browse/MIME4J-283
Project: James Mime4j
Issue Type: Improvement
Components: parser (core)
Affects Versions: master, 0.8.2
Reporter: Dmitry Potapov
Attachments: patch
DecoderUtil currently uses the following regex pattern for rfc2047-encoded
words:
{code:java}
"(.*?)=\\?(.+?)\\?(\\w)\\?(.*?)\\?="
{code}
First capturing group {{(.*?)}} is a very expensive regular expression causing
next pattern node evaluation on every input character. Because of this decoding
of 4 KB input ({{To:}} field with 40-80 recipients) takes up to 200ms on modern
CPUs.
At the same time, this capturing group used only to store separator text
between encoded words. Proposed patch reuses existing {{tailIndex}} for
separator text extraction and same input decoding now takes only 1-2ms.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)