Dmitry Potapov created MIME4J-283:
-------------------------------------

             Summary: DecoderUtil performance fix
                 Key: MIME4J-283
                 URL: https://issues.apache.org/jira/browse/MIME4J-283
             Project: James Mime4j
          Issue Type: Improvement
          Components: parser (core)
    Affects Versions: master, 0.8.2
            Reporter: Dmitry Potapov
         Attachments: patch

DecoderUtil currently uses the following regex pattern for rfc2047-encoded 
words: 
{code:java}
"(.*?)=\\?(.+?)\\?(\\w)\\?(.*?)\\?="
{code}
First capturing group {{(.*?)}} is a very expensive regular expression causing 
next pattern node evaluation on every input character. Because of this decoding 
of 4 KB input ({{To:}} field with 40-80 recipients) takes up to 200ms on modern 
CPUs.

At the same time, this capturing group used only to store separator text 
between encoded words. Proposed patch reuses existing {{tailIndex}} for 
separator text extraction and same input decoding now takes only 1-2ms.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to