[
https://issues.apache.org/jira/browse/TIKA-863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404580#comment-13404580
]
Nick Burch commented on TIKA-863:
---------------------------------
Should be fixed in r1355782. We now avoid creating a new AutoDetectParser (and
implicit TikaConfig) for each part in a RFC822 message. Instead, check for one
on the ParseContext, otherwise cache the TikaConfig for the lifetime of the
message being parsed, and grab the parser from that.
> MailContentHandler should not create AutoDetectParser on each call
> ------------------------------------------------------------------
>
> Key: TIKA-863
> URL: https://issues.apache.org/jira/browse/TIKA-863
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 1.1
> Reporter: Andrzej Bialecki
> Attachments: TIKA-863.patch
>
>
> MailContentHandler is called from RFC822Parser, and it creates
> AutoDetectParser on each call to parse(...). The process of creating
> AutoDetectParser involves reading TikaConfig (not cached), which in turn
> involves parsing XML config files. Apart from the fact that this process is
> wasteful and heavy, in addition in a highly concurrent setup it leads to
> multiple threads blocking on SAX parser creation.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira