[ 
https://issues.apache.org/jira/browse/TIKA-863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210242#comment-13210242
 ] 

Andrzej Bialecki  commented on TIKA-863:
----------------------------------------

This would work in my case, too. Though the same pattern would have to be 
applied in other places where Tika creates heavy parsers (not to mention the 
TikaConfig re-creation on each get ... quite costly), so I think a convenience 
method should be provided to avoid repeating the same logic in various places.
                
> MailContentHandler should not create AutoDetectParser on each call
> ------------------------------------------------------------------
>
>                 Key: TIKA-863
>                 URL: https://issues.apache.org/jira/browse/TIKA-863
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.1
>            Reporter: Andrzej Bialecki 
>         Attachments: TIKA-863.patch
>
>
> MailContentHandler is called from RFC822Parser, and it creates 
> AutoDetectParser on each call to parse(...). The process of creating 
> AutoDetectParser involves reading TikaConfig (not cached), which in turn 
> involves parsing XML config files. Apart from the fact that this process is 
> wasteful and heavy, in addition in a highly concurrent setup it leads to 
> multiple threads blocking on SAX parser creation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to