Out-of-process text extraction
------------------------------
Key: TIKA-416
URL: https://issues.apache.org/jira/browse/TIKA-416
Project: Tika
Issue Type: New Feature
Components: parser
Reporter: Jukka Zitting
Priority: Minor
There's currently no easy way to guard against JVM crashes or excessive memory
or CPU use caused by parsing very large, broken or intentionally malicious
input documents. To better protect against such cases and to generally improve
the manageability of resource consumption by Tika it would be great if we had a
way to run Tika parsers in separate JVM processes. This could be handled either
as a separate "Tika parser daemon" or as an explicitly managed pool of forked
JVMs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.