[jira] [Commented] (CONNECTORS-1625) When processing a specific PDF Manifold goes out of memory

Karl Wright (Jira) Fri, 04 Oct 2019 03:33:35 -0700


    [ 
https://issues.apache.org/jira/browse/CONNECTORS-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16944386#comment-16944386
 ]


Karl Wright commented on CONNECTORS-1625:
-----------------------------------------

Also, FWIW, the default Java memory sizes on the example are not guaranteed to 
allow processing of N simultaneous Tika extractions (one per worker thread) of 
the sort that require more memory.  Memory sizes allocated to the JVM are 
settable in the start-options files, and the first thing you want to do is 
increase those values to see if the problem goes away for you.


> When processing a specific PDF Manifold goes out of memory
> ----------------------------------------------------------
>
>                 Key: CONNECTORS-1625
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1625
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Tika extractor
>    Affects Versions: ManifoldCF 2.12
>            Reporter: Donald Van den Driessche
>            Assignee: Karl Wright
>            Priority: Major
>         Attachments: abd-serotec-antibodies-uk.pdf
>
>
> When processing attached file with manifoldcf 2.12, we keep getting an out of 
> memory error.
> When just parsing it throug Tika 1.18, no issues are being found.
> Can anyone look into it?
> Thanks in advance!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CONNECTORS-1625) When processing a specific PDF Manifold goes out of memory

Reply via email to