[jira] [Updated] (JCR-3146) Text extraction may congest thread pool in the repository

Alex Parvulescu (Updated) (JIRA) Mon, 14 Nov 2011 06:09:19 -0800

     [ 
https://issues.apache.org/jira/browse/JCR-3146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Alex Parvulescu updated JCR-3146:
---------------------------------

    Attachment: JCR-3146.patch

The solution is to define another queue for the tasks considered as low 
priority, so that they don't fill the execution queue.
Then, depending on the executor's load poll this queue for additional work 
items.

The secondary queue will only be used as needed, and the load is configurable 
via the system property 
"org.apache.jackrabbit.core.JackrabbitThreadPool.maxLoadForLowPriorityTasks"
This property is meant to be used as a percent. 0 means disabled / the default 
is 75.

There are some timing issues with the indexing tests on account of this new 
async text extraction. I've tried to fix all of them, but there may be more.

I haven't touched yet on the tika extraction that happens in a different 
process. I think that will need some minor refactoring as well.

Attaching proposed patch.


                
> Text extraction may congest thread pool in the repository
> ---------------------------------------------------------
>
>                 Key: JCR-3146
>                 URL: https://issues.apache.org/jira/browse/JCR-3146
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-core
>            Reporter: Alex Parvulescu
>            Priority: Minor
>         Attachments: JCR-3146.patch
>
>
> Text extraction congests the thread pool in the repository when e.g. many 
> PDFs are loaded into the workspace. Tasks submitted by the index merger are 
> delayed because of that and will result in many index segment folders.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (JCR-3146) Text extraction may congest thread pool in the repository

Reply via email to