[ http://issues.apache.org/jira/browse/JCR-390?page=all ]
Jukka Zitting updated JCR-390:
------------------------------
Version: 1.0.1
> Move text extraction into a background thread
> ---------------------------------------------
>
> Key: JCR-390
> URL: http://issues.apache.org/jira/browse/JCR-390
> Project: Jackrabbit
> Type: Improvement
> Components: indexing
> Versions: 1.0, 1.0.1
> Environment: all
> Reporter: Marcel Reutegger
> Assignee: Marcel Reutegger
> Priority: Minor
>
> Even though text extraction is not done right on save() most of the
> extraction work is later done by a client thread. There is a mechanism in
> place that commits the deferred work in a background thread. But the
> background thread is only triggered by a timer and does not constantly write
> back pending index changes. For regular index changes this is done on purpose
> and should not be changed. However text extraction work should be moved
> completely into a background thread because it often takes a fair amount of
> time to index a large document.
> Outline of a possible solution:
> - all text filtering is tasks are put into a work queue
> - the work queue is processed by a background thread
> - basic indexing of nt:resource without text filtering takes place
> - the background thread updates the index when text filtering completed for a
> nt:resource
> There should be a configuration parameter that allows to execute text
> filtering without the background thread. This way it is possible to get the
> existing behaviour of Jackrabbit: the fulltext index is always up-to-date and
> can be used.
> With the background process this is no longer the case.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira