[ http://issues.apache.org/jira/browse/JCR-264?page=all ] Marcel Reutegger closed JCR-264: --------------------------------
> TextFilters get called three times within checkin() method > ---------------------------------------------------------- > > Key: JCR-264 > URL: http://issues.apache.org/jira/browse/JCR-264 > Project: Jackrabbit > Type: Improvement > Components: indexing > Environment: all > Reporter: Martin Perez > Fix For: 1.0.1 > > If you want to add a PDF document to a repository using a PdfTextFilter, and > you do the following steps: > session.save() > node.checkin(); > The method PdfTextFilter.doFilter() gets called 4 times!!! > session's save method calls doFilter one time. This is normal > But checkin method calls doFilter three times. Is this normal? I do not see > the sense. > ------------------ > > Marcel Reutegger > <[EMAIL PROTECTED]> to jackrabbit-dev > More options 11:43 am (13 minutes ago) > Hi Martin, > this is unfortunate and should be improved. the reason why this happens > is the following: > the search index implementation always indexes a node as a whole to > improve query performance. that means even if a single property changes > the parent node with all its properties is re-indexed. > unfortunately the checkin method sets properties in three separate > 'transactions', causing the search to re-index the according node three > times. > usually this is not an issue, because the index implementation keeps a > buffer for pending index work. that is, if you change the same property > several times and save after each setProperty() call, it won't actually > get re-indexed several times. but text filters behave differently here, > because they extract the text even though the text will never be used. > eventually this will improve without any change to the search index > implementation, because as soon as versioning participates properly in > transactions there will only be one call to index a node on checkin(). > as a quick fix we could improve the text filter classes to only parse > the binary when the returned reader is acutally used. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira