[ 
https://issues.apache.org/jira/browse/LUCENE-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558935#action_12558935
 ] 

Karl Wettin commented on LUCENE-1016:
-------------------------------------

{quote}
I'm curious if the build part of this would be faster than reanalyzing a 
document.
{quote}

It is a slow process on an index with many terms. Each one has to be iterated 
and mached against the document number.

{quote}
Just thinking outloud, but I have wondering about a Highlighter that uses the 
new TermVectorMapper, but using that doesn't account for non-TermVector based 
Documents that need to be analyzed. Was thinking this might account for both 
cases, all through the TermVectorMapper mechanism. Just doesn't seem like it 
would be very fast.
{quote}

This patch is mostly about when you don't have access to the source data. It 
was used together with a TermVectorMappingCachedTokenStreamFactory to extract 
re-indexable documents from any directory.

If you think of this peice of code and highlighter together, I would consider 
something else, perhaps a tool that could add the term vector to all documents 
missing one in a single iteration sweep of the index. I know very little about 
the file format and the highlighter though.



> TermVectorAccessor, transparent vector space access 
> ----------------------------------------------------
>
>                 Key: LUCENE-1016
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1016
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Term Vectors
>    Affects Versions: 2.2
>            Reporter: Karl Wettin
>            Priority: Minor
>         Attachments: LUCENE-1016.txt
>
>
> This class visits TermVectorMapper and populates it with information 
> transparent by either passing it down to the default terms cache (documents 
> indexed with Field.TermVector) or by resolving the inverted index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to