Re: Re-indexing a particular field only without re-indexing the entire enclosing document in the index

KARTHIK SHIVAKUMAR Tue, 24 Apr 2012 09:28:07 -0700

Hi

Simple Techniques is  to use  "Update Index"  for the dynamic data colum


rather then re-indexing the whole document.




with regards
karthik

On Mon, Apr 23, 2012 at 9:01 PM, Jong Kim <jong.luc...@gmail.com> wrote:

> Hi,
>
> I'm sure that this is very common use case that probably hundreds of people
> have asked the same question in the past, but I haven't been able to find
> an exact answer to my question.
>
> I have a system where each document in the Lucene index comprises of at
> least one field containing very large number of terms (for example, entire
> text from the content of potentially very large text files) and another
> metadata field that is much smaller. The first field is rarely modified
> hence remains mostly static, while the second field is modified very
> frequently.
>
> Currently, I'm re-indexing the entire Lucene document whenever the value of
> the second field changes from the source side. Needless to say, this yields
> very inefficient system, because significant amount of the system resources
> are being wasted in effectively re-indexing what has not changed.
>
> Is there any good way to solve this design problem? Obviously, an
> alternative design would be to split the index into two, and maintain
> static (and large) data in one index and the other dynamic part in the
> other index. However, this approach is not acceptable due to our data
> pattern where the match on the first index yields very large result set,
> and filtering them against the second index is very inefficient due to high
> ratio of disjoint data. In other word, while the alternate approach
> significantly reduces the indexing-time overhead, resulting search is
> unacceptably expensive.
>
> Any design help would be highly appreciated.
>
> Thanks
> /Jong
>



-- 
*N.S.KARTHIK
R.M.S.COLONY
BEHIND BANK OF INDIA
R.M.V 2ND STAGE
BANGALORE
560094*

Re: Re-indexing a particular field only without re-indexing the entire enclosing document in the index

Reply via email to