Hi Simple Techniques is to use "Update Index" for the dynamic data colum
rather then re-indexing the whole document. with regards karthik On Mon, Apr 23, 2012 at 9:01 PM, Jong Kim <jong.luc...@gmail.com> wrote: > Hi, > > I'm sure that this is very common use case that probably hundreds of people > have asked the same question in the past, but I haven't been able to find > an exact answer to my question. > > I have a system where each document in the Lucene index comprises of at > least one field containing very large number of terms (for example, entire > text from the content of potentially very large text files) and another > metadata field that is much smaller. The first field is rarely modified > hence remains mostly static, while the second field is modified very > frequently. > > Currently, I'm re-indexing the entire Lucene document whenever the value of > the second field changes from the source side. Needless to say, this yields > very inefficient system, because significant amount of the system resources > are being wasted in effectively re-indexing what has not changed. > > Is there any good way to solve this design problem? Obviously, an > alternative design would be to split the index into two, and maintain > static (and large) data in one index and the other dynamic part in the > other index. However, this approach is not acceptable due to our data > pattern where the match on the first index yields very large result set, > and filtering them against the second index is very inefficient due to high > ratio of disjoint data. In other word, while the alternate approach > significantly reduces the indexing-time overhead, resulting search is > unacceptably expensive. > > Any design help would be highly appreciated. > > Thanks > /Jong > -- *N.S.KARTHIK R.M.S.COLONY BEHIND BANK OF INDIA R.M.V 2ND STAGE BANGALORE 560094*