Re: Combine data from index and db before sorting and pagination

fulin tang Wed, 08 Sep 2010 21:15:50 -0700

That is exactly what I am looking for now !

Our mail search system has a field name flags, like read/unread etc,
and it will change after the email indexed , so we need an update .


But we only update one field, more exactly, one  Field.Index.NOT_ANALYZED and
Field.Store.YES  field , how can we avoid update the whole document ?


梦的开始挣扎于城市的边缘
心的远方执着在脚步的瞬间
我的宿命埋藏了寂寞的永远



2010/9/2 Chris Lu <[email protected]>:
> If there is an API to adjust the inverted index directly, it would be much
> efficient.
>
> I guess Mirko's problem is similar to this: There could be a "main_record"
> table and "category" table. Each "main_record" has a "category".
> When one "category" is changed, quite some "main_record" are affected.
>
> If we denormalize the data, which is the only way currently for good sorting
> performance, we would need to re-index all the affected documents.
> However, all the re-indexing work is quite inefficient.
>
> Let's suppose the "category" is using Field.Index.NOT_ANALYZED and
> Field.Store.YES.
>
> So in the inverted index is conceptually like this:
>  "category_1": doc1,doc2,doc5,doc10.
>  "category_2": doc3,doc4,doc7,doc8.
> If the only change is that several "category_1" records are changed to
> "category_2", take doc5 and doc10 for example, after all the reindexing
> effort, the only changes is:
>  "category_1": doc1,doc2.
>  "category_2": doc3,doc4,doc5,doc7,doc8,doc10.
>
> Of course, to support this efficiently could be a big change, affecting all
> the nice efficient DocDelta storage.
>
> --
> Chris Lu
> -------------------------
> Instant Scalable Full-Text Search On Any Database/Application
> site: http://www.dbsight.net
> demo: http://search.dbsight.com
> Lucene Database Search in 3 minutes:
> http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
> DBSight customer, a shopping comparison site, (anonymous per request) got
> 2.6 Million Euro funding!
>
> On Wed, Sep 1, 2010 at 4:29 PM, Erick Erickson <[email protected]>wrote:
>
>> The usual first choice when using Lucene to search database data is to
>> denormalize the db data into the index. Yes, it's redundant, but it's often
>> a better solution than trying to use both. Synchronization can be an issue,
>> but you have to deal with that anyway since you're indexing from the db
>> anyway.
>>
>>  But you haven't given us any indication of how much data you're talking
>> about here. Without some such detail, it's really hard to make a
>> recommendation.
>>
>> Best
>> Erick
>>
>> On Wed, Sep 1, 2010 at 9:30 AM, Sertic Mirko, Bedag
>> <[email protected]>wrote:
>>
>> > The data from db is required for sorting, and one db entry matches to
>> many
>> > index entries, so storing it in the index would be redundant. Also there
>> > would be the challenge to keep index and db in sync. Any ideas?
>> >
>> > Mirko
>> >
>> > -----Ursprüngliche Nachricht-----
>> > Von: Ian Lea [mailto:[email protected]]
>> > Gesendet: Mittwoch, 1. September 2010 15:17
>> > An: [email protected]
>> > Betreff: Re: Combine data from index and db before sorting and pagination
>> >
>> > If the sorting and pagination doesn't require data from the database,
>> > just do db lookups for the hits on a page, page by page as required.
>> > But if the db data is required I'd suggest storing it in the index.
>> >
>> >
>> > --
>> > Ian.
>> >
>> > On Wed, Sep 1, 2010 at 1:43 PM, Sertic Mirko, Bedag
>> > <[email protected]> wrote:
>> > > Hi
>> > >
>> > >
>> > >
>> > > I need to implement sorting and pagination of lucene search results.
>> > > This is quite easy, but I have to combine Data from the index with data
>> > > from a database. The index has the fulltext data plus a unique
>> > > identifier for a record from the database. The database stores
>> > > additional data. Fulltext search is only done on the index. I need to
>> > > combine the search results from the index and the additional data from
>> > > the database before sorting and pagination.
>> > >
>> > >
>> > >
>> > > Is the IndexReader.document() Method the right place to enrich the data
>> > > from the index with data from the db? How should I implement this
>> > > functionality with lucene?
>> > >
>> > >
>> > >
>> > > Thanks in advance
>> > >
>> > > Mirko
>> > >
>> > >
>> > >
>> > >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: [email protected]
>> > For additional commands, e-mail: [email protected]
>> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: [email protected]
>> > For additional commands, e-mail: [email protected]
>> >
>> >
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Combine data from index and db before sorting and pagination

Reply via email to