Hi, I have to index (tokenized) documents which may have very much pages, up to 10.000. I also have to know on which pages the search phrase occurs. I have to update some stored index fields for my document. The content is never changed.
Thus I think I have to add one lucene document with the index fields and one lucene document per page. Mapping ======= MyDocument -ID -Field 1-N -Page 1-N Lucene -Lucene Document with ID, page number 0 and Field1 - N (stored fields) -Lucene Document 1 with ID, page number 1 and tokenized content of Page 1 ... -Lucene Document N with ID, page number N and tokenized content of Page N Delete of MyDocument -> IndexWriter#deleteDocuments(Term:ID=foo) Update of stored index fields -> IndexWriter#updateDocument(Term: ID=foo, page number = 0) Search with index and content. Step 1: Search on stored index fields -> List of IDs Step 2: Search on ID field (list from above OR'ed together) and content -> List of IDs and page numbers Does this work? What drawbacks has this approch? Is there another way to achieve what I want? Thank you. P.S. There are millions of documents with a page range from 1 to 10.000. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]