hi li, i looked at doing something similar - where we only index the text but retrieve search results / highlight from files -- we ended up giving up because of the amount of customisation required in solr -- mainly because we wanted the distributed search functionality in solr which meant making sure the original file ended up the same filing system i.e. machine too!).
we ended up just storing the main text field too even though there was a bit of text -- in the end solr/lucene can handle the index size fine and disk space is cheaper than man-hours to customise solr/lucene to work in this way! that was our conclusion anyway and it works fine -- we also have separate index / search server(s) so we don't care about merge time either -- and as i said above - we use the distributed search so don't tend to need to merge very large indexes anyway. when your system grows / you go into production you'll probably split the indexes too to use solr's distributed search func. for the sake of query speed). hope that helps, bec :) On 7 July 2010 14:07, Li Li <fancye...@gmail.com> wrote: > I used to store full text into lucene index. But I found it's very > slow when merging index because when merging 2 segments it copy the > fdt files into a new one. So I want to only index full text. But When > searching I need the full text for applications such as hightlight and > view full text. I can store the full text by <url,full text> pair in > database and load it to memory. And When I search in lucene(or solr), > I retrive url of doc first, then use url to get full text. But when > they are stored separately, it is hard to managed. They may be not > consistent with each other. Does lucene or solr provied any method to > ease this problem? Or any one has some experience of this problem? > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org