Hi agatone, I agree with markharw00 that highlighting is the main reason to store fields in lucene. I want to remind Sascha Fahl that the stored field in lucene are not inside the inverted index-structure. The implemention of stored fields is very simple: A (.fdt)-file with the pairs "field-name"/"field-value" in order of the documents with a map "documentID" --> "first pair in file". ("Stored fields" in http://lucene.apache.org/java/2_3_2/fileformats.html#Fields ) You can search with no stored fields at all. I agree with chrislusf that you should store least data in lucene as possible. If you store large byte-arrays for "full view" you possible will have a lot more IO even for hit-lists which does not use this byte-array. (you would have to use FieldSelector, but still with FieldSelector a hard-drive don't like to skip this field-data (= seek data)).
So if you have no highlighting at all, you could store a map "lucene document id"(int) --> "database id"(hopefully also type int) in main memory, and convert each lucene search result-list to a small select statement. This is completely ok. Lucene is very good in searching not in storing data. Take a look to thread http://www.nabble.com/Using-lucene-as-a-database...-good-idea-or-bad-idea--to18703473.html In my company we decided to use lucene as storage. But we have now to index-directories: one for searching and showing hit lists, the other as storage with ony two fields: "key" & "data". Performance tests shows that reading the storage is between 5 and 2 times slower then a solution with database (this was OK for our use-case). Best regards Karsten agatone wrote: > > Hi, > I asked this question already on "lucene-general" list but also got > advised to ask here too. > > I'm working on a project that has big database in the background (some > tables have about 1500000 rows). We decided to use Lucene for "faster" > search. Our search works similar as all searches: you write search string, > get list of hits with detail link. But there is dilemma if we should store > more data into index than it's needed. > > One side of developing team insists that we should use lucene index as > somekind of storage for data so when you get hit, you go onto details and > then again use lucene to find document that matches the selected ID and > take the data from Lucene index. So in the end you end with copying > complete database tables into the lucene index. > > Other side insists on storing to index only data that is displayed > directly to the user when showing the search results list and needed for > search criteria. When you go onto details, you have the matching ID so you > can pickup that row from database by that ID rather than search it inside > Lucene index. > > Can someone please describe drawbacks and advantages of both approaches. > Actually can someone write down what's the actual profit, where and when > of the Lucene itself in real production env. > > IT would be great if there is anyone who could write his experience with > indexing and searching large amount of data. > > > Thank you > -- View this message in context: http://www.nabble.com/Lucene-vs.-Database-tp19755932p19757274.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]