hello all We've got 100GB of data which has doc,txt,pdf,ppt,etc.., we've separate parser for each file format, so we're going to index those data by lucene. (since we scared of Nutch setup , thats why we didn't use it) My doubt is , will it be scalable when i index those dcouments ? we planned to do separate index for each file format , and we planned to use multi index reader for searching, please anyone suggest me
1. Are we going on the right way? 2. Please suggest me about mergeFactors & segments 3. How much index size can lucene handle? 4. Will it cause for java OOM. -- View this message in context: http://www.nabble.com/indexing-100GB-of-data-tp24600563p24600563.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org