The underlying data formats are different. For example, because Lucene42Codec will load terms into RAM, it uses an FST. But DiskDV uses a more simplistic storage for the terms thats more suitable for being disk-resident.
There are also different compression block sizes and so on in use. you can pick and choose the formats on a per-field basis just as you mentioned. In solr its also hooked into schema.xml so you can do docValuesFormat="Disk" as an element on the field type (similar to postingsFormat) On Fri, Mar 8, 2013 at 2:02 PM, David Smiley (@MITRE.org) <dsmi...@mitre.org> wrote: > DiskDocValues is a codec (or part of a codec, apparenlty) for accessing the > DocValues from disk, with minimal RAM usage for things like offsets. > Lucene42Codec alternatively puts all of DocValues in RAM. Is the actual > disk resident data format the same between them? And how do you pick & > choose the formats? i.e. can I use Lucene42Codec for all the non-DocValues > stuff but then use DiskDocValues so that I can let the OS's cache govern > access to DV data while lowering my Java heap and giving the GC a break. Ok > I'm going to answer the 2nd question as I just discovered > Lucene42Codec.getDocValuesFormatForField which I can customize. But that > still leaves the 1st question. It would be nice to not have to re-index. > > ~ David > > > > ----- > Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book > -- > View this message in context: > http://lucene.472066.n3.nabble.com/DiskDocValues-vs-Lucene42Codec-tp4044061p4045871.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org