DocValues formats hold large byte[][]s even when using MMapDirectory

2013-10-02 Thread Steven Schlansker
Hi, I have a search application using Lucene 4.4.0 with various BinaryDocValues and SortedSetDocValues. We use MMapDirectory to help keep the Java heap small / GC pause times short and instead rely on the OS buffer cache to keep things fast, which I gather is generally considered a "best practi

Re: DocValues formats hold large byte[][]s even when using MMapDirectory

2013-10-02 Thread Michael McCandless
In Lucene 4.5 (coming out any day now) we've switched by default to a "mostly on disk" impl for doc values. Before that, you can use DiskDocValuesFormat instead. But you'll need to re-index (or create a new index and use IW.addIndexes) to cutover your current index to the DiskDVFormat. Mike McCa

Re: DocValues formats hold large byte[][]s even when using MMapDirectory

2013-10-02 Thread Steven Schlansker
On Oct 2, 2013, at 11:16 AM, Michael McCandless wrote: > In Lucene 4.5 (coming out any day now) we've switched by default to a > "mostly on disk" impl for doc values. > Awesome! Looking forward to that then. > Before that, you can use DiskDocValuesFormat instead. > > But you'll need to re-

Re: DocValues formats hold large byte[][]s even when using MMapDirectory

2013-10-02 Thread Michael McCandless
On Wed, Oct 2, 2013 at 2:37 PM, Steven Schlansker wrote: > > On Oct 2, 2013, at 11:16 AM, Michael McCandless > wrote: > >> In Lucene 4.5 (coming out any day now) we've switched by default to a >> "mostly on disk" impl for doc values. >> > > Awesome! Looking forward to that then. > >> Before tha