> This is more about compressing strings in TermsIndex, I think.

Ah, because they're sorted.  I think if the string lookup cost
degrades then it's not worth it?  That's something that needs to be
tested in the MMap case as well, eg, are ByteBuffers somehow slowing
down everything by a factor of 10%?

On Thu, May 19, 2011 at 6:30 AM, Earwin Burrfoot <ear...@gmail.com> wrote:
> This is more about compressing strings in TermsIndex, I think.
> And ability to use said TermsIndex directly in some cases that
> required FieldCache before. (Maybe FC is still needed, but it can be
> degraded to docId->ord map, storing actual strings in TI).
> This yields fat space savings when we, eg,  need to both lookup on a
> field and build facets out of it.
>
> mmap is cool :)  What I want to see is a FST-based TermsDict that is
> simply mmaped into memory, without building intermediate indexes, like
> Lucene does now.
> And docvalues are orthogonal to that, no?
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to