I agree with Robert and Dawid that once you go past the page fault
border you will loose performance. The problem here is that you don't
realize it immediately. IMO the byte[] representation is so compact
that it doesn't really matter if you use FS Cache memory or JVM memory
so I'd rather go for jvm memory here too.

it seems you are having a slightly different usecase here:

> On a related note, is it OK to place lets say 1K byte[]s into the
> Output portion of the FST?

here you should rather store pointers to another file and mmap that
file. Keep your FST as lean and compact as possible and make sure its
in memory. The compression should do a good job for you here!

simon
On Fri, Jun 3, 2011 at 8:19 AM, Dawid Weiss
<dawid.we...@cs.put.poznan.pl> wrote:
>> In any case I browsed the code a bit more, it'd be easy to test out
>> the MMap'ing because the FST simply reads directly from a byte[],
>> which could be converted to a ByteBuffer (that's MMap'd).  Nice!
>
> The FST is basically a byte[], but like Robert said -- the layout of
> transitions and states will require nearly random access patterns all
> around it. There are some optimizations one can make to lay out
> initial nodes close to each other, for example (and I've done it as
> part of another project) -- these do optimize CPU cache utilization
> and I would assume MMAP'ped buffers as well -- but once you go past
> page fault zone the performance will be terrible. And I assume the
> degradation will not be smooth, but very sharp.
>
> This said, I'd be interested in looking at the benchmarks if you want
> to invest some time and do it ;)
>
> Dawid
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to