On Jan 23, 2008 4:29 PM, Chris Harris <[EMAIL PROTECTED]> wrote:
> Supposing you could do this -- i.e. that you could get Solr to pass a
> particular field's data to Lucene without reading it all into memory
> first --, are there any potential problems on the Lucene end? It's not
> going to turn around and slurp the whole field into member itself, is
> it?

Well, yes and no.... reader based fields are for indexed fields only
(and lucene won't read that all into memory before indexing), but the
lucene index structures will still need to be created in memory.  So
using a Reader would save memory, but there would still be more memory
use  the bigger the document was.

> That was the indexing side. You also have the searching side, in
> particular when you need to retrieve the value of a huge stored field.

Reader based fields can't be stored (and if that restriction were
lifted, it wouldn't help memory anyway unless you wanted to buffer to
disk).

> It looks like Lucene will give you a stored field's value as a stream
> (a Java Reader), but that won't do any good if, behind the scenes, it
> brings the whole field into memory first. Then there's the question of
> whether Solr needs to slurp that whole stream into memory before
> outputting that field's contents as XML. (I doubt it does, but I
> haven't looked at any of the code recently.) And then if you're using
> a client such as solrsharp, there's the question of whether *it* will
> slurp the whole stream into memory.

Lucene doesn't provide a Reader interface to stored fields, but it
should be possible.

> Maybe this is something to take up on JIRA or solr-dev, rather than
> here. I was just trying to get a sense of how difficult the proposed
> feature would be.

One should consider storing really huge fields outside of Solr / Lucene.

-Yonik

Reply via email to