Re: Lazy Field Loading

Grant Ingersoll Tue, 04 Apr 2006 13:40:21 -0700

Your right, more flexibility is needed, but it goes beyond just fieldloading in my mind. I think this is what Doug was getting at (at leastpartially) with http://wiki.apache.org/jakarta-lucene/Lucene2Whiteboard#12 although that focuses on Indexing, I think it should be consideredfor searching. I am not sure we should just continue adding more andmore methods onto IndexReader. I think the 2.x move gives us anopportunity to refactor some of the things we think we can make better.

I am not sure you need 509 when you have Lazy loading. In my mind, youhave the best of both worlds. You can get all the meta-info about allthe stored fields on the Document w/o the penalty of loading the actualdata.

My use case is below (my guess is this is quite common).Run a search, get back your hits and display summary information on thehits (i.e. the "small" fields). User picks the Hit they want to seemore info on, go display the full document, including, most likely, theinfo in the really large stored fields (i.e the original document). Todate, I have been storing this info elsewhere b/c of the loadingpenalty. With lazy loading, I don't need to do this. I can just deferloading until the second level access is needed and I never load it ifthe user doesn't ask for it.In the case where you only get a few smaller fields, you have to go backand get the document again when you want to display the contents of thelarge field.

Of course, there are several other use cases where you may only wantcertain fields, but I don't think there is much cost associated withloading small fields, just the large ones, so you can just make them lazy.



Yonik Seeley wrote:

On 3/31/06, Yonik Seeley <[EMAIL PROTECTED]> wrote:

        <https://issues.apache.org:443/jira/browse/LUCENE-509>

Yes, I'd personally find a way to retrieve just fields x,y, and z more
useful than lazy loading.


Thinking a little more, it would be nice if the field reading API was
opened up a little more so that multiple things could be done... even
construct different field/document objects (say a document
implementation that indexed the fields, etc).
That could be used to implement either lazy field loading, or loading
of specific fields.

The lazy loading alone doesn't really address LUCENE-509

I was thinking something along the lines of

// an IndexReader would call FieldReader methods for each
abstract class FieldReader {
  boolean readField(int fieldnum, String fieldName);  // users return
true if this field should be read.
  boolean stringField(int fieldnum, byte[] utf8);   // returns true to
keep reading next field
    OR
  boolean stringField(int fieldnum, String str);   // returns true to
keep reading next field
  boolean binaryField(int fieldnum, byte[] data);  // returns true to
keep reading next field
}

class IndexReader {
  // expert level API
  void readFields(int doc, FieldReader reader);
}

Just brainstorming so far...

-Yonik
http://incubator.apache.org/solr Solr, The Open Source Lucene Search Server

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

--

Grant IngersollSr. Software EngineerCenter for Natural Language ProcessingSyracuse UniversitySchool of Information Studies335 Hinds HallSyracuse, NY 13244http://www.cnlp.orgVoice: 315-443-5484Fax: 315-443-6886


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Lazy Field Loading

Reply via email to