[jira] Commented: (LUCENE-2649) FieldCache should include a BitSet for matching docs

Ryan McKinley (JIRA) Thu, 23 Sep 2010 09:20:59 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12914082#action_12914082
 ]


Ryan McKinley commented on LUCENE-2649:
---------------------------------------

bq. Q: why aren't the CachePopulator methods just directly on EntryConfig - was 
it easier to share implementations that way or something?

Two reasons (but I can be talked out of it)
1. this approach separates what you are asking for (bits/values/etc) from how 
they are actually generated (the "populator").  Something makes me 
uncomfortable about the caller asking for Values needing to also know how they 
are generated.  Seems easy to mess up.  With this approach the 'populator' is 
attached to the field cache, and defines how stuff is read, vs the 
'EntryConfig' that defines what the user is asking for (particulary since they 
may change what they are asking for in subsequent calls)

2. The 'populator' is attached to the FieldCache so it has consistent behavior 
across subsequet calls to getXxxxValues().  Note that with this approach, if 
you ask the field cache for just the 'values' then later want the 'bits' it 
uses the same populator and adds the results to the existing CachedArray value.


bq. It doesn't seem like we need two methods fillValidBits , fillByteValues

The 'fillValidBits' just fills up the valid bits w/o actually parsing (or 
caching) the values.  This is useful when:
1. you only want the ValidBits, but not the values (Mike seems to want this)
2. you first ask for just values, then later want the bits.  

Thinking some more, I think the populator should look like this:
{code:java}
public abstract class CachePopulator 
  {
    public abstract ByteValues   createByteValues(   IndexReader reader, String 
field, EntryConfig config ) throws IOException;
    public abstract ShortValues  createShortValues(  IndexReader reader, String 
field, EntryConfig config ) throws IOException;
    public abstract IntValues    createIntValues(    IndexReader reader, String 
field, EntryConfig config ) throws IOException;
    public abstract FloatValues  createFloatValues(  IndexReader reader, String 
field, EntryConfig config ) throws IOException;
    public abstract DoubleValues createDoubleValues( IndexReader reader, String 
field, EntryConfig config ) throws IOException;
    
    public abstract void fillByteValues(   ByteValues   vals, IndexReader 
reader, String field, EntryConfig config ) throws IOException;
    public abstract void fillShortValues(  ShortValues  vals, IndexReader 
reader, String field, EntryConfig config ) throws IOException;
    public abstract void fillIntValues(    IntValues    vals, IndexReader 
reader, String field, EntryConfig config ) throws IOException;
    public abstract void fillFloatValues(  FloatValues  vals, IndexReader 
reader, String field, EntryConfig config ) throws IOException;
    public abstract void fillDoubleValues( DoubleValues vals, IndexReader 
reader, String field, EntryConfig config ) throws IOException;

    // This will only fill in the ValidBits w/o parsing any actual values
    public abstract void fillValidBits( CachedArray  vals, IndexReader reader, 
String field, EntryConfig config ) throws IOException;
  }
{code}

The default 'create' implementation could look something like this:

{code:java}

    @Override
    public ShortValues createShortValues( IndexReader reader, String field, 
EntryConfig config ) throws IOException 
    {
      if( config == null ) {
        config = new SimpleEntryConfig();
      }
      ShortValues vals = new ShortValues();
      if( config.cacheValues() ) {
        this.fillShortValues(vals, reader, field, config);
      }
      else if( config.cacheValidBits() ) {
        this.fillValidBits(vals, reader, field, config);
      }
      else {
        throw new RuntimeException( "the config must cache values and/or bits" 
);
      }
      return vals;
    }
{code}

And the Cache 'createValue' would looks somethign like this:
{code:java}

  static final class ByteCache extends Cache {
    ByteCache(FieldCache wrapper) {
      super(wrapper);
    }
    
    @Override
    protected final ByteValues createValue(IndexReader reader, Entry entry, 
CachePopulator populator) throws IOException {
      String field = entry.field;
      EntryConfig config = (EntryConfig)entry.custom;
      if (config == null) {
        return wrapper.getByteValues(reader, field, new SimpleEntryConfig() );
      }
      return populator.createByteValues(reader, field, config);
    }
  }
{code}

thoughts?  This would open up lots more of the field cache... so if we go this 
route, lets make sure it addresses the other issues people have with 
FieldCache.  IIUC, the other big request is to load the values from an external 
source -- that should be possible with this approach.

> FieldCache should include a BitSet for matching docs
> ----------------------------------------------------
>
>                 Key: LUCENE-2649
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2649
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Ryan McKinley
>             Fix For: 4.0
>
>         Attachments: LUCENE-2649-FieldCacheWithBitSet.patch, 
> LUCENE-2649-FieldCacheWithBitSet.patch, 
> LUCENE-2649-FieldCacheWithBitSet.patch, 
> LUCENE-2649-FieldCacheWithBitSet.patch, LUCENE-2649-FieldCacheWithBitSet.patch
>
>
> The FieldCache returns an array representing the values for each doc.  
> However there is no way to know if the doc actually has a value.
> This should be changed to return an object representing the values *and* a 
> BitSet for all valid docs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2649) FieldCache should include a BitSet for matching docs

Reply via email to