[ 
https://issues.apache.org/jira/browse/LUCENE-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910591#action_12910591
 ] 

Shai Erera commented on LUCENE-2649:
------------------------------------

One thing I've wanted to do for a long time, but didn't get to doing it, is 
open up FieldCache to allow the application to populate the entries from other 
sources - specifically pyloads. I wrote a sorting solution which relies solely 
on payloads, and wanted to contribute it to Lucene, but due to lack's of 
FieldCache hook points, I didn't find the time to do the necessary refactoring.

Sorting based on payloads-data has several advantages:
# It's much faster to read than iterating on the lexicon and parsing the term 
values into sortable values.
# If your application needs to cater sort over 10s of millions of documents, or 
if it needs to keep its RAM usage low, you can do the sort while reading the 
payload data as the search happens. It's faster than if it was in RAM, but the 
current FieldCache does not allow you to sort w/o RAM consumption.
# You don't inflate your lexicon w/ sort values, affecting other searches. In 
some situations, you can add a unique term per document for the sort values 
(such as when sorting by date and require up to a millisecond precision).

I'm bringing it up so that if you consider any refactoring to FieldCache, I'd 
appreciate if you can keep that in mind. If the right hooks will open up, I'll 
make time to contribute my sort-by-payload package. If you don't, then it'll 
need to wait until I can find the time to do the refactoring.



> FieldCache should include a BitSet for matching docs
> ----------------------------------------------------
>
>                 Key: LUCENE-2649
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2649
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Ryan McKinley
>             Fix For: 4.0
>
>         Attachments: LUCENE-2649-FieldCacheWithBitSet.patch, 
> LUCENE-2649-FieldCacheWithBitSet.patch
>
>
> The FieldCache returns an array representing the values for each doc.  
> However there is no way to know if the doc actually has a value.
> This should be changed to return an object representing the values *and* a 
> BitSet for all valid docs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to