Michael McCandless created LUCENE-7989:
------------------------------------------

             Summary: Add computed (at segment flush) doc values fields
                 Key: LUCENE-7989
                 URL: https://issues.apache.org/jira/browse/LUCENE-7989
             Project: Lucene - Core
          Issue Type: Improvement
            Reporter: Michael McCandless


This is a failed experiment but I thought I'd open an issue and post the patch 
in case it inspires others.

It adds a new feature to Lucene, which lets you provide function (set via 
{{IndexWriterConfig}}) that is invoked at segment flush time to create a new 
doc values field as a function of all other doc values fields in that segment.  
The newly created field is "first class", i.e. behaves as if you had indexed 
actual doc values fields on your documents, it can participate in index sort, 
etc.  The interesting thing about it is it has access to all other documents 
that made it into the flushed segment (by pulling doc values iterators for it).

Anyway, I got the feature working, and it's surprisingly small core code 
change, but I had a very specific use case in mind, to "coalesce" documents by 
their families while sorting them by another field, and I realized that even 
though the feature is working, I cannot use it for this particular use case 
since the coalescing would break during merge (it's not just a simple "merge 
sort").  The test case I added, simulating my use case, fails on those seeds / 
test multipliers that trigger merging of the random index.

I'll post a patch but I don't plan to push this any further!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to