[jira] [Updated] (LUCENE-5882) add 4.10 docvaluesformat

Robert Muir (JIRA) Thu, 14 Aug 2014 06:52:35 -0700

     [ 
https://issues.apache.org/jira/browse/LUCENE-5882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Robert Muir updated LUCENE-5882:
--------------------------------

    Attachment: LUCENE-5882.patch

Thank you Ryan. Its more than that actually, we had stupidity at read-time too 
to handle the empty terms case (this can happen when all values are merged 
away, and yes we test it explicitly). 

I removed the max'ing and replaced with asserts.

I also added new random termsenum tests to TestLucene410DocValuesFormat. These 
test the termsenum behavior with large amounts of terms (in nightly very large 
amounts). It would be nice to factor them into the base class to improve 
testing of all DVF's, but thats a little more complicated and noisy so I left a 
TODO. I intend to address it after this issue though.

> add 4.10 docvaluesformat
> ------------------------
>
>                 Key: LUCENE-5882
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5882
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Robert Muir
>         Attachments: LUCENE-5882.patch, LUCENE-5882.patch, LUCENE-5882.patch
>
>
> We can improve the current format in a few ways:
> * speed up Sorted/SortedSet byte[] lookup by structuring the term blocks 
> differently (allow random access, more efficient bulk i/o)
> * speed up reverse lookup by adding a reverse index (small: just every 
> 1024'th term with useless suffixes removed).
> * use slice API for access to access to binary content, too.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (LUCENE-5882) add 4.10 docvaluesformat

Reply via email to