Hi Mike,

13 Billion unique terms.  (CheckIndex output appended below)

Tom
------

 test: terms, freq, prox...OK [13,068,302,002 terms; 187,284,275,343
terms/docs pairs; 786,014,075,745 tokens]

Segments file=segments_6 numSegments=2 version=4.0.0.2 format=
userData={commitTimeMSec=1357596564850}
  1 of 2: name=_uhj docCount=866984
    codec=Lucene40
    compound=false
    numFiles=10
    size (MB)=2,048,537.68
    diagnostics = {os=Linux, os.version=2.6.18-308.24.1.el5, mergeFactor=8,
source=merge, lucene.version=4.0.0 1394950 - rmuir - 2012-10-06 03:00:40,
os.arch=amd64, mergeMaxNumSegments=1, java.version=1.6.0_16,
java.vendor=Sun Microsystems Inc.}
    no deletions
    test: open reader.........OK
    test: fields..............OK [92 fields]
    test: field norms.........OK [46 fields]
    test: terms, freq, prox...OK [13068302002 terms; 187284275343
terms/docs pairs; 786014075745 tokens]
    test: stored fields.......OK [34172522 total field count; avg 39.415
fields per doc]
    test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
    test: DocValues........OK [0 total doc Count; Num DocValues Fields 0



On Tue, Jun 4, 2013 at 1:00 PM, Tom Burton-West <[email protected]> wrote:

> Thanks Mike.
>
> I'm running CheckIndex on the 2TB index right now.    Hopefully it will
> finish running by tomorrow.  I'll send you a copy of the output.
>
> Tom
>
>
> On Mon, Jun 3, 2013 at 9:04 PM, Michael McCandless <
> [email protected]> wrote:
>
>> Hi Tom,
>>
>> On Mon, Jun 3, 2013 at 12:11 PM, Tom Burton-West <[email protected]>
>> wrote:
>>
>> > What is the current limit?
>>
>> I *think* (but would be nice to hear back how many terms you were able
>> to index into one segment ;) ) there is no hard limit to the max
>> number of terms, now that FSTs can handle more than 2.1 B
>> bytes/nodes/arcs.
>>
>> I'll update those javadocs, thanks!
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>>
>

Reply via email to