Just checked it out, and its not a bad win on multi term queries. Its
not the same exponential gain as field cache loading, but I bet lots of
2-3x type stuff. You appear to save a decent amount by not applying
every term to each segment because of the logarithmic sizing.
My query of: new WildcardQuery(new Term("string", "00*") gets 789 hits,
and takes half the time with this patch. The index is below:
Segments file=segments_a numSegments=34 version=FORMAT_USER_DATA [Lucene
2.9]
1 of 34: name=_bb48 docCount=159845
compound=true
hasProx=true
numFiles=2
size (MB)=294.529
docStoreOffset=0
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [159845 terms; 159845 terms/docs pairs;
159845 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
2 of 34: name=_bb5d docCount=159977
compound=true
hasProx=true
numFiles=2
size (MB)=294.681
docStoreOffset=159845
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [159977 terms; 159977 terms/docs pairs;
159977 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
3 of 34: name=_bb6i docCount=159691
compound=true
hasProx=true
numFiles=2
size (MB)=294.701
docStoreOffset=319822
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [159691 terms; 159691 terms/docs pairs;
159691 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
4 of 34: name=_bb6j docCount=3978
compound=true
hasProx=true
numFiles=2
size (MB)=7.382
docStoreOffset=479513
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [3978 terms; 3978 terms/docs pairs;
3978 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
5 of 34: name=_bb6k docCount=4002
compound=true
hasProx=true
numFiles=2
size (MB)=7.353
docStoreOffset=483491
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [4002 terms; 4002 terms/docs pairs;
4002 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
6 of 34: name=_bb6l docCount=3959
compound=true
hasProx=true
numFiles=2
size (MB)=7.365
docStoreOffset=487493
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [3959 terms; 3959 terms/docs pairs;
3959 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
7 of 34: name=_bb6m docCount=3938
compound=true
hasProx=true
numFiles=2
size (MB)=7.35
docStoreOffset=491452
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [3938 terms; 3938 terms/docs pairs;
3938 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
8 of 34: name=_bb6n docCount=4020
compound=true
hasProx=true
numFiles=2
size (MB)=7.379
docStoreOffset=495390
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [4020 terms; 4020 terms/docs pairs;
4020 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
9 of 34: name=_bb6o docCount=3973
compound=true
hasProx=true
numFiles=2
size (MB)=7.385
docStoreOffset=499410
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [3973 terms; 3973 terms/docs pairs;
3973 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
10 of 34: name=_bb6p docCount=3923
compound=true
hasProx=true
numFiles=2
size (MB)=7.338
docStoreOffset=503383
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [3923 terms; 3923 terms/docs pairs;
3923 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
11 of 34: name=_bb6q docCount=3978
compound=true
hasProx=true
numFiles=2
size (MB)=7.382
docStoreOffset=507306
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [3978 terms; 3978 terms/docs pairs;
3978 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
12 of 34: name=_bb6r docCount=4001
compound=true
hasProx=true
numFiles=2
size (MB)=7.382
docStoreOffset=511284
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [4001 terms; 4001 terms/docs pairs;
4001 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
13 of 34: name=_bb6s docCount=3944
compound=true
hasProx=true
numFiles=2
size (MB)=7.337
docStoreOffset=515285
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [3944 terms; 3944 terms/docs pairs;
3944 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
14 of 34: name=_bb6t docCount=4047
compound=true
hasProx=true
numFiles=2
size (MB)=7.373
docStoreOffset=519229
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [4047 terms; 4047 terms/docs pairs;
4047 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
15 of 34: name=_bb6u docCount=4014
compound=true
hasProx=true
numFiles=2
size (MB)=7.349
docStoreOffset=523276
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [4014 terms; 4014 terms/docs pairs;
4014 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
16 of 34: name=_bb6v docCount=3959
compound=true
hasProx=true
numFiles=2
size (MB)=7.361
docStoreOffset=527290
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [3959 terms; 3959 terms/docs pairs;
3959 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
17 of 34: name=_bb6w docCount=4033
compound=true
hasProx=true
numFiles=2
size (MB)=7.354
docStoreOffset=531249
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [4033 terms; 4033 terms/docs pairs;
4033 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
18 of 34: name=_bb6x docCount=4048
compound=true
hasProx=true
numFiles=2
size (MB)=7.391
docStoreOffset=535282
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [4048 terms; 4048 terms/docs pairs;
4048 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
19 of 34: name=_bb6y docCount=4038
compound=true
hasProx=true
numFiles=2
size (MB)=7.371
docStoreOffset=539330
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [4038 terms; 4038 terms/docs pairs;
4038 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
20 of 34: name=_bb6z docCount=3947
compound=true
hasProx=true
numFiles=2
size (MB)=7.35
docStoreOffset=543368
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [3947 terms; 3947 terms/docs pairs;
3947 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
21 of 34: name=_bb70 docCount=3991
compound=true
hasProx=true
numFiles=2
size (MB)=7.356
docStoreOffset=547315
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [3991 terms; 3991 terms/docs pairs;
3991 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
22 of 34: name=_bb71 docCount=3890
compound=true
hasProx=true
numFiles=2
size (MB)=7.345
docStoreOffset=551306
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [3890 terms; 3890 terms/docs pairs;
3890 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
23 of 34: name=_bb72 docCount=3931
compound=true
hasProx=true
numFiles=2
size (MB)=7.347
docStoreOffset=555196
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [3931 terms; 3931 terms/docs pairs;
3931 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
24 of 34: name=_bb73 docCount=4025
compound=true
hasProx=true
numFiles=2
size (MB)=7.359
docStoreOffset=559127
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [4025 terms; 4025 terms/docs pairs;
4025 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
25 of 34: name=_bb74 docCount=4039
compound=true
hasProx=true
numFiles=2
size (MB)=7.392
docStoreOffset=563152
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [4039 terms; 4039 terms/docs pairs;
4039 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
26 of 34: name=_bb75 docCount=4002
compound=true
hasProx=true
numFiles=2
size (MB)=7.35
docStoreOffset=567191
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [4002 terms; 4002 terms/docs pairs;
4002 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
27 of 34: name=_bb76 docCount=4012
compound=true
hasProx=true
numFiles=2
size (MB)=7.394
docStoreOffset=571193
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [4012 terms; 4012 terms/docs pairs;
4012 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
28 of 34: name=_bb77 docCount=3991
compound=true
hasProx=true
numFiles=2
size (MB)=7.331
docStoreOffset=575205
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [3991 terms; 3991 terms/docs pairs;
3991 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
29 of 34: name=_bb78 docCount=4008
compound=true
hasProx=true
numFiles=2
size (MB)=7.356
docStoreOffset=579196
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [4008 terms; 4008 terms/docs pairs;
4008 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
30 of 34: name=_bb79 docCount=4037
compound=true
hasProx=true
numFiles=2
size (MB)=7.366
docStoreOffset=583204
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [4037 terms; 4037 terms/docs pairs;
4037 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
31 of 34: name=_bb7a docCount=3984
compound=true
hasProx=true
numFiles=2
size (MB)=7.347
docStoreOffset=587241
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [3984 terms; 3984 terms/docs pairs;
3984 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
32 of 34: name=_bb7b docCount=4049
compound=true
hasProx=true
numFiles=2
size (MB)=7.349
docStoreOffset=591225
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [4049 terms; 4049 terms/docs pairs;
4049 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
33 of 34: name=_bb7c docCount=3992
compound=true
hasProx=true
numFiles=2
size (MB)=7.353
docStoreOffset=595274
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [3992 terms; 3992 terms/docs pairs;
3992 tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
34 of 34: name=_bb7d docCount=734
compound=true
hasProx=true
numFiles=2
size (MB)=1.338
docStoreOffset=599266
docStoreSegment=_bb34
docStoreIsCompoundFile=true
no deletions
test: open reader.........OK
test: fields, norms.......OK [1 fields]
test: terms, freq, prox...OK [734 terms; 734 terms/docs pairs; 734
tokens]
test: stored fields.......OK [0 total field count; avg 0 fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
No problems were detected with this index.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org