[ https://issues.apache.org/jira/browse/LUCENE-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15434935#comment-15434935 ]
Ferenczi Jim edited comment on LUCENE-7423 at 8/25/16 9:05 AM: --------------------------------------------------------------- (edited since the results of the autoprefix were wrong due to a bug in the code to generate the prefixes) I've added a small benchmark AutoPrefixPerf.java (modified from [~mikemccand] utils). For the benchmark I used the english wikipedia title and a standard analyzer: {panel:title=Standard analyzer|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1|bgColor=#FFFFCE} A single field in this test: * "field": standard analyzer {noformat} Indexed 12600000: 33.756 sec Final Indexed 12696047: 33.9 sec Optimize... After force merge: 37.794 sec Close... After close: 37.798 sec Done CheckIndex: Segments file=segments_1 numSegments=1 version=7.0.0 id=ex11gzoft89z21le5c93bpett 1 of 1: name=_j maxDoc=12696047 version=7.0.0 id=ex11gzoft89z21le5c93bpets codec=Lucene62 compound=false numFiles=7 size (MB)=78.562 diagnostics = {os=Mac OS X, java.vendor=Oracle Corporation, java.version=1.8.0_77, java.vm.version=25.77-b03, lucene.version=7.0.0, mergeMaxNumSegments=1, os.arch=x86_64, java.runtime.version=1.8.0_77-b03, source=merge, mergeFactor=9, os.version=10.11.4, timestamp=1472043738648} no deletions test: open reader.........OK [took 0.002 sec] test: check integrity.....OK [took 0.046 sec] test: check live docs.....OK [took 0.000 sec] test: field infos.........OK [1 fields] [took 0.000 sec] test: field norms.........OK [0 fields] [took 0.000 sec] test: terms, freq, prox...OK [2513966 terms; 34713220 terms/docs pairs; 0 tokens] [took 2.321 sec] field "field": index FST: 699982 bytes terms: 2513966 terms 20843092 bytes (8.3 bytes/term) blocks: 80953 blocks 59384 terms-only blocks 10 sub-block-only blocks 21559 mixed blocks 18273 floor blocks 25611 non-floor blocks 55342 floor sub-blocks 13294379 term suffix bytes (164.2 suffix-bytes/block) 2538232 term stats bytes (31.4 stats-bytes/block) 8829391 other bytes (109.1 other-bytes/block) by prefix length: 0: 5 1: 421 2: 5620 3: 18794 4: 31598 5: 16630 6: 5322 7: 1709 8: 443 9: 138 10: 249 11: 14 12: 2 13: 6 14: 2 test: stored fields.......OK [0 total field count; avg 0.0 fields per doc] [took 0.257 sec] test: term vectors........OK [0 total term vector count; avg 0.0 term/freq vector fields per doc] [took 0.000 sec] test: docvalues...........OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 SORTED; 0 SORTED_NUMERIC; 0 SORTED_SET] [took 0.000 sec] test: points..............OK [0 fields, 0 points] [took 0.000 sec] detailed segment RAM usage: _j(7.0.0):C12696047: 741.9 KB |-- postings [PerFieldPostings(segment=_j formats=1)]: 683.8 KB |-- format 'Lucene50_0' [BlockTreeTermsReader(fields=1,delegate=Lucene50PostingsReader(positions=false,payloads=false))]: 683.8 KB |-- field 'field' [BlockTreeTerms(terms=2513966,postings=34713220,positions=-1,docs=12682564)]: 683.7 KB |-- term index [FST(input=BYTE1,output=ByteSequenceOutputs,packed=false]: 683.6 KB |-- delegate [Lucene50PostingsReader(positions=false,payloads=false)]: 32 bytes |-- stored fields [CompressingStoredFieldsReader(mode=FAST,chunksize=16384)]: 58.1 KB |-- stored field index [CompressingStoredFieldsIndexReader(blocks=97)]: 58.1 KB |-- doc base deltas: 29.1 KB |-- start pointer deltas: 26.6 KB No problems were detected with this index. {noformat} {panel} -{panel:title=EdgeNgram analyzer min=2 max=5 |borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1|bgColor=#FFFFCE} Two fields for this test: * "field": standard analyzer * field-edge: edge ngram analyzer (min=2, max=5) on top of a standard analyzer. {noformat} Indexed 12600000: 70.831 sec Final Indexed 12696047: 71.484 sec Optimize... After force merge: 80.344 sec Close... After close: 80.347 sec Done CheckIndex: Segments file=segments_1 numSegments=1 version=7.0.0 id=8bm8xy2peb5wo3td0ptgwv036 1 of 1: name=_19 maxDoc=12696047 version=7.0.0 id=8bm8xy2peb5wo3td0ptgwv035 codec=Lucene62 compound=false numFiles=7 size (MB)=224.803 diagnostics = {os=Mac OS X, java.vendor=Oracle Corporation, java.version=1.8.0_77, java.vm.version=25.77-b03, lucene.version=7.0.0, mergeMaxNumSegments=1, os.arch=x86_64, java.runtime.version=1.8.0_77-b03, source=merge, mergeFactor=15, os.version=10.11.4, timestamp=1472044255056} no deletions test: open reader.........OK [took 0.002 sec] test: check integrity.....OK [took 0.130 sec] test: check live docs.....OK [took 0.000 sec] test: field infos.........OK [2 fields] [took 0.000 sec] test: field norms.........OK [0 fields] [took 0.000 sec] test: terms, freq, prox...OK [3459987 terms; 155467747 terms/docs pairs; 0 tokens] [took 3.736 sec] field "field": index FST: 699967 bytes terms: 2513966 terms 20843092 bytes (8.3 bytes/term) blocks: 80953 blocks 59384 terms-only blocks 10 sub-block-only blocks 21559 mixed blocks 18273 floor blocks 25611 non-floor blocks 55342 floor sub-blocks 13294377 term suffix bytes (164.2 suffix-bytes/block) 2538232 term stats bytes (31.4 stats-bytes/block) 8836971 other bytes (109.2 other-bytes/block) by prefix length: 0: 5 1: 421 2: 5620 3: 18794 4: 31598 5: 16630 6: 5322 7: 1709 8: 443 9: 138 10: 249 11: 14 12: 2 13: 6 14: 2 field "field-edge": index FST: 265903 bytes terms: 946021 terms 4693480 bytes (5.0 bytes/term) blocks: 30830 blocks 26448 terms-only blocks 16 sub-block-only blocks 4366 mixed blocks 6054 floor blocks 5852 non-floor blocks 24978 floor sub-blocks 2954296 term suffix bytes (95.8 suffix-bytes/block) 990273 term stats bytes (32.1 stats-bytes/block) 2750060 other bytes (89.2 other-bytes/block) by prefix length: 0: 5 1: 313 2: 6051 3: 21746 4: 2272 5: 396 6: 28 7: 16 8: 3 test: stored fields.......OK [0 total field count; avg 0.0 fields per doc] [took 0.319 sec] test: term vectors........OK [0 total term vector count; avg 0.0 term/freq vector fields per doc] [took 0.000 sec] test: docvalues...........OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 SORTED; 0 SORTED_NUMERIC; 0 SORTED_SET] [took 0.000 sec] test: points..............OK [0 fields, 0 points] [took 0.000 sec] detailed segment RAM usage: _19(7.0.0):C12696047: 1 MB |-- postings [PerFieldPostings(segment=_19 formats=1)]: 943.6 KB |-- format 'Lucene50_0' [BlockTreeTermsReader(fields=2,delegate=Lucene50PostingsReader(positions=false,payloads=false))]: 943.6 KB |-- field 'field' [BlockTreeTerms(terms=2513966,postings=34713220,positions=-1,docs=12682564)]: 683.7 KB |-- term index [FST(input=BYTE1,output=ByteSequenceOutputs,packed=false]: 683.6 KB |-- field 'field-edge' [BlockTreeTerms(terms=946021,postings=120754527,positions=-1,docs=12645321)]: 259.8 KB |-- term index [FST(input=BYTE1,output=ByteSequenceOutputs,packed=false]: 259.7 KB |-- delegate [Lucene50PostingsReader(positions=false,payloads=false)]: 32 bytes |-- stored fields [CompressingStoredFieldsReader(mode=FAST,chunksize=16384)]: 95.2 KB |-- stored field index [CompressingStoredFieldsIndexReader(blocks=97)]: 95.2 KB |-- doc base deltas: 47.5 KB |-- start pointer deltas: 45.3 KB No problems were detected with this index. Took 4.209 sec total. Total index size: 235722542 bytes {noformat} {panel} For the results of the AutoPrefix PostingsFormat please check the next comment. was (Author: jim.ferenczi): Another iteration. I fixed the prefix selection (the term "aa" should not increment the number of terms accounted for the term "a"). This reduces the index size greatly. I've added a small benchmark AutoPrefixPerf.java (modified from [~mikemccand] utils). For the benchmark I used the english wikipedia title and a standard analyzer: {panel:title=Standard analyzer|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1|bgColor=#FFFFCE} A single field in this test: * "field": standard analyzer {noformat} Indexed 12600000: 33.756 sec Final Indexed 12696047: 33.9 sec Optimize... After force merge: 37.794 sec Close... After close: 37.798 sec Done CheckIndex: Segments file=segments_1 numSegments=1 version=7.0.0 id=ex11gzoft89z21le5c93bpett 1 of 1: name=_j maxDoc=12696047 version=7.0.0 id=ex11gzoft89z21le5c93bpets codec=Lucene62 compound=false numFiles=7 size (MB)=78.562 diagnostics = {os=Mac OS X, java.vendor=Oracle Corporation, java.version=1.8.0_77, java.vm.version=25.77-b03, lucene.version=7.0.0, mergeMaxNumSegments=1, os.arch=x86_64, java.runtime.version=1.8.0_77-b03, source=merge, mergeFactor=9, os.version=10.11.4, timestamp=1472043738648} no deletions test: open reader.........OK [took 0.002 sec] test: check integrity.....OK [took 0.046 sec] test: check live docs.....OK [took 0.000 sec] test: field infos.........OK [1 fields] [took 0.000 sec] test: field norms.........OK [0 fields] [took 0.000 sec] test: terms, freq, prox...OK [2513966 terms; 34713220 terms/docs pairs; 0 tokens] [took 2.321 sec] field "field": index FST: 699982 bytes terms: 2513966 terms 20843092 bytes (8.3 bytes/term) blocks: 80953 blocks 59384 terms-only blocks 10 sub-block-only blocks 21559 mixed blocks 18273 floor blocks 25611 non-floor blocks 55342 floor sub-blocks 13294379 term suffix bytes (164.2 suffix-bytes/block) 2538232 term stats bytes (31.4 stats-bytes/block) 8829391 other bytes (109.1 other-bytes/block) by prefix length: 0: 5 1: 421 2: 5620 3: 18794 4: 31598 5: 16630 6: 5322 7: 1709 8: 443 9: 138 10: 249 11: 14 12: 2 13: 6 14: 2 test: stored fields.......OK [0 total field count; avg 0.0 fields per doc] [took 0.257 sec] test: term vectors........OK [0 total term vector count; avg 0.0 term/freq vector fields per doc] [took 0.000 sec] test: docvalues...........OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 SORTED; 0 SORTED_NUMERIC; 0 SORTED_SET] [took 0.000 sec] test: points..............OK [0 fields, 0 points] [took 0.000 sec] detailed segment RAM usage: _j(7.0.0):C12696047: 741.9 KB |-- postings [PerFieldPostings(segment=_j formats=1)]: 683.8 KB |-- format 'Lucene50_0' [BlockTreeTermsReader(fields=1,delegate=Lucene50PostingsReader(positions=false,payloads=false))]: 683.8 KB |-- field 'field' [BlockTreeTerms(terms=2513966,postings=34713220,positions=-1,docs=12682564)]: 683.7 KB |-- term index [FST(input=BYTE1,output=ByteSequenceOutputs,packed=false]: 683.6 KB |-- delegate [Lucene50PostingsReader(positions=false,payloads=false)]: 32 bytes |-- stored fields [CompressingStoredFieldsReader(mode=FAST,chunksize=16384)]: 58.1 KB |-- stored field index [CompressingStoredFieldsIndexReader(blocks=97)]: 58.1 KB |-- doc base deltas: 29.1 KB |-- start pointer deltas: 26.6 KB No problems were detected with this index. {noformat} {panel} {panel:title=EdgeNgram analyzer min=2 max=5 |borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1|bgColor=#FFFFCE} Two fields for this test: * "field": standard analyzer * field-edge: edge ngram analyzer (min=2, max=5) on top of a standard analyzer. {noformat} Indexed 12600000: 70.831 sec Final Indexed 12696047: 71.484 sec Optimize... After force merge: 80.344 sec Close... After close: 80.347 sec Done CheckIndex: Segments file=segments_1 numSegments=1 version=7.0.0 id=8bm8xy2peb5wo3td0ptgwv036 1 of 1: name=_19 maxDoc=12696047 version=7.0.0 id=8bm8xy2peb5wo3td0ptgwv035 codec=Lucene62 compound=false numFiles=7 size (MB)=224.803 diagnostics = {os=Mac OS X, java.vendor=Oracle Corporation, java.version=1.8.0_77, java.vm.version=25.77-b03, lucene.version=7.0.0, mergeMaxNumSegments=1, os.arch=x86_64, java.runtime.version=1.8.0_77-b03, source=merge, mergeFactor=15, os.version=10.11.4, timestamp=1472044255056} no deletions test: open reader.........OK [took 0.002 sec] test: check integrity.....OK [took 0.130 sec] test: check live docs.....OK [took 0.000 sec] test: field infos.........OK [2 fields] [took 0.000 sec] test: field norms.........OK [0 fields] [took 0.000 sec] test: terms, freq, prox...OK [3459987 terms; 155467747 terms/docs pairs; 0 tokens] [took 3.736 sec] field "field": index FST: 699967 bytes terms: 2513966 terms 20843092 bytes (8.3 bytes/term) blocks: 80953 blocks 59384 terms-only blocks 10 sub-block-only blocks 21559 mixed blocks 18273 floor blocks 25611 non-floor blocks 55342 floor sub-blocks 13294377 term suffix bytes (164.2 suffix-bytes/block) 2538232 term stats bytes (31.4 stats-bytes/block) 8836971 other bytes (109.2 other-bytes/block) by prefix length: 0: 5 1: 421 2: 5620 3: 18794 4: 31598 5: 16630 6: 5322 7: 1709 8: 443 9: 138 10: 249 11: 14 12: 2 13: 6 14: 2 field "field-edge": index FST: 265903 bytes terms: 946021 terms 4693480 bytes (5.0 bytes/term) blocks: 30830 blocks 26448 terms-only blocks 16 sub-block-only blocks 4366 mixed blocks 6054 floor blocks 5852 non-floor blocks 24978 floor sub-blocks 2954296 term suffix bytes (95.8 suffix-bytes/block) 990273 term stats bytes (32.1 stats-bytes/block) 2750060 other bytes (89.2 other-bytes/block) by prefix length: 0: 5 1: 313 2: 6051 3: 21746 4: 2272 5: 396 6: 28 7: 16 8: 3 test: stored fields.......OK [0 total field count; avg 0.0 fields per doc] [took 0.319 sec] test: term vectors........OK [0 total term vector count; avg 0.0 term/freq vector fields per doc] [took 0.000 sec] test: docvalues...........OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 SORTED; 0 SORTED_NUMERIC; 0 SORTED_SET] [took 0.000 sec] test: points..............OK [0 fields, 0 points] [took 0.000 sec] detailed segment RAM usage: _19(7.0.0):C12696047: 1 MB |-- postings [PerFieldPostings(segment=_19 formats=1)]: 943.6 KB |-- format 'Lucene50_0' [BlockTreeTermsReader(fields=2,delegate=Lucene50PostingsReader(positions=false,payloads=false))]: 943.6 KB |-- field 'field' [BlockTreeTerms(terms=2513966,postings=34713220,positions=-1,docs=12682564)]: 683.7 KB |-- term index [FST(input=BYTE1,output=ByteSequenceOutputs,packed=false]: 683.6 KB |-- field 'field-edge' [BlockTreeTerms(terms=946021,postings=120754527,positions=-1,docs=12645321)]: 259.8 KB |-- term index [FST(input=BYTE1,output=ByteSequenceOutputs,packed=false]: 259.7 KB |-- delegate [Lucene50PostingsReader(positions=false,payloads=false)]: 32 bytes |-- stored fields [CompressingStoredFieldsReader(mode=FAST,chunksize=16384)]: 95.2 KB |-- stored field index [CompressingStoredFieldsIndexReader(blocks=97)]: 95.2 KB |-- doc base deltas: 47.5 KB |-- start pointer deltas: 45.3 KB No problems were detected with this index. Took 4.209 sec total. Total index size: 235722542 bytes {noformat} {panel} {panel:title=AutoPrefix minPrefixTerms=2|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1|bgColor=#FFFFCE} Two indexed fields: * "field": standard analyzer * "field-autoprefix": the autoprefix of the field "field" with a minPrefixTerms set to 2. {noformat} Indexed 12600000: 52.49 sec Final Indexed 12696047: 52.717 sec Optimize... After force merge: 68.699 sec Close... After close: 68.704 sec Done CheckIndex: Segments file=segments_1 numSegments=1 version=7.0.0 id=1gb0m3msddxzckhpfj9lzsneq 1 of 1: name=_j maxDoc=12696047 version=7.0.0 id=1gb0m3msddxzckhpfj9lzsnep codec=Lucene62 compound=false numFiles=7 size (MB)=120.032 diagnostics = {os=Mac OS X, java.vendor=Oracle Corporation, java.version=1.8.0_77, java.vm.version=25.77-b03, lucene.version=7.0.0, mergeMaxNumSegments=1, os.arch=x86_64, java.runtime.version=1.8.0_77-b03, source=merge, mergeFactor=9, os.version=10.11.4, timestamp=1472044414055} no deletions test: open reader.........OK [took 0.002 sec] test: check integrity.....OK [took 0.067 sec] test: check live docs.....OK [took 0.000 sec] test: field infos.........OK [2 fields] [took 0.000 sec] test: field norms.........OK [0 fields] [took 0.000 sec] test: terms, freq, prox...OK [3034551 terms; 60351742 terms/docs pairs; 0 tokens] [took 2.566 sec] field "field-autoprefix": index FST: 152510 bytes terms: 520585 terms 3436438 bytes (6.6 bytes/term) blocks: 16779 blocks 12264 terms-only blocks 1 sub-block-only blocks 4514 mixed blocks 3880 floor blocks 5187 non-floor blocks 11592 floor sub-blocks 2140329 term suffix bytes (127.6 suffix-bytes/block) 539804 term stats bytes (32.2 stats-bytes/block) 729244 other bytes (43.5 other-bytes/block) by prefix length: 0: 9 1: 286 2: 1746 3: 6942 4: 5237 5: 1722 6: 577 7: 191 8: 31 9: 18 10: 19 11: 1 field "field": index FST: 699987 bytes terms: 2513966 terms 20843092 bytes (8.3 bytes/term) blocks: 80953 blocks 59384 terms-only blocks 10 sub-block-only blocks 21559 mixed blocks 18273 floor blocks 25611 non-floor blocks 55342 floor sub-blocks 13294384 term suffix bytes (164.2 suffix-bytes/block) 2538232 term stats bytes (31.4 stats-bytes/block) 8847612 other bytes (109.3 other-bytes/block) by prefix length: 0: 5 1: 421 2: 5620 3: 18794 4: 31598 5: 16630 6: 5322 7: 1709 8: 443 9: 138 10: 249 11: 14 12: 2 13: 6 14: 2 test: stored fields.......OK [0 total field count; avg 0.0 fields per doc] [took 0.281 sec] test: term vectors........OK [0 total term vector count; avg 0.0 term/freq vector fields per doc] [took 0.000 sec] test: docvalues...........OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 SORTED; 0 SORTED_NUMERIC; 0 SORTED_SET] [took 0.000 sec] test: points..............OK [0 fields, 0 points] [took 0.000 sec] detailed segment RAM usage: _j(7.0.0):C12696047: 894.8 KB |-- postings [PerFieldPostings(segment=_j formats=1)]: 832.9 KB |-- format 'AutoPrefix_0' [BlockTreeTermsReader(fields=2,delegate=Lucene50PostingsReader(positions=false,payloads=false))]: 832.9 KB |-- field 'field' [BlockTreeTerms(terms=2513966,postings=34713220,positions=-1,docs=12682564)]: 683.7 KB |-- term index [FST(input=BYTE1,output=ByteSequenceOutputs,packed=false]: 683.6 KB |-- field 'field-autoprefix' [BlockTreeTerms(terms=520585,postings=25638522,positions=-1,docs=9493306)]: 149.1 KB |-- term index [FST(input=BYTE1,output=ByteSequenceOutputs,packed=false]: 148.9 KB |-- delegate [Lucene50PostingsReader(positions=false,payloads=false)]: 32 bytes |-- stored fields [CompressingStoredFieldsReader(mode=FAST,chunksize=16384)]: 61.9 KB |-- stored field index [CompressingStoredFieldsIndexReader(blocks=97)]: 61.9 KB |-- doc base deltas: 30.5 KB |-- start pointer deltas: 29.1 KB No problems were detected with this index. Took 2.933 sec total. Total index size: 125862986 bytes {noformat} {panel} The autoprefix format has better performance than the 2-5 edge ngram solution. It produces 520,585 terms, two times less than the 2-5 edge ngram (1M terms), is faster to build 52.717 sec vs 71.484 sec and the index is smaller (120M vs 225M). > AutoPrefixPostingsFormat: a PostingsFormat optimized for prefix queries on > text fields. > --------------------------------------------------------------------------------------- > > Key: LUCENE-7423 > URL: https://issues.apache.org/jira/browse/LUCENE-7423 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/sandbox > Reporter: Ferenczi Jim > Priority: Minor > Attachments: LUCENE-7423.patch > > > The autoprefix terms dict added in > https://issues.apache.org/jira/browse/LUCENE-5879 has been removed with > https://issues.apache.org/jira/browse/LUCENE-7317. > The new points API is now used to do efficient range queries but the > replacement for prefix string queries is unclear. The edge ngrams could be > used instead but they have a lot of drawbacks and are hard to configure > correctly. The completion postings format is also a good replacement but it > requires to have a big FST in RAM and it cannot be intersected with other > fields. > This patch is a proposal for a new PostingsFormat optimized for prefix query > on string fields. It detects prefixes that match "enough" terms and writes > auto-prefix terms into their own virtual field. > At search time the virtual field is used to speed up prefix queries that > match "enough" terms. > The auto-prefix terms are built in two pass: > * The first pass builds a compact prefix tree. Since the terms enum is sorted > the prefixes are flushed on the fly depending on the input. For each prefix > we build its corresponding inverted lists using a DocIdSetBuilder. The first > pass visits each term of the field TermsEnum only once. When a prefix is > flushed from the prefix tree its inverted lists is dumped into a temporary > file for further use. This is necessary since the prefixes are not sorted > when they are removed from the tree. The selected auto prefixes are sorted at > the end of the first pass. > * The second pass is a sorted scan of the prefixes and the temporary file is > used to read the corresponding inverted lists. > The patch is just a POC and there are rooms for optimizations but the first > results are promising: > I tested the patch with the geonames dataset. I indexed all the titles with > the KeywordAnalyzer and compared the index/merge time and the size of the > indices. > The edge ngram index (with a min edge ngram size of 2 and a max of 20) takes > 572M on disk and it took 130s to index and optimize the 11M titles. > The auto prefix index takes 287M on disk and took 70s to index and optimize > the same 11M titles. Among the 287M, only 170M are used for the auto prefix > fields and the rest is for the regular keyword field. All the auto prefixes > were generated for this test (at least 2 terms per auto-prefix). > The queries have similar performance since we are sure on both sides that one > inverted list can answer any prefix query. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org