Re: Statically store sub-collections for search (faceted search?)
Am 12.04.2013 20:08, schrieb SUJIT PAL: Hi Carsten, Why not use your idea of the BooleanQuery but wrap it in a Filter instead? Since you are not doing any scoring (only filtering), the max boolean clauses limit should not apply to a filter. Hi Sujit, thanks for your suggestion! I wasn't aware that the max clause limit does not match for a BooleanQuery wrapped in a filter. I suppose the ideal way would be to use a BooleanFilter but not a QueryWrapperFilter, right? However, I am also not sure how to apply a filter in my use case because I perform a SpanQuery. Although SpanQuery#getSpans() does take a Bits object as an argument (acceptDocs), I haven't been able to figure out how to generate this Bits object correctly from a Filter object. Best, Carsten -- Institut für Deutsche Sprache | http://www.ids-mannheim.de Projekt KorAP | http://korap.ids-mannheim.de Tel. +49-(0)621-43740789 | schno...@ids-mannheim.de Korpusanalyseplattform der nächsten Generation Next Generation Corpus Analysis Platform - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: Statically store sub-collections for search (faceted search?)
Am 15.04.2013 11:27, schrieb Uwe Schindler: Hi again, You are somehow misusing acceptDocs and DocIdSet here, so you have to take care, semantics are different: - For acceptDocs null means all documents allowed - no deleted documents - For DocIdSet null means no documents matched Okay, as described above, I would now pass either the result of getLiveDocs() or Bits.MatchAllDocuments() as the acceptDocs argument to getDocIdSet(): MapTerm, TermContext termContexts = new HashMap(); AtomicReaderContext atomic = ... ChainedFilter filter = ... You just pass getLiveDocs(), no null check needed. Using your code would bring a slowdown for indexes without deletions. This makes sense to me, but now I get zero matches in all searches using the filter. I am pondering this remark in the documentation of Filter.getDocIdSet(AtomicReaderContext context, Bits acceptDocs): acceptDocs - Bits that represent the allowable docs to match (typically deleted docs but possibly filtering other documents) I understand that getLiveDocs() returns the document bits set that represent NON-deleted documents which seems to match the first part of the description (allowable docs). However, why does it say in brackets typically deleted docs? I had ignored this so far, but as I get zero results now, this might be relevant. I am also thinking about how to possibly make use of a BitsFilteredDocIdSet in the following kind: ChainFilter filter = ... AtomicReaderContext = ... Bits alldocs = atomic.reader().getLiveDocs(); DocIdSet docids = filter.getDocIdSet(atomic, alldocs); BitsFilteredDocIdSet filtered = new BitsFilteredDocIdSet(docids, alldocs); Spans luceneSpans = sq.getSpans(atomic, filtered.bits(), termContexts); However, the documentation of the constructor public BitsFilteredDocIdSet(DocIdSet innerSet, Bits acceptDocs) does not make it clear to me whether I am applying the arguments correcty. I fails especially to understand the acceptDocs argument again: acceptDocs - Allowed docs, all docids not in this set will not be returned by this DocIdSet Would this be the correct way to apply a filter on a SpanQuery? Thanks! Carsten -- Institut für Deutsche Sprache | http://www.ids-mannheim.de Projekt KorAP | http://korap.ids-mannheim.de Tel. +49-(0)621-43740789 | schno...@ids-mannheim.de Korpusanalyseplattform der nächsten Generation Next Generation Corpus Analysis Platform - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
RE: Statically store sub-collections for search (faceted search?)
Hi, Hi again, You are somehow misusing acceptDocs and DocIdSet here, so you have to take care, semantics are different: - For acceptDocs null means all documents allowed - no deleted documents - For DocIdSet null means no documents matched Okay, as described above, I would now pass either the result of getLiveDocs() or Bits.MatchAllDocuments() as the acceptDocs argument to getDocIdSet(): MapTerm, TermContext termContexts = new HashMap(); AtomicReaderContext atomic = ... ChainedFilter filter = ... You just pass getLiveDocs(), no null check needed. Using your code would bring a slowdown for indexes without deletions. This makes sense to me, but now I get zero matches in all searches using the filter. I am pondering this remark in the documentation of Filter.getDocIdSet(AtomicReaderContext context, Bits acceptDocs): acceptDocs - Bits that represent the allowable docs to match (typically deleted docs but possibly filtering other documents) This just means, you can pass liveDocs as got from AtomicReader (live == inverse deleted docs), but you can pass also any other Bits implementation that may remove more documents from results. This is what you are dowing with spans. Passing NULL means all documents are allowed, if this would not be the case, whole Lucene queries and filters would not work at all, so if you get 0 docs, you must have missed something else. If this is not the case, your filter may behave wrong. Look at e.g. FilteredQuery, IndexSearcher or any other query in Lucene that handles acceptDocs - those pass getLiveDocs() down. If they are null, that means all documents are allowed. The javadocs on Scorer/Filter/... should be more clear about this. Can you open an issue about Javadocs? I understand that getLiveDocs() returns the document bits set that represent NON-deleted documents which seems to match the first part of the description (allowable docs). However, why does it say in brackets typically deleted docs? I had ignored this so far, but as I get zero results now, this might be relevant. See above. I am also thinking about how to possibly make use of a BitsFilteredDocIdSet in the following kind: ChainFilter filter = ... AtomicReaderContext = ... Bits alldocs = atomic.reader().getLiveDocs(); DocIdSet docids = filter.getDocIdSet(atomic, alldocs); BitsFilteredDocIdSet filtered = new BitsFilteredDocIdSet(docids, alldocs); Spans luceneSpans = sq.getSpans(atomic, filtered.bits(), termContexts); However, the documentation of the constructor public BitsFilteredDocIdSet(DocIdSet innerSet, Bits acceptDocs) does not make it clear to me whether I am applying the arguments correcty. I fails especially to understand the acceptDocs argument again: acceptDocs - Allowed docs, all docids not in this set will not be returned by this DocIdSet You should use BitsFilteredDocIdSet.wrap(), the ctor does not do null checks. Would this be the correct way to apply a filter on a SpanQuery? new FilteredQuery(SpanQuery,Filter)? Thanks! Carsten -- Institut für Deutsche Sprache | http://www.ids-mannheim.de Projekt KorAP | http://korap.ids-mannheim.de Tel. +49-(0)621-43740789 | schno...@ids-mannheim.de Korpusanalyseplattform der nächsten Generation Next Generation Corpus Analysis Platform - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: Statically store sub-collections for search (faceted search?)
Am 15.04.2013 13:43, schrieb Uwe Schindler: Hi, Passing NULL means all documents are allowed, if this would not be the case, whole Lucene queries and filters would not work at all, so if you get 0 docs, you must have missed something else. If this is not the case, your filter may behave wrong. Look at e.g. FilteredQuery, IndexSearcher or any other query in Lucene that handles acceptDocs - those pass getLiveDocs() down. If they are null, that means all documents are allowed. The javadocs on Scorer/Filter/... should be more clear about this. Can you open an issue about Javadocs? I'll open an issue as soon as I have understood how this should be corrected. :) I think I've pin-pointed my problem: I use a TermsFilter, get a DocIdSet with TermsFilter.getDocIdSet(atomic, atomic.reader().getLiveDocs()), and eventually retrieve a Bits object from that with DocIdSet.bits(). However, the latter always returns null. Wrapping the TermsFilter into a CachingWrapperFilter doesn't change that. I was using a QueryWrapperFilter before which would give me a DocIdSet object from which I could get a proper Bits object to pass to SpanQuery.getSpans(). Is there any way I could extract a Bits object from a TermsFilter? Would this be the correct way to apply a filter on a SpanQuery? new FilteredQuery(SpanQuery,Filter)? Okay, I formulated the question wrongly. I need to call SpanQuery.getSpans() because I have to process the resultings Spans object. Therefore, I actually meant: what is the general way to generate a Bits object from a Filter that can be used as the 'acceptedDocs' argument? Best, Carsten -- Institut für Deutsche Sprache | http://www.ids-mannheim.de Projekt KorAP | http://korap.ids-mannheim.de Tel. +49-(0)621-43740789 | schno...@ids-mannheim.de Korpusanalyseplattform der nächsten Generation Next Generation Corpus Analysis Platform - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: Statically store sub-collections for search (faceted search?)
Hi Uwe, Thanks for the info, I was under the impression that it didn't... I got this info (that filters don't have a limit because they are not scoring) from a document like the one below. Can't say this is the exact doc because its been a while since I saw that, though. http://searchhub.org/2009/06/08/bringing-the-highlighter-back-to-wildcard-queries-in-solr-14/ As a response to this performance pitfall on very large indices’s (and the infamous TooManyClauses exception), new queries were developed that relied on a new Query class called ConstantScoreQuery. ConstantScoreQuerys accept a filter of matching documents and then score with a constant value equal to the boost. Depending on the qualities of your index, this method can be faster than the Boolean expansion method, and more importantly, does not suffer from TooManyClauses exceptions. Rather than matching and scoring n BooleanQuery clauses (potentially thousands of clauses), a single filter is enumerated and then traveled for scoring. On the other hand, constructing and scoring with a BooleanQuery containing a few clauses is likely to be much faster than constructing and traveling a Filter. -sujit On Apr 15, 2013, at 1:04 AM, Uwe Schindler wrote: The limit also applies for filters. If you have a list of terms ORed together, the fastest way is not to use a BooleanQuery at all, but instead a TermsFilter (which has no limits). - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Carsten Schnober [mailto:schno...@ids-mannheim.de] Sent: Monday, April 15, 2013 9:53 AM To: java-user@lucene.apache.org Subject: Re: Statically store sub-collections for search (faceted search?) Am 12.04.2013 20:08, schrieb SUJIT PAL: Hi Carsten, Why not use your idea of the BooleanQuery but wrap it in a Filter instead? Since you are not doing any scoring (only filtering), the max boolean clauses limit should not apply to a filter. Hi Sujit, thanks for your suggestion! I wasn't aware that the max clause limit does not match for a BooleanQuery wrapped in a filter. I suppose the ideal way would be to use a BooleanFilter but not a QueryWrapperFilter, right? However, I am also not sure how to apply a filter in my use case because I perform a SpanQuery. Although SpanQuery#getSpans() does take a Bits object as an argument (acceptDocs), I haven't been able to figure out how to generate this Bits object correctly from a Filter object. Best, Carsten -- Institut für Deutsche Sprache | http://www.ids-mannheim.de Projekt KorAP | http://korap.ids-mannheim.de Tel. +49-(0)621-43740789 | schno...@ids-mannheim.de Korpusanalyseplattform der nächsten Generation Next Generation Corpus Analysis Platform - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
RE: Statically store sub-collections for search (faceted search?)
Hi, Original Message- From: Sujit Pal [mailto:sujitatgt...@gmail.com] On Behalf Of SUJIT PAL Sent: Monday, April 15, 2013 9:43 PM To: java-user@lucene.apache.org Subject: Re: Statically store sub-collections for search (faceted search?) Hi Uwe, Thanks for the info, I was under the impression that it didn't... I got this info (that filters don't have a limit because they are not scoring) from a document like the one below. Can't say this is the exact doc because its been a while since I saw that, though. http://searchhub.org/2009/06/08/bringing-the-highlighter-back-to-wildcard- queries-in-solr-14/ As a response to this performance pitfall on very large indices’s (and the infamous TooManyClauses exception), new queries were developed that relied on a new Query class called ConstantScoreQuery. ConstantScoreQuerys accept a filter of matching documents and then score with a constant value equal to the boost. Depending on the qualities of your index, this method can be faster than the Boolean expansion method, and more importantly, does not suffer from TooManyClauses exceptions. Rather than matching and scoring n BooleanQuery clauses (potentially thousands of clauses), a single filter is enumerated and then traveled for scoring. On the other hand, constructing and scoring with a BooleanQuery containing a few clauses is likely to be much faster than constructing and traveling a Filter. This is true, but you misunderstood it: This is about MultiTermQueries (which is the superclass of WildcardQuery, Fuzzy-, and range queries). Those queries are no native Lucene queries, so they rewrite to basic/native queries. In earlier Lucene versions, Wildcards were always rewritten to BooleanQueries with many TermQueries (one for each term that matches the wildcard), leading to the problem with too many terms. This is still the case, but only in some limits (this mode is only used if the wildcard expands to few terms). Those BooleanQueris are then used with ConstantScoreQuery(Query). The above text talks about another mode (which is used for many terms today): *No* BooleanQuery is build at all, instead all matching term's documents are marked in a BitSet and this BitSet is used with a Filter to construct a different Query type: ConstantScoreQuery(Filter). The BooleanQuery max clause count does not apply, because no BooleanQuery is involved in the whole process. If you use ConstantScoreQuery(BooleanQuery), the limit still applies, but not for ConstantScoreQuery(internalWildcardFilter). Uwe On Apr 15, 2013, at 1:04 AM, Uwe Schindler wrote: The limit also applies for filters. If you have a list of terms ORed together, the fastest way is not to use a BooleanQuery at all, but instead a TermsFilter (which has no limits). - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Carsten Schnober [mailto:schno...@ids-mannheim.de] Sent: Monday, April 15, 2013 9:53 AM To: java-user@lucene.apache.org Subject: Re: Statically store sub-collections for search (faceted search?) Am 12.04.2013 20:08, schrieb SUJIT PAL: Hi Carsten, Why not use your idea of the BooleanQuery but wrap it in a Filter instead? Since you are not doing any scoring (only filtering), the max boolean clauses limit should not apply to a filter. Hi Sujit, thanks for your suggestion! I wasn't aware that the max clause limit does not match for a BooleanQuery wrapped in a filter. I suppose the ideal way would be to use a BooleanFilter but not a QueryWrapperFilter, right? However, I am also not sure how to apply a filter in my use case because I perform a SpanQuery. Although SpanQuery#getSpans() does take a Bits object as an argument (acceptDocs), I haven't been able to figure out how to generate this Bits object correctly from a Filter object. Best, Carsten -- Institut für Deutsche Sprache | http://www.ids-mannheim.de Projekt KorAP | http://korap.ids-mannheim.de Tel. +49-(0)621-43740789 | schno...@ids-mannheim.de Korpusanalyseplattform der nächsten Generation Next Generation Corpus Analysis Platform - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org - To unsubscribe, e-mail: java-user
Re: Statically store sub-collections for search (faceted search?)
Hi Uwe, I see, makes sense, thanks very much for the info. Sorry about giving you wrong info Carsten. -sujit On Apr 15, 2013, at 1:06 PM, Uwe Schindler wrote: Hi, Original Message- From: Sujit Pal [mailto:sujitatgt...@gmail.com] On Behalf Of SUJIT PAL Sent: Monday, April 15, 2013 9:43 PM To: java-user@lucene.apache.org Subject: Re: Statically store sub-collections for search (faceted search?) Hi Uwe, Thanks for the info, I was under the impression that it didn't... I got this info (that filters don't have a limit because they are not scoring) from a document like the one below. Can't say this is the exact doc because its been a while since I saw that, though. http://searchhub.org/2009/06/08/bringing-the-highlighter-back-to-wildcard- queries-in-solr-14/ As a response to this performance pitfall on very large indices’s (and the infamous TooManyClauses exception), new queries were developed that relied on a new Query class called ConstantScoreQuery. ConstantScoreQuerys accept a filter of matching documents and then score with a constant value equal to the boost. Depending on the qualities of your index, this method can be faster than the Boolean expansion method, and more importantly, does not suffer from TooManyClauses exceptions. Rather than matching and scoring n BooleanQuery clauses (potentially thousands of clauses), a single filter is enumerated and then traveled for scoring. On the other hand, constructing and scoring with a BooleanQuery containing a few clauses is likely to be much faster than constructing and traveling a Filter. This is true, but you misunderstood it: This is about MultiTermQueries (which is the superclass of WildcardQuery, Fuzzy-, and range queries). Those queries are no native Lucene queries, so they rewrite to basic/native queries. In earlier Lucene versions, Wildcards were always rewritten to BooleanQueries with many TermQueries (one for each term that matches the wildcard), leading to the problem with too many terms. This is still the case, but only in some limits (this mode is only used if the wildcard expands to few terms). Those BooleanQueris are then used with ConstantScoreQuery(Query). The above text talks about another mode (which is used for many terms today): *No* BooleanQuery is build at all, instead all matching term's documents are marked in a BitSet and this BitSet is used with a Filter to construct a different Query type: ConstantScoreQuery(Filter). The BooleanQuery max clause count does not apply, because no BooleanQuery is involved in the whole process. If you use ConstantScoreQuery(BooleanQuery), the limit still applies, but not for ConstantScoreQuery(internalWildcardFilter). Uwe On Apr 15, 2013, at 1:04 AM, Uwe Schindler wrote: The limit also applies for filters. If you have a list of terms ORed together, the fastest way is not to use a BooleanQuery at all, but instead a TermsFilter (which has no limits). - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Carsten Schnober [mailto:schno...@ids-mannheim.de] Sent: Monday, April 15, 2013 9:53 AM To: java-user@lucene.apache.org Subject: Re: Statically store sub-collections for search (faceted search?) Am 12.04.2013 20:08, schrieb SUJIT PAL: Hi Carsten, Why not use your idea of the BooleanQuery but wrap it in a Filter instead? Since you are not doing any scoring (only filtering), the max boolean clauses limit should not apply to a filter. Hi Sujit, thanks for your suggestion! I wasn't aware that the max clause limit does not match for a BooleanQuery wrapped in a filter. I suppose the ideal way would be to use a BooleanFilter but not a QueryWrapperFilter, right? However, I am also not sure how to apply a filter in my use case because I perform a SpanQuery. Although SpanQuery#getSpans() does take a Bits object as an argument (acceptDocs), I haven't been able to figure out how to generate this Bits object correctly from a Filter object. Best, Carsten -- Institut für Deutsche Sprache | http://www.ids-mannheim.de Projekt KorAP | http://korap.ids-mannheim.de Tel. +49-(0)621-43740789 | schno...@ids-mannheim.de Korpusanalyseplattform der nächsten Generation Next Generation Corpus Analysis Platform - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e
Re: Statically store sub-collections for search (faceted search?)
Hi Carsten, You're right that Lucene document numbers are ephemeral, but they are consistent for a certain IndexReader instance. So perhaps you can use SearcherLifetimeManager to obtain a 'version' of the reader that returned the original results and store a bitset together with that version. Then when the user further searches this subset of documents, you pull the relevant reader from SLM given the 'version' information. I think that you can write your own Pruner which prunes IR instances/versions when their corresponding docs subset tables are no longer needed... Shai On Fri, Apr 12, 2013 at 9:08 PM, SUJIT PAL sujit@comcast.net wrote: Hi Carsten, Why not use your idea of the BooleanQuery but wrap it in a Filter instead? Since you are not doing any scoring (only filtering), the max boolean clauses limit should not apply to a filter. -sujit On Apr 12, 2013, at 7:34 AM, Carsten Schnober wrote: Dear list, I would like to create a sub-set of the documents in an index that is to be used for further searches. However, the criteria that lead to the creation of that sub-set are not predefined so I think that faceted search cannot be applied my this use case. For instance: A user searches for documents that contain token 'A' in a field 'text'. These results form a set of documents that is persistently stored (in a database). Each document in the index has a field 'id' that identifies it, so these external IDs are stored in the database. Later on, a user loads the document IDs from the database and wants to execute another search on this set of documents only. However, performing a search on the full index and subsequently filtering the results against that list of documents takes very long if there are many matches. This is obvious as I have to retrieve the external id from each matching document and check whether it is part of the desired sub-set. Constructing a BooleanQuery in the style id:Doc1 OR id:Doc2 ... is not suitable either because there could be thousands of documents exceeding any limit for Boolean clauses. Any suggestions how to solve this? I would have gone for the Lucene document numbers and store them as a bit set that I could use as a filter during later searches, but I read that the document numbers are ephemeral. One possible way out seems to be to create another index from the documents that have matched the initial search, but this seems quite an overkill, especially if there are plenty of them... Thanks for any hint! Carsten -- Institut für Deutsche Sprache | http://www.ids-mannheim.de Projekt KorAP | http://korap.ids-mannheim.de Tel. +49-(0)621-43740789 | schno...@ids-mannheim.de Korpusanalyseplattform der nächsten Generation Next Generation Corpus Analysis Platform - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: Statically store sub-collections for search (faceted search?)
Hi Carsten, Why not use your idea of the BooleanQuery but wrap it in a Filter instead? Since you are not doing any scoring (only filtering), the max boolean clauses limit should not apply to a filter. -sujit On Apr 12, 2013, at 7:34 AM, Carsten Schnober wrote: Dear list, I would like to create a sub-set of the documents in an index that is to be used for further searches. However, the criteria that lead to the creation of that sub-set are not predefined so I think that faceted search cannot be applied my this use case. For instance: A user searches for documents that contain token 'A' in a field 'text'. These results form a set of documents that is persistently stored (in a database). Each document in the index has a field 'id' that identifies it, so these external IDs are stored in the database. Later on, a user loads the document IDs from the database and wants to execute another search on this set of documents only. However, performing a search on the full index and subsequently filtering the results against that list of documents takes very long if there are many matches. This is obvious as I have to retrieve the external id from each matching document and check whether it is part of the desired sub-set. Constructing a BooleanQuery in the style id:Doc1 OR id:Doc2 ... is not suitable either because there could be thousands of documents exceeding any limit for Boolean clauses. Any suggestions how to solve this? I would have gone for the Lucene document numbers and store them as a bit set that I could use as a filter during later searches, but I read that the document numbers are ephemeral. One possible way out seems to be to create another index from the documents that have matched the initial search, but this seems quite an overkill, especially if there are plenty of them... Thanks for any hint! Carsten -- Institut für Deutsche Sprache | http://www.ids-mannheim.de Projekt KorAP | http://korap.ids-mannheim.de Tel. +49-(0)621-43740789 | schno...@ids-mannheim.de Korpusanalyseplattform der nächsten Generation Next Generation Corpus Analysis Platform - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org