[jira] [Updated] (LUCENE-9609) When the term of more than 16, highlight the query does not return
[ https://issues.apache.org/jira/browse/LUCENE-9609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangFeiCheng updated LUCENE-9609: - Description: (was: 我注意到,当术语过多时,突出显示的查询受到限制 我知道在TermInSetQuery中,当词条较少时,请使用BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16来提高查询效率 {code:java} 静态最终整数BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16; 公共查询重写(IndexReader阅读器)引发IOException { 最终int阈值= Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD,BooleanQuery.getMaxClauseCount()); 如果(termData.size()<=阈值){ BooleanQuery.Builder bq =新的BooleanQuery.Builder(); TermIterator迭代器= termData.iterator(); for(BytesRef term = iterator.next(); term!= null; term = iterator.next()){ bq.add(new TermQuery(new Term(iterator.field(),BytesRef.deepCopyOf(term))),Occur.SHOULD); } 返回新的ConstantScoreQuery(bq.build()); } 返回super.rewrite(reader); } {code} 但是,在extractTerms中使用TermInSetQuery方法时,查询条件的重点超过16个 {code:java} @Override public void extractTerms(Set 术语){ //无操作 //此查询用于术语数量过多而无法使用的滥用情况 //作为BooleanQuery有效运行。因此,我们同样将其术语隐藏在 //为了保护荧光笔 } {code} 我想问一下,为什么要说“所以同样,我们为了保护荧光笔而隐藏了它的术语” 为什么这个阈值可以保护重点,以及如何实现这种“保护”? ) > When the term of more than 16, highlight the query does not return > -- > > Key: LUCENE-9609 > URL: https://issues.apache.org/jira/browse/LUCENE-9609 > Project: Lucene - Core > Issue Type: Wish > Components: core/search >Affects Versions: 7.7.3 >Reporter: WangFeiCheng >Priority: Minor > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9609) When the term of more than 16, highlight the query does not return
[ https://issues.apache.org/jira/browse/LUCENE-9609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangFeiCheng updated LUCENE-9609: - Description: I noticed that when there are too many terms, the highlighted query is restricted I know that in TermInSetQuery, when there are fewer entries, please use BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16 to improve query efficiency {code:java} 静态最终整数BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16; 公共查询重写(IndexReader阅读器)引发IOException { 最终int阈值= Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD,BooleanQuery.getMaxClauseCount()); 如果(termData.size()<=阈值){ BooleanQuery.Builder bq =新的BooleanQuery.Builder(); TermIterator迭代器= termData.iterator(); for(BytesRef term = iterator.next(); term!= null; term = iterator.next()){ bq.add(new TermQuery(new Term(iterator.field(),BytesRef.deepCopyOf(term))),Occur.SHOULD); } 返回新的ConstantScoreQuery(bq.build()); } 返回super.rewrite(reader); } {code} 但是,在extractTerms中使用TermInSetQuery方法时,查询条件的重点超过16个 {code:java} @Override public void extractTerms(Set 术语){ //无操作 //此查询用于术语数量过多而无法使用的滥用情况 //作为BooleanQuery有效运行。因此,我们同样将其术语隐藏在 //为了保护荧光笔 } {code} 我想问一下,为什么要说“所以同样,我们为了保护荧光笔而隐藏了它的术语” 为什么这个阈值可以保护重点,以及如何实现这种“保护”? > When the term of more than 16, highlight the query does not return > -- > > Key: LUCENE-9609 > URL: https://issues.apache.org/jira/browse/LUCENE-9609 > Project: Lucene - Core > Issue Type: Wish > Components: core/search >Affects Versions: 7.7.3 >Reporter: WangFeiCheng >Priority: Minor > > I noticed that when there are too many terms, the highlighted query is > restricted > I know that in TermInSetQuery, when there are fewer entries, please use > BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16 to improve query efficiency > {code:java} > 静态最终整数BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16; > 公共查询重写(IndexReader阅读器)引发IOException { > 最终int阈值= > Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD,BooleanQuery.getMaxClauseCount()); > 如果(termData.size()<=阈值){ > BooleanQuery.Builder bq =新的BooleanQuery.Builder(); > TermIterator迭代器= termData.iterator(); > for(BytesRef term = iterator.next(); term!= null; term = > iterator.next()){ > bq.add(new TermQuery(new > Term(iterator.field(),BytesRef.deepCopyOf(term))),Occur.SHOULD); > } > 返回新的ConstantScoreQuery(bq.build()); > } > 返回super.rewrite(reader); > } > {code} > 但是,在extractTerms中使用TermInSetQuery方法时,查询条件的重点超过16个 > > {code:java} > @Override > public void extractTerms(Set 术语){ > //无操作 > //此查询用于术语数量过多而无法使用的滥用情况 > //作为BooleanQuery有效运行。因此,我们同样将其术语隐藏在 > //为了保护荧光笔 > } > {code} > 我想问一下,为什么要说“所以同样,我们为了保护荧光笔而隐藏了它的术语” > 为什么这个阈值可以保护重点,以及如何实现这种“保护”? > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9609) When the term of more than 16, highlight the query does not return
[ https://issues.apache.org/jira/browse/LUCENE-9609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangFeiCheng updated LUCENE-9609: - Description: I noticed that when there are too many terms, the highlighted query is restricted I know that in TermInSetQuery, when there are fewer terms, BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16 will be used to improve query efficiency {code:java} static final int BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16; public Query rewrite(IndexReader reader) throws IOException { final int threshold = Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD, BooleanQuery.getMaxClauseCount()); if (termData.size() <= threshold) { BooleanQuery.Builder bq = new BooleanQuery.Builder(); TermIterator iterator = termData.iterator(); for (BytesRef term = iterator.next(); term != null; term = iterator.next()) { bq.add(new TermQuery(new Term(iterator.field(), BytesRef.deepCopyOf(term))), Occur.SHOULD); } return new ConstantScoreQuery(bq.build()); } return super.rewrite(reader); } {code} When the term of the query statement exceeds 16, the createWeight method in TermInSetQuery will be used {code:java} public Weight createWeight(IndexSearcher searcher, boolean needsScores, float boost) throws IOException { return new ConstantScoreWeight(this, boost) { @Override public void extractTerms(Set terms) { // no-op // This query is for abuse cases when the number of terms is too high to // run efficiently as a BooleanQuery. So likewise we hide its terms in // order to protect highlighters } .. } {code} I want to ask, why do I say "we hide its terms in order to protect highlighters" Why this threshold can highlight protection, or how to implement such " protect highlighters"? was: I noticed that when there are too many terms, the highlighted query is restricted I know that in TermInSetQuery, when there are fewer entries, please use BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16 to improve query efficiency {code:java} 静态最终整数BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16; 公共查询重写(IndexReader阅读器)引发IOException { 最终int阈值= Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD,BooleanQuery.getMaxClauseCount()); 如果(termData.size()<=阈值){ BooleanQuery.Builder bq =新的BooleanQuery.Builder(); TermIterator迭代器= termData.iterator(); for(BytesRef term = iterator.next(); term!= null; term = iterator.next()){ bq.add(new TermQuery(new Term(iterator.field(),BytesRef.deepCopyOf(term))),Occur.SHOULD); } 返回新的ConstantScoreQuery(bq.build()); } 返回super.rewrite(reader); } {code} 但是,在extractTerms中使用TermInSetQuery方法时,查询条件的重点超过16个 {code:java} @Override public void extractTerms(Set 术语){ //无操作 //此查询用于术语数量过多而无法使用的滥用情况 //作为BooleanQuery有效运行。因此,我们同样将其术语隐藏在 //为了保护荧光笔 } {code} 我想问一下,为什么要说“所以同样,我们为了保护荧光笔而隐藏了它的术语” 为什么这个阈值可以保护重点,以及如何实现这种“保护”? > When the term of more than 16, highlight the query does not return > -- > > Key: LUCENE-9609 > URL: https://issues.apache.org/jira/browse/LUCENE-9609 > Project: Lucene - Core > Issue Type: Wish > Components: core/search >Affects Versions: 7.7.3 >Reporter: WangFeiCheng >Priority: Minor > > I noticed that when there are too many terms, the highlighted query is > restricted > I know that in TermInSetQuery, when there are fewer terms, > BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16 will be used to improve query > efficiency > {code:java} > static final int BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16; > public Query rewrite(IndexReader reader) throws IOException { > final int threshold = Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD, > BooleanQuery.getMaxClauseCount()); > if (termData.size() <= threshold) { > BooleanQuery.Builder bq = new BooleanQuery.Builder(); > TermIterator iterator = termData.iterator(); > for (BytesRef term = iterator.next(); term != null; term = > iterator.next()) { > bq.add(new TermQuery(new Term(iterator.field(), > BytesRef.deepCopyOf(term))), Occur.SHOULD); > } > return new ConstantScoreQuery(bq.build()); > } > return super.rewrite(reader); > } > {code} > When the term of the query statement exceeds 16, the createWeight method in > TermInSetQuery will be used > {code:java} > public Weight createWeight(IndexSearcher searcher, boolean needsScores, float > boost) throws IOException { > return new ConstantScoreWeight(this, boost) { > @Override > public void extractTerms(Set terms) { > // no-op > // This query is for abuse cases when the number of terms is too high > to > // run efficiently as a BooleanQuery. So likewise we hide its terms in > // order to protect high
[jira] [Updated] (LUCENE-9609) When the term of more than 16, highlight the query does not return
[ https://issues.apache.org/jira/browse/LUCENE-9609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangFeiCheng updated LUCENE-9609: - Description: I noticed that when there are too many terms, the highlighted query is restricted I know that in TermInSetQuery, when there are fewer terms, BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16 will be used to improve query efficiency {code:java} static final int BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16; public Query rewrite(IndexReader reader) throws IOException { final int threshold = Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD, BooleanQuery.getMaxClauseCount()); if (termData.size() <= threshold) { BooleanQuery.Builder bq = new BooleanQuery.Builder(); TermIterator iterator = termData.iterator(); for (BytesRef term = iterator.next(); term != null; term = iterator.next()) { bq.add(new TermQuery(new Term(iterator.field(), BytesRef.deepCopyOf(term))), Occur.SHOULD); } return new ConstantScoreQuery(bq.build()); } return super.rewrite(reader); } {code} When the term of the query statement exceeds 16, the createWeight method in TermInSetQuery will be used {code:java} public Weight createWeight(IndexSearcher searcher, boolean needsScores, float boost) throws IOException { return new ConstantScoreWeight(this, boost) { @Override public void extractTerms(Set terms) { // no-op // This query is for abuse cases when the number of terms is too high to // run efficiently as a BooleanQuery. So likewise we hide its terms in // order to protect highlighters } .. } {code} I want to ask, why do you say "we hide its terms in order to protect highlighters" Why this threshold can highlight protection, or how to implement such " protect highlighters"? was: I noticed that when there are too many terms, the highlighted query is restricted I know that in TermInSetQuery, when there are fewer terms, BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16 will be used to improve query efficiency {code:java} static final int BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16; public Query rewrite(IndexReader reader) throws IOException { final int threshold = Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD, BooleanQuery.getMaxClauseCount()); if (termData.size() <= threshold) { BooleanQuery.Builder bq = new BooleanQuery.Builder(); TermIterator iterator = termData.iterator(); for (BytesRef term = iterator.next(); term != null; term = iterator.next()) { bq.add(new TermQuery(new Term(iterator.field(), BytesRef.deepCopyOf(term))), Occur.SHOULD); } return new ConstantScoreQuery(bq.build()); } return super.rewrite(reader); } {code} When the term of the query statement exceeds 16, the createWeight method in TermInSetQuery will be used {code:java} public Weight createWeight(IndexSearcher searcher, boolean needsScores, float boost) throws IOException { return new ConstantScoreWeight(this, boost) { @Override public void extractTerms(Set terms) { // no-op // This query is for abuse cases when the number of terms is too high to // run efficiently as a BooleanQuery. So likewise we hide its terms in // order to protect highlighters } .. } {code} I want to ask, why do I say "we hide its terms in order to protect highlighters" Why this threshold can highlight protection, or how to implement such " protect highlighters"? > When the term of more than 16, highlight the query does not return > -- > > Key: LUCENE-9609 > URL: https://issues.apache.org/jira/browse/LUCENE-9609 > Project: Lucene - Core > Issue Type: Wish > Components: core/search >Affects Versions: 7.7.3 >Reporter: WangFeiCheng >Priority: Minor > > I noticed that when there are too many terms, the highlighted query is > restricted > I know that in TermInSetQuery, when there are fewer terms, > BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16 will be used to improve query > efficiency > {code:java} > static final int BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16; > public Query rewrite(IndexReader reader) throws IOException { > final int threshold = Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD, > BooleanQuery.getMaxClauseCount()); > if (termData.size() <= threshold) { > BooleanQuery.Builder bq = new BooleanQuery.Builder(); > TermIterator iterator = termData.iterator(); > for (BytesRef term = iterator.next(); term != null; term = > iterator.next()) { > bq.add(new TermQuery(new Term(iterator.field(), > BytesRef.deepCopyOf(term))), Occur.SHOULD); > } > return new ConstantScoreQuery(bq.build()); > } > return super.rewrite(reader); >
[jira] [Updated] (LUCENE-9609) When the term of more than 16, highlight the query does not return
[ https://issues.apache.org/jira/browse/LUCENE-9609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangFeiCheng updated LUCENE-9609: - Description: I noticed that when there are too many terms, the highlighted query is restricted I know that in TermInSetQuery, when there are fewer terms, BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16 will be used to improve query efficiency {code:java} static final int BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16; public Query rewrite(IndexReader reader) throws IOException { final int threshold = Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD, BooleanQuery.getMaxClauseCount()); if (termData.size() <= threshold) { BooleanQuery.Builder bq = new BooleanQuery.Builder(); TermIterator iterator = termData.iterator(); for (BytesRef term = iterator.next(); term != null; term = iterator.next()) { bq.add(new TermQuery(new Term(iterator.field(), BytesRef.deepCopyOf(term))), Occur.SHOULD); } return new ConstantScoreQuery(bq.build()); } return super.rewrite(reader); } {code} When the term of the query statement exceeds 16, the createWeight method in TermInSetQuery will be used {code:java} public Weight createWeight(IndexSearcher searcher, boolean needsScores, float boost) throws IOException { return new ConstantScoreWeight(this, boost) { @Override public void extractTerms(Set terms) { // no-op // This query is for abuse cases when the number of terms is too high to // run efficiently as a BooleanQuery. So likewise we hide its terms in // order to protect highlighters } .. } {code} I want to ask, why do you say "we hide its terms in order to protect highlighters" How to implement such " protect highlighters"? was: I noticed that when there are too many terms, the highlighted query is restricted I know that in TermInSetQuery, when there are fewer terms, BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16 will be used to improve query efficiency {code:java} static final int BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16; public Query rewrite(IndexReader reader) throws IOException { final int threshold = Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD, BooleanQuery.getMaxClauseCount()); if (termData.size() <= threshold) { BooleanQuery.Builder bq = new BooleanQuery.Builder(); TermIterator iterator = termData.iterator(); for (BytesRef term = iterator.next(); term != null; term = iterator.next()) { bq.add(new TermQuery(new Term(iterator.field(), BytesRef.deepCopyOf(term))), Occur.SHOULD); } return new ConstantScoreQuery(bq.build()); } return super.rewrite(reader); } {code} When the term of the query statement exceeds 16, the createWeight method in TermInSetQuery will be used {code:java} public Weight createWeight(IndexSearcher searcher, boolean needsScores, float boost) throws IOException { return new ConstantScoreWeight(this, boost) { @Override public void extractTerms(Set terms) { // no-op // This query is for abuse cases when the number of terms is too high to // run efficiently as a BooleanQuery. So likewise we hide its terms in // order to protect highlighters } .. } {code} I want to ask, why do you say "we hide its terms in order to protect highlighters" Why this threshold can highlight protection, or how to implement such " protect highlighters"? > When the term of more than 16, highlight the query does not return > -- > > Key: LUCENE-9609 > URL: https://issues.apache.org/jira/browse/LUCENE-9609 > Project: Lucene - Core > Issue Type: Wish > Components: core/search >Affects Versions: 7.7.3 >Reporter: WangFeiCheng >Priority: Minor > > I noticed that when there are too many terms, the highlighted query is > restricted > I know that in TermInSetQuery, when there are fewer terms, > BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16 will be used to improve query > efficiency > {code:java} > static final int BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16; > public Query rewrite(IndexReader reader) throws IOException { > final int threshold = Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD, > BooleanQuery.getMaxClauseCount()); > if (termData.size() <= threshold) { > BooleanQuery.Builder bq = new BooleanQuery.Builder(); > TermIterator iterator = termData.iterator(); > for (BytesRef term = iterator.next(); term != null; term = > iterator.next()) { > bq.add(new TermQuery(new Term(iterator.field(), > BytesRef.deepCopyOf(term))), Occur.SHOULD); > } > return new ConstantScoreQuery(bq.build()); > } > return super.rewrite(reader); > } > {code} > When the term of the query state