[jira] [Updated] (LUCENE-9609) When the term of more than 16, highlight the query does not return

2020-11-14 Thread WangFeiCheng (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

WangFeiCheng updated LUCENE-9609:
-
Description: 
I noticed that when there are too many terms, the highlighted query is 
restricted

I know that in TermInSetQuery, when there are fewer terms, 
BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16 will be used to improve query 
efficiency
{code:java}
static final int BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16;

public Query rewrite(IndexReader reader) throws IOException {
final int threshold = Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD, 
BooleanQuery.getMaxClauseCount());
if (termData.size() <= threshold) {
  BooleanQuery.Builder bq = new BooleanQuery.Builder();
  TermIterator iterator = termData.iterator();
  for (BytesRef term = iterator.next(); term != null; term = 
iterator.next()) {
bq.add(new TermQuery(new Term(iterator.field(), 
BytesRef.deepCopyOf(term))), Occur.SHOULD);
  }
  return new ConstantScoreQuery(bq.build());
}
return super.rewrite(reader);
  }
{code}
 When the term of the query statement exceeds 16, the createWeight method in 
TermInSetQuery will be used
{code:java}
public Weight createWeight(IndexSearcher searcher, boolean needsScores, float 
boost) throws IOException {
return new ConstantScoreWeight(this, boost) {

  @Override
  public void extractTerms(Set terms) {
// no-op
// This query is for abuse cases when the number of terms is too high to
// run efficiently as a BooleanQuery. So likewise we hide its terms in
// order to protect highlighters
  }

  ..
  }
{code}
I want to ask, why do you say "we hide its terms in order to protect 
highlighters"

How to implement such " protect highlighters"?

 

 

  was:
I noticed that when there are too many terms, the highlighted query is 
restricted

I know that in TermInSetQuery, when there are fewer terms, 
BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16 will be used to improve query 
efficiency
{code:java}
static final int BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16;

public Query rewrite(IndexReader reader) throws IOException {
final int threshold = Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD, 
BooleanQuery.getMaxClauseCount());
if (termData.size() <= threshold) {
  BooleanQuery.Builder bq = new BooleanQuery.Builder();
  TermIterator iterator = termData.iterator();
  for (BytesRef term = iterator.next(); term != null; term = 
iterator.next()) {
bq.add(new TermQuery(new Term(iterator.field(), 
BytesRef.deepCopyOf(term))), Occur.SHOULD);
  }
  return new ConstantScoreQuery(bq.build());
}
return super.rewrite(reader);
  }
{code}
 When the term of the query statement exceeds 16, the createWeight method in 
TermInSetQuery will be used
{code:java}
public Weight createWeight(IndexSearcher searcher, boolean needsScores, float 
boost) throws IOException {
return new ConstantScoreWeight(this, boost) {

  @Override
  public void extractTerms(Set terms) {
// no-op
// This query is for abuse cases when the number of terms is too high to
// run efficiently as a BooleanQuery. So likewise we hide its terms in
// order to protect highlighters
  }

  ..
  }
{code}
I want to ask, why do you say "we hide its terms in order to protect 
highlighters"

Why this threshold can highlight protection, or how to implement such " protect 
highlighters"?

 

 


> When the term of more than 16, highlight the query does not return
> --
>
> Key: LUCENE-9609
> URL: https://issues.apache.org/jira/browse/LUCENE-9609
> Project: Lucene - Core
>  Issue Type: Wish
>  Components: core/search
>Affects Versions: 7.7.3
>Reporter: WangFeiCheng
>Priority: Minor
>
> I noticed that when there are too many terms, the highlighted query is 
> restricted
> I know that in TermInSetQuery, when there are fewer terms, 
> BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16 will be used to improve query 
> efficiency
> {code:java}
> static final int BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16;
> public Query rewrite(IndexReader reader) throws IOException {
> final int threshold = Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD, 
> BooleanQuery.getMaxClauseCount());
> if (termData.size() <= threshold) {
>   BooleanQuery.Builder bq = new BooleanQuery.Builder();
>   TermIterator iterator = termData.iterator();
>   for (BytesRef term = iterator.next(); term != null; term = 
> iterator.next()) {
> bq.add(new TermQuery(new Term(iterator.field(), 
> BytesRef.deepCopyOf(term))), Occur.SHOULD);
>   }
>   return new ConstantScoreQuery(bq.build());
> }
> return super.rewrite(reader);
>   }
> {code}
>  When the term of the query 

[jira] [Updated] (LUCENE-9609) When the term of more than 16, highlight the query does not return

2020-11-13 Thread WangFeiCheng (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

WangFeiCheng updated LUCENE-9609:
-
Description: 
I noticed that when there are too many terms, the highlighted query is 
restricted

I know that in TermInSetQuery, when there are fewer terms, 
BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16 will be used to improve query 
efficiency
{code:java}
static final int BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16;

public Query rewrite(IndexReader reader) throws IOException {
final int threshold = Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD, 
BooleanQuery.getMaxClauseCount());
if (termData.size() <= threshold) {
  BooleanQuery.Builder bq = new BooleanQuery.Builder();
  TermIterator iterator = termData.iterator();
  for (BytesRef term = iterator.next(); term != null; term = 
iterator.next()) {
bq.add(new TermQuery(new Term(iterator.field(), 
BytesRef.deepCopyOf(term))), Occur.SHOULD);
  }
  return new ConstantScoreQuery(bq.build());
}
return super.rewrite(reader);
  }
{code}
 When the term of the query statement exceeds 16, the createWeight method in 
TermInSetQuery will be used
{code:java}
public Weight createWeight(IndexSearcher searcher, boolean needsScores, float 
boost) throws IOException {
return new ConstantScoreWeight(this, boost) {

  @Override
  public void extractTerms(Set terms) {
// no-op
// This query is for abuse cases when the number of terms is too high to
// run efficiently as a BooleanQuery. So likewise we hide its terms in
// order to protect highlighters
  }

  ..
  }
{code}
I want to ask, why do you say "we hide its terms in order to protect 
highlighters"

Why this threshold can highlight protection, or how to implement such " protect 
highlighters"?

 

 

  was:
I noticed that when there are too many terms, the highlighted query is 
restricted

I know that in TermInSetQuery, when there are fewer terms, 
BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16 will be used to improve query 
efficiency
{code:java}
static final int BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16;

public Query rewrite(IndexReader reader) throws IOException {
final int threshold = Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD, 
BooleanQuery.getMaxClauseCount());
if (termData.size() <= threshold) {
  BooleanQuery.Builder bq = new BooleanQuery.Builder();
  TermIterator iterator = termData.iterator();
  for (BytesRef term = iterator.next(); term != null; term = 
iterator.next()) {
bq.add(new TermQuery(new Term(iterator.field(), 
BytesRef.deepCopyOf(term))), Occur.SHOULD);
  }
  return new ConstantScoreQuery(bq.build());
}
return super.rewrite(reader);
  }
{code}
 When the term of the query statement exceeds 16, the createWeight method in 
TermInSetQuery will be used
{code:java}
public Weight createWeight(IndexSearcher searcher, boolean needsScores, float 
boost) throws IOException {
return new ConstantScoreWeight(this, boost) {

  @Override
  public void extractTerms(Set terms) {
// no-op
// This query is for abuse cases when the number of terms is too high to
// run efficiently as a BooleanQuery. So likewise we hide its terms in
// order to protect highlighters
  }

  ..
  }
{code}
I want to ask, why do I say "we hide its terms in order to protect highlighters"

Why this threshold can highlight protection, or how to implement such " protect 
highlighters"?

 

 


> When the term of more than 16, highlight the query does not return
> --
>
> Key: LUCENE-9609
> URL: https://issues.apache.org/jira/browse/LUCENE-9609
> Project: Lucene - Core
>  Issue Type: Wish
>  Components: core/search
>Affects Versions: 7.7.3
>Reporter: WangFeiCheng
>Priority: Minor
>
> I noticed that when there are too many terms, the highlighted query is 
> restricted
> I know that in TermInSetQuery, when there are fewer terms, 
> BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16 will be used to improve query 
> efficiency
> {code:java}
> static final int BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16;
> public Query rewrite(IndexReader reader) throws IOException {
> final int threshold = Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD, 
> BooleanQuery.getMaxClauseCount());
> if (termData.size() <= threshold) {
>   BooleanQuery.Builder bq = new BooleanQuery.Builder();
>   TermIterator iterator = termData.iterator();
>   for (BytesRef term = iterator.next(); term != null; term = 
> iterator.next()) {
> bq.add(new TermQuery(new Term(iterator.field(), 
> BytesRef.deepCopyOf(term))), Occur.SHOULD);
>   }
>   return new ConstantScoreQuery(bq.build());
> }
> return super.rewrite(reader);
>   

[jira] [Updated] (LUCENE-9609) When the term of more than 16, highlight the query does not return

2020-11-13 Thread WangFeiCheng (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

WangFeiCheng updated LUCENE-9609:
-
Description: 
I noticed that when there are too many terms, the highlighted query is 
restricted

I know that in TermInSetQuery, when there are fewer terms, 
BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16 will be used to improve query 
efficiency
{code:java}
static final int BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16;

public Query rewrite(IndexReader reader) throws IOException {
final int threshold = Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD, 
BooleanQuery.getMaxClauseCount());
if (termData.size() <= threshold) {
  BooleanQuery.Builder bq = new BooleanQuery.Builder();
  TermIterator iterator = termData.iterator();
  for (BytesRef term = iterator.next(); term != null; term = 
iterator.next()) {
bq.add(new TermQuery(new Term(iterator.field(), 
BytesRef.deepCopyOf(term))), Occur.SHOULD);
  }
  return new ConstantScoreQuery(bq.build());
}
return super.rewrite(reader);
  }
{code}
 When the term of the query statement exceeds 16, the createWeight method in 
TermInSetQuery will be used
{code:java}
public Weight createWeight(IndexSearcher searcher, boolean needsScores, float 
boost) throws IOException {
return new ConstantScoreWeight(this, boost) {

  @Override
  public void extractTerms(Set terms) {
// no-op
// This query is for abuse cases when the number of terms is too high to
// run efficiently as a BooleanQuery. So likewise we hide its terms in
// order to protect highlighters
  }

  ..
  }
{code}
I want to ask, why do I say "we hide its terms in order to protect highlighters"

Why this threshold can highlight protection, or how to implement such " protect 
highlighters"?

 

 

  was:
I noticed that when there are too many terms, the highlighted query is 
restricted

I know that in TermInSetQuery, when there are fewer entries, please use 
BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16 to improve query efficiency
{code:java}
静态最终整数BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16;

公共查询重写(IndexReader阅读器)引发IOException {
最终int阈值= 
Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD,BooleanQuery.getMaxClauseCount());
如果(termData.size()<=阈值){
  BooleanQuery.Builder bq =新的BooleanQuery.Builder();
  TermIterator迭代器= termData.iterator();
  for(BytesRef term = iterator.next(); term!= null; term = iterator.next()){
bq.add(new TermQuery(new 
Term(iterator.field(),BytesRef.deepCopyOf(term))),Occur.SHOULD);
  }
  返回新的ConstantScoreQuery(bq.build());
}
返回super.rewrite(reader);
  }
{code}
但是,在extractTerms中使用TermInSetQuery方法时,查询条件的重点超过16个

 
{code:java}
@Override
public void extractTerms(Set 术语){
//无操作
//此查询用于术语数量过多而无法使用的滥用情况
//作为BooleanQuery有效运行。因此,我们同样将其术语隐藏在
//为了保护荧光笔
}
{code}
我想问一下,为什么要说“所以同样,我们为了保护荧光笔而隐藏了它的术语”

为什么这个阈值可以保护重点,以及如何实现这种“保护”?

 

 


> When the term of more than 16, highlight the query does not return
> --
>
> Key: LUCENE-9609
> URL: https://issues.apache.org/jira/browse/LUCENE-9609
> Project: Lucene - Core
>  Issue Type: Wish
>  Components: core/search
>Affects Versions: 7.7.3
>Reporter: WangFeiCheng
>Priority: Minor
>
> I noticed that when there are too many terms, the highlighted query is 
> restricted
> I know that in TermInSetQuery, when there are fewer terms, 
> BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16 will be used to improve query 
> efficiency
> {code:java}
> static final int BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16;
> public Query rewrite(IndexReader reader) throws IOException {
> final int threshold = Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD, 
> BooleanQuery.getMaxClauseCount());
> if (termData.size() <= threshold) {
>   BooleanQuery.Builder bq = new BooleanQuery.Builder();
>   TermIterator iterator = termData.iterator();
>   for (BytesRef term = iterator.next(); term != null; term = 
> iterator.next()) {
> bq.add(new TermQuery(new Term(iterator.field(), 
> BytesRef.deepCopyOf(term))), Occur.SHOULD);
>   }
>   return new ConstantScoreQuery(bq.build());
> }
> return super.rewrite(reader);
>   }
> {code}
>  When the term of the query statement exceeds 16, the createWeight method in 
> TermInSetQuery will be used
> {code:java}
> public Weight createWeight(IndexSearcher searcher, boolean needsScores, float 
> boost) throws IOException {
> return new ConstantScoreWeight(this, boost) {
>   @Override
>   public void extractTerms(Set terms) {
> // no-op
> // This query is for abuse cases when the number of terms is too high 
> to
> // run efficiently as a BooleanQuery. So likewise we hide its terms in
> // order to protect 

[jira] [Updated] (LUCENE-9609) When the term of more than 16, highlight the query does not return

2020-11-13 Thread WangFeiCheng (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

WangFeiCheng updated LUCENE-9609:
-
Description: 
I noticed that when there are too many terms, the highlighted query is 
restricted

I know that in TermInSetQuery, when there are fewer entries, please use 
BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16 to improve query efficiency
{code:java}
静态最终整数BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16;

公共查询重写(IndexReader阅读器)引发IOException {
最终int阈值= 
Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD,BooleanQuery.getMaxClauseCount());
如果(termData.size()<=阈值){
  BooleanQuery.Builder bq =新的BooleanQuery.Builder();
  TermIterator迭代器= termData.iterator();
  for(BytesRef term = iterator.next(); term!= null; term = iterator.next()){
bq.add(new TermQuery(new 
Term(iterator.field(),BytesRef.deepCopyOf(term))),Occur.SHOULD);
  }
  返回新的ConstantScoreQuery(bq.build());
}
返回super.rewrite(reader);
  }
{code}
但是,在extractTerms中使用TermInSetQuery方法时,查询条件的重点超过16个

 
{code:java}
@Override
public void extractTerms(Set 术语){
//无操作
//此查询用于术语数量过多而无法使用的滥用情况
//作为BooleanQuery有效运行。因此,我们同样将其术语隐藏在
//为了保护荧光笔
}
{code}
我想问一下,为什么要说“所以同样,我们为了保护荧光笔而隐藏了它的术语”

为什么这个阈值可以保护重点,以及如何实现这种“保护”?

 

 

> When the term of more than 16, highlight the query does not return
> --
>
> Key: LUCENE-9609
> URL: https://issues.apache.org/jira/browse/LUCENE-9609
> Project: Lucene - Core
>  Issue Type: Wish
>  Components: core/search
>Affects Versions: 7.7.3
>Reporter: WangFeiCheng
>Priority: Minor
>
> I noticed that when there are too many terms, the highlighted query is 
> restricted
> I know that in TermInSetQuery, when there are fewer entries, please use 
> BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16 to improve query efficiency
> {code:java}
> 静态最终整数BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16;
> 公共查询重写(IndexReader阅读器)引发IOException {
> 最终int阈值= 
> Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD,BooleanQuery.getMaxClauseCount());
> 如果(termData.size()<=阈值){
>   BooleanQuery.Builder bq =新的BooleanQuery.Builder();
>   TermIterator迭代器= termData.iterator();
>   for(BytesRef term = iterator.next(); term!= null; term = 
> iterator.next()){
> bq.add(new TermQuery(new 
> Term(iterator.field(),BytesRef.deepCopyOf(term))),Occur.SHOULD);
>   }
>   返回新的ConstantScoreQuery(bq.build());
> }
> 返回super.rewrite(reader);
>   }
> {code}
> 但是,在extractTerms中使用TermInSetQuery方法时,查询条件的重点超过16个
>  
> {code:java}
> @Override
> public void extractTerms(Set 术语){
> //无操作
> //此查询用于术语数量过多而无法使用的滥用情况
> //作为BooleanQuery有效运行。因此,我们同样将其术语隐藏在
> //为了保护荧光笔
> }
> {code}
> 我想问一下,为什么要说“所以同样,我们为了保护荧光笔而隐藏了它的术语”
> 为什么这个阈值可以保护重点,以及如何实现这种“保护”?
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9609) When the term of more than 16, highlight the query does not return

2020-11-13 Thread WangFeiCheng (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

WangFeiCheng updated LUCENE-9609:
-
Description: (was: 我注意到,当术语过多时,突出显示的查询受到限制

我知道在TermInSetQuery中,当词条较少时,请使用BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16来提高查询效率
{code:java}
静态最终整数BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16;

公共查询重写(IndexReader阅读器)引发IOException {
最终int阈值= 
Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD,BooleanQuery.getMaxClauseCount());
如果(termData.size()<=阈值){
  BooleanQuery.Builder bq =新的BooleanQuery.Builder();
  TermIterator迭代器= termData.iterator();
  for(BytesRef term = iterator.next(); term!= null; term = iterator.next()){
bq.add(new TermQuery(new 
Term(iterator.field(),BytesRef.deepCopyOf(term))),Occur.SHOULD);
  }
  返回新的ConstantScoreQuery(bq.build());
}
返回super.rewrite(reader);
  }
{code}
但是,在extractTerms中使用TermInSetQuery方法时,查询条件的重点超过16个

 
{code:java}
@Override
public void extractTerms(Set 术语){
//无操作
//此查询用于术语数量过多而无法使用的滥用情况
//作为BooleanQuery有效运行。因此,我们同样将其术语隐藏在
//为了保护荧光笔
}
{code}
我想问一下,为什么要说“所以同样,我们为了保护荧光笔而隐藏了它的术语”

为什么这个阈值可以保护重点,以及如何实现这种“保护”?

 

 )

> When the term of more than 16, highlight the query does not return
> --
>
> Key: LUCENE-9609
> URL: https://issues.apache.org/jira/browse/LUCENE-9609
> Project: Lucene - Core
>  Issue Type: Wish
>  Components: core/search
>Affects Versions: 7.7.3
>Reporter: WangFeiCheng
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9609) When the term of more than 16, highlight the query does not return

2020-11-13 Thread WangFeiCheng (Jira)
WangFeiCheng created LUCENE-9609:


 Summary: When the term of more than 16, highlight the query does 
not return
 Key: LUCENE-9609
 URL: https://issues.apache.org/jira/browse/LUCENE-9609
 Project: Lucene - Core
  Issue Type: Wish
  Components: core/search
Affects Versions: 7.7.3
Reporter: WangFeiCheng


我注意到,当术语过多时,突出显示的查询受到限制

我知道在TermInSetQuery中,当词条较少时,请使用BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16来提高查询效率
{code:java}
静态最终整数BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16;

公共查询重写(IndexReader阅读器)引发IOException {
最终int阈值= 
Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD,BooleanQuery.getMaxClauseCount());
如果(termData.size()<=阈值){
  BooleanQuery.Builder bq =新的BooleanQuery.Builder();
  TermIterator迭代器= termData.iterator();
  for(BytesRef term = iterator.next(); term!= null; term = iterator.next()){
bq.add(new TermQuery(new 
Term(iterator.field(),BytesRef.deepCopyOf(term))),Occur.SHOULD);
  }
  返回新的ConstantScoreQuery(bq.build());
}
返回super.rewrite(reader);
  }
{code}
但是,在extractTerms中使用TermInSetQuery方法时,查询条件的重点超过16个

 
{code:java}
@Override
public void extractTerms(Set 术语){
//无操作
//此查询用于术语数量过多而无法使用的滥用情况
//作为BooleanQuery有效运行。因此,我们同样将其术语隐藏在
//为了保护荧光笔
}
{code}
我想问一下,为什么要说“所以同样,我们为了保护荧光笔而隐藏了它的术语”

为什么这个阈值可以保护重点,以及如何实现这种“保护”?

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org