[jira] [Commented] (LUCENE-7698) CommonGramsQueryFilter in the query analyzer chain breaks phrase queries

2019-04-03 Thread Zhu JiaJun (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808465#comment-16808465
 ] 

Zhu JiaJun commented on LUCENE-7698:


Hi [~emaijala],

I followed the steps and found a query on "hello with an accent" get empty 
result which should get match on the field "features":"Good unicode support: 
héllo (hello with an accent over the e)" of document (id: SOLR1000). I'm trying 
to apply "CommonGramsFilterFactory + StopFilterFactory" in the query analyzer 
for our solr environment, while this issue cause some query get empty result.

The issue seems to be happens when 2 stopwords are siblings in the a phrase 
search. for example: "with an", also I didn't add "an" in the the stopwords.txt 
file, form the Analysis Tool, I found the CommonGramsFilterFactory still regard 
it as a stopword. 

JiaJun

> CommonGramsQueryFilter in the query analyzer chain breaks phrase queries
> 
>
> Key: LUCENE-7698
> URL: https://issues.apache.org/jira/browse/LUCENE-7698
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/queryparser
>Affects Versions: 6.4, 6.4.1
>Reporter: Ere Maijala
>Priority: Blocker
>  Labels: regression
> Fix For: 6.4.2, 7.0
>
> Attachments: LUCENE-7698.patch
>
>
> (Please pardon me if the project or component are wrong!)
> CommonGramsQueryFilter breaks phrase queries. The behavior also seems to 
> change with addition or removal of adjacent terms.
> Steps to reproduce:
> 1.) Download and extract Solr (in my test case version 6.4.1) somewhere.
> 2.) Modify 
> server/solr/configsets/sample_techproducts_configs/conf/managed-schema and 
> modify text_general fieldType by adding CommonGrams(Query)Filter before 
> stopWordFilter:
>  positionIncrementGap="100">
>   
> 
>  words="stopwords.txt" />
>  words="stopwords.txt" />
> 
> 
>   
>   
> 
>  words="stopwords.txt"/>
>  words="stopwords.txt" />
>  ignoreCase="true" expand="true"/>
> 
>   
> 
> 3.) Add "with" to 
> server/solr/configsets/sample_techproducts_configs/conf/stopwords.txt and 
> make sure the file has correct line endings (extracted from Solr zip it seems 
> to contain DOS/Windows lien endings which may break things).
> 4.) Run the techproducts example with "bin/solr -e techproducts"
> 5.) Browse to 
> 
> 6.) Observe that parsedquery in the debug output is empty
> 7.) Browse to 
> 
> 8.) Observe that parsedquery contains ipod_with as expected but not 
> with_video.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7698) CommonGramsQueryFilter in the query analyzer chain breaks phrase queries

2017-02-22 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877989#comment-15877989
 ] 

ASF subversion and git services commented on LUCENE-7698:
-

Commit 92ff8682b281a28f40826de4b94548671e580bd8 in lucene-solr's branch 
refs/heads/branch_6_4 from Mike McCandless
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=92ff868 ]

LUCENE-7698: fix CommonGramsQueryFilter to not produce a disconnected token 
graph


> CommonGramsQueryFilter in the query analyzer chain breaks phrase queries
> 
>
> Key: LUCENE-7698
> URL: https://issues.apache.org/jira/browse/LUCENE-7698
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/queryparser
>Affects Versions: 6.4, 6.4.1
>Reporter: Ere Maijala
>Priority: Blocker
>  Labels: regression
> Fix For: master (7.0), 6.4.2
>
> Attachments: LUCENE-7698.patch
>
>
> (Please pardon me if the project or component are wrong!)
> CommonGramsQueryFilter breaks phrase queries. The behavior also seems to 
> change with addition or removal of adjacent terms.
> Steps to reproduce:
> 1.) Download and extract Solr (in my test case version 6.4.1) somewhere.
> 2.) Modify 
> server/solr/configsets/sample_techproducts_configs/conf/managed-schema and 
> modify text_general fieldType by adding CommonGrams(Query)Filter before 
> stopWordFilter:
>  positionIncrementGap="100">
>   
> 
>  words="stopwords.txt" />
>  words="stopwords.txt" />
> 
> 
>   
>   
> 
>  words="stopwords.txt"/>
>  words="stopwords.txt" />
>  ignoreCase="true" expand="true"/>
> 
>   
> 
> 3.) Add "with" to 
> server/solr/configsets/sample_techproducts_configs/conf/stopwords.txt and 
> make sure the file has correct line endings (extracted from Solr zip it seems 
> to contain DOS/Windows lien endings which may break things).
> 4.) Run the techproducts example with "bin/solr -e techproducts"
> 5.) Browse to 
> 
> 6.) Observe that parsedquery in the debug output is empty
> 7.) Browse to 
> 
> 8.) Observe that parsedquery contains ipod_with as expected but not 
> with_video.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7698) CommonGramsQueryFilter in the query analyzer chain breaks phrase queries

2017-02-22 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877919#comment-15877919
 ] 

ASF subversion and git services commented on LUCENE-7698:
-

Commit d8e493c502d234099c927339426dfe4a01a94219 in lucene-solr's branch 
refs/heads/branch_6x from Mike McCandless
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d8e493c ]

LUCENE-7698: fix CommonGramsQueryFilter to not produce a disconnected token 
graph


> CommonGramsQueryFilter in the query analyzer chain breaks phrase queries
> 
>
> Key: LUCENE-7698
> URL: https://issues.apache.org/jira/browse/LUCENE-7698
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/queryparser
>Affects Versions: 6.4, 6.4.1
>Reporter: Ere Maijala
>Priority: Blocker
>  Labels: regression
> Fix For: master (7.0), 6.4.2
>
> Attachments: LUCENE-7698.patch
>
>
> (Please pardon me if the project or component are wrong!)
> CommonGramsQueryFilter breaks phrase queries. The behavior also seems to 
> change with addition or removal of adjacent terms.
> Steps to reproduce:
> 1.) Download and extract Solr (in my test case version 6.4.1) somewhere.
> 2.) Modify 
> server/solr/configsets/sample_techproducts_configs/conf/managed-schema and 
> modify text_general fieldType by adding CommonGrams(Query)Filter before 
> stopWordFilter:
>  positionIncrementGap="100">
>   
> 
>  words="stopwords.txt" />
>  words="stopwords.txt" />
> 
> 
>   
>   
> 
>  words="stopwords.txt"/>
>  words="stopwords.txt" />
>  ignoreCase="true" expand="true"/>
> 
>   
> 
> 3.) Add "with" to 
> server/solr/configsets/sample_techproducts_configs/conf/stopwords.txt and 
> make sure the file has correct line endings (extracted from Solr zip it seems 
> to contain DOS/Windows lien endings which may break things).
> 4.) Run the techproducts example with "bin/solr -e techproducts"
> 5.) Browse to 
> 
> 6.) Observe that parsedquery in the debug output is empty
> 7.) Browse to 
> 
> 8.) Observe that parsedquery contains ipod_with as expected but not 
> with_video.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7698) CommonGramsQueryFilter in the query analyzer chain breaks phrase queries

2017-02-22 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877916#comment-15877916
 ] 

ASF subversion and git services commented on LUCENE-7698:
-

Commit b9c9cddff7cef08e8b0433a203771e48e662e7b1 in lucene-solr's branch 
refs/heads/master from Mike McCandless
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=b9c9cdd ]

LUCENE-7698: fix CommonGramsQueryFilter to not produce a disconnected token 
graph


> CommonGramsQueryFilter in the query analyzer chain breaks phrase queries
> 
>
> Key: LUCENE-7698
> URL: https://issues.apache.org/jira/browse/LUCENE-7698
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/queryparser
>Affects Versions: 6.4, 6.4.1
>Reporter: Ere Maijala
>Priority: Blocker
>  Labels: regression
> Fix For: master (7.0), 6.4.2
>
> Attachments: LUCENE-7698.patch
>
>
> (Please pardon me if the project or component are wrong!)
> CommonGramsQueryFilter breaks phrase queries. The behavior also seems to 
> change with addition or removal of adjacent terms.
> Steps to reproduce:
> 1.) Download and extract Solr (in my test case version 6.4.1) somewhere.
> 2.) Modify 
> server/solr/configsets/sample_techproducts_configs/conf/managed-schema and 
> modify text_general fieldType by adding CommonGrams(Query)Filter before 
> stopWordFilter:
>  positionIncrementGap="100">
>   
> 
>  words="stopwords.txt" />
>  words="stopwords.txt" />
> 
> 
>   
>   
> 
>  words="stopwords.txt"/>
>  words="stopwords.txt" />
>  ignoreCase="true" expand="true"/>
> 
>   
> 
> 3.) Add "with" to 
> server/solr/configsets/sample_techproducts_configs/conf/stopwords.txt and 
> make sure the file has correct line endings (extracted from Solr zip it seems 
> to contain DOS/Windows lien endings which may break things).
> 4.) Run the techproducts example with "bin/solr -e techproducts"
> 5.) Browse to 
> 
> 6.) Observe that parsedquery in the debug output is empty
> 7.) Browse to 
> 
> 8.) Observe that parsedquery contains ipod_with as expected but not 
> with_video.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7698) CommonGramsQueryFilter in the query analyzer chain breaks phrase queries

2017-02-22 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877886#comment-15877886
 ] 

Michael McCandless commented on LUCENE-7698:


OK thanks for confirming [~emaijala]; I'll fix that test on back port.

> CommonGramsQueryFilter in the query analyzer chain breaks phrase queries
> 
>
> Key: LUCENE-7698
> URL: https://issues.apache.org/jira/browse/LUCENE-7698
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/queryparser
>Affects Versions: 6.4, 6.4.1
>Reporter: Ere Maijala
>Priority: Blocker
>  Labels: regression
> Fix For: master (7.0), 6.4.2
>
> Attachments: LUCENE-7698.patch
>
>
> (Please pardon me if the project or component are wrong!)
> CommonGramsQueryFilter breaks phrase queries. The behavior also seems to 
> change with addition or removal of adjacent terms.
> Steps to reproduce:
> 1.) Download and extract Solr (in my test case version 6.4.1) somewhere.
> 2.) Modify 
> server/solr/configsets/sample_techproducts_configs/conf/managed-schema and 
> modify text_general fieldType by adding CommonGrams(Query)Filter before 
> stopWordFilter:
>  positionIncrementGap="100">
>   
> 
>  words="stopwords.txt" />
>  words="stopwords.txt" />
> 
> 
>   
>   
> 
>  words="stopwords.txt"/>
>  words="stopwords.txt" />
>  ignoreCase="true" expand="true"/>
> 
>   
> 
> 3.) Add "with" to 
> server/solr/configsets/sample_techproducts_configs/conf/stopwords.txt and 
> make sure the file has correct line endings (extracted from Solr zip it seems 
> to contain DOS/Windows lien endings which may break things).
> 4.) Run the techproducts example with "bin/solr -e techproducts"
> 5.) Browse to 
> 
> 6.) Observe that parsedquery in the debug output is empty
> 7.) Browse to 
> 
> 8.) Observe that parsedquery contains ipod_with as expected but not 
> with_video.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7698) CommonGramsQueryFilter in the query analyzer chain breaks phrase queries

2017-02-20 Thread Ere Maijala (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15874209#comment-15874209
 ] 

Ere Maijala commented on LUCENE-7698:
-

[~mikemccand], thanks for the fix. An initial check indicates that the patch 
fixes my use case. I ran the tests in branch_6x. The patch didn't quite apply 
cleanly to branch_6_4 and after applying manually a test didn't compile:

{code}
common.compile-test:
[mkdir] Created dir: 
/Users/eremaijala/src/solr/lucene/build/analysis/common/classes/test
[javac] Compiling 279 source files to 
/Users/eremaijala/src/solr/lucene/build/analysis/common/classes/test
[javac] 
/Users/eremaijala/src/solr/lucene/analysis/common/src/test/org/apache/lucene/analysis/commongrams/TestCommonGramsQueryFilterFactory.java:103:
 error: cannot find symbol
[javac] assertGraphStrings(stream, "testing_the the_factory factory 
works");
[javac] ^
[javac]   symbol:   method assertGraphStrings(TokenStream,String)
[javac]   location: class TestCommonGramsQueryFilterFactory
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] 1 error
{code}

> CommonGramsQueryFilter in the query analyzer chain breaks phrase queries
> 
>
> Key: LUCENE-7698
> URL: https://issues.apache.org/jira/browse/LUCENE-7698
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/queryparser
>Affects Versions: 6.4, 6.4.1
>Reporter: Ere Maijala
>  Labels: regression
> Fix For: master (7.0), 6.4.2
>
> Attachments: LUCENE-7698.patch
>
>
> (Please pardon me if the project or component are wrong!)
> CommonGramsQueryFilter breaks phrase queries. The behavior also seems to 
> change with addition or removal of adjacent terms.
> Steps to reproduce:
> 1.) Download and extract Solr (in my test case version 6.4.1) somewhere.
> 2.) Modify 
> server/solr/configsets/sample_techproducts_configs/conf/managed-schema and 
> modify text_general fieldType by adding CommonGrams(Query)Filter before 
> stopWordFilter:
>  positionIncrementGap="100">
>   
> 
>  words="stopwords.txt" />
>  words="stopwords.txt" />
> 
> 
>   
>   
> 
>  words="stopwords.txt"/>
>  words="stopwords.txt" />
>  ignoreCase="true" expand="true"/>
> 
>   
> 
> 3.) Add "with" to 
> server/solr/configsets/sample_techproducts_configs/conf/stopwords.txt and 
> make sure the file has correct line endings (extracted from Solr zip it seems 
> to contain DOS/Windows lien endings which may break things).
> 4.) Run the techproducts example with "bin/solr -e techproducts"
> 5.) Browse to 
> 
> 6.) Observe that parsedquery in the debug output is empty
> 7.) Browse to 
> 
> 8.) Observe that parsedquery contains ipod_with as expected but not 
> with_video.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7698) CommonGramsQueryFilter in the query analyzer chain breaks phrase queries

2017-02-17 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871777#comment-15871777
 ] 

Michael McCandless commented on LUCENE-7698:


OK I see what's happening: this filter ({{CommonGramsQueryFilter}}) deletes the 
unigram tokens, but keeps {{posLength=2}} on the bigram tokens, which makes a 
disconnected graph, and then the query parser does the wrong thing.

I think the right fix is for it to set {{posLength}} to 1 when it drops unigram 
tokens .. I'll work on a patch.

> CommonGramsQueryFilter in the query analyzer chain breaks phrase queries
> 
>
> Key: LUCENE-7698
> URL: https://issues.apache.org/jira/browse/LUCENE-7698
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/queryparser
>Affects Versions: 6.4, 6.4.1
>Reporter: Ere Maijala
>  Labels: regression
>
> (Please pardon me if the project or component are wrong!)
> CommonGramsQueryFilter breaks phrase queries. The behavior also seems to 
> change with addition or removal of adjacent terms.
> Steps to reproduce:
> 1.) Download and extract Solr (in my test case version 6.4.1) somewhere.
> 2.) Modify 
> server/solr/configsets/sample_techproducts_configs/conf/managed-schema and 
> modify text_general fieldType by adding CommonGrams(Query)Filter before 
> stopWordFilter:
>  positionIncrementGap="100">
>   
> 
>  words="stopwords.txt" />
>  words="stopwords.txt" />
> 
> 
>   
>   
> 
>  words="stopwords.txt"/>
>  words="stopwords.txt" />
>  ignoreCase="true" expand="true"/>
> 
>   
> 
> 3.) Add "with" to 
> server/solr/configsets/sample_techproducts_configs/conf/stopwords.txt and 
> make sure the file has correct line endings (extracted from Solr zip it seems 
> to contain DOS/Windows lien endings which may break things).
> 4.) Run the techproducts example with "bin/solr -e techproducts"
> 5.) Browse to 
> 
> 6.) Observe that parsedquery in the debug output is empty
> 7.) Browse to 
> 
> 8.) Observe that parsedquery contains ipod_with as expected but not 
> with_video.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7698) CommonGramsQueryFilter in the query analyzer chain breaks phrase queries

2017-02-17 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871620#comment-15871620
 ] 

Michael McCandless commented on LUCENE-7698:


Hmm, no good, sorry about this ... thank you for reporting this [~emaijala]; 
I'll try to make a Lucene test case showing this.

> CommonGramsQueryFilter in the query analyzer chain breaks phrase queries
> 
>
> Key: LUCENE-7698
> URL: https://issues.apache.org/jira/browse/LUCENE-7698
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/queryparser
>Affects Versions: 6.4, 6.4.1
>Reporter: Ere Maijala
>  Labels: regression
>
> (Please pardon me if the project or component are wrong!)
> CommonGramsQueryFilter breaks phrase queries. The behavior also seems to 
> change with addition or removal of adjacent terms.
> Steps to reproduce:
> 1.) Download and extract Solr (in my test case version 6.4.1) somewhere.
> 2.) Modify 
> server/solr/configsets/sample_techproducts_configs/conf/managed-schema and 
> modify text_general fieldType by adding CommonGrams(Query)Filter before 
> stopWordFilter:
>  positionIncrementGap="100">
>   
> 
>  words="stopwords.txt" />
>  words="stopwords.txt" />
> 
> 
>   
>   
> 
>  words="stopwords.txt"/>
>  words="stopwords.txt" />
>  ignoreCase="true" expand="true"/>
> 
>   
> 
> 3.) Add "with" to 
> server/solr/configsets/sample_techproducts_configs/conf/stopwords.txt and 
> make sure the file has correct line endings (extracted from Solr zip it seems 
> to contain DOS/Windows lien endings which may break things).
> 4.) Run the techproducts example with "bin/solr -e techproducts"
> 5.) Browse to 
> 
> 6.) Observe that parsedquery in the debug output is empty
> 7.) Browse to 
> 
> 8.) Observe that parsedquery contains ipod_with as expected but not 
> with_video.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7698) CommonGramsQueryFilter in the query analyzer chain breaks phrase queries

2017-02-17 Thread Ere Maijala (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871575#comment-15871575
 ] 

Ere Maijala commented on LUCENE-7698:
-

Looks to me like LUCENE-7603 broke this.

> CommonGramsQueryFilter in the query analyzer chain breaks phrase queries
> 
>
> Key: LUCENE-7698
> URL: https://issues.apache.org/jira/browse/LUCENE-7698
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/queryparser
>Affects Versions: 6.4, 6.4.1
>Reporter: Ere Maijala
>  Labels: regression
>
> (Please pardon me if the project or component are wrong!)
> CommonGramsQueryFilter breaks phrase queries. The behavior also seems to 
> change with addition or removal of adjacent terms.
> Steps to reproduce:
> 1.) Download and extract Solr (in my test case version 6.4.1) somewhere.
> 2.) Modify 
> server/solr/configsets/sample_techproducts_configs/conf/managed-schema and 
> modify text_general fieldType by adding CommonGrams(Query)Filter before 
> stopWordFilter:
>  positionIncrementGap="100">
>   
> 
>  words="stopwords.txt" />
>  words="stopwords.txt" />
> 
> 
>   
>   
> 
>  words="stopwords.txt"/>
>  words="stopwords.txt" />
>  ignoreCase="true" expand="true"/>
> 
>   
> 
> 3.) Add "with" to 
> server/solr/configsets/sample_techproducts_configs/conf/stopwords.txt and 
> make sure the file has correct line endings (extracted from Solr zip it seems 
> to contain DOS/Windows lien endings which may break things).
> 4.) Run the techproducts example with "bin/solr -e techproducts"
> 5.) Browse to 
> 
> 6.) Observe that parsedquery in the debug output is empty
> 7.) Browse to 
> 
> 8.) Observe that parsedquery contains ipod_with as expected but not 
> with_video.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7698) CommonGramsQueryFilter in the query analyzer chain breaks phrase queries

2017-02-17 Thread Ere Maijala (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871485#comment-15871485
 ] 

Ere Maijala commented on LUCENE-7698:
-

This seems be a regression in Solr 6.4.0. At least a quick test shows correct 
results in 6.3.0.

> CommonGramsQueryFilter in the query analyzer chain breaks phrase queries
> 
>
> Key: LUCENE-7698
> URL: https://issues.apache.org/jira/browse/LUCENE-7698
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/queryparser
>Affects Versions: 6.4, 6.4.1
>Reporter: Ere Maijala
>
> CommonGramsQueryFilter breaks phrase queries. The behavior also seems to 
> change with addition or removal of adjacent terms.
> Steps to reproduce:
> 1.) Download and extract Solr (in my test case version 6.4.1) somewhere.
> 2.) Modify 
> server/solr/configsets/sample_techproducts_configs/conf/managed-schema and 
> modify text_general fieldType by adding CommonGrams(Query)Filter before 
> stopWordFilter:
>  positionIncrementGap="100">
>   
> 
>  words="stopwords.txt" />
>  words="stopwords.txt" />
> 
> 
>   
>   
> 
>  words="stopwords.txt"/>
>  words="stopwords.txt" />
>  ignoreCase="true" expand="true"/>
> 
>   
> 
> 3.) Add "with" to 
> server/solr/configsets/sample_techproducts_configs/conf/stopwords.txt and 
> make sure the file has correct line endings (extracted from Solr zip it seems 
> to contain DOS/Windows lien endings which may break things).
> 4.) Run the techproducts example with "bin/solr -e techproducts"
> 5.) Browse to 
> 
> 6.) Observe that parsedquery in the debug output is empty
> 7.) Browse to 
> 
> 8.) Observe that parsedquery contains ipod_with as expected but not 
> with_video.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org