[jira] [Commented] (SOLR-2927) SolrIndexSearcher's register do not match close and SolrCore's closeSearcher
[ https://issues.apache.org/jira/browse/SOLR-2927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14196333#comment-14196333 ] Michael Dodsworth commented on SOLR-2927: - Thanks, [~shalinmangar]. Much appreciated. SolrIndexSearcher's register do not match close and SolrCore's closeSearcher Key: SOLR-2927 URL: https://issues.apache.org/jira/browse/SOLR-2927 Project: Solr Issue Type: Bug Components: search Affects Versions: 4.0-ALPHA Environment: JDK1.6/CentOS Reporter: tom liu Assignee: Shalin Shekhar Mangar Fix For: 5.0, Trunk Attachments: SOLR-2927.patch, SOLR-2927.patch, mbean-leak-jira.png # SolrIndexSearcher's register method put the name of searcher, but SolrCore's closeSearcher method remove name of currentSearcher on infoRegistry. # SolrIndexSearcher's register method put the name of cache, but SolrIndexSearcher's close do not remove the name of cache. so, there maybe lost some memory leak. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2927) SolrIndexSearcher's register do not match close and SolrCore's closeSearcher
[ https://issues.apache.org/jira/browse/SOLR-2927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194864#comment-14194864 ] Michael Dodsworth commented on SOLR-2927: - [~shalinmangar] any feedback on this? SolrIndexSearcher's register do not match close and SolrCore's closeSearcher Key: SOLR-2927 URL: https://issues.apache.org/jira/browse/SOLR-2927 Project: Solr Issue Type: Bug Components: search Affects Versions: 4.0-ALPHA Environment: JDK1.6/CentOS Reporter: tom liu Assignee: Shalin Shekhar Mangar Fix For: 4.9, Trunk Attachments: SOLR-2927.patch, mbean-leak-jira.png # SolrIndexSearcher's register method put the name of searcher, but SolrCore's closeSearcher method remove name of currentSearcher on infoRegistry. # SolrIndexSearcher's register method put the name of cache, but SolrIndexSearcher's close do not remove the name of cache. so, there maybe lost some memory leak. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6212) upgrade Saxon-HE to 9.5.1-5 and reinstate Morphline tests that were affected under java 8/9 with 9.5.1-4
[ https://issues.apache.org/jira/browse/SOLR-6212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116277#comment-14116277 ] Michael Dodsworth commented on SOLR-6212: - Once morphlines is updated, we can upgrade to a version of guava that doesn't have the problematic annotation. upgrade Saxon-HE to 9.5.1-5 and reinstate Morphline tests that were affected under java 8/9 with 9.5.1-4 Key: SOLR-6212 URL: https://issues.apache.org/jira/browse/SOLR-6212 Project: Solr Issue Type: Bug Affects Versions: 4.7, 5.0 Reporter: Michael Dodsworth Assignee: Mark Miller Priority: Minor Attachments: SOLR-6212.patch From SOLR-1301: For posterity, there is a thread on the dev list where we are working through an issue with Saxon on java 8 and ibm's j9. Wolfgang filed https://saxonica.plan.io/issues/1944 upstream. (Saxon is pulled in via cdk-morphlines-saxon). Due to this issue, several Morphline tests were made to be 'ignored' in java 8+. The Saxon issue has been fixed in 9.5.1-5, so we should upgrade and reinstate those tests. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6455) Ambiguous reference to Base64 in ClientUtils causes the Java 8 build to fail
Michael Dodsworth created SOLR-6455: --- Summary: Ambiguous reference to Base64 in ClientUtils causes the Java 8 build to fail Key: SOLR-6455 URL: https://issues.apache.org/jira/browse/SOLR-6455 Project: Solr Issue Type: Bug Environment: Oracle JDK 1.8.0_20-b26 Mac OSX 10.9.4 Reporter: Michael Dodsworth Priority: Blocker Compilation fails with: {code} common.compile-core: [mkdir] Created dir: /Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java [javac] Compiling 122 source files to /Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java [javac] warning: [options] bootstrap class path not set in conjunction with -source 1.6 [javac] /Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:116: error: reference to Base64 is ambiguous [javac] v = Base64.byteArrayToBase64(bytes, 0,bytes.length); [javac] ^ [javac] both class org.apache.solr.common.util.Base64 in org.apache.solr.common.util and class java.util.Base64 in java.util match [javac] /Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:119: error: reference to Base64 is ambiguous [javac] v = Base64.byteArrayToBase64(bytes.array(), bytes.position(),bytes.limit() - bytes.position()); [javac] ^ [javac] both class org.apache.solr.common.util.Base64 in org.apache.solr.common.util and class java.util.Base64 in java.util match [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 2 errors [javac] 1 warning {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6455) Ambiguous reference to Base64 in ClientUtils causes the Java 8 build to fail
[ https://issues.apache.org/jira/browse/SOLR-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Dodsworth updated SOLR-6455: Attachment: SOLR-6455.patch Using fully-qualified class name to disambiguate. The alternative would be to avoid the wildcard import (or include an import for the solr version of Base64) but seems likely an (auto-)import reorganise done by someone using java 7 would break it again. Ambiguous reference to Base64 in ClientUtils causes the Java 8 build to fail Key: SOLR-6455 URL: https://issues.apache.org/jira/browse/SOLR-6455 Project: Solr Issue Type: Bug Environment: Oracle JDK 1.8.0_20-b26 Mac OSX 10.9.4 Reporter: Michael Dodsworth Priority: Blocker Attachments: SOLR-6455.patch Compilation fails with: {code} common.compile-core: [mkdir] Created dir: /Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java [javac] Compiling 122 source files to /Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java [javac] warning: [options] bootstrap class path not set in conjunction with -source 1.6 [javac] /Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:116: error: reference to Base64 is ambiguous [javac] v = Base64.byteArrayToBase64(bytes, 0,bytes.length); [javac] ^ [javac] both class org.apache.solr.common.util.Base64 in org.apache.solr.common.util and class java.util.Base64 in java.util match [javac] /Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:119: error: reference to Base64 is ambiguous [javac] v = Base64.byteArrayToBase64(bytes.array(), bytes.position(),bytes.limit() - bytes.position()); [javac] ^ [javac] both class org.apache.solr.common.util.Base64 in org.apache.solr.common.util and class java.util.Base64 in java.util match [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 2 errors [javac] 1 warning {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6455) Ambiguous reference to Base64 in ClientUtils causes the Java 8 build to fail
[ https://issues.apache.org/jira/browse/SOLR-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Dodsworth updated SOLR-6455: Priority: Major (was: Blocker) Ambiguous reference to Base64 in ClientUtils causes the Java 8 build to fail Key: SOLR-6455 URL: https://issues.apache.org/jira/browse/SOLR-6455 Project: Solr Issue Type: Bug Environment: Oracle JDK 1.8.0_20-b26 Mac OSX 10.9.4 Reporter: Michael Dodsworth Attachments: SOLR-6455.patch Compilation fails with: {code} common.compile-core: [mkdir] Created dir: /Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java [javac] Compiling 122 source files to /Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java [javac] warning: [options] bootstrap class path not set in conjunction with -source 1.6 [javac] /Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:116: error: reference to Base64 is ambiguous [javac] v = Base64.byteArrayToBase64(bytes, 0,bytes.length); [javac] ^ [javac] both class org.apache.solr.common.util.Base64 in org.apache.solr.common.util and class java.util.Base64 in java.util match [javac] /Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:119: error: reference to Base64 is ambiguous [javac] v = Base64.byteArrayToBase64(bytes.array(), bytes.position(),bytes.limit() - bytes.position()); [javac] ^ [javac] both class org.apache.solr.common.util.Base64 in org.apache.solr.common.util and class java.util.Base64 in java.util match [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 2 errors [javac] 1 warning {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6455) Ambiguous reference to Base64 in ClientUtils causes the Java 8 build to fail
[ https://issues.apache.org/jira/browse/SOLR-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116510#comment-14116510 ] Michael Dodsworth commented on SOLR-6455: - Downgraded from 'BLOCKER' as it looks like we're setting java.compat.version to 1.6 (which I don't have installed). Ambiguous reference to Base64 in ClientUtils causes the Java 8 build to fail Key: SOLR-6455 URL: https://issues.apache.org/jira/browse/SOLR-6455 Project: Solr Issue Type: Bug Environment: Oracle JDK 1.8.0_20-b26 Mac OSX 10.9.4 Reporter: Michael Dodsworth Attachments: SOLR-6455.patch Compilation fails with: {code} common.compile-core: [mkdir] Created dir: /Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java [javac] Compiling 122 source files to /Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java [javac] warning: [options] bootstrap class path not set in conjunction with -source 1.6 [javac] /Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:116: error: reference to Base64 is ambiguous [javac] v = Base64.byteArrayToBase64(bytes, 0,bytes.length); [javac] ^ [javac] both class org.apache.solr.common.util.Base64 in org.apache.solr.common.util and class java.util.Base64 in java.util match [javac] /Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:119: error: reference to Base64 is ambiguous [javac] v = Base64.byteArrayToBase64(bytes.array(), bytes.position(),bytes.limit() - bytes.position()); [javac] ^ [javac] both class org.apache.solr.common.util.Base64 in org.apache.solr.common.util and class java.util.Base64 in java.util match [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 2 errors [javac] 1 warning {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-6455) Ambiguous reference to Base64 in ClientUtils causes the Java 8 build to fail
[ https://issues.apache.org/jira/browse/SOLR-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116515#comment-14116515 ] Michael Dodsworth edited comment on SOLR-6455 at 8/30/14 6:37 PM: -- bah. Apologies (github fork didn't inform me I already had a fork (so I was cloning an out-of-date version)). was (Author: mdodswo...@salesforce.com): bah. Apologies (github fork didn't inform me I already had a fork (so I was cloning an out-of-date version). Ambiguous reference to Base64 in ClientUtils causes the Java 8 build to fail Key: SOLR-6455 URL: https://issues.apache.org/jira/browse/SOLR-6455 Project: Solr Issue Type: Bug Environment: Oracle JDK 1.8.0_20-b26 Mac OSX 10.9.4 Reporter: Michael Dodsworth Attachments: SOLR-6455.patch Compilation fails with: {code} common.compile-core: [mkdir] Created dir: /Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java [javac] Compiling 122 source files to /Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java [javac] warning: [options] bootstrap class path not set in conjunction with -source 1.6 [javac] /Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:116: error: reference to Base64 is ambiguous [javac] v = Base64.byteArrayToBase64(bytes, 0,bytes.length); [javac] ^ [javac] both class org.apache.solr.common.util.Base64 in org.apache.solr.common.util and class java.util.Base64 in java.util match [javac] /Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:119: error: reference to Base64 is ambiguous [javac] v = Base64.byteArrayToBase64(bytes.array(), bytes.position(),bytes.limit() - bytes.position()); [javac] ^ [javac] both class org.apache.solr.common.util.Base64 in org.apache.solr.common.util and class java.util.Base64 in java.util match [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 2 errors [javac] 1 warning {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Closed] (SOLR-6455) Ambiguous reference to Base64 in ClientUtils causes the Java 8 build to fail
[ https://issues.apache.org/jira/browse/SOLR-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Dodsworth closed SOLR-6455. --- Resolution: Invalid Ambiguous reference to Base64 in ClientUtils causes the Java 8 build to fail Key: SOLR-6455 URL: https://issues.apache.org/jira/browse/SOLR-6455 Project: Solr Issue Type: Bug Environment: Oracle JDK 1.8.0_20-b26 Mac OSX 10.9.4 Reporter: Michael Dodsworth Attachments: SOLR-6455.patch Compilation fails with: {code} common.compile-core: [mkdir] Created dir: /Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java [javac] Compiling 122 source files to /Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java [javac] warning: [options] bootstrap class path not set in conjunction with -source 1.6 [javac] /Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:116: error: reference to Base64 is ambiguous [javac] v = Base64.byteArrayToBase64(bytes, 0,bytes.length); [javac] ^ [javac] both class org.apache.solr.common.util.Base64 in org.apache.solr.common.util and class java.util.Base64 in java.util match [javac] /Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:119: error: reference to Base64 is ambiguous [javac] v = Base64.byteArrayToBase64(bytes.array(), bytes.position(),bytes.limit() - bytes.position()); [javac] ^ [javac] both class org.apache.solr.common.util.Base64 in org.apache.solr.common.util and class java.util.Base64 in java.util match [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 2 errors [javac] 1 warning {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6455) Ambiguous reference to Base64 in ClientUtils causes the Java 8 build to fail
[ https://issues.apache.org/jira/browse/SOLR-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116515#comment-14116515 ] Michael Dodsworth commented on SOLR-6455: - bah. Apologies (github fork didn't inform me I already had a fork (so I was cloning an out-of-date version). Ambiguous reference to Base64 in ClientUtils causes the Java 8 build to fail Key: SOLR-6455 URL: https://issues.apache.org/jira/browse/SOLR-6455 Project: Solr Issue Type: Bug Environment: Oracle JDK 1.8.0_20-b26 Mac OSX 10.9.4 Reporter: Michael Dodsworth Attachments: SOLR-6455.patch Compilation fails with: {code} common.compile-core: [mkdir] Created dir: /Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java [javac] Compiling 122 source files to /Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java [javac] warning: [options] bootstrap class path not set in conjunction with -source 1.6 [javac] /Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:116: error: reference to Base64 is ambiguous [javac] v = Base64.byteArrayToBase64(bytes, 0,bytes.length); [javac] ^ [javac] both class org.apache.solr.common.util.Base64 in org.apache.solr.common.util and class java.util.Base64 in java.util match [javac] /Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:119: error: reference to Base64 is ambiguous [javac] v = Base64.byteArrayToBase64(bytes.array(), bytes.position(),bytes.limit() - bytes.position()); [javac] ^ [javac] both class org.apache.solr.common.util.Base64 in org.apache.solr.common.util and class java.util.Base64 in java.util match [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 2 errors [javac] 1 warning {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)
[ https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14095691#comment-14095691 ] Michael Dodsworth commented on SOLR-6062: - Fantastic. Thank you, [~ehatcher]. Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query) - Key: SOLR-6062 URL: https://issues.apache.org/jira/browse/SOLR-6062 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Assignee: Erik Hatcher Priority: Minor Attachments: combined-phrased-dismax.patch SOLR-2058 subtly changed how phrase queries, created through the pf, pf2 and pf3 parameters, are merged into the main user query. For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get (omitting the non phrase query section for clarity): {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1) {code} Prior to this change, we had: {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | field3:term1 term2^1.0)~0.1) {code} The upshot being that if the phrase query term1 term2 appears in multiple fields, it will get a significant boost over the previous implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)
[ https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14094812#comment-14094812 ] Michael Dodsworth commented on SOLR-6062: - [~jdyer], any feedback? Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query) - Key: SOLR-6062 URL: https://issues.apache.org/jira/browse/SOLR-6062 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Priority: Minor Attachments: combined-phrased-dismax.patch https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase queries, created through the pf, pf2 and pf3 parameters, are merged into the main user query. For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get (omitting the non phrase query section for clarity): {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1) {code} Prior to this change, we had: {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | field3:term1 term2^1.0)~0.1) {code} The upshot being that if the phrase query term1 term2 appears in multiple fields, it will get a significant boost over the previous implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6212) upgrade Saxon-HE to 9.5.1-5 and reinstate Morphline tests that were affected under java 8/9 with 9.5.1-4
[ https://issues.apache.org/jira/browse/SOLR-6212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14094814#comment-14094814 ] Michael Dodsworth commented on SOLR-6212: - [~markrmil...@gmail.com], any feedback? upgrade Saxon-HE to 9.5.1-5 and reinstate Morphline tests that were affected under java 8/9 with 9.5.1-4 Key: SOLR-6212 URL: https://issues.apache.org/jira/browse/SOLR-6212 Project: Solr Issue Type: Bug Affects Versions: 4.7, 5.0 Reporter: Michael Dodsworth Assignee: Mark Miller Priority: Minor Attachments: SOLR-6212.patch From SOLR-1301: For posterity, there is a thread on the dev list where we are working through an issue with Saxon on java 8 and ibm's j9. Wolfgang filed https://saxonica.plan.io/issues/1944 upstream. (Saxon is pulled in via cdk-morphlines-saxon). Due to this issue, several Morphline tests were made to be 'ignored' in java 8+. The Saxon issue has been fixed in 9.5.1-5, so we should upgrade and reinstate those tests. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)
[ https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14094907#comment-14094907 ] Michael Dodsworth commented on SOLR-6062: - Thanks for looking at this, [~ehatcher]. Any suggestions on folks to pull in? Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query) - Key: SOLR-6062 URL: https://issues.apache.org/jira/browse/SOLR-6062 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Priority: Minor Attachments: combined-phrased-dismax.patch SOLR-2058 subtly changed how phrase queries, created through the pf, pf2 and pf3 parameters, are merged into the main user query. For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get (omitting the non phrase query section for clarity): {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1) {code} Prior to this change, we had: {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | field3:term1 term2^1.0)~0.1) {code} The upshot being that if the phrase query term1 term2 appears in multiple fields, it will get a significant boost over the previous implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)
[ https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14094935#comment-14094935 ] Michael Dodsworth commented on SOLR-6062: - not that I know of -- the wanted behavior of SOLR-2058 is supported (by supplying different slop values for the same field) as well as the original behavior. Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query) - Key: SOLR-6062 URL: https://issues.apache.org/jira/browse/SOLR-6062 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Priority: Minor Attachments: combined-phrased-dismax.patch SOLR-2058 subtly changed how phrase queries, created through the pf, pf2 and pf3 parameters, are merged into the main user query. For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get (omitting the non phrase query section for clarity): {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1) {code} Prior to this change, we had: {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | field3:term1 term2^1.0)~0.1) {code} The upshot being that if the phrase query term1 term2 appears in multiple fields, it will get a significant boost over the previous implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)
[ https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14086543#comment-14086543 ] Michael Dodsworth commented on SOLR-6062: - Any feedback, [~ramayer] Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query) - Key: SOLR-6062 URL: https://issues.apache.org/jira/browse/SOLR-6062 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Priority: Minor Attachments: combined-phrased-dismax.patch https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase queries, created through the pf, pf2 and pf3 parameters, are merged into the main user query. For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get (omitting the non phrase query section for clarity): {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1) {code} Prior to this change, we had: {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | field3:term1 term2^1.0)~0.1) {code} The upshot being that if the phrase query term1 term2 appears in multiple fields, it will get a significant boost over the previous implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-4730) SmartChineseAnalyzer got wrong matched offset
[ https://issues.apache.org/jira/browse/LUCENE-4730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047046#comment-14047046 ] Michael Dodsworth edited comment on LUCENE-4730 at 6/29/14 3:39 PM: This appears to be a symptom of LUCENE-4984 (fixed in 4.8). The following test fails: {code:java} // note Version.LUCENE_4_7 assertAnalyzesTo(new SmartChineseAnalyzer(Version.LUCENE_4_7, true), My China , new String[] { my, china}, new int[] {0,3}, new int[] {2, 8}); {code} whereas this passes: {code:java} // note Version.LUCENE_4_8 assertAnalyzesTo(new SmartChineseAnalyzer(Version.LUCENE_4_8, true), My China , new String[] { my, china}, new int[] {0,3}, new int[] {2, 8}); {code} I'll add a test to verify this double-whitespace case but otherwise, this can be closed out. was (Author: mdodswo...@salesforce.com): This appears to be a symptom of LUCENE-4984 (fixed in 4.8). The following test fails: {code:java} // note Version.LUCENE_4_7 assertAnalyzesTo(new SmartChineseAnalyzer(Version.LUCENE_4_7, true), My China , new String[] { my, china}, new int[] {0,3}, new int[] {2, 8}); {code} whereas this passes: {code:java} note Version.LUCENE_4_8 assertAnalyzesTo(new SmartChineseAnalyzer(Version.LUCENE_4_8, true), My China , new String[] { my, china}, new int[] {0,3}, new int[] {2, 8}); {code} I'll add a test to verify this double-whitespace case but otherwise, this can be closed out. SmartChineseAnalyzer got wrong matched offset - Key: LUCENE-4730 URL: https://issues.apache.org/jira/browse/LUCENE-4730 Project: Lucene - Core Issue Type: Bug Components: modules/analysis Affects Versions: 4.0, 4.1 Environment: JDK1.7 Linux/Windows Reporter: Jinsong Hu Priority: Critical Attachments: LUCENE-4730.patch We found that SmartChineseAnalyzer got wrong matched offset with the following test code: public void testHighlight() throws Exception { String text = My China ; String queryText = China; StringBuilder builder = new StringBuilder(html); Analyzer analyzer = new SmartChineseAnalyzer(Version.LUCENE_40); //Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_40); QueryParser parser = new QueryParser(Version.LUCENE_40, text, analyzer); Query query = parser.parse(queryText); SimpleHTMLFormatter formatter = new SimpleHTMLFormatter(span style=\background: yellow\, /span); TokenStream tokens = analyzer.tokenStream(text, new StringReader(text)); QueryScorer scorer = new QueryScorer(query, text); Highlighter highlighter = new Highlighter(formatter, scorer); highlighter.setTextFragmenter(new SimpleSpanFragmenter(scorer)); String result = highlighter.getBestFragments(tokens, text, 10, ...); if (result.length() text.length()) { result = text; } builder.append(body); builder.append(result); builder.append(/body); builder.append(/html); System.out.println(builder.toString()); } This method will generate a hilighted text, however, the highlight position is obviously wrong, and if we remove one space from the text, that is, change text from My China (ends with two spaces) to My China (ends with one space), it will generate a text with correct highlight. If we change the analyzer from SmartChineseAnalyzer to StandardAnalyzer, the highlight issue will disappear. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5109) Solr 4.4 will not deploy in Glassfish 4.x
[ https://issues.apache.org/jira/browse/SOLR-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047162#comment-14047162 ] Michael Dodsworth commented on SOLR-5109: - ah, fun. Those tests don't run under java 8 due to a Saxon-HE issue (which is actually fixed in 9.5.1-5). The guava issue has been fixed in kite-sdk:master - https://github.com/kite-sdk/kite/commit/0ab2795872e4e5721f477d79e5049371a17ab8db. We'll have to wait for the next drop of kite-sdk before guava can be upgraded. I'll create a separate issue for updating Saxon-HE and reinstating the affected tests. Apologies and thanks for looking this, Shalin. Solr 4.4 will not deploy in Glassfish 4.x - Key: SOLR-5109 URL: https://issues.apache.org/jira/browse/SOLR-5109 Project: Solr Issue Type: Bug Affects Versions: 4.4 Environment: Glassfish 4.x Reporter: jamon camisso Priority: Blocker Labels: guava Attachments: LUCENE-5109.patch, guava-15.0-SNAPSHOT.jar The bundled Guava 14.0.1 JAR blocks deploying Solr 4.4 in Glassfish 4.x. This failure is a known issue with upstream Guava and is described here: https://code.google.com/p/guava-libraries/issues/detail?id=1433 Building Guava guava-15.0-SNAPSHOT.jar from master and bundling it in Solr allows for a successful deployment. Until the Guava developers release version 15 using their HEAD or even an RC tag seems like the only way to resolve this. This is frustrating since it was proposed that Guava be removed as a dependency before Solr 4.0 was released and yet it remains and blocks upgrading: https://issues.apache.org/jira/browse/SOLR-3601 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6212) upgrade Saxon-HE to 9.5.1-5 and reinstate Morphline tests that were affected under java 8/9 with 9.5.1-4
Michael Dodsworth created SOLR-6212: --- Summary: upgrade Saxon-HE to 9.5.1-5 and reinstate Morphline tests that were affected under java 8/9 with 9.5.1-4 Key: SOLR-6212 URL: https://issues.apache.org/jira/browse/SOLR-6212 Project: Solr Issue Type: Bug Affects Versions: 4.7, 5.0 Reporter: Michael Dodsworth Priority: Minor From SOLR-1301: For posterity, there is a thread on the dev list where we are working through an issue with Saxon on java 8 and ibm's j9. Wolfgang filed https://saxonica.plan.io/issues/1944 upstream. (Saxon is pulled in via cdk-morphlines-saxon). Due to this issue, several Morphline tests were made to be 'ignored' in java 8+. The Saxon issue has been fixed in 9.5.1-5, so we should upgrade and reinstate those tests. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6212) upgrade Saxon-HE to 9.5.1-5 and reinstate Morphline tests that were affected under java 8/9 with 9.5.1-4
[ https://issues.apache.org/jira/browse/SOLR-6212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Dodsworth updated SOLR-6212: Attachment: SOLR-6212.patch upgrading Saxon-HE (to 9.5.1-5) and morphlines (to 0.14.1) upgrade Saxon-HE to 9.5.1-5 and reinstate Morphline tests that were affected under java 8/9 with 9.5.1-4 Key: SOLR-6212 URL: https://issues.apache.org/jira/browse/SOLR-6212 Project: Solr Issue Type: Bug Affects Versions: 4.7, 5.0 Reporter: Michael Dodsworth Assignee: Mark Miller Priority: Minor Attachments: SOLR-6212.patch From SOLR-1301: For posterity, there is a thread on the dev list where we are working through an issue with Saxon on java 8 and ibm's j9. Wolfgang filed https://saxonica.plan.io/issues/1944 upstream. (Saxon is pulled in via cdk-morphlines-saxon). Due to this issue, several Morphline tests were made to be 'ignored' in java 8+. The Saxon issue has been fixed in 9.5.1-5, so we should upgrade and reinstate those tests. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5109) Solr 4.4 will not deploy in Glassfish 4.x
[ https://issues.apache.org/jira/browse/SOLR-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Dodsworth updated SOLR-5109: Attachment: LUCENE-5109.patch Here's a patch for branch_4x to upgrade Guava to 17.0 (in which, the problematic annotation has been removed). Solr 4.4 will not deploy in Glassfish 4.x - Key: SOLR-5109 URL: https://issues.apache.org/jira/browse/SOLR-5109 Project: Solr Issue Type: Bug Affects Versions: 4.4 Environment: Glassfish 4.x Reporter: jamon camisso Priority: Blocker Labels: guava Attachments: LUCENE-5109.patch, guava-15.0-SNAPSHOT.jar The bundled Guava 14.0.1 JAR blocks deploying Solr 4.4 in Glassfish 4.x. This failure is a known issue with upstream Guava and is described here: https://code.google.com/p/guava-libraries/issues/detail?id=1433 Building Guava guava-15.0-SNAPSHOT.jar from master and bundling it in Solr allows for a successful deployment. Until the Guava developers release version 15 using their HEAD or even an RC tag seems like the only way to resolve this. This is frustrating since it was proposed that Guava be removed as a dependency before Solr 4.0 was released and yet it remains and blocks upgrading: https://issues.apache.org/jira/browse/SOLR-3601 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4787) The QueryScorer.getMaxWeight method is not found.
[ https://issues.apache.org/jira/browse/LUCENE-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Dodsworth updated LUCENE-4787: -- Attachment: LUCENE-4787.patch The QueryScorer.getMaxWeight method is not found. - Key: LUCENE-4787 URL: https://issues.apache.org/jira/browse/LUCENE-4787 Project: Lucene - Core Issue Type: Bug Components: modules/highlighter Affects Versions: 4.1 Reporter: Hao Zhong Priority: Critical Attachments: LUCENE-4787.patch The following API documents refer to the QueryScorer.getMaxWeight method: http://lucene.apache.org/core/4_1_0/highlighter/org/apache/lucene/search/highlight/package-summary.html The QueryScorer.getMaxWeight method is useful when passed to the GradientFormatter constructor to define the top score which is associated with the top color. http://lucene.apache.org/core/4_1_0/highlighter/org/apache/lucene/search/highlight/GradientFormatter.html See QueryScorer.getMaxWeight which can be used to calibrate scoring scale However, the QueryScorer class does not declare a getMaxWeight method in lucene 4.1, according to its document: http://lucene.apache.org/core/4_1_0/highlighter/org/apache/lucene/search/highlight/QueryScorer.html Instead, the class declares a getMaxTermWeight method. Is that the correct method in the preceding two documents? If it is, please revise the two documents. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4730) SmartChineseAnalyzer got wrong matched offset
[ https://issues.apache.org/jira/browse/LUCENE-4730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047046#comment-14047046 ] Michael Dodsworth commented on LUCENE-4730: --- This appears to be a symptom of LUCENE-4984 (fixed in 4.8). The following test fails: {code:java} // note Version.LUCENE_4_7 assertAnalyzesTo(new SmartChineseAnalyzer(Version.LUCENE_4_7, true), My China , new String[] { my, china}, new int[] {0,3}, new int[] {2, 8}); {code} whereas this passes: {code:java} note Version.LUCENE_4_8 assertAnalyzesTo(new SmartChineseAnalyzer(Version.LUCENE_4_8, true), My China , new String[] { my, china}, new int[] {0,3}, new int[] {2, 8}); {code} I'll add a test to verify this double-whitespace case but otherwise, this can be closed out. SmartChineseAnalyzer got wrong matched offset - Key: LUCENE-4730 URL: https://issues.apache.org/jira/browse/LUCENE-4730 Project: Lucene - Core Issue Type: Bug Components: modules/analysis Affects Versions: 4.0, 4.1 Environment: JDK1.7 Linux/Windows Reporter: Jinsong Hu Priority: Critical We found that SmartChineseAnalyzer got wrong matched offset with the following test code: public void testHighlight() throws Exception { String text = My China ; String queryText = China; StringBuilder builder = new StringBuilder(html); Analyzer analyzer = new SmartChineseAnalyzer(Version.LUCENE_40); //Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_40); QueryParser parser = new QueryParser(Version.LUCENE_40, text, analyzer); Query query = parser.parse(queryText); SimpleHTMLFormatter formatter = new SimpleHTMLFormatter(span style=\background: yellow\, /span); TokenStream tokens = analyzer.tokenStream(text, new StringReader(text)); QueryScorer scorer = new QueryScorer(query, text); Highlighter highlighter = new Highlighter(formatter, scorer); highlighter.setTextFragmenter(new SimpleSpanFragmenter(scorer)); String result = highlighter.getBestFragments(tokens, text, 10, ...); if (result.length() text.length()) { result = text; } builder.append(body); builder.append(result); builder.append(/body); builder.append(/html); System.out.println(builder.toString()); } This method will generate a hilighted text, however, the highlight position is obviously wrong, and if we remove one space from the text, that is, change text from My China (ends with two spaces) to My China (ends with one space), it will generate a text with correct highlight. If we change the analyzer from SmartChineseAnalyzer to StandardAnalyzer, the highlight issue will disappear. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4730) SmartChineseAnalyzer got wrong matched offset
[ https://issues.apache.org/jira/browse/LUCENE-4730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Dodsworth updated LUCENE-4730: -- Attachment: LUCENE-4730.patch SmartChineseAnalyzer got wrong matched offset - Key: LUCENE-4730 URL: https://issues.apache.org/jira/browse/LUCENE-4730 Project: Lucene - Core Issue Type: Bug Components: modules/analysis Affects Versions: 4.0, 4.1 Environment: JDK1.7 Linux/Windows Reporter: Jinsong Hu Priority: Critical Attachments: LUCENE-4730.patch We found that SmartChineseAnalyzer got wrong matched offset with the following test code: public void testHighlight() throws Exception { String text = My China ; String queryText = China; StringBuilder builder = new StringBuilder(html); Analyzer analyzer = new SmartChineseAnalyzer(Version.LUCENE_40); //Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_40); QueryParser parser = new QueryParser(Version.LUCENE_40, text, analyzer); Query query = parser.parse(queryText); SimpleHTMLFormatter formatter = new SimpleHTMLFormatter(span style=\background: yellow\, /span); TokenStream tokens = analyzer.tokenStream(text, new StringReader(text)); QueryScorer scorer = new QueryScorer(query, text); Highlighter highlighter = new Highlighter(formatter, scorer); highlighter.setTextFragmenter(new SimpleSpanFragmenter(scorer)); String result = highlighter.getBestFragments(tokens, text, 10, ...); if (result.length() text.length()) { result = text; } builder.append(body); builder.append(result); builder.append(/body); builder.append(/html); System.out.println(builder.toString()); } This method will generate a hilighted text, however, the highlight position is obviously wrong, and if we remove one space from the text, that is, change text from My China (ends with two spaces) to My China (ends with one space), it will generate a text with correct highlight. If we change the analyzer from SmartChineseAnalyzer to StandardAnalyzer, the highlight issue will disappear. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4787) The QueryScorer.getMaxWeight method is not found.
[ https://issues.apache.org/jira/browse/LUCENE-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047050#comment-14047050 ] Michael Dodsworth commented on LUCENE-4787: --- doc fixed in the attached patch. The QueryScorer.getMaxWeight method is not found. - Key: LUCENE-4787 URL: https://issues.apache.org/jira/browse/LUCENE-4787 Project: Lucene - Core Issue Type: Bug Components: modules/highlighter Affects Versions: 4.1 Reporter: Hao Zhong Priority: Critical Attachments: LUCENE-4787.patch The following API documents refer to the QueryScorer.getMaxWeight method: http://lucene.apache.org/core/4_1_0/highlighter/org/apache/lucene/search/highlight/package-summary.html The QueryScorer.getMaxWeight method is useful when passed to the GradientFormatter constructor to define the top score which is associated with the top color. http://lucene.apache.org/core/4_1_0/highlighter/org/apache/lucene/search/highlight/GradientFormatter.html See QueryScorer.getMaxWeight which can be used to calibrate scoring scale However, the QueryScorer class does not declare a getMaxWeight method in lucene 4.1, according to its document: http://lucene.apache.org/core/4_1_0/highlighter/org/apache/lucene/search/highlight/QueryScorer.html Instead, the class declares a getMaxTermWeight method. Is that the correct method in the preceding two documents? If it is, please revise the two documents. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)
[ https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034070#comment-14034070 ] Michael Dodsworth commented on SOLR-6062: - [~ramayer] does that output look correct to you? Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query) - Key: SOLR-6062 URL: https://issues.apache.org/jira/browse/SOLR-6062 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Priority: Minor Attachments: combined-phrased-dismax.patch https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase queries, created through the pf, pf2 and pf3 parameters, are merged into the main user query. For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get (omitting the non phrase query section for clarity): {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1) {code} Prior to this change, we had: {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | field3:term1 term2^1.0)~0.1) {code} The upshot being that if the phrase query term1 term2 appears in multiple fields, it will get a significant boost over the previous implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)
[ https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14012659#comment-14012659 ] Michael Dodsworth commented on SOLR-6062: - Thanks for looking at this, [~ramayer], Here's an example that shows both the grouping within a particular pf? query (where the supplied fields have the same slop) and the splitting out/layering of queries when different slops are used for the same field(s). Hold on to your hats... {q, , qf, phrase_sw phrase1_sw, pf, phrase_sw~1^10 phrase_sw~2^20 phrase_sw^30, pf2, phrase_sw~2^22 phrase_sw^33 phrase1_sw~2^44 phrase1_sw~4^55, pf3, phrase_sw~2^222 phrase_sw^333 phrase1_sw~2^444 phrase1_sw~4^555} # pf -- phrase_sw with 3 different slop values results in 3 independent dismax queries DisjunctionMaxQuery((phrase_sw: ~1^10.0)) DisjunctionMaxQuery((phrase_sw: ~2^20.0)) DisjunctionMaxQuery((phrase_sw: ^30.0)) # pf2 -- phrase_sw and phrase1_sw were both supplied with a slop of 2, so those queries are grouped ( DisjunctionMaxQuery((phrase_sw: ~2^22.0 | phrase1_sw: ~2^44.0)) DisjunctionMaxQuery((phrase_sw: ~2^22.0 | phrase1_sw: ~2^44.0)) ) ( DisjunctionMaxQuery((phrase_sw: ^33.0)) DisjunctionMaxQuery((phrase_sw: ^33.0)) ) ( DisjunctionMaxQuery((phrase1_sw: ~4^55.0)) DisjunctionMaxQuery((phrase1_sw: ~4^55.0)) ) # pf3 DisjunctionMaxQuery((phrase_sw: ~2^222.0 | phrase1_sw: ~2^444.0)) DisjunctionMaxQuery((phrase_sw: ^333.0)) DisjunctionMaxQuery((phrase1_sw: ~4^555.0))) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query) - Key: SOLR-6062 URL: https://issues.apache.org/jira/browse/SOLR-6062 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Priority: Minor Attachments: combined-phrased-dismax.patch https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase queries, created through the pf, pf2 and pf3 parameters, are merged into the main user query. For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get (omitting the non phrase query section for clarity): {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1) {code} Prior to this change, we had: {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | field3:term1 term2^1.0)~0.1) {code} The upshot being that if the phrase query term1 term2 appears in multiple fields, it will get a significant boost over the previous implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)
[ https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14012659#comment-14012659 ] Michael Dodsworth edited comment on SOLR-6062 at 5/29/14 6:25 PM: -- Thanks for looking at this, [~ramayer], Here's an example that shows both the grouping within a particular pf? query (where the supplied fields have the same slop) and the splitting out/layering of queries when different slops are used for the same field(s). Hold on to your hats... {code} {q, , qf, phrase_sw phrase1_sw, pf, phrase_sw~1^10 phrase_sw~2^20 phrase_sw^30, pf2, phrase_sw~2^22 phrase_sw^33 phrase1_sw~2^44 phrase1_sw~4^55, pf3, phrase_sw~2^222 phrase_sw^333 phrase1_sw~2^444 phrase1_sw~4^555} # pf -- phrase_sw with 3 different slop values results in 3 independent dismax queries DisjunctionMaxQuery((phrase_sw: ~1^10.0)) DisjunctionMaxQuery((phrase_sw: ~2^20.0)) DisjunctionMaxQuery((phrase_sw: ^30.0)) # pf2 -- phrase_sw and phrase1_sw were both supplied with a slop of 2, so those queries are grouped ( DisjunctionMaxQuery((phrase_sw: ~2^22.0 | phrase1_sw: ~2^44.0)) DisjunctionMaxQuery((phrase_sw: ~2^22.0 | phrase1_sw: ~2^44.0)) ) ( DisjunctionMaxQuery((phrase_sw: ^33.0)) DisjunctionMaxQuery((phrase_sw: ^33.0)) ) ( DisjunctionMaxQuery((phrase1_sw: ~4^55.0)) DisjunctionMaxQuery((phrase1_sw: ~4^55.0)) ) # pf3 DisjunctionMaxQuery((phrase_sw: ~2^222.0 | phrase1_sw: ~2^444.0)) DisjunctionMaxQuery((phrase_sw: ^333.0)) DisjunctionMaxQuery((phrase1_sw: ~4^555.0))) {code} was (Author: mdodswo...@salesforce.com): Thanks for looking at this, [~ramayer], Here's an example that shows both the grouping within a particular pf? query (where the supplied fields have the same slop) and the splitting out/layering of queries when different slops are used for the same field(s). Hold on to your hats... {q, , qf, phrase_sw phrase1_sw, pf, phrase_sw~1^10 phrase_sw~2^20 phrase_sw^30, pf2, phrase_sw~2^22 phrase_sw^33 phrase1_sw~2^44 phrase1_sw~4^55, pf3, phrase_sw~2^222 phrase_sw^333 phrase1_sw~2^444 phrase1_sw~4^555} # pf -- phrase_sw with 3 different slop values results in 3 independent dismax queries DisjunctionMaxQuery((phrase_sw: ~1^10.0)) DisjunctionMaxQuery((phrase_sw: ~2^20.0)) DisjunctionMaxQuery((phrase_sw: ^30.0)) # pf2 -- phrase_sw and phrase1_sw were both supplied with a slop of 2, so those queries are grouped ( DisjunctionMaxQuery((phrase_sw: ~2^22.0 | phrase1_sw: ~2^44.0)) DisjunctionMaxQuery((phrase_sw: ~2^22.0 | phrase1_sw: ~2^44.0)) ) ( DisjunctionMaxQuery((phrase_sw: ^33.0)) DisjunctionMaxQuery((phrase_sw: ^33.0)) ) ( DisjunctionMaxQuery((phrase1_sw: ~4^55.0)) DisjunctionMaxQuery((phrase1_sw: ~4^55.0)) ) # pf3 DisjunctionMaxQuery((phrase_sw: ~2^222.0 | phrase1_sw: ~2^444.0)) DisjunctionMaxQuery((phrase_sw: ^333.0)) DisjunctionMaxQuery((phrase1_sw: ~4^555.0))) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query) - Key: SOLR-6062 URL: https://issues.apache.org/jira/browse/SOLR-6062 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Priority: Minor Attachments: combined-phrased-dismax.patch https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase queries, created through the pf, pf2 and pf3 parameters, are merged into the main user query. For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get (omitting the non phrase query section for clarity): {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1) {code} Prior to this change, we had: {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | field3:term1 term2^1.0)~0.1) {code} The upshot being that if the phrase query term1 term2 appears in multiple fields, it will get a significant boost over the previous implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands,
[jira] [Comment Edited] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)
[ https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14012659#comment-14012659 ] Michael Dodsworth edited comment on SOLR-6062 at 5/29/14 6:25 PM: -- Thanks for looking at this, [~ramayer], Here's an example (post fix) that shows both the grouping within a particular pf? query (where the supplied fields have the same slop) and the splitting out/layering of queries when different slops are used for the same field(s). Hold on to your hats... {code} {q, , qf, phrase_sw phrase1_sw, pf, phrase_sw~1^10 phrase_sw~2^20 phrase_sw^30, pf2, phrase_sw~2^22 phrase_sw^33 phrase1_sw~2^44 phrase1_sw~4^55, pf3, phrase_sw~2^222 phrase_sw^333 phrase1_sw~2^444 phrase1_sw~4^555} # pf -- phrase_sw with 3 different slop values results in 3 independent dismax queries DisjunctionMaxQuery((phrase_sw: ~1^10.0)) DisjunctionMaxQuery((phrase_sw: ~2^20.0)) DisjunctionMaxQuery((phrase_sw: ^30.0)) # pf2 -- phrase_sw and phrase1_sw were both supplied with a slop of 2, so those queries are grouped ( DisjunctionMaxQuery((phrase_sw: ~2^22.0 | phrase1_sw: ~2^44.0)) DisjunctionMaxQuery((phrase_sw: ~2^22.0 | phrase1_sw: ~2^44.0)) ) ( DisjunctionMaxQuery((phrase_sw: ^33.0)) DisjunctionMaxQuery((phrase_sw: ^33.0)) ) ( DisjunctionMaxQuery((phrase1_sw: ~4^55.0)) DisjunctionMaxQuery((phrase1_sw: ~4^55.0)) ) # pf3 DisjunctionMaxQuery((phrase_sw: ~2^222.0 | phrase1_sw: ~2^444.0)) DisjunctionMaxQuery((phrase_sw: ^333.0)) DisjunctionMaxQuery((phrase1_sw: ~4^555.0))) {code} was (Author: mdodswo...@salesforce.com): Thanks for looking at this, [~ramayer], Here's an example that shows both the grouping within a particular pf? query (where the supplied fields have the same slop) and the splitting out/layering of queries when different slops are used for the same field(s). Hold on to your hats... {code} {q, , qf, phrase_sw phrase1_sw, pf, phrase_sw~1^10 phrase_sw~2^20 phrase_sw^30, pf2, phrase_sw~2^22 phrase_sw^33 phrase1_sw~2^44 phrase1_sw~4^55, pf3, phrase_sw~2^222 phrase_sw^333 phrase1_sw~2^444 phrase1_sw~4^555} # pf -- phrase_sw with 3 different slop values results in 3 independent dismax queries DisjunctionMaxQuery((phrase_sw: ~1^10.0)) DisjunctionMaxQuery((phrase_sw: ~2^20.0)) DisjunctionMaxQuery((phrase_sw: ^30.0)) # pf2 -- phrase_sw and phrase1_sw were both supplied with a slop of 2, so those queries are grouped ( DisjunctionMaxQuery((phrase_sw: ~2^22.0 | phrase1_sw: ~2^44.0)) DisjunctionMaxQuery((phrase_sw: ~2^22.0 | phrase1_sw: ~2^44.0)) ) ( DisjunctionMaxQuery((phrase_sw: ^33.0)) DisjunctionMaxQuery((phrase_sw: ^33.0)) ) ( DisjunctionMaxQuery((phrase1_sw: ~4^55.0)) DisjunctionMaxQuery((phrase1_sw: ~4^55.0)) ) # pf3 DisjunctionMaxQuery((phrase_sw: ~2^222.0 | phrase1_sw: ~2^444.0)) DisjunctionMaxQuery((phrase_sw: ^333.0)) DisjunctionMaxQuery((phrase1_sw: ~4^555.0))) {code} Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query) - Key: SOLR-6062 URL: https://issues.apache.org/jira/browse/SOLR-6062 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Priority: Minor Attachments: combined-phrased-dismax.patch https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase queries, created through the pf, pf2 and pf3 parameters, are merged into the main user query. For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get (omitting the non phrase query section for clarity): {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1) {code} Prior to this change, we had: {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | field3:term1 term2^1.0)~0.1) {code} The upshot being that if the phrase query term1 term2 appears in multiple fields, it will get a significant boost over the previous implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
[jira] [Commented] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)
[ https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010368#comment-14010368 ] Michael Dodsworth commented on SOLR-6062: - adding [~jdyer] [~janhoy], as you were involved in https://issues.apache.org/jira/browse/SOLR-2058 Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query) - Key: SOLR-6062 URL: https://issues.apache.org/jira/browse/SOLR-6062 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Priority: Minor Attachments: combined-phrased-dismax.patch https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase queries, created through the pf, pf2 and pf3 parameters, are merged into the main user query. For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get (omitting the non phrase query section for clarity): {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1) {code} Prior to this change, we had: {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | field3:term1 term2^1.0)~0.1) {code} The upshot being that if the phrase query term1 term2 appears in multiple fields, it will get a significant boost over the previous implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)
[ https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999275#comment-13999275 ] Michael Dodsworth commented on SOLR-6062: - all comments and feedback welcome. Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query) - Key: SOLR-6062 URL: https://issues.apache.org/jira/browse/SOLR-6062 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Priority: Minor Attachments: combined-phrased-dismax.patch https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase queries, created through the pf, pf2 and pf3 parameters, are merged into the main user query. For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get (omitting the non phrase query section for clarity): {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1) {code} Prior to this change, we had: {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | field3:term1 term2^1.0)~0.1) {code} The upshot being that if the phrase query term1 term2 appears in multiple fields, it will get a significant boost over the previous implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6062) a phrase query is created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)
[ https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996085#comment-13996085 ] Michael Dodsworth commented on SOLR-6062: - As was mentioned on this issue, the behavioral change was not desirable. a phrase query is created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query) Key: SOLR-6062 URL: https://issues.apache.org/jira/browse/SOLR-6062 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Priority: Minor https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase queries, created through the pf, pf2 and pf3 parameters, are merged into the main user query. For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get (omitting the non phrase query section for clarity): {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1) {code} Prior to this change, we had: {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | field3:term1 term2^1.0)~0.1) {code} The upshot being that if the phrase query term1 term2 appears in multiple fields, it will get a significant boost over the previous implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)
[ https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Dodsworth updated SOLR-6062: Attachment: combined-phrased-dismax.patch Rather than sending each FieldParam through addShingledPhraseQueries individually (which results in a dismax query per field), we're now grouping the phrase fields by their wordGram count and sending each *group* through addShingledPhraseQueries. One slight complication is that the original/linked issue allowed the *same* field to be passed through a pf parameter with differing slop values. The intent being that those scores would be combined, rather than the max being used across those fields. In order to continue support for that feature, we're also grouping FieldParams by their associated slop values (passing each group through independently). I've added a test for the multi-field case. If people are happy with the approach, I can combine the wordGram and slop value grouping into a single pass. Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query) - Key: SOLR-6062 URL: https://issues.apache.org/jira/browse/SOLR-6062 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Priority: Minor Attachments: combined-phrased-dismax.patch https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase queries, created through the pf, pf2 and pf3 parameters, are merged into the main user query. For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get (omitting the non phrase query section for clarity): {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1) {code} Prior to this change, we had: {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | field3:term1 term2^1.0)~0.1) {code} The upshot being that if the phrase query term1 term2 appears in multiple fields, it will get a significant boost over the previous implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)
[ https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996547#comment-13996547 ] Michael Dodsworth commented on SOLR-6062: - [~ndushay] [~ramayer] Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query) - Key: SOLR-6062 URL: https://issues.apache.org/jira/browse/SOLR-6062 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Priority: Minor Attachments: combined-phrased-dismax.patch https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase queries, created through the pf, pf2 and pf3 parameters, are merged into the main user query. For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get (omitting the non phrase query section for clarity): {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1) {code} Prior to this change, we had: {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | field3:term1 term2^1.0)~0.1) {code} The upshot being that if the phrase query term1 term2 appears in multiple fields, it will get a significant boost over the previous implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)
[ https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Dodsworth updated SOLR-6062: Summary: Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query) (was: a phrase query is created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query) - Key: SOLR-6062 URL: https://issues.apache.org/jira/browse/SOLR-6062 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Priority: Minor https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase queries, created through the pf, pf2 and pf3 parameters, are merged into the main user query. For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get (omitting the non phrase query section for clarity): {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1) {code} Prior to this change, we had: {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | field3:term1 term2^1.0)~0.1) {code} The upshot being that if the phrase query term1 term2 appears in multiple fields, it will get a significant boost over the previous implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6062) a phrase query is created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)
Michael Dodsworth created SOLR-6062: --- Summary: a phrase query is created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query) Key: SOLR-6062 URL: https://issues.apache.org/jira/browse/SOLR-6062 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Priority: Minor https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase queries, created through the pf, pf2 and pf3 parameters, are merged into the main user query. For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get (omitting the non phrase query section for clarity): {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1) {code} Prior to this change, we had: {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | field3:term1 term2^1.0)~0.1) {code} The upshot being that if the phrase query term1 term2 appears in multiple fields, it will get a significant boost over the previous implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2058) Adds optional phrase slop to edismax pf2, pf3 and pf parameters with field~slop^boost syntax
[ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13472796#comment-13472796 ] Michael Dodsworth commented on SOLR-2058: - Can anyone comment on whether this change was intentional or accidental (and unwanted)? Adds optional phrase slop to edismax pf2, pf3 and pf parameters with field~slop^boost syntax Key: SOLR-2058 URL: https://issues.apache.org/jira/browse/SOLR-2058 Project: Solr Issue Type: Improvement Components: query parsers Environment: n/a Reporter: Ron Mayer Assignee: James Dyer Priority: Minor Fix For: 4.0-ALPHA Attachments: edismax_pf_with_slop_v2.1.patch, edismax_pf_with_slop_v2.patch, pf2_with_slop.patch, SOLR-2058-and-3351-not-finished.patch, SOLR-2058.patch http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3c4c659119.2010...@0ape.com%3E {quote} From Ron Mayer r...@0ape.com ... my results might be even better if I had a couple different pf2s with different ps's at the same time. In particular. One with ps=0 to put a high boost on ones the have the right ordering of words. For example insuring that [the query]: red hat black jacket boosts only documents with red hats and not black hats. And another pf2 with a more modest boost with ps=5 or so to handle the query above also boosting docs with red baseball hat. {quote} [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3caanlktimd+v3g6d_mnhp+jykkd+dej8fvmvf_1lqoi...@mail.gmail.com%3E] {quote} From Yonik Seeley yo...@lucidimagination.com Perhaps fold it into the pf/pf2 syntax? pf=text^2// current syntax... makes phrases with a boost of 2 pf=text~1^2 // proposed syntax... makes phrases with a slop of 1 and a boost of 2 That actually seems pretty natural given the lucene query syntax - an actual boosted sloppy phrase query already looks like {{text:foo bar~1^2}} -Yonik {quote} [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3calpine.deb.1.10.1008161300510.6...@radix.cryptio.net%3E] {quote} From Chris Hostetter hossman_luc...@fucit.org Big +1 to this idea ... the existing ps param can stick arround as the default for any field that doesn't specify it's own slop in the pf/pf2/pf3 fields using the ~ syntax. -Hoss {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2058) Adds optional phrase slop to edismax pf2, pf3 and pf parameters with field~slop^boost syntax
[ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463009#comment-13463009 ] Michael Dodsworth commented on SOLR-2058: - It looks like this change also altered the way phrase queries are merged into the main query. For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get (omitting the non phrase query section for clarity): {code} main query DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1) {code} Prior to this change, we got: {code} main query DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | field3:term1 term2^1.0)~0.1) {code} The upshot being that if the phrase query term1 term2 appears in multiple fields, it will get a significant boost over the previous implementation. The presence of the dismax queries makes me think this behavioral change was not intentional; if that's the case, let me know and I'll get a fix together. Thanks. Adds optional phrase slop to edismax pf2, pf3 and pf parameters with field~slop^boost syntax Key: SOLR-2058 URL: https://issues.apache.org/jira/browse/SOLR-2058 Project: Solr Issue Type: Improvement Components: query parsers Environment: n/a Reporter: Ron Mayer Assignee: James Dyer Priority: Minor Fix For: 4.0-ALPHA Attachments: edismax_pf_with_slop_v2.1.patch, edismax_pf_with_slop_v2.patch, pf2_with_slop.patch, SOLR-2058-and-3351-not-finished.patch, SOLR-2058.patch http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3c4c659119.2010...@0ape.com%3E {quote} From Ron Mayer r...@0ape.com ... my results might be even better if I had a couple different pf2s with different ps's at the same time. In particular. One with ps=0 to put a high boost on ones the have the right ordering of words. For example insuring that [the query]: red hat black jacket boosts only documents with red hats and not black hats. And another pf2 with a more modest boost with ps=5 or so to handle the query above also boosting docs with red baseball hat. {quote} [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3caanlktimd+v3g6d_mnhp+jykkd+dej8fvmvf_1lqoi...@mail.gmail.com%3E] {quote} From Yonik Seeley yo...@lucidimagination.com Perhaps fold it into the pf/pf2 syntax? pf=text^2// current syntax... makes phrases with a boost of 2 pf=text~1^2 // proposed syntax... makes phrases with a slop of 1 and a boost of 2 That actually seems pretty natural given the lucene query syntax - an actual boosted sloppy phrase query already looks like {{text:foo bar~1^2}} -Yonik {quote} [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3calpine.deb.1.10.1008161300510.6...@radix.cryptio.net%3E] {quote} From Chris Hostetter hossman_luc...@fucit.org Big +1 to this idea ... the existing ps param can stick arround as the default for any field that doesn't specify it's own slop in the pf/pf2/pf3 fields using the ~ syntax. -Hoss {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3725) package-local-src-tgz target is pulling in non-source jars, dist/** and package/**
[ https://issues.apache.org/jira/browse/SOLR-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434466#comment-13434466 ] Michael Dodsworth commented on SOLR-3725: - Excellent. Thanks, Robert. package-local-src-tgz target is pulling in non-source jars, dist/** and package/** -- Key: SOLR-3725 URL: https://issues.apache.org/jira/browse/SOLR-3725 Project: Solr Issue Type: Improvement Components: Build Affects Versions: 4.0-ALPHA Reporter: Michael Dodsworth Priority: Minor Fix For: 5.0, 4.0 Attachments: SOLR-3725.patch package-local-src-tgz generates a 141M archive which contains a bunch of non-source jars: {code} tar tfz apache-solr-4.0-SNAPSHOT-src.tgz | grep -E '(war|jar)$' | wc -l 134 {code} It looks like we're expecting dist/** and package/** to be excluded: {code:xml} tarfileset dir=. prefix=${fullnamever}/solr excludes=build ${package.dir}/** ${dist}/** example/webapps/*.war example/exampledocs/post.jar lib/README.committers.txt **/data/ **/logs/* **/*.sh **/bin/ scripts/ .idea/ **/*.iml **/pom.xml / {code} The issue is that package.dir and dist refer to absolute paths; excludes assumes relative paths. It's also pulling in all the contrib/**/lib/ and example/lib/ jars. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3725) package-local-src-tgz target is pulling in non-source jars, dist/** and package/**
Michael Dodsworth created SOLR-3725: --- Summary: package-local-src-tgz target is pulling in non-source jars, dist/** and package/** Key: SOLR-3725 URL: https://issues.apache.org/jira/browse/SOLR-3725 Project: Solr Issue Type: Improvement Components: Build Affects Versions: 4.0-ALPHA Reporter: Michael Dodsworth Priority: Minor Fix For: 4.1 package-local-src-tgz generates a 141M archive which contains a bunch of non-source jars: {code} tar tfz apache-solr-4.0-SNAPSHOT-src.tgz | grep -E '(war|jar)$' | wc -l 134 {code} It looks like we're expecting dist/** and package/** to be excluded: {code:xml} tarfileset dir=. prefix=${fullnamever}/solr excludes=build ${package.dir}/** ${dist}/** example/webapps/*.war example/exampledocs/post.jar lib/README.committers.txt **/data/ **/logs/* **/*.sh **/bin/ scripts/ .idea/ **/*.iml **/pom.xml / {code} The issue is that package.dir and dist refer to absolute paths; excludes assumes relative paths. It's also pulling in all the contrib/**/lib/ and example/lib/ jars. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3725) package-local-src-tgz target is pulling in non-source jars, dist/** and package/**
[ https://issues.apache.org/jira/browse/SOLR-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13432136#comment-13432136 ] Michael Dodsworth commented on SOLR-3725: - it's also including everything from solr/build and solr/lib/ package-local-src-tgz target is pulling in non-source jars, dist/** and package/** -- Key: SOLR-3725 URL: https://issues.apache.org/jira/browse/SOLR-3725 Project: Solr Issue Type: Improvement Components: Build Affects Versions: 4.0-ALPHA Reporter: Michael Dodsworth Priority: Minor Fix For: 4.1 package-local-src-tgz generates a 141M archive which contains a bunch of non-source jars: {code} tar tfz apache-solr-4.0-SNAPSHOT-src.tgz | grep -E '(war|jar)$' | wc -l 134 {code} It looks like we're expecting dist/** and package/** to be excluded: {code:xml} tarfileset dir=. prefix=${fullnamever}/solr excludes=build ${package.dir}/** ${dist}/** example/webapps/*.war example/exampledocs/post.jar lib/README.committers.txt **/data/ **/logs/* **/*.sh **/bin/ scripts/ .idea/ **/*.iml **/pom.xml / {code} The issue is that package.dir and dist refer to absolute paths; excludes assumes relative paths. It's also pulling in all the contrib/**/lib/ and example/lib/ jars. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3725) package-local-src-tgz target is pulling in non-source jars, dist/** and package/**
[ https://issues.apache.org/jira/browse/SOLR-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Dodsworth updated SOLR-3725: Attachment: SOLR-3725.patch generated archive is now 31M package-local-src-tgz target is pulling in non-source jars, dist/** and package/** -- Key: SOLR-3725 URL: https://issues.apache.org/jira/browse/SOLR-3725 Project: Solr Issue Type: Improvement Components: Build Affects Versions: 4.0-ALPHA Reporter: Michael Dodsworth Priority: Minor Fix For: 4.1 Attachments: SOLR-3725.patch package-local-src-tgz generates a 141M archive which contains a bunch of non-source jars: {code} tar tfz apache-solr-4.0-SNAPSHOT-src.tgz | grep -E '(war|jar)$' | wc -l 134 {code} It looks like we're expecting dist/** and package/** to be excluded: {code:xml} tarfileset dir=. prefix=${fullnamever}/solr excludes=build ${package.dir}/** ${dist}/** example/webapps/*.war example/exampledocs/post.jar lib/README.committers.txt **/data/ **/logs/* **/*.sh **/bin/ scripts/ .idea/ **/*.iml **/pom.xml / {code} The issue is that package.dir and dist refer to absolute paths; excludes assumes relative paths. It's also pulling in all the contrib/**/lib/ and example/lib/ jars. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3580) In ExtendedDismax, lowercase 'not' operator is not being treated as an operator when 'lowercaseOperators' is enabled
[ https://issues.apache.org/jira/browse/SOLR-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425847#comment-13425847 ] Michael Dodsworth commented on SOLR-3580: - Does this seem like a reasonable direction to everyone? In ExtendedDismax, lowercase 'not' operator is not being treated as an operator when 'lowercaseOperators' is enabled Key: SOLR-3580 URL: https://issues.apache.org/jira/browse/SOLR-3580 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0-ALPHA Reporter: Michael Dodsworth Priority: Minor Fix For: 4.0 Attachments: SOLR-3580-proposal.patch, SOLR-3580.patch When lowercase operator support is enabled (for edismax), the lowercase 'not' operator is being wrongly treated as a literal term (and not as an operator). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3580) In ExtendedDismax, lowercase 'not' operator is not being treated as an operator when 'lowercaseOperators' is enabled
[ https://issues.apache.org/jira/browse/SOLR-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425932#comment-13425932 ] Michael Dodsworth commented on SOLR-3580: - Thanks for the feedback, Jack/Yonik. 1 - support for mixed-case operators is as before: they are interpreted as operators. Having said that, there appears to be an subtle bug with the 'mm' toggling behaviour. The operator counting (used to determine whether 'mm' needs to be disabled) only accepts strict uppercase and lowercase, whereas the query rebuild accepts mixed-case. I can also fix that up and add a test. 2 - the 'supportedLowercaseOperators' parameter would be in addition to 'lowercaseOperators', rather than replacing it. If 'lowercaseOperators' is true, we look for a 'supportedLowercaseOperators' value. If no value is provided, we use the default (and, or), which means we have backwards compatibility. Yonik - yeah, Jan's proposal is absolutely the most flexible. I guess my concerns were: - that it might snowball into wanting to have an external, stopword-esk file for per-language operator support (minor concern) - that we'd lose some backwards compatibility, as currently mixed-case operators are supported (although the default set could be expanded to accommodate this, if needed) - the interaction between the 'lowercaseOperators' parameter and 'valid*' might get a little funky. For example, if we simply ignore 'lowercaseOperators' when a 'valid*' parameter is present, there is no potential for confusion BUT toggling lowercase operator support per query then becomes a head-ache (as the upstream client needs to pass through the supported uppercase operators). If we allow interaction between 'lowercaseOperators' and 'valid*', which parameter takes priority? To allow toggling per-query, lowercaseOperators *should* take priority. Perhaps a good dollop of documentation would be enough here Let me extend the patch to switch-over to Jan's proposal so people can take a look. Cheers, In ExtendedDismax, lowercase 'not' operator is not being treated as an operator when 'lowercaseOperators' is enabled Key: SOLR-3580 URL: https://issues.apache.org/jira/browse/SOLR-3580 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0-ALPHA Reporter: Michael Dodsworth Priority: Minor Fix For: 4.0 Attachments: SOLR-3580-proposal.patch, SOLR-3580.patch When lowercase operator support is enabled (for edismax), the lowercase 'not' operator is being wrongly treated as a literal term (and not as an operator). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3467) ExtendedDismax escaping is missing several reserved characters
[ https://issues.apache.org/jira/browse/SOLR-3467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13405147#comment-13405147 ] Michael Dodsworth commented on SOLR-3467: - much appreciated, Jan. Thank you. ExtendedDismax escaping is missing several reserved characters -- Key: SOLR-3467 URL: https://issues.apache.org/jira/browse/SOLR-3467 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 3.6 Reporter: Michael Dodsworth Assignee: Jan Høydahl Priority: Minor Fix For: 4.0, 5.0 Attachments: SOLR-3467-lucene_solr_3_6.patch, SOLR-3467.patch, SOLR-3467.patch, SOLR-3467.patch, SOLR-3467.patch When edismax is unable to parse the original user query, it retries using an escaped version of that query (where all reserved chars have been escaped). Currently, the escaping done in {{splitIntoClauses}} appears to be missing several chars from {{QueryParserBase#escape(String)}}, namely {{'\\', '|', '', '/'}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3580) In ExtendedDismax, lowercase 'not' operator is not being treated as an operator when 'lowercaseOperators' is enabled
[ https://issues.apache.org/jira/browse/SOLR-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Dodsworth updated SOLR-3580: Attachment: SOLR-3580-proposal.patch adding the 'supportedLowercaseOperator' parameter I mentioned. Also cleared up a few unused vars, assignments, etc. Requires more tests but I'm interested to hear what people think. In ExtendedDismax, lowercase 'not' operator is not being treated as an operator when 'lowercaseOperators' is enabled Key: SOLR-3580 URL: https://issues.apache.org/jira/browse/SOLR-3580 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Priority: Minor Fix For: 4.0 Attachments: SOLR-3580-proposal.patch, SOLR-3580.patch When lowercase operator support is enabled (for edismax), the lowercase 'not' operator is being wrongly treated as a literal term (and not as an operator). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3580) In ExtendedDismax, lowercase 'not' operator is not being treated as an operator when 'lowercaseOperators' is enabled
[ https://issues.apache.org/jira/browse/SOLR-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403641#comment-13403641 ] Michael Dodsworth commented on SOLR-3580: - one option (that sits somewhere between the 2 proposed solutions) may be to have a 'supportedLowercaseOperators' setting that takes a comma-separated list of supported operators. If no override is provided, the default behaviour would be to accept '[and,or]'. {code:xml} str name=supportedLowercaseOperatorsand,or,not/str {code} Let me get a patch together so people can take a look. In ExtendedDismax, lowercase 'not' operator is not being treated as an operator when 'lowercaseOperators' is enabled Key: SOLR-3580 URL: https://issues.apache.org/jira/browse/SOLR-3580 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Priority: Minor Fix For: 4.0 Attachments: SOLR-3580.patch When lowercase operator support is enabled (for edismax), the lowercase 'not' operator is being wrongly treated as a literal term (and not as an operator). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3467) ExtendedDismax escaping is missing several reserved characters
[ https://issues.apache.org/jira/browse/SOLR-3467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402287#comment-13402287 ] Michael Dodsworth commented on SOLR-3467: - Thank you, Jan. From what I can tell, '/' only became a reserved character since 4.0 - https://issues.apache.org/jira/browse/LUCENE-2604. ExtendedDismax escaping is missing several reserved characters -- Key: SOLR-3467 URL: https://issues.apache.org/jira/browse/SOLR-3467 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 3.6 Reporter: Michael Dodsworth Assignee: Jan Høydahl Priority: Minor Fix For: 4.0, 3.6.1, 5.0 Attachments: SOLR-3467-lucene_solr_3_6.patch, SOLR-3467.patch, SOLR-3467.patch, SOLR-3467.patch When edismax is unable to parse the original user query, it retries using an escaped version of that query (where all reserved chars have been escaped). Currently, the escaping done in {{splitIntoClauses}} appears to be missing several chars from {{QueryParserBase#escape(String)}}, namely {{'\\', '|', '', '/'}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3580) In ExtendedDismax, lowercase 'not' operator is not being treated as an operator when 'lowercaseOperators' is enabled
[ https://issues.apache.org/jira/browse/SOLR-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402318#comment-13402318 ] Michael Dodsworth commented on SOLR-3580: - surely that's a more general hazard with supporting lowercase operators. It seems strange to give 'not' special treatment. There are likely are examples where having 'and' or 'or' wrongly treated as a operator /is/ catastrophic, therefore the onus should be on the client to choose the correct 'lowercaseOperator' option for their use-case. In ExtendedDismax, lowercase 'not' operator is not being treated as an operator when 'lowercaseOperators' is enabled Key: SOLR-3580 URL: https://issues.apache.org/jira/browse/SOLR-3580 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Priority: Minor Fix For: 4.0 Attachments: SOLR-3580.patch When lowercase operator support is enabled (for edismax), the lowercase 'not' operator is being wrongly treated as a literal term (and not as an operator). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3580) In ExtendedDismax, lowercase 'not' operator is not being treated as an operator when 'lowercaseOperators' is enabled
[ https://issues.apache.org/jira/browse/SOLR-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402350#comment-13402350 ] Michael Dodsworth commented on SOLR-3580: - were we not allowing the user to explicitly *specify* that they want to support lowercase operators, I might agree. That setting should (at the very least) come with a clear health warning so that more people aren't caught out by this. In ExtendedDismax, lowercase 'not' operator is not being treated as an operator when 'lowercaseOperators' is enabled Key: SOLR-3580 URL: https://issues.apache.org/jira/browse/SOLR-3580 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Priority: Minor Fix For: 4.0 Attachments: SOLR-3580.patch When lowercase operator support is enabled (for edismax), the lowercase 'not' operator is being wrongly treated as a literal term (and not as an operator). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3580) In ExtendedDismax, lowercase 'not' operator is not being treated as an operator when 'lowercaseOperators' is enabled
Michael Dodsworth created SOLR-3580: --- Summary: In ExtendedDismax, lowercase 'not' operator is not being treated as an operator when 'lowercaseOperators' is enabled Key: SOLR-3580 URL: https://issues.apache.org/jira/browse/SOLR-3580 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Priority: Minor Fix For: 4.0 When lowercase operator support is enabled (for edismax), the lowercase 'not' operator is being wrongly treated as a literal term (and not as an operator). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3580) In ExtendedDismax, lowercase 'not' operator is not being treated as an operator when 'lowercaseOperators' is enabled
[ https://issues.apache.org/jira/browse/SOLR-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Dodsworth updated SOLR-3580: Attachment: SOLR-3580.patch patch includes: - fix to edismax - replacement operator test (covers upper and lowercase 'AND', 'OR' and 'NOT' operators with 'lowercaseOperators' enabled/disabled) - small clear-up in the test (adding @Test annotations and removing redundant 'throws IOException's) In ExtendedDismax, lowercase 'not' operator is not being treated as an operator when 'lowercaseOperators' is enabled Key: SOLR-3580 URL: https://issues.apache.org/jira/browse/SOLR-3580 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Priority: Minor Fix For: 4.0 Attachments: SOLR-3580.patch When lowercase operator support is enabled (for edismax), the lowercase 'not' operator is being wrongly treated as a literal term (and not as an operator). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3467) ExtendedDismax escaping is missing several reserved characters
[ https://issues.apache.org/jira/browse/SOLR-3467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Dodsworth updated SOLR-3467: Attachment: (was: SOLR-3467.patch) ExtendedDismax escaping is missing several reserved characters -- Key: SOLR-3467 URL: https://issues.apache.org/jira/browse/SOLR-3467 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Priority: Minor Fix For: 4.0 Attachments: SOLR-3467.patch, SOLR-3467.patch When edismax is unable to parse the original user query, it retries using an escaped version of that query (where all reserved chars have been escaped). Currently, the escaping done in {{splitIntoClauses}} appears to be missing several chars from {{QueryParserBase#escape(String)}}, namely {{'\\', '|', '', '/'}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3467) ExtendedDismax escaping is missing several reserved characters
[ https://issues.apache.org/jira/browse/SOLR-3467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Dodsworth updated SOLR-3467: Attachment: SOLR-3467.patch SOLR-3467.patch added test. Thanks for looking at the patch. ExtendedDismax escaping is missing several reserved characters -- Key: SOLR-3467 URL: https://issues.apache.org/jira/browse/SOLR-3467 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Priority: Minor Fix For: 4.0 Attachments: SOLR-3467.patch, SOLR-3467.patch When edismax is unable to parse the original user query, it retries using an escaped version of that query (where all reserved chars have been escaped). Currently, the escaping done in {{splitIntoClauses}} appears to be missing several chars from {{QueryParserBase#escape(String)}}, namely {{'\\', '|', '', '/'}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3467) ExtendedDismax escaping is missing several reserved characters
[ https://issues.apache.org/jira/browse/SOLR-3467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400895#comment-13400895 ] Michael Dodsworth commented on SOLR-3467: - all feedback/review comments welcome ExtendedDismax escaping is missing several reserved characters -- Key: SOLR-3467 URL: https://issues.apache.org/jira/browse/SOLR-3467 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Priority: Minor Fix For: 4.0 Attachments: SOLR-3467.patch When edismax is unable to parse the original user query, it retries using an escaped version of that query (where all reserved chars have been escaped). Currently, the escaping done in {{splitIntoClauses}} appears to be missing several chars from {{QueryParserBase#escape(String)}}, namely {{'\\', '|', '', '/'}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3467) ExtendedDismax escaping is missing several reserved characters
Michael Dodsworth created SOLR-3467: --- Summary: ExtendedDismax escaping is missing several reserved characters Key: SOLR-3467 URL: https://issues.apache.org/jira/browse/SOLR-3467 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Priority: Minor Fix For: 4.0 When edismax is unable to parse the original user query, it retries using an escaped version of that query (where all reserved chars have been escaped). Currently, the escaping done in {{splitIntoClauses}} appears to be missing several chars from {{QueryParserBase#escape(String)}}, namely {{'\\', '|', '', '/'}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3467) ExtendedDismax escaping is missing several reserved characters
[ https://issues.apache.org/jira/browse/SOLR-3467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Dodsworth updated SOLR-3467: Attachment: SOLR-3467.patch ExtendedDismax escaping is missing several reserved characters -- Key: SOLR-3467 URL: https://issues.apache.org/jira/browse/SOLR-3467 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Priority: Minor Fix For: 4.0 Attachments: SOLR-3467.patch When edismax is unable to parse the original user query, it retries using an escaped version of that query (where all reserved chars have been escaped). Currently, the escaping done in {{splitIntoClauses}} appears to be missing several chars from {{QueryParserBase#escape(String)}}, namely {{'\\', '|', '', '/'}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1227) NGramTokenizer to handle more than 1024 chars
[ https://issues.apache.org/jira/browse/LUCENE-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12660924#action_12660924 ] Michael Dodsworth commented on LUCENE-1227: --- Any progress on getting this patch into a release? I can take a look if nobody else is. NGramTokenizer to handle more than 1024 chars - Key: LUCENE-1227 URL: https://issues.apache.org/jira/browse/LUCENE-1227 Project: Lucene - Java Issue Type: Improvement Components: contrib/* Reporter: Hiroaki Kawai Assignee: Grant Ingersoll Priority: Minor Attachments: LUCENE-1227.patch, NGramTokenizer.patch, NGramTokenizer.patch Current NGramTokenizer can't handle character stream that is longer than 1024. This is too short for non-whitespace-separated languages. I created a patch for this issues. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Created: (LUCENE-1352) trailing escaped backslashes in quoted queries cause parse error
trailing escaped backslashes in quoted queries cause parse error Key: LUCENE-1352 URL: https://issues.apache.org/jira/browse/LUCENE-1352 Project: Lucene - Java Issue Type: Bug Components: QueryParser Affects Versions: 2.3.2 Environment: Ubuntu 7.04, Sun JVM 1.5.0_1 Reporter: Michael Dodsworth The QueryParser fails to parse queries that contain escaped backslashes followed by a closing double-quote, then an opening double-quote (as part of another term). For example, the query: tagOrig:testing\\ title:titleTest will fail with the exception: org.apache.lucene.queryParser.ParseException: Cannot parse 'tagOrig:testing\\ title:titleTest': Lexical error at line 1, column 38. Encountered: EOF after : at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:155) at org.apache.solr.search.LuceneQParser.parse(LuceneQParserPlugin.java:79) After digging around, I found that 'QueryParserTokenManager:jjMoveNfa_3' is generating - 'testing\\\ title:' as the token following the opening quote. It should be generating 'testing\\'; it appears to see the first double-quote as being escaped by the preceding slashes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (SOLR-281) Search Components (plugins)
[ https://issues.apache.org/jira/browse/SOLR-281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12564739#action_12564739 ] Michael Dodsworth commented on SOLR-281: {quote} That would require instantiation with reflection I think. {quote} Reflection is already being used to create the QParserPlugins (SolrCore:1027 and AbstractPluginLoader:83) - I'm guessing the reason for the plugin is just to avoid creating instances through reflection on every parse (as you could keep hold of the QParser class and call newInstance). The second point is moot, once you take away the need for createParser(...). It's really not that big-a-deal, in the scheme of things. {quote} QParserPlugin is that interface essentially (except that its an class instead of an interface). For library maintainers an abstract class is preferred over an interface for things that a user will extend... that way signature changes can be made in a backward compatible manner. {quote} As an aside, method signature changes are usually trivial to fix; personally, the pain of those fixes is favourable to extending an abstract class unnecessarily. Are there any architectural reworking projects on the roadmap? I'm sure backward compatibility is a massive concern; perhaps with the more modular plugin design route Solr is going down, those concerns can be addressed. If there's a chance of being accepted, I would love to contribute a move towards using Spring. Search Components (plugins) --- Key: SOLR-281 URL: https://issues.apache.org/jira/browse/SOLR-281 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-281-ComponentInit.patch, SOLR-281-ComponentInit.patch, SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, solr-281.patch, solr-281.patch, solr-281.patch A request handler with pluggable search components for things like: - standard - dismax - more-like-this - highlighting - field collapsing For more discussion, see: http://www.nabble.com/search-components-%28plugins%29-tf3898040.html#a11050274 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-281) Search Components (plugins)
[ https://issues.apache.org/jira/browse/SOLR-281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12564449#action_12564449 ] Michael Dodsworth commented on SOLR-281: This is great; decomposing the handler and allowing the components to be wired up in the config really helps development (and maintenance of those changes). For my purposes, I needed to make a change to the way the dismax query was being generated. Using the DisMaxQParserPlugin as a template, I created my own QParser and associated QParserPlugin; changed the relevant bits; added a queryParser... entry in solrconfig.xml; added the 'defType' parameter to the wanted SearchHandler configuration...and...all works well. Just a few comments: * I had to make the QParser parse() method public (as the new query parser may still need to use the existing query parsers (backup lucene parser, boost parser, function parser, etc). * The QParserPlugin class seems unnecessary: all it does is implement init() and add a createParser method. Why not just have the parser constructor take those arguments...or, if that can't be done, create an interface to allow the parser itself implement both init() and createParser() (or create()). It then avoids having to create 2 classes (in the case of DisMax, in the same file...which is not pretty). Search Components (plugins) --- Key: SOLR-281 URL: https://issues.apache.org/jira/browse/SOLR-281 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-281-ComponentInit.patch, SOLR-281-ComponentInit.patch, SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, solr-281.patch, solr-281.patch, solr-281.patch A request handler with pluggable search components for things like: - standard - dismax - more-like-this - highlighting - field collapsing For more discussion, see: http://www.nabble.com/search-components-%28plugins%29-tf3898040.html#a11050274 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.