[jira] [Commented] (SOLR-2927) SolrIndexSearcher's register do not match close and SolrCore's closeSearcher

2014-11-04 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14196333#comment-14196333
 ] 

Michael Dodsworth commented on SOLR-2927:
-

Thanks, [~shalinmangar]. Much appreciated.

 SolrIndexSearcher's register do not match close and SolrCore's closeSearcher
 

 Key: SOLR-2927
 URL: https://issues.apache.org/jira/browse/SOLR-2927
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 4.0-ALPHA
 Environment: JDK1.6/CentOS
Reporter: tom liu
Assignee: Shalin Shekhar Mangar
 Fix For: 5.0, Trunk

 Attachments: SOLR-2927.patch, SOLR-2927.patch, mbean-leak-jira.png


 # SolrIndexSearcher's register method put the name of searcher, but 
 SolrCore's closeSearcher method remove name of currentSearcher on 
 infoRegistry.
 # SolrIndexSearcher's register method put the name of cache, but 
 SolrIndexSearcher's close do not remove the name of cache.
 so, there maybe lost some memory leak.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2927) SolrIndexSearcher's register do not match close and SolrCore's closeSearcher

2014-11-03 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194864#comment-14194864
 ] 

Michael Dodsworth commented on SOLR-2927:
-

[~shalinmangar] any feedback on this?

 SolrIndexSearcher's register do not match close and SolrCore's closeSearcher
 

 Key: SOLR-2927
 URL: https://issues.apache.org/jira/browse/SOLR-2927
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 4.0-ALPHA
 Environment: JDK1.6/CentOS
Reporter: tom liu
Assignee: Shalin Shekhar Mangar
 Fix For: 4.9, Trunk

 Attachments: SOLR-2927.patch, mbean-leak-jira.png


 # SolrIndexSearcher's register method put the name of searcher, but 
 SolrCore's closeSearcher method remove name of currentSearcher on 
 infoRegistry.
 # SolrIndexSearcher's register method put the name of cache, but 
 SolrIndexSearcher's close do not remove the name of cache.
 so, there maybe lost some memory leak.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6212) upgrade Saxon-HE to 9.5.1-5 and reinstate Morphline tests that were affected under java 8/9 with 9.5.1-4

2014-08-30 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116277#comment-14116277
 ] 

Michael Dodsworth commented on SOLR-6212:
-

Once morphlines is updated, we can upgrade to a version of guava that doesn't 
have the problematic annotation.

 upgrade Saxon-HE to 9.5.1-5 and reinstate Morphline tests that were affected 
 under java 8/9 with 9.5.1-4
 

 Key: SOLR-6212
 URL: https://issues.apache.org/jira/browse/SOLR-6212
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.7, 5.0
Reporter: Michael Dodsworth
Assignee: Mark Miller
Priority: Minor
 Attachments: SOLR-6212.patch


 From SOLR-1301:
 For posterity, there is a thread on the dev list where we are working 
 through an issue with Saxon on java 8 and ibm's j9. Wolfgang filed 
 https://saxonica.plan.io/issues/1944 upstream. (Saxon is pulled in via 
 cdk-morphlines-saxon).
 Due to this issue, several Morphline tests were made to be 'ignored' in java 
 8+. The Saxon issue has been fixed in 9.5.1-5, so we should upgrade and 
 reinstate those tests.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6455) Ambiguous reference to Base64 in ClientUtils causes the Java 8 build to fail

2014-08-30 Thread Michael Dodsworth (JIRA)
Michael Dodsworth created SOLR-6455:
---

 Summary: Ambiguous reference to Base64 in ClientUtils causes the 
Java 8 build to fail
 Key: SOLR-6455
 URL: https://issues.apache.org/jira/browse/SOLR-6455
 Project: Solr
  Issue Type: Bug
 Environment: Oracle JDK 1.8.0_20-b26
Mac OSX 10.9.4
Reporter: Michael Dodsworth
Priority: Blocker


Compilation fails with:

{code}
common.compile-core:
[mkdir] Created dir: 
/Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java
[javac] Compiling 122 source files to 
/Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java
[javac] warning: [options] bootstrap class path not set in conjunction with 
-source 1.6
[javac] 
/Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:116:
 error: reference to Base64 is ambiguous
[javac]   v = Base64.byteArrayToBase64(bytes, 0,bytes.length);
[javac]   ^
[javac]   both class org.apache.solr.common.util.Base64 in 
org.apache.solr.common.util and class java.util.Base64 in java.util match
[javac] 
/Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:119:
 error: reference to Base64 is ambiguous
[javac]   v = Base64.byteArrayToBase64(bytes.array(), 
bytes.position(),bytes.limit() - bytes.position());
[javac]   ^
[javac]   both class org.apache.solr.common.util.Base64 in 
org.apache.solr.common.util and class java.util.Base64 in java.util match
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.
[javac] 2 errors
[javac] 1 warning
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6455) Ambiguous reference to Base64 in ClientUtils causes the Java 8 build to fail

2014-08-30 Thread Michael Dodsworth (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dodsworth updated SOLR-6455:


Attachment: SOLR-6455.patch

Using fully-qualified class name to disambiguate. The alternative would be to 
avoid the wildcard import (or include an import for the solr version of Base64) 
but seems likely an (auto-)import reorganise done by someone using java 7 would 
break it again. 

 Ambiguous reference to Base64 in ClientUtils causes the Java 8 build to fail
 

 Key: SOLR-6455
 URL: https://issues.apache.org/jira/browse/SOLR-6455
 Project: Solr
  Issue Type: Bug
 Environment: Oracle JDK 1.8.0_20-b26
 Mac OSX 10.9.4
Reporter: Michael Dodsworth
Priority: Blocker
 Attachments: SOLR-6455.patch


 Compilation fails with:
 {code}
 common.compile-core:
 [mkdir] Created dir: 
 /Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java
 [javac] Compiling 122 source files to 
 /Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java
 [javac] warning: [options] bootstrap class path not set in conjunction 
 with -source 1.6
 [javac] 
 /Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:116:
  error: reference to Base64 is ambiguous
 [javac]   v = Base64.byteArrayToBase64(bytes, 0,bytes.length);
 [javac]   ^
 [javac]   both class org.apache.solr.common.util.Base64 in 
 org.apache.solr.common.util and class java.util.Base64 in java.util match
 [javac] 
 /Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:119:
  error: reference to Base64 is ambiguous
 [javac]   v = Base64.byteArrayToBase64(bytes.array(), 
 bytes.position(),bytes.limit() - bytes.position());
 [javac]   ^
 [javac]   both class org.apache.solr.common.util.Base64 in 
 org.apache.solr.common.util and class java.util.Base64 in java.util match
 [javac] Note: Some input files use unchecked or unsafe operations.
 [javac] Note: Recompile with -Xlint:unchecked for details.
 [javac] 2 errors
 [javac] 1 warning
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6455) Ambiguous reference to Base64 in ClientUtils causes the Java 8 build to fail

2014-08-30 Thread Michael Dodsworth (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dodsworth updated SOLR-6455:


Priority: Major  (was: Blocker)

 Ambiguous reference to Base64 in ClientUtils causes the Java 8 build to fail
 

 Key: SOLR-6455
 URL: https://issues.apache.org/jira/browse/SOLR-6455
 Project: Solr
  Issue Type: Bug
 Environment: Oracle JDK 1.8.0_20-b26
 Mac OSX 10.9.4
Reporter: Michael Dodsworth
 Attachments: SOLR-6455.patch


 Compilation fails with:
 {code}
 common.compile-core:
 [mkdir] Created dir: 
 /Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java
 [javac] Compiling 122 source files to 
 /Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java
 [javac] warning: [options] bootstrap class path not set in conjunction 
 with -source 1.6
 [javac] 
 /Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:116:
  error: reference to Base64 is ambiguous
 [javac]   v = Base64.byteArrayToBase64(bytes, 0,bytes.length);
 [javac]   ^
 [javac]   both class org.apache.solr.common.util.Base64 in 
 org.apache.solr.common.util and class java.util.Base64 in java.util match
 [javac] 
 /Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:119:
  error: reference to Base64 is ambiguous
 [javac]   v = Base64.byteArrayToBase64(bytes.array(), 
 bytes.position(),bytes.limit() - bytes.position());
 [javac]   ^
 [javac]   both class org.apache.solr.common.util.Base64 in 
 org.apache.solr.common.util and class java.util.Base64 in java.util match
 [javac] Note: Some input files use unchecked or unsafe operations.
 [javac] Note: Recompile with -Xlint:unchecked for details.
 [javac] 2 errors
 [javac] 1 warning
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6455) Ambiguous reference to Base64 in ClientUtils causes the Java 8 build to fail

2014-08-30 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116510#comment-14116510
 ] 

Michael Dodsworth commented on SOLR-6455:
-

Downgraded from 'BLOCKER' as it looks like we're setting java.compat.version to 
1.6 (which I don't have installed).

 Ambiguous reference to Base64 in ClientUtils causes the Java 8 build to fail
 

 Key: SOLR-6455
 URL: https://issues.apache.org/jira/browse/SOLR-6455
 Project: Solr
  Issue Type: Bug
 Environment: Oracle JDK 1.8.0_20-b26
 Mac OSX 10.9.4
Reporter: Michael Dodsworth
 Attachments: SOLR-6455.patch


 Compilation fails with:
 {code}
 common.compile-core:
 [mkdir] Created dir: 
 /Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java
 [javac] Compiling 122 source files to 
 /Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java
 [javac] warning: [options] bootstrap class path not set in conjunction 
 with -source 1.6
 [javac] 
 /Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:116:
  error: reference to Base64 is ambiguous
 [javac]   v = Base64.byteArrayToBase64(bytes, 0,bytes.length);
 [javac]   ^
 [javac]   both class org.apache.solr.common.util.Base64 in 
 org.apache.solr.common.util and class java.util.Base64 in java.util match
 [javac] 
 /Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:119:
  error: reference to Base64 is ambiguous
 [javac]   v = Base64.byteArrayToBase64(bytes.array(), 
 bytes.position(),bytes.limit() - bytes.position());
 [javac]   ^
 [javac]   both class org.apache.solr.common.util.Base64 in 
 org.apache.solr.common.util and class java.util.Base64 in java.util match
 [javac] Note: Some input files use unchecked or unsafe operations.
 [javac] Note: Recompile with -Xlint:unchecked for details.
 [javac] 2 errors
 [javac] 1 warning
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-6455) Ambiguous reference to Base64 in ClientUtils causes the Java 8 build to fail

2014-08-30 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116515#comment-14116515
 ] 

Michael Dodsworth edited comment on SOLR-6455 at 8/30/14 6:37 PM:
--

bah. Apologies (github fork didn't inform me I already had a fork (so I was 
cloning an out-of-date version)).


was (Author: mdodswo...@salesforce.com):
bah. Apologies (github fork didn't inform me I already had a fork (so I was 
cloning an out-of-date version).

 Ambiguous reference to Base64 in ClientUtils causes the Java 8 build to fail
 

 Key: SOLR-6455
 URL: https://issues.apache.org/jira/browse/SOLR-6455
 Project: Solr
  Issue Type: Bug
 Environment: Oracle JDK 1.8.0_20-b26
 Mac OSX 10.9.4
Reporter: Michael Dodsworth
 Attachments: SOLR-6455.patch


 Compilation fails with:
 {code}
 common.compile-core:
 [mkdir] Created dir: 
 /Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java
 [javac] Compiling 122 source files to 
 /Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java
 [javac] warning: [options] bootstrap class path not set in conjunction 
 with -source 1.6
 [javac] 
 /Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:116:
  error: reference to Base64 is ambiguous
 [javac]   v = Base64.byteArrayToBase64(bytes, 0,bytes.length);
 [javac]   ^
 [javac]   both class org.apache.solr.common.util.Base64 in 
 org.apache.solr.common.util and class java.util.Base64 in java.util match
 [javac] 
 /Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:119:
  error: reference to Base64 is ambiguous
 [javac]   v = Base64.byteArrayToBase64(bytes.array(), 
 bytes.position(),bytes.limit() - bytes.position());
 [javac]   ^
 [javac]   both class org.apache.solr.common.util.Base64 in 
 org.apache.solr.common.util and class java.util.Base64 in java.util match
 [javac] Note: Some input files use unchecked or unsafe operations.
 [javac] Note: Recompile with -Xlint:unchecked for details.
 [javac] 2 errors
 [javac] 1 warning
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Closed] (SOLR-6455) Ambiguous reference to Base64 in ClientUtils causes the Java 8 build to fail

2014-08-30 Thread Michael Dodsworth (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dodsworth closed SOLR-6455.
---

Resolution: Invalid

 Ambiguous reference to Base64 in ClientUtils causes the Java 8 build to fail
 

 Key: SOLR-6455
 URL: https://issues.apache.org/jira/browse/SOLR-6455
 Project: Solr
  Issue Type: Bug
 Environment: Oracle JDK 1.8.0_20-b26
 Mac OSX 10.9.4
Reporter: Michael Dodsworth
 Attachments: SOLR-6455.patch


 Compilation fails with:
 {code}
 common.compile-core:
 [mkdir] Created dir: 
 /Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java
 [javac] Compiling 122 source files to 
 /Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java
 [javac] warning: [options] bootstrap class path not set in conjunction 
 with -source 1.6
 [javac] 
 /Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:116:
  error: reference to Base64 is ambiguous
 [javac]   v = Base64.byteArrayToBase64(bytes, 0,bytes.length);
 [javac]   ^
 [javac]   both class org.apache.solr.common.util.Base64 in 
 org.apache.solr.common.util and class java.util.Base64 in java.util match
 [javac] 
 /Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:119:
  error: reference to Base64 is ambiguous
 [javac]   v = Base64.byteArrayToBase64(bytes.array(), 
 bytes.position(),bytes.limit() - bytes.position());
 [javac]   ^
 [javac]   both class org.apache.solr.common.util.Base64 in 
 org.apache.solr.common.util and class java.util.Base64 in java.util match
 [javac] Note: Some input files use unchecked or unsafe operations.
 [javac] Note: Recompile with -Xlint:unchecked for details.
 [javac] 2 errors
 [javac] 1 warning
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6455) Ambiguous reference to Base64 in ClientUtils causes the Java 8 build to fail

2014-08-30 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116515#comment-14116515
 ] 

Michael Dodsworth commented on SOLR-6455:
-

bah. Apologies (github fork didn't inform me I already had a fork (so I was 
cloning an out-of-date version).

 Ambiguous reference to Base64 in ClientUtils causes the Java 8 build to fail
 

 Key: SOLR-6455
 URL: https://issues.apache.org/jira/browse/SOLR-6455
 Project: Solr
  Issue Type: Bug
 Environment: Oracle JDK 1.8.0_20-b26
 Mac OSX 10.9.4
Reporter: Michael Dodsworth
 Attachments: SOLR-6455.patch


 Compilation fails with:
 {code}
 common.compile-core:
 [mkdir] Created dir: 
 /Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java
 [javac] Compiling 122 source files to 
 /Users/mdodsworth/dev/lucene-solr/solr/build/solr-solrj/classes/java
 [javac] warning: [options] bootstrap class path not set in conjunction 
 with -source 1.6
 [javac] 
 /Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:116:
  error: reference to Base64 is ambiguous
 [javac]   v = Base64.byteArrayToBase64(bytes, 0,bytes.length);
 [javac]   ^
 [javac]   both class org.apache.solr.common.util.Base64 in 
 org.apache.solr.common.util and class java.util.Base64 in java.util match
 [javac] 
 /Users/mdodsworth/dev/lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java:119:
  error: reference to Base64 is ambiguous
 [javac]   v = Base64.byteArrayToBase64(bytes.array(), 
 bytes.position(),bytes.limit() - bytes.position());
 [javac]   ^
 [javac]   both class org.apache.solr.common.util.Base64 in 
 org.apache.solr.common.util and class java.util.Base64 in java.util match
 [javac] Note: Some input files use unchecked or unsafe operations.
 [javac] Note: Recompile with -Xlint:unchecked for details.
 [javac] 2 errors
 [javac] 1 warning
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)

2014-08-13 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14095691#comment-14095691
 ] 

Michael Dodsworth commented on SOLR-6062:
-

Fantastic. Thank you, [~ehatcher].

 Phrase queries are created for each field supplied through edismax's pf, pf2 
 and pf3 parameters (rather them being combined in a single dismax query)
 -

 Key: SOLR-6062
 URL: https://issues.apache.org/jira/browse/SOLR-6062
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Michael Dodsworth
Assignee: Erik Hatcher
Priority: Minor
 Attachments: combined-phrased-dismax.patch


 SOLR-2058 subtly changed how phrase queries, created through the pf, pf2 and 
 pf3 parameters, are merged into the main user query.
 For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get 
 (omitting the non phrase query section for clarity):
 {code:java}
 main query
 DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1)
 {code}
 Prior to this change, we had:
 {code:java}
 main query 
 DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | 
 field3:term1 term2^1.0)~0.1)
 {code}
 The upshot being that if the phrase query term1 term2 appears in multiple 
 fields, it will get a significant boost over the previous implementation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)

2014-08-12 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14094812#comment-14094812
 ] 

Michael Dodsworth commented on SOLR-6062:
-

[~jdyer], any feedback?

 Phrase queries are created for each field supplied through edismax's pf, pf2 
 and pf3 parameters (rather them being combined in a single dismax query)
 -

 Key: SOLR-6062
 URL: https://issues.apache.org/jira/browse/SOLR-6062
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Michael Dodsworth
Priority: Minor
 Attachments: combined-phrased-dismax.patch


 https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase 
 queries, created through the pf, pf2 and pf3 parameters, are merged into the 
 main user query.
 For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get 
 (omitting the non phrase query section for clarity):
 {code:java}
 main query
 DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1)
 {code}
 Prior to this change, we had:
 {code:java}
 main query 
 DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | 
 field3:term1 term2^1.0)~0.1)
 {code}
 The upshot being that if the phrase query term1 term2 appears in multiple 
 fields, it will get a significant boost over the previous implementation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6212) upgrade Saxon-HE to 9.5.1-5 and reinstate Morphline tests that were affected under java 8/9 with 9.5.1-4

2014-08-12 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14094814#comment-14094814
 ] 

Michael Dodsworth commented on SOLR-6212:
-

[~markrmil...@gmail.com], any feedback?

 upgrade Saxon-HE to 9.5.1-5 and reinstate Morphline tests that were affected 
 under java 8/9 with 9.5.1-4
 

 Key: SOLR-6212
 URL: https://issues.apache.org/jira/browse/SOLR-6212
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.7, 5.0
Reporter: Michael Dodsworth
Assignee: Mark Miller
Priority: Minor
 Attachments: SOLR-6212.patch


 From SOLR-1301:
 For posterity, there is a thread on the dev list where we are working 
 through an issue with Saxon on java 8 and ibm's j9. Wolfgang filed 
 https://saxonica.plan.io/issues/1944 upstream. (Saxon is pulled in via 
 cdk-morphlines-saxon).
 Due to this issue, several Morphline tests were made to be 'ignored' in java 
 8+. The Saxon issue has been fixed in 9.5.1-5, so we should upgrade and 
 reinstate those tests.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)

2014-08-12 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14094907#comment-14094907
 ] 

Michael Dodsworth commented on SOLR-6062:
-

Thanks for looking at this, [~ehatcher]. Any suggestions on folks to pull in?

 Phrase queries are created for each field supplied through edismax's pf, pf2 
 and pf3 parameters (rather them being combined in a single dismax query)
 -

 Key: SOLR-6062
 URL: https://issues.apache.org/jira/browse/SOLR-6062
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Michael Dodsworth
Priority: Minor
 Attachments: combined-phrased-dismax.patch


 SOLR-2058 subtly changed how phrase queries, created through the pf, pf2 and 
 pf3 parameters, are merged into the main user query.
 For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get 
 (omitting the non phrase query section for clarity):
 {code:java}
 main query
 DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1)
 {code}
 Prior to this change, we had:
 {code:java}
 main query 
 DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | 
 field3:term1 term2^1.0)~0.1)
 {code}
 The upshot being that if the phrase query term1 term2 appears in multiple 
 fields, it will get a significant boost over the previous implementation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)

2014-08-12 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14094935#comment-14094935
 ] 

Michael Dodsworth commented on SOLR-6062:
-

not that I know of -- the wanted behavior of SOLR-2058 is supported (by 
supplying different slop values for the same field) as well as the original 
behavior.

 Phrase queries are created for each field supplied through edismax's pf, pf2 
 and pf3 parameters (rather them being combined in a single dismax query)
 -

 Key: SOLR-6062
 URL: https://issues.apache.org/jira/browse/SOLR-6062
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Michael Dodsworth
Priority: Minor
 Attachments: combined-phrased-dismax.patch


 SOLR-2058 subtly changed how phrase queries, created through the pf, pf2 and 
 pf3 parameters, are merged into the main user query.
 For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get 
 (omitting the non phrase query section for clarity):
 {code:java}
 main query
 DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1)
 {code}
 Prior to this change, we had:
 {code:java}
 main query 
 DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | 
 field3:term1 term2^1.0)~0.1)
 {code}
 The upshot being that if the phrase query term1 term2 appears in multiple 
 fields, it will get a significant boost over the previous implementation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)

2014-08-05 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14086543#comment-14086543
 ] 

Michael Dodsworth commented on SOLR-6062:
-

Any feedback, [~ramayer]

 Phrase queries are created for each field supplied through edismax's pf, pf2 
 and pf3 parameters (rather them being combined in a single dismax query)
 -

 Key: SOLR-6062
 URL: https://issues.apache.org/jira/browse/SOLR-6062
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Michael Dodsworth
Priority: Minor
 Attachments: combined-phrased-dismax.patch


 https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase 
 queries, created through the pf, pf2 and pf3 parameters, are merged into the 
 main user query.
 For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get 
 (omitting the non phrase query section for clarity):
 {code:java}
 main query
 DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1)
 {code}
 Prior to this change, we had:
 {code:java}
 main query 
 DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | 
 field3:term1 term2^1.0)~0.1)
 {code}
 The upshot being that if the phrase query term1 term2 appears in multiple 
 fields, it will get a significant boost over the previous implementation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-4730) SmartChineseAnalyzer got wrong matched offset

2014-06-29 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047046#comment-14047046
 ] 

Michael Dodsworth edited comment on LUCENE-4730 at 6/29/14 3:39 PM:


This appears to be a symptom of LUCENE-4984 (fixed in 4.8).

The following test fails:

{code:java}
// note Version.LUCENE_4_7
assertAnalyzesTo(new SmartChineseAnalyzer(Version.LUCENE_4_7, true),  My China 
 ,
new String[] { my, china}, new int[] {0,3}, new int[] {2, 8});
{code}

whereas this passes:

{code:java}
// note Version.LUCENE_4_8
assertAnalyzesTo(new SmartChineseAnalyzer(Version.LUCENE_4_8, true),  My China 
 ,
new String[] { my, china}, new int[] {0,3}, new int[] {2, 8});
{code}

I'll add a test to verify this double-whitespace case but otherwise, this can 
be closed out.


was (Author: mdodswo...@salesforce.com):
This appears to be a symptom of LUCENE-4984 (fixed in 4.8).

The following test fails:

{code:java}
// note Version.LUCENE_4_7
assertAnalyzesTo(new SmartChineseAnalyzer(Version.LUCENE_4_7, true),  My China 
 ,
new String[] { my, china}, new int[] {0,3}, new int[] {2, 8});
{code}

whereas this passes:

{code:java}
note Version.LUCENE_4_8
assertAnalyzesTo(new SmartChineseAnalyzer(Version.LUCENE_4_8, true),  My China 
 ,
new String[] { my, china}, new int[] {0,3}, new int[] {2, 8});
{code}

I'll add a test to verify this double-whitespace case but otherwise, this can 
be closed out.

 SmartChineseAnalyzer got wrong matched offset
 -

 Key: LUCENE-4730
 URL: https://issues.apache.org/jira/browse/LUCENE-4730
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/analysis
Affects Versions: 4.0, 4.1
 Environment: JDK1.7 Linux/Windows
Reporter: Jinsong Hu
Priority: Critical
 Attachments: LUCENE-4730.patch


 We found that SmartChineseAnalyzer got wrong matched offset with the 
 following test code:
 public void testHighlight() throws Exception {
 String text = My China  ;
 String queryText = China;
 StringBuilder builder = new StringBuilder(html);
 Analyzer analyzer = new SmartChineseAnalyzer(Version.LUCENE_40);
 //Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_40);
 QueryParser parser = new QueryParser(Version.LUCENE_40, text, 
 analyzer);
 Query query = parser.parse(queryText);
 SimpleHTMLFormatter formatter = new SimpleHTMLFormatter(span 
 style=\background: yellow\, /span);
 TokenStream tokens = analyzer.tokenStream(text, new 
 StringReader(text));
 QueryScorer scorer = new QueryScorer(query, text);
 Highlighter highlighter = new Highlighter(formatter, scorer);
 highlighter.setTextFragmenter(new SimpleSpanFragmenter(scorer));
 String result = highlighter.getBestFragments(tokens, text, 10, ...);
 if (result.length()  text.length()) {
 result = text;
 }
 builder.append(body);
 builder.append(result);
 builder.append(/body);
 builder.append(/html);
 System.out.println(builder.toString());
 }
 This method will generate a hilighted text, however, the highlight position 
 is obviously wrong, and if we remove one space from the text, that is, change 
 text from My China   (ends with two spaces) to My China  (ends with one 
 space), it will generate a text with correct highlight. If we change the 
 analyzer from SmartChineseAnalyzer to StandardAnalyzer, the highlight issue 
 will disappear.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5109) Solr 4.4 will not deploy in Glassfish 4.x

2014-06-29 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047162#comment-14047162
 ] 

Michael Dodsworth commented on SOLR-5109:
-

ah, fun. Those tests don't run under java 8 due to a Saxon-HE issue (which is 
actually fixed in 9.5.1-5).
The guava issue has been fixed in kite-sdk:master - 
https://github.com/kite-sdk/kite/commit/0ab2795872e4e5721f477d79e5049371a17ab8db.
 We'll have to wait for the next drop of kite-sdk before guava can be upgraded.

I'll create a separate issue for updating Saxon-HE and reinstating the affected 
tests. Apologies and thanks for looking this, Shalin.

 Solr 4.4 will not deploy in Glassfish 4.x
 -

 Key: SOLR-5109
 URL: https://issues.apache.org/jira/browse/SOLR-5109
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
 Environment: Glassfish 4.x
Reporter: jamon camisso
Priority: Blocker
  Labels: guava
 Attachments: LUCENE-5109.patch, guava-15.0-SNAPSHOT.jar


 The bundled Guava 14.0.1 JAR blocks deploying Solr 4.4 in Glassfish 4.x.
 This failure is a known issue with upstream Guava and is described here:
 https://code.google.com/p/guava-libraries/issues/detail?id=1433
 Building Guava guava-15.0-SNAPSHOT.jar from master and bundling it in Solr 
 allows for a successful deployment.
 Until the Guava developers release version 15 using their HEAD or even an RC 
 tag seems like the only way to resolve this.
 This is frustrating since it was proposed that Guava be removed as a 
 dependency before Solr 4.0 was released and yet it remains and blocks 
 upgrading: https://issues.apache.org/jira/browse/SOLR-3601



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6212) upgrade Saxon-HE to 9.5.1-5 and reinstate Morphline tests that were affected under java 8/9 with 9.5.1-4

2014-06-29 Thread Michael Dodsworth (JIRA)
Michael Dodsworth created SOLR-6212:
---

 Summary: upgrade Saxon-HE to 9.5.1-5 and reinstate Morphline tests 
that were affected under java 8/9 with 9.5.1-4
 Key: SOLR-6212
 URL: https://issues.apache.org/jira/browse/SOLR-6212
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.7, 5.0
Reporter: Michael Dodsworth
Priority: Minor


From SOLR-1301:

For posterity, there is a thread on the dev list where we are working through 
an issue with Saxon on java 8 and ibm's j9. Wolfgang filed 
https://saxonica.plan.io/issues/1944 upstream. (Saxon is pulled in via 
cdk-morphlines-saxon).

Due to this issue, several Morphline tests were made to be 'ignored' in java 
8+. The Saxon issue has been fixed in 9.5.1-5, so we should upgrade and 
reinstate those tests.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6212) upgrade Saxon-HE to 9.5.1-5 and reinstate Morphline tests that were affected under java 8/9 with 9.5.1-4

2014-06-29 Thread Michael Dodsworth (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dodsworth updated SOLR-6212:


Attachment: SOLR-6212.patch

upgrading Saxon-HE (to 9.5.1-5) and morphlines (to 0.14.1)

 upgrade Saxon-HE to 9.5.1-5 and reinstate Morphline tests that were affected 
 under java 8/9 with 9.5.1-4
 

 Key: SOLR-6212
 URL: https://issues.apache.org/jira/browse/SOLR-6212
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.7, 5.0
Reporter: Michael Dodsworth
Assignee: Mark Miller
Priority: Minor
 Attachments: SOLR-6212.patch


 From SOLR-1301:
 For posterity, there is a thread on the dev list where we are working 
 through an issue with Saxon on java 8 and ibm's j9. Wolfgang filed 
 https://saxonica.plan.io/issues/1944 upstream. (Saxon is pulled in via 
 cdk-morphlines-saxon).
 Due to this issue, several Morphline tests were made to be 'ignored' in java 
 8+. The Saxon issue has been fixed in 9.5.1-5, so we should upgrade and 
 reinstate those tests.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5109) Solr 4.4 will not deploy in Glassfish 4.x

2014-06-28 Thread Michael Dodsworth (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dodsworth updated SOLR-5109:


Attachment: LUCENE-5109.patch

Here's a patch for branch_4x to upgrade Guava to 17.0 (in which, the 
problematic annotation has been removed).

 Solr 4.4 will not deploy in Glassfish 4.x
 -

 Key: SOLR-5109
 URL: https://issues.apache.org/jira/browse/SOLR-5109
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
 Environment: Glassfish 4.x
Reporter: jamon camisso
Priority: Blocker
  Labels: guava
 Attachments: LUCENE-5109.patch, guava-15.0-SNAPSHOT.jar


 The bundled Guava 14.0.1 JAR blocks deploying Solr 4.4 in Glassfish 4.x.
 This failure is a known issue with upstream Guava and is described here:
 https://code.google.com/p/guava-libraries/issues/detail?id=1433
 Building Guava guava-15.0-SNAPSHOT.jar from master and bundling it in Solr 
 allows for a successful deployment.
 Until the Guava developers release version 15 using their HEAD or even an RC 
 tag seems like the only way to resolve this.
 This is frustrating since it was proposed that Guava be removed as a 
 dependency before Solr 4.0 was released and yet it remains and blocks 
 upgrading: https://issues.apache.org/jira/browse/SOLR-3601



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4787) The QueryScorer.getMaxWeight method is not found.

2014-06-28 Thread Michael Dodsworth (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dodsworth updated LUCENE-4787:
--

Attachment: LUCENE-4787.patch

 The QueryScorer.getMaxWeight method is not found.
 -

 Key: LUCENE-4787
 URL: https://issues.apache.org/jira/browse/LUCENE-4787
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/highlighter
Affects Versions: 4.1
Reporter: Hao Zhong
Priority: Critical
 Attachments: LUCENE-4787.patch


 The following API documents refer to the QueryScorer.getMaxWeight method:
 http://lucene.apache.org/core/4_1_0/highlighter/org/apache/lucene/search/highlight/package-summary.html
 The QueryScorer.getMaxWeight method is useful when passed to the 
 GradientFormatter constructor to define the top score which is associated 
 with the top color.
 http://lucene.apache.org/core/4_1_0/highlighter/org/apache/lucene/search/highlight/GradientFormatter.html
 See QueryScorer.getMaxWeight which can be used to calibrate scoring scale
 However, the QueryScorer class does not declare a getMaxWeight method in 
 lucene 4.1, according to its document:
 http://lucene.apache.org/core/4_1_0/highlighter/org/apache/lucene/search/highlight/QueryScorer.html
 Instead, the class declares a getMaxTermWeight method. Is that the correct 
 method in the preceding two documents? If it is, please revise the two 
 documents. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4730) SmartChineseAnalyzer got wrong matched offset

2014-06-28 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047046#comment-14047046
 ] 

Michael Dodsworth commented on LUCENE-4730:
---

This appears to be a symptom of LUCENE-4984 (fixed in 4.8).

The following test fails:

{code:java}
// note Version.LUCENE_4_7
assertAnalyzesTo(new SmartChineseAnalyzer(Version.LUCENE_4_7, true),  My China 
 ,
new String[] { my, china}, new int[] {0,3}, new int[] {2, 8});
{code}

whereas this passes:

{code:java}
note Version.LUCENE_4_8
assertAnalyzesTo(new SmartChineseAnalyzer(Version.LUCENE_4_8, true),  My China 
 ,
new String[] { my, china}, new int[] {0,3}, new int[] {2, 8});
{code}

I'll add a test to verify this double-whitespace case but otherwise, this can 
be closed out.

 SmartChineseAnalyzer got wrong matched offset
 -

 Key: LUCENE-4730
 URL: https://issues.apache.org/jira/browse/LUCENE-4730
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/analysis
Affects Versions: 4.0, 4.1
 Environment: JDK1.7 Linux/Windows
Reporter: Jinsong Hu
Priority: Critical

 We found that SmartChineseAnalyzer got wrong matched offset with the 
 following test code:
 public void testHighlight() throws Exception {
 String text = My China  ;
 String queryText = China;
 StringBuilder builder = new StringBuilder(html);
 Analyzer analyzer = new SmartChineseAnalyzer(Version.LUCENE_40);
 //Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_40);
 QueryParser parser = new QueryParser(Version.LUCENE_40, text, 
 analyzer);
 Query query = parser.parse(queryText);
 SimpleHTMLFormatter formatter = new SimpleHTMLFormatter(span 
 style=\background: yellow\, /span);
 TokenStream tokens = analyzer.tokenStream(text, new 
 StringReader(text));
 QueryScorer scorer = new QueryScorer(query, text);
 Highlighter highlighter = new Highlighter(formatter, scorer);
 highlighter.setTextFragmenter(new SimpleSpanFragmenter(scorer));
 String result = highlighter.getBestFragments(tokens, text, 10, ...);
 if (result.length()  text.length()) {
 result = text;
 }
 builder.append(body);
 builder.append(result);
 builder.append(/body);
 builder.append(/html);
 System.out.println(builder.toString());
 }
 This method will generate a hilighted text, however, the highlight position 
 is obviously wrong, and if we remove one space from the text, that is, change 
 text from My China   (ends with two spaces) to My China  (ends with one 
 space), it will generate a text with correct highlight. If we change the 
 analyzer from SmartChineseAnalyzer to StandardAnalyzer, the highlight issue 
 will disappear.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4730) SmartChineseAnalyzer got wrong matched offset

2014-06-28 Thread Michael Dodsworth (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dodsworth updated LUCENE-4730:
--

Attachment: LUCENE-4730.patch

 SmartChineseAnalyzer got wrong matched offset
 -

 Key: LUCENE-4730
 URL: https://issues.apache.org/jira/browse/LUCENE-4730
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/analysis
Affects Versions: 4.0, 4.1
 Environment: JDK1.7 Linux/Windows
Reporter: Jinsong Hu
Priority: Critical
 Attachments: LUCENE-4730.patch


 We found that SmartChineseAnalyzer got wrong matched offset with the 
 following test code:
 public void testHighlight() throws Exception {
 String text = My China  ;
 String queryText = China;
 StringBuilder builder = new StringBuilder(html);
 Analyzer analyzer = new SmartChineseAnalyzer(Version.LUCENE_40);
 //Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_40);
 QueryParser parser = new QueryParser(Version.LUCENE_40, text, 
 analyzer);
 Query query = parser.parse(queryText);
 SimpleHTMLFormatter formatter = new SimpleHTMLFormatter(span 
 style=\background: yellow\, /span);
 TokenStream tokens = analyzer.tokenStream(text, new 
 StringReader(text));
 QueryScorer scorer = new QueryScorer(query, text);
 Highlighter highlighter = new Highlighter(formatter, scorer);
 highlighter.setTextFragmenter(new SimpleSpanFragmenter(scorer));
 String result = highlighter.getBestFragments(tokens, text, 10, ...);
 if (result.length()  text.length()) {
 result = text;
 }
 builder.append(body);
 builder.append(result);
 builder.append(/body);
 builder.append(/html);
 System.out.println(builder.toString());
 }
 This method will generate a hilighted text, however, the highlight position 
 is obviously wrong, and if we remove one space from the text, that is, change 
 text from My China   (ends with two spaces) to My China  (ends with one 
 space), it will generate a text with correct highlight. If we change the 
 analyzer from SmartChineseAnalyzer to StandardAnalyzer, the highlight issue 
 will disappear.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4787) The QueryScorer.getMaxWeight method is not found.

2014-06-28 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047050#comment-14047050
 ] 

Michael Dodsworth commented on LUCENE-4787:
---

doc fixed in the attached patch.

 The QueryScorer.getMaxWeight method is not found.
 -

 Key: LUCENE-4787
 URL: https://issues.apache.org/jira/browse/LUCENE-4787
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/highlighter
Affects Versions: 4.1
Reporter: Hao Zhong
Priority: Critical
 Attachments: LUCENE-4787.patch


 The following API documents refer to the QueryScorer.getMaxWeight method:
 http://lucene.apache.org/core/4_1_0/highlighter/org/apache/lucene/search/highlight/package-summary.html
 The QueryScorer.getMaxWeight method is useful when passed to the 
 GradientFormatter constructor to define the top score which is associated 
 with the top color.
 http://lucene.apache.org/core/4_1_0/highlighter/org/apache/lucene/search/highlight/GradientFormatter.html
 See QueryScorer.getMaxWeight which can be used to calibrate scoring scale
 However, the QueryScorer class does not declare a getMaxWeight method in 
 lucene 4.1, according to its document:
 http://lucene.apache.org/core/4_1_0/highlighter/org/apache/lucene/search/highlight/QueryScorer.html
 Instead, the class declares a getMaxTermWeight method. Is that the correct 
 method in the preceding two documents? If it is, please revise the two 
 documents. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)

2014-06-17 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034070#comment-14034070
 ] 

Michael Dodsworth commented on SOLR-6062:
-

[~ramayer] does that output look correct to you?

 Phrase queries are created for each field supplied through edismax's pf, pf2 
 and pf3 parameters (rather them being combined in a single dismax query)
 -

 Key: SOLR-6062
 URL: https://issues.apache.org/jira/browse/SOLR-6062
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Michael Dodsworth
Priority: Minor
 Attachments: combined-phrased-dismax.patch


 https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase 
 queries, created through the pf, pf2 and pf3 parameters, are merged into the 
 main user query.
 For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get 
 (omitting the non phrase query section for clarity):
 {code:java}
 main query
 DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1)
 {code}
 Prior to this change, we had:
 {code:java}
 main query 
 DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | 
 field3:term1 term2^1.0)~0.1)
 {code}
 The upshot being that if the phrase query term1 term2 appears in multiple 
 fields, it will get a significant boost over the previous implementation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)

2014-05-29 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14012659#comment-14012659
 ] 

Michael Dodsworth commented on SOLR-6062:
-

Thanks for looking at this, [~ramayer],

Here's an example that shows both the grouping within a particular pf? query 
(where the supplied fields have the same slop) and the splitting out/layering 
of queries when different slops are used for the same field(s). Hold on to your 
hats...

{q,   ,
 qf, phrase_sw phrase1_sw,
 pf, phrase_sw~1^10 phrase_sw~2^20 phrase_sw^30,
 pf2, phrase_sw~2^22 phrase_sw^33 phrase1_sw~2^44 phrase1_sw~4^55,
 pf3, phrase_sw~2^222 phrase_sw^333 phrase1_sw~2^444 phrase1_sw~4^555}

# pf -- phrase_sw with 3 different slop values results in 3 independent dismax 
queries
DisjunctionMaxQuery((phrase_sw:  ~1^10.0)) 
DisjunctionMaxQuery((phrase_sw:  ~2^20.0)) 
DisjunctionMaxQuery((phrase_sw:  ^30.0)) 

# pf2 -- phrase_sw and phrase1_sw were both supplied with a slop of 2, so those 
queries are grouped
(
  DisjunctionMaxQuery((phrase_sw: ~2^22.0 | phrase1_sw: 
~2^44.0)) 
  DisjunctionMaxQuery((phrase_sw: ~2^22.0 | phrase1_sw: 
~2^44.0))
) 

(
  DisjunctionMaxQuery((phrase_sw: ^33.0)) 
  DisjunctionMaxQuery((phrase_sw: ^33.0))
)

(
  DisjunctionMaxQuery((phrase1_sw: ~4^55.0)) 
  DisjunctionMaxQuery((phrase1_sw: ~4^55.0))
)

# pf3
DisjunctionMaxQuery((phrase_sw:  ~2^222.0 | phrase1_sw:  
~2^444.0)) 
DisjunctionMaxQuery((phrase_sw:  ^333.0)) 
DisjunctionMaxQuery((phrase1_sw:  ~4^555.0)))

 Phrase queries are created for each field supplied through edismax's pf, pf2 
 and pf3 parameters (rather them being combined in a single dismax query)
 -

 Key: SOLR-6062
 URL: https://issues.apache.org/jira/browse/SOLR-6062
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Michael Dodsworth
Priority: Minor
 Attachments: combined-phrased-dismax.patch


 https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase 
 queries, created through the pf, pf2 and pf3 parameters, are merged into the 
 main user query.
 For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get 
 (omitting the non phrase query section for clarity):
 {code:java}
 main query
 DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1)
 {code}
 Prior to this change, we had:
 {code:java}
 main query 
 DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | 
 field3:term1 term2^1.0)~0.1)
 {code}
 The upshot being that if the phrase query term1 term2 appears in multiple 
 fields, it will get a significant boost over the previous implementation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)

2014-05-29 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14012659#comment-14012659
 ] 

Michael Dodsworth edited comment on SOLR-6062 at 5/29/14 6:25 PM:
--

Thanks for looking at this, [~ramayer],

Here's an example that shows both the grouping within a particular pf? query 
(where the supplied fields have the same slop) and the splitting out/layering 
of queries when different slops are used for the same field(s). Hold on to your 
hats...

{code}
{q,   ,
 qf, phrase_sw phrase1_sw,
 pf, phrase_sw~1^10 phrase_sw~2^20 phrase_sw^30,
 pf2, phrase_sw~2^22 phrase_sw^33 phrase1_sw~2^44 phrase1_sw~4^55,
 pf3, phrase_sw~2^222 phrase_sw^333 phrase1_sw~2^444 phrase1_sw~4^555}

# pf -- phrase_sw with 3 different slop values results in 3 independent dismax 
queries
DisjunctionMaxQuery((phrase_sw:  ~1^10.0)) 
DisjunctionMaxQuery((phrase_sw:  ~2^20.0)) 
DisjunctionMaxQuery((phrase_sw:  ^30.0)) 

# pf2 -- phrase_sw and phrase1_sw were both supplied with a slop of 2, so those 
queries are grouped
(
  DisjunctionMaxQuery((phrase_sw: ~2^22.0 | phrase1_sw: 
~2^44.0)) 
  DisjunctionMaxQuery((phrase_sw: ~2^22.0 | phrase1_sw: 
~2^44.0))
) 

(
  DisjunctionMaxQuery((phrase_sw: ^33.0)) 
  DisjunctionMaxQuery((phrase_sw: ^33.0))
)

(
  DisjunctionMaxQuery((phrase1_sw: ~4^55.0)) 
  DisjunctionMaxQuery((phrase1_sw: ~4^55.0))
)

# pf3
DisjunctionMaxQuery((phrase_sw:  ~2^222.0 | phrase1_sw:  
~2^444.0)) 
DisjunctionMaxQuery((phrase_sw:  ^333.0)) 
DisjunctionMaxQuery((phrase1_sw:  ~4^555.0)))

{code}


was (Author: mdodswo...@salesforce.com):
Thanks for looking at this, [~ramayer],

Here's an example that shows both the grouping within a particular pf? query 
(where the supplied fields have the same slop) and the splitting out/layering 
of queries when different slops are used for the same field(s). Hold on to your 
hats...

{q,   ,
 qf, phrase_sw phrase1_sw,
 pf, phrase_sw~1^10 phrase_sw~2^20 phrase_sw^30,
 pf2, phrase_sw~2^22 phrase_sw^33 phrase1_sw~2^44 phrase1_sw~4^55,
 pf3, phrase_sw~2^222 phrase_sw^333 phrase1_sw~2^444 phrase1_sw~4^555}

# pf -- phrase_sw with 3 different slop values results in 3 independent dismax 
queries
DisjunctionMaxQuery((phrase_sw:  ~1^10.0)) 
DisjunctionMaxQuery((phrase_sw:  ~2^20.0)) 
DisjunctionMaxQuery((phrase_sw:  ^30.0)) 

# pf2 -- phrase_sw and phrase1_sw were both supplied with a slop of 2, so those 
queries are grouped
(
  DisjunctionMaxQuery((phrase_sw: ~2^22.0 | phrase1_sw: 
~2^44.0)) 
  DisjunctionMaxQuery((phrase_sw: ~2^22.0 | phrase1_sw: 
~2^44.0))
) 

(
  DisjunctionMaxQuery((phrase_sw: ^33.0)) 
  DisjunctionMaxQuery((phrase_sw: ^33.0))
)

(
  DisjunctionMaxQuery((phrase1_sw: ~4^55.0)) 
  DisjunctionMaxQuery((phrase1_sw: ~4^55.0))
)

# pf3
DisjunctionMaxQuery((phrase_sw:  ~2^222.0 | phrase1_sw:  
~2^444.0)) 
DisjunctionMaxQuery((phrase_sw:  ^333.0)) 
DisjunctionMaxQuery((phrase1_sw:  ~4^555.0)))

 Phrase queries are created for each field supplied through edismax's pf, pf2 
 and pf3 parameters (rather them being combined in a single dismax query)
 -

 Key: SOLR-6062
 URL: https://issues.apache.org/jira/browse/SOLR-6062
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Michael Dodsworth
Priority: Minor
 Attachments: combined-phrased-dismax.patch


 https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase 
 queries, created through the pf, pf2 and pf3 parameters, are merged into the 
 main user query.
 For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get 
 (omitting the non phrase query section for clarity):
 {code:java}
 main query
 DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1)
 {code}
 Prior to this change, we had:
 {code:java}
 main query 
 DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | 
 field3:term1 term2^1.0)~0.1)
 {code}
 The upshot being that if the phrase query term1 term2 appears in multiple 
 fields, it will get a significant boost over the previous implementation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, 

[jira] [Comment Edited] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)

2014-05-29 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14012659#comment-14012659
 ] 

Michael Dodsworth edited comment on SOLR-6062 at 5/29/14 6:25 PM:
--

Thanks for looking at this, [~ramayer],

Here's an example (post fix) that shows both the grouping within a particular 
pf? query (where the supplied fields have the same slop) and the splitting 
out/layering of queries when different slops are used for the same field(s). 
Hold on to your hats...

{code}
{q,   ,
 qf, phrase_sw phrase1_sw,
 pf, phrase_sw~1^10 phrase_sw~2^20 phrase_sw^30,
 pf2, phrase_sw~2^22 phrase_sw^33 phrase1_sw~2^44 phrase1_sw~4^55,
 pf3, phrase_sw~2^222 phrase_sw^333 phrase1_sw~2^444 phrase1_sw~4^555}

# pf -- phrase_sw with 3 different slop values results in 3 independent dismax 
queries
DisjunctionMaxQuery((phrase_sw:  ~1^10.0)) 
DisjunctionMaxQuery((phrase_sw:  ~2^20.0)) 
DisjunctionMaxQuery((phrase_sw:  ^30.0)) 

# pf2 -- phrase_sw and phrase1_sw were both supplied with a slop of 2, so those 
queries are grouped
(
  DisjunctionMaxQuery((phrase_sw: ~2^22.0 | phrase1_sw: 
~2^44.0)) 
  DisjunctionMaxQuery((phrase_sw: ~2^22.0 | phrase1_sw: 
~2^44.0))
) 

(
  DisjunctionMaxQuery((phrase_sw: ^33.0)) 
  DisjunctionMaxQuery((phrase_sw: ^33.0))
)

(
  DisjunctionMaxQuery((phrase1_sw: ~4^55.0)) 
  DisjunctionMaxQuery((phrase1_sw: ~4^55.0))
)

# pf3
DisjunctionMaxQuery((phrase_sw:  ~2^222.0 | phrase1_sw:  
~2^444.0)) 
DisjunctionMaxQuery((phrase_sw:  ^333.0)) 
DisjunctionMaxQuery((phrase1_sw:  ~4^555.0)))

{code}


was (Author: mdodswo...@salesforce.com):
Thanks for looking at this, [~ramayer],

Here's an example that shows both the grouping within a particular pf? query 
(where the supplied fields have the same slop) and the splitting out/layering 
of queries when different slops are used for the same field(s). Hold on to your 
hats...

{code}
{q,   ,
 qf, phrase_sw phrase1_sw,
 pf, phrase_sw~1^10 phrase_sw~2^20 phrase_sw^30,
 pf2, phrase_sw~2^22 phrase_sw^33 phrase1_sw~2^44 phrase1_sw~4^55,
 pf3, phrase_sw~2^222 phrase_sw^333 phrase1_sw~2^444 phrase1_sw~4^555}

# pf -- phrase_sw with 3 different slop values results in 3 independent dismax 
queries
DisjunctionMaxQuery((phrase_sw:  ~1^10.0)) 
DisjunctionMaxQuery((phrase_sw:  ~2^20.0)) 
DisjunctionMaxQuery((phrase_sw:  ^30.0)) 

# pf2 -- phrase_sw and phrase1_sw were both supplied with a slop of 2, so those 
queries are grouped
(
  DisjunctionMaxQuery((phrase_sw: ~2^22.0 | phrase1_sw: 
~2^44.0)) 
  DisjunctionMaxQuery((phrase_sw: ~2^22.0 | phrase1_sw: 
~2^44.0))
) 

(
  DisjunctionMaxQuery((phrase_sw: ^33.0)) 
  DisjunctionMaxQuery((phrase_sw: ^33.0))
)

(
  DisjunctionMaxQuery((phrase1_sw: ~4^55.0)) 
  DisjunctionMaxQuery((phrase1_sw: ~4^55.0))
)

# pf3
DisjunctionMaxQuery((phrase_sw:  ~2^222.0 | phrase1_sw:  
~2^444.0)) 
DisjunctionMaxQuery((phrase_sw:  ^333.0)) 
DisjunctionMaxQuery((phrase1_sw:  ~4^555.0)))

{code}

 Phrase queries are created for each field supplied through edismax's pf, pf2 
 and pf3 parameters (rather them being combined in a single dismax query)
 -

 Key: SOLR-6062
 URL: https://issues.apache.org/jira/browse/SOLR-6062
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Michael Dodsworth
Priority: Minor
 Attachments: combined-phrased-dismax.patch


 https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase 
 queries, created through the pf, pf2 and pf3 parameters, are merged into the 
 main user query.
 For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get 
 (omitting the non phrase query section for clarity):
 {code:java}
 main query
 DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1)
 {code}
 Prior to this change, we had:
 {code:java}
 main query 
 DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | 
 field3:term1 term2^1.0)~0.1)
 {code}
 The upshot being that if the phrase query term1 term2 appears in multiple 
 fields, it will get a significant boost over the previous implementation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org

[jira] [Commented] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)

2014-05-27 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010368#comment-14010368
 ] 

Michael Dodsworth commented on SOLR-6062:
-

adding [~jdyer] [~janhoy], as you were involved in 
https://issues.apache.org/jira/browse/SOLR-2058

 Phrase queries are created for each field supplied through edismax's pf, pf2 
 and pf3 parameters (rather them being combined in a single dismax query)
 -

 Key: SOLR-6062
 URL: https://issues.apache.org/jira/browse/SOLR-6062
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Michael Dodsworth
Priority: Minor
 Attachments: combined-phrased-dismax.patch


 https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase 
 queries, created through the pf, pf2 and pf3 parameters, are merged into the 
 main user query.
 For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get 
 (omitting the non phrase query section for clarity):
 {code:java}
 main query
 DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1)
 {code}
 Prior to this change, we had:
 {code:java}
 main query 
 DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | 
 field3:term1 term2^1.0)~0.1)
 {code}
 The upshot being that if the phrase query term1 term2 appears in multiple 
 fields, it will get a significant boost over the previous implementation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)

2014-05-16 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999275#comment-13999275
 ] 

Michael Dodsworth commented on SOLR-6062:
-

all comments and feedback welcome.

 Phrase queries are created for each field supplied through edismax's pf, pf2 
 and pf3 parameters (rather them being combined in a single dismax query)
 -

 Key: SOLR-6062
 URL: https://issues.apache.org/jira/browse/SOLR-6062
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Michael Dodsworth
Priority: Minor
 Attachments: combined-phrased-dismax.patch


 https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase 
 queries, created through the pf, pf2 and pf3 parameters, are merged into the 
 main user query.
 For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get 
 (omitting the non phrase query section for clarity):
 {code:java}
 main query
 DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1)
 {code}
 Prior to this change, we had:
 {code:java}
 main query 
 DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | 
 field3:term1 term2^1.0)~0.1)
 {code}
 The upshot being that if the phrase query term1 term2 appears in multiple 
 fields, it will get a significant boost over the previous implementation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6062) a phrase query is created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)

2014-05-14 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996085#comment-13996085
 ] 

Michael Dodsworth commented on SOLR-6062:
-

As was mentioned on this issue, the behavioral change was not desirable.

 a phrase query is created for each field supplied through edismax's pf, pf2 
 and pf3 parameters (rather them being combined in a single dismax query)
 

 Key: SOLR-6062
 URL: https://issues.apache.org/jira/browse/SOLR-6062
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Michael Dodsworth
Priority: Minor

 https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase 
 queries, created through the pf, pf2 and pf3 parameters, are merged into the 
 main user query.
 For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get 
 (omitting the non phrase query section for clarity):
 {code:java}
 main query
 DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1)
 {code}
 Prior to this change, we had:
 {code:java}
 main query 
 DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | 
 field3:term1 term2^1.0)~0.1)
 {code}
 The upshot being that if the phrase query term1 term2 appears in multiple 
 fields, it will get a significant boost over the previous implementation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)

2014-05-13 Thread Michael Dodsworth (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dodsworth updated SOLR-6062:


Attachment: combined-phrased-dismax.patch

Rather than sending each FieldParam through addShingledPhraseQueries 
individually (which results in a dismax query per field), we're now grouping 
the phrase fields by their wordGram count and sending each *group* through 
addShingledPhraseQueries.

One slight complication is that the original/linked issue allowed the *same* 
field to be passed through a pf parameter with differing slop values. The 
intent being that those scores would be combined, rather than the max being 
used across those fields. In order to continue support for that feature, we're 
also grouping FieldParams by their associated slop values (passing each group 
through independently).

I've added a test for the multi-field case. If people are happy with the 
approach, I can combine the wordGram and slop value grouping into a single pass.



 Phrase queries are created for each field supplied through edismax's pf, pf2 
 and pf3 parameters (rather them being combined in a single dismax query)
 -

 Key: SOLR-6062
 URL: https://issues.apache.org/jira/browse/SOLR-6062
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Michael Dodsworth
Priority: Minor
 Attachments: combined-phrased-dismax.patch


 https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase 
 queries, created through the pf, pf2 and pf3 parameters, are merged into the 
 main user query.
 For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get 
 (omitting the non phrase query section for clarity):
 {code:java}
 main query
 DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1)
 {code}
 Prior to this change, we had:
 {code:java}
 main query 
 DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | 
 field3:term1 term2^1.0)~0.1)
 {code}
 The upshot being that if the phrase query term1 term2 appears in multiple 
 fields, it will get a significant boost over the previous implementation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)

2014-05-13 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996547#comment-13996547
 ] 

Michael Dodsworth commented on SOLR-6062:
-

[~ndushay] [~ramayer]

 Phrase queries are created for each field supplied through edismax's pf, pf2 
 and pf3 parameters (rather them being combined in a single dismax query)
 -

 Key: SOLR-6062
 URL: https://issues.apache.org/jira/browse/SOLR-6062
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Michael Dodsworth
Priority: Minor
 Attachments: combined-phrased-dismax.patch


 https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase 
 queries, created through the pf, pf2 and pf3 parameters, are merged into the 
 main user query.
 For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get 
 (omitting the non phrase query section for clarity):
 {code:java}
 main query
 DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1)
 {code}
 Prior to this change, we had:
 {code:java}
 main query 
 DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | 
 field3:term1 term2^1.0)~0.1)
 {code}
 The upshot being that if the phrase query term1 term2 appears in multiple 
 fields, it will get a significant boost over the previous implementation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)

2014-05-13 Thread Michael Dodsworth (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dodsworth updated SOLR-6062:


Summary: Phrase queries are created for each field supplied through 
edismax's pf, pf2 and pf3 parameters (rather them being combined in a single 
dismax query)  (was: a phrase query is created for each field supplied through 
edismax's pf, pf2 and pf3 parameters (rather them being combined in a single 
dismax query))

 Phrase queries are created for each field supplied through edismax's pf, pf2 
 and pf3 parameters (rather them being combined in a single dismax query)
 -

 Key: SOLR-6062
 URL: https://issues.apache.org/jira/browse/SOLR-6062
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Michael Dodsworth
Priority: Minor

 https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase 
 queries, created through the pf, pf2 and pf3 parameters, are merged into the 
 main user query.
 For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get 
 (omitting the non phrase query section for clarity):
 {code:java}
 main query
 DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1)
 DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1)
 {code}
 Prior to this change, we had:
 {code:java}
 main query 
 DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | 
 field3:term1 term2^1.0)~0.1)
 {code}
 The upshot being that if the phrase query term1 term2 appears in multiple 
 fields, it will get a significant boost over the previous implementation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6062) a phrase query is created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)

2014-05-13 Thread Michael Dodsworth (JIRA)
Michael Dodsworth created SOLR-6062:
---

 Summary: a phrase query is created for each field supplied through 
edismax's pf, pf2 and pf3 parameters (rather them being combined in a single 
dismax query)
 Key: SOLR-6062
 URL: https://issues.apache.org/jira/browse/SOLR-6062
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Michael Dodsworth
Priority: Minor


https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase 
queries, created through the pf, pf2 and pf3 parameters, are merged into the 
main user query.

For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get 
(omitting the non phrase query section for clarity):

{code:java}
main query
DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1)
DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1)
DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1)
{code}

Prior to this change, we had:

{code:java}
main query 
DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | 
field3:term1 term2^1.0)~0.1)
{code}

The upshot being that if the phrase query term1 term2 appears in multiple 
fields, it will get a significant boost over the previous implementation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2058) Adds optional phrase slop to edismax pf2, pf3 and pf parameters with field~slop^boost syntax

2012-10-09 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13472796#comment-13472796
 ] 

Michael Dodsworth commented on SOLR-2058:
-

Can anyone comment on whether this change was intentional or accidental (and 
unwanted)?

 Adds optional phrase slop to edismax pf2, pf3 and pf parameters with 
 field~slop^boost syntax
 

 Key: SOLR-2058
 URL: https://issues.apache.org/jira/browse/SOLR-2058
 Project: Solr
  Issue Type: Improvement
  Components: query parsers
 Environment: n/a
Reporter: Ron Mayer
Assignee: James Dyer
Priority: Minor
 Fix For: 4.0-ALPHA

 Attachments: edismax_pf_with_slop_v2.1.patch, 
 edismax_pf_with_slop_v2.patch, pf2_with_slop.patch, 
 SOLR-2058-and-3351-not-finished.patch, SOLR-2058.patch


 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3c4c659119.2010...@0ape.com%3E
 {quote}
 From  Ron Mayer r...@0ape.com
 ... my results might  be even better if I had a couple different pf2s with 
 different ps's  at the same time.   In particular.   One with ps=0 to put a 
 high boost on ones the have  the right ordering of words.  For example 
 insuring that [the query]:
   red hat black jacket
  boosts only documents with red hats and not black hats.   And another 
 pf2 with a more modest boost with ps=5 or so to handle the query above also 
 boosting docs with 
   red baseball hat.
 {quote}
 [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3caanlktimd+v3g6d_mnhp+jykkd+dej8fvmvf_1lqoi...@mail.gmail.com%3E]
 {quote}
 From  Yonik Seeley yo...@lucidimagination.com
 Perhaps fold it into the pf/pf2 syntax?
 pf=text^2// current syntax... makes phrases with a boost of 2
 pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
 a boost of 2
 That actually seems pretty natural given the lucene query syntax - an
 actual boosted sloppy phrase query already looks like
 {{text:foo bar~1^2}}
 -Yonik
 {quote}
 [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3calpine.deb.1.10.1008161300510.6...@radix.cryptio.net%3E]
 {quote}
 From  Chris Hostetter hossman_luc...@fucit.org
 Big +1 to this idea ... the existing ps param can stick arround as the 
 default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
 fields using the ~ syntax.
 -Hoss
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2058) Adds optional phrase slop to edismax pf2, pf3 and pf parameters with field~slop^boost syntax

2012-09-25 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463009#comment-13463009
 ] 

Michael Dodsworth commented on SOLR-2058:
-

It looks like this change also altered the way phrase queries are merged into 
the main query.

For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get 
(omitting the non phrase query section for clarity):
{code}
  main query DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1)
DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1)
DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1)
{code}

Prior to this change, we got:

{code}
  main query DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 
term2^1.0 | field3:term1 term2^1.0)~0.1)
{code}

The upshot being that if the phrase query term1 term2 appears in multiple 
fields, it will get a significant boost over the previous implementation. The 
presence of the dismax queries makes me think this behavioral change was not 
intentional; if that's the case, let me know and I'll get a fix together.
Thanks.

 Adds optional phrase slop to edismax pf2, pf3 and pf parameters with 
 field~slop^boost syntax
 

 Key: SOLR-2058
 URL: https://issues.apache.org/jira/browse/SOLR-2058
 Project: Solr
  Issue Type: Improvement
  Components: query parsers
 Environment: n/a
Reporter: Ron Mayer
Assignee: James Dyer
Priority: Minor
 Fix For: 4.0-ALPHA

 Attachments: edismax_pf_with_slop_v2.1.patch, 
 edismax_pf_with_slop_v2.patch, pf2_with_slop.patch, 
 SOLR-2058-and-3351-not-finished.patch, SOLR-2058.patch


 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3c4c659119.2010...@0ape.com%3E
 {quote}
 From  Ron Mayer r...@0ape.com
 ... my results might  be even better if I had a couple different pf2s with 
 different ps's  at the same time.   In particular.   One with ps=0 to put a 
 high boost on ones the have  the right ordering of words.  For example 
 insuring that [the query]:
   red hat black jacket
  boosts only documents with red hats and not black hats.   And another 
 pf2 with a more modest boost with ps=5 or so to handle the query above also 
 boosting docs with 
   red baseball hat.
 {quote}
 [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3caanlktimd+v3g6d_mnhp+jykkd+dej8fvmvf_1lqoi...@mail.gmail.com%3E]
 {quote}
 From  Yonik Seeley yo...@lucidimagination.com
 Perhaps fold it into the pf/pf2 syntax?
 pf=text^2// current syntax... makes phrases with a boost of 2
 pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
 a boost of 2
 That actually seems pretty natural given the lucene query syntax - an
 actual boosted sloppy phrase query already looks like
 {{text:foo bar~1^2}}
 -Yonik
 {quote}
 [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3calpine.deb.1.10.1008161300510.6...@radix.cryptio.net%3E]
 {quote}
 From  Chris Hostetter hossman_luc...@fucit.org
 Big +1 to this idea ... the existing ps param can stick arround as the 
 default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
 fields using the ~ syntax.
 -Hoss
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3725) package-local-src-tgz target is pulling in non-source jars, dist/** and package/**

2012-08-14 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434466#comment-13434466
 ] 

Michael Dodsworth commented on SOLR-3725:
-

Excellent. Thanks, Robert.

 package-local-src-tgz target is pulling in non-source jars, dist/** and 
 package/**
 --

 Key: SOLR-3725
 URL: https://issues.apache.org/jira/browse/SOLR-3725
 Project: Solr
  Issue Type: Improvement
  Components: Build
Affects Versions: 4.0-ALPHA
Reporter: Michael Dodsworth
Priority: Minor
 Fix For: 5.0, 4.0

 Attachments: SOLR-3725.patch


 package-local-src-tgz generates a 141M archive which contains a bunch of 
 non-source jars:
 {code}
 tar tfz apache-solr-4.0-SNAPSHOT-src.tgz  | grep -E '(war|jar)$' | wc -l
 134
 {code}
 It looks like we're expecting dist/** and package/** to be excluded:
 {code:xml}
 tarfileset dir=. prefix=${fullnamever}/solr
 excludes=build ${package.dir}/** ${dist}/**
  example/webapps/*.war 
 example/exampledocs/post.jar
  lib/README.committers.txt **/data/ **/logs/*
  **/*.sh **/bin/ scripts/
  .idea/ **/*.iml **/pom.xml /
 {code}
 The issue is that package.dir and dist refer to absolute paths; excludes 
 assumes relative paths.
 It's also pulling in all the contrib/**/lib/ and example/lib/ jars.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-3725) package-local-src-tgz target is pulling in non-source jars, dist/** and package/**

2012-08-09 Thread Michael Dodsworth (JIRA)
Michael Dodsworth created SOLR-3725:
---

 Summary: package-local-src-tgz target is pulling in non-source 
jars, dist/** and package/**
 Key: SOLR-3725
 URL: https://issues.apache.org/jira/browse/SOLR-3725
 Project: Solr
  Issue Type: Improvement
  Components: Build
Affects Versions: 4.0-ALPHA
Reporter: Michael Dodsworth
Priority: Minor
 Fix For: 4.1


package-local-src-tgz generates a 141M archive which contains a bunch of 
non-source jars:

{code}
tar tfz apache-solr-4.0-SNAPSHOT-src.tgz  | grep -E '(war|jar)$' | wc -l
134
{code}

It looks like we're expecting dist/** and package/** to be excluded:

{code:xml}
tarfileset dir=. prefix=${fullnamever}/solr
excludes=build ${package.dir}/** ${dist}/**
 example/webapps/*.war example/exampledocs/post.jar
 lib/README.committers.txt **/data/ **/logs/*
 **/*.sh **/bin/ scripts/
 .idea/ **/*.iml **/pom.xml /
{code}

The issue is that package.dir and dist refer to absolute paths; excludes 
assumes relative paths.

It's also pulling in all the contrib/**/lib/ and example/lib/ jars.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3725) package-local-src-tgz target is pulling in non-source jars, dist/** and package/**

2012-08-09 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13432136#comment-13432136
 ] 

Michael Dodsworth commented on SOLR-3725:
-

it's also including everything from solr/build and solr/lib/

 package-local-src-tgz target is pulling in non-source jars, dist/** and 
 package/**
 --

 Key: SOLR-3725
 URL: https://issues.apache.org/jira/browse/SOLR-3725
 Project: Solr
  Issue Type: Improvement
  Components: Build
Affects Versions: 4.0-ALPHA
Reporter: Michael Dodsworth
Priority: Minor
 Fix For: 4.1


 package-local-src-tgz generates a 141M archive which contains a bunch of 
 non-source jars:
 {code}
 tar tfz apache-solr-4.0-SNAPSHOT-src.tgz  | grep -E '(war|jar)$' | wc -l
 134
 {code}
 It looks like we're expecting dist/** and package/** to be excluded:
 {code:xml}
 tarfileset dir=. prefix=${fullnamever}/solr
 excludes=build ${package.dir}/** ${dist}/**
  example/webapps/*.war 
 example/exampledocs/post.jar
  lib/README.committers.txt **/data/ **/logs/*
  **/*.sh **/bin/ scripts/
  .idea/ **/*.iml **/pom.xml /
 {code}
 The issue is that package.dir and dist refer to absolute paths; excludes 
 assumes relative paths.
 It's also pulling in all the contrib/**/lib/ and example/lib/ jars.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3725) package-local-src-tgz target is pulling in non-source jars, dist/** and package/**

2012-08-09 Thread Michael Dodsworth (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dodsworth updated SOLR-3725:


Attachment: SOLR-3725.patch

generated archive is now 31M

 package-local-src-tgz target is pulling in non-source jars, dist/** and 
 package/**
 --

 Key: SOLR-3725
 URL: https://issues.apache.org/jira/browse/SOLR-3725
 Project: Solr
  Issue Type: Improvement
  Components: Build
Affects Versions: 4.0-ALPHA
Reporter: Michael Dodsworth
Priority: Minor
 Fix For: 4.1

 Attachments: SOLR-3725.patch


 package-local-src-tgz generates a 141M archive which contains a bunch of 
 non-source jars:
 {code}
 tar tfz apache-solr-4.0-SNAPSHOT-src.tgz  | grep -E '(war|jar)$' | wc -l
 134
 {code}
 It looks like we're expecting dist/** and package/** to be excluded:
 {code:xml}
 tarfileset dir=. prefix=${fullnamever}/solr
 excludes=build ${package.dir}/** ${dist}/**
  example/webapps/*.war 
 example/exampledocs/post.jar
  lib/README.committers.txt **/data/ **/logs/*
  **/*.sh **/bin/ scripts/
  .idea/ **/*.iml **/pom.xml /
 {code}
 The issue is that package.dir and dist refer to absolute paths; excludes 
 assumes relative paths.
 It's also pulling in all the contrib/**/lib/ and example/lib/ jars.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3580) In ExtendedDismax, lowercase 'not' operator is not being treated as an operator when 'lowercaseOperators' is enabled

2012-07-31 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425847#comment-13425847
 ] 

Michael Dodsworth commented on SOLR-3580:
-

Does this seem like a reasonable direction to everyone?

 In ExtendedDismax, lowercase 'not' operator is not being treated as an 
 operator when 'lowercaseOperators' is enabled
 

 Key: SOLR-3580
 URL: https://issues.apache.org/jira/browse/SOLR-3580
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0-ALPHA
Reporter: Michael Dodsworth
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-3580-proposal.patch, SOLR-3580.patch


 When lowercase operator support is enabled (for edismax), the lowercase 'not' 
 operator is being wrongly treated as a literal term (and not as an operator).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3580) In ExtendedDismax, lowercase 'not' operator is not being treated as an operator when 'lowercaseOperators' is enabled

2012-07-31 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425932#comment-13425932
 ] 

Michael Dodsworth commented on SOLR-3580:
-

Thanks for the feedback, Jack/Yonik.

1 - support for mixed-case operators is as before: they are interpreted as 
operators. Having said that, there appears to be an subtle bug with the 'mm' 
toggling behaviour. The operator counting (used to determine whether 'mm' needs 
to be disabled) only accepts strict uppercase and lowercase, whereas the query 
rebuild accepts mixed-case. I can also fix that up and add a test.

2 - the 'supportedLowercaseOperators' parameter would be in addition to 
'lowercaseOperators', rather than replacing it. If 'lowercaseOperators' is 
true, we look for a 'supportedLowercaseOperators' value. If no value is 
provided, we use the default (and, or), which means we have backwards 
compatibility.

Yonik - yeah, Jan's proposal is absolutely the most flexible. I guess my 
concerns were:
  - that it might snowball into wanting to have an external, stopword-esk file 
for per-language operator support (minor concern)
  - that we'd lose some backwards compatibility, as currently mixed-case 
operators are supported (although the default set could be expanded to 
accommodate this, if needed)
  - the interaction between the 'lowercaseOperators' parameter and 'valid*' 
might get a little funky. For example, if we simply ignore 'lowercaseOperators' 
when a 'valid*' parameter is present, there is no potential for confusion BUT 
toggling lowercase operator support per query then becomes a head-ache (as the 
upstream client needs to pass through the supported uppercase operators). If we 
allow interaction between 'lowercaseOperators' and 'valid*', which parameter 
takes priority? To allow toggling per-query, lowercaseOperators *should* take 
priority. Perhaps a good dollop of documentation would be enough here

Let me extend the patch to switch-over to Jan's proposal so people can take a 
look.
Cheers,

 In ExtendedDismax, lowercase 'not' operator is not being treated as an 
 operator when 'lowercaseOperators' is enabled
 

 Key: SOLR-3580
 URL: https://issues.apache.org/jira/browse/SOLR-3580
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0-ALPHA
Reporter: Michael Dodsworth
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-3580-proposal.patch, SOLR-3580.patch


 When lowercase operator support is enabled (for edismax), the lowercase 'not' 
 operator is being wrongly treated as a literal term (and not as an operator).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3467) ExtendedDismax escaping is missing several reserved characters

2012-07-02 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13405147#comment-13405147
 ] 

Michael Dodsworth commented on SOLR-3467:
-

much appreciated, Jan. Thank you.

 ExtendedDismax escaping is missing several reserved characters
 --

 Key: SOLR-3467
 URL: https://issues.apache.org/jira/browse/SOLR-3467
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 3.6
Reporter: Michael Dodsworth
Assignee: Jan Høydahl
Priority: Minor
 Fix For: 4.0, 5.0

 Attachments: SOLR-3467-lucene_solr_3_6.patch, SOLR-3467.patch, 
 SOLR-3467.patch, SOLR-3467.patch, SOLR-3467.patch


 When edismax is unable to parse the original user query, it retries using an 
 escaped version of that query (where all reserved chars have been escaped).
 Currently, the escaping done in {{splitIntoClauses}} appears to be missing 
 several chars from {{QueryParserBase#escape(String)}}, namely {{'\\', '|', 
 '', '/'}}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3580) In ExtendedDismax, lowercase 'not' operator is not being treated as an operator when 'lowercaseOperators' is enabled

2012-07-02 Thread Michael Dodsworth (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dodsworth updated SOLR-3580:


Attachment: SOLR-3580-proposal.patch

adding the 'supportedLowercaseOperator' parameter I mentioned.

Also cleared up a few unused vars, assignments, etc.

Requires more tests but I'm interested to hear what people think.

 In ExtendedDismax, lowercase 'not' operator is not being treated as an 
 operator when 'lowercaseOperators' is enabled
 

 Key: SOLR-3580
 URL: https://issues.apache.org/jira/browse/SOLR-3580
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Michael Dodsworth
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-3580-proposal.patch, SOLR-3580.patch


 When lowercase operator support is enabled (for edismax), the lowercase 'not' 
 operator is being wrongly treated as a literal term (and not as an operator).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3580) In ExtendedDismax, lowercase 'not' operator is not being treated as an operator when 'lowercaseOperators' is enabled

2012-06-28 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403641#comment-13403641
 ] 

Michael Dodsworth commented on SOLR-3580:
-

one option (that sits somewhere between the 2 proposed solutions) may be to 
have a 'supportedLowercaseOperators' setting that takes a comma-separated list 
of supported operators. If no override is provided, the default behaviour would 
be to accept '[and,or]'.

{code:xml} 
str name=supportedLowercaseOperatorsand,or,not/str
{code}

Let me get a patch together so people can take a look.

 In ExtendedDismax, lowercase 'not' operator is not being treated as an 
 operator when 'lowercaseOperators' is enabled
 

 Key: SOLR-3580
 URL: https://issues.apache.org/jira/browse/SOLR-3580
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Michael Dodsworth
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-3580.patch


 When lowercase operator support is enabled (for edismax), the lowercase 'not' 
 operator is being wrongly treated as a literal term (and not as an operator).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3467) ExtendedDismax escaping is missing several reserved characters

2012-06-27 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402287#comment-13402287
 ] 

Michael Dodsworth commented on SOLR-3467:
-

Thank you, Jan.

From what I can tell, '/' only became a reserved character since 4.0 - 
https://issues.apache.org/jira/browse/LUCENE-2604.

 ExtendedDismax escaping is missing several reserved characters
 --

 Key: SOLR-3467
 URL: https://issues.apache.org/jira/browse/SOLR-3467
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 3.6
Reporter: Michael Dodsworth
Assignee: Jan Høydahl
Priority: Minor
 Fix For: 4.0, 3.6.1, 5.0

 Attachments: SOLR-3467-lucene_solr_3_6.patch, SOLR-3467.patch, 
 SOLR-3467.patch, SOLR-3467.patch


 When edismax is unable to parse the original user query, it retries using an 
 escaped version of that query (where all reserved chars have been escaped).
 Currently, the escaping done in {{splitIntoClauses}} appears to be missing 
 several chars from {{QueryParserBase#escape(String)}}, namely {{'\\', '|', 
 '', '/'}}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3580) In ExtendedDismax, lowercase 'not' operator is not being treated as an operator when 'lowercaseOperators' is enabled

2012-06-27 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402318#comment-13402318
 ] 

Michael Dodsworth commented on SOLR-3580:
-

surely that's a more general hazard with supporting lowercase operators. It 
seems strange to give 'not' special treatment. There are likely are examples 
where having 'and' or 'or' wrongly treated as a operator /is/ catastrophic, 
therefore the onus should be on the client to choose the correct 
'lowercaseOperator' option for their use-case.


 In ExtendedDismax, lowercase 'not' operator is not being treated as an 
 operator when 'lowercaseOperators' is enabled
 

 Key: SOLR-3580
 URL: https://issues.apache.org/jira/browse/SOLR-3580
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Michael Dodsworth
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-3580.patch


 When lowercase operator support is enabled (for edismax), the lowercase 'not' 
 operator is being wrongly treated as a literal term (and not as an operator).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3580) In ExtendedDismax, lowercase 'not' operator is not being treated as an operator when 'lowercaseOperators' is enabled

2012-06-27 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402350#comment-13402350
 ] 

Michael Dodsworth commented on SOLR-3580:
-

were we not allowing the user to explicitly *specify* that they want to support 
lowercase operators, I might agree.

That setting should (at the very least) come with a clear health warning so 
that more people aren't caught out by this.

 In ExtendedDismax, lowercase 'not' operator is not being treated as an 
 operator when 'lowercaseOperators' is enabled
 

 Key: SOLR-3580
 URL: https://issues.apache.org/jira/browse/SOLR-3580
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Michael Dodsworth
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-3580.patch


 When lowercase operator support is enabled (for edismax), the lowercase 'not' 
 operator is being wrongly treated as a literal term (and not as an operator).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-3580) In ExtendedDismax, lowercase 'not' operator is not being treated as an operator when 'lowercaseOperators' is enabled

2012-06-26 Thread Michael Dodsworth (JIRA)
Michael Dodsworth created SOLR-3580:
---

 Summary: In ExtendedDismax, lowercase 'not' operator is not being 
treated as an operator when 'lowercaseOperators' is enabled
 Key: SOLR-3580
 URL: https://issues.apache.org/jira/browse/SOLR-3580
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Michael Dodsworth
Priority: Minor
 Fix For: 4.0


When lowercase operator support is enabled (for edismax), the lowercase 'not' 
operator is being wrongly treated as a literal term (and not as an operator).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3580) In ExtendedDismax, lowercase 'not' operator is not being treated as an operator when 'lowercaseOperators' is enabled

2012-06-26 Thread Michael Dodsworth (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dodsworth updated SOLR-3580:


Attachment: SOLR-3580.patch

patch includes:
  - fix to edismax
  - replacement operator test (covers upper and lowercase 'AND', 'OR' and 'NOT' 
operators with 'lowercaseOperators' enabled/disabled)
  - small clear-up in the test (adding @Test annotations and removing redundant 
'throws IOException's)

 In ExtendedDismax, lowercase 'not' operator is not being treated as an 
 operator when 'lowercaseOperators' is enabled
 

 Key: SOLR-3580
 URL: https://issues.apache.org/jira/browse/SOLR-3580
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Michael Dodsworth
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-3580.patch


 When lowercase operator support is enabled (for edismax), the lowercase 'not' 
 operator is being wrongly treated as a literal term (and not as an operator).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3467) ExtendedDismax escaping is missing several reserved characters

2012-06-26 Thread Michael Dodsworth (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dodsworth updated SOLR-3467:


Attachment: (was: SOLR-3467.patch)

 ExtendedDismax escaping is missing several reserved characters
 --

 Key: SOLR-3467
 URL: https://issues.apache.org/jira/browse/SOLR-3467
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Michael Dodsworth
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-3467.patch, SOLR-3467.patch


 When edismax is unable to parse the original user query, it retries using an 
 escaped version of that query (where all reserved chars have been escaped).
 Currently, the escaping done in {{splitIntoClauses}} appears to be missing 
 several chars from {{QueryParserBase#escape(String)}}, namely {{'\\', '|', 
 '', '/'}}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3467) ExtendedDismax escaping is missing several reserved characters

2012-06-26 Thread Michael Dodsworth (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dodsworth updated SOLR-3467:


Attachment: SOLR-3467.patch
SOLR-3467.patch

added test.

Thanks for looking at the patch.

 ExtendedDismax escaping is missing several reserved characters
 --

 Key: SOLR-3467
 URL: https://issues.apache.org/jira/browse/SOLR-3467
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Michael Dodsworth
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-3467.patch, SOLR-3467.patch


 When edismax is unable to parse the original user query, it retries using an 
 escaped version of that query (where all reserved chars have been escaped).
 Currently, the escaping done in {{splitIntoClauses}} appears to be missing 
 several chars from {{QueryParserBase#escape(String)}}, namely {{'\\', '|', 
 '', '/'}}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3467) ExtendedDismax escaping is missing several reserved characters

2012-06-25 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400895#comment-13400895
 ] 

Michael Dodsworth commented on SOLR-3467:
-

all feedback/review comments welcome

 ExtendedDismax escaping is missing several reserved characters
 --

 Key: SOLR-3467
 URL: https://issues.apache.org/jira/browse/SOLR-3467
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Michael Dodsworth
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-3467.patch


 When edismax is unable to parse the original user query, it retries using an 
 escaped version of that query (where all reserved chars have been escaped).
 Currently, the escaping done in {{splitIntoClauses}} appears to be missing 
 several chars from {{QueryParserBase#escape(String)}}, namely {{'\\', '|', 
 '', '/'}}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-3467) ExtendedDismax escaping is missing several reserved characters

2012-05-17 Thread Michael Dodsworth (JIRA)
Michael Dodsworth created SOLR-3467:
---

 Summary: ExtendedDismax escaping is missing several reserved 
characters
 Key: SOLR-3467
 URL: https://issues.apache.org/jira/browse/SOLR-3467
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Michael Dodsworth
Priority: Minor
 Fix For: 4.0


When edismax is unable to parse the original user query, it retries using an 
escaped version of that query (where all reserved chars have been escaped).

Currently, the escaping done in {{splitIntoClauses}} appears to be missing 
several chars from {{QueryParserBase#escape(String)}}, namely {{'\\', '|', '', 
'/'}}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3467) ExtendedDismax escaping is missing several reserved characters

2012-05-17 Thread Michael Dodsworth (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dodsworth updated SOLR-3467:


Attachment: SOLR-3467.patch

 ExtendedDismax escaping is missing several reserved characters
 --

 Key: SOLR-3467
 URL: https://issues.apache.org/jira/browse/SOLR-3467
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Michael Dodsworth
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-3467.patch


 When edismax is unable to parse the original user query, it retries using an 
 escaped version of that query (where all reserved chars have been escaped).
 Currently, the escaping done in {{splitIntoClauses}} appears to be missing 
 several chars from {{QueryParserBase#escape(String)}}, namely {{'\\', '|', 
 '', '/'}}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1227) NGramTokenizer to handle more than 1024 chars

2009-01-05 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12660924#action_12660924
 ] 

Michael Dodsworth commented on LUCENE-1227:
---

Any progress on getting this patch into a release? I can take a look if nobody 
else is. 

 NGramTokenizer to handle more than 1024 chars
 -

 Key: LUCENE-1227
 URL: https://issues.apache.org/jira/browse/LUCENE-1227
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/*
Reporter: Hiroaki Kawai
Assignee: Grant Ingersoll
Priority: Minor
 Attachments: LUCENE-1227.patch, NGramTokenizer.patch, 
 NGramTokenizer.patch


 Current NGramTokenizer can't handle character stream that is longer than 
 1024. This is too short for non-whitespace-separated languages.
 I created a patch for this issues.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Created: (LUCENE-1352) trailing escaped backslashes in quoted queries cause parse error

2008-08-07 Thread Michael Dodsworth (JIRA)
trailing escaped backslashes in quoted queries cause parse error


 Key: LUCENE-1352
 URL: https://issues.apache.org/jira/browse/LUCENE-1352
 Project: Lucene - Java
  Issue Type: Bug
  Components: QueryParser
Affects Versions: 2.3.2
 Environment: Ubuntu 7.04, Sun JVM 1.5.0_1
Reporter: Michael Dodsworth


The QueryParser fails to parse queries that contain escaped backslashes 
followed by a closing double-quote, then an opening double-quote (as part of 
another term).

For example, the query:

tagOrig:testing\\ title:titleTest

will fail with the exception:

org.apache.lucene.queryParser.ParseException: Cannot parse 'tagOrig:testing\\ 
title:titleTest': Lexical error at line 1, column 38.  Encountered: EOF 
after : 
at 
org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:155)
at 
org.apache.solr.search.LuceneQParser.parse(LuceneQParserPlugin.java:79)

After digging around, I found that 'QueryParserTokenManager:jjMoveNfa_3' is 
generating - 'testing\\\ title:' as the token following the opening quote. It 
should be generating 'testing\\'; it appears to see the first double-quote as 
being escaped by the preceding slashes. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (SOLR-281) Search Components (plugins)

2008-02-01 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12564739#action_12564739
 ] 

Michael Dodsworth commented on SOLR-281:


{quote} 
That would require instantiation with reflection I think. 
{quote} 

Reflection is already being used to create the QParserPlugins (SolrCore:1027 
and AbstractPluginLoader:83) - I'm guessing the reason for the plugin is just 
to avoid creating instances through reflection on every parse (as you could 
keep hold of the QParser class and call newInstance). The second point is moot, 
once you take away the need for createParser(...). 

It's really not that big-a-deal, in the scheme of things. 

{quote} 
QParserPlugin is that interface essentially (except that its an class instead 
of an interface). For library maintainers an abstract class is preferred over 
an interface for things that a user will extend... that way signature changes 
can be made in a backward compatible manner. 
{quote} 

As an aside, method signature changes are usually trivial to fix; personally, 
the pain of those fixes is favourable to extending an abstract class 
unnecessarily. 
Are there any architectural reworking projects on the roadmap? I'm sure 
backward compatibility is a massive concern; perhaps with the more modular 
plugin design route Solr is going down, those concerns can be addressed. If 
there's a chance of being accepted, I would love to contribute a move towards 
using Spring. 



 Search Components (plugins)
 ---

 Key: SOLR-281
 URL: https://issues.apache.org/jira/browse/SOLR-281
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-281-ComponentInit.patch, 
 SOLR-281-ComponentInit.patch, SOLR-281-SearchComponents.patch, 
 SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, 
 SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, 
 SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, 
 SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, 
 solr-281.patch, solr-281.patch, solr-281.patch


 A request handler with pluggable search components for things like:
   - standard
   - dismax
   - more-like-this
   - highlighting
   - field collapsing 
 For more discussion, see:
 http://www.nabble.com/search-components-%28plugins%29-tf3898040.html#a11050274

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-281) Search Components (plugins)

2008-01-31 Thread Michael Dodsworth (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12564449#action_12564449
 ] 

Michael Dodsworth commented on SOLR-281:


This is great; decomposing the handler and allowing the components to be wired 
up in the config really helps development (and maintenance of those changes). 

For my purposes, I needed to make a change to the way the dismax query was 
being generated. Using the DisMaxQParserPlugin as a template, I created my own 
QParser and associated QParserPlugin; changed the relevant bits; added a 
queryParser... entry in solrconfig.xml; added the 'defType' parameter to the 
wanted SearchHandler configuration...and...all works well. 
  
Just a few comments: 

* I had to make the QParser parse() method public (as the new query parser may 
still need to use the existing query parsers (backup lucene parser, boost 
parser, function parser, etc). 
* The QParserPlugin class seems unnecessary: all it does is implement init() 
and add a createParser method. Why not just have the parser constructor take 
those arguments...or, if that can't be done, create an interface to allow the 
parser itself implement both init() and createParser() (or create()). It then 
avoids having to create 2 classes (in the case of DisMax, in the same 
file...which is not pretty).

 Search Components (plugins)
 ---

 Key: SOLR-281
 URL: https://issues.apache.org/jira/browse/SOLR-281
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-281-ComponentInit.patch, 
 SOLR-281-ComponentInit.patch, SOLR-281-SearchComponents.patch, 
 SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, 
 SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, 
 SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, 
 SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, 
 solr-281.patch, solr-281.patch, solr-281.patch


 A request handler with pluggable search components for things like:
   - standard
   - dismax
   - more-like-this
   - highlighting
   - field collapsing 
 For more discussion, see:
 http://www.nabble.com/search-components-%28plugins%29-tf3898040.html#a11050274

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.