[jira] [Created] (LUCENE-4580) Facet DrillDown should return a Filter not Query

2012-11-30 Thread Shai Erera (JIRA)
Shai Erera created LUCENE-4580:
--

 Summary: Facet DrillDown should return a Filter not Query
 Key: LUCENE-4580
 URL: https://issues.apache.org/jira/browse/LUCENE-4580
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera
Priority: Minor


DrillDown is a helper class which the user can use to convert a facet value 
that a user selected into a Query for performing drill-down or narrowing the 
results. The API has several static methods that create e.g. a Term or Query.

Rather than creating a Query, it would make more sense to create a Filter I 
think. In most cases, the clicked facets should not affect the scoring of 
documents. Anyway, even if it turns out that it must return a Query (which I 
doubt), we should at least modify the impl to return a ConstantScoreQuery.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1487) Add expungeDelete to SolrJ's SolrServer.commit(..)

2012-11-30 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507839#comment-13507839
 ] 

Shawn Heisey commented on SOLR-1487:


As long as we're talking about enhancing SolrJ, expungeDeletesPctAllowed 
(SOLR-2725) should be exposed too.

> Add  expungeDelete to SolrJ's SolrServer.commit(..)
> ---
>
> Key: SOLR-1487
> URL: https://issues.apache.org/jira/browse/SOLR-1487
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Affects Versions: 1.3
> Environment: N/A
>Reporter: Jibo John
> Attachments: expunge-patch.txt
>
>
> Add  expungeDelete to SolrJ's SolrServer.commit(..).
> Currently, this can be done only through updatehandler (  ( curl update -F 
> stream.body=' ' )) 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2725) TieredMergePolicy and expungeDeletes behaviour

2012-11-30 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507838#comment-13507838
 ] 

Shawn Heisey commented on SOLR-2725:


It occurred to me that in addition to allowing solrconfig.xml to set this 
value, that it should also be available via SolrJ.  Which brings up SOLR-1487.


> TieredMergePolicy and expungeDeletes behaviour
> --
>
> Key: SOLR-2725
> URL: https://issues.apache.org/jira/browse/SOLR-2725
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 3.3, 3.4, 3.5, 3.6
>Reporter: Martijn van Groningen
>
> During executing a commit with expungeDeletes I noticed there were still a 
> lot of segments left.
> However there were still ~30 segments left with deletes after the commit 
> finished.
> After looking in SolrIndexConfig class I noticed that 
> TieredMergePolicy#setExpungeDeletesPctAllowed isn't invoked.
> I think the following statements in SolrIndexConfig#buildMergePolicy method 
> will purge all deletes:
> {code}
> tieredMergePolicy.setExpungeDeletesPctAllowed(0);
> {code} 
> This also reflects the behavior of Solr 3.1 / 3.2
> After some discussion on IRC setting expungeDeletesPctAllowed always to zero 
> isn't best for performance:
> http://colabti.org/irclogger/irclogger_log/lucene-dev?date=2011-08-20#l120
> I think we should add an option to solrconfig.xml that allows users to set 
> this option to whatever value is best for them:
> {code:xml}
> 0
> {code}
> Also having a expungeDeletesPctAllowed per commit command would be great:
> {code:xml}
>  expungeDeletesPctAllowed="0"/>
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.8.0-ea-b58) - Build # 2989 - Failure!

2012-11-30 Thread Policeman Jenkins Server
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux/2989/
Java: 64bit/jdk1.8.0-ea-b58 -XX:+UseParallelGC

All tests passed

Build Log:
[...truncated 9246 lines...]
[junit4:junit4] ERROR: JVM J0 ended with an exception, command line: 
/mnt/ssd/jenkins/tools/java/64bit/jdk1.8.0-ea-b58/jre/bin/java 
-XX:+UseParallelGC -XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/heapdumps 
-Dtests.prefix=tests -Dtests.seed=75D30FC657780547 -Xmx512M -Dtests.iters= 
-Dtests.verbose=false -Dtests.infostream=false 
-Dtests.lockdir=/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build 
-Dtests.codec=random -Dtests.postingsformat=random -Dtests.locale=random 
-Dtests.timezone=random -Dtests.directory=random 
-Dtests.linedocsfile=europarl.lines.txt.gz -Dtests.luceneMatchVersion=5.0 
-Dtests.cleanthreads=perClass 
-Djava.util.logging.config.file=/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/testlogging.properties
 -Dtests.nightly=false -Dtests.weekly=false -Dtests.slow=true 
-Dtests.asserts.gracious=false -Dtests.multiplier=3 -DtempDir=. 
-Djava.io.tmpdir=. 
-Dtests.sandbox.dir=/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build/solr-core
 
-Dclover.db.dir=/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/clover/db
 -Djava.security.manager=org.apache.lucene.util.TestSecurityManager 
-Djava.security.policy=/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/tools/junit4/tests.policy
 -Dlucene.version=5.0-SNAPSHOT -Djetty.testMode=1 -Djetty.insecurerandom=1 
-Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory 
-Djava.awt.headless=true -Dfile.encoding=UTF-8 -classpath 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build/solr-core/classes/test:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build/solr-test-framework/classes/java:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build/solr-core/test-files:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/test-framework/classes/java:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/codecs/classes/java:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build/solr-solrj/classes/java:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build/solr-core/classes/java:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/analysis/common/lucene-analyzers-common-5.0-SNAPSHOT.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/analysis/kuromoji/lucene-analyzers-kuromoji-5.0-SNAPSHOT.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/analysis/phonetic/lucene-analyzers-phonetic-5.0-SNAPSHOT.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/highlighter/lucene-highlighter-5.0-SNAPSHOT.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/memory/lucene-memory-5.0-SNAPSHOT.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/misc/lucene-misc-5.0-SNAPSHOT.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/spatial/lucene-spatial-5.0-SNAPSHOT.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/suggest/lucene-suggest-5.0-SNAPSHOT.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/grouping/lucene-grouping-5.0-SNAPSHOT.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/queries/lucene-queries-5.0-SNAPSHOT.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/queryparser/lucene-queryparser-5.0-SNAPSHOT.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/core/lib/commons-cli-1.2.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/core/lib/commons-codec-1.7.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/core/lib/commons-fileupload-1.2.1.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/core/lib/commons-lang-2.6.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/core/lib/easymock-2.2.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/core/lib/guava-13.0.1.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/core/lib/javax.servlet-api-3.0.1.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/core/lib/metrics-core-2.1.2.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/core/lib/spatial4j-0.3.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/solrj/lib/commons-io-2.1.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/solrj/lib/httpclient-4.1.3.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/solrj/lib/httpcore-4.1.4.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/solrj/lib/httpmime-4.1.3.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/solrj/lib/jcl-over-slf4j-1.6.4.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/solrj/lib/log4j-over-slf4j-1.6.4.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/solrj/lib/slf4j-api-1.6.4.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/solrj/lib/slf4j-jdk14-1.6

[jira] [Updated] (SOLR-4133) Cannot "set" field to null with partial updates when using the standard RequestWriter.

2012-11-30 Thread Will Butler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Will Butler updated SOLR-4133:
--

Description: 
I would like to "unset" a field using partial updates like so:
\\
\\
{code}
doc.setField(field, singletonMap("set", null));
{code}

When I attempt to add this document using the standard XML-based RequestWriter, 
this update is ignored. It works properly when using the BinaryRequestWriter.

  was:
I would like to "unset" a field using partial updates like so:

{{doc.setField(field, singletonMap("set", null));}}

When I attempt to add this document using the standard XML-based RequestWriter, 
this update is ignored. It works properly when using the BinaryRequestWriter.


> Cannot "set" field to null with partial updates when using the standard 
> RequestWriter.
> --
>
> Key: SOLR-4133
> URL: https://issues.apache.org/jira/browse/SOLR-4133
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java, update
>Affects Versions: 4.0
>Reporter: Will Butler
>Priority: Minor
>
> I would like to "unset" a field using partial updates like so:
> \\
> \\
> {code}
> doc.setField(field, singletonMap("set", null));
> {code}
> When I attempt to add this document using the standard XML-based 
> RequestWriter, this update is ignored. It works properly when using the 
> BinaryRequestWriter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4134) Cannot "set" multiple values into multivalued field with partial updates when using the standard RequestWriter.

2012-11-30 Thread Will Butler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Will Butler updated SOLR-4134:
--

Description: 
I would like to "set" multiple values into a field using partial updates like 
so:
\\
\\
{code}
List values = new ArrayList();
values.add("one");
values.add("two");
values.add("three");
doc.setField(field, singletonMap("set", values));
{code}

When using the standard XML-based RequestWriter, you end up with a single value 
that looks like [one, two, three], because of the toString() calls on lines 130 
and 132 of ClientUtils. It works properly when using the BinaryRequestWriter.

  was:
I would like to "set" multiple values into a field using partial updates like 
so:
\\
{code}
List values = new ArrayList();
values.add("one");
values.add("two");
values.add("three");
doc.setField(field, singletonMap("set", values));
{code}

When using the standard XML-based RequestWriter, you end up with a single value 
that looks like [one, two, three], because of the toString() calls on lines 130 
and 132 of ClientUtils. It works properly when using the BinaryRequestWriter.


> Cannot "set" multiple values into multivalued field with partial updates when 
> using the standard RequestWriter.
> ---
>
> Key: SOLR-4134
> URL: https://issues.apache.org/jira/browse/SOLR-4134
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java, update
>Affects Versions: 4.0
>Reporter: Will Butler
>Priority: Minor
>
> I would like to "set" multiple values into a field using partial updates like 
> so:
> \\
> \\
> {code}
> List values = new ArrayList();
> values.add("one");
> values.add("two");
> values.add("three");
> doc.setField(field, singletonMap("set", values));
> {code}
> When using the standard XML-based RequestWriter, you end up with a single 
> value that looks like [one, two, three], because of the toString() calls on 
> lines 130 and 132 of ClientUtils. It works properly when using the 
> BinaryRequestWriter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4134) Cannot "set" multiple values into multivalued field with partial updates when using the standard RequestWriter.

2012-11-30 Thread Will Butler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Will Butler updated SOLR-4134:
--

Description: 
I would like to "set" multiple values into a field using partial updates like 
so:
\\
{code}
List values = new ArrayList();
values.add("one");
values.add("two");
values.add("three");
doc.setField(field, singletonMap("set", values));
{code}

When using the standard XML-based RequestWriter, you end up with a single value 
that looks like [one, two, three], because of the toString() calls on lines 130 
and 132 of ClientUtils. It works properly when using the BinaryRequestWriter.

  was:
I would like to "set" multiple values into a field using partial updates like 
so:


{code}
List values = new ArrayList();
values.add("one");
values.add("two");
values.add("three");
doc.setField(field, singletonMap("set", values));
{code}

When using the standard XML-based RequestWriter, you end up with a single value 
that looks like [one, two, three], because of the toString() calls on lines 130 
and 132 of ClientUtils. It works properly when using the BinaryRequestWriter.


> Cannot "set" multiple values into multivalued field with partial updates when 
> using the standard RequestWriter.
> ---
>
> Key: SOLR-4134
> URL: https://issues.apache.org/jira/browse/SOLR-4134
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java, update
>Affects Versions: 4.0
>Reporter: Will Butler
>Priority: Minor
>
> I would like to "set" multiple values into a field using partial updates like 
> so:
> \\
> {code}
> List values = new ArrayList();
> values.add("one");
> values.add("two");
> values.add("three");
> doc.setField(field, singletonMap("set", values));
> {code}
> When using the standard XML-based RequestWriter, you end up with a single 
> value that looks like [one, two, three], because of the toString() calls on 
> lines 130 and 132 of ClientUtils. It works properly when using the 
> BinaryRequestWriter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4134) Cannot "set" multiple values into multivalued field with partial updates when using the standard RequestWriter.

2012-11-30 Thread Will Butler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Will Butler updated SOLR-4134:
--

Description: 
I would like to "set" multiple values into a field using partial updates like 
so:


{code}
List values = new ArrayList();
values.add("one");
values.add("two");
values.add("three");
doc.setField(field, singletonMap("set", values));
{code}

When using the standard XML-based RequestWriter, you end up with a single value 
that looks like [one, two, three], because of the toString() calls on lines 130 
and 132 of ClientUtils. It works properly when using the BinaryRequestWriter.

  was:
I would like to "set" multiple values into a field using partial updates like 
so:

{code}
List values = new ArrayList();
values.add("one");
values.add("two");
values.add("three");
doc.setField(field, singletonMap("set", values));
{code}

When using the standard XML-based RequestWriter, you end up with a single value 
that looks like [one, two, three], because of the toString() calls on lines 130 
and 132 of ClientUtils. It works properly when using the BinaryRequestWriter.


> Cannot "set" multiple values into multivalued field with partial updates when 
> using the standard RequestWriter.
> ---
>
> Key: SOLR-4134
> URL: https://issues.apache.org/jira/browse/SOLR-4134
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java, update
>Affects Versions: 4.0
>Reporter: Will Butler
>Priority: Minor
>
> I would like to "set" multiple values into a field using partial updates like 
> so:
> {code}
> List values = new ArrayList();
> values.add("one");
> values.add("two");
> values.add("three");
> doc.setField(field, singletonMap("set", values));
> {code}
> When using the standard XML-based RequestWriter, you end up with a single 
> value that looks like [one, two, three], because of the toString() calls on 
> lines 130 and 132 of ClientUtils. It works properly when using the 
> BinaryRequestWriter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4134) Cannot "set" multiple values into multivalued field with partial updates when using the standard RequestWriter.

2012-11-30 Thread Will Butler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Will Butler updated SOLR-4134:
--

Description: 
I would like to "set" multiple values into a field using partial updates like 
so:

{code}
List values = new ArrayList();
values.add("one");
values.add("two");
values.add("three");
doc.setField(field, singletonMap("set", values));
{code}

When using the standard XML-based RequestWriter, you end up with a single value 
that looks like [one, two, three], because of the toString() calls on lines 130 
and 132 of ClientUtils. It works properly when using the BinaryRequestWriter.

  was:
I would like to "set" multiple values into a field using partial updates like 
so:

{{
List values = new ArrayList();
values.add("one");
values.add("two");
values.add("three");
doc.setField(field, singletonMap("set", values));
}}

When using the standard XML-based RequestWriter, you end up with a single value 
that looks like [one, two, three], because of the toString() calls on lines 130 
and 132 of ClientUtils. It works properly when using the BinaryRequestWriter.


> Cannot "set" multiple values into multivalued field with partial updates when 
> using the standard RequestWriter.
> ---
>
> Key: SOLR-4134
> URL: https://issues.apache.org/jira/browse/SOLR-4134
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java, update
>Affects Versions: 4.0
>Reporter: Will Butler
>Priority: Minor
>
> I would like to "set" multiple values into a field using partial updates like 
> so:
> {code}
> List values = new ArrayList();
> values.add("one");
> values.add("two");
> values.add("three");
> doc.setField(field, singletonMap("set", values));
> {code}
> When using the standard XML-based RequestWriter, you end up with a single 
> value that looks like [one, two, three], because of the toString() calls on 
> lines 130 and 132 of ClientUtils. It works properly when using the 
> BinaryRequestWriter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4134) Cannot "set" multiple values into multivalued field with partial updates when using the standard RequestWriter.

2012-11-30 Thread Will Butler (JIRA)
Will Butler created SOLR-4134:
-

 Summary: Cannot "set" multiple values into multivalued field with 
partial updates when using the standard RequestWriter.
 Key: SOLR-4134
 URL: https://issues.apache.org/jira/browse/SOLR-4134
 Project: Solr
  Issue Type: Bug
  Components: clients - java, update
Affects Versions: 4.0
Reporter: Will Butler
Priority: Minor


I would like to "set" multiple values into a field using partial updates like 
so:

{{
List values = new ArrayList();
values.add("one");
values.add("two");
values.add("three");
doc.setField(field, singletonMap("set", values));
}}

When using the standard XML-based RequestWriter, you end up with a single value 
that looks like [one, two, three], because of the toString() calls on lines 130 
and 132 of ClientUtils. It works properly when using the BinaryRequestWriter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4133) Cannot "set" field to null with partial updates when using the standard RequestWriter.

2012-11-30 Thread Will Butler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Will Butler updated SOLR-4133:
--

Summary: Cannot "set" field to null with partial updates when using the 
standard RequestWriter.  (was: Solrj: Cannot "set" field to null with partial 
updates when using standard RequestWriter.)

> Cannot "set" field to null with partial updates when using the standard 
> RequestWriter.
> --
>
> Key: SOLR-4133
> URL: https://issues.apache.org/jira/browse/SOLR-4133
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java, update
>Affects Versions: 4.0
>Reporter: Will Butler
>Priority: Minor
>
> I would like to "unset" a field using partial updates like so:
> {{doc.setField(field, singletonMap("set", null));}}
> When I attempt to add this document using the standard XML-based 
> RequestWriter, this update is ignored. It works properly when using the 
> BinaryRequestWriter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4133) Solrj: Cannot "set" field to null with partial updates when using standard RequestWriter.

2012-11-30 Thread Will Butler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Will Butler updated SOLR-4133:
--

Component/s: update

> Solrj: Cannot "set" field to null with partial updates when using standard 
> RequestWriter.
> -
>
> Key: SOLR-4133
> URL: https://issues.apache.org/jira/browse/SOLR-4133
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java, update
>Affects Versions: 4.0
>Reporter: Will Butler
>Priority: Minor
>
> I would like to "unset" a field using partial updates like so:
> {{doc.setField(field, singletonMap("set", null));}}
> When I attempt to add this document using the standard XML-based 
> RequestWriter, this update is ignored. It works properly when using the 
> BinaryRequestWriter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4579) MultiTerm disjunction query implementation

2012-11-30 Thread John Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Wang updated LUCENE-4579:
--

Attachment: multitermdisjunctionquery.diff

> MultiTerm disjunction query implementation
> --
>
> Key: LUCENE-4579
> URL: https://issues.apache.org/jira/browse/LUCENE-4579
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Affects Versions: 4.0
>Reporter: John Wang
>  Labels: patch
> Attachments: multitermdisjunctionquery.diff
>
>
> This is a MultiTermQuery implementation for supporting disjunction of terms 
> in the same field.
> This should be the same as BooleanQuery for disjunction of the same field.
> This is related to: LUCENE-

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-4579) MultiTerm disjunction query implementation

2012-11-30 Thread John Wang (JIRA)
John Wang created LUCENE-4579:
-

 Summary: MultiTerm disjunction query implementation
 Key: LUCENE-4579
 URL: https://issues.apache.org/jira/browse/LUCENE-4579
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 4.0
Reporter: John Wang


This is a MultiTermQuery implementation for supporting disjunction of terms in 
the same field.

This should be the same as BooleanQuery for disjunction of the same field.

This is related to: LUCENE-

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4133) Solrj: Cannot "set" field to null with partial updates when using standard RequestWriter.

2012-11-30 Thread Will Butler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Will Butler updated SOLR-4133:
--

Description: 
I would like to "unset" a field using partial updates like so:

{{doc.setField(field, singletonMap("set", null));}}

When I attempt to add this document using the standard XML-based RequestWriter, 
this update is ignored. It works properly when using the BinaryRequestWriter.

  was:
I would like to "unset" a field using partial updates like so:

doc.setField(field, singletonMap("set", null));

When I attempt to add this document using the standard XML-based RequestWriter, 
this update is ignored. It works properly when using the BinaryRequestWriter.


> Solrj: Cannot "set" field to null with partial updates when using standard 
> RequestWriter.
> -
>
> Key: SOLR-4133
> URL: https://issues.apache.org/jira/browse/SOLR-4133
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java
>Affects Versions: 4.0
>Reporter: Will Butler
>Priority: Minor
>
> I would like to "unset" a field using partial updates like so:
> {{doc.setField(field, singletonMap("set", null));}}
> When I attempt to add this document using the standard XML-based 
> RequestWriter, this update is ignored. It works properly when using the 
> BinaryRequestWriter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4133) Solrj: Cannot "set" field to null with partial updates when using standard RequestWriter.

2012-11-30 Thread William Butler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

William Butler updated SOLR-4133:
-

Summary: Solrj: Cannot "set" field to null with partial updates when using 
standard RequestWriter.  (was: SolrJ - Cannot "set" field to null with partial 
updates when using standard RequestWriter.)

> Solrj: Cannot "set" field to null with partial updates when using standard 
> RequestWriter.
> -
>
> Key: SOLR-4133
> URL: https://issues.apache.org/jira/browse/SOLR-4133
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java
>Affects Versions: 4.0
>Reporter: William Butler
>Priority: Minor
>
> I would like to "unset" a field using partial updates like so:
> doc.setField(field, singletonMap("set", null));
> When I attempt to add this document using the standard XML-based 
> RequestWriter, this update is ignored. It works properly when using the 
> BinaryRequestWriter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4133) SolrJ - Cannot "set" field to null with partial updates when using standard RequestWriter.

2012-11-30 Thread William Butler (JIRA)
William Butler created SOLR-4133:


 Summary: SolrJ - Cannot "set" field to null with partial updates 
when using standard RequestWriter.
 Key: SOLR-4133
 URL: https://issues.apache.org/jira/browse/SOLR-4133
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 4.0
Reporter: William Butler
Priority: Minor


I would like to "unset" a field using partial updates like so:

doc.setField(field, singletonMap("set", null));

When I attempt to add this document using the standard XML-based RequestWriter, 
this update is ignored. It works properly when using the BinaryRequestWriter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-30 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507759#comment-13507759
 ] 

Mark Miller commented on SOLR-4114:
---

I started working on patching this into recent stuff, and it's more of a pain 
than I thought. I must have missed some piece as I tried to merge it up and the 
test is failing. Giving up for tonight.

> Collection API: Allow multiple shards from one collection on the same Solr 
> server
> -
>
> Key: SOLR-4114
> URL: https://issues.apache.org/jira/browse/SOLR-4114
> Project: Solr
>  Issue Type: New Feature
>  Components: multicore, SolrCloud
>Affects Versions: 4.0
> Environment: Solr 4.0.0 release
>Reporter: Per Steffensen
>Assignee: Per Steffensen
>  Labels: collection-api, multicore, shard, shard-allocation
> Attachments: SOLR-4114.patch, SOLR-4114.patch
>
>
> We should support running multiple shards from one collection on the same 
> Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
> (each Solr server running 2 shards).
> Performance tests at our side has shown that this is a good idea, and it is 
> also a good idea for easy elasticity later on - it is much easier to move an 
> entire existing shards from one Solr server to another one that just joined 
> the cluter than it is to split an exsiting shard among the Solr that used to 
> run it and the new Solr.
> See dev mailing list discussion "Multiple shards for one collection on the 
> same Solr server"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Source Control

2012-11-30 Thread Mark Miller
Thinking out loud...

We are currently notified of pull requests - if we started using git,
I would have two upstream repos - the apache repo and the github
mirror.

Rather than use the one click pull github offers (I'd never use that
anyway), I'd just add the guy as an upstream and merge in his change.

Then I would commit and be done. He can make all the branches he wants
and share them with whoever he wants.

It's easy to pull changes from anywhere with git. I don't use any of
the other github features, other than it's a central repo, and it
provides a nice little way to request a pull.

We have the pull requests anyway - if we document that was how you
should contribute, we'd have a lot more of them.

The problem is, I don't trust committing a patch i pull from git into
svn. I still have to double check renames or any re-factoring is done
right for svn, I have to worry about any binary file, and I have to
run all the tests again. That keeps me from doing it anymore.

For me that is the win - taking that out. I don't see a lot of other
github benefits. The main benifit is that it provides nice hosting of
code for any potential contributor that we can easily pull from. I see
the benefits simply in a distributed version control system - and git
is clearly the most popular one.

Even now, as I check out a revision in svn to apply an old patch, I'm
amazed at how long i have to wait when it would be done in no time
with git.

- Mark

On Tue, Nov 6, 2012 at 9:15 AM, Jack Krupansky  wrote:
> One issue is how to use git and github. One can certainly use it as if it
> were svn, but that misses a lot of the power of git, particularly the
> collaborative tools on github.
>
> For example, one approach is to create a branch for every Jira ticket and
> then instead of posting raw "patches" on the Jira ticket, create git "pull
> requests" from the branch, which make it easy to comment on individual file
> changes, right down to comments on individual lines of code. Changes can be
> "committed" and pushed to the branch as work continues and new pull requests
> generated. Eventually, pull requests can then be easily merged into the
> master, as desired. Users can selectively include pull requests as they see
> fit as well.
>
> But... can all of us, even non-committers do that? Or would the better
> features of github be available only to committers? I don't know enough
> about github to know whether you can have one class of user able to create
> branches or comment on them but not merge into master or tagged branches
> such as releases.
>
> -- Jack Krupansky
>
>
> -Original Message- From: Mark Miller
> Sent: Friday, October 26, 2012 7:02 PM
>
> To: dev@lucene.apache.org
> Subject: Source Control
>
> So, it's not everyone's favorite tool, but it sure seems to be the most
> popular tool.
>
> What are peoples thoughts about moving to git?
>
> Distributed version control is where it's at :)
>
> I know some prefer mercurial, but git and github clearly are taking over the
> world.
>
> Also, the cmd line for git is a little eccentric - I use a GUI client called
> SmartGit. Some very clever German's make it.
>
> A few Apache projects are already using git.
>
> I'd like to hear what people feel about this idea.
>
> - Mark
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>



-- 
- Mark

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3583) Percentiles for facets, pivot facets, and distributed pivot facets

2012-11-30 Thread Chris Russell (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507718#comment-13507718
 ] 

Chris Russell commented on SOLR-3583:
-

Think I may have introduced a performance issue re:faceting, unit tests taking 
a lot longer to run.  Am investigating.

> Percentiles for facets, pivot facets, and distributed pivot facets
> --
>
> Key: SOLR-3583
> URL: https://issues.apache.org/jira/browse/SOLR-3583
> Project: Solr
>  Issue Type: Improvement
>Reporter: Chris Russell
>Priority: Minor
>  Labels: newbie, patch
> Fix For: 4.1
>
> Attachments: SOLR-3583.patch, SOLR-3583.patch
>
>
> Built on top of SOLR-2894, this patch adds percentiles and averages to 
> facets, pivot facets, and distributed pivot facets by making use of range 
> facet internals.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4074) We should default ramBufferSizeMB to around 100

2012-11-30 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507717#comment-13507717
 ] 

Commit Tag Bot commented on SOLR-4074:
--

[branch_4x commit] Mark Robert Miller
http://svn.apache.org/viewvc?view=revision&revision=1415874

SOLR-4074: Raise default ramBufferSizeMB to 100 from 32.



> We should default ramBufferSizeMB to around 100
> ---
>
> Key: SOLR-4074
> URL: https://issues.apache.org/jira/browse/SOLR-4074
> Project: Solr
>  Issue Type: Improvement
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 4.1, 5.0
>
>
> You get diminishing returns after 100, but gains up to 100. In this day and 
> age of RAM, it seems better to default to 100 and have essentially the top 
> indexing speed the buffer can give us.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4074) We should default ramBufferSizeMB to around 100

2012-11-30 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-4074.
---

Resolution: Fixed

I've also changed the example config so that the commented out default setting 
matches and I've added a changes entry.

> We should default ramBufferSizeMB to around 100
> ---
>
> Key: SOLR-4074
> URL: https://issues.apache.org/jira/browse/SOLR-4074
> Project: Solr
>  Issue Type: Improvement
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 4.1, 5.0
>
>
> You get diminishing returns after 100, but gains up to 100. In this day and 
> age of RAM, it seems better to default to 100 and have essentially the top 
> indexing speed the buffer can give us.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4074) We should default ramBufferSizeMB to around 100

2012-11-30 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507711#comment-13507711
 ] 

Commit Tag Bot commented on SOLR-4074:
--

[trunk commit] Mark Robert Miller
http://svn.apache.org/viewvc?view=revision&revision=1415873

SOLR-4074: Raise default ramBufferSizeMB to 100 from 32.



> We should default ramBufferSizeMB to around 100
> ---
>
> Key: SOLR-4074
> URL: https://issues.apache.org/jira/browse/SOLR-4074
> Project: Solr
>  Issue Type: Improvement
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 4.1, 5.0
>
>
> You get diminishing returns after 100, but gains up to 100. In this day and 
> age of RAM, it seems better to default to 100 and have essentially the top 
> indexing speed the buffer can give us.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3990) index size unavailable in gui/mbeans unless replication handler configured

2012-11-30 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507708#comment-13507708
 ] 

Shawn Heisey commented on SOLR-3990:


Is the system info handler accessed via the /admin/system URL?  If so, it 
doesn't seem to have the index size.

I would prefer to have it accessible via /admin/mbeans?stats=true so that I 
don't have to make more than one SolrJ call per core to gather stats, but you 
should do whatever makes the most sense for Solr, not for me.  I'll adapt to it 
either way.


> index size unavailable in gui/mbeans unless replication handler configured
> --
>
> Key: SOLR-3990
> URL: https://issues.apache.org/jira/browse/SOLR-3990
> Project: Solr
>  Issue Type: Improvement
>  Components: web gui
>Affects Versions: 4.0
>Reporter: Shawn Heisey
>Priority: Minor
> Fix For: 4.1, 5.0
>
>
> Unless you configure the replication handler, the on-disk size of each core's 
> index seems to be unavailable in the gui or from the mbeans handler.  If you 
> are not doing replication, you should still be able to get the size of each 
> index without configuring things that won't be used.
> Also, I would like to get the size of the index in a consistent unit of 
> measurement, probably MB.  I understand the desire to give people a human 
> readable unit next to a number that's not enormous, but it's difficult to do 
> programmatic comparisons between values such as 787.33 MB and 23.56 GB.  That 
> may mean that the number needs to be available twice, one format to be shown 
> in the admin GUI and both formats available from the mbeans handler, for 
> scripting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: composition of different queries based scores

2012-11-30 Thread Jack Krupansky
No, the wildcard or fuzzy query is part of the term and the boost is a separate 
modifier:

  (hello*^0.5 OR hello~^0.5)

Or...

  (hello* OR hello~)^0.5


-- Jack Krupansky

From: sri krishna 
Sent: Friday, November 30, 2012 12:21 AM
To: dev@lucene.apache.org 
Subject: Re: composition of different queries based scores

for boosting the term for the same example is the above example is valid ?

(hello^0.5* OR hello^0.5~)




On Tue, Nov 27, 2012 at 11:22 PM, Jack Krupansky  
wrote:

  The fuzzy option will be ignored here – you cannot combine fuzzy and wild on 
the same term, although you could do an OR of the two:

  (hello* OR hello~)

  -- Jack Krupansky

  From: sri krishna 
  Sent: Tuesday, November 27, 2012 11:08 AM
  To: dev@lucene.apache.org 
  Subject: composition of different queries based scores

  for a search string hello*~ how the scoring is calculated?

  as the formula given in the url:  
http://lucene.apache.org/core/old_versioned_docs/versions/3_0_1/api/core/org/apache/lucene/search/Similarity.html,
 doesnot take into consideration of edit distance and prefix term corresponding 
factors into account.

  Does lucene add up the scores obtained from each type of query included i.e 
for the above query actual score=default scoring+1/(edit distance)+prefix match 
score ?, If so, there is no normalization between scores, else what is the 
approach lucene follows starting from seperating each query based identifiers 
like (~(edit distance), *(prefix query) etc) to actual scoring. 







[jira] [Commented] (SOLR-4115) WordBreakSpellChecker throws ArrayIndexOutOfBoundsException for random query string

2012-11-30 Thread James Dyer (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507707#comment-13507707
 ] 

James Dyer commented on SOLR-4115:
--

ahhh...I see.  Makes perfect sense.  Thanks, Steven for the explanation.

> WordBreakSpellChecker throws ArrayIndexOutOfBoundsException for random query 
> string
> ---
>
> Key: SOLR-4115
> URL: https://issues.apache.org/jira/browse/SOLR-4115
> Project: Solr
>  Issue Type: Bug
>  Components: spellchecker
>Affects Versions: 4.0
> Environment: java version "1.6.0_37"
> Java(TM) SE Runtime Environment (build 1.6.0_37-b06)
> Java HotSpot(TM) 64-Bit Server VM (build 20.12-b01, mixed mode)
>Reporter: Andreas Hubold
>
> The following SolrJ test code causes an ArrayIndexOutOfBoundsException in the 
> WordBreakSpellChecker. I tested this with the Solr 4.0.0 example webapp 
> started with {{java -jar start.jar}}.
> {code:java}
>   @Test
>   public void testWordbreakSpellchecker() throws Exception {
> SolrQuery q = new SolrQuery("\uD864\uDC79");
> q.setRequestHandler("/browse");
> q.setParam("spellcheck.dictionary", "wordbreak");
> HttpSolrServer server = new HttpSolrServer("http://localhost:8983/solr";);
> server.query(q, SolrRequest.METHOD.POST);
>   }
> {code}
> {noformat}
> INFO: [collection1] webapp=/solr path=/browse 
> params={spellcheck.dictionary=wordbreak&qt=/browse&wt=javabin&q=?&version=2} 
> hits=0 status=500 QTime=11 
> Nov 28, 2012 11:23:01 AM org.apache.solr.common.SolrException log
> SEVERE: null:java.lang.ArrayIndexOutOfBoundsException: 1
>   at org.apache.lucene.util.UnicodeUtil.UTF8toUTF16(UnicodeUtil.java:599)
>   at org.apache.lucene.util.BytesRef.utf8ToString(BytesRef.java:165)
>   at org.apache.lucene.index.Term.text(Term.java:72)
>   at 
> org.apache.lucene.search.spell.WordBreakSpellChecker.generateSuggestWord(WordBreakSpellChecker.java:350)
>   at 
> org.apache.lucene.search.spell.WordBreakSpellChecker.generateBreakUpSuggestions(WordBreakSpellChecker.java:283)
>   at 
> org.apache.lucene.search.spell.WordBreakSpellChecker.suggestWordBreaks(WordBreakSpellChecker.java:122)
>   at 
> org.apache.solr.spelling.WordBreakSolrSpellChecker.getSuggestions(WordBreakSolrSpellChecker.java:229)
>   at 
> org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:172)
>   at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:206)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1699)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:455)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:276)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
>   at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:250)
>   at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
>   at org.eclipse.jetty.server.Server.handle(Server.java:351)
>   at 
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:454)
>   at 
> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:47)
>   at 
> org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:900)
>   at 
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:954)
>   at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:857)
>   at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
>   at 
> org.ec

[jira] [Updated] (SOLR-3862) add "remove" as update option for atomically removing a value from a multivalued field

2012-11-30 Thread Jim Musil (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Musil updated SOLR-3862:


Attachment: SOLR-3862.patch

first stab

> add "remove" as update option for atomically removing a value from a 
> multivalued field
> --
>
> Key: SOLR-3862
> URL: https://issues.apache.org/jira/browse/SOLR-3862
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.0-BETA
>Reporter: Jim Musil
> Attachments: SOLR-3862.patch
>
>
> Currently you can atomically "add" a value to a multivalued field. It would 
> be useful to be able to "remove" a value from a multivalued field. 
> When you "set" a multivalued field to null, it destroys all values.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3862) add "remove" as update option for atomically removing a value from a multivalued field

2012-11-30 Thread Jim Musil (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507703#comment-13507703
 ] 

Jim Musil commented on SOLR-3862:
-

So, I needed to hack this together to suit my needs. I'm not sure if anyone 
else would find this useful, but I've added a "removeAll" option for atomic 
updates. It just uses String.removeAll() on each value, so supplying regex as 
the value will work. I've never submitted a patch before, so please forgive me 
if I've done this wrong.

> add "remove" as update option for atomically removing a value from a 
> multivalued field
> --
>
> Key: SOLR-3862
> URL: https://issues.apache.org/jira/browse/SOLR-3862
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.0-BETA
>Reporter: Jim Musil
>
> Currently you can atomically "add" a value to a multivalued field. It would 
> be useful to be able to "remove" a value from a multivalued field. 
> When you "set" a multivalued field to null, it destroys all values.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-3583) Percentiles for facets, pivot facets, and distributed pivot facets

2012-11-30 Thread Chris Russell (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507673#comment-13507673
 ] 

Chris Russell edited comment on SOLR-3583 at 11/30/12 9:47 PM:
---

Updated to trunk 1404975
Disentangled from SOLR-2894.  This patch no longer includes that patch.
You must first apply the 12th Nov 2012 version of SOLR-2894 which I updated to 
apply to the same version of trunk before applying this patch.

Based on some changes that I had to work around while updating to trunk, I feel 
that this will not work properly with facet.missing=true.  I am working on 
correcting this. (Pivot facets changed somewhat significantly in the interim.)

  was (Author: selah):
Updated to trunk 1404975
You must first apply the 12th Nov 2012 version of SOLR-2894 which I updated to 
apply to the same version of trunk.
Based on some changes that I had to work around while updating to trunk, I feel 
that this will not work properly with facet.missing=true.  I am working on 
correcting this.
  
> Percentiles for facets, pivot facets, and distributed pivot facets
> --
>
> Key: SOLR-3583
> URL: https://issues.apache.org/jira/browse/SOLR-3583
> Project: Solr
>  Issue Type: Improvement
>Reporter: Chris Russell
>Priority: Minor
>  Labels: newbie, patch
> Fix For: 4.1
>
> Attachments: SOLR-3583.patch, SOLR-3583.patch
>
>
> Built on top of SOLR-2894, this patch adds percentiles and averages to 
> facets, pivot facets, and distributed pivot facets by making use of range 
> facet internals.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-3583) Percentiles for facets, pivot facets, and distributed pivot facets

2012-11-30 Thread Chris Russell (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507673#comment-13507673
 ] 

Chris Russell edited comment on SOLR-3583 at 11/30/12 9:46 PM:
---

Updated to trunk 1404975
You must first apply the 12th Nov 2012 version of SOLR-2894 which I updated to 
apply to the same version of trunk.
Based on some changes that I had to work around while updating to trunk, I feel 
that this will not work properly with facet.missing=true.  I am working on 
correcting this.

  was (Author: selah):
Updated to trunk 1404975
You must first apply the 12th Nov 2012 version of SOLR-2894 which I updated to 
apply to the same version of trunk.
  
> Percentiles for facets, pivot facets, and distributed pivot facets
> --
>
> Key: SOLR-3583
> URL: https://issues.apache.org/jira/browse/SOLR-3583
> Project: Solr
>  Issue Type: Improvement
>Reporter: Chris Russell
>Priority: Minor
>  Labels: newbie, patch
> Fix For: 4.1
>
> Attachments: SOLR-3583.patch, SOLR-3583.patch
>
>
> Built on top of SOLR-2894, this patch adds percentiles and averages to 
> facets, pivot facets, and distributed pivot facets by making use of range 
> facet internals.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-3583) Percentiles for facets, pivot facets, and distributed pivot facets

2012-11-30 Thread Chris Russell (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507673#comment-13507673
 ] 

Chris Russell edited comment on SOLR-3583 at 11/30/12 9:44 PM:
---

Updated to trunk 1404975
You must first apply the 12th Nov 2012 version of SOLR-2894 which I updated to 
apply to the same version of trunk.

  was (Author: selah):
Updated to trunk 1404975
  
> Percentiles for facets, pivot facets, and distributed pivot facets
> --
>
> Key: SOLR-3583
> URL: https://issues.apache.org/jira/browse/SOLR-3583
> Project: Solr
>  Issue Type: Improvement
>Reporter: Chris Russell
>Priority: Minor
>  Labels: newbie, patch
> Fix For: 4.1
>
> Attachments: SOLR-3583.patch, SOLR-3583.patch
>
>
> Built on top of SOLR-2894, this patch adds percentiles and averages to 
> facets, pivot facets, and distributed pivot facets by making use of range 
> facet internals.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3583) Percentiles for facets, pivot facets, and distributed pivot facets

2012-11-30 Thread Chris Russell (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Russell updated SOLR-3583:


Description: Built on top of SOLR-2894, this patch adds percentiles and 
averages to facets, pivot facets, and distributed pivot facets by making use of 
range facet internals.(was: Built on top of SOLR-2894 (includes Apr 25th 
version) this patch adds percentiles and averages to facets, pivot facets, and 
distributed pivot facets by making use of range facet internals.  )

> Percentiles for facets, pivot facets, and distributed pivot facets
> --
>
> Key: SOLR-3583
> URL: https://issues.apache.org/jira/browse/SOLR-3583
> Project: Solr
>  Issue Type: Improvement
>Reporter: Chris Russell
>Priority: Minor
>  Labels: newbie, patch
> Fix For: 4.1
>
> Attachments: SOLR-3583.patch, SOLR-3583.patch
>
>
> Built on top of SOLR-2894, this patch adds percentiles and averages to 
> facets, pivot facets, and distributed pivot facets by making use of range 
> facet internals.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3583) Percentiles for facets, pivot facets, and distributed pivot facets

2012-11-30 Thread Chris Russell (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Russell updated SOLR-3583:


Attachment: SOLR-3583.patch

Updated to trunk 1404975

> Percentiles for facets, pivot facets, and distributed pivot facets
> --
>
> Key: SOLR-3583
> URL: https://issues.apache.org/jira/browse/SOLR-3583
> Project: Solr
>  Issue Type: Improvement
>Reporter: Chris Russell
>Priority: Minor
>  Labels: newbie, patch
> Fix For: 4.1
>
> Attachments: SOLR-3583.patch, SOLR-3583.patch
>
>
> Built on top of SOLR-2894 (includes Apr 25th version) this patch adds 
> percentiles and averages to facets, pivot facets, and distributed pivot 
> facets by making use of range facet internals.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4115) WordBreakSpellChecker throws ArrayIndexOutOfBoundsException for random query string

2012-11-30 Thread Steven Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507672#comment-13507672
 ] 

Steven Rowe commented on SOLR-4115:
---

Looks to me like a fundamental bug in WordBreakSpellChecker.

I agree with James that there's invalid UTF-8 here, but it's not 
{{\uD864\uDC79}}, which is a valid UTF-16 sequence representing a single 
character (codepoint: {{U+29079}}, UTF-8: {{F0 A9 81 B9}} - this is a CJK 
ideograph above the BMP).

The wordbreak suggester is breaking up multibyte UTF-8 characters at 
non-character boundaries.

As a method on TestWordBreakSpellChecker, this fails for me with the same stack 
trace in Lucene:

{code:java}
public void testBreakingCharAboveBMP() throws Exception {
  IndexReader ir = null;
  try {
ir = DirectoryReader.open(dir);
WordBreakSpellChecker wbsp = new WordBreakSpellChecker();
Term term = new Term("numbers", "\uD864\uDC79");
wbsp.setMaxChanges(1);
wbsp.setMinBreakWordLength(1);
wbsp.setMinSuggestionFrequency(1);
SuggestWord[][] sw = wbsp.suggestWordBreaks(term, 5, ir, 
SuggestMode.SUGGEST_WHEN_NOT_IN_INDEX, 
BreakSuggestionSortMethod.NUM_CHANGES_THEN_MAX_FREQUENCY);
Assert.assertEquals("sw.length", 0, sw.length);
  } catch(Exception e) {
throw e;
  } finally {
try { ir.close(); } catch(Exception e1) { }
  }
}
{code}

{{UnicodeUtil.UTF8toUTF16()}} assumes you're sending it a valid UTF-8 sequence, 
and so it croaks when WordBreakSpellChecker sends it the first byte in the 
UTF-8 representation of {{\uD864\uDC79}}: {{F0}}, a non-valid UTF-8 sequence 
without three following bytes.

> WordBreakSpellChecker throws ArrayIndexOutOfBoundsException for random query 
> string
> ---
>
> Key: SOLR-4115
> URL: https://issues.apache.org/jira/browse/SOLR-4115
> Project: Solr
>  Issue Type: Bug
>  Components: spellchecker
>Affects Versions: 4.0
> Environment: java version "1.6.0_37"
> Java(TM) SE Runtime Environment (build 1.6.0_37-b06)
> Java HotSpot(TM) 64-Bit Server VM (build 20.12-b01, mixed mode)
>Reporter: Andreas Hubold
>
> The following SolrJ test code causes an ArrayIndexOutOfBoundsException in the 
> WordBreakSpellChecker. I tested this with the Solr 4.0.0 example webapp 
> started with {{java -jar start.jar}}.
> {code:java}
>   @Test
>   public void testWordbreakSpellchecker() throws Exception {
> SolrQuery q = new SolrQuery("\uD864\uDC79");
> q.setRequestHandler("/browse");
> q.setParam("spellcheck.dictionary", "wordbreak");
> HttpSolrServer server = new HttpSolrServer("http://localhost:8983/solr";);
> server.query(q, SolrRequest.METHOD.POST);
>   }
> {code}
> {noformat}
> INFO: [collection1] webapp=/solr path=/browse 
> params={spellcheck.dictionary=wordbreak&qt=/browse&wt=javabin&q=?&version=2} 
> hits=0 status=500 QTime=11 
> Nov 28, 2012 11:23:01 AM org.apache.solr.common.SolrException log
> SEVERE: null:java.lang.ArrayIndexOutOfBoundsException: 1
>   at org.apache.lucene.util.UnicodeUtil.UTF8toUTF16(UnicodeUtil.java:599)
>   at org.apache.lucene.util.BytesRef.utf8ToString(BytesRef.java:165)
>   at org.apache.lucene.index.Term.text(Term.java:72)
>   at 
> org.apache.lucene.search.spell.WordBreakSpellChecker.generateSuggestWord(WordBreakSpellChecker.java:350)
>   at 
> org.apache.lucene.search.spell.WordBreakSpellChecker.generateBreakUpSuggestions(WordBreakSpellChecker.java:283)
>   at 
> org.apache.lucene.search.spell.WordBreakSpellChecker.suggestWordBreaks(WordBreakSpellChecker.java:122)
>   at 
> org.apache.solr.spelling.WordBreakSolrSpellChecker.getSuggestions(WordBreakSolrSpellChecker.java:229)
>   at 
> org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:172)
>   at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:206)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1699)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:455)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:276)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHa

[jira] [Commented] (SOLR-3990) index size unavailable in gui/mbeans unless replication handler configured

2012-11-30 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507665#comment-13507665
 ] 

Mark Miller commented on SOLR-3990:
---

bq.  That may mean that the number needs to be available twice, one format to 
be shown in the admin GUI and both formats available from the mbeans handler, 
for scripting.

I think the system info handler does exactly that - I remember working on it at 
some point.

> index size unavailable in gui/mbeans unless replication handler configured
> --
>
> Key: SOLR-3990
> URL: https://issues.apache.org/jira/browse/SOLR-3990
> Project: Solr
>  Issue Type: Improvement
>  Components: web gui
>Affects Versions: 4.0
>Reporter: Shawn Heisey
>Priority: Minor
> Fix For: 4.1, 5.0
>
>
> Unless you configure the replication handler, the on-disk size of each core's 
> index seems to be unavailable in the gui or from the mbeans handler.  If you 
> are not doing replication, you should still be able to get the size of each 
> index without configuring things that won't be used.
> Also, I would like to get the size of the index in a consistent unit of 
> measurement, probably MB.  I understand the desire to give people a human 
> readable unit next to a number that's not enormous, but it's difficult to do 
> programmatic comparisons between values such as 787.33 MB and 23.56 GB.  That 
> may mean that the number needs to be available twice, one format to be shown 
> in the admin GUI and both formats available from the mbeans handler, for 
> scripting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3990) index size unavailable in gui/mbeans unless replication handler configured

2012-11-30 Thread Shawn Heisey (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Heisey updated SOLR-3990:
---

Fix Version/s: 5.0

> index size unavailable in gui/mbeans unless replication handler configured
> --
>
> Key: SOLR-3990
> URL: https://issues.apache.org/jira/browse/SOLR-3990
> Project: Solr
>  Issue Type: Improvement
>  Components: web gui
>Affects Versions: 4.0
>Reporter: Shawn Heisey
>Priority: Minor
> Fix For: 4.1, 5.0
>
>
> Unless you configure the replication handler, the on-disk size of each core's 
> index seems to be unavailable in the gui or from the mbeans handler.  If you 
> are not doing replication, you should still be able to get the size of each 
> index without configuring things that won't be used.
> Also, I would like to get the size of the index in a consistent unit of 
> measurement, probably MB.  I understand the desire to give people a human 
> readable unit next to a number that's not enormous, but it's difficult to do 
> programmatic comparisons between values such as 787.33 MB and 23.56 GB.  That 
> may mean that the number needs to be available twice, one format to be shown 
> in the admin GUI and both formats available from the mbeans handler, for 
> scripting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-2890) omitTermFreqAndPositions and omitNorms don't work properly when used on fieldTypes

2012-11-30 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-2890.


   Resolution: Fixed
Fix Version/s: 5.0

> omitTermFreqAndPositions and omitNorms don't work properly when used on 
> fieldTypes
> --
>
> Key: SOLR-2890
> URL: https://issues.apache.org/jira/browse/SOLR-2890
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Affects Versions: 3.4
>Reporter: David Smiley
>Assignee: Hoss Man
> Fix For: 4.1, 5.0
>
> Attachments: SOLR-2890.patch
>
>
> Setting omitTermFreqAndPositions="true" doesn't work when I put it on a 
> fieldType definition for my text field.  It did work when I put it on the 
> field definition.  I think this option and probably all options should be 
> settable at the fieldType level.  I did some investigation and found that the 
> value of this option was being reset on line 54 of TextField.
> FYI I am trying to put this on a field type for use by the SpellCheck 
> component which has no use for term frequencies and positions from the source 
> field.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2890) omitTermFreqAndPositions and omitNorms don't work properly when used on fieldTypes

2012-11-30 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507630#comment-13507630
 ] 

Commit Tag Bot commented on SOLR-2890:
--

[branch_4x commit] Chris M. Hostetter
http://svn.apache.org/viewvc?view=revision&revision=1415837

SOLR-2890: Fixed a bug that prevented omitNorms and omitTermFreqAndPositions 
options from being respected in some  declarations (merge r1415817)



> omitTermFreqAndPositions and omitNorms don't work properly when used on 
> fieldTypes
> --
>
> Key: SOLR-2890
> URL: https://issues.apache.org/jira/browse/SOLR-2890
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Affects Versions: 3.4
>Reporter: David Smiley
>Assignee: Hoss Man
> Fix For: 4.1
>
> Attachments: SOLR-2890.patch
>
>
> Setting omitTermFreqAndPositions="true" doesn't work when I put it on a 
> fieldType definition for my text field.  It did work when I put it on the 
> field definition.  I think this option and probably all options should be 
> settable at the fieldType level.  I did some investigation and found that the 
> value of this option was being reset on line 54 of TextField.
> FYI I am trying to put this on a field type for use by the SpellCheck 
> component which has no use for term frequencies and positions from the source 
> field.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4132) Special log category for announcements, startup messages

2012-11-30 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507622#comment-13507622
 ] 

Shawn Heisey commented on SOLR-4132:


Messages from the new category could be at slf4j's trace level so that when 
logging all categories at INFO or DEBUG, you won't see duplicate messages.

> Special log category for announcements, startup messages
> 
>
> Key: SOLR-4132
> URL: https://issues.apache.org/jira/browse/SOLR-4132
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.0
>Reporter: Shawn Heisey
> Fix For: 4.1, 5.0
>
>
> When logging at WARN (log4j) or WARNING (jul), Solr startup logs nothing or 
> next to nothing.  I would like to be able to include some informational 
> messages in the log so that I can see that something's happening, and that 
> Solr is ready for use.  It would probably be wrong to set the level of such 
> things to WARN, because they aren't warnings.
> An alternate plan would be to create a special log category, which could be 
> set to INFO in log4j.properties or logging.properties.  David Smiley 
> suggested org.apache.solr.announcement.
> At a minimum, the new category should have messages for initial startup, core 
> creation, all initialization finished, and shutdown.  There may be other 
> things that should be included too.  It may even be a good idea to have 
> subcategories.  Some initial ideas I've thought of:
> org.apache.solr.announcement.startup
> org.apache.solr.announcement.shutdown
> org.apache.solr.announcement.init
> org.apache.solr.announcement.misc

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4132) Special log category for announcements, startup messages

2012-11-30 Thread Shawn Heisey (JIRA)
Shawn Heisey created SOLR-4132:
--

 Summary: Special log category for announcements, startup messages
 Key: SOLR-4132
 URL: https://issues.apache.org/jira/browse/SOLR-4132
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.0
Reporter: Shawn Heisey
 Fix For: 4.1, 5.0


When logging at WARN (log4j) or WARNING (jul), Solr startup logs nothing or 
next to nothing.  I would like to be able to include some informational 
messages in the log so that I can see that something's happening, and that Solr 
is ready for use.  It would probably be wrong to set the level of such things 
to WARN, because they aren't warnings.

An alternate plan would be to create a special log category, which could be set 
to INFO in log4j.properties or logging.properties.  David Smiley suggested 
org.apache.solr.announcement.

At a minimum, the new category should have messages for initial startup, core 
creation, all initialization finished, and shutdown.  There may be other things 
that should be included too.  It may even be a good idea to have subcategories. 
 Some initial ideas I've thought of:

org.apache.solr.announcement.startup
org.apache.solr.announcement.shutdown
org.apache.solr.announcement.init
org.apache.solr.announcement.misc


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2890) omitTermFreqAndPositions and omitNorms don't work properly when used on fieldTypes

2012-11-30 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507618#comment-13507618
 ] 

Commit Tag Bot commented on SOLR-2890:
--

[trunk commit] Chris M. Hostetter
http://svn.apache.org/viewvc?view=revision&revision=1415817

SOLR-2890: Fixed a bug that prevented omitNorms and omitTermFreqAndPositions 
options from being respected in some  declarations



> omitTermFreqAndPositions and omitNorms don't work properly when used on 
> fieldTypes
> --
>
> Key: SOLR-2890
> URL: https://issues.apache.org/jira/browse/SOLR-2890
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Affects Versions: 3.4
>Reporter: David Smiley
>Assignee: Hoss Man
> Fix For: 4.1
>
> Attachments: SOLR-2890.patch
>
>
> Setting omitTermFreqAndPositions="true" doesn't work when I put it on a 
> fieldType definition for my text field.  It did work when I put it on the 
> field definition.  I think this option and probably all options should be 
> settable at the fieldType level.  I did some investigation and found that the 
> value of this option was being reset on line 54 of TextField.
> FYI I am trying to put this on a field type for use by the SpellCheck 
> component which has no use for term frequencies and positions from the source 
> field.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3918) Change the way -excl-slf4j targets work

2012-11-30 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507613#comment-13507613
 ] 

Shawn Heisey commented on SOLR-3918:


Discussion on SOLR-4129 touched on this issue.  I think my patch for this issue 
needs to be committed.  My patch does two things:  1) adds a new target so you 
can exclude slf4j but still get everything else that "dist" gives you now.  2) 
Removes the sl4j api jar from the war in addition to the other slf4j jars.  
Restating my reasoning for number 2:

When you choose to exclude slf4j and use your own binding, you're already going 
to have to go track down jars from an external source and add them to your 
install.  The path of least resistance will lead you to download a newer 
version of slf4j (1.7.2 right now) than is found in Solr (1.6.4 right now).  
Because of the api jar sitting in the war, this won't work.  If a bug or 
performance problem is found that affects Solr, it's a fair amount of manual 
work to get operational with a new slf4j version.  My patch eliminates that 
problem, with the additional requirement that you include the api jar yourself.


> Change the way -excl-slf4j targets work
> ---
>
> Key: SOLR-3918
> URL: https://issues.apache.org/jira/browse/SOLR-3918
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 3.6.1, 4.0-BETA
>Reporter: Shawn Heisey
>Priority: Trivial
> Fix For: 3.6.2, 4.1, 5.0
>
> Attachments: SOLR-3918.patch, SOLR-3918.patch
>
>
> If you want to create an entire dist target but leave out slf4j bindings, you 
> must currently use this:
> ant dist-solrj, dist-core, dist-test-framework, dist-contrib 
> dist-war-excl-slf4j
> It would be better to have a single target.  Attaching a patch against 
> branch_4x for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4129) Solr UI doesn't support log4j

2012-11-30 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507602#comment-13507602
 ] 

Shawn Heisey commented on SOLR-4129:


bq. Using WARN for an informational message just seems wrong to me, albeit I've 
seen it in some projects at work. Simply setting only one class to INFO and 
others higher (e.g. WARN) wouldn't work because this would probably go in 
SolrCore which has plenty of other INFO messages to say. Too much IMO. I think 
the right solution is to log an announcement message to a special name like 
"org.apache.solr.announcement". Yeah, file an issue for that.

You're right, such logs aren't warnings.  I'll file the new issue.

bq.  I believe there's even an ant target to pre-package the .war file 
appropriately.

There's a dist-war-excl-slf4j target, but there's no easy way to get both the 
special war AND the rest of what 'dist' does.  I filed SOLR-3918 initially to 
just add a dist-excl-slf4j target, then updated that issue to also remove 
slf4j-api-1.6.4.jar.  I'll be updating that issue with my full rationale for 
why I think it should be committed.


> Solr UI doesn't support log4j 
> --
>
> Key: SOLR-4129
> URL: https://issues.apache.org/jira/browse/SOLR-4129
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0
>Reporter: Raintung Li
>  Labels: log
> Fix For: 4.1, 5.0
>
> Attachments: patch-4129.txt
>
>
> For many project use the log4j, actually solr use slf logger framework, slf 
> can easy to integrate with log4j by design. 
> Solr use log4j-over-slf.jar to handle log4j case.
> This jar has some issues.
> a. Actually last invoke slf to print the logger (For solr it is 
> JDK14.logging).
> b. Not implement all log4j function. ex. Logger.setLevel() 
> c. JDK14 log miss some function, ex. thread.info, day rolling 
> Some dependence project had been used log4j that the customer still want to 
> use it. JDK14 log has many different with Log4j, at least configuration file 
> can't reuse.
> The bad thing is log4j-over-slf.jar conflict with log4j. If use solr, the 
> other project have to remove log4j.
> I think it shouldn't use log4j-over-slf.jar, still reuse log4j if customer 
> want to use it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4129) Solr UI doesn't support log4j

2012-11-30 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507575#comment-13507575
 ] 

Mark Miller commented on SOLR-4129:
---

bq. There was discussion of moving the default solr log implementation to Log4J

+1

> Solr UI doesn't support log4j 
> --
>
> Key: SOLR-4129
> URL: https://issues.apache.org/jira/browse/SOLR-4129
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0
>Reporter: Raintung Li
>  Labels: log
> Fix For: 4.1, 5.0
>
> Attachments: patch-4129.txt
>
>
> For many project use the log4j, actually solr use slf logger framework, slf 
> can easy to integrate with log4j by design. 
> Solr use log4j-over-slf.jar to handle log4j case.
> This jar has some issues.
> a. Actually last invoke slf to print the logger (For solr it is 
> JDK14.logging).
> b. Not implement all log4j function. ex. Logger.setLevel() 
> c. JDK14 log miss some function, ex. thread.info, day rolling 
> Some dependence project had been used log4j that the customer still want to 
> use it. JDK14 log has many different with Log4j, at least configuration file 
> can't reuse.
> The bad thing is log4j-over-slf.jar conflict with log4j. If use solr, the 
> other project have to remove log4j.
> I think it shouldn't use log4j-over-slf.jar, still reuse log4j if customer 
> want to use it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4129) Solr UI doesn't support log4j

2012-11-30 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507568#comment-13507568
 ] 

Ryan McKinley commented on SOLR-4129:
-

See also SOLR-3356

FYI, the LogWatcher stuff initially supported Log4J out-of-the-box, but given 
the classpath conflicts with testing, it was removed.

There was discussion of moving the default solr log implementation to Log4J... 
but that did not resolve.



> Solr UI doesn't support log4j 
> --
>
> Key: SOLR-4129
> URL: https://issues.apache.org/jira/browse/SOLR-4129
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0
>Reporter: Raintung Li
>  Labels: log
> Fix For: 4.1, 5.0
>
> Attachments: patch-4129.txt
>
>
> For many project use the log4j, actually solr use slf logger framework, slf 
> can easy to integrate with log4j by design. 
> Solr use log4j-over-slf.jar to handle log4j case.
> This jar has some issues.
> a. Actually last invoke slf to print the logger (For solr it is 
> JDK14.logging).
> b. Not implement all log4j function. ex. Logger.setLevel() 
> c. JDK14 log miss some function, ex. thread.info, day rolling 
> Some dependence project had been used log4j that the customer still want to 
> use it. JDK14 log has many different with Log4j, at least configuration file 
> can't reuse.
> The bad thing is log4j-over-slf.jar conflict with log4j. If use solr, the 
> other project have to remove log4j.
> I think it shouldn't use log4j-over-slf.jar, still reuse log4j if customer 
> want to use it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4131) Nested dismax query showing "dismax" as spellcheck result

2012-11-30 Thread James Dyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Dyer resolved SOLR-4131.
--

Resolution: Invalid

SpellingQueryConverter by design does not support local params.  Using 
spellcheck.q is an acceptable workaround.

> Nested dismax query showing "dismax" as spellcheck result
> -
>
> Key: SOLR-4131
> URL: https://issues.apache.org/jira/browse/SOLR-4131
> Project: Solr
>  Issue Type: Bug
>Reporter: Parham Mofidi
> Fix For: 1.4
>
>
> I'm performing the following query against our Solr database:
> {code}
> spellcheck.count=5&facet=true&facet.limit=200&hl=true&hl.fl=name&hl.fl=shortDescription&version=1&start=0&facet.sort=index&facet.field=parentCatgroup_id_search&facet.field=mfName_ntk_cs&fq=catalog_id:"10001"&fq=storeent_id:("10001")&fq=published:1&fq=-(catenttype_id_ntk_cs:ItemBean+AND+parentCatentry_id:[*+TO+*])&timeAllowed=15000&hl.simple.post=&rows=24&debugQuery=false&facet.query=price_CAD:({*+100}+100)&facet.query=price_CAD:({100+200}+200)&facet.query=price_CAD:({200+300}+300)&facet.query=price_CAD:({300+400}+400)&facet.query=price_CAD:({400+500}+500)&facet.query=price_CAD:({500+*})&q=(%2B(mfPartNumber_ntk:(TESTSTRING)+partNumber_ntk:(TESTSTRING)))+OR+(_query_:"{!dismax+ps%3D2+mm%3D'2<-25%25'+qf%3D'longDescription+name^5+mfName^10+xf_CategoryPath^10'+pf%3D'longDescription^10+name^30+xf_CategoryPath^30'}+TESTSTRING+")&spellcheck.collate=false&hl.simple.pre=&facet.mincount=1&spellcheck=true&hl.requireFieldMatch=true
> {code}
> which results in the following returned XML:
> {code}
> This XML file does not appear to have any style information associated with 
> it. The document tree is shown below.
> 
> 
> 0
> 59
> 
> 5
> true
> 200
> true
> 
> name
> shortDescription
> 
> 1
> 0
> index
> 
> parentCatgroup_id_search
> mfName_ntk_cs
> 
> 
> catalog_id:"10001"
> storeent_id:("10001")
> published:1
> 
> -(catenttype_id_ntk_cs:ItemBean AND parentCatentry_id:[* TO *])
> 
> 
> 15000
> 
> 24
> false
> 
> price_CAD:({* 100} 100)
> price_CAD:({100 200} 200)
> price_CAD:({200 300} 300)
> price_CAD:({300 400} 400)
> price_CAD:({400 500} 500)
> price_CAD:({500 *})
> 
> 
> (+(mfPartNumber_ntk:(TESTSTRING) partNumber_ntk:(TESTSTRING))) OR 
> (_query_:"{!dismax ps=2 mm='2<-25%' qf='longDescription name^5 mfName^10 
> xf_CategoryPath^10' pf='longDescription^10 name^30 xf_CategoryPath^30'} 
> TESTSTRING ")
> 
> 
> false
> 1
> true
> true
> 
> 
> 
> 
> 
> 0
> 0
> 0
> 0
> 0
> 0
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 5
> 21
> 31
> 
> test taking
> testing
> tests eolang
> tests quick
> tests grade
> 
> 
> 
> 5
> 49
> 59
> 
> test taking
> testing
> tests eolang
> tests quick
> tests grade
> 
> 
> 
> 5
> 78
> 84
> 
> digimax
> minimax
> big max
> heisman
> mid max
> 
> 
> 
> 3
> 102
> 104
> 
> qlf
> qnf
> qlfd
> 
> 
> 
> 5
> 106
> 121
> 
> job descriptions
> description
> prescription
> descriptions
> skin prescription
> 
> 
> 
> 5
> 129
> 135
> 
> 3 name
> ornament
> enamel
> username
> surnames
> 
> 
> 
> 5
> 139
> 154
> 
> category
> category 5e
> category 6
> category 5
> category 6a
> 
> 
> 
> 5
> 163
> 178
> 
> job descriptions
> description
> prescription
> descriptions
> skin prescription
> 
> 
> 
> 5
> 190
> 205
> 
> category
> category 5e
> category 6
> category 5
> category 6a
> 
> 
> 
> 5
> 211
> 221
> 
> test taking
> testing
> tests eolang
> tests quick
> tests grade
> 
> 
> 
> 
> 
> {code}
> Parts of the nested dismax query are being used as spellcheck keywords, and 
> we don't quite understand why.  Is there something that we're doing wrong, 
> which would cause the nested query to show up as spellcheck kwywords?
> Thanks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: Reenable Solr tests on Jenkins

2012-11-30 Thread Uwe Schindler
Ah,

 

I left the Solr tests enabled for Clover and Badapples runs, as those don’t 
send emails. We should now be back at correct Solr code coverage.

 

Uwe

 

-

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

  http://www.thetaphi.de

eMail: u...@thetaphi.de

 

From: Uwe Schindler [mailto:u...@thetaphi.de] 
Sent: Friday, November 30, 2012 7:36 PM
To: dev@lucene.apache.org
Subject: RE: Reenable Solr tests on Jenkins

 

I implemented that for now and disabled all solr tests on Apache Jenkins until 
we found a good solution to *only* disable tests using SolrJettyRunner or 
whatever is confused by blackhole.

 

-

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

http://www.thetaphi.de  

eMail: u...@thetaphi.de

 

From: Robert Muir [mailto:rcm...@gmail.com] 
Sent: Friday, November 30, 2012 5:13 PM
To: dev@lucene.apache.org
Subject: Re: Reenable Solr tests on Jenkins

 

 

On Fri, Nov 30, 2012 at 11:11 AM, Mark Miller  wrote:

 

Just be aggressive and assume FreeBSD is black holed?

- Mark

 


Or just have a -D we provide, with our apache jenkins server setting it to 
true. 



RE: Reenable Solr tests on Jenkins

2012-11-30 Thread Uwe Schindler
I implemented that for now and disabled all solr tests on Apache Jenkins until 
we found a good solution to *only* disable tests using SolrJettyRunner or 
whatever is confused by blackhole.

 

-

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

  http://www.thetaphi.de

eMail: u...@thetaphi.de

 

From: Robert Muir [mailto:rcm...@gmail.com] 
Sent: Friday, November 30, 2012 5:13 PM
To: dev@lucene.apache.org
Subject: Re: Reenable Solr tests on Jenkins

 

 

On Fri, Nov 30, 2012 at 11:11 AM, Mark Miller  wrote:

 

Just be aggressive and assume FreeBSD is black holed?

- Mark

 


Or just have a -D we provide, with our apache jenkins server setting it to 
true. 



[jira] [Commented] (SOLR-4131) Nested dismax query showing "dismax" as spellcheck result

2012-11-30 Thread Parham Mofidi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507531#comment-13507531
 ] 

Parham Mofidi commented on SOLR-4131:
-

Yes it helped.  Thanks James!

> Nested dismax query showing "dismax" as spellcheck result
> -
>
> Key: SOLR-4131
> URL: https://issues.apache.org/jira/browse/SOLR-4131
> Project: Solr
>  Issue Type: Bug
>Reporter: Parham Mofidi
> Fix For: 1.4
>
>
> I'm performing the following query against our Solr database:
> {code}
> spellcheck.count=5&facet=true&facet.limit=200&hl=true&hl.fl=name&hl.fl=shortDescription&version=1&start=0&facet.sort=index&facet.field=parentCatgroup_id_search&facet.field=mfName_ntk_cs&fq=catalog_id:"10001"&fq=storeent_id:("10001")&fq=published:1&fq=-(catenttype_id_ntk_cs:ItemBean+AND+parentCatentry_id:[*+TO+*])&timeAllowed=15000&hl.simple.post=&rows=24&debugQuery=false&facet.query=price_CAD:({*+100}+100)&facet.query=price_CAD:({100+200}+200)&facet.query=price_CAD:({200+300}+300)&facet.query=price_CAD:({300+400}+400)&facet.query=price_CAD:({400+500}+500)&facet.query=price_CAD:({500+*})&q=(%2B(mfPartNumber_ntk:(TESTSTRING)+partNumber_ntk:(TESTSTRING)))+OR+(_query_:"{!dismax+ps%3D2+mm%3D'2<-25%25'+qf%3D'longDescription+name^5+mfName^10+xf_CategoryPath^10'+pf%3D'longDescription^10+name^30+xf_CategoryPath^30'}+TESTSTRING+")&spellcheck.collate=false&hl.simple.pre=&facet.mincount=1&spellcheck=true&hl.requireFieldMatch=true
> {code}
> which results in the following returned XML:
> {code}
> This XML file does not appear to have any style information associated with 
> it. The document tree is shown below.
> 
> 
> 0
> 59
> 
> 5
> true
> 200
> true
> 
> name
> shortDescription
> 
> 1
> 0
> index
> 
> parentCatgroup_id_search
> mfName_ntk_cs
> 
> 
> catalog_id:"10001"
> storeent_id:("10001")
> published:1
> 
> -(catenttype_id_ntk_cs:ItemBean AND parentCatentry_id:[* TO *])
> 
> 
> 15000
> 
> 24
> false
> 
> price_CAD:({* 100} 100)
> price_CAD:({100 200} 200)
> price_CAD:({200 300} 300)
> price_CAD:({300 400} 400)
> price_CAD:({400 500} 500)
> price_CAD:({500 *})
> 
> 
> (+(mfPartNumber_ntk:(TESTSTRING) partNumber_ntk:(TESTSTRING))) OR 
> (_query_:"{!dismax ps=2 mm='2<-25%' qf='longDescription name^5 mfName^10 
> xf_CategoryPath^10' pf='longDescription^10 name^30 xf_CategoryPath^30'} 
> TESTSTRING ")
> 
> 
> false
> 1
> true
> true
> 
> 
> 
> 
> 
> 0
> 0
> 0
> 0
> 0
> 0
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 5
> 21
> 31
> 
> test taking
> testing
> tests eolang
> tests quick
> tests grade
> 
> 
> 
> 5
> 49
> 59
> 
> test taking
> testing
> tests eolang
> tests quick
> tests grade
> 
> 
> 
> 5
> 78
> 84
> 
> digimax
> minimax
> big max
> heisman
> mid max
> 
> 
> 
> 3
> 102
> 104
> 
> qlf
> qnf
> qlfd
> 
> 
> 
> 5
> 106
> 121
> 
> job descriptions
> description
> prescription
> descriptions
> skin prescription
> 
> 
> 
> 5
> 129
> 135
> 
> 3 name
> ornament
> enamel
> username
> surnames
> 
> 
> 
> 5
> 139
> 154
> 
> category
> category 5e
> category 6
> category 5
> category 6a
> 
> 
> 
> 5
> 163
> 178
> 
> job descriptions
> description
> prescription
> descriptions
> skin prescription
> 
> 
> 
> 5
> 190
> 205
> 
> category
> category 5e
> category 6
> category 5
> category 6a
> 
> 
> 
> 5
> 211
> 221
> 
> test taking
> testing
> tests eolang
> tests quick
> tests grade
> 
> 
> 
> 
> 
> {code}
> Parts of the nested dismax query are being used as spellcheck keywords, and 
> we don't quite understand why.  Is there something that we're doing wrong, 
> which would cause the nested query to show up as spellcheck kwywords?
> Thanks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4115) WordBreakSpellChecker throws ArrayIndexOutOfBoundsException for random query string

2012-11-30 Thread James Dyer (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507530#comment-13507530
 ] 

James Dyer commented on SOLR-4115:
--

Correct me if I'm wrong, but Lucene is only able to take valid UTF-8 as input, 
right?  So oal.util.UnicodeUtil.UTF8toUTF16 doesn't like \uD864\uDC79 because 
its invalid UTF-8.  

See 
http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/util/UnicodeUtil.html#UTF8toUTF16%28org.apache.lucene.util.BytesRef,%20org.apache.lucene.util.CharsRef%29

> WordBreakSpellChecker throws ArrayIndexOutOfBoundsException for random query 
> string
> ---
>
> Key: SOLR-4115
> URL: https://issues.apache.org/jira/browse/SOLR-4115
> Project: Solr
>  Issue Type: Bug
>  Components: spellchecker
>Affects Versions: 4.0
> Environment: java version "1.6.0_37"
> Java(TM) SE Runtime Environment (build 1.6.0_37-b06)
> Java HotSpot(TM) 64-Bit Server VM (build 20.12-b01, mixed mode)
>Reporter: Andreas Hubold
>
> The following SolrJ test code causes an ArrayIndexOutOfBoundsException in the 
> WordBreakSpellChecker. I tested this with the Solr 4.0.0 example webapp 
> started with {{java -jar start.jar}}.
> {code:java}
>   @Test
>   public void testWordbreakSpellchecker() throws Exception {
> SolrQuery q = new SolrQuery("\uD864\uDC79");
> q.setRequestHandler("/browse");
> q.setParam("spellcheck.dictionary", "wordbreak");
> HttpSolrServer server = new HttpSolrServer("http://localhost:8983/solr";);
> server.query(q, SolrRequest.METHOD.POST);
>   }
> {code}
> {noformat}
> INFO: [collection1] webapp=/solr path=/browse 
> params={spellcheck.dictionary=wordbreak&qt=/browse&wt=javabin&q=?&version=2} 
> hits=0 status=500 QTime=11 
> Nov 28, 2012 11:23:01 AM org.apache.solr.common.SolrException log
> SEVERE: null:java.lang.ArrayIndexOutOfBoundsException: 1
>   at org.apache.lucene.util.UnicodeUtil.UTF8toUTF16(UnicodeUtil.java:599)
>   at org.apache.lucene.util.BytesRef.utf8ToString(BytesRef.java:165)
>   at org.apache.lucene.index.Term.text(Term.java:72)
>   at 
> org.apache.lucene.search.spell.WordBreakSpellChecker.generateSuggestWord(WordBreakSpellChecker.java:350)
>   at 
> org.apache.lucene.search.spell.WordBreakSpellChecker.generateBreakUpSuggestions(WordBreakSpellChecker.java:283)
>   at 
> org.apache.lucene.search.spell.WordBreakSpellChecker.suggestWordBreaks(WordBreakSpellChecker.java:122)
>   at 
> org.apache.solr.spelling.WordBreakSolrSpellChecker.getSuggestions(WordBreakSolrSpellChecker.java:229)
>   at 
> org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:172)
>   at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:206)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1699)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:455)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:276)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
>   at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:250)
>   at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
>   at org.eclipse.jetty.server.Server.handle(Server.java:351)
>   at 
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:454)
>   at 
> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:47)
>   at 
> org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:900)
>   at 
> org

[jira] [Commented] (SOLR-4129) Solr UI doesn't support log4j

2012-11-30 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507522#comment-13507522
 ] 

David Smiley commented on SOLR-4129:


I changed this issue to an "improvement" and changed the title to reflect it's 
a UI thing.  That is what is reflected in the patch.  The author of the 
description on the other hand, appears to be unaware that 
log4j-over-slf4j-1.7.2.jar needs to be removed if you're going to log to Log4j. 
 I believe there's even an ant target to pre-package the .war file 
appropriately.

> Solr UI doesn't support log4j 
> --
>
> Key: SOLR-4129
> URL: https://issues.apache.org/jira/browse/SOLR-4129
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0
>Reporter: Raintung Li
>  Labels: log
> Fix For: 4.1, 5.0
>
> Attachments: patch-4129.txt
>
>
> For many project use the log4j, actually solr use slf logger framework, slf 
> can easy to integrate with log4j by design. 
> Solr use log4j-over-slf.jar to handle log4j case.
> This jar has some issues.
> a. Actually last invoke slf to print the logger (For solr it is 
> JDK14.logging).
> b. Not implement all log4j function. ex. Logger.setLevel() 
> c. JDK14 log miss some function, ex. thread.info, day rolling 
> Some dependence project had been used log4j that the customer still want to 
> use it. JDK14 log has many different with Log4j, at least configuration file 
> can't reuse.
> The bad thing is log4j-over-slf.jar conflict with log4j. If use solr, the 
> other project have to remove log4j.
> I think it shouldn't use log4j-over-slf.jar, still reuse log4j if customer 
> want to use it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4129) Solr UI doesn't support log4j

2012-11-30 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-4129:
---

Summary: Solr UI doesn't support log4j   (was: Solr doesn't support log4j )

> Solr UI doesn't support log4j 
> --
>
> Key: SOLR-4129
> URL: https://issues.apache.org/jira/browse/SOLR-4129
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0
>Reporter: Raintung Li
>  Labels: log
> Fix For: 4.1, 5.0
>
> Attachments: patch-4129.txt
>
>
> For many project use the log4j, actually solr use slf logger framework, slf 
> can easy to integrate with log4j by design. 
> Solr use log4j-over-slf.jar to handle log4j case.
> This jar has some issues.
> a. Actually last invoke slf to print the logger (For solr it is 
> JDK14.logging).
> b. Not implement all log4j function. ex. Logger.setLevel() 
> c. JDK14 log miss some function, ex. thread.info, day rolling 
> Some dependence project had been used log4j that the customer still want to 
> use it. JDK14 log has many different with Log4j, at least configuration file 
> can't reuse.
> The bad thing is log4j-over-slf.jar conflict with log4j. If use solr, the 
> other project have to remove log4j.
> I think it shouldn't use log4j-over-slf.jar, still reuse log4j if customer 
> want to use it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4129) Solr doesn't support log4j

2012-11-30 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-4129:
---

Issue Type: Improvement  (was: Bug)

> Solr doesn't support log4j 
> ---
>
> Key: SOLR-4129
> URL: https://issues.apache.org/jira/browse/SOLR-4129
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0
>Reporter: Raintung Li
>  Labels: log
> Fix For: 4.1, 5.0
>
> Attachments: patch-4129.txt
>
>
> For many project use the log4j, actually solr use slf logger framework, slf 
> can easy to integrate with log4j by design. 
> Solr use log4j-over-slf.jar to handle log4j case.
> This jar has some issues.
> a. Actually last invoke slf to print the logger (For solr it is 
> JDK14.logging).
> b. Not implement all log4j function. ex. Logger.setLevel() 
> c. JDK14 log miss some function, ex. thread.info, day rolling 
> Some dependence project had been used log4j that the customer still want to 
> use it. JDK14 log has many different with Log4j, at least configuration file 
> can't reuse.
> The bad thing is log4j-over-slf.jar conflict with log4j. If use solr, the 
> other project have to remove log4j.
> I think it shouldn't use log4j-over-slf.jar, still reuse log4j if customer 
> want to use it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4129) Solr doesn't support log4j

2012-11-30 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507518#comment-13507518
 ] 

David Smiley commented on SOLR-4129:


bq. I would like to have short logs saying that each core has been started and 
that Solr is fully started. Would it be reasonable to file a jira asking for 
this to happen at WARN, or would I just have to figure out which class(es) to 
set to INFO in my log4j config?

Using WARN for an informational message just seems wrong to me, albeit I've 
seen it in some projects at work.  Simply setting only one class to INFO and 
others higher (e.g. WARN) wouldn't work because this would probably go in 
SolrCore which has plenty of other INFO messages to say.  Too much IMO.  I 
think the right solution is to log an announcement message to a special name 
like "org.apache.solr.announcement".  Yeah, file an issue for that.

> Solr doesn't support log4j 
> ---
>
> Key: SOLR-4129
> URL: https://issues.apache.org/jira/browse/SOLR-4129
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0
>Reporter: Raintung Li
>  Labels: log
> Fix For: 4.1, 5.0
>
> Attachments: patch-4129.txt
>
>
> For many project use the log4j, actually solr use slf logger framework, slf 
> can easy to integrate with log4j by design. 
> Solr use log4j-over-slf.jar to handle log4j case.
> This jar has some issues.
> a. Actually last invoke slf to print the logger (For solr it is 
> JDK14.logging).
> b. Not implement all log4j function. ex. Logger.setLevel() 
> c. JDK14 log miss some function, ex. thread.info, day rolling 
> Some dependence project had been used log4j that the customer still want to 
> use it. JDK14 log has many different with Log4j, at least configuration file 
> can't reuse.
> The bad thing is log4j-over-slf.jar conflict with log4j. If use solr, the 
> other project have to remove log4j.
> I think it shouldn't use log4j-over-slf.jar, still reuse log4j if customer 
> want to use it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4131) Nested dismax query showing "dismax" as spellcheck result

2012-11-30 Thread James Dyer (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507516#comment-13507516
 ] 

James Dyer commented on SOLR-4131:
--

I think the issue is that the SpellingQueryConverter class is not designed to 
work with local params.  Your best bet is to build a space-delimited list of 
raw query terms that you want spellchecked and pass them as the "spellcheck.q" 
parameter.  Does this solve your problems?

Of course making the spellchecker more robust to handle anything you throw at 
it would be a worthy enhancement. 

> Nested dismax query showing "dismax" as spellcheck result
> -
>
> Key: SOLR-4131
> URL: https://issues.apache.org/jira/browse/SOLR-4131
> Project: Solr
>  Issue Type: Bug
>Reporter: Parham Mofidi
> Fix For: 1.4
>
>
> I'm performing the following query against our Solr database:
> {code}
> spellcheck.count=5&facet=true&facet.limit=200&hl=true&hl.fl=name&hl.fl=shortDescription&version=1&start=0&facet.sort=index&facet.field=parentCatgroup_id_search&facet.field=mfName_ntk_cs&fq=catalog_id:"10001"&fq=storeent_id:("10001")&fq=published:1&fq=-(catenttype_id_ntk_cs:ItemBean+AND+parentCatentry_id:[*+TO+*])&timeAllowed=15000&hl.simple.post=&rows=24&debugQuery=false&facet.query=price_CAD:({*+100}+100)&facet.query=price_CAD:({100+200}+200)&facet.query=price_CAD:({200+300}+300)&facet.query=price_CAD:({300+400}+400)&facet.query=price_CAD:({400+500}+500)&facet.query=price_CAD:({500+*})&q=(%2B(mfPartNumber_ntk:(TESTSTRING)+partNumber_ntk:(TESTSTRING)))+OR+(_query_:"{!dismax+ps%3D2+mm%3D'2<-25%25'+qf%3D'longDescription+name^5+mfName^10+xf_CategoryPath^10'+pf%3D'longDescription^10+name^30+xf_CategoryPath^30'}+TESTSTRING+")&spellcheck.collate=false&hl.simple.pre=&facet.mincount=1&spellcheck=true&hl.requireFieldMatch=true
> {code}
> which results in the following returned XML:
> {code}
> This XML file does not appear to have any style information associated with 
> it. The document tree is shown below.
> 
> 
> 0
> 59
> 
> 5
> true
> 200
> true
> 
> name
> shortDescription
> 
> 1
> 0
> index
> 
> parentCatgroup_id_search
> mfName_ntk_cs
> 
> 
> catalog_id:"10001"
> storeent_id:("10001")
> published:1
> 
> -(catenttype_id_ntk_cs:ItemBean AND parentCatentry_id:[* TO *])
> 
> 
> 15000
> 
> 24
> false
> 
> price_CAD:({* 100} 100)
> price_CAD:({100 200} 200)
> price_CAD:({200 300} 300)
> price_CAD:({300 400} 400)
> price_CAD:({400 500} 500)
> price_CAD:({500 *})
> 
> 
> (+(mfPartNumber_ntk:(TESTSTRING) partNumber_ntk:(TESTSTRING))) OR 
> (_query_:"{!dismax ps=2 mm='2<-25%' qf='longDescription name^5 mfName^10 
> xf_CategoryPath^10' pf='longDescription^10 name^30 xf_CategoryPath^30'} 
> TESTSTRING ")
> 
> 
> false
> 1
> true
> true
> 
> 
> 
> 
> 
> 0
> 0
> 0
> 0
> 0
> 0
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 5
> 21
> 31
> 
> test taking
> testing
> tests eolang
> tests quick
> tests grade
> 
> 
> 
> 5
> 49
> 59
> 
> test taking
> testing
> tests eolang
> tests quick
> tests grade
> 
> 
> 
> 5
> 78
> 84
> 
> digimax
> minimax
> big max
> heisman
> mid max
> 
> 
> 
> 3
> 102
> 104
> 
> qlf
> qnf
> qlfd
> 
> 
> 
> 5
> 106
> 121
> 
> job descriptions
> description
> prescription
> descriptions
> skin prescription
> 
> 
> 
> 5
> 129
> 135
> 
> 3 name
> ornament
> enamel
> username
> surnames
> 
> 
> 
> 5
> 139
> 154
> 
> category
> category 5e
> category 6
> category 5
> category 6a
> 
> 
> 
> 5
> 163
> 178
> 
> job descriptions
> description
> prescription
> descriptions
> skin prescription
> 
> 
> 
> 5
> 190
> 205
> 
> category
> category 5e
> category 6
> category 5
> category 6a
> 
> 
> 
> 5
> 211
> 221
> 
> test taking
> testing
> tests eolang
> tests quick
> tests grade
> 
> 
> 
> 
> 
> {code}
> Parts of the nested dismax query are being used as spellcheck keywords, and 
> we don't quite understand why.  Is there something that we're doing wrong, 
> which would cause the nested query to show up as spellcheck kwywords?
> Thanks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: IndexWriter.ensureOpen and ensureOpen(boolean)

2012-11-30 Thread Michael McCandless
On Fri, Nov 30, 2012 at 1:08 PM, Shai Erera  wrote:
>> Hmm prepareCommit() really should somehow pass true: this API is only
>> invoked by the app, not by IW internally during close.
>
>
> Currently, the app can call prepareCommit() or prepareCommint(data), the
> former calls ensureOpen(true), the latter ensureOpen(false).
>
> The problem is that the latter is called by internalCommit(). So when I
> changed prepareCommit(data) to call ensureOpen(true), all TestIndexWriter
> tests failed (the only ones I tried).

OK I see.

> So I think that in that issue, I should create an internalPrepareCommit and
> leave prepCommit() to call ensureOpen(), while internalPrep will call
> ensureOpen(false). And also change IW code to call internalCommit. How's
> that sound?

That sounds good!  Thanks.

Mike McCandless

http://blog.mikemccandless.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4129) Solr doesn't support log4j

2012-11-30 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507511#comment-13507511
 ] 

Shawn Heisey commented on SOLR-4129:


bq. Shawn, how could you have both log4j-over-slf4 and log4j itself on the 
classpath? That doesn't make sense and SLF4J will probably complain; if it 
doesn't it's still an error. If your intent is to use Log4j then remove 
log4j-over-slf4.

I did this because I didn't know any better.  I'm a bit of a beginner in the 
Java universe.  I added the slf4j things included in the standard WAR except 
for the JUL binding, then added the log4j binding and log4j.  There are no 
complaints about it in my log.  It's probably not complaining because 
alphabetically, the real log4j jar comes before the slf4j one, so when the 
import is done, the right class gets loaded.

After removing log4j-over-slf4j, Solr still seems fine.  Both before and after 
the removal, logging at WARN, the only log entry I get is "Log watching is not 
yet implemented for log4j."

I would like to have short logs saying that each core has been started and that 
Solr is fully started.  Would it be reasonable to file a jira asking for this 
to happen at WARN, or would I just have to figure out which class(es) to set to 
INFO in my log4j config?


> Solr doesn't support log4j 
> ---
>
> Key: SOLR-4129
> URL: https://issues.apache.org/jira/browse/SOLR-4129
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0
>Reporter: Raintung Li
>  Labels: log
> Fix For: 4.1, 5.0
>
> Attachments: patch-4129.txt
>
>
> For many project use the log4j, actually solr use slf logger framework, slf 
> can easy to integrate with log4j by design. 
> Solr use log4j-over-slf.jar to handle log4j case.
> This jar has some issues.
> a. Actually last invoke slf to print the logger (For solr it is 
> JDK14.logging).
> b. Not implement all log4j function. ex. Logger.setLevel() 
> c. JDK14 log miss some function, ex. thread.info, day rolling 
> Some dependence project had been used log4j that the customer still want to 
> use it. JDK14 log has many different with Log4j, at least configuration file 
> can't reuse.
> The bad thing is log4j-over-slf.jar conflict with log4j. If use solr, the 
> other project have to remove log4j.
> I think it shouldn't use log4j-over-slf.jar, still reuse log4j if customer 
> want to use it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: IndexWriter.ensureOpen and ensureOpen(boolean)

2012-11-30 Thread Shai Erera
>
> Hmm prepareCommit() really should somehow pass true: this API is only
> invoked by the app, not by IW internally during close.


Currently, the app can call prepareCommit() or prepareCommint(data), the
former calls ensureOpen(true), the latter ensureOpen(false).

The problem is that the latter is called by internalCommit(). So when I
changed prepareCommit(data) to call ensureOpen(true), all TestIndexWriter
tests failed (the only ones I tried).

So I think that in that issue, I should create an internalPrepareCommit and
leave prepCommit() to call ensureOpen(), while internalPrep will call
ensureOpen(false). And also change IW code to call internalCommit. How's
that sound?

+1 for failIfClosing !

Shai

On Fri, Nov 30, 2012 at 3:29 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:

> On Fri, Nov 30, 2012 at 7:13 AM, Shai Erera  wrote:
> > I see. So two questions:
> >
> > 1) Is it ok for prepareCommit() to call ensureOpen(false)? In
> LUCENE-4575 I
> > consolidate the two prepCommit() and this is the only way it would work
> ...
>
> Hmm prepareCommit() really should somehow pass true: this API is only
> invoked by the app, not by IW internally during close.  Not sure how
> we can fix the patch to get that back ...
>
> > 2) Could you perhaps clarify the use of the second argument in the
> javadocs?
> > Maybe also rename it to something like "fail if closing"? The name
> > "includePendingClose" is vague perhaps consider*?)
>
> I agree that current name is no good!  failIfClosing seems good?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


[jira] [Created] (SOLR-4131) Nested dismax query showing "dismax" as spellcheck result

2012-11-30 Thread Parham Mofidi (JIRA)
Parham Mofidi created SOLR-4131:
---

 Summary: Nested dismax query showing "dismax" as spellcheck result
 Key: SOLR-4131
 URL: https://issues.apache.org/jira/browse/SOLR-4131
 Project: Solr
  Issue Type: Bug
Reporter: Parham Mofidi
 Fix For: 1.4


I'm performing the following query against our Solr database:

{code}
spellcheck.count=5&facet=true&facet.limit=200&hl=true&hl.fl=name&hl.fl=shortDescription&version=1&start=0&facet.sort=index&facet.field=parentCatgroup_id_search&facet.field=mfName_ntk_cs&fq=catalog_id:"10001"&fq=storeent_id:("10001")&fq=published:1&fq=-(catenttype_id_ntk_cs:ItemBean+AND+parentCatentry_id:[*+TO+*])&timeAllowed=15000&hl.simple.post=&rows=24&debugQuery=false&facet.query=price_CAD:({*+100}+100)&facet.query=price_CAD:({100+200}+200)&facet.query=price_CAD:({200+300}+300)&facet.query=price_CAD:({300+400}+400)&facet.query=price_CAD:({400+500}+500)&facet.query=price_CAD:({500+*})&q=(%2B(mfPartNumber_ntk:(TESTSTRING)+partNumber_ntk:(TESTSTRING)))+OR+(_query_:"{!dismax+ps%3D2+mm%3D'2<-25%25'+qf%3D'longDescription+name^5+mfName^10+xf_CategoryPath^10'+pf%3D'longDescription^10+name^30+xf_CategoryPath^30'}+TESTSTRING+")&spellcheck.collate=false&hl.simple.pre=&facet.mincount=1&spellcheck=true&hl.requireFieldMatch=true
{code}

which results in the following returned XML:

{code}
This XML file does not appear to have any style information associated with it. 
The document tree is shown below.


0
59

5
true
200
true

name
shortDescription

1
0
index

parentCatgroup_id_search
mfName_ntk_cs


catalog_id:"10001"
storeent_id:("10001")
published:1

-(catenttype_id_ntk_cs:ItemBean AND parentCatentry_id:[* TO *])


15000

24
false

price_CAD:({* 100} 100)
price_CAD:({100 200} 200)
price_CAD:({200 300} 300)
price_CAD:({300 400} 400)
price_CAD:({400 500} 500)
price_CAD:({500 *})


(+(mfPartNumber_ntk:(TESTSTRING) partNumber_ntk:(TESTSTRING))) OR 
(_query_:"{!dismax ps=2 mm='2<-25%' qf='longDescription name^5 mfName^10 
xf_CategoryPath^10' pf='longDescription^10 name^30 xf_CategoryPath^30'} 
TESTSTRING ")


false
1
true
true





0
0
0
0
0
0











5
21
31

test taking
testing
tests eolang
tests quick
tests grade



5
49
59

test taking
testing
tests eolang
tests quick
tests grade



5
78
84

digimax
minimax
big max
heisman
mid max



3
102
104

qlf
qnf
qlfd



5
106
121

job descriptions
description
prescription
descriptions
skin prescription



5
129
135

3 name
ornament
enamel
username
surnames



5
139
154

category
category 5e
category 6
category 5
category 6a



5
163
178

job descriptions
description
prescription
descriptions
skin prescription



5
190
205

category
category 5e
category 6
category 5
category 6a



5
211
221

test taking
testing
tests eolang
tests quick
tests grade





{code}

Parts of the nested dismax query are being used as spellcheck keywords, and we 
don't quite understand why.  Is there something that we're doing wrong, which 
would cause the nested query to show up as spellcheck kwywords?

Thanks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4574) FunctionQuery ValueSource value computed twice per document

2012-11-30 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507500#comment-13507500
 ] 

Michael McCandless commented on LUCENE-4574:


I love those asserts :)

I think Robert's patch is somewhat simpler, because we don't have to add 
getFieldComparators and the new boolean nonScoring/collectsScores to all the 
specialized collector impls?

Shouldn't we just always use ScoresTwiceCollector whenever we see 
RelevanceCollector?  (Ie, remove the check for trackMax/DocScores).  Then 
TestShardSearching.testSimple should pass?

Maybe just name it ScoreCachingCollector?  (It may call .score() more than 
twice!?  I can't tell).

> FunctionQuery ValueSource value computed twice per document
> ---
>
> Key: LUCENE-4574
> URL: https://issues.apache.org/jira/browse/LUCENE-4574
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.0, 4.1
>Reporter: David Smiley
>Assignee: David Smiley
> Attachments: LUCENE-4574.patch, LUCENE-4574.patch, LUCENE-4574.patch, 
> LUCENE-4574.patch, Test_for_LUCENE-4574.patch
>
>
> I was working on a custom ValueSource and did some basic profiling and 
> debugging to see if it was being used optimally.  To my surprise, the value 
> was being fetched twice per document in a row.  This computation isn't 
> exactly cheap to calculate so this is a big problem.  I was able to 
> work-around this problem trivially on my end by caching the last value with 
> corresponding docid in my FunctionValues implementation.
> Here is an excerpt of the code path to the first execution:
> {noformat}
> at 
> org.apache.lucene.queries.function.docvalues.DoubleDocValues.floatVal(DoubleDocValues.java:48)
> at 
> org.apache.lucene.queries.function.FunctionQuery$AllScorer.score(FunctionQuery.java:153)
> at 
> org.apache.lucene.search.TopFieldCollector$OneComparatorScoringMaxScoreCollector.collect(TopFieldCollector.java:291)
> at org.apache.lucene.search.Scorer.score(Scorer.java:62)
> at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:588)
> at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:280)
> {noformat}
> And here is the 2nd call:
> {noformat}
> at 
> org.apache.lucene.queries.function.docvalues.DoubleDocValues.floatVal(DoubleDocValues.java:48)
> at 
> org.apache.lucene.queries.function.FunctionQuery$AllScorer.score(FunctionQuery.java:153)
> at 
> org.apache.lucene.search.ScoreCachingWrappingScorer.score(ScoreCachingWrappingScorer.java:56)
> at 
> org.apache.lucene.search.FieldComparator$RelevanceComparator.copy(FieldComparator.java:951)
> at 
> org.apache.lucene.search.TopFieldCollector$OneComparatorScoringMaxScoreCollector.collect(TopFieldCollector.java:312)
> at org.apache.lucene.search.Scorer.score(Scorer.java:62)
> at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:588)
> at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:280)
> {noformat}
> The 2nd call appears to use some score caching mechanism, which is all well 
> and good, but that same mechanism wasn't used in the first call so there's no 
> cached value to retrieve.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4574) FunctionQuery ValueSource value computed twice per document

2012-11-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507499#comment-13507499
 ] 

Robert Muir commented on LUCENE-4574:
-

{quote}
So why do you hate this very simple cache so much? 
{quote}

I want things fixed correctly, the way I see it there is a lot of bogusness:
* When solr is only sorting by score, it should call IS.search without a Sort 
to get faster behavior. The relevance comparator documents that its the slow 
way.
* its especially stupid someone can ask for fillFields=true and 
trackDocScores=true if you have a relevance comparator.
* i'm not sure trackMaxScore=true is really useful at all except when relevance 
is the only sort, in which case you should be using IS.search without a sort 
anyway. If someone really needs this combination, i think its ok to make them 
impl their own collector
* i don't like wrapping the scorer with this cache in this relevance 
comparator. I feel like the comparator can probably do this in a cleaner way.
* i don't like all this caching just added on a whim everywhere. I see it here, 
I see BooleanScorer2 has a cache, I see block-join query has a cache, and I see 
PositivesScoreOnlyCollector has a cache. there are already cachingvaluesources 
at the valuesource level too: look at CachingDoubleValueSource in spatial .Some 
of these are senseless. If there is a real reason, its not documented. We 
should instead fix the APIs and so on instead of just adding all this caching 
everywhere.
* i think calling score() twice is bogus, but we should be fixing this 
correctly instead of hacking something in to speed up a slow functionquery. 

So yeah, clearly adding caches everywhere isn't the right solution to this 
stuff. I feel like I'm drowning in caches and bug reports like this one still 
exist.

We shouldnt rush anything in because of a particularly slow function query. 
Trust me, I think its bogus we call score() twice: but if something is put in 
rather quickly on this issue (e.g. more caching) then i prefer if its more 
contained so it can easily be ripped out later, when the problem is ultimately 
solved correctly.


> FunctionQuery ValueSource value computed twice per document
> ---
>
> Key: LUCENE-4574
> URL: https://issues.apache.org/jira/browse/LUCENE-4574
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.0, 4.1
>Reporter: David Smiley
>Assignee: David Smiley
> Attachments: LUCENE-4574.patch, LUCENE-4574.patch, LUCENE-4574.patch, 
> LUCENE-4574.patch, Test_for_LUCENE-4574.patch
>
>
> I was working on a custom ValueSource and did some basic profiling and 
> debugging to see if it was being used optimally.  To my surprise, the value 
> was being fetched twice per document in a row.  This computation isn't 
> exactly cheap to calculate so this is a big problem.  I was able to 
> work-around this problem trivially on my end by caching the last value with 
> corresponding docid in my FunctionValues implementation.
> Here is an excerpt of the code path to the first execution:
> {noformat}
> at 
> org.apache.lucene.queries.function.docvalues.DoubleDocValues.floatVal(DoubleDocValues.java:48)
> at 
> org.apache.lucene.queries.function.FunctionQuery$AllScorer.score(FunctionQuery.java:153)
> at 
> org.apache.lucene.search.TopFieldCollector$OneComparatorScoringMaxScoreCollector.collect(TopFieldCollector.java:291)
> at org.apache.lucene.search.Scorer.score(Scorer.java:62)
> at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:588)
> at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:280)
> {noformat}
> And here is the 2nd call:
> {noformat}
> at 
> org.apache.lucene.queries.function.docvalues.DoubleDocValues.floatVal(DoubleDocValues.java:48)
> at 
> org.apache.lucene.queries.function.FunctionQuery$AllScorer.score(FunctionQuery.java:153)
> at 
> org.apache.lucene.search.ScoreCachingWrappingScorer.score(ScoreCachingWrappingScorer.java:56)
> at 
> org.apache.lucene.search.FieldComparator$RelevanceComparator.copy(FieldComparator.java:951)
> at 
> org.apache.lucene.search.TopFieldCollector$OneComparatorScoringMaxScoreCollector.collect(TopFieldCollector.java:312)
> at org.apache.lucene.search.Scorer.score(Scorer.java:62)
> at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:588)
> at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:280)
> {noformat}
> The 2nd call appears to use some score caching mechanism, which is all well 
> and good, but that same mechanism wasn't used in the first call so there's no 
> cached value to retrieve.

--
This message is automa

[jira] [Commented] (SOLR-4129) Solr doesn't support log4j

2012-11-30 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507496#comment-13507496
 ] 

David Smiley commented on SOLR-4129:


Shawn, how could you have both log4j-over-slf4 and log4j itself on the 
classpath?  That doesn't make sense and SLF4J will probably complain; if it 
doesn't it's still an error.  If your intent is to use Log4j then remove 
log4j-over-slf4.

> Solr doesn't support log4j 
> ---
>
> Key: SOLR-4129
> URL: https://issues.apache.org/jira/browse/SOLR-4129
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0
>Reporter: Raintung Li
>  Labels: log
> Fix For: 4.1, 5.0
>
> Attachments: patch-4129.txt
>
>
> For many project use the log4j, actually solr use slf logger framework, slf 
> can easy to integrate with log4j by design. 
> Solr use log4j-over-slf.jar to handle log4j case.
> This jar has some issues.
> a. Actually last invoke slf to print the logger (For solr it is 
> JDK14.logging).
> b. Not implement all log4j function. ex. Logger.setLevel() 
> c. JDK14 log miss some function, ex. thread.info, day rolling 
> Some dependence project had been used log4j that the customer still want to 
> use it. JDK14 log has many different with Log4j, at least configuration file 
> can't reuse.
> The bad thing is log4j-over-slf.jar conflict with log4j. If use solr, the 
> other project have to remove log4j.
> I think it shouldn't use log4j-over-slf.jar, still reuse log4j if customer 
> want to use it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4129) Solr doesn't support log4j

2012-11-30 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507489#comment-13507489
 ] 

Shawn Heisey commented on SOLR-4129:


I've been trying to work out what is being said here and I'm finding it hard to 
follow.  I'm especially confused by the part about log4j-over-slf4j.

If this means that I will be able to change logging levels for log4j within the 
GUI, I'm all for it.  I'm less concerned about being able to view the log 
within the GUI, but that would be a nice addition.

I use log4j with Solr, and I've included log4j-over-slf4j.  To make it a lot 
easier to build that way, I patched solr/build.xml, that patch is attached to 
SOLR-3918.  I have the following jars in jetty's lib/ext:

jcl-over-slf4j-1.7.2.jar
log4j-1.2.17.jar
log4j-over-slf4j-1.7.2.jar
slf4j-api-1.7.2.jar
slf4j-log4j12-1.7.2.jar


> Solr doesn't support log4j 
> ---
>
> Key: SOLR-4129
> URL: https://issues.apache.org/jira/browse/SOLR-4129
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0
>Reporter: Raintung Li
>  Labels: log
> Fix For: 4.1, 5.0
>
> Attachments: patch-4129.txt
>
>
> For many project use the log4j, actually solr use slf logger framework, slf 
> can easy to integrate with log4j by design. 
> Solr use log4j-over-slf.jar to handle log4j case.
> This jar has some issues.
> a. Actually last invoke slf to print the logger (For solr it is 
> JDK14.logging).
> b. Not implement all log4j function. ex. Logger.setLevel() 
> c. JDK14 log miss some function, ex. thread.info, day rolling 
> Some dependence project had been used log4j that the customer still want to 
> use it. JDK14 log has many different with Log4j, at least configuration file 
> can't reuse.
> The bad thing is log4j-over-slf.jar conflict with log4j. If use solr, the 
> other project have to remove log4j.
> I think it shouldn't use log4j-over-slf.jar, still reuse log4j if customer 
> want to use it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

2012-11-30 Thread Amber Duque (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507488#comment-13507488
 ] 

Amber Duque commented on SOLR-2242:
---

I have a question on the SOLR-2242-solr40-3.patch.
I have applied this patch on top of the Solr 4.0 release 
(http://svn.apache.org/repos/asf/lucene/dev/tags/ - lucene_solr_4_0_0).
The patch builds fine, but several solr unit tests fail:

Tests with failures:
  - org.apache.solr.request.TestFaceting.testFacets
  - org.apache.solr.request.TestFaceting.testRegularBig
  - org.apache.solr.cloud.BasicDistributedZkTest.testDistribSearch
  - org.apache.solr.TestDistributedSearch.testDistribSearch
  - org.apache.solr.TestDistributedGrouping.testDistribSearch
  - org.apache.solr.request.SimpleFacetsTest (suite)
  - org.apache.solr.TestGroupingSearch.testRandomGrouping
  - org.apache.solr.TestGroupingSearch.testGroupingGroupedBasedFaceting
  - org.apache.solr.cloud.BasicDistributedZk2Test.testDistribSearch

Do the unit tests pass successfully for anyone (for this patch applied on top 
of the solr 4.0 release)?

Thanks!

> Get distinct count of names for a facet field
> -
>
> Key: SOLR-2242
> URL: https://issues.apache.org/jira/browse/SOLR-2242
> Project: Solr
>  Issue Type: New Feature
>  Components: Response Writers
>Affects Versions: 4.0-ALPHA
>Reporter: Bill Bell
>Priority: Minor
> Fix For: 4.1
>
> Attachments: SOLR-2242-3x_5_tests.patch, SOLR-2242-3x.patch, 
> SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, 
> SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, 
> SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR-2242-solr40-3.patch
>
>
> When returning facet.field= you will get a list of matches for 
> distinct values. This is normal behavior. This patch tells you how many 
> distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> Parameters:
> facet.numTerms or f..facet.numTerms = true (default is false) - turn 
> on distinct counting of terms
> facet.field - the field to count the terms
> It creates a new section in the facet section...
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numTerms=true&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numTerms=false&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numTerms=true&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> 
> 
> ...
> 
> 
> 14
> 
> 
> 14
> 
> 
> 
> 
> 
> OR with no sharding-
> 
> 14
> 
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4574) FunctionQuery ValueSource value computed twice per document

2012-11-30 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507476#comment-13507476
 ] 

David Smiley commented on LUCENE-4574:
--

Indeed this is complicated.  It's par for the course in the part of Lucene I 
figure.

bq. Is there some way we could invert this (e.g. so its boolean collectsScores 
or something?).

I hear ya; it was getting late and when I originally named the variable it was 
appropriate since it was only for the nonScoring collectors.

RE ScoringTwiceCollector, I'll just trust you that this approach makes sense as 
I'm not familiar with the distinguishing details between the half-dozen 
collectors to know that ScoringTwiceCollector can always be used in leu of 
those.

I applied your patch.  Then I applied the part of my patch for 
RelevanceComparator to detect when a score for the same doc is fetched twice in 
a row.  I recognized a small bug there in which I forgot to re-initialize 
_lastDocId to -1 on setScorer(), so I fixed that trivially.

But TestShardSearching.testSimple() still fails.  So I took a closer at the 
Collector calling it to see why.  OneComparatorNonScoringCollector.collect() 
opens up with this:
{code:java}
public void collect(int doc) throws IOException {
  ++totalHits;
  if (queueFull) {
if ((reverseMul * comparator.compareBottom(doc)) <= 0) {
  // since docs are visited in doc Id order, if compare is 0, it means
  // this document is larger than anything else in the queue, and
  // therefore not competitive.
  return;
}

// This hit is competitive - replace bottom element in queue & adjustTop
comparator.copy(bottom.slot, doc);
...
{code} 

Notice that there is a call to comparator.compareBottom() and comparator.copy() 
here.  Both of these for RelevanceComparator fetch the score.

So maybe RelevanceComparator.setScorer still needs to wrap its scorer with the 
caching one for such cases.

So why do you hate this very simple cache so much?  Figuring out exactly when 
to do it but not otherwise has the adverse effect of complicating further 
already complicated code and as a consequence increases the risk that at some 
point after future changes the conditions become wrong and triggers a query to 
take twice as long.  But this cache is so light-weight that it is probably too 
hard to measure the appreciable difference of doing unnecessary caching.

> FunctionQuery ValueSource value computed twice per document
> ---
>
> Key: LUCENE-4574
> URL: https://issues.apache.org/jira/browse/LUCENE-4574
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.0, 4.1
>Reporter: David Smiley
>Assignee: David Smiley
> Attachments: LUCENE-4574.patch, LUCENE-4574.patch, LUCENE-4574.patch, 
> LUCENE-4574.patch, Test_for_LUCENE-4574.patch
>
>
> I was working on a custom ValueSource and did some basic profiling and 
> debugging to see if it was being used optimally.  To my surprise, the value 
> was being fetched twice per document in a row.  This computation isn't 
> exactly cheap to calculate so this is a big problem.  I was able to 
> work-around this problem trivially on my end by caching the last value with 
> corresponding docid in my FunctionValues implementation.
> Here is an excerpt of the code path to the first execution:
> {noformat}
> at 
> org.apache.lucene.queries.function.docvalues.DoubleDocValues.floatVal(DoubleDocValues.java:48)
> at 
> org.apache.lucene.queries.function.FunctionQuery$AllScorer.score(FunctionQuery.java:153)
> at 
> org.apache.lucene.search.TopFieldCollector$OneComparatorScoringMaxScoreCollector.collect(TopFieldCollector.java:291)
> at org.apache.lucene.search.Scorer.score(Scorer.java:62)
> at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:588)
> at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:280)
> {noformat}
> And here is the 2nd call:
> {noformat}
> at 
> org.apache.lucene.queries.function.docvalues.DoubleDocValues.floatVal(DoubleDocValues.java:48)
> at 
> org.apache.lucene.queries.function.FunctionQuery$AllScorer.score(FunctionQuery.java:153)
> at 
> org.apache.lucene.search.ScoreCachingWrappingScorer.score(ScoreCachingWrappingScorer.java:56)
> at 
> org.apache.lucene.search.FieldComparator$RelevanceComparator.copy(FieldComparator.java:951)
> at 
> org.apache.lucene.search.TopFieldCollector$OneComparatorScoringMaxScoreCollector.collect(TopFieldCollector.java:312)
> at org.apache.lucene.search.Scorer.score(Scorer.java:62)
> at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:588)
>

Re: pro coding style

2012-11-30 Thread Dawid Weiss
> Caught on slowly... I had been using it before I became a Lucene

Yep, so did I albeit in a sligtly different flavor -- always starting
from a static seed and running a certain number of randomized
iterations of things, usually higher level. Kind of sanity checking I
guess. I don't know why I hadn't thought of just picking a different
seed every time.

> But yeah, it's only become a religion here recently.

Come on, I don't think it's that bad :) We may differ in opinions on
certain things (like which tests to run and when) but I think everyone
shares the same overall goal of having well tested code.

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4499) Multi-word synonym filter (synonym expansion)

2012-11-30 Thread Roman Chyla (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507440#comment-13507440
 ] 

Roman Chyla commented on LUCENE-4499:
-

Hi Nolan, your case seems to confirm a need for some solution. You have decided 
to make a seaprate query parser, I have put the expanding logic into a query 
parser as well.

See this for the working example:
https://github.com/romanchyla/montysolr/blob/master/contrib/adsabs/src/test/org/apache/solr/analysis/TestAdsabsTypeFulltextParsing.java

And its config
https://github.com/romanchyla/montysolr/blob/master/contrib/examples/adsabs/solr/collection1/conf/schema.xml#L325

I see two added benefits (besides not needing a query parser plugin - in our 
case, it must be plugged into our qparser):

 1. you can use the filter at index/query time inside a standard query parser
 2. special configuration for synonym expansion (for example, we have found it 
very useful to be able to search for multi-tokens in case-insensitive manner, 
but recognize single tokens only case-sensitively; or expand with multi-token 
synonyms only for multi-word originals and output also the original words, 
otherwise eat them (replace them))

Nice blog post, I wish I could write as instructively as well :)

> Multi-word synonym filter (synonym expansion)
> -
>
> Key: LUCENE-4499
> URL: https://issues.apache.org/jira/browse/LUCENE-4499
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Affects Versions: 4.1, 5.0
>Reporter: Roman Chyla
>Priority: Minor
>  Labels: analysis, multi-word, synonyms
> Fix For: 5.0
>
> Attachments: LUCENE-4499.patch
>
>
> I apologize for bringing the multi-token synonym expansion up again. There is 
> an old, unresolved issue at LUCENE-1622 [1]
> While solving the problem for our needs [2], I discovered that the current 
> SolrSynonym parser (and the wonderful FTS) have almost everything to 
> satisfactorily handle both the query and index time synonym expansion. It 
> seems that people often need to use the synonym filter *slightly* differently 
> at indexing and query time.
> In our case, we must do different things during indexing and querying.
> Example sentence: Mirrors of the Hubble space telescope pointed at XA5
> This is what we need (comma marks position bump):
> indexing: mirrors,hubble|hubble space 
> telescope|hst,space,telescope,pointed,xa5|astroobject#5
> querying: +mirrors +(hubble space telescope | hst) +pointed 
> +(xa5|astroboject#5)
> This translated to following needs:
>   indexing time: 
> single-token synonyms => return only synonyms
> multi-token synonyms => return original tokens *AND* the synonyms
>   query time:
> single-token: return only synonyms (but preserve case)
> multi-token: return only synonyms
>  
> We need the original tokens for the proximity queries, if we indexed 'hubble 
> space telescope'
> as one token, we cannot search for 'hubble NEAR telescope'
> You may (not) be surprised, but Lucene already supports ALL of these 
> requirements. The patch is an attempt to state the problem differently. I am 
> not sure if it is the best option, however it works perfectly for our needs 
> and it seems it could work for general public too. Especially if the 
> SynonymFilterFactory had a preconfigured sets of SynonymMapBuilders - and 
> people would just choose what situation they use. Please look at the unittest.
> links:
> [1] https://issues.apache.org/jira/browse/LUCENE-1622
> [2] http://labs.adsabs.harvard.edu/trac/ads-invenio/ticket/158
> [3] seems to have similar request: 
> http://lucene.472066.n3.nabble.com/Proposal-Full-support-for-multi-word-synonyms-at-query-time-td4000522.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Reenable Solr tests on Jenkins

2012-11-30 Thread Robert Muir
On Fri, Nov 30, 2012 at 7:46 AM, Uwe Schindler  wrote:

> Can we re-enable Solr tests on Jenkins?
> As a workaround for the blackhole, I would add an unless="freebsd" -like
> check to the Solr's "test" target. We should maybe work on a solution to
> make the blackhole work correctly or make SolrJettyRunner detect the
> blackhole and enable the simple non-NIO connector.
>

I have a few questions about the blackhole, because I want to understand
what the real scope of this issue is.

1. first of all, there is "normal" blackhole, which is where nothing comes
back from a closed TCP port. I don't think its is "special" at all,
actually lots of people have firewalls (on many operating systems), and
this is the kind of behavior you see if you have network outages and so on.
So its scary if things dont work right in this situation.

2. then there is "crazy" blackhole (our apache jenkins server) which
applies this also to the localhost interface: 127.0.0.1... thats just over
the top.

Is the problem specific to testing, or also real code? And does it only
impact #2 (crazy blackholes) or is it a more general problem (#1) ?


Re: Reenable Solr tests on Jenkins

2012-11-30 Thread Robert Muir
On Fri, Nov 30, 2012 at 11:11 AM, Mark Miller  wrote:

>
>
> Just be aggressive and assume FreeBSD is black holed?
>
> - Mark
>
>
>
Or just have a -D we provide, with our apache jenkins server setting it to
true.


Re: Reenable Solr tests on Jenkins

2012-11-30 Thread Mark Miller

On Nov 30, 2012, at 11:08 AM, Sami Siren  wrote:

> On Fri, Nov 30, 2012 at 2:46 PM, Uwe Schindler  wrote:
> Can we re-enable Solr tests on Jenkins?
> 
> +1
>  
> As a workaround for the blackhole, I would add an unless="freebsd" -like 
> check to the Solr's "test" target. We should maybe work on a solution to make 
> the blackhole work correctly or make SolrJettyRunner detect the blackhole and 
> enable the simple non-NIO connector.
> 
> Any ideas on how to detect the blackhole?

Just be aggressive and assume FreeBSD is black holed?

- Mark


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Reenable Solr tests on Jenkins

2012-11-30 Thread Sami Siren
On Fri, Nov 30, 2012 at 2:46 PM, Uwe Schindler  wrote:

> Can we re-enable Solr tests on Jenkins?
>

+1


> As a workaround for the blackhole, I would add an unless="freebsd" -like
> check to the Solr's "test" target. We should maybe work on a solution to
> make the blackhole work correctly or make SolrJettyRunner detect the
> blackhole and enable the simple non-NIO connector.
>

Any ideas on how to detect the blackhole?

--
 Sami Siren


Re: Fullmetal Jenkins: Solr - Build # 356 - Failure!

2012-11-30 Thread Mark Miller
Weird fails that seem to be out of the blue…output shows some trouble with svn 
at the start…

I updated the machine and rebooted. Will see how it goes…

- Mark

On Nov 30, 2012, at 7:11 AM, nore...@fullmetaljenkins.org wrote:

> Solr - Build # 356 - Failure:
> 
> Check console output at http://fullmetaljenkins.org/job/Solr/356/ to view the 
> results.
> 
> 9 tests failed.
> FAILED:  
> junit.framework.TestSuite.org.apache.solr.cloud.BasicDistributedZkTest
> 
> Error Message:
> 2 threads leaked from SUITE scope at 
> org.apache.solr.cloud.BasicDistributedZkTest: 1) Thread[id=27, 
> name=TEST-BasicDistributedZkTest.testDistribSearch-seed#[5727B28679A071E]-EventThread,
>  state=WAITING, group=TGRP-BasicDistributedZkTest] at 
> sun.misc.Unsafe.park(Native Method) at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>  at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)   
>   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:491) 
>2) Thread[id=26, 
> name=TEST-BasicDistributedZkTest.testDistribSearch-seed#[5727B28679A071E]-SendThread(localhost:57177),
>  state=TIMED_WAITING, group=TGRP-BasicDistributedZkTest] at 
> java.lang.Thread.sleep(Native Method) at 
> org.apache.zookeeper.client.StaticHostProvider.next(StaticHostProvider.java:86)
>  at 
> org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:937)  
>at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:993)
> 
> Stack Trace:
> com.carrotsearch.randomizedtesting.ThreadLeakError: 2 threads leaked from 
> SUITE scope at org.apache.solr.cloud.BasicDistributedZkTest: 
>   1) Thread[id=27, 
> name=TEST-BasicDistributedZkTest.testDistribSearch-seed#[5727B28679A071E]-EventThread,
>  state=WAITING, group=TGRP-BasicDistributedZkTest]
>at sun.misc.Unsafe.park(Native Method)
>at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:491)
>   2) Thread[id=26, 
> name=TEST-BasicDistributedZkTest.testDistribSearch-seed#[5727B28679A071E]-SendThread(localhost:57177),
>  state=TIMED_WAITING, group=TGRP-BasicDistributedZkTest]
>at java.lang.Thread.sleep(Native Method)
>at 
> org.apache.zookeeper.client.StaticHostProvider.next(StaticHostProvider.java:86)
>at 
> org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:937)
>at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:993)
>   at __randomizedtesting.SeedInfo.seed([5727B28679A071E]:0)
> 
> 
> FAILED:  
> junit.framework.TestSuite.org.apache.solr.cloud.BasicDistributedZkTest
> 
> Error Message:
> There are still zombie threads that couldn't be terminated:1) 
> Thread[id=26, 
> name=TEST-BasicDistributedZkTest.testDistribSearch-seed#[5727B28679A071E]-SendThread(localhost:57177),
>  state=TIMED_WAITING, group=TGRP-BasicDistributedZkTest] at 
> java.lang.Thread.sleep(Native Method) at 
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:984)
> 
> Stack Trace:
> com.carrotsearch.randomizedtesting.ThreadLeakError: There are still zombie 
> threads that couldn't be terminated:
>   1) Thread[id=26, 
> name=TEST-BasicDistributedZkTest.testDistribSearch-seed#[5727B28679A071E]-SendThread(localhost:57177),
>  state=TIMED_WAITING, group=TGRP-BasicDistributedZkTest]
>at java.lang.Thread.sleep(Native Method)
>at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:984)
>   at __randomizedtesting.SeedInfo.seed([5727B28679A071E]:0)
> 
> 
> FAILED:  junit.framework.TestSuite.org.apache.solr.cloud.OverseerTest
> 
> Error Message:
> 2 threads leaked from SUITE scope at org.apache.solr.cloud.OverseerTest: 
> 1) Thread[id=23, 
> name=TEST-OverseerTest.testDoubleAssignment-seed#[5727B28679A071E]-EventThread,
>  state=WAITING, group=TGRP-OverseerTest] at 
> sun.misc.Unsafe.park(Native Method) at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>  at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)   
>   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:491) 
>2) Thread[id=22, 
> name=TEST-OverseerTest.testDoubleAssignment-seed#[5727B28679A071E]-SendThread(localhost:60250),
>  state=TIMED_WAITING, group=TGRP-OverseerTest] at 
> java.lang.Thread.sleep(Native Method) at 

Re: pro coding style

2012-11-30 Thread Yonik Seeley
On Fri, Nov 30, 2012 at 9:52 AM, Dawid Weiss
 wrote:
>> RandomizedTesting for the win!  Thanks a ton Dawid.
>
> I didn't invent this thing, I merely wrapped it up, cleaned up the
> rough edges and extracted to a stand-alone package. Lucene/Solr
> contributors should be credited for introducing the concept. And
> there's also research literature dating waaay back so I don't think
> the concept it entirely new -- it just never caught on.

Caught on slowly... I had been using it before I became a Lucene
committer in '05 and used it in Lucene/Solr for anything that had
enough complexity to warrant it.

https://issues.apache.org/jira/browse/LUCENE-395?focusedCommentId=12356746&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12356746

And one of my personal favorites, I think the first random indexing
test - TestStressIndexing2
https://issues.apache.org/jira/browse/LUCENE-1173?focusedCommentId=12567845&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12567845

But yeah, it's only become a religion here recently.
The support in the framework is certainly welcome!

-Yonik
http://lucidworks.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4574) FunctionQuery ValueSource value computed twice per document

2012-11-30 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-4574:


Attachment: LUCENE-4574.patch

this is getting a little complicated.

here's an alternative patch (I just hacked it up quickly).

in the case where you have relevance comparator and also track 
scores/maxScores, we just return a "ScoresTwiceCollector".

I think this approach might be less dangerous, so we don't break common cases 
for a special case.

Something to think about: your test passes (I didnt even run other tests)


> FunctionQuery ValueSource value computed twice per document
> ---
>
> Key: LUCENE-4574
> URL: https://issues.apache.org/jira/browse/LUCENE-4574
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.0, 4.1
>Reporter: David Smiley
>Assignee: David Smiley
> Attachments: LUCENE-4574.patch, LUCENE-4574.patch, LUCENE-4574.patch, 
> LUCENE-4574.patch, Test_for_LUCENE-4574.patch
>
>
> I was working on a custom ValueSource and did some basic profiling and 
> debugging to see if it was being used optimally.  To my surprise, the value 
> was being fetched twice per document in a row.  This computation isn't 
> exactly cheap to calculate so this is a big problem.  I was able to 
> work-around this problem trivially on my end by caching the last value with 
> corresponding docid in my FunctionValues implementation.
> Here is an excerpt of the code path to the first execution:
> {noformat}
> at 
> org.apache.lucene.queries.function.docvalues.DoubleDocValues.floatVal(DoubleDocValues.java:48)
> at 
> org.apache.lucene.queries.function.FunctionQuery$AllScorer.score(FunctionQuery.java:153)
> at 
> org.apache.lucene.search.TopFieldCollector$OneComparatorScoringMaxScoreCollector.collect(TopFieldCollector.java:291)
> at org.apache.lucene.search.Scorer.score(Scorer.java:62)
> at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:588)
> at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:280)
> {noformat}
> And here is the 2nd call:
> {noformat}
> at 
> org.apache.lucene.queries.function.docvalues.DoubleDocValues.floatVal(DoubleDocValues.java:48)
> at 
> org.apache.lucene.queries.function.FunctionQuery$AllScorer.score(FunctionQuery.java:153)
> at 
> org.apache.lucene.search.ScoreCachingWrappingScorer.score(ScoreCachingWrappingScorer.java:56)
> at 
> org.apache.lucene.search.FieldComparator$RelevanceComparator.copy(FieldComparator.java:951)
> at 
> org.apache.lucene.search.TopFieldCollector$OneComparatorScoringMaxScoreCollector.collect(TopFieldCollector.java:312)
> at org.apache.lucene.search.Scorer.score(Scorer.java:62)
> at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:588)
> at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:280)
> {noformat}
> The 2nd call appears to use some score caching mechanism, which is all well 
> and good, but that same mechanism wasn't used in the first call so there's no 
> cached value to retrieve.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: pro coding style

2012-11-30 Thread Adrien Grand
On Fri, Nov 30, 2012 at 3:48 PM, David Smiley (@MITRE.org) <
dsmi...@mitre.org> wrote:

> RandomizedTesting for the win!  Thanks a ton Dawid.
>

+1

-- 
Adrien


[jira] [Commented] (LUCENE-4574) FunctionQuery ValueSource value computed twice per document

2012-11-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507355#comment-13507355
 ] 

Robert Muir commented on LUCENE-4574:
-

one thing that made it hard for me to review:
{code}
boolean nonScoring;//nonScoring AND not out of order
{code}

Is there some way we could invert this (e.g. so its boolean collectsScores or 
something?). 

Can we simplify the out-of-orderness somehow, by fixing this CachingWrapper to 
support out of order collectors (note I havent thought about this at all, but I 
feel like Uwe did some cool trick in constant-scorer along these lines...)

the (NOT AND NOT) is hard on the brain cells :)

> FunctionQuery ValueSource value computed twice per document
> ---
>
> Key: LUCENE-4574
> URL: https://issues.apache.org/jira/browse/LUCENE-4574
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.0, 4.1
>Reporter: David Smiley
>Assignee: David Smiley
> Attachments: LUCENE-4574.patch, LUCENE-4574.patch, LUCENE-4574.patch, 
> Test_for_LUCENE-4574.patch
>
>
> I was working on a custom ValueSource and did some basic profiling and 
> debugging to see if it was being used optimally.  To my surprise, the value 
> was being fetched twice per document in a row.  This computation isn't 
> exactly cheap to calculate so this is a big problem.  I was able to 
> work-around this problem trivially on my end by caching the last value with 
> corresponding docid in my FunctionValues implementation.
> Here is an excerpt of the code path to the first execution:
> {noformat}
> at 
> org.apache.lucene.queries.function.docvalues.DoubleDocValues.floatVal(DoubleDocValues.java:48)
> at 
> org.apache.lucene.queries.function.FunctionQuery$AllScorer.score(FunctionQuery.java:153)
> at 
> org.apache.lucene.search.TopFieldCollector$OneComparatorScoringMaxScoreCollector.collect(TopFieldCollector.java:291)
> at org.apache.lucene.search.Scorer.score(Scorer.java:62)
> at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:588)
> at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:280)
> {noformat}
> And here is the 2nd call:
> {noformat}
> at 
> org.apache.lucene.queries.function.docvalues.DoubleDocValues.floatVal(DoubleDocValues.java:48)
> at 
> org.apache.lucene.queries.function.FunctionQuery$AllScorer.score(FunctionQuery.java:153)
> at 
> org.apache.lucene.search.ScoreCachingWrappingScorer.score(ScoreCachingWrappingScorer.java:56)
> at 
> org.apache.lucene.search.FieldComparator$RelevanceComparator.copy(FieldComparator.java:951)
> at 
> org.apache.lucene.search.TopFieldCollector$OneComparatorScoringMaxScoreCollector.collect(TopFieldCollector.java:312)
> at org.apache.lucene.search.Scorer.score(Scorer.java:62)
> at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:588)
> at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:280)
> {noformat}
> The 2nd call appears to use some score caching mechanism, which is all well 
> and good, but that same mechanism wasn't used in the first call so there's no 
> cached value to retrieve.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4129) Solr doesn't support log4j

2012-11-30 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507352#comment-13507352
 ] 

David Smiley commented on SOLR-4129:


+1 Patch looks good.

> Solr doesn't support log4j 
> ---
>
> Key: SOLR-4129
> URL: https://issues.apache.org/jira/browse/SOLR-4129
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0
>Reporter: Raintung Li
>  Labels: log
> Fix For: 4.1, 5.0
>
> Attachments: patch-4129.txt
>
>
> For many project use the log4j, actually solr use slf logger framework, slf 
> can easy to integrate with log4j by design. 
> Solr use log4j-over-slf.jar to handle log4j case.
> This jar has some issues.
> a. Actually last invoke slf to print the logger (For solr it is 
> JDK14.logging).
> b. Not implement all log4j function. ex. Logger.setLevel() 
> c. JDK14 log miss some function, ex. thread.info, day rolling 
> Some dependence project had been used log4j that the customer still want to 
> use it. JDK14 log has many different with Log4j, at least configuration file 
> can't reuse.
> The bad thing is log4j-over-slf.jar conflict with log4j. If use solr, the 
> other project have to remove log4j.
> I think it shouldn't use log4j-over-slf.jar, still reuse log4j if customer 
> want to use it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: pro coding style

2012-11-30 Thread Robert Muir
On Fri, Nov 30, 2012 at 9:10 AM, Mark Miller  wrote:

>
> On Nov 30, 2012, at 8:56 AM, Robert Muir  wrote:
>
> > but git by itself, is pretty unusable.
>
> Given the number of committers that eat some pain to use git when
> developing lucene/solr, and have no github or pull requests, I'm not sure
> that's a common though :)
>
>
Sure, some people might disagree with me.
I'm more than willing to eat some pain if it makes contributions easier.

I just feel like a lot of what makes github successful is unfortunately
actually in github and not git.

Its like if your development team is screaming for linux machines. You have
to be careful how to interpret that. If you hand them a bunch of machines
with just linux kernels, they probably won't be productive. When they
scream for "linux" they want a userland with a shell, compiler, X-windows,
editor and so on too.


Re: pro coding style

2012-11-30 Thread Mark Miller

On Nov 30, 2012, at 8:56 AM, Robert Muir  wrote:

> but git by itself, is pretty unusable.

Given the number of committers that eat some pain to use git when developing 
lucene/solr, and have no github or pull requests, I'm not sure that's a common 
though :)

- Mark
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4129) Solr doesn't support log4j

2012-11-30 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-4129:
--

Fix Version/s: 5.0
   4.1

> Solr doesn't support log4j 
> ---
>
> Key: SOLR-4129
> URL: https://issues.apache.org/jira/browse/SOLR-4129
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0
>Reporter: Raintung Li
>  Labels: log
> Fix For: 4.1, 5.0
>
> Attachments: patch-4129.txt
>
>
> For many project use the log4j, actually solr use slf logger framework, slf 
> can easy to integrate with log4j by design. 
> Solr use log4j-over-slf.jar to handle log4j case.
> This jar has some issues.
> a. Actually last invoke slf to print the logger (For solr it is 
> JDK14.logging).
> b. Not implement all log4j function. ex. Logger.setLevel() 
> c. JDK14 log miss some function, ex. thread.info, day rolling 
> Some dependence project had been used log4j that the customer still want to 
> use it. JDK14 log has many different with Log4j, at least configuration file 
> can't reuse.
> The bad thing is log4j-over-slf.jar conflict with log4j. If use solr, the 
> other project have to remove log4j.
> I think it shouldn't use log4j-over-slf.jar, still reuse log4j if customer 
> want to use it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: pro coding style

2012-11-30 Thread Robert Muir
On Fri, Nov 30, 2012 at 8:50 AM, Per Steffensen  wrote:

> Robert Muir skrev:
>
>  Is it really git? Because its my understanding pull requests aren't
>> actually a git thing but a github thing.
>>
>> The distinction is important.
>>
>
> Actually Im not sure. Have never used git outside github, but at least
> part of it has to be git and not github (I think) - or else I couldnt
> imagine how you get the advantages you get. Remember that when using git
> you actually run a "repository" on every developers local machines. When
> you commit, you commit only to you local "repository". You need to "push"
> in order to have it "upstreamed" (as they call it)
>
>
Right, I'm positive this (pull requests) is github :)

I just wanted make this point: when we have discussions about using git
instead of svn, I'm not sure it makes things easier on anyone, actually
probably worse and more complex.

Its the github workflow that contributors want (I would +1 some scheme that
supports this!), but git by itself, is pretty unusable.

Github is like a nice front-end to this mess.


Re: pro coding style

2012-11-30 Thread Per Steffensen

Robert Muir skrev:
Is it really git? Because its my understanding pull requests aren't 
actually a git thing but a github thing.


The distinction is important.


Actually Im not sure. Have never used git outside github, but at least 
part of it has to be git and not github (I think) - or else I couldnt 
imagine how you get the advantages you get. Remember that when using git 
you actually run a "repository" on every developers local machines. When 
you commit, you commit only to you local "repository". You need to 
"push" in order to have it "upstreamed" (as they call it)


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: IndexWriter.ensureOpen and ensureOpen(boolean)

2012-11-30 Thread Robert Muir
On Fri, Nov 30, 2012 at 8:29 AM, Michael McCandless <
luc...@mikemccandless.com> wrote:

>
> > 2) Could you perhaps clarify the use of the second argument in the
> javadocs?
> > Maybe also rename it to something like "fail if closing"? The name
> > "includePendingClose" is vague perhaps consider*?)
>
> I agree that current name is no good!  failIfClosing seems good?
>
>
+1. a boolean for ensureOpen is intimidating: fixing up the param
name/documentation here makes it easier to understand.

I had no idea what this boolean was doing from the current
code/explanation: but shai's explanation makes sense.


Re: IndexWriter.ensureOpen and ensureOpen(boolean)

2012-11-30 Thread Michael McCandless
On Fri, Nov 30, 2012 at 7:13 AM, Shai Erera  wrote:
> I see. So two questions:
>
> 1) Is it ok for prepareCommit() to call ensureOpen(false)? In LUCENE-4575 I
> consolidate the two prepCommit() and this is the only way it would work ...

Hmm prepareCommit() really should somehow pass true: this API is only
invoked by the app, not by IW internally during close.  Not sure how
we can fix the patch to get that back ...

> 2) Could you perhaps clarify the use of the second argument in the javadocs?
> Maybe also rename it to something like "fail if closing"? The name
> "includePendingClose" is vague perhaps consider*?)

I agree that current name is no good!  failIfClosing seems good?

Mike McCandless

http://blog.mikemccandless.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: pro coding style

2012-11-30 Thread Dawid Weiss
> When you just do not think of randomized tests as a "replacement for
> boundary condition tests" etc

I never claimed they were; in fact, I always make it very explicit
that it's just another tool for yet another type of problems. I
typically write the tests for the conditions I can think of and put a
randomized test as an addition. And guess what typically fails first
;)

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4499) Multi-word synonym filter (synonym expansion)

2012-11-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507307#comment-13507307
 ] 

Robert Muir commented on LUCENE-4499:
-

Hi Nolan: 

There is a mistake in your blog post:

{quote}
Finally, and most seriously, the SynonymFilterFactory will simply not match 
multi-word synonyms in user queries if you do any kind of tokenization. This is 
because the tokenizer breaks up the input before the SynonymFilterFactory can 
transform it.
{quote}

This is not true. There is nothing wrong with SynonymFilter here. The bug is 
actually in the lucene queryparser (LUCENE-2605).


> Multi-word synonym filter (synonym expansion)
> -
>
> Key: LUCENE-4499
> URL: https://issues.apache.org/jira/browse/LUCENE-4499
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Affects Versions: 4.1, 5.0
>Reporter: Roman Chyla
>Priority: Minor
>  Labels: analysis, multi-word, synonyms
> Fix For: 5.0
>
> Attachments: LUCENE-4499.patch
>
>
> I apologize for bringing the multi-token synonym expansion up again. There is 
> an old, unresolved issue at LUCENE-1622 [1]
> While solving the problem for our needs [2], I discovered that the current 
> SolrSynonym parser (and the wonderful FTS) have almost everything to 
> satisfactorily handle both the query and index time synonym expansion. It 
> seems that people often need to use the synonym filter *slightly* differently 
> at indexing and query time.
> In our case, we must do different things during indexing and querying.
> Example sentence: Mirrors of the Hubble space telescope pointed at XA5
> This is what we need (comma marks position bump):
> indexing: mirrors,hubble|hubble space 
> telescope|hst,space,telescope,pointed,xa5|astroobject#5
> querying: +mirrors +(hubble space telescope | hst) +pointed 
> +(xa5|astroboject#5)
> This translated to following needs:
>   indexing time: 
> single-token synonyms => return only synonyms
> multi-token synonyms => return original tokens *AND* the synonyms
>   query time:
> single-token: return only synonyms (but preserve case)
> multi-token: return only synonyms
>  
> We need the original tokens for the proximity queries, if we indexed 'hubble 
> space telescope'
> as one token, we cannot search for 'hubble NEAR telescope'
> You may (not) be surprised, but Lucene already supports ALL of these 
> requirements. The patch is an attempt to state the problem differently. I am 
> not sure if it is the best option, however it works perfectly for our needs 
> and it seems it could work for general public too. Especially if the 
> SynonymFilterFactory had a preconfigured sets of SynonymMapBuilders - and 
> people would just choose what situation they use. Please look at the unittest.
> links:
> [1] https://issues.apache.org/jira/browse/LUCENE-1622
> [2] http://labs.adsabs.harvard.edu/trac/ads-invenio/ticket/158
> [3] seems to have similar request: 
> http://lucene.472066.n3.nabble.com/Proposal-Full-support-for-multi-word-synonyms-at-query-time-td4000522.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: pro coding style

2012-11-30 Thread Robert Muir
On Fri, Nov 30, 2012 at 6:15 AM, Per Steffensen  wrote:

>
>  * svn -> git (way better tools)

>>> I think we had this discussion already and it seems that lots of folks
>>> are positive, yet there is still some barrier infrasturcuture wise along
>>> the lines.
>>>
>>
>> dont blame infrastructure, other apache projects are using it.
>>
> Git is the way forward. It will also make comitting outside contributions
> easier (especially if the commit is to be performed after the branch has
> developed a lot since the pull-request was made). Merging among branches
> will also become easier. Why? Basically, since a pull request (request to
> merge) is a operation


Is it really git? Because its my understanding pull requests aren't
actually a git thing but a github thing.

The distinction is important.


Reenable Solr tests on Jenkins

2012-11-30 Thread Uwe Schindler
Can we re-enable Solr tests on Jenkins?
As a workaround for the blackhole, I would add an unless="freebsd" -like check 
to the Solr's "test" target. We should maybe work on a solution to make the 
blackhole work correctly or make SolrJettyRunner detect the blackhole and 
enable the simple non-NIO connector.

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: IndexWriter.ensureOpen and ensureOpen(boolean)

2012-11-30 Thread Shai Erera
I see. So two questions:

1) Is it ok for prepareCommit() to call ensureOpen(false)? In LUCENE-4575 I
consolidate the two prepCommit() and this is the only way it would work ...

2) Could you perhaps clarify the use of the second argument in the
javadocs? Maybe also rename it to something like "fail if closing"? The
name "includePendingClose" is vague perhaps consider*?)

Thanks anyway for clarifying this !

Shai

On Fri, Nov 30, 2012 at 1:57 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:

> On Thu, Nov 29, 2012 at 3:31 PM, Shai Erera  wrote:
> >
> > Hi
> >
> > While working on LUCENE-4575 I noticed what I thought was an
> inconsistency between prepareCommit() and prepareCommit(commitData).
> > The former called ensureOpen(true) and the latter ensureOpen(false). At
> first I thought that this is a bug, so I fixed both to call
> ensureOpen(true),
> > especially now that I consolidate the two prepCommit() versions into
> one, but then all tests failed with AlreadyClosedException. How wonderful
> :).
> >
> > Getting deeper into the meaning of the two ensureOpen versions i realize
> that the boolean means something like "fail if IW has been closed, or is
> > in the process of closing). Some methods choose to not fail if IW is in
> the process of closing, while others do (mostly internal methods).
> >
> > My question is - why make the distinction? If IW is in the process of
> closing, why not always fail?
>
> Because IW will call some of these methods during close (eg it commits
> before closing)...
>
> If we fix close to not commit (there is an issue for this...) then we
> could likely remove this boolean.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: pro coding style

2012-11-30 Thread Per Steffensen

Per Steffensen skrev:

Spot on! Good arguments.
When you just do not think of randomized tests as a "replacement for 
boundary condition tests" etc
Thanks. Will consider randomized for my projects in the future - with 
limits :-)


Regards, Per Steffensen

Dawid Weiss skrev:

I see your point about "bringing up bugs nobody thought to cover manually", but it 
also has cons - e.g. violating the principal that tests > should be (easily) repeatable (you 
will/can end up with tests the sometimes fail and sometimes succeed, and you have to dig out 
the > random values of the tests that fail in order to be able to repeat/reconstruct the 
fail)



Randomized tests should be identical in their execution given the same
seed, it's the same principle as with regular tests but expands on
different code paths every time you execute with a different seed.
They are not a replacement for boundary condition tests, they're a
complementary thing that should allow picking things you haven't
thought of. Sure, in case of a failure you need to find the seed that
caused the problem but that doesn't seem like a lot of effort given
the potential profit.

If you want identical runs -- fix the initial seed.

If you have a non-deterministic test for a given fixed seed, it'd be
equally non-deterministic if no randomization was used, it's just a
flawed test (or inherently non-deterministic by nature so assertions
should be relaxed).

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org


  






Re: pro coding style

2012-11-30 Thread Per Steffensen
Spot on! Good arguments. Thanks. Will consider randomized for my 
projects in the future - with limits :-)


Regards, Per Steffensen

Dawid Weiss skrev:

I see your point about "bringing up bugs nobody thought to cover manually", but it 
also has cons - e.g. violating the principal that tests > should be (easily) repeatable (you 
will/can end up with tests the sometimes fail and sometimes succeed, and you have to dig out 
the > random values of the tests that fail in order to be able to repeat/reconstruct the 
fail)



Randomized tests should be identical in their execution given the same
seed, it's the same principle as with regular tests but expands on
different code paths every time you execute with a different seed.
They are not a replacement for boundary condition tests, they're a
complementary thing that should allow picking things you haven't
thought of. Sure, in case of a failure you need to find the seed that
caused the problem but that doesn't seem like a lot of effort given
the potential profit.

If you want identical runs -- fix the initial seed.

If you have a non-deterministic test for a given fixed seed, it'd be
equally non-deterministic if no randomization was used, it's just a
flawed test (or inherently non-deterministic by nature so assertions
should be relaxed).

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org


  




Re: IndexWriter.ensureOpen and ensureOpen(boolean)

2012-11-30 Thread Michael McCandless
On Thu, Nov 29, 2012 at 3:31 PM, Shai Erera  wrote:
>
> Hi
>
> While working on LUCENE-4575 I noticed what I thought was an inconsistency 
> between prepareCommit() and prepareCommit(commitData).
> The former called ensureOpen(true) and the latter ensureOpen(false). At first 
> I thought that this is a bug, so I fixed both to call ensureOpen(true),
> especially now that I consolidate the two prepCommit() versions into one, but 
> then all tests failed with AlreadyClosedException. How wonderful :).
>
> Getting deeper into the meaning of the two ensureOpen versions i realize that 
> the boolean means something like "fail if IW has been closed, or is
> in the process of closing). Some methods choose to not fail if IW is in the 
> process of closing, while others do (mostly internal methods).
>
> My question is - why make the distinction? If IW is in the process of 
> closing, why not always fail?

Because IW will call some of these methods during close (eg it commits
before closing)...

If we fix close to not commit (there is an issue for this...) then we
could likely remove this boolean.

Mike McCandless

http://blog.mikemccandless.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: pro coding style

2012-11-30 Thread Dawid Weiss
> I see your point about "bringing up bugs nobody thought to cover manually", 
> but it also has cons - e.g. violating the principal that tests > should be 
> (easily) repeatable (you will/can end up with tests the sometimes fail and 
> sometimes succeed, and you have to dig out the > random values of the tests 
> that fail in order to be able to repeat/reconstruct the fail)

Randomized tests should be identical in their execution given the same
seed, it's the same principle as with regular tests but expands on
different code paths every time you execute with a different seed.
They are not a replacement for boundary condition tests, they're a
complementary thing that should allow picking things you haven't
thought of. Sure, in case of a failure you need to find the seed that
caused the problem but that doesn't seem like a lot of effort given
the potential profit.

If you want identical runs -- fix the initial seed.

If you have a non-deterministic test for a given fixed seed, it'd be
equally non-deterministic if no randomization was used, it's just a
flawed test (or inherently non-deterministic by nature so assertions
should be relaxed).

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: pro coding style

2012-11-30 Thread Per Steffensen

Everything below is my humble opinion and input - DONT MEAN TO OFFEND ANYONE

Radim Kolar wrote:



what you should do:
* stuff i do
Like people with confidence, but it is a balance :-) Every decent 
developer in the world believes that he is the best in the world. Chance 
is that he is not. Be humble.


+
* ant -> maven
Maven is a step forward, but it is still crap. Believe the original 
creator of ant has apologized in public for basing it on XML. Maven is 
also based on XML, besides being way to complex in infrastructure - 
goal, phases, environments, strange plugin with exections mapping to 
phases etc. XML is good for static data/config stuff, but build process 
is not static data/config - it is a process. Go gradle!
I dont have either, if i decide to go with SOLR instead of EC, i will 
fork it. It will save me lot of time.
We are baiscally handling our own version of Solr at my organization, 
because it is so hard go get contributions in - SOLR-3173,  SOLR-3178, 
SOLR-3382, SOLR-3428, SOLR-3383 etc - and lately SOLR-4114 and 
SOLR-4120. It is really hard keeping up with the latest versions of 
Apache Solr, because it is a huge job to merge new stuff into our Solr. 
We are considering to take the consequence and fork our own public (to 
let others bennefit and contribute) "variant" of Solr.


I understand that no committers are really assigned to focus on 
committing other peoples stuff, but it is a shame. I would really, 
really not like Solr to end up in a situation, where many organizations 
run their own little fork. Instead we should all collaborate on 
improving "the one and only Solr"! Maybe we should try to find a sponsor 
to pay for a full-time Solr committer with the main focus on verifying 
and committing contributions from the "outside".

* svn -> git (way better tools)
I think we had this discussion already and it seems that lots of 
folks are positive, yet there is still some barrier infrasturcuture 
wise along the lines.


dont blame infrastructure, other apache projects are using it.
Git is the way forward. It will also make comitting outside 
contributions easier (especially if the commit is to be performed after 
the branch has developed a lot since the pull-request was made). Merging 
among branches will also become easier. Why? Basically, since a pull 
request (request to merge) is a operation handled/know by git, i allows 
for git to maintain more information about where merged code fits into 
the code-base considering revisions etc. That information can be used to 
ease future or late merges.



* split code into small manageable maven modules
see above - we have a fully functional maven build but ant is out 
primary build.

i dont see pom.xml in your source tree.
Have a look at templates in dev-tools/maven. Do a "ant 
-Dversion=$VERSION get-maven-poms" to get your maven stuff generated in 
folder "maven-build". Maven build does not work 100% out of the box, (at 
least on lucene_solr_4_0 branch) but it is very close.



* use github to track patches wait why is github good for patches?
you can track patch revisions and apply/browse/comment it easily. Also 
its way easier to upload it and do pull request then attach to ticket 
in jira.

See comments under "git" above

Besides that I have some additional input, now that we are talking

Basically that code is a mess. Not blaming anyone particular. Its 
probably to some extend the nature of open source. If someone honestly 
belive that the code-base is beautiful, they should find something else 
to do. Some of the major problems are

* Bad "separation of concerns"
** Very long classes/methods dealing with a lot of different concerns
*** Example: DistributedUpdateProcessor - dealing with 
cloud/standalone-, phases-, optimistic-locking, calculating values for 
document-fields (for add/inc/set requests), routing etc. This should all 
be separated into different classes each dealing with the a different 
concern
** Code dealing with a particular concern is spread all over the code - 
it makes it very hard to "change strategy" for this concern
*** Example: An obvious "separate concern" is routing (the decision 
about in which shard under a collection a particualr document belongs 
(should be indexed and found) and where particualr request needs to go - 
leaders, replica, all shards under the collection?). This concern is 
dealt with in a lot of places - DistributedUpdateProcessor, 
CloudSolrServer, RealTimeGetComponent, SearchHandler etc.
** In my patch for TLT-3178 I have made a "separate concern" called 
UpdateSemantics. It deals with decissions on stuff related to how 
updates should be performed, depending on what update-semantics you have 
choosen (classic, consistency or classic-consistency-hybrid). This class 
UpdateSemantics is used from the actual updating component 
DirectUpdateHandler2 - instead of having a lot of if-else-if-else 
statements in DirectUpdateHandler2 itself

* Copied code
** A lot of code is clearly just cop

[jira] [Updated] (SOLR-4130) eDismax: Terms are skipped for phrase boost when using parentheses

2012-11-30 Thread Leonhard Maylein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonhard Maylein updated SOLR-4130:
---

Summary: eDismax: Terms are skipped for phrase boost when using parentheses 
 (was: eDismax: Terms are skipped for phrase boost when using parenthese)

> eDismax: Terms are skipped for phrase boost when using parentheses
> --
>
> Key: SOLR-4130
> URL: https://issues.apache.org/jira/browse/SOLR-4130
> Project: Solr
>  Issue Type: Bug
>  Components: query parsers
>Affects Versions: 4.0
>Reporter: Leonhard Maylein
>
> I've tried the following combination with the eDismax handler
> in SOLR 4.0.0:
> q: +sw(a b) +ti:(c d)
> qf: freitext exttext^0.5
> pf: freitext^6 exttext^3
> The result is:
> +sw:(a b) +ti:(c d)
> +sw:(a b) +ti:(c d)
> (((sw:a sw:b) +(ti:c ti:d)) 
> DisjunctionMaxQuery((freitext:"b d"^6.0)) DisjunctionMaxQuery((exttext:"b 
> d"^3.0)))/no_coord
> All terms are (equally) qualified by a field (field sw for the terms a and b, 
> field ti for the terms c and d).
> Why do the eDismax handler only use the terms b and d to build the phrase 
> boost query?
> It appears that some terms have been skipped for phrase boost.
> Moreover, in my opinion, fielded terms should not be used in phrase boost 
> except for the specified field.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4130) eDismax: Terms are skipped for phrase boost when using parenthese

2012-11-30 Thread Leonhard Maylein (JIRA)
Leonhard Maylein created SOLR-4130:
--

 Summary: eDismax: Terms are skipped for phrase boost when using 
parenthese
 Key: SOLR-4130
 URL: https://issues.apache.org/jira/browse/SOLR-4130
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Leonhard Maylein


I've tried the following combination with the eDismax handler
in SOLR 4.0.0:

q: +sw(a b) +ti:(c d)
qf: freitext exttext^0.5
pf: freitext^6 exttext^3

The result is:

+sw:(a b) +ti:(c d)

+sw:(a b) +ti:(c d)

(((sw:a sw:b) +(ti:c ti:d)) 
DisjunctionMaxQuery((freitext:"b d"^6.0)) DisjunctionMaxQuery((exttext:"b 
d"^3.0)))/no_coord

All terms are (equally) qualified by a field (field sw for the terms a and b, 
field ti for the terms c and d).
Why do the eDismax handler only use the terms b and d to build the phrase boost 
query?
It appears that some terms have been skipped for phrase boost.

Moreover, in my opinion, fielded terms should not be used in phrase boost 
except for the specified field.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4499) Multi-word synonym filter (synonym expansion)

2012-11-30 Thread Nolan Lawson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507243#comment-13507243
 ] 

Nolan Lawson commented on LUCENE-4499:
--

I had similar problems with multi-word synonyms, and I largely resolved them by 
moving all the synonym expansion logic into the query parser.  I have a 
description of it on my blog [1] and some open-source code on GitHub [2].

In terms of changes to Solr/Lucene, all my code adds is an extension to the 
EDismaxQueryParserPlugin, so it's easy to install.

And at the very least, with my code you would be able to query "hubble NEAR 
telescope" as in your example, and it would work because the index tokens have 
not been changed at all.  "HST" would also work.

It's not exactly like your solution, but I think mine could be an interesting 
addition to Solr, and could help folks out who have run into similar issues 
with multi-word synonyms.

Links:
[1] http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/
[2] https://github.com/healthonnet/hon-lucene-synonyms#readme

> Multi-word synonym filter (synonym expansion)
> -
>
> Key: LUCENE-4499
> URL: https://issues.apache.org/jira/browse/LUCENE-4499
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Affects Versions: 4.1, 5.0
>Reporter: Roman Chyla
>Priority: Minor
>  Labels: analysis, multi-word, synonyms
> Fix For: 5.0
>
> Attachments: LUCENE-4499.patch
>
>
> I apologize for bringing the multi-token synonym expansion up again. There is 
> an old, unresolved issue at LUCENE-1622 [1]
> While solving the problem for our needs [2], I discovered that the current 
> SolrSynonym parser (and the wonderful FTS) have almost everything to 
> satisfactorily handle both the query and index time synonym expansion. It 
> seems that people often need to use the synonym filter *slightly* differently 
> at indexing and query time.
> In our case, we must do different things during indexing and querying.
> Example sentence: Mirrors of the Hubble space telescope pointed at XA5
> This is what we need (comma marks position bump):
> indexing: mirrors,hubble|hubble space 
> telescope|hst,space,telescope,pointed,xa5|astroobject#5
> querying: +mirrors +(hubble space telescope | hst) +pointed 
> +(xa5|astroboject#5)
> This translated to following needs:
>   indexing time: 
> single-token synonyms => return only synonyms
> multi-token synonyms => return original tokens *AND* the synonyms
>   query time:
> single-token: return only synonyms (but preserve case)
> multi-token: return only synonyms
>  
> We need the original tokens for the proximity queries, if we indexed 'hubble 
> space telescope'
> as one token, we cannot search for 'hubble NEAR telescope'
> You may (not) be surprised, but Lucene already supports ALL of these 
> requirements. The patch is an attempt to state the problem differently. I am 
> not sure if it is the best option, however it works perfectly for our needs 
> and it seems it could work for general public too. Especially if the 
> SynonymFilterFactory had a preconfigured sets of SynonymMapBuilders - and 
> people would just choose what situation they use. Please look at the unittest.
> links:
> [1] https://issues.apache.org/jira/browse/LUCENE-1622
> [2] http://labs.adsabs.harvard.edu/trac/ads-invenio/ticket/158
> [3] seems to have similar request: 
> http://lucene.472066.n3.nabble.com/Proposal-Full-support-for-multi-word-synonyms-at-query-time-td4000522.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4129) Solr doesn't support log4j

2012-11-30 Thread Raintung Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raintung Li updated SOLR-4129:
--

Attachment: patch-4129.txt

This patch finish the UI for log4j case that will be shown in the Admin page. 
Something still need to do, maven build need log4j jar, and it need to check 
slf load logic for different component logger to find way can set it outside.



> Solr doesn't support log4j 
> ---
>
> Key: SOLR-4129
> URL: https://issues.apache.org/jira/browse/SOLR-4129
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0
>Reporter: Raintung Li
>  Labels: log
> Attachments: patch-4129.txt
>
>
> For many project use the log4j, actually solr use slf logger framework, slf 
> can easy to integrate with log4j by design. 
> Solr use log4j-over-slf.jar to handle log4j case.
> This jar has some issues.
> a. Actually last invoke slf to print the logger (For solr it is 
> JDK14.logging).
> b. Not implement all log4j function. ex. Logger.setLevel() 
> c. JDK14 log miss some function, ex. thread.info, day rolling 
> Some dependence project had been used log4j that the customer still want to 
> use it. JDK14 log has many different with Log4j, at least configuration file 
> can't reuse.
> The bad thing is log4j-over-slf.jar conflict with log4j. If use solr, the 
> other project have to remove log4j.
> I think it shouldn't use log4j-over-slf.jar, still reuse log4j if customer 
> want to use it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4129) Solr doesn't support log4j

2012-11-30 Thread Raintung Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raintung Li updated SOLR-4129:
--

Description: 
For many project use the log4j, actually solr use slf logger framework, slf can 
easy to integrate with log4j by design. 

Solr use log4j-over-slf.jar to handle log4j case.
This jar has some issues.
a. Actually last invoke slf to print the logger (For solr it is JDK14.logging).
b. Not implement all log4j function. ex. Logger.setLevel() 
c. JDK14 log miss some function, ex. thread.info, day rolling 

Some dependence project had been used log4j that the customer still want to use 
it. JDK14 log has many different with Log4j, at least configuration file can't 
reuse.

The bad thing is log4j-over-slf.jar conflict with log4j. If use solr, the other 
project have to remove log4j.

I think it shouldn't use log4j-over-slf.jar, still reuse log4j if customer want 
to use it.


  was:
For many project use the log4j, actually solr use slf logger framework, slf can 
easy to integrate with log4j. 
But solr use log4j-over-slf.jar to handle log4j case.
This jar has some issues.
a. Actually last invoke slf to print the logger (for solr it is JDK14.logging).
b. Not implement all log4j function. ex. Logger.setLevel() 
c. JDK14 log miss some function, ex. thread.info, day rolling 

Some dependence project had been used log4j that the customer still want to use 
it, that  exist the configuration file. JDK14 log has many different with Log4j.

The bad thing is log4j-over-slf.jar conflict with log4j. The other project need 
remove it log4j.

We shouldn't use log4j-over-slf.jar, still reuse log4j if customer want to use 
it.



> Solr doesn't support log4j 
> ---
>
> Key: SOLR-4129
> URL: https://issues.apache.org/jira/browse/SOLR-4129
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0
>Reporter: Raintung Li
>  Labels: log
>
> For many project use the log4j, actually solr use slf logger framework, slf 
> can easy to integrate with log4j by design. 
> Solr use log4j-over-slf.jar to handle log4j case.
> This jar has some issues.
> a. Actually last invoke slf to print the logger (For solr it is 
> JDK14.logging).
> b. Not implement all log4j function. ex. Logger.setLevel() 
> c. JDK14 log miss some function, ex. thread.info, day rolling 
> Some dependence project had been used log4j that the customer still want to 
> use it. JDK14 log has many different with Log4j, at least configuration file 
> can't reuse.
> The bad thing is log4j-over-slf.jar conflict with log4j. If use solr, the 
> other project have to remove log4j.
> I think it shouldn't use log4j-over-slf.jar, still reuse log4j if customer 
> want to use it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4129) Solr doesn't support log4j

2012-11-30 Thread Raintung Li (JIRA)
Raintung Li created SOLR-4129:
-

 Summary: Solr doesn't support log4j 
 Key: SOLR-4129
 URL: https://issues.apache.org/jira/browse/SOLR-4129
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.0, 4.0-BETA, 4.0-ALPHA
Reporter: Raintung Li


For many project use the log4j, actually solr use slf logger framework, slf can 
easy to integrate with log4j. 
But solr use log4j-over-slf.jar to handle log4j case.
This jar has some issues.
a. Actually last invoke slf to print the logger (for solr it is JDK14.logging).
b. Not implement all log4j function. ex. Logger.setLevel() 
c. JDK14 log miss some function, ex. thread.info, day rolling 

Some dependence project had been used log4j that the customer still want to use 
it, that  exist the configuration file. JDK14 log has many different with Log4j.

The bad thing is log4j-over-slf.jar conflict with log4j. The other project need 
remove it log4j.

We shouldn't use log4j-over-slf.jar, still reuse log4j if customer want to use 
it.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >