Re: [VOTE] Release PyLucene 4.3.0-1

2013-05-12 Thread Barry Wark
Hi all,

I'm new to the pylucene-dev list, so please forgive me if I'm stepping out
of line in the voting process.

We're using sucesfully JCC 2.15 to generate a wrapper for our Java API. JCC
2.16 from SVN HEAD produces the following error (output log attached) when
using the new --use_full_names option. Python 2.7, OS X 10.8:

build/_ovation/com/__init__.cpp:15:14: error: use of undeclared identifier
  'getJavaModule'
module = getJavaModule(module, , com);
 ^
build/_ovation/com/__init__.cpp:22:14: error: use of undeclared identifier
  'getJavaModule'
module = getJavaModule(module, , com);
 ^
2 errors generated.
error: command 'clang' failed with exit status 1


The generated _ovation/__init__.cpp and _ovation/com/__init__.cpp are also
attached.

Cheers,
Barry


On Mon, May 6, 2013 at 8:27 PM, Andi Vajda va...@apache.org wrote:


 It looks like the time has finally come for a PyLucene 4.x release !

 The PyLucene 4.3.0-1 release tracking the recent release of Apache Lucene
 4.3.0 is ready.

 A release candidate is available from:
 http://people.apache.org/~**vajda/staging_area/http://people.apache.org/~vajda/staging_area/

 A list of changes in this release can be seen at:
 http://svn.apache.org/repos/**asf/lucene/pylucene/branches/**
 pylucene_4_3/CHANGEShttp://svn.apache.org/repos/asf/lucene/pylucene/branches/pylucene_4_3/CHANGES

 PyLucene 4.3.0 is built with JCC 2.16 included in these release artifacts:
 http://svn.apache.org/repos/**asf/lucene/pylucene/trunk/jcc/**CHANGEShttp://svn.apache.org/repos/asf/lucene/pylucene/trunk/jcc/CHANGES

 A list of Lucene Java changes can be seen at:
 http://svn.apache.org/repos/**asf/lucene/dev/tags/lucene_**
 solr_4_3_0/lucene/CHANGES.txthttp://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_4_3_0/lucene/CHANGES.txt

 Please vote to release these artifacts as PyLucene 4.3.0-1.

 Thanks !

 Andi..

 ps: the KEYS file for PyLucene release signing is at:
 http://svn.apache.org/repos/**asf/lucene/pylucene/dist/KEYShttp://svn.apache.org/repos/asf/lucene/pylucene/dist/KEYS
 http://people.apache.org/~**vajda/staging_area/KEYShttp://people.apache.org/~vajda/staging_area/KEYS

 pps: here is my +1

setup args = {'ext_modules': [setuptools.extension.Extension instance at 
0x10996b908], 'name': 'ovation', 'package_data': {'ovation': 
['joda-time-1.6.2.jar', 'cloud-file-cache-2.0-SNAPSHOT.jar', 
'ovation-api-2.0-SNAPSHOT.jar', 'ovation-core-2.0-SNAPSHOT.jar', 
'ovation-couch-2.0-SNAPSHOT.jar', 'ovation-logging-2.0-SNAPSHOT.jar', 
'ovation-query-2.0-SNAPSHOT.jar', 'ovation-test-utils-2.0-SNAPSHOT.jar', 
'aopalliance-1.0.jar', 'c3p0-0.9.1.2.jar', 'cal10n-api-0.7.4.jar', 
'clj-time-0.4.4.jar', 'jackson-annotations-2.1.1.jar', 
'jackson-core-2.1.2.jar', 'jackson-databind-2.1.2.jar', 
'jackson-datatype-joda-2.1.2.jar', 'gson-2.2.jar', 'guava-13.0.1.jar', 
'guice-3.0.jar', 'guice-assistedinject-3.0.jar', 'protobuf-java-2.4.1.jar', 
'h2-1.3.170.jar', 'java-xmlbuilder-0.4.jar', 'je-4.0.92.jar', 'jconsole.jar', 
'jconsole.jar', 'jna-3.0.9.jar', 'commons-codec-1.7.jar', 
'commons-httpclient-3.1.jar', 'commons-io-2.4.jar', 'commons-lang-2.6.jar', 
'commons-logging-1.1.1.jar', 'netcdf-4.3.16.jar', 'udunits-4.3.16.jar', 
'jsr250-api-1.0.jar', 'javax.inject-1.jar', 'jsr311-api-1.1.1.jar', 
'korma-0.3.0-RC5.jar', 'lobos-1.0.0-beta1.jar', 'log4j-1.2.17.jar', 
'jcip-annotations-1.0.jar', 'ehcache-core-2.6.2.jar', 'rocoto-6.1.jar', 
'commons-compress-1.0.jar', 'commons-exec-1.1.jar', 'httpclient-4.2.1.jar', 
'httpclient-cache-4.1.2.jar', 'httpcore-4.2.1.jar', 'log4j-api-2.0-beta5.jar', 
'log4j-core-2.0-beta5.jar', 'log4j-slf4j-impl-2.0-beta5.jar', 
'bcprov-jdk16-1.46.jar', 'clojure-1.5.0.jar', 'java.jdbc-0.2.2.jar', 
'tools.macro-0.1.1.jar', 'jackson-core-asl-1.9.7.jar', 
'jackson-mapper-asl-1.9.7.jar', 'org.ektorp-1.3.0.jar', 
'jclouds-allblobstore-1.5.7.jar', 'jclouds-blobstore-1.5.7.jar', 
'jclouds-core-1.5.7.jar', 'atmos-1.5.7.jar', 'cloudfiles-1.5.7.jar', 
'filesystem-1.5.7.jar', 'openstack-keystone-1.5.7.jar', 's3-1.5.7.jar', 
'swift-1.5.7.jar', 'walrus-1.5.7.jar', 'aws-common-1.5.7.jar', 
'azure-common-1.5.7.jar', 'openstack-common-1.5.7.jar', 'aws-s3-1.5.7.jar', 
'azureblob-1.5.7.jar', 'cloudfiles-uk-1.5.7.jar', 'cloudfiles-us-1.5.7.jar', 
'cloudonestorage-1.5.7.jar', 'eucalyptus-partnercloud-s3-1.5.7.jar', 
'hpcloud-objectstorage-1.5.7.jar', 'ninefold-storage-1.5.7.jar', 
'synaptic-storage-1.5.7.jar', 'jdom-1.1.jar', 'quartz-2.1.1.jar', 
'jcl-over-slf4j-1.6.4.jar', 'slf4j-api-1.7.5.jar', 'slf4j-ext-1.7.2.jar', 
'osx-keychain-java-1.0.jar']}, 'version': '2.0-SNAPSHOT', 'zip_safe': False, 
'script_args': ['build_ext', 'bdist_egg'], 'packages': ['ovation'], 
'package_dir': {'ovation': 'build/ovation'}}
running build_ext
building 'ovation._ovation' extension
clang -fno-strict-aliasing -fno-common -dynamic -g -Os -pipe -fno-common 
-fno-strict-aliasing -fwrapv -mno-fused-madd -DENABLE_DTRACE -DMACOSX -DNDEBUG 
-Wall -Wstrict-prototypes 

Re: [VOTE] Release PyLucene 4.3.0-1

2013-05-12 Thread Andi Vajda

On May 12, 2013, at 21:04, Barry Wark ba...@physion.us wrote:

 Hi all,
 
 I'm new to the pylucene-dev list, so please forgive me if I'm stepping out of 
 line in the voting process. 
 
 We're using sucesfully JCC 2.15 to generate a wrapper for our Java API. JCC 
 2.16 from SVN HEAD produces the following error (output log attached) when 
 using the new --use_full_names option. Python 2.7, OS X 10.8:
 
 build/_ovation/com/__init__.cpp:15:14: error: use of undeclared identifier
   'getJavaModule'
 module = getJavaModule(module, , com);
  ^
 build/_ovation/com/__init__.cpp:22:14: error: use of undeclared identifier
   'getJavaModule'
 module = getJavaModule(module, , com);

You might be mixing in headers from an old version here...

Andi..

  ^
 2 errors generated.
 error: command 'clang' failed with exit status 1
 
 
 The generated _ovation/__init__.cpp and _ovation/com/__init__.cpp are also 
 attached.
 
 Cheers,
 Barry
 
 
 On Mon, May 6, 2013 at 8:27 PM, Andi Vajda va...@apache.org wrote:
 
 It looks like the time has finally come for a PyLucene 4.x release !
 
 The PyLucene 4.3.0-1 release tracking the recent release of Apache Lucene 
 4.3.0 is ready.
 
 A release candidate is available from:
 http://people.apache.org/~vajda/staging_area/
 
 A list of changes in this release can be seen at:
 http://svn.apache.org/repos/asf/lucene/pylucene/branches/pylucene_4_3/CHANGES
 
 PyLucene 4.3.0 is built with JCC 2.16 included in these release artifacts:
 http://svn.apache.org/repos/asf/lucene/pylucene/trunk/jcc/CHANGES
 
 A list of Lucene Java changes can be seen at:
 http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_4_3_0/lucene/CHANGES.txt
 
 Please vote to release these artifacts as PyLucene 4.3.0-1.
 
 Thanks !
 
 Andi..
 
 ps: the KEYS file for PyLucene release signing is at:
 http://svn.apache.org/repos/asf/lucene/pylucene/dist/KEYS
 http://people.apache.org/~vajda/staging_area/KEYS
 
 pps: here is my +1
 
 jcc_2.16_osx_10.8_py2.7.out.txt
 __init__.cpp
 __init__.cpp


[JENKINS] Lucene-Solr-4.x-Windows (64bit/jdk1.6.0_45) - Build # 2780 - Still Failing!

2013-05-12 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Windows/2780/
Java: 64bit/jdk1.6.0_45 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC

5 tests failed.
FAILED:  
org.apache.solr.client.solrj.embedded.TestEmbeddedSolrServer.testGetCoreContainer

Error Message:


Stack Trace:
org.apache.solr.common.SolrException: 
at 
__randomizedtesting.SeedInfo.seed([B89FE01415CDFF59:758CD45BA33C97E7]:0)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:262)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:219)
at org.apache.solr.core.CoreContainer.init(CoreContainer.java:149)
at 
org.apache.solr.client.solrj.embedded.AbstractEmbeddedSolrServerTestCase.setUp(AbstractEmbeddedSolrServerTestCase.java:64)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:771)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: The filename, directory name, or volume label 
syntax is incorrect
at java.io.WinNTFileSystem.canonicalize0(Native Method)
at java.io.Win32FileSystem.canonicalize(Win32FileSystem.java:396)
at java.io.File.getCanonicalPath(File.java:559)
at 

[jira] [Commented] (LUCENE-4992) ArrayOutOfBoundsException in BooleanScorer2

2013-05-12 Thread John Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655478#comment-13655478
 ] 

John Wang commented on LUCENE-4992:
---

makes senses, agreed.

 ArrayOutOfBoundsException in BooleanScorer2
 ---

 Key: LUCENE-4992
 URL: https://issues.apache.org/jira/browse/LUCENE-4992
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/search
Affects Versions: 4.1
Reporter: John Wang
 Attachments: LUCENE-4992.patch, patch.diff


 Seeing following exception in BooleanScorer2 in our production system:
 Exception in thread main java.lang.ArrayIndexOutOfBoundsException: 
 2147483647
   at 
 org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.java:312)
   at 
 org.apache.lucene.queries.CustomScoreQuery$CustomScorer.score(CustomScoreQuery.java:324)
   at 
 org.apache.lucene.search.DisjunctionMaxScorer.score(DisjunctionMaxScorer.java:84)
   at 
 org.apache.lucene.search.TopScoreDocCollector$InOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:47)
   at org.apache.lucene.search.Scorer.score(Scorer.java:64)
   at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:605)
   at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:482)
   at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:438)
   at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
   at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_21) - Build # 5615 - Still Failing!

2013-05-12 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/5615/
Java: 64bit/jdk1.7.0_21 -XX:+UseCompressedOops -XX:+UseG1GC

1 tests failed.
REGRESSION:  org.apache.solr.request.TestRemoteStreaming.testQtUpdateFails

Error Message:
IOException occured when talking to server at: 
https://127.0.0.1:50584/solr/collection1

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: IOException occured when 
talking to server at: https://127.0.0.1:50584/solr/collection1
at 
__randomizedtesting.SeedInfo.seed([6289A0DF930E9789:368BB43A5F3E12D3]:0)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:435)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:168)
at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:146)
at 
org.apache.solr.request.TestRemoteStreaming.doBefore(TestRemoteStreaming.java:60)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:771)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 

[jira] [Commented] (LUCENE-4583) StraightBytesDocValuesField fails if bytes 32k

2013-05-12 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655522#comment-13655522
 ] 

Robert Muir commented on LUCENE-4583:
-

{quote}
Are you also against just fixing the limit in the core code
(IndexWriter/BinaryDocValuesWriter) and leaving the limit enforced in
the existing DVFormats (my patch)?

I thought that was a good compromise ...

This way at least users can still build their own / use DVFormats that
don't have the limit.
{quote}

I'm worried about a few things:
* I think the limit is ok, because in my eyes its the limit of a single term. I 
feel that anyone arguing for increasing the limit only has abuse cases (not use 
cases) in mind. I'm worried about making dv more complicated for no good 
reason. 
* I'm worried about opening up the possibility of bugs and index corruption 
(e.g. clearly MULTIPLE people on this issue dont understand why you cannot just 
remove IndexWriter's limit without causing corruption).
* I'm really worried about the precedent: once these abuse-case-fans have their 
way and increase this limit, they will next argue that we should do the same 
for SORTED, maybe SORTED_SET, maybe even inverted terms. They will make 
arguments that its the same as binary, just with sorting, and why should 
sorting bring in additional limits. I can easily see this all spinning out of 
control.
* I think that most people hitting the limit are abusing docvalues as stored 
fields, so the limit is providing a really useful thing today actually, and 
telling them they are doing something wrong.

The only argument i have *for* removing the limit is that by expanding BINARY's 
possible abuse cases (in my opinion, thats pretty much all its useful for), we 
might prevent additional complexity from being added elsewhere to DV in the 
long-term.

 StraightBytesDocValuesField fails if bytes  32k
 

 Key: LUCENE-4583
 URL: https://issues.apache.org/jira/browse/LUCENE-4583
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.0, 4.1, 5.0
Reporter: David Smiley
Priority: Critical
 Fix For: 4.4

 Attachments: LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch


 I didn't observe any limitations on the size of a bytes based DocValues field 
 value in the docs.  It appears that the limit is 32k, although I didn't get 
 any friendly error telling me that was the limit.  32k is kind of small IMO; 
 I suspect this limit is unintended and as such is a bug.The following 
 test fails:
 {code:java}
   public void testBigDocValue() throws IOException {
 Directory dir = newDirectory();
 IndexWriter writer = new IndexWriter(dir, writerConfig(false));
 Document doc = new Document();
 BytesRef bytes = new BytesRef((4+4)*4097);//4096 works
 bytes.length = bytes.bytes.length;//byte data doesn't matter
 doc.add(new StraightBytesDocValuesField(dvField, bytes));
 writer.addDocument(doc);
 writer.commit();
 writer.close();
 DirectoryReader reader = DirectoryReader.open(dir);
 DocValues docValues = MultiDocValues.getDocValues(reader, dvField);
 //FAILS IF BYTES IS BIG!
 docValues.getSource().getBytes(0, bytes);
 reader.close();
 dir.close();
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4583) StraightBytesDocValuesField fails if bytes 32k

2013-05-12 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655525#comment-13655525
 ] 

Jack Krupansky commented on LUCENE-4583:


bq. abusing docvalues as stored fields

Great point. I have to admit that I still don't have a 100% handle on the use 
case(s) for docvalues vs. stored fields, even though I've asked on the list. I 
mean, sometimes the chatter seems to suggest that dv is the successor to stored 
values. Hmmm... in that case, I should be able to store the full text of a 24 
MB PDF file in a dv. Now, I know that isn't true.

Maybe we just need to start with some common use cases, based on size: tiny (16 
bytes or less), small (256 or 1024 bytes or less), medium (up to 32K), and 
large (upwards of 1MB, and larger.) It sounds like large implies stored field.

A related concern is dv or stored fields that need a bias towards being in 
memory and in the heap, vs. a bias towards being off heap. Maybe the size 
category is the hint: tiny and small bias towards on-heap, medium and certainly 
large bias towards off-heap. If people are only going towards DV because they 
think they get off-heap, then maybe we need to reconsider the model of what DV 
vs. stored is really all about. But then that leads back to DV somehow morphing 
out of column-stride fields.


 StraightBytesDocValuesField fails if bytes  32k
 

 Key: LUCENE-4583
 URL: https://issues.apache.org/jira/browse/LUCENE-4583
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.0, 4.1, 5.0
Reporter: David Smiley
Priority: Critical
 Fix For: 4.4

 Attachments: LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch


 I didn't observe any limitations on the size of a bytes based DocValues field 
 value in the docs.  It appears that the limit is 32k, although I didn't get 
 any friendly error telling me that was the limit.  32k is kind of small IMO; 
 I suspect this limit is unintended and as such is a bug.The following 
 test fails:
 {code:java}
   public void testBigDocValue() throws IOException {
 Directory dir = newDirectory();
 IndexWriter writer = new IndexWriter(dir, writerConfig(false));
 Document doc = new Document();
 BytesRef bytes = new BytesRef((4+4)*4097);//4096 works
 bytes.length = bytes.bytes.length;//byte data doesn't matter
 doc.add(new StraightBytesDocValuesField(dvField, bytes));
 writer.addDocument(doc);
 writer.commit();
 writer.close();
 DirectoryReader reader = DirectoryReader.open(dir);
 DocValues docValues = MultiDocValues.getDocValues(reader, dvField);
 //FAILS IF BYTES IS BIG!
 docValues.getSource().getBytes(0, bytes);
 reader.close();
 dir.close();
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4583) StraightBytesDocValuesField fails if bytes 32k

2013-05-12 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655526#comment-13655526
 ] 

Michael McCandless commented on LUCENE-4583:


{quote}
I'm worried about a few things:
I think the limit is ok, because in my eyes its the limit of a single term. I 
feel that anyone arguing for increasing the limit only has abuse cases (not use 
cases) in mind. I'm worried about making dv more complicated for no good reason.
{quote}

I guess I see DV binary as more like a stored field, just stored
column stride for faster access.  Faceting (and I guess spatial)
encode many things inside one DV binary field.

bq. I'm worried about opening up the possibility of bugs and index corruption 
(e.g. clearly MULTIPLE people on this issue dont understand why you cannot just 
remove IndexWriter's limit without causing corruption).

I agree this is a concern and we need to take it slow, add good
test coverage.

{quote}
I'm really worried about the precedent: once these abuse-case-fans have their 
way and increase this limit, they will next argue that we should do the same 
for SORTED, maybe SORTED_SET, maybe even inverted terms. They will make 
arguments that its the same as binary, just with sorting, and why should 
sorting bring in additional limits. I can easily see this all spinning out of 
control.
I think that most people hitting the limit are abusing docvalues as stored 
fields, so the limit is providing a really useful thing today actually, and 
telling them they are doing something wrong.
{quote}

I don't think we should change the limit for sorted/set nor terms: I
think we should raise the limit ONLY for BINARY, and declare that DV
BINARY is for these abuse cases.  So if you really really want
sorted set with a higher limit then you will have to encode yourself
into DV BINARY.

{quote}
The only argument i have for removing the limit is that by expanding BINARY's 
possible abuse cases (in my opinion, thats pretty much all its useful for), we 
might prevent additional complexity from being added elsewhere to DV in the 
long-term.
{quote}

+1


 StraightBytesDocValuesField fails if bytes  32k
 

 Key: LUCENE-4583
 URL: https://issues.apache.org/jira/browse/LUCENE-4583
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.0, 4.1, 5.0
Reporter: David Smiley
Priority: Critical
 Fix For: 4.4

 Attachments: LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch


 I didn't observe any limitations on the size of a bytes based DocValues field 
 value in the docs.  It appears that the limit is 32k, although I didn't get 
 any friendly error telling me that was the limit.  32k is kind of small IMO; 
 I suspect this limit is unintended and as such is a bug.The following 
 test fails:
 {code:java}
   public void testBigDocValue() throws IOException {
 Directory dir = newDirectory();
 IndexWriter writer = new IndexWriter(dir, writerConfig(false));
 Document doc = new Document();
 BytesRef bytes = new BytesRef((4+4)*4097);//4096 works
 bytes.length = bytes.bytes.length;//byte data doesn't matter
 doc.add(new StraightBytesDocValuesField(dvField, bytes));
 writer.addDocument(doc);
 writer.commit();
 writer.close();
 DirectoryReader reader = DirectoryReader.open(dir);
 DocValues docValues = MultiDocValues.getDocValues(reader, dvField);
 //FAILS IF BYTES IS BIG!
 docValues.getSource().getBytes(0, bytes);
 reader.close();
 dir.close();
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4583) StraightBytesDocValuesField fails if bytes 32k

2013-05-12 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655529#comment-13655529
 ] 

Michael McCandless commented on LUCENE-4583:


bq. I have to admit that I still don't have a 100% handle on the use case(s) 
for docvalues vs. stored fields, even though I've asked on the list. I mean, 
sometimes the chatter seems to suggest that dv is the successor to stored 
values. Hmmm... in that case, I should be able to store the full text of a 24 
MB PDF file in a dv. Now, I know that isn't true.

The big difference is that DV fields are stored column stride, so you
can decide on a field by field basis whether it will be in RAM on disk
etc., and you get faster access if you know you just need to work with
just one or two fields.

Vs stored fields where all fields for one document are stored
together.

Each has different tradeoffs so it's really up to the app to decide
which is best... if you know you need 12 fields loaded for each
document you are presenting on the current page, stored fields is
probably best.

But if you need one field to use as a scoring factor (eg maybe you are
boosting by recency) then column-stride is better.


 StraightBytesDocValuesField fails if bytes  32k
 

 Key: LUCENE-4583
 URL: https://issues.apache.org/jira/browse/LUCENE-4583
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.0, 4.1, 5.0
Reporter: David Smiley
Priority: Critical
 Fix For: 4.4

 Attachments: LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch


 I didn't observe any limitations on the size of a bytes based DocValues field 
 value in the docs.  It appears that the limit is 32k, although I didn't get 
 any friendly error telling me that was the limit.  32k is kind of small IMO; 
 I suspect this limit is unintended and as such is a bug.The following 
 test fails:
 {code:java}
   public void testBigDocValue() throws IOException {
 Directory dir = newDirectory();
 IndexWriter writer = new IndexWriter(dir, writerConfig(false));
 Document doc = new Document();
 BytesRef bytes = new BytesRef((4+4)*4097);//4096 works
 bytes.length = bytes.bytes.length;//byte data doesn't matter
 doc.add(new StraightBytesDocValuesField(dvField, bytes));
 writer.addDocument(doc);
 writer.commit();
 writer.close();
 DirectoryReader reader = DirectoryReader.open(dir);
 DocValues docValues = MultiDocValues.getDocValues(reader, dvField);
 //FAILS IF BYTES IS BIG!
 docValues.getSource().getBytes(0, bytes);
 reader.close();
 dir.close();
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4975) Add Replication module to Lucene

2013-05-12 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-4975:
---

Attachment: LUCENE-4975.patch

New patch changes how handlers work:

* Beasting found a seed which uncovered a major problem with their current 
operation. They were trying to be too honest with the index and e.g. 
revert/delete upon any exception that occurred.

* Thanks to MDW, Mike and I decided to keep the handlers simple -- if a handler 
successfully copies + syncs the revision files, then this is considered the 
new revision.

* Kissing the index is now done not through IndexWriter, but rather deleting 
all files not referenced by last commit.
** That cleanup is a best-effort only ... if it fails, it just logs this 
information and not act on it. Cleanup can happen later too.

* That means that if you have a really nasty crazy IO system (like MDW 
sometimes acts), the Replicator is not the one that's going to care about it. 
The app will hit those weird errors in other places too, e.g. when it tries to 
refresh SearcherManager or perform search.
** These errors are not caused by the Replicator or bad handler operation. I.e. 
if the handler successfully called fsync(), yet the IO system decides to fail 
later ... that's really not the handler's problem.

* Therefore the handlers are now simpler, don't use IW (and the crazy need to 
rollback()), and once files were successfully copied + sync'd, no more 
exceptions can occur by the handler (except callback may fail, but that's ok).

* I also removed the timeout behavior the test employed -- now that 
ReplicationClient has isUpdateThreadAlive(), assertHandlerRevision loops as 
long as the client is alive. If there's a serious bug, test-framework will 
terminate the test after 2 hours ...

* ReplicationClient.startUpdateThread is nicer -- allows starting the thread if 
updateThread != null, but !isAlive.

Now beasting this patch.

 Add Replication module to Lucene
 

 Key: LUCENE-4975
 URL: https://issues.apache.org/jira/browse/LUCENE-4975
 Project: Lucene - Core
  Issue Type: New Feature
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
 LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
 LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
 LUCENE-4975.patch


 I wrote a replication module which I think will be useful to Lucene users who 
 want to replicate their indexes for e.g high-availability, taking hot backups 
 etc.
 I will upload a patch soon where I'll describe in general how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_21) - Build # 5619 - Failure!

2013-05-12 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/5619/
Java: 64bit/jdk1.7.0_21 -XX:+UseCompressedOops -XX:+UseG1GC

1 tests failed.
REGRESSION:  
org.apache.lucene.search.grouping.AllGroupHeadsCollectorTest.testBasic

Error Message:
testBasic(org.apache.lucene.search.grouping.AllGroupHeadsCollectorTest): Insane 
FieldCache usage(s) found expected:0 but was:2

Stack Trace:
java.lang.AssertionError: 
testBasic(org.apache.lucene.search.grouping.AllGroupHeadsCollectorTest): Insane 
FieldCache usage(s) found expected:0 but was:2
at 
__randomizedtesting.SeedInfo.seed([CA214642676B81D0:61DB5B57B8B707FE]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at 
org.apache.lucene.util.LuceneTestCase.assertSaneFieldCaches(LuceneTestCase.java:592)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:55)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at java.lang.Thread.run(Thread.java:722)




Build Log:
[...truncated 6970 lines...]
[junit4:junit4] Suite: 
org.apache.lucene.search.grouping.AllGroupHeadsCollectorTest
[junit4:junit4]   2 *** BEGIN 
testBasic(org.apache.lucene.search.grouping.AllGroupHeadsCollectorTest): Insane 
FieldCache usage(s) ***
[junit4:junit4]   2 VALUEMISMATCH: Multiple distinct value objects for 
java.lang.Object@2e63d475+id
[junit4:junit4]   2'java.lang.Object@2e63d475'='id',class 
org.apache.lucene.index.SortedDocValues,0.5=org.apache.lucene.search.FieldCacheImpl$SortedDocValuesImpl#1434779092
 (size =~ 232 bytes)
[junit4:junit4]   2
'java.lang.Object@2e63d475'='id',int,org.apache.lucene.search.FieldCache.DEFAULT_INT_PARSER=org.apache.lucene.search.FieldCacheImpl$IntsFromArray#1338614742
 (size =~ 56 bytes)
[junit4:junit4]   2
'java.lang.Object@2e63d475'='id',int,null=org.apache.lucene.search.FieldCacheImpl$IntsFromArray#1338614742
 (size =~ 56 bytes)
[junit4:junit4]   2 
[junit4:junit4]   2 VALUEMISMATCH: Multiple distinct value objects for 
java.lang.Object@8aa35b+id
[junit4:junit4]   2

[jira] [Commented] (LUCENE-3422) IndeIndexWriter.optimize() throws FileNotFoundException and IOException

2013-05-12 Thread l0co (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655545#comment-13655545
 ] 

l0co commented on LUCENE-3422:
--

I have the similar problems in 3.1.0, likewise here and LUCENE-1638. I believe 
there's a bug in lucene synchronization/multi threading execution model during 
merge.

I use the Hibernate Search with exclusive index usage enabled, what means that 
the same IndexWriter is used constantly per index dir and only closed for some 
reasons. On the other hand the IndexWriter uses ConcurrentMergeScheduler by 
default. I occasionally have FileNotFoundException telling me that some segment 
files are not found (FNFE).

When I look into the implementation I can see in merge:
{code}
void merge() {
  mergeInit(merge);
  mergeMiddle(merge);
  mergeSuccess(merge);
  // is the problem here (?)
  mergeFinish(merge);
}
{code}

The merge() method is not synchronized at all. I believe that in first three 
lines (before is the problem here? line) the merge is started and scheduled, 
and can do some job (like remove the segment files). However, the segmentInfo 
update is only done in mergeFinish() method (where it waits for all merges and 
updates segmentInfo-s then). If in the is the problem here? line other thread 
invokes doFlush() (which is synchronized) we have the IndexWriter in the 
partial or completely done merge (some files are removed), but it has still 
outdated segmentsInfo, which will be updated for a while in mergeFinish().

This might be a wrong interpretation because I didn't thoroughly review the 
code, but the error is there. It might occasionally produce FNFE on doFlush() 
because of outated segmentInfos regarding the current directory state in 
filesystem (this has been verified).

 IndeIndexWriter.optimize() throws FileNotFoundException and IOException
 ---

 Key: LUCENE-3422
 URL: https://issues.apache.org/jira/browse/LUCENE-3422
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Elizabeth Nisha

 I am using lucene 3.0.2 search APIs for my application. 
 Indexed data is about 350MB and time taken for indexing is 25 hrs. Search 
 indexing and Optimization runs in two different threads. Optimization runs 
 for every 1 hour and it doesn't run while indexing is going on and vice 
 versa. When optimization is going on using IndexWriter.optimize(), 
 FileNotFoundException and IOException are seen in my log and the index file 
 is getting corrupted, log says
 1. java.io.IOException: No sub-file with id _5r8.fdt found 
 [The file name in this message changes over time (_5r8.fdt, _6fa.fdt, 
 _6uh.fdt, ..., _emv.fdt) ]
 2. java.io.FileNotFoundException: 
 /local/groups/necim/index_5.3/index/_bdx.cfs (No such file or directory)  
 3. java.io.FileNotFoundException: 
 /local/groups/necim/index_5.3/index/_hkq.cfs (No such file or directory)
   Stack trace: java.io.IOException: background merge hit exception: 
 _hkp:c100-_hkp _hkq:c100-_hkp _hkr:c100-_hkr _hks:c100-_hkr _hxb:c5500 
 _hx5:c1000 _hxc:c198
 84 into _hxd [optimize] [mergeDocStores]
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2359)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2298)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2268)
at com.telelogic.cs.search.SearchIndex.doOptimize(SearchIndex.java:130)
at 
 com.telelogic.cs.search.SearchIndexerThread$1.run(SearchIndexerThread.java:337)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.FileNotFoundException: 
 /local/groups/necim/index_5.3/index/_hkq.cfs (No such file or directory)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.init(RandomAccessFile.java:212)
at 
 org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descriptor.init(SimpleFSDirectory.java:76)
at 
 org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.init(SimpleFSDirectory.java:97)
at 
 org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.init(NIOFSDirectory.java:87)
at 
 org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:67)
at 
 org.apache.lucene.index.CompoundFileReader.init(CompoundFileReader.java:67)
at 
 org.apache.lucene.index.SegmentReader$CoreReaders.init(SegmentReader.java:114)
at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:590)
at 
 org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:616)
at 
 org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4309)

SegmentInfos static infoStream (and logging in general)

2013-05-12 Thread Shai Erera
Hi

Over at LUCENE-4975 (Replicator) I use InfoStream.getDefault() for logging
messages. That uses the static InfoStream, which means it affects all
replicator instances in the same JVM.

I remember that one of the reasons to not using Logger is because of its
static nature, i.e. once you turn on logging for e.g. o.a.l.index, it
affects all Lucene instances in the same JVM.

Before I went and made it an instance of e.g. ReplicationClient,
ReplicationHandler etc., I checked and noticed that SegmentInfos has *only*
a static infoStream, which is in fact a PrintStream, not even InfoStream.

Why doesn't SIS use InfoStream, and why is it static? Is it an error, or
was there a reason behind it?

Do we have any policy around logging / info-streaming? Like, having
replication output logging messages proved invaluable while debugging the
client. Should I just make all classes that have interesting log messages
InfoStream-aware by taking an InfoStream instance as a member? Is using
InfoStream.getDefault() considered an error? If so, maybe we should add it
to the forbidden API check?

Shai


[JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 462 - Failure!

2013-05-12 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/462/
Java: 64bit/jdk1.7.0 -XX:-UseCompressedOops -XX:+UseG1GC

All tests passed

Build Log:
[...truncated 9178 lines...]
[junit4:junit4] ERROR: JVM J0 ended with an exception, command line: 
/Library/Java/JavaVirtualMachines/jdk1.7.0_21.jdk/Contents/Home/jre/bin/java 
-XX:-UseCompressedOops -XX:+UseG1GC -XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/heapdumps
 -Dtests.prefix=tests -Dtests.seed=2D36861A5DEDFC8C -Xmx512M -Dtests.iters= 
-Dtests.verbose=false -Dtests.infostream=false -Dtests.codec=random 
-Dtests.postingsformat=random -Dtests.docvaluesformat=random 
-Dtests.locale=random -Dtests.timezone=random -Dtests.directory=random 
-Dtests.linedocsfile=europarl.lines.txt.gz -Dtests.luceneMatchVersion=5.0 
-Dtests.cleanthreads=perClass 
-Djava.util.logging.config.file=/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/logging.properties
 -Dtests.nightly=false -Dtests.weekly=false -Dtests.slow=true 
-Dtests.asserts.gracious=false -Dtests.multiplier=1 -DtempDir=. 
-Djava.io.tmpdir=. 
-Djunit4.tempDir=/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/test/temp
 
-Dclover.db.dir=/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/clover/db
 -Djava.security.manager=org.apache.lucene.util.TestSecurityManager 
-Djava.security.policy=/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/tests.policy
 -Dlucene.version=5.0-SNAPSHOT -Djetty.testMode=1 -Djetty.insecurerandom=1 
-Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory 
-Djava.awt.headless=true -Dfile.encoding=US-ASCII -classpath 

[jira] [Updated] (SOLR-3038) Solrj should use javabin wireformat by default with updaterequests

2013-05-12 Thread Shawn Heisey (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Heisey updated SOLR-3038:
---

Attachment: SOLR-3038-abstract-writer.patch

The extremely simple fix for this is to simply change one line in 
HttpSolrServer, so it creates its request writer as BinaryRequestWriter.

This attached patch (missing CHANGES.txt) goes further.  In addition to the 
simple one-line fix, it turns RequestWriter into an abstract class and creates 
XMLRequestWriter as an implementation.

I'm running tests now, if there are failures I will adjust it.  There could be 
code out there that will break with this change, so I'd like to know if this is 
a bad idea for 4.x.

 Solrj should use javabin wireformat by default with updaterequests
 --

 Key: SOLR-3038
 URL: https://issues.apache.org/jira/browse/SOLR-3038
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Affects Versions: 4.0-ALPHA
Reporter: Sami Siren
Priority: Minor
 Attachments: SOLR-3038-abstract-writer.patch


 The javabin wire format is faster than xml when feeding Solr - it should 
 become the default. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #324: POMs out of sync

2013-05-12 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/324/

2 tests failed.
REGRESSION:  org.apache.solr.cloud.ShardSplitTest.testDistribSearch

Error Message:
Wrong doc count on shard1_1 expected:142 but was:141

Stack Trace:
java.lang.AssertionError: Wrong doc count on shard1_1 expected:142 but 
was:141
at 
__randomizedtesting.SeedInfo.seed([20D3A2F3FDB0555E:A1352CEB8AEF3562]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at 
org.apache.solr.cloud.ShardSplitTest.checkDocCountsAndShardStates(ShardSplitTest.java:167)
at org.apache.solr.cloud.ShardSplitTest.doTest(ShardSplitTest.java:142)


REGRESSION:  org.apache.solr.cloud.SyncSliceTest.testDistribSearch

Error Message:
shard1 is not consistent.  Got 305 from 
http://127.0.0.1:63214/collection1lastClient and got 5 from 
http://127.0.0.1:45553/collection1

Stack Trace:
java.lang.AssertionError: shard1 is not consistent.  Got 305 from 
http://127.0.0.1:63214/collection1lastClient and got 5 from 
http://127.0.0.1:45553/collection1
at 
__randomizedtesting.SeedInfo.seed([7D44EC4B504B9610:FCA262532714F62C]:0)
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.solr.cloud.AbstractFullDistribZkTestBase.checkShardConsistency(AbstractFullDistribZkTestBase.java:963)
at org.apache.solr.cloud.SyncSliceTest.doTest(SyncSliceTest.java:238)




Build Log:
[...truncated 23594 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Make tests.failfast default to true

2013-05-12 Thread Dawid Weiss
 OnI think that in general the code that initializes ignoreAfterMaxFailures 
 should take tests.iters into account?
 I.e. what's the point of running a test with maxfailures=5 or failfast=true 
 without iters?

I looked at the code. The thing is one of these options (failfast?) is
I think a legacy thing; maxfailures takes precedence over this
property. When you look at the LTC this is indeed the case

if (failFast) {
  if (maxFailures == Integer.MAX_VALUE) {
maxFailures = 1;
  } else {
Logger.getLogger(LuceneTestCase.class.getSimpleName()).warning(
Property ' + SYSPROP_MAXFAILURES + '= + maxFailures +
, 'failfast' is +
 ignored.);
  }
}

So if failfast is specified it effectively means the same as
specifying maxFailures=1, otherwise a warning is printed about
incompatible combination of options. I think we should keep this
behavior -- people may be used to this.

The implementation of maxfailures is JVM-local and indeed is meant
primarily to early-exit (ignore) any tests after the first N failures.
This includes the scenario with multiple test repetitions but was
designed for the general case, not this particular use case. The
default behavior in ANT's JUnit (and in maven too) is to execute all
tests, not to stop on the first error. Again, I think we should keep
this behavior consistent. If you wish to alter it locally on a
persistent basis then how about if you create your own setup in:

property file=${user.home}/lucene.build.properties/

As for tests failing when failfast is set to true -- this is indeed a
bug in tests (they shouldn't be sensitive to this but they are). I
filed this issue to fix this:

https://issues.apache.org/jira/browse/LUCENE-4997

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-4997) TestReproduceMessage fails when tests.failfast is set to true.

2013-05-12 Thread Dawid Weiss (JIRA)
Dawid Weiss created LUCENE-4997:
---

 Summary: TestReproduceMessage fails when tests.failfast is set to 
true.
 Key: LUCENE-4997
 URL: https://issues.apache.org/jira/browse/LUCENE-4997
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 5.0, 4.4




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Make tests.failfast default to true

2013-05-12 Thread Shai Erera
Thanks Dawid. It's not critical to change this setting and i don't want to
override it by default when running tests, only if I use tests.iters.
While we could make LTC default maxFailures to 1 if tests.iters are 1, I
don't think that it's worth it. I'll just type -Dtests.failfast=true.

Anyway, since then I was able to run luceneutil/repeatLuceneTest.py which
does the combination of tests.iters and tests.dups (which is what I wished
-Dtests.dups, or -Dtests.beast would do). So far it's been working great.


Shai


On Sun, May 12, 2013 at 9:15 PM, Dawid Weiss
dawid.we...@cs.put.poznan.plwrote:

  OnI think that in general the code that initializes
 ignoreAfterMaxFailures should take tests.iters into account?
  I.e. what's the point of running a test with maxfailures=5 or
 failfast=true without iters?

 I looked at the code. The thing is one of these options (failfast?) is
 I think a legacy thing; maxfailures takes precedence over this
 property. When you look at the LTC this is indeed the case

 if (failFast) {
   if (maxFailures == Integer.MAX_VALUE) {
 maxFailures = 1;
   } else {
 Logger.getLogger(LuceneTestCase.class.getSimpleName()).warning(
 Property ' + SYSPROP_MAXFAILURES + '= + maxFailures +
 , 'failfast' is +
  ignored.);
   }
 }

 So if failfast is specified it effectively means the same as
 specifying maxFailures=1, otherwise a warning is printed about
 incompatible combination of options. I think we should keep this
 behavior -- people may be used to this.

 The implementation of maxfailures is JVM-local and indeed is meant
 primarily to early-exit (ignore) any tests after the first N failures.
 This includes the scenario with multiple test repetitions but was
 designed for the general case, not this particular use case. The
 default behavior in ANT's JUnit (and in maven too) is to execute all
 tests, not to stop on the first error. Again, I think we should keep
 this behavior consistent. If you wish to alter it locally on a
 persistent basis then how about if you create your own setup in:

 property file=${user.home}/lucene.build.properties/

 As for tests failing when failfast is set to true -- this is indeed a
 bug in tests (they shouldn't be sensitive to this but they are). I
 filed this issue to fix this:

 https://issues.apache.org/jira/browse/LUCENE-4997

 Dawid

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




[JENKINS] Lucene-Solr-trunk-Windows (64bit/jdk1.7.0_21) - Build # 2813 - Still Failing!

2013-05-12 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/2813/
Java: 64bit/jdk1.7.0_21 -XX:-UseCompressedOops -XX:+UseSerialGC

5 tests failed.
FAILED:  
org.apache.solr.client.solrj.embedded.TestEmbeddedSolrServer.testGetCoreContainer

Error Message:


Stack Trace:
org.apache.solr.common.SolrException: 
at 
__randomizedtesting.SeedInfo.seed([6F790073B2D4A0DB:A26A343C0425C865]:0)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:262)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:219)
at org.apache.solr.core.CoreContainer.init(CoreContainer.java:149)
at 
org.apache.solr.client.solrj.embedded.AbstractEmbeddedSolrServerTestCase.setUp(AbstractEmbeddedSolrServerTestCase.java:64)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:771)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.io.IOException: The filename, directory name, or volume label 
syntax is incorrect
at java.io.WinNTFileSystem.canonicalize0(Native Method)
at java.io.Win32FileSystem.canonicalize(Win32FileSystem.java:414)
at java.io.File.getCanonicalPath(File.java:589)
at 

Re: Make tests.failfast default to true

2013-05-12 Thread Dawid Weiss
 While we could make LTC default maxFailures to 1 if tests.iters are 1, I
 don't think that it's worth it. I'll just type -Dtests.failfast=true.

The default isn't without a reason -- when you run with tests.iters
with a fixed (method) seed and have multiple failures  (but different
from the number of runs) this means your test is seed-independent (it
depends on factors other than just the seed). Yours is just one
scenario out of many :)

 Anyway, since then I was able to run luceneutil/repeatLuceneTest.py which
 does the combination of tests.iters and tests.dups (which is what I wished
 -Dtests.dups, or -Dtests.beast would do). So far it's been working great.

Great to hear that. I will fix that issue you came across anyway --
this is a bug so thanks for reporting.

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Make tests.failfast default to true

2013-05-12 Thread Shai Erera
Thanks Dawid!

Shai


On Sun, May 12, 2013 at 10:03 PM, Dawid Weiss
dawid.we...@cs.put.poznan.plwrote:

  While we could make LTC default maxFailures to 1 if tests.iters are 1, I
  don't think that it's worth it. I'll just type -Dtests.failfast=true.

 The default isn't without a reason -- when you run with tests.iters
 with a fixed (method) seed and have multiple failures  (but different
 from the number of runs) this means your test is seed-independent (it
 depends on factors other than just the seed). Yours is just one
 scenario out of many :)

  Anyway, since then I was able to run luceneutil/repeatLuceneTest.py which
  does the combination of tests.iters and tests.dups (which is what I
 wished
  -Dtests.dups, or -Dtests.beast would do). So far it's been working great.

 Great to hear that. I will fix that issue you came across anyway --
 this is a bug so thanks for reporting.

 Dawid

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




[jira] [Updated] (SOLR-3038) Solrj should use javabin wireformat by default with updaterequests

2013-05-12 Thread Shawn Heisey (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Heisey updated SOLR-3038:
---

Attachment: SOLR-3038-abstract-writer.patch

Earlier patch was incomplete - forgot an svn add.

 Solrj should use javabin wireformat by default with updaterequests
 --

 Key: SOLR-3038
 URL: https://issues.apache.org/jira/browse/SOLR-3038
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Affects Versions: 4.0-ALPHA
Reporter: Sami Siren
Priority: Minor
 Attachments: SOLR-3038-abstract-writer.patch, 
 SOLR-3038-abstract-writer.patch


 The javabin wire format is faster than xml when feeding Solr - it should 
 become the default. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] [Commented] (LUCENE-4995) Remove the strong reference of CompressingStoredFieldsReader on the decompression buffer

2013-05-12 Thread Savia Beson
I do not think the real problem gets solved by fixing it here, this is just one 
particular instance of the problem and it will pop up somewhere else. If you 
have a lot of threads, or you have very high thread churn, gc()/oom is going to 
hurt with or without this patch, just slightly later.  It is hard to balance 
between memory needs and pressure at gc() and we have no easy knobs to control 
it at application level. 

Would a globally bound  object pool, per index cache  be an option with new as 
fallback option? It is another sync, but this should be acceptable and probably 
useful for other places? With smart defaults would be  completely transparent 
for end user, but would enable making these decisions user responsibility (zero 
sized pool == always create object… )  
 


On May 11, 2013, at 10:25 PM, Adrien Grand (JIRA) j...@apache.org wrote:

 
[ 
 https://issues.apache.org/jira/browse/LUCENE-4995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655364#comment-13655364
  ] 
 
 Adrien Grand commented on LUCENE-4995:
 --
 
 bq. Are we sure this is the right thing to do?
 
 I have no idea at all. On the one hand, it looks reasonable to me to have a 
 reusable per-thread buffer to handle decompression, but on the other hand, it 
 makes me unhappy that its size is unbounded: if an index has a few 1M 
 documents on S segments and T threads, the JVM will have to reserve S*T*1M of 
 heap just to be able to handle decompression.
 
 Remove the strong reference of CompressingStoredFieldsReader on the 
 decompression buffer
 
 
Key: LUCENE-4995
URL: https://issues.apache.org/jira/browse/LUCENE-4995
Project: Lucene - Core
 Issue Type: Bug
   Reporter: Adrien Grand
   Assignee: Adrien Grand
Attachments: LUCENE-4995.patch
 
 
 CompressingStoredFieldsReader has a strong reference on the buffer it uses 
 for decompression. Although it makes the reader able to reuse this buffer, 
 this can trigger high memory usage in case some documents are very large. 
 Creating this buffer on demand would help give memory back to the JVM.
 
 --
 This message is automatically generated by JIRA.
 If you think it was sent incorrectly, please contact your JIRA administrators
 For more information on JIRA, see: http://www.atlassian.com/software/jira
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4997) Internal test framework's tests are sensitive to previous test failures and tests.failfast.

2013-05-12 Thread Dawid Weiss (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated LUCENE-4997:


Summary: Internal test framework's tests are sensitive to previous test 
failures and tests.failfast.  (was: TestReproduceMessage fails when 
tests.failfast is set to true.)

 Internal test framework's tests are sensitive to previous test failures and 
 tests.failfast.
 ---

 Key: LUCENE-4997
 URL: https://issues.apache.org/jira/browse/LUCENE-4997
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 5.0, 4.4




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4997) Internal test framework's tests are sensitive to previous test failures and tests.failfast.

2013-05-12 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655621#comment-13655621
 ] 

Commit Tag Bot commented on LUCENE-4997:


[trunk commit] dweiss
http://svn.apache.org/viewvc?view=revisionrevision=1481634

LUCENE-4997: Internal test framework's tests are sensitive to previous test 
failures and tests.failfast.

 Internal test framework's tests are sensitive to previous test failures and 
 tests.failfast.
 ---

 Key: LUCENE-4997
 URL: https://issues.apache.org/jira/browse/LUCENE-4997
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 5.0, 4.4




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4997) Internal test framework's tests are sensitive to previous test failures and tests.failfast.

2013-05-12 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655623#comment-13655623
 ] 

Commit Tag Bot commented on LUCENE-4997:


[branch_4x commit] dweiss
http://svn.apache.org/viewvc?view=revisionrevision=1481636

LUCENE-4997: Internal test framework's tests are sensitive to previous test 
failures and tests.failfast.

 Internal test framework's tests are sensitive to previous test failures and 
 tests.failfast.
 ---

 Key: LUCENE-4997
 URL: https://issues.apache.org/jira/browse/LUCENE-4997
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 5.0, 4.4




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-4997) Internal test framework's tests are sensitive to previous test failures and tests.failfast.

2013-05-12 Thread Dawid Weiss (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved LUCENE-4997.
-

Resolution: Fixed

 Internal test framework's tests are sensitive to previous test failures and 
 tests.failfast.
 ---

 Key: LUCENE-4997
 URL: https://issues.apache.org/jira/browse/LUCENE-4997
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 5.0, 4.4




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3422) IndeIndexWriter.optimize() throws FileNotFoundException and IOException

2013-05-12 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655626#comment-13655626
 ] 

Michael McCandless commented on LUCENE-3422:


An ongoing merge won't remove files from other segments (eg a flushed segment) 
so I don't think that alone can lead to FNFE.

Can you give more details about how you're hitting FNFEs?

 IndeIndexWriter.optimize() throws FileNotFoundException and IOException
 ---

 Key: LUCENE-3422
 URL: https://issues.apache.org/jira/browse/LUCENE-3422
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Elizabeth Nisha

 I am using lucene 3.0.2 search APIs for my application. 
 Indexed data is about 350MB and time taken for indexing is 25 hrs. Search 
 indexing and Optimization runs in two different threads. Optimization runs 
 for every 1 hour and it doesn't run while indexing is going on and vice 
 versa. When optimization is going on using IndexWriter.optimize(), 
 FileNotFoundException and IOException are seen in my log and the index file 
 is getting corrupted, log says
 1. java.io.IOException: No sub-file with id _5r8.fdt found 
 [The file name in this message changes over time (_5r8.fdt, _6fa.fdt, 
 _6uh.fdt, ..., _emv.fdt) ]
 2. java.io.FileNotFoundException: 
 /local/groups/necim/index_5.3/index/_bdx.cfs (No such file or directory)  
 3. java.io.FileNotFoundException: 
 /local/groups/necim/index_5.3/index/_hkq.cfs (No such file or directory)
   Stack trace: java.io.IOException: background merge hit exception: 
 _hkp:c100-_hkp _hkq:c100-_hkp _hkr:c100-_hkr _hks:c100-_hkr _hxb:c5500 
 _hx5:c1000 _hxc:c198
 84 into _hxd [optimize] [mergeDocStores]
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2359)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2298)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2268)
at com.telelogic.cs.search.SearchIndex.doOptimize(SearchIndex.java:130)
at 
 com.telelogic.cs.search.SearchIndexerThread$1.run(SearchIndexerThread.java:337)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.FileNotFoundException: 
 /local/groups/necim/index_5.3/index/_hkq.cfs (No such file or directory)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.init(RandomAccessFile.java:212)
at 
 org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descriptor.init(SimpleFSDirectory.java:76)
at 
 org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.init(SimpleFSDirectory.java:97)
at 
 org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.init(NIOFSDirectory.java:87)
at 
 org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:67)
at 
 org.apache.lucene.index.CompoundFileReader.init(CompoundFileReader.java:67)
at 
 org.apache.lucene.index.SegmentReader$CoreReaders.init(SegmentReader.java:114)
at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:590)
at 
 org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:616)
at 
 org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4309)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3965)
at 
 org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:231)
at 
 org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:288)
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4901) Unit test TestIndexWriterOnJRECrash does not support IBM Java

2013-05-12 Thread Dawid Weiss (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated LUCENE-4901:


Attachment: LUCENE-4901.patch

I've removed the vendor-check assumption entirely and call Runtime.halt() 
instead of messing with Unsafe/ zero pointer dereference.

Take a look, if there are no objections I'll commit it in.

 Unit test TestIndexWriterOnJRECrash does not support IBM Java 
 --

 Key: LUCENE-4901
 URL: https://issues.apache.org/jira/browse/LUCENE-4901
 Project: Lucene - Core
  Issue Type: Improvement
  Components: general/test
Affects Versions: 4.2
 Environment: Red Hat EL 6.3
 IBM Java 1.6.0
 ANT 1.9.0
Reporter: Rodrigo Trujillo
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 4.2

 Attachments: LUCENE-4901.patch, test-IBM-java-vendor.patch


 I successfully compiled Lucene 4.2 with IBM.
 Then ran unit tests with the nightly option set to true
 The test case TestIndexWriterOnJRECrash was skipped returning IBM 
 Corporation JRE not supported:
 [junit4:junit4] Suite: org.apache.lucene.index.TestIndexWriterOnJRECrash
 [junit4:junit4] IGNOR/A 0.28s | TestIndexWriterOnJRECrash.testNRTThreads
 [junit4:junit4] Assumption #1: IBM Corporation JRE not supported.
 [junit4:junit4] Completed in 0.68s, 1 test, 1 skipped

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4901) TestIndexWriterOnJRECrash should work on any JRE vendor via Runtime.halt()

2013-05-12 Thread Dawid Weiss (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated LUCENE-4901:


Summary: TestIndexWriterOnJRECrash should work on any JRE vendor via 
Runtime.halt()  (was: Unit test TestIndexWriterOnJRECrash does not support IBM 
Java )

 TestIndexWriterOnJRECrash should work on any JRE vendor via Runtime.halt()
 --

 Key: LUCENE-4901
 URL: https://issues.apache.org/jira/browse/LUCENE-4901
 Project: Lucene - Core
  Issue Type: Improvement
  Components: general/test
 Environment: Red Hat EL 6.3
 IBM Java 1.6.0
 ANT 1.9.0
Reporter: Rodrigo Trujillo
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 5.0, 4.4

 Attachments: LUCENE-4901.patch, test-IBM-java-vendor.patch


 I successfully compiled Lucene 4.2 with IBM.
 Then ran unit tests with the nightly option set to true
 The test case TestIndexWriterOnJRECrash was skipped returning IBM 
 Corporation JRE not supported:
 [junit4:junit4] Suite: org.apache.lucene.index.TestIndexWriterOnJRECrash
 [junit4:junit4] IGNOR/A 0.28s | TestIndexWriterOnJRECrash.testNRTThreads
 [junit4:junit4] Assumption #1: IBM Corporation JRE not supported.
 [junit4:junit4] Completed in 0.68s, 1 test, 1 skipped

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4901) Unit test TestIndexWriterOnJRECrash does not support IBM Java

2013-05-12 Thread Dawid Weiss (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated LUCENE-4901:


Fix Version/s: (was: 4.2)
   4.4
   5.0

 Unit test TestIndexWriterOnJRECrash does not support IBM Java 
 --

 Key: LUCENE-4901
 URL: https://issues.apache.org/jira/browse/LUCENE-4901
 Project: Lucene - Core
  Issue Type: Improvement
  Components: general/test
Affects Versions: 4.2
 Environment: Red Hat EL 6.3
 IBM Java 1.6.0
 ANT 1.9.0
Reporter: Rodrigo Trujillo
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 5.0, 4.4

 Attachments: LUCENE-4901.patch, test-IBM-java-vendor.patch


 I successfully compiled Lucene 4.2 with IBM.
 Then ran unit tests with the nightly option set to true
 The test case TestIndexWriterOnJRECrash was skipped returning IBM 
 Corporation JRE not supported:
 [junit4:junit4] Suite: org.apache.lucene.index.TestIndexWriterOnJRECrash
 [junit4:junit4] IGNOR/A 0.28s | TestIndexWriterOnJRECrash.testNRTThreads
 [junit4:junit4] Assumption #1: IBM Corporation JRE not supported.
 [junit4:junit4] Completed in 0.68s, 1 test, 1 skipped

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4901) Unit test TestIndexWriterOnJRECrash does not support IBM Java

2013-05-12 Thread Dawid Weiss (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated LUCENE-4901:


Affects Version/s: (was: 4.2)

 Unit test TestIndexWriterOnJRECrash does not support IBM Java 
 --

 Key: LUCENE-4901
 URL: https://issues.apache.org/jira/browse/LUCENE-4901
 Project: Lucene - Core
  Issue Type: Improvement
  Components: general/test
 Environment: Red Hat EL 6.3
 IBM Java 1.6.0
 ANT 1.9.0
Reporter: Rodrigo Trujillo
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 5.0, 4.4

 Attachments: LUCENE-4901.patch, test-IBM-java-vendor.patch


 I successfully compiled Lucene 4.2 with IBM.
 Then ran unit tests with the nightly option set to true
 The test case TestIndexWriterOnJRECrash was skipped returning IBM 
 Corporation JRE not supported:
 [junit4:junit4] Suite: org.apache.lucene.index.TestIndexWriterOnJRECrash
 [junit4:junit4] IGNOR/A 0.28s | TestIndexWriterOnJRECrash.testNRTThreads
 [junit4:junit4] Assumption #1: IBM Corporation JRE not supported.
 [junit4:junit4] Completed in 0.68s, 1 test, 1 skipped

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4811) cleanup - commonsHttpSolrServer - httpSolrServer

2013-05-12 Thread Shawn Heisey (JIRA)
Shawn Heisey created SOLR-4811:
--

 Summary: cleanup - commonsHttpSolrServer - httpSolrServer
 Key: SOLR-4811
 URL: https://issues.apache.org/jira/browse/SOLR-4811
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.3
Reporter: Shawn Heisey
Assignee: Shawn Heisey
Priority: Minor
 Fix For: 5.0, 4.4


There is still a presence of commonsHttpSolrServer in variable names and 
comments.  This will clean that up.

The code changes are limited to test classes, but if it's committed, it could 
complicate life for others who are working on those tests.  What's the best 
practice for minimizing impact?


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4811) cleanup - commonsHttpSolrServer - httpSolrServer

2013-05-12 Thread Shawn Heisey (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Heisey updated SOLR-4811:
---

Attachment: SOLR-4811.patch

Attached patch.

 cleanup - commonsHttpSolrServer - httpSolrServer
 -

 Key: SOLR-4811
 URL: https://issues.apache.org/jira/browse/SOLR-4811
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.3
Reporter: Shawn Heisey
Assignee: Shawn Heisey
Priority: Minor
 Fix For: 5.0, 4.4

 Attachments: SOLR-4811.patch


 There is still a presence of commonsHttpSolrServer in variable names and 
 comments.  This will clean that up.
 The code changes are limited to test classes, but if it's committed, it could 
 complicate life for others who are working on those tests.  What's the best 
 practice for minimizing impact?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4975) Add Replication module to Lucene

2013-05-12 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-4975:
---

Attachment: LUCENE-4975.patch

Patch adds instance InfoStream members instead of relying on the static 
default. Beasted 10K+ iterations for both IndexReplicationClientTest and 
IndexAndTaxonomyReplicationClientTest, all pass.

I think it's ready. I plan to commit it tomorrow.

 Add Replication module to Lucene
 

 Key: LUCENE-4975
 URL: https://issues.apache.org/jira/browse/LUCENE-4975
 Project: Lucene - Core
  Issue Type: New Feature
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
 LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
 LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
 LUCENE-4975.patch, LUCENE-4975.patch


 I wrote a replication module which I think will be useful to Lucene users who 
 want to replicate their indexes for e.g high-availability, taking hot backups 
 etc.
 I will upload a patch soon where I'll describe in general how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4258) Incremental Field Updates through Stacked Segments

2013-05-12 Thread Sivan Yogev (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sivan Yogev updated LUCENE-4258:


Attachment: LUCENE-4258.branch.4.patch

1. Removed some assertions which were based on assumptions that stacked 
segments break
2. Added a mechanism to apply updates on already applied ones, now all tests 
pass
3. Did some house cleaning
What's left? Improve calculation of bytes used, make merge policy 
updates-aware, add the option to collapse stacked segments for segments that 
cannot be merged, check performance...

 Incremental Field Updates through Stacked Segments
 --

 Key: LUCENE-4258
 URL: https://issues.apache.org/jira/browse/LUCENE-4258
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Reporter: Sivan Yogev
 Fix For: 4.4

 Attachments: IncrementalFieldUpdates.odp, 
 LUCENE-4258-API-changes.patch, LUCENE-4258.branch.1.patch, 
 LUCENE-4258.branch.2.patch, LUCENE-4258.branch3.patch, 
 LUCENE-4258.branch.4.patch, LUCENE-4258.r1410593.patch, 
 LUCENE-4258.r1412262.patch, LUCENE-4258.r1416438.patch, 
 LUCENE-4258.r1416617.patch, LUCENE-4258.r1422495.patch, 
 LUCENE-4258.r1423010.patch

   Original Estimate: 2,520h
  Remaining Estimate: 2,520h

 Shai and I would like to start working on the proposal to Incremental Field 
 Updates outlined here (http://markmail.org/message/zhrdxxpfk6qvdaex).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4785) New MaxScoreQParserPlugin

2013-05-12 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655641#comment-13655641
 ] 

Commit Tag Bot commented on SOLR-4785:
--

[trunk commit] janhoy
http://svn.apache.org/viewvc?view=revisionrevision=1481651

SOLR-4785: New MaxScoreQParserPlugin

 New MaxScoreQParserPlugin
 -

 Key: SOLR-4785
 URL: https://issues.apache.org/jira/browse/SOLR-4785
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Reporter: Jan Høydahl
Assignee: Jan Høydahl
Priority: Minor
 Fix For: 5.0, 4.4

 Attachments: SOLR-4785.patch, SOLR-4785.patch


 A customer wants to contribute back this component.
 It is a QParser which behaves exactly like lucene parser (extends it), but 
 returns the Max score from the clauses, i.e. max(c1,c2,c3..) instead of the 
 default which is sum(c1,c2,c3...). It does this by wrapping all SHOULD 
 clauses in a DisjunctionMaxQuery with tie=1.0. Any MUST or PROHIBITED clauses 
 are passed through as-is. Non-boolean queries, e.g. NumericRange 
 falls-through to lucene parser.
 To use, add to solrconfig.xml:
 {code:xml}
   queryParser name=maxscore class=solr.MaxScoreQParserPlugin/
 {code}
 Then use it in a query
 {noformat}
 q=A AND B AND {!maxscore v=$max}max=C OR (D AND E)
 {noformat}
 This will return the score of A+B+max(C,sum(D+E))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4785) New MaxScoreQParserPlugin

2013-05-12 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655646#comment-13655646
 ] 

Commit Tag Bot commented on SOLR-4785:
--

[branch_4x commit] janhoy
http://svn.apache.org/viewvc?view=revisionrevision=1481656

SOLR-4785: New MaxScoreQParserPlugin (merge from trunk)

 New MaxScoreQParserPlugin
 -

 Key: SOLR-4785
 URL: https://issues.apache.org/jira/browse/SOLR-4785
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Reporter: Jan Høydahl
Assignee: Jan Høydahl
Priority: Minor
 Fix For: 5.0, 4.4

 Attachments: SOLR-4785.patch, SOLR-4785.patch


 A customer wants to contribute back this component.
 It is a QParser which behaves exactly like lucene parser (extends it), but 
 returns the Max score from the clauses, i.e. max(c1,c2,c3..) instead of the 
 default which is sum(c1,c2,c3...). It does this by wrapping all SHOULD 
 clauses in a DisjunctionMaxQuery with tie=1.0. Any MUST or PROHIBITED clauses 
 are passed through as-is. Non-boolean queries, e.g. NumericRange 
 falls-through to lucene parser.
 To use, add to solrconfig.xml:
 {code:xml}
   queryParser name=maxscore class=solr.MaxScoreQParserPlugin/
 {code}
 Then use it in a query
 {noformat}
 q=A AND B AND {!maxscore v=$max}max=C OR (D AND E)
 {noformat}
 This will return the score of A+B+max(C,sum(D+E))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4785) New MaxScoreQParserPlugin

2013-05-12 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SOLR-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl resolved SOLR-4785.
---

Resolution: Fixed

 New MaxScoreQParserPlugin
 -

 Key: SOLR-4785
 URL: https://issues.apache.org/jira/browse/SOLR-4785
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Reporter: Jan Høydahl
Assignee: Jan Høydahl
Priority: Minor
 Fix For: 5.0, 4.4

 Attachments: SOLR-4785.patch, SOLR-4785.patch


 A customer wants to contribute back this component.
 It is a QParser which behaves exactly like lucene parser (extends it), but 
 returns the Max score from the clauses, i.e. max(c1,c2,c3..) instead of the 
 default which is sum(c1,c2,c3...). It does this by wrapping all SHOULD 
 clauses in a DisjunctionMaxQuery with tie=1.0. Any MUST or PROHIBITED clauses 
 are passed through as-is. Non-boolean queries, e.g. NumericRange 
 falls-through to lucene parser.
 To use, add to solrconfig.xml:
 {code:xml}
   queryParser name=maxscore class=solr.MaxScoreQParserPlugin/
 {code}
 Then use it in a query
 {noformat}
 q=A AND B AND {!maxscore v=$max}max=C OR (D AND E)
 {noformat}
 This will return the score of A+B+max(C,sum(D+E))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3422) IndeIndexWriter.optimize() throws FileNotFoundException and IOException

2013-05-12 Thread l0co (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655651#comment-13655651
 ] 

l0co commented on LUCENE-3422:
--

Sure, this can happen because of other reasons (might be a bug in Hibernate 
Search?). This works like this:

 1. I have IndexWriter writing to the index in Hibernate Search exclusive mode 
(it creates Workspace with single IndexWriter which is not closed/re-created on 
each usage, but constantly opened) with RAM flush threshold=2MB.
 2. IndexWriter has concurrent merge scheduler by default.
 3. I'm writing to the index using application UI and on the other window I'm 
observing the index directory.
 4. After each write (entity save) new bunch of XXX.* files is created (_13.*, 
_14.* etc)
 5. After some time these files dissapears from the directory, for example I 
have 13,14,15,16,17,18,19 files and after the merge (?) process I have only 
18,19 and the rest dissapears.
 6. This happens during the IndexWriter usage - when I save entitiy to the 
database.
 7. Sometimes in this scenario I have FNFE.
 8. I caught the error with breakpoint and I see that during FNFE the 
IndexWriter has segmentInfos corresponded to the files that already dissapeared 
from the index directory in current index writer usage (ie. in the directory 
there are 18,19 files but the segmentInfos shows all 13,14,15,16,17,18,19).
 9. So, I suppose that when the writer has been invoked, the merge thread has 
removed these files, but the (another, concurrent) write thread still sees them.
 10. This didn't happen (by now) when I switched to serial merge scheduler.

 IndeIndexWriter.optimize() throws FileNotFoundException and IOException
 ---

 Key: LUCENE-3422
 URL: https://issues.apache.org/jira/browse/LUCENE-3422
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Elizabeth Nisha

 I am using lucene 3.0.2 search APIs for my application. 
 Indexed data is about 350MB and time taken for indexing is 25 hrs. Search 
 indexing and Optimization runs in two different threads. Optimization runs 
 for every 1 hour and it doesn't run while indexing is going on and vice 
 versa. When optimization is going on using IndexWriter.optimize(), 
 FileNotFoundException and IOException are seen in my log and the index file 
 is getting corrupted, log says
 1. java.io.IOException: No sub-file with id _5r8.fdt found 
 [The file name in this message changes over time (_5r8.fdt, _6fa.fdt, 
 _6uh.fdt, ..., _emv.fdt) ]
 2. java.io.FileNotFoundException: 
 /local/groups/necim/index_5.3/index/_bdx.cfs (No such file or directory)  
 3. java.io.FileNotFoundException: 
 /local/groups/necim/index_5.3/index/_hkq.cfs (No such file or directory)
   Stack trace: java.io.IOException: background merge hit exception: 
 _hkp:c100-_hkp _hkq:c100-_hkp _hkr:c100-_hkr _hks:c100-_hkr _hxb:c5500 
 _hx5:c1000 _hxc:c198
 84 into _hxd [optimize] [mergeDocStores]
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2359)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2298)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2268)
at com.telelogic.cs.search.SearchIndex.doOptimize(SearchIndex.java:130)
at 
 com.telelogic.cs.search.SearchIndexerThread$1.run(SearchIndexerThread.java:337)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.FileNotFoundException: 
 /local/groups/necim/index_5.3/index/_hkq.cfs (No such file or directory)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.init(RandomAccessFile.java:212)
at 
 org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descriptor.init(SimpleFSDirectory.java:76)
at 
 org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.init(SimpleFSDirectory.java:97)
at 
 org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.init(NIOFSDirectory.java:87)
at 
 org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:67)
at 
 org.apache.lucene.index.CompoundFileReader.init(CompoundFileReader.java:67)
at 
 org.apache.lucene.index.SegmentReader$CoreReaders.init(SegmentReader.java:114)
at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:590)
at 
 org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:616)
at 
 org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4309)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3965)
at 
 

[jira] [Comment Edited] (LUCENE-3422) IndeIndexWriter.optimize() throws FileNotFoundException and IOException

2013-05-12 Thread l0co (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655651#comment-13655651
 ] 

l0co edited comment on LUCENE-3422 at 5/12/13 10:09 PM:


Sure, this can happen because of other reasons (might be a bug in Hibernate 
Search?). This works like this:

 1. I have IndexWriter writing to the index in Hibernate Search exclusive mode 
(it creates Workspace with single IndexWriter which is not closed/re-created on 
each usage, but constantly opened) with RAM flush threshold=2MB.
 2. IndexWriter has concurrent merge scheduler by default.
 3. I'm writing to the index using application UI and on the other window I'm 
observing the index directory.
 4. After each write (entity save) new bunch of XXX.* files is created (_13.*, 
_14.* etc)
 5. After some time these files dissapears from the directory, for example I 
have 13,14,15,16,17,18,19 files and after the merge (?) process I have only 
18,19 and the rest dissapears.
 6. 5. effect happens during the IndexWriter usage - when I save entitiy to the 
database again.
 7. Sometimes in this scenario I have FNFE.
 8. I caught the error with breakpoint and I see that during FNFE the 
IndexWriter has segmentInfos corresponded to the files that already dissapeared 
from the index directory in current index writer usage (ie. in the directory 
there are 18,19 files but the segmentInfos shows all 13,14,15,16,17,18,19).
 9. So, I suppose that when the writer has been invoked, the merge thread has 
removed these files, but the (another, concurrent) write thread still sees them.
 10. This didn't happen (by now) when I switched to serial merge scheduler.

  was (Author: l0co):
Sure, this can happen because of other reasons (might be a bug in Hibernate 
Search?). This works like this:

 1. I have IndexWriter writing to the index in Hibernate Search exclusive mode 
(it creates Workspace with single IndexWriter which is not closed/re-created on 
each usage, but constantly opened) with RAM flush threshold=2MB.
 2. IndexWriter has concurrent merge scheduler by default.
 3. I'm writing to the index using application UI and on the other window I'm 
observing the index directory.
 4. After each write (entity save) new bunch of XXX.* files is created (_13.*, 
_14.* etc)
 5. After some time these files dissapears from the directory, for example I 
have 13,14,15,16,17,18,19 files and after the merge (?) process I have only 
18,19 and the rest dissapears.
 6. This happens during the IndexWriter usage - when I save entitiy to the 
database.
 7. Sometimes in this scenario I have FNFE.
 8. I caught the error with breakpoint and I see that during FNFE the 
IndexWriter has segmentInfos corresponded to the files that already dissapeared 
from the index directory in current index writer usage (ie. in the directory 
there are 18,19 files but the segmentInfos shows all 13,14,15,16,17,18,19).
 9. So, I suppose that when the writer has been invoked, the merge thread has 
removed these files, but the (another, concurrent) write thread still sees them.
 10. This didn't happen (by now) when I switched to serial merge scheduler.
  
 IndeIndexWriter.optimize() throws FileNotFoundException and IOException
 ---

 Key: LUCENE-3422
 URL: https://issues.apache.org/jira/browse/LUCENE-3422
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Elizabeth Nisha

 I am using lucene 3.0.2 search APIs for my application. 
 Indexed data is about 350MB and time taken for indexing is 25 hrs. Search 
 indexing and Optimization runs in two different threads. Optimization runs 
 for every 1 hour and it doesn't run while indexing is going on and vice 
 versa. When optimization is going on using IndexWriter.optimize(), 
 FileNotFoundException and IOException are seen in my log and the index file 
 is getting corrupted, log says
 1. java.io.IOException: No sub-file with id _5r8.fdt found 
 [The file name in this message changes over time (_5r8.fdt, _6fa.fdt, 
 _6uh.fdt, ..., _emv.fdt) ]
 2. java.io.FileNotFoundException: 
 /local/groups/necim/index_5.3/index/_bdx.cfs (No such file or directory)  
 3. java.io.FileNotFoundException: 
 /local/groups/necim/index_5.3/index/_hkq.cfs (No such file or directory)
   Stack trace: java.io.IOException: background merge hit exception: 
 _hkp:c100-_hkp _hkq:c100-_hkp _hkr:c100-_hkr _hks:c100-_hkr _hxb:c5500 
 _hx5:c1000 _hxc:c198
 84 into _hxd [optimize] [mergeDocStores]
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2359)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2298)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2268)
at 

[jira] [Commented] (SOLR-4785) New MaxScoreQParserPlugin

2013-05-12 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655657#comment-13655657
 ] 

Jan Høydahl commented on SOLR-4785:
---

Also added the new parser to the list in http://wiki.apache.org/solr/QueryParser

 New MaxScoreQParserPlugin
 -

 Key: SOLR-4785
 URL: https://issues.apache.org/jira/browse/SOLR-4785
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Reporter: Jan Høydahl
Assignee: Jan Høydahl
Priority: Minor
 Fix For: 5.0, 4.4

 Attachments: SOLR-4785.patch, SOLR-4785.patch


 A customer wants to contribute back this component.
 It is a QParser which behaves exactly like lucene parser (extends it), but 
 returns the Max score from the clauses, i.e. max(c1,c2,c3..) instead of the 
 default which is sum(c1,c2,c3...). It does this by wrapping all SHOULD 
 clauses in a DisjunctionMaxQuery with tie=1.0. Any MUST or PROHIBITED clauses 
 are passed through as-is. Non-boolean queries, e.g. NumericRange 
 falls-through to lucene parser.
 To use, add to solrconfig.xml:
 {code:xml}
   queryParser name=maxscore class=solr.MaxScoreQParserPlugin/
 {code}
 Then use it in a query
 {noformat}
 q=A AND B AND {!maxscore v=$max}max=C OR (D AND E)
 {noformat}
 This will return the score of A+B+max(C,sum(D+E))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #848: POMs out of sync

2013-05-12 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/848/

1 tests failed.
FAILED:  
org.apache.solr.search.QueryEqualityTest.org.apache.solr.search.QueryEqualityTest

Error Message:
testParserCoverage was run w/o any other method explicitly testing qparser: 
maxscore

Stack Trace:
java.lang.AssertionError: testParserCoverage was run w/o any other method 
explicitly testing qparser: maxscore
at __randomizedtesting.SeedInfo.seed([3113BEB9952E4D5C]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.solr.search.QueryEqualityTest.afterClassParserCoverageTest(QueryEqualityTest.java:61)




Build Log:
[...truncated 23978 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4922) A SpatialPrefixTree based on the Hilbert Curve and variable grid sizes

2013-05-12 Thread John Berryman (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Berryman updated LUCENE-4922:
--

Attachment: HilbertConverter.zip

This is a Java example of converting from a set of x,y values (within specified 
bounds) to a HilbertOrdered value.

The strategy is to

# Convert the x,y value so doubles between 0 and 1.
# Convert these doubles to large integers with max value 256^numBytes (user 
specifies numBytes). Note: there's probably a way to do these two last steps 
simultaneously and regain some precision. Note: numBytes must be = 7 for now.
# Interleave the bits so that you get a z-order value (note, I did this the 
obvious way. In the code I've pointed to a website with a much more efficient 
method - so called morten numbers.
# Convert the z-ordered value to a hilbert-ordered value.

How to do this last step really deserves a big whiteboard session - it's an 
inherently visual discussion. However, as a clue to what's happening in the 
code:

* There are only 4 shapes that compose the Hilbert curve. In the code I've 
called them D,U,n, and C because of the way they look. These are the states 
of a state machine.
* I convert from z to hilbert 2 bits at a time.
* On the first iteration I assume that I'm in the D state. In this simplistic 
case, I convert from 2 z-ordered bits to 2 hilbert-ordered bits based upon a 
lookup table that goes with the D state. I replace the z-ordered bits with the 
hilbert-ordered bits.
* I then check for which state I should go to next based upon a different 
lookup table that goes with the D state. It directs me to another state.
* I then get the next 2 bits from the byte array and repeat this method until 
I'm out of bits.

I've spot checked the input/output and it looks good (you'll see where I've 
done this in the code). No tests!

Also. This method could be slower than expected because I'm doing all 
operations on 2 bits at a time. As is, the method in the python code might even 
be faster because (correct me if I'm wrong) a double multiply can take place in 
one clock cycle.

That said, the methods here can be extended to operate at the processor word 
size.

What's more, I'm working with lookup tables. I suspect that you could use magic 
numbers and bitmasks and all that jazz and make something that took up very 
little space or time.

Also, I couldn't find them now, but there exist known efficient algorithms for 
doing this conversion. (I guess this weekend I just felt like making my 
own.:-/) I've run across algorithms that are even for higher dimension Hilbert 
Curves


 A SpatialPrefixTree based on the Hilbert Curve and variable grid sizes
 --

 Key: LUCENE-4922
 URL: https://issues.apache.org/jira/browse/LUCENE-4922
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
  Labels: gsoc2013, mentor, newdev
 Attachments: HilbertConverter.zip


 My wish-list for an ideal SpatialPrefixTree has these properties:
 * Hilbert Curve ordering
 * Variable grid size per level (ex: 256 at the top, 64 at the bottom, 16 for 
 all in-between)
 * Compact binary encoding (so-called Morton number)
 * Works for geodetic (i.e. lat  lon) and non-geodetic
 Some bonus wishes for use in geospatial:
 * Use an equal-area projection such that each cell has an equal area to all 
 others at the same level.
 * When advancing a grid level, if a cell's width is less than half its 
 height. then divide it as 4 vertically stacked instead of 2 by 2. The point 
 is to avoid super-skinny cells which occurs towards the poles and degrades 
 performance.
 All of this requires some basic performance benchmarks to measure the effects 
 of these characteristics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4922) A SpatialPrefixTree based on the Hilbert Curve and variable grid sizes

2013-05-12 Thread John Berryman (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655668#comment-13655668
 ] 

John Berryman commented on LUCENE-4922:
---

I just attached code called HilbertConverter. This is a Java example of 
converting from a set of x,y values (within specified bounds) to a 
HilbertOrdered value.

The strategy is to

# Convert the x,y value so doubles between 0 and 1.
# Convert these doubles to large integers with max value 256^numBytes (user 
specifies numBytes). Note: there's probably a way to do these two last steps 
simultaneously and regain some precision. Note: numBytes must be = 7 for now.
# Interleave the bits so that you get a z-order value (note, I did this the 
obvious way. In the code I've pointed to a website with a much more efficient 
method - so called morten numbers.
# Convert the z-ordered value to a hilbert-ordered value.

How to do this last step really deserves a big whiteboard session - it's an 
inherently visual discussion. However, as a clue to what's happening in the 
code:

* There are only 4 shapes that compose the Hilbert curve. In the code I've 
called them D,U,n, and C because of the way they look. These are the states 
of a state machine.
* I convert from z to hilbert 2 bits at a time.
* On the first iteration I assume that I'm in the D state. In this simplistic 
case, I convert from 2 z-ordered bits to 2 hilbert-ordered bits based upon a 
lookup table that goes with the D state. I replace the z-ordered bits with the 
hilbert-ordered bits.
* I then check for which state I should go to next based upon a different 
lookup table that goes with the D state. It directs me to another state.
* I then get the next 2 bits from the byte array and repeat this method until 
I'm out of bits.

I've spot checked the input/output and it looks good (you'll see where I've 
done this in the code). No tests!

Also. This method could be slower than expected because I'm doing all 
operations on 2 bits at a time. As is, the method in the python code might even 
be faster because (correct me if I'm wrong) a double multiply can take place in 
one clock cycle.

That said, the methods here can be extended to operate at the processor word 
size.

What's more, I'm working with lookup tables. I suspect that you could use magic 
numbers and bitmasks and all that jazz and make something that took up very 
little space or time.

Also, I couldn't find them now, but there exist known efficient algorithms for 
doing this conversion. (I guess this weekend I just felt like making my 
own.:-/) I've run across algorithms that are even for higher dimension Hilbert 
Curves


 A SpatialPrefixTree based on the Hilbert Curve and variable grid sizes
 --

 Key: LUCENE-4922
 URL: https://issues.apache.org/jira/browse/LUCENE-4922
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
  Labels: gsoc2013, mentor, newdev
 Attachments: HilbertConverter.zip


 My wish-list for an ideal SpatialPrefixTree has these properties:
 * Hilbert Curve ordering
 * Variable grid size per level (ex: 256 at the top, 64 at the bottom, 16 for 
 all in-between)
 * Compact binary encoding (so-called Morton number)
 * Works for geodetic (i.e. lat  lon) and non-geodetic
 Some bonus wishes for use in geospatial:
 * Use an equal-area projection such that each cell has an equal area to all 
 others at the same level.
 * When advancing a grid level, if a cell's width is less than half its 
 height. then divide it as 4 vertically stacked instead of 2 by 2. The point 
 is to avoid super-skinny cells which occurs towards the poles and degrades 
 performance.
 All of this requires some basic performance benchmarks to measure the effects 
 of these characteristics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_21) - Build # 5625 - Failure!

2013-05-12 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/5625/
Java: 64bit/jdk1.7.0_21 -XX:+UseCompressedOops -XX:+UseG1GC

1 tests failed.
FAILED:  junit.framework.TestSuite.org.apache.solr.search.QueryEqualityTest

Error Message:
testParserCoverage was run w/o any other method explicitly testing qparser: 
maxscore

Stack Trace:
java.lang.AssertionError: testParserCoverage was run w/o any other method 
explicitly testing qparser: maxscore
at __randomizedtesting.SeedInfo.seed([D11D2701DB8F04BC]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.solr.search.QueryEqualityTest.afterClassParserCoverageTest(QueryEqualityTest.java:61)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:700)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at java.lang.Thread.run(Thread.java:722)




Build Log:
[...truncated 9109 lines...]
[junit4:junit4] Suite: org.apache.solr.search.QueryEqualityTest
[junit4:junit4]   1 INFO  - 2013-05-12 22:44:44.564; 
org.apache.solr.SolrTestCaseJ4; initCore
[junit4:junit4]   1 INFO  - 2013-05-12 22:44:44.565; 
org.apache.solr.core.SolrResourceLoader; new SolrResourceLoader for directory: 
'/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build/solr-core/test-files/solr/collection1/'
[junit4:junit4]   1 INFO  - 2013-05-12 22:44:44.565; 
org.apache.solr.core.SolrResourceLoader; Adding 
'file:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build/solr-core/test-files/solr/collection1/lib/README'
 to classloader
[junit4:junit4]   1 INFO  - 2013-05-12 22:44:44.566; 
org.apache.solr.core.SolrResourceLoader; Adding 
'file:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build/solr-core/test-files/solr/collection1/lib/classes/'
 to classloader
[junit4:junit4]   1 INFO  - 2013-05-12 22:44:44.612; 
org.apache.solr.core.SolrConfig; Using Lucene MatchVersion: LUCENE_50
[junit4:junit4]   1 INFO  - 2013-05-12 22:44:44.654; 
org.apache.solr.core.SolrConfig; Loaded SolrConfig: solrconfig.xml
[junit4:junit4]   1 INFO  - 2013-05-12 22:44:44.654; 
org.apache.solr.schema.IndexSchema; Reading Solr Schema from schema15.xml
[junit4:junit4]   1 INFO  - 2013-05-12 22:44:44.658; 
org.apache.solr.schema.IndexSchema; [null] Schema name=test
[junit4:junit4]   1 INFO  - 2013-05-12 22:44:44.936; 
org.apache.solr.schema.IndexSchema; default search field in schema is text
[junit4:junit4]   1 INFO  - 2013-05-12 22:44:44.939; 
org.apache.solr.schema.IndexSchema; unique key field: id
[junit4:junit4]   1 INFO  - 2013-05-12 22:44:44.940; 
org.apache.solr.schema.FileExchangeRateProvider; Reloading exchange rates from 
file currency.xml
[junit4:junit4]   1 INFO  - 2013-05-12 22:44:44.942; 
org.apache.solr.schema.FileExchangeRateProvider; Reloading exchange rates from 
file currency.xml
[junit4:junit4]   1 INFO  - 2013-05-12 

[jira] [Updated] (SOLR-3038) Solrj should use javabin wireformat by default with updaterequests

2013-05-12 Thread Shawn Heisey (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Heisey updated SOLR-3038:
---

Attachment: SOLR-3038-abstract-writer.patch

Updated patch to fix some test failures.  I am seeing some test failures that 
do not seem related to my patch - they are things that I have been seeing from 
Jenkins.

I think BasicHttpSolrServerTest needs at least one more test that also tests 
XMLRequestWriter, I'll think about that.

SolrExampleStreamingTest was explicitly setting the XML writer.  My patch 
continues to do that, but I'm wondering if it needs an additional test for 
Binary.

This patch is absolutely what I think we need for trunk.  I'd like to do the 
same to 4x, but I'm worried that it isn't OK to break code where RequestWriter 
is explicitly used.  Code that uses BinaryRequestWriter would not need 
adjustment.  If that breakage sounds too dangerous, we could do the one-line 
fix for 4x.

Abstract classes are a somewhat new area for me.  Is my implementation 
acceptable?

 Solrj should use javabin wireformat by default with updaterequests
 --

 Key: SOLR-3038
 URL: https://issues.apache.org/jira/browse/SOLR-3038
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Affects Versions: 4.0-ALPHA
Reporter: Sami Siren
Priority: Minor
 Attachments: SOLR-3038-abstract-writer.patch, 
 SOLR-3038-abstract-writer.patch, SOLR-3038-abstract-writer.patch


 The javabin wire format is faster than xml when feeding Solr - it should 
 become the default. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3763) Make solr use lucene filters directly

2013-05-12 Thread Greg Bowyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Bowyer updated SOLR-3763:
--

Attachment: SOLR-3763-Make-solr-use-lucene-filters-directly.patch

Version that has a basic (but hopefully working) cache implementation.

PostFilters are still a bit of an unknown, since these are needed for spacial I 
will look at how they can be supported

 Make solr use lucene filters directly
 -

 Key: SOLR-3763
 URL: https://issues.apache.org/jira/browse/SOLR-3763
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.0, 4.1, 5.0
Reporter: Greg Bowyer
Assignee: Greg Bowyer
 Attachments: SOLR-3763-Make-solr-use-lucene-filters-directly.patch, 
 SOLR-3763-Make-solr-use-lucene-filters-directly.patch, 
 SOLR-3763-Make-solr-use-lucene-filters-directly.patch


 Presently solr uses bitsets, queries and collectors to implement the concept 
 of filters. This has proven to be very powerful, but does come at the cost of 
 introducing a large body of code into solr making it harder to optimise and 
 maintain.
 Another issue here is that filters currently cache sub-optimally given the 
 changes in lucene towards atomic readers.
 Rather than patch these issues, this is an attempt to rework the filters in 
 solr to leverage the Filter subsystem from lucene as much as possible.
 In good time the aim is to get this to do the following:
 ∘ Handle setting up filter implementations that are able to correctly cache 
 with reference to the AtomicReader that they are caching for rather that for 
 the entire index at large
 ∘ Get the post filters working, I am thinking that this can be done via 
 lucenes chained filter, with the ‟expensive” filters being put towards the 
 end of the chain - this has different semantics internally to the original 
 implementation but IMHO should have the same result for end users
 ∘ Learn how to create filters that are potentially more efficient, at present 
 solr basically runs a simple query that gathers a DocSet that relates to the 
 documents that we want filtered; it would be interesting to make use of 
 filter implementations that are in theory faster than query filters (for 
 instance there are filters that are able to query the FieldCache)
 ∘ Learn how to decompose filters so that a complex filter query can be cached 
 (potentially) as its constituent parts; for example the filter below 
 currently needs love, care and feeding to ensure that the filter cache is not 
 unduly stressed
 {code}
   'category:(100) OR category:(200) OR category:(300)'
 {code}
 Really there is no reason not to express this in a cached form as 
 {code}
 BooleanFilter(
 FilterClause(CachedFilter(TermFilter(Term(category, 100))), SHOULD),
 FilterClause(CachedFilter(TermFilter(Term(category, 200))), SHOULD),
 FilterClause(CachedFilter(TermFilter(Term(category, 300))), SHOULD)
   )
 {code}
 This would yield better cache usage I think as we can reuse docsets across 
 multiple queries, as well as avoid issues when filters are presented in 
 differing orders
 ∘ Instead of end users providing costing we might (and this is a big might 
 FWIW), be able to create a sort of execution plan of filters, leveraging a 
 combination of what the index is able to tell us as well as sampling and 
 ‟educated guesswork”; in essence this is what some DBMS software, for example 
 postgresql does - it has a genetic algo that attempts to solve the travelling 
 salesman - to great effect
 ∘ I am sure I will probably come up with other ambitious ideas to plug in 
 here . :S 
 Patches obviously forthcoming but the bulk of the work can be followed here 
 https://github.com/GregBowyer/lucene-solr/commits/solr-uses-lucene-filters

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



ReleaseTodo update

2013-05-12 Thread Jan Høydahl
Hi,

I discovered that the doc redirect still redirects to 4_1_0 javadocs.
I changed .htaccess so it now points to 4_3_0 
https://svn.apache.org/repos/asf/lucene/cms/trunk/content/.htaccess

The Release TODO should mention updating this link - 
http://wiki.apache.org/lucene-java/ReleaseTodo

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-Tests-4.x-Java6 - Build # 1606 - Failure

2013-05-12 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-4.x-Java6/1606/

1 tests failed.
FAILED:  junit.framework.TestSuite.org.apache.solr.search.QueryEqualityTest

Error Message:
testParserCoverage was run w/o any other method explicitly testing qparser: 
maxscore

Stack Trace:
java.lang.AssertionError: testParserCoverage was run w/o any other method 
explicitly testing qparser: maxscore
at __randomizedtesting.SeedInfo.seed([D112667014F561AA]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.solr.search.QueryEqualityTest.afterClassParserCoverageTest(QueryEqualityTest.java:61)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:700)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at java.lang.Thread.run(Thread.java:679)




Build Log:
[...truncated 8661 lines...]
[junit4:junit4] Suite: org.apache.solr.search.QueryEqualityTest
[junit4:junit4]   1 INFO  - 2013-05-12 23:21:36.420; 
org.apache.solr.SolrTestCaseJ4; initCore
[junit4:junit4]   1 INFO  - 2013-05-12 23:21:36.421; 
org.apache.solr.core.SolrResourceLoader; new SolrResourceLoader for directory: 
'/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/solr/build/solr-core/test-files/solr/collection1/'
[junit4:junit4]   1 INFO  - 2013-05-12 23:21:36.423; 
org.apache.solr.core.SolrResourceLoader; Adding 
'file:/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/solr/build/solr-core/test-files/solr/collection1/lib/README'
 to classloader
[junit4:junit4]   1 INFO  - 2013-05-12 23:21:36.423; 
org.apache.solr.core.SolrResourceLoader; Adding 
'file:/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/solr/build/solr-core/test-files/solr/collection1/lib/classes/'
 to classloader
[junit4:junit4]   1 INFO  - 2013-05-12 23:21:36.503; 
org.apache.solr.core.SolrConfig; Using Lucene MatchVersion: LUCENE_44
[junit4:junit4]   1 INFO  - 2013-05-12 23:21:36.582; 
org.apache.solr.core.SolrConfig; Loaded SolrConfig: solrconfig.xml
[junit4:junit4]   1 INFO  - 2013-05-12 23:21:36.583; 
org.apache.solr.schema.IndexSchema; Reading Solr Schema from schema15.xml
[junit4:junit4]   1 INFO  - 2013-05-12 23:21:36.590; 
org.apache.solr.schema.IndexSchema; [null] Schema name=test
[junit4:junit4]   1 INFO  - 2013-05-12 23:21:37.024; 
org.apache.solr.schema.IndexSchema; default search field in schema is text
[junit4:junit4]   1 INFO  - 2013-05-12 23:21:37.027; 
org.apache.solr.schema.IndexSchema; unique key field: id
[junit4:junit4]   1 INFO  - 2013-05-12 23:21:37.029; 
org.apache.solr.schema.FileExchangeRateProvider; Reloading exchange rates from 
file currency.xml
[junit4:junit4]   1 INFO  - 2013-05-12 23:21:37.032; 
org.apache.solr.schema.FileExchangeRateProvider; Reloading exchange rates from 
file currency.xml
[junit4:junit4]   1 INFO  - 2013-05-12 

[jira] [Updated] (SOLR-3763) Make solr use lucene filters directly

2013-05-12 Thread Greg Bowyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Bowyer updated SOLR-3763:
--

Fix Version/s: 5.0

 Make solr use lucene filters directly
 -

 Key: SOLR-3763
 URL: https://issues.apache.org/jira/browse/SOLR-3763
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.0, 4.1, 5.0
Reporter: Greg Bowyer
Assignee: Greg Bowyer
 Fix For: 5.0

 Attachments: SOLR-3763-Make-solr-use-lucene-filters-directly.patch, 
 SOLR-3763-Make-solr-use-lucene-filters-directly.patch, 
 SOLR-3763-Make-solr-use-lucene-filters-directly.patch


 Presently solr uses bitsets, queries and collectors to implement the concept 
 of filters. This has proven to be very powerful, but does come at the cost of 
 introducing a large body of code into solr making it harder to optimise and 
 maintain.
 Another issue here is that filters currently cache sub-optimally given the 
 changes in lucene towards atomic readers.
 Rather than patch these issues, this is an attempt to rework the filters in 
 solr to leverage the Filter subsystem from lucene as much as possible.
 In good time the aim is to get this to do the following:
 ∘ Handle setting up filter implementations that are able to correctly cache 
 with reference to the AtomicReader that they are caching for rather that for 
 the entire index at large
 ∘ Get the post filters working, I am thinking that this can be done via 
 lucenes chained filter, with the ‟expensive” filters being put towards the 
 end of the chain - this has different semantics internally to the original 
 implementation but IMHO should have the same result for end users
 ∘ Learn how to create filters that are potentially more efficient, at present 
 solr basically runs a simple query that gathers a DocSet that relates to the 
 documents that we want filtered; it would be interesting to make use of 
 filter implementations that are in theory faster than query filters (for 
 instance there are filters that are able to query the FieldCache)
 ∘ Learn how to decompose filters so that a complex filter query can be cached 
 (potentially) as its constituent parts; for example the filter below 
 currently needs love, care and feeding to ensure that the filter cache is not 
 unduly stressed
 {code}
   'category:(100) OR category:(200) OR category:(300)'
 {code}
 Really there is no reason not to express this in a cached form as 
 {code}
 BooleanFilter(
 FilterClause(CachedFilter(TermFilter(Term(category, 100))), SHOULD),
 FilterClause(CachedFilter(TermFilter(Term(category, 200))), SHOULD),
 FilterClause(CachedFilter(TermFilter(Term(category, 300))), SHOULD)
   )
 {code}
 This would yield better cache usage I think as we can reuse docsets across 
 multiple queries, as well as avoid issues when filters are presented in 
 differing orders
 ∘ Instead of end users providing costing we might (and this is a big might 
 FWIW), be able to create a sort of execution plan of filters, leveraging a 
 combination of what the index is able to tell us as well as sampling and 
 ‟educated guesswork”; in essence this is what some DBMS software, for example 
 postgresql does - it has a genetic algo that attempts to solve the travelling 
 salesman - to great effect
 ∘ I am sure I will probably come up with other ambitious ideas to plug in 
 here . :S 
 Patches obviously forthcoming but the bulk of the work can be followed here 
 https://github.com/GregBowyer/lucene-solr/commits/solr-uses-lucene-filters

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.6.0_45) - Build # 5564 - Failure!

2013-05-12 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/5564/
Java: 32bit/jdk1.6.0_45 -client -XX:+UseParallelGC

1 tests failed.
FAILED:  junit.framework.TestSuite.org.apache.solr.search.QueryEqualityTest

Error Message:
testParserCoverage was run w/o any other method explicitly testing qparser: 
maxscore

Stack Trace:
java.lang.AssertionError: testParserCoverage was run w/o any other method 
explicitly testing qparser: maxscore
at __randomizedtesting.SeedInfo.seed([DD1C463717885084]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.solr.search.QueryEqualityTest.afterClassParserCoverageTest(QueryEqualityTest.java:61)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:700)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at java.lang.Thread.run(Thread.java:662)




Build Log:
[...truncated 9002 lines...]
[junit4:junit4] Suite: org.apache.solr.search.QueryEqualityTest
[junit4:junit4]   1 INFO  - 2013-05-12 23:41:05.807; 
org.apache.solr.SolrTestCaseJ4; initCore
[junit4:junit4]   1 INFO  - 2013-05-12 23:41:05.807; 
org.apache.solr.core.SolrResourceLoader; new SolrResourceLoader for directory: 
'/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/build/solr-core/test-files/solr/collection1/'
[junit4:junit4]   1 INFO  - 2013-05-12 23:41:05.808; 
org.apache.solr.core.SolrResourceLoader; Adding 
'file:/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/build/solr-core/test-files/solr/collection1/lib/README'
 to classloader
[junit4:junit4]   1 INFO  - 2013-05-12 23:41:05.808; 
org.apache.solr.core.SolrResourceLoader; Adding 
'file:/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/build/solr-core/test-files/solr/collection1/lib/classes/'
 to classloader
[junit4:junit4]   1 INFO  - 2013-05-12 23:41:05.848; 
org.apache.solr.core.SolrConfig; Using Lucene MatchVersion: LUCENE_44
[junit4:junit4]   1 INFO  - 2013-05-12 23:41:05.901; 
org.apache.solr.core.SolrConfig; Loaded SolrConfig: solrconfig.xml
[junit4:junit4]   1 INFO  - 2013-05-12 23:41:05.902; 
org.apache.solr.schema.IndexSchema; Reading Solr Schema from schema15.xml
[junit4:junit4]   1 INFO  - 2013-05-12 23:41:05.906; 
org.apache.solr.schema.IndexSchema; [null] Schema name=test
[junit4:junit4]   1 INFO  - 2013-05-12 23:41:06.289; 
org.apache.solr.schema.IndexSchema; default search field in schema is text
[junit4:junit4]   1 INFO  - 2013-05-12 23:41:06.293; 
org.apache.solr.schema.IndexSchema; unique key field: id
[junit4:junit4]   1 INFO  - 2013-05-12 23:41:06.294; 
org.apache.solr.schema.FileExchangeRateProvider; Reloading exchange rates from 
file currency.xml
[junit4:junit4]   1 INFO  - 2013-05-12 23:41:06.297; 
org.apache.solr.schema.FileExchangeRateProvider; Reloading exchange rates from 
file currency.xml
[junit4:junit4]   1 INFO  - 2013-05-12 23:41:06.303; 

[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.8.0-ea-b86) - Build # 5626 - Still Failing!

2013-05-12 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/5626/
Java: 64bit/jdk1.8.0-ea-b86 -XX:-UseCompressedOops -XX:+UseSerialGC

1 tests failed.
FAILED:  junit.framework.TestSuite.org.apache.solr.search.QueryEqualityTest

Error Message:
testParserCoverage was run w/o any other method explicitly testing qparser: 
maxscore

Stack Trace:
java.lang.AssertionError: testParserCoverage was run w/o any other method 
explicitly testing qparser: maxscore
at __randomizedtesting.SeedInfo.seed([298F356A4C536626]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.solr.search.QueryEqualityTest.afterClassParserCoverageTest(QueryEqualityTest.java:61)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:490)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:700)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at java.lang.Thread.run(Thread.java:722)




Build Log:
[...truncated 9504 lines...]
[junit4:junit4] Suite: org.apache.solr.search.QueryEqualityTest
[junit4:junit4]   1 INFO  - 2013-05-13 06:17:55.634; 
org.apache.solr.SolrTestCaseJ4; initCore
[junit4:junit4]   1 INFO  - 2013-05-13 06:17:55.635; 
org.apache.solr.core.SolrResourceLoader; new SolrResourceLoader for directory: 
'/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build/solr-core/test-files/solr/collection1/'
[junit4:junit4]   1 INFO  - 2013-05-13 06:17:55.635; 
org.apache.solr.core.SolrResourceLoader; Adding 
'file:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build/solr-core/test-files/solr/collection1/lib/README'
 to classloader
[junit4:junit4]   1 INFO  - 2013-05-13 06:17:55.636; 
org.apache.solr.core.SolrResourceLoader; Adding 
'file:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build/solr-core/test-files/solr/collection1/lib/classes/'
 to classloader
[junit4:junit4]   1 INFO  - 2013-05-13 06:17:55.668; 
org.apache.solr.core.SolrConfig; Using Lucene MatchVersion: LUCENE_50
[junit4:junit4]   1 INFO  - 2013-05-13 06:17:55.730; 
org.apache.solr.core.SolrConfig; Loaded SolrConfig: solrconfig.xml
[junit4:junit4]   1 INFO  - 2013-05-13 06:17:55.731; 
org.apache.solr.schema.IndexSchema; Reading Solr Schema from schema15.xml
[junit4:junit4]   1 INFO  - 2013-05-13 06:17:55.736; 
org.apache.solr.schema.IndexSchema; [null] Schema name=test
[junit4:junit4]   1 INFO  - 2013-05-13 06:17:56.070; 
org.apache.solr.schema.IndexSchema; default search field in schema is text
[junit4:junit4]   1 INFO  - 2013-05-13 06:17:56.072; 
org.apache.solr.schema.IndexSchema; unique key field: id
[junit4:junit4]   1 INFO  - 2013-05-13 06:17:56.074; 
org.apache.solr.schema.FileExchangeRateProvider; Reloading exchange rates from 
file currency.xml
[junit4:junit4]   1 INFO  - 2013-05-13 06:17:56.076; 
org.apache.solr.schema.FileExchangeRateProvider; Reloading exchange rates from 
file currency.xml
[junit4:junit4]   1 INFO  - 

[jira] [Reopened] (SOLR-4785) New MaxScoreQParserPlugin

2013-05-12 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller reopened SOLR-4785:
---


This one seems to be causing a test fail.

 New MaxScoreQParserPlugin
 -

 Key: SOLR-4785
 URL: https://issues.apache.org/jira/browse/SOLR-4785
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Reporter: Jan Høydahl
Assignee: Jan Høydahl
Priority: Minor
 Fix For: 5.0, 4.4

 Attachments: SOLR-4785.patch, SOLR-4785.patch


 A customer wants to contribute back this component.
 It is a QParser which behaves exactly like lucene parser (extends it), but 
 returns the Max score from the clauses, i.e. max(c1,c2,c3..) instead of the 
 default which is sum(c1,c2,c3...). It does this by wrapping all SHOULD 
 clauses in a DisjunctionMaxQuery with tie=1.0. Any MUST or PROHIBITED clauses 
 are passed through as-is. Non-boolean queries, e.g. NumericRange 
 falls-through to lucene parser.
 To use, add to solrconfig.xml:
 {code:xml}
   queryParser name=maxscore class=solr.MaxScoreQParserPlugin/
 {code}
 Then use it in a query
 {noformat}
 q=A AND B AND {!maxscore v=$max}max=C OR (D AND E)
 {noformat}
 This will return the score of A+B+max(C,sum(D+E))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4234) Add support for binary files in ZooKeeper.

2013-05-12 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655696#comment-13655696
 ] 

Commit Tag Bot commented on SOLR-4234:
--

[trunk commit] markrmiller
http://svn.apache.org/viewvc?view=revisionrevision=1481675

SOLR-4234: Add support for binary files in ZooKeeper.

 Add support for binary files in ZooKeeper.
 --

 Key: SOLR-4234
 URL: https://issues.apache.org/jira/browse/SOLR-4234
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.0
Reporter: Eric Pugh
Assignee: Mark Miller
 Fix For: 4.4

 Attachments: binary_upload_download.patch, 
 fix_show_file_handler_with_binaries.patch, SOLR4234_binary_files.patch, 
 solr.png


 I was attempting to get the ShowFileHandler to show a .png file, and it was 
 failing.  But in non-ZK mode it worked just fine!   It took a while, but it 
 seems that we upload to zk as a text, and download as well.  I've attached a 
 unit test that demonstrates the problem, and a fix.  You have to have a 
 binary file in the conf directory to make the test work, I put solr.png in 
 the collection1/conf/velocity directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4234) Add support for binary files in ZooKeeper.

2013-05-12 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655697#comment-13655697
 ] 

Commit Tag Bot commented on SOLR-4234:
--

[branch_4x commit] markrmiller
http://svn.apache.org/viewvc?view=revisionrevision=1481676

SOLR-4234: Add support for binary files in ZooKeeper.

 Add support for binary files in ZooKeeper.
 --

 Key: SOLR-4234
 URL: https://issues.apache.org/jira/browse/SOLR-4234
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.0
Reporter: Eric Pugh
Assignee: Mark Miller
 Fix For: 4.4

 Attachments: binary_upload_download.patch, 
 fix_show_file_handler_with_binaries.patch, SOLR4234_binary_files.patch, 
 solr.png


 I was attempting to get the ShowFileHandler to show a .png file, and it was 
 failing.  But in non-ZK mode it worked just fine!   It took a while, but it 
 seems that we upload to zk as a text, and download as well.  I've attached a 
 unit test that demonstrates the problem, and a fix.  You have to have a 
 binary file in the conf directory to make the test work, I put solr.png in 
 the collection1/conf/velocity directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-4796) zkcli.sh should honor JAVA_HOME

2013-05-12 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller reassigned SOLR-4796:
-

Assignee: Mark Miller

 zkcli.sh should honor JAVA_HOME
 ---

 Key: SOLR-4796
 URL: https://issues.apache.org/jira/browse/SOLR-4796
 Project: Solr
  Issue Type: Bug
  Components: scripts and tools
Affects Versions: 4.2
Reporter: Roman Shaposhnik
Assignee: Mark Miller
 Fix For: 4.4

 Attachments: SOLR-4796.patch.txt


 On a system with GNU java installed the fact that zkcli.sh doesn't honor 
 JAVA_HOME could lead to hard to diagnose failure:
 {noformat}
 Exception in thread main java.lang.NoClassDefFoundError: 
 org.apache.solr.cloud.ZkCLI
at gnu.java.lang.MainThread.run(libgcj.so.7rh)
 Caused by: java.lang.ClassNotFoundException: org.apache.solr.cloud.ZkCLI not 
 found in gnu.gcj.runtime.SystemClassLoader{urls=[], 
 parent=gnu.gcj.runtime.ExtensionClassLoader{urls=[], parent=null}}
at java.net.URLClassLoader.findClass(libgcj.so.7rh)
at java.lang.ClassLoader.loadClass(libgcj.so.7rh)
at java.lang.ClassLoader.loadClass(libgcj.so.7rh)
at gnu.java.lang.MainThread.run(libgcj.so.7rh)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4796) zkcli.sh should honor JAVA_HOME

2013-05-12 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-4796:
--

Fix Version/s: 5.0

 zkcli.sh should honor JAVA_HOME
 ---

 Key: SOLR-4796
 URL: https://issues.apache.org/jira/browse/SOLR-4796
 Project: Solr
  Issue Type: Bug
  Components: scripts and tools
Affects Versions: 4.2
Reporter: Roman Shaposhnik
Assignee: Mark Miller
 Fix For: 5.0, 4.4

 Attachments: SOLR-4796.patch.txt


 On a system with GNU java installed the fact that zkcli.sh doesn't honor 
 JAVA_HOME could lead to hard to diagnose failure:
 {noformat}
 Exception in thread main java.lang.NoClassDefFoundError: 
 org.apache.solr.cloud.ZkCLI
at gnu.java.lang.MainThread.run(libgcj.so.7rh)
 Caused by: java.lang.ClassNotFoundException: org.apache.solr.cloud.ZkCLI not 
 found in gnu.gcj.runtime.SystemClassLoader{urls=[], 
 parent=gnu.gcj.runtime.ExtensionClassLoader{urls=[], parent=null}}
at java.net.URLClassLoader.findClass(libgcj.so.7rh)
at java.lang.ClassLoader.loadClass(libgcj.so.7rh)
at java.lang.ClassLoader.loadClass(libgcj.so.7rh)
at gnu.java.lang.MainThread.run(libgcj.so.7rh)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4796) zkcli.sh should honor JAVA_HOME

2013-05-12 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655702#comment-13655702
 ] 

Mark Miller commented on SOLR-4796:
---

I've tested this - seems to work as expected with or without a JAVA_HOME env 
variable set. I'll commit shortly.

 zkcli.sh should honor JAVA_HOME
 ---

 Key: SOLR-4796
 URL: https://issues.apache.org/jira/browse/SOLR-4796
 Project: Solr
  Issue Type: Bug
  Components: scripts and tools
Affects Versions: 4.2
Reporter: Roman Shaposhnik
Assignee: Mark Miller
 Fix For: 5.0, 4.4

 Attachments: SOLR-4796.patch.txt


 On a system with GNU java installed the fact that zkcli.sh doesn't honor 
 JAVA_HOME could lead to hard to diagnose failure:
 {noformat}
 Exception in thread main java.lang.NoClassDefFoundError: 
 org.apache.solr.cloud.ZkCLI
at gnu.java.lang.MainThread.run(libgcj.so.7rh)
 Caused by: java.lang.ClassNotFoundException: org.apache.solr.cloud.ZkCLI not 
 found in gnu.gcj.runtime.SystemClassLoader{urls=[], 
 parent=gnu.gcj.runtime.ExtensionClassLoader{urls=[], parent=null}}
at java.net.URLClassLoader.findClass(libgcj.so.7rh)
at java.lang.ClassLoader.loadClass(libgcj.so.7rh)
at java.lang.ClassLoader.loadClass(libgcj.so.7rh)
at gnu.java.lang.MainThread.run(libgcj.so.7rh)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.7.0_21) - Build # 5565 - Still Failing!

2013-05-12 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/5565/
Java: 32bit/jdk1.7.0_21 -client -XX:+UseParallelGC

1 tests failed.
FAILED:  junit.framework.TestSuite.org.apache.solr.search.QueryEqualityTest

Error Message:
testParserCoverage was run w/o any other method explicitly testing qparser: 
maxscore

Stack Trace:
java.lang.AssertionError: testParserCoverage was run w/o any other method 
explicitly testing qparser: maxscore
at __randomizedtesting.SeedInfo.seed([8D1918CCABFBC7B7]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.solr.search.QueryEqualityTest.afterClassParserCoverageTest(QueryEqualityTest.java:61)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:700)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at java.lang.Thread.run(Thread.java:722)




Build Log:
[...truncated 9250 lines...]
[junit4:junit4] Suite: org.apache.solr.search.QueryEqualityTest
[junit4:junit4]   1 INFO  - 2013-05-13 00:47:28.972; 
org.apache.solr.SolrTestCaseJ4; initCore
[junit4:junit4]   1 INFO  - 2013-05-13 00:47:28.973; 
org.apache.solr.core.SolrResourceLoader; new SolrResourceLoader for directory: 
'/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/build/solr-core/test-files/solr/collection1/'
[junit4:junit4]   1 INFO  - 2013-05-13 00:47:28.974; 
org.apache.solr.core.SolrResourceLoader; Adding 
'file:/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/build/solr-core/test-files/solr/collection1/lib/README'
 to classloader
[junit4:junit4]   1 INFO  - 2013-05-13 00:47:28.974; 
org.apache.solr.core.SolrResourceLoader; Adding 
'file:/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/build/solr-core/test-files/solr/collection1/lib/classes/'
 to classloader
[junit4:junit4]   1 INFO  - 2013-05-13 00:47:29.009; 
org.apache.solr.core.SolrConfig; Using Lucene MatchVersion: LUCENE_44
[junit4:junit4]   1 INFO  - 2013-05-13 00:47:29.053; 
org.apache.solr.core.SolrConfig; Loaded SolrConfig: solrconfig.xml
[junit4:junit4]   1 INFO  - 2013-05-13 00:47:29.054; 
org.apache.solr.schema.IndexSchema; Reading Solr Schema from schema15.xml
[junit4:junit4]   1 INFO  - 2013-05-13 00:47:29.058; 
org.apache.solr.schema.IndexSchema; [null] Schema name=test
[junit4:junit4]   1 INFO  - 2013-05-13 00:47:29.382; 
org.apache.solr.schema.IndexSchema; default search field in schema is text
[junit4:junit4]   1 INFO  - 2013-05-13 00:47:29.385; 
org.apache.solr.schema.IndexSchema; unique key field: id
[junit4:junit4]   1 INFO  - 2013-05-13 00:47:29.386; 
org.apache.solr.schema.FileExchangeRateProvider; Reloading exchange rates from 
file currency.xml
[junit4:junit4]   1 INFO  - 2013-05-13 00:47:29.389; 
org.apache.solr.schema.FileExchangeRateProvider; Reloading exchange rates from 
file currency.xml
[junit4:junit4]   1 INFO  - 2013-05-13 00:47:29.395; 

[jira] [Updated] (SOLR-4785) New MaxScoreQParserPlugin

2013-05-12 Thread Greg Bowyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Bowyer updated SOLR-4785:
--

Attachment: SOLR-4785-Add-tests-for-maxscore-to-QueryEqualityTest.patch

This bit me while I was updating my filter patch (SOLR-3763)

I had a stab at putting some basic equality tests in place, but looking at the 
test case itself I wonder if QueryEqualityTest should be re-worked with the 
full fury of randomised testing, as it seems to be at best, only testing the 
happy cases.

 New MaxScoreQParserPlugin
 -

 Key: SOLR-4785
 URL: https://issues.apache.org/jira/browse/SOLR-4785
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Reporter: Jan Høydahl
Assignee: Jan Høydahl
Priority: Minor
 Fix For: 5.0, 4.4

 Attachments: 
 SOLR-4785-Add-tests-for-maxscore-to-QueryEqualityTest.patch, SOLR-4785.patch, 
 SOLR-4785.patch


 A customer wants to contribute back this component.
 It is a QParser which behaves exactly like lucene parser (extends it), but 
 returns the Max score from the clauses, i.e. max(c1,c2,c3..) instead of the 
 default which is sum(c1,c2,c3...). It does this by wrapping all SHOULD 
 clauses in a DisjunctionMaxQuery with tie=1.0. Any MUST or PROHIBITED clauses 
 are passed through as-is. Non-boolean queries, e.g. NumericRange 
 falls-through to lucene parser.
 To use, add to solrconfig.xml:
 {code:xml}
   queryParser name=maxscore class=solr.MaxScoreQParserPlugin/
 {code}
 Then use it in a query
 {noformat}
 q=A AND B AND {!maxscore v=$max}max=C OR (D AND E)
 {noformat}
 This will return the score of A+B+max(C,sum(D+E))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3763) Make solr use lucene filters directly

2013-05-12 Thread Greg Bowyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Bowyer updated SOLR-3763:
--

Attachment: SOLR-3763-Make-solr-use-lucene-filters-directly.patch

 Trunk moves really quickly these days (or I move slowly)

Updated patch to cope with recent trunk changes

 Make solr use lucene filters directly
 -

 Key: SOLR-3763
 URL: https://issues.apache.org/jira/browse/SOLR-3763
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.0, 4.1, 5.0
Reporter: Greg Bowyer
Assignee: Greg Bowyer
 Fix For: 5.0

 Attachments: SOLR-3763-Make-solr-use-lucene-filters-directly.patch, 
 SOLR-3763-Make-solr-use-lucene-filters-directly.patch, 
 SOLR-3763-Make-solr-use-lucene-filters-directly.patch, 
 SOLR-3763-Make-solr-use-lucene-filters-directly.patch


 Presently solr uses bitsets, queries and collectors to implement the concept 
 of filters. This has proven to be very powerful, but does come at the cost of 
 introducing a large body of code into solr making it harder to optimise and 
 maintain.
 Another issue here is that filters currently cache sub-optimally given the 
 changes in lucene towards atomic readers.
 Rather than patch these issues, this is an attempt to rework the filters in 
 solr to leverage the Filter subsystem from lucene as much as possible.
 In good time the aim is to get this to do the following:
 ∘ Handle setting up filter implementations that are able to correctly cache 
 with reference to the AtomicReader that they are caching for rather that for 
 the entire index at large
 ∘ Get the post filters working, I am thinking that this can be done via 
 lucenes chained filter, with the ‟expensive” filters being put towards the 
 end of the chain - this has different semantics internally to the original 
 implementation but IMHO should have the same result for end users
 ∘ Learn how to create filters that are potentially more efficient, at present 
 solr basically runs a simple query that gathers a DocSet that relates to the 
 documents that we want filtered; it would be interesting to make use of 
 filter implementations that are in theory faster than query filters (for 
 instance there are filters that are able to query the FieldCache)
 ∘ Learn how to decompose filters so that a complex filter query can be cached 
 (potentially) as its constituent parts; for example the filter below 
 currently needs love, care and feeding to ensure that the filter cache is not 
 unduly stressed
 {code}
   'category:(100) OR category:(200) OR category:(300)'
 {code}
 Really there is no reason not to express this in a cached form as 
 {code}
 BooleanFilter(
 FilterClause(CachedFilter(TermFilter(Term(category, 100))), SHOULD),
 FilterClause(CachedFilter(TermFilter(Term(category, 200))), SHOULD),
 FilterClause(CachedFilter(TermFilter(Term(category, 300))), SHOULD)
   )
 {code}
 This would yield better cache usage I think as we can reuse docsets across 
 multiple queries, as well as avoid issues when filters are presented in 
 differing orders
 ∘ Instead of end users providing costing we might (and this is a big might 
 FWIW), be able to create a sort of execution plan of filters, leveraging a 
 combination of what the index is able to tell us as well as sampling and 
 ‟educated guesswork”; in essence this is what some DBMS software, for example 
 postgresql does - it has a genetic algo that attempts to solve the travelling 
 salesman - to great effect
 ∘ I am sure I will probably come up with other ambitious ideas to plug in 
 here . :S 
 Patches obviously forthcoming but the bulk of the work can be followed here 
 https://github.com/GregBowyer/lucene-solr/commits/solr-uses-lucene-filters

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4690) Highlighting Doesn't works when boost is used along with query

2013-05-12 Thread Nguyen Manh Tien (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655723#comment-13655723
 ] 

Nguyen Manh Tien commented on SOLR-4690:


Could you post your response?

 Highlighting Doesn't works when boost is used along with query
 --

 Key: SOLR-4690
 URL: https://issues.apache.org/jira/browse/SOLR-4690
 Project: Solr
  Issue Type: Bug
  Components: highlighter
Affects Versions: 4.1
 Environment: windows and unix both
Reporter: lukes shaw
Priority: Critical

 Hi everyone, recently i was trying to have the boost in the query and 
 highlighting on in parallel. But if have the boost, highlighting doesn't 
 works, but the moment i remove the boost highlighting start working again.
 Below is the request i am sending.
 http://localhost:8983/solr/collection1/select?q=%2B_query_%3A%22
 {!type%3Dedismax+qf%3D%27body^1.0+title^10.0%27+pf%3D%27body^2%27+ps%3D36+pf2%3D%27body^2%27+pf3%3D%27body^2%27+v%3D%27apple%27+mm%3D100}%22group=truegroup.field=content_group_id_kgroup.ngroups=truegroup.limit=3fl=id%2Clanguage_k%2Clast_modified_date_dt%2Ctitlerows=20hl.snippets=1hl.fragsize=200hl.fl=bodyhl.fl=titlehl=truehl.q=%2B_query_%3A%22{!type%3Dedismax+qf%3D%27body^1.0+title^10.0%27+pf%3D%27body^2%27+ps%3D36+pf2%3D%27body^2%27+pf3%3D%27body^2%27+v%3D%27apple%27+mm%3D100}
 %22debugQuery=truewt=jsonindent=truehl.snippets=1hl.fragsize=200hl.fl=bodyhl.fl=titlehl=trueboost=boost_weight
 OR
 http://localhost:8983/solr/collection1/select?q=%2B_query_%3A%22
 {!type%3Dedismax+qf%3D%27body^1.0+title^10.0%27+pf%3D%27body^2%27+ps%3D36+pf2%3D%27body^2%27+pf3%3D%27body^2%27+v%3D%27apple%27+mm%3D100}
 %22group=truegroup.field=content_group_id_kgroup.ngroups=truegroup.limit=3fl=id%2Clanguage_k%2Clast_modified_date_dt%2Ctitlerows=20hl.snippets=1hl.fragsize=200hl.fl=bodyhl.fl=titlehl=truedebugQuery=truewt=jsonindent=truehl.snippets=1hl.fragsize=200hl.fl=bodyhl.fl=titlehl=trueboost=boost_weight
 But if i do above two without the boost or use bf(additive) instead of 
 boost(multiplicative), things works but i don't get the boost(multiplicative).
 I am using SOLR4.1.0
 Any help in this is really appreciated.
 Regards,
 Lukes

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-4690) Highlighting Doesn't works when boost is used along with query

2013-05-12 Thread Nguyen Manh Tien (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655723#comment-13655723
 ] 

Nguyen Manh Tien edited comment on SOLR-4690 at 5/13/13 2:48 AM:
-

Could you post the response of above query?

  was (Author: tiennm):
Could you post your response?
  
 Highlighting Doesn't works when boost is used along with query
 --

 Key: SOLR-4690
 URL: https://issues.apache.org/jira/browse/SOLR-4690
 Project: Solr
  Issue Type: Bug
  Components: highlighter
Affects Versions: 4.1
 Environment: windows and unix both
Reporter: lukes shaw
Priority: Critical

 Hi everyone, recently i was trying to have the boost in the query and 
 highlighting on in parallel. But if have the boost, highlighting doesn't 
 works, but the moment i remove the boost highlighting start working again.
 Below is the request i am sending.
 http://localhost:8983/solr/collection1/select?q=%2B_query_%3A%22
 {!type%3Dedismax+qf%3D%27body^1.0+title^10.0%27+pf%3D%27body^2%27+ps%3D36+pf2%3D%27body^2%27+pf3%3D%27body^2%27+v%3D%27apple%27+mm%3D100}%22group=truegroup.field=content_group_id_kgroup.ngroups=truegroup.limit=3fl=id%2Clanguage_k%2Clast_modified_date_dt%2Ctitlerows=20hl.snippets=1hl.fragsize=200hl.fl=bodyhl.fl=titlehl=truehl.q=%2B_query_%3A%22{!type%3Dedismax+qf%3D%27body^1.0+title^10.0%27+pf%3D%27body^2%27+ps%3D36+pf2%3D%27body^2%27+pf3%3D%27body^2%27+v%3D%27apple%27+mm%3D100}
 %22debugQuery=truewt=jsonindent=truehl.snippets=1hl.fragsize=200hl.fl=bodyhl.fl=titlehl=trueboost=boost_weight
 OR
 http://localhost:8983/solr/collection1/select?q=%2B_query_%3A%22
 {!type%3Dedismax+qf%3D%27body^1.0+title^10.0%27+pf%3D%27body^2%27+ps%3D36+pf2%3D%27body^2%27+pf3%3D%27body^2%27+v%3D%27apple%27+mm%3D100}
 %22group=truegroup.field=content_group_id_kgroup.ngroups=truegroup.limit=3fl=id%2Clanguage_k%2Clast_modified_date_dt%2Ctitlerows=20hl.snippets=1hl.fragsize=200hl.fl=bodyhl.fl=titlehl=truedebugQuery=truewt=jsonindent=truehl.snippets=1hl.fragsize=200hl.fl=bodyhl.fl=titlehl=trueboost=boost_weight
 But if i do above two without the boost or use bf(additive) instead of 
 boost(multiplicative), things works but i don't get the boost(multiplicative).
 I am using SOLR4.1.0
 Any help in this is really appreciated.
 Regards,
 Lukes

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-1560) maxDocBytesToAnalyze should be required arg up front

2013-05-12 Thread Greg Bowyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Bowyer updated LUCENE-1560:


Labels: dead  (was: )

 maxDocBytesToAnalyze should be required arg up front
 

 Key: LUCENE-1560
 URL: https://issues.apache.org/jira/browse/LUCENE-1560
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/highlighter
Affects Versions: 2.4.1
Reporter: Michael McCandless
  Labels: dead
 Fix For: 4.4


 We recently changed IndexWriter to require you to specify up-front
 MaxFieldLength, on creation, so that you are aware of this dangerous
 loses stuff setting.  Too many developers had fallen into the trap
 of how come my search can't find this document
 I think we should do the same with maxDocBytesToAnalyze with
 highlighter?
 Spinoff from this thread:
 
 http://www.nabble.com/Lucene-Highlighting-and-Dynamic-Summaries-p22385887.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-1743) MMapDirectory should only mmap large files, small files should be opened using SimpleFS/NIOFS

2013-05-12 Thread Greg Bowyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Bowyer updated LUCENE-1743:


Labels: dead  (was: )

 MMapDirectory should only mmap large files, small files should be opened 
 using SimpleFS/NIOFS
 -

 Key: LUCENE-1743
 URL: https://issues.apache.org/jira/browse/LUCENE-1743
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/store
Affects Versions: 2.9
Reporter: Uwe Schindler
Assignee: Uwe Schindler
  Labels: dead
 Fix For: 4.4


 This is a followup to LUCENE-1741:
 Javadocs state (in FileChannel#map): For most operating systems, mapping a 
 file into memory is more expensive than reading or writing a few tens of 
 kilobytes of data via the usual read and write methods. From the standpoint 
 of performance it is generally only worth mapping relatively large files into 
 memory.
 MMapDirectory should get a user-configureable size parameter that is a lower 
 limit for mmapping files. All files with a sizelimit should be opened using 
 a conventional IndexInput from SimpleFS or NIO (another configuration option 
 for the fallback?).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4812) Edismax highlighting query doesn't work.

2013-05-12 Thread Nguyen Manh Tien (JIRA)
Nguyen Manh Tien created SOLR-4812:
--

 Summary: Edismax highlighting query doesn't work.
 Key: SOLR-4812
 URL: https://issues.apache.org/jira/browse/SOLR-4812
 Project: Solr
  Issue Type: Bug
 Environment: When hl.q is a edismax query, Highligting will ignore the 
query specified in hl.q
Reporter: Nguyen Manh Tien
Priority: Minor




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4812) Edismax highlighting query doesn't work.

2013-05-12 Thread Nguyen Manh Tien (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nguyen Manh Tien updated SOLR-4812:
---

Attachment: SOLR-4812.patch

Here is a patch

 Edismax highlighting query doesn't work.
 

 Key: SOLR-4812
 URL: https://issues.apache.org/jira/browse/SOLR-4812
 Project: Solr
  Issue Type: Bug
 Environment: When hl.q is a edismax query, Highligting will ignore 
 the query specified in hl.q
Reporter: Nguyen Manh Tien
Priority: Minor
 Attachments: SOLR-4812.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4812) Edismax highlighting query doesn't work.

2013-05-12 Thread Nguyen Manh Tien (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nguyen Manh Tien updated SOLR-4812:
---

Description: 
edismax highlighting query hl.q={!edismax qf=title v=Software}
function getHighlightQuery in edismax don't parse highlight query so it always 
return null so hl.q is ignored.

 Edismax highlighting query doesn't work.
 

 Key: SOLR-4812
 URL: https://issues.apache.org/jira/browse/SOLR-4812
 Project: Solr
  Issue Type: Bug
 Environment: When hl.q is a edismax query, Highligting will ignore 
 the query specified in hl.q
Reporter: Nguyen Manh Tien
Priority: Minor
 Attachments: SOLR-4812.patch


 edismax highlighting query hl.q={!edismax qf=title v=Software}
 function getHighlightQuery in edismax don't parse highlight query so it 
 always return null so hl.q is ignored.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2713) TestPhraseQuery.testRandomPhrases takes minutes to run with SimpleText

2013-05-12 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655732#comment-13655732
 ] 

Commit Tag Bot commented on LUCENE-2713:


[trunk commit] gbowyer
http://svn.apache.org/viewvc?view=revisionrevision=1481693

LUCENE-2713: Removed fixed test seed from TestPhraseQuery

 TestPhraseQuery.testRandomPhrases takes minutes to run with SimpleText
 --

 Key: LUCENE-2713
 URL: https://issues.apache.org/jira/browse/LUCENE-2713
 Project: Lucene - Core
  Issue Type: Bug
  Components: general/test
Affects Versions: 4.0-ALPHA
Reporter: Robert Muir
  Labels: dead
 Fix For: 4.4


 This test takes a few minutes to run if it gets simpletext codec.
 On hudson, it took 15 minutes!
 I added an assumeFalse(simpleText) as a temporary workaround, but we should 
 see if there is 
 somethign we can improve so we can remove this hack.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-2713) TestPhraseQuery.testRandomPhrases takes minutes to run with SimpleText

2013-05-12 Thread Greg Bowyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Bowyer resolved LUCENE-2713.
-

   Resolution: Fixed
Fix Version/s: (was: 4.4)
   5.0

I beat on this test case a few times choosing all the codecs, and I could not 
repeat the slowdown, I am thinking that both the ThreadLeaks and performance 
issues have long been fixed.

I am removing the fixed seed and closing this bug down, hopefully to never see 
it again.

 TestPhraseQuery.testRandomPhrases takes minutes to run with SimpleText
 --

 Key: LUCENE-2713
 URL: https://issues.apache.org/jira/browse/LUCENE-2713
 Project: Lucene - Core
  Issue Type: Bug
  Components: general/test
Affects Versions: 4.0-ALPHA
Reporter: Robert Muir
  Labels: dead
 Fix For: 5.0


 This test takes a few minutes to run if it gets simpletext codec.
 On hudson, it took 15 minutes!
 I added an assumeFalse(simpleText) as a temporary workaround, but we should 
 see if there is 
 somethign we can improve so we can remove this hack.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4812) Edismax highlighting query doesn't work.

2013-05-12 Thread Otis Gospodnetic (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Otis Gospodnetic updated SOLR-4812:
---

Description: 
When hl.q is an edismax query, Highligting will ignore the query specified in 
hl.q

edismax highlighting query hl.q={!edismax qf=title v=Software}

function getHighlightQuery in edismax don't parse highlight query so it always 
return null so hl.q is ignored.

  was:
edismax highlighting query hl.q={!edismax qf=title v=Software}
function getHighlightQuery in edismax don't parse highlight query so it always 
return null so hl.q is ignored.


 Edismax highlighting query doesn't work.
 

 Key: SOLR-4812
 URL: https://issues.apache.org/jira/browse/SOLR-4812
 Project: Solr
  Issue Type: Bug
 Environment: When hl.q is a edismax query, Highligting will ignore 
 the query specified in hl.q
Reporter: Nguyen Manh Tien
Priority: Minor
 Attachments: SOLR-4812.patch


 When hl.q is an edismax query, Highligting will ignore the query specified in 
 hl.q
 edismax highlighting query hl.q={!edismax qf=title v=Software}
 function getHighlightQuery in edismax don't parse highlight query so it 
 always return null so hl.q is ignored.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4812) Edismax highlighting query doesn't work.

2013-05-12 Thread Otis Gospodnetic (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Otis Gospodnetic updated SOLR-4812:
---

Affects Version/s: 4.2
   4.3

 Edismax highlighting query doesn't work.
 

 Key: SOLR-4812
 URL: https://issues.apache.org/jira/browse/SOLR-4812
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.2, 4.3
 Environment: When hl.q is a edismax query, Highligting will ignore 
 the query specified in hl.q
Reporter: Nguyen Manh Tien
Priority: Minor
 Attachments: SOLR-4812.patch


 When hl.q is an edismax query, Highligting will ignore the query specified in 
 hl.q
 edismax highlighting query hl.q={!edismax qf=title v=Software}
 function getHighlightQuery in edismax don't parse highlight query so it 
 always return null so hl.q is ignored.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Closed] (LUCENE-1890) auto-warming from Apache Solr causes NULL Pointer

2013-05-12 Thread Greg Bowyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Bowyer closed LUCENE-1890.
---

Resolution: Cannot Reproduce
  Assignee: Greg Bowyer

I am going to be bold and make the assumption that, since spatial has been 
re-worked and Lucene has gone from 2.x - 4.x this issue is no longer present.

 auto-warming from Apache Solr causes NULL Pointer
 -

 Key: LUCENE-1890
 URL: https://issues.apache.org/jira/browse/LUCENE-1890
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/spatial
Affects Versions: 2.4.1
 Environment: Linux
Reporter: Bill Bell
Assignee: Greg Bowyer
  Labels: dead
 Fix For: 4.4

 Attachments: localsolr.jar, lucene-spatial-2.9-dev.jar


 Sep 6, 2009 12:48:07 PM org.apache.solr.common.SolrException log
 SEVERE: Error during auto-warming of 
 key:org.apache.solr.search.QueryResultKey@b00371eb:java.lang.NullPointerException
 at 
 org.apache.lucene.spatial.tier.DistanceFieldComparatorSource$DistanceScoreDocLookupComparator.copy(DistanceFieldComparatorSource.java:101)
 at 
 org.apache.lucene.search.TopFieldCollector$MultiComparatorScoringMaxScoreCollector.collect(TopFieldCollector.java:554)
 at 
 org.apache.solr.search.DocSetDelegateCollector.collect(DocSetHitCollector.java:98)
 at 
 org.apache.lucene.search.IndexSearcher.doSearch(IndexSearcher.java:281)
 at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:253)
 at org.apache.lucene.search.Searcher.search(Searcher.java:171)
 at 
 org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1088)
 at 
 org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:876)
 at 
 org.apache.solr.search.SolrIndexSearcher.access$000(SolrIndexSearcher.java:53)
 at 
 org.apache.solr.search.SolrIndexSearcher$3.regenerateItem(SolrIndexSearcher.java:328)
 at org.apache.solr.search.LRUCache.warm(LRUCache.java:194)
 at 
 org.apache.solr.search.SolrIndexSearcher.warm(SolrIndexSearcher.java:1468)
 at org.apache.solr.core.SolrCore$3.call(SolrCore.java:1142)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
 at java.lang.Thread.run(Thread.java:619)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4813) Unavoidable IllegalArgumentException occurs when SynonymFilterFactory's setting has tokenizer's parameter.

2013-05-12 Thread Shingo Sasaki (JIRA)
Shingo Sasaki created SOLR-4813:
---

 Summary: Unavoidable IllegalArgumentException occurs when 
SynonymFilterFactory's setting has tokenizer's parameter.
 Key: SOLR-4813
 URL: https://issues.apache.org/jira/browse/SOLR-4813
 Project: Solr
  Issue Type: Bug
  Components: Schema and Analysis
Affects Versions: 4.3
Reporter: Shingo Sasaki
Priority: Critical


When I write SynonymFilterFactory' setting in schema.xml as follows, ...

{code:xml}
analyzer
  tokenizer class=solr.NGramTokenizerFactory maxGramSize=2 
minGramSize=2/
  filter class=solr.SynonymFilterFactory synonyms=synonyms.txt 
ignoreCase=true expand=true
   tokenizerFactory=solr.NGramTokenizerFactory maxGramSize=2 
minGramSize=2/
/analyzer
{code}

IllegalArgumentException (Unknown parameters) occurs.

{noformat}
Caused by: java.lang.IllegalArgumentException: Unknown parameters: 
{maxGramSize=2, minGramSize=2}
at 
org.apache.lucene.analysis.synonym.FSTSynonymFilterFactory.init(FSTSynonymFilterFactory.java:71)
at 
org.apache.lucene.analysis.synonym.SynonymFilterFactory.init(SynonymFilterFactory.java:50)
... 28 more
{noformat}

However TokenizerFactory's params should be set to loadTokenizerFactory method 
in [FST|Slow]SynonymFilterFactory. (ref. SOLR-2909)

I think, the problem was caused by LUCENE-4877 (Fix analyzer factories to 
throw exception when arguments are invalid) and SOLR-3402 (Parse Version 
outside of Analysis Factories).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2540) Document. add get(i) and addAll to make interacting with fieldables and documents easier/faster and more readable

2013-05-12 Thread Greg Bowyer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655743#comment-13655743
 ] 

Greg Bowyer commented on LUCENE-2540:
-

Outside of batch adding fields it looks like this issue is somewhat dead since 
we can now address the field(s) by name, and have sensible iterators on them?

Anyone opposed to closing this ?

 Document. add get(i) and addAll to make interacting with fieldables and 
 documents easier/faster and more readable
 -

 Key: LUCENE-2540
 URL: https://issues.apache.org/jira/browse/LUCENE-2540
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/other
Affects Versions: 3.0.2
Reporter: Woody Anderson
  Labels: dead
 Fix For: 4.4

 Attachments: LUCENE-2540.patch


 Working with Document Fieldables is often a pain.
 getting the ith involves chained method calls and is not very readable:
 {code}
 // nice
 doc.getFieldable(i);
 // not nice
 doc.getFields().get(i);
 {code}
 also, when combining documents, or otherwise aggregating multiple fields into 
 a single document,
 {code}
 // nice
 doc.addAll(fieldables);
 // note nice: less readable and more error prone
 ListFieldable fields = ...;
 for (Fieldable field : fields) {
   result.add(field);
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4813) Unavoidable IllegalArgumentException occurs when SynonymFilterFactory's setting has tokenizer's parameter.

2013-05-12 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655749#comment-13655749
 ] 

Jack Krupansky commented on SOLR-4813:
--

The problem is not that you have a tokenizerFactory attribute, but that you are 
trying to use a tokenizer that has attributes. Solr is simply complaining that 
you have two attributes, maxGramSize and minGramSize that are not defined for 
the SynonymFilterFactory. Yes, that is a new feature in Solr! If your code was 
working in a previous release, you were lucky - it would have been using the 
default min and max of 1 and 1.

The synonym tokenizerFactory attribute has no provision for passing attributes 
to the synonym tokenizer. For now, you'll have to create a custom ngram 
tokenizer factor with the desired settings.


 Unavoidable IllegalArgumentException occurs when SynonymFilterFactory's 
 setting has tokenizer's parameter.
 --

 Key: SOLR-4813
 URL: https://issues.apache.org/jira/browse/SOLR-4813
 Project: Solr
  Issue Type: Bug
  Components: Schema and Analysis
Affects Versions: 4.3
Reporter: Shingo Sasaki
Priority: Critical
  Labels: SynonymFilterFactory

 When I write SynonymFilterFactory' setting in schema.xml as follows, ...
 {code:xml}
 analyzer
   tokenizer class=solr.NGramTokenizerFactory maxGramSize=2 
 minGramSize=2/
   filter class=solr.SynonymFilterFactory synonyms=synonyms.txt 
 ignoreCase=true expand=true
tokenizerFactory=solr.NGramTokenizerFactory maxGramSize=2 
 minGramSize=2/
 /analyzer
 {code}
 IllegalArgumentException (Unknown parameters) occurs.
 {noformat}
 Caused by: java.lang.IllegalArgumentException: Unknown parameters: 
 {maxGramSize=2, minGramSize=2}
   at 
 org.apache.lucene.analysis.synonym.FSTSynonymFilterFactory.init(FSTSynonymFilterFactory.java:71)
   at 
 org.apache.lucene.analysis.synonym.SynonymFilterFactory.init(SynonymFilterFactory.java:50)
   ... 28 more
 {noformat}
 However TokenizerFactory's params should be set to loadTokenizerFactory 
 method in [FST|Slow]SynonymFilterFactory. (ref. SOLR-2909)
 I think, the problem was caused by LUCENE-4877 (Fix analyzer factories to 
 throw exception when arguments are invalid) and SOLR-3402 (Parse Version 
 outside of Analysis Factories).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3038) Solrj should use javabin wireformat by default with updaterequests

2013-05-12 Thread Shawn Heisey (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Heisey updated SOLR-3038:
---

Attachment: SOLR-3038-abstract-writer.patch

I took a closer look at the tests and fiddled things around so no test coverage 
is lost. Another update to the patch.

 Solrj should use javabin wireformat by default with updaterequests
 --

 Key: SOLR-3038
 URL: https://issues.apache.org/jira/browse/SOLR-3038
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Affects Versions: 4.0-ALPHA
Reporter: Sami Siren
Priority: Minor
 Attachments: SOLR-3038-abstract-writer.patch, 
 SOLR-3038-abstract-writer.patch, SOLR-3038-abstract-writer.patch, 
 SOLR-3038-abstract-writer.patch


 The javabin wire format is faster than xml when feeding Solr - it should 
 become the default. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4813) Unavoidable IllegalArgumentException occurs when SynonymFilterFactory's setting has tokenizer's parameter.

2013-05-12 Thread Shingo Sasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655772#comment-13655772
 ] 

Shingo Sasaki commented on SOLR-4813:
-

Hi, Jack. Thank you for your comment.
Do you mean that backward compatibility was lost? 

On samples of SOLR-319, the filter tag of SynonymFilterFactory has attributes, 
minGramSize and maxGramSize,
and TokenizerFactory's instance is inited by args in 
FSTSynonymFilterFactory.loadTokenizerFactory of Lucene 4.2.1's.

 Unavoidable IllegalArgumentException occurs when SynonymFilterFactory's 
 setting has tokenizer's parameter.
 --

 Key: SOLR-4813
 URL: https://issues.apache.org/jira/browse/SOLR-4813
 Project: Solr
  Issue Type: Bug
  Components: Schema and Analysis
Affects Versions: 4.3
Reporter: Shingo Sasaki
Priority: Critical
  Labels: SynonymFilterFactory

 When I write SynonymFilterFactory' setting in schema.xml as follows, ...
 {code:xml}
 analyzer
   tokenizer class=solr.NGramTokenizerFactory maxGramSize=2 
 minGramSize=2/
   filter class=solr.SynonymFilterFactory synonyms=synonyms.txt 
 ignoreCase=true expand=true
tokenizerFactory=solr.NGramTokenizerFactory maxGramSize=2 
 minGramSize=2/
 /analyzer
 {code}
 IllegalArgumentException (Unknown parameters) occurs.
 {noformat}
 Caused by: java.lang.IllegalArgumentException: Unknown parameters: 
 {maxGramSize=2, minGramSize=2}
   at 
 org.apache.lucene.analysis.synonym.FSTSynonymFilterFactory.init(FSTSynonymFilterFactory.java:71)
   at 
 org.apache.lucene.analysis.synonym.SynonymFilterFactory.init(SynonymFilterFactory.java:50)
   ... 28 more
 {noformat}
 However TokenizerFactory's params should be set to loadTokenizerFactory 
 method in [FST|Slow]SynonymFilterFactory. (ref. SOLR-2909)
 I think, the problem was caused by LUCENE-4877 (Fix analyzer factories to 
 throw exception when arguments are invalid) and SOLR-3402 (Parse Version 
 outside of Analysis Factories).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org