Re: quiet.ant in the build file?

2012-01-26 Thread Shai Erera
See Lucene/common-build.xml, line 540:

   This is very loud and obnoxious. abuse touch instead for a quiet
mkdir
   --
   touch file=@{tempDir}/@{threadNum}/quiet.ant verbose=false
mkdirs=true/

Shai

On Thu, Jan 26, 2012 at 10:27 AM, Dawid Weiss dawid.we...@gmail.com wrote:

 What is 'quiet.ant' file used for (around tests, ANT build files).
 Couldn't find any occurrence of it anywhere either in ANT codebase nor
 in Lucene/Solr codebase.

 Dawid

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




[JENKINS] Lucene-Solr-tests-only-trunk - Build # 12262 - Still Failing

2012-01-26 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12262/

1 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.solr.cloud.FullSolrCloudDistribCmdsTest

Error Message:
ERROR: SolrIndexSearcher opens=0 closes=-1

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=0 closes=-1
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:147)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)




Build Log (for compile errors):
[...truncated 8164 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: quiet.ant in the build file?

2012-01-26 Thread Dawid Weiss
Oh, I see now -- it's just to make the parent folder, the file is not
used at all... I thought it's a trigger to make ant's logging system
quiet for test runs. Thanks Shai.

Dawid

On Thu, Jan 26, 2012 at 9:32 AM, Shai Erera ser...@gmail.com wrote:
 See Lucene/common-build.xml, line 540:

    This is very loud and obnoxious. abuse touch instead for a quiet
 mkdir
    --
    touch file=@{tempDir}/@{threadNum}/quiet.ant verbose=false
 mkdirs=true/

 Shai

 On Thu, Jan 26, 2012 at 10:27 AM, Dawid Weiss dawid.we...@gmail.com wrote:

 What is 'quiet.ant' file used for (around tests, ANT build files).
 Couldn't find any occurrence of it anywhere either in ANT codebase nor
 in Lucene/Solr codebase.

 Dawid

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Solr-trunk - Build # 1745 - Failure

2012-01-26 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Solr-trunk/1745/

6 tests failed.
FAILED:  org.apache.solr.search.TestRecovery.testTruncatedLog

Error Message:
null

Stack Trace:
junit.framework.AssertionFailedError
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:165)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
at org.apache.solr.update.TransactionLog.init(TransactionLog.java:159)
at org.apache.solr.update.TransactionLog.init(TransactionLog.java:132)
at org.apache.solr.update.UpdateLog.ensureLog(UpdateLog.java:567)
at org.apache.solr.update.UpdateLog.add(UpdateLog.java:237)
at 
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:184)
at 
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:56)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:53)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:313)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:410)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:217)
at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:134)
at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:78)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:59)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1515)
at 
org.apache.solr.servlet.DirectSolrConnection.request(DirectSolrConnection.java:131)
at org.apache.solr.util.TestHarness.update(TestHarness.java:253)
at 
org.apache.solr.util.TestHarness.checkUpdateStatus(TestHarness.java:301)
at org.apache.solr.util.TestHarness.validateUpdate(TestHarness.java:271)
at org.apache.solr.SolrTestCaseJ4.checkUpdateU(SolrTestCaseJ4.java:372)
at org.apache.solr.SolrTestCaseJ4.assertU(SolrTestCaseJ4.java:351)
at org.apache.solr.SolrTestCaseJ4.assertU(SolrTestCaseJ4.java:345)
at 
org.apache.solr.search.TestRecovery.testTruncatedLog(TestRecovery.java:603)
at 
org.apache.lucene.util.LuceneTestCase$3$1.evaluate(LuceneTestCase.java:529)


FAILED:  org.apache.solr.search.TestRecovery.testLogReplay

Error Message:
null

Stack Trace:
junit.framework.AssertionFailedError
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:165)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
at org.apache.solr.update.TransactionLog.init(TransactionLog.java:159)
at org.apache.solr.update.TransactionLog.init(TransactionLog.java:132)
at org.apache.solr.update.UpdateLog.ensureLog(UpdateLog.java:567)
at org.apache.solr.update.UpdateLog.deleteByQuery(UpdateLog.java:288)
at 
org.apache.solr.update.DirectUpdateHandler2.deleteByQuery(DirectUpdateHandler2.java:283)
at 
org.apache.solr.update.processor.RunUpdateProcessor.processDelete(RunUpdateProcessorFactory.java:67)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processDelete(UpdateRequestProcessor.java:57)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalDelete(DistributedUpdateProcessor.java:318)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processDeleteByQuery(DistributedUpdateProcessor.java:617)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processDelete(DistributedUpdateProcessor.java:431)
at org.apache.solr.handler.XMLLoader.processDelete(XMLLoader.java:239)
at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:170)
at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:78)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:59)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1515)
at 
org.apache.solr.servlet.DirectSolrConnection.request(DirectSolrConnection.java:131)
at org.apache.solr.util.TestHarness.update(TestHarness.java:253)
at 
org.apache.solr.util.TestHarness.checkUpdateStatus(TestHarness.java:301)
at org.apache.solr.util.TestHarness.validateUpdate(TestHarness.java:271)
at org.apache.solr.SolrTestCaseJ4.checkUpdateU(SolrTestCaseJ4.java:372)
at org.apache.solr.SolrTestCaseJ4.assertU(SolrTestCaseJ4.java:351)
at org.apache.solr.SolrTestCaseJ4.assertU(SolrTestCaseJ4.java:345)
at 

[jira] [Updated] (SOLR-2202) Money FieldType

2012-01-26 Thread Updated

 [ 
https://issues.apache.org/jira/browse/SOLR-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-2202:
--

Description: 
Provides support for monetary values to Solr/Lucene with query-time currency 
conversion. The following features are supported:

- Point queries
- Range quries
- Sorting
- Currency parsing by either currency code or symbol.
- Symmetric  Asymmetric exchange rates. (Asymmetric exchange rates are useful 
if there are fees associated with exchanging the currency.)

At indexing time, money fields can be indexed in a native currency. For 
example, if a product on an e-commerce site is listed in Euros, indexing the 
price field as 1000,EUR will index it appropriately. By altering the 
currency.xml file, the sorting and querying against Solr can take into account 
fluctuations in currency exchange rates without having to re-index the 
documents.

The new money field type is a polyfield which indexes two fields, one which 
contains the amount of the value and another which contains the currency code 
or symbol. The currency metadata (names, symbols, codes, and exchange rates) 
are expected to be in an xml file which is pointed to by the field type 
declaration in the schema.xml.

The current patch is factored such that Money utility functions and 
configuration metadata lie in Lucene (see MoneyUtil and CurrencyConfig), while 
the MoneyType and MoneyValueSource lie in Solr. This was meant to mirror the 
work being done on the spacial field types.

This patch will be getting used to power the international search capabilities 
of the search engine at Etsy.

Also see WIKI page: http://wiki.apache.org/solr/MoneyFieldType


  was:
Attached please find patches to add support for monetary values to Solr/Lucene 
with query-time currency conversion. The following features are supported:

- Point queries (ex: price:4.00USD)
- Range quries (ex: price:[$5.00 TO $10.00])
- Sorting.
- Currency parsing by either currency code or symbol.
- Symmetric  Asymmetric exchange rates. (Asymmetric exchange rates are useful 
if there are fees associated with exchanging the currency.)

At indexing time, money fields can be indexed in a native currency. For 
example, if a product on an e-commerce site is listed in Euros, indexing the 
price field as 10.00EUR will index it appropriately. By altering the 
currency.xml file, the sorting and querying against Solr can take into account 
fluctuations in currency exchange rates without having to re-index the 
documents.

The new money field type is a polyfield which indexes two fields, one which 
contains the amount of the value and another which contains the currency code 
or symbol. The currency metadata (names, symbols, codes, and exchange rates) 
are expected to be in an xml file which is pointed to by the field type 
declaration in the schema.xml.

The current patch is factored such that Money utility functions and 
configuration metadata lie in Lucene (see MoneyUtil and CurrencyConfig), while 
the MoneyType and MoneyValueSource lie in Solr. This was meant to mirror the 
work being done on the spacial field types.

This patch has not yet been deployed to production but will be getting used to 
power the international search capabilities of the search engine at Etsy.





Updated description, as it was a bit outdated

 Money FieldType
 ---

 Key: SOLR-2202
 URL: https://issues.apache.org/jira/browse/SOLR-2202
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Affects Versions: 1.5
Reporter: Greg Fodor
Assignee: Jan Høydahl
 Fix For: 3.6, 4.0

 Attachments: SOLR-2022-solr-3.patch, SOLR-2202-lucene-1.patch, 
 SOLR-2202-solr-1.patch, SOLR-2202-solr-2.patch, SOLR-2202-solr-4.patch, 
 SOLR-2202-solr-5.patch, SOLR-2202-solr-6.patch, SOLR-2202-solr-7.patch, 
 SOLR-2202-solr-8.patch, SOLR-2202-solr-9.patch, SOLR-2202.patch, 
 SOLR-2202.patch, SOLR-2202.patch


 Provides support for monetary values to Solr/Lucene with query-time currency 
 conversion. The following features are supported:
 - Point queries
 - Range quries
 - Sorting
 - Currency parsing by either currency code or symbol.
 - Symmetric  Asymmetric exchange rates. (Asymmetric exchange rates are 
 useful if there are fees associated with exchanging the currency.)
 At indexing time, money fields can be indexed in a native currency. For 
 example, if a product on an e-commerce site is listed in Euros, indexing the 
 price field as 1000,EUR will index it appropriately. By altering the 
 currency.xml file, the sorting and querying against Solr can take into 
 account fluctuations in currency exchange rates without having to re-index 
 the documents.
 The new money field type is a polyfield which indexes two fields, one which 
 contains the amount of the value and another which contains 

[JENKINS] Lucene-Solr-tests-only-trunk - Build # 12263 - Still Failing

2012-01-26 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12263/

1 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.solr.cloud.FullSolrCloudDistribCmdsTest

Error Message:
ERROR: SolrIndexSearcher opens=0 closes=-3

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=0 closes=-3
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:147)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)




Build Log (for compile errors):
[...truncated 8244 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 1615 - Failure

2012-01-26 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/1615/

1 tests failed.
REGRESSION:  org.apache.solr.cloud.FullSolrCloudTest.testDistribSearch

Error Message:
java.net.SocketTimeoutException: Read timed out

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: 
java.net.SocketTimeoutException: Read timed out
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:481)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:251)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:104)
at 
org.apache.solr.cloud.FullSolrCloudTest.index_specific(FullSolrCloudTest.java:446)
at 
org.apache.solr.cloud.FullSolrCloudTest.brindDownShardIndexSomeDocsAndRecover(FullSolrCloudTest.java:673)
at 
org.apache.solr.cloud.FullSolrCloudTest.doTest(FullSolrCloudTest.java:498)
at 
org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:663)
at 
org.apache.lucene.util.LuceneTestCase$3$1.evaluate(LuceneTestCase.java:529)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:165)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:150)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
at 
org.apache.commons.httpclient.HttpParser.readRawLine(HttpParser.java:78)
at 
org.apache.commons.httpclient.HttpParser.readLine(HttpParser.java:106)
at 
org.apache.commons.httpclient.HttpConnection.readLine(HttpConnection.java:1116)
at 
org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.readLine(MultiThreadedHttpConnectionManager.java:1413)
at 
org.apache.commons.httpclient.HttpMethodBase.readStatusLine(HttpMethodBase.java:1973)
at 
org.apache.commons.httpclient.HttpMethodBase.readResponse(HttpMethodBase.java:1735)
at 
org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1098)
at 
org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398)
at 
org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:425)




Build Log (for compile errors):
[...truncated 11325 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1093) A RequestHandler to run multiple queries in a batch

2012-01-26 Thread Mikhail Khludnev (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193750#comment-13193750
 ] 

Mikhail Khludnev commented on SOLR-1093:


1. is there a way to dispatch separate queries by the webcontainer threads?
2. otherwise it requires separate thread pool. It makes operations support more 
complicated and less predictable. I suppose that webcontainer admin wisely 
configures number of threads and jvm heap size. Then you surprisingly blows up 
no:of threads that can lead to failures. 

and even item 1. is possible there is a chance to saturate web container thread 
pool by multiqueries, which will be blocked by sub-queries. And saturated 
thread pool blocks sub-queries from progress. 

I propose implement this feature at the client side - in SolrJ. It also allows 
evenly distribute load on a cluster via 
http://wiki.apache.org/solr/LBHttpSolrServer underneath, instead of explode 
single node by such multi-query.

 A RequestHandler to run multiple queries in a batch
 ---

 Key: SOLR-1093
 URL: https://issues.apache.org/jira/browse/SOLR-1093
 Project: Solr
  Issue Type: New Feature
  Components: search
Reporter: Noble Paul
Assignee: Simon Willnauer
 Fix For: 3.6, 4.0


 It is a common requirement that a single page requires to fire multiple 
 queries .In cases where these queries are independent of each other. If there 
 is a handler which can take in multiple queries , run them in paralll and 
 send the response as one big chunk it would be useful
 Let us say the handler is  MultiRequestHandler
 {code}
 requestHandler name=/multi class=solr.MultiRequestHandler/
 {code}
 h2.Query Syntax
 The request must specify the no:of queries as count=n
 Each request parameter must be prefixed with a number which denotes the query 
 index.optionally ,it may can also specify the handler name.
 example
 {code}
 /multi?count=21.handler=/select1.q=a:b2.handler=/select2.q=a:c
 {code}
 default handler can be '/select' so the equivalent can be
 {code} 
 /multi?count=21.q=a:b2.q=a:c
 {code}
 h2.The response
 The response will be a ListNamedList where each NamedList will be a 
 response to a query. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-3.x - Build # 12280 - Failure

2012-01-26 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-3.x/12280/

1 tests failed.
REGRESSION:  
org.apache.lucene.analysis.phonetic.TestBeiderMorseFilter.testRandom

Error Message:
Java heap space

Stack Trace:
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2894)
at 
java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:117)
at 
java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:407)
at java.lang.StringBuilder.append(StringBuilder.java:136)
at org.apache.commons.codec.language.bm.Rule$Phoneme.join(Rule.java:130)
at 
org.apache.commons.codec.language.bm.PhoneticEngine$PhonemeBuilder.apply(PhoneticEngine.java:112)
at 
org.apache.commons.codec.language.bm.PhoneticEngine$RulesApplication.invoke(PhoneticEngine.java:211)
at 
org.apache.commons.codec.language.bm.PhoneticEngine.encode(PhoneticEngine.java:462)
at 
org.apache.commons.codec.language.bm.PhoneticEngine.encode(PhoneticEngine.java:375)
at 
org.apache.lucene.analysis.phonetic.BeiderMorseFilter.incrementToken(BeiderMorseFilter.java:89)
at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:333)
at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:295)
at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:252)
at 
org.apache.lucene.analysis.phonetic.TestBeiderMorseFilter.testRandom(TestBeiderMorseFilter.java:91)
at 
org.apache.lucene.util.LuceneTestCase$2$1.evaluate(LuceneTestCase.java:432)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:147)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:50)




Build Log (for compile errors):
[...truncated 10890 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-3.x-java7 - Build # 1642 - Failure

2012-01-26 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-3.x-java7/1642/

1 tests failed.
REGRESSION:  
org.apache.lucene.analysis.phonetic.TestBeiderMorseFilter.testRandom

Error Message:
java.lang.AssertionError: Some threads threw uncaught exceptions!

Stack Trace:
java.lang.RuntimeException: java.lang.AssertionError: Some threads threw 
uncaught exceptions!
at 
org.apache.lucene.util.LuceneTestCase.tearDown(LuceneTestCase.java:571)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:147)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:50)
at 
org.apache.lucene.util.LuceneTestCase.checkUncaughtExceptionsAfter(LuceneTestCase.java:599)
at 
org.apache.lucene.util.LuceneTestCase.tearDown(LuceneTestCase.java:543)




Build Log (for compile errors):
[...truncated 10920 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk - Build # 12264 - Still Failing

2012-01-26 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12264/

1 tests failed.
REGRESSION:  org.apache.solr.cloud.FullSolrCloudTest.testDistribSearch

Error Message:
java.net.SocketTimeoutException: Read timed out

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: 
java.net.SocketTimeoutException: Read timed out
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:481)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:251)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:104)
at 
org.apache.solr.cloud.FullSolrCloudTest.index_specific(FullSolrCloudTest.java:446)
at 
org.apache.solr.cloud.FullSolrCloudTest.brindDownShardIndexSomeDocsAndRecover(FullSolrCloudTest.java:673)
at 
org.apache.solr.cloud.FullSolrCloudTest.doTest(FullSolrCloudTest.java:498)
at 
org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:663)
at 
org.apache.lucene.util.LuceneTestCase$3$1.evaluate(LuceneTestCase.java:529)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:165)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:146)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
at 
org.apache.commons.httpclient.HttpParser.readRawLine(HttpParser.java:78)
at 
org.apache.commons.httpclient.HttpParser.readLine(HttpParser.java:106)
at 
org.apache.commons.httpclient.HttpConnection.readLine(HttpConnection.java:1116)
at 
org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.readLine(MultiThreadedHttpConnectionManager.java:1413)
at 
org.apache.commons.httpclient.HttpMethodBase.readStatusLine(HttpMethodBase.java:1973)
at 
org.apache.commons.httpclient.HttpMethodBase.readResponse(HttpMethodBase.java:1735)
at 
org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1098)
at 
org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398)
at 
org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:425)




Build Log (for compile errors):
[...truncated 8321 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 1616 - Still Failing

2012-01-26 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/1616/

8 tests failed.
REGRESSION:  
org.apache.solr.cloud.FullSolrCloudDistribCmdsTest.testDistribSearch

Error Message:
null  java.lang.AssertionError  at 
org.apache.solr.update.TransactionLog.init(TransactionLog.java:159)  at 
org.apache.solr.update.TransactionLog.init(TransactionLog.java:132)  at 
org.apache.solr.update.UpdateLog.ensureLog(UpdateLog.java:567)  at 
org.apache.solr.update.UpdateLog.add(UpdateLog.java:237)  at 
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:184)
  at 
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:56)
  at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:53)
  at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:313)
  at 
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:410)
  at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:217)
  at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:134)  at 
org.apache.solr.handler.XMLLoader.load(XMLLoader.java:78)  at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:59)
  at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
  at org.apache.solr.core.SolrCore.execute(SolrCore.java:1515)  at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:339) 
 at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:234)
  at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
  at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)  
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)  at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)  at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)  at 
org.mortbay.jetty.Server.handle(Server.java:326)  at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)  at 
org.mortbay.jetty.Htt  null  java.lang.AssertionError  at 
org.apache.solr.update.TransactionLog.init(TransactionLog.java:159)  at 
org.apache.solr.update.TransactionLog.init(TransactionLog.java:132)  at 
org.apache.solr.update.UpdateLog.ensureLog(UpdateLog.java:567)  at 
org.apache.solr.update.UpdateLog.add(UpdateLog.java:237)  at 
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:184)
  at 
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:56)
  at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:53)
  at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:313)
  at 
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:410)
  at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:217)
  at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:134)  at 
org.apache.solr.handler.XMLLoader.load(XMLLoader.java:78)  at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:59)
  at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
  at org.apache.solr.core.SolrCore.execute(SolrCore.java:1515)  at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:339) 
 at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:234)
  at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
  at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)  
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)  at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)  at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)  at 
org.mortbay.jetty.Server.handle(Server.java:326)  at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)  at 
org.mortbay.jetty.Htt  request: 
http://localhost:12672/solr/collection1/update?wt=javabinversion=2

Stack Trace:
org.apache.solr.common.SolrException: null  java.lang.AssertionErrorat 
org.apache.solr.update.TransactionLog.init(TransactionLog.java:159)at 
org.apache.solr.update.TransactionLog.init(TransactionLog.java:132)at 
org.apache.solr.update.UpdateLog.ensureLog(UpdateLog.java:567)   at 
org.apache.solr.update.UpdateLog.add(UpdateLog.java:237) at 
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:184)
at 

[jira] [Commented] (SOLR-2368) Improve extended dismax (edismax) parser

2012-01-26 Thread Okke Klein (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193772#comment-13193772
 ] 

Okke Klein commented on SOLR-2368:
--

Was the feature {quote}advanced stopword handling... stopwords are not required 
in the mandatory part of the query but are still used (if indexed) in the 
proximity boosting part. If a query consists of all stopwords (e.g. to be or 
not to be) then all will be required.{quote} from 
https://issues.apache.org/jira/browse/SOLR-1553 ever implemented? If not can 
this feature be added?

 Improve extended dismax (edismax) parser
 

 Key: SOLR-2368
 URL: https://issues.apache.org/jira/browse/SOLR-2368
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Yonik Seeley
  Labels: QueryParser

 This is a mother issue to track further improvements for eDismax parser.
 The goal is to be able to deprecate and remove the old dismax once edismax 
 satisfies all usecases of dismax.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3714) add suggester that uses shortest path/wFST instead of buckets

2012-01-26 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193804#comment-13193804
 ] 

Robert Muir commented on LUCENE-3714:
-

Thats the exact match case of the benchmark where it benchmarks the entire word 
being typed in completely?

I didn't look at it because its not very interesting (TSTLookup looks 
super-fast here too)

I think this is the interesting part:
{noformat}
[junit] -- prefixes: 2-4, num: 7, onlyMorePopular: true
[junit] JaspellLookup   queries: 50001, time[ms]: 363 [+- 16.75], ~qps: 138
[junit] TSTLookup   queries: 50001, time[ms]: 1112 [+- 22.57], ~qps: 45
[junit] FSTCompletionLookup queries: 50001, time[ms]: 140 [+- 3.16], ~qps: 
356
[junit] WFSTCompletionLookup queries: 50001, time[ms]: 242 [+- 4.01], ~qps: 
207
{noformat}

We could of course toggle 'num' etc and see how this comes out. Anyway I just 
wanted
to make sure the performance was 'in the ballpark' and  competitive with the 
other
suggesters, I think for a lot of users thats all that matters, and whether its
400 QPS or 200QPS, probably doesnt matter so much... so I didnt tweak any 
further.


 add suggester that uses shortest path/wFST instead of buckets
 -

 Key: LUCENE-3714
 URL: https://issues.apache.org/jira/browse/LUCENE-3714
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/spellchecker
Reporter: Robert Muir
 Attachments: LUCENE-3714.patch, LUCENE-3714.patch, LUCENE-3714.patch, 
 TestMe.java, out.png


 Currently the FST suggester (really an FSA) quantizes weights into buckets 
 (e.g. single byte) and puts them in front of the word.
 This makes it fast, but you lose granularity in your suggestions.
 Lately the question was raised, if you build lucene's FST with 
 positiveintoutputs, does it behave the same as a tropical semiring wFST?
 In other words, after completing the word, we instead traverse min(output) at 
 each node to find the 'shortest path' to the 
 best suggestion (with the highest score).
 This means we wouldnt need to quantize weights at all and it might make some 
 operations (e.g. adding fuzzy matching etc) a lot easier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk - Build # 12265 - Still Failing

2012-01-26 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12265/

2 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.solr.cloud.FullSolrCloudDistribCmdsTest

Error Message:
ERROR: SolrIndexSearcher opens=0 closes=-2

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=0 closes=-2
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:147)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)


FAILED:  org.apache.solr.cloud.FullSolrCloudTest.testDistribSearch

Error Message:
java.net.SocketTimeoutException: Read timed out

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: 
java.net.SocketTimeoutException: Read timed out
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:481)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:251)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:104)
at 
org.apache.solr.cloud.FullSolrCloudTest.index_specific(FullSolrCloudTest.java:446)
at 
org.apache.solr.cloud.FullSolrCloudTest.brindDownShardIndexSomeDocsAndRecover(FullSolrCloudTest.java:673)
at 
org.apache.solr.cloud.FullSolrCloudTest.doTest(FullSolrCloudTest.java:498)
at 
org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:663)
at 
org.apache.lucene.util.LuceneTestCase$3$1.evaluate(LuceneTestCase.java:529)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:165)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:146)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
at 
org.apache.commons.httpclient.HttpParser.readRawLine(HttpParser.java:78)
at 
org.apache.commons.httpclient.HttpParser.readLine(HttpParser.java:106)
at 
org.apache.commons.httpclient.HttpConnection.readLine(HttpConnection.java:1116)
at 
org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.readLine(MultiThreadedHttpConnectionManager.java:1413)
at 
org.apache.commons.httpclient.HttpMethodBase.readStatusLine(HttpMethodBase.java:1973)
at 
org.apache.commons.httpclient.HttpMethodBase.readResponse(HttpMethodBase.java:1735)
at 
org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1098)
at 
org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398)
at 
org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:425)




Build Log (for compile errors):
[...truncated 8932 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3714) add suggester that uses shortest path/wFST instead of buckets

2012-01-26 Thread Michael McCandless (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193832#comment-13193832
 ] 

Michael McCandless commented on LUCENE-3714:


Awesome!  I haven't looked at patch yet... but the comparison is also 
apples/oranges right?

Because WFSTCompletionLookup does no quantizing (bucketing), ie, it returns the 
true topN suggestions, so it has to do more work to differentiate hits that 
FSTCompletionLookup considers equal.

I guess we could pre-quantize the weights into the same buckets that 
FSTCompletionLookup will use, when adding to WFSTCompletionLookup... then the 
results are comparable.

But I guess we don't need to do this since the results are good enough...

 add suggester that uses shortest path/wFST instead of buckets
 -

 Key: LUCENE-3714
 URL: https://issues.apache.org/jira/browse/LUCENE-3714
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/spellchecker
Reporter: Robert Muir
 Attachments: LUCENE-3714.patch, LUCENE-3714.patch, LUCENE-3714.patch, 
 TestMe.java, out.png


 Currently the FST suggester (really an FSA) quantizes weights into buckets 
 (e.g. single byte) and puts them in front of the word.
 This makes it fast, but you lose granularity in your suggestions.
 Lately the question was raised, if you build lucene's FST with 
 positiveintoutputs, does it behave the same as a tropical semiring wFST?
 In other words, after completing the word, we instead traverse min(output) at 
 each node to find the 'shortest path' to the 
 best suggestion (with the highest score).
 This means we wouldnt need to quantize weights at all and it might make some 
 operations (e.g. adding fuzzy matching etc) a lot easier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3714) add suggester that uses shortest path/wFST instead of buckets

2012-01-26 Thread Dawid Weiss (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193838#comment-13193838
 ] 

Dawid Weiss commented on LUCENE-3714:
-

No, no, Mike -- I'm all for going with exact, fine scores. That way even if 
WFSTCompletionLookup is slower it's still a real-life use case and 
FSTCompletionLookup (even if faster) will have a (*) saying it's not a complete 
solution.

I like what Robert did (still didn't look at the patch), but I was just 
wondering why the hell difference for long prefixes. This is not a frequent 
case in real-life, just curiosity.

 add suggester that uses shortest path/wFST instead of buckets
 -

 Key: LUCENE-3714
 URL: https://issues.apache.org/jira/browse/LUCENE-3714
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/spellchecker
Reporter: Robert Muir
 Attachments: LUCENE-3714.patch, LUCENE-3714.patch, LUCENE-3714.patch, 
 TestMe.java, out.png


 Currently the FST suggester (really an FSA) quantizes weights into buckets 
 (e.g. single byte) and puts them in front of the word.
 This makes it fast, but you lose granularity in your suggestions.
 Lately the question was raised, if you build lucene's FST with 
 positiveintoutputs, does it behave the same as a tropical semiring wFST?
 In other words, after completing the word, we instead traverse min(output) at 
 each node to find the 'shortest path' to the 
 best suggestion (with the highest score).
 This means we wouldnt need to quantize weights at all and it might make some 
 operations (e.g. adding fuzzy matching etc) a lot easier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3714) add suggester that uses shortest path/wFST instead of buckets

2012-01-26 Thread Michael McCandless (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193845#comment-13193845
 ] 

Michael McCandless commented on LUCENE-3714:


Sorry, I'm not proposing we commit pre-quantizing the scores... I'm
just saying we'd learn more from the perf test that way.  Ie, is WFST
slower because 1) it's doing precise scores, or 2) the topN algo is
slowish.


 add suggester that uses shortest path/wFST instead of buckets
 -

 Key: LUCENE-3714
 URL: https://issues.apache.org/jira/browse/LUCENE-3714
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/spellchecker
Reporter: Robert Muir
 Attachments: LUCENE-3714.patch, LUCENE-3714.patch, LUCENE-3714.patch, 
 TestMe.java, out.png


 Currently the FST suggester (really an FSA) quantizes weights into buckets 
 (e.g. single byte) and puts them in front of the word.
 This makes it fast, but you lose granularity in your suggestions.
 Lately the question was raised, if you build lucene's FST with 
 positiveintoutputs, does it behave the same as a tropical semiring wFST?
 In other words, after completing the word, we instead traverse min(output) at 
 each node to find the 'shortest path' to the 
 best suggestion (with the highest score).
 This means we wouldnt need to quantize weights at all and it might make some 
 operations (e.g. adding fuzzy matching etc) a lot easier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: svn commit: r1236178 - /lucene/dev/trunk/solr/build.xml

2012-01-26 Thread Uwe Schindler
These directories are part of the SVN checkout, they should always be 
available! What's wrong here? Be also sure to not remove them by your script!

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: dwe...@apache.org [mailto:dwe...@apache.org]
 Sent: Thursday, January 26, 2012 2:57 PM
 To: comm...@lucene.apache.org
 Subject: svn commit: r1236178 - /lucene/dev/trunk/solr/build.xml
 
 Author: dweiss
 Date: Thu Jan 26 13:56:50 2012
 New Revision: 1236178
 
 URL: http://svn.apache.org/viewvc?rev=1236178view=rev
 Log:
 Fixed 'ant clean' failing if a directory didn't exist. Replaced with an 
 include
 pattern.
 
 Modified:
 lucene/dev/trunk/solr/build.xml
 
 Modified: lucene/dev/trunk/solr/build.xml
 URL:
 http://svn.apache.org/viewvc/lucene/dev/trunk/solr/build.xml?rev=1236178r
 1=1236177r2=1236178view=diff
 
 ==
 --- lucene/dev/trunk/solr/build.xml (original)
 +++ lucene/dev/trunk/solr/build.xml Thu Jan 26 13:56:50 2012
 @@ -184,9 +184,12 @@
  delete dir=package /
  delete dir=example/solr/lib /
  delete includeemptydirs=true
 -  fileset dir=example includes=**/data/**/* excludes=**/.gitignore
 /
 -  fileset dir=example/webapps includes=**/* excludes=**/.gitignore
 /
 -  fileset dir=example/work includes=**/* excludes=**/.gitignore /
 +  fileset dir=example
 +include name=example/**/data/**/* /
 +include name=example/webapps/**/* /
 +include name=example/work/**/* /
 +exclude name=**/.gitignore /
 +  /fileset
  /delete
  echo message=TODO: fix tests to not write files to 'core/src/test-
 files/solr/data'! /
  delete dir=core/src/test-files/solr/data /



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3714) add suggester that uses shortest path/wFST instead of buckets

2012-01-26 Thread Dawid Weiss (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193855#comment-13193855
 ] 

Dawid Weiss commented on LUCENE-3714:
-

Looked through the patch. Some comments:

+// nocommit: why does the benchmark test supply duplicates? what 
should we do in this case?

ignore duplicates. I think it is allowed for the iterator to allow duplicates; 
I'm not sure but this may even be used when bucketing input -- the same entry 
with a different score (before bucketing) may end up identical after sorting.

+// nocommit: shouldn't we have an option to
+// match exactly even if its a crappy suggestion? we would just look
+// for arc.isFinal here etc...

Yes, this should definitely be an option because if it's an exact match then 
you'll probably want it on top of the suggestions list, no matter what.

You could also add a generator of the bad case that I attached inside 
TestMe.java -- this creates the case when following greedily doesn't yield 
correct output (requires a pq).

I also checked the benchmark and yes, it uses exactMatchFirst promotion. This 
clarifies the speed difference for longer prefixes -- not enough results are 
collected in any of the buckets so no early termination occurs and _all_ 
buckets must be traversed.

I like WFSTCompletionLookup very much, clean and simple.

 add suggester that uses shortest path/wFST instead of buckets
 -

 Key: LUCENE-3714
 URL: https://issues.apache.org/jira/browse/LUCENE-3714
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/spellchecker
Reporter: Robert Muir
 Attachments: LUCENE-3714.patch, LUCENE-3714.patch, LUCENE-3714.patch, 
 TestMe.java, out.png


 Currently the FST suggester (really an FSA) quantizes weights into buckets 
 (e.g. single byte) and puts them in front of the word.
 This makes it fast, but you lose granularity in your suggestions.
 Lately the question was raised, if you build lucene's FST with 
 positiveintoutputs, does it behave the same as a tropical semiring wFST?
 In other words, after completing the word, we instead traverse min(output) at 
 each node to find the 'shortest path' to the 
 best suggestion (with the highest score).
 This means we wouldnt need to quantize weights at all and it might make some 
 operations (e.g. adding fuzzy matching etc) a lot easier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: svn commit: r1236178 - /lucene/dev/trunk/solr/build.xml

2012-01-26 Thread Uwe Schindler
Also: can this work?

  +  fileset dir=example
  +include name=example/**/data/**/* /
  +include name=example/webapps/**/* /
  +include name=example/work/**/* /
  +exclude name=**/.gitignore /
  +  /fileset

This expects example/example/...

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Uwe Schindler [mailto:u...@thetaphi.de]
 Sent: Thursday, January 26, 2012 3:11 PM
 To: dev@lucene.apache.org
 Subject: RE: svn commit: r1236178 - /lucene/dev/trunk/solr/build.xml
 
 These directories are part of the SVN checkout, they should always be
 available! What's wrong here? Be also sure to not remove them by your script!
 
 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de
 
 
  -Original Message-
  From: dwe...@apache.org [mailto:dwe...@apache.org]
  Sent: Thursday, January 26, 2012 2:57 PM
  To: comm...@lucene.apache.org
  Subject: svn commit: r1236178 - /lucene/dev/trunk/solr/build.xml
 
  Author: dweiss
  Date: Thu Jan 26 13:56:50 2012
  New Revision: 1236178
 
  URL: http://svn.apache.org/viewvc?rev=1236178view=rev
  Log:
  Fixed 'ant clean' failing if a directory didn't exist. Replaced with
  an include pattern.
 
  Modified:
  lucene/dev/trunk/solr/build.xml
 
  Modified: lucene/dev/trunk/solr/build.xml
  URL:
  http://svn.apache.org/viewvc/lucene/dev/trunk/solr/build.xml?rev=12361
  78r
  1=1236177r2=1236178view=diff
 
 
  ==
  --- lucene/dev/trunk/solr/build.xml (original)
  +++ lucene/dev/trunk/solr/build.xml Thu Jan 26 13:56:50 2012
  @@ -184,9 +184,12 @@
   delete dir=package /
   delete dir=example/solr/lib /
   delete includeemptydirs=true
  -  fileset dir=example includes=**/data/**/* 
  excludes=**/.gitignore
  /
  -  fileset dir=example/webapps includes=**/*
 excludes=**/.gitignore
  /
  -  fileset dir=example/work includes=**/* excludes=**/.gitignore 
  /
  +  fileset dir=example
  +include name=example/**/data/**/* /
  +include name=example/webapps/**/* /
  +include name=example/work/**/* /
  +exclude name=**/.gitignore /
  +  /fileset
   /delete
   echo message=TODO: fix tests to not write files to
  'core/src/test- files/solr/data'! /
   delete dir=core/src/test-files/solr/data /
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1236178 - /lucene/dev/trunk/solr/build.xml

2012-01-26 Thread Dawid Weiss
Ah ok, I see now. I use git-svn so empty folders are removed for
me by default.

 This expects example/example/...

Right, should be without the prefix. I'll revert and see how it works
with svn first then.

Dawid

On Thu, Jan 26, 2012 at 3:11 PM, Uwe Schindler u...@thetaphi.de wrote:
 Also: can this work?

  +      fileset dir=example
  +        include name=example/**/data/**/* /
  +        include name=example/webapps/**/* /
  +        include name=example/work/**/* /
  +        exclude name=**/.gitignore /
  +      /fileset

 This expects example/example/...

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de


 -Original Message-
 From: Uwe Schindler [mailto:u...@thetaphi.de]
 Sent: Thursday, January 26, 2012 3:11 PM
 To: dev@lucene.apache.org
 Subject: RE: svn commit: r1236178 - /lucene/dev/trunk/solr/build.xml

 These directories are part of the SVN checkout, they should always be
 available! What's wrong here? Be also sure to not remove them by your script!

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de


  -Original Message-
  From: dwe...@apache.org [mailto:dwe...@apache.org]
  Sent: Thursday, January 26, 2012 2:57 PM
  To: comm...@lucene.apache.org
  Subject: svn commit: r1236178 - /lucene/dev/trunk/solr/build.xml
 
  Author: dweiss
  Date: Thu Jan 26 13:56:50 2012
  New Revision: 1236178
 
  URL: http://svn.apache.org/viewvc?rev=1236178view=rev
  Log:
  Fixed 'ant clean' failing if a directory didn't exist. Replaced with
  an include pattern.
 
  Modified:
      lucene/dev/trunk/solr/build.xml
 
  Modified: lucene/dev/trunk/solr/build.xml
  URL:
  http://svn.apache.org/viewvc/lucene/dev/trunk/solr/build.xml?rev=12361
  78r
  1=1236177r2=1236178view=diff
 
 
  ==
  --- lucene/dev/trunk/solr/build.xml (original)
  +++ lucene/dev/trunk/solr/build.xml Thu Jan 26 13:56:50 2012
  @@ -184,9 +184,12 @@
       delete dir=package /
       delete dir=example/solr/lib /
       delete includeemptydirs=true
  -      fileset dir=example includes=**/data/**/* 
  excludes=**/.gitignore
  /
  -      fileset dir=example/webapps includes=**/*
 excludes=**/.gitignore
  /
  -      fileset dir=example/work includes=**/* 
  excludes=**/.gitignore /
  +      fileset dir=example
  +        include name=example/**/data/**/* /
  +        include name=example/webapps/**/* /
  +        include name=example/work/**/* /
  +        exclude name=**/.gitignore /
  +      /fileset
       /delete
       echo message=TODO: fix tests to not write files to
  'core/src/test- files/solr/data'! /
       delete dir=core/src/test-files/solr/data /



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1236178 - /lucene/dev/trunk/solr/build.xml

2012-01-26 Thread Dawid Weiss
Corrected for svn and working (doesn't delete anything, including the
empty webapps folder). Thanks Uwe!

http://ophelia.cs.put.poznan.pl/~dweiss/download/uwe.jpg

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1236178 - /lucene/dev/trunk/solr/build.xml

2012-01-26 Thread Robert Muir
On Thu, Jan 26, 2012 at 9:27 AM, Dawid Weiss
dawid.we...@cs.put.poznan.pl wrote:
 Corrected for svn and working (doesn't delete anything, including the
 empty webapps folder). Thanks Uwe!

 http://ophelia.cs.put.poznan.pl/~dweiss/download/uwe.jpg


definitely on point :)


-- 
lucidimagination.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3714) add suggester that uses shortest path/wFST instead of buckets

2012-01-26 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193865#comment-13193865
 ] 

Robert Muir commented on LUCENE-3714:
-

Dawid thanks for the comments: we can remove that first nocommit and then add 
an option
for the second one...

I think as a step to move this forward we have to fix the output encoding to 
not be (Integer.MAX_VALUE-weight). 

Seems like the best first step is to generify findMinPairs to T and to allow an 
arbitrary Comparator
so that we can muck around with the algebra? I looked at this and it seems 
possible...


 add suggester that uses shortest path/wFST instead of buckets
 -

 Key: LUCENE-3714
 URL: https://issues.apache.org/jira/browse/LUCENE-3714
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/spellchecker
Reporter: Robert Muir
 Attachments: LUCENE-3714.patch, LUCENE-3714.patch, LUCENE-3714.patch, 
 TestMe.java, out.png


 Currently the FST suggester (really an FSA) quantizes weights into buckets 
 (e.g. single byte) and puts them in front of the word.
 This makes it fast, but you lose granularity in your suggestions.
 Lately the question was raised, if you build lucene's FST with 
 positiveintoutputs, does it behave the same as a tropical semiring wFST?
 In other words, after completing the word, we instead traverse min(output) at 
 each node to find the 'shortest path' to the 
 best suggestion (with the highest score).
 This means we wouldnt need to quantize weights at all and it might make some 
 operations (e.g. adding fuzzy matching etc) a lot easier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3714) add suggester that uses shortest path/wFST instead of buckets

2012-01-26 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193867#comment-13193867
 ] 

Robert Muir commented on LUCENE-3714:
-

also I'm not sure I'm in love with findMinPairs. Maybe we should call it 
shortestPaths ?

 add suggester that uses shortest path/wFST instead of buckets
 -

 Key: LUCENE-3714
 URL: https://issues.apache.org/jira/browse/LUCENE-3714
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/spellchecker
Reporter: Robert Muir
 Attachments: LUCENE-3714.patch, LUCENE-3714.patch, LUCENE-3714.patch, 
 TestMe.java, out.png


 Currently the FST suggester (really an FSA) quantizes weights into buckets 
 (e.g. single byte) and puts them in front of the word.
 This makes it fast, but you lose granularity in your suggestions.
 Lately the question was raised, if you build lucene's FST with 
 positiveintoutputs, does it behave the same as a tropical semiring wFST?
 In other words, after completing the word, we instead traverse min(output) at 
 each node to find the 'shortest path' to the 
 best suggestion (with the highest score).
 This means we wouldnt need to quantize weights at all and it might make some 
 operations (e.g. adding fuzzy matching etc) a lot easier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: svn commit: r1236178 - /lucene/dev/trunk/solr/build.xml

2012-01-26 Thread Uwe Schindler
In my opinion, we should better fix the SVN checkout to not include empty 
folders by adding more .gitignore files or whatever as placeholder ?

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: dawid.we...@gmail.com [mailto:dawid.we...@gmail.com] On Behalf Of
 Dawid Weiss
 Sent: Thursday, January 26, 2012 3:28 PM
 To: dev@lucene.apache.org
 Subject: Re: svn commit: r1236178 - /lucene/dev/trunk/solr/build.xml
 
 Corrected for svn and working (doesn't delete anything, including the empty
 webapps folder). Thanks Uwe!
 
 http://ophelia.cs.put.poznan.pl/~dweiss/download/uwe.jpg
 
 Dawid
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: svn commit: r1236178 - /lucene/dev/trunk/solr/build.xml

2012-01-26 Thread Uwe Schindler
LOL, @Uwesays's new profile picture...

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Robert Muir [mailto:rcm...@gmail.com]
 Sent: Thursday, January 26, 2012 3:30 PM
 To: dev@lucene.apache.org
 Subject: Re: svn commit: r1236178 - /lucene/dev/trunk/solr/build.xml
 
 On Thu, Jan 26, 2012 at 9:27 AM, Dawid Weiss dawid.we...@cs.put.poznan.pl
 wrote:
  Corrected for svn and working (doesn't delete anything, including the
  empty webapps folder). Thanks Uwe!
 
  http://ophelia.cs.put.poznan.pl/~dweiss/download/uwe.jpg
 
 
 definitely on point :)
 
 
 --
 lucidimagination.com
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1236178 - /lucene/dev/trunk/solr/build.xml

2012-01-26 Thread Dawid Weiss
 In my opinion, we should better fix the SVN checkout to not include empty 
 folders by adding more .gitignore files or whatever as placeholder ?

It would make it more explicit that a folder should be there. And
perhaps facilitate any migration should we move to a version control
system that doesn't store folder metadata in the future.

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1093) A RequestHandler to run multiple queries in a batch

2012-01-26 Thread Noble Paul (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193890#comment-13193890
 ] 

Noble Paul commented on SOLR-1093:
--

If you use distributed search, solr  uses its own thread pool.

if you implement it in client side  java clients can benefit

 A RequestHandler to run multiple queries in a batch
 ---

 Key: SOLR-1093
 URL: https://issues.apache.org/jira/browse/SOLR-1093
 Project: Solr
  Issue Type: New Feature
  Components: search
Reporter: Noble Paul
Assignee: Simon Willnauer
 Fix For: 3.6, 4.0


 It is a common requirement that a single page requires to fire multiple 
 queries .In cases where these queries are independent of each other. If there 
 is a handler which can take in multiple queries , run them in paralll and 
 send the response as one big chunk it would be useful
 Let us say the handler is  MultiRequestHandler
 {code}
 requestHandler name=/multi class=solr.MultiRequestHandler/
 {code}
 h2.Query Syntax
 The request must specify the no:of queries as count=n
 Each request parameter must be prefixed with a number which denotes the query 
 index.optionally ,it may can also specify the handler name.
 example
 {code}
 /multi?count=21.handler=/select1.q=a:b2.handler=/select2.q=a:c
 {code}
 default handler can be '/select' so the equivalent can be
 {code} 
 /multi?count=21.q=a:b2.q=a:c
 {code}
 h2.The response
 The response will be a ListNamedList where each NamedList will be a 
 response to a query. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk - Build # 12266 - Still Failing

2012-01-26 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12266/

All tests passed

Build Log (for compile errors):
[...truncated 15296 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #371: POMs out of sync

2012-01-26 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/371/

No tests ran.

Build Log (for compile errors):
[...truncated 12195 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Javadoc warnings in Solr Core [was: RE: [JENKINS-MAVEN] Lucene-Solr-Maven-trunk #371: POMs out of sync]

2012-01-26 Thread Steven A Rowe
This failure was triggered by Javadoc warnings in Solr Core:

  [javadoc] Standard Doclet version 1.6.0
  [javadoc] Building tree for all the packages and classes...
  [javadoc] .../solr/core/src/java/org/apache/solr/cloud/AssignShard.java:44: 
warning - @return tag has no arguments.
  [javadoc] .../solr/core/src/java/org/apache/solr/cloud/AssignShard.java:44: 
warning - @param argument slices is not a parameter name.
  [javadoc] 
.../solr/core/src/java/org/apache/solr/cloud/LeaderElector.java:209: warning - 
@param argument SolrCore is not a parameter name.
  [javadoc] 
.../solr/core/src/java/org/apache/solr/cloud/LeaderElector.java:264: warning - 
@param argument shardId is not a parameter name.
  [javadoc] 
.../solr/core/src/java/org/apache/solr/cloud/LeaderElector.java:264: warning - 
@param argument collection is not a parameter name.
  [javadoc] .../solr/core/src/java/org/apache/solr/cloud/ZkController.java:448: 
warning - @return tag has no arguments.
  [javadoc] .../solr/core/src/java/org/apache/solr/cloud/ZkController.java:462: 
warning - @return tag has no arguments.
  [javadoc] .../solr/core/src/java/org/apache/solr/cloud/ZkController.java:147: 
warning - @param argument coreContainer is not a parameter name.
  [javadoc] .../solr/core/src/java/org/apache/solr/cloud/ZkController.java:147: 
warning - @param argument numShards is not a parameter name.
  [javadoc] .../solr/core/src/java/org/apache/solr/cloud/ZkController.java:448: 
warning - @param argument cloudDesc is not a parameter name.
  [javadoc] Generating 
.../solr/build/solr-core/docs/api/org/apache/solr/util/package-summary.html...
  [javadoc] Copying file 
.../solr/core/src/java/org/apache/solr/util/doc-files/min-should-match.html to 
directory .../solr/build/solr-core/docs/api/org/apache/solr/util/doc-files...
  [javadoc] Generating .../solr/build/solr-core/docs/api/serialized-form.html...
  [javadoc] Copying file .../lucene/src/tools/prettify/stylesheet+prettify.css 
to file .../solr/build/solr-core/docs/api/stylesheet+prettify.css...
  [javadoc] Building index for all the packages and classes...
  [javadoc] Building index for all classes...
  [javadoc] Generating .../solr/build/solr-core/docs/api/help-doc.html...
  [javadoc] 10 warnings

BUILD FAILED
.../solr/build.xml:506: The following error occurred while executing this line:
.../solr/common-build.xml:227: The following error occurred while executing 
this line:
.../lucene/common-build.xml:909: Javadocs warnings were found!


 -Original Message-
 From: Apache Jenkins Server [mailto:jenk...@builds.apache.org]
 Sent: Thursday, January 26, 2012 10:36 AM
 To: dev@lucene.apache.org
 Subject: [JENKINS-MAVEN] Lucene-Solr-Maven-trunk #371: POMs out of sync
 
 Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/371/
 
 No tests ran.
 
 Build Log (for compile errors):
 [...truncated 12195 lines...]
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Javadoc warnings in Solr Core [was: RE: [JENKINS-MAVEN] Lucene-Solr-Maven-trunk #371: POMs out of sync]

2012-01-26 Thread Mark Miller
Thanks Steve - making corrections.

On Jan 26, 2012, at 10:49 AM, Steven A Rowe wrote:

 This failure was triggered by Javadoc warnings in Solr Core:
 
  [javadoc] Standard Doclet version 1.6.0
  [javadoc] Building tree for all the packages and classes...
  [javadoc] .../solr/core/src/java/org/apache/solr/cloud/AssignShard.java:44: 
 warning - @return tag has no arguments.
  [javadoc] .../solr/core/src/java/org/apache/solr/cloud/AssignShard.java:44: 
 warning - @param argument slices is not a parameter name.
  [javadoc] 
 .../solr/core/src/java/org/apache/solr/cloud/LeaderElector.java:209: warning 
 - @param argument SolrCore is not a parameter name.
  [javadoc] 
 .../solr/core/src/java/org/apache/solr/cloud/LeaderElector.java:264: warning 
 - @param argument shardId is not a parameter name.
  [javadoc] 
 .../solr/core/src/java/org/apache/solr/cloud/LeaderElector.java:264: warning 
 - @param argument collection is not a parameter name.
  [javadoc] 
 .../solr/core/src/java/org/apache/solr/cloud/ZkController.java:448: warning - 
 @return tag has no arguments.
  [javadoc] 
 .../solr/core/src/java/org/apache/solr/cloud/ZkController.java:462: warning - 
 @return tag has no arguments.
  [javadoc] 
 .../solr/core/src/java/org/apache/solr/cloud/ZkController.java:147: warning - 
 @param argument coreContainer is not a parameter name.
  [javadoc] 
 .../solr/core/src/java/org/apache/solr/cloud/ZkController.java:147: warning - 
 @param argument numShards is not a parameter name.
  [javadoc] 
 .../solr/core/src/java/org/apache/solr/cloud/ZkController.java:448: warning - 
 @param argument cloudDesc is not a parameter name.
  [javadoc] Generating 
 .../solr/build/solr-core/docs/api/org/apache/solr/util/package-summary.html...
  [javadoc] Copying file 
 .../solr/core/src/java/org/apache/solr/util/doc-files/min-should-match.html 
 to directory 
 .../solr/build/solr-core/docs/api/org/apache/solr/util/doc-files...
  [javadoc] Generating 
 .../solr/build/solr-core/docs/api/serialized-form.html...
  [javadoc] Copying file .../lucene/src/tools/prettify/stylesheet+prettify.css 
 to file .../solr/build/solr-core/docs/api/stylesheet+prettify.css...
  [javadoc] Building index for all the packages and classes...
  [javadoc] Building index for all classes...
  [javadoc] Generating .../solr/build/solr-core/docs/api/help-doc.html...
  [javadoc] 10 warnings
 
 BUILD FAILED
 .../solr/build.xml:506: The following error occurred while executing this 
 line:
 .../solr/common-build.xml:227: The following error occurred while executing 
 this line:
 .../lucene/common-build.xml:909: Javadocs warnings were found!
 
 
 -Original Message-
 From: Apache Jenkins Server [mailto:jenk...@builds.apache.org]
 Sent: Thursday, January 26, 2012 10:36 AM
 To: dev@lucene.apache.org
 Subject: [JENKINS-MAVEN] Lucene-Solr-Maven-trunk #371: POMs out of sync
 
 Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/371/
 
 No tests ran.
 
 Build Log (for compile errors):
 [...truncated 12195 lines...]
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 

- Mark Miller
lucidimagination.com












-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3720) OOM in TestBeiderMorseFilter.testRandom

2012-01-26 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193937#comment-13193937
 ] 

Robert Muir commented on LUCENE-3720:
-

I'm still working on a testcase for this.

I think the underlying commons-codec algorithm blows up on some inputs, 
especially those from randomHtmlIshString.
Then given multiple threads its easy for it to OOM.

So far the best i have is: (ant test -Dtestcase=TestBeiderMorseFilter 
-Dtestmethod=testOOM 
-Dtests.seed=320e23c1f5dbf9c5:-766b86e7dc5e81df:148a8a4955f89b5e 
-Dargs=-Dfile.encoding=UTF-8 -Dtests.heapsize=64M

This blows up, so I just have to get it to log the string that blows up I 
think, and we have a start to a testcase.

{code}
  public void testOOM() throws Exception {
int numIter = 10;
int numTokens = 0;
for (int i = 0; i  numIter; i++) {
  String s = _TestUtil.randomHtmlishString(random, 30);
  TokenStream ts = analyzer.tokenStream(bogus, new StringReader(s));
  ts.reset();
  while (ts.incrementToken()) {
numTokens++;
  }
  ts.end();
  ts.close();
}
System.out.println(numTokens);
  }
{code}

 OOM in TestBeiderMorseFilter.testRandom
 ---

 Key: LUCENE-3720
 URL: https://issues.apache.org/jira/browse/LUCENE-3720
 Project: Lucene - Java
  Issue Type: Test
  Components: modules/analysis
Affects Versions: 3.6, 4.0
Reporter: Robert Muir

 This has been OOM'ing a lot... we should see why, its likely a real bug.
 ant test -Dtestcase=TestBeiderMorseFilter -Dtestmethod=testRandom 
 -Dtests.seed=2e18f456e714be89:310bba5e8404100d:-3bd11277c22f4591 
 -Dtests.multiplier=3 -Dargs=-Dfile.encoding=ISO8859-1

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3714) add suggester that uses shortest path/wFST instead of buckets

2012-01-26 Thread Michael McCandless (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193950#comment-13193950
 ] 

Michael McCandless commented on LUCENE-3714:


{quote}
I think the (existing) benchmark is fair: it just measures how fast each 
suggester returns the top-N.
 Of course FSTSuggester buckets/early terminates and thats just a tradeoff it 
makes (having some impact on scores, possibly even positive).
{quote}

OK, I agree with that: it is a meaningful black-box comparison of two suggester 
impls.

bq. also I'm not sure I'm in love with findMinPairs. Maybe we should call it 
shortestPaths ?

+1

 add suggester that uses shortest path/wFST instead of buckets
 -

 Key: LUCENE-3714
 URL: https://issues.apache.org/jira/browse/LUCENE-3714
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/spellchecker
Reporter: Robert Muir
 Attachments: LUCENE-3714.patch, LUCENE-3714.patch, LUCENE-3714.patch, 
 TestMe.java, out.png


 Currently the FST suggester (really an FSA) quantizes weights into buckets 
 (e.g. single byte) and puts them in front of the word.
 This makes it fast, but you lose granularity in your suggestions.
 Lately the question was raised, if you build lucene's FST with 
 positiveintoutputs, does it behave the same as a tropical semiring wFST?
 In other words, after completing the word, we instead traverse min(output) at 
 each node to find the 'shortest path' to the 
 best suggestion (with the highest score).
 This means we wouldnt need to quantize weights at all and it might make some 
 operations (e.g. adding fuzzy matching etc) a lot easier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3720) OOM in TestBeiderMorseFilter.testRandom

2012-01-26 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193955#comment-13193955
 ] 

Robert Muir commented on LUCENE-3720:
-

Here's an initial test:
{noformat}
  public void testOOM2() throws Exception {
String test = 200697900'--#1913348150;/  bceaeef 
aadaabcf\aedfbff!--\'--?cae +
cfaaa?#!--/scriptlangfc;aadeaf?bdquocc =\abff\
//   afe   +
script!-- f(';cf aefbeef = \bfabadcf\ ebbfeedd = fccabeb ;
TokenStream ts = analyzer.tokenStream(bogus, new StringReader(test));
ts.reset();
while (ts.incrementToken()) {
  ;
}
ts.end();
ts.close();
  }
{noformat}

Ill see if i can make it blow up with a smaller string, and then port the test 
to just use commons-codec apis (not lucene ones).


 OOM in TestBeiderMorseFilter.testRandom
 ---

 Key: LUCENE-3720
 URL: https://issues.apache.org/jira/browse/LUCENE-3720
 Project: Lucene - Java
  Issue Type: Test
  Components: modules/analysis
Affects Versions: 3.6, 4.0
Reporter: Robert Muir

 This has been OOM'ing a lot... we should see why, its likely a real bug.
 ant test -Dtestcase=TestBeiderMorseFilter -Dtestmethod=testRandom 
 -Dtests.seed=2e18f456e714be89:310bba5e8404100d:-3bd11277c22f4591 
 -Dtests.multiplier=3 -Dargs=-Dfile.encoding=ISO8859-1

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk - Build # 12267 - Still Failing

2012-01-26 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12267/

2 tests failed.
REGRESSION:  org.apache.solr.cloud.FullSolrCloudTest.testDistribSearch

Error Message:
java.net.SocketTimeoutException: Read timed out

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: 
java.net.SocketTimeoutException: Read timed out
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:481)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:251)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:104)
at 
org.apache.solr.cloud.FullSolrCloudTest.index_specific(FullSolrCloudTest.java:446)
at 
org.apache.solr.cloud.FullSolrCloudTest.brindDownShardIndexSomeDocsAndRecover(FullSolrCloudTest.java:673)
at 
org.apache.solr.cloud.FullSolrCloudTest.doTest(FullSolrCloudTest.java:498)
at 
org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:663)
at 
org.apache.lucene.util.LuceneTestCase$3$1.evaluate(LuceneTestCase.java:529)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:165)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:146)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
at 
org.apache.commons.httpclient.HttpParser.readRawLine(HttpParser.java:78)
at 
org.apache.commons.httpclient.HttpParser.readLine(HttpParser.java:106)
at 
org.apache.commons.httpclient.HttpConnection.readLine(HttpConnection.java:1116)
at 
org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.readLine(MultiThreadedHttpConnectionManager.java:1413)
at 
org.apache.commons.httpclient.HttpMethodBase.readStatusLine(HttpMethodBase.java:1973)
at 
org.apache.commons.httpclient.HttpMethodBase.readResponse(HttpMethodBase.java:1735)
at 
org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1098)
at 
org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398)
at 
org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:425)


FAILED:  junit.framework.TestSuite.org.apache.solr.cloud.FullSolrCloudTest

Error Message:
ERROR: SolrIndexSearcher opens=0 closes=1

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=0 closes=1
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:147)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)




Build Log (for compile errors):
[...truncated 8659 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3725) Add optional packing to FST building

2012-01-26 Thread Michael McCandless (Created) (JIRA)
Add optional packing to FST building


 Key: LUCENE-3725
 URL: https://issues.apache.org/jira/browse/LUCENE-3725
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/FSTs
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 3.6, 4.0


The FSTs produced by Builder can be further shrunk if you are willing
to spend highish transient RAM to do so... our Builder today tries
hard not to use much RAM (and has options to tweak down the RAM usage,
in exchange for somewhat lager FST), even when building immense FSTs.

But for apps that can afford highish transient RAM to get a smaller
net FST, I think we should offer packing.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3725) Add optional packing to FST building

2012-01-26 Thread Michael McCandless (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3725:
---

Attachment: LUCENE-3725.patch

Initial patch it has tons of nocommits but I think it's basically
working correctly.

The packing is fairly simplistic now, but we can improve it with time
(I know Dawid has done all sorts of cool things!): it chooses the top
N nodes (sorted by incoming arc count) and saves them dereferenced so
that nodes w/ high in-count get a low address.  It also saves the
pointer as delta vs current position, if that would take fewer bytes.
The bytes are then in forward order.

The size savings varies by FST... eg, for the all-Wikipedia-terms FSA
(no outputs) it reduces byte size by 21%.  If I map to ords (FST) then
it's only 13% (I don't do anything to pack the outputs now, so the
bytes required for them are unchanged).

While the resulting FST is smaller, there is some hit to lookup (~8%
for the Wikipedia ord FST), because we have to deref some nodes.

I only turned packing on for one thing: the Kuromoji FST (shrank by
14%, 272 KB).


 Add optional packing to FST building
 

 Key: LUCENE-3725
 URL: https://issues.apache.org/jira/browse/LUCENE-3725
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/FSTs
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3725.patch


 The FSTs produced by Builder can be further shrunk if you are willing
 to spend highish transient RAM to do so... our Builder today tries
 hard not to use much RAM (and has options to tweak down the RAM usage,
 in exchange for somewhat lager FST), even when building immense FSTs.
 But for apps that can afford highish transient RAM to get a smaller
 net FST, I think we should offer packing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3720) OOM in TestBeiderMorseFilter.testRandom

2012-01-26 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194053#comment-13194053
 ] 

Robert Muir commented on LUCENE-3720:
-

For now I added a big red bold warning and disabled the test.

 OOM in TestBeiderMorseFilter.testRandom
 ---

 Key: LUCENE-3720
 URL: https://issues.apache.org/jira/browse/LUCENE-3720
 Project: Lucene - Java
  Issue Type: Test
  Components: modules/analysis
Affects Versions: 3.6, 4.0
Reporter: Robert Muir

 This has been OOM'ing a lot... we should see why, its likely a real bug.
 ant test -Dtestcase=TestBeiderMorseFilter -Dtestmethod=testRandom 
 -Dtests.seed=2e18f456e714be89:310bba5e8404100d:-3bd11277c22f4591 
 -Dtests.multiplier=3 -Dargs=-Dfile.encoding=ISO8859-1

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-trunk - Build # 1810 - Failure

2012-01-26 Thread Robert Muir
I think its possible MockRandom got some ridiculous settings? I think
the last time i saw this fail it was content=SimpleText

What is the free space assertion all about? is it just ensuring that
MockDirectoryWrapper's disk full stuff actually works?
If thats all it is maybe we should tone it back or remove it, and
separately test that in another test that uses a certain codec always?

On Thu, Jan 26, 2012 at 12:27 AM, Apache Jenkins Server
jenk...@builds.apache.org wrote:
 Build: https://builds.apache.org/job/Lucene-trunk/1810/

 1 tests failed.
 REGRESSION:  
 org.apache.lucene.index.TestIndexWriterOnDiskFull.testAddIndexOnDiskFull

 Error Message:
 max free Directory space required exceeded 1X the total input index sizes 
 during addIndexes(Directory[]) + forceMerge(1): max temp usage = 32229 bytes 
 vs limit=29274; starting disk usage = 3577 bytes; input index disk usage = 
 11060 bytes

 Stack Trace:
 junit.framework.AssertionFailedError: max free Directory space required 
 exceeded 1X the total input index sizes during addIndexes(Directory[]) + 
 forceMerge(1): max temp usage = 32229 bytes vs limit=29274; starting disk 
 usage = 3577 bytes; input index disk usage = 11060 bytes
        at 
 org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:165)
        at 
 org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
        at 
 org.apache.lucene.index.TestIndexWriterOnDiskFull.__CLR2_6_3gfbvu329wn(TestIndexWriterOnDiskFull.java:415)
        at 
 org.apache.lucene.index.TestIndexWriterOnDiskFull.testAddIndexOnDiskFull(TestIndexWriterOnDiskFull.java:149)
        at 
 org.apache.lucene.util.LuceneTestCase$3$1.evaluate(LuceneTestCase.java:529)




 Build Log (for compile errors):
 [...truncated 17750 lines...]



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-- 
lucidimagination.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-trunk - Build # 1810 - Failure

2012-01-26 Thread Michael McCandless
On Thu, Jan 26, 2012 at 1:35 PM, Robert Muir rcm...@gmail.com wrote:
 I think its possible MockRandom got some ridiculous settings? I think
 the last time i saw this fail it was content=SimpleText

 What is the free space assertion all about? is it just ensuring that
 MockDirectoryWrapper's disk full stuff actually works?
 If thats all it is maybe we should tone it back or remove it, and
 separately test that in another test that uses a certain codec always?

First, it tests that disk full at random times doesn't cause index corruption.

But then once it finds a disk-free size that completes successfully,
it runs that assert to confirm IndexWriter's peak transient disk space
required when adding indices is  X, where it computes X as a function
of the size of the indexes being added... since MockRandom can wildly
vary how much space segment will take even when indexing the same
docs... it can easily cause the assert to trip...

Mike McCandless

http://blog.mikemccandless.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3720) OOM in TestBeiderMorseFilter.testRandom

2012-01-26 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194071#comment-13194071
 ] 

Uwe Schindler commented on LUCENE-3720:
---

I think we should disable the whole TokenFilter for now until this is fixed. It 
was not yet released as far as I know. So I would suggest to temporary svn rm 
it and revert this once this is fixed. This makes it disappear from 3.6 
Release, but it's not lost. We should keep this issue open, until the root 
cause is fixed.

 OOM in TestBeiderMorseFilter.testRandom
 ---

 Key: LUCENE-3720
 URL: https://issues.apache.org/jira/browse/LUCENE-3720
 Project: Lucene - Java
  Issue Type: Test
  Components: modules/analysis
Affects Versions: 3.6, 4.0
Reporter: Robert Muir

 This has been OOM'ing a lot... we should see why, its likely a real bug.
 ant test -Dtestcase=TestBeiderMorseFilter -Dtestmethod=testRandom 
 -Dtests.seed=2e18f456e714be89:310bba5e8404100d:-3bd11277c22f4591 
 -Dtests.multiplier=3 -Dargs=-Dfile.encoding=ISO8859-1

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3720) OOM in TestBeiderMorseFilter.testRandom

2012-01-26 Thread Michael McCandless (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194087#comment-13194087
 ] 

Michael McCandless commented on LUCENE-3720:


I don't think we need to remove the Tokenfilter just because it has a known 
bug.  It could be this is rarely hit in practice / on more normal input.  All 
software has bugs...

 OOM in TestBeiderMorseFilter.testRandom
 ---

 Key: LUCENE-3720
 URL: https://issues.apache.org/jira/browse/LUCENE-3720
 Project: Lucene - Java
  Issue Type: Test
  Components: modules/analysis
Affects Versions: 3.6, 4.0
Reporter: Robert Muir

 This has been OOM'ing a lot... we should see why, its likely a real bug.
 ant test -Dtestcase=TestBeiderMorseFilter -Dtestmethod=testRandom 
 -Dtests.seed=2e18f456e714be89:310bba5e8404100d:-3bd11277c22f4591 
 -Dtests.multiplier=3 -Dargs=-Dfile.encoding=ISO8859-1

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk - Build # 12268 - Still Failing

2012-01-26 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12268/

All tests passed

Build Log (for compile errors):
[...truncated 14916 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3720) OOM in TestBeiderMorseFilter.testRandom

2012-01-26 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194106#comment-13194106
 ] 

Robert Muir commented on LUCENE-3720:
-

I suspect (as noted on CODEC-132) that this is caused by certain punctuation. I 
can dig a little, could be we apply a temporary
workaround sanitising the offending character (and long term/better, create a 
fix for CODEC-132)

 OOM in TestBeiderMorseFilter.testRandom
 ---

 Key: LUCENE-3720
 URL: https://issues.apache.org/jira/browse/LUCENE-3720
 Project: Lucene - Java
  Issue Type: Test
  Components: modules/analysis
Affects Versions: 3.6, 4.0
Reporter: Robert Muir

 This has been OOM'ing a lot... we should see why, its likely a real bug.
 ant test -Dtestcase=TestBeiderMorseFilter -Dtestmethod=testRandom 
 -Dtests.seed=2e18f456e714be89:310bba5e8404100d:-3bd11277c22f4591 
 -Dtests.multiplier=3 -Dargs=-Dfile.encoding=ISO8859-1

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-trunk - Build # 1810 - Failure

2012-01-26 Thread Michael McCandless
I committed a fix -- this was a real (trunk-only) bug, where IW was
trying to remove files for the merged segments before closing the
merge readers, so the delete failed and transient disk usage was too
high!

Thank you crazy test,

Mike McCandless

http://blog.mikemccandless.com

On Thu, Jan 26, 2012 at 1:39 PM, Michael McCandless
luc...@mikemccandless.com wrote:
 On Thu, Jan 26, 2012 at 1:35 PM, Robert Muir rcm...@gmail.com wrote:
 I think its possible MockRandom got some ridiculous settings? I think
 the last time i saw this fail it was content=SimpleText

 What is the free space assertion all about? is it just ensuring that
 MockDirectoryWrapper's disk full stuff actually works?
 If thats all it is maybe we should tone it back or remove it, and
 separately test that in another test that uses a certain codec always?

 First, it tests that disk full at random times doesn't cause index corruption.

 But then once it finds a disk-free size that completes successfully,
 it runs that assert to confirm IndexWriter's peak transient disk space
 required when adding indices is  X, where it computes X as a function
 of the size of the indexes being added... since MockRandom can wildly
 vary how much space segment will take even when indexing the same
 docs... it can easily cause the assert to trip...

 Mike McCandless

 http://blog.mikemccandless.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2268) Add support for Point in Polygon searches

2012-01-26 Thread Randy Jones (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194139#comment-13194139
 ] 

Randy Jones commented on SOLR-2268:
---

Another vote for this functionality. Any updates?

 Add support for Point in Polygon searches
 -

 Key: SOLR-2268
 URL: https://issues.apache.org/jira/browse/SOLR-2268
 Project: Solr
  Issue Type: New Feature
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
 Attachments: SOLR-2268.patch


 In spatial applications, it is common to ask whether a point is inside of a 
 polygon.  Solr could support two forms of this: 
 # A field contains a polygon and the user supplies a point.  If it does, the 
 doc is returned.  
 # A document contains a point and the user supplies a polygon.  If the point 
 is in the polygon, return the document
 With both of these case, it would be good to support the negative assertion, 
 too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 1620 - Failure

2012-01-26 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/1620/

3 tests failed.
REGRESSION:  org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.testDistribSearch

Error Message:
shard5 is not consistent, expected:45 and got:42

Stack Trace:
junit.framework.AssertionFailedError: shard5 is not consistent, expected:45 and 
got:42
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:165)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
at 
org.apache.solr.cloud.FullSolrCloudTest.checkShardConsistency(FullSolrCloudTest.java:1045)
at 
org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.doTest(ChaosMonkeySafeLeaderTest.java:114)
at 
org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:663)
at 
org.apache.lucene.util.LuceneTestCase$3$1.evaluate(LuceneTestCase.java:529)


FAILED:  
junit.framework.TestSuite.org.apache.solr.cloud.FullSolrCloudDistribCmdsTest

Error Message:
ERROR: SolrIndexSearcher opens=66 closes=65

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=66 
closes=65
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:147)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)


FAILED:  junit.framework.TestSuite.org.apache.solr.cloud.RecoveryZkTest

Error Message:
ERROR: SolrIndexSearcher opens=12 closes=13

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=12 
closes=13
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:147)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)




Build Log (for compile errors):
[...truncated 11244 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk - Build # 12269 - Still Failing

2012-01-26 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12269/

2 tests failed.
REGRESSION:  org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.testDistribSearch

Error Message:
shard3 is not consistent, expected:15 and got:16

Stack Trace:
junit.framework.AssertionFailedError: shard3 is not consistent, expected:15 and 
got:16
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:165)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
at 
org.apache.solr.cloud.FullSolrCloudTest.checkShardConsistency(FullSolrCloudTest.java:1045)
at 
org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.doTest(ChaosMonkeySafeLeaderTest.java:114)
at 
org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:663)
at 
org.apache.lucene.util.LuceneTestCase$3$1.evaluate(LuceneTestCase.java:529)


REGRESSION:  
org.apache.solr.cloud.FullSolrCloudDistribCmdsTest.testDistribSearch

Error Message:
null  java.lang.AssertionError  at 
org.apache.solr.update.TransactionLog.init(TransactionLog.java:159)  at 
org.apache.solr.update.TransactionLog.init(TransactionLog.java:132)  at 
org.apache.solr.update.UpdateLog.ensureLog(UpdateLog.java:567)  at 
org.apache.solr.update.UpdateLog.add(UpdateLog.java:237)  at 
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:184)
  at 
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:56)
  at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:53)
  at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:313)
  at 
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:410)
  at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:217)
  at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:134)  at 
org.apache.solr.handler.XMLLoader.load(XMLLoader.java:78)  at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:59)
  at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
  at org.apache.solr.core.SolrCore.execute(SolrCore.java:1515)  at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:339) 
 at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:234)
  at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
  at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)  
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)  at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)  at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)  at 
org.mortbay.jetty.Server.handle(Server.java:326)  at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)  at 
org.mortbay.jetty.Htt  null  java.lang.AssertionError  at 
org.apache.solr.update.TransactionLog.init(TransactionLog.java:159)  at 
org.apache.solr.update.TransactionLog.init(TransactionLog.java:132)  at 
org.apache.solr.update.UpdateLog.ensureLog(UpdateLog.java:567)  at 
org.apache.solr.update.UpdateLog.add(UpdateLog.java:237)  at 
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:184)
  at 
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:56)
  at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:53)
  at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:313)
  at 
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:410)
  at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:217)
  at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:134)  at 
org.apache.solr.handler.XMLLoader.load(XMLLoader.java:78)  at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:59)
  at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
  at org.apache.solr.core.SolrCore.execute(SolrCore.java:1515)  at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:339) 
 at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:234)
  at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
  at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)  
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)  at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)  at 

[jira] [Updated] (LUCENE-3714) add suggester that uses shortest path/wFST instead of buckets

2012-01-26 Thread Robert Muir (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3714:


Attachment: LUCENE-3714.patch

updated patch nuking a couple nocommits: added the boolean exactMatchFirst 
(default=on like FSTSuggester), and removed the deduping nocommit.

So the main remaining things are:
* fix the encoding of the weights
* use the offline sort
* figure out what onlyMorePopular means for suggesters and what we should do 
about it

 add suggester that uses shortest path/wFST instead of buckets
 -

 Key: LUCENE-3714
 URL: https://issues.apache.org/jira/browse/LUCENE-3714
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/spellchecker
Reporter: Robert Muir
 Attachments: LUCENE-3714.patch, LUCENE-3714.patch, LUCENE-3714.patch, 
 LUCENE-3714.patch, TestMe.java, out.png


 Currently the FST suggester (really an FSA) quantizes weights into buckets 
 (e.g. single byte) and puts them in front of the word.
 This makes it fast, but you lose granularity in your suggestions.
 Lately the question was raised, if you build lucene's FST with 
 positiveintoutputs, does it behave the same as a tropical semiring wFST?
 In other words, after completing the word, we instead traverse min(output) at 
 each node to find the 'shortest path' to the 
 best suggestion (with the highest score).
 This means we wouldnt need to quantize weights at all and it might make some 
 operations (e.g. adding fuzzy matching etc) a lot easier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2268) Add support for Point in Polygon searches

2012-01-26 Thread David Smiley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194179#comment-13194179
 ] 

David Smiley commented on SOLR-2268:


FWIW, this has been made top of my priority list for addition to LSP (which is 
Solr trunk only right now).  It does polygons already but they are not geodetic 
-- no dateline crossing, no pole wrapping, and Mercator projection is assumed 
(flat earth).  My initial goal is simply adding dateline wrap which will make 
it useful for a majority of contexts which use the Mercator projection and 
can't scroll/pan beyond a pole any way (such as Google Maps, etc.).  I expect 
this to be done in a couple weeks.  Adding pole wrap and great-circle-distance 
lines (not mercator), will probably follow thereafter.

 Add support for Point in Polygon searches
 -

 Key: SOLR-2268
 URL: https://issues.apache.org/jira/browse/SOLR-2268
 Project: Solr
  Issue Type: New Feature
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
 Attachments: SOLR-2268.patch


 In spatial applications, it is common to ask whether a point is inside of a 
 polygon.  Solr could support two forms of this: 
 # A field contains a polygon and the user supplies a point.  If it does, the 
 doc is returned.  
 # A document contains a point and the user supplies a polygon.  If the point 
 is in the polygon, return the document
 With both of these case, it would be good to support the negative assertion, 
 too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Welcome Tommaso Teofili as Lucene/Solr committer

2012-01-26 Thread Uwe Schindler
Hi,

I'm happy to announce that the Lucene/Solr PMC has voted in Tommaso Teofili
as our newest committer. He is also committer of the Apache UIMA project
(and various other Apache projects) and did a lot of work to make the Solr
UIMA plugin and various Analyzers working.
 
Tommaso, if you don't mind, could you introduce yourself with a brief bio as
has become our tradition? As soon as you have SVN commit rights, can you
also add yourself to the Lucene and Solr homepages?

Congratulations and welcome aboard!
Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Tommaso Teofili as Lucene/Solr committer

2012-01-26 Thread Dawid Weiss
Congratulations Tommaso!

Dawid

On Thu, Jan 26, 2012 at 10:32 PM, Uwe Schindler u...@thetaphi.de wrote:
 Hi,

 I'm happy to announce that the Lucene/Solr PMC has voted in Tommaso Teofili
 as our newest committer. He is also committer of the Apache UIMA project
 (and various other Apache projects) and did a lot of work to make the Solr
 UIMA plugin and various Analyzers working.

 Tommaso, if you don't mind, could you introduce yourself with a brief bio as
 has become our tradition? As soon as you have SVN commit rights, can you
 also add yourself to the Lucene and Solr homepages?

 Congratulations and welcome aboard!
 Uwe

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: Welcome Tommaso Teofili as Lucene/Solr committer

2012-01-26 Thread Steven A Rowe
Welcome Tommaso! - Steve

 -Original Message-
 From: Uwe Schindler [mailto:u...@thetaphi.de]
 Sent: Thursday, January 26, 2012 4:32 PM
 To: dev@lucene.apache.org
 Cc: 'Tommaso Teofili'
 Subject: Welcome Tommaso Teofili as Lucene/Solr committer
 
 Hi,
 
 I'm happy to announce that the Lucene/Solr PMC has voted in Tommaso
 Teofili
 as our newest committer. He is also committer of the Apache UIMA project
 (and various other Apache projects) and did a lot of work to make the Solr
 UIMA plugin and various Analyzers working.
 
 Tommaso, if you don't mind, could you introduce yourself with a brief bio
 as
 has become our tradition? As soon as you have SVN commit rights, can you
 also add yourself to the Lucene and Solr homepages?
 
 Congratulations and welcome aboard!
 Uwe
 
 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de
 
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 1621 - Still Failing

2012-01-26 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/1621/

3 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.solr.cloud.ChaosMonkeySafeLeaderTest

Error Message:
ERROR: SolrIndexSearcher opens=144 closes=136

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=144 
closes=136
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:147)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)


FAILED:  
junit.framework.TestSuite.org.apache.solr.cloud.FullSolrCloudDistribCmdsTest

Error Message:
ERROR: SolrIndexSearcher opens=66 closes=64

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=66 
closes=64
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:147)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)


FAILED:  junit.framework.TestSuite.org.apache.solr.cloud.RecoveryZkTest

Error Message:
ERROR: SolrIndexSearcher opens=13 closes=15

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=13 
closes=15
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:147)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)




Build Log (for compile errors):
[...truncated 10233 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Tommaso Teofili as Lucene/Solr committer

2012-01-26 Thread Michael McCandless
Welcome Tommaso!

Mike McCandless

http://blog.mikemccandless.com

On Thu, Jan 26, 2012 at 4:32 PM, Uwe Schindler u...@thetaphi.de wrote:
 Hi,

 I'm happy to announce that the Lucene/Solr PMC has voted in Tommaso Teofili
 as our newest committer. He is also committer of the Apache UIMA project
 (and various other Apache projects) and did a lot of work to make the Solr
 UIMA plugin and various Analyzers working.

 Tommaso, if you don't mind, could you introduce yourself with a brief bio as
 has become our tradition? As soon as you have SVN commit rights, can you
 also add yourself to the Lucene and Solr homepages?

 Congratulations and welcome aboard!
 Uwe

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Tommaso Teofili as Lucene/Solr committer

2012-01-26 Thread Simon Willnauer
welcome! good to have you on board!

simon

On Thu, Jan 26, 2012 at 11:42 PM, Michael McCandless
luc...@mikemccandless.com wrote:
 Welcome Tommaso!

 Mike McCandless

 http://blog.mikemccandless.com

 On Thu, Jan 26, 2012 at 4:32 PM, Uwe Schindler u...@thetaphi.de wrote:
 Hi,

 I'm happy to announce that the Lucene/Solr PMC has voted in Tommaso Teofili
 as our newest committer. He is also committer of the Apache UIMA project
 (and various other Apache projects) and did a lot of work to make the Solr
 UIMA plugin and various Analyzers working.

 Tommaso, if you don't mind, could you introduce yourself with a brief bio as
 has become our tradition? As soon as you have SVN commit rights, can you
 also add yourself to the Lucene and Solr homepages?

 Congratulations and welcome aboard!
 Uwe

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Tommaso Teofili as Lucene/Solr committer

2012-01-26 Thread Mark Miller
Congrats and welcome!

On Jan 26, 2012, at 4:32 PM, Uwe Schindler wrote:

 Hi,
 
 I'm happy to announce that the Lucene/Solr PMC has voted in Tommaso Teofili
 as our newest committer. He is also committer of the Apache UIMA project
 (and various other Apache projects) and did a lot of work to make the Solr
 UIMA plugin and various Analyzers working.
 
 Tommaso, if you don't mind, could you introduce yourself with a brief bio as
 has become our tradition? As soon as you have SVN commit rights, can you
 also add yourself to the Lucene and Solr homepages?
 
 Congratulations and welcome aboard!
 Uwe
 
 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de
 
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 

- Mark Miller
lucidimagination.com












-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Tommaso Teofili as Lucene/Solr committer

2012-01-26 Thread Tommaso Teofili
Uwe, all

2012/1/26 Uwe Schindler u...@thetaphi.de

 Hi,

 I'm happy to announce that the Lucene/Solr PMC has voted in Tommaso Teofili
 as our newest committer.


this makes me very very happy, I'll do my best to help this already great
project and community grow and improve.


 He is also committer of the Apache UIMA project
 (and various other Apache projects) and did a lot of work to make the Solr
 UIMA plugin and various Analyzers working.

 Tommaso, if you don't mind, could you introduce yourself with a brief bio
 as
 has become our tradition?


Sure, I live in Rome, Italy, where I've a wonderful family (wife and son)
and where I work as a software engineer, my main interests and passions are
information retrieval, natural language processing and machine learning.
My story at ASF started back in 2009 when I began to contribute to Apache
UIMA as I was using it  for applying NLP algorithms on data extracted from
the web for my master degree final work; now I am an Apache member
contributing to UIMA and other projects.
In the last two years I've worked on two quite big projects for migrating
commercial closed source based search engines to Lucene/Solr backed systems
hence I started getting my hands dirty with Lucene/Solr (the Solr-UIMA
plugin and the Lucence TypeTokenFilter are maybe my main contributions) and
I'm continuing to enjoy the ride.


 As soon as you have SVN commit rights, can you
 also add yourself to the Lucene and Solr homepages?


sure!



 Congratulations and welcome aboard!


Once again, thank you all for your trust!
Cheers,
Tommaso


 Uwe

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de






Re: Welcome Tommaso Teofili as Lucene/Solr committer

2012-01-26 Thread Robert Muir
Welcome!

On Thu, Jan 26, 2012 at 4:32 PM, Uwe Schindler u...@thetaphi.de wrote:
 Hi,

 I'm happy to announce that the Lucene/Solr PMC has voted in Tommaso Teofili
 as our newest committer. He is also committer of the Apache UIMA project
 (and various other Apache projects) and did a lot of work to make the Solr
 UIMA plugin and various Analyzers working.

 Tommaso, if you don't mind, could you introduce yourself with a brief bio as
 has become our tradition? As soon as you have SVN commit rights, can you
 also add yourself to the Lucene and Solr homepages?

 Congratulations and welcome aboard!
 Uwe

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-- 
lucidimagination.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Tommaso Teofili as Lucene/Solr committer

2012-01-26 Thread Koji Sekiguchi

Welcome Tommaso!

(12/01/27 6:32), Uwe Schindler wrote:

Hi,

I'm happy to announce that the Lucene/Solr PMC has voted in Tommaso Teofili
as our newest committer. He is also committer of the Apache UIMA project
(and various other Apache projects) and did a lot of work to make the Solr
UIMA plugin and various Analyzers working.

Tommaso, if you don't mind, could you introduce yourself with a brief bio as
has become our tradition? As soon as you have SVN commit rights, can you
also add yourself to the Lucene and Solr homepages?

Congratulations and welcome aboard!
Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org





--
http://www.rondhuit.com/en/

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 1622 - Still Failing

2012-01-26 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/1622/

3 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.solr.cloud.ChaosMonkeySafeLeaderTest

Error Message:
ERROR: SolrIndexSearcher opens=176 closes=162

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=176 
closes=162
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:147)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)


FAILED:  
junit.framework.TestSuite.org.apache.solr.cloud.FullSolrCloudDistribCmdsTest

Error Message:
ERROR: SolrIndexSearcher opens=66 closes=65

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=66 
closes=65
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:147)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)


FAILED:  junit.framework.TestSuite.org.apache.solr.cloud.RecoveryZkTest

Error Message:
ERROR: SolrIndexSearcher opens=13 closes=14

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=13 
closes=14
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:147)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)




Build Log (for compile errors):
[...truncated 67753 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1236410 - in /lucene/dev/trunk/solr: core/src/java/org/apache/solr/update/TransactionLog.java core/src/java/org/apache/solr/update/UpdateLog.java testlogging.properties

2012-01-26 Thread Robert Muir
I noticed on test output seeing some thai numerals that the
transaction log filename is using the default locale String.format,
but then parsed with Long.parseLong

I can't think of any bugs/problems in this, except possible confusion
of localized filenames, but I think it would be safer to pass
Locale.ENGLISH to String.format...

On Thu, Jan 26, 2012 at 5:09 PM,  yo...@apache.org wrote:
 Author: yonik
 Date: Thu Jan 26 22:09:08 2012
 New Revision: 1236410

 URL: http://svn.apache.org/viewvc?rev=1236410view=rev
 Log:
 tests: try to track down the tlog-already-exists issue

 Modified:
    
 lucene/dev/trunk/solr/core/src/java/org/apache/solr/update/TransactionLog.java
    lucene/dev/trunk/solr/core/src/java/org/apache/solr/update/UpdateLog.java
    lucene/dev/trunk/solr/testlogging.properties

 Modified: 
 lucene/dev/trunk/solr/core/src/java/org/apache/solr/update/TransactionLog.java
 URL: 
 http://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/java/org/apache/solr/update/TransactionLog.java?rev=1236410r1=1236409r2=1236410view=diff
 ==
 --- 
 lucene/dev/trunk/solr/core/src/java/org/apache/solr/update/TransactionLog.java
  (original)
 +++ 
 lucene/dev/trunk/solr/core/src/java/org/apache/solr/update/TransactionLog.java
  Thu Jan 26 22:09:08 2012
 @@ -55,6 +55,8 @@ import java.util.concurrent.atomic.Atomi
  */
  public class TransactionLog {
   public static Logger log = LoggerFactory.getLogger(TransactionLog.class);
 +  final boolean debug = log.isDebugEnabled();
 +  final boolean trace = log.isTraceEnabled();

   public final static String END_MESSAGE=SOLR_TLOG_END;

 @@ -71,7 +73,6 @@ public class TransactionLog {
   AtomicInteger refcount = new AtomicInteger(1);
   MapString,Integer globalStringMap = new HashMapString, Integer();
   ListString globalStringList = new ArrayListString();
 -  final boolean debug = log.isDebugEnabled();

   long snapshot_size;
   int snapshot_numRecords;
 @@ -156,6 +157,9 @@ public class TransactionLog {
           addGlobalStrings(globalStrings);
         }
       } else {
 +        if (start  0) {
 +          log.error(New transaction log already exists: + tlogFile +  
 size= + raf.length());
 +        }
         assert start==0;
         if (start  0) {
           raf.setLength(0);
 @@ -543,8 +547,8 @@ public class TransactionLog {


       synchronized (TransactionLog.this) {
 -        if (debug) {
 -          log.debug(Reading log record.  pos=+pos+ 
 currentSize=+fos.size());
 +        if (trace) {
 +          log.trace(Reading log record.  pos=+pos+ 
 currentSize=+fos.size());
         }

         if (pos = fos.size()) {

 Modified: 
 lucene/dev/trunk/solr/core/src/java/org/apache/solr/update/UpdateLog.java
 URL: 
 http://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/java/org/apache/solr/update/UpdateLog.java?rev=1236410r1=1236409r2=1236410view=diff
 ==
 --- lucene/dev/trunk/solr/core/src/java/org/apache/solr/update/UpdateLog.java 
 (original)
 +++ lucene/dev/trunk/solr/core/src/java/org/apache/solr/update/UpdateLog.java 
 Thu Jan 26 22:09:08 2012
 @@ -48,6 +48,7 @@ import java.util.concurrent.*;
  public class UpdateLog implements PluginInfoInitialized {
   public static Logger log = LoggerFactory.getLogger(UpdateLog.class);
   public boolean debug = log.isDebugEnabled();
 +  public boolean trace = log.isTraceEnabled();


   public enum SyncLevel { NONE, FLUSH, FSYNC }
 @@ -141,6 +142,9 @@ public class UpdateLog implements Plugin
     this.uhandler = uhandler;

     if (dataDir.equals(lastDataDir)) {
 +      if (debug) {
 +        log.debug(UpdateHandler init: tlogDir= + tlogDir + , next id= + 
 id,  this is a reopen... nothing else to do.);
 +      }
       // on a normal reopen, we currently shouldn't have to do anything
       return;
     }
 @@ -150,6 +154,10 @@ public class UpdateLog implements Plugin
     tlogFiles = getLogList(tlogDir);
     id = getLastLogId() + 1;   // add 1 since we will create a new log for 
 the next update

 +    if (debug) {
 +      log.debug(UpdateHandler init: tlogDir= + tlogDir + , existing 
 tlogs= + Arrays.asList(tlogFiles) + , next id= + id);
 +    }
 +
     TransactionLog oldLog = null;
     for (String oldLogName : tlogFiles) {
       File f = new File(tlogDir, oldLogName);
 @@ -247,8 +255,8 @@ public class UpdateLog implements Plugin
         map.put(cmd.getIndexedId(), ptr);
       }

 -      if (debug) {
 -        log.debug(TLOG: added id  + cmd.getPrintableId() +  to  + tlog + 
   + ptr +  map= + System.identityHashCode(map));
 +      if (trace) {
 +        log.trace(TLOG: added id  + cmd.getPrintableId() +  to  + tlog + 
   + ptr +  map= + System.identityHashCode(map));
       }
     }
   }
 @@ -274,8 +282,8 @@ public class UpdateLog implements Plugin
         oldDeletes.put(br, ptr);
       }

 -      if (debug) {
 -        log.debug(TLOG: added delete for id  + 

[JENKINS] Lucene-Solr-tests-only-trunk - Build # 12271 - Still Failing

2012-01-26 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12271/

3 tests failed.
REGRESSION:  org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.testDistribSearch

Error Message:
shard4 is not consistent, expected:50 and got:47

Stack Trace:
junit.framework.AssertionFailedError: shard4 is not consistent, expected:50 and 
got:47
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:165)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
at 
org.apache.solr.cloud.FullSolrCloudTest.checkShardConsistency(FullSolrCloudTest.java:1050)
at 
org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.doTest(ChaosMonkeySafeLeaderTest.java:114)
at 
org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:663)
at 
org.apache.lucene.util.LuceneTestCase$3$1.evaluate(LuceneTestCase.java:529)


FAILED:  junit.framework.TestSuite.org.apache.solr.cloud.LeaderElectionTest

Error Message:
ERROR: SolrIndexSearcher opens=0 closes=1

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=0 closes=1
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:147)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)


FAILED:  
junit.framework.TestSuite.org.apache.solr.cloud.FullSolrCloudDistribCmdsTest

Error Message:
ERROR: SolrIndexSearcher opens=66 closes=65

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=66 
closes=65
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:147)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)




Build Log (for compile errors):
[...truncated 48903 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-1052) Deprecate/Remove indexDefaults in favor of mainIndex in solrconfig.xml

2012-01-26 Thread Updated

 [ 
https://issues.apache.org/jira/browse/SOLR-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-1052:
--

Fix Version/s: 4.0
   3.6
   Labels: solrconfig.xml  (was: )

Suggestion:

In 3.6, deprecate indexDefaults, removing from example solrconfig, announcing 
its death in CHANGES.TXT
In 4.0, remove every trace of indexDefaults

 Deprecate/Remove indexDefaults in favor of mainIndex in solrconfig.xml
 --

 Key: SOLR-1052
 URL: https://issues.apache.org/jira/browse/SOLR-1052
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Priority: Minor
  Labels: solrconfig.xml
 Fix For: 3.6, 4.0


 Given that we now handle multiple cores via the solr.xml and the discussion 
 around indexDefaults and mainIndex at 
 http://www.lucidimagination.com/search/p:solr?q=mainIndex+vs.+indexDefaults
 We should deprecate/remove the use of indexDefaults and just rely on 
 mainIndex, as it doesn't seem to serve any purpose and is confusing to 
 explain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3672) IndexCommit.equals() bug

2012-01-26 Thread Michael McCandless (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3672:
---

Attachment: LUCENE-3672.patch

Patch.

I removed Directory.fileModified, and
IndexCommit.getVersion/getTimestamp.  I changed Solr to take its own
timestamp (System.currentTimeMillis) and store into the commit's
userData, and fixed the places that needed it to look it up from
there.  I also throw an exception if ever IndexCommit.compareTo is
passed an IndexCommit with a different Directory.


 IndexCommit.equals() bug
 

 Key: LUCENE-3672
 URL: https://issues.apache.org/jira/browse/LUCENE-3672
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.0
Reporter: Andrzej Bialecki 
 Attachments: LUCENE-3672.patch


 IndexCommit.equals() checks for equality of Directories and versions, but it 
 doesn't check IMHO the more important generation numbers. It looks like 
 commits are really identified by a combination of directory and segments_XXX, 
 which means the generation number, because that's what the 
 DirectoryReader.open() checks for.
 This bug leads to an unexpected behavior when the only change to be committed 
 is in userData - we get two commits then that are declared equal, they have 
 the same version but they have different generation numbers. I have no idea 
 how this situation is treated in a few dozen references to 
 IndexCommit.equals() across Lucene...
 On the surface the fix is trivial - either add the gen number to equals(), or 
 use gen number instead of version. However, it's puzzling why these two would 
 ever get out of sync??? and if they are always supposed to be in sync then 
 maybe we don't need both of them at all, maybe just generation or version is 
 sufficient?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-3672) IndexCommit.equals() bug

2012-01-26 Thread Michael McCandless (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless reassigned LUCENE-3672:
--

Assignee: Michael McCandless

 IndexCommit.equals() bug
 

 Key: LUCENE-3672
 URL: https://issues.apache.org/jira/browse/LUCENE-3672
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.0
Reporter: Andrzej Bialecki 
Assignee: Michael McCandless
 Attachments: LUCENE-3672.patch


 IndexCommit.equals() checks for equality of Directories and versions, but it 
 doesn't check IMHO the more important generation numbers. It looks like 
 commits are really identified by a combination of directory and segments_XXX, 
 which means the generation number, because that's what the 
 DirectoryReader.open() checks for.
 This bug leads to an unexpected behavior when the only change to be committed 
 is in userData - we get two commits then that are declared equal, they have 
 the same version but they have different generation numbers. I have no idea 
 how this situation is treated in a few dozen references to 
 IndexCommit.equals() across Lucene...
 On the surface the fix is trivial - either add the gen number to equals(), or 
 use gen number instead of version. However, it's puzzling why these two would 
 ever get out of sync??? and if they are always supposed to be in sync then 
 maybe we don't need both of them at all, maybe just generation or version is 
 sufficient?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1052) Deprecate/Remove indexDefaults in favor of mainIndex in solrconfig.xml

2012-01-26 Thread Chris Male (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194331#comment-13194331
 ] 

Chris Male commented on SOLR-1052:
--

Why not deprecate in both code and configurations in 3.6? That way it doesn't 
just disappear from users in 3.6.

 Deprecate/Remove indexDefaults in favor of mainIndex in solrconfig.xml
 --

 Key: SOLR-1052
 URL: https://issues.apache.org/jira/browse/SOLR-1052
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Priority: Minor
  Labels: solrconfig.xml
 Fix For: 3.6, 4.0


 Given that we now handle multiple cores via the solr.xml and the discussion 
 around indexDefaults and mainIndex at 
 http://www.lucidimagination.com/search/p:solr?q=mainIndex+vs.+indexDefaults
 We should deprecate/remove the use of indexDefaults and just rely on 
 mainIndex, as it doesn't seem to serve any purpose and is confusing to 
 explain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1236410 - in /lucene/dev/trunk/solr: core/src/java/org/apache/solr/update/TransactionLog.java core/src/java/org/apache/solr/update/UpdateLog.java testlogging.properties

2012-01-26 Thread Yonik Seeley
On Thu, Jan 26, 2012 at 7:25 PM, Robert Muir rcm...@gmail.com wrote:
 I noticed on test output seeing some thai numerals that the
 transaction log filename is using the default locale String.format,
 but then parsed with Long.parseLong

 I can't think of any bugs/problems in this, except possible confusion
 of localized filenames, but I think it would be safer to pass
 Locale.ENGLISH to String.format...

Thanks! I'll give it a shot and cross my fingers...
could be a localization bug with that JVM or something.

-Yonik
http://www.lucidimagination.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Tommaso Teofili as Lucene/Solr committer

2012-01-26 Thread Erick Erickson
Glad to have you aboard!

Erick

On Thu, Jan 26, 2012 at 3:42 PM, Koji Sekiguchi k...@r.email.ne.jp wrote:
 Welcome Tommaso!


 (12/01/27 6:32), Uwe Schindler wrote:

 Hi,

 I'm happy to announce that the Lucene/Solr PMC has voted in Tommaso
 Teofili
 as our newest committer. He is also committer of the Apache UIMA project
 (and various other Apache projects) and did a lot of work to make the Solr
 UIMA plugin and various Analyzers working.

 Tommaso, if you don't mind, could you introduce yourself with a brief bio
 as
 has become our tradition? As soon as you have SVN commit rights, can you
 also add yourself to the Lucene and Solr homepages?

 Congratulations and welcome aboard!
 Uwe

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




 --
 http://www.rondhuit.com/en/


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2012-01-26 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194360#comment-13194360
 ] 

Robert Muir commented on SOLR-2358:
---

Should another issue be opened for the tests?

Do the failures reproduce if you ssh into the hudson machine itself and test 
from there?
I've found this useful before when things are hard to reproduce.

Do any tests rely upon *not* being able to connect to a tcp/udp port (even 
localhost)? 
Our hudson machine has an interesting network configuration: it blackholes 
connections
to closed ports, so any tests that rely upon this will just hang (for a very 
long time!) 
unless you do some tricks.  This is actually great for testing (imo), because 
it 
simulates how a real outage can behave: but is likely different from anyone's 
local machine.

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: 2shard4server.jpg, SOLR-2358.patch, SOLR-2358.patch, 
 apache-solr-noggit-r1211150.jar, zookeeper-3.3.4.jar


 The indexing side of SolrCloud - the goal of this issue is to provide 
 durable, fault tolerant indexing to an elastic cluster of Solr instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Tommaso Teofili as Lucene/Solr committer

2012-01-26 Thread Jan Høydahl
You are very much welcome on on board, Tommaso! You've done great work so far.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

On 26. jan. 2012, at 22:32, Uwe Schindler wrote:

 Hi,
 
 I'm happy to announce that the Lucene/Solr PMC has voted in Tommaso Teofili
 as our newest committer. He is also committer of the Apache UIMA project
 (and various other Apache projects) and did a lot of work to make the Solr
 UIMA plugin and various Analyzers working.
 
 Tommaso, if you don't mind, could you introduce yourself with a brief bio as
 has become our tradition? As soon as you have SVN commit rights, can you
 also add yourself to the Lucene and Solr homepages?
 
 Congratulations and welcome aboard!
 Uwe
 
 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de
 
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-1052) Deprecate/Remove indexDefaults in favor of mainIndex in solrconfig.xml

2012-01-26 Thread Assigned

 [ 
https://issues.apache.org/jira/browse/SOLR-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl reassigned SOLR-1052:
-

Assignee: Jan Høydahl

 Deprecate/Remove indexDefaults in favor of mainIndex in solrconfig.xml
 --

 Key: SOLR-1052
 URL: https://issues.apache.org/jira/browse/SOLR-1052
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Assignee: Jan Høydahl
Priority: Minor
  Labels: solrconfig.xml
 Fix For: 3.6, 4.0

 Attachments: SOLR-1052-3x.patch


 Given that we now handle multiple cores via the solr.xml and the discussion 
 around indexDefaults and mainIndex at 
 http://www.lucidimagination.com/search/p:solr?q=mainIndex+vs.+indexDefaults
 We should deprecate/remove the use of indexDefaults and just rely on 
 mainIndex, as it doesn't seem to serve any purpose and is confusing to 
 explain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-1052) Deprecate/Remove indexDefaults in favor of mainIndex in solrconfig.xml

2012-01-26 Thread Updated

 [ 
https://issues.apache.org/jira/browse/SOLR-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-1052:
--

Attachment: SOLR-1052-3x.patch

Attempt on 3.6 deprecation. I merged the configs into mainIndex, added 
missing comments and commented out some settings which were identical to the 
defaults anyway. I also changed the default for ramBufferSizeMB from 16 to 32 
in java code, to be able to comment out this setting in XML as well.

If indexDefaults is found in solrconfig.xml a deprecation warning is printed.

I also removed some attempt to parse an old Solr1.x(?) syntax for 
mergeScheduler, mergePolicy and luceneAutoCommit.

What do you think?

 Deprecate/Remove indexDefaults in favor of mainIndex in solrconfig.xml
 --

 Key: SOLR-1052
 URL: https://issues.apache.org/jira/browse/SOLR-1052
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Priority: Minor
  Labels: solrconfig.xml
 Fix For: 3.6, 4.0

 Attachments: SOLR-1052-3x.patch


 Given that we now handle multiple cores via the solr.xml and the discussion 
 around indexDefaults and mainIndex at 
 http://www.lucidimagination.com/search/p:solr?q=mainIndex+vs.+indexDefaults
 We should deprecate/remove the use of indexDefaults and just rely on 
 mainIndex, as it doesn't seem to serve any purpose and is confusing to 
 explain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1052) Deprecate/Remove indexDefaults in favor of mainIndex in solrconfig.xml

2012-01-26 Thread Chris Male (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194384#comment-13194384
 ] 

Chris Male commented on SOLR-1052:
--

Why not leave the indexDefaults in the config and say they're deprecated (as 
I said before)?

 Deprecate/Remove indexDefaults in favor of mainIndex in solrconfig.xml
 --

 Key: SOLR-1052
 URL: https://issues.apache.org/jira/browse/SOLR-1052
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Assignee: Jan Høydahl
Priority: Minor
  Labels: solrconfig.xml
 Fix For: 3.6, 4.0

 Attachments: SOLR-1052-3x.patch


 Given that we now handle multiple cores via the solr.xml and the discussion 
 around indexDefaults and mainIndex at 
 http://www.lucidimagination.com/search/p:solr?q=mainIndex+vs.+indexDefaults
 We should deprecate/remove the use of indexDefaults and just rely on 
 mainIndex, as it doesn't seem to serve any purpose and is confusing to 
 explain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk - Build # 12272 - Still Failing

2012-01-26 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12272/

5 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.solr.cloud.ChaosMonkeySafeLeaderTest

Error Message:
ERROR: SolrIndexSearcher opens=116 closes=108

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=116 
closes=108
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:151)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)


FAILED:  junit.framework.TestSuite.org.apache.solr.cloud.RecoveryZkTest

Error Message:
ERROR: SolrIndexSearcher opens=12 closes=11

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=12 
closes=11
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:151)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)


FAILED:  junit.framework.TestSuite.org.apache.solr.search.SpatialFilterTest

Error Message:
ERROR: SolrIndexSearcher opens=6 closes=7

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=6 closes=7
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:151)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)


FAILED:  junit.framework.TestSuite.org.apache.solr.cloud.LeaderElectionTest

Error Message:
ERROR: SolrIndexSearcher opens=0 closes=6

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=0 closes=6
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:151)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)


FAILED:  
junit.framework.TestSuite.org.apache.solr.cloud.FullSolrCloudDistribCmdsTest

Error Message:
ERROR: SolrIndexSearcher opens=66 closes=65

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=66 
closes=65
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:151)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)




Build Log (for compile errors):
[...truncated 43998 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3056) Introduce Japanese field type in schema.xml

2012-01-26 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194403#comment-13194403
 ] 

Robert Muir commented on SOLR-3056:
---

As a first step, lets adjust the analyzer defaults so that the Lucene analyzer 
supports search mode by default.
I have a few questions about this mode I want to throw out there, so I'll 
create a new issue.

 Introduce Japanese field type in schema.xml
 ---

 Key: SOLR-3056
 URL: https://issues.apache.org/jira/browse/SOLR-3056
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Affects Versions: 3.6, 4.0
Reporter: Christian Moen

 Kuromoji (LUCENE-3305) is now on both on trunk and branch_3x (thanks again 
 Robert, Uwe and Simon). It would be very good to get a default field type 
 defined for Japanese in {{schema.xml}} so we can good Japanese out-of-the-box 
 support in Solr.
 I've been playing with the below configuration today, which I think is a 
 reasonable starting point for Japanese.  There's lot to be said about various 
 considerations necessary when searching Japanese, but perhaps a wiki page is 
 more suitable to cover the wider topic?
 In order to make the below {{text_ja}} field type work, Kuromoji itself and 
 its analyzers need to be seen by the Solr classloader.  However, these are 
 currently in contrib and I'm wondering if we should consider moving them to 
 core to make them directly available.  If there are concerns with additional 
 memory usage, etc. for non-Japanese users, we can make sure resources are 
 loaded lazily and only when needed in factory-land.
 Any thoughts?
 {code:xml}
 !-- Text field type is suitable for Japanese text using morphological 
 analysis
  NOTE: Please copy files
contrib/analysis-extras/lucene-libs/lucene-kuromoji-x.y.z.jar
dist/apache-solr-analysis-extras-x.y.z.jar
  to your Solr lib directory (i.e. example/solr/lib) before before 
 starting Solr.
  (x.y.z refers to a version number)
  If you would like to optimize for precision, default operator AND with
solrQueryParser defaultOperator=AND/
  below (this file).  Use OR if you would like to optimize for recall 
 (default).
 --
 fieldType name=text_ja class=solr.TextField positionIncrementGap=100 
 autoGeneratePhraseQueries=false
   analyzer
 !-- Kuromoji Japanese morphological analyzer/tokenizer
  Use search-mode to get a noun-decompounding effect useful for search.
  Example:
関西国際空港 (Kansai International Airpart) becomes 関西 (Kansai) 国際 
 (International) 空港 (airport)
so we get a match for 空港 (airport) as we would expect from a good 
 search engine
  Valid values for mode are:
 normal: default segmentation
 search: segmentation useful for search (extra compound splitting)
   extended: search mode with unigramming of unknown words 
 (experimental)
  NOTE: Search mode improves segmentation for search at the expense of 
 part-of-speech accuracy
 --
 tokenizer class=solr.KuromojiTokenizerFactory mode=search/
 !-- Reduces inflected verbs and adjectives to their base/dectionary 
 forms (辞書形) --  
 filter class=solr.KuromojiBaseFormFilterFactory/
 !-- Optionally remove tokens with certain part-of-speeches
 filter class=solr.KuromojiPartOfSpeechStopFilterFactory 
 tags=stopTags.txt enablePositionIncrements=true/ --
 !-- Normalizes full-width romaji to half-with and half-width kana to 
 full-width (Unicode NFKC subset) --
 filter class=solr.CJKWidthFilterFactory/
 !-- Lower-case romaji characters --
 filter class=solr.LowerCaseFilterFactory/
   /analyzer
 /fieldType
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3726) Default KuromojiAnalyzer to use search mode

2012-01-26 Thread Robert Muir (Created) (JIRA)
Default KuromojiAnalyzer to use search mode
---

 Key: LUCENE-3726
 URL: https://issues.apache.org/jira/browse/LUCENE-3726
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 3.6, 4.0
Reporter: Robert Muir


Kuromoji supports an option to segment text in a way more suitable for search,
by preventing long compound nouns as indexing terms.

In general 'how you segment' can be important depending on the application 
(see http://nlp.stanford.edu/pubs/acl-wmt08-cws.pdf for some studies on this in 
chinese)

The current algorithm punishes the cost based on some parameters 
(SEARCH_MODE_PENALTY, SEARCH_MODE_LENGTH, etc)
for long runs of kanji.

Some questions (these can be separate future issues if any useful ideas come 
out):
* should these parameters continue to be static-final, or configurable?
* should POS also play a role in the algorithm (can/should we refine exactly 
what we decompound)?
* is the Tokenizer the best place to do this, or should we do it in a 
tokenfilter? or both?
  with a tokenfilter, one idea would be to also preserve the original indexing 
term, overlapping it: e.g. ABCD - AB, CD, ABCD(posInc=0)
  from my understanding this tends to help with noun compounds in other 
languages, because IDF of the original term boosts 'exact' compound matches.
  but does a tokenfilter provide the segmenter enough 'context' to do this 
properly?

Either way, I think as a start we should turn on what we have by default: its 
likely a very easy win.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3056) Introduce Japanese field type in schema.xml

2012-01-26 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194413#comment-13194413
 ] 

Robert Muir commented on SOLR-3056:
---

I opened LUCENE-3726 for the search mode discussion.

 Introduce Japanese field type in schema.xml
 ---

 Key: SOLR-3056
 URL: https://issues.apache.org/jira/browse/SOLR-3056
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Affects Versions: 3.6, 4.0
Reporter: Christian Moen

 Kuromoji (LUCENE-3305) is now on both on trunk and branch_3x (thanks again 
 Robert, Uwe and Simon). It would be very good to get a default field type 
 defined for Japanese in {{schema.xml}} so we can good Japanese out-of-the-box 
 support in Solr.
 I've been playing with the below configuration today, which I think is a 
 reasonable starting point for Japanese.  There's lot to be said about various 
 considerations necessary when searching Japanese, but perhaps a wiki page is 
 more suitable to cover the wider topic?
 In order to make the below {{text_ja}} field type work, Kuromoji itself and 
 its analyzers need to be seen by the Solr classloader.  However, these are 
 currently in contrib and I'm wondering if we should consider moving them to 
 core to make them directly available.  If there are concerns with additional 
 memory usage, etc. for non-Japanese users, we can make sure resources are 
 loaded lazily and only when needed in factory-land.
 Any thoughts?
 {code:xml}
 !-- Text field type is suitable for Japanese text using morphological 
 analysis
  NOTE: Please copy files
contrib/analysis-extras/lucene-libs/lucene-kuromoji-x.y.z.jar
dist/apache-solr-analysis-extras-x.y.z.jar
  to your Solr lib directory (i.e. example/solr/lib) before before 
 starting Solr.
  (x.y.z refers to a version number)
  If you would like to optimize for precision, default operator AND with
solrQueryParser defaultOperator=AND/
  below (this file).  Use OR if you would like to optimize for recall 
 (default).
 --
 fieldType name=text_ja class=solr.TextField positionIncrementGap=100 
 autoGeneratePhraseQueries=false
   analyzer
 !-- Kuromoji Japanese morphological analyzer/tokenizer
  Use search-mode to get a noun-decompounding effect useful for search.
  Example:
関西国際空港 (Kansai International Airpart) becomes 関西 (Kansai) 国際 
 (International) 空港 (airport)
so we get a match for 空港 (airport) as we would expect from a good 
 search engine
  Valid values for mode are:
 normal: default segmentation
 search: segmentation useful for search (extra compound splitting)
   extended: search mode with unigramming of unknown words 
 (experimental)
  NOTE: Search mode improves segmentation for search at the expense of 
 part-of-speech accuracy
 --
 tokenizer class=solr.KuromojiTokenizerFactory mode=search/
 !-- Reduces inflected verbs and adjectives to their base/dectionary 
 forms (辞書形) --  
 filter class=solr.KuromojiBaseFormFilterFactory/
 !-- Optionally remove tokens with certain part-of-speeches
 filter class=solr.KuromojiPartOfSpeechStopFilterFactory 
 tags=stopTags.txt enablePositionIncrements=true/ --
 !-- Normalizes full-width romaji to half-with and half-width kana to 
 full-width (Unicode NFKC subset) --
 filter class=solr.CJKWidthFilterFactory/
 !-- Lower-case romaji characters --
 filter class=solr.LowerCaseFilterFactory/
   /analyzer
 /fieldType
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-3062) Solr4 Join query with fq not correctly filtering results

2012-01-26 Thread Mike Hugo (Created) (JIRA)
Solr4 Join query with fq not correctly filtering results


 Key: SOLR-3062
 URL: https://issues.apache.org/jira/browse/SOLR-3062
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Mike Hugo
 Attachments: join_filter_query_problem.patch

filter queries are not properly filtering results when using a join query in 
solr4

To replicate, use the attached patch file which contains a test method that 
will fail (but should pass).

- OR -

run the solr example:

cd solr
ant example
java -jar start.jar
cd exampledocs
java -jar post.jar *.xml

Then try a few of the sample queries on the wiki page 
http://wiki.apache.org/solr/Join.  In particular, this is illustrates the issue:

Find all manufacturer docs named belkin, then join them against (product) 
docs and filter that list to only products with a price less than 12 dollars
http://localhost:8983/solr/select?q={!join+from=id+to=manu_id_s}compName_s:Belkinfq=price:%5B%2A+TO+12%5D

When you run that query, you will get two results, one with a price of 19.95 
and another with a price of 11.5  Because of the filter query, I'm only 
expecting to see one result - the one with a price of 11.99.

I was able to track this down to a change in revision #1188624.  Prior to that 
revision (i.e. 1188613 and before) the test method will pass and the example 
will work as expected.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3062) Solr4 Join query with fq not correctly filtering results

2012-01-26 Thread Mike Hugo (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Hugo updated SOLR-3062:


Attachment: join_filter_query_problem.patch

 Solr4 Join query with fq not correctly filtering results
 

 Key: SOLR-3062
 URL: https://issues.apache.org/jira/browse/SOLR-3062
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Mike Hugo
 Attachments: join_filter_query_problem.patch


 filter queries are not properly filtering results when using a join query in 
 solr4
 To replicate, use the attached patch file which contains a test method that 
 will fail (but should pass).
 - OR -
 run the solr example:
 cd solr
 ant example
 java -jar start.jar
 cd exampledocs
 java -jar post.jar *.xml
 Then try a few of the sample queries on the wiki page 
 http://wiki.apache.org/solr/Join.  In particular, this is illustrates the 
 issue:
 Find all manufacturer docs named belkin, then join them against (product) 
 docs and filter that list to only products with a price less than 12 dollars
 http://localhost:8983/solr/select?q={!join+from=id+to=manu_id_s}compName_s:Belkinfq=price:%5B%2A+TO+12%5D
 When you run that query, you will get two results, one with a price of 19.95 
 and another with a price of 11.5  Because of the filter query, I'm only 
 expecting to see one result - the one with a price of 11.99.
 I was able to track this down to a change in revision #1188624.  Prior to 
 that revision (i.e. 1188613 and before) the test method will pass and the 
 example will work as expected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3062) Solr4 Join query with fq not correctly filtering results

2012-01-26 Thread Mike Hugo (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Hugo updated SOLR-3062:


Description: 
filter queries are not properly filtering results when using a join query in 
solr4

To replicate, use the attached patch file which contains a test method that 
will fail (but should pass).

OR

run the solr example:

cd solr
ant example
java -jar start.jar
cd exampledocs
java -jar post.jar *.xml

Then try a few of the sample queries on the wiki page 
http://wiki.apache.org/solr/Join.  In particular, this is illustrates the issue:

Find all manufacturer docs named belkin, then join them against (product) 
docs and filter that list to only products with a price less than 12 dollars
http://localhost:8983/solr/select?q={!join+from=id+to=manu_id_s}compName_s:Belkinfq=price:%5B%2A+TO+12%5D

When you run that query, you will get two results, one with a price of 19.95 
and another with a price of 11.5  Because of the filter query, I'm only 
expecting to see one result - the one with a price of 11.99.

I was able to track this down to a change in revision #1188624 
(http://svn.apache.org/viewvc?view=revisionrevision=1188624 : LUCENE-1536: 
Filters can now be applied down-low, if their DocIdSet implements a new bits() 
method, returning all documents in a random access way).  Prior to that 
revision (i.e. 1188613 and before) the test method will pass and the example 
will work as expected.



  was:
filter queries are not properly filtering results when using a join query in 
solr4

To replicate, use the attached patch file which contains a test method that 
will fail (but should pass).

- OR -

run the solr example:

cd solr
ant example
java -jar start.jar
cd exampledocs
java -jar post.jar *.xml

Then try a few of the sample queries on the wiki page 
http://wiki.apache.org/solr/Join.  In particular, this is illustrates the issue:

Find all manufacturer docs named belkin, then join them against (product) 
docs and filter that list to only products with a price less than 12 dollars
http://localhost:8983/solr/select?q={!join+from=id+to=manu_id_s}compName_s:Belkinfq=price:%5B%2A+TO+12%5D

When you run that query, you will get two results, one with a price of 19.95 
and another with a price of 11.5  Because of the filter query, I'm only 
expecting to see one result - the one with a price of 11.99.

I was able to track this down to a change in revision #1188624.  Prior to that 
revision (i.e. 1188613 and before) the test method will pass and the example 
will work as expected.




 Solr4 Join query with fq not correctly filtering results
 

 Key: SOLR-3062
 URL: https://issues.apache.org/jira/browse/SOLR-3062
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Mike Hugo
 Attachments: join_filter_query_problem.patch


 filter queries are not properly filtering results when using a join query in 
 solr4
 To replicate, use the attached patch file which contains a test method that 
 will fail (but should pass).
 OR
 run the solr example:
 cd solr
 ant example
 java -jar start.jar
 cd exampledocs
 java -jar post.jar *.xml
 Then try a few of the sample queries on the wiki page 
 http://wiki.apache.org/solr/Join.  In particular, this is illustrates the 
 issue:
 Find all manufacturer docs named belkin, then join them against (product) 
 docs and filter that list to only products with a price less than 12 dollars
 http://localhost:8983/solr/select?q={!join+from=id+to=manu_id_s}compName_s:Belkinfq=price:%5B%2A+TO+12%5D
 When you run that query, you will get two results, one with a price of 19.95 
 and another with a price of 11.5  Because of the filter query, I'm only 
 expecting to see one result - the one with a price of 11.99.
 I was able to track this down to a change in revision #1188624 
 (http://svn.apache.org/viewvc?view=revisionrevision=1188624 : LUCENE-1536: 
 Filters can now be applied down-low, if their DocIdSet implements a new 
 bits() method, returning all documents in a random access way).  Prior to 
 that revision (i.e. 1188613 and before) the test method will pass and the 
 example will work as expected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Tommaso Teofili as Lucene/Solr committer

2012-01-26 Thread Shai Erera
Welcome !

Shai

On Fri, Jan 27, 2012 at 12:42 AM, Michael McCandless 
luc...@mikemccandless.com wrote:

 Welcome Tommaso!


[jira] [Commented] (LUCENE-3726) Default KuromojiAnalyzer to use search mode

2012-01-26 Thread Christian Moen (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194509#comment-13194509
 ] 

Christian Moen commented on LUCENE-3726:


These are very interesting questions, Robert.  Please find my comments below.

{quote}
should these parameters continue to be static-final, or configurable?
{quote}

It's perhaps possible to make these configurable, but I think we'd be exposing 
configuration that is most likely to confuse most users rather than help them.

The values currently uses have been found using some analysis and 
experimentation, and they can probably be improved both in terms of tuning and 
with added heuristics -- in particular for katakana compounds (more below).

However, changing and improving this requires quite detailed analysis and 
testing, though.  I think the major case for exposing them is as a means for 
easily tuning them rather than these parameters being generally useful to users.

{quote}
should POS also play a role in the algorithm (can/should we refine exactly what 
we decompound)?
{quote}

Very good question and an interesting idea.

In the case of long kanji words such as 関西国際空港 (Kansai International Airport), 
which is a known noun, we can possible use POS info as a hint for applying the 
Viterbi penalty.  In the case of unknown kanji, Kuromoji unigrams them.  
(関西国際空港 becomes 関西  国際  空港 (Kansai International Airport) using search mode.)

Katakana compounds such as シニアソフトウェアエンジニア (senior software engineer) becomes 
one token without search mode, but when search mode is used, we get three 
tokens シニア  ソフトウェア  エンジニア as you would expect.  It's also the case that 
シニアソフトウェアエンジニア is an unknown word, but its constituents become known and get 
the correct POS after search mode. 

In general, unknown words get a noun-POS (名詞) so the idea of using POS here 
should be fine.

There are some problems with the katakana decompounding in search mode.  For 
example, コニカミノルタホールディングス (Konika Minolta Holdings) becomes コニカ  ミノルタ  ホール  
ディングス  (Konika Minolta horu dings), where we get the additional token ホール (also 
means hall, in Japanese).

To sum up, I think we can potentially use the noun-POS as a hint when doing the 
decompounding in search mode, but I'm not sure how much we will benefit from 
it, but I like the idea.  I think we'll benefit most from an improved heuristic 
for non-kanji to improve katakana decompounding.

Let me have a tinker and see how I can improve this.

{quote}
is the Tokenizer the best place to do this, or should we do it in a 
tokenfilter? or both?
{quote}

Interesting idea and good point regarding IDF.

In order do the decompoundning, we'll need access to the lattice and add 
entries to it before we run the Viterbi.  If we do normal segmentation first 
then run a decompounding filter, I think we'll need to run the Viterbi twice in 
order to get the desired results.  (Optimizations are possible, though.)

I'm thinking a possibility could be to expose possible decompounds as part of 
Kuromoji's Token interface.  We can potentially have something like

{code:title=Token.java}

/**
 * Returns a list of possible decompounds for this token found by a heuristic
 * 
 * @return a list of candidate decompounds or null of none is found
 */
ListToken getDecompounds() {
  // ...
}
{code}

In the case of シニアソフトウェアエンジニア, the current token would have surface form 
シニアソフトウェアエンジニア, but with tokens シニア, ソフトウェア and エンジニア accessible using 
{{getDecompounds()}}.

As a general notice, I should point our that how well the heuristics performs 
depends on the dictionary/statistical model used (i.e. IPADIC) and if we might 
want to make different heuristics for each of those we support as needed.

 Default KuromojiAnalyzer to use search mode
 ---

 Key: LUCENE-3726
 URL: https://issues.apache.org/jira/browse/LUCENE-3726
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 3.6, 4.0
Reporter: Robert Muir

 Kuromoji supports an option to segment text in a way more suitable for search,
 by preventing long compound nouns as indexing terms.
 In general 'how you segment' can be important depending on the application 
 (see http://nlp.stanford.edu/pubs/acl-wmt08-cws.pdf for some studies on this 
 in chinese)
 The current algorithm punishes the cost based on some parameters 
 (SEARCH_MODE_PENALTY, SEARCH_MODE_LENGTH, etc)
 for long runs of kanji.
 Some questions (these can be separate future issues if any useful ideas come 
 out):
 * should these parameters continue to be static-final, or configurable?
 * should POS also play a role in the algorithm (can/should we refine exactly 
 what we decompound)?
 * is the Tokenizer the best place to do this, or should we do it in a 
 tokenfilter? or both?
   with a tokenfilter, one idea would be to also 

[JENKINS] Lucene-Solr-tests-only-trunk - Build # 12273 - Still Failing

2012-01-26 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12273/

4 tests failed.
FAILED:  junit.framework.TestSuite.org.apache.solr.search.TestFiltering

Error Message:
ERROR: SolrIndexSearcher opens=57 closes=58

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=57 
closes=58
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:151)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)


FAILED:  
junit.framework.TestSuite.org.apache.solr.cloud.ChaosMonkeySafeLeaderTest

Error Message:
ERROR: SolrIndexSearcher opens=201 closes=192

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=201 
closes=192
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:151)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)


FAILED:  junit.framework.TestSuite.org.apache.solr.cloud.RecoveryZkTest

Error Message:
ERROR: SolrIndexSearcher opens=13 closes=12

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=13 
closes=12
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:151)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)


FAILED:  
junit.framework.TestSuite.org.apache.solr.cloud.FullSolrCloudDistribCmdsTest

Error Message:
ERROR: SolrIndexSearcher opens=66 closes=67

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=66 
closes=67
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:151)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)




Build Log (for compile errors):
[...truncated 41583 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 1624 - Still Failing

2012-01-26 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/1624/

5 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.solr.cloud.FullSolrCloudDistribCmdsTest

Error Message:
ERROR: SolrIndexSearcher opens=66 closes=65

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=66 
closes=65
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:151)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)


FAILED:  junit.framework.TestSuite.org.apache.solr.cloud.OverseerTest

Error Message:
ERROR: SolrIndexSearcher opens=0 closes=1

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=0 closes=1
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:151)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)


FAILED:  junit.framework.TestSuite.org.apache.solr.cloud.RecoveryZkTest

Error Message:
ERROR: SolrIndexSearcher opens=13 closes=12

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=13 
closes=12
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:151)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)


FAILED:  junit.framework.TestSuite.org.apache.solr.request.TestFaceting

Error Message:
ERROR: SolrIndexSearcher opens=9 closes=10

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=9 closes=10
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:151)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)


FAILED:  
junit.framework.TestSuite.org.apache.solr.cloud.ChaosMonkeySafeLeaderTest

Error Message:
ERROR: SolrIndexSearcher opens=126 closes=121

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=126 
closes=121
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:151)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)




Build Log (for compile errors):
[...truncated 59115 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org