Build failed in Jenkins: the 4547 machine gun #154

2013-01-25 Thread Charlie Cron
See http://fortyounce.servebeer.com/job/the%204547%20machine%20gun/154/

--
[...truncated 1011 lines...]
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.index.TestCheckIndex
[junit4:junit4] Completed on J3 in 0.03s, 3 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.TestRegexpQuery
[junit4:junit4] Completed on J1 in 0.05s, 7 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.spans.TestNearSpansOrdered
[junit4:junit4] Completed on J0 in 0.22s, 10 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.index.TestDocCount
[junit4:junit4] Completed on J3 in 0.05s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.index.TestSumDocFreq
[junit4:junit4] Completed on J1 in 0.23s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.TestSearchForDuplicates
[junit4:junit4] Completed on J0 in 0.15s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.index.TestPerSegmentDeletes
[junit4:junit4] Completed on J3 in 0.09s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.index.TestIndexWriterConfig
[junit4:junit4] Completed on J1 in 0.04s, 9 tests
[junit4:junit4] 
[junit4:junit4] Suite: 
org.apache.lucene.util.junitcompat.TestBeforeAfterOverrides
[junit4:junit4] Completed on J0 in 0.03s, 2 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.TestFilteredSearch
[junit4:junit4] Completed on J3 in 0.07s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: 
org.apache.lucene.util.junitcompat.TestSetupTeardownChaining
[junit4:junit4] Completed on J1 in 0.02s, 2 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.TestCachingWrapperFilter
[junit4:junit4] Completed on J0 in 0.05s, 5 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.TestDocIdSet
[junit4:junit4] Completed on J3 in 0.06s, 3 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.TestFieldValueFilter
[junit4:junit4] Completed on J1 in 0.02s, 2 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.store.TestFileSwitchDirectory
[junit4:junit4] Completed on J0 in 0.04s, 4 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.TestBooleanScorer
[junit4:junit4] Completed on J3 in 0.02s, 3 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.TestConstantScoreQuery
[junit4:junit4] Completed on J1 in 0.02s, 3 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.util.TestRecyclingByteBlockAllocator
[junit4:junit4] Completed on J0 in 0.01s, 3 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.store.TestDirectory
[junit4:junit4] IGNOR/A 0.03s J3 | TestDirectory.testThreadSafety
[junit4:junit4] Assumption #1: 'nightly' test group is disabled (@Nightly)
[junit4:junit4] Completed on J3 in 0.06s, 8 tests, 1 skipped
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.util.TestCharsRef
[junit4:junit4] Completed on J1 in 0.03s, 8 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.util.junitcompat.TestCodecReported
[junit4:junit4] Completed on J0 in 0.01s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.TestBooleanOr
[junit4:junit4] Completed on J2 in 2.55s, 6 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.index.TestParallelTermEnum
[junit4:junit4] Completed on J3 in 0.07s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.TestElevationComparator
[junit4:junit4] Completed on J1 in 0.02s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.TestExplanations
[junit4:junit4] Completed on J0 in 0.01s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.TestFieldCacheTermsFilter
[junit4:junit4] Completed on J2 in 0.02s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.TestMatchAllDocsQuery
[junit4:junit4] Completed on J3 in 0.01s, 2 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.TestNot
[junit4:junit4] Completed on J1 in 0.02s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.TestSimilarity
[junit4:junit4] Completed on J0 in 0.11s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.TestSimilarityProvider
[junit4:junit4] Completed on J2 in 0.01s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.TestAutomatonQueryUnicode
[junit4:junit4] Completed on J3 in 0.07s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.util.TestAttributeSource
[junit4:junit4] Completed on J1 in 0.02s, 5 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.TestTopScoreDocCollector
[junit4:junit4] Completed on J0 in 0.02s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.spans.TestSpanFirstQuery
[junit4:junit4] Completed on J2 in 0.02s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.util.TestBytesRef
[junit4:junit4] 

Jenkins build is back to normal : the 4547 machine gun #155

2013-01-25 Thread Charlie Cron
See http://fortyounce.servebeer.com/job/the%204547%20machine%20gun/155/


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3177) Excluding tagged filter in StatsComponent

2013-01-25 Thread Nikolai Luthman (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Luthman updated SOLR-3177:
--

Attachment: statsfilterexclude.patch

I've made a patch for this, based on the code from the FacetComponent. The 
patch is for 3.6.1.

Might need some cleanup to get it into the latest version.

Apply by changing to solr/core/src/java/org/apache/solr/handler/component/ and 
run:
patch  statsfilterexclude.patch

 Excluding tagged filter in StatsComponent
 -

 Key: SOLR-3177
 URL: https://issues.apache.org/jira/browse/SOLR-3177
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Affects Versions: 3.5, 3.6, 4.0-ALPHA, 4.1
Reporter: Mathias H.
Priority: Minor
  Labels: localparams, stats, statscomponent
 Attachments: statsfilterexclude.patch


 It would be useful to exclude the effects of some fq params from the set of 
 documents used to compute stats -- similar to 
 how you can exclude tagged filters when generating facet counts... 
 https://wiki.apache.org/solr/SimpleFacetParameters#Tagging_and_excluding_Filters
 So that it's possible to do something like this... 
 http://localhost:8983/solr/select?fq={!tag=priceFilter}price:[1 TO 
 20]q=*:*stats=truestats.field={!ex=priceFilter}price 
 If you want to create a price slider this is very useful because then you can 
 filter the price ([1 TO 20) and nevertheless get the lower and upper bound of 
 the unfiltered price (min=0, max=100):
 {noformat}
 |-[---]--|
 $0 $1 $20$100
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4356) SOLR 4.1 Out Of Memory error After commit of a few thousand Solr Docs

2013-01-25 Thread Harish Verma (JIRA)
Harish Verma created SOLR-4356:
--

 Summary: SOLR 4.1 Out Of Memory error After commit of a few 
thousand Solr Docs
 Key: SOLR-4356
 URL: https://issues.apache.org/jira/browse/SOLR-4356
 Project: Solr
  Issue Type: Improvement
  Components: clients - java, Schema and Analysis, Tests
Affects Versions: 4.1
 Environment: OS = Ubuntu 12.04
Sun JAVA 7
Max Java Heap Space = 2GB
Apache Tomcat 7
Hardware = {Intel core i3, 2GB RAM}
Average no of fields in a Solr Doc = 100

Reporter: Harish Verma
 Fix For: 4.1.1


we are testing solr 4.1 running inside tomcat 7 and java 7 with  following 
options

JAVA_OPTS=-Xms256m -Xmx2048m -XX:MaxPermSize=1024m -XX:+UseConcMarkSweepGC 
-XX:+CMSIncrementalMode -XX:+ParallelRefProcEnabled 
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/home/ubuntu/OOM_HeapDump

our source code looks like following:
/ START */
int noOfSolrDocumentsInBatch = 0;
for(int i=0 ; i5000 ; i++) {
SolrInputDocument solrInputDocument = getNextSolrInputDocument();
server.add(solrInputDocument);
noOfSolrDocumentsInBatch += 1;
if(noOfSolrDocumentsInBatch == 10) {
server.commit();
noOfSolrDocumentsInBatch = 0;
}
}
/ END */

the method getNextSolrInputDocument() generates a solr document with 100 
fields (average). Around 50 of the fields are of text_general type.
Some of the test_general fields consist of approx 1000 words rest consists of 
few words. Ouf of total fields there are around 35-40 multivalued fields (not 
of type text_general).
We are indexing all the fields but storing only 8 fields. Out of these 8 fields 
two are string type, five are long and one is boolean. So our index size is 
only 394 MB. But the RAM occupied at time of OOM is around 2.5 GB. Why the 
memory is so high even though the index size is small?
What is being stored in the memory? Our understanding is that after every 
commit documents are flushed to the disk.So nothing should remain in RAM after 
commit.

We are using the following settings:

server.commit() set waitForSearcher=true and waitForFlush=true
solrConfig.xml has following properties set:
directoryFactory = solr.MMapDirectoryFactory
maxWarmingSearchers = 1
text_general data type is being used as supplied in the schema.xml with the 
solr setup.
maxIndexingThreads = 8(default)
autoCommitmaxTime15000/maxTimeopenSearcherfalse/openSearcher/autoCommit

We get Java heap Out Of Memory Error after commiting around 3990 solr 
documents.Some of the snapshots of memory dump from profiler are attached.

can somebody please suggest what should we do to minimize/optimize the memory 
consumption in our case with the reasons?
also suggest what should be optimal values and reason for following parameters 
of solrConfig.xml 
useColdSearcher - true/false?
maxwarmingsearchers- number
spellcheck-on/off?
omitNorms=true/false?
omitTermFreqAndPositions?
mergefactor? we are using default value 10
java garbage collection tuning parameters ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4352) Velocity-base pagination should support/preserve sorting

2013-01-25 Thread Erik Hatcher (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher updated SOLR-4352:
---

Attachment: SOLR-4352-erik.patch

Eric - how about this patch?   It allows the sort parameter(s) to stick around 
on facet selections as well, not just pagination.

 Velocity-base pagination should support/preserve sorting
 

 Key: SOLR-4352
 URL: https://issues.apache.org/jira/browse/SOLR-4352
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Eric Spiegelberg
Assignee: Erik Hatcher
 Attachments: SOLR-4352-erik.patch, SOLR-4352.patch


 When performing /browse, the Velocity generated UI does not support sorting 
 in the generated pagination links.
 The link_to_previous_page and link_to_next_page macros found within 
 [apache-solr-4.0.0]/example/solr/collection1/conf/velocity/VM_global_library.vm
  should be modified to maintain/preserve an existing sort parameter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4357) Default field in query syntax documentation has confusing error

2013-01-25 Thread Hayden Muhl (JIRA)
Hayden Muhl created SOLR-4357:
-

 Summary: Default field in query syntax documentation has confusing 
error
 Key: SOLR-4357
 URL: https://issues.apache.org/jira/browse/SOLR-4357
 Project: Solr
  Issue Type: Bug
  Components: documentation
Affects Versions: 4.0
Reporter: Hayden Muhl
Priority: Trivial
 Fix For: 4.0.1


The explanation of default search fields uses two different queries that are 
supposed to be semantically the same, but the query text changes between the 
two examples.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4357) Default field in query syntax documentation has confusing error

2013-01-25 Thread Hayden Muhl (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hayden Muhl updated SOLR-4357:
--

Attachment: SOLR-4357.patch

Small fix for documentation.

 Default field in query syntax documentation has confusing error
 ---

 Key: SOLR-4357
 URL: https://issues.apache.org/jira/browse/SOLR-4357
 Project: Solr
  Issue Type: Bug
  Components: documentation
Affects Versions: 4.0
Reporter: Hayden Muhl
Priority: Trivial
  Labels: documentation
 Fix For: 4.0.1

 Attachments: SOLR-4357.patch

   Original Estimate: 5m
  Remaining Estimate: 5m

 The explanation of default search fields uses two different queries that are 
 supposed to be semantically the same, but the query text changes between the 
 two examples.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Moved] (LUCENE-4718) Default field in query syntax documentation has confusing error

2013-01-25 Thread Erik Hatcher (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher moved SOLR-4357 to LUCENE-4718:


  Component/s: (was: documentation)
   core/queryparser
Fix Version/s: (was: 4.0.1)
   4.0.1
Lucene Fields: New,Patch Available
Affects Version/s: (was: 4.0)
   4.0
  Key: LUCENE-4718  (was: SOLR-4357)
  Project: Lucene - Core  (was: Solr)

 Default field in query syntax documentation has confusing error
 ---

 Key: LUCENE-4718
 URL: https://issues.apache.org/jira/browse/LUCENE-4718
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/queryparser
Affects Versions: 4.0
Reporter: Hayden Muhl
Priority: Trivial
  Labels: documentation
 Fix For: 4.0.1

 Attachments: SOLR-4357.patch

   Original Estimate: 5m
  Remaining Estimate: 5m

 The explanation of default search fields uses two different queries that are 
 supposed to be semantically the same, but the query text changes between the 
 two examples.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Deleted] (SOLR-4356) SOLR 4.1 Out Of Memory error After commit of a few thousand Solr Docs

2013-01-25 Thread Harish Verma (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Verma updated SOLR-4356:
---

Comment: was deleted

(was: screenshots of Memory Dump)

 SOLR 4.1 Out Of Memory error After commit of a few thousand Solr Docs
 -

 Key: SOLR-4356
 URL: https://issues.apache.org/jira/browse/SOLR-4356
 Project: Solr
  Issue Type: Improvement
  Components: clients - java, Schema and Analysis, Tests
Affects Versions: 4.1
 Environment: OS = Ubuntu 12.04
 Sun JAVA 7
 Max Java Heap Space = 2GB
 Apache Tomcat 7
 Hardware = {Intel core i3, 2GB RAM}
 Average no of fields in a Solr Doc = 100
Reporter: Harish Verma
  Labels: performance, test
 Fix For: 4.1.1

 Attachments: memorydump1.png, memorydump2.png

   Original Estimate: 168h
  Remaining Estimate: 168h

 we are testing solr 4.1 running inside tomcat 7 and java 7 with  following 
 options
 JAVA_OPTS=-Xms256m -Xmx2048m -XX:MaxPermSize=1024m -XX:+UseConcMarkSweepGC 
 -XX:+CMSIncrementalMode -XX:+ParallelRefProcEnabled 
 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/home/ubuntu/OOM_HeapDump
 our source code looks like following:
 / START */
 int noOfSolrDocumentsInBatch = 0;
 for(int i=0 ; i5000 ; i++) {
 SolrInputDocument solrInputDocument = getNextSolrInputDocument();
 server.add(solrInputDocument);
 noOfSolrDocumentsInBatch += 1;
 if(noOfSolrDocumentsInBatch == 10) {
 server.commit();
 noOfSolrDocumentsInBatch = 0;
 }
 }
 / END */
 the method getNextSolrInputDocument() generates a solr document with 100 
 fields (average). Around 50 of the fields are of text_general type.
 Some of the test_general fields consist of approx 1000 words rest consists 
 of few words. Ouf of total fields there are around 35-40 multivalued fields 
 (not of type text_general).
 We are indexing all the fields but storing only 8 fields. Out of these 8 
 fields two are string type, five are long and one is boolean. So our index 
 size is only 394 MB. But the RAM occupied at time of OOM is around 2.5 GB. 
 Why the memory is so high even though the index size is small?
 What is being stored in the memory? Our understanding is that after every 
 commit documents are flushed to the disk.So nothing should remain in RAM 
 after commit.
 We are using the following settings:
 server.commit() set waitForSearcher=true and waitForFlush=true
 solrConfig.xml has following properties set:
 directoryFactory = solr.MMapDirectoryFactory
 maxWarmingSearchers = 1
 text_general data type is being used as supplied in the schema.xml with the 
 solr setup.
 maxIndexingThreads = 8(default)
 autoCommitmaxTime15000/maxTimeopenSearcherfalse/openSearcher/autoCommit
 We get Java heap Out Of Memory Error after commiting around 3990 solr 
 documents.Some of the snapshots of memory dump from profiler are attached.
 can somebody please suggest what should we do to minimize/optimize the memory 
 consumption in our case with the reasons?
 also suggest what should be optimal values and reason for following 
 parameters of solrConfig.xml 
 useColdSearcher - true/false?
 maxwarmingsearchers- number
 spellcheck-on/off?
 omitNorms=true/false?
 omitTermFreqAndPositions?
 mergefactor? we are using default value 10
 java garbage collection tuning parameters ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4356) SOLR 4.1 Out Of Memory error After commit of a few thousand Solr Docs

2013-01-25 Thread Harish Verma (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Verma updated SOLR-4356:
---

Attachment: memorydump2.png
memorydump1.png

screenshots of Memory Dump

 SOLR 4.1 Out Of Memory error After commit of a few thousand Solr Docs
 -

 Key: SOLR-4356
 URL: https://issues.apache.org/jira/browse/SOLR-4356
 Project: Solr
  Issue Type: Improvement
  Components: clients - java, Schema and Analysis, Tests
Affects Versions: 4.1
 Environment: OS = Ubuntu 12.04
 Sun JAVA 7
 Max Java Heap Space = 2GB
 Apache Tomcat 7
 Hardware = {Intel core i3, 2GB RAM}
 Average no of fields in a Solr Doc = 100
Reporter: Harish Verma
  Labels: performance, test
 Fix For: 4.1.1

 Attachments: memorydump1.png, memorydump2.png

   Original Estimate: 168h
  Remaining Estimate: 168h

 we are testing solr 4.1 running inside tomcat 7 and java 7 with  following 
 options
 JAVA_OPTS=-Xms256m -Xmx2048m -XX:MaxPermSize=1024m -XX:+UseConcMarkSweepGC 
 -XX:+CMSIncrementalMode -XX:+ParallelRefProcEnabled 
 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/home/ubuntu/OOM_HeapDump
 our source code looks like following:
 / START */
 int noOfSolrDocumentsInBatch = 0;
 for(int i=0 ; i5000 ; i++) {
 SolrInputDocument solrInputDocument = getNextSolrInputDocument();
 server.add(solrInputDocument);
 noOfSolrDocumentsInBatch += 1;
 if(noOfSolrDocumentsInBatch == 10) {
 server.commit();
 noOfSolrDocumentsInBatch = 0;
 }
 }
 / END */
 the method getNextSolrInputDocument() generates a solr document with 100 
 fields (average). Around 50 of the fields are of text_general type.
 Some of the test_general fields consist of approx 1000 words rest consists 
 of few words. Ouf of total fields there are around 35-40 multivalued fields 
 (not of type text_general).
 We are indexing all the fields but storing only 8 fields. Out of these 8 
 fields two are string type, five are long and one is boolean. So our index 
 size is only 394 MB. But the RAM occupied at time of OOM is around 2.5 GB. 
 Why the memory is so high even though the index size is small?
 What is being stored in the memory? Our understanding is that after every 
 commit documents are flushed to the disk.So nothing should remain in RAM 
 after commit.
 We are using the following settings:
 server.commit() set waitForSearcher=true and waitForFlush=true
 solrConfig.xml has following properties set:
 directoryFactory = solr.MMapDirectoryFactory
 maxWarmingSearchers = 1
 text_general data type is being used as supplied in the schema.xml with the 
 solr setup.
 maxIndexingThreads = 8(default)
 autoCommitmaxTime15000/maxTimeopenSearcherfalse/openSearcher/autoCommit
 We get Java heap Out Of Memory Error after commiting around 3990 solr 
 documents.Some of the snapshots of memory dump from profiler are attached.
 can somebody please suggest what should we do to minimize/optimize the memory 
 consumption in our case with the reasons?
 also suggest what should be optimal values and reason for following 
 parameters of solrConfig.xml 
 useColdSearcher - true/false?
 maxwarmingsearchers- number
 spellcheck-on/off?
 omitNorms=true/false?
 omitTermFreqAndPositions?
 mergefactor? we are using default value 10
 java garbage collection tuning parameters ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Build failed in Jenkins: the 4547 machine gun #53

2013-01-25 Thread Adrien Grand
On Fri, Jan 25, 2013 at 2:30 AM, Robert Muir rcm...@gmail.com wrote:
 I think the bug is when its direct and the last block has the
 optimized bitsPerValue=0 case.

Right. This was a test bug due to the fact that the reader moves the
file pointer in the for (i = 0; i  valueCount; ++i) { assert blah }
loop. I committed a fix.

Thanks for running tests on this branch!

-- 
Adrien

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Build failed in Jenkins: the 4547 machine gun #206

2013-01-25 Thread Charlie Cron
See http://fortyounce.servebeer.com/job/the%204547%20machine%20gun/206/

--
[...truncated 966 lines...]
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.index.TestPayloads
[junit4:junit4] Completed on J1 in 0.22s, 7 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.TestFuzzyQuery
[junit4:junit4] Completed on J2 in 0.06s, 6 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.index.TestOmitPositions
[junit4:junit4] Completed on J0 in 0.32s, 4 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.util.TestRollingBuffer
[junit4:junit4] Completed on J3 in 0.06s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: 
org.apache.lucene.util.junitcompat.TestSystemPropertiesInvariantRule
[junit4:junit4] Completed on J2 in 0.07s, 5 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.index.TestSizeBoundedForceMerge
[junit4:junit4] Completed on J0 in 0.06s, 11 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.TestWildcardRandom
[junit4:junit4] Completed on J3 in 0.03s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.TestRegexpQuery
[junit4:junit4] Completed on J2 in 0.04s, 7 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.spans.TestNearSpansOrdered
[junit4:junit4] Completed on J0 in 0.25s, 10 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.util.automaton.TestSpecialOperations
[junit4:junit4] Completed on J3 in 0.23s, 2 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.index.TestDocCount
[junit4:junit4] Completed on J2 in 0.03s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.index.TestSumDocFreq
[junit4:junit4] Completed on J0 in 0.11s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.index.TestPerSegmentDeletes
[junit4:junit4] Completed on J2 in 0.08s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.util.TestSmallFloat
[junit4:junit4] Completed on J0 in 0.05s, 2 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.index.TestParallelReaderEmptyIndex
[junit4:junit4] Completed on J2 in 0.08s, 2 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.index.TestIndexWriterConfig
[junit4:junit4] Completed on J0 in 0.04s, 9 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.TestSearchForDuplicates
[junit4:junit4] Completed on J3 in 0.61s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.util.TestSetOnce
[junit4:junit4] Completed on J2 in 0.02s, 4 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.TestFilteredSearch
[junit4:junit4] Completed on J0 in 0.02s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.index.TestNoDeletionPolicy
[junit4:junit4] Completed on J3 in 0.18s, 4 tests
[junit4:junit4] 
[junit4:junit4] Suite: 
org.apache.lucene.util.junitcompat.TestSetupTeardownChaining
[junit4:junit4] Completed on J2 in 0.02s, 2 tests
[junit4:junit4] 
[junit4:junit4] Suite: 
org.apache.lucene.util.junitcompat.TestSameRandomnessLocalePassedOrNot
[junit4:junit4] Completed on J0 in 0.02s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.TestBooleanOr
[junit4:junit4] Completed on J1 in 1.95s, 6 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.TestSubScorerFreqs
[junit4:junit4] Completed on J3 in 0.02s, 3 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.TestDateFilter
[junit4:junit4] Completed on J2 in 0.02s, 2 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.util.junitcompat.TestSeedFromUncaught
[junit4:junit4] Completed on J0 in 0.03s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.spans.TestSpansAdvanced
[junit4:junit4] Completed on J1 in 0.10s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.TestBooleanScorer
[junit4:junit4] Completed on J3 in 0.08s, 3 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.TestPhrasePrefixQuery
[junit4:junit4] Completed on J2 in 0.01s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.util.TestRecyclingByteBlockAllocator
[junit4:junit4] Completed on J0 in 0.03s, 3 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.document.TestDateTools
[junit4:junit4] Completed on J1 in 0.04s, 5 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.util.junitcompat.TestJUnitRuleOrder
[junit4:junit4] Completed on J3 in 0.03s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.util.junitcompat.TestCodecReported
[junit4:junit4] Completed on J2 in 0.02s, 1 test
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.index.TestReaderClosed
[junit4:junit4] Completed on J0 in 0.13s, 2 tests
[junit4:junit4] 
[junit4:junit4] Suite: org.apache.lucene.search.TestMatchAllDocsQuery
[junit4:junit4] Completed on J1 in 0.22s, 2 tests
[junit4:junit4] 
[junit4:junit4] Suite: 

Jenkins build is back to normal : the 4547 machine gun #207

2013-01-25 Thread Charlie Cron
See http://fortyounce.servebeer.com/job/the%204547%20machine%20gun/207/


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4716) Add OR support to DrillDown

2013-01-25 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562648#comment-13562648
 ] 

Commit Tag Bot commented on LUCENE-4716:


[trunk commit] Shai Erera
http://svn.apache.org/viewvc?view=revisionrevision=1438485

LUCENE-4716: Add OR support to DrillDown


 Add OR support to DrillDown
 ---

 Key: LUCENE-4716
 URL: https://issues.apache.org/jira/browse/LUCENE-4716
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-4716.patch


 DrillDown provides helper methods to wrap a baseQuery with drill-down 
 categories. All the categories are AND'ed, and it has been asked on the user 
 list for OR support. While users can construct their own BooleanQuery, it 
 would be useful if DrillDown helped them doing that. I think that a simple 
 Occur additional parameter to DrillDown.query will help to some extent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-4716) Add OR support to DrillDown

2013-01-25 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-4716.


   Resolution: Fixed
Fix Version/s: 5.0
   4.2
Lucene Fields: New,Patch Available  (was: New)

Committed to trunk and 4x

 Add OR support to DrillDown
 ---

 Key: LUCENE-4716
 URL: https://issues.apache.org/jira/browse/LUCENE-4716
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4716.patch


 DrillDown provides helper methods to wrap a baseQuery with drill-down 
 categories. All the categories are AND'ed, and it has been asked on the user 
 list for OR support. While users can construct their own BooleanQuery, it 
 would be useful if DrillDown helped them doing that. I think that a simple 
 Occur additional parameter to DrillDown.query will help to some extent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4716) Add OR support to DrillDown

2013-01-25 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562661#comment-13562661
 ] 

Commit Tag Bot commented on LUCENE-4716:


[branch_4x commit] Shai Erera
http://svn.apache.org/viewvc?view=revisionrevision=1438491

LUCENE-4716: Add OR support to DrillDown


 Add OR support to DrillDown
 ---

 Key: LUCENE-4716
 URL: https://issues.apache.org/jira/browse/LUCENE-4716
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4716.patch


 DrillDown provides helper methods to wrap a baseQuery with drill-down 
 categories. All the categories are AND'ed, and it has been asked on the user 
 list for OR support. While users can construct their own BooleanQuery, it 
 would be useful if DrillDown helped them doing that. I think that a simple 
 Occur additional parameter to DrillDown.query will help to some extent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Build failed in Jenkins: the 4547 machine gun #206

2013-01-25 Thread Michael McCandless
I'll dig.

Mike McCandless

http://blog.mikemccandless.com


On Fri, Jan 25, 2013 at 7:46 AM, Charlie Cron hudsonsevilt...@gmail.com wrote:
 See http://fortyounce.servebeer.com/job/the%204547%20machine%20gun/206/

 --
 [...truncated 966 lines...]
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.index.TestPayloads
 [junit4:junit4] Completed on J1 in 0.22s, 7 tests
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.search.TestFuzzyQuery
 [junit4:junit4] Completed on J2 in 0.06s, 6 tests
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.index.TestOmitPositions
 [junit4:junit4] Completed on J0 in 0.32s, 4 tests
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.util.TestRollingBuffer
 [junit4:junit4] Completed on J3 in 0.06s, 1 test
 [junit4:junit4]
 [junit4:junit4] Suite: 
 org.apache.lucene.util.junitcompat.TestSystemPropertiesInvariantRule
 [junit4:junit4] Completed on J2 in 0.07s, 5 tests
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.index.TestSizeBoundedForceMerge
 [junit4:junit4] Completed on J0 in 0.06s, 11 tests
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.search.TestWildcardRandom
 [junit4:junit4] Completed on J3 in 0.03s, 1 test
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.search.TestRegexpQuery
 [junit4:junit4] Completed on J2 in 0.04s, 7 tests
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.search.spans.TestNearSpansOrdered
 [junit4:junit4] Completed on J0 in 0.25s, 10 tests
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.util.automaton.TestSpecialOperations
 [junit4:junit4] Completed on J3 in 0.23s, 2 tests
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.index.TestDocCount
 [junit4:junit4] Completed on J2 in 0.03s, 1 test
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.index.TestSumDocFreq
 [junit4:junit4] Completed on J0 in 0.11s, 1 test
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.index.TestPerSegmentDeletes
 [junit4:junit4] Completed on J2 in 0.08s, 1 test
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.util.TestSmallFloat
 [junit4:junit4] Completed on J0 in 0.05s, 2 tests
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.index.TestParallelReaderEmptyIndex
 [junit4:junit4] Completed on J2 in 0.08s, 2 tests
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.index.TestIndexWriterConfig
 [junit4:junit4] Completed on J0 in 0.04s, 9 tests
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.TestSearchForDuplicates
 [junit4:junit4] Completed on J3 in 0.61s, 1 test
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.util.TestSetOnce
 [junit4:junit4] Completed on J2 in 0.02s, 4 tests
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.search.TestFilteredSearch
 [junit4:junit4] Completed on J0 in 0.02s, 1 test
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.index.TestNoDeletionPolicy
 [junit4:junit4] Completed on J3 in 0.18s, 4 tests
 [junit4:junit4]
 [junit4:junit4] Suite: 
 org.apache.lucene.util.junitcompat.TestSetupTeardownChaining
 [junit4:junit4] Completed on J2 in 0.02s, 2 tests
 [junit4:junit4]
 [junit4:junit4] Suite: 
 org.apache.lucene.util.junitcompat.TestSameRandomnessLocalePassedOrNot
 [junit4:junit4] Completed on J0 in 0.02s, 1 test
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.search.TestBooleanOr
 [junit4:junit4] Completed on J1 in 1.95s, 6 tests
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.search.TestSubScorerFreqs
 [junit4:junit4] Completed on J3 in 0.02s, 3 tests
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.search.TestDateFilter
 [junit4:junit4] Completed on J2 in 0.02s, 2 tests
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.util.junitcompat.TestSeedFromUncaught
 [junit4:junit4] Completed on J0 in 0.03s, 1 test
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.search.spans.TestSpansAdvanced
 [junit4:junit4] Completed on J1 in 0.10s, 1 test
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.search.TestBooleanScorer
 [junit4:junit4] Completed on J3 in 0.08s, 3 tests
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.search.TestPhrasePrefixQuery
 [junit4:junit4] Completed on J2 in 0.01s, 1 test
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.util.TestRecyclingByteBlockAllocator
 [junit4:junit4] Completed on J0 in 0.03s, 3 tests
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.document.TestDateTools
 [junit4:junit4] Completed on J1 in 0.04s, 5 tests
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.util.junitcompat.TestJUnitRuleOrder
 [junit4:junit4] Completed on J3 in 0.03s, 1 test
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.util.junitcompat.TestCodecReported
 [junit4:junit4] Completed on J2 in 0.02s, 1 test
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.index.TestReaderClosed
 [junit4:junit4] Completed on J0 in 0.13s, 2 

[jira] [Commented] (SOLR-4354) Replication should perform full copy if slave's generation higher than master's

2013-01-25 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562690#comment-13562690
 ] 

Shalin Shekhar Mangar commented on SOLR-4354:
-

Amit, I don't complete understand the problem.

bq. Slave now tries to pull from master B (has higher index version than slave 
but lower generation)

Say, slave has generation G and version V and master(B) has a higher version 
V+1 but lower generation G-1. The code right now says:
{code}
boolean isFullCopyNeeded = IndexDeletionPolicyWrapper
   .getCommitTimestamp(commit) = latestVersion
-  || commit.getGeneration() = latestGeneration || forceReplication;
{code}

Since master's generation is lower than slave, a full copy will be forced here. 
Further, your patch has:
{code}
-  || commit.getGeneration() = latestGeneration || forceReplication;
+  || commit.getGeneration() = latestGeneration || 
(commit.getGeneration()  latestGeneration) || forceReplication;
{code}

I don't see how that changes anything. The second condition on generation is 
redundant. Did I miss something?

 Replication should perform full copy if slave's generation higher than 
 master's
 ---

 Key: SOLR-4354
 URL: https://issues.apache.org/jira/browse/SOLR-4354
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 4.1
Reporter: Amit Nithian
 Fix For: 4.2

 Attachments: SOLR-4354.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 We have dual masters each incrementally indexing from our MySQL database and 
 sit behind a virtual hostname in our load balancer. As such, it's possible 
 that the generation numbers between the masters for a given index are not in 
 sync. Slaves are configured to replicate from this virtual host (and pin 
 based on source/dest IP hash) so we can add and remove masters as necessary 
 (great for maintenance). 
 For the most part this works but we've seen the following happen:
 * Slave has been pulling from master A
 * Master A goes down for maint and now will pull from master B (which has a 
 lower generation number for some reason than master A).
 * Slave now tries to pull from master B (has higher index version than slave 
 but lower generation).
 * Slave downloads index files, moves them to the index/ directory but these 
 files are deleted during the doCommit() phase (looks like older generation 
 data is deleted).
 * Index remains as-is and no change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4708) Make LZ4 hash tables reusable

2013-01-25 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562693#comment-13562693
 ] 

Commit Tag Bot commented on LUCENE-4708:


[trunk commit] Adrien Grand
http://svn.apache.org/viewvc?view=revisionrevision=1438519

LUCENE-4708: Reuse LZ4 hash tables across calls.



 Make LZ4 hash tables reusable
 -

 Key: LUCENE-4708
 URL: https://issues.apache.org/jira/browse/LUCENE-4708
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Attachments: LUCENE-4708.patch


 Currently LZ4 compressors instantiate their own hash table for every byte 
 sequence they need to compress. These can be large (256KB for LZ4 HC) so we 
 should try to reuse them across calls.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4708) Make LZ4 hash tables reusable

2013-01-25 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562705#comment-13562705
 ] 

Commit Tag Bot commented on LUCENE-4708:


[branch_4x commit] Adrien Grand
http://svn.apache.org/viewvc?view=revisionrevision=1438524

LUCENE-4708: Reuse LZ4 hash tables across calls (merged from r1438519).



 Make LZ4 hash tables reusable
 -

 Key: LUCENE-4708
 URL: https://issues.apache.org/jira/browse/LUCENE-4708
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Attachments: LUCENE-4708.patch


 Currently LZ4 compressors instantiate their own hash table for every byte 
 sequence they need to compress. These can be large (256KB for LZ4 HC) so we 
 should try to reuse them across calls.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-4708) Make LZ4 hash tables reusable

2013-01-25 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand resolved LUCENE-4708.
--

Resolution: Fixed

 Make LZ4 hash tables reusable
 -

 Key: LUCENE-4708
 URL: https://issues.apache.org/jira/browse/LUCENE-4708
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Attachments: LUCENE-4708.patch


 Currently LZ4 compressors instantiate their own hash table for every byte 
 sequence they need to compress. These can be large (256KB for LZ4 HC) so we 
 should try to reuse them across calls.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Fixing query-time multi-word synonym issue

2013-01-25 Thread Jack Krupansky

Here's an example query with q.op=AND:

   causes of heart attack

And I have this synonym definition:

   heart attack, myocardial infarction

So, what is the alleged query parser fix so that the query is treated as:

   causes of (heart attack OR myocardial infarction)

The core problem with the synonym filter is that it mashes all the terms of 
a multi-term synonym to be at the same position so that the order (attack 
after heart and infarction after mycardial) is lost.


What is needed is a synonym filter with a notion of path so the term 
sequences for each of the synonym alternatives is available for the query 
parser to generate the OR alternative queries.


Granted, the query parser ALSO needs to present the full sequence of terms 
to the analyzer as one string causes of heart attack, but that alone 
doesn't address the synonym filter misbehavior.


-- Jack Krupansky

-Original Message- 
From: Robert Muir

Sent: Friday, January 25, 2013 3:46 AM
To: dev@lucene.apache.org
Subject: Re: Fixing query-time multi-word synonym issue

On Fri, Jan 25, 2013 at 12:48 AM, Jack Krupansky
j...@basetechnology.com wrote:

Otis, this is precisely why nothing will get done any time soon on the
multi-term synonym issue - there isn't even common agreement that there is 
a

problem, let alone common agreement on the specifics of the problem, let
alone common agreement on a solution.


I think you are the only one arguing the bug is a synonymsfilter problem.



Even though technically the Solr Query Parser is now separate from the
Lucene Query Parser, the synonym filter is still strictly Lucene. 
Addressing

the multi-term synonym feature requires enhancement to the synonym filter,


dude you have a bug in X, you fix the bug in X: you dont go hack around it 
in Y.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org 



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4358) SolrJ, by preventing multi-part post, loses key information about file name that Tika needs

2013-01-25 Thread Karl Wright (JIRA)
Karl Wright created SOLR-4358:
-

 Summary: SolrJ, by preventing multi-part post, loses key 
information about file name that Tika needs
 Key: SOLR-4358
 URL: https://issues.apache.org/jira/browse/SOLR-4358
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 4.0
Reporter: Karl Wright


SolrJ accepts a ContentStream, which has a name field.  Within 
HttpSolrServer.java, if SolrJ makes the decision to use multipart posts, this 
filename is transmitted as part of the form boundary information.  However, if 
SolrJ chooses not to use multipart post, the filename information is lost.

This information is used by SolrCell (Tika) to make decisions about content 
extraction, so it is very important that it makes it into Solr in one way or 
another.  Either SolrJ should set appropriate equivalent headers to send the 
filename automatically, or it should force multipart posts when this 
information is present.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4356) SOLR 4.1 Out Of Memory error After commit of a few thousand Solr Docs

2013-01-25 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562734#comment-13562734
 ] 

Mark Miller commented on SOLR-4356:
---

Please send questions like this to the user list - then open a JIRA issue if 
you determine the issue is a bug.

 SOLR 4.1 Out Of Memory error After commit of a few thousand Solr Docs
 -

 Key: SOLR-4356
 URL: https://issues.apache.org/jira/browse/SOLR-4356
 Project: Solr
  Issue Type: Improvement
  Components: clients - java, Schema and Analysis, Tests
Affects Versions: 4.1
 Environment: OS = Ubuntu 12.04
 Sun JAVA 7
 Max Java Heap Space = 2GB
 Apache Tomcat 7
 Hardware = {Intel core i3, 2GB RAM}
 Average no of fields in a Solr Doc = 100
Reporter: Harish Verma
  Labels: performance, test
 Fix For: 4.1.1

 Attachments: memorydump1.png, memorydump2.png

   Original Estimate: 168h
  Remaining Estimate: 168h

 we are testing solr 4.1 running inside tomcat 7 and java 7 with  following 
 options
 JAVA_OPTS=-Xms256m -Xmx2048m -XX:MaxPermSize=1024m -XX:+UseConcMarkSweepGC 
 -XX:+CMSIncrementalMode -XX:+ParallelRefProcEnabled 
 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/home/ubuntu/OOM_HeapDump
 our source code looks like following:
 / START */
 int noOfSolrDocumentsInBatch = 0;
 for(int i=0 ; i5000 ; i++) {
 SolrInputDocument solrInputDocument = getNextSolrInputDocument();
 server.add(solrInputDocument);
 noOfSolrDocumentsInBatch += 1;
 if(noOfSolrDocumentsInBatch == 10) {
 server.commit();
 noOfSolrDocumentsInBatch = 0;
 }
 }
 / END */
 the method getNextSolrInputDocument() generates a solr document with 100 
 fields (average). Around 50 of the fields are of text_general type.
 Some of the test_general fields consist of approx 1000 words rest consists 
 of few words. Ouf of total fields there are around 35-40 multivalued fields 
 (not of type text_general).
 We are indexing all the fields but storing only 8 fields. Out of these 8 
 fields two are string type, five are long and one is boolean. So our index 
 size is only 394 MB. But the RAM occupied at time of OOM is around 2.5 GB. 
 Why the memory is so high even though the index size is small?
 What is being stored in the memory? Our understanding is that after every 
 commit documents are flushed to the disk.So nothing should remain in RAM 
 after commit.
 We are using the following settings:
 server.commit() set waitForSearcher=true and waitForFlush=true
 solrConfig.xml has following properties set:
 directoryFactory = solr.MMapDirectoryFactory
 maxWarmingSearchers = 1
 text_general data type is being used as supplied in the schema.xml with the 
 solr setup.
 maxIndexingThreads = 8(default)
 autoCommitmaxTime15000/maxTimeopenSearcherfalse/openSearcher/autoCommit
 We get Java heap Out Of Memory Error after commiting around 3990 solr 
 documents.Some of the snapshots of memory dump from profiler are attached.
 can somebody please suggest what should we do to minimize/optimize the memory 
 consumption in our case with the reasons?
 also suggest what should be optimal values and reason for following 
 parameters of solrConfig.xml 
 useColdSearcher - true/false?
 maxwarmingsearchers- number
 spellcheck-on/off?
 omitNorms=true/false?
 omitTermFreqAndPositions?
 mergefactor? we are using default value 10
 java garbage collection tuning parameters ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Fixing query-time multi-word synonym issue

2013-01-25 Thread Robert Muir
On Fri, Jan 25, 2013 at 9:19 AM, Jack Krupansky j...@basetechnology.com wrote:
 Here's an example query with q.op=AND:

causes of heart attack

 And I have this synonym definition:

heart attack, myocardial infarction

 So, what is the alleged query parser fix so that the query is treated as:

causes of (heart attack OR myocardial infarction)


Thats actually inefficient and stupid to do. if you make a parser that
doesnt split on whitespace, you can just tell it to fold at index and
query time just like stemming. no OR necessary.

But I think you are trying to get off topic, again the real problem
affecting 99%+ users is that the lucene queryparser splits on
whitespace.

If this is fixed, then lots of things (not just synonyms, but other
basic shit that is broken today) starts working too:
https://issues.apache.org/jira/browse/LUCENE-2605

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4642) TokenizerFactory should provide a create method with a given AttributeSource

2013-01-25 Thread Renaud Delbru (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562740#comment-13562740
 ] 

Renaud Delbru commented on LUCENE-4642:
---

Hi, 

are there still some open questions on this issue that block the patch of being 
committed ? 

 TokenizerFactory should provide a create method with a given AttributeSource
 

 Key: LUCENE-4642
 URL: https://issues.apache.org/jira/browse/LUCENE-4642
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 4.1
Reporter: Renaud Delbru
Assignee: Steve Rowe
  Labels: analysis, attribute, tokenizer
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4642.patch, LUCENE-4642.patch


 All tokenizer implementations have a constructor that takes a given 
 AttributeSource as parameter (LUCENE-1826). However, the TokenizerFactory 
 does not provide an API to create tokenizers with a given AttributeSource.
 Side note: There are still a lot of tokenizers that do not provide 
 constructors that take AttributeSource and AttributeFactory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4043) Add ability to get success/failure responses from Collections API.

2013-01-25 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562741#comment-13562741
 ] 

Mark Miller commented on SOLR-4043:
---

I'm going to commit this in a moment so it can start baking for 4.2.

 Add ability to get success/failure responses from Collections API.
 --

 Key: SOLR-4043
 URL: https://issues.apache.org/jira/browse/SOLR-4043
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0
 Environment: Solr cloud cluster
Reporter: Raintung Li
Assignee: Mark Miller
 Fix For: 4.2, 5.0

 Attachments: patch-4043.txt, SOLR-4043_brach4.x.txt, SOLR-4043.patch


 The create/delete/reload collections are asynchronous process, the client 
 can't get the right response, only make sure the information have been saved 
 into the OverseerCollectionQueue. The client will get the response directly 
 that don't wait the result of behavior(create/delete/reload collection) 
 whatever successful. 
 The easy solution is client wait until the asynchronous process success, the 
 create/delete/reload collection thread will save the response into 
 OverseerCollectionQueue, then notify client to get the response. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4328) Simultaneous multiple connections to Solr example often fail with various IOExceptions

2013-01-25 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562743#comment-13562743
 ] 

Karl Wright commented on SOLR-4328:
---

The Solr connector in the ManifoldCF project worked around this problem for the 
moment by doing two things:

(1) Detecting the broken pipe error and interpreting that as meaning that a 
fixed number of retries are required;
(2) Turning on stale connection checks in HttpClient.

This is workable but not ideal.  The fact that Solr forcibly closes connections 
means that connection pooling on the client side is essentially a futile 
effort, and thus there are significant performance losses going to be 
associated with this behavior.  It is therefore in everyone's interest, I 
believe, to get Solr to stop doing what it is doing.

If I get any time this weekend I will try and propose a patch.


 Simultaneous multiple connections to Solr example often fail with various 
 IOExceptions
 --

 Key: SOLR-4328
 URL: https://issues.apache.org/jira/browse/SOLR-4328
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0, 3.6.2
 Environment: ManifoldCF, Solr connector, SolrJ, and Solr 4.0 or 3.6 
 on Mac OSX or Ubuntu, all localhost connections
Reporter: Karl Wright

 In ManifoldCF, we've been seeing problems with SolrJ connections throwing 
 java.net.SocketException's.  See CONNECTORS-616 for details as to exactly 
 what varieties of this exception are thrown, but broken pipe is the most 
 common.  This occurs on multiple Unix variants as stated.  (We also 
 occasionally see exceptions on Windows, but they are much less frequent and 
 are different variants than on Unix.)
 The exceptions seem to occur during the time an initial connection is getting 
 established, and seems to occur randomly when multiple connections are 
 getting established all at the same time.  Wire logging shows that only the 
 first few headers are sent before the connection is broken.  Solr itself does 
 log any error.  A retry is usually sufficient to have the transaction succeed.
 The Solr Connector in ManifoldCF has recently been upgraded to rely on SolrJ, 
 which could be a complicating factor.  However, I have repeatedly audited 
 both the Solr Connection code and the SolrJ code for best practices, and 
 while I found a couple of problems, nothing seems to be of the sort that 
 could cause a broken pipe.  For that to happen, the socket must be closed 
 either on the client end or on the server end, and there appears to be no 
 mechanism for that happening on the client end, since multiple threads would 
 have to be working with the same socket for that to be a possibility.
 It is also true that in ManifoldCF we disable the automatic retries that are 
 normally enabled for HttpComponents HttpClient.  These automatic retries 
 likely mask this problem should it be occurring in other situations.
 Places where there could potentially be a bug, in order of likelihood:
 (1) Jetty.  Nobody I am aware of has seen this on Tomcat yet.  But I also 
 don't know if anyone has tried it.
 (2) Solr servlet.  If it is possible for a servlet implementation to cause 
 the connection to drop without any exception being generated, this would be 
 something that should be researched.
 (3) HttpComponents/HttpClient.  If there is a client-side issue, it would 
 have to be because an httpclient instance was closing sockets from other 
 instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4642) TokenizerFactory should provide a create method with a given AttributeSource

2013-01-25 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562745#comment-13562745
 ] 

Robert Muir commented on LUCENE-4642:
-

I raised a lot of questions. I think they are valid concerns.

 TokenizerFactory should provide a create method with a given AttributeSource
 

 Key: LUCENE-4642
 URL: https://issues.apache.org/jira/browse/LUCENE-4642
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 4.1
Reporter: Renaud Delbru
Assignee: Steve Rowe
  Labels: analysis, attribute, tokenizer
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4642.patch, LUCENE-4642.patch


 All tokenizer implementations have a constructor that takes a given 
 AttributeSource as parameter (LUCENE-1826). However, the TokenizerFactory 
 does not provide an API to create tokenizers with a given AttributeSource.
 Side note: There are still a lot of tokenizers that do not provide 
 constructors that take AttributeSource and AttributeFactory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4043) Add ability to get success/failure responses from Collections API.

2013-01-25 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562759#comment-13562759
 ] 

Commit Tag Bot commented on SOLR-4043:
--

[trunk commit] Mark Robert Miller
http://svn.apache.org/viewvc?view=revisionrevision=1438550

SOLR-4043: Add ability to get success/failure responses from Collections API.


 Add ability to get success/failure responses from Collections API.
 --

 Key: SOLR-4043
 URL: https://issues.apache.org/jira/browse/SOLR-4043
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0
 Environment: Solr cloud cluster
Reporter: Raintung Li
Assignee: Mark Miller
 Fix For: 4.2, 5.0

 Attachments: patch-4043.txt, SOLR-4043_brach4.x.txt, SOLR-4043.patch


 The create/delete/reload collections are asynchronous process, the client 
 can't get the right response, only make sure the information have been saved 
 into the OverseerCollectionQueue. The client will get the response directly 
 that don't wait the result of behavior(create/delete/reload collection) 
 whatever successful. 
 The easy solution is client wait until the asynchronous process success, the 
 create/delete/reload collection thread will save the response into 
 OverseerCollectionQueue, then notify client to get the response. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4352) Velocity-base pagination should support/preserve sorting

2013-01-25 Thread Eric Spiegelberg (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562760#comment-13562760
 ] 

Eric Spiegelberg commented on SOLR-4352:


This patch is specifically for maintaining the sort parameter(s) for pagination 
-- the Velocity template that generate the pagination links was modified. Very 
similar code of how to extract and maintain the sort parameter(s) could be 
applied to facet selections separately.

 Velocity-base pagination should support/preserve sorting
 

 Key: SOLR-4352
 URL: https://issues.apache.org/jira/browse/SOLR-4352
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Eric Spiegelberg
Assignee: Erik Hatcher
 Attachments: SOLR-4352-erik.patch, SOLR-4352.patch


 When performing /browse, the Velocity generated UI does not support sorting 
 in the generated pagination links.
 The link_to_previous_page and link_to_next_page macros found within 
 [apache-solr-4.0.0]/example/solr/collection1/conf/velocity/VM_global_library.vm
  should be modified to maintain/preserve an existing sort parameter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-4719) Payloads per position broken

2013-01-25 Thread JIRA
André  created LUCENE-4719:
--

 Summary: Payloads per position broken
 Key: LUCENE-4719
 URL: https://issues.apache.org/jira/browse/LUCENE-4719
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.1
Reporter: André 


In 4.0 it worked. Since 4.1 getPayload() returns the same ByteRef instance for 
every position of the same term. Additionally payloads stored on the term 
vector (correct) may differ form payloads stored in the postings (wrong).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4642) TokenizerFactory should provide a create method with a given AttributeSource

2013-01-25 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562765#comment-13562765
 ] 

Steve Rowe commented on LUCENE-4642:


Renaud, have you looked at 
[TeeSinkTokenFilter|http://lucene.apache.org/core/4_1_0/analyzers-common/org/apache/lucene/analysis/sinks/TeeSinkTokenFilter.html]?
  Sounds to me like a good fit for the use case you mentioned.

 TokenizerFactory should provide a create method with a given AttributeSource
 

 Key: LUCENE-4642
 URL: https://issues.apache.org/jira/browse/LUCENE-4642
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 4.1
Reporter: Renaud Delbru
Assignee: Steve Rowe
  Labels: analysis, attribute, tokenizer
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4642.patch, LUCENE-4642.patch


 All tokenizer implementations have a constructor that takes a given 
 AttributeSource as parameter (LUCENE-1826). However, the TokenizerFactory 
 does not provide an API to create tokenizers with a given AttributeSource.
 Side note: There are still a lot of tokenizers that do not provide 
 constructors that take AttributeSource and AttributeFactory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4352) Velocity-base pagination should support/preserve sorting

2013-01-25 Thread Erik Hatcher (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562769#comment-13562769
 ] 

Erik Hatcher commented on SOLR-4352:


Eric - my patch covers both facet and pagination links.  Any reason not to keep 
sort on facet links too?   Thoughts on my patch for your needs?

 Velocity-base pagination should support/preserve sorting
 

 Key: SOLR-4352
 URL: https://issues.apache.org/jira/browse/SOLR-4352
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Eric Spiegelberg
Assignee: Erik Hatcher
 Attachments: SOLR-4352-erik.patch, SOLR-4352.patch


 When performing /browse, the Velocity generated UI does not support sorting 
 in the generated pagination links.
 The link_to_previous_page and link_to_next_page macros found within 
 [apache-solr-4.0.0]/example/solr/collection1/conf/velocity/VM_global_library.vm
  should be modified to maintain/preserve an existing sort parameter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4719) Payloads per position broken

2013-01-25 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/LUCENE-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

André  updated LUCENE-4719:
---

Fix Version/s: 4.1.1

 Payloads per position broken
 

 Key: LUCENE-4719
 URL: https://issues.apache.org/jira/browse/LUCENE-4719
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.1
Reporter: André 
 Fix For: 4.1.1


 In 4.0 it worked. Since 4.1 getPayload() returns the same ByteRef instance 
 for every position of the same term. Additionally payloads stored on the term 
 vector (correct) may differ form payloads stored in the postings (wrong).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4352) Velocity-base pagination should support/preserve sorting

2013-01-25 Thread Erik Hatcher (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562771#comment-13562771
 ] 

Erik Hatcher commented on SOLR-4352:


In your original patch, Eric, it doesn't account for multiple sort parameters 
nor does it URL encode the sort values.  Both multiple sort and url encoding 
are handled in my patch.

 Velocity-base pagination should support/preserve sorting
 

 Key: SOLR-4352
 URL: https://issues.apache.org/jira/browse/SOLR-4352
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Eric Spiegelberg
Assignee: Erik Hatcher
 Attachments: SOLR-4352-erik.patch, SOLR-4352.patch


 When performing /browse, the Velocity generated UI does not support sorting 
 in the generated pagination links.
 The link_to_previous_page and link_to_next_page macros found within 
 [apache-solr-4.0.0]/example/solr/collection1/conf/velocity/VM_global_library.vm
  should be modified to maintain/preserve an existing sort parameter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4328) Simultaneous multiple connections to Solr example often fail with various IOExceptions

2013-01-25 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated SOLR-4328:
--

Description: 
In ManifoldCF, we've been seeing problems with SolrJ connections throwing 
java.net.SocketException's.  See CONNECTORS-616 for details as to exactly what 
varieties of this exception are thrown, but broken pipe is the most common.  
This occurs on multiple Unix variants as stated.  (We also occasionally see 
exceptions on Windows, but they are much less frequent and are different 
variants than on Unix.)

The exceptions seem to occur during the time an initial connection is getting 
established, and seems to occur randomly when multiple connections are getting 
established all at the same time.  Wire logging shows that only the first few 
headers are sent before the connection is broken.  Solr itself does not log any 
error.  A retry is usually sufficient to have the transaction succeed.

The Solr Connector in ManifoldCF has recently been upgraded to rely on SolrJ, 
which could be a complicating factor.  However, I have repeatedly audited both 
the Solr Connection code and the SolrJ code for best practices, and while I 
found a couple of problems, nothing seems to be of the sort that could cause a 
broken pipe.  For that to happen, the socket must be closed either on the 
client end or on the server end, and there appears to be no mechanism for that 
happening on the client end, since multiple threads would have to be working 
with the same socket for that to be a possibility.

It is also true that in ManifoldCF we disable the automatic retries that are 
normally enabled for HttpComponents HttpClient.  These automatic retries likely 
mask this problem should it be occurring in other situations.

Places where there could potentially be a bug, in order of likelihood:

(1) Jetty.  Nobody I am aware of has seen this on Tomcat yet.  But I also don't 
know if anyone has tried it.
(2) Solr servlet.  If it is possible for a servlet implementation to cause the 
connection to drop without any exception being generated, this would be 
something that should be researched.
(3) HttpComponents/HttpClient.  If there is a client-side issue, it would have 
to be because an httpclient instance was closing sockets from other instances.


  was:
In ManifoldCF, we've been seeing problems with SolrJ connections throwing 
java.net.SocketException's.  See CONNECTORS-616 for details as to exactly what 
varieties of this exception are thrown, but broken pipe is the most common.  
This occurs on multiple Unix variants as stated.  (We also occasionally see 
exceptions on Windows, but they are much less frequent and are different 
variants than on Unix.)

The exceptions seem to occur during the time an initial connection is getting 
established, and seems to occur randomly when multiple connections are getting 
established all at the same time.  Wire logging shows that only the first few 
headers are sent before the connection is broken.  Solr itself does log any 
error.  A retry is usually sufficient to have the transaction succeed.

The Solr Connector in ManifoldCF has recently been upgraded to rely on SolrJ, 
which could be a complicating factor.  However, I have repeatedly audited both 
the Solr Connection code and the SolrJ code for best practices, and while I 
found a couple of problems, nothing seems to be of the sort that could cause a 
broken pipe.  For that to happen, the socket must be closed either on the 
client end or on the server end, and there appears to be no mechanism for that 
happening on the client end, since multiple threads would have to be working 
with the same socket for that to be a possibility.

It is also true that in ManifoldCF we disable the automatic retries that are 
normally enabled for HttpComponents HttpClient.  These automatic retries likely 
mask this problem should it be occurring in other situations.

Places where there could potentially be a bug, in order of likelihood:

(1) Jetty.  Nobody I am aware of has seen this on Tomcat yet.  But I also don't 
know if anyone has tried it.
(2) Solr servlet.  If it is possible for a servlet implementation to cause the 
connection to drop without any exception being generated, this would be 
something that should be researched.
(3) HttpComponents/HttpClient.  If there is a client-side issue, it would have 
to be because an httpclient instance was closing sockets from other instances.



 Simultaneous multiple connections to Solr example often fail with various 
 IOExceptions
 --

 Key: SOLR-4328
 URL: https://issues.apache.org/jira/browse/SOLR-4328
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0, 3.6.2
 Environment: ManifoldCF, Solr connector, SolrJ, and Solr 4.0 or 3.6 
 on Mac OSX or Ubuntu, 

[jira] [Commented] (SOLR-4043) Add ability to get success/failure responses from Collections API.

2013-01-25 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562780#comment-13562780
 ] 

Commit Tag Bot commented on SOLR-4043:
--

[branch_4x commit] Mark Robert Miller
http://svn.apache.org/viewvc?view=revisionrevision=1438555

SOLR-4043: Add ability to get success/failure responses from Collections API.


 Add ability to get success/failure responses from Collections API.
 --

 Key: SOLR-4043
 URL: https://issues.apache.org/jira/browse/SOLR-4043
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0
 Environment: Solr cloud cluster
Reporter: Raintung Li
Assignee: Mark Miller
 Fix For: 4.2, 5.0

 Attachments: patch-4043.txt, SOLR-4043_brach4.x.txt, SOLR-4043.patch


 The create/delete/reload collections are asynchronous process, the client 
 can't get the right response, only make sure the information have been saved 
 into the OverseerCollectionQueue. The client will get the response directly 
 that don't wait the result of behavior(create/delete/reload collection) 
 whatever successful. 
 The easy solution is client wait until the asynchronous process success, the 
 create/delete/reload collection thread will save the response into 
 OverseerCollectionQueue, then notify client to get the response. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4352) Velocity-base pagination should support/preserve sorting

2013-01-25 Thread Eric Spiegelberg (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562787#comment-13562787
 ] 

Eric Spiegelberg commented on SOLR-4352:


After comparing the two patches, my patch is for a more narrow slice of 
functionality and does not account for the additional use cases that yours 
does. Yours is the way to go.

 Velocity-base pagination should support/preserve sorting
 

 Key: SOLR-4352
 URL: https://issues.apache.org/jira/browse/SOLR-4352
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Eric Spiegelberg
Assignee: Erik Hatcher
 Attachments: SOLR-4352-erik.patch, SOLR-4352.patch


 When performing /browse, the Velocity generated UI does not support sorting 
 in the generated pagination links.
 The link_to_previous_page and link_to_next_page macros found within 
 [apache-solr-4.0.0]/example/solr/collection1/conf/velocity/VM_global_library.vm
  should be modified to maintain/preserve an existing sort parameter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-4719) Payloads per position broken

2013-01-25 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-4719.
-

Resolution: Not A Problem
  Assignee: Robert Muir

read the javadocs: its ok that it returns the same instance.

the instance is not *yours* and will refer to different bytes, or bytes with 
different content (all at the discretion of the implementation)

 Payloads per position broken
 

 Key: LUCENE-4719
 URL: https://issues.apache.org/jira/browse/LUCENE-4719
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.1
Reporter: André 
Assignee: Robert Muir
 Fix For: 4.1.1


 In 4.0 it worked. Since 4.1 getPayload() returns the same ByteRef instance 
 for every position of the same term. Additionally payloads stored on the term 
 vector (correct) may differ form payloads stored in the postings (wrong).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3177) Excluding tagged filter in StatsComponent

2013-01-25 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562798#comment-13562798
 ] 

Jan Høydahl commented on SOLR-3177:
---

Thanks for the patch, Nikolai.

You'll have a greater chance of having the patch committed if you bring it even 
closer to finalization. The patch should ideally be against TRUNK, 
alternatively against branch_4x. Also, please review this page 
http://wiki.apache.org/solr/HowToContribute#Generating_a_patch and try to 
follow the guidelines as closely as possible.

Most important I think is that the patch only includes needed changes so that 
it is really easy to review what has changed. If you can write JUnit tests 
that's perfect, if not that's ok too :)

 Excluding tagged filter in StatsComponent
 -

 Key: SOLR-3177
 URL: https://issues.apache.org/jira/browse/SOLR-3177
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Affects Versions: 3.5, 3.6, 4.0-ALPHA, 4.1
Reporter: Mathias H.
Priority: Minor
  Labels: localparams, stats, statscomponent
 Attachments: statsfilterexclude.patch


 It would be useful to exclude the effects of some fq params from the set of 
 documents used to compute stats -- similar to 
 how you can exclude tagged filters when generating facet counts... 
 https://wiki.apache.org/solr/SimpleFacetParameters#Tagging_and_excluding_Filters
 So that it's possible to do something like this... 
 http://localhost:8983/solr/select?fq={!tag=priceFilter}price:[1 TO 
 20]q=*:*stats=truestats.field={!ex=priceFilter}price 
 If you want to create a price slider this is very useful because then you can 
 filter the price ([1 TO 20) and nevertheless get the lower and upper bound of 
 the unfiltered price (min=0, max=100):
 {noformat}
 |-[---]--|
 $0 $1 $20$100
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4359) The RecentUpdates#update method should treat a problem reading the next record the same as a problem parsing the record - log the exception and break.

2013-01-25 Thread Mark Miller (JIRA)
Mark Miller created SOLR-4359:
-

 Summary: The RecentUpdates#update method should treat a problem 
reading the next record the same as a problem parsing the record - log the 
exception and break.
 Key: SOLR-4359
 URL: https://issues.apache.org/jira/browse/SOLR-4359
 Project: Solr
  Issue Type: Bug
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 4.2, 5.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4043) Add ability to get success/failure responses from Collections API.

2013-01-25 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-4043.
---

Resolution: Fixed

Thanks Raintung! Let's open new JIRA's for anything further needed on this.

 Add ability to get success/failure responses from Collections API.
 --

 Key: SOLR-4043
 URL: https://issues.apache.org/jira/browse/SOLR-4043
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0
 Environment: Solr cloud cluster
Reporter: Raintung Li
Assignee: Mark Miller
 Fix For: 4.2, 5.0

 Attachments: patch-4043.txt, SOLR-4043_brach4.x.txt, SOLR-4043.patch


 The create/delete/reload collections are asynchronous process, the client 
 can't get the right response, only make sure the information have been saved 
 into the OverseerCollectionQueue. The client will get the response directly 
 that don't wait the result of behavior(create/delete/reload collection) 
 whatever successful. 
 The easy solution is client wait until the asynchronous process success, the 
 create/delete/reload collection thread will save the response into 
 OverseerCollectionQueue, then notify client to get the response. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Reopened] (LUCENE-4719) Payloads per position broken

2013-01-25 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/LUCENE-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

André  reopened LUCENE-4719:



Yes, you are right, but the value is also the same.

 Payloads per position broken
 

 Key: LUCENE-4719
 URL: https://issues.apache.org/jira/browse/LUCENE-4719
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.1
Reporter: André 
Assignee: Robert Muir
 Fix For: 4.1.1


 In 4.0 it worked. Since 4.1 getPayload() returns the same ByteRef instance 
 for every position of the same term. Additionally payloads stored on the term 
 vector (correct) may differ form payloads stored in the postings (wrong).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-4719) Payloads per position broken

2013-01-25 Thread JIRA

[ 
https://issues.apache.org/jira/browse/LUCENE-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562801#comment-13562801
 ] 

André  edited comment on LUCENE-4719 at 1/25/13 4:04 PM:
-

Yes, you are right, but the value is also the same. You can easily test it.

  was (Author: antiheld):
Yes, you are right, but the value is also the same.
  
 Payloads per position broken
 

 Key: LUCENE-4719
 URL: https://issues.apache.org/jira/browse/LUCENE-4719
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.1
Reporter: André 
Assignee: Robert Muir
 Fix For: 4.1.1


 In 4.0 it worked. Since 4.1 getPayload() returns the same ByteRef instance 
 for every position of the same term. Additionally payloads stored on the term 
 vector (correct) may differ form payloads stored in the postings (wrong).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4719) Payloads per position broken

2013-01-25 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/LUCENE-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

André  updated LUCENE-4719:
---

Description: In 4.0 it worked. Since 4.1 getPayload() returns the same 
value for every position of the same term. Additionally payloads stored on the 
term vector (correct) may differ form payloads stored in the postings (wrong).  
(was: In 4.0 it worked. Since 4.1 getPayload() returns the same ByteRef 
instance for every position of the same term. Additionally payloads stored on 
the term vector (correct) may differ form payloads stored in the postings 
(wrong).)

 Payloads per position broken
 

 Key: LUCENE-4719
 URL: https://issues.apache.org/jira/browse/LUCENE-4719
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.1
Reporter: André 
Assignee: Robert Muir
 Fix For: 4.1.1


 In 4.0 it worked. Since 4.1 getPayload() returns the same value for every 
 position of the same term. Additionally payloads stored on the term vector 
 (correct) may differ form payloads stored in the postings (wrong).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



java.lang.NumberFormatException Using PhraseQuery with Lucene 4.0.0

2013-01-25 Thread JimAld
Hi,

The below code is throwing the exception:
java.lang.NumberFormatException: For input string: 01.SZ at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)

when the TopDocs docs = indexSearcher.search(phraseQuery, null, 10, sort);
line is called. This only happens when the searchPattern contains a space
character. No other info is available in the exception.

The 01.SZ value is the first value in my index...
the index is a RAMDirectory...

Anyone have any ideas? Many Thanks.

Code:
BooleanQuery.setMaxClauseCount(clauseCount);
searchPattern = QueryParser.escape(searchPattern);
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_40);
IndexReader reader = IndexReader.open(index);
IndexSearcher indexSearcher = new IndexSearcher(reader);

PhraseQuery phraseQuery = new PhraseQuery();
Term term = new Term(fieldName, searchPattern);
phraseQuery.add(term);
phraseQuery.setSlop(0);

Sort sort = new Sort(new SortField(fieldName,SortField.Type.SCORE));
TopDocs docs = indexSearcher.search(phraseQuery, null, 10, sort);




--
View this message in context: 
http://lucene.472066.n3.nabble.com/java-lang-NumberFormatException-Using-PhraseQuery-with-Lucene-4-0-0-tp4036273.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Fixing query-time multi-word synonym issue

2013-01-25 Thread Jack Krupansky
One clarification from my previous comment: One requirement is to prevent 
false matches for instances of heart infarction and myocardial attack - 
the current synonym filter does not preserver the path or term ordering 
within the multi-term phrases. Even if the query parser does present the 
full term sequence as a single input string.


Yes, the position information is preserved, but there is no path attribute 
to be able to tell that heart was before attack as opposed to before 
infarction.


-- Jack Krupansky

-Original Message- 
From: Robert Muir

Sent: Friday, January 25, 2013 9:47 AM
To: dev@lucene.apache.org
Subject: Re: Fixing query-time multi-word synonym issue

On Fri, Jan 25, 2013 at 9:19 AM, Jack Krupansky j...@basetechnology.com 
wrote:

Here's an example query with q.op=AND:

   causes of heart attack

And I have this synonym definition:

   heart attack, myocardial infarction

So, what is the alleged query parser fix so that the query is treated as:

   causes of (heart attack OR myocardial infarction)



Thats actually inefficient and stupid to do. if you make a parser that
doesnt split on whitespace, you can just tell it to fold at index and
query time just like stemming. no OR necessary.

But I think you are trying to get off topic, again the real problem
affecting 99%+ users is that the lucene queryparser splits on
whitespace.

If this is fixed, then lots of things (not just synonyms, but other
basic shit that is broken today) starts working too:
https://issues.apache.org/jira/browse/LUCENE-2605

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org 



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-trunk-Linux (32bit/jrockit-jdk1.6.0_33-R28.2.4-4.1.0) - Build # 3970 - Failure!

2013-01-25 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/3970/
Java: 32bit/jrockit-jdk1.6.0_33-R28.2.4-4.1.0 -XnoOpt

All tests passed

Build Log:
[...truncated 29961 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:305: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:120: The 
following files are missing svn:eol-style (or binary svn:mime-type):
* solr/core/src/java/org/apache/solr/cloud/OverseerSolrResponse.java

Total time: 56 minutes 6 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 32bit/jrockit-jdk1.6.0_33-R28.2.4-4.1.0 -XnoOpt
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-4360) TermVectorAccessor return terms that do not match with current document

2013-01-25 Thread Francois-Xavier Bonnet (JIRA)
Francois-Xavier Bonnet created SOLR-4360:


 Summary: TermVectorAccessor return terms that do not match with 
current document
 Key: SOLR-4360
 URL: https://issues.apache.org/jira/browse/SOLR-4360
 Project: Solr
  Issue Type: Bug
Affects Versions: 3.6.2
Reporter: Francois-Xavier Bonnet


For each term, TermVectorAccessor looks in the indexReader and calls 
termPositions.skipTo(documentNumber) but this methods returns the first 
document with id greater or equal to documentNumber.
As a result you get some extra terms that do not really match with 
documentNumber.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4360) TermVectorAccessor return terms that do not match with current document

2013-01-25 Thread Francois-Xavier Bonnet (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francois-Xavier Bonnet updated SOLR-4360:
-

Attachment: SOLR-4360.txt

Here is a patch

 TermVectorAccessor return terms that do not match with current document
 ---

 Key: SOLR-4360
 URL: https://issues.apache.org/jira/browse/SOLR-4360
 Project: Solr
  Issue Type: Bug
Affects Versions: 3.6.2
Reporter: Francois-Xavier Bonnet
 Attachments: SOLR-4360.txt


 For each term, TermVectorAccessor looks in the indexReader and calls 
 termPositions.skipTo(documentNumber) but this methods returns the first 
 document with id greater or equal to documentNumber.
 As a result you get some extra terms that do not really match with 
 documentNumber.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4325) DIH DateFormatEvaluator seems to have problems with DST changes - test disabled

2013-01-25 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562848#comment-13562848
 ] 

Commit Tag Bot commented on SOLR-4325:
--

[trunk commit] James Dyer
http://svn.apache.org/viewvc?view=revisionrevision=1438597

SOLR-4325: fix TestBuiltInEvaluators


 DIH DateFormatEvaluator seems to have problems with DST changes - test 
 disabled
 

 Key: SOLR-4325
 URL: https://issues.apache.org/jira/browse/SOLR-4325
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 4.0, 4.1
Reporter: Uwe Schindler
Assignee: James Dyer
 Fix For: 4.2, 5.0

 Attachments: SOLR-4325.patch


 Yesterday was DST change in Fidji (clock went one hour backwards, as summer 
 time ended and winter time started). This caused 
 org.apache.solr.handler.dataimport.TestBuiltInEvaluators.testDateFormatEvaluator
  to fail. The reason is simple: NOW-2DAYS is evaluated without taking time 
 zone into account (its substracting 48 hours), but to be correct and go 2 
 DAYS back in local wall clock time, it must subtract only 47 hours. If this 
 is not intended (we want to go 48 hours back, not 47), the test needs a fix. 
 Otherwise the date evaluator must take the timezone into account when 
 substracting days (e.g., use correctly localized Calendar instance and use 
 the add() method 
 ([http://docs.oracle.com/javase/6/docs/api/java/util/Calendar.html#add(int, 
 int)]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4325) DIH DateFormatEvaluator seems to have problems with DST changes - test disabled

2013-01-25 Thread James Dyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Dyer resolved SOLR-4325.
--

Resolution: Fixed

 DIH DateFormatEvaluator seems to have problems with DST changes - test 
 disabled
 

 Key: SOLR-4325
 URL: https://issues.apache.org/jira/browse/SOLR-4325
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 4.0, 4.1
Reporter: Uwe Schindler
Assignee: James Dyer
 Fix For: 4.2, 5.0

 Attachments: SOLR-4325.patch


 Yesterday was DST change in Fidji (clock went one hour backwards, as summer 
 time ended and winter time started). This caused 
 org.apache.solr.handler.dataimport.TestBuiltInEvaluators.testDateFormatEvaluator
  to fail. The reason is simple: NOW-2DAYS is evaluated without taking time 
 zone into account (its substracting 48 hours), but to be correct and go 2 
 DAYS back in local wall clock time, it must subtract only 47 hours. If this 
 is not intended (we want to go 48 hours back, not 47), the test needs a fix. 
 Otherwise the date evaluator must take the timezone into account when 
 substracting days (e.g., use correctly localized Calendar instance and use 
 the add() method 
 ([http://docs.oracle.com/javase/6/docs/api/java/util/Calendar.html#add(int, 
 int)]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.6.0_38) - Build # 3948 - Failure!

2013-01-25 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/3948/
Java: 32bit/jdk1.6.0_38 -server -XX:+UseSerialGC

All tests passed

Build Log:
[...truncated 29807 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:305: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/extra-targets.xml:120: The 
following files are missing svn:eol-style (or binary svn:mime-type):
* solr/core/src/java/org/apache/solr/cloud/OverseerSolrResponse.java

Total time: 40 minutes 5 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 32bit/jdk1.6.0_38 -server -XX:+UseSerialGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4642) TokenizerFactory should provide a create method with a given AttributeSource

2013-01-25 Thread Renaud Delbru (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562853#comment-13562853
 ] 

Renaud Delbru commented on LUCENE-4642:
---

@steve:

{quote}
have you looked at TeeSinkTokenFilter
{quote}

Yes, and from my current understanding, it is similar to our current 
implementation. The problem with this approach is that the exchange of 
attributes is performed using the AttributeSource.State API with 
AttributeSource#captureState and AttributeSource#restoreState, which copies the 
values of all attribute implementations that the state contains, and this is 
very inefficient as it has to copies arrays and other objects (e.g., char term 
arrays, etc.) for every single token.

@robert:

Concerning the problem of UOEs, the new patch of Steve reduces the number of 
UOEs to one only, which is much more reasonable than my first approach. I have 
looked at the current state of the Lucene trunk, and there are already a lot of 
UOEs in many places. So, I would suggest that this problem may not be a 
blocking one (but I might be wrong).

Concerning the problem of constructor explosion, maybe we can find a consensus. 
Your proposition of removing Tokenizer(AttributeSource) cannot work for us, as 
we need it to share a same AttributeSource across multiple streams. However, as 
I proposed, removing the Tokenizer(AttributeFactory) could work as it could be 
emulated by using Tokenizer(AttributeSource).



 TokenizerFactory should provide a create method with a given AttributeSource
 

 Key: LUCENE-4642
 URL: https://issues.apache.org/jira/browse/LUCENE-4642
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 4.1
Reporter: Renaud Delbru
Assignee: Steve Rowe
  Labels: analysis, attribute, tokenizer
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4642.patch, LUCENE-4642.patch


 All tokenizer implementations have a constructor that takes a given 
 AttributeSource as parameter (LUCENE-1826). However, the TokenizerFactory 
 does not provide an API to create tokenizers with a given AttributeSource.
 Side note: There are still a lot of tokenizers that do not provide 
 constructors that take AttributeSource and AttributeFactory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4325) DIH DateFormatEvaluator seems to have problems with DST changes - test disabled

2013-01-25 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562854#comment-13562854
 ] 

Commit Tag Bot commented on SOLR-4325:
--

[branch_4x commit] James Dyer
http://svn.apache.org/viewvc?view=revisionrevision=1438598

SOLR-4325: fix TestBuiltInEvaluators


 DIH DateFormatEvaluator seems to have problems with DST changes - test 
 disabled
 

 Key: SOLR-4325
 URL: https://issues.apache.org/jira/browse/SOLR-4325
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 4.0, 4.1
Reporter: Uwe Schindler
Assignee: James Dyer
 Fix For: 4.2, 5.0

 Attachments: SOLR-4325.patch


 Yesterday was DST change in Fidji (clock went one hour backwards, as summer 
 time ended and winter time started). This caused 
 org.apache.solr.handler.dataimport.TestBuiltInEvaluators.testDateFormatEvaluator
  to fail. The reason is simple: NOW-2DAYS is evaluated without taking time 
 zone into account (its substracting 48 hours), but to be correct and go 2 
 DAYS back in local wall clock time, it must subtract only 47 hours. If this 
 is not intended (we want to go 48 hours back, not 47), the test needs a fix. 
 Otherwise the date evaluator must take the timezone into account when 
 substracting days (e.g., use correctly localized Calendar instance and use 
 the add() method 
 ([http://docs.oracle.com/javase/6/docs/api/java/util/Calendar.html#add(int, 
 int)]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Closed] (SOLR-4354) Replication should perform full copy if slave's generation higher than master's

2013-01-25 Thread Amit Nithian (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Nithian closed SOLR-4354.
--

Resolution: Invalid

My apologies that was embarrassing. I was looking at the 4.0 code that we use 
and not the 4.1 code which has this fixed. I blindly copied my code to trunk 
without doing a proper code refresh (so much for late night working).

Again please accept my apologies.

 Replication should perform full copy if slave's generation higher than 
 master's
 ---

 Key: SOLR-4354
 URL: https://issues.apache.org/jira/browse/SOLR-4354
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 4.1
Reporter: Amit Nithian
 Fix For: 4.2

 Attachments: SOLR-4354.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 We have dual masters each incrementally indexing from our MySQL database and 
 sit behind a virtual hostname in our load balancer. As such, it's possible 
 that the generation numbers between the masters for a given index are not in 
 sync. Slaves are configured to replicate from this virtual host (and pin 
 based on source/dest IP hash) so we can add and remove masters as necessary 
 (great for maintenance). 
 For the most part this works but we've seen the following happen:
 * Slave has been pulling from master A
 * Master A goes down for maint and now will pull from master B (which has a 
 lower generation number for some reason than master A).
 * Slave now tries to pull from master B (has higher index version than slave 
 but lower generation).
 * Slave downloads index files, moves them to the index/ directory but these 
 files are deleted during the doCommit() phase (looks like older generation 
 data is deleted).
 * Index remains as-is and no change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Reopened] (SOLR-4354) Replication should perform full copy if slave's generation higher than master's

2013-01-25 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller reopened SOLR-4354:
---

  Assignee: Mark Miller

 Replication should perform full copy if slave's generation higher than 
 master's
 ---

 Key: SOLR-4354
 URL: https://issues.apache.org/jira/browse/SOLR-4354
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 4.1
Reporter: Amit Nithian
Assignee: Mark Miller
 Fix For: 4.2

 Attachments: SOLR-4354.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 We have dual masters each incrementally indexing from our MySQL database and 
 sit behind a virtual hostname in our load balancer. As such, it's possible 
 that the generation numbers between the masters for a given index are not in 
 sync. Slaves are configured to replicate from this virtual host (and pin 
 based on source/dest IP hash) so we can add and remove masters as necessary 
 (great for maintenance). 
 For the most part this works but we've seen the following happen:
 * Slave has been pulling from master A
 * Master A goes down for maint and now will pull from master B (which has a 
 lower generation number for some reason than master A).
 * Slave now tries to pull from master B (has higher index version than slave 
 but lower generation).
 * Slave downloads index files, moves them to the index/ directory but these 
 files are deleted during the doCommit() phase (looks like older generation 
 data is deleted).
 * Index remains as-is and no change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4354) Replication should perform full copy if slave's generation higher than master's

2013-01-25 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-4354.
---

Resolution: Duplicate

 Replication should perform full copy if slave's generation higher than 
 master's
 ---

 Key: SOLR-4354
 URL: https://issues.apache.org/jira/browse/SOLR-4354
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 4.1
Reporter: Amit Nithian
Assignee: Mark Miller
 Fix For: 4.2

 Attachments: SOLR-4354.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 We have dual masters each incrementally indexing from our MySQL database and 
 sit behind a virtual hostname in our load balancer. As such, it's possible 
 that the generation numbers between the masters for a given index are not in 
 sync. Slaves are configured to replicate from this virtual host (and pin 
 based on source/dest IP hash) so we can add and remove masters as necessary 
 (great for maintenance). 
 For the most part this works but we've seen the following happen:
 * Slave has been pulling from master A
 * Master A goes down for maint and now will pull from master B (which has a 
 lower generation number for some reason than master A).
 * Slave now tries to pull from master B (has higher index version than slave 
 but lower generation).
 * Slave downloads index files, moves them to the index/ directory but these 
 files are deleted during the doCommit() phase (looks like older generation 
 data is deleted).
 * Index remains as-is and no change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4354) Replication should perform full copy if slave's generation higher than master's

2013-01-25 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562868#comment-13562868
 ] 

Mark Miller commented on SOLR-4354:
---

No worries Amit - looks like this was a dupe of SOLR-4303.

 Replication should perform full copy if slave's generation higher than 
 master's
 ---

 Key: SOLR-4354
 URL: https://issues.apache.org/jira/browse/SOLR-4354
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 4.1
Reporter: Amit Nithian
Assignee: Mark Miller
 Fix For: 4.2

 Attachments: SOLR-4354.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 We have dual masters each incrementally indexing from our MySQL database and 
 sit behind a virtual hostname in our load balancer. As such, it's possible 
 that the generation numbers between the masters for a given index are not in 
 sync. Slaves are configured to replicate from this virtual host (and pin 
 based on source/dest IP hash) so we can add and remove masters as necessary 
 (great for maintenance). 
 For the most part this works but we've seen the following happen:
 * Slave has been pulling from master A
 * Master A goes down for maint and now will pull from master B (which has a 
 lower generation number for some reason than master A).
 * Slave now tries to pull from master B (has higher index version than slave 
 but lower generation).
 * Slave downloads index files, moves them to the index/ directory but these 
 files are deleted during the doCommit() phase (looks like older generation 
 data is deleted).
 * Index remains as-is and no change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Moved] (LUCENE-4720) TermVectorAccessor return terms that do not match with current document

2013-01-25 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man moved SOLR-4360 to LUCENE-4720:


Lucene Fields: New
Affects Version/s: (was: 3.6.2)
   3.6.2
  Key: LUCENE-4720  (was: SOLR-4360)
  Project: Lucene - Core  (was: Solr)

 TermVectorAccessor return terms that do not match with current document
 ---

 Key: LUCENE-4720
 URL: https://issues.apache.org/jira/browse/LUCENE-4720
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 3.6.2
Reporter: Francois-Xavier Bonnet
 Attachments: SOLR-4360.txt


 For each term, TermVectorAccessor looks in the indexReader and calls 
 termPositions.skipTo(documentNumber) but this methods returns the first 
 document with id greater or equal to documentNumber.
 As a result you get some extra terms that do not really match with 
 documentNumber.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.6.0) - Build # 134 - Failure!

2013-01-25 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/134/
Java: 64bit/jdk1.6.0 -XX:+UseSerialGC

All tests passed

Build Log:
[...truncated 29905 lines...]
BUILD FAILED
/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/build.xml:305: 
The following error occurred while executing this line:
/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/extra-targets.xml:120:
 The following files are missing svn:eol-style (or binary svn:mime-type):
* solr/core/src/java/org/apache/solr/cloud/OverseerSolrResponse.java

Total time: 84 minutes 3 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 64bit/jdk1.6.0 -XX:+UseSerialGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.6.0_38) - Build # 3971 - Still Failing!

2013-01-25 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/3971/
Java: 32bit/jdk1.6.0_38 -client -XX:+UseParallelGC

All tests passed

Build Log:
[...truncated 29914 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:305: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:120: The 
following files are missing svn:eol-style (or binary svn:mime-type):
* solr/core/src/java/org/apache/solr/cloud/OverseerSolrResponse.java

Total time: 36 minutes 32 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 32bit/jdk1.6.0_38 -client -XX:+UseParallelGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #226: POMs out of sync

2013-01-25 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/226/

2 tests failed.
FAILED:  
org.apache.solr.cloud.RecoveryZkTest.org.apache.solr.cloud.RecoveryZkTest

Error Message:
Resource in scope SUITE failed to close. Resource was registered from thread 
Thread[id=3494, name=coreLoadExecutor-1996-thread-1, state=RUNNABLE, 
group=TGRP-RecoveryZkTest], registration stack trace below.

Stack Trace:
com.carrotsearch.randomizedtesting.ResourceDisposalError: Resource in scope 
SUITE failed to close. Resource was registered from thread Thread[id=3494, 
name=coreLoadExecutor-1996-thread-1, state=RUNNABLE, 
group=TGRP-RecoveryZkTest], registration stack trace below.
at java.lang.Thread.getStackTrace(Thread.java:1495)
at 
com.carrotsearch.randomizedtesting.RandomizedContext.closeAtEnd(RandomizedContext.java:150)
at 
org.apache.lucene.util.LuceneTestCase.closeAfterSuite(LuceneTestCase.java:517)
at 
org.apache.lucene.util.LuceneTestCase.wrapDirectory(LuceneTestCase.java:977)
at 
org.apache.lucene.util.LuceneTestCase.newDirectory(LuceneTestCase.java:875)
at 
org.apache.lucene.util.LuceneTestCase.newDirectory(LuceneTestCase.java:867)
at 
org.apache.solr.core.MockDirectoryFactory.create(MockDirectoryFactory.java:33)
at 
org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:267)
at 
org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:223)
at org.apache.solr.core.SolrCore.getNewIndexDir(SolrCore.java:241)
at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:446)
at org.apache.solr.core.SolrCore.init(SolrCore.java:718)
at org.apache.solr.core.SolrCore.init(SolrCore.java:607)
at 
org.apache.solr.core.CoreContainer.createFromZk(CoreContainer.java:949)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1031)
at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:679)
Caused by: java.lang.AssertionError: Directory not closed: 
BaseDirectoryWrapper(org.apache.lucene.store.RAMDirectory@302dd679 
lockFactory=org.apache.lucene.store.NativeFSLockFactory@1d3aaf8a)
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.lucene.util.CloseableDirectory.close(CloseableDirectory.java:47)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$2$1.apply(RandomizedRunner.java:602)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$2$1.apply(RandomizedRunner.java:599)
at 
com.carrotsearch.randomizedtesting.RandomizedContext.closeResources(RandomizedContext.java:167)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$2.afterAlways(RandomizedRunner.java:615)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:43)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
... 1 more


FAILED:  
org.apache.solr.cloud.RecoveryZkTest.org.apache.solr.cloud.RecoveryZkTest

Error Message:
Resource in scope SUITE failed to close. Resource was registered from thread 
Thread[id=3524, name=RecoveryThread, state=RUNNABLE, 
group=TGRP-RecoveryZkTest], registration stack trace below.

Stack Trace:
com.carrotsearch.randomizedtesting.ResourceDisposalError: Resource in scope 
SUITE failed to close. Resource was registered from thread Thread[id=3524, 
name=RecoveryThread, state=RUNNABLE, group=TGRP-RecoveryZkTest], registration 
stack trace below.
at java.lang.Thread.getStackTrace(Thread.java:1495)
at 
com.carrotsearch.randomizedtesting.RandomizedContext.closeAtEnd(RandomizedContext.java:150)
at 
org.apache.lucene.util.LuceneTestCase.closeAfterSuite(LuceneTestCase.java:517)
at 
org.apache.lucene.util.LuceneTestCase.wrapDirectory(LuceneTestCase.java:983)
at 
org.apache.lucene.util.LuceneTestCase.newDirectory(LuceneTestCase.java:875)
at 
org.apache.lucene.util.LuceneTestCase.newDirectory(LuceneTestCase.java:867)
at 
org.apache.solr.core.MockDirectoryFactory.create(MockDirectoryFactory.java:33)
at 
org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:267)
at 

[jira] [Created] (SOLR-4361) DIH request parameters with dots throws UnsupportedOperationException

2013-01-25 Thread James Dyer (JIRA)
James Dyer created SOLR-4361:


 Summary: DIH request parameters with dots throws 
UnsupportedOperationException
 Key: SOLR-4361
 URL: https://issues.apache.org/jira/browse/SOLR-4361
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 4.1
Reporter: James Dyer
Assignee: James Dyer
Priority: Minor
 Fix For: 4.2, 5.0


If the user puts placeholders for request parameters and these contain dots, 
DIH fails.  Current workaround is to either use no dots or use the 4.0 DIH jar.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4361) DIH request parameters with dots throws UnsupportedOperationException

2013-01-25 Thread James Dyer (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562900#comment-13562900
 ] 

James Dyer commented on SOLR-4361:
--

Example from user list:

I've just tried to upgrade from 4.0 to 4.1 and I have the following
exception when reindexing my data:

Caused by: java.lang.UnsupportedOperationException
 at java.util.Collections$UnmodifiableMap.put(Collections.java:1283)
at
org.apache.solr.handler.dataimport.VariableResolver.currentLevelMap(VariableResolver.java:204)
 at
org.apache.solr.handler.dataimport.VariableResolver.resolve(VariableResolver.java:94)
at
org.apache.solr.handler.dataimport.VariableResolver.replaceTokens(VariableResolver.java:144)
 at
org.apache.solr.handler.dataimport.ContextImpl.replaceTokens(ContextImpl.java:254)
at
org.apache.solr.handler.dataimport.JdbcDataSource.resolveVariables(JdbcDataSource.java:203)
 at
org.apache.solr.handler.dataimport.JdbcDataSource.createConnectionFactory(JdbcDataSource.java:101)
at
org.apache.solr.handler.dataimport.JdbcDataSource.init(JdbcDataSource.java:62)
 at
org.apache.solr.handler.dataimport.DataImporter.getDataSourceInstance(DataImporter.java:394)

It seems to be related to the use of placeholders in data-config.xml:

dataConfig
dataSource type=JdbcDataSource
name=bceDS
driver=${dataimporter.request.solr.bceDS.driver}
url=${dataimporter.request.solr.bceDS.url}
user=${dataimporter.request.solr.bceDS.user}
password=${dataimporter.request.solr.bceDS.password}
batchSize=-1/

solrconfig.xml:

requestHandler name=/dataimport
class=org.apache.solr.handler.dataimport.DataImportHandler
 lst name=defaults
str name=configdata-config.xml/str

!-- dataSource parameters for data-config.xml --
str name=solr.bceDS.driver.../str
 str name=solr.bceDS.url.../str
str name=solr.bceDS.user.../str
 str name=solr.bceDS.password.../str
/lst
/requestHandler


 DIH request parameters with dots throws UnsupportedOperationException
 -

 Key: SOLR-4361
 URL: https://issues.apache.org/jira/browse/SOLR-4361
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 4.1
Reporter: James Dyer
Assignee: James Dyer
Priority: Minor
 Fix For: 4.2, 5.0


 If the user puts placeholders for request parameters and these contain dots, 
 DIH fails.  Current workaround is to either use no dots or use the 4.0 DIH 
 jar.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4225) Term info page under schema browser shows incorrect count of terms

2013-01-25 Thread Stefan Matheis (steffkes) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Matheis (steffkes) updated SOLR-4225:


Attachment: schema-browser_histogram.png
SOLR-4225.patch

Attached Screenshot shows how the new Histograms will look like, using Data 
from Shawn (the two on top) as well as exampedocs (the two at the bottom)

Thoughts on this?

 Term info page under schema browser shows incorrect count of terms 
 ---

 Key: SOLR-4225
 URL: https://issues.apache.org/jira/browse/SOLR-4225
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.0
 Environment: chrome (version: Version 22.0.1229.94 m) on a windows 
 2003 machine
Reporter: Shreejay
Assignee: Stefan Matheis (steffkes)
Priority: Minor
 Attachments: luke-terms-elyograg.txt, schema-browser_histogram.png, 
 schemabrowser-termcount-problem.png, SOLR-4225.patch, TermInfo.png


 The box sizes on the term info page (under Schema Browser), overlaps, due to 
 which the number of terms shown look incorrect. Screenshot attached 
 (TermInfo.png).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-4225) Term info page under schema browser shows incorrect count of terms

2013-01-25 Thread Stefan Matheis (steffkes) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562902#comment-13562902
 ] 

Stefan Matheis (steffkes) edited comment on SOLR-4225 at 1/25/13 6:47 PM:
--

Attached Screenshot (schema-browser_histogram.png) shows how the new Histograms 
will look like, using Data from Shawn (the two on top) as well as exampedocs 
(the two at the bottom)

Thoughts on this?

  was (Author: steffkes):
Attached Screenshot shows how the new Histograms will look like, using Data 
from Shawn (the two on top) as well as exampedocs (the two at the bottom)

Thoughts on this?
  
 Term info page under schema browser shows incorrect count of terms 
 ---

 Key: SOLR-4225
 URL: https://issues.apache.org/jira/browse/SOLR-4225
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.0
 Environment: chrome (version: Version 22.0.1229.94 m) on a windows 
 2003 machine
Reporter: Shreejay
Assignee: Stefan Matheis (steffkes)
Priority: Minor
 Attachments: luke-terms-elyograg.txt, schema-browser_histogram.png, 
 schemabrowser-termcount-problem.png, SOLR-4225.patch, TermInfo.png


 The box sizes on the term info page (under Schema Browser), overlaps, due to 
 which the number of terms shown look incorrect. Screenshot attached 
 (TermInfo.png).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4359) The RecentUpdates#update method should treat a problem reading the next record the same as a problem parsing the record - log the exception and break.

2013-01-25 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562927#comment-13562927
 ] 

Commit Tag Bot commented on SOLR-4359:
--

[trunk commit] Mark Robert Miller
http://svn.apache.org/viewvc?view=revisionrevision=1438655

SOLR-4359: The RecentUpdates#update method should treat a problem reading the 
next record the same as a problem parsing the record - log the exception and  
break.


 The RecentUpdates#update method should treat a problem reading the next 
 record the same as a problem parsing the record - log the exception and break.
 --

 Key: SOLR-4359
 URL: https://issues.apache.org/jira/browse/SOLR-4359
 Project: Solr
  Issue Type: Bug
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 4.2, 5.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4359) The RecentUpdates#update method should treat a problem reading the next record the same as a problem parsing the record - log the exception and break.

2013-01-25 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562934#comment-13562934
 ] 

Commit Tag Bot commented on SOLR-4359:
--

[branch_4x commit] Mark Robert Miller
http://svn.apache.org/viewvc?view=revisionrevision=1438656

SOLR-4359: The RecentUpdates#update method should treat a problem reading the 
next record the same as a problem parsing the record - log the exception and  
break.


 The RecentUpdates#update method should treat a problem reading the next 
 record the same as a problem parsing the record - log the exception and break.
 --

 Key: SOLR-4359
 URL: https://issues.apache.org/jira/browse/SOLR-4359
 Project: Solr
  Issue Type: Bug
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 4.2, 5.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4359) The RecentUpdates#update method should treat a problem reading the next record the same as a problem parsing the record - log the exception and break.

2013-01-25 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-4359.
---

Resolution: Fixed

 The RecentUpdates#update method should treat a problem reading the next 
 record the same as a problem parsing the record - log the exception and break.
 --

 Key: SOLR-4359
 URL: https://issues.apache.org/jira/browse/SOLR-4359
 Project: Solr
  Issue Type: Bug
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 4.2, 5.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4361) DIH request parameters with dots throws UnsupportedOperationException

2013-01-25 Thread James Dyer (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562947#comment-13562947
 ] 

James Dyer commented on SOLR-4361:
--

Also, this workaround was mentioned.  This should be protected with a unit test 
so it doesn't get broken, also added to the wiki if not currently documented:

I do something similar, but without the placeholders in db-data-config.xml. You 
can define the entire datasource in solrconfig.xml, then leave out that element 
entirely in db-data-config.xml. It seems really odd, but that is how the code 
works.

This is working for me in 4.1, so it might be a workaround for you.

It looks like this:

  requestHandler name=/dataimport 
class=org.apache.solr.handler.dataimport.DataImportHandler
lst name=defaults
  str name=configdb-data-config.xml/str
  lst name=datasource
str name=defTypeJdbcDataSource/str
str name=drivercom.mysql.jdbc.Driver/str
str name=urljdbc:mysql://${textbooks.dbhost:nohost}//str
str name=user${textbooks.dbuser:y}/str
str name=password${textbooks.dbpass:zz}/str
str name=batchSize-1/str
str name=readOnlytrue/str
str name=onErrorskip/str
str name=netTimeoutForStreamingResults600/str
str name=zeroDateTimeBehaviorconvertToNull/str
  /lst
/lst
  /requestHandler


 DIH request parameters with dots throws UnsupportedOperationException
 -

 Key: SOLR-4361
 URL: https://issues.apache.org/jira/browse/SOLR-4361
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 4.1
Reporter: James Dyer
Assignee: James Dyer
Priority: Minor
 Fix For: 4.2, 5.0


 If the user puts placeholders for request parameters and these contain dots, 
 DIH fails.  Current workaround is to either use no dots or use the 4.0 DIH 
 jar.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4225) Term info page under schema browser shows incorrect count of terms

2013-01-25 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562954#comment-13562954
 ] 

Hoss Man commented on SOLR-4225:


+1 ... nice.

Why are the numbers formated as {{8'388'608}} instead of {{8,388,608}} or the 
more SI recomended {{8 388 608}}?  is {{\'}} a locale based convention i'm 
not aware of? 

 Term info page under schema browser shows incorrect count of terms 
 ---

 Key: SOLR-4225
 URL: https://issues.apache.org/jira/browse/SOLR-4225
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.0
 Environment: chrome (version: Version 22.0.1229.94 m) on a windows 
 2003 machine
Reporter: Shreejay
Assignee: Stefan Matheis (steffkes)
Priority: Minor
 Attachments: luke-terms-elyograg.txt, schema-browser_histogram.png, 
 schemabrowser-termcount-problem.png, SOLR-4225.patch, TermInfo.png


 The box sizes on the term info page (under Schema Browser), overlaps, due to 
 which the number of terms shown look incorrect. Screenshot attached 
 (TermInfo.png).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4225) Term info page under schema browser shows incorrect count of terms

2013-01-25 Thread Stefan Matheis (steffkes) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562963#comment-13562963
 ] 

Stefan Matheis (steffkes) commented on SOLR-4225:
-

bq. Why are the numbers formated as {{8'388'608}} instead of {{8,388,608}} or 
the more SI recomended {{8 388 608}}? is {{\'}} a locale based convention i'm 
not aware of?

Uhm, that's a good question oO That's the same formatting rule i used for the 
DIH-Interface, grabbed a short Javascript-Snippet from StackOverflow which 
included this apostrophe. 

If we change the formatting-character, i'd like to use the , instead of the 
whitespace - because the whitespace only works well if you use mono-space 
formatting (as you did in your comment), otherwise the space between the digits 
is so small, that it does not really help while scanning the whole number.

 Term info page under schema browser shows incorrect count of terms 
 ---

 Key: SOLR-4225
 URL: https://issues.apache.org/jira/browse/SOLR-4225
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.0
 Environment: chrome (version: Version 22.0.1229.94 m) on a windows 
 2003 machine
Reporter: Shreejay
Assignee: Stefan Matheis (steffkes)
Priority: Minor
 Attachments: luke-terms-elyograg.txt, schema-browser_histogram.png, 
 schemabrowser-termcount-problem.png, SOLR-4225.patch, TermInfo.png


 The box sizes on the term info page (under Schema Browser), overlaps, due to 
 which the number of terms shown look incorrect. Screenshot attached 
 (TermInfo.png).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-4225) Term info page under schema browser shows incorrect count of terms

2013-01-25 Thread Stefan Matheis (steffkes) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562963#comment-13562963
 ] 

Stefan Matheis (steffkes) edited comment on SOLR-4225 at 1/25/13 8:04 PM:
--

bq. Why are the numbers formated as {{8'388'608}} instead of {{8,388,608}} or 
the more SI recomended {{8 388 608}}? is {{\'}} a locale based convention i'm 
not aware of?

Uhm, that's a good question oO That's the same formatting rule i used for the 
DIH-Interface, grabbed a short Javascript-Snippet from StackOverflow which 
included this apostrophe. 

If we change the formatting-character, i'd like to use the , instead of the 
whitespace - because the whitespace only works well if you use mono-space 
formatting (as you did in your comment), otherwise the space between the digits 
is so small, that it does not really help while scanning the whole number.

# edit

hmm, maybe .. looks, like it's the swiss formatting rule, but i didn't realize 
that :D

  was (Author: steffkes):
bq. Why are the numbers formated as {{8'388'608}} instead of {{8,388,608}} 
or the more SI recomended {{8 388 608}}? is {{\'}} a locale based convention 
i'm not aware of?

Uhm, that's a good question oO That's the same formatting rule i used for the 
DIH-Interface, grabbed a short Javascript-Snippet from StackOverflow which 
included this apostrophe. 

If we change the formatting-character, i'd like to use the , instead of the 
whitespace - because the whitespace only works well if you use mono-space 
formatting (as you did in your comment), otherwise the space between the digits 
is so small, that it does not really help while scanning the whole number.
  
 Term info page under schema browser shows incorrect count of terms 
 ---

 Key: SOLR-4225
 URL: https://issues.apache.org/jira/browse/SOLR-4225
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.0
 Environment: chrome (version: Version 22.0.1229.94 m) on a windows 
 2003 machine
Reporter: Shreejay
Assignee: Stefan Matheis (steffkes)
Priority: Minor
 Attachments: luke-terms-elyograg.txt, schema-browser_histogram.png, 
 schemabrowser-termcount-problem.png, SOLR-4225.patch, TermInfo.png


 The box sizes on the term info page (under Schema Browser), overlaps, due to 
 which the number of terms shown look incorrect. Screenshot attached 
 (TermInfo.png).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4225) Term info page under schema browser shows incorrect count of terms

2013-01-25 Thread Stefan Matheis (steffkes) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Matheis (steffkes) updated SOLR-4225:


Attachment: schema-browser_histogram.png

Instead of maybe .. have a look at your own .. the two samples on top are 
updated .. the first using whitespace and the second using comma as separator - 
let me know which one :)

 Term info page under schema browser shows incorrect count of terms 
 ---

 Key: SOLR-4225
 URL: https://issues.apache.org/jira/browse/SOLR-4225
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.0
 Environment: chrome (version: Version 22.0.1229.94 m) on a windows 
 2003 machine
Reporter: Shreejay
Assignee: Stefan Matheis (steffkes)
Priority: Minor
 Attachments: luke-terms-elyograg.txt, schema-browser_histogram.png, 
 schema-browser_histogram.png, schemabrowser-termcount-problem.png, 
 SOLR-4225.patch, TermInfo.png


 The box sizes on the term info page (under Schema Browser), overlaps, due to 
 which the number of terms shown look incorrect. Screenshot attached 
 (TermInfo.png).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4325) DIH DateFormatEvaluator seems to have problems with DST changes - test disabled

2013-01-25 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563012#comment-13563012
 ] 

Uwe Schindler commented on SOLR-4325:
-

Thanks James!

 DIH DateFormatEvaluator seems to have problems with DST changes - test 
 disabled
 

 Key: SOLR-4325
 URL: https://issues.apache.org/jira/browse/SOLR-4325
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 4.0, 4.1
Reporter: Uwe Schindler
Assignee: James Dyer
 Fix For: 4.2, 5.0

 Attachments: SOLR-4325.patch


 Yesterday was DST change in Fidji (clock went one hour backwards, as summer 
 time ended and winter time started). This caused 
 org.apache.solr.handler.dataimport.TestBuiltInEvaluators.testDateFormatEvaluator
  to fail. The reason is simple: NOW-2DAYS is evaluated without taking time 
 zone into account (its substracting 48 hours), but to be correct and go 2 
 DAYS back in local wall clock time, it must subtract only 47 hours. If this 
 is not intended (we want to go 48 hours back, not 47), the test needs a fix. 
 Otherwise the date evaluator must take the timezone into account when 
 substracting days (e.g., use correctly localized Calendar instance and use 
 the add() method 
 ([http://docs.oracle.com/javase/6/docs/api/java/util/Calendar.html#add(int, 
 int)]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4695) Add utility class for getting live values for a given field during NRT indexing

2013-01-25 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563061#comment-13563061
 ] 

Commit Tag Bot commented on LUCENE-4695:


[trunk commit] Michael McCandless
http://svn.apache.org/viewvc?view=revisionrevision=1438721

LUCENE-4695: add LiveFieldValues, to get current (live/real-time) values for 
fields indexed after the last NRT reopen


 Add utility class for getting live values for a given field during NRT 
 indexing
 ---

 Key: LUCENE-4695
 URL: https://issues.apache.org/jira/browse/LUCENE-4695
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4695.patch, LUCENE-4695.patch


 This is a simple utility/wrapper class, that holds the field
 values for recently indexed documents until the NRT reader has
 refreshed, and exposes a get API to get the last indexed value per
 id.
 For example one could use this to look up the version field for a
 given id, even when that id was just indexed and not yet visible in
 the NRT reader.
 The implementation is fairly simple: it just watches the gen coming
 out of NRTManager and updates/prunes accordingly.
 The class is abstract: you must subclass it and impl the lookupFromSearcher
 method...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4362) edismax, phrase query with slop, pf parameter

2013-01-25 Thread Ahmet Arslan (JIRA)
Ahmet Arslan created SOLR-4362:
--

 Summary: edismax, phrase query with slop, pf parameter
 Key: SOLR-4362
 URL: https://issues.apache.org/jira/browse/SOLR-4362
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.1
Reporter: Ahmet Arslan


When sloppy phrase query (plus additional term) is used with edismax, slop 
value is search against fields that are supplied with pf parameter.

Example : With this url q=phrase query~10 termqf=textpf=text document 
having 10 term in its text field is boosted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-4695) Add utility class for getting live values for a given field during NRT indexing

2013-01-25 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-4695.


Resolution: Fixed

 Add utility class for getting live values for a given field during NRT 
 indexing
 ---

 Key: LUCENE-4695
 URL: https://issues.apache.org/jira/browse/LUCENE-4695
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4695.patch, LUCENE-4695.patch


 This is a simple utility/wrapper class, that holds the field
 values for recently indexed documents until the NRT reader has
 refreshed, and exposes a get API to get the last indexed value per
 id.
 For example one could use this to look up the version field for a
 given id, even when that id was just indexed and not yet visible in
 the NRT reader.
 The implementation is fairly simple: it just watches the gen coming
 out of NRTManager and updates/prunes accordingly.
 The class is abstract: you must subclass it and impl the lookupFromSearcher
 method...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Fixing query-time multi-word synonym issue

2013-01-25 Thread Michael McCandless
PositionLengthAttribute is sufficient to express the true graph, but
SynonymFilter has not been fully fixed to properly set it.
Specifically, it cannot create new positions, which is what's
necessary if you expand when applying synonyms (e.g., dns - domain
name service).

It is better to do the reverse: map the multi-word phrase down to a
single token, at indexing time (domain name service - dns): you get
accurate scoring (exact docFreq for how many docs have either dns or
domain name service) and faster search performance, and PosLenAtt is
properly set, and you workaround the fact that the index cannot index
the position length att (since you never create alternate paths in
the token graph).  The downside is you must re-index if you change
your synonyms.

However: once we fix QueryParser to stop splitting on whitespace (it's
really ridiculous that it does so: it causes so many problems), and
fix SynFilter to create positions, it is in theory possible to take
the resulting graph (if you expand when applying synonyms) and
enumerate the correct query (something like MultiPhraseQuery, or OR
of them, or something; maybe we'll need WordGraphQuery), and get the
correct results.

Mike McCandless

http://blog.mikemccandless.com

On Fri, Jan 25, 2013 at 11:27 AM, Jack Krupansky
j...@basetechnology.com wrote:
 One clarification from my previous comment: One requirement is to prevent
 false matches for instances of heart infarction and myocardial attack -
 the current synonym filter does not preserver the path or term ordering
 within the multi-term phrases. Even if the query parser does present the
 full term sequence as a single input string.

 Yes, the position information is preserved, but there is no path attribute
 to be able to tell that heart was before attack as opposed to before
 infarction.


 -- Jack Krupansky

 -Original Message- From: Robert Muir
 Sent: Friday, January 25, 2013 9:47 AM

 To: dev@lucene.apache.org
 Subject: Re: Fixing query-time multi-word synonym issue

 On Fri, Jan 25, 2013 at 9:19 AM, Jack Krupansky j...@basetechnology.com
 wrote:

 Here's an example query with q.op=AND:

causes of heart attack

 And I have this synonym definition:

heart attack, myocardial infarction

 So, what is the alleged query parser fix so that the query is treated as:

causes of (heart attack OR myocardial infarction)


 Thats actually inefficient and stupid to do. if you make a parser that
 doesnt split on whitespace, you can just tell it to fold at index and
 query time just like stemming. no OR necessary.

 But I think you are trying to get off topic, again the real problem
 affecting 99%+ users is that the lucene queryparser splits on
 whitespace.

 If this is fixed, then lots of things (not just synonyms, but other
 basic shit that is broken today) starts working too:
 https://issues.apache.org/jira/browse/LUCENE-2605

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4695) Add utility class for getting live values for a given field during NRT indexing

2013-01-25 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563071#comment-13563071
 ] 

Commit Tag Bot commented on LUCENE-4695:


[branch_4x commit] Michael McCandless
http://svn.apache.org/viewvc?view=revisionrevision=1438731

LUCENE-4695: add LiveFieldValues, to get current (live/real-time) values for 
fields indexed after the last NRT reopen


 Add utility class for getting live values for a given field during NRT 
 indexing
 ---

 Key: LUCENE-4695
 URL: https://issues.apache.org/jira/browse/LUCENE-4695
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4695.patch, LUCENE-4695.patch


 This is a simple utility/wrapper class, that holds the field
 values for recently indexed documents until the NRT reader has
 refreshed, and exposes a get API to get the last indexed value per
 id.
 For example one could use this to look up the version field for a
 given id, even when that id was just indexed and not yet visible in
 the NRT reader.
 The implementation is fairly simple: it just watches the gen coming
 out of NRTManager and updates/prunes accordingly.
 The class is abstract: you must subclass it and impl the lookupFromSearcher
 method...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: java.lang.NumberFormatException Using PhraseQuery with Lucene 4.0.0

2013-01-25 Thread Michael McCandless
Can you provide the full stack trace?

Mike McCandless

http://blog.mikemccandless.com

On Fri, Jan 25, 2013 at 11:13 AM, JimAld jim.alder...@db.com wrote:
 Hi,

 The below code is throwing the exception:
 java.lang.NumberFormatException: For input string: 01.SZ at
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)

 when the TopDocs docs = indexSearcher.search(phraseQuery, null, 10, sort);
 line is called. This only happens when the searchPattern contains a space
 character. No other info is available in the exception.

 The 01.SZ value is the first value in my index...
 the index is a RAMDirectory...

 Anyone have any ideas? Many Thanks.

 Code:
 BooleanQuery.setMaxClauseCount(clauseCount);
 searchPattern = QueryParser.escape(searchPattern);
 Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_40);
 IndexReader reader = IndexReader.open(index);
 IndexSearcher indexSearcher = new IndexSearcher(reader);

 PhraseQuery phraseQuery = new PhraseQuery();
 Term term = new Term(fieldName, searchPattern);
 phraseQuery.add(term);
 phraseQuery.setSlop(0);

 Sort sort = new Sort(new SortField(fieldName,SortField.Type.SCORE));
 TopDocs docs = indexSearcher.search(phraseQuery, null, 10, sort);




 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/java-lang-NumberFormatException-Using-PhraseQuery-with-Lucene-4-0-0-tp4036273.html
 Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4362) edismax, phrase query with slop, pf parameter

2013-01-25 Thread Ahmet Arslan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563075#comment-13563075
 ] 

Ahmet Arslan commented on SOLR-4362:


http://search-lucene.com/m/RwfwXkbfc

 edismax, phrase query with slop, pf parameter
 -

 Key: SOLR-4362
 URL: https://issues.apache.org/jira/browse/SOLR-4362
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.1
Reporter: Ahmet Arslan
  Labels: edismax, pf

 When sloppy phrase query (plus additional term) is used with edismax, slop 
 value is search against fields that are supplied with pf parameter.
 Example : With this url q=phrase query~10 termqf=textpf=text document 
 having 10 term in its text field is boosted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4362) edismax, phrase query with slop, pf parameter

2013-01-25 Thread Ahmet Arslan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Arslan updated SOLR-4362:
---

Attachment: SOLR-4362.patch

A failing test case that demonstrates the problem.

 edismax, phrase query with slop, pf parameter
 -

 Key: SOLR-4362
 URL: https://issues.apache.org/jira/browse/SOLR-4362
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.1
Reporter: Ahmet Arslan
  Labels: edismax, pf
 Attachments: SOLR-4362.patch


 When sloppy phrase query (plus additional term) is used with edismax, slop 
 value is search against fields that are supplied with pf parameter.
 Example : With this url q=phrase query~10 termqf=textpf=text document 
 having 10 term in its text field is boosted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-4721) WordDelimiterFilter ignores payloads

2013-01-25 Thread Scott Smerchek (JIRA)
Scott Smerchek created LUCENE-4721:
--

 Summary: WordDelimiterFilter ignores payloads 
 Key: LUCENE-4721
 URL: https://issues.apache.org/jira/browse/LUCENE-4721
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/analysis
Affects Versions: 4.1
Reporter: Scott Smerchek
 Attachments: LUCENE-4721.patch

When generating new tokens, the WordDelimeterFilter does not carry forward 
payloads. It appears that this issue was fixed long ago in 1.4 
(https://issues.apache.org/jira/browse/SOLR-532), however, it is yet again an 
issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4721) WordDelimiterFilter ignores payloads

2013-01-25 Thread Scott Smerchek (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Smerchek updated LUCENE-4721:
---

Attachment: LUCENE-4721.patch

 WordDelimiterFilter ignores payloads 
 -

 Key: LUCENE-4721
 URL: https://issues.apache.org/jira/browse/LUCENE-4721
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/analysis
Affects Versions: 4.1
Reporter: Scott Smerchek
 Attachments: LUCENE-4721.patch


 When generating new tokens, the WordDelimeterFilter does not carry forward 
 payloads. It appears that this issue was fixed long ago in 1.4 
 (https://issues.apache.org/jira/browse/SOLR-532), however, it is yet again an 
 issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-4722) Can we move SortField.Type.SCORE/DOC to singleton SortField instances instead...?

2013-01-25 Thread Michael McCandless (JIRA)
Michael McCandless created LUCENE-4722:
--

 Summary: Can we move SortField.Type.SCORE/DOC to singleton 
SortField instances instead...?
 Key: LUCENE-4722
 URL: https://issues.apache.org/jira/browse/LUCENE-4722
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael McCandless
 Fix For: 4.2, 5.0


It's ... weird that you can do eg new SortField(myfield, 
SortField.Type.SCORE).

We already have dedicated SortField.FIELD_SCORE and FIELD_DOC ... so I think 
apps should use those and never make a new SortField for them?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4642) TokenizerFactory should provide a create method with a given AttributeSource

2013-01-25 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563108#comment-13563108
 ] 

Robert Muir commented on LUCENE-4642:
-

My problem i guess with AttributeSource/AttributeFactory is that they invade on 
every single custom tokenizer: the API is not good.

I realize its useful for expert users to be able to plug in their own, but why 
in the world must *every*  tokenizer have ctor explosion (minimum 3) to support 
this? 

And I guess I was secretly hoping we could remove Tokenizer(AttributeSource) if 
we fixed the solr hack. :)

Again my main problem is not about what you want to do, its instead related to 
the existing APIs (Tokenizer.java) and where we are heading if we perpetuate 
this to the analysis factories (TokenizerFactory) too.


 TokenizerFactory should provide a create method with a given AttributeSource
 

 Key: LUCENE-4642
 URL: https://issues.apache.org/jira/browse/LUCENE-4642
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 4.1
Reporter: Renaud Delbru
Assignee: Steve Rowe
  Labels: analysis, attribute, tokenizer
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4642.patch, LUCENE-4642.patch


 All tokenizer implementations have a constructor that takes a given 
 AttributeSource as parameter (LUCENE-1826). However, the TokenizerFactory 
 does not provide an API to create tokenizers with a given AttributeSource.
 Side note: There are still a lot of tokenizers that do not provide 
 constructors that take AttributeSource and AttributeFactory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4225) Term info page under schema browser shows incorrect count of terms

2013-01-25 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563129#comment-13563129
 ] 

Hoss Man commented on SOLR-4225:


Well, 16 years of education at US schools has biased me in favor of using comma 
as the thousand separator -- but i appreciate that people smarter then me clam 
whitespace separation is less confusing when communicating with people from 
other cultures that have diff conventions. (although i appreciate your point 
about it mainly being useful in fixed width fonts)

truthfully i don't really care what we use: i was just surprised by the 
apostrophes since that's not a convention i'd ever seen before in any locale.

 Term info page under schema browser shows incorrect count of terms 
 ---

 Key: SOLR-4225
 URL: https://issues.apache.org/jira/browse/SOLR-4225
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.0
 Environment: chrome (version: Version 22.0.1229.94 m) on a windows 
 2003 machine
Reporter: Shreejay
Assignee: Stefan Matheis (steffkes)
Priority: Minor
 Attachments: luke-terms-elyograg.txt, schema-browser_histogram.png, 
 schema-browser_histogram.png, schemabrowser-termcount-problem.png, 
 SOLR-4225.patch, TermInfo.png


 The box sizes on the term info page (under Schema Browser), overlaps, due to 
 which the number of terms shown look incorrect. Screenshot attached 
 (TermInfo.png).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: java.lang.NumberFormatException Using PhraseQuery with Lucene 4.0.0

2013-01-25 Thread JimAld
Sure, here it is:

java.lang.NumberFormatException: For input string: 01.SZ
at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
at java.lang.Integer.parseInt(Integer.java:458)
at java.lang.Byte.parseByte(Byte.java:151)
at java.lang.Byte.parseByte(Byte.java:108)
at
org.apache.lucene.search.FieldCache$1.parseByte(FieldCache.java:130)
at
org.apache.lucene.search.FieldCacheImpl$ByteCache.createValue(FieldCacheImpl.java:366)
at
org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:248)
at
org.apache.lucene.search.FieldCacheImpl.getBytes(FieldCacheImpl.java:329)
at
org.apache.lucene.search.FieldComparator$ByteComparator.setNextReader(FieldComparator.java:271)
at
org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:97)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:585)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:555)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:507)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:484)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:309)
at
com.db.gef.locates.index.impl.LuceneLocatesSearchIndex.getMatchingIndexedObjectPhrases(LuceneLocatesSearchIndex.java:361)
at
com.db.gef.locates.cache.impl.services.CasheServiceImpl.lookupSecurity(CasheServiceImpl.java:304)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:304)
at
org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:182)
at
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:149)
at
org.springframework.remoting.support.RemoteInvocationTraceInterceptor.invoke(RemoteInvocationTraceInterceptor.java:70)
at
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
at
org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
at $Proxy62.lookupSecurity(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
com.caucho.hessian.server.HessianSkeleton.invoke(HessianSkeleton.java:157)
at
org.springframework.remoting.caucho.Hessian2SkeletonInvoker.invoke(Hessian2SkeletonInvoker.java:67)
at
org.springframework.remoting.caucho.HessianServiceExporter.handleRequest(HessianServiceExporter.java:147)
at
org.springframework.web.servlet.mvc.HttpRequestHandlerAdapter.handle(HttpRequestHandlerAdapter.java:49)
at
org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:859)
at
org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:793)
at
org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:476)
at
org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:441)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at
weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:227)
at
weblogic.servlet.internal.StubSecurityHelper.invokeServlet(StubSecurityHelper.java:125)
at
weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:292)
at
weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:175)
at
weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:3498)
at
weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321)
at weblogic.security.service.SecurityManager.runAs(Unknown Source)
at
weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2180)
at
weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2086)
at
weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1406)
at weblogic.work.ExecuteThread.execute(ExecuteThread.java:201)
at 

Re: java.lang.NumberFormatException Using PhraseQuery with Lucene 4.0.0

2013-01-25 Thread JimAld
Also, I made a mistake in my original post, the sort constructor used is
actually of type String as follows:

Sort sort = new Sort(new SortField(fieldName,SortField.Type.STRING));

All the rest of the code is correct.

Thanks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/java-lang-NumberFormatException-Using-PhraseQuery-with-Lucene-4-0-0-tp4036273p4036383.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4362) edismax, phrase query with slop, pf parameter

2013-01-25 Thread Ahmet Arslan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563152#comment-13563152
 ] 

Ahmet Arslan commented on SOLR-4362:


org.apache.solr.search.ExtendedDismaxQParser#splitIntoClauses(\phrase 
query\~10 term) return 3 Clauses:
{noformat}
 field = null rawField = null isPhrase = true val = phrase query raw = phrase 
query 
 field = null rawField = null isPhrase = false val = \~10 raw = ~10 
 field = null rawField = null isPhrase = false val = term raw = term
{noformat}

And mainUserQuery becomes : phrase query ~10 term 

 edismax, phrase query with slop, pf parameter
 -

 Key: SOLR-4362
 URL: https://issues.apache.org/jira/browse/SOLR-4362
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.1
Reporter: Ahmet Arslan
  Labels: edismax, pf
 Attachments: SOLR-4362.patch


 When sloppy phrase query (plus additional term) is used with edismax, slop 
 value is search against fields that are supplied with pf parameter.
 Example : With this url q=phrase query~10 termqf=textpf=text document 
 having 10 term in its text field is boosted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Fixing query-time multi-word synonym issue

2013-01-25 Thread Jack Krupansky

Thanks for that insight, Mike!

I don't think anybody is in disagreement with the need for the query parser 
to present the full, white-space delimited pseudo-term sequence to analysis 
in one step. My proposal from last September recognized that.


Is there a decent writeup on PositionLengthAttribute? I mean, the Javadoc 
says The positionLength determines how many positions this token spans, 
which doesn't sound very relevant to multi-term synonyms that span multiple 
positions.


-- Jack Krupansky

-Original Message- 
From: Michael McCandless

Sent: Friday, January 25, 2013 4:41 PM
To: dev@lucene.apache.org
Subject: Re: Fixing query-time multi-word synonym issue

PositionLengthAttribute is sufficient to express the true graph, but
SynonymFilter has not been fully fixed to properly set it.
Specifically, it cannot create new positions, which is what's
necessary if you expand when applying synonyms (e.g., dns - domain
name service).

It is better to do the reverse: map the multi-word phrase down to a
single token, at indexing time (domain name service - dns): you get
accurate scoring (exact docFreq for how many docs have either dns or
domain name service) and faster search performance, and PosLenAtt is
properly set, and you workaround the fact that the index cannot index
the position length att (since you never create alternate paths in
the token graph).  The downside is you must re-index if you change
your synonyms.

However: once we fix QueryParser to stop splitting on whitespace (it's
really ridiculous that it does so: it causes so many problems), and
fix SynFilter to create positions, it is in theory possible to take
the resulting graph (if you expand when applying synonyms) and
enumerate the correct query (something like MultiPhraseQuery, or OR
of them, or something; maybe we'll need WordGraphQuery), and get the
correct results.

Mike McCandless

http://blog.mikemccandless.com

On Fri, Jan 25, 2013 at 11:27 AM, Jack Krupansky
j...@basetechnology.com wrote:

One clarification from my previous comment: One requirement is to prevent
false matches for instances of heart infarction and myocardial 
attack -

the current synonym filter does not preserver the path or term ordering
within the multi-term phrases. Even if the query parser does present the
full term sequence as a single input string.

Yes, the position information is preserved, but there is no path 
attribute

to be able to tell that heart was before attack as opposed to before
infarction.


-- Jack Krupansky

-Original Message- From: Robert Muir
Sent: Friday, January 25, 2013 9:47 AM

To: dev@lucene.apache.org
Subject: Re: Fixing query-time multi-word synonym issue

On Fri, Jan 25, 2013 at 9:19 AM, Jack Krupansky j...@basetechnology.com
wrote:


Here's an example query with q.op=AND:

   causes of heart attack

And I have this synonym definition:

   heart attack, myocardial infarction

So, what is the alleged query parser fix so that the query is treated as:

   causes of (heart attack OR myocardial infarction)



Thats actually inefficient and stupid to do. if you make a parser that
doesnt split on whitespace, you can just tell it to fold at index and
query time just like stemming. no OR necessary.

But I think you are trying to get off topic, again the real problem
affecting 99%+ users is that the lucene queryparser splits on
whitespace.

If this is fixed, then lots of things (not just synonyms, but other
basic shit that is broken today) starts working too:
https://issues.apache.org/jira/browse/LUCENE-2605

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org 



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-1822) FastVectorHighlighter: SimpleFragListBuilder hard-coded 6 char margin is too naive

2013-01-25 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563305#comment-13563305
 ] 

Commit Tag Bot commented on LUCENE-1822:


[trunk commit] Koji Sekiguchi
http://svn.apache.org/viewvc?view=revisionrevision=1438822

LUCENE-1822: add a note in Changes in runtime behavior


 FastVectorHighlighter: SimpleFragListBuilder hard-coded 6 char margin is too 
 naive
 --

 Key: LUCENE-1822
 URL: https://issues.apache.org/jira/browse/LUCENE-1822
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/highlighter
Affects Versions: 2.9
 Environment: any
Reporter: Alex Vigdor
Assignee: Koji Sekiguchi
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: LUCENE-1822.patch, LUCENE-1822.patch, LUCENE-1822.patch, 
 LUCENE-1822-tests.patch


 The new FastVectorHighlighter performs extremely well, however I've found in 
 testing that the window of text chosen per fragment is often very poor, as it 
 is hard coded in SimpleFragListBuilder to always select starting 6 characters 
 to the left of the first phrase match in a fragment.  When selecting long 
 fragments, this often means that there is barely any context before the 
 highlighted word, and lots after; even worse, when highlighting a phrase at 
 the end of a short text the beginning is cut off, even though the entire 
 phrase would fit in the specified fragCharSize.  For example, highlighting 
 Punishment in Crime and Punishment  returns e and bPunishment/b no 
 matter what fragCharSize is specified.  I am going to attach a patch that 
 improves the text window selection by recalculating the starting margin once 
 all phrases in the fragment have been identified - this way if a single word 
 is matched in a fragment, it will appear in the middle of the highlight, 
 instead of 6 characters from the beginning.  This way one can also guarantee 
 that the entirety of short texts are represented in a fragment by specifying 
 a large enough fragCharSize.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-1822) FastVectorHighlighter: SimpleFragListBuilder hard-coded 6 char margin is too naive

2013-01-25 Thread Koji Sekiguchi (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563308#comment-13563308
 ] 

Koji Sekiguchi commented on LUCENE-1822:


I committed the above note to trunk, branch_4x and lucene_solr_4_1.

 FastVectorHighlighter: SimpleFragListBuilder hard-coded 6 char margin is too 
 naive
 --

 Key: LUCENE-1822
 URL: https://issues.apache.org/jira/browse/LUCENE-1822
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/highlighter
Affects Versions: 2.9
 Environment: any
Reporter: Alex Vigdor
Assignee: Koji Sekiguchi
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: LUCENE-1822.patch, LUCENE-1822.patch, LUCENE-1822.patch, 
 LUCENE-1822-tests.patch


 The new FastVectorHighlighter performs extremely well, however I've found in 
 testing that the window of text chosen per fragment is often very poor, as it 
 is hard coded in SimpleFragListBuilder to always select starting 6 characters 
 to the left of the first phrase match in a fragment.  When selecting long 
 fragments, this often means that there is barely any context before the 
 highlighted word, and lots after; even worse, when highlighting a phrase at 
 the end of a short text the beginning is cut off, even though the entire 
 phrase would fit in the specified fragCharSize.  For example, highlighting 
 Punishment in Crime and Punishment  returns e and bPunishment/b no 
 matter what fragCharSize is specified.  I am going to attach a patch that 
 improves the text window selection by recalculating the starting margin once 
 all phrases in the fragment have been identified - this way if a single word 
 is matched in a fragment, it will appear in the middle of the highlight, 
 instead of 6 characters from the beginning.  This way one can also guarantee 
 that the entirety of short texts are represented in a fragment by specifying 
 a large enough fragCharSize.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-1822) FastVectorHighlighter: SimpleFragListBuilder hard-coded 6 char margin is too naive

2013-01-25 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563312#comment-13563312
 ] 

Commit Tag Bot commented on LUCENE-1822:


[branch_4x commit] Koji Sekiguchi
http://svn.apache.org/viewvc?view=revisionrevision=1438824

LUCENE-1822: add a note in Changes in runtime behavior


 FastVectorHighlighter: SimpleFragListBuilder hard-coded 6 char margin is too 
 naive
 --

 Key: LUCENE-1822
 URL: https://issues.apache.org/jira/browse/LUCENE-1822
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/highlighter
Affects Versions: 2.9
 Environment: any
Reporter: Alex Vigdor
Assignee: Koji Sekiguchi
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: LUCENE-1822.patch, LUCENE-1822.patch, LUCENE-1822.patch, 
 LUCENE-1822-tests.patch


 The new FastVectorHighlighter performs extremely well, however I've found in 
 testing that the window of text chosen per fragment is often very poor, as it 
 is hard coded in SimpleFragListBuilder to always select starting 6 characters 
 to the left of the first phrase match in a fragment.  When selecting long 
 fragments, this often means that there is barely any context before the 
 highlighted word, and lots after; even worse, when highlighting a phrase at 
 the end of a short text the beginning is cut off, even though the entire 
 phrase would fit in the specified fragCharSize.  For example, highlighting 
 Punishment in Crime and Punishment  returns e and bPunishment/b no 
 matter what fragCharSize is specified.  I am going to attach a patch that 
 improves the text window selection by recalculating the starting margin once 
 all phrases in the fragment have been identified - this way if a single word 
 is matched in a fragment, it will appear in the middle of the highlight, 
 instead of 6 characters from the beginning.  This way one can also guarantee 
 that the entirety of short texts are represented in a fragment by specifying 
 a large enough fragCharSize.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: java.lang.NumberFormatException Using PhraseQuery with Lucene 4.0.0

2013-01-25 Thread Uwe Schindler
Hi,

this has nothing to do with PhraseQuery. The stack trace shows, that your code 
seems to have passed SortField.BYTE, so maybe you have some logic error? 
PhraseQuery by itself does not use the FieldCache, only the the result 
collector is using the cache and this one is independent from the query.

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: JimAld [mailto:jim.alder...@db.com]
 Sent: Saturday, January 26, 2013 12:01 AM
 To: dev@lucene.apache.org
 Subject: Re: java.lang.NumberFormatException Using PhraseQuery with
 Lucene 4.0.0
 
 Sure, here it is:
 
 java.lang.NumberFormatException: For input string: 01.SZ
 at
 java.lang.NumberFormatException.forInputString(NumberFormatException.
 java:48)
 at java.lang.Integer.parseInt(Integer.java:458)
 at java.lang.Byte.parseByte(Byte.java:151)
 at java.lang.Byte.parseByte(Byte.java:108)
 at
 org.apache.lucene.search.FieldCache$1.parseByte(FieldCache.java:130)
 at
 org.apache.lucene.search.FieldCacheImpl$ByteCache.createValue(FieldCach
 eImpl.java:366)
 at
 org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:24
 8)
 at
 org.apache.lucene.search.FieldCacheImpl.getBytes(FieldCacheImpl.java:329)
 at
 org.apache.lucene.search.FieldComparator$ByteComparator.setNextReader
 (FieldComparator.java:271)
 at
 org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringColl
 ector.setNextReader(TopFieldCollector.java:97)
 at
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:585)
 at
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:555)
 at
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:507)
 at
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:484)
 at
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:309)
 at
 com.db.gef.locates.index.impl.LuceneLocatesSearchIndex.getMatchingInde
 xedObjectPhrases(LuceneLocatesSearchIndex.java:361)
 at
 com.db.gef.locates.cache.impl.services.CasheServiceImpl.lookupSecurity(Ca
 sheServiceImpl.java:304)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
 ava:39)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
 sorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at
 org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(
 AopUtils.java:304)
 at
 org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoi
 npoint(ReflectiveMethodInvocation.java:182)
 at
 org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(
 ReflectiveMethodInvocation.java:149)
 at
 org.springframework.remoting.support.RemoteInvocationTraceInterceptor.i
 nvoke(RemoteInvocationTraceInterceptor.java:70)
 at
 org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(
 ReflectiveMethodInvocation.java:171)
 at
 org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDyna
 micAopProxy.java:204)
 at $Proxy62.lookupSecurity(Unknown Source)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
 ava:39)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
 sorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at
 com.caucho.hessian.server.HessianSkeleton.invoke(HessianSkeleton.java:15
 7)
 at
 org.springframework.remoting.caucho.Hessian2SkeletonInvoker.invoke(Hes
 sian2SkeletonInvoker.java:67)
 at
 org.springframework.remoting.caucho.HessianServiceExporter.handleReque
 st(HessianServiceExporter.java:147)
 at
 org.springframework.web.servlet.mvc.HttpRequestHandlerAdapter.handle(
 HttpRequestHandlerAdapter.java:49)
 at
 org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherS
 ervlet.java:859)
 at
 org.springframework.web.servlet.DispatcherServlet.doService(DispatcherSe
 rvlet.java:793)
 at
 org.springframework.web.servlet.FrameworkServlet.processRequest(Frame
 workServlet.java:476)
 at
 org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkSer
 vlet.java:441)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
 at
 weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(Stub
 SecurityHelper.java:227)
 at
 weblogic.servlet.internal.StubSecurityHelper.invokeServlet(StubSecurityHel
 per.java:125)
 at
 weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:292)
 at