Re: set PYTHONPATH programatically from Java?

2011-11-14 Thread Roman Chyla
hi,

so after reading
http://docs.python.org/c-api/init.html#PySys_SetArgvEx and the source
code for _PythonVM_init i figured it out

I have to do:

PythonVM.start(/dvt/workspace/montysolr/src/python/montysolr);

and the sys.path then contains the parent folder (above montysolr) and
i can then set more things by loading some boostrap module

but something like
http://docs.python.org/c-api/veryhigh.html#PyRun_SimpleString would be
much more flexible. Is it something that could be added? I can prepare
a patch (as it seems really trivial my knowledge might be sufficient
for this :))

roman

On Mon, Nov 14, 2011 at 1:12 PM, Roman Chyla roman.ch...@gmail.com wrote:
 On Mon, Nov 14, 2011 at 4:25 AM, Andi Vajda va...@apache.org wrote:

 On Sun, 13 Nov 2011, Roman Chyla wrote:

 I am using JCC to run Python inside Java. For unittest, I'd like to
 set PYTHONPATH environment variable programmatically. I can change env
 vars inside Java (using

 http://stackoverflow.com/questions/318239/how-do-i-set-environment-variables-from-java)
 and System.getenv(PYTHONPATH) shows correct values

 However, I am still getting ImportError: no module named

 If I set PYTHONPATH before starting unittest, it works fine

 Is it possible what I would like to do?

 Why mess with the environment instead of setting sys.path directly instead ?

 That would be great, but I don't know how. I am doing roughly this:

 PythonVM.start(programName)
 vm = PythonVM.get()
 vm.instantiate(moduleName, className);

 I tried also:
 PythonVM.start(programName, new String[]{-c, import
 sys;sys.path.insert(0, \'/dvt/workspace/montysolr/src/python\'});

 it is failing on vm.instantiate when Python cannot find the module


 Alternatively, if JCC could execute/eval python string, I could set
 sys.argv that way

 I'm not sure what you mean here but JCC's Java PythonVM.init() method takes
 an array of strings that is fed into sys.argv. See _PythonVM_Init() sources
 in jcc.cpp for details.

 sorry, i meant sys.path, not sys.argv

 roman


 Andi..




[jira] [Updated] (LUCENE-3269) Speed up Top-K sampling tests

2011-11-14 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3269:
---

Attachment: LUCENE-3269.patch

Patch introduces the following:
* HashMapInteger, SearchTaxoDirPair which is initialized in beforeClass and 
maps a partition size to the pair of Directories.
* initIndex first checks the map for the partition size, and creates the 
indexes only if no matching pair is found.

The sampling tests do not benefit from that directly, as they only run one test 
method, however, if they will run in the same JVM, then they will reuse the 
already created indexes.

Patch is against 3x and seems trivial, so I intend to commit this later today 
or tomorrow if there are no objections.

 Speed up Top-K sampling tests
 -

 Key: LUCENE-3269
 URL: https://issues.apache.org/jira/browse/LUCENE-3269
 Project: Lucene - Java
  Issue Type: Test
  Components: modules/facet
Reporter: Robert Muir
 Fix For: 3.5, 4.0

 Attachments: LUCENE-3269.patch, LUCENE-3269.patch, LUCENE-3269.patch, 
 LUCENE-3269.patch


 speed up the top-k sampling tests (but make sure they are thorough on nightly 
 etc still)
 usually we would do this with use of atLeast(), but these tests are somewhat 
 tricky,
 so maybe a different approach is needed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-2571) Indexing performance tests with realtime branch

2011-11-14 Thread Simon Willnauer (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer resolved LUCENE-2571.
-

Resolution: Fixed

 Indexing performance tests with realtime branch
 ---

 Key: LUCENE-2571
 URL: https://issues.apache.org/jira/browse/LUCENE-2571
 Project: Lucene - Java
  Issue Type: Task
  Components: core/index
Reporter: Michael Busch
Assignee: Simon Willnauer
Priority: Minor
 Fix For: Realtime Branch

 Attachments: wikimedium.realtime.Standard.nd10M_dps.png, 
 wikimedium.realtime.Standard.nd10M_dps_addDocuments.png, 
 wikimedium.realtime.Standard.nd10M_dps_addDocuments_flush.png, 
 wikimedium.trunk.Standard.nd10M_dps.png, 
 wikimedium.trunk.Standard.nd10M_dps_BalancedSegmentMergePolicy.png, 
 wikimedium.trunk.Standard.nd10M_dps_addDocuments.png


 We should run indexing performance tests with the DWPT changes and compare to 
 trunk.
 We need to test both single-threaded and multi-threaded performance.
 NOTE:  flush by RAM isn't implemented just yet, so either we wait with the 
 tests or flush by doc count.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3562) Stop storing TermsEnum in CloseableThreadLocal inside Terms instance

2011-11-14 Thread Simon Willnauer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149531#comment-13149531
 ] 

Simon Willnauer commented on LUCENE-3562:
-

mike I think you should commit this - patch looks good to me

 Stop storing TermsEnum in CloseableThreadLocal inside Terms instance
 

 Key: LUCENE-3562
 URL: https://issues.apache.org/jira/browse/LUCENE-3562
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.0

 Attachments: LUCENE-3562.patch


 We have sugar methods in Terms.java (docFreq, totalTermFreq, docs,
 docsAndPositions) that use a saved thread-private TermsEnum to do the
 lookups.
 But on apps that send many threads through Lucene, and/or have many
 segments, this can add up to a lot of RAM, especially if the codecs
 impl holds onto stuff.
 Also, Terms has a close method (closes the CloseableThreadLocal) which
 must be called, but we fail to do so in some places.
 These saved enums are the cause of the recent OOME in TestNRTManager
 (TestNRTManager.testNRTManager -seed
 2aa27e1aec20c4a2:-4a5a5ecf46837d0e:-7c4f651f1f0b75d7 -mult 3
 -nightly).
 Really sharing these enums is a holdover from before Lucene queries
 would share state (ie, save the TermState from the first pass, and use
 it later to pull enums, get docFreq, etc.).  It's not helpful anymore,
 and it can use gobbs of RAM, so I'd like to remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-1271) ClassCastException when using ParallelMultiSearcher.search(Query query, Filter filter, int n, Sort sort)

2011-11-14 Thread Simon Willnauer (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer resolved LUCENE-1271.
-

Resolution: Won't Fix

ParallelMultiSearcher is deprecated use IndexSearcher instead

 ClassCastException when using ParallelMultiSearcher.search(Query query, 
 Filter filter, int n, Sort sort)
 

 Key: LUCENE-1271
 URL: https://issues.apache.org/jira/browse/LUCENE-1271
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/search
Affects Versions: 2.3, 2.3.1
 Environment: MS Windows XP (SP 2), JDK 1.5.0 Update 12
Reporter: Kai Burjack
Priority: Minor
 Fix For: 4.0


 Stacktrace-Output in Console:
 Exception in thread MultiSearcher thread #1 java.lang.ClassCastException: 
 org.apache.lucene.search.ScoreDoc
   at 
 org.apache.lucene.search.FieldDocSortedHitQueue.lessThan(FieldDocSortedHitQueue.java:105)
   at org.apache.lucene.util.PriorityQueue.upHeap(PriorityQueue.java:139)
   at org.apache.lucene.util.PriorityQueue.put(PriorityQueue.java:53)
   at 
 org.apache.lucene.util.PriorityQueue.insertWithOverflow(PriorityQueue.java:78)
   at org.apache.lucene.util.PriorityQueue.insert(PriorityQueue.java:63)
   at 
 org.apache.lucene.search.MultiSearcherThread.run(ParallelMultiSearcher.java:272)
 Exception in thread MultiSearcher thread #2 java.lang.ClassCastException: 
 org.apache.lucene.search.ScoreDoc
   at 
 org.apache.lucene.search.FieldDocSortedHitQueue.lessThan(FieldDocSortedHitQueue.java:105)
   at org.apache.lucene.util.PriorityQueue.upHeap(PriorityQueue.java:139)
   at org.apache.lucene.util.PriorityQueue.put(PriorityQueue.java:53)
   at 
 org.apache.lucene.util.PriorityQueue.insertWithOverflow(PriorityQueue.java:78)
   at org.apache.lucene.util.PriorityQueue.insert(PriorityQueue.java:63)
   at 
 org.apache.lucene.search.MultiSearcherThread.run(ParallelMultiSearcher.java:272)
 Stack-Trace in resulting exception while performing the JUnit-Test:
 java.lang.ClassCastException: org.apache.lucene.search.ScoreDoc
   at 
 org.apache.lucene.search.FieldDocSortedHitQueue.lessThan(FieldDocSortedHitQueue.java:105)
   at org.apache.lucene.util.PriorityQueue.downHeap(PriorityQueue.java:155)
   at org.apache.lucene.util.PriorityQueue.pop(PriorityQueue.java:106)
   at 
 org.apache.lucene.search.ParallelMultiSearcher.search(ParallelMultiSearcher.java:146)
   at org.apache.lucene.search.Searcher.search(Searcher.java:78)
   at class calling the Searcher.search(Query query, Filter filter, int 
 n, Sort sort) method with filter:null and sort:null
   
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
   at java.lang.reflect.Method.invoke(Unknown Source)
   at junit.framework.TestCase.runTest(TestCase.java:154)
   at junit.framework.TestCase.runBare(TestCase.java:127)
   at junit.framework.TestResult$1.protect(TestResult.java:106)
   at junit.framework.TestResult.runProtected(TestResult.java:124)
   at junit.framework.TestResult.run(TestResult.java:109)
   at junit.framework.TestCase.run(TestCase.java:118)
   at junit.framework.TestSuite.runTest(TestSuite.java:208)
   at junit.framework.TestSuite.run(TestSuite.java:203)
   at 
 org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestReference.run(JUnit3TestReference.java:130)
   at 
 org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
   at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:460)
   at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:673)
   at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:386)
   at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:196)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3428) trunk tests hang/deadlock TestIndexWriterWithThreads

2011-11-14 Thread Simon Willnauer (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer resolved LUCENE-3428.
-

   Resolution: Fixed
Fix Version/s: 4.0
Lucene Fields: New,Patch Available  (was: New)

fixed

 trunk tests hang/deadlock TestIndexWriterWithThreads
 

 Key: LUCENE-3428
 URL: https://issues.apache.org/jira/browse/LUCENE-3428
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Robert Muir
Assignee: Simon Willnauer
 Fix For: 4.0

 Attachments: LUCENE-3428.patch


 trunk tests have been hanging often lately in hudson, this time i was careful 
 to kill and get a good stacktrace:

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3425) NRT Caching Dir to allow for exact memory usage, better buffer allocation and global cross indices control

2011-11-14 Thread Simon Willnauer (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-3425:


Affects Version/s: 4.0
   3.4
Fix Version/s: 4.0
   3.5

 NRT Caching Dir to allow for exact memory usage, better buffer allocation and 
 global cross indices control
 

 Key: LUCENE-3425
 URL: https://issues.apache.org/jira/browse/LUCENE-3425
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 3.4, 4.0
Reporter: Shay Banon
 Fix For: 3.5, 4.0


 A discussion on IRC raised several improvements that can be made to NRT 
 caching dir. Some of the problems it currently has are:
 1. Not explicitly controlling the memory usage, which can result in overusing 
 memory (for example, large new segments being committed because refreshing is 
 too far behind).
 2. Heap fragmentation because of constant allocation of (probably promoted to 
 old gen) byte buffers.
 3. Not being able to control the memory usage across indices for multi index 
 usage within a single JVM.
 A suggested solution (which still needs to be ironed out) is to have a 
 BufferAllocator that controls allocation of byte[], and allow to return 
 unused byte[] to it. It will have a cap on the size of memory it allows to be 
 allocated.
 The NRT caching dir will use the allocator, which can either be provided (for 
 usage across several indices) or created internally. The caching dir will 
 also create a wrapped IndexOutput, that will flush to the main dir if the 
 allocator can no longer provide byte[] (exhausted).
 When a file is flushed from the cache to the main directory, it will return 
 all the currently allocated byte[] to the BufferAllocator to be reused by 
 other files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3453) remove IndexDocValuesField

2011-11-14 Thread Simon Willnauer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149536#comment-13149536
 ] 

Simon Willnauer commented on LUCENE-3453:
-

hey chris what is the status here?

 remove IndexDocValuesField
 --

 Key: LUCENE-3453
 URL: https://issues.apache.org/jira/browse/LUCENE-3453
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Robert Muir
Assignee: Chris Male
 Fix For: 4.0


 Its confusing how we present CSF functionality to the user, its actually not 
 a field but an attribute of a field like  STORED or INDEXED.
 Otherwise, its really hard to think about CSF because there is a mismatch 
 between the APIs and the index format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3453) remove IndexDocValuesField

2011-11-14 Thread Simon Willnauer (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-3453:


Fix Version/s: 4.0

 remove IndexDocValuesField
 --

 Key: LUCENE-3453
 URL: https://issues.apache.org/jira/browse/LUCENE-3453
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Robert Muir
Assignee: Chris Male
 Fix For: 4.0


 Its confusing how we present CSF functionality to the user, its actually not 
 a field but an attribute of a field like  STORED or INDEXED.
 Otherwise, its really hard to think about CSF because there is a mismatch 
 between the APIs and the index format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2382) DIH Cache Improvements

2011-11-14 Thread Noble Paul (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149540#comment-13149540
 ] 

Noble Paul commented on SOLR-2382:
--

committed svn version 121659. Thanks James

 DIH Cache Improvements
 --

 Key: SOLR-2382
 URL: https://issues.apache.org/jira/browse/SOLR-2382
 Project: Solr
  Issue Type: New Feature
  Components: contrib - DataImportHandler
Reporter: James Dyer
Priority: Minor
 Attachments: SOLR-2382-dihwriter.patch, SOLR-2382-dihwriter.patch, 
 SOLR-2382-dihwriter.patch, SOLR-2382-entities.patch, 
 SOLR-2382-entities.patch, SOLR-2382-entities.patch, SOLR-2382-entities.patch, 
 SOLR-2382-entities.patch, SOLR-2382-entities.patch, SOLR-2382-entities.patch, 
 SOLR-2382-entities.patch, SOLR-2382-properties.patch, 
 SOLR-2382-properties.patch, SOLR-2382-solrwriter-verbose-fix.patch, 
 SOLR-2382-solrwriter.patch, SOLR-2382-solrwriter.patch, 
 SOLR-2382-solrwriter.patch, SOLR-2382.patch, SOLR-2382.patch, 
 SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, 
 SOLR-2382.patch, SOLR-2382.patch


 Functionality:
  1. Provide a pluggable caching framework for DIH so that users can choose a 
 cache implementation that best suits their data and application.
  
  2. Provide a means to temporarily cache a child Entity's data without 
 needing to create a special cached implementation of the Entity Processor 
 (such as CachedSqlEntityProcessor).
  
  3. Provide a means to write the final (root entity) DIH output to a cache 
 rather than to Solr.  Then provide a way for a subsequent DIH call to use the 
 cache as an Entity input.  Also provide the ability to do delta updates on 
 such persistent caches.
  
  4. Provide the ability to partition data across multiple caches that can 
 then be fed back into DIH and indexed either to varying Solr Shards, or to 
 the same Core in parallel.
 Use Cases:
  1. We needed a flexible  scalable way to temporarily cache child-entity 
 data prior to joining to parent entities.
   - Using SqlEntityProcessor with Child Entities can cause an n+1 select 
 problem.
   - CachedSqlEntityProcessor only supports an in-memory HashMap as a Caching 
 mechanism and does not scale.
   - There is no way to cache non-SQL inputs (ex: flat files, xml, etc).
  
  2. We needed the ability to gather data from long-running entities by a 
 process that runs separate from our main indexing process.
   
  3. We wanted the ability to do a delta import of only the entities that 
 changed.
   - Lucene/Solr requires entire documents to be re-indexed, even if only a 
 few fields changed.
   - Our data comes from 50+ complex sql queries and/or flat files.
   - We do not want to incur overhead re-gathering all of this data if only 1 
 entity's data changed.
   - Persistent DIH caches solve this problem.
   
  4. We want the ability to index several documents in parallel (using 1.4.1, 
 which did not have the threads parameter).
  
  5. In the future, we may need to use Shards, creating a need to easily 
 partition our source data into Shards.
 Implementation Details:
  1. De-couple EntityProcessorBase from caching.  
   - Created a new interface, DIHCache  two implementations:  
 - SortedMapBackedCache - An in-memory cache, used as default with 
 CachedSqlEntityProcessor (now deprecated).
 - BerkleyBackedCache - A disk-backed cache, dependent on bdb-je, tested 
 with je-4.1.6.jar
- NOTE: the existing Lucene Contrib db project uses je-3.3.93.jar.  
 I believe this may be incompatible due to Generic Usage.
- NOTE: I did not modify the ant script to automatically get this jar, 
 so to use or evaluate this patch, download bdb-je from 
 http://www.oracle.com/technetwork/database/berkeleydb/downloads/index.html 
  
  2. Allow Entity Processors to take a cacheImpl parameter to cause the 
 entity data to be cached (see EntityProcessorBase  DIHCacheProperties).
  
  3. Partially De-couple SolrWriter from DocBuilder
   - Created a new interface DIHWriter,  two implementations:
- SolrWriter (refactored)
- DIHCacheWriter (allows DIH to write ultimately to a Cache).

  4. Create a new Entity Processor, DIHCacheProcessor, which reads a 
 persistent Cache as DIH Entity Input.
  
  5. Support a partition parameter with both DIHCacheWriter and 
 DIHCacheProcessor to allow for easy partitioning of source entity data.
  
  6. Change the semantics of entity.destroy()
   - Previously, it was being called on each iteration of 
 DocBuilder.buildDocument().
   - Now it is does one-time cleanup tasks (like closing or deleting a 
 disk-backed cache) once the entity processor is completed.
   - The only out-of-the-box entity processor that previously implemented 
 destroy() was LineEntitiyProcessor, so this is not a very invasive change.
 

[jira] [Commented] (LUCENE-3396) Make TokenStream Reuse Mandatory for Analyzers

2011-11-14 Thread Simon Willnauer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149541#comment-13149541
 ] 

Simon Willnauer commented on LUCENE-3396:
-

chris, this seems to be done no? can you close it?

 Make TokenStream Reuse Mandatory for Analyzers
 --

 Key: LUCENE-3396
 URL: https://issues.apache.org/jira/browse/LUCENE-3396
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/analysis
Reporter: Chris Male
 Attachments: LUCENE-3396-forgotten.patch, LUCENE-3396-rab.patch, 
 LUCENE-3396-rab.patch, LUCENE-3396-rab.patch, LUCENE-3396-rab.patch, 
 LUCENE-3396-rab.patch, LUCENE-3396-rab.patch, LUCENE-3396-rab.patch, 
 LUCENE-3396-remaining-analyzers.patch, LUCENE-3396-remaining-merging.patch


 In LUCENE-2309 it became clear that we'd benefit a lot from Analyzer having 
 to return reusable TokenStreams.  This is a big chunk of work, but its time 
 to bite the bullet.
 I plan to attack this in the following way:
 - Collapse the logic of ReusableAnalyzerBase into Analyzer
 - Add a ReuseStrategy abstraction to Analyzer which controls whether the 
 TokenStreamComponents are reused globally (as they are today) or per-field.
 - Convert all Analyzers over to using TokenStreamComponents.  I've already 
 seen that some of the TokenStreams created in tests need some work to be 
 reusable (even if they aren't reused).
 - Remove Analyzer.reusableTokenStream and convert everything over to using 
 .tokenStream (which will now be returning reusable TokenStreams).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-2949) FastVectorHighlighter FieldTermStack could likely benefit from using TermVectorMapper

2011-11-14 Thread Koji Sekiguchi (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated LUCENE-2949:
---

Assignee: (was: Koji Sekiguchi)

 FastVectorHighlighter FieldTermStack could likely benefit from using 
 TermVectorMapper
 -

 Key: LUCENE-2949
 URL: https://issues.apache.org/jira/browse/LUCENE-2949
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 3.0.3, 4.0
Reporter: Grant Ingersoll
Priority: Minor
  Labels: FastVectorHighlighter, Highlighter
 Fix For: 3.5, 4.0

 Attachments: LUCENE-2949.patch


 Based on my reading of the FieldTermStack constructor that loads the vector 
 from disk, we could probably save a bunch of time and memory by using the 
 TermVectorMapper callback mechanism instead of materializing the full array 
 of terms into memory and then throwing most of them out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2949) FastVectorHighlighter FieldTermStack could likely benefit from using TermVectorMapper

2011-11-14 Thread Koji Sekiguchi (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149542#comment-13149542
 ] 

Koji Sekiguchi commented on LUCENE-2949:


Cool, I like the idea! But I don't have much time to try it now, I'll unassign 
myself.

 FastVectorHighlighter FieldTermStack could likely benefit from using 
 TermVectorMapper
 -

 Key: LUCENE-2949
 URL: https://issues.apache.org/jira/browse/LUCENE-2949
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 3.0.3, 4.0
Reporter: Grant Ingersoll
Assignee: Koji Sekiguchi
Priority: Minor
  Labels: FastVectorHighlighter, Highlighter
 Fix For: 3.5, 4.0

 Attachments: LUCENE-2949.patch


 Based on my reading of the FieldTermStack constructor that loads the vector 
 from disk, we could probably save a bunch of time and memory by using the 
 TermVectorMapper callback mechanism instead of materializing the full array 
 of terms into memory and then throwing most of them out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3496) Support grouping by IndexDocValues

2011-11-14 Thread Simon Willnauer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149543#comment-13149543
 ] 

Simon Willnauer commented on LUCENE-3496:
-

Martjin, the last patch looks ok to me. you should go ahead and commit this...

 Support grouping by IndexDocValues
 --

 Key: LUCENE-3496
 URL: https://issues.apache.org/jira/browse/LUCENE-3496
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/grouping
Reporter: Martijn van Groningen
Assignee: Martijn van Groningen
 Fix For: 4.0

 Attachments: LUCENE-3496.patch, LUCENE-3496.patch, LUCENE-3496.patch, 
 LUCENE-3496.patch, LUCENE-3496.patch, LUCENE-3496.patch, LUCENE-3496.patch


 Although IDV is not yet finalized (More particular the SortedSource). I think 
 we already can discuss / investigate implementing grouping by IDV.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3509) Add settings to IWC to optimize IDV indices for CPU or RAM respectivly

2011-11-14 Thread Simon Willnauer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149545#comment-13149545
 ] 

Simon Willnauer commented on LUCENE-3509:
-

bq. I think fasterButMoreRam is fine, since it is a dv codec parameter now.
+1 go ahead

 Add settings to IWC to optimize IDV indices for CPU or RAM respectivly
 --

 Key: LUCENE-3509
 URL: https://issues.apache.org/jira/browse/LUCENE-3509
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-3509.patch, LUCENE-3509.patch


 spinnoff from LUCENE-3496 - we are seeing much better performance if required 
 bits for PackedInts are rounded up to a 8/16/32/64. We should add this option 
 to IWC and default to round up ie. more RAM  faster lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Closed] (LUCENE-3379) jre crashes in ArrayUtil mergeSort

2011-11-14 Thread Simon Willnauer (Closed) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer closed LUCENE-3379.
---

Resolution: Not A Problem

closing this - nobody seemed to hit this again

 jre crashes in ArrayUtil mergeSort
 --

 Key: LUCENE-3379
 URL: https://issues.apache.org/jira/browse/LUCENE-3379
 Project: Lucene - Java
  Issue Type: Bug
 Environment: 1.6.0_24
Reporter: Robert Muir
 Attachments: hs_err_pid25327.log, hs_err_pid4624.log


 while running the analyzers test, i got a JRE crash with 1.6.0_24 in
 {noformat}
 Current CompileTask:
 C2: 54  org.apache.lucene.util.SorterTemplate.merge(I)V (151 bytes)
 {noformat}
 {noformat}
[junit] #
[junit] # A fatal error has been detected by the Java Runtime Environment:
[junit] #
[junit] #  SIGSEGV (0xb) at pc=0x7f768cc2f0ec, pid=4624, 
 tid=140147041961728
[junit] #
[junit] # JRE version: 6.0_24-b07
[junit] # Java VM: Java HotSpot(TM) 64-Bit Server VM (19.1-b02 mixed mode 
 linux-amd64 compressed oops)
[junit] # Problematic frame:
[junit] # V  [libjvm.so+0x3eb0ec]
[junit] #
[junit] # An error report file with more information is saved as:
[junit] # 
 /home/rmuir/workspace/lucene-trunk/modules/analysis/build/common/test/8/hs_err_pid4624.log
[junit] #
[junit] # If you would like to submit a bug report, please visit:
[junit] #   http://java.sun.com/webapps/bugreport/crash.jsp
[junit] #
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3270) additional tests enhancements to faceting module

2011-11-14 Thread Shai Erera (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149553#comment-13149553
 ] 

Shai Erera commented on LUCENE-3270:


I searched for static final under facet/src/test and scanned all the results 
- nothing there that seems worth randomizing. Also, I thought about 
RandomeTaxonomyWriter, and I'm not sure it's worth the effort since I'm afraid 
randomization will affect the strict behavior required by TW and we'll just 
chase ourselves.

Perhaps we should just close this issue and handle things on a per case basis 
when we encounter them?

 additional tests enhancements to faceting module
 

 Key: LUCENE-3270
 URL: https://issues.apache.org/jira/browse/LUCENE-3270
 Project: Lucene - Java
  Issue Type: Test
  Components: modules/facet
Reporter: Robert Muir

 Some ideas from LUCENE-3264:
 * make a RandomTaxonomyWriter
 * look at any hardcoded constants like #docs etc and see if we can in general 
 add randomization.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3237) FSDirectory.fsync() may not work properly

2011-11-14 Thread Simon Willnauer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149555#comment-13149555
 ] 

Simon Willnauer commented on LUCENE-3237:
-

Shai, I think we should close this. we can still reopen if we run into issues?

 FSDirectory.fsync() may not work properly
 -

 Key: LUCENE-3237
 URL: https://issues.apache.org/jira/browse/LUCENE-3237
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/store
Reporter: Shai Erera
 Fix For: 3.5, 4.0


 Spinoff from LUCENE-3230. FSDirectory.fsync() opens a new RAF, sync() its 
 FileDescriptor and closes RAF. It is not clear that this syncs whatever was 
 written to the file by other FileDescriptors. It would be better if we do 
 this operation on the actual RAF/FileOS which wrote the data. We can add 
 sync() to IndexOutput and FSIndexOutput will do that.
 Directory-wise, we should stop syncing on file names, and instead sync on the 
 IOs that performed the write operations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3237) FSDirectory.fsync() may not work properly

2011-11-14 Thread Shai Erera (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-3237.


   Resolution: Won't Fix
Fix Version/s: (was: 3.5)
   (was: 4.0)

Closing. If we ever see that this actually is a problem, we can reopen.

 FSDirectory.fsync() may not work properly
 -

 Key: LUCENE-3237
 URL: https://issues.apache.org/jira/browse/LUCENE-3237
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/store
Reporter: Shai Erera

 Spinoff from LUCENE-3230. FSDirectory.fsync() opens a new RAF, sync() its 
 FileDescriptor and closes RAF. It is not clear that this syncs whatever was 
 written to the file by other FileDescriptors. It would be better if we do 
 this operation on the actual RAF/FileOS which wrote the data. We can add 
 sync() to IndexOutput and FSIndexOutput will do that.
 Directory-wise, we should stop syncing on file names, and instead sync on the 
 IOs that performed the write operations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3235) TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug

2011-11-14 Thread Simon Willnauer (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer resolved LUCENE-3235.
-

Resolution: Won't Fix

we moved to 1.6 on trunk seems we can't do much about it on 3.x - folks should 
run their stuff on 1.6 jvms or newer

 TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug
 

 Key: LUCENE-3235
 URL: https://issues.apache.org/jira/browse/LUCENE-3235
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless

 Not sure what's going on yet... but under Java 1.6 it seems not to hang bug 
 under Java 1.5 hangs fairly easily, on Linux.  Java is 1.5.0_22.
 I suspect this is relevant: 
 http://stackoverflow.com/questions/3292577/is-it-possible-for-concurrenthashmap-to-deadlock
  which refers to this JVM bug 
 http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6865591 which then refers 
 to this one http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6822370
 It looks like that last bug was fixed in Java 1.6 but not 1.5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3176) TestNRTThreads test failure

2011-11-14 Thread Simon Willnauer (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer resolved LUCENE-3176.
-

Resolution: Fixed

this was a temp file issue - fixed

 TestNRTThreads test failure
 ---

 Key: LUCENE-3176
 URL: https://issues.apache.org/jira/browse/LUCENE-3176
 Project: Lucene - Java
  Issue Type: Bug
 Environment: trunk
Reporter: Robert Muir
Assignee: Michael McCandless

 hit a fail in TestNRTThreads running tests over and over:

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3235) TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug

2011-11-14 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149562#comment-13149562
 ] 

Robert Muir commented on LUCENE-3235:
-

wait, this statement makes no sense.

if 1.5 is no longer supported, then 1.5 should no longer be supported, and we 
should be
free to use 1.6 code everywhere.


 TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug
 

 Key: LUCENE-3235
 URL: https://issues.apache.org/jira/browse/LUCENE-3235
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless

 Not sure what's going on yet... but under Java 1.6 it seems not to hang bug 
 under Java 1.5 hangs fairly easily, on Linux.  Java is 1.5.0_22.
 I suspect this is relevant: 
 http://stackoverflow.com/questions/3292577/is-it-possible-for-concurrenthashmap-to-deadlock
  which refers to this JVM bug 
 http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6865591 which then refers 
 to this one http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6822370
 It looks like that last bug was fixed in Java 1.6 but not 1.5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Reopened] (LUCENE-3235) TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug

2011-11-14 Thread Robert Muir (Reopened) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir reopened LUCENE-3235:
-


 TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug
 

 Key: LUCENE-3235
 URL: https://issues.apache.org/jira/browse/LUCENE-3235
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless

 Not sure what's going on yet... but under Java 1.6 it seems not to hang bug 
 under Java 1.5 hangs fairly easily, on Linux.  Java is 1.5.0_22.
 I suspect this is relevant: 
 http://stackoverflow.com/questions/3292577/is-it-possible-for-concurrenthashmap-to-deadlock
  which refers to this JVM bug 
 http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6865591 which then refers 
 to this one http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6822370
 It looks like that last bug was fixed in Java 1.6 but not 1.5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3089) CachingTokenFilter can cause close() to be called twice.

2011-11-14 Thread Simon Willnauer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149564#comment-13149564
 ] 

Simon Willnauer commented on LUCENE-3089:
-

robert, since TokenStream impl. Closeable we should be able to call close as 
often as we want to. we should actually check that we do that in our tests to 
make sure nothing fails. 

 CachingTokenFilter can cause close() to be called twice.
 

 Key: LUCENE-3089
 URL: https://issues.apache.org/jira/browse/LUCENE-3089
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Robert Muir

 In LUCENE-3064, we added some state and checks to MockTokenizer to validate 
 that consumers
 are properly using the tokenstream workflow (described here: 
 http://lucene.apache.org/java/3_0_3/api/core/org/apache/lucene/analysis/TokenStream.html)
 One problem I noticed in 
 TestTermVectorsWriter.testEndOffsetPositionWithCachingTokenFilter is that 
 providing a CachingTOkenFilter directly will result
 in close() being called twice on the underlying tokenstream... this seems 
 wrong.
 Some ideas to fix this could be:
 # CachingTokenFilter overrides close() and we document that you must close 
 the underlying stream yourself. I think this is what the queryparser does 
 anyway.
 # CachingTokenFilter does something tricky to ensure it only closes the 
 underlying stream once.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3235) TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug

2011-11-14 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149565#comment-13149565
 ] 

Uwe Schindler commented on LUCENE-3235:
---

I agree with Robert. This issue is still existent in 3.x as we officially 
support Java 5.

 TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug
 

 Key: LUCENE-3235
 URL: https://issues.apache.org/jira/browse/LUCENE-3235
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless

 Not sure what's going on yet... but under Java 1.6 it seems not to hang bug 
 under Java 1.5 hangs fairly easily, on Linux.  Java is 1.5.0_22.
 I suspect this is relevant: 
 http://stackoverflow.com/questions/3292577/is-it-possible-for-concurrenthashmap-to-deadlock
  which refers to this JVM bug 
 http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6865591 which then refers 
 to this one http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6822370
 It looks like that last bug was fixed in Java 1.6 but not 1.5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3089) CachingTokenFilter can cause close() to be called twice.

2011-11-14 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149566#comment-13149566
 ] 

Uwe Schindler commented on LUCENE-3089:
---

Yes, the java.io.Closeable interface requires the underlying implementation to 
ignore additional close calls. But we should still fix our code to actually 
call it only once.

 CachingTokenFilter can cause close() to be called twice.
 

 Key: LUCENE-3089
 URL: https://issues.apache.org/jira/browse/LUCENE-3089
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Robert Muir

 In LUCENE-3064, we added some state and checks to MockTokenizer to validate 
 that consumers
 are properly using the tokenstream workflow (described here: 
 http://lucene.apache.org/java/3_0_3/api/core/org/apache/lucene/analysis/TokenStream.html)
 One problem I noticed in 
 TestTermVectorsWriter.testEndOffsetPositionWithCachingTokenFilter is that 
 providing a CachingTOkenFilter directly will result
 in close() being called twice on the underlying tokenstream... this seems 
 wrong.
 Some ideas to fix this could be:
 # CachingTokenFilter overrides close() and we document that you must close 
 the underlying stream yourself. I think this is what the queryparser does 
 anyway.
 # CachingTokenFilter does something tricky to ensure it only closes the 
 underlying stream once.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3089) CachingTokenFilter can cause close() to be called twice.

2011-11-14 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149568#comment-13149568
 ] 

Robert Muir commented on LUCENE-3089:
-

Hmm i'm not sure i like that... perhaps its not appropriate to implement 
closeable.

Lots of people seem to have problems with the analysis workflow and I think 
this adds confusion.

 CachingTokenFilter can cause close() to be called twice.
 

 Key: LUCENE-3089
 URL: https://issues.apache.org/jira/browse/LUCENE-3089
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Robert Muir

 In LUCENE-3064, we added some state and checks to MockTokenizer to validate 
 that consumers
 are properly using the tokenstream workflow (described here: 
 http://lucene.apache.org/java/3_0_3/api/core/org/apache/lucene/analysis/TokenStream.html)
 One problem I noticed in 
 TestTermVectorsWriter.testEndOffsetPositionWithCachingTokenFilter is that 
 providing a CachingTOkenFilter directly will result
 in close() being called twice on the underlying tokenstream... this seems 
 wrong.
 Some ideas to fix this could be:
 # CachingTokenFilter overrides close() and we document that you must close 
 the underlying stream yourself. I think this is what the queryparser does 
 anyway.
 # CachingTokenFilter does something tricky to ensure it only closes the 
 underlying stream once.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3397) Cleanup Test TokenStreams so they are reusable

2011-11-14 Thread Chris Male (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Male resolved LUCENE-3397.


Resolution: Fixed

All TokenStreams are now reusable.

 Cleanup Test TokenStreams so they are reusable
 --

 Key: LUCENE-3397
 URL: https://issues.apache.org/jira/browse/LUCENE-3397
 Project: Lucene - Java
  Issue Type: Sub-task
  Components: modules/analysis
Reporter: Chris Male
Assignee: Chris Male
 Fix For: 4.0

 Attachments: LUCENE-3397-highlighter.patch, LUCENE-3397-more.patch, 
 LUCENE-3397.patch, LUCENE-3397.patch


 Many TokenStreams created in tests are not reusable.  Some do some really 
 messy things which prevent their reuse so we may have to change the tests 
 themselves.
 We'll target back porting this to 3x.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3089) CachingTokenFilter can cause close() to be called twice.

2011-11-14 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149571#comment-13149571
 ] 

Uwe Schindler commented on LUCENE-3089:
---

I disagree, removing the Closeable interface makes it stupid to use in Java 7 
(close-with-resources).

 CachingTokenFilter can cause close() to be called twice.
 

 Key: LUCENE-3089
 URL: https://issues.apache.org/jira/browse/LUCENE-3089
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Robert Muir

 In LUCENE-3064, we added some state and checks to MockTokenizer to validate 
 that consumers
 are properly using the tokenstream workflow (described here: 
 http://lucene.apache.org/java/3_0_3/api/core/org/apache/lucene/analysis/TokenStream.html)
 One problem I noticed in 
 TestTermVectorsWriter.testEndOffsetPositionWithCachingTokenFilter is that 
 providing a CachingTOkenFilter directly will result
 in close() being called twice on the underlying tokenstream... this seems 
 wrong.
 Some ideas to fix this could be:
 # CachingTokenFilter overrides close() and we document that you must close 
 the underlying stream yourself. I think this is what the queryparser does 
 anyway.
 # CachingTokenFilter does something tricky to ensure it only closes the 
 underlying stream once.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3396) Make TokenStream Reuse Mandatory for Analyzers

2011-11-14 Thread Chris Male (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Male resolved LUCENE-3396.


   Resolution: Fixed
Fix Version/s: 4.0
 Assignee: Chris Male

TokenStream reuse is now mandatory

 Make TokenStream Reuse Mandatory for Analyzers
 --

 Key: LUCENE-3396
 URL: https://issues.apache.org/jira/browse/LUCENE-3396
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/analysis
Reporter: Chris Male
Assignee: Chris Male
 Fix For: 4.0

 Attachments: LUCENE-3396-forgotten.patch, LUCENE-3396-rab.patch, 
 LUCENE-3396-rab.patch, LUCENE-3396-rab.patch, LUCENE-3396-rab.patch, 
 LUCENE-3396-rab.patch, LUCENE-3396-rab.patch, LUCENE-3396-rab.patch, 
 LUCENE-3396-remaining-analyzers.patch, LUCENE-3396-remaining-merging.patch


 In LUCENE-2309 it became clear that we'd benefit a lot from Analyzer having 
 to return reusable TokenStreams.  This is a big chunk of work, but its time 
 to bite the bullet.
 I plan to attack this in the following way:
 - Collapse the logic of ReusableAnalyzerBase into Analyzer
 - Add a ReuseStrategy abstraction to Analyzer which controls whether the 
 TokenStreamComponents are reused globally (as they are today) or per-field.
 - Convert all Analyzers over to using TokenStreamComponents.  I've already 
 seen that some of the TokenStreams created in tests need some work to be 
 reusable (even if they aren't reused).
 - Remove Analyzer.reusableTokenStream and convert everything over to using 
 .tokenStream (which will now be returning reusable TokenStreams).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3089) CachingTokenFilter can cause close() to be called twice.

2011-11-14 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149570#comment-13149570
 ] 

Robert Muir commented on LUCENE-3089:
-

{quote}
Yes, the java.io.Closeable interface requires the underlying implementation to 
ignore additional close calls. 
{quote}

Just because java.io.Closeable exists doesn't mean we must use it everywhere: 
if these
semantics are inappropriate we can simply have .close() ourselves.

 CachingTokenFilter can cause close() to be called twice.
 

 Key: LUCENE-3089
 URL: https://issues.apache.org/jira/browse/LUCENE-3089
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Robert Muir

 In LUCENE-3064, we added some state and checks to MockTokenizer to validate 
 that consumers
 are properly using the tokenstream workflow (described here: 
 http://lucene.apache.org/java/3_0_3/api/core/org/apache/lucene/analysis/TokenStream.html)
 One problem I noticed in 
 TestTermVectorsWriter.testEndOffsetPositionWithCachingTokenFilter is that 
 providing a CachingTOkenFilter directly will result
 in close() being called twice on the underlying tokenstream... this seems 
 wrong.
 Some ideas to fix this could be:
 # CachingTokenFilter overrides close() and we document that you must close 
 the underlying stream yourself. I think this is what the queryparser does 
 anyway.
 # CachingTokenFilter does something tricky to ensure it only closes the 
 underlying stream once.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3089) CachingTokenFilter can cause close() to be called twice.

2011-11-14 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149572#comment-13149572
 ] 

Robert Muir commented on LUCENE-3089:
-

I think java 7 close-with-resources is stupid too.

 CachingTokenFilter can cause close() to be called twice.
 

 Key: LUCENE-3089
 URL: https://issues.apache.org/jira/browse/LUCENE-3089
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Robert Muir

 In LUCENE-3064, we added some state and checks to MockTokenizer to validate 
 that consumers
 are properly using the tokenstream workflow (described here: 
 http://lucene.apache.org/java/3_0_3/api/core/org/apache/lucene/analysis/TokenStream.html)
 One problem I noticed in 
 TestTermVectorsWriter.testEndOffsetPositionWithCachingTokenFilter is that 
 providing a CachingTOkenFilter directly will result
 in close() being called twice on the underlying tokenstream... this seems 
 wrong.
 Some ideas to fix this could be:
 # CachingTokenFilter overrides close() and we document that you must close 
 the underlying stream yourself. I think this is what the queryparser does 
 anyway.
 # CachingTokenFilter does something tricky to ensure it only closes the 
 underlying stream once.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3089) CachingTokenFilter can cause close() to be called twice.

2011-11-14 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149573#comment-13149573
 ] 

Uwe Schindler commented on LUCENE-3089:
---

Why? For TokenStreams close-with-resources is great.

 CachingTokenFilter can cause close() to be called twice.
 

 Key: LUCENE-3089
 URL: https://issues.apache.org/jira/browse/LUCENE-3089
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Robert Muir

 In LUCENE-3064, we added some state and checks to MockTokenizer to validate 
 that consumers
 are properly using the tokenstream workflow (described here: 
 http://lucene.apache.org/java/3_0_3/api/core/org/apache/lucene/analysis/TokenStream.html)
 One problem I noticed in 
 TestTermVectorsWriter.testEndOffsetPositionWithCachingTokenFilter is that 
 providing a CachingTOkenFilter directly will result
 in close() being called twice on the underlying tokenstream... this seems 
 wrong.
 Some ideas to fix this could be:
 # CachingTokenFilter overrides close() and we document that you must close 
 the underlying stream yourself. I think this is what the queryparser does 
 anyway.
 # CachingTokenFilter does something tricky to ensure it only closes the 
 underlying stream once.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3089) CachingTokenFilter can cause close() to be called twice.

2011-11-14 Thread Simon Willnauer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149575#comment-13149575
 ] 

Simon Willnauer commented on LUCENE-3089:
-

bq. Why? For TokenStreams close-with-resources is great.
+1

 CachingTokenFilter can cause close() to be called twice.
 

 Key: LUCENE-3089
 URL: https://issues.apache.org/jira/browse/LUCENE-3089
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Robert Muir

 In LUCENE-3064, we added some state and checks to MockTokenizer to validate 
 that consumers
 are properly using the tokenstream workflow (described here: 
 http://lucene.apache.org/java/3_0_3/api/core/org/apache/lucene/analysis/TokenStream.html)
 One problem I noticed in 
 TestTermVectorsWriter.testEndOffsetPositionWithCachingTokenFilter is that 
 providing a CachingTOkenFilter directly will result
 in close() being called twice on the underlying tokenstream... this seems 
 wrong.
 Some ideas to fix this could be:
 # CachingTokenFilter overrides close() and we document that you must close 
 the underlying stream yourself. I think this is what the queryparser does 
 anyway.
 # CachingTokenFilter does something tricky to ensure it only closes the 
 underlying stream once.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Edited] (LUCENE-3089) CachingTokenFilter can cause close() to be called twice.

2011-11-14 Thread Uwe Schindler (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149571#comment-13149571
 ] 

Uwe Schindler edited comment on LUCENE-3089 at 11/14/11 11:11 AM:
--

I disagree, removing the Closeable interface makes it stupid to use in Java 7 
(try-with-resources).

  was (Author: thetaphi):
I disagree, removing the Closeable interface makes it stupid to use in Java 
7 (close-with-resources).
  
 CachingTokenFilter can cause close() to be called twice.
 

 Key: LUCENE-3089
 URL: https://issues.apache.org/jira/browse/LUCENE-3089
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Robert Muir

 In LUCENE-3064, we added some state and checks to MockTokenizer to validate 
 that consumers
 are properly using the tokenstream workflow (described here: 
 http://lucene.apache.org/java/3_0_3/api/core/org/apache/lucene/analysis/TokenStream.html)
 One problem I noticed in 
 TestTermVectorsWriter.testEndOffsetPositionWithCachingTokenFilter is that 
 providing a CachingTOkenFilter directly will result
 in close() being called twice on the underlying tokenstream... this seems 
 wrong.
 Some ideas to fix this could be:
 # CachingTokenFilter overrides close() and we document that you must close 
 the underlying stream yourself. I think this is what the queryparser does 
 anyway.
 # CachingTokenFilter does something tricky to ensure it only closes the 
 underlying stream once.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Edited] (LUCENE-3089) CachingTokenFilter can cause close() to be called twice.

2011-11-14 Thread Uwe Schindler (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149573#comment-13149573
 ] 

Uwe Schindler edited comment on LUCENE-3089 at 11/14/11 11:11 AM:
--

Why? For TokenStreams try-with-resources is great.

  was (Author: thetaphi):
Why? For TokenStreams close-with-resources is great.
  
 CachingTokenFilter can cause close() to be called twice.
 

 Key: LUCENE-3089
 URL: https://issues.apache.org/jira/browse/LUCENE-3089
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Robert Muir

 In LUCENE-3064, we added some state and checks to MockTokenizer to validate 
 that consumers
 are properly using the tokenstream workflow (described here: 
 http://lucene.apache.org/java/3_0_3/api/core/org/apache/lucene/analysis/TokenStream.html)
 One problem I noticed in 
 TestTermVectorsWriter.testEndOffsetPositionWithCachingTokenFilter is that 
 providing a CachingTOkenFilter directly will result
 in close() being called twice on the underlying tokenstream... this seems 
 wrong.
 Some ideas to fix this could be:
 # CachingTokenFilter overrides close() and we document that you must close 
 the underlying stream yourself. I think this is what the queryparser does 
 anyway.
 # CachingTokenFilter does something tricky to ensure it only closes the 
 underlying stream once.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3235) TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug

2011-11-14 Thread Simon Willnauer (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-3235:


Affects Version/s: 3.0
   3.1
   3.2
   3.3
   3.4
Fix Version/s: 3.5

 TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug
 

 Key: LUCENE-3235
 URL: https://issues.apache.org/jira/browse/LUCENE-3235
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 3.0, 3.1, 3.2, 3.3, 3.4
Reporter: Michael McCandless
 Fix For: 3.5


 Not sure what's going on yet... but under Java 1.6 it seems not to hang bug 
 under Java 1.5 hangs fairly easily, on Linux.  Java is 1.5.0_22.
 I suspect this is relevant: 
 http://stackoverflow.com/questions/3292577/is-it-possible-for-concurrenthashmap-to-deadlock
  which refers to this JVM bug 
 http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6865591 which then refers 
 to this one http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6822370
 It looks like that last bug was fixed in Java 1.6 but not 1.5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3235) TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug

2011-11-14 Thread Simon Willnauer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149577#comment-13149577
 ] 

Simon Willnauer commented on LUCENE-3235:
-

well then we should fix it - I will mark it as 3.5

 TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug
 

 Key: LUCENE-3235
 URL: https://issues.apache.org/jira/browse/LUCENE-3235
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 3.0, 3.1, 3.2, 3.3, 3.4
Reporter: Michael McCandless
 Fix For: 3.5


 Not sure what's going on yet... but under Java 1.6 it seems not to hang bug 
 under Java 1.5 hangs fairly easily, on Linux.  Java is 1.5.0_22.
 I suspect this is relevant: 
 http://stackoverflow.com/questions/3292577/is-it-possible-for-concurrenthashmap-to-deadlock
  which refers to this JVM bug 
 http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6865591 which then refers 
 to this one http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6822370
 It looks like that last bug was fixed in Java 1.6 but not 1.5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk - Build # 11324 - Failure

2011-11-14 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/11324/

1 tests failed.
REGRESSION:  org.apache.lucene.search.TestSort.testReverseSort

Error Message:
expected:[CEGIA] but was:[ACEGI]

Stack Trace:
at org.apache.lucene.search.TestSort.assertMatches(TestSort.java:1234)
at org.apache.lucene.search.TestSort.assertMatches(TestSort.java:1215)
at org.apache.lucene.search.TestSort.testReverseSort(TestSort.java:758)
at 
org.apache.lucene.util.LuceneTestCase$3$1.evaluate(LuceneTestCase.java:523)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:149)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:51)




Build Log (for compile errors):
[...truncated 1331 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3089) CachingTokenFilter can cause close() to be called twice.

2011-11-14 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149580#comment-13149580
 ] 

Robert Muir commented on LUCENE-3089:
-

I just don't think it should be blanket policy without thinking things thru.

for example: lots of code you see on the internet opens a new indexreader for 
every search and closes it

should we seriously encourage this?! If someone seriously needs to do this, 
thats an expert case and
they can use try + finally and close themselves.

So for example, there I think it makes sense for IndexReader to not support 
AutoCLoseable, and separately
to remove the stupid IndexSearcher(Directory) so that IndexSearcher only takes 
IndexReader, so its *always*
a thin wrapper like we claim it is (which is an outright lie today). Then 
IndexSearcher would implement [Auto]Closeable
since its cheap.
 

 CachingTokenFilter can cause close() to be called twice.
 

 Key: LUCENE-3089
 URL: https://issues.apache.org/jira/browse/LUCENE-3089
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Robert Muir

 In LUCENE-3064, we added some state and checks to MockTokenizer to validate 
 that consumers
 are properly using the tokenstream workflow (described here: 
 http://lucene.apache.org/java/3_0_3/api/core/org/apache/lucene/analysis/TokenStream.html)
 One problem I noticed in 
 TestTermVectorsWriter.testEndOffsetPositionWithCachingTokenFilter is that 
 providing a CachingTOkenFilter directly will result
 in close() being called twice on the underlying tokenstream... this seems 
 wrong.
 Some ideas to fix this could be:
 # CachingTokenFilter overrides close() and we document that you must close 
 the underlying stream yourself. I think this is what the queryparser does 
 anyway.
 # CachingTokenFilter does something tricky to ensure it only closes the 
 underlying stream once.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3235) TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug

2011-11-14 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149581#comment-13149581
 ] 

Uwe Schindler commented on LUCENE-3235:
---

An easy fix would be to use Collections.synchronizedMap(new HashMap()) in the 
ctor to initializer cache1 and cache2 (if Java 5 is detected)? If people are 
using Java 5 they get not-the best-performance.

 TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug
 

 Key: LUCENE-3235
 URL: https://issues.apache.org/jira/browse/LUCENE-3235
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 3.0, 3.1, 3.2, 3.3, 3.4
Reporter: Michael McCandless
 Fix For: 3.5


 Not sure what's going on yet... but under Java 1.6 it seems not to hang bug 
 under Java 1.5 hangs fairly easily, on Linux.  Java is 1.5.0_22.
 I suspect this is relevant: 
 http://stackoverflow.com/questions/3292577/is-it-possible-for-concurrenthashmap-to-deadlock
  which refers to this JVM bug 
 http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6865591 which then refers 
 to this one http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6822370
 It looks like that last bug was fixed in Java 1.6 but not 1.5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-tests-only-trunk - Build # 11324 - Failure

2011-11-14 Thread Michael McCandless
I'll dig...

Mike McCandless

http://blog.mikemccandless.com

On Mon, Nov 14, 2011 at 6:28 AM, Apache Jenkins Server
jenk...@builds.apache.org wrote:
 Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/11324/

 1 tests failed.
 REGRESSION:  org.apache.lucene.search.TestSort.testReverseSort

 Error Message:
 expected:[CEGIA] but was:[ACEGI]

 Stack Trace:
        at org.apache.lucene.search.TestSort.assertMatches(TestSort.java:1234)
        at org.apache.lucene.search.TestSort.assertMatches(TestSort.java:1215)
        at org.apache.lucene.search.TestSort.testReverseSort(TestSort.java:758)
        at 
 org.apache.lucene.util.LuceneTestCase$3$1.evaluate(LuceneTestCase.java:523)
        at 
 org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:149)
        at 
 org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:51)




 Build Log (for compile errors):
 [...truncated 1331 lines...]



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3269) Speed up Top-K sampling tests

2011-11-14 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149583#comment-13149583
 ] 

Robert Muir commented on LUCENE-3269:
-

Hi Shai: a couple suggestions.

With the current patch we will never close these directories, so we lose some 
test coverage like the CheckIndex at the end...
I think these tests caught a serious JRE bug in this checkindex so i'd like to 
keep it.

Additionally we have a problem I think if we randomly get a FSDirectory, 
especially on windows.

So how about we build up a RAMdir and cache it? when topK tests start up they 
could do something like this:

{noformat}
   Directory dir = newDirectory(random, getCachedDir());
   ...
   dir.close();
{noformat}

where getCachedDir is the access to the cache (if it doesnt exist, it builds 
it, and its always a ramdir).
(LuceneTestCase already has newDirectory(random, Directory) that copies from an 
existing directory)


 Speed up Top-K sampling tests
 -

 Key: LUCENE-3269
 URL: https://issues.apache.org/jira/browse/LUCENE-3269
 Project: Lucene - Java
  Issue Type: Test
  Components: modules/facet
Reporter: Robert Muir
 Fix For: 3.5, 4.0

 Attachments: LUCENE-3269.patch, LUCENE-3269.patch, LUCENE-3269.patch, 
 LUCENE-3269.patch


 speed up the top-k sampling tests (but make sure they are thorough on nightly 
 etc still)
 usually we would do this with use of atLeast(), but these tests are somewhat 
 tricky,
 so maybe a different approach is needed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3235) TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug

2011-11-14 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149584#comment-13149584
 ] 

Robert Muir commented on LUCENE-3235:
-

I like Uwe's idea: not-the-best-performance is far preferable to a 
hang/deadlock!

 TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug
 

 Key: LUCENE-3235
 URL: https://issues.apache.org/jira/browse/LUCENE-3235
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 3.0, 3.1, 3.2, 3.3, 3.4
Reporter: Michael McCandless
 Fix For: 3.5


 Not sure what's going on yet... but under Java 1.6 it seems not to hang bug 
 under Java 1.5 hangs fairly easily, on Linux.  Java is 1.5.0_22.
 I suspect this is relevant: 
 http://stackoverflow.com/questions/3292577/is-it-possible-for-concurrenthashmap-to-deadlock
  which refers to this JVM bug 
 http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6865591 which then refers 
 to this one http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6822370
 It looks like that last bug was fixed in Java 1.6 but not 1.5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3235) TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug

2011-11-14 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149585#comment-13149585
 ] 

Uwe Schindler commented on LUCENE-3235:
---

I am currently preparing a patch.

 TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug
 

 Key: LUCENE-3235
 URL: https://issues.apache.org/jira/browse/LUCENE-3235
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 3.0, 3.1, 3.2, 3.3, 3.4
Reporter: Michael McCandless
 Fix For: 3.5


 Not sure what's going on yet... but under Java 1.6 it seems not to hang bug 
 under Java 1.5 hangs fairly easily, on Linux.  Java is 1.5.0_22.
 I suspect this is relevant: 
 http://stackoverflow.com/questions/3292577/is-it-possible-for-concurrenthashmap-to-deadlock
  which refers to this JVM bug 
 http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6865591 which then refers 
 to this one http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6822370
 It looks like that last bug was fixed in Java 1.6 but not 1.5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3089) CachingTokenFilter can cause close() to be called twice.

2011-11-14 Thread Michael McCandless (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149586#comment-13149586
 ] 

Michael McCandless commented on LUCENE-3089:


{quote}
So for example, there I think it makes sense for IndexReader to not support 
AutoCLoseable, and separately
to remove the stupid IndexSearcher(Directory) so that IndexSearcher only takes 
IndexReader, so its always
a thin wrapper like we claim it is (which is an outright lie today). 
{quote}

+1

We should deprecate/remove the IS ctor that takes a Directory.  It's trappy.

 CachingTokenFilter can cause close() to be called twice.
 

 Key: LUCENE-3089
 URL: https://issues.apache.org/jira/browse/LUCENE-3089
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Robert Muir

 In LUCENE-3064, we added some state and checks to MockTokenizer to validate 
 that consumers
 are properly using the tokenstream workflow (described here: 
 http://lucene.apache.org/java/3_0_3/api/core/org/apache/lucene/analysis/TokenStream.html)
 One problem I noticed in 
 TestTermVectorsWriter.testEndOffsetPositionWithCachingTokenFilter is that 
 providing a CachingTOkenFilter directly will result
 in close() being called twice on the underlying tokenstream... this seems 
 wrong.
 Some ideas to fix this could be:
 # CachingTokenFilter overrides close() and we document that you must close 
 the underlying stream yourself. I think this is what the queryparser does 
 anyway.
 # CachingTokenFilter does something tricky to ensure it only closes the 
 underlying stream once.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3235) TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug

2011-11-14 Thread Uwe Schindler (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-3235:
--

Attachment: LUCENE-3235.patch

Patch.

We should forward port the deprecation/removal of useless Constants.

 TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug
 

 Key: LUCENE-3235
 URL: https://issues.apache.org/jira/browse/LUCENE-3235
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 3.0, 3.1, 3.2, 3.3, 3.4
Reporter: Michael McCandless
 Fix For: 3.5

 Attachments: LUCENE-3235.patch


 Not sure what's going on yet... but under Java 1.6 it seems not to hang bug 
 under Java 1.5 hangs fairly easily, on Linux.  Java is 1.5.0_22.
 I suspect this is relevant: 
 http://stackoverflow.com/questions/3292577/is-it-possible-for-concurrenthashmap-to-deadlock
  which refers to this JVM bug 
 http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6865591 which then refers 
 to this one http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6822370
 It looks like that last bug was fixed in Java 1.6 but not 1.5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3269) Speed up Top-K sampling tests

2011-11-14 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149594#comment-13149594
 ] 

Robert Muir commented on LUCENE-3269:
-

Sorry Shai, i got myself confused and thought you were trying to cache 
across-tests...
this patch is good in case a test has multiple methods...!

 Speed up Top-K sampling tests
 -

 Key: LUCENE-3269
 URL: https://issues.apache.org/jira/browse/LUCENE-3269
 Project: Lucene - Java
  Issue Type: Test
  Components: modules/facet
Reporter: Robert Muir
 Fix For: 3.5, 4.0

 Attachments: LUCENE-3269.patch, LUCENE-3269.patch, LUCENE-3269.patch, 
 LUCENE-3269.patch


 speed up the top-k sampling tests (but make sure they are thorough on nightly 
 etc still)
 usually we would do this with use of atLeast(), but these tests are somewhat 
 tricky,
 so maybe a different approach is needed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3089) CachingTokenFilter can cause close() to be called twice.

2011-11-14 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149595#comment-13149595
 ] 

Uwe Schindler commented on LUCENE-3089:
---

Then we should also rename close() to something else: 
closeThisIfYouAreReallySure() - implementing Closeable is then already out-of 
scope. When adding a close method to classes it leads you to take care of 
closing after using it. Also everybody expects what Closeable interface 
defines: You can use it multiple times.

For TokenStreams thats find, as close is just a cleanup and is not even 
required if you dont have a Tokenizer with Reader.

 CachingTokenFilter can cause close() to be called twice.
 

 Key: LUCENE-3089
 URL: https://issues.apache.org/jira/browse/LUCENE-3089
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Robert Muir

 In LUCENE-3064, we added some state and checks to MockTokenizer to validate 
 that consumers
 are properly using the tokenstream workflow (described here: 
 http://lucene.apache.org/java/3_0_3/api/core/org/apache/lucene/analysis/TokenStream.html)
 One problem I noticed in 
 TestTermVectorsWriter.testEndOffsetPositionWithCachingTokenFilter is that 
 providing a CachingTOkenFilter directly will result
 in close() being called twice on the underlying tokenstream... this seems 
 wrong.
 Some ideas to fix this could be:
 # CachingTokenFilter overrides close() and we document that you must close 
 the underlying stream yourself. I think this is what the queryparser does 
 anyway.
 # CachingTokenFilter does something tricky to ensure it only closes the 
 underlying stream once.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: set PYTHONPATH programatically from Java?

2011-11-14 Thread Roman Chyla
On Mon, Nov 14, 2011 at 4:25 AM, Andi Vajda va...@apache.org wrote:

 On Sun, 13 Nov 2011, Roman Chyla wrote:

 I am using JCC to run Python inside Java. For unittest, I'd like to
 set PYTHONPATH environment variable programmatically. I can change env
 vars inside Java (using

 http://stackoverflow.com/questions/318239/how-do-i-set-environment-variables-from-java)
 and System.getenv(PYTHONPATH) shows correct values

 However, I am still getting ImportError: no module named

 If I set PYTHONPATH before starting unittest, it works fine

 Is it possible what I would like to do?

 Why mess with the environment instead of setting sys.path directly instead ?

That would be great, but I don't know how. I am doing roughly this:

PythonVM.start(programName)
vm = PythonVM.get()
vm.instantiate(moduleName, className);

I tried also:
PythonVM.start(programName, new String[]{-c, import
sys;sys.path.insert(0, \'/dvt/workspace/montysolr/src/python\'});

it is failing on vm.instantiate when Python cannot find the module


 Alternatively, if JCC could execute/eval python string, I could set
 sys.argv that way

 I'm not sure what you mean here but JCC's Java PythonVM.init() method takes
 an array of strings that is fed into sys.argv. See _PythonVM_Init() sources
 in jcc.cpp for details.

sorry, i meant sys.path, not sys.argv

roman


 Andi..



[jira] [Commented] (LUCENE-3269) Speed up Top-K sampling tests

2011-11-14 Thread Shai Erera (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149597#comment-13149597
 ] 

Shai Erera commented on LUCENE-3269:


Right. Caching across tests is very tricky since they can anyway run in 
different JVMs (with parallel testing) and so we'll gain nothing. And the tests 
are not really slow - the sampling tests run 12 seconds on my laptop ... not a 
big deal.

I'll commit shortly.

 Speed up Top-K sampling tests
 -

 Key: LUCENE-3269
 URL: https://issues.apache.org/jira/browse/LUCENE-3269
 Project: Lucene - Java
  Issue Type: Test
  Components: modules/facet
Reporter: Robert Muir
 Fix For: 3.5, 4.0

 Attachments: LUCENE-3269.patch, LUCENE-3269.patch, LUCENE-3269.patch, 
 LUCENE-3269.patch


 speed up the top-k sampling tests (but make sure they are thorough on nightly 
 etc still)
 usually we would do this with use of atLeast(), but these tests are somewhat 
 tricky,
 so maybe a different approach is needed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3269) Speed up Top-K sampling tests

2011-11-14 Thread Shai Erera (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-3269.


   Resolution: Fixed
 Assignee: Shai Erera
Lucene Fields: New,Patch Available  (was: New)

Committed revisions 1201677 (3x) and 1201678 (trunk).

Thanks Robert !

 Speed up Top-K sampling tests
 -

 Key: LUCENE-3269
 URL: https://issues.apache.org/jira/browse/LUCENE-3269
 Project: Lucene - Java
  Issue Type: Test
  Components: modules/facet
Reporter: Robert Muir
Assignee: Shai Erera
 Fix For: 3.5, 4.0

 Attachments: LUCENE-3269.patch, LUCENE-3269.patch, LUCENE-3269.patch, 
 LUCENE-3269.patch


 speed up the top-k sampling tests (but make sure they are thorough on nightly 
 etc still)
 usually we would do this with use of atLeast(), but these tests are somewhat 
 tricky,
 so maybe a different approach is needed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3269) Speed up Top-K sampling tests

2011-11-14 Thread Shai Erera (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149600#comment-13149600
 ] 

Shai Erera commented on LUCENE-3269:


I see what got you confused (it was me, not you):

{quote}
however, if they will run in the same JVM, then they will reuse the already 
created indexes
{quote}

what I wrote is wrong (I got myself confused (!) -- whatever you do in 
beforeClass affects only that testcase, not all the ones that will run in the 
JVM. Perhaps JUnit need to invent two more concepts @StartJVM and @EndJVM, for 
this to happen :)

 Speed up Top-K sampling tests
 -

 Key: LUCENE-3269
 URL: https://issues.apache.org/jira/browse/LUCENE-3269
 Project: Lucene - Java
  Issue Type: Test
  Components: modules/facet
Reporter: Robert Muir
Assignee: Shai Erera
 Fix For: 3.5, 4.0

 Attachments: LUCENE-3269.patch, LUCENE-3269.patch, LUCENE-3269.patch, 
 LUCENE-3269.patch


 speed up the top-k sampling tests (but make sure they are thorough on nightly 
 etc still)
 usually we would do this with use of atLeast(), but these tests are somewhat 
 tricky,
 so maybe a different approach is needed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Edited] (LUCENE-3269) Speed up Top-K sampling tests

2011-11-14 Thread Shai Erera (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149600#comment-13149600
 ] 

Shai Erera edited comment on LUCENE-3269 at 11/14/11 12:20 PM:
---

I see what got you confused (it was me, not you):

{quote}
however, if they will run in the same JVM, then they will reuse the already 
created indexes
{quote}

what I wrote is wrong (I got myself confused !) -- whatever you do in 
beforeClass affects only that testcase, not all the ones that will run in the 
JVM. Perhaps JUnit need to invent two more concepts @StartJVM and @EndJVM, for 
this to happen :)

  was (Author: shaie):
I see what got you confused (it was me, not you):

{quote}
however, if they will run in the same JVM, then they will reuse the already 
created indexes
{quote}

what I wrote is wrong (I got myself confused (!) -- whatever you do in 
beforeClass affects only that testcase, not all the ones that will run in the 
JVM. Perhaps JUnit need to invent two more concepts @StartJVM and @EndJVM, for 
this to happen :)
  
 Speed up Top-K sampling tests
 -

 Key: LUCENE-3269
 URL: https://issues.apache.org/jira/browse/LUCENE-3269
 Project: Lucene - Java
  Issue Type: Test
  Components: modules/facet
Reporter: Robert Muir
Assignee: Shai Erera
 Fix For: 3.5, 4.0

 Attachments: LUCENE-3269.patch, LUCENE-3269.patch, LUCENE-3269.patch, 
 LUCENE-3269.patch


 speed up the top-k sampling tests (but make sure they are thorough on nightly 
 etc still)
 usually we would do this with use of atLeast(), but these tests are somewhat 
 tricky,
 so maybe a different approach is needed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3097) Post grouping faceting

2011-11-14 Thread Martijn van Groningen (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen resolved LUCENE-3097.
---

   Resolution: Fixed
Lucene Fields: Patch Available  (was: New)

The support for real grouped faceting (matrix counts) needs to be added to Solr 
or faceting module.

 Post grouping faceting
 --

 Key: LUCENE-3097
 URL: https://issues.apache.org/jira/browse/LUCENE-3097
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/grouping
Reporter: Martijn van Groningen
Assignee: Martijn van Groningen
Priority: Minor
 Fix For: 4.0, 3.4

 Attachments: LUCENE-3097.patch, LUCENE-3097.patch, LUCENE-3097.patch, 
 LUCENE-3097.patch, LUCENE-3097.patch, LUCENE-30971.patch


 This issues focuses on implementing post grouping faceting.
 * How to handle multivalued fields. What field value to show with the facet.
 * Where the facet counts should be based on
 ** Facet counts can be based on the normal documents. Ungrouped counts. 
 ** Facet counts can be based on the groups. Grouped counts.
 ** Facet counts can be based on the combination of group value and facet 
 value. Matrix counts.   
 And properly more implementation options.
 The first two methods are implemented in the SOLR-236 patch. For the first 
 option it calculates a DocSet based on the individual documents from the 
 query result. For the second option it calculates a DocSet for all the most 
 relevant documents of a group. Once the DocSet is computed the FacetComponent 
 and StatsComponent use one the DocSet to create facets and statistics.  
 This last one is a bit more complex. I think it is best explained with an 
 example. Lets say we search on travel offers:
 |||hotel||departure_airport||duration||
 |Hotel a|AMS|5
 |Hotel a|DUS|10
 |Hotel b|AMS|5
 |Hotel b|AMS|10
 If we group by hotel and have a facet for airport. Most end users expect 
 (according to my experience off course) the following airport facet:
 AMS: 2
 DUS: 1
 The above result can't be achieved by the first two methods. You either get 
 counts AMS:3 and DUS:1 or 1 for both airports.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3097) Post grouping faceting

2011-11-14 Thread Martijn van Groningen (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-3097:
--

Fix Version/s: (was: 3.5)
   3.4

 Post grouping faceting
 --

 Key: LUCENE-3097
 URL: https://issues.apache.org/jira/browse/LUCENE-3097
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/grouping
Reporter: Martijn van Groningen
Assignee: Martijn van Groningen
Priority: Minor
 Fix For: 3.4, 4.0

 Attachments: LUCENE-3097.patch, LUCENE-3097.patch, LUCENE-3097.patch, 
 LUCENE-3097.patch, LUCENE-3097.patch, LUCENE-30971.patch


 This issues focuses on implementing post grouping faceting.
 * How to handle multivalued fields. What field value to show with the facet.
 * Where the facet counts should be based on
 ** Facet counts can be based on the normal documents. Ungrouped counts. 
 ** Facet counts can be based on the groups. Grouped counts.
 ** Facet counts can be based on the combination of group value and facet 
 value. Matrix counts.   
 And properly more implementation options.
 The first two methods are implemented in the SOLR-236 patch. For the first 
 option it calculates a DocSet based on the individual documents from the 
 query result. For the second option it calculates a DocSet for all the most 
 relevant documents of a group. Once the DocSet is computed the FacetComponent 
 and StatsComponent use one the DocSet to create facets and statistics.  
 This last one is a bit more complex. I think it is best explained with an 
 example. Lets say we search on travel offers:
 |||hotel||departure_airport||duration||
 |Hotel a|AMS|5
 |Hotel a|DUS|10
 |Hotel b|AMS|5
 |Hotel b|AMS|10
 If we group by hotel and have a facet for airport. Most end users expect 
 (according to my experience off course) the following airport facet:
 AMS: 2
 DUS: 1
 The above result can't be achieved by the first two methods. You either get 
 counts AMS:3 and DUS:1 or 1 for both airports.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-2898) Support grouped faceting

2011-11-14 Thread Martijn van Groningen (Created) (JIRA)
Support grouped faceting


 Key: SOLR-2898
 URL: https://issues.apache.org/jira/browse/SOLR-2898
 Project: Solr
  Issue Type: New Feature
Reporter: Martijn van Groningen


Support grouped faceting. As described in LUCENE-3097 (matrix counts).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk - Build # 11325 - Still Failing

2011-11-14 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/11325/

All tests passed

Build Log (for compile errors):
[...truncated 14647 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3571) nuke IndexSearcher(directory)

2011-11-14 Thread Robert Muir (Created) (JIRA)
nuke IndexSearcher(directory)
-

 Key: LUCENE-3571
 URL: https://issues.apache.org/jira/browse/LUCENE-3571
 Project: Lucene - Java
  Issue Type: Task
Reporter: Robert Muir
 Fix For: 4.0


IndexSearcher is supposed to be a cheap wrapper around a reader,
but sometimes it is, sometimes it isn't.

I think its confusing tangling of a heavyweight and lightweight
object that it sometimes 'houses' a reader and must close it in that case.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3571) nuke IndexSearcher(directory)

2011-11-14 Thread Robert Muir (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3571:


Attachment: LUCENE-3571.patch

 nuke IndexSearcher(directory)
 -

 Key: LUCENE-3571
 URL: https://issues.apache.org/jira/browse/LUCENE-3571
 Project: Lucene - Java
  Issue Type: Task
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-3571.patch


 IndexSearcher is supposed to be a cheap wrapper around a reader,
 but sometimes it is, sometimes it isn't.
 I think its confusing tangling of a heavyweight and lightweight
 object that it sometimes 'houses' a reader and must close it in that case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3572) MultiIndexDocValues pretends it can merge sorted sources

2011-11-14 Thread Michael McCandless (Created) (JIRA)
MultiIndexDocValues pretends it can merge sorted sources


 Key: LUCENE-3572
 URL: https://issues.apache.org/jira/browse/LUCENE-3572
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
 Fix For: 4.0


Nightly build hit this failure:

{noformat}
ant test-core -Dtestcase=TestSort -Dtestmethod=testReverseSort 
-Dtests.seed=791b126576b0cfab:-48895c7243ecc5d0:743c683d1c9f7768 
-Dtests.multiplier=3 -Dargs=-Dfile.encoding=ISO8859-1

[junit] Testcase: testReverseSort(org.apache.lucene.search.TestSort):   
Caused an ERROR
[junit] expected:[CEGIA] but was:[ACEGI]
[junit] at 
org.apache.lucene.search.TestSort.assertMatches(TestSort.java:1248)
[junit] at 
org.apache.lucene.search.TestSort.assertMatches(TestSort.java:1216)
[junit] at 
org.apache.lucene.search.TestSort.testReverseSort(TestSort.java:759)
[junit] at 
org.apache.lucene.util.LuceneTestCase$3$1.evaluate(LuceneTestCase.java:523)
[junit] at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:149)
[junit] at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:51)
{noformat}

It's happening in the test for reverse-sort of a string field with DocValues, 
when the test had gotten SlowMultiReaderWrapper.

I committed a fix to the test to avoid testing this case, but we need a better 
fix to the underlying bug.

MultiIndexDocValues cannot merge sorted sources (I think?), yet somehow it's 
pretending it can (in the above test, the three subs had BYTES_FIXED_SORTED 
type, and the TypePromoter happily claims to merge these to BYTES_FIXED_SORTED; 
I think MultiIndexDocValues should return null for the sorted source in this 
case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-tests-only-trunk - Build # 11324 - Failure

2011-11-14 Thread Michael McCandless
OK I committed a fix to the test, but also opened LUCENE-3572 to get
to the root cause...

Mike McCandless

http://blog.mikemccandless.com

On Mon, Nov 14, 2011 at 6:39 AM, Michael McCandless
luc...@mikemccandless.com wrote:
 I'll dig...

 Mike McCandless

 http://blog.mikemccandless.com

 On Mon, Nov 14, 2011 at 6:28 AM, Apache Jenkins Server
 jenk...@builds.apache.org wrote:
 Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/11324/

 1 tests failed.
 REGRESSION:  org.apache.lucene.search.TestSort.testReverseSort

 Error Message:
 expected:[CEGIA] but was:[ACEGI]

 Stack Trace:
        at org.apache.lucene.search.TestSort.assertMatches(TestSort.java:1234)
        at org.apache.lucene.search.TestSort.assertMatches(TestSort.java:1215)
        at 
 org.apache.lucene.search.TestSort.testReverseSort(TestSort.java:758)
        at 
 org.apache.lucene.util.LuceneTestCase$3$1.evaluate(LuceneTestCase.java:523)
        at 
 org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:149)
        at 
 org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:51)




 Build Log (for compile errors):
 [...truncated 1331 lines...]



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 940 - Failure

2011-11-14 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/940/

1 tests failed.
REGRESSION:  org.apache.solr.update.AutoCommitTest.testMaxDocs

Error Message:
should find one query failed XPath: //result[@numFound=1]  xml response was: 
?xml version=1.0 encoding=UTF-8? response lst 
name=responseHeaderint name=status0/intint 
name=QTime1/int/lstresult name=response numFound=0 
start=0/result /response   request was: 
start=0q=id:14qt=standardrows=20version=2.2

Stack Trace:
junit.framework.AssertionFailedError: should find one query failed XPath: 
//result[@numFound=1]
 xml response was: ?xml version=1.0 encoding=UTF-8?
response
lst name=responseHeaderint name=status0/intint 
name=QTime1/int/lstresult name=response numFound=0 
start=0/result
/response

 request was: start=0q=id:14qt=standardrows=20version=2.2
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:149)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:51)
 xml response was: ?xml version=1.0 encoding=UTF-8?
response
lst name=responseHeaderint name=status0/intint 
name=QTime1/int/lstresult name=response numFound=0 
start=0/result
/response

 request was: start=0q=id:14qt=standardrows=20version=2.2
at 
org.apache.solr.util.AbstractSolrTestCase.assertQ(AbstractSolrTestCase.java:260)
at 
org.apache.solr.update.AutoCommitTest.testMaxDocs(AutoCommitTest.java:181)
at 
org.apache.lucene.util.LuceneTestCase$3$1.evaluate(LuceneTestCase.java:523)




Build Log (for compile errors):
[...truncated 10996 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1726) Deep Paging and Large Results Improvements

2011-11-14 Thread Manojkumar Rangasamy Kannadasan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149629#comment-13149629
 ] 

Manojkumar Rangasamy Kannadasan commented on SOLR-1726:
---

hi,
I am working to insert a new type of query for the issue 1726 by including the 
lastpageScore and lastDoc in the query as stated by Grant. Can anyone please 
let me know the place of code where i can insert a new mapping rule for this 
query to a new function in SolrIndexSearcher.
Kindly reply.

 Deep Paging and Large Results Improvements
 --

 Key: SOLR-1726
 URL: https://issues.apache.org/jira/browse/SOLR-1726
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 3.5, 4.0


 There are possibly ways to improve collections of deep paging by passing 
 Solr/Lucene more information about the last page of results seen, thereby 
 saving priority queue operations.   See LUCENE-2215.
 There may also be better options for retrieving large numbers of rows at a 
 time that are worth exploring.  LUCENE-2127.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk - Build # 11326 - Still Failing

2011-11-14 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/11326/

1 tests failed.
REGRESSION:  org.apache.solr.update.AutoCommitTest.testMaxDocs

Error Message:
should find one query failed XPath: //result[@numFound=1]  xml response was: 
?xml version=1.0 encoding=UTF-8? response lst 
name=responseHeaderint name=status0/intint 
name=QTime3/int/lstresult name=response numFound=0 
start=0/result /response   request was: 
start=0q=id:14qt=standardrows=20version=2.2

Stack Trace:
junit.framework.AssertionFailedError: should find one query failed XPath: 
//result[@numFound=1]
 xml response was: ?xml version=1.0 encoding=UTF-8?
response
lst name=responseHeaderint name=status0/intint 
name=QTime3/int/lstresult name=response numFound=0 
start=0/result
/response

 request was: start=0q=id:14qt=standardrows=20version=2.2
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:149)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:51)
 xml response was: ?xml version=1.0 encoding=UTF-8?
response
lst name=responseHeaderint name=status0/intint 
name=QTime3/int/lstresult name=response numFound=0 
start=0/result
/response

 request was: start=0q=id:14qt=standardrows=20version=2.2
at 
org.apache.solr.util.AbstractSolrTestCase.assertQ(AbstractSolrTestCase.java:260)
at 
org.apache.solr.update.AutoCommitTest.testMaxDocs(AutoCommitTest.java:181)
at 
org.apache.lucene.util.LuceneTestCase$3$1.evaluate(LuceneTestCase.java:523)




Build Log (for compile errors):
[...truncated 7847 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3235) TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug

2011-11-14 Thread Simon Willnauer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149631#comment-13149631
 ] 

Simon Willnauer commented on LUCENE-3235:
-

bq. An easy fix would be to use Collections.synchronizedMap(new HashMap()) in 
the ctor to initializer cache1 and cache2 (if Java 5 is detected)? If people 
are using Java 5 they get not-the best-performance.

I like that too...

 TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug
 

 Key: LUCENE-3235
 URL: https://issues.apache.org/jira/browse/LUCENE-3235
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 3.0, 3.1, 3.2, 3.3, 3.4
Reporter: Michael McCandless
 Fix For: 3.5

 Attachments: LUCENE-3235.patch


 Not sure what's going on yet... but under Java 1.6 it seems not to hang bug 
 under Java 1.5 hangs fairly easily, on Linux.  Java is 1.5.0_22.
 I suspect this is relevant: 
 http://stackoverflow.com/questions/3292577/is-it-possible-for-concurrenthashmap-to-deadlock
  which refers to this JVM bug 
 http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6865591 which then refers 
 to this one http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6822370
 It looks like that last bug was fixed in Java 1.6 but not 1.5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Casesensitive search problem

2011-11-14 Thread jayanta sahoo
HI
Whenever I am searching with the words OfficeJet or officejet or
Officejet or oFiiIcejET. I am getting the different results for each
search respectively. I am not able to understand why this is happening?
   I want to solve this problem such a way that search will become case
insensitive and I will get same result for any combination of capital and
small letters.
Please let me know How i will solve this problem

-- 
Jayanta Sahoo


[jira] [Commented] (LUCENE-3571) nuke IndexSearcher(directory)

2011-11-14 Thread Simon Willnauer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149634#comment-13149634
 ] 

Simon Willnauer commented on LUCENE-3571:
-

+1

 nuke IndexSearcher(directory)
 -

 Key: LUCENE-3571
 URL: https://issues.apache.org/jira/browse/LUCENE-3571
 Project: Lucene - Java
  Issue Type: Task
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-3571.patch


 IndexSearcher is supposed to be a cheap wrapper around a reader,
 but sometimes it is, sometimes it isn't.
 I think its confusing tangling of a heavyweight and lightweight
 object that it sometimes 'houses' a reader and must close it in that case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Edited] (LUCENE-3571) nuke IndexSearcher(directory)

2011-11-14 Thread Simon Willnauer (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149634#comment-13149634
 ] 

Simon Willnauer edited comment on LUCENE-3571 at 11/14/11 2:12 PM:
---

+1 - actually I think we should deprecate this ctor in 3.x - nobody should use 
that really

  was (Author: simonw):
+1
  
 nuke IndexSearcher(directory)
 -

 Key: LUCENE-3571
 URL: https://issues.apache.org/jira/browse/LUCENE-3571
 Project: Lucene - Java
  Issue Type: Task
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-3571.patch


 IndexSearcher is supposed to be a cheap wrapper around a reader,
 but sometimes it is, sometimes it isn't.
 I think its confusing tangling of a heavyweight and lightweight
 object that it sometimes 'houses' a reader and must close it in that case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Query mapping for Issue 1726

2011-11-14 Thread Manojkumar Rangasamy Kannadasan
Hi,

I would like to insert a new type of query for the issue 1726 by including
the lastpageScore and lastDoc in the query. Can anyone please let me know
the place of code where i can insert a new mapping rule for this query to a
new function in SolrIndexSearcher.
Kindly reply.

Thanks  Regards,
Manoj Kumar.R.K
Graduate Student, MS Computer Science
University at Buffalo
Buffalo, New York
(413) 461-8938|www.rkmanojkumar.co.nr


[jira] [Commented] (LUCENE-3305) Kuromoji code donation - a new Japanese morphological analyzer

2011-11-14 Thread Christian Moen (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149642#comment-13149642
 ] 

Christian Moen commented on LUCENE-3305:


Thanks a lot, Simon!

Robert, I agree completely with your comments.  The Unicode normalization is 
only done at dictionary build time.  Simon has turned it on by default -- its 
previous default was off.  Perhaps it makes sense to have it on in Lucene's 
case...

Simon, the TokenizerRunner class doesn't seem to be included in the patch, 
which might be fine.  It's not strictly necessary for Lucene, but I think it's 
useful to keep it there so the analyzer can easily be run from the command 
line.  The DebugTokenizer and GraphvizFormatter is there already, which aren't 
strictly necessary either, but sometimes quite useful, so I'm think we should 
add the TokenizerRunner as well -- at least for now.

Tests didn't pass in my case, but I'll look more into this soon.  My tomorrow 
is very busy, but I'll have time for this on Wednesday.


 Kuromoji code donation - a new Japanese morphological analyzer
 --

 Key: LUCENE-3305
 URL: https://issues.apache.org/jira/browse/LUCENE-3305
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/analysis
Reporter: Christian Moen
Assignee: Simon Willnauer
 Fix For: 4.0

 Attachments: Kuromoji short overview .pdf, LUCENE-3305.patch, 
 ip-clearance-Kuromoji.xml, ip-clearance-Kuromoji.xml, 
 kuromoji-0.7.6-asf.tar.gz, kuromoji-0.7.6.tar.gz, 
 kuromoji-solr-0.5.3-asf.tar.gz, kuromoji-solr-0.5.3.tar.gz


 Atilika Inc. (アティリカ株式会社) would like to donate the Kuromoji Japanese 
 morphological analyzer to the Apache Software Foundation in the hope that it 
 will be useful to Lucene and Solr users in Japan and elsewhere.
 The project was started in 2010 since we couldn't find any high-quality, 
 actively maintained and easy-to-use Java-based Japanese morphological 
 analyzers, and these become many of our design goals for Kuromoji.
 Kuromoji also has a segmentation mode that is particularly useful for search, 
 which we hope will interest Lucene and Solr users.  Compound-nouns, such as 
 関西国際空港 (Kansai International Airport) and 日本経済新聞 (Nikkei Newspaper), are 
 segmented as one token with most analyzers.  As a result, a search for 空港 
 (airport) or 新聞 (newspaper) will not give you a for in these words.  Kuromoji 
 can segment these words into 関西 国際 空港 and 日本 経済 新聞, which is generally what 
 you would want for search and you'll get a hit.
 We also wanted to make sure the technology has a license that makes it 
 compatible with other Apache Software Foundation software to maximize its 
 usefulness.  Kuromoji has an Apache License 2.0 and all code is currently 
 owned by Atilika Inc.  The software has been developed by my good friend and 
 ex-colleague Masaru Hasegawa and myself.
 Kuromoji uses the so-called IPADIC for its dictionary/statistical model and 
 its license terms are described in NOTICE.txt.
 I'll upload code distributions and their corresponding hashes and I'd very 
 much like to start the code grant process.  I'm also happy to provide patches 
 to integrate Kuromoji into the codebase, if you prefer that.
 Please advise on how you'd like me to proceed with this.  Thank you.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3571) nuke IndexSearcher(directory)

2011-11-14 Thread Robert Muir (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3571:


Fix Version/s: 3.5

setting fix version 3.x for the @deprecated

 nuke IndexSearcher(directory)
 -

 Key: LUCENE-3571
 URL: https://issues.apache.org/jira/browse/LUCENE-3571
 Project: Lucene - Java
  Issue Type: Task
Reporter: Robert Muir
 Fix For: 3.5, 4.0

 Attachments: LUCENE-3571.patch


 IndexSearcher is supposed to be a cheap wrapper around a reader,
 but sometimes it is, sometimes it isn't.
 I think its confusing tangling of a heavyweight and lightweight
 object that it sometimes 'houses' a reader and must close it in that case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3573) TaxonomyReader.refresh() is broken, replace its logic with reopen(), following IR.reopen pattern

2011-11-14 Thread Doron Cohen (Created) (JIRA)
TaxonomyReader.refresh() is broken, replace its logic with reopen(), following 
IR.reopen pattern


 Key: LUCENE-3573
 URL: https://issues.apache.org/jira/browse/LUCENE-3573
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/facet
Reporter: Doron Cohen
Assignee: Doron Cohen
Priority: Minor


When recreating the taxonomy index, TR's assumption that categories are only 
added does not hold anymore.
As result, calling TR.refresh() will be incorrect at best, but usually throw an 
AIOOBE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3571) nuke IndexSearcher(directory)

2011-11-14 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149649#comment-13149649
 ] 

Uwe Schindler commented on LUCENE-3571:
---

+1

 nuke IndexSearcher(directory)
 -

 Key: LUCENE-3571
 URL: https://issues.apache.org/jira/browse/LUCENE-3571
 Project: Lucene - Java
  Issue Type: Task
Reporter: Robert Muir
 Fix For: 3.5, 4.0

 Attachments: LUCENE-3571.patch


 IndexSearcher is supposed to be a cheap wrapper around a reader,
 but sometimes it is, sometimes it isn't.
 I think its confusing tangling of a heavyweight and lightweight
 object that it sometimes 'houses' a reader and must close it in that case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3573) TaxonomyReader.refresh() is broken, replace its logic with reopen(), following IR.reopen pattern

2011-11-14 Thread Doron Cohen (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doron Cohen updated LUCENE-3573:


Attachment: LUCENE-3573.patch

Attached patch for trunk adds two tests:
* one of them is opening a new TR and passes
* the other is refreshing the TR and fails.

 TaxonomyReader.refresh() is broken, replace its logic with reopen(), 
 following IR.reopen pattern
 

 Key: LUCENE-3573
 URL: https://issues.apache.org/jira/browse/LUCENE-3573
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/facet
Reporter: Doron Cohen
Assignee: Doron Cohen
Priority: Minor
 Attachments: LUCENE-3573.patch


 When recreating the taxonomy index, TR's assumption that categories are only 
 added does not hold anymore.
 As result, calling TR.refresh() will be incorrect at best, but usually throw 
 an AIOOBE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Casesensitive search problem

2011-11-14 Thread Erick Erickson
If you're using the example schema, your problem is
probably WordDelimiterFilterFactory, which splits
the input into separate tokens if the case changes.

See admin/analysis for a great way to see what
your analysis chain does at every step. Click the
verbose mode...

Best
Erick

On Mon, Nov 14, 2011 at 8:22 AM, jayanta sahoo jsahoo1...@gmail.com wrote:
 HI
 Whenever I am searching with the words OfficeJet or officejet or
 Officejet or oFiiIcejET. I am getting the different results for each
 search respectively. I am not able to understand why this is happening?
    I want to solve this problem such a way that search will become case
 insensitive and I will get same result for any combination of capital and
 small letters.
 Please let me know How i will solve this problem
 --
 Jayanta Sahoo




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #296: POMs out of sync

2011-11-14 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/296/

No tests ran.

Build Log (for compile errors):
[...truncated 16206 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3573) TaxonomyReader.refresh() is broken, replace its logic with reopen(), following IR.reopen pattern

2011-11-14 Thread Shai Erera (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149661#comment-13149661
 ] 

Shai Erera commented on LUCENE-3573:


+1. I think that we should nuke refresh() and adopt the IR approach, even 
though I don't like the 'maybe' and 'if', might as well make the API 
consistent. So instead of refresh() we'll have a static TR.openIfChanged that 
either returns null (no changes, or the taxonomy wasn't recreated) or a new 
instance in case it was recreated.

Note that unlike IndexReader, if the taxonomy index wasn't recreated, 
openIfChanged will modify the internal state of TR. That's ok since the 
taxonomy index was built for it: existing TR instances (that weren't refreshed) 
won't be affected as they won't know about the new categories (and taxonomy 
index doesn't support deletes) and the caller can use the same TR instance in 
that case.

Whatever we end up doing, we should remove refresh(). Even though we're not 
committed to back-compat yet (it's all experimental), I think it is dangerous 
if we'll simply modify refresh() behavior, because users may not be aware of 
the change. So a new method is a must.

Besides that, the test looks good. Was there any reason to add it to 
TestTaxonomyCombined?

 TaxonomyReader.refresh() is broken, replace its logic with reopen(), 
 following IR.reopen pattern
 

 Key: LUCENE-3573
 URL: https://issues.apache.org/jira/browse/LUCENE-3573
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/facet
Reporter: Doron Cohen
Assignee: Doron Cohen
Priority: Minor
 Attachments: LUCENE-3573.patch


 When recreating the taxonomy index, TR's assumption that categories are only 
 added does not hold anymore.
 As result, calling TR.refresh() will be incorrect at best, but usually throw 
 an AIOOBE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3574) Add some more constants for newer Java versions to Constants.class, remove outdated ones.

2011-11-14 Thread Uwe Schindler (Created) (JIRA)
Add some more constants for newer Java versions to Constants.class, remove 
outdated ones.
-

 Key: LUCENE-3574
 URL: https://issues.apache.org/jira/browse/LUCENE-3574
 Project: Lucene - Java
  Issue Type: New Feature
  Components: core/other
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 3.5, 4.0


Preparation for LUCENE-3235:
This adds constants to quickly detect Java6 and Java7 to Constants.java. It 
also deprecated and removes the outdated historical Java versions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3574) Add some more constants for newer Java versions to Constants.class, remove outdated ones.

2011-11-14 Thread Uwe Schindler (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-3574:
--

Attachment: LUCENE-3574-3x.patch

Patch for Lucene 3.x

will remove deprecations in trunk and make JRE_IS_MINIMUM_JRE6 = true (+ 
deprecate it there)

 Add some more constants for newer Java versions to Constants.class, remove 
 outdated ones.
 -

 Key: LUCENE-3574
 URL: https://issues.apache.org/jira/browse/LUCENE-3574
 Project: Lucene - Java
  Issue Type: New Feature
  Components: core/other
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 3.5, 4.0

 Attachments: LUCENE-3574-3x.patch


 Preparation for LUCENE-3235:
 This adds constants to quickly detect Java6 and Java7 to Constants.java. It 
 also deprecated and removes the outdated historical Java versions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3574) Add some more constants for newer Java versions to Constants.class, remove outdated ones.

2011-11-14 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149665#comment-13149665
 ] 

Uwe Schindler commented on LUCENE-3574:
---

Committed 3.x revision: 1201739

 Add some more constants for newer Java versions to Constants.class, remove 
 outdated ones.
 -

 Key: LUCENE-3574
 URL: https://issues.apache.org/jira/browse/LUCENE-3574
 Project: Lucene - Java
  Issue Type: New Feature
  Components: core/other
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 3.5, 4.0

 Attachments: LUCENE-3574-3x.patch


 Preparation for LUCENE-3235:
 This adds constants to quickly detect Java6 and Java7 to Constants.java. It 
 also deprecated and removes the outdated historical Java versions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3574) Add some more constants for newer Java versions to Constants.class, remove outdated ones.

2011-11-14 Thread Shai Erera (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149667#comment-13149667
 ] 

Shai Erera commented on LUCENE-3574:


One typo: nsme - name

Also, not sure if it's worth it, but perhaps instead of constants like 
MIMINUM_JAVA_X we can have a class JavaVersion that follows the same logic we 
have in Version and can compare itself to other JavaVersions? Then we can have 
constants for JAVA_6 = new JavaVersion(6) and similar for JAVA_7, and another 
CURRENT_JAVA_VER that is initialized with the code you wrote. And you can then 
compare CURRENT to JAVA_6/7?

Just an idea.

 Add some more constants for newer Java versions to Constants.class, remove 
 outdated ones.
 -

 Key: LUCENE-3574
 URL: https://issues.apache.org/jira/browse/LUCENE-3574
 Project: Lucene - Java
  Issue Type: New Feature
  Components: core/other
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 3.5, 4.0

 Attachments: LUCENE-3574-3x.patch


 Preparation for LUCENE-3235:
 This adds constants to quickly detect Java6 and Java7 to Constants.java. It 
 also deprecated and removes the outdated historical Java versions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3574) Add some more constants for newer Java versions to Constants.class, remove outdated ones.

2011-11-14 Thread Uwe Schindler (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler resolved LUCENE-3574.
---

Resolution: Fixed

Committed trunk revision: 1201741

 Add some more constants for newer Java versions to Constants.class, remove 
 outdated ones.
 -

 Key: LUCENE-3574
 URL: https://issues.apache.org/jira/browse/LUCENE-3574
 Project: Lucene - Java
  Issue Type: New Feature
  Components: core/other
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 3.5, 4.0

 Attachments: LUCENE-3574-3x.patch


 Preparation for LUCENE-3235:
 This adds constants to quickly detect Java6 and Java7 to Constants.java. It 
 also deprecated and removes the outdated historical Java versions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3574) Add some more constants for newer Java versions to Constants.class, remove outdated ones.

2011-11-14 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149671#comment-13149671
 ] 

Robert Muir commented on LUCENE-3574:
-

{quote}
Also, not sure if it's worth it, but perhaps instead of constants like 
MIMINUM_JAVA_X we can have a class JavaVersion that follows the same logic we 
have in Version
{quote}

I think the problem here would be that say we release 3.5 in a week.

Then two years later Java 8 comes out... we can't know today how to detect it. 
So all we can do is say that we are 'at least' java 7 because we have XYZ.

 Add some more constants for newer Java versions to Constants.class, remove 
 outdated ones.
 -

 Key: LUCENE-3574
 URL: https://issues.apache.org/jira/browse/LUCENE-3574
 Project: Lucene - Java
  Issue Type: New Feature
  Components: core/other
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 3.5, 4.0

 Attachments: LUCENE-3574-3x.patch


 Preparation for LUCENE-3235:
 This adds constants to quickly detect Java6 and Java7 to Constants.java. It 
 also deprecated and removes the outdated historical Java versions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3235) TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug

2011-11-14 Thread Uwe Schindler (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-3235:
--

Attachment: LUCENE-3235.patch

Updated patch after LUCENE-3574 was committed. I also added a 
System.out.println to the test (VERBOSE only).

 TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug
 

 Key: LUCENE-3235
 URL: https://issues.apache.org/jira/browse/LUCENE-3235
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 3.0, 3.1, 3.2, 3.3, 3.4
Reporter: Michael McCandless
 Fix For: 3.5

 Attachments: LUCENE-3235.patch, LUCENE-3235.patch


 Not sure what's going on yet... but under Java 1.6 it seems not to hang bug 
 under Java 1.5 hangs fairly easily, on Linux.  Java is 1.5.0_22.
 I suspect this is relevant: 
 http://stackoverflow.com/questions/3292577/is-it-possible-for-concurrenthashmap-to-deadlock
  which refers to this JVM bug 
 http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6865591 which then refers 
 to this one http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6822370
 It looks like that last bug was fixed in Java 1.6 but not 1.5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk - Build # 11327 - Still Failing

2011-11-14 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/11327/

All tests passed

Build Log (for compile errors):
[...truncated 14675 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3574) Add some more constants for newer Java versions to Constants.class, remove outdated ones.

2011-11-14 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149677#comment-13149677
 ] 

Uwe Schindler commented on LUCENE-3574:
---

bq. One typo: nsme - name

nsme - NoSuchMethodException

 Add some more constants for newer Java versions to Constants.class, remove 
 outdated ones.
 -

 Key: LUCENE-3574
 URL: https://issues.apache.org/jira/browse/LUCENE-3574
 Project: Lucene - Java
  Issue Type: New Feature
  Components: core/other
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 3.5, 4.0

 Attachments: LUCENE-3574-3x.patch


 Preparation for LUCENE-3235:
 This adds constants to quickly detect Java6 and Java7 to Constants.java. It 
 also deprecated and removes the outdated historical Java versions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-2898) Support grouped faceting

2011-11-14 Thread Martijn van Groningen (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated SOLR-2898:


Attachment: SOLR-2898.patch

Attached initial patch that supports rudimentary grouped field facets for 
single valued and non tokenized string fields. Grouped facets isn't yet 
implemented for query / range and pivot facets. 

This patch is compatible with trunk. To use it for all field facets use 
group.facet=true or specify it per field. See test in patch for more details.

I just hacked some code in the SimpleFacets class. To support it for all types 
of facets will require a lot of changes in many places in this class. Currently 
I don't see another way...

 Support grouped faceting
 

 Key: SOLR-2898
 URL: https://issues.apache.org/jira/browse/SOLR-2898
 Project: Solr
  Issue Type: New Feature
Reporter: Martijn van Groningen
 Attachments: SOLR-2898.patch


 Support grouped faceting. As described in LUCENE-3097 (matrix counts).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3574) Add some more constants for newer Java versions to Constants.class, remove outdated ones.

2011-11-14 Thread Shai Erera (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149679#comment-13149679
 ] 

Shai Erera commented on LUCENE-3574:


Exactly (I think that's what I meant) -- we detect the Java version as best we 
can and store it in a constant JAVA_VERSION. It can be compared to JAVA_6/7 
thru an atLeast() API, like JAVA_VERSION.atLeast(JAVA_7).

The code in 3.5 will only know to detect up to Java 7, while the code in 5.2 
will know to detect Java 8.

Wouldn't that work?

 Add some more constants for newer Java versions to Constants.class, remove 
 outdated ones.
 -

 Key: LUCENE-3574
 URL: https://issues.apache.org/jira/browse/LUCENE-3574
 Project: Lucene - Java
  Issue Type: New Feature
  Components: core/other
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 3.5, 4.0

 Attachments: LUCENE-3574-3x.patch


 Preparation for LUCENE-3235:
 This adds constants to quickly detect Java6 and Java7 to Constants.java. It 
 also deprecated and removes the outdated historical Java versions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3574) Add some more constants for newer Java versions to Constants.class, remove outdated ones.

2011-11-14 Thread Shai Erera (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149682#comment-13149682
 ] 

Shai Erera commented on LUCENE-3574:


bq. nsme - NoSuchMethodException

ah, ok :).

 Add some more constants for newer Java versions to Constants.class, remove 
 outdated ones.
 -

 Key: LUCENE-3574
 URL: https://issues.apache.org/jira/browse/LUCENE-3574
 Project: Lucene - Java
  Issue Type: New Feature
  Components: core/other
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 3.5, 4.0

 Attachments: LUCENE-3574-3x.patch


 Preparation for LUCENE-3235:
 This adds constants to quickly detect Java6 and Java7 to Constants.java. It 
 also deprecated and removes the outdated historical Java versions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3573) TaxonomyReader.refresh() is broken, replace its logic with reopen(), following IR.reopen pattern

2011-11-14 Thread Doron Cohen (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149684#comment-13149684
 ] 

Doron Cohen commented on LUCENE-3573:
-

I agree about keeping the same notions as IR. 

bq. returns null (no changes, or the taxonomy wasn't recreated) 

In fact I was thinking of a different contract.

So we have two approaches here for the returned value:

* Option A:
## *new TR* - if the taxonomy was recreated.
## *null* - if the taxonomy was either not modified or just grew.

* Option B:
## *new TR* - if the taxonomy was modified (either recreated or just grew)
## *null* - if the taxonomy was not modified.

Option A is simpler to implement, but I think it has two drawbacks:
* it is confusingly different from that of IR
* the fact that the TR was refreshed is hidden from the caller.

Option B is a bit more involved to implement:
* would need to copy arrays' data from old TR to new one in case the taxonomy 
only grew

I started to implement option B but now rethinking this...

bq. Was there any reason to add it to TestTaxonomyCombined?

Good point, should probably move this to TestDirectoryTaxonomyReader.

 TaxonomyReader.refresh() is broken, replace its logic with reopen(), 
 following IR.reopen pattern
 

 Key: LUCENE-3573
 URL: https://issues.apache.org/jira/browse/LUCENE-3573
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/facet
Reporter: Doron Cohen
Assignee: Doron Cohen
Priority: Minor
 Attachments: LUCENE-3573.patch


 When recreating the taxonomy index, TR's assumption that categories are only 
 added does not hold anymore.
 As result, calling TR.refresh() will be incorrect at best, but usually throw 
 an AIOOBE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3573) TaxonomyReader.refresh() is broken, replace its logic with reopen(), following IR.reopen pattern

2011-11-14 Thread Doron Cohen (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149687#comment-13149687
 ] 

Doron Cohen commented on LUCENE-3573:
-

One more thing 
- In approach B, the fact that the taxonomy just grew simply allows an 
optimization (read only the new ordinals), and so it is not a part of the API 
logic, and the only logic is - was the taxonomy modified or not. 
- In approach A, this fact is part of the API logic. 

 TaxonomyReader.refresh() is broken, replace its logic with reopen(), 
 following IR.reopen pattern
 

 Key: LUCENE-3573
 URL: https://issues.apache.org/jira/browse/LUCENE-3573
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/facet
Reporter: Doron Cohen
Assignee: Doron Cohen
Priority: Minor
 Attachments: LUCENE-3573.patch


 When recreating the taxonomy index, TR's assumption that categories are only 
 added does not hold anymore.
 As result, calling TR.refresh() will be incorrect at best, but usually throw 
 an AIOOBE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3496) Support grouping by IndexDocValues

2011-11-14 Thread Martijn van Groningen (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149689#comment-13149689
 ] 

Martijn van Groningen commented on LUCENE-3496:
---

I was planning on doing this. I'm almost ready to commit it. I'm only a bit 
stuck on documents that don't have a value for a group field.

The random grouping tests also add documents with a null value for the group 
field and an empty string for the group field. This works fine with the term 
based implementations, but not the DV based implementations (random test fail). 
Should we not use null as group value if the dv based implementations are used 
during the test?

 Support grouping by IndexDocValues
 --

 Key: LUCENE-3496
 URL: https://issues.apache.org/jira/browse/LUCENE-3496
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/grouping
Reporter: Martijn van Groningen
Assignee: Martijn van Groningen
 Fix For: 4.0

 Attachments: LUCENE-3496.patch, LUCENE-3496.patch, LUCENE-3496.patch, 
 LUCENE-3496.patch, LUCENE-3496.patch, LUCENE-3496.patch, LUCENE-3496.patch


 Although IDV is not yet finalized (More particular the SortedSource). I think 
 we already can discuss / investigate implementing grouping by IDV.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3574) Add some more constants for newer Java versions to Constants.class, remove outdated ones.

2011-11-14 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149697#comment-13149697
 ] 

Uwe Schindler commented on LUCENE-3574:
---

One example where it might be bad: If it's an enum, you can also do if 
(JAVA_VERSION==JAVA_7, so the enum constants are not named like the fact they 
represent.

I think thats all too much logic for something simple. For one major version we 
will have mostly 2 or 3 constants. In trunk we currently only have Java7 and a 
deprecated one which is always true. New constants are only added on request, 
when we want to test for features/bugs.

 Add some more constants for newer Java versions to Constants.class, remove 
 outdated ones.
 -

 Key: LUCENE-3574
 URL: https://issues.apache.org/jira/browse/LUCENE-3574
 Project: Lucene - Java
  Issue Type: New Feature
  Components: core/other
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 3.5, 4.0

 Attachments: LUCENE-3574-3x.patch


 Preparation for LUCENE-3235:
 This adds constants to quickly detect Java6 and Java7 to Constants.java. It 
 also deprecated and removes the outdated historical Java versions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3235) TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug

2011-11-14 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149701#comment-13149701
 ] 

Uwe Schindler commented on LUCENE-3235:
---

I wait until tomorrow before I commit this safe-but-slow fix.

 TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug
 

 Key: LUCENE-3235
 URL: https://issues.apache.org/jira/browse/LUCENE-3235
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 3.0, 3.1, 3.2, 3.3, 3.4
Reporter: Michael McCandless
 Fix For: 3.5

 Attachments: LUCENE-3235.patch, LUCENE-3235.patch


 Not sure what's going on yet... but under Java 1.6 it seems not to hang bug 
 under Java 1.5 hangs fairly easily, on Linux.  Java is 1.5.0_22.
 I suspect this is relevant: 
 http://stackoverflow.com/questions/3292577/is-it-possible-for-concurrenthashmap-to-deadlock
  which refers to this JVM bug 
 http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6865591 which then refers 
 to this one http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6822370
 It looks like that last bug was fixed in Java 1.6 but not 1.5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk - Build # 11328 - Still Failing

2011-11-14 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/11328/

No tests ran.

Build Log (for compile errors):
[...truncated 1312 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2382) DIH Cache Improvements

2011-11-14 Thread Steven Rowe (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149711#comment-13149711
 ] 

Steven Rowe commented on SOLR-2382:
---

Hi Noble,

In {{DIHCache.java}}, you used the javadoc tag {{@solr.experimental}}, but 
there is no support in the build system for this tag, so it causes javadoc 
warnings, which fail the build, e.g.: 
[https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/11327/consoleText] 
(scroll down to the bottom to see the warning):

{noformat}
[javadoc] [...]/DIHCache.java:14: warning - @solr.experimental is an unknown 
tag.
{noformat}

Would you mind if I switch {{@solr.experimental}} to {{@lucene.experimental}}?

 DIH Cache Improvements
 --

 Key: SOLR-2382
 URL: https://issues.apache.org/jira/browse/SOLR-2382
 Project: Solr
  Issue Type: New Feature
  Components: contrib - DataImportHandler
Reporter: James Dyer
Priority: Minor
 Attachments: SOLR-2382-dihwriter.patch, SOLR-2382-dihwriter.patch, 
 SOLR-2382-dihwriter.patch, SOLR-2382-entities.patch, 
 SOLR-2382-entities.patch, SOLR-2382-entities.patch, SOLR-2382-entities.patch, 
 SOLR-2382-entities.patch, SOLR-2382-entities.patch, SOLR-2382-entities.patch, 
 SOLR-2382-entities.patch, SOLR-2382-properties.patch, 
 SOLR-2382-properties.patch, SOLR-2382-solrwriter-verbose-fix.patch, 
 SOLR-2382-solrwriter.patch, SOLR-2382-solrwriter.patch, 
 SOLR-2382-solrwriter.patch, SOLR-2382.patch, SOLR-2382.patch, 
 SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, 
 SOLR-2382.patch, SOLR-2382.patch


 Functionality:
  1. Provide a pluggable caching framework for DIH so that users can choose a 
 cache implementation that best suits their data and application.
  
  2. Provide a means to temporarily cache a child Entity's data without 
 needing to create a special cached implementation of the Entity Processor 
 (such as CachedSqlEntityProcessor).
  
  3. Provide a means to write the final (root entity) DIH output to a cache 
 rather than to Solr.  Then provide a way for a subsequent DIH call to use the 
 cache as an Entity input.  Also provide the ability to do delta updates on 
 such persistent caches.
  
  4. Provide the ability to partition data across multiple caches that can 
 then be fed back into DIH and indexed either to varying Solr Shards, or to 
 the same Core in parallel.
 Use Cases:
  1. We needed a flexible  scalable way to temporarily cache child-entity 
 data prior to joining to parent entities.
   - Using SqlEntityProcessor with Child Entities can cause an n+1 select 
 problem.
   - CachedSqlEntityProcessor only supports an in-memory HashMap as a Caching 
 mechanism and does not scale.
   - There is no way to cache non-SQL inputs (ex: flat files, xml, etc).
  
  2. We needed the ability to gather data from long-running entities by a 
 process that runs separate from our main indexing process.
   
  3. We wanted the ability to do a delta import of only the entities that 
 changed.
   - Lucene/Solr requires entire documents to be re-indexed, even if only a 
 few fields changed.
   - Our data comes from 50+ complex sql queries and/or flat files.
   - We do not want to incur overhead re-gathering all of this data if only 1 
 entity's data changed.
   - Persistent DIH caches solve this problem.
   
  4. We want the ability to index several documents in parallel (using 1.4.1, 
 which did not have the threads parameter).
  
  5. In the future, we may need to use Shards, creating a need to easily 
 partition our source data into Shards.
 Implementation Details:
  1. De-couple EntityProcessorBase from caching.  
   - Created a new interface, DIHCache  two implementations:  
 - SortedMapBackedCache - An in-memory cache, used as default with 
 CachedSqlEntityProcessor (now deprecated).
 - BerkleyBackedCache - A disk-backed cache, dependent on bdb-je, tested 
 with je-4.1.6.jar
- NOTE: the existing Lucene Contrib db project uses je-3.3.93.jar.  
 I believe this may be incompatible due to Generic Usage.
- NOTE: I did not modify the ant script to automatically get this jar, 
 so to use or evaluate this patch, download bdb-je from 
 http://www.oracle.com/technetwork/database/berkeleydb/downloads/index.html 
  
  2. Allow Entity Processors to take a cacheImpl parameter to cause the 
 entity data to be cached (see EntityProcessorBase  DIHCacheProperties).
  
  3. Partially De-couple SolrWriter from DocBuilder
   - Created a new interface DIHWriter,  two implementations:
- SolrWriter (refactored)
- DIHCacheWriter (allows DIH to write ultimately to a Cache).

  4. Create a new Entity Processor, DIHCacheProcessor, which reads a 
 persistent Cache as DIH Entity Input.
  
  5. Support a partition parameter with both DIHCacheWriter and 
 DIHCacheProcessor to 

[jira] [Commented] (SOLR-1726) Deep Paging and Large Results Improvements

2011-11-14 Thread Grant Ingersoll (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149723#comment-13149723
 ] 

Grant Ingersoll commented on SOLR-1726:
---

Hi Manoj,

This shouldn't require a new query since it should work with all queries, but 
instead new parameters that get passed in alongside the query (see earlier 
comments that lay out what the parameter names are.)  You might start by 
looking at how something like the rows parameter or the start parameter are 
handled and passed through down to the SolrIndexSearcher.

 Deep Paging and Large Results Improvements
 --

 Key: SOLR-1726
 URL: https://issues.apache.org/jira/browse/SOLR-1726
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 3.5, 4.0


 There are possibly ways to improve collections of deep paging by passing 
 Solr/Lucene more information about the last page of results seen, thereby 
 saving priority queue operations.   See LUCENE-2215.
 There may also be better options for retrieving large numbers of rows at a 
 time that are worth exploring.  LUCENE-2127.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3235) TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug

2011-11-14 Thread Michael McCandless (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149728#comment-13149728
 ] 

Michael McCandless commented on LUCENE-3235:


+1 for the safe-but-slow Java 5 only workaround

 TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug
 

 Key: LUCENE-3235
 URL: https://issues.apache.org/jira/browse/LUCENE-3235
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 3.0, 3.1, 3.2, 3.3, 3.4
Reporter: Michael McCandless
 Fix For: 3.5

 Attachments: LUCENE-3235.patch, LUCENE-3235.patch


 Not sure what's going on yet... but under Java 1.6 it seems not to hang bug 
 under Java 1.5 hangs fairly easily, on Linux.  Java is 1.5.0_22.
 I suspect this is relevant: 
 http://stackoverflow.com/questions/3292577/is-it-possible-for-concurrenthashmap-to-deadlock
  which refers to this JVM bug 
 http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6865591 which then refers 
 to this one http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6822370
 It looks like that last bug was fixed in Java 1.6 but not 1.5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2382) DIH Cache Improvements

2011-11-14 Thread Noble Paul (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149731#comment-13149731
 ] 

Noble Paul commented on SOLR-2382:
--

please go ahead

 DIH Cache Improvements
 --

 Key: SOLR-2382
 URL: https://issues.apache.org/jira/browse/SOLR-2382
 Project: Solr
  Issue Type: New Feature
  Components: contrib - DataImportHandler
Reporter: James Dyer
Priority: Minor
 Attachments: SOLR-2382-dihwriter.patch, SOLR-2382-dihwriter.patch, 
 SOLR-2382-dihwriter.patch, SOLR-2382-entities.patch, 
 SOLR-2382-entities.patch, SOLR-2382-entities.patch, SOLR-2382-entities.patch, 
 SOLR-2382-entities.patch, SOLR-2382-entities.patch, SOLR-2382-entities.patch, 
 SOLR-2382-entities.patch, SOLR-2382-properties.patch, 
 SOLR-2382-properties.patch, SOLR-2382-solrwriter-verbose-fix.patch, 
 SOLR-2382-solrwriter.patch, SOLR-2382-solrwriter.patch, 
 SOLR-2382-solrwriter.patch, SOLR-2382.patch, SOLR-2382.patch, 
 SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, 
 SOLR-2382.patch, SOLR-2382.patch


 Functionality:
  1. Provide a pluggable caching framework for DIH so that users can choose a 
 cache implementation that best suits their data and application.
  
  2. Provide a means to temporarily cache a child Entity's data without 
 needing to create a special cached implementation of the Entity Processor 
 (such as CachedSqlEntityProcessor).
  
  3. Provide a means to write the final (root entity) DIH output to a cache 
 rather than to Solr.  Then provide a way for a subsequent DIH call to use the 
 cache as an Entity input.  Also provide the ability to do delta updates on 
 such persistent caches.
  
  4. Provide the ability to partition data across multiple caches that can 
 then be fed back into DIH and indexed either to varying Solr Shards, or to 
 the same Core in parallel.
 Use Cases:
  1. We needed a flexible  scalable way to temporarily cache child-entity 
 data prior to joining to parent entities.
   - Using SqlEntityProcessor with Child Entities can cause an n+1 select 
 problem.
   - CachedSqlEntityProcessor only supports an in-memory HashMap as a Caching 
 mechanism and does not scale.
   - There is no way to cache non-SQL inputs (ex: flat files, xml, etc).
  
  2. We needed the ability to gather data from long-running entities by a 
 process that runs separate from our main indexing process.
   
  3. We wanted the ability to do a delta import of only the entities that 
 changed.
   - Lucene/Solr requires entire documents to be re-indexed, even if only a 
 few fields changed.
   - Our data comes from 50+ complex sql queries and/or flat files.
   - We do not want to incur overhead re-gathering all of this data if only 1 
 entity's data changed.
   - Persistent DIH caches solve this problem.
   
  4. We want the ability to index several documents in parallel (using 1.4.1, 
 which did not have the threads parameter).
  
  5. In the future, we may need to use Shards, creating a need to easily 
 partition our source data into Shards.
 Implementation Details:
  1. De-couple EntityProcessorBase from caching.  
   - Created a new interface, DIHCache  two implementations:  
 - SortedMapBackedCache - An in-memory cache, used as default with 
 CachedSqlEntityProcessor (now deprecated).
 - BerkleyBackedCache - A disk-backed cache, dependent on bdb-je, tested 
 with je-4.1.6.jar
- NOTE: the existing Lucene Contrib db project uses je-3.3.93.jar.  
 I believe this may be incompatible due to Generic Usage.
- NOTE: I did not modify the ant script to automatically get this jar, 
 so to use or evaluate this patch, download bdb-je from 
 http://www.oracle.com/technetwork/database/berkeleydb/downloads/index.html 
  
  2. Allow Entity Processors to take a cacheImpl parameter to cause the 
 entity data to be cached (see EntityProcessorBase  DIHCacheProperties).
  
  3. Partially De-couple SolrWriter from DocBuilder
   - Created a new interface DIHWriter,  two implementations:
- SolrWriter (refactored)
- DIHCacheWriter (allows DIH to write ultimately to a Cache).

  4. Create a new Entity Processor, DIHCacheProcessor, which reads a 
 persistent Cache as DIH Entity Input.
  
  5. Support a partition parameter with both DIHCacheWriter and 
 DIHCacheProcessor to allow for easy partitioning of source entity data.
  
  6. Change the semantics of entity.destroy()
   - Previously, it was being called on each iteration of 
 DocBuilder.buildDocument().
   - Now it is does one-time cleanup tasks (like closing or deleting a 
 disk-backed cache) once the entity processor is completed.
   - The only out-of-the-box entity processor that previously implemented 
 destroy() was LineEntitiyProcessor, so this is not a very invasive change.
 General Notes:
 We are near 

  1   2   >