[jira] [Created] (LUCENE-3983) HTMLCharacterEntities.jflex uses String.toUpperCase without Locale
HTMLCharacterEntities.jflex uses String.toUpperCase without Locale -- Key: LUCENE-3983 URL: https://issues.apache.org/jira/browse/LUCENE-3983 Project: Lucene - Java Issue Type: Bug Reporter: Uwe Schindler Assignee: Steven Rowe Is this expected? {code:java} "xi", "\u03BE", "yacute", "\u00FD", "yen", "\u00A5", "yuml", "\u00FF", "zeta", "\u03B6", "zwj", "\u200D", "zwnj", "\u200C" }; for (int i = 0 ; i < entities.length ; i += 2) { Character value = entities[i + 1].charAt(0); entityValues.put(entities[i], value); if (upperCaseVariantsAccepted.contains(entities[i])) { entityValues.put(entities[i].toUpperCase(), value); } } {code} In my opinion, this should look like: {code:java} "xi", "\u03BE", "yacute", "\u00FD", "yen", "\u00A5", "yuml", "\u00FF", "zeta", "\u03B6", "zwj", "\u200D", "zwnj", "\u200C" }; for (int i = 0 ; i < entities.length ; i += 2) { Character value = entities[i + 1].charAt(0); entityValues.put(entities[i], value); if (upperCaseVariantsAccepted.contains(entities[i])) { entityValues.put(entities[i].toUpperCase(Locale.ENGLISH), value); } } {code} (otherwise in the Turkish locale, the entities containing "i" (like "xi" -> '\u03BE') will not be detected correctly). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3962) Fix incorrect/missing CHANGES.txt entries
Fix incorrect/missing CHANGES.txt entries - Key: LUCENE-3962 URL: https://issues.apache.org/jira/browse/LUCENE-3962 Project: Lucene - Java Issue Type: Bug Components: general/build Reporter: Uwe Schindler Assignee: Uwe Schindler Priority: Blocker Fix For: 3.6, 4.0 While reviewing the release artifacts I found several issues with the CHANGES.txt file sin Lucene and Solr. Attached is an easy patch: - we no longer JARJAR commons-csv - Apache Ivy changes were missing in both CHANGES files - Restructuring of build system by steven was not mentioned by Solr. This is important as it affects people working with the Solr source code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3949) Fix license headers in all Java files to not be in Javadocs /** format
Fix license headers in all Java files to not be in Javadocs /** format -- Key: LUCENE-3949 URL: https://issues.apache.org/jira/browse/LUCENE-3949 Project: Lucene - Java Issue Type: Task Reporter: Uwe Schindler Fix For: 4.0 Our current License headers in all .java files are (for a reason I don't know) in Javadocs format. Means, when you have a class without javadocs, the License header is used as Javadocs. I reviewed lots of other Apache projects, most of them use the correct /* header, but some (including Lucene+Solr) the Javadocs one. We should change this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3937) Workaround the XERCES-J bug in Benchmark
Workaround the XERCES-J bug in Benchmark Key: LUCENE-3937 URL: https://issues.apache.org/jira/browse/LUCENE-3937 Project: Lucene - Java Issue Type: Bug Reporter: Uwe Schindler In becnhmark we have a patched version of XERCES which is hard to compile from source. When looking at the code part patched and the source of EnwikiContentSource, to simply provide the XML parser a Reader instead of InputStream, so the broken code is not triggered. This assumes, that the XML-file is always UTF-8 If not it will no longer work (because the XML parser cannot switch encoding, if it only has a Reader). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3926) Improve Javadocs of RAMDirectory to document its limitations
Improve Javadocs of RAMDirectory to document its limitations Key: LUCENE-3926 URL: https://issues.apache.org/jira/browse/LUCENE-3926 Project: Lucene - Java Issue Type: Sub-task Affects Versions: 3.5, 4.0 Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 3.6, 4.0 Attachments: LUCENE-3659.patch Spinoff from several dev@lao issues: - [http://mail-archives.apache.org/mod_mbox/lucene-dev/201112.mbox/%3C001001ccbf1c%2471845830%24548d0890%24%40thetaphi.de%3E] - issue LUCENE-3653 The use cases for RAMDirectory are very limited and to prevent users from using it for e.g. loading a 50 Gigabyte index from a file on disk, we should improve the javadocs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3924) Optimize buffer size handling in RAMDirectory to make it more GC friendly
Optimize buffer size handling in RAMDirectory to make it more GC friendly - Key: LUCENE-3924 URL: https://issues.apache.org/jira/browse/LUCENE-3924 Project: Lucene - Java Issue Type: Improvement Components: core/store Affects Versions: 4.0 Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 4.0 RAMDirectory currently uses a fixed buffer size of 1024 bytes to allocate memory. This is very wasteful for large indexes. Improvements may be: - per file buffer sizes based on IOContext and maximum segment size - allocate only one buffer for files that are copied from another directory - dynamically increae buffer size when files grow (makes seek() complicated) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3886) MemoryIndex memory estimation in toString inconsistent with getMemorySize()
MemoryIndex memory estimation in toString inconsistent with getMemorySize() --- Key: LUCENE-3886 URL: https://issues.apache.org/jira/browse/LUCENE-3886 Project: Lucene - Java Issue Type: Bug Reporter: Uwe Schindler After LUCENE-3867 was committed, there are some more minor problems with MemoryIndex's estimates. This patch will fix those and also add verbose test output of RAM needed for MemoryIndex vs. RAMDirectory. Interestingly, the RAMDirectory always takes (according to estimates, so even with buffer overheads) only 2/3 of the MemoryIndex (excluding IndexReaders). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3866) Make CompositeReader.getSequntialSubReaders() and the corresponding IndexReaderContext methods return unmodifiable List
Make CompositeReader.getSequntialSubReaders() and the corresponding IndexReaderContext methods return unmodifiable List -- Key: LUCENE-3866 URL: https://issues.apache.org/jira/browse/LUCENE-3866 Project: Lucene - Java Issue Type: Improvement Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 4.0 Since Lucene 2.9 we have the method getSequentialSubReader() returning IndexReader[]. Based on hardly-to-debug errors in user's code, Robert and me realized that returning an array from a public API is an anti-pattern. If the array is intended to be modifiable (like byte[] in terms,...), it is fine to use arrays in public APIs, but not, if the array must be protected from modification. As IndexReaders are 100% unmodifiable in trunk code (no deletions,...), the only possibility to corrumpt the reader is by modifying the array returned by getSequentialSubReaders(). We should prevent this. The same theoretically applies to FieldCache, too - but the party that is afraid of performance problems is too big to fight against that :( For getSequentialSubReaders there is no performance problem at all. The binary search of reader-ids inside BaseCompositeReader can still be done with the internal protected array, but public APIs should expose only a unmodifiable List. The same applies to leaves() and children() in IndexReaderContext. This change to list would also allow to make CompositeReader and CompositeReaderContext Iterable, so some loops would look nice. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3213) Upgrade to commons-csv once it is released
Upgrade to commons-csv once it is released -- Key: SOLR-3213 URL: https://issues.apache.org/jira/browse/SOLR-3213 Project: Solr Issue Type: Task Components: Build Reporter: Uwe Schindler Since SOLR-3159 we have a jarjar'ed apache-solr-commons-csv-SNAPSHOT.jar file in lib folder. Once version 1.0 of commons-csv is officially released, we should upgrade that to this version, remove maven publishing and change the import statements to the official package name in java files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3852) Rename BaseMultiReader class to BaseCompositeReader and make public
Rename BaseMultiReader class to BaseCompositeReader and make public --- Key: LUCENE-3852 URL: https://issues.apache.org/jira/browse/LUCENE-3852 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 4.0 Currently the abstract DirectoryReader and MultiReader and ParallelCompositeReader extend a package private class. Users that want to implement a composite reader, should be able to subclass this pkg-private class, as it implements lots of abstract methods, useful for own implementations. In fact MultiReader is a shallow subclass only implementing correct closing&refCounting. By making it public after the rename, the generics problems (type parameter R is not correctly displayed) in the JavaDocs are solved, too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3850) Fix rawtypes warnings for Java 7 compiler
Fix rawtypes warnings for Java 7 compiler - Key: LUCENE-3850 URL: https://issues.apache.org/jira/browse/LUCENE-3850 Project: Lucene - Java Issue Type: Improvement Affects Versions: 3.5, 4.0 Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 3.6, 4.0 Java 7 changed the warnings a little bit: - Java 6 only knew "unchecked" warning type, applying for all types of generics violations, like missing generics (raw types) - Java 7 still knows "unchecked" but only emits warning if the call is really unchecked. Declaration of variables/fields or constructing instances without type param now emits "rawtypes" warning. The changes above causes the Java 7 compile now emit lots of "rawtypes" warnings, where Java 6 is silent. The easy fix is to suppres both warning types: @SuppressWarnings({"unchecked","rawtypes"}) for all those places. Changes are easy to do, will provide patch later! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3844) Deprecate Token class and remove in trunk
Deprecate Token class and remove in trunk - Key: LUCENE-3844 URL: https://issues.apache.org/jira/browse/LUCENE-3844 Project: Lucene - Java Issue Type: Task Reporter: Uwe Schindler After issues like LUCENE-3843, introducing new attributes, we should remove Token class in trunk, as it leads to code that ignores those new attributes (like PositionLengthAttribute, ScriptAttribute, KeywordAttribute,...). If you want a holder for all Attributes a TokenStream, use TS.cloneAttributes(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3823) Add a field-filtering FilterAtomicReader to 4.0 so ParallelReaders can be better tested (in LTC.maybeWrapReader)
Add a field-filtering FilterAtomicReader to 4.0 so ParallelReaders can be better tested (in LTC.maybeWrapReader) Key: LUCENE-3823 URL: https://issues.apache.org/jira/browse/LUCENE-3823 Project: Lucene - Java Issue Type: Improvement Components: core/index, general/test Affects Versions: 4.0 Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 4.0 In addition to the filters in contrib/misc for horizontally filtering (by doc-id) AtomicReader, it would be good to have the same vertically (by field). For now I will add this implementation to test-framework, as it cannot stay in contrib/misc, because LTC will need it for maybeWrapReader. LTC will use this FilterAtomicReader to construct a ParallelAtomicReader out of two (or maybe more) FieldFilterAtomicReaders. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3822) Inner classes of FilterAtomicReader (trunk) / FilterIndexReader (3.x) do not override all methods to be filtered
Inner classes of FilterAtomicReader (trunk) / FilterIndexReader (3.x) do not override all methods to be filtered Key: LUCENE-3822 URL: https://issues.apache.org/jira/browse/LUCENE-3822 Project: Lucene - Java Issue Type: Bug Affects Versions: 3.5, 4.0 Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 3.6, 4.0 This issue adds missing checks in the FilterReader test to also check overridden methods in the enum implementations (inner classes) similar to the checks added by Shai Erea. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3800) Readers wrapping other readers don't prevent usage if any of their subreaders was closed
Readers wrapping other readers don't prevent usage if any of their subreaders was closed Key: LUCENE-3800 URL: https://issues.apache.org/jira/browse/LUCENE-3800 Project: Lucene - Java Issue Type: Bug Components: core/index Affects Versions: 4.0 Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 4.0 On recent trunk test we got this problem: org.apache.lucene.index.TestReaderClosed.test fails because the inner reader is closed but the wrapped outer ones are still open. I fixed the issue partially for SlowCompositeReaderWrapper and ParallelAtomicReader but it failed again. The cool thing with this test is the following: The test opens an DirectoryReader and then creates a searcher, closes the reader and executes a search. This is not an issue, if the reader is closed that the search is running on. This test uses LTC.newSearcher(wrap=true), which randomly wraps the passed Reader with SlowComposite or ParallelReader - or with both!!! If you then close the original inner reader, the close is not detected when excuting search. This can cause SIGSEGV when MMAP is used. The problem in (in Slow* and Parallel*) is, that both have their own Fields instances thats are kept alive until the reader itsself is closed. If the child reader is closed, the wrapping reader does not know and still uses its own Fields instance that delegates to the inner readers. On this step no more ensureOpen checks are done, causing the failures. The first fix done in Slow and Parallel was to call ensureOpen() on the subReader, too when rewquesting fields(). This works fine until you wrap two times: ParallelAtomicReader(SlowCompositeReaderWrapper(StandardDirectoryReader(segments_1:3:nrt _0(4.0):C42))) One solution would be to make ensureOpen also check all subreaders, but that would do the volatile checks way too often (with n is the total number of subreaders and m is the number of hierarchical instances this is n^m) - we cannot do this. Currently we only have n*m which is fine. The proposal how to solve this (closing subreaders under the hood of parent readers is to use the readerClosedListeners. Whenever a composite or slow reader wraps another readers, it registers itself as interested in readerClosed events. When a subreader is then forcefully closed (e.g by a programming error or this crazy test), we automatically close the parents, too. We should also fix this in 3.x, if we have similar problems there (needs investigation). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3771) Rename some remaining tests for new IndexReader class hierarchy
Rename some remaining tests for new IndexReader class hierarchy --- Key: LUCENE-3771 URL: https://issues.apache.org/jira/browse/LUCENE-3771 Project: Lucene - Java Issue Type: Sub-task Components: general/test Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 4.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3770) Rename FilterIndexReader to FilterAtomicReader
Rename FilterIndexReader to FilterAtomicReader -- Key: LUCENE-3770 URL: https://issues.apache.org/jira/browse/LUCENE-3770 Project: Lucene - Java Issue Type: Sub-task Reporter: Uwe Schindler Assignee: Uwe Schindler -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3764) Remove oal.util.MapBackedSet (Java 6 offsers Collections.newSetFromMap())
Remove oal.util.MapBackedSet (Java 6 offsers Collections.newSetFromMap()) - Key: LUCENE-3764 URL: https://issues.apache.org/jira/browse/LUCENE-3764 Project: Lucene - Java Issue Type: Task Reporter: Uwe Schindler Assignee: Uwe Schindler Easy search and replace job. In 3.x we still need the class, as Java 5 does not have Collections.newSetFromMap(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3757) Change AtomicReaderContext.leaves() to return itsself as only leave to simplify code and remove an otherwise unneeded ReaderUtil method
Change AtomicReaderContext.leaves() to return itsself as only leave to simplify code and remove an otherwise unneeded ReaderUtil method --- Key: LUCENE-3757 URL: https://issues.apache.org/jira/browse/LUCENE-3757 Project: Lucene - Java Issue Type: Improvement Reporter: Uwe Schindler The documentation of IndexReaderContext.leaves() states that it returns (for convenience) all leave nodes, if the context is top-level (directly got from IndexReader), otherwise returns null. This is not correct for AtomicReaderContext, where it returns null always. To make it consistent, the convenience method should simply return itsself as only leave for atomic contexts. This makes the utility method ReaderUtil.leaves() obsolete and simplifies code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3736) ParallelReader is now atomic, add convenience methods to wrap CompositeReaders in either "slow atomic" or "fast composite" way
ParallelReader is now atomic, add convenience methods to wrap CompositeReaders in either "slow atomic" or "fast composite" way -- Key: LUCENE-3736 URL: https://issues.apache.org/jira/browse/LUCENE-3736 Project: Lucene - Java Issue Type: Sub-task Components: core/index Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 4.0 ParallelReader is now atomic. We should add a sugar wrapper method to allow synchronized composite readers (with same segment sizes) to be aligned with MultiReaders or wrapped by Slow: - one ParallelReader with Slow wrapped parallel readers, they only need same maxDoc() (and deletions) - a MultiReader containing all sub-ParallelReaders. This needs CompositeReaders with same docStarts[] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3735) Fix PayloadProcessorProvider to no longer use Directory for lookup, instead AtomicReader
Fix PayloadProcessorProvider to no longer use Directory for lookup, instead AtomicReader Key: LUCENE-3735 URL: https://issues.apache.org/jira/browse/LUCENE-3735 Project: Lucene - Java Issue Type: Sub-task Components: core/index Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 4.0 The PayloadProcessorProvider has a broken API, this should be fixed. The current patch mimics the old behaviour, but not 100%. The PayloadProcessorProvider API should return a PayloadProcessor based on the AtomicReader instance that gets merged. As AtomicReader do no longer know the directory they are reside (they could be e.g. FilterIndexReaders, MemoryIndexes,...) a selection by Directory is no longer possible. The current code in Lucene trunk mimics the old behavior by doing an instanceof SegmentReader check and then asking for a DirProvider. If something else is merged in, Payload processing is not supported. This should be changed, the old API could be kept backwards compatible by moving the instanceof check in a "convenience class" DirPayloadProcessorProvider, extending PayloadProcessorProvider. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3734) Allow customizing/subclassing of DirectoryReader
Allow customizing/subclassing of DirectoryReader Key: LUCENE-3734 URL: https://issues.apache.org/jira/browse/LUCENE-3734 Project: Lucene - Java Issue Type: Sub-task Components: core/index Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 4.0 DirectoryReader is final and has only static factory methods. It is not possible to subclass it in any way. The problem is mainly Solr, as Solr accesses directory(), IndexCommits,... and therefore cannot work on abstract IndexReader anymore. This should be changed, by e.g. handling reopening in the IRFactory, also versions, commits,... Currently its not possible to implement any other IRFactory that returns something else. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3733) Remaining TODOs of LUCENE-2858: Finalize AtomicReader/CompositeReader API
Remaining TODOs of LUCENE-2858: Finalize AtomicReader/CompositeReader API - Key: LUCENE-3733 URL: https://issues.apache.org/jira/browse/LUCENE-3733 Project: Lucene - Java Issue Type: Task Reporter: Uwe Schindler Fix For: 4.0 This issue will handle the remaining issues in the commit last night (LUCENE-2858). A new branch will be created and several problems handled in sub-tasks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3716) Discussion topic: Move all Commit/Version&Reopen stuff from abstract IR to DirectoryReader
Discussion topic: Move all Commit/Version&Reopen stuff from abstract IR to DirectoryReader -- Key: LUCENE-3716 URL: https://issues.apache.org/jira/browse/LUCENE-3716 Project: Lucene - Java Issue Type: Sub-task Affects Versions: 4.0 Reporter: Uwe Schindler Assignee: Uwe Schindler When implementing the parent issue, I noticed a lot of other stuff in IndexReader thats only implemented in DirectoryReader/SegmentReader and is not really related to IndexReader at all: - getVersion (maybe also isCurrent) only affects DirectoryReaders, because of the commit-stuff there is no easy way for e.g. MultiReader to implement this - reopen/openIfChanged cannot be implemented easily by most AtomicIndexReaders, but also CompositeIndexReader is the wrong place to define those methods In the parant issue, I already let IndexReader.open() return DirectoryReader and I made this class public. We should move the whole stuff (including IR.open) to DirectoryReader. Reopening outside DirectoryReader is not really needed. If some people think, it should maybe stay abstract (there are ways for other readers to implement it, but for sure its not specific to IR's in general). In that case I would decalre an interface that DirectoryReader implements. Code like SearcherManager/Solr could then instanceof the IR instance and find out if it's worth reopening/version checking). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3712) Remove unused (and untested) methods from ReaderUtil that are also veeeeery ineffective
Remove unused (and untested) methods from ReaderUtil that are also very ineffective --- Key: LUCENE-3712 URL: https://issues.apache.org/jira/browse/LUCENE-3712 Project: Lucene - Java Issue Type: Task Components: core/other Reporter: Uwe Schindler Assignee: Uwe Schindler ReaderUtil contains two methods that are nowhere used and not even tested. Additionally those are implemented with useless List->array copying; ineffective docStart calculation for a binary search later instead directly returning the reader while scanning -- and I am not sure if they really work as expected. As ReaderUtil is @lucene.internal we should remove them in 3.x and trunk, alternatively the useless array copy / docStarts handling should be removed and tests added: {code:java} public static IndexReader subReader(int doc, IndexReader reader) public static IndexReader subReader(IndexReader reader, int subIndex) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3659) Improve Javadocs of RAMDirectory to document its limitations
Improve Javadocs of RAMDirectory to document its limitations Key: LUCENE-3659 URL: https://issues.apache.org/jira/browse/LUCENE-3659 Project: Lucene - Java Issue Type: Task Affects Versions: 3.5, 4.0 Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 3.6, 4.0 Spinoff from several dev@lao issues: - [http://mail-archives.apache.org/mod_mbox/lucene-dev/201112.mbox/browser] - issue LUCENE-3653 The use cases for RAMDirectory are very limited and to prevent users from using it for e.g. loading a 50 Gigabyte index from a file on disk, we should improve the javadocs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3656) IndexReader's add/removeCloseListener should not use ConcurrentHashMap, just a synchronized set
IndexReader's add/removeCloseListener should not use ConcurrentHashMap, just a synchronized set --- Key: LUCENE-3656 URL: https://issues.apache.org/jira/browse/LUCENE-3656 Project: Lucene - Java Issue Type: Bug Components: core/index Affects Versions: 3.5, 4.0 Reporter: Uwe Schindler Assignee: Uwe Schindler Priority: Minor The use-case for ConcurrentHashMap is when many threads are reading and less writing to the structure. Here this is just funny: The only reader is close(). Here you can just use a synchronized HashSet. The complexity of CHM is making this just a joke :-) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3652) Move org.apache.lucene.messages to QueryParser module in Lucene trunk (maybe also in 3.x)
Move org.apache.lucene.messages to QueryParser module in Lucene trunk (maybe also in 3.x) - Key: LUCENE-3652 URL: https://issues.apache.org/jira/browse/LUCENE-3652 Project: Lucene - Java Issue Type: Task Reporter: Uwe Schindler The package org.apache.lucene.messages as introduced by flexible QueryParser but is not used by any code in core. It should move to this module / this contrib (maybe even in 3.x). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3643) Improve FilteredQuery to shortcut on wrapped MatchAllDocsQuery, null Query or null Filter
Improve FilteredQuery to shortcut on wrapped MatchAllDocsQuery, null Query or null Filter - Key: LUCENE-3643 URL: https://issues.apache.org/jira/browse/LUCENE-3643 Project: Lucene - Java Issue Type: Improvement Components: core/search Affects Versions: 4.0 Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 4.0 Since the rewrite of Lucene trunk to delegate all Filter logic to FilteredQuery, by simply wrapping in IndexSearcher.wrapFilter(), we can do more short circuits and improve query execution. A common use case it to pass MatchAllDocsQuery as query to IndexSearcher and a filter. For the underlying hit collection this is stupid and slow, as MatchAllDocsQuery simply increments the docID and checks acceptDocs. If the filter is sparse, this is a big waste. This patch changes FilteredQuery.rewrite() to short circuit and return ConstantScoreQuery, if the query is null or MatchAllDocs. The same happens for filter==null, in this case FilteredQuery rewrites itsself to the inner query with modified boost. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3641) MultiReader does not propagate readerFinishedListeners to clones/reopened readers
MultiReader does not propagate readerFinishedListeners to clones/reopened readers - Key: LUCENE-3641 URL: https://issues.apache.org/jira/browse/LUCENE-3641 Project: Lucene - Java Issue Type: Bug Components: core/index Affects Versions: 3.5 Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 3.6, 4.0 While working on refactoring MultiReader/DirectoryReader in trunk, I found out that MultiReader does not correctly pass readerFinishedListeners to its clones and reopened readers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3633) Remove code duplication in MultiReader/DirectoryReader, make everything inside final
Remove code duplication in MultiReader/DirectoryReader, make everything inside final Key: LUCENE-3633 URL: https://issues.apache.org/jira/browse/LUCENE-3633 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 4.0 After making IndexReader readOnly (LUCENE-3606) there is no need to have completely different DirectoryReader and MultiReader, the current code is heavy code duplication and violations against finalness patterns. There are only few differences in reopen and things like isCurrent/getDirectory/... This issue will clean this up by introducing a hidden package-private base class for both and only handling reopen and incRef/decRef different. DirectoryReader is now final and all fields in BaseMultiReader, MultiReader and DirectoryReader are final now. DirectoryReader has now only static factories, no public ctor anymore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3632) Fully support doOpenIfChanged(boolean readOnly)/clone(boolean readOnly) in MultiReader and ParallelReader
Fully support doOpenIfChanged(boolean readOnly)/clone(boolean readOnly) in MultiReader and ParallelReader - Key: LUCENE-3632 URL: https://issues.apache.org/jira/browse/LUCENE-3632 Project: Lucene - Java Issue Type: Improvement Reporter: Uwe Schindler Assignee: Uwe Schindler Followup from LUCENE-3630: doOpenIfChanged is behaving incorrectly if you pass a boolean to openIfChanged/clone. A partial fix is in LUCENE-3630, but it's not complete. This issue fully supports doOpenIfChanged/clone by conditionally passing the boolean down to the subreaders. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3631) Remove write access from SegmentReader and possibly move to separate class or IndexWriter/BufferedDeletes/...
Remove write access from SegmentReader and possibly move to separate class or IndexWriter/BufferedDeletes/... - Key: LUCENE-3631 URL: https://issues.apache.org/jira/browse/LUCENE-3631 Project: Lucene - Java Issue Type: Task Components: core/index Affects Versions: 4.0 Reporter: Uwe Schindler After LUCENE-3606 is finished, there are some TODOs: SegmentReader still contains (package-private) all delete logic including crazy copyOnWrite for validDocs Bits. It would be good, if SegmentReader itsself could be read-only like all other IndexReaders. There are two possibilities to do this: # the simple one: Subclass SegmentReader and make a RWSegmentReader that is only used by IndexWriter/BufferedDeletes/... DirectoryReader will only use the read-only SegmentReader. This would move all TODOs to a separate class. It's reopen/clone method would always create a RO-SegmentReader (for NRT). # Remove all write and commit stuff from SegmentReader completely and move it to IndexWriter's readerPool (it must be in readerPool as deletions need a not-changing view on an index snapshot). Unfortunately the code is so complicated and I have no real experience in those internals of IndexWriter so I did not want to do it with LUCENE-3606, I just separated the code in SegmentReader and marked with TODO. Maybe Mike McCandless can help :-) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3630) MultiReader and ParallelReader accidently override doOpenIfChanged(boolean readOnly) with doOpenIfChanged(boolean doClone)
MultiReader and ParallelReader accidently override doOpenIfChanged(boolean readOnly) with doOpenIfChanged(boolean doClone) -- Key: LUCENE-3630 URL: https://issues.apache.org/jira/browse/LUCENE-3630 Project: Lucene - Java Issue Type: Bug Components: core/index Affects Versions: 3.5 Reporter: Uwe Schindler Fix For: 3.6 I found this during adding deprecations for RW access in LUCENE-3606: the base class defines doOpenIfChanged(boolean readOnly), but MultiReader and ParallelReader "override" this method with a signature doOpenIfChanged(doClone) and missing @Override. This makes consumers calling IR.openIfChanged(boolean readOnly) do the wrong thing. Instead they should get UOE like for the other unimplemented doOpenIfChanged methods in MR and PR. Easy fix is to rename and hide this internal "reopen" method, like DirectoryReader,... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3626) Make PKIndexSplitter and MultiPassIndexSplitter work per segment
Make PKIndexSplitter and MultiPassIndexSplitter work per segment Key: LUCENE-3626 URL: https://issues.apache.org/jira/browse/LUCENE-3626 Project: Lucene - Java Issue Type: Improvement Components: modules/other Affects Versions: 4.0 Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 4.0 Spinoff from LUCENE-3624: DocValuesw merger throws exception on IW.addIndexes(SlowMultiReaderWrapper) as string-index like docvalues cannot provide asSortedSource. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3614) Add a JUL/SLF4J example InfoStream implementation so IndexWriter can log to JUL/SLF4J
Add a JUL/SLF4J example InfoStream implementation so IndexWriter can log to JUL/SLF4J - Key: LUCENE-3614 URL: https://issues.apache.org/jira/browse/LUCENE-3614 Project: Lucene - Java Issue Type: Improvement Affects Versions: 4.0 Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 4.0 Followup to LUCENE-3598: Hoss suggested to add a default JUL/SLF4J implementation to contrib/misc (that can also be used by SOLR to log IndexWriter verbose messages to its logging framework). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3606) Make IndexReader really read-only in Lucene 4.0
Make IndexReader really read-only in Lucene 4.0 --- Key: LUCENE-3606 URL: https://issues.apache.org/jira/browse/LUCENE-3606 Project: Lucene - Java Issue Type: Task Components: core/index Affects Versions: 4.0 Reporter: Uwe Schindler As we change API completely in Lucene 4.0 we are also free to remove read-write access and commits from IndexReader. This code is so hairy and buggy (as investigated by Robert and Mike today) when you work on SegmentReader level but forget to flush in the DirectoryReader, so its better to really make IndexReaders readonly. Currently with IndexReader you can do things like: - delete/undelete Documents -> Can be done by with IndexWriter, too (using deleteByQuery) - change norms -> this is a bad idea in general, but when we remove norms at all and replace by DocValues this is obsolete already. Changing DocValues should also be done using IndexWriter in trunk (once it is ready) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3598) Improve InfoStream class in trunk to be more consistent with logging-frameworks like slf4j/log4j/commons-logging
Improve InfoStream class in trunk to be more consistent with logging-frameworks like slf4j/log4j/commons-logging Key: LUCENE-3598 URL: https://issues.apache.org/jira/browse/LUCENE-3598 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Uwe Schindler Followup on a [thread by Shai Erea on java-dev@lao|http://lucene.472066.n3.nabble.com/IndexWriter-infoStream-is-final-td3537485.html]: I already discussed with Robert about that, that there is one thing missing. Currently the IW only checks if the infoStream!=null and then passes the message to the method, and that *may* ignore it. For your requirement it is the case that this is enabled or disabled dynamically. Unfortunately if the construction of the message is heavy, then this wastes resources. I would like to add another method to this class: abstract boolean isMessageEnabled() that can also be implemented. I would then replace all null checks in IW by this method. The default config in IW would be changed to use a NoOutputInfoStream that returns false here and ignores the message. A simple logger wrapper for e.g. log4j / slf4j then could look like (ignoring component, could be enabled): Loger log = YourLoggingFramework.getLogger(IndexWriter.class); {code:java} public void message(String component, String message) { log.debug(component + ": " + message); } public boolean isMessageEnabled(String component) { return log.isDebugEnabled(); } {code} Using this you could enable/disable logging live by e.g. the log4j management console of your app server by enabling/disabling IndexWriter.class logging. The changes are really simple: - PrintStreamInfoStream returns true, always, mabye make it dynamically enable/disable to allow Shai's request - infoStream.getDefault() is never null and can never be set to null. Instead the default is a singleton NoOutputInfoStream that returns false of isMessageEnabled(). - All null checks on infoStream should be replaced by infoStream.isMessageEanbled(component), this is possible as always != null. There are no slowdowns by this - it's like Collections.emptyList() instead stupid null checks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3595) Refactor FieldCacheRangeFilter.FieldCacheDocIdSet to be separate class and fix the dangerous matchDoc() throws AIOOBE requirement
Refactor FieldCacheRangeFilter.FieldCacheDocIdSet to be separate class and fix the dangerous matchDoc() throws AIOOBE requirement - Key: LUCENE-3595 URL: https://issues.apache.org/jira/browse/LUCENE-3595 Project: Lucene - Java Issue Type: Task Reporter: Uwe Schindler Assignee: Uwe Schindler Followup from LUCENE-3593: The FieldCacheRangeFilter.FieldCacheDocIdSet class has a strange requirement on the abstract matchDoc(): It should throw AIOOBE if the docId is > maxDoc. This check should be done by caller as especially on trunk, e.g. FieldCacheTermsFilter does not seem to always throw this exception correctly (getOrd() is a method and no array in TermsIndex cache). Also in 3.x the Filter does not correctly respect deletions when a FieldCache based on a reopened reader is used. This issue will refactor this and fix the bugs and moves the docId check up to the iterator. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3594) Backport FieldCacheTermsFilter code duplication removal to 3.x
Backport FieldCacheTermsFilter code duplication removal to 3.x -- Key: LUCENE-3594 URL: https://issues.apache.org/jira/browse/LUCENE-3594 Project: Lucene - Java Issue Type: Improvement Affects Versions: 3.5 Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 3.6 In trunk I already cleaned up FieldCacheTermsFilter to not duplicate code of FieldCacheRangeFilter. This issue simply backports this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3588) Try harder to prevent SIGSEGV on cloned MMapIndexInputs
Try harder to prevent SIGSEGV on cloned MMapIndexInputs --- Key: LUCENE-3588 URL: https://issues.apache.org/jira/browse/LUCENE-3588 Project: Lucene - Java Issue Type: Improvement Components: core/store Affects Versions: 3.4, 3.5 Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 3.6, 4.0 We are unmapping mmapped byte buffers which is disallowed by the JDK, because it has the risk of SIGSEGV when you access the mapped byte buffer after unmapping. We currently prevent this for the main IndexInput by setting its buffer to null, so we NPE if somebody tries to access the underlying buffer. I recently fixed also the stupid curBuf (LUCENE-3200) by setting to null. The big problem are cloned IndexInputs which are generally not closed. Those still contain references to the unmapped ByteBuffer, which lead to SIGSEGV easily. The patch from Mike in LUCENE-3439 prevents most of this in Lucene 3.5, but its still not 100% safe (as it uses non-volatiles). This patch will fix the remaining issues by also setting the buffers of clones to null when the original is closed. The trick is to record weak references of all clones created and close them together with the original. This uses a ConcurrentHashMap,?> as store with the logic borrowed from WeakHashMap to cleanup the GCed references (using ReferenceQueue). If we respin 3.5, we should maybe also get this in. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3583) benchmark tests always fail on windows because directory cannot be removed
benchmark tests always fail on windows because directory cannot be removed -- Key: LUCENE-3583 URL: https://issues.apache.org/jira/browse/LUCENE-3583 Project: Lucene - Java Issue Type: Bug Reporter: Uwe Schindler This seems to be a bug recently introduced. I have no idea what's wrong. Attached is a log file, reproduces everytime. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3574) Add some more constants for newer Java versions to Constants.class, remove outdated ones.
Add some more constants for newer Java versions to Constants.class, remove outdated ones. - Key: LUCENE-3574 URL: https://issues.apache.org/jira/browse/LUCENE-3574 Project: Lucene - Java Issue Type: New Feature Components: core/other Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 3.5, 4.0 Preparation for LUCENE-3235: This adds constants to quickly detect Java6 and Java7 to Constants.java. It also deprecated and removes the outdated historical Java versions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3561) Maven xxx-src.jar files do not contain resources
Maven xxx-src.jar files do not contain resources Key: LUCENE-3561 URL: https://issues.apache.org/jira/browse/LUCENE-3561 Project: Lucene - Java Issue Type: Bug Components: general/build Affects Versions: 3.4, 4.0 Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 3.5, 4.0 When building src.jar files for maven deploy, they miss resources, so analyzers-sommon-src.jar is useless. The attached patch will fix this. The only backside is: The globmapper hack does not work with , so i used the new ANT 1.7.1 attribute erroronmissingdir="no" on the s I also upgraded BUILD.txt, which were missing even java 1.6 in trunk! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3540) In 3.x branch (starting with 3.4) the IndexFormatTooOldException was backported, but the error message was not modified for 3.x
In 3.x branch (starting with 3.4) the IndexFormatTooOldException was backported, but the error message was not modified for 3.x --- Key: LUCENE-3540 URL: https://issues.apache.org/jira/browse/LUCENE-3540 Project: Lucene - Java Issue Type: Bug Affects Versions: 3.4 Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 3.5 In 3.x branch (starting with 3.4) the IndexFormatTooOldException was backported, but the error message was not modified for 3.x: bq. This version of Lucene only supports indexes created with release 3.0 and later. In 3.x it must be: bq. This version of Lucene only supports indexes created with release 1.9 and later. Indexes before 1.9 will throw this exception on reading SegmentInfos (LUCENE-3255). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3537) Add note about Java 7u1 and 6u29 to Lucene/Solr sites
Add note about Java 7u1 and 6u29 to Lucene/Solr sites - Key: LUCENE-3537 URL: https://issues.apache.org/jira/browse/LUCENE-3537 Project: Lucene - Java Issue Type: Task Components: general/website Reporter: Uwe Schindler Oracle confirmed, that the bugs leading to index corruption and SIGSEGV are fixed in Java 7u1 and 6u29. We should post a message to the news sections revising the previous WARNING (LUCENE-3349). I prepared something, please comment before i commit: {quote} 26 October 2011 - Java 7u1 fixes index corruption and crash bugs in Apache Lucene Core and Apache Solr Oracle released http://www.oracle.com/technetwork/java/javase/7u1-relnotes-507962.html";>Java 7u1 on October 19. According to the release notes and tests done by the Lucene committers, all bugs reported on July 28 are fixed in this release, so code using Porter stemmer no longer crashes with SIGSEGV. We were not able to experience any index corruption anymore, so it is safe to use Java 7u1 with Lucene Core and Solr. On the same day, Oracle released http://www.oracle.com/technetwork/java/javase/6u29-relnotes-507960.html";>Java 6u29 fixing the same problems occurring with Java 6, if the JVM switches -XX:+AggressiveOpts or -XX:+OptimizeStringConcat were used. Of course, you should not use experimental JVM options like -XX:+AggressiveOpts in production environments! We recommend everybody to upgrade to this latest version 6u29. In case you upgrade to Java 7, remember that you may have to reindex, as the unicode version shipped with Java 7 changed and tokenization behaves differently (e.g. lowercasing). For more information, read JRE_VERSION_MIGRATION.txt in your distribution package! {quote} I plan to commit this later this afternoon. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3534) Backport FilteredQuery/IndexSearcher changes to 3.x: Remove filter logic from IndexSearcher and delegate to FilteredQuery
Backport FilteredQuery/IndexSearcher changes to 3.x: Remove filter logic from IndexSearcher and delegate to FilteredQuery - Key: LUCENE-3534 URL: https://issues.apache.org/jira/browse/LUCENE-3534 Project: Lucene - Java Issue Type: Improvement Reporter: Uwe Schindler Spinoff from LUCENE-1536: We simplified the code in IndexSearcher to no longer do the filtering there, instead wrap all Query with FilteredQuery, if a non-null filter is given. The conjunction code would then only exist in FilteredQuery which makes it easier to maintain. Currently both implementations differ in 3.x, in trunk we used the more optimized IndexSearcher variant with addition of a simplified in-order conjunction code. This issue will backport those changes (without random access bits). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3533) Nuke SpanFilters and CachingSpanFilter (maybe move to sandbox)
Nuke SpanFilters and CachingSpanFilter (maybe move to sandbox) -- Key: LUCENE-3533 URL: https://issues.apache.org/jira/browse/LUCENE-3533 Project: Lucene - Java Issue Type: Task Reporter: Uwe Schindler Assignee: Uwe Schindler SpanFilters are inefficient and OOM easily (they don't scale at all: Create large Lists of Objects for every match, also filtering deleted docs is a pain). Some talks with Grant on Eurocon and also the fact that caching of them is still broken in 3.x (but fixed on trunk) - I assume nobody uses them, so let's nuke them. They are also in wrong package, so standard statement: "Die, SpanFilters, die!" -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3532) Improve Weight.scorer() API to enforce consistent topScorer/outOfOrder parameters across segments
Improve Weight.scorer() API to enforce consistent topScorer/outOfOrder parameters across segments - Key: LUCENE-3532 URL: https://issues.apache.org/jira/browse/LUCENE-3532 Project: Lucene - Java Issue Type: Improvement Components: core/search Affects Versions: 4.0 Reporter: Uwe Schindler Spinoff from LUCENE-1536: In the past, when filters were applied, all scorers were forced to score in order. With random access DocIdSets, this is no longer needed. Some Weights (BooleanWeight) unfortunately return different scorers for in-order/out-of-order, leading to incompatible scores between segments. For now we enforce in-order execution of scorers for FilteredQuery (as we do in 3.x), but once we fix BooleanWeight or have some other good way to produce compatible scores, we can reenable random access. Maybe we should nuke BooleanScorer2... - Robert and Mike have some ideas how to do that :) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3531) Improve CachingWrapperFilter to optionally also cache acceptDocs, if identical to liveDocs
Improve CachingWrapperFilter to optionally also cache acceptDocs, if identical to liveDocs -- Key: LUCENE-3531 URL: https://issues.apache.org/jira/browse/LUCENE-3531 Project: Lucene - Java Issue Type: Improvement Components: core/search Affects Versions: 4.0 Reporter: Uwe Schindler Spinoff from LUCENE-1536: This issue removed the different cache modes completely and always applies the acceptDocs using BitsFilteredDocIdSet.wrap(), the cache only contains raw DocIdSet without any deletions/acceptDocs. For IndexReaders that are seldom reopened, this might not be as performant as it could be. If the acceptDocs==IR.liveDocs, those DocIdSet could also be cached with liveDocs applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3530) Remove deprecated methods in CompoundTokenFilters
Remove deprecated methods in CompoundTokenFilters - Key: LUCENE-3530 URL: https://issues.apache.org/jira/browse/LUCENE-3530 Project: Lucene - Java Issue Type: Sub-task Affects Versions: 4.0 Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 4.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3489) Refactor test classes that use assumeFalse(codec != SimpleText, Memory) to use new annotation and move the expensive methods to separate classes
Refactor test classes that use assumeFalse(codec != SimpleText, Memory) to use new annotation and move the expensive methods to separate classes Key: LUCENE-3489 URL: https://issues.apache.org/jira/browse/LUCENE-3489 Project: Lucene - Java Issue Type: Test Reporter: Uwe Schindler Folloup for LUCENE-3463. TODO: - Move test-methods that need the new @UseNoMemoryExpensiveCodec annotation to separate classes - Eliminate the assumeFalse-calls that check the current codec and disable the test if SimpleText or Memory is used -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org