[jira] [Commented] (LUCENE-5260) Make older Suggesters more accepting of TermFreqPayloadIterator
[ https://issues.apache.org/jira/browse/LUCENE-5260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790075#comment-13790075 ] Areek Zillur commented on LUCENE-5260: -- Hey Michael, I was thinking about how to nicely replace TermFreqIterator. - I was thinking about having some kind of wrapper for TermFreqPayloadIterator that will nullify the payload field for the current TermFreqIterator consumers and a way for the wrapper to signal early on to the consumers that they dont need to deal with the payload at all. - Also It seems like there are a lot of implementations for TermFreqIterator (e.g BufferedTermFreqIteratorWrapper, SortedTermFreqIteratorWrapper); I will make sure all these implementation work with TermFreqPayloadIterator and its new wrapper (for mimicking TermFreqIterator). Any thoughts? I will try to come up with a rough patch soon. Make older Suggesters more accepting of TermFreqPayloadIterator --- Key: LUCENE-5260 URL: https://issues.apache.org/jira/browse/LUCENE-5260 Project: Lucene - Core Issue Type: Improvement Components: core/search Reporter: Areek Zillur As discussed in https://issues.apache.org/jira/browse/LUCENE-5251, it would be nice to make the older suggesters accepting of TermFreqPayloadIterator and throw an exception if payload is found (if it cannot be used). This will also allow us to nuke most of the other interfaces for BytesRefIterator. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5251) New Dictionary Implementation for Suggester consumption
[ https://issues.apache.org/jira/browse/LUCENE-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790079#comment-13790079 ] Areek Zillur commented on LUCENE-5251: -- Thanks for committing the patch, Michael! New Dictionary Implementation for Suggester consumption --- Key: LUCENE-5251 URL: https://issues.apache.org/jira/browse/LUCENE-5251 Project: Lucene - Core Issue Type: New Feature Components: core/search Reporter: Areek Zillur Fix For: 5.0, 4.6 Attachments: LUCENE-5251.patch, LUCENE-5251.patch, LUCENE-5251.patch, LUCENE-5251.patch With the vast array of new suggester, It would be nice to have a dictionary implementation that could feed the suggesters terms, weights and (optionally) payloads from the lucene index. The idea of this dictionary implementation is to grab stored documents from the index and use user-configured fields for terms, weights and payloads. use-case: If you have a document with three fields - product_id - product_name - product_popularity_score then using this implementation would enable you to have a suggester for product_name using the weight of product_popularity_score and return you the payload of product_id, with which you can do further processing on (example: construct a url etc). -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5320) Multi level compositeId router
[ https://issues.apache.org/jira/browse/SOLR-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anshum Gupta updated SOLR-5320: --- Remaining Estimate: 336h Original Estimate: 336h Multi level compositeId router -- Key: SOLR-5320 URL: https://issues.apache.org/jira/browse/SOLR-5320 Project: Solr Issue Type: New Feature Components: SolrCloud Reporter: Anshum Gupta Original Estimate: 336h Remaining Estimate: 336h This would enable multi level routing as compared to the 2 level routing available as of now. On the usage bit, here's an example: Document Id: myapp!dummyuser!doc myapp!dummyuser! can be used as the shardkey for searching content for dummyuser. myapp! can be used for searching across all users of myapp. I am looking at either a 3 (or 4) level routing. The 32 bit hash would then comprise of 8X4 components from each part (in case of 4 level). -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5320) Multi level compositeId router
Anshum Gupta created SOLR-5320: -- Summary: Multi level compositeId router Key: SOLR-5320 URL: https://issues.apache.org/jira/browse/SOLR-5320 Project: Solr Issue Type: New Feature Components: SolrCloud Reporter: Anshum Gupta This would enable multi level routing as compared to the 2 level routing available as of now. On the usage bit, here's an example: Document Id: myapp!dummyuser!doc myapp!dummyuser! can be used as the shardkey for searching content for dummyuser. myapp! can be used for searching across all users of myapp. I am looking at either a 3 (or 4) level routing. The 32 bit hash would then comprise of 8X4 components from each part (in case of 4 level). -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data
[ https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13789958#comment-13789958 ] Littlestar edited comment on LUCENE-5267 at 10/9/13 6:54 AM: - {noformat} public static int decompress(DataInput compressed, int decompressedLen, byte[] dest, int dOff) throws IOException { final int destEnd = dest.length; do { .. // copying a multiple of 8 bytes can make decompression from 5% to 10% faster final int fastLen = (matchLen + 7) 0xFFF8; if (matchDec matchLen || dOff + fastLen destEnd) { // overlap - naive incremental copy for (int ref = dOff - matchDec, end = dOff + matchLen; dOff end; ++ref, ++dOff) { dest[dOff] = dest[ref]; } } else { // no overlap - arraycopy try { System.arraycopy(dest, dOff - matchDec, dest, dOff, fastLen); }catch(Throwable e) { System.out.println(dest.length= + dest.length + ,dOff= + dOff + ,matchDec= + matchDec + ,matchLen= + matchLen + ,fastLen= + fastLen); } dOff += matchLen; } } while (dOff decompressedLen); return dOff; } {noformat} was (Author: cnstar9988): {noformat} public static int decompress(DataInput compressed, int decompressedLen, byte[] dest, int dOff) throws IOException { final int destEnd = dest.length; do { .. // copying a multiple of 8 bytes can make decompression from 5% to 10% faster final int fastLen = (matchLen + 7) 0xFFF8; if (matchDec matchLen || dOff + fastLen destEnd) { // overlap - naive incremental copy for (int ref = dOff - matchDec, end = dOff + matchLen; dOff end; ++ref, ++dOff) { dest[dOff] = dest[ref]; } } else { // no overlap - arraycopy // System.out.println(dest.length= + dest.length + ,dOff= + dOff + ,matchDec= + matchDec + ,fastLen= + fastLen); System.arraycopy(dest, dOff - matchDec, dest, dOff, fastLen);//here throws java.lang.ArrayIndexOutOfBoundsException dOff += matchLen; } } while (dOff decompressedLen); return dOff; } {noformat} java.lang.ArrayIndexOutOfBoundsException on reading data Key: LUCENE-5267 URL: https://issues.apache.org/jira/browse/LUCENE-5267 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.4 Reporter: Littlestar Labels: LZ4 java.lang.ArrayIndexOutOfBoundsException at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132) at org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135) at org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133) at org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110) at org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212) at org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365) at org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110) at org.apache.lucene.index.IndexReader.document(IndexReader.java:447) at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5294) Pluggable Dictionary Implementation for Suggester
[ https://issues.apache.org/jira/browse/SOLR-5294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790105#comment-13790105 ] Areek Zillur commented on SOLR-5294: Thanks for reviewing this, Robert! {quote} Should we think about fixing the spellchecker stuff too (which seems to have totally separate implementations like FileBased and so on to just change the dictionary {quote} This is an interesting point! After looking through the AbstractLuceneSpellChecker and all its implementations, it seems like it would be better to refactor those out too. I feel like that should be considered for the dictionaryImpl setting to work as expected. {quote} I am not sure if we want to keep spell and suggest entangled? {quote} It does make sense to untangle them, but I think that by itself is a bigger issue (I will open up an issue about that and will be happy to work on that) {quote} Should we name the DictionaryFactoryBase something better (SuggestDictionary? SpellingDictionary?) {quote} Given the situation, it seems like the dictionary plugin will be shared among both suggest and spelling; maybe call it DictionaryFactory? {quote} Maybe we can simplify the base plugin class to suit more use cases, like remove the setCore() and just check if it implements CoreAware interface? {quote} That sounds good to me. {quote} I think it would be ideal if we could eliminate the additional hierarchy of FileBased* and IndexBased*: couldnt the FileBased impl just take its filename in via a parameter in params, and IndexBased take its fieldname in params the same way, and we push up create(IndexSearcher) to the base plugin class (the file-based just wouldnt use the indexsearcher argument). {quote} The reason for having the hierarchy was to separate out the two major types of dictionaries (index and file-based). I can change that but at the cost of reduced enforcement. I will upload another patch, incorporating your feedback! Pluggable Dictionary Implementation for Suggester - Key: SOLR-5294 URL: https://issues.apache.org/jira/browse/SOLR-5294 Project: Solr Issue Type: Improvement Components: SearchComponents - other Reporter: Areek Zillur Attachments: SOLR-5294.patch, SOLR-5294.patch It would be nice to have the option of plugging in Dictionary implementations for the suggester to consume, like that of the lookup implementation that allows users to specify which lucene suggesters to use. This would allow easy addition of new dictionary implementations that the lucene suggesters can consume. New implementations of dictionary like (https://issues.apache.org/jira/browse/LUCENE-5251) could be easily added. I believe this would give the users more control on what they what their lucene suggesters to consume. For the implementation, the user can add a new setting in the spellcomponent in the solrconfig. The new setting would be a string identifying the class path of the dictionary implementation to be used (very similar to the existing lookupImpl). This setting would be used to call the relevant DictionaryFactory. A sample solrconfig file would look as follows (note the new dictionaryImpl setting): {code} searchComponent class=solr.SpellCheckComponent name=fuzzy_suggest_analyzing_with_lucene_dict lst name=spellchecker str name=namefuzzy_suggest_analyzing_with_lucene_dict/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.fst.FuzzyLookupFactory/str str name=dictionaryImplorg.apache.solr.spelling.suggest.LuceneDictionaryFactory/str !-- new setting -- str name=storeDirfuzzy_suggest_analyzing/str str name=buildOnCommitfalse/str !-- Suggester properties -- bool name=exactMatchFirsttrue/bool str name=suggestAnalyzerFieldTypetext/str bool name=preserveSepfalse/bool str name=fieldstext/str /lst {code} -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data
[ https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand reassigned LUCENE-5267: Assignee: Adrien Grand java.lang.ArrayIndexOutOfBoundsException on reading data Key: LUCENE-5267 URL: https://issues.apache.org/jira/browse/LUCENE-5267 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.4 Reporter: Littlestar Assignee: Adrien Grand Labels: LZ4 java.lang.ArrayIndexOutOfBoundsException at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132) at org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135) at org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133) at org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110) at org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212) at org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365) at org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110) at org.apache.lucene.index.IndexReader.document(IndexReader.java:447) at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data
[ https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790108#comment-13790108 ] Adrien Grand commented on LUCENE-5267: -- Thanks for the report. Can you check if there are disk-related issues in your system logs and share the .fdx and .fdt files of the broken segment? java.lang.ArrayIndexOutOfBoundsException on reading data Key: LUCENE-5267 URL: https://issues.apache.org/jira/browse/LUCENE-5267 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.4 Reporter: Littlestar Labels: LZ4 java.lang.ArrayIndexOutOfBoundsException at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132) at org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135) at org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133) at org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110) at org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212) at org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365) at org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110) at org.apache.lucene.index.IndexReader.document(IndexReader.java:447) at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data
[ https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790110#comment-13790110 ] Adrien Grand commented on LUCENE-5267: -- Can you also confirm that you are using Lucene42StoredFieldsFormat in your hybaseStd42x codec (and not eg. a customized CompressingStoredFieldsFormat)? java.lang.ArrayIndexOutOfBoundsException on reading data Key: LUCENE-5267 URL: https://issues.apache.org/jira/browse/LUCENE-5267 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.4 Reporter: Littlestar Assignee: Adrien Grand Labels: LZ4 java.lang.ArrayIndexOutOfBoundsException at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132) at org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135) at org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133) at org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110) at org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212) at org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365) at org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110) at org.apache.lucene.index.IndexReader.document(IndexReader.java:447) at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data
[ https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790117#comment-13790117 ] Littlestar commented on LUCENE-5267: dOff - matchDec 0, so throws java.lang.ArrayIndexOutOfBoundsException dest.length=33288,dOff=3184,matchDec=34510,matchLen=15,fastLen=16 dest.length=33288,dOff=3213,matchDec=34724,matchLen=9,fastLen=16 dest.length=33288,dOff=3229,matchDec=45058,matchLen=12,fastLen=16 dest.length=33288,dOff=3255,matchDec=20482,matchLen=9,fastLen=16 dest.length=33288,dOff=3275,matchDec=26122,matchLen=12,fastLen=16 dest.length=33288,dOff=3570,matchDec=35228,matchLen=6,fastLen=8 java.lang.ArrayIndexOutOfBoundsException on reading data Key: LUCENE-5267 URL: https://issues.apache.org/jira/browse/LUCENE-5267 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.4 Reporter: Littlestar Assignee: Adrien Grand Labels: LZ4 java.lang.ArrayIndexOutOfBoundsException at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132) at org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135) at org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133) at org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110) at org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212) at org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365) at org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110) at org.apache.lucene.index.IndexReader.document(IndexReader.java:447) at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data
[ https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790131#comment-13790131 ] Littlestar commented on LUCENE-5267: // Lucene42Codec + LZ4 public final class Hybase42StandardCodec extends FilterCodec { public Hybase42StandardCodec() { super(hybaseStd42x, new Lucene42Codec()); } } disk-related issues in your system logs and share the .fdx and .fdt files of the broken segment too big(5G) java.lang.ArrayIndexOutOfBoundsException on reading data Key: LUCENE-5267 URL: https://issues.apache.org/jira/browse/LUCENE-5267 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.4 Reporter: Littlestar Assignee: Adrien Grand Labels: LZ4 java.lang.ArrayIndexOutOfBoundsException at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132) at org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135) at org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133) at org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110) at org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212) at org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365) at org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110) at org.apache.lucene.index.IndexReader.document(IndexReader.java:447) at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data
[ https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790134#comment-13790134 ] Littlestar commented on LUCENE-5267: when ArrayIndexOutOfBoundsException omit ERROR [Invalid vLong detected (negative values disallowed)] java.lang.RuntimeException: Invalid vLong detected (negative values disallowed) at org.apache.lucene.store.ByteArrayDataInput.readVLong(ByteArrayDataInput.java:152) at org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:342) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133) at org.apache.lucene.index.IndexReader.document(IndexReader.java:436) at org.apache.lucene.index.CheckIndex.testStoredFields(CheckIndex.java:1268) at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:626) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:1903) test: term vectorsOK [0 total vector count; avg 0 term/freq vector fields per doc] test: docvalues...OK [12 docvalues fields; 7 BINARY; 3 NUMERIC; 2 SORTED; 0 SORTED_SET] java.lang.ArrayIndexOutOfBoundsException on reading data Key: LUCENE-5267 URL: https://issues.apache.org/jira/browse/LUCENE-5267 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.4 Reporter: Littlestar Assignee: Adrien Grand Labels: LZ4 java.lang.ArrayIndexOutOfBoundsException at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132) at org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135) at org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133) at org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110) at org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212) at org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365) at org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110) at org.apache.lucene.index.IndexReader.document(IndexReader.java:447) at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Need help regarding Boolean queries with queryparser
In our search application , queries like test usage is not returning correct results but if I give the query like test AND usage works fine. Using queryparser with standard analyzer. Could some one please help me.
RE: Need help regarding Boolean queries with queryparser
Hi, you have to write your own query parser. Look e.g. at the flexible query parser module, which can be customized. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de http://www.thetaphi.de/ eMail: u...@thetaphi.de From: Devi pulaparti [mailto:pvkd...@gmail.com] Sent: Wednesday, October 09, 2013 9:50 AM To: dev@lucene.apache.org Subject: Need help regarding Boolean queries with queryparser In our search application , queries like test usage is not returning correct results but if I give the query like test AND usage works fine. Using queryparser with standard analyzer. Could some one please help me.
Re: Need help regarding Boolean queries with queryparser
Hi Uwe, thanks a lot for the quick reply. I am very new to Lucene. could please shed some light on the capabilities of queryparser? why do we need a flexible query parser module for symbol to work? Doesn't queryparser handle this? On Wed, Oct 9, 2013 at 1:24 PM, Uwe Schindler u...@thetaphi.de wrote: Hi, ** ** you have to write your own query parser. Look e.g. at the flexible query parser module, which can be customized. ** ** Uwe ** ** - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de ** ** *From:* Devi pulaparti [mailto:pvkd...@gmail.com] *Sent:* Wednesday, October 09, 2013 9:50 AM *To:* dev@lucene.apache.org *Subject:* Need help regarding Boolean queries with queryparser ** ** In our search application , queries like test usage is not returning correct results but if I give the query like test AND usage works fine. Using queryparser with standard analyzer. Could some one please help me.** **
[jira] [Resolved] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data
[ https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-5267. -- Resolution: Not A Problem bq. dOff - matchDec 0, so throws java.lang.ArrayIndexOutOfBoundsException bq. dest.length=33288,dOff=3184,matchDec=34510,matchLen=15,fastLen=16 Indeed, all the lines you pasted make no sense since matchDec should be lower than dOff. To me this really looks like your index got corrupted somehow. It could be a single corrupt byte that makes LZ4 read a length on 2 bytes instead of 1 and this shift makes LZ4 try to decompress bytes that make no sense at all, explaining why all matchDecs are all higher than dOff. There are likely only a few chunks that are broken so if you want to try to get back as many documents as possible from the corrupt segment, the following piece of code may help https://gist.github.com/jpountz/6461246 java.lang.ArrayIndexOutOfBoundsException on reading data Key: LUCENE-5267 URL: https://issues.apache.org/jira/browse/LUCENE-5267 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.4 Reporter: Littlestar Assignee: Adrien Grand Labels: LZ4 java.lang.ArrayIndexOutOfBoundsException at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132) at org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135) at org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133) at org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110) at org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212) at org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365) at org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110) at org.apache.lucene.index.IndexReader.document(IndexReader.java:447) at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5319) Collection ZK nodes do not reflect the correct router chosen
[ https://issues.apache.org/jira/browse/SOLR-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790153#comment-13790153 ] Shalin Shekhar Mangar commented on SOLR-5319: - The doc router stored in the collection zk node is not used anywhere. We should just remove that code. Collection ZK nodes do not reflect the correct router chosen Key: SOLR-5319 URL: https://issues.apache.org/jira/browse/SOLR-5319 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.4, 4.5, 5.0 Reporter: Jessica Cheng Assignee: Shalin Shekhar Mangar Labels: solrcloud, zookeeper In ZkController.createCollectionZkNode, the doc router is determined by this code snippet: if (collectionProps.get(DocCollection.DOC_ROUTER) == null) { Object numShards = collectionProps.get(ZkStateReader.NUM_SHARDS_PROP); if (numShards == null) { numShards = System.getProperty(ZkStateReader.NUM_SHARDS_PROP); } if (numShards == null) { collectionProps.put(DocCollection.DOC_ROUTER, ImplicitDocRouter.NAME); } else { collectionProps.put(DocCollection.DOC_ROUTER, DocRouter.DEFAULT_NAME); } } Since OverseerCollectionProcessor never passes on any params prefixed with collection other than collection.configName in its create core commands, collectionProps.get(DocCollection.DOC_ROUTER) will never be non-null. Thus, it needs to figure out if the router is implicit or compositeID based on if numShards is passed in. However, collectionProps.get(ZkStateReader.NUM_SHARDS_PROP) will also always be null for the same reason collectionProps.get(DocCollection.DOC_ROUTER) is null, and it isn't explicitly set in the code above, so the only way for numShards not to be null is if it's passed in as a system property. As an example, here's a cluster state that's created as compositeId router, but the collection ZK node says it's implicit: in clusterstate.json: example:{ shards:{shard1:{ range:8000-7fff, state:active, replicas:{core_node1:{ state:active, core:example_shard1_replica1, node_name:localhost:8983_solr, base_url:http://localhost:8983/solr;, leader:true, router:compositeId}, in /collections/example data: { configName:myconf, router:implicit} I've not sure if the collection ZK node router info is actually used anywhere, so it may not matter, but it's confusing. I think the best fix is for OverseerCollectionProcessor to pass on params prefixed with collection. to the core creation requests. Otherwise, ZkController.createCollectionZkNode can explicitly set the numShards collectionProps by cd.getNumShards() too. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary
[ https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790156#comment-13790156 ] ASF subversion and git services commented on LUCENE-3069: - Commit 1530520 from [~billy] in branch 'dev/trunk' [ https://svn.apache.org/r1530520 ] LUCENE-3069: add CHANGES, move new postingsformats to oal.codecs Lucene should have an entirely memory resident term dictionary -- Key: LUCENE-3069 URL: https://issues.apache.org/jira/browse/LUCENE-3069 Project: Lucene - Core Issue Type: Improvement Components: core/index, core/search Affects Versions: 4.0-ALPHA Reporter: Simon Willnauer Assignee: Han Jiang Labels: gsoc2013 Fix For: 4.6 Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch FST based TermDictionary has been a great improvement yet it still uses a delta codec file for scanning to terms. Some environments have enough memory available to keep the entire FST based term dict in memory. We should add a TermDictionary implementation that encodes all needed information for each term into the FST (custom fst.Output) and builds a FST from the entire term not just the delta. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5319) Collection ZK nodes do not reflect the correct router chosen
[ https://issues.apache.org/jira/browse/SOLR-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790157#comment-13790157 ] ASF subversion and git services commented on SOLR-5319: --- Commit 1530521 from sha...@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1530521 ] SOLR-5319: Remove unused and incorrect router name from Collection ZK nodes Collection ZK nodes do not reflect the correct router chosen Key: SOLR-5319 URL: https://issues.apache.org/jira/browse/SOLR-5319 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.4, 4.5, 5.0 Reporter: Jessica Cheng Assignee: Shalin Shekhar Mangar Labels: solrcloud, zookeeper Fix For: 5.0, 4.6 In ZkController.createCollectionZkNode, the doc router is determined by this code snippet: if (collectionProps.get(DocCollection.DOC_ROUTER) == null) { Object numShards = collectionProps.get(ZkStateReader.NUM_SHARDS_PROP); if (numShards == null) { numShards = System.getProperty(ZkStateReader.NUM_SHARDS_PROP); } if (numShards == null) { collectionProps.put(DocCollection.DOC_ROUTER, ImplicitDocRouter.NAME); } else { collectionProps.put(DocCollection.DOC_ROUTER, DocRouter.DEFAULT_NAME); } } Since OverseerCollectionProcessor never passes on any params prefixed with collection other than collection.configName in its create core commands, collectionProps.get(DocCollection.DOC_ROUTER) will never be non-null. Thus, it needs to figure out if the router is implicit or compositeID based on if numShards is passed in. However, collectionProps.get(ZkStateReader.NUM_SHARDS_PROP) will also always be null for the same reason collectionProps.get(DocCollection.DOC_ROUTER) is null, and it isn't explicitly set in the code above, so the only way for numShards not to be null is if it's passed in as a system property. As an example, here's a cluster state that's created as compositeId router, but the collection ZK node says it's implicit: in clusterstate.json: example:{ shards:{shard1:{ range:8000-7fff, state:active, replicas:{core_node1:{ state:active, core:example_shard1_replica1, node_name:localhost:8983_solr, base_url:http://localhost:8983/solr;, leader:true, router:compositeId}, in /collections/example data: { configName:myconf, router:implicit} I've not sure if the collection ZK node router info is actually used anywhere, so it may not matter, but it's confusing. I think the best fix is for OverseerCollectionProcessor to pass on params prefixed with collection. to the core creation requests. Otherwise, ZkController.createCollectionZkNode can explicitly set the numShards collectionProps by cd.getNumShards() too. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5319) Collection ZK nodes do not reflect the correct router chosen
[ https://issues.apache.org/jira/browse/SOLR-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790158#comment-13790158 ] ASF subversion and git services commented on SOLR-5319: --- Commit 1530523 from sha...@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1530523 ] SOLR-5319: Remove unused and incorrect router name from Collection ZK nodes Collection ZK nodes do not reflect the correct router chosen Key: SOLR-5319 URL: https://issues.apache.org/jira/browse/SOLR-5319 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.4, 4.5, 5.0 Reporter: Jessica Cheng Assignee: Shalin Shekhar Mangar Labels: solrcloud, zookeeper Fix For: 5.0, 4.6 In ZkController.createCollectionZkNode, the doc router is determined by this code snippet: if (collectionProps.get(DocCollection.DOC_ROUTER) == null) { Object numShards = collectionProps.get(ZkStateReader.NUM_SHARDS_PROP); if (numShards == null) { numShards = System.getProperty(ZkStateReader.NUM_SHARDS_PROP); } if (numShards == null) { collectionProps.put(DocCollection.DOC_ROUTER, ImplicitDocRouter.NAME); } else { collectionProps.put(DocCollection.DOC_ROUTER, DocRouter.DEFAULT_NAME); } } Since OverseerCollectionProcessor never passes on any params prefixed with collection other than collection.configName in its create core commands, collectionProps.get(DocCollection.DOC_ROUTER) will never be non-null. Thus, it needs to figure out if the router is implicit or compositeID based on if numShards is passed in. However, collectionProps.get(ZkStateReader.NUM_SHARDS_PROP) will also always be null for the same reason collectionProps.get(DocCollection.DOC_ROUTER) is null, and it isn't explicitly set in the code above, so the only way for numShards not to be null is if it's passed in as a system property. As an example, here's a cluster state that's created as compositeId router, but the collection ZK node says it's implicit: in clusterstate.json: example:{ shards:{shard1:{ range:8000-7fff, state:active, replicas:{core_node1:{ state:active, core:example_shard1_replica1, node_name:localhost:8983_solr, base_url:http://localhost:8983/solr;, leader:true, router:compositeId}, in /collections/example data: { configName:myconf, router:implicit} I've not sure if the collection ZK node router info is actually used anywhere, so it may not matter, but it's confusing. I think the best fix is for OverseerCollectionProcessor to pass on params prefixed with collection. to the core creation requests. Otherwise, ZkController.createCollectionZkNode can explicitly set the numShards collectionProps by cd.getNumShards() too. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5319) Collection ZK nodes do not reflect the correct router chosen
[ https://issues.apache.org/jira/browse/SOLR-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar resolved SOLR-5319. - Resolution: Fixed Fix Version/s: 4.6 5.0 Collection ZK nodes do not reflect the correct router chosen Key: SOLR-5319 URL: https://issues.apache.org/jira/browse/SOLR-5319 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.4, 4.5, 5.0 Reporter: Jessica Cheng Assignee: Shalin Shekhar Mangar Labels: solrcloud, zookeeper Fix For: 5.0, 4.6 In ZkController.createCollectionZkNode, the doc router is determined by this code snippet: if (collectionProps.get(DocCollection.DOC_ROUTER) == null) { Object numShards = collectionProps.get(ZkStateReader.NUM_SHARDS_PROP); if (numShards == null) { numShards = System.getProperty(ZkStateReader.NUM_SHARDS_PROP); } if (numShards == null) { collectionProps.put(DocCollection.DOC_ROUTER, ImplicitDocRouter.NAME); } else { collectionProps.put(DocCollection.DOC_ROUTER, DocRouter.DEFAULT_NAME); } } Since OverseerCollectionProcessor never passes on any params prefixed with collection other than collection.configName in its create core commands, collectionProps.get(DocCollection.DOC_ROUTER) will never be non-null. Thus, it needs to figure out if the router is implicit or compositeID based on if numShards is passed in. However, collectionProps.get(ZkStateReader.NUM_SHARDS_PROP) will also always be null for the same reason collectionProps.get(DocCollection.DOC_ROUTER) is null, and it isn't explicitly set in the code above, so the only way for numShards not to be null is if it's passed in as a system property. As an example, here's a cluster state that's created as compositeId router, but the collection ZK node says it's implicit: in clusterstate.json: example:{ shards:{shard1:{ range:8000-7fff, state:active, replicas:{core_node1:{ state:active, core:example_shard1_replica1, node_name:localhost:8983_solr, base_url:http://localhost:8983/solr;, leader:true, router:compositeId}, in /collections/example data: { configName:myconf, router:implicit} I've not sure if the collection ZK node router info is actually used anywhere, so it may not matter, but it's confusing. I think the best fix is for OverseerCollectionProcessor to pass on params prefixed with collection. to the core creation requests. Otherwise, ZkController.createCollectionZkNode can explicitly set the numShards collectionProps by cd.getNumShards() too. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5318) create command don't take into account the transient core property
[ https://issues.apache.org/jira/browse/SOLR-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] olivier soyez updated SOLR-5318: Description: the create core admin command don't take into account the transient core property, when the core is registered (so, the core will be never closed by the transient core cache) To reproduce : set transientCacheSize=2 and start with no cores Create 3 cores : curl http://ip:port/solr/admin/cores?action=CREATEname=coreXinstanceDir=path_to_instanceDir_coreXdataDir=path_to_dataDir_coreXloadOnStartup=falsetransient=true; Look at the status : http://ip:port/solr/admin/cores?action=STATUS All cores are still loaded. One core should not be loaded (closed by the transient cache). was: the create core admin command don't take into account the transient core property, when the core is registered (so, the core will be never closed by the transient core cache) create command don't take into account the transient core property -- Key: SOLR-5318 URL: https://issues.apache.org/jira/browse/SOLR-5318 Project: Solr Issue Type: Bug Components: multicore Affects Versions: 4.6 Reporter: olivier soyez Assignee: Erick Erickson Priority: Minor Attachments: SOLR-5318.patch the create core admin command don't take into account the transient core property, when the core is registered (so, the core will be never closed by the transient core cache) To reproduce : set transientCacheSize=2 and start with no cores Create 3 cores : curl http://ip:port/solr/admin/cores?action=CREATEname=coreXinstanceDir=path_to_instanceDir_coreXdataDir=path_to_dataDir_coreXloadOnStartup=falsetransient=true; Look at the status : http://ip:port/solr/admin/cores?action=STATUS All cores are still loaded. One core should not be loaded (closed by the transient cache). -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5318) create command don't take into account the transient core property
[ https://issues.apache.org/jira/browse/SOLR-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] olivier soyez updated SOLR-5318: Description: the create core admin command don't take into account the transient core property, when the core is registered (so, the core will be never closed by the transient core cache) To reproduce : set transientCacheSize=2 and start with no cores Create 3 cores : curl http://ip:port/solr/admin/cores?action=CREATEname=coreXinstanceDir=path_to_instanceDir_coreXdataDir=path_to_dataDir_coreXloadOnStartup=falsetransient=true; Look at the status : http://ip:port/solr/admin/cores?action=STATUS All cores are still loaded. One core should not be loaded (closed by the transient cache) was: the create core admin command don't take into account the transient core property, when the core is registered (so, the core will be never closed by the transient core cache) To reproduce : set transientCacheSize=2 and start with no cores Create 3 cores : curl http://ip:port/solr/admin/cores?action=CREATEname=coreXinstanceDir=path_to_instanceDir_coreXdataDir=path_to_dataDir_coreXloadOnStartup=falsetransient=true; Look at the status : http://ip:port/solr/admin/cores?action=STATUS All cores are still loaded. One core should not be loaded (closed by the transient cache). create command don't take into account the transient core property -- Key: SOLR-5318 URL: https://issues.apache.org/jira/browse/SOLR-5318 Project: Solr Issue Type: Bug Components: multicore Affects Versions: 4.6 Reporter: olivier soyez Assignee: Erick Erickson Priority: Minor Attachments: SOLR-5318.patch the create core admin command don't take into account the transient core property, when the core is registered (so, the core will be never closed by the transient core cache) To reproduce : set transientCacheSize=2 and start with no cores Create 3 cores : curl http://ip:port/solr/admin/cores?action=CREATEname=coreXinstanceDir=path_to_instanceDir_coreXdataDir=path_to_dataDir_coreXloadOnStartup=falsetransient=true; Look at the status : http://ip:port/solr/admin/cores?action=STATUS All cores are still loaded. One core should not be loaded (closed by the transient cache) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5318) create command don't take into account the transient core property
[ https://issues.apache.org/jira/browse/SOLR-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790169#comment-13790169 ] olivier soyez commented on SOLR-5318: - We are using in production solr 4.2.1, but I also test solr 4.4 and the svn solr branch_4X : same issue I complete the description and the way to reproduce the issue Not correlated with SOLR-4862 create command don't take into account the transient core property -- Key: SOLR-5318 URL: https://issues.apache.org/jira/browse/SOLR-5318 Project: Solr Issue Type: Bug Components: multicore Affects Versions: 4.6 Reporter: olivier soyez Assignee: Erick Erickson Priority: Minor Attachments: SOLR-5318.patch the create core admin command don't take into account the transient core property, when the core is registered (so, the core will be never closed by the transient core cache) To reproduce : set transientCacheSize=2 and start with no cores Create 3 cores : curl http://ip:port/solr/admin/cores?action=CREATEname=coreXinstanceDir=path_to_instanceDir_coreXdataDir=path_to_dataDir_coreXloadOnStartup=falsetransient=true; Look at the status : http://ip:port/solr/admin/cores?action=STATUS All cores are still loaded. One core should not be loaded (closed by the transient cache) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5318) create command don't take into account the transient core property
[ https://issues.apache.org/jira/browse/SOLR-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] olivier soyez updated SOLR-5318: Affects Version/s: 4.4 create command don't take into account the transient core property -- Key: SOLR-5318 URL: https://issues.apache.org/jira/browse/SOLR-5318 Project: Solr Issue Type: Bug Components: multicore Affects Versions: 4.4, 4.6 Reporter: olivier soyez Assignee: Erick Erickson Priority: Minor Attachments: SOLR-5318.patch the create core admin command don't take into account the transient core property, when the core is registered (so, the core will be never closed by the transient core cache) To reproduce : set transientCacheSize=2 and start with no cores Create 3 cores : curl http://ip:port/solr/admin/cores?action=CREATEname=coreXinstanceDir=path_to_instanceDir_coreXdataDir=path_to_dataDir_coreXloadOnStartup=falsetransient=true; Look at the status : http://ip:port/solr/admin/cores?action=STATUS All cores are still loaded. One core should not be loaded (closed by the transient cache) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5321) Overseer.updateState tries to use router name from message but none is sent
Shalin Shekhar Mangar created SOLR-5321: --- Summary: Overseer.updateState tries to use router name from message but none is sent Key: SOLR-5321 URL: https://issues.apache.org/jira/browse/SOLR-5321 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.5 Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Priority: Minor Fix For: 5.0, 4.6 Overseer.updateSlice method has the following code: {code} String router = message.getStr(OverseerCollectionProcessor.ROUTER,DocRouter.DEFAULT_NAME); ListString shardNames = new ArrayListString(); //collection does not yet exist, create placeholders if num shards is specified boolean collectionExists = state.getCollections().contains(collection); if (!collectionExists numShards!=null) { if(ImplicitDocRouter.NAME.equals(router)){ getShardNames(shardNames, message.getStr(shards,null)); numShards = shardNames.size(); }else { getShardNames(numShards, shardNames); } state = createCollection(state, collection, shardNames, message); } {code} Here it tries to read the router name from the message. Even if we ignore that the key to lookup the router is wrong here, the router name is never sent in a state message. Considering that we don't even support creating a collection with implicit router from command line, we should stop expecting the parameter. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1530537 - in /lucene/dev/trunk/lucene: common-build.xml ivy-settings.xml
Thanks for updating this! I think we should merge this back to branch 4.x too? This way the source code tar.gz is working from China for our next release? 2013/10/9 h...@apache.org: Author: han Date: Wed Oct 9 08:56:15 2013 New Revision: 1530537 URL: http://svn.apache.org/r1530537 Log: update broken links for maven mirror Modified: lucene/dev/trunk/lucene/common-build.xml lucene/dev/trunk/lucene/ivy-settings.xml Modified: lucene/dev/trunk/lucene/common-build.xml URL: http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/common-build.xml?rev=1530537r1=1530536r2=1530537view=diff == --- lucene/dev/trunk/lucene/common-build.xml (original) +++ lucene/dev/trunk/lucene/common-build.xml Wed Oct 9 08:56:15 2013 @@ -360,7 +360,7 @@ property name=ivy_install_path location=${user.home}/.ant/lib / property name=ivy_bootstrap_url1 value=http://repo1.maven.org/maven2/ !-- you might need to tweak this from china so it works -- - property name=ivy_bootstrap_url2 value=http://mirror.netcologne.de/maven2/ + property name=ivy_bootstrap_url2 value=http://uk.maven.org/maven2/ property name=ivy_checksum_sha1 value=c5ebf1c253ad4959a29f4acfe696ee48cdd9f473/ target name=ivy-availability-check unless=ivy.available Modified: lucene/dev/trunk/lucene/ivy-settings.xml URL: http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/ivy-settings.xml?rev=1530537r1=1530536r2=1530537view=diff == --- lucene/dev/trunk/lucene/ivy-settings.xml (original) +++ lucene/dev/trunk/lucene/ivy-settings.xml Wed Oct 9 08:56:15 2013 @@ -35,7 +35,7 @@ ibiblio name=maven.restlet.org root=http://maven.restlet.org; m2compatible=true / !-- you might need to tweak this from china so it works -- -ibiblio name=working-chinese-mirror root=http://mirror.netcologne.de/maven2; m2compatible=true / +ibiblio name=working-chinese-mirror root=http://uk.maven.org/maven2; m2compatible=true / !-- temporary to try Clover 3.2.0 snapshots, see https://issues.apache.org/jira/browse/LUCENE-5243, https://jira.atlassian.com/browse/CLOV-1368 -- ibiblio name=atlassian-clover-snapshots root=https://maven.atlassian.com/content/repositories/atlassian-public-snapshot; m2compatible=true / - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1530537 - in /lucene/dev/trunk/lucene: common-build.xml ivy-settings.xml
oh, yes, I'll do that! On Wed, Oct 9, 2013 at 5:17 PM, Robert Muir rcm...@gmail.com wrote: Thanks for updating this! I think we should merge this back to branch 4.x too? This way the source code tar.gz is working from China for our next release? 2013/10/9 h...@apache.org: Author: han Date: Wed Oct 9 08:56:15 2013 New Revision: 1530537 URL: http://svn.apache.org/r1530537 Log: update broken links for maven mirror Modified: lucene/dev/trunk/lucene/common-build.xml lucene/dev/trunk/lucene/ivy-settings.xml Modified: lucene/dev/trunk/lucene/common-build.xml URL: http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/common-build.xml?rev=1530537r1=1530536r2=1530537view=diff == --- lucene/dev/trunk/lucene/common-build.xml (original) +++ lucene/dev/trunk/lucene/common-build.xml Wed Oct 9 08:56:15 2013 @@ -360,7 +360,7 @@ property name=ivy_install_path location=${user.home}/.ant/lib / property name=ivy_bootstrap_url1 value= http://repo1.maven.org/maven2/ !-- you might need to tweak this from china so it works -- - property name=ivy_bootstrap_url2 value= http://mirror.netcologne.de/maven2/ + property name=ivy_bootstrap_url2 value=http://uk.maven.org/maven2 / property name=ivy_checksum_sha1 value=c5ebf1c253ad4959a29f4acfe696ee48cdd9f473/ target name=ivy-availability-check unless=ivy.available Modified: lucene/dev/trunk/lucene/ivy-settings.xml URL: http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/ivy-settings.xml?rev=1530537r1=1530536r2=1530537view=diff == --- lucene/dev/trunk/lucene/ivy-settings.xml (original) +++ lucene/dev/trunk/lucene/ivy-settings.xml Wed Oct 9 08:56:15 2013 @@ -35,7 +35,7 @@ ibiblio name=maven.restlet.org root=http://maven.restlet.org; m2compatible=true / !-- you might need to tweak this from china so it works -- -ibiblio name=working-chinese-mirror root= http://mirror.netcologne.de/maven2; m2compatible=true / +ibiblio name=working-chinese-mirror root= http://uk.maven.org/maven2; m2compatible=true / !-- temporary to try Clover 3.2.0 snapshots, see https://issues.apache.org/jira/browse/LUCENE-5243, https://jira.atlassian.com/browse/CLOV-1368 -- ibiblio name=atlassian-clover-snapshots root= https://maven.atlassian.com/content/repositories/atlassian-public-snapshot; m2compatible=true / - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5264) CommonTermsQuery ignores minMustMatch if only high freq terms are present.
[ https://issues.apache.org/jira/browse/LUCENE-5264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790203#comment-13790203 ] Adrien Grand commented on LUCENE-5264: -- +1 CommonTermsQuery ignores minMustMatch if only high freq terms are present. -- Key: LUCENE-5264 URL: https://issues.apache.org/jira/browse/LUCENE-5264 Project: Lucene - Core Issue Type: Bug Components: modules/other Affects Versions: 5.0, 4.5 Reporter: Simon Willnauer Assignee: Simon Willnauer Priority: Minor Fix For: 5.0, 4.6 Attachments: LUCENE-5264.patch if we only have high freq terms we move to a pure conjunction and ignore the min must match entirely if it is 0. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5310) Add a collection admin command to remove a replica
[ https://issues.apache.org/jira/browse/SOLR-5310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-5310: - Description: the only way a replica can removed is by unloading the core .There is no way to remove a replica that is down . So, the clusterstate will have unreferenced nodes if a few nodes go down over time We need a cluster admin command to clean that up e.g: /admin/collections?action=DELETEREPLICAcollection=coll1shard=shard1replica=core_node3 The system would first see if the replica is active. If yes , a core UNLOAD command is fired , which would take care of deleting the replica from the clusterstate as well if the state is inactive, then the core or node may be down , in that case the entry is removed from cluster state was: the only way a replica can removed is by unloading the core .There is no way to remove a replica that is down . So, the clusterstate will have unreferenced nodes if a few nodes go down over time We need a cluster admin command to clean that up e.g: /admin/collections?action=REMOVEREPLICAcollection=coll1shard=shard1replica=core_node3 Add a collection admin command to remove a replica -- Key: SOLR-5310 URL: https://issues.apache.org/jira/browse/SOLR-5310 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul Original Estimate: 72h Remaining Estimate: 72h the only way a replica can removed is by unloading the core .There is no way to remove a replica that is down . So, the clusterstate will have unreferenced nodes if a few nodes go down over time We need a cluster admin command to clean that up e.g: /admin/collections?action=DELETEREPLICAcollection=coll1shard=shard1replica=core_node3 The system would first see if the replica is active. If yes , a core UNLOAD command is fired , which would take care of deleting the replica from the clusterstate as well if the state is inactive, then the core or node may be down , in that case the entry is removed from cluster state -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5321) Overseer.updateState tries to use router name from message but none is sent
[ https://issues.apache.org/jira/browse/SOLR-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar resolved SOLR-5321. - Resolution: Fixed Overseer.updateState tries to use router name from message but none is sent --- Key: SOLR-5321 URL: https://issues.apache.org/jira/browse/SOLR-5321 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.5 Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Priority: Minor Fix For: 5.0, 4.6 Overseer.updateSlice method has the following code: {code} String router = message.getStr(OverseerCollectionProcessor.ROUTER,DocRouter.DEFAULT_NAME); ListString shardNames = new ArrayListString(); //collection does not yet exist, create placeholders if num shards is specified boolean collectionExists = state.getCollections().contains(collection); if (!collectionExists numShards!=null) { if(ImplicitDocRouter.NAME.equals(router)){ getShardNames(shardNames, message.getStr(shards,null)); numShards = shardNames.size(); }else { getShardNames(numShards, shardNames); } state = createCollection(state, collection, shardNames, message); } {code} Here it tries to read the router name from the message. Even if we ignore that the key to lookup the router is wrong here, the router name is never sent in a state message. Considering that we don't even support creating a collection with implicit router from command line, we should stop expecting the parameter. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5321) Overseer.updateState tries to use router name from message but none is sent
[ https://issues.apache.org/jira/browse/SOLR-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790223#comment-13790223 ] ASF subversion and git services commented on SOLR-5321: --- Commit 1530555 from sha...@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1530555 ] SOLR-5321: Remove unnecessary code in Overseer.updateState method which tries to use router name from message where none is ever sent Overseer.updateState tries to use router name from message but none is sent --- Key: SOLR-5321 URL: https://issues.apache.org/jira/browse/SOLR-5321 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.5 Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Priority: Minor Fix For: 5.0, 4.6 Overseer.updateSlice method has the following code: {code} String router = message.getStr(OverseerCollectionProcessor.ROUTER,DocRouter.DEFAULT_NAME); ListString shardNames = new ArrayListString(); //collection does not yet exist, create placeholders if num shards is specified boolean collectionExists = state.getCollections().contains(collection); if (!collectionExists numShards!=null) { if(ImplicitDocRouter.NAME.equals(router)){ getShardNames(shardNames, message.getStr(shards,null)); numShards = shardNames.size(); }else { getShardNames(numShards, shardNames); } state = createCollection(state, collection, shardNames, message); } {code} Here it tries to read the router name from the message. Even if we ignore that the key to lookup the router is wrong here, the router name is never sent in a state message. Considering that we don't even support creating a collection with implicit router from command line, we should stop expecting the parameter. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5321) Overseer.updateState tries to use router name from message but none is sent
[ https://issues.apache.org/jira/browse/SOLR-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790225#comment-13790225 ] ASF subversion and git services commented on SOLR-5321: --- Commit 1530556 from sha...@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1530556 ] SOLR-5321: Remove unnecessary code in Overseer.updateState method which tries to use router name from message where none is ever sent Overseer.updateState tries to use router name from message but none is sent --- Key: SOLR-5321 URL: https://issues.apache.org/jira/browse/SOLR-5321 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.5 Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Priority: Minor Fix For: 5.0, 4.6 Overseer.updateSlice method has the following code: {code} String router = message.getStr(OverseerCollectionProcessor.ROUTER,DocRouter.DEFAULT_NAME); ListString shardNames = new ArrayListString(); //collection does not yet exist, create placeholders if num shards is specified boolean collectionExists = state.getCollections().contains(collection); if (!collectionExists numShards!=null) { if(ImplicitDocRouter.NAME.equals(router)){ getShardNames(shardNames, message.getStr(shards,null)); numShards = shardNames.size(); }else { getShardNames(numShards, shardNames); } state = createCollection(state, collection, shardNames, message); } {code} Here it tries to read the router name from the message. Even if we ignore that the key to lookup the router is wrong here, the router name is never sent in a state message. Considering that we don't even support creating a collection with implicit router from command line, we should stop expecting the parameter. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5322) Permisions didn't check when call discoverUnder
Said Chavkin created SOLR-5322: -- Summary: Permisions didn't check when call discoverUnder Key: SOLR-5322 URL: https://issues.apache.org/jira/browse/SOLR-5322 Project: Solr Issue Type: Bug Components: contrib - Clustering Affects Versions: 4.5 Environment: Centos 6.4 tomcat6 Reporter: Said Chavkin Hello. When in solr/home directory exists directory to which solr do not have rights, then solr failed to start with exception 2108 [main] INFO org.apache.solr.core.CoresLocator - Looking for core definitions underneath /var/lib/solr 2109 [main] ERROR org.apache.solr.servlet.SolrDispatchFilter - Could not start Solr. Check solr/home property and the logs 2138 [main] ERROR org.apache.solr.core.SolrCore - null:java.lang.NullPointerException at org.apache.solr.core.CorePropertiesLocator.discoverUnder(CorePropertiesLocator.java:121) at org.apache.solr.core.CorePropertiesLocator.discoverUnder(CorePropertiesLocator.java:130) at org.apache.solr.core.CorePropertiesLocator.discover(CorePropertiesLocator.java:113) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:226) at org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:177) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:127) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:295) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:422) at org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:115) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3838) at org.apache.catalina.core.StandardContext.start(StandardContext.java:4488) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:791) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:771) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:526) at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:637) at org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:563) at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:498) at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1277) at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:321) at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:119) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1053) at org.apache.catalina.core.StandardHost.start(StandardHost.java:722) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1045) at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) at org.apache.catalina.core.StandardService.start(StandardService.java:516) at org.apache.catalina.core.StandardServer.start(StandardServer.java:710) at org.apache.catalina.startup.Catalina.start(Catalina.java:593) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:289) at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:414) 2138 [main] INFO org.apache.solr.servlet.SolrDispatchFilter - SolrDispatchFilter.init() done For example: solr home located on /var/lib/solr /var/lib/solr is another file system, it has lost+found directory. As result solr can't to star. Yours faithfully. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data
[ https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790236#comment-13790236 ] Littlestar commented on LUCENE-5267: Thanks, most of records recoverd. But why index got corrupted? mybe compress or writer has bug ... java.lang.ArrayIndexOutOfBoundsException on reading data Key: LUCENE-5267 URL: https://issues.apache.org/jira/browse/LUCENE-5267 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.4 Reporter: Littlestar Assignee: Adrien Grand Labels: LZ4 java.lang.ArrayIndexOutOfBoundsException at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132) at org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135) at org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133) at org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110) at org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212) at org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365) at org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110) at org.apache.lucene.index.IndexReader.document(IndexReader.java:447) at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-5236) Use broadword bit selection in EliasFanoDecoder
[ https://issues.apache.org/jira/browse/LUCENE-5236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand reassigned LUCENE-5236: Assignee: Adrien Grand Use broadword bit selection in EliasFanoDecoder --- Key: LUCENE-5236 URL: https://issues.apache.org/jira/browse/LUCENE-5236 Project: Lucene - Core Issue Type: Improvement Reporter: Paul Elschot Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-5236.patch, LUCENE-5236.patch, TestDocIdSetBenchmark.java Try and speed up decoding -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data
[ https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790248#comment-13790248 ] Adrien Grand commented on LUCENE-5267: -- Good question. I've had this issue myself once and the dmesg of the system was full with disk-related errors so something really bad probably happened with the disk. I am actually thinking of adding some basic checksuming to the future stored fields format (4 bytes per chunk, this wouldn't hurt the compression ratio much) in order to be able to distinguish easily index corruptions from bugs in the stored fields format (and especially the compression layer). java.lang.ArrayIndexOutOfBoundsException on reading data Key: LUCENE-5267 URL: https://issues.apache.org/jira/browse/LUCENE-5267 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.4 Reporter: Littlestar Assignee: Adrien Grand Labels: LZ4 java.lang.ArrayIndexOutOfBoundsException at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132) at org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135) at org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133) at org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110) at org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212) at org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365) at org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110) at org.apache.lucene.index.IndexReader.document(IndexReader.java:447) at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5261) add simple API to build queries from analysis chain
[ https://issues.apache.org/jira/browse/LUCENE-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-5261: Attachment: LUCENE-5261.patch Simplified patch: * I removed get/set defaultOperator and slop, restoring these to the QPs (so less changes there: including no api impact) * I removed operator enum completely and just use Occur for that. * instead createFieldQuery just takes Occur and slop as parameters. * added javadocs From the use directly side I just added createBooleanQuery(String,String,Occur) and createPhraseQuery(String,String,int). I think this is much more intuitive, these parameters are really per-query anyway: they shouldnt be getters/setters on this class. (Thats just brain damage from our crazy QP). I think this is ready. add simple API to build queries from analysis chain --- Key: LUCENE-5261 URL: https://issues.apache.org/jira/browse/LUCENE-5261 Project: Lucene - Core Issue Type: New Feature Reporter: Robert Muir Attachments: LUCENE-5261.patch, LUCENE-5261.patch, LUCENE-5261.patch Currently this is pretty crazy stuff. Additionally its duplicated in like 3 or 4 places in our codebase (i noticed it doing LUCENE-5259) We can solve that duplication, and make it easy to simply create queries from an analyzer (its been asked on the user list), as well as make it easier to build new queryparsers. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2548) Multithreaded faceting
[ https://issues.apache.org/jira/browse/SOLR-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790257#comment-13790257 ] Erick Erickson commented on SOLR-2548: -- 1. no. Could be extended to I think if you have the energy. 2. no 3. yes 4 all Multithreaded faceting -- Key: SOLR-2548 URL: https://issues.apache.org/jira/browse/SOLR-2548 Project: Solr Issue Type: Improvement Components: search Affects Versions: 3.1 Reporter: Janne Majaranta Assignee: Erick Erickson Priority: Minor Labels: facet Fix For: 4.5, 5.0 Attachments: SOLR-2548_4.2.1.patch, SOLR-2548_for_31x.patch, SOLR-2548_multithreaded_faceting,_dsmiley.patch, SOLR-2548_multithreaded_faceting,_dsmiley.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch Add multithreading support for faceting. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5322) Permisions didn't check when call discoverUnder
[ https://issues.apache.org/jira/browse/SOLR-5322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson resolved SOLR-5322. -- Resolution: Invalid Please raise this kind of issue on the user's list before raising a JIRA to see if it's really a but in Solr or a configuration issue. You can reopen this is you think it's something Solr should manage. What would you have Solr do? If it's not being run as a process that has permissions to a necessary directory what can it do _but_ fail on startup? You as the sysadmin are responsible for permissions Permisions didn't check when call discoverUnder --- Key: SOLR-5322 URL: https://issues.apache.org/jira/browse/SOLR-5322 Project: Solr Issue Type: Bug Components: contrib - Clustering Affects Versions: 4.5 Environment: Centos 6.4 tomcat6 Reporter: Said Chavkin Hello. When in solr/home directory exists directory to which solr do not have rights, then solr failed to start with exception 2108 [main] INFO org.apache.solr.core.CoresLocator - Looking for core definitions underneath /var/lib/solr 2109 [main] ERROR org.apache.solr.servlet.SolrDispatchFilter - Could not start Solr. Check solr/home property and the logs 2138 [main] ERROR org.apache.solr.core.SolrCore - null:java.lang.NullPointerException at org.apache.solr.core.CorePropertiesLocator.discoverUnder(CorePropertiesLocator.java:121) at org.apache.solr.core.CorePropertiesLocator.discoverUnder(CorePropertiesLocator.java:130) at org.apache.solr.core.CorePropertiesLocator.discover(CorePropertiesLocator.java:113) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:226) at org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:177) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:127) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:295) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:422) at org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:115) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3838) at org.apache.catalina.core.StandardContext.start(StandardContext.java:4488) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:791) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:771) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:526) at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:637) at org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:563) at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:498) at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1277) at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:321) at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:119) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1053) at org.apache.catalina.core.StandardHost.start(StandardHost.java:722) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1045) at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) at org.apache.catalina.core.StandardService.start(StandardService.java:516) at org.apache.catalina.core.StandardServer.start(StandardServer.java:710) at org.apache.catalina.startup.Catalina.start(Catalina.java:593) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:289) at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:414) 2138 [main] INFO org.apache.solr.servlet.SolrDispatchFilter - SolrDispatchFilter.init() done For example: solr home located on /var/lib/solr /var/lib/solr is another file system, it has lost+found directory. As result solr can't to star. Yours faithfully. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5323) Solr requires -Dsolr.clustering.enabled=false when pointing at example config
John Berryman created SOLR-5323: --- Summary: Solr requires -Dsolr.clustering.enabled=false when pointing at example config Key: SOLR-5323 URL: https://issues.apache.org/jira/browse/SOLR-5323 Project: Solr Issue Type: Bug Components: contrib - Clustering Affects Versions: 4.5 Environment: vanilla mac Reporter: John Berryman Fix For: 5.0, 4.6 my typical use of Solr is something like this: cd SOLR_HOME/example cp -r solr /myProjectDir/solr_home java -jar -Dsolr.solr.home=/myProjectDir/solr_home start.jar But in solr 4.5.0 this fails to start successfully. I get an error: org.apache.solr.common.SolrException: Error loading class 'solr.clustering.ClusteringComponent' The reason is because solr.clustering.enabled defaults to true now. I don't know why this might be the case. you can get around it with java -jar -Dsolr.solr.home=/myProjectDir/solr_home -Dsolr.clustering.enabled=false start.jar SOLR-4708 is when this became an issue. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5323) Solr requires -Dsolr.clustering.enabled=false when pointing at example config
[ https://issues.apache.org/jira/browse/SOLR-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Hatcher updated SOLR-5323: --- Description: my typical use of Solr is something like this: {code} cd SOLR_HOME/example cp -r solr /myProjectDir/solr_home java -jar -Dsolr.solr.home=/myProjectDir/solr_home start.jar {code} But in solr 4.5.0 this fails to start successfully. I get an error: {code} org.apache.solr.common.SolrException: Error loading class 'solr.clustering.ClusteringComponent' {code} The reason is because solr.clustering.enabled defaults to true now. I don't know why this might be the case. you can get around it with {code} java -jar -Dsolr.solr.home=/myProjectDir/solr_home -Dsolr.clustering.enabled=false start.jar {code} SOLR-4708 is when this became an issue. was: my typical use of Solr is something like this: cd SOLR_HOME/example cp -r solr /myProjectDir/solr_home java -jar -Dsolr.solr.home=/myProjectDir/solr_home start.jar But in solr 4.5.0 this fails to start successfully. I get an error: org.apache.solr.common.SolrException: Error loading class 'solr.clustering.ClusteringComponent' The reason is because solr.clustering.enabled defaults to true now. I don't know why this might be the case. you can get around it with java -jar -Dsolr.solr.home=/myProjectDir/solr_home -Dsolr.clustering.enabled=false start.jar SOLR-4708 is when this became an issue. Solr requires -Dsolr.clustering.enabled=false when pointing at example config - Key: SOLR-5323 URL: https://issues.apache.org/jira/browse/SOLR-5323 Project: Solr Issue Type: Bug Components: contrib - Clustering Affects Versions: 4.5 Environment: vanilla mac Reporter: John Berryman Fix For: 5.0, 4.6 my typical use of Solr is something like this: {code} cd SOLR_HOME/example cp -r solr /myProjectDir/solr_home java -jar -Dsolr.solr.home=/myProjectDir/solr_home start.jar {code} But in solr 4.5.0 this fails to start successfully. I get an error: {code} org.apache.solr.common.SolrException: Error loading class 'solr.clustering.ClusteringComponent' {code} The reason is because solr.clustering.enabled defaults to true now. I don't know why this might be the case. you can get around it with {code} java -jar -Dsolr.solr.home=/myProjectDir/solr_home -Dsolr.clustering.enabled=false start.jar {code} SOLR-4708 is when this became an issue. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5323) Solr requires -Dsolr.clustering.enabled=false when pointing at example config
[ https://issues.apache.org/jira/browse/SOLR-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790362#comment-13790362 ] Erik Hatcher commented on SOLR-5323: I think we should have the lib elements in solrconfig.xml be something like this: {code} lib dir=${solr.install.dir}/contrib/clustering/lib/ regex=.*\.jar / {code} where solr.install.dir is a property defined by Solr automatically at startup that has the root of where Solr is installed. I've done this manually by adjusting the configuration in this exact scenario (copying the example configuration and changing all lib's in this way and defining solr.install.dir on the command-line), but Solr should be able to do this better. Solr requires -Dsolr.clustering.enabled=false when pointing at example config - Key: SOLR-5323 URL: https://issues.apache.org/jira/browse/SOLR-5323 Project: Solr Issue Type: Bug Components: contrib - Clustering Affects Versions: 4.5 Environment: vanilla mac Reporter: John Berryman Fix For: 5.0, 4.6 my typical use of Solr is something like this: {code} cd SOLR_HOME/example cp -r solr /myProjectDir/solr_home java -jar -Dsolr.solr.home=/myProjectDir/solr_home start.jar {code} But in solr 4.5.0 this fails to start successfully. I get an error: {code} org.apache.solr.common.SolrException: Error loading class 'solr.clustering.ClusteringComponent' {code} The reason is because solr.clustering.enabled defaults to true now. I don't know why this might be the case. you can get around it with {code} java -jar -Dsolr.solr.home=/myProjectDir/solr_home -Dsolr.clustering.enabled=false start.jar {code} SOLR-4708 is when this became an issue. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5266) Optimization of the direct PackedInts readers
[ https://issues.apache.org/jira/browse/LUCENE-5266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-5266: Attachment: LUCENE-5266.patch here is a patch from playing around this morning. I'm afraid of specialization here: but this one should help the relatively low bpv I think by using readShort ? Optimization of the direct PackedInts readers - Key: LUCENE-5266 URL: https://issues.apache.org/jira/browse/LUCENE-5266 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-5266.patch Given that the initial focus for PackedInts readers was more on in-memory readers (for storing stuff like the mapping from old to new doc IDs at merging time), I never spent time trying to optimize the direct readers although it could be beneficial now that they are used for disk-based doc values. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5324) Make sub shard replica recovery and shard state switch asynchronous
Shalin Shekhar Mangar created SOLR-5324: --- Summary: Make sub shard replica recovery and shard state switch asynchronous Key: SOLR-5324 URL: https://issues.apache.org/jira/browse/SOLR-5324 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Fix For: 5.0, 4.6 Currently the shard split command waits for all replicas of all sub shards to recover and then switches the state of parent to inactive and sub-shards to active. The problem is that shard split (ab)uses the CoreAdmin WaitForState action to ask the sub shard leader to wait until the replica states are active. This action is prone to timeout. We should make the shard state switching asynchronous. Once all replicas of all sub-shards are 'active', the shard states should be switched automatically. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790411#comment-13790411 ] Mark Miller commented on SOLR-1301: --- I have a new patch I'm cleaning up that tackles some of the packaging: * Split out solr-morphlines-core and solr-morphlines-cell into their own modules. * Updated to trunk and the new modules are now using the new dependency version tracking system. * Fixed an issue in the code around the TokenStream contract being violated - the latest code detected this and failed a test - end and close now called. * Updated to use Morphlines from CDK 0.8. * Setup the main class in the solr-mr jar manifest. * I enabled an ignored test which exposed a few bugs because of the required solr.xml in Solr 5.0 - I addressed those bugs. * Added a missing metrics health-check dependency that somehow popped up. * I played around with naming the solr-mr artifact MapReduceIndexTool.jar, but the system really want's us to follow the rules of the artifacts and have something like solr-solr-mr-5.0.jar. Anything else has some random issues, such as with javadoc, and if your name does not start with solr-, it will be changed to start with lucene-. I'm not yet sure if it's worth the trouble to expand the system or use a different name, so for now it's still just using the default jar name based on the contrib module name (solr-mr). Besides the naming issue, there are a couple other things to button up: * How we are going to set up the classpath - script, in the manifest, leave it up to the user and doc, etc. * All dependencies are currently in solr-morphlines-core - this was a simple way to split out the modules since solr-mr and solr-morphlines-cell depend on solr-morphlines-core. Finally, we will probably need some help from [~steve_rowe] to get the Maven build setup correctly. I spent a bunch of time trying to use asm to work around the hacked test policy issue. There are multiple problems I ran into. One is that another module uses asm 4.1, but Hadoop brings in asm 3.1 - if you are doing some asm coding, this can cause compile issues with your ide (at least eclipse). It also ends up being really hard to get an injection in the right place because of how the yarn code is structured. After spending a bunch of time trying to get this to work, I'm backing out and considering other options. Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce. - Key: SOLR-1301 URL: https://issues.apache.org/jira/browse/SOLR-1301 Project: Solr Issue Type: New Feature Reporter: Andrzej Bialecki Assignee: Mark Miller Fix For: 4.6 Attachments: commons-logging-1.0.4.jar, commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, log4j-1.2.15.jar, README.txt, SOLR-1301-hadoop-0-20.patch, SOLR-1301-hadoop-0-20.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SolrRecordWriter.java This patch contains a contrib module that provides distributed indexing (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is twofold: * provide an API that is familiar to Hadoop developers, i.e. that of OutputFormat * avoid unnecessary export and (de)serialization of data maintained on HDFS. SolrOutputFormat consumes data produced by reduce tasks directly, without storing it in intermediate files. Furthermore, by using an EmbeddedSolrServer, the indexing task is split into as many parts as there are reducers, and the data to be indexed is not sent over the network. Design -- Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, which in turn uses SolrRecordWriter to write this data. SolrRecordWriter instantiates an EmbeddedSolrServer, and it also instantiates an implementation of SolrDocumentConverter, which is responsible for turning Hadoop (key, value) into a SolrInputDocument. This data is then added to a batch, which is periodically submitted to EmbeddedSolrServer. When reduce task completes, and the OutputFormat is closed, SolrRecordWriter calls commit() and optimize() on the EmbeddedSolrServer. The API provides facilities to specify an arbitrary existing solr.home directory, from which the conf/ and lib/ files will be taken. This process results in the creation of as many partial Solr home directories as there were reduce tasks. The output shards are placed in the output directory on the default
[jira] [Commented] (LUCENE-5264) CommonTermsQuery ignores minMustMatch if only high freq terms are present.
[ https://issues.apache.org/jira/browse/LUCENE-5264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790413#comment-13790413 ] ASF subversion and git services commented on LUCENE-5264: - Commit 1530651 from [~simonw] in branch 'dev/trunk' [ https://svn.apache.org/r1530651 ] LUCENE-5264: CommonTermsQuery ignores minMustMatch if only high freq terms are present CommonTermsQuery ignores minMustMatch if only high freq terms are present. -- Key: LUCENE-5264 URL: https://issues.apache.org/jira/browse/LUCENE-5264 Project: Lucene - Core Issue Type: Bug Components: modules/other Affects Versions: 5.0, 4.5 Reporter: Simon Willnauer Assignee: Simon Willnauer Priority: Minor Fix For: 5.0, 4.6 Attachments: LUCENE-5264.patch if we only have high freq terms we move to a pure conjunction and ignore the min must match entirely if it is 0. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5264) CommonTermsQuery ignores minMustMatch if only high freq terms are present.
[ https://issues.apache.org/jira/browse/LUCENE-5264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790421#comment-13790421 ] ASF subversion and git services commented on LUCENE-5264: - Commit 1530657 from [~simonw] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1530657 ] LUCENE-5264: CommonTermsQuery ignores minMustMatch if only high freq terms are present CommonTermsQuery ignores minMustMatch if only high freq terms are present. -- Key: LUCENE-5264 URL: https://issues.apache.org/jira/browse/LUCENE-5264 Project: Lucene - Core Issue Type: Bug Components: modules/other Affects Versions: 5.0, 4.5 Reporter: Simon Willnauer Assignee: Simon Willnauer Priority: Minor Fix For: 5.0, 4.6 Attachments: LUCENE-5264.patch if we only have high freq terms we move to a pure conjunction and ignore the min must match entirely if it is 0. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2548) Multithreaded faceting
[ https://issues.apache.org/jira/browse/SOLR-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790425#comment-13790425 ] Markus Jelsma commented on SOLR-2548: - I'm having a hard time measuring performance differenes without and with facet.threads. On my development machine, there are no differences on warmed indexes, both measure around 1ms. They're also almost identical after stop/start of Jetty with no warm up queries, around 40ms, after that, fast again. We're facetting on four fields this time, there are also four threads. Multithreaded faceting -- Key: SOLR-2548 URL: https://issues.apache.org/jira/browse/SOLR-2548 Project: Solr Issue Type: Improvement Components: search Affects Versions: 3.1 Reporter: Janne Majaranta Assignee: Erick Erickson Priority: Minor Labels: facet Fix For: 4.5, 5.0 Attachments: SOLR-2548_4.2.1.patch, SOLR-2548_for_31x.patch, SOLR-2548_multithreaded_faceting,_dsmiley.patch, SOLR-2548_multithreaded_faceting,_dsmiley.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch Add multithreading support for faceting. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5324) Make sub shard replica recovery and shard state switch asynchronous
[ https://issues.apache.org/jira/browse/SOLR-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar updated SOLR-5324: Attachment: SOLR-5324.patch Changes: # A new shard state: 'recovery' is added # After all sub-shard replicas have been created, the sub-shard state is set to 'recovery'. If replication factor is 1 then the sub-shards are set to 'active'. The splitshard API returns at this point. # The state change events in the overseer are used to track when all replicas of all sub-shards become 'active'. Once that happens, the parent shard is set to inactive and the sub-shards are set to 'active'. # To facilitate the above, a slice property called 'parent' is introduced which is removed once the slice becomes 'active'. # If the split is retried then we use the 'deleteshard' api to completely remove the sub-shards before starting the splitting process. Make sub shard replica recovery and shard state switch asynchronous --- Key: SOLR-5324 URL: https://issues.apache.org/jira/browse/SOLR-5324 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Fix For: 5.0, 4.6 Attachments: SOLR-5324.patch Currently the shard split command waits for all replicas of all sub shards to recover and then switches the state of parent to inactive and sub-shards to active. The problem is that shard split (ab)uses the CoreAdmin WaitForState action to ask the sub shard leader to wait until the replica states are active. This action is prone to timeout. We should make the shard state switching asynchronous. Once all replicas of all sub-shards are 'active', the shard states should be switched automatically. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2548) Multithreaded faceting
[ https://issues.apache.org/jira/browse/SOLR-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790435#comment-13790435 ] Markus Jelsma commented on SOLR-2548: - Alright, i took another index and facetted on much more fields and now i see a small improvement after start up of about 12%. It is not much, perhaps this machine is too fast in this case. Multithreaded faceting -- Key: SOLR-2548 URL: https://issues.apache.org/jira/browse/SOLR-2548 Project: Solr Issue Type: Improvement Components: search Affects Versions: 3.1 Reporter: Janne Majaranta Assignee: Erick Erickson Priority: Minor Labels: facet Fix For: 4.5, 5.0 Attachments: SOLR-2548_4.2.1.patch, SOLR-2548_for_31x.patch, SOLR-2548_multithreaded_faceting,_dsmiley.patch, SOLR-2548_multithreaded_faceting,_dsmiley.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch Add multithreading support for faceting. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5264) CommonTermsQuery ignores minMustMatch if only high freq terms are present.
[ https://issues.apache.org/jira/browse/LUCENE-5264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer resolved LUCENE-5264. - Resolution: Fixed Lucene Fields: New,Patch Available (was: New) CommonTermsQuery ignores minMustMatch if only high freq terms are present. -- Key: LUCENE-5264 URL: https://issues.apache.org/jira/browse/LUCENE-5264 Project: Lucene - Core Issue Type: Bug Components: modules/other Affects Versions: 5.0, 4.5 Reporter: Simon Willnauer Assignee: Simon Willnauer Priority: Minor Fix For: 5.0, 4.6 Attachments: LUCENE-5264.patch if we only have high freq terms we move to a pure conjunction and ignore the min must match entirely if it is 0. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5323) Solr requires -Dsolr.clustering.enabled=false when pointing at example config
[ https://issues.apache.org/jira/browse/SOLR-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790452#comment-13790452 ] Erik Hatcher commented on SOLR-5323: This isn't specific to the clustering component, except that it gets loaded non-lazily. See these comments: https://issues.apache.org/jira/browse/SOLR-4708?focusedCommentId=13630567page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13630567 Solr requires -Dsolr.clustering.enabled=false when pointing at example config - Key: SOLR-5323 URL: https://issues.apache.org/jira/browse/SOLR-5323 Project: Solr Issue Type: Bug Components: contrib - Clustering Affects Versions: 4.5 Environment: vanilla mac Reporter: John Berryman Fix For: 5.0, 4.6 my typical use of Solr is something like this: {code} cd SOLR_HOME/example cp -r solr /myProjectDir/solr_home java -jar -Dsolr.solr.home=/myProjectDir/solr_home start.jar {code} But in solr 4.5.0 this fails to start successfully. I get an error: {code} org.apache.solr.common.SolrException: Error loading class 'solr.clustering.ClusteringComponent' {code} The reason is because solr.clustering.enabled defaults to true now. I don't know why this might be the case. you can get around it with {code} java -jar -Dsolr.solr.home=/myProjectDir/solr_home -Dsolr.clustering.enabled=false start.jar {code} SOLR-4708 is when this became an issue. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5255) Make DocumentsWriter reference final in IW
[ https://issues.apache.org/jira/browse/LUCENE-5255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790519#comment-13790519 ] ASF subversion and git services commented on LUCENE-5255: - Commit 1530679 from [~simonw] in branch 'dev/trunk' [ https://svn.apache.org/r1530679 ] LUCENE-5255: Make DocumentsWriter reference final in IW Make DocumentsWriter reference final in IW -- Key: LUCENE-5255 URL: https://issues.apache.org/jira/browse/LUCENE-5255 Project: Lucene - Core Issue Type: Improvement Components: core/index Affects Versions: 5.0, 4.6 Reporter: Simon Willnauer Fix For: 5.0, 4.6 Attachments: LUCENE-5255.patch the DocumentWriter ref is nulled on close which seems unnecessary altogether. We can just make it final instead. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5255) Make DocumentsWriter reference final in IW
[ https://issues.apache.org/jira/browse/LUCENE-5255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790529#comment-13790529 ] ASF subversion and git services commented on LUCENE-5255: - Commit 1530685 from [~simonw] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1530685 ] LUCENE-5255: Make DocumentsWriter reference final in IW Make DocumentsWriter reference final in IW -- Key: LUCENE-5255 URL: https://issues.apache.org/jira/browse/LUCENE-5255 Project: Lucene - Core Issue Type: Improvement Components: core/index Affects Versions: 5.0, 4.6 Reporter: Simon Willnauer Fix For: 5.0, 4.6 Attachments: LUCENE-5255.patch the DocumentWriter ref is nulled on close which seems unnecessary altogether. We can just make it final instead. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2548) Multithreaded faceting
[ https://issues.apache.org/jira/browse/SOLR-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790544#comment-13790544 ] David Smiley commented on SOLR-2548: Multithreaded faceting is useful when your CPU core count is much greater than the number of Solr cores you have, and you have a ton of data and need to facet on multiple fields. You could theoretically get similar results by sharding more but you should limit sharding based on disk IO capabilities (especially when there's so much it won't get in RAM), which isn't necessary one-for-one with the CPU count. Multithreaded faceting -- Key: SOLR-2548 URL: https://issues.apache.org/jira/browse/SOLR-2548 Project: Solr Issue Type: Improvement Components: search Affects Versions: 3.1 Reporter: Janne Majaranta Assignee: Erick Erickson Priority: Minor Labels: facet Fix For: 4.5, 5.0 Attachments: SOLR-2548_4.2.1.patch, SOLR-2548_for_31x.patch, SOLR-2548_multithreaded_faceting,_dsmiley.patch, SOLR-2548_multithreaded_faceting,_dsmiley.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch Add multithreading support for faceting. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5261) add simple API to build queries from analysis chain
[ https://issues.apache.org/jira/browse/LUCENE-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790547#comment-13790547 ] ASF subversion and git services commented on LUCENE-5261: - Commit 1530693 from [~rcmuir] in branch 'dev/trunk' [ https://svn.apache.org/r1530693 ] LUCENE-5261: add simple API to build queries from analysis chain add simple API to build queries from analysis chain --- Key: LUCENE-5261 URL: https://issues.apache.org/jira/browse/LUCENE-5261 Project: Lucene - Core Issue Type: New Feature Reporter: Robert Muir Attachments: LUCENE-5261.patch, LUCENE-5261.patch, LUCENE-5261.patch Currently this is pretty crazy stuff. Additionally its duplicated in like 3 or 4 places in our codebase (i noticed it doing LUCENE-5259) We can solve that duplication, and make it easy to simply create queries from an analyzer (its been asked on the user list), as well as make it easier to build new queryparsers. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5323) Solr requires -Dsolr.clustering.enabled=false when pointing at example config
[ https://issues.apache.org/jira/browse/SOLR-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790551#comment-13790551 ] Yonik Seeley commented on SOLR-5323: Hmmm, I agree this is a bug. My comment in SOLR-4708 was +1, provided that everything (except clustering) still works if you copy example somewhere else. Solr requires -Dsolr.clustering.enabled=false when pointing at example config - Key: SOLR-5323 URL: https://issues.apache.org/jira/browse/SOLR-5323 Project: Solr Issue Type: Bug Components: contrib - Clustering Affects Versions: 4.5 Environment: vanilla mac Reporter: John Berryman Fix For: 5.0, 4.6 my typical use of Solr is something like this: {code} cd SOLR_HOME/example cp -r solr /myProjectDir/solr_home java -jar -Dsolr.solr.home=/myProjectDir/solr_home start.jar {code} But in solr 4.5.0 this fails to start successfully. I get an error: {code} org.apache.solr.common.SolrException: Error loading class 'solr.clustering.ClusteringComponent' {code} The reason is because solr.clustering.enabled defaults to true now. I don't know why this might be the case. you can get around it with {code} java -jar -Dsolr.solr.home=/myProjectDir/solr_home -Dsolr.clustering.enabled=false start.jar {code} SOLR-4708 is when this became an issue. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5323) Solr requires -Dsolr.clustering.enabled=false when pointing at example config
[ https://issues.apache.org/jira/browse/SOLR-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790568#comment-13790568 ] Erik Hatcher commented on SOLR-5323: bq. My comment in SOLR-4708 was +1, provided that everything (except clustering) still works if you copy example somewhere else. And that's the reason I didn't commit it before. I thought somehow Dawid had worked some magic to alleviate this issue when he took it on. We should perhaps have lazy loaded SearchComponents too? Solr requires -Dsolr.clustering.enabled=false when pointing at example config - Key: SOLR-5323 URL: https://issues.apache.org/jira/browse/SOLR-5323 Project: Solr Issue Type: Bug Components: contrib - Clustering Affects Versions: 4.5 Environment: vanilla mac Reporter: John Berryman Fix For: 5.0, 4.6 my typical use of Solr is something like this: {code} cd SOLR_HOME/example cp -r solr /myProjectDir/solr_home java -jar -Dsolr.solr.home=/myProjectDir/solr_home start.jar {code} But in solr 4.5.0 this fails to start successfully. I get an error: {code} org.apache.solr.common.SolrException: Error loading class 'solr.clustering.ClusteringComponent' {code} The reason is because solr.clustering.enabled defaults to true now. I don't know why this might be the case. you can get around it with {code} java -jar -Dsolr.solr.home=/myProjectDir/solr_home -Dsolr.clustering.enabled=false start.jar {code} SOLR-4708 is when this became an issue. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5324) Make sub shard replica recovery and shard state switch asynchronous
[ https://issues.apache.org/jira/browse/SOLR-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar updated SOLR-5324: Attachment: SOLR-5324.patch # On unsuccessful replica recovery, the sub-shard state was incorrectly being set active # The split by route field test should wait for the right collection to recover Make sub shard replica recovery and shard state switch asynchronous --- Key: SOLR-5324 URL: https://issues.apache.org/jira/browse/SOLR-5324 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Fix For: 5.0, 4.6 Attachments: SOLR-5324.patch, SOLR-5324.patch Currently the shard split command waits for all replicas of all sub shards to recover and then switches the state of parent to inactive and sub-shards to active. The problem is that shard split (ab)uses the CoreAdmin WaitForState action to ask the sub shard leader to wait until the replica states are active. This action is prone to timeout. We should make the shard state switching asynchronous. Once all replicas of all sub-shards are 'active', the shard states should be switched automatically. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5325) zk connection loss causes overseer leader loss
Christine Poerschke created SOLR-5325: - Summary: zk connection loss causes overseer leader loss Key: SOLR-5325 URL: https://issues.apache.org/jira/browse/SOLR-5325 Project: Solr Issue Type: Bug Reporter: Christine Poerschke -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5325) zk connection loss causes overseer leader loss
[ https://issues.apache.org/jira/browse/SOLR-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christine Poerschke updated SOLR-5325: -- Description: The problem we saw was that when the solr overseer leader experienced temporary zk connectivity problems it stopped processing overseer queue events. This first happened when quorum within the external zk ensemble was lost due to too many zookeepers being stopped (similar to SOLR-5199). The second time it happened when there was a sufficient number of zookeepers but they were holding zookeeper leadership elections and thus refused connections (the elections were taking several seconds, we were using the default zookeeper.cnxTimeout=5s value and it was hit for one ensemble member). Affects Version/s: 4.3 4.4 zk connection loss causes overseer leader loss -- Key: SOLR-5325 URL: https://issues.apache.org/jira/browse/SOLR-5325 Project: Solr Issue Type: Bug Affects Versions: 4.3, 4.4 Reporter: Christine Poerschke The problem we saw was that when the solr overseer leader experienced temporary zk connectivity problems it stopped processing overseer queue events. This first happened when quorum within the external zk ensemble was lost due to too many zookeepers being stopped (similar to SOLR-5199). The second time it happened when there was a sufficient number of zookeepers but they were holding zookeeper leadership elections and thus refused connections (the elections were taking several seconds, we were using the default zookeeper.cnxTimeout=5s value and it was hit for one ensemble member). -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5261) add simple API to build queries from analysis chain
[ https://issues.apache.org/jira/browse/LUCENE-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790580#comment-13790580 ] ASF subversion and git services commented on LUCENE-5261: - Commit 1530701 from [~rcmuir] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1530701 ] LUCENE-5261: add simple API to build queries from analysis chain add simple API to build queries from analysis chain --- Key: LUCENE-5261 URL: https://issues.apache.org/jira/browse/LUCENE-5261 Project: Lucene - Core Issue Type: New Feature Reporter: Robert Muir Attachments: LUCENE-5261.patch, LUCENE-5261.patch, LUCENE-5261.patch Currently this is pretty crazy stuff. Additionally its duplicated in like 3 or 4 places in our codebase (i noticed it doing LUCENE-5259) We can solve that duplication, and make it easy to simply create queries from an analyzer (its been asked on the user list), as well as make it easier to build new queryparsers. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5261) add simple API to build queries from analysis chain
[ https://issues.apache.org/jira/browse/LUCENE-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-5261. - Resolution: Fixed Fix Version/s: 4.6 5.0 add simple API to build queries from analysis chain --- Key: LUCENE-5261 URL: https://issues.apache.org/jira/browse/LUCENE-5261 Project: Lucene - Core Issue Type: New Feature Reporter: Robert Muir Fix For: 5.0, 4.6 Attachments: LUCENE-5261.patch, LUCENE-5261.patch, LUCENE-5261.patch Currently this is pretty crazy stuff. Additionally its duplicated in like 3 or 4 places in our codebase (i noticed it doing LUCENE-5259) We can solve that duplication, and make it easy to simply create queries from an analyzer (its been asked on the user list), as well as make it easier to build new queryparsers. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5307) Solr 4.5 collection api ignores collection.configName when used in cloud
[ https://issues.apache.org/jira/browse/SOLR-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790579#comment-13790579 ] Mark Miller commented on SOLR-5307: --- Ouch - this sounds like a pretty bad bug. Solr 4.5 collection api ignores collection.configName when used in cloud Key: SOLR-5307 URL: https://issues.apache.org/jira/browse/SOLR-5307 Project: Solr Issue Type: Bug Affects Versions: 4.5 Reporter: Nathan Neulinger Labels: cloud, collection-api, zookeeper This worked properly in 4.4, but on 4.5, specifying collection.configName when creating a collection doesn't work - it gets the default regardless of what has been uploaded into zk. Explicitly linking config name to collection ahead of time with zkcli.sh is a workaround I'm using for the moment, but that did not appear to be necessary with 4.4 unless I was doing something wrong and not realizing it. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5325) zk connection loss causes overseer leader loss
[ https://issues.apache.org/jira/browse/SOLR-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christine Poerschke updated SOLR-5325: -- Attachment: SOLR-5325.patch Attaching Overseer.java patch for solr 4.4.0, OverseerCollectionProcessor.java could be changed in similar way. zk connection loss causes overseer leader loss -- Key: SOLR-5325 URL: https://issues.apache.org/jira/browse/SOLR-5325 Project: Solr Issue Type: Bug Affects Versions: 4.3, 4.4 Reporter: Christine Poerschke Attachments: SOLR-5325.patch The problem we saw was that when the solr overseer leader experienced temporary zk connectivity problems it stopped processing overseer queue events. This first happened when quorum within the external zk ensemble was lost due to too many zookeepers being stopped (similar to SOLR-5199). The second time it happened when there was a sufficient number of zookeepers but they were holding zookeeper leadership elections and thus refused connections (the elections were taking several seconds, we were using the default zookeeper.cnxTimeout=5s value and it was hit for one ensemble member). -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5213) collections?action=SPLITSHARD parent vs. sub-shards numDocs
[ https://issues.apache.org/jira/browse/SOLR-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790590#comment-13790590 ] Christine Poerschke commented on SOLR-5213: --- Two occurrences of lost documents were seen. The one with the majority of documents lost was tracked down to operational error (shardX files were copied to be shardY files), a second loss was of a few dozen documents only, for that never figured out if it was operational or something else. Other shard splits since then were fine i.e. no losses. collections?action=SPLITSHARD parent vs. sub-shards numDocs --- Key: SOLR-5213 URL: https://issues.apache.org/jira/browse/SOLR-5213 Project: Solr Issue Type: Improvement Components: update Affects Versions: 4.4 Reporter: Christine Poerschke Assignee: Shalin Shekhar Mangar Attachments: SOLR-5213.patch The problem we saw was that splitting a shard took a long time and at the end of it the sub-shards contained fewer documents than the original shard. The root cause was eventually tracked down to the disappearing documents not falling into the hash ranges of the sub-shards. Could SolrIndexSplitter split report per-segment numDocs for parent and sub-shards, with at least a warning logged for any discrepancies (documents falling into none of the sub-shards or documents falling into several sub-shards)? Additionally, could a case be made for erroring out when discrepancies are detected i.e. not proceeding with the shard split? Either to always error or to have an verifyNumDocs=false/true optional parameter for the SPLITSHARD action. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5213) collections?action=SPLITSHARD parent vs. sub-shards numDocs
[ https://issues.apache.org/jira/browse/SOLR-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790608#comment-13790608 ] Shalin Shekhar Mangar commented on SOLR-5213: - I'm seeing similar problems as well on the ShardSplitTest sporadically. I've opened SOLR-5309 to track it. I'll review and commit your patch shortly. collections?action=SPLITSHARD parent vs. sub-shards numDocs --- Key: SOLR-5213 URL: https://issues.apache.org/jira/browse/SOLR-5213 Project: Solr Issue Type: Improvement Components: update Affects Versions: 4.4 Reporter: Christine Poerschke Assignee: Shalin Shekhar Mangar Attachments: SOLR-5213.patch The problem we saw was that splitting a shard took a long time and at the end of it the sub-shards contained fewer documents than the original shard. The root cause was eventually tracked down to the disappearing documents not falling into the hash ranges of the sub-shards. Could SolrIndexSplitter split report per-segment numDocs for parent and sub-shards, with at least a warning logged for any discrepancies (documents falling into none of the sub-shards or documents falling into several sub-shards)? Additionally, could a case be made for erroring out when discrepancies are detected i.e. not proceeding with the shard split? Either to always error or to have an verifyNumDocs=false/true optional parameter for the SPLITSHARD action. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5307) Solr 4.5 collection api ignores collection.configName when used in cloud
[ https://issues.apache.org/jira/browse/SOLR-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790612#comment-13790612 ] Shalin Shekhar Mangar commented on SOLR-5307: - bq. Ouch - this sounds like a pretty bad bug. Yeah, SOLR-5317 too. Solr 4.5 collection api ignores collection.configName when used in cloud Key: SOLR-5307 URL: https://issues.apache.org/jira/browse/SOLR-5307 Project: Solr Issue Type: Bug Affects Versions: 4.5 Reporter: Nathan Neulinger Labels: cloud, collection-api, zookeeper This worked properly in 4.4, but on 4.5, specifying collection.configName when creating a collection doesn't work - it gets the default regardless of what has been uploaded into zk. Explicitly linking config name to collection ahead of time with zkcli.sh is a workaround I'm using for the moment, but that did not appear to be necessary with 4.4 unless I was doing something wrong and not realizing it. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-5307) Solr 4.5 collection api ignores collection.configName when used in cloud
[ https://issues.apache.org/jira/browse/SOLR-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reassigned SOLR-5307: - Assignee: Mark Miller Solr 4.5 collection api ignores collection.configName when used in cloud Key: SOLR-5307 URL: https://issues.apache.org/jira/browse/SOLR-5307 Project: Solr Issue Type: Bug Affects Versions: 4.5 Reporter: Nathan Neulinger Assignee: Mark Miller Labels: cloud, collection-api, zookeeper This worked properly in 4.4, but on 4.5, specifying collection.configName when creating a collection doesn't work - it gets the default regardless of what has been uploaded into zk. Explicitly linking config name to collection ahead of time with zkcli.sh is a workaround I'm using for the moment, but that did not appear to be necessary with 4.4 unless I was doing something wrong and not realizing it. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-5306) can not create collection when have over one config
[ https://issues.apache.org/jira/browse/SOLR-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reassigned SOLR-5306: - Assignee: Mark Miller can not create collection when have over one config --- Key: SOLR-5306 URL: https://issues.apache.org/jira/browse/SOLR-5306 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 4.5 Environment: win7 jdk 7 Reporter: Liang Tianyu Assignee: Mark Miller Priority: Critical I have uploaded zookeeper two config: patent and applicant. I can not create collection:http://localhost:8080/solr/admin/collections?action=CREATEname=patent_main_1numShards=1collection.configName=patent.show errors:patent_main_1_shard1_replica1: org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: Could not find configName for collection patent_main_1 found:[applicant, patent]. In solr 4.4 I can create sucessfully. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5213) collections?action=SPLITSHARD parent vs. sub-shards numDocs
[ https://issues.apache.org/jira/browse/SOLR-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790657#comment-13790657 ] Christine Poerschke commented on SOLR-5213: --- A variation of the patch i uploaded here would be to 'rescue' (and id+hash log) any documents that would have been lost otherwise e.g. always put them in the first sub-shard, they don't belong there but at least that way they are not lost and could be analysed and dealt with later on. collections?action=SPLITSHARD parent vs. sub-shards numDocs --- Key: SOLR-5213 URL: https://issues.apache.org/jira/browse/SOLR-5213 Project: Solr Issue Type: Improvement Components: update Affects Versions: 4.4 Reporter: Christine Poerschke Assignee: Shalin Shekhar Mangar Attachments: SOLR-5213.patch The problem we saw was that splitting a shard took a long time and at the end of it the sub-shards contained fewer documents than the original shard. The root cause was eventually tracked down to the disappearing documents not falling into the hash ranges of the sub-shards. Could SolrIndexSplitter split report per-segment numDocs for parent and sub-shards, with at least a warning logged for any discrepancies (documents falling into none of the sub-shards or documents falling into several sub-shards)? Additionally, could a case be made for erroring out when discrepancies are detected i.e. not proceeding with the shard split? Either to always error or to have an verifyNumDocs=false/true optional parameter for the SPLITSHARD action. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5326) admin/collections?action=SPLITSHARD support for multiple shards
Christine Poerschke created SOLR-5326: - Summary: admin/collections?action=SPLITSHARD support for multiple shards Key: SOLR-5326 URL: https://issues.apache.org/jira/browse/SOLR-5326 Project: Solr Issue Type: New Feature Affects Versions: 4.4 Reporter: Christine Poerschke The problem we saw was that splitting one shard took 'a long time' (around 4 hours) and with there being 'many' (8 at the time) shards to split and the solr overseer serialising action=SPLITSHARD requests a full collection split would have taken 'a very long time'. Separately, shard splitting distributing replica2, replica3, etc. of each shard randomly across machines was not desirable and as in SOLR-5004 splitting into 'n' rather than '2' sub-shards was useful. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5263) Deletes may be silently lost if an IOException is hit and later not hit (e.g., disk fills up and then frees up)
[ https://issues.apache.org/jira/browse/LUCENE-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790677#comment-13790677 ] ASF subversion and git services commented on LUCENE-5263: - Commit 1530741 from [~mikemccand] in branch 'dev/trunk' [ https://svn.apache.org/r1530741 ] LUCENE-5263: remove extra deleter.checkpoint Deletes may be silently lost if an IOException is hit and later not hit (e.g., disk fills up and then frees up) --- Key: LUCENE-5263 URL: https://issues.apache.org/jira/browse/LUCENE-5263 Project: Lucene - Core Issue Type: Bug Components: core/index Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.6, 5.0 Attachments: LUCENE-5263.patch, LUCENE-5263.patch This case is tricky to handle, yet I think realistic: disk fills up temporarily, causes an exception in writeLiveDocs, and then the app keeps using the IW instance. Meanwhile disk later frees up again, IW is closed successfully. In certain cases, we can silently lose deletes in this case. I had already committed TestIndexWriterDeletes.testNoLostDeletesOnDiskFull, and Jenkins seems happy with it so far, but when I added fangs to the test (cutover to RandomIndexWriter from IndexWriter, allow IOE during getReader, add randomness to when exc is thrown, etc.), it uncovered some real/nasty bugs: * ReaderPool.dropAll was suppressing any exception it hit, because {code}if (priorE != null){code} should instead be {code}if (priorE == null){code} * After a merge, we have to write deletes before committing the segment, because an exception when writing deletes means we need to abort the merge * Several places that were directly calling deleter.checkpoint must also increment the changeCount else on close IW thinks there are no changes and doesn't write a new segments file. * closeInternal was dropping pooled readers after writing the segments file, which would lose deletes still buffered due to a previous exc. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5268) Cutover more postings formats to the inverted pull API
Michael McCandless created LUCENE-5268: -- Summary: Cutover more postings formats to the inverted pull API Key: LUCENE-5268 URL: https://issues.apache.org/jira/browse/LUCENE-5268 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0 In LUCENE-5123, we added a new, more flexible, pull API for writing postings. This API allows the postings format to iterate the fields/terms/postings more than once, and mirrors the API for writing doc values. But that was just the first step (only SimpleText was cutover to the new API). I want to cutover more components, so we can (finally) e.g. play with different encodings depending on the term's postings, such as using a bitset for high freq DOCS_ONLY terms (LUCENE-5052). -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5268) Cutover more postings formats to the inverted pull API
[ https://issues.apache.org/jira/browse/LUCENE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-5268: --- Attachment: LUCENE-5268.patch Patch with these changes: * Cutover BlockTreeTermsWriter, BlockTermsWriter, FST/OrdTermsWriter from PushFieldsConsumer to FieldsConsumer * Changed PostingsBaseWriter to a pull API, with a single method to write the current term's postings, and then added a new PushPostingsBaseWriter that has the push API. * Cutover some formats to new PostingsBaseWriter; pulsing and bloom were nice cleanups. For the rest I just switched them to PushPostingsBaseWriter. * Only two PushFieldsConsumers remain: MemoryPF and RAMOnlyPF (test-framework); I'm tempted to just cut those over and then remove PushFieldsConsumer here. Still a few nocommits but I think it's close ... Cutover more postings formats to the inverted pull API Key: LUCENE-5268 URL: https://issues.apache.org/jira/browse/LUCENE-5268 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0 Attachments: LUCENE-5268.patch In LUCENE-5123, we added a new, more flexible, pull API for writing postings. This API allows the postings format to iterate the fields/terms/postings more than once, and mirrors the API for writing doc values. But that was just the first step (only SimpleText was cutover to the new API). I want to cutover more components, so we can (finally) e.g. play with different encodings depending on the term's postings, such as using a bitset for high freq DOCS_ONLY terms (LUCENE-5052). -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5248) Improve the data structure used in ReaderAndLiveDocs to hold the updates
[ https://issues.apache.org/jira/browse/LUCENE-5248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-5248: --- Attachment: LUCENE-5248.patch Patch replaces MapNumericFieldUpdates with PackedNumericFieldUpdates which hold the docs/values data in PagedMutable and PagedGrowableWriter respectively. It also holds a FixedBitSet the size of maxDoc to mark which documents have a numeric value (e.g. for unsetting a value from a document). Improve the data structure used in ReaderAndLiveDocs to hold the updates Key: LUCENE-5248 URL: https://issues.apache.org/jira/browse/LUCENE-5248 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Shai Erera Assignee: Shai Erera Attachments: LUCENE-5248.patch, LUCENE-5248.patch, LUCENE-5248.patch, LUCENE-5248.patch Currently ReaderAndLiveDocs holds the updates in two structures: +MapString,MapInteger,Long+ Holds a mapping from each field, to all docs that were updated and their values. This structure is updated when applyDeletes is called, and needs to satisfy several requirements: # Un-ordered writes: if a field f is updated by two terms, termA and termB, in that order, and termA affects doc=100 and termB doc=2, then the updates are applied in that order, meaning we cannot rely on updates coming in order. # Same document may be updated multiple times, either by same term (e.g. several calls to IW.updateNDV) or by different terms. Last update wins. # Sequential read: when writing the updates to the Directory (fieldsConsumer), we iterate on the docs in-order and for each one check if it's updated and if not, pull its value from the current DV. # A single update may affect several million documents, therefore need to be efficient w.r.t. memory consumption. +MapInteger,MapString,Long+ Holds a mapping from a document, to all the fields that it was updated in and the updated value for each field. This is used by IW.commitMergedDeletes to apply the updates that came in while the segment was merging. The requirements this structure needs to satisfy are: # Access in doc order: this is how commitMergedDeletes works. # One-pass: we visit a document once (currently) and so if we can, it's better if we know all the fields in which it was updated. The updates are applied to the merged ReaderAndLiveDocs (where they are stored in the first structure mentioned above). Comments with proposals will follow next. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5027) Result Set Collapse and Expand Plugins
[ https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-5027: - Attachment: SOLR-5027.patch Added support for the QueryElevationComponent and test case. Result Set Collapse and Expand Plugins -- Key: SOLR-5027 URL: https://issues.apache.org/jira/browse/SOLR-5027 Project: Solr Issue Type: New Feature Components: search Affects Versions: 5.0 Reporter: Joel Bernstein Priority: Minor Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch This ticket introduces two new Solr plugins, the *CollapsingQParserPlugin* and the *ExpandComponent*. The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. Collapse based on the highest scoring document: {code} fq=(!collapse field=field_name} {code} Collapse based on the min value of a numeric field: {code} fq={!collapse field=field_name min=field_name} {code} Collapse based on the max value of a numeric field: {code} fq={!collapse field=field_name max=field_name} {code} Collapse with a null policy: {code} fq={!collapse field=field_name nullPolicy=null_policy} {code} There are three null policies: ignore : removes docs with a null value in the collapse field (default). expand : treats each doc with a null value in the collapse field as a separate group. collapse : collapses all docs with a null value into a single group using either highest score, or min/max. The *ExpandComponent* is a search component that takes the collapsed docList and expands the groups for a single page based on parameters provided. Initial syntax: expand=true - Turns on the expand component. expand.field=field - Expands results for this field expand.limit=5 - Limits the documents for each expanded group. expand.sort=sort spec - The sort spec for the expanded documents. Default is score. expand.rows=500 - The max number of expanded results to bring back. Default is 500. *Note:* Recent patches don't contain the expand component. The July 16 patch does. This will be brought back in when the collapse is finished, or possible moved to it's own ticket. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5307) Solr 4.5 collection api ignores collection.configName when used in cloud
[ https://issues.apache.org/jira/browse/SOLR-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved SOLR-5307. --- Resolution: Duplicate Solr 4.5 collection api ignores collection.configName when used in cloud Key: SOLR-5307 URL: https://issues.apache.org/jira/browse/SOLR-5307 Project: Solr Issue Type: Bug Affects Versions: 4.5 Reporter: Nathan Neulinger Assignee: Mark Miller Labels: cloud, collection-api, zookeeper This worked properly in 4.4, but on 4.5, specifying collection.configName when creating a collection doesn't work - it gets the default regardless of what has been uploaded into zk. Explicitly linking config name to collection ahead of time with zkcli.sh is a workaround I'm using for the moment, but that did not appear to be necessary with 4.4 unless I was doing something wrong and not realizing it. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5306) can not create collection when have over one config
[ https://issues.apache.org/jira/browse/SOLR-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-5306: -- Attachment: SOLR-5306.patch can not create collection when have over one config --- Key: SOLR-5306 URL: https://issues.apache.org/jira/browse/SOLR-5306 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 4.5 Environment: win7 jdk 7 Reporter: Liang Tianyu Assignee: Mark Miller Priority: Critical Attachments: SOLR-5306.patch I have uploaded zookeeper two config: patent and applicant. I can not create collection:http://localhost:8080/solr/admin/collections?action=CREATEname=patent_main_1numShards=1collection.configName=patent.show errors:patent_main_1_shard1_replica1: org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: Could not find configName for collection patent_main_1 found:[applicant, patent]. In solr 4.4 I can create sucessfully. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5306) can not create collection when have over one config
[ https://issues.apache.org/jira/browse/SOLR-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-5306: -- Fix Version/s: 5.0 4.6 4.5.1 can not create collection when have over one config --- Key: SOLR-5306 URL: https://issues.apache.org/jira/browse/SOLR-5306 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 4.5 Environment: win7 jdk 7 Reporter: Liang Tianyu Assignee: Mark Miller Priority: Critical Fix For: 4.5.1, 4.6, 5.0 Attachments: SOLR-5306.patch I have uploaded zookeeper two config: patent and applicant. I can not create collection:http://localhost:8080/solr/admin/collections?action=CREATEname=patent_main_1numShards=1collection.configName=patent.show errors:patent_main_1_shard1_replica1: org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: Could not find configName for collection patent_main_1 found:[applicant, patent]. In solr 4.4 I can create sucessfully. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5317) CoreAdmin API is not persisting data properly
[ https://issues.apache.org/jira/browse/SOLR-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-5317: -- Fix Version/s: 5.0 4.6 4.5.1 CoreAdmin API is not persisting data properly - Key: SOLR-5317 URL: https://issues.apache.org/jira/browse/SOLR-5317 Project: Solr Issue Type: Bug Reporter: Yago Riveiro Priority: Critical Fix For: 4.5.1, 4.6, 5.0 There is a regression between 4.4 and 4.5 with the CoreAdmin API, the command doesn't save the result on solr.xml at time that is executed. The full process is describe here: https://gist.github.com/yriveiro/6883208 -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5027) Field Collapsing PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-5027: - Summary: Field Collapsing PostFilter (was: Result Set Collapse and Expand Plugins) Field Collapsing PostFilter --- Key: SOLR-5027 URL: https://issues.apache.org/jira/browse/SOLR-5027 Project: Solr Issue Type: New Feature Components: search Affects Versions: 5.0 Reporter: Joel Bernstein Priority: Minor Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch This ticket introduces two new Solr plugins, the *CollapsingQParserPlugin* and the *ExpandComponent*. The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. Collapse based on the highest scoring document: {code} fq=(!collapse field=field_name} {code} Collapse based on the min value of a numeric field: {code} fq={!collapse field=field_name min=field_name} {code} Collapse based on the max value of a numeric field: {code} fq={!collapse field=field_name max=field_name} {code} Collapse with a null policy: {code} fq={!collapse field=field_name nullPolicy=null_policy} {code} There are three null policies: ignore : removes docs with a null value in the collapse field (default). expand : treats each doc with a null value in the collapse field as a separate group. collapse : collapses all docs with a null value into a single group using either highest score, or min/max. The *ExpandComponent* is a search component that takes the collapsed docList and expands the groups for a single page based on parameters provided. Initial syntax: expand=true - Turns on the expand component. expand.field=field - Expands results for this field expand.limit=5 - Limits the documents for each expanded group. expand.sort=sort spec - The sort spec for the expanded documents. Default is score. expand.rows=500 - The max number of expanded results to bring back. Default is 500. *Note:* Recent patches don't contain the expand component. The July 16 patch does. This will be brought back in when the collapse is finished, or possible moved to it's own ticket. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5027) Field Collapsing PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-5027: - Description: This ticket introduces the *CollapsingQParserPlugin* The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. This is a high performance alternative to standard Solr field collapsing (with *ngroups*) when the number of distinct groups in the result set is high. For example in one performance test, a search with 10 million full results and 1 million collapsed groups: Standard grouping with ngroups : 17 seconds. CollapsingQParserPlugin: 300 milli-seconds. Sample syntax: Collapse based on the highest scoring document: {code} fq=(!collapse field=field_name} {code} Collapse based on the min value of a numeric field: {code} fq={!collapse field=field_name min=field_name} {code} Collapse based on the max value of a numeric field: {code} fq={!collapse field=field_name max=field_name} {code} Collapse with a null policy: {code} fq={!collapse field=field_name nullPolicy=null_policy} {code} There are three null policies: ignore : removes docs with a null value in the collapse field (default). expand : treats each doc with a null value in the collapse field as a separate group. collapse : collapses all docs with a null value into a single group using either highest score, or min/max. *Note:* The July 16 patch also includes and ExpandComponent that expands the collapsed groups for the current search result page. This functionality will moved to it's own ticket. was: This ticket introduces two new Solr plugins, the *CollapsingQParserPlugin* and the *ExpandComponent*. The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. Collapse based on the highest scoring document: {code} fq=(!collapse field=field_name} {code} Collapse based on the min value of a numeric field: {code} fq={!collapse field=field_name min=field_name} {code} Collapse based on the max value of a numeric field: {code} fq={!collapse field=field_name max=field_name} {code} Collapse with a null policy: {code} fq={!collapse field=field_name nullPolicy=null_policy} {code} There are three null policies: ignore : removes docs with a null value in the collapse field (default). expand : treats each doc with a null value in the collapse field as a separate group. collapse : collapses all docs with a null value into a single group using either highest score, or min/max. The *ExpandComponent* is a search component that takes the collapsed docList and expands the groups for a single page based on parameters provided. Initial syntax: expand=true - Turns on the expand component. expand.field=field - Expands results for this field expand.limit=5 - Limits the documents for each expanded group. expand.sort=sort spec - The sort spec for the expanded documents. Default is score. expand.rows=500 - The max number of expanded results to bring back. Default is 500. *Note:* Recent patches don't contain the expand component. The July 16 patch does. This will be brought back in when the collapse is finished, or possible moved to it's own ticket. Field Collapsing PostFilter --- Key: SOLR-5027 URL: https://issues.apache.org/jira/browse/SOLR-5027 Project: Solr Issue Type: New Feature Components: search Affects Versions: 5.0 Reporter: Joel Bernstein Priority: Minor Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch This ticket introduces the *CollapsingQParserPlugin* The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. This is a high performance alternative to standard Solr field collapsing (with *ngroups*) when the number of distinct groups in the result set is high. For example in one performance test, a search with 10 million full results and 1 million collapsed groups: Standard grouping with ngroups : 17 seconds. CollapsingQParserPlugin: 300 milli-seconds. Sample syntax: Collapse based on the highest scoring document: {code} fq=(!collapse field=field_name} {code} Collapse based on the min value of a numeric field: {code} fq={!collapse field=field_name min=field_name} {code} Collapse based on the max value of a numeric field: {code} fq={!collapse field=field_name max=field_name} {code} Collapse with a null policy: {code} fq={!collapse field=field_name nullPolicy=null_policy} {code} There are three null policies: ignore : removes docs with a null value in the collapse field (default). expand : treats each doc with a null value in the collapse field as a separate group. collapse : collapses all docs with a null value into a single group using either highest score, or min/max. *Note:* The July 16 patch also includes
[jira] [Assigned] (SOLR-5027) Field Collapsing PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein reassigned SOLR-5027: Assignee: Joel Bernstein Field Collapsing PostFilter --- Key: SOLR-5027 URL: https://issues.apache.org/jira/browse/SOLR-5027 Project: Solr Issue Type: New Feature Components: search Affects Versions: 5.0 Reporter: Joel Bernstein Assignee: Joel Bernstein Priority: Minor Fix For: 4.6, 5.0 Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch This ticket introduces the *CollapsingQParserPlugin* The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. This is a high performance alternative to standard Solr field collapsing (with *ngroups*) when the number of distinct groups in the result set is high. For example in one performance test, a search with 10 million full results and 1 million collapsed groups: Standard grouping with ngroups : 17 seconds. CollapsingQParserPlugin: 300 milli-seconds. Sample syntax: Collapse based on the highest scoring document: {code} fq=(!collapse field=field_name} {code} Collapse based on the min value of a numeric field: {code} fq={!collapse field=field_name min=field_name} {code} Collapse based on the max value of a numeric field: {code} fq={!collapse field=field_name max=field_name} {code} Collapse with a null policy: {code} fq={!collapse field=field_name nullPolicy=null_policy} {code} There are three null policies: ignore : removes docs with a null value in the collapse field (default). expand : treats each doc with a null value in the collapse field as a separate group. collapse : collapses all docs with a null value into a single group using either highest score, or min/max. *Note:* The July 16 patch also includes and ExpandComponent that expands the collapsed groups for the current search result page. This functionality will moved to it's own ticket. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5027) Field Collapsing PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-5027: - Fix Version/s: 5.0 4.6 Field Collapsing PostFilter --- Key: SOLR-5027 URL: https://issues.apache.org/jira/browse/SOLR-5027 Project: Solr Issue Type: New Feature Components: search Affects Versions: 5.0 Reporter: Joel Bernstein Assignee: Joel Bernstein Priority: Minor Fix For: 4.6, 5.0 Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch This ticket introduces the *CollapsingQParserPlugin* The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. This is a high performance alternative to standard Solr field collapsing (with *ngroups*) when the number of distinct groups in the result set is high. For example in one performance test, a search with 10 million full results and 1 million collapsed groups: Standard grouping with ngroups : 17 seconds. CollapsingQParserPlugin: 300 milli-seconds. Sample syntax: Collapse based on the highest scoring document: {code} fq=(!collapse field=field_name} {code} Collapse based on the min value of a numeric field: {code} fq={!collapse field=field_name min=field_name} {code} Collapse based on the max value of a numeric field: {code} fq={!collapse field=field_name max=field_name} {code} Collapse with a null policy: {code} fq={!collapse field=field_name nullPolicy=null_policy} {code} There are three null policies: ignore : removes docs with a null value in the collapse field (default). expand : treats each doc with a null value in the collapse field as a separate group. collapse : collapses all docs with a null value into a single group using either highest score, or min/max. *Note:* The July 16 patch also includes and ExpandComponent that expands the collapsed groups for the current search result page. This functionality will moved to it's own ticket. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5027) Field Collapsing PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-5027: - Description: This ticket introduces the *CollapsingQParserPlugin* The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. This is a high performance alternative to standard Solr field collapsing (with *ngroups*) when the number of distinct groups in the result set is high. For example in one performance test, a search with 10 million full results and 1 million collapsed groups: Standard grouping with ngroups : 17 seconds. CollapsingQParserPlugin: 300 milli-seconds. Sample syntax: Collapse based on the highest scoring document: {code} fq=(!collapse field=field_name} {code} Collapse based on the min value of a numeric field: {code} fq={!collapse field=field_name min=field_name} {code} Collapse based on the max value of a numeric field: {code} fq={!collapse field=field_name max=field_name} {code} Collapse with a null policy: {code} fq={!collapse field=field_name nullPolicy=null_policy} {code} There are three null policies: ignore : removes docs with a null value in the collapse field (default). expand : treats each doc with a null value in the collapse field as a separate group. collapse : collapses all docs with a null value into a single group using either highest score, or min/max. The CollapsingQParserPlugin also fully supports the QueryElevationComponent *Note:* The July 16 patch also includes and ExpandComponent that expands the collapsed groups for the current search result page. This functionality will moved to it's own ticket. was: This ticket introduces the *CollapsingQParserPlugin* The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. This is a high performance alternative to standard Solr field collapsing (with *ngroups*) when the number of distinct groups in the result set is high. For example in one performance test, a search with 10 million full results and 1 million collapsed groups: Standard grouping with ngroups : 17 seconds. CollapsingQParserPlugin: 300 milli-seconds. Sample syntax: Collapse based on the highest scoring document: {code} fq=(!collapse field=field_name} {code} Collapse based on the min value of a numeric field: {code} fq={!collapse field=field_name min=field_name} {code} Collapse based on the max value of a numeric field: {code} fq={!collapse field=field_name max=field_name} {code} Collapse with a null policy: {code} fq={!collapse field=field_name nullPolicy=null_policy} {code} There are three null policies: ignore : removes docs with a null value in the collapse field (default). expand : treats each doc with a null value in the collapse field as a separate group. collapse : collapses all docs with a null value into a single group using either highest score, or min/max. *Note:* The July 16 patch also includes and ExpandComponent that expands the collapsed groups for the current search result page. This functionality will moved to it's own ticket. Field Collapsing PostFilter --- Key: SOLR-5027 URL: https://issues.apache.org/jira/browse/SOLR-5027 Project: Solr Issue Type: New Feature Components: search Affects Versions: 5.0 Reporter: Joel Bernstein Assignee: Joel Bernstein Priority: Minor Fix For: 4.6, 5.0 Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch This ticket introduces the *CollapsingQParserPlugin* The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. This is a high performance alternative to standard Solr field collapsing (with *ngroups*) when the number of distinct groups in the result set is high. For example in one performance test, a search with 10 million full results and 1 million collapsed groups: Standard grouping with ngroups : 17 seconds. CollapsingQParserPlugin: 300 milli-seconds. Sample syntax: Collapse based on the highest scoring document: {code} fq=(!collapse field=field_name} {code} Collapse based on the min value of a numeric field: {code} fq={!collapse field=field_name min=field_name} {code} Collapse based on the max value of a numeric field: {code} fq={!collapse field=field_name max=field_name} {code} Collapse with a null policy: {code} fq={!collapse field=field_name nullPolicy=null_policy} {code} There are three null policies: ignore : removes docs with a null value in the collapse field (default). expand : treats each doc with a null value in the collapse field as a separate group. collapse : collapses all docs with a null value into a single group using either highest score, or min/max. The CollapsingQParserPlugin also fully supports the QueryElevationComponent *Note:*
[jira] [Updated] (SOLR-5027) Field Collapsing PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-5027: - Description: This ticket introduces the *CollapsingQParserPlugin* The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. This is a high performance alternative to standard Solr field collapsing (with *ngroups*) when the number of distinct groups in the result set is high. For example in one performance test, a search with 10 million full results and 1 million collapsed groups: Standard grouping with ngroups : 17 seconds. CollapsingQParserPlugin: 300 milli-seconds. Sample syntax: Collapse based on the highest scoring document: {code} fq=(!collapse field=field_name} {code} Collapse based on the min value of a numeric field: {code} fq={!collapse field=field_name min=field_name} {code} Collapse based on the max value of a numeric field: {code} fq={!collapse field=field_name max=field_name} {code} Collapse with a null policy: {code} fq={!collapse field=field_name nullPolicy=null_policy} {code} There are three null policies: ignore : removes docs with a null value in the collapse field (default). expand : treats each doc with a null value in the collapse field as a separate group. collapse : collapses all docs with a null value into a single group using either highest score, or min/max. The CollapsingQParserPlugin also fully supports the QueryElevationComponent *Note:* The July 16 patch also includes and ExpandComponent that expands the collapsed groups for the current search result page. This functionality will be moved to it's own ticket. was: This ticket introduces the *CollapsingQParserPlugin* The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. This is a high performance alternative to standard Solr field collapsing (with *ngroups*) when the number of distinct groups in the result set is high. For example in one performance test, a search with 10 million full results and 1 million collapsed groups: Standard grouping with ngroups : 17 seconds. CollapsingQParserPlugin: 300 milli-seconds. Sample syntax: Collapse based on the highest scoring document: {code} fq=(!collapse field=field_name} {code} Collapse based on the min value of a numeric field: {code} fq={!collapse field=field_name min=field_name} {code} Collapse based on the max value of a numeric field: {code} fq={!collapse field=field_name max=field_name} {code} Collapse with a null policy: {code} fq={!collapse field=field_name nullPolicy=null_policy} {code} There are three null policies: ignore : removes docs with a null value in the collapse field (default). expand : treats each doc with a null value in the collapse field as a separate group. collapse : collapses all docs with a null value into a single group using either highest score, or min/max. The CollapsingQParserPlugin also fully supports the QueryElevationComponent *Note:* The July 16 patch also includes and ExpandComponent that expands the collapsed groups for the current search result page. This functionality will moved to it's own ticket. Field Collapsing PostFilter --- Key: SOLR-5027 URL: https://issues.apache.org/jira/browse/SOLR-5027 Project: Solr Issue Type: New Feature Components: search Affects Versions: 5.0 Reporter: Joel Bernstein Assignee: Joel Bernstein Priority: Minor Fix For: 4.6, 5.0 Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch This ticket introduces the *CollapsingQParserPlugin* The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. This is a high performance alternative to standard Solr field collapsing (with *ngroups*) when the number of distinct groups in the result set is high. For example in one performance test, a search with 10 million full results and 1 million collapsed groups: Standard grouping with ngroups : 17 seconds. CollapsingQParserPlugin: 300 milli-seconds. Sample syntax: Collapse based on the highest scoring document: {code} fq=(!collapse field=field_name} {code} Collapse based on the min value of a numeric field: {code} fq={!collapse field=field_name min=field_name} {code} Collapse based on the max value of a numeric field: {code} fq={!collapse field=field_name max=field_name} {code} Collapse with a null policy: {code} fq={!collapse field=field_name nullPolicy=null_policy} {code} There are three null policies: ignore : removes docs with a null value in the collapse field (default). expand : treats each doc with a null value in the collapse field as a separate group. collapse : collapses all docs with a null value into a single group using either highest score, or min/max. The
[jira] [Commented] (SOLR-5306) can not create collection when have over one config
[ https://issues.apache.org/jira/browse/SOLR-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790763#comment-13790763 ] ASF subversion and git services commented on SOLR-5306: --- Commit 1530772 from [~markrmil...@gmail.com] in branch 'dev/trunk' [ https://svn.apache.org/r1530772 ] SOLR-5306: Extra collection creation parameters like collection.configName are not being respected. can not create collection when have over one config --- Key: SOLR-5306 URL: https://issues.apache.org/jira/browse/SOLR-5306 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 4.5 Environment: win7 jdk 7 Reporter: Liang Tianyu Assignee: Mark Miller Priority: Critical Fix For: 4.5.1, 4.6, 5.0 Attachments: SOLR-5306.patch I have uploaded zookeeper two config: patent and applicant. I can not create collection:http://localhost:8080/solr/admin/collections?action=CREATEname=patent_main_1numShards=1collection.configName=patent.show errors:patent_main_1_shard1_replica1: org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: Could not find configName for collection patent_main_1 found:[applicant, patent]. In solr 4.4 I can create sucessfully. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5306) can not create collection when have over one config
[ https://issues.apache.org/jira/browse/SOLR-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790765#comment-13790765 ] ASF subversion and git services commented on SOLR-5306: --- Commit 1530773 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1530773 ] SOLR-5306: Extra collection creation parameters like collection.configName are not being respected. can not create collection when have over one config --- Key: SOLR-5306 URL: https://issues.apache.org/jira/browse/SOLR-5306 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 4.5 Environment: win7 jdk 7 Reporter: Liang Tianyu Assignee: Mark Miller Priority: Critical Fix For: 4.5.1, 4.6, 5.0 Attachments: SOLR-5306.patch I have uploaded zookeeper two config: patent and applicant. I can not create collection:http://localhost:8080/solr/admin/collections?action=CREATEname=patent_main_1numShards=1collection.configName=patent.show errors:patent_main_1_shard1_replica1: org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: Could not find configName for collection patent_main_1 found:[applicant, patent]. In solr 4.4 I can create sucessfully. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5323) Solr requires -Dsolr.clustering.enabled=false when pointing at example config
[ https://issues.apache.org/jira/browse/SOLR-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790788#comment-13790788 ] Dawid Weiss commented on SOLR-5323: --- I can't remember but I think the problem was that it wasn't possible to define install-dir relative directories for lib element. I'll take a look. Solr requires -Dsolr.clustering.enabled=false when pointing at example config - Key: SOLR-5323 URL: https://issues.apache.org/jira/browse/SOLR-5323 Project: Solr Issue Type: Bug Components: contrib - Clustering Affects Versions: 4.5 Environment: vanilla mac Reporter: John Berryman Fix For: 4.6, 5.0 my typical use of Solr is something like this: {code} cd SOLR_HOME/example cp -r solr /myProjectDir/solr_home java -jar -Dsolr.solr.home=/myProjectDir/solr_home start.jar {code} But in solr 4.5.0 this fails to start successfully. I get an error: {code} org.apache.solr.common.SolrException: Error loading class 'solr.clustering.ClusteringComponent' {code} The reason is because solr.clustering.enabled defaults to true now. I don't know why this might be the case. you can get around it with {code} java -jar -Dsolr.solr.home=/myProjectDir/solr_home -Dsolr.clustering.enabled=false start.jar {code} SOLR-4708 is when this became an issue. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-NightlyTests-4.x - Build # 400 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-4.x/400/ 1 tests failed. REGRESSION: org.apache.lucene.analysis.core.TestRandomChains.testRandomChains Error Message: first posInc must be 0 Stack Trace: java.lang.IllegalStateException: first posInc must be 0 at __randomizedtesting.SeedInfo.seed([D025BEA04DE60E8F:EDC497C10AF4134F]:0) at org.apache.lucene.analysis.ValidatingTokenFilter.incrementToken(ValidatingTokenFilter.java:89) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:694) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:605) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:506) at org.apache.lucene.analysis.core.TestRandomChains.testRandomChains(TestRandomChains.java:925) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at java.lang.Thread.run(Thread.java:679) Build Log: [...truncated 4359 lines...] [junit4] Suite: org.apache.lucene.analysis.core.TestRandomChains [junit4] 2 TEST FAIL: useCharFilter=false text='\ucd6f\u8537\uab05d\uf3cd qkt \u0136'
[jira] [Commented] (LUCENE-5248) Improve the data structure used in ReaderAndLiveDocs to hold the updates
[ https://issues.apache.org/jira/browse/LUCENE-5248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790834#comment-13790834 ] Robert Muir commented on LUCENE-5248: - Hi Shai: should UpdatesIterator implement DISI? It seems like it might be a good fit. {code} +private final FixedBitSet docsWithField; +private PagedMutable docs; +private PagedGrowableWriter values; {code} When we have multiple related structures like this, maybe we can add a comment as to what each is? Something like: {code} // bit per docid: set if the value is real // TODO: is bitset(maxdoc) really needed since usually its sparse? why not an openbitset parallel with docs? private final FixedBitSet docsWithField; // holds a list of documents. // TODO: do these really need to be absolute-encoded? private PagedMutable docs; // holds a list of values, parallel with docs private PagedGrowableWriter values; {code} {code} + docsWithField = new FixedBitSet(maxDoc); + docsWithField.clear(0, maxDoc) {code} The clear should be unnecessary! {code} +public void add(int doc, Long value) { + assert value != null; + if (size == Integer.MAX_VALUE) { +throw new IllegalStateException(cannot support more than Integer.MAX_VALUE doc/value entries); + } {code} Is this really a limitation? {code} +@Override +protected int compare(int i, int j) { + return (int) (docs.get(i) - docs.get(j)); +} {code} Can we just use Long.compare? this subtraction may be safe... but it would smell better. Improve the data structure used in ReaderAndLiveDocs to hold the updates Key: LUCENE-5248 URL: https://issues.apache.org/jira/browse/LUCENE-5248 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Shai Erera Assignee: Shai Erera Attachments: LUCENE-5248.patch, LUCENE-5248.patch, LUCENE-5248.patch, LUCENE-5248.patch Currently ReaderAndLiveDocs holds the updates in two structures: +MapString,MapInteger,Long+ Holds a mapping from each field, to all docs that were updated and their values. This structure is updated when applyDeletes is called, and needs to satisfy several requirements: # Un-ordered writes: if a field f is updated by two terms, termA and termB, in that order, and termA affects doc=100 and termB doc=2, then the updates are applied in that order, meaning we cannot rely on updates coming in order. # Same document may be updated multiple times, either by same term (e.g. several calls to IW.updateNDV) or by different terms. Last update wins. # Sequential read: when writing the updates to the Directory (fieldsConsumer), we iterate on the docs in-order and for each one check if it's updated and if not, pull its value from the current DV. # A single update may affect several million documents, therefore need to be efficient w.r.t. memory consumption. +MapInteger,MapString,Long+ Holds a mapping from a document, to all the fields that it was updated in and the updated value for each field. This is used by IW.commitMergedDeletes to apply the updates that came in while the segment was merging. The requirements this structure needs to satisfy are: # Access in doc order: this is how commitMergedDeletes works. # One-pass: we visit a document once (currently) and so if we can, it's better if we know all the fields in which it was updated. The updates are applied to the merged ReaderAndLiveDocs (where they are stored in the first structure mentioned above). Comments with proposals will follow next. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-NightlyTests-4.x - Build # 400 - Still Failing
I will investigate. looks like fun. On Wed, Oct 9, 2013 at 4:18 PM, Apache Jenkins Server jenk...@builds.apache.org wrote: Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-4.x/400/ 1 tests failed. REGRESSION: org.apache.lucene.analysis.core.TestRandomChains.testRandomChains Error Message: first posInc must be 0 Stack Trace: java.lang.IllegalStateException: first posInc must be 0 at __randomizedtesting.SeedInfo.seed([D025BEA04DE60E8F:EDC497C10AF4134F]:0) at org.apache.lucene.analysis.ValidatingTokenFilter.incrementToken(ValidatingTokenFilter.java:89) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:694) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:605) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:506) at org.apache.lucene.analysis.core.TestRandomChains.testRandomChains(TestRandomChains.java:925) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at
[JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 900 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/900/ Java: 64bit/jdk1.7.0 -XX:+UseCompressedOops -XX:+UseParallelGC All tests passed Build Log: [...truncated 10176 lines...] [junit4] ERROR: JVM J0 ended with an exception, command line: /Library/Java/JavaVirtualMachines/jdk1.7.0_40.jdk/Contents/Home/jre/bin/java -XX:+UseCompressedOops -XX:+UseParallelGC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/heapdumps -Dtests.prefix=tests -Dtests.seed=EE974D36626AC16B -Xmx512M -Dtests.iters= -Dtests.verbose=false -Dtests.infostream=false -Dtests.codec=random -Dtests.postingsformat=random -Dtests.docvaluesformat=random -Dtests.locale=random -Dtests.timezone=random -Dtests.directory=random -Dtests.linedocsfile=europarl.lines.txt.gz -Dtests.luceneMatchVersion=5.0 -Dtests.cleanthreads=perClass -Djava.util.logging.config.file=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/logging.properties -Dtests.nightly=false -Dtests.weekly=false -Dtests.slow=true -Dtests.asserts.gracious=false -Dtests.multiplier=1 -DtempDir=. -Djava.io.tmpdir=. -Djunit4.tempDir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/test/temp -Dclover.db.dir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/clover/db -Djava.security.manager=org.apache.lucene.util.TestSecurityManager -Djava.security.policy=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/tests.policy -Dlucene.version=5.0-SNAPSHOT -Djetty.testMode=1 -Djetty.insecurerandom=1 -Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory -Djava.awt.headless=true -Dtests.disableHdfs=true -Dfile.encoding=ISO-8859-1 -classpath
Re: [JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 900 - Failure!
malloc/free bug. On Wed, Oct 9, 2013 at 4:47 PM, Policeman Jenkins Server jenk...@thetaphi.de wrote: Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/900/ Java: 64bit/jdk1.7.0 -XX:+UseCompressedOops -XX:+UseParallelGC All tests passed Build Log: [...truncated 10176 lines...] [junit4] ERROR: JVM J0 ended with an exception, command line: /Library/Java/JavaVirtualMachines/jdk1.7.0_40.jdk/Contents/Home/jre/bin/java -XX:+UseCompressedOops -XX:+UseParallelGC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/heapdumps -Dtests.prefix=tests -Dtests.seed=EE974D36626AC16B -Xmx512M -Dtests.iters= -Dtests.verbose=false -Dtests.infostream=false -Dtests.codec=random -Dtests.postingsformat=random -Dtests.docvaluesformat=random -Dtests.locale=random -Dtests.timezone=random -Dtests.directory=random -Dtests.linedocsfile=europarl.lines.txt.gz -Dtests.luceneMatchVersion=5.0 -Dtests.cleanthreads=perClass -Djava.util.logging.config.file=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/logging.properties -Dtests.nightly=false -Dtests.weekly=false -Dtests.slow=true -Dtests.asserts.gracious=false -Dtests.multiplier=1 -DtempDir=. -Djava.io.tmpdir=. -Djunit4.tempDir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/test/temp -Dclover.db.dir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/clover/db -Djava.security.manager=org.apache.lucene.util.TestSecurityManager -Djava.security.policy=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/tests.policy -Dlucene.version=5.0-SNAPSHOT -Djetty.testMode=1 -Djetty.insecurerandom=1 -Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory -Djava.awt.headless=true -Dtests.disableHdfs=true -Dfile.encoding=ISO-8859-1 -classpath
[jira] [Created] (LUCENE-5269) TestRandomChains failure
Robert Muir created LUCENE-5269: --- Summary: TestRandomChains failure Key: LUCENE-5269 URL: https://issues.apache.org/jira/browse/LUCENE-5269 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir One of EdgeNGramTokenizer, ShingleFilter, NGramTokenFilter is buggy, or possibly only the combination of them conspiring together. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5269) TestRandomChains failure
[ https://issues.apache.org/jira/browse/LUCENE-5269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-5269: Attachment: LUCENE-5269_test.patch Here's a test. For whatever reason the exact text in jenkins wouldnt reproduce with checkAnalysisConsistency with the exact configuration. However the random seed reproduces in jenkins easily. I suspect maybe there is something not reset and the linedocs file is triggering it??? If i blast random data at the configuration it fails the same way. I then removed various harmless filters and so on until I was left with these three and it was still failing... TestRandomChains failure Key: LUCENE-5269 URL: https://issues.apache.org/jira/browse/LUCENE-5269 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-5269_test.patch One of EdgeNGramTokenizer, ShingleFilter, NGramTokenFilter is buggy, or possibly only the combination of them conspiring together. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5270) add Terms.hasFreqs
Michael McCandless created LUCENE-5270: -- Summary: add Terms.hasFreqs Key: LUCENE-5270 URL: https://issues.apache.org/jira/browse/LUCENE-5270 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.6, 5.0 While working on LUCENE-5268, I realized we have hasPositions/Offsets/Payloads methods in Terms but not hasFreqs ... -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-5317) CoreAdmin API is not persisting data properly
[ https://issues.apache.org/jira/browse/SOLR-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reassigned SOLR-5317: - Assignee: Mark Miller CoreAdmin API is not persisting data properly - Key: SOLR-5317 URL: https://issues.apache.org/jira/browse/SOLR-5317 Project: Solr Issue Type: Bug Reporter: Yago Riveiro Assignee: Mark Miller Priority: Critical Fix For: 4.5.1, 4.6, 5.0 There is a regression between 4.4 and 4.5 with the CoreAdmin API, the command doesn't save the result on solr.xml at time that is executed. The full process is describe here: https://gist.github.com/yriveiro/6883208 -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org