Re: backward incompatibility with MockTokenFilter
Hey John, this class is used for testing only. It's part of the testing framework and I don't think we can provide migration suggestion or BW compat for that package. If you rely on the functionality I suggest you to fork the code into your code base or move to an official alternative in the analysis jars. simon On Fri, Aug 16, 2013 at 7:06 AM, John Wang john.w...@gmail.com wrote: Hi folks: In release 4.3.1, MockTokenFilter has an api to turn on/off position increments, e.g. : set/getEnablePositionIncrements() In release 4.4.0 that was removed. And the default behavior in 4.4.0 is that it is assumed to be true. But I don't see this change documented or a migration suggestion. Please advise. Thanks -John - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5168) ByteSliceReader assert trips with 32-bit oracle 1.7.0_25 + G1GC
[ https://issues.apache.org/jira/browse/LUCENE-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741968#comment-13741968 ] Dawid Weiss commented on LUCENE-5168: - I can reproduce the issue on a different scenario too (core tests) so it's quite definitely a compiler bug lurking somewhere. {code} [junit4] ERROR 0.00s | TestSimpleExplanations (suite) [junit4] Throwable #1: java.lang.AssertionError [junit4]at __randomizedtesting.SeedInfo.seed([8C5A2DB2970990FA]:0) [junit4]at org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:457) [junit4]at org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85) [junit4]at org.apache.lucene.index.TermsHash.flush(TermsHash.java:116) [junit4]at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53) [junit4]at org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:81) [junit4]at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:501) [junit4]at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:478) [junit4]at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:615) [junit4]at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:365) [junit4]at org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:307) [junit4]at org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:249) [junit4]at org.apache.lucene.search.TestExplanations.beforeClassTestExplanations(TestExplanations.java:82) [junit4]at java.lang.Thread.run(Thread.java:724)Throwable #2: java.lang.NullPointerException [junit4]at __randomizedtesting.SeedInfo.seed([8C5A2DB2970990FA]:0) [junit4]at org.apache.lucene.search.TestExplanations.afterClassTestExplanations(TestExplanations.java:63) [junit4]at java.lang.Thread.run(Thread.java:724) [junit4] Completed in 0.06s, 0 tests, 1 failure, 1 error FAILURES! {code} Five failures out of a hundred full runs of lucene's core tests. So it's not a frequent thing, but it does happen. Java 1.8 b102, 32-bit (Windows). ByteSliceReader assert trips with 32-bit oracle 1.7.0_25 + G1GC --- Key: LUCENE-5168 URL: https://issues.apache.org/jira/browse/LUCENE-5168 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: java8-windows-4x-3075-console.txt This assertion trips (sometimes from different tests), if you run the highlighting tests on branch_4x with r1512807. It reproduces about half the time, always only with 32bit + G1GC (other combinations do not seem to trip it, i didnt try looping or anything really though). {noformat} rmuir@beast:~/workspace/branch_4x$ svn up -r 1512807 rmuir@beast:~/workspace/branch_4x$ ant clean rmuir@beast:~/workspace/branch_4x$ rm -rf .caches #this is important, otherwise master seed does not work! rmuir@beast:~/workspace/branch_4x/lucene/highlighter$ ant test -Dtests.jvms=2 -Dtests.seed=EBBFA6F4E80A7365 -Dargs=-server -XX:+UseG1GC {noformat} Originally showed up like this: {noformat} Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6874/ Java: 32bit/jdk1.7.0_25 -server -XX:+UseG1GC 1 tests failed. REGRESSION: org.apache.lucene.search.postingshighlight.TestPostingsHighlighter.testUserFailedToIndexOffsets Error Message: Stack Trace: java.lang.AssertionError at __randomizedtesting.SeedInfo.seed([EBBFA6F4E80A7365:1FBF811885F2D611]:0) at org.apache.lucene.index.ByteSliceReader.readByte(ByteSliceReader.java:73) at org.apache.lucene.store.DataInput.readVInt(DataInput.java:108) at org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:453) at org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85) at org.apache.lucene.index.TermsHash.flush(TermsHash.java:116) at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53) at org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:81) at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:501) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail:
[jira] [Commented] (SOLR-3280) to many / sometimes stale CLOSE_WAIT connections from SnapPuller during / after replication
[ https://issues.apache.org/jira/browse/SOLR-3280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741972#comment-13741972 ] Bernd Fehling commented on SOLR-3280: - After going from solr 3.6 to 4.2.1 I haven't seen this anymore. There was pretty much rework done in SnapPuller due to multicore. Which version are you using? to many / sometimes stale CLOSE_WAIT connections from SnapPuller during / after replication --- Key: SOLR-3280 URL: https://issues.apache.org/jira/browse/SOLR-3280 Project: Solr Issue Type: Bug Affects Versions: 3.5, 3.6, 4.0-ALPHA Reporter: Bernd Fehling Assignee: Robert Muir Priority: Minor Attachments: SOLR-3280.patch There are sometimes to many and also stale CLOSE_WAIT connections during/after replication left over on SLAVE server. Normally GC should clean up this but this is not always the case. Also if a CLOSE_WAIT is hanging then the new replication won't load. Dirty work around so far is to fake a TCP connection as root to that connection and close it. After that the new replication will load, the old index and searcher released and the system will return to normal operation. Background: The SnapPuller is using Apache httpclient 3.x and uses the MultiThreadedHttpConnectionManager. The manager holds a connection in CLOSE_WAIT after its use for further requests. This is done by calling releaseConnection. But if a connection is stuck it is not available any more and a new connection from the pool is used. Solution: After calling releaseConnection clean up with closeIdleConnections(0). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5168) ByteSliceReader assert trips with 32-bit oracle 1.7.0_25 + G1GC
[ https://issues.apache.org/jira/browse/LUCENE-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated LUCENE-5168: Attachment: log.0100 log.0086 log.0078 log.0042 log.0025 Failed logs from 1.8b102 runs. ByteSliceReader assert trips with 32-bit oracle 1.7.0_25 + G1GC --- Key: LUCENE-5168 URL: https://issues.apache.org/jira/browse/LUCENE-5168 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: java8-windows-4x-3075-console.txt, log.0025, log.0042, log.0078, log.0086, log.0100 This assertion trips (sometimes from different tests), if you run the highlighting tests on branch_4x with r1512807. It reproduces about half the time, always only with 32bit + G1GC (other combinations do not seem to trip it, i didnt try looping or anything really though). {noformat} rmuir@beast:~/workspace/branch_4x$ svn up -r 1512807 rmuir@beast:~/workspace/branch_4x$ ant clean rmuir@beast:~/workspace/branch_4x$ rm -rf .caches #this is important, otherwise master seed does not work! rmuir@beast:~/workspace/branch_4x/lucene/highlighter$ ant test -Dtests.jvms=2 -Dtests.seed=EBBFA6F4E80A7365 -Dargs=-server -XX:+UseG1GC {noformat} Originally showed up like this: {noformat} Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6874/ Java: 32bit/jdk1.7.0_25 -server -XX:+UseG1GC 1 tests failed. REGRESSION: org.apache.lucene.search.postingshighlight.TestPostingsHighlighter.testUserFailedToIndexOffsets Error Message: Stack Trace: java.lang.AssertionError at __randomizedtesting.SeedInfo.seed([EBBFA6F4E80A7365:1FBF811885F2D611]:0) at org.apache.lucene.index.ByteSliceReader.readByte(ByteSliceReader.java:73) at org.apache.lucene.store.DataInput.readVInt(DataInput.java:108) at org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:453) at org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85) at org.apache.lucene.index.TermsHash.flush(TermsHash.java:116) at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53) at org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:81) at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:501) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5164) Can not create a collection via collections API (cloud mode)
[ https://issues.apache.org/jira/browse/SOLR-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742014#comment-13742014 ] Alan Woodward commented on SOLR-5164: - Related: SOLR-5099. I think we need an explicit test for creating collections via the API, though. It's a bit scary that this bug can occur without the test suite complaining about it. I'm busy for the next couple of days, but will have some time next week if nobody else gets there first. Can not create a collection via collections API (cloud mode) Key: SOLR-5164 URL: https://issues.apache.org/jira/browse/SOLR-5164 Project: Solr Issue Type: Bug Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Blocker Attachments: SOLR-5164.patch When you try to create a collection in SolrCloud, the instanceDir that gets created has an extra solr in it which messes up the pathing for all the lib directives in solrconfig.xml as they're all relative. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5168) ByteSliceReader assert trips with 32-bit oracle 1.7.0_25 + G1GC
[ https://issues.apache.org/jira/browse/LUCENE-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742015#comment-13742015 ] Dawid Weiss commented on LUCENE-5168: - This issue also affects 1.7.0_21-b11 (32 bit). ByteSliceReader assert trips with 32-bit oracle 1.7.0_25 + G1GC --- Key: LUCENE-5168 URL: https://issues.apache.org/jira/browse/LUCENE-5168 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: java8-windows-4x-3075-console.txt, log.0025, log.0042, log.0078, log.0086, log.0100 This assertion trips (sometimes from different tests), if you run the highlighting tests on branch_4x with r1512807. It reproduces about half the time, always only with 32bit + G1GC (other combinations do not seem to trip it, i didnt try looping or anything really though). {noformat} rmuir@beast:~/workspace/branch_4x$ svn up -r 1512807 rmuir@beast:~/workspace/branch_4x$ ant clean rmuir@beast:~/workspace/branch_4x$ rm -rf .caches #this is important, otherwise master seed does not work! rmuir@beast:~/workspace/branch_4x/lucene/highlighter$ ant test -Dtests.jvms=2 -Dtests.seed=EBBFA6F4E80A7365 -Dargs=-server -XX:+UseG1GC {noformat} Originally showed up like this: {noformat} Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6874/ Java: 32bit/jdk1.7.0_25 -server -XX:+UseG1GC 1 tests failed. REGRESSION: org.apache.lucene.search.postingshighlight.TestPostingsHighlighter.testUserFailedToIndexOffsets Error Message: Stack Trace: java.lang.AssertionError at __randomizedtesting.SeedInfo.seed([EBBFA6F4E80A7365:1FBF811885F2D611]:0) at org.apache.lucene.index.ByteSliceReader.readByte(ByteSliceReader.java:73) at org.apache.lucene.store.DataInput.readVInt(DataInput.java:108) at org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:453) at org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85) at org.apache.lucene.index.TermsHash.flush(TermsHash.java:116) at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53) at org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:81) at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:501) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5152) EdgeNGramFilterFactory deletes token
[ https://issues.apache.org/jira/browse/SOLR-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742051#comment-13742051 ] Christoph Lingg commented on SOLR-5152: --- how about a properties as in _WhitespaceTokenizerFactory_: preserveOriginal=1 EdgeNGramFilterFactory deletes token Key: SOLR-5152 URL: https://issues.apache.org/jira/browse/SOLR-5152 Project: Solr Issue Type: Improvement Affects Versions: 4.4 Reporter: Christoph Lingg I am using EdgeNGramFilterFactory in my schema.xml {code:xml}fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index !-- ... -- filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=10 side=front / /analyzer /fieldType{code} Some tokens in my index only consist of one character, let's say {{R}}. minGramSize is set to 2 and is bigger than the length of the token. I expected the NGramFilter to left {{R}} unchanged but in fact it is deleting the token. For my use case this interpretation is undesirable, and probably for most use cases too!? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5178) doc values should allow configurable defaults
[ https://issues.apache.org/jira/browse/LUCENE-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742072#comment-13742072 ] ASF subversion and git services commented on LUCENE-5178: - Commit 1514642 from [~rcmuir] in branch 'dev/branches/lucene5178' [ https://svn.apache.org/r1514642 ] LUCENE-5178: add 'missing' support to docvalues (simpletext only) doc values should allow configurable defaults - Key: LUCENE-5178 URL: https://issues.apache.org/jira/browse/LUCENE-5178 Project: Lucene - Core Issue Type: Improvement Reporter: Yonik Seeley DocValues should somehow allow a configurable default per-field. Possible implementations include setting it on the field in the document or registration of an IndexWriter callback. If we don't make the default configurable, then another option is to have DocValues fields keep track of whether a value was indexed for that document or not. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5178) doc values should allow configurable defaults
[ https://issues.apache.org/jira/browse/LUCENE-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742073#comment-13742073 ] Robert Muir commented on LUCENE-5178: - I created a patch with getDocsWithField (and the current fieldcache.getDocsWithField passing thru to it) for docvalues so you know if a value is missing. It also means e.g. SortedDocValues returns -1 ord for missing like fieldcache. so its completely consistent there with FC. currently simpletext is the only one implementing it: the other codecs return MatchAllBits (and thats how the backcompat will work, because they never had missing values before). all tests are passing, but I want to think about strategies for the efficient codecs (Memory/Disk) before doing anything. one other thing i like is, if we do it this way, the codec has the chance to represent missing values in a more efficient way than if the users do it themselves on top. doc values should allow configurable defaults - Key: LUCENE-5178 URL: https://issues.apache.org/jira/browse/LUCENE-5178 Project: Lucene - Core Issue Type: Improvement Reporter: Yonik Seeley DocValues should somehow allow a configurable default per-field. Possible implementations include setting it on the field in the document or registration of an IndexWriter callback. If we don't make the default configurable, then another option is to have DocValues fields keep track of whether a value was indexed for that document or not. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5164) Can not create a collection via collections API (cloud mode)
[ https://issues.apache.org/jira/browse/SOLR-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742098#comment-13742098 ] Erick Erickson commented on SOLR-5164: -- Blast, I wish I'd paid more attention to SOLR-5099, it'd have saved me some time. Sigh.. [~romseygeek] I looked and there are some collection creation tests, but I didn't dig enough to understand completely why the second solr in the path didn't trip this condition. What it didn't seem like we have was a way to restart from scratch. And in the case of SOLR-5099, core creation does succeed it's the restart that's the problem. FWIW. Can not create a collection via collections API (cloud mode) Key: SOLR-5164 URL: https://issues.apache.org/jira/browse/SOLR-5164 Project: Solr Issue Type: Bug Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Blocker Attachments: SOLR-5164.patch When you try to create a collection in SolrCloud, the instanceDir that gets created has an extra solr in it which messes up the pathing for all the lib directives in solrconfig.xml as they're all relative. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5099) The core.properties not created during collection creation
[ https://issues.apache.org/jira/browse/SOLR-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson resolved SOLR-5099. -- Resolution: Fixed Fix Version/s: 5.0 4.5 Assignee: Erick Erickson (was: Alan Woodward) Herb: I stumbled across this as well. I sure wish I'd paid more attention to this JIRA before, you'd have saved me a couple of hours of head-scratching. Nice sleuthing, you nailed the problem. Anyway, I'll check in the fixes for SOLR-5164 this morning and this will be fixed. The core.properties not created during collection creation -- Key: SOLR-5099 URL: https://issues.apache.org/jira/browse/SOLR-5099 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.5, 5.0 Reporter: Herb Jiang Assignee: Erick Erickson Priority: Critical Fix For: 4.5, 5.0 Attachments: CorePropertiesLocator.java.patch When using the new solr.xml structure. The core auto discovery mechanism trying to find core.properties. But I found the core.properties cannot be create when I dynamically create a collection. The root issue is the CorePropertiesLocator trying to create properties before the instanceDir is created. And collection creation process will done and looks fine at runtime, but it will cause issues (cores are not auto discovered after server restart). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: MoreLikeThis (MLT) - AND operator between the fields
I don't know enough about MLT to have an opinion one way or the other. But it's perfectly fine to open up a JIRA and attach your patch, see: http://wiki.apache.org/solr/HowToContribute Best Erick On Thu, Aug 15, 2013 at 12:13 PM, Kranti Parisa kranti.par...@gmail.comwrote: I was looking at the code and found that it is hard coded to Occur.SHOULD in MoreLikeThisQuery. I customized the code to pass a new parameter *mlt.operator*=AND/OR based on that it computes the MLT documents. Default operator is set to OR. And I also want to have *mlt.sort* option, So I will be trying for that as well. Do you guys think, we should make this part of the MLT feature? Please share your ideas. I can submit this change. Thanks Regards, Kranti K Parisa http://www.linkedin.com/in/krantiparisa On Thu, Aug 15, 2013 at 12:05 AM, Kranti Parisa kranti.par...@gmail.comwrote: Hi, It seems that when we pass multiple field names with mlt.fl parameter, it is ORing them to find the MLT documents. Is there a way to specify AND operator? Means if mlt.fl=language,year, then we should return back the MLT documents which has language AND year field values as same as the main query result document. http://localhost:8180/solr/mltCore/mlt?q=id:1wt=jsonmlt=truemlt.fl=language,yearfl=*,scoremlt.mindf=0mlt.mintf=0mlt.match.include=false The above query should return those documents whose field values (language, year) are exactly matching with the document id:1. Is this possible thru any config or param? If not, I think it's worth having as a feature because we don't know the values of those fields to apply as FQ. Thanks Regards, Kranti K Parisa http://www.linkedin.com/in/krantiparisa
[jira] [Created] (SOLR-5167) Ability to use AnalyzingInfixSuggester in Solr
Varun Thacker created SOLR-5167: --- Summary: Ability to use AnalyzingInfixSuggester in Solr Key: SOLR-5167 URL: https://issues.apache.org/jira/browse/SOLR-5167 Project: Solr Issue Type: New Feature Components: SearchComponents - other Reporter: Varun Thacker Priority: Minor Fix For: 4.5, 5.0 We should be able to use AnalyzingInfixSuggester in Solr by defining it in solrconfig.xml -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5149) Query facet to respect mincount
[ https://issues.apache.org/jira/browse/SOLR-5149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742123#comment-13742123 ] Markus Jelsma commented on SOLR-5149: - The use cases mostly limit themselves to saving space when we have a large amount of facet queries to return. Also, if our different clients toggle mincount with one setting but also have facet queries, we need additional code maintain the behaviour. This is not a problem, only inconvenient. Yes, facet.query.mincount sounds fine. Query facet to respect mincount --- Key: SOLR-5149 URL: https://issues.apache.org/jira/browse/SOLR-5149 Project: Solr Issue Type: Bug Components: SearchComponents - other Affects Versions: 4.4 Reporter: Markus Jelsma Priority: Minor Fix For: 4.5, 5.0 Attachments: SOLR-5149-trunk.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5149) Query facet to respect mincount
[ https://issues.apache.org/jira/browse/SOLR-5149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated SOLR-5149: Attachment: SOLR-5149-trunk.patch Patch for trunk now introduces patch.query.mincount. There's no support for facet.zeros in this patch. Query facet to respect mincount --- Key: SOLR-5149 URL: https://issues.apache.org/jira/browse/SOLR-5149 Project: Solr Issue Type: Bug Components: SearchComponents - other Affects Versions: 4.4 Reporter: Markus Jelsma Priority: Minor Fix For: 4.5, 5.0 Attachments: SOLR-5149-trunk.patch, SOLR-5149-trunk.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5164) Can not create a collection via collections API (cloud mode)
[ https://issues.apache.org/jira/browse/SOLR-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742134#comment-13742134 ] ASF subversion and git services commented on SOLR-5164: --- Commit 1514666 from [~erickoerickson] in branch 'dev/trunk' [ https://svn.apache.org/r1514666 ] SOLR-5164, Can not create a collection via collections API (cloud mode). Fixes SOLR-5099 too Can not create a collection via collections API (cloud mode) Key: SOLR-5164 URL: https://issues.apache.org/jira/browse/SOLR-5164 Project: Solr Issue Type: Bug Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Blocker Attachments: SOLR-5164.patch When you try to create a collection in SolrCloud, the instanceDir that gets created has an extra solr in it which messes up the pathing for all the lib directives in solrconfig.xml as they're all relative. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5099) The core.properties not created during collection creation
[ https://issues.apache.org/jira/browse/SOLR-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742135#comment-13742135 ] ASF subversion and git services commented on SOLR-5099: --- Commit 1514666 from [~erickoerickson] in branch 'dev/trunk' [ https://svn.apache.org/r1514666 ] SOLR-5164, Can not create a collection via collections API (cloud mode). Fixes SOLR-5099 too The core.properties not created during collection creation -- Key: SOLR-5099 URL: https://issues.apache.org/jira/browse/SOLR-5099 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.5, 5.0 Reporter: Herb Jiang Assignee: Erick Erickson Priority: Critical Fix For: 4.5, 5.0 Attachments: CorePropertiesLocator.java.patch When using the new solr.xml structure. The core auto discovery mechanism trying to find core.properties. But I found the core.properties cannot be create when I dynamically create a collection. The root issue is the CorePropertiesLocator trying to create properties before the instanceDir is created. And collection creation process will done and looks fine at runtime, but it will cause issues (cores are not auto discovered after server restart). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5167) Ability to use AnalyzingInfixSuggester in Solr
[ https://issues.apache.org/jira/browse/SOLR-5167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742139#comment-13742139 ] Varun Thacker commented on SOLR-5167: - We could define it like: {noformat} searchComponent class=solr.SpellCheckComponent name=suggest lst name=spellchecker str name=namesuggest/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.AnalyzingInfixSuggester/str str name=fieldname/str !-- the indexed field to derive suggestions from -- str name=buildOnCommittrue/str str name=storeDirsuggester/str str name=suggestAnalyzerFieldTypetext_general/str str name=minPrefixChars4/str /lst /searchComponent {noformat} Ability to use AnalyzingInfixSuggester in Solr -- Key: SOLR-5167 URL: https://issues.apache.org/jira/browse/SOLR-5167 Project: Solr Issue Type: New Feature Components: SearchComponents - other Reporter: Varun Thacker Priority: Minor Fix For: 4.5, 5.0 We should be able to use AnalyzingInfixSuggester in Solr by defining it in solrconfig.xml -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4583) StraightBytesDocValuesField fails if bytes 32k
[ https://issues.apache.org/jira/browse/LUCENE-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742145#comment-13742145 ] ASF subversion and git services commented on LUCENE-4583: - Commit 1514669 from [~mikemccand] in branch 'dev/trunk' [ https://svn.apache.org/r1514669 ] LUCENE-4583: IndexWriter no longer places a limit on length of DV binary fields (individual codecs still have their limits, including the default codec) StraightBytesDocValuesField fails if bytes 32k Key: LUCENE-4583 URL: https://issues.apache.org/jira/browse/LUCENE-4583 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.0, 4.1, 5.0 Reporter: David Smiley Assignee: Michael McCandless Priority: Critical Fix For: 5.0, 4.5 Attachments: LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch I didn't observe any limitations on the size of a bytes based DocValues field value in the docs. It appears that the limit is 32k, although I didn't get any friendly error telling me that was the limit. 32k is kind of small IMO; I suspect this limit is unintended and as such is a bug.The following test fails: {code:java} public void testBigDocValue() throws IOException { Directory dir = newDirectory(); IndexWriter writer = new IndexWriter(dir, writerConfig(false)); Document doc = new Document(); BytesRef bytes = new BytesRef((4+4)*4097);//4096 works bytes.length = bytes.bytes.length;//byte data doesn't matter doc.add(new StraightBytesDocValuesField(dvField, bytes)); writer.addDocument(doc); writer.commit(); writer.close(); DirectoryReader reader = DirectoryReader.open(dir); DocValues docValues = MultiDocValues.getDocValues(reader, dvField); //FAILS IF BYTES IS BIG! docValues.getSource().getBytes(0, bytes); reader.close(); dir.close(); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5167) Ability to use AnalyzingInfixSuggester in Solr
[ https://issues.apache.org/jira/browse/SOLR-5167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Thacker updated SOLR-5167: Attachment: SOLR-5167.patch I have a few doubts over this impl. 1. AnalyzingInfixSuggester.store() and AnalyzingInfixSuggester.load() return true instead of false. Not sure if this is the right? 2. Suggester.reload() throws a FileNotFoundException since no file actually gets written. Any suggestions on what the right approach for this would be. Ability to use AnalyzingInfixSuggester in Solr -- Key: SOLR-5167 URL: https://issues.apache.org/jira/browse/SOLR-5167 Project: Solr Issue Type: New Feature Components: SearchComponents - other Reporter: Varun Thacker Priority: Minor Fix For: 4.5, 5.0 Attachments: SOLR-5167.patch We should be able to use AnalyzingInfixSuggester in Solr by defining it in solrconfig.xml -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5178) doc values should expose missing values (or allow configurable defaults)
[ https://issues.apache.org/jira/browse/LUCENE-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated LUCENE-5178: - Summary: doc values should expose missing values (or allow configurable defaults) (was: doc values should allow configurable defaults) doc values should expose missing values (or allow configurable defaults) Key: LUCENE-5178 URL: https://issues.apache.org/jira/browse/LUCENE-5178 Project: Lucene - Core Issue Type: Improvement Reporter: Yonik Seeley DocValues should somehow allow a configurable default per-field. Possible implementations include setting it on the field in the document or registration of an IndexWriter callback. If we don't make the default configurable, then another option is to have DocValues fields keep track of whether a value was indexed for that document or not. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5099) The core.properties not created during collection creation
[ https://issues.apache.org/jira/browse/SOLR-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742163#comment-13742163 ] ASF subversion and git services commented on SOLR-5099: --- Commit 1514684 from [~erickoerickson] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1514684 ] SOLR-5164, Can not create a collection via collections API (cloud mode). Fixes SOLR-5099 too The core.properties not created during collection creation -- Key: SOLR-5099 URL: https://issues.apache.org/jira/browse/SOLR-5099 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.5, 5.0 Reporter: Herb Jiang Assignee: Erick Erickson Priority: Critical Fix For: 4.5, 5.0 Attachments: CorePropertiesLocator.java.patch When using the new solr.xml structure. The core auto discovery mechanism trying to find core.properties. But I found the core.properties cannot be create when I dynamically create a collection. The root issue is the CorePropertiesLocator trying to create properties before the instanceDir is created. And collection creation process will done and looks fine at runtime, but it will cause issues (cores are not auto discovered after server restart). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5164) Can not create a collection via collections API (cloud mode)
[ https://issues.apache.org/jira/browse/SOLR-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742162#comment-13742162 ] ASF subversion and git services commented on SOLR-5164: --- Commit 1514684 from [~erickoerickson] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1514684 ] SOLR-5164, Can not create a collection via collections API (cloud mode). Fixes SOLR-5099 too Can not create a collection via collections API (cloud mode) Key: SOLR-5164 URL: https://issues.apache.org/jira/browse/SOLR-5164 Project: Solr Issue Type: Bug Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Blocker Attachments: SOLR-5164.patch When you try to create a collection in SolrCloud, the instanceDir that gets created has an extra solr in it which messes up the pathing for all the lib directives in solrconfig.xml as they're all relative. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5164) Can not create a collection via collections API (cloud mode)
[ https://issues.apache.org/jira/browse/SOLR-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson resolved SOLR-5164. -- Resolution: Fixed Fix Version/s: 5.0 4.5 Can not create a collection via collections API (cloud mode) Key: SOLR-5164 URL: https://issues.apache.org/jira/browse/SOLR-5164 Project: Solr Issue Type: Bug Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Blocker Fix For: 4.5, 5.0 Attachments: SOLR-5164.patch When you try to create a collection in SolrCloud, the instanceDir that gets created has an extra solr in it which messes up the pathing for all the lib directives in solrconfig.xml as they're all relative. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5178) doc values should expose missing values (or allow configurable defaults)
[ https://issues.apache.org/jira/browse/LUCENE-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742182#comment-13742182 ] Yonik Seeley commented on LUCENE-5178: -- Yes, I think tracking/exposing missing values is the best option, esp for numerics where you can use the full range and still tell of there was a value or not. doc values should expose missing values (or allow configurable defaults) Key: LUCENE-5178 URL: https://issues.apache.org/jira/browse/LUCENE-5178 Project: Lucene - Core Issue Type: Improvement Reporter: Yonik Seeley DocValues should somehow allow a configurable default per-field. Possible implementations include setting it on the field in the document or registration of an IndexWriter callback. If we don't make the default configurable, then another option is to have DocValues fields keep track of whether a value was indexed for that document or not. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5178) doc values should expose missing values (or allow configurable defaults)
[ https://issues.apache.org/jira/browse/LUCENE-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742186#comment-13742186 ] Robert Muir commented on LUCENE-5178: - OK. I can remove the solr defaultValue check here too: i have to fix the tests to test sort missing first/last / facet missing etc anyway (currently the dv tests avoid that). doc values should expose missing values (or allow configurable defaults) Key: LUCENE-5178 URL: https://issues.apache.org/jira/browse/LUCENE-5178 Project: Lucene - Core Issue Type: Improvement Reporter: Yonik Seeley DocValues should somehow allow a configurable default per-field. Possible implementations include setting it on the field in the document or registration of an IndexWriter callback. If we don't make the default configurable, then another option is to have DocValues fields keep track of whether a value was indexed for that document or not. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4718) Allow solr.xml to be stored in zookeeper
[ https://issues.apache.org/jira/browse/SOLR-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-4718: - Attachment: SOLR-4718.patch Alan's patch with some modifications and with the new test cases. Allow solr.xml to be stored in zookeeper Key: SOLR-4718 URL: https://issues.apache.org/jira/browse/SOLR-4718 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 4.3, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Attachments: SOLR-4718-alternative.patch, SOLR-4718.patch, SOLR-4718.patch, SOLR-4718.patch, SOLR-4718.patch, SOLR-4718.patch, SOLR-4718.patch So the near-final piece of this puzzle is to make solr.xml be storable in Zookeeper. Code-wise in terms of Solr, this doesn't look very difficult, I'm working on it now. More interesting is how to get the configuration into ZK in the first place, enhancements to ZkCli? Or boostrap-conf? Other? I'm punting on that for this patch. Second level is how to tell Solr to get the file from ZK. Some possibilities: 1 A system prop, -DzkSolrXmlPath=blah where blah is the path _on zk_ where the file is. Would require -DzkHost or -DzkRun as well. pros - simple, I can wrap my head around it. - easy to script cons - can't run multiple JVMs pointing to different files. Is this really a problem? 2 New solr.xml element. Something like: solr solrcloud str name=zkHostzkurl/str str name=zkSolrXmlPathwhatever/str /solrcloud solr Really, this form would hinge on the presence or absence of zkSolrXmlPath. If present, go up and look for the indicated solr.xml file on ZK. Any properties in the ZK version would overwrite anything in the local copy. NOTE: I'm really not very interested in supporting this as an option for old-style solr.xml unless it's _really_ easy. For instance, what if the local solr.xml is new-style and the one in ZK is old-style? Or vice-versa? Since old-style is going away, this doesn't seem like it's worth the effort. pros - No new mechanisms cons - once again requires that there be a solr.xml file on each client. Admittedly for installations that didn't care much about multiple JVMs, it could be a stock file that didn't change... For now, I'm going to just manually push solr.xml to ZK, then read it based on a sysprop. That'll get the structure in place while we debate. Not going to check this in until there's some consensus though. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5168) ByteSliceReader assert trips with 32-bit oracle 1.7.0_25 + G1GC
[ https://issues.apache.org/jira/browse/LUCENE-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742233#comment-13742233 ] Robert Muir commented on LUCENE-5168: - Out of curiosity, were those failures also with G1GC? ByteSliceReader assert trips with 32-bit oracle 1.7.0_25 + G1GC --- Key: LUCENE-5168 URL: https://issues.apache.org/jira/browse/LUCENE-5168 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: java8-windows-4x-3075-console.txt, log.0025, log.0042, log.0078, log.0086, log.0100 This assertion trips (sometimes from different tests), if you run the highlighting tests on branch_4x with r1512807. It reproduces about half the time, always only with 32bit + G1GC (other combinations do not seem to trip it, i didnt try looping or anything really though). {noformat} rmuir@beast:~/workspace/branch_4x$ svn up -r 1512807 rmuir@beast:~/workspace/branch_4x$ ant clean rmuir@beast:~/workspace/branch_4x$ rm -rf .caches #this is important, otherwise master seed does not work! rmuir@beast:~/workspace/branch_4x/lucene/highlighter$ ant test -Dtests.jvms=2 -Dtests.seed=EBBFA6F4E80A7365 -Dargs=-server -XX:+UseG1GC {noformat} Originally showed up like this: {noformat} Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6874/ Java: 32bit/jdk1.7.0_25 -server -XX:+UseG1GC 1 tests failed. REGRESSION: org.apache.lucene.search.postingshighlight.TestPostingsHighlighter.testUserFailedToIndexOffsets Error Message: Stack Trace: java.lang.AssertionError at __randomizedtesting.SeedInfo.seed([EBBFA6F4E80A7365:1FBF811885F2D611]:0) at org.apache.lucene.index.ByteSliceReader.readByte(ByteSliceReader.java:73) at org.apache.lucene.store.DataInput.readVInt(DataInput.java:108) at org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:453) at org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85) at org.apache.lucene.index.TermsHash.flush(TermsHash.java:116) at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53) at org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:81) at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:501) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5156) Provide a way to move the contents of a file to ZooKeeper with ZkCLI
[ https://issues.apache.org/jira/browse/SOLR-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-5156: - Attachment: SOLR-5156.patch I'll commit this shortly Provide a way to move the contents of a file to ZooKeeper with ZkCLI Key: SOLR-5156 URL: https://issues.apache.org/jira/browse/SOLR-5156 Project: Solr Issue Type: Improvement Reporter: Erick Erickson Assignee: Erick Erickson Attachments: SOLR-5156.patch, SOLR-5156.patch Spinoff from SOLR-4718. We don't have any good way of putting solr.xml up in Zookeeper in the first place. So while we can fake getting the file up there we need a way consistent with ZkCLI -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5179) Refactoring on PostingsWriterBase for delta-encoding
Han Jiang created LUCENE-5179: - Summary: Refactoring on PostingsWriterBase for delta-encoding Key: LUCENE-5179 URL: https://issues.apache.org/jira/browse/LUCENE-5179 Project: Lucene - Core Issue Type: Improvement Reporter: Han Jiang Assignee: Han Jiang Fix For: 5.0, 4.5 A further step from LUCENE-5029. The short story is, previous API change brings two problems: * it somewhat breaks backward compatibility: although we can still read old format, we can no longer reproduce it; * pulsing codec have problem with it. And long story... With the change, current PostingsBase API will be like this: * term dict tells PBF we start a new term (via startTerm()); * PBF adds docs, positions and other postings data; * term dict tells PBF all the data for current term is completed (via finishTerm()), then PBF returns the metadata for current term (as long[] and byte[]); * term dict might buffer all the metadata in an ArrayList. when all the term is collected, it then decides how those metadata will be located on disk. So after the API change, PBF no longer have that annoying 'flushTermBlock', and instead term dict maintains the term, metadata list. However, for each term we'll now write long[] blob before byte[], so the index format is not consistent with pre-4.5. like in Lucne41, the metadata can be written as longA,bytesA,longB, but now we have to write as longA,longB,bytesA. Another problem is, pulsing codec cannot tell wrapped PBF how the metadata is delta-encoded, after all PulsingPostingsWriter is only a PBF. For example, we have terms=[a, a1, a2, b, b1 b2] and itemsInBlock=2, so theoretically we'll finally have three blocks in BTTR: [a b] [a1 a2] [b1 b2], with this approach, the metadata of term b is delta encoded base on metadata of a. but when term dict tells PBF to finishTerm(b), it might silly do the delta encode base on term a2. So I think maybe we can introduce a method 'encodeTerm(long[], DataOutput out, FieldInfo, TermState, boolean absolute)', so that during metadata flush, we can control how current term is written? And the term dict will buffer TermState, which implicitly holds metadata like we do in PBReader side. For example, if we want to reproduce old lucene41 format , we can simple set longsSize==0, then PBF writes the old format (longA,bytesA,longB) to DataOutput, and the compatible issue is solved. For pulsing codec, it will also be able to tell lower level how to encode metadata. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5168) ByteSliceReader assert trips with 32-bit oracle 1.7.0_25 + G1GC
[ https://issues.apache.org/jira/browse/LUCENE-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742243#comment-13742243 ] Dawid Weiss commented on LUCENE-5168: - Yes. This just has to be complex though because it's not just GC-related. Disabling escape analysis also makes the tests pass, so does removing inlining. I managed to find a reproducible scenario under 1.8 (fastdebug) which is great because now I can dump the assembly. It's still terribly large... Anyway, the blame still seems to point to readvint :) Really, not joking. I added a sysout in {code} final int code = freq.readVInt(); {code} this is consistent when the test passes but when it fails you get a difference: {code} // normal run code::0 true code::4 true code::2 true // error run code::0 true code::3 true code::4 true {code} ByteSliceReader assert trips with 32-bit oracle 1.7.0_25 + G1GC --- Key: LUCENE-5168 URL: https://issues.apache.org/jira/browse/LUCENE-5168 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: java8-windows-4x-3075-console.txt, log.0025, log.0042, log.0078, log.0086, log.0100 This assertion trips (sometimes from different tests), if you run the highlighting tests on branch_4x with r1512807. It reproduces about half the time, always only with 32bit + G1GC (other combinations do not seem to trip it, i didnt try looping or anything really though). {noformat} rmuir@beast:~/workspace/branch_4x$ svn up -r 1512807 rmuir@beast:~/workspace/branch_4x$ ant clean rmuir@beast:~/workspace/branch_4x$ rm -rf .caches #this is important, otherwise master seed does not work! rmuir@beast:~/workspace/branch_4x/lucene/highlighter$ ant test -Dtests.jvms=2 -Dtests.seed=EBBFA6F4E80A7365 -Dargs=-server -XX:+UseG1GC {noformat} Originally showed up like this: {noformat} Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6874/ Java: 32bit/jdk1.7.0_25 -server -XX:+UseG1GC 1 tests failed. REGRESSION: org.apache.lucene.search.postingshighlight.TestPostingsHighlighter.testUserFailedToIndexOffsets Error Message: Stack Trace: java.lang.AssertionError at __randomizedtesting.SeedInfo.seed([EBBFA6F4E80A7365:1FBF811885F2D611]:0) at org.apache.lucene.index.ByteSliceReader.readByte(ByteSliceReader.java:73) at org.apache.lucene.store.DataInput.readVInt(DataInput.java:108) at org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:453) at org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85) at org.apache.lucene.index.TermsHash.flush(TermsHash.java:116) at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53) at org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:81) at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:501) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Reopened] (SOLR-5164) Can not create a collection via collections API (cloud mode)
[ https://issues.apache.org/jira/browse/SOLR-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reopened SOLR-5164: --- We should add a test case to the collections api that catches this. Also, did this affect 4.4? The Affects versions seems to indicate not? If thats the case, there should be no separate changes entry. Can not create a collection via collections API (cloud mode) Key: SOLR-5164 URL: https://issues.apache.org/jira/browse/SOLR-5164 Project: Solr Issue Type: Bug Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Blocker Fix For: 4.5, 5.0 Attachments: SOLR-5164.patch When you try to create a collection in SolrCloud, the instanceDir that gets created has an extra solr in it which messes up the pathing for all the lib directives in solrconfig.xml as they're all relative. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Reopened] (SOLR-5099) The core.properties not created during collection creation
[ https://issues.apache.org/jira/browse/SOLR-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reopened SOLR-5099: --- We need a test for this as well - I'm happy to do it if no one else does, but lets not resolve these types of bugs until we have tests for htem. The core.properties not created during collection creation -- Key: SOLR-5099 URL: https://issues.apache.org/jira/browse/SOLR-5099 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.5, 5.0 Reporter: Herb Jiang Assignee: Erick Erickson Priority: Critical Fix For: 4.5, 5.0 Attachments: CorePropertiesLocator.java.patch When using the new solr.xml structure. The core auto discovery mechanism trying to find core.properties. But I found the core.properties cannot be create when I dynamically create a collection. The root issue is the CorePropertiesLocator trying to create properties before the instanceDir is created. And collection creation process will done and looks fine at runtime, but it will cause issues (cores are not auto discovered after server restart). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5164) Can not create a collection via collections API (cloud mode)
[ https://issues.apache.org/jira/browse/SOLR-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742259#comment-13742259 ] Erick Erickson commented on SOLR-5164: -- yeah, we should have a test, but this has been a pretty big rathole for me already and I didn't see a simple way to create a test, see my comment earlier. No, it didn't affect 4.4 so I'll take the entry out of CHANGES.txt in the next JIRA I fix (should be this morning sometime). Can not create a collection via collections API (cloud mode) Key: SOLR-5164 URL: https://issues.apache.org/jira/browse/SOLR-5164 Project: Solr Issue Type: Bug Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Blocker Fix For: 4.5, 5.0 Attachments: SOLR-5164.patch When you try to create a collection in SolrCloud, the instanceDir that gets created has an extra solr in it which messes up the pathing for all the lib directives in solrconfig.xml as they're all relative. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5164) Can not create a collection via collections API (cloud mode)
[ https://issues.apache.org/jira/browse/SOLR-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742258#comment-13742258 ] Mark Miller commented on SOLR-5164: --- I've reopened SOLR-5099 as well - tests for these bugs are as important as the fixes. Can not create a collection via collections API (cloud mode) Key: SOLR-5164 URL: https://issues.apache.org/jira/browse/SOLR-5164 Project: Solr Issue Type: Bug Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Blocker Fix For: 4.5, 5.0 Attachments: SOLR-5164.patch When you try to create a collection in SolrCloud, the instanceDir that gets created has an extra solr in it which messes up the pathing for all the lib directives in solrconfig.xml as they're all relative. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5156) Provide a way to move the contents of a file to ZooKeeper with ZkCLI
[ https://issues.apache.org/jira/browse/SOLR-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-5156: - Attachment: SOLR-5156.patch Final patch with bogus nocommit removed, passing precommit checks. Provide a way to move the contents of a file to ZooKeeper with ZkCLI Key: SOLR-5156 URL: https://issues.apache.org/jira/browse/SOLR-5156 Project: Solr Issue Type: Improvement Reporter: Erick Erickson Assignee: Erick Erickson Attachments: SOLR-5156.patch, SOLR-5156.patch, SOLR-5156.patch Spinoff from SOLR-4718. We don't have any good way of putting solr.xml up in Zookeeper in the first place. So while we can fake getting the file up there we need a way consistent with ZkCLI -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5164) Can not create a collection via collections API (cloud mode)
[ https://issues.apache.org/jira/browse/SOLR-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742261#comment-13742261 ] Mark Miller commented on SOLR-5164: --- bq. but this has been a pretty big rathole for me already and I didn't see a simple way to create a test That's fine, but please don't resolve the issue then. Bug fixes for really ugly issues like these absolutely need tests to make sure they don't keep coming back. We have seen that type of thing a lot recently - we fix something like this and it just breaks a couple months later in a new refactoring. You don't have to write the tests, but you might ask for some advice or help from someone else on it before resolving the issue. I'm happy to help make sure these problems have tests. Can not create a collection via collections API (cloud mode) Key: SOLR-5164 URL: https://issues.apache.org/jira/browse/SOLR-5164 Project: Solr Issue Type: Bug Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Blocker Fix For: 4.5, 5.0 Attachments: SOLR-5164.patch When you try to create a collection in SolrCloud, the instanceDir that gets created has an extra solr in it which messes up the pathing for all the lib directives in solrconfig.xml as they're all relative. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4718) Allow solr.xml to be stored in zookeeper
[ https://issues.apache.org/jira/browse/SOLR-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-4718: - Attachment: SOLR-4718.patch Final patch with, CHANGES.txt entry here. Allow solr.xml to be stored in zookeeper Key: SOLR-4718 URL: https://issues.apache.org/jira/browse/SOLR-4718 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 4.3, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Attachments: SOLR-4718-alternative.patch, SOLR-4718.patch, SOLR-4718.patch, SOLR-4718.patch, SOLR-4718.patch, SOLR-4718.patch, SOLR-4718.patch, SOLR-4718.patch So the near-final piece of this puzzle is to make solr.xml be storable in Zookeeper. Code-wise in terms of Solr, this doesn't look very difficult, I'm working on it now. More interesting is how to get the configuration into ZK in the first place, enhancements to ZkCli? Or boostrap-conf? Other? I'm punting on that for this patch. Second level is how to tell Solr to get the file from ZK. Some possibilities: 1 A system prop, -DzkSolrXmlPath=blah where blah is the path _on zk_ where the file is. Would require -DzkHost or -DzkRun as well. pros - simple, I can wrap my head around it. - easy to script cons - can't run multiple JVMs pointing to different files. Is this really a problem? 2 New solr.xml element. Something like: solr solrcloud str name=zkHostzkurl/str str name=zkSolrXmlPathwhatever/str /solrcloud solr Really, this form would hinge on the presence or absence of zkSolrXmlPath. If present, go up and look for the indicated solr.xml file on ZK. Any properties in the ZK version would overwrite anything in the local copy. NOTE: I'm really not very interested in supporting this as an option for old-style solr.xml unless it's _really_ easy. For instance, what if the local solr.xml is new-style and the one in ZK is old-style? Or vice-versa? Since old-style is going away, this doesn't seem like it's worth the effort. pros - No new mechanisms cons - once again requires that there be a solr.xml file on each client. Admittedly for installations that didn't care much about multiple JVMs, it could be a stock file that didn't change... For now, I'm going to just manually push solr.xml to ZK, then read it based on a sysprop. That'll get the structure in place while we debate. Not going to check this in until there's some consensus though. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 737 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/737/ Java: 64bit/jdk1.7.0 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC All tests passed Build Log: [...truncated 9920 lines...] [junit4] ERROR: JVM J0 ended with an exception, command line: /Library/Java/JavaVirtualMachines/jdk1.7.0_25.jdk/Contents/Home/jre/bin/java -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/heapdumps -Dtests.prefix=tests -Dtests.seed=DBD6FE1DD046F358 -Xmx512M -Dtests.iters= -Dtests.verbose=false -Dtests.infostream=false -Dtests.codec=random -Dtests.postingsformat=random -Dtests.docvaluesformat=random -Dtests.locale=random -Dtests.timezone=random -Dtests.directory=random -Dtests.linedocsfile=europarl.lines.txt.gz -Dtests.luceneMatchVersion=5.0 -Dtests.cleanthreads=perClass -Djava.util.logging.config.file=/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/logging.properties -Dtests.nightly=false -Dtests.weekly=false -Dtests.slow=true -Dtests.asserts.gracious=false -Dtests.multiplier=1 -DtempDir=. -Djava.io.tmpdir=. -Djunit4.tempDir=/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/test/temp -Dclover.db.dir=/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/clover/db -Djava.security.manager=org.apache.lucene.util.TestSecurityManager -Djava.security.policy=/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/tests.policy -Dlucene.version=5.0-SNAPSHOT -Djetty.testMode=1 -Djetty.insecurerandom=1 -Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory -Djava.awt.headless=true -Dtests.disableHdfs=true -Dfile.encoding=UTF-8 -classpath
[jira] [Updated] (SOLR-5164) Creating collections via the Collections API does not work with lib include directives.
[ https://issues.apache.org/jira/browse/SOLR-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-5164: -- Component/s: SolrCloud Priority: Critical (was: Blocker) Summary: Creating collections via the Collections API does not work with lib include directives. (was: Can not create a collection via collections API (cloud mode)) Creating collections via the Collections API does not work with lib include directives. --- Key: SOLR-5164 URL: https://issues.apache.org/jira/browse/SOLR-5164 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Critical Fix For: 4.5, 5.0 Attachments: SOLR-5164.patch When you try to create a collection in SolrCloud, the instanceDir that gets created has an extra solr in it which messes up the pathing for all the lib directives in solrconfig.xml as they're all relative. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5164) Creating collections via the Collections API fails due to core being created in the wrong directory
[ https://issues.apache.org/jira/browse/SOLR-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-5164: - Summary: Creating collections via the Collections API fails due to core being created in the wrong directory (was: Creating collections via the Collections API does not work with lib include directives.) Creating collections via the Collections API fails due to core being created in the wrong directory --- Key: SOLR-5164 URL: https://issues.apache.org/jira/browse/SOLR-5164 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Critical Fix For: 4.5, 5.0 Attachments: SOLR-5164.patch When you try to create a collection in SolrCloud, the instanceDir that gets created has an extra solr in it which messes up the pathing for all the lib directives in solrconfig.xml as they're all relative. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5164) Creating collections via the Collections API fails due to core being created in the wrong directory
[ https://issues.apache.org/jira/browse/SOLR-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742360#comment-13742360 ] Erick Erickson commented on SOLR-5164: -- Well, the code is fixed, how about raising another JIRA instead? Creating collections via the Collections API fails due to core being created in the wrong directory --- Key: SOLR-5164 URL: https://issues.apache.org/jira/browse/SOLR-5164 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Critical Fix For: 4.5, 5.0 Attachments: SOLR-5164.patch When you try to create a collection in SolrCloud, the instanceDir that gets created has an extra solr in it which messes up the pathing for all the lib directives in solrconfig.xml as they're all relative. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5156) Provide a way to move the contents of a file to ZooKeeper with ZkCLI
[ https://issues.apache.org/jira/browse/SOLR-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742364#comment-13742364 ] ASF subversion and git services commented on SOLR-5156: --- Commit 1514776 from [~erickoerickson] in branch 'dev/trunk' [ https://svn.apache.org/r1514776 ] SOLR-5156 Provide a way to move the contents of a file to ZooKeeper with ZkCLI Provide a way to move the contents of a file to ZooKeeper with ZkCLI Key: SOLR-5156 URL: https://issues.apache.org/jira/browse/SOLR-5156 Project: Solr Issue Type: Improvement Reporter: Erick Erickson Assignee: Erick Erickson Attachments: SOLR-5156.patch, SOLR-5156.patch, SOLR-5156.patch Spinoff from SOLR-4718. We don't have any good way of putting solr.xml up in Zookeeper in the first place. So while we can fake getting the file up there we need a way consistent with ZkCLI -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-5099) The core.properties not created during collection creation
[ https://issues.apache.org/jira/browse/SOLR-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reassigned SOLR-5099: - Assignee: Mark Miller (was: Erick Erickson) The core.properties not created during collection creation -- Key: SOLR-5099 URL: https://issues.apache.org/jira/browse/SOLR-5099 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.5, 5.0 Reporter: Herb Jiang Assignee: Mark Miller Priority: Critical Fix For: 4.5, 5.0 Attachments: CorePropertiesLocator.java.patch When using the new solr.xml structure. The core auto discovery mechanism trying to find core.properties. But I found the core.properties cannot be create when I dynamically create a collection. The root issue is the CorePropertiesLocator trying to create properties before the instanceDir is created. And collection creation process will done and looks fine at runtime, but it will cause issues (cores are not auto discovered after server restart). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-5164) Creating collections via the Collections API fails due to core being created in the wrong directory
[ https://issues.apache.org/jira/browse/SOLR-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reassigned SOLR-5164: - Assignee: Mark Miller (was: Erick Erickson) Creating collections via the Collections API fails due to core being created in the wrong directory --- Key: SOLR-5164 URL: https://issues.apache.org/jira/browse/SOLR-5164 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Mark Miller Priority: Critical Fix For: 4.5, 5.0 Attachments: SOLR-5164.patch When you try to create a collection in SolrCloud, the instanceDir that gets created has an extra solr in it which messes up the pathing for all the lib directives in solrconfig.xml as they're all relative. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5164) Creating collections via the Collections API fails due to core being created in the wrong directory
[ https://issues.apache.org/jira/browse/SOLR-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742388#comment-13742388 ] Mark Miller commented on SOLR-5164: --- I don't consider this fixed without a test. The two issues are critical and somewhat complicated issues. I'm going to write the tests - without them, we only have your word they are fixed today and a random guess they will still be fixed tomorrow or the next day. These two issues are much too critical to not consider a test part of the issue. I'll finish the issues. Creating collections via the Collections API fails due to core being created in the wrong directory --- Key: SOLR-5164 URL: https://issues.apache.org/jira/browse/SOLR-5164 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Critical Fix For: 4.5, 5.0 Attachments: SOLR-5164.patch When you try to create a collection in SolrCloud, the instanceDir that gets created has an extra solr in it which messes up the pathing for all the lib directives in solrconfig.xml as they're all relative. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5179) Refactoring on PostingsWriterBase for delta-encoding
[ https://issues.apache.org/jira/browse/LUCENE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Han Jiang updated LUCENE-5179: -- Attachment: LUCENE-5179.patch Patch for branch3069, tests pass for all 'temp' postings format. Refactoring on PostingsWriterBase for delta-encoding Key: LUCENE-5179 URL: https://issues.apache.org/jira/browse/LUCENE-5179 Project: Lucene - Core Issue Type: Improvement Reporter: Han Jiang Assignee: Han Jiang Fix For: 5.0, 4.5 Attachments: LUCENE-5179.patch A further step from LUCENE-5029. The short story is, previous API change brings two problems: * it somewhat breaks backward compatibility: although we can still read old format, we can no longer reproduce it; * pulsing codec have problem with it. And long story... With the change, current PostingsBase API will be like this: * term dict tells PBF we start a new term (via startTerm()); * PBF adds docs, positions and other postings data; * term dict tells PBF all the data for current term is completed (via finishTerm()), then PBF returns the metadata for current term (as long[] and byte[]); * term dict might buffer all the metadata in an ArrayList. when all the term is collected, it then decides how those metadata will be located on disk. So after the API change, PBF no longer have that annoying 'flushTermBlock', and instead term dict maintains the term, metadata list. However, for each term we'll now write long[] blob before byte[], so the index format is not consistent with pre-4.5. like in Lucne41, the metadata can be written as longA,bytesA,longB, but now we have to write as longA,longB,bytesA. Another problem is, pulsing codec cannot tell wrapped PBF how the metadata is delta-encoded, after all PulsingPostingsWriter is only a PBF. For example, we have terms=[a, a1, a2, b, b1 b2] and itemsInBlock=2, so theoretically we'll finally have three blocks in BTTR: [a b] [a1 a2] [b1 b2], with this approach, the metadata of term b is delta encoded base on metadata of a. but when term dict tells PBF to finishTerm(b), it might silly do the delta encode base on term a2. So I think maybe we can introduce a method 'encodeTerm(long[], DataOutput out, FieldInfo, TermState, boolean absolute)', so that during metadata flush, we can control how current term is written? And the term dict will buffer TermState, which implicitly holds metadata like we do in PBReader side. For example, if we want to reproduce old lucene41 format , we can simple set longsSize==0, then PBF writes the old format (longA,bytesA,longB) to DataOutput, and the compatible issue is solved. For pulsing codec, it will also be able to tell lower level how to encode metadata. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5099) The core.properties not created during collection creation
[ https://issues.apache.org/jira/browse/SOLR-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742403#comment-13742403 ] Erick Erickson commented on SOLR-5099: -- FWIW, a separate test case would be fine here, but note that the actual fix is part of SOLR-5164. I didn't see Herb's patch until after I'd found it as part of SOLR-5164 The core.properties not created during collection creation -- Key: SOLR-5099 URL: https://issues.apache.org/jira/browse/SOLR-5099 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.5, 5.0 Reporter: Herb Jiang Assignee: Mark Miller Priority: Critical Fix For: 4.5, 5.0 Attachments: CorePropertiesLocator.java.patch When using the new solr.xml structure. The core auto discovery mechanism trying to find core.properties. But I found the core.properties cannot be create when I dynamically create a collection. The root issue is the CorePropertiesLocator trying to create properties before the instanceDir is created. And collection creation process will done and looks fine at runtime, but it will cause issues (cores are not auto discovered after server restart). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3936) QueryElevationComponent: Wrong order when result grouping is activated
[ https://issues.apache.org/jira/browse/SOLR-3936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742405#comment-13742405 ] ASF subversion and git services commented on SOLR-3936: --- Commit 1514795 from hoss...@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1514795 ] SOLR-3936: Fixed QueryElevationComponent sorting when used with Grouping QueryElevationComponent: Wrong order when result grouping is activated -- Key: SOLR-3936 URL: https://issues.apache.org/jira/browse/SOLR-3936 Project: Solr Issue Type: Bug Components: SearchComponents - other Affects Versions: 4.0 Reporter: Michael Berger Assignee: Hoss Man Attachments: SOLR-3936.patch, SOLR-3936.patch When I use elevation together with grouping I got not the expected result order. I tried it with the standard solr example: http://localhost:8983/solr/elevate?enableElevation=truefl=score%2C[elevated]%2Cid%2CnameforceElevation=truegroup.field=manugroup=onindent=onq=ipodwt=json but the results ignored the elevation: { responseHeader:{ status:0, QTime:2, params:{ enableElevation:true, fl:score,[elevated],id,name, indent:on, q:ipod, forceElevation:true, group.field:manu, group:on, wt:json}}, grouped:{ manu:{ matches:2, groups:[{ groupValue:belkin, doclist:{numFound:1,start:0,maxScore:0.7698604,docs:[ { id:F8V7067-APL-KIT, name:Belkin Mobile Power Cord for iPod w/ Dock, score:0.7698604, [elevated]:false}] }}, { groupValue:inc, doclist:{numFound:1,start:0,maxScore:0.28869766,docs:[ { id:MA147LL/A, name:Apple 60 GB iPod with Video Playback Black, score:0.28869766, [elevated]:true}] }}]}}} the elevate.xml defines the following rules : query text=ipod doc id=MA147LL/A / !-- put the actual ipod at the top -- doc id=IW-02 exclude=true / !-- exclude this cable -- /query /elevate -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4718) Allow solr.xml to be stored in zookeeper
[ https://issues.apache.org/jira/browse/SOLR-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742407#comment-13742407 ] ASF subversion and git services commented on SOLR-4718: --- Commit 1514800 from [~erickoerickson] in branch 'dev/trunk' [ https://svn.apache.org/r1514800 ] SOLR-4718 Allow solr.xml to be stored in ZooKeeper Allow solr.xml to be stored in zookeeper Key: SOLR-4718 URL: https://issues.apache.org/jira/browse/SOLR-4718 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 4.3, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Attachments: SOLR-4718-alternative.patch, SOLR-4718.patch, SOLR-4718.patch, SOLR-4718.patch, SOLR-4718.patch, SOLR-4718.patch, SOLR-4718.patch, SOLR-4718.patch So the near-final piece of this puzzle is to make solr.xml be storable in Zookeeper. Code-wise in terms of Solr, this doesn't look very difficult, I'm working on it now. More interesting is how to get the configuration into ZK in the first place, enhancements to ZkCli? Or boostrap-conf? Other? I'm punting on that for this patch. Second level is how to tell Solr to get the file from ZK. Some possibilities: 1 A system prop, -DzkSolrXmlPath=blah where blah is the path _on zk_ where the file is. Would require -DzkHost or -DzkRun as well. pros - simple, I can wrap my head around it. - easy to script cons - can't run multiple JVMs pointing to different files. Is this really a problem? 2 New solr.xml element. Something like: solr solrcloud str name=zkHostzkurl/str str name=zkSolrXmlPathwhatever/str /solrcloud solr Really, this form would hinge on the presence or absence of zkSolrXmlPath. If present, go up and look for the indicated solr.xml file on ZK. Any properties in the ZK version would overwrite anything in the local copy. NOTE: I'm really not very interested in supporting this as an option for old-style solr.xml unless it's _really_ easy. For instance, what if the local solr.xml is new-style and the one in ZK is old-style? Or vice-versa? Since old-style is going away, this doesn't seem like it's worth the effort. pros - No new mechanisms cons - once again requires that there be a solr.xml file on each client. Admittedly for installations that didn't care much about multiple JVMs, it could be a stock file that didn't change... For now, I'm going to just manually push solr.xml to ZK, then read it based on a sysprop. That'll get the structure in place while we debate. Not going to check this in until there's some consensus though. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5150) HdfsIndexInput may not fully read requested bytes.
[ https://issues.apache.org/jira/browse/SOLR-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742432#comment-13742432 ] Mark Miller commented on SOLR-5150: --- I've held off on committing this because some performance tests indicate the upstream blur patch may have been more performant for merging/flushing while the current patch is *much* more performant for queries. We might be able to use one or the other based on the IOContext. I'm waiting until I can get some more results and testing done though - I've seen lots of random deadlock situations in some of my testing with the upstream blue fix (synchronization around two calls). HdfsIndexInput may not fully read requested bytes. -- Key: SOLR-5150 URL: https://issues.apache.org/jira/browse/SOLR-5150 Project: Solr Issue Type: Bug Affects Versions: 4.4 Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.5, 5.0 Attachments: SOLR-5150.patch Patrick Hunt noticed that our HdfsDirectory code was a bit behind Blur here - the read call we are using may not read all of the requested bytes - it returns the number of bytes actually written - which we ignore. Blur moved to using a seek and then readFully call - synchronizing across the two calls to deal with clones. We have seen that really kills performance, and using the readFully call that lets you pass the position rather than first doing a seek, performs much better and does not require the synchronization. I also noticed that the seekInternal impl should not seek but be a no op since we are seeking on the read. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5150) HdfsIndexInput may not fully read requested bytes.
[ https://issues.apache.org/jira/browse/SOLR-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742433#comment-13742433 ] Mark Miller commented on SOLR-5150: --- @phunt was on vacation, but is now back and may have some thoughts on this issue as well. HdfsIndexInput may not fully read requested bytes. -- Key: SOLR-5150 URL: https://issues.apache.org/jira/browse/SOLR-5150 Project: Solr Issue Type: Bug Affects Versions: 4.4 Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.5, 5.0 Attachments: SOLR-5150.patch Patrick Hunt noticed that our HdfsDirectory code was a bit behind Blur here - the read call we are using may not read all of the requested bytes - it returns the number of bytes actually written - which we ignore. Blur moved to using a seek and then readFully call - synchronizing across the two calls to deal with clones. We have seen that really kills performance, and using the readFully call that lets you pass the position rather than first doing a seek, performs much better and does not require the synchronization. I also noticed that the seekInternal impl should not seek but be a no op since we are seeking on the read. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5150) HdfsIndexInput may not fully read requested bytes.
[ https://issues.apache.org/jira/browse/SOLR-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742433#comment-13742433 ] Mark Miller edited comment on SOLR-5150 at 8/16/13 5:37 PM: [~phunt] was on vacation, but is now back and may have some thoughts on this issue as well. was (Author: markrmil...@gmail.com): @phunt was on vacation, but is now back and may have some thoughts on this issue as well. HdfsIndexInput may not fully read requested bytes. -- Key: SOLR-5150 URL: https://issues.apache.org/jira/browse/SOLR-5150 Project: Solr Issue Type: Bug Affects Versions: 4.4 Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.5, 5.0 Attachments: SOLR-5150.patch Patrick Hunt noticed that our HdfsDirectory code was a bit behind Blur here - the read call we are using may not read all of the requested bytes - it returns the number of bytes actually written - which we ignore. Blur moved to using a seek and then readFully call - synchronizing across the two calls to deal with clones. We have seen that really kills performance, and using the readFully call that lets you pass the position rather than first doing a seek, performs much better and does not require the synchronization. I also noticed that the seekInternal impl should not seek but be a no op since we are seeking on the read. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5150) HdfsIndexInput may not fully read requested bytes.
[ https://issues.apache.org/jira/browse/SOLR-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742432#comment-13742432 ] Mark Miller edited comment on SOLR-5150 at 8/16/13 5:38 PM: I've held off on committing this because some performance tests indicate the upstream blur patch may have been more performant for merging/flushing while the current patch is *much* more performant for queries. We might be able to use one or the other based on the IOContext. I'm waiting until I can get some more results and testing done though - I've seen lots of random deadlock situations in some of my testing with the upstream blur fix (synchronization around two calls). was (Author: markrmil...@gmail.com): I've held off on committing this because some performance tests indicate the upstream blur patch may have been more performant for merging/flushing while the current patch is *much* more performant for queries. We might be able to use one or the other based on the IOContext. I'm waiting until I can get some more results and testing done though - I've seen lots of random deadlock situations in some of my testing with the upstream blue fix (synchronization around two calls). HdfsIndexInput may not fully read requested bytes. -- Key: SOLR-5150 URL: https://issues.apache.org/jira/browse/SOLR-5150 Project: Solr Issue Type: Bug Affects Versions: 4.4 Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.5, 5.0 Attachments: SOLR-5150.patch Patrick Hunt noticed that our HdfsDirectory code was a bit behind Blur here - the read call we are using may not read all of the requested bytes - it returns the number of bytes actually written - which we ignore. Blur moved to using a seek and then readFully call - synchronizing across the two calls to deal with clones. We have seen that really kills performance, and using the readFully call that lets you pass the position rather than first doing a seek, performs much better and does not require the synchronization. I also noticed that the seekInternal impl should not seek but be a no op since we are seeking on the read. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5159) Manifest includes non-parsed maven variables
[ https://issues.apache.org/jira/browse/SOLR-5159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742434#comment-13742434 ] ASF subversion and git services commented on SOLR-5159: --- Commit 1514813 from [~steve_rowe] in branch 'dev/trunk' [ https://svn.apache.org/r1514813 ] SOLR-5159: Manifest includes non-parsed maven variables Manifest includes non-parsed maven variables Key: SOLR-5159 URL: https://issues.apache.org/jira/browse/SOLR-5159 Project: Solr Issue Type: Bug Components: Build Affects Versions: 4.4, 4.5, 5.0 Environment: Apache Maven 3.0.5 Reporter: Artem Karpenko Assignee: Steve Rowe Priority: Minor Labels: maven-bundle-plugin, maven3, Attachments: SOLR-5159.patch When building Lucene/Solr with Apache Maven 3, all MANIFEST.MF files included into JAR artifacts contain non-parsed POM variables: namely, there are entries like Specification-Version: 5.0.0.$\{now.version} In the end, Solr displays these values on admin page in Versions section. This is caused by unresolved bug in maven-bundle-plugin (FELIX-3392). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5159) Manifest includes non-parsed maven variables
[ https://issues.apache.org/jira/browse/SOLR-5159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742435#comment-13742435 ] ASF subversion and git services commented on SOLR-5159: --- Commit 1514814 from [~steve_rowe] in branch 'dev/trunk' [ https://svn.apache.org/r1514814 ] SOLR-5159: fix typo in CHANGES entry Manifest includes non-parsed maven variables Key: SOLR-5159 URL: https://issues.apache.org/jira/browse/SOLR-5159 Project: Solr Issue Type: Bug Components: Build Affects Versions: 4.4, 4.5, 5.0 Environment: Apache Maven 3.0.5 Reporter: Artem Karpenko Assignee: Steve Rowe Priority: Minor Labels: maven-bundle-plugin, maven3, Attachments: SOLR-5159.patch When building Lucene/Solr with Apache Maven 3, all MANIFEST.MF files included into JAR artifacts contain non-parsed POM variables: namely, there are entries like Specification-Version: 5.0.0.$\{now.version} In the end, Solr displays these values on admin page in Versions section. This is caused by unresolved bug in maven-bundle-plugin (FELIX-3392). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5159) Manifest includes non-parsed maven variables
[ https://issues.apache.org/jira/browse/SOLR-5159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742439#comment-13742439 ] ASF subversion and git services commented on SOLR-5159: --- Commit 1514816 from [~steve_rowe] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1514816 ] SOLR-5159: Manifest includes non-parsed maven variables (merged trunk r1514813 and r1514814) Manifest includes non-parsed maven variables Key: SOLR-5159 URL: https://issues.apache.org/jira/browse/SOLR-5159 Project: Solr Issue Type: Bug Components: Build Affects Versions: 4.4, 4.5, 5.0 Environment: Apache Maven 3.0.5 Reporter: Artem Karpenko Assignee: Steve Rowe Priority: Minor Labels: maven-bundle-plugin, maven3, Attachments: SOLR-5159.patch When building Lucene/Solr with Apache Maven 3, all MANIFEST.MF files included into JAR artifacts contain non-parsed POM variables: namely, there are entries like Specification-Version: 5.0.0.$\{now.version} In the end, Solr displays these values on admin page in Versions section. This is caused by unresolved bug in maven-bundle-plugin (FELIX-3392). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5150) HdfsIndexInput may not fully read requested bytes.
[ https://issues.apache.org/jira/browse/SOLR-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742441#comment-13742441 ] Mark Miller commented on SOLR-5150: --- To describe that more fully: not deadlock - just really long pauses - no cpu or harddrive usage by either hdfs processes or solr for a *long* time - threads hanging out in socket waits of some kind it seemed. That is how I first saw the slowdown with the blur fix - I was running one of the HdfsDirectory tests on my mac and it took 10 min instead of 14 seconds. On linux, the test was still fast. Some other perf tests around querying took a nose dive on linux as well though. Meanwhile, some tests involving indexing sped up. The current patch sped that test back up on my mac and fixed the query perf test. We might be able to get the best of both worlds, or the synchronized version might not be worth it. HdfsIndexInput may not fully read requested bytes. -- Key: SOLR-5150 URL: https://issues.apache.org/jira/browse/SOLR-5150 Project: Solr Issue Type: Bug Affects Versions: 4.4 Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.5, 5.0 Attachments: SOLR-5150.patch Patrick Hunt noticed that our HdfsDirectory code was a bit behind Blur here - the read call we are using may not read all of the requested bytes - it returns the number of bytes actually written - which we ignore. Blur moved to using a seek and then readFully call - synchronizing across the two calls to deal with clones. We have seen that really kills performance, and using the readFully call that lets you pass the position rather than first doing a seek, performs much better and does not require the synchronization. I also noticed that the seekInternal impl should not seek but be a no op since we are seeking on the read. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5156) Provide a way to move the contents of a file to ZooKeeper with ZkCLI
[ https://issues.apache.org/jira/browse/SOLR-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742446#comment-13742446 ] ASF subversion and git services commented on SOLR-5156: --- Commit 1514821 from [~erickoerickson] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1514821 ] SOLR-5156 Provide a way to move the contents of a file to ZooKeeper with ZkCLI Provide a way to move the contents of a file to ZooKeeper with ZkCLI Key: SOLR-5156 URL: https://issues.apache.org/jira/browse/SOLR-5156 Project: Solr Issue Type: Improvement Reporter: Erick Erickson Assignee: Erick Erickson Attachments: SOLR-5156.patch, SOLR-5156.patch, SOLR-5156.patch Spinoff from SOLR-4718. We don't have any good way of putting solr.xml up in Zookeeper in the first place. So while we can fake getting the file up there we need a way consistent with ZkCLI -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5156) Provide a way to move the contents of a file to ZooKeeper with ZkCLI
[ https://issues.apache.org/jira/browse/SOLR-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson resolved SOLR-5156. -- Resolution: Fixed Fix Version/s: 5.0 4.5 Provide a way to move the contents of a file to ZooKeeper with ZkCLI Key: SOLR-5156 URL: https://issues.apache.org/jira/browse/SOLR-5156 Project: Solr Issue Type: Improvement Reporter: Erick Erickson Assignee: Erick Erickson Fix For: 4.5, 5.0 Attachments: SOLR-5156.patch, SOLR-5156.patch, SOLR-5156.patch Spinoff from SOLR-4718. We don't have any good way of putting solr.xml up in Zookeeper in the first place. So while we can fake getting the file up there we need a way consistent with ZkCLI -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5168) BJQParserTest reproducible failures
Hoss Man created SOLR-5168: -- Summary: BJQParserTest reproducible failures Key: SOLR-5168 URL: https://issues.apache.org/jira/browse/SOLR-5168 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Yonik Seeley two recent Jenkins builds have uncovered some test seeds that cause failures in multiple test methods in BJQParserTest. These seeds reproduce reliably (as of trunk r1514815) ... {noformat} ant test -Dtestcase=BJQParserTest -Dtests.seed=7A613F321CE87F5B -Dtests.multiplier=3 -Dtests.slow=true ant test -Dtestcase=BJQParserTest -Dtests.seed=1DC8055F837E437E -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5159) Manifest includes non-parsed maven variables
[ https://issues.apache.org/jira/browse/SOLR-5159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe resolved SOLR-5159. -- Resolution: Fixed Fix Version/s: 5.0 4.5 bq. I want to verify Maven2 locally, and I also want to compare all manifest entries with the Ant-produced ones - the solr entries were changed recently, and I want to keep them in sync. Maven 2.2.1 works fine. I compared the Ant-built and Maven-built manifests, and the Maven-built ones of course have lots of Bnd-produced entries not in the Ant-built ones. There are two other differences: # The Maven-built manifest contains Implementation-Vendor-Id (with Maven coordinate groupId as the value: org.apache.lucene or org.apache.solr). I think this is fine to leave in, and maybe the Ant-built manifests should get it too? # The Maven-built manifests have the old style {{Specification-Version}}, including a timestamp, e.g. {{5.0.0.2013.08.16.12.36.14}}, where the Ant-built manifests just have the version, e.g. {{5.0-SNAPSHOT}}. The latter is actually syntactically incorrect, since the value should only have digits and period. I've left it as the old style in the Maven version, since it's not a syntax error, and since the Maven versions will only ever be produced by end-users - all snapshot and release Maven artifacts are produced by Ant. I've committed to trunk and branch_4x. Thanks Artem! Manifest includes non-parsed maven variables Key: SOLR-5159 URL: https://issues.apache.org/jira/browse/SOLR-5159 Project: Solr Issue Type: Bug Components: Build Affects Versions: 4.4, 4.5, 5.0 Environment: Apache Maven 3.0.5 Reporter: Artem Karpenko Assignee: Steve Rowe Priority: Minor Labels: maven-bundle-plugin, maven3, Fix For: 4.5, 5.0 Attachments: SOLR-5159.patch When building Lucene/Solr with Apache Maven 3, all MANIFEST.MF files included into JAR artifacts contain non-parsed POM variables: namely, there are entries like Specification-Version: 5.0.0.$\{now.version} In the end, Solr displays these values on admin page in Versions section. This is caused by unresolved bug in maven-bundle-plugin (FELIX-3392). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5150) HdfsIndexInput may not fully read requested bytes.
[ https://issues.apache.org/jira/browse/SOLR-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742465#comment-13742465 ] Uwe Schindler commented on SOLR-5150: - Hi Mark, I think your version should be preferred in both cases. The Apache Blur upstream version looks like SimpleFSIndexInput (which has synchronization on the RandomAccessFile). The difference is here, that reading from a real file has no network involved (at least not for local filesystems) so the time spent in the locked code block is shorter. Still SimpleFSDir is bad for queries. When merging the whole stuff works single-threaded per file so you would see so difference in both approaches. If the positional readFully approach would be slower, then this would be clearly a bug in Hdfs. Another alternative would be: When cloning a file also clone the underlying Hdfs connection. With RandomAccessFile we cannot do this in the JDK (we have no dup() for file descriptors), but if Hdfs supports some dup() like approach with delete on-last close semantics (the file could already be deleted when you dup the file descriptor) you could create 2 different connection for each thread. The backside: Lucene never closes clones - one reason why I gave up on implementig a Windows-Optimized directory that would clone underlying file descriptor: The clone would never close the dup :( HdfsIndexInput may not fully read requested bytes. -- Key: SOLR-5150 URL: https://issues.apache.org/jira/browse/SOLR-5150 Project: Solr Issue Type: Bug Affects Versions: 4.4 Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.5, 5.0 Attachments: SOLR-5150.patch Patrick Hunt noticed that our HdfsDirectory code was a bit behind Blur here - the read call we are using may not read all of the requested bytes - it returns the number of bytes actually written - which we ignore. Blur moved to using a seek and then readFully call - synchronizing across the two calls to deal with clones. We have seen that really kills performance, and using the readFully call that lets you pass the position rather than first doing a seek, performs much better and does not require the synchronization. I also noticed that the seekInternal impl should not seek but be a no op since we are seeking on the read. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5150) HdfsIndexInput may not fully read requested bytes.
[ https://issues.apache.org/jira/browse/SOLR-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742465#comment-13742465 ] Uwe Schindler edited comment on SOLR-5150 at 8/16/13 6:00 PM: -- Hi Mark, I think your version should be preferred in both cases. The Apache Blur upstream version looks like SimpleFSIndexInput (which has synchronization on the RandomAccessFile). The difference is here, that reading from a real file has no network involved (at least not for local filesystems) so the time spent in the locked code block is shorter. Still SimpleFSDir is bad for queries. When merging the whole stuff works single-threaded per file so you would see no difference in both approaches. If the positional readFully approach would be slower, then this would be clearly a bug in Hdfs. Another alternative would be: When cloning a file also clone the underlying Hdfs connection. With RandomAccessFile we cannot do this in the JDK (we have no dup() for file descriptors), but if Hdfs supports some dup() like approach with delete on-last close semantics (the file could already be deleted when you dup the file descriptor) you could create 2 different connection for each thread. The backside: Lucene never closes clones - one reason why I gave up on implementig a Windows-Optimized directory that would clone underlying file descriptor: The clone would never close the dup :( was (Author: thetaphi): Hi Mark, I think your version should be preferred in both cases. The Apache Blur upstream version looks like SimpleFSIndexInput (which has synchronization on the RandomAccessFile). The difference is here, that reading from a real file has no network involved (at least not for local filesystems) so the time spent in the locked code block is shorter. Still SimpleFSDir is bad for queries. When merging the whole stuff works single-threaded per file so you would see so difference in both approaches. If the positional readFully approach would be slower, then this would be clearly a bug in Hdfs. Another alternative would be: When cloning a file also clone the underlying Hdfs connection. With RandomAccessFile we cannot do this in the JDK (we have no dup() for file descriptors), but if Hdfs supports some dup() like approach with delete on-last close semantics (the file could already be deleted when you dup the file descriptor) you could create 2 different connection for each thread. The backside: Lucene never closes clones - one reason why I gave up on implementig a Windows-Optimized directory that would clone underlying file descriptor: The clone would never close the dup :( HdfsIndexInput may not fully read requested bytes. -- Key: SOLR-5150 URL: https://issues.apache.org/jira/browse/SOLR-5150 Project: Solr Issue Type: Bug Affects Versions: 4.4 Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.5, 5.0 Attachments: SOLR-5150.patch Patrick Hunt noticed that our HdfsDirectory code was a bit behind Blur here - the read call we are using may not read all of the requested bytes - it returns the number of bytes actually written - which we ignore. Blur moved to using a seek and then readFully call - synchronizing across the two calls to deal with clones. We have seen that really kills performance, and using the readFully call that lets you pass the position rather than first doing a seek, performs much better and does not require the synchronization. I also noticed that the seekInternal impl should not seek but be a no op since we are seeking on the read. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5168) BJQParserTest reproducible failures
[ https://issues.apache.org/jira/browse/SOLR-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742467#comment-13742467 ] Hoss Man commented on SOLR-5168: One of those seeds (1DC8055F837E437E) causes MockRandomMergePolicy -- but a cursory review of hte test (and my cursory udnderstanding of the block join queries) doesn't suggest any reason why that should cause a problem for this test -- the only ever time a commit *might* happen in the test is at the end of an entire block. The other seed (7A613F321CE87F5B) just uses LogDocMergePolicy, so even if my cusory understandings above are incorrect, there really seems to be a bug when this seed is used. BJQParserTest reproducible failures --- Key: SOLR-5168 URL: https://issues.apache.org/jira/browse/SOLR-5168 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Yonik Seeley two recent Jenkins builds have uncovered some test seeds that cause failures in multiple test methods in BJQParserTest. These seeds reproduce reliably (as of trunk r1514815) ... {noformat} ant test -Dtestcase=BJQParserTest -Dtests.seed=7A613F321CE87F5B -Dtests.multiplier=3 -Dtests.slow=true ant test -Dtestcase=BJQParserTest -Dtests.seed=1DC8055F837E437E -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3280) to many / sometimes stale CLOSE_WAIT connections from SnapPuller during / after replication
[ https://issues.apache.org/jira/browse/SOLR-3280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742478#comment-13742478 ] David Fu commented on SOLR-3280: I am still on 3.4 now. I noticed the solr4 pretty much reimplemented the snappuller and am thinking about upgrading to v4. Just out of the curiosity, what are some issues you faced in the process of upgrading from 3.6 to 4.2.1? to many / sometimes stale CLOSE_WAIT connections from SnapPuller during / after replication --- Key: SOLR-3280 URL: https://issues.apache.org/jira/browse/SOLR-3280 Project: Solr Issue Type: Bug Affects Versions: 3.5, 3.6, 4.0-ALPHA Reporter: Bernd Fehling Assignee: Robert Muir Priority: Minor Attachments: SOLR-3280.patch There are sometimes to many and also stale CLOSE_WAIT connections during/after replication left over on SLAVE server. Normally GC should clean up this but this is not always the case. Also if a CLOSE_WAIT is hanging then the new replication won't load. Dirty work around so far is to fake a TCP connection as root to that connection and close it. After that the new replication will load, the old index and searcher released and the system will return to normal operation. Background: The SnapPuller is using Apache httpclient 3.x and uses the MultiThreadedHttpConnectionManager. The manager holds a connection in CLOSE_WAIT after its use for further requests. This is done by calling releaseConnection. But if a connection is stuck it is not available any more and a new connection from the pool is used. Solution: After calling releaseConnection clean up with closeIdleConnections(0). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5159) Manifest includes non-parsed maven variables
[ https://issues.apache.org/jira/browse/SOLR-5159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742480#comment-13742480 ] Artem Karpenko commented on SOLR-5159: -- Great, thank you Steve, I was glad to help. Manifest includes non-parsed maven variables Key: SOLR-5159 URL: https://issues.apache.org/jira/browse/SOLR-5159 Project: Solr Issue Type: Bug Components: Build Affects Versions: 4.4, 4.5, 5.0 Environment: Apache Maven 3.0.5 Reporter: Artem Karpenko Assignee: Steve Rowe Priority: Minor Labels: maven-bundle-plugin, maven3, Fix For: 4.5, 5.0 Attachments: SOLR-5159.patch When building Lucene/Solr with Apache Maven 3, all MANIFEST.MF files included into JAR artifacts contain non-parsed POM variables: namely, there are entries like Specification-Version: 5.0.0.$\{now.version} In the end, Solr displays these values on admin page in Versions section. This is caused by unresolved bug in maven-bundle-plugin (FELIX-3392). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3936) QueryElevationComponent: Wrong order when result grouping is activated
[ https://issues.apache.org/jira/browse/SOLR-3936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742500#comment-13742500 ] ASF subversion and git services commented on SOLR-3936: --- Commit 1514836 from hoss...@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1514836 ] SOLR-3936: Fixed QueryElevationComponent sorting when used with Grouping (merge r1514795) QueryElevationComponent: Wrong order when result grouping is activated -- Key: SOLR-3936 URL: https://issues.apache.org/jira/browse/SOLR-3936 Project: Solr Issue Type: Bug Components: SearchComponents - other Affects Versions: 4.0 Reporter: Michael Berger Assignee: Hoss Man Attachments: SOLR-3936.patch, SOLR-3936.patch When I use elevation together with grouping I got not the expected result order. I tried it with the standard solr example: http://localhost:8983/solr/elevate?enableElevation=truefl=score%2C[elevated]%2Cid%2CnameforceElevation=truegroup.field=manugroup=onindent=onq=ipodwt=json but the results ignored the elevation: { responseHeader:{ status:0, QTime:2, params:{ enableElevation:true, fl:score,[elevated],id,name, indent:on, q:ipod, forceElevation:true, group.field:manu, group:on, wt:json}}, grouped:{ manu:{ matches:2, groups:[{ groupValue:belkin, doclist:{numFound:1,start:0,maxScore:0.7698604,docs:[ { id:F8V7067-APL-KIT, name:Belkin Mobile Power Cord for iPod w/ Dock, score:0.7698604, [elevated]:false}] }}, { groupValue:inc, doclist:{numFound:1,start:0,maxScore:0.28869766,docs:[ { id:MA147LL/A, name:Apple 60 GB iPod with Video Playback Black, score:0.28869766, [elevated]:true}] }}]}}} the elevate.xml defines the following rules : query text=ipod doc id=MA147LL/A / !-- put the actual ipod at the top -- doc id=IW-02 exclude=true / !-- exclude this cable -- /query /elevate -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5135) Deleting a collection should be extra aggressive in the face of failures.
[ https://issues.apache.org/jira/browse/SOLR-5135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved SOLR-5135. --- Resolution: Fixed Deleting a collection should be extra aggressive in the face of failures. - Key: SOLR-5135 URL: https://issues.apache.org/jira/browse/SOLR-5135 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.5, 5.0 Attachments: SOLR-5135.patch Until Zk is the source of truth for the cluster, zk and local node states can get out of whack in certain situations - as a result, sometimes you cannot clean out all of the remnants of a collection to recreate it. For example, if the collection is listed in zk under /collections, but is not in clusterstate.json, you cannot remove or create the collection again due to a early exception in the collection removal chain. I think we should probably still return the error - but also delete as much as we can. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-3936) QueryElevationComponent: Wrong order when result grouping is activated
[ https://issues.apache.org/jira/browse/SOLR-3936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-3936. Resolution: Fixed Fix Version/s: 5.0 4.5 Thanks again Michael! QueryElevationComponent: Wrong order when result grouping is activated -- Key: SOLR-3936 URL: https://issues.apache.org/jira/browse/SOLR-3936 Project: Solr Issue Type: Bug Components: SearchComponents - other Affects Versions: 4.0 Reporter: Michael Berger Assignee: Hoss Man Fix For: 4.5, 5.0 Attachments: SOLR-3936.patch, SOLR-3936.patch When I use elevation together with grouping I got not the expected result order. I tried it with the standard solr example: http://localhost:8983/solr/elevate?enableElevation=truefl=score%2C[elevated]%2Cid%2CnameforceElevation=truegroup.field=manugroup=onindent=onq=ipodwt=json but the results ignored the elevation: { responseHeader:{ status:0, QTime:2, params:{ enableElevation:true, fl:score,[elevated],id,name, indent:on, q:ipod, forceElevation:true, group.field:manu, group:on, wt:json}}, grouped:{ manu:{ matches:2, groups:[{ groupValue:belkin, doclist:{numFound:1,start:0,maxScore:0.7698604,docs:[ { id:F8V7067-APL-KIT, name:Belkin Mobile Power Cord for iPod w/ Dock, score:0.7698604, [elevated]:false}] }}, { groupValue:inc, doclist:{numFound:1,start:0,maxScore:0.28869766,docs:[ { id:MA147LL/A, name:Apple 60 GB iPod with Video Playback Black, score:0.28869766, [elevated]:true}] }}]}}} the elevate.xml defines the following rules : query text=ipod doc id=MA147LL/A / !-- put the actual ipod at the top -- doc id=IW-02 exclude=true / !-- exclude this cable -- /query /elevate -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-4718) Allow solr.xml to be stored in zookeeper
[ https://issues.apache.org/jira/browse/SOLR-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson resolved SOLR-4718. -- Resolution: Fixed Fix Version/s: 5.0 4.5 Allow solr.xml to be stored in zookeeper Key: SOLR-4718 URL: https://issues.apache.org/jira/browse/SOLR-4718 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 4.3, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Fix For: 4.5, 5.0 Attachments: SOLR-4718-alternative.patch, SOLR-4718.patch, SOLR-4718.patch, SOLR-4718.patch, SOLR-4718.patch, SOLR-4718.patch, SOLR-4718.patch, SOLR-4718.patch So the near-final piece of this puzzle is to make solr.xml be storable in Zookeeper. Code-wise in terms of Solr, this doesn't look very difficult, I'm working on it now. More interesting is how to get the configuration into ZK in the first place, enhancements to ZkCli? Or boostrap-conf? Other? I'm punting on that for this patch. Second level is how to tell Solr to get the file from ZK. Some possibilities: 1 A system prop, -DzkSolrXmlPath=blah where blah is the path _on zk_ where the file is. Would require -DzkHost or -DzkRun as well. pros - simple, I can wrap my head around it. - easy to script cons - can't run multiple JVMs pointing to different files. Is this really a problem? 2 New solr.xml element. Something like: solr solrcloud str name=zkHostzkurl/str str name=zkSolrXmlPathwhatever/str /solrcloud solr Really, this form would hinge on the presence or absence of zkSolrXmlPath. If present, go up and look for the indicated solr.xml file on ZK. Any properties in the ZK version would overwrite anything in the local copy. NOTE: I'm really not very interested in supporting this as an option for old-style solr.xml unless it's _really_ easy. For instance, what if the local solr.xml is new-style and the one in ZK is old-style? Or vice-versa? Since old-style is going away, this doesn't seem like it's worth the effort. pros - No new mechanisms cons - once again requires that there be a solr.xml file on each client. Admittedly for installations that didn't care much about multiple JVMs, it could be a stock file that didn't change... For now, I'm going to just manually push solr.xml to ZK, then read it based on a sysprop. That'll get the structure in place while we debate. Not going to check this in until there's some consensus though. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4718) Allow solr.xml to be stored in zookeeper
[ https://issues.apache.org/jira/browse/SOLR-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742512#comment-13742512 ] ASF subversion and git services commented on SOLR-4718: --- Commit 1514843 from [~erickoerickson] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1514843 ] SOLR-4718 Allow solr.xml to be stored in ZooKeeper Allow solr.xml to be stored in zookeeper Key: SOLR-4718 URL: https://issues.apache.org/jira/browse/SOLR-4718 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 4.3, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Attachments: SOLR-4718-alternative.patch, SOLR-4718.patch, SOLR-4718.patch, SOLR-4718.patch, SOLR-4718.patch, SOLR-4718.patch, SOLR-4718.patch, SOLR-4718.patch So the near-final piece of this puzzle is to make solr.xml be storable in Zookeeper. Code-wise in terms of Solr, this doesn't look very difficult, I'm working on it now. More interesting is how to get the configuration into ZK in the first place, enhancements to ZkCli? Or boostrap-conf? Other? I'm punting on that for this patch. Second level is how to tell Solr to get the file from ZK. Some possibilities: 1 A system prop, -DzkSolrXmlPath=blah where blah is the path _on zk_ where the file is. Would require -DzkHost or -DzkRun as well. pros - simple, I can wrap my head around it. - easy to script cons - can't run multiple JVMs pointing to different files. Is this really a problem? 2 New solr.xml element. Something like: solr solrcloud str name=zkHostzkurl/str str name=zkSolrXmlPathwhatever/str /solrcloud solr Really, this form would hinge on the presence or absence of zkSolrXmlPath. If present, go up and look for the indicated solr.xml file on ZK. Any properties in the ZK version would overwrite anything in the local copy. NOTE: I'm really not very interested in supporting this as an option for old-style solr.xml unless it's _really_ easy. For instance, what if the local solr.xml is new-style and the one in ZK is old-style? Or vice-versa? Since old-style is going away, this doesn't seem like it's worth the effort. pros - No new mechanisms cons - once again requires that there be a solr.xml file on each client. Admittedly for installations that didn't care much about multiple JVMs, it could be a stock file that didn't change... For now, I'm going to just manually push solr.xml to ZK, then read it based on a sysprop. That'll get the structure in place while we debate. Not going to check this in until there's some consensus though. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5171) AnalyzingSuggester and FuzzySuggester should be able to share same FST
[ https://issues.apache.org/jira/browse/LUCENE-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742513#comment-13742513 ] Michael McCandless commented on LUCENE-5171: If you use FuzzySuggester with maxEdits=0, does it work? Or, maybe we should simply merge these two suggesters into AnalyzingSuggester and default maxEdits to 0? AnalyzingSuggester and FuzzySuggester should be able to share same FST -- Key: LUCENE-5171 URL: https://issues.apache.org/jira/browse/LUCENE-5171 Project: Lucene - Core Issue Type: Improvement Components: modules/other Affects Versions: 4.4, 4.3.1 Reporter: Anna Björk Nikulásdóttir Priority: Minor In my code I use both suggesters for the same FST. I use AnalyzerSuggester#store() to create the FST and later on AnalyzingSuggester#load() and FuzzySuggester#load() to use it. This approach works very well but it unnecessarily creates 2 fst instances resulting in 2x memory consumption. It seems that for the time being both suggesters use the same FST format. The following trivial method in AnalyzingSuggester provides the possibility to share the same FST among different instances of AnalyzingSuggester. It has been tested in the above scenario: public boolean shareFstFrom(AnalyzingSuggester instance) { if (instance.fst == null) { return false; } this.fst = instance.fst; this.maxAnalyzedPathsForOneInput = instance.maxAnalyzedPathsForOneInput; this.hasPayloads = instance.hasPayloads; return true; } One could use it like this: analyzingSugg = new AnalyzingSuggester(...); fuzzySugg = new FuzzySuggester(...); analyzingSugg.load(someInputStream); fuzzySugg = analyzingSugg.shareFstFrom(analyzingSugg); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4583) StraightBytesDocValuesField fails if bytes 32k
[ https://issues.apache.org/jira/browse/LUCENE-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742521#comment-13742521 ] ASF subversion and git services commented on LUCENE-4583: - Commit 1514848 from [~mikemccand] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1514848 ] LUCENE-4583: IndexWriter no longer places a limit on length of DV binary fields (individual codecs still have their limits, including the default codec) StraightBytesDocValuesField fails if bytes 32k Key: LUCENE-4583 URL: https://issues.apache.org/jira/browse/LUCENE-4583 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.0, 4.1, 5.0 Reporter: David Smiley Assignee: Michael McCandless Priority: Critical Fix For: 5.0, 4.5 Attachments: LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch I didn't observe any limitations on the size of a bytes based DocValues field value in the docs. It appears that the limit is 32k, although I didn't get any friendly error telling me that was the limit. 32k is kind of small IMO; I suspect this limit is unintended and as such is a bug.The following test fails: {code:java} public void testBigDocValue() throws IOException { Directory dir = newDirectory(); IndexWriter writer = new IndexWriter(dir, writerConfig(false)); Document doc = new Document(); BytesRef bytes = new BytesRef((4+4)*4097);//4096 works bytes.length = bytes.bytes.length;//byte data doesn't matter doc.add(new StraightBytesDocValuesField(dvField, bytes)); writer.addDocument(doc); writer.commit(); writer.close(); DirectoryReader reader = DirectoryReader.open(dir); DocValues docValues = MultiDocValues.getDocValues(reader, dvField); //FAILS IF BYTES IS BIG! docValues.getSource().getBytes(0, bytes); reader.close(); dir.close(); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4583) StraightBytesDocValuesField fails if bytes 32k
[ https://issues.apache.org/jira/browse/LUCENE-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-4583. Resolution: Fixed StraightBytesDocValuesField fails if bytes 32k Key: LUCENE-4583 URL: https://issues.apache.org/jira/browse/LUCENE-4583 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.0, 4.1, 5.0 Reporter: David Smiley Assignee: Michael McCandless Priority: Critical Fix For: 5.0, 4.5 Attachments: LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch I didn't observe any limitations on the size of a bytes based DocValues field value in the docs. It appears that the limit is 32k, although I didn't get any friendly error telling me that was the limit. 32k is kind of small IMO; I suspect this limit is unintended and as such is a bug.The following test fails: {code:java} public void testBigDocValue() throws IOException { Directory dir = newDirectory(); IndexWriter writer = new IndexWriter(dir, writerConfig(false)); Document doc = new Document(); BytesRef bytes = new BytesRef((4+4)*4097);//4096 works bytes.length = bytes.bytes.length;//byte data doesn't matter doc.add(new StraightBytesDocValuesField(dvField, bytes)); writer.addDocument(doc); writer.commit(); writer.close(); DirectoryReader reader = DirectoryReader.open(dir); DocValues docValues = MultiDocValues.getDocValues(reader, dvField); //FAILS IF BYTES IS BIG! docValues.getSource().getBytes(0, bytes); reader.close(); dir.close(); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5164) In some cases, creating collections via the Collections API due to core being created in the wrong directory
[ https://issues.apache.org/jira/browse/SOLR-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-5164: -- Summary: In some cases, creating collections via the Collections API due to core being created in the wrong directory (was: Creating collections via the Collections API fails due to core being created in the wrong directory) In some cases, creating collections via the Collections API due to core being created in the wrong directory Key: SOLR-5164 URL: https://issues.apache.org/jira/browse/SOLR-5164 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Mark Miller Priority: Critical Fix For: 4.5, 5.0 Attachments: SOLR-5164.patch When you try to create a collection in SolrCloud, the instanceDir that gets created has an extra solr in it which messes up the pathing for all the lib directives in solrconfig.xml as they're all relative. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5169) Provide a way to query for zookeeper quorum state and other cloud-related info
Shawn Heisey created SOLR-5169: -- Summary: Provide a way to query for zookeeper quorum state and other cloud-related info Key: SOLR-5169 URL: https://issues.apache.org/jira/browse/SOLR-5169 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.4 Reporter: Shawn Heisey Priority: Minor Fix For: 4.5, 5.0 There should be a way, either through an existing admin handler or a new one, to get an up-to-the-moment zookeeper status. There may be other status information related to SolrCloud that could be included as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5099) The core.properties not created during collection creation
[ https://issues.apache.org/jira/browse/SOLR-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742534#comment-13742534 ] ASF subversion and git services commented on SOLR-5099: --- Commit 1514857 from [~markrmil...@gmail.com] in branch 'dev/trunk' [ https://svn.apache.org/r1514857 ] SOLR-5164: add relative solr.home testing to some tests, explicitly check for expected instanceDir handling with relative solr.home SOLR-5099: explicity check for proper solrcore.properties creation Speed up some tests by setting leaderVoteWait to 0 The core.properties not created during collection creation -- Key: SOLR-5099 URL: https://issues.apache.org/jira/browse/SOLR-5099 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.5, 5.0 Reporter: Herb Jiang Assignee: Mark Miller Priority: Critical Fix For: 4.5, 5.0 Attachments: CorePropertiesLocator.java.patch When using the new solr.xml structure. The core auto discovery mechanism trying to find core.properties. But I found the core.properties cannot be create when I dynamically create a collection. The root issue is the CorePropertiesLocator trying to create properties before the instanceDir is created. And collection creation process will done and looks fine at runtime, but it will cause issues (cores are not auto discovered after server restart). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5164) In some cases, creating collections via the Collections API due to core being created in the wrong directory
[ https://issues.apache.org/jira/browse/SOLR-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742533#comment-13742533 ] ASF subversion and git services commented on SOLR-5164: --- Commit 1514857 from [~markrmil...@gmail.com] in branch 'dev/trunk' [ https://svn.apache.org/r1514857 ] SOLR-5164: add relative solr.home testing to some tests, explicitly check for expected instanceDir handling with relative solr.home SOLR-5099: explicity check for proper solrcore.properties creation Speed up some tests by setting leaderVoteWait to 0 In some cases, creating collections via the Collections API due to core being created in the wrong directory Key: SOLR-5164 URL: https://issues.apache.org/jira/browse/SOLR-5164 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Mark Miller Priority: Critical Fix For: 4.5, 5.0 Attachments: SOLR-5164.patch When you try to create a collection in SolrCloud, the instanceDir that gets created has an extra solr in it which messes up the pathing for all the lib directives in solrconfig.xml as they're all relative. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5164) In some cases, creating collections via the Collections API due to core being created in the wrong directory
[ https://issues.apache.org/jira/browse/SOLR-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742537#comment-13742537 ] ASF subversion and git services commented on SOLR-5164: --- Commit 1514858 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1514858 ] SOLR-5164: add relative solr.home testing to some tests, explicitly check for expected instanceDir handling with relative solr.home SOLR-5099: explicity check for proper solrcore.properties creation Speed up some tests by setting leaderVoteWait to 0 In some cases, creating collections via the Collections API due to core being created in the wrong directory Key: SOLR-5164 URL: https://issues.apache.org/jira/browse/SOLR-5164 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Mark Miller Priority: Critical Fix For: 4.5, 5.0 Attachments: SOLR-5164.patch When you try to create a collection in SolrCloud, the instanceDir that gets created has an extra solr in it which messes up the pathing for all the lib directives in solrconfig.xml as they're all relative. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5099) The core.properties not created during collection creation
[ https://issues.apache.org/jira/browse/SOLR-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742538#comment-13742538 ] ASF subversion and git services commented on SOLR-5099: --- Commit 1514858 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1514858 ] SOLR-5164: add relative solr.home testing to some tests, explicitly check for expected instanceDir handling with relative solr.home SOLR-5099: explicity check for proper solrcore.properties creation Speed up some tests by setting leaderVoteWait to 0 The core.properties not created during collection creation -- Key: SOLR-5099 URL: https://issues.apache.org/jira/browse/SOLR-5099 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.5, 5.0 Reporter: Herb Jiang Assignee: Mark Miller Priority: Critical Fix For: 4.5, 5.0 Attachments: CorePropertiesLocator.java.patch When using the new solr.xml structure. The core auto discovery mechanism trying to find core.properties. But I found the core.properties cannot be create when I dynamically create a collection. The root issue is the CorePropertiesLocator trying to create properties before the instanceDir is created. And collection creation process will done and looks fine at runtime, but it will cause issues (cores are not auto discovered after server restart). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.7.0_25) - Build # 7040 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/7040/ Java: 32bit/jdk1.7.0_25 -server -XX:+UseSerialGC 3 tests failed. FAILED: org.apache.solr.client.solrj.impl.CloudSolrServerTest.testDistribSearch Error Message: java.util.concurrent.TimeoutException: Could not connect to ZooKeeper 127.0.0.1:59345 within 3 ms Stack Trace: java.lang.RuntimeException: java.util.concurrent.TimeoutException: Could not connect to ZooKeeper 127.0.0.1:59345 within 3 ms at __randomizedtesting.SeedInfo.seed([19D57DD7E754F289:9833F3CF900B92B5]:0) at org.apache.solr.common.cloud.SolrZkClient.init(SolrZkClient.java:130) at org.apache.solr.common.cloud.SolrZkClient.init(SolrZkClient.java:93) at org.apache.solr.common.cloud.SolrZkClient.init(SolrZkClient.java:84) at org.apache.solr.cloud.AbstractZkTestCase.buildZooKeeper(AbstractZkTestCase.java:89) at org.apache.solr.cloud.AbstractZkTestCase.buildZooKeeper(AbstractZkTestCase.java:83) at org.apache.solr.cloud.AbstractDistribZkTestBase.setUp(AbstractDistribZkTestBase.java:70) at org.apache.solr.cloud.AbstractFullDistribZkTestBase.setUp(AbstractFullDistribZkTestBase.java:193) at org.apache.solr.client.solrj.impl.CloudSolrServerTest.setUp(CloudSolrServerTest.java:71) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:771) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at
[jira] [Commented] (LUCENE-5179) Refactoring on PostingsWriterBase for delta-encoding
[ https://issues.apache.org/jira/browse/LUCENE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742561#comment-13742561 ] Michael McCandless commented on LUCENE-5179: So, the idea with this patch is to go back to letting the PBF encode the metadata for the term? Just, one term at a time, not the whole block that we have on trunk today. And the reason for this is back-compat? Ie, so that in test-framework we can have writers for the old formats? One thing that this change precludes is having the terms dict use different encodings than simple delta vInt to encode the long[] metadata, e.g. Simple9/16 or something? But that's OK ... we can explore those later. It's sort of frustrating to have to compromise the design just for back-compat ... e.g. we could instead cheat a bit, and have the writers write the newer format. It's easy to make the readers read either format right? But ... I don't understand how this change helps Pulsing, or rather why Pulsing would have trouble w/ the API we have today? Refactoring on PostingsWriterBase for delta-encoding Key: LUCENE-5179 URL: https://issues.apache.org/jira/browse/LUCENE-5179 Project: Lucene - Core Issue Type: Improvement Reporter: Han Jiang Assignee: Han Jiang Fix For: 5.0, 4.5 Attachments: LUCENE-5179.patch A further step from LUCENE-5029. The short story is, previous API change brings two problems: * it somewhat breaks backward compatibility: although we can still read old format, we can no longer reproduce it; * pulsing codec have problem with it. And long story... With the change, current PostingsBase API will be like this: * term dict tells PBF we start a new term (via startTerm()); * PBF adds docs, positions and other postings data; * term dict tells PBF all the data for current term is completed (via finishTerm()), then PBF returns the metadata for current term (as long[] and byte[]); * term dict might buffer all the metadata in an ArrayList. when all the term is collected, it then decides how those metadata will be located on disk. So after the API change, PBF no longer have that annoying 'flushTermBlock', and instead term dict maintains the term, metadata list. However, for each term we'll now write long[] blob before byte[], so the index format is not consistent with pre-4.5. like in Lucne41, the metadata can be written as longA,bytesA,longB, but now we have to write as longA,longB,bytesA. Another problem is, pulsing codec cannot tell wrapped PBF how the metadata is delta-encoded, after all PulsingPostingsWriter is only a PBF. For example, we have terms=[a, a1, a2, b, b1 b2] and itemsInBlock=2, so theoretically we'll finally have three blocks in BTTR: [a b] [a1 a2] [b1 b2], with this approach, the metadata of term b is delta encoded base on metadata of a. but when term dict tells PBF to finishTerm(b), it might silly do the delta encode base on term a2. So I think maybe we can introduce a method 'encodeTerm(long[], DataOutput out, FieldInfo, TermState, boolean absolute)', so that during metadata flush, we can control how current term is written? And the term dict will buffer TermState, which implicitly holds metadata like we do in PBReader side. For example, if we want to reproduce old lucene41 format , we can simple set longsSize==0, then PBF writes the old format (longA,bytesA,longB) to DataOutput, and the compatible issue is solved. For pulsing codec, it will also be able to tell lower level how to encode metadata. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.7.0_25) - Build # 7040 - Still Failing!
Ill look to see if this was somehow me soon - all tests passed locally. Mark Sent from my iPhone On Aug 16, 2013, at 3:54 PM, Policeman Jenkins Server jenk...@thetaphi.de wrote: Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/7040/ Java: 32bit/jdk1.7.0_25 -server -XX:+UseSerialGC 3 tests failed. FAILED: org.apache.solr.client.solrj.impl.CloudSolrServerTest.testDistribSearch Error Message: java.util.concurrent.TimeoutException: Could not connect to ZooKeeper 127.0.0.1:59345 within 3 ms Stack Trace: java.lang.RuntimeException: java.util.concurrent.TimeoutException: Could not connect to ZooKeeper 127.0.0.1:59345 within 3 ms at __randomizedtesting.SeedInfo.seed([19D57DD7E754F289:9833F3CF900B92B5]:0) at org.apache.solr.common.cloud.SolrZkClient.init(SolrZkClient.java:130) at org.apache.solr.common.cloud.SolrZkClient.init(SolrZkClient.java:93) at org.apache.solr.common.cloud.SolrZkClient.init(SolrZkClient.java:84) at org.apache.solr.cloud.AbstractZkTestCase.buildZooKeeper(AbstractZkTestCase.java:89) at org.apache.solr.cloud.AbstractZkTestCase.buildZooKeeper(AbstractZkTestCase.java:83) at org.apache.solr.cloud.AbstractDistribZkTestBase.setUp(AbstractDistribZkTestBase.java:70) at org.apache.solr.cloud.AbstractFullDistribZkTestBase.setUp(AbstractFullDistribZkTestBase.java:193) at org.apache.solr.client.solrj.impl.CloudSolrServerTest.setUp(CloudSolrServerTest.java:71) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:771) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at
[jira] [Commented] (SOLR-5168) BJQParserTest reproducible failures
[ https://issues.apache.org/jira/browse/SOLR-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742569#comment-13742569 ] Mikhail Khludnev commented on SOLR-5168: I wonder, how it could work (it seems I wrote it myself - my fault). https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/test/org/apache/solr/search/join/BJQParserTest.java#L56 test doesn't use block add, but adds docs one by one, hence a block can be broken by commit {code} public static void createIndex() ... assertU(add(doc(idDoc))); {code} BJQParserTest reproducible failures --- Key: SOLR-5168 URL: https://issues.apache.org/jira/browse/SOLR-5168 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Yonik Seeley two recent Jenkins builds have uncovered some test seeds that cause failures in multiple test methods in BJQParserTest. These seeds reproduce reliably (as of trunk r1514815) ... {noformat} ant test -Dtestcase=BJQParserTest -Dtests.seed=7A613F321CE87F5B -Dtests.multiplier=3 -Dtests.slow=true ant test -Dtestcase=BJQParserTest -Dtests.seed=1DC8055F837E437E -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5179) Refactoring on PostingsWriterBase for delta-encoding
[ https://issues.apache.org/jira/browse/LUCENE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742575#comment-13742575 ] Robert Muir commented on LUCENE-5179: - Is it for real back compat or for impersonation ? Refactoring on PostingsWriterBase for delta-encoding Key: LUCENE-5179 URL: https://issues.apache.org/jira/browse/LUCENE-5179 Project: Lucene - Core Issue Type: Improvement Reporter: Han Jiang Assignee: Han Jiang Fix For: 5.0, 4.5 Attachments: LUCENE-5179.patch A further step from LUCENE-5029. The short story is, previous API change brings two problems: * it somewhat breaks backward compatibility: although we can still read old format, we can no longer reproduce it; * pulsing codec have problem with it. And long story... With the change, current PostingsBase API will be like this: * term dict tells PBF we start a new term (via startTerm()); * PBF adds docs, positions and other postings data; * term dict tells PBF all the data for current term is completed (via finishTerm()), then PBF returns the metadata for current term (as long[] and byte[]); * term dict might buffer all the metadata in an ArrayList. when all the term is collected, it then decides how those metadata will be located on disk. So after the API change, PBF no longer have that annoying 'flushTermBlock', and instead term dict maintains the term, metadata list. However, for each term we'll now write long[] blob before byte[], so the index format is not consistent with pre-4.5. like in Lucne41, the metadata can be written as longA,bytesA,longB, but now we have to write as longA,longB,bytesA. Another problem is, pulsing codec cannot tell wrapped PBF how the metadata is delta-encoded, after all PulsingPostingsWriter is only a PBF. For example, we have terms=[a, a1, a2, b, b1 b2] and itemsInBlock=2, so theoretically we'll finally have three blocks in BTTR: [a b] [a1 a2] [b1 b2], with this approach, the metadata of term b is delta encoded base on metadata of a. but when term dict tells PBF to finishTerm(b), it might silly do the delta encode base on term a2. So I think maybe we can introduce a method 'encodeTerm(long[], DataOutput out, FieldInfo, TermState, boolean absolute)', so that during metadata flush, we can control how current term is written? And the term dict will buffer TermState, which implicitly holds metadata like we do in PBReader side. For example, if we want to reproduce old lucene41 format , we can simple set longsSize==0, then PBF writes the old format (longA,bytesA,longB) to DataOutput, and the compatible issue is solved. For pulsing codec, it will also be able to tell lower level how to encode metadata. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5125) Distributed MoreLikeThis fails with NullPointerException, shard query gives EarlyTerminatingCollectorException
[ https://issues.apache.org/jira/browse/SOLR-5125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742600#comment-13742600 ] Shawn Heisey commented on SOLR-5125: Does anyone have any ideas here? The same thing happens with a 4x snapshot: 4.5-SNAPSHOT 1514424 - ncindex - 2013-08-15 12:56:50 Distributed MoreLikeThis fails with NullPointerException, shard query gives EarlyTerminatingCollectorException -- Key: SOLR-5125 URL: https://issues.apache.org/jira/browse/SOLR-5125 Project: Solr Issue Type: Bug Components: MoreLikeThis Affects Versions: 4.4 Reporter: Shawn Heisey Fix For: 4.5, 5.0 A distributed MoreLikeThis query that works perfectly on 4.2.1 is failing on 4.4.0. The original query returns a NullPointerException. The Solr log shows that the shard queries are throwing EarlyTerminatingCollectorException. Full details to follow in the comments. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5179) Refactoring on PostingsWriterBase for delta-encoding
[ https://issues.apache.org/jira/browse/LUCENE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742605#comment-13742605 ] Michael McCandless commented on LUCENE-5179: I believe it's for impersonation. Real back-compat (reader can read the old index format using the new APIs) should work fine, I think? Refactoring on PostingsWriterBase for delta-encoding Key: LUCENE-5179 URL: https://issues.apache.org/jira/browse/LUCENE-5179 Project: Lucene - Core Issue Type: Improvement Reporter: Han Jiang Assignee: Han Jiang Fix For: 5.0, 4.5 Attachments: LUCENE-5179.patch A further step from LUCENE-5029. The short story is, previous API change brings two problems: * it somewhat breaks backward compatibility: although we can still read old format, we can no longer reproduce it; * pulsing codec have problem with it. And long story... With the change, current PostingsBase API will be like this: * term dict tells PBF we start a new term (via startTerm()); * PBF adds docs, positions and other postings data; * term dict tells PBF all the data for current term is completed (via finishTerm()), then PBF returns the metadata for current term (as long[] and byte[]); * term dict might buffer all the metadata in an ArrayList. when all the term is collected, it then decides how those metadata will be located on disk. So after the API change, PBF no longer have that annoying 'flushTermBlock', and instead term dict maintains the term, metadata list. However, for each term we'll now write long[] blob before byte[], so the index format is not consistent with pre-4.5. like in Lucne41, the metadata can be written as longA,bytesA,longB, but now we have to write as longA,longB,bytesA. Another problem is, pulsing codec cannot tell wrapped PBF how the metadata is delta-encoded, after all PulsingPostingsWriter is only a PBF. For example, we have terms=[a, a1, a2, b, b1 b2] and itemsInBlock=2, so theoretically we'll finally have three blocks in BTTR: [a b] [a1 a2] [b1 b2], with this approach, the metadata of term b is delta encoded base on metadata of a. but when term dict tells PBF to finishTerm(b), it might silly do the delta encode base on term a2. So I think maybe we can introduce a method 'encodeTerm(long[], DataOutput out, FieldInfo, TermState, boolean absolute)', so that during metadata flush, we can control how current term is written? And the term dict will buffer TermState, which implicitly holds metadata like we do in PBReader side. For example, if we want to reproduce old lucene41 format , we can simple set longsSize==0, then PBF writes the old format (longA,bytesA,longB) to DataOutput, and the compatible issue is solved. For pulsing codec, it will also be able to tell lower level how to encode metadata. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5168) BJQParserTest reproducible failures
[ https://issues.apache.org/jira/browse/SOLR-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742607#comment-13742607 ] Yonik Seeley commented on SOLR-5168: bq. test doesn't use block add Yeah, I thought that was on purpose to test the query separately from any block indexing. Simplest fix would be to disable the random IW stuff for this test (it would always work if the buffering in IW is enough such that the docs are flushed to a single segment). Optimizing after that fact in conjunction with the log merge policy would also work. BJQParserTest reproducible failures --- Key: SOLR-5168 URL: https://issues.apache.org/jira/browse/SOLR-5168 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Yonik Seeley two recent Jenkins builds have uncovered some test seeds that cause failures in multiple test methods in BJQParserTest. These seeds reproduce reliably (as of trunk r1514815) ... {noformat} ant test -Dtestcase=BJQParserTest -Dtests.seed=7A613F321CE87F5B -Dtests.multiplier=3 -Dtests.slow=true ant test -Dtestcase=BJQParserTest -Dtests.seed=1DC8055F837E437E -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5099) The core.properties not created during collection creation
[ https://issues.apache.org/jira/browse/SOLR-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742615#comment-13742615 ] Mark Miller commented on SOLR-5099: --- For bug fixes to unreleased issues that a non committer contribs towards, we should add credit to the issue that caused the bug. If it's minor in comparison to the original issue, we tend to create sub Changes entries - see some previous examples in Changes. I'll make an update here. The core.properties not created during collection creation -- Key: SOLR-5099 URL: https://issues.apache.org/jira/browse/SOLR-5099 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.5, 5.0 Reporter: Herb Jiang Assignee: Mark Miller Priority: Critical Fix For: 4.5, 5.0 Attachments: CorePropertiesLocator.java.patch When using the new solr.xml structure. The core auto discovery mechanism trying to find core.properties. But I found the core.properties cannot be create when I dynamically create a collection. The root issue is the CorePropertiesLocator trying to create properties before the instanceDir is created. And collection creation process will done and looks fine at runtime, but it will cause issues (cores are not auto discovered after server restart). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5170) Spatial multi-value distance sort via DocValues
David Smiley created SOLR-5170: -- Summary: Spatial multi-value distance sort via DocValues Key: SOLR-5170 URL: https://issues.apache.org/jira/browse/SOLR-5170 Project: Solr Issue Type: New Feature Components: spatial Reporter: David Smiley Assignee: David Smiley The attached patch implements spatial multi-value distance sorting. In other words, a document can have more than one point per field, and using a provided function query, it will return the distance to the closest point. The data goes into binary DocValues, and as-such it's pretty friendly to realtime search requirements, and it only uses 8 bytes per point. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5168) BJQParserTest reproducible failures
[ https://issues.apache.org/jira/browse/SOLR-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Khludnev updated SOLR-5168: --- Attachment: BJQTest.patch first patch. it solves most of tests but testGrandChildren() still fails on broken block. BJQParserTest reproducible failures --- Key: SOLR-5168 URL: https://issues.apache.org/jira/browse/SOLR-5168 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Yonik Seeley Attachments: BJQTest.patch two recent Jenkins builds have uncovered some test seeds that cause failures in multiple test methods in BJQParserTest. These seeds reproduce reliably (as of trunk r1514815) ... {noformat} ant test -Dtestcase=BJQParserTest -Dtests.seed=7A613F321CE87F5B -Dtests.multiplier=3 -Dtests.slow=true ant test -Dtestcase=BJQParserTest -Dtests.seed=1DC8055F837E437E -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5170) Spatial multi-value distance sort via DocValues
[ https://issues.apache.org/jira/browse/SOLR-5170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-5170: --- Attachment: SOLR-5170_spatial_multi-value_sort_via_docvalues.patch *The first patch is not committable*. * The biggest reason why is there's an awkward hack to work-around the fact that a Solr FieldType can't aggregate multiple values into a single BinaryDocValuesField. So I've got this UpdateRequestProcessor that works in concert with the field. SOLR-4329 * Secondly it needs more tests. It's been working in quasi-production for many months, though. * And thirdly, I'd prefer to see this mechanism integrated into the lucene spatial framework somehow. If you want to know how to use it, look at the tests. I'm providing this because I got permission to open-source it and people want this capability. Once SOLR-4329 is addressed then I'll work on this code more to make it commit-worthy. Spatial multi-value distance sort via DocValues --- Key: SOLR-5170 URL: https://issues.apache.org/jira/browse/SOLR-5170 Project: Solr Issue Type: New Feature Components: spatial Reporter: David Smiley Assignee: David Smiley Attachments: SOLR-5170_spatial_multi-value_sort_via_docvalues.patch The attached patch implements spatial multi-value distance sorting. In other words, a document can have more than one point per field, and using a provided function query, it will return the distance to the closest point. The data goes into binary DocValues, and as-such it's pretty friendly to realtime search requirements, and it only uses 8 bytes per point. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5170) Spatial multi-value distance sort via DocValues
[ https://issues.apache.org/jira/browse/SOLR-5170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742654#comment-13742654 ] Robert Muir commented on SOLR-5170: --- why use BINARY vs SORTED_SET? that has a much easier fit in solr to boot. its designed for multiple values... Spatial multi-value distance sort via DocValues --- Key: SOLR-5170 URL: https://issues.apache.org/jira/browse/SOLR-5170 Project: Solr Issue Type: New Feature Components: spatial Reporter: David Smiley Assignee: David Smiley Attachments: SOLR-5170_spatial_multi-value_sort_via_docvalues.patch The attached patch implements spatial multi-value distance sorting. In other words, a document can have more than one point per field, and using a provided function query, it will return the distance to the closest point. The data goes into binary DocValues, and as-such it's pretty friendly to realtime search requirements, and it only uses 8 bytes per point. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Luceneutil high variability between runs
Hello, I'm trying to benchmark a change to BM25Similarity (LUCENE-5175 )using luceneutil I'm running this on a lightly loaded machine with a load average (top) of about 0.01 when the benchmark is not running. I made the following changes: 1) localrun.py changed Competition(debug=True) to Competition(debug=False) 2) made the following changes to localconstants.py per Robert Muir's suggestion: JAVA_COMMAND = 'java -server -Xms4g -Xmx4g' SEARCH_NUM_THREADS = 1 3) for the BM25 tests set SIMILARITY_DEFAULT='BM25Similarity' 4) for the BM25 tests uncommened the following line from searchBench.py #verifyScores = False Attached is output from iter 19 of several runs The first 4 runs show consistently that the modified version is somewhere between 6% and 8% slower on the tasks with the highest difference between trunk and patch. However if you look at the baseline TaskQPS, for HighTerm, for example, run 3 is about 55 and run 1 is about 88. So the difference for this task between different runs of the bench program is very much higher than the differences between trunk and modified/patch within a run. Is this to be expected? Is there a reason I should believe the differences shown within a run reflect the true differences? Seeing this variability, I then switched DEFAULT_SIMILARITY back to DefaultSimilarity. In this case trunk and my_modified, should be exercising exactly the same code, since the only changes in the patch are the addition of a test case for BM25Similarity and a change to BM25Similarity. In this case the modified version varies from -6.2% difference from the base to +4.4% difference from the base for LowTerm. Comparing QPS for the base case for HighTerm between different runs we can see it varies from about 21 for run 1 to 76 for run 3. Is this kind of variation between runs of the benchmark to be expected? Any suggestions about where to look to reduce the variations between runs? Tom BM25Similarity runs where my_modified_version is LUCENE- tail -33 BM25SimRun1 |head -5 Report after iter 19: TaskQPS baseline StdDevQPS my_modified_version StdDevPct diff HighTerm 87.91 (13.2%) 81.02 (8.5%) -7.8% ( -26% - 16%) MedTerm 111.81 (13.2%) 103.11 (8.4%) -7.8% ( -25% - 15%) LowTerm 411.44 (17.7%) 382.47 (14.5%) -7.0% ( -33% - 30%) [tburtonw@alamo runs]$ tail -33 BM25SimRun2 |head -5 Report after iter 19: TaskQPS baseline StdDevQPS my_modified_version StdDevPct diff HighTerm 62.15 (6.4%) 58.10 (7.1%) -6.5% ( -18% -7%) MedTerm 139.11 (4.5%) 130.22 (7.5%) -6.4% ( -17% -5%) LowTerm 391.93 (10.5%) 373.71 (13.1%) -4.6% ( -25% - 21%) [tburtonw@alamo runs]$ tail -33 BM25SimRun3 |head -5 Report after iter 19: TaskQPS baseline StdDevQPS my_modified_version StdDevPct diff HighTerm 54.85 (6.5%) 50.18 (1.6%) -8.5% ( -15% -0%) MedTerm 146.04 (8.6%) 137.31 (4.7%) -6.0% ( -17% -8%) OrNotHighLow 45.85 (11.1%) 43.37 (10.6%) -5.4% ( -24% - 18%) [tburtonw@alamo runs]$ tail -33 BM25SimRun4 |head -5 Report after iter 19: TaskQPS baseline StdDevQPS my_modified_version StdDevPct diff OrNotHighMed 49.40 (8.7%) 45.37 (8.8%) -8.2% ( -23% - 10%) OrNotHighLow 65.48 (8.7%) 60.19 (9.0%) -8.1% ( -23% - 10%) OrNotHighHigh 37.06 (8.2%) 34.18 (8.2%) -7.8% ( -22% -9%) == Default similarity, which is not modified by the BM25 patch DefaultSimRun1 LowTerm 398.97 (17.9%) 398.94 (18.1%) -0.0% ( -30% - 43%) HighTerm 21.13 (12.1%) 21.45 (12.2%) 1.5% ( -20% - 29%) DefaultSimRun2 LowTerm 406.93 (17.1%) 381.51 (15.8%) -6.2% ( -33% - 32%) HighTerm 59.21 (2.5%) 59.70 (3.5%) 0.8% ( -5% -7%) DefaultSimRun3 LowTerm 431.59 (18.5%) 450.55 (16.8%) 4.4% ( -26% - 48%) HighTerm 76.45 (2.0%) 76.45 (1.7%) 0.0% ( -3% -3%) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5179) Refactoring on PostingsWriterBase for delta-encoding
[ https://issues.apache.org/jira/browse/LUCENE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742665#comment-13742665 ] Robert Muir commented on LUCENE-5179: - we have had imperfect impersonation before (For example PreFlexRWFieldInfosReader). But the idea was to exercise to the best extent possible: e.g. if somehow we can make a Reader in the RW package (impersonator) that subclasses the real reader and overrides the term metadata piece, at least we are still testing the postings lists and term bytes and so on. and the real reader in lucene/core still gets some basic tests from TestBackwardsCompatibility. Refactoring on PostingsWriterBase for delta-encoding Key: LUCENE-5179 URL: https://issues.apache.org/jira/browse/LUCENE-5179 Project: Lucene - Core Issue Type: Improvement Reporter: Han Jiang Assignee: Han Jiang Fix For: 5.0, 4.5 Attachments: LUCENE-5179.patch A further step from LUCENE-5029. The short story is, previous API change brings two problems: * it somewhat breaks backward compatibility: although we can still read old format, we can no longer reproduce it; * pulsing codec have problem with it. And long story... With the change, current PostingsBase API will be like this: * term dict tells PBF we start a new term (via startTerm()); * PBF adds docs, positions and other postings data; * term dict tells PBF all the data for current term is completed (via finishTerm()), then PBF returns the metadata for current term (as long[] and byte[]); * term dict might buffer all the metadata in an ArrayList. when all the term is collected, it then decides how those metadata will be located on disk. So after the API change, PBF no longer have that annoying 'flushTermBlock', and instead term dict maintains the term, metadata list. However, for each term we'll now write long[] blob before byte[], so the index format is not consistent with pre-4.5. like in Lucne41, the metadata can be written as longA,bytesA,longB, but now we have to write as longA,longB,bytesA. Another problem is, pulsing codec cannot tell wrapped PBF how the metadata is delta-encoded, after all PulsingPostingsWriter is only a PBF. For example, we have terms=[a, a1, a2, b, b1 b2] and itemsInBlock=2, so theoretically we'll finally have three blocks in BTTR: [a b] [a1 a2] [b1 b2], with this approach, the metadata of term b is delta encoded base on metadata of a. but when term dict tells PBF to finishTerm(b), it might silly do the delta encode base on term a2. So I think maybe we can introduce a method 'encodeTerm(long[], DataOutput out, FieldInfo, TermState, boolean absolute)', so that during metadata flush, we can control how current term is written? And the term dict will buffer TermState, which implicitly holds metadata like we do in PBReader side. For example, if we want to reproduce old lucene41 format , we can simple set longsSize==0, then PBF writes the old format (longA,bytesA,longB) to DataOutput, and the compatible issue is solved. For pulsing codec, it will also be able to tell lower level how to encode metadata. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Luceneutil high variability between runs
I think the raw values don't matter so much because there is some randomization involved? And the same random seed is used... Your DefaultSimilarityRuns look pretty stable. its between 0.0% and 1.5% variation which is about as good as it gets for HighTerm LowTerm i am guessing is always noisy because they are so fast. a few of these measures at least are, i know particularly IntNRQ :) On Fri, Aug 16, 2013 at 6:20 PM, Tom Burton-West tburt...@umich.edu wrote: Hello, I'm trying to benchmark a change to BM25Similarity (LUCENE-5175 )using luceneutil I'm running this on a lightly loaded machine with a load average (top) of about 0.01 when the benchmark is not running. I made the following changes: 1) localrun.py changed Competition(debug=True) to Competition(debug=False) 2) made the following changes to localconstants.py per Robert Muir's suggestion: JAVA_COMMAND = 'java -server -Xms4g -Xmx4g' SEARCH_NUM_THREADS = 1 3) for the BM25 tests set SIMILARITY_DEFAULT='BM25Similarity' 4) for the BM25 tests uncommened the following line from searchBench.py #verifyScores = False Attached is output from iter 19 of several runs The first 4 runs show consistently that the modified version is somewhere between 6% and 8% slower on the tasks with the highest difference between trunk and patch. However if you look at the baseline TaskQPS, for HighTerm, for example, run 3 is about 55 and run 1 is about 88. So the difference for this task between different runs of the bench program is very much higher than the differences between trunk and modified/patch within a run. Is this to be expected? Is there a reason I should believe the differences shown within a run reflect the true differences? Seeing this variability, I then switched DEFAULT_SIMILARITY back to DefaultSimilarity. In this case trunk and my_modified, should be exercising exactly the same code, since the only changes in the patch are the addition of a test case for BM25Similarity and a change to BM25Similarity. In this case the modified version varies from -6.2% difference from the base to +4.4% difference from the base for LowTerm. Comparing QPS for the base case for HighTerm between different runs we can see it varies from about 21 for run 1 to 76 for run 3. Is this kind of variation between runs of the benchmark to be expected? Any suggestions about where to look to reduce the variations between runs? Tom - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5164) In some cases, creating collections via the Collections API due to core being created in the wrong directory
[ https://issues.apache.org/jira/browse/SOLR-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742719#comment-13742719 ] Mark Miller commented on SOLR-5164: --- I added some new important testing - we were not testing with a relative solr.home at all really - now it randomly uses one. I also added explicit testing to make sure the instance dir for collections api created cores is correct. In some cases, creating collections via the Collections API due to core being created in the wrong directory Key: SOLR-5164 URL: https://issues.apache.org/jira/browse/SOLR-5164 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Mark Miller Priority: Critical Fix For: 4.5, 5.0 Attachments: SOLR-5164.patch When you try to create a collection in SolrCloud, the instanceDir that gets created has an extra solr in it which messes up the pathing for all the lib directives in solrconfig.xml as they're all relative. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5099) The core.properties not created during collection creation
[ https://issues.apache.org/jira/browse/SOLR-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742720#comment-13742720 ] Mark Miller commented on SOLR-5099: --- I added an explicit test to make sure the cores.properties file is created. The core.properties not created during collection creation -- Key: SOLR-5099 URL: https://issues.apache.org/jira/browse/SOLR-5099 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.5, 5.0 Reporter: Herb Jiang Assignee: Mark Miller Priority: Critical Fix For: 4.5, 5.0 Attachments: CorePropertiesLocator.java.patch When using the new solr.xml structure. The core auto discovery mechanism trying to find core.properties. But I found the core.properties cannot be create when I dynamically create a collection. The root issue is the CorePropertiesLocator trying to create properties before the instanceDir is created. And collection creation process will done and looks fine at runtime, but it will cause issues (cores are not auto discovered after server restart). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5164) In some cases, creating collections via the Collections API due to core being created in the wrong directory
[ https://issues.apache.org/jira/browse/SOLR-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved SOLR-5164. --- Resolution: Fixed In some cases, creating collections via the Collections API due to core being created in the wrong directory Key: SOLR-5164 URL: https://issues.apache.org/jira/browse/SOLR-5164 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Mark Miller Priority: Critical Fix For: 4.5, 5.0 Attachments: SOLR-5164.patch When you try to create a collection in SolrCloud, the instanceDir that gets created has an extra solr in it which messes up the pathing for all the lib directives in solrconfig.xml as they're all relative. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org