[jira] [Commented] (SOLR-4718) Allow solr.xml to be stored in zookeeper
[ https://issues.apache.org/jira/browse/SOLR-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736216#comment-13736216 ] Noble Paul commented on SOLR-4718: -- I'm wondering why solr.xml is an xml file? From what I see why can't we just make it a properties file. Allow solr.xml to be stored in zookeeper Key: SOLR-4718 URL: https://issues.apache.org/jira/browse/SOLR-4718 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 4.3, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Attachments: SOLR-4718.patch So the near-final piece of this puzzle is to make solr.xml be storable in Zookeeper. Code-wise in terms of Solr, this doesn't look very difficult, I'm working on it now. More interesting is how to get the configuration into ZK in the first place, enhancements to ZkCli? Or boostrap-conf? Other? I'm punting on that for this patch. Second level is how to tell Solr to get the file from ZK. Some possibilities: 1 A system prop, -DzkSolrXmlPath=blah where blah is the path _on zk_ where the file is. Would require -DzkHost or -DzkRun as well. pros - simple, I can wrap my head around it. - easy to script cons - can't run multiple JVMs pointing to different files. Is this really a problem? 2 New solr.xml element. Something like: solr solrcloud str name=zkHostzkurl/str str name=zkSolrXmlPathwhatever/str /solrcloud solr Really, this form would hinge on the presence or absence of zkSolrXmlPath. If present, go up and look for the indicated solr.xml file on ZK. Any properties in the ZK version would overwrite anything in the local copy. NOTE: I'm really not very interested in supporting this as an option for old-style solr.xml unless it's _really_ easy. For instance, what if the local solr.xml is new-style and the one in ZK is old-style? Or vice-versa? Since old-style is going away, this doesn't seem like it's worth the effort. pros - No new mechanisms cons - once again requires that there be a solr.xml file on each client. Admittedly for installations that didn't care much about multiple JVMs, it could be a stock file that didn't change... For now, I'm going to just manually push solr.xml to ZK, then read it based on a sysprop. That'll get the structure in place while we debate. Not going to check this in until there's some consensus though. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4906) PostingsHighlighter's PassageFormatter should allow for rendering to arbitrary objects
[ https://issues.apache.org/jira/browse/LUCENE-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736240#comment-13736240 ] Luca Cavanna commented on LUCENE-4906: -- Hi Mike, I definitely agree that highlighting api should be simple and the postings highlighter is probably the only one that's really easy to use. On the other hand, I think it's good to make explicit that if you use a FormatterYourObject, YourObject is what you're going to get back from the highlighter. People using the string version wouldn't notice the change, while advanced users would have to extend the base class and get type safety too, that in my opinion makes it clearier and easier. Using Object feels to me a little old-fashioned and bogus, but again that's probably me :) I do trust your experience though. If you think the object version is better that's fine with me. What I care about is that this improvement gets committed soon, since it's a really useful one ;) Thanks a lot for sharing your ideas PostingsHighlighter's PassageFormatter should allow for rendering to arbitrary objects -- Key: LUCENE-4906 URL: https://issues.apache.org/jira/browse/LUCENE-4906 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Attachments: LUCENE-4906.patch, LUCENE-4906.patch For example, in a server, I may want to render the highlight result to JsonObject to send back to the front-end. Today since we render to string, I have to render to JSON string and then re-parse to JsonObject, which is inefficient... Or, if (Rob's idea:) we make a query that's like MoreLikeThis but it pulls terms from snippets instead, so you get proximity-influenced salient/expanded terms, then perhaps that renders to just an array of tokens or fragments or something from each snippet. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4906) PostingsHighlighter's PassageFormatter should allow for rendering to arbitrary objects
[ https://issues.apache.org/jira/browse/LUCENE-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736242#comment-13736242 ] Luca Cavanna commented on LUCENE-4906: -- One more thing: re-reading Robert's previous comments, I find also interesting the idea he had about changing the api to return a proper object instead of the MapString, String[], or the String[] for the simplest methods. I wonder if it's worth to address this as well in this issue, or if the current api is clear enough in your opinion. Any thoughts? PostingsHighlighter's PassageFormatter should allow for rendering to arbitrary objects -- Key: LUCENE-4906 URL: https://issues.apache.org/jira/browse/LUCENE-4906 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Attachments: LUCENE-4906.patch, LUCENE-4906.patch For example, in a server, I may want to render the highlight result to JsonObject to send back to the front-end. Today since we render to string, I have to render to JSON string and then re-parse to JsonObject, which is inefficient... Or, if (Rob's idea:) we make a query that's like MoreLikeThis but it pulls terms from snippets instead, so you get proximity-influenced salient/expanded terms, then perhaps that renders to just an array of tokens or fragments or something from each snippet. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5164) Remove the OOM catching in SimpleFSDirectory and NIOFSDirectory
[ https://issues.apache.org/jira/browse/LUCENE-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-5164: -- Attachment: LUCENE-5164.patch New patch: - The NIOFS read loop was further cleaned up and simplified by using the ByteBuffer tracking. - The setter/getter in FSDirectory are no no-ops (deprecated) - Every implementation has its own chunk size, which fits the underlying IO layer. For RandomAccessFile this is 8192 bytes I decided not to put the chunking into Buffered*, as it is still separate and complicates code of Buffered* even more. Remove the OOM catching in SimpleFSDirectory and NIOFSDirectory --- Key: LUCENE-5164 URL: https://issues.apache.org/jira/browse/LUCENE-5164 Project: Lucene - Core Issue Type: Bug Reporter: Uwe Schindler Assignee: Uwe Schindler Attachments: LUCENE-5164.patch, LUCENE-5164.patch Followup from LUCENE-5161: In former times we added the OOM cactching in NIOFSDir and SimpleFSDir because nobody understand why the OOM could happen on FileChannel.read() or SimpleFSDir.read(). By reading the Java code its easy to understand (it allocates direct buffers with same size as the requested length to read). As we have chunking now reduce to a few kilobytes it cannot happen anymore that we get spurious OOMs. In fact we might hide a *real* OOM! So we should remove it. I am also not sure if we should make chunk size configureable in FSDirectory at all! It makes no sense to me (it was in fact only added for people that hit the OOM to fine-tune). In my opinion we should remove the setter in trunk and keep it deprecated in 4.x. The buf size is then in trunk equal to the defaults from LUCENE-5161. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5164) Remove the OOM catching in SimpleFSDirectory and NIOFSDirectory
[ https://issues.apache.org/jira/browse/LUCENE-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-5164: -- Attachment: LUCENE-5164.patch Improved test in TestDirectory that ensures if chunking is working correctly. This is now ready. Remove the OOM catching in SimpleFSDirectory and NIOFSDirectory --- Key: LUCENE-5164 URL: https://issues.apache.org/jira/browse/LUCENE-5164 Project: Lucene - Core Issue Type: Bug Reporter: Uwe Schindler Assignee: Uwe Schindler Attachments: LUCENE-5164.patch, LUCENE-5164.patch, LUCENE-5164.patch Followup from LUCENE-5161: In former times we added the OOM cactching in NIOFSDir and SimpleFSDir because nobody understand why the OOM could happen on FileChannel.read() or SimpleFSDir.read(). By reading the Java code its easy to understand (it allocates direct buffers with same size as the requested length to read). As we have chunking now reduce to a few kilobytes it cannot happen anymore that we get spurious OOMs. In fact we might hide a *real* OOM! So we should remove it. I am also not sure if we should make chunk size configureable in FSDirectory at all! It makes no sense to me (it was in fact only added for people that hit the OOM to fine-tune). In my opinion we should remove the setter in trunk and keep it deprecated in 4.x. The buf size is then in trunk equal to the defaults from LUCENE-5161. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5101) make it easier to plugin different bitset implementations to CachingWrapperFilter
[ https://issues.apache.org/jira/browse/LUCENE-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736266#comment-13736266 ] Paul Elschot commented on LUCENE-5101: -- Patch looks good to me, too. Hopefully we'll get some early feedback about performance. make it easier to plugin different bitset implementations to CachingWrapperFilter - Key: LUCENE-5101 URL: https://issues.apache.org/jira/browse/LUCENE-5101 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir Attachments: DocIdSetBenchmark.java, LUCENE-5101.patch, LUCENE-5101.patch Currently this is possible, but its not so friendly: {code} protected DocIdSet docIdSetToCache(DocIdSet docIdSet, AtomicReader reader) throws IOException { if (docIdSet == null) { // this is better than returning null, as the nonnull result can be cached return EMPTY_DOCIDSET; } else if (docIdSet.isCacheable()) { return docIdSet; } else { final DocIdSetIterator it = docIdSet.iterator(); // null is allowed to be returned by iterator(), // in this case we wrap with the sentinel set, // which is cacheable. if (it == null) { return EMPTY_DOCIDSET; } else { /* INTERESTING PART */ final FixedBitSet bits = new FixedBitSet(reader.maxDoc()); bits.or(it); return bits; /* END INTERESTING PART */ } } } {code} Is there any value to having all this other logic in the protected API? It seems like something thats not useful for a subclass... Maybe this stuff can become final, and INTERESTING PART calls a simpler method, something like: {code} protected DocIdSet cacheImpl(DocIdSetIterator iterator, AtomicReader reader) { final FixedBitSet bits = new FixedBitSet(reader.maxDoc()); bits.or(iterator); return bits; } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5164) Remove the OOM catching in SimpleFSDirectory and NIOFSDirectory
[ https://issues.apache.org/jira/browse/LUCENE-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-5164: -- Attachment: LUCENE-5164.patch Explicitely pass buffer size as CHUNK_SIZE to BufferedIndexOutput for FSDirectory. Remove the OOM catching in SimpleFSDirectory and NIOFSDirectory --- Key: LUCENE-5164 URL: https://issues.apache.org/jira/browse/LUCENE-5164 Project: Lucene - Core Issue Type: Bug Reporter: Uwe Schindler Assignee: Uwe Schindler Attachments: LUCENE-5164.patch, LUCENE-5164.patch, LUCENE-5164.patch, LUCENE-5164.patch Followup from LUCENE-5161: In former times we added the OOM cactching in NIOFSDir and SimpleFSDir because nobody understand why the OOM could happen on FileChannel.read() or SimpleFSDir.read(). By reading the Java code its easy to understand (it allocates direct buffers with same size as the requested length to read). As we have chunking now reduce to a few kilobytes it cannot happen anymore that we get spurious OOMs. In fact we might hide a *real* OOM! So we should remove it. I am also not sure if we should make chunk size configureable in FSDirectory at all! It makes no sense to me (it was in fact only added for people that hit the OOM to fine-tune). In my opinion we should remove the setter in trunk and keep it deprecated in 4.x. The buf size is then in trunk equal to the defaults from LUCENE-5161. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4718) Allow solr.xml to be stored in zookeeper
[ https://issues.apache.org/jira/browse/SOLR-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736270#comment-13736270 ] Erick Erickson commented on SOLR-4718: -- I tried that once. It gets a little ugly with entries like solr.cloud=. The xml does divide things nicely into sections. Having a solr.xml allows a clean specification of a local properties file, there's no confusion between solr.xml and solr.properties, but that could be handled by convention since we really haven't decided what the local properties file could be (something like solr.properties and solrlocal.properties). But personally I don't want to go through the hassle of changing from solr.xml. I agree that functionally we should be able to get by with a properties file, but the fact that it's xml is built into the code in a lot of places and untangling the xml-ish nature is more time consuming (at least it was last time I did it then reverted) than valuable I think. Allow solr.xml to be stored in zookeeper Key: SOLR-4718 URL: https://issues.apache.org/jira/browse/SOLR-4718 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 4.3, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Attachments: SOLR-4718.patch So the near-final piece of this puzzle is to make solr.xml be storable in Zookeeper. Code-wise in terms of Solr, this doesn't look very difficult, I'm working on it now. More interesting is how to get the configuration into ZK in the first place, enhancements to ZkCli? Or boostrap-conf? Other? I'm punting on that for this patch. Second level is how to tell Solr to get the file from ZK. Some possibilities: 1 A system prop, -DzkSolrXmlPath=blah where blah is the path _on zk_ where the file is. Would require -DzkHost or -DzkRun as well. pros - simple, I can wrap my head around it. - easy to script cons - can't run multiple JVMs pointing to different files. Is this really a problem? 2 New solr.xml element. Something like: solr solrcloud str name=zkHostzkurl/str str name=zkSolrXmlPathwhatever/str /solrcloud solr Really, this form would hinge on the presence or absence of zkSolrXmlPath. If present, go up and look for the indicated solr.xml file on ZK. Any properties in the ZK version would overwrite anything in the local copy. NOTE: I'm really not very interested in supporting this as an option for old-style solr.xml unless it's _really_ easy. For instance, what if the local solr.xml is new-style and the one in ZK is old-style? Or vice-versa? Since old-style is going away, this doesn't seem like it's worth the effort. pros - No new mechanisms cons - once again requires that there be a solr.xml file on each client. Admittedly for installations that didn't care much about multiple JVMs, it could be a stock file that didn't change... For now, I'm going to just manually push solr.xml to ZK, then read it based on a sysprop. That'll get the structure in place while we debate. Not going to check this in until there's some consensus though. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4718) Allow solr.xml to be stored in zookeeper
[ https://issues.apache.org/jira/browse/SOLR-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736275#comment-13736275 ] Mark Miller commented on SOLR-4718: --- The idea has come up before - there was a preference for the better hierarchy support with xml vs properties as well as consistency with the solrcore configuration. It has little to do with putting it in zookeeper though - we want the same format as if it's on disk. Allow solr.xml to be stored in zookeeper Key: SOLR-4718 URL: https://issues.apache.org/jira/browse/SOLR-4718 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 4.3, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Attachments: SOLR-4718.patch So the near-final piece of this puzzle is to make solr.xml be storable in Zookeeper. Code-wise in terms of Solr, this doesn't look very difficult, I'm working on it now. More interesting is how to get the configuration into ZK in the first place, enhancements to ZkCli? Or boostrap-conf? Other? I'm punting on that for this patch. Second level is how to tell Solr to get the file from ZK. Some possibilities: 1 A system prop, -DzkSolrXmlPath=blah where blah is the path _on zk_ where the file is. Would require -DzkHost or -DzkRun as well. pros - simple, I can wrap my head around it. - easy to script cons - can't run multiple JVMs pointing to different files. Is this really a problem? 2 New solr.xml element. Something like: solr solrcloud str name=zkHostzkurl/str str name=zkSolrXmlPathwhatever/str /solrcloud solr Really, this form would hinge on the presence or absence of zkSolrXmlPath. If present, go up and look for the indicated solr.xml file on ZK. Any properties in the ZK version would overwrite anything in the local copy. NOTE: I'm really not very interested in supporting this as an option for old-style solr.xml unless it's _really_ easy. For instance, what if the local solr.xml is new-style and the one in ZK is old-style? Or vice-versa? Since old-style is going away, this doesn't seem like it's worth the effort. pros - No new mechanisms cons - once again requires that there be a solr.xml file on each client. Admittedly for installations that didn't care much about multiple JVMs, it could be a stock file that didn't change... For now, I'm going to just manually push solr.xml to ZK, then read it based on a sysprop. That'll get the structure in place while we debate. Not going to check this in until there's some consensus though. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5164) Remove the OOM catching in SimpleFSDirectory and NIOFSDirectory
[ https://issues.apache.org/jira/browse/LUCENE-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-5164: -- Attachment: LUCENE-5164.patch New patch again, this time with better reuse of NIOFS' ByteBuffer! Remove the OOM catching in SimpleFSDirectory and NIOFSDirectory --- Key: LUCENE-5164 URL: https://issues.apache.org/jira/browse/LUCENE-5164 Project: Lucene - Core Issue Type: Bug Reporter: Uwe Schindler Assignee: Uwe Schindler Attachments: LUCENE-5164.patch, LUCENE-5164.patch, LUCENE-5164.patch, LUCENE-5164.patch, LUCENE-5164.patch Followup from LUCENE-5161: In former times we added the OOM cactching in NIOFSDir and SimpleFSDir because nobody understand why the OOM could happen on FileChannel.read() or SimpleFSDir.read(). By reading the Java code its easy to understand (it allocates direct buffers with same size as the requested length to read). As we have chunking now reduce to a few kilobytes it cannot happen anymore that we get spurious OOMs. In fact we might hide a *real* OOM! So we should remove it. I am also not sure if we should make chunk size configureable in FSDirectory at all! It makes no sense to me (it was in fact only added for people that hit the OOM to fine-tune). In my opinion we should remove the setter in trunk and keep it deprecated in 4.x. The buf size is then in trunk equal to the defaults from LUCENE-5161. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5164) Remove the OOM catching in SimpleFSDirectory and NIOFSDirectory
[ https://issues.apache.org/jira/browse/LUCENE-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736288#comment-13736288 ] ASF subversion and git services commented on LUCENE-5164: - Commit 1512937 from [~thetaphi] in branch 'dev/trunk' [ https://svn.apache.org/r1512937 ] LUCENE-5164: Fix default chunk sizes in FSDirectory to not be unnecessarily large (now 8192 bytes); also use chunking when writing to index files. FSDirectory#setReadChunkSize() is now deprecated and will be removed in Lucene 5.0 Remove the OOM catching in SimpleFSDirectory and NIOFSDirectory --- Key: LUCENE-5164 URL: https://issues.apache.org/jira/browse/LUCENE-5164 Project: Lucene - Core Issue Type: Bug Reporter: Uwe Schindler Assignee: Uwe Schindler Attachments: LUCENE-5164.patch, LUCENE-5164.patch, LUCENE-5164.patch, LUCENE-5164.patch, LUCENE-5164.patch Followup from LUCENE-5161: In former times we added the OOM cactching in NIOFSDir and SimpleFSDir because nobody understand why the OOM could happen on FileChannel.read() or SimpleFSDir.read(). By reading the Java code its easy to understand (it allocates direct buffers with same size as the requested length to read). As we have chunking now reduce to a few kilobytes it cannot happen anymore that we get spurious OOMs. In fact we might hide a *real* OOM! So we should remove it. I am also not sure if we should make chunk size configureable in FSDirectory at all! It makes no sense to me (it was in fact only added for people that hit the OOM to fine-tune). In my opinion we should remove the setter in trunk and keep it deprecated in 4.x. The buf size is then in trunk equal to the defaults from LUCENE-5161. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4956) make maxBufferedAddsPerServer configurable
[ https://issues.apache.org/jira/browse/SOLR-4956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736291#comment-13736291 ] Erick Erickson commented on SOLR-4956: -- So it sounds like we have competing needs here. On the one hand we have several anecdotal statements that upping the buffer size had significant impact on throughput. On the other, just upping the buffer size has potential for Bad Outcomes. So it seems we have three options here: 1 make it configurable with a warning that if you change it it may lead to Bad Stuff. 2 Leave it as-is and forget about it. 3 Do the harder thing and see if we can figure out why changing the batch size makes such a difference and fix the underlying cause (if there is one). I'm totally unfamiliar with the code, but the 20,000 ft. smell is that there's something about the intra-node routing code that's very inefficient and making the buffers bigger is masking that. On the surface, just sending the packets around doesn't seem like it should spike the CPU that much... But like I said, I haven't looked at the code at all. make maxBufferedAddsPerServer configurable -- Key: SOLR-4956 URL: https://issues.apache.org/jira/browse/SOLR-4956 Project: Solr Issue Type: Improvement Affects Versions: 4.3, 5.0 Reporter: Erick Erickson Anecdotal user's list evidence indicates that in high-throughput situations, the default of 10 docs/batch for inter-shard batching can generate significant CPU load. See the thread titled Sharding and Replication on June 19th, but the gist is below. I haven't poked around, but it's a little surprising on the surface that Asif is seeing this kind of difference. So I'm wondering if this change indicates some other underlying issue. Regardless, this seems like it would be good to investigate. Here's the gist of Asif's experience from the thread: Its a completely practical problem - we are exploring Solr to build a real time analytics/data solution for a system handling about 1000 qps. We have various metrics that are stored as different collections on the cloud, which means very high amount of writes. The cloud also needs to support about 300-400 qps. We initially tested with a single Solr node on a 16 core / 24 GB box for a single metric. We saw that writes were not a issue at all - Solr was handling it extremely well. We were also able to achieve about 200 qps from a single node. When we set up the cloud ( a ensemble on 6 boxes), we saw very high CPU usage on the replicas. Up to 10 cores were getting used for writes on the replicas. Hence my concern with respect to batch updates for the replicas. BTW, I altered the maxBufferedAddsPerServer to 1000 - and now CPU usage is very similar to single node installation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3076) Solr(Cloud) should support block joins
[ https://issues.apache.org/jira/browse/SOLR-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736292#comment-13736292 ] Yonik Seeley commented on SOLR-3076: bq. why PARENT:true should ignore deletions? In an earlier iteration, it needed to... but now I think it's just desirable (as opposed to required) because it's more efficient (less backtracking over deleted docs), and more resilient to accidental error conditions (like when someone deletes a parent doc but not it's children). bq. I propose to revise the idea of rewindable docIDset iterator See LUCENE-5092, it looks like something like that has been rejected. As far as maintenance, the current stuff makes some things easier to tweak. I already did so for the child parser to make it fit better with how we put together full queries. Anyway, the important part are the public interfaces (the XML doc, and \{!parent} \{!child} parsers and semantics). If we're happy with those, I think we should commit at this point - this issue has been open for far too long! Solr(Cloud) should support block joins -- Key: SOLR-3076 URL: https://issues.apache.org/jira/browse/SOLR-3076 Project: Solr Issue Type: New Feature Reporter: Grant Ingersoll Assignee: Yonik Seeley Fix For: 4.5, 5.0 Attachments: 27M-singlesegment-histogram.png, 27M-singlesegment.png, bjq-vs-filters-backward-disi.patch, bjq-vs-filters-illegal-state.patch, child-bjqparser.patch, dih-3076.patch, dih-config.xml, parent-bjq-qparser.patch, parent-bjq-qparser.patch, Screen Shot 2012-07-17 at 1.12.11 AM.png, SOLR-3076-childDocs.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-7036-childDocs-solr-fork-trunk-patched, solrconf-bjq-erschema-snippet.xml, solrconfig.xml.patch, tochild-bjq-filtered-search-fix.patch Lucene has the ability to do block joins, we should add it to Solr. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5164) Remove the OOM catching in SimpleFSDirectory and NIOFSDirectory
[ https://issues.apache.org/jira/browse/LUCENE-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-5164: -- Attachment: LUCENE-5164-4x.patch Patch for 4.x (as the merging was complicated, because of many changes - Java 7) Remove the OOM catching in SimpleFSDirectory and NIOFSDirectory --- Key: LUCENE-5164 URL: https://issues.apache.org/jira/browse/LUCENE-5164 Project: Lucene - Core Issue Type: Bug Reporter: Uwe Schindler Assignee: Uwe Schindler Attachments: LUCENE-5164-4x.patch, LUCENE-5164.patch, LUCENE-5164.patch, LUCENE-5164.patch, LUCENE-5164.patch, LUCENE-5164.patch Followup from LUCENE-5161: In former times we added the OOM cactching in NIOFSDir and SimpleFSDir because nobody understand why the OOM could happen on FileChannel.read() or SimpleFSDir.read(). By reading the Java code its easy to understand (it allocates direct buffers with same size as the requested length to read). As we have chunking now reduce to a few kilobytes it cannot happen anymore that we get spurious OOMs. In fact we might hide a *real* OOM! So we should remove it. I am also not sure if we should make chunk size configureable in FSDirectory at all! It makes no sense to me (it was in fact only added for people that hit the OOM to fine-tune). In my opinion we should remove the setter in trunk and keep it deprecated in 4.x. The buf size is then in trunk equal to the defaults from LUCENE-5161. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5164) Remove the OOM catching in SimpleFSDirectory and NIOFSDirectory
[ https://issues.apache.org/jira/browse/LUCENE-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736294#comment-13736294 ] ASF subversion and git services commented on LUCENE-5164: - Commit 1512949 from [~thetaphi] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1512949 ] Merged revision(s) 1512937 from lucene/dev/trunk: LUCENE-5164: Fix default chunk sizes in FSDirectory to not be unnecessarily large (now 8192 bytes); also use chunking when writing to index files. FSDirectory#setReadChunkSize() is now deprecated and will be removed in Lucene 5.0 Remove the OOM catching in SimpleFSDirectory and NIOFSDirectory --- Key: LUCENE-5164 URL: https://issues.apache.org/jira/browse/LUCENE-5164 Project: Lucene - Core Issue Type: Bug Reporter: Uwe Schindler Assignee: Uwe Schindler Attachments: LUCENE-5164-4x.patch, LUCENE-5164.patch, LUCENE-5164.patch, LUCENE-5164.patch, LUCENE-5164.patch, LUCENE-5164.patch Followup from LUCENE-5161: In former times we added the OOM cactching in NIOFSDir and SimpleFSDir because nobody understand why the OOM could happen on FileChannel.read() or SimpleFSDir.read(). By reading the Java code its easy to understand (it allocates direct buffers with same size as the requested length to read). As we have chunking now reduce to a few kilobytes it cannot happen anymore that we get spurious OOMs. In fact we might hide a *real* OOM! So we should remove it. I am also not sure if we should make chunk size configureable in FSDirectory at all! It makes no sense to me (it was in fact only added for people that hit the OOM to fine-tune). In my opinion we should remove the setter in trunk and keep it deprecated in 4.x. The buf size is then in trunk equal to the defaults from LUCENE-5161. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5164) Remove the OOM catching in SimpleFSDirectory and NIOFSDirectory
[ https://issues.apache.org/jira/browse/LUCENE-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736295#comment-13736295 ] ASF subversion and git services commented on LUCENE-5164: - Commit 1512951 from [~thetaphi] in branch 'dev/trunk' [ https://svn.apache.org/r1512951 ] LUCENE-5164: Remove deprecated stuff in trunk. Remove the OOM catching in SimpleFSDirectory and NIOFSDirectory --- Key: LUCENE-5164 URL: https://issues.apache.org/jira/browse/LUCENE-5164 Project: Lucene - Core Issue Type: Bug Reporter: Uwe Schindler Assignee: Uwe Schindler Attachments: LUCENE-5164-4x.patch, LUCENE-5164.patch, LUCENE-5164.patch, LUCENE-5164.patch, LUCENE-5164.patch, LUCENE-5164.patch Followup from LUCENE-5161: In former times we added the OOM cactching in NIOFSDir and SimpleFSDir because nobody understand why the OOM could happen on FileChannel.read() or SimpleFSDir.read(). By reading the Java code its easy to understand (it allocates direct buffers with same size as the requested length to read). As we have chunking now reduce to a few kilobytes it cannot happen anymore that we get spurious OOMs. In fact we might hide a *real* OOM! So we should remove it. I am also not sure if we should make chunk size configureable in FSDirectory at all! It makes no sense to me (it was in fact only added for people that hit the OOM to fine-tune). In my opinion we should remove the setter in trunk and keep it deprecated in 4.x. The buf size is then in trunk equal to the defaults from LUCENE-5161. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5164) Remove the OOM catching in SimpleFSDirectory and NIOFSDirectory
[ https://issues.apache.org/jira/browse/LUCENE-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved LUCENE-5164. --- Resolution: Fixed Thanks Robert and Grant for the fruitful discussions! Remove the OOM catching in SimpleFSDirectory and NIOFSDirectory --- Key: LUCENE-5164 URL: https://issues.apache.org/jira/browse/LUCENE-5164 Project: Lucene - Core Issue Type: Bug Reporter: Uwe Schindler Assignee: Uwe Schindler Attachments: LUCENE-5164-4x.patch, LUCENE-5164.patch, LUCENE-5164.patch, LUCENE-5164.patch, LUCENE-5164.patch, LUCENE-5164.patch Followup from LUCENE-5161: In former times we added the OOM cactching in NIOFSDir and SimpleFSDir because nobody understand why the OOM could happen on FileChannel.read() or SimpleFSDir.read(). By reading the Java code its easy to understand (it allocates direct buffers with same size as the requested length to read). As we have chunking now reduce to a few kilobytes it cannot happen anymore that we get spurious OOMs. In fact we might hide a *real* OOM! So we should remove it. I am also not sure if we should make chunk size configureable in FSDirectory at all! It makes no sense to me (it was in fact only added for people that hit the OOM to fine-tune). In my opinion we should remove the setter in trunk and keep it deprecated in 4.x. The buf size is then in trunk equal to the defaults from LUCENE-5161. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4956) make maxBufferedAddsPerServer configurable
[ https://issues.apache.org/jira/browse/SOLR-4956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736298#comment-13736298 ] Yonik Seeley commented on SOLR-4956: Buffering can also slow indexing speeds... Say you up the buffer to 100 docs and then you send in a batch of 50. All 50 docs will be indexed locally and only then will all 50 be sent to the replica (where we have to wait for all 50 docs to be indexed again). make maxBufferedAddsPerServer configurable -- Key: SOLR-4956 URL: https://issues.apache.org/jira/browse/SOLR-4956 Project: Solr Issue Type: Improvement Affects Versions: 4.3, 5.0 Reporter: Erick Erickson Anecdotal user's list evidence indicates that in high-throughput situations, the default of 10 docs/batch for inter-shard batching can generate significant CPU load. See the thread titled Sharding and Replication on June 19th, but the gist is below. I haven't poked around, but it's a little surprising on the surface that Asif is seeing this kind of difference. So I'm wondering if this change indicates some other underlying issue. Regardless, this seems like it would be good to investigate. Here's the gist of Asif's experience from the thread: Its a completely practical problem - we are exploring Solr to build a real time analytics/data solution for a system handling about 1000 qps. We have various metrics that are stored as different collections on the cloud, which means very high amount of writes. The cloud also needs to support about 300-400 qps. We initially tested with a single Solr node on a 16 core / 24 GB box for a single metric. We saw that writes were not a issue at all - Solr was handling it extremely well. We were also able to achieve about 200 qps from a single node. When we set up the cloud ( a ensemble on 6 boxes), we saw very high CPU usage on the replicas. Up to 10 cores were getting used for writes on the replicas. Hence my concern with respect to batch updates for the replicas. BTW, I altered the maxBufferedAddsPerServer to 1000 - and now CPU usage is very similar to single node installation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5134) Have HdfsIndexOutput extend BufferedIndexOutput
[ https://issues.apache.org/jira/browse/SOLR-5134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-5134: -- Attachment: SOLR-5134.patch Thanks Uwe - new patch attached. Have HdfsIndexOutput extend BufferedIndexOutput --- Key: SOLR-5134 URL: https://issues.apache.org/jira/browse/SOLR-5134 Project: Solr Issue Type: Improvement Affects Versions: 4.4 Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.5, 5.0 Attachments: SOLR-5134.patch, SOLR-5134.patch Upstream Blur has moved HdfsIndexOutput to use BufferedIndexOutput and the simple FS IndexOutput does as well - seems we should do the same. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4956) make maxBufferedAddsPerServer configurable
[ https://issues.apache.org/jira/browse/SOLR-4956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736301#comment-13736301 ] Erick Erickson commented on SOLR-4956: -- Hmmm, sounds like we need more details. I wonder if the situation where buffering up more docs is helping are also situations in which there is only a leader? I guess the thing that's puzzling me is the high CPU rates that are reported being related to internal buffering sizes. make maxBufferedAddsPerServer configurable -- Key: SOLR-4956 URL: https://issues.apache.org/jira/browse/SOLR-4956 Project: Solr Issue Type: Improvement Affects Versions: 4.3, 5.0 Reporter: Erick Erickson Anecdotal user's list evidence indicates that in high-throughput situations, the default of 10 docs/batch for inter-shard batching can generate significant CPU load. See the thread titled Sharding and Replication on June 19th, but the gist is below. I haven't poked around, but it's a little surprising on the surface that Asif is seeing this kind of difference. So I'm wondering if this change indicates some other underlying issue. Regardless, this seems like it would be good to investigate. Here's the gist of Asif's experience from the thread: Its a completely practical problem - we are exploring Solr to build a real time analytics/data solution for a system handling about 1000 qps. We have various metrics that are stored as different collections on the cloud, which means very high amount of writes. The cloud also needs to support about 300-400 qps. We initially tested with a single Solr node on a 16 core / 24 GB box for a single metric. We saw that writes were not a issue at all - Solr was handling it extremely well. We were also able to achieve about 200 qps from a single node. When we set up the cloud ( a ensemble on 6 boxes), we saw very high CPU usage on the replicas. Up to 10 cores were getting used for writes on the replicas. Hence my concern with respect to batch updates for the replicas. BTW, I altered the maxBufferedAddsPerServer to 1000 - and now CPU usage is very similar to single node installation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5164) Remove the OOM catching in SimpleFSDirectory and NIOFSDirectory
[ https://issues.apache.org/jira/browse/LUCENE-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-5164: -- Fix Version/s: 4.5 5.0 Remove the OOM catching in SimpleFSDirectory and NIOFSDirectory --- Key: LUCENE-5164 URL: https://issues.apache.org/jira/browse/LUCENE-5164 Project: Lucene - Core Issue Type: Bug Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 5.0, 4.5 Attachments: LUCENE-5164-4x.patch, LUCENE-5164.patch, LUCENE-5164.patch, LUCENE-5164.patch, LUCENE-5164.patch, LUCENE-5164.patch Followup from LUCENE-5161: In former times we added the OOM cactching in NIOFSDir and SimpleFSDir because nobody understand why the OOM could happen on FileChannel.read() or SimpleFSDir.read(). By reading the Java code its easy to understand (it allocates direct buffers with same size as the requested length to read). As we have chunking now reduce to a few kilobytes it cannot happen anymore that we get spurious OOMs. In fact we might hide a *real* OOM! So we should remove it. I am also not sure if we should make chunk size configureable in FSDirectory at all! It makes no sense to me (it was in fact only added for people that hit the OOM to fine-tune). In my opinion we should remove the setter in trunk and keep it deprecated in 4.x. The buf size is then in trunk equal to the defaults from LUCENE-5161. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5135) Deleting a collection should be extra aggressive in the face of failures.
Mark Miller created SOLR-5135: - Summary: Deleting a collection should be extra aggressive in the face of failures. Key: SOLR-5135 URL: https://issues.apache.org/jira/browse/SOLR-5135 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.5, 5.0 Until Zk is the source of truth for the cluster, zk and local node states can get out of whack in certain situations - as a result, sometimes you cannot clean out all of the remnants of a collection to recreate it. For example, if the collection is listed in zk under /collections, but is not in clusterstate.json, you cannot remove or create the collection again due to a early exception in the collection removal chain. I think we should probably still return the error - but also delete as much as we can. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.7.0_25) - Build # 6874 - Failure!
I cannot reproduce this. On Sat, Aug 10, 2013 at 1:44 PM, Policeman Jenkins Server jenk...@thetaphi.de wrote: Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6874/ Java: 32bit/jdk1.7.0_25 -server -XX:+UseG1GC 1 tests failed. REGRESSION: org.apache.lucene.search.postingshighlight.TestPostingsHighlighter.testUserFailedToIndexOffsets Error Message: Stack Trace: java.lang.AssertionError at __randomizedtesting.SeedInfo.seed([EBBFA6F4E80A7365:1FBF811885F2D611]:0) at org.apache.lucene.index.ByteSliceReader.readByte(ByteSliceReader.java:73) at org.apache.lucene.store.DataInput.readVInt(DataInput.java:108) at org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:453) at org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85) at org.apache.lucene.index.TermsHash.flush(TermsHash.java:116) at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53) at org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:81) at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:501) at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:478) at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:615) at org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2760) at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2909) at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2884) at org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:312) at org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:249) at org.apache.lucene.search.postingshighlight.TestPostingsHighlighter.testUserFailedToIndexOffsets(TestPostingsHighlighter.java:295) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at
[jira] [Commented] (SOLR-5135) Deleting a collection should be extra aggressive in the face of failures.
[ https://issues.apache.org/jira/browse/SOLR-5135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736374#comment-13736374 ] Mark Miller commented on SOLR-5135: --- bq. or create the collection again due Seems you can actually create it again - we check existence vs the clusterstate.json rather than the /collections node - we should still remove the remnants though. Deleting a collection should be extra aggressive in the face of failures. - Key: SOLR-5135 URL: https://issues.apache.org/jira/browse/SOLR-5135 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.5, 5.0 Until Zk is the source of truth for the cluster, zk and local node states can get out of whack in certain situations - as a result, sometimes you cannot clean out all of the remnants of a collection to recreate it. For example, if the collection is listed in zk under /collections, but is not in clusterstate.json, you cannot remove or create the collection again due to a early exception in the collection removal chain. I think we should probably still return the error - but also delete as much as we can. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3076) Solr(Cloud) should support block joins
[ https://issues.apache.org/jira/browse/SOLR-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736386#comment-13736386 ] Mikhail Khludnev commented on SOLR-3076: bq. like when someone deletes a parent doc but not it's children I've thought it so. However, there is an argument provided by one of my colleagues and the brightest engineers ever (Nina G): such courtesy works until merge happens, and after merge/expunge deletes it's a pain. So, beside it's inconsistent, I even thought it wont be passed by random tests. bq. See LUCENE-5092, it looks like something like that has been rejected. that approach has performance implication, but I propose nothing more just API massaging without any real implementation changes/extending: let BJQ work with something, with is either CachingWrapperFilter or BitDocSet.getTopFilter(). bq. If we're happy with those, I think we should commit at this point - this issue has been open for far too long) not found. I got your point. It makes sense. We just need to raise followup issue - unify BJQs across Lucene and Solr, and ideally address it before the next release. Otherwise, it's just a way to upset a user - if someone happy with BJQ in Lucene, it should be clear that with this parser he goes to another BJQs. As an alternative intermediate measure, don't you think it's more honest to store CachingWrapperFilter in Solr's filtercache via ugly hack, for sure. Then, follow up and address it soon. Thanks Solr(Cloud) should support block joins -- Key: SOLR-3076 URL: https://issues.apache.org/jira/browse/SOLR-3076 Project: Solr Issue Type: New Feature Reporter: Grant Ingersoll Assignee: Yonik Seeley Fix For: 4.5, 5.0 Attachments: 27M-singlesegment-histogram.png, 27M-singlesegment.png, bjq-vs-filters-backward-disi.patch, bjq-vs-filters-illegal-state.patch, child-bjqparser.patch, dih-3076.patch, dih-config.xml, parent-bjq-qparser.patch, parent-bjq-qparser.patch, Screen Shot 2012-07-17 at 1.12.11 AM.png, SOLR-3076-childDocs.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-7036-childDocs-solr-fork-trunk-patched, solrconf-bjq-erschema-snippet.xml, solrconfig.xml.patch, tochild-bjq-filtered-search-fix.patch Lucene has the ability to do block joins, we should add it to Solr. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5092) join: don't expect all filters to be FixedBitSet instances
[ https://issues.apache.org/jira/browse/LUCENE-5092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736388#comment-13736388 ] Mikhail Khludnev commented on LUCENE-5092: -- bq. In my opinion the order of child / parent documents should be reversed, so the search for the parent (or child dont know) could go forward only. [~thetaphi] in this case, after I advance()/nextDoc() a child scorer, how the parent scorer can reach the parent doc which is before of the matched child? join: don't expect all filters to be FixedBitSet instances -- Key: LUCENE-5092 URL: https://issues.apache.org/jira/browse/LUCENE-5092 Project: Lucene - Core Issue Type: Improvement Components: modules/join Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-5092.patch The join module throws exceptions when the parents filter isn't a FixedBitSet. The reason is that the join module relies on prevSetBit to find the first child document given a parent ID. As suggested by Uwe and Paul Elschot on LUCENE-5081, we could fix it by exposing methods in the iterators to iterate backwards. When the join modules gets an iterator which isn't able to iterate backwards, it would just need to dump its content into another DocIdSet that supports backward iteration, FixedBitSet for example. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-3076) Solr(Cloud) should support block joins
[ https://issues.apache.org/jira/browse/SOLR-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736386#comment-13736386 ] Mikhail Khludnev edited comment on SOLR-3076 at 8/11/13 7:22 PM: - bq. like when someone deletes a parent doc but not it's children I've thought it so. However, there is an argument provided by one of my colleagues and the brightest engineer ever (Nina G) - such courtesy works until merge happens, and after merge/expunge deletes it's a pain. So, beside it's inconsistent, I even thought it wont be passed by random tests. bq. See LUCENE-5092, it looks like something like that has been rejected. that approach has performance implication, but I propose nothing more just API massaging without any real implementation changes/extending: let BJQ work with something, which is either CachingWrapperFilter or BitDocSet.getTopFilter(). bq. If we're happy with those, I think we should commit at this point - I got your point. It makes sense. We just need to raise followup issue - unify BJQs across Lucene and Solr, and ideally address it before the next release. Otherwise, it's just a way to upset a user - if someone happy with BJQ in Lucene, it should be clear that with this parser he goes to another BJQs. As an alternative intermediate measure, don't you think it's more honest to store CachingWrapperFilter in Solr's filtercache via ugly hack, for sure. Then, follow up and address it soon. bq. this issue has been open for far too long) but who really cares? Thanks was (Author: mkhludnev): bq. like when someone deletes a parent doc but not it's children I've thought it so. However, there is an argument provided by one of my colleagues and the brightest engineers ever (Nina G): such courtesy works until merge happens, and after merge/expunge deletes it's a pain. So, beside it's inconsistent, I even thought it wont be passed by random tests. bq. See LUCENE-5092, it looks like something like that has been rejected. that approach has performance implication, but I propose nothing more just API massaging without any real implementation changes/extending: let BJQ work with something, with is either CachingWrapperFilter or BitDocSet.getTopFilter(). bq. If we're happy with those, I think we should commit at this point - this issue has been open for far too long) not found. I got your point. It makes sense. We just need to raise followup issue - unify BJQs across Lucene and Solr, and ideally address it before the next release. Otherwise, it's just a way to upset a user - if someone happy with BJQ in Lucene, it should be clear that with this parser he goes to another BJQs. As an alternative intermediate measure, don't you think it's more honest to store CachingWrapperFilter in Solr's filtercache via ugly hack, for sure. Then, follow up and address it soon. Thanks Solr(Cloud) should support block joins -- Key: SOLR-3076 URL: https://issues.apache.org/jira/browse/SOLR-3076 Project: Solr Issue Type: New Feature Reporter: Grant Ingersoll Assignee: Yonik Seeley Fix For: 4.5, 5.0 Attachments: 27M-singlesegment-histogram.png, 27M-singlesegment.png, bjq-vs-filters-backward-disi.patch, bjq-vs-filters-illegal-state.patch, child-bjqparser.patch, dih-3076.patch, dih-config.xml, parent-bjq-qparser.patch, parent-bjq-qparser.patch, Screen Shot 2012-07-17 at 1.12.11 AM.png, SOLR-3076-childDocs.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-7036-childDocs-solr-fork-trunk-patched, solrconf-bjq-erschema-snippet.xml, solrconfig.xml.patch, tochild-bjq-filtered-search-fix.patch Lucene has the ability to do block joins, we should add it to Solr. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3076) Solr(Cloud) should support block joins
[ https://issues.apache.org/jira/browse/SOLR-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736401#comment-13736401 ] Robert Muir commented on SOLR-3076: --- why take working per-segment code and make it slower/top-level? Solr(Cloud) should support block joins -- Key: SOLR-3076 URL: https://issues.apache.org/jira/browse/SOLR-3076 Project: Solr Issue Type: New Feature Reporter: Grant Ingersoll Assignee: Yonik Seeley Fix For: 4.5, 5.0 Attachments: 27M-singlesegment-histogram.png, 27M-singlesegment.png, bjq-vs-filters-backward-disi.patch, bjq-vs-filters-illegal-state.patch, child-bjqparser.patch, dih-3076.patch, dih-config.xml, parent-bjq-qparser.patch, parent-bjq-qparser.patch, Screen Shot 2012-07-17 at 1.12.11 AM.png, SOLR-3076-childDocs.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-7036-childDocs-solr-fork-trunk-patched, solrconf-bjq-erschema-snippet.xml, solrconfig.xml.patch, tochild-bjq-filtered-search-fix.patch Lucene has the ability to do block joins, we should add it to Solr. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3076) Solr(Cloud) should support block joins
[ https://issues.apache.org/jira/browse/SOLR-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736406#comment-13736406 ] Yonik Seeley commented on SOLR-3076: per-segment caches isn't the focus of this issue (although we should add a generic per-segment cache that can be sized/managed in a diff issue). Solr(Cloud) should support block joins -- Key: SOLR-3076 URL: https://issues.apache.org/jira/browse/SOLR-3076 Project: Solr Issue Type: New Feature Reporter: Grant Ingersoll Assignee: Yonik Seeley Fix For: 4.5, 5.0 Attachments: 27M-singlesegment-histogram.png, 27M-singlesegment.png, bjq-vs-filters-backward-disi.patch, bjq-vs-filters-illegal-state.patch, child-bjqparser.patch, dih-3076.patch, dih-config.xml, parent-bjq-qparser.patch, parent-bjq-qparser.patch, Screen Shot 2012-07-17 at 1.12.11 AM.png, SOLR-3076-childDocs.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-7036-childDocs-solr-fork-trunk-patched, solrconf-bjq-erschema-snippet.xml, solrconfig.xml.patch, tochild-bjq-filtered-search-fix.patch Lucene has the ability to do block joins, we should add it to Solr. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3076) Solr(Cloud) should support block joins
[ https://issues.apache.org/jira/browse/SOLR-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736411#comment-13736411 ] Robert Muir commented on SOLR-3076: --- The previous patches were per-segment. There is no reason for it to be top-level! Solr(Cloud) should support block joins -- Key: SOLR-3076 URL: https://issues.apache.org/jira/browse/SOLR-3076 Project: Solr Issue Type: New Feature Reporter: Grant Ingersoll Assignee: Yonik Seeley Fix For: 4.5, 5.0 Attachments: 27M-singlesegment-histogram.png, 27M-singlesegment.png, bjq-vs-filters-backward-disi.patch, bjq-vs-filters-illegal-state.patch, child-bjqparser.patch, dih-3076.patch, dih-config.xml, parent-bjq-qparser.patch, parent-bjq-qparser.patch, Screen Shot 2012-07-17 at 1.12.11 AM.png, SOLR-3076-childDocs.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-7036-childDocs-solr-fork-trunk-patched, solrconf-bjq-erschema-snippet.xml, solrconfig.xml.patch, tochild-bjq-filtered-search-fix.patch Lucene has the ability to do block joins, we should add it to Solr. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5135) Deleting a collection should be extra aggressive in the face of failures.
[ https://issues.apache.org/jira/browse/SOLR-5135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-5135: -- Attachment: SOLR-5135.patch Attached patch adds an attempt to remove the /collections zk node in a finally after trying to remove all of the cores. Deleting a collection should be extra aggressive in the face of failures. - Key: SOLR-5135 URL: https://issues.apache.org/jira/browse/SOLR-5135 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.5, 5.0 Attachments: SOLR-5135.patch Until Zk is the source of truth for the cluster, zk and local node states can get out of whack in certain situations - as a result, sometimes you cannot clean out all of the remnants of a collection to recreate it. For example, if the collection is listed in zk under /collections, but is not in clusterstate.json, you cannot remove or create the collection again due to a early exception in the collection removal chain. I think we should probably still return the error - but also delete as much as we can. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3076) Solr(Cloud) should support block joins
[ https://issues.apache.org/jira/browse/SOLR-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736417#comment-13736417 ] Michael McCandless commented on SOLR-3076: -- I don't understand all the design constraints here, but I really don't like that the internal fork (full copy) of the ToParent/ChildBlockJoinQuery sources. Why is this necessary? Is it to cutover to the top-level filter cache? We should not fork our sources if we can help it ... Solr(Cloud) should support block joins -- Key: SOLR-3076 URL: https://issues.apache.org/jira/browse/SOLR-3076 Project: Solr Issue Type: New Feature Reporter: Grant Ingersoll Assignee: Yonik Seeley Fix For: 4.5, 5.0 Attachments: 27M-singlesegment-histogram.png, 27M-singlesegment.png, bjq-vs-filters-backward-disi.patch, bjq-vs-filters-illegal-state.patch, child-bjqparser.patch, dih-3076.patch, dih-config.xml, parent-bjq-qparser.patch, parent-bjq-qparser.patch, Screen Shot 2012-07-17 at 1.12.11 AM.png, SOLR-3076-childDocs.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-7036-childDocs-solr-fork-trunk-patched, solrconf-bjq-erschema-snippet.xml, solrconfig.xml.patch, tochild-bjq-filtered-search-fix.patch Lucene has the ability to do block joins, we should add it to Solr. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.7.0_25) - Build # 6874 - Failure!
I tried even harder (with exact JVM, tests.jvms, master seed, svn rev, and same JVM args): At revision 1512807 rmuir@beast:~/workspace/branch_4x/lucene/highlighter$ export JAVA_HOME=/home/rmuir/Downloads/32bit7/jdk1.7.0_25 rmuir@beast:~/workspace/branch_4x/lucene/highlighter$ ant test -Dtests.jvms=2 -Dtests.seed=EBBFA6F4E80A7365 -Dargs=-server -XX:+UseG1GC BUILD SUCCESSFUL Total time: 18 seconds On Sun, Aug 11, 2013 at 2:17 PM, Robert Muir rcm...@gmail.com wrote: I cannot reproduce this. On Sat, Aug 10, 2013 at 1:44 PM, Policeman Jenkins Server jenk...@thetaphi.de wrote: Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6874/ Java: 32bit/jdk1.7.0_25 -server -XX:+UseG1GC 1 tests failed. REGRESSION: org.apache.lucene.search.postingshighlight.TestPostingsHighlighter.testUserFailedToIndexOffsets Error Message: Stack Trace: java.lang.AssertionError at __randomizedtesting.SeedInfo.seed([EBBFA6F4E80A7365:1FBF811885F2D611]:0) at org.apache.lucene.index.ByteSliceReader.readByte(ByteSliceReader.java:73) at org.apache.lucene.store.DataInput.readVInt(DataInput.java:108) at org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:453) at org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85) at org.apache.lucene.index.TermsHash.flush(TermsHash.java:116) at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53) at org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:81) at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:501) at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:478) at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:615) at org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2760) at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2909) at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2884) at org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:312) at org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:249) at org.apache.lucene.search.postingshighlight.TestPostingsHighlighter.testUserFailedToIndexOffsets(TestPostingsHighlighter.java:295) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693) at
[jira] [Commented] (SOLR-3076) Solr(Cloud) should support block joins
[ https://issues.apache.org/jira/browse/SOLR-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736504#comment-13736504 ] Yonik Seeley commented on SOLR-3076: The important parts of this issue are: - Serialization formats (XML, javabin, etc) - join semantics - join syntax... i.e. \{!child ...} \{!parent ...} - common public Solr Java APIs SolrInputDocument, UpdateHandler/UpdateProcessor - correctness Other things are implementation details that can be improved over time. We should be aware of things we *don't* want to support long term... this is why I removed the external/custom cache dependency (in addition to the usability implications). As far as per-segment goes, some of the previous patches had issues (such as caching SolrCache in QParser instances), double-caching (the filter used by the join would be cached separately from the same filter used in all other contexts), the custom caches defined in solrconfig.xml, not to mention my general dislike for weak references. Since per-segment filter caching is an orthogonal issue (and it would be best to be able to specify this on a per-filter basis), I decided it was best to leave per-segment filters for a different issue and create queries that would work well with the way Solr currently does it's filter caching and request construction. Additionally, how to deal with the going backwards problem / expecting all filters to be FixedBitSet (which Solr doesn't use) is still up in the air: LUCENE-5092. There's no reason to wait for that to get hashed out before giving Solr users block child/parent join functionallity. Those details of the Java APIs just don't matter to Solr users. These query classes in question are package-private classes that Solr users do not see - truly an implementation detail. Changing them in the future (as long as the behavior is equivalent) would not even warrant mention in release notes (unless performance had been improved). Can there still be implementation improvements? Absolutely! But I'm personally currently out of time on this issue, and I feel comfortable with supporting the public APIs we've come up with for some time to come. Since no one seems to have issues with any of the important parts like the public APIs, I plan on committing this shortly. Additional improvements/optimizations can come from follow-on patches. Solr(Cloud) should support block joins -- Key: SOLR-3076 URL: https://issues.apache.org/jira/browse/SOLR-3076 Project: Solr Issue Type: New Feature Reporter: Grant Ingersoll Assignee: Yonik Seeley Fix For: 4.5, 5.0 Attachments: 27M-singlesegment-histogram.png, 27M-singlesegment.png, bjq-vs-filters-backward-disi.patch, bjq-vs-filters-illegal-state.patch, child-bjqparser.patch, dih-3076.patch, dih-config.xml, parent-bjq-qparser.patch, parent-bjq-qparser.patch, Screen Shot 2012-07-17 at 1.12.11 AM.png, SOLR-3076-childDocs.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-7036-childDocs-solr-fork-trunk-patched, solrconf-bjq-erschema-snippet.xml, solrconfig.xml.patch, tochild-bjq-filtered-search-fix.patch Lucene has the ability to do block joins, we should add it to Solr. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3076) Solr(Cloud) should support block joins
[ https://issues.apache.org/jira/browse/SOLR-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736508#comment-13736508 ] Yonik Seeley commented on SOLR-3076: bq. However, there is an argument provided by one of my colleagues and the brightest engineer ever (Nina G) - such courtesy works until merge happens, and after merge/expunge deletes it's a pain. Ah, right you (and Nina G) are! The inconsistency here (working until a merge) is worse than any performance difference. I'll change it. Solr(Cloud) should support block joins -- Key: SOLR-3076 URL: https://issues.apache.org/jira/browse/SOLR-3076 Project: Solr Issue Type: New Feature Reporter: Grant Ingersoll Assignee: Yonik Seeley Fix For: 4.5, 5.0 Attachments: 27M-singlesegment-histogram.png, 27M-singlesegment.png, bjq-vs-filters-backward-disi.patch, bjq-vs-filters-illegal-state.patch, child-bjqparser.patch, dih-3076.patch, dih-config.xml, parent-bjq-qparser.patch, parent-bjq-qparser.patch, Screen Shot 2012-07-17 at 1.12.11 AM.png, SOLR-3076-childDocs.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-7036-childDocs-solr-fork-trunk-patched, solrconf-bjq-erschema-snippet.xml, solrconfig.xml.patch, tochild-bjq-filtered-search-fix.patch Lucene has the ability to do block joins, we should add it to Solr. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org