[jira] [Comment Edited] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13781854#comment-13781854 ] Shalin Shekhar Mangar edited comment on SOLR-4816 at 9/30/13 2:25 PM: -- bq. You would get an exception that you could trace to one of the servers. The exception itself would have to have the info needed to determine what document failed due to optimistic locking. [~joel.bernstein] - I'm a little confused about how to track failures with CloudSolrServer for update requests. Firstly, the RouteException class is private to CloudSolrServer and cannot be used at all. Secondly, since the responseFutures are per URL, won't two update requests on the same server overwrite the entries? was (Author: shalinmangar): bq. You would get an exception that you could trace to one of the servers. The exception itself would have to have the info needed to determine what document failed due to optimistic locking. [~joel.bernstein] - I'm a little confused about how to tracking failures with CloudSolrServer on update requests. Firstly, the RouteException class is private to CloudSolrServer and cannot be used at all. Secondly, since the responseFutures are per URL, won't two update requests on the same server overwrite the entries? Add document routing to CloudSolrServer --- Key: SOLR-4816 URL: https://issues.apache.org/jira/browse/SOLR-4816 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.3 Reporter: Joel Bernstein Assignee: Mark Miller Priority: Minor Fix For: 4.5, 5.0 Attachments: RequestTask-removal.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch This issue adds the following enhancements to CloudSolrServer's update logic: 1) Document routing: Updates are routed directly to the correct shard leader eliminating document routing at the server. 2) Optional parallel update execution: Updates for each shard are executed in a separate thread so parallel indexing can occur across the cluster. These enhancements should allow for near linear scalability on indexing throughput. Usage: CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); cloudClient.setParallelUpdates(true); SolrInputDocument doc1 = new SolrInputDocument(); doc1.addField(id, 0); doc1.addField(a_t, hello1); SolrInputDocument doc2 = new SolrInputDocument(); doc2.addField(id, 2); doc2.addField(a_t, hello2); UpdateRequest request = new UpdateRequest(); request.add(doc1); request.add(doc2); request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); NamedList response = cloudClient.request(request); // Returns a backwards compatible condensed response. //To get more detailed response down cast to RouteResponse: CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13761797#comment-13761797 ] Joel Bernstein edited comment on SOLR-4816 at 9/9/13 12:32 PM: --- Awesome! Looks like javabin transport is part of this as well. My earlier tests showed this provided a large performance increase. Also looks like you cleaned up the UpdateRequestExt, which is good. Hope to have a chance today to apply the patch and test things out. was (Author: joel.bernstein): Awesome! Looks like javabin transport is part of this well. My earlier tests showed this provided a large performance increase. Also looks like you cleaned up the UpdateRequestExt, which is good. Hope to have a chance today to apply the patch and test things out. Add document routing to CloudSolrServer --- Key: SOLR-4816 URL: https://issues.apache.org/jira/browse/SOLR-4816 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.3 Reporter: Joel Bernstein Assignee: Mark Miller Priority: Minor Fix For: 4.5, 5.0 Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch This issue adds the following enhancements to CloudSolrServer's update logic: 1) Document routing: Updates are routed directly to the correct shard leader eliminating document routing at the server. 2) Optional parallel update execution: Updates for each shard are executed in a separate thread so parallel indexing can occur across the cluster. These enhancements should allow for near linear scalability on indexing throughput. Usage: CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); cloudClient.setParallelUpdates(true); SolrInputDocument doc1 = new SolrInputDocument(); doc1.addField(id, 0); doc1.addField(a_t, hello1); SolrInputDocument doc2 = new SolrInputDocument(); doc2.addField(id, 2); doc2.addField(a_t, hello2); UpdateRequest request = new UpdateRequest(); request.add(doc1); request.add(doc2); request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); NamedList response = cloudClient.request(request); // Returns a backwards compatible condensed response. //To get more detailed response down cast to RouteResponse: CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692733#comment-13692733 ] Joel Bernstein edited comment on SOLR-4816 at 6/25/13 4:34 AM: --- * Added support for aliases * Added routing for delete by id's * Changed logic so all router impls but implicit are acceptable for routing. The test error occurs because the CloudSolrServer clears the documents after it performs the updates. It does this because it then executes the non-routable requests such as commit. So this error will only occur if you try issue the same update request again following a request with CloudSolrServer. I'll fix this tomorrow by using a different approach to executing the non-routables. was (Author: joel.bernstein): * Added support for aliases * Added routing for delete by id's * Changed logic so all routers impl's but implicit are acceptable for routing. The test error occurs because the CloudSolrServer clears the documents after it performs the updates. It does this because it then executes the non-routable requests such as commit. So this error will only occur if you try issue the same update request again following a request with CloudSolrServer. I'll fix this tomorrow by using a different approach to executing the non-routables. Add document routing to CloudSolrServer --- Key: SOLR-4816 URL: https://issues.apache.org/jira/browse/SOLR-4816 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.3 Reporter: Joel Bernstein Assignee: Mark Miller Priority: Minor Fix For: 5.0, 4.4 Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch This issue adds the following enhancements to CloudSolrServer's update logic: 1) Document routing: Updates are routed directly to the correct shard leader eliminating document routing at the server. 2) Optional parallel update execution: Updates for each shard are executed in a separate thread so parallel indexing can occur across the cluster. These enhancements should allow for near linear scalability on indexing throughput. Usage: CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); cloudClient.setParallelUpdates(true); SolrInputDocument doc1 = new SolrInputDocument(); doc1.addField(id, 0); doc1.addField(a_t, hello1); SolrInputDocument doc2 = new SolrInputDocument(); doc2.addField(id, 2); doc2.addField(a_t, hello2); UpdateRequest request = new UpdateRequest(); request.add(doc1); request.add(doc2); request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); NamedList response = cloudClient.request(request); // Returns a backwards compatible condensed response. //To get more detailed response down cast to RouteResponse: CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13680393#comment-13680393 ] Joel Bernstein edited comment on SOLR-4816 at 6/11/13 2:07 PM: --- Mark, there were a couple of changes I still wanted to make to this ticket: 1) Add a switch to turn on/off threading. 2) Add a thread pool rather then spawning new threads each request. 3) Add a few more tests. But before I dive in I wanted to be sure we're on the same page. Does this design satisfy the back compat issue for the response and exceptions? Are there other show stoppers in this design/implementation that need to be addressed? was (Author: joel.bernstein): Mark, there were a couple of changes I still wanted to make to this ticket: 1) Add a switch to turn on/off threading. 2) Add a thread pool rather then spawning new threads each request. 3) Add a few more tests. But before I dive I wanted to be sure we're on the same page. Does this design satisfy the back compat issue for the response and exceptions? Are there other show stoppers in this design/implementation that need to be addressed? Add document routing to CloudSolrServer --- Key: SOLR-4816 URL: https://issues.apache.org/jira/browse/SOLR-4816 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.3 Reporter: Joel Bernstein Assignee: Mark Miller Priority: Minor Fix For: 5.0, 4.4 Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch This issue adds the following enhancements to CloudSolrServer's update logic: 1) Document routing: Updates are routed directly to the correct shard leader eliminating document routing at the server. 2) Parallel update execution: Updates for each shard are executed in a separate thread so parallel indexing can occur across the cluster. 3) Javabin transport: Update requests are sent via javabin transport. These enhancements should allow for near linear scalability on indexing throughput. Usage: CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); SolrInputDocument doc1 = new SolrInputDocument(); doc1.addField(id, 0); doc1.addField(a_t, hello1); SolrInputDocument doc2 = new SolrInputDocument(); doc2.addField(id, 2); doc2.addField(a_t, hello2); UpdateRequest request = new UpdateRequest(); request.add(doc1); request.add(doc2); request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); NamedList response = cloudClient.request(request); // Returns a backwards compatible condensed response. //To get more detailed response down cast to RouteResponse: CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; NamedList responses = rr.getRouteResponse(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13680453#comment-13680453 ] Shawn Heisey edited comment on SOLR-4816 at 6/11/13 4:38 PM: - The work I've been doing (slowly) on SOLR-4715 overlaps with this issue. I've been thinking that I need to divide it into bite-size tasks for my own purposes, but that could possibly help here too. If I concentrate first on giving LBHttpSolrServer an easy way to set the writer to binary, that would reduce the complexity of this patch quite a bit. Thoughts? was (Author: elyograg): The work I've been doing (slowly) on SOLR-4715 overlaps with this issue. I've been thinking that I need to divide it into bite-size tasks for my own purposes, but that could possibly help here too. If I concentrate first on giving CloudSolrServer a way to set the writer to binary, that would reduce the complexity of this patch quite a bit. Thoughts? Add document routing to CloudSolrServer --- Key: SOLR-4816 URL: https://issues.apache.org/jira/browse/SOLR-4816 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.3 Reporter: Joel Bernstein Assignee: Mark Miller Priority: Minor Fix For: 5.0, 4.4 Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch This issue adds the following enhancements to CloudSolrServer's update logic: 1) Document routing: Updates are routed directly to the correct shard leader eliminating document routing at the server. 2) Parallel update execution: Updates for each shard are executed in a separate thread so parallel indexing can occur across the cluster. 3) Javabin transport: Update requests are sent via javabin transport. These enhancements should allow for near linear scalability on indexing throughput. Usage: CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); SolrInputDocument doc1 = new SolrInputDocument(); doc1.addField(id, 0); doc1.addField(a_t, hello1); SolrInputDocument doc2 = new SolrInputDocument(); doc2.addField(id, 2); doc2.addField(a_t, hello2); UpdateRequest request = new UpdateRequest(); request.add(doc1); request.add(doc2); request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); NamedList response = cloudClient.request(request); // Returns a backwards compatible condensed response. //To get more detailed response down cast to RouteResponse: CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; NamedList responses = rr.getRouteResponse(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13670431#comment-13670431 ] Joel Bernstein edited comment on SOLR-4816 at 5/30/13 4:01 PM: --- Latest patch is a version of CloudSolrServer that: 1)Does document routing 2)Sends requests to each shard in a separate thread 3) Uses javabin transport 4) Is backwards compatible with both the response and exception. It does this all by default, no switches needed. This is accomplished by returning a response or throwing an exception that condenses the info from each shard into a single response or exception. To get the full info for the response or exception you can down cast to either RouteReponse or RouteException which gives you a detailed breakdown from each of the shards. Will update the ticket name and description accordingly. was (Author: joel.bernstein): Latest patch is a version of CloudSolrServer that: 1)Does document routing 2)Sends requests to each shard in a separate thread 3) Uses javabin transport 4) Is backwards compatible with both the response and exception. It does this all by default, no switches needed. I accomplish this by returning a response or throwing an exception that condenses the info from each shard into a single response or exception. To get the full info for the response or exception you can down cast to either RouteReponse or RouteException which gives you a detailed breakdown from each of the shards. Will update the ticket name and description accordingly. Add document routing to CloudSolrServer --- Key: SOLR-4816 URL: https://issues.apache.org/jira/browse/SOLR-4816 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.3 Reporter: Joel Bernstein Priority: Minor Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch This issue adds the following enhancements to CloudSolrServer's update logic: 1) Document routing: Updates are routed directly to the correct shard leader eliminating document routing at the server. 2) Parallel update execution: Updates for each shard are executed in a separate thread so parallel indexing can occur across the cluster. 3) Javabin transport: Update requests are sent via javabin transport. These enhancements should allow for near linear scalability on indexing throughput. Usage: CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); SolrInputDocument doc1 = new SolrInputDocument(); doc1.addField(id, 0); doc1.addField(a_t, hello1); SolrInputDocument doc2 = new SolrInputDocument(); doc2.addField(id, 2); doc2.addField(a_t, hello2); UpdateRequest request = new UpdateRequest(); request.add(doc1); request.add(doc2); request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); NamedList response = cloudClient.request(request); // Returns a backwards compatible condensed response. //To get more detailed response down cast to RouteResponse: CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; NamedList responses = rr.getRouteResponse(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13670456#comment-13670456 ] Shawn Heisey edited comment on SOLR-4816 at 5/30/13 4:17 PM: - bq. I still think these extras will need to be off by default until 5. +1. Even in version 5, it should still be possible to turn them off. Advanced features (threading in particular) have a tendency to cause subtle bugs, and it's difficult to know if they are bugs in the underlying code or bugs in the advanced feature. Being able to turn them off will greatly help with debugging. IMHO, most tests that use CloudSolrServer should randomly turn things like threading on or off, change the writer and parser, etc. Which reminds me, I need to file an issue and work on a patch for Cloud/LBHttpSolrServer implementations that includes many of the getters/setters from HttpSolrServer. was (Author: elyograg): bq. I still think these extras will need to be off by default until 5. +1. Even in version 5, it should still be possible to turn them off. Advanced features (threading in particular) have a tendency to cause subtle bugs, and it's difficult to know if they are bugs in the underlying code or bugs in the advanced feature. Being able to turn them off will greatly help with debugging. IMHO, most tests that use CloudSolrServer should randomly turn things like threading on or off, change the writer and parser, etc. Which reminds me, I need to file an issue and work on a patch for {Cloud,LBHttp}SolrServer implementations that includes many of the getters/setters from HttpSolrServer. Add document routing to CloudSolrServer --- Key: SOLR-4816 URL: https://issues.apache.org/jira/browse/SOLR-4816 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.3 Reporter: Joel Bernstein Assignee: Mark Miller Priority: Minor Fix For: 5.0, 4.4 Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch This issue adds the following enhancements to CloudSolrServer's update logic: 1) Document routing: Updates are routed directly to the correct shard leader eliminating document routing at the server. 2) Parallel update execution: Updates for each shard are executed in a separate thread so parallel indexing can occur across the cluster. 3) Javabin transport: Update requests are sent via javabin transport. These enhancements should allow for near linear scalability on indexing throughput. Usage: CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); SolrInputDocument doc1 = new SolrInputDocument(); doc1.addField(id, 0); doc1.addField(a_t, hello1); SolrInputDocument doc2 = new SolrInputDocument(); doc2.addField(id, 2); doc2.addField(a_t, hello2); UpdateRequest request = new UpdateRequest(); request.add(doc1); request.add(doc2); request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); NamedList response = cloudClient.request(request); // Returns a backwards compatible condensed response. //To get more detailed response down cast to RouteResponse: CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; NamedList responses = rr.getRouteResponse(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org