[ https://issues.apache.org/jira/browse/SOLR-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13872573#comment-13872573 ]
Anshum Gupta edited comment on SOLR-5477 at 1/16/14 7:31 AM: ------------------------------------------------------------- I have a few questions regrading my approach for making the CoreAdmin calls async: Approach #1: * CoreAdmin requests get submitted to zk. * Core watches it's zk node for submitted tasks. Request object is the data in the node (when submitted). * On completion, the core deletes the submitted task and puts a new node with the response and other metadata into zk. * Collection API watches the node when it submits a task, waits for it to complete. * On completion of the Collection API call, delete all related core admin request nodes in zk that were generated. * Cleaning up of request nodes in zk happens through an explicit API call. * Having something on the following lines in zk would be helpful: /tasks ./collections/collection1/task1 ./cores/core1/collection1/task1/coretask1 ./_ This would help us delete the entire group of tasks associated to a core/collection/core task/collection task. Questions: * This move would mean having a lot more clients talk to and write to zk. Does this approach make sense as far as the intended direction of SolrCloud is concerned? * Any suggestions/concerns about scalability of zk as far as having multiple updates coming into zk is concerned. Approach #2: Continue accepting the request like right now, but just : # Get the call to return immediately # Use zk to only track/store the status (persistence). The request status calls still comes to the core and the status is fetched from zk by the core instead of the client being intelligent and talking directly to zk. This approach is certainly less intrusive but then also doesn't come with the benefit of having the client just watch over a particular zk node for task state change etc. Approach #3 (Not the best option, and more like the option if zk has scalability issues with everyone writing/watching): * Not have CoreAdmin calls as async but instead introduce a tracking mode. Once the task is submitted [with async = "taskid"], track this request using an in-memory data structure. Even if the request times out, the client can go back and query about the task status. was (Author: anshumg): I have a few questions regrading my approach for making the CoreAdmin calls async: Approach #1: * CoreAdmin requests get submitted to zk. * Core watches it's zk node for submitted tasks. Request object is the data in the node (when submitted). * On completion, the core deletes the submitted task and puts a new node with the response and other metadata into zk. * Collection API watches the node when it submits a task, waits for it to complete. * On completion of the Collection API call, delete all related core admin request nodes in zk that were generated. * Cleaning up of request nodes in zk happens through an explicit API call. * Having something on the following lines in zk would be helpful: /tasks ./collections/collection1/task1 ./cores/core1/collection1/task1/coretask1 ./_ This would help us delete the entire group of tasks associated to a core/collection/core task/collection task. Questions: * This move would mean having a lot more clients talk to and write to zk. Does this approach make sense as far as the intended direction of SolrCloud is concerned? * Any suggestions/concerns about scalability of zk as far as having multiple updates coming into zk is concerned. Approach #2 (Not the best option, and more like the option if zk has scalability issues with everyone writing/watching): * Not have CoreAdmin calls as async but instead introduce a tracking mode. Once the task is submitted [with async = "taskid"], track this request using an in-memory data structure. Even if the request times out, the client can go back and query about the task status. > Async execution of OverseerCollectionProcessor tasks > ---------------------------------------------------- > > Key: SOLR-5477 > URL: https://issues.apache.org/jira/browse/SOLR-5477 > Project: Solr > Issue Type: Sub-task > Components: SolrCloud > Reporter: Noble Paul > Assignee: Anshum Gupta > Attachments: SOLR-5477-CoreAdminStatus.patch, SOLR-5477.patch > > > Typical collection admin commands are long running and it is very common to > have the requests get timed out. It is more of a problem if the cluster is > very large.Add an option to run these commands asynchronously > add an extra param async=true for all collection commands > the task is written to ZK and the caller is returned a task id. > as separate collection admin command will be added to poll the status of the > task > command=status&id=7657668909 > if id is not passed all running async tasks should be listed > A separate queue is created to store in-process tasks . After the tasks are > completed the queue entry is removed. OverSeerColectionProcessor will perform > these tasks in multiple threads -- This message was sent by Atlassian JIRA (v6.1.5#6160) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org