[ 
https://issues.apache.org/jira/browse/SOLR-13942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17051369#comment-17051369
 ] 

Ishan Chattopadhyaya edited comment on SOLR-13942 at 3/4/20, 4:24 PM:
----------------------------------------------------------------------

Jason, I think you lack some perspective from the point of view of experts who 
intend to resolve problems and also operations engineers who run & maintain 
Solr. Here's my perspective. 

1. In past 4+ years of consulting, I've encountered several clients who have 
brought me on to help resolve a production issue. Often, these calls are over 
conference calls, so I don't have SSH access to their instances (sometimes I 
did). I am at their mercy to browse or capture screenshots of the Solr admin UI 
in order to get a fair idea of the ZK data. Here's a recent example, as 
[~munendrasn] can testify: recently I had to help out with a situation where 
overseer wasn't getting elected and OVERSEERSTATUS API wasn't working. I needed 
a way for the client to be able to dump the entire ZK data quickly and pass it 
to me for further analysis. (In this case, I made a recommendation without 
having access to the ZK data, and still saved the day.) Asking such clients, in 
times of such crisis, to install clients or fight with our ZK client etc. is 
unreasonable because there maybe policy restrictions on their part which will 
make this process lengthy.

2. Most often, Solr is just part of a very large distributed system comprising 
of several components and microservices. Expecting dev-ops to install and 
maintain additional tools for Solr is unreasonable. Also, since we treat ZK as 
an implementation detail of Solr, it is unreasonable to expect dev-ops to now 
start setting up proxies etc. for ZK as a way of monitoring Solr. Solr should 
be able to let expert users peek into the data that Solr puts into ZK. As 
you've rightly identified, cost of maintaining additional tools is a factor. 
Another factor is the complexity of monitoring such additional tools. Imagine, 
the situation when there's a crisis and an outage and the nginx proxy isn't 
working (and there was no alerting setup to warn that it has gone down). Having 
Solr let you peek into Solr's own internal state data reduces moving parts 
needed to debug Solr problems.

Hope this helps.

p.s.: [~erickerickson], [~dsmiley], [~markrmil...@gmail.com], some perspective 
from you would help us understand if this issue solves a real problem (which 
Noble and I seem to think it does). Your war stories will far outweigh whatever 
I've seen, I'm sure.


was (Author: ichattopadhyaya):
Jason, I think you lack some perspective from the point of view of experts who 
intend to resolve problems and also operations engineers who run & maintain 
Solr. Here's my perspective. 

1. In past 4+ years of consulting, I've encountered several clients who have 
brought me on to help resolve a production issue. Often, these calls are over 
conference calls, so I don't have SSH access to their instances (sometimes I 
did). I am at their mercy to browse or capture screenshots of the Solr admin UI 
in order to get a fair idea of the ZK data. Here's a recent example, as 
[~munendrasn] can testify: recently I had to help out with a situation where 
overseer wasn't getting elected and OVERSEERSTATUS API wasn't working. I needed 
a way for the client to be able to dump the entire ZK data quickly and pass it 
to me for further analysis. (In this case, I made a recommendation without 
having access to the ZK data, and still saved the day.) Asking such clients, in 
times of such crisis, to install clients or fight with our ZK client etc. is 
unreasonable because there maybe policy restrictions on their part which will 
make this process lengthy.

2. Most often, Solr is just part of a very large distributed system comprising 
of several components and microservices. Expecting dev-ops to install and 
maintain additional tools for Solr is unreasonable. Also, since we treat ZK as 
an implementation detail of Solr, it is unreasonable to expect dev-ops to now 
start setting up proxies etc. for ZK as a way of monitoring Solr. Solr should 
be able to let expert users peek into the data that Solr puts into ZK. As 
you've rightly identified, cost of maintaining additional tools is a factor. 
Another factor is the complexity of monitoring such additional tools. Imagine, 
the situation when there's a crisis and an outage and the nginx proxy isn't 
working (and there was no alerting setup to warn that it has gone down). Having 
Solr let you peek into Solr's own internal state data reduces moving parts 
needed to debug Solr problems.

Hope this helps.

> /api/cluster/zk/* to fetch raw ZK data
> --------------------------------------
>
>                 Key: SOLR-13942
>                 URL: https://issues.apache.org/jira/browse/SOLR-13942
>             Project: Solr
>          Issue Type: New Feature
>          Components: v2 API
>            Reporter: Noble Paul
>            Assignee: Noble Paul
>            Priority: Blocker
>             Fix For: 8.5
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> example
> download the {{state.json}} of
> {code}
> GET http://localhost:8983/api/cluster/zk/collections/gettingstarted/state.json
> {code}
> get a list of all children under {{/live_nodes}}
> {code}
> GET http://localhost:8983/api/cluster/zk/live_nodes
> {code}
> If the requested path is a node with children show the list of child nodes 
> and their meta data



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to