[jira] [Commented] (SOLR-11522) Suggestions/recommendations to rebalance replicas

Hoss Man (JIRA) Mon, 22 Oct 2018 15:52:09 -0700


    [ 
https://issues.apache.org/jira/browse/SOLR-11522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16659780#comment-16659780
 ]


Hoss Man commented on SOLR-11522:
---------------------------------

{quote}I am mostly worried about the suitability of "get" declared on this 
interface, and I think the costly lookup is symptomatic of this misalignment. 
This interface should be limited to writing IMO.
{quote}
I'm having trouble understanding the purpose/use of these {{_get}} methods 
(they seem to be purely for use in tests???) but in general I'm with david that 
they seem very confusing and badly suited for this interface/name.

If these are purely "utility" methods intended for special case situation where 
tests want to do a "lookup" of a value in one of these "Map like" MapWriter 
instances, why do they need to be in the interface itself? Since the bulk of 
the work to do the "pretend we are writting out the map but really just look 
for one key" is in {{Utils.getVal}} – hidden away as an implementation detail 
behind {{Utils.getObjectByPath}} which already does a lot of instanceof checks 
– why does {{MapWriter._get}} need to exist at all? why can't the caller just 
use {{Utils.getObjectByPath}} directly?

If the answer is: "Because we want impls of {{MapWriter}} to be able to provide 
a more efficient impl." then why have such a terrible inefficient default impl 
at all?

At the very least, this method should have a more descriptive name and better 
javadocs (as should {{Utils.getObjectByPath}} that makes it clear what the 
performance tradeoffs are here.
----
frankly, looking at the actual uses of {{_get}} in the tests makes me question 
the entire "value add" of these methods – why aren't all the callers of 
{{_get}} just using {{toMap}} (or {{Utils.getObjectByPath}}) and then making 
multiple assertions about the resulting map? AFAICT the way {{Utils.getVal}} 
works means that in the case of a test that does a single assert on a single 
entry, converting hte entire object to a Map would be just as efficient as only 
"writing" that single entry, but in many cases tests are calling {{_get}} on 
several sub-elements in a row, which should be much faster if the test just 
dumped the whole map and then called {{get}} on they keys it wants to assert.

Specifically, isn't this existing test snippet...
{code:java}
      CoreAdminResponse status = CoreAdminRequest.getStatus(corename, 
coreclient);
      assertEquals(collectionName, status._get(asList("status", corename, 
"cloud", "collection"), null));
      assertNotNull(status._get(asList("status", corename, "cloud", "shard"), 
null));
      assertNotNull(status._get(asList("status", corename, "cloud", "replica"), 
null));
{code}
...at least 3 times slower then if the test just did something like...
{code:java}
      CoreAdminResponse status = CoreAdminRequest.getStatus(corename, 
coreclient);
      Map coreMap = Utils.getObjectByPath(status, false, asList("status", 
corename, "cloud");
      assertEquals(collectionName, coreMap.get("collection"));
      assertNotNull(coreMap.get("shard"));
      assertNotNull(coreMap.get("replica"));
{code}
...?

I don't see how the existing test code is any better/faster/more-readable then 
the 2nd (which seems like a much simpler approach, w/o the need to pollute the 
{{MapWriter}} API with a confusing default method)

> Suggestions/recommendations to rebalance replicas
> -------------------------------------------------
>
>                 Key: SOLR-11522
>                 URL: https://issues.apache.org/jira/browse/SOLR-11522
>             Project: Solr
>          Issue Type: Sub-task
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: AutoScaling
>            Reporter: Noble Paul
>            Priority: Major
>
> It is possible that a cluster is unbalanced even if it is not breaking any of 
> the policy rules. Some nodes may have very little load while some others may 
> be heavily loaded. So, it is possible to move replicas around so that the 
> load is more evenly distributed. This is going to be driven by preferences. 
> The way we arrive at these suggestions is going to be as follows
>  # Sort the nodes according to the given preferences
>  # Choose a replica from the most loaded node ({{source-node}}) 
>  # try adding them to the least loaded node ({{target-node}})
>  # See if it breaks any policy rules. If yes , try another {{target-node}} 
> (go to #3)
>  # If no policy rules are being broken, present this as a {{suggestion}} . 
> The suggestion contains the following information
>  #* The {{source-node}} and {{target-node}} names
>  #* The actual v2 command that can be run to effect the operation
>  # Go to step #1
>  # Do this until the a replicas can be moved in such a way that the {{target 
> node}} is more loaded than the {{source-node}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-11522) Suggestions/recommendations to rebalance replicas

Reply via email to