npawar commented on PR #15050:
URL: https://github.com/apache/pinot/pull/15050#issuecomment-2667186163
> Updated summary:
>
> Adding a server:
>
> ```
> {
> "serverInfo" : {
> "numServersGettingNewSegments" : 1,
> "numServers" : {
> "valueBeforeRebalance" : 1,
> "expectedValueAfterRebalance" : 2
> },
> "serverSegmentChangeInfo" : {
> "Server_localhost_22004" : {
> "serverStatus" : "ADDED",
> "totalSegmentsAfterRebalance" : 6,
> "totalSegmentsBeforeRebalance" : 0,
> "segmentsAdded" : 6,
> "segmentsDeleted" : 0,
> "segmentsUnchanged" : 0,
> "tagList" : [ "DefaultTenant_OFFLINE", "DefaultTenant_REALTIME" ]
> },
> "Server_localhost_22001" : {
> "serverStatus" : "UNCHANGED",
> "totalSegmentsAfterRebalance" : 6,
> "totalSegmentsBeforeRebalance" : 12,
> "segmentsAdded" : 0,
> "segmentsDeleted" : 6,
> "segmentsUnchanged" : 6,
> "tagList" : [ "DefaultTenant_OFFLINE", "DefaultTenant_REALTIME" ]
> }
> }
> },
> "segmentInfo" : {
> "totalSegmentsToBeMoved" : 6,
> "estimatedAverageSegmentSizeInBytes" : 1690546,
> "totalEstimatedDataToBeMovedInBytes" : 10143276,
> "totalEstimatedTimeToMoveDataInSecs" : 0.09673381805419921,
> "replicationFactor" : {
> "valueBeforeRebalance" : 1,
> "expectedValueAfterRebalance" : 1
> },
> "numSegmentsInSingleReplica" : {
> "valueBeforeRebalance" : 12,
> "expectedValueAfterRebalance" : 12
> },
> "numSegmentsAcrossAllReplicas" : {
> "valueBeforeRebalance" : 12,
> "expectedValueAfterRebalance" : 12
> }
> }
> }
> ```
>
> Removing a server:
>
> ```
> {
> "serverInfo" : {
> "numServersGettingNewSegments" : 1,
> "numServers" : {
> "valueBeforeRebalance" : 2,
> "expectedValueAfterRebalance" : 1
> },
> "serverSegmentChangeInfo" : {
> "Server_localhost_22004" : {
> "serverStatus" : "REMOVED",
> "totalSegmentsAfterRebalance" : 0,
> "totalSegmentsBeforeRebalance" : 6,
> "segmentsAdded" : 0,
> "segmentsDeleted" : 6,
> "segmentsUnchanged" : 0,
> "tagList" : [ ]
> },
> "Server_localhost_22001" : {
> "serverStatus" : "UNCHANGED",
> "totalSegmentsAfterRebalance" : 12,
> "totalSegmentsBeforeRebalance" : 6,
> "segmentsAdded" : 6,
> "segmentsDeleted" : 0,
> "segmentsUnchanged" : 6,
> "tagList" : [ "DefaultTenant_OFFLINE", "DefaultTenant_REALTIME" ]
> }
> }
> },
> "segmentInfo" : {
> "totalSegmentsToBeMoved" : 6,
> "estimatedAverageSegmentSizeInBytes" : 1690546,
> "totalEstimatedDataToBeMovedInBytes" : 10143276,
> "totalEstimatedTimeToMoveDataInSecs" : 0.09673381805419921,
> "replicationFactor" : {
> "valueBeforeRebalance" : 1,
> "expectedValueAfterRebalance" : 1
> },
> "numSegmentsInSingleReplica" : {
> "valueBeforeRebalance" : 12,
> "expectedValueAfterRebalance" : 12
> },
> "numSegmentsAcrossAllReplicas" : {
> "valueBeforeRebalance" : 12,
> "expectedValueAfterRebalance" : 12
> }
> }
> }
> ```
>
> > A few comments on the summary. Feel free to just take those that make
sense for this first iteration:
> >
> > 1. along with numServers, it might be useful to see the list there of
existing and new, so operator can confirm that their tagging / untagging is
effective (so a servers added / removed / unchanged ?)
> > 2. how about showing what tenant tag we're operating with? so summarize
the tags from tenants, completed, tier, pools
> > 3. what does num unique segments mean? same with numTotalSegments,
didn't follow what existing/new value means in context of rebalance. Perhaps
having a description field within the sections will help.
> > 4. in the server to stats map for 22001, if it started with 12 and 6 are
moving, shouldn't totalNewSegments be 0?
> > 5. this payload will get pretty extensive over time. wondering if we
should take time to do some more top level categorization - segments related
info, servers related info, generic info
>
> @npawar responses regarding the above are below:
>
> 1. done, added a server status for each server
> 2. I've added the tenant tags for each server - is this what you meant? Or
do you think just dumping the list of all tenant tags across servers is good
enough? Or did you want finer categorization based on whether it's
OFFLINE/CONSUMING/COMPLETED/TIER etc? For more categorization, we can add it as
part of a future change?
> 3. by unique segments -> i meant number of segments per replica. I've
updated the name of the field in the new summary. Total segments was the total
number of segments across all replicas. Existing was the value before
rebalance, and the value after expected rebalance. I've updated the fields in
the summary. Let me know if that makes more sense or if we still need a
description field
> 4. here `totalNewSegments` was total segments hosted after the rebalance
completes. So 22001 was losing 6 of the existing 12 segments. I've updated the
field to better explain this.
> 5. I've attempted a categorization into server vs. segments. Please take a
look and let me know what you think
points 3,4,5 look great.
point 2 - sure we can take this up later
For 1, the tag in each server's section is helpful. What I was suggesting
though, is that we provide the list right on top (Essentially 4 lists called
serversGettingNewSegments, serverAdded, serverRemoved, serverUnchanged) as when
we'll get to clusters with 50-100 servers, parsing through 50-100 individual
json blurbs per server might get tricky.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]