tibrewalpratik17 commented on issue #12400: URL: https://github.com/apache/pinot/issues/12400#issuecomment-2154558656
Hey @ankitsultana I was looking into this. So based on your example: ```Say we have two replicas of consuming segments: S0 and S1. Say the segments with the previous sequence id for this segment in the replicas are: P0 and P1. While a rebalance is going on, say P0 gets moved to the target server before P1, and between that time we had a record come to S0 which needed to be read from P0. If a segment commit happens before the consuming segments were moved, we will end up with S0, S1 having different data. ``` When we rebalance with the includeConsuming option set to true, according to [Pinot's official documentation](https://docs.pinot.apache.org/operators/operating-pinot/rebalance/rebalance-servers#rebalance-parameters): ``` CONSUMING segments are rebalanced only if this is set to true. Moving a CONSUMING segment involves dropping the data consumed so far on old server, and re-consuming on the new server. ``` In the case of partial-upsert, an entire partition will be moved to another node, not just a few segments. As you mentioned, allSegmentsLoaded will prevent the consumption from starting on the new node until all the old segments for the partition are available, ensuring the data is re-consumed properly. Moreover, since we will rebalance at the replica level during a NoDowntime rebalance, once the first replica is in a stable consuming state, we will move to the next replica using the same logic. If a segment commit happens during this process, it should not result in different data because the allSegmentsLoaded condition will prevent any consumption inconsistencies. But please let me know if there are any edge cases I might have missed. I haven't gone through the rebalance code in detail yet, so my understanding is based on documentation and theoretical knowledge. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org