tibrewalpratik17 commented on issue #12400:
URL: https://github.com/apache/pinot/issues/12400#issuecomment-2162931989

   > In the case of partial-upsert, an entire partition will be moved to 
another node, not just a few segments. As you mentioned, allSegmentsLoaded will 
prevent the consumption from starting on the new node until all the old 
segments for the partition are available, ensuring the data is re-consumed 
properly.
   
   There is a scenario in partial-upserts where some segments of a partition 
are moved during a rebalance, and the consuming segment continues consumption. 
This results in inconsistent data because it misses previous record references. 
If, instead of this consuming segment being rebalanced and reconsume entire 
data on new node, it commits the data, we would end up with different data 
across replicas.
   
   Thinking out loud, I can think of few solutions to this:
   - Approach 1: Ensure that segments are not committed during an ongoing 
rebalance. Whenever a server reaches the threshold for committing a segment 
(the `endCriteriaReached` method), it should check for an ongoing rebalance for 
the table and make a `hold` call to prevent committing. The `hold` method will 
sleep the thread for a certain period and then check again if the rebalance has 
finished. Note: This approach requires server-to-controller communication (only 
for partial-upsert tables) to determine if a rebalance is in progress when 
commit criteria is met everytime. Ingestion stays at the same offset in that 
partition until further asked to change state.
   - Approach 2: Similar to Approach 1, but the controller handles the holding 
logic in the SegmentCompletionManager#holdingConsumed method. Until a rebalance 
is completed, the controller will ask the servers to hold committing. Note: It 
is unclear what happens if the consuming segment is relocated to a new instance 
and the controller then asks the winner to proceed with the commit job. Will it 
result in a No-Op operation? Even in this ingestion stays at the same offset in 
that partition until notified by controller to change state.
   - Approach 3: Pause consumption for partial-upsert tables (pausing 
consumption triggers a force-commit as well). Then, perform the rebalance 
followed by resuming consumption. This entire process should be orchestrated 
within the rebalance job.
   
   Approach 1 and Approach 2: These approaches will only come into play when a 
commit is happening at a partition-level during an ongoing rebalance. This 
minimizes the likelihood of ingestion lag during or after the rebalance 
operation.
   
   Approach 3: This approach will cause ingestion lag across all partitions 
every time a rebalance is triggered for a partial-upsert table, as it pauses 
and resumes ingestion during the rebalance process. Additionally, it requires 
the rebalance operation to be more intrusive, as it needs to orchestrate the 
entire flow of pausing consumption, ensuring the current consuming segments are 
committed, and then resuming consumption after the rebalance is complete.
   
   IMO approach 2 might be neater as controller becomes the sole brain here to 
decide when to allow commit and when to not. 
   
   cc @Jackie-Jiang @ankitsultana what are your thoughts? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to