The hypothetical concern described is around potential data resurrection -
would you still use resumable bootstrap if you knew that data deleted
during those STW pauses was improperly resurrected?

On Wed, Aug 3, 2022 at 2:40 PM Bowen Song via dev <>

> I have benefited from the resumable bootstrap before, and I'm in favour of
> keeping the feature around.
> I've had streaming failures due to long STW GC pauses on some
> bootstrapping nodes, and I had to resume the bootstrap once or twice in
> order to get these nodes finish joinning the cluster. They had not
> experienced more long STW GC pauses since they joined the cluster. I would
> imagine I will spend a lots of time tuning the GC parameters in order get
> these nodes to join if the resumable bootstrapping feature is removed.
> Also, I'm not concerned about racing conditions involving repairs, because
> we don't run repairs while we are adding new nodes (to minimize the
> additional load on the cluster).
> On 03/08/2022 19:46, Josh McKenzie wrote:
> Context:
> From the .yaml comment on the param I was working on adding:
> In certain environments, operators may want to disable resumable bootstrap in 
> order to avoid potential correctness violations or data loss scenarios. 
> Largely this centers around nodes going down during bootstrap, tombstones 
> being written, and potential races with repair. By default we leave this on 
> as it's been enabled for quite some time, however the option to disable it is 
> more palatable now that we have zero copy streaming as that greatly 
> accelerates
> Given zero copy streaming in the system and the general unexplored
> correctness concerns of
>, specifically
> pointed out by Jeff here:
>  I've
> been chatting w/Paulo about this and we've both concluded we think the
> functionality should be made configurable, default off (?), deprecated in
> 4.2 and then completely removed next.
> - First: anyone have any concerns with the general arc of "remove
> resumable bootstrap and decommission"?
> - Second: Should we leave them enabled by default in 4.2 or disabled?
> - Third: Should we consider revisiting older branches with this
> functionality and making it toggle-able?
> ~Josh

Reply via email to