> On 23 Jun 2026, at 22:40, Heikki Linnakangas <[email protected]> wrote:
> I had another fresh look at datachecksum_state.c while rebasing my
> "Interrupts vs signals" patch set, and spotted a few minor things that I
> think should be cleaned up. See commit messages for details.
Thanks for the postcommit review, all the attached patches LGTM.
> Unless I'm missing something, the last patch fixes a bug, albeit a very
> theoretical one. The crux is that when the launcher exits, the worker might
> be left running; launcher_exit sends it SIGTERM but it might not exit
> instantly. If the launcher is restarted, and it launches a new worker while
> the old one is still running, the old launcher might set the worker's result
> field in shared memory, misleading the launcher to believe that the *new*
> worker succeeded.
>
> That'd race condition would be really hard to hit in practice - I didn't even
> try to write a test - but it'd be nice to fix it. The patch adds a unique ID
> to each worker invocation to distinguish the old and new worker if both are
> running at the same time, ensuring that the old worker doesn't mess with the
> new worker's state.
+ uint64 worker_invocation_counter;
...
+ invocation = ++DataChecksumState->worker_invocation_counter;
+ DataChecksumState->worker_invocation = invocation;
This could safely be a uint32 without risking overflow in practice, but the
memory savings are fairly slim. Perhaps we should add a comment stating why
there is no overflow protection to try and mitigate LLM/static analyzer
findings being reported?
> <0002-Clarify-StartDataChecksumsWorkerLauncher-function.patch>
+typedef enum DataChecksumsWorkerOperation
+{
+ ENABLE_DATACHECKSUMS,
+ DISABLE_DATACHECKSUMS,
+} DataChecksumsWorkerOperation;
Just as a note, this clearly looks like it could be made a boolean but it was
intentionally made to support a (still hypothetical) future
VERIFY_DATACHECKSUMS operation.
--
Daniel Gustafsson