Martin Ågren <[email protected]> writes:
> If two threads have one freshly initialized string buffer each and call
> strbuf_reset on them at roughly the same time, both threads will be
> writing a '\0' to strbuf_slopbuf. That is not a problem in practice
> since it doesn't matter in which order the writes happen. But
> ThreadSanitizer will consider this a race.
>
> When compiling with GIT_THREAD_SANITIZER, avoid writing to
> strbuf_slopbuf. Let's instead assert on the first byte of strbuf_slopbuf
> being '\0', since it ensures the promised invariant of "buf[len] ==
> '\0'". (Writing to strbuf_slopbuf is normally bad, but could become even
> more bad if we stop covering it up in strbuf_reset.)
>
> Signed-off-by: Martin Ågren <[email protected]>
> ---
> strbuf.h | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/strbuf.h b/strbuf.h
> index e705b94db..295654d39 100644
> --- a/strbuf.h
> +++ b/strbuf.h
> @@ -153,7 +153,19 @@ static inline void strbuf_setlen(struct strbuf *sb,
> size_t len)
> /**
> * Empty the buffer by setting the size of it to zero.
> */
> +#ifdef GIT_THREAD_SANITIZER
> +#define strbuf_reset(sb) \
> + do { \
> + struct strbuf *_sb = sb; \
> + _sb->len = 0; \
> + if (_sb->buf == strbuf_slopbuf) \
> + assert(!strbuf_slopbuf[0]); \
> + else \
> + _sb->buf[0] = '\0'; \
> + } while (0)
> +#else
> #define strbuf_reset(sb) strbuf_setlen(sb, 0)
> +#endif
>
>
> /**
The strbuf_slopbuf[] is a shared resource that is expected by
everybody to stay a holder of a NUL. Even though it is defined as
"char [1]", it in spirit ought to be considered const. And from
that point of view, your new definition that is conditionally used
only when sanitizer is in use _is_ the more correct one than the
current "we do not care if it is slopbuf, we are writing \0 so it
will be no-op anyway" code.
I wonder if we excessively call strbuf_reset() in the real code to
make your version unacceptably expensive? If not, I somehow feel
that using this version unconditionally may be a better approach.
What happens when a caller calls "strbuf_setlen(&sb, 0)" on a strbuf
that happens to have nothing and whose buffer still points at the
slopbuf (instead of calling _reset())? Shouldn't your patch fix
that function instead, i.e. something like the following without the
above? Is that make things noticeably and measurably too expensive?
Thanks.
strbuf.h | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/strbuf.h b/strbuf.h
index 2075384e0b..1a77fe146a 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -147,7 +147,10 @@ static inline void strbuf_setlen(struct strbuf *sb, size_t
len)
if (len > (sb->alloc ? sb->alloc - 1 : 0))
die("BUG: strbuf_setlen() beyond buffer");
sb->len = len;
- sb->buf[len] = '\0';
+ if (sb->buf != strbuf_slopbuf)
+ sb->buf[len] = '\0';
+ else
+ assert(!strbuf_slopbuf[0]);
}
/**