On Mon, 4 Mar 2024, Martin Storsjö wrote:
Hi,
On Mon, 4 Mar 2024, Mateusz Mikuła wrote:
rand is not random enough and may lead to clashing temporary directories
with multiple parallel link processes as it was observed on Rust's CI.
It can be reproduced with these commands (run them all in without long
pauses):
for n in {1..15000}; do rm -f lib/libLLVMAVRAsmParser.a && \
ar qc lib/libLLVMAVRAsmParser.a
lib/Target/AVR/AsmParser/CMakeFiles/LLVMAVRAsmParser.dir/AVRAsmParser.cpp.obj
&& \
ranlib.exe lib/libLLVMAVRAsmParser.a; done &
for n in {1..15000}; do rm -f lib/libLLVMSparcCodeGen.a && \
ar qc lib/libLLVMSparcCodeGen.a
lib/Target/Sparc/CMakeFiles/LLVMSparcCodeGen.dir/*.obj && \
ranlib.exe lib/libLLVMSparcCodeGen.a; done
echo "done"
fg
Before the patch it will fail with an error: ranlib.exe: could not create
temporary file whilst writing archive: no more archived files.
Thanks, I've run into this issue occasionally when building LLVM on msys2 as
well, but I've failed to reproduce it when I've tried to look closer at it
(as I've missed the issue that one needs to build two archives at the same
time in order to trigger it).
If the issue is that the randomness clashes, shouldn't that be something
that, as part of the contract of mkstemp, the function should retry until it
finds a non-conflicting combination? But, thinking further, is the issue that
two processes end up trying the same sequence of pseudo random files, which
all then end up clashing, and mkstemp returns an error as it was unable to
find a unique file name? I guess that's plausible. In that case, I guess this
patch is fine (with Liu Hao's suggestion), as a way to reduce the risk of
running into this.
Looking closer at our mkstemp implementation, we have this loop:
/*
Like OpenBSD, mkstemp() will try at least 2 ** 31 combinations before
giving up.
*/
for (i = 0; i >= 0; i++) {
for(j = index; j < len; j++) {
template_name[j] = letters[rand () % 62];
}
fd = _sopen(template_name,
_O_RDWR | _O_CREAT | _O_EXCL | _O_BINARY,
_SH_DENYNO, _S_IREAD | _S_IWRITE);
if (fd != -1) return fd;
if (fd == -1 && errno != EEXIST) return -1;
}
This should retry an absolutely insane number of times, so as long as one
process finds a unique file name and stops iterating, the other parallel
process should also find a unique one soon after, one would expect.
So if this fails, it looks like something is fishy here; if we have this
clash, do we hit the "if (fd == -1 && errno != EEXIST) return -1;" case
directly on the first iteration?
(Separately, it looks like the loop relies on undefined behaviour, signed
wraparound, in order to exit the loop.)
// Martin
_______________________________________________
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public