On 6/20/24 7:45 PM, Andi Kleen wrote:
There are much faster/optimized modern hashes for good collision detection over MD5 especially when it's not needed to be cryptographically secure. Pick something from smhasher. Also perhaps the check sum should be cached in the file? I assume it's cheap to compute while writing. It could be written at the tail of the file. Then it can be read by seeking to the end and you save that step.
Good ideas, but with only minor benefits relative to time spent in LTRANS phase. Current focus was to create simple mostly self-contained implementation that reduces LTRANS recompilations. We can do these more pervasive improvements incrementally. Just to clarify, the hashes are computed only once, then stored in the cache.
The lockfiles scare me a bit. What happens when they get lost, e.g. due to a compiler crash? You may need some recovery for that. Perhaps it would be better to make the files self checking, so that partial files can be detected when reading, and get rid of the locks.
It uses process-associated locks via fcntl, so if the compiler crashes, the locks will be released. If the compiler process crashes and leaves partially written file, the lto-wrapper deletes it in tool_cleanup. If a file is missing, the cache entry will be deleted. Michal