On Sun, Oct 10, 2010 at 10:51:20PM +0200, Erik Cederstrand wrote:
> Hi hackers
> 
> As a followup to the "Timestamps in static libraries" thread which resulted 
> in a '-D' option to ar(1), I'd like to discuss if it is a worthy goal of the 
> Project to create deterministic builds. By that I mean for two make 
> build+install world+kernel+distribution runs, every contained file is bitwise 
> identical between the two runs.
> 
> Deterministic builds would be useful for me, since I'm creating binary diffs 
> against lots of FreeBSD builds, and smaller diffs are good. Also, I'd like to 
> detect which files have changed between two commits. I imagine it would also 
> be useful for things like IDS and freebsd-update.
> 
> Currently, this does not hold for static libraries (*.a), kernel modules 
> (*.ko / *.ko.symbols) and the following:
> 
> bthidd
> cc1
> cc1obj
> cc1plus
> clang
> clang++
> ctfconvert
> freebsd.cf
> freebsd.submit.cf
> kernel
> kernel.symbols
> libcrypto.so.6
> libufs.so.5
> loader
> pxeboot
> sendmail.cf
> submit.cf
> tblgen
> zfsloader
> 
> Most of the libraries can be brought to be identical by using ar -D. Some 
> record the absolute OBJDIR path to header files, though (libc.a for example).
> 
> I tried adding 'D' to ARFLAGS in share/mk/sys.mk, but that's only part of the 
> solution. ARFLAGS are overridden hundreds of places in the source code, and 
> in some places ARFLAGS isn't even used (or AR for that matter). Is it 
> worthwhile to go through the whole tree, fixing up these calls to ar? A lot 
> of this is in contrib/ code.
> 
> Another option is to add a WITH_DETERMINISTIC_AR knob to the build to compile 
> ar with D as default behaviour. This would make the above changes 
> unnecessary, but is more intrusive.
> 
> A third option is that this is not a priority for the community, or directly 
> unwanted, and that I just post-process my builds myself.
> 
> I don't know what causes the checksum difference in .ko files - there is no 
> size difference, and no difference according to strings(1). A bsdiff on the 
> two is typically around 160B.
> 
> .ko.symbols have some unique identifiers or addresses internally.
> 
> kernel, loader, zfsloader and pxeboot have a build date recorded, kernel also 
> has absolute path to GENERIC. OK for the kernel, I think, although it would 
> be easier for me if this was just stored in a separate file since binary 
> diffs on large files are expensive.
> 
> clang, clang++ and tblgen store some absolute paths to .cpp files in the src 
> repo internally, plus unique identifiers.
> 
> freebsd.cf, freebsd.submit.cf, sendmail.cf and submit.cf record the absolute 
> OBJDIR path to sendmail
> 
> What do you think?
My personal opinion that the feature is nice to have. Unless the changes to
get this working are too large, and, more importantly, unless the maintenance
cost of having this in good shape is too high, sure we would better have
deterministic build results.

Also, the deterministic builds require somebody who would monitor the
feature, either manually, or by setting some bot that automatically
checks it. Otherwise, I suspect, it will degrade.

Attachment: pgpwDmlVkBdCW.pgp
Description: PGP signature

Reply via email to