Re: [pacman-dev] [PATCH 3/4] makepkg: unify source file times for improved build reproducibility

2017-05-02 Thread Levente Polyak
On 05/02/2017 10:09 PM, Andrew Gregory wrote:
> On 05/02/17 at 09:30pm, Levente Polyak wrote:
>> On 04/21/2017 07:08 AM, Allan McRae wrote:
>>> On 21/04/17 13:36, Eli Schwartz wrote:
 On 04/20/2017 11:01 PM, Allan McRae wrote:
> I am probably moving this to after source extraction/prepare() running,
> so it can be skipped with --noextract.

 But --noextract depends on your having at some point previously run
 --nobuild, in order to pull updated sources, re-patch any patches, etc.
 which I still don't want to do manually outside of makepkg... I don't
 see why makepkg should start breaking things for me.

 How about instead we guard it with
 BUILDENV+=(fix_everyone_elses_SOURCE_DATE_EPOCH_stuff)

 Or better yet, just file bugs against whatever upstream build
 system/programming language/source code is determined to sneakily embed
 source code modification times into generated files, and call it a day?

>>>
>>> Adding list back - any further off-list replies will be completely ignored.
>>>
>>> The reproducible builds people will provide details, but it appears
>>> pyo/pyc do this.
>>>
>>> A
>>
>> Unfortunately it won't be possible to fully avoid such uniform
>> modifications times of input files to get reproducible builds. The most
>> known dominator there is indeed python and its way to determine the
>> produced artifacts state compared to the source code. Make does use the
>> same information by setting the modification time of the produced
>> artifacts to compare them.
>> I agree make is in that detail a nicer way as that info in contained
>> outside of the content of the produced artifacts itself, however there
>> are external design decisions that we need to accept that they exist and
>> won't change.
>>
>> So as both use cases (reproducible and incremental builds) seem to be
>> valid and wanted features, let's be cooperative and see how we can make
>> both separately work to make all of us happy. :)
>>
>> As I don't really see how both can work at the same time: one way that
>> could do its job is an option like --incremental that would not do any
>> timestamp unification. This could be used in the case of a incremental
>> build of a VCS package in a non cleaned environment. I think that is not
>> too much of a hassle to do in case someone wants to build an VCS package
>> incrementally with previous object files?
> 
> Why is setting the modification timestamp necessary?  makepkg should
> be preserving the modification timestamps of files when it extracts
> them from archives.  So two builds using the same source tarballs
> should already have files with the same timestamps.  When is this not
> sufficient?
> 
> apg
> 


Reproducibility is not something that should be exclusive to archives
providing such anyway. There is an increasing interest in builds from
repository checkout. On top, prepare() modifies such so those should be
adjusted too before the build starts.

cheers,
Levente


Re: [pacman-dev] [PATCH 3/4] makepkg: unify source file times for improved build reproducibility

2017-05-02 Thread Andrew Gregory
On 05/02/17 at 09:30pm, Levente Polyak wrote:
> On 04/21/2017 07:08 AM, Allan McRae wrote:
> > On 21/04/17 13:36, Eli Schwartz wrote:
> >> On 04/20/2017 11:01 PM, Allan McRae wrote:
> >>> I am probably moving this to after source extraction/prepare() running,
> >>> so it can be skipped with --noextract.
> >>
> >> But --noextract depends on your having at some point previously run
> >> --nobuild, in order to pull updated sources, re-patch any patches, etc.
> >> which I still don't want to do manually outside of makepkg... I don't
> >> see why makepkg should start breaking things for me.
> >>
> >> How about instead we guard it with
> >> BUILDENV+=(fix_everyone_elses_SOURCE_DATE_EPOCH_stuff)
> >>
> >> Or better yet, just file bugs against whatever upstream build
> >> system/programming language/source code is determined to sneakily embed
> >> source code modification times into generated files, and call it a day?
> >>
> > 
> > Adding list back - any further off-list replies will be completely ignored.
> > 
> > The reproducible builds people will provide details, but it appears
> > pyo/pyc do this.
> > 
> > A
> 
> Unfortunately it won't be possible to fully avoid such uniform
> modifications times of input files to get reproducible builds. The most
> known dominator there is indeed python and its way to determine the
> produced artifacts state compared to the source code. Make does use the
> same information by setting the modification time of the produced
> artifacts to compare them.
> I agree make is in that detail a nicer way as that info in contained
> outside of the content of the produced artifacts itself, however there
> are external design decisions that we need to accept that they exist and
> won't change.
> 
> So as both use cases (reproducible and incremental builds) seem to be
> valid and wanted features, let's be cooperative and see how we can make
> both separately work to make all of us happy. :)
> 
> As I don't really see how both can work at the same time: one way that
> could do its job is an option like --incremental that would not do any
> timestamp unification. This could be used in the case of a incremental
> build of a VCS package in a non cleaned environment. I think that is not
> too much of a hassle to do in case someone wants to build an VCS package
> incrementally with previous object files?

Why is setting the modification timestamp necessary?  makepkg should
be preserving the modification timestamps of files when it extracts
them from archives.  So two builds using the same source tarballs
should already have files with the same timestamps.  When is this not
sufficient?

apg


Re: [pacman-dev] [PATCH 3/4] makepkg: unify source file times for improved build reproducibility

2017-05-02 Thread Levente Polyak
On 04/21/2017 07:08 AM, Allan McRae wrote:
> On 21/04/17 13:36, Eli Schwartz wrote:
>> On 04/20/2017 11:01 PM, Allan McRae wrote:
>>> I am probably moving this to after source extraction/prepare() running,
>>> so it can be skipped with --noextract.
>>
>> But --noextract depends on your having at some point previously run
>> --nobuild, in order to pull updated sources, re-patch any patches, etc.
>> which I still don't want to do manually outside of makepkg... I don't
>> see why makepkg should start breaking things for me.
>>
>> How about instead we guard it with
>> BUILDENV+=(fix_everyone_elses_SOURCE_DATE_EPOCH_stuff)
>>
>> Or better yet, just file bugs against whatever upstream build
>> system/programming language/source code is determined to sneakily embed
>> source code modification times into generated files, and call it a day?
>>
> 
> Adding list back - any further off-list replies will be completely ignored.
> 
> The reproducible builds people will provide details, but it appears
> pyo/pyc do this.
> 
> A
> 

Unfortunately it won't be possible to fully avoid such uniform
modifications times of input files to get reproducible builds. The most
known dominator there is indeed python and its way to determine the
produced artifacts state compared to the source code. Make does use the
same information by setting the modification time of the produced
artifacts to compare them.
I agree make is in that detail a nicer way as that info in contained
outside of the content of the produced artifacts itself, however there
are external design decisions that we need to accept that they exist and
won't change.

So as both use cases (reproducible and incremental builds) seem to be
valid and wanted features, let's be cooperative and see how we can make
both separately work to make all of us happy. :)

As I don't really see how both can work at the same time: one way that
could do its job is an option like --incremental that would not do any
timestamp unification. This could be used in the case of a incremental
build of a VCS package in a non cleaned environment. I think that is not
too much of a hassle to do in case someone wants to build an VCS package
incrementally with previous object files?

cheers,
Levente


Re: [pacman-dev] [PATCH 3/4] makepkg: unify source file times for improved build reproducibility

2017-04-20 Thread Allan McRae
On 21/04/17 13:36, Eli Schwartz wrote:
> On 04/20/2017 11:01 PM, Allan McRae wrote:
>> I am probably moving this to after source extraction/prepare() running,
>> so it can be skipped with --noextract.
> 
> But --noextract depends on your having at some point previously run
> --nobuild, in order to pull updated sources, re-patch any patches, etc.
> which I still don't want to do manually outside of makepkg... I don't
> see why makepkg should start breaking things for me.
> 
> How about instead we guard it with
> BUILDENV+=(fix_everyone_elses_SOURCE_DATE_EPOCH_stuff)
> 
> Or better yet, just file bugs against whatever upstream build
> system/programming language/source code is determined to sneakily embed
> source code modification times into generated files, and call it a day?
> 

Adding list back - any further off-list replies will be completely ignored.

The reproducible builds people will provide details, but it appears
pyo/pyc do this.

A


Re: [pacman-dev] [PATCH 3/4] makepkg: unify source file times for improved build reproducibility

2017-04-20 Thread Eli Schwartz
On 04/17/2017 08:03 AM, Allan McRae wrote:
> From: Levente Polyak 
[...]>  run_build() {
> + # unify source times before building for reproducibility
> + find "$srcdir" -exec touch -h -d "@${SOURCE_DATE_EPOCH}" {} \;
> +
>   run_function_safe "build"
>  }

Just a general question on "why do we want this" (and the followup patch
to do this at the beginning of the package() step)...

It is one thing for makepkg to fiddle with its own internal logic to
respect SOURCE_DATE_EPOCH with regard to package metadata, installed
file modification times, etc. but as mentioned in the other thread, it
is not makepkg's job to ensure that, for example, python's compiled
bytecode respects SOURCE_DATE_EPOCH. Any build/generation process that
changes its own output based on the reported date of the source files,
is doing something wrong anyway.

Moreover, this breaks incremental builds by making the build system
think all files have been modified and must be recompiled.
Incremental builds are currently a perfectly valid use case for e.g.
*-git or other devel packages (assuming one is building for their own
computer, isn't worried about automagic/non-clean-chroot dependencies,
and is reasonably confident that the build system in question doesn't
fall on its face when asked to do incremental builds).

I very much do not want this to be accepted. :)

-- 
Eli Schwartz



signature.asc
Description: OpenPGP digital signature


Re: [pacman-dev] [PATCH 3/4] makepkg: unify source file times for improved build reproducibility

2017-04-17 Thread Dave Reisner
On Mon, Apr 17, 2017 at 10:03:02PM +1000, Allan McRae wrote:
> From: Levente Polyak 
> 
> Signed-off-by: Allan McRae 
> ---
>  scripts/makepkg.sh.in | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in
> index 7692ade5..df4d6a06 100644
> --- a/scripts/makepkg.sh.in
> +++ b/scripts/makepkg.sh.in
> @@ -475,6 +475,9 @@ run_prepare() {
>  }
>  
>  run_build() {
> + # unify source times before building for reproducibility
> + find "$srcdir" -exec touch -h -d "@${SOURCE_DATE_EPOCH}" {} \;
> +

I'd use the '{} +' form of find here to avoid excessive forking.

>   run_function_safe "build"
>  }
>  
> -- 
> 2.12.0


[pacman-dev] [PATCH 3/4] makepkg: unify source file times for improved build reproducibility

2017-04-17 Thread Allan McRae
From: Levente Polyak 

Signed-off-by: Allan McRae 
---
 scripts/makepkg.sh.in | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in
index 7692ade5..df4d6a06 100644
--- a/scripts/makepkg.sh.in
+++ b/scripts/makepkg.sh.in
@@ -475,6 +475,9 @@ run_prepare() {
 }
 
 run_build() {
+   # unify source times before building for reproducibility
+   find "$srcdir" -exec touch -h -d "@${SOURCE_DATE_EPOCH}" {} \;
+
run_function_safe "build"
 }
 
-- 
2.12.0