Bug#867982: libfile-stripnondeterminism-perl: [PATCH] Optimize load times of File::StripNondeterminism

2017-07-10 Thread Niels Thykier
Package: libfile-stripnondeterminism-perl
Version: 0.035-2
Severity: normal
Tags: upstream patch

Hi,

Thanks to a very minimalistic performance test case for dh from Adam
Borowski, I realised that dh_strip-nondeterminism accounts for ~4.5%
of the total runtime for a (mostly) no-op dh package build[1].  This
cost applies even to packages for which dh_strip-nondeterminism does
not normalize any files.

Attached, I have included a few patches to optimize the start up time
a bit (Feel free to squash them into a single commit).
  In my simplified testing, the start up time is reduced to ~0.075s
(down from ~0.170s) in the no-op case.  The impact should be minimal
in the case where dh_strip-nondeterminism actually need to do
anything.

Thanks,
~Niels

[1] dpkg-buildpackage -us -uc -tc -b
>From 5a1ffdd72f55b17b11869b10a85529a0048fec91 Mon Sep 17 00:00:00 2001
From: Niels Thykier 
Date: Mon, 10 Jul 2017 20:19:54 +
Subject: [PATCH 1/3] File::SND: Lazy load most handlers

This reduces the start up time of dh_strip_nondeterminism to ~0.10s
from ~0.17s in a "no-op" case.

Signed-off-by: Niels Thykier 
---
 lib/File/StripNondeterminism.pm | 73 -
 1 file changed, 42 insertions(+), 31 deletions(-)

diff --git a/lib/File/StripNondeterminism.pm b/lib/File/StripNondeterminism.pm
index c29d4df..c153b0e 100644
--- a/lib/File/StripNondeterminism.pm
+++ b/lib/File/StripNondeterminism.pm
@@ -22,16 +22,9 @@ use strict;
 use warnings;
 
 use POSIX qw(tzset);
-use File::StripNondeterminism::handlers::ar;
-use File::StripNondeterminism::handlers::cpio;
-use File::StripNondeterminism::handlers::gettext;
-use File::StripNondeterminism::handlers::gzip;
-use File::StripNondeterminism::handlers::jar;
 use File::StripNondeterminism::handlers::javadoc;
 use File::StripNondeterminism::handlers::pearregistry;
-use File::StripNondeterminism::handlers::png;
 use File::StripNondeterminism::handlers::javaproperties;
-use File::StripNondeterminism::handlers::zip;
 
 our($VERSION, $canonical_time, $clamp_time);
 
@@ -59,29 +52,29 @@ sub get_normalizer_for_file($) {
 
# ar
if (m/\.a$/ && _get_file_type($_) =~ m/ar archive/) {
-   return \::StripNondeterminism::handlers::ar::normalize;
+   return _handler('ar');
}
# cpio
if (m/\.cpio$/ && _get_file_type($_) =~ m/cpio archive/) {
-   return \::StripNondeterminism::handlers::cpio::normalize;
+   return _handler('cpio');
}
# gettext
if (m/\.g?mo$/ && _get_file_type($_) =~ m/GNU message catalog/) {
-   return 
\::StripNondeterminism::handlers::gettext::normalize;
+   return _handler('gettext');
}
# gzip
if (m/\.(gz|dz)$/ && _get_file_type($_) =~ m/gzip compressed data/) {
-   return \::StripNondeterminism::handlers::gzip::normalize;
+   return _handler('gzip');
}
# jar
if (m/\.(jar|war|hpi|apk)$/
&& _get_file_type($_) =~ m/(Java|Zip) archive data/) {
-   return \::StripNondeterminism::handlers::jar::normalize;
+   return _handler('jar');
}
# javadoc
if (m/\.html$/
&& 
File::StripNondeterminism::handlers::javadoc::is_javadoc_file($_)) {
-   return 
\::StripNondeterminism::handlers::javadoc::normalize;
+   return _handler('javadoc');
}
# pear registry
if (
@@ -89,11 +82,11 @@ sub get_normalizer_for_file($) {
&& 
File::StripNondeterminism::handlers::pearregistry::is_registry_file(
$_)
  ) {
-   return 
\::StripNondeterminism::handlers::pearregistry::normalize;
+   return _handler('pearregistry');
}
# PNG
if (m/\.png$/ && _get_file_type($_) =~ m/PNG image data/) {
-   return \::StripNondeterminism::handlers::png::normalize;
+   return _handler('png');
}
# pom.properties, version.properties
if (
@@ -101,32 +94,50 @@ sub get_normalizer_for_file($) {
&& 
File::StripNondeterminism::handlers::javaproperties::is_java_properties_file(
$_)
  ) {
-   return
- 
\::StripNondeterminism::handlers::javaproperties::normalize;
+   return _handler('javaproperties');
}
# zip
if (m/\.(zip|pk3|epub|whl|xpi|htb|zhfst|par)$/
&& _get_file_type($_) =~ m/Zip archive data|EPUB document/) {
-   return \::StripNondeterminism::handlers::zip::normalize;
+   return _handler('zip');
}
return undef;
 }
 
-our %typemap = (
-   ar  => \::StripNondeterminism::handlers::ar::normalize,
-   cpio=> \::StripNondeterminism::handlers::cpio::normalize,
-   gettext => 

Please review the draft for week 115's blog post

2017-07-10 Thread Ximin Luo
Hi all,

This week's blog post draft is available for review:

https://reproducible.alioth.debian.org/blog/drafts/115/

Feel free to commit fixes directly to drafts/115.mdwn in

https://anonscm.debian.org/git/reproducible/blog.git/

I'll wait at least 24 hours from the time of this email for any comments, and 
if everything is good then I will publish it soon after that.

X

-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git

___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds


Re: Dealing with version.texi and mdate-sh

2017-07-10 Thread Ximin Luo
Eric Dorland:
> Hi folks,
> 
> I was trying to fix some unreproduciblity issues with automake and the
> problem of version.texi came to my attention and I haven't seen it
> come up before, but let me know if I just couldn't find it.
> 
> According to
> https://www.gnu.org/software/automake/manual/html_node/Texinfo.html,
> if your .texi includes version.texi it will generate version.texi
> based on the output of mdate-sh on the .texi file (aka whatever the
> modification date of the file is). Obviously that's bad for build
> reproducibility. 
> 
> My thinking on how to fix this would be to add a flag to mdate-sh to
> use SOURCE_DATE_EPOCH if it's available and make automake use that
> flag when generating version.texi. Is there a better approach? Should
> the unpacked source package just have all of the files modification
> dates set to SOURCE_DATE_EPOCH?
> 

Hi Eric, it's better to just directly use SOURCE_DATE_EPOCH if it's available 
without requiring an extra command-line flag, so package maintainers don't have 
to add these extra flags anywhere.

With an unpacked source package, what we generally do is reset file modtimes 
only if they are newer than SOURCE_DATE_EPOCH, otherwise keep them as-is. We've 
been calling this "clamping". If you're creating tar files you can use the 
--clamp-mtime option rather than touching the files diretcly.

X

-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git

___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds