[Python-modules-team] Bug#795976: sphinx: please make the build reproducible (timestamps, randomness)

2015-08-25 Thread Val Lorentz
Hi,

I can't reproduce the unreproducibility on my computer.

However, reading debian/rules, I think setting PYTHONHASHSEED=0 when
calling dh_sphinxdoc should work. Actually, modifying dh_sphinxdoc to
set this variable may be better (in case someone copy-pastes it to their
own package).

If it does not work, try the attached package.


On 25/08/2015 14:13, Dmitry Shachnev wrote:
 Hi Val,
 
 Looks like my latest upload is still not reproducible: the debbindiff [1]
 reports differing contents of searchindex.js.
 
 Do you know if this can be fixed by yet another PYTHONHASHSEED=0 setting,
 or this is more complicate?
 
 [1]: 
 https://reproducible.debian.net/dbd/unstable/amd64/sphinx_1.3.1-5.debbindiff.html
 
 --
 Dmitry Shachnev
 
diff -u -r sphinx-1.3.1.old/sphinx/search/__init__.py sphinx-1.3.1/sphinx/search/__init__.py
--- sphinx-1.3.1.old/sphinx/search/__init__.py	2015-08-25 12:52:11.119163557 +
+++ sphinx-1.3.1/sphinx/search/__init__.py	2015-08-25 13:11:28.671207670 +
@@ -259,9 +259,9 @@
 rv = {}
 otypes = self._objtypes
 onames = self._objnames
-for domainname, domain in iteritems(self.env.domains):
+for domainname, domain in sorted(iteritems(self.env.domains)):
 for fullname, dispname, type, docname, anchor, prio in \
-domain.get_objects():
+sorted(domain.get_objects()):
 # XXX use dispname?
 if docname not in fn2index:
 continue



signature.asc
Description: OpenPGP digital signature
___
Python-modules-team mailing list
Python-modules-team@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/python-modules-team

[Python-modules-team] Bug#795976: sphinx: please make the build reproducible (timestamps, randomness)

2015-08-25 Thread Dmitry Shachnev
Hi Val,

Looks like my latest upload is still not reproducible: the debbindiff [1]
reports differing contents of searchindex.js.

Do you know if this can be fixed by yet another PYTHONHASHSEED=0 setting,
or this is more complicate?

[1]: 
https://reproducible.debian.net/dbd/unstable/amd64/sphinx_1.3.1-5.debbindiff.html

--
Dmitry Shachnev

signature.asc
Description: OpenPGP digital signature
___
Python-modules-team mailing list
Python-modules-team@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/python-modules-team

[Python-modules-team] Bug#795976: sphinx: please make the build reproducible (timestamps, randomness)

2015-08-19 Thread Dmitry Shachnev
Hi Val,

On Tue, 18 Aug 2015 15:07:32 +0200, Val Lorentz wrote:
 The attached patch removes build timestamp from the output
 documentation, makes domains sorted in HTML documentation, and makes
 generated automata (and their pickle dump) deterministic. Once applied,
 sphinx (and packages using sphinx) can be built reproducibly in our
 current experimental framework.

Thanks a lot for the patch! This was actually in my TODO list, and you
helped me to make it a bit shorter :)

A couple of questions though:

* Will you forward the upstream part to upstream, or should I do that?
  It is a simple as a pull request to https://github.com/sphinx-doc/sphinx.

* The PYTHONHASHSEED setting in debian/rules was there for purpose
  (needed to make sure tests pass with it, as there was a bug in earlier
  versions when they didn't). Is it possible to keep the old value?
  Maybe you can use something else than builtin hash()?

  If it's impossible, then debian/rules should set PYTHONHASHSEED=random
  explicitly when running tests.

--
Dmitry Shachnev

signature.asc
Description: OpenPGP digital signature
___
Python-modules-team mailing list
Python-modules-team@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/python-modules-team

[Python-modules-team] Bug#795976: sphinx: please make the build reproducible (timestamps, randomness)

2015-08-18 Thread Val Lorentz
Source: sphinx
Version: 1.3.1-4
Severity: wishlist
Tags: patch
User: reproducible-bui...@lists.alioth.debian.org
Usertags: timestamps randomness
X-Debbugs-Cc: reproducible-bui...@lists.alioth.debian.org

Hi!

While working on the “reproducible builds” effort [1], we have noticed
that sphinx could not be built reproducibly.

The attached patch removes build timestamp from the output
documentation, makes domains sorted in HTML documentation, and makes
generated automata (and their pickle dump) deterministic. Once applied,
sphinx (and packages using sphinx) can be built reproducibly in our
current experimental framework.

 [1]: https://wiki.debian.org/ReproducibleBuilds

Regards,
Val
diff -ru sphinx-1.3.1.old/debian/rules sphinx-1.3.1/debian/rules
--- sphinx-1.3.1.old/debian/rules	2015-08-17 17:41:44.55734 +
+++ sphinx-1.3.1/debian/rules	2015-08-18 12:26:01.040804815 +
@@ -3,11 +3,14 @@
 
 include /usr/share/python/python.mk
 
+export SOURCE_DATE_EPOCH = $(shell date -d $$(dpkg-parsechangelog --count 1 -SDate) +%s)
 export NO_PKG_MANGLE=1
 export PYTHONWARNINGS=d
-export PYTHONHASHSEED=random
 export http_proxy=http://127.0.0.1:9/
 
+# For deterministic pickling
+export PYTHONHASHSEED=0
+
 here = $(dir $(firstword $(MAKEFILE_LIST)))/..
 debian_version = $(word 2,$(shell cd $(here)  dpkg-parsechangelog | grep ^Version:))
 upstream_version = $(subst ~,,$(firstword $(subst -, ,$(debian_version
diff -ru sphinx-1.3.1.old/setup.py sphinx-1.3.1/setup.py
--- sphinx-1.3.1.old/setup.py	2015-08-17 17:41:44.55734 +
+++ sphinx-1.3.1/setup.py	2015-08-18 11:41:20.0 +
@@ -162,7 +162,7 @@
 messages=jscatalog,
 plural_expr=catalog.plural_expr,
 locale=str(catalog.locale)
-), outfile)
+), outfile, sort_keys=True)
 outfile.write(');')
 finally:
 outfile.close()
diff -ru sphinx-1.3.1.old/sphinx/builders/html.py sphinx-1.3.1/sphinx/builders/html.py
--- sphinx-1.3.1.old/sphinx/builders/html.py	2015-08-17 17:41:44.56534 +
+++ sphinx-1.3.1/sphinx/builders/html.py	2015-08-17 19:46:48.0 +
@@ -824,7 +824,7 @@
  u'# The remainder of this file is compressed using zlib.\n'
  % (self.config.project, self.config.version)).encode('utf-8'))
 compressor = zlib.compressobj(9)
-for domainname, domain in iteritems(self.env.domains):
+for domainname, domain in sorted(self.env.domains.items()):
 for name, dispname, type, docname, anchor, prio in \
 sorted(domain.get_objects()):
 if anchor.endswith(name):
diff -ru sphinx-1.3.1.old/sphinx/pycode/pgen2/pgen.py sphinx-1.3.1/sphinx/pycode/pgen2/pgen.py
--- sphinx-1.3.1.old/sphinx/pycode/pgen2/pgen.py	2015-08-17 17:41:44.56134 +
+++ sphinx-1.3.1/sphinx/pycode/pgen2/pgen.py	2015-08-18 12:06:30.0 +
@@ -4,6 +4,7 @@
 from __future__ import print_function
 
 from six import iteritems
+from collections import OrderedDict
 
 # Pgen imports
 
@@ -57,7 +58,7 @@
 def make_first(self, c, name):
 rawfirst = self.first[name]
 first = {}
-for label in rawfirst:
+for label in sorted(rawfirst):
 ilabel = self.make_label(c, label)
 ##assert ilabel not in first # X X X failed on  ... !=
 first[ilabel] = 1
@@ -138,8 +139,8 @@
 totalset[label] = 1
 overlapcheck[label] = {label: 1}
 inverse = {}
-for label, itsfirst in iteritems(overlapcheck):
-for symbol in itsfirst:
+for label, itsfirst in sorted(overlapcheck.items()):
+for symbol in sorted(itsfirst):
 if symbol in inverse:
 raise ValueError(rule %s is ambiguous; %s is in the
   first sets of %s as well as %s %
@@ -349,6 +350,9 @@
 assert isinstance(next, NFAState)
 self.arcs.append((label, next))
 
+def __hash__(self):
+return hash(tuple(x[0] for x in self.arcs))
+
 class DFAState(object):
 
 def __init__(self, nfaset, final):
@@ -357,7 +361,10 @@
 assert isinstance(final, NFAState)
 self.nfaset = nfaset
 self.isfinal = final in nfaset
-self.arcs = {} # map from label to DFAState
+self.arcs = OrderedDict() # map from label to DFAState
+
+def __hash__(self):
+return hash(tuple(self.arcs))
 
 def addarc(self, next, label):
 assert isinstance(label, str)


signature.asc
Description: OpenPGP digital signature
___
Python-modules-team mailing list
Python-modules-team@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/python-modules-team