Bug#1010957: status update? Re: Bug#1010957: man-db: unreproducible index.db: contents depend on directory read order

2022-10-03 Thread Holger Levsen
On Sun, Oct 02, 2022 at 04:00:58PM +0100, Colin Watson wrote:
> Control: tag -1 fixed-upstream
> Success!
>   https://gitlab.com/cjwatson/man-db/-/compare/5d2594d0a0...866c3571d3

awesome!

On Sun, Oct 02, 2022 at 05:56:19PM +0100, Colin Watson wrote:
> I thought I'd set SOURCE_DATE_EPOCH, but I'd failed to pass it through
> sudo.  After fixing that, I indeed get cmp-identical tarballs.

very nice! much cheers!


-- 
cheers,
Holger

 ⢀⣴⠾⠻⢶⣦⠀
 ⣾⠁⢠⠒⠀⣿⡁  holger@(debian|reproducible-builds|layer-acht).org
 ⢿⡄⠘⠷⠚⠋⠀  OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
 ⠈⠳⣄

Plastic bottles: made to last forever, designed to throw away.


signature.asc
Description: PGP signature


Bug#1010957: status update? Re: Bug#1010957: man-db: unreproducible index.db: contents depend on directory read order

2022-10-02 Thread Colin Watson
On Sun, Oct 02, 2022 at 05:50:07PM +0200, Johannes Schauer Marin Rodrigues 
wrote:
> Quoting Colin Watson (2022-10-02 17:00:58)
> > As well as more localized testing, I built a .deb with this and used
> > josch's instructions from the start of this bug to build mmdebstrap
> > tarballs via disorderfs, using
> > "--hook-dir=/usr/share/mmdebstrap/hooks/file-mirror-automount
> > --include=./man-db_2.10.3~20221002-1_amd64.deb" to inject the new .deb.
> > The two resulting tarballs had somewhat differing file lists (timestamps
> > etc.), but all the actual files in the tarballs were bitwise-identical.
> 
> Did you maybe forget the "export SOURCE_DATE_EPOCH=XXX" step? Just replace XXX
> with the output of `date +%s` but make sure that both mmdebstrap invocations
> see the same value for SOURCE_DATE_EPOCH and then there should be zero
> differences and a "cmp" should be sufficient to make sure that it works.

I thought I'd set SOURCE_DATE_EPOCH, but I'd failed to pass it through
sudo.  After fixing that, I indeed get cmp-identical tarballs.

-- 
Colin Watson (he/him)  [cjwat...@debian.org]



Bug#1010957: status update? Re: Bug#1010957: man-db: unreproducible index.db: contents depend on directory read order

2022-10-02 Thread Johannes Schauer Marin Rodrigues
Quoting Colin Watson (2022-10-02 17:00:58)
> Success!
> 
>   https://gitlab.com/cjwatson/man-db/-/compare/5d2594d0a0...866c3571d3

Thank you!! :D

> 
> As well as more localized testing, I built a .deb with this and used
> josch's instructions from the start of this bug to build mmdebstrap
> tarballs via disorderfs, using
> "--hook-dir=/usr/share/mmdebstrap/hooks/file-mirror-automount
> --include=./man-db_2.10.3~20221002-1_amd64.deb" to inject the new .deb.
> The two resulting tarballs had somewhat differing file lists (timestamps
> etc.), but all the actual files in the tarballs were bitwise-identical.

Did you maybe forget the "export SOURCE_DATE_EPOCH=XXX" step? Just replace XXX
with the output of `date +%s` but make sure that both mmdebstrap invocations
see the same value for SOURCE_DATE_EPOCH and then there should be zero
differences and a "cmp" should be sufficient to make sure that it works.

Thanks!

cheers, josch

signature.asc
Description: signature


Bug#1010957: status update? Re: Bug#1010957: man-db: unreproducible index.db: contents depend on directory read order

2022-10-02 Thread Colin Watson
Control: tag -1 fixed-upstream

Success!

  https://gitlab.com/cjwatson/man-db/-/compare/5d2594d0a0...866c3571d3

As well as more localized testing, I built a .deb with this and used
josch's instructions from the start of this bug to build mmdebstrap
tarballs via disorderfs, using
"--hook-dir=/usr/share/mmdebstrap/hooks/file-mirror-automount
--include=./man-db_2.10.3~20221002-1_amd64.deb" to inject the new .deb.
The two resulting tarballs had somewhat differing file lists (timestamps
etc.), but all the actual files in the tarballs were bitwise-identical.

Feel free to do any other testing you think might be useful.  There's a
bootstrapped source tarball attached as an artifact to the
"build-distcheck" CI job in GitLab that you can easily use to build a
snapshot .deb if you need one.

-- 
Colin Watson (he/him)  [cjwat...@debian.org]



Bug#1010957: status update? Re: Bug#1010957: man-db: unreproducible index.db: contents depend on directory read order

2022-09-26 Thread Holger Levsen
Hi Colin,

On Sun, Sep 25, 2022 at 11:18:19PM +0100, Colin Watson wrote:
> This weekend's work has been:
>   https://gitlab.com/cjwatson/man-db/-/compare/bb0f7086ba...5d2594d0a0

wow, impressive!

(and thank you for taking care of man-db for so many years now! :)

[...]
> I'll need a bit more concentrated hacking time here, but I'll continue
> to work on these; this has been a great opportunity to clean up some
> truly unpleasant bits of code.  Once I have the accessdb diff down to
> zero, we'll see whether there's any further instability in the on-disk
> GDBM representation, and also whether there are any other issues that
> don't show up in the set of pages I have installed.

sounds great! also thank you for keeping us updated here, i'm looking
forward to hear more good news eventually! :)


-- 
cheers,
Holger

 ⢀⣴⠾⠻⢶⣦⠀
 ⣾⠁⢠⠒⠀⣿⡁  holger@(debian|reproducible-builds|layer-acht).org
 ⢿⡄⠘⠷⠚⠋⠀  OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
 ⠈⠳⣄

I'm looking forward to Corona being a beer again and Donald a duck.


signature.asc
Description: PGP signature


Bug#1010957: status update? Re: Bug#1010957: man-db: unreproducible index.db: contents depend on directory read order

2022-09-25 Thread Colin Watson
This weekend's work has been:

  https://gitlab.com/cjwatson/man-db/-/compare/bb0f7086ba...5d2594d0a0

A lot of this was code rearrangement that I needed to do before I could
make progress on the real issues, but if you look at the NEWS.md diff
you'll see a number of changes that relate to this bug.  With all of
that, there are 33 lines of diff of accessdb output remaining on my
system against the result of josch's patch, which come down to two
issues:

 * unstable choice of whatis target for pages with many entries in NAME,
   some but not all of which are represented as symlinks in the
   filesystem to a file name that is not itself in NAME (there are some
   examples of this in libbsd-dev and libmd-dev)
 * some difficulty deciding exactly what to do with cross-section links
   in some cases (inetd.conf(5) → inetd(8))

I'll need a bit more concentrated hacking time here, but I'll continue
to work on these; this has been a great opportunity to clean up some
truly unpleasant bits of code.  Once I have the accessdb diff down to
zero, we'll see whether there's any further instability in the on-disk
GDBM representation, and also whether there are any other issues that
don't show up in the set of pages I have installed.

-- 
Colin Watson (he/him)  [cjwat...@debian.org]



Bug#1010957: status update? Re: Bug#1010957: man-db: unreproducible index.db: contents depend on directory read order

2022-09-22 Thread Holger Levsen
Hi Colin,

On Thu, Sep 22, 2022 at 08:53:07PM +0100, Colin Watson wrote:
> Yeah, this has taken me a bit longer than expected, but I have in fact
> been making some progress.  josch's patch has been very useful in that
> it provides an easy way to see differences between unsorted and sorted
> traversal, and I've taken my goal as being to drive those differences to
> zero.  The only bit I've committed so far has been:
> 
>   
> https://gitlab.com/cjwatson/man-db/-/commit/bb0f7086ba4ce4503761737bf612088c03b6c495

cool, thanks for the update and all your man-db work!

> I'll update this bug as I make further progress.

great, thanks again! 


-- 
cheers,
Holger

 ⢀⣴⠾⠻⢶⣦⠀
 ⣾⠁⢠⠒⠀⣿⡁  holger@(debian|reproducible-builds|layer-acht).org
 ⢿⡄⠘⠷⠚⠋⠀  OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
 ⠈⠳⣄

Imagine god created trillions of galaxies but freaks out because some dude
kisses another.


signature.asc
Description: PGP signature


Bug#1010957: status update? Re: Bug#1010957: man-db: unreproducible index.db: contents depend on directory read order

2022-09-22 Thread Colin Watson
Control: tag -1 - patch

On Thu, Sep 22, 2022 at 03:48:30PM +, Holger Levsen wrote:
> Colin, what's the status of this bug? You said you were working on improving
> josch' patch in May 2022...?! :)

Yeah, this has taken me a bit longer than expected, but I have in fact
been making some progress.  josch's patch has been very useful in that
it provides an easy way to see differences between unsorted and sorted
traversal, and I've taken my goal as being to drive those differences to
zero.  The only bit I've committed so far has been:

  
https://gitlab.com/cjwatson/man-db/-/commit/bb0f7086ba4ce4503761737bf612088c03b6c495

I also have a few hundred lines of somewhat untidy patch that I'll
commit in a few pieces as soon as I'm certain of it; this is all
essentially about stabilizing the decisions about which database entries
win compared to which other entries, so that the end result doesn't
change depending on the scan order.  With that, I'm down to on the order
of 150 lines of diff of accessdb output against the result of josch's
patch, and I think there are only about one or two problems left.

A lot of the remaining difficulties are due to somewhat impenetrable old
code which appeared to be trying to micro-optimize memory usage in a way
that I don't think makes sense nowadays, so I may take a bit of a
digression into reorganizing some of this.

I'll update this bug as I make further progress.

> Also, the bug is currently tagged 'patch', I guess it's appropriate to remove
> that tag?

Done.

-- 
Colin Watson (he/him)  [cjwat...@debian.org]



Bug#1010957: status update? Re: Bug#1010957: man-db: unreproducible index.db: contents depend on directory read order

2022-09-22 Thread Holger Levsen
hi!

Colin, what's the status of this bug? You said you were working on improving
josch' patch in May 2022...?! :)

Also, the bug is currently tagged 'patch', I guess it's appropriate to remove
that tag?

josch: btw you said you you submitted other patches missing freeing of memory,
have you updated those other patches?


-- 
cheers,
Holger

 ⢀⣴⠾⠻⢶⣦⠀
 ⣾⠁⢠⠒⠀⣿⡁  holger@(debian|reproducible-builds|layer-acht).org
 ⢿⡄⠘⠷⠚⠋⠀  OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
 ⠈⠳⣄

We live in a world where teenagers get more and more desperate trying to
convince adults to behave like grown ups.


signature.asc
Description: PGP signature