Re: dh_auto_clean, autoconf, and building packages twice (was: Potential MBF: packages failing to build twice in a row)

2024-06-23 Thread Andrey Rakhmatullin
On Sun, Jun 23, 2024 at 02:06:04PM +0200, Magnus Holmgren wrote:
> I'm very late to the party, but after reading the entire thread, I'd like to 
> discuss a more specific, but perhaps not uncommon, situation with regard to 
> cleaning and building again:
> 
> We have a [still fairly] typical upstream package using Autotools. The 
> tarball 
> includes some built files (e.g. documentation), i.e. make dist builds them 
> (as 
> long as the right tools are installed) and make distclean deletes them. 
> dh_auto_clean by default runs make distclean if such a target exists. But 
> that's not the case until ./configure has been run, because until then, there 
> are no makefiles. So The first time you run debian/rules build, the shipped 
> version of the files will be untouched and used in the final package, but 
> when 
> you then clean and build again, they will be deleted and rebuilt (or the 
> second build fails because of missing build dependencies).
> 
> Besides building a package twice potentially failing, this can lead to the 
> second build being different from the first. So I'm thinking:
> 
> 1. Do we generally want dh_auto_build to run make distclean, deleting files 
> that we otherwise wouldn't need to build? 

My autotools knowledge is rusty so I don't know what else except docs is
included here, but docs is a perfect example of a thing that we normally
*want* to rebuild.



-- 
WBR, wRAR


signature.asc
Description: PGP signature


dh_auto_clean, autoconf, and building packages twice (was: Potential MBF: packages failing to build twice in a row)

2024-06-23 Thread Magnus Holmgren
Hi,

I'm very late to the party, but after reading the entire thread, I'd like to 
discuss a more specific, but perhaps not uncommon, situation with regard to 
cleaning and building again:

We have a [still fairly] typical upstream package using Autotools. The tarball 
includes some built files (e.g. documentation), i.e. make dist builds them (as 
long as the right tools are installed) and make distclean deletes them. 
dh_auto_clean by default runs make distclean if such a target exists. But 
that's not the case until ./configure has been run, because until then, there 
are no makefiles. So The first time you run debian/rules build, the shipped 
version of the files will be untouched and used in the final package, but when 
you then clean and build again, they will be deleted and rebuilt (or the 
second build fails because of missing build dependencies).

Besides building a package twice potentially failing, this can lead to the 
second build being different from the first. So I'm thinking:

1. Do we generally want dh_auto_build to run make distclean, deleting files 
that we otherwise wouldn't need to build? dh_autoreconf regenerating configure 
and Makefile.in is another thing, I think, because those files aren't part of 
the final binary package, and dh_autoreconf tracks which files are changed so 
dh_autoreconf_clean can remove them.

2. In my case, I think I want those files (manpages) to be rebuilt already the 
first time, because the configured ${sysconfdir} is injected into them. But to 
get them rebuilt, I either need to run make distclean and then ./configure 
again, or explicitly delete those specific files, neither option being 
completely elegant. Perhaps I'm missing some better option?

3. But if we do want to build as much as possible from scratch, wouldn't it 
make some sense to run dh_auto_configure an extra time before dh_auto_clean, 
to ensure that the latter actually does something (which it otherwise 
shouldn't have to, when building freshly unpacked source)?

-- 
Magnus Holmgrenholmg...@debian.org
Debian Developer 

signature.asc
Description: This is a digitally signed message part.


Re: Potential MBF: packages failing to build twice in a row

2023-11-26 Thread julien . puydt
Le dimanche 26 novembre 2023 à 16:34 +0100, Matthijs Kooijman a écrit :
> Hi,
> 
> I've also gotten a bunch of bug reports from this MBF. Some were easy
> to
> fix, but there is one subtype of this issue where I think the
> commonly
> given advice and policy currently contradict.
> 
> This concerns files that:
>  - are shipped in the upstream tarball
>  - are regenerated (with slightly different contents) during the
> build
> 
> These are essentially build products prebuilt by upstream.
> 
> 
> A commonly recommended approach to fix this, and also the only
> approach
> listed on the page linked by the bug report [1], is to add such files
> to
> the extend-diff-ignore dpkg-source option.
> 
> [1]: https://wiki.debian.org/qa.debian.org/FTBFS/DoubleBuild
> 
> AFAICS this causes dpkg-source to simply ignore changes in this file,
> preventing dpkg-source from raising an error due to such
> modifications.
> 
> However, the policy 4.9 says:
> 
>   clean (required)
>    This must undo any effects that the build and binary targets
> may
>    have had, except that it should leave alone any output files
>    created in the parent directory by a run of a binary target.
> 
> So this does not say "dpkg-source must still work after the build
> + clean", it says that *any* effects of the build must be undone,
> which
> is stronger.
> 
> So AFAICS the extend-diff-ignore fix does not comply with the policy
> in
> its current form. Also, it means that two subsequent builds will not
> start from an identical source tree, which *could* hurt
> reproducibility
> (though in practice these files will be regenerated, so the build
> *should* be identical anyway).
> 

The way I handled such cases in some of my packages was:
- in the build target - but before actually building - detect if
foo.orig exists, and if it doesn't, copy foo to foo.orig ;
- in the clean target, detect if foo.orig exists, and if it does, move
it to foo.

That way even if foo gets modified during the build, the clean target
puts it back like it was.

Unelegant, annoying, but it does the trick...

Cheers,

JP



Potential MBF: packages failing to build twice in a row

2023-11-26 Thread Matthijs Kooijman
Hi,

I've also gotten a bunch of bug reports from this MBF. Some were easy to
fix, but there is one subtype of this issue where I think the commonly
given advice and policy currently contradict.

This concerns files that:
 - are shipped in the upstream tarball
 - are regenerated (with slightly different contents) during the build

These are essentially build products prebuilt by upstream.


A commonly recommended approach to fix this, and also the only approach
listed on the page linked by the bug report [1], is to add such files to
the extend-diff-ignore dpkg-source option.

[1]: https://wiki.debian.org/qa.debian.org/FTBFS/DoubleBuild

AFAICS this causes dpkg-source to simply ignore changes in this file,
preventing dpkg-source from raising an error due to such modifications.

However, the policy 4.9 says:

  clean (required)
   This must undo any effects that the build and binary targets may
   have had, except that it should leave alone any output files
   created in the parent directory by a run of a binary target.

So this does not say "dpkg-source must still work after the build
+ clean", it says that *any* effects of the build must be undone, which
is stronger.

So AFAICS the extend-diff-ignore fix does not comply with the policy in
its current form. Also, it means that two subsequent builds will not
start from an identical source tree, which *could* hurt reproducibility
(though in practice these files will be regenerated, so the build
*should* be identical anyway).


Another fix mentioned in this thread (and I also found it in the debmake
docs [2]), is to remove such files in the clean target. The net effect
is that if you do a build and then build a source package, these files
will be removed before calling dpkg-source and since dpkg-source does
not error out about deleted files (as it does for modified files),
dpkg-source is happy and the source package is created. However,
dpkg-source does print a warning, making me feel this is a hack rather
than a proper solution.

However, this approach *technically* complies with the policy for the
clean target, it feels a bit like hack. The clean target restore
everything changed by the build target, but does not need to restore the
changes it made itself. On the other hand, assuming that clean is called
before every build, that does ensure that every build has the same
starting position, which I guess is what really counts.

[2]: https://www.debian.org/doc/manuals/debmake-doc/ch05.en.html#rules-clean


In terms of short and long term actions to address this, should we (in
no particular order):

 - Add the remove-files-in-clean approach to the wikipage linked in the
   bug reports [1]?
 - Modify the policy to allow clean to not revert changes that are
   excluded by (extend-)diff-ignore?
 - Modify dpkg-source to not complain with a warning about deleted
   files (maybe still an informational message to prevent surprise)?
 - Modify dpkg-source to to complain about any modified files at all
   (blanket fix for this issue)? What is the point of this check anyway?

Note that other parts of the original thread have discussed modifying
the policy so building source after build+clean (or even build after
build+clean) would no longer required, but since that seems to be
a big and controversial change, I'm focusing this post on the current
situation of policy and recommended approaches instead.

Gr.

Matthijs


signature.asc
Description: PGP signature


POT creation date should match last modification of the source (Re: Potential MBF: packages failing to build twice in a row)

2023-08-21 Thread Holger Levsen
On Wed, Aug 16, 2023 at 11:42:32AM +0800, Paul Wise wrote:
> On Tue, 2023-08-15 at 09:21 -0400, Boyuan Yang wrote:
> > --- ibus-array-0.2.2.orig/po/zh_TW.po
> > +++ ibus-array-0.2.2/po/zh_TW.po
> > @@ -6,7 +6,7 @@ msgid ""
> >  msgstr ""
> >  "Project-Id-Version: ibus-array 0.2.2\n"
> >  "Report-Msgid-Bugs-To: https://github.com/lexical/ibus-array/issues\n;
> > -"POT-Creation-Date: 2019-12-10 22:09+0800\n"
> > +"POT-Creation-Date: 2023-08-15 09:07-0400\n"
> >  "PO-Revision-Date: 2019-12-10 22:12+0800\n"
> >  "Last-Translator: Anthony Fok \n"
> >  "Language-Team: Chinese (traditional)\n"
> I've long been annoyed by this behaviour as an upstream developer on
> gettext based projects. I think the most correct upstream solution to
> this is that the gettext tools need to be made deterministic. Probably
> the POT creation date field should be removed and replaced with the
> date of the last source file modification?

seems legit.


-- 
cheers,
Holger

 ⢀⣴⠾⠻⢶⣦⠀
 ⣾⠁⢠⠒⠀⣿⡁  holger@(debian|reproducible-builds|layer-acht).org
 ⢿⡄⠘⠷⠚⠋⠀  OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
 ⠈⠳⣄

“We are about to sacrifice our civilization for the opportunity of a very
 small number of people to continue to make enormous amounts of money. We are
 about to sacrifice the biosphere so that rich people in countries like mine
 can live in luxury. But it is the sufferings of the many which pay for the
 luxuries of the few.” ― Greta Thunberg


signature.asc
Description: PGP signature


Re: Potential MBF: packages failing to build twice in a row

2023-08-18 Thread Lisandro Damián Nicanor Pérez Meyer
El martes, 15 de agosto de 2023 13:47:56 -03 Sam Hartman escribió:
> > "Lucas" == Lucas Nussbaum  writes:
> 
> Lucas> But unless we go further than that and decide that we don't
> Lucas> care at all about 'source builds after successful builds',
> Lucas> the bugs (which where filed severity:minor) remain valid.
> 
> FWIW in terms of building toward a consensus.  I'd like to see policy
> updated to de-emphasize the importance of clean working.
> 
> I don't think you did anything wrong by filing these bugs at severity
> minor, but I also don't see the point in spending the time doing that.
> Fortunately that doesn't matter: Debian allows people to focus on what
> is important to them, and it's totally fine if you focus your time on
> something I think is kind of pointless.
> A maintainer can tag the bug as wontfix (or close and tag as wontfix) if
> they like.
> I got a couple of these bugs filed on packages I'm involved with.
> I'll fix them if it's easy, because while I don't really care, you care
> enough to have spent the time, and part of being in a community is
> respecting the work of others.

I agree with this.


signature.asc
Description: This is a digitally signed message part.


Re: Potential MBF: packages failing to build twice in a row

2023-08-16 Thread Jeremy Stanley
On 2023-08-16 11:45:43 +0800 (+0800), Paul Wise wrote:
> On Sun, 2023-08-13 at 21:18 +, Jeremy Stanley wrote:
> 
> > Similarly, I got one for __pycache__/*.cpython-311.pyc file
> > overwrites... is that something dh_python should clean?
> 
> Probably just send upstream a change removing them?

Maybe I was unclear. It's not included in the source tarballs, the
Python interpreter generates it automatically at package build time
as a side effect of being invoked.

Yes I could add bespoke clean rules to delete these autogenerated
files (it's the most straightforward approach though does mean a
package revision and bothering a sponsor as I'm not a DD), but since
a vast number of other Python-based projects' packages are also
affected it makes more sense to have debhelper/dh_python or similar
just take care of excluding it automatically so that packaging
configuration can be minimized and to avoid anyone bothering to
upload new packages just to fix this (in my case it's the only
actionable bug reported against the package in some time).
-- 
Jeremy Stanley


signature.asc
Description: PGP signature


Re: __pycache__ directories (Re: Potential MBF: packages failing to build twice in a row)

2023-08-16 Thread Luca Boccassi
On Wed, 16 Aug 2023 at 09:43, Konstantin Demin  wrote:
>
> I'd recommend to add in d/rules following variable:
>
> ```make
> export PYTHONDONTWRITEBYTECODE=1
> ```
>
> Optionally add this too:
>
> ```make
> export PYTHONUNBUFFERED=1
> ```
>
> Extra thing:
>
> ```make
> define remove_pycache
>
> : # $(strip $(1)): remove Python cache
> find $(strip $(1))/ -name __pycache__ -type d -exec rm -rf {} +
> find $(strip $(1))/ -name '*.py[co]' -ls -delete
>
> endef
> ```
>
> and then use it like this:
>
> ```make
> $(call remove_pycache, $(CURDIR) )
> ```

Why should we do something like that by hand for every single python
source package? It doesn't make much sense. This is something that
needs to be handled in the tooling/infrastructure automatically, not
create busywork for hundreds of people.

Kind regards,
Luca Boccassi



Re: Potential MBF: packages failing to build twice in a row

2023-08-16 Thread Vincent Lefevre
On 2023-08-15 09:38:32 +0200, Lucas Nussbaum wrote:
> On 15/08/23 at 01:29 -0400, Michael Stone wrote:
> > we don't know, since the test was "regenerate source"--a thing very few
> > people care about--rather than "build twice" which is the thing people do
> > seem to care about. It seems likely that the difference is thousands of
> > packages.
> > 
> > I'm somewhat concerned we magically went from "should we do an MBF" to "I
> > just did an MBF" without any real consensus in the middle. This being so
> > painfully obvious that the MBF itself basically says there's no consensus.
> 
> I agree that the distinction between "fails to build source after
> successful build" and "fails to build binary packages after successful
> build" is useful. My initial test covered both, but I separated both
> issues later on to provide more specific bug reports, so the MBF only
> covered the first case. I also plan to do a MBF for "fails to build
> binary packages after successful build" (there are about 700 packages
> failing this).

Note that if the source has been modified, it may be possible that
the second build succeeds but is incorrect. I suppose that you need
to check that both builds are identical.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Re: __pycache__ directories (Re: Potential MBF: packages failing to build twice in a row)

2023-08-16 Thread Konstantin Demin
I'd recommend to add in d/rules following variable:

```make
export PYTHONDONTWRITEBYTECODE=1
```

Optionally add this too:

```make
export PYTHONUNBUFFERED=1
```

Extra thing:

```make
define remove_pycache

: # $(strip $(1)): remove Python cache
find $(strip $(1))/ -name __pycache__ -type d -exec rm -rf {} +
find $(strip $(1))/ -name '*.py[co]' -ls -delete

endef
```

and then use it like this:

```make
$(call remove_pycache, $(CURDIR) )
```

ср, 16 авг. 2023 г. в 08:41, Michael Biebl :
>
> Am 16.08.23 um 06:02 schrieb Paul Wise:
> > On Mon, 2023-08-14 at 22:09 +0200, Michael Biebl wrote:
> >
> >> I received a couple of bug reports against packages I (co) maintain
> >> regarding this issue and having a quick look, quite a few fail due to
> >> python scripts being run during the build and creating a __pycache__
> >> directory.
> >
> > I recommend asking upstream to delete those directories from their VCS.
> >
>
> Those directories were created *during* build.
> They are not shipped in the VCS.



-- 
SY,
Konstantin Demin



Re: Potential MBF: packages failing to build twice in a row

2023-08-15 Thread Ian Campbell
On Sun, 2023-08-13 at 22:59 +0200, Timo Röhling wrote:
> There's talk on #d-python if pybuild could/should deal with it; I'd
> give it a few days and see if that pans out.

https://salsa.debian.org/python-team/tools/dh-python/-/merge_requests/46
seems to be the one to watch.

Ian.



Re: __pycache__ directories (Re: Potential MBF: packages failing to build twice in a row)

2023-08-15 Thread Michael Biebl

Am 16.08.23 um 06:02 schrieb Paul Wise:

On Mon, 2023-08-14 at 22:09 +0200, Michael Biebl wrote:


I received a couple of bug reports against packages I (co) maintain
regarding this issue and having a quick look, quite a few fail due to
python scripts being run during the build and creating a __pycache__
directory.


I recommend asking upstream to delete those directories from their VCS.



Those directories were created *during* build.
They are not shipped in the VCS.


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: __pycache__ directories (Re: Potential MBF: packages failing to build twice in a row)

2023-08-15 Thread Paul Wise
On Mon, 2023-08-14 at 22:09 +0200, Michael Biebl wrote:

> I received a couple of bug reports against packages I (co) maintain 
> regarding this issue and having a quick look, quite a few fail due to
> python scripts being run during the build and creating a __pycache__ 
> directory.

I recommend asking upstream to delete those directories from their VCS.

-- 
bye,
pabs

https://wiki.debian.org/PaulWise


signature.asc
Description: This is a digitally signed message part


Re: Potential MBF: packages failing to build twice in a row

2023-08-15 Thread Paul Wise
On Sun, 2023-08-13 at 22:28 +0200, Johannes Schauer Marin Rodrigues wrote:
> > dpkg-source: info: local changes detected, the modified files are:
> >   plakativ-0.5.1/plakativ.egg-info/SOURCES.txt
> 
> since this issue seems to be affecting a few more packages than plakativ, I
> wanted to ask here what the canonical way is to fix this issue?

Personally, I would switch from the current orig.tar source (PyPI?) to
using the upstream VCS, which presumably does not have those files,
since they are usually generated by setup.py or something else and
hopefully haven't been committed to the upstream VCS. If they have
been committed to the upstream VCS, send upstream a removal.

-- 
bye,
pabs

https://wiki.debian.org/PaulWise


signature.asc
Description: This is a digitally signed message part


Re: Potential MBF: packages failing to build twice in a row

2023-08-15 Thread Paul Wise
On Sun, 2023-08-13 at 21:18 +, Jeremy Stanley wrote:

> Similarly, I got one for __pycache__/*.cpython-311.pyc file
> overwrites... is that something dh_python should clean?

Probably just send upstream a change removing them?

-- 
bye,
pabs

https://wiki.debian.org/PaulWise


signature.asc
Description: This is a digitally signed message part


Re: Potential MBF: packages failing to build twice in a row

2023-08-15 Thread Paul Wise
On Tue, 2023-08-15 at 09:21 -0400, Boyuan Yang wrote:

> --- ibus-array-0.2.2.orig/po/zh_TW.po
> +++ ibus-array-0.2.2/po/zh_TW.po
> @@ -6,7 +6,7 @@ msgid ""
>  msgstr ""
>  "Project-Id-Version: ibus-array 0.2.2\n"
>  "Report-Msgid-Bugs-To: https://github.com/lexical/ibus-array/issues\n;
> -"POT-Creation-Date: 2019-12-10 22:09+0800\n"
> +"POT-Creation-Date: 2023-08-15 09:07-0400\n"
>  "PO-Revision-Date: 2019-12-10 22:12+0800\n"
>  "Last-Translator: Anthony Fok \n"
>  "Language-Team: Chinese (traditional)\n"

I've long been annoyed by this behaviour as an upstream developer on
gettext based projects. I think the most correct upstream solution to
this is that the gettext tools need to be made deterministic. Probably
the POT creation date field should be removed and replaced with the
date of the last source file modification?

-- 
bye,
pabs

https://wiki.debian.org/PaulWise


signature.asc
Description: This is a digitally signed message part


Re: Potential MBF: packages failing to build twice in a row

2023-08-15 Thread Simon McVittie
On Tue, 15 Aug 2023 at 09:21:22 -0400, Boyuan Yang wrote:
> I am looking for advice in handling these MBF reports against packages that do
> irreversible changes to the source files during build every time (such as
> updating timestamp). A broad category would be packages using Gettext + PO
> combination with /usr/share/gettext/po/Makefile.in.in involved / embedded,
> where .po file that contains translation is updated every time, causing dpkg-
> source to complain the diff and quit when building twoce in a row.

Translation files like these are traditionally a hybrid of source file
and derived file: they're source that is edited by hand, but they are
also updated programmatically during the build.

If you are not going to modify those files yourself (other than possibly
via patches in debian/patches/), I think the most reasonable thing to do
is to tell dpkg-source to ignore any local modifications and keep using
the version of them from the upstream source code, something like this:
https://salsa.debian.org/gnome-team/shell-extensions/gnome-shell-extension-caffeine/-/commit/7fbdf1c82d824978d6e2161e479cca817e38db6b

(in most packages they'll be closer to the top level, perhaps in a po/
directory, but GNOME Shell extensions have an unusual directory layout)

This is not, strictly speaking, Policy-compliant, but Policy is a tool
for making a high-quality distribution, not a stick to beat people with;
and doing a backup/restore workflow (like dh_autoreconf does) would be
a significant amount of work for no concrete benefit that I can see.

smcv



Re: Potential MBF: packages failing to build twice in a row

2023-08-15 Thread Sam Hartman
> "Lucas" == Lucas Nussbaum  writes:

Lucas> But unless we go further than that and decide that we don't
Lucas> care at all about 'source builds after successful builds',
Lucas> the bugs (which where filed severity:minor) remain valid.

FWIW in terms of building toward a consensus.  I'd like to see policy
updated to de-emphasize the importance of clean working.

I don't think you did anything wrong by filing these bugs at severity
minor, but I also don't see the point in spending the time doing that.
Fortunately that doesn't matter: Debian allows people to focus on what
is important to them, and it's totally fine if you focus your time on
something I think is kind of pointless.
A maintainer can tag the bug as wontfix (or close and tag as wontfix) if
they like.
I got a couple of these bugs filed on packages I'm involved with.
I'll fix them if it's easy, because while I don't really care, you care
enough to have spent the time, and part of being in a community is
respecting the work of others.



Re: Potential MBF: packages failing to build twice in a row

2023-08-15 Thread Marco d'Itri
On Aug 15, Jonas Smedegaard  wrote:

> The proper approach is IMO one of these:
Or else, if you know that they do not actually need to be rebuilt: just 
disable in the makefile the target which causes them to be rebuilt.
This is what I do in my packages.

-- 
ciao,
Marco


signature.asc
Description: PGP signature


Re: Potential MBF: packages failing to build twice in a row

2023-08-15 Thread Jonas Smedegaard
Quoting Jonas Smedegaard (2023-08-15 17:26:13)
> Quoting Boyuan Yang (2023-08-15 16:28:19)
> > Hi,
> > 
> > 在 2023-08-15星期二的 16:16 +0200,Jonas Smedegaard写道:
> > > Quoting Boyuan Yang (2023-08-15 15:21:22)
> > > > I am looking for advice in handling these MBF reports against packages
> > > > that do
> > > > irreversible changes to the source files during build every time (such 
> > > > as
> > > > updating timestamp). A broad category would be packages using Gettext + 
> > > > PO
> > > > combination with /usr/share/gettext/po/Makefile.in.in involved / 
> > > > embedded,
> > > > where .po file that contains translation is updated every time, causing
> > > > dpkg-
> > > > source to complain the diff and quit when building twoce in a row.
> > > > 
> > > > Take https://tracker.debian.org/pkg/ibus-array as an example. The 
> > > > upstream
> > > > project does not include .pot template file in the source code. The 
> > > > logic
> > > > of
> > > > Makefile.in.in is to call msgmerge to update po translation file with
> > > > generated .pot file when .pot file is not present. This causes at least
> > > > the
> > > > following diff after build:
> > > > 
> > > > --- ibus-array-0.2.2.orig/po/zh_TW.po
> > > > +++ ibus-array-0.2.2/po/zh_TW.po
> > > > @@ -6,7 +6,7 @@ msgid ""
> > > >  msgstr ""
> > > >  "Project-Id-Version: ibus-array 0.2.2\n"
> > > >  "Report-Msgid-Bugs-To: https://github.com/lexical/ibus-array/issues\n;
> > > > -"POT-Creation-Date: 2019-12-10 22:09+0800\n"
> > > > +"POT-Creation-Date: 2023-08-15 09:07-0400\n"
> > > >  "PO-Revision-Date: 2019-12-10 22:12+0800\n"
> > > >  "Last-Translator: Anthony Fok \n"
> > > >  "Language-Team: Chinese (traditional)\n"
> > > > 
> > > > 
> > > > ... which would raise error on building twice in a row:
> > > > 
> > > > dpkg-source: info: local changes detected, the modified files are:
> > > >  ibus-array/po/zh_TW.po
> > > > dpkg-source: info: Hint: make sure the version in debian/changelog 
> > > > matches
> > > > the
> > > > unpacked source tree
> > > > dpkg-source: info: you can integrate the local changes with dpkg-source 
> > > > --
> > > > commit
> > > > dpkg-source: error: aborting due to unexpected upstream changes, see
> > > > /tmp/ibus-array_0.2.2-2.diff.TnL2Yp
> > > > dpkg-buildpackage: error: dpkg-source -b . subprocess returned exit 
> > > > status
> > > > 2
> > > > 
> > > > 
> > > > I am looking for the advice to implement an elegant solution. What I can
> > > > think
> > > > of now is to persuade upstream to embed a copy of generated .pot 
> > > > template
> > > > file
> > > > in source code, which does not sound reasonable. Meanwhile since
> > > > Makefile.in.in is somehow widely used, this issue likely already had
> > > > impact on
> > > > packages using gettext to handle translation.
> > > 
> > > I see too options (aside from persuading upstream to not ship
> > > autogenerated files in the first place, which is arguably the most
> > > elegant but they might see it differently):
> > > 
> > > a) repackage upstream source to exclude autogenerated *.pot files
> > > 
> > > b) put aside autogenerated *.pot files during build
> > > 
> > > For a) the common mechanism is to declare an Files-Excluded: field in
> > > the header section of debian/copyright, and declare repacksuffix=+ds in
> > > debian/watch (or repacksuffix=+dfsg if DFSG-conflicting files also need
> > > to be excluded).  More details on `man uscan`, e.g. in section
> > > COPYRIGHT FILE EXAMPLES.
> > > An example package using this method is node-expat.
> > > 
> > > For b) I would suggest to move aside (but only if already done) in
> > > debian/rules target execute_before_dh_auto_configure, and move it back
> > > (forcefully - i.e. overwriting whatever might be there already - but
> > > do nothing if no file was put aside) in debian/rules target
> > > execute_after_dh_auto_clean.
> > > An example package using this method is librdf-rdfa-parser-perl.
> > 
> > I believe your take was in the reversed direction -- the issue is not about
> > upstream including auto-generated files, the issue is that upstream is 
> > **not**
> > including auto-generated POT files.
> > 
> > * If auto-generated POT file is present, the PO file will not be updated 
> > using
> > msgmerge(1). The source code is not modified during build.
> > 
> > * If auto-generated POT file is missing, it will be generated on-the-fly 
> > with
> > a new timestamp, and the PO file (in the source code) will be updated with a
> > new timestamp. The source code is thus modified, making dpkg-source(1) to
> > fail.
> 
> Ah sorry, I indeed failed to read your scenario properly.
> 
> If PO files are not updated when POT files exist (which sounds odd to
> me) then perhaps simply touch those files to bring them into existence
> during build, to fool msgmerge?  Probably you will then need to touch
> them with a date earlier than the PO files for the trick to work.  And
> make sure the fake POT file are removed in clean, and also excluded from
> getting 

Re: Potential MBF: packages failing to build twice in a row

2023-08-15 Thread Jonas Smedegaard
Quoting Boyuan Yang (2023-08-15 16:28:19)
> Hi,
> 
> 在 2023-08-15星期二的 16:16 +0200,Jonas Smedegaard写道:
> > Quoting Boyuan Yang (2023-08-15 15:21:22)
> > > I am looking for advice in handling these MBF reports against packages
> > > that do
> > > irreversible changes to the source files during build every time (such as
> > > updating timestamp). A broad category would be packages using Gettext + PO
> > > combination with /usr/share/gettext/po/Makefile.in.in involved / embedded,
> > > where .po file that contains translation is updated every time, causing
> > > dpkg-
> > > source to complain the diff and quit when building twoce in a row.
> > > 
> > > Take https://tracker.debian.org/pkg/ibus-array as an example. The upstream
> > > project does not include .pot template file in the source code. The logic
> > > of
> > > Makefile.in.in is to call msgmerge to update po translation file with
> > > generated .pot file when .pot file is not present. This causes at least
> > > the
> > > following diff after build:
> > > 
> > > --- ibus-array-0.2.2.orig/po/zh_TW.po
> > > +++ ibus-array-0.2.2/po/zh_TW.po
> > > @@ -6,7 +6,7 @@ msgid ""
> > >  msgstr ""
> > >  "Project-Id-Version: ibus-array 0.2.2\n"
> > >  "Report-Msgid-Bugs-To: https://github.com/lexical/ibus-array/issues\n;
> > > -"POT-Creation-Date: 2019-12-10 22:09+0800\n"
> > > +"POT-Creation-Date: 2023-08-15 09:07-0400\n"
> > >  "PO-Revision-Date: 2019-12-10 22:12+0800\n"
> > >  "Last-Translator: Anthony Fok \n"
> > >  "Language-Team: Chinese (traditional)\n"
> > > 
> > > 
> > > ... which would raise error on building twice in a row:
> > > 
> > > dpkg-source: info: local changes detected, the modified files are:
> > >  ibus-array/po/zh_TW.po
> > > dpkg-source: info: Hint: make sure the version in debian/changelog matches
> > > the
> > > unpacked source tree
> > > dpkg-source: info: you can integrate the local changes with dpkg-source --
> > > commit
> > > dpkg-source: error: aborting due to unexpected upstream changes, see
> > > /tmp/ibus-array_0.2.2-2.diff.TnL2Yp
> > > dpkg-buildpackage: error: dpkg-source -b . subprocess returned exit status
> > > 2
> > > 
> > > 
> > > I am looking for the advice to implement an elegant solution. What I can
> > > think
> > > of now is to persuade upstream to embed a copy of generated .pot template
> > > file
> > > in source code, which does not sound reasonable. Meanwhile since
> > > Makefile.in.in is somehow widely used, this issue likely already had
> > > impact on
> > > packages using gettext to handle translation.
> > 
> > I see too options (aside from persuading upstream to not ship
> > autogenerated files in the first place, which is arguably the most
> > elegant but they might see it differently):
> > 
> > a) repackage upstream source to exclude autogenerated *.pot files
> > 
> > b) put aside autogenerated *.pot files during build
> > 
> > For a) the common mechanism is to declare an Files-Excluded: field in
> > the header section of debian/copyright, and declare repacksuffix=+ds in
> > debian/watch (or repacksuffix=+dfsg if DFSG-conflicting files also need
> > to be excluded).  More details on `man uscan`, e.g. in section
> > COPYRIGHT FILE EXAMPLES.
> > An example package using this method is node-expat.
> > 
> > For b) I would suggest to move aside (but only if already done) in
> > debian/rules target execute_before_dh_auto_configure, and move it back
> > (forcefully - i.e. overwriting whatever might be there already - but
> > do nothing if no file was put aside) in debian/rules target
> > execute_after_dh_auto_clean.
> > An example package using this method is librdf-rdfa-parser-perl.
> 
> I believe your take was in the reversed direction -- the issue is not about
> upstream including auto-generated files, the issue is that upstream is **not**
> including auto-generated POT files.
> 
> * If auto-generated POT file is present, the PO file will not be updated using
> msgmerge(1). The source code is not modified during build.
> 
> * If auto-generated POT file is missing, it will be generated on-the-fly with
> a new timestamp, and the PO file (in the source code) will be updated with a
> new timestamp. The source code is thus modified, making dpkg-source(1) to
> fail.

Ah sorry, I indeed failed to read your scenario properly.

If PO files are not updated when POT files exist (which sounds odd to
me) then perhaps simply touch those files to bring them into existence
during build, to fool msgmerge?  Probably you will then need to touch
them with a date earlier than the PO files for the trick to work.  And
make sure the fake POT file are removed in clean, and also excluded from
getting installed if needed.


 - Jonas

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/
 * Sponsorship: https://ko-fi.com/drjones

 [x] quote me freely  [ ] ask before reusing  [ ] keep private

signature.asc
Description: signature


Re: Potential MBF: packages failing to build twice in a row

2023-08-15 Thread Andreas Metzler
On 2023-08-15 Boyuan Yang  wrote:
[...]
> where .po file that contains translation is updated every time, causing dpkg-
> source to complain the diff and quit when building twoce in a row.

> Take https://tracker.debian.org/pkg/ibus-array as an example. The upstream
> project does not include .pot template file in the source code. The logic of
> Makefile.in.in is to call msgmerge to update po translation file with
> generated .pot file when .pot file is not present. This causes at least the
> following diff after build:

> --- ibus-array-0.2.2.orig/po/zh_TW.po
> +++ ibus-array-0.2.2/po/zh_TW.po
> @@ -6,7 +6,7 @@ msgid ""
>  msgstr ""
>  "Project-Id-Version: ibus-array 0.2.2\n"
>  "Report-Msgid-Bugs-To: https://github.com/lexical/ibus-array/issues\n;
> -"POT-Creation-Date: 2019-12-10 22:09+0800\n"
> +"POT-Creation-Date: 2023-08-15 09:07-0400\n"
[...]
> I am looking for the advice to implement an elegant solution. What I
> can think of now is to persuade upstream to embed a copy of generated
> .pot template file in source code, which does not sound reasonable.
> Meanwhile since Makefile.in.in is somehow widely used, this issue
> likely already had impact on packages using gettext to handle
> translation.

You could simply set

extend-diff-ignore="\.po$"

in debian/source/options (untested).

cu Andreas
-- 
`What a good friend you are to him, Dr. Maturin. His other friends are
so grateful to you.'
`I sew his ears on from time to time, sure'



Re: Potential MBF: packages failing to build twice in a row

2023-08-15 Thread Boyuan Yang
Hi,

在 2023-08-15星期二的 16:16 +0200,Jonas Smedegaard写道:
> Quoting Boyuan Yang (2023-08-15 15:21:22)
> > I am looking for advice in handling these MBF reports against packages
> > that do
> > irreversible changes to the source files during build every time (such as
> > updating timestamp). A broad category would be packages using Gettext + PO
> > combination with /usr/share/gettext/po/Makefile.in.in involved / embedded,
> > where .po file that contains translation is updated every time, causing
> > dpkg-
> > source to complain the diff and quit when building twoce in a row.
> > 
> > Take https://tracker.debian.org/pkg/ibus-array as an example. The upstream
> > project does not include .pot template file in the source code. The logic
> > of
> > Makefile.in.in is to call msgmerge to update po translation file with
> > generated .pot file when .pot file is not present. This causes at least
> > the
> > following diff after build:
> > 
> > --- ibus-array-0.2.2.orig/po/zh_TW.po
> > +++ ibus-array-0.2.2/po/zh_TW.po
> > @@ -6,7 +6,7 @@ msgid ""
> >  msgstr ""
> >  "Project-Id-Version: ibus-array 0.2.2\n"
> >  "Report-Msgid-Bugs-To: https://github.com/lexical/ibus-array/issues\n;
> > -"POT-Creation-Date: 2019-12-10 22:09+0800\n"
> > +"POT-Creation-Date: 2023-08-15 09:07-0400\n"
> >  "PO-Revision-Date: 2019-12-10 22:12+0800\n"
> >  "Last-Translator: Anthony Fok \n"
> >  "Language-Team: Chinese (traditional)\n"
> > 
> > 
> > ... which would raise error on building twice in a row:
> > 
> > dpkg-source: info: local changes detected, the modified files are:
> >  ibus-array/po/zh_TW.po
> > dpkg-source: info: Hint: make sure the version in debian/changelog matches
> > the
> > unpacked source tree
> > dpkg-source: info: you can integrate the local changes with dpkg-source --
> > commit
> > dpkg-source: error: aborting due to unexpected upstream changes, see
> > /tmp/ibus-array_0.2.2-2.diff.TnL2Yp
> > dpkg-buildpackage: error: dpkg-source -b . subprocess returned exit status
> > 2
> > 
> > 
> > I am looking for the advice to implement an elegant solution. What I can
> > think
> > of now is to persuade upstream to embed a copy of generated .pot template
> > file
> > in source code, which does not sound reasonable. Meanwhile since
> > Makefile.in.in is somehow widely used, this issue likely already had
> > impact on
> > packages using gettext to handle translation.
> 
> I see too options (aside from persuading upstream to not ship
> autogenerated files in the first place, which is arguably the most
> elegant but they might see it differently):
> 
> a) repackage upstream source to exclude autogenerated *.pot files
> 
> b) put aside autogenerated *.pot files during build
> 
> For a) the common mechanism is to declare an Files-Excluded: field in
> the header section of debian/copyright, and declare repacksuffix=+ds in
> debian/watch (or repacksuffix=+dfsg if DFSG-conflicting files also need
> to be excluded).  More details on `man uscan`, e.g. in section
> COPYRIGHT FILE EXAMPLES.
> An example package using this method is node-expat.
> 
> For b) I would suggest to move aside (but only if already done) in
> debian/rules target execute_before_dh_auto_configure, and move it back
> (forcefully - i.e. overwriting whatever might be there already - but
> do nothing if no file was put aside) in debian/rules target
> execute_after_dh_auto_clean.
> An example package using this method is librdf-rdfa-parser-perl.

I believe your take was in the reversed direction -- the issue is not about
upstream including auto-generated files, the issue is that upstream is **not**
including auto-generated POT files.

* If auto-generated POT file is present, the PO file will not be updated using
msgmerge(1). The source code is not modified during build.

* If auto-generated POT file is missing, it will be generated on-the-fly with
a new timestamp, and the PO file (in the source code) will be updated with a
new timestamp. The source code is thus modified, making dpkg-source(1) to
fail.

Thanks,
Boyuan Yang


signature.asc
Description: This is a digitally signed message part


Re: Potential MBF: packages failing to build twice in a row

2023-08-15 Thread Jonas Smedegaard
Quoting Boyuan Yang (2023-08-15 15:21:22)
> I am looking for advice in handling these MBF reports against packages that do
> irreversible changes to the source files during build every time (such as
> updating timestamp). A broad category would be packages using Gettext + PO
> combination with /usr/share/gettext/po/Makefile.in.in involved / embedded,
> where .po file that contains translation is updated every time, causing dpkg-
> source to complain the diff and quit when building twoce in a row.
> 
> Take https://tracker.debian.org/pkg/ibus-array as an example. The upstream
> project does not include .pot template file in the source code. The logic of
> Makefile.in.in is to call msgmerge to update po translation file with
> generated .pot file when .pot file is not present. This causes at least the
> following diff after build:
> 
> --- ibus-array-0.2.2.orig/po/zh_TW.po
> +++ ibus-array-0.2.2/po/zh_TW.po
> @@ -6,7 +6,7 @@ msgid ""
>  msgstr ""
>  "Project-Id-Version: ibus-array 0.2.2\n"
>  "Report-Msgid-Bugs-To: https://github.com/lexical/ibus-array/issues\n;
> -"POT-Creation-Date: 2019-12-10 22:09+0800\n"
> +"POT-Creation-Date: 2023-08-15 09:07-0400\n"
>  "PO-Revision-Date: 2019-12-10 22:12+0800\n"
>  "Last-Translator: Anthony Fok \n"
>  "Language-Team: Chinese (traditional)\n"
> 
> 
> ... which would raise error on building twice in a row:
> 
> dpkg-source: info: local changes detected, the modified files are:
>  ibus-array/po/zh_TW.po
> dpkg-source: info: Hint: make sure the version in debian/changelog matches the
> unpacked source tree
> dpkg-source: info: you can integrate the local changes with dpkg-source --
> commit
> dpkg-source: error: aborting due to unexpected upstream changes, see
> /tmp/ibus-array_0.2.2-2.diff.TnL2Yp
> dpkg-buildpackage: error: dpkg-source -b . subprocess returned exit status 2
> 
> 
> I am looking for the advice to implement an elegant solution. What I can think
> of now is to persuade upstream to embed a copy of generated .pot template file
> in source code, which does not sound reasonable. Meanwhile since
> Makefile.in.in is somehow widely used, this issue likely already had impact on
> packages using gettext to handle translation.

I see too options (aside from persuading upstream to not ship
autogenerated files in the first place, which is arguably the most
elegant but they might see it differently):

a) repackage upstream source to exclude autogenerated *.pot files

b) put aside autogenerated *.pot files during build

For a) the common mechanism is to declare an Files-Excluded: field in
the header section of debian/copyright, and declare repacksuffix=+ds in
debian/watch (or repacksuffix=+dfsg if DFSG-conflicting files also need
to be excluded).  More details on `man uscan`, e.g. in section
COPYRIGHT FILE EXAMPLES.
An example package using this method is node-expat.

For b) I would suggest to move aside (but only if already done) in
debian/rules target execute_before_dh_auto_configure, and move it back
(forcefully - i.e. overwriting whatever might be there already - but
do nothing if no file was put aside) in debian/rules target
execute_after_dh_auto_clean.
An example package using this method is librdf-rdfa-parser-perl.


Hope that helps.

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/
 * Sponsorship: https://ko-fi.com/drjones

 [x] quote me freely  [ ] ask before reusing  [ ] keep private

signature.asc
Description: signature


Re: Potential MBF: packages failing to build twice in a row

2023-08-15 Thread Boyuan Yang
Hi all,

(Sorry for losing all contexts -- I deleted all previous mails in the thread
sent to my mailbox, and could not find a way to re-download mailing list email
from https://lists.debian.org/debian-devel/ . It is possible in bug reports on
https://bugs.debian.org though. Now replying with empty reference.)

As stated in the policy:

  clean (required)

 This must undo any effects that the build and binary targets may
 have had, except that it should leave alone any output files
 created in the parent directory by a run of a binary target.

I am looking for advice in handling these MBF reports against packages that do
irreversible changes to the source files during build every time (such as
updating timestamp). A broad category would be packages using Gettext + PO
combination with /usr/share/gettext/po/Makefile.in.in involved / embedded,
where .po file that contains translation is updated every time, causing dpkg-
source to complain the diff and quit when building twoce in a row.

Take https://tracker.debian.org/pkg/ibus-array as an example. The upstream
project does not include .pot template file in the source code. The logic of
Makefile.in.in is to call msgmerge to update po translation file with
generated .pot file when .pot file is not present. This causes at least the
following diff after build:

--- ibus-array-0.2.2.orig/po/zh_TW.po
+++ ibus-array-0.2.2/po/zh_TW.po
@@ -6,7 +6,7 @@ msgid ""
 msgstr ""
 "Project-Id-Version: ibus-array 0.2.2\n"
 "Report-Msgid-Bugs-To: https://github.com/lexical/ibus-array/issues\n;
-"POT-Creation-Date: 2019-12-10 22:09+0800\n"
+"POT-Creation-Date: 2023-08-15 09:07-0400\n"
 "PO-Revision-Date: 2019-12-10 22:12+0800\n"
 "Last-Translator: Anthony Fok \n"
 "Language-Team: Chinese (traditional)\n"


... which would raise error on building twice in a row:

dpkg-source: info: local changes detected, the modified files are:
 ibus-array/po/zh_TW.po
dpkg-source: info: Hint: make sure the version in debian/changelog matches the
unpacked source tree
dpkg-source: info: you can integrate the local changes with dpkg-source --
commit
dpkg-source: error: aborting due to unexpected upstream changes, see
/tmp/ibus-array_0.2.2-2.diff.TnL2Yp
dpkg-buildpackage: error: dpkg-source -b . subprocess returned exit status 2


I am looking for the advice to implement an elegant solution. What I can think
of now is to persuade upstream to embed a copy of generated .pot template file
in source code, which does not sound reasonable. Meanwhile since
Makefile.in.in is somehow widely used, this issue likely already had impact on
packages using gettext to handle translation.

Thanks,
Boyuan Yang


signature.asc
Description: This is a digitally signed message part


Re: Potential MBF: packages failing to build twice in a row

2023-08-15 Thread Sebastiaan Couwenberg

How would one implement this in the common Salsa CI pipeline?

There is an old issue for this [0], but unless someone provides an MR 
its unlikely to get implemented.


Occasional rebuild campaigns like this don't catch regression in a 
timely manner. Having a build-twice job in the Salsa CI pipeline would 
help to catch regressions quickly, but it doesn't cover all packages. 
Better would be a QA service for all uploads to the archive similar to 
reproducible builds or piuparts to ensure non-policy compliant clean 
targets are caught quickly.


[0] https://salsa.debian.org/salsa-ci-team/pipeline/-/issues/193

Kind Regards,

Bas

--
 GPG Key ID: 4096R/6750F10AE88D4AF1
Fingerprint: 8182 DE41 7056 408D 6146  50D1 6750 F10A E88D 4AF1



Re: Potential MBF: packages failing to build twice in a row

2023-08-15 Thread Lucas Nussbaum
On 15/08/23 at 01:29 -0400, Michael Stone wrote:
> On Mon, Aug 14, 2023 at 09:40:52PM +0100, Wookey wrote:
> > Yes. You are right. I (and most of the others who expressed an
> > interest in having this working) mostly care about doing a binary
> > build repeatedly. But doesn't this amount to much the same thing?
> 
> no, not really. a lot of benign changes (like copying in new autoconf stuff)
> can happily be made multiple times, which doesn't affect building at all but
> causes busy work to undo.
> 
> > dpkg-source will moan if the source has changed and tell you about the
> > nice patch it has made. OK, it will let some things slide as just
> > warnings, so 'builds binary twice' is a somewhat less stringent target
> > than 'leaves exactly the original pristine source'. I would have to check
> > the details, but I'm not sure how much difference this makes in
> > practice?
> 
> we don't know, since the test was "regenerate source"--a thing very few
> people care about--rather than "build twice" which is the thing people do
> seem to care about. It seems likely that the difference is thousands of
> packages.
> 
> I'm somewhat concerned we magically went from "should we do an MBF" to "I
> just did an MBF" without any real consensus in the middle. This being so
> painfully obvious that the MBF itself basically says there's no consensus.

I agree that the distinction between "fails to build source after
successful build" and "fails to build binary packages after successful
build" is useful. My initial test covered both, but I separated both
issues later on to provide more specific bug reports, so the MBF only
covered the first case. I also plan to do a MBF for "fails to build
binary packages after successful build" (there are about 700 packages
failing this).

What policy says is even stricter (and inadequate?), because it says:

  clean (required)

 This must undo any effects that the build and binary targets may
 have had, except that it should leave alone any output files
 created in the parent directory by a run of a binary target.

I think that the consensus might be something like:
- after 'clean', it _must_ be possible to perform a successful binary build
  again
- after 'clean', it _should_ be possible to build source again (so
  recommended by not required)

But unless we go further than that and decide that we don't care at all
about 'source builds after successful builds', the bugs (which where
filed severity:minor) remain valid.

Looking at the number of bugs fixed since the MBF (200 closed, + 125
pending, out of 5658, in less than two days [1]), it also looks like many
people think this is worth fixing.

[1] UDD queries:
select count(*) from bugs where id in (select id from bugs_usertags where 
email='lu...@debian.org' and tag = 'ftbfs-source-after-build');

select count(*) from bugs where id in (select id from bugs_usertags where 
email='lu...@debian.org' and tag = 'ftbfs-source-after-build') and 
status='done';

select count(*) from bugs where id in (select id from bugs_usertags where 
email='lu...@debian.org' and tag = 'ftbfs-source-after-build') and id in 
(select id from bugs_tags where tag='pending') and status!='done';

Lucas



Testing archive-wide changes (Was: __pycache__ directories (Re: Potential MBF: packages failing to)) build twice in a row)

2023-08-15 Thread Lucas Nussbaum
On 14/08/23 at 22:09 +0200, Michael Biebl wrote:
> Could maybe dh_clean automatically clean up such __pycache__ directories or
> do we really expect that each individual package does such a clean up
> manually?
> Or is there maybe a way to avoid the creation of the __pycache__ directories
> altogether.

Hi,

As a reminder, if someone needs to test a change over a large number of
packages, I can run custom rebuilds (typically pulling specific
packages from experimental or an external repository).

See https://lists.debian.org/debian-devel/2020/10/msg00097.html

It involves some manual work on my side, so please ask only for things 
you really want to push to Debian, and when the number of affected 
packages exceeds what you can build in a couple of days locally.
(But I don't think I have every declined any such request)

Lucas



Re: Potential MBF: packages failing to build twice in a row

2023-08-14 Thread Michael Stone

On Mon, Aug 14, 2023 at 09:40:52PM +0100, Wookey wrote:

On 2023-08-14 10:19 -0400, Michael Stone wrote:

On Thu, Aug 10, 2023 at 02:38:17PM +0200, Lucas Nussbaum wrote:
> On 08/08/23 at 10:26 +0200, Helmut Grohne wrote:
> > Are we ready to call for consensus on dropping the requirement that
> > `debian/rules clean; dpkg-source -b` shall work or is anyone interested
> > in sending lots of patches for this?
>
> My reading of the discussion is that there's sufficient interest for
> ensuring that building-source-after-successful-binary-build works.

my reading said that there was interest in making sure that binary builds
work repeatedly, and almost no interest in making sure that building source
from a rules/clean works. certainly not thousands of packages worth of busy
work level of interest.


Yes. You are right. I (and most of the others who expressed an
interest in having this working) mostly care about doing a binary
build repeatedly. But doesn't this amount to much the same thing?


no, not really. a lot of benign changes (like copying in new autoconf 
stuff) can happily be made multiple times, which doesn't affect building 
at all but causes busy work to undo.



dpkg-source will moan if the source has changed and tell you about the
nice patch it has made. OK, it will let some things slide as just
warnings, so 'builds binary twice' is a somewhat less stringent target
than 'leaves exactly the original pristine source'. I would have to check
the details, but I'm not sure how much difference this makes in
practice?


we don't know, since the test was "regenerate source"--a thing very few 
people care about--rather than "build twice" which is the thing people 
do seem to care about. It seems likely that the difference is thousands 
of packages.


I'm somewhat concerned we magically went from "should we do an MBF" to 
"I just did an MBF" without any real consensus in the middle. This being 
so painfully obvious that the MBF itself basically says there's no 
consensus. 



Re: Potential MBF: packages failing to build twice in a row

2023-08-14 Thread Wookey
On 2023-08-14 10:19 -0400, Michael Stone wrote:
> On Thu, Aug 10, 2023 at 02:38:17PM +0200, Lucas Nussbaum wrote:
> > On 08/08/23 at 10:26 +0200, Helmut Grohne wrote:
> > > Are we ready to call for consensus on dropping the requirement that
> > > `debian/rules clean; dpkg-source -b` shall work or is anyone interested
> > > in sending lots of patches for this?
> > 
> > My reading of the discussion is that there's sufficient interest for
> > ensuring that building-source-after-successful-binary-build works.
> 
> my reading said that there was interest in making sure that binary builds
> work repeatedly, and almost no interest in making sure that building source
> from a rules/clean works. certainly not thousands of packages worth of busy
> work level of interest.

Yes. You are right. I (and most of the others who expressed an
interest in having this working) mostly care about doing a binary
build repeatedly. But doesn't this amount to much the same thing?

dpkg-source will moan if the source has changed and tell you about the
nice patch it has made. OK, it will let some things slide as just
warnings, so 'builds binary twice' is a somewhat less stringent target
than 'leaves exactly the original pristine source'. I would have to check
the details, but I'm not sure how much difference this makes in
practice?

But yeah, I can live with the clean only cleaning well enough to do
correct binary builds (although I do think it should clean enough to
make correct sources too in general).

Wookey
-- 
Principal hats:  Debian, Wookware, ARM
http://wookware.org/


signature.asc
Description: PGP signature


Re: Potential MBF: packages failing to build twice in a row

2023-08-14 Thread Wookey
On 2023-08-13 22:48 +0200, Vincent Bernat wrote:
> On 2023-08-10 14:38, Lucas Nussbaum wrote:
> > 
> > My reading of the discussion is that there's sufficient interest for
> > ensuring that building-source-after-successful-binary-build works.
> 
> There is a bias asking d-devel@. You'll get people with enough time on their
> hands to care about this. Nobody ever complained about not being able to
> build twice in a row for years.

I may not have complained to you, but I certainly have been regularly
annoyed by this for many years, and have had to fix people's clean
targets before I could debug the actual problem I downloaded their
package to look at. Yes it's not the biggest problem in the world, but
it's sizeable lump of grit in the system, at least for some of us.

> We are asking a lot of people to fix problems that don't really exist.

It does exist. It's broken far too often and it affects actual
developers using debian tools. It's supposed to work.

Wookey
-- 
Principal hats:  Debian, Wookware, ARM
http://wookware.org/


signature.asc
Description: PGP signature


__pycache__ directories (Re: Potential MBF: packages failing to build twice in a row)

2023-08-14 Thread Michael Biebl

Hi,

I received a couple of bug reports against packages I (co) maintain 
regarding this issue and having a quick look, quite a few fail due to
python scripts being run during the build and creating a __pycache__ 
directory.


Examples:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1048444
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1044727
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1045734

Could maybe dh_clean automatically clean up such __pycache__ directories 
or do we really expect that each individual package does such a clean up 
manually?
Or is there maybe a way to avoid the creation of the __pycache__ 
directories altogether.


Regards,
Michael




OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Potential MBF: packages failing to build twice in a row

2023-08-14 Thread Steve Langasek
On Mon, Aug 14, 2023 at 08:28:15AM +0200, Johannes Schauer Marin Rodrigues 
wrote:

> Quoting John Goerzen (2023-08-13 23:32:03)
> > On Sat, Aug 05 2023, Lucas Nussbaum wrote:
> > > I wonder what we should do, because 5000+ failing packages is a lot...
> > Let's think about the level of trouble we cause trying to tackle something
> > that has clearly not bothered anyone for years.

> this is not the first time that is has been said in this thread that "this
> hasn't bothered anybody for years". I wanted to come out and say that it has
> bothered me. It just hasn't bothered me enough to investigate what the proper
> way to solve it is. It hasn't bothered me enough to bother other people with
> this issue.

Agreed.

-- 
Steve Langasek   Give me a lever long enough and a Free OS
Debian Developer   to set it on, and I can move the world.
Ubuntu Developer   https://www.debian.org/
slanga...@ubuntu.com vor...@debian.org


signature.asc
Description: PGP signature


Re: Potential MBF: packages failing to build twice in a row

2023-08-14 Thread Michael Stone

On Thu, Aug 10, 2023 at 02:38:17PM +0200, Lucas Nussbaum wrote:

On 08/08/23 at 10:26 +0200, Helmut Grohne wrote:

Are we ready to call for consensus on dropping the requirement that
`debian/rules clean; dpkg-source -b` shall work or is anyone interested
in sending lots of patches for this?


My reading of the discussion is that there's sufficient interest for
ensuring that building-source-after-successful-binary-build works.


my reading said that there was interest in making sure that binary 
builds work repeatedly, and almost no interest in making sure that 
building source from a rules/clean works. certainly not thousands of 
packages worth of busy work level of interest.




Re: Potential MBF: packages failing to build twice in a row

2023-08-14 Thread Tino Didriksen
On Sat, 5 Aug 2023 at 17:44, Vincent Bernat  wrote:

> On 2023-08-05 17:06, Lucas Nussbaum wrote:
> > Should we give up on requiring a 'clean' target that works? After all,
> > when 17% of packages are failing, it means that many maintainers don't
> > depend on it in their workflow.
>
> Yes, please, this does not make sense anymore to enforce such a rule
> when it is now easy to use "git clean -fxd" or to build in a chroot.
> Moreover, binary packages in the archive are now built by an official
> builder.
>

Agreed. This should result in a policy change. A "clean" target is entirely
superfluous these days, and has been for probably decades. It is easy, and
better in most measurable ways, to build from scratch every time. You only
need to reuse a build folder when debugging, and then "clean" isn't
relevant anyway.

-- Tino Didriksen


Re: Potential MBF: packages failing to build twice in a row

2023-08-14 Thread Lisandro Damián Nicanor Pérez Meyer
El lunes, 14 de agosto de 2023 03:28:15 -03 Johannes Schauer Marin Rodrigues 
escribió:
> Hi,
> 
> Quoting John Goerzen (2023-08-13 23:32:03)
> > On Sat, Aug 05 2023, Lucas Nussbaum wrote:
> > > I wonder what we should do, because 5000+ failing packages is a lot...
> > Let's think about the level of trouble we cause trying to tackle something
> > that has clearly not bothered anyone for years.
> 
> this is not the first time that is has been said in this thread that "this
> hasn't bothered anybody for years". I wanted to come out and say that it has
> bothered me. It just hasn't bothered me enough to investigate what the proper
> way to solve it is. It hasn't bothered me enough to bother other people with
> this issue. After all, I can just run "git clean -fdx" to "solve" the problem
> whenever it happens.

I respect your point of view, but I still think that doing that is actually 
better than patching things over to get the original stuff back.

That being said it's fine to disagree and let's see what the raw consensus is, 
but taking into account why the 5k+ packages fail. If many are do to stuff like 
the python's egg issue, well, clearly many things can be easily fixed. If the 
majority of issues are because maintainers do not care maybe it is really a 
signal of something that used to make sense and nowadays might not do as much 
as it used to. Making those bugs RC would mean forcing people to solve a 
feature that they are clearly not using. But do please file the bugs, and even 
better, send patches!


signature.asc
Description: This is a digitally signed message part.


Re: Potential MBF: packages failing to build twice in a row

2023-08-14 Thread Johannes Schauer Marin Rodrigues
Hi,

Quoting John Goerzen (2023-08-13 23:32:03)
> On Sat, Aug 05 2023, Lucas Nussbaum wrote:
> > I wonder what we should do, because 5000+ failing packages is a lot...
> Let's think about the level of trouble we cause trying to tackle something
> that has clearly not bothered anyone for years.

this is not the first time that is has been said in this thread that "this
hasn't bothered anybody for years". I wanted to come out and say that it has
bothered me. It just hasn't bothered me enough to investigate what the proper
way to solve it is. It hasn't bothered me enough to bother other people with
this issue. After all, I can just run "git clean -fdx" to "solve" the problem
whenever it happens.

Lucas filed bugs against three of my packages. Since I didn't know how to
properly solve this issue and since this thread exists I finally felt it was
time to ask my fellow DDs how to fix this and I'll be happy to upload packages
with this problem fixed.

Thank you Lucas for doing this work!

> Personally, I'm a volunteer.  I have X amount of time to devote to Debian.
> I work on Debian because I enjoy it.  It is satisfying!

I'm also just a volunteer doing this in my free time as my hobby because I find
it fun. I can totally understand how fixing this can be anything but fun for
others and I do not want to invalidate your opinion on this. But as this thread
has a number of people saying how they don't find this useful and how they
don't find fixing this fun, I wanted to write a message with my very different
view on the subject. I cannot tell you exactly why but my *feeling* is that I'm
very happy that Lucas filed bugs against three of my packages and I will
personally find it fun to upload a version that fixes them. I understand that
there are many who do not think like that and I understand that it's worth
discussing the practical benefit of it all. But I for one am happy that these
bugs were filed and I will have fun closing them with a fix.

Thanks!

cheers, josch

signature.asc
Description: signature


Re: Potential MBF: packages failing to build twice in a row

2023-08-13 Thread Vincent Lefevre
On 2023-08-13 16:32:03 -0500, John Goerzen wrote:
> - Upstream wants to ship things that may get modified during build.  Ie,
>   autoconf/automake replaces files they ship because they want it to
>   work "out of the box" in some fashion.  Another example is
>   documentation; upstream may ship built docs even though we rebuild it
>   for completeness.

Yes, the GNU Coding Standards even say:

  GNU distributions usually contain some files which are not
  source files—for example, Info files, and the output from
  Autoconf, Automake, Bison or Flex.

For mpfr4, the test is failing just because the makeinfo version
is now newer in Debian:

--- mpfr4-4.2.0.orig/doc/mpfr.info
+++ mpfr4-4.2.0/doc/mpfr.info
@@ -1,4 +1,4 @@
-This is mpfr.info, produced by makeinfo version 6.8 from mpfr.texi.
+This is mpfr.info, produced by makeinfo version 7.0.3 from mpfr.texi.
 This manual documents how to install and use the Multiple Precision
 Floating-Point Reliable Library, version 4.2.0.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Re: Potential MBF: packages failing to build twice in a row

2023-08-13 Thread John Goerzen
On Sat, Aug 05 2023, Lucas Nussbaum wrote:

> I wonder what we should do, because 5000+ failing packages is a lot...

Let's think about the level of trouble we cause trying to tackle
something that has clearly not bothered anyone for years.

>From the packaging side, there are many reasons that proper clean
targets can be difficult:

- Upstream wants to ship things that may get modified during build.  Ie,
  autoconf/automake replaces files they ship because they want it to
  work "out of the box" in some fashion.  Another example is
  documentation; upstream may ship built docs even though we rebuild it
  for completeness.

- Upstream's clean target is insufficient.

- Our own build processes make modifications.  (eg, quilt; yes I know it
  is supposed to be cleaned up but this is not always perfect)

- It is difficult to prepare a proper clean target, because build
  artifacts may lack predictable names (eg, have the architecture
  embedded in them)

Now, what are the possible options for dealing with this?

- Invest a lot of time writing bespoke scripts to handle it

- Copy the upstream source to a temp directory and build from there.
  (Some packages already do this; if we are going to hack in this
  direction all over, why not just do it by default everywhere?)

- Repack the upstream source to exclude files we generate.  But this has
  a ton of downfalls, including breaking the trust chain from upstream
  to us.  It should be a last resort (eg, making the tarball DFSG-free).

- Just tell people to run "git reset --hard HEAD; git clean -xfd"

Personally, I'm a volunteer.  I have X amount of time to devote to
Debian.   I work on Debian because I enjoy it.  It is satisfying!

I maintain packages written in at least 6 different languages.  Some
packages are sleek and modern tools.  One I am about to take over traces
its codebase to 1981 and the upstream author has an explicit goal that
it still builds on operating systems that were discontinued in the
mid-80s.

I have an alias (not just for Debian) that does "git reset --hard HEAD;
git clean -xfd".

Spending hours to make a clean target I have no need for, already have
an equivalent of, and has no real purpose, drains my enthusiasm and
steals time that I could otherwise be using doing more high-value work
like fixing bugs.

Let's focus our energies on things that matter more.

> Should we give up on requiring a 'clean' target that works? After all,
> when 17% of packages are failing, it means that many maintainers don't
> depend on it in their workflow.

Yes.

- John



Re: Potential MBF: packages failing to build twice in a row

2023-08-13 Thread Timo Röhling

* Jeremy Stanley  [2023-08-13 21:18]:

Similarly, I got one for __pycache__/*.cpython-311.pyc file
overwrites... is that something dh_python should clean?


As a matter of fact, I also have one like that, ironically it is not
even a Python package, it just happens to run Sphinx for its
documentation. Thus, I'm seriously wondering if dh_clean should
start removing __pycache__ folders, as it already cleans up some
other random stuff like autom4te.cache folders, *.orig and *.rej
patch backups, and even DEADJOE files.


Cheers
Timo

--
⢀⣴⠾⠻⢶⣦⠀   ╭╮
⣾⠁⢠⠒⠀⣿⡁   │ Timo Röhling   │
⢿⡄⠘⠷⠚⠋⠀   │ 9B03 EBB9 8300 DF97 C2B1  23BF CC8C 6BDD 1403 F4CA │
⠈⠳⣄   ╰╯


signature.asc
Description: PGP signature


Re: Potential MBF: packages failing to build twice in a row

2023-08-13 Thread John Goerzen
On Sat, Aug 05 2023, Andrey Rakhmatullin wrote:

> On Sat, Aug 05, 2023 at 08:10:35PM +0300, Adrian Bunk wrote:
>> Debian maintainers with proper git workflows are already exporting all
>> their changes from git to debian/patches/ as one file - currently the
>> preferred form of modification of a Debian package has to be in salsa
>> and not in our archive when the changes cannot be represented as quilt
>> patches against tarballs.
> Is the gbp-pq workflow improper?


Improper?  I don't know.  But bad, yes.

If we ignore the decades of history -- yes I know we don't live in that
vacuum, but humor me here -- it is a weird process.  We (typically) get
upstream code in git, and (typically) maintain it in git in Debian.  git
already has features built in to do all this tracking, but we do this
really weird thing where we use git to store text files consisting of
diffs that, in that case, are more properly maintained in git anyhow.

I actually prefer the old source format to quilt, because I can just
maintain things in git the proper way with it, rather than have to do
all this weirdness.  But that's just me.

Not all of my packages have their upstream in git, but I maintain 100%
of them in git at the Debian level.

Here's the key.  By using a lot of nonstandard things, Debian is doing
two things:

1) Raising the barrier to newcomers to participate

2) Assuring that we will lag behind what others are doing

For #2, the reason is that however much we work on our bespoke tooling,
we cannot hope to match what the rest of the world combined is doing.

That doesn't mean we give up on .deb and adopt RPM or something.  But,
in an ideal world, would gbp-pq need to exist?  I don't think so.  A
world in which it doesn't should be our target.

- John



Re: Potential MBF: packages failing to build twice in a row

2023-08-13 Thread Jeremy Stanley
On 2023-08-13 22:59:53 +0200 (+0200), Timo Röhling wrote:
[...]
> There's talk on #d-python if pybuild could/should deal with it; I'd
> give it a few days and see if that pans out.
> 
> If not, extend-diff-ignore as Scott suggested or simply removing the
> egg-info folders with a *.egg-info/ line in d/clean should both
> work. (Personally, I'd use extend-diff-ignore if the egg-info is
> also shipped in the source tarball and d/clean if not)

Similarly, I got one for __pycache__/*.cpython-311.pyc file
overwrites... is that something dh_python should clean?
-- 
Jeremy Stanley


signature.asc
Description: PGP signature


Re: Potential MBF: packages failing to build twice in a row

2023-08-13 Thread Timo Röhling

Hi,

* Johannes Schauer Marin Rodrigues  [2023-08-13 22:28]:

since this issue seems to be affecting a few more packages than plakativ, I
wanted to ask here what the canonical way is to fix this issue?

There's talk on #d-python if pybuild could/should deal with it; I'd
give it a few days and see if that pans out.

If not, extend-diff-ignore as Scott suggested or simply removing the
egg-info folders with a *.egg-info/ line in d/clean should both
work. (Personally, I'd use extend-diff-ignore if the egg-info is
also shipped in the source tarball and d/clean if not)


Cheers
Timo

--
⢀⣴⠾⠻⢶⣦⠀   ╭╮
⣾⠁⢠⠒⠀⣿⡁   │ Timo Röhling   │
⢿⡄⠘⠷⠚⠋⠀   │ 9B03 EBB9 8300 DF97 C2B1  23BF CC8C 6BDD 1403 F4CA │
⠈⠳⣄   ╰╯


signature.asc
Description: PGP signature


Re: Potential MBF: packages failing to build twice in a row

2023-08-13 Thread Vincent Bernat

On 2023-08-10 14:38, Lucas Nussbaum wrote:

On 08/08/23 at 10:26 +0200, Helmut Grohne wrote:

Are we ready to call for consensus on dropping the requirement that
`debian/rules clean; dpkg-source -b` shall work or is anyone interested
in sending lots of patches for this?


My reading of the discussion is that there's sufficient interest for
ensuring that building-source-after-successful-binary-build works.


There is a bias asking d-devel@. You'll get people with enough time on 
their hands to care about this. Nobody ever complained about not being 
able to build twice in a row for years. We are asking a lot of people to 
fix problems that don't really exist.


I don't have time to deal with the amount of bug reported against my own 
packages (most of them with Python) and just having the social pressure 
to spend time on them make me wonder if I really want to invest time at 
all. If these bugs become RC, so be it.




Re: Potential MBF: packages failing to build twice in a row

2023-08-13 Thread Scott Kitterman



On August 13, 2023 8:28:08 PM UTC, Johannes Schauer Marin Rodrigues 
 wrote:
>Hi,
>
>Quoting Simon McVittie (2023-08-06 12:27:04)
>> On Sat, 05 Aug 2023 at 21:29:08 +0200, Andrey Rakhmatullin wrote:
>> > I expect all Python packages that ship
>> > $name.egg-info and don't remove it in clean and don't exclude it via
>> > extend-diff-ignore (all of which is unneeded busywork even if recommended)
>> > to behave the same.
>> 
>> Python packages that *don't* ship $name.egg-info in their upstream source,
>> don't remove it in clean and don't exclude it via extend-diff-ignore will
>> also fail Lucas' test if they are 3.0 (quilt) format (or presumably will
>> have unintended diff instead if they are format 1.0). That's the only reason
>> bmap-tools_3.6-2 was on Lucas' list, for example.
>
>I just had #1045290 filed against a Python package of mine:
>
>> dpkg-source: info: local changes detected, the modified files are:
>>  plakativ-0.5.1/plakativ.egg-info/SOURCES.txt
>
>since this issue seems to be affecting a few more packages than plakativ, I
>wanted to ask here what the canonical way is to fix this issue?

Generally something like this (although depending on the Python build tool, it 
may need some adjustment):

https://salsa.debian.org/python-team/packages/dkimpy/-/commit/bf39be7563b680bdc73a67a4db7dc1af05af

Scott K



Re: Potential MBF: packages failing to build twice in a row

2023-08-13 Thread Johannes Schauer Marin Rodrigues
Hi,

Quoting Simon McVittie (2023-08-06 12:27:04)
> On Sat, 05 Aug 2023 at 21:29:08 +0200, Andrey Rakhmatullin wrote:
> > I expect all Python packages that ship
> > $name.egg-info and don't remove it in clean and don't exclude it via
> > extend-diff-ignore (all of which is unneeded busywork even if recommended)
> > to behave the same.
> 
> Python packages that *don't* ship $name.egg-info in their upstream source,
> don't remove it in clean and don't exclude it via extend-diff-ignore will
> also fail Lucas' test if they are 3.0 (quilt) format (or presumably will
> have unintended diff instead if they are format 1.0). That's the only reason
> bmap-tools_3.6-2 was on Lucas' list, for example.

I just had #1045290 filed against a Python package of mine:

> dpkg-source: info: local changes detected, the modified files are:
>  plakativ-0.5.1/plakativ.egg-info/SOURCES.txt

since this issue seems to be affecting a few more packages than plakativ, I
wanted to ask here what the canonical way is to fix this issue?

Thanks!

cheers, josch

signature.asc
Description: signature


Re: Potential MBF: packages failing to build twice in a row

2023-08-12 Thread Lisandro Damian Nicanor Perez Meyer
On sábado, 12 de agosto de 2023 20:26:09 -03 Lisandro Damian Nicanor Perez 
Meyer wrote:
> On sábado, 5 de agosto de 2023 12:06:27 -03 Lucas Nussbaum wrote:
> > Hi,
> > 
> > Debian Policy section 4.9 says:
> >   clean (required)
> >   
> >  This must undo any effects that the build and binary targets may
> >  have had, except that it should leave alone any output files
> >  created in the parent directory by a run of a binary target.
> 
> Now this is something I never understood why it would be needed and it seems
> that it might be necessary on other types of flows from the one I use.
> 
> Normally my repos only have the stuff in debian/ committed in them. If a
> package fails in a way I need to dig deeper I create a chroot, install the
> build dependnecies, untar the source code and work. If I need to clean, git
> clean -xdff && untar source code again.
> 
> I could be easily missing something beneficial for my workflow there but, to
> be honest, I never really required debian/clean to work.
> 
> And yes, if someone wants to build my packages then they need to understand
> how to use git and untar the source code. If they don't, I point them to the
> docs (we do have them).
> 
> Also I never seen building a package twice in a row on the archive
> happening... so no, at least in my use case, I do not see the value of
> debian/ clean.
> 
> But again, I might be missing something here.

Scott Kitterman was kind enough to discuss the issue with me, and he asked me 
what would I do if I have to work with a Debian source from the archive. That 
was actually a pretty good question, as I have to do it from time to time:

apt-get source foo
cd foo-x.y.z
git init
git add -A debian/

And voilá, now I can work with **any** package from the archive in the way I 
like. If I need to clean up and start over:

git clean -xdff
tar -xf ../foo_x.y.x.org.tar.gz --strip=1

Someone might say "hey, but you need to clean and untar each time!". Well, to 
be honest, I prefer that than relying on a add-on prepared by someone who is 
probably not upstream, ie, us maintainers. Let's be honest, this might be a 
flaky target unless you test it each and every time... waste of resources.

Someone might say "I do not want git installed". Well, I hope you don't use 
gbp on your packages either.

Now I **do** see value if you are using something like gbp to keep the source 
code, as you might have modified files there. But that's a peril of your 
workflow, not mine.

I can also understand why that was a fair thing to have when stuff like git was 
not around, but nowadays...

And **again**, I might easily be missing something very important here, but so 
far I see no reason to do two builds of my packages when I can easily solve it 
with a couple of commands.

Note that I am not saying that we should remove the feature. If the package 
happens to work with it, the better. If it doesn't and you provide a patch to 
make it work, awesome! Making it mandatory? No, so far I see no value in 
wasting my time on that.

signature.asc
Description: This is a digitally signed message part.


Re: Potential MBF: packages failing to build twice in a row

2023-08-12 Thread Lisandro Damian Nicanor Perez Meyer
On sábado, 5 de agosto de 2023 12:06:27 -03 Lucas Nussbaum wrote:
> Hi,
> 
> Debian Policy section 4.9 says:
>   clean (required)
>  This must undo any effects that the build and binary targets may
>  have had, except that it should leave alone any output files
>  created in the parent directory by a run of a binary target.

Now this is something I never understood why it would be needed and it seems 
that it might be necessary on other types of flows from the one I use.

Normally my repos only have the stuff in debian/ committed in them. If a 
package fails in a way I need to dig deeper I create a chroot, install the 
build dependnecies, untar the source code and work. If I need to clean, git 
clean -xdff && untar source code again.

I could be easily missing something beneficial for my workflow there but, to be 
honest, I never really required debian/clean to work.

And yes, if someone wants to build my packages then they need to understand 
how to use git and untar the source code. If they don't, I point them to the 
docs (we do have them).

Also I never seen building a package twice in a row on the archive 
happening... so no, at least in my use case, I do not see the value of debian/
clean.

But again, I might be missing something here.


signature.asc
Description: This is a digitally signed message part.


Re: Potential MBF: packages failing to build twice in a row

2023-08-12 Thread Lucas Nussbaum
On 10/08/23 at 14:38 +0200, Lucas Nussbaum wrote:
> On 08/08/23 at 10:26 +0200, Helmut Grohne wrote:
> > Are we ready to call for consensus on dropping the requirement that
> > `debian/rules clean; dpkg-source -b` shall work or is anyone interested
> > in sending lots of patches for this?
> 
> My reading of the discussion is that there's sufficient interest for
> ensuring that building-source-after-successful-binary-build works.
> 
> Also, for most packages, fixes are trivial, and can be implemented as
> durable fixes (not requiring changes for each upstream release).
> 
> So my proposal would be to file bugs against affected packages, with
> severity:minor for now (even it is a clear policy violation).
> The bugs would be properly usertagged to allow tracking, and point to a
> wiki page where we can share recipes for specific issues.
> 
> The rate at which packages are fixed could be useful input to determine
> if we can just live with that requirement, or if we needed to change
> policy.
> 
> After some time, when enough bugs are fixed, the severity could be
> increased to release-critical. And to ensure that we don't regress again
> on this, this check could easily be added to archive rebuilds.

Hi,

I prepared this wiki page:
https://wiki.debian.org/qa.debian.org/FTBFS/SourceAfterBuild

And I plan to use the following bug template:
--
From: {{ fullname }} <{{ email }}>
To: sub...@bugs.debian.org
Subject: {{ package }}: Fails to build source after successful build

Source: {{ package }}
Version: {{ version }}
Severity: minor
Tags: trixie sid ftbfs
User: lu...@debian.org
Usertags: ftbfs-sab-{{ date_without_slashes }} ftbfs-source-after-build

Hi,

This package fails to build a source package after a successful build
(dpkg-buildpackage ; dpkg-buildpackage -S).

This is probably a clear violation of Debian Policy section 4.9 (clean target),
but this is filed as severity:minor for now, because a discussion on
debian-devel showed that we might want to revisit the requirement of a working
'clean' target.

More information about this class of issues, included common problems and
solutions, is available at
https://wiki.debian.org/qa.debian.org/FTBFS/SourceAfterBuild

Relevant part of the build log:
{% for line in extract %}> {{ line }}
{% endfor %}

The full build log is available from:
http://qa-logs.debian.net/{{ date }}/{{ filename }}

If you reassign this bug to another package, please mark it as 'affects'-ing
this package. See https://www.debian.org/Bugs/server-control#affects

If you fail to reproduce this, please provide a build log and diff it with mine
so that we can identify if something relevant changed in the meantime.
--

I will first focus on packages where 'dpkg-buildpackage -S' fails after
a successful build.

I might do the same work for packages where 'dpkg-buildpackage -b' fails
after a successful build.

Lucas



Re: Potential MBF: packages failing to build twice in a row

2023-08-10 Thread Guillem Jover
On Wed, 2023-08-09 at 22:10:51 +0200, Johannes Schauer Marin Rodrigues wrote:
> Quoting Guillem Jover (2023-08-09 20:55:17)
> > I think I've mentioned this before, but dpkg-source is supposed to be
> > generating reproducible source packages since around the time dpkg-deb
> > has been generating reproducible binary packages. If that's not the case
> > in some circumstance I'd consider that a bug worth fixing (or at least
> > pondering whether it makes sense to support :).

> I ran diffoscope on the differing debian.tar.xz files and got:
> 
>   --- ../hello_2.10-3.debian.tar.xz.bak
>   +++ ../hello_2.10-3.debian.tar.xz
>   │┄ Format-specific differences are supported for XZ compressed files 
> but no file-specific differences were detected; falling back to a binary 
> diff. file(1) reports: XZ compressed data, checksum CRC64
> 
> I suspect that different versions of xz produce differently compressed
> archives?

As Sven mentions this is probably due to the new parallel xz execution,
but…

> In any case, I have no time for a more thorough analysis right now but this 
> was
> an example that makes my point: passing --source to sbuild might overwrite 
> your
> existing source package with something different and thus it's not trivial
> anymore to assure that what you built really came from those exact same 
> sources
> as the source that the built was done from was not bit-by-bit identical to the
> source produced by sbuild --source. In fact the .buildinfo file produced by
> sbuild will reference the *new* dsc and that was not the dsc that the built 
> was
> done from. So my point stands: avoid running sbuild with --source.

…the same constraints would apply to dpkg-source as they would when
building binary packages, as in, it can produce reproducible source as
long as the environment described in the .buildinfo file is the same
(or equivalent functionality wise).

So this works even ignoring the .buildinfo, as long as the source is
regenerated close enough in time to when it was first created.
Otherwise if it gets regenerated at any other future time then it
might fail to reproduce, yes, and trying to honor the .buildinfo and
use an up-to-date build environment seems would be in conflict. So I
guess your recommendation makes sense if this was intended to be used
unconditionally (more so in circumstances where previously no source
would generally be created, say on binNMUs or similar).

Thanks,
Guillem



Re: Potential MBF: packages failing to build twice in a row

2023-08-10 Thread Andrey Rakhmatullin
On Thu, Aug 10, 2023 at 02:22:30PM +0200, Lucas Nussbaum wrote:
> > It might be worth to consider changing your workflow a bit and work with
> > a git repository. It does not have to be a clone of the repository (if
> > any) where the package is maintained, you can start with a fresh import,
> > e.g. with "gbp import-dsc".
> > 
> > Then before building the package for the first time, commit or at least
> > stash your changes, and you can go easily back to a clean state with
> > "git reset --hard; git clean -fdqx".
> 
> While this works in practice (and I do that as well), I find it hard to
> explain to new contributors that hacking on random packages requires
> them to create a temporary git repository so that they revert the
> package' source to a clean state.
They should be able to get an official git repo and hack on it instead.
Not now or in <10 years though, unless I guess you count dgit.



Re: Potential MBF: packages failing to build twice in a row

2023-08-10 Thread Lucas Nussbaum
On 08/08/23 at 10:26 +0200, Helmut Grohne wrote:
> Are we ready to call for consensus on dropping the requirement that
> `debian/rules clean; dpkg-source -b` shall work or is anyone interested
> in sending lots of patches for this?

My reading of the discussion is that there's sufficient interest for
ensuring that building-source-after-successful-binary-build works.

Also, for most packages, fixes are trivial, and can be implemented as
durable fixes (not requiring changes for each upstream release).

So my proposal would be to file bugs against affected packages, with
severity:minor for now (even it is a clear policy violation).
The bugs would be properly usertagged to allow tracking, and point to a
wiki page where we can share recipes for specific issues.

The rate at which packages are fixed could be useful input to determine
if we can just live with that requirement, or if we needed to change
policy.

After some time, when enough bugs are fixed, the severity could be
increased to release-critical. And to ensure that we don't regress again
on this, this check could easily be added to archive rebuilds.

Lucas



Re: Potential MBF: packages failing to build twice in a row

2023-08-10 Thread Lucas Nussbaum
Hi,

On 05/08/23 at 21:01 +0200, Sven Joachim wrote:
> On 2023-08-05 19:31 +0100, Wookey wrote:
> 
> > On 2023-08-05 17:06 +0200, Lucas Nussbaum wrote:
> >>
> >> I wonder what we should do, because 5000+ failing packages is a lot...
> >>
> >> Should we give up on requiring a 'clean' target that works? After all,
> >> when 17% of packages are failing, it means that many maintainers don't
> >> depend on it in their workflow.
> >
> > I still depend on this in my workflow, and it's very frustrating that
> > a large fraction of packages are broken in this way. I'd love it if we
> > had a bit of automation to tell people it's bust so they can fix
> > it. Sometimes it is hard because build systems mess up your source
> > tree, but a lot of the time it isn't. I have some sympathy for people
> > who would have to do a lot of work to fight a build system that
> > doesn't care about clean source trees if they don't care about them either.
> >
> > On the other hand it is a massive PITA when you build a package, and
> > something breaks, and you try to build it again and it won't because
> > the source tree has changed and the clean target doesn't actually clean.
> > This happens way too often these days.
> 
> It might be worth to consider changing your workflow a bit and work with
> a git repository. It does not have to be a clone of the repository (if
> any) where the package is maintained, you can start with a fresh import,
> e.g. with "gbp import-dsc".
> 
> Then before building the package for the first time, commit or at least
> stash your changes, and you can go easily back to a clean state with
> "git reset --hard; git clean -fdqx".

While this works in practice (and I do that as well), I find it hard to
explain to new contributors that hacking on random packages requires
them to create a temporary git repository so that they revert the
package' source to a clean state.

We already have a large pile of tools and worflows that need to be
mastered when contributing to Debian. It would be better to avoid adding
an additional one...

Lucas



Re: Potential MBF: packages failing to build twice in a row

2023-08-10 Thread Timo Röhling

* Jonas Smedegaard  [2023-08-10 12:32]:

Example: An organisation has examines licensing of Chromium as installed
ontheir Android and Linux systems, expressed as SPDX datasets with SHA1
checksums for upstream tarballs.  They need to do a full analysis for
each upstream release, but would prefer to only need a partial analysis
for each Debian repackaging if possible.  If Debian included a SHA1
which matched a SHA1 in their SPDX dataset then they benefit.  If SHA1
for one reason or another don't match then it not a sign if insecurity,
only a more expensive process for them because they then need to analyze
that repackaged tarball as unique instead of as a derivation of
something known to them.


I agree that you describe a valid use-case and a good reason why
Debian maintainers should not repack source archives arbitrarily,
but it does not refute my point. A cryptographic hash is not a
signature, it merely represents a particular binary blob (such as a
source archive) and makes no claim about the authorship of that
blob. In fact, you can even compute it yourself if upstream refuses
to do so, and your described use-case would still work: You don't
need to know about authorship, you only need to know if some tarball
has content you have already seen and examined before.


Cheers
Timo



--
⢀⣴⠾⠻⢶⣦⠀   ╭╮
⣾⠁⢠⠒⠀⣿⡁   │ Timo Röhling   │
⢿⡄⠘⠷⠚⠋⠀   │ 9B03 EBB9 8300 DF97 C2B1  23BF CC8C 6BDD 1403 F4CA │
⠈⠳⣄   ╰╯


signature.asc
Description: PGP signature


Re: Potential MBF: packages failing to build twice in a row

2023-08-10 Thread Jonas Smedegaard
Quoting Timo Röhling (2023-08-10 11:56:42)
> Hi,
> 
> * Helmut Grohne  [2023-08-10 06:43]:
> >When repacking, the upstream signature becomes useless and external
> >parties can no longer verify it at ease. Including that upstream
> >signature increases trust in the source shipped by Debian being
> >good.
> I don't think that problem is very relevant in practise.
> 
> On the one hand, the vast majority of upstreams I have encountered
> so far do not ship any signatures at all. Some upstreams do not even
> have an immutable release archive; Github (for example) generates
> TARs and ZIPs on the fly and changes the exact format from time to
> time.
> 
> On the other hand, those upstream developers who care enough to go
> the extra mile with a meaningful [1] cryptographic signature,
> probably also pay more attention to the actual files they ship,
> making it less likely to require repacks in the first place.
> 
> 
> Cheers
> Timo
> 
> 
> [1] A signature is only meaningful if the signing key is kept
> secure. If you upload a GPG private key to your favorite code
> hoster and have it sign releases automatically, you have a very
> convenient workflow that achieves nothing at all, because the
> integrity of the release still depends on the integrity of the
> hosting platform.

I disagree that hoster-signed released are totally worthless.

Even if we in Debian consider (other) hosters not worthy of our trust,
downstreams of Debian may value some hosters differently and find value
in our tracking their offered signatures.

Example: An organisation has examines licensing of Chromium as installed
ontheir Android and Linux systems, expressed as SPDX datasets with SHA1
checksums for upstream tarballs.  They need to do a full analysis for
each upstream release, but would prefer to only need a partial analysis
for each Debian repackaging if possible.  If Debian included a SHA1
which matched a SHA1 in their SPDX dataset then they benefit.  If SHA1
for one reason or another don't match then it not a sign if insecurity,
only a more expensive process for them because they then need to analyze
that repackaged tarball as unique instead of as a derivation of
something known to them.

 - Jonas

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/
 * Sponsorship: https://ko-fi.com/drjones

 [x] quote me freely  [ ] ask before reusing  [ ] keep private

signature.asc
Description: signature


Re: Potential MBF: packages failing to build twice in a row

2023-08-10 Thread Timo Röhling

Hi,

* Helmut Grohne  [2023-08-10 06:43]:

When repacking, the upstream signature becomes useless and external
parties can no longer verify it at ease. Including that upstream
signature increases trust in the source shipped by Debian being
good.

I don't think that problem is very relevant in practise.

On the one hand, the vast majority of upstreams I have encountered
so far do not ship any signatures at all. Some upstreams do not even
have an immutable release archive; Github (for example) generates
TARs and ZIPs on the fly and changes the exact format from time to
time.

On the other hand, those upstream developers who care enough to go
the extra mile with a meaningful [1] cryptographic signature,
probably also pay more attention to the actual files they ship,
making it less likely to require repacks in the first place.


Cheers
Timo


[1] A signature is only meaningful if the signing key is kept
secure. If you upload a GPG private key to your favorite code
hoster and have it sign releases automatically, you have a very
convenient workflow that achieves nothing at all, because the
integrity of the release still depends on the integrity of the
hosting platform.

--
⢀⣴⠾⠻⢶⣦⠀   ╭╮
⣾⠁⢠⠒⠀⣿⡁   │ Timo Röhling   │
⢿⡄⠘⠷⠚⠋⠀   │ 9B03 EBB9 8300 DF97 C2B1  23BF CC8C 6BDD 1403 F4CA │
⠈⠳⣄   ╰╯


signature.asc
Description: PGP signature


Re: Potential MBF: packages failing to build twice in a row

2023-08-10 Thread Helmut Grohne
Hi Wookey,

On Wed, Aug 09, 2023 at 02:30:43PM +0100, Wookey wrote:
> I have never tried Helmut's suggestion of removing this stuff in the
> clean target. It does seem to me that removing it from the tarball
> makes a lot more sense than cleaning it later.

I do see all the advantages of repacking that you and Simon presented.
We don't have to argue about them. Simon also pointed at a severe
limitation though: When repacking, the upstream signature becomes
useless and external parties can no longer verify it at ease. Including
that upstream signature increases trust in the source shipped by Debian
being good.

For cases where we repack anyway (e.g. for licensing reasons), we have
broad consensus that we should also delete generated files at the
repacking stage. I also see a shift here where we may recommend
repacking just for deleting unused files in the absence of an upstream
signature. The arguments are convincing to me.

Does anyone see a way to enable upstream signature verification with
repacked sources? This seems technically incompatible: In order to
verify the signature, we really have to ship the original tar and thus
get into the licensing mess. So the best we might do here is point at
the original tar and signature (hoping that it does not go away) and
providing a tool that verifies the signature and establishes that the
repacked source really corresponds to the verified tar. Is anyone aware
of such tooling?

In the absence of such tooling, I continue to see clean-before-build as
a valid strategy for dealing with generated files and vendored sources.

Helmut



Re: Potential MBF: packages failing to build twice in a row

2023-08-09 Thread Sven Joachim
On 2023-08-09 22:10 +0200, Johannes Schauer Marin Rodrigues wrote:

> it has been a long time since I've analyzed this so things might've changed
> indeed since then. But what I remember is that, depending on the source
> package, running sbuild with --source would produce a different source package
> than was originally passed to sbuild. I tried running this on a few source
> packages to see if I can reproduce this problem today:
>
> sbuild --source --arch-all --arch-any -d unstable --no-run-lintian \
> --no-run-autopkgtest \
> --starting-build-commands='grep -E "^ [a-f0-9]{64} " *_*.dsc > 
> before' \
> --finished-build-commands='grep -E "^ [a-f0-9]{64} " *_*.dsc | diff 
> -u before -'
>
> Which prints for src:hello this:
>
>   --- before  2023-08-09 19:46:05.092628335 +
>   +++ -   2023-08-09 19:46:25.873292249 +
>   @@ -1,3 +1,3 @@
> 31e066137a962676e89f69d1b65382de95a7ef7d914b8cb956f41ea72e0f516b 
> 725946 hello_2.10.orig.tar.gz
> 4ea69de913428a4034d30dcdcb34ab84f5c4a76acf9040f3091f0d3fac411b60 819 
> hello_2.10.orig.tar.gz.asc
>   - 60ee7a466808301fbaa7fea2490b5e7a6d86f598956fb3e79c71b3295dc1f249 
> 12684 hello_2.10-3.debian.tar.xz
>   + 84b14a8c49f9bca8d6c7a5550fed71790e147576c8eb716b2afbd49df4d5a7a9 
> 12692 hello_2.10-3.debian.tar.xz
>
>
> I ran diffoscope on the differing debian.tar.xz files and got:
>
>   --- ../hello_2.10-3.debian.tar.xz.bak
>   +++ ../hello_2.10-3.debian.tar.xz
>   │┄ Format-specific differences are supported for XZ compressed files 
> but no file-specific differences were detected; falling back to a binary 
> diff. file(1) reports: XZ compressed data, checksum CRC64
>
> I suspect that different versions of xz produce differently compressed
> archives?

Not really, actually different versions of dpkg-source produce them.
The xz manpage notes that the single-threaded and multi-threaded
compressors produce different output, and dpkg 1.21.14 switched from
single-threaded to multi-threaded compression.  The hello package was
uploaded to the archive before the dpkg 1.21.14 release.

The uploader can also change the compression level with the -z option,
after which you might not be able to reproduce their debian.tar.xz so
easily.

Cheers,
   Sven



Re: Potential MBF: packages failing to build twice in a row

2023-08-09 Thread Johannes Schauer Marin Rodrigues
Hi,

Quoting Guillem Jover (2023-08-09 20:55:17)
> On Wed, 2023-08-09 at 19:55:41 +0200, Johannes Schauer Marin Rodrigues wrote:
> > I would only consider switching the default if at the same time, some checks
> > were done that made sure that the result is bit-by-bit identical to the
> > original.
> > 
> > The source package is the *input* to sbuild not its output. If sbuild builds
> > the source package it can happen that the resulting source package is not 
> > what
> > was given to sbuild to get built before.
> > 
> > So if the source package gets rebuilt and checked whether it is bit-by-bit
> > identical to what was given to sbuild before, then essentially we would've
> > enforced reproducible source packages. If I remember correctly, reproducible
> > source packages are something that the reproducible builds team discarded 
> > as a
> > concept many years ago.
> 
> I think I've mentioned this before, but dpkg-source is supposed to be
> generating reproducible source packages since around the time dpkg-deb
> has been generating reproducible binary packages. If that's not the case
> in some circumstance I'd consider that a bug worth fixing (or at least
> pondering whether it makes sense to support :).

it has been a long time since I've analyzed this so things might've changed
indeed since then. But what I remember is that, depending on the source
package, running sbuild with --source would produce a different source package
than was originally passed to sbuild. I tried running this on a few source
packages to see if I can reproduce this problem today:

sbuild --source --arch-all --arch-any -d unstable --no-run-lintian \
--no-run-autopkgtest \
--starting-build-commands='grep -E "^ [a-f0-9]{64} " *_*.dsc > before' \
--finished-build-commands='grep -E "^ [a-f0-9]{64} " *_*.dsc | diff -u 
before -'

Which prints for src:hello this:

--- before  2023-08-09 19:46:05.092628335 +
+++ -   2023-08-09 19:46:25.873292249 +
@@ -1,3 +1,3 @@
  31e066137a962676e89f69d1b65382de95a7ef7d914b8cb956f41ea72e0f516b 
725946 hello_2.10.orig.tar.gz
  4ea69de913428a4034d30dcdcb34ab84f5c4a76acf9040f3091f0d3fac411b60 819 
hello_2.10.orig.tar.gz.asc
- 60ee7a466808301fbaa7fea2490b5e7a6d86f598956fb3e79c71b3295dc1f249 
12684 hello_2.10-3.debian.tar.xz
+ 84b14a8c49f9bca8d6c7a5550fed71790e147576c8eb716b2afbd49df4d5a7a9 
12692 hello_2.10-3.debian.tar.xz


I ran diffoscope on the differing debian.tar.xz files and got:

--- ../hello_2.10-3.debian.tar.xz.bak
+++ ../hello_2.10-3.debian.tar.xz
│┄ Format-specific differences are supported for XZ compressed files 
but no file-specific differences were detected; falling back to a binary diff. 
file(1) reports: XZ compressed data, checksum CRC64

I suspect that different versions of xz produce differently compressed
archives?

In any case, I have no time for a more thorough analysis right now but this was
an example that makes my point: passing --source to sbuild might overwrite your
existing source package with something different and thus it's not trivial
anymore to assure that what you built really came from those exact same sources
as the source that the built was done from was not bit-by-bit identical to the
source produced by sbuild --source. In fact the .buildinfo file produced by
sbuild will reference the *new* dsc and that was not the dsc that the built was
done from. So my point stands: avoid running sbuild with --source.

Thanks!

cheers, josch

signature.asc
Description: signature


Re: Potential MBF: packages failing to build twice in a row

2023-08-09 Thread Guillem Jover
Hi!

On Wed, 2023-08-09 at 19:55:41 +0200, Johannes Schauer Marin Rodrigues wrote:
> I would only consider switching the default if at the same time, some checks
> were done that made sure that the result is bit-by-bit identical to the
> original.
> 
> The source package is the *input* to sbuild not its output. If sbuild builds
> the source package it can happen that the resulting source package is not what
> was given to sbuild to get built before.
> 
> So if the source package gets rebuilt and checked whether it is bit-by-bit
> identical to what was given to sbuild before, then essentially we would've
> enforced reproducible source packages. If I remember correctly, reproducible
> source packages are something that the reproducible builds team discarded as a
> concept many years ago.

I think I've mentioned this before, but dpkg-source is supposed to be
generating reproducible source packages since around the time dpkg-deb
has been generating reproducible binary packages. If that's not the case
in some circumstance I'd consider that a bug worth fixing (or at least
pondering whether it makes sense to support :).

Thanks,
Guillem



Re: Potential MBF: packages failing to build twice in a row

2023-08-09 Thread Scott Kitterman



On August 9, 2023 5:55:41 PM UTC, Johannes Schauer Marin Rodrigues 
 wrote:
>Hi,
>
>Quoting Stefano Rivera (2023-08-09 14:38:56)
>> Personally, I have my sbuild configured to build a source package after the
>> build, so that I can be sure that I don't regress my own packages' clean
>> target. It would be nice if this was a default feature in sbuild, for most
>> packages this is a very quick process.
>
>I would only consider switching the default if at the same time, some checks
>were done that made sure that the result is bit-by-bit identical to the
>original.
>
>The source package is the *input* to sbuild not its output. If sbuild builds
>the source package it can happen that the resulting source package is not what
>was given to sbuild to get built before.
>
>So if the source package gets rebuilt and checked whether it is bit-by-bit
>identical to what was given to sbuild before, then essentially we would've
>enforced reproducible source packages. If I remember correctly, reproducible
>source packages are something that the reproducible builds team discarded as a
>concept many years ago.
>
>So what should be the plan instead?
>
I think that's almost the right goal.

The binary package that is built from that source package should be identical 
to the one produced from the first build.

As an example, in Python packages it is not unusual to ignore the diff 
associated with rebuilding the upstream package metadata.  You get the same 
binary regardless, so as long as you can build the source package by ignoring 
that diff, it's good to go.

Scott K



Re: Potential MBF: packages failing to build twice in a row

2023-08-09 Thread Johannes Schauer Marin Rodrigues
Hi,

Quoting Stefano Rivera (2023-08-09 14:38:56)
> Personally, I have my sbuild configured to build a source package after the
> build, so that I can be sure that I don't regress my own packages' clean
> target. It would be nice if this was a default feature in sbuild, for most
> packages this is a very quick process.

I would only consider switching the default if at the same time, some checks
were done that made sure that the result is bit-by-bit identical to the
original.

The source package is the *input* to sbuild not its output. If sbuild builds
the source package it can happen that the resulting source package is not what
was given to sbuild to get built before.

So if the source package gets rebuilt and checked whether it is bit-by-bit
identical to what was given to sbuild before, then essentially we would've
enforced reproducible source packages. If I remember correctly, reproducible
source packages are something that the reproducible builds team discarded as a
concept many years ago.

So what should be the plan instead?

Thanks!

cheers, josch

signature.asc
Description: signature


Re: Potential MBF: packages failing to build twice in a row

2023-08-09 Thread Theodore Ts'o
On Tue, Aug 08, 2023 at 10:26:09AM +0200, Helmut Grohne wrote:
> As a minor data point, I also do not rely on `debian/rules clean` to
> work for reproducing the original source tree, because too many packages
> fail it.
> 
> Let me point out though that moving to git-based packaging is not the
> property that is relevant here. I expect that most developers use either
> sbuild or pbuilder for the majority of their builds. Both tools create a
> .dsc, copy that .dsc into a chroot, unpack, build, and dispose of it. So
> we effectively have at least three ways of cleaning source packages:
> 
> a) `debian/rules clean`
> b) Some VCS (and that's probably just git)
> c) Copy the source before build and dispose the entire copy

For what it's worth, my packages are managed using git, and some times
I'll use git-buildpackage (with sbuild as the backend), dgit (for
releases to unstable; but for some reasons it mysteriously fails when
doing uploads backports), as well as dpkg-buildpackage in the git
repository.

Because *do* run dpkg-buildpackage for my test builds, I actually have
an incentive to make "./debian/rules clean" work correctly, because
running dpkg-buildpackage leaves modified files all over the my
repository's working directory, and it's *useful* that "debian/rules
clean" gets my repository back to a clean state.  I could do "git
reset --hard", but sometimes I have locally modified files in working
directory, and "git reset --hard" would blast all of that, where as
"./debian/rules clean" does what I want.

Cheers,

- Ted



Re: Potential MBF: packages failing to build twice in a row

2023-08-09 Thread Wookey
On 2023-08-09 10:56 +0100, Simon McVittie wrote:
> On Tue, 08 Aug 2023 at 10:26:09 +0200, Helmut Grohne wrote:
> > With this you touch another purpose of `debian/rules clean`: Removing
> > generated files. Since we currently discourage repackaging and
> > `dpkg-source -b` is vaguely happy about deleted files, a common
> > technique for dealing with generated files is really shipping them in
> > the source tree and then deleting them via `debian/rules clean` while
> > relying on build tools (and our buildds do this) to clean before build.
> > >From my point of view, this is the main purpose of the clean target at
> > this time.
> > 
> > Do others see this strategy of dealing with generated files as viable
> > and is it compatible with git-based workflows?
> 
> 
> The major alternative to this use of d/clean is to repack the tarball
> with the generated files completely excluded, for example via uscan with
> Files-Excluded in d/copyright.

I generally prefer to do this. The original source is smaller (often
dramatically so: today's example shinks from 18MB to 2.3MB after
excluding some test images of uncertain provenance). I find a lot of
upstream sources shrink by a factor of at least 3 if you get rid of
the cruft.

I'm not sure why we discourage this as it does seem pointless to fill
up our archive with vendored copies we don't use/want or windows
binaries, docs that will be rebuilt anyway, or similar.

For stuff that is sufficiently extraneous I don't always put a +ds in
either, as what's left really is the source, not 'the source plus a
load of pointless cruft for other OSes'.

I understand why we like to distinguish between 'upstream exactly' and
'a modified version', but there is a common set of repackings which
really is just a better version of 'actually the source'. I think it's
good practice to use these as our 'upsteam', and I'm not convinced of
the value of adding a '+' suffix in this case. Perhaps we could/should
adjust policy to say that this is in fact good practice?

> If your upstream has non-DFSG or ambiguously-licensed files in their
> source releases and you need to repack a +dfsg tarball to get rid of
> those files *anyway*, then there's no significant additional cost to
> excluding generated and irrelevant-to-Debian files while you're there.

Agreed.

> If not, then the decision to be made is whether the generated files are
> large enough or annoying enough, in combination with other factors (like
> whether there are vendored subprojects whose copyright/licensing would be
> a lot of work to track in d/copyright), to justify repacking a +ds tarball
> without them.

There is very little work in adding some exclusions to
debian/copyright. That's it, and then (at least for
uscan/uupdate/debuild/sbuild type workflows) everything 'just
works'. So it's easy to justify in terms of maintenance effort IMHO.

The only maintenance load comes if/when upstream move things around.

I have never tried Helmut's suggestion of removing this stuff in the
clean target. It does seem to me that removing it from the tarball
makes a lot more sense than cleaning it later.

Wookey
-- 
Principal hats:  Debian, Wookware, ARM
http://wookware.org/


signature.asc
Description: PGP signature


Re: Potential MBF: packages failing to build twice in a row

2023-08-09 Thread Stefano Rivera
Hi Sven (2023.08.05_19:01:19_+)
> It might be worth to consider changing your workflow a bit and work with
> a git repository. It does not have to be a clone of the repository (if
> any) where the package is maintained, you can start with a fresh import,
> e.g. with "gbp import-dsc".
> 
> Then before building the package for the first time, commit or at least
> stash your changes, and you can go easily back to a clean state with
> "git reset --hard; git clean -fdqx".

I find myself in the same position as Wookey, here. I will work with the
git tree (if it exists) most of the time when I'm touching a package
I've never touched before. Unless the repository is unreasonably big, or
broken.

However, when it fails to build, it's inside my build environment, not a
git tree. It's nice to be able to troubleshoot the build in there,
rather than have to spin up a new build (and install build-deps etc.).
So I find myself having to fix the clean target to continue to work on
the package. This is annoying, it would be nice if packages cleaned
correctly.

Personally, I have my sbuild configured to build a source package after
the build, so that I can be sure that I don't regress my own packages'
clean target. It would be nice if this was a default feature in sbuild,
for most packages this is a very quick process.

Stefano

-- 
Stefano Rivera
  http://tumbleweed.org.za/
  +1 415 683 3272



Re: Potential MBF: packages failing to build twice in a row

2023-08-09 Thread Simon McVittie
On Tue, 08 Aug 2023 at 10:26:09 +0200, Helmut Grohne wrote:
> With this you touch another purpose of `debian/rules clean`: Removing
> generated files. Since we currently discourage repackaging and
> `dpkg-source -b` is vaguely happy about deleted files, a common
> technique for dealing with generated files is really shipping them in
> the source tree and then deleting them via `debian/rules clean` while
> relying on build tools (and our buildds do this) to clean before build.
> >From my point of view, this is the main purpose of the clean target at
> this time.
> 
> Do others see this strategy of dealing with generated files as viable
> and is it compatible with git-based workflows?

It's compatible, but annoying, for a couple of reasons:

The generated files show up in the diff when looking at the changes
between two consecutive upstream releases, and need to be filtered out
or ignored based on out-of-band knowledge that actually they are going to
be deleted. This is especially annoying when they're numerous or large,
like pre-generated HTML documentation, or GLib's GResource-generated
files (which are a way to embed binary icons etc. into your executable
or library).

If you build the package with git-buildpackage or similar, there
isn't usually a need to delete them from your git working tree;
but if you build in-tree (cd mypackage, git checkout debian/latest,
debuild, debuild -T clean) or if you need to run debian/rules clean to
regenerate something that is generated-but-committed (like debian/control
from debian/control.in), then your working tree will have uncommitted
deletions which need to be committed or undone (`git checkout generated/`
or similar). If you upload with dgit, then not committing them is not an
option (because dgit requires the git tree to match the unpacked source
package 1:1), so you have to undo the deletions.

The major alternative to this use of d/clean is to repack the tarball
with the generated files completely excluded, for example via uscan with
Files-Excluded in d/copyright.

If your upstream has non-DFSG or ambiguously-licensed files in their
source releases and you need to repack a +dfsg tarball to get rid of
those files *anyway*, then there's no significant additional cost to
excluding generated and irrelevant-to-Debian files while you're there.

If not, then the decision to be made is whether the generated files are
large enough or annoying enough, in combination with other factors (like
whether there are vendored subprojects whose copyright/licensing would be
a lot of work to track in d/copyright), to justify repacking a +ds tarball
without them.

smcv



Re: Potential MBF: packages failing to build twice in a row

2023-08-09 Thread Paul Wise
On Sat, 2023-08-05 at 17:29 +0100, Simon McVittie wrote:

> Devref §6.8.8.2 also says that "it is common for Debian users who
> need to build software for non-Debian platforms to fetch the source
> from a Debian mirror rather than trying to locate a canonical upstream
> distribution point", but I'm not convinced that's true any more. Our
> volunteers' time is our most limited resource, so if we can use that
> time more efficiently by no longer catering to possibly-hypothetical
> users who are building our source code on non-Debian platforms, then
> that might be a worthwhile tradeoff.

For Guix and nixOS at least there are several references to the Debian
mirrors, so at least some distros rely on us for their source tarballs.
I expect that other source based distros do this too.

Personally, if I were to use a non-Debian platform, I would much prefer
getting tarballs from Debian, since they will be DFSG-checked at least,
which I would still want when dealing with non-Debian platforms.

-- 
bye,
pabs

https://wiki.debian.org/PaulWise


signature.asc
Description: This is a digitally signed message part


Re: Potential MBF: packages failing to build twice in a row

2023-08-09 Thread Paul Wise
On Sat, 2023-08-05 at 17:06 +0200, Lucas Nussbaum wrote:

> I wonder what we should do, because 5000+ failing packages is a lot...

Add a message about this on tracker.debian.org for affected packages?

> Should we give up on requiring a 'clean' target that works? After all,
> when 17% of packages are failing, it means that many maintainers don't
> depend on it in their workflow.

We should keep requiring this, having the clean process work properly
it is very useful for when you are doing repeat builds inside a chroot
or VM while debugging a problem, since not having to repeat the build
dependency install step is a useful time-saver.

This also surfaces bugs in upstream clean rules and where upstreams
have accidentally or deliberately shipped prebuilt files in tarballs.
It is important that Debian give back fixes for these to upstreams.

We should also never drop the requirement to run the clean process
before the build, because that would lead to some files not being
built from source in Debian where upstreams ship prebuilt files in
git/tarballs but the clean rules still also remove those files.

-- 
bye,
pabs

https://wiki.debian.org/PaulWise


signature.asc
Description: This is a digitally signed message part


Re: Potential MBF: packages failing to build twice in a row

2023-08-08 Thread Helmut Grohne
On Sat, Aug 05, 2023 at 05:29:34PM +0100, Simon McVittie wrote:
> I think it's somewhat inevitable that code paths that aren't frequently
> exercised don't work. If a majority of maintainers are doing all of
> their builds with git-buildpackage, or dgit --clean=git, or something
> basically equivalent to one of those, then `debian/rules clean` will
> never actually be run against a built tree. For teams with a strongly
> preferred workflow (like the Perl, Python and GNOME teams consistently
> using git-buildpackage), this seems particularly likely.

As a minor data point, I also do not rely on `debian/rules clean` to
work for reproducing the original source tree, because too many packages
fail it.

Let me point out though that moving to git-based packaging is not the
property that is relevant here. I expect that most developers use either
sbuild or pbuilder for the majority of their builds. Both tools create a
.dsc, copy that .dsc into a chroot, unpack, build, and dispose of it. So
we effectively have at least three ways of cleaning source packages:

a) `debian/rules clean`
b) Some VCS (and that's probably just git)
c) Copy the source before build and dispose the entire copy

That last approach may be annoying for large source packages, but it
works reliably for the entire archive.

> For me, the main purpose of `debian/rules clean` is being able to do
> incremental builds while debugging something - but if I want to do
> incremental builds, it's quite likely that I'll also be using
> `debuild -b -nc` to make the builds genuinely incremental (and then a fully
> clean build from first principles at the end, to verify that whatever issue
> I'm debugging is really fixed).

I see that the purpose of `debian/rules clean` is evolving and that we
should clarify which of the purposes we as a project consider important.
Given the state of discussion, I think we should drop the idea of using
it to construct a source package after build.

> One way to streamline dealing with these generated files would be
> to normalize repacking of upstream source releases to exclude them,
> and make it easier to have source packages that genuinely only contain
> what we consider to be source. At the moment, devref §6.8.8.2 strongly
> discourages repacking tarballs to exclude DFSG-but-unnecessary files
> (including generated files, as well as source/build files only needed on
> Windows or macOS or whatever[1]), and Lintian strongly encourages adding
> a +dfsg or +ds suffix to any repacked tarball, which makes it less
> straightforward to track upstream's versioning. Is it time for us to
> reconsider those recommendations?

With this you touch another purpose of `debian/rules clean`: Removing
generated files. Since we currently discourage repackaging and
`dpkg-source -b` is vaguely happy about deleted files, a common
technique for dealing with generated files is really shipping them in
the source tree and then deleting them via `debian/rules clean` while
relying on build tools (and our buildds do this) to clean before build.
>From my point of view, this is the main purpose of the clean target at
this time.

Do others see this strategy of dealing with generated files as viable
and is it compatible with git-based workflows?

Are we ready to call for consensus on dropping the requirement that
`debian/rules clean; dpkg-source -b` shall work or is anyone interested
in sending lots of patches for this?

Helmut



Re: Potential MBF: packages failing to build twice in a row

2023-08-06 Thread Johannes Schauer Marin Rodrigues
Quoting Timo Röhling (2023-08-05 21:07:34)
> * Lucas Nussbaum  [2023-08-05 17:06]:
> >An example sbuild invocation to reproduce failures is:
> [omitted the command line equivalent of Tolstoy's War and Peace]
> 
> If we decide that this issue is important enough that people should
> care and mass bugs be filed, sbuild will need a more concise way to test
> this; something like pbuilder's --twice option.

if somebody feels strongly that a --twice option is needed for sbuild I'd be
happy to review and merge a MR against the sbuild packaging git on salsa:

https://salsa.debian.org/debian/sbuild/

Thanks!

cheers, josch

signature.asc
Description: signature


Re: Potential MBF: packages failing to build twice in a row

2023-08-06 Thread Simon McVittie
On Sat, 05 Aug 2023 at 21:29:08 +0200, Andrey Rakhmatullin wrote:
> I expect all Python packages that ship
> $name.egg-info and don't remove it in clean and don't exclude it via
> extend-diff-ignore (all of which is unneeded busywork even if recommended)
> to behave the same.

Python packages that *don't* ship $name.egg-info in their upstream source,
don't remove it in clean and don't exclude it via extend-diff-ignore will
also fail Lucas' test if they are 3.0 (quilt) format (or presumably will
have unintended diff instead if they are format 1.0). That's the only
reason bmap-tools_3.6-2 was on Lucas' list, for example.

smcv



Re: Potential MBF: packages failing to build twice in a row

2023-08-05 Thread Lucas Nussbaum
On 05/08/23 at 21:29 +0200, Andrey Rakhmatullin wrote:
> On Sat, Aug 05, 2023 at 07:20:19PM +0300, Adrian Bunk wrote:
> > What packages are failing, and why?
> > 
> > I would expect some debhelper machinery being responsible for most of 
> > these, e.g. perhaps some dh-whatever helper might be creating this 
> > issue for all 1k packages in some language ecosystem.
> I've checked one of my Python packages and it fails because the contents
> of $name.egg-info were modified. I expect all Python packages that ship
> $name.egg-info and don't remove it in clean and don't exclude it via
> extend-diff-ignore (all of which is unneeded busywork even if recommended)
> to behave the same.

Good point.

This seems to affect 1325 packages:
http://qa-logs.debian.net/2023/08/twice/python-egginfo.txt
http://qa-logs.debian.net/2023/08/twice/python-egginfo.txt.dd-list

Lucas



Re: Potential MBF: packages failing to build twice in a row

2023-08-05 Thread Adrian Bunk
On Sat, Aug 05, 2023 at 07:40:36PM +0200, Andrey Rakhmatullin wrote:
> On Sat, Aug 05, 2023 at 08:10:35PM +0300, Adrian Bunk wrote:
> > Debian maintainers with proper git workflows are already exporting all 
> > their changes from git to debian/patches/ as one file - currently the 
> > preferred form of modification of a Debian package has to be in salsa 
> > and not in our archive when the changes cannot be represented as quilt 
> > patches against tarballs.
> Is the gbp-pq workflow improper?

With "proper git workflow" I meant a workflow where the changes to the 
upstream sources are in topic branches that get rebased to new upstream 
versions and then merged.

"topic branch workflow" might have been better wording.

cu
Adrian



Re: Potential MBF: packages failing to build twice in a row

2023-08-05 Thread Adrian Bunk
On Sat, Aug 05, 2023 at 08:55:03PM +0200, Lucas Nussbaum wrote:
> On 05/08/23 at 19:20 +0300, Adrian Bunk wrote:
> > On Sat, Aug 05, 2023 at 05:06:27PM +0200, Lucas Nussbaum wrote:
> > >...
> > > Packages tested: 29883 (I filtered out those that take a very long time 
> > > to build)
> > > .. building OK all times: 24835 (83%)
> > > .. failing somehow: 5048 (17%)
> > >...
> > > I wonder what we should do, because 5000+ failing packages is a lot...
> > 
> > I doubt these are > 5k packages that need individual fixing.
> > 
> > What packages are failing, and why?
> 
> Did you see http://qa-logs.debian.net/2023/08/twice/ ?

Yes, after sending my email...

>...
> > > Should we give up on requiring a 'clean' target that works? After all,
> > > when 17% of packages are failing, it means that many maintainers don't
> > > depend on it in their workflow.
> > 
> > You are mixing two related but not identical topics.
> > 
> > Your subject talks about "failing to build twice in a row",
> > but the contents mostly talks about dpkg-source.
> > 
> > Based on my workflows I can say that building twice in a row, defined as
> >   dpkg-buildpackage -b --no-sign && dpkg-buildpackage -b --no-sign
> > works for > 99% of all packages in the archive.
> 
> That's true. However, if the 'clean' target doesn't work correctly,
> there are chances that the second build might not happen in the same
> conditions as the first one (for example because it will re-use
> left-overs from the first build).

Your test is not sufficient to ensure that the 'clean' target does work 
correctly, non-binary changes under debian/ might result in false negatives.

OTOH it is less of a problem for me if a package that does run autoconf 
during the build does not remove/restore the generated configure in the
'clean' target even though it might fail your test.

> Lucas

cu
Adrian



Re: Potential MBF: packages failing to build twice in a row

2023-08-05 Thread PICCA Frederic-Emmanuel
I second this idea, and also the salsa pipeline should check this also.


- Le 5 Aoû 23, à 21:07, Timo Röhling roehl...@debian.org a écrit :

> Hi Lucas,
> 
> * Lucas Nussbaum  [2023-08-05 17:06]:
>>An example sbuild invocation to reproduce failures is:
> [omitted the command line equivalent of Tolstoy's War and Peace]
> 
> If we decide that this issue is important enough that people should
> care and mass bugs be filed, sbuild will need a more concise way to
> test this; something like pbuilder's --twice option.
> 
> 
> 
> Cheers
> Timo
> 
> --
> ⢀⣴⠾⠻⢶⣦⠀   ╭╮
> ⣾⠁⢠⠒⠀⣿⡁   │ Timo Röhling   │
> ⢿⡄⠘⠷⠚⠋⠀   │ 9B03 EBB9 8300 DF97 C2B1  23BF CC8C 6BDD 1403 F4CA │
> ⠈⠳⣄   ╰╯



Re: Potential MBF: packages failing to build twice in a row

2023-08-05 Thread Andrey Rakhmatullin
On Sat, Aug 05, 2023 at 07:20:19PM +0300, Adrian Bunk wrote:
> What packages are failing, and why?
> 
> I would expect some debhelper machinery being responsible for most of 
> these, e.g. perhaps some dh-whatever helper might be creating this 
> issue for all 1k packages in some language ecosystem.
I've checked one of my Python packages and it fails because the contents
of $name.egg-info were modified. I expect all Python packages that ship
$name.egg-info and don't remove it in clean and don't exclude it via
extend-diff-ignore (all of which is unneeded busywork even if recommended)
to behave the same.



Re: Potential MBF: packages failing to build twice in a row

2023-08-05 Thread Scott Kitterman



On August 5, 2023 7:07:34 PM UTC, "Timo Röhling"  wrote:
>Hi Lucas,
>
>* Lucas Nussbaum  [2023-08-05 17:06]:
>> An example sbuild invocation to reproduce failures is:
>[omitted the command line equivalent of Tolstoy's War and Peace]
>
>If we decide that this issue is important enough that people should
>care and mass bugs be filed, sbuild will need a more concise way to
>test this; something like pbuilder's --twice option.

For the affected packages I looked at, this was enough to reproduce:

dpkg-buildpackage -b
dpkg-buildpackage -S

Or debuild if you prefer.

Scott K



Re: Potential MBF: packages failing to build twice in a row

2023-08-05 Thread Timo Röhling

Hi Lucas,

* Lucas Nussbaum  [2023-08-05 17:06]:

An example sbuild invocation to reproduce failures is:

[omitted the command line equivalent of Tolstoy's War and Peace]

If we decide that this issue is important enough that people should
care and mass bugs be filed, sbuild will need a more concise way to
test this; something like pbuilder's --twice option.



Cheers
Timo

--
⢀⣴⠾⠻⢶⣦⠀   ╭╮
⣾⠁⢠⠒⠀⣿⡁   │ Timo Röhling   │
⢿⡄⠘⠷⠚⠋⠀   │ 9B03 EBB9 8300 DF97 C2B1  23BF CC8C 6BDD 1403 F4CA │
⠈⠳⣄   ╰╯


signature.asc
Description: PGP signature


Re: Potential MBF: packages failing to build twice in a row

2023-08-05 Thread Sven Joachim
On 2023-08-05 19:31 +0100, Wookey wrote:

> On 2023-08-05 17:06 +0200, Lucas Nussbaum wrote:
>>
>> I wonder what we should do, because 5000+ failing packages is a lot...
>>
>> Should we give up on requiring a 'clean' target that works? After all,
>> when 17% of packages are failing, it means that many maintainers don't
>> depend on it in their workflow.
>
> I still depend on this in my workflow, and it's very frustrating that
> a large fraction of packages are broken in this way. I'd love it if we
> had a bit of automation to tell people it's bust so they can fix
> it. Sometimes it is hard because build systems mess up your source
> tree, but a lot of the time it isn't. I have some sympathy for people
> who would have to do a lot of work to fight a build system that
> doesn't care about clean source trees if they don't care about them either.
>
> On the other hand it is a massive PITA when you build a package, and
> something breaks, and you try to build it again and it won't because
> the source tree has changed and the clean target doesn't actually clean.
> This happens way too often these days.

It might be worth to consider changing your workflow a bit and work with
a git repository. It does not have to be a clone of the repository (if
any) where the package is maintained, you can start with a fresh import,
e.g. with "gbp import-dsc".

Then before building the package for the first time, commit or at least
stash your changes, and you can go easily back to a clean state with
"git reset --hard; git clean -fdqx".

> As you say it's clear that a lot of people are not doing things this
> way any more, but a clean target that works still has significant
> value for various sorts of automated builds, and debugging stuff.
> Perhaps an alternative to keeping the clean target working for people
> who don't care about maintaining it, is some metadata to say 'this
> package can only be built reliably from git/VCS - the old debian stuff is
> bust'. Better would be a new git-only dpkg format of some sort with a
> new set of expectations. But that's quite a big piece of work.
>
> Just to be clear I don't want any of that. I want the existing tooling
> and packaging to work the way policy says it should, at least until it
> is agreed that policy has to change.

You can want whatever you like, but wanting does not make anything
happen magically.

Cheers,
   Sven



Re: Potential MBF: packages failing to build twice in a row

2023-08-05 Thread Lucas Nussbaum
On 05/08/23 at 19:20 +0300, Adrian Bunk wrote:
> On Sat, Aug 05, 2023 at 05:06:27PM +0200, Lucas Nussbaum wrote:
> >...
> > Packages tested: 29883 (I filtered out those that take a very long time to 
> > build)
> > .. building OK all times: 24835 (83%)
> > .. failing somehow: 5048 (17%)
> >...
> > I wonder what we should do, because 5000+ failing packages is a lot...
> 
> I doubt these are > 5k packages that need individual fixing.
> 
> What packages are failing, and why?

Did you see http://qa-logs.debian.net/2023/08/twice/ ?

> I would expect some debhelper machinery being responsible for most of 
> these, e.g. perhaps some dh-whatever helper might be creating this 
> issue for all 1k packages in some language ecosystem.

Maybe, but I did not detect such a common root cause by randomly looking
at log files. Same when going through dd-lists: of course teams that
maintain many packages have many packages failing, but I did not
identify a team with a very large proportion of failing packages.

> > Should we give up on requiring a 'clean' target that works? After all,
> > when 17% of packages are failing, it means that many maintainers don't
> > depend on it in their workflow.
> 
> You are mixing two related but not identical topics.
> 
> Your subject talks about "failing to build twice in a row",
> but the contents mostly talks about dpkg-source.
> 
> Based on my workflows I can say that building twice in a row, defined as
>   dpkg-buildpackage -b --no-sign && dpkg-buildpackage -b --no-sign
> works for > 99% of all packages in the archive.

That's true. However, if the 'clean' target doesn't work correctly,
there are chances that the second build might not happen in the same
conditions as the first one (for example because it will re-use
left-overs from the first build).

Lucas



Re: Potential MBF: packages failing to build twice in a row

2023-08-05 Thread Wookey
On 2023-08-05 17:06 +0200, Lucas Nussbaum wrote:
> 
> I wonder what we should do, because 5000+ failing packages is a lot...
> 
> Should we give up on requiring a 'clean' target that works? After all,
> when 17% of packages are failing, it means that many maintainers don't
> depend on it in their workflow.

I still depend on this in my workflow, and it's very frustrating that
a large fraction of packages are broken in this way. I'd love it if we
had a bit of automation to tell people it's bust so they can fix
it. Sometimes it is hard because build systems mess up your source
tree, but a lot of the time it isn't. I have some sympathy for people
who would have to do a lot of work to fight a build system that
doesn't care about clean source trees if they don't care about them either.

On the other hand it is a massive PITA when you build a package, and
something breaks, and you try to build it again and it won't because
the source tree has changed and the clean target doesn't actually clean. 
This happens way too often these days.

As you say it's clear that a lot of people are not doing things this
way any more, but a clean target that works still has significant
value for various sorts of automated builds, and debugging stuff.

Perhaps an alternative to keeping the clean target working for people
who don't care about maintaining it, is some metadata to say 'this
package can only be built reliably from git/VCS - the old debian stuff is
bust'. Better would be a new git-only dpkg format of some sort with a
new set of expectations. But that's quite a big piece of work.

Just to be clear I don't want any of that. I want the existing tooling
and packaging to work the way policy says it should, at least until it
is agreed that policy has to change.

Wookey
-- 
Principal hats:  Debian, Wookware, ARM
http://wookware.org/


signature.asc
Description: PGP signature


Re: Potential MBF: packages failing to build twice in a row

2023-08-05 Thread Scott Kitterman



On August 5, 2023 5:40:36 PM UTC, Andrey Rakhmatullin  wrote:
>On Sat, Aug 05, 2023 at 08:10:35PM +0300, Adrian Bunk wrote:
>> Debian maintainers with proper git workflows are already exporting all 
>> their changes from git to debian/patches/ as one file - currently the 
>> preferred form of modification of a Debian package has to be in salsa 
>> and not in our archive when the changes cannot be represented as quilt 
>> patches against tarballs.
>Is the gbp-pq workflow improper?
>
Git-dpm too.  Apparently everything I do in git is improper.  Maybe I should 
give up on it then?

Scott K



Re: Potential MBF: packages failing to build twice in a row

2023-08-05 Thread Scott Kitterman
On Saturday, August 5, 2023 11:06:27 AM EDT Lucas Nussbaum wrote:
> Hi,
> 
> Debian Policy section 4.9 says:
>   clean (required)
>  This must undo any effects that the build and binary targets may
>  have had, except that it should leave alone any output files
>  created in the parent directory by a run of a binary target.
> 
> I looked at what happens when doing 'dpkg-buildpackage ;
> dpkg-buildpackage ; dpkg-buildpackage -S' over most source packages in
> sid. The resultats are the following:
> 
> Packages tested: 29883 (I filtered out those that take a very long time to
> build) .. building OK all times: 24835 (83%)
> .. failing somehow: 5048 (17%)
>  failing during the first build: 238 (not relevant for this mail)
>  failing because the 'clean' target fails: 52
>  failing because dpkg-source fails: 4740
> .. dpkg-source detects changes to binary files: 1595
> .. dpkg-source detects unwanted binary files: 117
> .. dpkg-source detects deletions: 101
> .. dpkg-source detects other local changes: 2929
>  failing for other reasons: 22
> 
> Logs, lists, and dd-lists are available at
> http://qa-logs.debian.net/2023/08/twice/
> 
> An example sbuild invocation to reproduce failures is:
> sbuild -n -A -s --force-orig-source --apt-update -d unstable -v
> --no-run-lintian \ --starting-build-commands="cd %SBUILD_PKGBUILD_DIR &&
> runuser -u $(id -un) -- dpkg-buildpackage --sanitize-env -us -uc
> -rfakeroot" \ --finished-build-commands="cd %SBUILD_PKGBUILD_DIR && runuser
> -u $(id -un) -- dpkg-buildpackage --sanitize-env -us -uc -rfakeroot -S" \
> ruby-highline
> 
> I wonder what we should do, because 5000+ failing packages is a lot...
> 
> Should we give up on requiring a 'clean' target that works? After all,
> when 17% of packages are failing, it means that many maintainers don't
> depend on it in their workflow.
> 
> Lucas

Thanks.  I think this is useful and we should fix issues like these.  For the 
packages I maintain/co-maintain on the list, I'd pushed fixes to the packaging 
git, so they should all be fixed after the next upload.

In my case, I've gotten in the habit of using -nc when building source 
packages, but that's a crutch and it's better to fix the issues.  Thanks for 
the prompt.

Scott K


signature.asc
Description: This is a digitally signed message part.


Re: Potential MBF: packages failing to build twice in a row

2023-08-05 Thread Andrey Rakhmatullin
On Sat, Aug 05, 2023 at 08:10:35PM +0300, Adrian Bunk wrote:
> Debian maintainers with proper git workflows are already exporting all 
> their changes from git to debian/patches/ as one file - currently the 
> preferred form of modification of a Debian package has to be in salsa 
> and not in our archive when the changes cannot be represented as quilt 
> patches against tarballs.
Is the gbp-pq workflow improper?



Re: Potential MBF: packages failing to build twice in a row

2023-08-05 Thread Adrian Bunk
On Sat, Aug 05, 2023 at 05:29:34PM +0100, Simon McVittie wrote:
>...
> One way to streamline dealing with these generated files would be
> to normalize repacking of upstream source releases to exclude them,
> and make it easier to have source packages that genuinely only contain
> what we consider to be source.

What do we actually consider to be source?

Debian maintainers with proper git workflows are already exporting all 
their changes from git to debian/patches/ as one file - currently the 
preferred form of modification of a Debian package has to be in salsa 
and not in our archive when the changes cannot be represented as quilt 
patches against tarballs.

> At the moment, devref §6.8.8.2 strongly
> discourages repacking tarballs to exclude DFSG-but-unnecessary files
> (including generated files, as well as source/build files only needed on
> Windows or macOS or whatever[1]), and Lintian strongly encourages adding
> a +dfsg or +ds suffix to any repacked tarball, which makes it less
> straightforward to track upstream's versioning. Is it time for us to
> reconsider those recommendations?
> 
> For many upstreams (for example Autotools-based projects, and any project
> like GTK that includes pre-generated documentation in source releases),
> we can get "more source-like" upstream source releases by repacking our
> own tarball based on upstream VCS tags than we would get by using their
> official source release artifacts. For other upstreams, Files-Excluded
> can be used to delete generated or unneeded files.
>...

The proper solution would be to stop pretending that we are still living 
in the last millenium and that tarballs would be the main form of sources.

Not using git trees as sources for many packages is preventing a lot of 
proper and easy tooling for many things, including here.

> smcv
>...

cu
Adrian



Re: Potential MBF: packages failing to build twice in a row

2023-08-05 Thread Adrian Bunk
On Sat, Aug 05, 2023 at 05:06:27PM +0200, Lucas Nussbaum wrote:
>...
> Packages tested: 29883 (I filtered out those that take a very long time to 
> build)
> .. building OK all times: 24835 (83%)
> .. failing somehow: 5048 (17%)
>...
> I wonder what we should do, because 5000+ failing packages is a lot...

I doubt these are > 5k packages that need individual fixing.

What packages are failing, and why?

I would expect some debhelper machinery being responsible for most of 
these, e.g. perhaps some dh-whatever helper might be creating this 
issue for all 1k packages in some language ecosystem.

> Should we give up on requiring a 'clean' target that works? After all,
> when 17% of packages are failing, it means that many maintainers don't
> depend on it in their workflow.

You are mixing two related but not identical topics.

Your subject talks about "failing to build twice in a row",
but the contents mostly talks about dpkg-source.

Based on my workflows I can say that building twice in a row, defined as
  dpkg-buildpackage -b --no-sign && dpkg-buildpackage -b --no-sign
works for > 99% of all packages in the archive.

> Lucas

cu
Adrian



Re: Potential MBF: packages failing to build twice in a row

2023-08-05 Thread Simon McVittie
On Sat, 05 Aug 2023 at 17:06:27 +0200, Lucas Nussbaum wrote:
> Should we give up on requiring a 'clean' target that works? After all,
> when 17% of packages are failing, it means that many maintainers don't
> depend on it in their workflow.

I think it's somewhat inevitable that code paths that aren't frequently
exercised don't work. If a majority of maintainers are doing all of
their builds with git-buildpackage, or dgit --clean=git, or something
basically equivalent to one of those, then `debian/rules clean` will
never actually be run against a built tree. For teams with a strongly
preferred workflow (like the Perl, Python and GNOME teams consistently
using git-buildpackage), this seems particularly likely.

I think we need to think about what benefit this Policy requirement brings
us, and whether it's worth the cost, before treating it as important:
the higher we choose to make the cost of fixing this class of bug,
the higher the bar should be for treating it as a real bug at all,
or treating it as RC.

For me, the main purpose of `debian/rules clean` is being able to do
incremental builds while debugging something - but if I want to do
incremental builds, it's quite likely that I'll also be using
`debuild -b -nc` to make the builds genuinely incremental (and then a fully
clean build from first principles at the end, to verify that whatever issue
I'm debugging is really fixed).

Having looked at some of the packages with my name on in
all_failing.txt.dd-list, many of them are a simple matter of built
files being created by the build (either upstream or downstream) or by
running automated tests, but not deleted. Those are easily fixed (several
fixed in git already).

libsdl2 has generated files in its upstream source tarball which are
re-generated with different content during the build (mostly Autotools
files, but also include/SDL_config.h and include/SDL_revision.h), and
I'm confident that it's not the only one: many upstreams want to do
this for the convenience of people building their package on OSs whose
development tools are less tightly integrated or less scriptable than
ours (especially Windows). In many cases the most pragmatic way to deal
with that is to delete the file during clean and ignore the resulting
warnings from dpkg-source.

I certainly don't want to require maintainers to invent machinery
for saving and restoring upstream's versions of generated files like
dh-autoreconf does, because that seems like busy-work that just adds to
the complexity of our builds without making Debian any better.

One way to streamline dealing with these generated files would be
to normalize repacking of upstream source releases to exclude them,
and make it easier to have source packages that genuinely only contain
what we consider to be source. At the moment, devref §6.8.8.2 strongly
discourages repacking tarballs to exclude DFSG-but-unnecessary files
(including generated files, as well as source/build files only needed on
Windows or macOS or whatever[1]), and Lintian strongly encourages adding
a +dfsg or +ds suffix to any repacked tarball, which makes it less
straightforward to track upstream's versioning. Is it time for us to
reconsider those recommendations?

For many upstreams (for example Autotools-based projects, and any project
like GTK that includes pre-generated documentation in source releases),
we can get "more source-like" upstream source releases by repacking our
own tarball based on upstream VCS tags than we would get by using their
official source release artifacts. For other upstreams, Files-Excluded
can be used to delete generated or unneeded files.

A side benefit of normalizing repacking upstream source releases would
be that maintainers are no longer expected to check the diff between
those files in the old and new version, and no longer required to track
the copyright and licensing status of the files that get excluded, which
can significantly speed up the process of importing new upstream versions
in some cases.

The major disadvantage of repacking upstream source is that if upstream
makes signed releases or publishes checksums, our repacked source cannot
be validated against that information; but if we can easily re-download
upstream source, then it would be possible for any interested developer
to verify that the repacked tarball matches what upstream released, minus
the files listed in Files-Excluded.

Devref §6.8.8.2 also says that "it is common for Debian users who
need to build software for non-Debian platforms to fetch the source
from a Debian mirror rather than trying to locate a canonical upstream
distribution point", but I'm not convinced that's true any more. Our
volunteers' time is our most limited resource, so if we can use that
time more efficiently by no longer catering to possibly-hypothetical
users who are building our source code on non-Debian platforms, then
that might be a worthwhile tradeoff.

smcv

[1] I mentioned Windows/macOS, but devref actually talks 

Re: Potential MBF: packages failing to build twice in a row

2023-08-05 Thread Vincent Bernat

On 2023-08-05 17:06, Lucas Nussbaum wrote:
Should we give up on requiring a 'clean' target that works? After all, 
when 17% of packages are failing, it means that many maintainers don't 
depend on it in their workflow.


Yes, please, this does not make sense anymore to enforce such a rule 
when it is now easy to use "git clean -fxd" or to build in a chroot. 
Moreover, binary packages in the archive are now built by an official 
builder.




Potential MBF: packages failing to build twice in a row

2023-08-05 Thread Lucas Nussbaum
Hi,

Debian Policy section 4.9 says:
  clean (required)
 This must undo any effects that the build and binary targets may
 have had, except that it should leave alone any output files
 created in the parent directory by a run of a binary target.

I looked at what happens when doing 'dpkg-buildpackage ;
dpkg-buildpackage ; dpkg-buildpackage -S' over most source packages in
sid. The resultats are the following:

Packages tested: 29883 (I filtered out those that take a very long time to 
build)
.. building OK all times: 24835 (83%)
.. failing somehow: 5048 (17%)
 failing during the first build: 238 (not relevant for this mail)
 failing because the 'clean' target fails: 52
 failing because dpkg-source fails: 4740
.. dpkg-source detects changes to binary files: 1595
.. dpkg-source detects unwanted binary files: 117
.. dpkg-source detects deletions: 101
.. dpkg-source detects other local changes: 2929
 failing for other reasons: 22

Logs, lists, and dd-lists are available at
http://qa-logs.debian.net/2023/08/twice/

An example sbuild invocation to reproduce failures is:
sbuild -n -A -s --force-orig-source --apt-update -d unstable -v 
--no-run-lintian \
--starting-build-commands="cd %SBUILD_PKGBUILD_DIR && runuser -u $(id -un) -- 
dpkg-buildpackage --sanitize-env -us -uc -rfakeroot" \
--finished-build-commands="cd %SBUILD_PKGBUILD_DIR && runuser -u $(id -un) -- 
dpkg-buildpackage --sanitize-env -us -uc -rfakeroot -S" \
ruby-highline

I wonder what we should do, because 5000+ failing packages is a lot...

Should we give up on requiring a 'clean' target that works? After all,
when 17% of packages are failing, it means that many maintainers don't
depend on it in their workflow.

Lucas