Ian Jackson:
Niels Thykier writes ("Bug#1079434: dgit: Can we avoid Debian specific rules for 
gitattributes?"):
When using `dgit` on a Debian package where there is a `.gitattributes`
file, I get the warning:

"""
dgit: .gitattributes not (fully) defused.  Recommended: dgit setup-new-tree.
"""

Did you read the section GITATTRIBUTES in dgit(7) ?


I did find it before filing this bug. Though I had to hop via the `dgit setup-new-tree` -> `dgit setup-gitattributes` to find it, which I did not find desirable in terms of documentation.

As for the section itself, I understand the gist of it but it was a bit too abstract to my liking.

    Concretely, I can have `git` enforce that I do not get Windows
newlines into my scripts just because the contributor wrote the patch on
a Windows machine in a less than ideal configured editor[2].

I can see how this is useful.  I think a better approach to this kind
of thing might be to use git's hook arrangements.


Whether it is `git`'s hooks, a CI job or a `pre-commit` hook is that it is effort to write and test the code works as intended. Compare with "echo '*.py eol=lf' > .gittaributes", which has almost no investment.

For me, the trade-off in simplicity strongly favors `.gitattributes`. My goal (as an upstream developer) is to develop on the upstream project, not write and maintain CI/pre-commit hooks. So I can spend 5s on the `.gitattributes` and then go back to what I am here to solve (rather than spend an hour creating and debugging a CI pipeline or pre-commit hook, where the implementation might differ depending on which git hosting platform/CI platform I use).

In a standard git workflow, I can edit the `.gitattributes` as needed
and push it. From there on, branches based on that commit will now
respect the delta. This is a nice property for me as an (upstream)
developer.

I think maybe you have misunderstood what `dgit setup-new-tree` does.
It does not manipulate the git tree object, or your branch.

It manipulates the per-tree *configuration* (.git/info/attributes) to
arrange that the in-tree .gitattributes don't cause discrepancies
between your working tree and the git history.  (Such discrepancies can
cause `dgit push-source` to fail.)


As I understand git, the `.git/info/attributes` is global across all branches where as `.gitattributes` is subject to the which branch (commit) you are on.

So are you saying this can "defuse" the `.gitattributes` without affecting how they work in general? Because that is not at all what I read out of `dgit(7)`.

What I read is that it will disable the "transforming" gitattributes (not sure what that is, but rewriting newlines do sound "transforming" to me). And if it disables a gitattribute in regular git, then there is a delta in the behavior of git for "upstream-only-developer" vs. "upstream-and-debian-developer", which is what I want to avoid.

When I developed this aspect of dgit, I was thinking of upstreams
who put all manner of surprising things in .gitattributes.  For
example, upstream Xen git has a .gitattributes file which encodes
version information in working tree files.

This isn't compatible with dgit's core invariant, which is that the
git tree object is precisely the same as the content of the source
package.  [...]


Side bar: Related to that (but unrelated to this bug), I opened https://salsa.debian.org/dgit-team/dgit/-/issues/7 to discuss build-time generation (or enrichment) of `debian/control`, which I suspect feature interacts with this invariant due to some other constraints.

I would like the dgit maintainer's feedback on that too, since I do not like the status quo very much and I hope we can find a solution that solves my goal that is also supported by `dgit` to the extent possible.

End side-bar.

I agree that this whole situation is not optimal.  To be honest,
I think the whole gitattributes system in git is a mistake.
(See also git subtrees which are an even more badly broken thing[1]
that dgit doesn't support.)

But the situation is not as bad as I think you imagine.

If your gitattributes don't in fact transform files, in practice,
in a way that makes your source packages different from your git tree
object, then:

  * You can safely ignore the warning, since even un-defused,
    the attributes won't cause discrepancies that cause dgit to fail.


It would be even better if the user did not get a warning when the file will not cause problems. Though, given your remark below, newline transformations seems to be in scope for the warning.

  * You can suppress the warning by providing a defuse line
    that affects no files (see `dgit setup-gitattributes` in dgit(1))
    (Possibly there could be a nicer way to do this.)


I think I would need an example for this in the manpage. I am not even sure what the `dgit-defuse-attrs` macro does.

  * Users who heed dgit's advice (or use `dgit clone`) will not
    experience lossage either.  (Assuming they don't also apply
    patches with Windows line endings, or something.)


So, I assume the problem is that `dgit` and `dpkg-source` sees to different things here during `dgit push-source`, so the generated dpkg-source does not need the attribute and `dgit` in its commit does?

For a native package (where I am coming from but the smallest possible user-base for you) or the `debian/` part of a non-native package, having git normalize the newlines (the `eol` attribute) as requested before calling dpkg-source should work.

But then, the majority of the `.gitattributes` that `dgit` will see are from the upstream part of a non-native package, and that is considerably more difficult to deal with. Since the `.orig`-tarball is actually the source of truth (and not git). Would this happen to be correct by virtue of `gbp import-orig` (or will a `.gitattributes` not affect that case) for users of `gbp import-orig`?

I get this only covers the `eol` attribute (and then also the `text` attribute due to their inter-relationship). But I think there is a lot to be said for supporting that those without a warning if possible as I think those are the most common one.

Perhaps also a special-case for merge drivers (like `dpkg-mergechangelogs`), which should not affect the committed form.



    Nevertheless, I think `dgit` should change its behavior here, since
we are making a Debian specific git workflow and it makes Debian
contributors that are also upstream developers a second class citizen.

I'm open to suggestions for how this could all be better.  But the
situation is certainly not straightforward.

Ian.

[1] See my blog post "Never use git submodules"
   https://diziet.dreamwidth.org/14666.html


I think my best bet is to cover the most common attributes and avoid warnings for those when possible. At least as a starting point this would be a partial solution that from a user perspective would be better than the status quo if possible to do reliably. But this all boils down to whether it is feasible to deal with the `.orig` where `git` is not/might not be the source of truth.

It would make the warning less likely to occur for many projects meaning fewer people in dual roles would need to worry about this problem.

Best regards,
Niels

Reply via email to