Hi Penny

Thanks for reaching out to Fedora.

Minor warning: different people do things differently, even within
Fedora. But we have guidelines:

https://docs.fedoraproject.org/en-US/packaging-guidelines/PatchUpstreamStatus/

> The Context: There is a common assumption in academic research that for any 
> given patch, the "original source context" (i.e., the full source code of the 
> file exactly as it existed when the patch was created) is easily accessible.

Do they define "easily"? I guess that's the starting point of your
research ...

git (or hg) patches include an oid (sha1) of the corresponding commit
and - consequently, theoretically - identify the whole history leading
up to that commit/patch..

> The theory is that to correctly port a patch from Distro A to Distro B, one 
> must analyze the full semantic context of the original code to understand the 
> patch's intent.

Yes, ideally.
 
> The Question: While backporting a commit directly from a package's official 
> upstream git repository is straightforward

It is straightforward to know the context. It might not be
straightforward to analyze it.

In "released" Fedora branches, we typically apply bugfix patches only. As
a packager, one of the biggest challenges is packporting bugfix patches
from upstreams which commit them to their development branch only. After
an upstream release, some do heavy refactoring or other major changes
(which is OK there, of course) but do not carry a "release branch" for
bugfixes themselves.
So, while all the context is there, backporting is the challenge.

> I am specifically interested in "third-party patches"—those acquired from:
> Other distributions (e.g., porting a fix from Debian/Arch/OpenEuler to 
> Fedora).
> Mailing list attachments.
> Standalone security advisories (CVEs) where only a .diff or .patch file is 
> provided.
> 
> In these scenarios:
> 
> Availability: Is it realistic for you to hunt down the exact source 
> tree/commit where that third-party patch was originally created? Do you 
> typically possess the full "original source file" (pre-patch state) to see 
> the complete context?

All distros reference or ship the source one way or the other and then apply
the patch on top. So yes for pre-patch. A different issue is referencing
the pre-patch state against an upstream commit or release. Their
pre-patch state might differ from ours.

Mailing list: hopefully `git format-patch` including oid's and a repo
url.

CVEs: wildly differing, dependent on the reporter

> Importance: When you encounter a conflict or a semantic mismatch during 
> backporting, do you actually rely on finding that original source context? Or 
> do you mostly rely on reading the .patch content itself and your current 
> target codebase to resolve it manually?

Mostly the latter, at least for smaller patches. If that does not work
then the former. "Work" here depends on test cases both for the built
product as well as for the bug to fix.

> The Reality: Does the lack of "original context" (e.g., due to squashed 
> commits in other distros, or lack of metadata) make these patches 
> significantly harder to maintain, or is this generally a non-issue for 
> experienced packagers?

I try to take patches from upstreams, and when there are problems it's
typically due to their improper usage of git (squashed commits, bad
commit messages, lack of branch model).

I look at other distros only in case of actual packaging decisions (do
they build dynamic libraries, how do they deal with improper soname
usage by upstream or other upstream weirdness), both to have some common
ground and leverage.

In general I find it difficult to "navigate" other distros' package
source repos and even identify what's the current release branch, the
actual package source and the like. That's not a citicism, but it's
different enough to rather look upstream.

> I suspect that in many real-world cases, maintainers effectively work with 
> "isolated patches" without access to the full original upstream context.

I work with upstream source trees all the time and keep most of my
patches in branches on top of the commit which corresponds to the source
of the Fedora package. This helps with easy rebasing and such.

I also keep some of my Fedora packaging in a git tree like this:

- git repo with spec file, patches and the like
- upstream source as a git submodule "source(s)"

This corresponds to Fedora's package source layout (dist-git) where
"sources" would be file instead (identifying source tarballs by sha512)
and allows me to follow upstream development easily.

There's also a wider initiative to do it the other way round
(source-git, packit), that is put packaging information into subdir of
the upstream source, which I consider the wrong way round, obviously ;-)
 
> Any insights—whether it’s a detailed "war story," a brief explanation of your 
> workflow, or even a simple "yes/no"—would be incredibly valuable to me. I am 
> eager to learn from your practical experience and hope to spark a 
> constructive discussion on how we can better bridge the gap between academic 
> theory and packaging reality.

As for stories: Several times when I looked at other distros, I've found
"my" patch adaptions or packaging solutions. Not necessarily referenced,
but hey, I'm not even sure I'm doing this in every case.

As a general rule, I think we should try and work with upstream as much
as possible. That's the natural "meeting point".

Michael
-- 
_______________________________________________
devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/[email protected]
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Reply via email to