Hi Penny Thanks for reaching out to Fedora.
Minor warning: different people do things differently, even within Fedora. But we have guidelines: https://docs.fedoraproject.org/en-US/packaging-guidelines/PatchUpstreamStatus/ > The Context: There is a common assumption in academic research that for any > given patch, the "original source context" (i.e., the full source code of the > file exactly as it existed when the patch was created) is easily accessible. Do they define "easily"? I guess that's the starting point of your research ... git (or hg) patches include an oid (sha1) of the corresponding commit and - consequently, theoretically - identify the whole history leading up to that commit/patch.. > The theory is that to correctly port a patch from Distro A to Distro B, one > must analyze the full semantic context of the original code to understand the > patch's intent. Yes, ideally. > The Question: While backporting a commit directly from a package's official > upstream git repository is straightforward It is straightforward to know the context. It might not be straightforward to analyze it. In "released" Fedora branches, we typically apply bugfix patches only. As a packager, one of the biggest challenges is packporting bugfix patches from upstreams which commit them to their development branch only. After an upstream release, some do heavy refactoring or other major changes (which is OK there, of course) but do not carry a "release branch" for bugfixes themselves. So, while all the context is there, backporting is the challenge. > I am specifically interested in "third-party patches"—those acquired from: > Other distributions (e.g., porting a fix from Debian/Arch/OpenEuler to > Fedora). > Mailing list attachments. > Standalone security advisories (CVEs) where only a .diff or .patch file is > provided. > > In these scenarios: > > Availability: Is it realistic for you to hunt down the exact source > tree/commit where that third-party patch was originally created? Do you > typically possess the full "original source file" (pre-patch state) to see > the complete context? All distros reference or ship the source one way or the other and then apply the patch on top. So yes for pre-patch. A different issue is referencing the pre-patch state against an upstream commit or release. Their pre-patch state might differ from ours. Mailing list: hopefully `git format-patch` including oid's and a repo url. CVEs: wildly differing, dependent on the reporter > Importance: When you encounter a conflict or a semantic mismatch during > backporting, do you actually rely on finding that original source context? Or > do you mostly rely on reading the .patch content itself and your current > target codebase to resolve it manually? Mostly the latter, at least for smaller patches. If that does not work then the former. "Work" here depends on test cases both for the built product as well as for the bug to fix. > The Reality: Does the lack of "original context" (e.g., due to squashed > commits in other distros, or lack of metadata) make these patches > significantly harder to maintain, or is this generally a non-issue for > experienced packagers? I try to take patches from upstreams, and when there are problems it's typically due to their improper usage of git (squashed commits, bad commit messages, lack of branch model). I look at other distros only in case of actual packaging decisions (do they build dynamic libraries, how do they deal with improper soname usage by upstream or other upstream weirdness), both to have some common ground and leverage. In general I find it difficult to "navigate" other distros' package source repos and even identify what's the current release branch, the actual package source and the like. That's not a citicism, but it's different enough to rather look upstream. > I suspect that in many real-world cases, maintainers effectively work with > "isolated patches" without access to the full original upstream context. I work with upstream source trees all the time and keep most of my patches in branches on top of the commit which corresponds to the source of the Fedora package. This helps with easy rebasing and such. I also keep some of my Fedora packaging in a git tree like this: - git repo with spec file, patches and the like - upstream source as a git submodule "source(s)" This corresponds to Fedora's package source layout (dist-git) where "sources" would be file instead (identifying source tarballs by sha512) and allows me to follow upstream development easily. There's also a wider initiative to do it the other way round (source-git, packit), that is put packaging information into subdir of the upstream source, which I consider the wrong way round, obviously ;-) > Any insights—whether it’s a detailed "war story," a brief explanation of your > workflow, or even a simple "yes/no"—would be incredibly valuable to me. I am > eager to learn from your practical experience and hope to spark a > constructive discussion on how we can better bridge the gap between academic > theory and packaging reality. As for stories: Several times when I looked at other distros, I've found "my" patch adaptions or packaging solutions. Not necessarily referenced, but hey, I'm not even sure I'm doing this in every case. As a general rule, I think we should try and work with upstream as much as possible. That's the natural "meeting point". Michael -- _______________________________________________ devel mailing list -- [email protected] To unsubscribe send an email to [email protected] Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/[email protected] Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
