Dragan Simic <[email protected]> writes: > On 2025-09-28 22:08, Arsen Arsenović wrote: >> Dragan Simic <[email protected]> writes: >>> On 2025-09-28 19:42, Collin Funk wrote: >>>> Arsen Arsenović <[email protected]> writes: >>>>> Dragan Simic <[email protected]> writes: >>>>>> On 2025-09-28 11:31, Arsen Arsenović wrote: >>>>>>> IMO this is a good idea. New contributors would likely find the >>>>>>> workflow easier (even though I personally like a mail-based workflow a >>>>>>> lot), we can incorporate automated testing, and Codeberg appears to be >>>>>>> ideologically aligned with the GNU project goals. >>>>>> Would sending patches through a mailing list disappear as an option >>>>>> after the migration to Codebeerg? I'd find that a huge step back. >>>>> There might be no need for that for coreutils (which is relatively >>>>> simple to test - consistent and automated testing is the core benefit of >>>>> forges IMO). >>>>> That said, it'd be good to address whatever concerns you might have with >>>>> such a switch to see if it is possible to build up improvements in >>>>> workflow. >>>> Myself and I assume others, who may correct me if I am wrong, still like >>>> mailing lists. Therefore, I would be for keeping that as the primary way >>>> to report bugs/send patches. >>> I also prefer mailing lists. They offer flexibility that no forge can >>> even come close by, simply because on mailing lists nearly nothing has >>> to be in some strictly predefined layout or conforming to some strict >>> form. >> This is also their downfall. It directly leads to the contribution >> process being inconsistent, incoherent, and hard to automate. > > For less complex projects, more linear and more structured workflows > that the pull requests inevitably result in, are actually beneficial. > For complex projects, they simply don't fit the bill.
This question is more on a patch-by-patch basis. Most patches aren't
complex enough to require such discussion.
Most discussions which /do/ branch that much are unproductive, also,
IME, but again I agree that it'd be nice to provide development in this
direction.
>> Consider testing for a project like GCC. To test GCC properly, you
>> cannot just run a testsuite on your machine.
>
> You can't run that only, but you should before submitting any patches.
> That's the standard procedure for any kind of a workflow.
Yes? Sorry, I'm not sure what you're trying to point out here - of
course, one needs to run the tests they write.
>> Due to unfortunate circumstances, GCC requires a baseline of tests to
>> compare against, and this comparison is considered successful only if
>> the differences between the baseline and patched GCC are passing tests.
>> So, you need to also test the baseline. This process is easy to
>> automate but (relatively) difficult to explain to new contributors.
>
> Well, if they can't grasp that, I wonder how are they actually able
> to make high-quality changes to such a complex codebase? That sounds
> to me like having someone capable of running a marathon, but unable to
> cross the street without someone else's help.
It is certainly possible to explain (it is standard procedure right
now), but it is something that machines can do without bothering people.
I disagree that failing to do the above indicates inability to make
high-quality changes. What we tend to call "complex" projects are
usually large collections of uniformly distributed simplicity. The
kernel and toolchain are perfect examples of this IMO.
>> The previous part was possible to do on just your machine, but
>> "properly" testing GCC also requires testing on various weird platforms.
>> For this, we have the cfarm. Access to the cfarm is a privilege not
>> available to many contributors (though luckily I have it), awareness of
>> this issue is also not something everyone has, and access to the cfarm
>> is moot if you aren't knowledgeable in the strange targets GCC ought to
>> be tested for (e.g. AIX.. I've spent many hours digging around for AIX
>> documentation, and trying to figure out AIX executable formats and
>> library formats). Such weird machines are non readily-available
>> (another such example are MS-w workstations - I haven't owned any in
>> years and I couldn't honestly say that I could operate one for purposes
>> of testing MinGW - as a real-world example, the Glorious Glasgow Haskell
>> Compiler project automatically tests GHC on Windows and when I
>> contributed to GHC I never had to worry about that decrepit platform).
>
> Well, that's why higher-level contributors and maintainers exist,
> to follow each contribution through and make sure it's fine.
It's a waste of their time to run tests that can be done automatically
on patches.
> Not every contributor has to instantly be aware of everything, and
> I'm afraid that most people can't process huge amounts of details
> quickly anyway, regardless of the workflow.
So, let's automate it ;)
That way the contributor doesn't need to be fully aware of details.
>>> For example, I'm unaware of any forge that offers fully threaded
>>> discussions about pull requests and issues, which are simply mandatory
>>> for any kind of complex discussion.
>> This, I agree, is unfortunate. But, for most changes, heavily-threaded
>> discussion doesn't tend to happen, whereas extensive testing is required
>> for all changes. It would be nice if the forges grew in this regard
>> (and I hope that the use of forges by, say, the GNU toolchain, could
>> lead to funds being redirected to enable work in that direction).
>
> Again, it depends on the complexity of the project, as I noted above.
>
>> Note that many forges (not sure about Forgejo specifically, however)
>> offer some form of threading for discussing individual parts of code
>> (e.g. commenting on a line starts a new, albeit linear, thread about
>> it).
>
> Having some form available doesn't mean it's automatically good enough.
If email is considered good, merely having probably suffices - the
threading model in email is so trivial that I doubt that an alternative
model of such simplicity exists.
>>> On top of everything, forges pull the metadata, as what's contained in
>>> the discussions, into some kind of their own format that's either tied
>>> to the particular forge or isn't some simple data format that can be
>>> used with no need for tools that are more complex than a pager such as
>>> less(1). Mailing lists don't do any of that, and they allow everyone
>>> to have a full local copy of all the metadata, which can only be good
>>> in the long term. Metadata is gold.
>> Metadata is lacking in MLs, besides plain email metadata, though.
>
> I believe it was clear that I referred to the implicit metadata, i.e.
> the knowledge contained within the discussions, not to the technical
> metadata contained in email message headers.
>
>> I do agree with the rest of this paragraph. I can't concede that only
>> MLs can achieve this, however; it is easy to imagine (and possibly
>> implement) a way to download discussions from a forge. Such a download
>> would necessarily also be richer.
>
> It would still be some new format...
Perhaps, but this isn't especially problematic.
> ... that would require some new, more complex tools for viewing, ...
I don't see why that'd be the case.
> ... locking the contents to the tools, which in the long term isn't
> good.
Tools like mail processing software? I'm afraid that this requirement
isn't new. There's no reason a format as simple as mbox can't be used
for this purpose (FWIW, one could provide mbox export, even!).
>>>> I don't think anyone will complain if the Codeberg interface leads
>>>> to more *quality* bug reports and patches, though. Some of us track
>>>> the pull requests and bugs on the GitHub mirror. But the closed
>>>> history there will show a lot of spam.
>>> I think people are focussing a bit too much on the need for having new
>>> contributors, which actually may or may not prefer the GitHub-style
>>> workflow. Assuming that by default all new contributors prefer that
>>> kind of workflow and find the mailing lists as a huge barrier to entry
>>> is simply a false assumption. See, I was a new contributor to quite
>>> a few projects, and I always preferred mailing lists.
>> Please don't project such intention on me. I have detailed other
>> significant reasons why a more structured approach is preferable.
>> Practice shows that ML-based workflows are hard to automate around, and
>> automation as described above is very necessary.
>
> Practice shows that it actually isn't hard, but it just requires that
> some work is put into it. Just have a look at the build farm provided
> by Intel for the Linux kernel, which builds and tests each submitted
> kernel patch automatically, on different machines.
I'm aware of it. Similar ones exist for the toolchain, and I gave
sourcehuts ML processing builders as another example.
>> While this extreme form of the assumption ("all" rather than "many" or
>> even "most") is indeed false, the weaker form is not a false assumption.
>> Anecdotally, as someone who taught at a university I can tell you that
>> interacting with mailing software is enough to dissuade many students.
>
> Well, that's your opinion and your experience, which may also not be
> applicable to the broad audience. Not all students are enthusiasts,
> which I know very well first hand.
>
>> Heck, many students don't even realize what "mailing software" is or
>> that they are using it (when using webmail).
>
> That's their fault, not ours.
>
>> (at my university the situation was yet more depressing, to my awareness
>> only two faculty members besides myself used non-webmail mail clients,
>> specifically Alpine and Thunderbird)
>
> Again, that's their fault only, ...
> and shouldn't serve as a measuring stick for anything else.
I disagree. It is entirely reasonable to look to a stock of people that
will succeed us eventually when deciding what to do.
>> This kind of workflow is not something present in most enterprise
>> environments (exceptions being IME mostly firms dealing with free
>> software primarily, and even then a specific subsection dealing with GNU
>> and Linux), where most engineers will be located most of the time, or
>> even in academic environments, where most students are (by definition)
>> present, ergo it is fair to assume most will have little interaction
>> with it.
>
> I believe you're fully aware this isn't an exterprise work environment,
> so such examples shouldn't provide some kind of steering.
Why not? The argument here is about how well-understood by a broad
audience these approaches are.
>> Many contributors doing cost-benefit analysis with this in mind for
>> minor changes will result in many minor but useful proposals being lost.
>> (note, however, that a web alternative also has some barrier to entry
>> assuming there isn't OpenID-esque ways to authenticate without creating
>> new accounts, which is also annoying, so, assuming no such way of
>> access, this could dissuade some contributors - I hope allowing
>> authetication using external services combined with the usefulness of
>> signing up for a generally-used forge like Codeberg or a heavily-used
>> forge like the Sourceware one is enough to overcome this cost)
>
> Please see my notes about GitGitGadget in another thread. That might
> be the best of both worlds approach, by providing different options to
> different people.
It isn't, it solves IMO the wrong equation.
If you believe that reliable CI systems can be built off of mailing
lists, then I propose that the email parsing solution in question is
used to generate patchsets on forges from emails, and to enable two-way
communication. *This* would be the best of both worlds (GGG is the dual
of this).
This keeps the presumably sufficient solution to the problem of poor
quality of input (specifically, the emails and email threads), and
allows integration into a more structured system.
>> I'd love to show empirical data for this, but I am not aware of anyone
>> collecting such data.
>
> Well, statistics doesn't always provide the best source of information.
> It often oversees the associated quality.
>
>>> Also, focusing too much on the new contributors and not thinking about
>>> the already existing ones at the same time may not be the best
>>> approach in the long run.
>> Naturally.
>> I'd love to see a magical way to make a non-fragile CI system for
>> mailing lists, but I am not aware of any such system so far. SourceHut
>> has appeared to come closest in my opinion, but it relies strictly on
>> emails generated by git format-patch. This is easy to get wrong
>> (unintentionally misconfiguring git send-email, or failing to configure
>> it and hence falling back to sending patches in one of many arbitrary
>> ways), it may even be impossible in some conditions (e.g. corporate
>> email servers munging outgoing emails).
>
> Please see my notes above about the kernel test farm. It's all doable,
> but it requires putting some work into it, instead of resorting to finding
> some ready-to-use solutions that often end up needing work as well.
>
>> I, too, like the convenience of git-send-email combined with a mailing
>> client I can use as efficiently as muscle memory, but the above points
>> outweigh the slight loss of convenience on my end.
>
> Well, it isn't just you who might be affected.
I didn't imply so, I was sharing an opinion.
>> (PS: we appear to be in the same region, that's fun!)
>
> Indeed. :)
--
Arsen Arsenović
signature.asc
Description: PGP signature
