Re: [gentoo-dev] Killing herds, again
Ühel kenal päeval, K, 03.04.2019 kell 19:35, kirjutas Michał Górny: > My goal here is to make sure that we have clear and correct > information > about package maintainers. Most notable, if a package has no active > maintainer, we really need to have 'up for grabs' issued and package > marked as maintainer-needed, rather than hidden behind some project > whose members may not even be aware of the fact that they're its > maintainers. > > > What do you think? I agree with most of what was written in the original post, but regarding this point I'd hate to see packages maintained by a project that makes sense to be thrown into a generic maintainer-needed. We need a maintainer-needed project status, and just improve the tooling to notify this. So similar to what Alec said, but I think it's fine to throw them into the generic bucket when the maintaining project was indeed just herd-like. But I don't think we should do that for projects where it makes sense to have the packages grouped under a project. A recent example was/is MATE; an older example is maybe enlightenment. We could mark projects as maintainer-needed and have a script even add comments in metadata.xml (and a matching script to remove those, once the project status changes); have any maintainer-needed list generation consider with packages marked as maintained by such a maintainer-needed project (with some potential grouping of them together in the list), etc. This could then also be used to signify huge staffing-needs, even if someone is sort-of trying to take care of them within that project. But, as said, I agree with almost everything else mentioned in the thread starter. Mart signature.asc Description: This is a digitally signed message part
Re: [gentoo-dev] Killing herds, again
On Wed, 2019-04-03 at 18:52 -0400, Alec Warner wrote: > On Wed, Apr 3, 2019 at 1:36 PM Michał Górny wrote: > > Less structured projects often have problems tracking member activity. > > More than once a project effectively died when all members became > > inactive, yet effectively hid the fact that the relevant packages were > > unmaintained and sometimes discouraged more timid developers from fixing > > bugs. > > > > I'm not sure I follow this logic. > > 1) We know who is in every project. > 2) We know the state of every developer. > > We should be able to detect if: > > a) A project has empty. > b) A project has no active developers. > > I don't see how this is markedly different from a package, assigned to no > maintainer. Or a package assigned to a maintainer who is not active. I'm afraid it's not that simple. Just because someone is active as a Gentoo developer doesn't mean that person is active in this specific project. There's no simple way to know that, and I don't think we should try to figure out complex ways of doing that as they are capable of causing more harm than good with false positives. There is also a deeper problem to this. I believe that having one personally listed as the maintainer (and especially as the sole maintainer) makes that person feel more responsible about the package and its state. If the developer in question doesn't want to maintain it anymore, 'up for grabs' are quite likely. On the other hand, I honestly doubt most of the projects regularly look for unmaintained packages. If a project's developer loses interest in a particular package, the developer can easily assume someone else will take care of it now. Which sometimes happens and sometime does not. There is also another problem with assumptions I myself make as an Undertaker. When a developer is inactive and package reassignment happens, I usually don't remove that developer from all projects, and instead just look for projects that have no other developers. After all, the purpose is to make sure packages aren't left unmaintained, and removing developer from active projects doesn't serve that goal. However, as you can see this way I can easily end up leaving a project consisting solely of inactive developers because I lack the necessary context. > So the solution here seems to be to fix the tools to detect this situation > and make it clearer that the package has no active maintainer? Better looks are always welcome, and I say thankya if you can deliver them. However, as I said they won't solve the problem completely. [...] > > Firstly, they usually tend to include packages that none of the project > > members is actually interested in maintaining. This also includes > > packages added by other developers (let's shove it in here, it matches > > their job description!) or packages leftover from other developers > > (where the project was backup maintainer). This means having a lot of > > packages that seem to have a maintainer but actually don't. > > > > I have a lot of empathy for this point FWIW. Tooling can find empty / > abandoned projects, but we cannot do things like clearly say "This package > shouldn't be in this project" > or "This package is not actually maintained by a project". > > One rule we might use here is that packages always need at least a single > human maintainer, and the project just an annotation; but doesn't affected > maintainer status. > So e.g. if there are 8 competing cron implementations, "cron-team" can't > maintain all 8, they have to find individual humans to vouch for each[0]. I don't think we want to impose this on all projects since -- as I mentioned -- many of them make sense as co-maintaining arbitrary packages. And then, I don't think we can really impose it on only some of the projects. > > Secondly, they frequently lack proper structure and handling of leaving > > members. Therefore, whenever a member maintaining a specific set of > > packages leaves, it is possible that the number of not-really-maintained > > packages increases. > > > > Thirdly, they tend to degenerate and become defunct (much more than > > projects that make sense). Then, the number of not-really-maintained > > packages ends up being really high. > > > > My goal here is to make sure that we have clear and correct information > > about package maintainers. Most notable, if a package has no active > > maintainer, we really need to have 'up for grabs' issued and package > > marked as maintainer-needed, rather than hidden behind some project > > whose members may not even be aware of the fact that they're its > > maintainers. > > > > What do you think? > > > > > [0] This is itself a question the project needs to decide for itself; does > every package need to be maintained actively? Some might answer no, and > maybe running for months / years without a maintainer is OK for Gentoo. Its > not an opinion I personally hold, but I suspect some community members do > hold it.
Re: [gentoo-dev] Killing herds, again
On Wed, Apr 3, 2019 at 1:36 PM Michał Górny wrote: > Hello, everyone. > > Back in 2016, we've killed the technical representation of herds. Some > of them were disbanded completely, others merged with existing projects > or converted into new projects. This solved some of the problems with > maintainer declarations but it didn't solve the most important problem > herds posed. Sadly, it seems that the spirit of herds survived along > with those problems. > > Herds served as a method of grouping packages by a common topic, > somewhat similar (but usually more broadly) than categories. In their > mature state, herds had either their specific maintainers, or were > directly connected to projects (which in turn provided maintainers for > the herds). Today, we still have many herds that are masked either > as complete projects, or semi-projects (i.e. project entries without > explicit lead, policies or anything else). > > > What's wrong with herds? > > The main problem with herds is that they represent an artificial > relation between packages. The only common thing about them is topic, > and there is no real reason why a group of people would maintain all > packages regarding the same topic. In fact, it is absurd -- say, why > would a single person maintain 10+ competing cron implementations? > Surely, there is some common knowledge related to running cron, > and it is entirely possible that a single person would use a few > different cron implementations on different systems. But that doesn't > justify creating an artificial project to maintain all cron > implementations. > > Mapping this to reality, projects usually represent a few developers, > each of them interested in a specific subset of packages maintained by > the project. In some cases, this is explicitly noted as project member > roles; in other, it is not stated clearly anywhere. In both cases, > there is usually some group of packages that are assigned to > the specific project but not maintained by any of the project members. > > Less structured projects often have problems tracking member activity. > More than once a project effectively died when all members became > inactive, yet effectively hid the fact that the relevant packages were > unmaintained and sometimes discouraged more timid developers from fixing > bugs. > I'm not sure I follow this logic. 1) We know who is in every project. 2) We know the state of every developer. We should be able to detect if: a) A project has empty. b) A project has no active developers. I don't see how this is markedly different from a package, assigned to no maintainer. Or a package assigned to a maintainer who is not active. So the solution here seems to be to fix the tools to detect this situation and make it clearer that the package has no active maintainer? (I tend to agree with your general thrust of the rest of the proposal, but I think in general limiting how projects are used on a statutory basis seems incorrect.) > > > What kind of projects make sense? > - > If we are to fight herd-like projects, I think it is important to > consider a bit what kind of projects make sense, and what form herd-like > trouble. > > The two projects maintaining the largest number of packages in Gentoo > are respectively the Perl project and the Python project. Strictly > speaking, both could be considered herd-like -- after all, they maintain > a lot of packages belonging to the same category. To some degree, this > is true. However, I believe those make sense because: > > a. They maintain a central group of packages, eclasses, policies etc. > related to writing ebuilds using the specific programming language, > and help other developers with it. The existence of such a project is > really useful. > > b. The packages maintained by them have many common properties, > frequently come from common sources (CPAN, pypi) and that makes it > possible for a large number of developers to actually maintain all > of them. > > The Python project I know better, so I'll add something. It does not > accept all Python packages (although some developers insist on adding us > to them without asking), and especially not random programs written in > the Python language. It specifically focuses on Python module packages, > i.e. resources generally useful to Python programmers. This is what > makes it different from a common herd project. > > The third biggest project in Gentoo is -- in my opinion -- a perfect > example of a problematic herd-project. The games project maintains > a total of 877 packages, and sad to say many are in a really bad shape. > Even if we presumed all developers were active, this gives us 175 > packages per person, and I seriously doubt one person can actively > maintain that many programs. Add to that the fact that many of them are > proprietary and fetch-restricted, and only the people possessing a copy > can maintain it, and you
[gentoo-dev] Killing herds, again
Hello, everyone. Back in 2016, we've killed the technical representation of herds. Some of them were disbanded completely, others merged with existing projects or converted into new projects. This solved some of the problems with maintainer declarations but it didn't solve the most important problem herds posed. Sadly, it seems that the spirit of herds survived along with those problems. Herds served as a method of grouping packages by a common topic, somewhat similar (but usually more broadly) than categories. In their mature state, herds had either their specific maintainers, or were directly connected to projects (which in turn provided maintainers for the herds). Today, we still have many herds that are masked either as complete projects, or semi-projects (i.e. project entries without explicit lead, policies or anything else). What's wrong with herds? The main problem with herds is that they represent an artificial relation between packages. The only common thing about them is topic, and there is no real reason why a group of people would maintain all packages regarding the same topic. In fact, it is absurd -- say, why would a single person maintain 10+ competing cron implementations? Surely, there is some common knowledge related to running cron, and it is entirely possible that a single person would use a few different cron implementations on different systems. But that doesn't justify creating an artificial project to maintain all cron implementations. Mapping this to reality, projects usually represent a few developers, each of them interested in a specific subset of packages maintained by the project. In some cases, this is explicitly noted as project member roles; in other, it is not stated clearly anywhere. In both cases, there is usually some group of packages that are assigned to the specific project but not maintained by any of the project members. Less structured projects often have problems tracking member activity. More than once a project effectively died when all members became inactive, yet effectively hid the fact that the relevant packages were unmaintained and sometimes discouraged more timid developers from fixing bugs. What kind of projects make sense? - If we are to fight herd-like projects, I think it is important to consider a bit what kind of projects make sense, and what form herd-like trouble. The two projects maintaining the largest number of packages in Gentoo are respectively the Perl project and the Python project. Strictly speaking, both could be considered herd-like -- after all, they maintain a lot of packages belonging to the same category. To some degree, this is true. However, I believe those make sense because: a. They maintain a central group of packages, eclasses, policies etc. related to writing ebuilds using the specific programming language, and help other developers with it. The existence of such a project is really useful. b. The packages maintained by them have many common properties, frequently come from common sources (CPAN, pypi) and that makes it possible for a large number of developers to actually maintain all of them. The Python project I know better, so I'll add something. It does not accept all Python packages (although some developers insist on adding us to them without asking), and especially not random programs written in the Python language. It specifically focuses on Python module packages, i.e. resources generally useful to Python programmers. This is what makes it different from a common herd project. The third biggest project in Gentoo is -- in my opinion -- a perfect example of a problematic herd-project. The games project maintains a total of 877 packages, and sad to say many are in a really bad shape. Even if we presumed all developers were active, this gives us 175 packages per person, and I seriously doubt one person can actively maintain that many programs. Add to that the fact that many of them are proprietary and fetch-restricted, and only the people possessing a copy can maintain it, and you see how blurry the package mapping is. Let's look at the next projects on the list. Proxy-maint is very specific as it proxies contributors; however, it is technically valid since all project members can (and should) actively proxy for any maintainers we have. Though I have to admit the number of maintained packages simply overburdens us. Haskell, Java, Ruby are other examples of projects focused on programming languages. KDE and GNOME projects generally make sense since packages maintained by those projects have many common features, and the core set has common upstream and sometimes synced releases. It is reasonable to assume members of those projects will maintain all, or at least majority of those packages. The next project is Sound -- and in my experience, it involves a lot of poorly maintained or unmaintained packages. Again, the problem