On Wed, Apr 3, 2019 at 1:36 PM Michał Górny <mgo...@gentoo.org> wrote:

> Hello, everyone.
>
> Back in 2016, we've killed the technical representation of herds.  Some
> of them were disbanded completely, others merged with existing projects
> or converted into new projects.  This solved some of the problems with
> maintainer declarations but it didn't solve the most important problem
> herds posed.  Sadly, it seems that the spirit of herds survived along
> with those problems.
>
> Herds served as a method of grouping packages by a common topic,
> somewhat similar (but usually more broadly) than categories.  In their
> mature state, herds had either their specific maintainers, or were
> directly connected to projects (which in turn provided maintainers for
> the herds).  Today, we still have many herds that are masked either
> as complete projects, or semi-projects (i.e. project entries without
> explicit lead, policies or anything else).
>
>
> What's wrong with herds?
> ------------------------
> The main problem with herds is that they represent an artificial
> relation between packages.  The only common thing about them is topic,
> and there is no real reason why a group of people would maintain all
> packages regarding the same topic.  In fact, it is absurd -- say, why
> would a single person maintain 10+ competing cron implementations?
> Surely, there is some common knowledge related to running cron,
> and it is entirely possible that a single person would use a few
> different cron implementations on different systems.  But that doesn't
> justify creating an artificial project to maintain all cron
> implementations.
>
> Mapping this to reality, projects usually represent a few developers,
> each of them interested in a specific subset of packages maintained by
> the project.  In some cases, this is explicitly noted as project member
> roles; in other, it is not stated clearly anywhere.  In both cases,
> there is usually some group of packages that are assigned to
> the specific project but not maintained by any of the project members.
>
> Less structured projects often have problems tracking member activity.
> More than once a project effectively died when all members became
> inactive, yet effectively hid the fact that the relevant packages were
> unmaintained and sometimes discouraged more timid developers from fixing
> bugs.
>

I'm not sure I follow this logic.

1) We know who is in every project.
2) We know the state of every developer.

We should be able to detect if:

a) A project has empty.
b) A project has no active developers.

I don't see how this is markedly different from a package, assigned to no
maintainer. Or a package assigned to a maintainer who is not active.

So the solution here seems to be to fix the tools to detect this situation
and make it clearer that the package has no active maintainer?

(I tend to agree with your general thrust of the rest of the proposal, but
I think in general limiting how projects are used on a statutory basis
seems incorrect.)


>
>
> What kind of projects make sense?
> ---------------------------------
> If we are to fight herd-like projects, I think it is important to
> consider a bit what kind of projects make sense, and what form herd-like
> trouble.
>
> The two projects maintaining the largest number of packages in Gentoo
> are respectively the Perl project and the Python project.  Strictly
> speaking, both could be considered herd-like -- after all, they maintain
> a lot of packages belonging to the same category.  To some degree, this
> is true.  However, I believe those make sense because:
>
> a. They maintain a central group of packages, eclasses, policies etc.
> related to writing ebuilds using the specific programming language,
> and help other developers with it.  The existence of such a project is
> really useful.
>
> b. The packages maintained by them have many common properties,
> frequently come from common sources (CPAN, pypi) and that makes it
> possible for a large number of developers to actually maintain all
> of them.
>
> The Python project I know better, so I'll add something.  It does not
> accept all Python packages (although some developers insist on adding us
> to them without asking), and especially not random programs written in
> the Python language.  It specifically focuses on Python module packages,
> i.e. resources generally useful to Python programmers.  This is what
> makes it different from a common herd project.
>
> The third biggest project in Gentoo is -- in my opinion -- a perfect
> example of a problematic herd-project.  The games project maintains
> a total of 877 packages, and sad to say many are in a really bad shape.
> Even if we presumed all developers were active, this gives us 175
> packages per person, and I seriously doubt one person can actively
> maintain that many programs.  Add to that the fact that many of them are
> proprietary and fetch-restricted, and only the people possessing a copy
> can maintain it, and you see how blurry the package mapping is.
>
> Let's look at the next projects on the list.  Proxy-maint is very
> specific as it proxies contributors; however, it is technically valid
> since all project members can (and should) actively proxy for any
> maintainers we have.  Though I have to admit the number of maintained
> packages simply overburdens us.
>
> Haskell, Java, Ruby are other examples of projects focused on
> programming languages.  KDE and GNOME projects generally make sense
> since packages maintained by those projects have many common features,
> and the core set has common upstream and sometimes synced releases.  It
> is reasonable to assume members of those projects will maintain all, or
> at least majority of those packages.
>
> The next project is Sound -- and in my experience, it involves a lot of
> poorly maintained or unmaintained packages.  Again, the problem is that
> the packages maintained by the project have little in common -- why
> would any single person maintain a dozen audio players, converters,
> libraries, etc.  Having multiple people in project may increase
> the chance that they would happen to cover a larger set of competing
> packages but that's really more incidental than expected.
>
> This is basically how I'd summarize a difference between a valid
> project, and a herd-project.  A valid project maintains packages that
> have many common properties, where it really makes sense for
> an arbitrarily chosen project member to take care of an arbitrary chosen
> package maintained by the project.  A herd-project maintains packages
> that have only common topic, and usually means that an arbitrarily
> chosen project member maintains only a small subset of all packages
> maintained by the project.
>
> Looking further through the list, projects that seem to make sense
> include ROS, Emacs, maybe base-system, SELinux, ML, X11 (after all, it
> maintains core Xorg and nobody sets them as 'backup' maintainers for
> random X11 programs), PHP, vim...
>
> Project that are herd-like include science (possibly with all its
> flavors), netmon, video, desktop-misc (this is a very example of 'random
> programs'), graphics...
>
>
> What do I propose?
> ------------------
> I'd like to propose either disbanding herd-like projects entirely, or
> transforming them into more proper projects.  Not only those that are
> clearly dysfunctional but also those that incidentally happen to work
> (e.g. because they maintain a few packages, or because they represent
> a single developer with wide interest).
>
> More specifically, I'd like each of the affected projects to choose
> between:
>
> a. disbanding the project entirely and finding individual maintainers
> for all packages,
>
> b. reducing the packages maintained by the project to a well-defined
> 'core set' whose maintenance by a group of developers makes sense,
> and finding individual maintainers for the remaining packages,
>
> c. splitting one or more smaller projects with well-defined scope from
> the project, and doing a. or b. for the remaining packages.
>
> Let's take a few examples.  For a start, cron project.  Previously, it
> maintained a number of different cron implementations (most having their
> individual maintainers by now), a cronbase package and cron.eclass.
> In this context, option a. means disbanding the project entirely.  Some
> packages already have maintainers, others go maintainer-needed.
>
> Option b. would most likely involve leaving a cron project as small
> entity to provide policies for consistent cron handling, and maintain
> cronbase package and cron.eclass.  Different cron implementation would
> go to individual maintainers anyway.
>
> A similar example can be made for the PAM project that maintained
> pambase, Linux-PAM, pam.eclass and some PAM modules.  Here a. means
> giving all packages away, and b. means leaving a minimal project that
> maintains policies, pambase, Linux-PAM and the eclass.  The individual
> modules (except for maybe very common, if there were some) would find
> individual maintainers.
>
> A good example for the c. option is the recently revived VoIP project.
> Again, this is an example of herd-project that tries to maintain
> an arbitrary set of loosely related packages.  To some, it might make
> sense, especially since there's only a few VoIP packages left in Gentoo.
> Nevertheless, there is no reason why a single project member would
> maintain multiple competing VoIP stacks.
>
> Here, the c. option would mean creating project(s) for specific stacks
> of interest.  For example, if there was specific project-level interest
> for maintaining Asterisk packages, an Asterisk project would make more
> sense than generic 'VoIP'.
>
>
> Why, again?
> -----------
> As I said before, the main problem with herds is that they introduce
> artificial and non-transparent relation between packages and package
> maintainers.
>

So back to this goal (which again I think is laudable.)


>
> Firstly, they usually tend to include packages that none of the project
> members is actually interested in maintaining.  This also includes
> packages added by other developers (let's shove it in here, it matches
> their job description!) or packages leftover from other developers
> (where the project was backup maintainer).  This means having a lot of
> packages that seem to have a maintainer but actually don't.
>

I have a lot of empathy for this point FWIW. Tooling can find empty /
abandoned projects, but we cannot do things like clearly say "This package
shouldn't be in this project"
or "This package is not actually maintained by a project".

One rule we might use here is that packages always need at least a single
human maintainer, and the project just an annotation; but doesn't affected
maintainer status.
So e.g. if there are 8 competing cron implementations, "cron-team" can't
maintain all 8, they have to find individual humans to vouch for each[0].


>
> Secondly, they frequently lack proper structure and handling of leaving
> members.  Therefore, whenever a member maintaining a specific set of
> packages leaves, it is possible that the number of not-really-maintained
> packages increases.
>
> Thirdly, they tend to degenerate and become defunct (much more than
> projects that make sense).  Then, the number of not-really-maintained
> packages ends up being really high.
>
> My goal here is to make sure that we have clear and correct information
> about package maintainers.  Most notable, if a package has no active
> maintainer, we really need to have 'up for grabs' issued and package
> marked as maintainer-needed, rather than hidden behind some project
> whose members may not even be aware of the fact that they're its
> maintainers.
>

>
> What do you think?
>
>
[0] This is itself a question the project needs to decide for itself; does
every package need to be maintained actively? Some might answer no, and
maybe running for months / years without a maintainer is OK for Gentoo. Its
not an opinion I personally hold, but I suspect some community members do
hold it. Herds / Projects help Gentoo scale and enable 160 humans to
maintain 19,600 packages. Taking this away will likely affect the number of
packages in the tree as maintainers scale down their stake in the tree.

-A


> --
> Best regards,
> Michał Górny
>
>

Reply via email to