Re: Community renewal and project obsolescence

2024-01-04 Thread Gunnar Wolf
Mo Zhou dijo [Thu, Dec 28, 2023 at 02:02:18PM -0500]:
> > Thanks for the code and the figure. Indeed, the trend is confirmed by
> > fitting a linear model count ~ year to the new members list. The
> > coefficient is -1.39 member/year, which is significantly different from
> > zero (F[1,22] = 11.8, p < 0.01). Even when we take out the data from
> > year 2001, that could be interpreted as an outlier, the trend is still
> > siginificant, with a drop of 0.98 member/year (F[1,21] = 8.48, p <
> > 0.01).
> 
> I thought about to use some models for population statistics, so we can get
> the data about DD birth rate and DD retire/leave rate, as well as a
> prediction. But since the descendants of DDs are not naturally new DDs, the
> typical population models are not likely going to work well. The birth of DD
> is more likely mutation, sort of.

Five years ago, I got a paper published where we analized and made
some forecasts on the curated Web-of-Trust keyrings in Debian:

https://jisajournal.springeropen.com/articles/10.1186/s13174-018-0082-7

I did the first part of the article, but the part that better fits
what you are describing was done by my coauthor, Víctor González (who
understands about statistics way better than me).

Anyway, it does not also answer to the exact question you are
presenting --- we there studied the lifetime of keys, and left for
later analysis a way to link said keys into people, in order to map
the life trajectory of an individual in the project. But it might
still be interesting or useful for your analysis.

> Anyway, we do not need sophisticated math models to draw the conclusion that
> Debian is an aging community. And yet, we don't seem to have a good way to
> reshape the curve using Debian's funds. -- this is one of the key problems
> behind the data.

And I think this is hardly an unexpected outcome. There are many
social and technological patterns that define us as a 1990s project
that continues to liveand thrive, but not necessarily with the best /
most up-to-date tooling.



Re: Lack of replies

2024-01-04 Thread Colin Watson
On Thu, Jan 04, 2024 at 04:52:58PM +0100, Daniel Gröber wrote:
> That's certainly not something I'd advocate for. I want us to minimize the
> PITA for the technically literate without sacrifising general usability.

To be honest, I think that address rewriting might actually be part of
improving usability of the BTS - but it would have to be done carefully
and with an eye to various other features that have evolved as
workarounds for the current situation.  It may be that some other
changes need to happen first.

The original design had a collection of addresses for each bug, all of
which were filed in the system itself, and some of which are also sent
on to others.  As seen on https://www.debian.org/Bugs/Developer:

 * n...@bugs.debian.org — such messages are also sent to the package
   maintainer and forwarded to debian-bugs-dist, but not to the
   submitter;
 * nnn-submit...@bugs.debian.org — these are also sent to the submitter
   and forwarded to debian-bugs-dist, but not to the package maintainer;
 * nnn-mainto...@bugs.debian.org — these are only sent to the package
   maintainer, not to the submitter or debian-bugs-dist;
 * nnn-qu...@bugs.debian.org — these are only filed in the bug tracking
   system (as are all the above), not sent to anyone else.

This has been a mostly functional but slightly problematic design for as
long as I've been involved with Debian.  The classic problem is that
people email a followup to a bug to nnn@bugs and (without much thinking
about it) expect it to be seen by everyone who's interested in that bug.
But in fact this doesn't automatically go to the submitter, nor
necessarily to anyone else who's been involved in the discussion on the
bug.

In practice, if you're receiving the bug discussion by email, then you
can reply-all and it'll usually be more or less OK - but if the email
thread is at all non-linear then people may well end up being left out
by accident, people can easily forget to reply-all rather than just
reply, and it's not exactly obvious how to participate in this way if
you weren't already CCed.  ("bts --mbox show nnn" exists and is OK for
experts, but it's not exactly obvious and probably only works with some
MUAs.)

At some point debbugs gained a "subscribe" feature: you can email
nnn-subscribe@bugs to subscribe to a bug, or nnn-unsubscribe@bugs to
unsubscribe, with a confirmation message in each case.  So far so good,
though clunky.  But submitters aren't auto-subscribed, and you can't
really tell who's subscribed so you have no way to know in advance if
your message is going to reach the right people.  (The implementation is
also kind of weird: IIRC it's done via lists.debian.org, so even the BTS
itself doesn't really know who's going to get the message.)  As a
result, while this helped with certain use cases, it didn't really solve
the problem above.

What I always thought would be a better model would be for each bug to
be a "nosy list" (the term comes from roundup, I think).  That is, bugs
would have a list of addresses notified of changes, by default filing a
bug or sending a message to it would cause you to be added to the list,
and subscribing to or unsubscribing from any given bug would be easy.

Now, such a change would certainly require some people to adjust a bit,
and it would be easier if the BTS had an optional authenticated web
interface to allow you to (un)subscribe more easily.  I'm not saying it
would be straightforward.  But if things worked this way, then I think
rewriting addresses to nnn@bugs would ultimately be less controversial -
it would be the most convenient default, as the address that's most
likely to reach everyone you probably want it to reach.

-- 
Colin Watson (he/him)  [cjwat...@debian.org]



Re: Lack of replies

2024-01-04 Thread Jeremy Stanley
On 2024-01-04 16:52:58 +0100 (+0100), Daniel Gröber wrote:
[...]
> Any good reason we cannot look at the MX domain (or in the worst case) ASN
> associated with mailserver IP to special case particularly offensive
> implementations such as this if looking at the DMARC policy works in the
> average case?
[...]

Unfortunately not. An example I know of is Red Hat's corporate
E-mail, they use a third-party filter (Mimecast) as their MX which
then forwards to their custom domain on Gmail.

As a mailing list admin in some popular open source communities,
I've become all too familiar with these challenges. It doesn't help
that the situation changes month-to-month, so what you decide today
may suddenly stop working any moment and you're back to square one.

The cynic in me says that the mass freemail providers slipped these
policy frameworks in as a sort of Trojan Horse, promising to fix
"the spam problem" when their real goal was poisoning the well we're
all drinking from and then setting themselves up as the only safe
supply of drinking water. They're all too happy to propose standards
when they want other operators to do something, but then willfully
ignore those same standards whenever it suits their purposes.
-- 
Jeremy Stanley


signature.asc
Description: PGP signature


Re: Lack of replies

2024-01-04 Thread Daniel Gröber
Hi Jeremy,

On Thu, Jan 04, 2024 at 03:11:59PM +, Jeremy Stanley wrote:
> On 2024-01-04 15:54:28 +0100 (+0100), Daniel Gröber wrote:
> > could this rewrite scheme be applied only for recipients where it's
> > absolutely necessary?
> 
> Unfortunately no. It *used* to be a popular assumption that you could
> look at the published DKIM/DMARC policies [...] but [...] Gmail decided
> to treat messages from its own users more strictly than the policy it
> publishes for them in DNS. And since Gmail does custom domain hosting
> too, you can't simply limit the workaround to treating their well-known
> domain specially. Given its popularity (near ubiquity) as a freemail
> provider these days, 

Any good reason we cannot look at the MX domain (or in the worst case) ASN
associated with mailserver IP to special case particularly offensive
implementations such as this if looking at the DMARC policy works in the
average case?

> telling users they'll have to get an address somewhere else to interact
> with the BTS is unlikely to end well either.

That's certainly not something I'd advocate for. I want us to minimize the
PITA for the technically literate without sacrifising general usability.

--Daniel


signature.asc
Description: PGP signature


Re: Lack of replies

2024-01-04 Thread Scott Kitterman



On January 4, 2024 3:15:29 PM UTC, Colin Watson  wrote:
>On Thu, Jan 04, 2024 at 03:54:28PM +0100, Daniel Gröber wrote:
>> On Wed, Jan 03, 2024 at 05:10:43PM +, Scott Kitterman wrote:
>> > >At least people could be warned that because of the domain they send
>> > >from their mail might not get through.
>> > >
>> > My guess is that such a warning email (which is the only way we'd have to
>> > do it) would also cause a lot of complaints.  
>> 
>> > I think we [...] will need to have the BTS send all emails from
>> > bugs.debian.org role addresses and not use the sender's email in From
>> > anymore.
>> 
>> Just to make sure I understand the constraints: we can determine at sending
>> time whether a particular domain is going to cause trouble or not, right?
>> If so could this rewrite scheme be applied only for recipients where it's
>> absolutely necessary?
>
>I haven't looked into this in a lot of detail, but my concern would be
>that that would end up being flaky and confusing in practice.  I think
>people want the behaviour of the BTS to be easily predictable without
>having to get an advanced degree in MTA debugging first.
>
I agree that inconsistent behavior is concerning.  IETF mailing lists 
selectively rewrite From based on domain DMARC and it seems to be mostly okay. 
It may be that the typical IETF participant knows more about email than the 
typical BTS user.

Personally, I worry more about the added complexity and maintainability.  It's 
timely that this
Niklaus Wirth  quote from 2008: A Brief History of Software Engineering by 
Niklaus Wirth showed up in my Mastodon feed yesterday:

“A primary effort must be education toward a sense of quality. Programmers must 
become engaged crusaders against home-made complexity. The cancerous growth of 
complexity is not a thing to be admired; it must be fought wherever possible.”

I think it's definitely possible to avoid this complexity, so we probably 
should.

Scott K



Re: Lack of replies

2024-01-04 Thread Colin Watson
On Thu, Jan 04, 2024 at 03:54:28PM +0100, Daniel Gröber wrote:
> On Wed, Jan 03, 2024 at 05:10:43PM +, Scott Kitterman wrote:
> > >At least people could be warned that because of the domain they send
> > >from their mail might not get through.
> > >
> > My guess is that such a warning email (which is the only way we'd have to
> > do it) would also cause a lot of complaints.  
> 
> > I think we [...] will need to have the BTS send all emails from
> > bugs.debian.org role addresses and not use the sender's email in From
> > anymore.
> 
> Just to make sure I understand the constraints: we can determine at sending
> time whether a particular domain is going to cause trouble or not, right?
> If so could this rewrite scheme be applied only for recipients where it's
> absolutely necessary?

I haven't looked into this in a lot of detail, but my concern would be
that that would end up being flaky and confusing in practice.  I think
people want the behaviour of the BTS to be easily predictable without
having to get an advanced degree in MTA debugging first.

-- 
Colin Watson (he/him)  [cjwat...@debian.org]



Re: Lack of replies

2024-01-04 Thread Jeremy Stanley
On 2024-01-04 15:54:28 +0100 (+0100), Daniel Gröber wrote:
[...]
> Just to make sure I understand the constraints: we can determine
> at sending time whether a particular domain is going to cause
> trouble or not, right? If so could this rewrite scheme be applied
> only for recipients where it's absolutely necessary?
[...]

Unfortunately no. It *used* to be a popular assumption that you
could look at the published DKIM/DMARC policies for a given address
and determine whether you can safely put them in the From header,
but that changed recently when Gmail decided to treat messages from
its own users more strictly than the policy it publishes for them in
DNS. And since Gmail does custom domain hosting too, you can't
simply limit the workaround to treating their well-known domain
specially. Given its popularity (near ubiquity) as a freemail
provider these days, telling users they'll have to get an address
somewhere else to interact with the BTS is unlikely to end well
either.
-- 
Jeremy Stanley


signature.asc
Description: PGP signature


Re: Lack of replies

2024-01-04 Thread Scott Kitterman



On January 4, 2024 2:54:28 PM UTC, "Daniel Gröber"  wrote:
>Hi Scott,
>
>On Wed, Jan 03, 2024 at 05:10:43PM +, Scott Kitterman wrote:
>> >At least people could be warned that because of the domain they send
>> >from their mail might not get through.
>> >
>> My guess is that such a warning email (which is the only way we'd have to
>> do it) would also cause a lot of complaints.  
>
>> I think we [...] will need to have the BTS send all emails from
>> bugs.debian.org role addresses and not use the sender's email in From
>> anymore.
>
>Just to make sure I understand the constraints: we can determine at sending
>time whether a particular domain is going to cause trouble or not, right?
>If so could this rewrite scheme be applied only for recipients where it's
>absolutely necessary?
>
>That way DDs, who are likeley to care more about their BTS email workflow
>than the average user, don't have to deal with the negative consequences of
>the address rewriting if they're already behind a polite mailserver.
>
>Further if this discrimination is possible I wonder if it might not also be
>possible to accomodate the subset of BTS users who are behind broken mail
>providers but use sensible mail clients (mutt and such).
>
>Specifically I think when you embedd an message/rfc822 part mutt allows me
>to autoview the message inline, see the (pretty set) of headers, and reply
>to this message instead of the "envelope".
>
>So when BTS sees a broken domain it could generate the usual message with
>address rewriting applied, but also attach in an multipart/alternative the
>untouched version for this set of users to use.
>
>Not sure that all works out, just a crazy idea,
>--Daniel

That's consistent with things I've seen proposed and implemented for mitigation 
of DMARC issues with mailing lists.  I don't know if the multipart solution has 
ever been implemented.  From/Mail From definitely have been.

If I were designing it, I would lean heavily on Colin Watson's experience with 
DMARC mitigation in Launchpad.  I've done some consulting work in this area, 
but I have never implemented it.

Scott K



Re: Lack of replies

2024-01-04 Thread Daniel Gröber
Hi Scott,

On Wed, Jan 03, 2024 at 05:10:43PM +, Scott Kitterman wrote:
> >At least people could be warned that because of the domain they send
> >from their mail might not get through.
> >
> My guess is that such a warning email (which is the only way we'd have to
> do it) would also cause a lot of complaints.  

> I think we [...] will need to have the BTS send all emails from
> bugs.debian.org role addresses and not use the sender's email in From
> anymore.

Just to make sure I understand the constraints: we can determine at sending
time whether a particular domain is going to cause trouble or not, right?
If so could this rewrite scheme be applied only for recipients where it's
absolutely necessary?

That way DDs, who are likeley to care more about their BTS email workflow
than the average user, don't have to deal with the negative consequences of
the address rewriting if they're already behind a polite mailserver.

Further if this discrimination is possible I wonder if it might not also be
possible to accomodate the subset of BTS users who are behind broken mail
providers but use sensible mail clients (mutt and such).

Specifically I think when you embedd an message/rfc822 part mutt allows me
to autoview the message inline, see the (pretty set) of headers, and reply
to this message instead of the "envelope".

So when BTS sees a broken domain it could generate the usual message with
address rewriting applied, but also attach in an multipart/alternative the
untouched version for this set of users to use.

Not sure that all works out, just a crazy idea,
--Daniel


signature.asc
Description: PGP signature