On Fri, Jan 7, 2022 at 1:40 AM Alessandro Vesely <ves...@tana.it> wrote:

> On Thu 06/Jan/2022 20:02:48 +0100 Brandon Long wrote:
> > On Thu, Jan 6, 2022 at 5:55 AM Alessandro Vesely <ves...@tana.it> wrote:
> >> On Wed 05/Jan/2022 21:25:35 +0100 Brandon Long wrote:
> >>> On Wed, Jan 5, 2022 at 10:49 AM Alessandro Vesely <ves...@tana.it>
> wrote:
> >>>> On Wed 05/Jan/2022 00:32:57 +0100 Brandon Long wrote:
> >>>>
> >>>>> There is the dmarc address that Google itself uses,
> >>>>> dmarc-nore...@google.com <mailto:dmarc-nore...@google.com>, it
> >>>>> sometimes has the same rejections.  I haven't checked recently, but
> >>>>> wouldn't surprise me.  Indeed, the problem occurs at midnight in the
> >>>>> most popular time zones, more in the various US ones than in Europe.
> >>>>> Jittering your sends to not be on the hour is a good idea for
> everyone
> >>>>> sending dmarc reports.
> >>>>
> >>>> Actually there is a random sleep at the beginning.  The short time it
> >>>> delays should be enough to avoid hardware problems, especially at
> >>>> major computer centers.
> >>>
> >>> I don't know the state now, but 2.5x the average load of the Gmail
> >>> receive pipeline on the hour, and 1.5x on the half hour was not
> >>> uncommon, and was over 5-10 minutes.
> >>> Peaks for individual accounts for individual things are much
> >>> higher, and backoffs take their time to go through as well.
> >>
> >> That's one point I'd be curious to understand.  If a server is on the
> >> ropes, not accepting connections or replying 421 to EHLO would be
> >> fair.  Delaying the whole incoming queue from that server would be an
> >> advantage, in that situation.  Checking the rate of a given recipient
> >> before delaying or rejecting must have a different reason; it's not
> >> because the hardware doesn't support higher volumes...
> >
> > There's a reverse proxy in front of our smtp servers which does
> > load balancing, and the connection itself can be retried on other
> > servers... but yes, sometimes the individual server rejects a
> > connection early for capacity issues, or at data content if the
> > message is large enough to put the server at memory risk
> > (obviously we try to reject earlier if they send a SIZE= or use
> > BDAT).  The smtp servers are also recipient agnostic and stateless,
> > so any server will do. >
> > This isn't about the individual smtp server, it's about the
> > individual backend account and the shared resources that serve that
> > particular account.  Fair sharing is always complicated.
>
>
> Thanks for sharing, although that info doesn't clarify on what
> principle are based the tactics to delay messages so that they don't
> accumulate in a short interval of time, even when there are no
> capacity issues.
>

In a multi-hop system, you need flow control in order to push back on new
requests... but adding a queue means that pushback based flow control
would need to be based on the queue size, but having a world-wide
distributed queue makes keeping track of the queues for each individual
user very expensive... rate based flow control is an alternative, though it
also requires distributed global state... but that system is already needed
for a variety of other use cases.

The downstream shared resources are also complicated and may change
over time (consider primary/secondary fail-over or cell failures) or that a
user
may be forwarding their mail to yet another account.

It's also much easier for customers to reason about an individual account
limit
than shared limits... and also easier to know whether the system is failing
to handle
the limits.

Having circuit breakers is also a good practice.

Brandon
_______________________________________________
mailop mailing list
mailop@mailop.org
https://list.mailop.org/listinfo/mailop

Reply via email to