Re: destination_rate_delay and connection_reuse_time_limit

2013-01-09 Thread Rafael Azevedo - IAGENTE
Em 08/01/2013, às 17:16, Wietse Venema wie...@porcupine.org escreveu:

 Reindl Harald:
 Big deal. Now I can block all mail for gmail.com by getting 100
 email messages into your queue
 
 how comes?
 how do you get gmail.com answer to any delivery from you with 4xx?
 
 He wants to temporarily suspend delivery when site has 5 consecutive
 delivery errors without distinguishing between SMTP protocol stages
 (such an ignore protocol stage switch could be added to Postfix).
 
 To implement a trivial DOS, I need 5 consecutive messages in his
 mail queue, plus a handful accounts that don't accept mail.
 
 I have no idea where he got the 100 from - that number was not part
 of his original problem description.

Wietse, I've never said anything about 5 errors, you suggested that on the 
cohort parameter.

When delivery starts failing because of an active block, its impossible to 
deliver any email after that. So it might happened to have like 10k emails to 
the same destination (domain).

As I said, if this happens to be parametrized, we could set 100, 500, 1000 
errors, whatever fits the needs.

 
   Wietse

Rafael

Re: destination_rate_delay and connection_reuse_time_limit

2013-01-09 Thread Rafael Azevedo - IAGENTE
 
 Barring a clean slow down signal, and a stable feedback mechanism,
 the only strategy is manually tuned rate delays, and spreading the
 load over multiple sending IPs (Postfix instances don't help if
 they share a single IP).

I have multiple instances of Postfix running on multiple IPs. The problem (not 
quite sure if it is a problem) is that we don't have a shared queue, so each 
postfix (ip) has its own queue. Is there anyway to share the queue along 
multiple postfix (ips) instances? Does it make sense?

Rafael

Re: destination_rate_delay and connection_reuse_time_limit

2013-01-09 Thread Rafael Azevedo - IAGENTE
I agree with Reindl, I guess Witsie is now better understanding the problem 
here.

I'd see this as a additional feature, not default configuration.

It would be even better if that could be parametrized on named transport basis.

- Rafael

Em 08/01/2013, às 19:02, Reindl Harald h.rei...@thelounge.net escreveu:

 
 
 Am 08.01.2013 21:40, schrieb Wietse Venema:
 My conclusion is that Postfix can continue to provide basic policies
 that avoid worst-case failure modes, but the choice of the settings
 that control those policies is better left to the operator. If the
 receiver slams on the brakes, then Postfix can suspend deliveries,
 but the sender operator will have to adjust the sending rate.
 
 exactly this is the point
 
 thank you for your understanding and thoughts!
 



Re: destination_rate_delay and connection_reuse_time_limit

2013-01-09 Thread Rafael Azevedo - IAGENTE

 When faced with a destination that imposes tight rate limits you
 must pre-configure your MTA to always stay under the limits. Nothing
 good happens when the Postfix output rate under load exceeds the
 remote limit whether you throttle the queue repeatedly or not.

But many times we just don't know the other side's limits and watching logs 
every day searching for delivery failures, with all the respect, is very 
painful.

 
 The best that one can hope for is for Postfix to dynamically apply
 a rate delay that is guaranteed to be slow enough to get under the
 limit, and then gradually reduce it.

That would be very nice.

- Rafael

Re: destination_rate_delay and connection_reuse_time_limit

2013-01-09 Thread Rafael Azevedo - IAGENTE

 That's not what happens when a destination is throttled, all mail
 there is deferred, and is retried some indefinite time later that
 is at least 5 minutes but perhaps a lot longer, and at great I/O
 cost, with expontial backoff for each message based on time in the
 queue, …

I totally disagree with you. It would have more I/O having postfix trying to 
deliver when there's an active block. Messages are kept in disk/memory for much 
longer time, and Postfix keeps putting it from deferred to active and then back 
to deferred again. Plus, you can't forget that there are more messages coming 
in the mean time while the block is still active, leading postfix to a 
**infinite** loop (until it reaches maximal_queue_lifetime)

 
 To understand what one is asking for, one needs to understand the
 scheduler (qmgr) architecture. Otherwise, one is just babbling
 nonsense (no offense intended).

Where can I read more about this?

 
 I would posit that neither Reindl nor the OP, or that many others
 really understand what they are asking for. If they understood,
 they would stop asking for it.
 
 i would posit you do not understand the usecase
 
 How likely do you think that is? Of course I understand the use
 case, in fact better than the users who are asking for it.

Sorry Viktor, but I'm not sure about that. You keep saying to get whitelist 
like if it would be very easy. Believe me, there are a lot of companies that 
don't have any support for that.

 
 and yes we do not care if a newsletter has reached every RCPT
 two hours later but we do care for reputation and not exceed
 rate limits of large ISP's
 
 Throttling the destination (which means moving all pending messages
 for the destinatin to deferred, where they age exponentially, while
 more mail builds up...) is not the answer to your problem.

But why move all the active queue to deferred? Wouldn't it be better to just 
move it to hold queue?

 
 1. Get whitelisted without limits, send at the arrival rate.
 2. Get whitelisted at above the arrival rate, set rate delay to
   avoid exceeding the rate.
 3. Don't waste time with unresponsive mailbox providers, tell their
   customers their mailbox provider is not supported.
 4. Snowshoe.

What's the meaning of Snowshoe?

- Rafael



Re: destination_rate_delay and connection_reuse_time_limit

2013-01-09 Thread Michael P. Demelbauer
On Wed, Jan 09, 2013 at 10:02:02AM -0200, Rafael Azevedo - IAGENTE wrote:
[ ... ]
  To understand what one is asking for, one needs to understand the
  scheduler (qmgr) architecture. Otherwise, one is just babbling
  nonsense (no offense intended).
 
 Where can I read more about this?

I think

http://www.postfix.org/SCHEDULER_README.html

was already mentioned in that thread?!

I also don't know anything else.
-- 
Michael P. Demelbauer
Systemadministration
WSR
Arsenal, Objekt 20
1030 Wien
---
In Deutschland geniesst der Zuschauer eine Freizeitgestaltung auf hohem
  Niveau, bei uns hat man gefrorene Finger und pickt mit einem kalten
 Hintern auf der Stahltribüne.
   -- Helmut Kraft auf die Frage, was getan werden müsste, damit Ö's
   Fussball wieder attraktiver wird.


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-09 Thread Viktor Dukhovni
On Wed, Jan 09, 2013 at 10:02:02AM -0200, Rafael Azevedo - IAGENTE wrote:

  That's not what happens when a destination is throttled, all mail
  there is deferred, and is retried some indefinite time later that
  is at least 5 minutes but perhaps a lot longer, and at great I/O
  cost, with expontial backoff for each message based on time in the
  queue, ?
 
 I totally disagree with you. It would have more I/O having postfix
 trying to deliver when there's an active block. Messages are kept
 in disk/memory for much longer time, and Postfix keeps putting it
 from deferred to active and then back to deferred again. Plus, you
 can't forget that there are more messages coming in the mean time
 while the block is still active, leading postfix to a **infinite**
 loop (until it reaches maximal_queue_lifetime)

The part you're missing is that when Postfix stops sending the
only mechanism in place other than a rate delay is throttling
the destination queue, which moves every message (even those that
have not been tried yet), from active to deferred, and the
same happens to any message that happens to arrive while the
queue is throttled.

The delivery rate to the destination will be a small number of
messages per $maximal_backoff_time, this is not terribly useful,
with the entire backlog shuffling between the active and deferred
queues, without any useful work being done.

  To understand what one is asking for, one needs to understand the
  scheduler (qmgr) architecture. Otherwise, one is just babbling
  nonsense (no offense intended).
 
 Where can I read more about this?

SCHEDULE_README, the queue manager man page and source code.

  i would posit you do not understand the usecase
  
  How likely do you think that is? Of course I understand the use
  case, in fact better than the users who are asking for it.
 
 Sorry Viktor, but I'm not sure about that. You keep saying to
 get whitelist like if it would be very easy. Believe me, there are
 a lot of companies that don't have any support for that.

I listed all the options available to you, one which whitelisting
is always the best when possible. When not possible you try one
of the others.

  Throttling the destination (which means moving all pending messages
  for the destinatin to deferred, where they age exponentially, while
  more mail builds up...) is not the answer to your problem.
 
 But why move all the active queue to deferred? Wouldn't it be
 better to just move it to hold queue?

This suspends delivery of a (multi-recipient) message for all
deferred destinations, not just the ones you want treated specially.

Messages on hold don't get retried without manual intervention, that
makes your delivery rate to the destination zero, not a good way
to get the mail out.

  1. Get whitelisted without limits, send at the arrival rate.
  2. Get whitelisted at above the arrival rate, set rate delay to
avoid exceeding the rate.
  3. Don't waste time with unresponsive mailbox providers, tell their
customers their mailbox provider is not supported.
  4. Snowshoe.
 
 What's the meaning of Snowshoe?

Spread the load over sufficiently many outbound systems that the
rate limits are not exceeded by any of them. A fifth option is
to outsource to others who've already done that.

-- 
Viktor.


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-09 Thread Rafael Azevedo - IAGENTE
I was watching my log files now looking for deferred errors, and for my 
surprise, we got temporary blocked by Yahoo on some SMTPs (ips), as shown:

Jan  9 13:20:52 mxcluster yahoo/smtp[8593]: 6731A13A2D956: host 
mta5.am0.yahoodns.net[98.136.216.25] refused to talk to me: 421 4.7.0 [TS02] 
Messages from X.X.X.X temporarily deferred - 4.16.56.1; see 
http://postmaster.yahoo.com/errors/421-ts02.html

So guess what, I still have another 44k messages on active queue (a lot of them 
are probably to yahoo) and postfix is wasting its time and cpu trying to 
deliver to Yahoo when there's an active block.

Yahoo suggests to try delivering in few hours, but we'll never get rid from the 
block if we keep trying while the block is active.

This doesn't happens only with bulk senders. Many people use their hosting 
company to send few hundreds emails together with many other users sending 
legitimate mails from their mail clients… Eventually, one user will compromise 
all infrastructure and many people may have problem delivering their messages.

There's gotta be a solution for this.

- Rafael

Re: destination_rate_delay and connection_reuse_time_limit

2013-01-09 Thread John Peach
On Wed, 9 Jan 2013 13:29:06 -0200
Rafael Azevedo - IAGENTE raf...@iagente.com.br wrote:

 I was watching my log files now looking for deferred errors, and for
 my surprise, we got temporary blocked by Yahoo on some SMTPs (ips),
 as shown:
 
 Jan  9 13:20:52 mxcluster yahoo/smtp[8593]: 6731A13A2D956: host
 mta5.am0.yahoodns.net[98.136.216.25] refused to talk to me: 421 4.7.0
 [TS02] Messages from X.X.X.X temporarily deferred - 4.16.56.1; see
 http://postmaster.yahoo.com/errors/421-ts02.html
 
 So guess what, I still have another 44k messages on active queue (a
 lot of them are probably to yahoo) and postfix is wasting its time
 and cpu trying to deliver to Yahoo when there's an active block.
 
 Yahoo suggests to try delivering in few hours, but we'll never get
 rid from the block if we keep trying while the block is active.
 
 This doesn't happens only with bulk senders. Many people use their
 hosting company to send few hundreds emails together with many other
 users sending legitimate mails from their mail clients… Eventually,
 one user will compromise all infrastructure and many people may have
 problem delivering their messages.
 
 There's gotta be a solution for this.

There is - you need to register your mailserver(s) with yahoo.

 
 - Rafael


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-09 Thread Rafael Azevedo - IAGENTE

 There's gotta be a solution for this.
 
 There is - you need to register your mailserver(s) with yahoo

You mean Yahoo's Feedback Program (feedbackloop.yahoo.net) ?

- Rafael


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-09 Thread Viktor Dukhovni
On Wed, Jan 09, 2013 at 01:29:06PM -0200, Rafael Azevedo - IAGENTE wrote:

 I was watching my log files now looking for deferred errors, and
 for my surprise, we got temporary blocked by Yahoo on some SMTPs
 (ips), as shown:
 
 Jan  9 13:20:52 mxcluster yahoo/smtp[8593]: 6731A13A2D956: host 
 mta5.am0.yahoodns.net[98.136.216.25] refused to talk to me: 421 4.7.0 [TS02] 
 Messages from X.X.X.X temporarily deferred - 4.16.56.1; see 
 http://postmaster.yahoo.com/errors/421-ts02.html

Postfix already treats this as a don't send signal. Enough of these
back to back and transmission stops. This is a 421 during HELO,
not a 4XX during RCPT TO.

Yahoo's filters are NOT simple rate limits. They delay delivery when
their reputation system wants more time to assess the source. They
typically will permit delayed message when they're retried, unless
of course they believe the source to be spamming, in which case they
may reject, or quarantine...

 So guess what, I still have another 44k messages on active queue
 (a lot of them are probably to yahoo) and postfix is wasting its
 time and cpu trying to deliver to Yahoo when there's an active
 block.

 Yahoo suggests to try delivering in few hours, but we'll never
 get rid from the block if we keep trying while the block is active.

This is false. Postfix does not keep trying under the above
conditions, and Yahoo does not rate-limit in the naive manner you
imagine.

 This doesn't happens only with bulk senders. Many people use
 their hosting company to send few hundreds emails together with
 many other users sending legitimate mails from their mail clients?
 Eventually, one user will compromise all infrastructure and many
 people may have problem delivering their messages.

This is rarely a problem, and when it is, any blocking is usually
transient, and one can request to be unblocked, at most providers. 

 There's gotta be a solution for this.

Yes, but not the one you're asking for. It is I think possible to
design and implement a useful dynamic rate delay algorithm, I am
not sure that spending the effort to optimize Postfix for unwhitelisted
bulk email is a good use of developer effort.

-- 
Viktor.


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-09 Thread Wietse Venema
Wietse:
 My conclusion is that Postfix can continue to provide basic policies
 that avoid worst-case failure modes, but the choice of the settings
 that control those policies is better left to the operator. If the
 receiver slams on the brakes, then Postfix can suspend deliveries,
 but the sender operator will have to adjust the sending rate.

Rafael Azevedo - IAGENTE:
 I agree with Reindl, I guess Witsie is now better understanding
 the problem here.

Please take the effort to spell my name correctly.

When a site sends a small volume of mail, the existing Postfix
strategy is sufficient (skip a site after N connect/handshake errors,
don't treat a post-handshake error as a stay away signal).  The
email will eventually get through.

When a site sends a large volume of mail to a rate-limited destination,
'we believe that a stragegy based on a bursty send-suspend cycle
will perform worse than a strategy based on an uninterrupted flow.

Why does this difference matter?  Once the sending rate drops under
rate at which mail enters the mail queue, all strategies become
equivalent to throwing away mail.

This is why bulk mailers should use a strategy based on an uninterrupted
flow, instead of relying on a bursty send-suspend cycle.

This is consistent with my conclusion cited above. The sole benefit
of adding the switch is that when it trips, the operator knows they
need a different sending strategy (reduce rates, snowshoe, whatever).

Wietse


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-09 Thread Rafael Azevedo - IAGENTE
John,

We've already done that.
We do sign ALL messages with DKIM and are also subscribed for Yahoo Feedback 
Loop Program.

Still there are few messages being blocked based on users complaints or 
unusual traffic from the IP xxx…

- Rafael

Em 09/01/2013, às 13:45, John Peach post...@johnpeach.com escreveu:

 On Wed, 9 Jan 2013 13:37:00 -0200
 Rafael Azevedo - IAGENTE raf...@iagente.com.br wrote:
 
 
 There's gotta be a solution for this.
 
 There is - you need to register your mailserver(s) with yahoo
 
 You mean Yahoo's Feedback Program (feedbackloop.yahoo.net) ?
 
 I forget exactly what needs doing, but you definitely need DKIM records
 and to register with their feedbackloop:
 
 http://help.yahoo.com/kb/index?page=contenty=PROD_MAIL_MLlocale=en_USid=SLN3435impressions=true
 
 - Rafael



Re: destination_rate_delay and connection_reuse_time_limit

2013-01-09 Thread Rafael Azevedo - IAGENTE

 I was watching my log files now looking for deferred errors, and
 for my surprise, we got temporary blocked by Yahoo on some SMTPs
 (ips), as shown:
 
 Jan  9 13:20:52 mxcluster yahoo/smtp[8593]: 6731A13A2D956: host 
 mta5.am0.yahoodns.net[98.136.216.25] refused to talk to me: 421 4.7.0 [TS02] 
 Messages from X.X.X.X temporarily deferred - 4.16.56.1; see 
 http://postmaster.yahoo.com/errors/421-ts02.html
 
 Postfix already treats this as a don't send signal. Enough of these
 back to back and transmission stops. This is a 421 during HELO,
 not a 4XX during RCPT TO.

So please, tell me what am I doing wrong because my postfix servers keep trying 
even after this failure. At this moment I have over 30k emails to yahoo on 
deferred queue based on the same error.

 Yahoo's filters are NOT simple rate limits. They delay delivery when
 their reputation system wants more time to assess the source. They
 typically will permit delayed message when they're retried, unless
 of course they believe the source to be spamming, in which case they
 may reject, or quarantine…

I agree with that.

 So guess what, I still have another 44k messages on active queue
 (a lot of them are probably to yahoo) and postfix is wasting its
 time and cpu trying to deliver to Yahoo when there's an active
 block.
 
 Yahoo suggests to try delivering in few hours, but we'll never
 get rid from the block if we keep trying while the block is active.
 
 This is false. Postfix does not keep trying under the above
 conditions, and Yahoo does not rate-limit in the naive manner you
 imagine.

My postfix does keep trying. Any idea about why this is happening?

 
 This doesn't happens only with bulk senders. Many people use
 their hosting company to send few hundreds emails together with
 many other users sending legitimate mails from their mail clients?
 Eventually, one user will compromise all infrastructure and many
 people may have problem delivering their messages.
 
 This is rarely a problem, and when it is, any blocking is usually
 transient, and one can request to be unblocked, at most providers. 

Most in this case might not be enough.

 
 There's gotta be a solution for this.
 
 Yes, but not the one you're asking for. It is I think possible to
 design and implement a useful dynamic rate delay algorithm, I am
 not sure that spending the effort to optimize Postfix for unwhitelisted
 bulk email is a good use of developer effort.

I'm 100% sure that this doesn't happened only with bulk senders. Legitimate 
mails are also subject to be blocked because of bad emails.

Last week a customer's server got compromised, somebody uploaded a 
bulk-php-script that started sending thousands of emails in a very small time 
frame, blocking all legitimate emails from that time on up to few hours.

- Rafael

Re: destination_rate_delay and connection_reuse_time_limit

2013-01-09 Thread Wietse Venema
Rafael Azevedo - IAGENTE:
 I was watching my log files now looking for deferred errors, and
 for my surprise, we got temporary blocked by Yahoo on some SMTPs
 (ips), as shown:

 Jan  9 13:20:52 mxcluster yahoo/smtp[8593]: 6731A13A2D956: host
 mta5.am0.yahoodns.net[98.136.216.25] refused to talk to me: 421
 4.7.0 [TS02] Messages from X.X.X.X temporarily deferred - 4.16.56.1;
 see http://postmaster.yahoo.com/errors/421-ts02.html

As required by RFC the Postfix SMTP client will try another MX
host when it receives a 4xx greeting.

Postfix limits the number of MX hosts to try.

When all greetings fail with 4xx or whatever then Postfix will
suspend deliveries.

Therefore, you are talking out of your exhaust pipe when you
claim that Postfix keeps trying.

Wietse


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-09 Thread Rafael Azevedo - IAGENTE

 Rafael Azevedo - IAGENTE:
 I was watching my log files now looking for deferred errors, and
 for my surprise, we got temporary blocked by Yahoo on some SMTPs
 (ips), as shown:
 
 Jan  9 13:20:52 mxcluster yahoo/smtp[8593]: 6731A13A2D956: host
 mta5.am0.yahoodns.net[98.136.216.25] refused to talk to me: 421
 4.7.0 [TS02] Messages from X.X.X.X temporarily deferred - 4.16.56.1;
 see http://postmaster.yahoo.com/errors/421-ts02.html
 
 As required by RFC the Postfix SMTP client will try another MX
 host when it receives a 4xx greeting.
 
 Postfix limits the number of MX hosts to try.
 
 When all greetings fail with 4xx or whatever then Postfix will
 suspend deliveries.

I have no idea about what I'm doing wrong, this really doesn't happen in my 
servers. 

In my case, postfix keeps trying to deliver even the messages on active queue 
to the same destination, not mentioning the deferred queue that never stops 
sending after getting this error. I mean, postfix gets the next message (not 
considering the destination) and tries again and again.

 
 Therefore, you are talking out of your exhaust pipe when you
 claim that Postfix keeps trying.

Sorry, my english is not that good, I didn't understand what you mean.

- Rafael

Re: destination_rate_delay and connection_reuse_time_limit

2013-01-09 Thread Rafael Azevedo - IAGENTE
Now Yahoo is giving another response:

said: 451 Message temporarily deferred - [160] (in reply to end of DATA command)

See, this is very hard to solve. I'm really truing to better understand the 
problem in order to find out the best solution. I'd like to thank in advance 
for the help, its being very appreciated.

- Rafael



Re: destination_rate_delay and connection_reuse_time_limit

2013-01-09 Thread Wietse Venema
Rafael Azevedo - IAGENTE:
  Why does this difference matter?  Once the sending rate drops under
  rate at which mail enters the mail queue, all strategies become
  equivalent to throwing away mail.
 
 I'm trying to understand what you said but it doesn't make any sense to me.

When you can send N messages per day, and your queue receives MN
messages per day, then you will throw away M-N messages per day.

If you achieve the sending rate by driving at full speed into the
wall and waiting for 6 hours, then your N will be smaller, and
you will throw away more mail.

Wietse


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-09 Thread Wietse Venema
Rafael Azevedo - IAGENTE:
  When all greetings fail with 4xx or whatever then Postfix will
  suspend deliveries.
 
 I have no idea about what I'm doing wrong, this really doesn't
 happen in my servers.

No it doesn't. Postfix logs delivery temporarily suspended and
skips Yahoo until the dead host timer expires.

Wietse


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Rafael Azevedo - IAGENTE
Hi Viktor,

I've added this into my main.cf:

slow_destination_concurrency_failed_cohort_limit = 5

But I noticed that even after a failure, postfix keeps trying to deliver to the 
destination.

Question: how can I stop postfix from trying to deliver emails after few 
failures? 

I mean, if it is trying to deliver to xyz.com and it fails 5 times, should 
postfix keep trying to deliver or is there any way that we can stop delivering 
for some time?

I thought this could be done using 
_destination_concurrency_failed_cohort_limit. Am I doing something wrong?

After this adjustments I'm still having trouble to deliver emails to this 
specific destination. 

This is the error I get:
said: 450 4.7.1 You've exceeded your sending limit to this domain. (in reply to 
end of DATA command))

I'm really trying to slow down the delivery speed in order to respect the 
destination's policies. I just can't figure out how to fix this issue. 
I've also sent more than 20 emails to network's administrator and they just 
won't answer. Reading on the internet I found out that there are a lot of 
people having the same problem with this specific provider. 

We send about 50k emails/day to 20k domains hosted on this provider that are 
being blocked.

Any help would be very appreciated.

Thanks in advance.

Att.
--
Rafael Azevedo | IAGENTE
Fone: 51 3086.0262
MSN: raf...@hotmail.com
Visite: www.iagente.com.br

Em 07/01/2013, às 15:57, Viktor Dukhovni postfix-us...@dukhovni.org escreveu:

 slow_destination_concurrency_failed_cohort_limit



Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Wietse Venema
Rafael Azevedo - IAGENTE:
[ Charset ISO-8859-1 unsupported, converting... ]
 Hi Viktor,
 
 I've added this into my main.cf:
 
 slow_destination_concurrency_failed_cohort_limit = 5

This stops deliveries after 5 COHORT failures.

 I mean, if it is trying to deliver to xyz.com and it fails 5 times,

Yes, but you configured Postfix to stop after 5 COHORT failures.

For more information about COHORT failures and other parameters:

http://www.postfix.org/SCHEDULER_README.html

Wietse


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Wietse Venema
Wietse Venema:
 Rafael Azevedo - IAGENTE:
  I've added this into my main.cf:
  
  slow_destination_concurrency_failed_cohort_limit = 5
 
 This stops deliveries after 5 COHORT failures.
 
  I mean, if it is trying to deliver to xyz.com and it fails 5 times,
 
 Yes, but you configured Postfix to stop after 5 COHORT failures.
 
 For more information about COHORT failures and other parameters:
 
 http://www.postfix.org/SCHEDULER_README.html

In short, Postfix ONLY adjusts concurrency after connect/handshake
failure, NEVER EVER for a 4XX reply to RCPT TO.

Wietse


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Wietse Venema
Rafael Azevedo - IAGENTE:
 Hi Witsie,
 
 Is there anyway we can adjust Postfix to stop delivering after a
 4XX reply?

Postfix will stop delivering after TCP or SMTP handshake failure.
Postfix WILL NOT stop delivering due to 4xx reply AFTER the SMTP
protocol handshake.

Postfix is not a tool to work around receiver policy restrictions.
If you want to send more than a few email messages, then it is your
responsibility to make the necessary arrangements with receivers.

Over and out.

Wietse




Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Viktor Dukhovni
On Tue, Jan 08, 2013 at 10:47:08AM -0200, Rafael Azevedo - IAGENTE wrote:

 I've added this into my main.cf:
 
 slow_destination_concurrency_failed_cohort_limit = 5

This is fine, since you set the concurrency limit to 1, it is
intended to avoid shutting down deliveries after a single connection
failure. As Wietse points out this does not stop deliveries when
individual recipients are rejected, that is not evidence of the
site being down.

 Question: how can I stop postfix from trying to deliver emails
 after few failures?

It is not possible to aumatically throttle deliveries based on 4XX
replies to RCPT TO. This is not a useful signal that Postfix is
sending too fast, nor is there any good way to dynamically
determine the correct rate.

Sites that impose indiscriminate (assuming you're sending legitimate
email, not spam) rate controls are breaking the email infrastructure.
Sadly, the work-around is to snowshoe---deploy more servers to split the
load over a larger number of IP addresses.

 I mean, if it is trying to deliver to xyz.com and it fails 5
 times, should postfix keep trying to deliver or is there any way
 that we can stop delivering for some time?

Only if xyz.com is down, not if it is merely tempfailing RCPT TO.

 This is the error I get:
 said: 450 4.7.1 You've exceeded your sending limit to this domain.
 (in reply to end of DATA command))

Since presumably at this point your connection rate is not high
(connections are being re-used), it seems that they're imposing
a message rate cap as well as a connection rate cap.

Send them less email.

 I'm really trying to slow down the delivery speed in order to
 respect the destination's policies. I just can't figure out how to
 fix this issue.

 We send about 50k emails/day to 20k domains hosted on this provider
 that are being blocked.

The output rate cannot on average fall below the input rate. The
input rate is approximately 1/sec (there are 86400 seconds in a
day). Thus the slowest you can send is with a rate delay of 1s.
If that's not slow enough, you're out of luck, and have to buy
more servers (possibly deploying them on separate networks).

The suggestion to turn off rate delays was based on an assumption
that they told you to avoid connecting too often and wanted all
the mail over a single connection (you wanted connection re-use),
but connection re-use works best when there is no rate delay. 
A rate delay of 1s is still compatible with connection re-use,
and is the largest you can specify and still send more than
43.2k messages a day.

It may be simplest to outsource your traffic to an existing large
bulk email operator.

-- 
Viktor.


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Rafael Azevedo - IAGENTE
Thank you Witsie.

We have a huge mail volume thats why I'm trying to figure out a better way to 
deal with it.

Many providers have their own restrictions. We do work in compliance with most 
of them, but there are a few that just won't help at all, so its easy to tell 
me to make the necessary arrangements when they don't even have a support or 
abuse dept to get involved.

So since the problem is in my hands, I must find out a way to deal with it. 
Trying to slow down delivery speed is one way to get through.

I truly believe that postfix is the best MTA ever, but you might agree with me 
that when the receiver start blocking the sender, its worthless to keep trying 
to deliver. 
The safest way is to stop delivering to these servers and try again later. 

I just can't believe that postfix doesn't have a way to deal with it. It would 
make postfix much more efficient in delivery terms.

Anyway, thanks for your time and all the help. It was for sure very appreciated.

Any help here would also be appreciated.

Thanks in advance.

Att.
--
Rafael Azevedo | IAGENTE
Fone: 51 3086.0262
MSN: raf...@hotmail.com
Visite: www.iagente.com.br

Em 08/01/2013, às 12:09, Wietse Venema wie...@porcupine.org escreveu:

 Rafael Azevedo - IAGENTE:
 Hi Witsie,
 
 Is there anyway we can adjust Postfix to stop delivering after a
 4XX reply?
 
 Postfix will stop delivering after TCP or SMTP handshake failure.
 Postfix WILL NOT stop delivering due to 4xx reply AFTER the SMTP
 protocol handshake.
 
 Postfix is not a tool to work around receiver policy restrictions.
 If you want to send more than a few email messages, then it is your
 responsibility to make the necessary arrangements with receivers.
 
 Over and out.
 
   Wietse
 
 



Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Wietse Venema
Rafael Azevedo - IAGENTE:
 I truly believe that postfix is the best MTA ever, but you might
 agree with me that when the receiver start blocking the sender,
 its worthless to keep trying to deliver.

1) Postfix will back off when the TCP or SMTP handshake fails. This
is a clear signal that a site is unavailable.

2) Postfix will not back off after [54]XX in the middle of a session.
IN THE GENERAL CASE this does not mean that the receiver is blocking
the sender, and backing off would be the wrong strategy.

Wietse


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Rafael Azevedo - IAGENTE
But Witsei, would you agree with me that error 4XX is (in general cases) a 
temporary error?

Why keep trying when we have a clear signal of a temporary error?

Also, if we had a temporary error control (number of deferred messages by 
recipient), it would be easy to identify when postfix should stop trying at 
least for a while.

Att.
--
Rafael Azevedo | IAGENTE
Fone: 51 3086.0262
MSN: raf...@hotmail.com
Visite: www.iagente.com.br

Em 08/01/2013, às 13:34, Wietse Venema wie...@porcupine.org escreveu:

 Rafael Azevedo - IAGENTE:
 I truly believe that postfix is the best MTA ever, but you might
 agree with me that when the receiver start blocking the sender,
 its worthless to keep trying to deliver.
 
 1) Postfix will back off when the TCP or SMTP handshake fails. This
 is a clear signal that a site is unavailable.
 
 2) Postfix will not back off after [54]XX in the middle of a session.
 IN THE GENERAL CASE this does not mean that the receiver is blocking
 the sender, and backing off would be the wrong strategy.
 
   Wietse



Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Viktor Dukhovni
On Tue, Jan 08, 2013 at 01:59:14PM -0200, Rafael Azevedo - IAGENTE wrote:

 But Witse, would you agree with me that error 4XX is (in general
 cases) a temporary error?

It is a temporary error for *that* recipient. It is not a global
indication that the site is temporary unreachable. Nor is there
any indication how long one should wait, nor that waiting will make
things any better.

Generally, delaying deliver *increases* congestion, since more mail
arrives in the mean-time, and once delivery resumes the volume is
even higher.

 Why keep trying when we have a clear signal of a temporary error?

Postfix does not keep trying, it defers the message in question
and moves on to the next one. Your mental model of email queue
management is too naive.

This is a very difficult problem, and there is no simple answer.

 Also, if we had a temporary error control (number of deferred
 messages by recipient), it would be easy to identify when postfix
 should stop trying at least for a while.

Given an arrival rate of ~50k msgs/day, you need to send at least
1 msg/sec to avoid growing an infinitely large queue. This is basic
arithmetic. Gowing slower does not work, your queue grows without
bound.

Let the recipients know that if they want to continue to receive
your email they should choose a new provider that is willing to
work with legitimate senders to resolve mail delivery issues.  Then
stop sending them email.

-- 
Viktor.


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Wietse Venema
Rafael Azevedo - IAGENTE:
 Why keep trying when we have a clear signal of a temporary error?

As Victor noted Postfix does not keep trying the SAME delivery.

Instead, Postfix tries to deliver a DIFFERENT message. It would be
incorrect IN THE GENERAL CASE to postpone ALL deliveries to a site
just because FIVE recipients were unavailable.

Wietse


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Rafael Azevedo - IAGENTE
Em 08/01/2013, às 14:21, Wietse Venema wie...@porcupine.org escreveu:

 Rafael Azevedo - IAGENTE:
 Why keep trying when we have a clear signal of a temporary error?
 
 As Victor noted Postfix does not keep trying the SAME delivery.

Yes you're right and I know that. But it keeps trying for another recipients in 
the same domain.

Lets just say Yahoo starts blocking because of unusual traffic. Yahoo will keep 
telling me to try again later. In the mean time, new emails to be sent 
arrives and we never get the change to get unblocked.

So postfix gets the next yahoo recipient and try to deliver without considering 
that yahoo does not want us to keep trying for a while.

This is just an example. We don't have problem delivering to Yahoo, but to 
smaller providers.

 
 Instead, Postfix tries to deliver a DIFFERENT message. It would be
 incorrect IN THE GENERAL CASE to postpone ALL deliveries to a site
 just because FIVE recipients were unavailable.

Thats why it would be interesting to have a way to configure that. Lets say we 
have 100 deferred messages in sequence. Why keep trying? This way we loose time 
and processing, and have no way to improve reputation since we don't stop 
bugging them after they tell us to stop for a while.

 
   Wietse

Anyway, it doesn't seem to be possible to do this.

Thanks guys.

Rafael.

Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Rafael Azevedo - IAGENTE

Att.
--
Rafael Azevedo | IAGENTE
Fone: 51 3086.0262
MSN: raf...@hotmail.com
Visite: www.iagente.com.br

Em 08/01/2013, às 14:07, Viktor Dukhovni postfix-us...@dukhovni.org escreveu:

 On Tue, Jan 08, 2013 at 01:59:14PM -0200, Rafael Azevedo - IAGENTE wrote:
 
 But Witse, would you agree with me that error 4XX is (in general
 cases) a temporary error?
 
 It is a temporary error for *that* recipient. It is not a global
 indication that the site is temporary unreachable. Nor is there
 any indication how long one should wait, nor that waiting will make
 things any better.

Yes you're right. There is no indication of how long we should wait, thats why 
it would be very nice to have a parameter to determinate that (just like 
maximal_queue_lifetime)

 
 Generally, delaying deliver *increases* congestion, since more mail
 arrives in the mean-time, and once delivery resumes the volume is
 even higher.

Thats exactly the problem. We have what I call as mxcluster, which is a box 
with hundreds of postfix running, splitting the traffic between them. It helps 
but its not solving the major problem.

 
 Why keep trying when we have a clear signal of a temporary error?
 
 Postfix does not keep trying, it defers the message in question
 and moves on to the next one. Your mental model of email queue
 management is too naive.
 
 This is a very difficult problem, and there is no simple answer.

Yes it tries the next message. But what about when it is to the same domain and 
also happens to get deferred?

 
 Also, if we had a temporary error control (number of deferred
 messages by recipient), it would be easy to identify when postfix
 should stop trying at least for a while.
 
 Given an arrival rate of ~50k msgs/day, you need to send at least
 1 msg/sec to avoid growing an infinitely large queue. This is basic
 arithmetic. Gowing slower does not work, your queue grows without
 bound.

Thats why we have multiple instances of postfix running, to split the traffic 
among them.

 Let the recipients know that if they want to continue to receive
 your email they should choose a new provider that is willing to
 work with legitimate senders to resolve mail delivery issues.  Then
 stop sending them email.

Yes and no. Some SMTPs get higher volumes of mail, but not the entire traffic 
centralized in only one smtp.

 
 -- 
   Viktor.


Rafael

Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Mark Goodge

On 08/01/2013 16:38, Rafael Azevedo - IAGENTE wrote:

Em 08/01/2013, às 14:21, Wietse Venema wie...@porcupine.org
escreveu:


Rafael Azevedo - IAGENTE:

Why keep trying when we have a clear signal of a temporary
error?


As Victor noted Postfix does not keep trying the SAME delivery.


Yes you're right and I know that. But it keeps trying for another
recipients in the same domain.


Which is absolutely the correct behaviour.

One of the most common reasons for a temporary delivery failure is a 
full mailbox. Or, where the remote server is acting as a 
store-and-forward, a temporary inability to verify the validity of the 
destination address.


I'd be very annoyed if I didn't get an email I was expecting because 
someone else on my system had forgotten to empty their mailbox, or 
because another customer of my upstream server had an outage and wasn't 
able to verify recipients.


Mark
--
Please take a short survey about the Leveson Report: http://meyu.eu/ak


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Wietse Venema
Rafael Azevedo - IAGENTE:
  Instead, Postfix tries to deliver a DIFFERENT message. It would be
  incorrect IN THE GENERAL CASE to postpone ALL deliveries to a site
  just because FIVE recipients were unavailable.
 
 Thats why it would be interesting to have a way to configure that.

Configurable, perhaps. But it would a mistake to make this the
default strategy.

That would make Postfix vulnerable to a trivial denial of service
attack where one bad recipient can block all mail for all other
recipients at that same site.

Imagine if I could block all mail for gmail.com in this manner.

If I understand correctly, your proposal is to treat all 4xx and
5xx delivery errors the same as a failure to connect error.

Wietse


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Reindl Harald


Am 08.01.2013 17:44, schrieb Mark Goodge:
 On 08/01/2013 16:38, Rafael Azevedo - IAGENTE wrote:
 Em 08/01/2013, às 14:21, Wietse Venema wie...@porcupine.org
 escreveu:

 Rafael Azevedo - IAGENTE:
 Why keep trying when we have a clear signal of a temporary
 error?

 As Victor noted Postfix does not keep trying the SAME delivery.

 Yes you're right and I know that. But it keeps trying for another
 recipients in the same domain.
 
 Which is absolutely the correct behaviour.
 
 One of the most common reasons for a temporary delivery failure is a full 
 mailbox. Or, where the remote server is
 acting as a store-and-forward, a temporary inability to verify the validity 
 of the destination address.
 
 I'd be very annoyed if I didn't get an email I was expecting because someone 
 else on my system had forgotten to
 empty their mailbox, or because another customer of my upstream server had an 
 outage and wasn't able to verify
 recipients.

yes that is all right for any normal mail

but if you send out a newsletter you have likely a lot of
users to big ISP's, Telekom Austria even rejects temporary
for whitelisted senders

since every smart admin spilts newsletter-relays from the
normal business mail a configuration option would help
while not hurt your case



signature.asc
Description: OpenPGP digital signature


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Reindl Harald


Am 08.01.2013 17:48, schrieb Wietse Venema:
 Rafael Azevedo - IAGENTE:
 Instead, Postfix tries to deliver a DIFFERENT message. It would be
 incorrect IN THE GENERAL CASE to postpone ALL deliveries to a site
 just because FIVE recipients were unavailable.

 Thats why it would be interesting to have a way to configure that.
 
 Configurable, perhaps. But it would a mistake to make this the
 default strategy.
 
 That would make Postfix vulnerable to a trivial denial of service
 attack where one bad recipient can block all mail for all other
 recipients at that same site.
 
 Imagine if I could block all mail for gmail.com in this manner.
 
 If I understand correctly, your proposal is to treat all 4xx and
 5xx delivery errors the same as a failure to connect error.

as i understand his proposal is if deliver to a configureable
amount of RCPT's with the same destination server delay also
the following targets with the same destination x minutes
instead trigger another 100,200,300 4xx errors because the
destination does not like mail from your IP for some time



signature.asc
Description: OpenPGP digital signature


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Rafael Azevedo - IAGENTE
 
 One of the most common reasons for a temporary delivery failure is a full 
 mailbox. Or, where the remote server is acting as a store-and-forward, a 
 temporary inability to verify the validity of the destination address.

I dont agree with that. Connection time out is the most common reason for 
temporary failure (in my case).

 
 I'd be very annoyed if I didn't get an email I was expecting because someone 
 else on my system had forgotten to empty their mailbox, or because another 
 customer of my upstream server had an outage and wasn't able to verify 
 recipients.

Mark, I don't think that postfix should stop sending to that domain for ever or 
that it should send the email back to the sender. I just think that postfix 
could have a way to hold the mail queue for a specific time based on specific 
and consecutive errors. Lets say for example, 100 errors in sequence to the 
same destination domain. Why keep trying if we're unable to deliver to that 
domain at the time?

 
 Mark
 -- 
 Please take a short survey about the Leveson Report: http://meyu.eu/ak

Rafael

Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Rafael Azevedo - IAGENTE


 Configurable, perhaps. But it would a mistake to make this the
 default strategy.
 
 That would make Postfix vulnerable to a trivial denial of service
 attack where one bad recipient can block all mail for all other
 recipients at that same site.

Not if it could me parametrized. As I said, what if we get 100 errors in 
sequence? Keep trying to deliver another 10k emails knowing that you're not 
allowed to send email at this time is more like a DoS attack. We're consuming 
server's resource when we shouldn't connect to them at all.

 
 Imagine if I could block all mail for gmail.com in this manner.
 
 If I understand correctly, your proposal is to treat all 4xx and
 5xx delivery errors the same as a failure to connect error.

No thats not what I meant. What I said is that would be nice to have a way to 
configure specific errors to put the queue on hold for those destinations which 
we're unable to connect at the time.

 
   Wietse


Rafael

Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Rafael Azevedo - IAGENTE
Yes Reindl, you got the point. I just want to wait for a while before retrying 
to send email to the same destination.


 Am 08.01.2013 17:48, schrieb Wietse Venema:
 Rafael Azevedo - IAGENTE:
 Instead, Postfix tries to deliver a DIFFERENT message. It would be
 incorrect IN THE GENERAL CASE to postpone ALL deliveries to a site
 just because FIVE recipients were unavailable.
 
 Thats why it would be interesting to have a way to configure that.
 
 Configurable, perhaps. But it would a mistake to make this the
 default strategy.
 
 That would make Postfix vulnerable to a trivial denial of service
 attack where one bad recipient can block all mail for all other
 recipients at that same site.
 
 Imagine if I could block all mail for gmail.com in this manner.
 
 If I understand correctly, your proposal is to treat all 4xx and
 5xx delivery errors the same as a failure to connect error.
 
 as i understand his proposal is if deliver to a configureable
 amount of RCPT's with the same destination server delay also
 the following targets with the same destination x minutes
 instead trigger another 100,200,300 4xx errors because the
 destination does not like mail from your IP for some time
 



Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Scott Lambert
On Tue, Jan 08, 2013 at 03:04:37PM -0200, Rafael Azevedo - IAGENTE wrote:
  Configurable, perhaps. But it would a mistake to make this the
  default strategy.
  
  That would make Postfix vulnerable to a trivial denial of service
  attack where one bad recipient can block all mail for all other
  recipients at that same site.

 Not if it could me parametrized. As I said, what if we get 100 errors
 in sequence? Keep trying to deliver another 10k emails knowing that
 you're not allowed to send email at this time is more like a DoS
 attack. We're consuming server's resource when we shouldn't connect to
 them at all.

  
  Imagine if I could block all mail for gmail.com in this manner.
  
  If I understand correctly, your proposal is to treat all 4xx and
  5xx delivery errors the same as a failure to connect error.

 No thats not what I meant. What I said is that would be nice to have
 a way to configure specific errors to put the queue on hold for those
 destinations which we're unable to connect at the time.

Could you not just watch your logs and count temporary errors for
each destination?  The script could then reconfigure your mailtertable
to point that domain to a hold transport (or even another box which
is configured to send messages very slowly).  After some amount of
time passes, change back to the normal SMTP transport.

I've never needed to do any such thing.  But, I believe that would
be possible without depending on changes to Postfix, which may not
be not happen.

-- 
Scott LambertKC5MLE   Unix SysAdmin
lamb...@lambertfam.org


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Wietse Venema
Rafael Azevedo - IAGENTE:
 
 
  Configurable, perhaps. But it would a mistake to make this the
  default strategy.
  
  That would make Postfix vulnerable to a trivial denial of service
  attack where one bad recipient can block all mail for all other
  recipients at that same site.
 
 Not if it could me parametrized. As I said, what if we get 100
 errors in sequence?

Big deal. Now I can block all mail for gmail.com by getting 100
email messages into your queue.

I could add an option to treat this in the same manner as failure
to connect errors (i.e. temporarily skip all further delivery to
this site). However, this must not be the default strategy, because
this would hurt the far majority of Postfix sites which is not a
bulk email sender.

Currently, Postfix error processing distinguishes between (hard
versus soft) errors, and between errors (during versus after) the
initial protocol handshake.  I don't have time to develop more
detailed error processing strategies, especially not since this is
of no benefit to the majority of the installed base.

Wietse


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Reindl Harald


Am 08.01.2013 19:08, schrieb Wietse Venema:
 Rafael Azevedo - IAGENTE:


 Configurable, perhaps. But it would a mistake to make this the
 default strategy.

 That would make Postfix vulnerable to a trivial denial of service
 attack where one bad recipient can block all mail for all other
 recipients at that same site.

 Not if it could me parametrized. As I said, what if we get 100
 errors in sequence?
 
 Big deal. Now I can block all mail for gmail.com by getting 100
 email messages into your queue

how comes?
how do you get gmail.com answer to any delivery from you with 4xx?

if it is becasue the 100 messages you are already done because
they reject still your messages and if the otehr side has some
greylisting-like rules to reject messages from you as long there
are not at least 5 minutes without anotehr connection it is even
easier to block you with the current behavior by sending one
message per minute after the treshhold is reached the first time
because you will never go under the 5 minutes



signature.asc
Description: OpenPGP digital signature


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Viktor Dukhovni
On Tue, Jan 08, 2013 at 01:08:21PM -0500, Wietse Venema wrote:

 I could add an option to treat this in the same manner as failure
 to connect errors (i.e. temporarily skip all further delivery to
 this site). However, this must not be the default strategy, because
 this would hurt the far majority of Postfix sites which is not a
 bulk email sender.

Such a feedback mechanism is a sure-fire recipe for congestive
collapse:

- A brief spike in traffic above the steady input rate cases the
  message-rate to trigger rate limits at the remote destination.

- Postfix throttles to the destination for many minutes.

- A lot of additional mail arrives while the destination is throttled.

- When the queue is unthrottled, the message rate will immediately
  spike above the remote limit. And the queue is throttled.

- Lather, rinse, repeat

The *only* way to deal with rate limits, is to avoid traffic spikes,
which is only possible if you also avoid traffic troughs, and send
at a *steady* rate that is *below* the remote limit, but above the
input rate.

When the steady-state input rate is above the remote limit, no
scheduler strategy can avoid congestive collapse.

For remote sites that enforce indiscriminate rate limits (for all
senders, not just those who have a history of reported spam), the
only strategy is to:

- Send below the rate that triggers the rate-limit

- Buy more machines, and hope that rate limits are per
  sending IP, not per sending domain. (showshoe).

- In some cases, the rate limits are reputation dependent,
  and it takes time to build a good reputation. In that case
  one needs to ramp traffic to the domain over time, by generating
  less mail intially, and building volume over time. This is done
  outside of the MTA, in the mail generating engine.

Thinking that delaying sending is a good idea is dead wrong.  The
optimal strategy is to always send as fast as possible and no
faster!

-- 
Viktor.


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Wietse Venema
Reindl Harald:
  Big deal. Now I can block all mail for gmail.com by getting 100
  email messages into your queue
 
 how comes?
 how do you get gmail.com answer to any delivery from you with 4xx?

He wants to temporarily suspend delivery when site has 5 consecutive
delivery errors without distinguishing between SMTP protocol stages
(such an ignore protocol stage switch could be added to Postfix).

To implement a trivial DOS, I need 5 consecutive messages in his
mail queue, plus a handful accounts that don't accept mail.

I have no idea where he got the 100 from - that number was not part
of his original problem description.

Wietse


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Reindl Harald


Am 08.01.2013 20:16, schrieb Wietse Venema:
 Reindl Harald:
 Big deal. Now I can block all mail for gmail.com by getting 100
 email messages into your queue

 how comes?
 how do you get gmail.com answer to any delivery from you with 4xx?
 
 He wants to temporarily suspend delivery when site has 5 consecutive
 delivery errors without distinguishing between SMTP protocol stages
 (such an ignore protocol stage switch could be added to Postfix).
 
 To implement a trivial DOS, I need 5 consecutive messages in his
 mail queue, plus a handful accounts that don't accept mail.
 
 I have no idea where he got the 100 from - that number was not part
 of his original problem description

the 100 was out in the game by yourself :-)

however, it would make sense to delay the next try to as example
@aon.at after 10,20,30 4xx errors while send out a newsletter
and re-start try to deliver this messages 5,10,15 minutes later

on the other hand: if you do not see snse here and are not willing
to spend time in this idea - postfix is your baby and still one
of the greatest pieces of software i was allowed to use in many
years and we will not die if you reject the feature wish




signature.asc
Description: OpenPGP digital signature


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Wietse Venema
Viktor Dukhovni:
 On Tue, Jan 08, 2013 at 01:08:21PM -0500, Wietse Venema wrote:
 
  I could add an option to treat this in the same manner as failure
  to connect errors (i.e. temporarily skip all further delivery to
  this site). However, this must not be the default strategy, because
  this would hurt the far majority of Postfix sites which is not a
  bulk email sender.
 
 Such a feedback mechanism is a sure-fire recipe for congestive
 collapse:

That depends on their average mail input rate. As long as they can
push out the mail from one input burst before the next input burst
happens, then it may be OK that the output flow stutters sometimes.

Wietse


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Viktor Dukhovni
On Tue, Jan 08, 2013 at 02:39:17PM -0500, Wietse Venema wrote:

 Viktor Dukhovni:
  On Tue, Jan 08, 2013 at 01:08:21PM -0500, Wietse Venema wrote:
  
   I could add an option to treat this in the same manner as failure
   to connect errors (i.e. temporarily skip all further delivery to
   this site). However, this must not be the default strategy, because
   this would hurt the far majority of Postfix sites which is not a
   bulk email sender.
  
  Such a feedback mechanism is a sure-fire recipe for congestive
  collapse:
 
 That depends on their average mail input rate. As long as they can
 push out the mail from one input burst before the next input burst
 happens, then it may be OK that the output flow stutters sometimes.

This is most unlikely. The sample size before the remote side clamps
down is likely small, so the effective throughput per throttle
interval will be very low.

If Postfix backs off initially for 5 minutes, it will fully drain
the active queue to deferred, then get a handfull of messages
through, then backoff for 10 minutes (doubling each time up to the
maximal_backoff_time). This won't push out 50k messages/day.

The optimal strategy is too send each message as quickly as possible,
but not faster than the remote rate limit, i.e. tune the rate delay.
Perhaps we need to measure the rate delay in tenths of seconds for
a bit more flexibility.

One can imagine adding a feedback mechanism to the rate delay (with
fractional positive/negative feedback), but getting a stable
algorithm out of this is far from easy.

Throttling the active queue is not an answer. With rate limits, one
wants to slow down, not stop, but throttling is not slowing down.

Barring a clean slow down signal, and a stable feedback mechanism,
the only strategy is manually tuned rate delays, and spreading the
load over multiple sending IPs (Postfix instances don't help if
they share a single IP).

-- 
Viktor.


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Reindl Harald


Am 08.01.2013 20:51, schrieb Viktor Dukhovni:
 On Tue, Jan 08, 2013 at 02:39:17PM -0500, Wietse Venema wrote:
 
 Viktor Dukhovni:
 On Tue, Jan 08, 2013 at 01:08:21PM -0500, Wietse Venema wrote:

 I could add an option to treat this in the same manner as failure
 to connect errors (i.e. temporarily skip all further delivery to
 this site). However, this must not be the default strategy, because
 this would hurt the far majority of Postfix sites which is not a
 bulk email sender.

 Such a feedback mechanism is a sure-fire recipe for congestive
 collapse:

 That depends on their average mail input rate. As long as they can
 push out the mail from one input burst before the next input burst
 happens, then it may be OK that the output flow stutters sometimes.
 
 This is most unlikely. The sample size before the remote side clamps
 down is likely small, so the effective throughput per throttle
 interval will be very low.
 
 If Postfix backs off initially for 5 minutes, it will fully drain
 the active queue to deferred, then get a handfull of messages
 through, then backoff for 10 minutes (doubling each time up to the
 maximal_backoff_time). This won't push out 50k messages/day

you missed the PER DESTINATION

* not initially
* not globally
* after CONFIGUREABLE temporary messages to the same destination

on a dedicated MTA for newsletters it would even improve in many
cass the amount of messages per day and your whole rputation
of the ip-address you are sending from

on a NORMAL mailserver with human senders it would make no sense

thats why such thing must not be default but would be nice to have



signature.asc
Description: OpenPGP digital signature


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Wietse Venema
Viktor Dukhovni:
 On Tue, Jan 08, 2013 at 02:39:17PM -0500, Wietse Venema wrote:
  Viktor Dukhovni:
   On Tue, Jan 08, 2013 at 01:08:21PM -0500, Wietse Venema wrote:
   
I could add an option to treat this in the same manner as failure
to connect errors (i.e. temporarily skip all further delivery to
this site). However, this must not be the default strategy, because
this would hurt the far majority of Postfix sites which is not a
bulk email sender.
   
   Such a feedback mechanism is a sure-fire recipe for congestive
   collapse:

Given that sites will tempfail a delivery attempt to signal a stay
away condition, it makes sense to provide an option for bulkmailers
to treat those responses accordingly.  That does not mean that this
option will be sufficient to solve all mail delivery problems. Some
parts will require human intervention.

The stay away condition is similar to failure to connect, except
that one might want to use different timers. With failure to
connect the timer choice depends more on sender preferences, while
with failure to deliver the timer choice would depend more on the
destination.

Coming to the issue of receiver's limits: when receivers slam hard
on the brakes, I don't see how Postfix could automatically tune the
delivery rate (as a fictitious example, suppose that a transgression
results in a 30-minute penalty during which no mail will be accepted
from the client IP address).

My conclusion is that Postfix can continue to provide basic policies
that avoid worst-case failure modes, but the choice of the settings
that control those policies is better left to the operator. If the
receiver slams on the brakes, then Postfix can suspend deliveries,
but the sender operator will have to adjust the sending rate.

Wietse


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Reindl Harald


Am 08.01.2013 21:40, schrieb Wietse Venema:
 My conclusion is that Postfix can continue to provide basic policies
 that avoid worst-case failure modes, but the choice of the settings
 that control those policies is better left to the operator. If the
 receiver slams on the brakes, then Postfix can suspend deliveries,
 but the sender operator will have to adjust the sending rate.

exactly this is the point

thank you for your understanding and thoughts!



signature.asc
Description: OpenPGP digital signature


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Viktor Dukhovni
On Tue, Jan 08, 2013 at 10:02:31PM +0100, Reindl Harald wrote:

 Am 08.01.2013 21:40, schrieb Wietse Venema:
  My conclusion is that Postfix can continue to provide basic policies
  that avoid worst-case failure modes, but the choice of the settings
  that control those policies is better left to the operator. If the
  receiver slams on the brakes, then Postfix can suspend deliveries,
  but the sender operator will have to adjust the sending rate.
 
 exactly this is the point
 
 thank you for your understanding and thoughts!

Suspending delivery and punting all messages from the active queue
for the designated nexthop is not a winning strategy. In this state
mail delivery to the destination is in most cases unlikely to
recover without manual intervention.

I would posit that neither Reindl nor the OP, or that many others
really understand what they are asking for. If they understood,
they would stop asking for it.

When faced with a destination that imposes tight rate limits you
must pre-configure your MTA to always stay under the limits. Nothing
good happens when the Postfix output rate under load exceeds the
remote limit whether you throttle the queue repeatedly or not.

The best that one can hope for is for Postfix to dynamically apply
a rate delay that is guaranteed to be slow enough to get under the
limit, and then gradually reduce it.

Throttling the destination (moving all active mail to deferred)
is a pre-programmed MTA outage, I'd not want to operate any system
that behaves that way, and neither should you, whether you know
it or not.

-- 
Viktor.


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Reindl Harald


Am 09.01.2013 02:57, schrieb Viktor Dukhovni:
 On Tue, Jan 08, 2013 at 10:02:31PM +0100, Reindl Harald wrote:
 
 Am 08.01.2013 21:40, schrieb Wietse Venema:
 My conclusion is that Postfix can continue to provide basic policies
 that avoid worst-case failure modes, but the choice of the settings
 that control those policies is better left to the operator. If the
 receiver slams on the brakes, then Postfix can suspend deliveries,
 but the sender operator will have to adjust the sending rate.

 exactly this is the point

 thank you for your understanding and thoughts!
 
 Suspending delivery and punting all messages from the active queue
 for the designated nexthop is not a winning strategy. In this state
 mail delivery to the destination is in most cases unlikely to
 recover without manual intervention.

it is in the usecase of a DEDICATED newsletter relay
why should it not recover?

the request was after 20 temp fails to the same destination
retry the next delivers to THIS destination FIVE MINUTES later

 I would posit that neither Reindl nor the OP, or that many others
 really understand what they are asking for. If they understood,
 they would stop asking for it.

i would posit you do not understand the usecase

 When faced with a destination that imposes tight rate limits you
 must pre-configure your MTA to always stay under the limits. Nothing
 good happens when the Postfix output rate under load exceeds the
 remote limit whether you throttle the queue repeatedly or not

smtp_destination_recipient_limit  = 15
smtp_initial_destination_concurrency  = 2
smtp_destination_concurrency_limit= 2
smtp_destination_concurrency_failed_cohort_limit  = 5
smtp_destination_rate_delay   = 1

so what should one do more?
the sending machine is whitelisted at the ISP
but the whitelisting does not affect rate-limits

and yes we do not care if a newsletter has reached every RCPT
two hours later but we do care for reputation and not exceed
rate limits of large ISP's



signature.asc
Description: OpenPGP digital signature


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Viktor Dukhovni
On Wed, Jan 09, 2013 at 03:06:58AM +0100, Reindl Harald wrote:

  Suspending delivery and punting all messages from the active queue
  for the designated nexthop is not a winning strategy. In this state
  mail delivery to the destination is in most cases unlikely to
  recover without manual intervention.
 
 it is in the usecase of a DEDICATED newsletter relay
 why should it not recover?
 
 the request was after 20 temp fails to the same destination
 retry the next delivers to THIS destination FIVE MINUTES later

That's not what happens when a destination is throttled, all mail
there is deferred, and is retried some indefinite time later that
is at least 5 minutes but perhaps a lot longer, and at great I/O
cost, with expontial backoff for each message based on time in the
queue, ...

To understand what one is asking for, one needs to understand the
scheduler (qmgr) architecture. Otherwise, one is just babbling
nonsense (no offense intended).

  I would posit that neither Reindl nor the OP, or that many others
  really understand what they are asking for. If they understood,
  they would stop asking for it.
 
 i would posit you do not understand the usecase

How likely do you think that is? Of course I understand the use
case, in fact better than the users who are asking for it.

 and yes we do not care if a newsletter has reached every RCPT
 two hours later but we do care for reputation and not exceed
 rate limits of large ISP's

Throttling the destination (which means moving all pending messages
for the destinatin to deferred, where they age exponentially, while
more mail builds up...) is not the answer to your problem.

1. Get whitelisted without limits, send at the arrival rate.
2. Get whitelisted at above the arrival rate, set rate delay to
   avoid exceeding the rate.
3. Don't waste time with unresponsive mailbox providers, tell their
   customers their mailbox provider is not supported.
4. Snowshoe.

Pick the first one that is viable for you.

-- 
Viktor.


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-08 Thread Reindl Harald


Am 09.01.2013 03:17, schrieb Viktor Dukhovni:
 the request was after 20 temp fails to the same destination
 retry the next delivers to THIS destination FIVE MINUTES later
 
 That's not what happens when a destination is throttled, all mail
 there is deferred, and is retried some indefinite time later that
 is at least 5 minutes but perhaps a lot longer, and at great I/O
 cost, with expontial backoff for each message based on time in the
 queue, ...
 
 To understand what one is asking for, one needs to understand the
 scheduler (qmgr) architecture. Otherwise, one is just babbling
 nonsense (no offense intended).

and the request was if the behavior can be controlled in
the future and not was the behavior currently is

 Throttling the destination (which means moving all pending messages
 for the destinatin to deferred, where they age exponentially, while
 more mail builds up...) is not the answer to your problem.

sorry, but you really NOT understand the usecase

while more mail builds up
NO there is NO MORE MAIL built up

* DEDICATED NEWSLETTER MACHINE
* means large amount of mails one or two times a week

 1. Get whitelisted without limits, send at the arrival rate

no option

 2. Get whitelisted at above the arrival rate, set rate delay to
 avoid exceeding the rate

you missed

smtp_destination_recipient_limit= 15
smtp_initial_destination_concurrency= 2
smtp_destination_concurrency_limit  = 2
smtp_destination_concurrency_failed_cohort_limit= 10
smtp_destination_rate_delay = 1

 3. Don't waste time with unresponsive mailbox providers, tell their
 customers their mailbox provider is not supported.

reailty check: you propose to tell my customers that
they should tell their customers anything because the
mailadmin would like to get rid of the permamently
try again later messages in his maillog

this will not happen in the real world



signature.asc
Description: OpenPGP digital signature


destination_rate_delay and connection_reuse_time_limit

2013-01-07 Thread Rafael Azevedo - IAGENTE
Guys,

I've identified a missbehavior on postfix.

I do use destination_rate_delay for specific transport queue, and I found out 
that connection cache is not working when I have 
transport_destination_rate_delay  1s.

If I change the destination_rate_delay to higher than 1s, connection cache 
won't work. Changing it back to 1s it all goes perfect.

The problem is that we're having a hard time to deliver email to specific 
destination, thats why I have a specific transport, so I can manage the queue 
on this. But I'm still having the same error… You've reached the sending limit 
to this domain.

So this is what I have:
specificTransport_destination_concurrency_limit = 4
specificTransport_destination_rate_delay = 1s
specificTransport_connection_cache_reuse_limit = 100
specificTransport_bounce_queue_lifetime = 6h
specificTransport_maximal_queue_lifetime = 12h
specificTransport_connection_cache_time_limit = 30s
specificTransport_connection_reuse_time_limit = 600s
specificTransport_connection_cache_destinations = static:all

What I really need to do is send only one email every 5 seconds, keeping the 
connection opened between one email and another. I can't have more than 20 
opened connections in a 10 minutes timeframe, or I'll get blocked.

Is there anyway I can do this? 

Can anybody help please?

Thanks.

Att.
--
Rafael Azevedo | IAGENTE
Fone: 51 3086.0262
MSN: raf...@hotmail.com
Visite: www.iagente.com.br



Re: destination_rate_delay and connection_reuse_time_limit

2013-01-07 Thread Wietse Venema
Rafael Azevedo - IAGENTE:
 I do use destination_rate_delay for specific transport queue, and
 I found out that connection cache is not working when I have
 transport_destination_rate_delay  1s.

The default time limit is 2s, and it is enforced in multiple places.
You have found only one.

As Postfix documentation says, you must not increase these time
limits without permission from the receiver.  Connection caching
is a performance tool, not a tool to circumvent receiver policies.

Wietse


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-07 Thread Rafael Azevedo - IAGENTE
Hi Wietse,

I don't really get. I'm also sure postfix has a way to solve this issue.

This is what I'm trying to do:

- I need to have only one process to this transport's queue.
- This queue must respect the destination's policy, so I can't have more than 
20 opened connections in 10 minutes timeframe. Thats why I want to use 
connection cache.

According to my configuration, I'm having only one process for this transport, 
also limiting the sending time, holding down delivery process, waiting 1 second 
for each sent message before sending another one.

And since this transport handles only specific domains, I really don't have to 
worry about receiver policies, because they told me to send as much as I can 
using the same connection, avoiding opening one connection per message.

What do you recommend me to do? Can you help me out on tunning this up?

Thanks.

Att.
--
Rafael Azevedo | IAGENTE
Fone: 51 3086.0262
MSN: raf...@hotmail.com
Visite: www.iagente.com.br

Em 07/01/2013, às 11:28, Wietse Venema wie...@porcupine.org escreveu:

 Rafael Azevedo - IAGENTE:
 I do use destination_rate_delay for specific transport queue, and
 I found out that connection cache is not working when I have
 transport_destination_rate_delay  1s.
 
 The default time limit is 2s, and it is enforced in multiple places.
 You have found only one.
 
 As Postfix documentation says, you must not increase these time
 limits without permission from the receiver.  Connection caching
 is a performance tool, not a tool to circumvent receiver policies.
 
   Wietse



Re: destination_rate_delay and connection_reuse_time_limit

2013-01-07 Thread Wietse Venema
Rafael Azevedo - IAGENTE:
 Hi Wietse,
 
 I don't really get. I'm also sure postfix has a way to solve this issue.

I told you that there are two parameters that enforce the time limit. 

Wietse


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-07 Thread Rafael Azevedo - IAGENTE
Could you please refresh my mind?

Thanks.

Att.
--
Rafael Azevedo | IAGENTE
Fone: 51 3086.0262
MSN: raf...@hotmail.com
Visite: www.iagente.com.br

Em 07/01/2013, às 12:17, Wietse Venema wie...@porcupine.org escreveu:

 Rafael Azevedo - IAGENTE:
 Hi Wietse,
 
 I don't really get. I'm also sure postfix has a way to solve this issue.
 
 I told you that there are two parameters that enforce the time limit. 
 
   Wietse



Re: destination_rate_delay and connection_reuse_time_limit

2013-01-07 Thread Viktor Dukhovni
On Mon, Jan 07, 2013 at 11:34:45AM -0200, Rafael Azevedo - IAGENTE wrote:

 This is what I'm trying to do:
 
 - I need to have only one process to this transport's queue.

mumble_destination_concurrency_limit = 1

 - This queue must respect the destination's policy, so I can't
 have more than 20 opened connections in 10 minutes timeframe. Thats
 why I want to use connection cache.

The connection cache is used automatically when there is a backlog
of mail to the destination. You are defeating the connection cache
by enforcing a rate limit of 1, which rate limits deliveries, not
connections. DO NOT set a rate limit.

 According to my configuration, I'm having only one process for
 this transport, also limiting the sending time, holding down delivery
 process, waiting 1 second for each sent message before sending
 another one.

Instead of setting a process limit of 1, you can just specify an
explicit nexthop for the domains whose concurrency you want to
aggregate:

example.com mumble:example.com
example.net mumble:example.com
example.edu mumble:example.com
...

This should the queue manager schedule deliveries to these domains
as it will combine the queues for all the domains that use the
transport into a single queue (while using the MX records for
a suitably chosen single domain).

 And since this transport handles only specific domains, I really
 don't have to worry about receiver policies, because they told me
 to send as much as I can using the same connection, avoiding opening
 one connection per message.

Don't enable rate delays. Do specify a common nexthop for all domains
that share the transport. Don't mess with the connection cache timers.

-- 
Viktor.


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-07 Thread Rafael Azevedo - IAGENTE
Hi Viktor, thanks for helping.

I've done something very similar.

I created different named transports for specific domains and have all domains 
I need a special treatment to use this named transport.

So since I'm using Postfix + MySQL, I have a transport table with all domains 
and destination transport. Its quite the same thing you're proposing.

Yet, I'm still with the same problem. Its worthless to have all domains I want 
on a specific transport and now controlling the throughput.

I don't need just a transport queue for this, I need to control the throughput 
for all domains that are in this transport, thats why I'm working with the 
connection cache timers. I've also noticed a huge delivery improvement for 
Hotmail and Yahoo since I've started messing things around.

So in the real life, I have about 10.000 domains that are hosted in the same 
hosting company. This company has a rigid control of their resources. Then I 
got all domains we had traffic in the last 3 months, looked up for their 
hoisting company, had everything mapped and put them all on this named 
transport. Now all I need is to control the delivery for these specific 
destinations.

Basically this is what I need to be careful about: not send more than 1.000 in 
10 minutes and not having more than 20 opened connections in the same time 
frame.

Is there anything else I can do to have a better control of my throughput?

Any help would be very appreciated.

Thanks in advance.

Att.
--
Rafael Azevedo | IAGENTE
Fone: 51 3086.0262
MSN: raf...@hotmail.com
Visite: www.iagente.com.br

Em 07/01/2013, às 14:25, Viktor Dukhovni postfix-us...@dukhovni.org escreveu:

 On Mon, Jan 07, 2013 at 11:34:45AM -0200, Rafael Azevedo - IAGENTE wrote:
 
 This is what I'm trying to do:
 
 - I need to have only one process to this transport's queue.
 
   mumble_destination_concurrency_limit = 1
 
 - This queue must respect the destination's policy, so I can't
 have more than 20 opened connections in 10 minutes timeframe. Thats
 why I want to use connection cache.
 
 The connection cache is used automatically when there is a backlog
 of mail to the destination. You are defeating the connection cache
 by enforcing a rate limit of 1, which rate limits deliveries, not
 connections. DO NOT set a rate limit.
 
 According to my configuration, I'm having only one process for
 this transport, also limiting the sending time, holding down delivery
 process, waiting 1 second for each sent message before sending
 another one.
 
 Instead of setting a process limit of 1, you can just specify an
 explicit nexthop for the domains whose concurrency you want to
 aggregate:
 
   example.com mumble:example.com
   example.net mumble:example.com
   example.edu mumble:example.com
   ...
 
 This should the queue manager schedule deliveries to these domains
 as it will combine the queues for all the domains that use the
 transport into a single queue (while using the MX records for
 a suitably chosen single domain).
 
 And since this transport handles only specific domains, I really
 don't have to worry about receiver policies, because they told me
 to send as much as I can using the same connection, avoiding opening
 one connection per message.
 
 Don't enable rate delays. Do specify a common nexthop for all domains
 that share the transport. Don't mess with the connection cache timers.
 
 -- 
   Viktor.



Re: destination_rate_delay and connection_reuse_time_limit

2013-01-07 Thread Viktor Dukhovni
On Mon, Jan 07, 2013 at 02:37:03PM -0200, Rafael Azevedo - IAGENTE wrote:

 I've done something very similar.

If you want help, please take some time to read and follow the
advice you receive completely and accurately. Similar is another
way of saying incorrect.

 I created different named transports for specific domains and
 have all domains I need a special treatment to use this named
 transport.

To achieve a total concurrency limit across multiple destination
domains, you must specify a common nexthop, not just a common
transport.

 So since I'm using Postfix + MySQL, I have a transport table with
 all domains and destination transport. Its quite the same thing
 you're proposing.

No, it is not, since it leaves out the common nexthop which
consolidates the queues for all the domains.

 Yet, I'm still with the same problem.

Do take the time to follow advice completely and accurately.

 So in the real life, I have about 10.000 domains that are hosted in
 the same hosting company. This company has a rigid control of their
 resources.

Your best bet is to get whitelisted by the receiving system for a higher
throughput limit.

If your average input message rate for these domains falls below the
current cap, and you're just trying to smooth out the spikes, the
advice I gate is correct, if you're willing to listen.

 Is there anything else I can do to have a better control of my throughput?

Understand that Postfix queues are per transport/nexthop, not merely
per transport. To schedule mail via a specific provider as a single
stream (queue), specify an explicit nexthop for all domains that
transit that provider. Since you're already using an explicit
transport, it is easy to append the appropriate nexthop.

 Any help would be very appreciated.

Ideally, you will not dismiss help when it is given.

-- 
Viktor.


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-07 Thread Rafael Azevedo - IAGENTE
Hi Viktor,

Thanks once again for helping me on this.

Please understand that I'm very open and thankful for any help. I'm also 
trying to understand what you meant.

Getting whitelisted is always the best solution, but believe me, there are some 
providers that just don't answer any email, they just won't help us to even 
work in compliance with their rules. Thats why I'm asking for help here.

Sometimes you guys speak in a very advanced language and it may be hard for 
some people to understand what you're meaning. Worse than that is when we try 
to explain our problem and we're not clear enough. So I tried to better explain 
myself and then you became with another solution.

Anyway, I'll search how to use this next hoop feature and see if it fixes the 
issue. Although I'm still having to respect the amount of message per time 
frame so the question persists: how can I low down delivery to these 
destinations without opening too many connections to them? Having them all in 
one only transport/nexthoop will not fix the problem if I don't control the 
throughput, right?

Sorry for the questions, I'm really trying to understand the solution here.

Thanks once again.

Att.
--
Rafael Azevedo | IAGENTE
Fone: 51 3086.0262
MSN: raf...@hotmail.com
Visite: www.iagente.com.br

Em 07/01/2013, às 14:47, Viktor Dukhovni postfix-us...@dukhovni.org escreveu:

 On Mon, Jan 07, 2013 at 02:37:03PM -0200, Rafael Azevedo - IAGENTE wrote:
 
 I've done something very similar.
 
 If you want help, please take some time to read and follow the
 advice you receive completely and accurately. Similar is another
 way of saying incorrect.
 
 I created different named transports for specific domains and
 have all domains I need a special treatment to use this named
 transport.
 
 To achieve a total concurrency limit across multiple destination
 domains, you must specify a common nexthop, not just a common
 transport.
 
 So since I'm using Postfix + MySQL, I have a transport table with
 all domains and destination transport. Its quite the same thing
 you're proposing.
 
 No, it is not, since it leaves out the common nexthop which
 consolidates the queues for all the domains.
 
 Yet, I'm still with the same problem.
 
 Do take the time to follow advice completely and accurately.
 
 So in the real life, I have about 10.000 domains that are hosted in
 the same hosting company. This company has a rigid control of their
 resources.
 
 Your best bet is to get whitelisted by the receiving system for a higher
 throughput limit.
 
 If your average input message rate for these domains falls below the
 current cap, and you're just trying to smooth out the spikes, the
 advice I gate is correct, if you're willing to listen.
 
 Is there anything else I can do to have a better control of my throughput?
 
 Understand that Postfix queues are per transport/nexthop, not merely
 per transport. To schedule mail via a specific provider as a single
 stream (queue), specify an explicit nexthop for all domains that
 transit that provider. Since you're already using an explicit
 transport, it is easy to append the appropriate nexthop.
 
 Any help would be very appreciated.
 
 Ideally, you will not dismiss help when it is given.
 
 -- 
   Viktor.



Re: destination_rate_delay and connection_reuse_time_limit

2013-01-07 Thread Rafael Azevedo - IAGENTE
Hi Viktor,

I was reading the documentation and found out something very interesting.

If I use mumble_destination_concurrency_limit = 1, the destination is a 
recipient not a domain.

Since I'm trying to control the throughput per destination domain, it is 
necessary to use destination_concurrency_limit  1, in this case I believe that 
if I use mumble_destination_concurrency_limit = 2 would be good for this issue.

default_destination_concurrency_limit (default: 20)
The default maximal number of parallel deliveries to the same destination. This 
is the default limit for delivery via the lmtp(8), pipe(8), smtp(8) and 
virtual(8) delivery agents. With per-destination recipient limit  1, a 
destination is a domain, otherwise it is a recipient.

Is this correct?

Thanks in advance.

Att.
--
Rafael Azevedo | IAGENTE
Fone: 51 3086.0262
MSN: raf...@hotmail.com
Visite: www.iagente.com.br

Em 07/01/2013, às 14:25, Viktor Dukhovni postfix-us...@dukhovni.org escreveu:

 On Mon, Jan 07, 2013 at 11:34:45AM -0200, Rafael Azevedo - IAGENTE wrote:
 
 This is what I'm trying to do:
 
 - I need to have only one process to this transport's queue.
 
   mumble_destination_concurrency_limit = 1
 
 - This queue must respect the destination's policy, so I can't
 have more than 20 opened connections in 10 minutes timeframe. Thats
 why I want to use connection cache.
 
 The connection cache is used automatically when there is a backlog
 of mail to the destination. You are defeating the connection cache
 by enforcing a rate limit of 1, which rate limits deliveries, not
 connections. DO NOT set a rate limit.
 
 According to my configuration, I'm having only one process for
 this transport, also limiting the sending time, holding down delivery
 process, waiting 1 second for each sent message before sending
 another one.
 
 Instead of setting a process limit of 1, you can just specify an
 explicit nexthop for the domains whose concurrency you want to
 aggregate:
 
   example.com mumble:example.com
   example.net mumble:example.com
   example.edu mumble:example.com
   ...
 
 This should the queue manager schedule deliveries to these domains
 as it will combine the queues for all the domains that use the
 transport into a single queue (while using the MX records for
 a suitably chosen single domain).
 
 And since this transport handles only specific domains, I really
 don't have to worry about receiver policies, because they told me
 to send as much as I can using the same connection, avoiding opening
 one connection per message.
 
 Don't enable rate delays. Do specify a common nexthop for all domains
 that share the transport. Don't mess with the connection cache timers.
 
 -- 
   Viktor.



Re: destination_rate_delay and connection_reuse_time_limit

2013-01-07 Thread Rafael Azevedo - IAGENTE
Hi Viktor,

Thanks for the help.

I believe I've activated the next hop feature in my transport table.

If I understood it right, all I had to do is tell postfix that these domains 
belongs to my named transport specifying the domain. 

So this is how it is now:
criticaldomain.tldslow:criticaldomain.tld
domain.tld  slow:criticaldomain.tld

Is it right?

Thanks once again.

Att.
--
Rafael Azevedo | IAGENTE
Fone: 51 3086.0262
MSN: raf...@hotmail.com
Visite: www.iagente.com.br

Em 07/01/2013, às 14:47, Viktor Dukhovni postfix-us...@dukhovni.org escreveu:

 On Mon, Jan 07, 2013 at 02:37:03PM -0200, Rafael Azevedo - IAGENTE wrote:
 
 I've done something very similar.
 
 If you want help, please take some time to read and follow the
 advice you receive completely and accurately. Similar is another
 way of saying incorrect.
 
 I created different named transports for specific domains and
 have all domains I need a special treatment to use this named
 transport.
 
 To achieve a total concurrency limit across multiple destination
 domains, you must specify a common nexthop, not just a common
 transport.
 
 So since I'm using Postfix + MySQL, I have a transport table with
 all domains and destination transport. Its quite the same thing
 you're proposing.
 
 No, it is not, since it leaves out the common nexthop which
 consolidates the queues for all the domains.
 
 Yet, I'm still with the same problem.
 
 Do take the time to follow advice completely and accurately.
 
 So in the real life, I have about 10.000 domains that are hosted in
 the same hosting company. This company has a rigid control of their
 resources.
 
 Your best bet is to get whitelisted by the receiving system for a higher
 throughput limit.
 
 If your average input message rate for these domains falls below the
 current cap, and you're just trying to smooth out the spikes, the
 advice I gate is correct, if you're willing to listen.
 
 Is there anything else I can do to have a better control of my throughput?
 
 Understand that Postfix queues are per transport/nexthop, not merely
 per transport. To schedule mail via a specific provider as a single
 stream (queue), specify an explicit nexthop for all domains that
 transit that provider. Since you're already using an explicit
 transport, it is easy to append the appropriate nexthop.
 
 Any help would be very appreciated.
 
 Ideally, you will not dismiss help when it is given.
 
 -- 
   Viktor.



Re: destination_rate_delay and connection_reuse_time_limit

2013-01-07 Thread Viktor Dukhovni
On Mon, Jan 07, 2013 at 03:06:42PM -0200, Rafael Azevedo - IAGENTE wrote:

 Anyway, I'll search how to use this next hoop feature and see

The term is nexthop, this specifies the next system or systems
to which the message will be forwarded en-route to its destination
mailbox. With SMTP the nexthop is a domain (subject to MX lookups)
or [gateway]  (not subject to MX lookups).

The syntax of transport entries for each delivery agent is specified
in the man page for that delivery agent, see the SMTP DESTINATION SYNTAX
section of:

http://www.postfix.org/smtp.8.html

 Although I'm still having to respect the amount of message per
 time frame so the question persists: how can I low down delivery
 to these destinations without opening too many connections to them?

You can reduce the connection rate by caching connections, which
works when you consolidate all the domains that use that provider
to a single transport/nexthop.

You can only reduce the message delivery rate by sending less mail.
To reduce the peak message delivery rate, you need to insert
artifiical delays between message deliveries, but this defeats
connection reuse. You can't have both if the limits are sufficiently
aggressive. You should probably ignore the message rate limit.

By capping both message rates and connection rates the receiving
system is hostile to legitimate bulk email. If the the hosted
users actually want your email, ask them to talk to the provider
on your behalf.

Otherwise, you can spread the load over multiple servers each
of which falls under the rate limits (snow-shoe).

 Having them all in one only transport/nexthop will not fix the
 problem if I don't control the throughput, right?

This will cause connection reuse, which combined with a destination
concurrency limit of 1, will minimize the number of connections
made.

-- 
Viktor.


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-07 Thread Viktor Dukhovni
On Mon, Jan 07, 2013 at 03:29:53PM -0200, Rafael Azevedo - IAGENTE wrote:

 I believe I've activated the next hop feature in my transport table.
 
 If I understood it right, all I had to do is tell postfix that
 these domains belongs to my named transport specifying the domain.
 
 So this is how it is now:
 criticaldomain.tldslow:criticaldomain.tld
 domain.tld  slow:criticaldomain.tld
 
 Is it right?

Correct. Together with:

slow_destination_concurrency_limit = 1
slow_destination_concurrency_failed_cohort_limit = 5

and without any:

slow_destination_rate_delay

which is equivalent to:

slow_destination_rate_delay = 0s

also don't change the defaults:

smtp_connection_cache_on_demand = yes
smtp_connection_cache_time_limit = 2s

-- 
Viktor.


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-07 Thread Viktor Dukhovni
On Mon, Jan 07, 2013 at 03:19:39PM -0200, Rafael Azevedo - IAGENTE wrote:

 If I use mumble_destination_concurrency_limit = 1, the destination
 is a recipient not a domain.

This is wrong. The setting in question is the recipient_limit, not
the concurrency limit.

 default_destination_concurrency_limit (default: 20)
 The default maximal number of parallel deliveries to the same destination. 
 This is the default limit for delivery via the lmtp(8), pipe(8), smtp(8) and 
 virtual(8) delivery agents. With per-destination recipient limit  1, a 
 destination is a domain, otherwise it is a recipient.
 
 Is this correct?

It says when the recipient limit  1.

-- 
Viktor.


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-07 Thread Rafael Azevedo - IAGENTE
Thank you so much Viktor, now I fully understand what you said.

Cheers.

Att.
--
Rafael Azevedo | IAGENTE
Fone: 51 3086.0262
MSN: raf...@hotmail.com
Visite: www.iagente.com.br

Em 07/01/2013, às 15:57, Viktor Dukhovni postfix-us...@dukhovni.org escreveu:

 On Mon, Jan 07, 2013 at 03:29:53PM -0200, Rafael Azevedo - IAGENTE wrote:
 
 I believe I've activated the next hop feature in my transport table.
 
 If I understood it right, all I had to do is tell postfix that
 these domains belongs to my named transport specifying the domain.
 
 So this is how it is now:
 criticaldomain.tldslow:criticaldomain.tld
 domain.tld  slow:criticaldomain.tld
 
 Is it right?
 
 Correct. Together with:
 
   slow_destination_concurrency_limit = 1
   slow_destination_concurrency_failed_cohort_limit = 5
 
 and without any:
 
   slow_destination_rate_delay
 
 which is equivalent to:
 
   slow_destination_rate_delay = 0s
 
 also don't change the defaults:
 
   smtp_connection_cache_on_demand = yes
   smtp_connection_cache_time_limit = 2s
 
 -- 
   Viktor.



Re: destination_rate_delay and connection_reuse_time_limit

2013-01-07 Thread Rafael Azevedo - IAGENTE
Hi Viktor,

I've done exactally what you said and notice that the connection cache is not 
being used anymore.

I ran a script with loop sending email to few recipients, and the cache seems 
not be working (after commenting slow_destination_rate_delay).

Changing slow_destination_rate_delay to 1s enables postfix cache's usage again.

Can you give me a tip?

Thanks once again.

Att.
--
Rafael Azevedo | IAGENTE
Fone: 51 3086.0262
MSN: raf...@hotmail.com
Visite: www.iagente.com.br

Em 07/01/2013, às 15:57, Viktor Dukhovni postfix-us...@dukhovni.org escreveu:

 On Mon, Jan 07, 2013 at 03:29:53PM -0200, Rafael Azevedo - IAGENTE wrote:
 
 I believe I've activated the next hop feature in my transport table.
 
 If I understood it right, all I had to do is tell postfix that
 these domains belongs to my named transport specifying the domain.
 
 So this is how it is now:
 criticaldomain.tldslow:criticaldomain.tld
 domain.tld  slow:criticaldomain.tld
 
 Is it right?
 
 Correct. Together with:
 
   slow_destination_concurrency_limit = 1
   slow_destination_concurrency_failed_cohort_limit = 5
 
 and without any:
 
   slow_destination_rate_delay
 
 which is equivalent to:
 
   slow_destination_rate_delay = 0s
 
 also don't change the defaults:
 
   smtp_connection_cache_on_demand = yes
   smtp_connection_cache_time_limit = 2s
 
 -- 
   Viktor.



Re: destination_rate_delay and connection_reuse_time_limit

2013-01-07 Thread Viktor Dukhovni
On Mon, Jan 07, 2013 at 04:24:20PM -0200, Rafael Azevedo - IAGENTE wrote:

 I've done exactally what you said and notice that the connection
 cache is not being used anymore.

You have enabled cache-on-demand behaviour. This happens when the active
queue contains a backlog of messages to the destination. If your
input rate is sufficiently low, messages leave as quickly as they
arrive and connections are not cached.

 I ran a script with loop sending email to few recipients, and
 the cache seems not be working (after commenting
 slow_destination_rate_delay).

This does not generate mail sufficiently fast, it is delivered as
fast as it arrives with no backlog.

 Changing slow_destination_rate_delay to 1s enables postfix cache's
 usage again.

You can set the rate delay to 1s, (but not more), provided 1msg/sec
is above your long-term average message rate to the destination.

If you just want to always cache, in master.cf change the slow
entry to add the option:

  master.cf:
slow unix ... smtp
  -o smtp_connection_cache_destinations=$slow_connection_cache_destinations

and then in main.cf add:

  main.cf:
# Perhaps safer:
# slow_connection_cache_destinations = example.com
slow_connection_cache_destinations = static:all

Or instead of 'static:all' just the nexthop you use in the transport
table for the slow domains in question, that way other nexthops that
use the slow transport can still use demand caching. You can of course
also use a table with appropriate keys:

  main.cf:
indexed = ${default_database_type}:${config_directory}/
slow_connection_cache_destinations = ${indexed}cache-nexthops

  cache-nexthops:
example.com whatever

Don't forget postfix reload.

-- 
Viktor.


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-07 Thread Wietse Venema
Viktor Dukhovni:
 On Mon, Jan 07, 2013 at 04:24:20PM -0200, Rafael Azevedo - IAGENTE wrote:
 
  I've done exactally what you said and notice that the connection
  cache is not being used anymore.
 
 You have enabled cache-on-demand behaviour. This happens when the active
 queue contains a backlog of messages to the destination. If your
 input rate is sufficiently low, messages leave as quickly as they
 arrive and connections are not cached.

Connection cache time limits are controlled by two parameters: one
in the delivery agent, and one in the scache daemon. It's the second
parameter that he is missing all the time.

Wietse


Re: destination_rate_delay and connection_reuse_time_limit

2013-01-07 Thread Viktor Dukhovni
On Mon, Jan 07, 2013 at 04:02:36PM -0500, Wietse Venema wrote:

  On Mon, Jan 07, 2013 at 04:24:20PM -0200, Rafael Azevedo - IAGENTE wrote:
  
   I've done exactally what you said and notice that the connection
   cache is not being used anymore.
  
  You have enabled cache-on-demand behaviour. This happens when the active
  queue contains a backlog of messages to the destination. If your
  input rate is sufficiently low, messages leave as quickly as they
  arrive and connections are not cached.
 
 Connection cache time limits are controlled by two parameters: one
 in the delivery agent, and one in the scache daemon. It's the second
 parameter that he is missing all the time.

Yes, but he should NOT change it. It was a sound piece of defensive
programming on your part to discourage abusive configurations.
Postfix should not cache idle connections to remote sites unnecessarily
long, and more than 1-2 seconds is unecessarily long!

Thus I am not inclined to discuss about the safety-net control.

-- 
Viktor.