Re: SSL_accept error from unknown[10.5.2.1]: lost connection
Thank you for the insight. It helped solving the issue. Un cordial saludo, Wolfgang Rauchholz +34 627 994 977 https://www.linkedin.com/in/wolfgangrauchholz/ On Tue, Feb 7, 2023 at 6:51 PM Wietse Venema wrote: > Wolfgang Paul Rauchholz: > > Hello I run postfix (postfix-3.5.8-4.el8.x86_64) on my Rocky Linux 8.7 > home > > server > > I setup postfix and dovecot as a firs step and it seems to be working; > > meaning I can send and receive mails (I send/returned mail from a gmail > > account). > > But I find these error messages in /var/log/maillog and after > researching > > and making changes cannot fix them. > > I searched on the web and there are many different cases discussed, > but... > > > > Feb 5 03:50:12 home postfix/smtps/smtpd[402300]: SSL_accept error from > > unknown[10.5.2.1]: lost connection > > Feb 5 03:50:12 home postfix/smtps/smtpd[402300]: lost connection after > > CONNECT from unknown[10.5.2.1] > > Feb 5 03:50:12 home postfix/smtps/smtpd[402300]: disconnect from > > unknown[10.5.2.1] commands=0/0 > > This could be a TLS wrappermode mismatch. > > Port 587 (submission) should not use TLS wrappermode. > > Port 465 (smtps) should use TLS wrappermode. > > Port 25 (smtp) should not use TLS wrappermode. > > Either the client or the server got this wrong. > > Wietse >
Re: SSL_accept error from unknown[10.5.2.1]: lost connection
Wolfgang Paul Rauchholz: > Hello I run postfix (postfix-3.5.8-4.el8.x86_64) on my Rocky Linux 8.7 home > server > I setup postfix and dovecot as a firs step and it seems to be working; > meaning I can send and receive mails (I send/returned mail from a gmail > account). > But I find these error messages in /var/log/maillog and after researching > and making changes cannot fix them. > I searched on the web and there are many different cases discussed, but... > > Feb 5 03:50:12 home postfix/smtps/smtpd[402300]: SSL_accept error from > unknown[10.5.2.1]: lost connection > Feb 5 03:50:12 home postfix/smtps/smtpd[402300]: lost connection after > CONNECT from unknown[10.5.2.1] > Feb 5 03:50:12 home postfix/smtps/smtpd[402300]: disconnect from > unknown[10.5.2.1] commands=0/0 This could be a TLS wrappermode mismatch. Port 587 (submission) should not use TLS wrappermode. Port 465 (smtps) should use TLS wrappermode. Port 25 (smtp) should not use TLS wrappermode. Either the client or the server got this wrong. Wietse
Re: SSL_accept error from unknown[10.5.2.1]: lost connection
On Tue, Feb 07, 2023 at 05:59:52PM +0100, Wolfgang Paul Rauchholz wrote: > Feb 5 03:50:12 home postfix/smtps/smtpd[402300]: > SSL_accept error from unknown[10.5.2.1]: lost connection > Feb 5 03:50:12 home postfix/smtps/smtpd[402300]: > lost connection after CONNECT from unknown[10.5.2.1] > Feb 5 03:50:12 home postfix/smtps/smtpd[402300]: > disconnect from unknown[10.5.2.1] commands=0/0 Something (was the address actually 10.5.2.1, or did you replace it for "privacy") connected to the port 465 implicit TLS submission service. And probably disconnected without even initiating an SSL handshake. Are any of your authorised devices or users having problems sending email? If not, you don't have a problem, except perhaps that the connection is coming from a private IP address 10.0.0.0/8, which may a configuration issue on your firewall, external IPs should not be changed in transit. Of course this could also be a source that is internal to your network. > I do have letsencrypt certificates that seem to be ok (domain > wo-lar.com) I added this to the /main.cf config file. These are likely irrelevant. > Where do I need to start looking? Thanks for your insights. Is there an actual problem? Unless you're concerned about unexpected connection attempts from a seemingly internal IP, there's nothing to worry about, port scans and TLS scans are a fact of life on the internet. Some (like my DANE survey[1]) are even for the public good, rather than malicious. -- Viktor. [1] https://stats.dnssec-tools.org/
SSL_accept error from unknown[10.5.2.1]: lost connection
Hello I run postfix (postfix-3.5.8-4.el8.x86_64) on my Rocky Linux 8.7 home server I setup postfix and dovecot as a firs step and it seems to be working; meaning I can send and receive mails (I send/returned mail from a gmail account). But I find these error messages in /var/log/maillog and after researching and making changes cannot fix them. I searched on the web and there are many different cases discussed, but... Feb 5 03:50:12 home postfix/smtps/smtpd[402300]: SSL_accept error from unknown[10.5.2.1]: lost connection Feb 5 03:50:12 home postfix/smtps/smtpd[402300]: lost connection after CONNECT from unknown[10.5.2.1] Feb 5 03:50:12 home postfix/smtps/smtpd[402300]: disconnect from unknown[10.5.2.1] commands=0/0 I do have letsencrypt certificates that seem to be ok (domain wo-lar.com) I added this to the /main.cf config file. smtp_tls_security_level = may meta_directory = /etc/postfix shlib_directory = /usr/lib64/postfix # Log TLS connections smtpd_tls_loglevel = 1 smtp_tls_loglevel = 1 #Force TLSv1.3 or TLSv1.2 smtpd_tls_mandatory_protocols = !SSLv2, !SSLv3, !TLSv1, !TLSv1.1 smtpd_tls_protocols = !SSLv2, !SSLv3, !TLSv1, !TLSv1.1 smtp_tls_mandatory_protocols = !SSLv2, !SSLv3, !TLSv1, !TLSv1.1 smtp_tls_protocols = !SSLv2, !SSLv3, !TLSv1, !TLSv1.1 # Postfix to deliver emails to local message store via the dovecot LMTP server mailbox_transport = lmtp:unix:private/dovecot-lmtp smtputf8_enable = no Where do I need to start looking? Thanks for your insights. Wolfgang Rauchholz +34 627 994 977 https://www.linkedin.com/in/wolfgangrauchholz/
Re: TLS encryption fails: lost connection after STARTTLS from unknown[10.5.2.1]
On Thu, Jan 12, 2023 at 03:51:35PM +0100, Wolfgang Paul Rauchholz wrote: > I am trying to find an error for the lost connection error. I tried > several different sources but don't seem to make any progress. What do you know about the SMTP client system? > Thank you for pointing me into the right direction. > > Jan 12 14:01:02 home postfix/submission/smtpd[7046]: connect from > unknown[10.5.2.1] > Jan 12 14:01:02 home postfix/submission/smtpd[7046]: > Anonymous TLS connection established from unknown[10.5.2.1]: > TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits) The TLS handshake was completed successfully. The problem may not be TLS-related. > Jan 12 14:01:02 home postfix/submission/smtpd[7046]: > lost connection after STARTTLS from unknown[10.5.2.1] > Jan 12 14:01:02 home postfix/submission/smtpd[7046]: > disconnect from unknown[10.5.2.1] ehlo=1 starttls=1 commands=2 The submission client disconnects without sending any SMTP commands after the TLS handshake, not even EHLO to probe for SASL support or QUIT to disconnect "politely". Perhaps it did not like your certificate (unexpected hostname: DNS-ID SAN?...). If you don't know who the client is, and they're not complaining, just ignore this. If the client user is known/complaining, get more technical details from the user, what error does the client software report? -- Viktor.
TLS encryption fails: lost connection after STARTTLS from unknown[10.5.2.1]
Hello. I am trying to find an error for the lost connection error. I tried several different sources but don't seem to make any progress. Thank you for pointing me into the right direction. Jan 12 14:01:02 home postfix/submission/smtpd[7046]: connect from unknown[10.5.2.1] Jan 12 14:01:02 home postfix/submission/smtpd[7046]: discarding EHLO keywords: CHUNKING Jan 12 14:01:02 home postfix/submission/smtpd[7046]: Anonymous TLS connection established from unknown[10.5.2.1]: TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits) Jan 12 14:01:02 home postfix/submission/smtpd[7046]: lost connection after STARTTLS from unknown[10.5.2.1] Jan 12 14:01:02 home postfix/submission/smtpd[7046]: disconnect from unknown[10.5.2.1] ehlo=1 starttls=1 commands=2 Some related config info: I run my server with letsencrypt certificates. - certbot certonly -a apache --agree-tos --staple-ocsp --email -d - smtpd_tls_cert_file, smtpd_tls_key_file and ssl_cert, - Dovevot: ssl_key and ssl_cert are set service auth { unix_listener /var/spool/postfix/private/auth { mode = 0600 user = postfix group = postfix } } - submission inet n-y--smtpd -o syslog_name=postfix/submission -o smtpd_tls_security_level=encrypt -o smtpd_tls_wrappermode=no -o smtpd_sasl_auth_enable=yes -o smtpd_relay_restrictions=permit_sasl_authenticated,reject -o smtpd_recipient_restrictions=permit_mynetworks,permit_sasl_authenticated,reject -o smtpd_sasl_type=dovecot -o smtpd_sasl_path=private/auth - smtps inet n - y - - smtpd -o syslog_name=postfix/smtps -o smtpd_tls_wrappermode=yes -o smtpd_sasl_auth_enable=yes -o smtpd_relay_restrictions=permit_sasl_authenticated,reject -o smtpd_recipient_restrictions=permit_mynetworks,permit_sasl_authenticated,reject -o smtpd_sasl_type=dovecot -o smtpd_sasl_path=private/auth - smtpd_tls_mandatory_protocols = !SSLv2, !SSLv3, !TLSv1, !TLSv1.1 smtpd_tls_protocols = !SSLv2, !SSLv3, !TLSv1, !TLSv1.1 smtp_tls_mandatory_protocols = !SSLv2, !SSLv3, !TLSv1, !TLSv1.1 smtp_tls_protocols = !SSLv2, !SSLv3, !TLSv1, !TLSv1.1 - [root@home home.wo-lar.com]# ss -lnpt | grep master LISTEN 0 100 0.0.0.0:25 0.0.0.0:* users:(("master",pid=6985,fd=16)) LISTEN 0 100 0.0.0.0:5870.0.0.0:* users:(("master",pid=6985,fd=20)) LISTEN 0 100 0.0.0.0:4650.0.0.0:* users:(("master",pid=6985,fd=23)) - LISTEN 0 100 0.0.0.0:9930.0.0.0:* users:(("dovecot",pid=7000,fd=45)) LISTEN 0 100 0.0.0.0:9950.0.0.0:* users:(("dovecot",pid=7000,fd=26)) LISTEN 0 100 0.0.0.0:1100.0.0.0:* users:(("dovecot",pid=7000,fd=24)) LISTEN 0 100 0.0.0.0:1430.0.0.0:* users:(("dovecot",pid=7000,fd=43)) LISTEN 0 100 [::]:993 [::]:* users:(("dovecot",pid=7000,fd=46)) LISTEN 0 100 [::]:995 [::]:* users:(("dovecot",pid=7000,fd=27)) LISTEN 0 100 [::]:110 [::]:* users:(("dovecot",pid=7000,fd=25)) LISTEN 0 100 [::]:143 [::]:* users:(("dovecot",pid=7000,fd=44)) Thanks for helping Wolfgang Rauchholz +34 627 994 977 https://www.linkedin.com/in/wolfgangrauchholz/
Re: lost connection after STARTTLS
On 12 Jun 2020, at 01:11, Fourhundred Thecat <400the...@gmx.ch> wrote: > But, on the other hand, who is still sending plaintext these days? Nearly everyone using STARTTLS? Someone who fails STARTTLS may then use SMTPS > And why can't legitimate client use reasonable ciphers? Define legitimate clients. I don';t see clients failing to connect securely (because I do not allow any client to connect insecurely) nor do I get complaints about not being able to connect with certain clients. Most of your log trawling seems like a waste of time to me. Configure postscreen and be done with it. > I think my settings are not so strict. I believe am using > recommendations from this mailing list: > > smtpd_tls_ciphers = medium > smtpd_tls_protocols = !SSLv2, !SSLv3 > smtpd_tls_mandatory_protocols = !SSLv2, !SSLv3 Those are fine, but there are far more settings than that. The really important ones are in master.cf for submission and smtps. -- "Are you pondering what I'm pondering?" "I think so, Brain NARF, but don't camels spit a lot?"
Re: lost connection after STARTTLS
On Fri, 2020-06-12 at 09:11 +0200, Fourhundred Thecat wrote: > > On 2020-06-12 08:57, Jeroen Geilman wrote: > > - too many errors after .* from .* > > - warning: non-SMTP command from .* > > > > While these do indicate badly-behaved clients, there is no reason > > to assume evil intent. The senior citizen that inadvertendly drove faster than the allowed speed limit had no evil intent, but they still got a speeding ticket. > - reject: RCPT from .* Recipient address rejected: User unknown in > > local recipient table; .*' > > > > This rejection is per-recipient; blocking this *client* because > > they mis-typed a single address means you /will/ reject valid email > > later on. > > OK, I see. But I am blocking for 1 hour only anyway. That is very reasonable. A civil society does not send all citizens that have driven a bit above limit to prison. For mild offenders the sentence is a small ticket. For more serious and repeat offenders, there are temporary license suspensions, and only for the worst offenders there are life-suspensions and sometimes jail. Intent is still not part of the equation, and this is by design: Proving intent (subjective!) is extremely difficult. In civil societies, we do not want to make the mistake of jailing innocent people, which is why to find intent (and thus criminal guilt), the standard of proof is "beyond reasonable doubt" (i.e. 100%), much higher than the standard to find strict liability (responsibility based on objective facts only, which is the case of the misconfigured client). Strict liability is easy to find at trial than criminal guilt, but carries much lighter sentences. Eventually, your one-hour blocked client will learn from the rejection, or will be rightfully excluded from the network. On the other hand, tolerating these bad-behaved clients is a slippery slope. If drivers get away with 10Km/h above limit, next week they will try 15Km/h and next month 20Km/h until there is no limit at all. But what was the purpose of the limit in the first place? Sadly, after the twitterization of language, we are witnessing also the twitterization of decision-making. Centuries of wisdom are lost to the analphabetism of incompetents-in-chief that condense accusation, trial, verdict and sentencing into one tweet. > And why can't legitimate client use reasonable ciphers? This is exactly the question to answer. Reasonable depends on context and evolves over time. Maybe 50 years ago it was reasonable to tolerate drinking and driving. Today we know better. Maybe 40 years ago encryption was difficult to implement, and even nowadays there may be reasons not to. Some banks are sending alerts unencrypted, for scalability reasons. > I think my settings are not so strict. I believe am using > recommendations from this mailing list: > > smtpd_tls_ciphers = medium > smtpd_tls_protocols = !SSLv2, !SSLv3 > smtpd_tls_mandatory_protocols = !SSLv2, !SSLv3 Depends on the sensitivity of the transmitted/received information and your overall protection strategy. If the information is encrypted with PGP or S/MIME, the choice of TLS becomes much less critical. The test I recommend: if you are not comfortable putting the information on the back of a postcard, use PGP, and make sure that you are comfortable putting the PGP-chiphercode on the back of a postcard. With that out of the way, the choice of encryption is much less critical. Have a look at https://ssl-tools.net/mailservers Generally, weak encryption is better than no encryption. !SSLv2 is very sensible because SSLv2 is so bad that it can be used to attack RSA keys and sites with the same name even if they are on entirely different servers. https://drownattack.com/ If SSLv2 is used, An attacker could use a couple of seemingly innocent encrypted packets to/from the SMTP server to gain access to the private key and attack anything else that is protected by that key. What were those GET requests again? SSLv3 is "only" weak when used with SMTP. The POODLE vulnerability is specific to HTTP. May or may not have been applied against SMTP. https://en.wikipedia.org/wiki/POODLE You will have to balance your choice for compatibility with your correspondents, and there is no absolute right or wrong answer. Sometimes no encryption is better than bad encryption. HTH -- Yuval Levy, JD, MBA, CFA Ontario-licensed lawyer
Re: lost connection after STARTTLS
> On 2020-06-12 08:57, Jeroen Geilman wrote: - too many errors after .* from .* - warning: non-SMTP command from .* While these do indicate badly-behaved clients, there is no reason to assume evil intent. who would send non-SMTP command to a mailserver. I usually see commands such as GET / - reject: RCPT from .* Recipient address rejected: User unknown in local recipient table; .*' This rejection is per-recipient; blocking this *client* because they mis-typed a single address means you /will/ reject valid email later on. OK, I see. But I am blocking for 1 hour only anyway. - lost connection after STARTTLS What if the client could not match the server version or ciphers, and has to disconnect to try plain SMTP again ? There is no down-step after STARTTLS. ok I see. But, on the other hand, who is still sending plaintext these days? And why can't legitimate client use reasonable ciphers? I think my settings are not so strict. I believe am using recommendations from this mailing list: smtpd_tls_ciphers = medium smtpd_tls_protocols = !SSLv2, !SSLv3 smtpd_tls_mandatory_protocols = !SSLv2, !SSLv3
lost connection after STARTTLS
Hello, I am parsing mail logs, and banning offending IP addresses. Mostly I match patterns such as: too many errors after .* from .* warning: non-SMTP command from .* reject: RCPT from .* Recipient address rejected: User unknown in local recipient table; .*' I think it is safe to block IP based on above examples. These erros clearly indicate evil intent. I also see many errors such as: lost connection after STARTTLS is it safe to block this command as well, or can this happen to a legitimate client? In other words, in what situation would a legitimate clien generate "lost connection after STARTTLS" ? thanks,
Re: lost connection after HELO
On Thu, 28 May 2020 09:59:56 +0200 Matus UHLAR - fantomas wrote: > On 28.05.20 09:36, Enrico Morelli wrote: > >I've an UPS that should send me email in case of problems. The email > >do not arrive because in the log I see "lost connection after HELO". > > > >I added debug_peer_list to my main.cf to debug the ups connection. Is > >there a way to solve the problem? > > >May 28 09:13:15 genio postfix/smtpd[31295]: < > >ups-ced.domain.net[192.168.145.19]: EHLO > >May 28 09:13:15 genio postfix/smtpd[31295]: > > >ups-ced.domain.net[192.168.145.19]: 501 Syntax: EHLO hostname > >May 28 09:13:15 genio postfix/smtpd[31295]: watchdog_pat: > >0x558d6b58d9f0 > >May 28 09:13:15 genio postfix/smtpd[31295]: < > >ups-ced.domain.net[192.168.145.19]: HELO > >May 28 09:13:15 genio postfix/smtpd[31295]: > > >ups-ced.domain.net[192.168.145.19]: 501 Syntax: HELO hostname > >May 28 09:13:15 genio postfix/smtpd[31295]: watchdog_pat: > >0x558d6b58d9f0 > >May 28 09:13:15 genio postfix/smtpd[31295]: smtp_get: EOF > > looks like yout UPS does not provide hostname in EHLO/HELO message, > which postfix doesn't accept. > If you can't set up a hostname on your UPS, you'll have to accept such > invalid helo, perhaps as described on: > https://www.claudiokuenzler.com/blog/664/force-postfix-allow-empty-null-helo-ehlo-smtp-commands > > it would be better only to accept such helo from IP of the UPS, if > possible > > > Thank you, the filter works. Now the mail goes is blocked by amavis, but this is another problem. My UPS is very strange :-)) -- --- Enrico Morelli System Administrator | Programmer | Web Developer CERM - Polo Scientifico via Sacconi, 6 - 50019 Sesto Fiorentino (FI) - ITALY
Re: lost connection after HELO
On 28.05.20 09:36, Enrico Morelli wrote: I've an UPS that should send me email in case of problems. The email do not arrive because in the log I see "lost connection after HELO". I added debug_peer_list to my main.cf to debug the ups connection. Is there a way to solve the problem? May 28 09:13:15 genio postfix/smtpd[31295]: < ups-ced.domain.net[192.168.145.19]: EHLO May 28 09:13:15 genio postfix/smtpd[31295]: > ups-ced.domain.net[192.168.145.19]: 501 Syntax: EHLO hostname May 28 09:13:15 genio postfix/smtpd[31295]: watchdog_pat: 0x558d6b58d9f0 May 28 09:13:15 genio postfix/smtpd[31295]: < ups-ced.domain.net[192.168.145.19]: HELO May 28 09:13:15 genio postfix/smtpd[31295]: > ups-ced.domain.net[192.168.145.19]: 501 Syntax: HELO hostname May 28 09:13:15 genio postfix/smtpd[31295]: watchdog_pat: 0x558d6b58d9f0 May 28 09:13:15 genio postfix/smtpd[31295]: smtp_get: EOF looks like yout UPS does not provide hostname in EHLO/HELO message, which postfix doesn't accept. If you can't set up a hostname on your UPS, you'll have to accept such invalid helo, perhaps as described on: https://www.claudiokuenzler.com/blog/664/force-postfix-allow-empty-null-helo-ehlo-smtp-commands it would be better only to accept such helo from IP of the UPS, if possible -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. On the other hand, you have different fingers.
lost connection after HELO
Dear, I've an UPS that should send me email in case of problems. The email do not arrive because in the log I see "lost connection after HELO". I added debug_peer_list to my main.cf to debug the ups connection. Is there a way to solve the problem? May 28 09:13:15 genio postfix/postscreen[31258]: CONNECT from [192.168.145.19]:51917 t o [192.168.146.39]:25 May 28 09:13:15 genio postfix/postscreen[31258]: WHITELISTED [192.168.145.19]:51917 May 28 09:13:15 genio postfix/smtpd[31295]: connect from ups-ced.domain.net[192.168.145.19] May 28 09:13:15 genio postfix/smtpd[31295]: smtp_stream_setup: maxtime=300 enable_dead line=0 May 28 09:13:15 genio postfix/smtpd[31295]: match_hostname: smtpd_client_event_limit_e xceptions: ups-ced.domain.net ~? 127.0.0.0/8 May 28 09:13:15 genio postfix/smtpd[31295]: match_hostaddr: smtpd_client_event_limit_exceptions: 192.168.145.19 ~? 127.0.0.0/8 May 28 09:13:15 genio postfix/smtpd[31295]: match_hostname: smtpd_client_event_limit_exceptions: ups-ced.domain.net ~? [:::127.0.0.0]/104 May 28 09:13:15 genio postfix/smtpd[31295]: match_hostname: smtpd_client_event_limit_e xceptions: ups-ced.domain.net ~? [::1]/128 May 28 09:13:15 genio postfix/smtpd[31295]: match_hostaddr: smtpd_client_event_limit_e xceptions: 192.168.145.19 ~? [::1]/128 May 28 09:13:15 genio postfix/smtpd[31295]: match_hostname: smtpd_client_event_limit_e xceptions: ups-ced.domain.net ~? 192.168.145.0/24 May 28 09:13:15 genio postfix/smtpd[31295]: match_hostaddr: smtpd_client_event_limit_exceptions: 192.168.145.19 ~? 192.168.145.0/24 May 28 09:13:15 genio postfix/smtpd[31295]: > ups-ced.domain.net[192.168.145.19]: 220 genio.domain.net ESMTP Postfix (Debian/GNU) ay 28 09:13:15 genio postfix/smtpd[31295]: xsasl_dovecot_server_create: SASL service=smtp, realm=(null) May 28 09:13:15 genio postfix/smtpd[31295]: name_mask: noanonymous May 28 09:13:15 genio postfix/smtpd[31295]: xsasl_dovecot_server_connect: Connecting May 28 09:13:15 genio postfix/smtpd[31295]: xsasl_dovecot_server_connect: auth reply: VERSION?1?2 May 28 09:13:15 genio postfix/smtpd[31295]: xsasl_dovecot_server_connect: auth reply: MECH?PLAIN?plaintext May 28 09:13:15 genio postfix/smtpd[31295]: name_mask: plaintext May 28 09:13:15 genio postfix/smtpd[31295]: xsasl_dovecot_server_connect: auth reply: MECH?DIGEST-MD5?dictionary?active?mutual-auth May 28 09:13:15 genio postfix/smtpd[31295]: name_mask: dictionary May 28 09:13:15 genio postfix/smtpd[31295]: name_mask: active May 28 09:13:15 genio postfix/smtpd[31295]: name_mask: mutual-auth May 28 09:13:15 genio postfix/smtpd[31295]: xsasl_dovecot_server_connect: auth reply MECH?CRAM-MD5?dictionary?active May 28 09:13:15 genio postfix/smtpd[31295]: name_mask: dictionary May 28 09:13:15 genio postfix/smtpd[31295]: name_mask: active May 28 09:13:15 genio postfix/smtpd[31295]: xsasl_dovecot_server_connect: auth reply: MECH?LOGIN?plaintext May 28 09:13:15 genio postfix/smtpd[31295]: name_mask: plaintext May 28 09:13:15 genio postfix/smtpd[31295]: xsasl_dovecot_server_connect: auth reply: SPID?22734 May 28 09:13:15 genio postfix/smtpd[31295]: xsasl_dovecot_server_connect: auth reply: COOKIE?30c2eef14595100fc106f0be90d07d16 May 28 09:13:15 genio postfix/smtpd[31295]: xsasl_dovecot_server_connect: auth reply: DONE May 28 09:13:15 genio postfix/smtpd[31295]: xsasl_dovecot_server_mech_filter: keep mechanism: PLAIN May 28 09:13:15 genio postfix/smtpd[31295]: xsasl_dovecot_server_mech_filter: keep mechanism: DIGEST-MD5 May 28 09:13:15 genio postfix/smtpd[31295]: xsasl_dovecot_server_mech_filter: keep mechanism: CRAM-MD5 May 28 09:13:15 genio postfix/smtpd[31295]: xsasl_dovecot_server_mech_filter: keep mechanism: LOGIN May 28 09:13:15 genio postfix/smtpd[31295]: < ups-ced.domain.net[192.168.145.19]: EHLO May 28 09:13:15 genio postfix/smtpd[31295]: > ups-ced.domain.net[192.168.145.19]: 501 Syntax: EHLO hostname May 28 09:13:15 genio postfix/smtpd[31295]: watchdog_pat: 0x558d6b58d9f0 May 28 09:13:15 genio postfix/smtpd[31295]: < ups-ced.domain.net[192.168.145.19]: HELO May 28 09:13:15 genio postfix/smtpd[31295]: > ups-ced.domain.net[192.168.145.19]: 501 Syntax: HELO hostname May 28 09:13:15 genio postfix/smtpd[31295]: watchdog_pat: 0x558d6b58d9f0 May 28 09:13:15 genio postfix/smtpd[31295]: smtp_get: EOF May 28 09:13:15 genio postfix/smtpd[31295]: match_hostname: smtpd_client_event_limit_exceptions: ups-ced.domain.net ~? 127.0.0.0/8 May 28 09:13:15 genio postfix/smtpd[31295]: match_hostaddr: smtpd_client_event_limit_exceptions: 192.168.145.19 ~? 127.0.0.0/8 May 28 09:13:15 genio postfix/smtpd[31295]: match_hostaddr: smtpd_client_event_limit_exceptions: 192.168.145.19 ~? [:::127.0.0.0]/104 May 28 09:13:15 genio postfix/smtpd[31295]: match_hostname: smtpd_client_event_limit_exceptions: ups-ced.domain.net ~? [::1]/128 May 28 09:13:15 genio postfix/smtpd[31295]: match_hostaddr: smtpd_client_event_limit_exceptions: 192.168.145.19 ~? [::1]/128 May 28 09:13:15
Re: lost connection after STARTTLS from localhost[127.0.0.1]
On Mon, Dec 17, 2018 at 01:28:56AM -0700, wp.rauchholz wrote: > I am trying to get a webmail client up and running. It works fine w/o > security settings. But when I try to implement SATARTTLS on port 587 I lose > connection to localhost as described in Subject. Note that the "lost connection to localhost" is an issue when sending email, while TLS on port 587 is inbound email, only tangentially related to the reported problem. STARTTLS on ports 25 and 587 is working just fine for your domain. > smtp-amavis unix - - n - 2 smtp > -o smtp_data_done_timeout=1200 > -o smtp_send_xforward_command=yes > -o disable_dns_lookups=yes Here you mmay want to also add: -o smtp_tls_security_level=none > postconf -n > content_filter = smtp-amavis:[127.0.0.1]:10024 Your amavis content filter is on localhost, and may not handle STARTTLS correctly. > smtp_tls_security_level = may But you try TLS if offered. You can also disable TLS in the port 10025 post-filter service: > 127.0.0.1:10025 inet n- n - - smtpd > -o content_filter= > -o local_recipient_maps= > -o relay_recipient_maps= > -o smtpd_restriction_classes= > -o smtpd_client_restrictions= > -o smtpd_helo_restrictions= > -o smtpd_sender_restrictions= > -o smtpd_recipient_restrictions=permit_mynetworks,reject > -o mynetworks=127.0.0.0/8 > -o strict_rfc821_envelopes=yes > -o smtpd_error_sleep_time=0 > -o smtpd_soft_error_limit=1001 > -o smtpd_hard_error_limit=1000 by adding: -o smtpd_tls_security_level=none There's no need for TLS on the loopback interface except in the unlikely case that you're authenticating to an LMTP server with client certificates, or the loopback SMTP service is actually TCP tunnel to a remote destination. -- Viktor.
Re: lost connection after STARTTLS from localhost[127.0.0.1]
Wolfgang Paul Rauchholz skrev den 2018-12-17 13:06: Unfortunately not. I am using roudcubemail, there is nothing in the log files. $config['default_host'] = 'ssl://localhost'; $config['default_port'] = 993; $config['imap_conn_options'] = array( 'ssl' => array( 'verify_peer' => false, 'verify_peer_name' => false, ), ); $config['smtp_server'] = 'ssl://localhost'; $config['smtp_port'] = 465; $config['smtp_conn_options'] = array( 'ssl' => array( 'verify_peer' => false, 'verify_peer_name' => false, ), ); $config['smtp_user'] = '%u'; $config['smtp_pass'] = '%p'; $config['smtp_helo_host'] = 'localhost.example.org'; $config['smtp_log'] = false; works for me :=)
Re: lost connection after STARTTLS from localhost[127.0.0.1]
Unfortunately not. I am using roudcubemail, there is nothing in the log files. Woflgang On Mon, Dec 17, 2018 at 12:48 PM Wietse Venema wrote: > wp.rauchholz: > > Good day. > > > > I am trying to get a webmail client up and running. It works fine w/o > > security settings. But when I try to implement SATARTTLS on port 587 I > lose > > connection to localhost as described in Subject. > > Does the webmail client provide any clues about why it is hanging up? > > Wietse > -- Wolfgang Rauchholz
Re: lost connection after STARTTLS from localhost[127.0.0.1]
wp.rauchholz: > Good day. > > I am trying to get a webmail client up and running. It works fine w/o > security settings. But when I try to implement SATARTTLS on port 587 I lose > connection to localhost as described in Subject. Does the webmail client provide any clues about why it is hanging up? Wietse
Re: lost connection after data
Selcuk Yazar: > postfix/smtpd[6055]: lost connection after DATA (3865 bytes) from > mx2.iparadigms.com[199.47.85.44] Possible cause: - Broken WSCALE (window scaling). https://en.wikipedia.org/wiki/TCP_window_scale_option Less likely, because the failure happened after 3865 bytes: - Broken MTU (maximum transmission unit). https://en.wikipedia.org/wiki/Maximum_transmission_unit Or something else :-) Only a network sniffer recording can tell. Wietse
lost connection after data
Hi, Our users try to get e-mail from turnitin.com, but we have error like below. It seems same IP address with different mx records. How can we resolve this. thanks in advance. postfix/policy-spf[10024]: : Policy action=PREPEND Received-SPF: pass ( turnitin.com: 199.47.85.44 is authorized to use 'nore...@turnitin.com (mailto:nore...@turnitin.com)' in 'mfrom' identity (mechanism 'ip4: 199.47.80.0/21' matched)) receiver=mail.trakya.edu.tr; identity=mailfrom; envelope-from="nore...@turnitin.com (mailto:nore...@turnitin.com)"; helo= mx2.turnitin.com; client-ip=199.47.85.44 host=199.47.85.44, *helo=mx2.turnitin.com <http://mx2.turnitin.com> * postfix/smtpd[6055]: lost connection after DATA (3865 bytes) from *mx2.iparadigms.com <http://mx2.iparadigms.com>*[199.47.85.44] dig mx2.turnitin.com ;; ANSWER SECTION: mx2.turnitin.com. 900 IN A 199.47.85.44 ;; AUTHORITY SECTION: turnitin.com. 14182 IN NS ns-1415.awsdns-48.org. turnitin.com. 14182 IN NS ns-58.awsdns-07.com. turnitin.com. 14182 IN NS ns-1607.awsdns-08.co.uk. turnitin.com. 14182 IN NS ns-721.awsdns-26.net. dig mx2.iparadigms.com ;; ANSWER SECTION: mx2.iparadigms.com. 900 IN A 199.47.85.44 ;; AUTHORITY SECTION: iparadigms.com. 153342 IN NS ns-1419.awsdns-49.org. iparadigms.com. 153342 IN NS ns-1801.awsdns-33.co.uk. iparadigms.com. 153342 IN NS ns-644.awsdns-16.net. iparadigms.com. 153342 IN NS ns-323.awsdns-40.com. -- Selçuk YAZAR http://www.selcukyazar.blogspot.com
Re: lost connection while sending end of data
Christos Chatzaras: > I use dovecot lmtp, dovecot quota plugin and postfix. > > When I send e-mail to 2 recipients (or more) at the same time and if one of > them is over quota (or under quota and the message I send is bigger than his > free space) mailq shows: > > -Queue ID- --Size-- Arrival Time -Sender/Recipient--- > 20B03336F2226099 Thu Apr 19 18:02:47 supp...@example.com > (lost connection with server25.example.org[private/dovecot-lmtp] while > sending end of data -- message may be sent more than once) > us...@example.com > us...@example.com > > E-mails sent from the same domain on same server so it's a local delivery. > > If I send the e-mail to the over quota user ( only him on To: ) then I get a > bounce that says that user is over quota which is the correct behaviour. > > By changing the postfix main.cf setting from the default: > > default_destination_recipient_limit = 50 > > to: > > default_destination_recipient_limit = 1 > > it solves the issue. > > Is this a known bug? Please see the instructions in the mailing list welcome message. Wietse
lost connection while sending end of data
I use dovecot lmtp, dovecot quota plugin and postfix. When I send e-mail to 2 recipients (or more) at the same time and if one of them is over quota (or under quota and the message I send is bigger than his free space) mailq shows: -Queue ID- --Size-- Arrival Time -Sender/Recipient--- 20B03336F2226099 Thu Apr 19 18:02:47 supp...@example.com (lost connection with server25.example.org[private/dovecot-lmtp] while sending end of data -- message may be sent more than once) us...@example.com us...@example.com E-mails sent from the same domain on same server so it's a local delivery. If I send the e-mail to the over quota user ( only him on To: ) then I get a bounce that says that user is over quota which is the correct behaviour. By changing the postfix main.cf setting from the default: default_destination_recipient_limit = 50 to: default_destination_recipient_limit = 1 it solves the issue. Is this a known bug?
Re: Postfix lost connection after EHLO from neon.domain.com
On 2018-02-08 (22:43 MST), motty cruzwrote: > > match_hostname: smtpd_client_event_limit_exceptions: neon.domain.com ~? > 189.45.22.55 postconf -n What (and why) do you have smtpd_client_event_limit_exceptions set to? Also, I don't believe for a second that domain.com is connecting to you. Please do not make up domains for your logs. use example.com, example.net, example.org or, if you must, domain.tld or something like that (I like using .tld myself, but best practice is to use example.com/net/org/ -- Living is easy with eyes closed, misunderstanding all you see
Re: Postfix lost connection after EHLO from neon.domain.com
Dr. Wietse, Thank you very much for taking the time to reply to my email. I enabled TLS on Postfix with a certificate from letsencrypt.com for temporary solution. This solved the problem, we're now able to received emails from that specific client. Your support on this matter is appreciated! Thanks, Motty On 2/9/2018 11:45 AM, Wietse Venema wrote: Bastian Blank: On Thu, Feb 08, 2018 at 09:43:51PM -0800, motty cruz wrote: I am trying to figure out why my Postfix disconnect after EHLO command. A customer is trying to email me something but Postfix disconnect: ( on the customer side this is the bounced message "Remote Server returned '< spring1.mydomain.com #5.0.0 smtp; 554 Security violation. Email Session ID:" ) This is no Postfix messages. According to Google this is some MtM device. You need to find out why *THEIR* firewall is refusing to deliver mail. Wietse
Re: Postfix lost connection after EHLO from neon.domain.com
Bastian Blank: > On Thu, Feb 08, 2018 at 09:43:51PM -0800, motty cruz wrote: > > I am trying to figure out why my Postfix disconnect after EHLO command. A > > customer is trying to email me something but Postfix disconnect: ( on the > > customer side this is the bounced message "Remote Server returned '< > > spring1.mydomain.com #5.0.0 smtp; 554 Security violation. Email Session > > ID:" ) > > This is no Postfix messages. According to Google this is some MtM > device. You need to find out why *THEIR* firewall is refusing to deliver mail. Wietse
Re: Postfix lost connection after EHLO from neon.domain.com
On 09-02-18 18:35, Motty Cruz wrote: If you have any other ideas please share, I appreciate your help! You could try getting a packet trace on your end. It might show you in more detail what is going on. Worst case you learn nothing new. regards, Erik
Re: Postfix lost connection after EHLO from neon.domain.com
Thanks Bill, The customer is from fairly large company and they're able to send email to other clients. They will not cooperate to help me troubleshoot this issue. I am working from the assumption the problem is on my side. We were getting emails from that client up to few weeks ago. Nothing has changed on my side. I had configured Postfix to handle TLS not sure if it will fix the error they're having. This issue is tormenting me! not sure where else to try to prove is on their side. If you have any other ideas please share, I appreciate your help! Thanks for your support! On 2/9/2018 9:25 AM, Bill Cole wrote: One more thing... On 9 Feb 2018, at 9:09, Motty Cruz wrote: Isn't because my smtp server does not support TLS? Yes, it could be. Their broken firewall may be set to require TLS support. Which is not in itself a bad thing. The only thing broken about this IF it's because they require TLS is the way they are disconnecting. Not supporting TLS for incoming email is not a rational choice in the modern world.
Re: Postfix lost connection after EHLO from neon.domain.com
One more thing... On 9 Feb 2018, at 9:09, Motty Cruz wrote: Isn't because my smtp server does not support TLS? Yes, it could be. Their broken firewall may be set to require TLS support. Which is not in itself a bad thing. The only thing broken about this IF it's because they require TLS is the way they are disconnecting. Not supporting TLS for incoming email is not a rational choice in the modern world.
Re: Postfix lost connection after EHLO from neon.domain.com
On 9 Feb 2018, at 9:09, Motty Cruz wrote: Hello Bastian, you're right " ( on the customer side this is the bounced message "Remote Server returned '< spring1.mydomain.com #5.0.0 smtp; 554 Security violation. Email Session ID:" ) This is a message generated by a dysfunctional and misguided option in a firewall. The sender is having their SMTP session hijacked by that firewall and mishandled because the firewall manufacturer doesn't understand SMTP adequately to function without breaking connections carelessly and for no good reason. Isn't because my smtp server does not support TLS? or do you have any idea how to solve this problem? is driving me to the cliff. The sender needs to fix their firewall.
Re: Postfix lost connection after EHLO from neon.domain.com
Hello Bastian, you're right " ( on the customer side this is the bounced message "Remote Server returned '< spring1.mydomain.com #5.0.0 smtp; 554 Security violation. Email Session ID:" ) is on the response of the remote server (smtp server of person submitting the email) but this log below is from my Spam-Filter: Feb 8 09:46:04 spring1 postfix/smtpd[47824]: > neon.domain.com <http://neon.domain.com>[189.45.22.55]: 250 SMTPUTF8 Feb 8 09:46:04 spring1 postfix/smtpd[47824]: match_hostname: smtpd_client_event_limit_exceptions: neon.domain.com <http://neon.domain.com> ~? 189.45.22.55 Feb 8 09:46:04 spring1 postfix/smtpd[47824]: lost connection after EHLO from neon.domain.com <http://neon.domain.com>[189.45.22.55] Isn't because my smtp server does not support TLS? or do you have any idea how to solve this problem? is driving me to the cliff. _Motty On 2/8/2018 10:18 PM, Bastian Blank wrote: On Thu, Feb 08, 2018 at 09:43:51PM -0800, motty cruz wrote: I am trying to figure out why my Postfix disconnect after EHLO command. A customer is trying to email me something but Postfix disconnect: ( on the customer side this is the bounced message "Remote Server returned '< spring1.mydomain.com #5.0.0 smtp; 554 Security violation. Email Session ID:" ) This is no Postfix messages. According to Google this is some MtM device. Feb 8 09:46:03 spring1 postfix/smtpd[47824]: connect from neon.domain.com Verbose logging is not needed, it just drowns you. Feb 8 09:46:04 spring1 postfix/smtpd[47824]: lost connection after EHLO from neon.domain.com[189.45.22.55] You really know someone owning domain.com? Bastian
Re: Postfix lost connection after EHLO from neon.domain.com
On Thu, Feb 08, 2018 at 09:43:51PM -0800, motty cruz wrote: > I am trying to figure out why my Postfix disconnect after EHLO command. A > customer is trying to email me something but Postfix disconnect: ( on the > customer side this is the bounced message "Remote Server returned '< > spring1.mydomain.com #5.0.0 smtp; 554 Security violation. Email Session > ID:" ) This is no Postfix messages. According to Google this is some MtM device. > Feb 8 09:46:03 spring1 postfix/smtpd[47824]: connect from neon.domain.com Verbose logging is not needed, it just drowns you. > Feb 8 09:46:04 spring1 postfix/smtpd[47824]: lost connection after EHLO > from neon.domain.com[189.45.22.55] You really know someone owning domain.com? Bastian -- Peace was the way. -- Kirk, "The City on the Edge of Forever", stardate unknown
Postfix lost connection after EHLO from neon.domain.com
Hello, I am trying to figure out why my Postfix disconnect after EHLO command. A customer is trying to email me something but Postfix disconnect: ( on the customer side this is the bounced message "Remote Server returned '< spring1.mydomain.com #5.0.0 smtp; 554 Security violation. Email Session ID:" ) your help is appreciated! Feb 8 09:46:03 spring1 postfix/smtpd[47824]: connect from neon.domain.com [189.45.22.55] Feb 8 09:46:03 spring1 postfix/smtpd[47824]: match_hostname: smtpd_client_event_limit_exceptions: neon.domain.com ~? 189.45.22.55 Feb 8 09:46:03 spring1 postfix/smtpd[47824]: > neon.domain.com[189.45.22.55]: 220 spring1.mydomain Feb 8 09:46:04 spring1 postfix/smtpd[47824]: < neon.domain.com[189.45.22.55]: EHLO neon.domain.com Feb 8 09:46:04 spring1 postfix/smtpd[47824]: match_list_match: neon.domain.com: no match Feb 8 09:46:04 spring1 postfix/smtpd[47824]: > neon.domain.com[189.45.22.55]: 250-spring1.mydomain Feb 8 09:46:04 spring1 postfix/smtpd[47824]: > neon.domain.com[189.45.22.55]: 250-PIPELINING Feb 8 09:46:04 spring1 postfix/smtpd[47824]: > neon.domain.com[189.45.22.55]: 250-SIZE 2048 Feb 8 09:46:04 spring1 postfix/smtpd[47824]: > neon.domain.com[189.45.22.55]: 250-VRFY Feb 8 09:46:04 spring1 postfix/smtpd[47824]: > neon.domain.com[189.45.22.55]: 250-ETRN Feb 8 09:46:04 spring1 postfix/smtpd[47824]: > neon.domain.com[189.45.22.55]: 250-ENHANCEDSTATUSCODES Feb 8 09:46:04 spring1 postfix/smtpd[47824]: > neon.domain.com[189.45.22.55]: 250-8BITMIME Feb 8 09:46:03 spring1 postfix/smtpd[47824]: connect from neon.domain.com [189.45.22.55] Feb 8 09:46:03 spring1 postfix/smtpd[47824]: match_hostname: smtpd_client_event_limit_exceptions: neon.domain.com ~? 189.45.22.55 Feb 8 09:46:03 spring1 postfix/smtpd[47824]: > neon.domain.com[189.45.22.55]: 220 spring1.mydomain Feb 8 09:46:04 spring1 postfix/smtpd[47824]: < neon.domain.com[189.45.22.55]: EHLO neon.domain.com Feb 8 09:46:04 spring1 postfix/smtpd[47824]: match_list_match: neon.domain.com: no match Feb 8 09:46:04 spring1 postfix/smtpd[47824]: > neon.domain.com[189.45.22.55]: 250-spring1.mydomain Feb 8 09:46:04 spring1 postfix/smtpd[47824]: > neon.domain.com[189.45.22.55]: 250-PIPELINING Feb 8 09:46:04 spring1 postfix/smtpd[47824]: > neon.domain.com[189.45.22.55]: 250-SIZE 2048 Feb 8 09:46:04 spring1 postfix/smtpd[47824]: > neon.domain.com[189.45.22.55]: 250-VRFY Feb 8 09:46:04 spring1 postfix/smtpd[47824]: > neon.domain.com[189.45.22.55]: 250-ETRN Feb 8 09:46:04 spring1 postfix/smtpd[47824]: > neon.domain.com[189.45.22.55]: 250-ENHANCEDSTATUSCODES Feb 8 09:46:04 spring1 postfix/smtpd[47824]: > neon.domain.com[189.45.22.55]: 250-8BITMIME Feb 8 09:46:04 spring1 postfix/smtpd[47824]: > neon.domain.com[189.45.22.55]: 250-DSN Feb 8 09:46:04 spring1 postfix/smtpd[47824]: > neon.domain.com[189.45.22.55]: 250 SMTPUTF8 Feb 8 09:46:04 spring1 postfix/smtpd[47824]: match_hostname: smtpd_client_event_limit_exceptions: neon.domain.com ~? 189.45.22.55 Feb 8 09:46:04 spring1 postfix/smtpd[47824]: lost connection after EHLO from neon.domain.com[189.45.22.55] Feb 8 09:46:04 spring1 postfix/smtpd[47824]: disconnect from neon.domain.com[189.45.22.55] ehlo=1 commands=1 Feb 8 09:46:03 spring1 postfix/smtpd[47824]: connect from neon.domain.com [189.45.22.55] Feb 8 09:46:03 spring1 postfix/smtpd[47824]: match_hostname: smtpd_client_event_limit_exceptions: neon.domain.com ~? 189.45.22.55 Feb 8 09:46:03 spring1 postfix/smtpd[47824]: > neon.domain.com[189.45.22.55]: 220 spring1.mydomain Feb 8 09:46:04 spring1 postfix/smtpd[47824]: < neon.domain.com[189.45.22.55]: EHLO neon.domain.com Feb 8 09:46:04 spring1 postfix/smtpd[47824]: match_list_match: neon.domain.com: no match Feb 8 09:46:04 spring1 postfix/smtpd[47824]: > neon.domain.com[189.45.22.55]: 250-spring1.mydomain Feb 8 09:46:04 spring1 postfix/smtpd[47824]: > neon.domain.com[189.45.22.55]: 250-PIPELINING Feb 8 09:46:04 spring1 postfix/smtpd[47824]: > neon.domain.com[189.45.22.55]: 250-SIZE 2048 Feb 8 09:46:04 spring1 postfix/smtpd[47824]: > neon.domain.com[189.45.22.55]: 250-VRFY Feb 8 09:46:04 spring1 postfix/smtpd[47824]: > neon.domain.com[189.45.22.55]: 250-ETRN Feb 8 09:46:04 spring1 postfix/smtpd[47824]: > neon.domain.com[189.45.22.55]: 250-ENHANCEDSTATUSCODES Feb 8 09:46:04 spring1 postfix/smtpd[47824]: > neon.domain.com[189.45.22.55]: 250-8BITMIME Feb 8 09:46:03 spring1 postfix/smtpd[47824]: connect from neon.domain.com [189.45.22.55] Feb 8 09:46:03 spring1 postfix/smtpd[47824]: match_hostname: smtpd_client_event_limit_exceptions: neon.domain.com ~? 189.45.22.55 Feb 8 09:46:03 spring1 postfix/smtpd[47824]: > neon.domain.com[189.45.22.55]: 220 spring1.mydomain Feb 8 09:46:04 spring1 postfix/smtpd[47824]: < neon.domain.com[189.45.22.55]: EHLO neon.domain.com Feb 8 09:46:04 spring1 postfix/smtpd[47824]: match_list_match: neon.domain.com: no match
Re: smtpd ... SSL_accept error from ... lost connection
> On Dec 11, 2016, at 3:25 AM, Dominic Raferd <domi...@timedicer.co.uk> wrote: > > In general my postfix mail server is working well, it is receiving > emails with optional STARTTLS. But I am occasionally seeing an error > message like this in the log: > > 2016-12-11 00:32:19 dl1 postfix/smtpd[13665]: SSL_accept error from > unknown[14.215.156.100]: lost connection > > The connection giving rise to the error is never from one of our > machines/users. Should I be worried about it? Does it indicate some > bad configuration on my side? No PTR record, SOA in China. Unless you have delayed correspondence from that province: 215.14.in-addr.arpa.39413 IN SOA soa. dns.guangzhou.gd.cn on the Internet stuff happens. Nothing to see, move along... -- Viktor.
Re: smtpd ... SSL_accept error from ... lost connection
On 11 December 2016 at 09:12, John Fawcett <j...@voipsupport.it> wrote: > On 12/11/2016 10:00 AM, Dominic Raferd wrote: >> On 11 December 2016 at 08:43, John Fawcett <j...@voipsupport.it> wrote: >>> On 12/11/2016 09:25 AM, Dominic Raferd wrote: >>>> In general my postfix mail server is working well, it is receiving >>>> emails with optional STARTTLS. But I am occasionally seeing an error >>>> message like this in the log: >>>> >>>> 2016-12-11 00:32:19 dl1 postfix/smtpd[13665]: SSL_accept error from >>>> unknown[14.215.156.100]: lost connection >>>> >>>> The connection giving rise to the error is never from one of our >>>> machines/users. Should I be worried about it? Does it indicate some >>>> bad configuration on my side? >>>> >>>> Dominic >>> Dominic >>> >>> it would help if you posted your configuration. >>> I suspect that you have the smtps service configured in master.cf. If >>> anyone is using it, it should be only your own users, so errors from >>> unrecognised ips will not be a problem and are probably not for any >>> legitimate reason. If you don't need the smtps service, you should >>> consider commenting it out completely in master.cf. >>> John >>> >> Thanks John for your quick reply. I don't have any smtps configured in >> master.cf, I only have smtp port (25) open and I allow opportunistic >> TLS (which I require before authentication [for which I use dovecot]) >> i.e. STARTTLS. So any senders can use TLS if they want. I guess that I >> should just ignore these errors from unknown ips as they don't >> indicate a security problem on my side? > > If you are able to receive encrypted email in general then I would > > ignore them unless there is any other sign of a problem > > (like users saying they cannot connect or people saying they are > > not receiving email). > > John > Thanks John, I have now filtered my error-message-checking cron job so that when these are 'from unknown' they will be ignored and I can stop worrying about them.
Re: smtpd ... SSL_accept error from ... lost connection
On 12/11/2016 10:00 AM, Dominic Raferd wrote: > On 11 December 2016 at 08:43, John Fawcett <j...@voipsupport.it> wrote: >> On 12/11/2016 09:25 AM, Dominic Raferd wrote: >>> In general my postfix mail server is working well, it is receiving >>> emails with optional STARTTLS. But I am occasionally seeing an error >>> message like this in the log: >>> >>> 2016-12-11 00:32:19 dl1 postfix/smtpd[13665]: SSL_accept error from >>> unknown[14.215.156.100]: lost connection >>> >>> The connection giving rise to the error is never from one of our >>> machines/users. Should I be worried about it? Does it indicate some >>> bad configuration on my side? >>> >>> Dominic >> Dominic >> >> it would help if you posted your configuration. >> I suspect that you have the smtps service configured in master.cf. If >> anyone is using it, it should be only your own users, so errors from >> unrecognised ips will not be a problem and are probably not for any >> legitimate reason. If you don't need the smtps service, you should >> consider commenting it out completely in master.cf. >> John >> > Thanks John for your quick reply. I don't have any smtps configured in > master.cf, I only have smtp port (25) open and I allow opportunistic > TLS (which I require before authentication [for which I use dovecot]) > i.e. STARTTLS. So any senders can use TLS if they want. I guess that I > should just ignore these errors from unknown ips as they don't > indicate a security problem on my side? If you are able to receive encrypted email in general then I would ignore them unless there is any other sign of a problem (like users saying they cannot connect or people saying they are not receiving email). John
Re: smtpd ... SSL_accept error from ... lost connection
On 12/11/2016 09:43 AM, John Fawcett wrote: > On 12/11/2016 09:25 AM, Dominic Raferd wrote: >> In general my postfix mail server is working well, it is receiving >> emails with optional STARTTLS. But I am occasionally seeing an error >> message like this in the log: >> >> 2016-12-11 00:32:19 dl1 postfix/smtpd[13665]: SSL_accept error from >> unknown[14.215.156.100]: lost connection >> >> The connection giving rise to the error is never from one of our >> machines/users. Should I be worried about it? Does it indicate some >> bad configuration on my side? >> >> Dominic > Dominic > > it would help if you posted your configuration. > I suspect that you have the smtps service configured in master.cf. If > anyone is using it, it should be only your own users, so errors from > unrecognised ips will not be a problem and are probably not for any > legitimate reason. If you don't need the smtps service, you should > consider commenting it out completely in master.cf. > John > I just did a quick check. I see these errors on STARTTLS in both smtpd and submission, so maybe they are not linked to smtps. Nevertheless they can probably be ignored, since if they are not your own users, the only other legitimate sources would be email servers transmitting email for your users and those are very unlikely to be "unknown" as in (unknown[14.215.156.100]:) which means they don't have proper reverse dns set up. John
Re: smtpd ... SSL_accept error from ... lost connection
On 11 December 2016 at 08:43, John Fawcett <j...@voipsupport.it> wrote: > On 12/11/2016 09:25 AM, Dominic Raferd wrote: >> In general my postfix mail server is working well, it is receiving >> emails with optional STARTTLS. But I am occasionally seeing an error >> message like this in the log: >> >> 2016-12-11 00:32:19 dl1 postfix/smtpd[13665]: SSL_accept error from >> unknown[14.215.156.100]: lost connection >> >> The connection giving rise to the error is never from one of our >> machines/users. Should I be worried about it? Does it indicate some >> bad configuration on my side? >> >> Dominic > > Dominic > > it would help if you posted your configuration. > I suspect that you have the smtps service configured in master.cf. If > anyone is using it, it should be only your own users, so errors from > unrecognised ips will not be a problem and are probably not for any > legitimate reason. If you don't need the smtps service, you should > consider commenting it out completely in master.cf. > John > Thanks John for your quick reply. I don't have any smtps configured in master.cf, I only have smtp port (25) open and I allow opportunistic TLS (which I require before authentication [for which I use dovecot]) i.e. STARTTLS. So any senders can use TLS if they want. I guess that I should just ignore these errors from unknown ips as they don't indicate a security problem on my side?
Re: smtpd ... SSL_accept error from ... lost connection
On 12/11/2016 09:25 AM, Dominic Raferd wrote: > In general my postfix mail server is working well, it is receiving > emails with optional STARTTLS. But I am occasionally seeing an error > message like this in the log: > > 2016-12-11 00:32:19 dl1 postfix/smtpd[13665]: SSL_accept error from > unknown[14.215.156.100]: lost connection > > The connection giving rise to the error is never from one of our > machines/users. Should I be worried about it? Does it indicate some > bad configuration on my side? > > Dominic Dominic it would help if you posted your configuration. I suspect that you have the smtps service configured in master.cf. If anyone is using it, it should be only your own users, so errors from unrecognised ips will not be a problem and are probably not for any legitimate reason. If you don't need the smtps service, you should consider commenting it out completely in master.cf. John
smtpd ... SSL_accept error from ... lost connection
In general my postfix mail server is working well, it is receiving emails with optional STARTTLS. But I am occasionally seeing an error message like this in the log: 2016-12-11 00:32:19 dl1 postfix/smtpd[13665]: SSL_accept error from unknown[14.215.156.100]: lost connection The connection giving rise to the error is never from one of our machines/users. Should I be worried about it? Does it indicate some bad configuration on my side? Dominic
RE: thousands of "lost connection after AUTH"
The are after username/passwords. And when that happend they will user your server als relay. Happend on one of my servers also. One of my users used his email and pass in facebook and linkedin. And the same as on the server.. :-/ About 60.000 mails where tried to send over my server. What i did was, i limited the use of sasl auth with my firewall to only from within my country with xtables geo block. Port 25 does not allow sasl, only 587 is allow and that port is limited to my country. And i told my user to never use the same username/pass of the server on any other place. Greetz, Louis > -Oorspronkelijk bericht- > Van: thomas.keller8...@gmail.com [mailto:owner-postfix-us...@postfix.org] > Namens Thomas Keller > Verzonden: vrijdag 24 juni 2016 9:50 > Aan: Postfix users > Onderwerp: thousands of "lost connection after AUTH" > > This is not a real problem, but I am curious to understand what is > happening here. > > I am running a small postfix server for personal use. One thing that I > observe over and over again is thousands of "lost connection after AUTH" > connections, such as these: > > 08:23:19 postfix/smtpd[4925]: connect from unknown [155.133.38.30] > 08:23:19 postfix/smtpd[4925]: lost connection after AUTH from unknown > [155.133.38.30] > 08:23:19 postfix/smtpd[4925]: disconnect from unknown [155.133.38.30] > > now, these are not causing much trouble for me (other than flooding my > logs), and I know I can tweak the anvil rate limits (I am using these > below and since these "lost connection after auth" happen every 1 - 2 > minutes, they are not caught by my anvil filter.): > > anvil_rate_time_unit= 60s > smtpd_client_connection_rate_limit = 10 > smtpd_client_message_rate_limit = 10 > smtpd_client_new_tls_session_rate_limit = 10 > > I am curious to know, who are these agents connecting to my server, and > what are they trying to achieve ? > > AFAICT, they don't even attempt to send spam, or use me as relay. What > do they want? >
thousands of "lost connection after AUTH"
This is not a real problem, but I am curious to understand what is happening here. I am running a small postfix server for personal use. One thing that I observe over and over again is thousands of "lost connection after AUTH" connections, such as these: 08:23:19 postfix/smtpd[4925]: connect from unknown [155.133.38.30] 08:23:19 postfix/smtpd[4925]: lost connection after AUTH from unknown [155.133.38.30] 08:23:19 postfix/smtpd[4925]: disconnect from unknown [155.133.38.30] now, these are not causing much trouble for me (other than flooding my logs), and I know I can tweak the anvil rate limits (I am using these below and since these "lost connection after auth" happen every 1 - 2 minutes, they are not caught by my anvil filter.): anvil_rate_time_unit= 60s smtpd_client_connection_rate_limit = 10 smtpd_client_message_rate_limit = 10 smtpd_client_new_tls_session_rate_limit = 10 I am curious to know, who are these agents connecting to my server, and what are they trying to achieve ? AFAICT, they don't even attempt to send spam, or use me as relay. What do they want?
Re: lost connection with [mail server] while performing the EHLO handshake after TLS established
I have the explanation -- I should've looked into the tcpdump output more closely. Viktor Dukhovni wrote the following on 05.11.2014 16:30: On Wed, Nov 05, 2014 at 01:27:49PM +0100, Tobias Reckhard wrote: It looks as though mail01.i-sec.tuv.com dropped the connection, though I see no indication of the reason. Strangely, though, in a tcpdump I recorded it appears that our customer's system is sending a [RST, ACK] packet directly after sending TLSv1 Application Data, which very probably is its EHLO. You may have read the wrong direction for the Application Data. The SMTP client speaks first after [STARTTLS]. It is not apparent in the postfix logs and my old version of Wireshark interprets it as an Ignored Unknown Record in the Secure Sockets Layer, but the remote server (mail01.i-sec.tuv.com) said 454 TLS not available due to a temporary reason in its final message. postfix responds to that by sending 64 bytes of gibberish TLSv1 Application Data and then tears down the connection with a RST. But the problem was obviously on the other end. The customer has also reported that the other end had a problem on their mail server which they have since fixed, allowing the mail in question as well as a few others that had queued up to be delivered. Thanks for your assistance, Viktor, I appreciate it. In case you're interested, these are plain text Wireshark exports of the last two messages from the server (omitting the Frame and Ethernet details): Internet Protocol Version 4, Src: 193.24.224.9 (193.24.224.9), Dst: 192.168.21.65 (192.168.21.65) Transmission Control Protocol, Src Port: smtp (25), Dst Port: 37055 (37055), Seq: 4416, Ack: 582, Len: 59 Secure Sockets Layer TLSv1 Record Layer: Change Cipher Spec Protocol: Change Cipher Spec Content Type: Change Cipher Spec (20) Version: TLS 1.0 (0x0301) Length: 1 Change Cipher Spec Message TLSv1 Record Layer: Handshake Protocol: Encrypted Handshake Message Content Type: Handshake (22) Version: TLS 1.0 (0x0301) Length: 48 Handshake Protocol: Encrypted Handshake Message Internet Protocol Version 4, Src: 193.24.224.9 (193.24.224.9), Dst: 192.168.21.65 (192.168.21.65) Transmission Control Protocol, Src Port: smtp (25), Dst Port: 37055 (37055), Seq: 4475, Ack: 582, Len: 86 Secure Sockets Layer TLSv1 Record Layer: Encrypted Alert Content Type: Alert (21) Version: TLS 1.0 (0x0301) Length: 32 Alert Message: Encrypted Alert Ignored Unknown Record The Ignored Unknown Record reads: 454 TLS not available due to a temporary reason Cheers, Tobias
lost connection with [mail server] while performing the EHLO handshake after TLS established
Hello I'm experiencing the above problem on a customer's system while trying to send mail to the domain i-sec.tuv.com -- I've replaced the HELO/EHLO of our customer with mail.customer. The logs say: Nov 5 12:36:45 pxmail1 postfix/smtp[8378]: mail01.i-sec.tuv.com[193.24.224.9]:25: 220 mail01.i-sec.tuv.com ESMTP Nov 5 12:36:45 pxmail1 postfix/smtp[8378]: mail01.i-sec.tuv.com[193.24.224.9]:25: EHLO mail.customer Nov 5 12:36:45 pxmail1 postfix/smtp[8378]: mail01.i-sec.tuv.com[193.24.224.9]:25: 250-mail01.i-sec.tuv.com Nov 5 12:36:45 pxmail1 postfix/smtp[8378]: mail01.i-sec.tuv.com[193.24.224.9]:25: 250-8BITMIME Nov 5 12:36:45 pxmail1 postfix/smtp[8378]: mail01.i-sec.tuv.com[193.24.224.9]:25: 250-SIZE 104857600 Nov 5 12:36:45 pxmail1 postfix/smtp[8378]: mail01.i-sec.tuv.com[193.24.224.9]:25: 250 STARTTLS Nov 5 12:36:45 pxmail1 postfix/smtp[8378]: server features: 0x101b size 104857600 Nov 5 12:36:45 pxmail1 postfix/smtp[8378]: mail01.i-sec.tuv.com[193.24.224.9]:25: STARTTLS Nov 5 12:36:45 pxmail1 postfix/smtp[8378]: mail01.i-sec.tuv.com[193.24.224.9]:25: 220 Go ahead with TLS Nov 5 12:36:45 pxmail1 postfix/smtp[8378]: setting up TLS connection to mail01.i-sec.tuv.com[193.24.224.9]:25 Nov 5 12:36:45 pxmail1 postfix/smtp[8378]: mail01.i-sec.tuv.com[193.24.224.9]:25: TLS cipher list ALL:+RC4:@STRENGTH Nov 5 12:36:45 pxmail1 postfix/smtp[8378]: looking for session smtp:193.24.224.9:25:mail01.i-sec.tuv.comp=1c=ALL:+RC4:@STRENGTH in smtp cache [...] Nov 5 12:36:45 pxmail1 postfix/smtp[8378]: Trusted TLS connection established to mail01.i-sec.tuv.com[193.24.224.9]:25: TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits) Nov 5 12:36:45 pxmail1 postfix/smtp[8378]: mail01.i-sec.tuv.com[193.24.224.9]:25: EHLO mail.customer Nov 5 12:36:45 pxmail1 postfix/smtp[8378]: smtp_get: EOF Nov 5 12:36:45 pxmail1 postfix/smtp[8378]: connect to subsystem private/defer [...] Nov 5 12:36:45 pxmail1 postfix/smtp[8378]: send attr action = delayed Nov 5 12:36:45 pxmail1 postfix/smtp[8378]: send attr reason = lost connection with mail01.i-sec.tuv.com[193.24.224.9] while performing the EHLO handshake It looks as though mail01.i-sec.tuv.com dropped the connection, though I see no indication of the reason. Strangely, though, in a tcpdump I recorded it appears that our customer's system is sending a [RST, ACK] packet directly after sending TLSv1 Application Data, which very probably is its EHLO. Any ideas? Cheers, Tobias
Re: lost connection with [mail server] while performing the EHLO handshake after TLS established
On Wed, Nov 05, 2014 at 01:27:49PM +0100, Tobias Reckhard wrote: Nov 5 12:36:45 pxmail1 postfix/smtp[8378]: Trusted TLS connection established to mail01.i-sec.tuv.com[193.24.224.9]:25: TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits) Nov 5 12:36:45 pxmail1 postfix/smtp[8378]: mail01.i-sec.tuv.com[193.24.224.9]:25: EHLO mail.customer Nov 5 12:36:45 pxmail1 postfix/smtp[8378]: smtp_get: EOF It looks as though mail01.i-sec.tuv.com dropped the connection, though I see no indication of the reason. Strangely, though, in a tcpdump I recorded it appears that our customer's system is sending a [RST, ACK] packet directly after sending TLSv1 Application Data, which very probably is its EHLO. You may have read the wrong direction for the Application Data. The SMTP client speaks first after EHLO. $ posttls-finger -dsha256 [mail01.i-sec.tuv.com] posttls-finger: Connected to mail01.i-sec.tuv.com[193.24.224.9]:25 posttls-finger: 220 mail01.i-sec.tuv.com ESMTP posttls-finger: EHLO amnesiac.local posttls-finger: 250-mail01.i-sec.tuv.com posttls-finger: 250-8BITMIME posttls-finger: 250-SIZE 104857600 posttls-finger: 250 STARTTLS posttls-finger: STARTTLS posttls-finger: 220 Go ahead with TLS ... posttls-finger: Untrusted TLS connection established to mail01.i-sec.tuv.com[193.24.224.9]:25: unknown with cipher DHE-RSA-AES256-SHA (256/256 bits) posttls-finger: EHLO amnesiac.local posttls-finger: 250-mail01.i-sec.tuv.com posttls-finger: 250-8BITMIME posttls-finger: 250-SIZE 104857600 posttls-finger: 250-AUTH PLAIN LOGIN posttls-finger: 250 AUTH=PLAIN LOGIN posttls-finger: QUIT posttls-finger: 221 mail01.i-sec.tuv.com If the direction is correct, and the server was sending application data, it would be logged as the response to the post-handshake EHLO. If building posttls-finger from Postfix 2.11 source is a pain, you might find swaks handy (swaks does a lot more, but does not support DANE, and does not exercise Postfix TLS library client features). -- Viktor.
Re: lost connection with [mail server] while performing the EHLO handshake after TLS established
On Wed, Nov 05, 2014 at 03:30:06PM +, Viktor Dukhovni wrote: recorded it appears that our customer's system is sending a [RST, ACK] packet directly after sending TLSv1 Application Data, which very probably is its EHLO. You may have read the wrong direction for the Application Data. The SMTP client speaks first after EHLO. Oops, sorry that's after STARTTLS of course. -- Viktor.
Lost connection
I am having trouble sending email to a specific server I got the following error lost connection with mx.example.org http://mx.cieebonaire.org/[xx.xx.xx.xxx] while receiving the initial server greeting” The operator says its my issue yet i have no problems with any other servers my postconf -n is as follows: body_checks = regexp:/usr/local/etc/postfix/body_check bounce_size_limit = 5 command_directory = /usr/local/sbin config_directory = /usr/local/etc/postfix content_filter = smtp-amavis:[127.0.0.1]:10024 daemon_directory = /usr/local/libexec/postfix daemon_timeout = 36000s data_directory = /var/db/postfix debugger_command = PATH=/bin:/usr/bin:/usr/local/bin:/usr/X11R6/bin xxgdb $daemon_directory/$process_name $process_id sleep 5 delay_warning_time = 2h disable_vrfy_command = yes header_checks = regexp:/usr/local/etc/postfix/header_checks home_mailbox = Maildir/ html_directory = /usr/local/share/doc/postfix inet_protocols = ipv4 mail_owner = postfix mail_spool_directory = /var/mail/vmail mailq_path = /usr/local/bin/mailq manpage_directory = /usr/local/man maps_rbl_domains = bl.spamcop.net http://bl.spamcop.net/ mydestination = localhost.$mydomain, localhost mydomain = theoceanwindow-bv.com http://theoceanwindow-bv.com/ mynetworks = removed myorigin = $mydomain newaliases_path = /usr/local/bin/newaliases queue_directory = /var/spool/postfix readme_directory = /usr/local/share/doc/postfix recipient_bcc_maps = hash:/usr/local/etc/postfix/recipient_bcc relay_recipient_maps = hash:/usr/local/etc/postfix/relay_recipients sample_directory = /usr/local/etc/postfix sendmail_path = /usr/local/sbin/sendmail setgid_group = maildrop smtp_tls_note_starttls_offer = yes smtpd_banner = $myhostname ESMTP smtpd_delay_reject = yes smtpd_helo_required = yes smtpd_helo_restrictions = permit_sasl_authenticated,check_helo_access hash:/usr/local/etc/postfix/helo_access,reject_invalid_hostname,permit smtpd_recipient_restrictions = permit_mynetworks, check_sender_access pcre://usr/local/etc/postfix/sender_access pcre://usr/local/etc/postfix/sender_access reject_rhsbl_sender fresh.spameatingmonkey.net http://fresh.spameatingmonkey.net/, reject_unauth_destination, check_client_access hash:/usr/local/etc/postfix/rbl_override, reject_rbl_client zen.spamhaus.org http://zen.spamhaus.org/, reject_rbl_client bl.spam, reject_rbl_client bl.spameatingmonkey.net http://bl.spameatingmonkey.net/, reject_rhsbl_client fresh.spameatingmonkey.net http://fresh.spameatingmonkey.net/, reject_rhsbl_sender urired.spameatingmonkey.net http://urired.spameatingmonkey.net/,permit_sasl_authenticated smtpd_relay_restrictions = permit_mynetworks permit_sasl_authenticated smtpd_sasl_auth_enable = yes smtpd_sasl_path = private/auth smtpd_sasl_type = dovecot smtpd_sender_restrictions = permit_sasl_authenticated, reject_unknown_sender_domain, reject_unauthenticated_sender_login_mismatch, check_sender_access pcre://usr/local/etc/postfix/sender_access pcre://usr/local/etc/postfix/sender_access permit_mynetworks smtpd_tls_CAfile = /usr/local/etc/keys/root.crt smtpd_tls_cert_file = /usr/local/etc/keys/server.cert smtpd_tls_key_file = /usr/local/etc/keys/private.key smtpd_tls_loglevel = 3 smtpd_tls_received_header = yes smtpd_tls_security_level = may smtpd_tls_session_cache_timeout = 3600s tls_random_source = dev:/dev/urandom unknown_local_recipient_reject_code = 550 virtual_alias_maps = hash:/usr/local/etc/postfix/virtual virtual_gid_maps = static:1000 virtual_mailbox_base = /var/mail/vmail virtual_mailbox_domains = hash:/usr/local/etc/postfix/virtual_domains virtual_mailbox_maps = hash:/usr/local/etc/postfix/virtual_mailbox virtual_minimum_uid = 100 virtual_uid_maps = static:1003
Re: Lost connection
Am 18.10.2014 um 15:36 schrieb jason hirsh: I am having trouble sending email to a specific server I got the following error lost connection with mx.example.org [xx.xx.xx.xxx] while receiving the initial servergreeting” The operator says its my issue yet i have no problems with any other servers that is nonsense because he can't know that - no other complaints don't mean it happens here and there without take notice not does that message alone means anything at all lost connection is just the messenger that, well, the connection was lost and is completly outside postfix scope, the only relevant question is how often that happens and if a lter retry works which is expected behavior look below 8 lost connections to one server of hotmail so what - 12 successful - the first was retried seconds later to a different MX, the reason could have been our link, their link, one of the routers between the servers at that moment - who cares? so did your mail *really* fail or do you just have panic because you see lost connection for no reason? _ cat maillog | grep 65.54.188.72 | grep status=sent | wc -l 12 Oct 12 18:00:39 mail postfix/smtp[17729]: 3jG73B02HDz2d: to=***, relay=mx1.hotmail.com[65.55.37.104]:25, delay=146, delays=22/121/1.5/0.84, dsn=2.0.0, status=sent (250 col004-mc3f30lsnpkz0001a...@col004-mc3f30.hotmail.com Queued mail for delivery) Oct 12 18:00:39 mail postfix/smtp[17729]: 3jG73B02HDz2d: to=***, relay=mx1.hotmail.com[65.55.37.104]:25, delay=146, delays=22/121/1.5/0.84, dsn=2.0.0, status=sent (250 col004-mc3f30lsnpkz0001a...@col004-mc3f30.hotmail.com Queued mail for delivery) _ Oct 12 18:00:36 mail postfix/smtp[17729]: 3jG73B02HDz2d: lost connection with mx2.hotmail.com[65.54.188.72] while receiving the initial server greeting Oct 12 18:08:00 mail postfix/smtp[17730]: 3jG74F4kj8z3H: lost connection with mx2.hotmail.com[65.54.188.72] while receiving the initial server greeting Oct 12 18:14:22 mail postfix/smtp[17729]: 3jG74q691qz3W: lost connection with mx3.hotmail.com[65.54.188.72] while receiving the initial server greeting Oct 12 18:23:39 mail postfix/smtp[17731]: 3jG76B1BPpz3y: lost connection with mx1.hotmail.com[65.54.188.72] while receiving the initial server greeting Oct 12 18:26:48 mail postfix/smtp[17728]: 3jG76L4Dkjz3s: lost connection with mx4.hotmail.com[65.54.188.72] while receiving the initial server greeting Oct 12 18:28:03 mail postfix/smtp[17729]: 3jG77G3z6fz45: lost connection with mx1.hotmail.com[65.54.188.72] while receiving the initial server greeting Oct 14 07:55:09 mail postfix/smtp[29058]: 3jH5ZM0zRgz23: lost connection with mx2.hotmail.com[65.54.188.72] while receiving the initial server greeting Oct 15 15:07:12 mail postfix/smtp[23519]: 3jHv691gnWz2W: lost connection with mx4.hotmail.com[65.54.188.72] while receiving the initial server greeting
Re: Lost connection
I have about 8 of these over two days all continue to be retired I am trying to be pro-active and was just looking if ether was something i may have hosed on my side although this is the only server I have been having problems I really think the issue is with that server but was just looking for expert input On Oct 18, 2014, at 9:54 AM, li...@rhsoft.net wrote: Am 18.10.2014 um 15:36 schrieb jason hirsh: I am having trouble sending email to a specific server I got the following error lost connection with mx.example.org [xx.xx.xx.xxx] while receiving the initial servergreeting” The operator says its my issue yet i have no problems with any other servers that is nonsense because he can't know that - no other complaints don't mean it happens here and there without take notice not does that message alone means anything at all lost connection is just the messenger that, well, the connection was lost and is completly outside postfix scope, the only relevant question is how often that happens and if a lter retry works which is expected behavior look below 8 lost connections to one server of hotmail so what - 12 successful - the first was retried seconds later to a different MX, the reason could have been our link, their link, one of the routers between the servers at that moment - who cares? so did your mail *really* fail or do you just have panic because you see lost connection for no reason? _ cat maillog | grep 65.54.188.72 | grep status=sent | wc -l 12 Oct 12 18:00:39 mail postfix/smtp[17729]: 3jG73B02HDz2d: to=***, relay=mx1.hotmail.com[65.55.37.104]:25, delay=146, delays=22/121/1.5/0.84, dsn=2.0.0, status=sent (250 col004-mc3f30lsnpkz0001a...@col004-mc3f30.hotmail.com Queued mail for delivery) Oct 12 18:00:39 mail postfix/smtp[17729]: 3jG73B02HDz2d: to=***, relay=mx1.hotmail.com[65.55.37.104]:25, delay=146, delays=22/121/1.5/0.84, dsn=2.0.0, status=sent (250 col004-mc3f30lsnpkz0001a...@col004-mc3f30.hotmail.com Queued mail for delivery) _ Oct 12 18:00:36 mail postfix/smtp[17729]: 3jG73B02HDz2d: lost connection with mx2.hotmail.com[65.54.188.72] while receiving the initial server greeting Oct 12 18:08:00 mail postfix/smtp[17730]: 3jG74F4kj8z3H: lost connection with mx2.hotmail.com[65.54.188.72] while receiving the initial server greeting Oct 12 18:14:22 mail postfix/smtp[17729]: 3jG74q691qz3W: lost connection with mx3.hotmail.com[65.54.188.72] while receiving the initial server greeting Oct 12 18:23:39 mail postfix/smtp[17731]: 3jG76B1BPpz3y: lost connection with mx1.hotmail.com[65.54.188.72] while receiving the initial server greeting Oct 12 18:26:48 mail postfix/smtp[17728]: 3jG76L4Dkjz3s: lost connection with mx4.hotmail.com[65.54.188.72] while receiving the initial server greeting Oct 12 18:28:03 mail postfix/smtp[17729]: 3jG77G3z6fz45: lost connection with mx1.hotmail.com[65.54.188.72] while receiving the initial server greeting Oct 14 07:55:09 mail postfix/smtp[29058]: 3jH5ZM0zRgz23: lost connection with mx2.hotmail.com[65.54.188.72] while receiving the initial server greeting Oct 15 15:07:12 mail postfix/smtp[23519]: 3jHv691gnWz2W: lost connection with mx4.hotmail.com[65.54.188.72] while receiving the initial server greeting
Re: Lost connection
Am 18.10.2014 um 16:01 schrieb jason hirsh: I have about 8 of these over two days all continue to be retired I am trying to be pro-active and was just looking if ether was something i may have hosed on my side although this is the only server I have been having problems I really think the issue is with that server but was just looking for expert input try net.ipv4.tcp_window_scaling = 0 in sysctl.conf and sysctl -p, maybe they have some crap device in front of their server! https://www.google.at/#q=smtp+tcp+window+scaling+problems On Oct 18, 2014, at 9:54 AM, li...@rhsoft.net wrote: Am 18.10.2014 um 15:36 schrieb jason hirsh: I am having trouble sending email to a specific server I got the following error lost connection with mx.example.org [xx.xx.xx.xxx] while receiving the initial servergreeting” The operator says its my issue yet i have no problems with any other servers that is nonsense because he can't know that - no other complaints don't mean it happens here and there without take notice not does that message alone means anything at all lost connection is just the messenger that, well, the connection was lost and is completly outside postfix scope, the only relevant question is how often that happens and if a lter retry works which is expected behavior look below 8 lost connections to one server of hotmail so what - 12 successful - the first was retried seconds later to a different MX, the reason could have been our link, their link, one of the routers between the servers at that moment - who cares? so did your mail *really* fail or do you just have panic because you see lost connection for no reason? _ cat maillog | grep 65.54.188.72 | grep status=sent | wc -l 12 Oct 12 18:00:39 mail postfix/smtp[17729]: 3jG73B02HDz2d: to=***, relay=mx1.hotmail.com[65.55.37.104]:25, delay=146, delays=22/121/1.5/0.84, dsn=2.0.0, status=sent (250 col004-mc3f30lsnpkz0001a...@col004-mc3f30.hotmail.com Queued mail for delivery) Oct 12 18:00:39 mail postfix/smtp[17729]: 3jG73B02HDz2d: to=***, relay=mx1.hotmail.com[65.55.37.104]:25, delay=146, delays=22/121/1.5/0.84, dsn=2.0.0, status=sent (250 col004-mc3f30lsnpkz0001a...@col004-mc3f30.hotmail.com Queued mail for delivery) _ Oct 12 18:00:36 mail postfix/smtp[17729]: 3jG73B02HDz2d: lost connection with mx2.hotmail.com[65.54.188.72] while receiving the initial server greeting Oct 12 18:08:00 mail postfix/smtp[17730]: 3jG74F4kj8z3H: lost connection with mx2.hotmail.com[65.54.188.72] while receiving the initial server greeting Oct 12 18:14:22 mail postfix/smtp[17729]: 3jG74q691qz3W: lost connection with mx3.hotmail.com[65.54.188.72] while receiving the initial server greeting Oct 12 18:23:39 mail postfix/smtp[17731]: 3jG76B1BPpz3y: lost connection with mx1.hotmail.com[65.54.188.72] while receiving the initial server greeting Oct 12 18:26:48 mail postfix/smtp[17728]: 3jG76L4Dkjz3s: lost connection with mx4.hotmail.com[65.54.188.72] while receiving the initial server greeting Oct 12 18:28:03 mail postfix/smtp[17729]: 3jG77G3z6fz45: lost connection with mx1.hotmail.com[65.54.188.72] while receiving the initial server greeting Oct 14 07:55:09 mail postfix/smtp[29058]: 3jH5ZM0zRgz23: lost connection with mx2.hotmail.com[65.54.188.72] while receiving the initial server greeting Oct 15 15:07:12 mail postfix/smtp[23519]: 3jHv691gnWz2W: lost connection with mx4.hotmail.com[65.54.188.72] while receiving the initial server greeting
Re: Lost connection
jason hirsh: I am having trouble sending email to a specific server I got the following error lost connection with mx.example.org http://mx.cieebonaire.org/[xx.xx.xx.xxx] while receiving the initial server greeting? The operator says its my issue yet i have no problems with any other servers Try: $ telnet xx.xx.xx.xxx 25 and report what happens. If the host replies with 220 servername then send: EHLO your.client.name and report what happens. Wietse
Re: Lost connection
On 10/18/2014 07:01 AM, jason hirsh wrote: I have about 8 of these over two days all continue to be retired I am trying to be pro-active and was just looking if ether was something i may have hosed on my side although this is the only server I have been having problems I really think the issue is with that server but was just looking for expert input What happens when you use TELNET to connect to the remote server on port 25? Perform the test repeatedly from your mail server -- that will help you with clue. For example, if it takes a long time for TELNET to connect, then you know the remote server may be overloaded. I discovered with, a TACACS daemon with a too-short backlog, that the IP stack in the server will complete the three-part handshake -- then finds it can't pass the connection to the LISTEN socket because the backlog is full. After the various TCP wait times expire, the stack sends RST and forgets the connection. The key for making me think this is the scenario is the clause while receiving the initial servergreeting in the log message. The remote SMTP server may never have had the connection passed on to it because of high backlog overflow on the remote host. One reason the remote operator doesn't see a problem is because s/he never knows this is happening (no logging of the event) and no other mail server operator has complained to him about this. I know I would ignore the message when I see it. Are you getting bounce messages on mail being sent to this server? Are you seeing this happen again and again on the same queued message, or it is happening with different queued mail?
Re: Lost connection
I get this Trying 66.96.142.51... Connected to 51.142.96.66.static.eigbox.net. Escape character is '^]'. 220 bosimpinc11 bizsmtp ESMTP server ready I am getting the original; error message on all mail i end only tot that server On Oct 18, 2014, at 10:21 AM, Stephen Satchell l...@satchell.net wrote: On 10/18/2014 07:01 AM, jason hirsh wrote: I have about 8 of these over two days all continue to be retired I am trying to be pro-active and was just looking if ether was something i may have hosed on my side although this is the only server I have been having problems I really think the issue is with that server but was just looking for expert input What happens when you use TELNET to connect to the remote server on port 25? Perform the test repeatedly from your mail server -- that will help you with clue. For example, if it takes a long time for TELNET to connect, then you know the remote server may be overloaded. I discovered with, a TACACS daemon with a too-short backlog, that the IP stack in the server will complete the three-part handshake -- then finds it can't pass the connection to the LISTEN socket because the backlog is full. After the various TCP wait times expire, the stack sends RST and forgets the connection. The key for making me think this is the scenario is the clause while receiving the initial servergreeting in the log message. The remote SMTP server may never have had the connection passed on to it because of high backlog overflow on the remote host. One reason the remote operator doesn't see a problem is because s/he never knows this is happening (no logging of the event) and no other mail server operator has complained to him about this. I know I would ignore the message when I see it. Are you getting bounce messages on mail being sent to this server? Are you seeing this happen again and again on the same queued message, or it is happening with different queued mail?
Re: Lost connection
jason hirsh: I get this Trying 66.96.142.51... Connected to 51.142.96.66.static.eigbox.net. Escape character is '^]'. 220 bosimpinc11 bizsmtp ESMTP server ready Then try step 2 in my reply: EHLO your.server.name and report what happens. I see the following: % telnet 66.96.142.51 25 Trying 66.96.142.51... Connected to 51.142.96.66.static.eigbox.net. Escape character is '^]'. 220 bosimpinc11 bizsmtp ESMTP server ready EHLO your.server.name 250-bosimpinc11 hello [70.104.130.26], pleased to meet you 250-HELP 250-SIZE 3000 250-8BITMIME 250-STARTTLS 250 OK quit 221 bosimpinc11 bizsmtp closing connection Connection closed by foreign host. Wietse
Re: Lost connection
On Oct 18, 2014, at 10:30 AM, jason hirsh hir...@att.net wrote: I get this Trying 66.96.142.51... Connected to 51.142.96.66.static.eigbox.net http://static.eigbox.net/. Escape character is '^]'. 220 bosimpinc11 bizsmtp ESMTP server ready I am getting the original; error message on all mail i end only tot that server correcting typos should use my glasses I get the error message that i have in the beginning of this thread on all traffic i send to this mail server from mine On Oct 18, 2014, at 10:21 AM, Stephen Satchell l...@satchell.net mailto:l...@satchell.net wrote: On 10/18/2014 07:01 AM, jason hirsh wrote: I have about 8 of these over two days all continue to be retired I am trying to be pro-active and was just looking if ether was something i may have hosed on my side although this is the only server I have been having problems I really think the issue is with that server but was just looking for expert input What happens when you use TELNET to connect to the remote server on port 25? Perform the test repeatedly from your mail server -- that will help you with clue. For example, if it takes a long time for TELNET to connect, then you know the remote server may be overloaded. I discovered with, a TACACS daemon with a too-short backlog, that the IP stack in the server will complete the three-part handshake -- then finds it can't pass the connection to the LISTEN socket because the backlog is full. After the various TCP wait times expire, the stack sends RST and forgets the connection. The key for making me think this is the scenario is the clause while receiving the initial servergreeting in the log message. The remote SMTP server may never have had the connection passed on to it because of high backlog overflow on the remote host. One reason the remote operator doesn't see a problem is because s/he never knows this is happening (no logging of the event) and no other mail server operator has complained to him about this. I know I would ignore the message when I see it. Are you getting bounce messages on mail being sent to this server? Are you seeing this happen again and again on the same queued message, or it is happening with different queued mail?
Re: Lost connection
oops missed step 2 I get this Trying 66.96.142.51... Connected to 51.142.96.66.static.eigbox.net. Escape character is '^]'. 220 bosimpinc11 bizsmtp ESMTP server ready EHLO mail.kasdivi.com 250-bosimpinc11 hello [209.160.65.133], pleased to meet you 250-HELP 250-SIZE 3000 250-8BITMIME 250-STARTTLS 250 OK On Oct 18, 2014, at 10:38 AM, Wietse Venema wie...@porcupine.org wrote: jason hirsh: I get this Trying 66.96.142.51... Connected to 51.142.96.66.static.eigbox.net. Escape character is '^]'. 220 bosimpinc11 bizsmtp ESMTP server ready Then try step 2 in my reply: EHLO your.server.name and report what happens. I see the following: % telnet 66.96.142.51 25 Trying 66.96.142.51... Connected to 51.142.96.66.static.eigbox.net. Escape character is '^]'. 220 bosimpinc11 bizsmtp ESMTP server ready EHLO your.server.name 250-bosimpinc11 hello [70.104.130.26], pleased to meet you 250-HELP 250-SIZE 3000 250-8BITMIME 250-STARTTLS 250 OK quit 221 bosimpinc11 bizsmtp closing connection Connection closed by foreign host. Wietse
Re: Lost connection
jason hirsh: oops missed step 2 I get this Trying 66.96.142.51... Connected to 51.142.96.66.static.eigbox.net. Escape character is '^]'. 220 bosimpinc11 bizsmtp ESMTP server ready EHLO mail.kasdivi.com 250-bosimpinc11 hello [209.160.65.133], pleased to meet you 250-HELP 250-SIZE 3000 250-8BITMIME 250-STARTTLS 250 OK That is interesting. These are the same commands that Postfix would send. Are you making the telnet connection from tyhe machine that runs Postfix? (mail.kasdivi.com is 209.160.65.133, so it looks like you are) Do you have any -o name=value settings in master.cf? If running Postix 2.11, try $ postconf -P '*/inet/*' Otherwise, you need to look at master.cf yourself. Wietse
Re: Lost connection
On 10/18/2014 07:01 AM, jason hirsh wrote: I have about 8 of these over two days all continue to be retired I am trying to be pro-active and was just looking if ether was something i may have hosed on my side I forgot to ask: what does your DNS entries look like for your mail server? Specifically: A record PTR record MX record(s) Back when I was working for a web hosting company, I insisted that any incoming mail come from a launch point with best practices DNS. My policy filter would return a polite error message and close the connection; this guy might just be rude. My tests: 1. IP address has a PTR record with a FQDN that looks to be statically assigned. This was developed over time, as I learned the patterns. I also has a whitelist of REGEXP patterns. Multiple returns were a no-no. 2. Look-up on FQDN returns an A record with the same IP address. Multiple IP addresses can be returned, but one of them must match the IP address of the incoming connection 3. Look-up of the domain name (tried several variations) returns MX record(s). I allowed for those large groups who split incoming mail from outgoing mail in a server farm -- the idea here is that I didn't accept mail from any endpoint that may not have a postmaster associated with it.
Re: Lost connection
On Oct 18, 2014, at 10:54 AM, Wietse Venema wie...@porcupine.org wrote: jason hirsh: oops missed step 2 I get this Trying 66.96.142.51... Connected to 51.142.96.66.static.eigbox.net. Escape character is '^]'. 220 bosimpinc11 bizsmtp ESMTP server ready EHLO mail.kasdivi.com 250-bosimpinc11 hello [209.160.65.133], pleased to meet you 250-HELP 250-SIZE 3000 250-8BITMIME 250-STARTTLS 250 OK That is interesting. These are the same commands that Postfix would send. Are you making the telnet connection from tyhe machine that runs Postfix? (mail.kasdivi.com http://mail.kasdivi.com/ is 209.160.65.133, so it looks like you are) Yes I am running Postfix version 2.11-20131001 Do you have any -o name=value settings in master.cf? not that i can see If running Postix 2.11, try $ postconf -P '*/inet/*’ I get a postconf: illegal option — P Otherwise, you need to look at master.cf yourself. Wietse
Re: Lost connection
jason hirsh: On Oct 18, 2014, at 10:54 AM, Wietse Venema wie...@porcupine.org wrote: jason hirsh: oops missed step 2 I get this Trying 66.96.142.51... Connected to 51.142.96.66.static.eigbox.net. Escape character is '^]'. 220 bosimpinc11 bizsmtp ESMTP server ready EHLO mail.kasdivi.com 250-bosimpinc11 hello [209.160.65.133], pleased to meet you 250-HELP 250-SIZE 3000 250-8BITMIME 250-STARTTLS 250 OK That is interesting. These are the same commands that Postfix would send. Does the lost connection happen because Postfix makes multiple connections at the same time? Try this example: /etc/postfix/main.cf: transport_maps = hash:/etc/postfix/transport slow_destination_concurrency_limit = 1 slow_initial_destination_concurrency = 1 slow_destination_concurrency_failed_cohort_limit = 10 /etc/postfix/transport: example.com slow: /etc/postfix/master.cf: # service type private unpriv chroot wakeup maxproc command slow unix - - n - -smtp -o smtp_connection_cache_on_demand=no for example.com specify the problem domain name (not hostname or IP address). Wietse
Re: Lost connection
On Oct 18, 2014, at 11:04 AM, Richard lists-post...@listmail.innovate.net wrote: Original Message Date: Saturday, October 18, 2014 10:45:09 -0400 From: jason hirsh hir...@att.net mailto:hir...@att.net To: Postfix users postfix-users@postfix.org mailto:postfix-users@postfix.org Cc: Subject: Re: Lost connection oops missed step 2 I get this Trying 66.96.142.51... Connected to 51.142.96.66.static.eigbox.net. Escape character is '^]'. 220 bosimpinc11 bizsmtp ESMTP server ready EHLO mail.kasdivi.com 250-bosimpinc11 hello [209.160.65.133], pleased to meet you 250-HELP 250-SIZE 3000 250-8BITMIME 250-STARTTLS 250 OK If mail.kasdivi.com http://mail.kasdivi.com/ is your mail server's true name, then the forward and reverse lookups don't match. There is an rDNS record for 209.160.65.133, but it isn't mail.kasdivi.com http://mail.kasdivi.com/. Some mail servers will refuse mail unless the forward and reverse match. What they tell you may vary. Unless/until these match, I think that the onus falls to your side. In my master record I do show this PTR for the domain in question I have 7 domains on this server tuna.theoceanwindoiw-bv.com http://tuna.theoceanwindoiw-bv.com/ is the server mail.kasdivi.com http://mail.kasdivi.com/. 38400 IN A 209.160.65.133 133.65.160.209.in-addr.arpa. 86188 IN PTR tuna.theoceanwindow-bv.com http://tuna.theoceanwindow-bv.com/. - Richard
Re: Lost connection
I made these changes and my mail log indicates that the delays in delivery have been reduced with no loss of connection I presume that test mail has been delivered, at least it was delivered to the other mail server On Oct 18, 2014, at 11:11 AM, Wietse Venema wie...@porcupine.org wrote: jason hirsh: On Oct 18, 2014, at 10:54 AM, Wietse Venema wie...@porcupine.org wrote: jason hirsh: oops missed step 2 I get this Trying 66.96.142.51... Connected to 51.142.96.66.static.eigbox.net. Escape character is '^]'. 220 bosimpinc11 bizsmtp ESMTP server ready EHLO mail.kasdivi.com 250-bosimpinc11 hello [209.160.65.133], pleased to meet you 250-HELP 250-SIZE 3000 250-8BITMIME 250-STARTTLS 250 OK That is interesting. These are the same commands that Postfix would send. Does the lost connection happen because Postfix makes multiple connections at the same time? Try this example: /etc/postfix/main.cf: transport_maps = hash:/etc/postfix/transport slow_destination_concurrency_limit = 1 slow_initial_destination_concurrency = 1 slow_destination_concurrency_failed_cohort_limit = 10 /etc/postfix/transport: example.com http://example.com/ slow: /etc/postfix/master.cf: # service type private unpriv chroot wakeup maxproc command slow unix - - n - -smtp -o smtp_connection_cache_on_demand=no for example.com http://example.com/ specify the problem domain name (not hostname or IP address). Wietse
Re: Lost connection
On Sat, Oct 18, 2014 at 11:01:29AM -0400, jason hirsh wrote: Are you making the telnet connection from tyhe machine that runs Postfix? (mail.kasdivi.com http://mail.kasdivi.com/ is 209.160.65.133, so it looks like you are) Yes I am running Postfix version 2.11-20131001 Postfix 2.11.0 was released in January 2014, and 2.11.2 since. You should no longer be using 2.11 snapshots. -- Viktor.
3x lost connection while performing the EHLO handshake - Connection refused - clear text delivery ok - automated reply ESMTPS ok
i have difficulty with messagelabs MTA's below is 1 example i don't understand the strace debug log i don't have it now regardless of the low/medium/high cipherlist in which medium is in use and low/high are inactive irrelevant messagelabs problems prevail i use 2 certs assistance is much appreciated 18 00:21:35 postfix/smtp[23811]: initializing the client-side TLS engine 18 00:21:45 postfix/smtp[23811]: setting up TLS connection to cluster3vk.eu.messagelabs.com[85.158.137.83]:25 18 00:21:45 postfix/smtp[23811]: cluster3vk.eu.messagelabs.com[85.158.137.83]:25: TLS cipher list aRSA:-aRSA:aECDSA:-aECDSA:kRSA:-kRSA:kEDH:-kEDH:kEECDH:-kEECDH:AESGCM:-AESGCM:AESGCM:AES:CAMELLIA:3DES:RC4:!aNULL:!eNULL:!EXPORT:!MD5:!DES:!SRP:!DSS:!SEED:!ADH:!AECDH:!kECDH:!PSK:!LOW 18 00:21:45 postfix/smtp[23811]: looking for session smtpanz.comcluster3vk.eu.messagelabs.com85.158.137.83A776F43E9992EF9CB772130D7D4807F401B17A1E92E4917BD547B1EBB22F584D in smtp cache 18 00:21:45 postfix/smtp[23811]: SSL_connect:before/connect initialization 18 00:21:45 postfix/smtp[23811]: SSL_connect:SSLv2/v3 write client hello A 18 00:21:45 postfix/smtp[23811]: SSL_connect:SSLv3 read server hello A 18 00:21:46 postfix/smtp[23811]: cluster3vk.eu.messagelabs.com[85.158.137.83]:25: depth=2 verify=0 subject=/C=US/O=VeriSign, Inc./OU=VeriSign Trust Network/OU=(c) 2006 VeriSign, Inc. - For authorized use only/CN=VeriSign Class 3 Public Primary Certification Authority - G5 18 00:21:46 postfix/smtp[23811]: cluster3vk.eu.messagelabs.com[85.158.137.83]:25: depth=2 verify=1 subject=/C=US/O=VeriSign, Inc./OU=VeriSign Trust Network/OU=(c) 2006 VeriSign, Inc. - For authorized use only/CN=VeriSign Class 3 Public Primary Certification Authority - G5 18 00:21:46 postfix/smtp[23811]: cluster3vk.eu.messagelabs.com[85.158.137.83]:25: depth=1 verify=1 subject=/C=US/O=VeriSign, Inc./OU=VeriSign Trust Network/OU=Terms of use at https://www.verisign.com/rpa (c)10/CN=VeriSign Class 3 International Server CA - G3 18 00:21:46 postfix/smtp[23811]: cluster3vk.eu.messagelabs.com[85.158.137.83]:25: depth=0 verify=1 subject=/C=US/ST=California/L=Mountain View/O=Symantec Corporation/OU=Symantec.cloud/CN=mail140.messagelabs.com 18 00:21:46 postfix/smtp[23811]: SSL_connect:SSLv3 read server certificate A 18 00:21:46 postfix/smtp[23811]: SSL_connect:SSLv3 read server key exchange A 18 00:21:46 postfix/smtp[23811]: SSL_connect:SSLv3 read server certificate request A 18 00:21:46 postfix/smtp[23811]: SSL_connect:SSLv3 read server done A 18 00:21:46 postfix/smtp[23811]: SSL_connect:SSLv3 write client certificate A 18 00:21:46 postfix/smtp[23811]: SSL_connect:SSLv3 write client key exchange A 18 00:21:46 postfix/smtp[23811]: SSL_connect:SSLv3 write certificate verify A 18 00:21:46 postfix/smtp[23811]: SSL_connect:SSLv3 write change cipher spec A 18 00:21:46 postfix/smtp[23811]: SSL_connect:SSLv3 write finished A 18 00:21:46 postfix/smtp[23811]: SSL_connect:SSLv3 flush data 18 00:21:47 postfix/smtp[23811]: SSL_connect:SSLv3 read server session ticket A 18 00:21:47 postfix/smtp[23811]: SSL_connect:SSLv3 read finished A 18 00:21:47 postfix/smtp[23811]: save session smtpanz.comcluster3vk.eu.messagelabs.com85.158.137.83A776F43E9992EF9CB772130D7D4807F401B17A1E92E4917BD547B1EBB22F584D to smtp cache 18 00:21:47 postfix/smtp[23811]: cluster3vk.eu.messagelabs.com[85.158.137.83]:25: subject_CN=mail140.messagelabs.com, issuer_CN=VeriSign Class 3 International Server CA - G3, fingerprint=0D:A5:2A:0E:C0:99:04:DC:98:4C:57:E3:C8:C0:05:72, pkey_fingerprint=D8:66:56:75:94:50:CA:38:3E:AF:22:78:93:77:27:9F 18 00:21:47 postfix/smtp[23811]: Untrusted TLS connection established to cluster3vk.eu.messagelabs.com[85.158.137.83]:25: TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits) 18 00:21:47 postfix/smtp[23811]: 7C07C800084: lost connection with cluster3vk.eu.messagelabs.com[85.158.137.83] while performing the EHLO handshake 18 00:22:20 postfix/smtp[23811]: setting up TLS connection to cluster3vk.eu.messagelabs.com[85.158.139.3]:25 18 00:22:20 postfix/smtp[23811]: cluster3vk.eu.messagelabs.com[85.158.139.3]:25: TLS cipher list aRSA:-aRSA:aECDSA:-aECDSA:kRSA:-kRSA:kEDH:-kEDH:kEECDH:-kEECDH:AESGCM:-AESGCM:AESGCM:AES:CAMELLIA:3DES:RC4:!aNULL:!eNULL:!EXPORT:!MD5:!DES:!SRP:!DSS:!SEED:!ADH:!AECDH:!kECDH:!PSK:!LOW 18 00:22:20 postfix/smtp[23811]: looking for session smtpanz.comcluster3vk.eu.messagelabs.com85.158.139.3D2903596AFC817BDF98DA58A8B9D0D2D81DF2FB786103A3AD6AF5E00A4A0CF61 in smtp cache 18 00:22:20 postfix/smtp[23811]: SSL_connect:before/connect initialization 18 00:22:20 postfix/smtp[23811]: SSL_connect:SSLv2/v3 write client hello A 18 00:22:21 postfix/smtp[23811]: SSL_connect:SSLv3 read server hello A 18 00:22:21 postfix/smtp[23811]: cluster3vk.eu.messagelabs.com[85.158.139.3]:25: depth=2 verify=0 subject=/C=US/O=VeriSign, Inc./OU=VeriSign Trust Network/OU=(c) 2006 VeriSign, Inc. - For authorized use only/CN=VeriSign Class 3
Re: 3x lost connection while performing the EHLO handshake - Connection refused - clear text delivery ok - automated reply ESMTPS ok
On Fri, Sep 19, 2014 at 01:40:34AM +1000, shm...@riseup.net wrote: I have difficulty with messagelabs MTA's below is 1 example i don't understand the strace debug log i don't have it now Disable verbose TLS logging, it is not required. A log level of 1 is enough. 18 00:21:35 postfix/smtp[23811]: initializing the client-side TLS engine Was anything else done in the 12 seconds between these two messages? Perhaps the verbose logging is making your system too slow? Is logging configured to be synchronous? 18 00:21:45 postfix/smtp[23811]: setting up TLS connection to cluster3vk.eu.messagelabs.com[85.158.137.83]:25 [...] 18 00:21:47 postfix/smtp[23811]: Untrusted TLS connection established to cluster3vk.eu.messagelabs.com[85.158.137.83]:25: TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits) 18 00:21:47 postfix/smtp[23811]: 7C07C800084: lost connection with cluster3vk.eu.messagelabs.com[85.158.137.83] while performing the EHLO handshake The other end hung up. If no TLS errors are reported, perhaps your client took too long, or they are rate limiting your server by selectively dropping connections. 18 00:22:20 postfix/smtp[23811]: setting up TLS connection to cluster3vk.eu.messagelabs.com[85.158.139.3]:25 18 00:22:21 postfix/smtp[23811]: SSL_connect:SSLv3 read server certificate request A 18 00:22:21 postfix/smtp[23811]: SSL_connect:SSLv3 read server done A 18 00:22:21 postfix/smtp[23811]: SSL_connect:SSLv3 write client certificate A Why have you configured a client certificate? Generally, you should not. It may work better if you don't. 18 00:22:22 postfix/smtp[23811]: SSL_connect error to cluster3vk.eu.messagelabs.com[85.158.139.3]:25: -1 18 00:22:22 postfix/smtp[23811]: warning: TLS library problem: error:1408F10B:SSL routines:SSL3_GET_RECORD:wrong version number:s3_pkt.c:338: This requires a PCAP capture file to see what the server sent. 18 00:22:22 postfix/smtp[23811]: 7C07C800084: to=, relay=none, delay=54, delays=6.6/0.03/47/0, dsn=4.4.1, status=deferred (connect to cluster3vk.eu.messagelabs.com[85.158.139.3]:25: Connection refused) They definitely have connection rate limiters in place. Nothing other than your IP reputation and OpenSSL library version number matters here. Disable verbose TLS logging, disable client certs: smtp_tls_cert_file = smtp_tls_key_file = smtp_tls_eccert_file = smtp_tls_eckey_file = make sure logging is not synchronous (syslog.conf) and post a PCAP file of a failed session (perhaps one of the wrong version ones). Because the message content is not sent, and in any case you're negotiating TLS, the PCAP file only discloses your IP address and SMTP client HELO name. -- VIktor.
Re: 3x lost connection while performing the EHLO handshake - Connection refused - clear text delivery ok - automated reply ESMTPS ok
thank you sir, Viktor Dukhovni wrote: On Fri, Sep 19, 2014 at 01:40:34AM +1000, shm...@riseup.net wrote: I have difficulty with messagelabs MTA's below is 1 example i don't understand the strace debug log i don't have it now Disable verbose TLS logging, it is not required. A log level of 1 is enough. done 18 00:21:35 postfix/smtp[23811]: initializing the client-side TLS engine Was anything else done in the 12 seconds between these two messages? Perhaps the verbose logging is making your system too slow? Is logging configured to be synchronous? only my mail client disconnect 18 00:21:37 postfix/smtpd[23799]: disconnect from [...] default in rsyslog.conf debian jessie was mail.info -/var/log/mail.info mail.warn -/var/log/mail.warn mail.err/var/log/mail.err i updated .err with - however i see what you mean each time i send emails i wait about 30s for completion i see the same postgrey log taking up 30s for MTA-MTA and MUA-MTA however in general, aside from messagelabs, i dont have any issues (that im currently aware of) receiving email from MTA's to my MTA even with the 30s delay 18 16:21:10 postfix/smtpd[5031]: initializing the server-side TLS engine 18 16:21:10 postfix/tlsmgr[5033]: open smtpd TLS cache btree:/var/lib/postfix/smtpd_scache 18 16:21:10 postfix/tlsmgr[5033]: open smtp TLS cache btree:/var/lib/postfix/smtp_scache 18 16:21:10 postfix/tlsmgr[5033]: tlsmgr_cache_run_event: start TLS smtpd session cache cleanup 18 16:21:10 postfix/tlsmgr[5033]: tlsmgr_cache_run_event: start TLS smtp session cache cleanup 18 16:21:14 postfix/smtpd[5031]: connect from [...] 18 16:21:45 postfix/smtpd[5031]: warning: milter inet:127.0.0.1:10023: can't read SMFIC_OPTNEG reply packet header: Connection timed out 18 16:21:45 postfix/smtpd[5031]: warning: milter inet:127.0.0.1:10023: read error in initial handshake 18 16:21:46 postfix/smtpd[5031]: setting up TLS connection from [...] 18 16:21:46 postfix/smtpd[5031]: [...]: TLS cipher list aRSA:-aRSA:aECDSA:-aECDSA:kRSA:-kRSA:kEDH:-kEDH:kEECD$ 18 16:21:46 postfix/smtpd[5031]: SSL_accept:before/accept initialization 18 16:21:49 postfix/smtpd[5031]: Anonymous TLS connection established from [...]: TLSv1.2 with cipher ECDHE-EC$ 18 16:21:55 postfix/smtpd[5031]: : client=[...], sasl_method=PLAIN, sasl_username= 18 16:21:57 postfix/cleanup[5039]: : message-id= 18 00:21:45 postfix/smtp[23811]: setting up TLS connection to cluster3vk.eu.messagelabs.com[85.158.137.83]:25 [...] 18 00:21:47 postfix/smtp[23811]: Untrusted TLS connection established to cluster3vk.eu.messagelabs.com[85.158.137.83]:25: TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits) 18 00:21:47 postfix/smtp[23811]: 7C07C800084: lost connection with cluster3vk.eu.messagelabs.com[85.158.137.83] while performing the EHLO handshake The other end hung up. If no TLS errors are reported, perhaps your client took too long, or they are rate limiting your server by selectively dropping connections. 18 00:22:20 postfix/smtp[23811]: setting up TLS connection to cluster3vk.eu.messagelabs.com[85.158.139.3]:25 18 00:22:21 postfix/smtp[23811]: SSL_connect:SSLv3 read server certificate request A 18 00:22:21 postfix/smtp[23811]: SSL_connect:SSLv3 read server done A 18 00:22:21 postfix/smtp[23811]: SSL_connect:SSLv3 write client certificate A Why have you configured a client certificate? Generally, you should not. It may work better if you don't. ok, done 18 00:22:22 postfix/smtp[23811]: SSL_connect error to cluster3vk.eu.messagelabs.com[85.158.139.3]:25: -1 18 00:22:22 postfix/smtp[23811]: warning: TLS library problem: error:1408F10B:SSL routines:SSL3_GET_RECORD:wrong version number:s3_pkt.c:338: This requires a PCAP capture file to see what the server sent. may take a while... 18 00:22:22 postfix/smtp[23811]: 7C07C800084: to=, relay=none, delay=54, delays=6.6/0.03/47/0, dsn=4.4.1, status=deferred (connect to cluster3vk.eu.messagelabs.com[85.158.139.3]:25: Connection refused) They definitely have connection rate limiters in place. Nothing other than your IP reputation and OpenSSL library version number matters here. Disable verbose TLS logging, disable client certs: smtp_tls_cert_file = smtp_tls_key_file = smtp_tls_eccert_file = smtp_tls_eckey_file = make sure logging is not synchronous (syslog.conf) and post a PCAP file of a failed session (perhaps one of the wrong version ones). Because the message content is not sent, and in any case you're negotiating TLS, the PCAP file only discloses your IP address and SMTP client HELO name. all done minus PCAP see how we go in the meantime...
Re: Possible reasons for lost connection after DATA
Hi Viktor, Am 11.09.2014 um 16:04 schrieb Viktor Dukhovni: Your PCAP files should demonstrate repeated retransmission of data, are the ACKs you're sending confirming receipt of packets that are sent repeatedly? In that case your ACKs are getting lost? Is there a sequence number gap in the data received from the server? In that case the remote server's data is getting lost. Does the capture confirm that window scaling is not in use? ... I've now collected a big enough data sample (I wanted to get more data from different hosts and such). What I am seeing is this: - window scaling is turned off - there's two cases: 1. I am simply not receiving any data anymore after a few packets, also no retransmissions (up to that point, everything seems normal, no sequence number gaps, no missing ACKs); after 20mins or so of complete silence, I receive a RST and the connection is killed. That is the less common case. That looks like the connection is suddenly completely dead and the remote hosts just keeps sending RST until one of them happens to go through after all. 2. The connection starts out OK (first few packets all OK), then I start receiving packets out of order or not at all. I then get a lot of retransmissions of packets that I didn't ACK because I never received them in the first place (that's the majority of cases). This goes on until the remote host is fed up and kills the connection. So that looks like this is an external problem I can't fix by changing my config... Regards, Sean
Re: Possible reasons for lost connection after DATA
Hi Wietse, Am 11.09.2014 um 17:10 schrieb Wietse Venema: That increases my suspicion of a data-dependent error - some marginal cable/switch/router, perhaps some middle box with a memory bit error that requires a power cycle to clear the problem. If the problem is caused by crosstalk defect, then only physical replacement will solve it. If it is a hardware error on my server (NIC, cable, anything on the physical layer...), wouldn't I be seeing CRC errors or something? Shouldn't I be seeing something i.e. in the ifconfig error counters or the like? If it's somewhere else in the data center (router/switch), then there's really no way for me to know. I could contact support, but that request would be much to vague to be any useful (You've probably got some broken hardware somewhere...) and they probably wouldn't care anyway unless more customers were affected. If it's a middle box somewhere along the way, that's even worse. Even more different people potentially involved... Try power cycling. Did that, no change. So, the plan now is to just sit it out, keeping an eye on things... Regards, Sean
Re: Possible reasons for lost connection after DATA
Hi Hannes, Am 11.09.2014 um 20:48 schrieb Hannes Erven: I remember a possibly similar situation back in 2008... the culprit was a not-fully-up-to-date Cisco ASA firewall that corrupted TCP SACK fields and hence had the remote site send RSET. Anyways on our end the connection seemed to starve, just as you describe it. We detected that by comparing tcpdumps from both affected ends. Of course we had been lucky enough to have that happen with a business partner with competent IT people who we got a hold of, spotted the problem and also temporarily switched the feature off on their side to prove that this actually is the problem. A firmware upgrade on my client's firewall then fixed the issue. With a server hosted somewhere and incoming connections from big clusters, you might not be as lucky as that... Yup. Looks like I'll just have to sit it out. This is just a small, private, low-traffic server, it's not like anyone at Amazon cares that I have problems. ;) And even if they did, I have neither the know-how nor the time and resources to do anything useful to fix it. I'll just keep my eyes open to see if it gets any worse and recommend my users have their Amazon and newsletter stuff sent to other Email addresses. The advantage of being small is that that really is a feasible option. :) Regards, Sean
Re: Possible reasons for lost connection after DATA
Hi Mark, Am 11.09.2014 um 22:59 schrieb L. Mark Stone: Any chance there is a UTM device in the email stream? Possible, but I wouldn't know. This is a rented rootserver in some data center. I don't know their topology, and they probably wouldn't tell me even if I asked. We see lots of these errors when our SonicWalls do an RBL lookup, don't like the data in the email stream etc. The SonicWalls then just drop the connection and Postfix logs the drop. I'll contact support and ask, that won't hurt. Regards, Sean
Re: Possible reasons for lost connection after DATA
On Fri, Sep 12, 2014 at 10:36:51AM +0200, Sean Durkin wrote: If it's a middle box somewhere along the way, that's even worse. Even more different people potentially involved... I would rent a backup MX server (deploy identical anti-spam policies, and lists of valid recipients, ...) at a different site, which relays mail to your primary. If no problems are not observed at that server, make it the primary (cut over) and stop paying for the original server. Let the vendor know that there is a network problem at the present location. You can refer them to this thread. http://archives.neohapsis.com/archives/postfix/2014-09/thread.html#118 It should be in their interest to fix the problem. They can deploy network sniffers upstream from your machine and work with you to find out what's really happening. Unless the problem is an NSL mandated network sniffer upstream of your machine. :-) -- Viktor.
Re: Possible reasons for lost connection after DATA
Hello Wietse, Am 10.09.2014 um 21:52 schrieb Wietse Venema: Slow performance is typical for TCP window scaling problems. Have you tried to turn it off in your kernel? Yes, Viktor suggested that also and I tried it. It does not make a difference, the problem persists. Regards, Sean
Re: Possible reasons for lost connection after DATA
Sean Durkin: Hello Wietse, Am 10.09.2014 um 21:52 schrieb Wietse Venema: Slow performance is typical for TCP window scaling problems. Have you tried to turn it off in your kernel? Yes, Viktor suggested that also and I tried it. It does not make a difference, the problem persists. What is the distribution of DATA sizes before failure? In your example I see numbers around 3kB, 9kB, 12kB. Some failures are triggered by packet content, and may be replaced only by replacing hardware that operates marginally. Does the problem go away when you - Replace the server (either the network card or the whole box) - Replace the cable that connects the server to the network switch - Replace the network switch that the server is plugged into. - Replace the cable that connects the switch to the router - Replace the router - And so on... If you think this is a stupid idea, then you haven't been around long enough. Wietse
Re: Possible reasons for lost connection after DATA
Hi Viktor, Am 10.09.2014 um 23:03 schrieb Viktor Dukhovni: This trace has an insane level of debugging turned on, to the point that syslogd is overwhelmed and is losing messages. PLEASE DISABLE ALL VERBOSE logging. NO -v options in master.cf, NO debug_peer_list, ... Yes, sorry, I cranked up the debug level, since normal logging looks like this: Sep 11 09:43:31 mail postfix/smtpd[25170]: connect from mail18-21.srv2.de[193.169.181.21] Sep 11 09:43:31 mail postfix/smtpd[25170]: 2C076C4026A: client=mail18-21.srv2.de[193.169.181.21] Sep 11 09:46:33 mail postfix/smtpd[25170]: lost connection after DATA (33290 bytes) from mail18-21.srv2.de[193.169.181.21] Sep 11 09:46:33 mail postfix/smtpd[25170]: disconnect from mail18-21.srv2.de[193.169.181.21] ... Sep 11 10:10:59 mail postfix/smtpd[25537]: connect from quattuorocto.psi.cust-cluster.com[195.140.187.48] Sep 11 10:10:59 mail postfix/smtpd[25537]: 8736FC40A7D: client=quattuorocto.psi.cust-cluster.com[195.140.187.48] Sep 11 10:36:44 mail postfix/smtpd[25537]: lost connection after DATA (36809 bytes) from quattuorocto.psi.cust-cluster.com[195.140.187.48] Sep 11 10:36:44 mail postfix/smtpd[25537]: disconnect from quattuorocto.psi.cust-cluster.com[195.140.187.48] .. Sep 11 10:38:48 mail postfix/smtpd[25913]: connect from smtp-out-127-108.amazon.com[176.32.127.108] Sep 11 10:38:49 mail postfix/smtpd[25913]: 2558DC40458: client=smtp-out-127-108.amazon.com[176.32.127.108] Sep 11 10:41:01 mail postfix/smtpd[25913]: lost connection after DATA (17511 bytes) from smtp-out-127-108.amazon.com[176.32.127.108] Sep 11 10:41:01 mail postfix/smtpd[25913]: disconnect from smtp-out-127-108.amazon.com[176.32.127.108] I didn't think that info alone was particularly useful... Please make sure that the /dev/log syslog socket is a dgram not a stream socket, and that mail logging is not synchronous. Logging is not synchronous, the socket is a datagram socket (it has all been set up that way all along). No change, still the same problem, see above. Meanwhile, I've managed to record a tcpdump of such a failed session. What exactly am I looking for there? I don't see anything out of the ordinary, except increasing delays between received packets from the external host, until the host sends a RST. It seems I simply do not receive any packets. The ones I get are immediately ACK'd, but then there's seconds and later minutes until the next one even arrives, until finally the remote host gives up and terminates the connection. I'll try to get more dumps for comparison, including some from hosts that have no problems delivering. There's no packet filtering or rate limiting on port 25, at least not on my machine. The hosting provider might have something there, I'd have to ask them... Regards, Sean
Re: Possible reasons for lost connection after DATA
On Thu, Sep 11, 2014 at 02:36:51PM +0200, Sean Durkin wrote: PLEASE DISABLE ALL VERBOSE logging. NO -v options in master.cf, NO debug_peer_list, Yes, sorry, I cranked up the debug level, since normal logging looks like this: Sep 11 09:43:31 mail postfix/smtpd[25170]: connect from mail18-21.srv2.de[193.169.181.21] Sep 11 09:43:31 mail postfix/smtpd[25170]: 2C076C4026A: client=mail18-21.srv2.de[193.169.181.21] Sep 11 09:46:33 mail postfix/smtpd[25170]: lost connection after DATA (33290 bytes) from mail18-21.srv2.de[193.169.181.21] Sep 11 09:46:33 mail postfix/smtpd[25170]: disconnect from mail18-21.srv2.de[193.169.181.21] That's sufficient. It shows you're likely not using TLS here, and the time beetween message start and connection loss. The number of samples is rather small now. I would expect the session duration for each sending host to be essentially constant over multiple deliveries (equal to the remote machine's TCP timeout). Possibilities include a broken network interface somewhere or a bad cable that corrupts IP or TCP packet headers given specific input patterns. If the problem is with the message payload, you could try enabling inbound TLS, perhaps these sending servers support it. Don't recall whether you already have TLS. If the problem is not with the payload, then TLS won't make any difference (some hosts will still fail even after TLS). I didn't think that info alone was particularly useful... It is sufficient, and the verbose logs just add noise. Meanwhile, I've managed to record a tcpdump of such a failed session. What exactly am I looking for there? Retransmissions from the sender of data with the same sequence number... Post tcpdump output (without packet content is fine), containing packets from just a single failed session. There's no packet filtering or rate limiting on port 25, at least not on my machine. The hosting provider might have something there, I'd have to ask them... They probably have middle boxes, which might be the cause of the problem. -- Viktor.
Re: Possible reasons for lost connection after DATA
Hi Wietse, Am 11.09.2014 um 13:49 schrieb Wietse Venema: What is the distribution of DATA sizes before failure? In your example I see numbers around 3kB, 9kB, 12kB. At the moment, I see these sizes: - always exactly 17511 bytes from smtp-out-127-*.amazon.com (today, seems to be only 3 different hosts trying) - always exactly 49116 bytes from *.psi.cust-cluster.com (I've seen about 60 different hosts from there today) - always exactly 33290 bytes from mail18-*.srv2.de (about a dozen different hosts) It seems those are always the same 3 messages being re-tried constantly (when I look at them in the incoming queue folder, it's the same recipient and sender and the same message-ID, as far as I can tell). I have problems only with messages from these clusters, everything else seems unaffected (at least I haven't seen any lost connection messages from any other hosts as far as my logfiles go back). Yesterday I had an additional message with exactly 17441 bytes on every try before failure from the Amazon-cluster. That one was finally delivered completely early this morning, and has since disappeared from the cycle. FWIW, I have received a handful of messages from the Amazon-cluster that did not have any delays/problems yesterday and today, one of them even from one of the problematic hosts that can't deliver the other message. Some failures are triggered by packet content, and may be replaced only by replacing hardware that operates marginally. Does the problem go away when you - Replace the server (either the network card or the whole box) - Replace the cable that connects the server to the network switch - Replace the network switch that the server is plugged into. - Replace the cable that connects the switch to the router - Replace the router - And so on... If you think this is a stupid idea, then you haven't been around long enough. By no means do I think that's stupid. :) I'm only doing this server stuff for fun in my spare time, but my real job is in microelectronics and hardware, so I've had my share of mysterious and seemingly unexplainable stuff (ISI, crosstalk, low-frequency jitter, ground bounce, ESD-induced phenomena, you know the drill...). Problem is that this box is a rented root server in a data center somwhere, so I don't have access to the hardware to try any of that. I can contact support, but they of course charge you for everything they do, and as long as I haven't ruled out that the reason is just some stupid configuration mistake on my part (or a routing/filtering issue at my hosting provider, or Amazon, or...), I don't want to start replacing hardware, obviously... Regards, Sean
Re: Possible reasons for lost connection after DATA
On Thu, Sep 11, 2014 at 03:25:57PM +0200, Sean Durkin wrote: I can contact support, but they of course charge you for everything they do, and as long as I haven't ruled out that the reason is just some stupid configuration mistake on my part (or a routing/filtering issue at my hosting provider, or Amazon, or...), I don't want to start replacing hardware, obviously... The Postfix configuration has no impact on the TCP layer, beyond optionally specifying the TCP window size. Since it is the TCP layer that fails, the problem is not related to the Postfix configuration. Your PCAP files should demonstrate repeated retransmission of data, are the ACKs you're sending confirming receipt of packets that are sent repeatedly? In that case your ACKs are getting lost? Is there a sequence number gap in the data received from the server? In that case the remote server's data is getting lost. Does the capture confirm that window scaling is not in use? ... -- Viktor.
Re: Possible reasons for lost connection after DATA
Sean Durkin: Hi Wietse, Am 11.09.2014 um 13:49 schrieb Wietse Venema: What is the distribution of DATA sizes before failure? In your example I see numbers around 3kB, 9kB, 12kB. At the moment, I see these sizes: - always exactly 17511 bytes from smtp-out-127-*.amazon.com (today, seems to be only 3 different hosts trying) - always exactly 49116 bytes from *.psi.cust-cluster.com (I've seen about 60 different hosts from there today) - always exactly 33290 bytes from mail18-*.srv2.de (about a dozen different hosts) It seems those are always the same 3 messages being re-tried constantly (when I look at them in the incoming queue folder, it's the same recipient and sender and the same message-ID, as far as I can tell). I have problems only with messages from these clusters, everything else seems unaffected (at least I haven't seen any lost connection messages from any other hosts as far as my logfiles go back). Yesterday I had an additional message with exactly 17441 bytes on every try before failure from the Amazon-cluster. That one was finally delivered completely early this morning, and has since disappeared from the cycle. That increases my suspicion of a data-dependent error - some marginal cable/switch/router, perhaps some middle box with a memory bit error that requires a power cycle to clear the problem. If the problem is caused by crosstalk defect, then only physical replacement will solve it. Problem is that this box is a rented root server in a data center somwhere, so I don't have access to the hardware to try any of that. I can contact support, but they of course charge you for everything they do, and as long as I haven't ruled out that the reason is just some stupid configuration mistake on my part (or a routing/filtering issue at my hosting provider, or Amazon, or...), I don't want to start replacing hardware, obviously... Try power cycling. Wietse
Re: Possible reasons for lost connection after DATA
Hi Sean, Meanwhile, I've managed to record a tcpdump of such a failed session. What exactly am I looking for there? I remember a possibly similar situation back in 2008... the culprit was a not-fully-up-to-date Cisco ASA firewall that corrupted TCP SACK fields and hence had the remote site send RSET. Anyways on our end the connection seemed to starve, just as you describe it. We detected that by comparing tcpdumps from both affected ends. Of course we had been lucky enough to have that happen with a business partner with competent IT people who we got a hold of, spotted the problem and also temporarily switched the feature off on their side to prove that this actually is the problem. A firmware upgrade on my client's firewall then fixed the issue. With a server hosted somewhere and incoming connections from big clusters, you might not be as lucky as that... best regards, -hannes
Re: Possible reasons for lost connection after DATA
Sean Durkin: Meanwhile, I've managed to record a tcpdump of such a failed session. What exactly am I looking for there? - The receiving host's window announcement in the tcp handshake and in subsequent ACKs. - Whether there is a gap in the sender packet sequence numbers as seen by the receiving host. Such a gap means that a particular packet is being dropped. Just to bore you with a few examples of bad middleboxes: - Shortly after the first Postfix release there was a problem with traffic corruption due to a buggy middlebox (a Packeteer traffic shaper). The error had a very distinct signature. - For many years, there were problems with CISCO PIX firewalls that inspected SMTP traffic but failed to properly handle the case that CRLF.CRLF happened to fall on a packet boundary. - http://www.arschkrebs.de/postfix/postfix_cisco_pix_bugs.shtml has other examples where CISCO PIX/ASA firewalls will mis-handle SMTP traffic in various ways. In your case, you may have to collaborate with someone who is willing to send large amounts of random email; hopefully some messages will trigger the bug, and then the sender and receiver can compare tcpdump recordings. Wietse
Re: Possible reasons for lost connection after DATA
Any chance there is a UTM device in the email stream? We see lots of these errors when our SonicWalls do an RBL lookup, don't like the data in the email stream etc. The SonicWalls then just drop the connection and Postfix logs the drop. Hope that helps, Mark
Possible reasons for lost connection after DATA
Hello, some of my users were complaining about losing incoming mail, namely Amazon shipping notifications, newsletters and such things that they were absolutely sure were sent out, but never reached their inbox. After doing some digging, increasing log verbosity and such, I found a lot of this: [... snip ...] Sep 10 00:06:37 mail postfix/smtpd[23095]: lost connection after DATA (17511 bytes) from smtp-out-127-108.amazon.com[176.32.127.108] Sep 10 00:06:48 mail postfix/smtpd[23111]: lost connection after DATA (22788 bytes) from mail18-92.srv2.de[193.169.181.92] Sep 10 00:13:35 mail postfix/smtpd[23348]: lost connection after DATA (17441 bytes) from smtp-out-127-108.amazon.com[176.32.127.108] Sep 10 00:27:49 mail postfix/smtpd[23454]: lost connection after DATA (22788 bytes) from mail18-97.srv2.de[193.169.181.97] Sep 10 00:31:03 mail postfix/smtpd[23103]: lost connection after DATA (49116 bytes) from quinqueunus.psi.cust-cluster.com[195.140.187.51] Sep 10 00:48:46 mail postfix/smtpd[23890]: lost connection after DATA (22788 bytes) from mail18-98.srv2.de[193.169.181.98] Sep 10 00:51:53 mail postfix/smtpd[23564]: lost connection after DATA (49116 bytes) from quattuorocto.psi.cust-cluster.com[195.140.187.48] Sep 10 01:09:16 mail postfix/smtpd[24565]: lost connection after DATA (17511 bytes) from smtp-out-127-108.amazon.com[176.32.127.108] Sep 10 01:09:44 mail postfix/smtpd[24290]: lost connection after DATA (22788 bytes) from mail18-99.srv2.de[193.169.181.99] Sep 10 01:16:14 mail postfix/smtpd[24674]: lost connection after DATA (17441 bytes) from smtp-out-127-107.amazon.com[176.32.127.107] Sep 10 01:30:44 mail postfix/smtpd[24782]: lost connection after DATA (22788 bytes) from mail18-100.srv2.de[193.169.181.100] Sep 10 01:51:42 mail postfix/smtpd[25198]: lost connection after DATA (22788 bytes) from mail18-105.srv2.de[193.169.181.105] Sep 10 01:54:29 mail postfix/smtpd[24966]: lost connection after DATA (49116 bytes) from quattuorocto.psi.cust-cluster.com[195.140.187.48] Sep 10 02:12:21 mail postfix/smtpd[25784]: lost connection after DATA (17511 bytes) from smtp-out-127-107.amazon.com[176.32.127.107] Sep 10 02:12:42 mail postfix/smtpd[25656]: lost connection after DATA (22788 bytes) from mail18-106.srv2.de[193.169.181.106] Sep 10 02:20:11 mail postfix/smtpd[25892]: lost connection after DATA (17441 bytes) from smtp-out-127-108.amazon.com[176.32.127.108] Sep 10 02:33:41 mail postfix/smtpd[26077]: lost connection after DATA (22788 bytes) from mail18-107.srv2.de[193.169.181.107] Sep 10 02:54:22 mail postfix/smtpd[26178]: lost connection after DATA (49116 bytes) from quinquenulla.psi.cust-cluster.com[195.140.187.50] Sep 10 02:54:40 mail postfix/smtpd[26490]: lost connection after DATA (22788 bytes) from mail18-108.srv2.de[193.169.181.108] Sep 10 03:14:16 mail postfix/smtpd[26585]: lost connection after DATA (49116 bytes) from quinquenulla.psi.cust-cluster.com[195.140.187.50] Sep 10 03:15:39 mail postfix/smtpd[26905]: lost connection after DATA (22788 bytes) from mail18-113.srv2.de[193.169.181.113] Sep 10 03:15:52 mail postfix/smtpd[27091]: lost connection after DATA (17511 bytes) from smtp-out-127-106.amazon.com[176.32.127.106] Sep 10 03:23:15 mail postfix/smtpd[27214]: lost connection after DATA (17441 bytes) from smtp-out-127-107.amazon.com[176.32.127.107] [... snip ...] So to me that looks as if either the external SMTP server closes its connection before it is done with the entire message (the transferred size does not match the size passed through MAIL FROM: SIZE=XYZ), or the connection times out. I can see in the logs and looking at the queue directories that these messages are put in the incoming queue by cleanup, are then found but skipped by qmgr (probably since they are not finished); they lurk in the incoming queue for awhile and disappear about the time the lost connection message is put in the logs. So this points to a timeout. The weird thing is that the data sizes are always the same for the same message-id being delivered, even if it is delivered via different servers from a cluster. If it were a timeout, network problem or such, I'd expect a more or less random value for the received data size, not always exactly the same. This seems to be a new problem (meaning I just recently got aware of it; I don't know when it started, but I do know everything was working fine for years before). This does not seem to be a problem of the sepcific hosts above; in between, I've been getting messages from the same hosts successfully. It seems that some messages go through on the first try, other messages from the same hosts are always losing connection, they never are delivered completely, ultimately resulting in the external host giving up; the message is then lost, and the user never knows about it. The first question is: Can I rule out it's my fault? I don't have traffic shaping or ICMP blocking running on that host, which maybe could cause something like
Re: Possible reasons for lost connection after DATA
Am 10.09.2014 um 09:56 schrieb Sean Durkin: The first question is: Can I rule out it's my fault? have you changed anything last days/month upgrades/updates software hardware ? please send you postfix config , search list archive lost connection after DATA Best Regards MfG Robert Schetterer -- [*] sys4 AG http://sys4.de, +49 (89) 30 90 46 64 Franziskanerstraße 15, 81669 München Sitz der Gesellschaft: München, Amtsgericht München: HRB 199263 Vorstand: Patrick Ben Koetter, Marc Schiffbauer Aufsichtsratsvorsitzender: Florian Kirstein
Re: Possible reasons for lost connection after DATA
On Wed, Sep 10, 2014 at 09:56:48AM +0200, Sean Durkin wrote: Some of my users were complaining about losing incoming mail, namely Amazon shipping notifications, newsletters and such things that they were absolutely sure were sent out, but never reached their inbox. After doing some digging, increasing log verbosity and such, I found a lot of this: Have you tried disabling TCP window scaling? It might be confusing some middle-box (firewall, NAT device, ...) on path between the remote systems and your MTA. [... snip ...] Sep 10 00:06:37 mail postfix/smtpd[23095]: lost connection after DATA (17511 bytes) from smtp-out-127-108.amazon.com[176.32.127.108] Post the hostname/IP address of the receving system. Capture and examine a tcpdump recording of mail from one of the problem senders. Any sign of retransmission by the sender? For at least one such session, post all related messages from the postfix/smtpd[pid] that occur between connect from and disconnect from. -- Viktor.
Re: Possible reasons for lost connection after DATA
Hi Robert, Am 10.09.2014 um 10:11 schrieb Robert Schetterer: Am 10.09.2014 um 09:56 schrieb Sean Durkin: The first question is: Can I rule out it's my fault? have you changed anything last days/month upgrades/updates software hardware ? Hardware is unchanged. The Ubuntu postfix package was upgraded in August (2.9.6-1~12.04.2), but this problem seems to have started before that, looking at older logs. Except that, I don't see any updates directly related to the mail system in the past half year. There's of course other system/security updates, but how should I know which of these might possibly be responsible? I haven't changed the basic Postfix configuration lately. I did add OpenDKIM a few months back, but I removed that a few days ago to rule out that's the problem. I also removed Spamassassin, any RBLs and Postgrey, which I normally have running there; that does not seem to make a difference. So I'm now back to a very basic Postfix conf, but the problem persists. please send you postfix config , Anonymized postfinger-output is attached below. search list archive lost connection after DATA I did that, I couldn't find anything that really applies in my case... most problems there are either related to DATA size 0 or to weird MTU issues. Mostly this seems to happen for connections from spam bots or misconfigured clients and people tell you you should just ignore it, but that doesn't really apply here. I've tried getting a TCP dump of such an SMTP session, but since most of the interesting mail is coming from server clusters and the external hosts trying to deliver mail keep changing I'm still waiting to catch a good one... Regards, Sean Here, as promised, postfinger-output: --System Parameters-- mail_version = 2.9.6 hostname = mail uname = Linux mail 3.2.0-65-virtual #99-Ubuntu SMP Fri Jul 4 21:23:03 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux --Packaging information-- looks like this postfix comes from deb package: postfix-2.9.6-1~12.04.2 --main.cf non-default parameters-- alias_maps = $alias_database append_dot_mydomain = no biff = no broken_sasl_auth_clients = yes debug_peer_list = amazon.com, srv2.de, psi.cust-cluster.com, outbound.protection.outlook.com delay_warning_time = 4h disable_vrfy_command = yes html_directory = /usr/share/doc/postfix/html mailbox_size_limit = 0 mailbox_transport = lmtp:unix:/var/run/cyrus/socket/lmtp message_size_limit = 262144000 mydestination = localhost, localhost.$mydomain, $mydomain, mail.$mydomain, mysql:/etc/postfix/mysql-mydestination.cf myhostname = my.host.name mynetworks = 127.0.0.0/8, ip.add.re.ss myorigin = /etc/mailname proxy_interfaces = ip.add.re.ss recipient_delimiter = + sender_canonical_maps = mysql:/etc/postfix/mysql-canonical.cf smtp_destination_concurrency_limit = 1 smtp_destination_rate_delay = 1s smtpd_helo_required = yes smtpd_recipient_restrictions = permit_mynetworks, permit_sasl_authenticated, reject_unauth_pipelining, reject_unauth_destination, reject_unlisted_recipient reject_invalid_hostname, reject_non_fqdn_hostname, reject_non_fqdn_sender, reject_non_fqdn_recipient, reject_unknown_sender_domain, reject_unknown_recipient_domain, smtpd_sasl_auth_enable = yes smtpd_sender_restrictions = permit_sasl_authenticated, permit_mynetworks, reject_unauth_destination, reject_non_fqdn_sender, reject_unknown_sender_domain reject_unknown_recipient_domain, reject_unauth_pipelining smtpd_tls_auth_only = yes smtpd_tls_CAfile = /etc/postfix/ssl/ca.pem smtpd_tls_cert_file = /etc/postfix/ssl/my_cert.crt smtpd_tls_dh1024_param_file = /etc/postfix/ssl/dh_2048.pem smtpd_tls_dh512_param_file = /etc/postfix/ssl/dh_512.pem smtpd_tls_key_file = /etc/postfix/ssl/my_key.key smtpd_tls_protocols = !SSLv2 smtpd_tls_received_header = yes smtpd_tls_session_cache_database = btree:/var/lib/postfix/smtpd_scache smtpd_use_tls = yes smtp_tls_security_level = may smtp_tls_session_cache_database = btree:/var/lib/postfix/smtp_scache strict_rfc821_envelopes = yes tls_preempt_cipherlist = yes virtual_alias_maps = mysql:/etc/postfix/mysql-virtual.cf --master.cf-- smtp inet n - y -- smtpd submission inet n - y - - smtpd -o smtpd_etrn_restrictions=reject -o smtpd_enforce_tls=yes -o smtpd_sasl_auth_enable=yes -o smtpd_client_restrictions=permit_sasl_authenticated,reject smtpsinet n - y - - smtpd -o smtpd_etrn_restrictions=reject -o smtpd_tls_wrappermode=yes -o smtpd_sasl_auth_enable=yes pickupfifo n - - 60 1 pickup cleanup unix n - - - 0 cleanup qmgr fifo n - n 100 1 qmgr tlsmgrunix - - - 1000? 1 tlsmgr rewrite unix - - - - - trivial-rewrite bounceunix - - - - 0 bounce defer unix
Re: Possible reasons for lost connection after DATA
Hi Viktor, Am 10.09.2014 um 16:19 schrieb Viktor Dukhovni: Have you tried disabling TCP window scaling? It might be confusing some middle-box (firewall, NAT device, ...) on path between the remote systems and your MTA. I would not have thought of that... I've tried that now, but it does not seem to help. Post the hostname/IP address of the receving system. mail.tuxroot.de Capture and examine a tcpdump recording of mail from one of the problem senders. Any sign of retransmission by the sender? I'm trying to get a good dump and will post results once I get one. Not that easy since the external hosts keep changing all the time. All mail affected comes from mass mailers that use server clusters, so I keep getting those messages from lots of different remote hosts. I'm waiting for it to happen from one of the hosts I've seen before. Retransmission is tried numerous times, but for every retransmission the lost connection message is the same (identical number of bytes), as far as I can tell. That's one thing that puzzles me... So e.g. a message is delivered twice and each time the connection is lost after exactly 17441 bytes, even if it's different remote hosts trying, that's kind of odd. For at least one such session, post all related messages from the postfix/smtpd[pid] that occur between connect from and disconnect from. Here's one: http://pastebin.com/twb3Z8Eg And this seems to be the same message being redelivered later, from a different host, with the same result (connection lost after exactly 17441 bytes): http://pastebin.com/Qihbjz3w What I do notice there is that in fact the connection seems to be *very* slow. In the above example, the whole process takes several minutes. I don't have any throughput or network speed issues with other hosts, though. I've tried sending mail from Gmail, Yahoo, my workplace, my former university, GMX, whatever; everything goes through on the first attempt each and every time, and quickly. But it seems it is always slow for a few hosts. Regards, Sean
Re: Possible reasons for lost connection after DATA
Sean Durkin: [ Charset windows-1252 converted... ] Hi Viktor, Am 10.09.2014 um 16:19 schrieb Viktor Dukhovni: Have you tried disabling TCP window scaling? It might be confusing some middle-box (firewall, NAT device, ...) on path between the remote systems and your MTA. I would not have thought of that... I've tried that now, but it does not seem to help. Post the hostname/IP address of the receving system. mail.tuxroot.de Capture and examine a tcpdump recording of mail from one of the problem senders. Any sign of retransmission by the sender? I'm trying to get a good dump and will post results once I get one. Not that easy since the external hosts keep changing all the time. All mail affected comes from mass mailers that use server clusters, so I keep getting those messages from lots of different remote hosts. I'm waiting for it to happen from one of the hosts I've seen before. Retransmission is tried numerous times, but for every retransmission the lost connection message is the same (identical number of bytes), as far as I can tell. That's one thing that puzzles me... So e.g. a message is delivered twice and each time the connection is lost after exactly 17441 bytes, even if it's different remote hosts trying, that's kind of odd. No, it means the same problem is happening. Same error, same symptom. What I do notice there is that in fact the connection seems to be *very* slow. In the above example, the whole process takes several minutes. I don't have any throughput or network speed issues with other hosts, though. I've tried sending mail from Gmail, Yahoo, Slow performance is typical for TCP window scaling problems. Have you tried to turn it off in your kernel? # sysctl -w net.ipv4.tcp_window_scaling=0 To make it permanent: # echo 'net.ipv4.tcp_window_scaling = 0' /etc/sysctl.conf Wietse
Re: Possible reasons for lost connection after DATA
On Wed, Sep 10, 2014 at 09:19:58PM +0200, Sean Durkin wrote: For at least one such session, post all related messages from the postfix/smtpd[pid] that occur between connect from and disconnect from. Here's one: http://pastebin.com/twb3Z8Eg This trace has an insane level of debugging turned on, to the point that syslogd is overwhelmed and is losing messages. PLEASE DISABLE ALL VERBOSE logging. NO -v options in master.cf, NO debug_peer_list, ... Please make sure that the /dev/log syslog socket is a dgram not a stream socket, and that mail logging is not synchronous. Then if the problem persists, report just normal Postfix logging, not the flood of noise from verbose logging. -- Viktor.
Re: Connection stats (was: Re: Why lost connection after RCPT when we reject?)
In response to Noel's followup, here is a proposal that can make Postfix trouble shooting / anomaly detection easier. This would reveal information that is currently available only by turning on verbose logging. Proposal: The Postfix SMTP server maintains two counters for each known command: one counter for the total number of times the command was issued during an SMTP session, and one counter for the number of normal completions (a 2XX reply status). These counters are reset before the server accepts the next SMTP connection. Perhaps there should also be a counter for unknown commands. Upon disconnect. the Postfix SMTP server logs statistics for each command that has a non-zero counter. The syntax is: command-name=normal-completions/total Example: a normal session with ESMTP handshake, one mail delivery transaction with one recipient, and closed with quit: ehlo=1/1 mail=1/1 rcpt=1/1 data=1/1 quit=1/1 An abnormal session that drops after a rejected recipient: helo=1/1 mail=1/1 rcpt=0/1 A normal ESMTP session with vrfy: ehlo=1/1 vrfy=1/1 quit=1/1 An abnormal session that drops after 10 rejected AUTH commands: ehlo=1/1 auth=0/10 The logging shows only counters for commands that were actually issued. To save space we could replace n/n (two identical numbers) with just n. I don't know if this would actually simplify parsing. As the examples show this is really a small amount of text, so there is no reason to increase logging overhead by using a separate record. Since the stats would be logged at the end of a session, they can be logged in the disconnect record. Wietse
Re: Connection stats (was: Re: Why lost connection after RCPT when we reject?)
Wietse Venema: Since the stats would be logged at the end of a session, they can be logged in the disconnect record. Hello Wietse, the proposal sounds good. Such intormation could be helpful. Do you think it should be logged always or only while debugging? I use to postconf -e debug_peer_list=$buggy_client when searching anomalies and would expect such details only in that context. Andreas
Re: Connection stats (was: Re: Why lost connection after RCPT when we reject?)
A normal ESMTP session with vrfy: ehlo=1/1 vrfy=1/1 quit=1/1 An abnormal session that drops after 10 rejected AUTH commands: ehlo=1/1 auth=0/10 The logging shows only counters for commands that were actually issued. To save space we could replace n/n (two identical numbers) with just n. I don't know if this would actually simplify parsing. On second consideration, the main benefit is that anomalies become easier to recognize. This is best demonstrated with a few examples: - normal ESMTP session with vrfy: ehlo=1 vrfy=1 quit=1 - abnormal session that drops after 10 rejected AUTH commands: ehlo=1 auth=0/10 Note that the / appears only when there is an anomaly. Here, the number of good auth commands (0) differs from the total number of auth commands (10). In a logfile analyzer, anomalies would match 'disconnect.*=\d+/\d+' (perl or pcre syntax). I think that we have a winner. Wietse
Re: Why lost connection after RCPT when we reject?
On Fri, 11 Jul 2014 16:52:12 -0500 Noel Jones njo...@megan.vbhcs.org wrote: But there's really only one scenario. The only time postfix logs that message is when the connection is lost after RCPT. This is always caused by either A) a poorly written mail engine that improperly drops the connection, or B) a network problem. But 'A' has subsets. I want to ask the question Who connected, confirmed a valid address and disconnected without sending mail? Is that an unreasonable question without needing to do stateful log analysis? It's not that I am a stranger to that sort of log analysis but the Postfix engine already has that information. All I am saying is that it would be nice if the lost connection message (or a separate message) made note of the status at the time of disconnection. Actually, a separate log entry makes sense because I want to know that information whether the connection was dropped properly or not. In other words, after a disconnect of any sort I want to know if the sender sent an invalid address, a valid one which it followed up with DATA or a valid one that it did not follow up. Unfortunately, it's impossible to tell the difference from your end. All postfix knows is the connection was lost unexpectedly, and it would be improper to not log it. I understand that. In fact, I understand more about what gets logged now from this discussion which I thank you and others for. You're focusing on what happens before the lost connection. That's a job for log analysis tools. Which I was trying to avoid mainly because I analyze the logs every five minutes to see who to block. By the end of the day that gets very CPU intensive. I was hoping for a simple grep|sed solution. Maybe I need a single process that runs all day doing a tail -f on the log file. Hmm. I wonder if I can play games with syslogd so that mail logs go to maillog as well as a socket that I can read. I'll have to play with that. Of course, the spamware writers could easily fix this little artifact by sending QUIT after their payload is rejected rather than just dropping the connection. They already know this. Apparently (for now) they would rather save a few milliseconds and move on to the next target. This is what I am worried about. Right now I am just counting dropped connections but that's not a long term solution. -- D'Arcy J.M. Cain System Administrator, Vex.Net http://www.Vex.Net/ IM:da...@vex.net VoIP: sip:da...@vex.net
Re: Why lost connection after RCPT when we reject?
On 7/11/2014 5:06 PM, Wietse Venema wrote: I suppose the recipient count could be added to the lost connection message. That might be modestly useful to the general user base. Maybe something like: postfix/smtpd[nnn]: lost connection after RCPT from test.example.com[192.0.2.100], nrcpt=N But that's just an idea, not a fully thought-out proposal. Feel free to submit a patch. I wonder, does that include rejected recipients? What about recipients in earlier transactions within the same SMTP session? Whatever we log would need to be easy to explain. Wietse My first thought was a simple number of valid recipients within this session before it disconnected, similar to the nrcpt counter in the cleanup log entry, or the recipient count in the policy service. This seems dirt simple to explain, which is always good. One could use this simple display to look for non-zero events worthy of investigation. Zero count shows a host that was already rejected for some reason and can be ignored. proposed log: postfix/smtpd[nnn]: lost connection after RCPT from test.example.com[192.0.2.100], nrcpt=N Probably more useful to help identify abuse would be a counter of valid/total RCPT commands within a session that drops. nrcpt=N/T where N is valid recipients, T is total RCPT commands. I think valid/total is easier to explain than valid/rejected, and makes a pretty fraction display. proposed log: postfix/smtpd[nnn]: lost connection after RCPT from test.example.com[192.0.2.100], nrcpt=N/T -- Noel Jones
Re: Why lost connection after RCPT when we reject?
On 12 Jul 2014, at 9:19, D'Arcy J.M. Cain wrote: I want to ask the question Who connected, confirmed a valid address and disconnected without sending mail? Is that an unreasonable question without needing to do stateful log analysis? It's not that I am a stranger to that sort of log analysis but the Postfix engine already has that information. All I am saying is that it would be nice if the lost connection message (or a separate message) made note of the status at the time of disconnection. Actually, a separate log entry makes sense because I want to know that information whether the connection was dropped properly or not. In other words, after a disconnect of any sort I want to know if the sender sent an invalid address, a valid one which it followed up with DATA or a valid one that it did not follow up. A formally well-behaved address verifier is most obvious in Postfix syslog messages by obscurity. This is all syslog messages generated by a manual SMTP session testing 6 addresses and disconnecting properly: Sat Jul 12 18:54:45 toaster postfix/postscreen[65414] Info: CONNECT from [127.0.0.1]:64826 to \[127.0.0.1\]:25 Sat Jul 12 18:54:45 toaster postfix/postscreen[65414] Info: WHITELISTED [127.0.0.1]:64826 Sat Jul 12 18:54:45 toaster postfix/smtpd[65416] Info: connect from localhost[127.0.0.1] Sat Jul 12 18:55:06 toaster postfix/smtpd[65416] Info: 3h9mff597Zz1XtTxB: client=localhost[127.0.0.1] Sat Jul 12 18:56:18 toaster postfix/smtpd[65416] Info: disconnect from localhost[127.0.0.1] IOW: Postfix logs nothing about what a client does unless it results in a failure reply or a queued message. Absent a log message from Postfix itself, you could get a message out of a milter (e.g. MIMEDefang) of each RCPT.
Re: Why lost connection after RCPT when we reject?
Noel Jones: Probably more useful to help identify abuse would be a counter of valid/total RCPT commands within a session that drops. nrcpt=N/T where N is valid recipients, T is total RCPT commands. I think valid/total is easier to explain than valid/rejected, and makes a pretty fraction display. proposed log: postfix/smtpd[nnn]: lost connection after RCPT from test.example.com[192.0.2.100], nrcpt=N/T [I am making an exception to respond on-list to known people.] Interesting idea, but why not log these numbers with the disconnect event? This is logged for all SMTP sessions, whether or not the client terminates a session with the QUIT command. And more counters might be of interest: the distribution of accepted/total number of {helo/mail/rcpt/data/dot/other} commands would give the demographics of an SMTP session. If a client hangs up after sending MAIL FROM and that command was or was not accepted, then that is a clue that would otherwise only be available with verbose logging. Wietse
Connection stats (was: Re: Why lost connection after RCPT when we reject?)
On 7/12/2014 7:09 PM, Wietse Venema wrote: Noel Jones: Probably more useful to help identify abuse would be a counter of valid/total RCPT commands within a session that drops. nrcpt=N/T where N is valid recipients, T is total RCPT commands. I think valid/total is easier to explain than valid/rejected, and makes a pretty fraction display. proposed log: postfix/smtpd[nnn]: lost connection after RCPT from test.example.com[192.0.2.100], nrcpt=N/T [I am making an exception to respond on-list to known people.] Interesting idea, but why not log these numbers with the disconnect event? This is logged for all SMTP sessions, whether or not the client terminates a session with the QUIT command. Yes, that had occurred to me, but then you would still have to correlate the stats on the disconnect line with a premature lost connection earlier in the log. At least for now, the lost connection is a nice flag for possible abuse, and normal disconnects are less interesting. My goal is something a simple grep command can identify for further investigation. And more counters might be of interest: the distribution of accepted/total number of {helo/mail/rcpt/data/dot/other} commands would give the demographics of an SMTP session. If a client hangs up after sending MAIL FROM and that command was or was not accepted, then that is a clue that would otherwise only be available with verbose logging. I was trying to start with something easily implemented. Moving past that... A new connection stats line logged separately after the disconnect could include all that and more, would surely be used for things I haven't thought of, while still being fairly easy to explain. Sample log expanding on the earlier ideas, n=valid T=total S=seconds: postfix/smtpd[nnn]: stats: test.example.com[192.0.2.100]:port, helo=n/T, auth=n/T, mail=n/T, rcpt=n/T, data=n/T, dot=n/T, quit=n/T, other=n/T, bytes=transmited/received, duration=SSS.ss, TLS={none|anonymous|trusted|...} I'm not sure how to indicate a lost connection in the sample above. Would including a quit=n/T be sufficient, 0/0 indicating a lost connection, 1/1 normal? Or would there need to be a separate end={normal|lost} indicator? Or maybe better for documenting, quit={yes|none} Admittedly, I have no idea what it would take to add all that info. Not my intention to propose a 3-month project. But you asked, so I'll shoot for the moon. -- Noel Jones
Why lost connection after RCPT when we reject?
There's a new trick in the spammer's bag of tricks. Companies like strikeiron and briteverify are springing up promising to verify email addresses so that senders can limit sending invalid emails to MTAs and thus wind up on their suspicious sender list. I can't think of a single legitimate use for this service. In order to find spammers or, as described above, their agents, I would like to find incomplete sessions. Unfortunately there is only one string to trigger on, the subject mentioned one. We get this message in at least three scenarios that I can see. One, someone sends email to an invalid address and we reject the balance of the session. Two, we reject the session because of an RBL. Three, someone is probing to find out if an address is valid. I don't really care about number two since I have already dealt with it but Postfix still sends out the log line for it even after it has already logged the reject. Is there any way to not have that logged or at least change the log message? Even number one is unnecessary since we already know that someone has attempted to send to an invalid user. It's really just the third case that we care about. If I do this I guess I can drop the User unknown in local recipient table test. I suppose there is a number four where the sender has a system issue and disconnects prematurely but this probably doesn't happen often enough to worry about especially if I only take note once the sender passes some reasonable threshold. -- D'Arcy J.M. Cain System Administrator, Vex.Net http://www.Vex.Net/ IM:da...@vex.net VoIP: sip:da...@vex.net
Re: Why lost connection after RCPT when we reject?
Am 11.07.2014 21:02, schrieb D'Arcy J.M. Cain: There's a new trick in the spammer's bag of tricks. Companies like strikeiron and briteverify are springing up promising to verify email addresses so that senders can limit sending invalid emails to MTAs and thus wind up on their suspicious sender list. I can't think of a single legitimate use for this service. In order to find spammers or, as described above, their agents, I would like to find incomplete sessions. Unfortunately there is only one string to trigger on, the subject mentioned one. We get this message in at least three scenarios that I can see. One, someone sends email to an invalid address and we reject the balance of the session. Two, we reject the session because of an RBL. Three, someone is probing to find out if an address is valid. I don't really care about number two since I have already dealt with it but Postfix still sends out the log line for it even after it has already logged the reject. Is there any way to not have that logged or at least change the log message? Even number one is unnecessary since we already know that someone has attempted to send to an invalid user. It's really just the third case that we care about. If I do this I guess I can drop the User unknown in local recipient table test. I suppose there is a number four where the sender has a system issue and disconnects prematurely but this probably doesn't happen often enough to worry about especially if I only take note once the sender passes some reasonable threshold you did not provide any log but lost connection after RCPT means the client did not quit the smtp session properly and so the client is broken * client connects * client send SMTP commands * postfix answers with the REJECT status * client blindly closes the connection
Re: Why lost connection after RCPT when we reject?
On Fri, 11 Jul 2014 21:06:59 +0200 li...@rhsoft.net li...@rhsoft.net wrote: this message in at least three scenarios that I can see. One, someone sends email to an invalid address and we reject the balance of the session. Two, we reject the session because of an RBL. Three, someone is probing to find out if an address is valid. I you did not provide any log but lost connection after RCPT means the client did not quit the smtp session properly and so the client is broken Are you sure that you read my message? That's only one of the three scenarios that generates that log. * client connects * client send SMTP commands * postfix answers with the REJECT status * client blindly closes the connection That's number one above. The problem is that Postfix logs the same message for scenario number three which is the one I want to isolate. Actually, number three can also look like number one when they try an invalid address so grepping for the lost connection log line would be fine if I could ignore number two. -- D'Arcy J.M. Cain System Administrator, Vex.Net http://www.Vex.Net/ IM:da...@vex.net VoIP: sip:da...@vex.net
Re: Why lost connection after RCPT when we reject?
Am 11.07.2014 22:16, schrieb D'Arcy J.M. Cain: On Fri, 11 Jul 2014 21:06:59 +0200 li...@rhsoft.net li...@rhsoft.net wrote: this message in at least three scenarios that I can see. One, someone sends email to an invalid address and we reject the balance of the session. Two, we reject the session because of an RBL. Three, someone is probing to find out if an address is valid. I you did not provide any log but lost connection after RCPT means the client did not quit the smtp session properly and so the client is broken Are you sure that you read my message? That's only one of the three scenarios that generates that log. * client connects * client send SMTP commands * postfix answers with the REJECT status * client blindly closes the connection That's number one above. The problem is that Postfix logs the same message for scenario number three which is the one I want to isolate. Actually, number three can also look like number one when they try an invalid address so grepping for the lost connection log line would be fine if I could ignore number two no - you did not understand what i told you the client is expected to close the SMTP session sane after receive the reject answer from the server and not just close the connection without saying bye in other words: * the 4 steps above are what happening * the first 3 steps are expected * step 4 should be a quit from the client before close look at the last post of that thread http://postfix.1071664.n5.nabble.com/lost-connection-after-RCPT-td903.html
Re: Why lost connection after RCPT when we reject?
On 7/11/2014 3:16 PM, D'Arcy J.M. Cain wrote: On Fri, 11 Jul 2014 21:06:59 +0200 li...@rhsoft.net li...@rhsoft.net wrote: this message in at least three scenarios that I can see. One, someone sends email to an invalid address and we reject the balance of the session. Two, we reject the session because of an RBL. Three, someone is probing to find out if an address is valid. I you did not provide any log but lost connection after RCPT means the client did not quit the smtp session properly and so the client is broken Are you sure that you read my message? That's only one of the three scenarios that generates that log. But there's really only one scenario. The only time postfix logs that message is when the connection is lost after RCPT. This is always caused by either A) a poorly written mail engine that improperly drops the connection, or B) a network problem. Unfortunately, it's impossible to tell the difference from your end. All postfix knows is the connection was lost unexpectedly, and it would be improper to not log it. You're focusing on what happens before the lost connection. That's a job for log analysis tools. I suppose the recipient count could be added to the lost connection message. That might be modestly useful to the general user base. Maybe something like: postfix/smtpd[nnn]: lost connection after RCPT from test.example.com[192.0.2.100], nrcpt=N But that's just an idea, not a fully thought-out proposal. Feel free to submit a patch. Of course, the spamware writers could easily fix this little artifact by sending QUIT after their payload is rejected rather than just dropping the connection. They already know this. Apparently (for now) they would rather save a few milliseconds and move on to the next target. -- Noel Jones
Re: Why lost connection after RCPT when we reject?
Noel Jones: [ Charset ISO-8859-1 converted... ] On 7/11/2014 3:16 PM, D'Arcy J.M. Cain wrote: On Fri, 11 Jul 2014 21:06:59 +0200 li...@rhsoft.net li...@rhsoft.net wrote: this message in at least three scenarios that I can see. One, someone sends email to an invalid address and we reject the balance of the session. Two, we reject the session because of an RBL. Three, someone is probing to find out if an address is valid. I you did not provide any log but lost connection after RCPT means the client did not quit the smtp session properly and so the client is broken Are you sure that you read my message? That's only one of the three scenarios that generates that log. But there's really only one scenario. The only time postfix logs that message is when the connection is lost after RCPT. This is always caused by either A) a poorly written mail engine that improperly drops the connection, or B) a network problem. Unfortunately, it's impossible to tell the difference from your end. All postfix knows is the connection was lost unexpectedly, and it would be improper to not log it. You're focusing on what happens before the lost connection. That's a job for log analysis tools. I suppose the recipient count could be added to the lost connection message. That might be modestly useful to the general user base. Maybe something like: postfix/smtpd[nnn]: lost connection after RCPT from test.example.com[192.0.2.100], nrcpt=N But that's just an idea, not a fully thought-out proposal. Feel free to submit a patch. I wonder, does that include rejected recipients? What about recipients in earlier transactions within the same SMTP session? Whatever we log would need to be easy to explain. Wietse Of course, the spamware writers could easily fix this little artifact by sending QUIT after their payload is rejected rather than just dropping the connection. They already know this. Apparently (for now) they would rather save a few milliseconds and move on to the next target. -- Noel Jones