It's a lot of time I'm using PF and I really appreciate it. Guys
you are doing a very good job.
I'm successfully using PF 2.0-RC3, even on Alix (embedded) and
installed on PC, with ipsec vpn, OVPN, carp for failover, WiFi, WAN in
load
balancing on 2 different ADSL lines, etc. Everything is working really
fine.
But a few days ago I encountered a problem that I cannot understand and
resolve: I've been upgrading a couple of PF installed on pc (configured
in failover with CARP, 5 nics) from release 1.2.3 to 2.0-RC3.
In version 1.2.3 and all the previous updates have everything been
working fine.
After the upgrade to 2.0-RC3 I had just one problem, but because of
this I had to revert to 1.2.3.
Here is the problem.
After the upgrade to version 2.0-RC3 every protocol has been filtered
by PF as expected. But the SMTP traffic from the e-mail provider mta
(postfix) to the internal MailReley server had a strange behaviour. On
the internal mail relay I saw the connection estabilished from the
provider mta, but at the moment of receiving the the mail body the
connection hanged up and reset at timeout. Just small e-mails, sent as
a test by the provider, have been successfully delivered.
Reverting to 1.2.3 everything works fine again.
An inspection to the traffic, made through a mirror port on the switch
(verified sniffing directly on PF)
shows the different behaviours reported below.
Here are the data captured with 2.0-RC3, related to an attempt of the
MTA to send an e-mail messages to the internal mail relay.
226970 684.515289 ProviderMtaIp -> MyMailRelayIp TCP 57715 >
smtp [SYN] Seq=0 Win=5840 Len=0 MSS=1460 TSV=68980421 TSER=0
WS=7
226971 684.515768 MyMailRelayIp -> ProviderMtaIp TCP smtp >
57715 [SYN, ACK] Seq=0 Ack=1 Win=64240 Len=0 MSS=1460 WS=0 TSV=0
TSER=0
226973 684.526527 ProviderMtaIp -> MyMailRelayIp TCP 57715 >
smtp [ACK] Seq=1 Ack=1 Win=5888 Len=0 TSV=68980427 TSER=0
226977 684.529562 MyMailRelayIp -> ProviderMtaIp SMTP S: 220
mail.mycompany.com ESMTP Service (Lotus Domino Release 8.5.1FP2)
ready at Wed, 27 Jul 2011 12:52:04 +0200
226978 684.537048 ProviderMtaIp -> MyMailRelayIp TCP 57715 >
smtp [ACK] Seq=1 Ack=110 Win=5888 Len=0 TSV=68980443 TSER=625882
226979 684.537070 ProviderMtaIp -> MyMailRelayIp SMTP C: EHLO
fedora.provider.org
226980 684.537868 MyMailRelayIp -> ProviderMtaIp SMTP S:
250-mail.mycompany.com Hello fedora.provider.org
([ProviderMtaIp]), pleased to meet you | 250-TLS | 250-ETRN |
250-STARTTLS | 250-DSN | 250-SIZE 18432000 | 250 PIPELINING
226992 684.551654 ProviderMtaIp -> MyMailRelayIp SMTP C: MAIL
FROM:<user@domain> SIZE=86045 | RCPT TO:<user@domain> | DATA
226996 684.552697 MyMailRelayIp -> ProviderMtaIp SMTP S: 250
user@domain Sender OK | 250 user@domain Recipient OK | 354 Enter
message, end with "." on a line by itself
227503 686.321903 MyMailRelayIp -> ProviderMtaIp SMTP [TCP
Retransmission] S: 250 user@domain Sender OK | 250 user@domain
Recipient OK | 354 Enter message, end with "." on a line by
itself
227505 686.329892 ProviderMtaIp -> MyMailRelayIp TCP [TCP
Previous segment lost] 57715 > smtp [ACK] Seq=3001 Ack=404
Win=8064 Len=0 TSV=68982235 TSER=625901 SLE=274 SRE=404
343904 1013.873824 MyMailRelayIp -> ProviderMtaIp TCP smtp >
57715 [FIN, ACK] Seq=404 Ack=105 Win=64136 Len=0 TSV=629175
TSER=68980454
343909 1013.883338 ProviderMtaIp -> MyMailRelayIp TCP 57715 >
smtp [RST] Seq=105 Win=0 Len=0
As I can see the traffic between the provider's MTA and the mai relay
starts and, initially it goes on, but packet ID 226996 get lost, then
retransmitted (227503) and acknowledged by ProviderMtaIp but with a
grater Seq. number. It looks like the mail data packets have been lost.
Then, after about 5 min. the connection reaches the time out, mail
relay sends a FIN request and the ProviderMtaIp resets the connection.
On PF's logs there's nothing about dropped packets related to the
connection.
Here's, what happens reverting to 1.2.3 (everything works fine).
...
19377 46.958958 ProviderMtaIp -> MyMailRelayIp SMTP C: MAIL
FROM:<user@domain> SIZE=56892 | RCPT TO:<user@domain> | DATA
19378 46.960259 MyMailRelayIp -> ProviderMtaIp SMTP S: 250
user@domain Sender OK | 250 user@domain Recipient OK | 354 Enter
message, end with "." on a line by itself
19386 46.971715 ProviderMtaIp -> MyMailRelayIp SMTP C: DATA
fragment, 1248 bytes
19387 46.974048 ProviderMtaIp -> MyMailRelayIp SMTP C: DATA
fragment, 1248 bytes
19388 46.974082 ProviderMtaIp -> MyMailRelayIp SMTP C: DATA
fragment, 1248 bytes
19389 46.974425 MyMailRelayIp -> ProviderMtaIp TCP smtp > 33359
[ACK] Seq=420 Ack=2617 Win=64240 Len=0 TSV=706364 TSER=77029773
19393 46.987139 ProviderMtaIp -> MyMailRelayIp SMTP C: DATA
fragment, 1248 bytes
19394 46.987663 MyMailRelayIp -> ProviderMtaIp TCP smtp > 33359
[ACK] Seq=420 Ack=5113 Win=63248 Len=0 TSV=706365 TSER=77029773
19395 46.987686 MyMailRelayIp -> ProviderMtaIp TCP [TCP Dup ACK
19394#1] smtp > 33359 [ACK] Seq=420 Ack=5113 Win=63248 Len=0
TSV=706365 TSER=77029773
19396 46.989640 ProviderMtaIp -> MyMailRelayIp SMTP C: DATA
fragment, 1248 bytes
19397 46.989661 ProviderMtaIp -> MyMailRelayIp SMTP C: DATA
fragment, 1248 bytes
19398 46.990342 MyMailRelayIp -> ProviderMtaIp TCP smtp > 33359
[ACK] Seq=420 Ack=7609 Win=64240 Len=0 TSV=706365 TSER=77029787
19407 46.999000 ProviderMtaIp -> MyMailRelayIp SMTP C: DATA
fragment, 1248 bytes
19408 46.999026 ProviderMtaIp -> MyMailRelayIp SMTP C: DATA
fragment, 1248 bytes
...
19492 47.067918 ProviderMtaIp -> MyMailRelayIp SMTP C: DATA
fragment, 1248 bytes
19493 47.068921 ProviderMtaIp -> MyMailRelayIp SMTP C: DATA
fragment, 1248 bytes
19494 47.069291 MyMailRelayIp -> ProviderMtaIp TCP smtp > 33359
[ACK] Seq=420 Ack=54809 Win=64240 Len=0 TSV=706365 TSER=77029856
19495 47.070644 ProviderMtaIp -> MyMailRelayIp SMTP C: DATA
fragment, 1248 bytes
19507 47.078352 ProviderMtaIp -> MyMailRelayIp IMF from: "user"
<user@domain>, subject xxx Masked Subject xxx, (text/plain)
(text/html)
19508 47.078846 MyMailRelayIp -> ProviderMtaIp TCP smtp > 33359
[ACK] Seq=420 Ack=57023 Win=63530 Len=0 TSV=706365 TSER=77029856
19509 47.078867 MyMailRelayIp -> ProviderMtaIp TCP [TCP Dup ACK
19508#1] smtp > 33359 [ACK] Seq=420 Ack=57023 Win=63530 Len=0
TSV=706365 TSER=77029856
19517 47.084957 MyMailRelayIp -> ProviderMtaIp SMTP S: 250
Message accepted for delivery
19518 47.085306 MyMailRelayIp -> ProviderMtaIp SMTP S: 221
mail.mycompany.com SMTP Service closing transmission channel
19519 47.085405 MyMailRelayIp -> ProviderMtaIp TCP smtp > 33359
[FIN, ACK] Seq=519 Ack=57023 Win=63530 Len=0 TSV=706365
TSER=77029856
19527 47.096111 ProviderMtaIp -> MyMailRelayIp TCP 33359 > smtp
[FIN, ACK] Seq=57023 Ack=519 Win=8064 Len=0 TSV=77029898
TSER=706365
19528 47.096609 MyMailRelayIp -> ProviderMtaIp TCP smtp > 33359
[ACK] Seq=520 Ack=57024 Win=63530 Len=0 TSV=706366 TSER=77029898
19529 47.098002 ProviderMtaIp -> MyMailRelayIp TCP 33359 > smtp
[ACK] Seq=57024 Ack=520 Win=8064 Len=0 TSV=77029900 TSER=706365
I've also tried to play around with the MTU value, with no effect.
Mail Relay is Lotus Domino Release 8.5.1FP2 and the mta is
Fedora, kernel 2.6.18-1, server postfix 2.2.8-1.2
During the tests the provider also tried Debian, kernel 2.6.26-2, server
postfix 2.5.5-1.1
The provider's mta lies in internet (WAN side of the PF), while the the
mail relay is in one of the DMZs of the PF, with public IP, no nat.
Even WAN and DMZ are over CARP for fault tolerance.
The provider have been delivering the e-mails to all other customers,
with no problem, and asserts that all his servers are strictly compliant
the RFCs
The router connecting to Internet is set up with MTU=1476.
Please, does someone have an idea of what is going on, or did already
see a similar behaviour?
Every suggestion will be appreciated.
Thank you in advance.
Odette Nsaka