It's a lot of time I'm using PF and I really appreciate it. Guys
you are doing a very good job.

I'm successfully using PF 2.0-RC3, even on Alix (embedded)  and
installed on PC,  with ipsec vpn, OVPN, carp for failover, WiFi, WAN in
load
balancing on 2 different ADSL lines, etc. Everything is working really
fine.

But a few days ago I encountered a problem that I cannot understand and
resolve: I've been upgrading a couple of PF installed on pc (configured
in failover with CARP, 5 nics) from release 1.2.3 to 2.0-RC3. 

In version 1.2.3 and all the previous updates have everything been
working fine.

After the upgrade to 2.0-RC3 I had just one problem, but because of
this I had to revert to 1.2.3.

Here is the problem.

After the upgrade to version 2.0-RC3 every protocol has been filtered
by PF as expected. But the SMTP traffic from the e-mail provider mta
(postfix) to the internal MailReley server had a strange behaviour. On
the internal mail relay I saw the connection estabilished from the
provider mta, but at the moment of receiving the the mail body the
connection hanged up and reset  at timeout. Just small e-mails, sent as
a test by the provider, have been successfully delivered.

Reverting to 1.2.3 everything works fine again.

An inspection to the traffic, made through a mirror port on the switch
(verified sniffing directly on PF) 
shows the different behaviours reported below.

Here are the data captured with 2.0-RC3, related to an attempt of the
MTA to send an e-mail messages to the internal mail relay. 


        226970 684.515289 ProviderMtaIp -> MyMailRelayIp TCP 57715 >
        smtp [SYN] Seq=0 Win=5840 Len=0 MSS=1460 TSV=68980421 TSER=0
        WS=7
        226971 684.515768 MyMailRelayIp -> ProviderMtaIp TCP smtp >
        57715 [SYN, ACK] Seq=0 Ack=1 Win=64240 Len=0 MSS=1460 WS=0 TSV=0
        TSER=0
        226973 684.526527 ProviderMtaIp -> MyMailRelayIp TCP 57715 >
        smtp [ACK] Seq=1 Ack=1 Win=5888 Len=0 TSV=68980427 TSER=0
        226977 684.529562 MyMailRelayIp -> ProviderMtaIp SMTP S: 220
        mail.mycompany.com ESMTP Service (Lotus Domino Release 8.5.1FP2)
        ready at Wed, 27 Jul 2011 12:52:04 +0200
        226978 684.537048 ProviderMtaIp -> MyMailRelayIp TCP 57715 >
        smtp [ACK] Seq=1 Ack=110 Win=5888 Len=0 TSV=68980443 TSER=625882
        226979 684.537070 ProviderMtaIp -> MyMailRelayIp SMTP C: EHLO
        fedora.provider.org
        226980 684.537868 MyMailRelayIp -> ProviderMtaIp SMTP S:
        250-mail.mycompany.com Hello fedora.provider.org
        ([ProviderMtaIp]), pleased to meet you | 250-TLS | 250-ETRN |
        250-STARTTLS | 250-DSN | 250-SIZE 18432000 | 250 PIPELINING
        226992 684.551654 ProviderMtaIp -> MyMailRelayIp SMTP C: MAIL
        FROM:<user@domain> SIZE=86045 | RCPT TO:<user@domain> | DATA
        226996 684.552697 MyMailRelayIp -> ProviderMtaIp SMTP S: 250
        user@domain Sender OK | 250 user@domain Recipient OK | 354 Enter
        message, end with "." on a line by itself
        227503 686.321903 MyMailRelayIp -> ProviderMtaIp SMTP [TCP
        Retransmission] S: 250 user@domain Sender OK | 250 user@domain
        Recipient OK | 354 Enter message, end with "." on a line by
        itself
        227505 686.329892 ProviderMtaIp -> MyMailRelayIp TCP [TCP
        Previous segment lost] 57715 > smtp [ACK] Seq=3001 Ack=404
        Win=8064 Len=0 TSV=68982235 TSER=625901 SLE=274 SRE=404
        343904 1013.873824 MyMailRelayIp -> ProviderMtaIp TCP smtp >
        57715 [FIN, ACK] Seq=404 Ack=105 Win=64136 Len=0 TSV=629175
        TSER=68980454
        343909 1013.883338 ProviderMtaIp -> MyMailRelayIp TCP 57715 >
        smtp [RST] Seq=105 Win=0 Len=0


As I can see the traffic between the provider's MTA and the mai relay
starts and, initially it goes on, but packet ID 226996 get lost, then
retransmitted (227503) and acknowledged by  ProviderMtaIp but with a
grater Seq. number. It looks like the mail data packets have been lost.
Then, after about 5 min. the connection reaches the time out, mail
relay sends a FIN request and the  ProviderMtaIp resets the connection.

On PF's logs there's nothing about dropped packets related to the
connection.



Here's, what happens reverting to 1.2.3 (everything works fine).

        ...
        19377  46.958958 ProviderMtaIp -> MyMailRelayIp SMTP C: MAIL
        FROM:<user@domain> SIZE=56892 | RCPT TO:<user@domain> | DATA
        19378  46.960259 MyMailRelayIp -> ProviderMtaIp SMTP S: 250
        user@domain Sender OK | 250 user@domain Recipient OK | 354 Enter
        message, end with "." on a line by itself
        19386  46.971715 ProviderMtaIp -> MyMailRelayIp SMTP C: DATA
        fragment, 1248 bytes
        19387  46.974048 ProviderMtaIp -> MyMailRelayIp SMTP C: DATA
        fragment, 1248 bytes
        19388  46.974082 ProviderMtaIp -> MyMailRelayIp SMTP C: DATA
        fragment, 1248 bytes
        19389  46.974425 MyMailRelayIp -> ProviderMtaIp TCP smtp > 33359
        [ACK] Seq=420 Ack=2617 Win=64240 Len=0 TSV=706364 TSER=77029773
        19393  46.987139 ProviderMtaIp -> MyMailRelayIp SMTP C: DATA
        fragment, 1248 bytes
        19394  46.987663 MyMailRelayIp -> ProviderMtaIp TCP smtp > 33359
        [ACK] Seq=420 Ack=5113 Win=63248 Len=0 TSV=706365 TSER=77029773
        19395  46.987686 MyMailRelayIp -> ProviderMtaIp TCP [TCP Dup ACK
        19394#1] smtp > 33359 [ACK] Seq=420 Ack=5113 Win=63248 Len=0
        TSV=706365 TSER=77029773
        19396  46.989640 ProviderMtaIp -> MyMailRelayIp SMTP C: DATA
        fragment, 1248 bytes
        19397  46.989661 ProviderMtaIp -> MyMailRelayIp SMTP C: DATA
        fragment, 1248 bytes
        19398  46.990342 MyMailRelayIp -> ProviderMtaIp TCP smtp > 33359
        [ACK] Seq=420 Ack=7609 Win=64240 Len=0 TSV=706365 TSER=77029787
        19407  46.999000 ProviderMtaIp -> MyMailRelayIp SMTP C: DATA
        fragment, 1248 bytes
        19408  46.999026 ProviderMtaIp -> MyMailRelayIp SMTP C: DATA
        fragment, 1248 bytes
        ...
        19492  47.067918 ProviderMtaIp -> MyMailRelayIp SMTP C: DATA
        fragment, 1248 bytes
        19493  47.068921 ProviderMtaIp -> MyMailRelayIp SMTP C: DATA
        fragment, 1248 bytes
        19494  47.069291 MyMailRelayIp -> ProviderMtaIp TCP smtp > 33359
        [ACK] Seq=420 Ack=54809 Win=64240 Len=0 TSV=706365 TSER=77029856
        19495  47.070644 ProviderMtaIp -> MyMailRelayIp SMTP C: DATA
        fragment, 1248 bytes
        19507  47.078352 ProviderMtaIp -> MyMailRelayIp IMF from: "user"
        <user@domain>, subject xxx Masked Subject xxx,  (text/plain)
        (text/html)
        19508  47.078846 MyMailRelayIp -> ProviderMtaIp TCP smtp > 33359
        [ACK] Seq=420 Ack=57023 Win=63530 Len=0 TSV=706365 TSER=77029856
        19509  47.078867 MyMailRelayIp -> ProviderMtaIp TCP [TCP Dup ACK
        19508#1] smtp > 33359 [ACK] Seq=420 Ack=57023 Win=63530 Len=0
        TSV=706365 TSER=77029856
        19517  47.084957 MyMailRelayIp -> ProviderMtaIp SMTP S: 250
        Message accepted for delivery
        19518  47.085306 MyMailRelayIp -> ProviderMtaIp SMTP S: 221
        mail.mycompany.com SMTP Service closing transmission channel
        19519  47.085405 MyMailRelayIp -> ProviderMtaIp TCP smtp > 33359
        [FIN, ACK] Seq=519 Ack=57023 Win=63530 Len=0 TSV=706365
        TSER=77029856
        19527  47.096111 ProviderMtaIp -> MyMailRelayIp TCP 33359 > smtp
        [FIN, ACK] Seq=57023 Ack=519 Win=8064 Len=0 TSV=77029898
        TSER=706365
        19528  47.096609 MyMailRelayIp -> ProviderMtaIp TCP smtp > 33359
        [ACK] Seq=520 Ack=57024 Win=63530 Len=0 TSV=706366 TSER=77029898
        19529  47.098002 ProviderMtaIp -> MyMailRelayIp TCP 33359 > smtp
        [ACK] Seq=57024 Ack=520 Win=8064 Len=0 TSV=77029900 TSER=706365




I've also tried to play around with the MTU value, with no effect.

Mail Relay is    Lotus Domino Release 8.5.1FP2   and the mta is
Fedora, kernel 2.6.18-1, server postfix 2.2.8-1.2
During the tests the provider also tried Debian, kernel 2.6.26-2, server
postfix 2.5.5-1.1

The provider's mta lies in internet (WAN side of the PF), while the the
mail relay is in one of the DMZs of the PF, with public IP, no nat.
Even WAN and DMZ are over CARP for fault tolerance.

The provider have been delivering the e-mails to all other customers,
with no problem, and asserts that all his servers are strictly compliant
the RFCs
The router connecting to Internet is set up with MTU=1476.

Please, does someone have an idea of what is going on, or did already
see a similar behaviour?
Every suggestion will be appreciated.

Thank you in advance.

Odette Nsaka
   


Reply via email to