What I suspect is the cause here is the ARP entry for the "next hop"
that gets used expiring and the mechanism used to send packets out
doesn't properly poke the Solaris ARP code into doing the right thing.
How to go about confirming that will take some thought, although I'm
sure dtrace could help with s10.
Thanks for looking into it, Darren.
Wouldn't a broken next hop ARP entry affect all remote connection
attempts? I can recreate situations whereby one remote host gets RST/ACK,
but another host (even in the same remote network) gets silence.
Or perhaps you're talking about the routing maps reported by
netstat -arn
which seems to have a direct correlation as to whether return-rst works or
not (i.e. if the remote host is there, return-rst works, otherwise not).
I don't know whether this is a cause or effect of the problem.
Does "ipfstat" show any "failures" next to "fastroute"?
Absolutely no fastroute errors:
> ipfstat
bad packets: in 0 out 0
IPv6 packets: in 0 out 0
input packets: blocked 159367 passed 1172163 nomatch 0 counted
0 short 2
output packets: blocked 2 passed 856405 nomatch 1 counted 0
short 0
input packets logged: blocked 158992 passed 3
output packets logged: blocked 0 passed 0
packets logged: input 0 output 0
log failures: input 293 output 0
fragment state(in): kept 0 lost 0 not fragmented 0
fragment state(out): kept 0 lost 0 not fragmented 0
packet state(in): kept 330766 lost 0
packet state(out): kept 368331 lost 3
ICMP replies: 1056 TCP RSTs sent: 131967
Invalid source(in): 0
Result cache hits(in): 36806 (out): 2
IN Pullups succeeded: 17046 failed: 0
OUT Pullups succeeded: 36124 failed: 0
Fastroute successes: 133020 failures: 0
TCP cksum fails(in): 0 (out): 0
IPF Ticks: 556568
Packet log flags set: (0)
none
From running ipfstat in quick succession, the count seems to go up
in accordance with the log entries for blocked connections. If IPF
is really generating RST/ACK packets, this would give credence to the
theory that the kernel is swallowing it up somewhere.
Joseph Tam <[EMAIL PROTECTED]>