On 08/09/2018 01:38 AM, Vieri Di Paola via Shorewall-users wrote:
> Hi,
>
> I've encountered a weird issue.
>
> I have 3 ISP links (WAN) connected to a shorewall gateway, each on their own
> NIC.
>
> After about 24 hours working with apparently no issues, I start to get
> network issues on only one of the three.
>
> A simple test from the shorewall gateway shows the following packet loss when
> pinging from the NIC that's connected to the failing ISP:
>
> # shorewall reset ; ping -n -I enp9s6 8.8.8.8 ; shorewall dump >
> /home/vieri/swdump
> Shorewall Counters Reset
> PING 8.8.8.8 (8.8.8.8) from 192.168.101.2 enp9s6: 56(84) bytes of data.
> 64 bytes from 8.8.8.8: icmp_seq=12 ttl=120 time=11.1 ms
> 64 bytes from 8.8.8.8: icmp_seq=13 ttl=120 time=11.1 ms
> 64 bytes from 8.8.8.8: icmp_seq=14 ttl=120 time=11.1 ms
> 64 bytes from 8.8.8.8: icmp_seq=15 ttl=120 time=10.9 ms
> 64 bytes from 8.8.8.8: icmp_seq=16 ttl=120 time=11.0 ms
> 64 bytes from 8.8.8.8: icmp_seq=17 ttl=120 time=11.0 ms
> 64 bytes from 8.8.8.8: icmp_seq=18 ttl=120 time=11.2 ms
> 64 bytes from 8.8.8.8: icmp_seq=19 ttl=120 time=11.1 ms
> 64 bytes from 8.8.8.8: icmp_seq=20 ttl=120 time=11.3 ms
> 64 bytes from 8.8.8.8: icmp_seq=21 ttl=120 time=11.1 ms
> 64 bytes from 8.8.8.8: icmp_seq=22 ttl=120 time=11.1 ms
> 64 bytes from 8.8.8.8: icmp_seq=23 ttl=120 time=11.2 ms
> 64 bytes from 8.8.8.8: icmp_seq=24 ttl=120 time=11.2 ms
> 64 bytes from 8.8.8.8: icmp_seq=25 ttl=120 time=11.2 ms
> 64 bytes from 8.8.8.8: icmp_seq=26 ttl=120 time=11.3 ms
> 64 bytes from 8.8.8.8: icmp_seq=27 ttl=120 time=11.2 ms
> 64 bytes from 8.8.8.8: icmp_seq=28 ttl=120 time=11.2 ms
> 64 bytes from 8.8.8.8: icmp_seq=29 ttl=120 time=11.3 ms
> 64 bytes from 8.8.8.8: icmp_seq=30 ttl=120 time=11.4 ms
> 64 bytes from 8.8.8.8: icmp_seq=31 ttl=120 time=11.3 ms
> 64 bytes from 8.8.8.8: icmp_seq=32 ttl=120 time=11.2 ms
> 64 bytes from 8.8.8.8: icmp_seq=33 ttl=120 time=11.3 ms
> 64 bytes from 8.8.8.8: icmp_seq=34 ttl=120 time=11.5 ms
> 64 bytes from 8.8.8.8: icmp_seq=35 ttl=120 time=11.3 ms
> 64 bytes from 8.8.8.8: icmp_seq=36 ttl=120 time=11.3 ms
> 64 bytes from 8.8.8.8: icmp_seq=37 ttl=120 time=11.4 ms
> 64 bytes from 8.8.8.8: icmp_seq=38 ttl=120 time=11.3 ms
> 64 bytes from 8.8.8.8: icmp_seq=39 ttl=120 time=11.3 ms
> 64 bytes from 8.8.8.8: icmp_seq=40 ttl=120 time=11.8 ms
> 64 bytes from 8.8.8.8: icmp_seq=41 ttl=120 time=11.6 ms
> 64 bytes from 8.8.8.8: icmp_seq=42 ttl=120 time=11.4 ms
> ^C
> --- 8.8.8.8 ping statistics ---
> 42 packets transmitted, 31 received, 26% packet loss, time 41698ms
> rtt min/avg/max/mdev = 10.981/11.303/11.890/0.212 ms
>
> The same test on the other 2 ISP links are OK.
>
> Hence, if ISP3 is the failing link and ISP1, ISP2 are OK, I try to move some
> traffic from ISP3 to ISP2 like so in the mangle file:
>
> MARK(2):P ${HMAN_EXTRA_CORP_NETWORKS}
> (2: ISP2, 3: ISP3,
> HMAN_EXTRA_CORP_NETWORKS="192.168.210.0/23,192.168.212.0/24")
>
> Now, the same ping test from the NIC that's connected to ISP2 starts showing
> the same packet loss stats while the test on the NIC connected to ISP3 has 0%
> packet loss.
>
> Wherever I move the traffic with this line in the mangle file, I get ICMP
> packet loss, ie., moving it back to MARK(3) (ISP3) shows packet loss again
> only on that line.
>
> The shorewall dump taken during the test above is here:
>
> https://drive.google.com/open?id=1a6RlQhi2w_JJF9ZuFt6aI9G-JAQbFC9n
>
> Finally, to top it all off, if I reboot the modem/router on the ISP3 link,
> all's well again (no packet loss whatsoever, no matter which rule I use in
> the mangle file). Until the next day...
>
> So, how can I go about this to determine what's causing this issue? My
> Internet Provider has already passed the buck and thinks that it's an issue
> with my shorewall gateway...
>
> Help appreciated.
> I don't see anything in the dump that explains this behavior. I do, however, notice this conntrack table entry: icmp 1 29 src=192.168.101.2 dst=8.8.8.8 type=8 code=0 id=3380 packets=42 bytes=3528 src=8.8.8.8 dst=192.168.101.2 type=0 code=0 id=3380 packets=31 bytes=2604 mark=3 use=1 'mark=3' indicates that the flow is using the correct interface (enp9s6). My suggestion for debugging this further is to use a packet sniffer to see what is happening on the wire during the period of loss: a) Are the echo-request packets being sent? b) If not, is there unsuccessful ARPing occurring? -Tom -- Tom Eastep \ Q: What do you get when you cross a mobster with Shoreline, \ an international standard? Washington, USA \ A: Someone who makes you an offer you can't http://shorewall.org \ understand \_______________________________________________
signature.asc
Description: OpenPGP digital signature
------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________ Shorewall-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/shorewall-users
