Re: [vnet] [epair] epair interface stops working after some time

2018-03-30 Thread Reshad Patuck
Hi,

I have filed a bug for this issue and cc'd both of you in it.

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227100

Best,

Reshad

On 29 March 2018 6:39:13 PM IST, Kristof Provost  wrote:
>On 29 Mar 2018, at 14:48, Reshad Patuck wrote:
>> pulling the 'net.link.epair.netisr_maxqlen' down does seem to make 
>> this occur faster.
>> ​
>Good, I think my hypothesis about where the issue lies is correct then.
>You should be able to avoid (or at least reduce the frequency of) the 
>issue by increasing the value on your system(s).
>
>> When I dropped it to 2 like Kristof did and I have the same symptoms 
>> on a box which was not exhibiting the problems manually began to have
>
>> the same symptoms.
>> Bumping it back up to 2100 did not restore the functionality (I don't
>
>> know if it should).
>> ​
>It’s good to know this. It doesn’t surprise me that it doesn’t fix 
>things.
>Something’s wrong in the code which handle an overflow of the netisr 
>queue in the epair driver. Once that happens the IFF_DRV_OACTIVE flag 
>gets set, and we keep enqueuing outside the netisr queue.
>Somehow we never end up back in epair_nh_drainedcpu(), so the flag
>never 
>gets cleared and the driver never recovers.
>
>> I will create a PR for this later today with all the information I 
>> have gathered so that we can have it all in one place.
>>
>Thanks. Please cc me on it. I’ll see if I can figure out what the 
>problem is, but we might need someone smarter, so cc Bjoern too.
>
>Regards,
>Kristof
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [vnet] [epair] epair interface stops working after some time

2018-03-29 Thread Kristof Provost

On 29 Mar 2018, at 14:48, Reshad Patuck wrote:
pulling the 'net.link.epair.netisr_maxqlen' down does seem to make 
this occur faster.

​

Good, I think my hypothesis about where the issue lies is correct then.
You should be able to avoid (or at least reduce the frequency of) the 
issue by increasing the value on your system(s).


When I dropped it to 2 like Kristof did and I have the same symptoms 
on a box which was not exhibiting the problems manually began to have 
the same symptoms.
Bumping it back up to 2100 did not restore the functionality (I don't 
know if it should).

​
It’s good to know this. It doesn’t surprise me that it doesn’t fix 
things.
Something’s wrong in the code which handle an overflow of the netisr 
queue in the epair driver. Once that happens the IFF_DRV_OACTIVE flag 
gets set, and we keep enqueuing outside the netisr queue.
Somehow we never end up back in epair_nh_drainedcpu(), so the flag never 
gets cleared and the driver never recovers.


I will create a PR for this later today with all the information I 
have gathered so that we can have it all in one place.


Thanks. Please cc me on it. I’ll see if I can figure out what the 
problem is, but we might need someone smarter, so cc Bjoern too.


Regards,
Kristof

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [vnet] [epair] epair interface stops working after some time

2018-03-29 Thread Reshad Patuck
​
Hi,

pulling the 'net.link.epair.netisr_maxqlen' down does seem to make this occur 
faster.
​
When I dropped it to 2 like Kristof did and I have the same symptoms on a box 
which was not exhibiting the problems manually began to have the same symptoms.
Bumping it back up to 2100 did not restore the functionality (I don't know if 
it should).
​
I will create a PR for this later today with all the information I have 
gathered so that we can have it all in one place.
Till then I have still have access to a box which is naturally in this state.
Let me know if there is anything you would like me to check
​
Thanks for the help,
​
Reshad

On 28 March 2018 12:32:44 AM IST, Kristof Provost  wrote:
>On 27 Mar 2018, at 20:59, Reshad Patuck wrote:
>> The current value of 'net.link.epair.netisr_maxqlen' is 2100, I will 
>> make it 210.
>> Will this require a reboot? or can I just change the sysctl and
>reload 
>> the epair module?
>> ​
>You shouldn’t need to reboot or reload the epair module. When I set it 
>to 2 on my box it pretty much immediately lost connectivity over the 
>epair interfaces.
>
>I’d expect you to get hit by the bug relatively quickly now, so be 
>aware of that.
>
>Regards,
>Kristof
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [vnet] [epair] epair interface stops working after some time

2018-03-27 Thread Reshad Patuck
Excellent, will give it a try on a box that I never have this problem on.

Will let you know of the symptoms are the same when I trigger it.

Best,

Reshad

On 28 March 2018 12:32:44 AM IST, Kristof Provost  wrote:
>On 27 Mar 2018, at 20:59, Reshad Patuck wrote:
>> The current value of 'net.link.epair.netisr_maxqlen' is 2100, I will 
>> make it 210.
>> Will this require a reboot? or can I just change the sysctl and
>reload 
>> the epair module?
>> ​
>You shouldn’t need to reboot or reload the epair module. When I set it 
>to 2 on my box it pretty much immediately lost connectivity over the 
>epair interfaces.
>
>I’d expect you to get hit by the bug relatively quickly now, so be 
>aware of that.
>
>Regards,
>Kristof
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [vnet] [epair] epair interface stops working after some time

2018-03-27 Thread Kristof Provost

On 27 Mar 2018, at 20:59, Reshad Patuck wrote:
The current value of 'net.link.epair.netisr_maxqlen' is 2100, I will 
make it 210.
Will this require a reboot? or can I just change the sysctl and reload 
the epair module?

​
You shouldn’t need to reboot or reload the epair module. When I set it 
to 2 on my box it pretty much immediately lost connectivity over the 
epair interfaces.


I’d expect you to get hit by the bug relatively quickly now, so be 
aware of that.


Regards,
Kristof
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [vnet] [epair] epair interface stops working after some time

2018-03-27 Thread Reshad Patuck
Hi,
​
@Kristof:
The current value of 'net.link.epair.netisr_maxqlen' is 2100, I will make it 
210.
Will this require a reboot? or can I just change the sysctl and reload the 
epair module?
​
@Bjoern:
here is the output to 'netstat -Q'
```
# netstat -Q
Configuration:
SettingCurrentLimit
Thread count 11
Default queue limit25610240
Dispatch policy direct  n/a
Threads bound to CPUs disabled  n/a
​
Protocols:
Name   Proto QLimit Policy Dispatch Flags
ip 1256   flow  default   ---
igmp   2256 source  default   ---
rtsock 3256 source  default   ---
arp4256 source  default   ---
ether  5256 source   direct   ---
ip66256   flow  default   ---
epair  8   2100cpu  default   CD-
​
Workstreams:
WSID CPU   Name Len WMark   Disp'd  HDisp'd   QDrops   Queued  Handled
   0   0   ip 030 1140926700 13574317 24983409
   0   0   igmp   0 000000
   0   0   rtsock 0 1000   42   42
   0   0   arp0 0 61109751000 61109751
   0   0   ether  0 0 115098020000 115098020
   0   0   ip6010 3615757700  4273274 40430846
   0   0   epair  0  210000   210972 303785724 303785724
```
​
I still have access to a machine in this state, but will need to reset it to a 
working state soon.
​
Please let me know if there is any information you would like me to get from 
this machine before I reset it.
​
Best,
​
Reshad

On 27 March 2018 8:18:29 PM IST, "Bjoern A. Zeeb" 
 wrote:
>On 27 Mar 2018, at 14:40, Kristof Provost wrote:
>
>> (Re-cc freebsd-net, because this is useful information)
>>
>> On 27 Mar 2018, at 13:07, Reshad Patuck wrote:
>>> The epair crash occurred again today running the epair module code 
>>> with the added dtrace sdt providers.
>>> ​
>>> Running the same command as last time, 'dtrace -n ::epair\*:'
>returns 
>>> the following:
>>> ```
>>> CPU IDFUNCTION:NAME
>> …
>>>   0  66499   epair_transmit_locked:enqueued
>>> ```
>>
>>> Looks like its filled up a queue somewhere and is dropping 
>>> connections post that.
>>> ​
>>> The value of the 'error' is 55 I can see both the ifp and m structs 
>>> but don't know what to look for in them.
>>>
>> That’s useful. Error 55 is ENOBUFS, which in IFQ_ENQUEUE() means 
>> we’re hitting _IF_QFULL().
>> There don’t seem to be counters for that drop though, so that makes 
>> it hard to diagnose without these extra probe points.
>> It also explains why you don’t really see any drop counters 
>> incrementing.
>>
>> The fact that this queue is full presumably means that the other side
>
>> is not reading packets off it any more.
>> That’s supposed to happen in epair_start_locked() (Look for the 
>> IFQ_DEQUEUE() calls).
>>
>> It’s not at all clear to my how, but it looks like the receive side 
>> is not doing its work.
>>
>> It looks like the IFQ code is already a fallback for when the netisr 
>> queue is full.
>> That code might be broken, or there might be a different issue that 
>> will just mean you’ll always end up in the same situation, 
>> regardless of queue size.
>>
>> It’s probably worth trying to play with 
>> ‘net.route.netisr_maxqlen’. I’d recommend *lowering* it, to see 
>> if the problem happens more frequently that way. If it does it’ll be 
>> helpful in reproducing and trying to fix this. If it doesn’t the 
>> full queues is probably a consequence rather than a cause/trigger.
>> (Of course, once you’ve confirmed that lowering the netisr_maxqlen 
>> makes the problem more frequent go ahead and increase it.)
>
>netstat -Q  will be useful
>___
>freebsd-net@freebsd.org mailing list
>https://lists.freebsd.org/mailman/listinfo/freebsd-net
>To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [vnet] [epair] epair interface stops working after some time

2018-03-27 Thread Kristof Provost



On 27 Mar 2018, at 16:48, Bjoern A. Zeeb wrote:


On 27 Mar 2018, at 14:40, Kristof Provost wrote:


(Re-cc freebsd-net, because this is useful information)

On 27 Mar 2018, at 13:07, Reshad Patuck wrote:
The epair crash occurred again today running the epair module code 
with the added dtrace sdt providers.

​
Running the same command as last time, 'dtrace -n ::epair\*:' 
returns the following:

```
CPU IDFUNCTION:NAME

…

  0  66499   epair_transmit_locked:enqueued
```


Looks like its filled up a queue somewhere and is dropping 
connections post that.

​
The value of the 'error' is 55 I can see both the ifp and m structs 
but don't know what to look for in them.


That’s useful. Error 55 is ENOBUFS, which in IFQ_ENQUEUE() means 
we’re hitting _IF_QFULL().
There don’t seem to be counters for that drop though, so that makes 
it hard to diagnose without these extra probe points.
It also explains why you don’t really see any drop counters 
incrementing.


The fact that this queue is full presumably means that the other side 
is not reading packets off it any more.
That’s supposed to happen in epair_start_locked() (Look for the 
IFQ_DEQUEUE() calls).


It’s not at all clear to my how, but it looks like the receive side 
is not doing its work.


It looks like the IFQ code is already a fallback for when the netisr 
queue is full.
That code might be broken, or there might be a different issue that 
will just mean you’ll always end up in the same situation, 
regardless of queue size.


It’s probably worth trying to play with 
‘net.route.netisr_maxqlen’. I’d recommend *lowering* it, to see 
if the problem happens more frequently that way. If it does it’ll 
be helpful in reproducing and trying to fix this. If it doesn’t the 
full queues is probably a consequence rather than a cause/trigger.
(Of course, once you’ve confirmed that lowering the netisr_maxqlen 
makes the problem more frequent go ahead and increase it.)


netstat -Q  will be useful


Reshad included that in his e-mail to me:

On the system with the bug 'netstat -Q' seems to have queue drops for 
epair.

```
# netstat -Q
Configuration:
Setting Current Limit
Thread count 1 1
Default queue limit 256 10240
Dispatch policy direct n/a
Threads bound to CPUs disabled n/a
​
Protocols:
Name Proto QLimit Policy Dispatch Flags
ip 1 256 flow default ---
igmp 2 256 source default ---
rtsock 3 256 source default ---
arp 4 256 source default ---
ether 5 256 source direct ---
ip6 6 256 flow default ---
epair 8 2100 cpu default CD-
​
Workstreams:
WSID CPU Name Len WMark Disp'd HDisp'd QDrops Queued Handled
0 0 ip 0 30 11150458 0 0 13092275 24242558
0 0 igmp 0 0 0 0 0 0 0
0 0 rtsock 0 1 0 0 0 42 42
0 0 arp 0 0 56380919 0 0 0 56380919
0 0 ether 0 0 108761357 0 0 0 108761357
0 0 ip6 0 10 34999359 0 0 4091259 39090613
0 0 epair 0 2100 0 0 210972 303785724 303785724
```
​
I also noticed that the values for 'epair' in the 'Workstreams' 
section including drops do not change, while all others increase after 
some time.


I think I’ve triggered this problem by setting 
net.link.epair.netisr_maxqlen to an absurdly low value (2 in my case).
It looks like there’s an issue with the handling over an overflow of 
the “hardware” queue, but I don’t really understand that code.


Regards,
Kristof
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [vnet] [epair] epair interface stops working after some time

2018-03-27 Thread Bjoern A. Zeeb

On 27 Mar 2018, at 14:40, Kristof Provost wrote:


(Re-cc freebsd-net, because this is useful information)

On 27 Mar 2018, at 13:07, Reshad Patuck wrote:
The epair crash occurred again today running the epair module code 
with the added dtrace sdt providers.

​
Running the same command as last time, 'dtrace -n ::epair\*:' returns 
the following:

```
CPU IDFUNCTION:NAME

…

  0  66499   epair_transmit_locked:enqueued
```


Looks like its filled up a queue somewhere and is dropping 
connections post that.

​
The value of the 'error' is 55 I can see both the ifp and m structs 
but don't know what to look for in them.


That’s useful. Error 55 is ENOBUFS, which in IFQ_ENQUEUE() means 
we’re hitting _IF_QFULL().
There don’t seem to be counters for that drop though, so that makes 
it hard to diagnose without these extra probe points.
It also explains why you don’t really see any drop counters 
incrementing.


The fact that this queue is full presumably means that the other side 
is not reading packets off it any more.
That’s supposed to happen in epair_start_locked() (Look for the 
IFQ_DEQUEUE() calls).


It’s not at all clear to my how, but it looks like the receive side 
is not doing its work.


It looks like the IFQ code is already a fallback for when the netisr 
queue is full.
That code might be broken, or there might be a different issue that 
will just mean you’ll always end up in the same situation, 
regardless of queue size.


It’s probably worth trying to play with 
‘net.route.netisr_maxqlen’. I’d recommend *lowering* it, to see 
if the problem happens more frequently that way. If it does it’ll be 
helpful in reproducing and trying to fix this. If it doesn’t the 
full queues is probably a consequence rather than a cause/trigger.
(Of course, once you’ve confirmed that lowering the netisr_maxqlen 
makes the problem more frequent go ahead and increase it.)


netstat -Q  will be useful
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [vnet] [epair] epair interface stops working after some time

2018-03-27 Thread Kristof Provost
On 27 Mar 2018, at 16:40, Kristof Provost wrote:
> It’s probably worth trying to play with ‘net.route.netisr_maxqlen’.
I probably mean ‘net.link.epair.netisr_maxqlen’ here.

Regards,
Kristof
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [vnet] [epair] epair interface stops working after some time

2018-03-27 Thread Kristof Provost

(Re-cc freebsd-net, because this is useful information)

On 27 Mar 2018, at 13:07, Reshad Patuck wrote:
The epair crash occurred again today running the epair module code 
with the added dtrace sdt providers.

​
Running the same command as last time, 'dtrace -n ::epair\*:' returns 
the following:

```
CPU IDFUNCTION:NAME

…

  0  66499   epair_transmit_locked:enqueued
```


Looks like its filled up a queue somewhere and is dropping connections 
post that.

​
The value of the 'error' is 55 I can see both the ifp and m structs 
but don't know what to look for in them.


That’s useful. Error 55 is ENOBUFS, which in IFQ_ENQUEUE() means 
we’re hitting _IF_QFULL().
There don’t seem to be counters for that drop though, so that makes it 
hard to diagnose without these extra probe points.
It also explains why you don’t really see any drop counters 
incrementing.


The fact that this queue is full presumably means that the other side is 
not reading packets off it any more.
That’s supposed to happen in epair_start_locked() (Look for the 
IFQ_DEQUEUE() calls).


It’s not at all clear to my how, but it looks like the receive side is 
not doing its work.


It looks like the IFQ code is already a fallback for when the netisr 
queue is full.
That code might be broken, or there might be a different issue that will 
just mean you’ll always end up in the same situation, regardless of 
queue size.


It’s probably worth trying to play with 
‘net.route.netisr_maxqlen’. I’d recommend *lowering* it, to see if 
the problem happens more frequently that way. If it does it’ll be 
helpful in reproducing and trying to fix this. If it doesn’t the full 
queues is probably a consequence rather than a cause/trigger.
(Of course, once you’ve confirmed that lowering the netisr_maxqlen 
makes the problem more frequent go ahead and increase it.)


Regards,
Kristof
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [vnet] [epair] epair interface stops working after some time

2018-01-14 Thread Reshad Patuck
Hi,

I attempted to unload the pf module, but this did not cause any changes.

I am not creating/destroying any VNET jails at the time epais stop to function.
Multiple VNET jails are started when I start the box, but no further activity 
(starts or stops of vnet jails, creation deletion of epair interfaces, pf 
start, stop or reload)

I have been monitoring output from the following:
- netstat -ss
- netstat -m
- vmstat -z
- vmstat -m

I will add 'netstat -i' to my battery of monitoring commands.

So far I the only pattern I can see out of the ordinary is the 'vmstat -m' 
output for epairs.
Where the size seems to keep growing, and at some point, the memory-use and 
high-use grow too.
The epair interface seems to stop working when the memory-use and high-use grow.
I have also noticed that these parameters stay almost constant on other boxes.

Here is a link (http://dpaste.com/3WB6AD4.txt) to the csv file containing the 
'vmstat -m' output for 'epair' over time.
I noticed the epair being to fail at timestamp 2018-01-09T07:56Z, but this test 
ran every 5 minutes so it could be upto 5 minutes before this timestamp.
NOTE: I have used --libxo on the vmstat to get json output, it seems to have 
lost the trailing 'K' in the memory-use column.

I will update things here if I find anything else in the logs.

Please let me know if there is anything else I should look at, or if there is 
any other output you would like.

Best regards,

Reshad

On Thursday 11 January 2018 2:20:06 AM IST Kristof Provost wrote:
> On 5 Jan 2018, at 20:54, Reshad Patuck wrote:
> > I have done the following on both servers to test what happens:
> > - Created a new epair interface epair3a and epair3b
> > - upped both interfaces
> > - given epair3a IP address 10.20.30.40/24 (I don't have this subnet
> > anywhere in my network)
> > - attempted to ping 10.20.30.50
> > - checked for any packets on epair3b
> > On the server where epairs are working, I can see APR packets for
> > 10.20.30.50, but on the server where epairs are not working I cant see 
> > any
> > packets on epair3b.
> > I can however see the arp packets on epair3a on both servers.
> >
> So epair3a was not added to the bridge and epair3b was not added to a 
> jail?
> That’s interesting, because it should mean the problem is not with the 
> bridge or jail.
> As it affects ARP packets it also shouldn’t be a pf problem.
> It might be worth unloading the pf module, just to re-confirm, but I 
> wouldn’t expect it to make a difference.
> 
> > Please let me know if there is anything I can do the debug this issue 
> > or if
> > you need any other information.
> >
> Are you creating/destroying vnet jails at any point? Is there a 
> correlation with that and the start of the epair issues?
> 
> Are there any errors in `netstat -s` or `netstat -i epair3a` ?
> 
> Regards,
> Kristof



___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [vnet] [epair] epair interface stops working after some time

2018-01-10 Thread Kristof Provost

On 5 Jan 2018, at 20:54, Reshad Patuck wrote:

I have done the following on both servers to test what happens:
- Created a new epair interface epair3a and epair3b
- upped both interfaces
- given epair3a IP address 10.20.30.40/24 (I don't have this subnet
anywhere in my network)
- attempted to ping 10.20.30.50
- checked for any packets on epair3b
On the server where epairs are working, I can see APR packets for
10.20.30.50, but on the server where epairs are not working I cant see 
any

packets on epair3b.
I can however see the arp packets on epair3a on both servers.

So epair3a was not added to the bridge and epair3b was not added to a 
jail?
That’s interesting, because it should mean the problem is not with the 
bridge or jail.

As it affects ARP packets it also shouldn’t be a pf problem.
It might be worth unloading the pf module, just to re-confirm, but I 
wouldn’t expect it to make a difference.


Please let me know if there is anything I can do the debug this issue 
or if

you need any other information.

Are you creating/destroying vnet jails at any point? Is there a 
correlation with that and the start of the epair issues?


Are there any errors in `netstat -s` or `netstat -i epair3a` ?

Regards,
Kristof
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


[vnet][epair] epair interface stops working after some time

2018-01-05 Thread Reshad Patuck
Hey,

I am having a strange issue with one of my servers.

I have a couple of VNET jails FreeBSD 12 r321619 set up using if_bridge and
epairs.
Each VNET jail (and the host too) has a pf firewall limiting inbound
traffic.
Everything works as intended for some time (1-5 days), services inside the
jail work and the jail can connect out to the rest of the network.
After some time of working fine I suddenly find that the jails stop
receiving traffic and can not send traffic out.
Essentially the traffic on one end of the epair does not come out the other.

I have linked to a diagram with my network setup for the jails.
Essentially the same setup is running on another identical server at
another location and has been running for atleast two weeks without any
issues.

The symptoms are as follows:
- I can connect to the server via ssh (on igb0 at IP 192.168.1.50).
- All connections from outside the jails work fine from (192.168.1.50 to
external IPs)
- I can not connect to any services running inside the jails from either
outside or inside the server
- I can not connect out from the jails (jexec in to the jails and then
attempt to connect out)
- When I attempt to connect out from one of the jails:
- I see arp traffic (via tcpdump) on the epair inside the jail (epair0b)
- I cant see the same arp traffic (via tcpdump) on the epair outside
the jail (epair0a)
- 'arp -a' insde the jails shows incomplete arps for any external IP I
try to reach.
- When I tcpdump on igb0, bridge0 or epair0a I see
broadcast/multicast/general network traffic.
- When I tcpdump on epair0b I see no traffic at all.

I have done the following on both servers to test what happens:
- Created a new epair interface epair3a and epair3b
- upped both interfaces
- given epair3a IP address 10.20.30.40/24 (I don't have this subnet
anywhere in my network)
- attempted to ping 10.20.30.50
- checked for any packets on epair3b
On the server where epairs are working, I can see APR packets for
10.20.30.50, but on the server where epairs are not working I cant see any
packets on epair3b.
I can however see the arp packets on epair3a on both servers.

This is the third time I have found this on the same server and the other
server is still going strong.
After rebooting the server this problem seems to go away temporarily, but
seems to manifest itself again after some time.

Any commands, ideas, thoughts on how to troubleshoot what is wrong here
will be much appreciated.

Please let me know if there is anything I can do the debug this issue or if
you need any other information.

Thanks and best regards,

Reshad

Link to network diagram: https://i.imgur.com/1XdRjt0.jpg
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"