> Is it always the OOM errors followed by the Tx timeout?
Yes, I believe I have the dmesg from one of the earlier incidence, I can
clean that up and make it public if you want.
> Is it an actual serial connection or is it something like serial over
> LAN?
Serial over LAN
> Do you know if you h
Hi Alex,
Thanks for responding!
> Have you been seeing this as a
> reproducible issue or is this something that has only occurred once?
It has occurred 3 times so far, we usually stop the hosts we see the issue
on for a long while which is probably why we don't see it that frequent, it
has occur
> There shouldn't be any need. Basically what you want to check for is
> to make sure those logs have the same pattern with OOM errors followed
> by the rcu_sched warning about detecting a CPU stall. If that is the
> case that is the most likely root cause for the Tx hangs that are
> being reported
On Mon, Jul 30, 2018 at 4:43 PM, Àbéjídé Àyodélé
wrote:
>> Is it always the OOM errors followed by the Tx timeout?
>
> Yes, I believe I have the dmesg from one of the earlier incidence, I can
> clean that up and make it public if you want.
There shouldn't be any need. Basically what you want to c
On Mon, Jul 30, 2018 at 1:25 PM, Àbéjídé Àyodélé
wrote:
> Hi Alex,
>
> Thanks for responding!
>
>> Have you been seeing this as a
>> reproducible issue or is this something that has only occurred once?
>
> It has occurred 3 times so far, we usually stop the hosts we see the issue
> on for a long w
On Sat, Jul 28, 2018 at 5:44 PM, Àbéjídé Àyodélé
wrote:
> Hi friends,
>
> On one of our machines at work, we observed a sequence of events starting
> from an OOM in a secondary cgroup which ends up in the bond interface being
> down for a period of up to 12 seconds. Below is some piece of dmesg ab
Hi friends,
On one of our machines at work, we observed a sequence of events starting
from an OOM in a secondary cgroup which ends up in the bond interface being
down for a period of up to 12 seconds. Below is some piece of dmesg about when
the bond interface went down:
[Wed Jul 25 19:20:45 2018]