** Description changed: - For the customer OpenStack deployment we deploy infra nodes on Dell R630 - servers. The servers have onboard Broadcom's NetXtreme II BCM57800 NIC - (quad port: 2x1G ports, 2x10G ports). For each port in UP state, we - observe 100% CPU load. So in total, we observe 4 CPUs with 100% load. + [Impact] - perf report shows function bnx2x_ptp_task taking up much of the CPUs - time: https://pastebin.canonical.com/p/kfrpd6Pwh5/ + * The PTP feature in bnx2x driver is implemented in a way that if the + NIC firmware takes some time to perform the timestamping - which is + observed as a bad register read in bnx2x_ptp_task() - then the ptp + worker function will reschedule itself indefinitely until the value read + from the register is meaningful. With that behavior, if an userspace + tool request a bad configured RX filter to bnx2x (or if NIC firmware has + any other issue in timestamping), the function bnx2x_ptp_task() will be + rescheduled forever and cause a unbound resource consumption. This + manifests as a kworker thread consuming 100% of CPU. - Also, /var/log/syslog contains the following outputs every few seconds: - [1738143.581721] bnx2x: [bnx2x_start_xmit:3855(eno4)]The device supports only a single outstanding packet to timestamp, this packet will not be timestamped - [1738176.727642] bnx2x: [bnx2x_start_xmit:3855(eno1)]The device supports only a single outstanding packet to timestamp, this packet will not be timestamped - [1738207.988310] bnx2x: [bnx2x_start_xmit:3855(eno3)]The device supports only a single outstanding packet to timestamp, this packet will not be timestamped - [1738240.227333] bnx2x: [bnx2x_start_xmit:3855(eno2)]The device supports only a single outstanding packet to timestamp, this packet will not be timestamped + * The dmesg log will show the following message regarding other packets being skipped on timestamp routine due to a packet getting stuck in the timestamping "pipeline": - So, the problem seems to be in a "timestampped" TX packet; the driver - for some reason (to be yet understood) get an unexpected value from a - register and then, it that same function, reschedule itself to try again - this register read, read gets a bad value again, and so on infinitely. + "bnx2x: [bnx2x_start_xmit:3862(eno4)]The device supports only a single + outstanding packet to timestamp, this packet will not be timestamped" - This is showing in the system as the 100% CPU usage kthreads; the - message "The device supports only a single outstanding packet to - timestamp, this packet will not be timestamped" happens because the - driver can only timestamp a single TX packet at a time, and given it's - stuck trying, it cannot accept another packet in this "queue". + Also, by using ftrace user can notice that function bnx2x_ptp_task() is + being called a lot, and by enabling bnx2x PTP debugging log (ethtool -s + <iface> msglvl 16777216) it's possible to observe the following message + flooding the kernel log: - The infinite loop appears to be: + "bnx2x: [bnx2x_ptp_task:15242(eno4)]There is no valid Tx timestamp yet" - static void bnx2x_ptp_task(struct work_struct *work) - { - struct bnx2x *bp = container_of(work, struct bnx2x, ptp_task); - int port = BP_PORT(bp); - u32 val_seq; - u64 timestamp, ns; - struct skb_shared_hwtstamps shhwtstamps; - /* Read Tx timestamp registers */ - val_seq = REG_RD(bp, port ? NIG_REG_P1_TLLH_PTP_BUF_SEQID : - NIG_REG_P0_TLLH_PTP_BUF_SEQID); - if (val_seq & 0x10000) { - [...] - } else { - DP(BNX2X_MSG_PTP, "There is no valid Tx timestamp yet\n"); - /* Reschedule to keep checking for a valid timestamp value */ - schedule_work(&bp->ptp_task); - } + * The patch proposed in this SRU request is accepted upstream and is available currently (2019-07-03) in David Miller's linux-net tree: + git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=3c91f25c2f72 + Besides fixing the issue, it also adds an ethtool statistics for accounting the ptp errors and reduces message flooding in case of errors. - It appears that val_seq & 0x10000 is never true, so the task constantly - reschedules itself immediately. Instrumenting the function shows that it - is being called in excess of 100,000 times per second. The REG_RD call - does appear to be expensive (as it's a register read from the device) - and shows high in the perf report, but that by itself doesn't appear to - be the root cause (i.e., it's not hanging forever in the REG_RD). - The cause appears to be that the driver is not prepared to deal with the - PTP request never being completed by the hardware. It's unclear why it - isn't completing, but regardless, the driver should not loop forever - here. + [Test case] + + Reproducing the problem is not difficult; we've used chrony in Bionic to + trigger the problem. The steps are: + + a) Install chrony on Bionic in a system with working NIC managed by + bnx2x; + + b) Edit chrony configuration and add: "hwtimestamp *" to the top of its + conf file; + + c) Restart chrony service + + Check dmesg for the "[...]single outstanding packet" message and the + overall CPU workload using a tool like "top" to observe a kthread + consuming 100% of CPU. + + + [Regression potential] + + The patch scope is restricted to bnx2x ptp handler, and was validated by + the driver maintainer. If there's any possibility of regressions, we + believe the worst would be an issue affecting the packet timestamping, + not messing with the regular xmit path for the driver.
-- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1832082 Title: bnx2x driver causes 100% CPU load To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1832082/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs