Re: [PATCH v8 0/7] netdev: intel: Eliminate duplicate barriers on weakly-ordered archs
On Mon, Jul 30, 2018 at 12:20 PM, Sinan Kaya wrote: > +netdev, > > On 7/30/2018 9:45 AM, Alexander Duyck wrote: >> >> I haven't had a chance to work on this much myself. My understanding >> is that igb has had the barriers updated, but I don't think any of the >> other drivers have been worked over yet. > > > Unfortunately, I have recently changed jobs and I no longer have the > hardware to test my changes. I thought that you wanted to handle this > yourself. > > I haven't seen any follow ups. I wanted to double check. As I said so far igb has been the only one updated, and that was done by a third party: 73017f4e051c8 igb: Use dma_wmb() instead of wmb() before doorbell writes > I worked with several architecture leads on 4.17. All architectures > support the updated barrier semantics now. It is time to clean up the > network drivers. Thanks for that update. As I said for now igb has the barriers updated. The idea being that igb is the test vehicle for this so if we go a kernel version or so without triggering any issues then we can follow up with the other drivers. The other thing we have to keep in mind is that unlike many other NICs we have to also deal with emulations of our devices (e1000 and e1000e) that may rely on certain barriers being used to enforce things like SMP synchronization between CPUs, so we have to be careful as we roll this out. Thanks. - Alex
Re: [PATCH v8 0/7] netdev: intel: Eliminate duplicate barriers on weakly-ordered archs
+netdev, On 7/30/2018 9:45 AM, Alexander Duyck wrote: I haven't had a chance to work on this much myself. My understanding is that igb has had the barriers updated, but I don't think any of the other drivers have been worked over yet. Unfortunately, I have recently changed jobs and I no longer have the hardware to test my changes. I thought that you wanted to handle this yourself. I haven't seen any follow ups. I wanted to double check. I worked with several architecture leads on 4.17. All architectures support the updated barrier semantics now. It is time to clean up the network drivers. 92d7223a7423 alpha: io: reorder barriers to guarantee writeX() and iowriteX() ordering #2 a1cc7034e33d MIPS: io: Add barrier after register read in readX() f6b7aeee8f16 MIPS: io: Prevent compiler reordering writeX() a71e7c44ffb7 io: change writeX_relaxed() to remove barriers 8875c5543761 io: change readX_relaxed() to remove barriers cd0e00c10672 alpha: io: reorder barriers to guarantee writeX() and iowriteX() ordering 87fe2d543f81 io: change inX() to have their own IO barrier overrides a7851aa54c0c io: change outX() to have their own IO barrier overrides 755bd04aaf4b io: define stronger ordering for the default writeX() implementation 032d59e1cde9 io: define stronger ordering for the default readX() implementation 64e2c6738b4d io: define several IO & PIO barrier types for the asm-generic version
Re: [PATCH v8 0/7] netdev: intel: Eliminate duplicate barriers on weakly-ordered archs
On Tue, 2018-04-03 at 13:50 -0400, Sinan Kaya wrote: > > > What do you think about this version? Did I miss any SMP > > > barriers? > > > > I would say we should probably just drop the whole set and start > > over. > > If we don't need the wmb() we are going to need to go through and > > clean up all of these paths and consider the barriers when updating > > the layout of the code. > > > > For example I have been thinking about it and in the case of x86 we > > are probably better off not bothering with the wmb() and > > writel_relaxed() and instead switch over to the smp_wmb() and > > writel() > > since in the case of a strongly ordered system like x86 or sparc > > this > > ends up translating out to a couple of compile barriers. > > > > I will also need time to reevaluate the Rx paths since dropping the > > wmb() may have other effects which I need to verify. > > Sounds good, I'll let you work on it. > > @Jeff Kirsher: can you drop this version from your development branch > until > Alex posts the next version? Already on it, I will work with Alex on any possible future versions of these patches. signature.asc Description: This is a digitally signed message part
Re: [PATCH v8 0/7] netdev: intel: Eliminate duplicate barriers on weakly-ordered archs
On 4/3/2018 1:47 PM, Alexander Duyck wrote: > On Mon, Apr 2, 2018 at 7:59 PM, Sinan Kayawrote: >> Alex, >> >> On 4/2/2018 3:06 PM, Sinan Kaya wrote: >>> Code includes wmb() followed by writel() in multiple places. writel() >>> already has a barrier on some architectures like arm64. >>> >>> This ends up CPU observing two barriers back to back before executing the >>> register write. >>> >>> Since code already has an explicit barrier call, changing writel() to >>> writel_relaxed(). >>> >>> I did a regex search for wmb() followed by writel() in each drivers >>> directory. I scrubbed the ones I care about in this series. >>> >>> I considered "ease of change", "popular usage" and "performance critical >>> path" as the determining criteria for my filtering. >>> >>> We used relaxed API heavily on ARM for a long time but >>> it did not exist on other architectures. For this reason, relaxed >>> architectures have been paying double penalty in order to use the common >>> drivers. >>> >>> Now that relaxed API is present on all architectures, we can go and scrub >>> all drivers to see what needs to change and what can remain. >>> >>> We start with mostly used ones and hope to increase the coverage over time. >>> It will take a while to cover all drivers. >>> >>> Feel free to apply patches individually. >>> >>> Change since 7: >>> >>> * API clarification with Linus and several architecture folks regarding >>> writel() >>> >>> https://www.mail-archive.com/netdev@vger.kernel.org/msg225806.html >>> >>> * removed wmb() in front of writel() at several places. >>> * remove old IA64 comments regarding ordering. >>> >> >> What do you think about this version? Did I miss any SMP barriers? > > I would say we should probably just drop the whole set and start over. > If we don't need the wmb() we are going to need to go through and > clean up all of these paths and consider the barriers when updating > the layout of the code. > > For example I have been thinking about it and in the case of x86 we > are probably better off not bothering with the wmb() and > writel_relaxed() and instead switch over to the smp_wmb() and writel() > since in the case of a strongly ordered system like x86 or sparc this > ends up translating out to a couple of compile barriers. > > I will also need time to reevaluate the Rx paths since dropping the > wmb() may have other effects which I need to verify. Sounds good, I'll let you work on it. @Jeff Kirsher: can you drop this version from your development branch until Alex posts the next version? > > Thanks. > > - Alex > -- Sinan Kaya Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
Re: [PATCH v8 0/7] netdev: intel: Eliminate duplicate barriers on weakly-ordered archs
On Mon, Apr 2, 2018 at 7:59 PM, Sinan Kayawrote: > Alex, > > On 4/2/2018 3:06 PM, Sinan Kaya wrote: >> Code includes wmb() followed by writel() in multiple places. writel() >> already has a barrier on some architectures like arm64. >> >> This ends up CPU observing two barriers back to back before executing the >> register write. >> >> Since code already has an explicit barrier call, changing writel() to >> writel_relaxed(). >> >> I did a regex search for wmb() followed by writel() in each drivers >> directory. I scrubbed the ones I care about in this series. >> >> I considered "ease of change", "popular usage" and "performance critical >> path" as the determining criteria for my filtering. >> >> We used relaxed API heavily on ARM for a long time but >> it did not exist on other architectures. For this reason, relaxed >> architectures have been paying double penalty in order to use the common >> drivers. >> >> Now that relaxed API is present on all architectures, we can go and scrub >> all drivers to see what needs to change and what can remain. >> >> We start with mostly used ones and hope to increase the coverage over time. >> It will take a while to cover all drivers. >> >> Feel free to apply patches individually. >> >> Change since 7: >> >> * API clarification with Linus and several architecture folks regarding >> writel() >> >> https://www.mail-archive.com/netdev@vger.kernel.org/msg225806.html >> >> * removed wmb() in front of writel() at several places. >> * remove old IA64 comments regarding ordering. >> > > What do you think about this version? Did I miss any SMP barriers? I would say we should probably just drop the whole set and start over. If we don't need the wmb() we are going to need to go through and clean up all of these paths and consider the barriers when updating the layout of the code. For example I have been thinking about it and in the case of x86 we are probably better off not bothering with the wmb() and writel_relaxed() and instead switch over to the smp_wmb() and writel() since in the case of a strongly ordered system like x86 or sparc this ends up translating out to a couple of compile barriers. I will also need time to reevaluate the Rx paths since dropping the wmb() may have other effects which I need to verify. Thanks. - Alex
Re: [PATCH v8 0/7] netdev: intel: Eliminate duplicate barriers on weakly-ordered archs
Alex, On 4/2/2018 3:06 PM, Sinan Kaya wrote: > Code includes wmb() followed by writel() in multiple places. writel() > already has a barrier on some architectures like arm64. > > This ends up CPU observing two barriers back to back before executing the > register write. > > Since code already has an explicit barrier call, changing writel() to > writel_relaxed(). > > I did a regex search for wmb() followed by writel() in each drivers > directory. I scrubbed the ones I care about in this series. > > I considered "ease of change", "popular usage" and "performance critical > path" as the determining criteria for my filtering. > > We used relaxed API heavily on ARM for a long time but > it did not exist on other architectures. For this reason, relaxed > architectures have been paying double penalty in order to use the common > drivers. > > Now that relaxed API is present on all architectures, we can go and scrub > all drivers to see what needs to change and what can remain. > > We start with mostly used ones and hope to increase the coverage over time. > It will take a while to cover all drivers. > > Feel free to apply patches individually. > > Change since 7: > > * API clarification with Linus and several architecture folks regarding > writel() > > https://www.mail-archive.com/netdev@vger.kernel.org/msg225806.html > > * removed wmb() in front of writel() at several places. > * remove old IA64 comments regarding ordering. > What do you think about this version? Did I miss any SMP barriers? > Sinan Kaya (7): > i40e/i40evf: Eliminate duplicate barriers on weakly-ordered archs > ixgbe: eliminate duplicate barriers on weakly-ordered archs > igbvf: eliminate duplicate barriers on weakly-ordered archs > igb: eliminate duplicate barriers on weakly-ordered archs > fm10k: Eliminate duplicate barriers on weakly-ordered archs > ixgbevf: keep writel() closer to wmb() > ixgbevf: eliminate duplicate barriers on weakly-ordered archs > > drivers/net/ethernet/intel/fm10k/fm10k_main.c | 15 ++--- > drivers/net/ethernet/intel/i40e/i40e_txrx.c | 22 -- > drivers/net/ethernet/intel/i40evf/i40e_txrx.c | 8 +-- > drivers/net/ethernet/intel/igb/igb_main.c | 14 ++-- > drivers/net/ethernet/intel/igbvf/netdev.c | 16 +- > drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 23 ++- > drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 5 - > drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 27 > --- > 8 files changed, 30 insertions(+), 100 deletions(-) > Sinan -- Sinan Kaya Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
[PATCH v8 0/7] netdev: intel: Eliminate duplicate barriers on weakly-ordered archs
Code includes wmb() followed by writel() in multiple places. writel() already has a barrier on some architectures like arm64. This ends up CPU observing two barriers back to back before executing the register write. Since code already has an explicit barrier call, changing writel() to writel_relaxed(). I did a regex search for wmb() followed by writel() in each drivers directory. I scrubbed the ones I care about in this series. I considered "ease of change", "popular usage" and "performance critical path" as the determining criteria for my filtering. We used relaxed API heavily on ARM for a long time but it did not exist on other architectures. For this reason, relaxed architectures have been paying double penalty in order to use the common drivers. Now that relaxed API is present on all architectures, we can go and scrub all drivers to see what needs to change and what can remain. We start with mostly used ones and hope to increase the coverage over time. It will take a while to cover all drivers. Feel free to apply patches individually. Change since 7: * API clarification with Linus and several architecture folks regarding writel() https://www.mail-archive.com/netdev@vger.kernel.org/msg225806.html * removed wmb() in front of writel() at several places. * remove old IA64 comments regarding ordering. Sinan Kaya (7): i40e/i40evf: Eliminate duplicate barriers on weakly-ordered archs ixgbe: eliminate duplicate barriers on weakly-ordered archs igbvf: eliminate duplicate barriers on weakly-ordered archs igb: eliminate duplicate barriers on weakly-ordered archs fm10k: Eliminate duplicate barriers on weakly-ordered archs ixgbevf: keep writel() closer to wmb() ixgbevf: eliminate duplicate barriers on weakly-ordered archs drivers/net/ethernet/intel/fm10k/fm10k_main.c | 15 ++--- drivers/net/ethernet/intel/i40e/i40e_txrx.c | 22 -- drivers/net/ethernet/intel/i40evf/i40e_txrx.c | 8 +-- drivers/net/ethernet/intel/igb/igb_main.c | 14 ++-- drivers/net/ethernet/intel/igbvf/netdev.c | 16 +- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 23 ++- drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 5 - drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 27 --- 8 files changed, 30 insertions(+), 100 deletions(-) -- 2.7.4