Hi Ben On Sat, Mar 24, 2018 at 3:19 AM, Benjamin Herrenschmidt <b...@kernel.crashing.org> wrote: > On Fri, 2018-03-23 at 07:41 -0500, Jared Bents wrote: >> Thank you for the advice. Looks like I get to try to rewrite the ath9k and >> ath10k drivers to use dma_alloc_coherent() instead of kmemdup() and >> dev_alloc_skb() > > Euh no... dev_alloc_skb() is the right thing to do for receive > packets for a device driver. > > The arch should be able to map that for DMA, even if include > bounce buffers via swiotlb. > > Cheers, > Ben.
I have fixed the kmemdup usage to be dma_alloc_coherent() in the ath10k driver. While dev_alloc_skb() is the right thing to do for receive packets, the dma_map_single for all of those buffers fails. So it looks like I have to add the ifdef conditional from arch/powerpc/platforms/85xx/corenet_generic.c to struct sk_buff *__netdev_alloc_skb() in net/core/skbuff.c #if defined(CONFIG_FSL_PCI) && defined(CONFIG_ZONE_DMA32) gfp_mask |= GFP_DMA32; #endif On Sun, Mar 25, 2018 at 6:27 PM, Oliver <ooh...@gmail.com> wrote: > On Fri, Mar 23, 2018 at 11:41 PM, Jared Bents > <jared.be...@rockwellcollins.com> wrote: >> Thank you for the advice. Looks like I get to try to rewrite the ath9k and >> ath10k drivers to use dma_alloc_coherent() instead of kmemdup() and >> dev_alloc_skb() > > I don't think you need to go that far. It looks like you might be able > to fix the uses of kmemdup() and kzalloc() in > ath10k_pci_hif_exchange_bmi_msg() and call it a day. Auditing the > other uses of dma_map_single() to see if they're using kmalloc() > memory might be a good idea too. > > Anyway this is probably something you're better off taking to the ath10k list. > > Thanks, > Oliver > I'll take my update of kmemdup to dma_alloc_coherent() to the ath10k mailing list. However, even after updating to use dma_alloc_coherent() and adding the conditional to __netdev_alloc_skb() for the rx skb's used in the driver, I am still getting a transmit error. I'm struggling to track down where in the kernel the skb being taken from a queue is coming from in drivers/net/wireless/ath/ath10k/mac.c I will ask ath10k about this as well. The skb being taken off the queue below is later DMAed with dma_map_single and that fails but since I haven't figured out where it comes from, I haven't been able to try to fix it. void ath10k_offchan_tx_work(struct work_struct *work) { >.......struct ath10k *ar = container_of(work, struct ath10k, offchan_tx_work); [...] >.......for (;;) { >.......>.......skb = skb_dequeue(&ar->offchan_tx_queue); Thank you for all the help, Jared >> >> On Thu, Mar 22, 2018 at 8:19 PM, Oliver <ooh...@gmail.com> wrote: >>> >>> On Fri, Mar 23, 2018 at 1:37 AM, Jared Bents >>> <jared.be...@rockwellcollins.com> wrote: >>> > Thank you for the response but unfortunately, it looks like I already >>> > have that and it is being used. To verify, I commented that out and >>> > got the failure "dma_direct_alloc_coherent: No suitable zone for pfn >>> > 0xe0000". Below is the code flow for function >>> > ath10k_pci_hif_exchange_bmi_msg which is showing the first dma mapping >>> > error. >>> > >>> > ath10k_pci_hif_exchange_bmi_msg -> dma_map_single -> >>> > dma_map_single_attrs -> swiotlb_map_page -> dma_capable (returns >>> > false) >>> > >>> > >>> > dma_capable is what reports the failure in that flow. >>> > >>> > static inline bool dma_capable(struct device *dev, dma_addr_t addr, >>> > size_t size) >>> > { >>> > #ifdef CONFIG_SWIOTLB >>> > struct dev_archdata *sd = &dev->archdata; >>> > >>> > if (sd->max_direct_dma_addr && addr + size > sd->max_direct_dma_addr) >>> > return false; >>> > #endif >>> > >>> > if (!dev->dma_mask) >>> > return false; >>> > >>> > return addr + size - 1 <= *dev->dma_mask; >>> > } >>> > Getting the below values: >>> > addr = 1ee376218 >>> > size = 4 >>> > sd->max_direct_dma_addr = e0000000 which is I believe DMA window size >>> > (e0000000) >>> > >>> > when executed sd->max_direct_dma_addr(e0000000) && addr(1ee376218) + >>> > size(4) becomes e0000004 which is > sd->max_direct_dma_addr (e0000000) >>> > >>> > >>> > So even though limit_zone_pfn(ZONE_DMA32, 1UL << (31 - PAGE_SHIFT)) is >>> > being used in arch/powerpc/platforms/85xx/corenet_generic.c, >>> >>> > kmemdup(req, req_len, GFP_KERNEL) is returning an address that when >>> > sent to dma_map_single(), results in a bad map. >>> >>> You need to use (GFP_KERNEL | GFP_DMA32) to constrain the allocations >>> to ZONE_DMA32. Without that the kmemdup() will allocate from any zone >>> so you'll probably get an unmappable address. >>> >>> That said, the driver probably shouldn't be using kmemdup() here. >>> DMA-API.txt pretty explicitly says that drivers should not assume that >>> dma_map_single() will work with arbitrary memory. It should be using >>> dma_alloc_coherent() or a dma pool here. >>> >>> > - Jared >>> > >>> > On Wed, Mar 21, 2018 at 11:54 PM, Oliver <ooh...@gmail.com> wrote: >>> >> On Thu, Mar 22, 2018 at 8:00 AM, Jared Bents >>> >> <jared.be...@rockwellcollins.com> wrote: >>> >>> Hi all, >>> >>> >>> >>> Apologies for the amount of information but we've been debugging this >>> >>> for a while and I wanted to get what we are seeing captured as much as >>> >>> possible. We are a T1042 processor and have a total 8GB DDR and our >>> >>> kernel version is fsl-sdk-v2.0-1703 (linux v4.1.35) as that is the >>> >>> latest version supplied by NXP. >>> >>> >>> >>> A while ago we ported from 32 bit to 64 bit. Everything continued to >>> >>> work except the ath10k module we have. So as a first step, we checked >>> >>> to see if an ath9k module also failed to work and it was also no >>> >>> longer working. The ath10k is working fine on a 32 bit system but >>> >>> it's not working on 64 bit system as we are getting dma mapping errors >>> >>> when trying to initialize the wifi modules. >>> >>> >>> >>> pci_bus 0002:01: bus scan returning with max=01 >>> >>> pci_bus 0002:01: busn_res: [bus 01] end is updated to 01 >>> >>> pci_bus 0002:00: bus scan returning with max=01 >>> >>> ath10k_pci 0000:01:00.0: unable to get target info from device >>> >>> ath10k_pci 0000:01:00.0: could not get target info (-5) >>> >>> ath10k_pci 0000:01:00.0: could not probe fw (-5) >>> >>> ath10k_pci 0001:01:00.0: Direct firmware load for >>> >>> ath10k/cal-pci-0001:01:00.0.bin failed with error -2 >>> >>> >>> >>> >>> >>> First, we have tried the mainline kernel (v4.15) to see if that would >>> >>> fix the issue, it did not. So I made a patch for the ath10k driver to >>> >>> restrict to just GFP_DMA areas when allocating memory or creating >>> >>> sk_buffs and have attached it. The ath10k wifi modules now initialize >>> >>> correctly but when I try to connect them and send traffic, they get a >>> >>> DMA mapping error from the sk_buff that it receives from elsewhere in >>> >>> the kernel. So while the driver appears to be fixable with the patch, >>> >>> the modules are still unusable due to data being sent to the driver >>> >>> when ath10k_tx is called and it tries to dma map with the provided >>> >>> skb. Also, according to the ath10k mailing list, GFP_DMA is not >>> >>> supposed to be used in general. The error below is the same sort of >>> >>> dma mapping error that is seen when initializing the modules without >>> >>> the patch to OR with GFP_DMA. >>> >>> >>> >>> ath10k_pci 0001:01:00.0: failed to transmit packet, dropping: -5 >>> >>> >>> >>> >>> >>> We asked on the ath10k mailing list if anyone else is having this >>> >>> problem and no one else seems to have the issue but they are using >>> >>> different architectures (ARM or X86). As a result, it does not seem to >>> >>> be a driver issue to us but something within the PowerPC arch. So we >>> >>> dug a little deeper to try to find what addresses being mapped are >>> >>> working and what address being mapped are not working. >>> >>> >>> >>> We found that when the virtual address of data pointer (a member of >>> >>> sk_buff) is above ~3.7 GB RAM address range then return address from >>> >>> dma_map_single API is failed to validate in dma_mapping_error >>> >>> function. >>> >>> >>> >>> We also noticed that in a 64bit machine sometimes ping is working and >>> >>> because of the virtual address is under ~3.7GAM RAM address range. So >>> >>> if we set mem=2048M in the bootargs, the ath10k module works >>> >>> perfectly, however this isn't a real solution since it cuts our >>> >>> available RAM from 8GB to 2GB. >>> >> >>> >> I think there's a known issue with the freescale PCIe root complex >>> >> where it can't DMA beyond the 4GB mark. There's a workaround in >>> >> the form of limit_zone_pfn() which you can use to put the lower 4GB >>> >> into >>> >> ZONE_DMA32 and allocate from there rather than ZONE_NORMAL. >>> >> For details of how to use it have a look at corenet_gen_setup_arch() in >>> >> arch/powerpc/platforms/85xx/corenet_generic.c >>> >> >>> >> Hope that helps, >>> >> Oliver >> >>