Re: [PATCH v6 2/2] PCI: Try best to allocate pref mmio 64bit above 4g
On Fri, Jan 10, 2014 at 1:41 AM, Guo Chao wrote: > On Wed, Jan 08, 2014 at 03:34:54PM -0800, Yinghai Lu wrote: > Just FYI, a Mellanox net card failed after exactly this patch. > > 3.13-rc7 + bjorn's series is OK. After this patch applied, Mellanox > driver complains: > > |mlx4_core 0003:05:00.0: Multiple PFs not yet supported. Skipping PF. > |mlx4_core: probe of 0003:05:00.0 failed with error -22 > > This is caused by MMIO read from BAR 0 (64-bit non-prefetchable) returns > non-zore value. > > Resource assignment, as far as we can see, works fine. The noticable > effect of this patch is putting ROM BAR under non-prefetachable. I try > to revert this effect by adding MEM_64 to its ROM resource and it works > again (system does not expose 4G above aperture yet). Not sure what's > the root cause, looks like a driver/firmware/hardware defect. Interesting. Can you post boot log with "debug ignore_loglevel initcall_debug" and with/without this patch? Thanks Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 2/2] PCI: Try best to allocate pref mmio 64bit above 4g
On Wed, Jan 08, 2014 at 03:34:54PM -0800, Yinghai Lu wrote: > On Sun, Dec 22, 2013 at 5:14 PM, Yinghai Lu wrote: > > On Sun, Dec 22, 2013 at 4:00 PM, Bjorn Helgaas wrote: > >> On Thu, Dec 19, 2013 at 1:44 PM, Yinghai Lu wrote: > >> > >> Let me see if I can figure out what you're trying to do here. Please > >> correct me if I'm wrong: > >> > >>> When one of children resources does not support MEM_64, MEM_64 for > >>> bridge get reset, so pull down whole pref resource on the bridge under 4G. > >> > >> When we allocate space for a bridge's prefetchable window, we > >> currently look at the devices behind the bridge and put the window > >> below 4GB if any of those children has a 32-bit prefetchable BAR. > >> > >> This maximizes the use of prefetch, at the cost of using more 32-bit > >> address space. > > > > yes. and we have problem when we have 8 sockets or 32 sockets system, > > will have limit 32bit space. > > but we have enough above 4G 64bit mmio for prefetchable. > > > >> > >>> If the bridge support pref mem 64, will only allocate that with pref > >>> mem64 to > >>> children that support it. > >>> For children resources if they only support pref mem 32, will allocate > >>> them > >>> from non pref mem instead. > >> > >> You are changing this so that we will always try to put a bridge's > >> 64-bit prefetchable window above 4GB, regardless of what devices are > >> behind the bridge. If a device behind the bridge has a 32-bit > >> prefetchable BAR, we will place that BAR in the bridge's 32-bit > >> non-prefetchable window. > > > > Yes. so we can keep IORESOURCE_MEM64 in the flags for PREF. > > > >> > >> This minimizes the use of the 32-bit address space, at the cost of not > >> being able to use prefetch as much. > >> > >>> If the bridge only support 32bit pref mmio, will still have all children > >>> pref > >>> mmio under that. > >> > >> Obviously, if a bridge has a prefetchable window that's only 32 bits, > >> 64-bit prefetchable BARs behind the bridge will have to be in that > >> 32-bit prefetchable window or the 32-bit non-prefetchable window. And > >> if the bridge has no prefetchable window at all, every memory BAR > >> behind the bridge will have to be in the 32-bit non-prefetchable > >> window. > > > > Yes. > > > >> > >> I'll look at the actual patch later; I just want to make sure I > >> understand your intent first. > > Hi, Bjorn, > > Can you check and add this one to your pci/resource branch? > With that we can close the loop for 64bit mmio resource allocation. > Just FYI, a Mellanox net card failed after exactly this patch. 3.13-rc7 + bjorn's series is OK. After this patch applied, Mellanox driver complains: |mlx4_core 0003:05:00.0: Multiple PFs not yet supported. Skipping PF. |mlx4_core: probe of 0003:05:00.0 failed with error -22 This is caused by MMIO read from BAR 0 (64-bit non-prefetchable) returns non-zore value. Resource assignment, as far as we can see, works fine. The noticable effect of this patch is putting ROM BAR under non-prefetachable. I try to revert this effect by adding MEM_64 to its ROM resource and it works again (system does not expose 4G above aperture yet). Not sure what's the root cause, looks like a driver/firmware/hardware defect. Thanks Guo Chao > Thanks > > Yinghai > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 2/2] PCI: Try best to allocate pref mmio 64bit above 4g
On Sun, Dec 22, 2013 at 5:14 PM, Yinghai Lu wrote: > On Sun, Dec 22, 2013 at 4:00 PM, Bjorn Helgaas wrote: >> On Thu, Dec 19, 2013 at 1:44 PM, Yinghai Lu wrote: >> >> Let me see if I can figure out what you're trying to do here. Please >> correct me if I'm wrong: >> >>> When one of children resources does not support MEM_64, MEM_64 for >>> bridge get reset, so pull down whole pref resource on the bridge under 4G. >> >> When we allocate space for a bridge's prefetchable window, we >> currently look at the devices behind the bridge and put the window >> below 4GB if any of those children has a 32-bit prefetchable BAR. >> >> This maximizes the use of prefetch, at the cost of using more 32-bit >> address space. > > yes. and we have problem when we have 8 sockets or 32 sockets system, > will have limit 32bit space. > but we have enough above 4G 64bit mmio for prefetchable. > >> >>> If the bridge support pref mem 64, will only allocate that with pref mem64 >>> to >>> children that support it. >>> For children resources if they only support pref mem 32, will allocate them >>> from non pref mem instead. >> >> You are changing this so that we will always try to put a bridge's >> 64-bit prefetchable window above 4GB, regardless of what devices are >> behind the bridge. If a device behind the bridge has a 32-bit >> prefetchable BAR, we will place that BAR in the bridge's 32-bit >> non-prefetchable window. > > Yes. so we can keep IORESOURCE_MEM64 in the flags for PREF. > >> >> This minimizes the use of the 32-bit address space, at the cost of not >> being able to use prefetch as much. >> >>> If the bridge only support 32bit pref mmio, will still have all children >>> pref >>> mmio under that. >> >> Obviously, if a bridge has a prefetchable window that's only 32 bits, >> 64-bit prefetchable BARs behind the bridge will have to be in that >> 32-bit prefetchable window or the 32-bit non-prefetchable window. And >> if the bridge has no prefetchable window at all, every memory BAR >> behind the bridge will have to be in the 32-bit non-prefetchable >> window. > > Yes. > >> >> I'll look at the actual patch later; I just want to make sure I >> understand your intent first. Hi, Bjorn, Can you check and add this one to your pci/resource branch? With that we can close the loop for 64bit mmio resource allocation. Thanks Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 2/2] PCI: Try best to allocate pref mmio 64bit above 4g
On Sun, Dec 22, 2013 at 4:00 PM, Bjorn Helgaas wrote: > On Thu, Dec 19, 2013 at 1:44 PM, Yinghai Lu wrote: > > Let me see if I can figure out what you're trying to do here. Please > correct me if I'm wrong: > >> When one of children resources does not support MEM_64, MEM_64 for >> bridge get reset, so pull down whole pref resource on the bridge under 4G. > > When we allocate space for a bridge's prefetchable window, we > currently look at the devices behind the bridge and put the window > below 4GB if any of those children has a 32-bit prefetchable BAR. > > This maximizes the use of prefetch, at the cost of using more 32-bit > address space. yes. and we have problem when we have 8 sockets or 32 sockets system, will have limit 32bit space. but we have enough above 4G 64bit mmio for prefetchable. > >> If the bridge support pref mem 64, will only allocate that with pref mem64 to >> children that support it. >> For children resources if they only support pref mem 32, will allocate them >> from non pref mem instead. > > You are changing this so that we will always try to put a bridge's > 64-bit prefetchable window above 4GB, regardless of what devices are > behind the bridge. If a device behind the bridge has a 32-bit > prefetchable BAR, we will place that BAR in the bridge's 32-bit > non-prefetchable window. Yes. so we can keep IORESOURCE_MEM64 in the flags for PREF. > > This minimizes the use of the 32-bit address space, at the cost of not > being able to use prefetch as much. > >> If the bridge only support 32bit pref mmio, will still have all children pref >> mmio under that. > > Obviously, if a bridge has a prefetchable window that's only 32 bits, > 64-bit prefetchable BARs behind the bridge will have to be in that > 32-bit prefetchable window or the 32-bit non-prefetchable window. And > if the bridge has no prefetchable window at all, every memory BAR > behind the bridge will have to be in the 32-bit non-prefetchable > window. Yes. > > I'll look at the actual patch later; I just want to make sure I > understand your intent first. Thanks Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 2/2] PCI: Try best to allocate pref mmio 64bit above 4g
On Thu, Dec 19, 2013 at 1:44 PM, Yinghai Lu wrote: Let me see if I can figure out what you're trying to do here. Please correct me if I'm wrong: > When one of children resources does not support MEM_64, MEM_64 for > bridge get reset, so pull down whole pref resource on the bridge under 4G. When we allocate space for a bridge's prefetchable window, we currently look at the devices behind the bridge and put the window below 4GB if any of those children has a 32-bit prefetchable BAR. This maximizes the use of prefetch, at the cost of using more 32-bit address space. > If the bridge support pref mem 64, will only allocate that with pref mem64 to > children that support it. > For children resources if they only support pref mem 32, will allocate them > from non pref mem instead. You are changing this so that we will always try to put a bridge's 64-bit prefetchable window above 4GB, regardless of what devices are behind the bridge. If a device behind the bridge has a 32-bit prefetchable BAR, we will place that BAR in the bridge's 32-bit non-prefetchable window. This minimizes the use of the 32-bit address space, at the cost of not being able to use prefetch as much. > If the bridge only support 32bit pref mmio, will still have all children pref > mmio under that. Obviously, if a bridge has a prefetchable window that's only 32 bits, 64-bit prefetchable BARs behind the bridge will have to be in that 32-bit prefetchable window or the 32-bit non-prefetchable window. And if the bridge has no prefetchable window at all, every memory BAR behind the bridge will have to be in the 32-bit non-prefetchable window. I'll look at the actual patch later; I just want to make sure I understand your intent first. Bjorn > -v2: Add release bridge res support with bridge mem res for pref_mem children > res. > -v3: refresh and make it can be applied early before for_each_dev_res > patchset. > -v4: fix non-pref mmio 64bit support found by Guo Chao. > > Signed-off-by: Yinghai Lu > Tested-by: Guo Chao > --- > drivers/pci/setup-bus.c | 138 > > drivers/pci/setup-res.c | 20 ++- > 2 files changed, 111 insertions(+), 47 deletions(-) > > diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c > index 138bdd6..b29504f 100644 > --- a/drivers/pci/setup-bus.c > +++ b/drivers/pci/setup-bus.c > @@ -713,12 +713,11 @@ static void pci_bridge_check_ranges(struct pci_bus *bus) > bus resource of a given type. Note: we intentionally skip > the bus resources which have already been assigned (that is, > have non-NULL parent resource). */ > -static struct resource *find_free_bus_resource(struct pci_bus *bus, unsigned > long type) > +static struct resource *find_free_bus_resource(struct pci_bus *bus, > +unsigned long type_mask, unsigned long type) > { > int i; > struct resource *r; > - unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM | > - IORESOURCE_PREFETCH; > > pci_bus_for_each_resource(bus, r, i) { > if (r == &ioport_resource || r == &iomem_resource) > @@ -815,7 +814,8 @@ static void pbus_size_io(struct pci_bus *bus, > resource_size_t min_size, > resource_size_t add_size, struct list_head *realloc_head) > { > struct pci_dev *dev; > - struct resource *b_res = find_free_bus_resource(bus, IORESOURCE_IO); > + struct resource *b_res = find_free_bus_resource(bus, IORESOURCE_IO, > + IORESOURCE_IO); > resource_size_t size = 0, size0 = 0, size1 = 0; > resource_size_t children_add_size = 0; > resource_size_t min_align, align; > @@ -915,15 +915,17 @@ static inline resource_size_t > calculate_mem_align(resource_size_t *aligns, > * guarantees that all child resources fit in this size. > */ > static int pbus_size_mem(struct pci_bus *bus, unsigned long mask, > -unsigned long type, resource_size_t min_size, > - resource_size_t add_size, > - struct list_head *realloc_head) > +unsigned long type, unsigned long type2, > +unsigned long type3, > +resource_size_t min_size, resource_size_t add_size, > +struct list_head *realloc_head) > { > struct pci_dev *dev; > resource_size_t min_align, align, size, size0, size1; > resource_size_t aligns[12]; /* Alignments from 1Mb to 2Gb */ > int order, max_order; > - struct resource *b_res = find_free_bus_resource(bus, type); > + struct resource *b_res = find_free_bus_resource(bus, > +mask | IORESOURCE_PREFETCH, type); > unsigned int mem64_mask = 0; > resource_size_t children_add_size = 0; > > @@ -944,7 +946,9 @@ static int pbus_size_mem(struct pci_bus