Re: [PATCH] powerpc: process.c: fix Kconfig typo
Cyril Burwrites: > On Wed, 2016-10-05 at 07:57 +0200, Valentin Rothberg wrote: >> s/ALIVEC/ALTIVEC/ >> > > Oops, nice catch > >> Signed-off-by: Valentin Rothberg > > Reviewed-by: Cyril Bur How did we not notice? Sounds like we need a new selftest. Looks like this should have: Fixes: dc16b553c949 ("powerpc: Always restore FPU/VEC/VSX if hardware transactional memory in use") And I guess I need to start running checkkconfigsymbols.py on every commit. cheers
[PATCH v3 2/2] PCI: Disable VF's memory space on updating IOV BAR in pci_update_resource()
pci_update_resource() might be called to update (shift) IOV BARs in PPC PowerNV specific pcibios_sriov_enable() when enabling PF's SRIOV capability. At that point, the PF have been functional if the SRIOV is enabled through sysfs entry "sriov_numvfs". The PF's memory decoding (0x2 in PCI_COMMAND) shouldn't be disabled when updating its IOV BARs with pci_update_resource(). Otherwise, we receives EEH error caused by MMIO access to PF's memory BARs during the window when PF's memory decoding is disabled. sriov_numvfs_store pdev->driver->sriov_configure mlx5_core_sriov_configure pci_enable_sriov sriov_enable pcibios_sriov_enable pnv_pci_sriov_enable pnv_pci_vf_resource_shift pci_update_resource This disables VF's memory space instead of PF's memory decoding when 64-bits IOV BARs are updated in pci_update_resource(). Reported-by: Carol SotoSuggested-by: Bjorn Helgaas Signed-off-by: Gavin Shan Tested-by: Carol Soto --- drivers/pci/setup-res.c | 28 1 file changed, 20 insertions(+), 8 deletions(-) diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c index 66c4d8f..1456896 100644 --- a/drivers/pci/setup-res.c +++ b/drivers/pci/setup-res.c @@ -29,10 +29,10 @@ void pci_update_resource(struct pci_dev *dev, int resno) { struct pci_bus_region region; - bool disable; - u16 cmd; + bool disable = false; + u16 cmd, bit; u32 new, check, mask; - int reg; + int reg, cmd_reg; enum pci_bar_type type; struct resource *res = dev->resource + resno; @@ -81,11 +81,23 @@ void pci_update_resource(struct pci_dev *dev, int resno) * disable decoding so that a half-updated BAR won't conflict * with another device. */ - disable = (res->flags & IORESOURCE_MEM_64) && !dev->mmio_always_on; + if (res->flags & IORESOURCE_MEM_64) { + if (resno <= PCI_ROM_RESOURCE) { + disable = !dev->mmio_always_on; + cmd_reg = PCI_COMMAND; + bit = PCI_COMMAND_MEMORY; + } else { +#ifdef CONFIG_PCI_IOV + disable = true; + cmd_reg = dev->sriov->pos + PCI_SRIOV_CTRL; + bit = PCI_SRIOV_CTRL_MSE; +#endif + } + } + if (disable) { - pci_read_config_word(dev, PCI_COMMAND, ); - pci_write_config_word(dev, PCI_COMMAND, - cmd & ~PCI_COMMAND_MEMORY); + pci_read_config_word(dev, cmd_reg, ); + pci_write_config_word(dev, cmd_reg, cmd & ~bit); } pci_write_config_dword(dev, reg, new); @@ -107,7 +119,7 @@ void pci_update_resource(struct pci_dev *dev, int resno) } if (disable) - pci_write_config_word(dev, PCI_COMMAND, cmd); + pci_write_config_word(dev, cmd_reg, cmd); } int pci_claim_resource(struct pci_dev *dev, int resource) -- 2.1.0
[PATCH v3 0/2] Disable VF's memory space on updating IOV BARs
This moves pcibios_sriov_enable() to the point before VF and VF BARs are enabled on PowerNV platform. Also, pci_update_resource() is used to update IOV BARs on PowerNV platform, the PF might have been functional when the function is called. We shouldn't disable PF's memory decoding at that point. Instead, the VF's memory space should be disabled. Changelog = v3: * Disable VF's memory space when IOV BARs are updated in pcibios_sriov_enable(). v2: * Added one patch calling pcibios_sriov_enable() before the VF and VF BARs are enabled. Gavin Shan (2): PCI: Call pcibios_sriov_enable() before IOV BARs are enabled PCI: Disable VF's memory space on updating IOV BAR in pci_update_resource() drivers/pci/iov.c | 14 +++--- drivers/pci/setup-res.c | 28 2 files changed, 27 insertions(+), 15 deletions(-) -- 2.1.0
[PATCH v3 1/2] PCI: Call pcibios_sriov_enable() before IOV BARs are enabled
In current implementation, pcibios_sriov_enable() is used by PPC PowerNV platform only. In PowerNV specific pcibios_sriov_enable(), PF's IOV BARs might be updated (shifted) by pci_update_resource(). It means the IOV BARs aren't ready for decoding incoming memory address until pcibios_sriov_enable() returns. This calls pcibios_sriov_enable() earlier before the IOV BARs are enabled. As the result, the IOV BARs have been configured correctly when they are enabled. Signed-off-by: Gavin ShanTested-by: Carol Soto --- drivers/pci/iov.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c index e30f05c..d41ec29 100644 --- a/drivers/pci/iov.c +++ b/drivers/pci/iov.c @@ -306,13 +306,6 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn) return rc; } - pci_iov_set_numvfs(dev, nr_virtfn); - iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE; - pci_cfg_access_lock(dev); - pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl); - msleep(100); - pci_cfg_access_unlock(dev); - iov->initial_VFs = initial; if (nr_virtfn < initial) initial = nr_virtfn; @@ -323,6 +316,13 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn) goto err_pcibios; } + pci_iov_set_numvfs(dev, nr_virtfn); + iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE; + pci_cfg_access_lock(dev); + pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl); + msleep(100); + pci_cfg_access_unlock(dev); + for (i = 0; i < initial; i++) { rc = pci_iov_add_virtfn(dev, i, 0); if (rc) -- 2.1.0
Re: [PATCH v2 2/2] PCI: Don't disable PF's memory decoding when enabling SRIOV
On Mon, Oct 24, 2016 at 10:51:13PM -0500, Bjorn Helgaas wrote: >On Tue, Oct 25, 2016 at 12:47:28PM +1100, Gavin Shan wrote: >> On Mon, Oct 24, 2016 at 09:03:16AM -0500, Bjorn Helgaas wrote: >> >On Mon, Oct 24, 2016 at 10:28:02AM +1100, Gavin Shan wrote: >> >> On Fri, Oct 21, 2016 at 11:50:34AM -0500, Bjorn Helgaas wrote: >> >> >On Fri, Sep 30, 2016 at 09:47:50AM +1000, Gavin Shan wrote: .../... > >That specific case (pci_enable_device() followed by >pci_update_resource()) should *not* work. pci_enable_device() is >normally called by a driver's .probe() method, and after we call a >.probe() method, the PCI core shouldn't touch the device at all >because there's no means of mutual exclusion between the driver and >the PCI core. > >I think pci_update_resource() should only be called in situations >where the caller already knows that nobody is using the device. For >regular PCI BARs, that doesn't necessarily mean PCI_COMMAND_MEMORY is >turned off, because firmware leaves PCI_COMMAND_MEMORY enabled for >many devices, even though nobody is using them. > >Anyway, I think that's a project for another day. That's too much to >tackle for the limited problem you're trying to solve. > Bjorn, it's all about discussion. Please take your time and reply when you have bandwidth. Well, some drivers break the order and expects the relaxed order to work. One example is drivers/char/agp/efficeon-agp.c::agp_efficeon_probe(). I didn't check all usage cases. I think it's hard for user, who calls pci_update_resource(), to know that nobody is using the devcie (limited to memory BARs as we concern). The memory write is usually non-posted transaction and it can be on the way to target device when pci_update_resource() is called. So which one tranaction will be completed first (disabling memory decoding and memory write). I guess it can happen even with the mutual exclusion, especially on SMP system. Yes, the situation is worse without the synchronization. .../... >> >> Yeah, it would be the solution to have. If you agree, I will post >> updatd version according to this: Clearing PCI_SRIOV_CTRL_MSE when >> updating IOV BARs. The bit won't be touched if pdev->mmio_always_on >> is true. > >I think you should ignore pdev->mmio_always_on for IOV BARs. >mmio_always_on is basically a workaround for devices that either don't >follow the spec or where we didn't completely understand the problem. >I don't think there's any reason to set mmio_always_on for SR-IOV >devices. > Agree, thanks for the comments again. I will post updated version shortly. Thanks, Gavin
Re: [PATCH v4 4/5] mm: make processing of movable_node arch-specific
On Wed, Oct 26, 2016 at 09:34:18AM +1100, Balbir Singh wrote: I still believe we need your changes, I was wondering if we've tested it against normal memory nodes and checked if any memblock allocations end up there. Michael showed me some memblock allocations on node 1 of a two node machine with movable_node The movable_node option is x86-only. Both of those nodes contain normal memory, so allocations on both are allowed. Longer; if you use "movable_node", x86 can identify these nodes at boot. They call memblock_mark_hotplug() while parsing the SRAT. Then, when the zones are initialized, those markings are used to determine ZONE_MOVABLE. We have no analog of this SRAT information, so our movable nodes can only be created post boot, by hotplugging and explicitly onlining with online_movable. Is this true for all of system memory as well or only for nodes hotplugged later? As far as I know, power has nothing like the SRAT that tells us, at boot, which memory is hotpluggable. So there is nothing to wire the movable_node option up to. Of course, any memory you hotplug afterwards is, by definition, hotpluggable. So we can still create movable nodes that way. -- Reza Arbab
[PATCH net-next] ibmveth: v1 calculate correct gso_size and set gso_type
We recently encountered a bug where a few customers using ibmveth on the same LPAR hit an issue where a TCP session hung when large receive was enabled. Closer analysis revealed that the session was stuck because the one side was advertising a zero window repeatedly. We narrowed this down to the fact the ibmveth driver did not set gso_size which is translated by TCP into the MSS later up the stack. The MSS is used to calculate the TCP window size and as that was abnormally large, it was calculating a zero window, even although the sockets receive buffer was completely empty. We were able to reproduce this and worked with IBM to fix this. Thanks Tom and Marcelo for all your help and review on this. The patch fixes both our internal reproduction tests and our customers tests. Signed-off-by: Jon Maxwell--- drivers/net/ethernet/ibm/ibmveth.c | 20 1 file changed, 20 insertions(+) diff --git a/drivers/net/ethernet/ibm/ibmveth.c b/drivers/net/ethernet/ibm/ibmveth.c index 29c05d0..c51717e 100644 --- a/drivers/net/ethernet/ibm/ibmveth.c +++ b/drivers/net/ethernet/ibm/ibmveth.c @@ -1182,6 +1182,8 @@ static int ibmveth_poll(struct napi_struct *napi, int budget) int frames_processed = 0; unsigned long lpar_rc; struct iphdr *iph; + bool large_packet = 0; + u16 hdr_len = ETH_HLEN + sizeof(struct tcphdr); restart_poll: while (frames_processed < budget) { @@ -1236,10 +1238,28 @@ static int ibmveth_poll(struct napi_struct *napi, int budget) iph->check = 0; iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl); adapter->rx_large_packets++; + large_packet = 1; } } } + if (skb->len > netdev->mtu) { + iph = (struct iphdr *)skb->data; + if (be16_to_cpu(skb->protocol) == ETH_P_IP && + iph->protocol == IPPROTO_TCP) { + hdr_len += sizeof(struct iphdr); + skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4; + skb_shinfo(skb)->gso_size = netdev->mtu - hdr_len; + } else if (be16_to_cpu(skb->protocol) == ETH_P_IPV6 && + iph->protocol == IPPROTO_TCP) { + hdr_len += sizeof(struct ipv6hdr); + skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6; + skb_shinfo(skb)->gso_size = netdev->mtu - hdr_len; + } + if (!large_packet) + adapter->rx_large_packets++; + } + napi_gro_receive(napi, skb);/* send it up */ netdev->stats.rx_packets++; -- 1.8.3.1
Re: [PATCH v4 4/5] mm: make processing of movable_node arch-specific
On 26/10/16 02:55, Reza Arbab wrote: > On Tue, Oct 25, 2016 at 11:15:40PM +1100, Balbir Singh wrote: >> After the ack, I realized there were some more checks needed, IOW >> questions for you :) > > Hey! No takebacks! > I still believe we need your changes, I was wondering if we've tested it against normal memory nodes and checked if any memblock allocations end up there. Michael showed me some memblock allocations on node 1 of a two node machine with movable_node I'll double check at my end. See my question below > The short answer is that neither of these is a concern. > > Longer; if you use "movable_node", x86 can identify these nodes at boot. They > call memblock_mark_hotplug() while parsing the SRAT. Then, when the zones are > initialized, those markings are used to determine ZONE_MOVABLE. > > We have no analog of this SRAT information, so our movable nodes can only be > created post boot, by hotplugging and explicitly onlining with online_movable. > Is this true for all of system memory as well or only for nodes hotplugged later? Balbir Singh.
Re: [PATCH V6 7/8] powerpc: Check arch.vec earlier during boot for memory features
: > On 09/21/2016 09:17 AM, Michael Bringmann wrote: >> architecture.vec5 features: The boot-time memory management needs to >> know the form of the "ibm,dynamic-memory-v2" property early during >> scanning of the flattened device tree. This patch moves execution of >> the function pseries_probe_fw_features() early enough to be before >> the scanning of the memory properties in the device tree to allow >> recognition of the supported properties. >> >> [V2: No change] >> [V3: Updated after commit 3808a88985b4f5f5e947c364debce4441a380fb8.] >> [V4: Update comments] >> [V5: Resynchronize/resubmit] >> [V6: Resync to v4.7 kernel code] >> >> Signed-off-by: Michael Bringmann>> --- >> diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c >> index 946e34f..2034edc 100644 >> --- a/arch/powerpc/kernel/prom.c >> +++ b/arch/powerpc/kernel/prom.c >> @@ -753,6 +753,9 @@ void __init early_init_devtree(void *params) >> */ >> of_scan_flat_dt(early_init_dt_scan_chosen_ppc, boot_command_line); >> >> +/* Now try to figure out if we are running on LPAR and so on */ >> +pseries_probe_fw_features(); >> + > > I'll have to defer to others on whether calling this earlier in boot > is ok. It is scanning the flattened device tree supplied by the BMC, though this is not the first such call to do so. The relevant content of the device tree should not change between the earlier point of the relocated point, and the later point of the former location. > I do notice that you do not remove the call later on, any reason? Bug in patch. Corrected in next patch group submission. > -Nathan > >> /* Scan memory nodes and rebuild MEMBLOCKs */ >> of_scan_flat_dt(early_init_dt_scan_root, NULL); >> of_scan_flat_dt(early_init_dt_scan_memory_ppc, NULL); >> > > -- Michael W. Bringmann Linux Technology Center IBM Corporation Tie-Line 363-5196 External: (512) 286-5196 Cell: (512) 466-0650 m...@linux.vnet.ibm.com
Re: [PATCH v4 4/5] mm: make processing of movable_node arch-specific
On 26/10/16 02:55, Reza Arbab wrote: > On Tue, Oct 25, 2016 at 11:15:40PM +1100, Balbir Singh wrote: >> After the ack, I realized there were some more checks needed, IOW >> questions for you :) > > Hey! No takebacks! > I still believe we need your changes, I was wondering if we've tested it against normal memory nodes and checked if any memblock allocations end up there. Michael showed me some memblock allocations on node 1 of a two node machine with movable_node I'll double check at my end. See my question below > The short answer is that neither of these is a concern. > > Longer; if you use "movable_node", x86 can identify these nodes at boot. They > call memblock_mark_hotplug() while parsing the SRAT. Then, when the zones are > initialized, those markings are used to determine ZONE_MOVABLE. > > We have no analog of this SRAT information, so our movable nodes can only be > created post boot, by hotplugging and explicitly onlining with online_movable. > Is this true for all of system memory as well or only for nodes hotplugged later? Balbir Singh.
[net-next PATCH 18/27] arch/powerpc: Add option to skip DMA sync as a part of mapping
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to avoid invoking cache line invalidation if the driver will just handle it via a sync_for_cpu or sync_for_device call. Cc: Benjamin HerrenschmidtCc: Paul Mackerras Cc: Michael Ellerman Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Alexander Duyck --- arch/powerpc/kernel/dma.c |9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c index e64a601..6877e3f 100644 --- a/arch/powerpc/kernel/dma.c +++ b/arch/powerpc/kernel/dma.c @@ -203,6 +203,10 @@ static int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, for_each_sg(sgl, sg, nents, i) { sg->dma_address = sg_phys(sg) + get_dma_offset(dev); sg->dma_length = sg->length; + + if (attrs & DMA_ATTR_SKIP_CPU_SYNC) + continue; + __dma_sync_page(sg_page(sg), sg->offset, sg->length, direction); } @@ -235,7 +239,10 @@ static inline dma_addr_t dma_direct_map_page(struct device *dev, unsigned long attrs) { BUG_ON(dir == DMA_NONE); - __dma_sync_page(page, offset, size, dir); + + if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC)) + __dma_sync_page(page, offset, size, dir); + return page_to_phys(page) + offset + get_dma_offset(dev); }
[PATCH V8 0/8] powerpc/devtree: Add support for 2 new DRC properties
Several properties in the DRC device tree format are replaced by more compact representations to allow, for example, for the encoding of vast amounts of memory, and or reduced duplication of information in related data structures. "ibm,drc-info": This property, when present, replaces the following four properties: "ibm,drc-indexes", "ibm,drc-names", "ibm,drc-types" and "ibm,drc-power-domains". This property is defined for all dynamically reconfigurable platform nodes. The "ibm,drc-info" elements are intended to provide a more compact representation, and reduce some search overhead. "ibm,dynamic-memory-v2": This property replaces the "ibm,dynamic-memory" node representation within the "ibm,dynamic-reconfiguration-memory" property provided by the BMC. This element format is intended to provide a more compact representation of memory, especially, for systems with massive amounts of RAM. To simplify portability, this property is converted to the "ibm,dynamic-memory" property during system boot. "ibm,architecture.vec": Bidirectional communication mechanism between the host system and the front end processor indicating what features the host system supports and what features the front end processor will actually provide. In this case, we are indicating that the host system can support the new device tree structures "ibm,drc-info" and "ibm,dynamic-memory-v2". [V1: Initial presentation of PAPR 2.7 changes to device tree.] [V2: Revise constant names. Fix some syntax errors. Improve comments.] [V3: Revise tests for presence of new properties to always scan devicetree instead of depending upon architecture vec, due to reboot issues.] [V4: Rearrange some code changes in patches to better match application, and other code cleanup.] [V5: Resynchronize patches.] [V6: Resync to latest kernel commit code] [V7: Correct mail threading] [v8: Insert more useful variable names] Signed-off-by: Michael Bringmann--- Michael Bringmann (8): powerpc/firmware: Add definitions for new firmware features. powerpc/memory: Parse new memory property to register blocks. powerpc/memory: Parse new memory property to initialize structures. pseries/hotplug init: Convert new DRC memory property for hotplug runtime pseries/drc-info: Search new DRC properties for CPU indexes hotplug/drc-info: Add code to search new devtree properties powerpc: Check arch.vec earlier during boot for memory features powerpc: Enable support for new DRC devtree properties arch/powerpc/include/asm/firmware.h |5 - arch/powerpc/include/asm/prom.h | 38 - arch/powerpc/kernel/prom.c | 103 +++-- arch/powerpc/kernel/prom_init.c |3 arch/powerpc/mm/numa.c | 168 ++-- arch/powerpc/platforms/pseries/Makefile |4 arch/powerpc/platforms/pseries/firmware.c |2 arch/powerpc/platforms/pseries/hotplug-memory.c | 93 +++ arch/powerpc/platforms/pseries/pseries_energy.c | 189 --- drivers/pci/hotplug/rpadlpar_core.c | 13 +- drivers/pci/hotplug/rpaphp.h|4 drivers/pci/hotplug/rpaphp_core.c | 109 ++--- 12 files changed, 628 insertions(+), 103 deletions(-) -- Michael W. Bringmann Linux Technology Center IBM Corporation m...@linux.vnet.ibm.com
Re: [PATCH net-next] ibmveth: calculate correct gso_size and set gso_type
>> + u16 hdr_len = ETH_HLEN + sizeof(struct tcphdr); > Compiler may optmize this, but maybe move hdr_len to [*] ?> There are other places in the stack where a u16 is used for the same purpose. So I'll rather stick to that convention. I'll make the other formatting changes you suggested and resubmit as v1. Thanks Jon On Tue, Oct 25, 2016 at 9:31 PM, Marcelo Ricardo Leitnerwrote: > On Tue, Oct 25, 2016 at 04:13:41PM +1100, Jon Maxwell wrote: >> We recently encountered a bug where a few customers using ibmveth on the >> same LPAR hit an issue where a TCP session hung when large receive was >> enabled. Closer analysis revealed that the session was stuck because the >> one side was advertising a zero window repeatedly. >> >> We narrowed this down to the fact the ibmveth driver did not set gso_size >> which is translated by TCP into the MSS later up the stack. The MSS is >> used to calculate the TCP window size and as that was abnormally large, >> it was calculating a zero window, even although the sockets receive buffer >> was completely empty. >> >> We were able to reproduce this and worked with IBM to fix this. Thanks Tom >> and Marcelo for all your help and review on this. >> >> The patch fixes both our internal reproduction tests and our customers tests. >> >> Signed-off-by: Jon Maxwell >> --- >> drivers/net/ethernet/ibm/ibmveth.c | 19 +++ >> 1 file changed, 19 insertions(+) >> >> diff --git a/drivers/net/ethernet/ibm/ibmveth.c >> b/drivers/net/ethernet/ibm/ibmveth.c >> index 29c05d0..3028c33 100644 >> --- a/drivers/net/ethernet/ibm/ibmveth.c >> +++ b/drivers/net/ethernet/ibm/ibmveth.c >> @@ -1182,6 +1182,8 @@ static int ibmveth_poll(struct napi_struct *napi, int >> budget) >> int frames_processed = 0; >> unsigned long lpar_rc; >> struct iphdr *iph; >> + bool large_packet = 0; >> + u16 hdr_len = ETH_HLEN + sizeof(struct tcphdr); > > Compiler may optmize this, but maybe move hdr_len to [*] ? > >> >> restart_poll: >> while (frames_processed < budget) { >> @@ -1236,10 +1238,27 @@ static int ibmveth_poll(struct napi_struct *napi, >> int budget) >> iph->check = 0; >> iph->check = >> ip_fast_csum((unsigned char *)iph, iph->ihl); >> adapter->rx_large_packets++; >> + large_packet = 1; >> } >> } >> } >> >> + if (skb->len > netdev->mtu) { > > [*] > >> + iph = (struct iphdr *)skb->data; >> + if (be16_to_cpu(skb->protocol) == ETH_P_IP && >> iph->protocol == IPPROTO_TCP) { > > The if line above is too long, should be broken in two. > >> + hdr_len += sizeof(struct iphdr); >> + skb_shinfo(skb)->gso_type = >> SKB_GSO_TCPV4; >> + skb_shinfo(skb)->gso_size = >> netdev->mtu - hdr_len; >> + } else if (be16_to_cpu(skb->protocol) == >> ETH_P_IPV6 && >> + iph->protocol == IPPROTO_TCP) { > ^ > And this one should start 3 spaces later, right below be16_ > > Marcelo > >> + hdr_len += sizeof(struct ipv6hdr); >> + skb_shinfo(skb)->gso_type = >> SKB_GSO_TCPV6; >> + skb_shinfo(skb)->gso_size = >> netdev->mtu - hdr_len; >> + } >> + if (!large_packet) >> + adapter->rx_large_packets++; >> + } >> + >> napi_gro_receive(napi, skb);/* send it up */ >> >> netdev->stats.rx_packets++; >> -- >> 1.8.3.1 >>
[PATCHv2 7/7] mm: kill arch_mremap
This reverts commit 4abad2ca4a4d ("mm: new arch_remap() hook") and commit 2ae416b142b6 ("mm: new mm hook framework"). It also keeps the same functionality of mremapping vDSO blob with introducing vm_special_mapping mremap op for powerpc. Cc: Laurent DufourCc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Michael Ellerman Cc: "Kirill A. Shutemov" Cc: Andy Lutomirski Cc: Oleg Nesterov Cc: Andrew Morton Cc: linuxppc-dev@lists.ozlabs.org Cc: linux...@kvack.org Signed-off-by: Dmitry Safonov --- v2: use vdso64_pages only under CONFIG_PPC64 arch/alpha/include/asm/Kbuild| 1 - arch/arc/include/asm/Kbuild | 1 - arch/arm/include/asm/Kbuild | 1 - arch/arm64/include/asm/Kbuild| 1 - arch/avr32/include/asm/Kbuild| 1 - arch/blackfin/include/asm/Kbuild | 1 - arch/c6x/include/asm/Kbuild | 1 - arch/cris/include/asm/Kbuild | 1 - arch/frv/include/asm/Kbuild | 1 - arch/h8300/include/asm/Kbuild| 1 - arch/hexagon/include/asm/Kbuild | 1 - arch/ia64/include/asm/Kbuild | 1 - arch/m32r/include/asm/Kbuild | 1 - arch/m68k/include/asm/Kbuild | 1 - arch/metag/include/asm/Kbuild| 1 - arch/microblaze/include/asm/Kbuild | 1 - arch/mips/include/asm/Kbuild | 1 - arch/mn10300/include/asm/Kbuild | 1 - arch/nios2/include/asm/Kbuild| 1 - arch/openrisc/include/asm/Kbuild | 1 - arch/parisc/include/asm/Kbuild | 1 - arch/powerpc/include/asm/mm-arch-hooks.h | 28 arch/powerpc/kernel/vdso.c | 25 + arch/powerpc/kernel/vdso_common.c| 1 + arch/s390/include/asm/Kbuild | 1 - arch/score/include/asm/Kbuild| 1 - arch/sh/include/asm/Kbuild | 1 - arch/sparc/include/asm/Kbuild| 1 - arch/tile/include/asm/Kbuild | 1 - arch/um/include/asm/Kbuild | 1 - arch/unicore32/include/asm/Kbuild| 1 - arch/x86/include/asm/Kbuild | 1 - arch/xtensa/include/asm/Kbuild | 1 - include/asm-generic/mm-arch-hooks.h | 16 include/linux/mm-arch-hooks.h| 25 - mm/mremap.c | 4 36 files changed, 26 insertions(+), 103 deletions(-) delete mode 100644 arch/powerpc/include/asm/mm-arch-hooks.h delete mode 100644 include/asm-generic/mm-arch-hooks.h delete mode 100644 include/linux/mm-arch-hooks.h diff --git a/arch/alpha/include/asm/Kbuild b/arch/alpha/include/asm/Kbuild index bf8475ce85ee..0a5e0ec2842b 100644 --- a/arch/alpha/include/asm/Kbuild +++ b/arch/alpha/include/asm/Kbuild @@ -6,7 +6,6 @@ generic-y += exec.h generic-y += export.h generic-y += irq_work.h generic-y += mcs_spinlock.h -generic-y += mm-arch-hooks.h generic-y += preempt.h generic-y += sections.h generic-y += trace_clock.h diff --git a/arch/arc/include/asm/Kbuild b/arch/arc/include/asm/Kbuild index c332604606dd..e6059a808463 100644 --- a/arch/arc/include/asm/Kbuild +++ b/arch/arc/include/asm/Kbuild @@ -22,7 +22,6 @@ generic-y += kvm_para.h generic-y += local.h generic-y += local64.h generic-y += mcs_spinlock.h -generic-y += mm-arch-hooks.h generic-y += mman.h generic-y += msgbuf.h generic-y += msi.h diff --git a/arch/arm/include/asm/Kbuild b/arch/arm/include/asm/Kbuild index 0745538b26d3..44b717cb4a55 100644 --- a/arch/arm/include/asm/Kbuild +++ b/arch/arm/include/asm/Kbuild @@ -15,7 +15,6 @@ generic-y += irq_regs.h generic-y += kdebug.h generic-y += local.h generic-y += local64.h -generic-y += mm-arch-hooks.h generic-y += msgbuf.h generic-y += msi.h generic-y += param.h diff --git a/arch/arm64/include/asm/Kbuild b/arch/arm64/include/asm/Kbuild index 44e1d7f10add..a42a1367aea4 100644 --- a/arch/arm64/include/asm/Kbuild +++ b/arch/arm64/include/asm/Kbuild @@ -20,7 +20,6 @@ generic-y += kvm_para.h generic-y += local.h generic-y += local64.h generic-y += mcs_spinlock.h -generic-y += mm-arch-hooks.h generic-y += mman.h generic-y += msgbuf.h generic-y += msi.h diff --git a/arch/avr32/include/asm/Kbuild b/arch/avr32/include/asm/Kbuild index 241b9b9729d8..519810d0d5e1 100644 --- a/arch/avr32/include/asm/Kbuild +++ b/arch/avr32/include/asm/Kbuild @@ -12,7 +12,6 @@ generic-y += irq_work.h generic-y += local.h generic-y += local64.h generic-y += mcs_spinlock.h -generic-y += mm-arch-hooks.h generic-y += param.h generic-y += percpu.h generic-y += preempt.h diff --git a/arch/blackfin/include/asm/Kbuild b/arch/blackfin/include/asm/Kbuild index 91d49c0a3118..c80181e4454f 100644 --- a/arch/blackfin/include/asm/Kbuild +++
unsubscribe
unsubscribe
Re: [PATCH] powerpc: Use pr_warn instead of pr_warning
On 10/24/2016 09:00 PM, Joe Perches wrote: > At some point, pr_warning will be removed so all logging messages use > a consistent _warn style. > > Update arch/powerpc/ > arch/powerpc/platforms/ps3/device-init.c| 12 +--- > arch/powerpc/platforms/ps3/mm.c | 4 ++-- > arch/powerpc/platforms/ps3/os-area.c| 2 +- PS3 parts look OK. Acked-by: Geoff Levand
Re: [PATCH v4 4/5] mm: make processing of movable_node arch-specific
On Tue, Oct 25, 2016 at 11:15:40PM +1100, Balbir Singh wrote: After the ack, I realized there were some more checks needed, IOW questions for you :) Hey! No takebacks! The short answer is that neither of these is a concern. Longer; if you use "movable_node", x86 can identify these nodes at boot. They call memblock_mark_hotplug() while parsing the SRAT. Then, when the zones are initialized, those markings are used to determine ZONE_MOVABLE. We have no analog of this SRAT information, so our movable nodes can only be created post boot, by hotplugging and explicitly onlining with online_movable. 1. Have you checked to see if our memblock allocations spill over to probably hotpluggable nodes? Since our nodes don't exist at boot, we don't have that short window before the zones are drawn where the node has normal memory, and a kernel allocation might occur within. 2. Shouldn't we be marking nodes discovered as movable via memblock_mark_hotplug()? Again, this early boot marking mechanism only applies to movable_node. -- Reza Arbab
[PATCH 4/7] powerpc/vdso: introduce init_vdso{32,64}_pagelist
Common code with allocation/initialization of vDSO's pagelist. Impact: cleanup Cc: Benjamin HerrenschmidtCc: Paul Mackerras Cc: Michael Ellerman Cc: Andy Lutomirski Cc: Oleg Nesterov Cc: linuxppc-dev@lists.ozlabs.org Cc: linux...@kvack.org Signed-off-by: Dmitry Safonov --- arch/powerpc/kernel/vdso.c| 27 ++- arch/powerpc/kernel/vdso_common.c | 22 ++ 2 files changed, 24 insertions(+), 25 deletions(-) diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c index 8010a0d82049..25d03d773c49 100644 --- a/arch/powerpc/kernel/vdso.c +++ b/arch/powerpc/kernel/vdso.c @@ -382,8 +382,6 @@ early_initcall(vdso_getcpu_init); static int __init vdso_init(void) { - int i; - #ifdef CONFIG_PPC64 /* * Fill up the "systemcfg" stuff for backward compatibility @@ -454,32 +452,11 @@ static int __init vdso_init(void) } #ifdef CONFIG_VDSO32 - /* Make sure pages are in the correct state */ - vdso32_pagelist = kzalloc(sizeof(struct page *) * (vdso32_pages + 2), - GFP_KERNEL); - BUG_ON(vdso32_pagelist == NULL); - for (i = 0; i < vdso32_pages; i++) { - struct page *pg = virt_to_page(vdso32_kbase + i*PAGE_SIZE); - ClearPageReserved(pg); - get_page(pg); - vdso32_pagelist[i] = pg; - } - vdso32_pagelist[i++] = virt_to_page(vdso_data); - vdso32_pagelist[i] = NULL; + init_vdso32_pagelist(); #endif #ifdef CONFIG_PPC64 - vdso64_pagelist = kzalloc(sizeof(struct page *) * (vdso64_pages + 2), - GFP_KERNEL); - BUG_ON(vdso64_pagelist == NULL); - for (i = 0; i < vdso64_pages; i++) { - struct page *pg = virt_to_page(vdso64_kbase + i*PAGE_SIZE); - ClearPageReserved(pg); - get_page(pg); - vdso64_pagelist[i] = pg; - } - vdso64_pagelist[i++] = virt_to_page(vdso_data); - vdso64_pagelist[i] = NULL; + init_vdso64_pagelist(); #endif /* CONFIG_PPC64 */ get_page(virt_to_page(vdso_data)); diff --git a/arch/powerpc/kernel/vdso_common.c b/arch/powerpc/kernel/vdso_common.c index ac25d66134fb..c97c30606b3f 100644 --- a/arch/powerpc/kernel/vdso_common.c +++ b/arch/powerpc/kernel/vdso_common.c @@ -14,6 +14,7 @@ #define VDSO_LBASE CONCAT3(VDSO, BITS, _LBASE) #define vdso_kbase CONCAT3(vdso, BITS, _kbase) #define vdso_pages CONCAT3(vdso, BITS, _pages) +#define vdso_pagelist CONCAT3(vdso, BITS, _pagelist) #undef pr_fmt #define pr_fmt(fmt)"vDSO" __stringify(BITS) ": " fmt @@ -202,6 +203,25 @@ static __init int vdso_setup(struct lib_elfinfo *v) return 0; } +#define init_vdso_pagelist CONCAT3(init_vdso, BITS, _pagelist) +static __init void init_vdso_pagelist(void) +{ + int i; + + /* Make sure pages are in the correct state */ + vdso_pagelist = kzalloc(sizeof(struct page *) * (vdso_pages + 2), + GFP_KERNEL); + BUG_ON(vdso_pagelist == NULL); + for (i = 0; i < vdso_pages; i++) { + struct page *pg = virt_to_page(vdso_kbase + i*PAGE_SIZE); + + ClearPageReserved(pg); + get_page(pg); + vdso_pagelist[i] = pg; + } + vdso_pagelist[i++] = virt_to_page(vdso_data); + vdso_pagelist[i] = NULL; +} #undef find_section #undef find_symbol @@ -211,10 +231,12 @@ static __init int vdso_setup(struct lib_elfinfo *v) #undef vdso_fixup_datapage #undef vdso_fixup_features #undef vdso_setup +#undef init_vdso_pagelist #undef VDSO_LBASE #undef vdso_kbase #undef vdso_pages +#undef vdso_pagelist #undef lib_elfinfo #undef BITS #undef _CONCAT3 -- 2.10.0
[PATCH 3/7] powerpc/vdso: separate common code in vdso_common
Impact: cleanup I also switched usage of printk(KERNEL_,...) on pr_(...) and used pr_fmt() macro for "vDSO{32,64}: " prefix. Cc: Benjamin HerrenschmidtCc: Paul Mackerras Cc: Michael Ellerman Cc: Andy Lutomirski Cc: Oleg Nesterov Cc: linuxppc-dev@lists.ozlabs.org Cc: linux...@kvack.org Signed-off-by: Dmitry Safonov --- arch/powerpc/kernel/vdso.c| 352 ++ arch/powerpc/kernel/vdso_common.c | 221 2 files changed, 234 insertions(+), 339 deletions(-) create mode 100644 arch/powerpc/kernel/vdso_common.c diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c index 278b9aa25a1c..8010a0d82049 100644 --- a/arch/powerpc/kernel/vdso.c +++ b/arch/powerpc/kernel/vdso.c @@ -51,13 +51,13 @@ #define VDSO_ALIGNMENT (1 << 16) static unsigned int vdso32_pages; -static void *vdso32_kbase; static struct page **vdso32_pagelist; unsigned long vdso32_sigtramp; unsigned long vdso32_rt_sigtramp; #ifdef CONFIG_VDSO32 extern char vdso32_start, vdso32_end; +static void *vdso32_kbase; #endif #ifdef CONFIG_PPC64 @@ -246,250 +246,16 @@ const char *arch_vma_name(struct vm_area_struct *vma) return NULL; } - - #ifdef CONFIG_VDSO32 -static void * __init find_section32(Elf32_Ehdr *ehdr, const char *secname, - unsigned long *size) -{ - Elf32_Shdr *sechdrs; - unsigned int i; - char *secnames; - - /* Grab section headers and strings so we can tell who is who */ - sechdrs = (void *)ehdr + ehdr->e_shoff; - secnames = (void *)ehdr + sechdrs[ehdr->e_shstrndx].sh_offset; - - /* Find the section they want */ - for (i = 1; i < ehdr->e_shnum; i++) { - if (strcmp(secnames+sechdrs[i].sh_name, secname) == 0) { - if (size) - *size = sechdrs[i].sh_size; - return (void *)ehdr + sechdrs[i].sh_offset; - } - } - *size = 0; - return NULL; -} - -static Elf32_Sym * __init find_symbol32(struct lib32_elfinfo *lib, - const char *symname) -{ - unsigned int i; - char name[MAX_SYMNAME], *c; - - for (i = 0; i < (lib->dynsymsize / sizeof(Elf32_Sym)); i++) { - if (lib->dynsym[i].st_name == 0) - continue; - strlcpy(name, lib->dynstr + lib->dynsym[i].st_name, - MAX_SYMNAME); - c = strchr(name, '@'); - if (c) - *c = 0; - if (strcmp(symname, name) == 0) - return >dynsym[i]; - } - return NULL; -} - -/* Note that we assume the section is .text and the symbol is relative to - * the library base - */ -static unsigned long __init find_function32(struct lib32_elfinfo *lib, - const char *symname) -{ - Elf32_Sym *sym = find_symbol32(lib, symname); - - if (sym == NULL) { - printk(KERN_WARNING "vDSO32: function %s not found !\n", - symname); - return 0; - } - return sym->st_value - VDSO32_LBASE; -} - -static int __init vdso_do_func_patch32(struct lib32_elfinfo *v32, - const char *orig, const char *fix) -{ - Elf32_Sym *sym32_gen, *sym32_fix; - - sym32_gen = find_symbol32(v32, orig); - if (sym32_gen == NULL) { - printk(KERN_ERR "vDSO32: Can't find symbol %s !\n", orig); - return -1; - } - if (fix == NULL) { - sym32_gen->st_name = 0; - return 0; - } - sym32_fix = find_symbol32(v32, fix); - if (sym32_fix == NULL) { - printk(KERN_ERR "vDSO32: Can't find symbol %s !\n", fix); - return -1; - } - sym32_gen->st_value = sym32_fix->st_value; - sym32_gen->st_size = sym32_fix->st_size; - sym32_gen->st_info = sym32_fix->st_info; - sym32_gen->st_other = sym32_fix->st_other; - sym32_gen->st_shndx = sym32_fix->st_shndx; - - return 0; -} -#else /* !CONFIG_VDSO32 */ -static unsigned long __init find_function32(struct lib32_elfinfo *lib, - const char *symname) -{ - return 0; -} - -static int __init vdso_do_func_patch32(struct lib32_elfinfo *v32, - const char *orig, const char *fix) -{ - return 0; -} +#include "vdso_common.c" #endif /* CONFIG_VDSO32 */ - #ifdef CONFIG_PPC64 - -static void * __init find_section64(Elf64_Ehdr *ehdr, const char *secname, - unsigned long *size) -{ - Elf64_Shdr *sechdrs; - unsigned int i; - char *secnames; - - /* Grab section headers and strings so
[PATCH 7/7] mm: kill arch_mremap
This reverts commit 4abad2ca4a4d ("mm: new arch_remap() hook") and commit 2ae416b142b6 ("mm: new mm hook framework"). It also keeps the same functionality of mremapping vDSO blob with introducing vm_special_mapping mremap op for powerpc. Cc: Laurent DufourCc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Michael Ellerman Cc: "Kirill A. Shutemov" Cc: Andy Lutomirski Cc: Oleg Nesterov Cc: Andrew Morton Cc: linuxppc-dev@lists.ozlabs.org Cc: linux...@kvack.org Signed-off-by: Dmitry Safonov --- arch/alpha/include/asm/Kbuild| 1 - arch/arc/include/asm/Kbuild | 1 - arch/arm/include/asm/Kbuild | 1 - arch/arm64/include/asm/Kbuild| 1 - arch/avr32/include/asm/Kbuild| 1 - arch/blackfin/include/asm/Kbuild | 1 - arch/c6x/include/asm/Kbuild | 1 - arch/cris/include/asm/Kbuild | 1 - arch/frv/include/asm/Kbuild | 1 - arch/h8300/include/asm/Kbuild| 1 - arch/hexagon/include/asm/Kbuild | 1 - arch/ia64/include/asm/Kbuild | 1 - arch/m32r/include/asm/Kbuild | 1 - arch/m68k/include/asm/Kbuild | 1 - arch/metag/include/asm/Kbuild| 1 - arch/microblaze/include/asm/Kbuild | 1 - arch/mips/include/asm/Kbuild | 1 - arch/mn10300/include/asm/Kbuild | 1 - arch/nios2/include/asm/Kbuild| 1 - arch/openrisc/include/asm/Kbuild | 1 - arch/parisc/include/asm/Kbuild | 1 - arch/powerpc/include/asm/mm-arch-hooks.h | 28 arch/powerpc/kernel/vdso.c | 19 +++ arch/powerpc/kernel/vdso_common.c| 1 + arch/s390/include/asm/Kbuild | 1 - arch/score/include/asm/Kbuild| 1 - arch/sh/include/asm/Kbuild | 1 - arch/sparc/include/asm/Kbuild| 1 - arch/tile/include/asm/Kbuild | 1 - arch/um/include/asm/Kbuild | 1 - arch/unicore32/include/asm/Kbuild| 1 - arch/x86/include/asm/Kbuild | 1 - arch/xtensa/include/asm/Kbuild | 1 - include/asm-generic/mm-arch-hooks.h | 16 include/linux/mm-arch-hooks.h| 25 - mm/mremap.c | 4 36 files changed, 20 insertions(+), 103 deletions(-) delete mode 100644 arch/powerpc/include/asm/mm-arch-hooks.h delete mode 100644 include/asm-generic/mm-arch-hooks.h delete mode 100644 include/linux/mm-arch-hooks.h diff --git a/arch/alpha/include/asm/Kbuild b/arch/alpha/include/asm/Kbuild index bf8475ce85ee..0a5e0ec2842b 100644 --- a/arch/alpha/include/asm/Kbuild +++ b/arch/alpha/include/asm/Kbuild @@ -6,7 +6,6 @@ generic-y += exec.h generic-y += export.h generic-y += irq_work.h generic-y += mcs_spinlock.h -generic-y += mm-arch-hooks.h generic-y += preempt.h generic-y += sections.h generic-y += trace_clock.h diff --git a/arch/arc/include/asm/Kbuild b/arch/arc/include/asm/Kbuild index c332604606dd..e6059a808463 100644 --- a/arch/arc/include/asm/Kbuild +++ b/arch/arc/include/asm/Kbuild @@ -22,7 +22,6 @@ generic-y += kvm_para.h generic-y += local.h generic-y += local64.h generic-y += mcs_spinlock.h -generic-y += mm-arch-hooks.h generic-y += mman.h generic-y += msgbuf.h generic-y += msi.h diff --git a/arch/arm/include/asm/Kbuild b/arch/arm/include/asm/Kbuild index 0745538b26d3..44b717cb4a55 100644 --- a/arch/arm/include/asm/Kbuild +++ b/arch/arm/include/asm/Kbuild @@ -15,7 +15,6 @@ generic-y += irq_regs.h generic-y += kdebug.h generic-y += local.h generic-y += local64.h -generic-y += mm-arch-hooks.h generic-y += msgbuf.h generic-y += msi.h generic-y += param.h diff --git a/arch/arm64/include/asm/Kbuild b/arch/arm64/include/asm/Kbuild index 44e1d7f10add..a42a1367aea4 100644 --- a/arch/arm64/include/asm/Kbuild +++ b/arch/arm64/include/asm/Kbuild @@ -20,7 +20,6 @@ generic-y += kvm_para.h generic-y += local.h generic-y += local64.h generic-y += mcs_spinlock.h -generic-y += mm-arch-hooks.h generic-y += mman.h generic-y += msgbuf.h generic-y += msi.h diff --git a/arch/avr32/include/asm/Kbuild b/arch/avr32/include/asm/Kbuild index 241b9b9729d8..519810d0d5e1 100644 --- a/arch/avr32/include/asm/Kbuild +++ b/arch/avr32/include/asm/Kbuild @@ -12,7 +12,6 @@ generic-y += irq_work.h generic-y += local.h generic-y += local64.h generic-y += mcs_spinlock.h -generic-y += mm-arch-hooks.h generic-y += param.h generic-y += percpu.h generic-y += preempt.h diff --git a/arch/blackfin/include/asm/Kbuild b/arch/blackfin/include/asm/Kbuild index 91d49c0a3118..c80181e4454f 100644 --- a/arch/blackfin/include/asm/Kbuild +++ b/arch/blackfin/include/asm/Kbuild @@ -21,7 +21,6 @@
[PATCH 6/7] powerpc/vdso: switch from legacy_special_mapping_vmops
This will allow to introduce mremap hook (the next patch). Cc: Benjamin HerrenschmidtCc: Paul Mackerras Cc: Michael Ellerman Cc: Andy Lutomirski Cc: Oleg Nesterov Cc: linuxppc-dev@lists.ozlabs.org Cc: linux...@kvack.org Signed-off-by: Dmitry Safonov --- arch/powerpc/kernel/vdso.c| 19 +++ arch/powerpc/kernel/vdso_common.c | 8 ++-- 2 files changed, 17 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c index e68601ffc9ad..9ee3fd65c6e9 100644 --- a/arch/powerpc/kernel/vdso.c +++ b/arch/powerpc/kernel/vdso.c @@ -51,7 +51,7 @@ #define VDSO_ALIGNMENT (1 << 16) static unsigned int vdso32_pages; -static struct page **vdso32_pagelist; +static struct vm_special_mapping vdso32_mapping; unsigned long vdso32_sigtramp; unsigned long vdso32_rt_sigtramp; @@ -64,7 +64,7 @@ static void *vdso32_kbase; extern char vdso64_start, vdso64_end; static void *vdso64_kbase = _start; static unsigned int vdso64_pages; -static struct page **vdso64_pagelist; +static struct vm_special_mapping vdso64_mapping; unsigned long vdso64_rt_sigtramp; #endif /* CONFIG_PPC64 */ @@ -143,10 +143,11 @@ struct lib64_elfinfo unsigned long text; }; -static int map_vdso(struct page **vdso_pagelist, unsigned long vdso_pages, +static int map_vdso(struct vm_special_mapping *vsm, unsigned long vdso_pages, unsigned long vdso_base) { struct mm_struct *mm = current->mm; + struct vm_area_struct *vma; int ret = 0; mm->context.vdso_base = 0; @@ -198,12 +199,14 @@ static int map_vdso(struct page **vdso_pagelist, unsigned long vdso_pages, * It's fine to use that for setting breakpoints in the vDSO code * pages though. */ - ret = install_special_mapping(mm, vdso_base, vdso_pages << PAGE_SHIFT, + vma = _install_special_mapping(mm, vdso_base, vdso_pages << PAGE_SHIFT, VM_READ|VM_EXEC| VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC, -vdso_pagelist); - if (ret) +vsm); + if (IS_ERR(vma)) { + ret = PTR_ERR(vma); current->mm->context.vdso_base = 0; + } out_up_mmap_sem: up_write(>mmap_sem); @@ -220,7 +223,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp) return 0; if (is_32bit_task()) - return map_vdso(vdso32_pagelist, vdso32_pages, VDSO32_MBASE); + return map_vdso(_mapping, vdso32_pages, VDSO32_MBASE); #ifdef CONFIG_PPC64 else /* @@ -228,7 +231,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp) * allows get_unmapped_area to find an area near other mmaps * and most likely share a SLB entry. */ - return map_vdso(vdso64_pagelist, vdso64_pages, 0); + return map_vdso(_mapping, vdso64_pages, 0); #endif WARN_ONCE(1, "task is not 32-bit on non PPC64 kernel"); return -1; diff --git a/arch/powerpc/kernel/vdso_common.c b/arch/powerpc/kernel/vdso_common.c index c97c30606b3f..047f6b8b230f 100644 --- a/arch/powerpc/kernel/vdso_common.c +++ b/arch/powerpc/kernel/vdso_common.c @@ -14,7 +14,7 @@ #define VDSO_LBASE CONCAT3(VDSO, BITS, _LBASE) #define vdso_kbase CONCAT3(vdso, BITS, _kbase) #define vdso_pages CONCAT3(vdso, BITS, _pages) -#define vdso_pagelist CONCAT3(vdso, BITS, _pagelist) +#define vdso_mapping CONCAT3(vdso, BITS, _mapping) #undef pr_fmt #define pr_fmt(fmt)"vDSO" __stringify(BITS) ": " fmt @@ -207,6 +207,7 @@ static __init int vdso_setup(struct lib_elfinfo *v) static __init void init_vdso_pagelist(void) { int i; + struct page **vdso_pagelist; /* Make sure pages are in the correct state */ vdso_pagelist = kzalloc(sizeof(struct page *) * (vdso_pages + 2), @@ -221,6 +222,9 @@ static __init void init_vdso_pagelist(void) } vdso_pagelist[i++] = virt_to_page(vdso_data); vdso_pagelist[i] = NULL; + + vdso_mapping.pages = vdso_pagelist; + vdso_mapping.name = "[vdso]"; } #undef find_section @@ -236,7 +240,7 @@ static __init void init_vdso_pagelist(void) #undef VDSO_LBASE #undef vdso_kbase #undef vdso_pages -#undef vdso_pagelist +#undef vdso_mapping #undef lib_elfinfo #undef BITS #undef _CONCAT3 -- 2.10.0
[PATCH 5/7] powerpc/vdso: split map_vdso from arch_setup_additional_pages
I'll be easier to introduce vm_special_mapping struct in a smaller map_vdso() function (see the next patches). Impact: cleanup Cc: Benjamin HerrenschmidtCc: Paul Mackerras Cc: Michael Ellerman Cc: Andy Lutomirski Cc: Oleg Nesterov Cc: linuxppc-dev@lists.ozlabs.org Cc: linux...@kvack.org Signed-off-by: Dmitry Safonov --- arch/powerpc/kernel/vdso.c | 67 +- 1 file changed, 31 insertions(+), 36 deletions(-) diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c index 25d03d773c49..e68601ffc9ad 100644 --- a/arch/powerpc/kernel/vdso.c +++ b/arch/powerpc/kernel/vdso.c @@ -143,52 +143,23 @@ struct lib64_elfinfo unsigned long text; }; - -/* - * This is called from binfmt_elf, we create the special vma for the - * vDSO and insert it into the mm struct tree - */ -int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp) +static int map_vdso(struct page **vdso_pagelist, unsigned long vdso_pages, + unsigned long vdso_base) { struct mm_struct *mm = current->mm; - struct page **vdso_pagelist; - unsigned long vdso_pages; - unsigned long vdso_base; int ret = 0; - if (!vdso_ready) - return 0; - -#ifdef CONFIG_PPC64 - if (is_32bit_task()) { - vdso_pagelist = vdso32_pagelist; - vdso_pages = vdso32_pages; - vdso_base = VDSO32_MBASE; - } else { - vdso_pagelist = vdso64_pagelist; - vdso_pages = vdso64_pages; - /* -* On 64bit we don't have a preferred map address. This -* allows get_unmapped_area to find an area near other mmaps -* and most likely share a SLB entry. -*/ - vdso_base = 0; - } -#else - vdso_pagelist = vdso32_pagelist; - vdso_pages = vdso32_pages; - vdso_base = VDSO32_MBASE; -#endif - - current->mm->context.vdso_base = 0; + mm->context.vdso_base = 0; - /* vDSO has a problem and was disabled, just don't "enable" it for the + /* +* vDSO has a problem and was disabled, just don't "enable" it for the * process */ if (vdso_pages == 0) return 0; + /* Add a page to the vdso size for the data page */ - vdso_pages ++; + vdso_pages++; /* * pick a base address for the vDSO in process space. We try to put it @@ -239,6 +210,30 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp) return ret; } +/* + * This is called from binfmt_elf, we create the special vma for the + * vDSO and insert it into the mm struct tree + */ +int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp) +{ + if (!vdso_ready) + return 0; + + if (is_32bit_task()) + return map_vdso(vdso32_pagelist, vdso32_pages, VDSO32_MBASE); +#ifdef CONFIG_PPC64 + else + /* +* On 64bit we don't have a preferred map address. This +* allows get_unmapped_area to find an area near other mmaps +* and most likely share a SLB entry. +*/ + return map_vdso(vdso64_pagelist, vdso64_pages, 0); +#endif + WARN_ONCE(1, "task is not 32-bit on non PPC64 kernel"); + return -1; +} + const char *arch_vma_name(struct vm_area_struct *vma) { if (vma->vm_mm && vma->vm_start == vma->vm_mm->context.vdso_base) -- 2.10.0
[PATCH 2/7] powerpc/vdso: remove unused params in vdso_do_func_patch{32, 64}
Impact: cleanup Cc: Benjamin HerrenschmidtCc: Paul Mackerras Cc: Michael Ellerman Cc: Andy Lutomirski Cc: Oleg Nesterov Cc: linuxppc-dev@lists.ozlabs.org Cc: linux...@kvack.org Signed-off-by: Dmitry Safonov --- arch/powerpc/kernel/vdso.c | 11 +++ 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c index 4ffb82a2d9e9..278b9aa25a1c 100644 --- a/arch/powerpc/kernel/vdso.c +++ b/arch/powerpc/kernel/vdso.c @@ -309,7 +309,6 @@ static unsigned long __init find_function32(struct lib32_elfinfo *lib, } static int __init vdso_do_func_patch32(struct lib32_elfinfo *v32, - struct lib64_elfinfo *v64, const char *orig, const char *fix) { Elf32_Sym *sym32_gen, *sym32_fix; @@ -344,7 +343,6 @@ static unsigned long __init find_function32(struct lib32_elfinfo *lib, } static int __init vdso_do_func_patch32(struct lib32_elfinfo *v32, - struct lib64_elfinfo *v64, const char *orig, const char *fix) { return 0; @@ -419,8 +417,7 @@ static unsigned long __init find_function64(struct lib64_elfinfo *lib, #endif } -static int __init vdso_do_func_patch64(struct lib32_elfinfo *v32, - struct lib64_elfinfo *v64, +static int __init vdso_do_func_patch64(struct lib64_elfinfo *v64, const char *orig, const char *fix) { Elf64_Sym *sym64_gen, *sym64_fix; @@ -619,11 +616,9 @@ static __init int vdso_fixup_alt_funcs(struct lib32_elfinfo *v32, * It would be easy to do, but doesn't seem to be necessary, * patching the OPD symbol is enough. */ - vdso_do_func_patch32(v32, v64, patch->gen_name, -patch->fix_name); + vdso_do_func_patch32(v32, patch->gen_name, patch->fix_name); #ifdef CONFIG_PPC64 - vdso_do_func_patch64(v32, v64, patch->gen_name, -patch->fix_name); + vdso_do_func_patch64(v64, patch->gen_name, patch->fix_name); #endif /* CONFIG_PPC64 */ } -- 2.10.0
[PATCH 1/7] powerpc/vdso: unify return paths in setup_additional_pages
Impact: cleanup Cc: Benjamin HerrenschmidtCc: Paul Mackerras Cc: Michael Ellerman Cc: Andy Lutomirski Cc: Oleg Nesterov Cc: linuxppc-dev@lists.ozlabs.org Cc: linux...@kvack.org Signed-off-by: Dmitry Safonov --- arch/powerpc/kernel/vdso.c | 19 +++ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c index 4111d30badfa..4ffb82a2d9e9 100644 --- a/arch/powerpc/kernel/vdso.c +++ b/arch/powerpc/kernel/vdso.c @@ -154,7 +154,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp) struct page **vdso_pagelist; unsigned long vdso_pages; unsigned long vdso_base; - int rc; + int ret = 0; if (!vdso_ready) return 0; @@ -203,8 +203,8 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp) ((VDSO_ALIGNMENT - 1) & PAGE_MASK), 0, 0); if (IS_ERR_VALUE(vdso_base)) { - rc = vdso_base; - goto fail_mmapsem; + ret = vdso_base; + goto out_up_mmap_sem; } /* Add required alignment. */ @@ -227,21 +227,16 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp) * It's fine to use that for setting breakpoints in the vDSO code * pages though. */ - rc = install_special_mapping(mm, vdso_base, vdso_pages << PAGE_SHIFT, + ret = install_special_mapping(mm, vdso_base, vdso_pages << PAGE_SHIFT, VM_READ|VM_EXEC| VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC, vdso_pagelist); - if (rc) { + if (ret) current->mm->context.vdso_base = 0; - goto fail_mmapsem; - } - - up_write(>mmap_sem); - return 0; - fail_mmapsem: +out_up_mmap_sem: up_write(>mmap_sem); - return rc; + return ret; } const char *arch_vma_name(struct vm_area_struct *vma) -- 2.10.0
[PATCH 0/7] powerpc/mm: refactor vDSO mapping code
Cleanup patches for vDSO on powerpc. Originally, I wanted to add vDSO remapping on arm/aarch64 and I decided to cleanup that part on powerpc. I've add a hook for vm_ops for vDSO just like I did for x86. Other changes - reduce exhaustive code duplication. No visible to userspace changes expected. Tested on qemu with buildroot rootfs. Dmitry Safonov (7): powerpc/vdso: unify return paths in setup_additional_pages powerpc/vdso: remove unused params in vdso_do_func_patch{32,64} powerpc/vdso: separate common code in vdso_common powerpc/vdso: introduce init_vdso{32,64}_pagelist powerpc/vdso: split map_vdso from arch_setup_additional_pages powerpc/vdso: switch from legacy_special_mapping_vmops mm: kill arch_mremap arch/alpha/include/asm/Kbuild| 1 - arch/arc/include/asm/Kbuild | 1 - arch/arm/include/asm/Kbuild | 1 - arch/arm64/include/asm/Kbuild| 1 - arch/avr32/include/asm/Kbuild| 1 - arch/blackfin/include/asm/Kbuild | 1 - arch/c6x/include/asm/Kbuild | 1 - arch/cris/include/asm/Kbuild | 1 - arch/frv/include/asm/Kbuild | 1 - arch/h8300/include/asm/Kbuild| 1 - arch/hexagon/include/asm/Kbuild | 1 - arch/ia64/include/asm/Kbuild | 1 - arch/m32r/include/asm/Kbuild | 1 - arch/m68k/include/asm/Kbuild | 1 - arch/metag/include/asm/Kbuild| 1 - arch/microblaze/include/asm/Kbuild | 1 - arch/mips/include/asm/Kbuild | 1 - arch/mn10300/include/asm/Kbuild | 1 - arch/nios2/include/asm/Kbuild| 1 - arch/openrisc/include/asm/Kbuild | 1 - arch/parisc/include/asm/Kbuild | 1 - arch/powerpc/include/asm/mm-arch-hooks.h | 28 -- arch/powerpc/kernel/vdso.c | 492 +-- arch/powerpc/kernel/vdso_common.c| 248 arch/s390/include/asm/Kbuild | 1 - arch/score/include/asm/Kbuild| 1 - arch/sh/include/asm/Kbuild | 1 - arch/sparc/include/asm/Kbuild| 1 - arch/tile/include/asm/Kbuild | 1 - arch/um/include/asm/Kbuild | 1 - arch/unicore32/include/asm/Kbuild| 1 - arch/x86/include/asm/Kbuild | 1 - arch/xtensa/include/asm/Kbuild | 1 - include/asm-generic/mm-arch-hooks.h | 16 - include/linux/mm-arch-hooks.h| 25 -- mm/mremap.c | 4 - 36 files changed, 323 insertions(+), 520 deletions(-) delete mode 100644 arch/powerpc/include/asm/mm-arch-hooks.h create mode 100644 arch/powerpc/kernel/vdso_common.c delete mode 100644 include/asm-generic/mm-arch-hooks.h delete mode 100644 include/linux/mm-arch-hooks.h -- 2.10.0
Re: [Patch v5 04/12] irqchip: xilinx: Add support for parent intc
On 25/10/16 15:44, Sören Brinkmann wrote: > On Tue, 2016-10-25 at 12:49:33 +0200, Thomas Gleixner wrote: >> On Tue, 25 Oct 2016, Zubair Lutfullah Kakakhel wrote: >>> On 10/21/2016 10:48 AM, Marc Zyngier wrote: Shouldn't you return an error if irq is zero? >>> >>> I'll add the following for the error case >>> >>> pr_err("%s: Parent exists but interrupts property not defined\n" , >>> __func__); >> >> Please do not use this silly __func__ stuff. It's not giving any value to >> the printout. >> >> Set a proper prefix for your pr_* stuff, so the string is prefixed with >> 'irq-xilinx:' or whatever you think is appropriate. Then the string itself >> is good enough to find from which place this printk comes. > > Haven't looked at the real code, but is there probably a way to get a > struct device pointer and use dev_err? You wish. Interrupt controllers (and timers) are brought up way before the device model is available, hence no struct device. I've started untangling that mess a couple of times, and always ran out of available time (you start pulling the VFS, then the scheduler, the creation of the first thread, and then things lock up because you need to context switch and no timer is ready yet). I may try to spend some time on it again while travelling to LPC... M. -- Jazz is not dead. It just smells funny...
Re: [Patch v5 04/12] irqchip: xilinx: Add support for parent intc
On Tue, 2016-10-25 at 12:49:33 +0200, Thomas Gleixner wrote: > On Tue, 25 Oct 2016, Zubair Lutfullah Kakakhel wrote: > > On 10/21/2016 10:48 AM, Marc Zyngier wrote: > > > Shouldn't you return an error if irq is zero? > > > > > > > I'll add the following for the error case > > > > pr_err("%s: Parent exists but interrupts property not defined\n" , > > __func__); > > Please do not use this silly __func__ stuff. It's not giving any value to > the printout. > > Set a proper prefix for your pr_* stuff, so the string is prefixed with > 'irq-xilinx:' or whatever you think is appropriate. Then the string itself > is good enough to find from which place this printk comes. Haven't looked at the real code, but is there probably a way to get a struct device pointer and use dev_err? Sören
Re: [PATCH] powerpc/pseries: fix spelling mistake: "Attemping" -> "Attempting"
On 10/24/2016 05:02 PM, Colin King wrote: > From: Colin Ian King> > trivial fix to spelling mistake in pr_debug message > > Signed-off-by: Colin Ian King Reviewed-by: Nathan Fontenot > --- > arch/powerpc/platforms/pseries/hotplug-cpu.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c > b/arch/powerpc/platforms/pseries/hotplug-cpu.c > index a1b63e0..c8929cb 100644 > --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c > +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c > @@ -553,7 +553,7 @@ static ssize_t dlpar_cpu_remove(struct device_node *dn, > u32 drc_index) > { > int rc; > > - pr_debug("Attemping to remove CPU %s, drc index: %x\n", > + pr_debug("Attempting to remove CPU %s, drc index: %x\n", >dn->name, drc_index); > > rc = dlpar_offline_cpu(dn); >
Re: [PATCH v4 4/5] mm: make processing of movable_node arch-specific
On 11/10/16 23:26, Balbir Singh wrote: > > > On 07/10/16 05:36, Reza Arbab wrote: >> Currently, CONFIG_MOVABLE_NODE depends on X86_64. In preparation to >> enable it for other arches, we need to factor a detail which is unique >> to x86 out of the generic mm code. >> >> Specifically, as documented in kernel-parameters.txt, the use of >> "movable_node" should remain restricted to x86: >> >> movable_node[KNL,X86] Boot-time switch to enable the effects >> of CONFIG_MOVABLE_NODE=y. See mm/Kconfig for details. >> >> This option tells x86 to find movable nodes identified by the ACPI SRAT. >> On other arches, it would have no benefit, only the undesired side >> effect of setting bottom-up memblock allocation. >> >> Since #ifdef CONFIG_MOVABLE_NODE will no longer be enough to restrict >> this option to x86, move it to an arch-specific compilation unit >> instead. >> >> Signed-off-by: Reza Arbab> > Acked-by: Balbir Singh > After the ack, I realized there were some more checks needed, IOW questions for you :) 1. Have you checked to see if our memblock allocations spill over to probably hotpluggable nodes? 2. Shouldn't we be marking nodes discovered as movable via memblock_mark_hotplug()? Balbir Singh.
Re: [PATCH 2/2] powerpc/64: Fix race condition in setting lock bit in idle/wakeup code
Hi Paul, On Fri, Oct 21, 2016 at 08:04:17PM +1100, Paul Mackerras wrote: > This fixes a race condition where one thread that is entering or > leaving a power-saving state can inadvertently ignore the lock bit > that was set by another thread, and potentially also clear it. > The core_idle_lock_held function is called when the lock bit is > seen to be set. It polls the lock bit until it is clear, then > does a lwarx to load the word containing the lock bit and thread > idle bits so it can be updated. However, it is possible that the > value loaded with the lwarx has the lock bit set, even though an > immediately preceding lwz loaded a value with the lock bit clear. > If this happens then we go ahead and update the word despite the > lock bit being set, and when called from pnv_enter_arch207_idle_mode, > we will subsequently clear the lock bit. > > No identifiable misbehaviour has been attributed to this race. > > This fixes it by checking the lock bit in the value loaded by the > lwarx. If it is set then we just go back and keep on polling. > > Fixes: b32aadc1a8ed This fixes the code which has been around since 4.2 kernel. Should this be marked to stable as well ? > Signed-off-by: Paul Mackerras> --- > arch/powerpc/kernel/idle_book3s.S | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/arch/powerpc/kernel/idle_book3s.S > b/arch/powerpc/kernel/idle_book3s.S > index 0d8712a..72dac0b 100644 > --- a/arch/powerpc/kernel/idle_book3s.S > +++ b/arch/powerpc/kernel/idle_book3s.S > @@ -90,6 +90,7 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300) > * Threads will spin in HMT_LOW until the lock bit is cleared. > * r14 - pointer to core_idle_state > * r15 - used to load contents of core_idle_state > + * r9 - used as a temporary variable > */ > > core_idle_lock_held: > @@ -99,6 +100,8 @@ core_idle_lock_held: > bne 3b > HMT_MEDIUM > lwarx r15,0,r14 > + andi. r9,r15,PNV_CORE_IDLE_LOCK_BIT > + bne core_idle_lock_held > blr > > /* > -- > 2.7.4 > -- Thanks and Regards gautham.
Re: [Patch v5 04/12] irqchip: xilinx: Add support for parent intc
On Tue, 25 Oct 2016, Zubair Lutfullah Kakakhel wrote: > On 10/21/2016 10:48 AM, Marc Zyngier wrote: > > Shouldn't you return an error if irq is zero? > > > > I'll add the following for the error case > > pr_err("%s: Parent exists but interrupts property not defined\n" , > __func__); Please do not use this silly __func__ stuff. It's not giving any value to the printout. Set a proper prefix for your pr_* stuff, so the string is prefixed with 'irq-xilinx:' or whatever you think is appropriate. Then the string itself is good enough to find from which place this printk comes. Thanks, tglx
Re: [PATCH net-next] ibmveth: calculate correct gso_size and set gso_type
On Tue, Oct 25, 2016 at 04:13:41PM +1100, Jon Maxwell wrote: > We recently encountered a bug where a few customers using ibmveth on the > same LPAR hit an issue where a TCP session hung when large receive was > enabled. Closer analysis revealed that the session was stuck because the > one side was advertising a zero window repeatedly. > > We narrowed this down to the fact the ibmveth driver did not set gso_size > which is translated by TCP into the MSS later up the stack. The MSS is > used to calculate the TCP window size and as that was abnormally large, > it was calculating a zero window, even although the sockets receive buffer > was completely empty. > > We were able to reproduce this and worked with IBM to fix this. Thanks Tom > and Marcelo for all your help and review on this. > > The patch fixes both our internal reproduction tests and our customers tests. > > Signed-off-by: Jon Maxwell> --- > drivers/net/ethernet/ibm/ibmveth.c | 19 +++ > 1 file changed, 19 insertions(+) > > diff --git a/drivers/net/ethernet/ibm/ibmveth.c > b/drivers/net/ethernet/ibm/ibmveth.c > index 29c05d0..3028c33 100644 > --- a/drivers/net/ethernet/ibm/ibmveth.c > +++ b/drivers/net/ethernet/ibm/ibmveth.c > @@ -1182,6 +1182,8 @@ static int ibmveth_poll(struct napi_struct *napi, int > budget) > int frames_processed = 0; > unsigned long lpar_rc; > struct iphdr *iph; > + bool large_packet = 0; > + u16 hdr_len = ETH_HLEN + sizeof(struct tcphdr); Compiler may optmize this, but maybe move hdr_len to [*] ? > > restart_poll: > while (frames_processed < budget) { > @@ -1236,10 +1238,27 @@ static int ibmveth_poll(struct napi_struct *napi, int > budget) > iph->check = 0; > iph->check = > ip_fast_csum((unsigned char *)iph, iph->ihl); > adapter->rx_large_packets++; > + large_packet = 1; > } > } > } > > + if (skb->len > netdev->mtu) { [*] > + iph = (struct iphdr *)skb->data; > + if (be16_to_cpu(skb->protocol) == ETH_P_IP && > iph->protocol == IPPROTO_TCP) { The if line above is too long, should be broken in two. > + hdr_len += sizeof(struct iphdr); > + skb_shinfo(skb)->gso_type = > SKB_GSO_TCPV4; > + skb_shinfo(skb)->gso_size = netdev->mtu > - hdr_len; > + } else if (be16_to_cpu(skb->protocol) == > ETH_P_IPV6 && > + iph->protocol == IPPROTO_TCP) { ^ And this one should start 3 spaces later, right below be16_ Marcelo > + hdr_len += sizeof(struct ipv6hdr); > + skb_shinfo(skb)->gso_type = > SKB_GSO_TCPV6; > + skb_shinfo(skb)->gso_size = netdev->mtu > - hdr_len; > + } > + if (!large_packet) > + adapter->rx_large_packets++; > + } > + > napi_gro_receive(napi, skb);/* send it up */ > > netdev->stats.rx_packets++; > -- > 1.8.3.1 >
Re: [PATCH 1/2] powerpc/64: Re-fix race condition between going idle and entering guest
Hi Paul, [Added Shreyas's current e-mail address ] On Fri, Oct 21, 2016 at 08:03:05PM +1100, Paul Mackerras wrote: > Commit 8117ac6a6c2f ("powerpc/powernv: Switch off MMU before entering > nap/sleep/rvwinkle mode", 2014-12-10) fixed a race condition where one > thread entering a KVM guest could switch the MMU context to the guest > while another thread was still in host kernel context with the MMU on. > That commit moved the point where a thread entering a power-saving > mode set its kvm_hstate.hwthread_state field in its PACA to > KVM_HWTHREAD_IN_IDLE from a point where the MMU was on to after the > MMU had been switched off. That commit also added a comment > explaining that we have to switch to real mode before setting > hwthread_state to avoid this race. > > Nevertheless, commit 4eae2c9ae54a ("powerpc/powernv: Make > pnv_powersave_common more generic", 2016-07-08) subsequently moved > the setting of hwthread_state back to a point where the MMU is on, > thus reintroducing the race, despite the comment saying that this > should not be done being included in full in the context lines of > the patch that did it. > Sorry about missing that part. I am at fault, since I reviewed 4eae2c9ae54a patch. Will keep this in mind in the future. > This fixes the race again and adds a bigger and shoutier comment > explaining the potential race condition. > > Cc: sta...@vger.kernel.org # v4.8 > Fixes: 4eae2c9ae54a > Signed-off-by: Paul Mackerras> --- > arch/powerpc/kernel/idle_book3s.S | 32 ++-- > 1 file changed, 26 insertions(+), 6 deletions(-) > > diff --git a/arch/powerpc/kernel/idle_book3s.S > b/arch/powerpc/kernel/idle_book3s.S > index bd739fe..0d8712a 100644 > --- a/arch/powerpc/kernel/idle_book3s.S > +++ b/arch/powerpc/kernel/idle_book3s.S > @@ -163,12 +163,6 @@ _GLOBAL(pnv_powersave_common) > std r9,_MSR(r1) > std r1,PACAR1(r13) > > -#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE > - /* Tell KVM we're entering idle */ > - li r4,KVM_HWTHREAD_IN_IDLE > - stb r4,HSTATE_HWTHREAD_STATE(r13) > -#endif > - > /* >* Go to real mode to do the nap, as required by the architecture. >* Also, we need to be in real mode before setting hwthread_state, > @@ -185,6 +179,26 @@ _GLOBAL(pnv_powersave_common) > > .globl pnv_enter_arch207_idle_mode > pnv_enter_arch207_idle_mode: > +#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE > + /* Tell KVM we're entering idle */ > + li r4,KVM_HWTHREAD_IN_IDLE > + /**/ > + /* N O T E W E L L! ! !N O T E W E L L */ > + /* The following store to HSTATE_HWTHREAD_STATE(r13) */ > + /* MUST occur in real mode, i.e. with the MMU off,*/ > + /* and the MMU must stay off until we clear this flag */ > + /* and test HSTATE_HWTHREAD_REQ(r13) in the system*/ > + /* reset interrupt vector in exceptions-64s.S.*/ > + /* The reason is that another thread can switch the */ > + /* MMU to a guest context whenever this flag is set */ > + /* to KVM_HWTHREAD_IN_IDLE, and if the MMU was on,*/ > + /* that would potentially cause this thread to start */ > + /* executing instructions from guest memory in*/ > + /* hypervisor mode, leading to a host crash or data */ > + /* corruption, or worse. */ > + /**/ > + stb r4,HSTATE_HWTHREAD_STATE(r13) > +#endif > stb r3,PACA_THREAD_IDLE_STATE(r13) > cmpwi cr3,r3,PNV_THREAD_SLEEP > bge cr3,2f > @@ -250,6 +264,12 @@ enter_winkle: > * r3 - requested stop state > */ > power_enter_stop: > +#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE > + /* Tell KVM we're entering idle */ > + li r4,KVM_HWTHREAD_IN_IDLE > + /* DO THIS IN REAL MODE! See comment above. */ > + stb r4,HSTATE_HWTHREAD_STATE(r13) > +#endif > /* > * Check if the requested state is a deep idle state. > */ > -- > 2.7.4 >
Re: [Patch v5 04/12] irqchip: xilinx: Add support for parent intc
Hi, Thanks for the review. Some comments in-line. On 10/21/2016 10:48 AM, Marc Zyngier wrote: On 17/10/16 17:52, Zubair Lutfullah Kakakhel wrote: The MIPS based xilfpga platform has the following IRQ structure Peripherals --> xilinx_intcontroller -> mips_cpu_int controller Add support for the driver to chain the irq handler Signed-off-by: Zubair Lutfullah Kakakhel--- V4 -> V5 Rebased to v4.9-rc1 Missing curly braces V3 -> V4 Clean up if/else when a parent is found Pass irqchip structure to handler as data V2 -> V3 Reused existing parent node instead of finding again. Cleanup up handler based on review V1 -> V2 No change --- drivers/irqchip/irq-xilinx-intc.c | 26 -- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/drivers/irqchip/irq-xilinx-intc.c b/drivers/irqchip/irq-xilinx-intc.c index 45e5154..dbf8b0c 100644 --- a/drivers/irqchip/irq-xilinx-intc.c +++ b/drivers/irqchip/irq-xilinx-intc.c @@ -15,6 +15,7 @@ #include #include #include +#include /* No one else should require these constants, so define them locally here. */ #define ISR 0x00 /* Interrupt Status Register */ @@ -154,11 +155,23 @@ static int xintc_map(struct irq_domain *d, unsigned int irq, irq_hw_number_t hw) .map = xintc_map, }; +static void xil_intc_irq_handler(struct irq_desc *desc) +{ + u32 pending; + + do { + pending = xintc_get_irq(); + if (pending == -1U) + break; + generic_handle_irq(pending); + } while (true); This is missing the chained_irq_enter()/exit() calls, which will lead to races or lockups on the root irqchip. I 'll fix it up in the next series. +} + static int __init xilinx_intc_of_init(struct device_node *intc, struct device_node *parent) { u32 nr_irq; - int ret; + int ret, irq; struct xintc_irq_chip *irqc; if (xintc_irqc) { @@ -221,7 +234,16 @@ static int __init xilinx_intc_of_init(struct device_node *intc, goto err_alloc; } - irq_set_default_host(root_domain); + if (parent) { + irq = irq_of_parse_and_map(intc, 0); + if (irq) + irq_set_chained_handler_and_data(irq, +xil_intc_irq_handler, +irqc); + Shouldn't you return an error if irq is zero? I'll add the following for the error case pr_err("%s: Parent exists but interrupts property not defined\n" , __func__); goto err_alloc; Thanks ZubairLK + } else { + irq_set_default_host(root_domain); + } return 0; Thanks, M.
Re: [PATCH v4 3/5] powerpc/mm: allow memory hotplug into a memoryless node
Balbir Singhwrites: > FYI, these checks were temporary to begin with > > I found this in git history > > b226e462124522f2f23153daff31c311729dfa2f (powerpc: don't add memory to empty > node/zone) Nice thanks for digging it up. commit b226e462124522f2f23153daff31c311729dfa2f Author: Mike Kravetz AuthorDate: Fri Dec 16 14:30:35 2005 -0800 That is why maintainers don't like to merge "temporary" patches :) cheers
[GIT PULL v2 0/5] cpu_relax: drop lowlatency, introduce yield
Peter, here is v2 with some improved patch descriptions and some fixes. The previous version has survived one day of linux-next and I only changed small parts. So unless there is some other issue, feel free to pull (or to apply the patches) to tip/locking. The following changes since commit 07d9a380680d1c0eb51ef87ff2eab5c994949e69: Linux 4.9-rc2 (2016-10-23 17:10:14 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/borntraeger/linux.git tags/cpurelax for you to fetch changes up to dcc37f9044436438360402714b7544a8e8779b07: processor.h: remove cpu_relax_lowlatency (2016-10-25 09:49:57 +0200) cpu_relax: drop lowlatency, introduce yield For spinning loops people do often use barrier() or cpu_relax(). For most architectures cpu_relax and barrier are the same, but on some architectures cpu_relax can add some latency. For example on power,sparc64 and arc, cpu_relax can shift the CPU towards other hardware threads in an SMT environment. On s390 cpu_relax does even more, it uses an hypercall to the hypervisor to give up the timeslice. In contrast to the SMT yielding this can result in larger latencies. In some places this latency is unwanted, so another variant "cpu_relax_lowlatency" was introduced. Before this is used in more and more places, lets revert the logic and provide a cpu_relax_yield that can be called in places where yielding is more important than latency. By default this is the same as cpu_relax on all architectures. So my proposal boils down to: - lowest latency: use barrier() or mb() if necessary - low latency: use cpu_relax (e.g. might give up some cpu for the other _hardware_ threads) - really give up CPU: use cpu_relax_yield PS: In the long run I would also try to provide for s390 something like cpu_relax_yield_to with a cpu number (or just add that to cpu_relax_yield), since a yield_to is always better than a yield as long as we know the waiter. Christian Borntraeger (5): processor.h: introduce cpu_relax_yield stop_machine: yield CPU during stop machine s390: make cpu_relax a barrier again processor.h: Remove cpu_relax_lowlatency users processor.h: remove cpu_relax_lowlatency arch/alpha/include/asm/processor.h | 2 +- arch/arc/include/asm/processor.h| 4 ++-- arch/arm/include/asm/processor.h| 2 +- arch/arm64/include/asm/processor.h | 2 +- arch/avr32/include/asm/processor.h | 2 +- arch/blackfin/include/asm/processor.h | 2 +- arch/c6x/include/asm/processor.h| 2 +- arch/cris/include/asm/processor.h | 2 +- arch/frv/include/asm/processor.h| 2 +- arch/h8300/include/asm/processor.h | 2 +- arch/hexagon/include/asm/processor.h| 2 +- arch/ia64/include/asm/processor.h | 2 +- arch/m32r/include/asm/processor.h | 2 +- arch/m68k/include/asm/processor.h | 2 +- arch/metag/include/asm/processor.h | 2 +- arch/microblaze/include/asm/processor.h | 2 +- arch/mips/include/asm/processor.h | 2 +- arch/mn10300/include/asm/processor.h| 2 +- arch/nios2/include/asm/processor.h | 2 +- arch/openrisc/include/asm/processor.h | 2 +- arch/parisc/include/asm/processor.h | 2 +- arch/powerpc/include/asm/processor.h| 2 +- arch/s390/include/asm/processor.h | 4 ++-- arch/s390/kernel/processor.c| 4 ++-- arch/score/include/asm/processor.h | 2 +- arch/sh/include/asm/processor.h | 2 +- arch/sparc/include/asm/processor_32.h | 2 +- arch/sparc/include/asm/processor_64.h | 2 +- arch/tile/include/asm/processor.h | 2 +- arch/unicore32/include/asm/processor.h | 2 +- arch/x86/include/asm/processor.h| 2 +- arch/x86/um/asm/processor.h | 2 +- arch/xtensa/include/asm/processor.h | 2 +- drivers/gpu/drm/i915/i915_gem_request.c | 2 +- drivers/vhost/net.c | 4 ++-- kernel/locking/mcs_spinlock.h | 4 ++-- kernel/locking/mutex.c | 4 ++-- kernel/locking/osq_lock.c | 6 +++--- kernel/locking/qrwlock.c| 6 +++--- kernel/locking/rwsem-xadd.c | 4 ++-- kernel/stop_machine.c | 2 +- lib/lockref.c | 2 +- 42 files changed, 53 insertions(+), 53 deletions(-)
[GIT PULL v2 1/5] processor.h: introduce cpu_relax_yield
For spinning loops people do often use barrier() or cpu_relax(). For most architectures cpu_relax and barrier are the same, but on some architectures cpu_relax can add some latency. For example on power,sparc64 and arc, cpu_relax can shift the CPU towards other hardware threads in an SMT environment. On s390 cpu_relax does even more, it uses an hypercall to the hypervisor to give up the timeslice. In contrast to the SMT yielding this can result in larger latencies. In some places this latency is unwanted, so another variant "cpu_relax_lowlatency" was introduced. Before this is used in more and more places, lets revert the logic and provide a cpu_relax_yield that can be called in places where yielding is more important than latency. By default this is the same as cpu_relax on all architectures. Signed-off-by: Christian Borntraeger--- arch/alpha/include/asm/processor.h | 1 + arch/arc/include/asm/processor.h| 2 ++ arch/arm/include/asm/processor.h| 1 + arch/arm64/include/asm/processor.h | 1 + arch/avr32/include/asm/processor.h | 1 + arch/blackfin/include/asm/processor.h | 1 + arch/c6x/include/asm/processor.h| 1 + arch/cris/include/asm/processor.h | 1 + arch/frv/include/asm/processor.h| 1 + arch/h8300/include/asm/processor.h | 1 + arch/hexagon/include/asm/processor.h| 1 + arch/ia64/include/asm/processor.h | 1 + arch/m32r/include/asm/processor.h | 1 + arch/m68k/include/asm/processor.h | 1 + arch/metag/include/asm/processor.h | 1 + arch/microblaze/include/asm/processor.h | 1 + arch/mips/include/asm/processor.h | 1 + arch/mn10300/include/asm/processor.h| 1 + arch/nios2/include/asm/processor.h | 1 + arch/openrisc/include/asm/processor.h | 1 + arch/parisc/include/asm/processor.h | 1 + arch/powerpc/include/asm/processor.h| 1 + arch/s390/include/asm/processor.h | 3 ++- arch/s390/kernel/processor.c| 4 ++-- arch/score/include/asm/processor.h | 1 + arch/sh/include/asm/processor.h | 1 + arch/sparc/include/asm/processor_32.h | 1 + arch/sparc/include/asm/processor_64.h | 1 + arch/tile/include/asm/processor.h | 1 + arch/unicore32/include/asm/processor.h | 1 + arch/x86/include/asm/processor.h| 1 + arch/x86/um/asm/processor.h | 1 + arch/xtensa/include/asm/processor.h | 1 + 33 files changed, 36 insertions(+), 3 deletions(-) diff --git a/arch/alpha/include/asm/processor.h b/arch/alpha/include/asm/processor.h index 43a7559..0556fda 100644 --- a/arch/alpha/include/asm/processor.h +++ b/arch/alpha/include/asm/processor.h @@ -58,6 +58,7 @@ unsigned long get_wchan(struct task_struct *p); ((tsk) == current ? rdusp() : task_thread_info(tsk)->pcb.usp) #define cpu_relax()barrier() +#define cpu_relax_yield() cpu_relax() #define cpu_relax_lowlatency() cpu_relax() #define ARCH_HAS_PREFETCH diff --git a/arch/arc/include/asm/processor.h b/arch/arc/include/asm/processor.h index 16b630f..6c158d5 100644 --- a/arch/arc/include/asm/processor.h +++ b/arch/arc/include/asm/processor.h @@ -60,6 +60,7 @@ struct task_struct; #ifndef CONFIG_EZNPS_MTM_EXT #define cpu_relax()barrier() +#define cpu_relax_yield() cpu_relax() #define cpu_relax_lowlatency() cpu_relax() #else @@ -67,6 +68,7 @@ struct task_struct; #define cpu_relax() \ __asm__ __volatile__ (".word %0" : : "i"(CTOP_INST_SCHD_RW) : "memory") +#define cpu_relax_yield() cpu_relax() #define cpu_relax_lowlatency() barrier() #endif diff --git a/arch/arm/include/asm/processor.h b/arch/arm/include/asm/processor.h index 8a1e8e9..db660e0 100644 --- a/arch/arm/include/asm/processor.h +++ b/arch/arm/include/asm/processor.h @@ -82,6 +82,7 @@ unsigned long get_wchan(struct task_struct *p); #define cpu_relax()barrier() #endif +#define cpu_relax_yield()cpu_relax() #define cpu_relax_lowlatency()cpu_relax() #define task_pt_regs(p) \ diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h index 60e3482..3f9b0e5 100644 --- a/arch/arm64/include/asm/processor.h +++ b/arch/arm64/include/asm/processor.h @@ -149,6 +149,7 @@ static inline void cpu_relax(void) asm volatile("yield" ::: "memory"); } +#define cpu_relax_yield() cpu_relax() #define cpu_relax_lowlatency()cpu_relax() /* Thread switching */ diff --git a/arch/avr32/include/asm/processor.h b/arch/avr32/include/asm/processor.h index 941593c..e412e8b 100644 --- a/arch/avr32/include/asm/processor.h +++ b/arch/avr32/include/asm/processor.h @@ -92,6 +92,7 @@ extern struct avr32_cpuinfo boot_cpu_data; #define TASK_UNMAPPED_BASE (PAGE_ALIGN(TASK_SIZE / 3)) #define cpu_relax()barrier() +#define cpu_relax_yield() cpu_relax() #define cpu_relax_lowlatency()cpu_relax() #define cpu_sync_pipeline()
[GIT PULL v2 5/5] processor.h: remove cpu_relax_lowlatency
As there are no users left, we can remove cpu_relax_lowlatency. Signed-off-by: Christian Borntraeger--- arch/alpha/include/asm/processor.h | 1 - arch/arc/include/asm/processor.h| 2 -- arch/arm/include/asm/processor.h| 1 - arch/arm64/include/asm/processor.h | 1 - arch/avr32/include/asm/processor.h | 1 - arch/blackfin/include/asm/processor.h | 1 - arch/c6x/include/asm/processor.h| 1 - arch/cris/include/asm/processor.h | 1 - arch/frv/include/asm/processor.h| 1 - arch/h8300/include/asm/processor.h | 1 - arch/hexagon/include/asm/processor.h| 1 - arch/ia64/include/asm/processor.h | 1 - arch/m32r/include/asm/processor.h | 1 - arch/m68k/include/asm/processor.h | 1 - arch/metag/include/asm/processor.h | 1 - arch/microblaze/include/asm/processor.h | 1 - arch/mips/include/asm/processor.h | 1 - arch/mn10300/include/asm/processor.h| 1 - arch/nios2/include/asm/processor.h | 1 - arch/openrisc/include/asm/processor.h | 1 - arch/parisc/include/asm/processor.h | 1 - arch/powerpc/include/asm/processor.h| 1 - arch/s390/include/asm/processor.h | 1 - arch/score/include/asm/processor.h | 1 - arch/sh/include/asm/processor.h | 1 - arch/sparc/include/asm/processor_32.h | 1 - arch/sparc/include/asm/processor_64.h | 1 - arch/tile/include/asm/processor.h | 1 - arch/unicore32/include/asm/processor.h | 1 - arch/x86/include/asm/processor.h| 1 - arch/x86/um/asm/processor.h | 1 - arch/xtensa/include/asm/processor.h | 1 - 32 files changed, 33 deletions(-) diff --git a/arch/alpha/include/asm/processor.h b/arch/alpha/include/asm/processor.h index 0556fda..31e8dbe 100644 --- a/arch/alpha/include/asm/processor.h +++ b/arch/alpha/include/asm/processor.h @@ -59,7 +59,6 @@ unsigned long get_wchan(struct task_struct *p); #define cpu_relax()barrier() #define cpu_relax_yield() cpu_relax() -#define cpu_relax_lowlatency() cpu_relax() #define ARCH_HAS_PREFETCH #define ARCH_HAS_PREFETCHW diff --git a/arch/arc/include/asm/processor.h b/arch/arc/include/asm/processor.h index 6c158d5..d102a49 100644 --- a/arch/arc/include/asm/processor.h +++ b/arch/arc/include/asm/processor.h @@ -61,7 +61,6 @@ struct task_struct; #define cpu_relax()barrier() #define cpu_relax_yield() cpu_relax() -#define cpu_relax_lowlatency() cpu_relax() #else @@ -69,7 +68,6 @@ struct task_struct; __asm__ __volatile__ (".word %0" : : "i"(CTOP_INST_SCHD_RW) : "memory") #define cpu_relax_yield() cpu_relax() -#define cpu_relax_lowlatency() barrier() #endif diff --git a/arch/arm/include/asm/processor.h b/arch/arm/include/asm/processor.h index db660e0..9e71c58b 100644 --- a/arch/arm/include/asm/processor.h +++ b/arch/arm/include/asm/processor.h @@ -83,7 +83,6 @@ unsigned long get_wchan(struct task_struct *p); #endif #define cpu_relax_yield()cpu_relax() -#define cpu_relax_lowlatency()cpu_relax() #define task_pt_regs(p) \ ((struct pt_regs *)(THREAD_START_SP + task_stack_page(p)) - 1) diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h index 3f9b0e5..6132f64 100644 --- a/arch/arm64/include/asm/processor.h +++ b/arch/arm64/include/asm/processor.h @@ -150,7 +150,6 @@ static inline void cpu_relax(void) } #define cpu_relax_yield() cpu_relax() -#define cpu_relax_lowlatency()cpu_relax() /* Thread switching */ extern struct task_struct *cpu_switch_to(struct task_struct *prev, diff --git a/arch/avr32/include/asm/processor.h b/arch/avr32/include/asm/processor.h index e412e8b..ee62365 100644 --- a/arch/avr32/include/asm/processor.h +++ b/arch/avr32/include/asm/processor.h @@ -93,7 +93,6 @@ extern struct avr32_cpuinfo boot_cpu_data; #define cpu_relax()barrier() #define cpu_relax_yield() cpu_relax() -#define cpu_relax_lowlatency()cpu_relax() #define cpu_sync_pipeline()asm volatile("sub pc, -2" : : : "memory") struct cpu_context { diff --git a/arch/blackfin/include/asm/processor.h b/arch/blackfin/include/asm/processor.h index 8b8704a..57acfb1 100644 --- a/arch/blackfin/include/asm/processor.h +++ b/arch/blackfin/include/asm/processor.h @@ -93,7 +93,6 @@ unsigned long get_wchan(struct task_struct *p); #define cpu_relax()smp_mb() #define cpu_relax_yield() cpu_relax() -#define cpu_relax_lowlatency() cpu_relax() /* Get the Silicon Revision of the chip */ static inline uint32_t __pure bfin_revid(void) diff --git a/arch/c6x/include/asm/processor.h b/arch/c6x/include/asm/processor.h index 914d730..1fd22e7 100644 --- a/arch/c6x/include/asm/processor.h +++ b/arch/c6x/include/asm/processor.h @@ -122,7 +122,6 @@ extern unsigned long get_wchan(struct task_struct *p); #define cpu_relax()do { } while (0) #define cpu_relax_yield()
[GIT PULL v2 3/5] s390: make cpu_relax a barrier again
stop_machine seemed to be the only important place for yielding during cpu_relax. This was fixed by using cpu_relax_yield. Therefore, we can now redefine cpu_relax to be a barrier instead on s390, making s390 identical to all other architectures. Signed-off-by: Christian Borntraeger--- arch/s390/include/asm/processor.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/s390/include/asm/processor.h b/arch/s390/include/asm/processor.h index d05965b..5d262cf 100644 --- a/arch/s390/include/asm/processor.h +++ b/arch/s390/include/asm/processor.h @@ -236,7 +236,7 @@ static inline unsigned short stap(void) */ void cpu_relax_yield(void); -#define cpu_relax() cpu_relax_yield() +#define cpu_relax() barrier() #define cpu_relax_lowlatency() barrier() #define ECAG_CACHE_ATTRIBUTE 0 -- 2.5.5
[GIT PULL v2 2/5] stop_machine: yield CPU during stop machine
Some time ago commit 57f2ffe14fd125c2 ("s390: remove diag 44 calls from cpu_relax()") did stop cpu_relax on s390 yielding to the hypervisor. As it turns out this made stop_machine run really slow on virtualized overcommited systems. For example the kprobes test during bootup took several seconds instead of just running unnoticed with large guests. Therefore, the yielding was reintroduced with commit 4d92f50249eb ("s390: reintroduce diag 44 calls for cpu_relax()"), but in fact the stop machine code seems to be the only place where this yielding was really necessary. This place is probably the most important one as it makes all but one guest CPUs wait for one guest CPU. As we now have cpu_relax_yield, we can use this in multi_cpu_stop. For now lets only add it here. We can add it later in other places when necessary. Signed-off-by: Christian Borntraeger--- kernel/stop_machine.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index ec9ab2f..1eb8266 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -194,7 +194,7 @@ static int multi_cpu_stop(void *data) /* Simple state machine */ do { /* Chill out and ensure we re-read multi_stop_state. */ - cpu_relax(); + cpu_relax_yield(); if (msdata->state != curstate) { curstate = msdata->state; switch (curstate) { -- 2.5.5
[GIT PULL v2 4/5] processor.h: Remove cpu_relax_lowlatency users
With the s390 special case of a yielding cpu_relax implementation gone, we can now remove all users of cpu_relax_lowlatency and replace them with cpu_relax. Signed-off-by: Christian Borntraeger--- drivers/gpu/drm/i915/i915_gem_request.c | 2 +- drivers/vhost/net.c | 4 ++-- kernel/locking/mcs_spinlock.h | 4 ++-- kernel/locking/mutex.c | 4 ++-- kernel/locking/osq_lock.c | 6 +++--- kernel/locking/qrwlock.c| 6 +++--- kernel/locking/rwsem-xadd.c | 4 ++-- lib/lockref.c | 2 +- 8 files changed, 16 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c index 8832f8e..383d134 100644 --- a/drivers/gpu/drm/i915/i915_gem_request.c +++ b/drivers/gpu/drm/i915/i915_gem_request.c @@ -723,7 +723,7 @@ bool __i915_spin_request(const struct drm_i915_gem_request *req, if (busywait_stop(timeout_us, cpu)) break; - cpu_relax_lowlatency(); + cpu_relax(); } while (!need_resched()); return false; diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index 5dc128a..5dc3465 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -342,7 +342,7 @@ static int vhost_net_tx_get_vq_desc(struct vhost_net *net, endtime = busy_clock() + vq->busyloop_timeout; while (vhost_can_busy_poll(vq->dev, endtime) && vhost_vq_avail_empty(vq->dev, vq)) - cpu_relax_lowlatency(); + cpu_relax(); preempt_enable(); r = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov), out_num, in_num, NULL, NULL); @@ -533,7 +533,7 @@ static int vhost_net_rx_peek_head_len(struct vhost_net *net, struct sock *sk) while (vhost_can_busy_poll(>dev, endtime) && !sk_has_rx_data(sk) && vhost_vq_avail_empty(>dev, vq)) - cpu_relax_lowlatency(); + cpu_relax(); preempt_enable(); diff --git a/kernel/locking/mcs_spinlock.h b/kernel/locking/mcs_spinlock.h index c835270..6a385aa 100644 --- a/kernel/locking/mcs_spinlock.h +++ b/kernel/locking/mcs_spinlock.h @@ -28,7 +28,7 @@ struct mcs_spinlock { #define arch_mcs_spin_lock_contended(l) \ do { \ while (!(smp_load_acquire(l))) \ - cpu_relax_lowlatency(); \ + cpu_relax();\ } while (0) #endif @@ -108,7 +108,7 @@ void mcs_spin_unlock(struct mcs_spinlock **lock, struct mcs_spinlock *node) return; /* Wait until the next pointer is set */ while (!(next = READ_ONCE(node->next))) - cpu_relax_lowlatency(); + cpu_relax(); } /* Pass lock to next waiter. */ diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index a70b90d..4463405 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -241,7 +241,7 @@ bool mutex_spin_on_owner(struct mutex *lock, struct task_struct *owner) break; } - cpu_relax_lowlatency(); + cpu_relax(); } rcu_read_unlock(); @@ -377,7 +377,7 @@ static bool mutex_optimistic_spin(struct mutex *lock, * memory barriers as we'll eventually observe the right * values at the cost of a few extra spins. */ - cpu_relax_lowlatency(); + cpu_relax(); } osq_unlock(>osq); diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c index 05a3785..4ea2710 100644 --- a/kernel/locking/osq_lock.c +++ b/kernel/locking/osq_lock.c @@ -75,7 +75,7 @@ osq_wait_next(struct optimistic_spin_queue *lock, break; } - cpu_relax_lowlatency(); + cpu_relax(); } return next; @@ -122,7 +122,7 @@ bool osq_lock(struct optimistic_spin_queue *lock) if (need_resched()) goto unqueue; - cpu_relax_lowlatency(); + cpu_relax(); } return true; @@ -148,7 +148,7 @@ bool osq_lock(struct optimistic_spin_queue *lock) if (smp_load_acquire(>locked)) return true; - cpu_relax_lowlatency(); + cpu_relax(); /* * Or we race against a concurrent unqueue()'s step-B, in which diff --git a/kernel/locking/qrwlock.c
Re: [PATCH v3 02/16] scsi: don't use fc_bsg_job::request and fc_bsg_job::reply directly
On Fri, Oct 14, 2016 at 09:38:21AM +0200, Johannes Thumshirn wrote: > On Thu, Oct 13, 2016 at 05:55:11PM +0200, Steffen Maier wrote: > > Hm, still behaves for me like I reported for v2: > > http://marc.info/?l=linux-scsi=147637177902937=2 > > Hi Steffen, > > Can you please try the following on top of 2/16? > > diff --git a/drivers/scsi/scsi_transport_fc.c > b/drivers/scsi/scsi_transport_fc.c > index 4149dac..baebaab 100644 > --- a/drivers/scsi/scsi_transport_fc.c > +++ b/drivers/scsi/scsi_transport_fc.c > @@ -3786,6 +3786,12 @@ enum fc_dispatch_result { > int cmdlen = sizeof(uint32_t); /* start with length of msgcode */ > int ret; > > + /* check if we really have all the request data needed */ > + if (job->request_len < cmdlen) { > + ret = -ENOMSG; > + goto fail_host_msg; > + } > + > /* Validate the host command */ > switch (bsg_request->msgcode) { > case FC_BSG_HST_ADD_RPORT: > @@ -3831,12 +3837,6 @@ enum fc_dispatch_result { > goto fail_host_msg; > } > > - /* check if we really have all the request data needed */ > - if (job->request_len < cmdlen) { > - ret = -ENOMSG; > - goto fail_host_msg; > - } > - > ret = i->f->bsg_request(job); > if (!ret) > return FC_DISPATCH_UNLOCKED; > @@ -3887,6 +3887,12 @@ enum fc_dispatch_result { > int cmdlen = sizeof(uint32_t); /* start with length of msgcode */ > int ret; > > + /* check if we really have all the request data needed */ > + if (job->request_len < cmdlen) { > + ret = -ENOMSG; > + goto fail_rport_msg; > + } > + > /* Validate the rport command */ > switch (bsg_request->msgcode) { > case FC_BSG_RPT_ELS: > > > > The rational behind this is, in fc_req_to_bsgjob() we're assigning > job->request as req->cmd and job->request_len = req->cmd_len. But without > checkinf job->request_len we don't know whether we're save to touch > job->request (a.k.a. bsg_request). Hi Steffen, Did you have any chance testing this? I hacked fcping to work with non-FCoE and rports as well and tested with FCoE and lpfc. No problems seen from my side. I've also pused the series (With this change folded in) to my git tree at [1] if this helps you in any way. [1] https://git.kernel.org/cgit/linux/kernel/git/jth/linux.git/log/?h=scsi-bsg-rewrite-v4 Thanks a lot, Johannes -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850