Re: [PATCH v2] dmaengine: k3dma: use the correct HiSilicon copyright
On 2021/4/1 下午7:50, Hao Fang wrote: s/Hisilicon/HiSilicon/g. It should use capital S, according to the official website. Signed-off-by: Hao Fang Thanks for the patch. Acked-by: Zhangfei Gao --- V2: -remove the terms of use link. --- drivers/dma/k3dma.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/dma/k3dma.c b/drivers/dma/k3dma.c index d0b2e60..ecdaada9 100644 --- a/drivers/dma/k3dma.c +++ b/drivers/dma/k3dma.c @@ -1,7 +1,7 @@ // SPDX-License-Identifier: GPL-2.0-only /* * Copyright (c) 2013 - 2015 Linaro Ltd. - * Copyright (c) 2013 Hisilicon Limited. + * Copyright (c) 2013 HiSilicon Limited. */ #include #include @@ -1039,6 +1039,6 @@ static struct platform_driver k3_pdma_driver = { module_platform_driver(k3_pdma_driver); -MODULE_DESCRIPTION("Hisilicon k3 DMA Driver"); +MODULE_DESCRIPTION("HiSilicon k3 DMA Driver"); MODULE_ALIAS("platform:k3dma"); MODULE_LICENSE("GPL v2");
Re: [PATCH] mmc: dw_mmc-k3: use the correct HiSilicon copyright
On Tue, Mar 30, 2021 at 2:46 PM Hao Fang wrote: > > s/Hisilicon/HiSilicon/g. > It should use capital S, according to > https://www.hisilicon.com/en/terms-of-use. > > Signed-off-by: Hao Fang Thanks for the fix. Acked-by: Zhangfei Gao > --- > drivers/mmc/host/dw_mmc-k3.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/mmc/host/dw_mmc-k3.c b/drivers/mmc/host/dw_mmc-k3.c > index 29d2494..0311a37 100644 > --- a/drivers/mmc/host/dw_mmc-k3.c > +++ b/drivers/mmc/host/dw_mmc-k3.c > @@ -1,7 +1,7 @@ > // SPDX-License-Identifier: GPL-2.0-or-later > /* > * Copyright (c) 2013 Linaro Ltd. > - * Copyright (c) 2013 Hisilicon Limited. > + * Copyright (c) 2013 HiSilicon Limited. > */ > > #include > -- > 2.8.1 >
Re: [PATCH v2 3/3] PCI: set dma-can-stall for HiSilicon chip
Hi, Krzysztof On 2021/3/8 上午1:54, Krzysztof Wilczyński wrote: Hi, [...] Property dma-can-stall depends on patchset https://lore.kernel.org/linux-iommu/20210108145217.2254447-1-jean-phili...@linaro.org/ [...] If you plan to post another version of this patch to include the above link into the commit message or reference to the commit itself, as Jean-Philippe's series can already be included in the mainline (since it has been a while now from when this series was originally posted), then I have a favour to ask - would you also be able to also capitalise the subject line (so that it's consistent) and change "chip" to "chips" since there are two you mention in the commit message. Have sent another version with the changes, thanks for the remind. I was waiting for Jean's patchest, https://lore.kernel.org/linux-iommu/20210302092644.2553014-1-jean-phili...@linaro.org/ Though the quirks patches can be applied and build directly on 5.12-rc1. Thanks
[PATCH v3 3/3] PCI: Set dma-can-stall for HiSilicon chips
HiSilicon KunPeng920 and KunPeng930 have devices appear as PCI but are actually on the AMBA bus. These fake PCI devices can support SVA via SMMU stall feature, by setting dma-can-stall for ACPI platforms. Property dma-can-stall depends on patchset https://lore.kernel.org/linux-iommu/20210302092644.2553014-1-jean-phili...@linaro.org/ Signed-off-by: Zhangfei Gao Signed-off-by: Jean-Philippe Brucker Signed-off-by: Zhou Wang --- drivers/pci/quirks.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 873d27f..b866cdf 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -1827,10 +1827,23 @@ DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_HUAWEI, 0x1610, PCI_CLASS_BRIDGE_PCI static void quirk_huawei_pcie_sva(struct pci_dev *pdev) { + struct property_entry properties[] = { + PROPERTY_ENTRY_BOOL("dma-can-stall"), + {}, + }; + if (pdev->revision != 0x21 && pdev->revision != 0x30) return; pdev->pasid_no_tlp = 1; + + /* +* Set the dma-can-stall property on ACPI platforms. Device tree +* can set it directly. +*/ + if (!pdev->dev.of_node && + device_add_properties(>dev, properties)) + pci_warn(pdev, "could not add stall property"); } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa251, quirk_huawei_pcie_sva); -- 2.9.5
[PATCH v3 2/3] PCI: Add a quirk to set pasid_no_tlp for HiSilicon chips
HiSilicon KunPeng920 and KunPeng930 have devices appear as PCI but are actually on the AMBA bus. These fake PCI devices have PASID capability though not supporting TLP. Add a quirk to set pasid_no_tlp for these devices. Signed-off-by: Zhangfei Gao Signed-off-by: Jean-Philippe Brucker Signed-off-by: Zhou Wang --- drivers/pci/quirks.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 653660e..873d27f 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -1825,6 +1825,20 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_E7525_MCH, quir DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_HUAWEI, 0x1610, PCI_CLASS_BRIDGE_PCI, 8, quirk_pcie_mch); +static void quirk_huawei_pcie_sva(struct pci_dev *pdev) +{ + if (pdev->revision != 0x21 && pdev->revision != 0x30) + return; + + pdev->pasid_no_tlp = 1; +} +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa251, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa255, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa256, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa258, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa259, quirk_huawei_pcie_sva); + /* * It's possible for the MSI to get corrupted if SHPC and ACPI are used * together on certain PXH-based systems. -- 2.9.5
[PATCH v3 1/3] PCI: PASID can be enabled without TLP prefix
A PASID-like feature is implemented on AMBA without using TLP prefixes and these devices have PASID capability though not supporting TLP. Adding a pasid_no_tlp bit for "PASID works without TLP prefixes" and pci_enable_pasid() checks pasid_no_tlp as well as eetlp_prefix_path. Suggested-by: Bjorn Helgaas Signed-off-by: Zhangfei Gao --- drivers/pci/ats.c | 2 +- include/linux/pci.h | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c index 793d381..88f981b 100644 --- a/drivers/pci/ats.c +++ b/drivers/pci/ats.c @@ -380,7 +380,7 @@ int pci_enable_pasid(struct pci_dev *pdev, int features) if (WARN_ON(pdev->pasid_enabled)) return -EBUSY; - if (!pdev->eetlp_prefix_path) + if (!pdev->eetlp_prefix_path && !pdev->pasid_no_tlp) return -EINVAL; if (!pasid) diff --git a/include/linux/pci.h b/include/linux/pci.h index 86c799c..1daa943 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -388,6 +388,7 @@ struct pci_dev { supported from root to here */ u16 l1ss; /* L1SS Capability pointer */ #endif + unsigned intpasid_no_tlp:1; /* PASID works without TLP Prefix */ unsigned inteetlp_prefix_path:1;/* End-to-End TLP Prefix */ pci_channel_state_t error_state;/* Current connectivity state */ -- 2.9.5
[PATCH v3 0/3] PCI: Add a quirk to enable SVA for HiSilicon chip
HiSilicon KunPeng920 and KunPeng930 have devices appear as PCI but are actually on the AMBA bus. These fake PCI devices have PASID capability though not supporting TLP. Add a quirk to set pasid_no_tlp and dma-can-stall for these devices. v3: Rebase to Linux 5.12-rc1 Change commit msg adding: Property dma-can-stall depends on patchset https://lore.kernel.org/linux-iommu/20210302092644.2553014-1-jean-phili...@linaro.org/ By the way the patchset can directly applied on 5.12-rc1 and build successfully though without the dependent patchset. v2: Add a new pci_dev bit: pasid_no_tlp, suggested by Bjorn "Apparently these devices have a PASID capability. I think you should add a new pci_dev bit that is specific to this idea of "PASID works without TLP prefixes" and then change pci_enable_pasid() to look at that bit as well as eetlp_prefix_path." https://lore.kernel.org/linux-pci/20210112170230.GA1838341@bjorn-Precision-5520/ Zhangfei Gao (3): PCI: PASID can be enabled without TLP prefix PCI: Add a quirk to set pasid_no_tlp for HiSilicon chips PCI: Set dma-can-stall for HiSilicon chips drivers/pci/ats.c| 2 +- drivers/pci/quirks.c | 27 +++ include/linux/pci.h | 1 + 3 files changed, 29 insertions(+), 1 deletion(-) -- 2.9.5
[PATCH v2 2/3] PCI: Add a quirk to set pasid_no_tlp for HiSilicon chip
HiSilicon KunPeng920 and KunPeng930 have devices appear as PCI but are actually on the AMBA bus. These fake PCI devices have PASID capability though not supporting TLP. Add a quirk to set pasid_no_tlp for these devices. Signed-off-by: Zhangfei Gao Signed-off-by: Jean-Philippe Brucker Signed-off-by: Zhou Wang --- drivers/pci/quirks.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 653660e..873d27f 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -1825,6 +1825,20 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_E7525_MCH, quir DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_HUAWEI, 0x1610, PCI_CLASS_BRIDGE_PCI, 8, quirk_pcie_mch); +static void quirk_huawei_pcie_sva(struct pci_dev *pdev) +{ + if (pdev->revision != 0x21 && pdev->revision != 0x30) + return; + + pdev->pasid_no_tlp = 1; +} +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa251, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa255, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa256, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa258, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa259, quirk_huawei_pcie_sva); + /* * It's possible for the MSI to get corrupted if SHPC and ACPI are used * together on certain PXH-based systems. -- 2.7.4
[PATCH v2 3/3] PCI: set dma-can-stall for HiSilicon chip
HiSilicon KunPeng920 and KunPeng930 have devices appear as PCI but are actually on the AMBA bus. These fake PCI devices can support SVA via SMMU stall feature, by setting dma-can-stall for ACPI platforms. Signed-off-by: Zhangfei Gao Signed-off-by: Jean-Philippe Brucker Signed-off-by: Zhou Wang --- Property dma-can-stall depends on patchset https://lore.kernel.org/linux-iommu/20210108145217.2254447-1-jean-phili...@linaro.org/ drivers/pci/quirks.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 873d27f..b866cdf 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -1827,10 +1827,23 @@ DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_HUAWEI, 0x1610, PCI_CLASS_BRIDGE_PCI static void quirk_huawei_pcie_sva(struct pci_dev *pdev) { + struct property_entry properties[] = { + PROPERTY_ENTRY_BOOL("dma-can-stall"), + {}, + }; + if (pdev->revision != 0x21 && pdev->revision != 0x30) return; pdev->pasid_no_tlp = 1; + + /* +* Set the dma-can-stall property on ACPI platforms. Device tree +* can set it directly. +*/ + if (!pdev->dev.of_node && + device_add_properties(>dev, properties)) + pci_warn(pdev, "could not add stall property"); } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa251, quirk_huawei_pcie_sva); -- 2.7.4
[PATCH v2 1/3] PCI: PASID can be enabled without TLP prefix
A PASID-like feature is implemented on AMBA without using TLP prefixes and these devices have PASID capability though not supporting TLP. Adding a pasid_no_tlp bit for "PASID works without TLP prefixes" and pci_enable_pasid() checks pasid_no_tlp as well as eetlp_prefix_path. Suggested-by: Bjorn Helgaas Signed-off-by: Zhangfei Gao --- drivers/pci/ats.c | 2 +- include/linux/pci.h | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c index e36d601..b67b1b1 100644 --- a/drivers/pci/ats.c +++ b/drivers/pci/ats.c @@ -386,7 +386,7 @@ int pci_enable_pasid(struct pci_dev *pdev, int features) if (WARN_ON(pdev->pasid_enabled)) return -EBUSY; - if (!pdev->eetlp_prefix_path) + if (!pdev->eetlp_prefix_path && !pdev->pasid_no_tlp) return -EINVAL; if (!pasid) diff --git a/include/linux/pci.h b/include/linux/pci.h index f1f26f8..ac1c735 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -388,6 +388,7 @@ struct pci_dev { supported from root to here */ u16 l1ss; /* L1SS Capability pointer */ #endif + unsigned intpasid_no_tlp:1; /* PASID works without TLP Prefix */ unsigned inteetlp_prefix_path:1;/* End-to-End TLP Prefix */ pci_channel_state_t error_state;/* Current connectivity state */ -- 2.7.4
[PATCH v2 0/3] PCI: Add a quirk to enable SVA for HiSilicon chip
HiSilicon KunPeng920 and KunPeng930 have devices appear as PCI but are actually on the AMBA bus. These fake PCI devices have PASID capability though not supporting TLP. Add a quirk to set pasid_no_tlp and dma-can-stall for these devices. v2: Add a new pci_dev bit: pasid_no_tlp, suggested by Bjorn "Apparently these devices have a PASID capability. I think you should add a new pci_dev bit that is specific to this idea of "PASID works without TLP prefixes" and then change pci_enable_pasid() to look at that bit as well as eetlp_prefix_path." https://lore.kernel.org/linux-pci/20210112170230.GA1838341@bjorn-Precision-5520/ Zhangfei Gao (3): PCI: PASID can be enabled without TLP prefix PCI: Add a quirk to set pasid_no_tlp for HiSilicon chip PCI: set dma-can-stall for HiSilicon chip drivers/pci/ats.c| 2 +- drivers/pci/quirks.c | 27 +++ include/linux/pci.h | 1 + 3 files changed, 29 insertions(+), 1 deletion(-) -- 2.7.4
Re: [PATCH] PCI: Add a quirk to enable SVA for HiSilicon chip
Hi, Bjorn Thanks for the suggestion. On 2021/1/13 上午1:02, Bjorn Helgaas wrote: On Tue, Jan 12, 2021 at 02:49:52PM +0800, Zhangfei Gao wrote: HiSilicon KunPeng920 and KunPeng930 have devices appear as PCI but are actually on the AMBA bus. These fake PCI devices can not support tlp and have to enable SMMU stall mode to use the SVA feature. Add a quirk to set dma-can-stall property and enable tlp for these devices. s/tlp/TLP/ I don't think "enable TLP" really captures what's going on here. You must be referring to the fact that you set pdev->eetlp_prefix_path. That is normally set by pci_configure_eetlp_prefix() if the Device Capabilities 2 register has the End-End TLP Prefix Supported bit set *and* all devices in the upstream path also have it set. The only place we currently test eetlp_prefix_path is in pci_enable_pasid(). In PCIe, PASID is implemented using the PASID TLP prefix, so we only enable PASID if TLP prefixes are supported. If I understand correctly, a PASID-like feature is implemented on AMBA without using TLP prefixes, and setting eetlp_prefix_path makes that work. Yes, that's the requirement. I don't think you should do this by setting eetlp_prefix_path because TLP prefixes are used for other features, e.g., TPH. Setting eetlp_prefix_path implies these devices can also support things like TLP, and I don't think that's necessarily true. Thanks for the remainder. Apparently these devices have a PASID capability. I think you should add a new pci_dev bit that is specific to this idea of "PASID works without TLP prefixes" and then change pci_enable_pasid() to look at that bit as well as eetlp_prefix_path. That's great, this solution is much simpler. we can set the bit before pci_enable_pasid. It seems like dma-can-stall is a separate thing from PASID? If so, this should be two separate patches. If they can be separated, I would probably make the PASID thing the first patch, and then the "dma-can-stall" can be on its own as a broken DT workaround (if that's what it is) and it's easier to remove that if it become obsolete. Signed-off-by: Zhangfei Gao Signed-off-by: Jean-Philippe Brucker Signed-off-by: Zhou Wang --- Property dma-can-stall depends on patchset https://lore.kernel.org/linux-iommu/20210108145217.2254447-1-jean-phili...@linaro.org/ drivers/pci/quirks.c | 25 + 1 file changed, 25 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 653660e..a27f327 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -1825,6 +1825,31 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_E7525_MCH, quir DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_HUAWEI, 0x1610, PCI_CLASS_BRIDGE_PCI, 8, quirk_pcie_mch); +static void quirk_huawei_pcie_sva(struct pci_dev *pdev) +{ + struct property_entry properties[] = { + PROPERTY_ENTRY_BOOL("dma-can-stall"), + {}, + }; + + if ((pdev->revision != 0x21) && (pdev->revision != 0x30)) + return; + + pdev->eetlp_prefix_path = 1; + + /* Device-tree can set the stall property */ + if (!pdev->dev.of_node && + device_add_properties(>dev, properties)) Does this mean "dma-can-stall" *can* be set via DT, and if it is, this quirk is not needed? So is this quirk basically a workaround for an old or broken DT? The quirk is still needed for uefi case, since uefi can not describe the endpoints (peripheral devices). + pci_warn(pdev, "could not add stall property"); +} + Remove this blank line to follow the style of the rest of the file. +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa251, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa255, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa256, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa258, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa259, quirk_huawei_pcie_sva); + /* * It's possible for the MSI to get corrupted if SHPC and ACPI are used * together on certain PXH-based systems. How about changes like this diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 68f53f7..886ea26 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2466,6 +2466,9 @@ static int arm_smmu_enable_pasid(struct arm_smmu_master *master) if (num_pasids <= 0) return num_pasids; + if (master->stall_enabled) + pdev->pasid_no_tlp = 1; + ret = pci_enable_pasid(pdev, features); if (ret) { dev_err(>dev, "Failed to enable PASID\n"); @@ -2860,6 +2863,11 @@ static struct iommu
[PATCH] PCI: Add a quirk to enable SVA for HiSilicon chip
HiSilicon KunPeng920 and KunPeng930 have devices appear as PCI but are actually on the AMBA bus. These fake PCI devices can not support tlp and have to enable SMMU stall mode to use the SVA feature. Add a quirk to set dma-can-stall property and enable tlp for these devices. Signed-off-by: Zhangfei Gao Signed-off-by: Jean-Philippe Brucker Signed-off-by: Zhou Wang --- Property dma-can-stall depends on patchset https://lore.kernel.org/linux-iommu/20210108145217.2254447-1-jean-phili...@linaro.org/ drivers/pci/quirks.c | 25 + 1 file changed, 25 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 653660e..a27f327 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -1825,6 +1825,31 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_E7525_MCH, quir DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_HUAWEI, 0x1610, PCI_CLASS_BRIDGE_PCI, 8, quirk_pcie_mch); +static void quirk_huawei_pcie_sva(struct pci_dev *pdev) +{ + struct property_entry properties[] = { + PROPERTY_ENTRY_BOOL("dma-can-stall"), + {}, + }; + + if ((pdev->revision != 0x21) && (pdev->revision != 0x30)) + return; + + pdev->eetlp_prefix_path = 1; + + /* Device-tree can set the stall property */ + if (!pdev->dev.of_node && + device_add_properties(>dev, properties)) + pci_warn(pdev, "could not add stall property"); +} + +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa251, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa255, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa256, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa258, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa259, quirk_huawei_pcie_sva); + /* * It's possible for the MSI to get corrupted if SHPC and ACPI are used * together on certain PXH-based systems. -- 2.7.4
Re: [PATCH] uacce: Use kobj_to_dev() instead of container_of()
On 2020/8/20 上午10:16, Tian Tao wrote: Use kobj_to_dev() instead of container_of() Signed-off-by: Tian Tao Reviewed-by: Zhou Wang Acked-by: Zhangfei Gao Thanks --- drivers/misc/uacce/uacce.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c index a5b8dab..a9da7b1 100644 --- a/drivers/misc/uacce/uacce.c +++ b/drivers/misc/uacce/uacce.c @@ -370,7 +370,7 @@ static struct attribute *uacce_dev_attrs[] = { static umode_t uacce_dev_is_visible(struct kobject *kobj, struct attribute *attr, int n) { - struct device *dev = container_of(kobj, struct device, kobj); + struct device *dev = kobj_to_dev(kobj); struct uacce_device *uacce = to_uacce_device(dev); if (((attr == _attr_region_mmio_size.attr) &&
Re: [PATCH] uacce: fix some coding styles
On 2020/7/30 下午2:13, Kai Ye wrote: 1. delete some redundant code. 2. modify the module author information. Signed-off-by: Kai Ye Thanks Kai Acked-by: Zhangfei Gao Thanks
Re: [PATCH] uacce: fix some coding styles
On 2020/7/20 下午3:18, Kai Ye wrote: 1. add some parameter check. 2. delete some redundant code. 3. modify the module author information. Signed-off-by: Kai Ye Reviewed-by: Zhou Wang Thanks Kai. --- drivers/misc/uacce/uacce.c | 28 +--- 1 file changed, 13 insertions(+), 15 deletions(-) diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c index 107028e..2e1af58 100644 --- a/drivers/misc/uacce/uacce.c +++ b/drivers/misc/uacce/uacce.c @@ -63,8 +63,12 @@ static long uacce_fops_unl_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) { struct uacce_queue *q = filep->private_data; - struct uacce_device *uacce = q->uacce; + struct uacce_device *uacce; + + if (WARN_ON(!q)) + return -EINVAL; WARN_ON should not be used in uacce, instead error can be printed in user space driver. Error should not be printed in kernel log as pasid can be used by unpriv user. And I think we do not need check filep->private_data. The fd is double checked in __fget_files. Thanks
Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU
Hi, Joerg On 2020/6/22 下午7:55, Joerg Roedel wrote: On Thu, Jun 04, 2020 at 09:33:07PM +0800, Zhangfei Gao wrote: +++ b/drivers/iommu/iommu.c @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode, fwspec->iommu_fwnode = iommu_fwnode; fwspec->ops = ops; dev_iommu_fwspec_set(dev, fwspec); + + if (dev_is_pci(dev)) + pci_fixup_device(pci_fixup_final, to_pci_dev(dev)); + That's not going to fly, I don't think we should run the fixups twice, and they should not be run from IOMMU code. Is the only reason for this second pass that iommu_fwspec is not yet allocated when it runs the first time? I ask because it might be easier to just allocate the struct earlier then. Thanks for looking this. Yes, it is the only reason calling fixup secondly after iommu_fwspec is allocated. The first time fixup final is very early in pci_bus_add_device. If allocating iommu_fwspec earlier, it maybe in pci_alloc_dev. And assigning ops still in iommu_fwspec_init. Have tested it works. Not sure is it acceptable? Alternatively, adding can_stall to struct pci_dev is simple but ugly too, since pci does not know stall now. Thanks
Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU
Hi, Bjorn On 2020/6/16 上午7:52, Bjorn Helgaas wrote: On Sat, Jun 13, 2020 at 10:30:56PM +0800, Zhangfei Gao wrote: On 2020/6/11 下午9:44, Bjorn Helgaas wrote: +++ b/drivers/iommu/iommu.c @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode, fwspec->iommu_fwnode = iommu_fwnode; fwspec->ops = ops; dev_iommu_fwspec_set(dev, fwspec); + + if (dev_is_pci(dev)) + pci_fixup_device(pci_fixup_final, to_pci_dev(dev)); + Then pci_fixup_final will be called twice, the first in pci_bus_add_device. Here in iommu_fwspec_init is the second time, specifically for iommu_fwspec. Will send this when 5.8-rc1 is open. Wait, this whole fixup approach seems wrong to me. No matter how you do the fixup, it's still a fixup, which means it requires ongoing maintenance. Surely we don't want to have to add the Vendor/Device ID for every new AMBA device that comes along, do we? Here the fake pci device has standard PCI cfg space, but physical implementation is base on AMBA They can provide pasid feature. However, 1, does not support tlp since they are not real pci devices. 2. does not support pri, instead support stall (provided by smmu) And stall is not a pci feature, so it is not described in struct pci_dev, but in struct iommu_fwspec. So we use this fixup to tell pci system that the devices can support stall, and hereby support pasid. This did not answer my question. Are you proposing that we update a quirk every time a new AMBA device is released? I don't think that would be a good model. Yes, you are right, but we do not have any better idea yet. Currently we have three fake pci devices, which support stall and pasid. We have to let pci system know the device can support pasid, because of stall feature, though not support pri. Do you have any other ideas? It sounds like the best way would be to allocate a PCI capability for it, so detection can be done through config space, at least in future devices, or possibly after a firmware update if the config space in your system is controlled by firmware somewhere. Once there is a proper mechanism to do this, using fixups to detect the early devices that don't use that should be uncontroversial. I have no idea what the process or timeline is to add new capabilities into the PCIe specification, or if this one would be acceptable to the PCI SIG at all. That sounds like a possibility. The spec already defines a Vendor-Specific Extended Capability (PCIe r5.0, sec 7.9.5) that might be a candidate. Will investigate this, thanks Bjorn FWIW, there's also a Vendor-Specific Capability that can appear in the first 256 bytes of config space (the Vendor-Specific Extended Capability must appear in the "Extended Configuration Space" from 0x100-0xfff). Unfortunately our silicon does not have either Vendor-Specific Capability or Vendor-Specific Extended Capability. Studied commit 8531e283bee66050734fb0e89d53e85fd5ce24a4 Looks this method requires adding member (like can_stall) to struct pci_dev, looks difficult. The problem is that we don't want to add device IDs every time a new chip comes out. Adding one or two device IDs for silicon that's already released is not a problem as long as you have a strategy for *future* devices so they don't require a quirk. If detection cannot be done through PCI config space, the next best alternative is to pass auxiliary data through firmware. On DT based machines, you can list non-hotpluggable PCIe devices and add custom properties that could be read during device enumeration. I assume ACPI has something similar, but I have not done that. Yes, thanks Arnd ACPI has _DSM (ACPI v6.3, sec 9.1.1), which might be a candidate. I like this better than a PCI capability because the property you need to expose is not a PCI property. _DSM may not workable, since it is working in runtime. We need stall information in init stage, neither too early (after allocation of iommu_fwspec) nor too late (before arm_smmu_add_device ). I'm not aware of a restriction on when _DSM can be evaluated. I'm looking at ACPI v6.3, sec 9.1.1. Are you seeing something different? DSM method seems requires vendor specific guid, and code would be vendor specific. _DSM indeed requires a vendor-specific UUID, precisely *because* vendors are free to define their own functionality without requiring changes to the ACPI spec. From the spec (ACPI v6.3, sec 9.1.1): New UUIDs may also be created by OEMs and IHVs for custom devices and other interface or device governing bodies (e.g. the PCI SIG), as long as the UUID is different from other published UUIDs. Have studied _DSM method, two issues we met comparing using quirk. 1. Need change definition of either pci_host_bridge or pci_dev, like adding member can_stall, while pci system does not know stall now. a, pci devices do not have uuid: uuid need be described in dsdt, while pci devices are not
[PATCH] uacce: remove uacce_vma_fault
Fix NULL pointer error if removing uacce's parent module during app's running. SIGBUS is already reported by do_page_fault, so uacce_vma_fault is not needed. If providing vma_fault, vmf->page has to be filled as well, required by __do_fault. Reported-by: Jean-Philippe Brucker Signed-off-by: Zhangfei Gao --- drivers/misc/uacce/uacce.c | 9 - 1 file changed, 9 deletions(-) diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c index 107028e..aa91f69 100644 --- a/drivers/misc/uacce/uacce.c +++ b/drivers/misc/uacce/uacce.c @@ -179,14 +179,6 @@ static int uacce_fops_release(struct inode *inode, struct file *filep) return 0; } -static vm_fault_t uacce_vma_fault(struct vm_fault *vmf) -{ - if (vmf->flags & (FAULT_FLAG_MKWRITE | FAULT_FLAG_WRITE)) - return VM_FAULT_SIGBUS; - - return 0; -} - static void uacce_vma_close(struct vm_area_struct *vma) { struct uacce_queue *q = vma->vm_private_data; @@ -199,7 +191,6 @@ static void uacce_vma_close(struct vm_area_struct *vma) } static const struct vm_operations_struct uacce_vm_ops = { - .fault = uacce_vma_fault, .close = uacce_vma_close, }; -- 2.7.4
[PATCH v2] crypto: hisilicon - fix strncpy warning with strscpy
Use strscpy to fix the warning warning: 'strncpy' specified bound 64 equals destination size Reported-by: kernel test robot Signed-off-by: Zhangfei Gao --- v2: Use strscpy instead of strlcpy since better truncation handling suggested by Herbert Rebase to 5.8-rc1 drivers/crypto/hisilicon/qm.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c index 9bb263cec6c3..8ac293afa8ab 100644 --- a/drivers/crypto/hisilicon/qm.c +++ b/drivers/crypto/hisilicon/qm.c @@ -2179,8 +2179,12 @@ static int qm_alloc_uacce(struct hisi_qm *qm) .flags = UACCE_DEV_SVA, .ops = _qm_ops, }; + int ret; - strncpy(interface.name, pdev->driver->name, sizeof(interface.name)); + ret = strscpy(interface.name, pdev->driver->name, + sizeof(interface.name)); + if (ret < 0) + return -ENAMETOOLONG; uacce = uacce_alloc(>dev, ); if (IS_ERR(uacce)) -- 2.23.0
Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU
On 2020/6/11 下午9:44, Bjorn Helgaas wrote: +++ b/drivers/iommu/iommu.c @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode, fwspec->iommu_fwnode = iommu_fwnode; fwspec->ops = ops; dev_iommu_fwspec_set(dev, fwspec); + + if (dev_is_pci(dev)) + pci_fixup_device(pci_fixup_final, to_pci_dev(dev)); + Then pci_fixup_final will be called twice, the first in pci_bus_add_device. Here in iommu_fwspec_init is the second time, specifically for iommu_fwspec. Will send this when 5.8-rc1 is open. Wait, this whole fixup approach seems wrong to me. No matter how you do the fixup, it's still a fixup, which means it requires ongoing maintenance. Surely we don't want to have to add the Vendor/Device ID for every new AMBA device that comes along, do we? Here the fake pci device has standard PCI cfg space, but physical implementation is base on AMBA They can provide pasid feature. However, 1, does not support tlp since they are not real pci devices. 2. does not support pri, instead support stall (provided by smmu) And stall is not a pci feature, so it is not described in struct pci_dev, but in struct iommu_fwspec. So we use this fixup to tell pci system that the devices can support stall, and hereby support pasid. This did not answer my question. Are you proposing that we update a quirk every time a new AMBA device is released? I don't think that would be a good model. Yes, you are right, but we do not have any better idea yet. Currently we have three fake pci devices, which support stall and pasid. We have to let pci system know the device can support pasid, because of stall feature, though not support pri. Do you have any other ideas? It sounds like the best way would be to allocate a PCI capability for it, so detection can be done through config space, at least in future devices, or possibly after a firmware update if the config space in your system is controlled by firmware somewhere. Once there is a proper mechanism to do this, using fixups to detect the early devices that don't use that should be uncontroversial. I have no idea what the process or timeline is to add new capabilities into the PCIe specification, or if this one would be acceptable to the PCI SIG at all. That sounds like a possibility. The spec already defines a Vendor-Specific Extended Capability (PCIe r5.0, sec 7.9.5) that might be a candidate. Will investigate this, thanks Bjorn FWIW, there's also a Vendor-Specific Capability that can appear in the first 256 bytes of config space (the Vendor-Specific Extended Capability must appear in the "Extended Configuration Space" from 0x100-0xfff). Unfortunately our silicon does not have either Vendor-Specific Capability or Vendor-Specific Extended Capability. Studied commit 8531e283bee66050734fb0e89d53e85fd5ce24a4 Looks this method requires adding member (like can_stall) to struct pci_dev, looks difficult. If detection cannot be done through PCI config space, the next best alternative is to pass auxiliary data through firmware. On DT based machines, you can list non-hotpluggable PCIe devices and add custom properties that could be read during device enumeration. I assume ACPI has something similar, but I have not done that. Yes, thanks Arnd ACPI has _DSM (ACPI v6.3, sec 9.1.1), which might be a candidate. I like this better than a PCI capability because the property you need to expose is not a PCI property. _DSM may not workable, since it is working in runtime. We need stall information in init stage, neither too early (after allocation of iommu_fwspec) nor too late (before arm_smmu_add_device ). I'm not aware of a restriction on when _DSM can be evaluated. I'm looking at ACPI v6.3, sec 9.1.1. Are you seeing something different? DSM method seems requires vendor specific guid, and code would be vendor specific. Except adding uuid to some spec like pci_acpi_dsm_guid. obj = acpi_evaluate_dsm(ACPI_HANDLE(bus->bridge), _acpi_dsm_guid, 1, IGNORE_PCI_BOOT_CONFIG_DSM, NULL); By the way, It would be a long time if we need modify either pcie spec or acpi spec. Can we use pci_fixup_device in iommu_fwspec_init first, it is relatively simple and meet the requirement of platform device using pasid, and they are already in product. Neither the PCI Vendor-Specific Capability nor the ACPI _DSM requires a spec change. Both can be completely vendor-defined. Adding vendor-specific code to common files looks a bit ugly. Thanks
Re: [RFC PATCH] PCI: Remove End-End TLP as PASID dependency
On 2020/6/12 上午1:41, Bjorn Helgaas wrote: [+cc Sinan] On Wed, Jun 10, 2020 at 12:18:14PM +0800, Zhangfei Gao wrote: Some platform devices appear as PCI and have PCI cfg space, but are actually on the AMBA bus. They can support PASID via smmu stall feature, but does not support tlp since they are not real pci devices. So remove tlp as a PASID dependency. When you iterate on this, pay attention to things like: - Wrap paragraphs to 75 columns or so, so they fill the whole line but don't overflow when "git show" adds 4 spaces. - Leave a blank line between paragraphs. - Capitalize consistently: "SMMU", "PCI", "TLP". - Provide references to relevant spec sections, e.g., for the SMMU stall feature. OK, Thanks Bjorn Signed-off-by: Zhangfei Gao --- drivers/pci/ats.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c index 390e92f..8e31278 100644 --- a/drivers/pci/ats.c +++ b/drivers/pci/ats.c @@ -344,9 +344,6 @@ int pci_enable_pasid(struct pci_dev *pdev, int features) if (WARN_ON(pdev->pasid_enabled)) return -EBUSY; - if (!pdev->eetlp_prefix_path) - return -EINVAL; No. This would mean we might enable PASID on actual PCIe devices when it is not safe to do so, as Jean-Philippe pointed out. You cannot break actual PCIe devices just to make your non-PCIe device work. These devices do not support PASID as defined in the PCIe spec. They might support something *like* PASID, and you might be able to make parts of the PCI core work with them, but you're going to have to deal with the parts that don't follow the PCIe spec on your own. That might be quirks, it might be some sort of AMBA adaptation shim, I don't know. But it's not the responsibility of the PCI core to adapt to them. Understand now. Will continue use quirk for this. Thanks
Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU
On 2020/6/10 上午12:49, Bjorn Helgaas wrote: On Tue, Jun 09, 2020 at 11:15:06AM +0200, Arnd Bergmann wrote: On Tue, Jun 9, 2020 at 6:02 AM Zhangfei Gao wrote: On 2020/6/9 上午12:41, Bjorn Helgaas wrote: On Mon, Jun 08, 2020 at 10:54:15AM +0800, Zhangfei Gao wrote: On 2020/6/6 上午7:19, Bjorn Helgaas wrote: +++ b/drivers/iommu/iommu.c @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode, fwspec->iommu_fwnode = iommu_fwnode; fwspec->ops = ops; dev_iommu_fwspec_set(dev, fwspec); + + if (dev_is_pci(dev)) + pci_fixup_device(pci_fixup_final, to_pci_dev(dev)); + Then pci_fixup_final will be called twice, the first in pci_bus_add_device. Here in iommu_fwspec_init is the second time, specifically for iommu_fwspec. Will send this when 5.8-rc1 is open. Wait, this whole fixup approach seems wrong to me. No matter how you do the fixup, it's still a fixup, which means it requires ongoing maintenance. Surely we don't want to have to add the Vendor/Device ID for every new AMBA device that comes along, do we? Here the fake pci device has standard PCI cfg space, but physical implementation is base on AMBA They can provide pasid feature. However, 1, does not support tlp since they are not real pci devices. 2. does not support pri, instead support stall (provided by smmu) And stall is not a pci feature, so it is not described in struct pci_dev, but in struct iommu_fwspec. So we use this fixup to tell pci system that the devices can support stall, and hereby support pasid. This did not answer my question. Are you proposing that we update a quirk every time a new AMBA device is released? I don't think that would be a good model. Yes, you are right, but we do not have any better idea yet. Currently we have three fake pci devices, which support stall and pasid. We have to let pci system know the device can support pasid, because of stall feature, though not support pri. Do you have any other ideas? It sounds like the best way would be to allocate a PCI capability for it, so detection can be done through config space, at least in future devices, or possibly after a firmware update if the config space in your system is controlled by firmware somewhere. Once there is a proper mechanism to do this, using fixups to detect the early devices that don't use that should be uncontroversial. I have no idea what the process or timeline is to add new capabilities into the PCIe specification, or if this one would be acceptable to the PCI SIG at all. That sounds like a possibility. The spec already defines a Vendor-Specific Extended Capability (PCIe r5.0, sec 7.9.5) that might be a candidate. Will investigate this, thanks Bjorn If detection cannot be done through PCI config space, the next best alternative is to pass auxiliary data through firmware. On DT based machines, you can list non-hotpluggable PCIe devices and add custom properties that could be read during device enumeration. I assume ACPI has something similar, but I have not done that. Yes, thanks Arnd ACPI has _DSM (ACPI v6.3, sec 9.1.1), which might be a candidate. I like this better than a PCI capability because the property you need to expose is not a PCI property. _DSM may not workable, since it is working in runtime. We need stall information in init stage, neither too early (after allocation of iommu_fwspec) nor too late (before arm_smmu_add_device ). By the way, It would be a long time if we need modify either pcie spec or acpi spec. Can we use pci_fixup_device in iommu_fwspec_init first, it is relatively simple and meet the requirement of platform device using pasid, and they are already in product. Thanks
Re: [RFC PATCH] PCI: Remove End-End TLP as PASID dependency
On 2020/6/10 下午3:46, Jean-Philippe Brucker wrote: On Wed, Jun 10, 2020 at 12:18:14PM +0800, Zhangfei Gao wrote: Some platform devices appear as PCI and have PCI cfg space, but are actually on the AMBA bus. They can support PASID via smmu stall feature, but does not support tlp since they are not real pci devices. So remove tlp as a PASID dependency. Signed-off-by: Zhangfei Gao --- drivers/pci/ats.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c index 390e92f..8e31278 100644 --- a/drivers/pci/ats.c +++ b/drivers/pci/ats.c @@ -344,9 +344,6 @@ int pci_enable_pasid(struct pci_dev *pdev, int features) if (WARN_ON(pdev->pasid_enabled)) return -EBUSY; - if (!pdev->eetlp_prefix_path) - return -EINVAL; - This check is useful, and follows the PCI specification (4.0r1.0 2.2.10.2 End-End TLP Prefix Processing: "Software should ensure that TLPs containing End-End TLP Prefixes are not sent to components that do not support them.") Thanks Jean, Why not set the eetlp_prefix_path bit from a PCI quirk? Unlike the stall problem from the other thread, this one looks like a simple design mistake that can be fixed easily in future iterations of the platform: just set the "End-End TLP Prefix Supported" bit in the Device Capability 2 Register of all bridges. Yes, we can still set eetlp_prefix_path bit from a PCI quirk. And we also have considered adding this bit in Device Capability 2 Register in future silicon. But we hesitated that it does reflect the real function: from register, it can support tlp, but in fact, it does not. Thanks
[RFC PATCH] PCI: Remove End-End TLP as PASID dependency
Some platform devices appear as PCI and have PCI cfg space, but are actually on the AMBA bus. They can support PASID via smmu stall feature, but does not support tlp since they are not real pci devices. So remove tlp as a PASID dependency. Signed-off-by: Zhangfei Gao --- drivers/pci/ats.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c index 390e92f..8e31278 100644 --- a/drivers/pci/ats.c +++ b/drivers/pci/ats.c @@ -344,9 +344,6 @@ int pci_enable_pasid(struct pci_dev *pdev, int features) if (WARN_ON(pdev->pasid_enabled)) return -EBUSY; - if (!pdev->eetlp_prefix_path) - return -EINVAL; - if (!pasid) return -EINVAL; -- 2.7.4
Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU
Hi, Bjorn On 2020/6/9 上午12:41, Bjorn Helgaas wrote: On Mon, Jun 08, 2020 at 10:54:15AM +0800, Zhangfei Gao wrote: On 2020/6/6 上午7:19, Bjorn Helgaas wrote: On Thu, Jun 04, 2020 at 09:33:07PM +0800, Zhangfei Gao wrote: On 2020/6/2 上午1:41, Bjorn Helgaas wrote: On Thu, May 28, 2020 at 09:33:44AM +0200, Joerg Roedel wrote: On Wed, May 27, 2020 at 01:18:42PM -0500, Bjorn Helgaas wrote: Is this slowdown significant? We already iterate over every device when applying PCI_FIXUP_FINAL quirks, so if we used the existing PCI_FIXUP_FINAL, we wouldn't be adding a new loop. We would only be adding two more iterations to the loop in pci_do_fixups() that tries to match quirks against the current device. I doubt that would be a measurable slowdown. I don't know how significant it is, but I remember people complaining about adding new PCI quirks because it takes too long for them to run them all. That was in the discussion about the quirk disabling ATS on AMD Stoney systems. So it probably depends on how many PCI devices are in the system whether it causes any measureable slowdown. I found this [1] from Paul Menzel, which was a slowdown caused by quirk_usb_early_handoff(). I think the real problem is individual quirks that take a long time. The PCI_FIXUP_IOMMU things we're talking about should be fast, and of course, they're only run for matching devices anyway. So I'd rather keep them as PCI_FIXUP_FINAL than add a whole new phase. Thanks Bjorn for taking time for this. If so, it would be much simpler. +++ b/drivers/iommu/iommu.c @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode, fwspec->iommu_fwnode = iommu_fwnode; fwspec->ops = ops; dev_iommu_fwspec_set(dev, fwspec); + + if (dev_is_pci(dev)) + pci_fixup_device(pci_fixup_final, to_pci_dev(dev)); + Then pci_fixup_final will be called twice, the first in pci_bus_add_device. Here in iommu_fwspec_init is the second time, specifically for iommu_fwspec. Will send this when 5.8-rc1 is open. Wait, this whole fixup approach seems wrong to me. No matter how you do the fixup, it's still a fixup, which means it requires ongoing maintenance. Surely we don't want to have to add the Vendor/Device ID for every new AMBA device that comes along, do we? Here the fake pci device has standard PCI cfg space, but physical implementation is base on AMBA They can provide pasid feature. However, 1, does not support tlp since they are not real pci devices. 2. does not support pri, instead support stall (provided by smmu) And stall is not a pci feature, so it is not described in struct pci_dev, but in struct iommu_fwspec. So we use this fixup to tell pci system that the devices can support stall, and hereby support pasid. This did not answer my question. Are you proposing that we update a quirk every time a new AMBA device is released? I don't think that would be a good model. Yes, you are right, but we do not have any better idea yet. Currently we have three fake pci devices, which support stall and pasid. We have to let pci system know the device can support pasid, because of stall feature, though not support pri. Do you have any other ideas? Thanks
Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU
Hi, Bjorn On 2020/6/6 上午7:19, Bjorn Helgaas wrote: On Thu, Jun 04, 2020 at 09:33:07PM +0800, Zhangfei Gao wrote: On 2020/6/2 上午1:41, Bjorn Helgaas wrote: On Thu, May 28, 2020 at 09:33:44AM +0200, Joerg Roedel wrote: On Wed, May 27, 2020 at 01:18:42PM -0500, Bjorn Helgaas wrote: Is this slowdown significant? We already iterate over every device when applying PCI_FIXUP_FINAL quirks, so if we used the existing PCI_FIXUP_FINAL, we wouldn't be adding a new loop. We would only be adding two more iterations to the loop in pci_do_fixups() that tries to match quirks against the current device. I doubt that would be a measurable slowdown. I don't know how significant it is, but I remember people complaining about adding new PCI quirks because it takes too long for them to run them all. That was in the discussion about the quirk disabling ATS on AMD Stoney systems. So it probably depends on how many PCI devices are in the system whether it causes any measureable slowdown. I found this [1] from Paul Menzel, which was a slowdown caused by quirk_usb_early_handoff(). I think the real problem is individual quirks that take a long time. The PCI_FIXUP_IOMMU things we're talking about should be fast, and of course, they're only run for matching devices anyway. So I'd rather keep them as PCI_FIXUP_FINAL than add a whole new phase. Thanks Bjorn for taking time for this. If so, it would be much simpler. +++ b/drivers/iommu/iommu.c @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode, fwspec->iommu_fwnode = iommu_fwnode; fwspec->ops = ops; dev_iommu_fwspec_set(dev, fwspec); + + if (dev_is_pci(dev)) + pci_fixup_device(pci_fixup_final, to_pci_dev(dev)); + Then pci_fixup_final will be called twice, the first in pci_bus_add_device. Here in iommu_fwspec_init is the second time, specifically for iommu_fwspec. Will send this when 5.8-rc1 is open. Wait, this whole fixup approach seems wrong to me. No matter how you do the fixup, it's still a fixup, which means it requires ongoing maintenance. Surely we don't want to have to add the Vendor/Device ID for every new AMBA device that comes along, do we? Here the fake pci device has standard PCI cfg space, but physical implementation is base on AMBA They can provide pasid feature. However, 1, does not support tlp since they are not real pci devices. 2. does not support pri, instead support stall (provided by smmu) And stall is not a pci feature, so it is not described in struct pci_dev, but in struct iommu_fwspec. So we use this fixup to tell pci system that the devices can support stall, and hereby support pasid. Thanks
Re: [PATCH] crypto: hisilicon - fix strncpy warning with strlcpy
On 2020/6/5 下午11:49, Eric Biggers wrote: On Fri, Jun 05, 2020 at 11:26:20PM +0800, Zhangfei Gao wrote: On 2020/6/5 下午8:17, Herbert Xu wrote: On Fri, Jun 05, 2020 at 05:34:32PM +0800, Zhangfei Gao wrote: Will add a check after the copy. strlcpy(interface.name, pdev->driver->name, sizeof(interface.name)); if (strlen(pdev->driver->name) != strlen(interface.name)) return -EINVAL; You don't need to do strlen. The function strlcpy returns the length of the source string. Better yet use strscpy which will even return an error for you. Yes, good idea, we can use strscpy. + int ret; - strncpy(interface.name, pdev->driver->name, sizeof(interface.name)); + ret = strscpy(interface.name, pdev->driver->name, sizeof(interface.name)); + if (ret < 0) + return ret; You might want to use -ENAMETOOLONG instead of the strscpy return value of -E2BIG. Yes, make sense, thanks Eric
Re: [PATCH] crypto: hisilicon - fix strncpy warning with strlcpy
On 2020/6/5 下午8:17, Herbert Xu wrote: On Fri, Jun 05, 2020 at 05:34:32PM +0800, Zhangfei Gao wrote: Will add a check after the copy. strlcpy(interface.name, pdev->driver->name, sizeof(interface.name)); if (strlen(pdev->driver->name) != strlen(interface.name)) return -EINVAL; You don't need to do strlen. The function strlcpy returns the length of the source string. Better yet use strscpy which will even return an error for you. Yes, good idea, we can use strscpy. + int ret; - strncpy(interface.name, pdev->driver->name, sizeof(interface.name)); + ret = strscpy(interface.name, pdev->driver->name, sizeof(interface.name)); + if (ret < 0) + return ret; Will resend later, thanks Herbert.
Re: [PATCH] crypto: hisilicon - fix strncpy warning with strlcpy
On 2020/6/4 下午2:50, Herbert Xu wrote: On Thu, Jun 04, 2020 at 02:44:16PM +0800, Zhangfei Gao wrote: I think it is fine. 1. Currently the name size is 64, bigger enough. Simply grep in driver name, 64 should be enough. We can make it larger when there is a request. 2. it does not matter what the name is, since it is just an interface. cat /sys/class/uacce/hisi_zip-0/flags cat /sys/class/uacce/his-0/flags should be both fine to app only they can be distinguished. 3. It maybe a hard restriction to fail just because of a long name. I think we should err on the side of caution. IOW, unless you know that you need it to succeed when it exceeds the limit, then you should just make it fail. Thanks Herbert Will add a check after the copy. strlcpy(interface.name, pdev->driver->name, sizeof(interface.name)); if (strlen(pdev->driver->name) != strlen(interface.name)) return -EINVAL; Will resend the fix after rc1 is open. Thanks
Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU
On 2020/6/2 上午1:41, Bjorn Helgaas wrote: On Thu, May 28, 2020 at 09:33:44AM +0200, Joerg Roedel wrote: On Wed, May 27, 2020 at 01:18:42PM -0500, Bjorn Helgaas wrote: Is this slowdown significant? We already iterate over every device when applying PCI_FIXUP_FINAL quirks, so if we used the existing PCI_FIXUP_FINAL, we wouldn't be adding a new loop. We would only be adding two more iterations to the loop in pci_do_fixups() that tries to match quirks against the current device. I doubt that would be a measurable slowdown. I don't know how significant it is, but I remember people complaining about adding new PCI quirks because it takes too long for them to run them all. That was in the discussion about the quirk disabling ATS on AMD Stoney systems. So it probably depends on how many PCI devices are in the system whether it causes any measureable slowdown. I found this [1] from Paul Menzel, which was a slowdown caused by quirk_usb_early_handoff(). I think the real problem is individual quirks that take a long time. The PCI_FIXUP_IOMMU things we're talking about should be fast, and of course, they're only run for matching devices anyway. So I'd rather keep them as PCI_FIXUP_FINAL than add a whole new phase. Thanks Bjorn for taking time for this. If so, it would be much simpler. +++ b/drivers/iommu/iommu.c @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode, fwspec->iommu_fwnode = iommu_fwnode; fwspec->ops = ops; dev_iommu_fwspec_set(dev, fwspec); + + if (dev_is_pci(dev)) + pci_fixup_device(pci_fixup_final, to_pci_dev(dev)); + Then pci_fixup_final will be called twice, the first in pci_bus_add_device. Here in iommu_fwspec_init is the second time, specifically for iommu_fwspec. Will send this when 5.8-rc1 is open. Thanks
Re: [PATCH] crypto: hisilicon - fix strncpy warning with strlcpy
On 2020/6/4 下午2:18, Herbert Xu wrote: On Thu, Jun 04, 2020 at 02:10:37PM +0800, Zhangfei Gao wrote: Should this even allow truncation? Perhaps it'd be better to fail in case of an overrun? I think we do not need consider overrun, since it at most copy size-1 bytes to dest. From the manual: strlcpy() This function is similar to strncpy(), but it copies at most size-1 bytes to dest, always adds a terminating null byte, And simple tested with smaller SIZE of interface.name, only SIZE-1 is copied, so it is safe. -#define UACCE_MAX_NAME_SIZE 64 +#define UACCE_MAX_NAME_SIZE 4 That's not what I meant. As it is if you do exceed the limit the name is silently truncated. Wouldn't it be better to fail the allocation instead? I think it is fine. 1. Currently the name size is 64, bigger enough. Simply grep in driver name, 64 should be enough. We can make it larger when there is a request. 2. it does not matter what the name is, since it is just an interface. cat /sys/class/uacce/hisi_zip-0/flags cat /sys/class/uacce/his-0/flags should be both fine to app only they can be distinguished. 3. It maybe a hard restriction to fail just because of a long name. What do you think. Thanks
Re: [PATCH] crypto: hisilicon - fix strncpy warning with strlcpy
On 2020/6/4 上午11:39, Herbert Xu wrote: On Thu, Jun 04, 2020 at 11:32:04AM +0800, Zhangfei Gao wrote: Use strlcpy to fix the warning warning: 'strncpy' specified bound 64 equals destination size [-Wstringop-truncation] Reported-by: kernel test robot Signed-off-by: Zhangfei Gao --- drivers/crypto/hisilicon/qm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c index f795fb5..224f3e2 100644 --- a/drivers/crypto/hisilicon/qm.c +++ b/drivers/crypto/hisilicon/qm.c @@ -1574,7 +1574,7 @@ static int qm_alloc_uacce(struct hisi_qm *qm) .ops = _qm_ops, }; - strncpy(interface.name, pdev->driver->name, sizeof(interface.name)); + strlcpy(interface.name, pdev->driver->name, sizeof(interface.name)); Should this even allow truncation? Perhaps it'd be better to fail in case of an overrun? I think we do not need consider overrun, since it at most copy size-1 bytes to dest. From the manual: strlcpy() This function is similar to strncpy(), but it copies at most size-1 bytes to dest, always adds a terminating null byte, And simple tested with smaller SIZE of interface.name, only SIZE-1 is copied, so it is safe. -#define UACCE_MAX_NAME_SIZE 64 +#define UACCE_MAX_NAME_SIZE 4 Thanks
[PATCH] crypto: hisilicon - fix strncpy warning with strlcpy
Use strlcpy to fix the warning warning: 'strncpy' specified bound 64 equals destination size [-Wstringop-truncation] Reported-by: kernel test robot Signed-off-by: Zhangfei Gao --- drivers/crypto/hisilicon/qm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c index f795fb5..224f3e2 100644 --- a/drivers/crypto/hisilicon/qm.c +++ b/drivers/crypto/hisilicon/qm.c @@ -1574,7 +1574,7 @@ static int qm_alloc_uacce(struct hisi_qm *qm) .ops = _qm_ops, }; - strncpy(interface.name, pdev->driver->name, sizeof(interface.name)); + strlcpy(interface.name, pdev->driver->name, sizeof(interface.name)); uacce = uacce_alloc(>dev, ); if (IS_ERR(uacce)) -- 2.7.4
Re: [PATCH 2/2] iommu: calling pci_fixup_iommu in iommu_fwspec_init
On 2020/5/27 下午5:01, Greg Kroah-Hartman wrote: On Tue, May 26, 2020 at 07:49:09PM +0800, Zhangfei Gao wrote: Calling pci_fixup_iommu in iommu_fwspec_init, which alloc iommu_fwnode. Some platform devices appear as PCI but are actually on the AMBA bus, and they need fixup in drivers/pci/quirks.c handling iommu_fwnode. So calling pci_fixup_iommu after iommu_fwnode is allocated. Signed-off-by: Zhangfei Gao --- drivers/iommu/iommu.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 7b37542..fb84c42 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode, fwspec->iommu_fwnode = iommu_fwnode; fwspec->ops = ops; dev_iommu_fwspec_set(dev, fwspec); + + if (dev_is_pci(dev)) + pci_fixup_device(pci_fixup_iommu, to_pci_dev(dev)); Why can't the caller do this as it "knows" it is a PCI device at that point in time, right? Putting fixup here is because 1. iommu_fwspec has been allocated 2. iommu_fwspec_init will be called by of_pci_iommu_init and iort_pci_iommu_init, covering both acpi and dt Thanks
Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU
Hi, Bjorn On 2020/5/28 上午2:18, Bjorn Helgaas wrote: On Tue, May 26, 2020 at 07:49:07PM +0800, Zhangfei Gao wrote: Some platform devices appear as PCI but are actually on the AMBA bus, and they need fixup in drivers/pci/quirks.c handling iommu_fwnode. Here introducing PCI_FIXUP_IOMMU, which is called after iommu_fwnode is allocated, instead of reusing PCI_FIXUP_FINAL since it will slow down iommu probing as all devices in fixup final list will be reprocessed, suggested by Joerg, [1] Is this slowdown significant? We already iterate over every device when applying PCI_FIXUP_FINAL quirks, so if we used the existing PCI_FIXUP_FINAL, we wouldn't be adding a new loop. We would only be adding two more iterations to the loop in pci_do_fixups() that tries to match quirks against the current device. I doubt that would be a measurable slowdown. I do not notice the difference when compared fixup_iommu and fixup_final via get_jiffies_64, since in our platform no other pci fixup is registered. Here the plan is adding pci_fixup_device in iommu_fwspec_init, so if using fixup_final the iteration will be done again here. For example: Hisilicon platform device need fixup in drivers/pci/quirks.c handling fwspec->can_stall, which is introduced in [2] +static void quirk_huawei_pcie_sva(struct pci_dev *pdev) +{ +struct iommu_fwspec *fwspec; + +pdev->eetlp_prefix_path = 1; +fwspec = dev_iommu_fwspec_get(>dev); +if (fwspec) +fwspec->can_stall = 1; +} + +DECLARE_PCI_FIXUP_IOMMU(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva); +DECLARE_PCI_iFIXUP_IOMMU(PCI_VENDOR_ID_HUAWEI, 0xa251, quirk_huawei_pcie_sva); [1] https://www.spinics.net/lists/iommu/msg44591.html [2] https://www.spinics.net/lists/linux-pci/msg94559.html If you reference these in the commit logs, please use lore.kernel.org links instead of spinics. Got it, thanks Bjorn.
Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU
On 2020/5/27 下午5:53, Arnd Bergmann wrote: On Wed, May 27, 2020 at 11:00 AM Greg Kroah-Hartman wrote: On Tue, May 26, 2020 at 07:49:07PM +0800, Zhangfei Gao wrote: Some platform devices appear as PCI but are actually on the AMBA bus, Why would these devices not just show up on the AMBA bus and use all of that logic instead of being a PCI device and having to go through odd fixes like this? There is a general move to having hardware be discoverable even with ARM processors. Having on-chip devices be discoverable using PCI config space is how x86 SoCs usually do it, and that is generally a good thing as it means we don't need to describe them in DT I guess as the hardware designers are still learning about it, this is not always done correctly. In general, we can also describe PCI devices on DT and do fixups during the probing there, but I suspect that won't work as easily using ACPI probing, so the fixup is keyed off the hardware ID, again as is common for x86 on-chip devices. Yes, thanks Arnd :) In order to use pasid, io page fault has to be supported, either by PCI PRI feature (from pci device) or stall mode from smmu (platform device). Here is letting system know the platform device can support smmu stall mode, as a result support pasid. While stall is not a pci capability, so we use a fixup here. Thanks
Re: [PATCH 1/2] PCI: Introduce PCI_FIXUP_IOMMU
Hi, Christoph On 2020/5/26 下午10:46, Christoph Hellwig wrote: On Tue, May 26, 2020 at 07:49:08PM +0800, Zhangfei Gao wrote: Some platform devices appear as PCI but are actually on the AMBA bus, and they need fixup in drivers/pci/quirks.c handling iommu_fwnode. Here introducing PCI_FIXUP_IOMMU, which is called after iommu_fwnode is allocated, instead of reusing PCI_FIXUP_FINAL since it will slow down iommu probing as all devices in fixup final list will be reprocessed. Who is going to use this? I don't see a single user in the series. We will add iommu fixup in drivers/pci/quirks.c, handling fwspec->can_stall, which is introduced in https://www.spinics.net/lists/linux-pci/msg94559.html Unfortunately, the patch does not catch v5.8, so we have to wait. And we want to check whether this is a right method to solve this issue. Thanks
Re: [PATCH 0/2] Let pci_fixup_final access iommu_fwnode
On 2020/5/25 下午9:43, Joerg Roedel wrote: On Tue, May 12, 2020 at 12:08:29PM +0800, Zhangfei Gao wrote: Some platform devices appear as PCI but are actually on the AMBA bus, and they need fixup in drivers/pci/quirks.c handling iommu_fwnode. So calling pci_fixup_final after iommu_fwnode is allocated. For example: Hisilicon platform device need fixup in drivers/pci/quirks.c +static void quirk_huawei_pcie_sva(struct pci_dev *pdev) +{ + struct iommu_fwspec *fwspec; + + pdev->eetlp_prefix_path = 1; + fwspec = dev_iommu_fwspec_get(>dev); + if (fwspec) + fwspec->can_stall = 1; +} + +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa251, quirk_huawei_pcie_sva); I don't think it is a great idea to hook this into PCI_FIXUP_FINAL. The fixup list needs to be processed for every device, which will slow down probing. So either we introduce something like PCI_FIXUP_IOMMU, if this is entirely PCI specific. If it needs to be generic we need some fixup infrastructure in the IOMMU code itself. Thanks Joerg for the good suggestion. I am trying to introduce PCI_FIXUP_IOMMU in https://lkml.org/lkml/2020/5/26/366 Thanks
[PATCH 2/2] iommu: calling pci_fixup_iommu in iommu_fwspec_init
Calling pci_fixup_iommu in iommu_fwspec_init, which alloc iommu_fwnode. Some platform devices appear as PCI but are actually on the AMBA bus, and they need fixup in drivers/pci/quirks.c handling iommu_fwnode. So calling pci_fixup_iommu after iommu_fwnode is allocated. Signed-off-by: Zhangfei Gao --- drivers/iommu/iommu.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 7b37542..fb84c42 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode, fwspec->iommu_fwnode = iommu_fwnode; fwspec->ops = ops; dev_iommu_fwspec_set(dev, fwspec); + + if (dev_is_pci(dev)) + pci_fixup_device(pci_fixup_iommu, to_pci_dev(dev)); + return 0; } EXPORT_SYMBOL_GPL(iommu_fwspec_init); -- 2.7.4
[PATCH 1/2] PCI: Introduce PCI_FIXUP_IOMMU
Some platform devices appear as PCI but are actually on the AMBA bus, and they need fixup in drivers/pci/quirks.c handling iommu_fwnode. Here introducing PCI_FIXUP_IOMMU, which is called after iommu_fwnode is allocated, instead of reusing PCI_FIXUP_FINAL since it will slow down iommu probing as all devices in fixup final list will be reprocessed. Suggested-by: Joerg Roedel Signed-off-by: Zhangfei Gao --- drivers/pci/quirks.c | 7 +++ include/asm-generic/vmlinux.lds.h | 3 +++ include/linux/pci.h | 8 3 files changed, 18 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index ca9ed57..b037034 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -83,6 +83,8 @@ extern struct pci_fixup __start_pci_fixups_header[]; extern struct pci_fixup __end_pci_fixups_header[]; extern struct pci_fixup __start_pci_fixups_final[]; extern struct pci_fixup __end_pci_fixups_final[]; +extern struct pci_fixup __start_pci_fixups_iommu[]; +extern struct pci_fixup __end_pci_fixups_iommu[]; extern struct pci_fixup __start_pci_fixups_enable[]; extern struct pci_fixup __end_pci_fixups_enable[]; extern struct pci_fixup __start_pci_fixups_resume[]; @@ -118,6 +120,11 @@ void pci_fixup_device(enum pci_fixup_pass pass, struct pci_dev *dev) end = __end_pci_fixups_final; break; + case pci_fixup_iommu: + start = __start_pci_fixups_iommu; + end = __end_pci_fixups_iommu; + break; + case pci_fixup_enable: start = __start_pci_fixups_enable; end = __end_pci_fixups_enable; diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index 71e387a..3da32d8 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -411,6 +411,9 @@ __start_pci_fixups_final = .; \ KEEP(*(.pci_fixup_final)) \ __end_pci_fixups_final = .; \ + __start_pci_fixups_iommu = .; \ + KEEP(*(.pci_fixup_iommu)) \ + __end_pci_fixups_iommu = .; \ __start_pci_fixups_enable = .; \ KEEP(*(.pci_fixup_enable)) \ __end_pci_fixups_enable = .;\ diff --git a/include/linux/pci.h b/include/linux/pci.h index 83ce1cd..0d5fbf8 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1892,6 +1892,7 @@ enum pci_fixup_pass { pci_fixup_early,/* Before probing BARs */ pci_fixup_header, /* After reading configuration header */ pci_fixup_final,/* Final phase of device fixups */ + pci_fixup_iommu,/* After iommu_fwspec_init */ pci_fixup_enable, /* pci_enable_device() time */ pci_fixup_resume, /* pci_device_resume() */ pci_fixup_suspend, /* pci_device_suspend() */ @@ -1934,6 +1935,10 @@ enum pci_fixup_pass { class_shift, hook) \ DECLARE_PCI_FIXUP_SECTION(.pci_fixup_final, \ hook, vendor, device, class, class_shift, hook) +#define DECLARE_PCI_FIXUP_CLASS_IOMMU(vendor, device, class, \ +class_shift, hook) \ + DECLARE_PCI_FIXUP_SECTION(.pci_fixup_iommu, \ + hook, vendor, device, class, class_shift, hook) #define DECLARE_PCI_FIXUP_CLASS_ENABLE(vendor, device, class, \ class_shift, hook) \ DECLARE_PCI_FIXUP_SECTION(.pci_fixup_enable,\ @@ -1964,6 +1969,9 @@ enum pci_fixup_pass { #define DECLARE_PCI_FIXUP_FINAL(vendor, device, hook) \ DECLARE_PCI_FIXUP_SECTION(.pci_fixup_final, \ hook, vendor, device, PCI_ANY_ID, 0, hook) +#define DECLARE_PCI_FIXUP_IOMMU(vendor, device, hook) \ + DECLARE_PCI_FIXUP_SECTION(.pci_fixup_iommu, \ + hook, vendor, device, PCI_ANY_ID, 0, hook) #define DECLARE_PCI_FIXUP_ENABLE(vendor, device, hook) \ DECLARE_PCI_FIXUP_SECTION(.pci_fixup_enable,\ hook, vendor, device, PCI_ANY_ID, 0, hook) -- 2.7.4
[PATCH 0/2] Introduce PCI_FIXUP_IOMMU
Some platform devices appear as PCI but are actually on the AMBA bus, and they need fixup in drivers/pci/quirks.c handling iommu_fwnode. Here introducing PCI_FIXUP_IOMMU, which is called after iommu_fwnode is allocated, instead of reusing PCI_FIXUP_FINAL since it will slow down iommu probing as all devices in fixup final list will be reprocessed, suggested by Joerg, [1] For example: Hisilicon platform device need fixup in drivers/pci/quirks.c handling fwspec->can_stall, which is introduced in [2] +static void quirk_huawei_pcie_sva(struct pci_dev *pdev) +{ +struct iommu_fwspec *fwspec; + +pdev->eetlp_prefix_path = 1; +fwspec = dev_iommu_fwspec_get(>dev); +if (fwspec) +fwspec->can_stall = 1; +} + +DECLARE_PCI_FIXUP_IOMMU(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva); +DECLARE_PCI_iFIXUP_IOMMU(PCI_VENDOR_ID_HUAWEI, 0xa251, quirk_huawei_pcie_sva); [1] https://www.spinics.net/lists/iommu/msg44591.html [2] https://www.spinics.net/lists/linux-pci/msg94559.html Zhangfei Gao (2): PCI: Introduce PCI_FIXUP_IOMMU iommu: calling pci_fixup_iommu in iommu_fwspec_init drivers/iommu/iommu.c | 4 drivers/pci/quirks.c | 7 +++ include/asm-generic/vmlinux.lds.h | 3 +++ include/linux/pci.h | 8 4 files changed, 22 insertions(+) -- 2.7.4
Re: [PATCH 0/2] Let pci_fixup_final access iommu_fwnode
Hi, Joerg On 2020/5/12 下午12:08, Zhangfei Gao wrote: Some platform devices appear as PCI but are actually on the AMBA bus, and they need fixup in drivers/pci/quirks.c handling iommu_fwnode. So calling pci_fixup_final after iommu_fwnode is allocated. For example: Hisilicon platform device need fixup in drivers/pci/quirks.c +static void quirk_huawei_pcie_sva(struct pci_dev *pdev) +{ + struct iommu_fwspec *fwspec; + + pdev->eetlp_prefix_path = 1; + fwspec = dev_iommu_fwspec_get(>dev); + if (fwspec) + fwspec->can_stall = 1; +} + +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa251, quirk_huawei_pcie_sva); Zhangfei Gao (2): iommu/of: Let pci_fixup_final access iommu_fwnode ACPI/IORT: Let pci_fixup_final access iommu_fwnode drivers/acpi/arm64/iort.c | 1 + drivers/iommu/of_iommu.c | 1 + 2 files changed, 2 insertions(+) Would you mind give any suggestion? We need access fwspec->can_stall describing the platform device (a fake pcie) can support stall feature. can_stall will be used arm_smmu_add_device [1]. And stall is not a pci feature, so no such member in struct pci_dev. iommu_fwnode is allocated in iommu_fwspec_init, from of_pci_iommu_init or iort_pci_iommu_init. The pci_fixup_device(pci_fixup_final, dev) in pci_bus_add_device is too early that iommu_fwnode is not allocated. The pci_fixup_device(pci_fixup_enable, dev) in do_pci_enable_device is too late after arm_smmu_add_device. So the idea here is calling pci_fixup_device(pci_fixup_final) after of_pci_iommu_init and iort_pci_iommu_init, where iommu_fwnode is allocated. [1] https://www.spinics.net/lists/linux-pci/msg94559.html Thanks
[PATCH 0/2] Let pci_fixup_final access iommu_fwnode
Some platform devices appear as PCI but are actually on the AMBA bus, and they need fixup in drivers/pci/quirks.c handling iommu_fwnode. So calling pci_fixup_final after iommu_fwnode is allocated. For example: Hisilicon platform device need fixup in drivers/pci/quirks.c +static void quirk_huawei_pcie_sva(struct pci_dev *pdev) +{ + struct iommu_fwspec *fwspec; + + pdev->eetlp_prefix_path = 1; + fwspec = dev_iommu_fwspec_get(>dev); + if (fwspec) + fwspec->can_stall = 1; +} + +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa251, quirk_huawei_pcie_sva); Zhangfei Gao (2): iommu/of: Let pci_fixup_final access iommu_fwnode ACPI/IORT: Let pci_fixup_final access iommu_fwnode drivers/acpi/arm64/iort.c | 1 + drivers/iommu/of_iommu.c | 1 + 2 files changed, 2 insertions(+) -- 2.7.4
[PATCH 1/2] iommu/of: Let pci_fixup_final access iommu_fwnode
Calling pci_fixup_final after of_pci_iommu_init, which alloc iommu_fwnode. Some platform devices appear as PCI but are actually on the AMBA bus, and they need fixup in drivers/pci/quirks.c handling iommu_fwnode. So calling pci_fixup_final after iommu_fwnode is allocated. Signed-off-by: Zhangfei Gao --- drivers/iommu/of_iommu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c index 20738aac..c1b58c4 100644 --- a/drivers/iommu/of_iommu.c +++ b/drivers/iommu/of_iommu.c @@ -188,6 +188,7 @@ const struct iommu_ops *of_iommu_configure(struct device *dev, pci_request_acs(); err = pci_for_each_dma_alias(to_pci_dev(dev), of_pci_iommu_init, ); + pci_fixup_device(pci_fixup_final, to_pci_dev(dev)); } else if (dev_is_fsl_mc(dev)) { err = of_fsl_mc_iommu_init(to_fsl_mc_device(dev), master_np); } else { -- 2.7.4
[PATCH 2/2] ACPI/IORT: Let pci_fixup_final access iommu_fwnode
Calling pci_fixup_final after iommu_fwspec_init, which alloc iommu_fwnode. Some platform devices appear as PCI but are actually on the AMBA bus, and they need fixup in drivers/pci/quirks.c handling iommu_fwnode. So calling pci_fixup_final after iommu_fwnode is allocated. Signed-off-by: Zhangfei Gao --- drivers/acpi/arm64/iort.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c index 7d04424..02e361d 100644 --- a/drivers/acpi/arm64/iort.c +++ b/drivers/acpi/arm64/iort.c @@ -1027,6 +1027,7 @@ const struct iommu_ops *iort_iommu_configure(struct device *dev) info.node = node; err = pci_for_each_dma_alias(to_pci_dev(dev), iort_pci_iommu_init, ); + pci_fixup_device(pci_fixup_final, to_pci_dev(dev)); fwspec = dev_iommu_fwspec_get(dev); if (fwspec && iort_pci_rc_supports_ats(node)) -- 2.7.4
[PATCH v6 1/3] uacce: Add documents for uacce
From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) is a kernel module targets to provide Shared Virtual Addressing (SVA) between the accelerator and process. This patch add document to explain how it works. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- Documentation/misc-devices/uacce.rst | 297 +++ 1 file changed, 297 insertions(+) create mode 100644 Documentation/misc-devices/uacce.rst diff --git a/Documentation/misc-devices/uacce.rst b/Documentation/misc-devices/uacce.rst new file mode 100644 index 000..05c1e09 --- /dev/null +++ b/Documentation/misc-devices/uacce.rst @@ -0,0 +1,297 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Introduction of Uacce += + +Uacce (Unified/User-space-access-intended Accelerator Framework) targets to +provide Shared Virtual Addressing (SVA) between accelerators and processes. +So accelerator can access any data structure of the main cpu. +This differs from the data sharing between cpu and io device, which share +data content rather than address. +Because of the unified address, hardware and user space of process can +share the same virtual address in the communication. +Uacce takes the hardware accelerator as a heterogeneous processor, while +IOMMU share the same CPU page tables and as a result the same translation +from va to pa. + +__ __ + | | | | + | User application (CPU) | | Hardware Accelerator | + |__| |__| + +| | +| va | va +V V + ____ +| | | | +| MMU| | IOMMU | +|__| |__| +| | +| | +V pa V pa +___ + | | + | Memory | + |___| + + + +Architecture + + +Uacce is the kernel module, taking charge of iommu and address sharing. +The user drivers and libraries are called WarpDrive. + +A virtual concept, queue, is used for the communication. It provides a +FIFO-like interface. And it maintains a unified address space between the +application and all involved hardware. + + ___ +| | user API | | +| WarpDrive library | > | user driver | +|___| || + || + || + | queue fd | + || + || + v| + ___ _| +| | | | | mmap memory +| Other framework | | uacce | | r/w interface +| crypto/nic/others | |_| | +|___| | + | || + | register | register | + | || + | || + |_ __ | + | | | | | | + - | Device Driver | | IOMMU | | + |_| |__| | + || + |
[PATCH v6 0/3] Add uacce module for Accelerator
Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Because of unified address, hardware and user space of process can share the same virtual address in the communication. Uacce is intended to be used with Jean Philippe Brucker's SVA patchset[1], which enables IO side page fault and PASID support. We have keep verifying with Jean's sva/current [2] We also keep verifying with Eric's SMMUv3 Nested Stage patch [3] This series and related zip & qm driver https://github.com/Linaro/linux-kernel-warpdrive/tree/5.4-rc1-uacce-v6 The library and user application: https://github.com/Linaro/warpdrive/tree/wdprd-v1-upstream References: [1] http://jpbrucker.net/sva/ [2] http://www.linux-arm.org/git?p=linux-jpb.git;a=shortlog;h=refs/heads/sva/current [3] https://github.com/eauger/linux/tree/v5.3.0-rc0-2stage-v9 Change History: v6: Change sys qfrs_size to different file, suggested by Jonathan Fix crypto daily build issue and based on crypto code base, also 5.4-rc1. v5: Add an example patch using the uacce interface, suggested by Greg 0003-crypto-hisilicon-register-zip-engine-to-uacce.patch v4: Based on 5.4-rc1 Considering other driver integrating uacce, if uacce not compiled, uacce_register return error and uacce_unregister is empty. Simplify uacce flag: UACCE_DEV_SVA. Address Greg's comments: Fix state machine, remove potential syslog triggered from user space etc. v3: Recommended by Greg, use sturct uacce_device instead of struct uacce, and use struct *cdev in struct uacce_device, as a result, cdev can be released by itself when refcount decreased to 0. So the two structures are decoupled and self-maintained by themsleves. Also add dev.release for put_device. v2: Address comments from Greg and Jonathan Modify interface uacce_register Drop noiommu mode first v1: 1. Rebase to 5.3-rc1 2. Build on iommu interface 3. Verifying with Jean's sva and Eric's nested mode iommu. 4. User library has developed a lot: support zlib, openssl etc. 5. Move to misc first RFC3: https://lkml.org/lkml/2018/11/12/1951 RFC2: https://lwn.net/Articles/763990/ Background of why Uacce: Von Neumann processor is not good at general data manipulation. It is designed for control-bound rather than data-bound application. The latter need less control path facility and more/specific ALUs. So there are more and more heterogeneous processors, such as encryption/decryption accelerators, TPUs, or EDGE (Explicated Data Graph Execution) processors, introduced to gain better performance or power efficiency for particular applications these days. There are generally two ways to make use of these heterogeneous processors: The first is to make them co-processors, just like FPU. This is good for some application but it has its own cons: It changes the ISA set permanently. You must save all state elements when the process is switched out. But most data-bound processors have a huge set of state elements. It makes the kernel scheduler more complex. The second is Accelerator. It is taken as a IO device from the CPU's point of view (but it need not to be physically). The process, running on CPU, hold a context of the accelerator and send instructions to it as if it calls a function or thread running with FPU. The context is bound with the processor itself. So the state elements remain in the hardware context until the context is released. We believe this is the core feature of an "Accelerator" vs. Co-processor or other heterogeneous processors. The intention of Uacce is to provide the basic facility to backup this scenario. Its first step is to make sure the accelerator and process can share the same address space. So the accelerator ISA can directly address any data structure of the main CPU. This differs from the data sharing between CPU and IO device, which share data content rather than address. So it is different comparing to the other DMA libraries. In the future, we may add more facility to support linking accelerator library to the main application, or managing the accelerator context as special thread. But no matter how, this can be a solid start point for new processor to be used as an "accelerator" as this is the essential requirement. Kenneth Lee (2): uacce: Add documents for uacce uacce: add uacce driver Zhangfei Gao (1): crypto: hisilicon - register zip engine to uacce Documentation/ABI/testing/sysfs-driver-uacce | 65 ++ Documentation/misc-devices/uacce.rst | 297 drivers/crypto/hisilicon/qm.c| 254 ++- drivers/crypto/hisilicon/qm.h| 13 +- drivers/crypto/hisilicon/zip/zip_main.c | 39 +- drivers/misc/Kconfig | 1 + drivers/misc/Makefile
[PATCH v5 3/3] crypto: hisilicon - register zip engine to uacce
qm using uacce as an example, will resubmit after uacce is merged. Signed-off-by: Zhangfei Gao Signed-off-by: Zhou Wang --- drivers/crypto/hisilicon/qm.c | 254 ++-- drivers/crypto/hisilicon/qm.h | 13 +- drivers/crypto/hisilicon/zip/zip_main.c | 39 ++--- include/uapi/misc/uacce/qm.h| 15 ++ 4 files changed, 285 insertions(+), 36 deletions(-) create mode 100644 include/uapi/misc/uacce/qm.h diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c index f975c39..60067d8 100644 --- a/drivers/crypto/hisilicon/qm.c +++ b/drivers/crypto/hisilicon/qm.c @@ -9,6 +9,9 @@ #include #include #include +#include +#include +#include #include "qm.h" /* eq/aeq irq enable */ @@ -459,17 +462,22 @@ static void qm_cq_head_update(struct hisi_qp *qp) static void qm_poll_qp(struct hisi_qp *qp, struct hisi_qm *qm) { - struct qm_cqe *cqe = qp->cqe + qp->qp_status.cq_head; - - if (qp->req_cb) { - while (QM_CQE_PHASE(cqe) == qp->qp_status.cqc_phase) { - dma_rmb(); - qp->req_cb(qp, qp->sqe + qm->sqe_size * cqe->sq_head); - qm_cq_head_update(qp); - cqe = qp->cqe + qp->qp_status.cq_head; - qm_db(qm, qp->qp_id, QM_DOORBELL_CMD_CQ, - qp->qp_status.cq_head, 0); - atomic_dec(>qp_status.used); + struct qm_cqe *cqe; + + if (qp->event_cb) { + qp->event_cb(qp); + } else { + cqe = qp->cqe + qp->qp_status.cq_head; + + if (qp->req_cb) { + while (QM_CQE_PHASE(cqe) == qp->qp_status.cqc_phase) { + dma_rmb(); + qp->req_cb(qp, qp->sqe + qm->sqe_size * + cqe->sq_head); + qm_cq_head_update(qp); + cqe = qp->cqe + qp->qp_status.cq_head; + atomic_dec(>qp_status.used); + } } /* set c_flag */ @@ -1391,6 +1399,221 @@ static void hisi_qm_cache_wb(struct hisi_qm *qm) } } +static void qm_qp_event_notifier(struct hisi_qp *qp) +{ + wake_up_interruptible(>uacce_q->wait); +} + +static int hisi_qm_get_available_instances(struct uacce_device *uacce) +{ + int i, ret; + struct hisi_qm *qm = uacce->priv; + + read_lock(>qps_lock); + for (i = 0, ret = 0; i < qm->qp_num; i++) + if (!qm->qp_array[i]) + ret++; + read_unlock(>qps_lock); + + return ret; +} + +static int hisi_qm_uacce_get_queue(struct uacce_device *uacce, + unsigned long arg, + struct uacce_queue *q) +{ + struct hisi_qm *qm = uacce->priv; + struct hisi_qp *qp; + u8 alg_type = 0; + + qp = hisi_qm_create_qp(qm, alg_type); + if (IS_ERR(qp)) + return PTR_ERR(qp); + + q->priv = qp; + q->uacce = uacce; + qp->uacce_q = q; + qp->event_cb = qm_qp_event_notifier; + qp->pasid = arg; + + return 0; +} + +static void hisi_qm_uacce_put_queue(struct uacce_queue *q) +{ + struct hisi_qp *qp = q->priv; + + /* +* As put_queue is only called in uacce_mode=1, and only one queue can +* be used in this mode. we flush all sqc cache back in put queue. +*/ + hisi_qm_cache_wb(qp->qm); + + /* need to stop hardware, but can not support in v1 */ + hisi_qm_release_qp(qp); +} + +/* map sq/cq/doorbell to user space */ +static int hisi_qm_uacce_mmap(struct uacce_queue *q, + struct vm_area_struct *vma, + struct uacce_qfile_region *qfr) +{ + struct hisi_qp *qp = q->priv; + struct hisi_qm *qm = qp->qm; + size_t sz = vma->vm_end - vma->vm_start; + struct pci_dev *pdev = qm->pdev; + struct device *dev = >dev; + unsigned long vm_pgoff; + int ret; + + switch (qfr->type) { + case UACCE_QFRT_MMIO: + if (qm->ver == QM_HW_V2) { + if (sz > PAGE_SIZE * (QM_DOORBELL_PAGE_NR + + QM_DOORBELL_SQ_CQ_BASE_V2 / PAGE_SIZE)) + return -EINVAL; + } else { + if (sz > PAGE_SIZE * QM_DOORBELL_PAGE_NR) + return -EINVAL; + } + + vma->vm_flags |= VM_IO; + + return remap_pfn_range(vma, vma->vm_start, + qm->phys_base >> PAGE_SHIFT, +
[PATCH v5 2/3] uacce: add uacce driver
From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Since unified address, hardware and user space of process can share the same virtual address in the communication. Uacce create a chrdev for every registration, the queue is allocated to the process when the chrdev is opened. Then the process can access the hardware resource by interact with the queue file. By mmap the queue file space to user space, the process can directly put requests to the hardware without syscall to the kernel space. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- Documentation/ABI/testing/sysfs-driver-uacce | 47 ++ drivers/misc/Kconfig | 1 + drivers/misc/Makefile| 1 + drivers/misc/uacce/Kconfig | 13 + drivers/misc/uacce/Makefile | 2 + drivers/misc/uacce/uacce.c | 974 +++ include/linux/uacce.h| 167 + include/uapi/misc/uacce/uacce.h | 34 + 8 files changed, 1239 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-driver-uacce create mode 100644 drivers/misc/uacce/Kconfig create mode 100644 drivers/misc/uacce/Makefile create mode 100644 drivers/misc/uacce/uacce.c create mode 100644 include/linux/uacce.h create mode 100644 include/uapi/misc/uacce/uacce.h diff --git a/Documentation/ABI/testing/sysfs-driver-uacce b/Documentation/ABI/testing/sysfs-driver-uacce new file mode 100644 index 000..b1a2c60 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-driver-uacce @@ -0,0 +1,47 @@ +What: /sys/class/uacce/hisi_zip-/id +Date: Oct 2019 +KernelVersion: 5.5 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Id of the device. + +What: /sys/class/uacce/hisi_zip-/api +Date: Oct 2019 +KernelVersion: 5.5 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Api of the device, used by application to match the correct driver + +What: /sys/class/uacce/hisi_zip-/flags +Date: Oct 2019 +KernelVersion: 5.5 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Attributes of the device, see UACCE_DEV_xxx flag defined in uacce.h + +What: /sys/class/uacce/hisi_zip-/available_instances +Date: Oct 2019 +KernelVersion: 5.5 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Available instances left of the device + +What: /sys/class/uacce/hisi_zip-/algorithms +Date: Oct 2019 +KernelVersion: 5.5 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Algorithms supported by this accelerator + +What: /sys/class/uacce/hisi_zip-/qfrs_size +Date: Oct 2019 +KernelVersion: 5.5 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Page size of each queue file regions + +What: /sys/class/uacce/hisi_zip-/numa_distance +Date: Oct 2019 +KernelVersion: 5.5 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Distance of device node to cpu node + +What: /sys/class/uacce/hisi_zip-/node_id +Date: Oct 2019 +KernelVersion: 5.5 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Id of the numa node diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig index c55b637..929feb0 100644 --- a/drivers/misc/Kconfig +++ b/drivers/misc/Kconfig @@ -481,4 +481,5 @@ source "drivers/misc/cxl/Kconfig" source "drivers/misc/ocxl/Kconfig" source "drivers/misc/cardreader/Kconfig" source "drivers/misc/habanalabs/Kconfig" +source "drivers/misc/uacce/Kconfig" endmenu diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile index c1860d3..9abf292 100644 --- a/drivers/misc/Makefile +++ b/drivers/misc/Makefile @@ -56,4 +56,5 @@ obj-$(CONFIG_OCXL)+= ocxl/ obj-y += cardreader/ obj-$(CONFIG_PVPANIC) += pvpanic.o obj-$(CONFIG_HABANA_AI)+= habanalabs/ +obj-$(CONFIG_UACCE)+= uacce/ obj-$(CONFIG_XILINX_SDFEC) += xilinx_sdfec.o diff --git a/drivers/misc/uacce/Kconfig b/drivers/misc/uacce/Kconfig new file mode 100644 index 000..e854354 --- /dev/null +++ b/drivers/misc/uacce/Kconfig @@ -0,0 +1,13 @@ +config UACCE + tristate "Accelerator Framework for User Land" + depends on IOMMU_API + help + UACCE provides interface for the user process to access the hardware + without interaction with the kernel space in data path. + + The user-space interface is described in +
[PATCH v5 0/3] Add uacce module for Accelerator
Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Because of unified address, hardware and user space of process can share the same virtual address in the communication. Uacce is intended to be used with Jean Philippe Brucker's SVA patchset[1], which enables IO side page fault and PASID support. We have keep verifying with Jean's sva/current [2] We also keep verifying with Eric's SMMUv3 Nested Stage patch [3] This series and related zip & qm driver https://github.com/Linaro/linux-kernel-warpdrive/tree/5.4-rc1-uacce-v5 The library and user application: https://github.com/Linaro/warpdrive/tree/wdprd-v1-upstream References: [1] http://jpbrucker.net/sva/ [2] http://www.linux-arm.org/git?p=linux-jpb.git;a=shortlog;h=refs/heads/sva/current [3] https://github.com/eauger/linux/tree/v5.3.0-rc0-2stage-v9 Change History: v5: Add an example patch using the uacce interface, suggested by Greg 0003-crypto-hisilicon-register-zip-engine-to-uacce.patch v4: Based on 5.4-rc1 Considering other driver integrating uacce, if uacce not compiled, uacce_register return error and uacce_unregister is empty. Simplify uacce flag: UACCE_DEV_SVA. Address Greg's comments: Fix state machine, remove potential syslog triggered from user space etc. v3: Recommended by Greg, use sturct uacce_device instead of struct uacce, and use struct *cdev in struct uacce_device, as a result, cdev can be released by itself when refcount decreased to 0. So the two structures are decoupled and self-maintained by themsleves. Also add dev.release for put_device. v2: Address comments from Greg and Jonathan Modify interface uacce_register Drop noiommu mode first v1: 1. Rebase to 5.3-rc1 2. Build on iommu interface 3. Verifying with Jean's sva and Eric's nested mode iommu. 4. User library has developed a lot: support zlib, openssl etc. 5. Move to misc first RFC3: https://lkml.org/lkml/2018/11/12/1951 RFC2: https://lwn.net/Articles/763990/ Background of why Uacce: Von Neumann processor is not good at general data manipulation. It is designed for control-bound rather than data-bound application. The latter need less control path facility and more/specific ALUs. So there are more and more heterogeneous processors, such as encryption/decryption accelerators, TPUs, or EDGE (Explicated Data Graph Execution) processors, introduced to gain better performance or power efficiency for particular applications these days. There are generally two ways to make use of these heterogeneous processors: The first is to make them co-processors, just like FPU. This is good for some application but it has its own cons: It changes the ISA set permanently. You must save all state elements when the process is switched out. But most data-bound processors have a huge set of state elements. It makes the kernel scheduler more complex. The second is Accelerator. It is taken as a IO device from the CPU's point of view (but it need not to be physically). The process, running on CPU, hold a context of the accelerator and send instructions to it as if it calls a function or thread running with FPU. The context is bound with the processor itself. So the state elements remain in the hardware context until the context is released. We believe this is the core feature of an "Accelerator" vs. Co-processor or other heterogeneous processors. The intention of Uacce is to provide the basic facility to backup this scenario. Its first step is to make sure the accelerator and process can share the same address space. So the accelerator ISA can directly address any data structure of the main CPU. This differs from the data sharing between CPU and IO device, which share data content rather than address. So it is different comparing to the other DMA libraries. In the future, we may add more facility to support linking accelerator library to the main application, or managing the accelerator context as special thread. But no matter how, this can be a solid start point for new processor to be used as an "accelerator" as this is the essential requirement. Kenneth Lee (2): uacce: Add documents for uacce uacce: add uacce driver Zhangfei Gao (1): crypto: hisilicon - register zip engine to uacce Documentation/ABI/testing/sysfs-driver-uacce | 47 ++ Documentation/misc-devices/uacce.rst | 297 drivers/crypto/hisilicon/qm.c| 259 ++- drivers/crypto/hisilicon/qm.h| 13 +- drivers/crypto/hisilicon/zip/zip_main.c | 39 +- drivers/misc/Kconfig | 1 + drivers/misc/Makefile| 1 + drivers/misc/uacce/Kconfig | 13 + drivers/misc/uacce/Makefile | 2 + dr
[PATCH v5 1/3] uacce: Add documents for uacce
From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) is a kernel module targets to provide Shared Virtual Addressing (SVA) between the accelerator and process. This patch add document to explain how it works. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- Documentation/misc-devices/uacce.rst | 297 +++ 1 file changed, 297 insertions(+) create mode 100644 Documentation/misc-devices/uacce.rst diff --git a/Documentation/misc-devices/uacce.rst b/Documentation/misc-devices/uacce.rst new file mode 100644 index 000..1ddf4ff --- /dev/null +++ b/Documentation/misc-devices/uacce.rst @@ -0,0 +1,297 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Introduction of Uacce += + +Uacce (Unified/User-space-access-intended Accelerator Framework) targets to +provide Shared Virtual Addressing (SVA) between accelerators and processes. +So accelerator can access any data structure of the main cpu. +This differs from the data sharing between cpu and io device, which share +data content rather than address. +Because of the unified address, hardware and user space of process can +share the same virtual address in the communication. +Uacce takes the hardware accelerator as a heterogeneous processor, while +IOMMU share the same CPU page tables and as a result the same translation +from va to pa. + +__ __ + | | | | + | User application (CPU) | | Hardware Accelerator | + |__| |__| + +| | +| va | va +V V + ____ +| | | | +| MMU| | IOMMU | +|__| |__| +| | +| | +V pa V pa +___ + | | + | Memory | + |___| + + + +Architecture + + +Uacce is the kernel module, taking charge of iommu and address sharing. +The user drivers and libraries are called WarpDrive. + +A virtual concept, queue, is used for the communication. It provides a +FIFO-like interface. And it maintains a unified address space between the +application and all involved hardware. + + ___ +| | user API | | +| WarpDrive library | > | user driver | +|___| || + || + || + | queue fd | + || + || + v| + ___ _| +| | | | | mmap memory +| Other framework | | uacce | | r/w interface +| crypto/nic/others | |_| | +|___| | + | || + | register | register | + | || + | || + |_ __ | + | | | | | | + - | Device Driver | | IOMMU | | + |_| |__| | + || + |
[RESEND PATCH v4 2/2] uacce: add uacce driver
From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Since unified address, hardware and user space of process can share the same virtual address in the communication. Uacce create a chrdev for every registration, the queue is allocated to the process when the chrdev is opened. Then the process can access the hardware resource by interact with the queue file. By mmap the queue file space to user space, the process can directly put requests to the hardware without syscall to the kernel space. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- Documentation/ABI/testing/sysfs-driver-uacce | 47 ++ drivers/misc/Kconfig |1 + drivers/misc/Makefile|1 + drivers/misc/uacce/Kconfig | 13 + drivers/misc/uacce/Makefile |2 + drivers/misc/uacce/uacce.c | 1013 ++ include/linux/uacce.h| 167 + include/uapi/misc/uacce.h| 36 + 8 files changed, 1280 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-driver-uacce create mode 100644 drivers/misc/uacce/Kconfig create mode 100644 drivers/misc/uacce/Makefile create mode 100644 drivers/misc/uacce/uacce.c create mode 100644 include/linux/uacce.h create mode 100644 include/uapi/misc/uacce.h diff --git a/Documentation/ABI/testing/sysfs-driver-uacce b/Documentation/ABI/testing/sysfs-driver-uacce new file mode 100644 index 000..b7ff6af2 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-driver-uacce @@ -0,0 +1,47 @@ +What: /sys/class/uacce/hisi_zip-/id +Date: Oct 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Id of the device. + +What: /sys/class/uacce/hisi_zip-/api +Date: Oct 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Api of the device, used by application to match the correct driver + +What: /sys/class/uacce/hisi_zip-/flags +Date: Oct 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Attributes of the device, see UACCE_DEV_xxx flag defined in uacce.h + +What: /sys/class/uacce/hisi_zip-/available_instances +Date: Oct 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Available instances left of the device + +What: /sys/class/uacce/hisi_zip-/algorithms +Date: Oct 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Algorithms supported by this accelerator + +What: /sys/class/uacce/hisi_zip-/qfrs_offset +Date: Oct 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Page offsets of each queue file regions + +What: /sys/class/uacce/hisi_zip-/numa_distance +Date: Oct 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Distance of device node to cpu node + +What: /sys/class/uacce/hisi_zip-/node_id +Date: Oct 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Id of the numa node diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig index c55b637..929feb0 100644 --- a/drivers/misc/Kconfig +++ b/drivers/misc/Kconfig @@ -481,4 +481,5 @@ source "drivers/misc/cxl/Kconfig" source "drivers/misc/ocxl/Kconfig" source "drivers/misc/cardreader/Kconfig" source "drivers/misc/habanalabs/Kconfig" +source "drivers/misc/uacce/Kconfig" endmenu diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile index c1860d3..9abf292 100644 --- a/drivers/misc/Makefile +++ b/drivers/misc/Makefile @@ -56,4 +56,5 @@ obj-$(CONFIG_OCXL)+= ocxl/ obj-y += cardreader/ obj-$(CONFIG_PVPANIC) += pvpanic.o obj-$(CONFIG_HABANA_AI)+= habanalabs/ +obj-$(CONFIG_UACCE)+= uacce/ obj-$(CONFIG_XILINX_SDFEC) += xilinx_sdfec.o diff --git a/drivers/misc/uacce/Kconfig b/drivers/misc/uacce/Kconfig new file mode 100644 index 000..e854354 --- /dev/null +++ b/drivers/misc/uacce/Kconfig @@ -0,0 +1,13 @@ +config UACCE + tristate "Accelerator Framework for User Land" + depends on IOMMU_API + help + UACCE provides interface for the user process to access the hardware + without interaction with the kernel space in data path. + + The user-space interface is desc
[RESEND PATCH v4 1/2] uacce: Add documents for uacce
From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) is a kernel module targets to provide Shared Virtual Addressing (SVA) between the accelerator and process. This patch add document to explain how it works. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- Documentation/misc-devices/uacce.rst | 297 +++ 1 file changed, 297 insertions(+) create mode 100644 Documentation/misc-devices/uacce.rst diff --git a/Documentation/misc-devices/uacce.rst b/Documentation/misc-devices/uacce.rst new file mode 100644 index 000..b3cf0d5 --- /dev/null +++ b/Documentation/misc-devices/uacce.rst @@ -0,0 +1,297 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Introduction of Uacce += + +Uacce (Unified/User-space-access-intended Accelerator Framework) targets to +provide Shared Virtual Addressing (SVA) between accelerators and processes. +So accelerator can access any data structure of the main cpu. +This differs from the data sharing between cpu and io device, which share +data content rather than address. +Because of the unified address, hardware and user space of process can +share the same virtual address in the communication. +Uacce takes the hardware accelerator as a heterogeneous processor, while +IOMMU share the same CPU page tables and as a result the same translation +from va to pa. + +__ __ + | | | | + | User application (CPU) | | Hardware Accelerator | + |__| |__| + +| | +| va | va +V V + ____ +| | | | +| MMU| | IOMMU | +|__| |__| +| | +| | +V pa V pa +___ + | | + | Memory | + |___| + + + +Architecture + + +Uacce is the kernel module, taking charge of iommu and address sharing. +The user drivers and libraries are called WarpDrive. + +A virtual concept, queue, is used for the communication. It provides a +FIFO-like interface. And it maintains a unified address space between the +application and all involved hardware. + + ___ +| | user API | | +| WarpDrive library | > | user driver | +|___| || + || + || + | queue fd | + || + || + v| + ___ _| +| | | | | mmap memory +| Other framework | | uacce | | r/w interface +| crypto/nic/others | |_| | +|___| | + | || + | register | register | + | || + | || + |_ __ | + | | | | | | + - | Device Driver | | IOMMU | | + |_| |__| | + || + |
[RESEND PATCH v4 0/2] Add uacce module for Accelerator
Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Because of unified address, hardware and user space of process can share the same virtual address in the communication. Uacce is intended to be used with Jean Philippe Brucker's SVA patchset[1], which enables IO side page fault and PASID support. We have keep verifying with Jean's sva/current [2] We also keep verifying with Eric's SMMUv3 Nested Stage patch [3] This series and related zip & qm driver https://github.com/Linaro/linux-kernel-warpdrive/tree/5.4-rc1-uacce-v4 The library and user application: https://github.com/Linaro/warpdrive/tree/wdprd-v1-upstream References: [1] http://jpbrucker.net/sva/ [2] http://www.linux-arm.org/git?p=linux-jpb.git;a=shortlog;h=refs/heads/sva/current [3] https://github.com/eauger/linux/tree/v5.3.0-rc0-2stage-v9 Change History: v4: Based on 5.4-rc1 Considering other driver integrating uacce, if uacce not compiled, uacce_register return error and uacce_unregister is empty. Simplify uacce flag: UACCE_DEV_SVA. Address Greg's comments: Fix state machine, remove potential syslog triggered from user space etc. v3: Recommended by Greg, use sturct uacce_device instead of struct uacce, and use struct *cdev in struct uacce_device, as a result, cdev can be released by itself when refcount decreased to 0. So the two structures are decoupled and self-maintained by themsleves. Also add dev.release for put_device. v2: Address comments from Greg and Jonathan Modify interface uacce_register Drop noiommu mode first v1: 1. Rebase to 5.3-rc1 2. Build on iommu interface 3. Verifying with Jean's sva and Eric's nested mode iommu. 4. User library has developed a lot: support zlib, openssl etc. 5. Move to misc first RFC3: https://lkml.org/lkml/2018/11/12/1951 RFC2: https://lwn.net/Articles/763990/ Background of why Uacce: Von Neumann processor is not good at general data manipulation. It is designed for control-bound rather than data-bound application. The latter need less control path facility and more/specific ALUs. So there are more and more heterogeneous processors, such as encryption/decryption accelerators, TPUs, or EDGE (Explicated Data Graph Execution) processors, introduced to gain better performance or power efficiency for particular applications these days. There are generally two ways to make use of these heterogeneous processors: The first is to make them co-processors, just like FPU. This is good for some application but it has its own cons: It changes the ISA set permanently. You must save all state elements when the process is switched out. But most data-bound processors have a huge set of state elements. It makes the kernel scheduler more complex. The second is Accelerator. It is taken as a IO device from the CPU's point of view (but it need not to be physically). The process, running on CPU, hold a context of the accelerator and send instructions to it as if it calls a function or thread running with FPU. The context is bound with the processor itself. So the state elements remain in the hardware context until the context is released. We believe this is the core feature of an "Accelerator" vs. Co-processor or other heterogeneous processors. The intention of Uacce is to provide the basic facility to backup this scenario. Its first step is to make sure the accelerator and process can share the same address space. So the accelerator ISA can directly address any data structure of the main CPU. This differs from the data sharing between CPU and IO device, which share data content rather than address. So it is different comparing to the other DMA libraries. In the future, we may add more facility to support linking accelerator library to the main application, or managing the accelerator context as special thread. But no matter how, this can be a solid start point for new processor to be used as an "accelerator" as this is the essential requirement. Kenneth Lee (2): uacce: Add documents for uacce uacce: add uacce driver Documentation/ABI/testing/sysfs-driver-uacce | 47 ++ Documentation/misc-devices/uacce.rst | 297 drivers/misc/Kconfig |1 + drivers/misc/Makefile|1 + drivers/misc/uacce/Kconfig | 13 + drivers/misc/uacce/Makefile |2 + drivers/misc/uacce/uacce.c | 1013 ++ include/linux/uacce.h| 167 + include/uapi/misc/uacce.h| 36 + 9 files changed, 1577 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-driver-uacce create mode 100644 Documentation/misc-devices/uacce.rst create mode 100644
[PATCH v4 2/2] uacce: add uacce driver
From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Since unified address, hardware and user space of process can share the same virtual address in the communication. Uacce create a chrdev for every registration, the queue is allocated to the process when the chrdev is opened. Then the process can access the hardware resource by interact with the queue file. By mmap the queue file space to user space, the process can directly put requests to the hardware without syscall to the kernel space. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- Documentation/ABI/testing/sysfs-driver-uacce | 47 ++ drivers/misc/Kconfig |1 + drivers/misc/Makefile|1 + drivers/misc/uacce/Kconfig | 13 + drivers/misc/uacce/Makefile |2 + drivers/misc/uacce/uacce.c | 1028 ++ include/linux/uacce.h| 156 include/uapi/misc/uacce.h| 40 + 8 files changed, 1288 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-driver-uacce create mode 100644 drivers/misc/uacce/Kconfig create mode 100644 drivers/misc/uacce/Makefile create mode 100644 drivers/misc/uacce/uacce.c create mode 100644 include/linux/uacce.h create mode 100644 include/uapi/misc/uacce.h diff --git a/Documentation/ABI/testing/sysfs-driver-uacce b/Documentation/ABI/testing/sysfs-driver-uacce new file mode 100644 index 000..563f55c --- /dev/null +++ b/Documentation/ABI/testing/sysfs-driver-uacce @@ -0,0 +1,47 @@ +What: /sys/class/uacce/hisi_zip-/id +Date: Sep 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Id of the device. + +What: /sys/class/uacce/hisi_zip-/api +Date: Sep 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Api of the device, used by application to match the correct driver + +What: /sys/class/uacce/hisi_zip-/flags +Date: Sep 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Attributes of the device, see UACCE_DEV_xxx flag defined in uacce.h + +What: /sys/class/uacce/hisi_zip-/available_instances +Date: Sep 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Available instances left of the device + +What: /sys/class/uacce/hisi_zip-/algorithms +Date: Sep 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Algorithms supported by this accelerator + +What: /sys/class/uacce/hisi_zip-/qfrs_offset +Date: Sep 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Page offsets of each queue file regions + +What: /sys/class/uacce/hisi_zip-/numa_distance +Date: Sep 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Distance of device node to cpu node + +What: /sys/class/uacce/hisi_zip-/node_id +Date: Sep 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Id of the numa node diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig index 1690035..94d363c 100644 --- a/drivers/misc/Kconfig +++ b/drivers/misc/Kconfig @@ -503,4 +503,5 @@ source "drivers/misc/cxl/Kconfig" source "drivers/misc/ocxl/Kconfig" source "drivers/misc/cardreader/Kconfig" source "drivers/misc/habanalabs/Kconfig" +source "drivers/misc/uacce/Kconfig" endmenu diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile index abd8ae2..93a131b 100644 --- a/drivers/misc/Makefile +++ b/drivers/misc/Makefile @@ -58,4 +58,5 @@ obj-$(CONFIG_OCXL)+= ocxl/ obj-y += cardreader/ obj-$(CONFIG_PVPANIC) += pvpanic.o obj-$(CONFIG_HABANA_AI)+= habanalabs/ +obj-$(CONFIG_UACCE)+= uacce/ obj-$(CONFIG_XILINX_SDFEC) += xilinx_sdfec.o diff --git a/drivers/misc/uacce/Kconfig b/drivers/misc/uacce/Kconfig new file mode 100644 index 000..e854354 --- /dev/null +++ b/drivers/misc/uacce/Kconfig @@ -0,0 +1,13 @@ +config UACCE + tristate "Accelerator Framework for User Land" + depends on IOMMU_API + help + UACCE provides interface for the user process to access the hardware + without interaction with the kernel space in data path. + + The user-space interface is described in +
[PATCH v4 0/2] Add uacce module for Accelerator
Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Because of unified address, hardware and user space of process can share the same virtual address in the communication. Uacce is intended to be used with Jean Philippe Brucker's SVA patchset[1], which enables IO side page fault and PASID support. We have keep verifying with Jean's sva/current [2] We also keep verifying with Eric's SMMUv3 Nested Stage patch [3] This series and related zip & qm driver https://github.com/Linaro/linux-kernel-warpdrive/tree/5.3-uacce-v4 The library and user application: https://github.com/Linaro/warpdrive/tree/wdprd-v1-upstream References: [1] http://jpbrucker.net/sva/ [2] http://www.linux-arm.org/git?p=linux-jpb.git;a=shortlog;h=refs/heads/sva/current [3] https://github.com/eauger/linux/tree/v5.3.0-rc0-2stage-v9 Change History: v4: Based on 5.3 Address Greg comments: Fix state machine, remove potential syslog triggered from user space etc. v3: Recommended by Greg, use sturct uacce_device instead of struct uacce, and use struct *cdev in struct uacce_device, as a result, cdev can be released by itself when refcount decreased to 0. So the two structures are decoupled and self-maintained by themsleves. Also add dev.release for put_device. v2: Address comments from Greg and Jonathan Modify interface uacce_register Drop noiommu mode first v1: 1. Rebase to 5.3-rc1 2. Build on iommu interface 3. Verifying with Jean's sva and Eric's nested mode iommu. 4. User library has developed a lot: support zlib, openssl etc. 5. Move to misc first RFC3: https://lkml.org/lkml/2018/11/12/1951 RFC2: https://lwn.net/Articles/763990/ Background of why Uacce: Von Neumann processor is not good at general data manipulation. It is designed for control-bound rather than data-bound application. The latter need less control path facility and more/specific ALUs. So there are more and more heterogeneous processors, such as encryption/decryption accelerators, TPUs, or EDGE (Explicated Data Graph Execution) processors, introduced to gain better performance or power efficiency for particular applications these days. There are generally two ways to make use of these heterogeneous processors: The first is to make them co-processors, just like FPU. This is good for some application but it has its own cons: It changes the ISA set permanently. You must save all state elements when the process is switched out. But most data-bound processors have a huge set of state elements. It makes the kernel scheduler more complex. The second is Accelerator. It is taken as a IO device from the CPU's point of view (but it need not to be physically). The process, running on CPU, hold a context of the accelerator and send instructions to it as if it calls a function or thread running with FPU. The context is bound with the processor itself. So the state elements remain in the hardware context until the context is released. We believe this is the core feature of an "Accelerator" vs. Co-processor or other heterogeneous processors. The intention of Uacce is to provide the basic facility to backup this scenario. Its first step is to make sure the accelerator and process can share the same address space. So the accelerator ISA can directly address any data structure of the main CPU. This differs from the data sharing between CPU and IO device, which share data content rather than address. So it is different comparing to the other DMA libraries. In the future, we may add more facility to support linking accelerator library to the main application, or managing the accelerator context as special thread. But no matter how, this can be a solid start point for new processor to be used as an "accelerator" as this is the essential requirement. Kenneth Lee (2): uacce: Add documents for uacce uacce: add uacce driver Documentation/ABI/testing/sysfs-driver-uacce | 47 ++ Documentation/misc-devices/uacce.rst | 309 drivers/misc/Kconfig |1 + drivers/misc/Makefile|1 + drivers/misc/uacce/Kconfig | 13 + drivers/misc/uacce/Makefile |2 + drivers/misc/uacce/uacce.c | 1038 ++ include/linux/uacce.h| 156 include/uapi/misc/uacce.h| 40 + 9 files changed, 1607 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-driver-uacce create mode 100644 Documentation/misc-devices/uacce.rst create mode 100644 drivers/misc/uacce/Kconfig create mode 100644 drivers/misc/uacce/Makefile create mode 100644 drivers/misc/uacce/uacce.c create mode 100644 include/linux/uacce.h create mode 100644
[PATCH v4 1/2] uacce: Add documents for uacce
From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) is a kernel module targets to provide Shared Virtual Addressing (SVA) between the accelerator and process. This patch add document to explain how it works. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- Documentation/misc-devices/uacce.rst | 308 +++ 1 file changed, 308 insertions(+) create mode 100644 Documentation/misc-devices/uacce.rst diff --git a/Documentation/misc-devices/uacce.rst b/Documentation/misc-devices/uacce.rst new file mode 100644 index 000..4fd356e --- /dev/null +++ b/Documentation/misc-devices/uacce.rst @@ -0,0 +1,308 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Introduction of Uacce += + +Uacce (Unified/User-space-access-intended Accelerator Framework) targets to +provide Shared Virtual Addressing (SVA) between accelerators and processes. +So accelerator can access any data structure of the main cpu. +This differs from the data sharing between cpu and io device, which share +data content rather than address. +Because of the unified address, hardware and user space of process can +share the same virtual address in the communication. +Uacce takes the hardware accelerator as a heterogeneous processor, while +IOMMU share the same CPU page tables and as a result the same translation +from va to pa. + +__ __ + | | | | + | User application (CPU) | | Hardware Accelerator | + |__| |__| + +| | +| va | va +V V + ____ +| | | | +| MMU| | IOMMU | +|__| |__| +| | +| | +V pa V pa +___ + | | + | Memory | + |___| + + + +Architecture + + +Uacce is the kernel module, taking charge of iommu and address sharing. +The user drivers and libraries are called WarpDrive. + +A virtual concept, queue, is used for the communication. It provides a +FIFO-like interface. And it maintains a unified address space between the +application and all involved hardware. + + ___ +| | user API | | +| WarpDrive library | > | user driver | +|___| || + || + || + | queue fd | + || + || + v| + ___ _| +| | | | | mmap memory +| Other framework | | uacce | | r/w interface +| crypto/nic/others | |_| | +|___| | + | || + | register | register | + | || + | || + |_ __ | + | | | | | | + - | Device Driver | | IOMMU | | + |_| |__| | + || + |
[PATCH v3 1/2] uacce: Add documents for uacce
From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) is a kernel module targets to provide Shared Virtual Addressing (SVA) between the accelerator and process. This patch add document to explain how it works. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- Documentation/misc-devices/uacce.rst | 309 +++ 1 file changed, 309 insertions(+) create mode 100644 Documentation/misc-devices/uacce.rst diff --git a/Documentation/misc-devices/uacce.rst b/Documentation/misc-devices/uacce.rst new file mode 100644 index 000..211f796 --- /dev/null +++ b/Documentation/misc-devices/uacce.rst @@ -0,0 +1,309 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Introduction of Uacce += + +Uacce (Unified/User-space-access-intended Accelerator Framework) targets to +provide Shared Virtual Addressing (SVA) between accelerators and processes. +So accelerator can access any data structure of the main cpu. +This differs from the data sharing between cpu and io device, which share +data content rather than address. +Because of the unified address, hardware and user space of process can +share the same virtual address in the communication. +Uacce takes the hardware accelerator as a heterogeneous processor, while +IOMMU share the same CPU page tables and as a result the same translation +from va to pa. + +__ __ + | | | | + | User application (CPU) | | Hardware Accelerator | + |__| |__| + +| | +| va | va +V V + ____ +| | | | +| MMU| | IOMMU | +|__| |__| +| | +| | +V pa V pa +___ + | | + | Memory | + |___| + + + +Architecture + + +Uacce is the kernel module, taking charge of iommu and address sharing. +The user drivers and libraries are called WarpDrive. + +A virtual concept, queue, is used for the communication. It provides a +FIFO-like interface. And it maintains a unified address space between the +application and all involved hardware. + + ___ +| | user API | | +| WarpDrive library | > | user driver | +|___| || + || + || + | queue fd | + || + || + v| + ___ _| +| | | | | mmap memory +| Other framework | | uacce | | r/w interface +| crypto/nic/others | |_| | +|___| | + | || + | register | register | + | || + | || + |_ __ | + | | | | | | + - | Device Driver | | IOMMU | | + |_| |__| | + || + |
[PATCH v3 2/2] uacce: add uacce driver
From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Since unified address, hardware and user space of process can share the same virtual address in the communication. Uacce create a chrdev for every registration, the queue is allocated to the process when the chrdev is opened. Then the process can access the hardware resource by interact with the queue file. By mmap the queue file space to user space, the process can directly put requests to the hardware without syscall to the kernel space. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- Documentation/ABI/testing/sysfs-driver-uacce | 47 ++ drivers/misc/Kconfig |1 + drivers/misc/Makefile|1 + drivers/misc/uacce/Kconfig | 13 + drivers/misc/uacce/Makefile |2 + drivers/misc/uacce/uacce.c | 1096 ++ include/linux/uacce.h| 172 include/uapi/misc/uacce.h| 39 + 8 files changed, 1371 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-driver-uacce create mode 100644 drivers/misc/uacce/Kconfig create mode 100644 drivers/misc/uacce/Makefile create mode 100644 drivers/misc/uacce/uacce.c create mode 100644 include/linux/uacce.h create mode 100644 include/uapi/misc/uacce.h diff --git a/Documentation/ABI/testing/sysfs-driver-uacce b/Documentation/ABI/testing/sysfs-driver-uacce new file mode 100644 index 000..ee0a66e --- /dev/null +++ b/Documentation/ABI/testing/sysfs-driver-uacce @@ -0,0 +1,47 @@ +What: /sys/class/uacce/hisi_zip-/id +Date: Sep 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Id of the device. + +What: /sys/class/uacce/hisi_zip-/api +Date: Sep 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Api of the device, used by application to match the correct driver + +What: /sys/class/uacce/hisi_zip-/flags +Date: Sep 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Attributes of the device, see UACCE_DEV_xxx flag defined in uacce.h + +What: /sys/class/uacce/hisi_zip-/available_instances +Date: Sep 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Available instances left of the device + +What: /sys/class/uacce/hisi_zip-/algorithms +Date: Sep 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Algorithms supported by this accelerator + +What: /sys/class/uacce/hisi_zip-/qfrs_offset +Date: Sep 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Page offsets of each queue file regions + +What: /sys/class/uacce/hisi_zip-/numa_distance +Date: Sep 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Distance of device node to cpu node + +What: /sys/class/uacce/hisi_zip-/node_id +Date: Sep 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Id of the numa node diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig index 6abfc8e..8073eb8 100644 --- a/drivers/misc/Kconfig +++ b/drivers/misc/Kconfig @@ -502,4 +502,5 @@ source "drivers/misc/cxl/Kconfig" source "drivers/misc/ocxl/Kconfig" source "drivers/misc/cardreader/Kconfig" source "drivers/misc/habanalabs/Kconfig" +source "drivers/misc/uacce/Kconfig" endmenu diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile index abd8ae2..93a131b 100644 --- a/drivers/misc/Makefile +++ b/drivers/misc/Makefile @@ -58,4 +58,5 @@ obj-$(CONFIG_OCXL)+= ocxl/ obj-y += cardreader/ obj-$(CONFIG_PVPANIC) += pvpanic.o obj-$(CONFIG_HABANA_AI)+= habanalabs/ +obj-$(CONFIG_UACCE)+= uacce/ obj-$(CONFIG_XILINX_SDFEC) += xilinx_sdfec.o diff --git a/drivers/misc/uacce/Kconfig b/drivers/misc/uacce/Kconfig new file mode 100644 index 000..e854354 --- /dev/null +++ b/drivers/misc/uacce/Kconfig @@ -0,0 +1,13 @@ +config UACCE + tristate "Accelerator Framework for User Land" + depends on IOMMU_API + help + UACCE provides interface for the user process to access the hardware + without interaction with the kernel space in data path. + + The user-space interface is described in +
[PATCH v3 0/2] Add uacce module for Accelerator
Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Because of unified address, hardware and user space of process can share the same virtual address in the communication. Uacce is intended to be used with Jean Philippe Brucker's SVA patchset[1], which enables IO side page fault and PASID support. We have keep verifying with Jean's sva/current [2] We also keep verifying with Eric's SMMUv3 Nested Stage patch [3] This series and related zip & qm driver https://github.com/Linaro/linux-kernel-warpdrive/tree/5.3-rc1-warpdrive-v3 The library and user application: https://github.com/Linaro/warpdrive/tree/wdprd-v1-upstream References: [1] http://jpbrucker.net/sva/ [2] http://www.linux-arm.org/git?p=linux-jpb.git;a=shortlog;h=refs/heads/sva/current [3] https://github.com/eauger/linux/tree/v5.3.0-rc0-2stage-v9 Change History: v3: Recommended by Greg, use sturct uacce_device instead of struct uacce, and use struct *cdev in struct uacce_device, as a result, cdev can be released by itself when refcount decreased to 0. So the two structures are decoupled and self-maintained by themsleves. Also add dev.release for put_device. v2: Address comments from Greg and Jonathan Modify interface uacce_register Drop noiommu mode first v1: 1. Rebase to 5.3-rc1 2. Build on iommu interface 3. Verifying with Jean's sva and Eric's nested mode iommu. 4. User library has developed a lot: support zlib, openssl etc. 5. Move to misc first RFC3: https://lkml.org/lkml/2018/11/12/1951 RFC2: https://lwn.net/Articles/763990/ Background of why Uacce: Von Neumann processor is not good at general data manipulation. It is designed for control-bound rather than data-bound application. The latter need less control path facility and more/specific ALUs. So there are more and more heterogeneous processors, such as encryption/decryption accelerators, TPUs, or EDGE (Explicated Data Graph Execution) processors, introduced to gain better performance or power efficiency for particular applications these days. There are generally two ways to make use of these heterogeneous processors: The first is to make them co-processors, just like FPU. This is good for some application but it has its own cons: It changes the ISA set permanently. You must save all state elements when the process is switched out. But most data-bound processors have a huge set of state elements. It makes the kernel scheduler more complex. The second is Accelerator. It is taken as a IO device from the CPU's point of view (but it need not to be physically). The process, running on CPU, hold a context of the accelerator and send instructions to it as if it calls a function or thread running with FPU. The context is bound with the processor itself. So the state elements remain in the hardware context until the context is released. We believe this is the core feature of an "Accelerator" vs. Co-processor or other heterogeneous processors. The intention of Uacce is to provide the basic facility to backup this scenario. Its first step is to make sure the accelerator and process can share the same address space. So the accelerator ISA can directly address any data structure of the main CPU. This differs from the data sharing between CPU and IO device, which share data content rather than address. So it is different comparing to the other DMA libraries. In the future, we may add more facility to support linking accelerator library to the main application, or managing the accelerator context as special thread. But no matter how, this can be a solid start point for new processor to be used as an "accelerator" as this is the essential requirement. Kenneth Lee (2): uacce: Add documents for uacce uacce: add uacce driver Documentation/ABI/testing/sysfs-driver-uacce | 47 ++ Documentation/misc-devices/uacce.rst | 309 drivers/misc/Kconfig |1 + drivers/misc/Makefile|1 + drivers/misc/uacce/Kconfig | 13 + drivers/misc/uacce/Makefile |2 + drivers/misc/uacce/uacce.c | 1094 ++ include/linux/uacce.h| 172 include/uapi/misc/uacce.h| 39 + 9 files changed, 1678 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-driver-uacce create mode 100644 Documentation/misc-devices/uacce.rst create mode 100644 drivers/misc/uacce/Kconfig create mode 100644 drivers/misc/uacce/Makefile create mode 100644 drivers/misc/uacce/uacce.c create mode 100644 include/linux/uacce.h create mode 100644 include/uapi/misc/uacce.h -- 2.7.4
[PATCH v2 1/2] uacce: Add documents for uacce
From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) is a kernel module targets to provide Shared Virtual Addressing (SVA) between the accelerator and process. This patch add document to explain how it works. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- Documentation/misc-devices/uacce.rst | 309 +++ 1 file changed, 309 insertions(+) create mode 100644 Documentation/misc-devices/uacce.rst diff --git a/Documentation/misc-devices/uacce.rst b/Documentation/misc-devices/uacce.rst new file mode 100644 index 000..a2cbd00 --- /dev/null +++ b/Documentation/misc-devices/uacce.rst @@ -0,0 +1,309 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Introduction of Uacce += + +Uacce (Unified/User-space-access-intended Accelerator Framework) targets to +provide Shared Virtual Addressing (SVA) between accelerators and processes. +So accelerator can access any data structure of the main cpu. +This differs from the data sharing between cpu and io device, which share +data content rather than address. +Because of the unified address, hardware and user space of process can +share the same virtual address in the communication. +Uacce takes the hardware accelerator as a heterogeneous processor, while +IOMMU share the same CPU page tables and as a result the same translation +from va to pa. + +__ __ + | | | | + | User application (CPU) | | Hardware Accelerator | + |__| |__| + +| | +| va | va +V V + ____ +| | | | +| MMU| | IOMMU | +|__| |__| +| | +| | +V pa V pa +___ + | | + | Memory | + |___| + + + +Architecture + + +Uacce is the kernel module, taking charge of iommu and address sharing. +The user drivers and libraries are called WarpDrive. + +A virtual concept, queue, is used for the communication. It provides a +FIFO-like interface. And it maintains a unified address space between the +application and all involved hardware. + + ___ +| | user API | | +| WarpDrive library | > | user driver | +|___| || + || + || + | queue fd | + || + || + v| + ___ _| +| | | | | mmap memory +| Other framework | | uacce | | r/w interface +| crypto/nic/others | |_| | +|___| | + | || + | register | register | + | || + | || + |_ __ | + | | | | | | + - | Device Driver | | IOMMU | | + |_| |__| | + || + |
[PATCH v2 2/2] uacce: add uacce driver
From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Since unified address, hardware and user space of process can share the same virtual address in the communication. Uacce create a chrdev for every registration, the queue is allocated to the process when the chrdev is opened. Then the process can access the hardware resource by interact with the queue file. By mmap the queue file space to user space, the process can directly put requests to the hardware without syscall to the kernel space. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- Documentation/ABI/testing/sysfs-driver-uacce | 47 ++ drivers/misc/Kconfig |1 + drivers/misc/Makefile|1 + drivers/misc/uacce/Kconfig | 13 + drivers/misc/uacce/Makefile |2 + drivers/misc/uacce/uacce.c | 1086 ++ include/linux/uacce.h| 172 include/uapi/misc/uacce.h| 39 + 8 files changed, 1361 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-driver-uacce create mode 100644 drivers/misc/uacce/Kconfig create mode 100644 drivers/misc/uacce/Makefile create mode 100644 drivers/misc/uacce/uacce.c create mode 100644 include/linux/uacce.h create mode 100644 include/uapi/misc/uacce.h diff --git a/Documentation/ABI/testing/sysfs-driver-uacce b/Documentation/ABI/testing/sysfs-driver-uacce new file mode 100644 index 000..44e2f69 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-driver-uacce @@ -0,0 +1,47 @@ +What: /sys/class/uacce/hisi_zip-/id +Date: Aug 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Id of the device. + +What: /sys/class/uacce/hisi_zip-/api +Date: Aug 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Api of the device, used by application to match the correct driver + +What: /sys/class/uacce/hisi_zip-/flags +Date: Aug 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Attributes of the device, see UACCE_DEV_xxx flag defined in uacce.h + +What: /sys/class/uacce/hisi_zip-/available_instances +Date: Aug 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Available instances left of the device + +What: /sys/class/uacce/hisi_zip-/algorithms +Date: Aug 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Algorithms supported by this accelerator + +What: /sys/class/uacce/hisi_zip-/qfrs_offset +Date: Aug 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Page offsets of each queue file regions + +What: /sys/class/uacce/hisi_zip-/numa_distance +Date: Aug 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Distance of device node to cpu node + +What: /sys/class/uacce/hisi_zip-/node_id +Date: Aug 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Id of the numa node diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig index 6abfc8e..8073eb8 100644 --- a/drivers/misc/Kconfig +++ b/drivers/misc/Kconfig @@ -502,4 +502,5 @@ source "drivers/misc/cxl/Kconfig" source "drivers/misc/ocxl/Kconfig" source "drivers/misc/cardreader/Kconfig" source "drivers/misc/habanalabs/Kconfig" +source "drivers/misc/uacce/Kconfig" endmenu diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile index abd8ae2..93a131b 100644 --- a/drivers/misc/Makefile +++ b/drivers/misc/Makefile @@ -58,4 +58,5 @@ obj-$(CONFIG_OCXL)+= ocxl/ obj-y += cardreader/ obj-$(CONFIG_PVPANIC) += pvpanic.o obj-$(CONFIG_HABANA_AI)+= habanalabs/ +obj-$(CONFIG_UACCE)+= uacce/ obj-$(CONFIG_XILINX_SDFEC) += xilinx_sdfec.o diff --git a/drivers/misc/uacce/Kconfig b/drivers/misc/uacce/Kconfig new file mode 100644 index 000..e854354 --- /dev/null +++ b/drivers/misc/uacce/Kconfig @@ -0,0 +1,13 @@ +config UACCE + tristate "Accelerator Framework for User Land" + depends on IOMMU_API + help + UACCE provides interface for the user process to access the hardware + without interaction with the kernel space in data path. + + The user-space interface is described in +
[PATCH v2 0/2] Add uacce module for Accelerator
Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Since unified address, hardware and user space of process can share the same virtual address in the communication. Uacce is intended to be used with Jean Philippe Brucker's SVA patchset[1], which enables IO side page fault and PASID support. We have keep verifying with Jean's sva/current [2] We also keep verifying with Eric's SMMUv3 Nested Stage patch [3] This series and related zip & qm driver https://github.com/Linaro/linux-kernel-warpdrive/tree/5.3-rc1-warpdrive-v2 The library and user application: https://github.com/Linaro/warpdrive/tree/5.3-rc1-v2 References: [1] http://jpbrucker.net/sva/ [2] http://www.linux-arm.org/git?p=linux-jpb.git;a=shortlog;h=refs/heads/sva/current [3] https://github.com/eauger/linux/tree/v5.3.0-rc0-2stage-v9 Change History: v2: Address comments from Greg and Jonathan Modify interface uacce_register Drop noiommu mode first v1: 1. Rebase to 5.3-rc1 2. Build on iommu interface 3. Verifying with Jean's sva and Eric's nested mode iommu. 4. User library has developed a lot: support zlib, openssl etc. 5. Move to misc first RFC3: https://lkml.org/lkml/2018/11/12/1951 RFC2: https://lwn.net/Articles/763990/ Background of why Uacce: Von Neumann processor is not good at general data manipulation. It is designed for control-bound rather than data-bound application. The latter need less control path facility and more/specific ALUs. So there are more and more heterogeneous processors, such as encryption/decryption accelerators, TPUs, or EDGE (Explicated Data Graph Execution) processors, introduced to gain better performance or power efficiency for particular applications these days. There are generally two ways to make use of these heterogeneous processors: The first is to make them co-processors, just like FPU. This is good for some application but it has its own cons: It changes the ISA set permanently. You must save all state elements when the process is switched out. But most data-bound processors have a huge set of state elements. It makes the kernel scheduler more complex. The second is Accelerator. It is taken as a IO device from the CPU's point of view (but it need not to be physically). The process, running on CPU, hold a context of the accelerator and send instructions to it as if it calls a function or thread running with FPU. The context is bound with the processor itself. So the state elements remain in the hardware context until the context is released. We believe this is the core feature of an "Accelerator" vs. Co-processor or other heterogeneous processors. The intention of Uacce is to provide the basic facility to backup this scenario. Its first step is to make sure the accelerator and process can share the same address space. So the accelerator ISA can directly address any data structure of the main CPU. This differs from the data sharing between CPU and IO device, which share data content rather than address. So it is different comparing to the other DMA libraries. In the future, we may add more facility to support linking accelerator library to the main application, or managing the accelerator context as special thread. But no matter how, this can be a solid start point for new processor to be used as an "accelerator" as this is the essential requirement. Kenneth Lee (2): uacce: Add documents for uacce uacce: add uacce driver Documentation/ABI/testing/sysfs-driver-uacce | 47 ++ Documentation/misc-devices/uacce.rst | 335 drivers/misc/Kconfig |1 + drivers/misc/Makefile|1 + drivers/misc/uacce/Kconfig | 13 + drivers/misc/uacce/Makefile |2 + drivers/misc/uacce/uacce.c | 1086 ++ include/linux/uacce.h| 172 include/uapi/misc/uacce.h| 39 + 9 files changed, 1696 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-driver-uacce create mode 100644 Documentation/misc-devices/uacce.rst create mode 100644 drivers/misc/uacce/Kconfig create mode 100644 drivers/misc/uacce/Makefile create mode 100644 drivers/misc/uacce/uacce.c create mode 100644 include/linux/uacce.h create mode 100644 include/uapi/misc/uacce.h -- 2.7.4
[PATCH 2/2] uacce: add uacce module
From: Kenneth Lee Uacce is the kernel component to support WarpDrive accelerator framework. It provides register/unregister interface for device drivers to expose their hardware resource to the user space. The resource is taken as "queue" in WarpDrive. Uacce create a chrdev for every registration, the queue is allocated to the process when the chrdev is opened. Then the process can access the hardware resource by interact with the queue file. By mmap the queue file space to user space, the process can directly put requests to the hardware without syscall to the kernel space. Uacce also manages unify addresses between the hardware and user space of the process. So they can share the same virtual address in the communication. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- drivers/misc/Kconfig|1 + drivers/misc/Makefile |1 + drivers/misc/uacce/Kconfig | 13 + drivers/misc/uacce/Makefile |2 + drivers/misc/uacce/uacce.c | 1186 +++ include/linux/uacce.h | 109 include/uapi/misc/uacce.h | 44 ++ 7 files changed, 1356 insertions(+) create mode 100644 drivers/misc/uacce/Kconfig create mode 100644 drivers/misc/uacce/Makefile create mode 100644 drivers/misc/uacce/uacce.c create mode 100644 include/linux/uacce.h create mode 100644 include/uapi/misc/uacce.h diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig index 6abfc8e..8073eb8 100644 --- a/drivers/misc/Kconfig +++ b/drivers/misc/Kconfig @@ -502,4 +502,5 @@ source "drivers/misc/cxl/Kconfig" source "drivers/misc/ocxl/Kconfig" source "drivers/misc/cardreader/Kconfig" source "drivers/misc/habanalabs/Kconfig" +source "drivers/misc/uacce/Kconfig" endmenu diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile index abd8ae2..93a131b 100644 --- a/drivers/misc/Makefile +++ b/drivers/misc/Makefile @@ -58,4 +58,5 @@ obj-$(CONFIG_OCXL)+= ocxl/ obj-y += cardreader/ obj-$(CONFIG_PVPANIC) += pvpanic.o obj-$(CONFIG_HABANA_AI)+= habanalabs/ +obj-$(CONFIG_UACCE)+= uacce/ obj-$(CONFIG_XILINX_SDFEC) += xilinx_sdfec.o diff --git a/drivers/misc/uacce/Kconfig b/drivers/misc/uacce/Kconfig new file mode 100644 index 000..569669c --- /dev/null +++ b/drivers/misc/uacce/Kconfig @@ -0,0 +1,13 @@ +config UACCE + tristate "Accelerator Framework for User Land" + depends on IOMMU_API + help + UACCE provides interface for the user process to access the hardware + without interaction with the kernel space in data path. + + The user-space interface is described in + include/uapi/misc/uacce.h + + See Documentation/misc-devices/warpdrive.rst for more details. + + If you don't know what to do here, say N. diff --git a/drivers/misc/uacce/Makefile b/drivers/misc/uacce/Makefile new file mode 100644 index 000..5b4374e --- /dev/null +++ b/drivers/misc/uacce/Makefile @@ -0,0 +1,2 @@ +# SPDX-License-Identifier: GPL-2.0-or-later +obj-$(CONFIG_UACCE) += uacce.o diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c new file mode 100644 index 000..43e0c9b --- /dev/null +++ b/drivers/misc/uacce/uacce.c @@ -0,0 +1,1186 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +static struct class *uacce_class; +static DEFINE_IDR(uacce_idr); +static dev_t uacce_devt; +static DEFINE_MUTEX(uacce_mutex); /* mutex to protect uacce */ + +/* lock to protect all queues management */ +static DECLARE_RWSEM(uacce_qs_lock); +#define uacce_qs_rlock() down_read(_qs_lock) +#define uacce_qs_runlock() up_read(_qs_lock) +#define uacce_qs_wlock() down_write(_qs_lock) +#define uacce_qs_wunlock() up_write(_qs_lock) + +static const struct file_operations uacce_fops; + +/* match with enum uacce_qfrt */ +static const char *const qfrt_str[] = { + "mmio", + "dko", + "dus", + "ss", + "invalid" +}; + +static const char *uacce_qfrt_str(struct uacce_qfile_region *qfr) +{ + enum uacce_qfrt type = qfr->type; + + if (type > UACCE_QFRT_INVALID) + type = UACCE_QFRT_INVALID; + + return qfrt_str[type]; +} + +/** + * uacce_wake_up - Wake up the process who is waiting this queue + * @q the accelerator queue to wake up + */ +void uacce_wake_up(struct uacce_queue *q) +{ + dev_dbg(>uacce->dev, "wake up\n"); + wake_up_interruptible(>wait); +} +EXPORT_SYMBOL_GPL(uacce_wake_up); + +static int uacce_queue_map_qfr(struct uacce_queue *q, + struct uacce_qfile_region *qfr) +{ + struct device *dev = q->uacce->pdev; + struct iommu_domain *domain = iommu_
[PATCH 1/2] uacce: Add documents for WarpDrive/uacce
From: Kenneth Lee WarpDrive is a general accelerator framework for the user application to access the hardware without going through the kernel in data path. The kernel component to provide kernel facility to driver for expose the user interface is called uacce. It a short name for "Unified/User-space-access-intended Accelerator Framework". This patch add document to explain how it works. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- Documentation/misc-devices/warpdrive.rst | 351 +++ 1 file changed, 351 insertions(+) create mode 100644 Documentation/misc-devices/warpdrive.rst diff --git a/Documentation/misc-devices/warpdrive.rst b/Documentation/misc-devices/warpdrive.rst new file mode 100644 index 000..14e5939 --- /dev/null +++ b/Documentation/misc-devices/warpdrive.rst @@ -0,0 +1,351 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Introduction of WarpDrive += + +*WarpDrive* is a general accelerator framework for the user application to +communicate with the hardware without going through the kernel in data path. + +It can be used as a quick channel for accelerators, network adaptors or +other hardware for application in user space. + +It may also make some exist solution simpler. E.g. you can reuse most of the +*netdev* driver in kernel and just share some ring buffer to the user space +driver for *DPDK* or *ODP*. Or you can combine the RSA accelerator +with the *netdev* in the user space as a https reverse proxy, etc. + +*WarpDrive* takes the hardware accelerator as a heterogeneous processor which +can share particular load from the CPU: + +__ __ + | | | | + | User application (CPU) | | Hardware Accelerator | + |__| |__| + +| | +| | +V V + ____ +| | | | +| MMU| | IOMMU | +|__| |__| + \ / + \ / + \ / +__ + | | + | Memory | + |__| + + + +Architecture + + +*WarpDrive* includes general user libraries, kernel management modules +and drivers for the hardware. In kernel, the management module +is called *uacce*, meaning "Unified/User-space-access-intended +Accelerator Framework". + +A virtual concept, queue, is used for the communication. It provides a +FIFO-like interface. And it maintains a unified address space between the +application and all involved hardware. + + ___ +| | user API | | +| WarpDrive library | > | user driver | +|___| || + || + || + | queue fd | + || + || + v| + ___ _| +| | | | | mmap memory +| Other framework | | uacce | | r/w interface +| crypto/nic/others | |_| | +|___| | + | || + | register | register | + | || + | || + |_ __ | + | | | | | | + - | Device
[PATCH 0/2] A General Accelerator Framework, WarpDrive
*WarpDrive* is a general accelerator framework for the user application to access the hardware without going through the kernel in data path. WarpDrive is the name for the whole framework. The component in kernel is called uacce, meaning "Unified/User-space-access-intended Accelerator Framework". It makes use of the capability of IOMMU to maintain a unified virtual address space between the hardware and the process. WarpDrive is intended to be used with Jean Philippe Brucker's SVA patchset[1], which enables IO side page fault and PASID support. We have keep verifying with Jean's sva/current [2] We also keep verifying with Eric's SMMUv3 Nested Stage patch [3] This series and related zip & qm driver as well as dummy driver for qemu test: https://github.com/Linaro/linux-kernel-warpdrive/tree/5.3-rc1-warpdrive-v1 zip driver already been upstreamed. zip supporting uacce will be the next step. The library and user application: https://github.com/Linaro/warpdrive/tree/wdprd-v1-current Change History: v4 changed from V3 1. Rebase to 5.3-rc1 2. Build on iommu interface 3. Verifying with Jean's sva and Eric's nested mode iommu. 4. User library has developed a lot: support zlib, openssl etc. 5. Move to misc first V3 changed from V2: https://lkml.org/lkml/2018/11/12/1951 1. Build uacce from original IOMMU interface. V2 is built on VFIO. But the VFIO way locking the user memory in place will not work properly if the process fork a child. Because the copy-on-write strategy will make the parent process lost its page. This is not acceptable to accelerator user. 2. The kernel component is renamed to uacce from sdmdev accordingly 3. Document is updated for the new design. The Static Shared Virtual Memory concept is introduced to replace the User Memory Sharing concept. 4. Rebase to the lastest kernel (4.20.0-rc1) 5. As an RFC, this version is tested only with "test-to-pass" test case and not tested with Jean's SVA patch. V2 changed from V1: https://lwn.net/Articles/763990/ 1. Change kernel framework name from SPIMDEV (Share Parent IOMMU Mdev) to SDMDEV (Share Domain Mdev). 2. Allocate Hardware Resource when a new mdev is created (While it is allocated when the mdev is openned) 3. Unmap pages from the shared domain when the sdmdev iommu group is detached. (This procedure is necessary, but missed in V1) 4. Update document accordingly. 5. Rebase to the latest kernel (4.19.0-rc1) References: [1] http://jpbrucker.net/sva/ [2] http://www.linux-arm.org/git?p=linux-jpb.git;a=shortlog;h=refs/heads/sva/current [3] https://github.com/eauger/linux/tree/v5.3.0-rc0-2stage-v9 Kenneth Lee (2): uacce: Add documents for WarpDrive/uacce uacce: add uacce module Documentation/misc-devices/warpdrive.rst | 351 + drivers/misc/Kconfig |1 + drivers/misc/Makefile|1 + drivers/misc/uacce/Kconfig | 13 + drivers/misc/uacce/Makefile |2 + drivers/misc/uacce/uacce.c | 1186 ++ include/linux/uacce.h| 109 +++ include/uapi/misc/uacce.h| 44 ++ 8 files changed, 1707 insertions(+) create mode 100644 Documentation/misc-devices/warpdrive.rst create mode 100644 drivers/misc/uacce/Kconfig create mode 100644 drivers/misc/uacce/Makefile create mode 100644 drivers/misc/uacce/uacce.c create mode 100644 include/linux/uacce.h create mode 100644 include/uapi/misc/uacce.h -- 2.7.4
Re: [PATCH 01/11] hisi_sas: add v2 hw support for ECC and AXI bus fatal error
On Wed, Nov 23, 2016 at 4:59 PM, John Garry <john.ga...@huawei.com> wrote: > On 16/11/2016 01:47, Zhangfei Gao wrote: >> >> On Mon, Nov 7, 2016 at 8:48 PM, John Garry <john.ga...@huawei.com> wrote: >>> >>> From: Xiang Chen <chenxian...@hisilicon.com> Reviewed-by: Zhangfei Gao <zhangfei@linaro.org> >>> >>> For ECC 1bit error, logic can recover it, so we only print >>> a warning. >>> For ECC multi-bit and AXI bus fatal error, we panic. >> >> >> Is it possible to recover via resetting phy and device etc instead of >> panic? >> >> Thanks >> >> > > > Hi Zhangfei, > > We are actually now working on supporting controller reset for certain > AXI/ECC errors, so that we will not need to panic. Got it, thanks for the info. Thanks
Re: [PATCH 01/11] hisi_sas: add v2 hw support for ECC and AXI bus fatal error
On Wed, Nov 23, 2016 at 4:59 PM, John Garry wrote: > On 16/11/2016 01:47, Zhangfei Gao wrote: >> >> On Mon, Nov 7, 2016 at 8:48 PM, John Garry wrote: >>> >>> From: Xiang Chen Reviewed-by: Zhangfei Gao >>> >>> For ECC 1bit error, logic can recover it, so we only print >>> a warning. >>> For ECC multi-bit and AXI bus fatal error, we panic. >> >> >> Is it possible to recover via resetting phy and device etc instead of >> panic? >> >> Thanks >> >> > > > Hi Zhangfei, > > We are actually now working on supporting controller reset for certain > AXI/ECC errors, so that we will not need to panic. Got it, thanks for the info. Thanks
Re: [PATCH 05/11] hisi_sas: replace WARN_ON() with dev_warn() for internal abort
On Mon, Nov 7, 2016 at 8:48 PM, John Garry <john.ga...@huawei.com> wrote: > From: Xiang Chen <chenxian...@hisilicon.com> > > Replace WARN_ON() with dev_warn() print when internal abort fails. > > Signed-off-by: Xiang Chen <chenxian...@hisilicon.com> > Signed-off-by: John Garry <john.ga...@huawei.com> Reviewed-by: Zhangfei Gao <zhangfei@linaro.org> Sorry, miss this one.
Re: [PATCH 05/11] hisi_sas: replace WARN_ON() with dev_warn() for internal abort
On Mon, Nov 7, 2016 at 8:48 PM, John Garry wrote: > From: Xiang Chen > > Replace WARN_ON() with dev_warn() print when internal abort fails. > > Signed-off-by: Xiang Chen > Signed-off-by: John Garry Reviewed-by: Zhangfei Gao Sorry, miss this one.
Re: [PATCH 06/11] hisi_sas: modify return value of hisi_sas_query_task()
On Mon, Nov 7, 2016 at 8:48 PM, John Garry <john.ga...@huawei.com> wrote: > From: Xiang Chen <chenxian...@hisilicon.com> > > sas_scsi_find_task() only deals with return value > TMF_RESP_FUNC_FAILED/TMF_RESP_FUNC_SUCC/TMF_RESP_FUNC_COMPLETE of > query task. So for LLDD errors just return TMF_RESP_FUNC_FAILED. > > Signed-off-by: Xiang Chen <chenxian...@hisilicon.com> > Signed-off-by: John Garry <john.ga...@huawei.com> Reviewed-by: Zhangfei Gao <zhangfei@linaro.org>
Re: [PATCH 06/11] hisi_sas: modify return value of hisi_sas_query_task()
On Mon, Nov 7, 2016 at 8:48 PM, John Garry wrote: > From: Xiang Chen > > sas_scsi_find_task() only deals with return value > TMF_RESP_FUNC_FAILED/TMF_RESP_FUNC_SUCC/TMF_RESP_FUNC_COMPLETE of > query task. So for LLDD errors just return TMF_RESP_FUNC_FAILED. > > Signed-off-by: Xiang Chen > Signed-off-by: John Garry Reviewed-by: Zhangfei Gao
Re: [PATCH 09/11] hisi_sas: check SATA FIS when directly attaching SATA device
On Mon, Nov 7, 2016 at 8:48 PM, John Garry <john.ga...@huawei.com> wrote: > From: Xiang Chen <chenxian...@hisilicon.com> > > Check ERR bit of status to decide whether there is something wrong with > initial register-D2H FIS. If error exists, PHY reset the channel to > restart OOB. > > Signed-off-by: Xiang Chen <chenxian...@hisilicon.com> > Signed-off-by: John Garry <john.ga...@huawei.com> Reviewed-by: Zhangfei Gao <zhangfei@linaro.org>
Re: [PATCH 09/11] hisi_sas: check SATA FIS when directly attaching SATA device
On Mon, Nov 7, 2016 at 8:48 PM, John Garry wrote: > From: Xiang Chen > > Check ERR bit of status to decide whether there is something wrong with > initial register-D2H FIS. If error exists, PHY reset the channel to > restart OOB. > > Signed-off-by: Xiang Chen > Signed-off-by: John Garry Reviewed-by: Zhangfei Gao
Re: [PATCH 11/11] hisi_sas: add PHY set linkrate support for v1 and v2 hw
On Mon, Nov 7, 2016 at 8:48 PM, John Garry <john.ga...@huawei.com> wrote: > From: Xiang Chen <chenxian...@hisilicon.com> > > Add the function to set PHY min and max linkrate through > sysfs interface. > > Signed-off-by: Xiang Chen <chenxian...@hisilicon.com> > Signed-off-by: John Garry <john.ga...@huawei.com> Reviewed-by: Zhangfei Gao <zhangfei@linaro.org>
Re: [PATCH 11/11] hisi_sas: add PHY set linkrate support for v1 and v2 hw
On Mon, Nov 7, 2016 at 8:48 PM, John Garry wrote: > From: Xiang Chen > > Add the function to set PHY min and max linkrate through > sysfs interface. > > Signed-off-by: Xiang Chen > Signed-off-by: John Garry Reviewed-by: Zhangfei Gao
Re: [PATCH 01/11] hisi_sas: add v2 hw support for ECC and AXI bus fatal error
On Mon, Nov 7, 2016 at 8:48 PM, John Garrywrote: > From: Xiang Chen > > For ECC 1bit error, logic can recover it, so we only print > a warning. > For ECC multi-bit and AXI bus fatal error, we panic. Is it possible to recover via resetting phy and device etc instead of panic? Thanks
Re: [PATCH 01/11] hisi_sas: add v2 hw support for ECC and AXI bus fatal error
On Mon, Nov 7, 2016 at 8:48 PM, John Garry wrote: > From: Xiang Chen > > For ECC 1bit error, logic can recover it, so we only print > a warning. > For ECC multi-bit and AXI bus fatal error, we panic. Is it possible to recover via resetting phy and device etc instead of panic? Thanks
Re: [PATCH 03/11] hisi_sas: only process broadcast change in phy_bcast_v2_hw()
On Mon, Nov 7, 2016 at 8:48 PM, John Garry <john.ga...@huawei.com> wrote: > From: Xiang Chen <chenxian...@hisilicon.com> > > There are many BROADCAST primitives generated by the host. > We are only interested in BROADCAST (CHANGE) primitives currently, > so only process this. > > Signed-off-by: Xiang Chen <chenxian...@hisilicon.com> > Signed-off-by: John Garry <john.ga...@huawei.com> Reviewed-by: Zhangfei Gao <zhangfei@linaro.org>
Re: [PATCH 03/11] hisi_sas: only process broadcast change in phy_bcast_v2_hw()
On Mon, Nov 7, 2016 at 8:48 PM, John Garry wrote: > From: Xiang Chen > > There are many BROADCAST primitives generated by the host. > We are only interested in BROADCAST (CHANGE) primitives currently, > so only process this. > > Signed-off-by: Xiang Chen > Signed-off-by: John Garry Reviewed-by: Zhangfei Gao
Re: [PATCH 08/11] hisi_sas: modify some values in get_ata_protocol()
On Mon, Nov 7, 2016 at 8:48 PM, John Garry <john.ga...@huawei.com> wrote: > From: Xiang Chen <chenxian...@hisilicon.com> > > Modify and add some SATA commands according to SATA protocol. > > Signed-off-by: Xiang Chen <chenxian...@hisilicon.com> > Signed-off-by: John Garry <john.ga...@huawei.com> Reviewed-by: Zhangfei Gao <zhangfei@linaro.org>
Re: [PATCH 08/11] hisi_sas: modify some values in get_ata_protocol()
On Mon, Nov 7, 2016 at 8:48 PM, John Garry wrote: > From: Xiang Chen > > Modify and add some SATA commands according to SATA protocol. > > Signed-off-by: Xiang Chen > Signed-off-by: John Garry Reviewed-by: Zhangfei Gao
Re: [PATCH 07/11] hisi_sas: delete repeated configuration in free_device_v2_hw()
On Mon, Nov 7, 2016 at 8:48 PM, John Garry <john.ga...@huawei.com> wrote: > From: Xiang Chen <chenxian...@hisilicon.com> > > Delete repeated configuration items for hisi_sas_device() when > we free a device. These items are now only set in > hisi_sas_dev_gone(). > > Signed-off-by: Xiang Chen <chenxian...@hisilicon.com> > Signed-off-by: John Garry <john.ga...@huawei.com> Reviewed-by: Zhangfei Gao <zhangfei@linaro.org>
Re: [PATCH 10/11] hisi_sas: use atomic64_t for hisi_sas_device.running_req
On Mon, Nov 7, 2016 at 8:48 PM, John Garry <john.ga...@huawei.com> wrote: > Sometimes the value of hisi_sas_device.running_req > would go negative unless we have the check for > running_req >= 0 before trying to decrement. > > This is because using running_req is not thread-safe. > > As such, the value for running_req may be actually incorrect, > so use atomic64_t instead. > > Signed-off-by: John Garry <john.ga...@huawei.com> > Reviewed-by: Xiang Chen <chenxian...@hisilicon.com> Reviewed-by: Zhangfei Gao <zhangfei@linaro.org>
Re: [PATCH 07/11] hisi_sas: delete repeated configuration in free_device_v2_hw()
On Mon, Nov 7, 2016 at 8:48 PM, John Garry wrote: > From: Xiang Chen > > Delete repeated configuration items for hisi_sas_device() when > we free a device. These items are now only set in > hisi_sas_dev_gone(). > > Signed-off-by: Xiang Chen > Signed-off-by: John Garry Reviewed-by: Zhangfei Gao
Re: [PATCH 10/11] hisi_sas: use atomic64_t for hisi_sas_device.running_req
On Mon, Nov 7, 2016 at 8:48 PM, John Garry wrote: > Sometimes the value of hisi_sas_device.running_req > would go negative unless we have the check for > running_req >= 0 before trying to decrement. > > This is because using running_req is not thread-safe. > > As such, the value for running_req may be actually incorrect, > so use atomic64_t instead. > > Signed-off-by: John Garry > Reviewed-by: Xiang Chen Reviewed-by: Zhangfei Gao
Re: [PATCH 04/11] hisi_sas: fix port form bug in hisi_sas_port_notify_formed()
On Mon, Nov 7, 2016 at 8:48 PM, John Garry <john.ga...@huawei.com> wrote: > From: Xiang Chen <chenxian...@hisilicon.com> > > When we form a wideport, we should use hardware PHY port_id instead > of sas_phy->id. > > Signed-off-by: Xiang Chen <chenxian...@hisilicon.com> > Signed-off-by: John Garry <john.ga...@huawei.com> Reviewed-by: Zhangfei Gao <zhangfei@linaro.org>
Re: [PATCH 04/11] hisi_sas: fix port form bug in hisi_sas_port_notify_formed()
On Mon, Nov 7, 2016 at 8:48 PM, John Garry wrote: > From: Xiang Chen > > When we form a wideport, we should use hardware PHY port_id instead > of sas_phy->id. > > Signed-off-by: Xiang Chen > Signed-off-by: John Garry Reviewed-by: Zhangfei Gao
Re: [PATCH 02/11] hisi_sas: alloc queue id of slot according to device id
On Mon, Nov 7, 2016 at 8:48 PM, John Garry <john.ga...@huawei.com> wrote: > From: Xiang Chen <chenxian...@hisilicon.com> > > Currently slots are allocated from queues in a round-robin fashion. > This causes a problem for internal commands in device mode. For this > mode, we should ensure that the internal abort command is the last > command seen in the host for that device. We can only ensure this when > we place the internal abort command after the preceding commands for > device that in the same queue, as there is no order in which the host > will select a queue to execute the next command. Is there performance penalty, since only one queue is supported for a device. > > This queue restriction makes supporting scsi mq more tricky in > the future, but should not be a blocker. > > Note: Even though v1 hw does not support internal abort, the > allocation method is chosen to be the same for consistency. > > Signed-off-by: Xiang Chen <chenxian...@hisilicon.com> > Signed-off-by: John Garry <john.ga...@huawei.com> Reviewed-by: Zhangfei Gao <zhangfei@linaro.org>
Re: [PATCH 02/11] hisi_sas: alloc queue id of slot according to device id
On Mon, Nov 7, 2016 at 8:48 PM, John Garry wrote: > From: Xiang Chen > > Currently slots are allocated from queues in a round-robin fashion. > This causes a problem for internal commands in device mode. For this > mode, we should ensure that the internal abort command is the last > command seen in the host for that device. We can only ensure this when > we place the internal abort command after the preceding commands for > device that in the same queue, as there is no order in which the host > will select a queue to execute the next command. Is there performance penalty, since only one queue is supported for a device. > > This queue restriction makes supporting scsi mq more tricky in > the future, but should not be a blocker. > > Note: Even though v1 hw does not support internal abort, the > allocation method is chosen to be the same for consistency. > > Signed-off-by: Xiang Chen > Signed-off-by: John Garry Reviewed-by: Zhangfei Gao
Re: [PATCH 0/5] hisi_sas: v2 hw SATA fixes
On Fri, Apr 8, 2016 at 5:23 PM, John Garry <john.ga...@huawei.com> wrote: > This patchset introduces SATA support fixes for > the HiSilicon v2 hw SAS controller. > > Fixes include: > - attach issue for SATA disk attached through expander > - intermittent issue for directly attaching multiple > SATA disks > - add support for directly attaching SATA disk to phy > index 4+ > - ITCT config issue > > John Garry (5): > hisi_sas: use device linkrate in MCR for v2 hw > hisi_sas: fix v2 hw multiple SATA disk issue > hisi_sas: add v2 hw support for >4 SATA phys > hisi_sas: for v2 hw only set ITCT qw2 for SAS device > hisi_sas: update driver version to 1.4 For the series, Reviewed-by: Zhangfei Gao <zhangfei@linaro.org> Thanks
Re: [PATCH 0/5] hisi_sas: v2 hw SATA fixes
On Fri, Apr 8, 2016 at 5:23 PM, John Garry wrote: > This patchset introduces SATA support fixes for > the HiSilicon v2 hw SAS controller. > > Fixes include: > - attach issue for SATA disk attached through expander > - intermittent issue for directly attaching multiple > SATA disks > - add support for directly attaching SATA disk to phy > index 4+ > - ITCT config issue > > John Garry (5): > hisi_sas: use device linkrate in MCR for v2 hw > hisi_sas: fix v2 hw multiple SATA disk issue > hisi_sas: add v2 hw support for >4 SATA phys > hisi_sas: for v2 hw only set ITCT qw2 for SAS device > hisi_sas: update driver version to 1.4 For the series, Reviewed-by: Zhangfei Gao Thanks
[PATCH v4 4/4] phy: add phy-hi6220-usb
Add usb phy controller for hi6220 platform Signed-off-by: Zhangfei Gao --- drivers/phy/Kconfig | 9 ++ drivers/phy/Makefile | 1 + drivers/phy/phy-hi6220-usb.c | 306 +++ 3 files changed, 316 insertions(+) create mode 100644 drivers/phy/phy-hi6220-usb.c diff --git a/drivers/phy/Kconfig b/drivers/phy/Kconfig index ccad880..40a1ef1 100644 --- a/drivers/phy/Kconfig +++ b/drivers/phy/Kconfig @@ -162,6 +162,15 @@ config PHY_HIX5HD2_SATA help Support for SATA PHY on Hisilicon hix5hd2 Soc. +config PHY_HI6220_USB + tristate "hi6220 USB PHY support" + select USB_PHY + select MFD_SYSCON + help + Enable this to support the HISILICON HI6220 USB PHY. + + To compile this driver as a module, choose M here. + config PHY_SUN4I_USB tristate "Allwinner sunxi SoC USB PHY driver" depends on ARCH_SUNXI && HAS_IOMEM && OF diff --git a/drivers/phy/Makefile b/drivers/phy/Makefile index aa74f96..ec43c2d 100644 --- a/drivers/phy/Makefile +++ b/drivers/phy/Makefile @@ -19,6 +19,7 @@ obj-$(CONFIG_TI_PIPE3)+= phy-ti-pipe3.o obj-$(CONFIG_TWL4030_USB) += phy-twl4030-usb.o obj-$(CONFIG_PHY_EXYNOS5250_SATA) += phy-exynos5250-sata.o obj-$(CONFIG_PHY_HIX5HD2_SATA) += phy-hix5hd2-sata.o +obj-$(CONFIG_PHY_HI6220_USB) += phy-hi6220-usb.o obj-$(CONFIG_PHY_SUN4I_USB)+= phy-sun4i-usb.o obj-$(CONFIG_PHY_SAMSUNG_USB2) += phy-exynos-usb2.o phy-exynos-usb2-y += phy-samsung-usb2.o diff --git a/drivers/phy/phy-hi6220-usb.c b/drivers/phy/phy-hi6220-usb.c new file mode 100644 index 000..0d9f5ac --- /dev/null +++ b/drivers/phy/phy-hi6220-usb.c @@ -0,0 +1,306 @@ +/* + * Copyright (c) 2015 Linaro Ltd. + * Copyright (c) 2015 Hisilicon Limited. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#define SC_PERIPH_CTRL40x00c + +#define CTRL4_PICO_SIDDQ BIT(6) +#define CTRL4_PICO_OGDISABLE BIT(8) +#define CTRL4_PICO_VBUSVLDEXT BIT(10) +#define CTRL4_PICO_VBUSVLDEXTSEL BIT(11) +#define CTRL4_OTG_PHY_SEL BIT(21) + +#define SC_PERIPH_CTRL50x010 + +#define CTRL5_USBOTG_RES_SEL BIT(3) +#define CTRL5_PICOPHY_ACAENB BIT(4) +#define CTRL5_PICOPHY_BC_MODE BIT(5) +#define CTRL5_PICOPHY_CHRGSEL BIT(6) +#define CTRL5_PICOPHY_VDATSRCEND BIT(7) +#define CTRL5_PICOPHY_VDATDETENB BIT(8) +#define CTRL5_PICOPHY_DCDENB BIT(9) +#define CTRL5_PICOPHY_IDDIGBIT(10) + +#define SC_PERIPH_CTRL80x018 +#define SC_PERIPH_RSTEN0 0x300 +#define SC_PERIPH_RSTDIS0 0x304 + +#define RST0_USBOTG_BUSBIT(4) +#define RST0_POR_PICOPHY BIT(5) +#define RST0_USBOTGBIT(6) +#define RST0_USBOTG_32KBIT(7) + +#define EYE_PATTERN_PARA 0x7053348c + +struct hi6220_priv { + struct usb_phy phy; + struct delayed_work work; + struct regmap *reg; + struct clk *clk; + struct regulator *vcc; + struct device *dev; + int gpio_vbus; + int gpio_id; + enum usb_otg_state state; +}; + +static void hi6220_start_peripheral(struct hi6220_priv *priv, bool on) +{ + struct usb_otg *otg = priv->phy.otg; + + if (!otg->gadget) + return; + + if (on) + usb_gadget_connect(otg->gadget); + else + usb_gadget_disconnect(otg->gadget); +} + +static void hi6220_detect_work(struct work_struct *work) +{ + struct hi6220_priv *priv = + container_of(work, struct hi6220_priv, work.work); + int gpio_id, gpio_vbus; + enum usb_otg_state state; + + if (!gpio_is_valid(priv->gpio_id) || !gpio_is_valid(priv->gpio_vbus)) + return; + + gpio_id = gpio_get_value_cansleep(priv->gpio_id); + gpio_vbus = gpio_get_value_cansleep(priv->gpio_vbus); + + if (gpio_vbus == 0) { + if (gpio_id == 1) + state = OTG_STATE_B_PERIPHERAL; + else + state = OTG_STATE_A_HOST; + } else { + state = OTG_STATE_A_HOST; + } + + if (priv->state != state) { + hi6220_start_peripheral(priv, state == OTG_STATE_B_PERIPHERAL); + priv->state = state; + } +} + +static irqreturn_t hiusb_gpio_intr(int irq, void *data) +{ + struct hi6220_priv
[PATCH v4 3/4] usb: dwc2: platform: add hi6220 support
Signed-off-by: Zhangfei Gao --- drivers/usb/dwc2/platform.c | 30 ++ 1 file changed, 30 insertions(+) diff --git a/drivers/usb/dwc2/platform.c b/drivers/usb/dwc2/platform.c index ae095f0..f7c67db 100644 --- a/drivers/usb/dwc2/platform.c +++ b/drivers/usb/dwc2/platform.c @@ -50,6 +50,35 @@ static const char dwc2_driver_name[] = "dwc2"; +static const struct dwc2_core_params params_hi6220 = { + .otg_cap= 2,/* No HNP/SRP capable */ + .otg_ver= 0,/* 1.3 */ + .dma_enable = 1, + .dma_desc_enable= 0, + .speed = 0,/* High Speed */ + .enable_dynamic_fifo= 1, + .en_multiple_tx_fifo= 1, + .host_rx_fifo_size = 512, + .host_nperio_tx_fifo_size = 512, + .host_perio_tx_fifo_size= 512, + .max_transfer_size = 65535, + .max_packet_count = 511, + .host_channels = 16, + .phy_type = 1,/* UTMI */ + .phy_utmi_width = 8, + .phy_ulpi_ddr = 0,/* Single */ + .phy_ulpi_ext_vbus = 0, + .i2c_enable = 0, + .ulpi_fs_ls = 0, + .host_support_fs_ls_low_power = 0, + .host_ls_low_power_phy_clk = 0,/* 48 MHz */ + .ts_dline = 0, + .reload_ctl = 0, + .ahbcfg = GAHBCFG_HBSTLEN_INCR16 << + GAHBCFG_HBSTLEN_SHIFT, + .uframe_sched = 0, +}; + static const struct dwc2_core_params params_bcm2835 = { .otg_cap= 0,/* HNP/SRP capable */ .otg_ver= 0,/* 1.3 */ @@ -129,6 +158,7 @@ static int dwc2_driver_remove(struct platform_device *dev) static const struct of_device_id dwc2_of_match_table[] = { { .compatible = "brcm,bcm2835-usb", .data = _bcm2835 }, + { .compatible = "hisilicon,hi6220-usb", .data = _hi6220 }, { .compatible = "rockchip,rk3066-usb", .data = _rk3066 }, { .compatible = "snps,dwc2", .data = NULL }, { .compatible = "samsung,s3c6400-hsotg", .data = NULL}, -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 1/4] Documentation: dt-bindings: add dt binding info for hi6220 dwc2
Add necessary dwc2 binding documentation for Hisilicon soc: hi6220 Signed-off-by: Zhangfei Gao --- Documentation/devicetree/bindings/usb/dwc2.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/devicetree/bindings/usb/dwc2.txt b/Documentation/devicetree/bindings/usb/dwc2.txt index fd132cb..2213682 100644 --- a/Documentation/devicetree/bindings/usb/dwc2.txt +++ b/Documentation/devicetree/bindings/usb/dwc2.txt @@ -4,6 +4,7 @@ Platform DesignWare HS OTG USB 2.0 controller Required properties: - compatible : One of: - brcm,bcm2835-usb: The DWC2 USB controller instance in the BCM2835 SoC. + - hisilicon,hi6220-usb: The DWC2 USB controller instance in the hi6220 SoC. - rockchip,rk3066-usb: The DWC2 USB controller instance in the rk3066 Soc; - "rockchip,rk3188-usb", "rockchip,rk3066-usb", "snps,dwc2": for rk3188 Soc; - "rockchip,rk3288-usb", "rockchip,rk3066-usb", "snps,dwc2": for rk3288 Soc; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 0/4] add usb support for hi6220
v4: Move drivers/usb/phy/phy-hi6220-usb.c to drivers/phy/phy-hi6220-usb.c, required by Balbi. Modify dt bindings per comments from Mark and Sergei v3: fix typo and add -EPROBE_DEFER of regulator, pointed by Peter v2: address comments from Sergei and Peter add hi6220_phy_setup(false) code v1: hi6220 usb controller is inherited from dwc2 add phy accordingly support otg gadget/host Zhangfei Gao (4): Documentation: dt-bindings: add dt binding info for hi6220 dwc2 Documentation: dt-bindings: add dt binding info for hi6220 usb: dwc2: platform: add hi6220 support phy: add phy-hi6220-usb Documentation/devicetree/bindings/usb/dwc2.txt | 1 + .../devicetree/bindings/usb/hi6220-usb.txt | 49 drivers/phy/Kconfig| 9 + drivers/phy/Makefile | 1 + drivers/phy/phy-hi6220-usb.c | 306 + drivers/usb/dwc2/platform.c| 30 ++ 6 files changed, 396 insertions(+) create mode 100644 Documentation/devicetree/bindings/usb/hi6220-usb.txt create mode 100644 drivers/phy/phy-hi6220-usb.c -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 2/4] Documentation: dt-bindings: add dt binding info for hi6220
Signed-off-by: Zhangfei Gao --- .../devicetree/bindings/usb/hi6220-usb.txt | 49 ++ 1 file changed, 49 insertions(+) create mode 100644 Documentation/devicetree/bindings/usb/hi6220-usb.txt diff --git a/Documentation/devicetree/bindings/usb/hi6220-usb.txt b/Documentation/devicetree/bindings/usb/hi6220-usb.txt new file mode 100644 index 000..b3a7b5a --- /dev/null +++ b/Documentation/devicetree/bindings/usb/hi6220-usb.txt @@ -0,0 +1,49 @@ +Hisilicon hi6220 SoC USB controller +- + +usb controller is inherited from dwc2, refer dwc2.txt +- + +Required properties: +- compatible: "hisilicon,hi6220-usb" +Refer to dwc2.txt for dwc2 usb properties + + +PHY: +- + +Required properties: +- compatible: "hisilicon,hi6220-usb-phy" +- vcc-supply: phandle to the regulator that provides power to the PHY. +- clocks: phandle and clock specifier of the PHY clock. +- hisilicon,peripheral-syscon: phandle of syscon used to control peripheral. +- hisilicon,vbus-gpios: gpio of detecting vbus. +- hisilicon,id-gpios: gpio of detecting id. + +Example: + + sys_ctrl: syscon@f703 { + compatible = "hisilicon,sysctrl", "syscon"; + reg = <0x0 0xf703 0x0 0x1000>; + }; + + usb_phy: usb-phy { + compatible = "hisilicon,hi6220-usb-phy"; + vcc-supply = <_5v_hub>; + hisilicon,vbus-gpios = < 6 0>; + hisilicon,id-gpios = < 5 0>; + hisilicon,peripheral-syscon = <_ctrl>; + clocks = <_sys HI6220_USBOTG_HCLK>; + }; + + usb: usb@f72c { + compatible = "hisilicon,hi6220-usb"; + reg = <0x0 0xf72c 0x0 0x4>; + phys = <_phy>; + dr_mode = "otg"; + g-use-dma; + g-rx-fifo-size = <512>; + g-np-tx-fifo-size = <128>; + g-tx-fifo-size = <128>; + interrupts = <0 77 0x4>; + }; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 4/4] phy: add phy-hi6220-usb
Add usb phy controller for hi6220 platform Signed-off-by: Zhangfei Gao zhangfei@linaro.org --- drivers/phy/Kconfig | 9 ++ drivers/phy/Makefile | 1 + drivers/phy/phy-hi6220-usb.c | 306 +++ 3 files changed, 316 insertions(+) create mode 100644 drivers/phy/phy-hi6220-usb.c diff --git a/drivers/phy/Kconfig b/drivers/phy/Kconfig index ccad880..40a1ef1 100644 --- a/drivers/phy/Kconfig +++ b/drivers/phy/Kconfig @@ -162,6 +162,15 @@ config PHY_HIX5HD2_SATA help Support for SATA PHY on Hisilicon hix5hd2 Soc. +config PHY_HI6220_USB + tristate hi6220 USB PHY support + select USB_PHY + select MFD_SYSCON + help + Enable this to support the HISILICON HI6220 USB PHY. + + To compile this driver as a module, choose M here. + config PHY_SUN4I_USB tristate Allwinner sunxi SoC USB PHY driver depends on ARCH_SUNXI HAS_IOMEM OF diff --git a/drivers/phy/Makefile b/drivers/phy/Makefile index aa74f96..ec43c2d 100644 --- a/drivers/phy/Makefile +++ b/drivers/phy/Makefile @@ -19,6 +19,7 @@ obj-$(CONFIG_TI_PIPE3)+= phy-ti-pipe3.o obj-$(CONFIG_TWL4030_USB) += phy-twl4030-usb.o obj-$(CONFIG_PHY_EXYNOS5250_SATA) += phy-exynos5250-sata.o obj-$(CONFIG_PHY_HIX5HD2_SATA) += phy-hix5hd2-sata.o +obj-$(CONFIG_PHY_HI6220_USB) += phy-hi6220-usb.o obj-$(CONFIG_PHY_SUN4I_USB)+= phy-sun4i-usb.o obj-$(CONFIG_PHY_SAMSUNG_USB2) += phy-exynos-usb2.o phy-exynos-usb2-y += phy-samsung-usb2.o diff --git a/drivers/phy/phy-hi6220-usb.c b/drivers/phy/phy-hi6220-usb.c new file mode 100644 index 000..0d9f5ac --- /dev/null +++ b/drivers/phy/phy-hi6220-usb.c @@ -0,0 +1,306 @@ +/* + * Copyright (c) 2015 Linaro Ltd. + * Copyright (c) 2015 Hisilicon Limited. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ + +#include linux/clk.h +#include linux/mfd/syscon.h +#include linux/of_gpio.h +#include linux/platform_device.h +#include linux/regmap.h +#include linux/regulator/consumer.h +#include linux/usb/gadget.h +#include linux/usb/otg.h + +#define SC_PERIPH_CTRL40x00c + +#define CTRL4_PICO_SIDDQ BIT(6) +#define CTRL4_PICO_OGDISABLE BIT(8) +#define CTRL4_PICO_VBUSVLDEXT BIT(10) +#define CTRL4_PICO_VBUSVLDEXTSEL BIT(11) +#define CTRL4_OTG_PHY_SEL BIT(21) + +#define SC_PERIPH_CTRL50x010 + +#define CTRL5_USBOTG_RES_SEL BIT(3) +#define CTRL5_PICOPHY_ACAENB BIT(4) +#define CTRL5_PICOPHY_BC_MODE BIT(5) +#define CTRL5_PICOPHY_CHRGSEL BIT(6) +#define CTRL5_PICOPHY_VDATSRCEND BIT(7) +#define CTRL5_PICOPHY_VDATDETENB BIT(8) +#define CTRL5_PICOPHY_DCDENB BIT(9) +#define CTRL5_PICOPHY_IDDIGBIT(10) + +#define SC_PERIPH_CTRL80x018 +#define SC_PERIPH_RSTEN0 0x300 +#define SC_PERIPH_RSTDIS0 0x304 + +#define RST0_USBOTG_BUSBIT(4) +#define RST0_POR_PICOPHY BIT(5) +#define RST0_USBOTGBIT(6) +#define RST0_USBOTG_32KBIT(7) + +#define EYE_PATTERN_PARA 0x7053348c + +struct hi6220_priv { + struct usb_phy phy; + struct delayed_work work; + struct regmap *reg; + struct clk *clk; + struct regulator *vcc; + struct device *dev; + int gpio_vbus; + int gpio_id; + enum usb_otg_state state; +}; + +static void hi6220_start_peripheral(struct hi6220_priv *priv, bool on) +{ + struct usb_otg *otg = priv-phy.otg; + + if (!otg-gadget) + return; + + if (on) + usb_gadget_connect(otg-gadget); + else + usb_gadget_disconnect(otg-gadget); +} + +static void hi6220_detect_work(struct work_struct *work) +{ + struct hi6220_priv *priv = + container_of(work, struct hi6220_priv, work.work); + int gpio_id, gpio_vbus; + enum usb_otg_state state; + + if (!gpio_is_valid(priv-gpio_id) || !gpio_is_valid(priv-gpio_vbus)) + return; + + gpio_id = gpio_get_value_cansleep(priv-gpio_id); + gpio_vbus = gpio_get_value_cansleep(priv-gpio_vbus); + + if (gpio_vbus == 0) { + if (gpio_id == 1) + state = OTG_STATE_B_PERIPHERAL; + else + state = OTG_STATE_A_HOST; + } else { + state = OTG_STATE_A_HOST; + } + + if (priv-state != state) { + hi6220_start_peripheral(priv, state == OTG_STATE_B_PERIPHERAL); + priv-state = state; + } +} + +static
[PATCH v4 3/4] usb: dwc2: platform: add hi6220 support
Signed-off-by: Zhangfei Gao zhangfei@linaro.org --- drivers/usb/dwc2/platform.c | 30 ++ 1 file changed, 30 insertions(+) diff --git a/drivers/usb/dwc2/platform.c b/drivers/usb/dwc2/platform.c index ae095f0..f7c67db 100644 --- a/drivers/usb/dwc2/platform.c +++ b/drivers/usb/dwc2/platform.c @@ -50,6 +50,35 @@ static const char dwc2_driver_name[] = dwc2; +static const struct dwc2_core_params params_hi6220 = { + .otg_cap= 2,/* No HNP/SRP capable */ + .otg_ver= 0,/* 1.3 */ + .dma_enable = 1, + .dma_desc_enable= 0, + .speed = 0,/* High Speed */ + .enable_dynamic_fifo= 1, + .en_multiple_tx_fifo= 1, + .host_rx_fifo_size = 512, + .host_nperio_tx_fifo_size = 512, + .host_perio_tx_fifo_size= 512, + .max_transfer_size = 65535, + .max_packet_count = 511, + .host_channels = 16, + .phy_type = 1,/* UTMI */ + .phy_utmi_width = 8, + .phy_ulpi_ddr = 0,/* Single */ + .phy_ulpi_ext_vbus = 0, + .i2c_enable = 0, + .ulpi_fs_ls = 0, + .host_support_fs_ls_low_power = 0, + .host_ls_low_power_phy_clk = 0,/* 48 MHz */ + .ts_dline = 0, + .reload_ctl = 0, + .ahbcfg = GAHBCFG_HBSTLEN_INCR16 + GAHBCFG_HBSTLEN_SHIFT, + .uframe_sched = 0, +}; + static const struct dwc2_core_params params_bcm2835 = { .otg_cap= 0,/* HNP/SRP capable */ .otg_ver= 0,/* 1.3 */ @@ -129,6 +158,7 @@ static int dwc2_driver_remove(struct platform_device *dev) static const struct of_device_id dwc2_of_match_table[] = { { .compatible = brcm,bcm2835-usb, .data = params_bcm2835 }, + { .compatible = hisilicon,hi6220-usb, .data = params_hi6220 }, { .compatible = rockchip,rk3066-usb, .data = params_rk3066 }, { .compatible = snps,dwc2, .data = NULL }, { .compatible = samsung,s3c6400-hsotg, .data = NULL}, -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/