Re: [PATCH v5 04/10] iommu/mediatek: Setting MISC_CTRL register
On Wed, 2020-07-01 at 16:58 +0200, Matthias Brugger wrote: > > On 30/06/2020 12:53, chao hao wrote: > > On Mon, 2020-06-29 at 11:28 +0200, Matthias Brugger wrote: > >> > >> On 29/06/2020 09:13, Chao Hao wrote: > >>> Add F_MMU_IN_ORDER_WR_EN and F_MMU_STANDARD_AXI_MODE_BIT definition > >>> in MISC_CTRL register. > >>> F_MMU_STANDARD_AXI_MODE_BIT: > >>> If we set F_MMU_STANDARD_AXI_MODE_BIT(bit[3][19] = 0, not follow > >>> standard AXI protocol), iommu will send urgent read command firstly > >>> compare with normal read command to improve performance. > >> > >> Can you please help me to understand the phrase. Sorry I'm not a AXI > >> specialist. > >> Does this mean that you will send a 'urgent read command' which is not > >> described > >> in the specifications instead of a normal read command? > > > > ok. > > iommu sends read command to next bus_node normally(we can name it to > > cmd1), when cmd1 isn't handled by next bus_node, iommu has a urgent read > > command is needed to be sent(we can name it to cmd2), iommu will send > > cmd2 and replace cmd1. So cmd2 is handled by next bus_node firstly and > > cmd2 will be handled secondly. > > But for standard AXI protocol, it will ignore the priority of read > > command and only be handled in order. So cmd2 is handled by next > > bus_node after cmd1 is done. > > > > Thanks. So I propose change this part of the commit message to something like: > F_MMU_STANDARD_AXI_MODE_BIT: > If we set F_MMU_STANDARD_AXI_MODE_EN_MASK (bit[3][19] = 0, not follow standard > AXI protocol), the iommu will priorize sending of urgent read command over a > normal read command. This improves the performance. > ok, thanks > >> > >>> F_MMU_IN_ORDER_WR_EN: > >>> If we set F_MMU_IN_ORDER_WR_EN(bit[1][17] = 0, out-of-order write), > >>> iommu > >>> will re-order write command and send more higher priority write command > >>> instead of sending write command in order. The feature be controlled > >>> by OUT_ORDER_EN macro definition. > > F_MMU_IN_ORDER_WR_EN: > If we set F_MMU_IN_ORDER_WR_EN_MASK (bit[1][17] = 0, out-of-order write), the > iommu will re-order write commands and send the write command with higher > priority. Otherwise the sending of write commands will be done in order. The > feature is controlled by OUT_ORDER_WR_EN platform data flag. > > > >>> > >>> Cc: Matthias Brugger > >>> Suggested-by: Yong Wu > >>> Signed-off-by: Chao Hao > >>> --- > >>> drivers/iommu/mtk_iommu.c | 12 +++- > >>> drivers/iommu/mtk_iommu.h | 1 + > >>> 2 files changed, 12 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > >>> index 8f81df6cbe51..67b46b5d83d9 100644 > >>> --- a/drivers/iommu/mtk_iommu.c > >>> +++ b/drivers/iommu/mtk_iommu.c > >>> @@ -42,6 +42,9 @@ > >>> #define F_INVLD_EN1 BIT(1) > >>> > >>> #define REG_MMU_MISC_CTRL0x048 > >>> +#define F_MMU_IN_ORDER_WR_EN (BIT(1) | BIT(17)) > >>> +#define F_MMU_STANDARD_AXI_MODE_BIT (BIT(3) | BIT(19)) > >> > >> Wouldn't it make more sense to name it F_MMU_STANDARD_AXI_MODE_EN? > > ok, you are right. > > 1'b1: follow standard axi protocol > > > > What about > F_MMU_IN_ORDER_WR_EN_MASK > F_MMU_STANDARD_AXI_MODE_EN_MASK > > Background is that we have to set/unset two bits to enable or disable the > feature, so it's a mask we have to apply to the register. > ok, thanks for your advice > Regards, > Matthias > > >> > >>> + > >>> #define REG_MMU_DCM_DIS 0x050 > >>> > >>> #define REG_MMU_CTRL_REG 0x110 > >>> @@ -574,10 +577,17 @@ static int mtk_iommu_hw_init(const struct > >>> mtk_iommu_data *data) > >>> } > >>> writel_relaxed(0, data->base + REG_MMU_DCM_DIS); > >>> > >>> + regval = readl_relaxed(data->base + REG_MMU_MISC_CTRL); > >> > >> We only need to read regval in the else branch. > > > > ok, I got it. thanks > > > >> > >>> if (MTK_IOMMU_HAS_FLAG(data->plat_data, RESET_AXI)) { > >>> /* The register is called STANDARD_AXI_MODE i
[PATCH v6 07/10] iommu/mediatek: Add REG_MMU_WR_LEN_CTRL register definition
Some platforms(ex: mt6779) need to improve performance by setting REG_MMU_WR_LEN_CTRL register. And we can use WR_THROT_EN macro to control whether we need to set the register. If the register uses default value, iommu will send command to EMI without restriction, when the number of commands become more and more, it will drop the EMI performance. So when more than ten_commands(default value) don't be handled for EMI, iommu will stop send command to EMI for keeping EMI's performace by enabling write throttling mechanism(bit[5][21]=0) in MMU_WR_LEN_CTRL register. Cc: Matthias Brugger Signed-off-by: Chao Hao --- drivers/iommu/mtk_iommu.c | 11 +++ drivers/iommu/mtk_iommu.h | 1 + 2 files changed, 12 insertions(+) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 0d96dcd8612b..5c8e141668fc 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -46,6 +46,8 @@ #define F_MMU_STANDARD_AXI_MODE_MASK (BIT(3) | BIT(19)) #define REG_MMU_DCM_DIS0x050 +#define REG_MMU_WR_LEN_CTRL0x054 +#define F_MMU_WR_THROT_DIS_MASK(BIT(5) | BIT(21)) #define REG_MMU_CTRL_REG 0x110 #define F_MMU_TF_PROT_TO_PROGRAM_ADDR (2 << 4) @@ -112,6 +114,7 @@ #define RESET_AXI BIT(3) #define OUT_ORDER_WR_ENBIT(4) #define HAS_SUB_COMM BIT(5) +#define WR_THROT_ENBIT(6) #define MTK_IOMMU_HAS_FLAG(pdata, _x) \ pdata)->flags) & (_x)) == (_x)) @@ -593,6 +596,12 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data) writel_relaxed(regval, data->base + REG_MMU_VLD_PA_RNG); } writel_relaxed(0, data->base + REG_MMU_DCM_DIS); + if (MTK_IOMMU_HAS_FLAG(data->plat_data, WR_THROT_EN)) { + /* write command throttling mode */ + regval = readl_relaxed(data->base + REG_MMU_WR_LEN_CTRL); + regval &= ~F_MMU_WR_THROT_DIS_MASK; + writel_relaxed(regval, data->base + REG_MMU_WR_LEN_CTRL); + } if (MTK_IOMMU_HAS_FLAG(data->plat_data, RESET_AXI)) { /* The register is called STANDARD_AXI_MODE in this case */ @@ -747,6 +756,7 @@ static int __maybe_unused mtk_iommu_suspend(struct device *dev) struct mtk_iommu_suspend_reg *reg = &data->reg; void __iomem *base = data->base; + reg->wr_len_ctrl = readl_relaxed(base + REG_MMU_WR_LEN_CTRL); reg->misc_ctrl = readl_relaxed(base + REG_MMU_MISC_CTRL); reg->dcm_dis = readl_relaxed(base + REG_MMU_DCM_DIS); reg->ctrl_reg = readl_relaxed(base + REG_MMU_CTRL_REG); @@ -771,6 +781,7 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev) dev_err(data->dev, "Failed to enable clk(%d) in resume\n", ret); return ret; } + writel_relaxed(reg->wr_len_ctrl, base + REG_MMU_WR_LEN_CTRL); writel_relaxed(reg->misc_ctrl, base + REG_MMU_MISC_CTRL); writel_relaxed(reg->dcm_dis, base + REG_MMU_DCM_DIS); writel_relaxed(reg->ctrl_reg, base + REG_MMU_CTRL_REG); diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h index 46d0d47b22e1..31edd05e2eb1 100644 --- a/drivers/iommu/mtk_iommu.h +++ b/drivers/iommu/mtk_iommu.h @@ -31,6 +31,7 @@ struct mtk_iommu_suspend_reg { u32 int_main_control; u32 ivrp_paddr; u32 vld_pa_rng; + u32 wr_len_ctrl; }; enum mtk_iommu_plat { -- 2.18.0
[PATCH v6 00/10] MT6779 IOMMU SUPPORT
This patchset adds mt6779 iommu support. mt6779 has two iommus, they are MM_IOMMU(M4U) and APU_IOMMU which used ARM Short-Descriptor translation format. The mt6779's MM_IOMMU-SMI and APU_IOMMU HW diagram is as below, it is only a brief diagram: EMI | -- || MM_IOMMUAPU_IOMMU || SMI_COMMOM--- APU_BUS || | SMI_LARB(0~11) | | || | || -- || | | | Multimedia engineCCU VPU MDLA EMDA All the connections are hardware fixed, software can not adjust it. Compared with mt8183, SMI_BUS_ID width has changed from 10 to 12. SMI Larb number is described in bit[11:7], Port number is described in bit[6:2]. In addition, there are some registers has changed in mt6779, so we need to redefine and reuse them. The patchset only used MM_IOMMU, so we only add MM_IOMMU basic function, such as smi_larb port definition, registers definition and hardware initialization. change notes: v6: 1. Fix build error for "PATCH v5 02/10". 2. Use more precise definitions and commit messages. v5: 1. Split "iommu/mediatek: Add mt6779 IOMMU basic support(patch v4)" to three patches(from PATCH v5 08/10 to PATCH v5 10/10). 2. Use macro definitions to replace bool values in mtk_iommu_plat_data structure http://lists.infradead.org/pipermail/linux-mediatek/2020-June/013586.html v4: 1. Rebase on v5.8-rc1. 2. Fix coding style. 3. Add F_MMU_IN_DRDER_WR_EN definition in MISC_CTRL to improve performance. https://lkml.org/lkml/2020/6/16/1741 v3: 1. Rebase on v5.7-rc1. 2. Remove unused port definition,ex:APU and CCU port in mt6779-larb-port.h. 3. Remove "change single domain to multiple domain" part(from PATCH v2 09/19 to PATCH v2 19/19). 4. Redesign mt6779 basic part (1)Add some register definition and reuse them. (2)Redesign smi larb bus ID to analyze IOMMU translation fault. (3)Only init MM_IOMMU and not use APU_IOMMU. http://lists.infradead.org/pipermail/linux-mediatek/2020-May/029811.html v2: 1. Rebase on v5.5-rc1. 2. Delete M4U_PORT_UNKNOWN define because of not use it. 3. Correct coding format. 4. Rename offset=0x48 register. 5. Split "iommu/mediatek: Add mt6779 IOMMU basic support(patch v1)" to several patches(patch v2). http://lists.infradead.org/pipermail/linux-mediatek/2020-January/026131.html v1: http://lists.infradead.org/pipermail/linux-mediatek/2019-November/024567.html Chao Hao (10): dt-bindings: mediatek: Add bindings for MT6779 iommu/mediatek: Rename the register STANDARD_AXI_MODE(0x48) to MISC_CTRL iommu/mediatek: Use a u32 flags to describe different HW features iommu/mediatek: Setting MISC_CTRL register iommu/mediatek: Move inv_sel_reg into the plat_data iommu/mediatek: Add sub_comm id in translation fault iommu/mediatek: Add REG_MMU_WR_LEN_CTRL register definition iommu/mediatek: Extend protect pa alignment value iommu/mediatek: Modify MMU_CTRL register setting iommu/mediatek: Add mt6779 basic support .../bindings/iommu/mediatek,iommu.txt | 2 + drivers/iommu/mtk_iommu.c | 110 +++--- drivers/iommu/mtk_iommu.h | 20 +- include/dt-bindings/memory/mt6779-larb-port.h | 206 ++ 4 files changed, 299 insertions(+), 39 deletions(-) -- 2.18.0
[PATCH v6 01/10] dt-bindings: mediatek: Add bindings for MT6779
This patch adds description for MT6779 IOMMU. MT6779 has two iommus, they are mm_iommu and apu_iommu which both use ARM Short-Descriptor translation format. In addition, mm_iommu and apu_iommu are two independent HW instance , we need to set them separately. The MT6779 IOMMU hardware diagram is as below, it is only a brief diagram about iommu, it don't focus on the part of smi_larb, so I don't describe the smi_larb detailedly. EMI | -- || MM_IOMMUAPU_IOMMU || SMI_COMMOM--- APU_BUS ||| SMI_LARB(0~11) || ||| || -- || | | | Multimedia engine CCU VPU MDLA EMDA All the connections are hardware fixed, software can not adjust it. Signed-off-by: Chao Hao Reviewed-by: Rob Herring --- .../bindings/iommu/mediatek,iommu.txt | 2 + include/dt-bindings/memory/mt6779-larb-port.h | 206 ++ 2 files changed, 208 insertions(+) create mode 100644 include/dt-bindings/memory/mt6779-larb-port.h diff --git a/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt b/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt index ce59a505f5a4..c1ccd8582eb2 100644 --- a/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt +++ b/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt @@ -58,6 +58,7 @@ Required properties: - compatible : must be one of the following string: "mediatek,mt2701-m4u" for mt2701 which uses generation one m4u HW. "mediatek,mt2712-m4u" for mt2712 which uses generation two m4u HW. + "mediatek,mt6779-m4u" for mt6779 which uses generation two m4u HW. "mediatek,mt7623-m4u", "mediatek,mt2701-m4u" for mt7623 which uses generation one m4u HW. "mediatek,mt8173-m4u" for mt8173 which uses generation two m4u HW. @@ -78,6 +79,7 @@ Required properties: Specifies the mtk_m4u_id as defined in dt-binding/memory/mt2701-larb-port.h for mt2701, mt7623 dt-binding/memory/mt2712-larb-port.h for mt2712, + dt-binding/memory/mt6779-larb-port.h for mt6779, dt-binding/memory/mt8173-larb-port.h for mt8173, and dt-binding/memory/mt8183-larb-port.h for mt8183. diff --git a/include/dt-bindings/memory/mt6779-larb-port.h b/include/dt-bindings/memory/mt6779-larb-port.h new file mode 100644 index ..2ad0899fbf2f --- /dev/null +++ b/include/dt-bindings/memory/mt6779-larb-port.h @@ -0,0 +1,206 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2019 MediaTek Inc. + * Author: Chao Hao + */ + +#ifndef _DTS_IOMMU_PORT_MT6779_H_ +#define _DTS_IOMMU_PORT_MT6779_H_ + +#define MTK_M4U_ID(larb, port) (((larb) << 5) | (port)) + +#define M4U_LARB0_ID0 +#define M4U_LARB1_ID1 +#define M4U_LARB2_ID2 +#define M4U_LARB3_ID3 +#define M4U_LARB4_ID4 +#define M4U_LARB5_ID5 +#define M4U_LARB6_ID6 +#define M4U_LARB7_ID7 +#define M4U_LARB8_ID8 +#define M4U_LARB9_ID9 +#define M4U_LARB10_ID 10 +#define M4U_LARB11_ID 11 + +/* larb0 */ +#define M4U_PORT_DISP_POSTMASK0 MTK_M4U_ID(M4U_LARB0_ID, 0) +#define M4U_PORT_DISP_OVL0_HDR MTK_M4U_ID(M4U_LARB0_ID, 1) +#define M4U_PORT_DISP_OVL1_HDR MTK_M4U_ID(M4U_LARB0_ID, 2) +#define M4U_PORT_DISP_OVL0 MTK_M4U_ID(M4U_LARB0_ID, 3) +#define M4U_PORT_DISP_OVL1 MTK_M4U_ID(M4U_LARB0_ID, 4) +#define M4U_PORT_DISP_PVRIC0MTK_M4U_ID(M4U_LARB0_ID, 5) +#define M4U_PORT_DISP_RDMA0 MTK_M4U_ID(M4U_LARB0_ID, 6) +#define M4U_PORT_DISP_WDMA0 MTK_M4U_ID(M4U_LARB0_ID, 7) +#define M4U_PORT_DISP_FAKE0 MTK_M4U_ID(M4U_LARB0_ID, 8) + +/* larb1 */ +#define M4U_PORT_DISP_OVL0_2L_HDR MTK_M4U_ID(M4U_LARB1_ID, 0) +#define M4U_PORT_DISP_OVL1_2L_HDR MTK_M4U_ID(M4U_LARB1_ID, 1) +#define M4U_PORT_DISP_OVL0_2L MTK_M4U_ID(M4U_LARB1_ID, 2) +#define M4U_PORT_DISP_OVL1_2L MTK_M4U_ID(M4U_LARB1_ID, 3) +#define M4U_PORT_DISP_RDMA1 MTK_M4U_ID(M4U_LARB1_ID, 4) +#define M4U_PORT_MDP_PVRIC0 MTK_M4U_ID(M4U_LARB1_ID, 5) +#define M4U_PORT_MDP_PVRIC1 MTK_M4U_ID(M4U_LARB1_ID, 6) +#define M4U_PORT_MDP_RDMA0 MTK_M4U_ID(M4U_LARB1_ID, 7) +#define M4U_PORT_MDP_
[PATCH v6 02/10] iommu/mediatek: Rename the register STANDARD_AXI_MODE(0x48) to MISC_CTRL
For iommu offset=0x48 register, only the previous mt8173/mt8183 use the name STANDARD_AXI_MODE, all the latest SoC extend the register more feature by different bits, for example: axi_mode, in_order_en, coherent_en and so on. So rename REG_MMU_MISC_CTRL may be more proper. This patch only rename the register name, no functional change. Signed-off-by: Chao Hao Reviewed-by: Yong Wu Reviewed-by: Matthias Brugger --- drivers/iommu/mtk_iommu.c | 14 +++--- drivers/iommu/mtk_iommu.h | 5 - 2 files changed, 11 insertions(+), 8 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 2be96f1cdbd2..88d3df5b91c2 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -41,7 +41,7 @@ #define F_INVLD_EN0BIT(0) #define F_INVLD_EN1BIT(1) -#define REG_MMU_STANDARD_AXI_MODE 0x048 +#define REG_MMU_MISC_CTRL 0x048 #define REG_MMU_DCM_DIS0x050 #define REG_MMU_CTRL_REG 0x110 @@ -573,8 +573,10 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data) } writel_relaxed(0, data->base + REG_MMU_DCM_DIS); - if (data->plat_data->reset_axi) - writel_relaxed(0, data->base + REG_MMU_STANDARD_AXI_MODE); + if (data->plat_data->reset_axi) { + /* The register is called STANDARD_AXI_MODE in this case */ + writel_relaxed(0, data->base + REG_MMU_MISC_CTRL); + } if (devm_request_irq(data->dev, data->irq, mtk_iommu_isr, 0, dev_name(data->dev), (void *)data)) { @@ -718,8 +720,7 @@ static int __maybe_unused mtk_iommu_suspend(struct device *dev) struct mtk_iommu_suspend_reg *reg = &data->reg; void __iomem *base = data->base; - reg->standard_axi_mode = readl_relaxed(base + - REG_MMU_STANDARD_AXI_MODE); + reg->misc_ctrl = readl_relaxed(base + REG_MMU_MISC_CTRL); reg->dcm_dis = readl_relaxed(base + REG_MMU_DCM_DIS); reg->ctrl_reg = readl_relaxed(base + REG_MMU_CTRL_REG); reg->int_control0 = readl_relaxed(base + REG_MMU_INT_CONTROL0); @@ -743,8 +744,7 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev) dev_err(data->dev, "Failed to enable clk(%d) in resume\n", ret); return ret; } - writel_relaxed(reg->standard_axi_mode, - base + REG_MMU_STANDARD_AXI_MODE); + writel_relaxed(reg->misc_ctrl, base + REG_MMU_MISC_CTRL); writel_relaxed(reg->dcm_dis, base + REG_MMU_DCM_DIS); writel_relaxed(reg->ctrl_reg, base + REG_MMU_CTRL_REG); writel_relaxed(reg->int_control0, base + REG_MMU_INT_CONTROL0); diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h index ea949a324e33..7212e6fcf982 100644 --- a/drivers/iommu/mtk_iommu.h +++ b/drivers/iommu/mtk_iommu.h @@ -18,7 +18,10 @@ #include struct mtk_iommu_suspend_reg { - u32 standard_axi_mode; + union { + u32 standard_axi_mode;/* v1 */ + u32 misc_ctrl;/* v2 */ + }; u32 dcm_dis; u32 ctrl_reg; u32 int_control0; -- 2.18.0
[PATCH v6 09/10] iommu/mediatek: Modify MMU_CTRL register setting
The MMU_CTRL register of MT8173 is different from other SoCs. The in_order_wr_en is bit[9] which is zero by default. Other SoCs have the vitcim_tlb_en feature mapped to bit[12]. This bit is set to one by default. We need to preserve the bit when setting F_MMU_TF_PROT_TO_PROGRAM_ADDR as otherwise the bit will be cleared and IOMMU performance will drop. Cc: Matthias Brugger Cc: Yong Wu Signed-off-by: Chao Hao --- drivers/iommu/mtk_iommu.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index e71003037ffa..a816030d00f1 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -555,11 +555,13 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data) return ret; } - if (data->plat_data->m4u_plat == M4U_MT8173) + if (data->plat_data->m4u_plat == M4U_MT8173) { regval = F_MMU_PREFETCH_RT_REPLACE_MOD | F_MMU_TF_PROT_TO_PROGRAM_ADDR_MT8173; - else - regval = F_MMU_TF_PROT_TO_PROGRAM_ADDR; + } else { + regval = readl_relaxed(data->base + REG_MMU_CTRL_REG); + regval |= F_MMU_TF_PROT_TO_PROGRAM_ADDR; + } writel_relaxed(regval, data->base + REG_MMU_CTRL_REG); regval = F_L2_MULIT_HIT_EN | -- 2.18.0
[PATCH v6 03/10] iommu/mediatek: Use a u32 flags to describe different HW features
Given the fact that we are adding more and more plat_data bool values, it would make sense to use a u32 flags register and add the appropriate macro definitions to set and check for a flag present. No functional change. Cc: Yong Wu Suggested-by: Matthias Brugger Signed-off-by: Chao Hao Reviewed-by: Matthias Brugger --- drivers/iommu/mtk_iommu.c | 28 +--- drivers/iommu/mtk_iommu.h | 7 +-- 2 files changed, 18 insertions(+), 17 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 88d3df5b91c2..40ca564d97af 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -100,6 +100,15 @@ #define MTK_M4U_TO_LARB(id)(((id) >> 5) & 0xf) #define MTK_M4U_TO_PORT(id)((id) & 0x1f) +#define HAS_4GB_MODE BIT(0) +/* HW will use the EMI clock if there isn't the "bclk". */ +#define HAS_BCLK BIT(1) +#define HAS_VLD_PA_RNG BIT(2) +#define RESET_AXI BIT(3) + +#define MTK_IOMMU_HAS_FLAG(pdata, _x) \ + pdata)->flags) & (_x)) == (_x)) + struct mtk_iommu_domain { struct io_pgtable_cfg cfg; struct io_pgtable_ops *iop; @@ -563,7 +572,8 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data) upper_32_bits(data->protect_base); writel_relaxed(regval, data->base + REG_MMU_IVRP_PADDR); - if (data->enable_4GB && data->plat_data->has_vld_pa_rng) { + if (data->enable_4GB && + MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_VLD_PA_RNG)) { /* * If 4GB mode is enabled, the validate PA range is from * 0x1__ to 0x1__. here record bit[32:30]. @@ -573,7 +583,7 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data) } writel_relaxed(0, data->base + REG_MMU_DCM_DIS); - if (data->plat_data->reset_axi) { + if (MTK_IOMMU_HAS_FLAG(data->plat_data, RESET_AXI)) { /* The register is called STANDARD_AXI_MODE in this case */ writel_relaxed(0, data->base + REG_MMU_MISC_CTRL); } @@ -618,7 +628,7 @@ static int mtk_iommu_probe(struct platform_device *pdev) /* Whether the current dram is over 4GB */ data->enable_4GB = !!(max_pfn > (BIT_ULL(32) >> PAGE_SHIFT)); - if (!data->plat_data->has_4gb_mode) + if (!MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_4GB_MODE)) data->enable_4GB = false; res = platform_get_resource(pdev, IORESOURCE_MEM, 0); @@ -631,7 +641,7 @@ static int mtk_iommu_probe(struct platform_device *pdev) if (data->irq < 0) return data->irq; - if (data->plat_data->has_bclk) { + if (MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_BCLK)) { data->bclk = devm_clk_get(dev, "bclk"); if (IS_ERR(data->bclk)) return PTR_ERR(data->bclk); @@ -763,23 +773,19 @@ static const struct dev_pm_ops mtk_iommu_pm_ops = { static const struct mtk_iommu_plat_data mt2712_data = { .m4u_plat = M4U_MT2712, - .has_4gb_mode = true, - .has_bclk = true, - .has_vld_pa_rng = true, + .flags= HAS_4GB_MODE | HAS_BCLK | HAS_VLD_PA_RNG, .larbid_remap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, }; static const struct mtk_iommu_plat_data mt8173_data = { .m4u_plat = M4U_MT8173, - .has_4gb_mode = true, - .has_bclk = true, - .reset_axi= true, + .flags= HAS_4GB_MODE | HAS_BCLK | RESET_AXI, .larbid_remap = {0, 1, 2, 3, 4, 5}, /* Linear mapping. */ }; static const struct mtk_iommu_plat_data mt8183_data = { .m4u_plat = M4U_MT8183, - .reset_axi= true, + .flags= RESET_AXI, .larbid_remap = {0, 4, 5, 6, 7, 2, 3, 1}, }; diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h index 7212e6fcf982..5225a9170aaa 100644 --- a/drivers/iommu/mtk_iommu.h +++ b/drivers/iommu/mtk_iommu.h @@ -39,12 +39,7 @@ enum mtk_iommu_plat { struct mtk_iommu_plat_data { enum mtk_iommu_plat m4u_plat; - boolhas_4gb_mode; - - /* HW will use the EMI clock if there isn't the "bclk". */ - boolhas_bclk; - boolhas_vld_pa_rng; - boolreset_axi; + u32 flags; unsigned char larbid_remap[MTK_LARB_NR_MAX]; }; -- 2.18.0
[PATCH v6 05/10] iommu/mediatek: Move inv_sel_reg into the plat_data
For mt6779, MMU_INV_SEL register's offset is changed from 0x38 to 0x2c, so we can put inv_sel_reg in the plat_data to use it. In addition, we renamed it to REG_MMU_INV_SEL_GEN1 and use it before mt6779. Cc: Yong Wu Signed-off-by: Chao Hao Reviewed-by: Matthias Brugger --- drivers/iommu/mtk_iommu.c | 9 ++--- drivers/iommu/mtk_iommu.h | 1 + 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 219d7aa6f059..533b8f76f592 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -37,7 +37,7 @@ #define REG_MMU_INVLD_START_A 0x024 #define REG_MMU_INVLD_END_A0x028 -#define REG_MMU_INV_SEL0x038 +#define REG_MMU_INV_SEL_GEN1 0x038 #define F_INVLD_EN0BIT(0) #define F_INVLD_EN1BIT(1) @@ -178,7 +178,7 @@ static void mtk_iommu_tlb_flush_all(void *cookie) for_each_m4u(data) { writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0, - data->base + REG_MMU_INV_SEL); + data->base + data->plat_data->inv_sel_reg); writel_relaxed(F_ALL_INVLD, data->base + REG_MMU_INVALIDATE); wmb(); /* Make sure the tlb flush all done */ } @@ -195,7 +195,7 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, for_each_m4u(data) { spin_lock_irqsave(&data->tlb_lock, flags); writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0, - data->base + REG_MMU_INV_SEL); + data->base + data->plat_data->inv_sel_reg); writel_relaxed(iova, data->base + REG_MMU_INVLD_START_A); writel_relaxed(iova + size - 1, @@ -784,18 +784,21 @@ static const struct dev_pm_ops mtk_iommu_pm_ops = { static const struct mtk_iommu_plat_data mt2712_data = { .m4u_plat = M4U_MT2712, .flags= HAS_4GB_MODE | HAS_BCLK | HAS_VLD_PA_RNG, + .inv_sel_reg = REG_MMU_INV_SEL_GEN1, .larbid_remap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, }; static const struct mtk_iommu_plat_data mt8173_data = { .m4u_plat = M4U_MT8173, .flags= HAS_4GB_MODE | HAS_BCLK | RESET_AXI, + .inv_sel_reg = REG_MMU_INV_SEL_GEN1, .larbid_remap = {0, 1, 2, 3, 4, 5}, /* Linear mapping. */ }; static const struct mtk_iommu_plat_data mt8183_data = { .m4u_plat = M4U_MT8183, .flags= RESET_AXI, + .inv_sel_reg = REG_MMU_INV_SEL_GEN1, .larbid_remap = {0, 4, 5, 6, 7, 2, 3, 1}, }; diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h index 5225a9170aaa..cf53f5e80d22 100644 --- a/drivers/iommu/mtk_iommu.h +++ b/drivers/iommu/mtk_iommu.h @@ -40,6 +40,7 @@ enum mtk_iommu_plat { struct mtk_iommu_plat_data { enum mtk_iommu_plat m4u_plat; u32 flags; + u32 inv_sel_reg; unsigned char larbid_remap[MTK_LARB_NR_MAX]; }; -- 2.18.0
[PATCH v6 06/10] iommu/mediatek: Add sub_comm id in translation fault
The max larb number that a iommu HW support is 8(larb0~larb7 in the below diagram). If the larb's number is over 8, we use a sub_common for merging several larbs into one larb. At this case, we will extend larb_id: bit[11:9] means common-id; bit[8:7] means subcommon-id; From these two variables, we could get the real larb number when translation fault happen. The diagram is as below: EMI | IOMMU | - | | common1 common0 | | - | smi common | | | | | || 3'd03'd13'd23'd3 ... 3'd7 <-common_id(max is 8) | | | | || Larb0 Larb1 | Larb3 ... Larb7 | smi sub common | -- || | | 2'd0 2'd12'd22'd3 <-sub_common_id(max is 4) || | | Larb8Larb9 Larb10 Larb11 In this patch we extend larb_remap[] to larb_remap[8][4] for this. larb_remap[x][y]: x means common-id above, y means subcommon_id above. We can also distinguish if the M4U HW has sub_common by HAS_SUB_COMM macro. Cc: Matthias Brugger Signed-off-by: Chao Hao Reviewed-by: Yong Wu --- drivers/iommu/mtk_iommu.c | 21 ++--- drivers/iommu/mtk_iommu.h | 5 - 2 files changed, 18 insertions(+), 8 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 533b8f76f592..0d96dcd8612b 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -91,6 +91,8 @@ #define REG_MMU1_INVLD_PA 0x148 #define REG_MMU0_INT_ID0x150 #define REG_MMU1_INT_ID0x154 +#define F_MMU_INT_ID_COMM_ID(a)(((a) >> 9) & 0x7) +#define F_MMU_INT_ID_SUB_COMM_ID(a)(((a) >> 7) & 0x3) #define F_MMU_INT_ID_LARB_ID(a)(((a) >> 7) & 0x7) #define F_MMU_INT_ID_PORT_ID(a)(((a) >> 2) & 0x1f) @@ -109,6 +111,7 @@ #define HAS_VLD_PA_RNG BIT(2) #define RESET_AXI BIT(3) #define OUT_ORDER_WR_ENBIT(4) +#define HAS_SUB_COMM BIT(5) #define MTK_IOMMU_HAS_FLAG(pdata, _x) \ pdata)->flags) & (_x)) == (_x)) @@ -239,7 +242,7 @@ static irqreturn_t mtk_iommu_isr(int irq, void *dev_id) struct mtk_iommu_data *data = dev_id; struct mtk_iommu_domain *dom = data->m4u_dom; u32 int_state, regval, fault_iova, fault_pa; - unsigned int fault_larb, fault_port; + unsigned int fault_larb, fault_port, sub_comm = 0; bool layer, write; /* Read error info from registers */ @@ -255,10 +258,14 @@ static irqreturn_t mtk_iommu_isr(int irq, void *dev_id) } layer = fault_iova & F_MMU_FAULT_VA_LAYER_BIT; write = fault_iova & F_MMU_FAULT_VA_WRITE_BIT; - fault_larb = F_MMU_INT_ID_LARB_ID(regval); fault_port = F_MMU_INT_ID_PORT_ID(regval); - - fault_larb = data->plat_data->larbid_remap[fault_larb]; + if (MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_SUB_COMM)) { + fault_larb = F_MMU_INT_ID_COMM_ID(regval); + sub_comm = F_MMU_INT_ID_SUB_COMM_ID(regval); + } else { + fault_larb = F_MMU_INT_ID_LARB_ID(regval); + } + fault_larb = data->plat_data->larbid_remap[fault_larb][sub_comm]; if (report_iommu_fault(&dom->domain, data->dev, fault_iova, write ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ)) { @@ -785,21 +792,21 @@ static const struct mtk_iommu_plat_data mt2712_data = { .m4u_plat = M4U_MT2712, .flags= HAS_4GB_MODE | HAS_BCLK | HAS_VLD_PA_RNG, .inv_sel_reg = REG_MMU_INV_SEL_GEN1, - .larbid_remap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, + .larbid_remap = {{0}, {1}, {2}, {3}, {4}, {5}, {6}, {7}}, }; static const struct mtk_iommu_plat_data mt8173_data = { .m4u_plat = M4U_MT8173, .flags= HAS_4GB_MODE | HAS_BCLK | RESET_AXI, .inv_sel_reg = REG_MMU_INV_SEL_GEN1, - .larbid_remap = {0, 1, 2, 3, 4, 5}, /* Linear mapping. */ + .larbid_remap = {{0}, {1}, {2}, {3}, {4}, {5}}, /* Linear mapping. */ }; static const struct mtk_iommu_plat_data mt8183_data = { .m4u_plat = M4U_MT8183, .flags= RESET_AXI, .inv_sel_reg = REG_MMU_INV_SEL_GEN1, - .larbid_remap = {0, 4, 5, 6, 7, 2, 3, 1}, + .larbid_remap = {{0}, {4}, {5}, {6}, {7}, {2}, {3}, {1}}, }; static const str
[PATCH v6 10/10] iommu/mediatek: Add mt6779 basic support
1. Start from mt6779, INVLDT_SEL move to offset=0x2c, so we add REG_MMU_INV_SEL_GEN2 definition and mt6779 uses it. 2. Add mt6779_data to support mm_iommu HW init. Cc: Yong Wu Signed-off-by: Chao Hao Reviewed-by: Matthias Brugger --- drivers/iommu/mtk_iommu.c | 9 + drivers/iommu/mtk_iommu.h | 1 + 2 files changed, 10 insertions(+) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index a816030d00f1..59e5a62a34db 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -37,6 +37,7 @@ #define REG_MMU_INVLD_START_A 0x024 #define REG_MMU_INVLD_END_A0x028 +#define REG_MMU_INV_SEL_GEN2 0x02c #define REG_MMU_INV_SEL_GEN1 0x038 #define F_INVLD_EN0BIT(0) #define F_INVLD_EN1BIT(1) @@ -808,6 +809,13 @@ static const struct mtk_iommu_plat_data mt2712_data = { .larbid_remap = {{0}, {1}, {2}, {3}, {4}, {5}, {6}, {7}}, }; +static const struct mtk_iommu_plat_data mt6779_data = { + .m4u_plat = M4U_MT6779, + .flags = HAS_SUB_COMM | OUT_ORDER_WR_EN | WR_THROT_EN, + .inv_sel_reg = REG_MMU_INV_SEL_GEN2, + .larbid_remap = {{0}, {1}, {2}, {3}, {5}, {7, 8}, {10}, {9}}, +}; + static const struct mtk_iommu_plat_data mt8173_data = { .m4u_plat = M4U_MT8173, .flags= HAS_4GB_MODE | HAS_BCLK | RESET_AXI, @@ -824,6 +832,7 @@ static const struct mtk_iommu_plat_data mt8183_data = { static const struct of_device_id mtk_iommu_of_ids[] = { { .compatible = "mediatek,mt2712-m4u", .data = &mt2712_data}, + { .compatible = "mediatek,mt6779-m4u", .data = &mt6779_data}, { .compatible = "mediatek,mt8173-m4u", .data = &mt8173_data}, { .compatible = "mediatek,mt8183-m4u", .data = &mt8183_data}, {} diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h index 31edd05e2eb1..214898578026 100644 --- a/drivers/iommu/mtk_iommu.h +++ b/drivers/iommu/mtk_iommu.h @@ -37,6 +37,7 @@ struct mtk_iommu_suspend_reg { enum mtk_iommu_plat { M4U_MT2701, M4U_MT2712, + M4U_MT6779, M4U_MT8173, M4U_MT8183, }; -- 2.18.0
[PATCH v6 08/10] iommu/mediatek: Extend protect pa alignment value
Starting with mt6779, iommu needs to extend to 256 bytes from 128 bytes which can send the max number of data for memory protection pa alignment. So we can use a separate patch to modify it. Signed-off-by: Chao Hao Reviewed-by: Matthias Brugger --- drivers/iommu/mtk_iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 5c8e141668fc..e71003037ffa 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -98,7 +98,7 @@ #define F_MMU_INT_ID_LARB_ID(a)(((a) >> 7) & 0x7) #define F_MMU_INT_ID_PORT_ID(a)(((a) >> 2) & 0x1f) -#define MTK_PROTECT_PA_ALIGN 128 +#define MTK_PROTECT_PA_ALIGN 256 /* * Get the local arbiter ID and the portid within the larb arbiter -- 2.18.0
[PATCH v6 04/10] iommu/mediatek: Setting MISC_CTRL register
Add F_MMU_IN_ORDER_WR_EN_MASK and F_MMU_STANDARD_AXI_MODE_EN_MASK definitions in MISC_CTRL register. F_MMU_STANDARD_AXI_MODE_EN_MASK: If we set F_MMU_STANDARD_AXI_MODE_EN_MASK (bit[3][19] = 0, not follow standard AXI protocol), the iommu will priorize sending of urgent read command over a normal read command. This improves the performance. F_MMU_IN_ORDER_WR_EN_MASK: If we set F_MMU_IN_ORDER_WR_EN_MASK (bit[1][17] = 0, out-of-order write), the iommu will re-order write commands and send the write commands with higher priority. Otherwise the sending of write commands will be done in order. The feature is controlled by OUT_ORDER_WR_EN platform data flag. Cc: Matthias Brugger Suggested-by: Yong Wu Signed-off-by: Chao Hao --- drivers/iommu/mtk_iommu.c | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 40ca564d97af..219d7aa6f059 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -42,6 +42,9 @@ #define F_INVLD_EN1BIT(1) #define REG_MMU_MISC_CTRL 0x048 +#define F_MMU_IN_ORDER_WR_EN_MASK (BIT(1) | BIT(17)) +#define F_MMU_STANDARD_AXI_MODE_MASK (BIT(3) | BIT(19)) + #define REG_MMU_DCM_DIS0x050 #define REG_MMU_CTRL_REG 0x110 @@ -105,6 +108,7 @@ #define HAS_BCLK BIT(1) #define HAS_VLD_PA_RNG BIT(2) #define RESET_AXI BIT(3) +#define OUT_ORDER_WR_ENBIT(4) #define MTK_IOMMU_HAS_FLAG(pdata, _x) \ pdata)->flags) & (_x)) == (_x)) @@ -585,8 +589,14 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data) if (MTK_IOMMU_HAS_FLAG(data->plat_data, RESET_AXI)) { /* The register is called STANDARD_AXI_MODE in this case */ - writel_relaxed(0, data->base + REG_MMU_MISC_CTRL); + regval = 0; + } else { + regval = readl_relaxed(data->base + REG_MMU_MISC_CTRL); + regval &= ~F_MMU_STANDARD_AXI_MODE_MASK; + if (MTK_IOMMU_HAS_FLAG(data->plat_data, OUT_ORDER_WR_EN)) + regval &= ~F_MMU_IN_ORDER_WR_EN_MASK; } + writel_relaxed(regval, data->base + REG_MMU_MISC_CTRL); if (devm_request_irq(data->dev, data->irq, mtk_iommu_isr, 0, dev_name(data->dev), (void *)data)) { -- 2.18.0
Re: [PATCH v5 04/10] iommu/mediatek: Setting MISC_CTRL register
On Mon, 2020-06-29 at 11:28 +0200, Matthias Brugger wrote: > > On 29/06/2020 09:13, Chao Hao wrote: > > Add F_MMU_IN_ORDER_WR_EN and F_MMU_STANDARD_AXI_MODE_BIT definition > > in MISC_CTRL register. > > F_MMU_STANDARD_AXI_MODE_BIT: > > If we set F_MMU_STANDARD_AXI_MODE_BIT(bit[3][19] = 0, not follow > > standard AXI protocol), iommu will send urgent read command firstly > > compare with normal read command to improve performance. > > Can you please help me to understand the phrase. Sorry I'm not a AXI > specialist. > Does this mean that you will send a 'urgent read command' which is not > described > in the specifications instead of a normal read command? ok. iommu sends read command to next bus_node normally(we can name it to cmd1), when cmd1 isn't handled by next bus_node, iommu has a urgent read command is needed to be sent(we can name it to cmd2), iommu will send cmd2 and replace cmd1. So cmd2 is handled by next bus_node firstly and cmd2 will be handled secondly. But for standard AXI protocol, it will ignore the priority of read command and only be handled in order. So cmd2 is handled by next bus_node after cmd1 is done. > > > F_MMU_IN_ORDER_WR_EN: > > If we set F_MMU_IN_ORDER_WR_EN(bit[1][17] = 0, out-of-order write), iommu > > will re-order write command and send more higher priority write command > > instead of sending write command in order. The feature be controlled > > by OUT_ORDER_EN macro definition. > > > > Cc: Matthias Brugger > > Suggested-by: Yong Wu > > Signed-off-by: Chao Hao > > --- > > drivers/iommu/mtk_iommu.c | 12 +++- > > drivers/iommu/mtk_iommu.h | 1 + > > 2 files changed, 12 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > > index 8f81df6cbe51..67b46b5d83d9 100644 > > --- a/drivers/iommu/mtk_iommu.c > > +++ b/drivers/iommu/mtk_iommu.c > > @@ -42,6 +42,9 @@ > > #define F_INVLD_EN1BIT(1) > > > > #define REG_MMU_MISC_CTRL 0x048 > > +#define F_MMU_IN_ORDER_WR_EN (BIT(1) | BIT(17)) > > +#define F_MMU_STANDARD_AXI_MODE_BIT(BIT(3) | BIT(19)) > > Wouldn't it make more sense to name it F_MMU_STANDARD_AXI_MODE_EN? ok, you are right. 1'b1: follow standard axi protocol > > > + > > #define REG_MMU_DCM_DIS0x050 > > > > #define REG_MMU_CTRL_REG 0x110 > > @@ -574,10 +577,17 @@ static int mtk_iommu_hw_init(const struct > > mtk_iommu_data *data) > > } > > writel_relaxed(0, data->base + REG_MMU_DCM_DIS); > > > > + regval = readl_relaxed(data->base + REG_MMU_MISC_CTRL); > > We only need to read regval in the else branch. ok, I got it. thanks > > > if (MTK_IOMMU_HAS_FLAG(data->plat_data, RESET_AXI)) { > > /* The register is called STANDARD_AXI_MODE in this case */ > > - writel_relaxed(0, data->base + REG_MMU_MISC_CTRL); > > + regval = 0; > > + } else { > > + /* For mm_iommu, it can improve performance by the setting */ > > + regval &= ~F_MMU_STANDARD_AXI_MODE_BIT; > > + if (MTK_IOMMU_HAS_FLAG(data->plat_data, OUT_ORDER_EN)) > > + regval &= ~F_MMU_IN_ORDER_WR_EN; > > } > > + writel_relaxed(regval, data->base + REG_MMU_MISC_CTRL); > > > > if (devm_request_irq(data->dev, data->irq, mtk_iommu_isr, 0, > > dev_name(data->dev), (void *)data)) { > > diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h > > index 7cc39f729263..4b780b651ef4 100644 > > --- a/drivers/iommu/mtk_iommu.h > > +++ b/drivers/iommu/mtk_iommu.h > > @@ -22,6 +22,7 @@ > > #define HAS_BCLK BIT(1) > > #define HAS_VLD_PA_RNG BIT(2) > > #define RESET_AXI BIT(3) > > +#define OUT_ORDER_EN BIT(4) > > Maybe something like OUT_ORDER_WR_EN, to make clear that it's about the the > write path. > ok, thanks for your advice. > > > > #define MTK_IOMMU_HAS_FLAG(pdata, _x) \ > > pdata)->flags) & (_x)) == (_x)) > >
Re: [PATCH v5 07/10] iommu/mediatek: Add REG_MMU_WR_LEN register definition
On Mon, 2020-06-29 at 12:16 +0200, Matthias Brugger wrote: > > On 29/06/2020 09:13, Chao Hao wrote: > > Some platforms(ex: mt6779) need to improve performance by setting > > REG_MMU_WR_LEN register. And we can use WR_THROT_EN macro to control > > whether we need to set the register. If the register uses default value, > > iommu will send command to EMI without restriction, when the number of > > commands become more and more, it will drop the EMI performance. So when > > more than ten_commands(default value) don't be handled for EMI, iommu will > > stop send command to EMI for keeping EMI's performace by enabling write > > throttling mechanism(bit[5][21]=0) in MMU_WR_LEN_CTRL register. > > > > Cc: Matthias Brugger > > Signed-off-by: Chao Hao > > --- > > drivers/iommu/mtk_iommu.c | 10 ++ > > drivers/iommu/mtk_iommu.h | 2 ++ > > 2 files changed, 12 insertions(+) > > > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > > index ec1f86913739..92316c4175a9 100644 > > --- a/drivers/iommu/mtk_iommu.c > > +++ b/drivers/iommu/mtk_iommu.c > > @@ -46,6 +46,8 @@ > > #define F_MMU_STANDARD_AXI_MODE_BIT(BIT(3) | BIT(19)) > > > > #define REG_MMU_DCM_DIS0x050 > > +#define REG_MMU_WR_LEN 0x054 > > The register name is confusing. For me it seems to describe the length of a > write but it is used for controlling the write throttling. Is this the name > that's used in the datasheet? > Thanks for your review carefully, we can name it to REG_MMU_WR_LEN_CTRL > > +#define F_MMU_WR_THROT_DIS_BIT (BIT(5) | BIT(21)) > > There are two spaces between '|' and 'BIT(21)', should be one. > > Regarding the name of the define, what does the 'F_' statnds for? F_ is used to described some bits in register and doesn't have other meanings. The format is refer to other bits definition > Also I think > it should be called '_MASK' instead of '_BIT' as it defines a mask of bits. > Thanks for your advice. For F_MMU_WR_THROT_DIS_BIT: 1'b0: Enable write throttling mechanism 1'b1: Disable write throttling mechanism So I think we can name "F_MMU_WR_THROT_DIS BIT(5) | BIT(21)" directly, it maybe more clearer. > Regards, > Matthias > > > > > #define REG_MMU_CTRL_REG 0x110 > > #define F_MMU_TF_PROT_TO_PROGRAM_ADDR (2 << 4) > > @@ -582,6 +584,12 @@ static int mtk_iommu_hw_init(const struct > > mtk_iommu_data *data) > > writel_relaxed(regval, data->base + REG_MMU_VLD_PA_RNG); > > } > > writel_relaxed(0, data->base + REG_MMU_DCM_DIS); > > + if (MTK_IOMMU_HAS_FLAG(data->plat_data, WR_THROT_EN)) { > > + /* write command throttling mode */ > > + regval = readl_relaxed(data->base + REG_MMU_WR_LEN); > > + regval &= ~F_MMU_WR_THROT_DIS_BIT; > > + writel_relaxed(regval, data->base + REG_MMU_WR_LEN); > > + } > > > > regval = readl_relaxed(data->base + REG_MMU_MISC_CTRL); > > if (MTK_IOMMU_HAS_FLAG(data->plat_data, RESET_AXI)) { > > @@ -737,6 +745,7 @@ static int __maybe_unused mtk_iommu_suspend(struct > > device *dev) > > struct mtk_iommu_suspend_reg *reg = &data->reg; > > void __iomem *base = data->base; > > > > + reg->wr_len = readl_relaxed(base + REG_MMU_WR_LEN); > > reg->misc_ctrl = readl_relaxed(base + REG_MMU_MISC_CTRL); > > reg->dcm_dis = readl_relaxed(base + REG_MMU_DCM_DIS); > > reg->ctrl_reg = readl_relaxed(base + REG_MMU_CTRL_REG); > > @@ -761,6 +770,7 @@ static int __maybe_unused mtk_iommu_resume(struct > > device *dev) > > dev_err(data->dev, "Failed to enable clk(%d) in resume\n", ret); > > return ret; > > } > > + writel_relaxed(reg->wr_len, base + REG_MMU_WR_LEN); > > writel_relaxed(reg->misc_ctrl, base + REG_MMU_MISC_CTRL); > > writel_relaxed(reg->dcm_dis, base + REG_MMU_DCM_DIS); > > writel_relaxed(reg->ctrl_reg, base + REG_MMU_CTRL_REG); > > diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h > > index be6d32ee5bda..ce4f4e8f03aa 100644 > > --- a/drivers/iommu/mtk_iommu.h > > +++ b/drivers/iommu/mtk_iommu.h > > @@ -24,6 +24,7 @@ > > #define RESET_AXI BIT(3) > > #define OUT_ORDER_EN BIT(4) > > #define HAS_SUB_COMM BIT(5) > > +#define WR_THROT_ENBIT(6) > > > > #define MTK_IOMMU_HAS_FLAG(pdata, _x) \ > > pdata)->flags) & (_x)) == (_x)) > > @@ -36,6 +37,7 @@ struct mtk_iommu_suspend_reg { > > u32 int_main_control; > > u32 ivrp_paddr; > > u32 vld_pa_rng; > > + u32 wr_len; > > }; > > > > enum mtk_iommu_plat { > >
Re: [PATCH v5 09/10] iommu/mediatek: Modify MMU_CTRL register setting
On Mon, 2020-06-29 at 12:28 +0200, Matthias Brugger wrote: > > On 29/06/2020 09:13, Chao Hao wrote: > > MT8173 is different from other SoCs for MMU_CTRL register. > > For mt8173, its bit9 is in_order_write_en and doesn't use its > > default 1'b1.> For other SoCs, bit[12] represents victim_tlb_en feature and > > victim_tlb is enable defaultly(bit[12]=1), if we use > > "regval = F_MMU_TF_PROT_TO_PROGRAM_ADDR", victim_tlb will be > > disabled, it will drop iommu performace. > > So we need to deal with the setting of MMU_CTRL separately > > for mt8173 and others. > > > > My proposal to rewrite the commit message: > > The MMU_CTRL regiser of MT8173 is different from other SoCs. The > in_order_wr_en > is bit[9] which is zero by default. > Other SoCs have the vitcim_tlb_en feature mapped to bit[12]. This bit is set > to > one by default. We need to preserve the bit when setting > F_MMU_TF_PROT_TO_PROGRAM_ADDR as otherwise the bit will be cleared and IOMMU > performance will drop. got it, thanks for your advice very much. > > > > Suggested-by: Matthias Brugger > > Suggested-by: Yong Wu > > Signed-off-by: Chao Hao > > --- > > drivers/iommu/mtk_iommu.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > > index 8299a3299090..e46e2deee3fd 100644 > > --- a/drivers/iommu/mtk_iommu.c > > +++ b/drivers/iommu/mtk_iommu.c > > @@ -543,11 +543,12 @@ static int mtk_iommu_hw_init(const struct > > mtk_iommu_data *data) > > return ret; > > } > > > > + regval = readl_relaxed(data->base + REG_MMU_CTRL_REG); > > The read is only needed in the else branch. > ok, thanks > > if (data->plat_data->m4u_plat == M4U_MT8173) > > regval = F_MMU_PREFETCH_RT_REPLACE_MOD | > > F_MMU_TF_PROT_TO_PROGRAM_ADDR_MT8173; > > else > > - regval = F_MMU_TF_PROT_TO_PROGRAM_ADDR; > > + regval |= F_MMU_TF_PROT_TO_PROGRAM_ADDR; > > writel_relaxed(regval, data->base + REG_MMU_CTRL_REG); > > > > regval = F_L2_MULIT_HIT_EN | > >
Re: [PATCH v5 06/10] iommu/mediatek: Add sub_comm id in translation fault
On Tue, 2020-06-30 at 18:55 +0800, Yong Wu wrote: > On Mon, 2020-06-29 at 15:13 +0800, Chao Hao wrote: > > The max larb number that a iommu HW support is 8(larb0~larb7 in the below > > diagram). > > If the larb's number is over 8, we use a sub_common for merging > > several larbs into one larb. At this case, we will extend larb_id: > > bit[11:9] means common-id; > > bit[8:7] means subcommon-id; > > From these two variables, we could get the real larb number when > > translation fault happen. > > The diagram is as below: > > EMI > > | > > IOMMU > > | > >- > >| | > > common1 common0 > >| | > >- > > | > > smi common > > | > > > > | | | | || > > 3'd03'd13'd23'd3 ... 3'd7 <-common_id(max is 8) > > | | | | || > > Larb0 Larb1 | Larb3 ... Larb7 > > | > > smi sub common > > | > > -- > > || | | > > 2'd0 2'd12'd22'd3 <-sub_common_id(max is 4) > > || | | > >Larb8Larb9 Larb10 Larb11 > > > > In this patch we extend larb_remap[] to larb_remap[8][4] for this. > > larb_remap[x][y]: x means common-id above, y means subcommon_id above. > > > > We can also distinguish if the M4U HW has sub_common by HAS_SUB_COMM > > macro. > > > > Cc: Matthias Brugger > > Signed-off-by: Chao Hao > > Reviewed-by: Yong Wu > > --- > > drivers/iommu/mtk_iommu.c | 20 +--- > > drivers/iommu/mtk_iommu.h | 3 ++- > > include/soc/mediatek/smi.h | 2 ++ > > 3 files changed, 17 insertions(+), 8 deletions(-) > > [snip] > > > @@ -48,7 +49,7 @@ struct mtk_iommu_plat_data { > > enum mtk_iommu_plat m4u_plat; > > u32 flags; > > u32 inv_sel_reg; > > - unsigned char larbid_remap[MTK_LARB_NR_MAX]; > > + unsigned char larbid_remap[MTK_LARB_COM_MAX][MTK_LARB_SUBCOM_MAX]; > > }; > > > > struct mtk_iommu_domain; > > diff --git a/include/soc/mediatek/smi.h b/include/soc/mediatek/smi.h > > index 5a34b87d89e3..fa65a55468e2 100644 > > --- a/include/soc/mediatek/smi.h > > +++ b/include/soc/mediatek/smi.h > > @@ -12,6 +12,8 @@ > > #ifdef CONFIG_MTK_SMI > > > > #define MTK_LARB_NR_MAX16 > > +#define MTK_LARB_COM_MAX 8 > > +#define MTK_LARB_SUBCOM_MAX4 > > Both are only used in mtk_iommu.h, and I don't think smi has plan to use > them. thus we could move them into mtk_iommu.h > ok, got it. Thanks for your advice. > > > > #define MTK_SMI_MMU_EN(port) BIT(port) > > > >
Re: [PATCH v5 03/10] iommu/mediatek: Modify the usage of mtk_iommu_plat_data structure
On Tue, 2020-06-30 at 18:56 +0800, Yong Wu wrote: > Hi Chao, > > This is also ok for me. Only two format nitpick. > > On Mon, 2020-06-29 at 15:13 +0800, Chao Hao wrote: > > Given the fact that we are adding more and more plat_data bool values, > > it would make sense to use a u32 flags register and add the appropriate > > macro definitions to set and check for a flag present. > > No functional change. > > > > Suggested-by: Matthias Brugger > > Signed-off-by: Chao Hao > > --- > > [snip] > > > static const struct mtk_iommu_plat_data mt2712_data = { > > .m4u_plat = M4U_MT2712, > > - .has_4gb_mode = true, > > - .has_bclk = true, > > - .has_vld_pa_rng = true, > > + .flags= HAS_4GB_MODE | > > + HAS_BCLK | > > + HAS_VLD_PA_RNG, > > short enough. we can put it in one line? ok, I will try to put it in one line in next version, thanks > > > .larbid_remap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, > > }; > > > > static const struct mtk_iommu_plat_data mt8173_data = { > > .m4u_plat = M4U_MT8173, > > - .has_4gb_mode = true, > > - .has_bclk = true, > > - .reset_axi= true, > > + .flags= HAS_4GB_MODE | > > + HAS_BCLK | > > + RESET_AXI, > > .larbid_remap = {0, 1, 2, 3, 4, 5}, /* Linear mapping. */ > > }; > > > > static const struct mtk_iommu_plat_data mt8183_data = { > > .m4u_plat = M4U_MT8183, > > - .reset_axi= true, > > + .flags= RESET_AXI, > > .larbid_remap = {0, 4, 5, 6, 7, 2, 3, 1}, > > }; > > > > diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h > > index 1b6ea839b92c..7cc39f729263 100644 > > --- a/drivers/iommu/mtk_iommu.h > > +++ b/drivers/iommu/mtk_iommu.h > > @@ -17,6 +17,15 @@ > > #include > > #include > > > > +#define HAS_4GB_MODE BIT(0) > > +/* HW will use the EMI clock if there isn't the "bclk". */ > > +#define HAS_BCLK BIT(1) > > +#define HAS_VLD_PA_RNG BIT(2) > > +#define RESET_AXI BIT(3) > > + > > +#define MTK_IOMMU_HAS_FLAG(pdata, _x) \ > > + pdata)->flags) & (_x)) == (_x)) > > If these definitions are not used in mtk_iommu_v1.c(also no this plan), > then we can put them in the mtk_iommu.c. > ok, mtk_iommu_v1.c doesn't use these definitions. I will move them to mtk_iommu.c in next version, thanks. > > BTW, the patch title "modify the usage of mtk_iommu_plat_data structure" > isn't so clear, we could write what the detailed modification is. > something like: > iommu/mediatek: Use a u32 flags to describe different HW features > got it , thanks for you advice. > > + > > struct mtk_iommu_suspend_reg { > > u32 misc_ctrl; > > u32 dcm_dis; > > @@ -36,12 +45,7 @@ enum mtk_iommu_plat { > > > > struct mtk_iommu_plat_data { > > enum mtk_iommu_plat m4u_plat; > > - boolhas_4gb_mode; > > - > > - /* HW will use the EMI clock if there isn't the "bclk". */ > > - boolhas_bclk; > > - boolhas_vld_pa_rng; > > - boolreset_axi; > > + u32 flags; > > unsigned char larbid_remap[MTK_LARB_NR_MAX]; > > }; > > > >
Re: [PATCH v6 03/10] iommu/mediatek: Use a u32 flags to describe different HW features
Hi Matthias and Yingjoe, Thanks for your comments! On Mon, 2020-07-06 at 17:17 +0200, Matthias Brugger wrote: > > On 04/07/2020 03:16, Yingjoe Chen wrote: > > On Fri, 2020-07-03 at 12:41 +0800, Chao Hao wrote: > >> Given the fact that we are adding more and more plat_data bool values, > >> it would make sense to use a u32 flags register and add the appropriate > >> macro definitions to set and check for a flag present. > >> No functional change. > >> > >> Cc: Yong Wu > >> Suggested-by: Matthias Brugger > >> Signed-off-by: Chao Hao > >> Reviewed-by: Matthias Brugger > >> --- > >> drivers/iommu/mtk_iommu.c | 28 +--- > >> drivers/iommu/mtk_iommu.h | 7 +-- > >> 2 files changed, 18 insertions(+), 17 deletions(-) > >> > >> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > >> index 88d3df5b91c2..40ca564d97af 100644 > >> --- a/drivers/iommu/mtk_iommu.c > >> +++ b/drivers/iommu/mtk_iommu.c > >> @@ -100,6 +100,15 @@ > >> #define MTK_M4U_TO_LARB(id) (((id) >> 5) & 0xf) > >> #define MTK_M4U_TO_PORT(id) ((id) & 0x1f) > >> > >> +#define HAS_4GB_MODE BIT(0) > >> +/* HW will use the EMI clock if there isn't the "bclk". */ > >> +#define HAS_BCLK BIT(1) > >> +#define HAS_VLD_PA_RNGBIT(2) > >> +#define RESET_AXI BIT(3) > >> + > >> +#define MTK_IOMMU_HAS_FLAG(pdata, _x) \ > >> + pdata)->flags) & (_x)) == (_x)) > >> + > >> struct mtk_iommu_domain { > >>struct io_pgtable_cfg cfg; > >>struct io_pgtable_ops *iop; > >> @@ -563,7 +572,8 @@ static int mtk_iommu_hw_init(const struct > >> mtk_iommu_data *data) > >> upper_32_bits(data->protect_base); > >>writel_relaxed(regval, data->base + REG_MMU_IVRP_PADDR); > >> > >> - if (data->enable_4GB && data->plat_data->has_vld_pa_rng) { > >> + if (data->enable_4GB && > >> + MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_VLD_PA_RNG)) { > >>/* > >> * If 4GB mode is enabled, the validate PA range is from > >> * 0x1__ to 0x1__. here record bit[32:30]. > >> @@ -573,7 +583,7 @@ static int mtk_iommu_hw_init(const struct > >> mtk_iommu_data *data) > >>} > >>writel_relaxed(0, data->base + REG_MMU_DCM_DIS); > >> > >> - if (data->plat_data->reset_axi) { > >> + if (MTK_IOMMU_HAS_FLAG(data->plat_data, RESET_AXI)) { > >>/* The register is called STANDARD_AXI_MODE in this case */ > >>writel_relaxed(0, data->base + REG_MMU_MISC_CTRL); > >>} > >> @@ -618,7 +628,7 @@ static int mtk_iommu_probe(struct platform_device > >> *pdev) > >> > >>/* Whether the current dram is over 4GB */ > >>data->enable_4GB = !!(max_pfn > (BIT_ULL(32) >> PAGE_SHIFT)); > >> - if (!data->plat_data->has_4gb_mode) > >> + if (!MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_4GB_MODE)) > >>data->enable_4GB = false; > >> > >>res = platform_get_resource(pdev, IORESOURCE_MEM, 0); > >> @@ -631,7 +641,7 @@ static int mtk_iommu_probe(struct platform_device > >> *pdev) > >>if (data->irq < 0) > >>return data->irq; > >> > >> - if (data->plat_data->has_bclk) { > >> + if (MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_BCLK)) { > >>data->bclk = devm_clk_get(dev, "bclk"); > >>if (IS_ERR(data->bclk)) > >>return PTR_ERR(data->bclk); > >> @@ -763,23 +773,19 @@ static const struct dev_pm_ops mtk_iommu_pm_ops = { > >> > >> static const struct mtk_iommu_plat_data mt2712_data = { > >>.m4u_plat = M4U_MT2712, > >> - .has_4gb_mode = true, > >> - .has_bclk = true, > >> - .has_vld_pa_rng = true, > >> + .flags= HAS_4GB_MODE | HAS_BCLK | HAS_VLD_PA_RNG, > >>.larbid_remap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, > >> }; > >> > >> static const struct mtk_iommu_plat_data mt8173_data = { > >>.m4u_plat = M4U_MT8173, > >> - .has_4gb_mode = true, > >> - .has_bclk = tr
Re: [PATCH v4 5/7] iommu/mediatek: Add sub_comm id in translation fault
On Wed, 2020-06-17 at 19:11 +0800, Yong Wu wrote: > Hi Matthias, > > Thanks very much for your review. > > On Wed, 2020-06-17 at 11:17 +0200, Matthias Brugger wrote: > > > > On 17/06/2020 05:00, Chao Hao wrote: > > > The max larb number that a iommu HW support is 8(larb0~larb7 in the below > > > diagram). > > > If the larb's number is over 8, we use a sub_common for merging > > > several larbs into one larb. At this case, we will extend larb_id: > > > bit[11:9] means common-id; > > > bit[8:7] means subcommon-id; > > > From these two variable, we could get the real larb number when > > > translation fault happen. > > > The diagram is as below: > > >EMI > > > | > > > IOMMU > > > | > > >- > > > | | > > > common1 common0 > > > | | > > > - > > > | > > > smi common > > > | > > > > > > | | | | || > > > 3'd03'd13'd23'd3 ... 3'd7 <-common_id(max is 8) > > > | | | | || > > > Larb0 Larb1 | Larb3 ... Larb7 > > > | > > > smi sub common > > > | > > > -- > > > || | | > > > 2'd0 2'd12'd22'd3 <-sub_common_id(max is 4) > > > || | | > > >Larb8Larb9 Larb10 Larb11 > > > > > > In this patch we extern larb_remap[] to larb_remap[8][4] for this. > > > > extern -> extend > > > > > larb_remap[x][y]: x mean common-id above, y means subcommon_id above. > > > > mean -> means > > > > > > > > We can also distinguish if the M4U HW has sub_common by has_sub_comm > > > property. > > > > > > Signed-off-by: Chao Hao > > > Reviewed-by: Yong Wu > > > --- > > > drivers/iommu/mtk_iommu.c | 20 +--- > > > drivers/iommu/mtk_iommu.h | 3 ++- > > > 2 files changed, 15 insertions(+), 8 deletions(-) > > > > > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > > > index f23919feba4e..a687e8db0e51 100644 > > > --- a/drivers/iommu/mtk_iommu.c > > > +++ b/drivers/iommu/mtk_iommu.c > > > @@ -91,6 +91,8 @@ > > > #define REG_MMU1_INVLD_PA0x148 > > > #define REG_MMU0_INT_ID 0x150 > > > #define REG_MMU1_INT_ID 0x154 > > > +#define F_MMU_INT_ID_COMM_ID(a) (((a) >> 9) & 0x7) > > > +#define F_MMU_INT_ID_SUB_COMM_ID(a) (((a) >> 7) & 0x3) > > > #define F_MMU_INT_ID_LARB_ID(a) (((a) >> 7) & 0x7) > > > #define F_MMU_INT_ID_PORT_ID(a) (((a) >> 2) & 0x1f) > > > > > > @@ -229,7 +231,7 @@ static irqreturn_t mtk_iommu_isr(int irq, void > > > *dev_id) > > > struct mtk_iommu_data *data = dev_id; > > > struct mtk_iommu_domain *dom = data->m4u_dom; > > > u32 int_state, regval, fault_iova, fault_pa; > > > - unsigned int fault_larb, fault_port; > > > + unsigned int fault_larb, fault_port, sub_comm = 0; > > > bool layer, write; > > > > > > /* Read error info from registers */ > > > @@ -245,10 +247,14 @@ static irqreturn_t mtk_iommu_isr(int irq, void > > > *dev_id) > > > } > > > layer = fault_iova & F_MMU_FAULT_VA_LAYER_BIT; > > > write = fault_iova & F_MMU_FAULT_VA_WRITE_BIT; > > > - fault_larb = F_MMU_INT_ID_LARB_ID(regval); > > > fault_port = F_MMU_INT_ID_PORT_ID(regval); > > > - > > > - fault_larb = data->plat_data->larbid_remap[fault_larb]; > > > + if (data->plat_data->has_sub_comm) { > > > + fault_larb = F_MMU_INT_ID_COMM_ID(regval); > > > + sub_comm = F_MMU_INT_ID_SUB_COMM_ID(regval); > > > + } else { > > > + fault_larb = F_MMU_INT_ID_LARB_ID(regval); > > > + } > > > + fault_larb = data->plat_data->larbid_remap[fault_larb][sub_comm]; > > > > > > if (report_iommu_fault(&dom->domain, data->dev, fau
Re: [PATCH v4 3/7] iommu/mediatek: Set MISC_CTRL register
On Wed, 2020-06-17 at 11:34 +0200, Matthias Brugger wrote: > > On 17/06/2020 05:00, Chao Hao wrote: > > Add F_MMU_IN_ORDER_WR_EN definition in MISC_CTRL. > > In order to improve performance, we always disable STANDARD_AXI_MODE > > and IN_ORDER_WR_EN in MISC_CTRL. > > > > Change since v3: > > The changelog should go below the '---' as we don't want this in the git > history > once the patch get's accepted. > okok, thanks > > 1. Rename Disable STANDARD_AXI_MODE in MISC_CTRL to Set MISC_CTRL register > > 2. Add F_MMU_IN_DRDER_WR_EN definition in MISC_CTRL > >We need to disable in_order_write to improve performance > > > > Cc: Yong Wu > > Signed-off-by: Chao Hao > > --- > > drivers/iommu/mtk_iommu.c | 11 +++ > > drivers/iommu/mtk_iommu.h | 1 + > > 2 files changed, 12 insertions(+) > > > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > > index 88d3df5b91c2..239d2cdbbc9f 100644 > > --- a/drivers/iommu/mtk_iommu.c > > +++ b/drivers/iommu/mtk_iommu.c > > @@ -42,6 +42,9 @@ > > #define F_INVLD_EN1BIT(1) > > > > #define REG_MMU_MISC_CTRL 0x048 > > +#define F_MMU_IN_ORDER_WR_EN (BIT(1) | BIT(17)) > > +#define F_MMU_STANDARD_AXI_MODE_BIT(BIT(3) | BIT(19)) > > + > > #define REG_MMU_DCM_DIS0x050 > > > > #define REG_MMU_CTRL_REG 0x110 > > @@ -578,6 +581,14 @@ static int mtk_iommu_hw_init(const struct > > mtk_iommu_data *data) > > writel_relaxed(0, data->base + REG_MMU_MISC_CTRL); > > } > > > > + if (data->plat_data->has_misc_ctrl) { > > That's confusing. We renamed the register to misc_ctrl, but it's present in > all > SoCs. We should find a better name for this flag to describe what the hardware > supports. > ok, thanks for you advice, I will rename it in next version. ex:has_perf_req(has performance requirement) > Regards, > Matthias > > > + /* For mm_iommu, it can improve performance by the setting */ > > + regval = readl_relaxed(data->base + REG_MMU_MISC_CTRL); > > + regval &= ~F_MMU_STANDARD_AXI_MODE_BIT; > > + regval &= ~F_MMU_IN_ORDER_WR_EN; > > + writel_relaxed(regval, data->base + REG_MMU_MISC_CTRL); > > + } > > + > > if (devm_request_irq(data->dev, data->irq, mtk_iommu_isr, 0, > > dev_name(data->dev), (void *)data)) { > > writel_relaxed(0, data->base + REG_MMU_PT_BASE_ADDR); > > diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h > > index 1b6ea839b92c..d711ac630037 100644 > > --- a/drivers/iommu/mtk_iommu.h > > +++ b/drivers/iommu/mtk_iommu.h > > @@ -40,6 +40,7 @@ struct mtk_iommu_plat_data { > > > > /* HW will use the EMI clock if there isn't the "bclk". */ > > boolhas_bclk; > > + boolhas_misc_ctrl; > > boolhas_vld_pa_rng; > > boolreset_axi; > > unsigned char larbid_remap[MTK_LARB_NR_MAX]; > >
Re: [PATCH v4 7/7] iommu/mediatek: Add mt6779 basic support
On Wed, 2020-06-17 at 11:33 +0200, Matthias Brugger wrote: > > On 17/06/2020 05:00, Chao Hao wrote: > > 1. Start from mt6779, INVLDT_SEL move to offset=0x2c, so we add > >REG_MMU_INV_SEL_GEN2 definition and mt6779 uses it. > > 2. Change PROTECT_PA_ALIGN from 128 byte to 256 byte. > > 3. For REG_MMU_CTRL_REG register, we only need to change bit[2:0], > >others bits keep default value, ex: enable victim tlb. > > 4. Add mt6779_data to support mm_iommu HW init. > > > > Change since v3: > > 1. When setting MMU_CTRL_REG, we don't need to include mt8173. > > > > Cc: Yong Wu > > Signed-off-by: Chao Hao > > --- > > drivers/iommu/mtk_iommu.c | 20 ++-- > > drivers/iommu/mtk_iommu.h | 1 + > > 2 files changed, 19 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > > index c706bca6487e..def2e996683f 100644 > > --- a/drivers/iommu/mtk_iommu.c > > +++ b/drivers/iommu/mtk_iommu.c > > @@ -37,6 +37,11 @@ > > #define REG_MMU_INVLD_START_A 0x024 > > #define REG_MMU_INVLD_END_A0x028 > > > > +/* In latest Coda, MMU_INV_SEL's offset is changed to 0x02c. > > + * So we named offset = 0x02c to "REG_MMU_INV_SEL_GEN2" > > + * and offset = 0x038 to "REG_MMU_INV_SEL_GEN1". > > + */ > > Please delete the comment, this should be understandable from the git history ok, thanks > > > +#define REG_MMU_INV_SEL_GEN2 0x02c > > #define REG_MMU_INV_SEL_GEN1 0x038 > > #define F_INVLD_EN0BIT(0) > > #define F_INVLD_EN1BIT(1) > > @@ -98,7 +103,7 @@ > > #define F_MMU_INT_ID_LARB_ID(a)(((a) >> 7) & 0x7) > > #define F_MMU_INT_ID_PORT_ID(a)(((a) >> 2) & 0x1f) > > > > -#define MTK_PROTECT_PA_ALIGN 128 > > +#define MTK_PROTECT_PA_ALIGN 256 > > Do we need 512 bytes for all gen2 IOMMUs? > I'm not sure if we should add this in plat_data or if we should just bump up > the > value for all SoCs. > In both cases this should be a separate patch. > From mt6779, MTK_PROTECT_PA_ALIGN is extend to 256 bytes and don't be changed for a long time from our HW designer comment. The legacy iommu also can use it, mabye it doesn't set it by platform. > > > > /* > > * Get the local arbiter ID and the portid within the larb arbiter > > @@ -543,11 +548,12 @@ static int mtk_iommu_hw_init(const struct > > mtk_iommu_data *data) > > return ret; > > } > > > > + regval = readl_relaxed(data->base + REG_MMU_CTRL_REG); > > if (data->plat_data->m4u_plat == M4U_MT8173) > > regval = F_MMU_PREFETCH_RT_REPLACE_MOD | > > F_MMU_TF_PROT_TO_PROGRAM_ADDR_MT8173; > > else > > - regval = F_MMU_TF_PROT_TO_PROGRAM_ADDR; > > + regval |= F_MMU_TF_PROT_TO_PROGRAM_ADDR; > > Why do we change this, is it that the bootloader for mt6779 set some values in > the register we have to keep? In this case I think we should update the regval > accordingly. For REG_MMU_CTRL_REG, bit[12] represents victim_tlb_en feature and victim_tlb is enable defaultly(bit[12]=1),but if we use "regval = F_MMU_TF_PROT_TO_PROGRAM_ADDR", victim_tlb will disable, it will drop iommu performace for mt6779 > > > writel_relaxed(regval, data->base + REG_MMU_CTRL_REG); > > > > regval = F_L2_MULIT_HIT_EN | > > @@ -797,6 +803,15 @@ static const struct mtk_iommu_plat_data mt2712_data = { > > .larbid_remap = {{0}, {1}, {2}, {3}, {4}, {5}, {6}, {7}}, > > }; > > > > +static const struct mtk_iommu_plat_data mt6779_data = { > > + .m4u_plat = M4U_MT6779, > > + .has_sub_comm = true, > > + .has_wr_len= true, > > + .has_misc_ctrl = true, > > + .inv_sel_reg = REG_MMU_INV_SEL_GEN2, > > + .larbid_remap = {{0}, {1}, {2}, {3}, {5}, {7, 8}, {10}, {9}}, > > +}; > > + > > static const struct mtk_iommu_plat_data mt8173_data = { > > .m4u_plat = M4U_MT8173, > > .has_4gb_mode = true, > > @@ -815,6 +830,7 @@ static const struct mtk_iommu_plat_data mt8183_data = { > > > > static const struct of_device_id mtk_iommu_of_ids[] = { > > { .compatible = "mediatek,mt2712-m4u", .data = &mt2712_data}, > > + { .compatible = "mediatek,mt6779-m4u", .data = &mt6779_data}, > > { .compatible = "mediatek,mt8173-m4u", .data = &mt8173_data}, > > { .compatible = "mediatek,mt8183-m4u", .data = &mt8183_data}, > > {} > > diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h > > index 9971cedd72ea..fb79e710c8d9 100644 > > --- a/drivers/iommu/mtk_iommu.h > > +++ b/drivers/iommu/mtk_iommu.h > > @@ -31,6 +31,7 @@ struct mtk_iommu_suspend_reg { > > enum mtk_iommu_plat { > > M4U_MT2701, > > M4U_MT2712, > > + M4U_MT6779, > > M4U_MT8173, > > M4U_MT8183, > > }; > >
Re: [PATCH v4 7/7] iommu/mediatek: Add mt6779 basic support
On Thu, 2020-06-18 at 18:00 +0200, Matthias Brugger wrote: > > On 18/06/2020 13:54, chao hao wrote: > > On Wed, 2020-06-17 at 11:33 +0200, Matthias Brugger wrote: > >> > >> On 17/06/2020 05:00, Chao Hao wrote: > >>> 1. Start from mt6779, INVLDT_SEL move to offset=0x2c, so we add > >>>REG_MMU_INV_SEL_GEN2 definition and mt6779 uses it. > >>> 2. Change PROTECT_PA_ALIGN from 128 byte to 256 byte. > >>> 3. For REG_MMU_CTRL_REG register, we only need to change bit[2:0], > >>>others bits keep default value, ex: enable victim tlb. > >>> 4. Add mt6779_data to support mm_iommu HW init. > >>> > >>> Change since v3: > >>> 1. When setting MMU_CTRL_REG, we don't need to include mt8173. > >>> > >>> Cc: Yong Wu > >>> Signed-off-by: Chao Hao > >>> --- > >>> drivers/iommu/mtk_iommu.c | 20 ++-- > >>> drivers/iommu/mtk_iommu.h | 1 + > >>> 2 files changed, 19 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > >>> index c706bca6487e..def2e996683f 100644 > >>> --- a/drivers/iommu/mtk_iommu.c > >>> +++ b/drivers/iommu/mtk_iommu.c > >>> @@ -37,6 +37,11 @@ > >>> #define REG_MMU_INVLD_START_A0x024 > >>> #define REG_MMU_INVLD_END_A 0x028 > >>> > >>> +/* In latest Coda, MMU_INV_SEL's offset is changed to 0x02c. > >>> + * So we named offset = 0x02c to "REG_MMU_INV_SEL_GEN2" > >>> + * and offset = 0x038 to "REG_MMU_INV_SEL_GEN1". > >>> + */ > >> > >> Please delete the comment, this should be understandable from the git > >> history > > > > ok, thanks > > > >> > >>> +#define REG_MMU_INV_SEL_GEN2 0x02c > >>> #define REG_MMU_INV_SEL_GEN1 0x038 > >>> #define F_INVLD_EN0 BIT(0) > >>> #define F_INVLD_EN1 BIT(1) > >>> @@ -98,7 +103,7 @@ > >>> #define F_MMU_INT_ID_LARB_ID(a) (((a) >> 7) & 0x7) > >>> #define F_MMU_INT_ID_PORT_ID(a) (((a) >> 2) & 0x1f) > >>> > >>> -#define MTK_PROTECT_PA_ALIGN 128 > >>> +#define MTK_PROTECT_PA_ALIGN 256 > >> > >> Do we need 512 bytes for all gen2 IOMMUs? > >> I'm not sure if we should add this in plat_data or if we should just bump > >> up the > >> value for all SoCs. > >> In both cases this should be a separate patch. > >> > > From mt6779, MTK_PROTECT_PA_ALIGN is extend to 256 bytes and don't be > > changed for a long time from our HW designer comment. The legacy iommu > > also can use it, mabye it doesn't set it by platform. > > > > Ok then just bump it to 256 in a new patch. Thanks for clarification. Ok, thanks > > > >>> > >>> /* > >>> * Get the local arbiter ID and the portid within the larb arbiter > >>> @@ -543,11 +548,12 @@ static int mtk_iommu_hw_init(const struct > >>> mtk_iommu_data *data) > >>> return ret; > >>> } > >>> > >>> + regval = readl_relaxed(data->base + REG_MMU_CTRL_REG); > >>> if (data->plat_data->m4u_plat == M4U_MT8173) > >>> regval = F_MMU_PREFETCH_RT_REPLACE_MOD | > >>>F_MMU_TF_PROT_TO_PROGRAM_ADDR_MT8173; > >>> else > >>> - regval = F_MMU_TF_PROT_TO_PROGRAM_ADDR; > >>> + regval |= F_MMU_TF_PROT_TO_PROGRAM_ADDR; > >> > >> Why do we change this, is it that the bootloader for mt6779 set some > >> values in > >> the register we have to keep? In this case I think we should update the > >> regval > >> accordingly. > > > > For REG_MMU_CTRL_REG, bit[12] represents victim_tlb_en feature and > > victim_tlb is enable defaultly(bit[12]=1),but if we use "regval = > > F_MMU_TF_PROT_TO_PROGRAM_ADDR", victim_tlb will disable, it will drop > > iommu performace for mt6779 > > > > Got it. Please put that in a separate patch then. > Ok, thanks > Regards, > Matthias > > > > >> > >>> writel_relaxed(regval, data->base + REG_MMU_CTRL_REG); > >>> > >>>
Re: [PATCH v4 6/7] iommu/mediatek: Add REG_MMU_WR_LEN definition preparing for mt6779
On Wed, 2020-06-17 at 11:22 +0200, Matthias Brugger wrote: > > On 17/06/2020 05:00, Chao Hao wrote: > > Some platforms(ex: mt6779) have a new register called by REG_MMU_WR_LEN > > to improve performance. > > This patch add this register definition. > > Please be more specific what this register is about. > OK. thanks. We can use "has_wr_len" flag to control whether we need to set the register. If the register uses default value, iommu will send command to EMI without restriction, when the number of commands become more and more, it will drop the EMI performance. So when more than ten_commands(default value) don't be handled for EMI, IOMMU will stop send command to EMI for keeping EMI's performace by enabling write throttling mechanism(bit[5][21]=0) in MMU_WR_LEN_CTRL register. I will write description above to commit message in next version > > > > Signed-off-by: Chao Hao > > --- > > drivers/iommu/mtk_iommu.c | 10 ++ > > drivers/iommu/mtk_iommu.h | 2 ++ > > 2 files changed, 12 insertions(+) > > > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > > index a687e8db0e51..c706bca6487e 100644 > > --- a/drivers/iommu/mtk_iommu.c > > +++ b/drivers/iommu/mtk_iommu.c > > @@ -46,6 +46,8 @@ > > #define F_MMU_STANDARD_AXI_MODE_BIT(BIT(3) | BIT(19)) > > > > #define REG_MMU_DCM_DIS0x050 > > +#define REG_MMU_WR_LEN 0x054 > > +#define F_MMU_WR_THROT_DIS_BIT (BIT(5) | BIT(21)) > > > > #define REG_MMU_CTRL_REG 0x110 > > #define F_MMU_TF_PROT_TO_PROGRAM_ADDR (2 << 4) > > @@ -581,6 +583,12 @@ static int mtk_iommu_hw_init(const struct > > mtk_iommu_data *data) > > writel_relaxed(regval, data->base + REG_MMU_VLD_PA_RNG); > > } > > writel_relaxed(0, data->base + REG_MMU_DCM_DIS); > > + if (data->plat_data->has_wr_len) { > > + /* write command throttling mode */ > > + regval = readl_relaxed(data->base + REG_MMU_WR_LEN); > > + regval &= ~F_MMU_WR_THROT_DIS_BIT; > > + writel_relaxed(regval, data->base + REG_MMU_WR_LEN); > > + } > > > > if (data->plat_data->reset_axi) { > > /* The register is called STANDARD_AXI_MODE in this case */ > > @@ -737,6 +745,7 @@ static int __maybe_unused mtk_iommu_suspend(struct > > device *dev) > > struct mtk_iommu_suspend_reg *reg = &data->reg; > > void __iomem *base = data->base; > > > > + reg->wr_len = readl_relaxed(base + REG_MMU_WR_LEN); > > Can we read/write the register without any side effect although hardware has > not > implemented it (!has_wr_len)? It doesn't have side effect. Becasue all the MTK platform have the register for iommu HW. If we need to have requirement for performance, we can set it by has_wr_len. But I'm Sorry, the name of flag(has_wr_len) is not exact, I will rename it in next version, ex: "wr_throt_en" > > > > reg->misc_ctrl = readl_relaxed(base + REG_MMU_MISC_CTRL); > > reg->dcm_dis = readl_relaxed(base + REG_MMU_DCM_DIS); > > reg->ctrl_reg = readl_relaxed(base + REG_MMU_CTRL_REG); > > @@ -761,6 +770,7 @@ static int __maybe_unused mtk_iommu_resume(struct > > device *dev) > > dev_err(data->dev, "Failed to enable clk(%d) in resume\n", ret); > > return ret; > > } > > + writel_relaxed(reg->wr_len, base + REG_MMU_WR_LEN); > > writel_relaxed(reg->misc_ctrl, base + REG_MMU_MISC_CTRL); > > writel_relaxed(reg->dcm_dis, base + REG_MMU_DCM_DIS); > > writel_relaxed(reg->ctrl_reg, base + REG_MMU_CTRL_REG); > > diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h > > index d51ff99c2c71..9971cedd72ea 100644 > > --- a/drivers/iommu/mtk_iommu.h > > +++ b/drivers/iommu/mtk_iommu.h > > @@ -25,6 +25,7 @@ struct mtk_iommu_suspend_reg { > > u32 int_main_control; > > u32 ivrp_paddr; > > u32 vld_pa_rng; > > + u32 wr_len; > > }; > > > > enum mtk_iommu_plat { > > @@ -43,6 +44,7 @@ struct mtk_iommu_plat_data { > > boolhas_misc_ctrl; > > boolhas_sub_comm; > > boolhas_vld_pa_rng; > > + boolhas_wr_len; > > Given the fact that we are adding more and more plat_data bool
[PATCH v3 2/7] iommu/mediatek: Rename the register STANDARD_AXI_MODE(0x48) to MISC_CTRL
For iommu offset=0x48 register, only the previous mt8173/mt8183 use the name STANDARD_AXI_MODE, all the latest SoC extend the register more feature by different bits, for example: axi_mode, in_order_en, coherent_en and so on. So rename REG_MMU_MISC_CTRL may be more proper. This patch only rename the register name, no functional change. Signed-off-by: Chao Hao --- drivers/iommu/mtk_iommu.c | 14 +++--- drivers/iommu/mtk_iommu.h | 2 +- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 5f4d6df59cf6..e7e7c7695ed1 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -41,7 +41,7 @@ #define F_INVLD_EN0BIT(0) #define F_INVLD_EN1BIT(1) -#define REG_MMU_STANDARD_AXI_MODE 0x048 +#define REG_MMU_MISC_CTRL 0x048 #define REG_MMU_DCM_DIS0x050 #define REG_MMU_CTRL_REG 0x110 @@ -585,8 +585,10 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data) } writel_relaxed(0, data->base + REG_MMU_DCM_DIS); - if (data->plat_data->reset_axi) - writel_relaxed(0, data->base + REG_MMU_STANDARD_AXI_MODE); + if (data->plat_data->reset_axi) { + /* The register is called STANDARD_AXI_MODE in this case */ + writel_relaxed(0, data->base + REG_MMU_MISC_CTRL); + } if (devm_request_irq(data->dev, data->irq, mtk_iommu_isr, 0, dev_name(data->dev), (void *)data)) { @@ -730,8 +732,7 @@ static int __maybe_unused mtk_iommu_suspend(struct device *dev) struct mtk_iommu_suspend_reg *reg = &data->reg; void __iomem *base = data->base; - reg->standard_axi_mode = readl_relaxed(base + - REG_MMU_STANDARD_AXI_MODE); + reg->misc_ctrl = readl_relaxed(base + REG_MMU_MISC_CTRL); reg->dcm_dis = readl_relaxed(base + REG_MMU_DCM_DIS); reg->ctrl_reg = readl_relaxed(base + REG_MMU_CTRL_REG); reg->int_control0 = readl_relaxed(base + REG_MMU_INT_CONTROL0); @@ -755,8 +756,7 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev) dev_err(data->dev, "Failed to enable clk(%d) in resume\n", ret); return ret; } - writel_relaxed(reg->standard_axi_mode, - base + REG_MMU_STANDARD_AXI_MODE); + writel_relaxed(reg->misc_ctrl, base + REG_MMU_MISC_CTRL); writel_relaxed(reg->dcm_dis, base + REG_MMU_DCM_DIS); writel_relaxed(reg->ctrl_reg, base + REG_MMU_CTRL_REG); writel_relaxed(reg->int_control0, base + REG_MMU_INT_CONTROL0); diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h index ea949a324e33..1b6ea839b92c 100644 --- a/drivers/iommu/mtk_iommu.h +++ b/drivers/iommu/mtk_iommu.h @@ -18,7 +18,7 @@ #include struct mtk_iommu_suspend_reg { - u32 standard_axi_mode; + u32 misc_ctrl; u32 dcm_dis; u32 ctrl_reg; u32 int_control0; -- 2.18.0
[PATCH v3 1/7] dt-bindings: mediatek: Add bindings for MT6779
This patch adds description for MT6779 IOMMU. MT6779 has two iommus, they are mm_iommu and apu_iommu which both use ARM Short-Descriptor translation format. In addition, mm_iommu and apu_iommu are two independent HW instance , we need to set them separately. The MT6779 IOMMU hardware diagram is as below, it is only a brief diagram about iommu, it don't focus on the part of smi_larb, so I don't describe the smi_larb detailedly. EMI | -- || MM_IOMMUAPU_IOMMU || SMI_COMMOM--- APU_BUS ||| SMI_LARB(0~11) || ||| || -- || | | | Multimedia engine CCU VPU MDLA EMDA All the connections are hardware fixed, software can not adjust it. Change since v2: 1. Delete unused definition, ex: M4U_LARB12_ID, M4U_LARB13_ID, CCU, VPU, MDLA, EDMA Change since v1: 1. Delete M4U_PORT_UNKNOWN define because of not use it. 2. Correct coding format: ex: /*larb3-VENC*/ --> /* larb3-VENC */ Signed-off-by: Chao Hao Reviewed-by: Rob Herring --- .../bindings/iommu/mediatek,iommu.txt | 2 + include/dt-bindings/memory/mt6779-larb-port.h | 206 ++ 2 files changed, 208 insertions(+) create mode 100644 include/dt-bindings/memory/mt6779-larb-port.h diff --git a/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt b/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt index ce59a505f5a4..c1ccd8582eb2 100644 --- a/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt +++ b/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt @@ -58,6 +58,7 @@ Required properties: - compatible : must be one of the following string: "mediatek,mt2701-m4u" for mt2701 which uses generation one m4u HW. "mediatek,mt2712-m4u" for mt2712 which uses generation two m4u HW. + "mediatek,mt6779-m4u" for mt6779 which uses generation two m4u HW. "mediatek,mt7623-m4u", "mediatek,mt2701-m4u" for mt7623 which uses generation one m4u HW. "mediatek,mt8173-m4u" for mt8173 which uses generation two m4u HW. @@ -78,6 +79,7 @@ Required properties: Specifies the mtk_m4u_id as defined in dt-binding/memory/mt2701-larb-port.h for mt2701, mt7623 dt-binding/memory/mt2712-larb-port.h for mt2712, + dt-binding/memory/mt6779-larb-port.h for mt6779, dt-binding/memory/mt8173-larb-port.h for mt8173, and dt-binding/memory/mt8183-larb-port.h for mt8183. diff --git a/include/dt-bindings/memory/mt6779-larb-port.h b/include/dt-bindings/memory/mt6779-larb-port.h new file mode 100644 index ..2ad0899fbf2f --- /dev/null +++ b/include/dt-bindings/memory/mt6779-larb-port.h @@ -0,0 +1,206 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2019 MediaTek Inc. + * Author: Chao Hao + */ + +#ifndef _DTS_IOMMU_PORT_MT6779_H_ +#define _DTS_IOMMU_PORT_MT6779_H_ + +#define MTK_M4U_ID(larb, port) (((larb) << 5) | (port)) + +#define M4U_LARB0_ID0 +#define M4U_LARB1_ID1 +#define M4U_LARB2_ID2 +#define M4U_LARB3_ID3 +#define M4U_LARB4_ID4 +#define M4U_LARB5_ID5 +#define M4U_LARB6_ID6 +#define M4U_LARB7_ID7 +#define M4U_LARB8_ID8 +#define M4U_LARB9_ID9 +#define M4U_LARB10_ID 10 +#define M4U_LARB11_ID 11 + +/* larb0 */ +#define M4U_PORT_DISP_POSTMASK0 MTK_M4U_ID(M4U_LARB0_ID, 0) +#define M4U_PORT_DISP_OVL0_HDR MTK_M4U_ID(M4U_LARB0_ID, 1) +#define M4U_PORT_DISP_OVL1_HDR MTK_M4U_ID(M4U_LARB0_ID, 2) +#define M4U_PORT_DISP_OVL0 MTK_M4U_ID(M4U_LARB0_ID, 3) +#define M4U_PORT_DISP_OVL1 MTK_M4U_ID(M4U_LARB0_ID, 4) +#define M4U_PORT_DISP_PVRIC0MTK_M4U_ID(M4U_LARB0_ID, 5) +#define M4U_PORT_DISP_RDMA0 MTK_M4U_ID(M4U_LARB0_ID, 6) +#define M4U_PORT_DISP_WDMA0 MTK_M4U_ID(M4U_LARB0_ID, 7) +#define M4U_PORT_DISP_FAKE0 MTK_M4U_ID(M4U_LARB0_ID, 8) + +/* larb1 */ +#define M4U_PORT_DISP_OVL0_2L_HDR MTK_M4U_ID(M4U_LARB1_ID, 0) +#define M4U_PORT_DISP_OVL1_2L_HDR MTK_M4U_ID(M4U_LARB1_ID, 1) +#define M4U_PORT_DISP_OVL0_2L MTK_M4U_ID(M4U_LARB1_ID, 2) +#define M4U_PORT_DISP_OVL1_2L MTK_M4U_ID(M4U_LARB1_ID, 3) +#define M4U_PORT_DISP_RDMA1 MTK_M4U_ID(M4U_L
[PATCH v3 6/7] iommu/mediatek: Add REG_MMU_WR_LEN definition preparing for mt6779
Some platforms(ex: mt6779) have a new register called by REG_MMU_WR_LEN to improve performance. This patch add this register definition. Signed-off-by: Chao Hao --- drivers/iommu/mtk_iommu.c | 10 ++ drivers/iommu/mtk_iommu.h | 2 ++ 2 files changed, 12 insertions(+) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 3914c418d1b0..dc9ae944e712 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -45,6 +45,8 @@ #define F_MMU_STANDARD_AXI_MODE_BIT(BIT(3) | BIT(19)) #define REG_MMU_DCM_DIS0x050 +#define REG_MMU_WR_LEN 0x054 +#define F_MMU_WR_THROT_DIS_BIT (BIT(5) | BIT(21)) #define REG_MMU_CTRL_REG 0x110 #define F_MMU_TF_PROT_TO_PROGRAM_ADDR (2 << 4) @@ -592,6 +594,12 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data) writel_relaxed(regval, data->base + REG_MMU_VLD_PA_RNG); } writel_relaxed(0, data->base + REG_MMU_DCM_DIS); + if (data->plat_data->has_wr_len) { + /* write command throttling mode */ + regval = readl_relaxed(data->base + REG_MMU_WR_LEN); + regval &= ~F_MMU_WR_THROT_DIS_BIT; + writel_relaxed(regval, data->base + REG_MMU_WR_LEN); + } if (data->plat_data->has_misc_ctrl) { regval = readl_relaxed(data->base + REG_MMU_MISC_CTRL); @@ -744,6 +752,7 @@ static int __maybe_unused mtk_iommu_suspend(struct device *dev) struct mtk_iommu_suspend_reg *reg = &data->reg; void __iomem *base = data->base; + reg->wr_len = readl_relaxed(base + REG_MMU_WR_LEN); reg->misc_ctrl = readl_relaxed(base + REG_MMU_MISC_CTRL); reg->dcm_dis = readl_relaxed(base + REG_MMU_DCM_DIS); reg->ctrl_reg = readl_relaxed(base + REG_MMU_CTRL_REG); @@ -768,6 +777,7 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev) dev_err(data->dev, "Failed to enable clk(%d) in resume\n", ret); return ret; } + writel_relaxed(reg->wr_len, base + REG_MMU_WR_LEN); writel_relaxed(reg->misc_ctrl, base + REG_MMU_MISC_CTRL); writel_relaxed(reg->dcm_dis, base + REG_MMU_DCM_DIS); writel_relaxed(reg->ctrl_reg, base + REG_MMU_CTRL_REG); diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h index d51ff99c2c71..9971cedd72ea 100644 --- a/drivers/iommu/mtk_iommu.h +++ b/drivers/iommu/mtk_iommu.h @@ -25,6 +25,7 @@ struct mtk_iommu_suspend_reg { u32 int_main_control; u32 ivrp_paddr; u32 vld_pa_rng; + u32 wr_len; }; enum mtk_iommu_plat { @@ -43,6 +44,7 @@ struct mtk_iommu_plat_data { boolhas_misc_ctrl; boolhas_sub_comm; boolhas_vld_pa_rng; + boolhas_wr_len; boolreset_axi; u32 inv_sel_reg; unsigned char larbid_remap[8][4]; -- 2.18.0
[PATCH v3 3/7] iommu/mediatek: Disable STANDARD_AXI_MODE in MISC_CTRL
In order to improve performance, we always disable STANDARD_AXI_MODE in MISC_CTRL. Signed-off-by: Chao Hao --- drivers/iommu/mtk_iommu.c | 8 +++- drivers/iommu/mtk_iommu.h | 1 + 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index e7e7c7695ed1..9ede327a418d 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -42,6 +42,8 @@ #define F_INVLD_EN1BIT(1) #define REG_MMU_MISC_CTRL 0x048 +#define F_MMU_STANDARD_AXI_MODE_BIT(BIT(3) | BIT(19)) + #define REG_MMU_DCM_DIS0x050 #define REG_MMU_CTRL_REG 0x110 @@ -585,7 +587,11 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data) } writel_relaxed(0, data->base + REG_MMU_DCM_DIS); - if (data->plat_data->reset_axi) { + if (data->plat_data->has_misc_ctrl) { + regval = readl_relaxed(data->base + REG_MMU_MISC_CTRL); + regval &= ~F_MMU_STANDARD_AXI_MODE_BIT; + writel_relaxed(regval, data->base + REG_MMU_MISC_CTRL); + } else if (data->plat_data->reset_axi) { /* The register is called STANDARD_AXI_MODE in this case */ writel_relaxed(0, data->base + REG_MMU_MISC_CTRL); } diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h index 1b6ea839b92c..d711ac630037 100644 --- a/drivers/iommu/mtk_iommu.h +++ b/drivers/iommu/mtk_iommu.h @@ -40,6 +40,7 @@ struct mtk_iommu_plat_data { /* HW will use the EMI clock if there isn't the "bclk". */ boolhas_bclk; + boolhas_misc_ctrl; boolhas_vld_pa_rng; boolreset_axi; unsigned char larbid_remap[MTK_LARB_NR_MAX]; -- 2.18.0
[PATCH v3 00/07] MT6779 IOMMU SUPPORT
This patchset adds mt6779 iommu support. mt6779 has two iommus, they are MM_IOMMU(M4U) and APU_IOMMU which used ARM Short-Descriptor translation format. The mt6779's MM_IOMMU-SMI and APU_IOMMU HW diagram is as below, it is only a brief diagram: EMI | -- || MM_IOMMUAPU_IOMMU || SMI_COMMOM--- APU_BUS || | SMI_LARB(0~11)| | || | || -- || | | | Multimedia engine CCU VPU MDLA EMDA All the connections are hardware fixed, software can not adjust it. Compared with mt8183, SMI_BUS_ID width has changed from 10 to 12. SMI Larb number is described in bit[11:7], Port number is described in bit[6:2]. In addition, there are some registers has changed in mt6779, so we need to redefine and reuse them. The patchset only used MM_IOMMU, so we only add MM_IOMMU basic function, such as smi_larb port definition, registers definition and hardware initialization. change notes: v3: 1. Rebase on v5.7-rc1. 2. Remove unused port definition,ex:APU and CCU port in mt6779-larb-port.h. 3. Remove "change single domain to multiple domain" part(from PATCH v2 09/19 to PATCH v2 19/19). 4. Redesign mt6779 basic part (1)Add some register definition and reuse them. (2)Redesign smi larb bus ID to analyze IOMMU translation fault. (3)Only init MM_IOMMU and not use APU_IOMMU. v2: 1. Rebase on v5.5-rc1. 2. Delete M4U_PORT_UNKNOWN define because of not use it. 3. Correct coding format. 4. Rename offset=0x48 register. 5. Split "iommu/mediatek: Add mt6779 IOMMU basic support(patch v1)" to several patches(patch v2). http://lists.infradead.org/pipermail/linux-mediatek/2020-January/026131.html v1: http://lists.infradead.org/pipermail/linux-mediatek/2019-November/024567.html Chao Hao (7): dt-bindings: mediatek: Add bindings for MT6779 iommu/mediatek: Rename the register STANDARD_AXI_MODE(0x48) to MISC_CTRL iommu/mediatek: Disable STANDARD_AXI_MODE in MISC_CTRL iommu/mediatek: Move inv_sel_reg into the plat_data iommu/mediatek: Add sub_comm id in translation fault iommu/mediatek: Add REG_MMU_WR_LEN definition preparing for mt6779 iommu/mediatek: Add mt6779 basic support .../bindings/iommu/mediatek,iommu.txt | 2 + drivers/iommu/mtk_iommu.c | 77 +-- drivers/iommu/mtk_iommu.h | 10 +- include/dt-bindings/memory/mt6779-larb-port.h | 206 ++ 4 files changed, 273 insertions(+), 22 deletions(-) -- 2.18.0
[PATCH v3 7/7] iommu/mediatek: Add mt6779 basic support
1. Start from mt6779, INVLDT_SEL move to offset=0x2c, so we add REG_MMU_INV_SEL_GEN2 definition and mt6779 uses it. 2. Change PROTECT_PA_ALIGN from 128 byte to 256 byte. 3. For REG_MMU_CTRL_REG register, we only need to change bit[2:0], others bits keep default value, ex: enable victim tlb. 4. Add mt6779_data to support mm_iommu HW init. Signed-off-by: Chao Hao --- drivers/iommu/mtk_iommu.c | 18 +++--- drivers/iommu/mtk_iommu.h | 1 + 2 files changed, 16 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index dc9ae944e712..34c4ffb77c73 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -37,6 +37,7 @@ #define REG_MMU_INVLD_START_A 0x024 #define REG_MMU_INVLD_END_A0x028 +#define REG_MMU_INV_SEL_GEN2 0x02c #define REG_MMU_INV_SEL_GEN1 0x038 #define F_INVLD_EN0BIT(0) #define F_INVLD_EN1BIT(1) @@ -97,7 +98,7 @@ #define F_MMU_INT_ID_LARB_ID(a)(((a) >> 7) & 0x7) #define F_MMU_INT_ID_PORT_ID(a)(((a) >> 2) & 0x1f) -#define MTK_PROTECT_PA_ALIGN 128 +#define MTK_PROTECT_PA_ALIGN 256 /* * Get the local arbiter ID and the portid within the larb arbiter @@ -554,11 +555,12 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data) return ret; } + regval = readl_relaxed(data->base + REG_MMU_CTRL_REG); if (data->plat_data->m4u_plat == M4U_MT8173) - regval = F_MMU_PREFETCH_RT_REPLACE_MOD | + regval |= F_MMU_PREFETCH_RT_REPLACE_MOD | F_MMU_TF_PROT_TO_PROGRAM_ADDR_MT8173; else - regval = F_MMU_TF_PROT_TO_PROGRAM_ADDR; + regval |= F_MMU_TF_PROT_TO_PROGRAM_ADDR; writel_relaxed(regval, data->base + REG_MMU_CTRL_REG); regval = F_L2_MULIT_HIT_EN | @@ -804,6 +806,15 @@ static const struct mtk_iommu_plat_data mt2712_data = { .larbid_remap = {{0}, {1}, {2}, {3}, {4}, {5}, {6}, {7}}, }; +static const struct mtk_iommu_plat_data mt6779_data = { + .m4u_plat = M4U_MT6779, + .larbid_remap = {{0}, {1}, {2}, {3}, {5}, {7, 8}, {10}, {9}}, + .has_sub_comm = true, + .has_wr_len = true, + .has_misc_ctrl = true, + .inv_sel_reg = REG_MMU_INV_SEL_GEN2, +}; + static const struct mtk_iommu_plat_data mt8173_data = { .m4u_plat = M4U_MT8173, .has_4gb_mode = true, @@ -822,6 +833,7 @@ static const struct mtk_iommu_plat_data mt8183_data = { static const struct of_device_id mtk_iommu_of_ids[] = { { .compatible = "mediatek,mt2712-m4u", .data = &mt2712_data}, + { .compatible = "mediatek,mt6779-m4u", .data = &mt6779_data}, { .compatible = "mediatek,mt8173-m4u", .data = &mt8173_data}, { .compatible = "mediatek,mt8183-m4u", .data = &mt8183_data}, {} diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h index 9971cedd72ea..fb79e710c8d9 100644 --- a/drivers/iommu/mtk_iommu.h +++ b/drivers/iommu/mtk_iommu.h @@ -31,6 +31,7 @@ struct mtk_iommu_suspend_reg { enum mtk_iommu_plat { M4U_MT2701, M4U_MT2712, + M4U_MT6779, M4U_MT8173, M4U_MT8183, }; -- 2.18.0
[PATCH v3 4/7] iommu/mediatek: Move inv_sel_reg into the plat_data
For mt6779, MMU_INVLDT_SEL register's offset is changed from 0x38 to 0x2c, so we can put inv_sel_reg in the plat_data to use it. In addition, we renamed it to REG_MMU_INV_SEL_GEN1 and use it before mt6779. Signed-off-by: Chao Hao --- drivers/iommu/mtk_iommu.c | 9 ++--- drivers/iommu/mtk_iommu.h | 1 + 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 9ede327a418d..d73de987f8be 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -37,7 +37,7 @@ #define REG_MMU_INVLD_START_A 0x024 #define REG_MMU_INVLD_END_A0x028 -#define REG_MMU_INV_SEL0x038 +#define REG_MMU_INV_SEL_GEN1 0x038 #define F_INVLD_EN0BIT(0) #define F_INVLD_EN1BIT(1) @@ -167,7 +167,7 @@ static void mtk_iommu_tlb_flush_all(void *cookie) for_each_m4u(data) { writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0, - data->base + REG_MMU_INV_SEL); + data->base + data->plat_data->inv_sel_reg); writel_relaxed(F_ALL_INVLD, data->base + REG_MMU_INVALIDATE); wmb(); /* Make sure the tlb flush all done */ } @@ -184,7 +184,7 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, for_each_m4u(data) { spin_lock_irqsave(&data->tlb_lock, flags); writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0, - data->base + REG_MMU_INV_SEL); + data->base + data->plat_data->inv_sel_reg); writel_relaxed(iova, data->base + REG_MMU_INVLD_START_A); writel_relaxed(iova + size - 1, @@ -784,6 +784,7 @@ static const struct mtk_iommu_plat_data mt2712_data = { .has_4gb_mode = true, .has_bclk = true, .has_vld_pa_rng = true, + .inv_sel_reg = REG_MMU_INV_SEL_GEN1, .larbid_remap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, }; @@ -792,12 +793,14 @@ static const struct mtk_iommu_plat_data mt8173_data = { .has_4gb_mode = true, .has_bclk = true, .reset_axi= true, + .inv_sel_reg = REG_MMU_INV_SEL_GEN1, .larbid_remap = {0, 1, 2, 3, 4, 5}, /* Linear mapping. */ }; static const struct mtk_iommu_plat_data mt8183_data = { .m4u_plat = M4U_MT8183, .reset_axi= true, + .inv_sel_reg = REG_MMU_INV_SEL_GEN1, .larbid_remap = {0, 4, 5, 6, 7, 2, 3, 1}, }; diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h index d711ac630037..afd7a2de5c1e 100644 --- a/drivers/iommu/mtk_iommu.h +++ b/drivers/iommu/mtk_iommu.h @@ -43,6 +43,7 @@ struct mtk_iommu_plat_data { boolhas_misc_ctrl; boolhas_vld_pa_rng; boolreset_axi; + u32 inv_sel_reg; unsigned char larbid_remap[MTK_LARB_NR_MAX]; }; -- 2.18.0
[PATCH v3 5/7] iommu/mediatek: Add sub_comm id in translation fault
The max larb number that a iommu HW support is 8(larb0~larb7 in the below diagram). If the larb's number is over 8, we use a sub_common for merging several larbs into one larb. At this case, we will extend larb_id: bit[11:9] means common-id; bit[8:7] means subcommon-id; From these two variable, we could get the real larb number when translation fault happen. The diagram is as below: EMI | IOMMU | - | | common1 common0 | | - | smi common | | | | | || 3'd03'd13'd23'd3 ... 3'd7 <-common_id(max is 8) | | | | || Larb0 Larb1 | Larb3 ... Larb7 | smi sub common | -- || | | 2'd0 2'd12'd22'd3 <-sub_common_id(max is 4) || | | Larb8Larb9 Larb10 Larb11 In this patch we extern larb_remap[] to larb_remap[8][4] for this. larb_remap[x][y]: x mean common-id above, y means subcommon_id above. We can also distinguish if the M4U HW has sub_common by has_sub_comm property. Signed-off-by: Chao Hao --- drivers/iommu/mtk_iommu.c | 20 +--- drivers/iommu/mtk_iommu.h | 3 ++- 2 files changed, 15 insertions(+), 8 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index d73de987f8be..3914c418d1b0 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -90,6 +90,8 @@ #define REG_MMU1_INVLD_PA 0x148 #define REG_MMU0_INT_ID0x150 #define REG_MMU1_INT_ID0x154 +#define F_MMU_INT_ID_COMM_ID(a)(((a) >> 9) & 0x7) +#define F_MMU_INT_ID_SUB_COMM_ID(a)(((a) >> 7) & 0x3) #define F_MMU_INT_ID_LARB_ID(a)(((a) >> 7) & 0x7) #define F_MMU_INT_ID_PORT_ID(a)(((a) >> 2) & 0x1f) @@ -228,7 +230,7 @@ static irqreturn_t mtk_iommu_isr(int irq, void *dev_id) struct mtk_iommu_data *data = dev_id; struct mtk_iommu_domain *dom = data->m4u_dom; u32 int_state, regval, fault_iova, fault_pa; - unsigned int fault_larb, fault_port; + unsigned int fault_larb, fault_port, sub_comm = 0; bool layer, write; /* Read error info from registers */ @@ -244,10 +246,14 @@ static irqreturn_t mtk_iommu_isr(int irq, void *dev_id) } layer = fault_iova & F_MMU_FAULT_VA_LAYER_BIT; write = fault_iova & F_MMU_FAULT_VA_WRITE_BIT; - fault_larb = F_MMU_INT_ID_LARB_ID(regval); fault_port = F_MMU_INT_ID_PORT_ID(regval); - - fault_larb = data->plat_data->larbid_remap[fault_larb]; + if (data->plat_data->has_sub_comm) { + fault_larb = F_MMU_INT_ID_COMM_ID(regval); + sub_comm = F_MMU_INT_ID_SUB_COMM_ID(regval); + } else { + fault_larb = F_MMU_INT_ID_LARB_ID(regval); + } + fault_larb = data->plat_data->larbid_remap[fault_larb][sub_comm]; if (report_iommu_fault(&dom->domain, data->dev, fault_iova, write ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ)) { @@ -785,7 +791,7 @@ static const struct mtk_iommu_plat_data mt2712_data = { .has_bclk = true, .has_vld_pa_rng = true, .inv_sel_reg = REG_MMU_INV_SEL_GEN1, - .larbid_remap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, + .larbid_remap = {{0}, {1}, {2}, {3}, {4}, {5}, {6}, {7}}, }; static const struct mtk_iommu_plat_data mt8173_data = { @@ -794,14 +800,14 @@ static const struct mtk_iommu_plat_data mt8173_data = { .has_bclk = true, .reset_axi= true, .inv_sel_reg = REG_MMU_INV_SEL_GEN1, - .larbid_remap = {0, 1, 2, 3, 4, 5}, /* Linear mapping. */ + .larbid_remap = {{0}, {1}, {2}, {3}, {4}, {5}}, /* Linear mapping. */ }; static const struct mtk_iommu_plat_data mt8183_data = { .m4u_plat = M4U_MT8183, .reset_axi= true, .inv_sel_reg = REG_MMU_INV_SEL_GEN1, - .larbid_remap = {0, 4, 5, 6, 7, 2, 3, 1}, + .larbid_remap = {{0}, {4}, {5}, {6}, {7}, {2}, {3}, {1}}, }; static const struct of_device_id mtk_iommu_of_ids[] = { diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h index afd7a2de5c1e..d51ff99c2c71 100644 --- a/drivers/iommu/mtk_iommu.h +++ b/drivers/iommu/mtk_iommu.h @@ -41,10 +41,11 @@ struct mtk_iommu_plat_data { /* HW will use the EMI clock if there isn't the "bclk". */ boolh
Re: [PATCH 2/4] iommu/mediatek: Add iotlb_sync_range() support
On Wed, 2020-10-21 at 17:55 +0100, Robin Murphy wrote: > On 2020-10-19 12:30, Chao Hao wrote: > > MTK_IOMMU driver writes one page entry and does tlb flush at a time > > currently. More optimal would be to aggregate the writes and flush > > BUS buffer in the end. > > That's exactly what iommu_iotlb_gather_add_page() is meant to achieve. > Rather than jumping straight into hacking up a new API to go round the > back of the existing API design, it would be far better to ask the > question of why that's not behaving as expected. Thanks for you review! iommu_iotlb_gather_add_page is put in io_pgtable_tlb_add_page(). io_pgtable_tlb_add_page() only be called in unmapping and mapping flow doesn't have it in linux iommu driver, but mtk iommu needs to do tlb sync in mapping and unmapping to avoid old data being in the iommu tlb. In addtion, we hope to do tlb sync once when all the pages mapping done. iommu_iotlb_gather_add_page maybe do tlb sync more than once. because one whole buffer consists of different page size(1MB/64K/4K). Based on the previous considerations, don't find more appropriate the way of tlb sync for mtk iommu, so we add a new API. > > > For 50MB buffer mapping, if mtk_iommu driver use iotlb_sync_range() > > instead of tlb_add_range() and tlb_flush_walk/leaf(), it can increase > > 50% performance or more(depending on size of every page size) in > > comparison to flushing after each page entry update. So we prefer to > > use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and > > tlb_flush_walk/leaf() for MTK platforms. > > In the case of mapping, it sounds like what you actually want to do is > hook up .iotlb_sync_map and generally make IO_PGTABLE_QUIRK_TLBI_ON_MAP > cleverer, because the current implementation is as dumb as it could > possibly be. iotlb_sync_map only has one parameter(iommu_domain), but mtk iommu_domain maybe include the whole iova space, if mtk_iommu to do tlb sync based on iommu_domain, it is equivalent to do tlb flush all in fact. iommu driver will do tlb sync in every mapping page when mtk iommu sets IO_PGTABLE_QUIRK_TLBI_ON_MAP(io_pgtable_tlb_flush_walk), as is the commit message mentioned, it will drop mapping performance in mtk platform. > In fact if we simply passed an address range to > .iotlb_sync_map, io-pgtable probably wouldn't need to be involved at all > any more. I know it is not a good idea probably by adding a new api, but I found out that tlb sync only to be done after mapping one page, so if mtk_iommu hope to do tlb sync once after all the pages map done, could you give me some advices? thanks! > > Robin. > > > Signed-off-by: Chao Hao > > --- > > drivers/iommu/mtk_iommu.c | 6 ++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > > index 785b228d39a6..d3400c15ff7b 100644 > > --- a/drivers/iommu/mtk_iommu.c > > +++ b/drivers/iommu/mtk_iommu.c > > @@ -224,6 +224,11 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned > > long iova, size_t size, > > } > > } > > > > +static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t > > size) > > +{ > > + mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) > > +} > > + > > static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather > > *gather, > > unsigned long iova, size_t granule, > > void *cookie) > > @@ -536,6 +541,7 @@ static const struct iommu_ops mtk_iommu_ops = { > > .map= mtk_iommu_map, > > .unmap = mtk_iommu_unmap, > > .flush_iotlb_all = mtk_iommu_flush_iotlb_all, > > + .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, > > .iotlb_sync = mtk_iommu_iotlb_sync, > > .iova_to_phys = mtk_iommu_iova_to_phys, > > .probe_device = mtk_iommu_probe_device, > >
Re: [PATCH 2/4] iommu/mediatek: Add iotlb_sync_range() support
On Fri, 2020-10-23 at 13:57 +0800, chao hao wrote: > On Wed, 2020-10-21 at 17:55 +0100, Robin Murphy wrote: > > On 2020-10-19 12:30, Chao Hao wrote: > > > MTK_IOMMU driver writes one page entry and does tlb flush at a time > > > currently. More optimal would be to aggregate the writes and flush > > > BUS buffer in the end. > > > > That's exactly what iommu_iotlb_gather_add_page() is meant to achieve. > > Rather than jumping straight into hacking up a new API to go round the > > back of the existing API design, it would be far better to ask the > > question of why that's not behaving as expected. > > Thanks for you review! > > iommu_iotlb_gather_add_page is put in io_pgtable_tlb_add_page(). > io_pgtable_tlb_add_page() only be called in > unmapping and mapping flow doesn't have it in linux iommu driver, but > mtk iommu needs to do tlb sync in mapping > and unmapping to avoid old data being in the iommu tlb. > > In addtion, we hope to do tlb sync once when all the pages mapping done. > iommu_iotlb_gather_add_page maybe do > tlb sync more than once. because one whole buffer consists of different > page size(1MB/64K/4K). > > Based on the previous considerations, don't find more appropriate the > way of tlb sync for mtk iommu, so we add a new API. > > > > > > For 50MB buffer mapping, if mtk_iommu driver use iotlb_sync_range() > > > instead of tlb_add_range() and tlb_flush_walk/leaf(), it can increase > > > 50% performance or more(depending on size of every page size) in > > > comparison to flushing after each page entry update. So we prefer to > > > use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and > > > tlb_flush_walk/leaf() for MTK platforms. > > > > In the case of mapping, it sounds like what you actually want to do is > > hook up .iotlb_sync_map and generally make IO_PGTABLE_QUIRK_TLBI_ON_MAP > > cleverer, because the current implementation is as dumb as it could > > possibly be. > > iotlb_sync_map only has one parameter(iommu_domain), but mtk > iommu_domain maybe include the whole iova space, if mtk_iommu to do tlb > sync based on iommu_domain, it is equivalent to do tlb flush all in > fact. > iommu driver will do tlb sync in every mapping page when mtk iommu sets > IO_PGTABLE_QUIRK_TLBI_ON_MAP(io_pgtable_tlb_flush_walk), > as is the commit message mentioned, it will drop mapping performance in > mtk platform. > > > > In fact if we simply passed an address range to > > .iotlb_sync_map, io-pgtable probably wouldn't need to be involved at all > > any more. Sorry, I forget to reply the question in previous mail. Do you mean we need to modify iotlb_sync_map() input parameter(ex: add start/end iova)? > > I know it is not a good idea probably by adding a new api, but I found > out that tlb sync only to be done after mapping one page, so if > mtk_iommu hope to do tlb sync once after all the pages map done, could > you give me some advices? thanks! > > > > > Robin. > > > > > Signed-off-by: Chao Hao > > > --- > > > drivers/iommu/mtk_iommu.c | 6 ++ > > > 1 file changed, 6 insertions(+) > > > > > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > > > index 785b228d39a6..d3400c15ff7b 100644 > > > --- a/drivers/iommu/mtk_iommu.c > > > +++ b/drivers/iommu/mtk_iommu.c > > > @@ -224,6 +224,11 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned > > > long iova, size_t size, > > > } > > > } > > > > > > +static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t > > > size) > > > +{ > > > + mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) > > > +} > > > + > > > static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather > > > *gather, > > > unsigned long iova, size_t > > > granule, > > > void *cookie) > > > @@ -536,6 +541,7 @@ static const struct iommu_ops mtk_iommu_ops = { > > > .map= mtk_iommu_map, > > > .unmap = mtk_iommu_unmap, > > > .flush_iotlb_all = mtk_iommu_flush_iotlb_all, > > > + .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, > > > .iotlb_sync = mtk_iommu_iotlb_sync, > > > .iova_to_phys = mtk_iommu_iova_to_phys, > > > .probe_device = mtk_iommu_probe_device, > > > >
[PATCH 2/4] iommu/mediatek: Add iotlb_sync_range() support
MTK_IOMMU driver writes one page entry and does tlb flush at a time currently. More optimal would be to aggregate the writes and flush BUS buffer in the end. For 50MB buffer mapping, if mtk_iommu driver use iotlb_sync_range() instead of tlb_add_range() and tlb_flush_walk/leaf(), it can increase 50% performance or more(depending on size of every page size) in comparison to flushing after each page entry update. So we prefer to use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf() for MTK platforms. Signed-off-by: Chao Hao --- drivers/iommu/mtk_iommu.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 785b228d39a6..d3400c15ff7b 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -224,6 +224,11 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, } } +static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) +{ + mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) +} + static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather, unsigned long iova, size_t granule, void *cookie) @@ -536,6 +541,7 @@ static const struct iommu_ops mtk_iommu_ops = { .map= mtk_iommu_map, .unmap = mtk_iommu_unmap, .flush_iotlb_all = mtk_iommu_flush_iotlb_all, + .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, .iotlb_sync = mtk_iommu_iotlb_sync, .iova_to_phys = mtk_iommu_iova_to_phys, .probe_device = mtk_iommu_probe_device, -- 2.18.0
[PATCH 0/4] MTK_IOMMU: Optimize mapping / unmapping performance
For MTK platforms, mtk_iommu is using iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf() to do tlb sync when iommu driver runs iova mapping/unmapping. But if buffer size is large, it maybe consist of many pages(4K/8K/64K/1MB..). So iommu driver maybe run many times tlb sync in mapping for this case and it will degrade performance seriously. In order to resolve the issue, we hope to add iotlb_sync_range() callback in iommu_ops, it can appiont iova and size to do tlb sync. MTK_IOMMU will use iotlb_sync_range() callback when the whole mapping/unmapping is completed and remove iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf(). So this patchset will replace iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf() with iotlb_sync_range() callback. Chao Hao (4): iommu: Introduce iotlb_sync_range callback iommu/mediatek: Add iotlb_sync_range() support iommu/mediatek: Remove unnecessary tlb sync iommu/mediatek: Adjust iotlb_sync_range drivers/iommu/dma-iommu.c | 9 + drivers/iommu/iommu.c | 7 +++ drivers/iommu/mtk_iommu.c | 36 include/linux/iommu.h | 2 ++ 4 files changed, 26 insertions(+), 28 deletions(-) -- 2.18.0
[PATCH 3/4] iommu/mediatek: Remove unnecessary tlb sync
As is "[PATCH 2/4]" described, we will use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf() to enhance performance. So we will remove the implementation of iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf(). Signed-off-by: Chao Hao --- drivers/iommu/mtk_iommu.c | 28 1 file changed, 4 insertions(+), 24 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index d3400c15ff7b..bca1f53c0ab9 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -229,21 +229,15 @@ static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) } -static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather, - unsigned long iova, size_t granule, - void *cookie) +static void mtk_iommu_tlb_flush_skip(unsigned long iova, size_t size, +size_t granule, void *cookie) { - struct mtk_iommu_data *data = cookie; - struct iommu_domain *domain = &data->m4u_dom->domain; - - iommu_iotlb_gather_add_page(domain, gather, iova, granule); } static const struct iommu_flush_ops mtk_iommu_flush_ops = { .tlb_flush_all = mtk_iommu_tlb_flush_all, - .tlb_flush_walk = mtk_iommu_tlb_flush_range_sync, - .tlb_flush_leaf = mtk_iommu_tlb_flush_range_sync, - .tlb_add_page = mtk_iommu_tlb_flush_page_nosync, + .tlb_flush_walk = mtk_iommu_tlb_flush_skip, + .tlb_flush_leaf = mtk_iommu_tlb_flush_skip, }; static irqreturn_t mtk_iommu_isr(int irq, void *dev_id) @@ -443,19 +437,6 @@ static void mtk_iommu_flush_iotlb_all(struct iommu_domain *domain) mtk_iommu_tlb_flush_all(mtk_iommu_get_m4u_data()); } -static void mtk_iommu_iotlb_sync(struct iommu_domain *domain, -struct iommu_iotlb_gather *gather) -{ - struct mtk_iommu_data *data = mtk_iommu_get_m4u_data(); - size_t length = gather->end - gather->start; - - if (gather->start == ULONG_MAX) - return; - - mtk_iommu_tlb_flush_range_sync(gather->start, length, gather->pgsize, - data); -} - static phys_addr_t mtk_iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova) { @@ -542,7 +523,6 @@ static const struct iommu_ops mtk_iommu_ops = { .unmap = mtk_iommu_unmap, .flush_iotlb_all = mtk_iommu_flush_iotlb_all, .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, - .iotlb_sync = mtk_iommu_iotlb_sync, .iova_to_phys = mtk_iommu_iova_to_phys, .probe_device = mtk_iommu_probe_device, .release_device = mtk_iommu_release_device, -- 2.18.0
[PATCH 1/4] iommu: Introduce iotlb_sync_range callback
Add iotlb_sync_range callback to support that driver can appoint iova and size to do tlb sync. Iommu will call iotlb_sync_range() after the whole mapping/unmapping is completed, and the iova and size of iotlb_sync_range() are start_iova and buffer total_size respectively. At the same time, iotlb_sync() and tlb_flush_walk/leaf() can be skipped. So iotlb_sync_range() will enhance performance by reducing the time of tlb sync. Signed-off-by: Chao Hao --- drivers/iommu/dma-iommu.c | 9 + drivers/iommu/iommu.c | 7 +++ include/linux/iommu.h | 2 ++ 3 files changed, 18 insertions(+) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 4959f5df21bd..e2e9114c4ae2 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -479,6 +479,7 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, size_t size, int prot, u64 dma_mask) { struct iommu_domain *domain = iommu_get_dma_domain(dev); + const struct iommu_ops *ops = domain->ops; struct iommu_dma_cookie *cookie = domain->iova_cookie; struct iova_domain *iovad = &cookie->iovad; size_t iova_off = iova_offset(iovad, phys); @@ -497,6 +498,10 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, iommu_dma_free_iova(cookie, iova, size); return DMA_MAPPING_ERROR; } + + if (ops->iotlb_sync_range) + ops->iotlb_sync_range(iova, size); + return iova + iova_off; } @@ -1165,6 +1170,7 @@ void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 size) static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev, phys_addr_t msi_addr, struct iommu_domain *domain) { + const struct iommu_ops *ops = domain->ops; struct iommu_dma_cookie *cookie = domain->iova_cookie; struct iommu_dma_msi_page *msi_page; dma_addr_t iova; @@ -1187,6 +1193,9 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev, if (iommu_map(domain, iova, msi_addr, size, prot)) goto out_free_iova; + if (ops->iotlb_sync_range) + ops->iotlb_sync_range(iova, size); + INIT_LIST_HEAD(&msi_page->list); msi_page->phys = msi_addr; msi_page->iova = iova; diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 609bd25bf154..e399a238d1e9 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2304,6 +2304,9 @@ static size_t __iommu_unmap(struct iommu_domain *domain, unmapped += unmapped_page; } + if (ops->iotlb_sync_range) + ops->iotlb_sync_range(iova, size); + trace_unmap(orig_iova, size, unmapped); return unmapped; } @@ -2334,6 +2337,7 @@ static size_t __iommu_map_sg(struct iommu_domain *domain, unsigned long iova, struct scatterlist *sg, unsigned int nents, int prot, gfp_t gfp) { + const struct iommu_ops *ops = domain->ops; size_t len = 0, mapped = 0; phys_addr_t start; unsigned int i = 0; @@ -2364,6 +2368,9 @@ static size_t __iommu_map_sg(struct iommu_domain *domain, unsigned long iova, sg = sg_next(sg); } + if (ops->iotlb_sync_range) + ops->iotlb_sync_range(iova, mapped); + return mapped; out_err: diff --git a/include/linux/iommu.h b/include/linux/iommu.h index fee209efb756..4be90324bd23 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -192,6 +192,7 @@ struct iommu_iotlb_gather { * @map: map a physically contiguous memory region to an iommu domain * @unmap: unmap a physically contiguous memory region from an iommu domain * @flush_iotlb_all: Synchronously flush all hardware TLBs for this domain + * @iotlb_sync_range: Sync specific iova and size mappings to the hardware * @iotlb_sync_map: Sync mappings created recently using @map to the hardware * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush *queue @@ -244,6 +245,7 @@ struct iommu_ops { size_t (*unmap)(struct iommu_domain *domain, unsigned long iova, size_t size, struct iommu_iotlb_gather *iotlb_gather); void (*flush_iotlb_all)(struct iommu_domain *domain); + void (*iotlb_sync_range)(unsigned long iova, size_t size); void (*iotlb_sync_map)(struct iommu_domain *domain); void (*iotlb_sync)(struct iommu_domain *domain, struct iommu_iotlb_gather *iotlb_gather); -- 2.18.0
[PATCH 4/4] iommu/mediatek: Adjust iotlb_sync_range
As is title, the patch only adjusts the architecture of iotlb_sync_range(). No functional change. Signed-off-by: Chao Hao --- drivers/iommu/mtk_iommu.c | 14 -- 1 file changed, 4 insertions(+), 10 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index bca1f53c0ab9..66e5b9d3c575 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -191,10 +191,9 @@ static void mtk_iommu_tlb_flush_all(void *cookie) } } -static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, - size_t granule, void *cookie) +static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) { - struct mtk_iommu_data *data = cookie; + struct mtk_iommu_data *data; unsigned long flags; int ret; u32 tmp; @@ -216,7 +215,7 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, if (ret) { dev_warn(data->dev, "Partial TLB flush timed out, falling back to full flush\n"); - mtk_iommu_tlb_flush_all(cookie); + mtk_iommu_tlb_flush_all(data); } /* Clear the CPE status */ writel_relaxed(0, data->base + REG_MMU_CPE_DONE); @@ -224,11 +223,6 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, } } -static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) -{ - mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) -} - static void mtk_iommu_tlb_flush_skip(unsigned long iova, size_t size, size_t granule, void *cookie) { @@ -522,7 +516,7 @@ static const struct iommu_ops mtk_iommu_ops = { .map= mtk_iommu_map, .unmap = mtk_iommu_unmap, .flush_iotlb_all = mtk_iommu_flush_iotlb_all, - .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, + .iotlb_sync_range = mtk_iommu_tlb_flush_range_sync, .iova_to_phys = mtk_iommu_iova_to_phys, .probe_device = mtk_iommu_probe_device, .release_device = mtk_iommu_release_device, -- 2.18.0
Re: [PATCH 11/21] iommu/mediatek: Add power-domain operation
On Sat, 2020-07-11 at 14:48 +0800, Yong Wu wrote: > In the previous SoC, the M4U HW is in the EMI power domain which is > always on. the latest M4U is in the display power domain which may be > turned on/off, thus we have to add pm_runtime interface for it. > > we should enable its power before M4U hw initial. and disable it after HW > initialize. > > When the engine work, the engine always enable the power and clocks for > smi-larb/smi-common, then the M4U's power will always be powered on > automatically via the device link with smi-common. > > Note: we don't enable the M4U power in iommu_map/unmap for tlb flush. > If its power already is on, of course it is ok. if the power is off, > the main tlb will be reset while M4U power on, thus the tlb flush while > m4u power off is unnecessary, just skip it. > > Signed-off-by: Yong Wu > --- > drivers/iommu/mtk_iommu.c | 54 ++- > 1 file changed, 47 insertions(+), 7 deletions(-) > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > index 931fdd19c8f3..03a6d66f4bef 100644 > --- a/drivers/iommu/mtk_iommu.c > +++ b/drivers/iommu/mtk_iommu.c > @@ -20,6 +20,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -172,6 +173,19 @@ static struct mtk_iommu_domain *to_mtk_domain(struct > iommu_domain *dom) > return container_of(dom, struct mtk_iommu_domain, domain); > } > > +static int mtk_iommu_rpm_get(struct device *dev) > +{ > + if (pm_runtime_enabled(dev)) > + return pm_runtime_get_sync(dev); > + return 0; > +} > + > +static void mtk_iommu_rpm_put(struct device *dev) > +{ > + if (pm_runtime_enabled(dev)) > + pm_runtime_put_autosuspend(dev); > +} > + > static void mtk_iommu_tlb_flush_all(void *cookie) > { > struct mtk_iommu_data *data = cookie; > @@ -193,6 +207,11 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long > iova, size_t size, > u32 tmp; > > for_each_m4u(data) { > + /* skip tlb flush when pm is not active */ > + if (pm_runtime_enabled(data->dev) && > + !pm_runtime_active(data->dev)) > + continue; > + > spin_lock_irqsave(&data->tlb_lock, flags); > writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0, > data->base + data->plat_data->inv_sel_reg); > @@ -377,15 +396,20 @@ static int mtk_iommu_attach_device(struct iommu_domain > *domain, > { > struct mtk_iommu_data *data = dev_iommu_priv_get(dev); > struct mtk_iommu_domain *dom = to_mtk_domain(domain); > + int ret; > > if (!data) > return -ENODEV; > > /* Update the pgtable base address register of the M4U HW */ > if (!data->m4u_dom) { > + ret = mtk_iommu_rpm_get(dev); > + if (ret < 0) > + return ret; > data->m4u_dom = dom; > writel(dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK, > data->base + REG_MMU_PT_BASE_ADDR); > + mtk_iommu_rpm_put(dev); > } > > mtk_iommu_config(data, dev, true); > @@ -543,10 +567,14 @@ static int mtk_iommu_hw_init(const struct > mtk_iommu_data *data) > u32 regval; > int ret; > > - ret = clk_prepare_enable(data->bclk); > - if (ret) { > - dev_err(data->dev, "Failed to enable iommu bclk(%d)\n", ret); > - return ret; > + /* bclk will be enabled in pm callback in power-domain case. */ > + if (!pm_runtime_enabled(data->dev)) { > + ret = clk_prepare_enable(data->bclk); > + if (ret) { > + dev_err(data->dev, "Failed to enable iommu bclk(%d)\n", > + ret); > + return ret; > + } > } > > if (data->plat_data->m4u_plat == M4U_MT8173) { > @@ -728,7 +756,15 @@ static int mtk_iommu_probe(struct platform_device *pdev) > > platform_set_drvdata(pdev, data); > > + if (dev->pm_domain) > + pm_runtime_enable(dev); hi yong, If you put "pm_runtime_enable" here, it maybe not device_link with smi_common for previous patch: if(i || !pm_runtime_enabled(dev)) continue; Whether put it up front? best regards, chao > + > + ret = mtk_iommu_rpm_get(dev); > + if (ret < 0) > + return ret; > + > ret = mtk_iommu_hw_init(data); > + mtk_iommu_rpm_put(dev); > if (ret) > return ret; > > @@ -801,6 +837,10 @@ static int __maybe_unused mtk_iommu_resume(struct device > *dev) > dev_err(data->dev, "Failed to enable clk(%d) in resume\n", ret); > return ret; > } > + > + /* Avoid first resume to affect the default value of registers below. */ > + if (!m4u_dom) > + return 0; > writel_relaxed(reg->wr_len_ctrl, base + REG_MMU_WR_LEN_CTRL); > writel_relaxed(reg->misc_ctrl, ba
Re: [PATCH v3 3/7] iommu/mediatek: Disable STANDARD_AXI_MODE in MISC_CTRL
On Mon, 2020-05-25 at 14:14 +0800, Yong Wu wrote: > On Sat, 2020-05-09 at 16:36 +0800, Chao Hao wrote: > > In order to improve performance, we always disable STANDARD_AXI_MODE in > > MISC_CTRL. > > > > Signed-off-by: Chao Hao > > --- > > drivers/iommu/mtk_iommu.c | 8 +++- > > drivers/iommu/mtk_iommu.h | 1 + > > 2 files changed, 8 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > > index e7e7c7695ed1..9ede327a418d 100644 > > --- a/drivers/iommu/mtk_iommu.c > > +++ b/drivers/iommu/mtk_iommu.c > > @@ -42,6 +42,8 @@ > > #define F_INVLD_EN1BIT(1) > > > > #define REG_MMU_MISC_CTRL 0x048 > > +#define F_MMU_STANDARD_AXI_MODE_BIT(BIT(3) | BIT(19)) > > + > > #define REG_MMU_DCM_DIS0x050 > > > > #define REG_MMU_CTRL_REG 0x110 > > @@ -585,7 +587,11 @@ static int mtk_iommu_hw_init(const struct > > mtk_iommu_data *data) > > } > > writel_relaxed(0, data->base + REG_MMU_DCM_DIS); > > > > - if (data->plat_data->reset_axi) { > > + if (data->plat_data->has_misc_ctrl) { > > + regval = readl_relaxed(data->base + REG_MMU_MISC_CTRL); > > + regval &= ~F_MMU_STANDARD_AXI_MODE_BIT; > > + writel_relaxed(regval, data->base + REG_MMU_MISC_CTRL); > > + } else if (data->plat_data->reset_axi) { > > /* The register is called STANDARD_AXI_MODE in this case */ > > writel_relaxed(0, data->base + REG_MMU_MISC_CTRL); > > } > > > 0x48 is either STANDARD_AXI_MODE or MISC_CTRL. > > Thus, > > if (data->plat_data->reset_axi) { >xxx > } else { /* MISC_CTRL */ >xxx > } > > No need add "has_misc_ctrl". Thanks for you comment. Only mm_iommu/m4u needs to set MISC_CTRL register and apu_iommu don't need to set it. So I think we need to use has_misc_ctrl to distinguish it. > > > diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h > > index 1b6ea839b92c..d711ac630037 100644 > > --- a/drivers/iommu/mtk_iommu.h > > +++ b/drivers/iommu/mtk_iommu.h > > @@ -40,6 +40,7 @@ struct mtk_iommu_plat_data { > > > > /* HW will use the EMI clock if there isn't the "bclk". */ > > boolhas_bclk; > > + boolhas_misc_ctrl; > > boolhas_vld_pa_rng; > > boolreset_axi; > > unsigned char larbid_remap[MTK_LARB_NR_MAX]; > >
[PATCH v4 2/7] iommu/mediatek: Rename the register STANDARD_AXI_MODE(0x48) to MISC_CTRL
For iommu offset=0x48 register, only the previous mt8173/mt8183 use the name STANDARD_AXI_MODE, all the latest SoC extend the register more feature by different bits, for example: axi_mode, in_order_en, coherent_en and so on. So rename REG_MMU_MISC_CTRL may be more proper. This patch only rename the register name, no functional change. Signed-off-by: Chao Hao Reviewed-by: Yong Wu --- drivers/iommu/mtk_iommu.c | 14 +++--- drivers/iommu/mtk_iommu.h | 2 +- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 2be96f1cdbd2..88d3df5b91c2 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -41,7 +41,7 @@ #define F_INVLD_EN0BIT(0) #define F_INVLD_EN1BIT(1) -#define REG_MMU_STANDARD_AXI_MODE 0x048 +#define REG_MMU_MISC_CTRL 0x048 #define REG_MMU_DCM_DIS0x050 #define REG_MMU_CTRL_REG 0x110 @@ -573,8 +573,10 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data) } writel_relaxed(0, data->base + REG_MMU_DCM_DIS); - if (data->plat_data->reset_axi) - writel_relaxed(0, data->base + REG_MMU_STANDARD_AXI_MODE); + if (data->plat_data->reset_axi) { + /* The register is called STANDARD_AXI_MODE in this case */ + writel_relaxed(0, data->base + REG_MMU_MISC_CTRL); + } if (devm_request_irq(data->dev, data->irq, mtk_iommu_isr, 0, dev_name(data->dev), (void *)data)) { @@ -718,8 +720,7 @@ static int __maybe_unused mtk_iommu_suspend(struct device *dev) struct mtk_iommu_suspend_reg *reg = &data->reg; void __iomem *base = data->base; - reg->standard_axi_mode = readl_relaxed(base + - REG_MMU_STANDARD_AXI_MODE); + reg->misc_ctrl = readl_relaxed(base + REG_MMU_MISC_CTRL); reg->dcm_dis = readl_relaxed(base + REG_MMU_DCM_DIS); reg->ctrl_reg = readl_relaxed(base + REG_MMU_CTRL_REG); reg->int_control0 = readl_relaxed(base + REG_MMU_INT_CONTROL0); @@ -743,8 +744,7 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev) dev_err(data->dev, "Failed to enable clk(%d) in resume\n", ret); return ret; } - writel_relaxed(reg->standard_axi_mode, - base + REG_MMU_STANDARD_AXI_MODE); + writel_relaxed(reg->misc_ctrl, base + REG_MMU_MISC_CTRL); writel_relaxed(reg->dcm_dis, base + REG_MMU_DCM_DIS); writel_relaxed(reg->ctrl_reg, base + REG_MMU_CTRL_REG); writel_relaxed(reg->int_control0, base + REG_MMU_INT_CONTROL0); diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h index ea949a324e33..1b6ea839b92c 100644 --- a/drivers/iommu/mtk_iommu.h +++ b/drivers/iommu/mtk_iommu.h @@ -18,7 +18,7 @@ #include struct mtk_iommu_suspend_reg { - u32 standard_axi_mode; + u32 misc_ctrl; u32 dcm_dis; u32 ctrl_reg; u32 int_control0; -- 2.18.0
[PATCH v4 1/7] dt-bindings: mediatek: Add bindings for MT6779
This patch adds description for MT6779 IOMMU. MT6779 has two iommus, they are mm_iommu and apu_iommu which both use ARM Short-Descriptor translation format. In addition, mm_iommu and apu_iommu are two independent HW instance , we need to set them separately. The MT6779 IOMMU hardware diagram is as below, it is only a brief diagram about iommu, it don't focus on the part of smi_larb, so I don't describe the smi_larb detailedly. EMI | -- || MM_IOMMUAPU_IOMMU || SMI_COMMOM--- APU_BUS ||| SMI_LARB(0~11) || ||| || -- || | | | Multimedia engine CCU VPU MDLA EMDA All the connections are hardware fixed, software can not adjust it. Change since v2: 1. Delete unused definition, ex: M4U_LARB12_ID, M4U_LARB13_ID, CCU, VPU, MDLA, EDMA Change since v1: 1. Delete M4U_PORT_UNKNOWN define because of not use it. 2. Correct coding format: ex: /*larb3-VENC*/ --> /* larb3-VENC */ Signed-off-by: Chao Hao Reviewed-by: Rob Herring --- .../bindings/iommu/mediatek,iommu.txt | 2 + include/dt-bindings/memory/mt6779-larb-port.h | 206 ++ 2 files changed, 208 insertions(+) create mode 100644 include/dt-bindings/memory/mt6779-larb-port.h diff --git a/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt b/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt index ce59a505f5a4..c1ccd8582eb2 100644 --- a/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt +++ b/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt @@ -58,6 +58,7 @@ Required properties: - compatible : must be one of the following string: "mediatek,mt2701-m4u" for mt2701 which uses generation one m4u HW. "mediatek,mt2712-m4u" for mt2712 which uses generation two m4u HW. + "mediatek,mt6779-m4u" for mt6779 which uses generation two m4u HW. "mediatek,mt7623-m4u", "mediatek,mt2701-m4u" for mt7623 which uses generation one m4u HW. "mediatek,mt8173-m4u" for mt8173 which uses generation two m4u HW. @@ -78,6 +79,7 @@ Required properties: Specifies the mtk_m4u_id as defined in dt-binding/memory/mt2701-larb-port.h for mt2701, mt7623 dt-binding/memory/mt2712-larb-port.h for mt2712, + dt-binding/memory/mt6779-larb-port.h for mt6779, dt-binding/memory/mt8173-larb-port.h for mt8173, and dt-binding/memory/mt8183-larb-port.h for mt8183. diff --git a/include/dt-bindings/memory/mt6779-larb-port.h b/include/dt-bindings/memory/mt6779-larb-port.h new file mode 100644 index ..2ad0899fbf2f --- /dev/null +++ b/include/dt-bindings/memory/mt6779-larb-port.h @@ -0,0 +1,206 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2019 MediaTek Inc. + * Author: Chao Hao + */ + +#ifndef _DTS_IOMMU_PORT_MT6779_H_ +#define _DTS_IOMMU_PORT_MT6779_H_ + +#define MTK_M4U_ID(larb, port) (((larb) << 5) | (port)) + +#define M4U_LARB0_ID0 +#define M4U_LARB1_ID1 +#define M4U_LARB2_ID2 +#define M4U_LARB3_ID3 +#define M4U_LARB4_ID4 +#define M4U_LARB5_ID5 +#define M4U_LARB6_ID6 +#define M4U_LARB7_ID7 +#define M4U_LARB8_ID8 +#define M4U_LARB9_ID9 +#define M4U_LARB10_ID 10 +#define M4U_LARB11_ID 11 + +/* larb0 */ +#define M4U_PORT_DISP_POSTMASK0 MTK_M4U_ID(M4U_LARB0_ID, 0) +#define M4U_PORT_DISP_OVL0_HDR MTK_M4U_ID(M4U_LARB0_ID, 1) +#define M4U_PORT_DISP_OVL1_HDR MTK_M4U_ID(M4U_LARB0_ID, 2) +#define M4U_PORT_DISP_OVL0 MTK_M4U_ID(M4U_LARB0_ID, 3) +#define M4U_PORT_DISP_OVL1 MTK_M4U_ID(M4U_LARB0_ID, 4) +#define M4U_PORT_DISP_PVRIC0MTK_M4U_ID(M4U_LARB0_ID, 5) +#define M4U_PORT_DISP_RDMA0 MTK_M4U_ID(M4U_LARB0_ID, 6) +#define M4U_PORT_DISP_WDMA0 MTK_M4U_ID(M4U_LARB0_ID, 7) +#define M4U_PORT_DISP_FAKE0 MTK_M4U_ID(M4U_LARB0_ID, 8) + +/* larb1 */ +#define M4U_PORT_DISP_OVL0_2L_HDR MTK_M4U_ID(M4U_LARB1_ID, 0) +#define M4U_PORT_DISP_OVL1_2L_HDR MTK_M4U_ID(M4U_LARB1_ID, 1) +#define M4U_PORT_DISP_OVL0_2L MTK_M4U_ID(M4U_LARB1_ID, 2) +#define M4U_PORT_DISP_OVL1_2L MTK_M4U_ID(M4U_LARB1_ID, 3) +#define M4U_PORT_DISP_RDMA1 MTK_M4U_ID(M4U_L
[PATCH v4 6/7] iommu/mediatek: Add REG_MMU_WR_LEN definition preparing for mt6779
Some platforms(ex: mt6779) have a new register called by REG_MMU_WR_LEN to improve performance. This patch add this register definition. Signed-off-by: Chao Hao --- drivers/iommu/mtk_iommu.c | 10 ++ drivers/iommu/mtk_iommu.h | 2 ++ 2 files changed, 12 insertions(+) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index a687e8db0e51..c706bca6487e 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -46,6 +46,8 @@ #define F_MMU_STANDARD_AXI_MODE_BIT(BIT(3) | BIT(19)) #define REG_MMU_DCM_DIS0x050 +#define REG_MMU_WR_LEN 0x054 +#define F_MMU_WR_THROT_DIS_BIT (BIT(5) | BIT(21)) #define REG_MMU_CTRL_REG 0x110 #define F_MMU_TF_PROT_TO_PROGRAM_ADDR (2 << 4) @@ -581,6 +583,12 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data) writel_relaxed(regval, data->base + REG_MMU_VLD_PA_RNG); } writel_relaxed(0, data->base + REG_MMU_DCM_DIS); + if (data->plat_data->has_wr_len) { + /* write command throttling mode */ + regval = readl_relaxed(data->base + REG_MMU_WR_LEN); + regval &= ~F_MMU_WR_THROT_DIS_BIT; + writel_relaxed(regval, data->base + REG_MMU_WR_LEN); + } if (data->plat_data->reset_axi) { /* The register is called STANDARD_AXI_MODE in this case */ @@ -737,6 +745,7 @@ static int __maybe_unused mtk_iommu_suspend(struct device *dev) struct mtk_iommu_suspend_reg *reg = &data->reg; void __iomem *base = data->base; + reg->wr_len = readl_relaxed(base + REG_MMU_WR_LEN); reg->misc_ctrl = readl_relaxed(base + REG_MMU_MISC_CTRL); reg->dcm_dis = readl_relaxed(base + REG_MMU_DCM_DIS); reg->ctrl_reg = readl_relaxed(base + REG_MMU_CTRL_REG); @@ -761,6 +770,7 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev) dev_err(data->dev, "Failed to enable clk(%d) in resume\n", ret); return ret; } + writel_relaxed(reg->wr_len, base + REG_MMU_WR_LEN); writel_relaxed(reg->misc_ctrl, base + REG_MMU_MISC_CTRL); writel_relaxed(reg->dcm_dis, base + REG_MMU_DCM_DIS); writel_relaxed(reg->ctrl_reg, base + REG_MMU_CTRL_REG); diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h index d51ff99c2c71..9971cedd72ea 100644 --- a/drivers/iommu/mtk_iommu.h +++ b/drivers/iommu/mtk_iommu.h @@ -25,6 +25,7 @@ struct mtk_iommu_suspend_reg { u32 int_main_control; u32 ivrp_paddr; u32 vld_pa_rng; + u32 wr_len; }; enum mtk_iommu_plat { @@ -43,6 +44,7 @@ struct mtk_iommu_plat_data { boolhas_misc_ctrl; boolhas_sub_comm; boolhas_vld_pa_rng; + boolhas_wr_len; boolreset_axi; u32 inv_sel_reg; unsigned char larbid_remap[8][4]; -- 2.18.0
[PATCH v4 7/7] iommu/mediatek: Add mt6779 basic support
1. Start from mt6779, INVLDT_SEL move to offset=0x2c, so we add REG_MMU_INV_SEL_GEN2 definition and mt6779 uses it. 2. Change PROTECT_PA_ALIGN from 128 byte to 256 byte. 3. For REG_MMU_CTRL_REG register, we only need to change bit[2:0], others bits keep default value, ex: enable victim tlb. 4. Add mt6779_data to support mm_iommu HW init. Change since v3: 1. When setting MMU_CTRL_REG, we don't need to include mt8173. Cc: Yong Wu Signed-off-by: Chao Hao --- drivers/iommu/mtk_iommu.c | 20 ++-- drivers/iommu/mtk_iommu.h | 1 + 2 files changed, 19 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index c706bca6487e..def2e996683f 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -37,6 +37,11 @@ #define REG_MMU_INVLD_START_A 0x024 #define REG_MMU_INVLD_END_A0x028 +/* In latest Coda, MMU_INV_SEL's offset is changed to 0x02c. + * So we named offset = 0x02c to "REG_MMU_INV_SEL_GEN2" + * and offset = 0x038 to "REG_MMU_INV_SEL_GEN1". + */ +#define REG_MMU_INV_SEL_GEN2 0x02c #define REG_MMU_INV_SEL_GEN1 0x038 #define F_INVLD_EN0BIT(0) #define F_INVLD_EN1BIT(1) @@ -98,7 +103,7 @@ #define F_MMU_INT_ID_LARB_ID(a)(((a) >> 7) & 0x7) #define F_MMU_INT_ID_PORT_ID(a)(((a) >> 2) & 0x1f) -#define MTK_PROTECT_PA_ALIGN 128 +#define MTK_PROTECT_PA_ALIGN 256 /* * Get the local arbiter ID and the portid within the larb arbiter @@ -543,11 +548,12 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data) return ret; } + regval = readl_relaxed(data->base + REG_MMU_CTRL_REG); if (data->plat_data->m4u_plat == M4U_MT8173) regval = F_MMU_PREFETCH_RT_REPLACE_MOD | F_MMU_TF_PROT_TO_PROGRAM_ADDR_MT8173; else - regval = F_MMU_TF_PROT_TO_PROGRAM_ADDR; + regval |= F_MMU_TF_PROT_TO_PROGRAM_ADDR; writel_relaxed(regval, data->base + REG_MMU_CTRL_REG); regval = F_L2_MULIT_HIT_EN | @@ -797,6 +803,15 @@ static const struct mtk_iommu_plat_data mt2712_data = { .larbid_remap = {{0}, {1}, {2}, {3}, {4}, {5}, {6}, {7}}, }; +static const struct mtk_iommu_plat_data mt6779_data = { + .m4u_plat = M4U_MT6779, + .has_sub_comm = true, + .has_wr_len= true, + .has_misc_ctrl = true, + .inv_sel_reg = REG_MMU_INV_SEL_GEN2, + .larbid_remap = {{0}, {1}, {2}, {3}, {5}, {7, 8}, {10}, {9}}, +}; + static const struct mtk_iommu_plat_data mt8173_data = { .m4u_plat = M4U_MT8173, .has_4gb_mode = true, @@ -815,6 +830,7 @@ static const struct mtk_iommu_plat_data mt8183_data = { static const struct of_device_id mtk_iommu_of_ids[] = { { .compatible = "mediatek,mt2712-m4u", .data = &mt2712_data}, + { .compatible = "mediatek,mt6779-m4u", .data = &mt6779_data}, { .compatible = "mediatek,mt8173-m4u", .data = &mt8173_data}, { .compatible = "mediatek,mt8183-m4u", .data = &mt8183_data}, {} diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h index 9971cedd72ea..fb79e710c8d9 100644 --- a/drivers/iommu/mtk_iommu.h +++ b/drivers/iommu/mtk_iommu.h @@ -31,6 +31,7 @@ struct mtk_iommu_suspend_reg { enum mtk_iommu_plat { M4U_MT2701, M4U_MT2712, + M4U_MT6779, M4U_MT8173, M4U_MT8183, }; -- 2.18.0
[PATCH v4 5/7] iommu/mediatek: Add sub_comm id in translation fault
The max larb number that a iommu HW support is 8(larb0~larb7 in the below diagram). If the larb's number is over 8, we use a sub_common for merging several larbs into one larb. At this case, we will extend larb_id: bit[11:9] means common-id; bit[8:7] means subcommon-id; From these two variable, we could get the real larb number when translation fault happen. The diagram is as below: EMI | IOMMU | - | | common1 common0 | | - | smi common | | | | | || 3'd03'd13'd23'd3 ... 3'd7 <-common_id(max is 8) | | | | || Larb0 Larb1 | Larb3 ... Larb7 | smi sub common | -- || | | 2'd0 2'd12'd22'd3 <-sub_common_id(max is 4) || | | Larb8Larb9 Larb10 Larb11 In this patch we extern larb_remap[] to larb_remap[8][4] for this. larb_remap[x][y]: x mean common-id above, y means subcommon_id above. We can also distinguish if the M4U HW has sub_common by has_sub_comm property. Signed-off-by: Chao Hao Reviewed-by: Yong Wu --- drivers/iommu/mtk_iommu.c | 20 +--- drivers/iommu/mtk_iommu.h | 3 ++- 2 files changed, 15 insertions(+), 8 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index f23919feba4e..a687e8db0e51 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -91,6 +91,8 @@ #define REG_MMU1_INVLD_PA 0x148 #define REG_MMU0_INT_ID0x150 #define REG_MMU1_INT_ID0x154 +#define F_MMU_INT_ID_COMM_ID(a)(((a) >> 9) & 0x7) +#define F_MMU_INT_ID_SUB_COMM_ID(a)(((a) >> 7) & 0x3) #define F_MMU_INT_ID_LARB_ID(a)(((a) >> 7) & 0x7) #define F_MMU_INT_ID_PORT_ID(a)(((a) >> 2) & 0x1f) @@ -229,7 +231,7 @@ static irqreturn_t mtk_iommu_isr(int irq, void *dev_id) struct mtk_iommu_data *data = dev_id; struct mtk_iommu_domain *dom = data->m4u_dom; u32 int_state, regval, fault_iova, fault_pa; - unsigned int fault_larb, fault_port; + unsigned int fault_larb, fault_port, sub_comm = 0; bool layer, write; /* Read error info from registers */ @@ -245,10 +247,14 @@ static irqreturn_t mtk_iommu_isr(int irq, void *dev_id) } layer = fault_iova & F_MMU_FAULT_VA_LAYER_BIT; write = fault_iova & F_MMU_FAULT_VA_WRITE_BIT; - fault_larb = F_MMU_INT_ID_LARB_ID(regval); fault_port = F_MMU_INT_ID_PORT_ID(regval); - - fault_larb = data->plat_data->larbid_remap[fault_larb]; + if (data->plat_data->has_sub_comm) { + fault_larb = F_MMU_INT_ID_COMM_ID(regval); + sub_comm = F_MMU_INT_ID_SUB_COMM_ID(regval); + } else { + fault_larb = F_MMU_INT_ID_LARB_ID(regval); + } + fault_larb = data->plat_data->larbid_remap[fault_larb][sub_comm]; if (report_iommu_fault(&dom->domain, data->dev, fault_iova, write ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ)) { @@ -778,7 +784,7 @@ static const struct mtk_iommu_plat_data mt2712_data = { .has_bclk = true, .has_vld_pa_rng = true, .inv_sel_reg= REG_MMU_INV_SEL_GEN1, - .larbid_remap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, + .larbid_remap = {{0}, {1}, {2}, {3}, {4}, {5}, {6}, {7}}, }; static const struct mtk_iommu_plat_data mt8173_data = { @@ -787,14 +793,14 @@ static const struct mtk_iommu_plat_data mt8173_data = { .has_bclk = true, .reset_axi= true, .inv_sel_reg = REG_MMU_INV_SEL_GEN1, - .larbid_remap = {0, 1, 2, 3, 4, 5}, /* Linear mapping. */ + .larbid_remap = {{0}, {1}, {2}, {3}, {4}, {5}}, /* Linear mapping. */ }; static const struct mtk_iommu_plat_data mt8183_data = { .m4u_plat = M4U_MT8183, .reset_axi= true, .inv_sel_reg = REG_MMU_INV_SEL_GEN1, - .larbid_remap = {0, 4, 5, 6, 7, 2, 3, 1}, + .larbid_remap = {{0}, {4}, {5}, {6}, {7}, {2}, {3}, {1}}, }; static const struct of_device_id mtk_iommu_of_ids[] = { diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h index afd7a2de5c1e..d51ff99c2c71 100644 --- a/drivers/iommu/mtk_iommu.h +++ b/drivers/iommu/mtk_iommu.h @@ -41,10 +41,11 @@ struct mtk_iommu_plat_data { /* HW will use the EMI clock if there isn't the "bclk". */ bool
[PATCH v4 4/7] iommu/mediatek: Move inv_sel_reg into the plat_data
For mt6779, MMU_INV_SEL register's offset is changed from 0x38 to 0x2c, so we can put inv_sel_reg in the plat_data to use it. In addition, we renamed it to REG_MMU_INV_SEL_GEN1 and use it before mt6779. Change since v3: 1. Fix coding style Cc: Yong Wu Signed-off-by: Chao Hao --- drivers/iommu/mtk_iommu.c | 19 +++ drivers/iommu/mtk_iommu.h | 1 + 2 files changed, 12 insertions(+), 8 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 239d2cdbbc9f..f23919feba4e 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -37,7 +37,7 @@ #define REG_MMU_INVLD_START_A 0x024 #define REG_MMU_INVLD_END_A0x028 -#define REG_MMU_INV_SEL0x038 +#define REG_MMU_INV_SEL_GEN1 0x038 #define F_INVLD_EN0BIT(0) #define F_INVLD_EN1BIT(1) @@ -168,7 +168,7 @@ static void mtk_iommu_tlb_flush_all(void *cookie) for_each_m4u(data) { writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0, - data->base + REG_MMU_INV_SEL); + data->base + data->plat_data->inv_sel_reg); writel_relaxed(F_ALL_INVLD, data->base + REG_MMU_INVALIDATE); wmb(); /* Make sure the tlb flush all done */ } @@ -185,7 +185,7 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, for_each_m4u(data) { spin_lock_irqsave(&data->tlb_lock, flags); writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0, - data->base + REG_MMU_INV_SEL); + data->base + data->plat_data->inv_sel_reg); writel_relaxed(iova, data->base + REG_MMU_INVLD_START_A); writel_relaxed(iova + size - 1, @@ -773,11 +773,12 @@ static const struct dev_pm_ops mtk_iommu_pm_ops = { }; static const struct mtk_iommu_plat_data mt2712_data = { - .m4u_plat = M4U_MT2712, - .has_4gb_mode = true, - .has_bclk = true, - .has_vld_pa_rng = true, - .larbid_remap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, + .m4u_plat = M4U_MT2712, + .has_4gb_mode = true, + .has_bclk = true, + .has_vld_pa_rng = true, + .inv_sel_reg= REG_MMU_INV_SEL_GEN1, + .larbid_remap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, }; static const struct mtk_iommu_plat_data mt8173_data = { @@ -785,12 +786,14 @@ static const struct mtk_iommu_plat_data mt8173_data = { .has_4gb_mode = true, .has_bclk = true, .reset_axi= true, + .inv_sel_reg = REG_MMU_INV_SEL_GEN1, .larbid_remap = {0, 1, 2, 3, 4, 5}, /* Linear mapping. */ }; static const struct mtk_iommu_plat_data mt8183_data = { .m4u_plat = M4U_MT8183, .reset_axi= true, + .inv_sel_reg = REG_MMU_INV_SEL_GEN1, .larbid_remap = {0, 4, 5, 6, 7, 2, 3, 1}, }; diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h index d711ac630037..afd7a2de5c1e 100644 --- a/drivers/iommu/mtk_iommu.h +++ b/drivers/iommu/mtk_iommu.h @@ -43,6 +43,7 @@ struct mtk_iommu_plat_data { boolhas_misc_ctrl; boolhas_vld_pa_rng; boolreset_axi; + u32 inv_sel_reg; unsigned char larbid_remap[MTK_LARB_NR_MAX]; }; -- 2.18.0
[PATCH v4 00/07] MT6779 IOMMU SUPPORT
This patchset adds mt6779 iommu support. mt6779 has two iommus, they are MM_IOMMU(M4U) and APU_IOMMU which used ARM Short-Descriptor translation format. The mt6779's MM_IOMMU-SMI and APU_IOMMU HW diagram is as below, it is only a brief diagram: EMI | -- || MM_IOMMUAPU_IOMMU || SMI_COMMOM--- APU_BUS || | SMI_LARB(0~11) | | || | || -- || | | | Multimedia engine CCU VPU MDLA EMDA All the connections are hardware fixed, software can not adjust it. Compared with mt8183, SMI_BUS_ID width has changed from 10 to 12. SMI Larb number is described in bit[11:7], Port number is described in bit[6:2]. In addition, there are some registers has changed in mt6779, so we need to redefine and reuse them. The patchset only used MM_IOMMU, so we only add MM_IOMMU basic function, such as smi_larb port definition, registers definition and hardware initialization. change notes: v4: 1. Rebase on v5.8-rc1. 2. Fix coding style. 3. Add F_MMU_IN_DRDER_WR_EN definition in MISC_CTRL to improve performance. v3: 1. Rebase on v5.7-rc1. 2. Remove unused port definition,ex:APU and CCU port in mt6779-larb-port.h. 3. Remove "change single domain to multiple domain" part(from PATCH v2 09/19 to PATCH v2 19/19). 4. Redesign mt6779 basic part (1)Add some register definition and reuse them. (2)Redesign smi larb bus ID to analyze IOMMU translation fault. (3)Only init MM_IOMMU and not use APU_IOMMU. http://lists.infradead.org/pipermail/linux-mediatek/2020-May/029811.html v2: 1. Rebase on v5.5-rc1. 2. Delete M4U_PORT_UNKNOWN define because of not use it. 3. Correct coding format. 4. Rename offset=0x48 register. 5. Split "iommu/mediatek: Add mt6779 IOMMU basic support(patch v1)" to several patches(patch v2). http://lists.infradead.org/pipermail/linux-mediatek/2020-January/026131.html v1: http://lists.infradead.org/pipermail/linux-mediatek/2019-November/024567.html Chao Hao (7): dt-bindings: mediatek: Add bindings for MT6779 iommu/mediatek: Rename the register STANDARD_AXI_MODE(0x48) to MISC_CTRL iommu/mediatek: Set MISC_CTRL register iommu/mediatek: Move inv_sel_reg into the plat_data iommu/mediatek: Add sub_comm id in translation fault iommu/mediatek: Add REG_MMU_WR_LEN definition preparing for mt6779 iommu/mediatek: Add mt6779 basic support .../bindings/iommu/mediatek,iommu.txt | 2 + drivers/iommu/mtk_iommu.c | 92 ++-- drivers/iommu/mtk_iommu.h | 10 +- include/dt-bindings/memory/mt6779-larb-port.h | 206 ++ 4 files changed, 285 insertions(+), 25 deletions(-) -- 2.18.0
[PATCH v4 3/7] iommu/mediatek: Set MISC_CTRL register
Add F_MMU_IN_ORDER_WR_EN definition in MISC_CTRL. In order to improve performance, we always disable STANDARD_AXI_MODE and IN_ORDER_WR_EN in MISC_CTRL. Change since v3: 1. Rename Disable STANDARD_AXI_MODE in MISC_CTRL to Set MISC_CTRL register 2. Add F_MMU_IN_DRDER_WR_EN definition in MISC_CTRL We need to disable in_order_write to improve performance Cc: Yong Wu Signed-off-by: Chao Hao --- drivers/iommu/mtk_iommu.c | 11 +++ drivers/iommu/mtk_iommu.h | 1 + 2 files changed, 12 insertions(+) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 88d3df5b91c2..239d2cdbbc9f 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -42,6 +42,9 @@ #define F_INVLD_EN1BIT(1) #define REG_MMU_MISC_CTRL 0x048 +#define F_MMU_IN_ORDER_WR_EN (BIT(1) | BIT(17)) +#define F_MMU_STANDARD_AXI_MODE_BIT(BIT(3) | BIT(19)) + #define REG_MMU_DCM_DIS0x050 #define REG_MMU_CTRL_REG 0x110 @@ -578,6 +581,14 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data) writel_relaxed(0, data->base + REG_MMU_MISC_CTRL); } + if (data->plat_data->has_misc_ctrl) { + /* For mm_iommu, it can improve performance by the setting */ + regval = readl_relaxed(data->base + REG_MMU_MISC_CTRL); + regval &= ~F_MMU_STANDARD_AXI_MODE_BIT; + regval &= ~F_MMU_IN_ORDER_WR_EN; + writel_relaxed(regval, data->base + REG_MMU_MISC_CTRL); + } + if (devm_request_irq(data->dev, data->irq, mtk_iommu_isr, 0, dev_name(data->dev), (void *)data)) { writel_relaxed(0, data->base + REG_MMU_PT_BASE_ADDR); diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h index 1b6ea839b92c..d711ac630037 100644 --- a/drivers/iommu/mtk_iommu.h +++ b/drivers/iommu/mtk_iommu.h @@ -40,6 +40,7 @@ struct mtk_iommu_plat_data { /* HW will use the EMI clock if there isn't the "bclk". */ boolhas_bclk; + boolhas_misc_ctrl; boolhas_vld_pa_rng; boolreset_axi; unsigned char larbid_remap[MTK_LARB_NR_MAX]; -- 2.18.0
Re: [PATCH v4 6/7] iommu/mediatek: Add REG_MMU_WR_LEN definition preparing for mt6779
On Sun, 2020-06-21 at 13:01 +0200, Matthias Brugger wrote: > > On 19/06/2020 12:56, chao hao wrote: > > On Wed, 2020-06-17 at 11:22 +0200, Matthias Brugger wrote: > >> > >> On 17/06/2020 05:00, Chao Hao wrote: > >>> Some platforms(ex: mt6779) have a new register called by REG_MMU_WR_LEN > >>> to improve performance. > >>> This patch add this register definition. > >> > >> Please be more specific what this register is about. > >> > > OK. thanks. > > We can use "has_wr_len" flag to control whether we need to set the > > register. If the register uses default value, iommu will send command to > > EMI without restriction, when the number of commands become more and > > more, it will drop the EMI performance. So when more than > > ten_commands(default value) don't be handled for EMI, IOMMU will stop > > send command to EMI for keeping EMI's performace by enabling write > > throttling mechanism(bit[5][21]=0) in MMU_WR_LEN_CTRL register. > > > > I will write description above to commit message in next version > > > >>> > >>> Signed-off-by: Chao Hao > >>> --- > >>> drivers/iommu/mtk_iommu.c | 10 ++ > >>> drivers/iommu/mtk_iommu.h | 2 ++ > >>> 2 files changed, 12 insertions(+) > >>> > >>> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > >>> index a687e8db0e51..c706bca6487e 100644 > >>> --- a/drivers/iommu/mtk_iommu.c > >>> +++ b/drivers/iommu/mtk_iommu.c > >>> @@ -46,6 +46,8 @@ > >>> #define F_MMU_STANDARD_AXI_MODE_BIT (BIT(3) | BIT(19)) > >>> > >>> #define REG_MMU_DCM_DIS 0x050 > >>> +#define REG_MMU_WR_LEN 0x054 > >>> +#define F_MMU_WR_THROT_DIS_BIT (BIT(5) | BIT(21)) > >>> > >>> #define REG_MMU_CTRL_REG 0x110 > >>> #define F_MMU_TF_PROT_TO_PROGRAM_ADDR(2 << 4) > >>> @@ -581,6 +583,12 @@ static int mtk_iommu_hw_init(const struct > >>> mtk_iommu_data *data) > >>> writel_relaxed(regval, data->base + REG_MMU_VLD_PA_RNG); > >>> } > >>> writel_relaxed(0, data->base + REG_MMU_DCM_DIS); > >>> + if (data->plat_data->has_wr_len) { > >>> + /* write command throttling mode */ > >>> + regval = readl_relaxed(data->base + REG_MMU_WR_LEN); > >>> + regval &= ~F_MMU_WR_THROT_DIS_BIT; > >>> + writel_relaxed(regval, data->base + REG_MMU_WR_LEN); > >>> + } > >>> > >>> if (data->plat_data->reset_axi) { > >>> /* The register is called STANDARD_AXI_MODE in this case */ > >>> @@ -737,6 +745,7 @@ static int __maybe_unused mtk_iommu_suspend(struct > >>> device *dev) > >>> struct mtk_iommu_suspend_reg *reg = &data->reg; > >>> void __iomem *base = data->base; > >>> > >>> + reg->wr_len = readl_relaxed(base + REG_MMU_WR_LEN); > >> > >> Can we read/write the register without any side effect although hardware > >> has not > >> implemented it (!has_wr_len)? > > > > It doesn't have side effect. Becasue all the MTK platform have the > > register for iommu HW. If we need to have requirement for performance, > > we can set it by has_wr_len. > > But I'm Sorry, the name of flag(has_wr_len) is not exact, I will rename > > it in next version, ex: "wr_throt_en" > > > >> > >> > >>> reg->misc_ctrl = readl_relaxed(base + REG_MMU_MISC_CTRL); > >>> reg->dcm_dis = readl_relaxed(base + REG_MMU_DCM_DIS); > >>> reg->ctrl_reg = readl_relaxed(base + REG_MMU_CTRL_REG); > >>> @@ -761,6 +770,7 @@ static int __maybe_unused mtk_iommu_resume(struct > >>> device *dev) > >>> dev_err(data->dev, "Failed to enable clk(%d) in resume\n", ret); > >>> return ret; > >>> } > >>> + writel_relaxed(reg->wr_len, base + REG_MMU_WR_LEN); > >>> writel_relaxed(reg->misc_ctrl, base + REG_MMU_MISC_CTRL); > >>> writel_relaxed(reg->dcm_dis, base + REG_MMU_DCM_DIS); > >>> writel_relaxed(reg->ctrl_reg, base + REG_MMU_CTRL_REG); > >>> diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h > >>> index d51ff99c2c71..9971
Re: [PATCH v4 3/7] iommu/mediatek: Set MISC_CTRL register
On Sat, 2020-06-20 at 10:03 +0800, Yong Wu wrote: > Hi Chao, > > On Thu, 2020-06-18 at 19:49 +0800, chao hao wrote: > > On Wed, 2020-06-17 at 11:34 +0200, Matthias Brugger wrote: > > [snip] > > > > > > > > > #define REG_MMU_MISC_CTRL 0x048 > > > > +#define F_MMU_IN_ORDER_WR_EN (BIT(1) | BIT(17)) > > > > +#define F_MMU_STANDARD_AXI_MODE_BIT(BIT(3) | BIT(19)) > > > > + > > > > #define REG_MMU_DCM_DIS0x050 > > > > > > > > #define REG_MMU_CTRL_REG 0x110 > > > > @@ -578,6 +581,14 @@ static int mtk_iommu_hw_init(const struct > > > > mtk_iommu_data *data) > > > > writel_relaxed(0, data->base + REG_MMU_MISC_CTRL); > > > > } > > > > > > > > + if (data->plat_data->has_misc_ctrl) { > > > > > > That's confusing. We renamed the register to misc_ctrl, but it's present > > > in all > > > SoCs. We should find a better name for this flag to describe what the > > > hardware > > > supports. > > > > > > > ok, thanks for you advice, I will rename it in next version. > > ex:has_perf_req(has performance requirement) > > > > > > > Regards, > > > Matthias > > > > > > > + /* For mm_iommu, it can improve performance by the > > > > setting */ > > > > + regval = readl_relaxed(data->base + REG_MMU_MISC_CTRL); > > > > + regval &= ~F_MMU_STANDARD_AXI_MODE_BIT; > > > > + regval &= ~F_MMU_IN_ORDER_WR_EN; > > Note: mt2712 also is MISC_CTRL register, but it don't use this > in_order setting. > > As commented in v3. 0x48 is either STANDARD_AXI_MODE or MISC_CTRL > register. No need two flags(reset_axi/has_xx) for it. > > something like: > > regval = readl_relaxed(data->base + REG_MMU_MISC_CTRL); > if (reset_axi) { > regval = 0; > } else { /* MISC_CTRL */ > if (!apu[1]) > regval &= ~F_MMU_STANDARD_AXI_MODE_BIT; > if (out_order_en) > regval &= ~F_MMU_IN_ORDER_WR_EN; > } > writel_relaxed(regval, data->base + REG_MMU_MISC_CTRL); > > > [1] Your current patch doesn't support apu-iommu, thus, add it when > necessary. ok, the patchset don't need to "if (!apu[1])", I will fix it in next version. thanks > > > > + writel_relaxed(regval, data->base + REG_MMU_MISC_CTRL); > > > > + } > > > > + > > > > if (devm_request_irq(data->dev, data->irq, mtk_iommu_isr, 0, > > > > dev_name(data->dev), (void *)data)) { > > > > writel_relaxed(0, data->base + REG_MMU_PT_BASE_ADDR); > > > > diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h > > > > index 1b6ea839b92c..d711ac630037 100644 > > > > --- a/drivers/iommu/mtk_iommu.h > > > > +++ b/drivers/iommu/mtk_iommu.h > > > > @@ -40,6 +40,7 @@ struct mtk_iommu_plat_data { > > > > > > > > /* HW will use the EMI clock if there isn't the "bclk". */ > > > > boolhas_bclk; > > > > + boolhas_misc_ctrl; > > > > boolhas_vld_pa_rng; > > > > boolreset_axi; > > > > unsigned char larbid_remap[MTK_LARB_NR_MAX]; > > > > > > > > >