Re: [PATCH v2 4/4] arm64: dts: allwinner: h6: Use RSB for AXP805 PMIC connection

2021-01-04 Thread André Przywara
On 03/01/2021 10:00, Samuel Holland wrote:
> On boards where the only peripheral connected to PL0/PL1 is an X-Powers
> PMIC, configure the connection to use the RSB bus rather than the I2C
> bus. Compared to the I2C controller that shares the pins, the RSB
> controller allows a higher bus frequency, and it is more CPU-efficient.

But is it really necessary to change the DTs for those boards in this
way? It means those newer DTs now become incompatible with older
kernels, and I don't know if those reasons above really justify this.

I understand that we officially don't care about "newer DTs on older
kernels", but do we really need to break this deliberately, for no
pressing reasons?

Cheers,
Andre

P.S. I am fine with supporting RSB on H6, and even using it on new DTs,
just want to avoid breaking existing ones.

> Signed-off-by: Samuel Holland 
> ---
>  .../dts/allwinner/sun50i-h6-beelink-gs1.dts   | 38 +--
>  .../dts/allwinner/sun50i-h6-orangepi-3.dts| 14 +++
>  .../dts/allwinner/sun50i-h6-orangepi.dtsi | 22 +--
>  3 files changed, 37 insertions(+), 37 deletions(-)
> 
> diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h6-beelink-gs1.dts 
> b/arch/arm64/boot/dts/allwinner/sun50i-h6-beelink-gs1.dts
> index 7c9dbde645b5..3452add30cc4 100644
> --- a/arch/arm64/boot/dts/allwinner/sun50i-h6-beelink-gs1.dts
> +++ b/arch/arm64/boot/dts/allwinner/sun50i-h6-beelink-gs1.dts
> @@ -150,12 +150,28 @@ &pio {
>   vcc-pg-supply = <®_aldo1>;
>  };
>  
> -&r_i2c {
> +&r_ir {
> + linux,rc-map-name = "rc-beelink-gs1";
> + status = "okay";
> +};
> +
> +&r_pio {
> + /*
> +  * FIXME: We can't add that supply for now since it would
> +  * create a circular dependency between pinctrl, the regulator
> +  * and the RSB Bus.
> +  *
> +  * vcc-pl-supply = <®_aldo1>;
> +  */
> + vcc-pm-supply = <®_aldo1>;
> +};
> +
> +&r_rsb {
>   status = "okay";
>  
> - axp805: pmic@36 {
> + axp805: pmic@745 {
>   compatible = "x-powers,axp805", "x-powers,axp806";
> - reg = <0x36>;
> + reg = <0x745>;
>   interrupt-parent = <&r_intc>;
>   interrupts = <0 IRQ_TYPE_LEVEL_LOW>;
>   interrupt-controller;
> @@ -273,22 +289,6 @@ sw {
>   };
>  };
>  
> -&r_ir {
> - linux,rc-map-name = "rc-beelink-gs1";
> - status = "okay";
> -};
> -
> -&r_pio {
> - /*
> -  * PL0 and PL1 are used for PMIC I2C
> -  * don't enable the pl-supply else
> -  * it will fail at boot
> -  *
> -  * vcc-pl-supply = <®_aldo1>;
> -  */
> - vcc-pm-supply = <®_aldo1>;
> -};
> -
>  &rtc {
>   clocks = <&ext_osc32k>;
>  };
> diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h6-orangepi-3.dts 
> b/arch/arm64/boot/dts/allwinner/sun50i-h6-orangepi-3.dts
> index 15c9dd8c4479..16702293ac0b 100644
> --- a/arch/arm64/boot/dts/allwinner/sun50i-h6-orangepi-3.dts
> +++ b/arch/arm64/boot/dts/allwinner/sun50i-h6-orangepi-3.dts
> @@ -175,12 +175,16 @@ &pio {
>   vcc-pg-supply = <®_vcc_wifi_io>;
>  };
>  
> -&r_i2c {
> +&r_ir {
> + status = "okay";
> +};
> +
> +&r_rsb {
>   status = "okay";
>  
> - axp805: pmic@36 {
> + axp805: pmic@745 {
>   compatible = "x-powers,axp805", "x-powers,axp806";
> - reg = <0x36>;
> + reg = <0x745>;
>   interrupt-parent = <&r_intc>;
>   interrupts = <0 IRQ_TYPE_LEVEL_LOW>;
>   interrupt-controller;
> @@ -291,10 +295,6 @@ sw {
>   };
>  };
>  
> -&r_ir {
> - status = "okay";
> -};
> -
>  &rtc {
>   clocks = <&ext_osc32k>;
>  };
> diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h6-orangepi.dtsi 
> b/arch/arm64/boot/dts/allwinner/sun50i-h6-orangepi.dtsi
> index ebc120a9232f..23e3cb2ffd8d 100644
> --- a/arch/arm64/boot/dts/allwinner/sun50i-h6-orangepi.dtsi
> +++ b/arch/arm64/boot/dts/allwinner/sun50i-h6-orangepi.dtsi
> @@ -112,12 +112,20 @@ &pio {
>   vcc-pg-supply = <®_aldo1>;
>  };
>  
> -&r_i2c {
> +&r_ir {
> + status = "okay";
> +};
> +
> +&r_pio {
> + vcc-pm-supply = <®_bldo3>;
> +};
> +
> +&r_rsb {
>   status = "okay";
>  
> - axp805: pmic@36 {
> + axp805: pmic@745 {
>   compatible = "x-powers,axp805", "x-powers,axp806";
> - reg = <0x36>;
> + reg = <0x745>;
>   interrupt-parent = <&r_intc>;
>   interrupts = <0 IRQ_TYPE_LEVEL_LOW>;
>   interrupt-controller;
> @@ -232,14 +240,6 @@ sw {
>   };
>  };
>  
> -&r_ir {
> - status = "okay";
> -};
> -
> -&r_pio {
> - vcc-pm-supply = <®_bldo3>;
> -};
> -
>  &rtc {
>   clocks = <&ext_osc32k>;
>  };
> 



Re: [PATCH 1/4] clk: sunxi-ng: h6-r: Add R_APB2_RSB clock and reset

2020-12-15 Thread André Przywara
On 15/12/2020 03:25, Samuel Holland wrote:
> On 12/14/20 8:57 AM, Maxime Ripard wrote:
>> Hi Samuel,
>>
>> On Sun, Dec 13, 2020 at 05:55:03PM -0600, Samuel Holland wrote:
>>> While no information about the H6 RSB controller is included in the
>>> datasheet or manual, the vendor BSP and power management blob both
>>> reference the RSB clock parent and register address. These values were
>>> verified by experimentation.
>>>
>>> Since this clock/reset are added late, the specifier is added at the end
>>> to maintain the existing DT binding. The code is kept in register order.
>>>
>>> Signed-off-by: Samuel Holland 
>>> ---
>>>  drivers/clk/sunxi-ng/ccu-sun50i-h6-r.c  | 5 +
>>>  drivers/clk/sunxi-ng/ccu-sun50i-h6-r.h  | 2 +-
>>>  include/dt-bindings/clock/sun50i-h6-r-ccu.h | 1 +
>>>  include/dt-bindings/reset/sun50i-h6-r-ccu.h | 1 +
>>>  4 files changed, 8 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.c 
>>> b/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.c
>>> index 50f8d1bc7046..56e351b513f3 100644
>>> --- a/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.c
>>> +++ b/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.c
>>> @@ -91,6 +91,8 @@ static SUNXI_CCU_GATE(r_apb2_uart_clk,"r-apb2-uart",  
>>> "r-apb2",
>>>   0x18c, BIT(0), 0);
>>>  static SUNXI_CCU_GATE(r_apb2_i2c_clk,  "r-apb2-i2c",   "r-apb2",
>>>   0x19c, BIT(0), 0);
>>> +static SUNXI_CCU_GATE(r_apb2_rsb_clk,  "r-apb2-rsb",   "r-apb2",
>>> + 0x1bc, BIT(0), 0);
>>>  static SUNXI_CCU_GATE(r_apb1_ir_clk,   "r-apb1-ir","r-apb1",
>>>   0x1cc, BIT(0), 0);
>>>  static SUNXI_CCU_GATE(r_apb1_w1_clk,   "r-apb1-w1","r-apb1",
>>> @@ -130,6 +132,7 @@ static struct ccu_common *sun50i_h6_r_ccu_clks[] = {
>>> &r_apb1_pwm_clk.common,
>>> &r_apb2_uart_clk.common,
>>> &r_apb2_i2c_clk.common,
>>> +   &r_apb2_rsb_clk.common,
>>> &r_apb1_ir_clk.common,
>>> &r_apb1_w1_clk.common,
>>> &ir_clk.common,
>>> @@ -147,6 +150,7 @@ static struct clk_hw_onecell_data sun50i_h6_r_hw_clks = 
>>> {
>>> [CLK_R_APB1_PWM]= &r_apb1_pwm_clk.common.hw,
>>> [CLK_R_APB2_UART]   = &r_apb2_uart_clk.common.hw,
>>> [CLK_R_APB2_I2C]= &r_apb2_i2c_clk.common.hw,
>>> +   [CLK_R_APB2_RSB]= &r_apb2_rsb_clk.common.hw,
>>> [CLK_R_APB1_IR] = &r_apb1_ir_clk.common.hw,
>>> [CLK_R_APB1_W1] = &r_apb1_w1_clk.common.hw,
>>> [CLK_IR]= &ir_clk.common.hw,
>>> @@ -161,6 +165,7 @@ static struct ccu_reset_map sun50i_h6_r_ccu_resets[] = {
>>> [RST_R_APB1_PWM]=  { 0x13c, BIT(16) },
>>> [RST_R_APB2_UART]   =  { 0x18c, BIT(16) },
>>> [RST_R_APB2_I2C]=  { 0x19c, BIT(16) },
>>> +   [RST_R_APB2_RSB]=  { 0x1bc, BIT(16) },
>>> [RST_R_APB1_IR] =  { 0x1cc, BIT(16) },
>>> [RST_R_APB1_W1] =  { 0x1ec, BIT(16) },
>>>  };
>>> diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.h 
>>> b/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.h
>>> index 782117dc0b28..7e290b840803 100644
>>> --- a/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.h
>>> +++ b/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.h
>>> @@ -14,6 +14,6 @@
>>>  
>>>  #define CLK_R_APB2 3
>>>  
>>> -#define CLK_NUMBER (CLK_W1 + 1)
>>> +#define CLK_NUMBER (CLK_R_APB2_RSB + 1)
>>>  
>>>  #endif /* _CCU_SUN50I_H6_R_H */
>>> diff --git a/include/dt-bindings/clock/sun50i-h6-r-ccu.h 
>>> b/include/dt-bindings/clock/sun50i-h6-r-ccu.h
>>> index 76136132a13e..f46ec03848ca 100644
>>> --- a/include/dt-bindings/clock/sun50i-h6-r-ccu.h
>>> +++ b/include/dt-bindings/clock/sun50i-h6-r-ccu.h
>>> @@ -15,6 +15,7 @@
>>>  #define CLK_R_APB1_PWM 6
>>>  #define CLK_R_APB2_UART7
>>>  #define CLK_R_APB2_I2C 8
>>> +#define CLK_R_APB2_RSB 13
>>>  #define CLK_R_APB1_IR  9
>>>  #define CLK_R_APB1_W1  10
>>>  
>>> diff --git a/include/dt-bindings/reset/sun50i-h6-r-ccu.h 
>>> b/include/dt-bindings/reset/sun50i-h6-r-ccu.h
>>> index 01c84dba49a4..6fe199a7969d 100644
>>> --- a/include/dt-bindings/reset/sun50i-h6-r-ccu.h
>>> +++ b/include/dt-bindings/reset/sun50i-h6-r-ccu.h
>>> @@ -11,6 +11,7 @@
>>>  #define RST_R_APB1_PWM 2
>>>  #define RST_R_APB2_UART3
>>>  #define RST_R_APB2_I2C 4
>>> +#define RST_R_APB2_RSB 7
>>>  #define RST_R_APB1_IR  5
>>>  #define RST_R_APB1_W1  6
>>
>> I think for the clock and reset binding, we'll want to sort by number.
>> It's fairly easy to miss otherwise and if we end up adding another one
>> it wouldn't be far fetched to assume the same indices would be used

I agree here. Admittedly this whole approach is really fragile. I ended
up with some tiny tool to check for consecutive numbers, reporting any
outliers (some gaps are legit). Of course the "-r" versions of the CCU
are not really a big issue, but with the 100+ clocks in the main CCU

Re: [PATCH v2 14/21] phy: sun4i-usb: Rework "pmu_unk1" handling

2020-12-13 Thread André Przywara
On 13/12/2020 18:24, Icenowy Zheng wrote:
> 在 2020-12-11星期五的 01:19 +,Andre Przywara写道:
>> Newer SoCs (A100, H616) need to clear a different bit in our
>> "unknown"
>> PMU PHY register.
> 
> It looks like that the unknown PHY register is PHYCTL register for each
> individual PHY, and the bit that is cleared is
> called SUNXI_HCI_PHY_CTRL_SIDDQ in the BSP (similar to
> the USBC_PHY_CTL_SIDDQ we cleared for main PHYCTL).

Oh, right, somehow I failed to find this in the BSP, I guess I got
confused over that similar symbol. I will put proper names to it.

Thanks!
Andre

>>
>> Generalise the existing code by allowing configs to specify a bitmask
>> of bits to clear.
>>
>> Signed-off-by: Andre Przywara 
>> ---
>>  drivers/phy/allwinner/phy-sun4i-usb.c | 28 +++
>> 
>>  1 file changed, 11 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/phy/allwinner/phy-sun4i-usb.c
>> b/drivers/phy/allwinner/phy-sun4i-usb.c
>> index 651d5e2a25ce..4ba0699e0bb4 100644
>> --- a/drivers/phy/allwinner/phy-sun4i-usb.c
>> +++ b/drivers/phy/allwinner/phy-sun4i-usb.c
>> @@ -115,9 +115,9 @@ struct sun4i_usb_phy_cfg {
>>  int hsic_index;
>>  enum sun4i_usb_phy_type type;
>>  u32 disc_thresh;
>> +u32 pmu_unk1_clrbits;
>>  u8 phyctl_offset;
>>  bool dedicated_clocks;
>> -bool enable_pmu_unk1;
>>  bool phy0_dual_route;
>>  int missing_phys;
>>  };
>> @@ -288,6 +288,12 @@ static int sun4i_usb_phy_init(struct phy *_phy)
>>  return ret;
>>  }
>>  
>> +if (phy->pmu && data->cfg->pmu_unk1_clrbits) {
>> +val = readl(phy->pmu + REG_PMU_UNK1);
>> +val &= ~data->cfg->pmu_unk1_clrbits;
>> +writel(val, phy->pmu + REG_PMU_UNK1);
>> +}
>> +
>>  if (data->cfg->type == sun8i_a83t_phy ||
>>  data->cfg->type == sun50i_h6_phy) {
>>  if (phy->index == 0) {
>> @@ -297,11 +303,6 @@ static int sun4i_usb_phy_init(struct phy *_phy)
>>  writel(val, data->base + data->cfg-
>>> phyctl_offset);
>>  }
>>  } else {
>> -if (phy->pmu && data->cfg->enable_pmu_unk1) {
>> -val = readl(phy->pmu + REG_PMU_UNK1);
>> -writel(val & ~2, phy->pmu + REG_PMU_UNK1);
>> -}
>> -
>>  /* Enable USB 45 Ohm resistor calibration */
>>  if (phy->index == 0)
>>  sun4i_usb_phy_write(phy, PHY_RES45_CAL_EN,
>> 0x01, 1);
>> @@ -867,7 +868,6 @@ static const struct sun4i_usb_phy_cfg
>> sun4i_a10_cfg = {
>>  .disc_thresh = 3,
>>  .phyctl_offset = REG_PHYCTL_A10,
>>  .dedicated_clocks = false,
>> -.enable_pmu_unk1 = false,
>>  };
>>  
>>  static const struct sun4i_usb_phy_cfg sun5i_a13_cfg = {
>> @@ -876,7 +876,6 @@ static const struct sun4i_usb_phy_cfg
>> sun5i_a13_cfg = {
>>  .disc_thresh = 2,
>>  .phyctl_offset = REG_PHYCTL_A10,
>>  .dedicated_clocks = false,
>> -.enable_pmu_unk1 = false,
>>  };
>>  
>>  static const struct sun4i_usb_phy_cfg sun6i_a31_cfg = {
>> @@ -885,7 +884,6 @@ static const struct sun4i_usb_phy_cfg
>> sun6i_a31_cfg = {
>>  .disc_thresh = 3,
>>  .phyctl_offset = REG_PHYCTL_A10,
>>  .dedicated_clocks = true,
>> -.enable_pmu_unk1 = false,
>>  };
>>  
>>  static const struct sun4i_usb_phy_cfg sun7i_a20_cfg = {
>> @@ -894,7 +892,6 @@ static const struct sun4i_usb_phy_cfg
>> sun7i_a20_cfg = {
>>  .disc_thresh = 2,
>>  .phyctl_offset = REG_PHYCTL_A10,
>>  .dedicated_clocks = false,
>> -.enable_pmu_unk1 = false,
>>  };
>>  
>>  static const struct sun4i_usb_phy_cfg sun8i_a23_cfg = {
>> @@ -903,7 +900,6 @@ static const struct sun4i_usb_phy_cfg
>> sun8i_a23_cfg = {
>>  .disc_thresh = 3,
>>  .phyctl_offset = REG_PHYCTL_A10,
>>  .dedicated_clocks = true,
>> -.enable_pmu_unk1 = false,
>>  };
>>  
>>  static const struct sun4i_usb_phy_cfg sun8i_a33_cfg = {
>> @@ -912,7 +908,6 @@ static const struct sun4i_usb_phy_cfg
>> sun8i_a33_cfg = {
>>  .disc_thresh = 3,
>>  .phyctl_offset = REG_PHYCTL_A33,
>>  .dedicated_clocks = true,
>> -.enable_pmu_unk1 = false,
>>  };
>>  
>>  static const struct sun4i_usb_phy_cfg sun8i_a83t_cfg = {
>> @@ -929,7 +924,7 @@ static const struct sun4i_usb_phy_cfg
>> sun8i_h3_cfg = {
>>  .disc_thresh = 3,
>>  .phyctl_offset = REG_PHYCTL_A33,
>>  .dedicated_clocks = true,
>> -.enable_pmu_unk1 = true,
>> +.pmu_unk1_clrbits = BIT(1),
>>  .phy0_dual_route = true,
>>  };
>>  
>> @@ -939,7 +934,7 @@ static const struct sun4i_usb_phy_cfg
>> sun8i_r40_cfg = {
>>  .disc_thresh = 3,
>>  .phyctl_offset = REG_PHYCTL_A33,
>>  .dedicated_clocks = true,
>> -.enable_pmu_unk1 = true,
>> +.pmu_unk1_clrbits = BIT(1),
>>  .phy0_dual_route = true,
>>  };
>>  
>> @@ -949,7 +944,7 @@ static const struct sun4i_usb_phy_cfg
>> sun8i_v3s_cfg = {
>>  .disc_thresh = 3,
>>  .phyctl_offset = REG_PHYCTL_A33,
>>  .dedicated_clocks = true,
>> -.enable_pmu_

Re: [PATCH v2 00/21] arm64: sunxi: Initial Allwinner H616 SoC support

2020-12-13 Thread André Przywara
On 13/12/2020 17:47, Icenowy Zheng wrote:

Hi Icenowy,

> 在 2020-12-11星期五的 01:19 +,Andre Przywara写道:
>> Hi,
>>
>> this is the quite expanded second version of the support series for
>> the
>> Allwinner H616 SoC.
>> Besides many fixes for the bugs discovered by the diligent reviewers
>> (many thanks for that!) this version adds some patches to support
>> some
>> slightly changed devices, like the second EMAC and the AXP not having
>> an interrupt.
>> Also I added quite some DT binding doc patches.
>> USB still does not work, but I include the patches anyway, hoping
>> that
>> someone can help spotting the issue.
>> For a more detailed changelog see below.
>>
>> Thanks!
>> Andre
>>
>> ==
>> This series gathers patches to support the Allwinner H616 SoC. This
>> is
>> a rather uninspired SoC (Quad-A53 with the usual peripherals), but
>> allows for some cheap development boards and TV boxes, and supports
>> up to 4GB of DRAM.
>>
>> Various DT binding patches are sprinkled throughout the series, to
>> add
>> the new compatible names right before they are used.
>> Patch 1/21 is the usual drive-by fix, discovered while staring at
>> the H6 clock code.
>> Patch 3 and 4 add pinctrl support, with the "-R" controller now being
>> crippled down to two I2C pins only. If we grow tired of repeating
>> this
>> exercise for every new SoC variant, I am happy to revive my more
>> versatile sunxi pinctrl driver effort from a few years back [1].
>> Patch 6 and 7 add clock support. For the -R clock this is shared with
>> the H6 code, as the clocks are identical, with the H616 just having
>> fewer of them. The main clocks are different enough to warrant a
>> separate
>> file.
>> Patch 08/21 is pulling a patch from Yangtao's A100 series, since we
>> need
>> the same fix for MMC support. This will probably be merged
>> separately,
>> I just include this here to provide a bootable solution.
>> Patch 10 teaches the AXP MFD driver to get along without having an
>> interrupt, as the H616 apparently does not have an NMI controller
>> anymore.
>> Patch 12 and 13 add some tweaks to the syscon and EMAC driver, to
>> deal
>> with the second EMAC clock used for the second Ethernet controller.
>> Patches 14 and 15 *try* to add USB support. The same approach works
>> with
>> the very similar U-Boot PHY driver, but for some reason still fail in
>> Linux. Maybe someone spots the issue and can help fixing it?
> 
> There's a judge currently checks for phy type A83T/H6. You may need to
> add H616 there.

You mean for setting PHY_CTL_VBUSVLDEXT and clearing PHY_CTL_SIDDQ? This
is guarded by .type == sun50i_h6_phy, which I also use for the H616.
So this should be covered.

> Or should we have a bool in the data struct to mark it? I think all
> chips that need that is 28nm.

Yeah, the usage of .type seems somewhat arbitrary, at the end of the day
it's just another quirk for a group of SoCs.

Thanks!
Andre

> 
>>
>> The remaining patches add DT bindings, which just add the new
>> compatible
>> string along with an existing name as a fallback string.
>> Eventually we get the .dtsi for the SoC in patch 19, and the .dts for
>> the OrangePi Zero2 board[2] in the last patch.
>>
>> We have U-Boot and Trusted-Firmware support in a working state,
>> booting
>> via FEL and even TFTPing kernels work already [3][4].
>>
>> Many thanks to Jernej for his tremendous help on this, also for the
>> awesome input and help from the #linux-sunxi Freenode channel.
>>
>> The whole series can also be found here:
>> https://github.com/apritzel/linux/commits/h616-v2
>>
>> Happy reviewing!
>>
>> Cheers,
>> Andre
>>
>> [1] 
>> https://patchwork.ozlabs.org/project/linux-gpio/cover/20171113012523.2328-1-andre.przyw...@arm.com/
>> [2] https://linux-sunxi.org/Xunlong_Orange_Pi_Zero2
>> [3] https://github.com/jernejsk/u-boot/commits/h616-v1
>> [4] https://github.com/apritzel/arm-trusted-firmware/commits/h616-WIP
>>
>> Changelog v1 .. v2:
>> - pinctrl: adjust irq bank map to cover undocumented GPIO bank IRQs
>> - use differing h_i2s0 pin output names
>> - r-ccu: fix number of used clocks
>> - ccu: remove PLL-PERIPHy(4X)
>> - ccu: fix gpu1 divider range
>> - ccu: fix usb-phy3 parent
>> - ccu: add missing TV clocks
>> - ccu: rework to CLK_OF_DECLARE style
>> - ccu: enable output bit for PLL clocks
>> - ccu: renumber clocks
>> - .dtsi: drop sun50i-a64-system-control fallback
>> - .dtsi: drop unknown SRAM regions
>> - .dtsi: add more (undocumented) GPIO interrupts
>> - .dtsi: fix I2C3 pin names
>> - .dtsi: use a100-emmc fallback for MMC2
>> - .dtsi: add second EMAC controller
>> - .dtsi: use H3 MUSB controller fallback
>> - .dtsi: fix frame size for USB PHY PMU registers
>> - .dtsi: add USB0 PHY references
>> - .dtsi: fix IR controller clock source
>> - .dts: fix LED naming and swap pins
>> - .dts: use 5V supply parent for USB supply
>> - .dts: drop dummy IRQ for AXP
>> - .dts: enable 3V3 header pin power rail
>> - .dts: add SPI flash node
>> - .dts: make USB-C port per

Re: [linux-sunxi] [PATCH 5/8] clk: sunxi-ng: Add support for the Allwinner H616 CCU

2020-12-10 Thread André Przywara
On 10/12/2020 13:31, Icenowy Zheng wrote:
> 在 2020-12-02星期三的 13:54 +,Andre Przywara写道:
>> While the clocks are fairly similar to the H6, many differ in tiny
>> details, so a separate clock driver seems indicated.
>>
>> Derived from the H6 clock driver, and adjusted according to the
>> manual.
>>
>> Signed-off-by: Andre Przywara 
>> ---
>>  drivers/clk/sunxi-ng/Kconfig|7 +-
>>  drivers/clk/sunxi-ng/Makefile   |1 +
>>  drivers/clk/sunxi-ng/ccu-sun50i-h616.c  | 1134
>> +++
>>  drivers/clk/sunxi-ng/ccu-sun50i-h616.h  |   58 +
>>  include/dt-bindings/clock/sun50i-h616-ccu.h |  110 ++
>>  include/dt-bindings/reset/sun50i-h616-ccu.h |   67 ++
>>  6 files changed, 1376 insertions(+), 1 deletion(-)
>>  create mode 100644 drivers/clk/sunxi-ng/ccu-sun50i-h616.c
>>  create mode 100644 drivers/clk/sunxi-ng/ccu-sun50i-h616.h
>>  create mode 100644 include/dt-bindings/clock/sun50i-h616-ccu.h
>>  create mode 100644 include/dt-bindings/reset/sun50i-h616-ccu.h
>>
>> diff --git a/drivers/clk/sunxi-ng/Kconfig b/drivers/clk/sunxi-
>> ng/Kconfig
>> index ce5f5847d5d3..cd46d8853876 100644
>> --- a/drivers/clk/sunxi-ng/Kconfig
>> +++ b/drivers/clk/sunxi-ng/Kconfig
>> @@ -32,8 +32,13 @@ config SUN50I_H6_CCU
>>  default ARM64 && ARCH_SUNXI
>>  depends on (ARM64 && ARCH_SUNXI) || COMPILE_TEST
>>  
>> +config SUN50I_H616_CCU
>> +bool "Support for the Allwinner H616 CCU"
>> +default ARM64 && ARCH_SUNXI
>> +depends on (ARM64 && ARCH_SUNXI) || COMPILE_TEST
>> +
>>  config SUN50I_H6_R_CCU
>> -bool "Support for the Allwinner H6 PRCM CCU"
>> +bool "Support for the Allwinner H6 and H616 PRCM CCU"
>>  default ARM64 && ARCH_SUNXI
>>  depends on (ARM64 && ARCH_SUNXI) || COMPILE_TEST
>>  
>> diff --git a/drivers/clk/sunxi-ng/Makefile b/drivers/clk/sunxi-
>> ng/Makefile
>> index 3eb5cff40eac..96c324306d97 100644
>> --- a/drivers/clk/sunxi-ng/Makefile
>> +++ b/drivers/clk/sunxi-ng/Makefile
>> @@ -26,6 +26,7 @@ obj-$(CONFIG_SUN50I_A64_CCU)   += ccu-sun50i-
>> a64.o
>>  obj-$(CONFIG_SUN50I_A100_CCU)   += ccu-sun50i-a100.o
>>  obj-$(CONFIG_SUN50I_A100_R_CCU) += ccu-sun50i-a100-r.o
>>  obj-$(CONFIG_SUN50I_H6_CCU) += ccu-sun50i-h6.o
>> +obj-$(CONFIG_SUN50I_H616_CCU)   += ccu-sun50i-h616.o
>>  obj-$(CONFIG_SUN50I_H6_R_CCU)   += ccu-sun50i-h6-r.o
>>  obj-$(CONFIG_SUN4I_A10_CCU) += ccu-sun4i-a10.o
>>  obj-$(CONFIG_SUN5I_CCU) += ccu-sun5i.o
>> diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-h616.c
>> b/drivers/clk/sunxi-ng/ccu-sun50i-h616.c
>> new file mode 100644
>> index ..3fbb258f0354
>> --- /dev/null
>> +++ b/drivers/clk/sunxi-ng/ccu-sun50i-h616.c
>> @@ -0,0 +1,1134 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Copyright (c) 2020 Arm Ltd.
>> + * Based on the H6 CCU driver, which is:
>> + *   Copyright (c) 2017 Icenowy Zheng 
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +#include "ccu_common.h"
>> +#include "ccu_reset.h"
>> +
>> +#include "ccu_div.h"
>> +#include "ccu_gate.h"
>> +#include "ccu_mp.h"
>> +#include "ccu_mult.h"
>> +#include "ccu_nk.h"
>> +#include "ccu_nkm.h"
>> +#include "ccu_nkmp.h"
>> +#include "ccu_nm.h"
>> +
>> +#include "ccu-sun50i-h616.h"
>> +
>> +/*
>> + * The CPU PLL is actually NP clock, with P being /1, /2 or /4.
>> However
>> + * P should only be used for output frequencies lower than 288 MHz.
>> + *
>> + * For now we can just model it as a multiplier clock, and force P
>> to /1.
>> + *
>> + * The M factor is present in the register's description, but not in
>> the
>> + * frequency formula, and it's documented as "M is only used for
>> backdoor
>> + * testing", so it's not modelled and then force to 0.
>> + */
>> +#define SUN50I_H616_PLL_CPUX_REG0x000
>> +static struct ccu_mult pll_cpux_clk = {
>> +.enable = BIT(31),
>> +.lock   = BIT(28),
>> +.mult   = _SUNXI_CCU_MULT_MIN(8, 8, 12),
>> +.common = {
>> +.reg= 0x000,
>> +.hw.init= CLK_HW_INIT("pll-cpux", "osc24M",
>> +  &ccu_mult_ops,
>> +  CLK_SET_RATE_UNGATE),
>> +},
>> +};
>> +
>> +/* Some PLLs are input * N / div1 / P. Model them as NKMP with no K
>> */
>> +#define SUN50I_H616_PLL_DDR0_REG0x010
>> +static struct ccu_nkmp pll_ddr0_clk = {
>> +.enable = BIT(31),
>> +.lock   = BIT(28),
>> +.n  = _SUNXI_CCU_MULT_MIN(8, 8, 12),
>> +.m  = _SUNXI_CCU_DIV(1, 1), /* input divider */
>> +.p  = _SUNXI_CCU_DIV(0, 1), /* output divider */
>> +.common = {
>> +.reg= 0x010,
>> +.hw.init= CLK_HW_INIT("pll-ddr0", "osc24M",
>> +  &ccu_nkmp_ops,
>> +  CLK_SET_RATE_UNGATE),
>> +},
>> +};
>> +
>> +#define SUN50I_H616_PLL_DDR1_

Re: [linux-sunxi] [PATCH 5/8] clk: sunxi-ng: Add support for the Allwinner H616 CCU

2020-12-09 Thread André Przywara
On 09/12/2020 22:20, Jernej Škrabec wrote:
> Dne sreda, 09. december 2020 ob 22:35:51 CET je André Przywara napisal(a):
>> On 09/12/2020 14:33, Clément Péron wrote:
>>
>> Hi,
>>
>>> I try to review this, and compare against the vendor Kernel>
>>>
>>> On Wed, 2 Dec 2020 at 14:54, Andre Przywara  
> wrote:
>>>> While the clocks are fairly similar to the H6, many differ in tiny
>>>> details, so a separate clock driver seems indicated.
>>>>
>>>> Derived from the H6 clock driver, and adjusted according to the manual.
>>>>
>>>> Signed-off-by: Andre Przywara 
>>>> ---
>>>>
>>>>  drivers/clk/sunxi-ng/Kconfig|7 +-
>>>>  drivers/clk/sunxi-ng/Makefile   |1 +
>>>>  drivers/clk/sunxi-ng/ccu-sun50i-h616.c  | 1134 +++
>>>>  drivers/clk/sunxi-ng/ccu-sun50i-h616.h  |   58 +
>>>>  include/dt-bindings/clock/sun50i-h616-ccu.h |  110 ++
>>>>  include/dt-bindings/reset/sun50i-h616-ccu.h |   67 ++
>>>>  6 files changed, 1376 insertions(+), 1 deletion(-)
>>>>  create mode 100644 drivers/clk/sunxi-ng/ccu-sun50i-h616.c
>>>>  create mode 100644 drivers/clk/sunxi-ng/ccu-sun50i-h616.h
>>>>  create mode 100644 include/dt-bindings/clock/sun50i-h616-ccu.h
>>>>  create mode 100644 include/dt-bindings/reset/sun50i-h616-ccu.h
>>>>
>>>> diff --git a/drivers/clk/sunxi-ng/Kconfig b/drivers/clk/sunxi-ng/Kconfig
>>>> index ce5f5847d5d3..cd46d8853876 100644
>>>> --- a/drivers/clk/sunxi-ng/Kconfig
>>>> +++ b/drivers/clk/sunxi-ng/Kconfig
>>>> @@ -32,8 +32,13 @@ config SUN50I_H6_CCU
>>>>
>>>> default ARM64 && ARCH_SUNXI
>>>> depends on (ARM64 && ARCH_SUNXI) || COMPILE_TEST
>>>>
>>>> +config SUN50I_H616_CCU
>>>> +   bool "Support for the Allwinner H616 CCU"
>>>> +   default ARM64 && ARCH_SUNXI
>>>> +   depends on (ARM64 && ARCH_SUNXI) || COMPILE_TEST
>>>> +
>>>>
>>>>  config SUN50I_H6_R_CCU
>>>>
>>>> -   bool "Support for the Allwinner H6 PRCM CCU"
>>>> +   bool "Support for the Allwinner H6 and H616 PRCM CCU"
>>>>
>>>> default ARM64 && ARCH_SUNXI
>>>> depends on (ARM64 && ARCH_SUNXI) || COMPILE_TEST
>>>>
>>>> diff --git a/drivers/clk/sunxi-ng/Makefile
>>>> b/drivers/clk/sunxi-ng/Makefile
>>>> index 3eb5cff40eac..96c324306d97 100644
>>>> --- a/drivers/clk/sunxi-ng/Makefile
>>>> +++ b/drivers/clk/sunxi-ng/Makefile
>>>> @@ -26,6 +26,7 @@ obj-$(CONFIG_SUN50I_A64_CCU)  += ccu-sun50i-a64.o
>>>>
>>>>  obj-$(CONFIG_SUN50I_A100_CCU)  += ccu-sun50i-a100.o
>>>>  obj-$(CONFIG_SUN50I_A100_R_CCU)+= ccu-sun50i-a100-r.o
>>>>  obj-$(CONFIG_SUN50I_H6_CCU)+= ccu-sun50i-h6.o
>>>>
>>>> +obj-$(CONFIG_SUN50I_H616_CCU)  += ccu-sun50i-h616.o
>>>>
>>>>  obj-$(CONFIG_SUN50I_H6_R_CCU)  += ccu-sun50i-h6-r.o
>>>>  obj-$(CONFIG_SUN4I_A10_CCU)+= ccu-sun4i-a10.o
>>>>  obj-$(CONFIG_SUN5I_CCU)+= ccu-sun5i.o
>>>>
>>>> diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-h616.c
>>>> b/drivers/clk/sunxi-ng/ccu-sun50i-h616.c new file mode 100644
>>>> index ..3fbb258f0354
>>>> --- /dev/null
>>>> +++ b/drivers/clk/sunxi-ng/ccu-sun50i-h616.c
>>>> @@ -0,0 +1,1134 @@
>>>> +// SPDX-License-Identifier: GPL-2.0
>>>> +/*
>>>> + * Copyright (c) 2020 Arm Ltd.
>>>> + * Based on the H6 CCU driver, which is:
>>>> + *   Copyright (c) 2017 Icenowy Zheng 
>>>> + */
>>>> +
>>>> +#include 
>>>> +#include 
>>>> +#include 
>>>> +#include 
>>>> +
>>>> +#include "ccu_common.h"
>>>> +#include "ccu_reset.h"
>>>> +
>>>> +#include "ccu_div.h"
>>>> +#include "ccu_gate.h"
>>>> +#include "ccu_mp.h"
>>>> +#include "ccu_mult.h"
>>>> +#include "ccu_nk.h"
>>>> +#include "ccu_nkm.h"
>>>> +#include "ccu_nkmp.h"
>>>> +#include "ccu_nm.h"
>>>> +
>>>> +#include "ccu

Re: [linux-sunxi] [PATCH 5/8] clk: sunxi-ng: Add support for the Allwinner H616 CCU

2020-12-09 Thread André Przywara
On 09/12/2020 14:33, Clément Péron wrote:

Hi,

> I try to review this, and compare against the vendor Kernel>
> On Wed, 2 Dec 2020 at 14:54, Andre Przywara  wrote:
>>
>> While the clocks are fairly similar to the H6, many differ in tiny
>> details, so a separate clock driver seems indicated.
>>
>> Derived from the H6 clock driver, and adjusted according to the manual.
>>
>> Signed-off-by: Andre Przywara 
>> ---
>>  drivers/clk/sunxi-ng/Kconfig|7 +-
>>  drivers/clk/sunxi-ng/Makefile   |1 +
>>  drivers/clk/sunxi-ng/ccu-sun50i-h616.c  | 1134 +++
>>  drivers/clk/sunxi-ng/ccu-sun50i-h616.h  |   58 +
>>  include/dt-bindings/clock/sun50i-h616-ccu.h |  110 ++
>>  include/dt-bindings/reset/sun50i-h616-ccu.h |   67 ++
>>  6 files changed, 1376 insertions(+), 1 deletion(-)
>>  create mode 100644 drivers/clk/sunxi-ng/ccu-sun50i-h616.c
>>  create mode 100644 drivers/clk/sunxi-ng/ccu-sun50i-h616.h
>>  create mode 100644 include/dt-bindings/clock/sun50i-h616-ccu.h
>>  create mode 100644 include/dt-bindings/reset/sun50i-h616-ccu.h
>>
>> diff --git a/drivers/clk/sunxi-ng/Kconfig b/drivers/clk/sunxi-ng/Kconfig
>> index ce5f5847d5d3..cd46d8853876 100644
>> --- a/drivers/clk/sunxi-ng/Kconfig
>> +++ b/drivers/clk/sunxi-ng/Kconfig
>> @@ -32,8 +32,13 @@ config SUN50I_H6_CCU
>> default ARM64 && ARCH_SUNXI
>> depends on (ARM64 && ARCH_SUNXI) || COMPILE_TEST
>>
>> +config SUN50I_H616_CCU
>> +   bool "Support for the Allwinner H616 CCU"
>> +   default ARM64 && ARCH_SUNXI
>> +   depends on (ARM64 && ARCH_SUNXI) || COMPILE_TEST
>> +
>>  config SUN50I_H6_R_CCU
>> -   bool "Support for the Allwinner H6 PRCM CCU"
>> +   bool "Support for the Allwinner H6 and H616 PRCM CCU"
>> default ARM64 && ARCH_SUNXI
>> depends on (ARM64 && ARCH_SUNXI) || COMPILE_TEST
>>
>> diff --git a/drivers/clk/sunxi-ng/Makefile b/drivers/clk/sunxi-ng/Makefile
>> index 3eb5cff40eac..96c324306d97 100644
>> --- a/drivers/clk/sunxi-ng/Makefile
>> +++ b/drivers/clk/sunxi-ng/Makefile
>> @@ -26,6 +26,7 @@ obj-$(CONFIG_SUN50I_A64_CCU)  += ccu-sun50i-a64.o
>>  obj-$(CONFIG_SUN50I_A100_CCU)  += ccu-sun50i-a100.o
>>  obj-$(CONFIG_SUN50I_A100_R_CCU)+= ccu-sun50i-a100-r.o
>>  obj-$(CONFIG_SUN50I_H6_CCU)+= ccu-sun50i-h6.o
>> +obj-$(CONFIG_SUN50I_H616_CCU)  += ccu-sun50i-h616.o
>>  obj-$(CONFIG_SUN50I_H6_R_CCU)  += ccu-sun50i-h6-r.o
>>  obj-$(CONFIG_SUN4I_A10_CCU)+= ccu-sun4i-a10.o
>>  obj-$(CONFIG_SUN5I_CCU)+= ccu-sun5i.o
>> diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-h616.c 
>> b/drivers/clk/sunxi-ng/ccu-sun50i-h616.c
>> new file mode 100644
>> index ..3fbb258f0354
>> --- /dev/null
>> +++ b/drivers/clk/sunxi-ng/ccu-sun50i-h616.c
>> @@ -0,0 +1,1134 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Copyright (c) 2020 Arm Ltd.
>> + * Based on the H6 CCU driver, which is:
>> + *   Copyright (c) 2017 Icenowy Zheng 
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +#include "ccu_common.h"
>> +#include "ccu_reset.h"
>> +
>> +#include "ccu_div.h"
>> +#include "ccu_gate.h"
>> +#include "ccu_mp.h"
>> +#include "ccu_mult.h"
>> +#include "ccu_nk.h"
>> +#include "ccu_nkm.h"
>> +#include "ccu_nkmp.h"
>> +#include "ccu_nm.h"
>> +
>> +#include "ccu-sun50i-h616.h"
>> +
>> +/*
>> + * The CPU PLL is actually NP clock, with P being /1, /2 or /4. However
>> + * P should only be used for output frequencies lower than 288 MHz.
>> + *
>> + * For now we can just model it as a multiplier clock, and force P to /1.
>> + *
>> + * The M factor is present in the register's description, but not in the
>> + * frequency formula, and it's documented as "M is only used for backdoor
>> + * testing", so it's not modelled and then force to 0.
>> + */
>> +#define SUN50I_H616_PLL_CPUX_REG   0x000
>> +static struct ccu_mult pll_cpux_clk = {
>> +   .enable = BIT(31),
>> +   .lock   = BIT(28),
>> +   .mult   = _SUNXI_CCU_MULT_MIN(8, 8, 12),
>> +   .common = {
>> +   .reg= 0x000,
>> +   .hw.init= CLK_HW_INIT("pll-cpux", "osc24M",
>> + &ccu_mult_ops,
>> + CLK_SET_RATE_UNGATE),
>> +   },
>> +};
>> +
>> +/* Some PLLs are input * N / div1 / P. Model them as NKMP with no K */
>> +#define SUN50I_H616_PLL_DDR0_REG   0x010
>> +static struct ccu_nkmp pll_ddr0_clk = {
>> +   .enable = BIT(31),
>> +   .lock   = BIT(28),
>> +   .n  = _SUNXI_CCU_MULT_MIN(8, 8, 12),
>> +   .m  = _SUNXI_CCU_DIV(1, 1), /* input divider */
>> +   .p  = _SUNXI_CCU_DIV(0, 1), /* output divider */
>> +   .common = {
>> +   .reg= 0x010,
>> +   .hw.init= CLK_HW_INIT("pll-ddr0", "osc24M",
>> + &ccu_nkmp_ops,
>> +

Re: [linux-sunxi] Re: [PATCH 7/8] arm64: dts: allwinner: Add Allwinner H616 .dtsi file

2020-12-07 Thread André Przywara
On 03/12/2020 16:20, Chen-Yu Tsai wrote:

Hi,

> On Thu, Dec 3, 2020 at 11:45 PM André Przywara  wrote:
>>
>> On 03/12/2020 15:02, Chen-Yu Tsai wrote:
>>> On Thu, Dec 3, 2020 at 6:54 PM André Przywara  
>>> wrote:
>>>>
>>>> On 03/12/2020 03:16, Samuel Holland wrote:
>>>>
>>>> Hi,
>>>>
>>>>> On 12/2/20 7:54 AM, Andre Przywara wrote:
>>>>> ...
>>>>>> +soc {
>>>>>> +compatible = "simple-bus";
>>>>>> +#address-cells = <1>;
>>>>>> +#size-cells = <1>;
>>>>>> +ranges = <0x0 0x0 0x0 0x4000>;
>>>>>> +
>>>>>> +syscon: syscon@300 {
>>>>>> +compatible = "allwinner,sun50i-h616-system-control",
>>>>>> + "allwinner,sun50i-a64-system-control";
>>>>>> +reg = <0x0300 0x1000>;
>>>>>> +#address-cells = <1>;
>>>>>> +#size-cells = <1>;
>>>>>> +ranges;
>>>>>> +
>>>>>> +sram_c: sram@28000 {
>>>>>> +compatible = "mmio-sram";
>>>>>> +reg = <0x00028000 0x3>;
>>>>>> +#address-cells = <1>;
>>>>>> +#size-cells = <1>;
>>>>>> +ranges = <0 0x00028000 0x3>;
>>>>>> +};
>>>>>> +
>>>>>> +sram_c1: sram@1a0 {
>>>>>> +compatible = "mmio-sram";
>>>>>> +reg = <0x01a0 0x20>;
>>>>>> +#address-cells = <1>;
>>>>>> +#size-cells = <1>;
>>>>>> +ranges = <0 0x01a0 0x20>;
>>>>>> +
>>>>>> +ve_sram: sram-section@0 {
>>>>>> +compatible = 
>>>>>> "allwinner,sun50i-h616-sram-c1",
>>>>>> + 
>>>>>> "allwinner,sun4i-a10-sram-c1";
>>>>>> +reg = <0x00 0x20>;
>>>>>> +};
>>>>>> +};
>>>>>> +};
>>>>>
>>>>> You mentioned that you could not find a SRAM A2. How were these SRAM 
>>>>> ranges
>>>>> verified? If you can load eGON.BT0 larger than 32 KiB, then presumably 
>>>>> NBROM
>>>>> uses SRAM C, and it is in the manual, but I see no mention of SRAM C1.
>>>>
>>>> The manual says that SRAM C *can* be used by "the system", at boot time,
>>>> as long as it's configured correctly. I couldn't find any details on how
>>>> to switch clock sources for SRAM C, and the manual stanza on this is
>>>> quite gibberish. I presume it's configured either by BROM or by reset
>>>> default this way. I think the idea is that the later users (VE, DE) take
>>>> ownership at some point (which means we can't run any firmware in there).
>>>> The BSP boot0 is 48KB already, so reaching into SRAM C, and the code
>>>> itself heavily uses SRAM C (found by hacking boot0 to drop to FEL and
>>>> inspecting the memory afterwards).
>>>>
>>>> For C1: I copied this name from the H6 .dtsi, the manual calls this
>>>> "VE-SRAM", in both manuals, and the description looks identical there
>>>> for both SoCs. I think this will be later used by the video engine, so I
>>>> kept it in. The large size made me suspicious, and from former
>>>> experiments it looks like being aliased to (parts of) SRAM C.
>>>
>>> I would just call it sram_ve or ve_sram. SRAM C1 would make more sense if
>>> it were part of SRAM C, not the other way around.
>>
>> But isn't that what we do? "sram_c1" is just the node name alias used
>> fo

Re: [linux-sunxi] [PATCH 2/8] pinctrl: sunxi: Add support for the Allwinner H616 pin controller

2020-12-06 Thread André Przywara
On 07/12/2020 01:07, André Przywara wrote:
> On 06/12/2020 16:01, Icenowy Zheng wrote:
> 
> Hi,
> 



>>>>>> + SUNXI_FUNCTION_IRQ_BANK(0x6, 4, 16)), /*
>>> PI_EINT16 */
>>>>>> +};
>>>>>> +static const unsigned int h616_irq_bank_map[] = { 2, 5, 6, 7, 8 };
>>>>>
>>>>> The BSP driver seems to have more than 5 IRQ Banks.
>>>>>
>>>>> static const unsigned sun50iw9p1_irq_bank_base[] = {
>>>>> SUNXI_PIO_BANK_BASE(PA_BASE, 0),
>>>>> SUNXI_PIO_BANK_BASE(PC_BASE, 1),
>>>>> SUNXI_PIO_BANK_BASE(PD_BASE, 2),
>>>>> SUNXI_PIO_BANK_BASE(PE_BASE, 3),
>>>>> SUNXI_PIO_BANK_BASE(PF_BASE, 4),
>>>>> SUNXI_PIO_BANK_BASE(PG_BASE, 5),
>>>>> SUNXI_PIO_BANK_BASE(PH_BASE, 6),
>>>>> SUNXI_PIO_BANK_BASE(PI_BASE, 7),
>>>>> };
>>>>>
>>>>> So maybe it should be somethings like this:
>>>>> static const unsigned int h616_irq_bank_map[] = { 0, 2, 3, 4, 5, 6,
>>> 7, 8 };
>>>>
>>>> While that's true, I don't see a need for IRQ bank on port A - this
>>> port is 
>>>> internal (not exposed on pins) and none of the functionality on that
>>> port 
>>>> needs IRQ.
>>>
>>> I agree here, since port A isn't even mentioned in the manual (neither
>>
>> I think if we ignore it we have the risk of DT binding issues
>> when we need to add it afterwards.
> 
> You have a point, but which interrupt shall I assign in the .dtsi?

Ah, of course they mention it in their -pinctrl.dtsi...

> And as Jernej mentioned, there is little sense in having those pins as
> interrupt sources, since we cannot use them as GPIOs in a useful way. We
> could bitbang I2C, but I don't see much sense in doing this.
> 
> And to be honest: that issue is a shortcoming of our binding. By moving
> this simple array into the DT we could avoid this problem entirely.
> 
>>> is PortD or PortE),
> 
> I had a look at PortD and PortE in the BSP: they describe LCD, LVDS and
> CSI, mostly, all interfaces which the chip does not support anymore.
> Even if the peripherals are still in, there is no use for having those
> signals internally. And there are surely no pads connected to them
> (there are simply no balls left on the package, according to the datasheet).
> 
> So my theory is that those peripherals are just left in because it was
> too much trouble to remove them (and it doesn't hurt having them in), or
> there is another package variant which exposes those pins.
> 
> So I would lean to not expose those ports (PD, PE) and their interrupts
> (for PA, PD, PE).
> 
> Opinions?

So with those numbers from their .dtsi I can offer to use the array as
Clément showed above, and adjust the indicies in the pin arrays above.
Then have the IRQ numbers as shown in the BSP in our .dtsi.
But I would not have SUNXI_FUNCTION_IRQ_BANK statements for those
unknown ports, in fact no mentions of PortD and PortE at all.

Does that sound acceptable?

Cheers,
Andre

> 
>> I would refrain from listing it here prematurely.
>>> Plus we actually don't know their interrupt numbers: the manual only
>>> mentions GPIOE on top of the already listed interrupts.
>>>
>>> The interrupts work by their index, so skipping ports is not an issue.
>>> I
>>> just tested the PIO interrupt on PortC, and it works.
>>>
>>> Cheers,
>>> Andre
>>>
>>>>>
>>>>>> +
>>>>>> +static const struct sunxi_pinctrl_desc h616_pinctrl_data = {
>>>>>> +   .pins = h616_pins,
>>>>>> +   .npins = ARRAY_SIZE(h616_pins),
>>>>>> +   .irq_banks = 5,
>>>>>
>>>>>  .irq_banks = ARAY_SIZE(h616_irq_bank_map) is better no ?
>>>>>
>>>>>> +   .irq_bank_map = h616_irq_bank_map,
>>>>>> +   .irq_read_needs_mux = true,
>>>>>> +   .io_bias_cfg_variant = BIAS_VOLTAGE_PIO_POW_MODE_SEL,
>>>>>> +};
>>>>>> +
>>>>>> +static int h616_pinctrl_probe(struct platform_device *pdev)
>>>>>> +{
>>>>>> +   return sunxi_pinctrl_init(pdev,
>>>>>> + &h616_pinctrl_data);
>>>>>> +}
>>>>>> +
>>>>>> +static const struct of_device_id h616_pinctrl_match[] = {
>>>>>> +   { .compatible = "allwinner,sun50i-h616-pinctrl", },
>>>>>
>>>>> This is a new compatible and should be documented.
>>>>>
>>>>> Regards,
>>>>> Clement
>>>>>
>>>>>> +   {}
>>>>>> +};
>>>>>> +
>>>>>> +static struct platform_driver h616_pinctrl_driver = {
>>>>>> +   .probe  = h616_pinctrl_probe,
>>>>>> +   .driver = {
>>>>>> +   .name   = "sun50i-h616-pinctrl",
>>>>>> +   .of_match_table = h616_pinctrl_match,
>>>>>> +   },
>>>>>> +};
>>>>>> +builtin_platform_driver(h616_pinctrl_driver);
>>>>>> --
>>>>>> 2.17.5
>>>>>>
>>>>>> --
>>
> 



Re: [RESEND PATCH 13/19] phy: sun4i-usb: add support for A100 USB PHY

2020-12-06 Thread André Przywara
On 10/11/2020 06:40, Frank Lee wrote:

Hi,


> From: Yangtao Li 
> 
> Add support for a100's usb phy, which with 2 PHYs.
> 
> Signed-off-by: Yangtao Li 
> ---
>  drivers/phy/allwinner/phy-sun4i-usb.c | 19 +++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/drivers/phy/allwinner/phy-sun4i-usb.c 
> b/drivers/phy/allwinner/phy-sun4i-usb.c
> index a6900495baa5..1a0e403131e7 100644
> --- a/drivers/phy/allwinner/phy-sun4i-usb.c
> +++ b/drivers/phy/allwinner/phy-sun4i-usb.c
> @@ -107,6 +107,7 @@ enum sun4i_usb_phy_type {
>   sun8i_r40_phy,
>   sun8i_v3s_phy,
>   sun50i_a64_phy,
> + sun50i_a100_phy,
>   sun50i_h6_phy,
>  };
>  
> @@ -289,7 +290,13 @@ static int sun4i_usb_phy_init(struct phy *_phy)
>   }
>  
>   if (data->cfg->type == sun8i_a83t_phy ||
> + data->cfg->type == sun50i_a100_phy ||
>   data->cfg->type == sun50i_h6_phy) {
> + if (phy->pmu && data->cfg->enable_pmu_unk1) {
> + val = readl(phy->pmu + REG_PMU_UNK1);
> + writel(val & ~BIT(3), phy->pmu + REG_PMU_UNK1);
> + }
> +

So having a closer look, this does not look right. We should not use
this very same variable (enable_pmu_unk1) for a different bit.

So what about changing "bool enable_pmu_unk1;" into "u32
pmu_phy_tune_mask;", and using this to mask bits in this PMU register,
regardless of the PHY type (above this "if" statement)? We just check it
for being 0 and possibly skip the R/M/W sequence.
Then the newer SoCs get .pmu_phy_tune_mask = BIT(1), in their config
below, and the A100 gets BIT(3). Older PHYs just omit this line at all,
are initialised to 0, and are skipped.

That would look more cleaner and might even be a bit future proof.

Cheers,
Andre

>   if (phy->index == 0) {
>   val = readl(data->base + data->cfg->phyctl_offset);
>   val |= PHY_CTL_VBUSVLDEXT;
> @@ -339,6 +346,7 @@ static int sun4i_usb_phy_exit(struct phy *_phy)
>  
>   if (phy->index == 0) {
>   if (data->cfg->type == sun8i_a83t_phy ||
> + data->cfg->type == sun50i_a100_phy ||
>   data->cfg->type == sun50i_h6_phy) {
>   void __iomem *phyctl = data->base +
>   data->cfg->phyctl_offset;
> @@ -960,6 +968,16 @@ static const struct sun4i_usb_phy_cfg sun50i_a64_cfg = {
>   .phy0_dual_route = true,
>  };
>  
> +static const struct sun4i_usb_phy_cfg sun50i_a100_cfg = {
> + .num_phys = 2,
> + .type = sun50i_a100_phy,
> + .disc_thresh = 3,
> + .phyctl_offset = REG_PHYCTL_A33,
> + .dedicated_clocks = true,
> + .enable_pmu_unk1 = true,
> + .phy0_dual_route = true,
> +};
> +
>  static const struct sun4i_usb_phy_cfg sun50i_h6_cfg = {
>   .num_phys = 4,
>   .type = sun50i_h6_phy,
> @@ -983,6 +1001,7 @@ static const struct of_device_id 
> sun4i_usb_phy_of_match[] = {
>   { .compatible = "allwinner,sun8i-v3s-usb-phy", .data = &sun8i_v3s_cfg },
>   { .compatible = "allwinner,sun50i-a64-usb-phy",
> .data = &sun50i_a64_cfg},
> + { .compatible = "allwinner,sun50i-a100-usb-phy", .data = 
> &sun50i_a100_cfg },
>   { .compatible = "allwinner,sun50i-h6-usb-phy", .data = &sun50i_h6_cfg },
>   { },
>  };
> 



Re: [linux-sunxi] [PATCH 2/8] pinctrl: sunxi: Add support for the Allwinner H616 pin controller

2020-12-06 Thread André Przywara
On 06/12/2020 16:01, Icenowy Zheng wrote:

Hi,

> 于 2020年12月6日 GMT+08:00 下午10:52:17, "André Przywara"  
> 写到:
>> On 06/12/2020 12:42, Jernej Škrabec wrote:
>>
>> Hi,
>>
>>> Dne nedelja, 06. december 2020 ob 13:32:49 CET je Clément Péron
>> napisal(a):
>>>> Hi Andre,
>>>>
>>>> On Wed, 2 Dec 2020 at 14:54, Andre Przywara 
>> wrote:
>>>>> Port A is used for an internal connection to some analogue
>> circuitry
>>>>> which looks like an AC200 IP (as in the H6), though this is not
>>>>> mentioned in the manual.
>>>>>
>>>>> Signed-off-by: Andre Przywara 
>>>>> ---
>>>>>
>>>>>  drivers/pinctrl/sunxi/Kconfig   |   5 +
>>>>>  drivers/pinctrl/sunxi/Makefile  |   1 +
>>>>>  drivers/pinctrl/sunxi/pinctrl-sun50i-h616.c | 549
>> 
>>>>>  3 files changed, 555 insertions(+)
>>>>>  create mode 100644 drivers/pinctrl/sunxi/pinctrl-sun50i-h616.c
>>>>>
>>>>> diff --git a/drivers/pinctrl/sunxi/Kconfig
>> b/drivers/pinctrl/sunxi/Kconfig
>>>>> index 593293584ecc..73e88ce71a48 100644
>>>>> --- a/drivers/pinctrl/sunxi/Kconfig
>>>>> +++ b/drivers/pinctrl/sunxi/Kconfig
>>>>> @@ -119,4 +119,9 @@ config PINCTRL_SUN50I_H6_R
>>>>>
>>>>> default ARM64 && ARCH_SUNXI
>>>>> select PINCTRL_SUNXI
>>>>>
>>>>> +config PINCTRL_SUN50I_H616
>>>>> +   bool "Support for the Allwinner H616 PIO"
>>>>> +   default ARM64 && ARCH_SUNXI
>>>>> +   select PINCTRL_SUNXI
>>>>> +
>>>>>
>>>>>  endif
>>>>>
>>>>> diff --git a/drivers/pinctrl/sunxi/Makefile
>>>>> b/drivers/pinctrl/sunxi/Makefile index 8b7ff0dc3bdf..5359327a3c8f
>> 100644
>>>>> --- a/drivers/pinctrl/sunxi/Makefile
>>>>> +++ b/drivers/pinctrl/sunxi/Makefile
>>>>> @@ -23,5 +23,6 @@ obj-$(CONFIG_PINCTRL_SUN8I_V3S)   +=
>>>>> pinctrl-sun8i-v3s.o> 
>>>>>  obj-$(CONFIG_PINCTRL_SUN50I_H5)+=
>> pinctrl-sun50i-h5.o
>>>>>  obj-$(CONFIG_PINCTRL_SUN50I_H6)+=
>> pinctrl-sun50i-h6.o
>>>>>  obj-$(CONFIG_PINCTRL_SUN50I_H6_R)  += pinctrl-sun50i-h6-r.o
>>>>>
>>>>> +obj-$(CONFIG_PINCTRL_SUN50I_H616)  += pinctrl-sun50i-h616.o
>>>>>
>>>>>  obj-$(CONFIG_PINCTRL_SUN9I_A80)+=
>> pinctrl-sun9i-a80.o
>>>>>  obj-$(CONFIG_PINCTRL_SUN9I_A80_R)  += pinctrl-sun9i-a80-r.o
>>>>>
>>>>> diff --git a/drivers/pinctrl/sunxi/pinctrl-sun50i-h616.c
>>>>> b/drivers/pinctrl/sunxi/pinctrl-sun50i-h616.c new file mode 100644
>>>>> index ..734f63eb08dd
>>>>> --- /dev/null
>>>>> +++ b/drivers/pinctrl/sunxi/pinctrl-sun50i-h616.c
>>>>> @@ -0,0 +1,549 @@
>>>>> +// SPDX-License-Identifier: GPL-2.0
>>>>> +/*
>>>>> + * Allwinner H616 SoC pinctrl driver.
>>>>> + *
>>>>> + * Copyright (C) 2020 Arm Ltd.
>>>>> + * based on the H6 pinctrl driver
>>>>> + *   Copyright (C) 2017 Icenowy Zheng 
>>>>> + */
>>>>> +
>>>>> +#include 
>>>>> +#include 
>>>>> +#include 
>>>>> +#include 
>>>>> +#include 
>>>>> +
>>>>> +#include "pinctrl-sunxi.h"
>>>>> +
>>>>> +static const struct sunxi_desc_pin h616_pins[] = {
>>>>> +   /* Internal connection to the AC200 part */
>>>>> +   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 0),
>>>>> +   SUNXI_FUNCTION(0x2, "emac1")),  /* ERXD1 */
>>>>> +   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 1),
>>>>> +   SUNXI_FUNCTION(0x2, "emac1")),  /* ERXD0 */
>>>>> +   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 2),
>>>>> +   SUNXI_FUNCTION(0x2, "emac1")),  /* ECRS_DV
>> */
>>>>> +   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 3),
>>>>> +   SUNXI_FUNCTION(0x2, "emac1")),  /* ERXERR
>> */
>>>>> +   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 4),
>>>>

Re: [linux-sunxi] [PATCH 2/8] pinctrl: sunxi: Add support for the Allwinner H616 pin controller

2020-12-06 Thread André Przywara
On 06/12/2020 12:42, Jernej Škrabec wrote:

Hi,

> Dne nedelja, 06. december 2020 ob 13:32:49 CET je Clément Péron napisal(a):
>> Hi Andre,
>>
>> On Wed, 2 Dec 2020 at 14:54, Andre Przywara  wrote:
>>> Port A is used for an internal connection to some analogue circuitry
>>> which looks like an AC200 IP (as in the H6), though this is not
>>> mentioned in the manual.
>>>
>>> Signed-off-by: Andre Przywara 
>>> ---
>>>
>>>  drivers/pinctrl/sunxi/Kconfig   |   5 +
>>>  drivers/pinctrl/sunxi/Makefile  |   1 +
>>>  drivers/pinctrl/sunxi/pinctrl-sun50i-h616.c | 549 
>>>  3 files changed, 555 insertions(+)
>>>  create mode 100644 drivers/pinctrl/sunxi/pinctrl-sun50i-h616.c
>>>
>>> diff --git a/drivers/pinctrl/sunxi/Kconfig b/drivers/pinctrl/sunxi/Kconfig
>>> index 593293584ecc..73e88ce71a48 100644
>>> --- a/drivers/pinctrl/sunxi/Kconfig
>>> +++ b/drivers/pinctrl/sunxi/Kconfig
>>> @@ -119,4 +119,9 @@ config PINCTRL_SUN50I_H6_R
>>>
>>> default ARM64 && ARCH_SUNXI
>>> select PINCTRL_SUNXI
>>>
>>> +config PINCTRL_SUN50I_H616
>>> +   bool "Support for the Allwinner H616 PIO"
>>> +   default ARM64 && ARCH_SUNXI
>>> +   select PINCTRL_SUNXI
>>> +
>>>
>>>  endif
>>>
>>> diff --git a/drivers/pinctrl/sunxi/Makefile
>>> b/drivers/pinctrl/sunxi/Makefile index 8b7ff0dc3bdf..5359327a3c8f 100644
>>> --- a/drivers/pinctrl/sunxi/Makefile
>>> +++ b/drivers/pinctrl/sunxi/Makefile
>>> @@ -23,5 +23,6 @@ obj-$(CONFIG_PINCTRL_SUN8I_V3S)   +=
>>> pinctrl-sun8i-v3s.o> 
>>>  obj-$(CONFIG_PINCTRL_SUN50I_H5)+= pinctrl-sun50i-h5.o
>>>  obj-$(CONFIG_PINCTRL_SUN50I_H6)+= pinctrl-sun50i-h6.o
>>>  obj-$(CONFIG_PINCTRL_SUN50I_H6_R)  += pinctrl-sun50i-h6-r.o
>>>
>>> +obj-$(CONFIG_PINCTRL_SUN50I_H616)  += pinctrl-sun50i-h616.o
>>>
>>>  obj-$(CONFIG_PINCTRL_SUN9I_A80)+= pinctrl-sun9i-a80.o
>>>  obj-$(CONFIG_PINCTRL_SUN9I_A80_R)  += pinctrl-sun9i-a80-r.o
>>>
>>> diff --git a/drivers/pinctrl/sunxi/pinctrl-sun50i-h616.c
>>> b/drivers/pinctrl/sunxi/pinctrl-sun50i-h616.c new file mode 100644
>>> index ..734f63eb08dd
>>> --- /dev/null
>>> +++ b/drivers/pinctrl/sunxi/pinctrl-sun50i-h616.c
>>> @@ -0,0 +1,549 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +/*
>>> + * Allwinner H616 SoC pinctrl driver.
>>> + *
>>> + * Copyright (C) 2020 Arm Ltd.
>>> + * based on the H6 pinctrl driver
>>> + *   Copyright (C) 2017 Icenowy Zheng 
>>> + */
>>> +
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +
>>> +#include "pinctrl-sunxi.h"
>>> +
>>> +static const struct sunxi_desc_pin h616_pins[] = {
>>> +   /* Internal connection to the AC200 part */
>>> +   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 0),
>>> +   SUNXI_FUNCTION(0x2, "emac1")),  /* ERXD1 */
>>> +   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 1),
>>> +   SUNXI_FUNCTION(0x2, "emac1")),  /* ERXD0 */
>>> +   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 2),
>>> +   SUNXI_FUNCTION(0x2, "emac1")),  /* ECRS_DV */
>>> +   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 3),
>>> +   SUNXI_FUNCTION(0x2, "emac1")),  /* ERXERR */
>>> +   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 4),
>>> +   SUNXI_FUNCTION(0x2, "emac1")),  /* ETXD1 */
>>> +   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 5),
>>> +   SUNXI_FUNCTION(0x2, "emac1")),  /* ETXD0 */
>>> +   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 6),
>>> +   SUNXI_FUNCTION(0x2, "emac1")),  /* ETXCK */
>>> +   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 7),
>>> +   SUNXI_FUNCTION(0x2, "emac1")),  /* ETXEN */
>>> +   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 8),
>>> +   SUNXI_FUNCTION(0x2, "emac1")),  /* EMDC */
>>> +   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 9),
>>> +   SUNXI_FUNCTION(0x2, "emac1")),  /* EMDIO */
>>> +   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 10),
>>> +   SUNXI_FUNCTION(0x2, "i2c3")),   /* SCK */
>>> +   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 11),
>>> +   SUNXI_FUNCTION(0x2, "i2c3")),   /* SDA */
>>> +   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 12),
>>> +   SUNXI_FUNCTION(0x2, "pwm5")),
>>> +   /* Hole */
>>> +   SUNXI_PIN(SUNXI_PINCTRL_PIN(C, 0),
>>> + SUNXI_FUNCTION(0x0, "gpio_in"),
>>> + SUNXI_FUNCTION(0x1, "gpio_out"),
>>> + SUNXI_FUNCTION(0x2, "nand0"), /* WE */
>>> + SUNXI_FUNCTION(0x3, "mmc2"),  /* DS */
>>> + SUNXI_FUNCTION(0x4, "spi0"),  /* CLK */
>>> + SUNXI_FUNCTION_IRQ_BANK(0x6, 0, 0)),  /* PC_EINT0 */
>>> +   SUNXI_PIN(SUNXI_PINCTRL_PIN(C, 1),
>>> + SUNXI_FUNCTION(0x0, "gpio_in"),
>>> + SUNXI_FUNCTION(0x1, "gpio_out"),
>>> + SUNXI_FUNCTION(0x2, "nand0"), /* ALE */
>>> + SUNXI_FUNCTION(0x3, "mmc2"),  /* RST */

Re: [linux-sunxi] Re: [PATCH 7/8] arm64: dts: allwinner: Add Allwinner H616 .dtsi file

2020-12-03 Thread André Przywara
On 03/12/2020 15:02, Chen-Yu Tsai wrote:
> On Thu, Dec 3, 2020 at 6:54 PM André Przywara  wrote:
>>
>> On 03/12/2020 03:16, Samuel Holland wrote:
>>
>> Hi,
>>
>>> On 12/2/20 7:54 AM, Andre Przywara wrote:
>>> ...
>>>> +soc {
>>>> +compatible = "simple-bus";
>>>> +#address-cells = <1>;
>>>> +#size-cells = <1>;
>>>> +ranges = <0x0 0x0 0x0 0x4000>;
>>>> +
>>>> +syscon: syscon@300 {
>>>> +compatible = "allwinner,sun50i-h616-system-control",
>>>> + "allwinner,sun50i-a64-system-control";
>>>> +reg = <0x0300 0x1000>;
>>>> +#address-cells = <1>;
>>>> +#size-cells = <1>;
>>>> +ranges;
>>>> +
>>>> +sram_c: sram@28000 {
>>>> +compatible = "mmio-sram";
>>>> +reg = <0x00028000 0x3>;
>>>> +#address-cells = <1>;
>>>> +#size-cells = <1>;
>>>> +ranges = <0 0x00028000 0x3>;
>>>> +};
>>>> +
>>>> +sram_c1: sram@1a0 {
>>>> +compatible = "mmio-sram";
>>>> +reg = <0x01a0 0x20>;
>>>> +#address-cells = <1>;
>>>> +#size-cells = <1>;
>>>> +ranges = <0 0x01a0 0x20>;
>>>> +
>>>> +ve_sram: sram-section@0 {
>>>> +compatible = 
>>>> "allwinner,sun50i-h616-sram-c1",
>>>> + 
>>>> "allwinner,sun4i-a10-sram-c1";
>>>> +reg = <0x00 0x20>;
>>>> +};
>>>> +};
>>>> +};
>>>
>>> You mentioned that you could not find a SRAM A2. How were these SRAM ranges
>>> verified? If you can load eGON.BT0 larger than 32 KiB, then presumably NBROM
>>> uses SRAM C, and it is in the manual, but I see no mention of SRAM C1.
>>
>> The manual says that SRAM C *can* be used by "the system", at boot time,
>> as long as it's configured correctly. I couldn't find any details on how
>> to switch clock sources for SRAM C, and the manual stanza on this is
>> quite gibberish. I presume it's configured either by BROM or by reset
>> default this way. I think the idea is that the later users (VE, DE) take
>> ownership at some point (which means we can't run any firmware in there).
>> The BSP boot0 is 48KB already, so reaching into SRAM C, and the code
>> itself heavily uses SRAM C (found by hacking boot0 to drop to FEL and
>> inspecting the memory afterwards).
>>
>> For C1: I copied this name from the H6 .dtsi, the manual calls this
>> "VE-SRAM", in both manuals, and the description looks identical there
>> for both SoCs. I think this will be later used by the video engine, so I
>> kept it in. The large size made me suspicious, and from former
>> experiments it looks like being aliased to (parts of) SRAM C.
> 
> I would just call it sram_ve or ve_sram. SRAM C1 would make more sense if
> it were part of SRAM C, not the other way around.

But isn't that what we do? "sram_c1" is just the node name alias used
for the parent node. That is actually never referenced anywhere (in any
of the the H6 .dts), so we can actually remove it, I guess.
The actual SRAM section is called ve_sram already.
And I can't change the compatible name, for the fallback, at least.

I can make the new compatible string read
"allwinner,sun50i-h616-ve-sram", if that helps, but that would mean
deviating from the H6 and other SoCs.

Cheers,
Andre


> 
> Also the sram-section node would make more sense if it were in sram_c, as
> that is the part that gets switched around, not the full region @ 1a0.
> 
> ChenYu
> 
>> Maybe some guys with more VE knowledge can shine some light on this?
>>
>> Cheers,
>> Andre
>>



Re: [PATCH 4/8] clk: sunxi-ng: Add support for the Allwinner H616 R-CCU

2020-12-03 Thread André Przywara
On 02/12/2020 14:31, Icenowy Zheng wrote:

Hi,

> 于 2020年12月2日 GMT+08:00 下午9:54:05, Andre Przywara  写到:
>> The clocks itself are identical to the H6 R-CCU, it's just that the
>> H616
>> has not all of them implemented (or connected).
> 
> For selective clocks, try to follow the practice of V3(s) driver?

Not sure what you mean, isn't that what I do? Having a separate
sunxi_ccu_desc for each SoC and referencing separate structs? At least
that's what I see in ccu-sun8i-v3s.c.

What am I missing?

Cheers,
Andre

>>
>> Signed-off-by: Andre Przywara 
>> ---
>> drivers/clk/sunxi-ng/ccu-sun50i-h6-r.c | 47 +-
>> drivers/clk/sunxi-ng/ccu-sun50i-h6-r.h |  3 +-
>> 2 files changed, 48 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.c
>> b/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.c
>> index 50f8d1bc7046..119d1797f501 100644
>> --- a/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.c
>> +++ b/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.c
>> @@ -136,6 +136,15 @@ static struct ccu_common *sun50i_h6_r_ccu_clks[] =
>> {
>>  &w1_clk.common,
>> };
>>
>> +static struct ccu_common *sun50i_h616_r_ccu_clks[] = {
>> +&r_apb1_clk.common,
>> +&r_apb2_clk.common,
>> +&r_apb1_twd_clk.common,
>> +&r_apb2_i2c_clk.common,
>> +&r_apb1_ir_clk.common,
>> +&ir_clk.common,
>> +};
>> +
>> static struct clk_hw_onecell_data sun50i_h6_r_hw_clks = {
>>  .hws= {
>>  [CLK_AR100] = &ar100_clk.common.hw,
>> @@ -152,7 +161,20 @@ static struct clk_hw_onecell_data
>> sun50i_h6_r_hw_clks = {
>>  [CLK_IR]= &ir_clk.common.hw,
>>  [CLK_W1]= &w1_clk.common.hw,
>>  },
>> -.num= CLK_NUMBER,
>> +.num= CLK_NUMBER_H616,
>> +};
>> +
>> +static struct clk_hw_onecell_data sun50i_h616_r_hw_clks = {
>> +.hws= {
>> +[CLK_R_AHB] = &r_ahb_clk.hw,
>> +[CLK_R_APB1]= &r_apb1_clk.common.hw,
>> +[CLK_R_APB2]= &r_apb2_clk.common.hw,
>> +[CLK_R_APB1_TWD]= &r_apb1_twd_clk.common.hw,
>> +[CLK_R_APB2_I2C]= &r_apb2_i2c_clk.common.hw,
>> +[CLK_R_APB1_IR] = &r_apb1_ir_clk.common.hw,
>> +[CLK_IR]= &ir_clk.common.hw,
>> +},
>> +.num= CLK_NUMBER_H616,
>> };
>>
>> static struct ccu_reset_map sun50i_h6_r_ccu_resets[] = {
>> @@ -165,6 +187,12 @@ static struct ccu_reset_map
>> sun50i_h6_r_ccu_resets[] = {
>>  [RST_R_APB1_W1] =  { 0x1ec, BIT(16) },
>> };
>>
>> +static struct ccu_reset_map sun50i_h616_r_ccu_resets[] = {
>> +[RST_R_APB1_TWD]=  { 0x12c, BIT(16) },
>> +[RST_R_APB2_I2C]=  { 0x19c, BIT(16) },
>> +[RST_R_APB1_IR] =  { 0x1cc, BIT(16) },
>> +};
>> +
>> static const struct sunxi_ccu_desc sun50i_h6_r_ccu_desc = {
>>  .ccu_clks   = sun50i_h6_r_ccu_clks,
>>  .num_ccu_clks   = ARRAY_SIZE(sun50i_h6_r_ccu_clks),
>> @@ -175,6 +203,16 @@ static const struct sunxi_ccu_desc
>> sun50i_h6_r_ccu_desc = {
>>  .num_resets = ARRAY_SIZE(sun50i_h6_r_ccu_resets),
>> };
>>
>> +static const struct sunxi_ccu_desc sun50i_h616_r_ccu_desc = {
>> +.ccu_clks   = sun50i_h616_r_ccu_clks,
>> +.num_ccu_clks   = ARRAY_SIZE(sun50i_h616_r_ccu_clks),
>> +
>> +.hw_clks= &sun50i_h616_r_hw_clks,
>> +
>> +.resets = sun50i_h616_r_ccu_resets,
>> +.num_resets = ARRAY_SIZE(sun50i_h616_r_ccu_resets),
>> +};
>> +
>> static void __init sunxi_r_ccu_init(struct device_node *node,
>>  const struct sunxi_ccu_desc *desc)
>> {
>> @@ -195,3 +233,10 @@ static void __init sun50i_h6_r_ccu_setup(struct
>> device_node *node)
>> }
>> CLK_OF_DECLARE(sun50i_h6_r_ccu, "allwinner,sun50i-h6-r-ccu",
>> sun50i_h6_r_ccu_setup);
>> +
>> +static void __init sun50i_h616_r_ccu_setup(struct device_node *node)
>> +{
>> +sunxi_r_ccu_init(node, &sun50i_h616_r_ccu_desc);
>> +}
>> +CLK_OF_DECLARE(sun50i_h616_r_ccu, "allwinner,sun50i-h616-r-ccu",
>> +   sun50i_h616_r_ccu_setup);
>> diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.h
>> b/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.h
>> index 782117dc0b28..128302696ca1 100644
>> --- a/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.h
>> +++ b/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.h
>> @@ -14,6 +14,7 @@
>>
>> #define CLK_R_APB2   3
>>
>> -#define CLK_NUMBER  (CLK_W1 + 1)
>> +#define CLK_NUMBER_H6   (CLK_W1 + 1)
>> +#define CLK_NUMBER_H616 (CLK_IR + 1)
>>
>> #endif /* _CCU_SUN50I_H6_R_H */



Re: [PATCH 4/8] clk: sunxi-ng: Add support for the Allwinner H616 R-CCU

2020-12-03 Thread André Przywara
On 02/12/2020 18:20, Jernej Škrabec wrote:

Hi,

> Dne sreda, 02. december 2020 ob 14:54:05 CET je Andre Przywara napisal(a):
>> The clocks itself are identical to the H6 R-CCU, it's just that the H616
>> has not all of them implemented (or connected).
>>
>> Signed-off-by: Andre Przywara 
>> ---
>>  drivers/clk/sunxi-ng/ccu-sun50i-h6-r.c | 47 +-
>>  drivers/clk/sunxi-ng/ccu-sun50i-h6-r.h |  3 +-
>>  2 files changed, 48 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.c 
>> b/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.c
>> index 50f8d1bc7046..119d1797f501 100644
>> --- a/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.c
>> +++ b/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.c
>> @@ -136,6 +136,15 @@ static struct ccu_common *sun50i_h6_r_ccu_clks[] = {
>>  &w1_clk.common,
>>  };
>>  
>> +static struct ccu_common *sun50i_h616_r_ccu_clks[] = {
>> +&r_apb1_clk.common,
>> +&r_apb2_clk.common,
>> +&r_apb1_twd_clk.common,
>> +&r_apb2_i2c_clk.common,
>> +&r_apb1_ir_clk.common,
>> +&ir_clk.common,
>> +};
>> +
>>  static struct clk_hw_onecell_data sun50i_h6_r_hw_clks = {
>>  .hws= {
>>  [CLK_AR100] = &ar100_clk.common.hw,
>> @@ -152,7 +161,20 @@ static struct clk_hw_onecell_data sun50i_h6_r_hw_clks = 
>> {
>>  [CLK_IR]= &ir_clk.common.hw,
>>  [CLK_W1]= &w1_clk.common.hw,
>>  },
>> -.num= CLK_NUMBER,
>> +.num= CLK_NUMBER_H616,
> 
> Above macro should be CLK_NUMBER_H6.

Ouch, well spotted!

>> +};
>> +
>> +static struct clk_hw_onecell_data sun50i_h616_r_hw_clks = {
>> +.hws= {
>> +[CLK_R_AHB] = &r_ahb_clk.hw,
>> +[CLK_R_APB1]= &r_apb1_clk.common.hw,
>> +[CLK_R_APB2]= &r_apb2_clk.common.hw,
>> +[CLK_R_APB1_TWD]= &r_apb1_twd_clk.common.hw,
> 
> Do we know if TWD exists? I tested I2C and IR. What is your source for these 
> clocks?

I poked around in this area, writing all Fs into those clock registers
and checked which stuck. There are actually even more clocks still
around there, though not all of the H6 (no PWM, for instance).

For TWD: it's mentioned in the manual, the clock register is there and
can be written to. The normal watchdog registers do not appear to work
at 0x7020800 (they are RAZ/WI), but there is a counter at 0x7020820,
which increments as long as those TWD clock gates are open. It stops
when the clock bits are cleared.
So that tells me that this clock is there and is working.
I didn't have time yet to check the other (former) peripherals from that
area.

Cheers,
Andre

> 
> Best regards,
> Jernej
> 
>> +[CLK_R_APB2_I2C]= &r_apb2_i2c_clk.common.hw,
>> +[CLK_R_APB1_IR] = &r_apb1_ir_clk.common.hw,
>> +[CLK_IR]= &ir_clk.common.hw,
>> +},
>> +.num= CLK_NUMBER_H616,
>>  };
>>  
>>  static struct ccu_reset_map sun50i_h6_r_ccu_resets[] = {
>> @@ -165,6 +187,12 @@ static struct ccu_reset_map sun50i_h6_r_ccu_resets[] = {
>>  [RST_R_APB1_W1] =  { 0x1ec, BIT(16) },
>>  };
>>  
>> +static struct ccu_reset_map sun50i_h616_r_ccu_resets[] = {
>> +[RST_R_APB1_TWD]=  { 0x12c, BIT(16) },
>> +[RST_R_APB2_I2C]=  { 0x19c, BIT(16) },
>> +[RST_R_APB1_IR] =  { 0x1cc, BIT(16) },
>> +};
>> +
>>  static const struct sunxi_ccu_desc sun50i_h6_r_ccu_desc = {
>>  .ccu_clks   = sun50i_h6_r_ccu_clks,
>>  .num_ccu_clks   = ARRAY_SIZE(sun50i_h6_r_ccu_clks),
>> @@ -175,6 +203,16 @@ static const struct sunxi_ccu_desc sun50i_h6_r_ccu_desc 
>> = {
>>  .num_resets = ARRAY_SIZE(sun50i_h6_r_ccu_resets),
>>  };
>>  
>> +static const struct sunxi_ccu_desc sun50i_h616_r_ccu_desc = {
>> +.ccu_clks   = sun50i_h616_r_ccu_clks,
>> +.num_ccu_clks   = ARRAY_SIZE(sun50i_h616_r_ccu_clks),
>> +
>> +.hw_clks= &sun50i_h616_r_hw_clks,
>> +
>> +.resets = sun50i_h616_r_ccu_resets,
>> +.num_resets = ARRAY_SIZE(sun50i_h616_r_ccu_resets),
>> +};
>> +
>>  static void __init sunxi_r_ccu_init(struct device_node *node,
>>  const struct sunxi_ccu_desc *desc)
>>  {
>> @@ -195,3 +233,10 @@ static void __init sun50i_h6_r_ccu_setup(struct 
>> device_node *node)
>>  }
>>  CLK_OF_DECLARE(sun50i_h6_r_ccu, "allwinner,sun50i-h6-r-ccu",
>> sun50i_h6_r_ccu_setup);
>> +
>> +static void __init sun50i_h616_r_ccu_setup(struct device_node *node)
>> +{
>> +sunxi_r_ccu_init(node, &sun50i_h616_r_ccu_desc);
>> +}
>> +CLK_OF_DECLARE(sun50i_h616_r_ccu, "allwinner,sun50i-h616-r-ccu",
>> +   sun50i_h616_r_ccu_setup);
>> diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.h 
>> b/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.h
>> index 782117dc0b28..128302696ca1 100644
>> --- a/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.h
>> +++ b/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.h
>> @@ -14,6 +14,7 @@
>>  
>>  #de

Re: [PATCH 7/8] arm64: dts: allwinner: Add Allwinner H616 .dtsi file

2020-12-03 Thread André Przywara
On 03/12/2020 03:16, Samuel Holland wrote:

Hi,

> On 12/2/20 7:54 AM, Andre Przywara wrote:
> ...
>> +soc {
>> +compatible = "simple-bus";
>> +#address-cells = <1>;
>> +#size-cells = <1>;
>> +ranges = <0x0 0x0 0x0 0x4000>;
>> +
>> +syscon: syscon@300 {
>> +compatible = "allwinner,sun50i-h616-system-control",
>> + "allwinner,sun50i-a64-system-control";
>> +reg = <0x0300 0x1000>;
>> +#address-cells = <1>;
>> +#size-cells = <1>;
>> +ranges;
>> +
>> +sram_c: sram@28000 {
>> +compatible = "mmio-sram";
>> +reg = <0x00028000 0x3>;
>> +#address-cells = <1>;
>> +#size-cells = <1>;
>> +ranges = <0 0x00028000 0x3>;
>> +};
>> +
>> +sram_c1: sram@1a0 {
>> +compatible = "mmio-sram";
>> +reg = <0x01a0 0x20>;
>> +#address-cells = <1>;
>> +#size-cells = <1>;
>> +ranges = <0 0x01a0 0x20>;
>> +
>> +ve_sram: sram-section@0 {
>> +compatible = 
>> "allwinner,sun50i-h616-sram-c1",
>> + 
>> "allwinner,sun4i-a10-sram-c1";
>> +reg = <0x00 0x20>;
>> +};
>> +};
>> +};
> 
> You mentioned that you could not find a SRAM A2. How were these SRAM ranges
> verified? If you can load eGON.BT0 larger than 32 KiB, then presumably NBROM
> uses SRAM C, and it is in the manual, but I see no mention of SRAM C1.

The manual says that SRAM C *can* be used by "the system", at boot time,
as long as it's configured correctly. I couldn't find any details on how
to switch clock sources for SRAM C, and the manual stanza on this is
quite gibberish. I presume it's configured either by BROM or by reset
default this way. I think the idea is that the later users (VE, DE) take
ownership at some point (which means we can't run any firmware in there).
The BSP boot0 is 48KB already, so reaching into SRAM C, and the code
itself heavily uses SRAM C (found by hacking boot0 to drop to FEL and
inspecting the memory afterwards).

For C1: I copied this name from the H6 .dtsi, the manual calls this
"VE-SRAM", in both manuals, and the description looks identical there
for both SoCs. I think this will be later used by the video engine, so I
kept it in. The large size made me suspicious, and from former
experiments it looks like being aliased to (parts of) SRAM C.

Maybe some guys with more VE knowledge can shine some light on this?

Cheers,
Andre



Re: [PATCH 7/8] arm64: dts: allwinner: Add Allwinner H616 .dtsi file

2020-12-02 Thread André Przywara
On 02/12/2020 16:33, Jernej Škrabec wrote:

Hi,

> Dne sreda, 02. december 2020 ob 14:54:08 CET je Andre Przywara napisal(a):
>> This (relatively) new SoC is similar to the H6, but drops the (broken)
>> PCIe support and the USB 3.0 controller. It also gets the management
>> controller removed, which in turn removes *some*, but not all of the
>> devices formerly dedicated to the ARISC (CPUS).
>> There does not seem to be an external interrupt controller anymore, so
>> no external interrupts through an NMI pin. The AXP driver needs to learn
>> living with that.
>>
>> Signed-off-by: Andre Przywara 
>> ---
>>  .../arm64/boot/dts/allwinner/sun50i-h616.dtsi | 704 ++
>>  1 file changed, 704 insertions(+)
>>  create mode 100644 arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi
>>
>> diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi 
>> b/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi
>> new file mode 100644
>> index ..dcffbfdcd26b
>> --- /dev/null
>> +++ b/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi
>> @@ -0,0 +1,704 @@
>> +// SPDX-License-Identifier: (GPL-2.0+ OR MIT)
>> +// Copyright (C) 2020 Arm Ltd.
>> +// based on the H6 dtsi, which is:
>> +//   Copyright (C) 2017 Icenowy Zheng 
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +/ {
>> +interrupt-parent = <&gic>;
>> +#address-cells = <2>;
>> +#size-cells = <2>;
>> +
>> +cpus {
>> +#address-cells = <1>;
>> +#size-cells = <0>;
>> +
>> +cpu0: cpu@0 {
>> +compatible = "arm,cortex-a53";
>> +device_type = "cpu";
>> +reg = <0>;
>> +enable-method = "psci";
>> +clocks = <&ccu CLK_CPUX>;
>> +};
>> +
>> +cpu1: cpu@1 {
>> +compatible = "arm,cortex-a53";
>> +device_type = "cpu";
>> +reg = <1>;
>> +enable-method = "psci";
>> +clocks = <&ccu CLK_CPUX>;
>> +};
>> +
>> +cpu2: cpu@2 {
>> +compatible = "arm,cortex-a53";
>> +device_type = "cpu";
>> +reg = <2>;
>> +enable-method = "psci";
>> +clocks = <&ccu CLK_CPUX>;
>> +};
>> +
>> +cpu3: cpu@3 {
>> +compatible = "arm,cortex-a53";
>> +device_type = "cpu";
>> +reg = <3>;
>> +enable-method = "psci";
>> +clocks = <&ccu CLK_CPUX>;
>> +};
>> +};
>> +
>> +reserved-memory {
>> +#address-cells = <2>;
>> +#size-cells = <2>;
>> +ranges;
>> +
>> +/* 512KiB reserved for ARM Trusted Firmware (BL31) */
>> +secmon_reserved: secmon@4000 {
>> +reg = <0x0 0x4000 0x0 0x8>;
>> +no-map;
>> +};
>> +};
>> +
>> +osc24M: osc24M_clk {
>> +#clock-cells = <0>;
>> +compatible = "fixed-clock";
>> +clock-frequency = <2400>;
>> +clock-output-names = "osc24M";
>> +};
>> +
>> +pmu {
>> +compatible = "arm,cortex-a53-pmu";
>> +interrupts = ,
>> + ,
>> + ,
>> + ;
>> +interrupt-affinity = <&cpu0>, <&cpu1>, <&cpu2>, <&cpu3>;
>> +};
>> +
>> +psci {
>> +compatible = "arm,psci-0.2";
>> +method = "smc";
>> +};
>> +
>> +timer {
>> +compatible = "arm,armv8-timer";
>> +arm,no-tick-in-suspend;
>> +interrupts = > +(GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_HIGH)>,
>> + > +(GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_HIGH)>,
>> + > +(GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_HIGH)>,
>> + > +(GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_HIGH)>;
>> +};
>> +
>> +soc {
>> +compatible = "simple-bus";
>> +#address-cells = <1>;
>> +#size-cells = <1>;
>> +ranges = <0x0 0x0 0x0 0x4000>;
>> +
>> +syscon: syscon@300 {
>> +compatible = "allwinner,sun50i-h616-system-control",
>> + "allwinner,sun50i-a64-system-control";
> 
> Those H616 is not compatible to A64 one because it has second emac control 
> register at offset 0x34, which no other supported SoC has.

But this means the H616 is a superset of the A64?
I changed the driver to extend the regmap to two registers for the H616,
is that all you need for the second emac?
How do we tell the EMAC driver which clock register to use? An extra
compatible? Or can we pass a cell to syscon? syscon = <&syscon 1>; ?

Re: [PATCH 5/8] clk: sunxi-ng: Add support for the Allwinner H616 CCU

2020-12-02 Thread André Przywara
On 02/12/2020 21:03, Jernej Škrabec wrote:
> Dne sreda, 02. december 2020 ob 14:54:06 CET je Andre Przywara napisal(a):
>> While the clocks are fairly similar to the H6, many differ in tiny
>> details, so a separate clock driver seems indicated.
>>
>> Derived from the H6 clock driver, and adjusted according to the manual.
>>
>> Signed-off-by: Andre Przywara 
>> ---
>>  drivers/clk/sunxi-ng/Kconfig|7 +-
>>  drivers/clk/sunxi-ng/Makefile   |1 +
>>  drivers/clk/sunxi-ng/ccu-sun50i-h616.c  | 1134 +++
>>  drivers/clk/sunxi-ng/ccu-sun50i-h616.h  |   58 +
>>  include/dt-bindings/clock/sun50i-h616-ccu.h |  110 ++
>>  include/dt-bindings/reset/sun50i-h616-ccu.h |   67 ++
>>  6 files changed, 1376 insertions(+), 1 deletion(-)
>>  create mode 100644 drivers/clk/sunxi-ng/ccu-sun50i-h616.c
>>  create mode 100644 drivers/clk/sunxi-ng/ccu-sun50i-h616.h
>>  create mode 100644 include/dt-bindings/clock/sun50i-h616-ccu.h
>>  create mode 100644 include/dt-bindings/reset/sun50i-h616-ccu.h
>>
>> diff --git a/drivers/clk/sunxi-ng/Kconfig b/drivers/clk/sunxi-ng/Kconfig
>> index ce5f5847d5d3..cd46d8853876 100644
>> --- a/drivers/clk/sunxi-ng/Kconfig
>> +++ b/drivers/clk/sunxi-ng/Kconfig



>> +static int sun50i_h616_ccu_probe(struct platform_device *pdev)
>> +{
>> +struct resource *res;
>> +void __iomem *reg;
>> +u32 val;
>> +int i;
>> +
>> +res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>> +reg = devm_ioremap_resource(&pdev->dev, res);
>> +if (IS_ERR(reg))
>> +return PTR_ERR(reg);
>> +
>> +/* Enable the lock bits on all PLLs */
>> +for (i = 0; i < ARRAY_SIZE(pll_regs); i++) {
>> +val = readl(reg + pll_regs[i]);
>> +val |= BIT(29);
> 
> You should also add BIT(27) here. It enables PLL output (new functionality in 
> H616). Without this only clocks which are child of PLL_CPUX and PLL_PERIPH0 
> would work (set up by U-Boot). I'm pretty sure that's not intended usage but 
> until we know better, it's ok imo.

Ah right, the output enable bit. I was wondering if the A100 solution
would better here: use bit 27 as the .enable value, and actually enable
(bit31) all PLLs here.
Or we add another field, maybe a flag, to allow the kernel to decide
which bit to use. Clement suggested something like that on IRC.
But for now I can surely just set bit 27 here as well.

>> +writel(val, reg + pll_regs[i]);
>> +}
>> +
>> +/*
>> + * Force the output divider of video PLLs to 0.
>> + *
>> + * See the comment before pll-video0 definition for the reason.
>> + */
>> +for (i = 0; i < ARRAY_SIZE(pll_video_regs); i++) {
>> +val = readl(reg + pll_video_regs[i]);
>> +val &= ~BIT(0);
>> +writel(val, reg + pll_video_regs[i]);
>> +}
>> +
>> +/*
>> + * Force OHCI 12M clock sources to 00 (12MHz divided from 48MHz)
>> + *
>> + * This clock mux is still mysterious, and the code just enforces
>> + * it to have a valid clock parent.
>> + */
>> +for (i = 0; i < ARRAY_SIZE(usb2_clk_regs); i++) {
>> +val = readl(reg + usb2_clk_regs[i]);
>> +val &= ~GENMASK(25, 24);
>> +writel (val, reg + usb2_clk_regs[i]);
>> +}
>> +
>> +/*
>> + * Force the post-divider of pll-audio to 12 and the output divider
>> + * of it to 2, so 24576000 and 22579200 rates can be set exactly.
>> + */
>> +val = readl(reg + SUN50I_H616_PLL_AUDIO_REG);
>> +val &= ~(GENMASK(21, 16) | BIT(0));
>> +writel(val | (11 << 16) | BIT(0), reg + SUN50I_H616_PLL_AUDIO_REG);
>> +
>> +/*
>> + * First clock parent (osc32K) is unusable for CEC. But since there
>> + * is no good way to force parent switch (both run with same frequency),
>> + * just set second clock parent here.
>> + */
>> +val = readl(reg + SUN50I_H616_HDMI_CEC_CLK_REG);
>> +val |= BIT(24);
>> +writel(val, reg + SUN50I_H616_HDMI_CEC_CLK_REG);
>> +
>> +return sunxi_ccu_probe(pdev->dev.of_node, reg, &sun50i_h616_ccu_desc);
>> +}
>> +
>> +static const struct of_device_id sun50i_h616_ccu_ids[] = {
>> +{ .compatible = "allwinner,sun50i-h616-ccu",
>> +.data = &sun50i_h616_ccu_desc },
>> +{ }
>> +};
>> +
>> +static struct platform_driver sun50i_h616_ccu_driver = {
>> +.probe  = sun50i_h616_ccu_probe,
>> +.driver = {
>> +.name   = "sun50i-h616-ccu",
>> +.of_match_table = sun50i_h616_ccu_ids,
>> +},
>> +};
>> +builtin_platform_driver(sun50i_h616_ccu_driver);
> 
> Please use CLK_OF_DECLARE() instead. That way clocks will be initialized 
> earlier and it will be actually possible to register both timer peripherals 
> (once DT nodes are added). If pdev or dev is ever needed, two stage 
> initialization can be made later.

Sure, will do.

Thanks for having a look!

Andre

> 
> Best regards,
> Jernej
> 
>> diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-h616.h 
>> b

Re: [PATCH 8/8] arm64: dts: allwinner: Add OrangePi Zero 2 .dts

2020-12-02 Thread André Przywara
On 02/12/2020 15:57, Icenowy Zheng wrote:
> 在 2020-12-02星期三的 13:54 +,Andre Przywara写道:
>> The OrangePi Zero 2 is a development board with the new H616 SoC.
>>
>> It features the usual connectors used on those small boards, and
>> comes
>> with the AXP305, which seems to be compatible with the AXP805.
>>
>> For more details see: http://linux-sunxi.org/Xunlong_Orange_Pi_Zero2
>>
>> Signed-off-by: Andre Przywara 
>> ---
>>  arch/arm64/boot/dts/allwinner/Makefile|   1 +
>>  .../allwinner/sun50i-h616-orangepi-zero2.dts  | 228
>> ++
>>  2 files changed, 229 insertions(+)
>>  create mode 100644 arch/arm64/boot/dts/allwinner/sun50i-h616-
>> orangepi-zero2.dts
>>
>> diff --git a/arch/arm64/boot/dts/allwinner/Makefile
>> b/arch/arm64/boot/dts/allwinner/Makefile
>> index 211d1e9d4701..0cf8299b1ce7 100644
>> --- a/arch/arm64/boot/dts/allwinner/Makefile
>> +++ b/arch/arm64/boot/dts/allwinner/Makefile
>> @@ -35,3 +35,4 @@ dtb-$(CONFIG_ARCH_SUNXI) += sun50i-h6-orangepi-one-
>> plus.dtb
>>  dtb-$(CONFIG_ARCH_SUNXI) += sun50i-h6-pine-h64.dtb
>>  dtb-$(CONFIG_ARCH_SUNXI) += sun50i-h6-pine-h64-model-b.dtb
>>  dtb-$(CONFIG_ARCH_SUNXI) += sun50i-h6-tanix-tx6.dtb
>> +dtb-$(CONFIG_ARCH_SUNXI) += sun50i-h616-orangepi-zero2.dtb
>> diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h616-orangepi-
>> zero2.dts b/arch/arm64/boot/dts/allwinner/sun50i-h616-orangepi-
>> zero2.dts
>> new file mode 100644
>> index ..814f5b4fec7c
>> --- /dev/null
>> +++ b/arch/arm64/boot/dts/allwinner/sun50i-h616-orangepi-zero2.dts
>> @@ -0,0 +1,228 @@
>> +// SPDX-License-Identifier: (GPL-2.0+ or MIT)
>> +/*
>> + * Copyright (C) 2020 Arm Ltd.
>> + */
>> +
>> +/dts-v1/;
>> +
>> +#include "sun50i-h616.dtsi"
>> +
>> +#include 
>> +#include 
>> +
>> +/ {
>> +model = "OrangePi Zero2";
>> +compatible = "xunlong,orangepi-zero2", "allwinner,sun50i-h616";
>> +
>> +aliases {
>> +ethernet0 = &emac0;
>> +serial0 = &uart0;
>> +};
>> +
>> +chosen {
>> +stdout-path = "serial0:115200n8";
>> +};
>> +
>> +leds {
>> +compatible = "gpio-leds";
>> +
>> +power {
>> +label = "orangepi:red:power";
>> +gpios = <&pio 2 13 GPIO_ACTIVE_HIGH>; /* PC13
>> */
>> +default-state = "on";
>> +};
>> +
>> +status {
>> +label = "orangepi:green:status";
>> +gpios = <&pio 2 12 GPIO_ACTIVE_HIGH>; /* PC12
>> */
>> +};
>> +};
>> +
>> +reg_vcc5v: vcc5v {
>> +/* board wide 5V supply directly from the USB-C socket
>> */
>> +compatible = "regulator-fixed";
>> +regulator-name = "vcc-5v";
>> +regulator-min-microvolt = <500>;
>> +regulator-max-microvolt = <500>;
>> +regulator-always-on;
>> +};
>> +
>> +reg_usb1_vbus: usb1-vbus {
>> +compatible = "regulator-fixed";
>> +regulator-name = "usb1-vbus";
>> +regulator-min-microvolt = <500>;
>> +regulator-max-microvolt = <500>;
>> +enable-active-high;
>> +gpio = <&pio 2 16 GPIO_ACTIVE_HIGH>; /* PC16 */
>> +status = "okay";
>> +};
>> +};
>> +
>> +&ehci0 {
>> +status = "okay";
>> +};
>> +
>> +&ehci1 {
>> +status = "okay";
>> +};
>> +
>> +/* USB 2 & 3 are on headers only. */
>> +
>> +&emac0 {
>> +pinctrl-names = "default";
>> +pinctrl-0 = <&ext_rgmii_pins>;
>> +phy-mode = "rgmii";
>> +phy-handle = <&ext_rgmii_phy>;
>> +phy-supply = <®_dcdce>;
>> +allwinner,rx-delay-ps = <3100>;
>> +allwinner,tx-delay-ps = <700>;
>> +status = "okay";
>> +};
>> +
>> +&mdio {
>> +ext_rgmii_phy: ethernet-phy@1 {
>> +compatible = "ethernet-phy-ieee802.3-c22";
>> +reg = <1>;
>> +};
>> +};
>> +
>> +&mmc0 {
>> +vmmc-supply = <®_dcdce>;
>> +cd-gpios = <&pio 5 6 GPIO_ACTIVE_LOW>;  /* PF6 */
>> +bus-width = <4>;
>> +status = "okay";
>> +};
>> +
>> +&ohci0 {
>> +status = "okay";
>> +};
>> +
>> +&ohci1 {
>> +status = "okay";
>> +};
>> +
>> +&r_i2c {
>> +status = "okay";
>> +
>> +axp305: pmic@36 {
>> +compatible = "x-powers,axp305", "x-powers,axp805",
>> + "x-powers,axp806";
>> +reg = <0x36>;
>> +
>> +/* dummy interrupt to appease the driver for now */
>> +interrupts = ;
>> +interrupt-controller;
>> +#interrupt-cells = <1>;
> 
> Is dummy interrupt future-proof?

No, it's just a placeholder. The whole interrupt controller story isn't
fully clear yet, the BSP DTS mentions one, but I didn't have time to
investigate this yet. There is no NMI pad anymore, but an NMI IRQ number
in this GIC table. The OPi Zero2 does not connect the AXP's IRQ pin to
anything.

I haven't checked the AXP driver yet, maybe it just needs to accept no
interrupts?

>> +
>> +  

Re: [PATCH 7/8] arm64: dts: allwinner: Add Allwinner H616 .dtsi file

2020-12-02 Thread André Przywara
On 02/12/2020 16:03, Icenowy Zheng wrote:
> 在 2020-12-02星期三的 13:54 +,Andre Przywara写道:
>> This (relatively) new SoC is similar to the H6, but drops the
>> (broken)
>> PCIe support and the USB 3.0 controller. It also gets the management
>> controller removed, which in turn removes *some*, but not all of the
>> devices formerly dedicated to the ARISC (CPUS).
>> There does not seem to be an external interrupt controller anymore,
>> so
>> no external interrupts through an NMI pin. The AXP driver needs to
>> learn
>> living with that.
>>
>> Signed-off-by: Andre Przywara 
>> ---
>>  .../arm64/boot/dts/allwinner/sun50i-h616.dtsi | 704
>> ++
>>  1 file changed, 704 insertions(+)
>>  create mode 100644 arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi
>>
>> diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi
>> b/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi
>> new file mode 100644
>> index ..dcffbfdcd26b
>> --- /dev/null
>> +++ b/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi
>> @@ -0,0 +1,704 @@
>> +// SPDX-License-Identifier: (GPL-2.0+ OR MIT)
>> +// Copyright (C) 2020 Arm Ltd.
>> +// based on the H6 dtsi, which is:
>> +//   Copyright (C) 2017 Icenowy Zheng 
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +/ {
>> +interrupt-parent = <&gic>;
>> +#address-cells = <2>;
>> +#size-cells = <2>;
>> +
>> +cpus {
>> +#address-cells = <1>;
>> +#size-cells = <0>;
>> +
>> +cpu0: cpu@0 {
>> +compatible = "arm,cortex-a53";
>> +device_type = "cpu";
>> +reg = <0>;
>> +enable-method = "psci";
>> +clocks = <&ccu CLK_CPUX>;
>> +};
>> +
>> +cpu1: cpu@1 {
>> +compatible = "arm,cortex-a53";
>> +device_type = "cpu";
>> +reg = <1>;
>> +enable-method = "psci";
>> +clocks = <&ccu CLK_CPUX>;
>> +};
>> +
>> +cpu2: cpu@2 {
>> +compatible = "arm,cortex-a53";
>> +device_type = "cpu";
>> +reg = <2>;
>> +enable-method = "psci";
>> +clocks = <&ccu CLK_CPUX>;
>> +};
>> +
>> +cpu3: cpu@3 {
>> +compatible = "arm,cortex-a53";
>> +device_type = "cpu";
>> +reg = <3>;
>> +enable-method = "psci";
>> +clocks = <&ccu CLK_CPUX>;
>> +};
>> +};
>> +
>> +reserved-memory {
>> +#address-cells = <2>;
>> +#size-cells = <2>;
>> +ranges;
>> +
>> +/* 512KiB reserved for ARM Trusted Firmware (BL31) */
>> +secmon_reserved: secmon@4000 {
>> +reg = <0x0 0x4000 0x0 0x8>;
>> +no-map;
>> +};
> 
> Should this node be dynamically added by the firmware? This is only
> some effort taken by our community, not from Allwinner. (Although
> Allwinner reserves much more memory in their BSP.)
> 
> (In my opinion, it should be applied by ATF to U-Boot DT, and then U-
> Boot add it to Linux DT.)

Yes, that is indeed a good idea. I put this in here to get things going.
The TF-A part is rather simple (we have code for that already), just
need to check if U-Boot does propagate this correctly.

> 
>> +};
>> +
>> +osc24M: osc24M_clk {
>> +#clock-cells = <0>;
>> +compatible = "fixed-clock";
>> +clock-frequency = <2400>;
>> +clock-output-names = "osc24M";
>> +};
>> +
>> +pmu {
>> +compatible = "arm,cortex-a53-pmu";
>> +interrupts = ,
>> + ,
>> + ,
>> + ;
>> +interrupt-affinity = <&cpu0>, <&cpu1>, <&cpu2>,
>> <&cpu3>;
>> +};
>> +
>> +psci {
>> +compatible = "arm,psci-0.2";
>> +method = "smc";
>> +};
>> +
>> +timer {
>> +compatible = "arm,armv8-timer";
>> +arm,no-tick-in-suspend;
>> +interrupts = > +(GIC_CPU_MASK_SIMPLE(4) |
>> IRQ_TYPE_LEVEL_HIGH)>,
>> + > +(GIC_CPU_MASK_SIMPLE(4) |
>> IRQ_TYPE_LEVEL_HIGH)>,
>> + > +(GIC_CPU_MASK_SIMPLE(4) |
>> IRQ_TYPE_LEVEL_HIGH)>,
>> + > +(GIC_CPU_MASK_SIMPLE(4) |
>> IRQ_TYPE_LEVEL_HIGH)>;
>> +};
>> +
>> +soc {
>> +compatible = "simple-bus";
>> +#address-cells = <1>;
>> +#size-cells = <1>;
>> +ranges = <0x0 0x0 0x0 0x4000>;
>> +
>> +syscon: syscon@300 {
>> +compatible = "allwinner,sun50i-h616-system-
>> control",
>> +   

Re: [RESEND PATCH 17/19] mmc: sunxi: add support for A100 mmc controller

2020-11-28 Thread André Przywara
On 28/11/2020 19:56, André Przywara wrote:
> On 10/11/2020 06:46, Frank Lee wrote:

Hi,

one more thing below ...

>> From: Yangtao Li 
>>
>> This patch adds support for A100 MMC controller, which use word address
>> for internal dma.
>>
>> Signed-off-by: Yangtao Li 
>> ---
>>  drivers/mmc/host/sunxi-mmc.c | 28 +---
>>  1 file changed, 25 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/mmc/host/sunxi-mmc.c b/drivers/mmc/host/sunxi-mmc.c
>> index fc62773602ec..1518b64112b7 100644
>> --- a/drivers/mmc/host/sunxi-mmc.c
>> +++ b/drivers/mmc/host/sunxi-mmc.c
>> @@ -244,6 +244,7 @@ struct sunxi_idma_des {
>>  
>>  struct sunxi_mmc_cfg {
>>  u32 idma_des_size_bits;
>> +u32 idma_des_shift;
>>  const struct sunxi_mmc_clk_delay *clk_delays;
>>  
>>  /* does the IP block support autocalibration? */
>> @@ -343,7 +344,7 @@ static int sunxi_mmc_init_host(struct sunxi_mmc_host 
>> *host)
>>  /* Enable CEATA support */
>>  mmc_writel(host, REG_FUNS, SDXC_CEATA_ON);
>>  /* Set DMA descriptor list base address */
>> -mmc_writel(host, REG_DLBA, host->sg_dma);
>> +mmc_writel(host, REG_DLBA, host->sg_dma >> host->cfg->idma_des_shift);
>>  
>>  rval = mmc_readl(host, REG_GCTRL);
>>  rval |= SDXC_INTERRUPT_ENABLE_BIT;
>> @@ -373,8 +374,10 @@ static void sunxi_mmc_init_idma_des(struct 
>> sunxi_mmc_host *host,
>>  
>>  next_desc += sizeof(struct sunxi_idma_des);
>>  pdes[i].buf_addr_ptr1 =
>> -cpu_to_le32(sg_dma_address(&data->sg[i]));
>> -pdes[i].buf_addr_ptr2 = cpu_to_le32((u32)next_desc);
>> +cpu_to_le32(sg_dma_address(&data->sg[i]) >>
>> +host->cfg->idma_des_shift);
>> +pdes[i].buf_addr_ptr2 = cpu_to_le32((u32)next_desc >>
>> +host->cfg->idma_des_shift);
> 
> I think you should cast after the shift, otherwise you lose the ability
> to run above 4 GB. This won't be a problem at the moment, since we still
> use the default 32-bit DMA mask, but might bite us later.
> 
> Otherwise this patch looks fine, and works on the H616 as well.
> 
> Cheers,
> Andre
> 
>>  }
>>  
>>  pdes[0].config |= cpu_to_le32(SDXC_IDMAC_DES0_FD);
>> @@ -1178,6 +1181,23 @@ static const struct sunxi_mmc_cfg sun50i_a64_emmc_cfg 
>> = {
>>  .needs_new_timings = true,
>>  };
>>  
>> +static const struct sunxi_mmc_cfg sun50i_a100_cfg = {
>> +.idma_des_size_bits = 16,
>> +.idma_des_shift = 2,
>> +.clk_delays = NULL,
>> +.can_calibrate = true,
>> +.mask_data0 = true,
>> +.needs_new_timings = true,
>> +};
>> +
>> +static const struct sunxi_mmc_cfg sun50i_a100_emmc_cfg = {
>> +.idma_des_size_bits = 13,
>> +.idma_des_shift = 2,

Is that actually true? Don't know about the A100, but the H616 manual
mentions that "SMHC2" deals with byte addresses, in contrast to the
other two ones. So MMC2 would be compatible with the a64_emmc_cfg?

Cheers,
Andre

>> +.clk_delays = NULL,
>> +.can_calibrate = true,
>> +.needs_new_timings = true,
>> +};
>> +
>>  static const struct of_device_id sunxi_mmc_of_match[] = {
>>  { .compatible = "allwinner,sun4i-a10-mmc", .data = &sun4i_a10_cfg },
>>  { .compatible = "allwinner,sun5i-a13-mmc", .data = &sun5i_a13_cfg },
>> @@ -1186,6 +1206,8 @@ static const struct of_device_id sunxi_mmc_of_match[] 
>> = {
>>  { .compatible = "allwinner,sun9i-a80-mmc", .data = &sun9i_a80_cfg },
>>  { .compatible = "allwinner,sun50i-a64-mmc", .data = &sun50i_a64_cfg },
>>  { .compatible = "allwinner,sun50i-a64-emmc", .data = 
>> &sun50i_a64_emmc_cfg },
>> +{ .compatible = "allwinner,sun50i-a100-mmc", .data = &sun50i_a100_cfg },
>> +{ .compatible = "allwinner,sun50i-a100-emmc", .data = 
>> &sun50i_a100_emmc_cfg },
>>  { /* sentinel */ }
>>  };
>>  MODULE_DEVICE_TABLE(of, sunxi_mmc_of_match);
>>
> 



Re: [RESEND PATCH 05/19] dmaengine: sun6i: Add support for A100 DMA

2020-11-28 Thread André Przywara
On 10/11/2020 06:28, Frank Lee wrote:

Hi,

> From: Yangtao Li 
> 
> The dma of a100 is similar to h6, with some minor changes to
> support greater addressing capabilities.

So apparently those changes are backwards compatible, right?
Why do we need then a new struct now, when this is actually identical to
the existing H6 one?

So as this seems to work with the same settings as the H6, I think we
don't need any change in the driver at the moment, just using the H6
compatible as a fallback in the .dtsi.

Cheers,
Andre

P.S. I understand that Vinod already applied it, and it doesn't hurt to
have that in at the moment, if we fix the compatible usage.

> 
> Add support for it.>
> Signed-off-by: Yangtao Li 
> ---
>  drivers/dma/sun6i-dma.c | 25 +
>  1 file changed, 25 insertions(+)
> 
> diff --git a/drivers/dma/sun6i-dma.c b/drivers/dma/sun6i-dma.c
> index f5f9c86c50bc..5cadd4d2b824 100644
> --- a/drivers/dma/sun6i-dma.c
> +++ b/drivers/dma/sun6i-dma.c
> @@ -1173,6 +1173,30 @@ static struct sun6i_dma_config sun50i_a64_dma_cfg = {
>BIT(DMA_SLAVE_BUSWIDTH_8_BYTES),
>  };
>  
> +/*
> + * TODO: Add support for more than 4g physical addressing.
> + *
> + * The A100 binding uses the number of dma channels from the
> + * device tree node.
> + */
> +static struct sun6i_dma_config sun50i_a100_dma_cfg = {
> + .clock_autogate_enable = sun6i_enable_clock_autogate_h3,
> + .set_burst_length = sun6i_set_burst_length_h3,
> + .set_drq  = sun6i_set_drq_h6,
> + .set_mode = sun6i_set_mode_h6,
> + .src_burst_lengths = BIT(1) | BIT(4) | BIT(8) | BIT(16),
> + .dst_burst_lengths = BIT(1) | BIT(4) | BIT(8) | BIT(16),
> + .src_addr_widths   = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) |
> +  BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) |
> +  BIT(DMA_SLAVE_BUSWIDTH_4_BYTES) |
> +  BIT(DMA_SLAVE_BUSWIDTH_8_BYTES),
> + .dst_addr_widths   = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) |
> +  BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) |
> +  BIT(DMA_SLAVE_BUSWIDTH_4_BYTES) |
> +  BIT(DMA_SLAVE_BUSWIDTH_8_BYTES),
> + .has_mbus_clk = true,
> +};
> +
>  /*
>   * The H6 binding uses the number of dma channels from the
>   * device tree node.
> @@ -1225,6 +1249,7 @@ static const struct of_device_id sun6i_dma_match[] = {
>   { .compatible = "allwinner,sun8i-h3-dma", .data = &sun8i_h3_dma_cfg },
>   { .compatible = "allwinner,sun8i-v3s-dma", .data = &sun8i_v3s_dma_cfg },
>   { .compatible = "allwinner,sun50i-a64-dma", .data = &sun50i_a64_dma_cfg 
> },
> + { .compatible = "allwinner,sun50i-a100-dma", .data = 
> &sun50i_a100_dma_cfg },
>   { .compatible = "allwinner,sun50i-h6-dma", .data = &sun50i_h6_dma_cfg },
>   { /* sentinel */ }
>  };
> 



Re: [RESEND PATCH 06/19] arm64: allwinner: a100: Add device node for DMA controller

2020-11-28 Thread André Przywara
On 10/11/2020 06:29, Frank Lee wrote:
> From: Yangtao Li 
> 
> The A100 SoC has a DMA controller that supports 8 DMA channels
> to and from various peripherals.
> 
> Add a device node for it.
> 
> Signed-off-by: Yangtao Li 
> ---
>  arch/arm64/boot/dts/allwinner/sun50i-a100.dtsi | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a100.dtsi 
> b/arch/arm64/boot/dts/allwinner/sun50i-a100.dtsi
> index cc321c04f121..c34ed8045363 100644
> --- a/arch/arm64/boot/dts/allwinner/sun50i-a100.dtsi
> +++ b/arch/arm64/boot/dts/allwinner/sun50i-a100.dtsi
> @@ -101,6 +101,18 @@ ccu: clock@3001000 {
>   #reset-cells = <1>;
>   };
>  
> + dma: dma-controller@3002000 {
> + compatible = "allwinner,sun50i-a100-dma";

So at it appears to work with the exact same settings in the driver as
the H6, we should have that as a compatible fallback:
  compatible = "allwinner,sun50i-a100-dma", "allwinner,sun50i-h6-dma";

Cheers,
Andre

> + reg = <0x03002000 0x1000>;
> + interrupts = ;
> + clocks = <&ccu CLK_BUS_DMA>, <&ccu CLK_MBUS_DMA>;
> + clock-names = "bus", "mbus";
> + dma-channels = <8>;
> + dma-requests = <51>;
> + resets = <&ccu RST_BUS_DMA>;
> + #dma-cells = <1>;
> + };
> +
>   gic: interrupt-controller@3021000 {
>   compatible = "arm,gic-400";
>   reg = <0x03021000 0x1000>, <0x03022000 0x2000>,
> 



Re: [RESEND PATCH 11/19] arm64: dts: allwinner: a100: add watchdog node

2020-11-28 Thread André Przywara
On 10/11/2020 06:38, Frank Lee wrote:
> From: Yangtao Li 
> 
> Declare A100's watchdog in the device-tree.
> 
> Signed-off-by: Yangtao Li 

I don't have any manual nor hardware, but this node looks alright, when
compared to the H6 one.

Reviewed-by: Andre Przywara 

Cheers,
Andre

> ---
>  arch/arm64/boot/dts/allwinner/sun50i-a100.dtsi | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a100.dtsi 
> b/arch/arm64/boot/dts/allwinner/sun50i-a100.dtsi
> index 01ff53b5a7a8..6aa3337ce0e9 100644
> --- a/arch/arm64/boot/dts/allwinner/sun50i-a100.dtsi
> +++ b/arch/arm64/boot/dts/allwinner/sun50i-a100.dtsi
> @@ -144,6 +144,14 @@ ths_calibration: calib@14 {
>   };
>   };
>  
> + watchdog@30090a0 {
> + compatible = "allwinner,sun50i-a100-wdt",
> +  "allwinner,sun6i-a31-wdt";
> + reg = <0x030090a0 0x20>;
> + interrupts = ;
> + clocks = <&dcxo24M>;
> + };
> +
>   pio: pinctrl@300b000 {
>   compatible = "allwinner,sun50i-a100-pinctrl";
>   reg = <0x0300b000 0x400>;
> 



Re: [RESEND PATCH 07/19] arm64: dts: allwinner: A100: Add PMU mode

2020-11-28 Thread André Przywara
On 10/11/2020 06:31, Frank Lee wrote:

Hi,

> From: Yangtao Li 
> 
> Add the Performance Monitoring Unit (PMU) device tree node to the A100
> .dtsi, which tells DT users which interrupts are triggered by PMU overflow
> events on each core.

Have you tested that the interrupts actually work? For the A64 there
were wrong in the manual, and we realised only later.
"perf stat" works even without interrupts, but "perf record" requires
interrupts, and will return empty-handed if they don't work.
Can you confirm this?

> 
> Signed-off-by: Yangtao Li 

Without being able to test or verify this, the nodes looks correct, so:

Reviewed-by: Andre Przywara 

Cheers,
Andre

> ---
>  arch/arm64/boot/dts/allwinner/sun50i-a100.dtsi | 15 ---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a100.dtsi 
> b/arch/arm64/boot/dts/allwinner/sun50i-a100.dtsi
> index c34ed8045363..01ff53b5a7a8 100644
> --- a/arch/arm64/boot/dts/allwinner/sun50i-a100.dtsi
> +++ b/arch/arm64/boot/dts/allwinner/sun50i-a100.dtsi
> @@ -25,21 +25,21 @@ cpu0: cpu@0 {
>   enable-method = "psci";
>   };
>  
> - cpu@1 {
> + cpu1: cpu@1 {
>   compatible = "arm,cortex-a53";
>   device_type = "cpu";
>   reg = <0x1>;
>   enable-method = "psci";
>   };
>  
> - cpu@2 {
> + cpu2: cpu@2 {
>   compatible = "arm,cortex-a53";
>   device_type = "cpu";
>   reg = <0x2>;
>   enable-method = "psci";
>   };
>  
> - cpu@3 {
> + cpu3: cpu@3 {
>   compatible = "arm,cortex-a53";
>   device_type = "cpu";
>   reg = <0x3>;
> @@ -47,6 +47,15 @@ cpu@3 {
>   };
>   };
>  
> + pmu {
> + compatible = "arm,cortex-a53-pmu";
> + interrupts = ,
> +  ,
> +  ,
> +  ;
> + interrupt-affinity = <&cpu0>, <&cpu1>, <&cpu2>, <&cpu3>;
> + };
> +
>   psci {
>   compatible = "arm,psci-1.0";
>   method = "smc";
> 



Re: [RESEND PATCH 12/19] dt-bindings: Add bindings for USB phy on Allwinner A100

2020-11-28 Thread André Przywara
On 11/11/2020 22:50, Rob Herring wrote:

Hi,

> On Tue, Nov 10, 2020 at 02:39:42PM +0800, Frank Lee wrote:
>> From: Yangtao Li 
>>
>> Add a device tree binding for the A100's USB PHY.

Not your fault, Yangto, but why do we actually have a separate binding
document per SoC, when the differences between the PHYs are so minimal
that we get away with some flags in the compatible match, in one driver
file?

For a start this file is basically identical to the A64 one (apart from
the example), so can you just add the A100 compatible string to that
one, instead?

Cheers,
Andre

>>
>> Signed-off-by: Yangtao Li 
>> ---
>>  .../phy/allwinner,sun50i-a100-usb-phy.yaml| 105 ++
>>  1 file changed, 105 insertions(+)
>>  create mode 100644 
>> Documentation/devicetree/bindings/phy/allwinner,sun50i-a100-usb-phy.yaml
>>
>> diff --git 
>> a/Documentation/devicetree/bindings/phy/allwinner,sun50i-a100-usb-phy.yaml 
>> b/Documentation/devicetree/bindings/phy/allwinner,sun50i-a100-usb-phy.yaml
>> new file mode 100644
>> index ..cc9bbebe2bd7
>> --- /dev/null
>> +++ 
>> b/Documentation/devicetree/bindings/phy/allwinner,sun50i-a100-usb-phy.yaml
>> @@ -0,0 +1,105 @@
>> +# SPDX-License-Identifier: GPL-2.0
> 
> Dual license new bindings. checkpatch.pl will tell you which ones.
> 
>> +%YAML 1.2
>> +---
>> +$id: http://devicetree.org/schemas/phy/allwinner,sun50i-a100-usb-phy.yaml#
>> +$schema: http://devicetree.org/meta-schemas/core.yaml#
>> +
>> +title: Allwinner A100 USB PHY Device Tree Bindings
>> +
>> +maintainers:
>> +  - Yangtao Li 
>> +
>> +properties:
>> +  "#phy-cells":
>> +const: 1
>> +
>> +  compatible:
>> +const: allwinner,sun50i-a100-usb-phy
>> +
>> +  reg:
>> +items:
>> +  - description: PHY Control registers
>> +  - description: PHY PMU0 registers
>> +  - description: PHY PMU1 registers
>> +
>> +  reg-names:
>> +items:
>> +  - const: phy_ctrl
>> +  - const: pmu0
>> +  - const: pmu1
>> +
>> +  clocks:
>> +items:
>> +  - description: USB OTG PHY bus clock
>> +  - description: USB Host 0 PHY bus clock
>> +
>> +  clock-names:
>> +items:
>> +  - const: usb0_phy
>> +  - const: usb1_phy
>> +
>> +  resets:
>> +items:
>> +  - description: USB OTG reset
>> +  - description: USB Host 1 Controller reset
>> +
>> +  reset-names:
>> +items:
>> +  - const: usb0_reset
>> +  - const: usb1_reset
>> +
>> +  usb0_id_det-gpios:
>> +description: GPIO to the USB OTG ID pin
> 
> Needs 'maxItems: 1'
> 
>> +
>> +  usb0_vbus_det-gpios:
>> +description: GPIO to the USB OTG VBUS detect pin
>> +
>> +  usb0_vbus_power-supply:
>> +description: Power supply to detect the USB OTG VBUS
>> +
>> +  usb0_vbus-supply:
>> +description: Regulator controlling USB OTG VBUS
>> +
>> +  usb1_vbus-supply:
>> +description: Regulator controlling USB1 Host controller
> 
> Are ID and VBus actually connected to the phy h/w? Really, all this 
> should be in a USB connector node for which we have bindings.
> 
>> +
>> +required:
>> +  - "#phy-cells"
>> +  - compatible
>> +  - clocks
>> +  - clock-names
>> +  - reg
>> +  - reg-names
>> +  - resets
>> +  - reset-names
>> +
>> +additionalProperties: false
>> +
>> +examples:
>> +  - |
>> +#include 
>> +#include 
>> +#include 
>> +
>> +phy@5100400 {
>> +#phy-cells = <1>;
>> +compatible = "allwinner,sun50i-a100-usb-phy";
>> +reg = <0x05100400 0x14>,
>> +  <0x05101800 0x4>,
>> +  <0x05200800 0x4>;
>> +reg-names = "phy_ctrl",
>> +"pmu0",
>> +"pmu1";
>> +clocks = <&ccu CLK_USB_PHY0>,
>> + <&ccu CLK_USB_PHY1>;
>> +clock-names = "usb0_phy",
>> +  "usb1_phy";
>> +resets = <&ccu RST_USB_PHY0>,
>> + <&ccu RST_USB_PHY1>;
>> +reset-names = "usb0_reset",
>> +  "usb1_reset";
>> +usb0_id_det-gpios = <&pio 7 10 GPIO_ACTIVE_HIGH>; /* PH10 */
>> +usb0_vbus_power-supply = <&usb_power_supply>;
>> +usb0_vbus-supply = <®_drivevbus>;
>> +usb1_vbus-supply = <®_usb1_vbus>;
>> +};
>> -- 
>> 2.28.0
>>
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 



Re: [RESEND PATCH 18/19] arm64: allwinner: a100: Add MMC related nodes

2020-11-28 Thread André Przywara
On 10/11/2020 06:48, Frank Lee wrote:

Hi,

> From: Yangtao Li 
> 
> The A100 has 3 MMC controllers, one of them being especially targeted to
> eMMC. Let's add nodes on dts.
> 
> Signed-off-by: Yangtao Li 

I don't have a datasheet nor a device for testing, but at least I could
check the pins against the pinctrl driver, and compare the MMC nodes
against the H6. Apart from the interrupts they are the same, so:

Reviewed-by: Andre Przywara 

Cheers,
Andre

> ---
>  .../arm64/boot/dts/allwinner/sun50i-a100.dtsi | 71 +++
>  1 file changed, 71 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a100.dtsi 
> b/arch/arm64/boot/dts/allwinner/sun50i-a100.dtsi
> index c731bb9727c2..4adfc7d4854a 100644
> --- a/arch/arm64/boot/dts/allwinner/sun50i-a100.dtsi
> +++ b/arch/arm64/boot/dts/allwinner/sun50i-a100.dtsi
> @@ -169,12 +169,83 @@ pio: pinctrl@300b000 {
>   interrupt-controller;
>   #interrupt-cells = <3>;
>  
> + mmc0_pins: mmc0-pins {
> + pins = "PF0", "PF1", "PF2", "PF3",
> +"PF4", "PF5";
> + function = "mmc0";
> + drive-strength = <30>;
> + bias-pull-up;
> + };
> +
> + /omit-if-no-ref/
> + mmc1_pins: mmc1-pins {
> + pins = "PG0", "PG1", "PG2", "PG3",
> +"PG4", "PG5";
> + function = "mmc1";
> + drive-strength = <30>;
> + bias-pull-up;
> + };
> +
> + mmc2_pins: mmc2-pins {
> + pins = "PC0", "PC1", "PC5", "PC6",
> +"PC8", "PC9", "PC10", "PC11",
> +"PC13", "PC14", "PC15", "PC16";
> + function = "mmc2";
> + drive-strength = <30>;
> + bias-pull-up;
> + };
> +
>   uart0_pb_pins: uart0-pb-pins {
>   pins = "PB9", "PB10";
>   function = "uart0";
>   };
>   };
>  
> + mmc0: mmc@402 {
> + compatible = "allwinner,sun50i-a100-mmc";
> + reg = <0x0402 0x1000>;
> + clocks = <&ccu CLK_BUS_MMC0>, <&ccu CLK_MMC0>;
> + clock-names = "ahb", "mmc";
> + resets = <&ccu RST_BUS_MMC0>;
> + reset-names = "ahb";
> + interrupts = ;
> + pinctrl-names = "default";
> + pinctrl-0 = <&mmc0_pins>;
> + status = "disabled";
> + #address-cells = <1>;
> + #size-cells = <0>;
> + };
> +
> + mmc1: mmc@4021000 {
> + compatible = "allwinner,sun50i-a100-mmc";
> + reg = <0x04021000 0x1000>;
> + clocks = <&ccu CLK_BUS_MMC1>, <&ccu CLK_MMC1>;
> + clock-names = "ahb", "mmc";
> + resets = <&ccu RST_BUS_MMC1>;
> + reset-names = "ahb";
> + interrupts = ;
> + pinctrl-names = "default";
> + pinctrl-0 = <&mmc1_pins>;
> + status = "disabled";
> + #address-cells = <1>;
> + #size-cells = <0>;
> + };
> +
> + mmc2: mmc@4022000 {
> + compatible = "allwinner,sun50i-a100-emmc";
> + reg = <0x04022000 0x1000>;
> + clocks = <&ccu CLK_BUS_MMC2>, <&ccu CLK_MMC2>;
> + clock-names = "ahb", "mmc";
> + resets = <&ccu RST_BUS_MMC2>;
> + reset-names = "ahb";
> + interrupts = ;
> + pinctrl-names = "default";
> + pinctrl-0 = <&mmc2_pins>;
> + status = "disabled";
> + #address-cells = <1>;
> + #size-cells = <0>;
> + };
> +
>   uart0: serial@500 {
>   compatible = "snps,dw-apb-uart";
>   reg = <0x0500 0x400>;
> 



Re: [RESEND PATCH 17/19] mmc: sunxi: add support for A100 mmc controller

2020-11-28 Thread André Przywara
On 10/11/2020 06:46, Frank Lee wrote:

Hi,

> From: Yangtao Li 
> 
> This patch adds support for A100 MMC controller, which use word address
> for internal dma.
> 
> Signed-off-by: Yangtao Li 
> ---
>  drivers/mmc/host/sunxi-mmc.c | 28 +---
>  1 file changed, 25 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/mmc/host/sunxi-mmc.c b/drivers/mmc/host/sunxi-mmc.c
> index fc62773602ec..1518b64112b7 100644
> --- a/drivers/mmc/host/sunxi-mmc.c
> +++ b/drivers/mmc/host/sunxi-mmc.c
> @@ -244,6 +244,7 @@ struct sunxi_idma_des {
>  
>  struct sunxi_mmc_cfg {
>   u32 idma_des_size_bits;
> + u32 idma_des_shift;
>   const struct sunxi_mmc_clk_delay *clk_delays;
>  
>   /* does the IP block support autocalibration? */
> @@ -343,7 +344,7 @@ static int sunxi_mmc_init_host(struct sunxi_mmc_host 
> *host)
>   /* Enable CEATA support */
>   mmc_writel(host, REG_FUNS, SDXC_CEATA_ON);
>   /* Set DMA descriptor list base address */
> - mmc_writel(host, REG_DLBA, host->sg_dma);
> + mmc_writel(host, REG_DLBA, host->sg_dma >> host->cfg->idma_des_shift);
>  
>   rval = mmc_readl(host, REG_GCTRL);
>   rval |= SDXC_INTERRUPT_ENABLE_BIT;
> @@ -373,8 +374,10 @@ static void sunxi_mmc_init_idma_des(struct 
> sunxi_mmc_host *host,
>  
>   next_desc += sizeof(struct sunxi_idma_des);
>   pdes[i].buf_addr_ptr1 =
> - cpu_to_le32(sg_dma_address(&data->sg[i]));
> - pdes[i].buf_addr_ptr2 = cpu_to_le32((u32)next_desc);
> + cpu_to_le32(sg_dma_address(&data->sg[i]) >>
> + host->cfg->idma_des_shift);
> + pdes[i].buf_addr_ptr2 = cpu_to_le32((u32)next_desc >>
> + host->cfg->idma_des_shift);

I think you should cast after the shift, otherwise you lose the ability
to run above 4 GB. This won't be a problem at the moment, since we still
use the default 32-bit DMA mask, but might bite us later.

Otherwise this patch looks fine, and works on the H616 as well.

Cheers,
Andre

>   }
>  
>   pdes[0].config |= cpu_to_le32(SDXC_IDMAC_DES0_FD);
> @@ -1178,6 +1181,23 @@ static const struct sunxi_mmc_cfg sun50i_a64_emmc_cfg 
> = {
>   .needs_new_timings = true,
>  };
>  
> +static const struct sunxi_mmc_cfg sun50i_a100_cfg = {
> + .idma_des_size_bits = 16,
> + .idma_des_shift = 2,
> + .clk_delays = NULL,
> + .can_calibrate = true,
> + .mask_data0 = true,
> + .needs_new_timings = true,
> +};
> +
> +static const struct sunxi_mmc_cfg sun50i_a100_emmc_cfg = {
> + .idma_des_size_bits = 13,
> + .idma_des_shift = 2,
> + .clk_delays = NULL,
> + .can_calibrate = true,
> + .needs_new_timings = true,
> +};
> +
>  static const struct of_device_id sunxi_mmc_of_match[] = {
>   { .compatible = "allwinner,sun4i-a10-mmc", .data = &sun4i_a10_cfg },
>   { .compatible = "allwinner,sun5i-a13-mmc", .data = &sun5i_a13_cfg },
> @@ -1186,6 +1206,8 @@ static const struct of_device_id sunxi_mmc_of_match[] = 
> {
>   { .compatible = "allwinner,sun9i-a80-mmc", .data = &sun9i_a80_cfg },
>   { .compatible = "allwinner,sun50i-a64-mmc", .data = &sun50i_a64_cfg },
>   { .compatible = "allwinner,sun50i-a64-emmc", .data = 
> &sun50i_a64_emmc_cfg },
> + { .compatible = "allwinner,sun50i-a100-mmc", .data = &sun50i_a100_cfg },
> + { .compatible = "allwinner,sun50i-a100-emmc", .data = 
> &sun50i_a100_emmc_cfg },
>   { /* sentinel */ }
>  };
>  MODULE_DEVICE_TABLE(of, sunxi_mmc_of_match);
> 



Re: [RESEND PATCH 13/19] phy: sun4i-usb: add support for A100 USB PHY

2020-11-28 Thread André Przywara
On 10/11/2020 06:40, Frank Lee wrote:

Hi,

> From: Yangtao Li 
> 
> Add support for a100's usb phy, which with 2 PHYs.
> 
> Signed-off-by: Yangtao Li 
> ---
>  drivers/phy/allwinner/phy-sun4i-usb.c | 19 +++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/drivers/phy/allwinner/phy-sun4i-usb.c 
> b/drivers/phy/allwinner/phy-sun4i-usb.c
> index a6900495baa5..1a0e403131e7 100644
> --- a/drivers/phy/allwinner/phy-sun4i-usb.c
> +++ b/drivers/phy/allwinner/phy-sun4i-usb.c
> @@ -107,6 +107,7 @@ enum sun4i_usb_phy_type {
>   sun8i_r40_phy,
>   sun8i_v3s_phy,
>   sun50i_a64_phy,
> + sun50i_a100_phy,

But with that patch fixing the H6 support you don't need a new name, do you?
Because below you just add the sun50i_a100_phy name next to every place
with a sun50i_h6_phy check.

>   sun50i_h6_phy,
>  };
>  
> @@ -289,7 +290,13 @@ static int sun4i_usb_phy_init(struct phy *_phy)
>   }
>  
>   if (data->cfg->type == sun8i_a83t_phy ||
> + data->cfg->type == sun50i_a100_phy ||
>   data->cfg->type == sun50i_h6_phy) {
> + if (phy->pmu && data->cfg->enable_pmu_unk1) {
> + val = readl(phy->pmu + REG_PMU_UNK1);
> + writel(val & ~BIT(3), phy->pmu + REG_PMU_UNK1);
> + }
> +
>   if (phy->index == 0) {
>   val = readl(data->base + data->cfg->phyctl_offset);
>   val |= PHY_CTL_VBUSVLDEXT;
> @@ -339,6 +346,7 @@ static int sun4i_usb_phy_exit(struct phy *_phy)
>  
>   if (phy->index == 0) {
>   if (data->cfg->type == sun8i_a83t_phy ||
> + data->cfg->type == sun50i_a100_phy ||
>   data->cfg->type == sun50i_h6_phy) {
>   void __iomem *phyctl = data->base +
>   data->cfg->phyctl_offset;
> @@ -960,6 +968,16 @@ static const struct sun4i_usb_phy_cfg sun50i_a64_cfg = {
>   .phy0_dual_route = true,
>  };
>  
> +static const struct sun4i_usb_phy_cfg sun50i_a100_cfg = {
> + .num_phys = 2,
> + .type = sun50i_a100_phy,

So you could just use the sun50i_h6_phy type here.

Cheers,
Andre

> + .disc_thresh = 3,
> + .phyctl_offset = REG_PHYCTL_A33,
> + .dedicated_clocks = true,
> + .enable_pmu_unk1 = true,
> + .phy0_dual_route = true,
> +};
> +
>  static const struct sun4i_usb_phy_cfg sun50i_h6_cfg = {
>   .num_phys = 4,
>   .type = sun50i_h6_phy,
> @@ -983,6 +1001,7 @@ static const struct of_device_id 
> sun4i_usb_phy_of_match[] = {
>   { .compatible = "allwinner,sun8i-v3s-usb-phy", .data = &sun8i_v3s_cfg },
>   { .compatible = "allwinner,sun50i-a64-usb-phy",
> .data = &sun50i_a64_cfg},
> + { .compatible = "allwinner,sun50i-a100-usb-phy", .data = 
> &sun50i_a100_cfg },
>   { .compatible = "allwinner,sun50i-h6-usb-phy", .data = &sun50i_h6_cfg },
>   { },
>  };
> 



Re: [PATCH v3 4/5] arm64: Add support for SMCCC TRNG entropy source

2020-11-20 Thread André Przywara
On 19/11/2020 13:41, Ard Biesheuvel wrote:

Hi,

> On Fri, 13 Nov 2020 at 19:24, Andre Przywara  wrote:
>>
>> The ARM architected TRNG firmware interface, described in ARM spec
>> DEN0098, defines an ARM SMCCC based interface to a true random number
>> generator, provided by firmware.
>> This can be discovered via the SMCCC >=v1.1 interface, and provides
>> up to 192 bits of entropy per call.
>>
>> Hook this SMC call into arm64's arch_get_random_*() implementation,
>> coming to the rescue when the CPU does not implement the ARM v8.5 RNG
>> system registers.
>>
>> For the detection, we piggy back on the PSCI/SMCCC discovery (which gives
>> us the conduit to use (hvc/smc)), then try to call the
>> ARM_SMCCC_TRNG_VERSION function, which returns -1 if this interface is
>> not implemented.
>>
>> Signed-off-by: Andre Przywara 
>> ---
>>  arch/arm64/include/asm/archrandom.h | 69 -
>>  1 file changed, 58 insertions(+), 11 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/archrandom.h 
>> b/arch/arm64/include/asm/archrandom.h
>> index abe07c21da8e..fe34bfd30caa 100644
>> --- a/arch/arm64/include/asm/archrandom.h
>> +++ b/arch/arm64/include/asm/archrandom.h
>> @@ -4,13 +4,24 @@
>>
>>  #ifdef CONFIG_ARCH_RANDOM
>>
>> +#include 
>>  #include 
>>  #include 
>>  #include 
>>
>> +#define ARM_SMCCC_TRNG_MIN_VERSION 0x1UL
>> +
>> +extern bool smccc_trng_available;
>> +
>>  static inline bool __init smccc_probe_trng(void)
>>  {
>> -   return false;
>> +   struct arm_smccc_res res;
>> +
>> +   arm_smccc_1_1_invoke(ARM_SMCCC_TRNG_VERSION, &res);
>> +   if ((s32)res.a0 < 0)
>> +   return false;
>> +
>> +   return res.a0 >= ARM_SMCCC_TRNG_MIN_VERSION;
>>  }
>>
>>  static inline bool __arm64_rndr(unsigned long *v)
>> @@ -43,26 +54,52 @@ static inline bool __must_check 
>> arch_get_random_int(unsigned int *v)
>>
>>  static inline bool __must_check arch_get_random_seed_long(unsigned long *v)
>>  {
>> +   struct arm_smccc_res res;
>> +
>> /*
>>  * Only support the generic interface after we have detected
>>  * the system wide capability, avoiding complexity with the
>>  * cpufeature code and with potential scheduling between CPUs
>>  * with and without the feature.
>>  */
>> -   if (!cpus_have_const_cap(ARM64_HAS_RNG))
>> -   return false;
>> +   if (cpus_have_const_cap(ARM64_HAS_RNG))
>> +   return __arm64_rndr(v);
>>
>> -   return __arm64_rndr(v);
>> -}
>> +   if (smccc_trng_available) {
>> +   arm_smccc_1_1_invoke(ARM_SMCCC_TRNG_RND64, 64, &res);
>> +   if ((int)res.a0 < 0)
>> +   return false;
>>
>> +   *v = res.a3;
>> +   return true;
>> +   }
>> +
>> +   return false;
>> +}
>>
> 
> I think we should be more rigorous here in how we map the concepts of
> random seeds and random numbers onto the various sources.
> 
> First of all, assuming my patch dropping the call to
> arch_get_random_seed_long() from add_interrupt_randomness() gets
> accepted, we should switch to RNDRRS here, and implement the non-seed
> variants using RNDR.

I agree (and have a patch ready), but that seems independent from this
series.

> However, this is still semantically inaccurate: RNDRRS does not return
> a random *seed*, it returns a number drawn from a freshly seeded
> pseudo-random sequence. This means that the TRNG interface, if
> implemented, is a better choice, and so we should try it first. Note
> that on platforms that don't implement both, only one of these will be
> available in the first place. But on platforms that *do* implement
> both, the firmware interface may actually be less wasteful in terms of
> resources: the TRNG interface returns every bit drawn from the
> underlying entropy source, whereas RNDRRS uses ~500 bits of entropy to
> reseed a DRBG that gets used only once to draw a single 64-bit number.

I am not sure I share your enthusiasm about the quality of the actual
TRNG firmware implementations, but we can go with this for now.
Maybe if we see bad implementations in the future, we can revisit this,
and have some tuneables? Or a command line option to ignore the SMCCC
interface? Or use the UUID mechanism for that?

> And the cost of the SMCCC call in terms of CPU time is charged to the
> caller, which is appropriate here.

This still leaves the problem that the core might be stuck in EL3 for an
unknown period of time, impeding our realtime efforts.
Do we have some ball park number of a number of cycles spent in EL3
still being acceptable? That could serve as a guideline for firmware
implementations?

> Then, I don't think we should ever return false without even trying if
> RNDRRS is available if the SMCCC invocation fails.

That's a good point.

> Something like this perhaps?
> 
> if (smccc_trng_available) {
>   arm_smccc_1_1_invoke(ARM_SMCCC_TRNG_RND64, 64, &res);
>   if ((int)res.a0 >= 0) {

Re: [PATCH v3 0/5] ARM: arm64: Add SMCCC TRNG entropy service

2020-11-13 Thread André Przywara
On 13/11/2020 23:05, Ard Biesheuvel wrote:
> On Fri, 13 Nov 2020 at 19:24, Andre Przywara  wrote:
>>
>> Hi,
>>
>> an update to v2 with some fixes and a few tweaks. Ard's patch [1] should
>> significantly reduce the frequency of arch_get_random_seed_long() calls,
>> not sure if that is enough the appease the concerns about the
>> potentially long latency of SMC calls. I also dropped the direct
>> arch_get_random() call in KVM for the same reason. An alternative could
>> be to just use the SMC in the _early() versions, but then we would lose
>> the SMCCC entropy source for the periodic reseeds. This could be mitigated
>> by using a hwrng driver [2] and rngd.
>> The only other non-minor change to v2 is the addition of using the SMCCC
>> call in the _early() variant. For a changelog see below.
>>
>> Sudeep: patch 1/5 is a prerequisite for all other patches, which
>> themselves could be considered separate and need to go via different trees.
>> If we could agree on that one now and get that merged, it would help the
>> handling of the other patches going forward.
>>
>> Cheers,
>> Andre
>> ==
>>
>> The ARM architected TRNG firmware interface, described in ARM spec
>> DEN0098[3], defines an ARM SMCCC based interface to a true random number
>> generator, provided by firmware.
>>
>> This series collects all the patches implementing this in various
>> places: as a user feeding into the ARCH_RANDOM pool, both for ARM and
>> arm64, and as a service provider for KVM guests.
>>
>> Patch 1 introduces the interface definition used by all three entities.
>> Patch 2 prepares the Arm SMCCC firmware driver to probe for the
>> interface. This patch is needed to avoid a later dependency on *two*
>> patches (there might be a better solution to this problem).
>>
>> Patch 3 implements the ARM part, patch 4 is the arm64 version.
>> The final patch 5 adds support to provide random numbers to KVM guests.
>>
>> This was tested on:
>> - QEMU -kernel (no SMCCC, regression test)
>> - Juno w/ prototype of the h/w Trusted RNG support
>> - mainline KVM (SMCCC, but no TRNG: regression test)
>> - ARM and arm64 KVM guests, using the KVM service in patch 5/5
>>
>> Based on v5.10-rc3, please let me know if I should rebased on something
>> else. A git repo is accessible at:
>> https://gitlab.arm.com/linux-arm/linux-ap/-/commits/smccc-trng/v3/
>>
>> Cheers,
>> Andre
>>
>> [1] 
>> http://lists.infradead.org/pipermail/linux-arm-kernel/2020-November/615446.html
>> [2] https://gitlab.arm.com/linux-arm/linux-ap/-/commit/87e3722f437
>> [3] https://developer.arm.com/documentation/den0098/latest/
>>
>> Changelog v2 ... v3:
>> - ARM: fix compilation with randconfig
>> - arm64: use SMCCC call also in arch_get_random_seed_long_early()
>> - KVM: comment on return value usage
>> - KVM: use more interesting UUID (enjoy, Marc!)
> 
> UUIDs are constructed using certain rules, so probably better to
> refrain from playing games with them here.

Hey, it's a valid variant 1 version 1 UUID (otherwise this would be no
real easter egg). uuidgen -t should be able to generate this one. I
found it too easy to use the random variant, but we can revert to that.

> If Marc wants an easter egg, he will have to wait until Easter.

Don't mess with your maintainer ;-)

Cheers,
Andre

> 
>> - KVM: use bitmaps instead of open coded long arrays
>> - KVM: drop direct usage of arch_get_random() interface
>>
>> Changelog "v1" ... v2:
>> - trigger ARCH_RANDOM initialisation from the SMCCC firmware driver
>> - use a single bool in smccc.c to hold the initialisation state for arm64
>> - handle endianess correctly in the KVM provider
>>
>> Andre Przywara (2):
>>   firmware: smccc: Introduce SMCCC TRNG framework
>>   arm64: Add support for SMCCC TRNG entropy source
>>
>> Ard Biesheuvel (3):
>>   firmware: smccc: Add SMCCC TRNG function call IDs
>>   ARM: implement support for SMCCC TRNG entropy source
>>   KVM: arm64: implement the TRNG hypervisor call
>>
>>  arch/arm/Kconfig|  4 ++
>>  arch/arm/include/asm/archrandom.h   | 74 +
>>  arch/arm64/include/asm/archrandom.h | 79 +++
>>  arch/arm64/include/asm/kvm_host.h   |  2 +
>>  arch/arm64/kvm/Makefile |  2 +-
>>  arch/arm64/kvm/hypercalls.c |  6 ++
>>  arch/arm64/kvm/trng.c   | 85 +
>>  drivers/firmware/smccc/smccc.c  |  5 ++
>>  include/linux/arm-smccc.h   | 31 +++
>>  9 files changed, 277 insertions(+), 11 deletions(-)
>>  create mode 100644 arch/arm/include/asm/archrandom.h
>>  create mode 100644 arch/arm64/kvm/trng.c
>>
>> --
>> 2.17.1
>>



Re: [PATCH v2 4/5] arm64: Add support for SMCCC TRNG entropy source

2020-11-12 Thread André Przywara
On 05/11/2020 14:38, Mark Rutland wrote:

Hi,

> On Thu, Nov 05, 2020 at 02:29:49PM +, Mark Brown wrote:
>> On Thu, Nov 05, 2020 at 02:03:22PM +, Mark Rutland wrote:
>>> On Thu, Nov 05, 2020 at 01:41:42PM +, Mark Brown wrote:
>>
 It isn't obvious to me why we don't fall through to trying the SMCCC
 TRNG here if for some reason the v8.5-RNG didn't give us something.
 Definitely an obscure possibility but still...
>>
>>> I think it's better to assume that if we have a HW RNG and it's not
>>> giving us entropy, it's not worthwhile trapping to the host, which might
>>> encounter the exact same issue.
>>
>> There's definitely a good argument for that, but OTOH it's possible the
>> SMCCC implementation is doing something else (it'd be an interesting
>> implementation decision but...).  That said I don't really mind, I think
>> my comment was more that if we're doing this the code should be explicit
>> about what the intent is since right now it isn't obvious.  Either a
>> comment or having an explicit "what method are we choosing" thing.
>>
>>> That said, I'm not sure it's great to plumb this under the
>>> arch_get_random*() interfaces, e.g. given this measn that
>>> add_interrupt_randomness() will end up trapping to the host all the time
>>> when it calls arch_get_random_seed_long().
>>
>>> Is there an existing interface for "slow" runtime entropy that we can
>>> plumb this into instead?
>>
>> Yeah, I was wondering about this myself - it seems like a better fit for
>> hwrng rather than the arch interfaces but that's not used until
>> userspace comes up, the arch stuff is all expected to be quick.  I
>> suppose we could implement the SMCCC stuff for the early variants of the
>> API you added so it gets used for bootstrapping purposes and then we
>> rely on userspace keeping things topped up by fetching entropy through
>> hwrng or otherwise but that feels confused so I have a hard time getting
>> enthusiastic about it.
> 
> I'm perfectly happy for the early functions to call this, or for us to
> add something new firmwware_get_random_*() functions that we can call
> early (and potentially at runtime, but less often than
> arch_get_random_*()).
> 
> I suspect the easy thing to do for now is plumb this into the existing
> early arch functions and hwrng.

So coming back to this: With Ard's patch to remove arch_get_random from
add_interrupt_randomness(), I see this called much less often: basically
once at early boot, then 16 longs every 5 minutes or so, from the
periodic crng reseed.
The only exception would be the KVM code now, so we are at the grace of
a guest to not swamp us with seed requests. Alternatively we could
remove the direct arch_get_random call from the KVM code, relying on the
general kernel pool instead.

Is this new situation now good enough to keep the SMCCC calls in this
interface here?

I have the hwrng driver ready, which could coexist with the arch_random
implementation. But if the only purpose of /dev/hwrng is to let rngd
feed this entropy back into the kernel, it would be pointless.
I found the driver useful to debug and test the firmware implementation
and to assess the random number quality (by feeding the raw stream into
rngtest or dieharder), but that might not justify a merge.

Ard objected against the driver, I guess to keep things simple and
architectural.

So what is the plan here? Shall I post a v3 with or without the hwrng
driver? And do we keep the SMCCC arch_random implementation?

Cheers,
Andre


Re: [PATCH v8 00/22] perf arm-spe: Refactor decoding & dumping flow

2020-11-11 Thread André Przywara
On 11/11/2020 17:44, Arnaldo Carvalho de Melo wrote:
> Em Wed, Nov 11, 2020 at 04:20:26PM +, Andr� Przywara escreveu:
>> On 11/11/2020 16:15, Arnaldo Carvalho de Melo wrote:
>>> Em Wed, Nov 11, 2020 at 01:10:51PM -0300, Arnaldo Carvalho de Melo escreveu:
 Em Wed, Nov 11, 2020 at 03:11:27PM +0800, Leo Yan escreveu:
> This is patch set v8 for refactoring Arm SPE trace decoding and dumping.
>
> This version addresses Andre's comment to pass parameter '&buf_len' at
> the last call arm_spe_pkt_snprintf() in the function arm_spe_pkt_desc().
>
> This patch set is cleanly applied on the top of perf/core branch
> with commit 644bf4b0f7ac ("perf jevents: Add test for arch std events").
>
> I retested this patch set on Hisilicon D06 platform with commands
> "perf report -D" and "perf script", compared the decoding results
> between with this patch set and without this patch set, "diff" tool
> shows the result as expected.

 With the patches I applied I'm getting:

 util/arm-spe-decoder/arm-spe-pkt-decoder.c: In function 'arm_spe_pkt_desc':
 util/arm-spe-decoder/arm-spe-pkt-decoder.c:410:3: error: left shift count 
 >= width of type [-Werror]
case 1: ns = !!(packet->payload & NS_FLAG);
^
 util/arm-spe-decoder/arm-spe-pkt-decoder.c:411:4: error: left shift count 
 >= width of type [-Werror]
 el = (packet->payload & EL_FLAG) >> 61;
 ^
 util/arm-spe-decoder/arm-spe-pkt-decoder.c:411:4: error: left shift count 
 >= width of type [-Werror]
 util/arm-spe-decoder/arm-spe-pkt-decoder.c:416:3: error: left shift count 
 >= width of type [-Werror]
case 3: ns = !!(packet->payload & NS_FLAG);
^
   CC   /tmp/build/perf/util/arm-spe-decoder/arm-spe-decoder.o
  

 On:

   1611.70 android-ndk:r12b-arm  : FAIL 
 arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
   1711.32 android-ndk:r15c-arm  : FAIL 
 arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)

 That were building ok before, builds still under way, perhaps its just
 on these old systems...
>>>
>>> [acme@five perf]$ git bisect good
>>> cc6fa07fb1458cca3741919774eb050976471000 is the first bad commit
>>> commit cc6fa07fb1458cca3741919774eb050976471000
>>> Author: Leo Yan 
>>> Date:   Wed Nov 11 15:11:28 2020 +0800
>>>
>>> perf arm-spe: Include bitops.h for BIT() macro
>>>
>>> Include header linux/bitops.h, directly use its BIT() macro and remove
>>> the self defined macros.
>>>
>>> Signed-off-by: Leo Yan 
>>> Reviewed-by: Andre Przywara 
>>> Link: https://lore.kernel.org/r/2020071149.815-2-leo@linaro.org
>>> Signed-off-by: Arnaldo Carvalho de Melo 
>>>
>>>  tools/perf/util/arm-spe-decoder/arm-spe-decoder.c | 5 +
>>>  tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c | 3 +--
>>>  2 files changed, 2 insertions(+), 6 deletions(-)
>>
>>
>> Ah, thanks! I think I mentioned the missing usage of BIT_ULL() in an
>> earlier review, and thought this was fixed. Possibly this gets fixed in
>> a later patch in this series, and is a temporary regression?
> 
> you mean this on that patch that ditches the local BIT() macro, right?
> 
> [acme@five perf]$ vim tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> [acme@five perf]$ git diff
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 46ddb53a645714bb..5f65a3a70c577207 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -12,8 +12,8 @@
> 
>  #include "arm-spe-pkt-decoder.h"
> 
> -#define NS_FLAGBIT(63)
> -#define EL_FLAG(BIT(62) | BIT(61))
> +#define NS_FLAGBIT_ULL(63)
> +#define EL_FLAG(BIT_ULL(62) | BIT_ULL(61))
> 
>  #define SPE_HEADER0_PAD0x0
>  #define SPE_HEADER0_END0x1

Yes, that basically happens in patch 10/22, so this will then
(trivially) clash when you rebase.

Thanks!
Andre.


> [acme@five perf]$
>  
>> How do you want to handle this? Shall Leo resend, amending this patch
>> (and merging 06 and 07 on the way ;-)?



Re: [PATCH v8 00/22] perf arm-spe: Refactor decoding & dumping flow

2020-11-11 Thread André Przywara
On 11/11/2020 16:15, Arnaldo Carvalho de Melo wrote:
> Em Wed, Nov 11, 2020 at 01:10:51PM -0300, Arnaldo Carvalho de Melo escreveu:
>> Em Wed, Nov 11, 2020 at 03:11:27PM +0800, Leo Yan escreveu:
>>> This is patch set v8 for refactoring Arm SPE trace decoding and dumping.
>>>
>>> This version addresses Andre's comment to pass parameter '&buf_len' at
>>> the last call arm_spe_pkt_snprintf() in the function arm_spe_pkt_desc().
>>>
>>> This patch set is cleanly applied on the top of perf/core branch
>>> with commit 644bf4b0f7ac ("perf jevents: Add test for arch std events").
>>>
>>> I retested this patch set on Hisilicon D06 platform with commands
>>> "perf report -D" and "perf script", compared the decoding results
>>> between with this patch set and without this patch set, "diff" tool
>>> shows the result as expected.
>>
>> With the patches I applied I'm getting:
>>
>> util/arm-spe-decoder/arm-spe-pkt-decoder.c: In function 'arm_spe_pkt_desc':
>> util/arm-spe-decoder/arm-spe-pkt-decoder.c:410:3: error: left shift count >= 
>> width of type [-Werror]
>>case 1: ns = !!(packet->payload & NS_FLAG);
>>^
>> util/arm-spe-decoder/arm-spe-pkt-decoder.c:411:4: error: left shift count >= 
>> width of type [-Werror]
>> el = (packet->payload & EL_FLAG) >> 61;
>> ^
>> util/arm-spe-decoder/arm-spe-pkt-decoder.c:411:4: error: left shift count >= 
>> width of type [-Werror]
>> util/arm-spe-decoder/arm-spe-pkt-decoder.c:416:3: error: left shift count >= 
>> width of type [-Werror]
>>case 3: ns = !!(packet->payload & NS_FLAG);
>>^
>>   CC   /tmp/build/perf/util/arm-spe-decoder/arm-spe-decoder.o
>>  
>>
>> On:
>>
>>   1611.70 android-ndk:r12b-arm  : FAIL arm-linux-androideabi-gcc 
>> (GCC) 4.9.x 20150123 (prerelease)
>>   1711.32 android-ndk:r15c-arm  : FAIL arm-linux-androideabi-gcc 
>> (GCC) 4.9.x 20150123 (prerelease)
>>
>> That were building ok before, builds still under way, perhaps its just
>> on these old systems...
> 
> [acme@five perf]$ git bisect good
> cc6fa07fb1458cca3741919774eb050976471000 is the first bad commit
> commit cc6fa07fb1458cca3741919774eb050976471000
> Author: Leo Yan 
> Date:   Wed Nov 11 15:11:28 2020 +0800
> 
> perf arm-spe: Include bitops.h for BIT() macro
> 
> Include header linux/bitops.h, directly use its BIT() macro and remove
> the self defined macros.
> 
> Signed-off-by: Leo Yan 
> Reviewed-by: Andre Przywara 
> Link: https://lore.kernel.org/r/2020071149.815-2-leo@linaro.org
> Signed-off-by: Arnaldo Carvalho de Melo 
> 
>  tools/perf/util/arm-spe-decoder/arm-spe-decoder.c | 5 +
>  tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c | 3 +--
>  2 files changed, 2 insertions(+), 6 deletions(-)


Ah, thanks! I think I mentioned the missing usage of BIT_ULL() in an
earlier review, and thought this was fixed. Possibly this gets fixed in
a later patch in this series, and is a temporary regression?

How do you want to handle this? Shall Leo resend, amending this patch
(and merging 06 and 07 on the way ;-)?

Cheers,
Andre


Re: [PATCH v8 06/22] perf arm-spe: Refactor printing string to buffer

2020-11-11 Thread André Przywara
On 11/11/2020 15:35, Arnaldo Carvalho de Melo wrote:

Hi Arnaldo,

thanks for taking a look!

> Em Wed, Nov 11, 2020 at 03:11:33PM +0800, Leo Yan escreveu:
>> When outputs strings to the decoding buffer with function snprintf(),
>> SPE decoder needs to detects if any error returns from snprintf() and if
>> so needs to directly bail out.  If snprintf() returns success, it needs
>> to update buffer pointer and reduce the buffer length so can continue to
>> output the next string into the consequent memory space.
>>
>> This complex logics are spreading in the function arm_spe_pkt_desc() so
>> there has many duplicate codes for handling error detecting, increment
>> buffer pointer and decrement buffer size.
>>
>> To avoid the duplicate code, this patch introduces a new helper function
>> arm_spe_pkt_snprintf() which is used to wrap up the complex logics, and
>> it's used by the caller arm_spe_pkt_desc().
>>
>> This patch also moves the variable 'blen' as the function's local
>> variable, this allows to remove the unnecessary braces and improve the
>> readability.
>>
>> Suggested-by: Dave Martin 
>> Signed-off-by: Leo Yan 
>> Reviewed-by: Andre Przywara 
>> ---
>>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 260 +-
>>  1 file changed, 126 insertions(+), 134 deletions(-)
>>
>> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
>> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
>> index 04fd7fd7c15f..1970686f7020 100644
>> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
>> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
>> @@ -9,6 +9,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  
>>  #include "arm-spe-pkt-decoder.h"
>>  
>> @@ -258,192 +259,183 @@ int arm_spe_get_packet(const unsigned char *buf, 
>> size_t len,
>>  return ret;
>>  }
>>  
>> +static int arm_spe_pkt_snprintf(int *err, char **buf_p, size_t *blen,
>> +const char *fmt, ...)
>> +{
>> +va_list ap;
>> +int ret;
>> +
>> +/* Bail out if any error occurred */
>> +if (err && *err)
>> +return *err;
>> +
>> +va_start(ap, fmt);
>> +ret = vsnprintf(*buf_p, *blen, fmt, ap);
>> +va_end(ap);
>> +
>> +if (ret < 0) {
>> +if (err && !*err)
>> +*err = ret;
>> +
>> +/*
>> + * A return value of (*blen - 1) or more means that the
>> + * output was truncated and the buffer is overrun.
>> + */
>> +} else if (ret >= ((int)*blen - 1)) {
>> +(*buf_p)[*blen - 1] = '\0';
>> +
>> +/*
>> + * Set *err to 'ret' to avoid overflow if tries to
>> + * fill this buffer sequentially.
>> + */
>> +if (err && !*err)
>> +*err = ret;
>> +} else {
>> +*buf_p += ret;
>> +*blen -= ret;
>> +}
>> +
>> +return ret;
>> +}
>> +
>>  int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
>>   size_t buf_len)
>>  {
>>  int ret, ns, el, idx = packet->index;
>>  unsigned long long payload = packet->payload;
>>  const char *name = arm_spe_pkt_name(packet->type);
>> +size_t blen = buf_len;
>> +int err = 0;
>>  
>>  switch (packet->type) {
>>  case ARM_SPE_BAD:
>>  case ARM_SPE_PAD:
>>  case ARM_SPE_END:
>> -return snprintf(buf, buf_len, "%s", name);
>> -case ARM_SPE_EVENTS: {
>> -size_t blen = buf_len;
>> -
>> -ret = 0;
>> -ret = snprintf(buf, buf_len, "EV");
>> -buf += ret;
>> -blen -= ret;
>> -if (payload & 0x1) {
>> -ret = snprintf(buf, buf_len, " EXCEPTION-GEN");
>> -buf += ret;
>> -blen -= ret;
>> -}
>> -if (payload & 0x2) {
>> -ret = snprintf(buf, buf_len, " RETIRED");
>> -buf += ret;
>> -blen -= ret;
>> -}
>> -if (payload & 0x4) {
>> -ret = snprintf(buf, buf_len, " L1D-ACCESS");
>> -buf += ret;
>> -blen -= ret;
>> -}
>> -if (payload & 0x8) {
>> -ret = snprintf(buf, buf_len, " L1D-REFILL");
>> -buf += ret;
>> -blen -= ret;
>> -}
>> -if (payload & 0x10) {
>> -ret = snprintf(buf, buf_len, " TLB-ACCESS");
>> -buf += ret;
>> -blen -= ret;
>> -}
>> -if (payload & 0x20) {
>> -ret = snprintf(buf, buf_len, " TLB-REFILL");
>> -buf += ret;
>> -blen -= ret;
>> -}
>> -if (payload & 0x40) {
>> -ret = snprintf(buf, buf_len, " NOT-TAKEN");
>> -buf += ret;
>> -blen -= ret;
>> -  

Re: [PATCH] random: avoid arch_get_random_seed_long() when collecting IRQ randomness

2020-11-11 Thread André Przywara
On 11/11/2020 10:05, Ard Biesheuvel wrote:

Hi,

> On Wed, 11 Nov 2020 at 10:45, André Przywara  wrote:
>>
>> On 11/11/2020 08:19, Ard Biesheuvel wrote:
>>
>> Hi,
>>
>>> (+ Eric)
>>>
>>> On Thu, 5 Nov 2020 at 16:29, Ard Biesheuvel  wrote:
>>>>
>>>> When reseeding the CRNG periodically, arch_get_random_seed_long() is
>>>> called to obtain entropy from an architecture specific source if one
>>>> is implemented. In most cases, these are special instructions, but in
>>>> some cases, such as on ARM, we may want to back this using firmware
>>>> calls, which are considerably more expensive.
>>>>
>>>> Another call to arch_get_random_seed_long() exists in the CRNG driver,
>>>> in add_interrupt_randomness(), which collects entropy by capturing
>>>> inter-interrupt timing and relying on interrupt jitter to provide
>>>> random bits. This is done by keeping a per-CPU state, and mixing in
>>>> the IRQ number, the cycle counter and the return address every time an
>>>> interrupt is taken, and mixing this per-CPU state into the entropy pool
>>>> every 64 invocations, or at least once per second. The entropy that is
>>>> gathered this way is credited as 1 bit of entropy. Every time this
>>>> happens, arch_get_random_seed_long() is invoked, and the result is
>>>> mixed in as well, and also credited with 1 bit of entropy.
>>>>
>>>> This means that arch_get_random_seed_long() is called at least once
>>>> per second on every CPU, which seems excessive, and doesn't really
>>>> scale, especially in a virtualization scenario where CPUs may be
>>>> oversubscribed: in cases where arch_get_random_seed_long() is backed
>>>> by an instruction that actually goes back to a shared hardware entropy
>>>> source (such as RNDRRS on ARM), we will end up hitting it hundreds of
>>>> times per second.
>>
>> May I ask why this should be a particular problem? Form what I gathered
>> on the web, it seems like most h/w RNGs have a capacity of multiple
>> MBit/s. Wikipedia [1] suggests that the x86 CPU instructions generate at
>> least 20 Mbit/s (worst case: AMD's 2500 cycles @ 800 MHz), and I
>> measured around 78 Mbit/s with the raw entropy source on my Juno
>> (possibly even limited by slow MMIO).
>> So it seems unlikely that a few kbit/s drain the hardware entropy source.
>>
>> If we consider this interface comparably cheap, should we then not try
>> to plug the Arm firmware interface into this?
>>
> 
> I'm not sure I follow. Are you saying we should not wire up a
> comparatively expensive firmware interface to
> arch_get_random_seed_long() because we currently assume it is backed
> by something cheap?

Yes. I wanted to (ab)use this patch to clarify this. x86 and arm64 use
CPU instructions (so far), S390 copies from some buffer. PPC uses either
a CPU instruction or an MMIO access. All of these I would consider
comparably cheap, especially when compared to a firmware call with
unknown costs. In fact the current Trusted Firmware implementation[1] is
not really terse, also the generic SMC dispatcher calls a platform
defined routine, which could do anything.
So to also guide the implementation in TF-A, it would be good to
establish what arch_get_random expects to be. The current
implementations and the fact that it lives in a header file suggests
that it's meant as a slim wrapper around something cheap.

> Because doing so would add significantly to the cost. Also note that a
> firmware interface would permit other ways of gathering entropy that
> are not necessarily backed by a dedicated high bandwidth noise source
> (and we already have examples of this)

Yes, agreed.
So I have a hwrng driver for the Arm SMCCC TRNG interface ready. I would
post this, but would like to know if we should drop the proposed
arch_get_random implementation [2][3] of this interface.

>> I am not against this patch, actually am considering this a nice
>> cleanup, to separate interrupt generated entropy from other sources.
>> Especially since we call arch_get_random_seed_long() under a spinlock here.
>> But I am curious about the expectations from arch_get_random in general.
>>
> 
> I think it is reasonable to clean this up a little bit. A random
> *seed* is not the same thing as a random number, and given that we
> expose both interfaces, it makes sense to permit the seed variant to
> be more costly, and only use it as intended (i.e., to seed a random
> number generator)

That's true, it seems we chickened out on the arm64 implementation
alread

Re: [PATCH v8 00/22] perf arm-spe: Refactor decoding & dumping flow

2020-11-11 Thread André Przywara
On 11/11/2020 07:11, Leo Yan wrote:

Hi Arnaldo, Ingo, Peter, (whoever feels responsible for taking this)

> This is patch set v8 for refactoring Arm SPE trace decoding and dumping.
I have reviewed every patch of this in anger, and am now fine with this
series. Given the bugs fixed, the improvements it brings in terms of
readability and maintainability, and the low risk it has on breaking
things, I would be happy to see it merged.

Thanks,
Andre.

> This version addresses Andre's comment to pass parameter '&buf_len' at
> the last call arm_spe_pkt_snprintf() in the function arm_spe_pkt_desc().
> 
> This patch set is cleanly applied on the top of perf/core branch
> with commit 644bf4b0f7ac ("perf jevents: Add test for arch std events").
> 
> I retested this patch set on Hisilicon D06 platform with commands
> "perf report -D" and "perf script", compared the decoding results
> between with this patch set and without this patch set, "diff" tool
> shows the result as expected.
> 
> Changes from v7:
> - Changed to pass '&buf_len' for the last call arm_spe_pkt_snprintf() in
>   the patch 07/22 (Andre).
> 
> Changes from v6:
> - Removed the redundant comma from the string in the patch 21/22 "perf
>   arm_spe: Decode memory tagging properties" (Dave);
> - Refined the return value for arm_spe_pkt_desc(): returns 0 for
>   success, otherwise returns non zero for failures; handle error code at
>   the end of function arm_spe_pkt_desc(); this is accomplished in the
>   new patch 07/22 "perf arm-spe: Consolidate arm_spe_pkt_desc()'s
>   return value" (Dave).
> 
> Changes from v5:
> - Directly bail out arm_spe_pkt_snprintf() if any error occurred
>   (Andre).
> 
> Changes from v4:
> - Implemented a cumulative error for arm_spe_pkt_snprintf() and changed
>   to condense code for printing strings (Dave);
> - Changed to check payload bits [55:52] for parse kernel address
>   (Andre).
> 
> Changes from v3:
> - Refined arm_spe_payload_len() and removed macro SPE_HEADER_SZ()
>   (Andre);
> - Refined packet header index macros (Andre);
> - Added patch "perf arm_spe: Fixup top byte for data virtual address" to
>   fixup the data virtual address for 64KB pages and refined comments for
>   the fixup (Andre);
> - Added Andre's review tag (using "b4 am" command);
> - Changed the macros to SPE_PKT_IS_XXX() format to check operation types
>   (Andre).
> 
> 
> Andre Przywara (1):
>   perf arm_spe: Decode memory tagging properties
> 
> Leo Yan (20):
>   perf arm-spe: Include bitops.h for BIT() macro
>   perf arm-spe: Fix a typo in comment
>   perf arm-spe: Refactor payload size calculation
>   perf arm-spe: Refactor arm_spe_get_events()
>   perf arm-spe: Fix packet length handling
>   perf arm-spe: Refactor printing string to buffer
>   perf arm-spe: Consolidate arm_spe_pkt_desc()'s return value
>   perf arm-spe: Refactor packet header parsing
>   perf arm-spe: Add new function arm_spe_pkt_desc_addr()
>   perf arm-spe: Refactor address packet handling
>   perf arm_spe: Fixup top byte for data virtual address
>   perf arm-spe: Refactor context packet handling
>   perf arm-spe: Add new function arm_spe_pkt_desc_counter()
>   perf arm-spe: Refactor counter packet handling
>   perf arm-spe: Add new function arm_spe_pkt_desc_event()
>   perf arm-spe: Refactor event type handling
>   perf arm-spe: Remove size condition checking for events
>   perf arm-spe: Add new function arm_spe_pkt_desc_op_type()
>   perf arm-spe: Refactor operation packet handling
>   perf arm-spe: Add more sub classes for operation packet
> 
> Wei Li (1):
>   perf arm-spe: Add support for ARMv8.3-SPE
> 
>  .../util/arm-spe-decoder/arm-spe-decoder.c|  59 +-
>  .../util/arm-spe-decoder/arm-spe-decoder.h|  17 -
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 601 ++
>  .../arm-spe-decoder/arm-spe-pkt-decoder.h | 122 +++-
>  tools/perf/util/arm-spe.c |   2 +-
>  5 files changed, 479 insertions(+), 322 deletions(-)
> 



Re: [PATCH] random: avoid arch_get_random_seed_long() when collecting IRQ randomness

2020-11-11 Thread André Przywara
On 11/11/2020 08:19, Ard Biesheuvel wrote:

Hi,

> (+ Eric)
> 
> On Thu, 5 Nov 2020 at 16:29, Ard Biesheuvel  wrote:
>>
>> When reseeding the CRNG periodically, arch_get_random_seed_long() is
>> called to obtain entropy from an architecture specific source if one
>> is implemented. In most cases, these are special instructions, but in
>> some cases, such as on ARM, we may want to back this using firmware
>> calls, which are considerably more expensive.
>>
>> Another call to arch_get_random_seed_long() exists in the CRNG driver,
>> in add_interrupt_randomness(), which collects entropy by capturing
>> inter-interrupt timing and relying on interrupt jitter to provide
>> random bits. This is done by keeping a per-CPU state, and mixing in
>> the IRQ number, the cycle counter and the return address every time an
>> interrupt is taken, and mixing this per-CPU state into the entropy pool
>> every 64 invocations, or at least once per second. The entropy that is
>> gathered this way is credited as 1 bit of entropy. Every time this
>> happens, arch_get_random_seed_long() is invoked, and the result is
>> mixed in as well, and also credited with 1 bit of entropy.
>>
>> This means that arch_get_random_seed_long() is called at least once
>> per second on every CPU, which seems excessive, and doesn't really
>> scale, especially in a virtualization scenario where CPUs may be
>> oversubscribed: in cases where arch_get_random_seed_long() is backed
>> by an instruction that actually goes back to a shared hardware entropy
>> source (such as RNDRRS on ARM), we will end up hitting it hundreds of
>> times per second.

May I ask why this should be a particular problem? Form what I gathered
on the web, it seems like most h/w RNGs have a capacity of multiple
MBit/s. Wikipedia [1] suggests that the x86 CPU instructions generate at
least 20 Mbit/s (worst case: AMD's 2500 cycles @ 800 MHz), and I
measured around 78 Mbit/s with the raw entropy source on my Juno
(possibly even limited by slow MMIO).
So it seems unlikely that a few kbit/s drain the hardware entropy source.

If we consider this interface comparably cheap, should we then not try
to plug the Arm firmware interface into this?

I am not against this patch, actually am considering this a nice
cleanup, to separate interrupt generated entropy from other sources.
Especially since we call arch_get_random_seed_long() under a spinlock here.
But I am curious about the expectations from arch_get_random in general.

>> So let's drop the call to arch_get_random_seed_long() from
>> add_interrupt_randomness(), and instead, rely on crng_reseed() to call
>> the arch hook to get random seed material from the platform.

So I tested this and it works as expected: I see some calls on
initialisation, then a handful of calls every few seconds from the
periodic reseeding. The large number of calls every second are gone.

>>
>> Signed-off-by: Ard Biesheuvel 

Since the above questions are unrelated to this particular patch:

Reviewed-by: Andre Przywara 
Tested-by: Andre Przywara 

Cheers,
Andre

[1] https://en.wikipedia.org/wiki/RDRAND#Performance

>> ---
>>  drivers/char/random.c | 15 +--
>>  1 file changed, 1 insertion(+), 14 deletions(-)
>>
>> diff --git a/drivers/char/random.c b/drivers/char/random.c
>> index 2a41b21623ae..a9c393c1466d 100644
>> --- a/drivers/char/random.c
>> +++ b/drivers/char/random.c
>> @@ -1261,8 +1261,6 @@ void add_interrupt_randomness(int irq, int irq_flags)
>> cycles_tcycles = random_get_entropy();
>> __u32   c_high, j_high;
>> __u64   ip;
>> -   unsigned long   seed;
>> -   int credit = 0;
>>
>> if (cycles == 0)
>> cycles = get_reg(fast_pool, regs);
>> @@ -1298,23 +1296,12 @@ void add_interrupt_randomness(int irq, int irq_flags)
>>
>> fast_pool->last = now;
>> __mix_pool_bytes(r, &fast_pool->pool, sizeof(fast_pool->pool));
>> -
>> -   /*
>> -* If we have architectural seed generator, produce a seed and
>> -* add it to the pool.  For the sake of paranoia don't let the
>> -* architectural seed generator dominate the input from the
>> -* interrupt noise.
>> -*/
>> -   if (arch_get_random_seed_long(&seed)) {
>> -   __mix_pool_bytes(r, &seed, sizeof(seed));
>> -   credit = 1;
>> -   }
>> spin_unlock(&r->lock);
>>
>> fast_pool->count = 0;
>>
>> /* award one bit for the contents of the fast pool */
>> -   credit_entropy_bits(r, credit + 1);
>> +   credit_entropy_bits(r, 1);
>>  }
>>  EXPORT_SYMBOL_GPL(add_interrupt_randomness);
>>
>> --
>> 2.17.1
>>



Re: [PATCH v7 07/22] perf arm-spe: Consolidate arm_spe_pkt_desc()'s return value

2020-11-09 Thread André Przywara
On 06/11/2020 01:41, Leo Yan wrote:
> arm_spe_pkt_desc() returns the length of consumed the buffer for
> the success case; otherwise, it delivers the return value from
> arm_spe_pkt_snprintf(), and returns the last return value if there have
> multiple calling arm_spe_pkt_snprintf().
> 
> Since arm_spe_pkt_snprintf() has the same semantics with vsnprintf() for
> the return value, and vsnprintf() might return value equals to or bigger
> than the parameter 'size' to indicate the truncation.  Because the
> return value is >= 0 when the string is truncated, this condition will
> be returned up the stack as "success".
> 
> This patch simplifies the return value for arm_spe_pkt_desc(): '0' means
> success and other values mean an error has occurred.  To realize this,
> it relies on arm_spe_pkt_snprintf()'s parameter 'err', the 'err' is a
> cumulative value, returns its final value if printing buffer is called
> for one time or multiple times.
> 
> To unify the error value generation, this patch handles error in a
> central place, rather than directly bailing out in switch-cases,
> it returns error at the end of arm_spe_pkt_desc().

And the but-last hunk means it will basically always return 0?
Just checking, it's probably fine (and I don't want to review a v8 ;-)

> This patch changes the caller arm_spe_dump() to respect the updated
> return value semantics of arm_spe_pkt_desc().
> 
> Suggested-by: Dave Martin 
> Signed-off-by: Leo Yan 

One tiny thing below ...

> ---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 128 +-
>  tools/perf/util/arm-spe.c |   2 +-
>  2 files changed, 68 insertions(+), 62 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 1970686f7020..33baef0c2c0b 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -301,9 +301,10 @@ static int arm_spe_pkt_snprintf(int *err, char **buf_p, 
> size_t *blen,
>  int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
>size_t buf_len)
>  {
> - int ret, ns, el, idx = packet->index;
> + int ns, el, idx = packet->index;
>   unsigned long long payload = packet->payload;
>   const char *name = arm_spe_pkt_name(packet->type);
> + char *buf_orig = buf;
>   size_t blen = buf_len;
>   int err = 0;
>  
> @@ -311,82 +312,76 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, 
> char *buf,
>   case ARM_SPE_BAD:
>   case ARM_SPE_PAD:
>   case ARM_SPE_END:
> - return arm_spe_pkt_snprintf(&err, &buf, &blen, "%s", name);
> + arm_spe_pkt_snprintf(&err, &buf, &blen, "%s", name);
> + break;
>   case ARM_SPE_EVENTS:
> - ret = arm_spe_pkt_snprintf(&err, &buf, &blen, "EV");
> + arm_spe_pkt_snprintf(&err, &buf, &blen, "EV");
>  
>   if (payload & 0x1)
> - ret = arm_spe_pkt_snprintf(&err, &buf, &blen, " 
> EXCEPTION-GEN");
> + arm_spe_pkt_snprintf(&err, &buf, &blen, " 
> EXCEPTION-GEN");
>   if (payload & 0x2)
> - ret = arm_spe_pkt_snprintf(&err, &buf, &blen, " 
> RETIRED");
> + arm_spe_pkt_snprintf(&err, &buf, &blen, " RETIRED");
>   if (payload & 0x4)
> - ret = arm_spe_pkt_snprintf(&err, &buf, &blen, " 
> L1D-ACCESS");
> + arm_spe_pkt_snprintf(&err, &buf, &blen, " L1D-ACCESS");
>   if (payload & 0x8)
> - ret = arm_spe_pkt_snprintf(&err, &buf, &blen, " 
> L1D-REFILL");
> + arm_spe_pkt_snprintf(&err, &buf, &blen, " L1D-REFILL");
>   if (payload & 0x10)
> - ret = arm_spe_pkt_snprintf(&err, &buf, &blen, " 
> TLB-ACCESS");
> + arm_spe_pkt_snprintf(&err, &buf, &blen, " TLB-ACCESS");
>   if (payload & 0x20)
> - ret = arm_spe_pkt_snprintf(&err, &buf, &blen, " 
> TLB-REFILL");
> + arm_spe_pkt_snprintf(&err, &buf, &blen, " TLB-REFILL");
>   if (payload & 0x40)
> - ret = arm_spe_pkt_snprintf(&err, &buf, &blen, " 
> NOT-TAKEN");
> + arm_spe_pkt_snprintf(&err, &buf, &blen, " NOT-TAKEN");
>   if (payload & 0x80)
> - ret = arm_spe_pkt_snprintf(&err, &buf, &blen, " 
> MISPRED");
> + arm_spe_pkt_snprintf(&err, &buf, &blen, " MISPRED");
>   if (idx > 1) {
>   if (payload & 0x100)
> - ret = arm_spe_pkt_snprintf(&err, &buf, &blen, " 
> LLC-ACCESS");
> + arm_spe_pkt_snprintf(&err, &buf, &blen, " 
> LLC-ACCESS");
>   if (payload & 0x200)
> - ret = arm_spe_pkt_snprintf(&err, &buf, &blen, " 
> LLC-REFI

Re: [PATCH v7 06/22] perf arm-spe: Refactor printing string to buffer

2020-11-09 Thread André Przywara
On 06/11/2020 01:41, Leo Yan wrote:
> When outputs strings to the decoding buffer with function snprintf(),
> SPE decoder needs to detects if any error returns from snprintf() and if
> so needs to directly bail out.  If snprintf() returns success, it needs
> to update buffer pointer and reduce the buffer length so can continue to
> output the next string into the consequent memory space.
> 
> This complex logics are spreading in the function arm_spe_pkt_desc() so
> there has many duplicate codes for handling error detecting, increment
> buffer pointer and decrement buffer size.
> 
> To avoid the duplicate code, this patch introduces a new helper function
> arm_spe_pkt_snprintf() which is used to wrap up the complex logics, and
> it's used by the caller arm_spe_pkt_desc().
> 
> This patch also moves the variable 'blen' as the function's local
> variable, this allows to remove the unnecessary braces and improve the
> readability.
> 
> Suggested-by: Dave Martin 
> Signed-off-by: Leo Yan 

Well, I am not sure this is particularly easier to review ;-), but here
we go:

Checked - vs. + in an editor to verify the transformation.

I also put the new printf routine into some very simple test program,
and it seems to work as advertised: buffer overflows are detected, and
the string never gets bigger or loses the terminating 0.

So that looks alright to me:

Reviewed-by: Andre Przywara 

Cheers,
Andre

> ---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 260 +-
>  1 file changed, 126 insertions(+), 134 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 04fd7fd7c15f..1970686f7020 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -9,6 +9,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "arm-spe-pkt-decoder.h"
>  
> @@ -258,192 +259,183 @@ int arm_spe_get_packet(const unsigned char *buf, 
> size_t len,
>   return ret;
>  }
>  
> +static int arm_spe_pkt_snprintf(int *err, char **buf_p, size_t *blen,
> + const char *fmt, ...)
> +{
> + va_list ap;
> + int ret;
> +
> + /* Bail out if any error occurred */
> + if (err && *err)
> + return *err;
> +
> + va_start(ap, fmt);
> + ret = vsnprintf(*buf_p, *blen, fmt, ap);
> + va_end(ap);
> +
> + if (ret < 0) {
> + if (err && !*err)
> + *err = ret;
> +
> + /*
> +  * A return value of (*blen - 1) or more means that the
> +  * output was truncated and the buffer is overrun.
> +  */
> + } else if (ret >= ((int)*blen - 1)) {
> + (*buf_p)[*blen - 1] = '\0';
> +
> + /*
> +  * Set *err to 'ret' to avoid overflow if tries to
> +  * fill this buffer sequentially.
> +  */
> + if (err && !*err)
> + *err = ret;
> + } else {
> + *buf_p += ret;
> + *blen -= ret;
> + }
> +
> + return ret;
> +}
> +
>  int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
>size_t buf_len)
>  {
>   int ret, ns, el, idx = packet->index;
>   unsigned long long payload = packet->payload;
>   const char *name = arm_spe_pkt_name(packet->type);
> + size_t blen = buf_len;
> + int err = 0;
>  
>   switch (packet->type) {
>   case ARM_SPE_BAD:
>   case ARM_SPE_PAD:
>   case ARM_SPE_END:
> - return snprintf(buf, buf_len, "%s", name);
> - case ARM_SPE_EVENTS: {
> - size_t blen = buf_len;
> -
> - ret = 0;
> - ret = snprintf(buf, buf_len, "EV");
> - buf += ret;
> - blen -= ret;
> - if (payload & 0x1) {
> - ret = snprintf(buf, buf_len, " EXCEPTION-GEN");
> - buf += ret;
> - blen -= ret;
> - }
> - if (payload & 0x2) {
> - ret = snprintf(buf, buf_len, " RETIRED");
> - buf += ret;
> - blen -= ret;
> - }
> - if (payload & 0x4) {
> - ret = snprintf(buf, buf_len, " L1D-ACCESS");
> - buf += ret;
> - blen -= ret;
> - }
> - if (payload & 0x8) {
> - ret = snprintf(buf, buf_len, " L1D-REFILL");
> - buf += ret;
> - blen -= ret;
> - }
> - if (payload & 0x10) {
> - ret = snprintf(buf, buf_len, " TLB-ACCESS");
> - buf += ret;
> - blen -= ret;
> - }
> - if (payload & 0x20) {
> - ret = snprintf(buf, buf_len, " TLB-REFILL");
> - buf += ret;
> - ble

Re: [PATCH v2 3/5] ARM: implement support for SMCCC TRNG entropy source

2020-11-05 Thread André Przywara
On 05/11/2020 17:15, kernel test robot wrote:
> Hi Andre,
> 
> I love your patch! Yet something to improve:
> 
> [auto build test ERROR on linus/master]
> [also build test ERROR on v5.10-rc2 next-20201105]
> [cannot apply to arm64/for-next/core kvmarm/next arm-perf/for-next/perf]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
> 
> url:
> https://github.com/0day-ci/linux/commits/Andre-Przywara/ARM-arm64-Add-SMCCC-TRNG-entropy-service/20201105-205934
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
> 4ef8451b332662d004df269d4cdeb7d9f31419b5
> config: arm-randconfig-s031-20201105 (attached as .config)
> compiler: arm-linux-gnueabi-gcc (GCC) 9.3.0
> reproduce:
> wget 
> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
> ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # apt-get install sparse
> # sparse version: v0.6.3-76-gf680124b-dirty
> # 
> https://github.com/0day-ci/linux/commit/1f0c18ec0b7aa0a67d7cdea2b1beb5e7b38c5f4b
> git remote add linux-review https://github.com/0day-ci/linux
> git fetch --no-tags linux-review 
> Andre-Przywara/ARM-arm64-Add-SMCCC-TRNG-entropy-service/20201105-205934
> git checkout 1f0c18ec0b7aa0a67d7cdea2b1beb5e7b38c5f4b
> # save the attached .config to linux build tree
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 
> CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=arm 
> 
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot 
> 
> All errors (new ones prefixed by >>):
> 
>arm-linux-gnueabi-ld: drivers/char/random.o: in function 
> `arch_get_random_seed_long':
>>> arch/arm/include/asm/archrandom.h:48: undefined reference to 
>>> `arm_smccc_1_1_get_conduit'
>>> arm-linux-gnueabi-ld: arch/arm/include/asm/archrandom.h:48: undefined 
>>> reference to `smccc_trng_available'

Found it, we should use "depends on HAVE_ARM_SMCCC_DISCOVERY" instead of
HAVE_ARM_SMCCC, because only the former guarantees the build of smccc.c.
Will fix it in the next version, (given we keep this arch random part).

Thanks,
Andre

> 
> vim +48 arch/arm/include/asm/archrandom.h
> 
> 42
> 43static inline bool __must_check 
> arch_get_random_seed_long(unsigned long *v)
> 44{
> 45struct arm_smccc_res res;
> 46
> 47if (smccc_trng_available) {
>   > 48arm_smccc_1_1_invoke(ARM_SMCCC_TRNG_RND32, 8 * 
> sizeof(*v), &res);
> 49
> 50if (res.a0 != 0)
> 51return false;
> 52
> 53*v = res.a3;
> 54return true;
> 55}
> 56
> 57return false;
> 58}
> 59
> 
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org
> 



Re: [PATCH v2 4/5] arm64: Add support for SMCCC TRNG entropy source

2020-11-05 Thread André Przywara
On 05/11/2020 14:03, Mark Rutland wrote:
> On Thu, Nov 05, 2020 at 01:41:42PM +, Mark Brown wrote:
>> On Thu, Nov 05, 2020 at 12:56:55PM +, Andre Przywara wrote:
>>
>>>  static inline bool __must_check arch_get_random_seed_int(unsigned int *v)
>>>  {
>>> +   struct arm_smccc_res res;
>>> unsigned long val;
>>> -   bool ok = arch_get_random_seed_long(&val);
>>>  
>>> -   *v = val;
>>> -   return ok;
>>> +   if (cpus_have_const_cap(ARM64_HAS_RNG)) {
>>> +   if (arch_get_random_seed_long(&val)) {
>>> +   *v = val;
>>> +   return true;
>>> +   }
>>> +   return false;
>>> +   }
>>
>> It isn't obvious to me why we don't fall through to trying the SMCCC
>> TRNG here if for some reason the v8.5-RNG didn't give us something.
>> Definitely an obscure possibility but still...
> 
> I think it's better to assume that if we have a HW RNG and it's not
> giving us entropy, it's not worthwhile trapping to the host, which might
> encounter the exact same issue.
> 
> I'd rather we have one RNG source that we trust works, and use that
> exclusively.
> 
> That said, I'm not sure it's great to plumb this under the
> arch_get_random*() interfaces, e.g. given this measn that
> add_interrupt_randomness() will end up trapping to the host all the time
> when it calls arch_get_random_seed_long().
> 
> Is there an existing interface for "slow" runtime entropy that we can
> plumb this into instead?

There is the framework implementing /dev/hwrng, and in fact I started
with a driver for that (have that in some working state).
But this is only available somewhat late in the game (after drivers get
initialised), and Ard mentioned that one advantage of the firmware i/f
is (somewhat) early availability. Now for SMCCC we need firmware tables
(for the conduit), so it's not too early either.

If too frequent firmware traps are a concern, we could always request
the maximum 192 bits, and store them. That would avoid 2/3 of the
current traps.

Cheers,
Andre



Re: [PATCH v6 06/21] perf arm-spe: Refactor printing string to buffer

2020-11-03 Thread André Przywara
On 03/11/2020 06:40, Leo Yan wrote:

Hi Dave, Leo,

> On Mon, Nov 02, 2020 at 05:06:53PM +, Dave Martin wrote:
>> On Fri, Oct 30, 2020 at 02:57:09AM +, Leo Yan wrote:
>>> When outputs strings to the decoding buffer with function snprintf(),
>>> SPE decoder needs to detects if any error returns from snprintf() and if
>>> so needs to directly bail out.  If snprintf() returns success, it needs
>>> to update buffer pointer and reduce the buffer length so can continue to
>>> output the next string into the consequent memory space.
>>>
>>> This complex logics are spreading in the function arm_spe_pkt_desc() so
>>> there has many duplicate codes for handling error detecting, increment
>>> buffer pointer and decrement buffer size.
>>>
>>> To avoid the duplicate code, this patch introduces a new helper function
>>> arm_spe_pkt_snprintf() which is used to wrap up the complex logics, and
>>> it's used by the caller arm_spe_pkt_desc(); if printing buffer is called
>>> for multiple times in a flow, the error is a cumulative value and simply
>>> returns its final value.
>>>
>>> This patch also moves the variable 'blen' as the function's local
>>> variable, this allows to remove the unnecessary braces and improve the
>>> readability.
>>>
>>> Suggested-by: Dave Martin 
>>> Signed-off-by: Leo Yan 
>>
>> This looks like a good refacroting now, but as pointed out by Andre this
>> patch is now rather hard to review, since it combines the refactoring
>> with other changes.
>>
>> If reposting this series, it would be good if this could be split into a
>> first patch that introduces arm_spe_pkt_snprintf() and just updates each
>> snprintf() call site to use it, but without moving other code around or
>> optimising anything, followed by one or more patches that clean up and
>> simplify arm_spe_pkt_desc().
> 
> I will respin the patch set and follow this approach.

Well, I am afraid this is not easily possible.

Dave: this patch is basically following the pattern turning this:
===
if (condition) {
ret = snprintf(buf, buf_len, "foo");
buf += ret;
blen -= ret;
}
...
if (ret < 0)
return ret;
blen -= ret;
return buf_len - blen;
===
into this:
---
if (condition)
arm_spe_pkt_snprintf(&err, &buf, &blen, "foo");
...
return err ?: (int)(buf_len - blen);
---

And "diff" is getting really ahead of itself here and tries to be super
clever, which leads to this hard to read patch.

But I don't think there is anything we can really do here, this is
already the minimal version. Leo adds the optimisations only later on,
in other patches.

Cheers,
Andre


> 
>> If the series is otherwise mature though, then this rework may be
>> overkill.
>>
>>> ---
>>>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 267 --
>>>  1 file changed, 117 insertions(+), 150 deletions(-)
>>>
>>> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
>>> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
>>> index 04fd7fd7c15f..1ecaf9805b79 100644
>>> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
>>> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
>>> @@ -9,6 +9,7 @@
>>>  #include 
>>>  #include 
>>>  #include 
>>> +#include 
>>>  
>>>  #include "arm-spe-pkt-decoder.h"
>>>  
>>> @@ -258,192 +259,158 @@ int arm_spe_get_packet(const unsigned char *buf, 
>>> size_t len,
>>> return ret;
>>>  }
>>>  
>>> +static int arm_spe_pkt_snprintf(int *err, char **buf_p, size_t *blen,
>>> +   const char *fmt, ...)
>>> +{
>>> +   va_list ap;
>>> +   int ret;
>>> +
>>> +   /* Bail out if any error occurred */
>>> +   if (err && *err)
>>> +   return *err;
>>> +
>>> +   va_start(ap, fmt);
>>> +   ret = vsnprintf(*buf_p, *blen, fmt, ap);
>>> +   va_end(ap);
>>> +
>>> +   if (ret < 0) {
>>> +   if (err && !*err)
>>> +   *err = ret;
>>
>> What happens on buffer overrun (i.e., ret >= *blen)?
>>
>> It looks to me like we'll advance buf_p too far, blen will wrap around,
>> and the string at *buf_p won't be null terminated.  Because the return
>> value is still >= 0, this condition will be returned up the stack as
>> "success".
> 
> Thanks for pointint out this.  I never note for the potential issue
> caused by returned value (ret >= *blen);  checked again for the
> manual, it says:
> 
> "The functions snprintf() and vsnprintf() do not write more than size
> bytes (including the terminating null byte ('\0')).  If the output was
> truncated due to this limit, then the return value is the number of
> characters (excluding  the  terminating  null  byte) which would have
> been written to the final string if enough space had been available.
> Thus, a return value of size or more means that the output was
> truncated."
> 
>> Perhaps this can never happen given the actual buffer size

Re: [PATCH v6 06/21] perf arm-spe: Refactor printing string to buffer

2020-11-02 Thread André Przywara
On 30/10/2020 02:57, Leo Yan wrote:
> When outputs strings to the decoding buffer with function snprintf(),
> SPE decoder needs to detects if any error returns from snprintf() and if
> so needs to directly bail out.  If snprintf() returns success, it needs
> to update buffer pointer and reduce the buffer length so can continue to
> output the next string into the consequent memory space.
> 
> This complex logics are spreading in the function arm_spe_pkt_desc() so
> there has many duplicate codes for handling error detecting, increment
> buffer pointer and decrement buffer size.
> 
> To avoid the duplicate code, this patch introduces a new helper function
> arm_spe_pkt_snprintf() which is used to wrap up the complex logics, and
> it's used by the caller arm_spe_pkt_desc(); if printing buffer is called
> for multiple times in a flow, the error is a cumulative value and simply
> returns its final value.
> 
> This patch also moves the variable 'blen' as the function's local
> variable, this allows to remove the unnecessary braces and improve the
> readability.
> 
> Suggested-by: Dave Martin 
> Signed-off-by: Leo Yan 

Wow, that's a bit tricky to review in its current incarnation, but it
looks alright to me.
I have some optimisation idea below, but that's a nit and the patch as
is seems correct to me (checked twice, the second time in an editor,
matching the lines), so:

Reviewed-by: Andre Przywara 

...

> ---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 267 --
>  1 file changed, 117 insertions(+), 150 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 04fd7fd7c15f..1ecaf9805b79 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -9,6 +9,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "arm-spe-pkt-decoder.h"
>  
> @@ -258,192 +259,158 @@ int arm_spe_get_packet(const unsigned char *buf, 
> size_t len,
>   return ret;
>  }
>  
> +static int arm_spe_pkt_snprintf(int *err, char **buf_p, size_t *blen,
> + const char *fmt, ...)
> +{
> + va_list ap;
> + int ret;
> +
> + /* Bail out if any error occurred */
> + if (err && *err)
> + return *err;
> +
> + va_start(ap, fmt);
> + ret = vsnprintf(*buf_p, *blen, fmt, ap);
> + va_end(ap);
> +
> + if (ret < 0) {
> + if (err && !*err)
> + *err = ret;
> + } else {
> + *buf_p += ret;
> + *blen -= ret;
> + }
> +
> + return ret;
> +}
> +
>  int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
>size_t buf_len)
>  {
> - int ret, ns, el, idx = packet->index;
> + int ns, el, idx = packet->index;
>   unsigned long long payload = packet->payload;
>   const char *name = arm_spe_pkt_name(packet->type);
> + size_t blen = buf_len;
> + int err = 0;
>  
>   switch (packet->type) {
>   case ARM_SPE_BAD:
>   case ARM_SPE_PAD:
>   case ARM_SPE_END:
> - return snprintf(buf, buf_len, "%s", name);
> - case ARM_SPE_EVENTS: {
> - size_t blen = buf_len;
> -
> - ret = 0;
> - ret = snprintf(buf, buf_len, "EV");
> - buf += ret;
> - blen -= ret;
> - if (payload & 0x1) {
> - ret = snprintf(buf, buf_len, " EXCEPTION-GEN");
> - buf += ret;
> - blen -= ret;
> - }
> - if (payload & 0x2) {
> - ret = snprintf(buf, buf_len, " RETIRED");
> - buf += ret;
> - blen -= ret;
> - }
> - if (payload & 0x4) {
> - ret = snprintf(buf, buf_len, " L1D-ACCESS");
> - buf += ret;
> - blen -= ret;
> - }
> - if (payload & 0x8) {
> - ret = snprintf(buf, buf_len, " L1D-REFILL");
> - buf += ret;
> - blen -= ret;
> - }
> - if (payload & 0x10) {
> - ret = snprintf(buf, buf_len, " TLB-ACCESS");
> - buf += ret;
> - blen -= ret;
> - }
> - if (payload & 0x20) {
> - ret = snprintf(buf, buf_len, " TLB-REFILL");
> - buf += ret;
> - blen -= ret;
> - }
> - if (payload & 0x40) {
> - ret = snprintf(buf, buf_len, " NOT-TAKEN");
> - buf += ret;
> - blen -= ret;
> - }
> - if (payload & 0x80) {
> - ret = snprintf(buf, buf_len, " MISPRED");
> - buf += ret;
> - blen -= ret;
> - }

Re: [PATCH v5 06/21] perf arm-spe: Refactor printing string to buffer

2020-10-29 Thread André Przywara
On 29/10/2020 10:51, Leo Yan wrote:
> Hi Andre,
> 
> On Thu, Oct 29, 2020 at 10:23:39AM +, Andr� Przywara wrote:
> 
> [...]
> 
>>> +static int arm_spe_pkt_snprintf(int *err, char **buf_p, size_t *blen,
>>> +   const char *fmt, ...)
>>> +{
>>> +   va_list ap;
>>> +   int ret;
>>> +
>>> +   va_start(ap, fmt);
>>> +   ret = vsnprintf(*buf_p, *blen, fmt, ap);
>>> +   va_end(ap);
>>> +
>>> +   if (ret < 0) {
>>> +   if (err && !*err)
>>> +   *err = ret;
>>> +   } else {
>>> +   *buf_p += ret;
>>> +   *blen -= ret;
>>> +   }
>>> +
>>> +   return ret;
>>> +}
>>
>> So this now implements the old behaviour of ignoring previous errors, in
>> all cases, since we don't check for errors and bail out in the callers.
>>
>> If you simply check for validity of err and for it being 0 before
>> proceeding with the va_start() above, this should be fixed.
> 
> I think you are suggesting below code, could you take a look for it
> before I proceed to respin new patch?>
> static int arm_spe_pkt_snprintf(int *err, char **buf_p, size_t *blen,
>   const char *fmt, ...)
> {
>   va_list ap;
>   int ret;
> 
> /* Bail out if any error occurred */
> if (err && *err)
> return *err;
> 
>   va_start(ap, fmt);
>   ret = vsnprintf(*buf_p, *blen, fmt, ap);
>   va_end(ap);
> 
>   if (ret < 0) {
>   if (err && !*err)
>   *err = ret;
>   } else {
>   *buf_p += ret;
>   *blen -= ret;
>   }
> 
>   return ret;
> }

Yes, this is what I had in mind.

Cheers,
Andre


Re: [PATCH v5 06/21] perf arm-spe: Refactor printing string to buffer

2020-10-29 Thread André Przywara
On 29/10/2020 07:19, Leo Yan wrote:

Hi,

> When outputs strings to the decoding buffer with function snprintf(),
> SPE decoder needs to detects if any error returns from snprintf() and if
> so needs to directly bail out.  If snprintf() returns success, it needs
> to update buffer pointer and reduce the buffer length so can continue to
> output the next string into the consequent memory space.
> 
> This complex logics are spreading in the function arm_spe_pkt_desc() so
> there has many duplicate codes for handling error detecting, increment
> buffer pointer and decrement buffer size.
> 
> To avoid the duplicate code, this patch introduces a new helper function
> arm_spe_pkt_snprintf() which is used to wrap up the complex logics, and
> it's used by the caller arm_spe_pkt_desc(); if printing buffer is called
> for multiple times in a flow, the error is a cumulative value and simply
> returns its final value.
> 
> This patch also moves the variable 'blen' as the function's local
> variable, this allows to remove the unnecessary braces and improve the
> readability.
> 
> Suggested-by: Dave Martin 
> Signed-off-by: Leo Yan 
> Reviewed-by: Andre Przywara 
> ---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 263 --
>  1 file changed, 113 insertions(+), 150 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 04fd7fd7c15f..9147b88ae00c 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -9,6 +9,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "arm-spe-pkt-decoder.h"
>  
> @@ -258,192 +259,154 @@ int arm_spe_get_packet(const unsigned char *buf, 
> size_t len,
>   return ret;
>  }
>  
> +static int arm_spe_pkt_snprintf(int *err, char **buf_p, size_t *blen,
> + const char *fmt, ...)
> +{
> + va_list ap;
> + int ret;
> +
> + va_start(ap, fmt);
> + ret = vsnprintf(*buf_p, *blen, fmt, ap);
> + va_end(ap);
> +
> + if (ret < 0) {
> + if (err && !*err)
> + *err = ret;
> + } else {
> + *buf_p += ret;
> + *blen -= ret;
> + }
> +
> + return ret;
> +}

So this now implements the old behaviour of ignoring previous errors, in
all cases, since we don't check for errors and bail out in the callers.

If you simply check for validity of err and for it being 0 before
proceeding with the va_start() above, this should be fixed.

Cheers,
Andre

> +
>  int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
>size_t buf_len)
>  {
> - int ret, ns, el, idx = packet->index;
> + int ns, el, idx = packet->index;
>   unsigned long long payload = packet->payload;
>   const char *name = arm_spe_pkt_name(packet->type);
> + size_t blen = buf_len;
> + int err = 0;
>  
>   switch (packet->type) {
>   case ARM_SPE_BAD:
>   case ARM_SPE_PAD:
>   case ARM_SPE_END:
> - return snprintf(buf, buf_len, "%s", name);
> - case ARM_SPE_EVENTS: {
> - size_t blen = buf_len;
> -
> - ret = 0;
> - ret = snprintf(buf, buf_len, "EV");
> - buf += ret;
> - blen -= ret;
> - if (payload & 0x1) {
> - ret = snprintf(buf, buf_len, " EXCEPTION-GEN");
> - buf += ret;
> - blen -= ret;
> - }
> - if (payload & 0x2) {
> - ret = snprintf(buf, buf_len, " RETIRED");
> - buf += ret;
> - blen -= ret;
> - }
> - if (payload & 0x4) {
> - ret = snprintf(buf, buf_len, " L1D-ACCESS");
> - buf += ret;
> - blen -= ret;
> - }
> - if (payload & 0x8) {
> - ret = snprintf(buf, buf_len, " L1D-REFILL");
> - buf += ret;
> - blen -= ret;
> - }
> - if (payload & 0x10) {
> - ret = snprintf(buf, buf_len, " TLB-ACCESS");
> - buf += ret;
> - blen -= ret;
> - }
> - if (payload & 0x20) {
> - ret = snprintf(buf, buf_len, " TLB-REFILL");
> - buf += ret;
> - blen -= ret;
> - }
> - if (payload & 0x40) {
> - ret = snprintf(buf, buf_len, " NOT-TAKEN");
> - buf += ret;
> - blen -= ret;
> - }
> - if (payload & 0x80) {
> - ret = snprintf(buf, buf_len, " MISPRED");
> - buf += ret;
> - blen -= ret;
> - }
> + return arm_spe_pkt_snprintf(&err, &buf, &blen, "%s", n

Re: [PATCH v4 09/21] perf arm-spe: Refactor address packet handling

2020-10-27 Thread André Przywara
On 27/10/2020 03:09, Leo Yan wrote:
> This patch is to refactor address packet handling, it defines macros for
> address packet's header and payload, these macros are used by decoder
> and the dump flow.
> 
> Signed-off-by: Leo Yan 

Thanks for the changes!

Reviewed-by: Andre Przywara 

Cheers,
Andre

> ---
>  .../util/arm-spe-decoder/arm-spe-decoder.c| 29 ---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 26 +++---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.h | 35 ---
>  3 files changed, 48 insertions(+), 42 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> index cc18a1e8c212..776b3e6628bb 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> @@ -24,36 +24,35 @@
>  
>  static u64 arm_spe_calc_ip(int index, u64 payload)
>  {
> - u8 *addr = (u8 *)&payload;
> - int ns, el;
> + u64 ns, el;
>  
>   /* Instruction virtual address or Branch target address */
>   if (index == SPE_ADDR_PKT_HDR_INDEX_INS ||
>   index == SPE_ADDR_PKT_HDR_INDEX_BRANCH) {
> - ns = addr[7] & SPE_ADDR_PKT_NS;
> - el = (addr[7] & SPE_ADDR_PKT_EL_MASK) >> SPE_ADDR_PKT_EL_OFFSET;
> + ns = SPE_ADDR_PKT_GET_NS(payload);
> + el = SPE_ADDR_PKT_GET_EL(payload);
> +
> + /* Clean highest byte */
> + payload = SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(payload);
>  
>   /* Fill highest byte for EL1 or EL2 (VHE) mode */
>   if (ns && (el == SPE_ADDR_PKT_EL1 || el == SPE_ADDR_PKT_EL2))
> - addr[7] = 0xff;
> - /* Clean highest byte for other cases */
> - else
> - addr[7] = 0x0;
> + payload |= 0xffULL << SPE_ADDR_PKT_ADDR_BYTE7_SHIFT;
>  
>   /* Data access virtual address */
>   } else if (index == SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT) {
>  
> + /* Clean tags */
> + payload = SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(payload);
> +
>   /* Fill highest byte if bits [48..55] is 0xff */
> - if (addr[6] == 0xff)
> - addr[7] = 0xff;
> - /* Otherwise, cleanup tags */
> - else
> - addr[7] = 0x0;
> + if (SPE_ADDR_PKT_ADDR_GET_BYTE_6(payload) == 0xffULL)
> + payload |= 0xffULL << SPE_ADDR_PKT_ADDR_BYTE7_SHIFT;
>  
>   /* Data access physical address */
>   } else if (index == SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS) {
> - /* Cleanup byte 7 */
> - addr[7] = 0x0;
> + /* Clean highest byte */
> + payload = SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(payload);
>   } else {
>   pr_err("unsupported address packet index: 0x%x\n", index);
>   }
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index e372e85e1c14..1218a731638f 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -13,9 +13,6 @@
>  
>  #include "arm-spe-pkt-decoder.h"
>  
> -#define NS_FLAG  BIT(63)
> -#define EL_FLAG  (BIT(62) | BIT(61))
> -
>  #if __BYTE_ORDER == __BIG_ENDIAN
>  #define le16_to_cpu bswap_16
>  #define le32_to_cpu bswap_32
> @@ -167,10 +164,11 @@ static int arm_spe_get_addr(const unsigned char *buf, 
> size_t len,
>   const unsigned char ext_hdr, struct arm_spe_pkt 
> *packet)
>  {
>   packet->type = ARM_SPE_ADDRESS;
> +
>   if (ext_hdr)
> - packet->index = ((buf[0] & 0x3) << 3) | (buf[1] & 0x7);
> + packet->index = SPE_HDR_EXTENDED_INDEX(buf[0], buf[1]);
>   else
> - packet->index = buf[0] & 0x7;
> + packet->index = SPE_HDR_SHORT_INDEX(buf[0]);
>  
>   return arm_spe_get_payload(buf, len, ext_hdr, packet);
>  }
> @@ -274,20 +272,20 @@ static int arm_spe_pkt_desc_addr(const struct 
> arm_spe_pkt *packet,
>   u64 payload = packet->payload;
>  
>   switch (idx) {
> - case 0:
> - case 1:
> - ns = !!(packet->payload & NS_FLAG);
> - el = (packet->payload & EL_FLAG) >> 61;
> - payload &= ~(0xffULL << 56);
> + case SPE_ADDR_PKT_HDR_INDEX_INS:
> + case SPE_ADDR_PKT_HDR_INDEX_BRANCH:
> + ns = !!SPE_ADDR_PKT_GET_NS(payload);
> + el = SPE_ADDR_PKT_GET_EL(payload);
> + payload = SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(payload);
>   return arm_spe_pkt_snprintf(&buf, &buf_len,
>   "%s 0x%llx el%d ns=%d",
>   (idx == 1) ? "TGT" : "PC", payload, el, ns);
> - case 2:
> + case SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT:
>   return arm_spe_pkt_snprintf(&buf, &buf_len,
>   

Re: [PATCH v4 03/21] perf arm-spe: Refactor payload size calculation

2020-10-27 Thread André Przywara
On 27/10/2020 03:08, Leo Yan wrote:
> This patch defines macro to extract "sz" field from header, and renames
> the function payloadlen() to arm_spe_payload_len().
> 
> Signed-off-by: Leo Yan 

Reviewed-by: Andre Przywara 

Cheers,
Andre

> ---
>  .../util/arm-spe-decoder/arm-spe-pkt-decoder.c | 18 +-
>  1 file changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 7c7b5eb09fba..06b3eec4494e 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -69,22 +69,22 @@ const char *arm_spe_pkt_name(enum arm_spe_pkt_type type)
>   return arm_spe_packet_name[type];
>  }
>  
> -/* return ARM SPE payload size from its encoding,
> - * which is in bits 5:4 of the byte.
> - * 00 : byte
> - * 01 : halfword (2)
> - * 10 : word (4)
> - * 11 : doubleword (8)
> +/*
> + * Extracts the field "sz" from header bits and converts to bytes:
> + *   00 : byte (1)
> + *   01 : halfword (2)
> + *   10 : word (4)
> + *   11 : doubleword (8)
>   */
> -static int payloadlen(unsigned char byte)
> +static unsigned int arm_spe_payload_len(unsigned char hdr)
>  {
> - return 1 << ((byte & 0x30) >> 4);
> + return 1U << ((hdr & GENMASK_ULL(5, 4)) >> 4);
>  }
>  
>  static int arm_spe_get_payload(const unsigned char *buf, size_t len,
>  struct arm_spe_pkt *packet)
>  {
> - size_t payload_len = payloadlen(buf[0]);
> + size_t payload_len = arm_spe_payload_len(buf[0]);
>  
>   if (len < 1 + payload_len)
>   return ARM_SPE_NEED_MORE_BYTES;
> 



Re: [PATCH v4 13/21] perf arm-spe: Refactor counter packet handling

2020-10-27 Thread André Przywara
On 27/10/2020 03:09, Leo Yan wrote:
> This patch defines macros for counter packet header, and uses macros to
> replace hard code values in functions arm_spe_get_counter() and
> arm_spe_pkt_desc().
> 
> In the function arm_spe_get_counter(), adds a new line for more
> readable.
> 
> Signed-off-by: Leo Yan 

Reviewed-by: Andre Przywara 

Cheers,
Andre

> ---
>  tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c | 11 ++-
>  tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h |  5 +
>  2 files changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 8f481c6ea054..4be649c26002 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -152,10 +152,11 @@ static int arm_spe_get_counter(const unsigned char 
> *buf, size_t len,
>  const unsigned char ext_hdr, struct arm_spe_pkt 
> *packet)
>  {
>   packet->type = ARM_SPE_COUNTER;
> +
>   if (ext_hdr)
> - packet->index = ((buf[0] & 0x3) << 3) | (buf[1] & 0x7);
> + packet->index = SPE_HDR_EXTENDED_INDEX(buf[0], buf[1]);
>   else
> - packet->index = buf[0] & 0x7;
> + packet->index = SPE_HDR_SHORT_INDEX(buf[0]);
>  
>   return arm_spe_get_payload(buf, len, ext_hdr, packet);
>  }
> @@ -307,17 +308,17 @@ static int arm_spe_pkt_desc_counter(const struct 
> arm_spe_pkt *packet,
>   return ret;
>  
>   switch (packet->index) {
> - case 0:
> + case SPE_CNT_PKT_HDR_INDEX_TOTAL_LAT:
>   ret = arm_spe_pkt_snprintf(&buf, &blen, "TOT");
>   if (ret < 0)
>   return ret;
>   break;
> - case 1:
> + case SPE_CNT_PKT_HDR_INDEX_ISSUE_LAT:
>   ret = arm_spe_pkt_snprintf(&buf, &blen, "ISSUE");
>   if (ret < 0)
>   return ret;
>   break;
> - case 2:
> + case SPE_CNT_PKT_HDR_INDEX_TRANS_LAT:
>   ret = arm_spe_pkt_snprintf(&buf, &blen, "XLAT");
>   if (ret < 0)
>   return ret;
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> index 9bc876bffd35..7d8e34e35f05 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> @@ -82,6 +82,11 @@ struct arm_spe_pkt {
>  /* Context packet header */
>  #define SPE_CTX_PKT_HDR_INDEX(h) ((h) & GENMASK_ULL(1, 0))
>  
> +/* Counter packet header */
> +#define SPE_CNT_PKT_HDR_INDEX_TOTAL_LAT  0x0
> +#define SPE_CNT_PKT_HDR_INDEX_ISSUE_LAT  0x1
> +#define SPE_CNT_PKT_HDR_INDEX_TRANS_LAT  0x2
> +
>  const char *arm_spe_pkt_name(enum arm_spe_pkt_type);
>  
>  int arm_spe_get_packet(const unsigned char *buf, size_t len,
> 



Re: [PATCH v4 21/21] perf arm-spe: Add support for ARMv8.3-SPE

2020-10-27 Thread André Przywara
On 27/10/2020 03:09, Leo Yan wrote:

Hi,

> From: Wei Li 
> 
> This patch is to support Armv8.3 extension for SPE, it adds alignment
> field in the Events packet and it supports the Scalable Vector Extension
> (SVE) for Operation packet and Events packet with two additions:
> 
>   - The vector length for SVE operations in the Operation Type packet;
>   - The incomplete predicate and empty predicate fields in the Events
> packet.
> 
> Signed-off-by: Wei Li 
> Signed-off-by: Leo Yan 

Looks correct, checked all bit patterns in the manual.

Reviewed-by: Andre Przywara 

Cheers,
Andre

> ---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 72 ++-
>  .../arm-spe-decoder/arm-spe-pkt-decoder.h | 16 +
>  2 files changed, 86 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 5195ec3b1ec4..40b12d6893f9 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -332,6 +332,21 @@ static int arm_spe_pkt_desc_event(const struct 
> arm_spe_pkt *packet,
>   if (ret < 0)
>   return ret;
>   }
> + if (payload & BIT(EV_ALIGNMENT)) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " ALIGNMENT");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & BIT(EV_PARTIAL_PREDICATE)) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " SVE-PARTIAL-PRED");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & BIT(EV_EMPTY_PREDICATE)) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " SVE-EMPTY-PRED");
> + if (ret < 0)
> + return ret;
> + }
>  
>   return buf_len - blen;
>  }
> @@ -345,8 +360,42 @@ static int arm_spe_pkt_desc_op_type(const struct 
> arm_spe_pkt *packet,
>  
>   switch (class) {
>   case SPE_OP_PKT_HDR_CLASS_OTHER:
> - return arm_spe_pkt_snprintf(&buf, &blen,
> - payload & SPE_OP_PKT_COND ? "COND-SELECT" : 
> "INSN-OTHER");
> + if (SPE_OP_PKT_IS_OTHER_SVE_OP(payload)) {
> +
> + ret = arm_spe_pkt_snprintf(&buf, &blen, "SVE-OTHER");
> + if (ret < 0)
> + return ret;
> +
> + /* SVE effective vector length */
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " EVLEN %d",
> +SPE_OP_PKG_SVE_EVL(payload));
> + if (ret < 0)
> + return ret;
> +
> + if (payload & SPE_OP_PKT_SVE_FP) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " FP");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & SPE_OP_PKT_SVE_PRED) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " 
> PRED");
> + if (ret < 0)
> + return ret;
> + }
> + } else {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, "OTHER");
> + if (ret < 0)
> + return ret;
> +
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " %s",
> + payload & SPE_OP_PKT_COND ?
> + "COND-SELECT" : "INSN-OTHER");
> + if (ret < 0)
> + return ret;
> + }
> +
> + return buf_len - blen;
> +
>   case SPE_OP_PKT_HDR_CLASS_LD_ST_ATOMIC:
>   ret = arm_spe_pkt_snprintf(&buf, &blen,
>  payload & SPE_OP_PKT_ST ? "ST" : 
> "LD");
> @@ -400,6 +449,25 @@ static int arm_spe_pkt_desc_op_type(const struct 
> arm_spe_pkt *packet,
>   break;
>   }
>  
> + if (SPE_OP_PKT_IS_LDST_SVE(payload)) {
> + /* SVE effective vector length */
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " EVLEN %d",
> +SPE_OP_PKG_SVE_EVL(payload));
> + if (ret < 0)
> + return ret;
> +
> + if (payload & SPE_OP_PKT_SVE_PRED) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " 
> PRED");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & SPE_OP_PKT_SVE_SG) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " SG");
> + if (ret < 0)
> + return ret;
> +

Re: [PATCH v4 10/21] perf arm_spe: Fixup top byte for data virtual address

2020-10-27 Thread André Przywara
On 27/10/2020 03:09, Leo Yan wrote:
> To establish a valid address from the address packet payload and finally
> the address value can be used for parsing data symbol in DSO, current
> code uses 0xff to replace the tag in the top byte of data virtual
> address.
> 
> So far the code only fixups top byte for the memory layouts with 4KB
> pages, it misses to support memory layouts with 64KB pages.
> 
> This patch adds the conditions for checking bits [55:48] are 0xf0 or
> 0xfd, if detects the patterns it will fill 0xff into the top byte of the
> address, also adds comment to explain the fixing up.
> 
> Signed-off-by: Leo Yan 
> ---
>  .../util/arm-spe-decoder/arm-spe-decoder.c| 24 ---
>  1 file changed, 21 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> index 776b3e6628bb..e135ac01d94a 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> @@ -24,7 +24,7 @@
>  
>  static u64 arm_spe_calc_ip(int index, u64 payload)
>  {
> - u64 ns, el;
> + u64 ns, el, val;
>  
>   /* Instruction virtual address or Branch target address */
>   if (index == SPE_ADDR_PKT_HDR_INDEX_INS ||
> @@ -45,8 +45,26 @@ static u64 arm_spe_calc_ip(int index, u64 payload)
>   /* Clean tags */
>   payload = SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(payload);
>  
> - /* Fill highest byte if bits [48..55] is 0xff */
> - if (SPE_ADDR_PKT_ADDR_GET_BYTE_6(payload) == 0xffULL)
> + /*
> +  * Armv8 ARM (ARM DDI 0487F.c), chapter "D10.2.1 Address packet"
> +  * defines the data virtual address payload format, the top byte
> +  * (bits [63:56]) is assigned as top-byte tag; so we only can
> +  * retrieve address value from bits [55:0].
> +  *
> +  * According to Documentation/arm64/memory.rst, if detects the
> +  * specific pattern in bits [55:48] of payload which falls in
> +  * the kernel space, should fixup the top byte and this allows
> +  * perf tool to parse DSO symbol for data address correctly.
> +  *
> +  * For this reason, if detects the bits [55:48] is one of
> +  * following values, will fill 0xff into the top byte:
> +  *
> +  *   - 0xff (for most kernel memory regions);
> +  *   - 0xf0 (for kernel logical memory map with 64KB pages);
> +  *   - 0xfd (for kasan shadow region with 64KB pages).
> +  */
> + val = SPE_ADDR_PKT_ADDR_GET_BYTE_6(payload);
> + if (val == 0xffULL || val == 0xf0ULL || val == 0xfdULL)

But those values are just the beginning of the region used by the
kernel, aren't they? So the kernel logical map goes from 0xfff000.. to
0xfff7fff..., for instance.

But actually I wonder why were are so selective here? Wouldn't it just
suffice to look at bits [55:52] to be either 0 or F?

Cheers,
Andre

>   payload |= 0xffULL << SPE_ADDR_PKT_ADDR_BYTE7_SHIFT;
>  
>   /* Data access physical address */
> 



Re: [PATCH v4 18/21] perf arm-spe: Refactor operation packet handling

2020-10-27 Thread André Przywara
On 27/10/2020 03:09, Leo Yan wrote:
> Defines macros for operation packet header and formats (support sub
> classes for 'other', 'branch', 'load and store', etc).  Uses these
> macros for operation packet decoding and dumping.
> 
> Signed-off-by: Leo Yan 

Looks good now, thanks!

Reviewed-by: Andre Przywara 

Cheers,
Andre

> ---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 32 ++-
>  .../arm-spe-decoder/arm-spe-pkt-decoder.h | 23 +
>  2 files changed, 40 insertions(+), 15 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 7c6a0caed976..e3b0d22743e8 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -144,7 +144,7 @@ static int arm_spe_get_op_type(const unsigned char *buf, 
> size_t len,
>  struct arm_spe_pkt *packet)
>  {
>   packet->type = ARM_SPE_OP_TYPE;
> - packet->index = buf[0] & 0x3;
> + packet->index = SPE_OP_PKT_HDR_CLASS(buf[0]);
>   return arm_spe_get_payload(buf, len, 0, packet);
>  }
>  
> @@ -339,37 +339,38 @@ static int arm_spe_pkt_desc_event(const struct 
> arm_spe_pkt *packet,
>  static int arm_spe_pkt_desc_op_type(const struct arm_spe_pkt *packet,
>   char *buf, size_t buf_len)
>  {
> - int ret, idx = packet->index;
> + int ret, class = packet->index;
>   unsigned long long payload = packet->payload;
>   size_t blen = buf_len;
>  
> - switch (idx) {
> - case 0:
> + switch (class) {
> + case SPE_OP_PKT_HDR_CLASS_OTHER:
>   return arm_spe_pkt_snprintf(&buf, &blen,
> - payload & 0x1 ? "COND-SELECT" : "INSN-OTHER");
> - case 1:
> + payload & SPE_OP_PKT_COND ? "COND-SELECT" : 
> "INSN-OTHER");
> + case SPE_OP_PKT_HDR_CLASS_LD_ST_ATOMIC:
>   ret = arm_spe_pkt_snprintf(&buf, &blen,
> -payload & 0x1 ? "ST" : "LD");
> +payload & SPE_OP_PKT_ST ? "ST" : 
> "LD");
>   if (ret < 0)
>   return ret;
>  
> - if (payload & 0x2) {
> - if (payload & 0x4) {
> + if (SPE_OP_PKT_IS_LDST_ATOMIC(payload)) {
> + if (payload & SPE_OP_PKT_AT) {
>   ret = arm_spe_pkt_snprintf(&buf, &blen, " AT");
>   if (ret < 0)
>   return ret;
>   }
> - if (payload & 0x8) {
> + if (payload & SPE_OP_PKT_EXCL) {
>   ret = arm_spe_pkt_snprintf(&buf, &blen, " 
> EXCL");
>   if (ret < 0)
>   return ret;
>   }
> - if (payload & 0x10) {
> + if (payload & SPE_OP_PKT_AR) {
>   ret = arm_spe_pkt_snprintf(&buf, &blen, " AR");
>   if (ret < 0)
>   return ret;
>   }
> - } else if (payload & 0x4) {
> + } else if (SPE_OP_PKT_LDST_SUBCLASS_GET(payload) ==
> + SPE_OP_PKT_LDST_SUBCLASS_SIMD_FP) {
>   ret = arm_spe_pkt_snprintf(&buf, &blen, " SIMD-FP");
>   if (ret < 0)
>   return ret;
> @@ -377,17 +378,18 @@ static int arm_spe_pkt_desc_op_type(const struct 
> arm_spe_pkt *packet,
>  
>   return buf_len - blen;
>  
> - case 2:
> + case SPE_OP_PKT_HDR_CLASS_BR_ERET:
>   ret = arm_spe_pkt_snprintf(&buf, &blen, "B");
>   if (ret < 0)
>   return ret;
>  
> - if (payload & 0x1) {
> + if (payload & SPE_OP_PKT_COND) {
>   ret = arm_spe_pkt_snprintf(&buf, &blen, " COND");
>   if (ret < 0)
>   return ret;
>   }
> - if (payload & 0x2) {
> +
> + if (SPE_OP_PKT_IS_INDIRECT_BRANCH(payload)) {
>   ret = arm_spe_pkt_snprintf(&buf, &blen, " IND");
>   if (ret < 0)
>   return ret;
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> index 42ed4e61ede2..7032fc141ad4 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> @@ -105,6 +105,29 @@ enum arm_spe_events {
>   EV_EMPTY_PREDICATE  = 18,
>  };
>  
> +/* Operation packet header */
> +#define SPE_OP_PKT_HDR_CLASS(h)  ((h) & GENMASK_ULL(1, 
> 0))
> +#define SPE_OP_PKT_HDR_

Re: [PATCH v3 20/20] perf arm-spe: Add support for ARMv8.3-SPE

2020-10-26 Thread André Przywara
On 22/10/2020 15:58, Leo Yan wrote:

Hi,

> From: Wei Li 
> 
> This patch is to support Armv8.3 extension for SPE, it adds alignment
> field in the Events packet and it supports the Scalable Vector Extension
> (SVE) for Operation packet and Events packet with two additions:
> 
>   - The vector length for SVE operations in the Operation Type packet;
>   - The incomplete predicate and empty predicate fields in the Events
> packet.
> 
> Signed-off-by: Wei Li 
> Signed-off-by: Leo Yan 
> ---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 74 ++-
>  .../arm-spe-decoder/arm-spe-pkt-decoder.h | 18 +
>  2 files changed, 90 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 74ac12cbec69..6da4cfbc9914 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -332,6 +332,21 @@ static int arm_spe_pkt_desc_event(const struct 
> arm_spe_pkt *packet,
>   if (ret < 0)
>   return ret;
>   }
> + if (payload & BIT(EV_ALIGNMENT)) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " ALIGNMENT");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & BIT(EV_PARTIAL_PREDICATE)) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " SVE-PARTIAL-PRED");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & BIT(EV_EMPTY_PREDICATE)) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " SVE-EMPTY-PRED");
> + if (ret < 0)
> + return ret;
> + }
>  
>   return buf_len - blen;
>  }
> @@ -345,8 +360,43 @@ static int arm_spe_pkt_desc_op_type(const struct 
> arm_spe_pkt *packet,
>  
>   switch (class) {
>   case SPE_OP_PKT_HDR_CLASS_OTHER:
> - return arm_spe_pkt_snprintf(&buf, &blen,
> - payload & SPE_OP_PKT_COND ? "COND-SELECT" : 
> "INSN-OTHER");
> + if (SPE_OP_PKT_OTHER_SUBCLASS_SVE_OP_GET(payload) ==
> + SPE_OP_PKT_OTHER_SUBCLASS_SVE_OP) {

Same comment as in the other patch, can you combine those two into one
symbol?

> +
> + ret = arm_spe_pkt_snprintf(&buf, &blen, "SVE-OTHER");
> + if (ret < 0)
> + return ret;
> +
> + /* SVE effective vector length */
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " EVLEN %d",
> +SPE_OP_PKG_SVE_EVL(payload));
> + if (ret < 0)
> + return ret;
> +
> + if (payload & SPE_OP_PKT_SVE_FP) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " FP");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & SPE_OP_PKT_SVE_PRED) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " 
> PRED");
> + if (ret < 0)
> + return ret;
> + }
> + } else {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, "OTHER");
> + if (ret < 0)
> + return ret;
> +
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " %s",
> + payload & SPE_OP_PKT_COND ?
> + "COND-SELECT" : "INSN-OTHER");
> + if (ret < 0)
> + return ret;
> + }
> +
> + return buf_len - blen;
> +
>   case SPE_OP_PKT_HDR_CLASS_LD_ST_ATOMIC:
>   ret = arm_spe_pkt_snprintf(&buf, &blen,
>  payload & SPE_OP_PKT_ST ? "ST" : 
> "LD");
> @@ -401,6 +451,26 @@ static int arm_spe_pkt_desc_op_type(const struct 
> arm_spe_pkt *packet,
>   break;
>   }
>  
> + if (SPE_OP_PKT_LDST_SUBCLASS_SVE_GET(payload) ==
> + SPE_OP_PKT_LDST_SUBCLASS_SVE) {

Same here, could be combined into one symbol.

> + /* SVE effective vector length */
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " EVLEN %d",
> +SPE_OP_PKG_SVE_EVL(payload));
> + if (ret < 0)
> + return ret;
> +
> + if (payload & SPE_OP_PKT_SVE_PRED) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " 
> PRED");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & SPE_OP_PKT_SVE_SG) {
> +  

Re: [PATCH v3 17/20] perf arm-spe: Refactor operation packet handling

2020-10-26 Thread André Przywara
On 22/10/2020 15:58, Leo Yan wrote:

Hi,

> Defines macros for operation packet header and formats (support sub
> classes for 'other', 'branch', 'load and store', etc).  Uses these
> macros for operation packet decoding and dumping.
> 
> Signed-off-by: Leo Yan 
> ---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 34 +++
>  .../arm-spe-decoder/arm-spe-pkt-decoder.h | 26 ++
>  2 files changed, 45 insertions(+), 15 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 19d05d9734ab..59b538563d31 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -144,7 +144,7 @@ static int arm_spe_get_op_type(const unsigned char *buf, 
> size_t len,
>  struct arm_spe_pkt *packet)
>  {
>   packet->type = ARM_SPE_OP_TYPE;
> - packet->index = buf[0] & 0x3;
> + packet->index = SPE_OP_PKT_HDR_CLASS(buf[0]);
>   return arm_spe_get_payload(buf, len, 0, packet);
>  }
>  
> @@ -339,37 +339,39 @@ static int arm_spe_pkt_desc_event(const struct 
> arm_spe_pkt *packet,
>  static int arm_spe_pkt_desc_op_type(const struct arm_spe_pkt *packet,
>   char *buf, size_t buf_len)
>  {
> - int ret, idx = packet->index;
> + int ret, class = packet->index;
>   unsigned long long payload = packet->payload;
>   size_t blen = buf_len;
>  
> - switch (idx) {
> - case 0:
> + switch (class) {
> + case SPE_OP_PKT_HDR_CLASS_OTHER:
>   return arm_spe_pkt_snprintf(&buf, &blen,
> - payload & 0x1 ? "COND-SELECT" : "INSN-OTHER");
> - case 1:
> + payload & SPE_OP_PKT_COND ? "COND-SELECT" : 
> "INSN-OTHER");
> + case SPE_OP_PKT_HDR_CLASS_LD_ST_ATOMIC:
>   ret = arm_spe_pkt_snprintf(&buf, &blen,
> -payload & 0x1 ? "ST" : "LD");
> +payload & SPE_OP_PKT_ST ? "ST" : 
> "LD");
>   if (ret < 0)
>   return ret;
>  
> - if (payload & 0x2) {
> - if (payload & 0x4) {
> + if (SPE_OP_PKT_LDST_SUBCLASS_ATOMIC_GET(payload) ==
> + SPE_OP_PKT_LDST_SUBCLASS_ATOMIC) {

This looks somewhat hard to read, and those symbols are only used once?
So what about combining this down in the header so that you can use:
if (SPE_OP_PKT_IS_LDST_ATOMIC(payload)) {

> + if (payload & SPE_OP_PKT_AT) {
>   ret = arm_spe_pkt_snprintf(&buf, &blen, " AT");
>   if (ret < 0)
>   return ret;
>   }
> - if (payload & 0x8) {
> + if (payload & SPE_OP_PKT_EXCL) {
>   ret = arm_spe_pkt_snprintf(&buf, &blen, " 
> EXCL");
>   if (ret < 0)
>   return ret;
>   }
> - if (payload & 0x10) {
> + if (payload & SPE_OP_PKT_AR) {
>   ret = arm_spe_pkt_snprintf(&buf, &blen, " AR");
>   if (ret < 0)
>   return ret;
>   }
> - } else if (payload & 0x4) {
> + } else if (SPE_OP_PKT_LDST_SUBCLASS_GET(payload) ==
> + SPE_OP_PKT_LDST_SUBCLASS_SIMD_FP) {
>   ret = arm_spe_pkt_snprintf(&buf, &blen, " SIMD-FP");
>   if (ret < 0)
>   return ret;
> @@ -377,17 +379,19 @@ static int arm_spe_pkt_desc_op_type(const struct 
> arm_spe_pkt *packet,
>  
>   return buf_len - blen;
>  
> - case 2:
> + case SPE_OP_PKT_HDR_CLASS_BR_ERET:
>   ret = arm_spe_pkt_snprintf(&buf, &blen, "B");
>   if (ret < 0)
>   return ret;
>  
> - if (payload & 0x1) {
> + if (payload & SPE_OP_PKT_COND) {
>   ret = arm_spe_pkt_snprintf(&buf, &blen, " COND");
>   if (ret < 0)
>   return ret;
>   }
> - if (payload & 0x2) {
> +
> + if (SPE_OP_PKT_BRANCH_SUBCLASS_GET(payload) ==
> + SPE_OP_PKT_BRANCH_SUBCLASS_INDIRECT) {

Same here, it's the only user of both symbols, so maybe:
if (SPE_OP_PKT_IS_INDIRECT_BRANCH(payload)) {

Cheers,
Andre

>   ret = arm_spe_pkt_snprintf(&buf, &blen, " IND");
>   if (ret < 0)
>   return ret;
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h 
> b/tools/perf/util/arm-spe-dec

Re: [PATCH v3 13/20] perf arm-spe: Add new function arm_spe_pkt_desc_event()

2020-10-23 Thread André Przywara
On 22/10/2020 15:58, Leo Yan wrote:

Hi,

> This patch moves out the event packet parsing from arm_spe_pkt_desc()
> to the new function arm_spe_pkt_desc_event().
> 
> Signed-off-by: Leo Yan 

diff -w says this is correct, so:

Reviewed-by: Andre Przywara 

Thanks!
Andre

> ---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 136 ++
>  1 file changed, 73 insertions(+), 63 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 6eebd30f3d78..8a6b50f32a52 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -266,6 +266,78 @@ static int arm_spe_pkt_snprintf(char **buf_p, size_t 
> *blen,
>   return ret;
>  }
>  
> +static int arm_spe_pkt_desc_event(const struct arm_spe_pkt *packet,
> +   char *buf, size_t buf_len)
> +{
> + u64 payload = packet->payload;
> + size_t blen = buf_len;
> + int ret;
> +
> + ret = arm_spe_pkt_snprintf(&buf, &blen, "EV");
> + if (ret < 0)
> + return ret;
> +
> + if (payload & 0x1) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " EXCEPTION-GEN");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & 0x2) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " RETIRED");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & 0x4) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-ACCESS");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & 0x8) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-REFILL");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & 0x10) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-ACCESS");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & 0x20) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-REFILL");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & 0x40) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " NOT-TAKEN");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & 0x80) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " MISPRED");
> + if (ret < 0)
> + return ret;
> + }
> + if (packet->index > 1) {
> + if (payload & 0x100) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-ACCESS");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & 0x200) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-REFILL");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & 0x400) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " 
> REMOTE-ACCESS");
> + if (ret < 0)
> + return ret;
> + }
> + }
> +
> + return buf_len - blen;
> +}
> +
>  static int arm_spe_pkt_desc_addr(const struct arm_spe_pkt *packet,
>char *buf, size_t buf_len)
>  {
> @@ -344,69 +416,7 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, 
> char *buf,
>   case ARM_SPE_END:
>   return arm_spe_pkt_snprintf(&buf, &blen, "%s", name);
>   case ARM_SPE_EVENTS:
> - ret = arm_spe_pkt_snprintf(&buf, &blen, "EV");
> - if (ret < 0)
> - return ret;
> -
> - if (payload & 0x1) {
> - ret = arm_spe_pkt_snprintf(&buf, &blen, " 
> EXCEPTION-GEN");
> - if (ret < 0)
> - return ret;
> - }
> - if (payload & 0x2) {
> - ret = arm_spe_pkt_snprintf(&buf, &blen, " RETIRED");
> - if (ret < 0)
> - return ret;
> - }
> - if (payload & 0x4) {
> - ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-ACCESS");
> - if (ret < 0)
> - return ret;
> - }
> - if (payload & 0x8) {
> - ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-REFILL");
> - if (ret < 0)
> - return ret;
> - }
> - if (payload & 0x10) {
> - ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-ACCESS");
> - if (ret < 0)
> - return ret;
> - }
> - if (payload & 0x20) {
> - ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-REFILL");
> -  

Re: [PATCH v3 19/20] perf arm_spe: Decode memory tagging properties

2020-10-23 Thread André Przywara
On 22/10/2020 15:58, Leo Yan wrote:

Hi,

> From: Andre Przywara 
> 
> When SPE records a physical address, it can additionally tag the event
> with information from the Memory Tagging architecture extension.
> 
> Decode the two additional fields in the SPE event payload.
> 
> [leoy: Refined patch to use predefined macros]
> 
> Signed-off-by: Andre Przywara 
> Signed-off-by: Leo Yan 
> ---
>  tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c | 6 +-
>  tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h | 2 ++
>  2 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index c1a3b0afd1de..74ac12cbec69 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -432,6 +432,7 @@ static int arm_spe_pkt_desc_addr(const struct arm_spe_pkt 
> *packet,
>char *buf, size_t buf_len)
>  {
>   int ns, el, idx = packet->index;
> + int ch, pat;
>   u64 payload = packet->payload;
>  
>   switch (idx) {
> @@ -448,9 +449,12 @@ static int arm_spe_pkt_desc_addr(const struct 
> arm_spe_pkt *packet,
>   "VA 0x%llx", payload);
>   case SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS:
>   ns = !!SPE_ADDR_PKT_GET_NS(payload);
> + ch = !!SPE_ADDR_PKT_GET_CH(payload);
> + pat = SPE_ADDR_PKT_GET_PAT(payload);
>   payload = SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(payload);
>   return arm_spe_pkt_snprintf(&buf, &buf_len,
> - "PA 0x%llx ns=%d", payload, ns);
> + "PA 0x%llx ns=%d ch=%d, pat=%x",
> + payload, ns, ch, pat);
>   default:
>   return 0;
>   }
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> index 31dbb8c0fde3..d69af0d618ea 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> @@ -75,6 +75,8 @@ struct arm_spe_pkt {
>  
>  #define SPE_ADDR_PKT_GET_NS(v)   (((v) & BIT(63)) >> 63)
>  #define SPE_ADDR_PKT_GET_EL(v)   (((v) & GENMASK_ULL(62, 
> 61)) >> 61)
> +#define SPE_ADDR_PKT_GET_CH(v)   (((v) & BIT(62)) >> 62)

You need BIT_ULL() here to make this work on 32-bit systems.

Cheers,
Andre

> +#define SPE_ADDR_PKT_GET_PAT(v)  (((v) & GENMASK_ULL(59, 
> 56)) >> 56)
>  
>  #define SPE_ADDR_PKT_EL0 0
>  #define SPE_ADDR_PKT_EL1 1
> 



Re: [PATCH v3 18/20] perf arm-spe: Add more sub classes for operation packet

2020-10-23 Thread André Przywara
On 22/10/2020 15:58, Leo Yan wrote:

Hi,

> For the operation type packet payload with load/store class, it misses
> to support these sub classes:
> 
>   - A load/store targeting the general-purpose registers;
>   - A load/store targeting unspecified registers;
>   - The ARMv8.4 nested virtualisation extension can redirect system
> register accesses to a memory page controlled by the hypervisor.
> The SPE profiling feature in newer implementations can tag those
> memory accesses accordingly.
> 
> Add the bit pattern describing load/store sub classes, so that the perf
> tool can decode it properly.
> 
> Inspired by Andre Przywara, refined the commit log and code for more
> clear description.
> 
> Co-developed-by: Andre Przywara 
> Signed-off-by: Leo Yan 

Reviewed-by: Andre Przywara 

Cheers,
Andre

> ---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 28 +--
>  1 file changed, 26 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 59b538563d31..c1a3b0afd1de 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -370,11 +370,35 @@ static int arm_spe_pkt_desc_op_type(const struct 
> arm_spe_pkt *packet,
>   if (ret < 0)
>   return ret;
>   }
> - } else if (SPE_OP_PKT_LDST_SUBCLASS_GET(payload) ==
> - SPE_OP_PKT_LDST_SUBCLASS_SIMD_FP) {
> + }
> +
> + switch (SPE_OP_PKT_LDST_SUBCLASS_GET(payload)) {
> + case SPE_OP_PKT_LDST_SUBCLASS_SIMD_FP:
>   ret = arm_spe_pkt_snprintf(&buf, &blen, " SIMD-FP");
>   if (ret < 0)
>   return ret;
> +
> + break;
> + case SPE_OP_PKT_LDST_SUBCLASS_GP_REG:
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " GP-REG");
> + if (ret < 0)
> + return ret;
> +
> + break;
> + case SPE_OP_PKT_LDST_SUBCLASS_UNSPEC_REG:
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " UNSPEC-REG");
> + if (ret < 0)
> + return ret;
> +
> + break;
> + case SPE_OP_PKT_LDST_SUBCLASS_NV_SYSREG:
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " NV-SYSREG");
> + if (ret < 0)
> + return ret;
> +
> + break;
> + default:
> + break;
>   }
>  
>   return buf_len - blen;
> 



Re: [PATCH v3 14/20] perf arm-spe: Refactor event type handling

2020-10-23 Thread André Przywara
On 22/10/2020 15:58, Leo Yan wrote:

Hi,

> Move the enums of event types to arm-spe-pkt-decoder.h, thus function
> arm_spe_pkt_desc() can them for bitmasks.
> 
> Suggested-by: Andre Przywara 
> Signed-off-by: Leo Yan 

The move is fine, and I checked the bitmasks as well.

Reviewed-by: Andre Przywara 

Cheers,
Andre

> ---
>  .../util/arm-spe-decoder/arm-spe-decoder.h| 17 --
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 22 +--
>  .../arm-spe-decoder/arm-spe-pkt-decoder.h | 18 +++
>  3 files changed, 29 insertions(+), 28 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h 
> b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
> index a5111a8d4360..24727b8ca7ff 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
> @@ -13,23 +13,6 @@
>  
>  #include "arm-spe-pkt-decoder.h"
>  
> -enum arm_spe_events {
> - EV_EXCEPTION_GEN= 0,
> - EV_RETIRED  = 1,
> - EV_L1D_ACCESS   = 2,
> - EV_L1D_REFILL   = 3,
> - EV_TLB_ACCESS   = 4,
> - EV_TLB_WALK = 5,
> - EV_NOT_TAKEN= 6,
> - EV_MISPRED  = 7,
> - EV_LLC_ACCESS   = 8,
> - EV_LLC_MISS = 9,
> - EV_REMOTE_ACCESS= 10,
> - EV_ALIGNMENT= 11,
> - EV_PARTIAL_PREDICATE= 17,
> - EV_EMPTY_PREDICATE  = 18,
> -};
> -
>  enum arm_spe_sample_type {
>   ARM_SPE_L1D_ACCESS  = 1 << 0,
>   ARM_SPE_L1D_MISS= 1 << 1,
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 8a6b50f32a52..58a1390b7a43 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -277,58 +277,58 @@ static int arm_spe_pkt_desc_event(const struct 
> arm_spe_pkt *packet,
>   if (ret < 0)
>   return ret;
>  
> - if (payload & 0x1) {
> + if (payload & BIT(EV_EXCEPTION_GEN)) {
>   ret = arm_spe_pkt_snprintf(&buf, &blen, " EXCEPTION-GEN");
>   if (ret < 0)
>   return ret;
>   }
> - if (payload & 0x2) {
> + if (payload & BIT(EV_RETIRED)) {
>   ret = arm_spe_pkt_snprintf(&buf, &blen, " RETIRED");
>   if (ret < 0)
>   return ret;
>   }
> - if (payload & 0x4) {
> + if (payload & BIT(EV_L1D_ACCESS)) {
>   ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-ACCESS");
>   if (ret < 0)
>   return ret;
>   }
> - if (payload & 0x8) {
> + if (payload & BIT(EV_L1D_REFILL)) {
>   ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-REFILL");
>   if (ret < 0)
>   return ret;
>   }
> - if (payload & 0x10) {
> + if (payload & BIT(EV_TLB_ACCESS)) {
>   ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-ACCESS");
>   if (ret < 0)
>   return ret;
>   }
> - if (payload & 0x20) {
> + if (payload & BIT(EV_TLB_WALK)) {
>   ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-REFILL");
>   if (ret < 0)
>   return ret;
>   }
> - if (payload & 0x40) {
> + if (payload & BIT(EV_NOT_TAKEN)) {
>   ret = arm_spe_pkt_snprintf(&buf, &blen, " NOT-TAKEN");
>   if (ret < 0)
>   return ret;
>   }
> - if (payload & 0x80) {
> + if (payload & BIT(EV_MISPRED)) {
>   ret = arm_spe_pkt_snprintf(&buf, &blen, " MISPRED");
>   if (ret < 0)
>   return ret;
>   }
>   if (packet->index > 1) {
> - if (payload & 0x100) {
> + if (payload & BIT(EV_LLC_ACCESS)) {
>   ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-ACCESS");
>   if (ret < 0)
>   return ret;
>   }
> - if (payload & 0x200) {
> + if (payload & BIT(EV_LLC_MISS)) {
>   ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-REFILL");
>   if (ret < 0)
>   return ret;
>   }
> - if (payload & 0x400) {
> + if (payload & BIT(EV_REMOTE_ACCESS)) {
>   ret = arm_spe_pkt_snprintf(&buf, &blen, " 
> REMOTE-ACCESS");
>   if (ret < 0)
>   return ret;
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> index 8a291f451ef8..12c344454cf1 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> @@ -92,6 +92,24 @@ struct arm_spe_pkt {
>  #define SPE_CNT_PKT_HD

Re: [PATCH v3 12/20] perf arm-spe: Refactor counter packet handling

2020-10-23 Thread André Przywara
On 22/10/2020 15:58, Leo Yan wrote:

Hi,

> This patch defines macros for counter packet header, and uses macros to
> replace hard code values in functions arm_spe_get_counter() and
> arm_spe_pkt_desc().
> 
> In the function arm_spe_get_counter(), adds a new line for more
> readable.
> 
> Signed-off-by: Leo Yan 
> ---
>  tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c | 11 ++-
>  tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h |  8 
>  2 files changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 023bcc9be3cc..6eebd30f3d78 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -152,10 +152,11 @@ static int arm_spe_get_counter(const unsigned char 
> *buf, size_t len,
>  const unsigned char ext_hdr, struct arm_spe_pkt 
> *packet)
>  {
>   packet->type = ARM_SPE_COUNTER;
> +
>   if (ext_hdr)
> - packet->index = ((buf[0] & 0x3) << 3) | (buf[1] & 0x7);
> + packet->index = SPE_CNT_PKT_HDR_EXTENDED_INDEX(buf[0], buf[1]);
>   else
> - packet->index = buf[0] & 0x7;
> + packet->index = SPE_CNT_PKT_HDR_SHORT_INDEX(buf[0]);
>  
>   return arm_spe_get_payload(buf, len, ext_hdr, packet);
>  }
> @@ -307,17 +308,17 @@ static int arm_spe_pkt_desc_counter(const struct 
> arm_spe_pkt *packet,
>   return ret;
>  
>   switch (packet->index) {
> - case 0:
> + case SPE_CNT_PKT_HDR_INDEX_TOTAL_LAT:
>   ret = arm_spe_pkt_snprintf(&buf, &blen, "TOT");
>   if (ret < 0)
>   return ret;
>   break;
> - case 1:
> + case SPE_CNT_PKT_HDR_INDEX_ISSUE_LAT:
>   ret = arm_spe_pkt_snprintf(&buf, &blen, "ISSUE");
>   if (ret < 0)
>   return ret;
>   break;
> - case 2:
> + case SPE_CNT_PKT_HDR_INDEX_TRANS_LAT:
>   ret = arm_spe_pkt_snprintf(&buf, &blen, "XLAT");
>   if (ret < 0)
>   return ret;
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> index 8808f2d0b6e4..8a291f451ef8 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> @@ -84,6 +84,14 @@ struct arm_spe_pkt {
>  /* Context packet header */
>  #define SPE_CTX_PKT_HDR_INDEX(h) ((h) & GENMASK_ULL(1, 0))
>  
> +/* Counter packet header */
> +#define SPE_CNT_PKT_HDR_SHORT_INDEX(h)   ((h) & GENMASK_ULL(2, 
> 0))
> +#define SPE_CNT_PKT_HDR_EXTENDED_INDEX(h0, h1)   (((h0) & GENMASK_ULL(1, 
> 0)) << 3 | \
> +  
> SPE_ADDR_PKT_HDR_SHORT_INDEX(h1))

I would still like to see this merged with the SPE_ADDR_PKT_HDR_*_INDEX
definition in patch 10/20.

Otherwise this patch is fine.

Cheers,
Andre

> +#define SPE_CNT_PKT_HDR_INDEX_TOTAL_LAT  0x0
> +#define SPE_CNT_PKT_HDR_INDEX_ISSUE_LAT  0x1
> +#define SPE_CNT_PKT_HDR_INDEX_TRANS_LAT  0x2
> +
>  const char *arm_spe_pkt_name(enum arm_spe_pkt_type);
>  
>  int arm_spe_get_packet(const unsigned char *buf, size_t len,
> 



Re: [PATCH v3 09/20] perf arm-spe: Refactor address packet handling

2020-10-23 Thread André Przywara
On 22/10/2020 15:58, Leo Yan wrote:

Hi Leo,

> This patch is to refactor address packet handling, it defines macros for
> address packet's header and payload, these macros are used by decoder
> and the dump flow.
> 
> Signed-off-by: Leo Yan 
> ---
>  .../util/arm-spe-decoder/arm-spe-decoder.c| 29 
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 26 +++---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.h | 34 ---
>  3 files changed, 47 insertions(+), 42 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> index cc18a1e8c212..776b3e6628bb 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> @@ -24,36 +24,35 @@
>  
>  static u64 arm_spe_calc_ip(int index, u64 payload)
>  {
> - u8 *addr = (u8 *)&payload;
> - int ns, el;
> + u64 ns, el;
>  
>   /* Instruction virtual address or Branch target address */
>   if (index == SPE_ADDR_PKT_HDR_INDEX_INS ||
>   index == SPE_ADDR_PKT_HDR_INDEX_BRANCH) {
> - ns = addr[7] & SPE_ADDR_PKT_NS;
> - el = (addr[7] & SPE_ADDR_PKT_EL_MASK) >> SPE_ADDR_PKT_EL_OFFSET;
> + ns = SPE_ADDR_PKT_GET_NS(payload);
> + el = SPE_ADDR_PKT_GET_EL(payload);
> +
> + /* Clean highest byte */
> + payload = SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(payload);
>  
>   /* Fill highest byte for EL1 or EL2 (VHE) mode */
>   if (ns && (el == SPE_ADDR_PKT_EL1 || el == SPE_ADDR_PKT_EL2))
> - addr[7] = 0xff;
> - /* Clean highest byte for other cases */
> - else
> - addr[7] = 0x0;
> + payload |= 0xffULL << SPE_ADDR_PKT_ADDR_BYTE7_SHIFT;
>  
>   /* Data access virtual address */
>   } else if (index == SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT) {
>  
> + /* Clean tags */
> + payload = SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(payload);
> +
>   /* Fill highest byte if bits [48..55] is 0xff */

Do you know where this comes from? If yes, can you replace the comment
with the reason, so that it says *why* and not *what* the code does?

> - if (addr[6] == 0xff)
> - addr[7] = 0xff;
> - /* Otherwise, cleanup tags */
> - else
> - addr[7] = 0x0;
> + if (SPE_ADDR_PKT_ADDR_GET_BYTE_6(payload) == 0xffULL)
> + payload |= 0xffULL << SPE_ADDR_PKT_ADDR_BYTE7_SHIFT;
>  
>   /* Data access physical address */
>   } else if (index == SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS) {
> - /* Cleanup byte 7 */
> - addr[7] = 0x0;
> + /* Clean highest byte */
> + payload = SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(payload);
>   } else {
>   pr_err("unsupported address packet index: 0x%x\n", index);
>   }
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 550cd7648c73..156f98d6b8b2 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -13,9 +13,6 @@
>  
>  #include "arm-spe-pkt-decoder.h"
>  
> -#define NS_FLAG  BIT(63)
> -#define EL_FLAG  (BIT(62) | BIT(61))
> -
>  #if __BYTE_ORDER == __BIG_ENDIAN
>  #define le16_to_cpu bswap_16
>  #define le32_to_cpu bswap_32
> @@ -167,10 +164,11 @@ static int arm_spe_get_addr(const unsigned char *buf, 
> size_t len,
>   const unsigned char ext_hdr, struct arm_spe_pkt 
> *packet)
>  {
>   packet->type = ARM_SPE_ADDRESS;
> +
>   if (ext_hdr)
> - packet->index = ((buf[0] & 0x3) << 3) | (buf[1] & 0x7);
> + packet->index = SPE_ADDR_PKT_HDR_EXTENDED_INDEX(buf[0], buf[1]);
>   else
> - packet->index = buf[0] & 0x7;
> + packet->index = SPE_ADDR_PKT_HDR_SHORT_INDEX(buf[0]);
>  
>   return arm_spe_get_payload(buf, len, ext_hdr, packet);
>  }
> @@ -274,20 +272,20 @@ static int arm_spe_pkt_desc_addr(const struct 
> arm_spe_pkt *packet,
>   u64 payload = packet->payload;
>  
>   switch (idx) {
> - case 0:
> - case 1:
> - ns = !!(packet->payload & NS_FLAG);
> - el = (packet->payload & EL_FLAG) >> 61;
> - payload &= ~(0xffULL << 56);
> + case SPE_ADDR_PKT_HDR_INDEX_INS:
> + case SPE_ADDR_PKT_HDR_INDEX_BRANCH:
> + ns = !!SPE_ADDR_PKT_GET_NS(payload);
> + el = SPE_ADDR_PKT_GET_EL(payload);
> + payload = SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(payload);
>   return arm_spe_pkt_snprintf(&buf, &buf_len,
>   "%s 0x%llx el%d ns=%d",
>   (idx == 1) ? "TGT" : "PC", payload, el, ns);
> - case 2:
> + case SPE_ADDR_P

Re: [PATCH v3 16/20] perf arm-spe: Add new function arm_spe_pkt_desc_op_type()

2020-10-23 Thread André Przywara
On 22/10/2020 15:58, Leo Yan wrote:

Hi,

> The operation type packet is complex and contains subclass; the parsing
> flow causes deep indentation; for more readable, this patch introduces
> a new function arm_spe_pkt_desc_op_type() which is used for operation
> type parsing.
> 
> Signed-off-by: Leo Yan 

Compared '-' and '+' with diff -w, no changes.

Reviewed-by: Andre Przywara 

Thanks,
Andre

> ---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 122 ++
>  1 file changed, 66 insertions(+), 56 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 2cb01016..19d05d9734ab 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -336,6 +336,70 @@ static int arm_spe_pkt_desc_event(const struct 
> arm_spe_pkt *packet,
>   return buf_len - blen;
>  }
>  
> +static int arm_spe_pkt_desc_op_type(const struct arm_spe_pkt *packet,
> + char *buf, size_t buf_len)
> +{
> + int ret, idx = packet->index;
> + unsigned long long payload = packet->payload;
> + size_t blen = buf_len;
> +
> + switch (idx) {
> + case 0:
> + return arm_spe_pkt_snprintf(&buf, &blen,
> + payload & 0x1 ? "COND-SELECT" : "INSN-OTHER");
> + case 1:
> + ret = arm_spe_pkt_snprintf(&buf, &blen,
> +payload & 0x1 ? "ST" : "LD");
> + if (ret < 0)
> + return ret;
> +
> + if (payload & 0x2) {
> + if (payload & 0x4) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " AT");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & 0x8) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " 
> EXCL");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & 0x10) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " AR");
> + if (ret < 0)
> + return ret;
> + }
> + } else if (payload & 0x4) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " SIMD-FP");
> + if (ret < 0)
> + return ret;
> + }
> +
> + return buf_len - blen;
> +
> + case 2:
> + ret = arm_spe_pkt_snprintf(&buf, &blen, "B");
> + if (ret < 0)
> + return ret;
> +
> + if (payload & 0x1) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " COND");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & 0x2) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " IND");
> + if (ret < 0)
> + return ret;
> + }
> +
> + return buf_len - blen;
> +
> + default:
> + return 0;
> + }
> +}
> +
>  static int arm_spe_pkt_desc_addr(const struct arm_spe_pkt *packet,
>char *buf, size_t buf_len)
>  {
> @@ -403,7 +467,7 @@ static int arm_spe_pkt_desc_counter(const struct 
> arm_spe_pkt *packet,
>  int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
>size_t buf_len)
>  {
> - int ret, idx = packet->index;
> + int idx = packet->index;
>   unsigned long long payload = packet->payload;
>   const char *name = arm_spe_pkt_name(packet->type);
>   size_t blen = buf_len;
> @@ -416,61 +480,7 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, 
> char *buf,
>   case ARM_SPE_EVENTS:
>   return arm_spe_pkt_desc_event(packet, buf, buf_len);
>   case ARM_SPE_OP_TYPE:
> - switch (idx) {
> - case 0:
> - return arm_spe_pkt_snprintf(&buf, &blen,
> - payload & 0x1 ? "COND-SELECT" : 
> "INSN-OTHER");
> - case 1:
> - ret = arm_spe_pkt_snprintf(&buf, &blen,
> -payload & 0x1 ? "ST" : "LD");
> - if (ret < 0)
> - return ret;
> -
> - if (payload & 0x2) {
> - if (payload & 0x4) {
> - ret = arm_spe_pkt_snprintf(&buf, &blen, 
> " AT");
> - if (ret < 0)
> - return ret;
> - }
> - if (payload & 0x8) {
> - 

Re: [PATCH v3 15/20] perf arm-spe: Remove size condition checking for events

2020-10-23 Thread André Przywara
On 22/10/2020 15:58, Leo Yan wrote:
> In the Armv8 ARM (ARM DDI 0487F.c), chapter "D10.2.6 Events packet", it
> describes the event bit is valid with specific payload requirement.  For
> example, the Last Level cache access event, the bit is defined as:
> 
>   E[8], byte 1 bit [0], when SZ == 0b01 , when SZ == 0b10 ,
>or when SZ == 0b11
> 
> It requires the payload size is at least 2 bytes, when byte 1 (start
> counting from 0) is valid, E[8] (bit 0 in byte 1) can be used for LLC
> access event type.  For safety, the code checks the condition for
> payload size firstly, if meet the requirement for payload size, then
> continue to parse event type.
> 
> If review function arm_spe_get_payload(), it has used cast, so any bytes
> beyond the valid size have been set to zeros.
> 
> For this reason, we don't need to check payload size anymore afterwards
> when parse events, thus this patch removes payload size conditions.
> 
> Suggested-by: Andre Przywara 
> Signed-off-by: Leo Yan 

Thanks, that looks better now!

Reviewed-by: Andre Przywara 

Cheers,
Andre

> ---
>  .../util/arm-spe-decoder/arm-spe-decoder.c|  9 ++
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 30 +--
>  2 files changed, 17 insertions(+), 22 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> index 776b3e6628bb..a5d7509d5daa 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> @@ -178,16 +178,13 @@ static int arm_spe_read_record(struct arm_spe_decoder 
> *decoder)
>   if (payload & BIT(EV_TLB_ACCESS))
>   decoder->record.type |= ARM_SPE_TLB_ACCESS;
>  
> - if ((idx == 2 || idx == 4 || idx == 8) &&
> - (payload & BIT(EV_LLC_MISS)))
> + if (payload & BIT(EV_LLC_MISS))
>   decoder->record.type |= ARM_SPE_LLC_MISS;
>  
> - if ((idx == 2 || idx == 4 || idx == 8) &&
> - (payload & BIT(EV_LLC_ACCESS)))
> + if (payload & BIT(EV_LLC_ACCESS))
>   decoder->record.type |= ARM_SPE_LLC_ACCESS;
>  
> - if ((idx == 2 || idx == 4 || idx == 8) &&
> - (payload & BIT(EV_REMOTE_ACCESS)))
> + if (payload & BIT(EV_REMOTE_ACCESS))
>   decoder->record.type |= ARM_SPE_REMOTE_ACCESS;
>  
>   if (payload & BIT(EV_MISPRED))
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 58a1390b7a43..2cb01016 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -317,22 +317,20 @@ static int arm_spe_pkt_desc_event(const struct 
> arm_spe_pkt *packet,
>   if (ret < 0)
>   return ret;
>   }
> - if (packet->index > 1) {
> - if (payload & BIT(EV_LLC_ACCESS)) {
> - ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-ACCESS");
> - if (ret < 0)
> - return ret;
> - }
> - if (payload & BIT(EV_LLC_MISS)) {
> - ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-REFILL");
> - if (ret < 0)
> - return ret;
> - }
> - if (payload & BIT(EV_REMOTE_ACCESS)) {
> - ret = arm_spe_pkt_snprintf(&buf, &blen, " 
> REMOTE-ACCESS");
> - if (ret < 0)
> - return ret;
> - }
> + if (payload & BIT(EV_LLC_ACCESS)) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-ACCESS");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & BIT(EV_LLC_MISS)) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-REFILL");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & BIT(EV_REMOTE_ACCESS)) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " REMOTE-ACCESS");
> + if (ret < 0)
> + return ret;
>   }
>  
>   return buf_len - blen;
> 



Re: [PATCH v3 08/20] perf arm-spe: Add new function arm_spe_pkt_desc_addr()

2020-10-23 Thread André Przywara
On 22/10/2020 15:58, Leo Yan wrote:
> This patch moves out the address parsing code from arm_spe_pkt_desc()
> and uses the new introduced function arm_spe_pkt_desc_addr() to process
> address packet.
> 
> Signed-off-by: Leo Yan 

Can confirm the move is correct.

Reviewed-by: Andre Przywara 

Cheers,
Andre

> ---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 49 ---
>  1 file changed, 30 insertions(+), 19 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 6f2329990729..550cd7648c73 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -267,10 +267,38 @@ static int arm_spe_pkt_snprintf(char **buf_p, size_t 
> *blen,
>   return ret;
>  }
>  
> +static int arm_spe_pkt_desc_addr(const struct arm_spe_pkt *packet,
> +  char *buf, size_t buf_len)
> +{
> + int ns, el, idx = packet->index;
> + u64 payload = packet->payload;
> +
> + switch (idx) {
> + case 0:
> + case 1:
> + ns = !!(packet->payload & NS_FLAG);
> + el = (packet->payload & EL_FLAG) >> 61;
> + payload &= ~(0xffULL << 56);
> + return arm_spe_pkt_snprintf(&buf, &buf_len,
> + "%s 0x%llx el%d ns=%d",
> + (idx == 1) ? "TGT" : "PC", payload, el, ns);
> + case 2:
> + return arm_spe_pkt_snprintf(&buf, &buf_len,
> + "VA 0x%llx", payload);
> + case 3:
> + ns = !!(packet->payload & NS_FLAG);
> + payload &= ~(0xffULL << 56);
> + return arm_spe_pkt_snprintf(&buf, &buf_len,
> + "PA 0x%llx ns=%d", payload, ns);
> + default:
> + return 0;
> + }
> +}
> +
>  int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
>size_t buf_len)
>  {
> - int ret, ns, el, idx = packet->index;
> + int ret, idx = packet->index;
>   unsigned long long payload = packet->payload;
>   const char *name = arm_spe_pkt_name(packet->type);
>   size_t blen = buf_len;
> @@ -404,24 +432,7 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, 
> char *buf,
>   case ARM_SPE_TIMESTAMP:
>   return arm_spe_pkt_snprintf(&buf, &blen, "%s %lld", name, 
> payload);
>   case ARM_SPE_ADDRESS:
> - switch (idx) {
> - case 0:
> - case 1: ns = !!(packet->payload & NS_FLAG);
> - el = (packet->payload & EL_FLAG) >> 61;
> - payload &= ~(0xffULL << 56);
> - return arm_spe_pkt_snprintf(&buf, &blen,
> - "%s 0x%llx el%d ns=%d",
> - (idx == 1) ? "TGT" : "PC", payload, el, 
> ns);
> - case 2:
> - return arm_spe_pkt_snprintf(&buf, &blen,
> - "VA 0x%llx", payload);
> - case 3: ns = !!(packet->payload & NS_FLAG);
> - payload &= ~(0xffULL << 56);
> - return arm_spe_pkt_snprintf(&buf, &blen,
> - "PA 0x%llx ns=%d", payload, 
> ns);
> - default:
> - return 0;
> - }
> + return arm_spe_pkt_desc_addr(packet, buf, buf_len);
>   case ARM_SPE_CONTEXT:
>   return arm_spe_pkt_snprintf(&buf, &blen, "%s 0x%lx el%d",
>   name, (unsigned long)payload, idx + 
> 1);
> 



Re: [PATCH v3 11/20] perf arm-spe: Add new function arm_spe_pkt_desc_counter()

2020-10-23 Thread André Przywara
On 22/10/2020 15:58, Leo Yan wrote:
> This patch moves out the counter packet parsing code from
> arm_spe_pkt_desc() to the new function arm_spe_pkt_desc_counter().
> 
> Signed-off-by: Leo Yan 

Confirmed by diff'ing '-' vs. '+' to not introduce an actual change.

Reviewed-by: Andre Przywara 

Cheers,
Andre

> ---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 64 +++
>  1 file changed, 37 insertions(+), 27 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 1fc07c693640..023bcc9be3cc 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -293,6 +293,42 @@ static int arm_spe_pkt_desc_addr(const struct 
> arm_spe_pkt *packet,
>   }
>  }
>  
> +static int arm_spe_pkt_desc_counter(const struct arm_spe_pkt *packet,
> + char *buf, size_t buf_len)
> +{
> + u64 payload = packet->payload;
> + const char *name = arm_spe_pkt_name(packet->type);
> + size_t blen = buf_len;
> + int ret;
> +
> + ret = arm_spe_pkt_snprintf(&buf, &blen, "%s %d ", name,
> +(unsigned short)payload);
> + if (ret < 0)
> + return ret;
> +
> + switch (packet->index) {
> + case 0:
> + ret = arm_spe_pkt_snprintf(&buf, &blen, "TOT");
> + if (ret < 0)
> + return ret;
> + break;
> + case 1:
> + ret = arm_spe_pkt_snprintf(&buf, &blen, "ISSUE");
> + if (ret < 0)
> + return ret;
> + break;
> + case 2:
> + ret = arm_spe_pkt_snprintf(&buf, &blen, "XLAT");
> + if (ret < 0)
> + return ret;
> + break;
> + default:
> + break;
> + }
> +
> + return buf_len - blen;
> +}
> +
>  int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
>size_t buf_len)
>  {
> @@ -435,33 +471,7 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, 
> char *buf,
>   return arm_spe_pkt_snprintf(&buf, &blen, "%s 0x%lx el%d",
>   name, (unsigned long)payload, idx + 
> 1);
>   case ARM_SPE_COUNTER:
> - ret = arm_spe_pkt_snprintf(&buf, &blen, "%s %d ", name,
> -(unsigned short)payload);
> - if (ret < 0)
> - return ret;
> -
> - switch (idx) {
> - case 0:
> - ret = arm_spe_pkt_snprintf(&buf, &blen, "TOT");
> - if (ret < 0)
> - return ret;
> - break;
> - case 1:
> - ret = arm_spe_pkt_snprintf(&buf, &blen, "ISSUE");
> - if (ret < 0)
> - return ret;
> - break;
> - case 2:
> - ret = arm_spe_pkt_snprintf(&buf, &blen, "XLAT");
> - if (ret < 0)
> - return ret;
> - break;
> - default:
> - break;
> - }
> -
> - return buf_len - blen;
> -
> + return arm_spe_pkt_desc_counter(packet, buf, buf_len);
>   default:
>   break;
>   }
> 



Re: [PATCH v3 07/20] perf arm-spe: Refactor packet header parsing

2020-10-23 Thread André Przywara
On 22/10/2020 15:58, Leo Yan wrote:

Hi,

> The packet header parsing uses the hard coded values and it uses nested
> if-else statements.
> 
> To improve the readability, this patch refactors the macros for packet
> header format so it removes the hard coded values.  Furthermore, based
> on the new mask macros it reduces the nested if-else statements and
> changes to use the flat conditions checking, this is directive and can
> easily map to the descriptions in ARMv8-a architecture reference manual
> (ARM DDI 0487E.a), chapter 'D10.1.5 Statistical Profiling Extension
> protocol packet headers'.
> 
> Signed-off-by: Leo Yan 

Compared against the previous version, which I had checked already
against the manual. Thanks for the fixes!

Reviewed-by: Andre Przywara 

Thanks,
Andre


> ---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 92 +--
>  .../arm-spe-decoder/arm-spe-pkt-decoder.h | 20 
>  2 files changed, 61 insertions(+), 51 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index c7b6dc016f11..6f2329990729 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -16,28 +16,6 @@
>  #define NS_FLAG  BIT(63)
>  #define EL_FLAG  (BIT(62) | BIT(61))
>  
> -#define SPE_HEADER0_PAD  0x0
> -#define SPE_HEADER0_END  0x1
> -#define SPE_HEADER0_ADDRESS  0x30 /* address packet (short) */
> -#define SPE_HEADER0_ADDRESS_MASK 0x38
> -#define SPE_HEADER0_COUNTER  0x18 /* counter packet (short) */
> -#define SPE_HEADER0_COUNTER_MASK 0x38
> -#define SPE_HEADER0_TIMESTAMP0x71
> -#define SPE_HEADER0_TIMESTAMP0x71
> -#define SPE_HEADER0_EVENTS   0x2
> -#define SPE_HEADER0_EVENTS_MASK  0xf
> -#define SPE_HEADER0_SOURCE   0x3
> -#define SPE_HEADER0_SOURCE_MASK  0xf
> -#define SPE_HEADER0_CONTEXT  0x24
> -#define SPE_HEADER0_CONTEXT_MASK 0x3c
> -#define SPE_HEADER0_OP_TYPE  0x8
> -#define SPE_HEADER0_OP_TYPE_MASK 0x3c
> -#define SPE_HEADER1_ALIGNMENT0x0
> -#define SPE_HEADER1_ADDRESS  0xb0 /* address packet (extended) */
> -#define SPE_HEADER1_ADDRESS_MASK 0xf8
> -#define SPE_HEADER1_COUNTER  0x98 /* counter packet (extended) */
> -#define SPE_HEADER1_COUNTER_MASK 0xf8
> -
>  #if __BYTE_ORDER == __BIG_ENDIAN
>  #define le16_to_cpu bswap_16
>  #define le32_to_cpu bswap_32
> @@ -200,46 +178,58 @@ static int arm_spe_get_addr(const unsigned char *buf, 
> size_t len,
>  static int arm_spe_do_get_packet(const unsigned char *buf, size_t len,
>struct arm_spe_pkt *packet)
>  {
> - unsigned int byte;
> + unsigned int hdr;
> + unsigned char ext_hdr = 0;
>  
>   memset(packet, 0, sizeof(struct arm_spe_pkt));
>  
>   if (!len)
>   return ARM_SPE_NEED_MORE_BYTES;
>  
> - byte = buf[0];
> - if (byte == SPE_HEADER0_PAD)
> + hdr = buf[0];
> +
> + if (hdr == SPE_HEADER0_PAD)
>   return arm_spe_get_pad(packet);
> - else if (byte == SPE_HEADER0_END) /* no timestamp at end of record */
> +
> + if (hdr == SPE_HEADER0_END) /* no timestamp at end of record */
>   return arm_spe_get_end(packet);
> - else if (byte & 0xc0 /* 0y11xx */) {
> - if (byte & 0x80) {
> - if ((byte & SPE_HEADER0_ADDRESS_MASK) == 
> SPE_HEADER0_ADDRESS)
> - return arm_spe_get_addr(buf, len, 0, packet);
> - if ((byte & SPE_HEADER0_COUNTER_MASK) == 
> SPE_HEADER0_COUNTER)
> - return arm_spe_get_counter(buf, len, 0, packet);
> - } else
> - if (byte == SPE_HEADER0_TIMESTAMP)
> - return arm_spe_get_timestamp(buf, len, packet);
> - else if ((byte & SPE_HEADER0_EVENTS_MASK) == 
> SPE_HEADER0_EVENTS)
> - return arm_spe_get_events(buf, len, packet);
> - else if ((byte & SPE_HEADER0_SOURCE_MASK) == 
> SPE_HEADER0_SOURCE)
> - return arm_spe_get_data_source(buf, len, 
> packet);
> - else if ((byte & SPE_HEADER0_CONTEXT_MASK) == 
> SPE_HEADER0_CONTEXT)
> - return arm_spe_get_context(buf, len, packet);
> - else if ((byte & SPE_HEADER0_OP_TYPE_MASK) == 
> SPE_HEADER0_OP_TYPE)
> - return arm_spe_get_op_type(buf, len, packet);
> - } else if ((byte & 0xe0) == 0x20 /* 0y001x */) {
> - /* 16-bit header */
> - byte = buf[1];
> - if (byte == SPE_HEADER1_ALIGNMENT)
> +
> + if (hdr == SPE_HEADER0_TIMESTAMP)
> + return arm_spe_get_timestamp(buf, len, packet

Re: [PATCH v3 03/20] perf arm-spe: Refactor payload size calculation

2020-10-23 Thread André Przywara
On 22/10/2020 15:57, Leo Yan wrote:

Hi Leo,

> This patch defines macro to extract "sz" field from header, and renames
> the function payloadlen() to arm_spe_payload_len().
> 
> Signed-off-by: Leo Yan 
> ---
>  .../util/arm-spe-decoder/arm-spe-pkt-decoder.c | 18 +-
>  .../util/arm-spe-decoder/arm-spe-pkt-decoder.h |  3 +++
>  2 files changed, 12 insertions(+), 9 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 7c7b5eb09fba..4294c133a465 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -69,22 +69,22 @@ const char *arm_spe_pkt_name(enum arm_spe_pkt_type type)
>   return arm_spe_packet_name[type];
>  }
>  
> -/* return ARM SPE payload size from its encoding,
> - * which is in bits 5:4 of the byte.
> - * 00 : byte
> - * 01 : halfword (2)
> - * 10 : word (4)
> - * 11 : doubleword (8)
> +/*
> + * Extracts the field "sz" from header bits and converts to bytes:
> + *   00 : byte (1)
> + *   01 : halfword (2)
> + *   10 : word (4)
> + *   11 : doubleword (8)
>   */
> -static int payloadlen(unsigned char byte)
> +static unsigned int arm_spe_payload_len(unsigned char hdr)
>  {
> - return 1 << ((byte & 0x30) >> 4);
> + return 1 << SPE_HEADER_SZ(hdr);

I know, I know, I asked for this, but now looking again at it - and
after having seen the whole series:
This is now really trivial, and there are just two users? And
SPE_HEADER_SZ() is only used in here?

So either you just stuff the "1U << .." into the callers of
arm_spe_payload_len(), or indeed put all of this into one macro (as you
had originally).

Apologies for this forth and back, but I didn't realise how this is
really used eventually, and I just saw the transition from function to
macro.

But please use 1U << .., signed shifts are treacherous.

>  }
>  
>  static int arm_spe_get_payload(const unsigned char *buf, size_t len,
>  struct arm_spe_pkt *packet)
>  {
> - size_t payload_len = payloadlen(buf[0]);
> + size_t payload_len = arm_spe_payload_len(buf[0]);
>  
>   if (len < 1 + payload_len)
>   return ARM_SPE_NEED_MORE_BYTES;
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> index 4c870521b8eb..e9ea8e3ead5d 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> @@ -9,6 +9,7 @@
>  
>  #include 
>  #include 
> +#include 
>  
>  #define ARM_SPE_PKT_DESC_MAX 256
>  
> @@ -36,6 +37,8 @@ struct arm_spe_pkt {
>   uint64_tpayload;
>  };
>  
> +#define SPE_HEADER_SZ(val)   ((val & GENMASK_ULL(5, 4)) >> 4)

If you should keep this, please put parentheses around "val".

Cheers,
Andre

> +
>  #define SPE_ADDR_PKT_HDR_INDEX_INS   (0x0)
>  #define SPE_ADDR_PKT_HDR_INDEX_BRANCH(0x1)
>  #define SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT (0x2)
> 



Re: [PATCH v3 04/20] perf arm-spe: Refactor arm_spe_get_events()

2020-10-23 Thread André Przywara
On 22/10/2020 15:58, Leo Yan wrote:
> In function arm_spe_get_events(), the event packet's 'index' is assigned
> as payload length, but the flow is not directive: it firstly gets the
> packet length from the return value of arm_spe_get_payload(), the value
> includes header length (1) and payload length:
> 
>   int ret = arm_spe_get_payload(buf, len, packet);
> 
> and then reduces header length from packet length, so finally get the
> payload length:
> 
>   packet->index = ret - 1;
> 
> To simplify the code, this patch directly assigns payload length to
> event packet's index; and at the end it calls arm_spe_get_payload() to
> return the payload value.
> 
> Signed-off-by: Leo Yan 

Reviewed-by: Andre Przywara 

Cheers,
Andre

> ---
>  tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 4294c133a465..f3bb8bf102aa 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -136,8 +136,6 @@ static int arm_spe_get_timestamp(const unsigned char 
> *buf, size_t len,
>  static int arm_spe_get_events(const unsigned char *buf, size_t len,
> struct arm_spe_pkt *packet)
>  {
> - int ret = arm_spe_get_payload(buf, len, packet);
> -
>   packet->type = ARM_SPE_EVENTS;
>  
>   /* we use index to identify Events with a less number of
> @@ -145,9 +143,9 @@ static int arm_spe_get_events(const unsigned char *buf, 
> size_t len,
>* LLC-REFILL, and REMOTE-ACCESS events are identified if
>* index > 1.
>*/
> - packet->index = ret - 1;
> + packet->index = arm_spe_payload_len(buf[0]);
>  
> - return ret;
> + return arm_spe_get_payload(buf, len, packet);
>  }
>  
>  static int arm_spe_get_data_source(const unsigned char *buf, size_t len,
> 



Re: [PATCH v2 14/14] perf arm-spe: Add support for ARMv8.3-SPE

2020-10-21 Thread André Przywara
On 21/10/2020 11:17, Leo Yan wrote:

Hi Leo,

> On Wed, Oct 21, 2020 at 10:26:07AM +0100, Andr� Przywara wrote:
>> On 21/10/2020 06:10, Leo Yan wrote:
>>
>> Hi,
>>
>>> On Tue, Oct 20, 2020 at 10:54:44PM +0100, Andr� Przywara wrote:
 On 29/09/2020 14:39, Leo Yan wrote:

 Hi,

> From: Wei Li 
>
> This patch is to support Armv8.3 extension for SPE, it adds alignment
> field in the Events packet and it supports the Scalable Vector Extension
> (SVE) for Operation packet and Events packet with two additions:
>
>   - The vector length for SVE operations in the Operation Type packet;
>   - The incomplete predicate and empty predicate fields in the Events
> packet.
>
> Signed-off-by: Wei Li 
> Signed-off-by: Leo Yan 
> ---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 84 ++-
>  .../arm-spe-decoder/arm-spe-pkt-decoder.h |  6 ++
>  2 files changed, 87 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 05a4c74399d7..3ec381fddfcb 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -342,14 +342,73 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt 
> *packet, char *buf,
>   return ret;
>   }
>   }
> + if (idx > 2) {

 As I mentioned in the other patch, I doubt this extra comparison is
 useful. Does that protect us from anything?
>>>
>>> It's the same reason with Event packet which have explained for replying
>>> patch 10, the condition is to respect the SPE specifiction:
>>>
>>>   E[11], byte 1, bit [11], when SZ == 0b10 , or SZ == 0b11
>>>  Alignment.
>>>  ...
>>>  Otherwise this bit reads-as-zero.
>>>
>>> So we gives higher priority for checking payload size than the Event
>>> bit setting; if you have other thinking for this, please let me know.
>>
>> Ah, thanks for pointing this out. It looks like a bug in the manual
>> then, because I don't see why bit 11 should be any different from bits
>> [10:8] and bits [15:12] in this respect. And in the diagrams above you
>> clearly see bit 11 being shown even when SZ == 0b01.
>>
>> I will try to follow this up here.
> 
> Thanks for following up!

Just got the confirmation that this is indeed a bug in the manual. It
will be fixed, but since the ARM ARM isn't published on a daily base, it
might take a while to trickle in.

Cheers,
Andre


> 
> + if (payload & SPE_EVT_PKT_ALIGNMENT) {

 Mmh, but this is bit 11, right?
>>>
>>> Yes.
>>>
 So would need to go into the (idx > 1)
 section (covering bits 8-15)? Another reason to ditch this comparison 
 above.
>>>
>>> As has explained in patch 10, idx is not the same thing with "sz"
>>> field; "idx" stands for payload length in bytes, so:
>>>
>>>   idx = 1 << sz
>>>
>>> The spec defines the sz is 2 or 3, thus idx is 4 or 8; so this is why
>>> here use the condition "(idx > 2)".
>>>
>>> I think here need to refine code for more explict expression so can
>>> avoid confusion.  So I think it's better to condition such like:
>>>
>>>   if (payload_len >= 4) {
>>
>> Yes, that would be (or have been) more helpful, but as mentioned in the
>> other patch, I'd rather see those comparisons go entirely.
> 
> Agree.  Will remove comparisons in next version.
> 
> Thanks,
> Leo
> 



Re: [PATCH v2 14/14] perf arm-spe: Add support for ARMv8.3-SPE

2020-10-21 Thread André Przywara
On 21/10/2020 06:10, Leo Yan wrote:

Hi,

> On Tue, Oct 20, 2020 at 10:54:44PM +0100, Andr� Przywara wrote:
>> On 29/09/2020 14:39, Leo Yan wrote:
>>
>> Hi,
>>
>>> From: Wei Li 
>>>
>>> This patch is to support Armv8.3 extension for SPE, it adds alignment
>>> field in the Events packet and it supports the Scalable Vector Extension
>>> (SVE) for Operation packet and Events packet with two additions:
>>>
>>>   - The vector length for SVE operations in the Operation Type packet;
>>>   - The incomplete predicate and empty predicate fields in the Events
>>> packet.
>>>
>>> Signed-off-by: Wei Li 
>>> Signed-off-by: Leo Yan 
>>> ---
>>>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 84 ++-
>>>  .../arm-spe-decoder/arm-spe-pkt-decoder.h |  6 ++
>>>  2 files changed, 87 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
>>> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
>>> index 05a4c74399d7..3ec381fddfcb 100644
>>> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
>>> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
>>> @@ -342,14 +342,73 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt 
>>> *packet, char *buf,
>>> return ret;
>>> }
>>> }
>>> +   if (idx > 2) {
>>
>> As I mentioned in the other patch, I doubt this extra comparison is
>> useful. Does that protect us from anything?
> 
> It's the same reason with Event packet which have explained for replying
> patch 10, the condition is to respect the SPE specifiction:
> 
>   E[11], byte 1, bit [11], when SZ == 0b10 , or SZ == 0b11
>  Alignment.
>  ...
>  Otherwise this bit reads-as-zero.
> 
> So we gives higher priority for checking payload size than the Event
> bit setting; if you have other thinking for this, please let me know.

Ah, thanks for pointing this out. It looks like a bug in the manual
then, because I don't see why bit 11 should be any different from bits
[10:8] and bits [15:12] in this respect. And in the diagrams above you
clearly see bit 11 being shown even when SZ == 0b01.

I will try to follow this up here.

>>> +   if (payload & SPE_EVT_PKT_ALIGNMENT) {
>>
>> Mmh, but this is bit 11, right?
> 
> Yes.
> 
>> So would need to go into the (idx > 1)
>> section (covering bits 8-15)? Another reason to ditch this comparison above.
> 
> As has explained in patch 10, idx is not the same thing with "sz"
> field; "idx" stands for payload length in bytes, so:
> 
>   idx = 1 << sz
> 
> The spec defines the sz is 2 or 3, thus idx is 4 or 8; so this is why
> here use the condition "(idx > 2)".
> 
> I think here need to refine code for more explict expression so can
> avoid confusion.  So I think it's better to condition such like:
> 
>   if (payload_len >= 4) {

Yes, that would be (or have been) more helpful, but as mentioned in the
other patch, I'd rather see those comparisons go entirely.

Cheers,
Andre

> 
>  ...
> 
>   }
> 
>>> +   ret = snprintf(buf, buf_len, " ALIGNMENT");
>>> +   if (ret < 0)
>>> +   return ret;
>>> +   buf += ret;
>>> +   blen -= ret;
>>
>> Shouldn't we use the new arm_spe_pkt_snprintf() function here as well?
>> Or is there a reason that this doesn't work?
> 
> Goot point.  Will change to use arm_spe_pkt_snprintf().
> 
>>> +   }
>>> +   if (payload & SPE_EVT_PKT_SVE_PARTIAL_PREDICATE) {
>>> +   ret = snprintf(buf, buf_len, " 
>>> SVE-PARTIAL-PRED");
>>> +   if (ret < 0)
>>> +   return ret;
>>> +   buf += ret;
>>> +   blen -= ret;
>>> +   }
>>> +   if (payload & SPE_EVT_PKT_SVE_EMPTY_PREDICATE) {
>>> +   ret = snprintf(buf, buf_len, " SVE-EMPTY-PRED");
>>> +   if (ret < 0)
>>> +   return ret;
>>> +   buf += ret;
>>> +   blen -= ret;
>>> +   }
>>> +   }
>>> +
>>> return buf_len - blen;
>>>  
>>> case ARM_SPE_OP_TYPE:
>>> switch (idx) {
>>> case SPE_OP_PKT_HDR_CLASS_OTHER:
>>> -   return arm_spe_pkt_snprintf(&buf, &blen,
>>> -   payload & 
>>> SPE_OP_PKT_OTHER_SUBCLASS_COND ?
>>> -   "COND-SELECT" : "INSN-OTHER");
>>> +   if ((payload & SPE_OP_PKT_OTHER_SVE_SUBCLASS_MASK) ==
>>> +   SPE_OP_PKT_OTHER_SUBCLASS_SVG_OP) {
>>> +
>>> +   ret = arm_spe_pkt_snprintf(&buf, &blen, 
>>> "SVE-OTHER");
>>> +   if (ret < 0)
>>> +

Re: [PATCH v2 10/14] perf arm-spe: Refactor event type handling

2020-10-21 Thread André Przywara
On 21/10/2020 05:54, Leo Yan wrote:

Hi Leo,

> On Tue, Oct 20, 2020 at 10:54:16PM +0100, Andr� Przywara wrote:
>> On 29/09/2020 14:39, Leo Yan wrote:
>>
>> Hi,
>>
>>> Use macros instead of the enum values for event types, this is more
>>> directive and without bit shifting when parse packet.
>>>
>>> Signed-off-by: Leo Yan 
>>> ---
>>>  .../util/arm-spe-decoder/arm-spe-decoder.c| 16 +++---
>>>  .../util/arm-spe-decoder/arm-spe-decoder.h| 17 --
>>>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 22 +--
>>>  .../arm-spe-decoder/arm-spe-pkt-decoder.h | 16 ++
>>>  4 files changed, 35 insertions(+), 36 deletions(-)
>>>
>>> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c 
>>> b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
>>> index 9d3de163d47c..ac66e7f42a58 100644
>>> --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
>>> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
>>> @@ -168,31 +168,31 @@ static int arm_spe_read_record(struct arm_spe_decoder 
>>> *decoder)
>>> case ARM_SPE_OP_TYPE:
>>> break;
>>> case ARM_SPE_EVENTS:
>>> -   if (payload & BIT(EV_L1D_REFILL))
>>> +   if (payload & SPE_EVT_PKT_L1D_REFILL)
>>
>> Not sure this (and the others below) are an improvement? I liked the
>> enum below, and reading BIT() here tells me that it's a bitmask.
> 
> Agreed.
> 
>>> decoder->record.type |= ARM_SPE_L1D_MISS;
>>>  
>>> -   if (payload & BIT(EV_L1D_ACCESS))
>>> +   if (payload & SPE_EVT_PKT_L1D_ACCESS)
>>> decoder->record.type |= ARM_SPE_L1D_ACCESS;
>>>  
>>> -   if (payload & BIT(EV_TLB_WALK))
>>> +   if (payload & SPE_EVT_PKT_TLB_WALK)
>>> decoder->record.type |= ARM_SPE_TLB_MISS;
>>>  
>>> -   if (payload & BIT(EV_TLB_ACCESS))
>>> +   if (payload & SPE_EVT_PKT_TLB_ACCESS)
>>> decoder->record.type |= ARM_SPE_TLB_ACCESS;
>>>  
>>> if ((idx == 2 || idx == 4 || idx == 8) &&
>>> -   (payload & BIT(EV_LLC_MISS)))
>>> +   (payload & SPE_EVT_PKT_LLC_MISS))
>>> decoder->record.type |= ARM_SPE_LLC_MISS;
>>>  
>>> if ((idx == 2 || idx == 4 || idx == 8) &&
>>> -   (payload & BIT(EV_LLC_ACCESS)))
>>> +   (payload & SPE_EVT_PKT_LLC_ACCESS))
>>> decoder->record.type |= ARM_SPE_LLC_ACCESS;
>>>  
>>> if ((idx == 2 || idx == 4 || idx == 8) &&
>>> -   (payload & BIT(EV_REMOTE_ACCESS)))
>>> +   (payload & SPE_EVT_PKT_REMOTE_ACCESS))
>>> decoder->record.type |= ARM_SPE_REMOTE_ACCESS;
>>>  
>>> -   if (payload & BIT(EV_MISPRED))
>>> +   if (payload & SPE_EVT_PKT_MISPREDICTED)
>>> decoder->record.type |= ARM_SPE_BRANCH_MISS;
>>>  
>>> break;
>>> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h 
>>> b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
>>> index a5111a8d4360..24727b8ca7ff 100644
>>> --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
>>> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
>>> @@ -13,23 +13,6 @@
>>>  
>>>  #include "arm-spe-pkt-decoder.h"
>>>  
>>> -enum arm_spe_events {
>>> -   EV_EXCEPTION_GEN= 0,
>>> -   EV_RETIRED  = 1,
>>> -   EV_L1D_ACCESS   = 2,
>>> -   EV_L1D_REFILL   = 3,
>>> -   EV_TLB_ACCESS   = 4,
>>> -   EV_TLB_WALK = 5,
>>> -   EV_NOT_TAKEN= 6,
>>> -   EV_MISPRED  = 7,
>>> -   EV_LLC_ACCESS   = 8,
>>> -   EV_LLC_MISS = 9,
>>> -   EV_REMOTE_ACCESS= 10,
>>> -   EV_ALIGNMENT= 11,
>>> -   EV_PARTIAL_PREDICATE= 17,
>>> -   EV_EMPTY_PREDICATE  = 18,
>>> -};
>>
>> So what about keeping this, but moving it into the other header file?
> 
> Will do.  This is more directive, especially if we consider every bit
> indicates an event type :)
> 
>> coding-style.rst says: "Enums are preferred when defining several
>> related constants."
> 
> Thanks for pasting the coding style, it's useful.  I agree that using
> BIT() + enum is better form, will refine the patch for this.
> 
>>> -
>>>  enum arm_spe_sample_type {
>>> ARM_SPE_L1D_ACCESS  = 1 << 0,
>>> ARM_SPE_L1D_MISS= 1 << 1,
>>> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
>>> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
>>> index ed0f4c74dfc5..b8f343320abf 100644
>>> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
>>> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
>>> @@ -284,58 +284,58 @@ int arm_spe_pkt_desc(const s

Re: [PATCH v2 12/14] perf arm-spe: Add more sub classes for operation packet

2020-10-20 Thread André Przywara
On 29/09/2020 14:39, Leo Yan wrote:

Hi,

> For the operation type packet payload with load/store class, it misses
> to support these sub classes:
> 
>   - A load/store targeting the general-purpose registers;
>   - A load/store targeting unspecified registers;
>   - The ARMv8.4 nested virtualisation extension can redirect system
> register accesses to a memory page controlled by the hypervisor.
> The SPE profiling feature in newer implementations can tag those
> memory accesses accordingly.
> 
> Add the bit pattern describing load/store sub classes, so that the perf
> tool can decode it properly.
> 
> Inspired by Andre Przywara, refined the commit log and code for more
> clear description.
> 
> Co-developed-by: Andre Przywara 
> Signed-off-by: Leo Yan 
> ---
>  .../util/arm-spe-decoder/arm-spe-pkt-decoder.c| 15 +++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index a848c784f4cf..57a2d5494838 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -378,6 +378,21 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, 
> char *buf,
>   ret = arm_spe_pkt_snprintf(&buf, &blen, " 
> SIMD-FP");
>   if (ret < 0)
>   return ret;
> + } else if ((payload & SPE_OP_PKT_LDST_SUBCLASS_MASK) ==

These three and the one above use the same mask, should this go into a
switch case? Move this block to the end, then do:
switch (payload & SPE_OP_PKT_LDST_SUBCLASS_MASK) {
case SPE_OP_PKT_LDST_SUBCLASS_GP_REG:
...
case SPE_OP_PKT_LDST_SUBCLASS_UNSPEC_REG:
...
Maybe even assign just a string pointer inside, then have one snprintf.
Haven't checked it that *really* looks better, though.

Also those later checks are quite indented, shall those be moved to
helper functions? Again just an idea 

Cheers,
Andre


> + SPE_OP_PKT_LDST_SUBCLASS_GP_REG) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " 
> GP-REG");
> + if (ret < 0)
> + return ret;
> + } else if ((payload & SPE_OP_PKT_LDST_SUBCLASS_MASK) ==
> + SPE_OP_PKT_LDST_SUBCLASS_UNSPEC_REG) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " 
> UNSPEC-REG");
> + if (ret < 0)
> + return ret;
> + } else if ((payload & SPE_OP_PKT_LDST_SUBCLASS_MASK) ==
> + SPE_OP_PKT_LDST_SUBCLASS_NV_SYSREG) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " 
> NV-SYSREG");
> + if (ret < 0)
> + return ret;
>   }
>  
>   return buf_len - blen;
> 



Re: [PATCH v2 14/14] perf arm-spe: Add support for ARMv8.3-SPE

2020-10-20 Thread André Przywara
On 29/09/2020 14:39, Leo Yan wrote:

Hi,

> From: Wei Li 
> 
> This patch is to support Armv8.3 extension for SPE, it adds alignment
> field in the Events packet and it supports the Scalable Vector Extension
> (SVE) for Operation packet and Events packet with two additions:
> 
>   - The vector length for SVE operations in the Operation Type packet;
>   - The incomplete predicate and empty predicate fields in the Events
> packet.
> 
> Signed-off-by: Wei Li 
> Signed-off-by: Leo Yan 
> ---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 84 ++-
>  .../arm-spe-decoder/arm-spe-pkt-decoder.h |  6 ++
>  2 files changed, 87 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 05a4c74399d7..3ec381fddfcb 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -342,14 +342,73 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, 
> char *buf,
>   return ret;
>   }
>   }
> + if (idx > 2) {

As I mentioned in the other patch, I doubt this extra comparison is
useful. Does that protect us from anything?

> + if (payload & SPE_EVT_PKT_ALIGNMENT) {

Mmh, but this is bit 11, right? So would need to go into the (idx > 1)
section (covering bits 8-15)? Another reason to ditch this comparison above.

> + ret = snprintf(buf, buf_len, " ALIGNMENT");
> + if (ret < 0)
> + return ret;
> + buf += ret;
> + blen -= ret;

Shouldn't we use the new arm_spe_pkt_snprintf() function here as well?
Or is there a reason that this doesn't work?

> + }
> + if (payload & SPE_EVT_PKT_SVE_PARTIAL_PREDICATE) {
> + ret = snprintf(buf, buf_len, " 
> SVE-PARTIAL-PRED");
> + if (ret < 0)
> + return ret;
> + buf += ret;
> + blen -= ret;
> + }
> + if (payload & SPE_EVT_PKT_SVE_EMPTY_PREDICATE) {
> + ret = snprintf(buf, buf_len, " SVE-EMPTY-PRED");
> + if (ret < 0)
> + return ret;
> + buf += ret;
> + blen -= ret;
> + }
> + }
> +
>   return buf_len - blen;
>  
>   case ARM_SPE_OP_TYPE:
>   switch (idx) {
>   case SPE_OP_PKT_HDR_CLASS_OTHER:
> - return arm_spe_pkt_snprintf(&buf, &blen,
> - payload & 
> SPE_OP_PKT_OTHER_SUBCLASS_COND ?
> - "COND-SELECT" : "INSN-OTHER");
> + if ((payload & SPE_OP_PKT_OTHER_SVE_SUBCLASS_MASK) ==
> + SPE_OP_PKT_OTHER_SUBCLASS_SVG_OP) {
> +
> + ret = arm_spe_pkt_snprintf(&buf, &blen, 
> "SVE-OTHER");
> + if (ret < 0)
> + return ret;
> +
> + /* Effective vector length: step is 32 bits */
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " EVLEN 
> %d",
> + 32 << ((payload & 
> SPE_OP_PKT_SVE_EVL_MASK) >>
> + SPE_OP_PKT_SVE_EVL_SHIFT));

Can you move this into a macro, and add a comment about how this works?
People might get confused over the "32 << something".

Cheers,
Andre

> + if (ret < 0)
> + return ret;
> +
> + if (payload & SPE_OP_PKT_SVE_FP) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, 
> " FP");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & SPE_OP_PKT_SVE_PRED) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, 
> " PRED");
> + if (ret < 0)
> + return ret;
> + }
> + } else {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, 
> "OTHER");
> + if (ret < 0)
> + return ret;
> +
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " %s",
> +payload & 
> SPE_OP_PKT_OTHER_SUBCLASS_COND ?
> +

Re: [PATCH v2 10/14] perf arm-spe: Refactor event type handling

2020-10-20 Thread André Przywara
On 29/09/2020 14:39, Leo Yan wrote:

Hi,

> Use macros instead of the enum values for event types, this is more
> directive and without bit shifting when parse packet.
> 
> Signed-off-by: Leo Yan 
> ---
>  .../util/arm-spe-decoder/arm-spe-decoder.c| 16 +++---
>  .../util/arm-spe-decoder/arm-spe-decoder.h| 17 --
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 22 +--
>  .../arm-spe-decoder/arm-spe-pkt-decoder.h | 16 ++
>  4 files changed, 35 insertions(+), 36 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> index 9d3de163d47c..ac66e7f42a58 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> @@ -168,31 +168,31 @@ static int arm_spe_read_record(struct arm_spe_decoder 
> *decoder)
>   case ARM_SPE_OP_TYPE:
>   break;
>   case ARM_SPE_EVENTS:
> - if (payload & BIT(EV_L1D_REFILL))
> + if (payload & SPE_EVT_PKT_L1D_REFILL)

Not sure this (and the others below) are an improvement? I liked the
enum below, and reading BIT() here tells me that it's a bitmask.

>   decoder->record.type |= ARM_SPE_L1D_MISS;
>  
> - if (payload & BIT(EV_L1D_ACCESS))
> + if (payload & SPE_EVT_PKT_L1D_ACCESS)
>   decoder->record.type |= ARM_SPE_L1D_ACCESS;
>  
> - if (payload & BIT(EV_TLB_WALK))
> + if (payload & SPE_EVT_PKT_TLB_WALK)
>   decoder->record.type |= ARM_SPE_TLB_MISS;
>  
> - if (payload & BIT(EV_TLB_ACCESS))
> + if (payload & SPE_EVT_PKT_TLB_ACCESS)
>   decoder->record.type |= ARM_SPE_TLB_ACCESS;
>  
>   if ((idx == 2 || idx == 4 || idx == 8) &&
> - (payload & BIT(EV_LLC_MISS)))
> + (payload & SPE_EVT_PKT_LLC_MISS))
>   decoder->record.type |= ARM_SPE_LLC_MISS;
>  
>   if ((idx == 2 || idx == 4 || idx == 8) &&
> - (payload & BIT(EV_LLC_ACCESS)))
> + (payload & SPE_EVT_PKT_LLC_ACCESS))
>   decoder->record.type |= ARM_SPE_LLC_ACCESS;
>  
>   if ((idx == 2 || idx == 4 || idx == 8) &&
> - (payload & BIT(EV_REMOTE_ACCESS)))
> + (payload & SPE_EVT_PKT_REMOTE_ACCESS))
>   decoder->record.type |= ARM_SPE_REMOTE_ACCESS;
>  
> - if (payload & BIT(EV_MISPRED))
> + if (payload & SPE_EVT_PKT_MISPREDICTED)
>   decoder->record.type |= ARM_SPE_BRANCH_MISS;
>  
>   break;
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h 
> b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
> index a5111a8d4360..24727b8ca7ff 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
> @@ -13,23 +13,6 @@
>  
>  #include "arm-spe-pkt-decoder.h"
>  
> -enum arm_spe_events {
> - EV_EXCEPTION_GEN= 0,
> - EV_RETIRED  = 1,
> - EV_L1D_ACCESS   = 2,
> - EV_L1D_REFILL   = 3,
> - EV_TLB_ACCESS   = 4,
> - EV_TLB_WALK = 5,
> - EV_NOT_TAKEN= 6,
> - EV_MISPRED  = 7,
> - EV_LLC_ACCESS   = 8,
> - EV_LLC_MISS = 9,
> - EV_REMOTE_ACCESS= 10,
> - EV_ALIGNMENT= 11,
> - EV_PARTIAL_PREDICATE= 17,
> - EV_EMPTY_PREDICATE  = 18,
> -};

So what about keeping this, but moving it into the other header file?
coding-style.rst says: "Enums are preferred when defining several
related constants."

> -
>  enum arm_spe_sample_type {
>   ARM_SPE_L1D_ACCESS  = 1 << 0,
>   ARM_SPE_L1D_MISS= 1 << 1,
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index ed0f4c74dfc5..b8f343320abf 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -284,58 +284,58 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, 
> char *buf,
>   if (ret < 0)
>   return ret;
>  
> - if (payload & 0x1) {
> + if (payload & SPE_EVT_PKT_GEN_EXCEPTION) {

Having the bitmask here directly is indeed not very nice and error
prone. But I would rather see the above solution:
if (payload & BIT(EV_EXCEPTION_GEN)) {

>   ret = arm_spe_pkt_snprintf(&buf, &blen, " 
> EXCEPTION-GEN");
>   if 

Re: [PATCH v2 09/14] perf arm-spe: Refactor counter packet handling

2020-10-20 Thread André Przywara
On 29/09/2020 14:39, Leo Yan wrote:

Hi,

> This patch defines macros for counter packet header, and uses macro to
> replace hard code values for packet parsing.
> 
> Signed-off-by: Leo Yan 
> ---
>  .../util/arm-spe-decoder/arm-spe-pkt-decoder.c  | 17 ++---
>  .../util/arm-spe-decoder/arm-spe-pkt-decoder.h  |  9 +
>  2 files changed, 19 insertions(+), 7 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 00a2cd1af422..ed0f4c74dfc5 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -150,10 +150,13 @@ static int arm_spe_get_counter(const unsigned char 
> *buf, size_t len,
>  const unsigned char ext_hdr, struct arm_spe_pkt 
> *packet)
>  {
>   packet->type = ARM_SPE_COUNTER;
> - if (ext_hdr)
> - packet->index = ((buf[0] & 0x3) << 3) | (buf[1] & 0x7);
> - else
> - packet->index = buf[0] & 0x7;
> + if (ext_hdr) {
> + packet->index  = (buf[1] & SPE_CNT_PKT_HDR_INDEX_MASK);
> + packet->index |= ((buf[0] & SPE_CNT_PKT_HDR_EXT_INDEX_MASK)
> + << SPE_CNT_PKT_HDR_EXT_INDEX_SHIFT);
> + } else {
> + packet->index = buf[0] & SPE_CNT_PKT_HDR_INDEX_MASK;

That looks suspiciously similar to the extended header in the address
packet. Can you use the same name for that?
And, similar to the address packet, what about:
packet->index |= SPE_PKT_EXT_HEADER_INDEX(buf[0]);

(merging the mask and the shift in the macro definition)

> + }
>  
>   return arm_spe_get_payload(buf, len, ext_hdr, packet);
>  }
> @@ -431,17 +434,17 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, 
> char *buf,
>   return ret;
>  
>   switch (idx) {
> - case 0:
> + case SPE_CNT_PKT_HDR_INDEX_TOTAL_LAT:
>   ret = arm_spe_pkt_snprintf(&buf, &blen, "TOT");
>   if (ret < 0)
>   return ret;
>   break;
> - case 1:
> + case SPE_CNT_PKT_HDR_INDEX_ISSUE_LAT:
>   ret = arm_spe_pkt_snprintf(&buf, &blen, "ISSUE");
>   if (ret < 0)
>   return ret;
>   break;
> - case 2:
> + case SPE_CNT_PKT_HDR_INDEX_TRANS_LAT:
>   ret = arm_spe_pkt_snprintf(&buf, &blen, "XLAT");
>   if (ret < 0)
>   return ret;
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> index 62db4ff91832..18667a63f5ba 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> @@ -89,6 +89,15 @@ struct arm_spe_pkt {
>  /* Context packet header */
>  #define SPE_CTX_PKT_HDR_INDEX_MASK   GENMASK_ULL(1, 0)
>  
> +/* Counter packet header */
> +#define SPE_CNT_PKT_HDR_INDEX_MASK   GENMASK_ULL(2, 0)
> +#define SPE_CNT_PKT_HDR_INDEX_TOTAL_LAT  (0x0)
> +#define SPE_CNT_PKT_HDR_INDEX_ISSUE_LAT  (0x1)
> +#define SPE_CNT_PKT_HDR_INDEX_TRANS_LAT  (0x2)

I think the Linux kernel coding style does not mention parentheses just
around numbers, so just 0x2 would suffice, for instance.
See section 12) in Documentation/process/coding-style.rst

Cheers,
Andre


> +
> +#define SPE_CNT_PKT_HDR_EXT_INDEX_MASK   GENMASK_ULL(1, 0)
> +#define SPE_CNT_PKT_HDR_EXT_INDEX_SHIFT  (3)
> +
>  const char *arm_spe_pkt_name(enum arm_spe_pkt_type);
>  
>  int arm_spe_get_packet(const unsigned char *buf, size_t len,
> 



Re: [PATCH v2 08/14] perf arm-spe: Refactor context packet handling

2020-10-20 Thread André Przywara
On 29/09/2020 14:39, Leo Yan wrote:
> Minor refactoring to use macro for index mask.
> 
> Signed-off-by: Leo Yan 

Reviewed-by: Andre Przywara 

Cheers,
Andre

> ---
>  tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c | 2 +-
>  tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h | 3 +++
>  2 files changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index b51a2207e4a0..00a2cd1af422 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -134,7 +134,7 @@ static int arm_spe_get_context(const unsigned char *buf, 
> size_t len,
>  struct arm_spe_pkt *packet)
>  {
>   packet->type = ARM_SPE_CONTEXT;
> - packet->index = buf[0] & 0x3;
> + packet->index = buf[0] & SPE_CTX_PKT_HDR_INDEX_MASK;
>   return arm_spe_get_payload(buf, len, 0, packet);
>  }
>  
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> index 88d2231c76da..62db4ff91832 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> @@ -86,6 +86,9 @@ struct arm_spe_pkt {
>  #define SPE_ADDR_PKT_INST_VA_EL2 (2)
>  #define SPE_ADDR_PKT_INST_VA_EL3 (3)
>  
> +/* Context packet header */
> +#define SPE_CTX_PKT_HDR_INDEX_MASK   GENMASK_ULL(1, 0)
> +
>  const char *arm_spe_pkt_name(enum arm_spe_pkt_type);
>  
>  int arm_spe_get_packet(const unsigned char *buf, size_t len,
> 



Re: [PATCH v2 07/14] perf arm-spe: Refactor address packet handling

2020-10-19 Thread André Przywara
On 29/09/2020 14:39, Leo Yan wrote:

Hi Leo,

> This patch is to refactor address packet handling, it defines macros for
> address packet's header and payload, these macros are used by decoder
> and the dump flow.

So I was thinking about these next few patches a bit. I understand that
it's common ground to not use numbers in code directly, but put names to
them (and there is good rationale for that).

However those long and complicated names don't make it really easier to
read, I think.

See below for an idea:

> Signed-off-by: Leo Yan 
> ---
>  .../util/arm-spe-decoder/arm-spe-decoder.c| 33 ++-
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 25 +++---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.h | 27 ++-
>  3 files changed, 49 insertions(+), 36 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> index cc18a1e8c212..9d3de163d47c 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> @@ -24,36 +24,37 @@
>  
>  static u64 arm_spe_calc_ip(int index, u64 payload)
>  {
> - u8 *addr = (u8 *)&payload;
> - int ns, el;
> + u64 ns, el;

This (and the "u64 vs. u8[]" changes below) looks like a nice cleanup.
>   /* Instruction virtual address or Branch target address */
>   if (index == SPE_ADDR_PKT_HDR_INDEX_INS ||
>   index == SPE_ADDR_PKT_HDR_INDEX_BRANCH) {
> - ns = addr[7] & SPE_ADDR_PKT_NS;
> - el = (addr[7] & SPE_ADDR_PKT_EL_MASK) >> SPE_ADDR_PKT_EL_OFFSET;
> + ns = payload & SPE_ADDR_PKT_INST_VA_NS;
> + el = (payload & SPE_ADDR_PKT_INST_VA_EL_MASK)
> + >> SPE_ADDR_PKT_INST_VA_EL_SHIFT;

So if I see this correctly, this _EL_SHIFT and _EL_MASK are only used
together, and only to read values, not to construct them.
So can you fuse them together in the header file below, like:
el = SPE_ADDR_PKT_INST_VA_GET_EL(payload);

That should help readablity, I guess, while still keeping the actual
numbers in one place. _SHIFT and _MASK are useful when we use them to
both extract *and construct* values, but here we only parse the buffer.

Similar for other places where you just extract bits from a bitfield or
integer.

Cheers,
Andre


> +
> + /* Clean highest byte */
> + payload &= SPE_ADDR_PKT_ADDR_MASK;
>  
>   /* Fill highest byte for EL1 or EL2 (VHE) mode */
> - if (ns && (el == SPE_ADDR_PKT_EL1 || el == SPE_ADDR_PKT_EL2))
> - addr[7] = 0xff;
> - /* Clean highest byte for other cases */
> - else
> - addr[7] = 0x0;
> + if (ns && (el == SPE_ADDR_PKT_INST_VA_EL1 ||
> +el == SPE_ADDR_PKT_INST_VA_EL2))
> + payload |= 0xffULL << SPE_ADDR_PKT_ADDR_BYTE7_SHIFT;
>  
>   /* Data access virtual address */
>   } else if (index == SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT) {
>  
> + /* Clean tags */
> + payload &= SPE_ADDR_PKT_ADDR_MASK;
> +
>   /* Fill highest byte if bits [48..55] is 0xff */
> - if (addr[6] == 0xff)
> - addr[7] = 0xff;
> - /* Otherwise, cleanup tags */
> - else
> - addr[7] = 0x0;
> + if ((payload >> 48) == 0xffULL)
> + payload |= 0xffULL << SPE_ADDR_PKT_ADDR_BYTE7_SHIFT;
>  
>   /* Data access physical address */
>   } else if (index == SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS) {
> - /* Cleanup byte 7 */
> - addr[7] = 0x0;
> + /* Clean highest byte */
> + payload &= SPE_ADDR_PKT_ADDR_MASK;
>   } else {
>   pr_err("unsupported address packet index: 0x%x\n", index);
>   }
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index e738bd04f209..b51a2207e4a0 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -13,9 +13,6 @@
>  
>  #include "arm-spe-pkt-decoder.h"
>  
> -#define NS_FLAG  BIT(63)
> -#define EL_FLAG  (BIT(62) | BIT(61))
> -
>  #if __BYTE_ORDER == __BIG_ENDIAN
>  #define le16_to_cpu bswap_16
>  #define le32_to_cpu bswap_32
> @@ -166,9 +163,10 @@ static int arm_spe_get_addr(const unsigned char *buf, 
> size_t len,
>  {
>   packet->type = ARM_SPE_ADDRESS;
>   if (ext_hdr)
> - packet->index = ((buf[0] & 0x3) << 3) | (buf[1] & 0x7);
> + packet->index = (((buf[0] & SPE_ADDR_PKT_HDR_EXT_INDEX_MASK) << 
> 3) |
> +   (buf[1] & SPE_ADDR_PKT_HDR_INDEX_MASK));
>   else
> - packet->index = buf[0] & 0x7;
> + packet->index = buf[0] & SPE_ADDR_PKT_HDR_INDEX_MASK;
>  
>   retu

Re: [PATCH v2 06/14] perf arm-spe: Refactor packet header parsing

2020-10-08 Thread André Przywara
On 29/09/2020 14:39, Leo Yan wrote:

Hi Leo,

> The packet header parsing uses the hard coded values and it uses nested
> if-else statements.
> 
> To improve the readability, this patch refactors the macros for packet
> header format so it removes the hard coded values.  Furthermore, based
> on the new mask macros it reduces the nested if-else statements and
> changes to use the flat conditions checking, this is directive and can
> easily map to the descriptions in ARMv8-a architecture reference manual
> (ARM DDI 0487E.a), chapter 'D10.1.5 Statistical Profiling Extension
> protocol packet headers'.

Yeah, that's so much better, thank you!

I checked all the bits and comparisons against the ARM ARM.

Two minor things below ...

> 
> Signed-off-by: Leo Yan 
> ---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 92 +--
>  .../arm-spe-decoder/arm-spe-pkt-decoder.h | 21 +
>  2 files changed, 62 insertions(+), 51 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 96b717a19163..e738bd04f209 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -16,28 +16,6 @@
>  #define NS_FLAG  BIT(63)
>  #define EL_FLAG  (BIT(62) | BIT(61))
>  
> -#define SPE_HEADER0_PAD  0x0
> -#define SPE_HEADER0_END  0x1
> -#define SPE_HEADER0_ADDRESS  0x30 /* address packet (short) */
> -#define SPE_HEADER0_ADDRESS_MASK 0x38
> -#define SPE_HEADER0_COUNTER  0x18 /* counter packet (short) */
> -#define SPE_HEADER0_COUNTER_MASK 0x38
> -#define SPE_HEADER0_TIMESTAMP0x71
> -#define SPE_HEADER0_TIMESTAMP0x71
> -#define SPE_HEADER0_EVENTS   0x2
> -#define SPE_HEADER0_EVENTS_MASK  0xf
> -#define SPE_HEADER0_SOURCE   0x3
> -#define SPE_HEADER0_SOURCE_MASK  0xf
> -#define SPE_HEADER0_CONTEXT  0x24
> -#define SPE_HEADER0_CONTEXT_MASK 0x3c
> -#define SPE_HEADER0_OP_TYPE  0x8
> -#define SPE_HEADER0_OP_TYPE_MASK 0x3c
> -#define SPE_HEADER1_ALIGNMENT0x0
> -#define SPE_HEADER1_ADDRESS  0xb0 /* address packet (extended) */
> -#define SPE_HEADER1_ADDRESS_MASK 0xf8
> -#define SPE_HEADER1_COUNTER  0x98 /* counter packet (extended) */
> -#define SPE_HEADER1_COUNTER_MASK 0xf8
> -
>  #if __BYTE_ORDER == __BIG_ENDIAN
>  #define le16_to_cpu bswap_16
>  #define le32_to_cpu bswap_32
> @@ -198,46 +176,58 @@ static int arm_spe_get_addr(const unsigned char *buf, 
> size_t len,
>  static int arm_spe_do_get_packet(const unsigned char *buf, size_t len,
>struct arm_spe_pkt *packet)
>  {
> - unsigned int byte;
> + unsigned int hdr;
> + unsigned char ext_hdr = 0;
>  
>   memset(packet, 0, sizeof(struct arm_spe_pkt));
>  
>   if (!len)
>   return ARM_SPE_NEED_MORE_BYTES;
>  
> - byte = buf[0];
> - if (byte == SPE_HEADER0_PAD)
> + hdr = buf[0];
> +
> + if (hdr == SPE_HEADER0_PAD)
>   return arm_spe_get_pad(packet);
> - else if (byte == SPE_HEADER0_END) /* no timestamp at end of record */
> +
> + if (hdr == SPE_HEADER0_END) /* no timestamp at end of record */
>   return arm_spe_get_end(packet);
> - else if (byte & 0xc0 /* 0y11xx */) {
> - if (byte & 0x80) {
> - if ((byte & SPE_HEADER0_ADDRESS_MASK) == 
> SPE_HEADER0_ADDRESS)
> - return arm_spe_get_addr(buf, len, 0, packet);
> - if ((byte & SPE_HEADER0_COUNTER_MASK) == 
> SPE_HEADER0_COUNTER)
> - return arm_spe_get_counter(buf, len, 0, packet);
> - } else
> - if (byte == SPE_HEADER0_TIMESTAMP)
> - return arm_spe_get_timestamp(buf, len, packet);
> - else if ((byte & SPE_HEADER0_EVENTS_MASK) == 
> SPE_HEADER0_EVENTS)
> - return arm_spe_get_events(buf, len, packet);
> - else if ((byte & SPE_HEADER0_SOURCE_MASK) == 
> SPE_HEADER0_SOURCE)
> - return arm_spe_get_data_source(buf, len, 
> packet);
> - else if ((byte & SPE_HEADER0_CONTEXT_MASK) == 
> SPE_HEADER0_CONTEXT)
> - return arm_spe_get_context(buf, len, packet);
> - else if ((byte & SPE_HEADER0_OP_TYPE_MASK) == 
> SPE_HEADER0_OP_TYPE)
> - return arm_spe_get_op_type(buf, len, packet);
> - } else if ((byte & 0xe0) == 0x20 /* 0y001x */) {
> - /* 16-bit header */
> - byte = buf[1];
> - if (byte == SPE_HEADER1_ALIGNMENT)
> +
> + if (hdr == SPE_HEADER0_TIMESTAMP)
> + return arm_spe_get_timestamp(buf, len, packet);
> +
> + if ((h

Re: [PATCH v2 05/14] perf arm-spe: Refactor printing string to buffer

2020-10-08 Thread André Przywara
On 29/09/2020 14:39, Leo Yan wrote:

Hi,

> When outputs strings to the decoding buffer with function snprintf(),
> SPE decoder needs to detects if any error returns from snprintf() and if
> so needs to directly bail out.  If snprintf() returns success, it needs
> to update buffer pointer and reduce the buffer length so can continue to
> output the next string into the consequent memory space.
> 
> This complex logics are spreading in the function arm_spe_pkt_desc() so
> there has many duplicate codes for handling error detecting, increment
> buffer pointer and decrement buffer size.
> 
> To avoid the duplicate code, this patch introduces a new helper function
> arm_spe_pkt_snprintf() which is used to wrap up the complex logics, and
> the caller arm_spe_pkt_desc() will call it and simply check the returns
> value.
> 
> This patch also moves the variable 'blen' as the function's local
> variable, this allows to remove the unnecessary braces and improve the
> readability.

Ah, many thanks for tackling this, I wondered about those code
duplications and missing error handling already.

So I tried some sed pattern on the "old" part of the patch, to get to
the new format and spot any differences apart from mechanically changing
"snprintf(buf, buf_len," to "arm_spe_pkt_snprintf(&buf, &blen,".
See below.

> 
> Suggested-by: Dave Martin 
> Signed-off-by: Leo Yan 

Very nice cleanup! If you can address this one '"%s", name' comment
below, then:

Reviewed-by: Andre Przywara 


> ---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 247 ++
>  1 file changed, 135 insertions(+), 112 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 4f0aeb62e97b..96b717a19163 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -9,6 +9,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "arm-spe-pkt-decoder.h"
>  
> @@ -256,192 +257,214 @@ int arm_spe_get_packet(const unsigned char *buf, 
> size_t len,
>   return ret;
>  }
>  
> +static int arm_spe_pkt_snprintf(char **buf_p, size_t *blen,
> + const char *fmt, ...)
> +{
> + va_list ap;
> + int ret;
> +
> + va_start(ap, fmt);
> + ret = vsnprintf(*buf_p, *blen, fmt, ap);
> + va_end(ap);
> +
> + if (ret < 0)
> + return ret;
> +
> + *buf_p += ret;
> + *blen -= ret;
> + return ret;
> +}
> +
>  int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
>size_t buf_len)
>  {
>   int ret, ns, el, idx = packet->index;
>   unsigned long long payload = packet->payload;
>   const char *name = arm_spe_pkt_name(packet->type);
> + size_t blen = buf_len;
>  
>   switch (packet->type) {
>   case ARM_SPE_BAD:
>   case ARM_SPE_PAD:
>   case ARM_SPE_END:
> - return snprintf(buf, buf_len, "%s", name);
> - case ARM_SPE_EVENTS: {
> - size_t blen = buf_len;
> -
> - ret = 0;
> - ret = snprintf(buf, buf_len, "EV");
> - buf += ret;
> - blen -= ret;
> + return arm_spe_pkt_snprintf(&buf, &blen, name);

Isn't it considered safer to use '..., "%s", name)', because that would
handle percent signs in "name" correctly?
Not really an issue in the current code, but good style, I believe.

> + case ARM_SPE_EVENTS:
> + ret = arm_spe_pkt_snprintf(&buf, &blen, "EV");
> + if (ret < 0)
> + return ret;
> +
>   if (payload & 0x1) {
> - ret = snprintf(buf, buf_len, " EXCEPTION-GEN");

So this was actually a bug before, right? Because "buf" does no longer
have buf_len bytes available at this point, it should have been blen
here already. Which gets fixed with all your changes!

Thanks!
Andre

> - buf += ret;
> - blen -= ret;
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " 
> EXCEPTION-GEN");
> + if (ret < 0)
> + return ret;
>   }
>   if (payload & 0x2) {
> - ret = snprintf(buf, buf_len, " RETIRED");
> - buf += ret;
> - blen -= ret;
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " RETIRED");
> + if (ret < 0)
> + return ret;
>   }
>   if (payload & 0x4) {
> - ret = snprintf(buf, buf_len, " L1D-ACCESS");
> - buf += ret;
> - blen -= ret;
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-ACCESS");
> + if (ret < 0)
> + return ret;
>   }
>   if (payload & 0x8) {
> - ret = snprint

Re: [PATCH v2 04/14] perf arm-spe: Fix packet length handling

2020-10-08 Thread André Przywara
On 29/09/2020 14:39, Leo Yan wrote:
> When process address packet and counter packet, if the packet contains

   processing

> extended header, it misses to account the extra one byte for header
> length calculation, thus returns the wrong packet length.
> 
> To correct the packet length calculation, one possible fixing is simply
> to plus extra 1 for extended header, but will spread some duplicate code
> in the flows for processing address packet and counter packet.
> Alternatively, we can refine the function arm_spe_get_payload() to not
> only support short header and allow it to support extended header, and
> rely on it for the packet length calculation.
> 
> So this patch refactors function arm_spe_get_payload() with a new
> argument 'ext_hdr' for support extended header; the packet processing
> flows can invoke this function to unify the packet length calculation.
> 
> Signed-off-by: Leo Yan 

That looks alright to me.

Reviewed-by: Andre Przywara 

Cheers,
Andre

> ---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 34 +++
>  1 file changed, 12 insertions(+), 22 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 5a8696031e16..4f0aeb62e97b 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -80,14 +80,15 @@ const char *arm_spe_pkt_name(enum arm_spe_pkt_type type)
>   (1 << (((val) & SPE_HEADER_SZ_MASK) >> SPE_HEADER_SZ_SHIFT))
>  
>  static int arm_spe_get_payload(const unsigned char *buf, size_t len,
> +unsigned char ext_hdr,
>  struct arm_spe_pkt *packet)
>  {
> - size_t payload_len = PAYLOAD_LEN(buf[0]);
> + size_t payload_len = PAYLOAD_LEN(buf[ext_hdr]);
>  
> - if (len < 1 + payload_len)
> + if (len < 1 + ext_hdr + payload_len)
>   return ARM_SPE_NEED_MORE_BYTES;
>  
> - buf++;
> + buf += 1 + ext_hdr;
>  
>   switch (payload_len) {
>   case 1: packet->payload = *(uint8_t *)buf; break;
> @@ -97,7 +98,7 @@ static int arm_spe_get_payload(const unsigned char *buf, 
> size_t len,
>   default: return ARM_SPE_BAD_PACKET;
>   }
>  
> - return 1 + payload_len;
> + return 1 + ext_hdr + payload_len;
>  }
>  
>  static int arm_spe_get_pad(struct arm_spe_pkt *packet)
> @@ -128,7 +129,7 @@ static int arm_spe_get_timestamp(const unsigned char 
> *buf, size_t len,
>struct arm_spe_pkt *packet)
>  {
>   packet->type = ARM_SPE_TIMESTAMP;
> - return arm_spe_get_payload(buf, len, packet);
> + return arm_spe_get_payload(buf, len, 0, packet);
>  }
>  
>  static int arm_spe_get_events(const unsigned char *buf, size_t len,
> @@ -143,14 +144,14 @@ static int arm_spe_get_events(const unsigned char *buf, 
> size_t len,
>*/
>   packet->index = PAYLOAD_LEN(buf[0]);
>  
> - return arm_spe_get_payload(buf, len, packet);
> + return arm_spe_get_payload(buf, len, 0, packet);
>  }
>  
>  static int arm_spe_get_data_source(const unsigned char *buf, size_t len,
>  struct arm_spe_pkt *packet)
>  {
>   packet->type = ARM_SPE_DATA_SOURCE;
> - return arm_spe_get_payload(buf, len, packet);
> + return arm_spe_get_payload(buf, len, 0, packet);
>  }
>  
>  static int arm_spe_get_context(const unsigned char *buf, size_t len,
> @@ -158,8 +159,7 @@ static int arm_spe_get_context(const unsigned char *buf, 
> size_t len,
>  {
>   packet->type = ARM_SPE_CONTEXT;
>   packet->index = buf[0] & 0x3;
> -
> - return arm_spe_get_payload(buf, len, packet);
> + return arm_spe_get_payload(buf, len, 0, packet);
>  }
>  
>  static int arm_spe_get_op_type(const unsigned char *buf, size_t len,
> @@ -167,41 +167,31 @@ static int arm_spe_get_op_type(const unsigned char 
> *buf, size_t len,
>  {
>   packet->type = ARM_SPE_OP_TYPE;
>   packet->index = buf[0] & 0x3;
> - return arm_spe_get_payload(buf, len, packet);
> + return arm_spe_get_payload(buf, len, 0, packet);
>  }
>  
>  static int arm_spe_get_counter(const unsigned char *buf, size_t len,
>  const unsigned char ext_hdr, struct arm_spe_pkt 
> *packet)
>  {
> - if (len < 2)
> - return ARM_SPE_NEED_MORE_BYTES;
> -
>   packet->type = ARM_SPE_COUNTER;
>   if (ext_hdr)
>   packet->index = ((buf[0] & 0x3) << 3) | (buf[1] & 0x7);
>   else
>   packet->index = buf[0] & 0x7;
>  
> - packet->payload = le16_to_cpu(*(uint16_t *)(buf + 1));
> -
> - return 1 + ext_hdr + 2;
> + return arm_spe_get_payload(buf, len, ext_hdr, packet);
>  }
>  
>  static int arm_spe_get_addr(const unsigned char *buf, size_t len,
>   const unsigned char ext_hdr, struct arm_spe_pkt 
> *packet)
>  {
> - if (len < 8)
> - return ARM_SPE_NEED_MORE_BYTES;
> -
>   

Re: [PATCH v2 02/14] perf arm-spe: Fix a typo in comment

2020-10-08 Thread André Przywara
On 29/09/2020 14:39, Leo Yan wrote:
> Fix a typo: s/iff/if.
> 
> Signed-off-by: Leo Yan 

Reviewed-by: Andre Przywara 

Cheers,
Andre

> ---
>  tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 46ddb53a6457..7c7b5eb09fba 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -142,7 +142,7 @@ static int arm_spe_get_events(const unsigned char *buf, 
> size_t len,
>  
>   /* we use index to identify Events with a less number of
>* comparisons in arm_spe_pkt_desc(): E.g., the LLC-ACCESS,
> -  * LLC-REFILL, and REMOTE-ACCESS events are identified iff
> +  * LLC-REFILL, and REMOTE-ACCESS events are identified if
>* index > 1.
>*/
>   packet->index = ret - 1;
> 



Re: [PATCH v2 03/14] perf arm-spe: Refactor payload length calculation

2020-10-08 Thread André Przywara
On 29/09/2020 14:39, Leo Yan wrote:

Hi Leo,

> Defines macro for payload length calculation instead of static function.

What is the reason for that? I thought the kernel's direction is more
the other way: replacing macros with static functions ("Don't write CPP,
write C")? Ideally the compiler would generate the same code.

> Currently the event packet's 'index' is assigned as payload length, but
> the flow is not directive: it firstly gets the packet length (includes
> header length and payload length) and then reduces header length from
> packet length, so finally get the payload length; to simplify the code,
> this patch directly assigns payload length to event packet's index.
> 
> Signed-off-by: Leo Yan 
> ---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 26 ---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.h |  4 +++
>  2 files changed, 15 insertions(+), 15 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 7c7b5eb09fba..5a8696031e16 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -69,22 +69,20 @@ const char *arm_spe_pkt_name(enum arm_spe_pkt_type type)
>   return arm_spe_packet_name[type];
>  }
>  
> -/* return ARM SPE payload size from its encoding,
> - * which is in bits 5:4 of the byte.
> - * 00 : byte
> - * 01 : halfword (2)
> - * 10 : word (4)
> - * 11 : doubleword (8)
> +/*
> + * Return ARM SPE payload size from header bits 5:4
> + *   00 : byte
> + *   01 : halfword (2)
> + *   10 : word (4)
> + *   11 : doubleword (8)
>   */
> -static int payloadlen(unsigned char byte)
> -{
> - return 1 << ((byte & 0x30) >> 4);
> -}
> +#define PAYLOAD_LEN(val) \
> + (1 << (((val) & SPE_HEADER_SZ_MASK) >> SPE_HEADER_SZ_SHIFT))

This change of the expression is good (although it should be 1U), but
please keep it a function. The return type should be unsigned, I guess.

The rest looks fine.

Cheers,
Andre

>  
>  static int arm_spe_get_payload(const unsigned char *buf, size_t len,
>  struct arm_spe_pkt *packet)
>  {
> - size_t payload_len = payloadlen(buf[0]);
> + size_t payload_len = PAYLOAD_LEN(buf[0]);
>  
>   if (len < 1 + payload_len)
>   return ARM_SPE_NEED_MORE_BYTES;
> @@ -136,8 +134,6 @@ static int arm_spe_get_timestamp(const unsigned char 
> *buf, size_t len,
>  static int arm_spe_get_events(const unsigned char *buf, size_t len,
> struct arm_spe_pkt *packet)
>  {
> - int ret = arm_spe_get_payload(buf, len, packet);
> -
>   packet->type = ARM_SPE_EVENTS;
>  
>   /* we use index to identify Events with a less number of
> @@ -145,9 +141,9 @@ static int arm_spe_get_events(const unsigned char *buf, 
> size_t len,
>* LLC-REFILL, and REMOTE-ACCESS events are identified if
>* index > 1.
>*/
> - packet->index = ret - 1;
> + packet->index = PAYLOAD_LEN(buf[0]);
>  
> - return ret;
> + return arm_spe_get_payload(buf, len, packet);
>  }
>  
>  static int arm_spe_get_data_source(const unsigned char *buf, size_t len,
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> index 4c870521b8eb..f2d0af39a58c 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> @@ -9,6 +9,7 @@
>  
>  #include 
>  #include 
> +#include 
>  
>  #define ARM_SPE_PKT_DESC_MAX 256
>  
> @@ -36,6 +37,9 @@ struct arm_spe_pkt {
>   uint64_tpayload;
>  };
>  
> +#define SPE_HEADER_SZ_SHIFT  (4)
> +#define SPE_HEADER_SZ_MASK   GENMASK_ULL(5, 4)
> +
>  #define SPE_ADDR_PKT_HDR_INDEX_INS   (0x0)
>  #define SPE_ADDR_PKT_HDR_INDEX_BRANCH(0x1)
>  #define SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT (0x2)
> 



Re: [PATCH v2 01/14] perf arm-spe: Include bitops.h for BIT() macro

2020-10-08 Thread André Przywara
On 29/09/2020 14:39, Leo Yan wrote:
> Include header linux/bitops.h, directly use its BIT() macro and remove
> the self defined macros.
> 
> Signed-off-by: Leo Yan 

Reviewed-by: Andre Przywara 

Thanks,
Andre

> ---
>  tools/perf/util/arm-spe-decoder/arm-spe-decoder.c | 5 +
>  tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c | 3 +--
>  2 files changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> index 93e063f22be5..cc18a1e8c212 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> @@ -12,6 +12,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  
> @@ -21,10 +22,6 @@
>  
>  #include "arm-spe-decoder.h"
>  
> -#ifndef BIT
> -#define BIT(n)   (1UL << (n))
> -#endif
> -
>  static u64 arm_spe_calc_ip(int index, u64 payload)
>  {
>   u8 *addr = (u8 *)&payload;
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index b94001b756c7..46ddb53a6457 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -8,11 +8,10 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "arm-spe-pkt-decoder.h"
>  
> -#define BIT(n)   (1ULL << (n))
> -
>  #define NS_FLAG  BIT(63)
>  #define EL_FLAG  (BIT(62) | BIT(61))
>  
> 



Re: [PATCH 2/2] arm64: Add support for SMCCC TRNG firmware interface

2020-10-07 Thread André Przywara
On 07/10/2020 15:16, James Morse wrote:

Hi,

> On 06/10/2020 21:18, Andre Przywara wrote:
>> The ARM architected TRNG firmware interface, described in ARM spec
>> DEN0098[1], defines an ARM SMCCC based interface to a true random number
>> generator, provided by firmware.
>> This can be discovered via the SMCCC >=v1.1 interface, and provides
>> up to 192 bits of entropy per call.
>>
>> Hook this SMC call into arm64's arch_get_random_*() implementation,
>> coming to the rescue when the CPU does not implement the ARM v8.5 RNG
>> system registers.
>>
>> For the detection, we piggy back on the PSCI/SMCCC discovery (which gives
>> us the conduit to use: hvc or smc), then try to call the
>> ARM_SMCCC_TRNG_VERSION function, which returns -1 if this interface is
>> not implemented.
> 
>>  arch/arm64/include/asm/archrandom.h | 83 +
>>  1 file changed, 73 insertions(+), 10 deletions(-)
> 
>> diff --git a/arch/arm64/include/asm/archrandom.h 
>> b/arch/arm64/include/asm/archrandom.h
>> index ffb1a40d5475..b6c291c42a48 100644
>> --- a/arch/arm64/include/asm/archrandom.h
>> +++ b/arch/arm64/include/asm/archrandom.h
>> @@ -7,6 +7,13 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>> +
>> +static enum smc_trng_status {
>> +SMC_TRNG_UNKNOWN,
>> +SMC_TRNG_NOT_SUPPORTED,
>> +SMC_TRNG_SUPPORTED
>> +} smc_trng_status = SMC_TRNG_UNKNOWN;
> 
> Doesn't this static variable in a header file mean each file that includes 
> this has its
> own copy? Is that intentional?

Right, and it's not intentional. It doesn't really break, but since
random.h includes archrandom.h, we get an instance everywhere :-(

I wasn't too happy with this detection method to begin with (and also
not with stuffing everything into a header file), but wanted to
accommodate the early case, where PSCI hasn't been initialised yet, and
so we don't know the SMCCC conduit. A static key sounds better, but gets
a bit hairy with this scenario, I think.

Any ideas here?
I could copy Ard's solution and introduce random.c, if that makes more
sense.

Cheers,
Andre


Re: [PATCH 5/5] perf: arm_spe: Decode SVE events

2020-09-28 Thread André Przywara
On 28/09/2020 14:21, Dave Martin wrote:

Hi Dave,

> On Tue, Sep 22, 2020 at 11:12:25AM +0100, Andre Przywara wrote:
>> The Scalable Vector Extension (SVE) is an ARMv8 architecture extension
>> that introduces very long vector operations (up to 2048 bits).
> 
> (8192, in fact, though don't expect to see that on real hardware any
> time soon...  qemu and the Arm fast model can do it, though.)
> 
>> The SPE profiling feature can tag SVE instructions with additional
>> properties like predication or the effective vector length.
>>
>> Decode the new operation type bits in the SPE decoder to allow the perf
>> tool to correctly report about SVE instructions.
> 
> 
> I don't know anything about SPE, so just commenting on a few minor
> things that catch my eye here.

Many thanks for taking a look!
Please note that I actually missed a prior submission by Wei, so the
code changes here will end up in:
https://lore.kernel.org/patchwork/patch/1288413/

But your two points below magically apply to his patch as well, so

> 
>> Signed-off-by: Andre Przywara 
>> ---
>>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 48 ++-
>>  1 file changed, 47 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
>> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
>> index a033f34846a6..f0c369259554 100644
>> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
>> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
>> @@ -372,8 +372,35 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, 
>> char *buf,
>>  }
>>  case ARM_SPE_OP_TYPE:
>>  switch (idx) {
>> -case 0: return snprintf(buf, buf_len, "%s", payload & 0x1 ?
>> +case 0: {
>> +size_t blen = buf_len;
>> +
>> +if ((payload & 0x89) == 0x08) {
>> +ret = snprintf(buf, buf_len, "SVE");
>> +buf += ret;
>> +blen -= ret;
> 
> (Nit: can ret be < 0 ?  I've never been 100% clear on this myself for
> the s*printf() family -- if this assumption is widespread in perf tool
> a lready that I guess just go with the flow.)

Yeah, some parts of the code in here check for -1, actually, but doing
this on every call to snprintf would push this current code over the
edge - and I cowardly avoided a refactoring ;-)

Please note that his is perf userland, and also we are printing constant
strings here.
Although admittedly this starts to sounds like an excuse now ...

> I wonder if this snprintf+increment+decrement sequence could be wrapped
> up as a helper, rather than having to be repeated all over the place.

Yes, I was hoping nobody would notice ;-)

>> +if (payload & 0x2)
>> +ret = snprintf(buf, buf_len, " FP");
>> +else
>> +ret = snprintf(buf, buf_len, " INT");
>> +buf += ret;
>> +blen -= ret;
>> +if (payload & 0x4) {
>> +ret = snprintf(buf, buf_len, " PRED");
>> +buf += ret;
>> +blen -= ret;
>> +}
>> +/* Bits [7..4] encode the vector length */
>> +ret = snprintf(buf, buf_len, " EVLEN%d",
>> +   32 << ((payload >> 4) & 0x7));
> 
> Isn't this just extracting 3 bits (0x7)? 

Ah, right, the comment is wrong. It's actually bits [6:4].

> And what unit are we aiming
> for here: is it the number of bytes per vector, or something else?  I'm
> confused by the fact that this will go up in steps of 32, which doesn't
> seem to match up to the architecure.

So this is how SPE encodes the effective vector length in its payload:
the format is described in section "D10.2.7 Operation Type packet" in a
(recent) ARMv8 ARM. I put the above statement in a C file and ran all
input values through it, it produced the exact *bit* length values as in
the spec.

Is there any particular pattern you are concerned about?
I admit this is somewhat hackish, I can do an extra function to put some
comments in there.

> 
> I notice that bit 7 has to be zero to get into this if() though.
> 
>> +buf += ret;
>> +blen -= ret;
>> +return buf_len - blen;
>> +}
>> +
>> +return snprintf(buf, buf_len, "%s", payload & 0x1 ?
>>  "COND-SELECT" : "INSN-OTHER");
>> +}
>>  case 1: {
>>  size_t blen = buf_len;
>>  
>> @@ -403,6 +430,25 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, 
>> char *buf,
>>  ret = snpri

Re: [PATCH 5/5] perf: arm_spe: Decode SVE events

2020-09-28 Thread André Przywara
On 27/09/2020 04:30, Leo Yan wrote:

Hi Leo,

> On Tue, Sep 22, 2020 at 11:12:25AM +0100, Andre Przywara wrote:
>> The Scalable Vector Extension (SVE) is an ARMv8 architecture extension
>> that introduces very long vector operations (up to 2048 bits).
>> The SPE profiling feature can tag SVE instructions with additional
>> properties like predication or the effective vector length.
>>
>> Decode the new operation type bits in the SPE decoder to allow the perf
>> tool to correctly report about SVE instructions.
>>
>> Signed-off-by: Andre Przywara 
>> ---
>>  .../arm-spe-decoder/arm-spe-pkt-decoder.c | 48 ++-
>>  1 file changed, 47 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
>> b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
>> index a033f34846a6..f0c369259554 100644
>> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
>> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
>> @@ -372,8 +372,35 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, 
>> char *buf,
>>  }
>>  case ARM_SPE_OP_TYPE:
>>  switch (idx) {
>> -case 0: return snprintf(buf, buf_len, "%s", payload & 0x1 ?
>> +case 0: {
>> +size_t blen = buf_len;
>> +
>> +if ((payload & 0x89) == 0x08) {
>> +ret = snprintf(buf, buf_len, "SVE");
>> +buf += ret;
>> +blen -= ret;
>> +if (payload & 0x2)
>> +ret = snprintf(buf, buf_len, " FP");
>> +else
>> +ret = snprintf(buf, buf_len, " INT");
>> +buf += ret;
>> +blen -= ret;
>> +if (payload & 0x4) {
>> +ret = snprintf(buf, buf_len, " PRED");
>> +buf += ret;
>> +blen -= ret;
>> +}
>> +/* Bits [7..4] encode the vector length */
>> +ret = snprintf(buf, buf_len, " EVLEN%d",
>> +   32 << ((payload >> 4) & 0x7));
>> +buf += ret;
>> +blen -= ret;
>> +return buf_len - blen;
>> +}
>> +
>> +return snprintf(buf, buf_len, "%s", payload & 0x1 ?
>>  "COND-SELECT" : "INSN-OTHER");
>> +}
>>  case 1: {
>>  size_t blen = buf_len;
>>  
>> @@ -403,6 +430,25 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, 
>> char *buf,
>>  ret = snprintf(buf, buf_len, " NV-SYSREG");
>>  buf += ret;
>>  blen -= ret;
>> +} else if ((payload & 0x0a) == 0x08) {
>> +ret = snprintf(buf, buf_len, " SVE");
>> +buf += ret;
>> +blen -= ret;
>> +if (payload & 0x4) {
>> +ret = snprintf(buf, buf_len, " PRED");
>> +buf += ret;
>> +blen -= ret;
>> +}
>> +if (payload & 0x80) {
>> +ret = snprintf(buf, buf_len, " SG");
>> +buf += ret;
>> +blen -= ret;
>> +}
>> +/* Bits [7..4] encode the vector length */
>> +ret = snprintf(buf, buf_len, " EVLEN%d",
>> +   32 << ((payload >> 4) & 0x7));
>> +buf += ret;
>> +blen -= ret;
> 
> The changes in this patch has been included in the patch [1].
> 
> So my summary for patches 02 ~ 05, except patch 04, other changes has
> been included in the patch set "perf arm-spe: Refactor decoding &
> dumping flow".

Ah, my sincere apologies, I totally missed Wei's and your series on this
(although I did some research on "prior art").

> I'd like to add your patch 04 into the patch set "perf arm-spe:
> Refactor decoding & dumping flow" and I will respin the patch set v2 on
> the latest perf/core branch and send out to review.
> 
> For patch 01, you could continue to try to land it in the kernel.
> (Maybe consolidate a bit with Wei?).
> 
> Do you think this is okay for you?

Yes, sounds like a plan. So Wei's original series is now fully
integrated into your 13-patch rework, right?

Is "[RESEND,v1,xx/13] ..." the latest revision of your series?
Do you plan on sendi

Re: [PATCH v2 1/6] dt-bindings: timers: sp-804: Convert to json-schema

2020-09-09 Thread André Przywara
On 08/09/2020 18:28, Rob Herring wrote:
> On Fri, 28 Aug 2020 15:20:13 +0100, Andre Przywara wrote:
>> This converts the DT binding documentation for the ARM SP-804 timer IP
>> over to json-schema.
>> Most properties are just carried over, the clocks property requirement
>> (either one or three clocks) is now formalised and enforced.
>> As the former binding didn't specify clock-names, and there is no
>> common name used by the existing DTs, I refrained from adding them in
>> detail (just allowing the property).
>> The requirement for the APB clock is enforced by the primecell binding
>> already.
>>
>> Signed-off-by: Andre Przywara 
>> ---
>>  .../devicetree/bindings/timer/arm,sp804.txt   | 29 --
>>  .../devicetree/bindings/timer/arm,sp804.yaml  | 93 +++
>>  2 files changed, 93 insertions(+), 29 deletions(-)
>>  delete mode 100644 Documentation/devicetree/bindings/timer/arm,sp804.txt
>>  create mode 100644 Documentation/devicetree/bindings/timer/arm,sp804.yaml
>>
> 
> Applied, thanks!
> 
> I dropped the primecell.yaml ref as it is redundant.

Interesting, because I explicitly added it to cover one property that
was only described in primecell.yaml. But I think this one node was
originally missing the actual primecell compatible string.

So I tested it now again and don't see any issues without the explicit
primecell.yaml reference anymore.

Thanks for taking it!

Cheers,
Andre.


Re: [PATCH 00/10] dt-bindings: Convert SP805 to Json-schema (and fix users)

2020-09-04 Thread André Przywara
On 04/09/2020 16:29, Florian Fainelli wrote:

Hi,

> On 9/4/2020 1:58 AM, Linus Walleij wrote:>> On Fri, Aug 28, 2020 at 9:34 PM 
> Florian Fainelli
>>  wrote:
>>> On 8/28/20 6:05 AM, Andre Przywara wrote:
>>
>>> What is the plan for merging this series? Should Rob pick up all changes
>>> or since those are non critical changes, should we just leave it to the
>>> SoC maintainers to pick up the changes in their tree?
>>
>> What about André just send a pull request to the ARM SoC maintainers
>> for the whole thing?
> 
> I already applied some of the patches, if we got that route please CC me
> so I can drop them from my local queue. Thanks

I would for sure drop these from any PR.

Rob, are you happy with the actual binding conversion? If you are
willing to take it as it is (Viresh has already acked), I could then
split off the DT fixes and either chase the maintainers or send ARM SoC
a PR. But this really depends on the binding being good.

Cheers,
Andre.


Re: [PATCH v2 3/6] ARM: dts: NSP: Fix SP804 compatible node

2020-09-03 Thread André Przywara
On 02/09/2020 00:04, Florian Fainelli wrote:

Hi Florian,

sorry, the mail got swamped in my inbox...

> On 8/28/2020 10:12 AM, Florian Fainelli wrote:
>> On 8/28/20 7:20 AM, Andre Przywara wrote:
>>> The DT binding for SP804 requires to have an "arm,primecell" compatible
>>> string.
>>> Add this string so that the Linux primecell bus driver picks the device
>>> up and activates the clock.
>>>
>>> Fixes: a0efb0d28b77 ("ARM: dts: NSP: Add SP804 Support to DT")
>>> Tested-by: Florian Fainelli 
>>> Signed-off-by: Andre Przywara 
>>
>> This looks fine, however there is a ccbtimer1 instance that you missed,
>> can you resubmit with it included?
>>
>> With that:
>>
>> Acked-by: Florian Fainelli 
> 
> Andre are you going to resubmit a patch with the second instance
> (ccbtimer1) fixed as well, or should I take care of that while applying
> the patch? Either way is fine, just let me know.

So I was waiting for more comments, but there was nothing so far that
justifies a new version. So would you mind fixing this while applying? I
must have indeed missed this instance while diffing before and after.

Many thanks!
Andre.


Re: [PATCH 00/10] dt-bindings: Convert SP805 to Json-schema (and fix users)

2020-09-01 Thread André Przywara
On 28/08/2020 22:32, Florian Fainelli wrote:

Hi,

Florian, thanks for queueing the Broadcom specific patches!

> On 8/28/20 2:28 PM, Rob Herring wrote:
>> On Fri, Aug 28, 2020 at 1:34 PM Florian Fainelli  
>> wrote:
>>>
>>> On 8/28/20 6:05 AM, Andre Przywara wrote:
 This is an attempt to convert the SP805 watchdog DT binding to yaml.
 This is done in the first patch, the remaining nine fix some DT users.

 I couldn't test any of those DT files on actual machines, but tried
 to make the changes in a way that would be transparent to at least the
 Linux driver. The only other SP805 DT user I could find is U-Boot, which
 seems to only use a very minimal subset of the binding (just the first
 clock).
 I only tried to fix those DTs that were easily and reliably fixable.
 AFAICT, a missing primecell compatible string, for instance, would
 prevent the Linux driver from probing the device at all, so I didn't
 dare to touch those DTs at all. Missing clocks are equally fatal.
>>>
>>> What is the plan for merging this series? Should Rob pick up all changes
>>> or since those are non critical changes, should we just leave it to the
>>> SoC maintainers to pick up the changes in their tree?
>>
>> I don't take .dts files. Either subarch maintainers can pick up
>> individual patches or send a PR to SoC maintainers.
> 
> OK, so we are fine, to say make sure this all lands in v5.10-rc1 at some
> point and the warnings should no longer exist by then?

So yes, I would be very grateful if subsystem maintainers take this at
their discretion.
For once, I didn't actually change anything in the binding, so most
things were already slightly wrong according to the .txt binding, just
nobody realised or cared. So those .dts files changes are actually
independent and justified even without patch 01/10.

Secondly, there are already so many warnings in many .dts files at the
moment, that (in the worst case) a few more - for a brief period of time
- do not really matter. But at the end it will improve the situation.

Rob, if you are fine with the actual binding, I would try to pursue the
remaining subsystem maintainers to get the .dts changes merged.

Thanks,
Andre.


Re: [PATCH v2 0/6] dt-bindings: Convert SP804 to Json-schema (and fix users)

2020-08-28 Thread André Przywara
On 28/08/2020 15:54, Linus Walleij wrote:

Hi,

> On Fri, Aug 28, 2020 at 4:20 PM Andre Przywara  wrote:
> 
>> This is the second attempt at converting the SP804 timer binding to yaml.
>> Compared to v1, I forbid additional properties, and included the primecell
>> binding. Also the clock-names property is now listed, although without
>> further requirements on the names. Changelog below.
> 
> The series:
> Acked-by: Linus Walleij 
> 
>> I couldn't test any of those DT files on actual machines, but tried
>> to make the changes in a way that would be transparent to at least the
>> Linux driver. The only other SP804 DT user I could find is FreeBSD,
>> but they seem to use a different binding (no clocks, but a
>> clock-frequency property).
> 
> That's annoying. I suppose FreeBSD just made that up and doesn't
> even have a binding document for it?

I couldn't find bindings at all in their git tree. I don't think they
treat this very formally, it seems to be more use-case driven.
Their SP804 driver does not know how to handle clock properties, so most
of the DTs (in sys/gnu/dts, so apparently copied from Linux) would not
work really well, because the driver assumes a hardcoded frequency of
1MHz by default.
There is only one DT (Annapurna Alpine with Cortex-A15) that provides
this clock-frequency property. The Linux DT does not mention the SP804
in there at all, interestingly.

> In an ideal world I suppose we should go and fix FreeBSD but I have
> no idea how easy or hard that is.

It seems to be messy, at least in this case, and I guess unifying DTs
means some work on drivers as well.
But AFAIK most of the more modern platforms copy the DTs (and thus
implicitly the bindings) from Linux, so there is probably much less
deviation for many more relevant boards.

Cheers,
Andre


Re: [PATCH 2/6] ARM: dts: arm: Fix SP804 users

2020-08-28 Thread André Przywara
On 28/08/2020 15:03, Linus Walleij wrote:

Hi,

> On Wed, Aug 26, 2020 at 8:38 PM Andre Przywara  wrote:
> 
>> The SP804 DT nodes for Realview, MPS2 and VExpress were not complying
>> with the binding: it requires either one or three clocks, but does not
>> allow exactly two clocks.
>>
>> Simply duplicate the first clock to satisfy the binding requirement.
>> For MPS2, we triple the clock, and add the clock-names property, as this
>> is required by the Linux primecell driver.
>> Try to make the clock-names more consistent on the way.
>>
>> Signed-off-by: Andre Przywara 
> 
> Acked-by: Linus Walleij 

Thanks!

> 
> This looks good to me, shall I simply apply this patch to my
> Versatile tree (I suppose Sudeep should ack it too) or are
> you sending it upstream to the soc tree?

If you want to take it (and Sudeep is OK with it), I am happy with that.
The DTs should work either way, so there is no dependency or anything.
One patch less to carry around for me ;-)
Just sent a v2 with your ACK, so please pick this one.

Thanks,
Andre


Re: [PATCH 3/6] ARM: dts: broadcom: Fix SP804 node

2020-08-28 Thread André Przywara
On 26/08/2020 21:55, Florian Fainelli wrote:
> On 8/26/20 11:59 AM, Florian Fainelli wrote:
>> On 8/26/20 11:53 AM, André Przywara wrote:
>>> On 26/08/2020 19:42, Florian Fainelli wrote:

Hi Florian,

>>>
>>> Hi,
>>>
>>>> On 8/26/20 11:38 AM, Andre Przywara wrote:
>>>>> The DT binding for SP804 requires to have an "arm,primecell" compatible
>>>>> string.
>>>>> Add this string so that the Linux primecell bus driver picks the device
>>>>> up and activates the clock.
>>>>>
>>>>> Signed-off-by: Andre Przywara 
>>>>
>>>> The commit subject should be:
>>>>
>>>> ARM: dts: NSP: Fix SP804 compatible node
>>>>
>>>> and we should probably have a Fixes tag that is:
>>>>
>>>> Fixes: a0efb0d28b77 ("ARM: dts: NSP: Add SP804 Support to DT")
>>>>
>>>> Could you please re-submit with those things corrected? Thanks
>>>
>>> Sure, will include that in a v2.
>>>
>>> Out of curiosity, do you have the hardware and can check the impact that
>>> has?
>>
>> I have the hardware and could run some tests if you would like.
>>
>>> Not sure we actually create the device without the primecell compatible?
>>> Or is the sp804 an exception here, compared to the other AMBA devices
>>> (SP805, PL011)?
>>
>> No idea, I have never used those timers personally, and I doubt that
>> anybody besides me within broadcom and hobbyists actually care about NSP
>> these days.
> 
> Seems to be working fine for me with your patch applied, it probes:
> 
> # dmesg | grep sp804
> [0.035363] clocksource: arm,sp804: mask: 0x max_cycles:
> 0x, max_idle_ns: 15290083572 ns
> 
> and it is usable:
> 
> # cat clocksource0/available_clocksource
> arm_global_timer arm,sp804
> 
> and appears to work:
> 
> # echo "arm,sp804" > clocksource0/current_clocksource
> [  105.108547] clocksource: Switched to clocksource arm,sp804
> 
> # date; sleep 5; date
> Thu Jan  1 00:01:51 UTC 1970
> Thu Jan  1 00:01:56 UTC 1970
> 
> Feel free to add Tested-by: Florian Fainelli  in
> your v2, thanks André!

Wow, thanks a lot for this test!

Sending out a v2 in a minute.

Cheers,
Andre


Re: [PATCH 3/6] ARM: dts: broadcom: Fix SP804 node

2020-08-26 Thread André Przywara
On 26/08/2020 19:42, Florian Fainelli wrote:

Hi,

> On 8/26/20 11:38 AM, Andre Przywara wrote:
>> The DT binding for SP804 requires to have an "arm,primecell" compatible
>> string.
>> Add this string so that the Linux primecell bus driver picks the device
>> up and activates the clock.
>>
>> Signed-off-by: Andre Przywara 
> 
> The commit subject should be:
> 
> ARM: dts: NSP: Fix SP804 compatible node
> 
> and we should probably have a Fixes tag that is:
> 
> Fixes: a0efb0d28b77 ("ARM: dts: NSP: Add SP804 Support to DT")
> 
> Could you please re-submit with those things corrected? Thanks

Sure, will include that in a v2.

Out of curiosity, do you have the hardware and can check the impact that
has?
Not sure we actually create the device without the primecell compatible?
Or is the sp804 an exception here, compared to the other AMBA devices
(SP805, PL011)?

Cheers,
Andre

>> ---
>>  arch/arm/boot/dts/bcm-nsp.dtsi | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/arm/boot/dts/bcm-nsp.dtsi b/arch/arm/boot/dts/bcm-nsp.dtsi
>> index 0346ea621f0f..1333ef8be0a2 100644
>> --- a/arch/arm/boot/dts/bcm-nsp.dtsi
>> +++ b/arch/arm/boot/dts/bcm-nsp.dtsi
>> @@ -368,7 +368,7 @@
>>  };
>>  
>>  ccbtimer0: timer@34000 {
>> -compatible = "arm,sp804";
>> +compatible = "arm,sp804", "arm,primecell";
>>  reg = <0x34000 0x1000>;
>>  interrupts = ,
>>   ;
>>
> 
> 



Re: [PATCH] MAINTAINERS, edac: Calxeda Highbank, handover maintenance to Andre Przywara

2020-08-24 Thread André Przywara
On 24/08/2020 13:49, Robert Richter wrote:
> I do not have hardware anymore, nor there is ongoing development. So
> handover maintenance to Andre who already maintains the last
> remainings of Calxeda.
> 
> Cc: Andre Przywara 
> Signed-off-by: Robert Richter 

Acked-by: Andre Przywara 

Thanks!
Andre

> ---
>  MAINTAINERS | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 1b7b0c1a24c8..6ed56e1a7d28 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -6148,7 +6148,7 @@ S:  Supported
>  F:   drivers/edac/bluefield_edac.c
>  
>  EDAC-CALXEDA
> -M:   Robert Richter 
> +M:   Andre Przywara 
>  L:   linux-e...@vger.kernel.org
>  S:   Maintained
>  F:   drivers/edac/highbank*
> 



Re: [PATCH v5 10/10] arm64: dts: actions: Add uSD support for Cubieboard7

2020-07-12 Thread André Przywara
On 12/07/2020 19:45, Amit Tomer wrote:

Hi,

> On Sun, Jul 12, 2020 at 11:00 PM Manivannan Sadhasivam
>  wrote:
>>
>> On Thu, Jul 02, 2020 at 08:22:56PM +0530, Amit Singh Tomar wrote:
>>> This commit adds uSD support for Cubieboard7 board based on Actions Semi
>>> S700 SoC. SD0 is connected to uSD slot. Since there is no PMIC support
>>> added yet, fixed regulator has been used as a regulator node.
>>>
>>> Signed-off-by: Amit Singh Tomar 
>>> ---
>>> Changes since v4:
>>>   * No change.
>>> Changes since v3:
>>> * No change.
>>> Changes since v2:
>>> * No change.
>>> Changes since v1:
>>> * No change.
>>> Changes since RFC:
>>> * No change.
>>> ---
>>>  arch/arm64/boot/dts/actions/s700-cubieboard7.dts | 41 
>>> 
>>>  arch/arm64/boot/dts/actions/s700.dtsi|  1 +
>>>  2 files changed, 42 insertions(+)
>>>
>>> diff --git a/arch/arm64/boot/dts/actions/s700-cubieboard7.dts 
>>> b/arch/arm64/boot/dts/actions/s700-cubieboard7.dts
>>> index 63e375cd9eb4..ec117eb12f3a 100644
>>> --- a/arch/arm64/boot/dts/actions/s700-cubieboard7.dts
>>> +++ b/arch/arm64/boot/dts/actions/s700-cubieboard7.dts
>>> @@ -13,6 +13,7 @@
>>>
>>>   aliases {
>>>   serial3 = &uart3;
>>> + mmc0 = &mmc0;
>>>   };
>>>
>>>   chosen {
>>> @@ -28,6 +29,23 @@
>>>   device_type = "memory";
>>>   reg = <0x1 0xe000 0x0 0x0>;
>>>   };
>>> +
>>> + /* Fixed regulator used in the absence of PMIC */
>>> + vcc_3v1: vcc-3v1 {
>>> + compatible = "regulator-fixed";
>>> + regulator-name = "fixed-3.1V";
>>> + regulator-min-microvolt = <310>;
>>> + regulator-max-microvolt = <310>;
>>> + };
>>
>> Is this regulator used somewhere?
> 
> This is something I copied from bubblegum dts as I wasn't sure what is right 
> way
> to include these regulators.

But this regulator is only used for the eMMC there, which we apparently
don't have on the Cubieboard 7?

> Also, another day tested it without having these regulators in , and
> still it seems to
> work.  So should these be removed ?

If there are not even referenced in the .dts, then fixed regulators are
rather pointless. So yes, please remove this vcc-3v1 one.

What is the story with the other regulator? Is there a PMIC or a power
switch for the SD card? Or is the power supply actually hardwired?

Cheers,
Andre


Re: mainline/master bisection: baseline.dmesg.crit on qemu_arm-vexpress-a15

2020-07-03 Thread André Przywara
On 03/07/2020 06:38, kernelci.org bot wrote:

Hi Guillaume,

is this report legit? The situation didn't change from Monday, I just
repeated the test with mainline compared to my patch reverted.

What is the actual failure here? You pointed to:
<2>GIC CPU mask not found - kernel will fail to boot.
but I don't see any explicit line stating that as the culprit anywhere
in the logs. Actually the last line says:
00:24:07.04  Job finished correctly

And I see the GIC messages with and without this patch. As mentioned on
Monday, "-smp 2" should be added to the QEMU command line to fix that.

> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> * This automated bisection report was sent to you on the basis  *
> * that you may be involved with the breaking commit it has  *
> * found.  No manual investigation has been done to verify it,   *
> * and the root cause of the problem may be somewhere else.  *
> *   *
> * If you do send a fix, please include this trailer:*
> *   Reported-by: "kernelci.org bot"   *
> *   *
> * Hope this helps!  *
> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> 
> mainline/master bisection: baseline.dmesg.crit on qemu_arm-vexpress-a15
> 
> Summary:
>   Start:  7cc2a8ea1048 Merge tag 'block-5.8-2020-07-01' of 
> git://git.kernel.dk/linux-block
>   Plain log:  
> https://storage.kernelci.org/mainline/master/v5.8-rc3-37-g7cc2a8ea1048/arm/vexpress_defconfig/gcc-8/lab-cip/baseline-vexpress-v2p-ca15-tc1.txt
>   HTML log:   
> https://storage.kernelci.org/mainline/master/v5.8-rc3-37-g7cc2a8ea1048/arm/vexpress_defconfig/gcc-8/lab-cip/baseline-vexpress-v2p-ca15-tc1.html
>   Result: 38ac46002d1d arm: dts: vexpress: Move mcc node back into 
> motherboard node
> 
> Checks:
>   revert: PASS
>   verify: PASS

What does that mean? That reverting the patch made the test pass?
I did exactly that, and reverting made it worse, because poweroff
doesn't work (among other things).
So could this be a testing artifact? Because of the failing poweroff the
test timed out or something?

Many thanks,
Andre

> 
> Parameters:
>   Tree:   mainline
>   URL:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>   Branch: master
>   Target: qemu_arm-vexpress-a15
>   CPU arch:   arm
>   Lab:lab-cip
>   Compiler:   gcc-8
>   Config: vexpress_defconfig
>   Test case:  baseline.dmesg.crit
> 
> Breaking commit found:
> 
> ---
> commit 38ac46002d1df5707566a73486452851341028d2
> Author: Andre Przywara 
> Date:   Wed Jun 3 17:22:37 2020 +0100
> 
> arm: dts: vexpress: Move mcc node back into motherboard node
> 
> Commit d9258898ad49 ("arm64: dts: arm: vexpress: Move fixed devices
> out of bus node") moved the "mcc" DT node into the root node, because
> it does not have any children using "reg" properties, so does violate
> some dtc checks about "simple-bus" nodes.
> 
> However this broke the vexpress config-bus code, which walks up the
> device tree to find the first node with an "arm,vexpress,site" property.
> This gave the wrong result (matching the root node instead of the
> motherboard node), so broke the clocks and some other devices for
> VExpress boards.
> 
> Move the whole node back into its original position. This re-introduces
> the dtc warning, but is conceptually the right thing to do. The dtc
> warning seems to be overzealous here, there are discussions on fixing or
> relaxing this check instead.
> 
> Link: 
> https://lore.kernel.org/r/20200603162237.16319-1-andre.przyw...@arm.com
> Fixes: d9258898ad49 ("arm64: dts: vexpress: Move fixed devices out of bus 
> node")
> Reported-and-tested-by: Guenter Roeck 
> Signed-off-by: Andre Przywara 
> Signed-off-by: Sudeep Holla 
> 
> diff --git a/arch/arm/boot/dts/vexpress-v2m-rs1.dtsi 
> b/arch/arm/boot/dts/vexpress-v2m-rs1.dtsi
> index e6308fb76183..a88ee5294d35 100644
> --- a/arch/arm/boot/dts/vexpress-v2m-rs1.dtsi
> +++ b/arch/arm/boot/dts/vexpress-v2m-rs1.dtsi
> @@ -100,79 +100,6 @@
>   };
>   };
>  
> - mcc {
> - compatible = "arm,vexpress,config-bus";
> - arm,vexpress,config-bridge = <&v2m_sysreg>;
> -
> - oscclk0 {
> - /* MCC static memory clock */
> - compatible = "arm,vexpress-osc";
> - arm,vexpress-sysreg,func = <1 0>;
> - freq-range = <2500 6000>;
> - #clock-cells = <0>;
> - clock-output-names = "v2m:oscclk0";
> - };
> -
> - v2m_oscclk1: oscclk1 {
> - /* CLCD clock */
> -

Re: [PATCH v4 02/10] dmaengine: Actions: Add support for S700 DMA engine

2020-06-29 Thread André Przywara
On 29/06/2020 10:54, Vinod Koul wrote:

Hi Vinod,

> On 24-06-20, 10:35, Andr� Przywara wrote:
>> On 24/06/2020 07:15, Vinod Koul wrote:
>>> On 09-06-20, 15:47, Amit Singh Tomar wrote:
>>>
 @@ -372,6 +383,7 @@ static inline int owl_dma_cfg_lli(struct owl_dma_vchan 
 *vchan,
  struct dma_slave_config *sconfig,
  bool is_cyclic)
  {
 +  struct owl_dma *od = to_owl_dma(vchan->vc.chan.device);
u32 mode, ctrlb;
  
mode = OWL_DMA_MODE_PW(0);
 @@ -427,14 +439,26 @@ static inline int owl_dma_cfg_lli(struct 
 owl_dma_vchan *vchan,
lli->hw[OWL_DMADESC_DADDR] = dst;
lli->hw[OWL_DMADESC_SRC_STRIDE] = 0;
lli->hw[OWL_DMADESC_DST_STRIDE] = 0;
 -  /*
 -   * Word starts from offset 0xC is shared between frame length
 -   * (max frame length is 1MB) and frame count, where first 20
 -   * bits are for frame length and rest of 12 bits are for frame
 -   * count.
 -   */
 -  lli->hw[OWL_DMADESC_FLEN] = len | FCNT_VAL << 20;
 -  lli->hw[OWL_DMADESC_CTRLB] = ctrlb;
 +
 +  if (od->devid == S700_DMA) {
 +  /* Max frame length is 1MB */
 +  lli->hw[OWL_DMADESC_FLEN] = len;
 +  /*
 +   * On S700, word starts from offset 0x1C is shared between
 +   * frame count and ctrlb, where first 12 bits are for frame
 +   * count and rest of 20 bits are for ctrlb.
 +   */
 +  lli->hw[OWL_DMADESC_CTRLB] = FCNT_VAL | ctrlb;
 +  } else {
 +  /*
 +   * On S900, word starts from offset 0xC is shared between
 +   * frame length (max frame length is 1MB) and frame count,
 +   * where first 20 bits are for frame length and rest of
 +   * 12 bits are for frame count.
 +   */
 +  lli->hw[OWL_DMADESC_FLEN] = len | FCNT_VAL << 20;
 +  lli->hw[OWL_DMADESC_CTRLB] = ctrlb;
>>>
>>> Unfortunately this wont scale, we will keep adding new conditions for
>>> newer SoC's! So rather than this why not encode max frame length in
>>> driver_data rather than S900_DMA/S700_DMA.. In future one can add values
>>> for newer SoC and not code above logic again.
>>
>> What newer SoCs? I don't think we should try to guess the future here.
> 
> In a patch for adding new SoC, quite ironical I would say!

S700 is not a new SoC, it's just this driver didn't support it yet. What
I meant is that I don't even know about the existence of upcoming SoCs
(Google seems clueless), not to speak of documentation to assess which
DMA controller they use.

>> We can always introduce further abstractions later, once we actually
>> *know* what we are looking at.
> 
> Rather if we know we are adding abstractions, why not add in a way that
> makes it scale better rather than rework again

I appreciate the effort, but this really tapping around in the dark,
since we don't know which direction any new DMA controller is taking. I
might not even be similar.

>> Besides, I don't understand what you are after. The max frame length is
>> 1MB in both cases, it's just a matter of where to put FCNT_VAL, either
>> in FLEN or in CTRLB. And having an extra flag for that in driver data
>> sounds a bit over the top at the moment.
> 
> Maybe, maybe not. I would rather make it support N SoC when adding
> support for second one rather than keep adding everytime a new SoC is
> added...

Well, what do you suggest, specifically? At the moment we have two
*slightly* different DMA controllers, so we differentiate between the
two based on the model. Do you want to introduce an extra flag like
FRAME_CNT_IN_CTRLB? That seems to be a bit over the top here, since we
don't know if a future DMA controller is still compatible, or introduces
completely new differences.

Cheers,
Andre


Re: [PATCH v4 02/10] dmaengine: Actions: Add support for S700 DMA engine

2020-06-24 Thread André Przywara
On 24/06/2020 07:15, Vinod Koul wrote:

Hi,

> On 09-06-20, 15:47, Amit Singh Tomar wrote:
> 
>> @@ -372,6 +383,7 @@ static inline int owl_dma_cfg_lli(struct owl_dma_vchan 
>> *vchan,
>>struct dma_slave_config *sconfig,
>>bool is_cyclic)
>>  {
>> +struct owl_dma *od = to_owl_dma(vchan->vc.chan.device);
>>  u32 mode, ctrlb;
>>  
>>  mode = OWL_DMA_MODE_PW(0);
>> @@ -427,14 +439,26 @@ static inline int owl_dma_cfg_lli(struct owl_dma_vchan 
>> *vchan,
>>  lli->hw[OWL_DMADESC_DADDR] = dst;
>>  lli->hw[OWL_DMADESC_SRC_STRIDE] = 0;
>>  lli->hw[OWL_DMADESC_DST_STRIDE] = 0;
>> -/*
>> - * Word starts from offset 0xC is shared between frame length
>> - * (max frame length is 1MB) and frame count, where first 20
>> - * bits are for frame length and rest of 12 bits are for frame
>> - * count.
>> - */
>> -lli->hw[OWL_DMADESC_FLEN] = len | FCNT_VAL << 20;
>> -lli->hw[OWL_DMADESC_CTRLB] = ctrlb;
>> +
>> +if (od->devid == S700_DMA) {
>> +/* Max frame length is 1MB */
>> +lli->hw[OWL_DMADESC_FLEN] = len;
>> +/*
>> + * On S700, word starts from offset 0x1C is shared between
>> + * frame count and ctrlb, where first 12 bits are for frame
>> + * count and rest of 20 bits are for ctrlb.
>> + */
>> +lli->hw[OWL_DMADESC_CTRLB] = FCNT_VAL | ctrlb;
>> +} else {
>> +/*
>> + * On S900, word starts from offset 0xC is shared between
>> + * frame length (max frame length is 1MB) and frame count,
>> + * where first 20 bits are for frame length and rest of
>> + * 12 bits are for frame count.
>> + */
>> +lli->hw[OWL_DMADESC_FLEN] = len | FCNT_VAL << 20;
>> +lli->hw[OWL_DMADESC_CTRLB] = ctrlb;
> 
> Unfortunately this wont scale, we will keep adding new conditions for
> newer SoC's! So rather than this why not encode max frame length in
> driver_data rather than S900_DMA/S700_DMA.. In future one can add values
> for newer SoC and not code above logic again.

What newer SoCs? I don't think we should try to guess the future here.
We can always introduce further abstractions later, once we actually
*know* what we are looking at.

Besides, I don't understand what you are after. The max frame length is
1MB in both cases, it's just a matter of where to put FCNT_VAL, either
in FLEN or in CTRLB. And having an extra flag for that in driver data
sounds a bit over the top at the moment.

Cheers,
Andre.

> 
>> +static const struct of_device_id owl_dma_match[] = {
>> +{ .compatible = "actions,s900-dma", .data = (void *)S900_DMA,},
>> +{ .compatible = "actions,s700-dma", .data = (void *)S700_DMA,},
> 
> Is the .compatible documented, Documentation patch should come before
> the driver use patch in a series
> 
>>  static int owl_dma_probe(struct platform_device *pdev)
>>  {
>>  struct device_node *np = pdev->dev.of_node;
>>  struct owl_dma *od;
>>  int ret, i, nr_channels, nr_requests;
>> +const struct of_device_id *of_id =
>> +of_match_device(owl_dma_match, &pdev->dev);
> 
> You care about driver_data rather than of_id, so using
> of_device_get_match_data() would be better..
> 
>>  od = devm_kzalloc(&pdev->dev, sizeof(*od), GFP_KERNEL);
>>  if (!od)
>> @@ -1083,6 +1116,8 @@ static int owl_dma_probe(struct platform_device *pdev)
>>  dev_info(&pdev->dev, "dma-channels %d, dma-requests %d\n",
>>   nr_channels, nr_requests);
>>  
>> +od->devid = (enum owl_dma_id)(uintptr_t)of_id->data;
> 
> Funny casts, I dont think you need uintptr_t!
> 



Re: linux-next: Signed-off-by missing for commit in the scmi tree

2020-05-18 Thread André Przywara
On 18/05/2020 12:08, Stephen Rothwell wrote:
> Commit
> 
>   17a37ff76e95 ("arm64: dts: juno: Use proper DT node name for USB")
> 
> is missing a Signed-off-by from its author.

Oh, sorry for that and thanks for spotting this!

Sudeep, many thanks for the quick fix and update!

Cheers,
Andre

> 
> Also, the commit message tags should be separated from the rest of the
> commit message by a blank line.
> 



Re: [PATCH V7 0/2] mailbox: arm: introduce smc triggered mailbox

2019-09-23 Thread André Przywara
On 23/09/2019 07:36, Peng Fan wrote:

Hi Peng,

thanks for the update!

> From: Peng Fan 
> 
> V7:
> Typo fix
> #mbox-cells changed to 0
> Add a new header file arm-smccc-mbox.h
> Use ARM_SMCCC_IS_64
> 
> Andre,
>   The function_id is still kept in arm_smccc_mbox_cmd, because arm,func-id
> property is optional, so clients could pass function_id to mbox driver.

Well, to be honest, this is the main thing I am opposing:

It should *not* be optional.

The controller driver DT node should *always* contain the function ID.
The reasons for that I explained in the other emails to Jassi:
We can't safely execute smc calls from the Linux kernel, unless we also
comply with the SMCCC standard. So we should not leave the choice of the
function ID to the mailbox client.
Also this much better separates the mailbox controller driver from the
client.

So I think we should reach an agreement here.

Cheers,
Andre

> V6:
> Switch to per-channel a mbox controller
> Drop arm,num-chans, transports, method
> Add arm,hvc-mbox compatible
> Fix smc/hvc args, drop client id and use correct type.
> https://patchwork.kernel.org/cover/11146641/
> 
> V5:
> yaml fix
> https://patchwork.kernel.org/cover/7741/
> 
> V4:
> yaml fix for num-chans in patch 1/2.
> https://patchwork.kernel.org/cover/6521/
> 
> V3:
> Drop interrupt
> Introduce transports for mem/reg usage
> Add chan-id for mem usage
> Convert to yaml format
> https://patchwork.kernel.org/cover/11043541/
>  
> V2:
> This is a modified version from Andre Przywara's patch series
> https://lore.kernel.org/patchwork/cover/812997/.
> The modification are mostly:
> Introduce arm,num-chans
> Introduce arm_smccc_mbox_cmd
> txdone_poll and txdone_irq are both set to false
> arm,func-ids are kept, but as an optional property.
> Rewords SCPI to SCMI, because I am trying SCMI over SMC, not SCPI.
> Introduce interrupts notification.
> 
> [1] is a draft implementation of i.MX8MM SCMI ATF implementation that
> use smc as mailbox, power/clk is included, but only part of clk has been
> implemented to work with hardware, power domain only supports get name
> for now.
> 
> The traditional Linux mailbox mechanism uses some kind of dedicated hardware
> IP to signal a condition to some other processing unit, typically a dedicated
> management processor.
> This mailbox feature is used for instance by the SCMI protocol to signal a
> request for some action to be taken by the management processor.
> However some SoCs does not have a dedicated management core to provide
> those services. In order to service TEE and to avoid linux shutdown
> power and clock that used by TEE, need let firmware to handle power
> and clock, the firmware here is ARM Trusted Firmware that could also
> run SCMI service.
> 
> The existing SCMI implementation uses a rather flexible shared memory
> region to communicate commands and their parameters, it still requires a
> mailbox to actually trigger the action.
> 
> This patch series provides a Linux mailbox compatible service which uses
> smc calls to invoke firmware code, for instance taking care of SCMI requests.
> The actual requests are still communicated using the standard SCMI way of
> shared memory regions, but a dedicated mailbox hardware IP can be replaced via
> this new driver.
> 
> This simple driver uses the architected SMC calling convention to trigger
> firmware services, also allows for using "HVC" calls to call into hypervisors
> or firmware layers running in the EL2 exception level.
> 
> Patch 1 contains the device tree binding documentation, patch 2 introduces
> the actual mailbox driver.
> 
> Please note that this driver just provides a generic mailbox mechanism,
> It could support synchronous TX/RX, or synchronous TX with asynchronous
> RX. And while providing SCMI services was the reason for this exercise,
> this driver is in no way bound to this use case, but can be used generically
> where the OS wants to signal a mailbox condition to firmware or a
> hypervisor.
> Also the driver is in no way meant to replace any existing firmware
> interface, but actually to complement existing interfaces.
> 
> [1] https://github.com/MrVan/arm-trusted-firmware/tree/scmi
> 
> 
> 
> Peng Fan (2):
>   dt-bindings: mailbox: add binding doc for the ARM SMC/HVC mailbox
>   mailbox: introduce ARM SMC based mailbox
> 
>  .../devicetree/bindings/mailbox/arm-smc.yaml   |  95 
>  drivers/mailbox/Kconfig|   7 +
>  drivers/mailbox/Makefile   |   2 +
>  drivers/mailbox/arm-smc-mailbox.c  | 168 
> +
>  4 files changed, 272 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/mailbox/arm-smc.yaml
>  create mode 100644 drivers/mailbox/arm-smc-mailbox.c
> 



  1   2   >