Re: [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc

2018-10-16 Thread Leizhen (ThunderTown)



On 2018/10/15 20:46, Andrew Murray wrote:
> Hi Zhen,
> 
> On Mon, Oct 15, 2018 at 04:36:16PM +0800, Zhen Lei wrote:
>> ITS translation register map:
>> 0x-0x003CReserved
>> 0x0040   GITS_TRANSLATER
>> 0x0044-0xFFFCReserved
>>
>> The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
>> expands the next 4 bytes to carry some IMPDEF information. That means, 8 
>> bytes
>> data will be written to MSIAddress each time.
>>
>> MSIAddr: |4bytes|4bytes|
>>   |MSIData   |IMPDEF|
>>
>> There is no problem for ITS, because the next 4 bytes space is reserved in 
>> ITS.
>> But it will overwrite the 4 bytes memory following "sync_count". It's very
>> luckly that the previous and the next neighbour of "sync_count" are both 
>> aligned
>> by 8 bytes, so no problem is met now.
> 
> My understanding is that MSI's are 32bit memory writes and as such the SMMU
> performs a 32bit write in response to the MSI. If so then what is different
> with the Hi16xx that causes a problem? Have you been able to able to adjust
> the layout of the arm_smmu_device struct to demonstrate this?

In normal, only 32bits MSIdata will be written into sync_count:
|4bytes|4bytes|
|  sync_count  |  |

But for Hi16xx, the ITS hardware will write extra 32bits IMDDEF data into 
"". If
"" is the space of the next struct member, its value will be overwritten.

> 
> Thanks,
> 
> Andrew Murray
> 
>>
>> It's good to explicitly add a workaround:
>> 1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is 
>> always
>>aligned by 8 bytes.
>> 2. Add a "u64" union member to make sure the 4 bytes padding is always exist.
>>
>> There is no functional change.
>>
>> Signed-off-by: Zhen Lei 
>> ---
>>  drivers/iommu/arm-smmu-v3.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index 5059d09..a07bc0d 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -586,7 +586,10 @@ struct arm_smmu_device {
>>  
>>  struct arm_smmu_strtab_cfg  strtab_cfg;
>>  
>> +union {
>> +u64 padding; /* workaround for Hisilicon */
>>  u32 sync_count;
>> +} __attribute__((aligned(8)));
>>  
>>  /* IOMMU core code handle */
>>  struct iommu_device iommu;
>> -- 
>> 1.8.3
>>
>>
>> ___
>> iommu mailing list
>> io...@lists.linux-foundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> 
> .
> 

-- 
Thanks!
BestRegards



Re: [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc

2018-10-16 Thread Robin Murphy

On 15/10/18 18:21, Will Deacon wrote:

On Mon, Oct 15, 2018 at 04:36:16PM +0800, Zhen Lei wrote:

ITS translation register map:
0x-0x003C   Reserved
0x0040  GITS_TRANSLATER
0x0044-0xFFFC   Reserved

The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
expands the next 4 bytes to carry some IMPDEF information. That means, 8 bytes
data will be written to MSIAddress each time.

MSIAddr: |4bytes|4bytes|
 |MSIData   |IMPDEF|

There is no problem for ITS, because the next 4 bytes space is reserved in ITS.
But it will overwrite the 4 bytes memory following "sync_count". It's very
luckly that the previous and the next neighbour of "sync_count" are both aligned
by 8 bytes, so no problem is met now.

It's good to explicitly add a workaround:
1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is always
aligned by 8 bytes.
2. Add a "u64" union member to make sure the 4 bytes padding is always exist.

There is no functional change.

Signed-off-by: Zhen Lei 
---
  drivers/iommu/arm-smmu-v3.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 5059d09..a07bc0d 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -586,7 +586,10 @@ struct arm_smmu_device {
  
  	struct arm_smmu_strtab_cfg	strtab_cfg;
  
+	union {

+   u64 padding; /* workaround for Hisilicon */
u32 sync_count;
+   } __attribute__((aligned(8)));


Won't this already be aligned by the ABI?

Anyway, you'll need to swizzle things for big-endian, I suspect. Maybe you
can do something clever like making sync_count an array of two elements
and determining the offset based on the endianness. Or just keep it simple
like we do for things like struct qrwlock and struct qspinlock and use
#ifdefs.


I don't think so - the CPUs should only ever be making word accesses to 
the u32 member, while the SMMU expects to be writing little-endian data 
to an ITS, so AFAICS the data word will always be at the lower address 
either way.


Although now that it's come up, the pre-existing issue of whether the 
byte order *within* that u32 comes out correct after its round-trip 
through the SMMU is something I need to run away and hurriedly think 
about...


Robin.


Re: [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc

2018-10-16 Thread Leizhen (ThunderTown)



On 2018/10/15 19:17, John Garry wrote:
> On 15/10/2018 09:36, Zhen Lei wrote:
>> ITS translation register map:
>> 0x-0x003CReserved
>> 0x0040GITS_TRANSLATER
>> 0x0044-0xFFFCReserved
>>
> 
> Can you add a better opening than the ITS translation register map?

OK

> 
>> The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
>> expands the next 4 bytes to carry some IMPDEF information. That means, 8 
>> bytes
>> data will be written to MSIAddress each time.
>>
>> MSIAddr: |4bytes|4bytes|
>>  |MSIData   |IMPDEF|
>>
>> There is no problem for ITS, because the next 4 bytes space is reserved in 
>> ITS.
>> But it will overwrite the 4 bytes memory following "sync_count". It's very
> 
> I think arm_smmu_device.sync_count is better, or "sync_count member in the 
> the smmu driver control struct".

OK, I will use "struct" in v2.

+   struct {
u32 sync_count;
+   u32 padding;
+   } __attribute__((aligned(8)));

> 
>> luckly that the previous and the next neighbour of "sync_count" are both 
>> aligned
> 
> /s/luckly/luckily or fortunately/

OK, thanks

> 
>> by 8 bytes, so no problem is met now.
>>
>> It's good to explicitly add a workaround:
>> 1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is 
>> always
>>aligned by 8 bytes.
>> 2. Add a "u64" union member to make sure the 4 bytes padding is always exist.
>>
>> There is no functional change.
>>
>> Signed-off-by: Zhen Lei 
>> ---
>>  drivers/iommu/arm-smmu-v3.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index 5059d09..a07bc0d 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -586,7 +586,10 @@ struct arm_smmu_device {
>>
>>  struct arm_smmu_strtab_cfgstrtab_cfg;
>>
>> +union {
>> +u64padding; /* workaround for Hisilicon */
> 
> I think that a more detailed comment is required.

OK, I will try to describe it more clearly.

> 
>>  u32sync_count;
> 
> Can you indent these 2 members? However - as discussed internally - this may 
> have endian issue so better to declare full 64b struct.

These indent is inherited, to keep aligning with other members.

There is no endian issue, I have tested it on both little-endian and big-endian.

$gdb vmlinux
..
(gdb) p &((struct arm_smmu_device *)0)->sync_count
$1 = (u32 *) 0x4178
(gdb) p &((struct arm_smmu_device *)0)->tst1
$2 = (int *) 0x4170
(gdb) p &((struct arm_smmu_device *)0)->tst2
$3 = (int *) 0x4180

testcase

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 5059d09..7c6f7ac 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -586,7 +586,14 @@ struct arm_smmu_device {

struct arm_smmu_strtab_cfg  strtab_cfg;

+ int tst1;
+
+ union {
+ u64 padding;
u32 sync_count;
+ } __attribute__((aligned(8)));
+
+ int tst2;

/* IOMMU core code handle */
struct iommu_device iommu;

> 
>> +} __attribute__((aligned(8)));
>>
>>  /* IOMMU core code handle */
>>  struct iommu_deviceiommu;
>>
> Thanks
> 
> 
> 
> 
> .
> 

-- 
Thanks!
BestRegards