As a quick hack: try moving "u32 interrupt_pending;" to the start of the 
structure...

Thanks… Dave

-----Original Message-----
From: Brian Brooks [mailto:brian.bro...@arm.com] 
Sent: Friday, September 22, 2017 12:33 PM
To: Dave Barach (dbarach) <dbar...@cisco.com>
Cc: Saxena, Nitin <nitin.sax...@cavium.com>; vpp-dev@lists.fd.io; Damjan Marion 
(damarion) <damar...@cisco.com>; Narayana, Prasad Athreya 
<prasadathreya.naray...@cavium.com>
Subject: Re: [vpp-dev] [v17.07.01]: vec_add2() causing crash on ARMv8

On 09/28 11:57:36, Dave Barach (dbarach) wrote:
> Dear Nitin,
> 
> First off: exactly which LDXR / STXR instruction variant pairs is generated? 
> I begin to wonder if __sync_lock_test_and_set(...) might not be doing you any 
> favors. Given that dq->interrupt_pending is a u32, I would have expected a 
> 4-byte instruction with (at worst) a 4-byte alignment requirement.

It's true that a LDXR of 4 bytes only requires 4 byte alignment (not 8).

For the TAS, objdump vhost-user.o shows

  ldxr   w0, [x1]
  stxr   w3, w2, [x1]
  cbnz   w3, ..

These instructions are operating on 4 byte data because of the use of a
'w' register instead of a 'x' register to hold the actual value.

Nitin, can you confirm you see the same generated code? If so, is
&dq->interrupt_pending 4 byte aligned?

> Are there any alignment restrictions on the 1-byte variants LDXRB / STXRB?
> 
> If not: since we use dq->interrupt_pending as a one-bit flag, declaring it as 
> a u8 - or casting &dq->interrupt_pending to (u8 *) in an arch-dependent 
> fashion - might make the pain go away.
> 
> Aligning every vector in the system will waste memory, and will not legislate 
> this class of problem out of existence. So, I wouldn't want to force 8-byte 
> alignment in the way you mention.
> 
> Anyhow, aligning the first vector element to an 8-byte boundary says little 
> about the layout of elements within each vector element, especially if the 
> structure is packed.
> 
> If dq->interrupt_pending needs to be aligned to a specific boundary without 
> fail, the only completely reliable method would be to pack and pad the 
> structure e.g. to a multiple of 8 octets and ensure that interrupt_pending 
> lands on the required boundary. Then use vec_add2_ha (...) to manipulate the 
> vector.
> 
> HTH... Dave
> 
> From: vpp-dev-boun...@lists.fd.io [mailto:vpp-dev-boun...@lists.fd.io] On 
> Behalf Of Saxena, Nitin
> Sent: Thursday, September 28, 2017 4:53 AM
> To: vpp-dev@lists.fd.io
> Cc: Narayana, Prasad Athreya <prasadathreya.naray...@cavium.com>
> Subject: [vpp-dev] [v17.07.01]: vec_add2() causing crash on ARMv8
> 
> 
> Hi All,
> 
> 
> 
> I got a crash with vpp v17.07.01 on ARMv8 Soc 
> @src/vnet/devices/virtio/vhost-user.c: Line no: 1852
> 
> 
> if (clib_smp_swap (&dq->interrupt_pending, 0) ||
>     (node->state == VLIB_NODE_STATE_POLLING)){
> }
> 
> While debugging it turns out that value of (&dq->interrupt_pending) was not 8 
> byte aligned hence causing SIGBUS error on ARMv8 SoC. Further debugging tells 
> that dq was added in vector using vec_add2 (src/vnet/devices/devices.c Line 
> no: 152)
> 
> vec_add2 (rt->devices_and_queues, dq, 1)
> 
> which uses 0 byte alignment. Changing vec_add2 to vec_add2_aligned() fixed 
> the problem. My question is can we completely define vec_add2() as
> 
> #define vec_add2(V,P,N)           vec_add2_ha(V,P,N,0,8) instead of #define 
> vec_add2(V,P,N)           vec_add2_ha(V,P,N,0,0)
> 
> This can be helpful for all architecture.
> 
> Thanks,
> Nitin
> 
> 
> 

> _______________________________________________
> vpp-dev mailing list
> vpp-dev@lists.fd.io
> https://lists.fd.io/mailman/listinfo/vpp-dev

_______________________________________________
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev

Reply via email to