Re: [PATCH v11 1/3] xen/arm64: io: Emulate instructions (with invalid ISS) on MMIO region

2022-03-18 Thread Julien Grall

Hi Ayan,

On 17/03/2022 14:00, Ayan Kumar Halder wrote:

When an instruction is trapped in Xen due to translation fault, Xen
checks if the ISS is invalid (for data abort) or it is an instruction
abort. If so, Xen tries to resolve the translation fault using p2m page
tables. In case of data abort, Xen will try to map the mmio region to
the guest (ie tries to emulate the mmio region).

If the ISS is not valid and it is a data abort, then Xen tries to
decode the instruction. In case of ioreq, Xen  saves the decoding state,
rn and imm9 to vcpu_io. Whenever the vcpu handles the ioreq successfully,
it will read the decoding state to determine if the instruction decoded
was a ldr/str post indexing (ie INSTR_LDR_STR_POSTINDEXING). If so, it
uses these details to post increment rn.

In case of mmio handler, if the mmio operation was successful, then Xen
retrives the decoding state, rn and imm9. For state ==
INSTR_LDR_STR_POSTINDEXING, Xen will update rn.

If there is an error encountered while decoding/executing the instruction,
Xen will forward the abort to the guest.

Also, the logic to infer the type of instruction has been moved from
try_handle_mmio() to try_decode_instruction() which is called before.
try_handle_mmio() is solely responsible for handling the mmio operation.

Signed-off-by: Ayan Kumar Halder 


Reviewed-by: Julien Grall 

Cheers,

--
Julien Grall



Re: [PATCH v11 1/3] xen/arm64: io: Emulate instructions (with invalid ISS) on MMIO region

2022-03-17 Thread Stefano Stabellini
On Thu, 17 Mar 2022, Ayan Kumar Halder wrote:
> When an instruction is trapped in Xen due to translation fault, Xen
> checks if the ISS is invalid (for data abort) or it is an instruction
> abort. If so, Xen tries to resolve the translation fault using p2m page
> tables. In case of data abort, Xen will try to map the mmio region to
> the guest (ie tries to emulate the mmio region).
> 
> If the ISS is not valid and it is a data abort, then Xen tries to
> decode the instruction. In case of ioreq, Xen  saves the decoding state,
> rn and imm9 to vcpu_io. Whenever the vcpu handles the ioreq successfully,
> it will read the decoding state to determine if the instruction decoded
> was a ldr/str post indexing (ie INSTR_LDR_STR_POSTINDEXING). If so, it
> uses these details to post increment rn.
> 
> In case of mmio handler, if the mmio operation was successful, then Xen
> retrives the decoding state, rn and imm9. For state ==
> INSTR_LDR_STR_POSTINDEXING, Xen will update rn.
> 
> If there is an error encountered while decoding/executing the instruction,
> Xen will forward the abort to the guest.
> 
> Also, the logic to infer the type of instruction has been moved from
> try_handle_mmio() to try_decode_instruction() which is called before.
> try_handle_mmio() is solely responsible for handling the mmio operation.
> 
> Signed-off-by: Ayan Kumar Halder 

I managed to reproduce (on QEMU emulating a cortex-a15, see recent patch
series for automation) the original crash. And I also tested that this
version of the patch doesn't have the same issue. Now it works, so:

Tested-by: Stefano Stabellini 
Reviewed-by: Stefano Stabellini 


> ---
> 
> Changelog :-
> 
> v2..v5 - Mentioned in the cover letter.
> 
> v6 - 1. Mantained the decoding state of the instruction. This is used by the
> caller to either abort the guest or retry or ignore or perform read/write on
> the mmio region.
> 
> 2. try_decode() invokes decoding for both aarch64 and thumb state. (Previously
> it used to invoke decoding only for aarch64 state). Thus, it handles all the
> checking of the registers before invoking any decoding of instruction.
> try_decode_instruction_invalid_iss() has thus been removed.
> 
> 3. Introduced a new field('enum instr_decode_state state') inside
> 'struct instr_details'. This holds the decoding state of the instruction.
> This is later read by the post_increment_register() to determine if rn needs 
> to
> be incremented. Also, this is read by the callers of try_decode_instruction()
> to determine if the instruction was valid or ignored or to be retried or
> error or decoded successfully.
> 
> 4. Also stored 'instr_details' inside 'struct ioreq'. This enables
> arch_ioreq_complete_mmio() to invoke post_increment_register() without 
> decoding
> the instruction again.
> 
> 5. Check hsr.dabt.valid in do_trap_stage2_abort_guest(). If it is not valid,
> then decode the instruction. This ensures that try_handle_mmio() is invoked 
> only
> when the instruction is either valid or decoded successfully.
> 
> 6. Inside do_trap_stage2_abort_guest(), if hsr.dabt.valid is not set, then
> resolve the translation fault before trying to decode the instruction. If
> translation fault is resolved, then return to the guest to execute the 
> instruction
> again.
> 
> 
> v7 - 1. Moved the decoding instruction details ie instr_details from 'struct 
> ioreq'
> to 'struct vcpu_io'.
> 
> 2. The instruction is decoded only when we get a data abort.
> 
> 3. Replaced ASSERT_UNREACHABLE() with domain_crash(). The reason being asserts
> can be disabled in some builds. In this scenario when the guest's cpsr is in 
> an
> erroneous state, Xen should crash the guest.
> 
> 4. Introduced check_p2m() which invokes p2m_resolve_translation_fault() and
> try_map_mmio() to resolve translation fault by configuring the page tables. 
> This
> gets invoked first if ISS is invalid and it is an instruction abort. If it is
> a data abort and hsr.dabt.s1ptw is set or try_handle_mmio() returns 
> IO_UNHANDLED,
> then check_p2m() gets invoked again.
> 
> 
> v8 - 1. Removed the handling of data abort when info->dabt.cache is set. This 
> will
> be implemented in a subsequent patch. (Not as part of this series)
> 
> 2. When the data abort is due to access to stage 1 translation tables, Xen 
> will
> try to fix the mapping of the page table for the corresponding address. If 
> this
> returns an error, Xen will abort the guest. Else, it will ask the guest to 
> retry
> the instruction.
> 
> 3. Changed v->io.info.dabt_instr from pointer to variable. The reason being 
> that
> arch_ioreq_complete_mmio() is called from leave_hypervisor_to_guest().
> That is after do_trap_stage2_abort_guest()  has been invoked. So the original
> variable will be no longer valid.
> 
> 4. Some other style issues pointed out in v7.
> 
> 
> v9 - 1. Ensure that "Erratum 766422" is handled only when ISS is valid.
> 
> 2. Whenever Xen receives and instruction abort or data abort (with invalid 
> ISS),
> Xen sho

[PATCH v11 1/3] xen/arm64: io: Emulate instructions (with invalid ISS) on MMIO region

2022-03-17 Thread Ayan Kumar Halder
When an instruction is trapped in Xen due to translation fault, Xen
checks if the ISS is invalid (for data abort) or it is an instruction
abort. If so, Xen tries to resolve the translation fault using p2m page
tables. In case of data abort, Xen will try to map the mmio region to
the guest (ie tries to emulate the mmio region).

If the ISS is not valid and it is a data abort, then Xen tries to
decode the instruction. In case of ioreq, Xen  saves the decoding state,
rn and imm9 to vcpu_io. Whenever the vcpu handles the ioreq successfully,
it will read the decoding state to determine if the instruction decoded
was a ldr/str post indexing (ie INSTR_LDR_STR_POSTINDEXING). If so, it
uses these details to post increment rn.

In case of mmio handler, if the mmio operation was successful, then Xen
retrives the decoding state, rn and imm9. For state ==
INSTR_LDR_STR_POSTINDEXING, Xen will update rn.

If there is an error encountered while decoding/executing the instruction,
Xen will forward the abort to the guest.

Also, the logic to infer the type of instruction has been moved from
try_handle_mmio() to try_decode_instruction() which is called before.
try_handle_mmio() is solely responsible for handling the mmio operation.

Signed-off-by: Ayan Kumar Halder 
---

Changelog :-

v2..v5 - Mentioned in the cover letter.

v6 - 1. Mantained the decoding state of the instruction. This is used by the
caller to either abort the guest or retry or ignore or perform read/write on
the mmio region.

2. try_decode() invokes decoding for both aarch64 and thumb state. (Previously
it used to invoke decoding only for aarch64 state). Thus, it handles all the
checking of the registers before invoking any decoding of instruction.
try_decode_instruction_invalid_iss() has thus been removed.

3. Introduced a new field('enum instr_decode_state state') inside
'struct instr_details'. This holds the decoding state of the instruction.
This is later read by the post_increment_register() to determine if rn needs to
be incremented. Also, this is read by the callers of try_decode_instruction()
to determine if the instruction was valid or ignored or to be retried or
error or decoded successfully.

4. Also stored 'instr_details' inside 'struct ioreq'. This enables
arch_ioreq_complete_mmio() to invoke post_increment_register() without decoding
the instruction again.

5. Check hsr.dabt.valid in do_trap_stage2_abort_guest(). If it is not valid,
then decode the instruction. This ensures that try_handle_mmio() is invoked only
when the instruction is either valid or decoded successfully.

6. Inside do_trap_stage2_abort_guest(), if hsr.dabt.valid is not set, then
resolve the translation fault before trying to decode the instruction. If
translation fault is resolved, then return to the guest to execute the 
instruction
again.


v7 - 1. Moved the decoding instruction details ie instr_details from 'struct 
ioreq'
to 'struct vcpu_io'.

2. The instruction is decoded only when we get a data abort.

3. Replaced ASSERT_UNREACHABLE() with domain_crash(). The reason being asserts
can be disabled in some builds. In this scenario when the guest's cpsr is in an
erroneous state, Xen should crash the guest.

4. Introduced check_p2m() which invokes p2m_resolve_translation_fault() and
try_map_mmio() to resolve translation fault by configuring the page tables. This
gets invoked first if ISS is invalid and it is an instruction abort. If it is
a data abort and hsr.dabt.s1ptw is set or try_handle_mmio() returns 
IO_UNHANDLED,
then check_p2m() gets invoked again.


v8 - 1. Removed the handling of data abort when info->dabt.cache is set. This 
will
be implemented in a subsequent patch. (Not as part of this series)

2. When the data abort is due to access to stage 1 translation tables, Xen will
try to fix the mapping of the page table for the corresponding address. If this
returns an error, Xen will abort the guest. Else, it will ask the guest to retry
the instruction.

3. Changed v->io.info.dabt_instr from pointer to variable. The reason being that
arch_ioreq_complete_mmio() is called from leave_hypervisor_to_guest().
That is after do_trap_stage2_abort_guest()  has been invoked. So the original
variable will be no longer valid.

4. Some other style issues pointed out in v7.


v9 - 1. Ensure that "Erratum 766422" is handled only when ISS is valid.

2. Whenever Xen receives and instruction abort or data abort (with invalid ISS),
Xen should first try to resolve the p2m translation fault or see if it it needs
to map a MMIO region. If it succeeds, it should return to the guest to retry the
instruction.

3. Removed handling of "dabt.s1ptw == 1" aborts. This is addressed in patch3 as
it is an existing bug in codebase.

4. Various style issues pointed by Julien in v8.


v10 - 1. Set 'dabt.valid=1' when the instruction is fully decoded. This is
checked in try_handle_mmio() and try_fwd_ioserv().

2. Various other style issues pointed in v9.


v11 - 1. Renamed post_increment_register() to fin