On 10/05/2016 11:54 PM, Julien Grall wrote:
> 
> 
> On 05/10/2016 13:23, Tamas K Lengyel wrote:
>> Hi Julien,
>> It is expected that certain combinations of mem_access flags will put
>> the domain into unstable condition, resulting in a crash or a hang. As
>> Razvan mentioned, on x86 we can end up triggering EPT misconfiguration
>> with the wrong set of flags. The user of the API is expected to know
>> what he/she is doing in this regard, we don't do any enforcements or
>> sanity checking on the Xen side.
>>
>> As to the issue you describe, indeed that can happen. If the user marks
>> a pagetable area non-readable/non-writable and the way ARM reports a
>> walk for an instruction-fetch as an execute violation when it traps, it
>> will hang the VM in a continuous violation state as no execute-violation
>> was requested to be triggered on the gfn by the user. There are other
>> situations where this can happen, as on ARM there is no such thing as
>> execute-only memory, so any time the user requests memory to be
>> execute-only or writable-executable will lead to problems like this -
>> instruction fetch violation when the user only requested
>> read-violations. But again, the users are expected to know what they are
>> doing and perform their own sanity checks as appropriate.
> 
> I think the problem I described is neither the fault of the user,
> neither a misconfiguration of the page table. Let me clarify it.
> 
> The user can purposefully restrict the access to stage-1 page table to
> detect when the OS is modifying them. By side effect, this will also
> impact the page table walker.
> 
> A prefetch abort (e.g when an error occurs when the processor is trying
> to load the instruction) can either occur during a stage-1 page table
> walk (e.g the underlying memory of stage-1 page table has been
> protected) or because the permission in the stage-2 entry has been
> restricted.
> 
> In the case of the latter, this will always be because the memory is not
> executable. However, for the former may happen if the page table walker
> (i.e the MMU) is reading/writing the entry.
> 
> However, Xen ARM today is always considering that a prefetch abort will
> happen because it was not possible to execute the instruction.
> 
> I requested clarification about the flags because we need to fix this
> valid issue. From the usage on ARM and in the vm event app, it is not
> clear how those flags should be used.

I understand. FWIW, I find it better to have the most precise type of
event sent, i.e. in your case if the application gets a read-only page
fault event it would then be able to do something about it (for example,
lift the restrictions on the page), whereas if it would get an execute
denied event in this case, allowing execution on that page would not
solve the issue and leave the guest in an infinite loop, as you say. The
problem here is that the application never gets a chance to do the right
thing even if it wants to, and is capable of that.

So I'm all for properly differentiating between these two cases, unless
the ARM SDM disagrees or there's some reason why this is unfeasible.


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Reply via email to