Hi,
On 18/06/2022 18:43, Dmytro Semenets wrote:
пт, 17 июн. 2022 г. в 14:12, Julien Grall <[email protected]>:
Hi Julien,
Hi,
On 17/06/2022 10:10, Volodymyr Babchuk wrote:
Julien Grall <[email protected]> writes:
On 16/06/2022 19:40, Volodymyr Babchuk wrote:
Hi Julien,
Hi Volodymyr,
Julien Grall <[email protected]> writes:
Hi,
On 16/06/2022 14:55, [email protected] wrote:
From: Dmytro Semenets <[email protected]>
According to PSCI specification ARM TF can return DENIED on CPU OFF.
I am confused. The spec is talking about Trusted OS and not
firmware. The docummentation is also not specific to ARM Trusted
Firmware. So did you mean "Trusted OS"?
It should be "firmware", I believe.
Hmmm... I couldn't find a reference in the spec suggesting that
CPU_OFF could return DENIED because of the firmware. Do you have a
pointer to the spec?
Ah, looks like we are talking about different things. Indeed, CPU_OFF
can return DENIED only because of Trusted OS. But entity that *returns*
the error to a caller is a firmware.
Right, the interesting part is *why* DENIED is returned not *who*
returns it.
ARM TF returns DENIED *only* for the platform I have.
We have a dissonance between spec and xen implementation because
DENIED returned by
ARM TF or Thrusted OS or whatever is not a reason for panic.
I agree that's not a reason for panic. However, knowing the reason does
help to figure out the correct approach.
For instance, one could have suggest to migrate the trusted OS to
another pCPU. But this would not have worked for you because the DENIED
is not about that.
And we
have issues with this.
If machine_restart() behaviour is more or less correct (sometimes
reports about panic but restarts the machine)
Right...
but machine_halt() doesn't work at al
... this should also be the case here because machine_halt() could also
be called from cpu0. So I am a bit confused why you are saying it never
works.
Transit execution to CPU0 for my understanding is a workaround and
this approach will fix
machine_restart() function but will not fix machine_halt().
I would say it is a more specific case of what the spec suggests (see
below). But it should fix both machine_restart() and machine_halt()
because the last CPU running will be CPU0. So Xen would call SYSTEM_*
rather than CPU_OF. So I don't understand why you think it will fix one
but not the other.
In fact, the idea to always run the request from a given CPU is quite
similar to what the specification suggests (5.10.3 DEN0022D.b):
"
One way in which cores can be placed into a known state is to use calls
to CPU_OFF on all online cores
except for the last one, which instead uses SYSTEM_OFF. If a UP Trusted
OS is present, this method
only works if the core that calls SYSTEM_OFF is the one where the
Trusted OS is resident, as calls to
CPU_OFF on this core return a DENIED error. Any core can call SYSTEM_OFF.
"
For Xen, we would need to detect if the trusted OS is UP and where it is
running. Then we could always restart/halt from that CPU or CPU0.
Approach
you suggested (spinning all cpus) will work but
will save less energy.
I am not sure to understand what's the concern about the energy here.
From my understanding of the specification, SYSTEM_OFF will take care
of switching off the power for all the cores. So at worse, the CPUs will
spin for a few ms. This would like be more efficient than a call to PSCI
CPU off.
This is different compare just turning off one CPU (i.e. CPU hot-unplug)
because the CPU will end up to spin for a very long time. And this is
why I wasn't OK with conditionally avoiding the panic.
Cheers,
--
Julien Grall