On 28/08/18 15:40, Volodymyr Babchuk wrote:
Hi Julien,
On 28.08.18 17:05, Julien Grall wrote:
Hi Volodymyr,
On 27/08/18 17:50, Volodymyr Babchuk wrote:
On 27.08.18 18:29, Julien Grall wrote:
On 27/08/2018 15:15, Volodymyr Babchuk wrote:
On 24.08.18 19:58, Julien Grall wrote:
Signed-off-by: Julien Grall <julien.gr...@arm.com>
This is indeed increasing the size of the function, but with a
better performance when you perform an SMC with 1.1.
The main goal of SMCCC 1.1 is to limit the number of registers saved
when calling an SMC. So if you provide provide a wrapper using a
function, then you are better off sticking with SMCCC 1.0.
The idea of this code is to provide a faster call on the presence of
SMCCC 1.1 while providing a fallback for other case. The function
cpus_have_const_cap is implemented using an ALTERNATIVE (similar to
static key) that will make if close to a NOP. I am saying close
because this is not quite a static key as we don't have it in Xen.
Instead, you replace a memory load by a constant.
Ah, yes, true.
I checked how static keys are working. Seems, they use asm goto
feature. Now I think: can we optimize this more? Something like that:
#define arm_smccc_smc(...)
do {
asm goto (ALTERNATIVE(nop",
"b %l[smccc_1_1]",
ARM_SMCCC_1_1));
You would need to list the label in GotoLabels (see 6.45.2.7 [1]).
Yes, but as you said, we can use __LINE__ there.
arm_smccc_1_0_smc(__VA_ARGS__);
break;
smccc_1_1:
The label will get redefined if you have multiple SMC call in the same
function. We could possibly generate label based on the file/line.
arm_smccc_1_1_smc(__VA_ARGS__);
} while ( 0 )
This will save use additional conditional branch.
The code with this patch looks like:
mov x0, #0 mov x0, #1
cbz w0, label cbz w0, label
The mov + conditional branch is likely to be negligible over the cost
of switching to EL3. Overall, I am not really convinced that it would
be worth the cost of open-coding it. I would prefer if we keep the
call to cpus_have_const_cap() and see if it can be optimized.
Okay. I'm fine with this. I just saw one more way to optimize and wanted
to share it with you. Anyways:
Reviewed-by: Volodymyr Babchuk <volodymyr_babc...@epam.com>
Thank you!
I have looked at cpus_have_const_cap() and haven't found good way to
optimize it with the current infrastructure in Xen. Feel free to
suggest improvement.
Another thing: maybe it is worth to branch to 1.0 code and leave 1.1 in
a straight path of execution? This will save you one more instruction
for SMCCC 1.1 call.
I am not sure to understand your suggestion here. Could you expand?
However, why SMCCC 1.1 should be in the straight path of execution?
Cheers,
--
Julien Grall
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel