Re: [RFC v2 2/5] cputlb: Replace switches in load/store_helper with callback

2019-09-19 Thread Richard Henderson
On 9/19/19 7:53 AM, David Hildenbrand wrote:
> On 18.09.19 20:02, Richard Henderson wrote:
>> Add a function parameter to perform the actual load/store to ram.
>> With optimization, this results in identical code.
>>
>> Signed-off-by: Richard Henderson 
>> ---
>>  accel/tcg/cputlb.c | 159 +++--
>>  1 file changed, 83 insertions(+), 76 deletions(-)
> 
> I would have guessed the compiler propagates the constant and eliminates
> the switch completely for the variants. After all, we now have more LOC ...
> 
> I would have moved the actual read/write to a separate function
> containing the switch statement instead.

I tried that.

The compiler does fold away the constant and eliminate the switch for the code
as-is.  But it somehow fails to do so for op ^ MO_BSWAP, where op is constant.
 I don't know why.

These little inline functions were the third try and the only one for which all
of the indirection folded away.  Not so much here, but in the next patch which
passes in the reverse-endian function as well.


>> +/* Wrap the unaligned load helpers to that they have a common signature.  */
>> +static inline uint64_t wrap_ldub(const void *haddr)
>> +{
>> +return ldub_p(haddr);
> 
> I wonder if you should just add proper type cast to all of the <
> uint64_t accessors (e.g., here (uint8_t) ). Shouldn't hurt and makes
> people wonder less how the conversion from the int these accessors
> return to uint64_t will turn out.
> 
> But yeah, you're simply moving code here.

Perhaps.  I'm not really a fan of redundant casts.


r~



Re: [RFC v2 2/5] cputlb: Replace switches in load/store_helper with callback

2019-09-19 Thread David Hildenbrand
On 18.09.19 20:02, Richard Henderson wrote:
> Add a function parameter to perform the actual load/store to ram.
> With optimization, this results in identical code.
> 
> Signed-off-by: Richard Henderson 
> ---
>  accel/tcg/cputlb.c | 159 +++--
>  1 file changed, 83 insertions(+), 76 deletions(-)

I would have guessed the compiler propagates the constant and eliminates
the switch completely for the variants. After all, we now have more LOC ...

I would have moved the actual read/write to a separate function
containing the switch statement instead.

But I am pretty sure you know what you're doing here :)

> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index b87764..b4a63d3928 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -1280,11 +1280,38 @@ static void *atomic_mmu_lookup(CPUArchState *env, 
> target_ulong addr,
>  
>  typedef uint64_t FullLoadHelper(CPUArchState *env, target_ulong addr,
>  TCGMemOpIdx oi, uintptr_t retaddr);
> +typedef uint64_t LoadHelper(const void *);
> +
> +/* Wrap the unaligned load helpers to that they have a common signature.  */
> +static inline uint64_t wrap_ldub(const void *haddr)
> +{
> +return ldub_p(haddr);

I wonder if you should just add proper type cast to all of the <
uint64_t accessors (e.g., here (uint8_t) ). Shouldn't hurt and makes
people wonder less how the conversion from the int these accessors
return to uint64_t will turn out.

But yeah, you're simply moving code here.

> +}
> +
> +static inline uint64_t wrap_lduw_be(const void *haddr)
> +{
> +return lduw_be_p(haddr);
> +}
> +
> +static inline uint64_t wrap_lduw_le(const void *haddr)
> +{
> +return lduw_le_p(haddr);
> +}
> +
> +static inline uint64_t wrap_ldul_be(const void *haddr)
> +{
> +return (uint32_t)ldl_be_p(haddr);
> +}
> +
> +static inline uint64_t wrap_ldul_le(const void *haddr)
> +{
> +return (uint32_t)ldl_le_p(haddr);
> +}

Looks sane to me!


-- 

Thanks,

David / dhildenb