Re: [Qemu-devel] [PATCH 01/18] tcg: add support for 128bit vector type

Kirill Batuzov Mon, 23 Jan 2017 02:34:41 -0800

On Sat, 21 Jan 2017, Richard Henderson wrote:

> On 01/19/2017 08:54 AM, Kirill Batuzov wrote:
> > 
> > Wrappers issue emulation code instead of operation if it is not supported by
> > host.
> > 
> > tcg_gen_add_i32x4 looks like this:
> > 
> > if (TCG_TARGET_HAS_add_i32x4) {
> >     tcg_gen_op3_v128(INDEX_op_add_i32x4, args[0], args[1], args[2]);
> > } else {
> >     for (i = 0; i < 4; i++) {
> >         tcg_gen_ld_i32(...);
> >         tcg_gen_ld_i32(...);
> >         tcg_gen_add_i32(...);
> >         tcg_gen_st_i32(...);
> >     }
> > }
> 
> To me that begs the question of why you wouldn't issue 4 adds on 4 i32
> registers instead.
>


Because 4 adds on 4 i32 registers work good only when the size of
vector elements matches the size of scalar variables we use for
representation of a vector. add_i16x8 will not be that great if we use
4 i32 variables: each will need to be split into two values, processed
independently and merged back afterwards. And when we create variable we
do not know which operations will be performed on it.

Scalar variables lack primitives to work with them as vectors of shorter
values. This is one of the reasons I added v64 type instead of using i64
for 64-bit vector operations. And this is the reason I'm so opposed to
using them to represent vector types if vector registers are not
supported by host. Handling vector operations with element size that
does not match representation will be complicated, may require special
handling for different operations and will produce a lot of if-s in code.

The method I'm proposing can handle any operation regardless of
representation. This includes handling situation where host supports
vector registers but does not support required operation (for example 
SSE/AVX does not support multiplication of vectors of 8-bit values).

-- 
Kirill

Re: [Qemu-devel] [PATCH 01/18] tcg: add support for 128bit vector type

Reply via email to