Re: [Mesa-dev] [PATCH] radeonsi: correct si_vgt_param_key on big endian machines

2018-04-10 Thread Bas Vermeulen
On Mon, Apr 9, 2018 at 11:19 PM, Gert Wollny  wrote:

> Am Montag, den 09.04.2018, 14:03 -0400 schrieb Marek Olšák:
> > On Mon, Apr 9, 2018 at 10:51 AM, Bas Vermeulen 
> > wrote:
> Which solution is better depends on what is done more often: reading
> the index or writing to the bit fields.
>

The bitfields are read and written, and the index is mostly read. I found
four instances of the bitfields being written after which the index needs to
be updated.


> > > I am working on a new version of this patch. I have one version
> > > which does away with all the bitfields, and uses functions to
> > > update the index.
> This emulates the code the compiler would create, but it requires that
> for each bit field setters (and getters?) must be implemented.
>

Yes. I have a git branch with this change ready if that's what's
wanted/needed.


> > > Another approach would be to change the union to a struct, and use
> > > a function to get the index.
> This method has the advantage that only the access to the index needs
> new implementation.
>

I can prepare a patch for this as well.


> > > Yet another approach would be to keep the contents of the union and
> > > the index in one struct, and use a function to
> > > (re)calculate the index.
> I don't think that would make much sense.
>

It adds four lines to the code, all the key->u.xxx has it's u. removed.
But future implementation needs to remember to call that function if any of
the bitfields are changed. Which can be annoying.

There is another option: Check at configuration time whether the bit
> field layout is like the low or the high endian layout you already
> implemented, and instead of basing the selection of the struct layout
> on the big/low-endianess of the architecture, base it on this test.
>
> It would probably be prudent to test both layouts and then fail
> configuration if non of the two reflect the actual layout (at which
> point one would have to thing about how to implement all the bit
> shifting properly).


Or just keep the union dependent on endianness, and add an assert/check/test
to make sure that everything works as expected.


> > >
> > > Which would you prefer?
> > >
> >
> > I don't mind bitfields. They make the code nice and tiny. Shifts
> > would decrease readability.
> The problem is, that the layout of bitfields is compiler dependend.
>

Let me know what you guys want to have this included. I just want it fixed,
I don't really care on the form. :)

Bas Vermeulen
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: correct si_vgt_param_key on big endian machines

2018-04-09 Thread Bas Vermeulen
I am working on a new version of this patch. I have one version which does
away with all the bitfields, and uses
functions to update the index.
Another approach would be to change the union to a struct, and use a
function to get the index.
Yet another approach would be to keep the contents of the union and the
index in one struct, and use a function to
(re)calculate the index.

Which would you prefer?

Bas Vermeulen

On Tue, Mar 20, 2018 at 6:33 PM, Gert Wollny  wrote:

> Am Dienstag, den 20.03.2018, 15:33 +0100 schrieb Nicolai Hähnle:
> > Nice, did you actually get it to work entirely on a big endian
> > machine?
> >
> > Bit fields aren't super portable, [...]
> Indeed, the order of the bits in a bit field is compiler implementation
> dependent. To make sure that changing the compiler doesn't change the
> behaviour of the code I'd suggest that instead of using a bit field the
> index should be created by explicitly shifting the bits into the right
> positions.
>
> Best,
> Gert
>
> > However, I
> > think we should use the PIPE_ARCH_LITTLE_ENDIAN define from
> > u_endian.h
> >
> > Cheers,
> > Nicolai
> >
> > On 20.03.2018 15:21, Bas Vermeulen wrote:
> > > Using mesa OpenCL failed on a big endian PowerPC machine because
> > > si_vgt_param_key is using bitfields and a 32 bit int for an
> > > index into an array.
> > >
> > > Fix si_vgt_param_key to work correctly on both little endian
> > > and big endian machines.
> > >
> > > Signed-off-by: Bas Vermeulen 
> > > ---
> > >   src/gallium/drivers/radeonsi/si_pipe.h | 13 +
> > >   1 file changed, 13 insertions(+)
> > >
> > > diff --git a/src/gallium/drivers/radeonsi/si_pipe.h
> > > b/src/gallium/drivers/radeonsi/si_pipe.h
> > > index 2053dcb9fc..32dbdf6e2c 100644
> > > --- a/src/gallium/drivers/radeonsi/si_pipe.h
> > > +++ b/src/gallium/drivers/radeonsi/si_pipe.h
> > > @@ -385,6 +385,7 @@ struct si_shader_ctx_state {
> > >*/
> > >   union si_vgt_param_key {
> > > struct {
> > > +#if (__BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__)
> > > unsigned prim:4;
> > > unsigned uses_instancing:1;
> > > unsigned
> > > multi_instances_smaller_than_primgroup:1;
> > > @@ -395,6 +396,18 @@ union si_vgt_param_key {
> > > unsigned tess_uses_prim_id:1;
> > > unsigned uses_gs:1;
> > > unsigned _pad:32 - SI_NUM_VGT_PARAM_KEY_BITS;
> > > +#else /* __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ */
> > > +   unsigned _pad:32 - SI_NUM_VGT_PARAM_KEY_BITS;
> > > +   unsigned uses_gs:1;
> > > +   unsigned tess_uses_prim_id:1;
> > > +   unsigned uses_tess:1;
> > > +   unsigned line_stipple_enabled:1;
> > > +   unsigned count_from_stream_output:1;
> > > +   unsigned primitive_restart:1;
> > > +   unsigned multi_instances_smaller_than_primgroup:1;
> > > +   unsigned uses_instancing:1;
> > > +   unsigned prim:4;
> > > +#endif
> > > } u;
> > > uint32_t index;
> > >   };
> > >
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] radeonsi: Patches to enable the use of OpenCL on a big endian system

2018-04-09 Thread Bas Vermeulen
Hi,

The two patches I sent an hour ago enable the use of OpenCL on a big endian
system.
I've tested this with a modified mesa 17.3.6, using kernel 4.1.35 and
4.16.0 on a Freescale
T2080rdb board (4-way e6500 PPC) running Debian unstable.

The patches in question are:
radeonsi: correct si_vgt_param_key on big endian machines
radeonsi: convert dispatch packet to little endian

The first patch adds a big endian version of the si_vgt_param_key union,
the second patch
modifies si_compute.c to convert the endianness of the data sent to the GPU
when it is
uploaded to VRAM instead of being sent through the rings.

Together, these patches allow me to execute OpenCL kernels correctly on the
T2080 mentioned above.

Bas Vermeulen
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] radeonsi: convert dispatch packet to little endian

2018-04-09 Thread Bas Vermeulen
The parameters for the compute engine are wrong when using
an E8860 on a big endian machine.
To fix this, convert the contents of struct dispatch_packet
to little endian.

This ensures that get_global_id(0) and similar functions
in the OpenCL code get the correct endian values, and
makes my simple OpenCL program work correctly.

Signed-off-by: Bas Vermeulen 
---
 src/gallium/drivers/radeonsi/si_compute.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
b/src/gallium/drivers/radeonsi/si_compute.c
index dfede47605..8ac5b262c4 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -564,18 +564,18 @@ static void si_setup_user_sgprs_co_v2(struct si_context 
*sctx,
/* Upload dispatch ptr */
memset(&dispatch, 0, sizeof(dispatch));
 
-   dispatch.workgroup_size_x = info->block[0];
-   dispatch.workgroup_size_y = info->block[1];
-   dispatch.workgroup_size_z = info->block[2];
+   dispatch.workgroup_size_x = util_cpu_to_le16(info->block[0]);
+   dispatch.workgroup_size_y = util_cpu_to_le16(info->block[1]);
+   dispatch.workgroup_size_z = util_cpu_to_le16(info->block[2]);
 
-   dispatch.grid_size_x = info->grid[0] * info->block[0];
-   dispatch.grid_size_y = info->grid[1] * info->block[1];
-   dispatch.grid_size_z = info->grid[2] * info->block[2];
+   dispatch.grid_size_x = util_cpu_to_le32(info->grid[0] * 
info->block[0]);
+   dispatch.grid_size_y = util_cpu_to_le32(info->grid[1] * 
info->block[1]);
+   dispatch.grid_size_z = util_cpu_to_le32(info->grid[2] * 
info->block[2]);
 
-   dispatch.private_segment_size = program->private_size;
-   dispatch.group_segment_size = program->local_size;
+   dispatch.private_segment_size = 
util_cpu_to_le32(program->private_size);
+   dispatch.group_segment_size = 
util_cpu_to_le32(program->local_size);
 
-   dispatch.kernarg_address = kernel_args_va;
+   dispatch.kernarg_address = util_cpu_to_le64(kernel_args_va);
 
u_upload_data(sctx->b.const_uploader, 0, sizeof(dispatch),
   256, &dispatch, &dispatch_offset,
@@ -652,9 +652,9 @@ static bool si_upload_compute_input(struct si_context *sctx,
 
if (!code_object) {
for (i = 0; i < 3; i++) {
-   kernel_args[i] = info->grid[i];
-   kernel_args[i + 3] = info->grid[i] * info->block[i];
-   kernel_args[i + 6] = info->block[i];
+   kernel_args[i] = util_cpu_to_le32(info->grid[i]);
+   kernel_args[i + 3] = util_cpu_to_le32(info->grid[i] * 
info->block[i]);
+   kernel_args[i + 6] = util_cpu_to_le32(info->block[i]);
}
}
 
-- 
2.14.1


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] radeonsi: correct si_vgt_param_key on big endian machines

2018-04-09 Thread Bas Vermeulen
Using mesa OpenCL failed on a big endian PowerPC machine because
si_vgt_param_key is using bitfields and a 32 bit int for an
index into an array.

Fix si_vgt_param_key to work correctly on both little endian
and big endian machines.

Signed-off-by: Bas Vermeulen 
---
 src/gallium/drivers/radeonsi/si_pipe.h | 13 +
 1 file changed, 13 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 9fb18a84d3..e3e5d5ac91 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -464,6 +464,7 @@ struct si_shader_ctx_state {
  */
 union si_vgt_param_key {
struct {
+#ifdef PIPE_ARCH_LITTLE_ENDIAN
unsigned prim:4;
unsigned uses_instancing:1;
unsigned multi_instances_smaller_than_primgroup:1;
@@ -474,6 +475,18 @@ union si_vgt_param_key {
unsigned tess_uses_prim_id:1;
unsigned uses_gs:1;
unsigned _pad:32 - SI_NUM_VGT_PARAM_KEY_BITS;
+#else /* PIPE_ARCH_BIG_ENDIAN */
+   unsigned _pad:32 - SI_NUM_VGT_PARAM_KEY_BITS;
+   unsigned uses_gs:1;
+   unsigned tess_uses_prim_id:1;
+   unsigned uses_tess:1;
+   unsigned line_stipple_enabled:1;
+   unsigned count_from_stream_output:1;
+   unsigned primitive_restart:1;
+   unsigned multi_instances_smaller_than_primgroup:1;
+   unsigned uses_instancing:1;
+   unsigned prim:4;
+#endif
} u;
uint32_t index;
 };
-- 
2.14.1


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Mixed endianness with OpenCL

2018-03-23 Thread Bas Vermeulen
Hi,

I have the following situation:

A PowerPC (T2080) big endian CPU with an AMD E8860 (little endian) PCIe
graphics card.
I have modified the radeonsi gallium driver to allow execution on big
endian (there was a
union with bitfields and an uint32_t index where the index was out of range
because
the bitfields assumed little endian; patch is in the moderator queue).
When I use an OpenCL program that fills a 1024 buffer of floats with values
from 1..1024,
the floats are mangled when I retrieve the buffer.
Some values I can recover by byteswapping (as expected with mixed
endianness), others are
mangled somehow.

My question is, does the radeonsi gallium driver process result buffers
somewhere in the code?
Pointers are more than welcome. I'm guessing the values are
processed/mangled some way,
and I would love to make this work correctly.

Bas Vermeulen
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radeonsi: correct si_vgt_param_key on big endian machines

2018-03-20 Thread Bas Vermeulen
Using mesa OpenCL failed on a big endian PowerPC machine because
si_vgt_param_key is using bitfields and a 32 bit int for an
index into an array.

Fix si_vgt_param_key to work correctly on both little endian
and big endian machines.

Signed-off-by: Bas Vermeulen 
---
 src/gallium/drivers/radeonsi/si_pipe.h | 13 +
 1 file changed, 13 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 2053dcb9fc..32dbdf6e2c 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -385,6 +385,7 @@ struct si_shader_ctx_state {
  */
 union si_vgt_param_key {
struct {
+#if (__BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__)
unsigned prim:4;
unsigned uses_instancing:1;
unsigned multi_instances_smaller_than_primgroup:1;
@@ -395,6 +396,18 @@ union si_vgt_param_key {
unsigned tess_uses_prim_id:1;
unsigned uses_gs:1;
unsigned _pad:32 - SI_NUM_VGT_PARAM_KEY_BITS;
+#else /* __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ */
+   unsigned _pad:32 - SI_NUM_VGT_PARAM_KEY_BITS;
+   unsigned uses_gs:1;
+   unsigned tess_uses_prim_id:1;
+   unsigned uses_tess:1;
+   unsigned line_stipple_enabled:1;
+   unsigned count_from_stream_output:1;
+   unsigned primitive_restart:1;
+   unsigned multi_instances_smaller_than_primgroup:1;
+   unsigned uses_instancing:1;
+   unsigned prim:4;
+#endif
} u;
uint32_t index;
 };
-- 
2.14.1


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: correct si_vgt_param_key on big endian machines

2018-03-20 Thread Bas Vermeulen
I'm able to call clinfo without things crashing. Without this fix, clinfo
results in a signal 11 because key.index is byte swapped. With it,
I get the information I would expect. I'm working to test the OpenCL
currently.

I'll update the patch to use PIPE_ARCH_LITTLE_ENDIAN instead of my own #if.

Bas Vermeulen

On Tue, Mar 20, 2018 at 3:33 PM, Nicolai Hähnle 
wrote:

> Nice, did you actually get it to work entirely on a big endian machine?
>
> Bit fields aren't super portable, but this looks good enough. However, I
> think we should use the PIPE_ARCH_LITTLE_ENDIAN define from u_endian.h
>
> Cheers,
> Nicolai
>
>
> On 20.03.2018 15:21, Bas Vermeulen wrote:
>
>> Using mesa OpenCL failed on a big endian PowerPC machine because
>> si_vgt_param_key is using bitfields and a 32 bit int for an
>> index into an array.
>>
>> Fix si_vgt_param_key to work correctly on both little endian
>> and big endian machines.
>>
>> Signed-off-by: Bas Vermeulen 
>> ---
>>   src/gallium/drivers/radeonsi/si_pipe.h | 13 +
>>   1 file changed, 13 insertions(+)
>>
>> diff --git a/src/gallium/drivers/radeonsi/si_pipe.h
>> b/src/gallium/drivers/radeonsi/si_pipe.h
>> index 2053dcb9fc..32dbdf6e2c 100644
>> --- a/src/gallium/drivers/radeonsi/si_pipe.h
>> +++ b/src/gallium/drivers/radeonsi/si_pipe.h
>> @@ -385,6 +385,7 @@ struct si_shader_ctx_state {
>>*/
>>   union si_vgt_param_key {
>> struct {
>> +#if (__BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__)
>> unsigned prim:4;
>> unsigned uses_instancing:1;
>> unsigned multi_instances_smaller_than_primgroup:1;
>> @@ -395,6 +396,18 @@ union si_vgt_param_key {
>> unsigned tess_uses_prim_id:1;
>> unsigned uses_gs:1;
>> unsigned _pad:32 - SI_NUM_VGT_PARAM_KEY_BITS;
>> +#else /* __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ */
>> +   unsigned _pad:32 - SI_NUM_VGT_PARAM_KEY_BITS;
>> +   unsigned uses_gs:1;
>> +   unsigned tess_uses_prim_id:1;
>> +   unsigned uses_tess:1;
>> +   unsigned line_stipple_enabled:1;
>> +   unsigned count_from_stream_output:1;
>> +   unsigned primitive_restart:1;
>> +   unsigned multi_instances_smaller_than_primgroup:1;
>> +   unsigned uses_instancing:1;
>> +   unsigned prim:4;
>> +#endif
>> } u;
>> uint32_t index;
>>   };
>>
>>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev