Hi,
Am 28.05.2010 00:04, schrieb Conn Clark:
> On Thu, May 27, 2010 at 8:51 AM, Brian Paul wrote:
>
> This code could be written with a faster algorithm requiring just 13
> operations
>
> + pixel_number |= ((x >> 0) & 1) << 0; // pn[0] = x[0]
> + pixel_number |= ((
> Look up tables have some hidden penalties but I think it might be a
> win. Looks like we may have to benchmark the solutions against one
> another to really know which is best in real life.
For x86 and ppc the single assembler instruction is fastest. Can you wire
an R600 to anything else ?
> Note if it is known that x and y are less than or equal to 7 it can be
> done in 11 operations.
And bsr is one instruction for x86, cntlzw for ppc
Alan
On Thu, May 27, 2010 at 7:52 PM, Alan Cox wrote:
>> Look up tables have some hidden penalties but I think it might be a
>> win. Looks like we may have to benchmark the solutions against one
>> another to really know which is best in real life.
>
> For x86 and ppc the single assembler instruction i
On Thu, May 27, 2010 at 7:52 PM, Alan Cox wrote:
>> Look up tables have some hidden penalties but I think it might be a
>> win. Looks like we may have to benchmark the solutions against one
>> another to really know which is best in real life.
>
> For x86 and ppc the single assembler instruction i
On Thu, May 27, 2010 at 7:52 PM, Alan Cox wrote:
>> Look up tables have some hidden penalties but I think it might be a
>> win. Looks like we may have to benchmark the solutions against one
>> another to really know which is best in real life.
>
> For x86 and ppc the single assembler instruction i
On Thu, May 27, 2010 at 7:52 PM, Alan Cox wrote:
>> Look up tables have some hidden penalties but I think it might be a
>> win. Looks like we may have to benchmark the solutions against one
>> another to really know which is best in real life.
>
> For x86 and ppc the single assembler instruction i
> Look up tables have some hidden penalties but I think it might be a
> win. Looks like we may have to benchmark the solutions against one
> another to really know which is best in real life.
For x86 and ppc the single assembler instruction is fastest. Can you wire
an R600 to anything else ?
_
On Thu, 27 May 2010 11:20:59 -0400
Alex Deucher wrote:
> On Thu, May 27, 2010 at 10:55 AM, Matt Turner wrote:
> >> +static inline GLint r600_log2(GLint n)
> >> +{
> >> + ? ? ? GLint log2 = 0;
> >> +
> >> + ? ? ? while (n >>= 1)
> >> + ? ? ? ? ? ? ? ++log2;
> >> + ? ? ? return log2;
> >> +}
> >
>
On Thu, May 27, 2010 at 4:01 PM, Frieder Ferlemann
wrote:
> Hi,
>
> Am 28.05.2010 00:04, schrieb Conn Clark:
>> On Thu, May 27, 2010 at 8:51 AM, Brian Paul wrote:
>>
>> This code could be written with a faster algorithm requiring just 13
>> operations
>>
>> + pixel_number |= ((x >
On Thu, May 27, 2010 at 4:01 PM, Frieder Ferlemann
wrote:
> Hi,
>
> Am 28.05.2010 00:04, schrieb Conn Clark:
>> On Thu, May 27, 2010 at 8:51 AM, Brian Paul wrote:
>>
>> This code could be written with a faster algorithm requiring ?just 13
>> operations
>>
>> + ? ? ? ? ? ? ? pixel_number |= ((x >
Hi,
Am 28.05.2010 00:04, schrieb Conn Clark:
> On Thu, May 27, 2010 at 8:51 AM, Brian Paul wrote:
>
> This code could be written with a faster algorithm requiring just 13
> operations
>
> + pixel_number |= ((x >> 0) & 1) << 0; // pn[0] = x[0]
> + pixel_number |= ((
> Note if it is known that x and y are less than or equal to 7 it can be
> done in 11 operations.
And bsr is one instruction for x86, cntlzw for ppc
Alan
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/lis
On Thu, May 27, 2010 at 8:51 AM, Brian Paul wrote:
> Alex Deucher wrote:
>>
>> On Thu, May 27, 2010 at 10:55 AM, Matt Turner wrote:
+static inline GLint r600_log2(GLint n)
+{
+ GLint log2 = 0;
+
+ while (n >>= 1)
+ ++log2;
+
On Thu, May 27, 2010 at 8:51 AM, Brian Paul wrote:
> Alex Deucher wrote:
>>
>> On Thu, May 27, 2010 at 10:55 AM, Matt Turner wrote:
+static inline GLint r600_log2(GLint n)
+{
+ ? ? ? GLint log2 = 0;
+
+ ? ? ? while (n >>= 1)
+ ? ? ? ? ? ? ? ++log2;
+ ? ? ?
On Thu, May 27, 2010 at 10:55 AM, Matt Turner wrote:
>> +static inline GLint r600_log2(GLint n)
>> +{
>> + ? ? ? GLint log2 = 0;
>> +
>> + ? ? ? while (n >>= 1)
>> + ? ? ? ? ? ? ? ++log2;
>> + ? ? ? return log2;
>> +}
>
> Does mesa not provide something like this?
The only one I could find was a
> +static inline GLint r600_log2(GLint n)
> +{
> + ? ? ? GLint log2 = 0;
> +
> + ? ? ? while (n >>= 1)
> + ? ? ? ? ? ? ? ++log2;
> + ? ? ? return log2;
> +}
Does mesa not provide something like this?
Matt
Hi Alex,
not tested (admittedly I haven't compiled it),
and probably not really relevant but these
switch cases could be more compact:
+static inline GLint r600_2d_tile_helper(const struct radeon_renderbuffer * rrb,
+GLint x, GLint y, GLint is_depth, GLint is_stencil)
...
+
Alex Deucher wrote:
> On Thu, May 27, 2010 at 10:55 AM, Matt Turner wrote:
>>> +static inline GLint r600_log2(GLint n)
>>> +{
>>> + GLint log2 = 0;
>>> +
>>> + while (n >>= 1)
>>> + ++log2;
>>> + return log2;
>>> +}
>> Does mesa not provide something like this?
>
>
Alex Deucher wrote:
On Thu, May 27, 2010 at 10:55 AM, Matt Turner wrote:
+static inline GLint r600_log2(GLint n)
+{
+ GLint log2 = 0;
+
+ while (n >>= 1)
+ ++log2;
+ return log2;
+}
Does mesa not provide something like this?
The only one I could find was a gal
On Thu, 27 May 2010 11:20:59 -0400
Alex Deucher wrote:
> On Thu, May 27, 2010 at 10:55 AM, Matt Turner wrote:
> >> +static inline GLint r600_log2(GLint n)
> >> +{
> >> + GLint log2 = 0;
> >> +
> >> + while (n >>= 1)
> >> + ++log2;
> >> + return log2;
> >> +}
> >
>
On Thu, May 27, 2010 at 10:55 AM, Matt Turner wrote:
>> +static inline GLint r600_log2(GLint n)
>> +{
>> + GLint log2 = 0;
>> +
>> + while (n >>= 1)
>> + ++log2;
>> + return log2;
>> +}
>
> Does mesa not provide something like this?
The only one I could find was a
> +static inline GLint r600_log2(GLint n)
> +{
> + GLint log2 = 0;
> +
> + while (n >>= 1)
> + ++log2;
> + return log2;
> +}
Does mesa not provide something like this?
Matt
___
dri-devel mailing list
dri-devel@lists.freed
Hi Alex,
not tested (admittedly I haven't compiled it),
and probably not really relevant but these
switch cases could be more compact:
+static inline GLint r600_2d_tile_helper(const struct radeon_renderbuffer * rrb,
+GLint x, GLint y, GLint is_depth, GLint is_stencil)
...
+
Requires tiling config ioctl support from the drm to use.
kms only.
Signed-off-by: Alex Deucher
---
.../drivers/dri/radeon/radeon_common_context.c |9 +-
.../drivers/dri/radeon/radeon_common_context.h |7 +
src/mesa/drivers/dri/radeon/radeon_screen.h|7 +
src/mesa/dri
Requires tiling config ioctl support from the drm to use.
kms only.
Signed-off-by: Alex Deucher
---
.../drivers/dri/radeon/radeon_common_context.c |9 +-
.../drivers/dri/radeon/radeon_common_context.h |7 +
src/mesa/drivers/dri/radeon/radeon_screen.h|7 +
src/mesa/dri
26 matches
Mail list logo