Dnia czwartek, 4 czerwca 2009 o 08:42:29 Michel Dänzer napisał(a): > On Wed, 2009-06-03 at 18:20 +0200, Maciej Cencora wrote: > > Dnia środa, 3 czerwca 2009 o 11:21:40 Michel Dänzer napisał(a): > > > On Tue, 2009-06-02 at 21:09 +0200, Nicolai Hähnle wrote: > > > > Am Tuesday 02 June 2009 20:18:17 schrieb Michel Dänzer: > > > > > On Mon, 2009-06-01 at 16:39 +0200, Maciej Cencora wrote: > > > > > > Dnia poniedziałek, 1 czerwca 2009 o 14:25:57 Maciej Cencora > > > > napisał(a): > > > > > > > Dnia poniedziałek, 1 czerwca 2009 o 12:44:20 Michel Dänzer > > > > napisał(a): > > > > > > > > On Sat, 2009-05-30 at 21:00 +0200, Maciej Cencora wrote: > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > this round of patches for r300 brings: > > > > > > > > > 1) hardware accelerated support for 8bit and 16bit vertex > > > > > > > > > attribute data formats, > > > > > > > > > 2) support for 16bit vertex indices, > > > > > > > > > 3) support for EXT_vertex_array_bgra extension, > > > > > > > > > 4) T&L path cleanup. it's used when hardware TCL is > > > > > > > > > enabled, but we have to fallback for software TCL - > > > > > > > > > clipping is still done in hardware unlike in software TCL > > > > > > > > > path. > > > > > > > > > > > > > > > > > > Those patches are unfinished, to be done: > > > > > > > > > 1) unmap bo's after rendering is finished > > > > > > > > > Currently the map/unmap functions are noop so it's working > > > > > > > > > ok, but when we will add support for VBO's it won't work. > > > > > > > > > > > > > > > > > > 2) handle big endian machines correctly > > > > > > > > > Is this really an issue? > > > > > > > > > > > > > > > > The driver certainly used to have special code for non-32-bit > > > > > > > > elts on big endian. How can I test this on my PowerBook? > > > > > > > > > > > > > > Hmm, since vbo branch merge all elts are always 32bit. I've > > > > > > > looked a little at the code before the merge and I couldn't > > > > > > > find any big endian specific code for elt handling, but maybe I > > > > > > > just haven't looked hard enough. > > > > > > > > > > > > > > mesa/progs/trivial/draw2arrays is using unsigned bytes as > > > > > > > indexes, change it to unsigned shorts and check if the > > > > > > > rendering is correct. > > > > > > > > > > > > I meant mesa/progs/trivial/drawrange > > > > > > > > > > Patch 4 indeed breaks that on my PowerBook. The patch below is a > > > > > minimal fix. Not sure why the #if 0'd code to just swap short > > > > > indices doesn't work... > > > > > > > > Just a stab in the dark, but maybe the single bytes within the shorts > > > > also need to be swapped? > > > > > > No. After some rather confusing experiments, I figured it out: > > > r300EmitElts() didn't copy the extra two bytes. The patch below now > > > works with short and byte indices. > > > > > > > > > commit a67a18d27a1e0e28cb62a346a4a995ca66ecf949 > > > Author: Michel Dänzer <daen...@vmware.com> > > > Date: Wed Jun 3 11:18:24 2009 +0200 > > > > > > r300: Endianness fixes for recent vertex path changes. > > > > > > diff --git a/src/mesa/drivers/dri/r300/r300_draw.c > > > b/src/mesa/drivers/dri/r300/r300_draw.c index 232ed10..142e934 100644 > > > --- a/src/mesa/drivers/dri/r300/r300_draw.c > > > +++ b/src/mesa/drivers/dri/r300/r300_draw.c > > > @@ -69,19 +69,42 @@ static void r300FixupIndexBuffer(GLcontext *ctx, > > > const struct _mesa_index_buffer > > > > > > if (mesa_ind_buf->type == GL_UNSIGNED_BYTE) { > > > > You should add #if MESA_BIG_ENDIAN for this path too. > > I don't agree, I prefer sharing code whenever feasible. > > > > GLubyte *in = (GLubyte *)src_ptr; > > > - GLushort *out = _mesa_malloc(sizeof(GLushort) * > > > mesa_ind_buf->count); + GLuint *out = > > > _mesa_malloc(sizeof(GLushort) * > > > + ((mesa_ind_buf->count + 1) & ~1)); > > > int i; > > > > > > - for (i = 0; i < mesa_ind_buf->count; ++i) { > > > - out[i] = (GLushort) in[i]; > > > + for (i = 0; i + 1 < mesa_ind_buf->count; i += 2) { > > > + out[i / 2] = in[i] | in[i + 1] << 16; > > > + } > > > + > > > + if (i < mesa_ind_buf->count) { > > > + out[i / 2] = in[i]; > > > } > > > > You should use seperate indexes for in and out arrays > > The new patch below uses *out++, hope that's okay. > > > > @@ -157,7 +180,11 @@ static void r300TranslateAttrib(GLcontext *ctx, > > > GLuint attr, int count, const st } else > > > src_ptr = input->Ptr; > > > > > > - if (input->Type == GL_DOUBLE || input->Type == GL_UNSIGNED_INT || > > > input->Type == GL_INT || input->StrideB < 4){ + if (input->Type == > > > GL_DOUBLE || input->Type == GL_UNSIGNED_INT || input->Type == GL_INT || > > > +#if MESA_BIG_ENDIAN > > > + getTypeSize(input->Type) != 4 || > > > +#endif > > > + input->StrideB < 4) { > > > > What's the reason for this change? > > Attributes smaller than 4 bytes have the same endianness problem, e.g. > in trivial/drawrange the colours are wrong. Maybe there's a better way > to handle this. > > > > --- a/src/mesa/drivers/dri/r300/r300_render.c > > > +++ b/src/mesa/drivers/dri/r300/r300_render.c > > > @@ -184,7 +184,7 @@ static void r300EmitElts(GLcontext * ctx, unsigned > > > long n_elts) &rmesa->radeon.tcl.elt_dma_offset, n_elts * el_size, 4); > > > radeon_bo_map(rmesa->radeon.tcl.elt_dma_bo, 1); > > > out = rmesa->radeon.tcl.elt_dma_bo->ptr + > > > rmesa->radeon.tcl.elt_dma_offset; - memcpy(out, rmesa->ind_buf.ptr, > > > n_elts * el_size); > > > + memcpy(out, rmesa->ind_buf.ptr, (n_elts * el_size + 3) & ~3); > > > radeon_bo_unmap(rmesa->radeon.tcl.elt_dma_bo); > > > } > > > > You should probably round up the dma buffer size to 4 bytes too. > > Ah yes, for some reason I thought the last parameter to > radeonAllocDmaRegion() did that. > > > commit 1a933d1a5748715a325bd0b24427ef095d5389a7 > Author: Michel Dänzer <daen...@vmware.com> > Date: Wed Jun 3 19:07:25 2009 +0200 > > r300: Endianness fixes for recent vertex path changes. > > diff --git a/src/mesa/drivers/dri/r300/r300_draw.c > b/src/mesa/drivers/dri/r300/r300_draw.c index 232ed10..72e6807 100644 > --- a/src/mesa/drivers/dri/r300/r300_draw.c > +++ b/src/mesa/drivers/dri/r300/r300_draw.c > @@ -69,19 +69,44 @@ static void r300FixupIndexBuffer(GLcontext *ctx, const > struct _mesa_index_buffer > > if (mesa_ind_buf->type == GL_UNSIGNED_BYTE) { > GLubyte *in = (GLubyte *)src_ptr; > - GLushort *out = _mesa_malloc(sizeof(GLushort) * > mesa_ind_buf->count); > + GLuint *out = _mesa_malloc(sizeof(GLushort) * > + ((mesa_ind_buf->count + 1) & ~1)); > int i; > > - for (i = 0; i < mesa_ind_buf->count; ++i) { > - out[i] = (GLushort) in[i]; > + ind_buf->ptr = out; > + > + for (i = 0; i + 1 < mesa_ind_buf->count; i += 2) { > + *out++ = in[i] | in[i + 1] << 16; > + } > + > + if (i < mesa_ind_buf->count) { > + *out++ = in[i]; > } > > - ind_buf->ptr = out; > ind_buf->free_needed = GL_TRUE; > ind_buf->is_32bit = GL_FALSE; > } else if (mesa_ind_buf->type == GL_UNSIGNED_SHORT) { > +#if MESA_BIG_ENDIAN > + GLushort *in = (GLushort *)src_ptr; > + GLuint *out = _mesa_malloc(sizeof(GLushort) * > + ((mesa_ind_buf->count + 1) & ~1)); > + int i; > + > + ind_buf->ptr = out; > + > + for (i = 0; i + 1 < mesa_ind_buf->count; i += 2) { > + *out++ = in[i] | in[i + 1] << 16; > + } > + > + if (i < mesa_ind_buf->count) { > + *out++ = in[i]; > + } > + > + ind_buf->free_needed = GL_TRUE; > +#else > ind_buf->ptr = src_ptr; > ind_buf->free_needed = GL_FALSE; > +#endif > ind_buf->is_32bit = GL_FALSE; > } else { > ind_buf->ptr = src_ptr; > @@ -157,7 +182,11 @@ static void r300TranslateAttrib(GLcontext *ctx, GLuint > attr, int count, const st } else > src_ptr = input->Ptr; > > - if (input->Type == GL_DOUBLE || input->Type == GL_UNSIGNED_INT || > input->Type == GL_INT || input->StrideB < 4){ + if (input->Type == > GL_DOUBLE || input->Type == GL_UNSIGNED_INT || input->Type == GL_INT || > +#if MESA_BIG_ENDIAN > + getTypeSize(input->Type) != 4 || > +#endif > + input->StrideB < 4) { > if (RADEON_DEBUG & DEBUG_FALLBACKS) { > fprintf(stderr, "%s: Converting vertex attributes, > attribute data > format %x,", __FUNCTION__, input->Type); fprintf(stderr, "stride %d, > components %d\n", input->StrideB, input->Size); diff --git > a/src/mesa/drivers/dri/r300/r300_render.c > b/src/mesa/drivers/dri/r300/r300_render.c index adda924..dfbd79a 100644 > --- a/src/mesa/drivers/dri/r300/r300_render.c > +++ b/src/mesa/drivers/dri/r300/r300_render.c > @@ -176,15 +176,15 @@ static void r300EmitElts(GLcontext * ctx, unsigned > long n_elts) { > r300ContextPtr rmesa = R300_CONTEXT(ctx); > void *out; > - GLbyte el_size; > + GLuint size; > > - el_size = rmesa->ind_buf.is_32bit ? 4 : 2; > + size = ((rmesa->ind_buf.is_32bit ? 4 : 2) * n_elts + 3) & ~3; > > radeonAllocDmaRegion(&rmesa->radeon, &rmesa->radeon.tcl.elt_dma_bo, > - &rmesa->radeon.tcl.elt_dma_offset, n_elts * > el_size, 4); > + &rmesa->radeon.tcl.elt_dma_offset, size, 4); > radeon_bo_map(rmesa->radeon.tcl.elt_dma_bo, 1); > out = rmesa->radeon.tcl.elt_dma_bo->ptr + > rmesa->radeon.tcl.elt_dma_offset; - memcpy(out, rmesa->ind_buf.ptr, n_elts > * el_size); > + memcpy(out, rmesa->ind_buf.ptr, size); > radeon_bo_unmap(rmesa->radeon.tcl.elt_dma_bo); > }
Looks good to me. Thanks for this endianess work. Maciej Cencora ------------------------------------------------------------------------------ OpenSolaris 2009.06 is a cutting edge operating system for enterprises looking to deploy the next generation of Solaris that includes the latest innovations from Sun and the OpenSource community. Download a copy and enjoy capabilities such as Networking, Storage and Virtualization. Go to: http://p.sf.net/sfu/opensolaris-get _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev