Re: [Mesa3d-dev] RFC: allow resource_copy_region between different (yet compatabile) formats
On Mon, 2010-09-06 at 16:31 -0700, Marek Olšák wrote: On Mon, Sep 6, 2010 at 9:57 PM, José Fonseca jfons...@vmware.com wrote: On Mon, 2010-09-06 at 10:22 -0700, Marek Olšák wrote: On Mon, Sep 6, 2010 at 3:57 PM, José Fonseca jfons...@vmware.com wrote: I'd like to know if there's any objection to change the resource_copy_region semantics to allow copies between different yet compatible formats, where the definition of compatible formats is: formats for which copying the bytes from the source resource unmodified to the destination resource will achieve the same effect of a textured quad blitter There is an helper function util_is_format_compatible() to help making this decision, and these are the non-trivial conversions that this function currently recognizes, (which was produced by u_format_compatible_test.c): b8g8r8a8_unorm - b8g8r8x8_unorm This specific case (and others) might not work, because there are no 0/1 swizzles when blending pixels with the framebuffer, e.g. see this sequence of operations: - Blit from b8g8r8a8 to b8g8r8x8. - x8 now contains a8. - Bind b8g8r8x8 as a colorbuffer. - Use blending with the destination alpha channel. - The original a8 is read instead of 1 (x8) because of lack of swizzles. This is not correct. Or at least not my interpretation. The x in b8g8r8x8 means padding (potentially with with unitialized data). There is no implicit guarantee that it will contain 0xff or anything. When blending to b8g8r8x8, destination alpha is by definition 1.0. It is an implicit swizzle (see e.g., u_format.csv). If the hardware's fixed function blending doesn't understand bgrx formats natively, then the pipe driver should internally replace the destination alpha factor factor with one. It's really simple. See for The dst blending parameter is just a factor the real dst value is multiplied by (except for min/max). There is no way to multiply an arbitrary value by a constant and get 1.0. But you can force 0, of course. I don't think there is hardware which supports such flexible swizzling in the blender. Lets assume your hardware doesn't understand bgrx rendertargets formats natively, and you program it with the bgra format instead. If so then you must do these replacements in rgb_src_factor and rgb_dst_factor: PIPE_BLENDFACTOR_DST_ALPHA - PIPE_BLENDFACTOR_ONE; PIPE_BLENDFACTOR_INV_DST_ALPHA - PIPE_BLENDFACTOR_ZERO; PIPE_BLENDFACTOR_SRC_ALPHA_SATURATE - PIPE_BLENDFACTOR_ZERO; This will ensure that's written in the red, green, and blue components is consistent with a bgrx format (that is, destination alpha is always one -- incoming values are discarded). In this scenario, how you program alpha_src_factor/alpha_dst_factor is irrelevant, because they will only affect what's written in the padding bits, which is just padding -- it can and should be treated as gibberish. If x8 is just padding as you say, the value of it should be undefined and every operation using the padding bits should be undefined too except for texture sampling. It's not like I have any other choice. IMO, there is no such thing as an operation using the padding bits. It is more like the contents of padding is undefined after/before any operation. And no operation should rely on it to have any particular value, by definition. Alpha blending of with a bgrx format should not (and needs not) to incorporate the padding bits for any computation. It may, however, write anything it feels like to the padding bits as a side effect. Now we could certainly impose the restriction in gallium that dst alpha blendfactors will produce undefined results for bgrx (and perhaps this is what you're arguing for). Then the burden of doing the replacements above shifts to the statetracker. I think Keith favors that stance. At any rate, going back to the original topic, I see no reason not to allow bgra - bgrx region_copy_regions. Also, for the record, in the moment arbitrary swizzles in the texture sampler bgrx formats became almost redundant. And I say almost because knowning that there is no alpha in the color buffer allows for certain optimizations (e.g., llvmpipe's swizzled layout separates the red, green, blue, and alpha channels into different 128bit words, and will not
Re: [Mesa3d-dev] RFC: allow resource_copy_region between different (yet compatabile) formats
On 06.09.2010 15:57, José Fonseca wrote: I'd like to know if there's any objection to change the resource_copy_region semantics to allow copies between different yet compatible formats, where the definition of compatible formats is: formats for which copying the bytes from the source resource unmodified to the destination resource will achieve the same effect of a textured quad blitter There is an helper function util_is_format_compatible() to help making this decision, and these are the non-trivial conversions that this function currently recognizes, (which was produced by u_format_compatible_test.c): b8g8r8a8_unorm - b8g8r8x8_unorm a8r8g8b8_unorm - x8r8g8b8_unorm b5g5r5a1_unorm - b5g5r5x1_unorm b4g4r4a4_unorm - b4g4r4x4_unorm l8_unorm - r8_unorm i8_unorm - l8_unorm i8_unorm - a8_unorm i8_unorm - r8_unorm l16_unorm - r16_unorm z24_unorm_s8_uscaled - z24x8_unorm s8_uscaled_z24_unorm - x8z24_unorm r8g8b8a8_unorm - r8g8b8x8_unorm a8b8g8r8_srgb - x8b8g8r8_srgb b8g8r8a8_srgb - b8g8r8x8_srgb a8r8g8b8_srgb - x8r8g8b8_srgb a8b8g8r8_unorm - x8b8g8r8_unorm r10g10b10a2_uscaled - r10g10b10x2_uscaled r10sg10sb10sa2u_norm - r10g10b10x2_snorm Note that format compatibility is not commutative. For software drivers this means that memcpy/util_copy_rect() will achieve the correct result. For hardware drivers this means that a VRAM-VRAM 2D blit engine will also achieve the correct result. So I'd expect no implementation change of resource_copy_region() for any driver AFAICT. But I'd like to be sure. Jose José, this looks good to me. Note that the analogous function in d3d10, ResourceCopyRegion, only requires formats to be in the same typeless group (hence same number of bits for all components), which is certainly a broader set of compatible formats to what util_is_format_compatible() is outputting. As far as I can tell, no conversion is happening at all in d3d10, this is just like memcpy. I think we might want to support that in the future as well, but for now extending this to the formats you listed certainly sounds ok. Roland -- This SF.net Dev2Dev email is sponsored by: Show off your parallel programming skills. Enter the Intel(R) Threading Challenge 2010. http://p.sf.net/sfu/intel-thread-sfd ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] RFC: allow resource_copy_region between different (yet compatabile) formats
On Mon, Sep 6, 2010 at 3:57 PM, José Fonseca jfons...@vmware.com wrote: I'd like to know if there's any objection to change the resource_copy_region semantics to allow copies between different yet compatible formats, where the definition of compatible formats is: I was about to propose something like this. How about a much more powerful change though, that would make any pair of non-blocked format of the same bit depth compatible? This way you could copy z24s8 to r8g8b8a8, for instance. In addition to this, how about explicitly allowing sampler views to use a compatible format, and add the ability for surfaces to use a compatible format too? (with a new parameter to get_tex_surface) This would allow for instance to implement glBlitFramebuffer on stencil buffers by reinterpreting the buffer as r8g8b8a8, and allow the blitter module to copy depth/stencil buffers by simply treating them as color buffers. The only issue is that some drivers might hold depth/stencil surfaces in compressed formats that cannot be interpreted as a color format, and not have any mechanism for keeping temporaries or doing conversions internally. DirectX seems to have something like this with the _TYPELESS formats. -- This SF.net Dev2Dev email is sponsored by: Show off your parallel programming skills. Enter the Intel(R) Threading Challenge 2010. http://p.sf.net/sfu/intel-thread-sfd ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] RFC: allow resource_copy_region between different (yet compatabile) formats
On Mon, 2010-09-06 at 08:11 -0700, Roland Scheidegger wrote: On 06.09.2010 15:57, José Fonseca wrote: I'd like to know if there's any objection to change the resource_copy_region semantics to allow copies between different yet compatible formats, where the definition of compatible formats is: formats for which copying the bytes from the source resource unmodified to the destination resource will achieve the same effect of a textured quad blitter There is an helper function util_is_format_compatible() to help making this decision, and these are the non-trivial conversions that this function currently recognizes, (which was produced by u_format_compatible_test.c): b8g8r8a8_unorm - b8g8r8x8_unorm a8r8g8b8_unorm - x8r8g8b8_unorm b5g5r5a1_unorm - b5g5r5x1_unorm b4g4r4a4_unorm - b4g4r4x4_unorm l8_unorm - r8_unorm i8_unorm - l8_unorm i8_unorm - a8_unorm i8_unorm - r8_unorm l16_unorm - r16_unorm z24_unorm_s8_uscaled - z24x8_unorm s8_uscaled_z24_unorm - x8z24_unorm r8g8b8a8_unorm - r8g8b8x8_unorm a8b8g8r8_srgb - x8b8g8r8_srgb b8g8r8a8_srgb - b8g8r8x8_srgb a8r8g8b8_srgb - x8r8g8b8_srgb a8b8g8r8_unorm - x8b8g8r8_unorm r10g10b10a2_uscaled - r10g10b10x2_uscaled r10sg10sb10sa2u_norm - r10g10b10x2_snorm Note that format compatibility is not commutative. For software drivers this means that memcpy/util_copy_rect() will achieve the correct result. For hardware drivers this means that a VRAM-VRAM 2D blit engine will also achieve the correct result. So I'd expect no implementation change of resource_copy_region() for any driver AFAICT. But I'd like to be sure. Jose José, this looks good to me. Note that the analogous function in d3d10, ResourceCopyRegion, only requires formats to be in the same typeless group (hence same number of bits for all components), which is certainly a broader set of compatible formats to what util_is_format_compatible() is outputting. As far as I can tell, no conversion is happening at all in d3d10, this is just like memcpy. I think we might want to support that in the future as well, but for now extending this to the formats you listed certainly sounds ok. Yes, that makes sense. Thanks for the feedback, Roland. Jose -- This SF.net Dev2Dev email is sponsored by: Show off your parallel programming skills. Enter the Intel(R) Threading Challenge 2010. http://p.sf.net/sfu/intel-thread-sfd ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] RFC: allow resource_copy_region between different (yet compatabile) formats
On Mon, Sep 6, 2010 at 3:57 PM, José Fonseca jfons...@vmware.com wrote: I'd like to know if there's any objection to change the resource_copy_region semantics to allow copies between different yet compatible formats, where the definition of compatible formats is: formats for which copying the bytes from the source resource unmodified to the destination resource will achieve the same effect of a textured quad blitter There is an helper function util_is_format_compatible() to help making this decision, and these are the non-trivial conversions that this function currently recognizes, (which was produced by u_format_compatible_test.c): b8g8r8a8_unorm - b8g8r8x8_unorm This specific case (and others) might not work, because there are no 0/1 swizzles when blending pixels with the framebuffer, e.g. see this sequence of operations: - Blit from b8g8r8a8 to b8g8r8x8. - x8 now contains a8. - Bind b8g8r8x8 as a colorbuffer. - Use blending with the destination alpha channel. - The original a8 is read instead of 1 (x8) because of lack of swizzles. The blitter and other util functions just need to be extended to explicitly write 1 instead of copying the alpha channel. Something likes this is already done in st/mesa, see the function compatible_src_dst_formats. Marek a8r8g8b8_unorm - x8r8g8b8_unorm b5g5r5a1_unorm - b5g5r5x1_unorm b4g4r4a4_unorm - b4g4r4x4_unorm l8_unorm - r8_unorm i8_unorm - l8_unorm i8_unorm - a8_unorm i8_unorm - r8_unorm l16_unorm - r16_unorm z24_unorm_s8_uscaled - z24x8_unorm s8_uscaled_z24_unorm - x8z24_unorm r8g8b8a8_unorm - r8g8b8x8_unorm a8b8g8r8_srgb - x8b8g8r8_srgb b8g8r8a8_srgb - b8g8r8x8_srgb a8r8g8b8_srgb - x8r8g8b8_srgb a8b8g8r8_unorm - x8b8g8r8_unorm r10g10b10a2_uscaled - r10g10b10x2_uscaled r10sg10sb10sa2u_norm - r10g10b10x2_snorm Note that format compatibility is not commutative. For software drivers this means that memcpy/util_copy_rect() will achieve the correct result. For hardware drivers this means that a VRAM-VRAM 2D blit engine will also achieve the correct result. So I'd expect no implementation change of resource_copy_region() for any driver AFAICT. But I'd like to be sure. Jose -- This SF.net Dev2Dev email is sponsored by: Show off your parallel programming skills. Enter the Intel(R) Threading Challenge 2010. http://p.sf.net/sfu/intel-thread-sfd ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev -- This SF.net Dev2Dev email is sponsored by: Show off your parallel programming skills. Enter the Intel(R) Threading Challenge 2010. http://p.sf.net/sfu/intel-thread-sfd___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] RFC: allow resource_copy_region between different (yet compatabile) formats
How about dropping the idea that resource_copy_region must be just a memcpy and have the driver instruct the hardware 2D blitter to write 1s in the alpha channel if supported by hw or have u_blitter do this in the shader? nv30/nv40 and apparently nv50 can do this in the 2D blitter, and all Radeons seem to use the 3D engine, which obviously can do it in the shader. We may also want to allow actual conversion between arbitrary formats, since again u_blitter can do it trivially, and so can most/all hardware 2D engines. -- This SF.net Dev2Dev email is sponsored by: Show off your parallel programming skills. Enter the Intel(R) Threading Challenge 2010. http://p.sf.net/sfu/intel-thread-sfd ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] RFC: allow resource_copy_region between different (yet compatabile) formats
On 06.09.2010 17:16, Luca Barbieri wrote: On Mon, Sep 6, 2010 at 3:57 PM, José Fonseca jfons...@vmware.com wrote: I'd like to know if there's any objection to change the resource_copy_region semantics to allow copies between different yet compatible formats, where the definition of compatible formats is: I was about to propose something like this. How about a much more powerful change though, that would make any pair of non-blocked format of the same bit depth compatible? This way you could copy z24s8 to r8g8b8a8, for instance. I am not sure this makes a lot of sense. There's no guarantee the bit layout of these is even remotely similar (and it likely won't be on any decent hardware). I think the dx10 restriction makes sense here. In addition to this, how about explicitly allowing sampler views to use a compatible format, and add the ability for surfaces to use a compatible format too? (with a new parameter to get_tex_surface) Note that get_tex_surface is dead (in gallium-array-textures - not merged yet but it will happen eventually). Its replacement (for render targets or depth stencil) create_surface(), already can be supplied with a format parameter. Compatible formats though should ultimately end up to something similar to dx10. This would allow for instance to implement glBlitFramebuffer on stencil buffers by reinterpreting the buffer as r8g8b8a8, and allow the blitter module to copy depth/stencil buffers by simply treating them as color buffers. The only issue is that some drivers might hold depth/stencil surfaces in compressed formats that cannot be interpreted as a color format, and not have any mechanism for keeping temporaries or doing conversions internally. I think that's a pretty big if. I could be wrong but I think operations like blitting stencil buffers are pretty rare anyway (afaik other apis don't allow things like that). DirectX seems to have something like this with the _TYPELESS formats. Yes, and it precisely won't allow you to interpret s24_z8 as r8g8b8a8 or other wonky stuff. Only if all components have same number of bits. Roland -- This SF.net Dev2Dev email is sponsored by: Show off your parallel programming skills. Enter the Intel(R) Threading Challenge 2010. http://p.sf.net/sfu/intel-thread-sfd ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] RFC: allow resource_copy_region between different (yet compatabile) formats
On Mon, 2010-09-06 at 10:22 -0700, Marek Olšák wrote: On Mon, Sep 6, 2010 at 3:57 PM, José Fonseca jfons...@vmware.com wrote: I'd like to know if there's any objection to change the resource_copy_region semantics to allow copies between different yet compatible formats, where the definition of compatible formats is: formats for which copying the bytes from the source resource unmodified to the destination resource will achieve the same effect of a textured quad blitter There is an helper function util_is_format_compatible() to help making this decision, and these are the non-trivial conversions that this function currently recognizes, (which was produced by u_format_compatible_test.c): b8g8r8a8_unorm - b8g8r8x8_unorm This specific case (and others) might not work, because there are no 0/1 swizzles when blending pixels with the framebuffer, e.g. see this sequence of operations: - Blit from b8g8r8a8 to b8g8r8x8. - x8 now contains a8. - Bind b8g8r8x8 as a colorbuffer. - Use blending with the destination alpha channel. - The original a8 is read instead of 1 (x8) because of lack of swizzles. This is not correct. Or at least not my interpretation. The x in b8g8r8x8 means padding (potentially with with unitialized data). There is no implicit guarantee that it will contain 0xff or anything. When blending to b8g8r8x8, destination alpha is by definition 1.0. It is an implicit swizzle (see e.g., u_format.csv). If the hardware's fixed function blending doesn't understand bgrx formats natively, then the pipe driver should internally replace the destination alpha factor factor with one. It's really simple. See for example llvmpipe (which needs to do that because the swizzled tile format is always bgra, so it needs to ignore destination alpha when bgrx surface is bound). I'm not sure what OpenGL defines, but DirectX/DCT definetely prescribes/enforces this behavior. The blitter and other util functions just need to be extended to explicitly write 1 instead of copying the alpha channel. Something likes this is already done in st/mesa, see the function compatible_src_dst_formats. There is no alpha channel in b8g8r8x8 for anybody to write. The problem here is not what's written in the padding bits -- it is instead in making sure the padding bits are not interpreted as alpha. If the hardware *really* works better with 0xff in the padding bits, then that needs to be enforced not only in surface copy, but in transfers (i.e., when the transfer is unmapped, the pipe driver would need to fill padding bits with 0xff for every pixel. Jose -- This SF.net Dev2Dev email is sponsored by: Show off your parallel programming skills. Enter the Intel(R) Threading Challenge 2010. http://p.sf.net/sfu/intel-thread-sfd ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] RFC: allow resource_copy_region between different (yet compatabile) formats
This way you could copy z24s8 to r8g8b8a8, for instance. I am not sure this makes a lot of sense. There's no guarantee the bit layout of these is even remotely similar (and it likely won't be on any decent hardware). I think the dx10 restriction makes sense here. Yes, it depends on the flexibility of the hardware and the driver. Due to depth textures, I think it is actually likely that you can easily treat depth as color. The worst issue right now is that stencil cannot be accessed in a sensible way at all, which makes implementing glBlitFramebuffer of STENCIL_BIT with NEAREST and different rect sizes impossible. Some cards (r600+ at least) can write stencil in shaders, but on some you must reinterpret the surface. And resource_copy_region does not support stretching, so it can't be used. Since not all cards can write stencil in shaders, one either needs to be able to bind depth/stencil as a color buffer, or extend resource_copy_region to support stretching with nearest filtering, or both (possibly in addition to having the option of using stencil export in shaders). Other things would likely benefit, such as GL_NV_copy_depth_to_color. -- This SF.net Dev2Dev email is sponsored by: Show off your parallel programming skills. Enter the Intel(R) Threading Challenge 2010. http://p.sf.net/sfu/intel-thread-sfd ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] RFC: allow resource_copy_region between different (yet compatabile) formats
On Mon, 2010-09-06 at 10:41 -0700, Luca Barbieri wrote: How about dropping the idea that resource_copy_region must be just a memcpy and have the driver instruct the hardware 2D blitter to write 1s in the alpha channel if supported by hw or have u_blitter do this in the shader? It's really different functionality. You're asking for a cast, as in b = (type)a; as in (int)1.0f = 1. Another thing is b = *(type *)a; as *(int *)1.0f = 0x3f80. This is my understanding of region_copy_region (previously known as surface_copy). And Roland provided a compelling argument for that. Both these functionality are exposed by APIs, and neither is a superset of the other. Jose -- This SF.net Dev2Dev email is sponsored by: Show off your parallel programming skills. Enter the Intel(R) Threading Challenge 2010. http://p.sf.net/sfu/intel-thread-sfd ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] RFC: allow resource_copy_region between different (yet compatabile) formats
Yes, if x8 is interpreted as writes can write arbitrary data, reads must return 1 (as you said), then this is not necessary in resource_copy_region even if A8 - X8 becomes supported. You are right that format conversions would probably be better added as a separate function (if at all), in addition to the reinterpret_cast mechanism you proposed to add. -- This SF.net Dev2Dev email is sponsored by: Show off your parallel programming skills. Enter the Intel(R) Threading Challenge 2010. http://p.sf.net/sfu/intel-thread-sfd ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] RFC: allow resource_copy_region between different (yet compatabile) formats
On 06.09.2010 22:03, Luca Barbieri wrote: This way you could copy z24s8 to r8g8b8a8, for instance. I am not sure this makes a lot of sense. There's no guarantee the bit layout of these is even remotely similar (and it likely won't be on any decent hardware). I think the dx10 restriction makes sense here. Yes, it depends on the flexibility of the hardware and the driver. Due to depth textures, I think it is actually likely that you can easily treat depth as color. The worst issue right now is that stencil cannot be accessed in a sensible way at all, which makes implementing glBlitFramebuffer of STENCIL_BIT with NEAREST and different rect sizes impossible. Some cards (r600+ at least) can write stencil in shaders, but on some you must reinterpret the surface. And resource_copy_region does not support stretching, so it can't be used. Since not all cards can write stencil in shaders, one either needs to be able to bind depth/stencil as a color buffer, or extend resource_copy_region to support stretching with nearest filtering, or both (possibly in addition to having the option of using stencil export in shaders). Yes, accessing stencil is a problem - other apis just disallow that... There are other problems with accessing stencil, like for instance WritePixels with multisampled depth/stencil buffer (which you can't really map hence cpu fallbacks don't even work). Plus you really don't want any cpu fallbacks anyway. Using stencil export (ARB_shader_stencil_export) seems like a clean solution, but as you said not all cards support it. Plus you can't actually get the stencil values with texture sampling neither, so this doesn't help that much (well you can't get them with GL though hardware may support it I guess). When I said it won't work with decent hardware, I really meant it won't work due to compression. Now, it's quite possible this can be disabled on any chip, but you don't know that before hence you need to jump through hoops to get an uncompressed version of your compressed buffer later. Do applications actually really ever use blitframebuffer with stencil bit (with different sizes, otherwise resource_copy_region could be used)? It just seems to me that casts to completely different formats (well still with same total bitwidth, but still) are very unclean, but I don't have any good solution of how to solve this - if noone ever uses this in practice cpu fallback is just fine, but as said won't work for multisampled buffers for instance neither. Other things would likely benefit, such as GL_NV_copy_depth_to_color. Roland -- This SF.net Dev2Dev email is sponsored by: Show off your parallel programming skills. Enter the Intel(R) Threading Challenge 2010. http://p.sf.net/sfu/intel-thread-sfd ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] RFC: allow resource_copy_region between different (yet compatabile) formats
When I said it won't work with decent hardware, I really meant it won't work due to compression. Now, it's quite possible this can be disabled on any chip, but you don't know that before hence you need to jump through hoops to get an uncompressed version of your compressed buffer later. Well, you can render to a compressed depth buffer and then bind it as a depth texture (routinely done for shadows), so there needs to be a way to get compressed data to the sampler either directly or via the driver automagically converting it with a blit beforehand. Of course, this may not actually work for stencil too, or might not allow to let you interpret depth as 8-bit color components, or perhaps not use directly as a render target, but it seems possible, especially on modern flexible hardware and on older dumber hardware that lacks/doesn't force compression. I haven't checked any hardware docs though, beyond the fact that nvfx currently doesn't support any compression and thus can just do it. -- This SF.net Dev2Dev email is sponsored by: Show off your parallel programming skills. Enter the Intel(R) Threading Challenge 2010. http://p.sf.net/sfu/intel-thread-sfd ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] RFC: allow resource_copy_region between different (yet compatabile) formats
On Mon, Sep 6, 2010 at 9:57 PM, José Fonseca jfons...@vmware.com wrote: On Mon, 2010-09-06 at 10:22 -0700, Marek Olšák wrote: On Mon, Sep 6, 2010 at 3:57 PM, José Fonseca jfons...@vmware.com wrote: I'd like to know if there's any objection to change the resource_copy_region semantics to allow copies between different yet compatible formats, where the definition of compatible formats is: formats for which copying the bytes from the source resource unmodified to the destination resource will achieve the same effect of a textured quad blitter There is an helper function util_is_format_compatible() to help making this decision, and these are the non-trivial conversions that this function currently recognizes, (which was produced by u_format_compatible_test.c): b8g8r8a8_unorm - b8g8r8x8_unorm This specific case (and others) might not work, because there are no 0/1 swizzles when blending pixels with the framebuffer, e.g. see this sequence of operations: - Blit from b8g8r8a8 to b8g8r8x8. - x8 now contains a8. - Bind b8g8r8x8 as a colorbuffer. - Use blending with the destination alpha channel. - The original a8 is read instead of 1 (x8) because of lack of swizzles. This is not correct. Or at least not my interpretation. The x in b8g8r8x8 means padding (potentially with with unitialized data). There is no implicit guarantee that it will contain 0xff or anything. When blending to b8g8r8x8, destination alpha is by definition 1.0. It is an implicit swizzle (see e.g., u_format.csv). If the hardware's fixed function blending doesn't understand bgrx formats natively, then the pipe driver should internally replace the destination alpha factor factor with one. It's really simple. See for The dst blending parameter is just a factor the real dst value is multiplied by (except for min/max). There is no way to multiply an arbitrary value by a constant and get 1.0. But you can force 0, of course. I don't think there is hardware which supports such flexible swizzling in the blender. If x8 is just padding as you say, the value of it should be undefined and every operation using the padding bits should be undefined too except for texture sampling. It's not like I have any other choice. Marek -- This SF.net Dev2Dev email is sponsored by: Show off your parallel programming skills. Enter the Intel(R) Threading Challenge 2010. http://p.sf.net/sfu/intel-thread-sfd___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] RFC: allow resource_copy_region between different (yet compatabile) formats
The dst blending parameter is just a factor the real dst value is multiplied by (except for min/max). There is no way to multiply an arbitrary value by a constant and get 1.0. But you can force 0, of course. I don't think there is hardware which supports such flexible swizzling in the blender. If x8 is just padding as you say, the value of it should be undefined and every operation using the padding bits should be undefined too except for texture sampling. It's not like I have any other choice. As far as I can tell, the only problem you have with blending with an X8 with random garbage, but with read value 1 is if any of the blending factors is DST_ALPHA or INV_DST_ALPHA (or COLOR as an alpha factor), in which case you can solve the issue by replacing the offending factor with ONE or ZERO, as long as you have support for RGB/A separate blend functions (which Gallium currenly assumes afaik). You can also disable the alpha channel in the writemask to avoid unnecessary work. On nv30/nv40, there is an actual render target format that instructs the card to read dst alpha as 1 (you can also choose whether to write 0 or 1). Of course, one could argue that mesa/st should do the transformation instead of Gallium drivers where hardware lacks such support. I suppose just not advertising X8 formats as render target formats could also work. -- This SF.net Dev2Dev email is sponsored by: Show off your parallel programming skills. Enter the Intel(R) Threading Challenge 2010. http://p.sf.net/sfu/intel-thread-sfd ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev