On Thu, Apr 14, 2016 at 5:47 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote: > On Thu, Apr 14, 2016 at 10:34 AM, Oded Gabbay <oded.gab...@gmail.com> wrote: >> On Thu, Apr 14, 2016 at 5:01 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote: >>> This feels like an incredibly confused interface to me... >> I agree it's not the prettiest design I made, but considering >> big-endian is a class F citizen, then that's the best I got right now. >> I'm open to suggestions, but please guys don't tell me "that's not >> good but we have no idea how to solve it otherwise". It's not >> constructive. > > Definitely didn't want to do that. Just want to create something > that'll allow handling of all the various cases. > I know you mean well, I just vented some frustration :( Sorry.
>> >>>> you talk >>> about CPU <-> GPU, but I think that's a misnomer. What happens if you >>> use up too much VRAM and something gets evicted to GART? Does it get >>> byteswapped to the CPU-endianness? >> Good point, I need to think about that. If that happens, it may >> require a complementary fix in kernel driver. However, it's not >> mutually exclusive to this patch-set. >> >>> Are you treating everything as >>> 32-bit packed integers implicitly? What if you have, e.g., a R8 format >>> (without the GBA bits)? Or even worse, a R16 format, where cpu bytes 0 >>> 1 2 3 would have to come out as 1 0 3 2. >> >> I don't think I have the problem you describe. I'm not treating >> anything in an implicit matter. Each format get's its own handling. >> Please remember that r600 AMD GPUs *do support* big-endian. So, I >> configure them to know the color format (R8, R16, L4A4, ...), and I >> configure endian (8 in 16 swap, 8 in 32 swap, ...) and also I can >> configure swizzling. Then, the GPU handles all the swapping. >> Therefore, if it supports a certain format, it knows how to swap it >> correctly. > > Right. And NVIDIA has some of that too. And that's where a LOT of the > confusion comes from. (At least for me.) Lacking docs doesn't help > either =/ > >> >>> >>> This seems like it might be enough for GL 1.3, but this sort of >>> interface has to account for everything, IMO. >>> >> Well, for GL 3.3 with r600g, the piglit run on big-endian get's stuck >> every time, after ~300 tests, even without my patches, so for sure its >> broken. I just followed Marek's advise on how to solve it, and I >> believe I reached a certain point where it is worth merging it. >> >>> I was also struggling with such issues when getting the nv30 driver to >>> work on big endian. This was compounded by the issue that the NVIDIA >>> GPUs have a kinda-sorta-but-not-really big-endian mode which swaps >>> some things but not others. Basically you need to intelligently be >>> able to say what "endian" the data is in where, and in order to do >>> that, you need to know what format it's in. >> That's what I'm trying to do here. >> >> And that's where we hit >>> some fails, because all the transfer_map/unmap stuff is formatless. >>> And so there's no way to know whether you should be swapping or not. >> >> I tried to find a common ground for different formats, according to >> how mesa/gallium treats them, and I believe I managed to do it to some >> degree of success. I'm not saying its simple. There are different >> paths in the code for the same action, and they behave differently. >> e.g. in st_ReadPixels, you can read it using H/W blit or through >> map+memcpy+unpack (fallback mode). So, for the same texture, it >> depends which path you go through. In one path, I need to configure LE >> and in the other path I need to configure BE. > > Wouldn't it make more sense to handle such issues in transfer_map? > (i.e. create a staging memory area, and decode into it)? This assumes > that the transfer_map() call has enough information to "do the right > thing". I don't think it does today, but perhaps it could be taught? It doesn't have all the info today, that's for sure. I imagine though we can add parameters to it. > That way everything that's in a pipe_resource is in some > tightly-controlled format, and we specify the LE <-> BE parameters > when converting between CPU-read/written and GPU-read/written data. I > believe this is a better match for what's really happening, too. What > do you think? > > -ilia Unless I'm missing something, I think, at the end of the day, it will be the same issues as in my solution - per code path per format is a different case. That's because you will still need to "teach" transfer_map, per each transfer per format what to do. So one will need to go and debug every single code path there is in mesa for drawing/copying/reading/textures/etc., like what I did in the last 1.5 months. It's a great learning experience but it won't give anything generic. Again, for example, in st_ReadPixels, I imagine you will need to give "different orders" to transfer_map for the two different scenarios - H/W blit and fallback. So what's the gain here ? If I'm missing something, please tell me. Oded _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev