Re: time for amber2 branch?

Triang3l Thu, 20 Jun 2024 11:46:35 -0700

On 19/06/2024 20:34, Mike Blumenkrantz wrote:

> Terakan is not a Mesa driver, and Mesa has no obligation to cater toout-of-tree projects which use its internal API. For everything else,see above.

I don't think, however, that it can simply be dismissed like it doesn'texist when it's: • striving to become a part of Mesa among the "cool" drivers withbroad extension support like RADV, Anvil, Turnip, and now NVK; • actively developed nearly every day (albeit for around 2 hours perday on average because it's a free time project); • trying to explore horizons Mesa hasn't been to yet (submittinghardware commands directly on Windows).

As for R600g, it's one thing to drop the constraints imposed by someDirect3D 9 level GPUs that, for instance, don't even support integers inshaders or something like that (if that's even actually causing issuesin reality that slow down development of everything else significantly —the broad hardware support is something that I absolutely LOVE Mesa andoverall open source infrastructure for, and I think that's the case formany others too), but here we're talking about Direct3D 11 (or 10, butprogrammed largely the same way) class hardware with OpenGL 4.5 alreadysupported, and 4.6 being straightforward to implement.

This means that, with the exception of OpenCL-specific global addressingissues (R9xx can have a 4 GB "global memory" binding though possibly),the interface contract between Gallium's internals and R600g shouldn'tdiffer that much from that of the more modern drivers — the _hardware_architecture itself doesn't really warrant dropping active support incommon code.

Incidents like one change suddenly breaking vertex strides are thusmainly a problem in how _the driver itself_ is written, and that's ofcourse another story… While I can't say much about Gallium interactionsspecifically, I keep encountering more and more things that areunhandled or broken in how the driver actually works with the GPU, andthere are many Piglit tests that fail. I can imagine the way R600g isintegrated into Gallium isn't in a much better state.

So I think it may make sense (even though I definitely don't see anyserious necessity) to **temporarily** place R600g in a more stableenvironment where regressions in it are less likely to happen, but thenonce it's brought up to modern Mesa quality standards, and when itbecomes more friendly to the rest of Mesa, to **move it back** to themain branch (but that may stumble upon a huge lot of interface versionconflicts, who knows). Some of the things we can do to clean it up are:

• Make patterns of interaction with other subsystems of Gallium moresimilar to those used by other drivers. Maybe use RadeonSI as theprimary example because of their shared roots. • Fix some GPU configuration bugs — that I described in my previousmessage, as well as some other ones, such as these small ones: • Emit all viewports and scissors at once without using the dirtymask because the hardware requires that (already handled years ago inRadeonSI). • Fix gl_VertexID in indirect draws — the DRAW_INDIRECT packetswrite the base to SQ_VTX_BASE_VTX_LOC, which has an effect on vertexfetch instructions, but not on the vertex ID input; instead switch fromSQ_VTX_FETCH_VERTEX_DATA to SQ_VTX_FETCH_NO_INDEX_OFFSET, and COPY_DWthe base to VGT_INDX_OFFSET. • Properly configure the export format of the pixel shader DB exportvector (gl_FragDepth, gl_FragStencilRefARB, gl_SampleMask). • Investigate how queries currently work if the command buffer wassplit in the middle of a query, add the necessary stitching where needed. • Make Piglit squeal less. I remember trying to experiment withglDispatchComputeIndirect, only to find out that the test I wanted torun to verify my solution was broken for another reason. Oink oink. • If needed, remove the remaining references to TGSI enums, and alsoswitch to the NIR transform feedback interface that, as far as Iunderstand, is compatible with Nine and D3D10 frontends (or maybe it'sthe other way around (= either way, make that consistent).

 • Do some cleanup in common areas:

• Register, packet and shader structures can be moved to JSONdefinitions similar to those used for GCN/RDNA, but with more clearindication of the architecture revisions they can be used on (withoutsplitting into r600d.h and evergreend.h). I've already stumbled upon atypo in that probably hand-written S_/G_/C_ #define soup that has causedweird Vulkan CTS failures once, specifically inC_028780_BLEND_CONTROL_ENABLE in evergreend.h, and who knows what othersurprises may be there. Some fields there are apparently just for thewrong architecture revisions (though maybe actually present, butundocumented, I don't know, given the [RESERVED] situation with thedocumentation for anisotropic filtering and maybe non-1D/2D_THIN tilingmodes, for example, and that we have the reference for the 3D registers,but not for compute). • A lot of format information can be shared between vertex fetch,texture fetch, and color/storage attachments. I'm finishing writing somecommon format code for Terakan currently, that may be adopted by R600g. • Carefully make sure virtual memory is properly supported in allplaces on R9xx (using virtual addresses, and not emitting relocationNOPs that are harmless but wasteful — moreover, this part deserves somecommon function that will make it easier to port R600g to otherplatforms, such as by making it write D3DKMTRender patch locations onWindows). • Unify R6xx/R7xx and R8xx/R9xx code wherever possible. There'sr600_state.c that is over 100 KB large, and evergreen_state.c that'seven bigger, but in many places it's just the same code, just includingr600d.h in one file and evergreend.h in another — and how much technicaldebt we already have in the R6xx/R7xx code is an interesting question.To me, there doesn't seem to be any necessity to abandon R6xx/R7xxsupport completely currently considering that the programmingdifferences from R8xx/R9xx are pretty minor. At least as long as someoneoccasionally runs tests on the older generations.

Maybe that will involve some small-scale changes, maybe that will end upbeing more like a rewrite, but still it's totally possible that R600gmay have a new beginning at this point, especially with Gert Wollny'scompiler, and me visiting every aspect of the interface of those GPUs,rather than an ending. At some point we may even start exposingR600-specific functionality such as D3DFMT_D24FS8 in Gallium Nine onR6xx/R7xx.

However, I don't like the whole idea of moving drivers away from themain branch because that affects not only development, but also users ofMesa. It'd be necessary to ensure that Linux distribution maintainersare well-notified of the new branch, but even then that may still causeissues. Like, what if the amber2 drivers end up in a separate package ina distribution — and that'll possibly mean that after some `apt-getdist-upgrade`, users will suddenly lose GPU acceleration on theirsystems for an unobvious reason. And we definitely shouldn't beunderestimating the number of users of that old hardware outside Linuxdeveloper circles — especially TeraScale (I think Firefox regularly getsissue reports from Nvidia Rankine/Curie users?) I occasionally seepeople on Reddit and other platforms discussing the status of Terakan,and I'd expect that the people who talk about some software are just asmall fraction of those who use it at all. And sometimes weird thingsjust happen like Bringus Studios bringing up a Xi3 Piston out ofsemi-vaporware nowhere…

Regarding CI, I can't promise anything right now, but I think that's notan unsolvable issue. Overall just one machine with a Trinity APU, anR6xx/R7xx card, and an R8xx card (one of them preferably being RV670,RV770, or Cypress/Hemlock, to be able to test co-issuing of float64instructions with a transcendental one when that's implemented) likelyshould cover most of our regression testing needs — at least in Galliuminteraction most definitely.

Terakan development will surely continue being based on the main branch,partly because the original reason behind the split suggestion mostlydoesn't apply to it. I do need recent Vulkan headers and all the WSIimprovements at the very least — and there are areas where Terakanitself may contribute something new to the common Vulkan runtime code. Ialready have some WSI-demanded binary-over-timeline sync typeenhancements on my branch, and if my Windows experiments go forward,there will likely be a lot of what can be added to the common code, suchas WDDM 1 synchronization primitives (even though WDDM 2's timelinesemaphores aka monitored fences are more important to modern drivers,there's no WDDM 2 on Windows older than 10), as well as paths forzero-copy presentation (primarily for WDDM 1 level configurations — likevia sharing images with Direct3D 10/11, or with OpenGL to take advantageof the "exclusive borderless" driver hack, or maybe even viaD3DKMTPresent where possible).


On 20/06/2024 20:30, Adam Jackson wrote:

> We're using compute shaders internally in more and more ways, forexample, maybe being able to assume them would be a win.

I'd imagine that compute shader usage scenarios in common Gallium codeare optional, and depending on the hardware, compute shaders can even bethe less optimal approach to things like image copying/resolving (wherespecialized copy hardware is available) from the perspective ofperformance or maybe format support (early, or maybe actually all, Idon't know for sure yet, AMD R8xx hardware, for instance, hangs withlinear storage images according to one comment in R800AddrLib, andthat's why a quad with a color target may be preferable for copying —and it also has fast resolves inside its color buffer hardware, as wellas a DMA engine).


— Triang3l

Re: time for amber2 branch?

Reply via email to