Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal
On Tue, Apr 20, 2021, 09:25 Marek Olšák wrote: > Daniel, imagine hardware that can only do what Windows does: future fences > signalled by userspace whenever userspace wants, and no kernel queues like > we have today. > Hmm, that sounds kinda like what we're trying to do for Libre-SOC's gpu which is basically where the cpu (exactly the same cores as the gpu) runs a user-space software renderer with extra instructions to make it go fast, so the kernel only gets involved for futex-wait or for video scan-out. This causes problems when figuring out how to interact with dma-fences for interoperability... Jacob Lifshay ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Rust drivers in Mesa
On Tue, Oct 13, 2020, 23:52 Thomas Zimmermann wrote: > Hi > > On Tue, 13 Oct 2020 13:01:58 -0700 Eric Anholt wrote: > > > On Tue, Oct 13, 2020 at 12:08 AM Thomas Zimmermann > > wrote: > > > > > > Hi > > > > > > On Fri, 02 Oct 2020 08:04:43 -0700 "Dylan Baker" > > > wrote: > > > > > > > I have serious concerns about cargo and crate usage. Cargo is > basically > > > > npm for rust, and shares all of the bad design decisions of npm, > > > > including linking multiple versions of the same library together and > > > > ballooning dependency lists that are fetched intrigued from the > > > > internet. This is both a security problem and directly in conflict > with > > > > meson's design off one and only one version of a project. And while > > > > rust prevents certain kinds of bugs, it doesn't prevent design bugs > or > > > > malicious code. Add a meson developer the rust community has been > > > > incredibly hard to work with and basically hostile to every request > > > > we've made "cargo is hour you build rust", is essentially the answer > > > > we've gotten from them at every turn. And if you're not going to use > > > > cargo, is rust really a win? The standard library is rather minimal > > > > "because just pull in 1000 crates". The distro people can correct me > if > > > > I'm wrong, but when librsvg went to rust it was a nightmare, several > > > > distros went a long time without u > > > pdates because of cargo. > > > > > > I can't say much about meson, but using Rust has broken the binaries of > > > several packages on i586 for us; which consequently affects Gnome and > KDE. > > > [1][2] Rust uses SSE2 instructions on platforms that don't have them. > > > There's a proposed workaround, but it's not yet clear if that's > feasible > > > in practice. > > > > > > Best regards > > > Thomas > > > > > > [1] https://bugzilla.opensuse.org/show_bug.cgi?id=1162283 > > > [2] https://bugzilla.opensuse.org/show_bug.cgi?id=1077870 > > > > From the first bug: > > > > >Not entirely sure what to do about this. i586 is unsupported by Rust > (tier > > >2) and as such the package is built for i686 > > > > This really sounds like your distro is just building with the wrong > > rust target for packages targeting an earlier processor. > > Every other language/compiler combination appears to get this right. And > even > i686 does not require SSE2. As I said before, there might be a > workaround. > Rust has co-opted i586-unknown-linux-gnu to mean x86-32 without SSE and i686-unknown-linux-gnu to mean x86-32 with SSE2 (technically it implements it by using the pentium4 target cpu, so may require more than just SSE2). Rust just doesn't provide official binaries for i586-unknown-linux-gnu -- that does not imply that rustc and cargo won't work fine if they are recompiled for i586-unknown-linux-gnu. The only major caveat I'd expect is floats having slightly different semantics due to x87, which may cause some of rustc's tests to fail. Jacob ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Rust drivers in Mesa
On Sun, Oct 4, 2020, 10:13 Jacob Lifshay wrote: > On Sun, Oct 4, 2020, 08:19 Alyssa Rosenzweig < > alyssa.rosenzw...@collabora.com> wrote: > >> Cc'd. >> >> On Sun, Oct 04, 2020 at 03:17:28PM +0300, Michael Shigorin wrote: >> > Hello, >> > regarding this proposal: >> > http://lists.freedesktop.org/archives/mesa-dev/2020-October/224639.html >> > >> > Alyssa, Rust is not "naturally fit for graphics driver >> > development" since it's not as universally available as >> > C/C++ in terms of mature compilers; you're basically >> > trying to put a cart before a horse, which will just >> > put more pressure on both Rust developers team *and* >> > on those of us working on non-x86 arches capable of >> > driving modern GPUs with Mesa. >> > >> > For one, I'm porting ALT Linux onto e2k platform >> > (there's only an early non-optimizing version of >> > Rust port there by now), and we're maintaining repos >> > for aarch64, armv7hf, ppc64el, mipsel, and riscv64 either >> > -- none of which seem to be described as better than >> > "experimental" at http://doc.rust-lang.org/core/arch page >> > in terms of Rust support. >> > Those are the ISA-specific intrinsics, which aren't really necessary for almost all rust code. If your looking for the official support list, see: https://doc.rust-lang.org/nightly/rustc/platform-support.html You can also see the list of official binaries here: https://forge.rust-lang.org/infra/other-installation-methods.html Jacob > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Rust drivers in Mesa
On Sun, Oct 4, 2020, 08:19 Alyssa Rosenzweig < alyssa.rosenzw...@collabora.com> wrote: > Cc'd. > > On Sun, Oct 04, 2020 at 03:17:28PM +0300, Michael Shigorin wrote: > > Hello, > > regarding this proposal: > > http://lists.freedesktop.org/archives/mesa-dev/2020-October/224639.html > > > > Alyssa, Rust is not "naturally fit for graphics driver > > development" since it's not as universally available as > > C/C++ in terms of mature compilers; you're basically > > trying to put a cart before a horse, which will just > > put more pressure on both Rust developers team *and* > > on those of us working on non-x86 arches capable of > > driving modern GPUs with Mesa. > > > > For one, I'm porting ALT Linux onto e2k platform > > (there's only an early non-optimizing version of > > Rust port there by now), and we're maintaining repos > > for aarch64, armv7hf, ppc64el, mipsel, and riscv64 either > > -- none of which seem to be described as better than > > "experimental" at http://doc.rust-lang.org/core/arch page > > in terms of Rust support. > AArch64 with Linux is moving to tier 1 support with official support from ARM: https://github.com/rust-lang/rfcs/pull/2959 In my experience on armv7hf and powerpc64le with Linux and on aarch64 on Android (running rustc in termux), I've never had issues with Rust. Jacob Lifshay ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Rust drivers in Mesa
On Fri, Oct 2, 2020, 10:53 Jason Ekstrand wrote: > > 2. Rust's enums look awesome but are only mostly awesome: > a. Pattern matching on them can lead to some pretty deep > indentation which is a bit annoying. > b. There's no good way to have multiple cases handled by the same > code like you can with a C switch; you have to either repeat it or > break it out into a generic helper. > You can use | in match which could help with 2.b: https://doc.rust-lang.org/reference/expressions/match-expr.html example: match make_my_enum() { B { strv: s, .. } | C(MyStruct(s)) => println!("B or C: the string is {}", s), A(_) => {} } struct MyStruct(String); enum MyEnum { A(String), B { intv: i32, strv: String, }, C(MyStruct), } Jacob Lifshay ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Libre-soc-dev] Loading Vulkan Driver
On Sun, Aug 23, 2020, 18:38 Dave Airlie wrote: > On Mon, 24 Aug 2020 at 10:12, Jacob Lifshay > wrote: > > no, that is the existing LLVM backend from AMD's opengl/opencl drivers. > amdvlk came later. > > Those are the same codebase, amdvlk just uses a fork of llvm, but the > differences are only minor changes for impedance mismatch and release > timing, they never diverge more than necessary. > yeah, what I had meant is that the llvm amdgpu backend was not originally created for amdvlk, since amdvlk didn't exist then. Jacob ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Libre-soc-dev] Loading Vulkan Driver
On Sun, Aug 23, 2020, 17:08 Luke Kenneth Casson Leighton wrote: > > > On Monday, August 24, 2020, Dave Airlie wrote: > >> >> amdgpu is completely scalar, > > > it is?? waah! that's new information to me. does it even squash vec2/3/4, > predication and swizzle? > yes, iirc. > > what about the upstream amdgpu LLVM-IR? that still preserves vector > intrinsics, right? > > i'm assuming that AMDVLK preserves vector intrinsics? > > AMDVLK's associated LLVM port was what ended up upstream, is that right? > no, that is the existing LLVM backend from AMD's opengl/opencl drivers. amdvlk came later. Jacob ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Libre-soc-dev] Loading Vulkan Driver
On Sun, Aug 23, 2020, 16:31 Dave Airlie wrote: > It's hard to know then what you can expect to leverage from Mesa for > that sort of architecture, the SPIRV->NIR translation, and then you > probably want some sort of driver specific NIR->LLVM translation, > amdgpu is completely scalar, llvmpipe is manually vectorised, swr does > something like you might want, afaik it does a scalar translation and > then magic execution (but I haven't dug into it except to fix > interfaces). > Kazan isn't built on Mesa or NIR (though it was originally intended to integrate with Mesa). Luke decided that Libre-SOC should also have a similar Mesa-based driver and applied to NLNet for funding for it, Vivek is currently starting on implementing that Mesa driver. > > I think you will hit problems with vectorisation, because it's always > been a problem for every effort in this area, but if the CPU design is > such that everything can be vectorised and you never hit a scalar > path, and you workout how texture derivatives work early, it might be > something prototype-able. > My current plan for screen-space derivatives in Kazan is to have the fragment shaders vectorized in multiples of 4 pixels (IIRC there's a nvidia vulkan extension that basically requires that), so adjacent fragment shader can be subtracted as-needed to calculate derivatives. > > But none of the current mesa drivers are even close to this > architecture, llvmpipe or swr are probably a bit closer than anything, > so it's likely you'll be doing most of this out-of-tree anyways. My > other feeling is it sounds like over architecting, and reaching for > the stars here, where it might be practical to bring vallium/llvmpipe > I would expect it to be quite easy to get llvmpipe to run on Libre-SOC's processor, since it is PowerPC-compatible, it just won't have very good performance due to llvmpipe's architecture. up on the architecture first then work out how to do the big new > thing, or bring up this architecture on an x86 chip and see it works > at all. > The plan is to get Kazan working with vectorization on x86, then change the backend out to unmodified PowerPC (LLVM may scalarize the instructions here), then add the new vector ISA to the PowerPC backend, then add custom instructions as-needed to improve performance. Jacob ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Libre-soc-dev] Loading Vulkan Driver
On Sun, Aug 23, 2020, 15:55 Dave Airlie wrote: > What is your work submission model then, Vulkan is designed around > having work submitted to a secondary processor from a control > processor. Do you intend for the device to be used only as a > coprocessor, or as a CPU for normal tasks that can then use the > features in their threads, because none of the API are designed around > that model. They have a running thread of execution that queues stuff > up to a separate execution domain. > It's intended to be a combination, where CPU threads schedule work and render threads dequeue and run the work, probably using something like Rayon: https://github.com/rayon-rs/rayon When a CPU thread waits in the Vulkan driver, it can possibly be used as a render thread instead of blocking on a futex, avoiding needing excessive numbers of Linux kernel-level threads. The CPU and render threads run on the same cores, as scheduled by Linux. > > What represents a vulkan queue, The rayon task queues. what will be recorded into vulkan > command buffers, a command sequence encoded as bytecode, a list of Rust enums, or something similar. what will execute those command buffers. the render threads will dequeue the command buffers, run through all the commands in them, and schedule the appropriate rendering tasks to rayon's task execution mechanism. These are > the questions you need to ask yourself and answer before writing any > code. Yup, did that 2 years ago, though I don't remember if I explicitly wrote them down before. Jacob ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Libre-soc-dev] Loading Vulkan Driver
On Sun, Aug 23, 2020, 12:49 Dave Airlie wrote: > The big thing doing what Jacob did before, and from memory where he > went wrong despite advice to the contrary is he skipped the > "vectorises it" stage, which is highly important for vector CPU > execution, as opposed to scalar GPU. > IIRC, my plan for Kazan all along was/is to vectorize and multi-thread it, I just didn't get that far during the GSoC portion due to running out of time, so, in order to get it to do something more impressive than parse SPIR-V I kludged a rasterizer onto it during the last few weeks of GSoC. Originally, my plan was to write a whole-function vectorization pass as a LLVM pass. The new plan is to have the vectorization pass be done in a new IR (shader-compiler-ir) before translating to LLVM IR since LLVM IR isn't very good at retaining the structured IR needed to do full-function vectorization. That also has the benefit of supporting non-LLVM backends, such as cranelift. For the rewritten version in Rust that I'm still working on (when not working on the hardware design side), I'm not going to skip ahead and actually get it to work properly. Jacob ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Libre-soc-dev] Loading Vulkan Driver
On Thu, Aug 20, 2020, 22:28 vivek pandya wrote: > Thanks Jason for your time to reply. I understand the error but I am not > much familiar with entry points generation code. Currently I just make it > compile (I just want to develop a broken pipeline quickly that just returns > from the entry point) . > I will study the code. Is there any document to read about that? I want to > understand how loaders and icd interact. > IIRC the docs are here: https://github.com/KhronosGroup/Vulkan-Loader/blob/master/loader/LoaderAndLayerInterface.md Jacob Lifshay > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Plumbing explicit synchronization through the Linux ecosystem
On Tue, Mar 17, 2020 at 11:35 PM Jason Ekstrand wrote: > > On Wed, Mar 18, 2020 at 12:20 AM Jacob Lifshay > wrote: > > > > The main issue with doing everything immediately is that a lot of the > > function calls that games expect to take a very short time (e.g. > > vkQueueSubmit) would instead take a much longer time, potentially > > causing problems. > > Do you have any evidence that it will cause problems? What I said > above is what switfshader is doing and they're running real apps and > I've not heard of it causing any problems. It's also worth noting > that you would only really have to stall at sync_file export. You can > async as much as you want internally. Ok, seems worth trying out. > > One idea for a safe userspace-backed sync_file is to have a step > > counter that counts down until the sync_file is ready, where if > > userspace doesn't tell it to count any steps in a certain amount of > > time, then the sync_file switches to the error state. This way, it > > will error shortly after a process deadlocks for some reason, while > > still having the finite-time guarantee. > > > > When the sync_file is created, the step counter would be set to the > > number of jobs that the fence is waiting on. > > > > It can also be set to pause the timeout to wait until another > > sync_file signals, to handle cases where a sync_file is waiting on a > > userspace process that is waiting on another sync_file. > > > > The main issue is that the kernel would have to make sure that the > > sync_file graph doesn't have loops, maybe by erroring all sync_files > > that it finds in the loop. > > > > Does that sound like a good idea? > > Honestly, I don't think you'll ever be able to sell that to the kernel > community. All of the deadlock detection would add massive complexity > to the already non-trivial dma_fence infrastructure and for what > benefit? So that a software rasterizer can try to pretend to be more > like a GPU? You're going to have some very serious perf numbers > and/or other proof of necessity if you want to convince the kernel to > people to accept that level of complexity/risk. "I designed my > software to work this way" isn't going to convince anyone of anything > especially when literally every other software rasterizer I'm aware of > is immediate and they work just fine. After some further research, it turns out that it will work to have all the sync_files that a sync_file needs to depend on specified at creation, which forces the dependence graph to be a DAG since you can't depend on a sync_file that isn't yet created, so loops are impossible by design. Since kernel deadlock detection isn't actually required, just timeouts for the case of halted userspace, does this seem feasable? I'd guess that it'd require maybe 200-300 lines of code in a self-contained driver similar to the sync_file debugging driver mentioned previously but with the additional timeout code for safety. Jacob ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Plumbing explicit synchronization through the Linux ecosystem
On Tue, Mar 17, 2020 at 7:08 PM Jason Ekstrand wrote: > > On Tue, Mar 17, 2020 at 7:16 PM Jacob Lifshay > wrote: > > > > On Tue, Mar 17, 2020 at 11:14 AM Lucas Stach wrote: > > > > > > Am Dienstag, den 17.03.2020, 10:59 -0700 schrieb Jacob Lifshay: > > > > I think I found a userspace-accessible way to create sync_files and > > > > dma_fences that would fulfill the requirements: > > > > https://github.com/torvalds/linux/blob/master/drivers/dma-buf/sw_sync.c > > > > > > > > I'm just not sure if that's a good interface to use, since it appears > > > > to be designed only for debugging. Will have to check for additional > > > > requirements of signalling an error when the process that created the > > > > fence is killed. > > It is expressly only for debugging and testing. Exposing such an API > to userspace would break the finite time guarantees that are relied > upon to keep sync_file a secure API. Ok, I was figuring that was probably the case. > > > Something like that can certainly be lifted for general use if it makes > > > sense. But then with a software renderer I don't really see how fences > > > help you at all. With a software renderer you know exactly when the > > > frame is finished and you can just defer pushing it over to the next > > > pipeline element until that time. You won't gain any parallelism by > > > using fences as the CPU is busy doing the rendering and will not run > > > other stuff concurrently, right? > > > > There definitely may be other hardware and/or processes that can > > process some stuff concurrently with the main application, such as the > > compositor and or video encoding processes (for video capture). > > Additionally, from what I understand, sync_file is the standard way to > > export and import explicit synchronization between processes and > > between drivers on Linux, so it seems like a good idea to support it > > from an interoperability standpoint even if it turns out that there > > aren't any scheduling/timing benefits. > > There are different ways that one can handle interoperability, > however. One way is to try and make the software rasterizer look as > much like a GPU as possible: lots of threads to make things as > asynchronous as possible, "real" implementations of semaphores and > fences, etc. This is basically the route I've picked, though rather than making lots of native threads, I'm planning on having just one thread per core and have a work-stealing scheduler (inspired by Rust's rayon crate) schedule all the individual render/compute jobs, because that allows making a lot more jobs to allow finer load balancing. > Another is to let a SW rasterizer be a SW rasterizer: do > everything immediately, thread only so you can exercise all the CPU > cores, and minimally implement semaphores and fences well enough to > maintain compatibility. If you take the first approach, then we have > to solve all these problems with letting userspace create unsignaled > sync_files which it will signal later and figure out how to make it > safe. If you take the second approach, you'll only ever have to > return already signaled sync_files and there's no problem with the > sync_file finite time guarantees. The main issue with doing everything immediately is that a lot of the function calls that games expect to take a very short time (e.g. vkQueueSubmit) would instead take a much longer time, potentially causing problems. One idea for a safe userspace-backed sync_file is to have a step counter that counts down until the sync_file is ready, where if userspace doesn't tell it to count any steps in a certain amount of time, then the sync_file switches to the error state. This way, it will error shortly after a process deadlocks for some reason, while still having the finite-time guarantee. When the sync_file is created, the step counter would be set to the number of jobs that the fence is waiting on. It can also be set to pause the timeout to wait until another sync_file signals, to handle cases where a sync_file is waiting on a userspace process that is waiting on another sync_file. The main issue is that the kernel would have to make sure that the sync_file graph doesn't have loops, maybe by erroring all sync_files that it finds in the loop. Does that sound like a good idea? Jacob ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Plumbing explicit synchronization through the Linux ecosystem
On Tue, Mar 17, 2020 at 11:14 AM Lucas Stach wrote: > > Am Dienstag, den 17.03.2020, 10:59 -0700 schrieb Jacob Lifshay: > > I think I found a userspace-accessible way to create sync_files and > > dma_fences that would fulfill the requirements: > > https://github.com/torvalds/linux/blob/master/drivers/dma-buf/sw_sync.c > > > > I'm just not sure if that's a good interface to use, since it appears > > to be designed only for debugging. Will have to check for additional > > requirements of signalling an error when the process that created the > > fence is killed. > > Something like that can certainly be lifted for general use if it makes > sense. But then with a software renderer I don't really see how fences > help you at all. With a software renderer you know exactly when the > frame is finished and you can just defer pushing it over to the next > pipeline element until that time. You won't gain any parallelism by > using fences as the CPU is busy doing the rendering and will not run > other stuff concurrently, right? There definitely may be other hardware and/or processes that can process some stuff concurrently with the main application, such as the compositor and or video encoding processes (for video capture). Additionally, from what I understand, sync_file is the standard way to export and import explicit synchronization between processes and between drivers on Linux, so it seems like a good idea to support it from an interoperability standpoint even if it turns out that there aren't any scheduling/timing benefits. Jacob ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Plumbing explicit synchronization through the Linux ecosystem
On Tue, Mar 17, 2020 at 10:21 AM Lucas Stach wrote: > > Am Dienstag, den 17.03.2020, 10:12 -0700 schrieb Jacob Lifshay: > > One related issue with explicit sync using sync_file is that combined > > CPUs/GPUs (the CPU cores *are* the GPU cores) that do all the > > rendering in userspace (like llvmpipe but for Vulkan and with extra > > instructions for GPU tasks) but need to synchronize with other > > drivers/processes is that there should be some way to create an > > explicit fence/semaphore from userspace and later signal it. This > > seems to conflict with the requirement for a sync_file to complete in > > finite time, since the user process could be stopped or killed. > > > > Any ideas? > > Finite just means "not infinite". If you stop the process that's doing > part of the pipeline processing you block the pipeline, you get to keep > the pieces in that case. Seems reasonable. > That's one of the issues with implicit sync > that explicit may solve: a single client taking way too much time to > render something can block the whole pipeline up until the display > flip. With explicit sync the compositor can just decide to use the last > client buffer if the latest buffer isn't ready by some deadline. > > With regard to the process getting killed: whatever you sync primitive > is, you need to make sure to signal the fence (possibly with an error > condition set) when you are not going to make progress anymore. So > whatever your means to creating the sync_fd from your software renderer > is, it needs to signal any outstanding fences on the sync_fd when the > fd is closed. I think I found a userspace-accessible way to create sync_files and dma_fences that would fulfill the requirements: https://github.com/torvalds/linux/blob/master/drivers/dma-buf/sw_sync.c I'm just not sure if that's a good interface to use, since it appears to be designed only for debugging. Will have to check for additional requirements of signalling an error when the process that created the fence is killed. Jacob > > Regards, > Lucas > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Plumbing explicit synchronization through the Linux ecosystem
One related issue with explicit sync using sync_file is that combined CPUs/GPUs (the CPU cores *are* the GPU cores) that do all the rendering in userspace (like llvmpipe but for Vulkan and with extra instructions for GPU tasks) but need to synchronize with other drivers/processes is that there should be some way to create an explicit fence/semaphore from userspace and later signal it. This seems to conflict with the requirement for a sync_file to complete in finite time, since the user process could be stopped or killed. Any ideas? Jacob Lifshay ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services
One idea for Marge-bot (don't know if you already do this): Rust-lang has their bot (bors) automatically group together a few merge requests into a single merge commit, which it then tests, then, then the tests pass, it merges. This could help reduce CI runs to once a day (or some other rate). If the tests fail, then it could automatically deduce which one failed, by recursive subdivision or similar. There's also a mechanism to adjust priority and grouping behavior when the defaults aren't sufficient. Jacob ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH] Add GL_MESA_ieee_fp_alu_mode specification draft
See also: http://bugs.libre-riscv.org/show_bug.cgi?id=188 It might be worthwhile to consider a Vulkan extension to support this as a translation target for DX9 as well as other old HW? Jacob Lifshay ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] NLNet Funded development of a software/hardware MESA driver for the Libre GPGPU
On Mon, Jan 13, 2020 at 9:39 AM Jason Ekstrand wrote: > > On Mon, Jan 13, 2020 at 11:27 AM Luke Kenneth Casson Leighton > wrote: >> jason i'd be interested to hear your thoughts on what jacob wrote, does it >> alleviate your concerns, (we're not designing hardware specifically around >> vec2/3/4, it simply has that capability). > > > Not at all. If you just want a SW renderer that runs on RISC-V, feel free to > write one. If you want to vectorize in hardware and actually get serious > performance out of it, I highly doubt his plan will work. That said, I > wasn't planning to work on it so none of this is my problem so you're welcome > to take or leave anything I say. :-) So, since it may not have been clearly explained before, the GPU we're building has masked vectorization like most other GPUs, it just that it additionally supports the masked vectors' elements being 1 to 4 element subvectors. If it turns out that using subvectors makes the GPU slower, we can add a scalarization pass before the SIMT to vector translation, converting everything to using more conventional operations. Jacob ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] NLNet Funded development of a software/hardware MESA driver for the Libre GPGPU
On Thu, Jan 9, 2020 at 3:56 AM Luke Kenneth Casson Leighton wrote: > > On 1/9/20, Jason Ekstrand wrote: > >> 2. as a flexible Vector Processor, soft-programmable, then over time if > >> the industry moves to dropping vec4, so can we. > >> > > > > That's very nice. My primary reason for sending the first e-mail was that > > SwiftShader vs. Mesa is a pretty big decision that's hard to reverse after > > someone has poured several months into working on a driver and the argument > > you gave in favor of Mesa was that it supports vec4. > > not quite :) i garbled it (jacob spent some time explaining it, a few > months back, so it's 3rd hand if you know what i mean). what i can > recall of what he said was: it's something to do with the data types, > particularly predication, being maintained as part of SPIR-V (and > NIR), which, if you drop that information, you have to use > auto-vectorisation and other rather awful tricks to get it back when > you get to the assembly level. > > jacob perhaps you could clarify, here? So the major issue with the approach AMDGPU took where the SIMT to predicated vector translation is done by the LLVM backend is that LLVM doesn't really maintain a reducible CFG, which is needed to correctly vectorize the code without devolving to a switch-in-a-loop. This kinda-sorta works for AMDGPU because the backend can specifically tell the optimization passes to try to maintain a reducible CFG. However, that won't work for Libre-RISCV's GPU because we don't have a separate GPU ISA (it's just RISC-V or Power, we're still deciding), so the backends don't tell the optimization passes that they need to maintain a reducible CFG, additionally, the AMDGPU vectorization is done as part of the translation from LLVM IR to MIR, which makes it very hard to adapt to a different ISA. Because of all of those issues, I decided that it would be better to vectorize before translating to LLVM IR, since that way, the CFG reducibility can be easily maintained. This also gives the benefit that it's much easier to substitute a different backend compiler such as gccjit or cranelift, since all of the required SIMT-specific transformations are already completed before the code goes to the backend. Both NIR and the IR I'm currently implementing in Kazan (the non-Mesa Vulkan driver for libre-riscv) maintain a reducible CFG throughout the optimization process. In fact, the IR I'm implementing can't express non-reducible CFGs since it's built as a tree of loops and code blocks where control transfer operations can only continue a loop or exit a loop or block. Switches work by having a nested set of blocks and the switch instruction picks which block to break out of. Hopefully, that all made sense. :) Jacob Lifshay ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Adding support for EXT_sRGB for Opengl ES
Any progress on adding EXT_sRGB support to Mesa? Jacob Lifshay ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Vulkan-CPU has been renamed to Kazan
The new name means "Volcano" in Japanese. For those who don't remember, Kazan is a work-in-progress software-rendering Vulkan implementation. I moved the source code to https://github.com/kazan-3d/kazan, additionally, I registered the domain name kazan-3d.org, which currently redirects to the source code on GitHub. I renamed the project because vulkan-cpu infringes the Vulkan trademark. The source code will still be available at the old URL ( https://github.com/programmerjake/vulkan-cpu) to avoid breaking any links, however I probably won't keep the old repository up-to-date. Jacob Lifshay ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Vulkan on CPU Google Summer of Code project
GSOC 2017 is effectively over for me, Here's the link to the GSOC landing page that I will be submitting later: https://github.com/programmerjake/vulkan-cpu/blob/gsoc-2017/docs/gsoc-2017-landing-page.md I've only completed part of the Vulkan on CPU project: - I've completely implemented generating the SPIR-V parser from Khronos's JSON grammar description files. - I've completely implemented using LLVM ORC as the JIT compiler back-end. I implemented a ORC layer that integrates with the old JITEventListener interface, allowing me to use LLVM's debugger integration. - I've developed this project on GNU/Linux, so it has complete support. - I've partially implemented translating from SPIR-V to LLVM IR. - I've partially implemented generating a graphics pipeline by compiling several shaders together. - I've partially implemented image support: currently only VK_FORMAT_B8G8R8A8_UNORM - I have a temporary rasterizer implementation: I've implemented a scanline rasterizer just to get stuff to draw, I'm planning on replacing it with the originally-planned tiled binning rasterizer. - I have less than 5 functions to implement before it will work on Win32, mostly filesystem support. - I've not yet started implementing the actual Vulkan ICD interface. - I've not yet started implementing the Whole-function vectorization pass. - I've not yet started implementing support for other platforms, however, it should compile and run without problems on most Unix systems. Some interesting things I learned: - Vulkan doesn't actually specify using the top-left fill rule. (I couldn't find it in the rasterization section of the Vulkan specification.) - SPIR-V is surprisingly complicated to parse for an IR that's designed to be simple to parse. OpSwitch requires you to determine the bit-width of the value being switched on before you can parse the list of cases. See https://bugs.freedesktop.org/show_bug.cgi?id=101560 I had gotten delayed by my implementing a SPIR-V parser myself. In hindsight, the amount of time needed to run the SPIR-V to LLVM IR translator is dwarfed by running LLVM's optimizations, so I could have just used the SPIR-V parser that Khronos has already written as it's fast enough. Jacob Lifshay ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] software implementation of vulkan for gsoc/evoc
s / standard C runtime's > sqrt/log2/exp2/sin/cos then it would be dead slow. > I am planning on using c++ templates to help with a lot of the texture sampler code generation -- clang can convert it to llvm ir and then I can inline it into the appropriate places. I think that all of the non-compressed image formats should be pretty easy to handle that way, as they are all pretty similar (bits packed into a long word or members of a struct). I can implement interpolation on top of the functions to load and unpack the image elements from memory. I'd estimate that, excluding the compressed texture formats, I'd need less than 10k lines and maybe a week or two to implement it all. (Glad I don't have to implement that in C.) I am planning on compiling fdlibm with clang into llvm ir, then running my vectorization algorithm on all the functions. LLVM has a spot where you can tell it that you have optimized vectorized math intrinsics, I could add them there, or implement another lowering pass to convert the intrinsics to function calls, which can then be inlined. Hopefully, that will save most of the work needed to implement vectorized math functions. Also, llvm is already pretty good at converting vectorized sqrt intrinsics to vector sqrt instructions, which x86 sse/avx and (i think) arm neon already have. > > > Anyway, I hope this helps. Best of luck. > Thanks, Jacob Lifshay ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] vulkan/wsi: Improve the DRI3 error message
This commit improves the message by telling them that they could probably enable DRI3. More importantly, it includes a little heuristic to check to see if we're running on AMD or NVIDIA's proprietary X11 drivers and, if we are, doesn't emit the warning. This way, users with both a discrete card and Intel graphics don't get the warning when they're just running on the discrete card. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99715 Co-authored-by: Jason Ekstrand Reviewed-by: Kai Wasserbäch Tested-by: Rene Lindsay --- src/vulkan/wsi/wsi_common_x11.c | 51 + 1 file changed, 41 insertions(+), 10 deletions(-) diff --git a/src/vulkan/wsi/wsi_common_x11.c b/src/vulkan/wsi/wsi_common_x11.c index 64ba921..323209c 100644 --- a/src/vulkan/wsi/wsi_common_x11.c +++ b/src/vulkan/wsi/wsi_common_x11.c @@ -49,6 +49,7 @@ struct wsi_x11_connection { bool has_dri3; bool has_present; + bool is_proprietary_x11; }; struct wsi_x11 { @@ -63,8 +64,8 @@ static struct wsi_x11_connection * wsi_x11_connection_create(const VkAllocationCallbacks *alloc, xcb_connection_t *conn) { - xcb_query_extension_cookie_t dri3_cookie, pres_cookie; - xcb_query_extension_reply_t *dri3_reply, *pres_reply; + xcb_query_extension_cookie_t dri3_cookie, pres_cookie, amd_cookie, nv_cookie; + xcb_query_extension_reply_t *dri3_reply, *pres_reply, *amd_reply, *nv_reply; struct wsi_x11_connection *wsi_conn = vk_alloc(alloc, sizeof(*wsi_conn), 8, @@ -75,20 +76,43 @@ wsi_x11_connection_create(const VkAllocationCallbacks *alloc, dri3_cookie = xcb_query_extension(conn, 4, "DRI3"); pres_cookie = xcb_query_extension(conn, 7, "PRESENT"); + /* We try to be nice to users and emit a warning if they try to use a +* Vulkan application on a system without DRI3 enabled. However, this ends +* up spewing the warning when a user has, for example, both Intel +* integrated graphics and a discrete card with proprietary drivers and are +* running on the discrete card with the proprietary DDX. In this case, we +* really don't want to print the warning because it just confuses users. +* As a heuristic to detect this case, we check for a couple of proprietary +* X11 extensions. +*/ + amd_cookie = xcb_query_extension(conn, 11, "ATIFGLRXDRI"); + nv_cookie = xcb_query_extension(conn, 10, "NV-CONTROL"); + dri3_reply = xcb_query_extension_reply(conn, dri3_cookie, NULL); pres_reply = xcb_query_extension_reply(conn, pres_cookie, NULL); - if (dri3_reply == NULL || pres_reply == NULL) { + amd_reply = xcb_query_extension_reply(conn, amd_cookie, NULL); + nv_reply = xcb_query_extension_reply(conn, nv_cookie, NULL); + if (!dri3_reply || !pres_reply) { free(dri3_reply); free(pres_reply); + free(amd_reply); + free(nv_reply); vk_free(alloc, wsi_conn); return NULL; } wsi_conn->has_dri3 = dri3_reply->present != 0; wsi_conn->has_present = pres_reply->present != 0; + wsi_conn->is_proprietary_x11 = false; + if (amd_reply && amd_reply->present) + wsi_conn->is_proprietary_x11 = true; + if (nv_reply && nv_reply->present) + wsi_conn->is_proprietary_x11 = true; free(dri3_reply); free(pres_reply); + free(amd_reply); + free(nv_reply); return wsi_conn; } @@ -100,6 +124,18 @@ wsi_x11_connection_destroy(const VkAllocationCallbacks *alloc, vk_free(alloc, conn); } +static bool +wsi_x11_check_for_dri3(struct wsi_x11_connection *wsi_conn) +{ + if (wsi_conn->has_dri3) +return true; + if (!wsi_conn->is_proprietary_x11) { +fprintf(stderr, "vulkan: No DRI3 support detected - required for presentation\n" +"Note: you can probably enable DRI3 in your Xorg config\n"); + } + return false; +} + static struct wsi_x11_connection * wsi_x11_get_connection(struct wsi_device *wsi_dev, const VkAllocationCallbacks *alloc, @@ -264,11 +300,8 @@ VkBool32 wsi_get_physical_device_xcb_presentation_support( if (!wsi_conn) return false; - if (!wsi_conn->has_dri3) { - fprintf(stderr, "vulkan: No DRI3 support detected - required for presentation\n"); - fprintf(stderr, "Note: Buggy applications may crash, if they do please report to vendor\n"); + if (!wsi_x11_check_for_dri3(wsi_conn)) return false; - } unsigned visual_depth; if (!connection_get_visualtype(connection, visual_id, &visual_depth)) @@ -313,9 +346,7 @@ x11_surface_get_support(VkIcdSurfaceBase *icd_surface, if (!wsi_conn) return VK_ERROR_OUT_OF_HOST_MEMORY; - if (!wsi_conn->has_dri3) { - fprintf(stderr, "vulkan: No DRI3 support detected - required for presentation\n"); - fprintf(stderr, "Note: Buggy applications may crash, if they do please report to vendor\n"); + if (!wsi_x11_check_for_dri3(wsi_conn)) { *pSupported = false; return VK_SUCCESS; } -- 2.7.4
Re: [Mesa-dev] [PATCH] vulkan/wsi: Improve the DRI3 error message
Just to double check, is there anything else I need to do to have this patch committed? Jacob Lifshay On Feb 19, 2017 02:08, "Kai Wasserbäch" wrote: > Jason Ekstrand wrote on 19.02.2017 06:01: > > On Feb 18, 2017 12:37 PM, "Kai Wasserbäch" > > wrote: > > > > Hey Jacob, > > sorry for not spotting this the first time, but I have an additional > > comment. > > Please see below. > > > > Jacob Lifshay wrote on 18.02.2017 18:48:> This commit improves the > message > > by > > telling them that they could probably > >> enable DRI3. More importantly, it includes a little heuristic to check > >> to see if we're running on AMD or NVIDIA's proprietary X11 drivers and, > >> if we are, doesn't emit the warning. This way, users with both a > discrete > >> card and Intel graphics don't get the warning when they're just running > >> on the discrete card. > >> > >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99715 > >> Co-authored-by: Jason Ekstrand > >> --- > >> src/vulkan/wsi/wsi_common_x11.c | 47 ++ > > ++- > >> 1 file changed, 37 insertions(+), 10 deletions(-) > >> > >> diff --git a/src/vulkan/wsi/wsi_common_x11.c > b/src/vulkan/wsi/wsi_common_ > > x11.c > >> index 64ba921..b3a017a 100644 > >> --- a/src/vulkan/wsi/wsi_common_x11.c > >> +++ b/src/vulkan/wsi/wsi_common_x11.c > >> @@ -49,6 +49,7 @@ > >> struct wsi_x11_connection { > >> bool has_dri3; > >> bool has_present; > >> + bool is_proprietary_x11; > >> }; > >> > >> struct wsi_x11 { > >> @@ -63,8 +64,8 @@ static struct wsi_x11_connection * > >> wsi_x11_connection_create(const VkAllocationCallbacks *alloc, > >>xcb_connection_t *conn) > >> { > >> - xcb_query_extension_cookie_t dri3_cookie, pres_cookie; > >> - xcb_query_extension_reply_t *dri3_reply, *pres_reply; > >> + xcb_query_extension_cookie_t dri3_cookie, pres_cookie, amd_cookie, > > nv_cookie; > >> + xcb_query_extension_reply_t *dri3_reply, *pres_reply, *amd_reply, > > *nv_reply; > >> > >> struct wsi_x11_connection *wsi_conn = > >>vk_alloc(alloc, sizeof(*wsi_conn), 8, > >> @@ -75,20 +76,39 @@ wsi_x11_connection_create(const > VkAllocationCallbacks > > *alloc, > >> dri3_cookie = xcb_query_extension(conn, 4, "DRI3"); > >> pres_cookie = xcb_query_extension(conn, 7, "PRESENT"); > >> > >> + /* We try to be nice to users and emit a warning if they try to use > a > >> +* Vulkan application on a system without DRI3 enabled. However, > > this ends > >> +* up spewing the warning when a user has, for example, both Intel > >> +* integrated graphics and a discrete card with proprietary driers > > and are > >> +* running on the discrete card with the proprietary DDX. In this > > case, we > >> +* really don't want to print the warning because it just confuses > > users. > >> +* As a heuristic to detect this case, we check for a couple of > > proprietary > >> +* X11 extensions. > >> +*/ > >> + amd_cookie = xcb_query_extension(conn, 11, "ATIFGLRXDRI"); > >> + nv_cookie = xcb_query_extension(conn, 10, "NV-CONTROL"); > >> + > >> dri3_reply = xcb_query_extension_reply(conn, dri3_cookie, NULL); > >> pres_reply = xcb_query_extension_reply(conn, pres_cookie, NULL); > >> - if (dri3_reply == NULL || pres_reply == NULL) { > >> + amd_reply = xcb_query_extension_reply(conn, amd_cookie, NULL); > >> + nv_reply = xcb_query_extension_reply(conn, nv_cookie, NULL); > >> + if (!dri3_reply || !pres_reply || !amd_reply || !nv_reply) { > > > > I don't feel wsi_x11_connection_create should fail if there's no > amd_reply > > or > > nv_reply. That should just lead to unconditionally warning, in case > there's > > no > > DRI3 support. > > > > > > Of there is no reply then we either lost our connection to the X server > or > > ran out of memory. Either of those seem like a valid excuse to fail. > The > > chances of successfully connecting to X to create a swapchain at that > point > > is pretty close to zero. > > Fair enough. > > > With that fixed, this patch is > > Reviewed-by: Kai Wasserbäch >
[Mesa-dev] [PATCH] vulkan/wsi: Improve the DRI3 error message
This commit improves the message by telling them that they could probably enable DRI3. More importantly, it includes a little heuristic to check to see if we're running on AMD or NVIDIA's proprietary X11 drivers and, if we are, doesn't emit the warning. This way, users with both a discrete card and Intel graphics don't get the warning when they're just running on the discrete card. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99715 Co-authored-by: Jason Ekstrand Reviewed-by: Kai Wasserbäch --- src/vulkan/wsi/wsi_common_x11.c | 51 + 1 file changed, 41 insertions(+), 10 deletions(-) diff --git a/src/vulkan/wsi/wsi_common_x11.c b/src/vulkan/wsi/wsi_common_x11.c index 64ba921..a6fd094 100644 --- a/src/vulkan/wsi/wsi_common_x11.c +++ b/src/vulkan/wsi/wsi_common_x11.c @@ -49,6 +49,7 @@ struct wsi_x11_connection { bool has_dri3; bool has_present; + bool is_proprietary_x11; }; struct wsi_x11 { @@ -63,8 +64,8 @@ static struct wsi_x11_connection * wsi_x11_connection_create(const VkAllocationCallbacks *alloc, xcb_connection_t *conn) { - xcb_query_extension_cookie_t dri3_cookie, pres_cookie; - xcb_query_extension_reply_t *dri3_reply, *pres_reply; + xcb_query_extension_cookie_t dri3_cookie, pres_cookie, amd_cookie, nv_cookie; + xcb_query_extension_reply_t *dri3_reply, *pres_reply, *amd_reply, *nv_reply; struct wsi_x11_connection *wsi_conn = vk_alloc(alloc, sizeof(*wsi_conn), 8, @@ -75,20 +76,43 @@ wsi_x11_connection_create(const VkAllocationCallbacks *alloc, dri3_cookie = xcb_query_extension(conn, 4, "DRI3"); pres_cookie = xcb_query_extension(conn, 7, "PRESENT"); + /* We try to be nice to users and emit a warning if they try to use a +* Vulkan application on a system without DRI3 enabled. However, this ends +* up spewing the warning when a user has, for example, both Intel +* integrated graphics and a discrete card with proprietary drivers and are +* running on the discrete card with the proprietary DDX. In this case, we +* really don't want to print the warning because it just confuses users. +* As a heuristic to detect this case, we check for a couple of proprietary +* X11 extensions. +*/ + amd_cookie = xcb_query_extension(conn, 11, "ATIFGLRXDRI"); + nv_cookie = xcb_query_extension(conn, 10, "NV-CONTROL"); + dri3_reply = xcb_query_extension_reply(conn, dri3_cookie, NULL); pres_reply = xcb_query_extension_reply(conn, pres_cookie, NULL); - if (dri3_reply == NULL || pres_reply == NULL) { + amd_reply = xcb_query_extension_reply(conn, amd_cookie, NULL); + nv_reply = xcb_query_extension_reply(conn, nv_cookie, NULL); + if (!dri3_reply || !pres_reply) { free(dri3_reply); free(pres_reply); + free(amd_reply); + free(nv_reply); vk_free(alloc, wsi_conn); return NULL; } wsi_conn->has_dri3 = dri3_reply->present != 0; wsi_conn->has_present = pres_reply->present != 0; + wsi_conn->is_proprietary_x11 = false; + if (amd_reply && amd_reply->present) + wsi_conn->is_proprietary_x11 = true; + if (nv_reply && nv_reply->present) + wsi_conn->is_proprietary_x11 = true; free(dri3_reply); free(pres_reply); + free(amd_reply); + free(nv_reply); return wsi_conn; } @@ -100,6 +124,18 @@ wsi_x11_connection_destroy(const VkAllocationCallbacks *alloc, vk_free(alloc, conn); } +static bool +wsi_x11_check_for_dri3(struct wsi_x11_connection *wsi_conn) +{ + if (wsi_conn->has_dri3) +return true; + if (!wsi_conn->is_proprietary_x11) { +fprintf(stderr, "vulkan: No DRI3 support detected - required for presentation\n"); +"Note: you can probably enable DRI3 in your Xorg config\n"); + } + return false; +} + static struct wsi_x11_connection * wsi_x11_get_connection(struct wsi_device *wsi_dev, const VkAllocationCallbacks *alloc, @@ -264,11 +300,8 @@ VkBool32 wsi_get_physical_device_xcb_presentation_support( if (!wsi_conn) return false; - if (!wsi_conn->has_dri3) { - fprintf(stderr, "vulkan: No DRI3 support detected - required for presentation\n"); - fprintf(stderr, "Note: Buggy applications may crash, if they do please report to vendor\n"); + if (!wsi_x11_check_for_dri3(wsi_conn)) return false; - } unsigned visual_depth; if (!connection_get_visualtype(connection, visual_id, &visual_depth)) @@ -313,9 +346,7 @@ x11_surface_get_support(VkIcdSurfaceBase *icd_surface, if (!wsi_conn) return VK_ERROR_OUT_OF_HOST_MEMORY; - if (!wsi_conn->has_dri3) { - fprintf(stderr, "vulkan: No DRI3 support detected - required for presentation\n"); - fprintf(stderr, "Note: Buggy applications may crash, if they do please report to vendor\n"); + if (!wsi_x11_check_for_dri3(wsi_conn)) { *pSupported = false; return VK_SUCCESS; } -- 2.7.4 _
[Mesa-dev] [PATCH] vulkan/wsi: Improve the DRI3 error message
This commit improves the message by telling them that they could probably enable DRI3. More importantly, it includes a little heuristic to check to see if we're running on AMD or NVIDIA's proprietary X11 drivers and, if we are, doesn't emit the warning. This way, users with both a discrete card and Intel graphics don't get the warning when they're just running on the discrete card. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99715 Co-authored-by: Jason Ekstrand Reviewed-by: Kai Wasserbäch --- src/vulkan/wsi/wsi_common_x11.c | 51 + 1 file changed, 41 insertions(+), 10 deletions(-) diff --git a/src/vulkan/wsi/wsi_common_x11.c b/src/vulkan/wsi/wsi_common_x11.c index 64ba921..e4906f1 100644 --- a/src/vulkan/wsi/wsi_common_x11.c +++ b/src/vulkan/wsi/wsi_common_x11.c @@ -49,6 +49,7 @@ struct wsi_x11_connection { bool has_dri3; bool has_present; + bool is_proprietary_x11; }; struct wsi_x11 { @@ -63,8 +64,8 @@ static struct wsi_x11_connection * wsi_x11_connection_create(const VkAllocationCallbacks *alloc, xcb_connection_t *conn) { - xcb_query_extension_cookie_t dri3_cookie, pres_cookie; - xcb_query_extension_reply_t *dri3_reply, *pres_reply; + xcb_query_extension_cookie_t dri3_cookie, pres_cookie, amd_cookie, nv_cookie; + xcb_query_extension_reply_t *dri3_reply, *pres_reply, *amd_reply, *nv_reply; struct wsi_x11_connection *wsi_conn = vk_alloc(alloc, sizeof(*wsi_conn), 8, @@ -75,20 +76,43 @@ wsi_x11_connection_create(const VkAllocationCallbacks *alloc, dri3_cookie = xcb_query_extension(conn, 4, "DRI3"); pres_cookie = xcb_query_extension(conn, 7, "PRESENT"); + /* We try to be nice to users and emit a warning if they try to use a +* Vulkan application on a system without DRI3 enabled. However, this ends +* up spewing the warning when a user has, for example, both Intel +* integrated graphics and a discrete card with proprietary driers and are +* running on the discrete card with the proprietary DDX. In this case, we +* really don't want to print the warning because it just confuses users. +* As a heuristic to detect this case, we check for a couple of proprietary +* X11 extensions. +*/ + amd_cookie = xcb_query_extension(conn, 11, "ATIFGLRXDRI"); + nv_cookie = xcb_query_extension(conn, 10, "NV-CONTROL"); + dri3_reply = xcb_query_extension_reply(conn, dri3_cookie, NULL); pres_reply = xcb_query_extension_reply(conn, pres_cookie, NULL); - if (dri3_reply == NULL || pres_reply == NULL) { + amd_reply = xcb_query_extension_reply(conn, amd_cookie, NULL); + nv_reply = xcb_query_extension_reply(conn, nv_cookie, NULL); + if (!dri3_reply || !pres_reply) { free(dri3_reply); free(pres_reply); + free(amd_reply); + free(nv_reply); vk_free(alloc, wsi_conn); return NULL; } wsi_conn->has_dri3 = dri3_reply->present != 0; wsi_conn->has_present = pres_reply->present != 0; + wsi_conn->is_proprietary_x11 = false; + if (amd_reply && amd_reply->present) + wsi_conn->is_proprietary_x11 = true; + if (nv_reply && nv_reply->present) + wsi_conn->is_proprietary_x11 = true; free(dri3_reply); free(pres_reply); + free(amd_reply); + free(nv_reply); return wsi_conn; } @@ -100,6 +124,18 @@ wsi_x11_connection_destroy(const VkAllocationCallbacks *alloc, vk_free(alloc, conn); } +static bool +wsi_x11_check_for_dri3(struct wsi_x11_connection *wsi_conn) +{ + if (wsi_conn->has_dri3) +return true; + if (!wsi_conn->is_proprietary_x11) { +fprintf(stderr, "vulkan: No DRI3 support detected - required for presentation\n"); +"Note: you can probably enable DRI3 in your Xorg config\n"); + } + return false; +} + static struct wsi_x11_connection * wsi_x11_get_connection(struct wsi_device *wsi_dev, const VkAllocationCallbacks *alloc, @@ -264,11 +300,8 @@ VkBool32 wsi_get_physical_device_xcb_presentation_support( if (!wsi_conn) return false; - if (!wsi_conn->has_dri3) { - fprintf(stderr, "vulkan: No DRI3 support detected - required for presentation\n"); - fprintf(stderr, "Note: Buggy applications may crash, if they do please report to vendor\n"); + if (!wsi_x11_check_for_dri3(wsi_conn)) return false; - } unsigned visual_depth; if (!connection_get_visualtype(connection, visual_id, &visual_depth)) @@ -313,9 +346,7 @@ x11_surface_get_support(VkIcdSurfaceBase *icd_surface, if (!wsi_conn) return VK_ERROR_OUT_OF_HOST_MEMORY; - if (!wsi_conn->has_dri3) { - fprintf(stderr, "vulkan: No DRI3 support detected - required for presentation\n"); - fprintf(stderr, "Note: Buggy applications may crash, if they do please report to vendor\n"); + if (!wsi_x11_check_for_dri3(wsi_conn)) { *pSupported = false; return VK_SUCCESS; } -- 2.7.4 __
[Mesa-dev] [PATCH] vulkan/wsi: Improve the DRI3 error message
This commit improves the message by telling them that they could probably enable DRI3. More importantly, it includes a little heuristic to check to see if we're running on AMD or NVIDIA's proprietary X11 drivers and, if we are, doesn't emit the warning. This way, users with both a discrete card and Intel graphics don't get the warning when they're just running on the discrete card. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99715 Co-authored-by: Jason Ekstrand --- src/vulkan/wsi/wsi_common_x11.c | 47 - 1 file changed, 37 insertions(+), 10 deletions(-) diff --git a/src/vulkan/wsi/wsi_common_x11.c b/src/vulkan/wsi/wsi_common_x11.c index 64ba921..b3a017a 100644 --- a/src/vulkan/wsi/wsi_common_x11.c +++ b/src/vulkan/wsi/wsi_common_x11.c @@ -49,6 +49,7 @@ struct wsi_x11_connection { bool has_dri3; bool has_present; + bool is_proprietary_x11; }; struct wsi_x11 { @@ -63,8 +64,8 @@ static struct wsi_x11_connection * wsi_x11_connection_create(const VkAllocationCallbacks *alloc, xcb_connection_t *conn) { - xcb_query_extension_cookie_t dri3_cookie, pres_cookie; - xcb_query_extension_reply_t *dri3_reply, *pres_reply; + xcb_query_extension_cookie_t dri3_cookie, pres_cookie, amd_cookie, nv_cookie; + xcb_query_extension_reply_t *dri3_reply, *pres_reply, *amd_reply, *nv_reply; struct wsi_x11_connection *wsi_conn = vk_alloc(alloc, sizeof(*wsi_conn), 8, @@ -75,20 +76,39 @@ wsi_x11_connection_create(const VkAllocationCallbacks *alloc, dri3_cookie = xcb_query_extension(conn, 4, "DRI3"); pres_cookie = xcb_query_extension(conn, 7, "PRESENT"); + /* We try to be nice to users and emit a warning if they try to use a +* Vulkan application on a system without DRI3 enabled. However, this ends +* up spewing the warning when a user has, for example, both Intel +* integrated graphics and a discrete card with proprietary driers and are +* running on the discrete card with the proprietary DDX. In this case, we +* really don't want to print the warning because it just confuses users. +* As a heuristic to detect this case, we check for a couple of proprietary +* X11 extensions. +*/ + amd_cookie = xcb_query_extension(conn, 11, "ATIFGLRXDRI"); + nv_cookie = xcb_query_extension(conn, 10, "NV-CONTROL"); + dri3_reply = xcb_query_extension_reply(conn, dri3_cookie, NULL); pres_reply = xcb_query_extension_reply(conn, pres_cookie, NULL); - if (dri3_reply == NULL || pres_reply == NULL) { + amd_reply = xcb_query_extension_reply(conn, amd_cookie, NULL); + nv_reply = xcb_query_extension_reply(conn, nv_cookie, NULL); + if (!dri3_reply || !pres_reply || !amd_reply || !nv_reply) { free(dri3_reply); free(pres_reply); + free(amd_reply); + free(nv_reply); vk_free(alloc, wsi_conn); return NULL; } wsi_conn->has_dri3 = dri3_reply->present != 0; wsi_conn->has_present = pres_reply->present != 0; + wsi_conn->is_proprietary_x11 = amd_reply->present || nv_reply->present; free(dri3_reply); free(pres_reply); + free(amd_reply); + free(nv_reply); return wsi_conn; } @@ -100,6 +120,18 @@ wsi_x11_connection_destroy(const VkAllocationCallbacks *alloc, vk_free(alloc, conn); } +static bool +wsi_x11_check_for_dri3(struct wsi_x11_connection *wsi_conn) +{ + if (wsi_conn->has_dri3) +return true; + if (!wsi_conn->is_proprietary_x11) { +fprintf(stderr, "vulkan: No DRI3 support detected - required for presentation\n"); +"Note: you can probably enable DRI3 in your Xorg config\n"); + } + return false; +} + static struct wsi_x11_connection * wsi_x11_get_connection(struct wsi_device *wsi_dev, const VkAllocationCallbacks *alloc, @@ -264,11 +296,8 @@ VkBool32 wsi_get_physical_device_xcb_presentation_support( if (!wsi_conn) return false; - if (!wsi_conn->has_dri3) { - fprintf(stderr, "vulkan: No DRI3 support detected - required for presentation\n"); - fprintf(stderr, "Note: Buggy applications may crash, if they do please report to vendor\n"); + if (!wsi_x11_check_for_dri3(wsi_conn)) return false; - } unsigned visual_depth; if (!connection_get_visualtype(connection, visual_id, &visual_depth)) @@ -313,9 +342,7 @@ x11_surface_get_support(VkIcdSurfaceBase *icd_surface, if (!wsi_conn) return VK_ERROR_OUT_OF_HOST_MEMORY; - if (!wsi_conn->has_dri3) { - fprintf(stderr, "vulkan: No DRI3 support detected - required for presentation\n"); - fprintf(stderr, "Note: Buggy applications may crash, if they do please report to vendor\n"); + if (!wsi_x11_check_for_dri3(wsi_conn)) { *pSupported = false; return VK_SUCCESS; } -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] vulkan/wsi: Improve the DRI3 error message
This commit improves the message by telling them that they could probably enable DRI3 and giving a url to a Ask Ubuntu question showing how to do that. More importantly, it includes a little heuristic to check to see if we're running on AMD or NVIDIA's proprietary X11 drivers and, if we are, doesn't emit the warning. This way, users with both a discrete card and Intel graphics don't get the warning when they're just running on the discrete card. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99715 Co-authored-by: Jason Ekstrand --- src/vulkan/wsi/wsi_common_x11.c | 55 + 1 file changed, 45 insertions(+), 10 deletions(-) diff --git a/src/vulkan/wsi/wsi_common_x11.c b/src/vulkan/wsi/wsi_common_x11.c index 64ba921..ac7a972 100644 --- a/src/vulkan/wsi/wsi_common_x11.c +++ b/src/vulkan/wsi/wsi_common_x11.c @@ -48,7 +48,9 @@ struct wsi_x11_connection { bool has_dri3; + bool has_dri2; bool has_present; + bool is_proprietary_x11; }; struct wsi_x11 { @@ -63,8 +65,8 @@ static struct wsi_x11_connection * wsi_x11_connection_create(const VkAllocationCallbacks *alloc, xcb_connection_t *conn) { - xcb_query_extension_cookie_t dri3_cookie, pres_cookie; - xcb_query_extension_reply_t *dri3_reply, *pres_reply; + xcb_query_extension_cookie_t dri3_cookie, dri2_cookie, pres_cookie, amd_cookie, nv_cookie; + xcb_query_extension_reply_t *dri3_reply, *dri2_reply, *pres_reply, *amd_reply, *nv_reply; struct wsi_x11_connection *wsi_conn = vk_alloc(alloc, sizeof(*wsi_conn), 8, @@ -73,22 +75,46 @@ wsi_x11_connection_create(const VkAllocationCallbacks *alloc, return NULL; dri3_cookie = xcb_query_extension(conn, 4, "DRI3"); + dri2_cookie = xcb_query_extension(conn, 4, "DRI2"); pres_cookie = xcb_query_extension(conn, 7, "PRESENT"); + /* We try to be nice to users and emit a warning if they try to use a +* Vulkan application on a system without DRI3 enabled. However, this ends +* up spewing the warning when a user has, for example, both Intel +* integrated graphics and a discrete card with proprietary driers and are +* running on the discrete card with the proprietary DDX. In this case, we +* really don't want to print the warning because it just confuses users. +* As a heuristic to detect this case, we check for a couple of proprietary +* X11 extensions. +*/ + amd_cookie = xcb_query_extension(conn, 11, "ATIFGLRXDRI"); + nv_cookie = xcb_query_extension(conn, 10, "NV-CONTROL"); + dri3_reply = xcb_query_extension_reply(conn, dri3_cookie, NULL); + dri2_reply = xcb_query_extension_reply(conn, dri2_cookie, NULL); pres_reply = xcb_query_extension_reply(conn, pres_cookie, NULL); - if (dri3_reply == NULL || pres_reply == NULL) { + amd_reply = xcb_query_extension_reply(conn, amd_cookie, NULL); + nv_reply = xcb_query_extension_reply(conn, nv_cookie, NULL); + if (!dri3_reply || !dri2_reply || !pres_reply || !amd_reply || !nv_reply) { free(dri3_reply); + free(dri2_reply); free(pres_reply); + free(amd_reply); + free(nv_reply); vk_free(alloc, wsi_conn); return NULL; } wsi_conn->has_dri3 = dri3_reply->present != 0; + wsi_conn->has_dri2 = dri2_reply->present != 0; wsi_conn->has_present = pres_reply->present != 0; + wsi_conn->is_proprietary_x11 = amd_reply->present || nv_reply->present; free(dri3_reply); + free(dri2_reply); free(pres_reply); + free(amd_reply); + free(nv_reply); return wsi_conn; } @@ -100,6 +126,20 @@ wsi_x11_connection_destroy(const VkAllocationCallbacks *alloc, vk_free(alloc, conn); } +static bool +wsi_x11_check_for_dri3(struct wsi_x11_connection *wsi_conn) +{ + if (wsi_conn->has_dri3) +return true; + if (!wsi_conn->is_proprietary_x11) { +fprintf(stderr, "vulkan: No DRI3 support detected - required for presentation\n"); +if (wsi_conn->has_dri2) + fprintf(stderr, "Note: DRI2 support detected, you can probably enable DRI3 in your Xorg config;\n" + " see http://askubuntu.com/questions/817226/how-to-enable-dri3-on-ubuntu-16-04\n";); + } + return false; +} + static struct wsi_x11_connection * wsi_x11_get_connection(struct wsi_device *wsi_dev, const VkAllocationCallbacks *alloc, @@ -264,11 +304,8 @@ VkBool32 wsi_get_physical_device_xcb_presentation_support( if (!wsi_conn) return false; - if (!wsi_conn->has_dri3) { - fprintf(stderr, "vulkan: No DRI3 support detected - required for presentation\n"); - fprintf(stderr, "Note: Buggy applications may crash, if they do please report to vendor\n"); + if (!wsi_x11_check_for_dri3(wsi_conn)) return false; - } unsigned visual_depth; if (!connection_get_visualtype(connection, visual_id, &visual_depth)) @@ -313,9 +350,7 @@ x11_surface_get_support(VkIcdSurfaceBase *icd_surface, if (!wsi_conn)
Re: [Mesa-dev] software implementation of vulkan for gsoc/evoc
On Feb 14, 2017 12:18 AM, "Nicolai Hähnle" wrote: On 13.02.2017 17:54, Jacob Lifshay wrote: > the algorithm i was going to use would get the union of the sets of live > variables at the barriers (union over barriers), create an array of > structs that holds them all, then for each barrier, insert the code to > store all live variables, then end the for loop over tid_in_workgroup, > then run the memory barrier, then start another for loop over > tid_in_workgroup, then load all live variables. > Okay, sounds reasonable in theory. There are some issues, like: how do you actually determine live variables? If you're working off TGSI like llvmpipe does today, you'd need to write your own analysis for that, but in a structured control flow graph like TGSI has, that shouldn't be too difficult. I was planning on using the spir-v to llvm translator and never using tgsi. I could implement the pass using llvm coroutines, however, I'd need to have several additional passes to convert the output; it might not optimize all the way because we would have the switch on the suspend point index still left. Also, according to the docs from llvm trunk, llvm doesn't support reducing the space required by using the minimum size needed to store all the live variables at the suspend point with the largest space requirements, instead, it allocates separate space for each variable at each suspend point: http://llvm.org/docs/Coroutines.html#areas-requiring-attention I'd still recommend you to at least seriously read through the LLVM coroutine stuff. Cheers, Nicolai Jacob Lifshay > > On Feb 13, 2017 08:45, "Nicolai Hähnle" <mailto:nhaeh...@gmail.com>> wrote: > > [ re-adding mesa-dev on the assumption that it got dropped by accident > ] > > On 13.02.2017 17:27, Jacob Lifshay wrote: > > I would start a thread for each cpu, then have each > thread run the > compute shader a number of times instead of having a > thread per > shader > invocation. > > > This will not work. > > Please, read again what the barrier() instruction does: When > the > barrier() call is reached, _all_ threads within the > workgroup are > supposed to be run until they reach that barrier() call. > > > to clarify, I had meant that each os thread would run the > sections of > the shader between the barriers for all the shaders in a work > group, > then, when it finished the work group, it would go to the next work > group assigned to the os thread. > > so, if our shader is: > a = b + tid; > barrier(); > d = e + f; > > and our simd width is 4, our work-group size is 128, and we have > 16 os > threads, then it will run for each os thread: > for(workgroup = os_thread_index; workgroup < workgroup_count; > workgroup++) > { > for(tid_in_workgroup = 0; tid_in_workgroup < 128; > tid_in_workgroup += 4) > { > ivec4 tid = ivec4(0, 1, 2, 3) + ivec4(tid_in_workgroup + > workgroup * 128); > a[tid_in_workgroup / 4] = ivec_add(b[tid_in_workgroup / > 4], tid); > } > memory_fence(); // if needed > for(tid_in_workgroup = 0; tid_in_workgroup < 128; > tid_in_workgroup += 4) > { > d[tid_in_workgroup / 4] = vec_add(e[tid_in_workgroup / 4], > f[tid_in_workgroup / 4]); > } > } > // after this, we run the next rendering or compute job > > > Okay good, that's the right concept. > > Actually doing that is not at all straightforward though: consider > that the barrier() might occur inside a loop in the shader. > > So if you implemented that within the framework of llvmpipe, you'd > make a lot of people very happy: it would allow finally adding > compute shader support to llvmpipe. Mind you, that in itself would > already be a pretty decent-sized project for GSoC! > > Cheers, > Nicolai > > Jacob Lifshay ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] software implementation of vulkan for gsoc/evoc
the algorithm i was going to use would get the union of the sets of live variables at the barriers (union over barriers), create an array of structs that holds them all, then for each barrier, insert the code to store all live variables, then end the for loop over tid_in_workgroup, then run the memory barrier, then start another for loop over tid_in_workgroup, then load all live variables. Jacob Lifshay On Feb 13, 2017 08:45, "Nicolai Hähnle" wrote: > [ re-adding mesa-dev on the assumption that it got dropped by accident ] > > On 13.02.2017 17:27, Jacob Lifshay wrote: > >> I would start a thread for each cpu, then have each thread run the >> compute shader a number of times instead of having a thread per >> shader >> invocation. >> >> >> This will not work. >> >> Please, read again what the barrier() instruction does: When the >> barrier() call is reached, _all_ threads within the workgroup are >> supposed to be run until they reach that barrier() call. >> >> >> to clarify, I had meant that each os thread would run the sections of >> the shader between the barriers for all the shaders in a work group, >> then, when it finished the work group, it would go to the next work >> group assigned to the os thread. >> >> so, if our shader is: >> a = b + tid; >> barrier(); >> d = e + f; >> >> and our simd width is 4, our work-group size is 128, and we have 16 os >> threads, then it will run for each os thread: >> for(workgroup = os_thread_index; workgroup < workgroup_count; workgroup++) >> { >> for(tid_in_workgroup = 0; tid_in_workgroup < 128; tid_in_workgroup += >> 4) >> { >> ivec4 tid = ivec4(0, 1, 2, 3) + ivec4(tid_in_workgroup + >> workgroup * 128); >> a[tid_in_workgroup / 4] = ivec_add(b[tid_in_workgroup / 4], tid); >> } >> memory_fence(); // if needed >> for(tid_in_workgroup = 0; tid_in_workgroup < 128; tid_in_workgroup += >> 4) >> { >> d[tid_in_workgroup / 4] = vec_add(e[tid_in_workgroup / 4], >> f[tid_in_workgroup / 4]); >> } >> } >> // after this, we run the next rendering or compute job >> > > Okay good, that's the right concept. > > Actually doing that is not at all straightforward though: consider that > the barrier() might occur inside a loop in the shader. > > So if you implemented that within the framework of llvmpipe, you'd make a > lot of people very happy: it would allow finally adding compute shader > support to llvmpipe. Mind you, that in itself would already be a pretty > decent-sized project for GSoC! > > Cheers, > Nicolai > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] software implementation of vulkan for gsoc/evoc
forgot to add mesa-dev when I sent (again). -- Forwarded message -- From: "Jacob Lifshay" Date: Feb 13, 2017 8:27 AM Subject: Re: [Mesa-dev] software implementation of vulkan for gsoc/evoc To: "Nicolai Hähnle" Cc: > > On Feb 13, 2017 7:54 AM, "Nicolai Hähnle" wrote: > > On 13.02.2017 03:17, Jacob Lifshay wrote: > >> On Feb 12, 2017 5:34 PM, "Dave Airlie" > <mailto:airl...@gmail.com>> wrote: >> >> > I'm assuming that control barriers in Vulkan are identical to >> barriers >> > across a work-group in opencl. I was going to have a work-group be >> a single >> > OS thread, with the different work-items mapped to SIMD lanes. If >> we need to >> > have additional scheduling, I have written a javascript compiler >> that >> > supports generator functions, so I mostly know how to write a llvm >> pass to >> > implement that. I was planning on writing the shader compiler >> using llvm, >> > using the whole-function-vectorization pass I will write, and >> using the >> > pre-existing spir-v to llvm translation layer. I would also write >> some llvm >> > passes to translate from texture reads and stuff to basic vector >> ops. >> >> Well the problem is number of work-groups that gets launched could be >> quite high, and this can cause a large overhead in number of host >> threads >> that have to be launched. There was some discussion on this in >> mesa-dev >> archives back when I added softpipe compute shaders. >> >> >> I would start a thread for each cpu, then have each thread run the >> compute shader a number of times instead of having a thread per shader >> invocation. >> > > This will not work. > > Please, read again what the barrier() instruction does: When the barrier() > call is reached, _all_ threads within the workgroup are supposed to be run > until they reach that barrier() call. > > > to clarify, I had meant that each os thread would run the sections of the > shader between the barriers for all the shaders in a work group, then, when > it finished the work group, it would go to the next work group assigned to > the os thread. > > so, if our shader is: > a = b + tid; > barrier(); > d = e + f; > > and our simd width is 4, our work-group size is 128, and we have 16 os > threads, then it will run for each os thread: > for(workgroup = os_thread_index; workgroup < workgroup_count; workgroup++) > { > for(tid_in_workgroup = 0; tid_in_workgroup < 128; tid_in_workgroup += > 4) > { > ivec4 tid = ivec4(0, 1, 2, 3) + ivec4(tid_in_workgroup + workgroup > * 128); > a[tid_in_workgroup / 4] = ivec_add(b[tid_in_workgroup / 4], tid); > } > memory_fence(); // if needed > for(tid_in_workgroup = 0; tid_in_workgroup < 128; tid_in_workgroup += > 4) > { > d[tid_in_workgroup / 4] = vec_add(e[tid_in_workgroup / 4], > f[tid_in_workgroup / 4]); > } > } > // after this, we run the next rendering or compute job > > >> > I have a prototype rasterizer, however I haven't implemented >> binning for >> > triangles yet or implemented interpolation. currently, it can handle >> > triangles in 3D homogeneous and calculate edge equations. >> > https://github.com/programmerjake/tiled-renderer >> <https://github.com/programmerjake/tiled-renderer> >> > A previous 3d renderer that doesn't implement any vectorization >> and has >> > opengl 1.x level functionality: >> > https://github.com/programmerjake/lib3d/blob/master/softrender.cpp >> <https://github.com/programmerjake/lib3d/blob/master/softrender.cpp> >> >> Well I think we already have a completely fine rasterizer and binning >> and whatever >> else in the llvmpipe code base. I'd much rather any Mesa based >> project doesn't >> throw all of that away, there is no reason the same swrast backend >> couldn't >> be abstracted to be used for both GL and Vulkan and introducing >> another >> just because it's interesting isn't a great fit for long term project >> maintenance.. >> >> If there are improvements to llvmpipe that need to be made, then that >> is something >> to possibly consider, but I'm not sure why a swrast vulkan needs a >> from scratch >> raster implemented. For a project that is so
Re: [Mesa-dev] software implementation of vulkan for gsoc/evoc
forgot to add mesa-dev when I sent. -- Forwarded message -- From: "Jacob Lifshay" Date: Feb 12, 2017 6:16 PM Subject: Re: [Mesa-dev] software implementation of vulkan for gsoc/evoc To: "Dave Airlie" Cc: On Feb 12, 2017 5:34 PM, "Dave Airlie" wrote: > I'm assuming that control barriers in Vulkan are identical to barriers > across a work-group in opencl. I was going to have a work-group be a single > OS thread, with the different work-items mapped to SIMD lanes. If we need to > have additional scheduling, I have written a javascript compiler that > supports generator functions, so I mostly know how to write a llvm pass to > implement that. I was planning on writing the shader compiler using llvm, > using the whole-function-vectorization pass I will write, and using the > pre-existing spir-v to llvm translation layer. I would also write some llvm > passes to translate from texture reads and stuff to basic vector ops. Well the problem is number of work-groups that gets launched could be quite high, and this can cause a large overhead in number of host threads that have to be launched. There was some discussion on this in mesa-dev archives back when I added softpipe compute shaders. I would start a thread for each cpu, then have each thread run the compute shader a number of times instead of having a thread per shader invocation. > I have a prototype rasterizer, however I haven't implemented binning for > triangles yet or implemented interpolation. currently, it can handle > triangles in 3D homogeneous and calculate edge equations. > https://github.com/programmerjake/tiled-renderer > A previous 3d renderer that doesn't implement any vectorization and has > opengl 1.x level functionality: > https://github.com/programmerjake/lib3d/blob/master/softrender.cpp Well I think we already have a completely fine rasterizer and binning and whatever else in the llvmpipe code base. I'd much rather any Mesa based project doesn't throw all of that away, there is no reason the same swrast backend couldn't be abstracted to be used for both GL and Vulkan and introducing another just because it's interesting isn't a great fit for long term project maintenance.. If there are improvements to llvmpipe that need to be made, then that is something to possibly consider, but I'm not sure why a swrast vulkan needs a from scratch raster implemented. For a project that is so large in scope, I'd think reusing that code would be of some use. Since most of the fun stuff is all the texture sampling etc. I actually think implementing the rasterization algorithm is the best part. I wanted the rasterization algorithm to be included in the shaders, eg. triangle setup and binning would be tacked on to the end of the vertex shader and parameter interpolation and early z tests would be tacked on to the beginning of the fragment shader and blending on to the end. That way, llvm could do more specialization and instruction scheduling than is possible in llvmpipe now. so the tile rendering function would essentially be: for(i = 0; i < triangle_count; i+= vector_width) jit_functions[i](tile_x, tile_y, &triangle_setup_results[i]); as opposed to the current llvmpipe code where there is a large amount of fixed code that isn't optimized with the shaders. > The scope that I intended to complete is the bare minimum to be vulkan > conformant (i.e. no tessellation and no geometry shaders), so implementing a > loadable ICD for linux and windows that implements a single queue, vertex, > fragment, and compute shaders, implementing events, semaphores, and fences, > implementing images with the minimum requirements, supporting a f32 depth > buffer or a f24 with 8bit stencil, and supporting a yet-to-be-determined > compressed format. For the image optimal layouts, I will probably use the > same chunked layout I use in > https://github.com/programmerjake/tiled-renderer/blob/master2/image.h#L59 , > where I have a linear array of chunks where each chunk has a linear array of > texels. If you think that's too big, we could leave out all of the image > formats except the two depth-stencil formats, the 8-bit and 32-bit integer > and 32-bit float formats. > Seems like a quite large scope, possibly a bit big for a GSoC though, esp one that intends to not use any existing Mesa code. most of the vulkan functions have a simple implementation when we don't need to worry about building stuff for a gpu and synchronization (because we have only one queue), and llvm implements most of the rest of the needed functionality. If we leave out most of the image formats, that would probably cut the amount of code by a third. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] software implementation of vulkan for gsoc/evoc
I'm assuming that control barriers in Vulkan are identical to barriers across a work-group in opencl. I was going to have a work-group be a single OS thread, with the different work-items mapped to SIMD lanes. If we need to have additional scheduling, I have written a javascript compiler that supports generator functions, so I mostly know how to write a llvm pass to implement that. I was planning on writing the shader compiler using llvm, using the whole-function-vectorization pass I will write, and using the pre-existing spir-v to llvm translation layer. I would also write some llvm passes to translate from texture reads and stuff to basic vector ops. I have a prototype rasterizer, however I haven't implemented binning for triangles yet or implemented interpolation. currently, it can handle triangles in 3D homogeneous and calculate edge equations. https://github.com/programmerjake/tiled-renderer A previous 3d renderer that doesn't implement any vectorization and has opengl 1.x level functionality: https://github.com/programmerjake/lib3d/blob/master/softrender.cpp The scope that I intended to complete is the bare minimum to be vulkan conformant (i.e. no tessellation and no geometry shaders), so implementing a loadable ICD for linux and windows that implements a single queue, vertex, fragment, and compute shaders, implementing events, semaphores, and fences, implementing images with the minimum requirements, supporting a f32 depth buffer or a f24 with 8bit stencil, and supporting a yet-to-be-determined compressed format. For the image optimal layouts, I will probably use the same chunked layout I use in https://github.com/programmerjake/tiled-renderer/blob/master2/image.h#L59 , where I have a linear array of chunks where each chunk has a linear array of texels. If you think that's too big, we could leave out all of the image formats except the two depth-stencil formats, the 8-bit and 32-bit integer and 32-bit float formats. As mentioned by Roland Mainz, I plan to implement it so all state is stored in the VkDevice structure or structures created from VKDevice, so there are no global variables that prevent the library from being completely reentrant. I might have global variables for something like detecting cpu features, but that will be protected by a mutex. Jacob Lifshay On Sun, Feb 12, 2017 at 3:14 PM Dave Airlie wrote: > On 11 February 2017 at 09:03, Jacob Lifshay > wrote: > > I would like to write a software implementation of Vulkan for inclusion > in > > mesa3d. I wanted to use a tiled renderer coupled with llvm and either > write > > or use a whole-function-vectorization pass. Would anyone be willing to > > mentor me for this project? I would probably only need help getting it > > committed, and would be able to do the rest with minimal help. > > So I started writing a vulkan->gallium swrast layer > > https://cgit.freedesktop.org/~airlied/mesa/log/?h=not-a-vulkan-swrast > > with the intention of using it to prove a vulkan swrast driver on top > of llvmpipe eventually. > > This was because I was being too lazy to just rewrite llvmpipe as a > vulkan driver, > and it seemed easier to just write the layer to investigate. The thing > about vulkan is it > already very based around the idea of command streams and parallel > building/execution, > so having the gallium/vulkan layer record a CPU command stream and execute > that > isn't going to be a large an overhead as doing something similiar with > hw drivers. > > I got it working with softpipe after adding a bunch of features to > softpipe, however to > get it going with llvmpipe, there would need to be a lot of work on > improving llvmpipe. > > Vulkan really wants images and compute shaders (i.e. it requires > them), and so far we haven't got > image and compute shader support for llvmpipe. There are a few threads > previously on this, > but the main problem with compute shader is getting efficient barriers > working, which needs > some kind of threading model, maybe llvm's coroutine support is useful > for this we won't know > until we try I suppose. > > I'd probably be happy to mentor on the project, but you'd want to > define the scope of it pretty > well, as there is a lot of work to get the non-graphics pieces even if > you are just ripping stuff > out of llvmpipe. > > Dave. > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] software implementation of vulkan for gsoc/evoc
by tiled renderer, I meant that I would split the render target into small pieces, then, for each triangle, decide which pieces contains the triangle and add that triangle to per-piece render lists. afterwards, I'd use the constructed render lists and render all the parts of triangles in a piece, then go to the next piece. Obviously, I'd use multiple threads that are all rendering their separate pieces simultaneously. I'm not sure if you'd be able to use the whole-function-vectorization pass with gallium3d, you'd need to translate the shader to llvm ir and back. the whole-function-vectorization pass would still output scalar code for statically uniform values, llvm (as of 3.9.1) doesn't have a pass to devectorize vectors where all elements are identical. Jacob Lifshay On Feb 11, 2017 11:11, "Roland Scheidegger" wrote: Am 11.02.2017 um 00:03 schrieb Jacob Lifshay: > I would like to write a software implementation of Vulkan for inclusion > in mesa3d. I wanted to use a tiled renderer coupled with llvm and either > write or use a whole-function-vectorization pass. Would anyone be > willing to mentor me for this project? I would probably only need help > getting it committed, and would be able to do the rest with minimal help. > Jacob Lifshay This sounds like a potentially interesting project, though I don't have much of an idea if it's feasible as gsoc. By "using a tiled renderer" do you mean you want to "borrow" that, presumably from either llvmpipe or openswr? The whole-function-vectorization idea for shader execution looks reasonable to me, just not sure if it will deliver good results. I guess it would be nice if that could sort of be used as a replacement for the current gallivm cpu shader execution implementation (used by both llvmpipe and openswr). Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] removed report to vendor message when dri3 is not detected
fixes bug 99715 Signed-off-by: Jacob Lifshay --- src/vulkan/wsi/wsi_common_x11.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/src/vulkan/wsi/wsi_common_x11.c b/src/vulkan/wsi/wsi_common_x11.c index 64ba921..e092066 100644 --- a/src/vulkan/wsi/wsi_common_x11.c +++ b/src/vulkan/wsi/wsi_common_x11.c @@ -266,7 +266,6 @@ VkBool32 wsi_get_physical_device_xcb_presentation_support( if (!wsi_conn->has_dri3) { fprintf(stderr, "vulkan: No DRI3 support detected - required for presentation\n"); - fprintf(stderr, "Note: Buggy applications may crash, if they do please report to vendor\n"); return false; } @@ -315,7 +314,6 @@ x11_surface_get_support(VkIcdSurfaceBase *icd_surface, if (!wsi_conn->has_dri3) { fprintf(stderr, "vulkan: No DRI3 support detected - required for presentation\n"); - fprintf(stderr, "Note: Buggy applications may crash, if they do please report to vendor\n"); *pSupported = false; return VK_SUCCESS; } -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] software implementation of vulkan for gsoc/evoc
I think vulkan is supposed to be reentrant already. Jacob Lifshay On Feb 10, 2017 3:38 PM, "Roland Mainz" wrote: > On Sat, Feb 11, 2017 at 12:03 AM, Jacob Lifshay > wrote: > > I would like to write a software implementation of Vulkan for inclusion > in > > mesa3d. I wanted to use a tiled renderer coupled with llvm and either > write > > or use a whole-function-vectorization pass. Would anyone be willing to > > mentor me for this project? I would probably only need help getting it > > committed, and would be able to do the rest with minimal help. > > Please do me a favour and implement the renderer in an reentrant way, > i.e. no global variables (e.g. put all variables which are "global" in > a "handle" struct which is then passed around, e.g. like libpam was > implemented). This helps a lot with later multithreading and helps > with debugging the code. > > > > Bye, > Roland > > -- > __ . . __ > (o.\ \/ /.o) roland.ma...@nrubsig.org > \__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer > /O /==\ O\ TEL +49 641 3992797 > (;O/ \/ \O;) > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] software implementation of vulkan for gsoc/evoc
I would like to write a software implementation of Vulkan for inclusion in mesa3d. I wanted to use a tiled renderer coupled with llvm and either write or use a whole-function-vectorization pass. Would anyone be willing to mentor me for this project? I would probably only need help getting it committed, and would be able to do the rest with minimal help. Jacob Lifshay ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev