----- Original Message ---- > From: Dave Airlie <[EMAIL PROTECTED]> > To: Ian Romanick <[EMAIL PROTECTED]> > Cc: DRI <dri-devel@lists.sourceforge.net> > Sent: Monday, May 19, 2008 4:38:02 AM > Subject: Re: TTM vs GEM discussion questions > > > > > > All the good that's done us and our users. After more than *5 years* of > > various memory manager efforts we can't support basic OpenGL 1.0 (yes, > > 1.0) functionality in a performant manner (i.e., glCopyTexImage and > > friends). We have to get over this "it has to be perfect or it will > > never get in" crap. Our 3D drivers are entirely irrelevant at this point. > > Except on Intel hardware, who's relevance may or may not be relevant.
These can't do copyteximage with the in-kernel drm. > > To say that "userspace APIs cannot die once released" is not a relevant > > counterpoint. We're not talking about a userspace API for general > > application use. This isn't futexs, sysfs, or anything that > > applications will directly depend upon. This is an interface between a > > kernel portion of a driver and a usermode portion of a driver. If we > > can't be allowed to change or deprecate those interfaces, we have no hope. > > > > Note that the closed source guys don't have this artificial handicap. > > Ian, fine you can take this up with Linus and Andrew Morton, I'm not > making this up just to stop you from putting 50 unsupportable memory > managers in the kernel. If you define any interface to userspace from the > kernel (ioctls, syscalls), you cannot just make it go away. The rule is > simple and is that if you install a distro with a kernel 2.6.x.distro, and > it has Mesa 7.0 drivers on it, upgrading the kernel to kernel 2.6.x+n > without touching userspace shouldn't break userspace ever. If we can't > follow this rule we can't put out code into Linus's kernel. So don't argue > about it, deal with it, this isn't going to change. > > and yes I've heard this crap about closed source guys, but we can't follow > their route and be distributed by vendors. How many vendors ship the > closed drivers? > > > This is also a completely orthogonal issue to maintaining any particular > > driver. Drivers are removed from the kernel just the same as they are > > removed from X.org. Assume we upstreamed either TTM or GEM "today." > > Clearly that memory manager would continue to exist as long as some > > other driver continued to depend on it. I don't see how this is > > different from cfb or any of the other interfaces within the X server > > that we've gutted recently. > > Drivers and pieces of the kernel aren't removed like you think. I think we > nuked gamma (didn't have a working userspcae anymore) and ffb (it sucked > and couldn't be fixed). Someone is bound to bring up OSS->ALSA, but that > doesn't count as ALSA had OSS emulation layer so userspace apps didn't > just stop working. Removing chunks of X is vastly different to removing an > exposed kernel userspace interface. Please talk to any IBM kernel person > and clarify how this stuff works. (Maybe benh could chime in...??) > > > If you want to remove a piece of infrastructure, you have three choices. > > ~ If nothing uses it, you gut it. If something uses it, you either fix > > that "something" to use different infrastructure (which puts you in the > > "nothing uses it" state) or you leave things as they are. In spite of > > all the fussing the kernel guys do in this respect, the kernel isn't > > different in this respect from any other large, complex piece of > > infrastructure. > > So you are going to go around and fix the userspaces on machines that are > already deployed? how? e.g. Andrew Morton has a Fedora Core 1 install on a > laptop booting 2.6.x-mm kernels, when 3D stops working on that laptop we > get to hear about it. So yes you can redesign and move around the kernel > internals as much as you like, but you damn well better expose the old > interface and keep it working. > > > managers or that we may want to have N memory managers now that will be > > gutted later. It seems that the real problem is that the memory > > managers have been exposed as a generic, directly usable, device > > independent piece of infrastructure. Maybe the right answer is to punt > > on the entire concept of a general memory manager. At best we'll have > > some shared, optional use infrastructure, and all of the interfaces that > > anything in userspace can ever see are driver dependent. That limits > > the exposure of the interfaces and lets us solve todays problems today. > > > > As is trivially apparent, we don't know what the "best" (for whatever > > definition of best we choose) answer is for a memory manager interface. > > ~ We're probably not going to know that answer in the near future. To > > not let our users have anything until we can give them the best thing is > > an incredible disservice to them, and it makes us look silly (at best). > > > > Well the thing is I can't believe we don't know enough to do this in some > way generically, but maybe the TTM vs GEM thing proves its not possible. I don't think there's anything particularly wrong with the GEM interface -- I just need to know that the implementation can be fixed so that performance doesn't suck as hard as it does in the current one, and that people's political views on basic operations like mapping buffers don't get in the way of writing a decent driver. We've run a few benchmarks against i915 drivers in all their permutations, and to summarize the results look like: - for GPU-bound apps, there are small differences, perhaps up to 10%. I'm really not concerned about these (yet). - for CPU-bound apps, the overheads introduced by Intel's approach to buffer handling impose a significant penalty in the region of 50-100%. I think the latter is the significant result -- none of these experiments in memory management significantly change the command stream the hardware has to operate on, so what we're varying essentially is the CPU behaviour to acheive that command stream. And it is in CPU usage where GEM (and Keith/Eric's now-abandoned TTM driver) do significantly dissapoint. Or to put it another way, GEM & master/TTM seem to burn huge amounts of CPU just running the memory manager. This isn't true for master/no-ttm or for i915tex using userspace sub-allocation, where the CPU penalty for getting decent memory management seems to be minimal relative to the non-ttm baseline. If there's a political desire to not use userspace sub-allocation, then whatever kernel-based approach you want to investigate should nonetheless make some effort to hit reasonable performance goals -- and neither of the current two kernel-allocation-based approaches currently are at all impressive. Keith ============================================================== And on an i945G, dual core Pentium D 3Ghz 2MB cache, FSB 800 Mhz, single-channel ram: Openarena timedemo at 640x480: -------------------------------------------- master w/o TTM: 840 frames, 17.1 seconds: 49.0 fps, 12.24s user 1.02s system 63% cpu 20.880 total master with TTM: 840 frames, 15.8 seconds: 53.1 fps, 13.51s user 5.15s system 95% cpu 19.571 total i915tex_branch: 840 frames, 13.8 seconds: 61.0 fps, 12.54s user 2.34s system 85% cpu 17.506 total gem: 840 frames, 15.9 seconds: 52.8 fps, 11.96s user 4.44s system 83% cpu 19.695 total KW: It's less obvious here than some of the tests below, but the pattern is still clear -- compared to master/no-ttm i915tex is getting about the same ratio of fps to CPU usage, whereas both master/ttm and gem are significantly worse, burning much more CPU per fps, with a large chunk of the extra CPU being spent in the kernel. The particularly worrying thing about GEM is that it isn't hitting *either* 100% cpu *or* maximum framerates from the hardware -- that's really not very good, as it implies hardware is being left idle unecessarily. glxgears: A: ~1029 fps, 20.63user 2.88system 1:00.00elapsed 39%CPU (master, no ttm) B: ~1072 fps, 23.97user 18.06system 1:00.00elapsed 70%CPU (master, ttm) C: ~1128 fps, 22.38user 5.21system 1:00.00elapsed 45%CPU (i915tex, new) D: ~1167 fps, 23.14user 9.07system 1:00.00elapsed 53%CPU (i915tex, old) F: ~1112 fps, 24.70user 21.95system 1:00.00elapsed 77%CPU (gem) KW: The high CPU overhead imposed by GEM and (non-suballocating) master/TTM should be pretty clear here. master/TTM burns 30% of CPU just running the memory manager!! GEM gets slightly higher framerates but uses even more CPU than master/TTM. fgl_glxgears -fbo: A: n/a B: ~244 fps, 7.03user 5.30system 1:00.01elapsed 20%CPU (master, ttm) C: ~255 fps, 6.24user 1.71system 1:00.00elapsed 13%CPU (i915tex, new) D: ~260 fps, 6.60user 2.44system 1:00.00elapsed 15%CPU (i915tex, old) F: ~258 fps, 7.56user 6.44system 1:00.00elapsed 23%CPU (gem) KW: GEM & master/ttm burn more cpu to build/submit the same command streams. openarena 1280x1024: A: 840 frames, 44.5 seconds: 18.9 fps (master, no ttm) B: 840 frames, 40.8 seconds: 20.6 fps (master, ttm) C: 840 frames, 40.4 seconds: 20.8 fps (i915tex, new) D: 840 frames, 37.9 seconds: 22.2 fps (i915tex, old) F: 840 frames, 40.3 seconds: 20.8 fps (gem) KW: no cpu measurements taken here, but almost certainly GPU bound. A lot of similar numbers, I don't believe the deltas have anything in particular to do with memory management interface choices... ipers: A: ~285000 Poly/sec (master, no ttm) B: ~217000 Poly/sec (master, ttm) C: ~298000 Poly/sec (i915tex, new) D: ~227000 Poly/sec (i915tex, old) F: ~125000 Poly/sec (gen, GPU lockup on first attempt) KW: no cpu measurements in this run, but all are almost certainly 100% pinned on CPU. - i915tex (in particular i915tex, new) show similar performance to classic - ie low cpu overhead for this memory manager. - GEM is significantly worse even than master/ttm -- hopefully this is a bug rather than a necessary characteristic of the interface. texdown: A: total texels=393216000.000000 time=3.004000 (master, no ttm) B: total texels=434110464.000000 time=3.000000 (master, ttm) C: (i915tex new --- woops, crashes) D: total texels=1111490560.000000 time=3.002000 (i915tex old) F: total texels=279969792.000000 time=3.004000 (gem) Note the huge (3x-4x) performance lead of i915tex, despite the embarassing crash in the newer version. I suspect this is unrelated to command handling and probably somebody has disabled or regressed some aspect of the texture upload path... NOTE: The reason that i915tex does so well relative to master/no-ttm is because we can upload directly to "VRAM"... master/no-ttm treats vram as a cache & always keeps a second copy of the texture safe in main memory... Hence performance isn't great for texture uploads on master/no-ttm. Here's what we're seeing on a i915 3GHz Celeron 256kB cache. Dual channel. Reportdamage disabled. DRM master: ======================================================================= *Test* *i915tex_branch* *i915 master, TTM* *i915 master, classic* ( no gem results on this machine ... ) gears 1033fps, 70.1% CPU. (i915tex) 726fps, 100% CPU. (master, ttm) 955fps, 56%CPU. (master, no-ttm) openarena 47,1fps, 17.9u, 2.7s time (i915tex) 31.5fps, 21.1u, 8.7s time (master, ttm) 39fps, 17.9u, 1.3s time (master, no-ttm) Texdown 1327MB/s (i915tex) 551MB/s (master, ttm) 572MB/s (master, no-ttm) Texdown, subimage 1014MB/s (i915tex) 134MB/s (master, ttm) 148MB/s (master, no-ttm) Ipers, no help screen 255 000 tri/s, 100% cpu (i915tex) 139 000 tri/s, 100% cpu (master, ttm) 241 000 tri/s, 100% cpu (master, no-ttm) I would summarize the results like this: - master/no-ttm has a basically "free" memory manager in terms of CPU overhead - master/ttm and GEM gain a proper memory manager but introduce a huge CPU overhead & consequent performance regression - i915tex makes use of userspace sub-allocation to resolve that regression & achieve comparable efficiency to master/no-ttm. - a separate regression seems to have killed texture upload performance on master/ttm relative to it's ancestor i915tex. Keith ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -- _______________________________________________ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel