Re: EXA performance problem
Am 28.11.2011 07:43, schrieb Maarten Maathuis: ___ xorg@lists.freedesktop.org: X.Org support Archives: http://lists.freedesktop.org/archives/xorg Info: http://lists.freedesktop.org/mailman/listinfo/xorg Your subscription address: madman2...@gmail.com No, damage is an extention, it is called by EXA, it's probably adding all you rectangles to a damage region used to determine how much data is actually valid (needed for ram--vram migrations for example). One thing that just comes to mind, if you are rendering a million rectangles, how many of those do you actually see on your screen? Most of them are only 1x1 pixel wide. And lots of rectangles share the same pixel. I assue that one can optimize the application. But I did not write it and do not know how it works. I only know that it was able to show vector pictures consisting of millions of rectangles within seconds (VLSI design data) when run on XFree86. With Xorg it takes minutes. I only see the problem because we recently upgraded our X11 thin clients to better hardware. But they turned out to be much slower than the older ones. My quest to find the problem has led me to the damage extension now. First I thought it was a network problem. But Xorg was also slow on my notebook when the program was started locally. The contrast is striking: The old XFree86 thin clients were able to draw all the rectangles that were sent over a 100 MBit ethernet network in seconds. While my more powerful Xorg server needs minutes although the software runs on the same machine. However, I was able to improve the runtime of the first operation in damagePolyRectangle. The runtime of my benchmark went down from 90 seconds to 64 seconds. Now one has to look at (*pGC-ops-PolyRectangle)(pDrawable, pGC, nRects, pRects); Thanks Christoph ___ xorg@lists.freedesktop.org: X.Org support Archives: http://lists.freedesktop.org/archives/xorg Info: http://lists.freedesktop.org/mailman/listinfo/xorg Your subscription address: arch...@mail-archive.com
Re: EXA performance problem
Am 28.11.2011 10:35, schrieb Christoph Bartoschek: Now one has to look at (*pGC-ops-PolyRectangle)(pDrawable, pGC, nRects, pRects); Here is what I see so far: - damagePolyRectangle is called for 2044 rectangles. - the damage region is computed it consists of about 1000 rectangles each time. - miPolyRectangle is called. - the function iterates over all rectangles and calls exaPolylines for each of them because most have only a width and height of 0 - exaPolylines calls ExaCheckPolylines. We see that for each rectanlge ExaCheckPolylines is called. I have added timers to this function to see what costs time: void ExaCheckPolylines (DrawablePtr pDrawable, GCPtr pGC, int mode, int npt, DDXPointPtr ppt) { EXA_PRE_FALLBACK_GC(pGC); EXA_FALLBACK((to %p (%c), width %d, mode %d, count %d\n, pDrawable, exaDrawableLocation(pDrawable), pGC-lineWidth, mode, npt)); exaPrepareAccess (pDrawable, EXA_PREPARE_DEST); // Step1: 55 s exaPrepareAccessGC (pGC); // Step2: 2.4 s pGC-ops-Polylines (pDrawable, pGC, mode, npt, ppt); // Step3: 2.4 s exaFinishAccessGC (pGC); // Step4: 2.2 s exaFinishAccess (pDrawable, EXA_PREPARE_DEST);// Step5: 2.2 s EXA_POST_FALLBACK_GC(pGC); } We see that exaPrepareAccess needs most of the time. Is that expected? Inside we see that there are some region operations on the damage region in exaCopyDirty. As said before the damage region contains about 1000 rectangles. So we have 2000 times several operations on 1000 rectangeles. I think this explains the runtime. Isn't it somehow possible to batch the rectangle drawing such that the region operations are not neccessary for each rectangle? Isn't is possible to expand the damage region such that it contains less rectangles? Is this still the correct list, or should I ask the EXA questions elsewhere? Christoph ___ xorg@lists.freedesktop.org: X.Org support Archives: http://lists.freedesktop.org/archives/xorg Info: http://lists.freedesktop.org/mailman/listinfo/xorg Your subscription address: arch...@mail-archive.com
EXA performance problem
Hi, I still have a huge performance problem with Xorg. One application that painted 2 Mio rectangles on the screen within a second or so with XFree86 needs about a minute with Xorg. Most of the time is spent in libpixman. I've added some debug statements and see that pixman_raster_op is called about 7.2 mio times during my testcase. I do not think that pixman itself is the problem. It is just used too often by EXA. Is there anything I can do about this? Is there a better list where I can ask? Or do you know a person that might be interested in solving such a problem? Christoph ___ xorg@lists.freedesktop.org: X.Org support Archives: http://lists.freedesktop.org/archives/xorg Info: http://lists.freedesktop.org/mailman/listinfo/xorg Your subscription address: arch...@mail-archive.com
Re: EXA performance problem
Am 27.11.2011 16:13, schrieb Maarten Maathuis: On Sun, Nov 27, 2011 at 3:55 PM, Christoph Bartoschek bartosc...@or.uni-bonn.de wrote: Hi, I still have a huge performance problem with Xorg. One application that painted 2 Mio rectangles on the screen within a second or so with XFree86 needs about a minute with Xorg. Most of the time is spent in libpixman. I've added some debug statements and see that pixman_raster_op is called about 7.2 mio times during my testcase. I do not think that pixman itself is the problem. It is just used too often by EXA. Is there anything I can do about this? Is there a better list where I can ask? Or do you know a person that might be interested in solving such a problem? Christoph ___ xorg@lists.freedesktop.org: X.Org support Archives: http://lists.freedesktop.org/archives/xorg Info: http://lists.freedesktop.org/mailman/listinfo/xorg Your subscription address: madman2...@gmail.com As far as i know it basically boils down to this, rendering rectangles is done in a software library as you observed. If your pixmap happens to be outside normal ram then a lot of reads will kill performance. These days the aim should be to use as little core rendering as possible. A modern toolkit or a rendering library like cairo should handle this far better. How can I check whether the pixmap is outside the normal ram? For me it does not look as if pixman is used for rendering the image. It looks as if EXA is managing the region that needs updates with pixman routines. But I could be wrong here. Christoph ___ xorg@lists.freedesktop.org: X.Org support Archives: http://lists.freedesktop.org/archives/xorg Info: http://lists.freedesktop.org/mailman/listinfo/xorg Your subscription address: arch...@mail-archive.com
Re: EXA performance problem
I have new information. I am no longer sure whether it is a problem with EXA. I have a testcase that currently takes 90 seconds to draw all rectangles. I see that in damage.c two functions are mainly used: damagePolyRectangle damagePolyFillRectangle The first function calls for each given rectangle up to four times damageDamageBox (pDrawable, box, pGC-subWindowMode); which adds the box to a region. The function then calls damageRegionAppend. This part takes in sum 30 seconds of my testcase. I think the code has quadratic behaviour here becuase it adds rectangle by rectangle instead of first adding them to a region and then calling damageRegionAppend. I think removing the quadratic behaviour can reduce the runtime significantly. About 60 seconds are spent in the calls (*pGC-ops-PolyRectangle)(pDrawable, pGC, nRects, pRects); (*pGC-ops-PolyFillRect)(pDrawable, pGC, nRects, pRects); However I do not yet know why they are so slow. Is damage.c part of EXA? Christoph ___ xorg@lists.freedesktop.org: X.Org support Archives: http://lists.freedesktop.org/archives/xorg Info: http://lists.freedesktop.org/mailman/listinfo/xorg Your subscription address: arch...@mail-archive.com
Disabling EXA
Hi, how can I disable EXA and use XAA? I am on opensuse and added the following section to /etc/X11/xorg.conf.d/50-device.conf: Section Device Option AccelMethod xaa Identifier Default Device EndSection Xorg reads the file because it says in its logfile: [21.481] (==) RADEON(0): Depth 24, (--) framebuffer bpp 32 [21.481] (II) RADEON(0): Pixel depth = 24 bits stored in 4 bytes (32 bpp pixmaps) [21.481] (==) RADEON(0): Default visual is TrueColor [21.481] (**) RADEON(0): Option AccelMethod xaa [21.481] (==) RADEON(0): RGB weight 888 [21.481] (II) RADEON(0): Using 8 bits per RGB (8 bit DAC) [21.481] (--) RADEON(0): Chipset: ATI Radeon Mobility X300 (M22) 5460 (PCIE) (ChipID = 0x5460) [21.481] (II) RADEON(0): PCIE card detected But EXA is still used: [21.481] (II) Loading sub module exa [21.481] (II) LoadModule: exa [21.482] (II) Loading /usr/lib/xorg/modules/libexa.so [21.488] (II) Module exa: vendor=X.Org Foundation [21.488]compiled for 1.10.4, module version = 2.5.0 ... [21.575] (II) RADEON(0): EXA: Driver will allow EXA pixmaps in VRAM ... [21.635] (II) RADEON(0): Setting EXA maxPitchBytes [21.635] (II) EXA(0): Driver allocated offscreen pixmaps [21.635] (II) EXA(0): Driver registered support for the following operations: [21.635] (II) Solid [21.635] (II) Copy [21.635] (II) Composite (RENDER acceleration) [21.635] (II) UploadToScreen [21.635] (II) DownloadFromScreen How can I disable EXA and use XAA?. I suspect EXA to be responsible for a huge performance regression. Thanks Christoph ___ xorg@lists.freedesktop.org: X.Org support Archives: http://lists.freedesktop.org/archives/xorg Info: http://lists.freedesktop.org/mailman/listinfo/xorg Your subscription address: arch...@mail-archive.com
X.org drawing rectangles extremely slow.
Hi, I have an application that draws rectangles (approx 50). Most of them are so small that only a pixel or nothing is visible. After we upgraded our thin clients we saw a huge performance regression in the application. On our old thin clients with XFree86 identifying itself as X.org: 6.8.1 the application need 1.5 seconds to draw all rectangles. With the new thin client it takes 75 seconds. The X.org version is 1.5.2. On my notebook with X.org 1.9.3 from opensuse 11.4 it still takes 7.4 seconds. This is about 5x slower than on the old thin client. Has anyone an idea what could be wrong with the newer X.org servers? I've created a perf profile of the X.org server on my notebook: # Events: 33K cycles # # Overhead Command Shared Object Symbol # ... .. # 46.83% Xorg libpixman-1.so.0.20.0 [.] fbbc 5.39% Xorg libexa.so [.] 3127 5.38% Xorg radeon_drv.so [.]c4928 5.32% Xorg Xorg[.]2a343 4.34% Xorg libc-2.11.3.so [.] _int_malloc 3.14% Xorg libc-2.11.3.so [.] __GI_memmove 2.78% Xorg [radeon][k] r100_cs_packet_parse 2.38% Xorg [radeon][k] r100_cs_parse_packet0 2.18% Xorg libc-2.11.3.so [.] _int_free 2.14% Xorg Xorg[.] RegionValidate 1.87% Xorg libdrm_radeon.so.1.0.0 [.] 287d 1.81% Xorg libc-2.11.3.so [.] __cfree 1.71% Xorg libc-2.11.3.so [.] __malloc 1.40% Xorg [radeon][k] r300_cs_parse 1.02% Xorg libpixman-1.so.0.20.0 [.] pixman_region_union 0.95% Xorg libpixman-1.so.0.20.0 [.] pixman_region_intersect 0.71% Xorg Xorg[.] RegionFromRects 0.66% Xorg [kernel.kallsyms] [k] read_hpet 0.53% Xorg libc-2.11.3.so [.] __i686.get_pc_thunk.bx Thanks Christoph ___ xorg@lists.freedesktop.org: X.Org support Archives: http://lists.freedesktop.org/archives/xorg Info: http://lists.freedesktop.org/mailman/listinfo/xorg Your subscription address: arch...@mail-archive.com