Re: EXA performance problem

2011-11-28 Thread Christoph Bartoschek

Am 28.11.2011 07:43, schrieb Maarten Maathuis:
 ___

xorg@lists.freedesktop.org: X.Org support
Archives: http://lists.freedesktop.org/archives/xorg
Info: http://lists.freedesktop.org/mailman/listinfo/xorg
Your subscription address: madman2...@gmail.com



No, damage is an extention, it is called by EXA, it's probably adding
all you rectangles to a damage region used to determine how much data
is actually valid (needed for ram--vram migrations for example).

One thing that just comes to mind, if you are rendering a million
rectangles, how many of those do you actually see on your screen?


Most of them are only 1x1 pixel wide. And lots of rectangles share the 
same pixel.  I assue that one can optimize the application. But I did 
not write it and do not know how it works. I only know that it was able 
to show vector pictures consisting of millions of rectangles within 
seconds (VLSI design data) when run on XFree86. With Xorg it takes minutes.


I only see the problem because we recently upgraded our X11 thin clients 
to better hardware. But they turned out to be much slower than the older 
ones.


My quest to find the problem has led me to the damage extension now. 
First I thought it was a network problem. But Xorg was also slow on my 
notebook when the program was started locally.


The contrast is striking: The old XFree86 thin clients were able to draw 
all the rectangles that were sent over a 100 MBit ethernet network in 
seconds. While my more powerful Xorg server needs minutes although the 
software runs on the same machine.



However, I was able to improve the runtime of the first operation in 
damagePolyRectangle. The runtime of my benchmark went down from 90 
seconds to 64 seconds.


Now one has to look at
(*pGC-ops-PolyRectangle)(pDrawable, pGC, nRects, pRects);


Thanks
Christoph
___
xorg@lists.freedesktop.org: X.Org support
Archives: http://lists.freedesktop.org/archives/xorg
Info: http://lists.freedesktop.org/mailman/listinfo/xorg
Your subscription address: arch...@mail-archive.com


Re: EXA performance problem

2011-11-28 Thread Christoph Bartoschek

Am 28.11.2011 10:35, schrieb Christoph Bartoschek:


Now one has to look at
(*pGC-ops-PolyRectangle)(pDrawable, pGC, nRects, pRects);


Here is what I see so far:

- damagePolyRectangle is called for 2044 rectangles.

- the damage region is computed it consists of about 1000 rectangles 
each time.


- miPolyRectangle is called.

- the function iterates over all rectangles and calls exaPolylines for 
each of them because most have only a width and height of 0


- exaPolylines calls ExaCheckPolylines.


We see that for each rectanlge ExaCheckPolylines is called. I have added 
timers to this function to see what costs time:



void
ExaCheckPolylines (DrawablePtr pDrawable, GCPtr pGC,
  int mode, int npt, DDXPointPtr ppt)
{
  EXA_PRE_FALLBACK_GC(pGC);
  EXA_FALLBACK((to %p (%c), width %d, mode %d, count %d\n,
pDrawable, exaDrawableLocation(pDrawable),
pGC-lineWidth, mode, npt));

  exaPrepareAccess (pDrawable, EXA_PREPARE_DEST);   // Step1: 55 s
  exaPrepareAccessGC (pGC); // Step2: 2.4 s
  pGC-ops-Polylines (pDrawable, pGC, mode, npt, ppt); // Step3: 2.4 s
  exaFinishAccessGC (pGC);  // Step4: 2.2 s
  exaFinishAccess (pDrawable, EXA_PREPARE_DEST);// Step5: 2.2 s
  EXA_POST_FALLBACK_GC(pGC);
}


We see that exaPrepareAccess needs most of the time. Is that expected?

Inside we see that there are some region operations on the damage region 
in exaCopyDirty. As said before the damage region contains about 1000 
rectangles. So we have 2000 times several operations on 1000 rectangeles.


I think this explains the runtime.

Isn't it somehow possible to batch the rectangle drawing such that the 
region operations are not neccessary for each rectangle?


Isn't is possible to expand the damage region such that it contains less 
rectangles?


Is this still the correct list, or should I ask the EXA questions elsewhere?

Christoph
___
xorg@lists.freedesktop.org: X.Org support
Archives: http://lists.freedesktop.org/archives/xorg
Info: http://lists.freedesktop.org/mailman/listinfo/xorg
Your subscription address: arch...@mail-archive.com


EXA performance problem

2011-11-27 Thread Christoph Bartoschek

Hi,

I still have a huge performance problem with Xorg. One application that 
painted 2 Mio rectangles on the screen within a second or so with 
XFree86 needs about a minute with Xorg.


Most of the time is spent in libpixman. I've added some debug statements 
and see that pixman_raster_op is called about 7.2 mio times during my 
testcase.


I do not think that pixman itself is the problem. It is just used too 
often by EXA.


Is there anything I can do about this? Is there a better list where I 
can ask? Or do you know a person that might be interested in solving 
such a problem?


Christoph
___
xorg@lists.freedesktop.org: X.Org support
Archives: http://lists.freedesktop.org/archives/xorg
Info: http://lists.freedesktop.org/mailman/listinfo/xorg
Your subscription address: arch...@mail-archive.com


Re: EXA performance problem

2011-11-27 Thread Christoph Bartoschek

Am 27.11.2011 16:13, schrieb Maarten Maathuis:

On Sun, Nov 27, 2011 at 3:55 PM, Christoph Bartoschek
bartosc...@or.uni-bonn.de  wrote:

Hi,

I still have a huge performance problem with Xorg. One application that
painted 2 Mio rectangles on the screen within a second or so with XFree86
needs about a minute with Xorg.

Most of the time is spent in libpixman. I've added some debug statements and
see that pixman_raster_op is called about 7.2 mio times during my testcase.

I do not think that pixman itself is the problem. It is just used too often
by EXA.

Is there anything I can do about this? Is there a better list where I can
ask? Or do you know a person that might be interested in solving such a
problem?

Christoph
___
xorg@lists.freedesktop.org: X.Org support
Archives: http://lists.freedesktop.org/archives/xorg
Info: http://lists.freedesktop.org/mailman/listinfo/xorg
Your subscription address: madman2...@gmail.com



As far as i know it basically boils down to this, rendering rectangles
is done in a software library as you observed. If your pixmap happens
to be outside normal ram then a lot of reads will kill performance.
These days the aim should be to use as little core rendering as
possible. A modern toolkit or a rendering library like cairo should
handle this far better.



How can I check whether the pixmap is outside the normal ram? For me it 
does not look as if pixman is used for rendering the image. It looks as 
if EXA is managing the region that needs updates with pixman routines. 
But I could be wrong here.


Christoph
___
xorg@lists.freedesktop.org: X.Org support
Archives: http://lists.freedesktop.org/archives/xorg
Info: http://lists.freedesktop.org/mailman/listinfo/xorg
Your subscription address: arch...@mail-archive.com


Re: EXA performance problem

2011-11-27 Thread Christoph Bartoschek
I have new information. I am no longer sure whether it is a problem with 
EXA.


I have a testcase that currently takes 90 seconds to draw all 
rectangles. I see that in damage.c two functions are mainly used:


damagePolyRectangle
damagePolyFillRectangle

The first function calls for each given rectangle up to four times 
damageDamageBox (pDrawable, box, pGC-subWindowMode);

which adds the box to a region. The function then calls damageRegionAppend.

This part takes in sum 30 seconds of my testcase. I think the code has 
quadratic behaviour here becuase it adds rectangle by rectangle instead 
of first adding them to a region and then calling damageRegionAppend. I 
think removing the quadratic behaviour can reduce the runtime significantly.


About 60 seconds are spent in the calls

(*pGC-ops-PolyRectangle)(pDrawable, pGC, nRects, pRects);
(*pGC-ops-PolyFillRect)(pDrawable, pGC, nRects, pRects);

However I do not yet know why they are so slow.

Is damage.c part of EXA?

Christoph
___
xorg@lists.freedesktop.org: X.Org support
Archives: http://lists.freedesktop.org/archives/xorg
Info: http://lists.freedesktop.org/mailman/listinfo/xorg
Your subscription address: arch...@mail-archive.com


Disabling EXA

2011-11-26 Thread Christoph Bartoschek

Hi,

how can I disable EXA and use XAA? I am on opensuse and added the 
following section to /etc/X11/xorg.conf.d/50-device.conf:


Section Device
  Option AccelMethod xaa
  Identifier Default Device
EndSection

Xorg reads the file because it says in its logfile:

[21.481] (==) RADEON(0): Depth 24, (--) framebuffer bpp 32
[21.481] (II) RADEON(0): Pixel depth = 24 bits stored in 4 bytes (32 
bpp pixmaps)

[21.481] (==) RADEON(0): Default visual is TrueColor
[21.481] (**) RADEON(0): Option AccelMethod xaa
[21.481] (==) RADEON(0): RGB weight 888
[21.481] (II) RADEON(0): Using 8 bits per RGB (8 bit DAC)
[21.481] (--) RADEON(0): Chipset: ATI Radeon Mobility X300 (M22) 
5460 (PCIE) (ChipID = 0x5460)

[21.481] (II) RADEON(0): PCIE card detected

But EXA is still used:

[21.481] (II) Loading sub module exa
[21.481] (II) LoadModule: exa
[21.482] (II) Loading /usr/lib/xorg/modules/libexa.so
[21.488] (II) Module exa: vendor=X.Org Foundation
[21.488]compiled for 1.10.4, module version = 2.5.0
...
[21.575] (II) RADEON(0): EXA: Driver will allow EXA pixmaps in VRAM
...
[21.635] (II) RADEON(0): Setting EXA maxPitchBytes
[21.635] (II) EXA(0): Driver allocated offscreen pixmaps
[21.635] (II) EXA(0): Driver registered support for the following 
operations:

[21.635] (II) Solid
[21.635] (II) Copy
[21.635] (II) Composite (RENDER acceleration)
[21.635] (II) UploadToScreen
[21.635] (II) DownloadFromScreen


How can I disable EXA and use XAA?. I suspect EXA to be responsible for 
a huge performance regression.


Thanks
Christoph
___
xorg@lists.freedesktop.org: X.Org support
Archives: http://lists.freedesktop.org/archives/xorg
Info: http://lists.freedesktop.org/mailman/listinfo/xorg
Your subscription address: arch...@mail-archive.com


X.org drawing rectangles extremely slow.

2011-08-23 Thread Christoph Bartoschek

Hi,

I have an application that draws rectangles (approx 50). Most of 
them are so small that only a pixel or nothing is visible.


After we upgraded our thin clients we saw a huge performance regression 
in the application.


On our old thin clients with XFree86 identifying itself as X.org: 6.8.1 
the application need 1.5 seconds to draw all rectangles.


With the new thin client it takes 75 seconds. The X.org version is 1.5.2.

On my notebook with X.org 1.9.3 from opensuse 11.4 it still takes 7.4 
seconds. This is about 5x slower than on the old thin client.


Has anyone an idea what could be wrong with the newer X.org servers?

I've created a perf profile of the X.org server on my notebook:

# Events: 33K cycles
#
# Overhead  Command   Shared Object 
   Symbol
#   ...  .. 


#
46.83% Xorg  libpixman-1.so.0.20.0   [.] fbbc
 5.39% Xorg  libexa.so   [.] 3127
 5.38% Xorg  radeon_drv.so   [.]c4928
 5.32% Xorg  Xorg[.]2a343
 4.34% Xorg  libc-2.11.3.so  [.] _int_malloc
 3.14% Xorg  libc-2.11.3.so  [.] __GI_memmove
 2.78% Xorg  [radeon][k] r100_cs_packet_parse
 2.38% Xorg  [radeon][k] r100_cs_parse_packet0
 2.18% Xorg  libc-2.11.3.so  [.] _int_free
 2.14% Xorg  Xorg[.] RegionValidate
 1.87% Xorg  libdrm_radeon.so.1.0.0  [.] 287d
 1.81% Xorg  libc-2.11.3.so  [.] __cfree
 1.71% Xorg  libc-2.11.3.so  [.] __malloc
 1.40% Xorg  [radeon][k] r300_cs_parse
 1.02% Xorg  libpixman-1.so.0.20.0   [.] pixman_region_union
 0.95% Xorg  libpixman-1.so.0.20.0   [.] pixman_region_intersect
 0.71% Xorg  Xorg[.] RegionFromRects
 0.66% Xorg  [kernel.kallsyms]   [k] read_hpet
 0.53% Xorg  libc-2.11.3.so  [.] __i686.get_pc_thunk.bx


Thanks
Christoph
___
xorg@lists.freedesktop.org: X.Org support
Archives: http://lists.freedesktop.org/archives/xorg
Info: http://lists.freedesktop.org/mailman/listinfo/xorg
Your subscription address: arch...@mail-archive.com