Re: ProcPutImage calls exaDoMoveOutPixmap, 4x slowdown

2008-10-16 Thread Maarten Maathuis
On Thu, Oct 16, 2008 at 5:02 PM, Michel Dänzer
[EMAIL PROTECTED] wrote:
 On Wed, 2008-10-15 at 21:59 +0200, Maarten Maathuis wrote:
 On Wed, Oct 15, 2008 at 9:43 PM, Eric Anholt [EMAIL PROTECTED] wrote:
 
  Migrating out for a write-only operation is just broken, and is the
  thing that should be fixed there.

 There is no actual migration here, just superfluous syncing fixed by my
 patch.

What makes you so sure, the standard thing to do on a fallback is migrate out.



 I'd like to add that if anything changes in this beheaviour, then this
 shouldn't be done quietly. Because some may depend on this (offscreen
 memory tiled and needing migration to have something linear available
 for example).

 Sounds like something that could be handled in UploadToScreen.

I'm assuming a case where UTS and DFS do the conversion, but direct
cpu access is a bad idea. Fortunately exa *never* does this currently,
prepare access always triggers a migration out.


 The current {Prepare,Finish}Access isn't completely
 suited for this conversion (exaPixmapIsOffscreen() isn't exported).

 FWIW, exporting exaPixmapIsOffscreen() might make sense anyway though.



 --
 Earthling Michel Dänzer   |  http://tungstengraphics.com
 Libre software enthusiast |  Debian, X and DRI developer


___
xorg mailing list
xorg@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/xorg


Re: ProcPutImage calls exaDoMoveOutPixmap, 4x slowdown

2008-10-16 Thread Adam Jackson
On Thu, 2008-10-16 at 08:02 -0700, Michel Dänzer wrote:
 On Wed, 2008-10-15 at 21:59 +0200, Maarten Maathuis wrote:
  On Wed, Oct 15, 2008 at 9:43 PM, Eric Anholt [EMAIL PROTECTED] wrote:
  
   Migrating out for a write-only operation is just broken, and is the
   thing that should be fixed there.
 
 There is no actual migration here, just superfluous syncing fixed by my
 patch.

The patch looks plausible.  1.5 branch candidate?

  The current {Prepare,Finish}Access isn't completely
  suited for this conversion (exaPixmapIsOffscreen() isn't exported).
 
 FWIW, exporting exaPixmapIsOffscreen() might make sense anyway though.

Yes please.

- ajax


signature.asc
Description: This is a digitally signed message part
___
xorg mailing list
xorg@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/xorg

ProcPutImage calls exaDoMoveOutPixmap, 4x slowdown

2008-10-15 Thread Clemens Eisserer
Hi Michel,

Thanks a lot for your investigation.

 Does the attached xserver patch help? Looks like we're syncing
 unnecessarily in the migration no-op case.
Yes, a lot. My benchmark went up from ~12fps to ~19fps and the
fallback is gone according to the profile.
I am still only at 50% of intel-2.1.1/xorg-server-1.3's throughput,
however a lot of time is spent inside the intel-driver - I guess its
related to the refactoring to make it GEM ready.

Thanks again, Clemens
___
xorg mailing list
xorg@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/xorg


Re: ProcPutImage calls exaDoMoveOutPixmap, 4x slowdown

2008-10-15 Thread Eric Anholt
On Tue, 2008-10-14 at 16:49 +0200, Maarten Maathuis wrote:
 On Tue, Oct 14, 2008 at 4:02 PM, Clemens Eisserer [EMAIL PROTECTED] wrote:
  Hello,
 
  I've a use-case where the client uploads 32x32 A8 images to an
  256x256x8 pixmap  which is later used as mask in a composition
  operation.
  The test-case is able to render with 40fps on xserver-1.3/intel-2.1.1
  however with the latest GIT of both I only get ~10-15fps.
  Unfourtunatly I've not been able to create a stand-alone testcase
  which triggers this problem :-/
 
  Using sysprof I can see a lot of time is spent moving data arround,
  very strange is that PutImage seems to cause a readback:
  ProcPutImage-ExaCheckPutImage-exaPrepareAccessReg-exaDoMigration-exaDoMoveOutPixmap-exaCopyDirty-exaWaitSync-I830EXASync
  In Composite I see the re-uploading again.
 
  Any idea why ProcPutImage could to fallback (there's plenty of free vram)?
  Are there tools / settings which could help me to identify the problem?
 
  Thank you in advance, Clemens
  ___
  xorg mailing list
  xorg@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/xorg
 
 
 I think this is because intel does not provide an UploadToScreen hook
 (because it has no vram). It hasn't made (visible) effort to
 reintegrate UXA in EXA, because you can obviously be bit smarter than
 what is currently being done. I've got an idea or two on how to
 improve this, but intel should be more than capable in dealing with
 this.

Migrating out for a write-only operation is just broken, and is the
thing that should be fixed there.

-- 
Eric Anholt
[EMAIL PROTECTED] [EMAIL PROTECTED]




signature.asc
Description: This is a digitally signed message part
___
xorg mailing list
xorg@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/xorg

Re: ProcPutImage calls exaDoMoveOutPixmap, 4x slowdown

2008-10-15 Thread Maarten Maathuis
On Wed, Oct 15, 2008 at 9:43 PM, Eric Anholt [EMAIL PROTECTED] wrote:
 On Tue, 2008-10-14 at 16:49 +0200, Maarten Maathuis wrote:
 On Tue, Oct 14, 2008 at 4:02 PM, Clemens Eisserer [EMAIL PROTECTED] wrote:
  Hello,
 
  I've a use-case where the client uploads 32x32 A8 images to an
  256x256x8 pixmap  which is later used as mask in a composition
  operation.
  The test-case is able to render with 40fps on xserver-1.3/intel-2.1.1
  however with the latest GIT of both I only get ~10-15fps.
  Unfourtunatly I've not been able to create a stand-alone testcase
  which triggers this problem :-/
 
  Using sysprof I can see a lot of time is spent moving data arround,
  very strange is that PutImage seems to cause a readback:
  ProcPutImage-ExaCheckPutImage-exaPrepareAccessReg-exaDoMigration-exaDoMoveOutPixmap-exaCopyDirty-exaWaitSync-I830EXASync
  In Composite I see the re-uploading again.
 
  Any idea why ProcPutImage could to fallback (there's plenty of free vram)?
  Are there tools / settings which could help me to identify the problem?
 
  Thank you in advance, Clemens
  ___
  xorg mailing list
  xorg@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/xorg
 

 I think this is because intel does not provide an UploadToScreen hook
 (because it has no vram). It hasn't made (visible) effort to
 reintegrate UXA in EXA, because you can obviously be bit smarter than
 what is currently being done. I've got an idea or two on how to
 improve this, but intel should be more than capable in dealing with
 this.

 Migrating out for a write-only operation is just broken, and is the
 thing that should be fixed there.

 --
 Eric Anholt
 [EMAIL PROTECTED] [EMAIL PROTECTED]




I'd like to add that if anything changes in this beheaviour, then this
shouldn't be done quietly. Because some may depend on this (offscreen
memory tiled and needing migration to have something linear available
for example). The current {Prepare,Finish}Access isn't completely
suited for this conversion (exaPixmapIsOffscreen() isn't exported).

Just my 3 cent.

Maarten.
___
xorg mailing list
xorg@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/xorg


ProcPutImage calls exaDoMoveOutPixmap, 4x slowdown

2008-10-14 Thread Clemens Eisserer
Hello,

I've a use-case where the client uploads 32x32 A8 images to an
256x256x8 pixmap  which is later used as mask in a composition
operation.
The test-case is able to render with 40fps on xserver-1.3/intel-2.1.1
however with the latest GIT of both I only get ~10-15fps.
Unfourtunatly I've not been able to create a stand-alone testcase
which triggers this problem :-/

Using sysprof I can see a lot of time is spent moving data arround,
very strange is that PutImage seems to cause a readback:
ProcPutImage-ExaCheckPutImage-exaPrepareAccessReg-exaDoMigration-exaDoMoveOutPixmap-exaCopyDirty-exaWaitSync-I830EXASync
In Composite I see the re-uploading again.

Any idea why ProcPutImage could to fallback (there's plenty of free vram)?
Are there tools / settings which could help me to identify the problem?

Thank you in advance, Clemens
___
xorg mailing list
xorg@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/xorg


Re: ProcPutImage calls exaDoMoveOutPixmap, 4x slowdown

2008-10-14 Thread Clemens Eisserer
Hi,

 There is ofcource a fallback system, which is pretty much a memcpy.
Ah, I guess that was that memcpy I always saw in moveIn / moveOut ;)

 intel has never had an UploadToScreen hook.
Ah interesting, because I saw 4x better performance with intel-2.1.1 /
xserver-1.3.
With this configuration the putted data was just memcpy'd to vram, but
now it seems to be a readback-put-upload cycle :-/
I'll try to find a small test-case and report a bug.

 I'm just mentioning uxa,
 because they did realize exa wasn't perfect for them (in it's current
 form), they just haven't fixed exa yet to be a little more smart for
 non-vram cards.
Yes, I also really hope they merge it back soon.

Thanks again, Clemens
___
xorg mailing list
xorg@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/xorg


Re: ProcPutImage calls exaDoMoveOutPixmap, 4x slowdown

2008-10-14 Thread Maarten Maathuis
On Tue, Oct 14, 2008 at 7:35 PM, Clemens Eisserer [EMAIL PROTECTED] wrote:
 Hi,

 There is ofcource a fallback system, which is pretty much a memcpy.
 Ah, I guess that was that memcpy I always saw in moveIn / moveOut ;)

 intel has never had an UploadToScreen hook.
 Ah interesting, because I saw 4x better performance with intel-2.1.1 /
 xserver-1.3.
 With this configuration the putted data was just memcpy'd to vram, but
 now it seems to be a readback-put-upload cycle :-/
 I'll try to find a small test-case and report a bug.

 I'm just mentioning uxa,
 because they did realize exa wasn't perfect for them (in it's current
 form), they just haven't fixed exa yet to be a little more smart for
 non-vram cards.
 Yes, I also really hope they merge it back soon.

 Thanks again, Clemens
 ___
 xorg mailing list
 xorg@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/xorg


2.1.1 probably used XAA as default, which didn't try to accelerate much.

Maarten.
___
xorg mailing list
xorg@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/xorg


Re: ProcPutImage calls exaDoMoveOutPixmap, 4x slowdown

2008-10-14 Thread Clemens Eisserer
Sorry for the email flood ...

 2.1.1 probably used XAA as default, which didn't try to accelerate much.
No, the results were with EXA enabled - although results with XAA are
again magnitudes better ;)

Thanks, Clemens
___
xorg mailing list
xorg@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/xorg


Re: ProcPutImage calls exaDoMoveOutPixmap, 4x slowdown

2008-10-14 Thread Clemens Eisserer
Hi,

 I think this is because intel does not provide an UploadToScreen hook
 (because it has no vram). It hasn't made (visible) effort to
 reintegrate UXA in EXA,
Btw. I was using EXA without GEM.
Has the UploadToScreen hook been removed when preparing the driver for
UXA and/or GEM?
One thing which puzzles me, if the intel-driver does not define
UploadToScreen, how can pixmaps end in vram at all, or are there
other, slower paths which take care of this in that case?

Thanks a lot, Clemens
___
xorg mailing list
xorg@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/xorg


Re: ProcPutImage calls exaDoMoveOutPixmap, 4x slowdown

2008-10-14 Thread Michel Dänzer
On Tue, 2008-10-14 at 16:02 +0200, Clemens Eisserer wrote:
 
 I've a use-case where the client uploads 32x32 A8 images to an
 256x256x8 pixmap  which is later used as mask in a composition
 operation.
 The test-case is able to render with 40fps on xserver-1.3/intel-2.1.1
 however with the latest GIT of both I only get ~10-15fps.
 Unfourtunatly I've not been able to create a stand-alone testcase
 which triggers this problem :-/
 
 Using sysprof I can see a lot of time is spent moving data arround,
 very strange is that PutImage seems to cause a readback:
 ProcPutImage-ExaCheckPutImage-exaPrepareAccessReg-exaDoMigration-exaDoMoveOutPixmap-exaCopyDirty-exaWaitSync-I830EXASync
 In Composite I see the re-uploading again.
 
 Any idea why ProcPutImage could to fallback (there's plenty of free vram)?
 Are there tools / settings which could help me to identify the problem?

Does the attached xserver patch help? Looks like we're syncing
unnecessarily in the migration no-op case.


-- 
Earthling Michel Dänzer   |  http://tungstengraphics.com
Libre software enthusiast |  Debian, X and DRI developer
diff --git a/exa/exa_migration.c b/exa/exa_migration.c
index 56b6945..c68cd76 100644
--- a/exa/exa_migration.c
+++ b/exa/exa_migration.c
@@ -129,6 +131,7 @@ exaCopyDirty(ExaMigrationPtr migrate, RegionPtr pValidDst, RegionPtr pValidSrc,
 BoxPtr pBox;
 int nbox;
 Bool access_prepared = FALSE;
+Bool need_sync = FALSE;
 
 /* Damaged bits are valid in current copy but invalid in other one */
 if (exaPixmapIsOffscreen(pPixmap)) {
@@ -220,14 +253,15 @@ exaCopyDirty(ExaMigrationPtr migrate, RegionPtr pValidDst, RegionPtr pValidSrc,
 	exaMemcpyBox (pPixmap, pBox,
 			  fallback_src, fallback_srcpitch,
 			  fallback_dst, fallback_dstpitch);
-	}
+	} else
+	need_sync = TRUE;
 
 	pBox++;
 }
 
 if (access_prepared)
 	exaFinishAccess(pPixmap-drawable, fallback_index);
-else
+else if (need_sync)
 	sync (pPixmap-drawable.pScreen);
 
 pExaPixmap-offscreen = save_offscreen;
___
xorg mailing list
xorg@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/xorg