Re: ProcPutImage calls exaDoMoveOutPixmap, 4x slowdown
On Thu, Oct 16, 2008 at 5:02 PM, Michel Dänzer [EMAIL PROTECTED] wrote: On Wed, 2008-10-15 at 21:59 +0200, Maarten Maathuis wrote: On Wed, Oct 15, 2008 at 9:43 PM, Eric Anholt [EMAIL PROTECTED] wrote: Migrating out for a write-only operation is just broken, and is the thing that should be fixed there. There is no actual migration here, just superfluous syncing fixed by my patch. What makes you so sure, the standard thing to do on a fallback is migrate out. I'd like to add that if anything changes in this beheaviour, then this shouldn't be done quietly. Because some may depend on this (offscreen memory tiled and needing migration to have something linear available for example). Sounds like something that could be handled in UploadToScreen. I'm assuming a case where UTS and DFS do the conversion, but direct cpu access is a bad idea. Fortunately exa *never* does this currently, prepare access always triggers a migration out. The current {Prepare,Finish}Access isn't completely suited for this conversion (exaPixmapIsOffscreen() isn't exported). FWIW, exporting exaPixmapIsOffscreen() might make sense anyway though. -- Earthling Michel Dänzer | http://tungstengraphics.com Libre software enthusiast | Debian, X and DRI developer ___ xorg mailing list xorg@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/xorg
Re: ProcPutImage calls exaDoMoveOutPixmap, 4x slowdown
On Thu, 2008-10-16 at 08:02 -0700, Michel Dänzer wrote: On Wed, 2008-10-15 at 21:59 +0200, Maarten Maathuis wrote: On Wed, Oct 15, 2008 at 9:43 PM, Eric Anholt [EMAIL PROTECTED] wrote: Migrating out for a write-only operation is just broken, and is the thing that should be fixed there. There is no actual migration here, just superfluous syncing fixed by my patch. The patch looks plausible. 1.5 branch candidate? The current {Prepare,Finish}Access isn't completely suited for this conversion (exaPixmapIsOffscreen() isn't exported). FWIW, exporting exaPixmapIsOffscreen() might make sense anyway though. Yes please. - ajax signature.asc Description: This is a digitally signed message part ___ xorg mailing list xorg@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/xorg
ProcPutImage calls exaDoMoveOutPixmap, 4x slowdown
Hi Michel, Thanks a lot for your investigation. Does the attached xserver patch help? Looks like we're syncing unnecessarily in the migration no-op case. Yes, a lot. My benchmark went up from ~12fps to ~19fps and the fallback is gone according to the profile. I am still only at 50% of intel-2.1.1/xorg-server-1.3's throughput, however a lot of time is spent inside the intel-driver - I guess its related to the refactoring to make it GEM ready. Thanks again, Clemens ___ xorg mailing list xorg@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/xorg
Re: ProcPutImage calls exaDoMoveOutPixmap, 4x slowdown
On Tue, 2008-10-14 at 16:49 +0200, Maarten Maathuis wrote: On Tue, Oct 14, 2008 at 4:02 PM, Clemens Eisserer [EMAIL PROTECTED] wrote: Hello, I've a use-case where the client uploads 32x32 A8 images to an 256x256x8 pixmap which is later used as mask in a composition operation. The test-case is able to render with 40fps on xserver-1.3/intel-2.1.1 however with the latest GIT of both I only get ~10-15fps. Unfourtunatly I've not been able to create a stand-alone testcase which triggers this problem :-/ Using sysprof I can see a lot of time is spent moving data arround, very strange is that PutImage seems to cause a readback: ProcPutImage-ExaCheckPutImage-exaPrepareAccessReg-exaDoMigration-exaDoMoveOutPixmap-exaCopyDirty-exaWaitSync-I830EXASync In Composite I see the re-uploading again. Any idea why ProcPutImage could to fallback (there's plenty of free vram)? Are there tools / settings which could help me to identify the problem? Thank you in advance, Clemens ___ xorg mailing list xorg@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/xorg I think this is because intel does not provide an UploadToScreen hook (because it has no vram). It hasn't made (visible) effort to reintegrate UXA in EXA, because you can obviously be bit smarter than what is currently being done. I've got an idea or two on how to improve this, but intel should be more than capable in dealing with this. Migrating out for a write-only operation is just broken, and is the thing that should be fixed there. -- Eric Anholt [EMAIL PROTECTED] [EMAIL PROTECTED] signature.asc Description: This is a digitally signed message part ___ xorg mailing list xorg@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/xorg
Re: ProcPutImage calls exaDoMoveOutPixmap, 4x slowdown
On Wed, Oct 15, 2008 at 9:43 PM, Eric Anholt [EMAIL PROTECTED] wrote: On Tue, 2008-10-14 at 16:49 +0200, Maarten Maathuis wrote: On Tue, Oct 14, 2008 at 4:02 PM, Clemens Eisserer [EMAIL PROTECTED] wrote: Hello, I've a use-case where the client uploads 32x32 A8 images to an 256x256x8 pixmap which is later used as mask in a composition operation. The test-case is able to render with 40fps on xserver-1.3/intel-2.1.1 however with the latest GIT of both I only get ~10-15fps. Unfourtunatly I've not been able to create a stand-alone testcase which triggers this problem :-/ Using sysprof I can see a lot of time is spent moving data arround, very strange is that PutImage seems to cause a readback: ProcPutImage-ExaCheckPutImage-exaPrepareAccessReg-exaDoMigration-exaDoMoveOutPixmap-exaCopyDirty-exaWaitSync-I830EXASync In Composite I see the re-uploading again. Any idea why ProcPutImage could to fallback (there's plenty of free vram)? Are there tools / settings which could help me to identify the problem? Thank you in advance, Clemens ___ xorg mailing list xorg@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/xorg I think this is because intel does not provide an UploadToScreen hook (because it has no vram). It hasn't made (visible) effort to reintegrate UXA in EXA, because you can obviously be bit smarter than what is currently being done. I've got an idea or two on how to improve this, but intel should be more than capable in dealing with this. Migrating out for a write-only operation is just broken, and is the thing that should be fixed there. -- Eric Anholt [EMAIL PROTECTED] [EMAIL PROTECTED] I'd like to add that if anything changes in this beheaviour, then this shouldn't be done quietly. Because some may depend on this (offscreen memory tiled and needing migration to have something linear available for example). The current {Prepare,Finish}Access isn't completely suited for this conversion (exaPixmapIsOffscreen() isn't exported). Just my 3 cent. Maarten. ___ xorg mailing list xorg@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/xorg
ProcPutImage calls exaDoMoveOutPixmap, 4x slowdown
Hello, I've a use-case where the client uploads 32x32 A8 images to an 256x256x8 pixmap which is later used as mask in a composition operation. The test-case is able to render with 40fps on xserver-1.3/intel-2.1.1 however with the latest GIT of both I only get ~10-15fps. Unfourtunatly I've not been able to create a stand-alone testcase which triggers this problem :-/ Using sysprof I can see a lot of time is spent moving data arround, very strange is that PutImage seems to cause a readback: ProcPutImage-ExaCheckPutImage-exaPrepareAccessReg-exaDoMigration-exaDoMoveOutPixmap-exaCopyDirty-exaWaitSync-I830EXASync In Composite I see the re-uploading again. Any idea why ProcPutImage could to fallback (there's plenty of free vram)? Are there tools / settings which could help me to identify the problem? Thank you in advance, Clemens ___ xorg mailing list xorg@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/xorg
Re: ProcPutImage calls exaDoMoveOutPixmap, 4x slowdown
Hi, There is ofcource a fallback system, which is pretty much a memcpy. Ah, I guess that was that memcpy I always saw in moveIn / moveOut ;) intel has never had an UploadToScreen hook. Ah interesting, because I saw 4x better performance with intel-2.1.1 / xserver-1.3. With this configuration the putted data was just memcpy'd to vram, but now it seems to be a readback-put-upload cycle :-/ I'll try to find a small test-case and report a bug. I'm just mentioning uxa, because they did realize exa wasn't perfect for them (in it's current form), they just haven't fixed exa yet to be a little more smart for non-vram cards. Yes, I also really hope they merge it back soon. Thanks again, Clemens ___ xorg mailing list xorg@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/xorg
Re: ProcPutImage calls exaDoMoveOutPixmap, 4x slowdown
On Tue, Oct 14, 2008 at 7:35 PM, Clemens Eisserer [EMAIL PROTECTED] wrote: Hi, There is ofcource a fallback system, which is pretty much a memcpy. Ah, I guess that was that memcpy I always saw in moveIn / moveOut ;) intel has never had an UploadToScreen hook. Ah interesting, because I saw 4x better performance with intel-2.1.1 / xserver-1.3. With this configuration the putted data was just memcpy'd to vram, but now it seems to be a readback-put-upload cycle :-/ I'll try to find a small test-case and report a bug. I'm just mentioning uxa, because they did realize exa wasn't perfect for them (in it's current form), they just haven't fixed exa yet to be a little more smart for non-vram cards. Yes, I also really hope they merge it back soon. Thanks again, Clemens ___ xorg mailing list xorg@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/xorg 2.1.1 probably used XAA as default, which didn't try to accelerate much. Maarten. ___ xorg mailing list xorg@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/xorg
Re: ProcPutImage calls exaDoMoveOutPixmap, 4x slowdown
Sorry for the email flood ... 2.1.1 probably used XAA as default, which didn't try to accelerate much. No, the results were with EXA enabled - although results with XAA are again magnitudes better ;) Thanks, Clemens ___ xorg mailing list xorg@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/xorg
Re: ProcPutImage calls exaDoMoveOutPixmap, 4x slowdown
Hi, I think this is because intel does not provide an UploadToScreen hook (because it has no vram). It hasn't made (visible) effort to reintegrate UXA in EXA, Btw. I was using EXA without GEM. Has the UploadToScreen hook been removed when preparing the driver for UXA and/or GEM? One thing which puzzles me, if the intel-driver does not define UploadToScreen, how can pixmaps end in vram at all, or are there other, slower paths which take care of this in that case? Thanks a lot, Clemens ___ xorg mailing list xorg@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/xorg
Re: ProcPutImage calls exaDoMoveOutPixmap, 4x slowdown
On Tue, 2008-10-14 at 16:02 +0200, Clemens Eisserer wrote: I've a use-case where the client uploads 32x32 A8 images to an 256x256x8 pixmap which is later used as mask in a composition operation. The test-case is able to render with 40fps on xserver-1.3/intel-2.1.1 however with the latest GIT of both I only get ~10-15fps. Unfourtunatly I've not been able to create a stand-alone testcase which triggers this problem :-/ Using sysprof I can see a lot of time is spent moving data arround, very strange is that PutImage seems to cause a readback: ProcPutImage-ExaCheckPutImage-exaPrepareAccessReg-exaDoMigration-exaDoMoveOutPixmap-exaCopyDirty-exaWaitSync-I830EXASync In Composite I see the re-uploading again. Any idea why ProcPutImage could to fallback (there's plenty of free vram)? Are there tools / settings which could help me to identify the problem? Does the attached xserver patch help? Looks like we're syncing unnecessarily in the migration no-op case. -- Earthling Michel Dänzer | http://tungstengraphics.com Libre software enthusiast | Debian, X and DRI developer diff --git a/exa/exa_migration.c b/exa/exa_migration.c index 56b6945..c68cd76 100644 --- a/exa/exa_migration.c +++ b/exa/exa_migration.c @@ -129,6 +131,7 @@ exaCopyDirty(ExaMigrationPtr migrate, RegionPtr pValidDst, RegionPtr pValidSrc, BoxPtr pBox; int nbox; Bool access_prepared = FALSE; +Bool need_sync = FALSE; /* Damaged bits are valid in current copy but invalid in other one */ if (exaPixmapIsOffscreen(pPixmap)) { @@ -220,14 +253,15 @@ exaCopyDirty(ExaMigrationPtr migrate, RegionPtr pValidDst, RegionPtr pValidSrc, exaMemcpyBox (pPixmap, pBox, fallback_src, fallback_srcpitch, fallback_dst, fallback_dstpitch); - } + } else + need_sync = TRUE; pBox++; } if (access_prepared) exaFinishAccess(pPixmap-drawable, fallback_index); -else +else if (need_sync) sync (pPixmap-drawable.pScreen); pExaPixmap-offscreen = save_offscreen; ___ xorg mailing list xorg@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/xorg