Jesse Barnes wrote: > On Wednesday, March 19, 2008 3:14 pm Thomas Hellström wrote: > >>> IIRC Eric had the relocation costs down in the "negligible" range, but >>> with the latest Mesa & DRM bits, applying relocations seems to be a big >>> part of openarena profiles at least: >>> >>> samples % app name symbol name >>> 27354 11.0340 libopenal.so.0.0.0 (no symbols) >>> 26907 10.8537 ioquake3.x86_64 (no symbols) >>> 25328 10.2167 i915 i915_apply_reloc >>> 10186 4.1088 i965_dri.so search_cache >>> 9411 3.7962 intel_drv.so i830SetLVDSPanelPower >>> 8920 3.5981 i915 i915_flush_ttm >>> 7538 3.0407 cgame.o_uaKVFT (deleted) (no symbols) >>> 6286 2.5356 libc-2.7.so memcpy >>> 5398 2.1774 vmlinux read_hpet >>> 4768 1.9233 vmlinux clear_page_c >>> 4037 1.6284 i965_dri.so _mesa_UpdateTexEnvProgram >>> 3824 1.5425 libpthread-2.7.so pthread_mutex_lock >>> 3655 1.4743 vmlinux mwait_idle_with_hints >>> 3015 1.2162 vmlinux acpi_os_read_port >>> 2915 1.1758 i965_dri.so dri_ttm_bo_process_reloc >>> 2830 1.1416 drm drm_ht_find_key >>> 2629 1.0605 vmlinux acpi_idle_enter_bm >>> 2563 1.0339 opreport (no symbols) >>> >>> I'm using the below profiling script to setup oprofile >>> (i830SetLVDSPanelPower is still in there because profiling started right >>> near the end of openarena's modesetting, which called dpms off/on). >>> >>> Thanks, >>> Jesse >>> >>> opcontrol --reset >>> openarena +exec anholt 2>&1 | egrep -e '[0-9]+ frames' & >>> OPENARENA=$! >>> sleep 10 # avoid openarena jit & mode setting >>> opcontrol --start >>> wait $OPENARENA >>> opcontrol --dump >>> opreport -t 1 -l >>> opcontrol --stop >>> >> Jesse, >> The post-reloc branch should not in any way alter the way relocations >> are performed on the mesa master drivers, since they are still using >> relocation type 0. Post-relocs only affect relocation type 1. >> > > Ah ok... > > >> So the performance degradation is probably caused by something else. >> Could you narrow it down with a git-bisect? >> > > I'm not even sure there was a performance degradation. At 1024x768 I'm > seeing > ~46 FPS with Eric's demo regardless of whether the PRESUMED_OFFSET stuff is > enabled or not, which doesn't sound too unreasonable. I was just worried > that the profile might be way different than what I was hearing from Eric, > but that could easily have been due to differences in the bits we're testing > or the fact that he was using sysprof and not oprofile. > > Jesse > I would have thought that the PRESUMED_OFFSET stuff would take care of most relocations. However, when relocations type 0 _need_ to be performed there's a recent commit that evicts the relocatee first. For an app with many performed applications this would probably show up on a profile. I think the evict is needed to force a clflush() on the relocatee, but it should be more efficient to just clflush() the cache line of the value just written, while keeping the relocatee bound...
/Thomas ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -- _______________________________________________ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel