On Wed, Nov 26, 2014 at 7:38 AM, Andy Lutomirski <luto at amacapital.net> wrote: > On Tue, Nov 25, 2014 at 10:42 PM, Michel Dänzer <michel at daenzer.net> > wrote: >> On 20.11.2014 09:58, Andy Lutomirski wrote: >>> >>> On Wed, Nov 19, 2014 at 4:07 PM, Andy Lutomirski <luto at amacapital.net> >>> wrote: >>>> >>>> On Tue, Nov 18, 2014 at 11:19 PM, Michel Dänzer <michel at daenzer.net> >>>> wrote: >>>>> >>>>> On 19.11.2014 09:21, Andy Lutomirski wrote: >>>>>> >>>>>> >>>>>> On Mon, Nov 17, 2014 at 1:51 AM, Michel Dänzer <michel at daenzer.net> >>>>>> wrote: >>>>>>> >>>>>>> >>>>>>> On 15.11.2014 07:21, Andy Lutomirski wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On recent kernels (3.16 through 3.18-rc4, perhaps), doing anything >>>>>>>> graphics intensive seems to cause my system to become unusable for >>>>>>>> tens of seconds. Pointing Firefox at Google Maps is a big offender >>>>>>>> -- >>>>>>>> it can take several minutes for me to move my mouse far enough to >>>>>>>> close the tab and get my computer back. >>>>>>>> >>>>>>>> On bootup, I get this warning: >>>>>>>> [drm:btc_dpm_set_power_state] *ERROR* >>>>>>>> rv770_restrict_performance_levels_before_switch failed >>>>>>>> >>>>>>>> Setting radeon.dpm=0 seems to work around this problem at the cost of >>>>>>>> giving my rather slow graphics. >>>>>>>> >>>>>>>> Are there known issues here? >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Can you bisect the kernel, or at least isolate which kernel version >>>>>>> first >>>>>>> introduced the problem? >>>>>> >>>>>> >>>>>> >>>>>> With whatever userspace I'm running, I'm seeing it 3.13, 3.14, 3.15, >>>>>> 3.16, and 3.18-rc4+. I haven't tried other versions. >>>>>> >>>>>> With radeon.dpm=0, I can still trigger short stalls (around one >>>>>> second), but I seem unable to trigger long stalls easily. (I say >>>>>> easily because, just as I was typing this email, my system stalled for >>>>>> about a minute.) >>>>> >>>>> >>>>> >>>>> I can only think of two things offhand that could cause such extremely >>>>> long >>>>> stalls: Swap thrashing or IRQ storms. >>>>> >>>>> With a setup where you can easily trigger long stalls, can you try >>>>> getting a >>>>> CPU profile for a stall with sysprof or perf? >>>>> >>>>> >>>> >>>> Got one with perf: >>>> >>>> 16.82% Xorg libc-2.18.so [.] >>>> __memcpy_sse2_unaligned >>>> 9.20% swapper [kernel.kallsyms] [k] >>>> intel_idle >>>> 1.00% Xorg [kernel.kallsyms] [k] >>>> evergreen_irq_set >>>> 0.83% firefox libxul.so [.] >>>> 0x0000000001d93281 >>>> 0.69% firefox libxul.so [.] >>>> 0x0000000001d932ad >>>> 0.62% firefox [kernel.kallsyms] [k] >>>> copy_user_generic_string >>>> 0.55% swapper [kernel.kallsyms] [k] >>>> evergreen_irq_ack >>>> 0.54% firefox libpthread-2.18.so [.] >>>> pthread_mutex_lock >>>> 0.52% firefox libpthread-2.18.so [.] >>>> pthread_mutex_unlock >>>> 0.45% Xorg [kernel.kallsyms] [k] >>>> drm_mm_insert_node_in_range_generic >>>> 0.41% Xorg [kernel.kallsyms] [k] >>>> lock_release >>>> 0.40% Xorg [kernel.kallsyms] [k] >>>> lock_acquire >>>> 0.35% firefox firefox [.] >>>> 0x000000000001245d >>>> 0.33% Xorg [kernel.kallsyms] [k] >>>> __module_address >>>> 0.31% firefox [kernel.kallsyms] [k] >>>> clear_page_c >>>> 0.29% Xorg [kernel.kallsyms] [k] >>>> copy_user_generic_string >>>> 0.28% firefox firefox [.] >>>> 0x0000000000013159 >>>> >>>> and: >>>> >>>> Samples: 11K of event 'irq:irq_handler_entry', Event count (approx.): >>>> 11802 >>>> 87.43% swapper [kernel.kallsyms] [k] >>>> handle_irq_event_percpu >>>> 7.52% firefox [kernel.kallsyms] [k] >>>> handle_irq_event_percpu >>>> 1.84% irq/36-ahci [kernel.kallsyms] [k] >>>> handle_irq_event_percpu >>>> 1.14% Xorg [kernel.kallsyms] [k] >>>> handle_irq_event_percpu >>>> 0.75% kworker/5:0 [kernel.kallsyms] [k] >>>> handle_irq_event_percpu >>>> 0.32% gnome-shell [kernel.kallsyms] [k] >>>> handle_irq_event_percpu >>>> 0.25% kworker/5:1H [kernel.kallsyms] [k] >>>> handle_irq_event_percpu >>>> 0.25% Media D~ode #10 [kernel.kallsyms] [k] >>>> handle_irq_event_percpu >>>> 0.19% ImageDe~er #330 [kernel.kallsyms] [k] >>>> handle_irq_event_percpu >>>> 0.07% pulseaudio [kernel.kallsyms] [k] >>>> handle_irq_event_percpu >>>> >>>> The cycles were with -e cycles:pp, so I think that iret would have >>>> shown up if there were enough IRQs to cause the problem. >>>> >>>> I'll build a kernel with latencytop. >>>> >>> >>> I just caught call_rwsem_down_write_failed for 5379 ms in khugepaged >>> (holy crap) and radeon_fence_default_wait for 489.2ms in Xorg. >>> >>> Turning off THP gets rid of the khugepaged thing. The 489.2ms is >>> radeon_fence_default_wait is amazingly reproducible -- I've seen that >>> exact number three times now. >> >> >> Sounds like the long stalls were THP, but the shorter ones might be radeon? >> >> Can you get some call graphs for the profile or from latencytop? Make sure >> at least the kernel is built with frame pointers (CONFIG_FRAME_POINTER=y), >> preferably also userspace (-fno-omit-frame-pointer). > > Will try next week. I'm out of town.
The relevant line from latencytop seems to be: 154 20441402 489139 radeon_fence_default_wait [radeon] fence_wait_timeout ttm_bo_wait [ttm] ttm_bo_move_accel_cleanup [ttm] radeon_move_blit.isra.12 [radeon] radeon_bo_move [radeon] ttm_bo_handle_move_mem [ttm] ttm_bo_evict [ttm] ttm_mem_evict_first [ttm] ttm_bo_mem_space [ttm] ttm_bo_validate [ttm] radeon_bo_fault_reserve_notify [radeon] --Andy > > --Andy > >> >> >> >> -- >> Earthling Michel Dänzer | http://www.amd.com >> Libre software enthusiast | Mesa and X developer > > > > -- > Andy Lutomirski > AMA Capital Management, LLC -- Andy Lutomirski AMA Capital Management, LLC