Re: [Intel-gfx] looks like a issue in do_switch() and mi_set_context() in i915_gem_context.c?
On Wed, Apr 01, 2015 at 08:01:56PM +0800, Zhi Wang wrote: Hi Chris: Thanks for the reply. :) I can understand that the backing storage is pinned at this time, as the reference counter of context object should not be zero. But for VMA, is there any chance that the vma will be unbinded from GGTT at this time by shrinker? I saw that i915_gem_object_ggtt_unpin() will decrease the VMA reference counter... In order for the shrinker to evict an active object, it must first wait upon it. (So the shrinker will only do so as a last gasp measure.) Once the vma is unbound, we know that the GPU will have switched contexts away from the vma (because the last request that we waited upon for the vma included the instructions to do the switch away) and so the pages are swappable. This obviously relies on the hardware being correct... As would waiting upon the CCID! -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] looks like a issue in do_switch() and mi_set_context() in i915_gem_context.c?
Hi Chris: I begin to understand that before the prev context object is unpinned, it's set to active by i915_vma_move_to_active, so the shrinker will wait for it. Thanks for the help. Every time I learned a lot from you. Thanks. :) -Original Message- From: Chris Wilson [mailto:ch...@chris-wilson.co.uk] Sent: Thursday, April 02, 2015 3:18 PM To: Wang, Zhi A Cc: intel-gfx@lists.freedesktop.org Subject: Re: [Intel-gfx] looks like a issue in do_switch() and mi_set_context() in i915_gem_context.c? On Wed, Apr 01, 2015 at 08:01:56PM +0800, Zhi Wang wrote: Hi Chris: Thanks for the reply. :) I can understand that the backing storage is pinned at this time, as the reference counter of context object should not be zero. But for VMA, is there any chance that the vma will be unbinded from GGTT at this time by shrinker? I saw that i915_gem_object_ggtt_unpin() will decrease the VMA reference counter... In order for the shrinker to evict an active object, it must first wait upon it. (So the shrinker will only do so as a last gasp measure.) Once the vma is unbound, we know that the GPU will have switched contexts away from the vma (because the last request that we waited upon for the vma included the instructions to do the switch away) and so the pages are swappable. This obviously relies on the hardware being correct... As would waiting upon the CCID! -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] looks like a issue in do_switch() and mi_set_context() in i915_gem_context.c?
Hi Experts and Gurus: I'm learning i915 and It looks like in i915_gem_context.c: mi_set_context(), the last intel_ring_advance(); is a lazy ring tail write. So I think the ring buffer contains MI_SET_CONTEXT will not be submitted at this time, but in the caller: do_switch() it will unpin the backing memory of the context GEM object. From this time the backing memory may be swapped out? Then in the execbuffer routines, the MI_SET_CONTEXT may save current HW state into an unexpected location I guess? do_switch() - mi_set_context() - intel_ring_begin() -emit MI_SET_CONTEXT (save current HW state to prev context and load engine state from next context, but it won't be submitted at this time) -intel_ring_advance() - unpin and unreference prev context(from this time the prev context can be swapped by shrinker I think) Then we come into the time of real submission: i915_gem_execbuffer_retire_commands() - __intel_ring_advance() (MI_SET_CONTEXT got submitted at this time, but the prev context in do_switch() may be swapped out and turned into invalid? If GPU save current engine state into prev context, the prev context should be corrupted? When it got switch back, there should be problem I think.) I think a solution should be: - Changing intel_ring_advance() to __intel_ring_davnce() in mi_set_context()- submit the MI_SET_CONTEXT ring buffer directly at this time. - Polling CCID register until the new engine state is load into HW. Then the prev context can be unpinned and unreferenced safely? I'm just a beginner, If you can confirm this is an issue or a potential problem, I can cook a patch. :) Thanks, Zhi. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] looks like a issue in do_switch() and mi_set_context() in i915_gem_context.c?
On Wed, Apr 01, 2015 at 03:52:17PM +, Wang, Zhi A wrote: Hi Experts and Gurus: I'm learning i915 and It looks like in i915_gem_context.c: mi_set_context(), the last intel_ring_advance(); is a lazy ring tail write. So I think the ring buffer contains MI_SET_CONTEXT will not be submitted at this time, but in the caller: do_switch() it will unpin the backing memory of the context GEM object. From this time the backing memory may be swapped out? Then in the execbuffer routines, the MI_SET_CONTEXT may save current HW state into an unexpected location I guess? do_switch() - mi_set_context() - intel_ring_begin() -emit MI_SET_CONTEXT (save current HW state to prev context and load engine state from next context, but it won't be submitted at this time) -intel_ring_advance() - unpin and unreference prev context(from this time the prev context can be swapped by shrinker I think) Then we come into the time of real submission: i915_gem_execbuffer_retire_commands() - __intel_ring_advance() (MI_SET_CONTEXT got submitted at this time, but the prev context in do_switch() may be swapped out and turned into invalid? If GPU save current engine state into prev context, the prev context should be corrupted? When it got switch back, there should be problem I think.) I think a solution should be: - Changing intel_ring_advance() to __intel_ring_davnce() in mi_set_context()- submit the MI_SET_CONTEXT ring buffer directly at this time. - Polling CCID register until the new engine state is load into HW. Then the prev context can be unpinned and unreferenced safely? I'm just a beginner, If you can confirm this is an issue or a potential problem, I can cook a patch. :) It's not a problem. The old context is pinned by the active reference until the MI_SET_CONTEXT switch is completed by hardware at an indefinite point in the future. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] looks like a issue in do_switch() and mi_set_context() in i915_gem_context.c?
Hi Chris: Thanks for the reply. :) I can understand that the backing storage is pinned at this time, as the reference counter of context object should not be zero. But for VMA, is there any chance that the vma will be unbinded from GGTT at this time by shrinker? I saw that i915_gem_object_ggtt_unpin() will decrease the VMA reference counter... Thanks, Zhi. On 04/01/15 23:58, Chris Wilson wrote: On Wed, Apr 01, 2015 at 03:52:17PM +, Wang, Zhi A wrote: Hi Experts and Gurus: I'm learning i915 and It looks like in i915_gem_context.c: mi_set_context(), the last intel_ring_advance(); is a lazy ring tail write. So I think the ring buffer contains MI_SET_CONTEXT will not be submitted at this time, but in the caller: do_switch() it will unpin the backing memory of the context GEM object. From this time the backing memory may be swapped out? Then in the execbuffer routines, the MI_SET_CONTEXT may save current HW state into an unexpected location I guess? do_switch() - mi_set_context() - intel_ring_begin() -emit MI_SET_CONTEXT (save current HW state to prev context and load engine state from next context, but it won't be submitted at this time) -intel_ring_advance() - unpin and unreference prev context(from this time the prev context can be swapped by shrinker I think) Then we come into the time of real submission: i915_gem_execbuffer_retire_commands() - __intel_ring_advance() (MI_SET_CONTEXT got submitted at this time, but the prev context in do_switch() may be swapped out and turned into invalid? If GPU save current engine state into prev context, the prev context should be corrupted? When it got switch back, there should be problem I think.) I think a solution should be: - Changing intel_ring_advance() to __intel_ring_davnce() in mi_set_context()- submit the MI_SET_CONTEXT ring buffer directly at this time. - Polling CCID register until the new engine state is load into HW. Then the prev context can be unpinned and unreferenced safely? I'm just a beginner, If you can confirm this is an issue or a potential problem, I can cook a patch. :) It's not a problem. The old context is pinned by the active reference until the MI_SET_CONTEXT switch is completed by hardware at an indefinite point in the future. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx