Re: [Intel-gfx] looks like a issue in do_switch() and mi_set_context() in i915_gem_context.c?

2015-04-02 Thread Chris Wilson
On Wed, Apr 01, 2015 at 08:01:56PM +0800, Zhi Wang wrote:
 Hi Chris:
 Thanks for the reply. :) I can understand that the backing
 storage is pinned at this time, as the reference counter of context
 object should not be zero. But for VMA, is there any chance that the
 vma will be unbinded from GGTT at this time by shrinker? I saw that
 i915_gem_object_ggtt_unpin() will decrease the VMA reference
 counter...

In order for the shrinker to evict an active object, it must first wait
upon it. (So the shrinker will only do so as a last gasp measure.) Once
the vma is unbound, we know that the GPU will have switched contexts
away from the vma (because the last request that we waited upon for the
vma included the instructions to do the switch away) and so the pages are
swappable.

This obviously relies on the hardware being correct... As would waiting
upon the CCID!
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] looks like a issue in do_switch() and mi_set_context() in i915_gem_context.c?

2015-04-02 Thread Wang, Zhi A
Hi Chris:
I begin to understand that before the prev context object is unpinned, 
it's set to active by i915_vma_move_to_active, so the shrinker will wait for 
it. Thanks for the help.  Every time I learned a lot from you. Thanks. :)

-Original Message-
From: Chris Wilson [mailto:ch...@chris-wilson.co.uk] 
Sent: Thursday, April 02, 2015 3:18 PM
To: Wang, Zhi A
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] looks like a issue in do_switch() and mi_set_context() 
in i915_gem_context.c?

On Wed, Apr 01, 2015 at 08:01:56PM +0800, Zhi Wang wrote:
 Hi Chris:
 Thanks for the reply. :) I can understand that the backing storage 
 is pinned at this time, as the reference counter of context object 
 should not be zero. But for VMA, is there any chance that the vma will 
 be unbinded from GGTT at this time by shrinker? I saw that
 i915_gem_object_ggtt_unpin() will decrease the VMA reference 
 counter...

In order for the shrinker to evict an active object, it must first wait upon 
it. (So the shrinker will only do so as a last gasp measure.) Once the vma is 
unbound, we know that the GPU will have switched contexts away from the vma 
(because the last request that we waited upon for the vma included the 
instructions to do the switch away) and so the pages are swappable.

This obviously relies on the hardware being correct... As would waiting upon 
the CCID!
-Chris

--
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] looks like a issue in do_switch() and mi_set_context() in i915_gem_context.c?

2015-04-01 Thread Wang, Zhi A
Hi Experts and Gurus:
   I'm learning i915 and It looks like in i915_gem_context.c: mi_set_context(), 
the last intel_ring_advance(); is a lazy ring tail write. So I think the ring 
buffer contains MI_SET_CONTEXT will not be submitted at this time, but in the 
caller: do_switch() it will unpin the backing memory of the context GEM object. 
From this time the backing memory may be swapped out? Then in the execbuffer 
routines, the MI_SET_CONTEXT may save current HW state into an unexpected 
location I guess?

do_switch()
- mi_set_context()
- intel_ring_begin()
 -emit MI_SET_CONTEXT (save current HW state to prev context and 
load engine state from next context, but it won't be submitted at this time)
 -intel_ring_advance()
- unpin and unreference prev context(from this time the prev context 
can be swapped by shrinker I think) 

Then we come into the time of real submission:

i915_gem_execbuffer_retire_commands()
- __intel_ring_advance() (MI_SET_CONTEXT got submitted at this time, but 
the prev context in do_switch() may be swapped out and turned into invalid? 
If GPU save current engine state into prev context, the prev context should 
be corrupted? When it got switch back, there should be problem I think.)

I think a solution should be:
- Changing intel_ring_advance() to __intel_ring_davnce() in mi_set_context()- 
submit the MI_SET_CONTEXT ring buffer directly at  this time.
- Polling CCID register until the new engine state is load into HW.

Then the prev context can be unpinned and unreferenced safely?

I'm just a beginner, If you can confirm this is an issue or a potential 
problem, I can cook a patch. :)

Thanks,
Zhi.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] looks like a issue in do_switch() and mi_set_context() in i915_gem_context.c?

2015-04-01 Thread Chris Wilson
On Wed, Apr 01, 2015 at 03:52:17PM +, Wang, Zhi A wrote:
 Hi Experts and Gurus:
I'm learning i915 and It looks like in i915_gem_context.c: 
 mi_set_context(), the last intel_ring_advance(); is a lazy ring tail write. 
 So I think the ring buffer contains MI_SET_CONTEXT will not be submitted at 
 this time, but in the caller: do_switch() it will unpin the backing memory of 
 the context GEM object. From this time the backing memory may be swapped out? 
 Then in the execbuffer routines, the MI_SET_CONTEXT may save current HW state 
 into an unexpected location I guess?
 
 do_switch()
 - mi_set_context()
 - intel_ring_begin()
  -emit MI_SET_CONTEXT (save current HW state to prev context and 
 load engine state from next context, but it won't be submitted at this time)
  -intel_ring_advance()
 - unpin and unreference prev context(from this time the prev context 
 can be swapped by shrinker I think) 
 
 Then we come into the time of real submission:
 
 i915_gem_execbuffer_retire_commands()
 - __intel_ring_advance() (MI_SET_CONTEXT got submitted at this time, but 
 the prev context in do_switch() may be swapped out and turned into invalid? 
 If GPU save current engine state into prev context, the prev context 
 should be corrupted? When it got switch back, there should be problem I 
 think.)
 
 I think a solution should be:
 - Changing intel_ring_advance() to __intel_ring_davnce() in 
 mi_set_context()- submit the MI_SET_CONTEXT ring buffer directly at  this 
 time.
 - Polling CCID register until the new engine state is load into HW.
 
 Then the prev context can be unpinned and unreferenced safely?
 
 I'm just a beginner, If you can confirm this is an issue or a potential 
 problem, I can cook a patch. :)

It's not a problem. The old context is pinned by the active reference
until the MI_SET_CONTEXT switch is completed by hardware at an
indefinite point in the future.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] looks like a issue in do_switch() and mi_set_context() in i915_gem_context.c?

2015-04-01 Thread Zhi Wang

Hi Chris:
Thanks for the reply. :) I can understand that the backing storage 
is pinned at this time, as the reference counter of context object 
should not be zero. But for VMA, is there any chance that the vma will 
be unbinded from GGTT at this time by shrinker? I saw that 
i915_gem_object_ggtt_unpin() will decrease the VMA reference counter...


Thanks,
Zhi.

On 04/01/15 23:58, Chris Wilson wrote:

On Wed, Apr 01, 2015 at 03:52:17PM +, Wang, Zhi A wrote:

Hi Experts and Gurus:
I'm learning i915 and It looks like in i915_gem_context.c: 
mi_set_context(), the last intel_ring_advance(); is a lazy ring tail write. So 
I think the ring buffer contains MI_SET_CONTEXT will not be submitted at this 
time, but in the caller: do_switch() it will unpin the backing memory of the 
context GEM object. From this time the backing memory may be swapped out? Then 
in the execbuffer routines, the MI_SET_CONTEXT may save current HW state into 
an unexpected location I guess?

do_switch()
 - mi_set_context()
 - intel_ring_begin()
  -emit MI_SET_CONTEXT (save current HW state to prev context and load engine 
state from next context, but it won't be submitted at this time)
  -intel_ring_advance()
 - unpin and unreference prev context(from this time the prev context 
can be swapped by shrinker I think)

Then we come into the time of real submission:

i915_gem_execbuffer_retire_commands()
 - __intel_ring_advance() (MI_SET_CONTEXT got submitted at this time, but the prev context 
in do_switch() may be swapped out and turned into invalid? If GPU save current engine state into 
prev context, the prev context should be corrupted? When it got switch back, there 
should be problem I think.)

I think a solution should be:
- Changing intel_ring_advance() to __intel_ring_davnce() in mi_set_context()- 
submit the MI_SET_CONTEXT ring buffer directly at  this time.
- Polling CCID register until the new engine state is load into HW.

Then the prev context can be unpinned and unreferenced safely?

I'm just a beginner, If you can confirm this is an issue or a potential 
problem, I can cook a patch. :)


It's not a problem. The old context is pinned by the active reference
until the MI_SET_CONTEXT switch is completed by hardware at an
indefinite point in the future.
-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx