On 11/22/2012 04:51 PM, Maarten Lankhorst wrote:
> Op 21-11-12 14:27, Thomas Hellstrom schreef:
>> On 11/21/2012 02:12 PM, Maarten Lankhorst wrote:
>>> Op 21-11-12 13:42, Thomas Hellstrom schreef:
>>>> On 11/21/2012 12:38 PM, Maarten Lankhorst wrote:
>>>>> Hey,
>>>>>
>>>>> Op 20-11-12 16:08, Thomas Hellstrom schreef:
>>>>>> On 11/20/2012 02:13 PM, Maarten Lankhorst wrote:
>>>>>>> Op 20-11-12 13:03, Thomas Hellstrom schreef:
>>>>>>>> On 11/20/2012 12:33 PM, Maarten Lankhorst wrote:
>>>>>>>>> Op 20-11-12 08:48, Thomas Hellstrom schreef:
>>>>>>>>>> On 11/19/2012 04:33 PM, Maarten Lankhorst wrote:
>>>>>>>>>>> Op 19-11-12 16:04, Thomas Hellstrom schreef:
>>>>>>>>>>>> On 11/19/2012 03:17 PM, Thomas Hellstrom wrote:
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> This patch looks mostly good, although I think 
>>>>>>>>>>>>> ttm_bo_cleanup_refs becomes overly complicated:
>>>>>>>>>>>>> Could this do, or am I missing something?
>>>>>>>>>>>>>
>>>>>>>>>>>> Actually, my version is bad, because ttm_bo_wait() is called with 
>>>>>>>>>>>> the lru lock held.
>>>>>>>>>>>>
>>>>>>>>>>>> /Thomas
>>>>>>>>>>> Oh digging through it made me remember why I had to release the 
>>>>>>>>>>> reservation early and
>>>>>>>>>>> had to allow move_notify to be called without reservation.
>>>>>>>>>>>
>>>>>>>>>>> Fortunately move_notify has a NULL parameter, which is the only 
>>>>>>>>>>> time that happens,
>>>>>>>>>>> so you can still check do BUG_ON(mem != NULL && 
>>>>>>>>>>> !ttm_bo_reserved(bo)); in your
>>>>>>>>>>> move_notify handler.
>>>>>>>>>>>
>>>>>>>>>>> 05/10 removed the loop and assumed no new fence could be attached 
>>>>>>>>>>> after the driver has
>>>>>>>>>>> declared the bo dead.
>>>>>>>>>>>
>>>>>>>>>>> However, at that point it may no longer hold a reservation to 
>>>>>>>>>>> confirm this, that's why
>>>>>>>>>>> I moved the cleanup to be done in the release_list handler. It 
>>>>>>>>>>> could still be done in
>>>>>>>>>>> ttm_bo_release, but we no longer have a reservation after we 
>>>>>>>>>>> waited. Getting
>>>>>>>>>>> a reservation can fail if the bo is imported for example.
>>>>>>>>>>>
>>>>>>>>>>> While it would be true that in that case a new fence may be 
>>>>>>>>>>> attached as well, that
>>>>>>>>>>> would be less harmful since that operation wouldn't involve this 
>>>>>>>>>>> device, so the
>>>>>>>>>>> ttm bo can still be removed in that case. When that time comes I 
>>>>>>>>>>> should probably
>>>>>>>>>>> fix up that WARN_ON(ret) in ttm_bo_cleanup_refs. :-)
>>>>>>>>>>>
>>>>>>>>>>> I did add a WARN_ON(!atomic_read(&bo->kref.refcount)); to
>>>>>>>>>>> ttm_bo_reserve and ttm_eu_reserve_buffers to be sure nothing is 
>>>>>>>>>>> done on the device
>>>>>>>>>>> itself. If that is too paranoid, those WARN_ON's could be dropped. 
>>>>>>>>>>> I prefer to leave them
>>>>>>>>>>> in for a kernel release or 2. But according to the rules that would 
>>>>>>>>>>> be the only time you
>>>>>>>>>>> could attach a new fence and trigger the WARN_ON for now..
>>>>>>>>>> Hmm, I'd appreciate if you could group patches with functional 
>>>>>>>>>> changes that depend on eachother togeteher,
>>>>>>>>>> and "this is done because ...", which makes it much easier to 
>>>>>>>>>> review, (and to follow the commit history in case
>>>>>>>>>> something goes terribly wrong and we need to revert).
>>>>>>>>>>
>>>>>>>>>> Meanwhile I'll take a look at the final ttm_bo.c and see if I can 
>>>>>>>>>> spot any culprits.
>>>>>>>>>>
>>>>>>>>>> In general, as long as a bo is on a LRU list, we must be able to 
>>>>>>>>>> attach fences because of accelerated eviction.
>>>>>>>>> I thought it was deliberately designed in such a way that it was kept 
>>>>>>>>> on the lru list,
>>>>>>>>> but since it's also on the ddestroy list it won't start accelerated 
>>>>>>>>> eviction,
>>>>>>>>> since it branches into cleanup_refs early, and lru_lock still 
>>>>>>>>> protects all the list entries.
>>>>>>>> I used bad wording. I meant that unbinding might be accelerated, but  
>>>>>>>> currently (quite inefficiently)
>>>>>>>> do synchronized unbinding, assuming that only the CPU can do that. 
>>>>>>>> When we start to support
>>>>>>>> unsynchronized moves, we need to be able to attach fences at least at 
>>>>>>>> the last move_notify(bo, NULL);
>>>>>>> Would you need to wait in that case on fence_wait being completed 
>>>>>>> before calling move_notify?
>>>>>>>
>>>>>>> If not, you would still only need to perform one wait, but you'd have 
>>>>>>> to make sure move_notify only gets
>>>>>>> called by 1 thread before checking the fence pointer and performing a 
>>>>>>> wait. At that point you still hold the
>>>>>>> lru_lock though, so it shouldn't be too hard to make something safe.
>>>>>> I think typically a driver that wants to implement asynchronous moves 
>>>>>> don't want to wait before calling
>>>>>> move_notify, but may wait in move_notify or move. Typically (upcoming 
>>>>>> vmwgfx) it would invalidate the buffer in move_notify(bo, NULL), attach 
>>>>>> a fence and then use the normal delayed destroy to wait on that fence 
>>>>>> before destroying the buffer.
>>>>>>
>>>>>> Otherwise, since binds / unbinds are handled in the GPU command stream 
>>>>>> there's never any need to wait for moves except when there's a CPU
>>>>>> access.
>>>>> Well, nouveau actually needs fence_wait to finish first, since vm changes 
>>>>> are out of band.
>>>>> But I guess it should be possible to attach it as work to the fence when 
>>>>> it's signaled, and I
>>>>> may want to do something like that already for performance reasons in a 
>>>>> different place,
>>>>> so I guess it doesn't matter.
>>>> Actions to be performed on fence signaling tend to be very cpu consuming, 
>>>> I think due to the context switches involved.
>>>> We had to replace that in the old psb driver and batch things like TTM 
>>>> does instead.
>>>>
>>>> Also remember that TTM fences are not required to signal in finite time 
>>>> unless fence_flush is called.
>>>>
>>>> I think nouveau doesn't use fence irqs to signal its fences.
>>>>
>>>>> Is calling move_notify(bo, NULL) legal and a noop the second time?
>>>> I see no fundamental reason why it shouldn't be OK, although we might need 
>>>> to patch drivers to cope with it.
>>>>
>>>>>     That would save a flag in the bo to check if it's called already,
>>>>> although I suppose we could always define a TTM_BO_PRIV_FLAG_* for it 
>>>>> otherwise.
>>>>>
>>>>> move_notify might end up being called with the lru_lock held, but that 
>>>>> shouldn't be a problem.
>>>> I don't think that's a good idea. Drivers sleeping in move_notify will 
>>>> need to release the spinlock, and that means it's
>>>> better to release it before move_notify is called.
>>> Is the only sleeping being done on fences? In that case we might wish to 
>>> split it up in 2 pieces for destruction,
>>> the piece that runs immediately, and a piece to run after the new fence has 
>>> signaled (current behavior).
>>>
>>> Nouveau needs the final move_notify unmap to be called after object is 
>>> idle, like it is now. It doesn't need
>>> to attach a new fence.
>> In that case it might be best to worry about asynchronous stuff later?
>> We will eventually implement it on the new vmwgfx hardware revision, but 
>> it's not ready yet.
>>
>> /Thomas
> Ok sounds good.
>
> In that case what do you want me to change from the first 4 patches apart 
> from more verbose commit messages?
> - 03/10 I got that I need to re-add the list_empty check after -EBUSY was 
> returned in evict_mem_first.
>
> Also PATCH 05/10 cleans up the spinning in ttm_bo_cleanup_refs, so I hope 
> it's ok that it's a big
> ugly in 04/10, as long as it doesn't result in any new bugs being introduced.
>
> ~Maarten
>
> PS: I did a plain rebase of my git tree to deal with the conflicts in 
> drm-next.
>

Maarten, it seems to me the purpose of the patches are the following 
(not necessarily the correct order).

1) Change fence lock locking order w r t LRU lock - should be a trivial 
and very small change.
2) Change reservations from lists to always be trylock, skipping already 
reserved bos.
3) Remove the lru lock around reservations.
4) Various optimizations / cleanups.

If you could reorganize and make 4 patch series like this, it would be 
much easier to follow what happens and why, and would make it much 
easier for me to review. It seems to me if patch series 1-3 focus on the 
intended changes and the intended changes only, they would be quite small?

/Thomas




Reply via email to