On 25 February 2015 at 15:43, Maarten Lankhorst <maarten.lankho...@ubuntu.com> wrote: > Op 25-02-15 om 16:28 schreef Patrick Baggett: >> On Wed, Feb 25, 2015 at 9:07 AM, Maarten Lankhorst < >> maarten.lankho...@ubuntu.com> wrote: >> >>> Op 25-02-15 om 16:04 schreef Patrick Baggett: >>>> On Wed, Feb 25, 2015 at 8:59 AM, Maarten Lankhorst < >>>> maarten.lankho...@ubuntu.com> wrote: >>>> >>>>> Op 25-02-15 om 15:11 schreef Emil Velikov: >>>>>> On 24 February 2015 at 09:01, Maarten Lankhorst >>>>>> <maarten.lankho...@ubuntu.com> wrote: >>>>>>> Only add wrapped bo's and bo's that have been exported through flink >>> or >>>>> dma-buf. >>>>>>> This avoids a lock in the common case, and decreases traversal needed >>>>> for importing >>>>>>> a dma-buf or flink. >>>>>>> >>>>>>> Signed-off-by: Maarten Lankhorst <maarten.lankho...@ubuntu.com> >>>>>>> --- >>>>>>> nouveau/nouveau.c | 47 >>> +++++++++++++++++++++++------------------------ >>>>>>> 1 file changed, 23 insertions(+), 24 deletions(-) >>>>>>> >>>>>>> diff --git a/nouveau/nouveau.c b/nouveau/nouveau.c >>>>>>> index 1c723b9..d411523 100644 >>>>>>> --- a/nouveau/nouveau.c >>>>>>> +++ b/nouveau/nouveau.c >>>>>>> @@ -349,8 +349,8 @@ nouveau_bo_del(struct nouveau_bo *bo) >>>>>>> struct nouveau_bo_priv *nvbo = nouveau_bo(bo); >>>>>>> struct drm_gem_close req = { bo->handle }; >>>>>>> >>>>>>> - pthread_mutex_lock(&nvdev->lock); >>>>>>> - if (nvbo->name) { >>>>>>> + if (nvbo->head.next) { >>>>>>> + pthread_mutex_lock(&nvdev->lock); >>>>>>> if (atomic_read(&nvbo->refcnt) == 0) { >>>>>>> DRMLISTDEL(&nvbo->head); >>>>>>> /* >>>>>>> @@ -365,8 +365,6 @@ nouveau_bo_del(struct nouveau_bo *bo) >>>>>>> } >>>>>>> pthread_mutex_unlock(&nvdev->lock); >>>>>>> } else { >>>>>>> - DRMLISTDEL(&nvbo->head); >>>>>>> - pthread_mutex_unlock(&nvdev->lock); >>>>>>> drmIoctl(bo->device->fd, DRM_IOCTL_GEM_CLOSE, &req); >>>>>>> } >>>>>>> if (bo->map) >>>>>>> @@ -379,7 +377,6 @@ nouveau_bo_new(struct nouveau_device *dev, >>> uint32_t >>>>> flags, uint32_t align, >>>>>>> uint64_t size, union nouveau_bo_config *config, >>>>>>> struct nouveau_bo **pbo) >>>>>>> { >>>>>>> - struct nouveau_device_priv *nvdev = nouveau_device(dev); >>>>>>> struct nouveau_bo_priv *nvbo = calloc(1, sizeof(*nvbo)); >>>>>>> struct nouveau_bo *bo = &nvbo->base; >>>>>>> int ret; >>>>>>> @@ -397,10 +394,6 @@ nouveau_bo_new(struct nouveau_device *dev, >>>>> uint32_t flags, uint32_t align, >>>>>>> return ret; >>>>>>> } >>>>>>> >>>>>>> - pthread_mutex_lock(&nvdev->lock); >>>>>>> - DRMLISTADD(&nvbo->head, &nvdev->bo_list); >>>>>>> - pthread_mutex_unlock(&nvdev->lock); >>>>>>> - >>>>>>> *pbo = bo; >>>>>>> return 0; >>>>>>> } >>>>>>> @@ -457,6 +450,18 @@ nouveau_bo_wrap_locked(struct nouveau_device >>> *dev, >>>>> uint32_t handle, >>>>>>> return -ENOMEM; >>>>>>> } >>>>>>> >>>>>>> +static void >>>>>>> +nouveau_bo_make_global(struct nouveau_bo_priv *nvbo) >>>>>>> +{ >>>>>>> + if (!nvbo->head.next) { >>>>>>> + struct nouveau_device_priv *nvdev = >>>>> nouveau_device(nvbo->base.device); >>>>>>> + pthread_mutex_lock(&nvdev->lock); >>>>>>> + if (!nvbo->head.next) >>>> I guess the bo_make_global call is not particularly sensitive, so >>>>> removing's fine with me. >>>>> >>>> I would be worried about the duplicated check. It seems like a "smart" >>>> compiler could cache the value of "nvbo->head.next" (unless marked as >>>> volatile), rendering the second if() useless. If the field is marked >>>> volatile, then of course, this does not apply. >>>> >>> In that case I would be worried about a compiler that ignores the acquire >>> semantics of pthread_mutex_lock.. >>> >> This isn't about the acquire semantics here, because you're reading it on >> both sides of the lock. The compiler need not reorder the reads at all, >> merely just save the value of the first read. Unless the compiler knows >> that the value can change after the lock is acquired, it can simply cache >> the result. Consider the nearly identical transformation: >> >> const int condition = (!nvbo->head.next); >> if(condition){ >> pthread_mutex_lock(...); >> if(condition){ //redundant, already can assert it is true >> ... >> > So you're saying a compiler can optimize: > > - statement with memory access > - read memory barrier > - statement with memory access > > To drop the second statement? I would worry about your definition of memory > barrier then.. > Patrick thanks for elaborating my concern in a more complete manner.
Maarten, I would assume that the overhead of dropping the first if statement will be negligible wrt the time spent debugging when the compiler does the silly thing. Afaict even with a single if, your patch does reduce the time spend in libdrm_nouveau quite a bit. That said, it's up-to you and others to draw the final call. Cheers, Emil _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau