The compiler doesn't know that there's another thread. And it won't
start to assume that there might always be another thread because then
it could never optimize pointer derefs.

On Fri, Jun 26, 2015 at 11:55 AM, Marek Olšák <mar...@gmail.com> wrote:
> I assumed the atomic operation in another thread would act as a
> barrier in this case. Is that right?
>
> Marek
>
> On Fri, Jun 26, 2015 at 5:47 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote:
>> On Fri, Jun 26, 2015 at 11:33 AM, Marek Olšák <mar...@gmail.com> wrote:
>>> I expect the variable will be changed using an atomic operation by the
>>> CPU, or using a coherent store instruction by the GPU.
>>>
>>> If this is wrong and volatile is really required here, then
>>> p_atomic_read is wrong too. Should we fix it? For example:
>>>
>>> #define p_atomic_read(_v) (*(volatile int*)(_v))
>>>
>>> Then, os_wait_until_zero can use p_atomic_read.
>>
>> The problem is the lack of a memory barrier. A function call to
>> sched_yield implies a barrier (since the function could well have
>> changed the memory that var points to), and so it's all fine. But on a
>> non-PIPE_OS_UNIX os, this becomes
>>
>> while (*var);
>>
>> Which the compiler would be well within its right to rewrite to
>>
>> x = *var;
>> while (x);
>>
>> which wouldn't end well. You need a barrier that tells the compiler
>> that memory may be invalidated... I believe this maps to either mb()
>> or barrier() in the linux kernel, and they have some fancy assembly
>> syntax to tell the compiler "oops, memory may have changed". However
>> in this case, it may be easier to just mark the var volatile.
>>
>>>
>>> Marek
>>>
>>> On Fri, Jun 26, 2015 at 4:48 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote:
>>>> On Fri, Jun 26, 2015 at 7:05 AM, Marek Olšák <mar...@gmail.com> wrote:
>>>>> From: Marek Olšák <marek.ol...@amd.com>
>>>>>
>>>>> This will be used by radeon and amdgpu winsyses.
>>>>> Copied from the amdgpu winsys.
>>>>> ---
>>>>>  src/gallium/auxiliary/os/os_time.c | 36 
>>>>> +++++++++++++++++++++++++++++++++++-
>>>>>  src/gallium/auxiliary/os/os_time.h | 10 ++++++++++
>>>>>  2 files changed, 45 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/src/gallium/auxiliary/os/os_time.c 
>>>>> b/src/gallium/auxiliary/os/os_time.c
>>>>> index f7e4ca4..63b6879 100644
>>>>> --- a/src/gallium/auxiliary/os/os_time.c
>>>>> +++ b/src/gallium/auxiliary/os/os_time.c
>>>>> @@ -33,11 +33,12 @@
>>>>>   */
>>>>>
>>>>>
>>>>> -#include "pipe/p_config.h"
>>>>> +#include "pipe/p_defines.h"
>>>>>
>>>>>  #if defined(PIPE_OS_UNIX)
>>>>>  #  include <time.h> /* timeval */
>>>>>  #  include <sys/time.h> /* timeval */
>>>>> +#  include <sched.h> /* sched_yield */
>>>>>  #elif defined(PIPE_SUBSYSTEM_WINDOWS_USER)
>>>>>  #  include <windows.h>
>>>>>  #else
>>>>> @@ -92,3 +93,36 @@ os_time_sleep(int64_t usecs)
>>>>>  }
>>>>>
>>>>>  #endif
>>>>> +
>>>>> +
>>>>> +bool os_wait_until_zero(int *var, uint64_t timeout)
>>>>
>>>> Does this need to be volatile?
>>>>
>>>>> +{
>>>>> +   if (!*var)
>>>>> +      return true;
>>>>> +
>>>>> +   if (!timeout)
>>>>> +      return false;
>>>>> +
>>>>> +   if (timeout == PIPE_TIMEOUT_INFINITE) {
>>>>> +      while (*var) {
>>>>> +#if defined(PIPE_OS_UNIX)
>>>>> +         sched_yield();
>>>>> +#endif
>>>>> +      }
>>>>> +      return true;
>>>>> +   }
>>>>> +   else {
>>>>> +      int64_t start_time = os_time_get_nano();
>>>>> +      int64_t end_time = start_time + timeout;
>>>>> +
>>>>> +      while (*var) {
>>>>> +         if (os_time_timeout(start_time, end_time, os_time_get_nano()))
>>>>> +            return false;
>>>>> +
>>>>> +#if defined(PIPE_OS_UNIX)
>>>>> +         sched_yield();
>>>>> +#endif
>>>>> +      }
>>>>> +      return true;
>>>>> +   }
>>>>> +}
>>>>> diff --git a/src/gallium/auxiliary/os/os_time.h 
>>>>> b/src/gallium/auxiliary/os/os_time.h
>>>>> index 4fab03c..fdc8040 100644
>>>>> --- a/src/gallium/auxiliary/os/os_time.h
>>>>> +++ b/src/gallium/auxiliary/os/os_time.h
>>>>> @@ -94,6 +94,16 @@ os_time_timeout(int64_t start,
>>>>>  }
>>>>>
>>>>>
>>>>> +/**
>>>>> + * Wait until the variable at the given memory location is zero.
>>>>> + *
>>>>> + * \param var           variable
>>>>> + * \param timeout       timeout, can be anything from 0 (no wait) to
>>>>> + *                      PIPE_TIME_INFINITE (wait forever)
>>>>> + * \return     true if the variable is zero
>>>>> + */
>>>>> +bool os_wait_until_zero(int *var, uint64_t timeout);
>>>>> +
>>>>>  #ifdef __cplusplus
>>>>>  }
>>>>>  #endif
>>>>> --
>>>>> 2.1.0
>>>>>
>>>>> _______________________________________________
>>>>> mesa-dev mailing list
>>>>> mesa-dev@lists.freedesktop.org
>>>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to