On Fri, Jun 26, 2015 at 12:40 PM, Marek Olšák <mar...@gmail.com> wrote:
> If p_atomic_read is fine, then this patch is fine too. So you're
> telling that this should work:
>
> while (p_atomic_read(var));

No, this shouldn't work. I don't believe that anyone ever claimed it
ought to. But perhaps we have different ideas of what "p_atomic_read"
should do. I see it as "atomically read a word from memory in a way
that you can never get partial results". The compiler can still
happily optimize this to

x = p_atomic_read(var) (assuming p_atomic_read isn't an out-of-line function)
while (x);

>
> I wouldn't be concerned about a memory barrier. This is only 1 int, so
> it should make its way into the shared cache eventually.

The barrier is for the compiler to know what's going on, not for any
hardware reasons. I *believe* this is the barrier() macro in the
kernel (mb() is an actual memory barrier issued on the CPU).

So something like

#define barrier() __asm__ __volatile__("": : :"memory")

while (*var) {
  barrier();
}

Should have the desired effect. [Or while (p_atomic_read(var)) ... whatever]

>
> Marek
>
>
> On Fri, Jun 26, 2015 at 6:25 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote:
>> p_atomic_read is fine as-is. The read is guaranteed to be atomic up to
>> a word size on x86 processors. I suspect other cpu's have similar
>> guarantees, and if not, then hopefully have other ops to perform the
>> read atomically.
>>
>> On Fri, Jun 26, 2015 at 12:18 PM, Marek Olšák <mar...@gmail.com> wrote:
>>> My question was how to fix p_atomic_read? Would the volatile read that
>>> I proposed work?
>>>
>>> Marek
>>>
>>>
>>> On Fri, Jun 26, 2015 at 5:59 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote:
>>>> The compiler doesn't know that there's another thread. And it won't
>>>> start to assume that there might always be another thread because then
>>>> it could never optimize pointer derefs.
>>>>
>>>> On Fri, Jun 26, 2015 at 11:55 AM, Marek Olšák <mar...@gmail.com> wrote:
>>>>> I assumed the atomic operation in another thread would act as a
>>>>> barrier in this case. Is that right?
>>>>>
>>>>> Marek
>>>>>
>>>>> On Fri, Jun 26, 2015 at 5:47 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote:
>>>>>> On Fri, Jun 26, 2015 at 11:33 AM, Marek Olšák <mar...@gmail.com> wrote:
>>>>>>> I expect the variable will be changed using an atomic operation by the
>>>>>>> CPU, or using a coherent store instruction by the GPU.
>>>>>>>
>>>>>>> If this is wrong and volatile is really required here, then
>>>>>>> p_atomic_read is wrong too. Should we fix it? For example:
>>>>>>>
>>>>>>> #define p_atomic_read(_v) (*(volatile int*)(_v))
>>>>>>>
>>>>>>> Then, os_wait_until_zero can use p_atomic_read.
>>>>>>
>>>>>> The problem is the lack of a memory barrier. A function call to
>>>>>> sched_yield implies a barrier (since the function could well have
>>>>>> changed the memory that var points to), and so it's all fine. But on a
>>>>>> non-PIPE_OS_UNIX os, this becomes
>>>>>>
>>>>>> while (*var);
>>>>>>
>>>>>> Which the compiler would be well within its right to rewrite to
>>>>>>
>>>>>> x = *var;
>>>>>> while (x);
>>>>>>
>>>>>> which wouldn't end well. You need a barrier that tells the compiler
>>>>>> that memory may be invalidated... I believe this maps to either mb()
>>>>>> or barrier() in the linux kernel, and they have some fancy assembly
>>>>>> syntax to tell the compiler "oops, memory may have changed". However
>>>>>> in this case, it may be easier to just mark the var volatile.
>>>>>>
>>>>>>>
>>>>>>> Marek
>>>>>>>
>>>>>>> On Fri, Jun 26, 2015 at 4:48 PM, Ilia Mirkin <imir...@alum.mit.edu> 
>>>>>>> wrote:
>>>>>>>> On Fri, Jun 26, 2015 at 7:05 AM, Marek Olšák <mar...@gmail.com> wrote:
>>>>>>>>> From: Marek Olšák <marek.ol...@amd.com>
>>>>>>>>>
>>>>>>>>> This will be used by radeon and amdgpu winsyses.
>>>>>>>>> Copied from the amdgpu winsys.
>>>>>>>>> ---
>>>>>>>>>  src/gallium/auxiliary/os/os_time.c | 36 
>>>>>>>>> +++++++++++++++++++++++++++++++++++-
>>>>>>>>>  src/gallium/auxiliary/os/os_time.h | 10 ++++++++++
>>>>>>>>>  2 files changed, 45 insertions(+), 1 deletion(-)
>>>>>>>>>
>>>>>>>>> diff --git a/src/gallium/auxiliary/os/os_time.c 
>>>>>>>>> b/src/gallium/auxiliary/os/os_time.c
>>>>>>>>> index f7e4ca4..63b6879 100644
>>>>>>>>> --- a/src/gallium/auxiliary/os/os_time.c
>>>>>>>>> +++ b/src/gallium/auxiliary/os/os_time.c
>>>>>>>>> @@ -33,11 +33,12 @@
>>>>>>>>>   */
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -#include "pipe/p_config.h"
>>>>>>>>> +#include "pipe/p_defines.h"
>>>>>>>>>
>>>>>>>>>  #if defined(PIPE_OS_UNIX)
>>>>>>>>>  #  include <time.h> /* timeval */
>>>>>>>>>  #  include <sys/time.h> /* timeval */
>>>>>>>>> +#  include <sched.h> /* sched_yield */
>>>>>>>>>  #elif defined(PIPE_SUBSYSTEM_WINDOWS_USER)
>>>>>>>>>  #  include <windows.h>
>>>>>>>>>  #else
>>>>>>>>> @@ -92,3 +93,36 @@ os_time_sleep(int64_t usecs)
>>>>>>>>>  }
>>>>>>>>>
>>>>>>>>>  #endif
>>>>>>>>> +
>>>>>>>>> +
>>>>>>>>> +bool os_wait_until_zero(int *var, uint64_t timeout)
>>>>>>>>
>>>>>>>> Does this need to be volatile?
>>>>>>>>
>>>>>>>>> +{
>>>>>>>>> +   if (!*var)
>>>>>>>>> +      return true;
>>>>>>>>> +
>>>>>>>>> +   if (!timeout)
>>>>>>>>> +      return false;
>>>>>>>>> +
>>>>>>>>> +   if (timeout == PIPE_TIMEOUT_INFINITE) {
>>>>>>>>> +      while (*var) {
>>>>>>>>> +#if defined(PIPE_OS_UNIX)
>>>>>>>>> +         sched_yield();
>>>>>>>>> +#endif
>>>>>>>>> +      }
>>>>>>>>> +      return true;
>>>>>>>>> +   }
>>>>>>>>> +   else {
>>>>>>>>> +      int64_t start_time = os_time_get_nano();
>>>>>>>>> +      int64_t end_time = start_time + timeout;
>>>>>>>>> +
>>>>>>>>> +      while (*var) {
>>>>>>>>> +         if (os_time_timeout(start_time, end_time, 
>>>>>>>>> os_time_get_nano()))
>>>>>>>>> +            return false;
>>>>>>>>> +
>>>>>>>>> +#if defined(PIPE_OS_UNIX)
>>>>>>>>> +         sched_yield();
>>>>>>>>> +#endif
>>>>>>>>> +      }
>>>>>>>>> +      return true;
>>>>>>>>> +   }
>>>>>>>>> +}
>>>>>>>>> diff --git a/src/gallium/auxiliary/os/os_time.h 
>>>>>>>>> b/src/gallium/auxiliary/os/os_time.h
>>>>>>>>> index 4fab03c..fdc8040 100644
>>>>>>>>> --- a/src/gallium/auxiliary/os/os_time.h
>>>>>>>>> +++ b/src/gallium/auxiliary/os/os_time.h
>>>>>>>>> @@ -94,6 +94,16 @@ os_time_timeout(int64_t start,
>>>>>>>>>  }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> +/**
>>>>>>>>> + * Wait until the variable at the given memory location is zero.
>>>>>>>>> + *
>>>>>>>>> + * \param var           variable
>>>>>>>>> + * \param timeout       timeout, can be anything from 0 (no wait) to
>>>>>>>>> + *                      PIPE_TIME_INFINITE (wait forever)
>>>>>>>>> + * \return     true if the variable is zero
>>>>>>>>> + */
>>>>>>>>> +bool os_wait_until_zero(int *var, uint64_t timeout);
>>>>>>>>> +
>>>>>>>>>  #ifdef __cplusplus
>>>>>>>>>  }
>>>>>>>>>  #endif
>>>>>>>>> --
>>>>>>>>> 2.1.0
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> mesa-dev mailing list
>>>>>>>>> mesa-dev@lists.freedesktop.org
>>>>>>>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to