On Mon, Jul 14, 2014 at 12:47 AM, Andres Freund <and...@2ndquadrant.com> wrote: > On 2014-07-08 11:51:14 +0530, Amit Kapila wrote: > > On Thu, Jul 3, 2014 at 10:56 AM, Amit Kapila <amit.kapil...@gmail.com> > > wrote:
> > Further review of patch: > > 1. > > /* > > * pg_atomic_test_and_set_flag - TAS() > > * > > * Acquire/read barrier semantics. > > */ > > STATIC_IF_INLINE_DECLARE bool > > pg_atomic_test_set_flag(volatile pg_atomic_flag *ptr); > > > > a. I think Acquire and read barrier semantics aren't equal. > > Right. It was mean as Acquire (thus including read barrier) semantics. Okay, I got confused by reading comments, may be saying the same explicitly in comments is better. > I guess I'll better update README.barrier to have definitions of all > barriers. I think updating about the reasons behind using specific barriers for atomic operations defined by PG can be useful for people who want to understand or use the atomic op API's. > > b. I think it adheres to Acquire semantics for platforms where > > that semantics > > are supported else it defaults to full memory ordering. > > Example : > > #define TAS(lock) (InterlockedCompareExchange(lock, 1, 0)) > > > > Is it right to declare that generic version of function adheres to > > Acquire semantics. > > Implementing stronger semantics than required should always be > fine. There's simply no sane way to work with the variety of platforms > we support in any other way. Agreed, may be we can have a note somewhere either in code or readme to mention about it, if you think that can be helpful. > > > > 2. > > bool > > pg_atomic_test_set_flag_impl(volatile pg_atomic_flag *ptr) > > { > > return TAS((slock_t *) &ptr->sema); > > #define TAS(lock) (InterlockedCompareExchange(lock, 1, 0)) > > Where's that from? Its there in s_lock.h. Refer below code: #ifdef WIN32_ONLY_COMPILER typedef LONG slock_t; #define HAS_TEST_AND_SET #define TAS(lock) (InterlockedCompareExchange(lock, 1, 0)) > I can't recall adding a line of code like that? It is not added by your patch but used in your patch. I am not sure if that is what excatly you want atomic API to define for WIN32, but I think it is picking this definition. I have one question here, if the value of sema is other than 1 or 0, then it won't set the flag and not sure what it will return (ex. if the value is negative), so is this implementation completely sane? > > a. In above usage (ptr->sema), sema itself is not declared as volatile, > > so would it be right to use it in this way for API > > (InterlockedCompareExchange) > > expecting volatile. > > Upgrading a pointer to volatile is defined, so I don't see the problem? Thats okay. However all other structure definitions use it as volatile, example for atomics-arch-amd64.h it is defined as follows: #define PG_HAVE_ATOMIC_FLAG_SUPPORT typedef struct pg_atomic_flag { volatile char value; } pg_atomic_flag; So is there any harm if we keep this definition in sync with others? > > 3. > > /* > > * pg_atomic_unlocked_test_flag - TAS() > > * > > * No barrier semantics. > > */ > > STATIC_IF_INLINE_DECLARE bool > > pg_atomic_unlocked_test_flag(volatile pg_atomic_flag *ptr); > > > > a. There is no setting in this, so using TAS in function comments > > seems to be wrong. > > Good point. > > > b. BTW, I am not able see this function in C11 specs. > > So? It's needed for a sensible implementation of spinlocks ontop of > atomics. Okay got the point, do you think it makes sense to have a common agreement/consensus w.r.t which API's are necessary as a first cut. For me as a reviewer, if the API is implemented as per specs or what it is desired to then we can have it, however I am not sure if there is general consensus or at least some people agreed to set of API's which are required for first cut, have I missed some conclusion/discussion about it. > > > 5. > > atomics-generic-gcc.h > > #ifndef PG_HAS_ATOMIC_UNLOCKED_TEST_FLAG > > #define PG_HAS_ATOMIC_UNLOCKED_TEST_FLAG > > static inline bool > > pg_atomic_unlocked_test_flag_impl(volatile pg_atomic_flag *ptr) > > { > > return ptr->value == 0; > > } > > #endif > > > > Don't we need to compare it with 1 instead of 0? > > Why? It returns true if the lock is free, makes sense imo? Other implementation in atomics-generic.h seems to implement it other way #define PG_HAS_ATOMIC_UNLOCKED_TEST_FLAG static inline bool pg_atomic_unlocked_test_flag_impl(volatile pg_atomic_flag *ptr) { return pg_atomic_read_u32_impl(ptr) == 1; } > Will add comments to atomics.h I think that will be helpful. > > 6. > > atomics-fallback.h > > pg_atomic_unlocked_test_flag_impl(volatile pg_atomic_flag *ptr) > > { > > /* > > * Can't do this in the semaphore based implementation - always return > > * true. Do this in the header so compilers can optimize the test away. > > */ > > return true; > > } > > > > Why we can't implement this in semaphore based implementation? > > Because semaphores don't provide the API for it? Okay, but it is not clear from comments why this test should always return true, basically I am not able to think why incase of missing API returning true is correct behaviour for this API? > > 7. > > /* > > * pg_atomic_clear_flag - release lock set by TAS() > > * > > * Release/write barrier semantics. > > */ > > STATIC_IF_INLINE_DECLARE void > > pg_atomic_clear_flag(volatile pg_atomic_flag *ptr); > > > > a. Are Release and write barriers equivalent? > > Same answer as above. Meant as "Release (thus including write barrier) semantics" > > > b. IIUC then current code doesn't have release semantics for unlock/clear. > > If you're referring to s_lock.h with 'current code', yes it should have: I was referring to next point (point number 8) which you have clarified below. > > > 8. > > #define PG_HAS_ATOMIC_CLEAR_FLAG > > static inline void > > pg_atomic_clear_flag_impl(volatile pg_atomic_flag *ptr) > > { > > /* XXX: release semantics suffice? */ > > pg_memory_barrier_impl(); > > pg_atomic_write_u32_impl(ptr, 0); > > } > > > > Are we sure that having memory barrier before clearing flag will > > achieve *Release* semantics; as per my understanding the definition > > of Release sematics is as below and above doesn't seem to follow the > > same. > > Yes, a memory barrier guarantees a release semantics. It guarantees > more, but that's not a problem. You are right. > > 9. > > static inline uint32 > > pg_atomic_exchange_u32_impl(volatile pg_atomic_uint32 *ptr, uint32 xchg_) > > { > > uint32 old; > > while (true) > > { > > old = pg_atomic_read_u32_impl(ptr); > > if (pg_atomic_compare_exchange_u32_impl(ptr, &old, xchg_)) > > break; > > } > > return old; > > } > > > > What is the reason to implement exchange as compare-and-exchange, at least > > for > > Windows I know corresponding function (InterlockedExchange) exists. > > I could not see any other implementation of same. > > I think this at least deserves comment explaining why we have done > > the implementation this way. > > Because Robert and Tom insist that we shouldn't add more operations > employing hardware features. I think that's ok for now, we can always > add more capabilities later. No issues, may be a comment indicating some form of reason would be good. > > 10. > > STATIC_IF_INLINE_DECLARE uint32 > > pg_atomic_fetch_add_u32(volatile pg_atomic_uint32 *ptr, int32 add_); > > STATIC_IF_INLINE_DECLARE uint32 > > pg_atomic_fetch_sub_u32(volatile pg_atomic_uint32 *ptr, int32 sub_); > > > > The function name and input value seems to be inconsistent. > > The function name contains *_u32*, however the input value is *int32*. > > A u32 is implemented by adding/subtracting a signed int. There's some > platforms why that's the only API provided by intrinsics and it's good > enough for pretty much everything. As such no issues, but just wondering is there any reason to prefer *_u32 instead of simple _32 or *32. > > > 12. > > I am not able to see API's pg_atomic_add_fetch_u32(), > > pg_atomic_sub_fetch_u32() > > in C11 atomic ops specs, do you need it for something specific? > > Yes, it's already used in the lwlocks code and it's provided by several > other atomics libraries. I don't really see a reason not to provide it, > especially as some platforms could implement it more efficiently. Okay, I find below implementation in your patch. Won't it add the value twice? +#if !defined(PG_HAS_ATOMIC_ADD_FETCH_U32) && defined(PG_HAS_ATOMIC_FETCH_ADD_U32) +#define PG_HAS_ATOMIC_ADD_FETCH_U32 +static inline uint32 +pg_atomic_add_fetch_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_) +{ + return pg_atomic_fetch_add_u32_impl(ptr, add_) + add_; +} +#endif > > 14. > > * pg_memory_barrier - prevent the CPU from reordering memory access > > * > > * A memory barrier must act as a compiler barrier, > > > > Is it ensured in all cases that pg_memory_barrier implies compiler barrier > > as well? > > Yes. > > > Example for msvc case, the specs doesn't seem to guarntee it. > > I think it actually indirectly guarantees it, but I'm on a plane and > can't check :) No problem let me know when you are comfortable, this is just for clarification. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com