First, a note that to my shame I discovered Linux interface for atomic
operations only yesterday.  And I must say that the proposed zatomic interface
looks very similar to what Linux has.  I guess that this is not surprising.
Linux defines these types:
typedef struct {
        int counter;
} atomic_t;

#ifdef CONFIG_64BIT
typedef struct {
        long counter;
} atomic64_t;
#endif

And atomic functions operate on these types.

I got an impression that Linux does not provide atomic64_t operations in 32-bit
kernels.  But I got a bit lost in the maze of indirections, so maybe someone
more familiar with Linux could clarify on this.

It is a little bit tempting to propose to switch OpenZFS to Linux atomics and
have other platforms provide compatibility shims for them.

The rest of the reply is inline below:

On 03/10/2019 20:38, Matthew Ahrens wrote:
> On Wed, Oct 2, 2019 at 7:51 AM Andriy Gapon <a...@freebsd.org
> <mailto:a...@freebsd.org>> wrote:
> 
> 
>     This is a work in progress report for my work on safer use of atomics in 
> ZFS
>     that was discussed in the August developer meeting.
> 
>     I took the approach suggested at the meeting of creating a new type that 
> is to
>     be used for all objects that should be modified atomically.
>     The type is trivial:
>     typedef struct zatomic64 {
>             volatile uint64_t val;
>     } zatomic64_t;
> 
>     I also made a decision upfront to use a wrapper type only for 64-bit 
> objects as
>     32-bit objects should be already safe on all supported platforms where a 
> native
>     integer size is either 32 or 64 bit.
> 
>     Then, as suggested, I wrote trivial wrappers for all 64-bit atomic 
> operations
>     that are used in ZFS code.  For example:
>       static inline void
>       zatomic_add_64(zatomic64_t *a, uint64_t val)
>       {
>               atomic_add_64(&a->val, val);
>       }
> 
>     A side note on naming.  I simply prepended original function names with 
> 'z'.
>     Although names like zatomic64_add() could be prettier and more aligned 
> with the
>     type name.
> 
>     After that I added three new functions:
>     - zatomic_load_64 -- safe (with respect to torn values) read of a 64-bit 
> atomic
>     - zatomic_store_64 -- safe write of a 64-bit atomic
>     - zatomic_init_64 -- unsafe write of a 64-bit atomic
> 
>     The last function just sets zatomic64_t.val and it is supposed to be used 
> when
>     it is known that there are no concurrent accesses.  It's pretty redundant 
> the
>     field can be set directly or the whole object can be initialized with x = 
> { 0 },
>     but I decided to add it for symmetry.
> 
>     The first two functions are implemented like this:
>       static inline uint64_t
>       zatomic_load_64(zatomic64_t *a)
>       {
>       #ifdef __LP64__
>               return (a->val);
>       #else
>               return (zatomic_add_64_nv(a, 0));
>       #endif
>       }
> 
>       static inline void
>       zatomic_store_64(zatomic64_t *a, uint64_t val)
>       {
>       #ifdef __LP64__
>               a->val = val;
>       #else
>               (void)zatomic_swap_64(a, 0);
>       #endif
>       }
> 
>     I am not sure if this was a right way to do it.
>     Maybe there should be a standard implementation only for 64-bit platforms 
> and
>     each 32-bit platform should do its own thing?
>     For example, there is more than one way to get atomic 64-bit loads on x86,
>     according to this:
>     
> https://stackoverflow.com/questions/48046591/how-do-i-atomically-move-a-64bit-value-in-x86-asm
> 
> 
> What you did above seems fine to me.

After getting my feet wet in FreeBSD multi-platform support for atomics, I am
not sure about this myself.
Some 32-bit platforms provide adequate 64-bit atomics including load and store
operations.  So, checking for just __LP64__ is too naive.
I guess that the check will have to be OS-specific
In fact, on FreeBSD if you have other 64-bit atomic operations, then you will
have load and store for sure.

>     Anyway, I started converting ZFS (/FreeBSD) code to use the new type.
>     Immediately, I got some problems and some questions.
> 
>     First, I am quite hesitant about what to do with kstat-s.
>     I wanted to confine the change to ZFS, but kstat is an external thing (at 
> least
>     on illumos).  For now I decided to leave the kstat-s alone.  Especially 
> as I was
>     not going to change any code that reads the kstat-s.
>     But this also means that some things like arc_c and arc_p, which are 
> aliases to
>     kstat value, remain potentially unsafe with respect to torn reads.
> 
> 
> Yeah, we would have to convert these kstats to have callbacks that do the 
> atomic
> reads.
>  
> 
> 
>     I have not converted atomics in the aggsum code yet.
>     I am actually a little bit confused about why as_lower_bound and 
> as_upper_bound
>     are manipulated with atomics.  All manipulations seems to be done under 
> as_lock.
>     Maybe I overlooked something...
> 
> 
> The comment in aggsum_flush_bucket() has the (perhaps incorrect) 
> justification:
> 
> /*
> * We use atomic instructions for this because we read the upper and
> * lower bounds without the lock, so we need stores to be atomic.
> */
> 
> atomic_add_64((volatile uint64_t *)&as->as_lower_bound, asb->asc_delta);
> 
> 
> So the assumption is that as long as we do an atomic store, we can read with a
> normal read.  The lock doesn't help because the readers don't hold the lock.

I missed the fact that there are unprotected reads.

The comment does not explain how exactly atomic stores help.
Are they to prevent torn writes?  Then, if those are possible, torn reads are
equally possible and they need to be prevented as well using atomic loads.


>     I converted refcount.h.
>     That had a bit of unexpected effect on rrwlock_t.
>     zfs_refcount_t is used for rr_anon_rcount and rr_linked_rcount fields.
>     The fields are always accessed rr_lock, so their atomicity is not actually
>     required.  I guess that zfs_refcount_t is used for the support of 
> references
>     with ownership.  
> 
> 
> That's right.
>  
> 
>     So, some bits of the rrwlock code access those fields through
>     the refcount interface, but other (conditional) pieces directly access 
> rc_count
>     as a simple integer.
>     So, there are a couple of options here.  The direct accesses can be 
> replaced
>     with refcount accesses, but then we get unnecessary atomic operations in 
> the
>     fast paths.  
> 
> 
> That would be true only on 32-bit platforms, right?  I don't think a small
> performance hit to 32-bit kernels is worth worrying about.

No, it's not only for 32-bit platforms.
This is about consistency vs performance (with equal correctness) on all 
platforms.

For option 1 we do this:
mutex_enter(rr_lock)
refcount_add(rr_anon_rcount) ==> zatomic_inc_64_nv(rc_count)
mutex_exit(rr_lock)
The atomic operation is redundant given the lock, but we respect encapsulation
of both zfs_refcount_t and zatomic64_t.
zatomic_inc_64_nv may be more expensive on i386, but it is still an atomic
operation on amd64.

For option 2 we do this:
mutex_enter(rr_lock)
rr_anon_rcount.rc_count.val++
mutex_exit(rr_lock)
This is fast and still correct, but we poke inside of zfs_refcount_t and
zatomic64_t.

>     Or the code can keep directly accessing internals of both
>     zfs_refcount_t and zatomic64_t.  Another option is to create a copy of
>     zfs_refcount_t that relies on external synchronization (so that it does 
> not use
>     atomics or any internal locks).  But that would be a bit of code 
> duplication.
> 
> 
> I'd prefer to keep the code simple, consistent, and easy to see its 
> correctness,
> even if it costs some performance on 32-bit.
>  
> 
> 
>     Yet another hard case is znode's z_size.
>     It's changed with atomic_cas_64() in a single place, in all other places 
> it's
>     used as a normal integer.  On the one hand, there can be a torn read 
> because of
>     that atomic operation.  On the other hand, it's quite messy to update all 
> uses
>     of z_size.
> 
>     So, the work is progressing, but I feel like limiting the scope of the 
> change.
>     Originally I planned to convert everything to zatomic in one go.
>     But now I feel that just a few things really need that.
>     Probably os_obj_next_percpu.
>     Likely zfs_refcount_t.
>     Anything else?
>     I am thinking that maybe a few things could be converted initially and 
> the rest
>     could be converted on as needed basis.
> 
>     I don't like having a mix of regular atomics and zatomics, but alas.
> 
>     Any suggestions, comments, observations, ideas, etc are very welcome!
> 
>     P.S.
>     Another thought is creeping in: maybe it's not worth bothering with ZFS on
>     32-bit platforms.
> 
> 
> Fine by me, although I think that opinions may vary on this one :-)  
> Especially
> as long as some of the most popular operating systems to run OpenZFS support
> 32-bit kernels (although it looks like FreeBSD and Ubuntu are in the process 
> of
> dropping support for i386?  But ZoL still supports much older releases of 
> Ubuntu
> and RHEL.).

I am actually not quite sure how FreeBSD/i386 situation is going to develop.
But we also have some other 32-bit platforms where we provide ZFS and there are
even users.  For example, ARMv7 is popular.  Some even tried 32-bit PowerPC, but
as I was informed ZFS was unreliable.

-- 
Andriy Gapon

------------------------------------------
openzfs: openzfs-developer
Permalink: 
https://openzfs.topicbox.com/groups/developer/T3ee8a81d5f09f2ec-Md9533ca9d4b6f27281c1b240
Delivery options: https://openzfs.topicbox.com/groups/developer/subscription

Reply via email to