Re: [Xen-devel] [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list
>>> On 01.09.17 at 11:55,wrote: On 01.09.17 at 10:37, wrote: >> it seems add_sized() won't be a LOCKed instruction. >> #define build_add_sized(name, size, type, reg) \ >> static inline void name(volatile type *addr, type val) \ >> { \ >> asm volatile("add" size " %1,%0"\ >> : "=m" (*addr) \ >> : reg (val)); \ >> } > > Oh, you're right. But then I'd still like you to not add a new > user, as I don't see why it was introduced in the first place: > Independent of architecture it is equivalent to > > write_atomic(p, read_atomic(p) + c) > > and hence I'd like to get rid of it as misleading/redundant. Actually, on x86 it still is a bit better than the generic replacement, i.e. it would only be worthwhile dropping the custom ARM variant in favor of a generic one. Keep using it here then. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list
>>> On 01.09.17 at 10:37,wrote: > it seems add_sized() won't be a LOCKed instruction. > #define build_add_sized(name, size, type, reg) \ > static inline void name(volatile type *addr, type val) \ > { \ > asm volatile("add" size " %1,%0"\ > : "=m" (*addr) \ > : reg (val)); \ > } Oh, you're right. But then I'd still like you to not add a new user, as I don't see why it was introduced in the first place: Independent of architecture it is equivalent to write_atomic(p, read_atomic(p) + c) and hence I'd like to get rid of it as misleading/redundant. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list
On Fri, Sep 01, 2017 at 03:13:17AM -0600, Jan Beulich wrote: On 01.09.17 at 09:55,wrote: >> On Fri, Sep 01, 2017 at 02:24:08AM -0600, Jan Beulich wrote: >> On 01.09.17 at 03:39, wrote: After thinking it again, I want to define the counter as a unsigned int variable for the following reasion: 1. It is definite that the counter is closely related with list_add() and list_del(). If the list is protected by the lock, it is straightforward that the counter is also protected by the lock. 2. In patch 3, althought there are some lock-less readers, we will check the counter still meets our requirement with the lock held. Thus, I don't think there is a racing issue. >>> >>>I think that's fine, but then you still don't need LOCKed accesses >>>to the counter for updating it; write_atomic() will suffice afaict. >> >> A stupid question. >> Is it contradictory that you think the counter can be protected by >> the lock while suggesting using write_atomic() instead of LOCKed >> accesses? >> >> updating the counter is always accompanied by updating list and updating >> list should in locked region. I meaned things like: >> >> spin_lock() >> list_add() >> counter++ >> spin_unlock() >> >> However, I am afraid that not using LOCKed accesses but using >> write_atomic() means something like (separating updating the counter >> from updating the list I think is not good): >> >> spin_lock() >> list_add() >> spin_unlock() >> write_atomic() > >No, I mean > > spin_lock() > list_add() > write_atomic() > spin_unlock() > >whereas ... > >> And I think this version is: >> >> spin_lock() >> list_add() >> add_sized() >> spin_unlock() > >... this produces a needless LOCKed instruction redundant with being >inside the locked region). it seems add_sized() won't be a LOCKed instruction. #define build_add_sized(name, size, type, reg) \ static inline void name(volatile type *addr, type val) \ { \ asm volatile("add" size " %1,%0"\ : "=m" (*addr) \ : reg (val)); \ } Thanks Chao ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list
>>> On 01.09.17 at 09:55,wrote: > On Fri, Sep 01, 2017 at 02:24:08AM -0600, Jan Beulich wrote: > On 01.09.17 at 03:39, wrote: >>> After thinking it again, I want to define the counter as >>> a unsigned int variable for the following reasion: >>> 1. It is definite that the counter is closely related with >>> list_add() and list_del(). If the list is protected by the >>> lock, it is straightforward that the counter is also protected >>> by the lock. >>> 2. In patch 3, althought there are some lock-less readers, we >>> will check the counter still meets our requirement with the lock >>> held. Thus, I don't think there is a racing issue. >> >>I think that's fine, but then you still don't need LOCKed accesses >>to the counter for updating it; write_atomic() will suffice afaict. > > A stupid question. > Is it contradictory that you think the counter can be protected by > the lock while suggesting using write_atomic() instead of LOCKed > accesses? > > updating the counter is always accompanied by updating list and updating > list should in locked region. I meaned things like: > > spin_lock() > list_add() > counter++ > spin_unlock() > > However, I am afraid that not using LOCKed accesses but using > write_atomic() means something like (separating updating the counter > from updating the list I think is not good): > > spin_lock() > list_add() > spin_unlock() > write_atomic() No, I mean spin_lock() list_add() write_atomic() spin_unlock() whereas ... > And I think this version is: > > spin_lock() > list_add() > add_sized() > spin_unlock() ... this produces a needless LOCKed instruction redundant with being inside the locked region). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list
On Fri, Sep 01, 2017 at 02:24:08AM -0600, Jan Beulich wrote: On 01.09.17 at 03:39,wrote: >> After thinking it again, I want to define the counter as >> a unsigned int variable for the following reasion: >> 1. It is definite that the counter is closely related with >> list_add() and list_del(). If the list is protected by the >> lock, it is straightforward that the counter is also protected >> by the lock. >> 2. In patch 3, althought there are some lock-less readers, we >> will check the counter still meets our requirement with the lock >> held. Thus, I don't think there is a racing issue. > >I think that's fine, but then you still don't need LOCKed accesses >to the counter for updating it; write_atomic() will suffice afaict. A stupid question. Is it contradictory that you think the counter can be protected by the lock while suggesting using write_atomic() instead of LOCKed accesses? updating the counter is always accompanied by updating list and updating list should in locked region. I meaned things like: spin_lock() list_add() counter++ spin_unlock() However, I am afraid that not using LOCKed accesses but using write_atomic() means something like (separating updating the counter from updating the list I think is not good): spin_lock() list_add() spin_unlock() write_atomic() And I think this version is: spin_lock() list_add() add_sized() spin_unlock() Thanks Chao ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list
>>> On 01.09.17 at 03:39,wrote: > After thinking it again, I want to define the counter as > a unsigned int variable for the following reasion: > 1. It is definite that the counter is closely related with > list_add() and list_del(). If the list is protected by the > lock, it is straightforward that the counter is also protected > by the lock. > 2. In patch 3, althought there are some lock-less readers, we > will check the counter still meets our requirement with the lock > held. Thus, I don't think there is a racing issue. I think that's fine, but then you still don't need LOCKed accesses to the counter for updating it; write_atomic() will suffice afaict. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list
On Thu, Aug 31, 2017 at 02:33:57AM -0600, Jan Beulich wrote: On 31.08.17 at 09:15,wrote: >> On Thu, Aug 31, 2017 at 01:42:53AM -0600, Jan Beulich wrote: >> On 31.08.17 at 00:57, wrote: On Wed, Aug 30, 2017 at 10:00:49AM -0600, Jan Beulich wrote: On 16.08.17 at 07:14, wrote: >> @@ -100,6 +101,24 @@ void vmx_pi_per_cpu_init(unsigned int cpu) >> spin_lock_init(_cpu(vmx_pi_blocking, cpu).lock); >> } >> >> +static void vmx_pi_add_vcpu(struct pi_blocking_vcpu *pbv, >> +struct vmx_pi_blocking_vcpu *vpbv) >> +{ >> +ASSERT(spin_is_locked(>lock)); > >You realize this is only a very weak check for a non-recursive lock? I just thought the lock should be held when adding one entry to the blocking list. Do you think we should remove this check or make it stricter? >>> >>>Well, the primary purpose of my comment was to make you aware >>>of the fact. If the weak check is good enough for you, then fine. >> >> To be honest, I don't know the difference between weak check and tight >> check. > >For non-recursive locks spin_is_locked() only tells you if _any_ >CPU in the system currently holds the lock. For recursive ones it >checks whether it's the local CPU that owns the lock. > >>>Removing the check would be a bad idea imo (but see also below); >>>tightening might be worthwhile, but might also go too far (depending >>>mainly on how clearly provable it is that all callers actually hold the >>>lock). >> >> IMO, the lock was introduced (not by me) to protect the blocking list. >> list_add() and list_del() should be performed with the lock held. So I >> think it is clear that all callers should hold the lock. > >Good. > >> +add_sized(>counter, 1); >> +ASSERT(read_atomic(>counter)); > >Why add_sized() and read_atomic() when you hold the lock? > In patch 3, frequent reading the counter is used to find a suitable vcpu and we can use add_sized() and read_atomic() to avoid acquiring the lock. In one word, the lock doesn't protect the counter. >>> >>>In that case it would be more natural to switch to the atomic >>>accesses there. Plus you still wouldn't need read_atomic() >>>here, with the lock held. Furthermore I would then wonder >>>whether it wasn't better to use atomic_t for the counter at >> >> Is there some basic guide on when it is better to use read_atomic() >> and add_sized() and when it is better to define a atomic variable >> directly? > >If an atomic_t variable fits your needs, I think it should always >be preferred. add_sized() was introduced for a case where an >atomic_t variable would not have been usable. Please also >consult older commits for understanding the background. > >>>that point. Also with a lock-less readers the requirement to >>>hold a lock here (rather than using suitable LOCKed accesses) >>>becomes questionable too. >> >> As I said above, I think the lock is used to protect the list. >> >> I think this patch has two parts: >> 1. Move all list operations to two inline functions. (with this, adding >> a counter is easier and don't need add code in several places.) >> >> 2. Add a counter. > >With it being left unclear whether the counter is meant to >also be protected by the lock: In the patch here you claim it >is, yet by later introducing lock-less readers you weaken >that model. Hence the request to bring things into a >consistent state right away, and ideally also into the final >state. > Hi, Jan. After thinking it again, I want to define the counter as a unsigned int variable for the following reasion: 1. It is definite that the counter is closely related with list_add() and list_del(). If the list is protected by the lock, it is straightforward that the counter is also protected by the lock. 2. In patch 3, althought there are some lock-less readers, we will check the counter still meets our requirement with the lock held. Thus, I don't think there is a racing issue. Thanks Chao ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list
On Thu, Aug 31, 2017 at 02:33:57AM -0600, Jan Beulich wrote: On 31.08.17 at 09:15,wrote: >> On Thu, Aug 31, 2017 at 01:42:53AM -0600, Jan Beulich wrote: >> On 31.08.17 at 00:57, wrote: On Wed, Aug 30, 2017 at 10:00:49AM -0600, Jan Beulich wrote: On 16.08.17 at 07:14, wrote: >> @@ -100,6 +101,24 @@ void vmx_pi_per_cpu_init(unsigned int cpu) >> spin_lock_init(_cpu(vmx_pi_blocking, cpu).lock); >> } >> >> +static void vmx_pi_add_vcpu(struct pi_blocking_vcpu *pbv, >> +struct vmx_pi_blocking_vcpu *vpbv) >> +{ >> +ASSERT(spin_is_locked(>lock)); > >You realize this is only a very weak check for a non-recursive lock? I just thought the lock should be held when adding one entry to the blocking list. Do you think we should remove this check or make it stricter? >>> >>>Well, the primary purpose of my comment was to make you aware >>>of the fact. If the weak check is good enough for you, then fine. >> >> To be honest, I don't know the difference between weak check and tight >> check. > >For non-recursive locks spin_is_locked() only tells you if _any_ >CPU in the system currently holds the lock. For recursive ones it >checks whether it's the local CPU that owns the lock. This remake is impressive to me. > >>>Removing the check would be a bad idea imo (but see also below); >>>tightening might be worthwhile, but might also go too far (depending >>>mainly on how clearly provable it is that all callers actually hold the >>>lock). >> >> IMO, the lock was introduced (not by me) to protect the blocking list. >> list_add() and list_del() should be performed with the lock held. So I >> think it is clear that all callers should hold the lock. > >Good. > >> +add_sized(>counter, 1); >> +ASSERT(read_atomic(>counter)); > >Why add_sized() and read_atomic() when you hold the lock? > In patch 3, frequent reading the counter is used to find a suitable vcpu and we can use add_sized() and read_atomic() to avoid acquiring the lock. In one word, the lock doesn't protect the counter. >>> >>>In that case it would be more natural to switch to the atomic >>>accesses there. Plus you still wouldn't need read_atomic() >>>here, with the lock held. Furthermore I would then wonder >>>whether it wasn't better to use atomic_t for the counter at >> >> Is there some basic guide on when it is better to use read_atomic() >> and add_sized() and when it is better to define a atomic variable >> directly? > >If an atomic_t variable fits your needs, I think it should always >be preferred. add_sized() was introduced for a case where an >atomic_t variable would not have been usable. Please also >consult older commits for understanding the background. Ok. I will. Thanks for your guide. > >>>that point. Also with a lock-less readers the requirement to >>>hold a lock here (rather than using suitable LOCKed accesses) >>>becomes questionable too. >> >> As I said above, I think the lock is used to protect the list. >> >> I think this patch has two parts: >> 1. Move all list operations to two inline functions. (with this, adding >> a counter is easier and don't need add code in several places.) >> >> 2. Add a counter. > >With it being left unclear whether the counter is meant to >also be protected by the lock: In the patch here you claim it >is, yet by later introducing lock-less readers you weaken >that model. Hence the request to bring things into a >consistent state right away, and ideally also into the final >state. Sure. I will clarify this and make things consistent. Thanks Chao ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list
>>> On 31.08.17 at 09:15,wrote: > On Thu, Aug 31, 2017 at 01:42:53AM -0600, Jan Beulich wrote: > On 31.08.17 at 00:57, wrote: >>> On Wed, Aug 30, 2017 at 10:00:49AM -0600, Jan Beulich wrote: >>> On 16.08.17 at 07:14, wrote: > @@ -100,6 +101,24 @@ void vmx_pi_per_cpu_init(unsigned int cpu) > spin_lock_init(_cpu(vmx_pi_blocking, cpu).lock); > } > > +static void vmx_pi_add_vcpu(struct pi_blocking_vcpu *pbv, > +struct vmx_pi_blocking_vcpu *vpbv) > +{ > +ASSERT(spin_is_locked(>lock)); You realize this is only a very weak check for a non-recursive lock? >>> >>> I just thought the lock should be held when adding one entry to the >>> blocking list. Do you think we should remove this check or make it >>> stricter? >> >>Well, the primary purpose of my comment was to make you aware >>of the fact. If the weak check is good enough for you, then fine. > > To be honest, I don't know the difference between weak check and tight > check. For non-recursive locks spin_is_locked() only tells you if _any_ CPU in the system currently holds the lock. For recursive ones it checks whether it's the local CPU that owns the lock. >>Removing the check would be a bad idea imo (but see also below); >>tightening might be worthwhile, but might also go too far (depending >>mainly on how clearly provable it is that all callers actually hold the >>lock). > > IMO, the lock was introduced (not by me) to protect the blocking list. > list_add() and list_del() should be performed with the lock held. So I > think it is clear that all callers should hold the lock. Good. > +add_sized(>counter, 1); > +ASSERT(read_atomic(>counter)); Why add_sized() and read_atomic() when you hold the lock? >>> >>> In patch 3, frequent reading the counter is used to find a suitable >>> vcpu and we can use add_sized() and read_atomic() to avoid acquiring the >>> lock. In one word, the lock doesn't protect the counter. >> >>In that case it would be more natural to switch to the atomic >>accesses there. Plus you still wouldn't need read_atomic() >>here, with the lock held. Furthermore I would then wonder >>whether it wasn't better to use atomic_t for the counter at > > Is there some basic guide on when it is better to use read_atomic() > and add_sized() and when it is better to define a atomic variable > directly? If an atomic_t variable fits your needs, I think it should always be preferred. add_sized() was introduced for a case where an atomic_t variable would not have been usable. Please also consult older commits for understanding the background. >>that point. Also with a lock-less readers the requirement to >>hold a lock here (rather than using suitable LOCKed accesses) >>becomes questionable too. > > As I said above, I think the lock is used to protect the list. > > I think this patch has two parts: > 1. Move all list operations to two inline functions. (with this, adding > a counter is easier and don't need add code in several places.) > > 2. Add a counter. With it being left unclear whether the counter is meant to also be protected by the lock: In the patch here you claim it is, yet by later introducing lock-less readers you weaken that model. Hence the request to bring things into a consistent state right away, and ideally also into the final state. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list
On Thu, Aug 31, 2017 at 01:42:53AM -0600, Jan Beulich wrote: On 31.08.17 at 00:57,wrote: >> On Wed, Aug 30, 2017 at 10:00:49AM -0600, Jan Beulich wrote: >> On 16.08.17 at 07:14, wrote: @@ -100,6 +101,24 @@ void vmx_pi_per_cpu_init(unsigned int cpu) spin_lock_init(_cpu(vmx_pi_blocking, cpu).lock); } +static void vmx_pi_add_vcpu(struct pi_blocking_vcpu *pbv, +struct vmx_pi_blocking_vcpu *vpbv) +{ +ASSERT(spin_is_locked(>lock)); >>> >>>You realize this is only a very weak check for a non-recursive lock? >> >> I just thought the lock should be held when adding one entry to the >> blocking list. Do you think we should remove this check or make it >> stricter? > >Well, the primary purpose of my comment was to make you aware >of the fact. If the weak check is good enough for you, then fine. To be honest, I don't know the difference between weak check and tight check. >Removing the check would be a bad idea imo (but see also below); >tightening might be worthwhile, but might also go too far (depending >mainly on how clearly provable it is that all callers actually hold the >lock). IMO, the lock was introduced (not by me) to protect the blocking list. list_add() and list_del() should be performed with the lock held. So I think it is clear that all callers should hold the lock. > +add_sized(>counter, 1); +ASSERT(read_atomic(>counter)); >>> >>>Why add_sized() and read_atomic() when you hold the lock? >>> >> >> In patch 3, frequent reading the counter is used to find a suitable >> vcpu and we can use add_sized() and read_atomic() to avoid acquiring the >> lock. In one word, the lock doesn't protect the counter. > >In that case it would be more natural to switch to the atomic >accesses there. Plus you still wouldn't need read_atomic() >here, with the lock held. Furthermore I would then wonder >whether it wasn't better to use atomic_t for the counter at Is there some basic guide on when it is better to use read_atomic() and add_sized() and when it is better to define a atomic variable directly? >that point. Also with a lock-less readers the requirement to >hold a lock here (rather than using suitable LOCKed accesses) >becomes questionable too. As I said above, I think the lock is used to protect the list. I think this patch has two parts: 1. Move all list operations to two inline functions. (with this, adding a counter is easier and don't need add code in several places.) 2. Add a counter. Thanks Chao ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list
>>> On 31.08.17 at 00:57,wrote: > On Wed, Aug 30, 2017 at 10:00:49AM -0600, Jan Beulich wrote: > On 16.08.17 at 07:14, wrote: >>> @@ -100,6 +101,24 @@ void vmx_pi_per_cpu_init(unsigned int cpu) >>> spin_lock_init(_cpu(vmx_pi_blocking, cpu).lock); >>> } >>> >>> +static void vmx_pi_add_vcpu(struct pi_blocking_vcpu *pbv, >>> +struct vmx_pi_blocking_vcpu *vpbv) >>> +{ >>> +ASSERT(spin_is_locked(>lock)); >> >>You realize this is only a very weak check for a non-recursive lock? > > I just thought the lock should be held when adding one entry to the > blocking list. Do you think we should remove this check or make it > stricter? Well, the primary purpose of my comment was to make you aware of the fact. If the weak check is good enough for you, then fine. Removing the check would be a bad idea imo (but see also below); tightening might be worthwhile, but might also go too far (depending mainly on how clearly provable it is that all callers actually hold the lock). >>> +add_sized(>counter, 1); >>> +ASSERT(read_atomic(>counter)); >> >>Why add_sized() and read_atomic() when you hold the lock? >> > > In patch 3, frequent reading the counter is used to find a suitable > vcpu and we can use add_sized() and read_atomic() to avoid acquiring the > lock. In one word, the lock doesn't protect the counter. In that case it would be more natural to switch to the atomic accesses there. Plus you still wouldn't need read_atomic() here, with the lock held. Furthermore I would then wonder whether it wasn't better to use atomic_t for the counter at that point. Also with a lock-less readers the requirement to hold a lock here (rather than using suitable LOCKed accesses) becomes questionable too. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list
On Wed, Aug 30, 2017 at 10:00:49AM -0600, Jan Beulich wrote: On 16.08.17 at 07:14,wrote: >> @@ -100,6 +101,24 @@ void vmx_pi_per_cpu_init(unsigned int cpu) >> spin_lock_init(_cpu(vmx_pi_blocking, cpu).lock); >> } >> >> +static void vmx_pi_add_vcpu(struct pi_blocking_vcpu *pbv, >> +struct vmx_pi_blocking_vcpu *vpbv) >> +{ >> +ASSERT(spin_is_locked(>lock)); > >You realize this is only a very weak check for a non-recursive lock? I just thought the lock should be held when adding one entry to the blocking list. Do you think we should remove this check or make it stricter? > >> +add_sized(>counter, 1); >> +ASSERT(read_atomic(>counter)); > >Why add_sized() and read_atomic() when you hold the lock? > In patch 3, frequent reading the counter is used to find a suitable vcpu and we can use add_sized() and read_atomic() to avoid acquiring the lock. In one word, the lock doesn't protect the counter. Thanks Chao ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list
>>> On 16.08.17 at 07:14,wrote: > @@ -100,6 +101,24 @@ void vmx_pi_per_cpu_init(unsigned int cpu) > spin_lock_init(_cpu(vmx_pi_blocking, cpu).lock); > } > > +static void vmx_pi_add_vcpu(struct pi_blocking_vcpu *pbv, > +struct vmx_pi_blocking_vcpu *vpbv) > +{ > +ASSERT(spin_is_locked(>lock)); You realize this is only a very weak check for a non-recursive lock? > +add_sized(>counter, 1); > +ASSERT(read_atomic(>counter)); Why add_sized() and read_atomic() when you hold the lock? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel