<snip>

> 
> 15/06/2021 12:08, Ananyev, Konstantin:
> > > 15/06/2021 11:33, Ananyev, Konstantin:
> > > > > 14/06/2021 17:48, Jerin Jacob:
> > > > > > On Mon, Jun 14, 2021 at 8:29 PM Ananyev, Konstantin
> > > > > > <konstantin.anan...@intel.com> wrote:
> > > > > > > I had only a quick look at your approach so far.
> > > > > > > But from what I can read, in MT environment your suggestion
> > > > > > > will require extra synchronization for each read-write access to
> such parray element (lock, rcu, ...).
> > > > > > > I think what Bruce suggests will be much ligther, easier to
> implement and less error prone.
> > > > > > > At least for rte_ethdevs[] and friends.
> > > > > >
> > > > > > +1
> > > > >
> > > > > Please could you have a deeper look and tell me why we need more
> locks?
> > > > > The element pointers doesn't change.
> > > > > Only the array pointer change at resize,
> > > >
> > > > Yes, array pointer changes at resize, and reader has to read that
> > > > value to access elements in the parray. Which means that we need
> > > > some sync between readers and updaters to avoid reader using stale
> pointer (ref-counter, rcu, etc.).
> > >
> > > No
> > > The old array is still there, so we don't need sync.
> > >
> > > > I.E. updater can free old array pointer *only* when it can
> > > > guarantee that there are no readers that still use it.
> > >
> > > No
> > > Reading an element is OK because the pointer to the element is not
> changed.
> > > Getting the pointer to an element from the index is the only thing
> > > which is blocking the freeing of an array, and I see no reason why
> > > dereferencing an index would be longer than 2 consecutive resizes of
> > > the array.
> >
> > In general, your thread can be switched off the cpu at any moment.
> > And you don't know for sure when it will be scheduled back.
> >
> > >
> > > > > but the old one is still usable until the next resize.
> > > >
> > > > Ok, but what is the guarantee that reader would *always* finish till 
> > > > next
> resize?
> > > > As an example of such race condition:
> > > >
> > > > /* global one */
> > > >         struct rte_parray pa;
> > > >
> > > > /* thread #1, tries to read elem from the array */
> > > >         ....
> > > >         int **x = pa->array;
> > >
> > > We should not save the array pointer.
> > > Each index must be dereferenced with the macro getting the current
> > > array pointer.
> > > So the interrupt is during dereference of a single index.
> >
> > You still need to read your pa->array somewhere (let say into a register).
> > Straight after that your thread can be interrupted.
> > Then when it is scheduled back to the CPU that value (in a register) might 
> > be
> s stale one.
> >
> > >
> > > > /* thread # 1 get suspended for a while  at that point */
> > > >
> > > > /* meanwhile thread #2 does: */
> > > >         ....
> > > >         /* causes first resize(), x still valid, points to 
> > > > pa->old_array */
> > > >         rte_parray_alloc(&pa, ...);
> > > >         .....
> > > >         /* causes second resize(), x now points to freed memory */
> > > >         rte_parray_alloc(&pa, ...);
> > > >         ...
> > >
> > > 2 resizes is a very long time, it is at minimum 33 allocations!
> > >
> > > > /* at that point thread #1 resumes: */
> > > >
> > > >         /* contents of x[0] are undefined, 'p' could point anywhere,
> > > >              might cause segfault or silent memory corruption */
> > > >         int *p = x[0];
> > > >
> > > >
> > > > Yes probability of such situation is quite small.
> > > > But it is still possible.
> > >
> > > In device probing, I don't see how it is realistically possible:
> > > 33 device allocations during 1 device index being dereferenced.
> >
> > Yeh, it would work fine 1M times, but sometimes will crash.
> 
> Sometimes a thread will be interrupted during 33 device allocations?
> 
> > Which will make it even harder to reproduce, debug and fix.
> > I think that when introducing a new generic library into DPDK, we
> > should avoid making such assumptions.
> 
> I intend to make it internal-only (I should have named it eal_parray).
> 
> > > I agree it is tricky, but that's the whole point of finding tricks
> > > to keep fast code.
> >
> > It is not tricky, it is buggy 😊
> > You introducing a race condition into the new core generic library by
> > design, and trying to convince people that it is *OK*.
> 
> Yes, because I am convinced myself.
> 
> > Sorry, but NACK from me till that issue will be addressed.
Agree here that a synchronization mechanism is required to indicate when it is 
safe to free the old array. An ACK from the readers is required to free the old 
array. We cannot use "enough time has passed" argument.

As others have mentioned, I think the key is the use case. Not all use cases 
require a dynamically resized array. Dynamically allocated array at init time 
would be enough.

If a dynamically resized array is required, using RCU (or any other mechanism) 
is necessary. I do not think these use cases should be characterized by the 
size of the memory/array in question (it might be a small chunk in a system 
with abundant memory, but might be a big chunk in a system with small amount of 
memory). The current RCU library provides good options to hide complexities 
from the application or allow the application to handle complexities if it 
wants.

> 
> It is not an issue, but a design.
> If you think that a thread can be interrupted during 33 device allocations 
> then
> we should find another implementation, but I am quite sure it will be slower.
> 

Reply via email to