__rcu_process_callbacks() in Linux 2.6

James Huang Tue, 20 Nov 2007 19:27:43 -0800

In the latest Linux 2.6 RCU implementation, __rcu_process_callbacks() is coded 
as follows:



422 static void __rcu_process_callbacks(struct rcu_ctrlblk *rcp,
423                                        struct rcu_data *rdp)
424 {
425        if (rdp->curlist && !rcu_batch_before(rcp->completed, rdp->batch)) {
426                *rdp->donetail = rdp->curlist;
427                rdp->donetail = rdp->curtail;
428                rdp->curlist = NULL;
429                rdp->curtail = &rdp->curlist;
430        }
431 
432        if (rdp->nxtlist && !rdp->curlist) {
433                local_irq_disable();
434                rdp->curlist = rdp->nxtlist;
435                rdp->curtail = rdp->nxttail;
436                rdp->nxtlist = NULL;
437                rdp->nxttail = &rdp->nxtlist;
438                local_irq_enable();
439 
440                /*
441                  * start the next batch of callbacks
442                  */
443 
444                /* determine batch number */
445                rdp->batch = rcp->cur + 1;
446                /* see the comment and corresponding wmb() in
447                  * the rcu_start_batch()
448                  */
449                smp_rmb();
450 
451                if (!rcp->next_pending) {
452                        /* and start it/schedule start if it's a new batch */
453                        spin_lock(&rcp->lock);
454                        rcp->next_pending = 1;
455                        rcu_start_batch(rcp);
456                        spin_unlock(&rcp->lock);
457                }
458        }
459 
460        rcu_check_quiescent_state(rcp, rdp);
461        if (rdp->donelist)
462                rcu_do_batch(rdp);
463 }


The question is how does the update of rdp->batch at line 445 guarantee to 
observe the most updated value of rcp->cur??
The issue is that there is no memory barrier/locking before line 445.
So I think the following sequence of events in chronological order is possible:

Assume initially rcp->cur = 100, this current batch value is visible to every 
CPU, and batch 100 has completed.
 
 
CPU 0: 
--------- 
             call_rcu(): a callback inserted into rdp->nxtlist;                 
    
 
             timer interrupt 
                call rcu_pending(), return true  ( ! rdp->curlist && 
rdp->nxtlist)
                call rcu_check_callbacks() 
                     schedule per CPU rcu_tasklet
 
                __rcu_process_callbacks()
                           move callbacks from nxtlist to curlist;
                           rdp->batch = 101
                           lock rcp->lock
                           rcp->next_pending = 1
                           call rcu_start_batch()
                                 find the current batch has completed and next 
batch pending;
                                 rcp->next_pending = 0
                                 update rcp->cur to 101 and initialize 
rcp->cpumask;                  <----- time t1
                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                           unlock rcp->lock
                              
CPU 1:
---------  
             timer interrupt 
                  call rcu_pending(), return true (asume observing rcp->cur = 
101 != rdp->quiescbatch) 
                  
                  call rcu_check_callbacks() 
                       schedule per CPU rcu_tasklet
 
                 __rcu_process_callbacks()
                       call rcu_check_quisecent_state()
                             find rdp->quiescbatch != rcp->cur
                             set rdp->qs_pending = 1
                             set rdp->passed_quiesc = 0
                             set rdp->quiescbatch = 101 (rcp->cur)
 
             Another timer interrupt
                 call rcu_pending(), return true (rdp->qs_pending == 1)  
                 call rcu_check_callbacks() 
                       (assume in user mode)                         <-- time 
t2  pass quiescent state
                       ~~~~~~~~~~~~~~~~~~                           
~~~~~~~~~~~~~~~~~~~~~~
                       rdp->passed_quiesc = 1
                       schedule per CPU rcu_tasklet
            
                 __rcu_process_callbacks()
                       call rcu_check_quisecent_state()
                             find rdp->qs_pending == 1 && rdp-> passed_quiesc 
== 1
                             set rdp->qs_pending = 0
                             lock rcp->lock
                             call cpu_quite()
                                    clear bit in the rcp->cpumask set up by CPU 
0 at time t1
                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                             unlock rcp->lock

CPU 2:
---------  
             call_rcu(): a callback inserted into rdp->nxtlist;     <--- time t3
             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
             
             timer interrupt 
                 call rcu_pending(), return true ( ! rdp->curlist && 
rdp->nxtlist)
                 call rcu_check_callbacks() 
                      schedule per CPU rcu_tasklet
                                
                 calls __rcu_process_callbacks()
                     move callbacks from nxtlist to curlist; (including the 
callback inserted at time t3)
                     rdp->batch = rcp->cur + 1;                         <--- 
time t4
                     ~~~~~~~~~~~~~~~~~~~~

At time t4, CPU 2 might observe rcp->cpu as 100 and set rdp->batch to 101.
So CPU 2 essentially started its batch 101 (includes a callback inserted at 
time t3) after CPU 1 passed its quiescent state (at time t2) for batch 101.

Isn't this an issue???  Am I missing something?
Thanks for pointing me to the right direction.


-- James Huang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

__rcu_process_callbacks() in Linux 2.6

Reply via email to