On Tue, 2012-10-23 at 17:15 -0700, Peter LaDow wrote:
> (Sorry for the subject change, but I wanted to try and pull in those
> who work on RT issues, and the subject didn't make that obvious.
> Please search for the same subject without the RT Linux trailing
> text.)
> 
> Well, more information.  Even with SMP enabled (and presumably the
> migrate_enable having calls to preempt_disable), we still got the
> lockup in iptables-restore.  I did more digging, and it looks like the
> complete stack trace includes a call from iptables-restore (through
> setsockopt with IPT_SO_SET_ADD_COUNTERS).  This seems to be a
> potential multiple writer case where the counters are updated through
> the syscall and the kernel is updating the counters as it filters
> packets.
> 
> I think there might be a race on the update to xt_recseq.sequence,
> since the RT patches remove the spinlock in seqlock_t.  Thus multiple
> writers can corrupt the sequence count. 



>  And I thought the SMP
> configuration would disable preemption when local_bh_disable() is
> called.  And indeed, looking at the disassembly, I see
> preempt_disable() (though optimized, goes to add_preempt_count() ) is
> being called.
> 
> Yet we still see the lockup in the get_counters() in iptables.  Which,
> it seems, is because of some sort of problem with the sequence.  It
> doesn't appear to be related to the preemption, and perhaps there is
> some other corruption of the sequence counter happening.  But the only
> places I see it modified is in xt_write_recseq_begin and
> xt_write_recseq_end, which is only in the netfilter code
> (ip6_tables.c, ip_tables.c, and arp_tables.c).  And every single call
> is preceeded by a call to local_bh_disable().
> 
> This problem is a huge one for us.  And so far I'm unable to track
> down how this is occurring.
> 
> Any other tips?  I presume this is the proper place for RT issues.

Hmm...

Could you try following patch ?

diff --git a/include/linux/netfilter/x_tables.h 
b/include/linux/netfilter/x_tables.h
index 8d674a7..053f9e7 100644
--- a/include/linux/netfilter/x_tables.h
+++ b/include/linux/netfilter/x_tables.h
@@ -471,30 +471,24 @@ DECLARE_PER_CPU(seqcount_t, xt_recseq);
  *
  * Begin packet processing : all readers must wait the end
  * 1) Must be called with preemption disabled
- * 2) softirqs must be disabled too (or we should use this_cpu_add())
  * Returns :
  *  1 if no recursion on this cpu
  *  0 if recursion detected
  */
 static inline unsigned int xt_write_recseq_begin(void)
 {
-       unsigned int addend;
-
+       unsigned int old, new;
+       seqcount_t *ptr = &__get_cpu_var(xt_recseq);
        /*
         * Low order bit of sequence is set if we already
         * called xt_write_recseq_begin().
         */
-       addend = (__this_cpu_read(xt_recseq.sequence) + 1) & 1;
-
-       /*
-        * This is kind of a write_seqcount_begin(), but addend is 0 or 1
-        * We dont check addend value to avoid a test and conditional jump,
-        * since addend is most likely 1
-        */
-       __this_cpu_add(xt_recseq.sequence, addend);
-       smp_wmb();
-
-       return addend;
+       old = ptr->sequence;
+       new = old | 1;
+       /* Note : cmpxchg() is a memory barrier, we dont need smp_wmb() */
+       if (old != new && cmpxchg(&ptr->sequence, old, new) == old)
+               return 1;
+       return 0;
 }
 
 /**


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to