On 19/03/15 18:30, Peter Zijlstra wrote:
On Thu, Mar 19, 2015 at 01:39:58PM -0400, Steven Rostedt wrote:
+void printk_nmi_backtrace_complete(void)
+{
+       struct nmi_seq_buf *s;
+       int len, cpu, i, last_i;
+
+       /*
+        * Now that all the NMIs have triggered, we can dump out their
+        * back traces safely to the console.
+        */
+       for_each_possible_cpu(cpu) {
+               s = &per_cpu(nmi_print_seq, cpu);
+               last_i = 0;
+
+               len = seq_buf_used(&s->seq);
+               if (!len)
+                       continue;
+
+               /* Print line by line. */
+               for (i = 0; i < len; i++) {
+                       if (s->buffer[i] == '\n') {
+                               print_seq_line(s, last_i, i);
+                               last_i = i + 1;
+                       }
+               }
+               /* Check if there was a partial line. */
+               if (last_i < len) {
+                       print_seq_line(s, last_i, len - 1);
+                       pr_cont("\n");
+               }
+
+               /* Wipe out the buffer ready for the next time around. */
+               seq_buf_clear(&s->seq);
+       }
+
+       clear_bit(0, &nmi_print_flag);
+       smp_mb__after_atomic();

Is this really necessary. What is the mb synchronizing?

[ Added Peter Zijlstra to confirm it's not needed ]

It surely looks suspect; and it lacks a comment, which is a clear sign
its buggy.

Now it if tries to order the accesses to the seqbuf againt the clearing
of the bit one would have expected a _before_ barrier, not an _after_.

It's nothing to do with the seqbuf since I added the seqbuf code myself but the barrier was already in the code that I copied from.

In the mainline code today it looks like this as part of the x86 code (note that call to put_cpu() in my patchset but it lives in the arch/ specific code rather than the generic code):

:                 /* Check if there was a partial line. */
:                 if (last_i < len) {
:                         print_seq_line(s, last_i, len - 1);
:                         pr_cont("\n");
:                 }
:         }
:
:         clear_bit(0, &backtrace_flag);
:         smp_mb__after_atomic();
:         put_cpu();
: }

The barrier was not intended to have anything to do with put_cpu() either though since the barrier was added before put_cpu() arrived:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=554ec063982752e9a569ab9189eeffa3d96731b2

There's nothing in the commit comment explaining the barrier and I really can't see what it is for.


Daniel.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to