Hi, On Mon, Jul 25, 2016 at 11:22:24AM +0200, Vincent Brillault wrote: > While working on a kernel module that basically forks > kernel/printk/printk.c to create its own buffer, I believe I have found > a bug in how printk wraps around its buffer: In a very specific case, a > reader, trying to read what was supposed to be a zeroed header > indicating that the buffer wrapped, might read random data from another > message that has overridden it. The root of the problems seems to be > that the zeroed header placed at the end of the buffer is not protected > by the sequence number checks. > > [...] > > Due to scheduling/reader choice, the reader now stops. > A 20 character-long message arrives. The buffers removes messages from before > and wraps arround: > 88888888888888888883333333444444444555555555556666666677770........... > > And we now have: > log_first_seq = 3, log_next_seq = 9 > log_first_idx = 20, log_next_idx = 20 > > Let's continue filling the buffer, without wrapping: > 88888888888888888889999999999999999999aaaaaaaaaaaaabbbbbbbbbbbbbb..... > > We now have: > log_first_seq = 8, log_next_seq = 0xc > log_first_idx = 0, log_next_idx = 65 > > And now the reader restarts, it still have the same sequence number, '8' > As `seq < log_first_seq` is false, it will consider that this message is valid > and continue. Unfortunately, it's first action will be `log_from_idx(idx)`, > which will first try to read the header of the message at the offset `idx`. > This header, which was supposed to be zeroed due to the wrap-around, has now > been overritten and thus contain random junk, which might ultimately lead > to a kernel panic (wrong memory access from the reader). > > As I understand it, the underlying problem is that the zeroed-header is > not properly protected by the sequence number cheks.
Thanks for your very detailled write-up but I believe you are mistaken on the value of log_first_idx between these two last steps you describe. To detail even more: - we want to store bbb… and we have first_seq=7, first_idx=57, - first loop in log_make_free_space(), not enough space, so we move the next one, first_seq=8, first_idx= *59* (log_next check of msg->len is for record 7), - second loop in log_make_free_space(), logbuf_has_space uses first_idx=59 and will still return false, so we go to the next record once again and end up with first_seq=9, first_idx=23. So I think the situation you describe cannot happen, first_seq will always be incremented when we find the 0-filled record and wrap around, and so all the later checks of "*_seq < log_first_seq" will reset all the readers indexes. Did I miss anything? -- Ivan "Colona" Delalande Arista Networks