On Fri, Sep 4, 2015 at 9:34 PM, Peter Hurley <pe...@hurleysoftware.com> wrote: > Hi Dmitry, > > On 09/04/2015 03:09 PM, Dmitry Vyukov wrote: >> Race on buffer data happens in the following scenario: >> __tty_buffer_request_room does a plain write of tail->commit, >> no barriers were executed before that. >> At this point flush_to_ldisc reads this new value of commit, >> and reads buffer data, no barriers in between. >> The committed buffer data is not necessary visible to flush_to_ldisc. > > Please submit one patch for each "fix", because it is not possible > to review what you believe you're fixing. > > See below for an example. > >> Similar bug happens when tty_schedule_flip commits data. >> >> Another race happens in tty_buffer_flush. It uses plain reads >> to read tty_buffer.next, as the result it can free a buffer >> which has pending writes in __tty_buffer_request_room thread. >> For example, tty_buffer_flush calls tty_buffer_free which >> reads b->size, the size may not be visible to this thread. >> As the result a large buffer can hang in the freelist. >> >> Update commit with smp_store_release and read commit with >> smp_load_acquire, as it is commit that signals data readiness. >> This is orthogonal to the existing synchronization on tty_buffer.next, >> which is required to not dismiss a buffer with unconsumed data. >> >> The data race was found with KernelThreadSanitizer (KTSAN). >> >> Signed-off-by: Dmitry Vyukov <dvyu...@google.com> >> --- >> drivers/tty/tty_buffer.c | 38 ++++++++++++++++++++++++-------------- >> 1 file changed, 24 insertions(+), 14 deletions(-) >> >> diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c >> index 4cf263d..4fae5d1 100644 >> --- a/drivers/tty/tty_buffer.c >> +++ b/drivers/tty/tty_buffer.c >> @@ -89,7 +89,7 @@ void tty_buffer_unlock_exclusive(struct tty_port *port) >> struct tty_bufhead *buf = &port->buf; >> int restart; >> >> - restart = buf->head->commit != buf->head->read; >> + restart = READ_ONCE(buf->head->commit) != buf->head->read; >> >> atomic_dec(&buf->priority); >> mutex_unlock(&buf->lock); >> @@ -242,11 +242,14 @@ void tty_buffer_flush(struct tty_struct *tty, struct >> tty_ldisc *ld) >> atomic_inc(&buf->priority); >> >> mutex_lock(&buf->lock); >> - while ((next = buf->head->next) != NULL) { >> + /* paired with smp_store_release in __tty_buffer_request_room(); >> + * ensures there are no outstanding writes to buf->head when we free it >> + */ >> + while ((next = smp_load_acquire(&buf->head->next)) != NULL) { >> tty_buffer_free(port, buf->head); >> buf->head = next; >> } >> - buf->head->read = buf->head->commit; >> + buf->head->read = READ_ONCE(buf->head->commit); >> >> if (ld && ld->ops->flush_buffer) >> ld->ops->flush_buffer(tty); >> @@ -290,13 +293,15 @@ static int __tty_buffer_request_room(struct tty_port >> *port, size_t size, >> if (n != NULL) { >> n->flags = flags; >> buf->tail = n; >> - b->commit = b->used; >> - /* paired w/ barrier in flush_to_ldisc(); ensures the >> - * latest commit value can be read before the head is >> - * advanced to the next buffer >> + /* paired with smp_load_acquire in flush_to_ldisc(); >> + * ensures flush_to_ldisc() sees buffer data. >> */ >> - smp_wmb(); >> - b->next = n; >> + smp_store_release(&b->commit, b->used); >> + /* paired with smp_load_acquire in flush_to_ldisc(); >> + * ensures the latest commit value can be read before >> + * the head is advanced to the next buffer >> + */ >> + smp_store_release(&b->next, n); >> } else if (change) >> size = 0; >> else >> @@ -394,7 +399,10 @@ void tty_schedule_flip(struct tty_port *port) >> { >> struct tty_bufhead *buf = &port->buf; >> >> - buf->tail->commit = buf->tail->used; >> + /* paired with smp_load_acquire in flush_to_ldisc(); ensures the >> + * committed data is visible to flush_to_ldisc() >> + */ >> + smp_store_release(&buf->tail->commit, buf->tail->used); >> schedule_work(&buf->work); > > schedule_work() is an implied barrier for obvious reasons.
OK, I will split. To answer this particular question: you need release/write barrier _before_ the synchronizing store, not _after_. Once the store to commit happened, another thread can start reading buffer data, this thread has not yet executed schedule_work at this point. >> } >> EXPORT_SYMBOL(tty_schedule_flip); >> @@ -488,13 +496,15 @@ static void flush_to_ldisc(struct work_struct *work) >> if (atomic_read(&buf->priority)) >> break; >> >> - next = head->next; >> - /* paired w/ barrier in __tty_buffer_request_room(); >> + /* paired with smp_store_release in >> __tty_buffer_request_room(); >> * ensures commit value read is not stale if the head >> * is advancing to the next buffer >> */ >> - smp_rmb(); >> - count = head->commit - head->read; >> + next = smp_load_acquire(&head->next); >> + /* paired with smp_store_release in >> __tty_buffer_request_room(); >> + * ensures we see the committed buffer data >> + */ >> + count = smp_load_acquire(&head->commit) - head->read; >> if (!count) { >> if (next == NULL) { >> check_other_closed(tty); >> > -- Dmitry Vyukov, Software Engineer, dvyu...@google.com Google Germany GmbH, Dienerstraße 12, 80331, München Geschäftsführer: Graham Law, Christine Elizabeth Flores Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat sind, leiten Sie diese bitte nicht weiter, informieren Sie den Absender und löschen Sie die E-Mail und alle Anhänge. Vielen Dank. This e-mail is confidential. If you are not the right addressee please do not forward it, please inform the sender, and please erase this e-mail including any attachments. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/