On Thu, 25 Jun 2020 13:55:07 -0400 (EDT) Mathieu Desnoyers <mathieu.desnoy...@efficios.com> wrote
> > Here's the design of this solution: > > > > All this is per cpu, and only needs to worry about nested events (not > > parallel events). > > > > The players: > > > > write_tail: The index in the buffer where new events can be written to. > > It is incremented via local_add() to reserve space for a new event. > > > > before_stamp: A time stamp set by all events before reserving space. > > > > write_stamp: A time stamp updated by events after it has successfully > > reserved space. > > What are the types used for before_stamp and write_stamp ? If those > are 64-bit integers, how does sharing them between nested contexts > work on 32-bit architectures ? Well, write_stamp is updated via local64, which I believe handles this for us. I probably should make before_stamp handle it as well. > > > > > next_write: A copy of "write_tail" used to help with races. > > > > /* Save the current position of write */ > > [A] w = local_read(write_tail); > > barrier(); > > /* Read both before and write stamps before touching > > anything */ before = READ_ONCE(before_stamp); > > after = local_read(write_stamp); > > barrier(); > > > > /* > > * If before and after are the same, then this event is not > > * preempting a time update. If it is, then reserve space > > for adding > > You should use the term "interrupt" rather than "preempting", because > as you stated yourself, this algorithm only works with nested > interrupts, not with preemption. The two terms are basically interchangeable here, but for consistency, I will update it. Thanks. > > > * a full time stamp (this can turn into a time extend which > > is > > * just an extended time delta but fill up the extra space). > > */ > > if (after != before) > > abs = true; > > > > ts = clock(); > > > > /* Now update the before_stamp (everyone does this!) */ > > [B] WRITE_ONCE(before_stamp, ts); > > > > /* Read the current next_write and update it to what we want > > write > > * to be after we reserve space. */ > > next = READ_ONCE(next_write); > > WRITE_ONCE(next_write, w + len); > > > > /* Now reserve space on the buffer */ > > [C] write = local_add_return(len, write_tail); > > So the reservation is not "just" an add instruction, it's actually an > xadd on x86. Is that really faster than a cmpxchg ? I believe the answer is still yes. But I can run some benchmarks to make sure. > > > > > /* Set tail to be were this event's data is */ > > tail = write - len; > > > > if (w == tail) { > > > > /* Nothing preempted this between A and C */ > > [D] local_set(write_stamp, ts); > > barrier(); > > [E] save_before = READ_ONCE(before_stamp); > > > > if (!abs) { > > /* This did not preempt a time update */ > > delta = ts - a; > > What does "a" refer to ? What triggers its update ? Oops, When I first wrote this, I used "a" and "b" for "after" and "before" and had "after" and "before" be "after_stamp" and "before_stamp". I missed this update. Nice catch. > > > } else { > > /* slow path */ > > if (w == next) { > > If it's a slow path, why try to play games with a delta timestamp ? > Space should not be an issue for a slow path like this. It would be > simpler to just use the full timestamp here. Hmm, you may be right. Previous iterations of this code had a distinct difference here, but after restructuring it, I don't think that distinction is valid anymore. If anything, having this at least lets me validate that it's doing what I believe it should be doing (as I added a ton of trace_printk()s into this). > > > /* This event preempted the previous > > event > > * just after it reserved its > > buffer. > > You mean nesting after [C] but before [D]. Yes. I can add that for clarity, but perhaps I don't need that if I merge the two. > > > * It's timestamp should be > > "before_stamp" */ > > It's -> Its ;-) My email client messed up your formatting of the rest of the email, so I'll send a separate reply. -- Steve