On Tue, Oct 25, 2016 at 02:33:55PM +0200, Morten Br?rup wrote:
> Comments at the end.
> 
> Med venlig hilsen / kind regards
> - Morten Br?rup
> 
> > -----Original Message-----
> > From: Bruce Richardson [mailto:bruce.richardson at intel.com]
> > Sent: Tuesday, October 25, 2016 2:20 PM
> > To: Morten Br?rup
> > Cc: Adrien Mazarguil; Wiles, Keith; dev at dpdk.org; Olivier Matz; Oleg
> > Kuporosov
> > Subject: Re: [dpdk-dev] mbuf changes
> > 
> > On Tue, Oct 25, 2016 at 02:16:29PM +0200, Morten Br?rup wrote:
> > > Comments inline.
> > >
> > > > -----Original Message-----
> > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce
> > > > Richardson
> > > > Sent: Tuesday, October 25, 2016 1:14 PM
> > > > To: Adrien Mazarguil
> > > > Cc: Morten Br?rup; Wiles, Keith; dev at dpdk.org; Olivier Matz; Oleg
> > > > Kuporosov
> > > > Subject: Re: [dpdk-dev] mbuf changes
> > > >
> > > > On Tue, Oct 25, 2016 at 01:04:44PM +0200, Adrien Mazarguil wrote:
> > > > > On Tue, Oct 25, 2016 at 12:11:04PM +0200, Morten Br?rup wrote:
> > > > > > Comments inline.
> > > > > >
> > > > > > Med venlig hilsen / kind regards
> > > > > > - Morten Br?rup
> > > > > >
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Adrien Mazarguil [mailto:adrien.mazarguil at 6wind.com]
> > > > > > > Sent: Tuesday, October 25, 2016 11:39 AM
> > > > > > > To: Bruce Richardson
> > > > > > > Cc: Wiles, Keith; Morten Br?rup; dev at dpdk.org; Olivier Matz;
> > > > > > > Oleg Kuporosov
> > > > > > > Subject: Re: [dpdk-dev] mbuf changes
> > > > > > >
> > > > > > > On Mon, Oct 24, 2016 at 05:25:38PM +0100, Bruce Richardson
> > wrote:
> > > > > > > > On Mon, Oct 24, 2016 at 04:11:33PM +0000, Wiles, Keith
> > wrote:
> > > > > > > [...]
> > > > > > > > > > On Oct 24, 2016, at 10:49 AM, Morten Br?rup
> > > > > > > <mb at smartsharesystems.com> wrote:
> > > > > > > [...]
> > >
> > > > > > > > One other point I'll mention is that we need to have a
> > > > > > > > discussion on how/where to add in a timestamp value into
> > the
> > > > > > > > mbuf. Personally, I think it can be in a union with the
> > > > sequence
> > > > > > > > number value, but I also suspect that 32-bits of a
> > timestamp
> > > > > > > > is not going to be enough for
> > > > > > > many.
> > > > > > > >
> > > > > > > > Thoughts?
> > > > > > >
> > > > > > > If we consider that timestamp representation should use
> > > > nanosecond
> > > > > > > granularity, a 32-bit value may likely wrap around too
> > quickly
> > > > > > > to be useful. We can also assume that applications requesting
> > > > > > > timestamps may care more about latency than throughput, Oleg
> > > > found
> > > > > > > that using the second cache line for this purpose had a
> > > > noticeable impact [1].
> > > > > > >
> > > > > > >  [1] http://dpdk.org/ml/archives/dev/2016-October/049237.html
> > > > > >
> > > > > > I agree with Oleg about the latency vs. throughput importance
> > > > > > for
> > > > such applications.
> > > > > >
> > > > > > If you need high resolution timestamps, consider them to be
> > > > generated by the NIC RX driver, possibly by the hardware itself
> > > > (http://w3new.napatech.com/features/time-precision/hardware-time-
> > > > stamp), so the timestamp belongs in the first cache line. And I am
> > > > proposing that it should have the highest possible accuracy, which
> > > > makes the value hardware dependent.
> > > > > >
> > > > > > Furthermore, I am arguing that we leave it up to the
> > application
> > > > > > to
> > > > keep track of the slowly moving bits (i.e. counting whole seconds,
> > > > hours and calendar date) out of band, so we don't use precious
> > space
> > > > in the mbuf. The application doesn't need the NIC RX driver's fast
> > > > path to capture which date (or even which second) a packet was
> > > > received. Yes, it adds complexity to the application, but we can't
> > > > set aside 64 bit for a generic timestamp. Or as a weird tradeoff:
> > > > Put the fast moving 32 bit in the first cache line and the slow
> > > > moving 32 bit in the second cache line, as a placeholder for the
> > application to fill out if needed.
> > > > Yes, it means that the application needs to check the time and
> > > > update its variable holding the slow moving time once every second
> > > > or so; but that should be doable without significant effort.
> > > > >
> > > > > That's a good point, however without a 64 bit value, elapsed time
> > > > > between two arbitrary mbufs cannot be measured reliably due to
> > not
> > > > > enough context, one way or another the low resolution value is
> > > > > also
> > > > needed.
> > > > >
> > > > > Obviously latency-sensitive applications are unlikely to perform
> > > > > lengthy buffering and require this but I'm not sure about all the
> > > > > possible use-cases. Considering many NICs expose 64 bit
> > timestaps,
> > > > > I suggest we do not truncate them.
> > > > >
> > > > > I'm not a fan of the weird tradeoff either, PMDs will be tempted
> > > > > to fill the extra 32 bits whenever they can and negate the
> > > > > performance improvement of the first cache line.
> > > >
> > > > I would tend to agree, and I don't really see any convenient way to
> > > > avoid putting in a 64-bit field for the timestamp in cache-line 0.
> > > > If we are ok with having this overlap/partially overlap with
> > > > sequence number, it will use up an extra 4B of storage in that
> > cacheline.
> > >
> > > I agree about the lack of convenience! And Adrien certainly has a
> > point about PMD temptations.
> > >
> > > However, I still don't think that a NICs ability to date-stamp a
> > packet is sufficient reason to put a date-stamp in cache line 0 of the
> > mbuf. Storing only the fast moving 32 bit in cache line 0 seems like a
> > good compromise to me.
> > >
> > > Maybe you can find just one more byte, so it can hold 17 minutes with
> > > nanosecond resolution. (I'm joking!)
> > >
> > > Please don't sacrifice the sequence number for the seconds/hours/days
> > part a timestamp. Maybe it could be configurable to use a 32 bit or 64
> > bit timestamp.
> > >
> > Do you see both timestamp and sequence numbers being used together? I
> > would have thought that apps would either use one or the other?
> > However, your suggestion is workable in any case, to allow the sequence
> > number to overlap just the high 32 bits of the timestamp, rather than
> > the low.
> 
> In our case, I can foresee sequence numbers used for packet processing and 
> timestamps for timing analysis (and possibly for packet capturing, when being 
> used). For timing analysis, we don?t need long durations, e.g. 4 seconds with 
> 32 bit nanosecond resolution suffices. And for packet capturing we are 
> perfectly capable of adding the slowly moving 32 bit of the timestamp to our 
> output data stream without fetching it from the mbuf.
> 

For the 32-bit timestamp case, it might be useful to have a right-shift
value passed in to the ethdev driver. If we assume a NIC with nanosecond
resolution, (or TSC value with resolution of that order of magnitude),
then the app can choose to have 1 ns resolution with 4 second
wraparound, or alternatively 4ns resolution with 16 second wraparound,
or even microsecond resolution with wrap around of over an hour.
The cost is obviously just a shift op in the driver code per packet -
hopefully with multiple packets done at a time using vector operations.

/Bruce

Reply via email to