> > We have tested the effect of turbo mode on TSC and there is none. The
> TSC frequency remains at the nominal clock speed, no matter if the core is
> clocked down or up. So, I believe for PMD threads (where performance
> matters) TSC would be an adequate and efficient clock.
> 
> It's highly platform dependent and testing on a few systems doesn't
> guarantee anything.
> From the other hand POSIX guarantee the monotonic characteristics for
> CLOCK_MONOTONIC.

TSC is also monotonic on a given core. Does CLOCK_MONOTONIC guarantee any 
better accuracy than TSC for PMD threads?

> > On PMDs I am a bit concerned about the overhead/latency introduced
> with the clock_gettime() system call, but I haven't done any measurements
> to check the actual impact. Have you?
> 
> Have you seen my incremental patches?
> There is no overhead, because we're just replacing 'time_msec' with
> 'time_usec'.
> No difference except converting timespec to usec instead of msec.

I did look at you incremental patches and we will test their performance. I was 
concerned about the system call cost on master already before. Perhaps I'm 
paranoid, but I would like to double check by testing.

> > If we go for CLOCK_MONOTONIC in microsecond resolution, we should
> make sure that the clock is read not more than once once every iteration
> (and cache the us value as now in the pmd ctx struct as suggested in your
> other patch). But then for consistency also the XPS feature should use the
> PMD time in us resolution.
> 
> Again, please, look at my incremental patches.

As far as I could see you did, for example, not consistently adapt 
tx_port->last_used to microsecond resolution.

> > For non-PMD thread we could actually skip time-based output batching
> completely. The packet rates and the frequency of calls to
> dpif_netdev_run() in the main ovs-vswitchd thread are so low that time-
> based flushing doesn't seem to make much sense.

Have you considered this option? 

> >
> > Below you can find an alternative incremental patch on top of your RFC
> 4/4 that uses TSC on PMD. We will be comparing the two alternatives for
> performance both with non-PMD guests (iperf3) as well as PMD guests
> (DPDK testpmd).
> 
> In your version you need to move all the output_batching related code
> under #ifdef DPDK_NETDEV because it will brake userspace networking if
> compiled without dpdk and output-max-latency != 0.

Not sure. Batching should implicitly be disabled because cycles_counter() and 
cycles_per_microsecond() would both return zero. But I agree that would be 
fairly cryptic design. If we used TSC in PMDs we should explicitly not do 
time-based tx batching on the non-PMD thread.

Anyway, if the cost of the clock_gettime() system call proves insignificant and 
our performance tests comparing our TSC-based with your CLOCK_MONOTONIC-based 
implementation show equivalent results, we can go for your approach.

BR, Jan
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to