On Wed May  6 15:30:24 PDT 2015, charles.fors...@gmail.com wrote:

> On 6 May 2015 at 22:28, David du Colombier <0in...@gmail.com> wrote:
> 
> > Since the problem only happen when Fossil or vacfs are running
> > on the same machine as Venti, I suppose this is somewhat related
> > to how TCP behaves with the loopback.
> >
> 
> Interesting. That would explain the clock-like delays.
> Possibly it's nearly zero RTT in initial exchanges and then when venti has
> to do some work,
> things time out. You'd think it would only lead to needless retransmissions
> not increased latency
> but perhaps some calculation doesn't work properly with tiny values,
> causing one side to back off
> incorrectly.

i don't think that's possible.

NOW is defined as MACHP(0)->ticks, so this is a pretty course timer
that can't go backwards on intel processors.  this limits the timer's 
resolution to HZ,
which on 9atom is 1000, and 100 on pretty much anything else.  further limiting 
the
resolution is the tcp retransmit timers which according to presotto are
        /* bounded twixt 0.3 and 64 seconds */
so i really doubt the retransmit timers are resending anything.  if someone
has a system that isn't working right, please post 
/net/tcp/<connectionno>/^(local remote status)
i'd like to have a look.

quoting steve stallion ,,,

> > Definitely interesting, and explains why I've never seen the regression (I
> > switched to a dedicated venti server a couple of years ago). Were these the
> > changes that erik submitted? ISTR him working on reno bits somewhere around
> > there...
>
> I don't think so. Someone else submitted a different set of tcp changes
> independently much earlier.

just for the record, the earlier changes were an incorrect partial 
implementation of
reno.  i implemented newreno from the specs and added corrected window scaling
and removed the problem of window slamming.  we spent a month going over cases
from 50µs to 100ms rtt latency and showed that we got near the theoretical max 
for
all those cases.  (big thanks to bruce wong for putting up with early, buggy 
versions.)

during the investigation of this i found that loopback *is* slow for reasons i 
don't
completely understand.  part of this was the terrible scheduler.  as part of 
the gsoc
work, we were able to make the nix scheduler not howlingly terrible for 1-8 
cpus.  this
improvement depends on the goodness of mcs locks.  i developed a version of 
this,
but ended up using charles' much cleaner version.  there remain big problems 
with
the tcp and ip stack.  it's really slow.  i can't get >400MB/s on ethernet.  it 
seems
that the 3-way interaction between tcp:tx, tcp:rx and the user-space queues is 
the issue.
queue locking is very wasteful as well.  i have some student code that 
addresses part
of the latter problem, but it smells to me like ip/tcp.c's direct calls between 
tx and rx
are the real issue.

- erik

Reply via email to