Re: [freenet-dev] [gsoc2013] Transport Layer - Update 1

Matthew Toseland Thu, 18 Jul 2013 15:44:39 -0700

On Wednesday 03 Jul 2013 02:09:22 Vladislav Sterzhanov wrote:
> Greetings once again, I suddenly realised (thanks, toad), that the last
> report was ages ago, so will make up for it.
> _______________________________________________________________
> Quick outline:
>   Proposals:
>     - Current Congestion Window Validation (RFC 2861) problems and
> solutions for them
>     - Cummulative acknowledgements
>     - Appropriate_Byte_Counting-oriented congestion window adjustements
> (RFC 3465)
>     - Fast retransmission and loss recovery (RFC 5681)
>     - Slow start and congestion avoidance (RFC 5681)
>   Discussion:
>     - Packet vs Byte size of congestion window and packets in flight
>     - Forward RTO-Recovery (RFC 5682)
> _______________________________________________________________
> 
> | Concerning CWV: Original RFC promotes two crucial ideas:
> 1) If we received ack for a packet, but we didn't utilize the full window
> when we actually sent it to the node (we were application-limited rather
> than network-limited) - we should not increase the window size, but instead
> shrink it since we can't be sure, could the link hold such window or not.
> This shrink should be done immideately after sending the last packet that
> doesn't fill up the window.
> 2) If the application was idle for N RTO periods - the window should be
> shrinked for each of those N - the previous window size can't be trusted as
> relevant anymore.
> Two major problems here:
> Current implementation of CWV (PacketThrottle.notifyOfPacketAcknowledged())
> checks the size of a window *after* receiving an ack for packet, when it
> should check for the state of the window *when the packet was sent* - could
> be solved simply by adding a bitflag to SentPacket.
> Since it only makes this check on receiving an ack - it does not handle the
> second principle - PacketThrottle simply is not notified about idle
> periouds, but it should be.


We also implement the limit wrong - IIRC we limit it to the maximum size used, 
rather than "decay the congestion window each RTT to midway between the current 
cwnd value and the maximum value used."

ssthresh and cwnd stuff obviously needs changing too - but this relates to 
going back into slow-start, which we don't do at all at present. Basically we 
don't use their algorithm, we just do something similar IIRC.

> 
> | Cummulative acks:
> The main principle holds the same as in the previous report, just to
> formalize it a bit:
> - Make range acknowledgement the default one. We can leave a bitflag for
> individual acks and decide which will be more efficient for each packet.
> - Save the delta compression, but provide the possibility to ack wide
> ranges also - I see it as [[B|range1-begin][B|range1-end]
> [B|range2-begin][B|range2-end]], where B signifies the size of the next
> value (far / near).
> - Include already acked packets in the ack-range if it reduces the total
> cost.
> - Possibly include duplicate acks even if it increases the overall cost,
> depending on the per-link stats. (This one feels foggy - too much of a
> strange heuristic)
> 
> | ABC - the main message is that we should increase the window size not by
> some constant multiple every time we receive an ack, but instead depend on
> the amount of new data that it acknowledges (the behaviour varies depending
> on whether we are in slow-start on in congestion avoidance phase). But we
> need to decide about the units that window size is measured in (see below)
> - otherwise there will not be much sence in it.

The RFC seems reasonable.  Currently we increase the window size for any newly 
acked packet; one incoming packet can thus increase the window size quite a 
bit. We can send small packets sometimes so this change may make sense, 
although we try to send big packets when possible. So right now we can get the 
"line rate bursts" it talks about when lots of packets are acked at once. We 
may need to limit that.

We need to be careful here, quite a bit of TCP doesn't apply directly, e.g. 
because of cumulative acks / not always having SACK.

Should we follow the TCP rules and send acks more often? Is there a really good 
reason to do so? Bear in mind that we like to delay things for as long as 
reasonable as our packets have fairly large overhead. Later on it talks about 
bursts only being a problem for devices with very small buffers, but this 
probably isn't the whole story...

"   Delayed ACKs [RFC1122,RFC2581] allow a TCP receiver to refrain from
   sending an ACK for each incoming segment.  However, a receiver SHOULD
   send an ACK for every second full-sized segment that arrives."
- Worth looking into the reasons for this?

"   Furthermore, a receiver MUST NOT withhold an ACK for more than 500
   ms."

This is fine, IIRC the current limit is less than that. (Maybe we could 
increase it?)

Speaking of which our coalescing delay for most traffic is 100ms. TCP's is 
200ms.

Not sure whether this part applies to us:

"   Note that delayed ACKs should not be an issue during slow start-based
   loss recovery, as RFC 2581 recommends that receivers should not delay
   ACKs that cover out-of-order segments.  Therefore, as discussed
   above, ABC with L > 1*SMSS bytes is inappropriate for such slow start
   based loss recovery and MUST NOT be used."
> 
> | Fast retransmission. I was a bit confused with it: This freenet-wiki
> page<https://wiki.freenetproject.org/Transport_Layer>states that fred
> implements the fast retransmission logic, but I didn't
> manage to find any.. Nothing in NPFCK, which is responsible for everything
> concerning message ids. Is it really unimplemented or I am just looking in
> the wrong direction?
> | Fast recovery: also nothing for this part of TCP congestion control. Need
> to be implemented.
> | Slow-start -- Congestion avoidance: as soon as we leave slow-start, we
> never return to it again in context of a single link - we should, at least
> for the purposes of fast recovery.

Yes, we need this stuff, discussed elsewhere.

TCP detects packet loss quickly by repeated acks of the same number, we can 
detect it by getting an ack for a later packet number.
> 
> So now for the points that I'd like to discuss:
> 
> | Packet vs Byte size: not sure about current status of this discussion,
> the last description of it that I've seen was in the comments to
> NPFCK.countSentPackets() - if there was something more recent - please, let
> me know. *IMHO*, changing it to bytes will give *much* more precise control
> over the link. In any case, there is *very* significant difference from the
> point of the already implemented algos and the ones outlined here whether
> there are currently N max-sized packets in flight, or just N left-overs or
> ack-only packets, much shorter than maxPacketSize. Thoughts? But of course
> it will create some difficulties, e.g. what to consider a "full" window? I
> propose considering the window full if it utilizes > (window -
> window%maxPacketSize bytes) - quite natural.

Yes, discussed elsewhere.
> 
> | F-RTO algo: not very trivial logic to outline here, but it allows to
> detect the cases where the retransmission timeout was spurious - we could
> wait a while more and the ack would arrive. Could possibly help especially
> over high-latency and high-bandwith links. Have someone looked through it -
> could it be usefull for us?
> 
> You are welcome to ask if something is not clear for you - I'm falling
> asleep, so sorry in advance.

Okay so I still need to look at:
- Reno / RFC 5861
- Veno
- Does Veno extend New Reno?
- Forward RTO Recovery: RFC 5682

Some general notes:
TCP/IP details: Notes relevant to our changes (GSoC project):
- PacketSender:
-- Fair queueing. Or weighted fair queueing, weighted round robin. Etc. Worth 
looking at existing algorithms. Priorities of messages; priorities of peers 
(e.g. prefer darknet).
--- TODO Compare with what I just sent to quadrocube. Can we get along fine 
without cross-peer priorities, just using deadlines for key things??
--- http://en.wikipedia.org/wiki/Bandwidth_management
- Much complexity caused by not usually having ECN, and not always having SACK.
- TCP Tahoe:
-- Receiving duplicated acks is the usual signal because we don't usually have 
ECN. So the ack gets resent, without updating the pointer; this indicates we 
received a later packet.
-- We never go back to slow start. We never update sshthresh.
-- How to compute RTTs accurately with the right sensitivity? What is the right 
sensitivity? One window? Karn's algorithm doesn't apply because we never resend 
a packet with the same seqnum.
-- Fast Retransmit: Currently we only ever resend after a timeout. Fast 
Retransmit would have us resend immediately after receiving 3 acks for later 
packets.
-- Fast Recovery: Should be straightforward.
- TCP Vegas, BIC, CUBIC: Use changes in RTT to identify load rather than lost 
packets. Can work well for very fast connections with high delay. Not widely 
deployed. (SEE VENO!)
- Difference between Reno and New Reno?
- TODO Look at Veno.

signature.asc
Description: This is a digitally signed message part.

_______________________________________________
Devl mailing list
[email protected]
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] [gsoc2013] Transport Layer - Update 1

Reply via email to