On 2015/09/18 13:36, Martin Pieuchot wrote:
> On 18/09/15(Fri) 15:55, David Gwynne wrote:
> > hashing bits of packet headers to tie connections to particular
> > physical interfaces within a trunk turns out to be fairly expensive.
> > in my very unscientific testing it is about 20% of the cost of udp
> > traffic generated with tcpbench -u.
> > 
> > we could tune or change the hash. eg, going from siphash 2 4 to
> > siphash 1 1 halves the overhead of hashing. however, it occurred
> > to me that sometimes we already know about connections. why not
> > reuse that info if it is available?
> 
> Why not, but I'd argue that's orthogonal to the fact that siphash
> 2 4 has a high cost. 
> 
> > this lets pf embed the state id into the mbuf as a "flow id" so
> > other subsystems can use it. eg, trunk can pull it out and use it.
> > 
> > this diff steals the pad field in mbuf packet headers and uses it
> > to embed a flow id. it makes pf fill it in, and trunk use it. this
> > avoids the cost of hashing in trunk altogether.
> > 
> > it could be used in other places too, eg, picking an upstream when
> > we're going multipath routing.
> 
> I've been through RFC 2992 again and indeed I believe we could use that.

as far as trunk(4) goes, we're ok from the perspective of 802.3-2000
section 43.2.1 says

f)Frame ordering must be maintained for certain sequences of frame
exchanges between MAC Clients (known as conversations, see 1.4). The
Distributor ensures that all frames of a given conversation are passed
to a single port. For any given port, the Collector is required to pass
frames to the MAC Client in the order that they are received from that
port. The Collector is otherwise free to select frames received from the
aggregated ports in any order. Since there are no means for frames to be
mis-ordered on a single link, this guarantees that frame ordering is
maintained for any conversation.

so we're OK from that perspective.

> What about carp(4) and bridge(4)?

I don't think it applies to bridge, load balancing is done at a lower
level there (i.e. you'd have trunk as a member of a bridge if you wanted
to balance across links).

Probably the same for carp, there might be some opportunity, but
it's already a bit of a minefield to have things working nicely with
pfsync/defer in various different situations.

Reply via email to