On Feb 6, 2013, at 2:57 , ext Jesse Gross wrote:
>> 
>> While I would not see a problem in passing a key hash computed with an
>> undisclosed algorithm up to userspace, I see your point.
>> 
>> In my tests it seems that about 1/3 of the benefit is attainable with 
>> deferred
>> layer pointer computation within the userspace.
> 
> Ben has been doing work optimizing flow setups, so it might be good
> for him to comment on this and how it fits in with his current ideas.


I noted some rather low hanging fruit for optimization in ofproto-dpif:

1. When handling a miss with facets, the rules are compiled for every packet 
separately for side effects, in order to credit stats on the rules and vports 
hit. This is buried inside 
subfacet_update_stats()->facet_push_stats()->flow_push_stats(). A separate pass 
is done for compiling the actions, once for the whole miss. For single-packet 
misses this effectively doubles the classifier/action compilation load, and for 
multi-packet misses the overhead just piles up. It would be better to gather 
the stats from all the packets in the miss, and then do a single pass on the 
rules/classifier to credit the stats and compile the actions at the same time. 
This could be a huge win for multi-packet misses (e.g. with coarser pooling).

2. Missed packets are pushed (execute) to the kernel first, and the new flow 
(if any) is created after (flow_put). It seems possible to change this around 
and first create the new flow and then have the packets use that flow within 
the kernel. This way the kernel flow would take care of the stats, and they 
would not need to be handled in userspace at flow creation time. The packet 
execution could omit the actions and flow key, and just pass the packet down. 
Essentially this would require a new datapath operation type that would drop 
the packet if there is no match (i.e. when the preceding flow_put has failed). 
Additionally, the datapath could remember the created flow from the last 
flow_put on the given channel and use those for immediately forthcoming packet 
"resubmits". This way the packet match on the kernel flow table could be 
omitted for the "resubmitted" packets.

3. When handling misses without facets, the pooling of packets is not utilized 
at all. Again, the stats could be collected from the packets first, then they 
could be credited at the same time with action compilation, for the whole miss 
at the same time. Currently all of this is done on per packet basis.

4. Subfacet create does odp_flow_key_hash(), which seems somewhat expensive. 
This could be avoided if kernel computed flow hash would be passed up with the 
key. Userspace should not care how the hash was computed. With perfect fitness 
this same hash could be used for facets and miss pooling. With coarser grained 
pooling this would still be useful for subfacets. With current pooling this 
would get rid of two different key hash computations. For this to work properly 
the kernel should consistently either provide the hash for a given flow or then 
not.

  Jarno

_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to