Justin asked an interesting question today, how does this affect performance on 
the manager? That is where we are feeling a lot of pain with select().

On Apr 11, 2017, at 7:41 PM, Siwek, Jon 
<jsi...@illinois.edu<mailto:jsi...@illinois.edu>> wrote:

I recently got a minimal CAF-based run loop for Bro working, did crude 
performance comparisons, and wanted to share.

The approach was to measure average time between calls of net_packet_dispatch() 
and also the average time it takes to analyze a packet.  The former attempts to 
measure the overhead imposed by the loop implementation and the later just 
gives an idea of how significant a chunk of time that is in relation to Bro’s 
main workload.  I found that the overhead of the loop can be ~5-10% of the 
packet processing time, so it does seem worthwhile to try and keep the run loop 
overhead low.

Initial testing of the CAF-based loop showed the overhead increased by ~1.8x, 
but there was still a major difference in the implementations: the standard Bro 
loop only invokes its IOSource polling mechanism (select) once every 25 cycles 
of the loop, while the CAF implementation’s polling mechanism (actor/thread 
scheduling + messaging + epoll) is used for every cycle/packet.  As one would 
expect, by just trivially spinning the main process() function in a loop for 25 
iterations, the overhead of the CAF-based loop comes back into line with the 
standard run loop.

To try and better measure the actual differences related to the polling 
mechanism implementation, I quickly hacked Bro’s standard runloop to select() 
on every packet instead of once every 25th and found that the overhead measures 
+/- 10% within the 1.8x overhead increase of the initial CAF-based loop.  So is 
the cost of the extra system call for epoll/select per packet the main thing to 
avoid?  Sort of.  I again hacked Bro’s standard loop to be able to use either 
epoll or poll instead of select and found that those do better, with the 
overhead increase being about 1.3x (still doing one “poll” per packet) in 
relation to the standard run loop.  Meaning there is some measurable trend in 
polling mechanism performance (for sparse # of FDs/sources): poll comes in 
first, epoll second, with CAF and select about tied for third.

Takeaways:

(1) Regardless of runloop implementation or polling mechanism choices, 
performing the polling operation once per packet should probably be avoided.  
In concept, it’s an easy way to get a 2-5% speedup in relation to total packet 
processing time.

(2) Related to (1), but not in the sense of performance, is that even w/ a 
CAF-based loop it still seems somewhat difficult to reason about the reality of 
how IOSources are prioritized.  In the standard loop, the priority of an 
IOSource is a combination of its “idle” state, the polling frequency, and a 
timestamp, which it often chooses arbitrarily as the “time of last packet”, 
just so that it gets processed with higher priority than subsequent packets.  
Maybe the topic of making IOSource prioritization more explicit/well-defined 
could be another thread of discussion, but my initial thought is that the whole 
IOSource abstraction may be over-generalized and maybe not even needed.

(3) The performance overhead of a CAF-based loop doesn’t seem like a 
showstopper for proceeding with it as a choice for replacing the current loop.  
It’s not significantly worse than the current loop (provided we still throttle 
the polling ratio when packet sources are saturated), and even using the most 
minimal loop implementation of just poll() would only be about a 1% speedup in 
relation to the total packet processing workload.

Just raw data below, for those interested:

I tested against the pcaps from http://tcpreplay.appneta.com/wiki/captures.html
(I was initially going to use tcpreplay to test performance against a live 
interface, but decided reading from a file is easier and just as good for what 
I wanted to measure).
Numbers are measured in “ticks”, which are equivalent to nanoseconds on the 
test system.
Bro and CAF are both compiled w/ optimizations.

bigFlows.pcap, 1 “poll" per packet
--------------------------
poll
('avg overhead', 1018.8868239999998)
('avg process', 11664.4968147)

epoll
('avg overhead', 1114.2168096999999)
('avg process', 11680.6078816)

CAF
('avg overhead', 1515.9933343999996)
('avg process', 11914.897109200003)

select
('avg overhead', 1792.8142910999995)
('avg process', 11863.308550400001)

bigFlows.pcap, Polling Throttled to 1 per 25 packets
---------------------------
poll
('avg overhead', 772.6118347999999)
('avg process', 11504.2397625)

epoll
('avg overhead', 814.4771509)
('avg process', 11547.058394900001)

CAF
('avg overhead', 847.6571822)
('avg process', 11681.377972700002)

select
('avg overhead', 855.2147494000001)
('avg process', 11585.1111236)

smallFlows.pcap, 1 “poll" per packet
----------------------------
poll
('avg overhead', 1403.8950280800004)
('avg process', 22202.960570839998)

epoll
('avg overhead', 1470.0554376)
('avg process', 22210.3240474)

select
('avg overhead', 2305.6278429200006)
('avg process', 22549.29251384)

CAF
('avg overhead', 2405.1401093399995)
('avg process', 23401.66596454)

smallFlows.pcap, Polling Throttled to 1 per 25 packets
-----------------------------
poll
('avg overhead', 1156.0900352)
('avg process', 22113.8645395)

epoll
('avg overhead', 1192.37176)
('avg process', 22000.2246757)

select
('avg overhead', 1269.0761219)
('avg process', 22017.891367999997)

CAF
('avg overhead', 1441.6064868)
('avg process', 22658.534969599998)

_______________________________________________
bro-dev mailing list
bro-dev@bro.org<mailto:bro-dev@bro.org>
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev

------

Adam J. Slagell
Director, Cybersecurity & Networking Division
Chief Information Security Officer
National Center for Supercomputing Applications
University of Illinois at Urbana-Champaign
www.slagell.info<http://www.slagell.info>

"Under the Illinois Freedom of Information Act (FOIA), any written 
communication to or from University employees regarding University business is 
a public record and may be subject to public disclosure."








_______________________________________________
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev

Reply via email to