Re: [pox-dev] Profiling Pox

Tmusic Sat, 23 Mar 2013 15:59:14 -0700

Thanks a lot for the extensive information!!

Considering the profiling module: It needs some cleaning up first, but I'll
throw it in a Github fork when it's done :)


Thanks for the recoco example!
I think I'll go for the throughout solution and do some work on the
scheduling. Changing the timeout can make it work better, but the packet_in
events are also backlogged (which is something I'd like to avoid). So some
more advanced scheduling will be useful. I'm even thinking of throwing in
some multiprocessing so some heavy calculations can be run outside the main
process. But I'll focus on the scheduling first.

I'm still a bit confused about the work producing and consuming as it's
implemented now. So there is the main task loop in OpenFlow_01_Task which
loops over the connections and calls the read function on each of them.
This read functions calls the appropriate handler which in its turn fires
the appropriate event on the pox core (which are then further handled). So
everything would be processed connection by connection...

But if I understand you correctly, the handlers called by the read function
put the jobs in a queue, which is than emptied by a separate task loop
(which I can't find at the moment). Can you give a hint where (in the code)
the task loop runs that empties the queue and where the filling of the
queue exactly happens?




2013/3/22 Murphy McCauley <[email protected]>

> On Mar 22, 2013, at 6:32 AM, Tmusic wrote:
>
> I've been trying some things over the last couple days...
>
> The pypy problem was indeed due to some external modules. Debug-pox.py
> does not provide much helpfull information. Can I suggest adding a
> traceback.print_exc() when import fails  ( +- line 80 in  boot.py [after:
> print("Module not found:", base_name)]). In my case it really showed which
> import failed.
>
>
> This actually should happen.  I thought it was there with debug-pox.py,
> but can you try running pox.py --verbose and see if it gives a useful stack
> trace?
>
> For profiling I tried yappi (https://code.google.com/p/yappi/). Not as
> handy as cProfile with RunSnakeRun, but it works with the threading model
> and provides #calls, total time,... per function. It requires some changes
> in the code, but it's possible to create some wrappers and load it as a POX
> module. Let me know if you're interested in the code :)
>
>
> Sounds interesting.  Do you have it in a github fork or anything?
>
> The second issue is the parsing of flow stats. When I'm getting a
> flow_stat_reply from a switch I'm parsing the statistics for each flow
> contained in the reply. It goes for up to about 300 flows, but from that
> point links start disconnecting again. I tried to split the calculation
> into different parts (don't process them in one loop, but fire an event for
> each flow which will process only that flow). So far this had no measurable
> impact. I'm guessing these events are processed right afterwards which
> basically goes back to the "one big for loop" scenario. Can this be the
> case? Pypy offers an improvement  of going up to about 550 flows, but then
> the same issues arise again.
>
> Further, I was looking at the recoco and revent libraries. What I'd like
> to do is submit the "processing events" with a lower priority, so the
> packet in events are processed first. I guess this can resolve the problem?
> Are there some features in recoco or revent which could help in
> implementing this? When I print the length of the schedule queue (cycle
> function in recoco), not all fired events seem to be scheduled as a
> separate tasks. Where is the processing queue for the events situated?
>
>
> Right, this would be my suggestion.  The OpenFlow event handlers are
> producers that fill a work queue and then you have a consumer in the form
> of a recoco Task that tries to drain the queue.
>
> I think recoco could make this somewhat simpler than it is with just a
> little work, but I've so rarely hit performance problems that I've never
> fleshed it out.  In theory the recoco.events module might be a nice way to
> do this, but I think it's not as general purpose as it should be (and it
> has been a long time since I've used it at all).  I've thrown together a
> quick producer/consumer example using recoco:
>   https://gist.github.com/MurphyMc/939fccd335fb3920f993
>
> On my machine, run under CPython the consumer sometimes gets backlogged
> but eventually catches up, and on PyPy it pretty much stays caught up all
> the time.  In general, you'll need some application-specific logic for
> if/when the consumer gets really backed up (e.g., just throw away really
> old events, or don't yield and just churn through the backlog while
> stalling the other Tasks, or temporarily raise the consumer Task's
> priority, or etc.).  Or if the problem is that your production is just
> really bursty but not actually more than you can handle amortized over time
> then you can just ignore it as in the example.
>
> Some of the things you can do to play with tuning the example are:
> 1. Adjust the consumer's priority.
> 2. Adjust the minimum number of items to consume (batch size) in one
> scheduling period (the min(10, ...) in run()).
>
> In your case, I'd expect #1 might not do as much as you'd expect since it
> only matters when another Task -- e.g., the OpenFlow IO Task -- is actually
> waiting to run.  In your case, I'd expect the OpenFlow task to be mostly
> idle, but then you get a flow_stats and suddenly you have a lot of work to
> do.  #2 (or the equivalent) is probably more useful.  You want to set it
> high enough that you're not wasting time rescheduling constantly, but low
> enough that discovery doesn't get starved.
>
> I think a lot of the time I get away with a much simpler approximation
> which is where events just update some state (e.g. counters, list of
> expired flows), and then I have a pending callDelayed which tries to
> process them, and then callDelayed()s itself again in some amount of time
> if there is nothing left to do and in some shorter amount of time if there
> is stuff left to do.
>
> Another possibility may just be to adjust discovery's timeouts.  There's
> nothing magic about the defaults.
>
> And finally I noticed strongly varying performance (100 flows more or less
> in the reply before it starts crashing) with exactly the same traffic
> patterns. Has this something to do with the random generator in the recoco
> scheduler cycle() function?
>
>
> Doubtful -- you probably don't have any Tasks now that have a priority
> other than 1, so the randomization shouldn't kick in.  My first guess is
> that this is nondeterminism caused by how Python 2.x is switching between
> the IO thread and the cooperative thread.  If you used the version of
> recoco from the debugger branch (which combines these into a single
> thread), you might find that it evens out.
>
> -- Murphy
>

Re: [pox-dev] Profiling Pox

Reply via email to