Re: benchmarking various event loops with and without anyevent

2008-04-28

On Apr 28, 2008, at 03:21, Marc Lehmann wrote:

In all fairness, I want to point out that, after _multiple_ rounds of
longish e-mail exchanges, Rocco Caputo could not solve the problems  
forced AnyEvent to use this design, nor did he enlighten me on how  
to work

around the specific problems that I mentioned to him that forced this
design decision(*).

Addressed in <[EMAIL PROTECTED]>. Please respond there.  
respond there.

He did not come up with any further evidence for a problem, either  
repeatedly stating that the design is broken. The only argument he  
up was: one of your design goals is to be reasonably efficient, POE  

not do it reasonably efficient, so your design is broken, which is an
outright absurd logic).

Marc and I are disagreeing with what I wrote in the message included  
in <[EMAIL PROTECTED]>.  No amount of he- 
said/no-he-said will resolve it at this point, so I refer the reader  
back to the actual exchange.

Everyone: Your suggestions to improve my communication are greatly  
appreciated.  Please comment off-list, if you can.

In fact, it seems his problem is indeed the AnyEvent API and not the
interface module to POE, i.e. the "broken" means I should not provide
events in the form AnyEvent does, which is of course  
counterproductive to
the goal of AnyEvent of being compatible to multiple event loops (I  

provide different APIs to different event loops...).

I explicitly stated otherwise in the message included in <[EMAIL PROTECTED] 
>.  It's the sentence beginning with "We should not need to change  
AnyEvent::Impl::POE's public interface".

So I conclude that even the POE author is unable to provide a  
more efficient approach, which, according to his own words, would  
make him

worse then the average first-time POE user.

Considering our previous discussions on the matter, I feel that your  
conclusion is premature.

Also, you seem to be saying that one solution can simultaneously  
perform equally as well as another and worse than it.  Which quantum  
computers have you ported AnyEvent to lately? :)

Rocco Caputo

Re: benchmarking various event loops with and without anyevent

2008-04-28

On Apr 28, 2008, at 06:24, Marc Lehmann wrote:

[most important points first]

In your case, I would create a single persistent POE::Session  

that serviced all the watchers.

I would, too, but I cannot find a way to do that with POE: sessions
without active watchers will be destroyed, forcing the session to have
active resources will make the program never-ending.

I already told you that I tried this approach, and why I couldn't  
get it


I might use something like an explicit reference count to keep the  
session alive.  I would create a proxy object that, when DESTROYed,  
would post a shutdown message to the session.  Or if AnyEvent knows  
when a program is to exit, I would have it explicitly shutdown the  
session as part of its finalization.

The shutdown handler would remove the explicit reference count,  
allowing the session to stop.

It's similar to the technique you use in AnyEvent::Impl::POE:

   POE::Kernel->post (${${$_[0]}}, "stop");

Except it would be done once at the very end, rather than for each  
event watcher.

Perhaps this isn't necessary.  You're not using run(), so technically  
you're free to go at any time.  If your program must exit while  
watchers are active, then you could force the issue by sending a  
global UIDESTROY signal (designed to tell POE when it must  
unconditionally stop), and calling run():

$poe_kernel->signal($poe_kernel, "UIDESTROY");

On the other hand, your AnyEvent::Impl::POE proxy objects could also  
hold references to the singleton session, and if they release those  
references when they clean up their POE::Kernel resources, the session  
should be "empty" by the time they all destruct.  In that case, the  
UIDESTROY signal should not be needed.  run() will return after  
removing the "empty" session.

In short, your AnyEvent::Impl::POE objects would be of the form:

sub new {
  # (AnyEvent::Impl::POE setup goes here)
  # set up the POE::Kernel watcher
  # make a note of the watcher in this object
  $self->{something} = $record_of_the_poe_watcher;
  return $self;

  my $self = shift;
  # ... release the POE::Kernel watcher
  $poe_kernel->something($self->{something}, something);

If you expect the user to be creating their own POE::Session  
instances, then you'd need to call() AnyEvent::Impl::POE to make sure  
the watchers are created in the right context.

sub new {
  # (AnyEvent::Impl::POE setup goes here)
  # set up the POE::Kernel watcher
  $poe_kernel->call("anyevent_impl_poe", "something", something);
  # make a note of the watcher in this object
  $self->{something} = $record_of_the_poe_watcher;
  return $self;

And DESTROY would tell the session when to clear the watcher.

You may need to add a new AnyEvent::Impl method to explicitly stop  
POE, especially if your public API allows users to exit with active  

sub shutdown {
  $poe_kernel->signal($poe_kernel, "UIDESTROY");

As an added bonus, shutting down this way satisfies the run() warning.

I know this isn't a full solution, but I hope you still find it helpful.

You kepe repeating how it could be designed better, but you never  
say how to solve the fundamental problems and bugs within POE that  
keep it

from being implementable.

I would be welcome to discuss the code more than each-other.  If we  
can agree on this, perhaps we can get down to more important matters.   
See above.

As I said, if possible, I can only imagine the design becoming  
vastly more
complex because I would have to create sdessions on demand and be  
able to

react to my session beign turned down at unopportune times.

I don't understand why this design is necessary.  Please help me  
understand your design constraints, so that I may focus on designs  
that will work.

What we seem to agree on by now is that such a design is not trivial  
to do

with POE.

Also, remember that the benchmarks show that session creation is not
the big problem, running the sesions is - of course, there could be
inefficiencies in POE handling large number of sessions, but that  

just that - POE doesn't scale well.

While my suggestions are not as trivial as your current design, I  
don't think the end design will be as complex as you expect.

Thank you for your feedback.  I'm sorry that POE doesn't meet your  
needs.  When I have the chance, I'll profile POE while running your  
benchmark and see what I can do.

As the documentation mentions, AnyEvent doesn't enforce itself on a
module, unlike POE - a module using POE is not going to work with  

event loops, because it monopolises the process.

This means that a module using POE forces all its users to also use  

This is factually incorrect.  For example, POE::Loop::Glib allows POE  
to be embedded into applications like vim and irssi.  The  
application's functionality is not impaired, n

Re: benchmarking various event loops with and without anyevent

2008-04-28
[most important points first]

> In your case, I would create a single persistent POE::Session instance  
> that serviced all the watchers.

I would, too, but I cannot find a way to do that with POE: sessions
without active watchers will be destroyed, forcing the session to have
active resources will make the program never-ending.

I already told you that I tried this approach, and why I couldn't get it

You kepe repeating how it could be designed better, but you never actually
say how to solve the fundamental problems and bugs within POE that keep it
from being implementable.

As I said, if possible, I can only imagine the design becoming vastly more
complex because I would have to create sdessions on demand and be able to
react to my session beign turned down at unopportune times.

What we seem to agree on by now is that such a design is not trivial to do
with POE.

Also, remember that the benchmarks show that session creation is not
the big problem, running the sesions is - of course, there could be
inefficiencies in POE handling large number of sessions, but that means
just that - POE doesn't scale well.

On Mon, Apr 28, 2008 at 04:36:42AM -0400, Rocco Caputo wrote: 
> Most people on the planet don't know Perl, or even how to program a  
> computer.  No amount of documentation will help them.  :)

Good (or any) documentation is widely known to be almost a _requirement_
for learning how to program or use a software package.

> In the future, you may wish to include me on your list of people to  
> consult about designing applications for POE.  I may known a thing or  
> two about the topic. :)

I am not interested in designing applications for POE - I am interested in
making it possible to write event-based modules that are interoperable
between various event loops such as POE.

As the documentation mentions, AnyEvent doesn't enforce itself on a
module, unlike POE - a module using POE is not going to work with other
event loops, because it monopolises the process.

This means that a module using POE forces all its users to also use POE.

AnyEvent does not do that, as long as it supports the event model actually
in use (it is not an event loop itself!): A module that uses AnyEvent
works with both Qt, POE, IO::Async (once its backend is implemented), EV
and so on.

This is a fundamental difference between POE and AnyEvent, it has nothing
to do with event loop backend modules, of which POE also emplys a few, but
comes form the fact that you have to call POE::Kernel->run and give up
your process to it (just like with EV::loop etc.)

> Are you aware that I'm gradually rewriting POE's documentation?  If  
> you could describe what you don't like in a useful way, I may be able  
> to do something about it.

Since I described it already (and you know that) it means you find the way
I did it "not useful". Thats a strawman argument. If you don't like my
criticism or don't understand it, ask.

> I would love to have the opportunity to suggest a different design,  

You always ahd the opportunity, you are not using it.

> Obviously I cannot expect you to know everything about POE.  Likewise,  
> you cannot expect me to magically know when you started writing  
> AnyEvent::Impl::POE.

It doesn't matter when I started writing AnyEvent::Impl::POE at all. What
matters is that you made unfounded (and as you now admit, wrong)
statements about it (and its author).

It is fine with me if you don't understand AnyEvent, it is somewhat fine
with me if you make strong (But wrong) statements about it, but don't
expect anybody to put much faith in them, or you ability to make useful

> Even if you announced it somewhere, I may not have been looking.  The
> first I heard of it was here, when you announced your benchmarks.

Yes, so?

> In general, if you need someone's attention online, the most effective  
> and polite way is to contact them directly.

I did not need your attention?

> about this, then I'm sorry that I missed it.  Are you sure your  
> message wasn't lost in transit?

Which message? I was only conveying some benchmark results, to give
people an idea of the overheads of AnyEvent of various event loop
implementations in the hope of beign useful.

I was also hoping that people might give AnyEvent some try, as it's design
doesn't force a module author using it into a specific event loop.

> Assuming that N is the same between the equivalent POE and  
> AnyEvent::Impl::POE program:
> S(N*M) > S(N) for M > 1.
> QED  :P

I couldn't really follow you here, and I am not sure what you have proven. To
me it certainly looks as if it was "POE cannot support the AnyEvent API
efficiently" (at leats not in a simple and straightforward way).

I knew that already.

> >I can only imagine making some very complex on-demand instantiating and
> >re- check wether the session still exists on each watcher creation.
> Your imagination comes up with such incredible things.  Don't lose  
> t

Re: benchmarking various event loops with and without anyevent

2008-04-28

On Apr 28, 2008, at 03:21, Marc Lehmann wrote:
On Sat, Apr 26, 2008 at 08:13:27AM -0400, Rocco Caputo wrote: 
> wrote:

each event watcher.  Anyone who knows POE can tell you this is one of
the least efficient designs possible.  In fact, this design is worse
than the average for first-time POE users.

[He then called my model fundamentally broken in private mail and the
documentation rude and unprofessional, without bringing up any  

Dear, and anyone reading this in the archives.

I apologize for my part in this unfolding thread.  As Marc mentions, I  
have been trying to take it to private e-mail.  However I feel that  
Marc has misrepresented to the list what I said to him in private.  At  
this point, it's easier for me to repost what I actually said rather  
than paraphrase it.

Again, I'm sorry you had to be involved.

Rocco Caputo

Begin forwarded message:

From: Rocco Caputo
Date: April 27, 2008 23:29:18 EDT
To: Marc Lehmann
Subject: Re: benchmarking various event loops with and without anyevent  

On Apr 27, 2008, at 01:53, Marc Lehmann wrote:

On Sun, Apr 27, 2008 at 01:15:49AM -0400, Rocco Caputo wrote: 
> wrote:
I have read your code and documentation for AnyEvent::Impl::POE.   

module's design is fundamentally broken, and your code is probably
more to blame than POE.

Oh, btw, be careful with such strong idioms such as "fundemantally  
so far, there is no evidence that it is broken at all, only  

(If you think it really is fundamentally _broken_ then you better  
back up

your statements).

I hope to show that I intended no offense.

Reasonable scalability (CPU and memory) seems to be one of  
AnyEvent's design goals.  I base this impression on the fact that  
you're benchmarking your code in terms of speed and size.   
"Reasonable" is subject to interpretation, but I think we agree that  
AnyEvent::Impl::POE is neither as fast nor as small as it should be.

Therefore, while AnyEvent::Impl::POE operates correctly, it does not  
fulfill some of AnyEvent's design goals.

AnyEvent::Impl::POE's greatest inefficiencies stem from one  
fundamental design choice: the 1:1 relationship between watcher  
instances and POE::Session instances.  In your own words: "AnyEvent  
has to create one POE::Session per event watcher, which is immensely  
slow and makes watchers very large."

One point of contention may be whether this is a design or  
implementation flaw.  The problem is inherent in the way one class  
(AnyEvent::Impl::POE) interacts with another (POE::Session).  Class  
interaction is a software design issue.  It can be modeled in  
software design languages such as UML.  Re-implementing the same  
entity relationship more efficiently, or in a faster language such  
as C, would not resolve the scalability problem.

Therefore, AnyEvent::Impl::POE is flawed in design rather than  

Therefore, AnyEvent::Impl::POE's design prevents it from meeting  
some of AnyEvent's design goals.

Therefore, AnyEvent::Impl::POE's design is broken.

Unfortunately most of AnyEvent::Impl::POE's design stems from its  
flawed interaction with POE::Session.  We should not need to change  
AnyEvent::Impl::POE's public interface, but we will need to rethink  
and revise nearly all aspects of its interaction with POE.

Therefore, AnyEvent::Impl::POE's design flaw is a fundamental one.

Therefore, AnyEvent::Impl::POE's design is fundamentally broken.

Again, let me repeat that empty insults that obviously are founded by
paranoia will not have any positive effect on your standing with me  
mean, I won't hate you or anything, but you make yourself an idiot  
in my

eyes very quickly by repeatedly not beinging up any evidence...).

Of course, I understand that if you mistook my comments about POE  
as rude,

there is a natural tendency to "insult back".

I hope I have shown reasonable evidence that my assertion is neither  
empty nor intrinsically insulting.  Without the intent to insult,  
there can be no intent to "insult back".  In the end, any offense  
you have taken may be of your own manufacture.  Could there be  
cultural differences to overcome?

In this light, your assertion of my paranoia is unfounded, unjust  
and offensive.  Your view that I'm acting like an idiot is no  
better.  You are of course entitled to your opinions, but those two  
are not appropriate for polite conversation.


I'm sorry that I haven't responded promptly to your e-mail.  I get  
the impression that you expect the worst from me, so I feel the need  
to choose my words in a slow and painful (and often futile) effort  
to minimize being misunderstood.  As a result, I cannot write to you  
as oft

Re: benchmarking various event loops with and without anyevent

2008-04-28

On Apr 27, 2008, at 00:56, Marc Lehmann wrote:
On Sat, Apr 26, 2008 at 08:13:27AM -0400, Rocco Caputo wrote: 
> wrote:

each event watcher.  Anyone who knows POE can tell you this is one of
the least efficient designs possible.

It is the only design that I could get working, even after  
consulting a

few people and implementing some workarounds for the bugs in POE.

In any case, you have to consider that most people on this planet
don't know POE, and even if, they don't know it that well. Since the
documentation for POE is in such a bad state, thats the obvious way  
to fix


Most people on the planet don't know Perl, or even how to program a  
computer.  No amount of documentation will help them.  :)

In the future, you may wish to include me on your list of people to  
consult about designing applications for POE.  I may known a thing or  
two about the topic. :)

Are you aware that I'm gradually rewriting POE's documentation?  If  
you could describe what you don't like in a useful way, I may be able  
to do something about it.

In fact, this design is worse than the average for first-time POE  

If a better design is possible, it is not known to me, and you haven't
suggested one either, so talk is cheap. I'd be happy to get a more
efficient design for POE but nobody could come up with one that also
worked reliably through multiple iterations of run and also does not  

the POE kernel from returning.

I would love to have the opportunity to suggest a different design,  
but most of my discretionary time is spent addressing the constant  
misunderstandings between us.  If we can first resolve them, I'll have  
that much more time to work on the design.

Obviously I cannot expect you to know everything about POE.  Likewise,  
you cannot expect me to magically know when you started writing  
AnyEvent::Impl::POE.  Even if you announced it somewhere, I may not  
have been looking.  The first I heard of it was here, when you  
announced your benchmarks.

In general, if you need someone's attention online, the most effective  
and polite way is to contact them directly.  If you did contact me  
about this, then I'm sorry that I missed it.  Are you sure your  
message wasn't lost in transit?

As for the design:

First-time POE users tend to design programs where the number of  
sessions scales linearly with the number of objects that handle  
events.  If S(1) is the total overhead imposed by a single session,  
then S(N) is the overhead imposed by the average naïve POE user.  N is  
the number of objects handling events.

AnyEvent::Impl::POE creates a new POE::Session for every event  
watcher.  It's not uncommon for an object to use more than one event  
watcher (I/O and timeout, for example).  So we can model the session  
overhead in an AnyEvent::Impl::POE program as S(N*M), where N is the  
number of objects handling events, and M is the average number of  
event watchers per object.

Assuming that N is the same between the equivalent POE and  
AnyEvent::Impl::POE program:

S(N*M) > S(N) for M > 1.


If a better design *is* possible (which I don't really doubt), then it
needs to be vastly more complex, or it needs some non-obvious trick.  
I can
only imagine making some very complex on-demand instantiating and re- 

wether the session still exists on each watcher creation.

Your imagination comes up with such incredible things.  Don't lose  
that. :)

The "trick" is to minimize the number of sessions used.  Your  
benchmarks and comments in your documentation implied that you knew  
that sessions imposed overhead.  I wrongly assumed the solution would  
be obvious.

In your case, I would create a single persistent POE::Session instance  
that serviced all the watchers.  The watchers themselves would be  
small proxies that controlled POE::Kernel watchers within that  
session's context.

I use this design in POE::Stage.  If you're not averse to looking at  
experimental code, you can find it on the CPAN.  It also does a lot of  
other, unrelated things, so you may have difficulty separating the  
magic you need from the voodoo you don't.  Comments are welcome, if  
they're useful.

(It is possible that I was fooled by the docs as well, so if there  
is a

better way, it likely isn't documented, so who could blame me).

I won't blame you.  But I will point out that your documentation says  
you're already familiar with using undocumented POE features. :)

Instead of going around and accuse people of making bad designs, it  
be much better to improve the documentation for POE, so that said  

don't have to go around and start guessing...

I don't appreciate your vague, negative comments about the  
documentation.  Please describe problems in a useful way, or your  
expectation that they be fixed is unreasonable.

You seem to be saying that your design for AnyEvent::Impl::POE is  
based on guesswork.  If so, I overestimated your

Re: benchmarking various event loops with and without anyevent

2008-04-28
On Sat, Apr 26, 2008 at 08:13:27AM -0400, Rocco Caputo wrote: 
> each event watcher.  Anyone who knows POE can tell you this is one of  
> the least efficient designs possible.  In fact, this design is worse  
> than the average for first-time POE users.

[He then called my model fundamentally broken in private mail and the
documentation rude and unprofessional, without bringing up any evidence]

In all fairness, I want to point out that, after _multiple_ rounds of
longish e-mail exchanges, Rocco Caputo could not solve the problems that
forced AnyEvent to use this design, nor did he enlighten me on how to work
around the specific problems that I mentioned to him that forced this
design decision(*).

He did not come up with any further evidence for a problem, either (just
repeatedly stating that the design is broken. The only argument he brought
up was: one of your design goals is to be reasonably efficient, POE does
not do it reasonably efficient, so your design is broken, which is an
outright absurd logic).

In fact, it seems his problem is indeed the AnyEvent API and not the
interface module to POE, i.e. the "broken" means I should not provide
events in the form AnyEvent does, which is of course counterproductive to
the goal of AnyEvent of being compatible to multiple event loops (I can't
provide different APIs to different event loops...).

So I conclude that even the POE author is unable to provide a (strongly)
more efficient approach, which, according to his own words, would make him
worse then the average first-time POE user.

This is absurd, so I conclude that the original claim has been disproven.

(And yes, I did ask multiple times to come up with how to better design
the interface to POE, or how to solve the lifetime-issues with POE).

(*) the specific problems are (taken directly from my mail to rocco):

- the session must not go away, or there must be an easy way to recreate
  it when the kernel kills it.
- the session itself must not keep the kernel "alive"/running (preferably
  without going away).

Re: benchmarking various event loops with and without anyevent

2008-04-26
On Sat, Apr 26, 2008 at 08:13:27AM -0400, Rocco Caputo wrote: 
> each event watcher.  Anyone who knows POE can tell you this is one of  
> the least efficient designs possible.

It is the only design that I could get working, even after consulting a
few people and implementing some workarounds for the bugs in POE.

In any case, you have to consider that most people on this planet
don't know POE, and even if, they don't know it that well. Since the
documentation for POE is in such a bad state, thats the obvious way to fix

> In fact, this design is worse than the average for first-time POE users.

If a better design is possible, it is not known to me, and you haven't
suggested one either, so talk is cheap. I'd be happy to get a more
efficient design for POE but nobody could come up with one that also
worked reliably through multiple iterations of run and also does not keep
the POE kernel from returning.

If a better design *is* possible (which I don't really doubt), then it
needs to be vastly more complex, or it needs some non-obvious trick. I can
only imagine making some very complex on-demand instantiating and re-check
wether the session still exists on each watcher creation.

(It is possible that I was fooled by the docs as well, so if there is a
better way, it likely isn't documented, so who could blame me).

Instead of going around and accuse people of making bad designs, it would
be much better to improve the documentation for POE, so that said people
don't have to go around and start guessing...

Oh, and while you fix the docs, could you fix the other bugs as well (the
race condition with sigchld is really annoying, and the nag messages are
as well).

> I have specific issues with your docs.  As a courtesy to you and the  
> list, I'll send them to you directly.

Not sure why it would be a courtesy to me, do you have anything to hide
or do you want to insult me even more in private so nobody sees your real
self or something? (just guessing... :)

I do fix my bugs and am open to suggestions and improvements. But
correctness comes first, performance second.


Re: benchmarking various event loops with and without anyevent

2008-04-26

On Apr 25, 2008, at 22:46, Marc Lehmann wrote:

The next release of AnyEvent contains support for a few more  
notably POE, so AnyEvent is now by definition compatible to POE  
(before it
was only compatible when using an even loop used by POE, such as  
Event or

EV that could be shared).

The results were mostly as expected, with EV leading and POE being

It's no wonder that AnyEvent::Impl::POE runs so slowly.  According to  
your documentation, you designed it to use a separate POE::Session for  
each event watcher.  Anyone who knows POE can tell you this is one of  
the least efficient designs possible.  In fact, this design is worse  
than the average for first-time POE users.

On the bright side, you could still have done worse.  You could have  
instantiated and destroyed a POE::Session for each event.  Perhaps  
you're saving that for a future release. :)

I have specific issues with your docs.  As a courtesy to you and the  
list, I'll send them to you directly.

Rocco Caputo

Re: benchmarking various event loops with and without anyevent

2008-04-26
> "ML" == Marc Lehmann writes:

  ML> The surprising one was the pure perl implementation, which was quite on
  ML> par with C-based event loops such as Event or Glib. I did expect the pure
  ML> perl implementatioon to be at least a factor of three slower than Event or
  ML> Glib.

  ML> As the pure perl loop wasn't written with high performance in mind, this
  ML> prompted me to optimise it for some important cases (mostly to get rid of
  ML> the O(n²) degenerate cases and improving the select bitmask parsing for
  ML> the sparse case).

check out stem's pure perl event loop. there are examples in the
/sessions dir on how to use that directly without the rest of the
modules. it does things in a different direction and doesn't scan
select's bit masks but instead it scans the interesting handles and see
whether their bits are set. it should exhibit good behavior under growth
as all the data are managed in hashes.

  ML> I then made a second benchmark, designed not to measure anyevent overhead,
  ML> but to measure real-world performance of a socket server.

and that /sessions code also shows use of the asyncio module. if you can
benchmark that i would be interested in the results. 

  ML> The result is that the pure perl event loop used as fallback in AnyEvent
  ML> single-handedly beats Glib by a large margin, and even event by a factor
  ML> of two.

  ML> For small servers, the overhead introduced by running a lot of perl
  ML> opcodes per iteration dominates, however, reflected in the last benchmark.

in a heavily loaded server most of the work in in the i/o and should
overwhelm the event loop itself. that is the whole purpose of event
loops as we all know here. 

  ML> However, the net result is that the pure perl event loop performs
  ML> better than almost all other event loops (EV being the only exception)
  ML> ins erious/medium-sized cases, while I originally expected it to fail
  ML> completely w.r.t. performance and being only usable as a workaround when
  ML> no "better" event module is installed.

i don't find that surprising. perl's i/o is decent and as i said above,
a loaded server is doing mostly i/o. 

  ML> All the benchmark data and explanations can be found here:


  ML> The code is not yet released and likely still buggy (the question is
  ML> whether any bugs affect the benchmark results). It is only available via
  ML> CVS:

i will take a gander and see if i can play with it and add stem's loop
to it. if you want to work on this with me, i wouldn't mind the help.



Re: benchmarking various event loops with and without anyevent

2008-04-26
> "ML" == Marc Lehmann writes:

  On Fri, Apr 25, 2008 at 11:40:03PM -0400, Uri Guttman wrote: 
  >> check out stem's pure perl event loop. there are examples in the

  ML> Maybe I'll provide a backend for stem.

actually it makes more sense to me to wrap anyevent in stem. it already
has several event wrappers (pure perl, and tk) and wrapping is
very easy to do. not much different than the code i see in anyevent.

  >> modules. it does things in a different direction and doesn't scan
  >> select's bit masks but instead it scans the interesting handles and see
  >> whether their bits are set. it should exhibit good behavior under growth
  >> as all the data are managed in hashes.

  ML> That is exactly as the anyevent perl backend did in the previous
  ML> version. I don't see how that should exhibit good behaviour - file
  ML> descristpors are small integers, so hashes only add overhead over arrays,
  ML> and checking every file descriptor that way is much slower than scanning
  ML> the bitmask.

  ML> Especially in the important case of many handles/few active ones, an 
  ML> of "scan all handles" is very slow.

my code doesn't scan all handle, but scan all writeable events to see if
their handle bits are set. descriptors increment and can cause bloat of
the array if you have many in use (and not all of them in the event loop).

  ML> (The benchmarks reflect this, you could try with an older anyevent
  ML> release where we just check the keys of the hash storing fd
  ML> information: performance is much lower).

  ML> I then made a second benchmark, designed not to measure anyevent overhead,
  ML> but to measure real-world performance of a socket server.
  >> and that /sessions code also shows use of the asyncio module. if you can
  >> benchmark that i would be interested in the results. 

  ML> Hmm, tt seems to have become fashionable lately to call synchronous I/O
  ML> asynchronous (see IO::Async, another 100% synchronous framework). What
  ML> exactly is asynchronous about the I/O in that example, it seems to be
  ML> fully synchronous to me (just event-driven).

event loops are async in that you get callbacks as they are needed. sure
you can't get true async behavior from any cpu/os combo but calling it
async is no different than calling it parallel processing when it is
just context switching among processes on a single cpu. the illusion and
api are what matters. a better term may be non-blocking i/o (and socket
connections) but that needs more explanation in some ways. my view is
that threads are the real sync app style and event loops are the async
style. but this is not the time nor place to discuss that.

  >> i don't find that surprising. perl's i/o is decent and as i said above,
  >> a loaded server is doing mostly i/o. 

  ML> Well, since the server in this benchmark hardly does anything, and as you
  ML> can see by the results, the event loop completely dominates, and bad event
  ML> loops are a thousand times slower than good ones.

i haven't looked at the benchmark stuff yet. i was browsing the code

  ML> When put into requests/s, this means that with POE, you are limited
  ML> to <1000 requests/s, while with EV you can easily reach hundreds of
  ML> thousands.

  ML> Also, this does not explain at all why Event is so slow, and why Glib 
  ML> so extremely badly. Most of the stuff that slows down the perl-based event
  ML> loop is clearly stuff that is much much faster in C.

poor memory management in the c code? i have done pure c event loops in
a framework where memory management was fairly fast due to cached queues
of known block sizes. alloc/free were usually a trivial call and the
event code had it own private fixed size buffer queues. it had no
problem doing 2k parallel web fetches including all the url parsing all
on a single sparc. we had to throttle it down to keep up with the slower

  ML> For a reasonable event loop implemented in C, the overhead of
  ML> calling select should completely dominate the rest of the
  ML> processing (it does so in EV).

true, but bad c (and perl) code is all around us.

  >> i will take a gander and see if i can play with it and add stem's loop
  >> to it. if you want to work on this with me, i wouldn't mind the help.

  ML> Well, can't say thereis much demand for it, but if you cna give me a 
  ML> on these things in the docs I can probably come up with one before the 
  ML> release:

  ML> - how can I do one iteration of the event loop? This is important

i don't have an api for oneevent. i still haven't seen a need for
it. hence having stem wrap anyevent may be the better way. the goal for
me is to have stem support more (and faster) event loops. if you are
doing a stem app, you should use the stem event api.

  ML>   for condvars: the function must not block after one or more events
  ML>   have been handled.


  ML> - how do I regi

Re: benchmarking various event loops with and without anyevent

2008-04-26
On Sat, Apr 26, 2008 at 01:14:07AM -0400, Uri Guttman wrote:
> actually it makes more sense to me to wrap anyevent in stem. it already

Yes, after it turned out that stopping the loop also seems to clear events
it became clear that Stem cannot provide even the stripped down interface
of anyevent (even when one works aorund the bugs that make it unusable).

Too bad, I'd liked to benchmark it, but thats almost out of the question
(maybe with a quick hack...).

Re: benchmarking various event loops with and without anyevent

2008-04-25
Hmm, more bugs:

   sub stopbusywaiting {

   my $stopper = new Stem::Event::Timer
  object   => $dummy,
  method   => "stopbusywaiting",
  delay=> 0.05,
  interval => 0.05; # bug workaround

   warn "a\n";
   warn "b\n";

This always takes a second, here is an strace:

adding "use Time::HiRes 'time'" to makes it run fast (yes,
diretcly below the code that should handle this, something is broken
witht he time::hires loading code apparently, maybe its this weird
core-overwriting issue).

Re: benchmarking various event loops with and without anyevent

2008-04-25
On Sat, Apr 26, 2008 at 01:14:07AM -0400, Uri Guttman wrote:
> > "ML" == Marc Lehmann writes:
On Fri, Apr 25, 2008 at 11:40:03PM -0400, Uri Guttman wrote: 
> PROTECTED]> wrote:
>   >> check out stem's pure perl event loop. there are examples in the
>   ML> Maybe I'll provide a backend for stem.
> actually it makes more sense to me to wrap anyevent in stem. it already
> has several event wrappers (pure perl, and tk) and wrapping is
> very easy to do. not much different than the code i see in anyevent.

FYI: I think I found a bug, one-shot timer events never get reported:
- calls timer_triggered,
- which cancels,
- which sets active to 0,
- then calls trigger
- which returns becasue active is 0

and callback never gets invoked.

Maybe my reading of the sourcecode is wrong, but I can't for the life of it
get any timer event out of stem.

Re: benchmarking various event loops with and without anyevent

2008-04-25
On Sat, Apr 26, 2008 at 01:14:07AM -0400, Uri Guttman wrote:
>   ML> Maybe I'll provide a backend for stem.
> actually it makes more sense to me to wrap anyevent in stem. it already

of course, using anyevent always makes sense.

however, using anyevent doesn't solve the interoeprability problem: you
still cannot use modules using anyevent when you don't use the anyevent
but the wx interface, for example.

> has several event wrappers (pure perl, and tk) and wrapping is
> very easy to do. not much different than the code i see in anyevent.

Yes, but that doesn't give you much advantage - as long as your module
isn't good citizen and plays nicely with other modules by monopolising the
process it is not interoperable.

Making anyevent compatible to stem enables anyevent modules to be used in
stem. making stem use anyevent doesn't achieve that aslong as it doesn't
use it's anyevent backend, and it cnanot be used in other programs as
well, as it isn't event-loop agnostic because it forces the user to use
"stem" as the event loop.

Also, this would make it impossible to benchmark the pure perl event loop
- I would *prepdict* (but i am bad at that) that it will be the slowest
one, ignoring POE, which I expect to be much slower still.

>   ML> Especially in the important case of many handles/few active ones, an 
> approach
>   ML> of "scan all handles" is very slow.
> my code doesn't scan all handle, but scan all writeable events to see if
> their handle bits are set.

That's even worse, as there can be many more events (watchers?) then file
descriptors (in the first benchmark you would scan, say, 1 event
watchers for the same fd). even if not, it's slow O(n), compared to fast
O(n) that you can achieve with scanning the bitmasks.

It is only fast if you have a lot of handles and very few active watchers,
which isn't too typical (who uses all those extra fd's if not your

In any case, the "scan mask" approach is about many times faster in the
actual benchmark with 1 handles and 100 active ones, simply because
scanning a mask is so much faster.

Besides, select returns the number of active handles, so one could use
both approachs and select between them, e.g. when more than 80% of the
handles are active, use the scan-all-handles method, otherwise the bitmask

> descriptors increment and can cause bloat of the array if you have many
> in use (and not all of them in the event loop).

That is not a common case, besides, arrays are very compact, unlike
hashes, so it's not a clear win (note how the pure perl backend in
anyevent comes out as one of the backends using the leats amount of

In any case, a lot of technology in the kernel goes into providing "small
integers" as fd's, not taking advantage of that gives away optimisaiton
opportunities. In this case, unix guarantees that the memory use is
bounded (there is even a resource limit for it, and reserving 4 or 8
bytes/file descriptor is nothing really).

Trying to avoid bloat on that side is the wrogn side to optimise for.

> event loops are async in that you get callbacks as they are needed. sure

Yes, but the I/O isn't async, which was my point. asynchronous I/O is
quite a different beast, but few people really use it. (which is a pity,
but only pelr ahs a IMnsHO decent module for it).

> api are what matters. a better term may be non-blocking i/o (and socket

It's actually the only correct term, as no I/O is done in the event loop.

>   ML> Also, this does not explain at all why Event is so slow, and why Glib 
> scales
>   ML> so extremely badly. Most of the stuff that slows down the perl-based 
> event
>   ML> loop is clearly stuff that is much much faster in C.
> poor memory management in the c code?

Perl's memory management is quite good, yeah. I do suspect that it
has somethign to do with glib scanning its watcher list (ALL of them)
repeatedly, and when removing, who knows, it might run a singly-linked
list to its end to remove the watcher.

As for Event, I think it simply does way too much around simple callback
invocation, for example it uses its event queue and adds events at the end
(walking the list each time). All that the event queue has done for me,
hwoever, was causing infite memory growth when it added multiple events
for the same fd again and again becasue some high-priority watcher got

(I have nothing against event queues, you need one, but one *can* manage
it abysmally).

> a framework where memory management was fairly fast due to cached queues
> of known block sizes. alloc/free were usually a trivial call and the
> event code had it own private fixed size buffer queues. it had no
> problem doing 2k parallel web fetches including all the url parsing all
> on a single sparc. we had to throttle it down to keep up with the slower

hehe :)

>   ML> For a reasonable event loop implemented in C, the overhead of
>   ML> calling select should completely dominate the re

Re: benchmarking various event loops with and without anyevent

2008-04-25
On Sat, Apr 26, 2008 at 07:01:39AM +0200, Marc Lehmann wrote: 
> And another one:

does Stem deal with subsecond delay values? Under what circumstances?
AnyEvent guarantees subsecond accuracy currently.


   The ’hard’ attribute means that the next interval delay starts
   before the callback to the object is made. If a soft timer is selected
   (hard is 0), the delay starts after the callback returns. So the hard
   timer ignores the time taken by the callback and so it is a more
   accurate timer. The accuracy a soft timer is affected by how much time
   the callback takes.

It is nice to have a hard timer, but please implement clumping. Not
implementing clumping makes this type of timer completely unusable in
practise, as a time jump or stopping the program for a long time has makes
it potentially unusable.

(In fact, this inability to use Event's hard timers was one of the reasons
I was looking for a different event loop. EV doesn't do clumping, however,
as it has two timer types that solve time-jump and stopping-related
problems diferently).

The choice of a   Deliantra, the free code+content MORPG
Re: benchmarking various event loops with and without anyevent

2008-04-25
On Sat, Apr 26, 2008 at 06:48:55AM +0200, Marc Lehmann wrote: 
> Well, can't say thereis much demand for it, but if you cna give me a pointer
> on these things in the docs I can probably come up with one before the next
> release:

Looking at Stem::Event, which hopefully is the right place to look, I cna
come up with soem answres and some more refined questions :)

Here is a new question:

- How do I select a specific event loop backend to be used?

> - does stem provide something to catch signals synchronously?

doesn't seem to be documented, I assume so.

> - how can I do one iteration of the event loop? This is important

Interestingly, the manpage for "Stem::Event # event loop" doesn't mention any
way to start said event loop.

The source code has init_loop, start_loop and, interestingly, stop_loop.

Does that mena I have to initialsie the event loop myself? Is initialising
it multiple times a problem (obviously, code using anyevent can't be
bothered to implement workarounds for stem, so if that were a problem,
stem couldn't be used by default. Thats not too bad, however, as it would
sitll be used if the main program uses stem).

> - how do I register a callback to be called when an fh or fd becomes
>   "readable" or "writable"?

This is also not documented. All it says is: 

   This class creates an event that will trigger a callback whenever its
   file descriptor has data to be read. [...]
It doesn't say how to pass "its dile descriptor", or wether it also works
with file handles.

> - how can I register a relative timer?

Solved :) Whats missing is how I cancel that time, does it automatically
cancel itself when the returned object gets forgotten by the calling code?

> - are there any limitations, for example, what happens when I register
>   two read watchers for one fh (or fd)?

The answre seems to be: stem needs the same workaround as Tk, unless I
overlooked some internal layer.

> - how about child status changes or exits?

Doesn't seem to be supported, so AnyEvent will fall back to its own code
(there is nothing wrong with that, POE is much worse, as its code is
broken but insists on reaping children so one cnanot even use one's own

> - how does stem handle time jumps, if at all?

Can't really find anything here either, I guess changing the clock
requires a restart.

> - are it's timers based on absolute or wallclock time or relative/monotonic 
> time?

Seems to be absolute time.

Re: benchmarking various event loops with and without anyevent

2008-04-25
On Fri, Apr 25, 2008 at 11:40:03PM -0400, Uri Guttman wrote:
> check out stem's pure perl event loop. there are examples in the

Maybe I'll provide a backend for stem.

> modules. it does things in a different direction and doesn't scan
> select's bit masks but instead it scans the interesting handles and see
> whether their bits are set. it should exhibit good behavior under growth
> as all the data are managed in hashes.

That is exactly as the anyevent perl backend did in the previous
version. I don't see how that should exhibit good behaviour - file
descristpors are small integers, so hashes only add overhead over arrays,
and checking every file descriptor that way is much slower than scanning
the bitmask.

Especially in the important case of many handles/few active ones, an approach
of "scan all handles" is very slow.

(The benchmarks reflect this, you could try with an older anyevent
release where we just check the keys of the hash storing fd information:
performance is much lower).

>   ML> I then made a second benchmark, designed not to measure anyevent 
> overhead,
>   ML> but to measure real-world performance of a socket server.
> and that /sessions code also shows use of the asyncio module. if you can
> benchmark that i would be interested in the results. 

Hmm, tt seems to have become fashionable lately to call synchronous I/O
asynchronous (see IO::Async, another 100% synchronous framework). What
exactly is asynchronous about the I/O in that example, it seems to be
fully synchronous to me (just event-driven).

>   ML> For small servers, the overhead introduced by running a lot of perl
>   ML> opcodes per iteration dominates, however, reflected in the last 
> benchmark.
> in a heavily loaded server most of the work in in the i/o and should
> overwhelm the event loop itself. that is the whole purpose of event
> loops as we all know here. 

I actually don't know that. When I write response data (such as a file in
my anime server), the performance of the event loop completely dominates
(ignoring IO::AIO overhead in this case, which also of coruse relies on
the event loop).

Of course, select/poll don't scale at all, unless the majority of handles
is active, also not very typical of heavily loaded servers.

>   ML> However, the net result is that the pure perl event loop performs
>   ML> better than almost all other event loops (EV being the only exception)
>   ML> ins erious/medium-sized cases, while I originally expected it to fail
>   ML> completely w.r.t. performance and being only usable as a workaround when
>   ML> no "better" event module is installed.
> i don't find that surprising. perl's i/o is decent and as i said above,
> a loaded server is doing mostly i/o. 

Well, since the server in this benchmark hardly does anything, and as you
can see by the results, the event loop completely dominates, and bad event
loops are a thousand times slower than good ones.

When put into requests/s, this means that with POE, you are limited
to <1000 requests/s, while with EV you can easily reach hundreds of

Also, this does not explain at all why Event is so slow, and why Glib scales
so extremely badly. Most of the stuff that slows down the perl-based event
loop is clearly stuff that is much much faster in C.

For a reasonable event loop implemented in C, the overhead of calling
select should completely dominate the rest of the processing (it does so in

This is not the case for any of the event loops I tested with the
exception of EV and Event (which "only" has high overhead, but it doesn't
grow beyond reason).

> i will take a gander and see if i can play with it and add stem's loop
> to it. if you want to work on this with me, i wouldn't mind the help.

Well, can't say thereis much demand for it, but if you cna give me a pointer
on these things in the docs I can probably come up with one before the next

- how can I do one iteration of the event loop? This is important
  for condvars: the function must not block after one or more events
  have been handled.
- how do I register a callback to be called when an fh or fd becomes
  "readable" or "writable"?
- how can I register a relative timer?
- are there any limitations, for example, what happens when I register
  two read watchers for one fh (or fd)?
- does stem provide something to catch signals synchronously?
- how about child status changes or exits?

The EV or Event implementation modules should give a good idea of whats

And for the documentation:

- how does stem handle time jumps, if at all?
- are it's timers based on absolute or wallclock time or relative/monotonic 

