Re: [Boston.pm] threads and sockets

Ben Tilly Thu, 06 Oct 2005 22:55:06 -0700

On 10/6/05, Uri Guttman <[EMAIL PROTECTED]> wrote:
> >>>>> "BT" == Ben Tilly <[EMAIL PROTECTED]> writes:
>
>   >> you don't even need children to do non-blocking rpc calls. if you do the
>   >> protocol yourself and it is over a socket (as it should be), you can do
>   >> async rpc calls. but if you are using a typical library that hardcodes a
>   >> sync rpc protocol, then you are stuck. this is a major issue i have with
>   >> most protocol implementations, especially on cpan. they are written with
>   >> sync i/o and never think about supporting async. what they don't realize
>   >> is that async can easily emulate sync but it is almost impossible for
>   >> sync to emulate async.
>
>   BT> I assumed that he wouldn't want to rewrite things like database
>   BT> drivers, so I was assuming that he'd be stuck.  After all, if he was
>   BT> using async protocols, then he'd have never complained about blocking
>   BT> calls.
>
> true about dbi and other sync modules. but i won't assume anything since
> the OP has not given a full problem spec (yet).


Perspective.  For what I do, interesting data is almost all stored in
the database, so useful programs are going to want to use DBI.  I
also, as I noted, saw key phrases in the question that suggested
synchronous libraries were anticipated.  Finally, even if everything
can be made asynchronous now, allowing for synchronous calls in your
architecture is a piece of future-proofing - some day a management
dictate may come down to integrate with some piece of software that
has a convenient synchronous interface already, but no asynchronous
one.

Some applications, because of what they do, environment, etc, can
guarantee that asynchronous will never be an issue.  But a great many
cannot.

>   >> preforking is just an optimization of a forking server. if you
>   >> distribute the processes over a farm of machines, then you can make the
>   >> main server just connect to the processes on demand or in advance.
>
>   BT> The key feature that I was pointing to was having multiple
>   BT> processes, not the pre-forking optimization.  I specified prefork
>   BT> to distinguish it from the threading model that Apache 2 also
>   BT> offers.
>
> i also like the process farm idea and have used it many times.

There are many ways to skin this cat.

>   BT> Unless overhead is a huge concern, I'd personally use the alternate
>   BT> strategy.  I'd then avoid writing all of the multi-process logic by
>   BT> using Apache for that piece, and mod_perl to process requests/send
>   BT> responses.
>
> but what if you already had a tool in perl that let you do all the async
> communications with no new coding needed? and it can do application
> servers as well? :)

I already indicated reasons to pick a design where synchronous calls
are not an issue.  An additional reason to avoid cooperative
multitasking is scaleability - multiple processes allow you to benefit
from multiple CPUs (both real and virtual), and make the migration to
multiple machines easier.  A cooperatively multitasked program can
only use one CPU.  (But cooperative benefits from being able to
communicate between tasks very directly.  However if you do too much
of that, it is easy to accidentally introduce race conditions. 
Particularly if you try to be asynchronous everywhere.)

>   >> and what if you aren't using http for the clients? what if you want to
>   >> support a cli or a gui client? i really hate how apache (1 or 2) is
>   >> being touted as the next great application platform and savior. cramming
>   >> all those different modules (apache and perl) into the insane mess of
>   >> apache is asking for trouble. anyone ever heard of config file hell? or
>   >> colliding modules that are too tightly coupled to clean up?
>
>   BT> If you're building the system from scratch, you can use any protocol
>   BT> that you want for the clients.  It doesn't matter what kind of clients
>   BT> you have.
>
> sure. again we have no proper spec so i won't speculate.

My point remains.  Wanting to support a CLI or GUI client does not
prevent you from using http.  Your suggesting that it does is a red
herring.

[...]
> that still doesn't mean using apache and http for app serving is a good
> idea. http is stateless so it makes for a bad protocol for when you need
> multiple remote operations on a single session. sure you can work around
> it (cookies) but that is still a workaround.

Stateless is indeed a drawback to http.  And working around it is one
of the pieces of overhead to this approach that I alluded to when I
said, "if you don't mind the overhead".

>   BT> Before you disagree with the last, note that I'm counting as overhead
>   BT> having to implement asynchronous wheels because you don't like the
>   BT> blocking that happens with the synchronous one on CPAN...
>
> hmm, what if the wheels are there and rounder than the square ones you
> currently have?

What rounder wheel than DBI do you have to offer me?

[...]
>   BT> And this makes anything that I've said any less true?
>
> no, but it shows that threads and events can work together if needed.

Did I say they couldn't?

>   BT> Note that I'm not saying that Apache is a perfect strategy.  I'm
>   BT> not saying that it has no drawbacks.  I'm just saying that it is a
>   BT> workable strategy in many situations, and it is the strategy that
>   BT> I'd be inclined to use fairly often.
>
> i would vote against it. too many apps require sessions so it loses
> there with http. http also has to incur much overhead to transfer binary
> data.

Since we do not work together, and are unlikely to any time soon, our
design disagreements are unlikely to be an issue.

>   BT> If you want perfection, then it is clearly the wrong way to go.  But
>   BT> the perfect is the enemy of the good, and it is a pretty good
>   BT> solution.
>
> and a bad solution hurts forever. :)

True.  See my above comment about how allowing for the use of
synchronous pieces is a piece of future-proofing. :-P

> i have too much experience with event loops and code to back me to start
> going in the direction of apache as an app server. in fact my view is to
> use apache as a *front end* but it does little other than to handle http
> and static requests. all cgi stuff is packed up and sent to a middleware
> layer which does all the work. you get real normal perl (mod_perl also
> is slightly nutso compared to regular perl) and a simple faster
> apache. and the middleware layer can be on other boxes, it can do async
> backend stuff, parallel ops, etc. then apache is just a simple UI, no
> different to the middleware than a cli with a socket or anything else.

Ironically I do exactly this, but apache is used as both front end and
middleware.  High-volume sites commonly use a reverse proxy
configuration so that heavy mod_perl servers (that take up precious
resources like RAM, database connections, etc) can return data as fast
as they generate it, and people on dialup connections will only tie up
a lightweight proxy server.

As for your exact solution, if it works for you, it works.  However
the benefits that you're getting seem fairly marginal to me. 
Furthermore if you want high performance in a web environment, both
reason and experience (my own and that of many others that I know)
says that middleware servers don't really help scaleability.

Yes, I know that vendors swear the opposite about middleware up and
down until their faces turn blue.  But those same vendors have a
vested interest in seeing you buy *more* machines, not less.  And, as
I say, the examples that I'm personally aware of contradict what
vendors try to tell you.

Cheers,
Ben
 
_______________________________________________
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm

Re: [Boston.pm] threads and sockets

Reply via email to