Re: Question about deployment of math computing

Wesley Peng Tue, 04 Aug 2020 22:50:38 -0700

Thank you David. That makes thing clear. I have made mistake to thinkstarman was event driven, who is really preforked.


I think any preforked server could serve our depolyment better.


Regards.


dc...@prosentient.com.au wrote:

Hi Wesley,

I don't know all the ins and outs of Starman. I do know that Starman is a 
preforking web server, which uses Net::Server::PreFork under the hood. You 
configure the number of preforked workers to correspond with your CPU and 
memory limits for that server.

As per the Starman documentation 
(https://metacpan.org/pod/release/MIYAGAWA/Starman-0.4015/lib/Starman.pm), you 
should put a frontend server/reverse proxy (like Nginx) in front of Starman. 
Nginx is often recommended because it's event-driven. The idea being that a few 
Nginx workers (rather than those thousands of Apache processes you mentioned) 
can handle a very large volume of HTTP requests, and then Nginx intelligently 
passes those requests to the backend server (e.g. Starman).

Of course, no matter what, you can still get timeouts if the backend server 
isn't responding fast enough, but typically the backend process is going as 
fast as it can. At that point, your only option is to scale up. You can do that 
by using Nginx as a load balancer and horizontally scaling your Starman 
instances, or you can put more CPUs on that machine, and configure Starman to 
prefork more workers.

Let's say you use Mod_Perl/Apache instead of Starman/Nginx. At the end of the 
day, you still need to think about how many concurrent requests you're needing 
to serve and how many CPUs you have available. If you've configured Apache to 
have too many processes, you're going to overwhelm your server with tasks. You 
need to use reasonable constraints.

But remember that this isn't specific to Perl/Starman/Nginx/Apache/mod_perl. 
These are concepts that translate to any stack regardless of programming 
language and web server. (Of course, languages like Node.js and Golang have 
some very cool features for dealing with blocking I/O, so that you can make the 
most of the resources you have. That being said, Perl has Mojo/Mojolicious, 
which claims to do non-blocking I/O in Perl. I have yet to try it though. I'm 
skeptical, but could give it a try.)

At the end of the day, it depends on the workload that you're trying to cater 
to.

David Cook

-----Original Message-----
From: Wesley Peng <m...@yonghua.org>
Sent: Wednesday, 5 August 2020 1:31 PM
To: dc...@prosentient.com.au; modperl@perl.apache.org
Subject: Re: Question about deployment of math computing

Hi

dc...@prosentient.com.au wrote:

That's interesting. After re-reading your earlier email, I think that I 
misunderstood what you were saying.

Since this is a mod_perl listserv, I imagine that the advice will always be to 
use mod_perl rather than starman?

Personally, I'd say either option would be fine. In my experience, the key 
advantage of mod_perl or starman (say over CGI) is that you can pre-load 
libraries into memory at web server startup time, and that processes are 
persistent (although they do have limited lifetimes of course).

You could use a framework like Catalyst or Mojolicious (note Dancer is another 
framework, but I haven't worked with it) which can support different web 
servers, and then try the different options to see what suits you best.

One thing to note would be that usually people put a reverse proxy in front of 
starman like Apache or Nginx (partially for serving static assets but other 
reasons as well). Your stack could be less complicated if you just went the 
mod_perl/Apache route.

That said, what OS are you planning to use? It's worth checking if mod_perl is 
easily available in your target OS's package repositories. I think Red Hat 
dropped mod_perl starting with RHEL 8, although EPEL 8 now has mod_perl in it. 
Something to think about.


We use ubuntu 16.04 and 18.04.

We do use dancer/starman in product env, but the service only handle light 
weight API requests, for example, a restful api for data validation.

While our math computing is heavy weight service, each request will take a lot 
time to finish, so I think should it be deployed in dancer?

Since the webserver behind dancer is starman by default, starman is event 
driven, it uses very few processes ,and the process can't scale up/down 
automatically.

We deploy starman with 5 processes by default. when 5 requests coming, all 5 
starman processes are so busy to compute them, so the next request will be 
blocked. is it?

But apache mp is working as prefork way, generally it can have as many as 
thousands of processes if the resource is permitted. And the process management 
can scale up/down the children automatically.

So my real question is, for a CPU consuming service, the event driven service 
like starman, has no advantage than preforked service like Apache.

Am I right?

Thanks.

Re: Question about deployment of math computing

Reply via email to