Re: Question about deployment of math computing

2020-08-07 Thread Ruben Safir
On Thu, Aug 06, 2020 at 10:10:38PM -0500, Mithun Bhattacharya wrote:
> Ruben this conversation had nothing to do with any specific AI use - since
> we did not ask to be spammed with your ethics opinion could you please stop

***Boink*** that is wrong.  If you want to discuss the destructive use
of software to harm people, a forum where free software is developed and
used is exactly the wrong place.  This is not a safe haven for unethical
behaviors.

Of course, predictive AI can be used for other purposes, and you are
free to discuss it.  I am all ears.

It is not __my ethics__ .  It is just __ethics__ .

Its not spam.  It is central to the purpose and the development of the
platform and it being given a free license, in order to share it broadly
with the scientific and general community as a benifit.  It is
gratifying to know that modperl can be deployed in such a way to give
production servers the muscle and concunency needed to a mass deployment
of an AI.  That probably has as much to do with the database back end as
the perl services.  However, it is an ethical problem if this is being
deployed to purposely suppress populations and in support of inhumane
dictatorships.

We are at a watershed moment in history with regard to AI and all
technical forum, you would hope, are ripe with such discussion, but
especially in the forums such as this which support free software
projects which might be used to build the tools of repression.



> > > > > yes, this is a prediction server, which would be deployed in PROD
> > > > > environment, the client application would request the prediction
> > > > > server for results as scores. You can think it as online
> > > > > recommendation systems.
> > > > >

How many clients are we talking about and what is the nature of the
scoring process?

Are we talking about 1.43 billion people or so?


> > > > > regards.
> > > >
> >
> > The ethical issues can no longer be dodged
> >
> > https://www.youtube.com/watch?v=5dZ_lvDgevk
> >
> >
> > > >
> > > > Is this AI being used by the Chinese governments social rating system?
> > > >
> > > > Reuvain
> > > >
> > > > --
> > > > So many immigrant groups have swept through our town
> > > > that Brooklyn, like Atlantis, reaches mythological
> > > > proportions in the mind of the world - RI Safir 1998
> > > > http://www.mrbrklyn.com
> > > >
> > > > DRM is THEFT - We are the STAKEHOLDERS - RI Safir 2002
> > > > http://www.nylxs.com - Leadership Development in Free Software
> > > > http://www2.mrbrklyn.com/resources - Unpublished Archive
> > > > http://www.coinhangout.com - coins!
> > > > http://www.brooklyn-living.com
> > > >
> > > > Being so tracked is for FARM ANIMALS and extermination camps,
> > > > but incompatible with living as a free human being. -RI Safir 2013
> > > >
> > > >
> >
> > --
> > So many immigrant groups have swept through our town
> > that Brooklyn, like Atlantis, reaches mythological
> > proportions in the mind of the world - RI Safir 1998
> > http://www.mrbrklyn.com
> >
> > DRM is THEFT - We are the STAKEHOLDERS - RI Safir 2002
> > http://www.nylxs.com - Leadership Development in Free Software
> > http://www2.mrbrklyn.com/resources - Unpublished Archive
> > http://www.coinhangout.com - coins!
> > http://www.brooklyn-living.com
> >
> > Being so tracked is for FARM ANIMALS and extermination camps,
> > but incompatible with living as a free human being. -RI Safir 2013
> >
> >

-- 
So many immigrant groups have swept through our town
that Brooklyn, like Atlantis, reaches mythological
proportions in the mind of the world - RI Safir 1998
http://www.mrbrklyn.com 

DRM is THEFT - We are the STAKEHOLDERS - RI Safir 2002
http://www.nylxs.com - Leadership Development in Free Software
http://www2.mrbrklyn.com/resources - Unpublished Archive 
http://www.coinhangout.com - coins!
http://www.brooklyn-living.com 

Being so tracked is for FARM ANIMALS and extermination camps, 
but incompatible with living as a free human being. -RI Safir 2013



Re: Question about deployment of math computing

2020-08-07 Thread Ruben Safir
On Thu, Aug 06, 2020 at 10:10:38PM -0500, Mithun Bhattacharya wrote:
> Ruben this conversation had nothing to do with any specific AI use - since
> we did not ask to be spammed with your ethics opinion could you please stop
> misusing the mod_perl forum for such activities ?

So you are using modperl to create a predictive AI that is used to 
suppress and repress a population?

That is what you seemed to describe.

There is no inappropriate place to discuss such ethics.

The development of modperl was originally with Lincoln Stein in support
of the Human Genome project and it was/is released to the public as a
tool for software freedom.  The ethics where always first and foremost
part of the release.

It is not cool to use it to spy on whole populations and keep them under
surveillance.  It is not cool or acceptable to do that at all.

> 
> On Thu, Aug 6, 2020 at 9:00 PM Ruben Safir  wrote:
> 
> > On Wed, Aug 05, 2020 at 10:47:48AM -0500, Mithun Bhattacharya wrote:
> > > Assuming that is genuine curiosity can we please not deviate from the
> > topic
> > > ?
> > >
> > > On Wed, Aug 5, 2020 at 6:19 AM Ruben Safir  wrote:
> > >
> > > > On Wed, Aug 05, 2020 at 09:46:18AM +0800, Wesley Peng wrote:
> > > > > Hi
> > > > >
> > > > > Mithun Bhattacharya wrote:
> > > > > >Do you really need a webserver which is providing a blocking
> > service ?
> > > > >
> > > > > yes, this is a prediction server, which would be deployed in PROD
> > > > > environment, the client application would request the prediction
> > > > > server for results as scores. You can think it as online
> > > > > recommendation systems.
> > > > >
> > > > > regards.
> > > >
> >
> > The ethical issues can no longer be dodged
> >
> > https://www.youtube.com/watch?v=5dZ_lvDgevk
> >
> >
> > > >
> > > > Is this AI being used by the Chinese governments social rating system?
> > > >
> > > > Reuvain
> > > >
> > > > --
> > > > So many immigrant groups have swept through our town
> > > > that Brooklyn, like Atlantis, reaches mythological
> > > > proportions in the mind of the world - RI Safir 1998
> > > > http://www.mrbrklyn.com
> > > >
> > > > DRM is THEFT - We are the STAKEHOLDERS - RI Safir 2002
> > > > http://www.nylxs.com - Leadership Development in Free Software
> > > > http://www2.mrbrklyn.com/resources - Unpublished Archive
> > > > http://www.coinhangout.com - coins!
> > > > http://www.brooklyn-living.com
> > > >
> > > > Being so tracked is for FARM ANIMALS and extermination camps,
> > > > but incompatible with living as a free human being. -RI Safir 2013
> > > >
> > > >
> >
> > --
> > So many immigrant groups have swept through our town
> > that Brooklyn, like Atlantis, reaches mythological
> > proportions in the mind of the world - RI Safir 1998
> > http://www.mrbrklyn.com
> >
> > DRM is THEFT - We are the STAKEHOLDERS - RI Safir 2002
> > http://www.nylxs.com - Leadership Development in Free Software
> > http://www2.mrbrklyn.com/resources - Unpublished Archive
> > http://www.coinhangout.com - coins!
> > http://www.brooklyn-living.com
> >
> > Being so tracked is for FARM ANIMALS and extermination camps,
> > but incompatible with living as a free human being. -RI Safir 2013
> >
> >

-- 
So many immigrant groups have swept through our town
that Brooklyn, like Atlantis, reaches mythological
proportions in the mind of the world - RI Safir 1998
http://www.mrbrklyn.com 

DRM is THEFT - We are the STAKEHOLDERS - RI Safir 2002
http://www.nylxs.com - Leadership Development in Free Software
http://www2.mrbrklyn.com/resources - Unpublished Archive 
http://www.coinhangout.com - coins!
http://www.brooklyn-living.com 

Being so tracked is for FARM ANIMALS and extermination camps, 
but incompatible with living as a free human being. -RI Safir 2013



Re: Question about deployment of math computing

2020-08-06 Thread Mithun Bhattacharya
Ruben this conversation had nothing to do with any specific AI use - since
we did not ask to be spammed with your ethics opinion could you please stop
misusing the mod_perl forum for such activities ?

On Thu, Aug 6, 2020 at 9:00 PM Ruben Safir  wrote:

> On Wed, Aug 05, 2020 at 10:47:48AM -0500, Mithun Bhattacharya wrote:
> > Assuming that is genuine curiosity can we please not deviate from the
> topic
> > ?
> >
> > On Wed, Aug 5, 2020 at 6:19 AM Ruben Safir  wrote:
> >
> > > On Wed, Aug 05, 2020 at 09:46:18AM +0800, Wesley Peng wrote:
> > > > Hi
> > > >
> > > > Mithun Bhattacharya wrote:
> > > > >Do you really need a webserver which is providing a blocking
> service ?
> > > >
> > > > yes, this is a prediction server, which would be deployed in PROD
> > > > environment, the client application would request the prediction
> > > > server for results as scores. You can think it as online
> > > > recommendation systems.
> > > >
> > > > regards.
> > >
>
> The ethical issues can no longer be dodged
>
> https://www.youtube.com/watch?v=5dZ_lvDgevk
>
>
> > >
> > > Is this AI being used by the Chinese governments social rating system?
> > >
> > > Reuvain
> > >
> > > --
> > > So many immigrant groups have swept through our town
> > > that Brooklyn, like Atlantis, reaches mythological
> > > proportions in the mind of the world - RI Safir 1998
> > > http://www.mrbrklyn.com
> > >
> > > DRM is THEFT - We are the STAKEHOLDERS - RI Safir 2002
> > > http://www.nylxs.com - Leadership Development in Free Software
> > > http://www2.mrbrklyn.com/resources - Unpublished Archive
> > > http://www.coinhangout.com - coins!
> > > http://www.brooklyn-living.com
> > >
> > > Being so tracked is for FARM ANIMALS and extermination camps,
> > > but incompatible with living as a free human being. -RI Safir 2013
> > >
> > >
>
> --
> So many immigrant groups have swept through our town
> that Brooklyn, like Atlantis, reaches mythological
> proportions in the mind of the world - RI Safir 1998
> http://www.mrbrklyn.com
>
> DRM is THEFT - We are the STAKEHOLDERS - RI Safir 2002
> http://www.nylxs.com - Leadership Development in Free Software
> http://www2.mrbrklyn.com/resources - Unpublished Archive
> http://www.coinhangout.com - coins!
> http://www.brooklyn-living.com
>
> Being so tracked is for FARM ANIMALS and extermination camps,
> but incompatible with living as a free human being. -RI Safir 2013
>
>


Re: Question about deployment of math computing

2020-08-06 Thread Ruben Safir
On Wed, Aug 05, 2020 at 10:47:48AM -0500, Mithun Bhattacharya wrote:
> Assuming that is genuine curiosity can we please not deviate from the topic
> ?
> 
> On Wed, Aug 5, 2020 at 6:19 AM Ruben Safir  wrote:
> 
> > On Wed, Aug 05, 2020 at 09:46:18AM +0800, Wesley Peng wrote:
> > > Hi
> > >
> > > Mithun Bhattacharya wrote:
> > > >Do you really need a webserver which is providing a blocking service ?
> > >
> > > yes, this is a prediction server, which would be deployed in PROD
> > > environment, the client application would request the prediction
> > > server for results as scores. You can think it as online
> > > recommendation systems.
> > >
> > > regards.
> >

The ethical issues can no longer be dodged

https://www.youtube.com/watch?v=5dZ_lvDgevk


> >
> > Is this AI being used by the Chinese governments social rating system?
> >
> > Reuvain
> >
> > --
> > So many immigrant groups have swept through our town
> > that Brooklyn, like Atlantis, reaches mythological
> > proportions in the mind of the world - RI Safir 1998
> > http://www.mrbrklyn.com
> >
> > DRM is THEFT - We are the STAKEHOLDERS - RI Safir 2002
> > http://www.nylxs.com - Leadership Development in Free Software
> > http://www2.mrbrklyn.com/resources - Unpublished Archive
> > http://www.coinhangout.com - coins!
> > http://www.brooklyn-living.com
> >
> > Being so tracked is for FARM ANIMALS and extermination camps,
> > but incompatible with living as a free human being. -RI Safir 2013
> >
> >

-- 
So many immigrant groups have swept through our town
that Brooklyn, like Atlantis, reaches mythological
proportions in the mind of the world - RI Safir 1998
http://www.mrbrklyn.com 

DRM is THEFT - We are the STAKEHOLDERS - RI Safir 2002
http://www.nylxs.com - Leadership Development in Free Software
http://www2.mrbrklyn.com/resources - Unpublished Archive 
http://www.coinhangout.com - coins!
http://www.brooklyn-living.com 

Being so tracked is for FARM ANIMALS and extermination camps, 
but incompatible with living as a free human being. -RI Safir 2013



Re: Question about deployment of math computing

2020-08-05 Thread Ruben Safir
On Wed, Aug 05, 2020 at 10:47:48AM -0500, Mithun Bhattacharya wrote:
> Assuming that is genuine curiosity can we please not deviate from the topic
> ?

IBM felt that way once.

I think it is important to know what kind of aps we are developing, and
for what uses..  You are aware of the broad use of AI in China to create 
social "credit" scores that is used to control populations and lock out 
disidents.

Is this what your AI is doing?

Its a nasty business and folks should be aware of that fact before
chosing to collaberate with such a project.

It sounds very much like what you described your working on.

Ruben

> 
> On Wed, Aug 5, 2020 at 6:19 AM Ruben Safir  wrote:
> 
> > On Wed, Aug 05, 2020 at 09:46:18AM +0800, Wesley Peng wrote:
> > > Hi
> > >
> > > Mithun Bhattacharya wrote:
> > > >Do you really need a webserver which is providing a blocking service ?
> > >
> > > yes, this is a prediction server, which would be deployed in PROD
> > > environment, the client application would request the prediction
> > > server for results as scores. You can think it as online
> > > recommendation systems.
> > >
> > > regards.
> >
> >
> > Is this AI being used by the Chinese governments social rating system?
> >
> > Reuvain
> >
> > --
> > So many immigrant groups have swept through our town
> > that Brooklyn, like Atlantis, reaches mythological
> > proportions in the mind of the world - RI Safir 1998
> > http://www.mrbrklyn.com
> >
> > DRM is THEFT - We are the STAKEHOLDERS - RI Safir 2002
> > http://www.nylxs.com - Leadership Development in Free Software
> > http://www2.mrbrklyn.com/resources - Unpublished Archive
> > http://www.coinhangout.com - coins!
> > http://www.brooklyn-living.com
> >
> > Being so tracked is for FARM ANIMALS and extermination camps,
> > but incompatible with living as a free human being. -RI Safir 2013
> >
> >

-- 
So many immigrant groups have swept through our town
that Brooklyn, like Atlantis, reaches mythological
proportions in the mind of the world - RI Safir 1998
http://www.mrbrklyn.com 

DRM is THEFT - We are the STAKEHOLDERS - RI Safir 2002
http://www.nylxs.com - Leadership Development in Free Software
http://www2.mrbrklyn.com/resources - Unpublished Archive 
http://www.coinhangout.com - coins!
http://www.brooklyn-living.com 

Being so tracked is for FARM ANIMALS and extermination camps, 
but incompatible with living as a free human being. -RI Safir 2013



Re: Question about deployment of math computing

2020-08-05 Thread Mithun Bhattacharya
Assuming that is genuine curiosity can we please not deviate from the topic
?

On Wed, Aug 5, 2020 at 6:19 AM Ruben Safir  wrote:

> On Wed, Aug 05, 2020 at 09:46:18AM +0800, Wesley Peng wrote:
> > Hi
> >
> > Mithun Bhattacharya wrote:
> > >Do you really need a webserver which is providing a blocking service ?
> >
> > yes, this is a prediction server, which would be deployed in PROD
> > environment, the client application would request the prediction
> > server for results as scores. You can think it as online
> > recommendation systems.
> >
> > regards.
>
>
> Is this AI being used by the Chinese governments social rating system?
>
> Reuvain
>
> --
> So many immigrant groups have swept through our town
> that Brooklyn, like Atlantis, reaches mythological
> proportions in the mind of the world - RI Safir 1998
> http://www.mrbrklyn.com
>
> DRM is THEFT - We are the STAKEHOLDERS - RI Safir 2002
> http://www.nylxs.com - Leadership Development in Free Software
> http://www2.mrbrklyn.com/resources - Unpublished Archive
> http://www.coinhangout.com - coins!
> http://www.brooklyn-living.com
>
> Being so tracked is for FARM ANIMALS and extermination camps,
> but incompatible with living as a free human being. -RI Safir 2013
>
>


Re: Question about deployment of math computing

2020-08-05 Thread Ruben Safir
On Wed, Aug 05, 2020 at 09:46:18AM +0800, Wesley Peng wrote:
> Hi
> 
> Mithun Bhattacharya wrote:
> >Do you really need a webserver which is providing a blocking service ?
> 
> yes, this is a prediction server, which would be deployed in PROD
> environment, the client application would request the prediction
> server for results as scores. You can think it as online
> recommendation systems.
> 
> regards.


Is this AI being used by the Chinese governments social rating system?

Reuvain

-- 
So many immigrant groups have swept through our town
that Brooklyn, like Atlantis, reaches mythological
proportions in the mind of the world - RI Safir 1998
http://www.mrbrklyn.com 

DRM is THEFT - We are the STAKEHOLDERS - RI Safir 2002
http://www.nylxs.com - Leadership Development in Free Software
http://www2.mrbrklyn.com/resources - Unpublished Archive 
http://www.coinhangout.com - coins!
http://www.brooklyn-living.com 

Being so tracked is for FARM ANIMALS and extermination camps, 
but incompatible with living as a free human being. -RI Safir 2013



Re: Question about deployment of math computing [EXT]

2020-08-05 Thread Mark Blackman



> 
> 
> The good thing about Apache is it's dynamic rescaling - which isn't as easy 
> with starman - if you have a large code base the spin up time for starman can 
> be quite large as it appears (to make it efficient) load in every bit of code 
> that the application needs - even if it is one of those small edge cases.
> 
> So yes use starman for simple apps if you need to, but for complex stuff I 
> find mod_perl setup more reliable.

Even Apache has a maximum number of instances. If you’re prepared to let your 
Apache+mod_perl use up to say 300 concurrent Perl instances, you just set up 
your starman instance to pre-fork 300 concurrent instances. Your hardware will 
always impose concurrency limits. You should always be able to achieve the same 
performance with mod_perl and Starman as Perl is fundamentally single-threaded. 
Separating the front-end proxy (Apache or Nginx) from the back-end application 
(Perl app running under starman) is a simplification and a separation of 
concerns, not a performance gain or penalty.

If you use unix domain sockets for the proxying you can even get zero-downtime 
application restarts.

mod_perl is great for weird, special cases, like supporting some legacy, 3rd 
party code, but I don’t believe it’s the best option for the common case.

- Mark




Re: Question about deployment of math computing [EXT]

2020-08-05 Thread Wesley Peng

James,

James Smith wrote:

The services which use apache/mod_perl work reliably and return data for these 
- the dancer/starman sometimes fail/hang as there are no backends to serve the 
requests or those backends timeout requests to the nginx/proxy (but still 
continue using resources). The team running the backends fail to notice this - 
because there is no easy to see reporting etc on these boxes.


Thanks for letting me know this.
We have been using starman for restful api service, they are light 
weight http request/response.
But for (machine learning)/(deep learning) serving stuff, we may 
consider to use modperl for more stability.


regards.


RE: Question about deployment of math computing [EXT]

2020-08-05 Thread James Smith
Wesley,

You will have seen my posts elsewhere - we work on large Terra/Peta byte scale 
datasets {and these aren't a large number of large records but more a very, 
very large number of small records} so the memory and response times are both 
large - less so compute in some cases but not others.

The services which use apache/mod_perl work reliably and return data for these 
- the dancer/starman sometimes fail/hang as there are no backends to serve the 
requests or those backends timeout requests to the nginx/proxy (but still 
continue using resources). The team running the backends fail to notice this - 
because there is no easy to see reporting etc on these boxes.

We do have other services which we have set up which return large amounts of 
data computed on the fly and the response time for these could be multiple 
hours - but by carefully streaming the data in apache we can get the data to 
return. A similar option isn't available in dancer (or wasn't at the time) to 
handle these sorts of requests and so similar code was impossible.

In most cases starman hasn't really been the answer and apache works 
sufficiently well. Even where people are using nginx we are often now using 
some of the alternative apache workers (mpm_event) which seem to be better/more 
reliable than nginx, and means we don't have to have completely different 
configuration setups for some of our proxies, static servers and dynamic 
content servers.

The good thing about Apache is it's dynamic rescaling - which isn't as easy 
with starman - if you have a large code base the spin up time for starman can 
be quite large as it appears (to make it efficient) load in every bit of code 
that the application needs - even if it is one of those small edge cases.

So yes use starman for simple apps if you need to, but for complex stuff I find 
mod_perl setup more reliable.

James

-Original Message-
From: Wesley Peng  
Sent: 05 August 2020 04:31
To: dc...@prosentient.com.au; modperl@perl.apache.org
Subject: Re: Question about deployment of math computing [EXT]

Hi

dc...@prosentient.com.au wrote:
> That's interesting. After re-reading your earlier email, I think that I 
> misunderstood what you were saying.
> 
> Since this is a mod_perl listserv, I imagine that the advice will always be 
> to use mod_perl rather than starman?
> 
> Personally, I'd say either option would be fine. In my experience, the key 
> advantage of mod_perl or starman (say over CGI) is that you can pre-load 
> libraries into memory at web server startup time, and that processes are 
> persistent (although they do have limited lifetimes of course).
> 
> You could use a framework like Catalyst or Mojolicious (note Dancer is 
> another framework, but I haven't worked with it) which can support different 
> web servers, and then try the different options to see what suits you best.
> 
> One thing to note would be that usually people put a reverse proxy in front 
> of starman like Apache or Nginx (partially for serving static assets but 
> other reasons as well). Your stack could be less complicated if you just went 
> the mod_perl/Apache route.
> 
> That said, what OS are you planning to use? It's worth checking if mod_perl 
> is easily available in your target OS's package repositories. I think Red Hat 
> dropped mod_perl starting with RHEL 8, although EPEL 8 now has mod_perl in 
> it. Something to think about.

We use ubuntu 16.04 and 18.04.

We do use dancer/starman in product env, but the service only handle light 
weight API requests, for example, a restful api for data validation.

While our math computing is heavy weight service, each request will take a lot 
time to finish, so I think should it be deployed in dancer?

Since the webserver behind dancer is starman by default, starman is event 
driven, it uses very few processes ,and the process can't scale up/down 
automatically.

We deploy starman with 5 processes by default. when 5 requests coming, all 5 
starman processes are so busy to compute them, so the next request will be 
blocked. is it?

But apache mp is working as prefork way, generally it can have as many as 
thousands of processes if the resource is permitted. And the process management 
can scale up/down the children automatically.

So my real question is, for a CPU consuming service, the event driven service 
like starman, has no advantage than preforked service like Apache.

Am I right?

Thanks.



-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.

Re: Question about deployment of math computing

2020-08-04 Thread Wesley Peng
Thank you David. That makes thing clear. I have made mistake to think 
starman was event driven, who is really preforked.


I think any preforked server could serve our depolyment better.

Regards.


dc...@prosentient.com.au wrote:

Hi Wesley,

I don't know all the ins and outs of Starman. I do know that Starman is a 
preforking web server, which uses Net::Server::PreFork under the hood. You 
configure the number of preforked workers to correspond with your CPU and 
memory limits for that server.

As per the Starman documentation 
(https://metacpan.org/pod/release/MIYAGAWA/Starman-0.4015/lib/Starman.pm), you 
should put a frontend server/reverse proxy (like Nginx) in front of Starman. 
Nginx is often recommended because it's event-driven. The idea being that a few 
Nginx workers (rather than those thousands of Apache processes you mentioned) 
can handle a very large volume of HTTP requests, and then Nginx intelligently 
passes those requests to the backend server (e.g. Starman).

Of course, no matter what, you can still get timeouts if the backend server 
isn't responding fast enough, but typically the backend process is going as 
fast as it can. At that point, your only option is to scale up. You can do that 
by using Nginx as a load balancer and horizontally scaling your Starman 
instances, or you can put more CPUs on that machine, and configure Starman to 
prefork more workers.

Let's say you use Mod_Perl/Apache instead of Starman/Nginx. At the end of the 
day, you still need to think about how many concurrent requests you're needing 
to serve and how many CPUs you have available. If you've configured Apache to 
have too many processes, you're going to overwhelm your server with tasks. You 
need to use reasonable constraints.

But remember that this isn't specific to Perl/Starman/Nginx/Apache/mod_perl. 
These are concepts that translate to any stack regardless of programming 
language and web server. (Of course, languages like Node.js and Golang have 
some very cool features for dealing with blocking I/O, so that you can make the 
most of the resources you have. That being said, Perl has Mojo/Mojolicious, 
which claims to do non-blocking I/O in Perl. I have yet to try it though. I'm 
skeptical, but could give it a try.)

At the end of the day, it depends on the workload that you're trying to cater 
to.

David Cook

-Original Message-
From: Wesley Peng 
Sent: Wednesday, 5 August 2020 1:31 PM
To: dc...@prosentient.com.au; modperl@perl.apache.org
Subject: Re: Question about deployment of math computing

Hi

dc...@prosentient.com.au wrote:

That's interesting. After re-reading your earlier email, I think that I 
misunderstood what you were saying.

Since this is a mod_perl listserv, I imagine that the advice will always be to 
use mod_perl rather than starman?

Personally, I'd say either option would be fine. In my experience, the key 
advantage of mod_perl or starman (say over CGI) is that you can pre-load 
libraries into memory at web server startup time, and that processes are 
persistent (although they do have limited lifetimes of course).

You could use a framework like Catalyst or Mojolicious (note Dancer is another 
framework, but I haven't worked with it) which can support different web 
servers, and then try the different options to see what suits you best.

One thing to note would be that usually people put a reverse proxy in front of 
starman like Apache or Nginx (partially for serving static assets but other 
reasons as well). Your stack could be less complicated if you just went the 
mod_perl/Apache route.

That said, what OS are you planning to use? It's worth checking if mod_perl is 
easily available in your target OS's package repositories. I think Red Hat 
dropped mod_perl starting with RHEL 8, although EPEL 8 now has mod_perl in it. 
Something to think about.


We use ubuntu 16.04 and 18.04.

We do use dancer/starman in product env, but the service only handle light 
weight API requests, for example, a restful api for data validation.

While our math computing is heavy weight service, each request will take a lot 
time to finish, so I think should it be deployed in dancer?

Since the webserver behind dancer is starman by default, starman is event 
driven, it uses very few processes ,and the process can't scale up/down 
automatically.

We deploy starman with 5 processes by default. when 5 requests coming, all 5 
starman processes are so busy to compute them, so the next request will be 
blocked. is it?

But apache mp is working as prefork way, generally it can have as many as 
thousands of processes if the resource is permitted. And the process management 
can scale up/down the children automatically.

So my real question is, for a CPU consuming service, the event driven service 
like starman, has no advantage than preforked service like Apache.

Am I right?

Thanks.



RE: Question about deployment of math computing

2020-08-04 Thread dcook
Hi Wesley,

I don't know all the ins and outs of Starman. I do know that Starman is a 
preforking web server, which uses Net::Server::PreFork under the hood. You 
configure the number of preforked workers to correspond with your CPU and 
memory limits for that server. 

As per the Starman documentation 
(https://metacpan.org/pod/release/MIYAGAWA/Starman-0.4015/lib/Starman.pm), you 
should put a frontend server/reverse proxy (like Nginx) in front of Starman. 
Nginx is often recommended because it's event-driven. The idea being that a few 
Nginx workers (rather than those thousands of Apache processes you mentioned) 
can handle a very large volume of HTTP requests, and then Nginx intelligently 
passes those requests to the backend server (e.g. Starman). 

Of course, no matter what, you can still get timeouts if the backend server 
isn't responding fast enough, but typically the backend process is going as 
fast as it can. At that point, your only option is to scale up. You can do that 
by using Nginx as a load balancer and horizontally scaling your Starman 
instances, or you can put more CPUs on that machine, and configure Starman to 
prefork more workers.

Let's say you use Mod_Perl/Apache instead of Starman/Nginx. At the end of the 
day, you still need to think about how many concurrent requests you're needing 
to serve and how many CPUs you have available. If you've configured Apache to 
have too many processes, you're going to overwhelm your server with tasks. You 
need to use reasonable constraints. 

But remember that this isn't specific to Perl/Starman/Nginx/Apache/mod_perl. 
These are concepts that translate to any stack regardless of programming 
language and web server. (Of course, languages like Node.js and Golang have 
some very cool features for dealing with blocking I/O, so that you can make the 
most of the resources you have. That being said, Perl has Mojo/Mojolicious, 
which claims to do non-blocking I/O in Perl. I have yet to try it though. I'm 
skeptical, but could give it a try.)

At the end of the day, it depends on the workload that you're trying to cater 
to.

David Cook

-Original Message-
From: Wesley Peng  
Sent: Wednesday, 5 August 2020 1:31 PM
To: dc...@prosentient.com.au; modperl@perl.apache.org
Subject: Re: Question about deployment of math computing

Hi

dc...@prosentient.com.au wrote:
> That's interesting. After re-reading your earlier email, I think that I 
> misunderstood what you were saying.
> 
> Since this is a mod_perl listserv, I imagine that the advice will always be 
> to use mod_perl rather than starman?
> 
> Personally, I'd say either option would be fine. In my experience, the key 
> advantage of mod_perl or starman (say over CGI) is that you can pre-load 
> libraries into memory at web server startup time, and that processes are 
> persistent (although they do have limited lifetimes of course).
> 
> You could use a framework like Catalyst or Mojolicious (note Dancer is 
> another framework, but I haven't worked with it) which can support different 
> web servers, and then try the different options to see what suits you best.
> 
> One thing to note would be that usually people put a reverse proxy in front 
> of starman like Apache or Nginx (partially for serving static assets but 
> other reasons as well). Your stack could be less complicated if you just went 
> the mod_perl/Apache route.
> 
> That said, what OS are you planning to use? It's worth checking if mod_perl 
> is easily available in your target OS's package repositories. I think Red Hat 
> dropped mod_perl starting with RHEL 8, although EPEL 8 now has mod_perl in 
> it. Something to think about.

We use ubuntu 16.04 and 18.04.

We do use dancer/starman in product env, but the service only handle light 
weight API requests, for example, a restful api for data validation.

While our math computing is heavy weight service, each request will take a lot 
time to finish, so I think should it be deployed in dancer?

Since the webserver behind dancer is starman by default, starman is event 
driven, it uses very few processes ,and the process can't scale up/down 
automatically.

We deploy starman with 5 processes by default. when 5 requests coming, all 5 
starman processes are so busy to compute them, so the next request will be 
blocked. is it?

But apache mp is working as prefork way, generally it can have as many as 
thousands of processes if the resource is permitted. And the process management 
can scale up/down the children automatically.

So my real question is, for a CPU consuming service, the event driven service 
like starman, has no advantage than preforked service like Apache.

Am I right?

Thanks.



signature.asc
Description: PGP signature


Re: Question about deployment of math computing

2020-08-04 Thread Wesley Peng

Hi

dc...@prosentient.com.au wrote:

That's interesting. After re-reading your earlier email, I think that I 
misunderstood what you were saying.

Since this is a mod_perl listserv, I imagine that the advice will always be to 
use mod_perl rather than starman?

Personally, I'd say either option would be fine. In my experience, the key 
advantage of mod_perl or starman (say over CGI) is that you can pre-load 
libraries into memory at web server startup time, and that processes are 
persistent (although they do have limited lifetimes of course).

You could use a framework like Catalyst or Mojolicious (note Dancer is another 
framework, but I haven't worked with it) which can support different web 
servers, and then try the different options to see what suits you best.

One thing to note would be that usually people put a reverse proxy in front of 
starman like Apache or Nginx (partially for serving static assets but other 
reasons as well). Your stack could be less complicated if you just went the 
mod_perl/Apache route.

That said, what OS are you planning to use? It's worth checking if mod_perl is 
easily available in your target OS's package repositories. I think Red Hat 
dropped mod_perl starting with RHEL 8, although EPEL 8 now has mod_perl in it. 
Something to think about.


We use ubuntu 16.04 and 18.04.

We do use dancer/starman in product env, but the service only handle 
light weight API requests, for example, a restful api for data validation.


While our math computing is heavy weight service, each request will take 
a lot time to finish, so I think should it be deployed in dancer?


Since the webserver behind dancer is starman by default, starman is 
event driven, it uses very few processes ,and the process can't scale 
up/down automatically.


We deploy starman with 5 processes by default. when 5 requests coming, 
all 5 starman processes are so busy to compute them, so the next request 
will be blocked. is it?


But apache mp is working as prefork way, generally it can have as many 
as thousands of processes if the resource is permitted. And the process 
management can scale up/down the children automatically.


So my real question is, for a CPU consuming service, the event driven 
service like starman, has no advantage than preforked service like Apache.


Am I right?

Thanks.


RE: Question about deployment of math computing

2020-08-04 Thread dcook
That's interesting. After re-reading your earlier email, I think that I 
misunderstood what you were saying.

Since this is a mod_perl listserv, I imagine that the advice will always be to 
use mod_perl rather than starman? 

Personally, I'd say either option would be fine. In my experience, the key 
advantage of mod_perl or starman (say over CGI) is that you can pre-load 
libraries into memory at web server startup time, and that processes are 
persistent (although they do have limited lifetimes of course).

You could use a framework like Catalyst or Mojolicious (note Dancer is another 
framework, but I haven't worked with it) which can support different web 
servers, and then try the different options to see what suits you best. 

One thing to note would be that usually people put a reverse proxy in front of 
starman like Apache or Nginx (partially for serving static assets but other 
reasons as well). Your stack could be less complicated if you just went the 
mod_perl/Apache route. 

That said, what OS are you planning to use? It's worth checking if mod_perl is 
easily available in your target OS's package repositories. I think Red Hat 
dropped mod_perl starting with RHEL 8, although EPEL 8 now has mod_perl in it. 
Something to think about.

David Cook

-Original Message-
From: Wesley Peng  
Sent: Wednesday, 5 August 2020 1:00 PM
To: dc...@prosentient.com.au; modperl@perl.apache.org
Subject: Re: Question about deployment of math computing

Hi

dc...@prosentient.com.au wrote:
> If your app isn't human-facing, then I don't see why a little delay would be 
> a problem?

Our app is not human facing. The application by other department will request 
the result from our app via HTTP.

The company has huge big-data stack deployed, such as Hadoop/Flink/Storm/Spark 
etc, all these solutions have been existing there. The data traffic each day is 
as huge as xx PB.

But, those stacks have complicated privileges control layer, they are most time 
running as backend service, for example, offline analysis, feature engineering, 
and some real time streaming.

We train the modes in backend, and use the stacks mentioned above.

But, once the mode finished training, they will be pushed to online as 
prediction service and serve as HTTP API, b/c third party apps will only like 
to request the interface via HTTP way.

Thanks.



signature.asc
Description: PGP signature


Re: Question about deployment of math computing

2020-08-04 Thread Wesley Peng

Hi

dc...@prosentient.com.au wrote:

If your app isn't human-facing, then I don't see why a little delay would be a 
problem?


Our app is not human facing. The application by other department will 
request the result from our app via HTTP.


The company has huge big-data stack deployed, such as 
Hadoop/Flink/Storm/Spark etc, all these solutions have been existing 
there. The data traffic each day is as huge as xx PB.


But, those stacks have complicated privileges control layer, they are 
most time running as backend service, for example, offline analysis, 
feature engineering, and some real time streaming.


We train the modes in backend, and use the stacks mentioned above.

But, once the mode finished training, they will be pushed to online as 
prediction service and serve as HTTP API, b/c third party apps will only 
like to request the interface via HTTP way.


Thanks.


RE: Question about deployment of math computing

2020-08-04 Thread dcook
As Mithun suggested, it's going to be slow regardless of which web server 
option you choose. 

It depends on your application, but if it's a user-facing web application, one 
way would be to have your client application make an API call to your backend, 
the backend enqueues the job in the queue, the queue worker/consumer does the 
work and puts the result in a result store, and your application front-end 
either asynchronously polls (using Javascript) or has a refresh button that 
lets the user check for their predictions/recommendations. 

Or if you don't want to use the job queue, 100-500ms isn't too bad to just do 
an AJAX call from the front-end to your Mod_perl API backend. Have your web 
page load quickly, then do the slow work asynchronously (one way or another). 
That's how most modern user-facing web apps work, and it works pretty well.  

If your app isn't human-facing, then I don't see why a little delay would be a 
problem?

David Cook

-Original Message-
From: Wesley Peng  
Sent: Wednesday, 5 August 2020 11:46 AM
To: modperl@perl.apache.org
Subject: Re: Question about deployment of math computing

Hi

Mithun Bhattacharya wrote:
> Do you really need a webserver which is providing a blocking service ?

yes, this is a prediction server, which would be deployed in PROD environment, 
the client application would request the prediction server for results as 
scores. You can think it as online recommendation systems.

regards.



signature.asc
Description: PGP signature


Re: Question about deployment of math computing

2020-08-04 Thread Wesley Peng

Hi

Mithun Bhattacharya wrote:

Do you really need a webserver which is providing a blocking service ?


yes, this is a prediction server, which would be deployed in PROD 
environment, the client application would request the prediction server 
for results as scores. You can think it as online recommendation systems.


regards.


Re: Question about deployment of math computing

2020-08-04 Thread Mithun Bhattacharya
Do you really need a webserver which is providing a blocking service ?

Assuming you are doing some sort of map reduce you would be better of
creating a job queue and placing requests into it. You would have a
separate consumer of the queue which could scale up or down depending upon
how long the job queue is.

On Tue, Aug 4, 2020 at 8:23 PM Wesley Peng  wrote:

> Hi
>
> We do math programming (so called machine learning today) in webserver.
> The response would be slow, generally it will take 100ms~500ms to finish
> a request.
> For this use case, shall we deploy the code within preforked modperl ,or
> event-driven server like dancer/starman?
> (we don't use DB like mysql or other slow IO storage server, all
> arguments were passed to webserver by HTTP POST from client).
>
> Thank you.
>