Re: scaling problems

2008-05-20 Thread David Stanek
On Tue, May 20, 2008 at 12:03 AM, James A. Donald <[EMAIL PROTECTED]> wrote:
> On Mon, 19 May 2008 21:04:28 -0400, "David Stanek"
> <[EMAIL PROTECTED]> wrote:
>> What is the difference if you have a process with 10 threads or 10
>> separate processes running in parallel? Apache is a good example of a
>> server that may be configured to use multiple processes to handle
>> requests. And from what I hear is scales just fine.
>>
>> I think you are looking at the problem wrong. The fundamentals are the
>> same between threads and processes.
>
> I am not planning to write a web server framework, but to use one.
> Doubtless a python framework could be written to have satisfactory
> scaling properties, but what are the scaling properties of the ones
> that have been written?
>

Both Django and TurborGears work well for me. When you step back and
think about it all of the popular web frameworks would do just fine.
The ones that don't do multiprocessing out of the box would be trivial
to load balance behind Apache or a real load balancer. Again the
problem here is the number of connections to the database, once you
get big enough to worry about it.

-- 
David
http://www.traceback.org
--
http://mail.python.org/mailman/listinfo/python-list


Re: scaling problems

2008-05-20 Thread Duncan Booth
Marc 'BlackJack' Rintsch <[EMAIL PROTECTED]> wrote:

> On Tue, 20 May 2008 10:47:50 +1000, James A. Donald wrote:
> 
>> 2.  It is not clear to me how a python web application scales.
> 
> Ask YouTube.  :-)

Or look at Google appengine where unlike normal Python you really are 
prevented from making good use of threading.

Google App Engine takes an interesting approach by forcing the programmer 
to consider scalability right from the start: state is stored in a 
distributed database which cannot do all the hard to scale things that SQL 
databases do. This means that you have to work as though your application 
were spread on servers all round the world from day 1 instead of waiting 
until the structure that was 'good enough' is threatening to kill your 
business before you address them.

It also puts strict limits on how much a single web request can do, so 
again you have to work from day 1 to make sure that page requests are as 
efficient as possible.

In return you get an application which should scale well. There is nothing 
Python specific about the techniques, it is just that Python is the first 
(and so far only) language supported on the platform.

-- 
Duncan Booth http://kupuguy.blogspot.com
--
http://mail.python.org/mailman/listinfo/python-list


Re: scaling problems

2008-05-20 Thread Nick Craig-Wood
Marc 'BlackJack' Rintsch <[EMAIL PROTECTED]> wrote:
>  On Tue, 20 May 2008 13:57:26 +1000, James A. Donald wrote:
> 
> > The larger the program, the greater the likelihood of inadvertent name
> > collisions creating rare and irreproducible interactions between
> > different and supposedly independent parts of the program that each
> > work fine on their own, and supposedly cannot possibly interact.
> 
>  How should such collisions happen?  You don't throw all your names into
>  the same namespace!?

If you ever did a lot of programming in C with large projects you have
exactly that problem a lot - there is only one namespace for all the
external functions and variables, and macro definitions from one
include are forever messing up those from another.  I suspect the OP
is coming from that background.

However python doesn't have that problem at all due to its use of
module namespaces - each name is confined to within a module (file)
unless you take specific action otherwise, and each class attribute is
confined to the class etc.

>From the Zen of Python "Namespaces are one honking great idea -- let's
do more of those!" - as a battle scarred C programmer I'd agree ;-)

-- 
Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick
--
http://mail.python.org/mailman/listinfo/python-list


Re: scaling problems

2008-05-20 Thread Graham Dumpleton
On May 20, 2:00 pm, James A. Donald <[EMAIL PROTECTED]> wrote:
> > > 2.  It is not clear to me how a python web application scales.  Python
> > > is inherently single threaded, so one will need lots of python
> > > processes on lots of computers, with the database software handling
> > > parallel accesses to the same or related data.  One could organize it
> > > as one python program for each url, and one python process for each
> > > http request, but that involves a lot of overhead starting up and
> > > shutting down python processes.  Or one could organize it as one
> > > python program for each url, but if one gets a lot of http requests
> > > for one url, a small number of python processes will each sequentially
> > > handle a large number of those requests.  What I am really asking is:
> > > Are there python web frameworks that scale with hardware and how do
> > > they handle scaling?
>
> Reid Priedhorsky
>
> > This sounds like a good match for Apache withmod_python.
>
> I would hope that it is, but the question that I would like to know is
> how does mod_python handle the problem - how do python programs and
> processes relate to web pages and http requests when one is using mod_python, 
> and what happens when one has quite a lot of web pages and
> a very large number of http requests?

Read:

  http://blog.dscpl.com.au/2007/09/parallel-python-discussion-and-modwsgi.html
  http://code.google.com/p/modwsgi/wiki/ProcessesAndThreading

They talk about multi process nature of Apache and how GIL is not as
big a deal when using it.

The latter document explains the various process/threading modes when
using Apache/mod_wsgi. The embedded modes described in that
documentation also apply to mod_python.

The server is generally never the bottleneck, but if you are paranoid
about performance, then also look at relative comparison of mod_wsgi
and mod_python in:

  http://code.google.com/p/modwsgi/wiki/PerformanceEstimates

Graham

--
http://mail.python.org/mailman/listinfo/python-list


Re: scaling problems

2008-05-20 Thread Marc 'BlackJack' Rintsch
On Tue, 20 May 2008 10:47:50 +1000, James A. Donald wrote:

> 2.  It is not clear to me how a python web application scales.

Ask YouTube.  :-)

Ciao,
Marc 'BlackJack' Rintsch
--
http://mail.python.org/mailman/listinfo/python-list


Re: scaling problems

2008-05-20 Thread Marc 'BlackJack' Rintsch
On Tue, 20 May 2008 13:57:26 +1000, James A. Donald wrote:

> The larger the program, the greater the likelihood of inadvertent name
> collisions creating rare and irreproducible interactions between
> different and supposedly independent parts of the program that each
> work fine on their own, and supposedly cannot possibly interact.

How should such collisions happen?  You don't throw all your names into
the same namespace!?

Ciao,
Marc 'BlackJack' Rintsch
--
http://mail.python.org/mailman/listinfo/python-list


Re: scaling problems

2008-05-19 Thread Arnaud Delobelle
James A. Donald <[EMAIL PROTECTED]> writes:

> Ben Finney 

> The larger the program, the greater the likelihood of inadvertent name
> collisions creating rare and irreproducible interactions between
> different and supposedly independent parts of the program that each
> work fine on their own, and supposedly cannot possibly interact.
>
>> These errors are a small subset of possible errors. If writing a large
>> program, an automated testing suite is essential, and can catch far
>> more errors than the compiler can hope to catch. If you run a static
>> code analyser, you'll be notified of unused names and other simple
>> errors that are often caught by static-declaration compilers.
>
> That is handy, but the larger the program, the bigger the problem with
> names that are over used, rather than unused.

Fortunately for each file that you group functionality in (called a
'module'), Python creates a brand new namespace where it puts all the
names defined in that file.  That makes name collision unlikely,
provided that you don't write gigantic modules with plenty of globals
in them (which would be very unnatural in Python), and don't use from
mymodule import * too liberally.

Why not download a largish project in Python (a web framework for
instance, since you have a particular interest in this), study the
code and see if your concerns seem founded?

Arnaud

> --
>   --
> We have the right to defend ourselves and our property, because 
> of the kind of animals that we are. True law derives from this 
> right, not from the arbitrary power of the omnipotent state.

-- 
La propriete, c'est le vol !
   - Pierre-Joseph Proudhon
--
http://mail.python.org/mailman/listinfo/python-list


Re: scaling problems

2008-05-19 Thread James A. Donald
On Mon, 19 May 2008 21:04:28 -0400, "David Stanek"
<[EMAIL PROTECTED]> wrote:
> What is the difference if you have a process with 10 threads or 10
> separate processes running in parallel? Apache is a good example of a
> server that may be configured to use multiple processes to handle
> requests. And from what I hear is scales just fine.
> 
> I think you are looking at the problem wrong. The fundamentals are the
> same between threads and processes.

I am not planning to write a web server framework, but to use one.
Doubtless a python framework could be written to have satisfactory
scaling properties, but what are the scaling properties of the ones
that have been written?

--
  --
We have the right to defend ourselves and our property, because 
of the kind of animals that we are. True law derives from this 
right, not from the arbitrary power of the omnipotent state.

http://www.jim.com/  James A. Donald
--
http://mail.python.org/mailman/listinfo/python-list


Re: scaling problems

2008-05-19 Thread James A. Donald
> > 2.  It is not clear to me how a python web application scales.  Python
> > is inherently single threaded, so one will need lots of python
> > processes on lots of computers, with the database software handling
> > parallel accesses to the same or related data.  One could organize it
> > as one python program for each url, and one python process for each
> > http request, but that involves a lot of overhead starting up and
> > shutting down python processes.  Or one could organize it as one
> > python program for each url, but if one gets a lot of http requests
> > for one url, a small number of python processes will each sequentially
> > handle a large number of those requests.  What I am really asking is:
> > Are there python web frameworks that scale with hardware and how do
> > they handle scaling?

Reid Priedhorsky 
> This sounds like a good match for Apache with mod_python.

I would hope that it is, but the question that I would like to know is
how does mod_python handle the problem - how do python programs and
processes relate to web pages and http requests when one is using
mod_python, and what happens when one has quite a lot of web pages and
a very large number of http requests?
--
  --
We have the right to defend ourselves and our property, because 
of the kind of animals that we are. True law derives from this 
right, not from the arbitrary power of the omnipotent state.

http://www.jim.com/  James A. Donald
--
http://mail.python.org/mailman/listinfo/python-list


Re: scaling problems

2008-05-19 Thread James A. Donald
> > 1.  Looks to me that python will not scale to very large programs,
> > partly because of the lack of static typing, but mostly because there
> > is no distinction between creating a new variable and utilizing an
> > existing variable,

Ben Finney 
> This seems quite a non sequitur. How do you see a connection between
> these properties and "will not scale to large programs"?

The larger the program, the greater the likelihood of inadvertent name
collisions creating rare and irreproducible interactions between
different and supposedly independent parts of the program that each
work fine on their own, and supposedly cannot possibly interact.

> These errors are a small subset of possible errors. If writing a large
> program, an automated testing suite is essential, and can catch far
> more errors than the compiler can hope to catch. If you run a static
> code analyser, you'll be notified of unused names and other simple
> errors that are often caught by static-declaration compilers.

That is handy, but the larger the program, the bigger the problem with
names that are over used, rather than unused.

--
  --
We have the right to defend ourselves and our property, because 
of the kind of animals that we are. True law derives from this 
right, not from the arbitrary power of the omnipotent state.

http://www.jim.com/  James A. Donald
--
http://mail.python.org/mailman/listinfo/python-list


Re: scaling problems

2008-05-19 Thread Carl Banks
On May 19, 8:47 pm, James A. Donald <[EMAIL PROTECTED]> wrote:
> 1.  Looks to me that python will not scale to very large programs,
> partly because of the lack of static typing, but mostly because there
> is no distinction between creating a new variable and utilizing an
> existing variable, so the interpreter fails to catch typos and name
> collisions.

This factor is scale-neutral.  You can expect the number of such bugs
to be proportional to the lines of code.

It might not scale up well if you engage in poor programming practives
(for example, importing lots of unqualified globals with tiny,
undescriptive names directly into every module's namespace), but if
you do that you have worse problems than accidental name collisions.


> I am inclined to suspect that when a successful small
> python program turns into a large python program, it rapidly reaches
> ninety percent complete, and remains ninety percent complete forever.

Unlike most C++/Java/VB/Whatever programs which finish and ship, and
are never patched or improved or worked on ever again?


> 2.  It is not clear to me how a python web application scales.  Python
> is inherently single threaded,

No it isn't.

It has some limitations in threading, but many programs make good use
of threads nonetheless.  In fact for something like a web app Python's
threading limitations are relatively unimportant, since they tend to
be I/O-bound under heavy load.

[snip rest]


Carl Banks
--
http://mail.python.org/mailman/listinfo/python-list


Re: scaling problems

2008-05-19 Thread Ben Finney
James A. Donald <[EMAIL PROTECTED]> writes:

> I am just getting into python, and know little about it

Welcome to Python, and this forum.

> and am posting to ask on what beaches the salt water crocodiles hang
> out.

Heh. You want to avoid them, or hang out with them? :-)

> 1.  Looks to me that python will not scale to very large programs,
> partly because of the lack of static typing, but mostly because there
> is no distinction between creating a new variable and utilizing an
> existing variable,

This seems quite a non sequitur. How do you see a connection between
these properties and "will not scale to large programs"?

> so the interpreter fails to catch typos and name collisions.

These errors are a small subset of possible errors. If writing a large
program, an automated testing suite is essential, and can catch far
more errors than the compiler can hope to catch. If you run a static
code analyser, you'll be notified of unused names and other simple
errors that are often caught by static-declaration compilers.

> I am inclined to suspect that when a successful small python program
> turns into a large python program, it rapidly reaches ninety percent
> complete, and remains ninety percent complete forever.

You may want to look at the Python success stories before suspecting
that, http://www.python.org/about/success/>.

> 2.  It is not clear to me how a python web application scales.

I'll leave this one for others to speak to; I don't have experience
with large web applications.

-- 
 \ "I was gratified to be able to answer promptly and I did. I |
  `\said I didn't know."  -- Mark Twain, _Life on the Mississippi_ |
_o__)  |
Ben Finney
--
http://mail.python.org/mailman/listinfo/python-list


Re: scaling problems

2008-05-19 Thread David Stanek
On Mon, May 19, 2008 at 8:47 PM, James A. Donald <[EMAIL PROTECTED]> wrote:
> I am just getting into python, and know little about it, and am
> posting to ask on what beaches the salt water crocodiles hang out.
>
> 1.  Looks to me that python will not scale to very large programs,
> partly because of the lack of static typing, but mostly because there
> is no distinction between creating a new variable and utilizing an
> existing variable, so the interpreter fails to catch typos and name
> collisions.  I am inclined to suspect that when a successful small
> python program turns into a large python program, it rapidly reaches
> ninety percent complete, and remains ninety percent complete forever.

I can assure you that in practice this is not a problem. If you do
proper unit testing then you will catch many, if not all, of the
errors that static typing catches. There are also tools like PyLint,
PyFlakes and pep8.py will also catch many of those mistakes.


> 2.  It is not clear to me how a python web application scales.  Python
> is inherently single threaded, so one will need lots of python
> processes on lots of computers, with the database software handling
> parallel accesses to the same or related data.  One could organize it
> as one python program for each url, and one python process for each
> http request, but that involves a lot of overhead starting up and
> shutting down python processes.  Or one could organize it as one
> python program for each url, but if one gets a lot of http requests
> for one url, a small number of python processes will each sequentially
> handle a large number of those requests.  What I am really asking is:
> Are there python web frameworks that scale with hardware and how do
> they handle scaling?

What is the difference if you have a process with 10 threads or 10
separate processes running in parallel? Apache is a good example of a
server that may be configured to use multiple processes to handle
requests. And from what I hear is scales just fine.

I think you are looking at the problem wrong. The fundamentals are the
same between threads and processes. You simply have a pool of workers
that handle requests. Any process is capable of handling any request.
The key to scalability is that the processes are persistent and not
forked for each request.


> Please don't read this as "Python sucks, everyone should program in
> machine language expressed as binary numbers".  I am just asking where
> the problems are.

The only real problem I have had with process pools is that sharing
resources is harder. It is harder to create things like connection
pools.


-- 
David
http://www.traceback.org
--
http://mail.python.org/mailman/listinfo/python-list


Re: scaling problems

2008-05-19 Thread Reid Priedhorsky
On Tue, 20 May 2008 10:47:50 +1000, James A. Donald wrote:
>
> 1.  Looks to me that python will not scale to very large programs,
> partly because of the lack of static typing, but mostly because there
> is no distinction between creating a new variable and utilizing an
> existing variable, so the interpreter fails to catch typos and name
> collisions.  I am inclined to suspect that when a successful small
> python program turns into a large python program, it rapidly reaches
> ninety percent complete, and remains ninety percent complete forever.

I find this frustrating too, but not to the extent that I choose a
different language. pylint helps but it's not as good as a nice, strict
compiler.

> 2.  It is not clear to me how a python web application scales.  Python
> is inherently single threaded, so one will need lots of python
> processes on lots of computers, with the database software handling
> parallel accesses to the same or related data.  One could organize it
> as one python program for each url, and one python process for each
> http request, but that involves a lot of overhead starting up and
> shutting down python processes.  Or one could organize it as one
> python program for each url, but if one gets a lot of http requests
> for one url, a small number of python processes will each sequentially
> handle a large number of those requests.  What I am really asking is:
> Are there python web frameworks that scale with hardware and how do
> they handle scaling?

This sounds like a good match for Apache with mod_python.

Reid
--
http://mail.python.org/mailman/listinfo/python-list


scaling problems

2008-05-19 Thread James A. Donald
I am just getting into python, and know little about it, and am
posting to ask on what beaches the salt water crocodiles hang out.

1.  Looks to me that python will not scale to very large programs,
partly because of the lack of static typing, but mostly because there
is no distinction between creating a new variable and utilizing an
existing variable, so the interpreter fails to catch typos and name
collisions.  I am inclined to suspect that when a successful small
python program turns into a large python program, it rapidly reaches
ninety percent complete, and remains ninety percent complete forever.

2.  It is not clear to me how a python web application scales.  Python
is inherently single threaded, so one will need lots of python
processes on lots of computers, with the database software handling
parallel accesses to the same or related data.  One could organize it
as one python program for each url, and one python process for each
http request, but that involves a lot of overhead starting up and
shutting down python processes.  Or one could organize it as one
python program for each url, but if one gets a lot of http requests
for one url, a small number of python processes will each sequentially
handle a large number of those requests.  What I am really asking is:
Are there python web frameworks that scale with hardware and how do
they handle scaling?

Please don't read this as "Python sucks, everyone should program in
machine language expressed as binary numbers".  I am just asking where
the problems are.
--
  --
We have the right to defend ourselves and our property, because 
of the kind of animals that we are. True law derives from this 
right, not from the arbitrary power of the omnipotent state.

http://www.jim.com/  James A. Donald
--
http://mail.python.org/mailman/listinfo/python-list