[web2py] Re: Web2py and threads

2010-08-23 Thread mdipierro
In Java a serverlet, as far as I understand, is a class which conforms
to some API that allows it to serve one http request. Each instance is
executed in its own thread. The Python equivalent of the serverlet API
is a WSGI application and web2py is based on WSGI, therefore the
parallelization mechanism is equivalent to Java serverlets.

In web2py (the same in Django, Pylons, any any WSGI app) each http
request is executed in its own thread. Threads are recycled to server
non-concurrent requests and reuse database connections (pooling)
without need to close and reopen them. The web server can be
configured for a min number and a max number of threads.

I think the GIL in this context is a false problem. In fact in
production you can use Apache and run as many processes as the number
of cores that you have. Each process will create as many threads as it
needs to server multiple requests. The GIL is a problems only if one
process runs multiple threads on multiple cores. It is possible there
are some caveats with many cores but I have not really played with
apache configurations and benchmarks.

I do not think using Jython helps anything. According to these tests:
  
http://blog.dhananjaynene.com/2008/07/performance-comparison-c-java-python-ruby-jython-jruby-groovy/
  http://pyevolve.sourceforge.net/wordpress/?p=1189
Jython is 2x-3x slower than cpython. So you may get better scaling
with multiple cores but you pay huge perfomance hit.

Web2py runs on Jython but there is a known bug in Java regular
expressions that Sun marked as "won'tfix" that can cause runaway
problems when parsing complex templates. This is not a web2py specific
problem but we have seen effects of the bug in some web2py apps.

Massimo





On Aug 23, 11:29 pm, pierreth  wrote:
> Hello,
>
> I would like to know how Web2py is managing threads. Is it like Java
> servlets where requests are mapped to servlets while one servlet
> object can be used by multiple threads at the same time to serve many
> requests?
>
> Are some users here using Jython with Web2py to get around the ugly
> Pyhton GIL? I would to know about your experience.
>
> --
> Pierre


[web2py] Re: Web2py and threads

2010-08-23 Thread mdipierro
P.S. In the end the bottle neck is ALWAYS database access.

On Aug 24, 12:20 am, mdipierro  wrote:
> In Java a serverlet, as far as I understand, is a class which conforms
> to some API that allows it to serve one http request. Each instance is
> executed in its own thread. The Python equivalent of the serverlet API
> is a WSGI application and web2py is based on WSGI, therefore the
> parallelization mechanism is equivalent to Java serverlets.
>
> In web2py (the same in Django, Pylons, any any WSGI app) each http
> request is executed in its own thread. Threads are recycled to server
> non-concurrent requests and reuse database connections (pooling)
> without need to close and reopen them. The web server can be
> configured for a min number and a max number of threads.
>
> I think the GIL in this context is a false problem. In fact in
> production you can use Apache and run as many processes as the number
> of cores that you have. Each process will create as many threads as it
> needs to server multiple requests. The GIL is a problems only if one
> process runs multiple threads on multiple cores. It is possible there
> are some caveats with many cores but I have not really played with
> apache configurations and benchmarks.
>
> I do not think using Jython helps anything. According to these tests:
>  http://blog.dhananjaynene.com/2008/07/performance-comparison-c-java-p...
>  http://pyevolve.sourceforge.net/wordpress/?p=1189
> Jython is 2x-3x slower than cpython. So you may get better scaling
> with multiple cores but you pay huge perfomance hit.
>
> Web2py runs on Jython but there is a known bug in Java regular
> expressions that Sun marked as "won'tfix" that can cause runaway
> problems when parsing complex templates. This is not a web2py specific
> problem but we have seen effects of the bug in some web2py apps.
>
> Massimo
>
> On Aug 23, 11:29 pm, pierreth  wrote:
>
> > Hello,
>
> > I would like to know how Web2py is managing threads. Is it like Java
> > servlets where requests are mapped to servlets while one servlet
> > object can be used by multiple threads at the same time to serve many
> > requests?
>
> > Are some users here using Jython with Web2py to get around the ugly
> > Pyhton GIL? I would to know about your experience.
>
> > --
> > Pierre


[web2py] Re: Web2py and threads

2010-08-24 Thread mdipierro
Somebody here did. They found it works but there was a proliferation
of open files. We never got to the bottom of this.

On Aug 24, 4:55 am, Michele Comitini 
wrote:
> 2010/8/24 mdipierro :> P.S. In the end the bottle 
> neck is ALWAYS database access.
>
> true! many driver implementations do not release the GIL properly on a
> blocking call.
> Anyway a well designed db would avoid the problem entirely.
>
> Do you know if anyone tried web2py on pypy [http://pypy.org] ?


[web2py] Re: Web2py and threads

2010-08-24 Thread John Heenan
There is absolutely no need to serve up static web pages of a pure
Python web app or a WGSI app with a separate thread.  It is
inefficient to use an inbuilt web server (of a Python web app) or
Apache (if WGSI used) to serve up static web pages using separate
threads. Both Lighttpd and Nginx are well known web servers that
thrash Apache in objective tests for static pages when a web server is
under load. These web servers use event handlers to serve static web
pages, not necessarily separate threads.

Of course the question remains, how much can the performance of WSGI
type apps be improved by an analogous event handling model within the
app and how much of a change in development style would be required to
take full advantage of such an approach. As far as I am aware these
questions has never even been posed.

Further background:

There is no need to use web2py to serve up its css pages, javascript
and images.  A compiled static language (such as C) web server can be
used instead.

The question then becomes which web server. The answer is obvious: web
servers that use event handlers to serve static web pages, not
necessarily threads. Unfortunately you will find religious bigots,
even on this forum, who will ridicule anyone who points out the
obvious. Expect abuse from this reply.

John Heenan

On Aug 24, 3:21 pm, mdipierro  wrote:
> P.S. In the end the bottle neck is ALWAYS database access.
>
> On Aug 24, 12:20 am, mdipierro  wrote:
>
> > In Java a serverlet, as far as I understand, is a class which conforms
> > to some API that allows it to serve one http request. Each instance is
> > executed in its own thread. The Python equivalent of the serverlet API
> > is a WSGI application and web2py is based on WSGI, therefore the
> > parallelization mechanism is equivalent to Java serverlets.
>
> > In web2py (the same in Django, Pylons, any any WSGI app) each http
> > request is executed in its own thread. Threads are recycled to server
> > non-concurrent requests and reuse database connections (pooling)
> > without need to close and reopen them. The web server can be
> > configured for a min number and a max number of threads.
>
> > I think the GIL in this context is a false problem. In fact in
> > production you can use Apache and run as many processes as the number
> > of cores that you have. Each process will create as many threads as it
> > needs to server multiple requests. The GIL is a problems only if one
> > process runs multiple threads on multiple cores. It is possible there
> > are some caveats with many cores but I have not really played with
> > apache configurations and benchmarks.
>
> > I do not think using Jython helps anything. According to these tests:
> >  http://blog.dhananjaynene.com/2008/07/performance-comparison-c-java-p...
> >  http://pyevolve.sourceforge.net/wordpress/?p=1189
> > Jython is 2x-3x slower than cpython. So you may get better scaling
> > with multiple cores but you pay huge perfomance hit.
>
> > Web2py runs on Jython but there is a known bug in Java regular
> > expressions that Sun marked as "won'tfix" that can cause runaway
> > problems when parsing complex templates. This is not a web2py specific
> > problem but we have seen effects of the bug in some web2py apps.
>
> > Massimo
>
> > On Aug 23, 11:29 pm, pierreth  wrote:
>
> > > Hello,
>
> > > I would like to know how Web2py is managing threads. Is it like Java
> > > servlets where requests are mapped to servlets while one servlet
> > > object can be used by multiple threads at the same time to serve many
> > > requests?
>
> > > Are some users here using Jython with Web2py to get around the ugly
> > > Pyhton GIL? I would to know about your experience.
>
> > > --
> > > Pierre


[web2py] Re: Web2py and threads

2010-08-24 Thread pierreth
On 24 août, 01:20, mdipierro  wrote:
> In Java a serverlet, as far as I understand, is a class which conforms
> to some API that allows it to serve one http request. Each instance is
> executed in its own thread.

Yes, but one instance can be executed by multiple threads at the same
time. It is one thread per request. EJB, Enterprise Java Beans, are
running on their own threads.

>The Python equivalent of the serverlet API
> is a WSGI application and web2py is based on WSGI, therefore the
> parallelization mechanism is equivalent to Java serverlets.

Is web2py running as a WSGI application when we do "python web2py.py"
or is it only when used in a specific deployment with WSGI?

>
> In web2py (the same in Django, Pylons, any any WSGI app) each http
> request is executed in its own thread. Threads are recycled to server
> non-concurrent requests and reuse database connections (pooling)
> without need to close and reopen them. The web server can be
> configured for a min number and a max number of threads.

So, as a web2py developer, what do I have to do to avoid
synchronization problems in my application. Where is the danger of
having multiple threads for the web2py developers? What are the
instances shared my multiple threads? What are the instances living in
their own threads?

>
> I think the GIL in this context is a false problem. In fact in
> production you can use Apache and run as many processes as the number
> of cores that you have. Each process will create as many threads as it
> needs to server multiple requests. The GIL is a problems only if one
> process runs multiple threads on multiple cores. It is possible there
> are some caveats with many cores but I have not really played with
> apache configurations and benchmarks.

Yes but a web2py server is running with only one process and using
more web2py processes for serving the same web2py app will lead to
synchronization problems. With processors having more and more cores,
having a web server that cannot use them is not very fun. It is an
issue to be solved with Python 3.2 I think.

> Massimo
>

Thank you for this precious information.


[web2py] Re: Web2py and threads

2010-08-24 Thread mdipierro
On Aug 24, 10:36 am, pierreth  wrote:
> On 24 août, 01:20, mdipierro  wrote:
>
> > In Java a serverlet, as far as I understand, is a class which conforms
> > to some API that allows it to serve one http request. Each instance is
> > executed in its own thread.
>
> Yes, but one instance can be executed by multiple threads at the same
> time. It is one thread per request. EJB, Enterprise Java Beans, are
> running on their own threads.

it is the same in web2py

> >The Python equivalent of the serverlet API
> > is a WSGI application and web2py is based on WSGI, therefore the
> > parallelization mechanism is equivalent to Java serverlets.
>
> Is web2py running as a WSGI application when we do "python web2py.py"
> or is it only when used in a specific deployment with WSGI?

when you do "python web2py.py" you do not start the wsgi app. You
start the rocket web server which makes a number of threads. When a
new http request arrives it is assigned to a free thread (or a new
thread is created) and the wsgi is run in that thread to server that
request and only that request.

> > In web2py (the same in Django, Pylons, any any WSGI app) each http
> > request is executed in its own thread. Threads are recycled to server
> > non-concurrent requests and reuse database connections (pooling)
> > without need to close and reopen them. The web server can be
> > configured for a min number and a max number of threads.
>
> So, as a web2py developer, what do I have to do to avoid
> synchronization problems in my application. Where is the danger of
> having multiple threads for the web2py developers? What are the
> instances shared my multiple threads? What are the instances living in
> their own threads?

You do have to do anything because every concurrency issue is taken
care automatically. There are some DO NOTs:
- do not ever call os.chdir
- do not import third party modules that are not thread safe
- do not use thread.start_new_thread and threading.Thread.start()
- if you open a file that is not uniquely associate to this http/
request/client/session lock the file.

> > I think the GIL in this context is a false problem. In fact in
> > production you can use Apache and run as many processes as the number
> > of cores that you have. Each process will create as many threads as it
> > needs to server multiple requests. The GIL is a problems only if one
> > process runs multiple threads on multiple cores. It is possible there
> > are some caveats with many cores but I have not really played with
> > apache configurations and benchmarks.
>
> Yes but a web2py server is running with only one process and using
> more web2py processes for serving the same web2py app will lead to
> synchronization problems. With processors having more and more cores,
> having a web server that cannot use them is not very fun.

In production you should not use the rocket web server. Use Apache and
preform more than one process.

> It is an
> issue to be solved with Python 3.2 I think.

We are not moving to 3.2. Not at least until Google App Engine moves
to 3.x and all database drivers are supported. Than we'll open this
discussion.


> > Massimo
>
> Thank you for this precious information.


[web2py] Re: Web2py and threads

2010-08-24 Thread John Heenan
Can't we at least have an acknowledgement that it is not necessary for
web2py to use a thread per request model and that web2py could instead
use an event model?

WSGI can be viewed as an evil conspiracy to force Python web apps to
follow the Apache thread per request model! Also with Apace mod_wsgi,
Apache controls the Python process that web2py runs under! How evil
and ugly!

There is no inherent reason why web2py needs to run a separate thread
for each NON static http request, if WSGI is not used!

If web2py uses WSGI then a thread per request is forced upon web2py.
This suits Apache but not web serves with better event driven models
such as Lighttpd and Nginx

For example Lighttpd does not even support WSGI. Instead web2py used
fastcgi for communication Lighttpd via a UNIX socket and web2py then
needlessly converts each request into a thread for handling by a WSGI
handler!

Why should web2py be forced into using a thread model? Anyone who
writes PC applications avoids thread as if they are a plague. Even
academics openly call using threads evil. Here is an article by Edward
A. Lee professor at Berkely University with the title "The Problem
with Threads" (PDF).

John Heenan


On Aug 25, 1:00 am, John Heenan  wrote:
> There is absolutely no need to serve up static web pages of a pure
> Python web app or a WGSI app with a separate thread.  It is
> inefficient to use an inbuilt web server (of a Python web app) or
> Apache (if WGSI used) to serve up static web pages using separate
> threads. Both Lighttpd and Nginx are well known web servers that
> thrash Apache in objective tests for static pages when a web server is
> under load. These web servers use event handlers to serve static web
> pages, not necessarily separate threads.
>
> Of course the question remains, how much can the performance of WSGI
> type apps be improved by an analogous event handling model within the
> app and how much of a change in development style would be required to
> take full advantage of such an approach. As far as I am aware these
> questions has never even been posed.
>
> Further background:
>
> There is no need to use web2py to serve up its css pages, javascript
> and images.  A compiled static language (such as C) web server can be
> used instead.
>
> The question then becomes which web server. The answer is obvious: web
> servers that use event handlers to serve static web pages, not
> necessarily threads. Unfortunately you will find religious bigots,
> even on this forum, who will ridicule anyone who points out the
> obvious. Expect abuse from this reply.
>
> John Heenan
>
> On Aug 24, 3:21 pm, mdipierro  wrote:
>
> > P.S. In the end the bottle neck is ALWAYS database access.
>
> > On Aug 24, 12:20 am, mdipierro  wrote:
>
> > > In Java a serverlet, as far as I understand, is a class which conforms
> > > to some API that allows it to serve one http request. Each instance is
> > > executed in its own thread. The Python equivalent of the serverlet API
> > > is a WSGI application and web2py is based on WSGI, therefore the
> > > parallelization mechanism is equivalent to Java serverlets.
>
> > > In web2py (the same in Django, Pylons, any any WSGI app) each http
> > > request is executed in its own thread. Threads are recycled to server
> > > non-concurrent requests and reuse database connections (pooling)
> > > without need to close and reopen them. The web server can be
> > > configured for a min number and a max number of threads.
>
> > > I think the GIL in this context is a false problem. In fact in
> > > production you can use Apache and run as many processes as the number
> > > of cores that you have. Each process will create as many threads as it
> > > needs to server multiple requests. The GIL is a problems only if one
> > > process runs multiple threads on multiple cores. It is possible there
> > > are some caveats with many cores but I have not really played with
> > > apache configurations and benchmarks.
>
> > > I do not think using Jython helps anything. According to these tests:
> > >  http://blog.dhananjaynene.com/2008/07/performance-comparison-c-java-p...
> > >  http://pyevolve.sourceforge.net/wordpress/?p=1189
> > > Jython is 2x-3x slower than cpython. So you may get better scaling
> > > with multiple cores but you pay huge perfomance hit.
>
> > > Web2py runs on Jython but there is a known bug in Java regular
> > > expressions that Sun marked as "won'tfix" that can cause runaway
> > > problems when parsing complex templates. This is not a web2py specific
> > > problem but we have seen effects of the bug in some web2py apps.
>
> > > Massimo
>
> > > On Aug 23, 11:29 pm, pierreth  wrote:
>
> > > > Hello,
>
> > > > I would like to know how Web2py is managing threads. Is it like Java
> > > > servlets where requests are mapped to servlets while one servlet
> > > > object can be used by multiple threads at the same time to serve many
> > > > requests?
>
> > > > Are some users here using Jython with Web2py to get arou

[web2py] Re: Web2py and threads

2010-08-24 Thread John Heenan
Lee's 'The Problem with Threads' link is at
http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.pdf

There is in Lee at http://ptolemy.eecs.berkeley.edu/%7Eeal/

John Heenan

On Aug 25, 5:00 am, John Heenan  wrote:
> Can't we at least have an acknowledgement that it is not necessary for
> web2py to use a thread per request model and that web2py could instead
> use an event model?
>
> WSGI can be viewed as an evil conspiracy to force Python web apps to
> follow the Apache thread per request model! Also with Apace mod_wsgi,
> Apache controls the Python process that web2py runs under! How evil
> and ugly!
>
> There is no inherent reason why web2py needs to run a separate thread
> for each NON static http request, if WSGI is not used!
>
> If web2py uses WSGI then a thread per request is forced upon web2py.
> This suits Apache but not web serves with better event driven models
> such as Lighttpd and Nginx
>
> For example Lighttpd does not even support WSGI. Instead web2py used
> fastcgi for communication Lighttpd via a UNIX socket and web2py then
> needlessly converts each request into a thread for handling by a WSGI
> handler!
>
> Why should web2py be forced into using a thread model? Anyone who
> writes PC applications avoids thread as if they are a plague. Even
> academics openly call using threads evil. Here is an article by Edward
> A. Lee professor at Berkely University with the title "The Problem
> with Threads" (PDF).
>
> John Heenan
>
> On Aug 25, 1:00 am, John Heenan  wrote:
>
> > There is absolutely no need to serve up static web pages of a pure
> > Python web app or a WGSI app with a separate thread.  It is
> > inefficient to use an inbuilt web server (of a Python web app) or
> > Apache (if WGSI used) to serve up static web pages using separate
> > threads. Both Lighttpd and Nginx are well known web servers that
> > thrash Apache in objective tests for static pages when a web server is
> > under load. These web servers use event handlers to serve static web
> > pages, not necessarily separate threads.
>
> > Of course the question remains, how much can the performance of WSGI
> > type apps be improved by an analogous event handling model within the
> > app and how much of a change in development style would be required to
> > take full advantage of such an approach. As far as I am aware these
> > questions has never even been posed.
>
> > Further background:
>
> > There is no need to use web2py to serve up its css pages, javascript
> > and images.  A compiled static language (such as C) web server can be
> > used instead.
>
> > The question then becomes which web server. The answer is obvious: web
> > servers that use event handlers to serve static web pages, not
> > necessarily threads. Unfortunately you will find religious bigots,
> > even on this forum, who will ridicule anyone who points out the
> > obvious. Expect abuse from this reply.
>
> > John Heenan
>
> > On Aug 24, 3:21 pm, mdipierro  wrote:
>
> > > P.S. In the end the bottle neck is ALWAYS database access.
>
> > > On Aug 24, 12:20 am, mdipierro  wrote:
>
> > > > In Java a serverlet, as far as I understand, is a class which conforms
> > > > to some API that allows it to serve one http request. Each instance is
> > > > executed in its own thread. The Python equivalent of the serverlet API
> > > > is a WSGI application and web2py is based on WSGI, therefore the
> > > > parallelization mechanism is equivalent to Java serverlets.
>
> > > > In web2py (the same in Django, Pylons, any any WSGI app) each http
> > > > request is executed in its own thread. Threads are recycled to server
> > > > non-concurrent requests and reuse database connections (pooling)
> > > > without need to close and reopen them. The web server can be
> > > > configured for a min number and a max number of threads.
>
> > > > I think the GIL in this context is a false problem. In fact in
> > > > production you can use Apache and run as many processes as the number
> > > > of cores that you have. Each process will create as many threads as it
> > > > needs to server multiple requests. The GIL is a problems only if one
> > > > process runs multiple threads on multiple cores. It is possible there
> > > > are some caveats with many cores but I have not really played with
> > > > apache configurations and benchmarks.
>
> > > > I do not think using Jython helps anything. According to these tests:
> > > >  http://blog.dhananjaynene.com/2008/07/performance-comparison-c-java-p...
> > > >  http://pyevolve.sourceforge.net/wordpress/?p=1189
> > > > Jython is 2x-3x slower than cpython. So you may get better scaling
> > > > with multiple cores but you pay huge perfomance hit.
>
> > > > Web2py runs on Jython but there is a known bug in Java regular
> > > > expressions that Sun marked as "won'tfix" that can cause runaway
> > > > problems when parsing complex templates. This is not a web2py specific
> > > > problem but we have seen effects of the bug in some web2py apps.
>
> > > > Ma

[web2py] Re: Web2py and threads

2010-08-24 Thread cjrh
On Aug 24, 9:00 pm, John Heenan  wrote:
> Can't we at least have an acknowledgement that it is not necessary for
> web2py to use a thread per request model and that web2py could instead
> use an event model?

Acknowledged.


[web2py] Re: Web2py and threads

2010-08-24 Thread mdipierro
I agree with you. web2py does not care. It is the web server that
decides. The question that started this thread was about the built-in
web server and it does follow the thread model.

On Aug 24, 2:00 pm, John Heenan  wrote:
> Can't we at least have an acknowledgement that it is not necessary for
> web2py to use a thread per request model and that web2py could instead
> use an event model?
>
> WSGI can be viewed as an evil conspiracy to force Python web apps to
> follow the Apache thread per request model! Also with Apace mod_wsgi,
> Apache controls the Python process that web2py runs under! How evil
> and ugly!
>
> There is no inherent reason why web2py needs to run a separate thread
> for each NON static http request, if WSGI is not used!
>
> If web2py uses WSGI then a thread per request is forced upon web2py.
> This suits Apache but not web serves with better event driven models
> such as Lighttpd and Nginx
>
> For example Lighttpd does not even support WSGI. Instead web2py used
> fastcgi for communication Lighttpd via a UNIX socket and web2py then
> needlessly converts each request into a thread for handling by a WSGI
> handler!
>
> Why should web2py be forced into using a thread model? Anyone who
> writes PC applications avoids thread as if they are a plague. Even
> academics openly call using threads evil. Here is an article by Edward
> A. Lee professor at Berkely University with the title "The Problem
> with Threads" (PDF).
>
> John Heenan
>
> On Aug 25, 1:00 am, John Heenan  wrote:
>
> > There is absolutely no need to serve up static web pages of a pure
> > Python web app or a WGSI app with a separate thread.  It is
> > inefficient to use an inbuilt web server (of a Python web app) or
> > Apache (if WGSI used) to serve up static web pages using separate
> > threads. Both Lighttpd and Nginx are well known web servers that
> > thrash Apache in objective tests for static pages when a web server is
> > under load. These web servers use event handlers to serve static web
> > pages, not necessarily separate threads.
>
> > Of course the question remains, how much can the performance of WSGI
> > type apps be improved by an analogous event handling model within the
> > app and how much of a change in development style would be required to
> > take full advantage of such an approach. As far as I am aware these
> > questions has never even been posed.
>
> > Further background:
>
> > There is no need to use web2py to serve up its css pages, javascript
> > and images.  A compiled static language (such as C) web server can be
> > used instead.
>
> > The question then becomes which web server. The answer is obvious: web
> > servers that use event handlers to serve static web pages, not
> > necessarily threads. Unfortunately you will find religious bigots,
> > even on this forum, who will ridicule anyone who points out the
> > obvious. Expect abuse from this reply.
>
> > John Heenan
>
> > On Aug 24, 3:21 pm, mdipierro  wrote:
>
> > > P.S. In the end the bottle neck is ALWAYS database access.
>
> > > On Aug 24, 12:20 am, mdipierro  wrote:
>
> > > > In Java a serverlet, as far as I understand, is a class which conforms
> > > > to some API that allows it to serve one http request. Each instance is
> > > > executed in its own thread. The Python equivalent of the serverlet API
> > > > is a WSGI application and web2py is based on WSGI, therefore the
> > > > parallelization mechanism is equivalent to Java serverlets.
>
> > > > In web2py (the same in Django, Pylons, any any WSGI app) each http
> > > > request is executed in its own thread. Threads are recycled to server
> > > > non-concurrent requests and reuse database connections (pooling)
> > > > without need to close and reopen them. The web server can be
> > > > configured for a min number and a max number of threads.
>
> > > > I think the GIL in this context is a false problem. In fact in
> > > > production you can use Apache and run as many processes as the number
> > > > of cores that you have. Each process will create as many threads as it
> > > > needs to server multiple requests. The GIL is a problems only if one
> > > > process runs multiple threads on multiple cores. It is possible there
> > > > are some caveats with many cores but I have not really played with
> > > > apache configurations and benchmarks.
>
> > > > I do not think using Jython helps anything. According to these tests:
> > > >  http://blog.dhananjaynene.com/2008/07/performance-comparison-c-java-p...
> > > >  http://pyevolve.sourceforge.net/wordpress/?p=1189
> > > > Jython is 2x-3x slower than cpython. So you may get better scaling
> > > > with multiple cores but you pay huge perfomance hit.
>
> > > > Web2py runs on Jython but there is a known bug in Java regular
> > > > expressions that Sun marked as "won'tfix" that can cause runaway
> > > > problems when parsing complex templates. This is not a web2py specific
> > > > problem but we have seen effects of the bug in some web2py apps.
>
> > > > 

[web2py] Re: Web2py and threads

2010-08-24 Thread pierreth
On 24 août, 13:04, mdipierro  wrote:
> when you do "python web2py.py" you do not start the wsgi app. You
> start the rocket web server which makes a number of threads. When a
> new http request arrives it is assigned to a free thread (or a new
> thread is created) and the wsgi is run in that thread to server that
> request and only that request.
>
>

I guess that each time a thread is put in service the model is
executed again. So, for the model, a version of the instance exist for
each thread. Right?

> > So, as a web2py developer, what do I have to do to avoid
> > synchronization problems in my application. Where is the danger of
> > having multiple threads for the web2py developers? What are the
> > instances shared my multiple threads? What are the instances living in
> > their own threads?
>
> You do have to do anything because every concurrency issue is taken
> care automatically. There are some DO NOTs:
> - do not ever call os.chdir
> - do not import third party modules that are not thread safe
> - do not use thread.start_new_thread and threading.Thread.start()
> - if you open a file that is not uniquely associate to this http/
> request/client/session lock the file.

OK


[web2py] Re: Web2py and threads

2010-08-24 Thread pierreth
I don't understand. The link is broken at the moment.

Do you mean using only on thread and dispatching using the observer
pattern? Doing only one request at a time? It does not makes sense to
me. But I guess there is something I don't understand... Can someone
guide me?

On 24 août, 20:04, mdipierro  wrote:
> I agree with you. web2py does not care. It is the web server that
> decides. The question that started this thread was about the built-in
> web server and it does follow the thread model.
>
> On Aug 24, 2:00 pm, John Heenan  wrote:
>
> > Can't we at least have an acknowledgement that it is not necessary for
> > web2py to use a thread per request model and that web2py could instead
> > use an event model?
>


[web2py] Re: Web2py and threads

2010-08-25 Thread mdipierro
The model is executed at every http request, new thread or not new
thread.

On Aug 24, 9:36 pm, pierreth  wrote:
> On 24 août, 13:04, mdipierro  wrote:
>
> > when you do "python web2py.py" you do not start the wsgi app. You
> > start the rocket web server which makes a number of threads. When a
> > new http request arrives it is assigned to a free thread (or a new
> > thread is created) and the wsgi is run in that thread to server that
> > request and only that request.
>
> I guess that each time a thread is put in service the model is
> executed again. So, for the model, a version of the instance exist for
> each thread. Right?
>
> > > So, as a web2py developer, what do I have to do to avoid
> > > synchronization problems in my application. Where is the danger of
> > > having multiple threads for the web2py developers? What are the
> > > instances shared my multiple threads? What are the instances living in
> > > their own threads?
>
> > You do have to do anything because every concurrency issue is taken
> > care automatically. There are some DO NOTs:
> > - do not ever call os.chdir
> > - do not import third party modules that are not thread safe
> > - do not use thread.start_new_thread and threading.Thread.start()
> > - if you open a file that is not uniquely associate to this http/
> > request/client/session lock the file.
>
> OK


[web2py] Re: Web2py and threads

2010-08-25 Thread John Heenan
No, nothing that abstract. Using WSGI forces a new thread for each
request. This is is a simple and inefficient brute force approach that
really only suits the simplest Python applications and where only a
small number of concurrent connection might be expected.

Any application that provides web services is going to OS block on
file reading (and writing) and on database access. Using threads is a
classic and easy way out that carries a lot of baggage. Windows has
had a way out of this for years with its asynch (or event)
notification set up through an OVERLAPPED structure.

Lightttpd makes use of efficient event notification schemes like
kqueue and epoll. Apache only uses such schemes for listening and Keep-
Alives.

No matter how careful one is with threads and processes there always
appears to be unexpected gotchas. Python has a notorious example, the
now fixed 'Beazly Effect' that affected the GIL. Also I don't think
there is a single experienced Python user that trusts the GIL.

John Heenan

On Aug 25, 12:40 pm, pierreth  wrote:
> I don't understand. The link is broken at the moment.
>
> Do you mean using only on thread and dispatching using the observer
> pattern? Doing only one request at a time? It does not makes sense to
> me. But I guess there is something I don't understand... Can someone
> guide me?
>
> On 24 août, 20:04, mdipierro  wrote:
>
> > I agree with you. web2py does not care. It is the web server that
> > decides. The question that started this thread was about the built-in
> > web server and it does follow the thread model.
>
> > On Aug 24, 2:00 pm, John Heenan  wrote:
>
> > > Can't we at least have an acknowledgement that it is not necessary for
> > > web2py to use a thread per request model and that web2py could instead
> > > use an event model?


[web2py] Re: Web2py and threads

2010-08-25 Thread pierreth
I would appreciate a good reference to understand the concepts you are
talking about. It is something new to me and I don't understand.

On 25 août, 11:22, John Heenan  wrote:
> No, nothing that abstract. Using WSGI forces a new thread for each
> request. This is is a simple and inefficient brute force approach that
> really only suits the simplest Python applications and where only a
> small number of concurrent connection might be expected.
>
> Any application that provides web services is going to OS block on
> file reading (and writing) and on database access. Using threads is a
> classic and easy way out that carries a lot of baggage. Windows has
> had a way out of this for years with its asynch (or event)
> notification set up through an OVERLAPPED structure.
>
> Lightttpd makes use of efficient event notification schemes like
> kqueue and epoll. Apache only uses such schemes for listening and Keep-
> Alives.
>
> No matter how careful one is with threads and processes there always
> appears to be unexpected gotchas. Python has a notorious example, the
> now fixed 'Beazly Effect' that affected the GIL. Also I don't think
> there is a single experienced Python user that trusts the GIL.
>
> John Heenan
>


[web2py] Re: Web2py and threads

2010-08-25 Thread mdipierro


On Aug 25, 11:00 am, Phyo Arkar  wrote:
> Did I Read that reading files inside controller will block web2py , Does it?

No web2py does not block. web2py only locks sessions that means one
user cannot request two concurrent pages because there would be a race
condition in saving sessions. Two user can request different pages
which open the same file unless the file is explicitly locked by your
code.

> Thats a bad news.. i am doing a file crawler and while crawling ,
> web2py is blocked even tho the process talke only 25% of 1 out of 4
> CPUs ..

Tell us more or I cannot help.


>
> On 8/25/10, pierreth  wrote:
>
> > I would appreciate a good reference to understand the concepts you are
> > talking about. It is something new to me and I don't understand.
>
> > On 25 août, 11:22, John Heenan  wrote:
> >> No, nothing that abstract. Using WSGI forces a new thread for each
> >> request. This is is a simple and inefficient brute force approach that
> >> really only suits the simplest Python applications and where only a
> >> small number of concurrent connection might be expected.
>
> >> Any application that provides web services is going to OS block on
> >> file reading (and writing) and on database access. Using threads is a
> >> classic and easy way out that carries a lot of baggage. Windows has
> >> had a way out of this for years with its asynch (or event)
> >> notification set up through an OVERLAPPED structure.
>
> >> Lightttpd makes use of efficient event notification schemes like
> >> kqueue and epoll. Apache only uses such schemes for listening and Keep-
> >> Alives.
>
> >> No matter how careful one is with threads and processes there always
> >> appears to be unexpected gotchas. Python has a notorious example, the
> >> now fixed 'Beazly Effect' that affected the GIL. Also I don't think
> >> there is a single experienced Python user that trusts the GIL.
>
> >> John Heenan


[web2py] Re: Web2py and threads

2010-08-25 Thread John Heenan
Even with file reading there is no way the disk drive, its controllers
and various buses can keep up with the CPU.

Hence reading any file from disk will cause the OS to intervene and
block (reading a template view file, controller file or otherwise),
albeit a 'short' time.

Here are two choices.

1) Put the read function in a thread, wait for the thread to unblock
and continue servicing any other threads that are no longer blocked.
This is what web2py does using a Python file read. The OS
automatically provides thread scheduling.

2) Tell the OS to pass a message to an event callback when the OS is
ready. A separate thread is not required if the application chooses to
process its own message queue as the thread in effect simply relays
the message on.

There is of course other blocks: file writing, database access and
network read and write (including from/to the http request PC)

It is tempting to say it is not rocket science.

Anyway the main message is being aware of the plumbing and avoiding
blind religious type fixations is important for long term planning and
scalability issues.

We really need to face up to realities that seeing Python as a black
box type total solution is not healthy.

John Heenan



On Aug 26, 2:27 am, mdipierro  wrote:
> On Aug 25, 11:00 am, Phyo Arkar  wrote:
>
> > Did I Read that reading files inside controller will block web2py , Does it?
>
> No web2py does not block. web2py only locks sessions that means one
> user cannot request two concurrent pages because there would be a race
> condition in saving sessions. Two user can request different pages
> which open the same file unless the file is explicitly locked by your
> code.
>
> > Thats a bad news.. i am doing a file crawler and while crawling ,
> > web2py is blocked even tho the process talke only 25% of 1 out of 4
> > CPUs ..
>
> Tell us more or I cannot help.
>
>
>
> > On 8/25/10, pierreth  wrote:
>
> > > I would appreciate a good reference to understand the concepts you are
> > > talking about. It is something new to me and I don't understand.
>
> > > On 25 août, 11:22, John Heenan  wrote:
> > >> No, nothing that abstract. Using WSGI forces a new thread for each
> > >> request. This is is a simple and inefficient brute force approach that
> > >> really only suits the simplest Python applications and where only a
> > >> small number of concurrent connection might be expected.
>
> > >> Any application that provides web services is going to OS block on
> > >> file reading (and writing) and on database access. Using threads is a
> > >> classic and easy way out that carries a lot of baggage. Windows has
> > >> had a way out of this for years with its asynch (or event)
> > >> notification set up through an OVERLAPPED structure.
>
> > >> Lightttpd makes use of efficient event notification schemes like
> > >> kqueue and epoll. Apache only uses such schemes for listening and Keep-
> > >> Alives.
>
> > >> No matter how careful one is with threads and processes there always
> > >> appears to be unexpected gotchas. Python has a notorious example, the
> > >> now fixed 'Beazly Effect' that affected the GIL. Also I don't think
> > >> there is a single experienced Python user that trusts the GIL.
>
> > >> John Heenan


[web2py] Re: Web2py and threads

2010-08-25 Thread mdipierro
call

session._unlock()

if you do not need session locking

On Aug 25, 11:38 am, Phyo Arkar  wrote:
> Yes may be session was locked , thats why
> session.current=processing_path not working
>
> But then again , while processing files i try opening separate page ,
> to other controller , it was waited till the first (file Crawler) page
> finished parsing.
>
> ok i will make a separate thread about this.
>
> On 8/25/10, mdipierro  wrote:
>
>
>
> > On Aug 25, 11:00 am, Phyo Arkar  wrote:
> >> Did I Read that reading files inside controller will block web2py , Does
> >> it?
>
> > No web2py does not block. web2py only locks sessions that means one
> > user cannot request two concurrent pages because there would be a race
> > condition in saving sessions. Two user can request different pages
> > which open the same file unless the file is explicitly locked by your
> > code.
>
> >> Thats a bad news.. i am doing a file crawler and while crawling ,
> >> web2py is blocked even tho the process talke only 25% of 1 out of 4
> >> CPUs ..
>
> > Tell us more or I cannot help.
>
> >> On 8/25/10, pierreth  wrote:
>
> >> > I would appreciate a good reference to understand the concepts you are
> >> > talking about. It is something new to me and I don't understand.
>
> >> > On 25 août, 11:22, John Heenan  wrote:
> >> >> No, nothing that abstract. Using WSGI forces a new thread for each
> >> >> request. This is is a simple and inefficient brute force approach that
> >> >> really only suits the simplest Python applications and where only a
> >> >> small number of concurrent connection might be expected.
>
> >> >> Any application that provides web services is going to OS block on
> >> >> file reading (and writing) and on database access. Using threads is a
> >> >> classic and easy way out that carries a lot of baggage. Windows has
> >> >> had a way out of this for years with its asynch (or event)
> >> >> notification set up through an OVERLAPPED structure.
>
> >> >> Lightttpd makes use of efficient event notification schemes like
> >> >> kqueue and epoll. Apache only uses such schemes for listening and Keep-
> >> >> Alives.
>
> >> >> No matter how careful one is with threads and processes there always
> >> >> appears to be unexpected gotchas. Python has a notorious example, the
> >> >> now fixed 'Beazly Effect' that affected the GIL. Also I don't think
> >> >> there is a single experienced Python user that trusts the GIL.
>
> >> >> John Heenan


[web2py] Re: Web2py and threads

2010-08-25 Thread mdipierro
the time to execute a typical web2py action my server is 10-20ms. The
time to open a file or write a small file is so small that is not
measurable. I am not sure I believe there is any issue here or perhaps
I do not understand the problem. Can you provide a test case?

On Aug 25, 2:11 pm, John Heenan  wrote:
> Even with file reading there is no way the disk drive, its controllers
> and various buses can keep up with the CPU.
>
> Hence reading any file from disk will cause the OS to intervene and
> block (reading a template view file, controller file or otherwise),
> albeit a 'short' time.
>
> Here are two choices.
>
> 1) Put the read function in a thread, wait for the thread to unblock
> and continue servicing any other threads that are no longer blocked.
> This is what web2py does using a Python file read. The OS
> automatically provides thread scheduling.
>
> 2) Tell the OS to pass a message to an event callback when the OS is
> ready. A separate thread is not required if the application chooses to
> process its own message queue as the thread in effect simply relays
> the message on.
>
> There is of course other blocks: file writing, database access and
> network read and write (including from/to the http request PC)
>
> It is tempting to say it is not rocket science.
>
> Anyway the main message is being aware of the plumbing and avoiding
> blind religious type fixations is important for long term planning and
> scalability issues.
>
> We really need to face up to realities that seeing Python as a black
> box type total solution is not healthy.
>
> John Heenan
>
> On Aug 26, 2:27 am, mdipierro  wrote:
>
> > On Aug 25, 11:00 am, Phyo Arkar  wrote:
>
> > > Did I Read that reading files inside controller will block web2py , Does 
> > > it?
>
> > No web2py does not block. web2py only locks sessions that means one
> > user cannot request two concurrent pages because there would be a race
> > condition in saving sessions. Two user can request different pages
> > which open the same file unless the file is explicitly locked by your
> > code.
>
> > > Thats a bad news.. i am doing a file crawler and while crawling ,
> > > web2py is blocked even tho the process talke only 25% of 1 out of 4
> > > CPUs ..
>
> > Tell us more or I cannot help.
>
> > > On 8/25/10, pierreth  wrote:
>
> > > > I would appreciate a good reference to understand the concepts you are
> > > > talking about. It is something new to me and I don't understand.
>
> > > > On 25 août, 11:22, John Heenan  wrote:
> > > >> No, nothing that abstract. Using WSGI forces a new thread for each
> > > >> request. This is is a simple and inefficient brute force approach that
> > > >> really only suits the simplest Python applications and where only a
> > > >> small number of concurrent connection might be expected.
>
> > > >> Any application that provides web services is going to OS block on
> > > >> file reading (and writing) and on database access. Using threads is a
> > > >> classic and easy way out that carries a lot of baggage. Windows has
> > > >> had a way out of this for years with its asynch (or event)
> > > >> notification set up through an OVERLAPPED structure.
>
> > > >> Lightttpd makes use of efficient event notification schemes like
> > > >> kqueue and epoll. Apache only uses such schemes for listening and Keep-
> > > >> Alives.
>
> > > >> No matter how careful one is with threads and processes there always
> > > >> appears to be unexpected gotchas. Python has a notorious example, the
> > > >> now fixed 'Beazly Effect' that affected the GIL. Also I don't think
> > > >> there is a single experienced Python user that trusts the GIL.
>
> > > >> John Heenan


[web2py] Re: Web2py and threads

2010-08-25 Thread John Heenan
Linux lost a PR war over the exact same issue that I am addressing
now.

The point is being aware of issues surrounding the scalability of
services associated with using far more OS resources than required and
at more risk.

The tiny amount of time a single thread, thread A, may be blocked
during a read is not the point. The OS is still going to block and do
expensive context switches that allow other threads to run before
examining if thread A should be allowed to unblock. If the server is
heavily loaded then it could be quite some time before thread A
unblocks due to the large amount of total time other unblocked thread
consume before they block again.

However in the context of the the Python GIL works, it is is not even
as simple as above because the GIL imposes another layer

In objective tests, using the OS to notify an OS mediated action is
ready beats using multiple threads that just become unblocked. For
example Lightttpd beats Apache when serving static files with lots of
concurrent connections. Lighttpd can serve up to 10,000 concurrent
connections. Lighttpd uses event notification to service http
requests, so allowing Lighttpd to know when the OS is finished with
some action that would block a thread if actioned synchronously.
Apache wastefully uses a separate thread for each separate request and
allows the OS to unblock threads that expect synchronous action..

Years ago there was a big PR war between Windows and Linux. There was
a hell of a lot riding on the outcome of web server scaling tests and
the war got an awful lot of attention in the computer media. Linux had
God like status. The Linux zealots were absolutely convinced tests
would show Linux would beat what they regarded as inferior Windows
bloat and poor design. They had the journalists convinced and everyone
else convinced the tests would be just a formality that would prove
Linux was superior and would formally justify the utter contempt
Windows was held in. Linux advocates got their best people. Linux
lost. After the tests Linux lost its God like status in the media,
really just became another side show and never recovered.

So what went wrong for Linux? Simple I reckon. Microsoft had an API
that allowed the OS to notify a process asynchronously that data was
ready. Linux at the time did not and relied on its antique UNIX select
function to allow threads to block. At small scale there is little
effective difference. However Apache was unable to scale as well as
the Microsoft IIS web server.

John Heenan


On Aug 26, 6:46 am, mdipierro  wrote:
> the time to execute a typical web2py action my server is 10-20ms. The
> time to open a file or write a small file is so small that is not
> measurable. I am not sure I believe there is any issue here or perhaps
> I do not understand the problem. Can you provide a test case?
>
> On Aug 25, 2:11 pm, John Heenan  wrote:
>
> > Even with file reading there is no way the disk drive, its controllers
> > and various buses can keep up with the CPU.
>
> > Hence reading any file from disk will cause the OS to intervene and
> > block (reading a template view file, controller file or otherwise),
> > albeit a 'short' time.
>
> > Here are two choices.
>
> > 1) Put the read function in a thread, wait for the thread to unblock
> > and continue servicing any other threads that are no longer blocked.
> > This is what web2py does using a Python file read. The OS
> > automatically provides thread scheduling.
>
> > 2) Tell the OS to pass a message to an event callback when the OS is
> > ready. A separate thread is not required if the application chooses to
> > process its own message queue as the thread in effect simply relays
> > the message on.
>
> > There is of course other blocks: file writing, database access and
> > network read and write (including from/to the http request PC)
>
> > It is tempting to say it is not rocket science.
>
> > Anyway the main message is being aware of the plumbing and avoiding
> > blind religious type fixations is important for long term planning and
> > scalability issues.
>
> > We really need to face up to realities that seeing Python as a black
> > box type total solution is not healthy.
>
> > John Heenan
>
> > On Aug 26, 2:27 am, mdipierro  wrote:
>
> > > On Aug 25, 11:00 am, Phyo Arkar  wrote:
>
> > > > Did I Read that reading files inside controller will block web2py , 
> > > > Does it?
>
> > > No web2py does not block. web2py only locks sessions that means one
> > > user cannot request two concurrent pages because there would be a race
> > > condition in saving sessions. Two user can request different pages
> > > which open the same file unless the file is explicitly locked by your
> > > code.
>
> > > > Thats a bad news.. i am doing a file crawler and while crawling ,
> > > > web2py is blocked even tho the process talke only 25% of 1 out of 4
> > > > CPUs ..
>
> > > Tell us more or I cannot help.
>
> > > > On 8/25/10, pierreth  wrote:
>
> > > > > I would appreciate a good 

[web2py] Re: Web2py and threads

2010-08-25 Thread mdipierro
The problem is only if have two http request from the same client in
the same session

A arrives loads session and unlocks
B arrives loads session and unlocks
A change session and saves it
B changes session and saves it

Nothing breaks but B never sees changes made by A and they are
overwritten by B.
With locks

A arrives loads session
B arrives and waits
A change session and saves it
B loads session (with changes made by A)
B changes session and saves it


On Aug 25, 3:52 pm, Jonathan Lundell  wrote:
> On Aug 25, 2010, at 1:41 PM, mdipierro wrote:
>
>
>
> > call
>
> > session._unlock()
>
> > if you do not need session locking
>
> If you do that (without calling session.forget), what will happen in 
> _try_store_on_disk when cPickle.dump(dict(self), response.session_file) is 
> called with a None file argument? Or is cPickle.dump cool with that? Or am I 
> misreading the logic?
>
>
>
> > On Aug 25, 11:38 am, Phyo Arkar  wrote:
> >> Yes may be session was locked , thats why
> >> session.current=processing_path not working
>
> >> But then again , while processing files i try opening separate page ,
> >> to other controller , it was waited till the first (file Crawler) page
> >> finished parsing.
>
> >> ok i will make a separate thread about this.
>
> >> On 8/25/10, mdipierro  wrote:
>
> >>> On Aug 25, 11:00 am, Phyo Arkar  wrote:
>  Did I Read that reading files inside controller will block web2py , Does
>  it?
>
> >>> No web2py does not block. web2py only locks sessions that means one
> >>> user cannot request two concurrent pages because there would be a race
> >>> condition in saving sessions. Two user can request different pages
> >>> which open the same file unless the file is explicitly locked by your
> >>> code.
>
>  Thats a bad news.. i am doing a file crawler and while crawling ,
>  web2py is blocked even tho the process talke only 25% of 1 out of 4
>  CPUs ..
>
> >>> Tell us more or I cannot help.
>
>  On 8/25/10, pierreth  wrote:
>
> > I would appreciate a good reference to understand the concepts you are
> > talking about. It is something new to me and I don't understand.
>
> > On 25 août, 11:22, John Heenan  wrote:
> >> No, nothing that abstract. Using WSGI forces a new thread for each
> >> request. This is is a simple and inefficient brute force approach that
> >> really only suits the simplest Python applications and where only a
> >> small number of concurrent connection might be expected.
>
> >> Any application that provides web services is going to OS block on
> >> file reading (and writing) and on database access. Using threads is a
> >> classic and easy way out that carries a lot of baggage. Windows has
> >> had a way out of this for years with its asynch (or event)
> >> notification set up through an OVERLAPPED structure.
>
> >> Lightttpd makes use of efficient event notification schemes like
> >> kqueue and epoll. Apache only uses such schemes for listening and Keep-
> >> Alives.
>
> >> No matter how careful one is with threads and processes there always
> >> appears to be unexpected gotchas. Python has a notorious example, the
> >> now fixed 'Beazly Effect' that affected the GIL. Also I don't think
> >> there is a single experienced Python user that trusts the GIL.
>
> >> John Heenan


[web2py] Re: Web2py and threads

2010-08-25 Thread mdipierro
This is a bug. I fixed it in trunk. Thanks Jonathan.

On Aug 25, 9:30 pm, Jonathan Lundell  wrote:
> On Aug 25, 2010, at 6:37 PM, mdipierro wrote:
>
>
>
> > The problem is only if have two http request from the same client in
> > the same session
>
> Thanks for that; I was wondering under which conditions unlocking might be 
> permissible (and I'm still not entirely clear, but never mind for now).
>
> My concern is this. Here's unlock:
>
>     def _unlock(self, response):
>         if response and response.session_file:
>             try:
>                 portalocker.unlock(response.session_file)
>                 response.session_file.close()
>                 del response.session_file  <-
>             except: ### this should never happen but happens in Windows
>                 pass
>
> Now we save the session file:
>
>     def _try_store_on_disk(self, request, response):
>         if response._dbtable_and_field \
>                 or not response.session_id \
>                 or self._forget:
>             self._unlock(response)
>             return
>         if response.session_new:
>             # Tests if the session folder exists, if not, create it
>             session_folder = os.path.dirname(response.session_filename)
>             response.session_file = open(response.session_filename, 'wb')
>             portalocker.lock(response.session_file, portalocker.LOCK_EX)
>         cPickle.dump(dict(self), response.session_file)  
> <
>         self._unlock(response)
>
> But response.session_file is None at this point.
>
>
>
> > A arrives loads session and unlocks
> > B arrives loads session and unlocks
> > A change session and saves it
> > B changes session and saves it
>
> > Nothing breaks but B never sees changes made by A and they are
> > overwritten by B.
> > With locks
>
> > A arrives loads session
> > B arrives and waits
> > A change session and saves it
> > B loads session (with changes made by A)
> > B changes session and saves it
>
> > On Aug 25, 3:52 pm, Jonathan Lundell  wrote:
> >> On Aug 25, 2010, at 1:41 PM, mdipierro wrote:
>
> >>> call
>
> >>> session._unlock()
>
> >>> if you do not need session locking
>
> >> If you do that (without calling session.forget), what will happen in 
> >> _try_store_on_disk when cPickle.dump(dict(self), response.session_file) is 
> >> called with a None file argument? Or is cPickle.dump cool with that? Or am 
> >> I misreading the logic?
>
> >>> On Aug 25, 11:38 am, Phyo Arkar  wrote:
>  Yes may be session was locked , thats why
>  session.current=processing_path not working
>
>  But then again , while processing files i try opening separate page ,
>  to other controller , it was waited till the first (file Crawler) page
>  finished parsing.
>
>  ok i will make a separate thread about this.
>
>  On 8/25/10, mdipierro  wrote:
>
> > On Aug 25, 11:00 am, Phyo Arkar  wrote:
> >> Did I Read that reading files inside controller will block web2py , 
> >> Does
> >> it?
>
> > No web2py does not block. web2py only locks sessions that means one
> > user cannot request two concurrent pages because there would be a race
> > condition in saving sessions. Two user can request different pages
> > which open the same file unless the file is explicitly locked by your
> > code.
>
> >> Thats a bad news.. i am doing a file crawler and while crawling ,
> >> web2py is blocked even tho the process talke only 25% of 1 out of 4
> >> CPUs ..
>
> > Tell us more or I cannot help.
>
> >> On 8/25/10, pierreth  wrote:
>
> >>> I would appreciate a good reference to understand the concepts you are
> >>> talking about. It is something new to me and I don't understand.
>
> >>> On 25 août, 11:22, John Heenan  wrote:
>  No, nothing that abstract. Using WSGI forces a new thread for each
>  request. This is is a simple and inefficient brute force approach 
>  that
>  really only suits the simplest Python applications and where only a
>  small number of concurrent connection might be expected.
>
>  Any application that provides web services is going to OS block on
>  file reading (and writing) and on database access. Using threads is a
>  classic and easy way out that carries a lot of baggage. Windows has
>  had a way out of this for years with its asynch (or event)
>  notification set up through an OVERLAPPED structure.
>
>  Lightttpd makes use of efficient event notification schemes like
>  kqueue and epoll. Apache only uses such schemes for listening and 
>  Keep-
>  Alives.
>
>  No matter how careful one is with threads and processes there always
>  appears to be unexpected gotchas. Python has a notorious example, the
>  now fixed 'Beazly Effect' that affected the GIL. Also I don't think
>  there is a si

[web2py] Re: Web2py and threads

2010-08-27 Thread mdipierro
You are right. Please check trunk again.

Massimo

On Aug 27, 10:25 am, Jonathan Lundell  wrote:
> On Aug 25, 2010, at 8:12 PM, Jonathan Lundell wrote:
>
>
>
> > On Aug 25, 2010, at 7:56 PM, mdipierro wrote:
>
> >> This is a bug. I fixed it in trunk. Thanks Jonathan.
>
> > It's fixed in the sense that it won't raise an exception. But now how is 
> > calling _unlock different from calling forget?
>
> Nag.
>
>
>
> >> On Aug 25, 9:30 pm, Jonathan Lundell  wrote:
> >>> On Aug 25, 2010, at 6:37 PM, mdipierro wrote:
>
>  The problem is only if have two http request from the same client in
>  the same session
>
> >>> Thanks for that; I was wondering under which conditions unlocking might 
> >>> be permissible (and I'm still not entirely clear, but never mind for now).
>
> >>> My concern is this. Here's unlock:
>
> >>>    def _unlock(self, response):
> >>>        if response and response.session_file:
> >>>            try:
> >>>                portalocker.unlock(response.session_file)
> >>>                response.session_file.close()
> >>>                del response.session_file  <-
> >>>            except: ### this should never happen but happens in Windows
> >>>                pass
>
> >>> Now we save the session file:
>
> >>>    def _try_store_on_disk(self, request, response):
> >>>        if response._dbtable_and_field \
> >>>                or not response.session_id \
> >>>                or self._forget:
> >>>            self._unlock(response)
> >>>            return
> >>>        if response.session_new:
> >>>            # Tests if the session folder exists, if not, create it
> >>>            session_folder = os.path.dirname(response.session_filename)
> >>>            response.session_file = open(response.session_filename, 'wb')
> >>>            portalocker.lock(response.session_file, portalocker.LOCK_EX)
> >>>        cPickle.dump(dict(self), response.session_file)  
> >>> <
> >>>        self._unlock(response)
>
> >>> But response.session_file is None at this point.
>
>  A arrives loads session and unlocks
>  B arrives loads session and unlocks
>  A change session and saves it
>  B changes session and saves it
>
>  Nothing breaks but B never sees changes made by A and they are
>  overwritten by B.
>  With locks
>
>  A arrives loads session
>  B arrives and waits
>  A change session and saves it
>  B loads session (with changes made by A)
>  B changes session and saves it
>
>  On Aug 25, 3:52 pm, Jonathan Lundell  wrote:
> > On Aug 25, 2010, at 1:41 PM, mdipierro wrote:
>
> >> call
>
> >> session._unlock()
>
> >> if you do not need session locking
>
> > If you do that (without calling session.forget), what will happen in 
> > _try_store_on_disk when cPickle.dump(dict(self), response.session_file) 
> > is called with a None file argument? Or is cPickle.dump cool with that? 
> > Or am I misreading the logic?
>
> >> On Aug 25, 11:38 am, Phyo Arkar  wrote:
> >>> Yes may be session was locked , thats why
> >>> session.current=processing_path not working
>
> >>> But then again , while processing files i try opening separate page ,
> >>> to other controller , it was waited till the first (file Crawler) page
> >>> finished parsing.
>
> >>> ok i will make a separate thread about this.
>
> >>> On 8/25/10, mdipierro  wrote:
>
>  On Aug 25, 11:00 am, Phyo Arkar  wrote:
> > Did I Read that reading files inside controller will block web2py , 
> > Does
> > it?
>
>  No web2py does not block. web2py only locks sessions that means one
>  user cannot request two concurrent pages because there would be a 
>  race
>  condition in saving sessions. Two user can request different pages
>  which open the same file unless the file is explicitly locked by your
>  code.
>
> > Thats a bad news.. i am doing a file crawler and while crawling ,
> > web2py is blocked even tho the process talke only 25% of 1 out of 4
> > CPUs ..
>
>  Tell us more or I cannot help.
>
> > On 8/25/10, pierreth  wrote:
>
> >> I would appreciate a good reference to understand the concepts you 
> >> are
> >> talking about. It is something new to me and I don't understand.
>
> >> On 25 août, 11:22, John Heenan  wrote:
> >>> No, nothing that abstract. Using WSGI forces a new thread for each
> >>> request. This is is a simple and inefficient brute force approach 
> >>> that
> >>> really only suits the simplest Python applications and where only 
> >>> a
> >>> small number of concurrent connection might be expected.
>
> >>> Any application that provides web services is going to OS block on
> >>> file reading (and writing) and on database access. Using threads 
> >>> is a
> 

Re: [web2py] Re: Web2py and threads

2010-08-24 Thread Michele Comitini
2010/8/24 mdipierro :
> P.S. In the end the bottle neck is ALWAYS database access.
true! many driver implementations do not release the GIL properly on a
blocking call.
Anyway a well designed db would avoid the problem entirely.

Do you know if anyone tried web2py on pypy [http://pypy.org] ?


Re: [web2py] Re: Web2py and threads

2010-08-24 Thread Michele Comitini
CPython threading is not useful for (real) parallel processing
1) thread (with GIL) is good for *cpu bound processes* that do not
stop the main process while blocked by a system call (the intent is
similar to select/poll)
2) for really using multiple cores/cpus use something more
appropriated or make Mr. Van Rossum change his mind! :-)

http://mail.python.org/pipermail/python-3000/2007-May/007414.html

Object instantiated in python threads are isolated so they live their
own life no worry about that
Anyway do you like singletons? Python threading has many ways to
support that... but read this:
http://code.activestate.com/recipes/66531-singleton-we-dont-need-no-stinkin-singleton-the-bo/

wow! :-)


2010/8/24 pierreth :
> On 24 août, 01:20, mdipierro  wrote:
>> In Java a serverlet, as far as I understand, is a class which conforms
>> to some API that allows it to serve one http request. Each instance is
>> executed in its own thread.
>
> Yes, but one instance can be executed by multiple threads at the same
> time. It is one thread per request. EJB, Enterprise Java Beans, are
> running on their own threads.
>
>>The Python equivalent of the serverlet API
>> is a WSGI application and web2py is based on WSGI, therefore the
>> parallelization mechanism is equivalent to Java serverlets.
>
> Is web2py running as a WSGI application when we do "python web2py.py"
> or is it only when used in a specific deployment with WSGI?
>
>>
>> In web2py (the same in Django, Pylons, any any WSGI app) each http
>> request is executed in its own thread. Threads are recycled to server
>> non-concurrent requests and reuse database connections (pooling)
>> without need to close and reopen them. The web server can be
>> configured for a min number and a max number of threads.
>
> So, as a web2py developer, what do I have to do to avoid
> synchronization problems in my application. Where is the danger of
> having multiple threads for the web2py developers? What are the
> instances shared my multiple threads? What are the instances living in
> their own threads?
>
>>
>> I think the GIL in this context is a false problem. In fact in
>> production you can use Apache and run as many processes as the number
>> of cores that you have. Each process will create as many threads as it
>> needs to server multiple requests. The GIL is a problems only if one
>> process runs multiple threads on multiple cores. It is possible there
>> are some caveats with many cores but I have not really played with
>> apache configurations and benchmarks.
>
> Yes but a web2py server is running with only one process and using
> more web2py processes for serving the same web2py app will lead to
> synchronization problems. With processors having more and more cores,
> having a web server that cannot use them is not very fun. It is an
> issue to be solved with Python 3.2 I think.
>
>> Massimo
>>
>
> Thank you for this precious information.


Re: [web2py] Re: Web2py and threads

2010-08-24 Thread Michele Comitini
2010/8/24 pierreth :

> Yes but a web2py server is running with only one process and using
> more web2py processes for serving the same web2py app will lead to
> synchronization problems. With processors having more and more cores,
> having a web server that cannot use them is not very fun. It is an
> issue to be solved with Python 3.2 I think.
>

the GIL is still in place, maybe faster but still serializing...
http://docs.python.org/dev/whatsnew/3.2.html#multi-threading


Re: [web2py] Re: Web2py and threads

2010-08-24 Thread Michele Comitini
John,
Tnx ... I'll keep this under my pillow ;-)

2010/8/24 John Heenan :
> Lee's 'The Problem with Threads' link is at
> http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.pdf
>
> There is in Lee at http://ptolemy.eecs.berkeley.edu/%7Eeal/
>
> John Heenan
>
> On Aug 25, 5:00 am, John Heenan  wrote:
>> Can't we at least have an acknowledgement that it is not necessary for
>> web2py to use a thread per request model and that web2py could instead
>> use an event model?
>>
>> WSGI can be viewed as an evil conspiracy to force Python web apps to
>> follow the Apache thread per request model! Also with Apace mod_wsgi,
>> Apache controls the Python process that web2py runs under! How evil
>> and ugly!
>>
>> There is no inherent reason why web2py needs to run a separate thread
>> for each NON static http request, if WSGI is not used!
>>
>> If web2py uses WSGI then a thread per request is forced upon web2py.
>> This suits Apache but not web serves with better event driven models
>> such as Lighttpd and Nginx
>>
>> For example Lighttpd does not even support WSGI. Instead web2py used
>> fastcgi for communication Lighttpd via a UNIX socket and web2py then
>> needlessly converts each request into a thread for handling by a WSGI
>> handler!
>>
>> Why should web2py be forced into using a thread model? Anyone who
>> writes PC applications avoids thread as if they are a plague. Even
>> academics openly call using threads evil. Here is an article by Edward
>> A. Lee professor at Berkely University with the title "The Problem
>> with Threads" (PDF).
>>
>> John Heenan
>>
>> On Aug 25, 1:00 am, John Heenan  wrote:
>>
>> > There is absolutely no need to serve up static web pages of a pure
>> > Python web app or a WGSI app with a separate thread.  It is
>> > inefficient to use an inbuilt web server (of a Python web app) or
>> > Apache (if WGSI used) to serve up static web pages using separate
>> > threads. Both Lighttpd and Nginx are well known web servers that
>> > thrash Apache in objective tests for static pages when a web server is
>> > under load. These web servers use event handlers to serve static web
>> > pages, not necessarily separate threads.
>>
>> > Of course the question remains, how much can the performance of WSGI
>> > type apps be improved by an analogous event handling model within the
>> > app and how much of a change in development style would be required to
>> > take full advantage of such an approach. As far as I am aware these
>> > questions has never even been posed.
>>
>> > Further background:
>>
>> > There is no need to use web2py to serve up its css pages, javascript
>> > and images.  A compiled static language (such as C) web server can be
>> > used instead.
>>
>> > The question then becomes which web server. The answer is obvious: web
>> > servers that use event handlers to serve static web pages, not
>> > necessarily threads. Unfortunately you will find religious bigots,
>> > even on this forum, who will ridicule anyone who points out the
>> > obvious. Expect abuse from this reply.
>>
>> > John Heenan
>>
>> > On Aug 24, 3:21 pm, mdipierro  wrote:
>>
>> > > P.S. In the end the bottle neck is ALWAYS database access.
>>
>> > > On Aug 24, 12:20 am, mdipierro  wrote:
>>
>> > > > In Java a serverlet, as far as I understand, is a class which conforms
>> > > > to some API that allows it to serve one http request. Each instance is
>> > > > executed in its own thread. The Python equivalent of the serverlet API
>> > > > is a WSGI application and web2py is based on WSGI, therefore the
>> > > > parallelization mechanism is equivalent to Java serverlets.
>>
>> > > > In web2py (the same in Django, Pylons, any any WSGI app) each http
>> > > > request is executed in its own thread. Threads are recycled to server
>> > > > non-concurrent requests and reuse database connections (pooling)
>> > > > without need to close and reopen them. The web server can be
>> > > > configured for a min number and a max number of threads.
>>
>> > > > I think the GIL in this context is a false problem. In fact in
>> > > > production you can use Apache and run as many processes as the number
>> > > > of cores that you have. Each process will create as many threads as it
>> > > > needs to server multiple requests. The GIL is a problems only if one
>> > > > process runs multiple threads on multiple cores. It is possible there
>> > > > are some caveats with many cores but I have not really played with
>> > > > apache configurations and benchmarks.
>>
>> > > > I do not think using Jython helps anything. According to these tests:
>> > > >  http://blog.dhananjaynene.com/2008/07/performance-comparison-c-java-p...
>> > > >  http://pyevolve.sourceforge.net/wordpress/?p=1189
>> > > > Jython is 2x-3x slower than cpython. So you may get better scaling
>> > > > with multiple cores but you pay huge perfomance hit.
>>
>> > > > Web2py runs on Jython but there is a known bug in Java regular
>> > > > expressions that Sun marked as "won'tfix" that c

Re: [web2py] Re: Web2py and threads

2010-08-25 Thread Phyo Arkar
Did I Read that reading files inside controller will block web2py , Does it?

Thats a bad news.. i am doing a file crawler and while crawling ,
web2py is blocked even tho the process talke only 25% of 1 out of 4
CPUs ..



On 8/25/10, pierreth  wrote:
> I would appreciate a good reference to understand the concepts you are
> talking about. It is something new to me and I don't understand.
>
> On 25 août, 11:22, John Heenan  wrote:
>> No, nothing that abstract. Using WSGI forces a new thread for each
>> request. This is is a simple and inefficient brute force approach that
>> really only suits the simplest Python applications and where only a
>> small number of concurrent connection might be expected.
>>
>> Any application that provides web services is going to OS block on
>> file reading (and writing) and on database access. Using threads is a
>> classic and easy way out that carries a lot of baggage. Windows has
>> had a way out of this for years with its asynch (or event)
>> notification set up through an OVERLAPPED structure.
>>
>> Lightttpd makes use of efficient event notification schemes like
>> kqueue and epoll. Apache only uses such schemes for listening and Keep-
>> Alives.
>>
>> No matter how careful one is with threads and processes there always
>> appears to be unexpected gotchas. Python has a notorious example, the
>> now fixed 'Beazly Effect' that affected the GIL. Also I don't think
>> there is a single experienced Python user that trusts the GIL.
>>
>> John Heenan
>>
>


Re: [web2py] Re: Web2py and threads

2010-08-25 Thread Jonathan Lundell
On Aug 25, 2010, at 9:00 AM, Phyo Arkar wrote:

> Did I Read that reading files inside controller will block web2py , Does it?
> 
> Thats a bad news.. i am doing a file crawler and while crawling ,
> web2py is blocked even tho the process talke only 25% of 1 out of 4
> CPUs ..

This stuff gets a little coverage in the book's deployment chapter, but it 
could use a systematic discussion.

What are the implication for web2py apps of http server policies, database 
locks (sqlite especially), session locking, the GIL, etc? With a section on 
best practices.

> 
> 
> 
> On 8/25/10, pierreth  wrote:
>> I would appreciate a good reference to understand the concepts you are
>> talking about. It is something new to me and I don't understand.
>> 
>> On 25 août, 11:22, John Heenan  wrote:
>>> No, nothing that abstract. Using WSGI forces a new thread for each
>>> request. This is is a simple and inefficient brute force approach that
>>> really only suits the simplest Python applications and where only a
>>> small number of concurrent connection might be expected.
>>> 
>>> Any application that provides web services is going to OS block on
>>> file reading (and writing) and on database access. Using threads is a
>>> classic and easy way out that carries a lot of baggage. Windows has
>>> had a way out of this for years with its asynch (or event)
>>> notification set up through an OVERLAPPED structure.
>>> 
>>> Lightttpd makes use of efficient event notification schemes like
>>> kqueue and epoll. Apache only uses such schemes for listening and Keep-
>>> Alives.
>>> 
>>> No matter how careful one is with threads and processes there always
>>> appears to be unexpected gotchas. Python has a notorious example, the
>>> now fixed 'Beazly Effect' that affected the GIL. Also I don't think
>>> there is a single experienced Python user that trusts the GIL.
>>> 
>>> John Heenan
>>> 
>> 




Re: [web2py] Re: Web2py and threads

2010-08-25 Thread Phyo Arkar
Yes may be session was locked , thats why
session.current=processing_path not working

But then again , while processing files i try opening separate page ,
to other controller , it was waited till the first (file Crawler) page
finished parsing.


ok i will make a separate thread about this.


On 8/25/10, mdipierro  wrote:
>
>
> On Aug 25, 11:00 am, Phyo Arkar  wrote:
>> Did I Read that reading files inside controller will block web2py , Does
>> it?
>
> No web2py does not block. web2py only locks sessions that means one
> user cannot request two concurrent pages because there would be a race
> condition in saving sessions. Two user can request different pages
> which open the same file unless the file is explicitly locked by your
> code.
>
>> Thats a bad news.. i am doing a file crawler and while crawling ,
>> web2py is blocked even tho the process talke only 25% of 1 out of 4
>> CPUs ..
>
> Tell us more or I cannot help.
>
>
>>
>> On 8/25/10, pierreth  wrote:
>>
>> > I would appreciate a good reference to understand the concepts you are
>> > talking about. It is something new to me and I don't understand.
>>
>> > On 25 août, 11:22, John Heenan  wrote:
>> >> No, nothing that abstract. Using WSGI forces a new thread for each
>> >> request. This is is a simple and inefficient brute force approach that
>> >> really only suits the simplest Python applications and where only a
>> >> small number of concurrent connection might be expected.
>>
>> >> Any application that provides web services is going to OS block on
>> >> file reading (and writing) and on database access. Using threads is a
>> >> classic and easy way out that carries a lot of baggage. Windows has
>> >> had a way out of this for years with its asynch (or event)
>> >> notification set up through an OVERLAPPED structure.
>>
>> >> Lightttpd makes use of efficient event notification schemes like
>> >> kqueue and epoll. Apache only uses such schemes for listening and Keep-
>> >> Alives.
>>
>> >> No matter how careful one is with threads and processes there always
>> >> appears to be unexpected gotchas. Python has a notorious example, the
>> >> now fixed 'Beazly Effect' that affected the GIL. Also I don't think
>> >> there is a single experienced Python user that trusts the GIL.
>>
>> >> John Heenan


Re: [web2py] Re: Web2py and threads

2010-08-25 Thread Jonathan Lundell
On Aug 25, 2010, at 1:41 PM, mdipierro wrote:
> 
> call
> 
> session._unlock()
> 
> if you do not need session locking

If you do that (without calling session.forget), what will happen in 
_try_store_on_disk when cPickle.dump(dict(self), response.session_file) is 
called with a None file argument? Or is cPickle.dump cool with that? Or am I 
misreading the logic?


> 
> On Aug 25, 11:38 am, Phyo Arkar  wrote:
>> Yes may be session was locked , thats why
>> session.current=processing_path not working
>> 
>> But then again , while processing files i try opening separate page ,
>> to other controller , it was waited till the first (file Crawler) page
>> finished parsing.
>> 
>> ok i will make a separate thread about this.
>> 
>> On 8/25/10, mdipierro  wrote:
>> 
>> 
>> 
>>> On Aug 25, 11:00 am, Phyo Arkar  wrote:
 Did I Read that reading files inside controller will block web2py , Does
 it?
>> 
>>> No web2py does not block. web2py only locks sessions that means one
>>> user cannot request two concurrent pages because there would be a race
>>> condition in saving sessions. Two user can request different pages
>>> which open the same file unless the file is explicitly locked by your
>>> code.
>> 
 Thats a bad news.. i am doing a file crawler and while crawling ,
 web2py is blocked even tho the process talke only 25% of 1 out of 4
 CPUs ..
>> 
>>> Tell us more or I cannot help.
>> 
 On 8/25/10, pierreth  wrote:
>> 
> I would appreciate a good reference to understand the concepts you are
> talking about. It is something new to me and I don't understand.
>> 
> On 25 août, 11:22, John Heenan  wrote:
>> No, nothing that abstract. Using WSGI forces a new thread for each
>> request. This is is a simple and inefficient brute force approach that
>> really only suits the simplest Python applications and where only a
>> small number of concurrent connection might be expected.
>> 
>> Any application that provides web services is going to OS block on
>> file reading (and writing) and on database access. Using threads is a
>> classic and easy way out that carries a lot of baggage. Windows has
>> had a way out of this for years with its asynch (or event)
>> notification set up through an OVERLAPPED structure.
>> 
>> Lightttpd makes use of efficient event notification schemes like
>> kqueue and epoll. Apache only uses such schemes for listening and Keep-
>> Alives.
>> 
>> No matter how careful one is with threads and processes there always
>> appears to be unexpected gotchas. Python has a notorious example, the
>> now fixed 'Beazly Effect' that affected the GIL. Also I don't think
>> there is a single experienced Python user that trusts the GIL.
>> 
>> John Heenan




Re: [web2py] Re: Web2py and threads

2010-08-25 Thread Jonathan Lundell
On Aug 25, 2010, at 6:37 PM, mdipierro wrote:
> 
> The problem is only if have two http request from the same client in
> the same session

Thanks for that; I was wondering under which conditions unlocking might be 
permissible (and I'm still not entirely clear, but never mind for now).

My concern is this. Here's unlock:

def _unlock(self, response):
if response and response.session_file:
try:
portalocker.unlock(response.session_file)
response.session_file.close()
del response.session_file  <-
except: ### this should never happen but happens in Windows
pass

Now we save the session file:

def _try_store_on_disk(self, request, response):
if response._dbtable_and_field \
or not response.session_id \
or self._forget:
self._unlock(response)
return
if response.session_new:
# Tests if the session folder exists, if not, create it
session_folder = os.path.dirname(response.session_filename)
response.session_file = open(response.session_filename, 'wb')
portalocker.lock(response.session_file, portalocker.LOCK_EX)
cPickle.dump(dict(self), response.session_file)  
<
self._unlock(response)

But response.session_file is None at this point.

> 
> A arrives loads session and unlocks
> B arrives loads session and unlocks
> A change session and saves it
> B changes session and saves it
> 
> Nothing breaks but B never sees changes made by A and they are
> overwritten by B.
> With locks
> 
> A arrives loads session
> B arrives and waits
> A change session and saves it
> B loads session (with changes made by A)
> B changes session and saves it
> 
> 
> On Aug 25, 3:52 pm, Jonathan Lundell  wrote:
>> On Aug 25, 2010, at 1:41 PM, mdipierro wrote:
>> 
>> 
>> 
>>> call
>> 
>>> session._unlock()
>> 
>>> if you do not need session locking
>> 
>> If you do that (without calling session.forget), what will happen in 
>> _try_store_on_disk when cPickle.dump(dict(self), response.session_file) is 
>> called with a None file argument? Or is cPickle.dump cool with that? Or am I 
>> misreading the logic?
>> 
>> 
>> 
>>> On Aug 25, 11:38 am, Phyo Arkar  wrote:
 Yes may be session was locked , thats why
 session.current=processing_path not working
>> 
 But then again , while processing files i try opening separate page ,
 to other controller , it was waited till the first (file Crawler) page
 finished parsing.
>> 
 ok i will make a separate thread about this.
>> 
 On 8/25/10, mdipierro  wrote:
>> 
> On Aug 25, 11:00 am, Phyo Arkar  wrote:
>> Did I Read that reading files inside controller will block web2py , Does
>> it?
>> 
> No web2py does not block. web2py only locks sessions that means one
> user cannot request two concurrent pages because there would be a race
> condition in saving sessions. Two user can request different pages
> which open the same file unless the file is explicitly locked by your
> code.
>> 
>> Thats a bad news.. i am doing a file crawler and while crawling ,
>> web2py is blocked even tho the process talke only 25% of 1 out of 4
>> CPUs ..
>> 
> Tell us more or I cannot help.
>> 
>> On 8/25/10, pierreth  wrote:
>> 
>>> I would appreciate a good reference to understand the concepts you are
>>> talking about. It is something new to me and I don't understand.
>> 
>>> On 25 août, 11:22, John Heenan  wrote:
 No, nothing that abstract. Using WSGI forces a new thread for each
 request. This is is a simple and inefficient brute force approach that
 really only suits the simplest Python applications and where only a
 small number of concurrent connection might be expected.
>> 
 Any application that provides web services is going to OS block on
 file reading (and writing) and on database access. Using threads is a
 classic and easy way out that carries a lot of baggage. Windows has
 had a way out of this for years with its asynch (or event)
 notification set up through an OVERLAPPED structure.
>> 
 Lightttpd makes use of efficient event notification schemes like
 kqueue and epoll. Apache only uses such schemes for listening and Keep-
 Alives.
>> 
 No matter how careful one is with threads and processes there always
 appears to be unexpected gotchas. Python has a notorious example, the
 now fixed 'Beazly Effect' that affected the GIL. Also I don't think
 there is a single experienced Python user that trusts the GIL.
>> 
 John Heenan




Re: [web2py] Re: Web2py and threads

2010-08-25 Thread Jonathan Lundell
On Aug 25, 2010, at 7:56 PM, mdipierro wrote:
> 
> This is a bug. I fixed it in trunk. Thanks Jonathan.

It's fixed in the sense that it won't raise an exception. But now how is 
calling _unlock different from calling forget?

> 
> On Aug 25, 9:30 pm, Jonathan Lundell  wrote:
>> On Aug 25, 2010, at 6:37 PM, mdipierro wrote:
>> 
>> 
>> 
>>> The problem is only if have two http request from the same client in
>>> the same session
>> 
>> Thanks for that; I was wondering under which conditions unlocking might be 
>> permissible (and I'm still not entirely clear, but never mind for now).
>> 
>> My concern is this. Here's unlock:
>> 
>> def _unlock(self, response):
>> if response and response.session_file:
>> try:
>> portalocker.unlock(response.session_file)
>> response.session_file.close()
>> del response.session_file  <-
>> except: ### this should never happen but happens in Windows
>> pass
>> 
>> Now we save the session file:
>> 
>> def _try_store_on_disk(self, request, response):
>> if response._dbtable_and_field \
>> or not response.session_id \
>> or self._forget:
>> self._unlock(response)
>> return
>> if response.session_new:
>> # Tests if the session folder exists, if not, create it
>> session_folder = os.path.dirname(response.session_filename)
>> response.session_file = open(response.session_filename, 'wb')
>> portalocker.lock(response.session_file, portalocker.LOCK_EX)
>> cPickle.dump(dict(self), response.session_file)  
>> <
>> self._unlock(response)
>> 
>> But response.session_file is None at this point.
>> 
>> 
>> 
>>> A arrives loads session and unlocks
>>> B arrives loads session and unlocks
>>> A change session and saves it
>>> B changes session and saves it
>> 
>>> Nothing breaks but B never sees changes made by A and they are
>>> overwritten by B.
>>> With locks
>> 
>>> A arrives loads session
>>> B arrives and waits
>>> A change session and saves it
>>> B loads session (with changes made by A)
>>> B changes session and saves it
>> 
>>> On Aug 25, 3:52 pm, Jonathan Lundell  wrote:
 On Aug 25, 2010, at 1:41 PM, mdipierro wrote:
>> 
> call
>> 
> session._unlock()
>> 
> if you do not need session locking
>> 
 If you do that (without calling session.forget), what will happen in 
 _try_store_on_disk when cPickle.dump(dict(self), response.session_file) is 
 called with a None file argument? Or is cPickle.dump cool with that? Or am 
 I misreading the logic?
>> 
> On Aug 25, 11:38 am, Phyo Arkar  wrote:
>> Yes may be session was locked , thats why
>> session.current=processing_path not working
>> 
>> But then again , while processing files i try opening separate page ,
>> to other controller , it was waited till the first (file Crawler) page
>> finished parsing.
>> 
>> ok i will make a separate thread about this.
>> 
>> On 8/25/10, mdipierro  wrote:
>> 
>>> On Aug 25, 11:00 am, Phyo Arkar  wrote:
 Did I Read that reading files inside controller will block web2py , 
 Does
 it?
>> 
>>> No web2py does not block. web2py only locks sessions that means one
>>> user cannot request two concurrent pages because there would be a race
>>> condition in saving sessions. Two user can request different pages
>>> which open the same file unless the file is explicitly locked by your
>>> code.
>> 
 Thats a bad news.. i am doing a file crawler and while crawling ,
 web2py is blocked even tho the process talke only 25% of 1 out of 4
 CPUs ..
>> 
>>> Tell us more or I cannot help.
>> 
 On 8/25/10, pierreth  wrote:
>> 
> I would appreciate a good reference to understand the concepts you are
> talking about. It is something new to me and I don't understand.
>> 
> On 25 août, 11:22, John Heenan  wrote:
>> No, nothing that abstract. Using WSGI forces a new thread for each
>> request. This is is a simple and inefficient brute force approach 
>> that
>> really only suits the simplest Python applications and where only a
>> small number of concurrent connection might be expected.
>> 
>> Any application that provides web services is going to OS block on
>> file reading (and writing) and on database access. Using threads is a
>> classic and easy way out that carries a lot of baggage. Windows has
>> had a way out of this for years with its asynch (or event)
>> notification set up through an OVERLAPPED structure.
>> 
>> Lightttpd makes use of efficient event notification schemes like
>> kqueue and epoll. Apache only uses such schemes for listening and 
>> Keep-
>> Aliv

Re: [web2py] Re: Web2py and threads

2010-08-27 Thread Jonathan Lundell
On Aug 25, 2010, at 8:12 PM, Jonathan Lundell wrote:
> 
> On Aug 25, 2010, at 7:56 PM, mdipierro wrote:
>> 
>> This is a bug. I fixed it in trunk. Thanks Jonathan.
> 
> It's fixed in the sense that it won't raise an exception. But now how is 
> calling _unlock different from calling forget?

Nag.

> 
>> 
>> On Aug 25, 9:30 pm, Jonathan Lundell  wrote:
>>> On Aug 25, 2010, at 6:37 PM, mdipierro wrote:
>>> 
>>> 
>>> 
 The problem is only if have two http request from the same client in
 the same session
>>> 
>>> Thanks for that; I was wondering under which conditions unlocking might be 
>>> permissible (and I'm still not entirely clear, but never mind for now).
>>> 
>>> My concern is this. Here's unlock:
>>> 
>>>def _unlock(self, response):
>>>if response and response.session_file:
>>>try:
>>>portalocker.unlock(response.session_file)
>>>response.session_file.close()
>>>del response.session_file  <-
>>>except: ### this should never happen but happens in Windows
>>>pass
>>> 
>>> Now we save the session file:
>>> 
>>>def _try_store_on_disk(self, request, response):
>>>if response._dbtable_and_field \
>>>or not response.session_id \
>>>or self._forget:
>>>self._unlock(response)
>>>return
>>>if response.session_new:
>>># Tests if the session folder exists, if not, create it
>>>session_folder = os.path.dirname(response.session_filename)
>>>response.session_file = open(response.session_filename, 'wb')
>>>portalocker.lock(response.session_file, portalocker.LOCK_EX)
>>>cPickle.dump(dict(self), response.session_file)  
>>> <
>>>self._unlock(response)
>>> 
>>> But response.session_file is None at this point.
>>> 
>>> 
>>> 
 A arrives loads session and unlocks
 B arrives loads session and unlocks
 A change session and saves it
 B changes session and saves it
>>> 
 Nothing breaks but B never sees changes made by A and they are
 overwritten by B.
 With locks
>>> 
 A arrives loads session
 B arrives and waits
 A change session and saves it
 B loads session (with changes made by A)
 B changes session and saves it
>>> 
 On Aug 25, 3:52 pm, Jonathan Lundell  wrote:
> On Aug 25, 2010, at 1:41 PM, mdipierro wrote:
>>> 
>> call
>>> 
>> session._unlock()
>>> 
>> if you do not need session locking
>>> 
> If you do that (without calling session.forget), what will happen in 
> _try_store_on_disk when cPickle.dump(dict(self), response.session_file) 
> is called with a None file argument? Or is cPickle.dump cool with that? 
> Or am I misreading the logic?
>>> 
>> On Aug 25, 11:38 am, Phyo Arkar  wrote:
>>> Yes may be session was locked , thats why
>>> session.current=processing_path not working
>>> 
>>> But then again , while processing files i try opening separate page ,
>>> to other controller , it was waited till the first (file Crawler) page
>>> finished parsing.
>>> 
>>> ok i will make a separate thread about this.
>>> 
>>> On 8/25/10, mdipierro  wrote:
>>> 
 On Aug 25, 11:00 am, Phyo Arkar  wrote:
> Did I Read that reading files inside controller will block web2py , 
> Does
> it?
>>> 
 No web2py does not block. web2py only locks sessions that means one
 user cannot request two concurrent pages because there would be a race
 condition in saving sessions. Two user can request different pages
 which open the same file unless the file is explicitly locked by your
 code.
>>> 
> Thats a bad news.. i am doing a file crawler and while crawling ,
> web2py is blocked even tho the process talke only 25% of 1 out of 4
> CPUs ..
>>> 
 Tell us more or I cannot help.
>>> 
> On 8/25/10, pierreth  wrote:
>>> 
>> I would appreciate a good reference to understand the concepts you 
>> are
>> talking about. It is something new to me and I don't understand.
>>> 
>> On 25 août, 11:22, John Heenan  wrote:
>>> No, nothing that abstract. Using WSGI forces a new thread for each
>>> request. This is is a simple and inefficient brute force approach 
>>> that
>>> really only suits the simplest Python applications and where only a
>>> small number of concurrent connection might be expected.
>>> 
>>> Any application that provides web services is going to OS block on
>>> file reading (and writing) and on database access. Using threads is 
>>> a
>>> classic and easy way out that carries a lot of baggage. Windows has
>>> had a way out of this for years with its asynch (or event)
>>> notification set up through an OVERLAPPED str