Re: [Web-SIG] Emulating req.write() in WSGI

2010-07-07 Thread Aaron Fransen
Well, just in case this helps anyone, I managed to get it working on most of
the current browsers.

My first problem was a missing import statement (definite DOH! moment
there), but once that was resolved Firefox steadfastly refused to
co-operate.

The fix was to use a boundary indicator then specify the content type for
every subsequent chunk of data, something along these lines:

# during initial converstation
 status= '200 OK'
 response_headers  = [('Content-type','multipart/x-mixed-replace')]
 if string.find(environ['HTTP_USER_AGENT'],'Firefox') != -1:
 response_headers  =
[('Content-type','multipart/x-mixed-replace;boundary=x0x0x0x')]
 writer = start_response(status, response_headers)

# during subsequent conversations
if string.find(environ['HTTP_USER_AGENT'],'Firefox') != -1:
writer('Content-type: text/html\r\n\r\n'+text+'\r\n\r\n--x0x0x0x')
else:
writer(text)

Now, in Firefox the behavior is that it completely replaces the previous
chunk, while in IE8, Chrome, Safari it simply adds to the existing content.
It doesn't work in Opera yet (with either method), I haven't been able to
determine why but I'll continue to work on it.

Sorry for the hassle everyone!
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Emulating req.write() in WSGI

2010-07-06 Thread Graham Dumpleton
On 5 July 2010 22:43, Aaron Fransen aaron.fran...@gmail.com wrote:
 Apologies Graham, I'm not actually trying to appear dense but clearly I'm
 not one of the world's bright lights when it comes to web interfaces.

 My installation is literally a base installation of the latest Ubuntu server
 platform. The only configuration at play is this:

     WSGIDaemonProcess node9 user=www-data group=www-data processes=2
 threads=25
     WSGIProcessGroup node9
     WSGIScriptAlias /run /var/www/run/run.py

 The error that occurs when using telnet and yield is:

 [Mon Jul 05 06:30:24 2010] [error] [client 127.0.0.1] mod_wsgi (pid=2716):
 Target WSGI script '/var/www/run/run.py' cannot be loaded as Python module.
 [Mon Jul 05 06:30:24 2010] [error] [client 127.0.0.1] mod_wsgi (pid=2716):
 Exception occurred processing WSGI script '/var/www/run/run.py'.
 [Mon Jul 05 06:30:24 2010] [error] [client 127.0.0.1] SyntaxError: 'return'
 with argument inside generator (run.py, line 14)

 using this code:

     status    =    '200 OK'
     response_headers    =    [('Content-type','text/plain')]
     start_response(status, response_headers)
     for x in range(0,10):
         yield 'hey %s' % x
         time.sleep(1)

 The error occurs when I use return [] as opposed to simply return,
 however I now see that is a result of the yield command itself.

In the code example I posted I never had a 'return' statement in same
function as 'yield'. You shouldn't be mixing the two.

Graham

 Using this method, the telnet interface returns immediately with:

 HTTP/1.1 200 OK
 Date: Mon, 05 Jul 2010 12:30:45 GMT
 Server: Apache/2.2.14 (Ubuntu)
 Vary: Accept-Encoding
 Connection: close
 Content-Type: text/plain

 0
 Connection closed by foreign host.

 In fact, whether using yield or write produces the same result.

 If I'm not getting the results I should be, then obviously I'm doing
 something wrong.

 I understand the danger of having a long-running web process (hence the
 reason I have a lot of virtual machines in the live environment using
 mod_python right now) but unfortunately it's something I don't seem to be
 able to work around at the moment.

 Thanks to all.

 On Wed, Jun 30, 2010 at 5:19 PM, Graham Dumpleton
 graham.dumple...@gmail.com wrote:

 On 30 June 2010 22:55, Aaron Fransen aaron.fran...@gmail.com wrote:
 
  I can see that this could potentially get very ugly very quickly.
 
  Using stock Apache on the current Ubuntu server, using yield produced a
  response error

 What error? If you aren't going to debug it enough to even work out
 what the error is in the browser or Apache error logs and post it here
 for comment so can say what may be wrong on your system, then we cant
 exactly help you much can we.

  and using write() (over the telnet interface) returned the 0
  only and disconnected. Similar behavior in Firefox.

 All the scripts I provided you are conforming WSGI applications and
 work on mod_wsgi. If you are having issues, then it is likely going to
 be the way your Apache/Python is setup or how you configured mod_wsgi
 to host the scripts. Again, because you are providing no details about
 how you configured mod_wsgi we cant help you work out what is wrong
 with your system.

  How odd that nobody's come up with a simple streaming/update schema (at
  least to my mind).

 For response content they have and it can be made to work. Just
 because you cant get it working or don't understand what we are saying
 about the need to use a JavaScript/AJAX type client (eg. comet style)
 to make use of it as opposed to trying to rely on browser
 functionality that doesn't exist doesn't change that. Request content
 streaming is a different matter as I will explain below but you
 haven't even mentioned that as yet that I can see.

  It would have been nice to be able to provide some kind of in-stream
  feedback for long running jobs, but it looks like I'm going to have to
  abandon that approach. The only issue with either of the other solutions
  is
  that each subsequent request depends on data provided by the prior, so
  the
  amount of traffic going back  forth could potentially become a problem.
 
  Alternatively I could simply create a session database that saves the
  required objects then each subsequent request simply fetches the
  required
  one from the table and...
 
  Well, you can see why streaming seemed like such a simple solution! Back
  to
  the drawing board, as it were.

 I'll try one last time to try and summarise a few issues for you,
 although based on your attitude so far, I don't think it will change
 your opinion or help your understanding.

 1. Streaming of responses from a WSGI application works fine using
 either yield or write(). If it doesn't work for a specific WSGI
 hosting mechanism then that implementation may not be conforming to
 WSGI requirements. Specifically, between a yield and/or write() it is
 required that an implicit flush is performed. This should ensure that
 the data is written 

Re: [Web-SIG] Emulating req.write() in WSGI

2010-07-06 Thread Graham Dumpleton
On 6 July 2010 21:02, Dirkjan Ochtman dirk...@ochtman.nl wrote:
 On Tue, Jul 6, 2010 at 12:50, Graham Dumpleton
 graham.dumple...@gmail.com wrote:
 In the code example I posted I never had a 'return' statement in same
 function as 'yield'. You shouldn't be mixing the two.

 Well, you can still use bare return as a way of raising StopIteration.

True. It was just easier to say not to and avoid any chance of
misunderstanding. :-)

Graham

There's something very important I forgot to tell you. Don't cross
the streams… It would be bad… Try to imagine all life as you know it
stopping instantaneously and every molecule in your body exploding at
the speed of light.
—Egon Spengler on crossing proton streams
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Emulating req.write() in WSGI

2010-07-05 Thread Aaron Fransen
Apologies Graham, I'm not actually trying to appear dense but clearly I'm
not one of the world's bright lights when it comes to web interfaces.

My installation is literally a base installation of the latest Ubuntu server
platform. The only configuration at play is this:

WSGIDaemonProcess node9 user=www-data group=www-data processes=2
threads=25
WSGIProcessGroup node9
WSGIScriptAlias /run /var/www/run/run.py

The error that occurs when using telnet and yield is:

[Mon Jul 05 06:30:24 2010] [error] [client 127.0.0.1] mod_wsgi (pid=2716):
Target WSGI script '/var/www/run/run.py' cannot be loaded as Python module.
[Mon Jul 05 06:30:24 2010] [error] [client 127.0.0.1] mod_wsgi (pid=2716):
Exception occurred processing WSGI script '/var/www/run/run.py'.
[Mon Jul 05 06:30:24 2010] [error] [client 127.0.0.1] SyntaxError: 'return'
with argument inside generator (run.py, line 14)

using this code:

status='200 OK'
response_headers=[('Content-type','text/plain')]
start_response(status, response_headers)
for x in range(0,10):
yield 'hey %s' % x
time.sleep(1)

The error occurs when I use return [] as opposed to simply return,
however I now see that is a result of the yield command itself.

Using this method, the telnet interface returns immediately with:

HTTP/1.1 200 OK
Date: Mon, 05 Jul 2010 12:30:45 GMT
Server: Apache/2.2.14 (Ubuntu)
Vary: Accept-Encoding
Connection: close
Content-Type: text/plain

0
Connection closed by foreign host.

In fact, whether using yield or write produces the same result.

If I'm not getting the results I should be, then obviously I'm doing
something wrong.

I understand the danger of having a long-running web process (hence the
reason I have a lot of virtual machines in the live environment using
mod_python right now) but unfortunately it's something I don't seem to be
able to work around at the moment.

Thanks to all.

On Wed, Jun 30, 2010 at 5:19 PM, Graham Dumpleton 
graham.dumple...@gmail.com wrote:

 On 30 June 2010 22:55, Aaron Fransen aaron.fran...@gmail.com wrote:
 
  I can see that this could potentially get very ugly very quickly.
 
  Using stock Apache on the current Ubuntu server, using yield produced a
  response error

 What error? If you aren't going to debug it enough to even work out
 what the error is in the browser or Apache error logs and post it here
 for comment so can say what may be wrong on your system, then we cant
 exactly help you much can we.

  and using write() (over the telnet interface) returned the 0
  only and disconnected. Similar behavior in Firefox.

 All the scripts I provided you are conforming WSGI applications and
 work on mod_wsgi. If you are having issues, then it is likely going to
 be the way your Apache/Python is setup or how you configured mod_wsgi
 to host the scripts. Again, because you are providing no details about
 how you configured mod_wsgi we cant help you work out what is wrong
 with your system.

  How odd that nobody's come up with a simple streaming/update schema (at
  least to my mind).

 For response content they have and it can be made to work. Just
 because you cant get it working or don't understand what we are saying
 about the need to use a JavaScript/AJAX type client (eg. comet style)
 to make use of it as opposed to trying to rely on browser
 functionality that doesn't exist doesn't change that. Request content
 streaming is a different matter as I will explain below but you
 haven't even mentioned that as yet that I can see.

  It would have been nice to be able to provide some kind of in-stream
  feedback for long running jobs, but it looks like I'm going to have to
  abandon that approach. The only issue with either of the other solutions
 is
  that each subsequent request depends on data provided by the prior, so
 the
  amount of traffic going back  forth could potentially become a problem.
 
  Alternatively I could simply create a session database that saves the
  required objects then each subsequent request simply fetches the required
  one from the table and...
 
  Well, you can see why streaming seemed like such a simple solution! Back
 to
  the drawing board, as it were.

 I'll try one last time to try and summarise a few issues for you,
 although based on your attitude so far, I don't think it will change
 your opinion or help your understanding.

 1. Streaming of responses from a WSGI application works fine using
 either yield or write(). If it doesn't work for a specific WSGI
 hosting mechanism then that implementation may not be conforming to
 WSGI requirements. Specifically, between a yield and/or write() it is
 required that an implicit flush is performed. This should ensure that
 the data is written to the HTTP client connection and/or ensure that
 the return of such data to the client occurs in parallel to further
 actions occurring in that request.

 2. A WSGI middleware that caches response data can stuff this up. One
 cant outright 

Re: [Web-SIG] Emulating req.write() in WSGI

2010-06-30 Thread Graham Dumpleton
On 30 June 2010 21:35, Aaron Fransen aaron.fran...@gmail.com wrote:


 On Tue, Jun 29, 2010 at 6:17 PM, Graham Dumpleton
 graham.dumple...@gmail.com wrote:

 On 30 June 2010 02:14, Aaron Fransen aaron.fran...@gmail.com wrote:
  Couple more things I've been able to discern.
 
  The first happened after I fixed the html code. Originally under
  mod_python, I guess I was cheating more than a little bit by sending
  html/html code blocks twice, once for the incremental notices, once
  for
  the final content. Once I changed the code to send a single properly
  parsed
  block, the entire document showed up as expected, however it still did
  not
  send any part of the html incrementally.
 
  Watching the line with Wireshark, all of the data was transmitted at the
  same time, so nothing was sent to the browser incrementally.
 
  (This is using the write() functionality, I haven't tried watching the
  line
  with yield yet.)

 Use a variation of WSGI middleware wrapper in:


  http://code.google.com/p/modwsgi/wiki/DebuggingTechniques#Tracking_Request_and_Response

 using it to 'print' returned data to Apache log and then tail Apache
 error log to see when that data is output. Alternatively, change the
 code there to output a time stamp against each chunk of data written
 to the file recording the response content.

 This will show what data is returned by WSGI application, before
 mod_wsgi truncates anything greater than content length specified,
 plus also show whether it is your WSGI application which is delaying
 output somehow, or whether Apache output filters are doing it.

 Graham

 I've actually tried a variation on this already using a built-in logging
 facility in the application that writes date/time values to an external log
 file with comments, and in the case of testing wsgi I actually included some
 time.sleep() statements to force a delay in the application.

 To give you an idea of the flow, here's essentially what's going on:

 def application(environ,start_response):
     mydict = {}
     mydict['environ']=environ
     mydict['startresponse'] = start_response
     # run program in another .py file that has been imported
     RunTest(mydict)

 Then in the other module you would have something like:

 def RunTest(mydict):
     status = '200 OK'
     response_headers = [('Content-type','text/html')]
     writeobj = detail['startresponse'](status,response_headers)
     writeobj('htmlbodyFetching sales for 2009...')
     time.sleep(2)
     writeobj('brFetching sales for 2010...')

     ...then finally...

     writeobj('5000 results returned./body/html')
     return

 This is obviously a truncated (and fake) example, but it gives you an idea
 of the flow.

Now go try the following two examples as illustrated instead.

In both cases, do not use a web browser, instead telnet to the port of
the web server and enter HTTP GET directly. If you are not using
VirtualHost, use something like:

  telnet localhost 80
  GET /stream-yield.wsgi HTTP/1.0

If using a VirtualHost, use something like:

  telnet localhost 80
  GET /stream-yield.wsgi HTTP/1.1
  Host: tests.example.com

Ensure additional blank line entered to indicate end of headers.

First example uses yield.

# stream-yield.wsgi

import time

def application(environ, start_response):
status = '200 OK'

response_headers = [('Content-type', 'text/plain')]
start_response(status, response_headers)

for i in range(10):
  yield '%d\n' % i
  time.sleep(1)

Second example uses write:

# stream-write.wsgi

import time

def application(environ, start_response):
status = '200 OK'

response_headers = [('Content-type', 'text/plain')]
write = start_response(status, response_headers)

for i in range(10):
  write('%d\n' % i)
  time.sleep(1)

return []

For me, using stock standard operating system supplied Apache on Mac
OS X, I see a line returned every second.

If I use Safari as a web browser, in both cases the browser only shows
the response after all data has been written and the socket connection
closed. If I use Firefox however, they display as data comes in.

This delay in display is thus possibly just the behaviour of a
specific browser delaying the display until the socket is closed.

The example for multipart/x-mixed-replace which others mention is:

import time

def application(environ, start_response):
status = '200 OK'

response_headers = [('Content-Type', 'multipart/x-mixed-replace;
boundary=xstringx')]
start_response(status, response_headers)

yield '--xstrinx\n'

for i in range(10):

  yield 'Content-type: text/plain\n'
  yield '\n'
  yield '%d\n' % i
  yield '--xstringx\n'

  time.sleep(1)

With telnet you will see the various sections, but with Safari again
only shows at end, although you will find that it only shows the data
line, ie., the number and not all the other stuff. So, understands
multipart format but doesn't support x-mixed-replace. It was always

Re: [Web-SIG] Emulating req.write() in WSGI

2010-06-30 Thread Graham Dumpleton
On 30 June 2010 22:26, Graham Dumpleton graham.dumple...@gmail.com wrote:
 On 30 June 2010 21:35, Aaron Fransen aaron.fran...@gmail.com wrote:


 On Tue, Jun 29, 2010 at 6:17 PM, Graham Dumpleton
 graham.dumple...@gmail.com wrote:

 On 30 June 2010 02:14, Aaron Fransen aaron.fran...@gmail.com wrote:
  Couple more things I've been able to discern.
 
  The first happened after I fixed the html code. Originally under
  mod_python, I guess I was cheating more than a little bit by sending
  html/html code blocks twice, once for the incremental notices, once
  for
  the final content. Once I changed the code to send a single properly
  parsed
  block, the entire document showed up as expected, however it still did
  not
  send any part of the html incrementally.
 
  Watching the line with Wireshark, all of the data was transmitted at the
  same time, so nothing was sent to the browser incrementally.
 
  (This is using the write() functionality, I haven't tried watching the
  line
  with yield yet.)

 Use a variation of WSGI middleware wrapper in:


  http://code.google.com/p/modwsgi/wiki/DebuggingTechniques#Tracking_Request_and_Response

 using it to 'print' returned data to Apache log and then tail Apache
 error log to see when that data is output. Alternatively, change the
 code there to output a time stamp against each chunk of data written
 to the file recording the response content.

 This will show what data is returned by WSGI application, before
 mod_wsgi truncates anything greater than content length specified,
 plus also show whether it is your WSGI application which is delaying
 output somehow, or whether Apache output filters are doing it.

 Graham

 I've actually tried a variation on this already using a built-in logging
 facility in the application that writes date/time values to an external log
 file with comments, and in the case of testing wsgi I actually included some
 time.sleep() statements to force a delay in the application.

 To give you an idea of the flow, here's essentially what's going on:

 def application(environ,start_response):
     mydict = {}
     mydict['environ']=environ
     mydict['startresponse'] = start_response
     # run program in another .py file that has been imported
     RunTest(mydict)

 Then in the other module you would have something like:

 def RunTest(mydict):
     status = '200 OK'
     response_headers = [('Content-type','text/html')]
     writeobj = detail['startresponse'](status,response_headers)
     writeobj('htmlbodyFetching sales for 2009...')
     time.sleep(2)
     writeobj('brFetching sales for 2010...')

     ...then finally...

     writeobj('5000 results returned./body/html')
     return

 This is obviously a truncated (and fake) example, but it gives you an idea
 of the flow.

 Now go try the following two examples as illustrated instead.

 In both cases, do not use a web browser, instead telnet to the port of
 the web server and enter HTTP GET directly. If you are not using
 VirtualHost, use something like:

  telnet localhost 80
  GET /stream-yield.wsgi HTTP/1.0

 If using a VirtualHost, use something like:

  telnet localhost 80
  GET /stream-yield.wsgi HTTP/1.1
  Host: tests.example.com

 Ensure additional blank line entered to indicate end of headers.

 First example uses yield.

 # stream-yield.wsgi

 import time

 def application(environ, start_response):
    status = '200 OK'

    response_headers = [('Content-type', 'text/plain')]
    start_response(status, response_headers)

    for i in range(10):
      yield '%d\n' % i
      time.sleep(1)

 Second example uses write:

 # stream-write.wsgi

 import time

 def application(environ, start_response):
    status = '200 OK'

    response_headers = [('Content-type', 'text/plain')]
    write = start_response(status, response_headers)

    for i in range(10):
      write('%d\n' % i)
      time.sleep(1)

    return []

 For me, using stock standard operating system supplied Apache on Mac
 OS X, I see a line returned every second.

 If I use Safari as a web browser, in both cases the browser only shows
 the response after all data has been written and the socket connection
 closed. If I use Firefox however, they display as data comes in.

 This delay in display is thus possibly just the behaviour of a
 specific browser delaying the display until the socket is closed.

 The example for multipart/x-mixed-replace which others mention is:

 import time

 def application(environ, start_response):
    status = '200 OK'

    response_headers = [('Content-Type', 'multipart/x-mixed-replace;
 boundary=xstringx')]
    start_response(status, response_headers)

    yield '--xstrinx\n'

    for i in range(10):

      yield 'Content-type: text/plain\n'
      yield '\n'
      yield '%d\n' % i
      yield '--xstringx\n'

      time.sleep(1)

 With telnet you will see the various sections, but with Safari again
 only shows at end, although you will find that it only shows the data
 line, ie., the number and not 

Re: [Web-SIG] Emulating req.write() in WSGI

2010-06-30 Thread Aaron Fransen
On Wed, Jun 30, 2010 at 6:26 AM, Graham Dumpleton 
graham.dumple...@gmail.com wrote:

 On 30 June 2010 21:35, Aaron Fransen aaron.fran...@gmail.com wrote:
 
 
  On Tue, Jun 29, 2010 at 6:17 PM, Graham Dumpleton
  graham.dumple...@gmail.com wrote:
 
  On 30 June 2010 02:14, Aaron Fransen aaron.fran...@gmail.com wrote:
   Couple more things I've been able to discern.
  
   The first happened after I fixed the html code. Originally under
   mod_python, I guess I was cheating more than a little bit by sending
   html/html code blocks twice, once for the incremental notices,
 once
   for
   the final content. Once I changed the code to send a single properly
   parsed
   block, the entire document showed up as expected, however it still did
   not
   send any part of the html incrementally.
  
   Watching the line with Wireshark, all of the data was transmitted at
 the
   same time, so nothing was sent to the browser incrementally.
  
   (This is using the write() functionality, I haven't tried watching the
   line
   with yield yet.)
 
  Use a variation of WSGI middleware wrapper in:
 
 
 
 http://code.google.com/p/modwsgi/wiki/DebuggingTechniques#Tracking_Request_and_Response
 
  using it to 'print' returned data to Apache log and then tail Apache
  error log to see when that data is output. Alternatively, change the
  code there to output a time stamp against each chunk of data written
  to the file recording the response content.
 
  This will show what data is returned by WSGI application, before
  mod_wsgi truncates anything greater than content length specified,
  plus also show whether it is your WSGI application which is delaying
  output somehow, or whether Apache output filters are doing it.
 
  Graham
 
  I've actually tried a variation on this already using a built-in logging
  facility in the application that writes date/time values to an external
 log
  file with comments, and in the case of testing wsgi I actually included
 some
  time.sleep() statements to force a delay in the application.
 
  To give you an idea of the flow, here's essentially what's going on:
 
  def application(environ,start_response):
  mydict = {}
  mydict['environ']=environ
  mydict['startresponse'] = start_response
  # run program in another .py file that has been imported
  RunTest(mydict)
 
  Then in the other module you would have something like:
 
  def RunTest(mydict):
  status = '200 OK'
  response_headers = [('Content-type','text/html')]
  writeobj = detail['startresponse'](status,response_headers)
  writeobj('htmlbodyFetching sales for 2009...')
  time.sleep(2)
  writeobj('brFetching sales for 2010...')
 
  ...then finally...
 
  writeobj('5000 results returned./body/html')
  return
 
  This is obviously a truncated (and fake) example, but it gives you an
 idea
  of the flow.

 Now go try the following two examples as illustrated instead.

 In both cases, do not use a web browser, instead telnet to the port of
 the web server and enter HTTP GET directly. If you are not using
 VirtualHost, use something like:

  telnet localhost 80
  GET /stream-yield.wsgi HTTP/1.0

 If using a VirtualHost, use something like:

  telnet localhost 80
  GET /stream-yield.wsgi HTTP/1.1
  Host: tests.example.com

 Ensure additional blank line entered to indicate end of headers.

 First example uses yield.

 # stream-yield.wsgi

 import time

 def application(environ, start_response):
 status = '200 OK'

response_headers = [('Content-type', 'text/plain')]
start_response(status, response_headers)

for i in range(10):
  yield '%d\n' % i
  time.sleep(1)

 Second example uses write:

 # stream-write.wsgi

 import time

 def application(environ, start_response):
 status = '200 OK'

response_headers = [('Content-type', 'text/plain')]
write = start_response(status, response_headers)

for i in range(10):
  write('%d\n' % i)
  time.sleep(1)

return []

 For me, using stock standard operating system supplied Apache on Mac
 OS X, I see a line returned every second.

 If I use Safari as a web browser, in both cases the browser only shows
 the response after all data has been written and the socket connection
 closed. If I use Firefox however, they display as data comes in.

 This delay in display is thus possibly just the behaviour of a
 specific browser delaying the display until the socket is closed.

 The example for multipart/x-mixed-replace which others mention is:

 import time

 def application(environ, start_response):
 status = '200 OK'

response_headers = [('Content-Type', 'multipart/x-mixed-replace;
 boundary=xstringx')]
start_response(status, response_headers)

yield '--xstrinx\n'

for i in range(10):

  yield 'Content-type: text/plain\n'
  yield '\n'
  yield '%d\n' % i
  yield '--xstringx\n'

  time.sleep(1)

 With telnet you will see the various sections, but with Safari again
 only 

Re: [Web-SIG] Emulating req.write() in WSGI

2010-06-30 Thread Éric Araujo
Forgot the footnote:

¹ http://en.wikipedia.org/wiki/Comet_%28programming%29

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Emulating req.write() in WSGI

2010-06-30 Thread Graham Dumpleton
On 30 June 2010 22:55, Aaron Fransen aaron.fran...@gmail.com wrote:


 On Wed, Jun 30, 2010 at 6:26 AM, Graham Dumpleton
 graham.dumple...@gmail.com wrote:

 On 30 June 2010 21:35, Aaron Fransen aaron.fran...@gmail.com wrote:
 
 
  On Tue, Jun 29, 2010 at 6:17 PM, Graham Dumpleton
  graham.dumple...@gmail.com wrote:
 
  On 30 June 2010 02:14, Aaron Fransen aaron.fran...@gmail.com wrote:
   Couple more things I've been able to discern.
  
   The first happened after I fixed the html code. Originally under
   mod_python, I guess I was cheating more than a little bit by sending
   html/html code blocks twice, once for the incremental notices,
   once
   for
   the final content. Once I changed the code to send a single properly
   parsed
   block, the entire document showed up as expected, however it still
   did
   not
   send any part of the html incrementally.
  
   Watching the line with Wireshark, all of the data was transmitted at
   the
   same time, so nothing was sent to the browser incrementally.
  
   (This is using the write() functionality, I haven't tried watching
   the
   line
   with yield yet.)
 
  Use a variation of WSGI middleware wrapper in:
 
 
 
   http://code.google.com/p/modwsgi/wiki/DebuggingTechniques#Tracking_Request_and_Response
 
  using it to 'print' returned data to Apache log and then tail Apache
  error log to see when that data is output. Alternatively, change the
  code there to output a time stamp against each chunk of data written
  to the file recording the response content.
 
  This will show what data is returned by WSGI application, before
  mod_wsgi truncates anything greater than content length specified,
  plus also show whether it is your WSGI application which is delaying
  output somehow, or whether Apache output filters are doing it.
 
  Graham
 
  I've actually tried a variation on this already using a built-in logging
  facility in the application that writes date/time values to an external
  log
  file with comments, and in the case of testing wsgi I actually included
  some
  time.sleep() statements to force a delay in the application.
 
  To give you an idea of the flow, here's essentially what's going on:
 
  def application(environ,start_response):
      mydict = {}
      mydict['environ']=environ
      mydict['startresponse'] = start_response
      # run program in another .py file that has been imported
      RunTest(mydict)
 
  Then in the other module you would have something like:
 
  def RunTest(mydict):
      status = '200 OK'
      response_headers = [('Content-type','text/html')]
      writeobj = detail['startresponse'](status,response_headers)
      writeobj('htmlbodyFetching sales for 2009...')
      time.sleep(2)
      writeobj('brFetching sales for 2010...')
 
      ...then finally...
 
      writeobj('5000 results returned./body/html')
      return
 
  This is obviously a truncated (and fake) example, but it gives you an
  idea
  of the flow.

 Now go try the following two examples as illustrated instead.

 In both cases, do not use a web browser, instead telnet to the port of
 the web server and enter HTTP GET directly. If you are not using
 VirtualHost, use something like:

  telnet localhost 80
  GET /stream-yield.wsgi HTTP/1.0

 If using a VirtualHost, use something like:

  telnet localhost 80
  GET /stream-yield.wsgi HTTP/1.1
  Host: tests.example.com

 Ensure additional blank line entered to indicate end of headers.

 First example uses yield.

 # stream-yield.wsgi

 import time

 def application(environ, start_response):
    status = '200 OK'

    response_headers = [('Content-type', 'text/plain')]
    start_response(status, response_headers)

    for i in range(10):
      yield '%d\n' % i
      time.sleep(1)

 Second example uses write:

 # stream-write.wsgi

 import time

 def application(environ, start_response):
    status = '200 OK'

    response_headers = [('Content-type', 'text/plain')]
    write = start_response(status, response_headers)

    for i in range(10):
      write('%d\n' % i)
      time.sleep(1)

    return []

 For me, using stock standard operating system supplied Apache on Mac
 OS X, I see a line returned every second.

 If I use Safari as a web browser, in both cases the browser only shows
 the response after all data has been written and the socket connection
 closed. If I use Firefox however, they display as data comes in.

 This delay in display is thus possibly just the behaviour of a
 specific browser delaying the display until the socket is closed.

 The example for multipart/x-mixed-replace which others mention is:

 import time

 def application(environ, start_response):
    status = '200 OK'

    response_headers = [('Content-Type', 'multipart/x-mixed-replace;
 boundary=xstringx')]
    start_response(status, response_headers)

    yield '--xstrinx\n'

    for i in range(10):

      yield 'Content-type: text/plain\n'
      yield '\n'
      yield '%d\n' % i
      yield '--xstringx\n'

      

Re: [Web-SIG] Emulating req.write() in WSGI

2010-06-29 Thread Aaron Fransen
On Mon, Jun 28, 2010 at 5:42 PM, Graham Dumpleton 
graham.dumple...@gmail.com wrote:

 On 29 June 2010 05:01, Aaron Fransen aaron.fran...@gmail.com wrote:
  One of the nice things about mod_python is the req.write() function.

 One thing I should warn you about req.write() in Apache is that for
 streaming data as you seem to be using it, it will accumulate memory
 against a request for each write call and that will not be reused,
 albeit it will be released again at the end of the request.

 The problem here isn't actually in mod_python but in the underlying
 Apache ap_rwrite() call.

 What this function does is that for each call to it, it creates what
 is called a bucket to hold the data to be written. The memory for this
 bucket is allocated from the per request memory pool each time. This
 bucket is then passed down the Apache output filter chain and
 eventually the data gets written out.

 Now, because the code doesn't attempt to reuse the bucket, that memory
 then remains unused, but still allocated against the memory pool, with
 the memory pool only being destroyed at the end of the request.

 The outcome of this is that if you had a long running request which
 continually wrote out response data in small bits using req.write(),
 for each call there is a small increase in amount of memory taken from
 the per request memory pool with it not being reused. Thus if the
 request were running for a very long time, you will see a gradual
 increase in overall memory usage of the process. When the request
 finishes, the memory is reclaimed and reused, but you have by then
 already set the high ceiling on ongoing process memory in use.

 Anyway, thought I should just warn you about this. In part this issue
 may even be why mod_python got a reputation for memory bloat in some
 situations. That is, the fundamental way of returning response data
 could cause unnecessary increase in process size if called many times
 for a request.

 Graham



Fortunately we're not talking about a huge amount of data here, basically
just a couple of notices to keep the user happy (less than 1K usually).

When using yield, it's as if the module where the yield command is run is
completely ignored. The page returned is a default page generated by the
application. Errors are being trapped, but none are being generated, it's
just exiting without any kind of notice.

When using write() without a Content-Length header, nothing shows on the
browser.

When using write() with a Content-Length header, the first update shows (and
only after the entire page has been generated), but none of the subsequent
ones nor the final page.

When using write() with a Content-Length header set large enough to
encompass the entire final result, the final result page shows, but none of
the informational messages leading up to the generation of the page appear.

I haven't really done anything to the base wsgi installation; just set it up
in daemon mode.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Emulating req.write() in WSGI

2010-06-29 Thread P.J. Eby

At 10:14 AM 6/29/2010 -0600, Aaron Fransen wrote:

Couple more things I've been able to discern.

The first happened after I fixed the html code. Originally under 
mod_python, I guess I was cheating more than a little bit by sending 
html/html code blocks twice, once for the incremental notices, 
once for the final content. Once I changed the code to send a single 
properly parsed block, the entire document showed up as expected, 
however it still did not send any part of the html incrementally.


Watching the line with Wireshark, all of the data was transmitted at 
the same time, so nothing was sent to the browser incrementally.


So, you're not sending a multipart/x-mixed-replace (server push) 
transmission? 


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Emulating req.write() in WSGI

2010-06-29 Thread P.J. Eby

At 12:33 PM 6/29/2010 -0600, Aaron Fransen wrote:
I was sending text/html (I probably should have used multipart 
before) ... should I try multipart now, even with having everything 
in a single stream?


Heck if I know.  I just assumed that what you're doing would be 
unlikely to work, whereas multipart has at least been previously 
documented as working with Apache (at least for nph scripts).  Dunno 
if mod_wsgi'll do that or not.


Actually, what I'd do in your place is try a nph- CGI in Python 
(using a wsgiref CGIHandler with its 'origin_server' attribute set to 
True), have it send multipart, and see if that works.  If it doesn't 
work, then it's probably a problem with your app.


If it *does* work, but the same app doesn't work under mod_wsgi, then 
it's a mod_wsgi issue; possibly related to configuration.  From what 
Graham's said, mod_wsgi shouldn't be buffering anything, which means 
it has to either be Apache or your app that's buffering.  If it's 
Apache, doing a proper nph+multipart ought to fix it, unless there's 
something else going on in the Apache configuration.



___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Emulating req.write() in WSGI

2010-06-29 Thread Graham Dumpleton
On 29 June 2010 23:37, Aaron Fransen aaron.fran...@gmail.com wrote:
 Fortunately we're not talking about a huge amount of data here, basically
 just a couple of notices to keep the user happy (less than 1K usually).

 When using yield, it's as if the module where the yield command is run is
 completely ignored. The page returned is a default page generated by the
 application. Errors are being trapped, but none are being generated, it's
 just exiting without any kind of notice.

 When using write() without a Content-Length header, nothing shows on the
 browser.

 When using write() with a Content-Length header, the first update shows (and
 only after the entire page has been generated), but none of the subsequent
 ones nor the final page.

 When using write() with a Content-Length header set large enough to
 encompass the entire final result, the final result page shows, but none of
 the informational messages leading up to the generation of the page appear.

These statements concerns me.

The Content-Length header if you are sending a response of unknown
length should not be set. Further, you definitely cannot return/write
more response data than is specified by Content-Length. Doing so
breaks HTTP and mod_wsgi will actually deliberately discard anything
returned over what Content-Length specifies.

Can you clarify this? Are you setting Content-Length to a value less
than the amount of data you could actually return?

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Emulating req.write() in WSGI

2010-06-29 Thread Graham Dumpleton
On 30 June 2010 02:14, Aaron Fransen aaron.fran...@gmail.com wrote:
 Couple more things I've been able to discern.

 The first happened after I fixed the html code. Originally under
 mod_python, I guess I was cheating more than a little bit by sending
 html/html code blocks twice, once for the incremental notices, once for
 the final content. Once I changed the code to send a single properly parsed
 block, the entire document showed up as expected, however it still did not
 send any part of the html incrementally.

 Watching the line with Wireshark, all of the data was transmitted at the
 same time, so nothing was sent to the browser incrementally.

 (This is using the write() functionality, I haven't tried watching the line
 with yield yet.)

Use a variation of WSGI middleware wrapper in:

  
http://code.google.com/p/modwsgi/wiki/DebuggingTechniques#Tracking_Request_and_Response

using it to 'print' returned data to Apache log and then tail Apache
error log to see when that data is output. Alternatively, change the
code there to output a time stamp against each chunk of data written
to the file recording the response content.

This will show what data is returned by WSGI application, before
mod_wsgi truncates anything greater than content length specified,
plus also show whether it is your WSGI application which is delaying
output somehow, or whether Apache output filters are doing it.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] Emulating req.write() in WSGI

2010-06-28 Thread Aaron Fransen
One of the nice things about mod_python is the req.write() function.

Although I realize it's somewhat of an abuse to the http protocol, it's
handy being able to periodically update the client browser with a status
message for a long-running job.

So handy in fact that I have a number of applications that rely fairly
heavily on it as a means of keeping the client (person) happy instead of
just showing them the default browser busy notification.

There are a couple of workarounds, neither of which are ideal:
1. Take them immediately to a secondary page, then submit the actual job
automatically on that second page.
2. Instead of using HTTP POST, use an HTTP Request Object (ie. Ajax).

Both of them involve significantly more development effort than an
equivalent req.write().

Is there a way to emulate the periodic-write functionality in WSGI?
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Emulating req.write() in WSGI

2010-06-28 Thread Gustavo Narea
http://pythonpaste.org/waitforit/

HTH.

 - Gustavo.

Aaron said:
 One of the nice things about mod_python is the req.write() function.
 
 Although I realize it's somewhat of an abuse to the http protocol, it's
 handy being able to periodically update the client browser with a status
 message for a long-running job.
 
 So handy in fact that I have a number of applications that rely fairly
 heavily on it as a means of keeping the client (person) happy instead of
 just showing them the default browser busy notification.
 
 There are a couple of workarounds, neither of which are ideal:
 1. Take them immediately to a secondary page, then submit the actual job
 automatically on that second page.
 2. Instead of using HTTP POST, use an HTTP Request Object (ie. Ajax).
 
 Both of them involve significantly more development effort than an
 equivalent req.write().
 
 Is there a way to emulate the periodic-write functionality in WSGI?
-- 
Gustavo Narea xri://=Gustavo.
| Tech blog: =Gustavo/(+blog)/tech  ~  About me: =Gustavo/about |
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Emulating req.write() in WSGI

2010-06-28 Thread P.J. Eby

At 01:01 PM 6/28/2010 -0600, Aaron Fransen wrote:

One of the nice things about mod_python is the req.write() function.

Although I realize it's somewhat of an abuse to the http protocol, 
it's handy being able to periodically update the client browser with 
a status message for a long-running job.


So handy in fact that I have a number of applications that rely 
fairly heavily on it as a means of keeping the client (person) happy 
instead of just showing them the default browser busy notification.


There are a couple of workarounds, neither of which are ideal:
1. Take them immediately to a secondary page, then submit the actual 
job automatically on that second page.

2. Instead of using HTTP POST, use an HTTP Request Object (ie. Ajax).

Both of them involve significantly more development effort than an 
equivalent req.write().


Is there a way to emulate the periodic-write functionality in WSGI?


Each string yielded (or passed to the write() callable returned by 
start_response) is supposed to be sent straight through to the client.


As long as your WSGI stack is actually conformant to the protocol, 
that's all you need to do.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Emulating req.write() in WSGI

2010-06-28 Thread Aaron Fransen
On Mon, Jun 28, 2010 at 3:11 PM, P.J. Eby p...@telecommunity.com wrote:

 At 01:01 PM 6/28/2010 -0600, Aaron Fransen wrote:

 One of the nice things about mod_python is the req.write() function.

 Although I realize it's somewhat of an abuse to the http protocol, it's
 handy being able to periodically update the client browser with a status
 message for a long-running job.

 So handy in fact that I have a number of applications that rely fairly
 heavily on it as a means of keeping the client (person) happy instead of
 just showing them the default browser busy notification.

 There are a couple of workarounds, neither of which are ideal:
 1. Take them immediately to a secondary page, then submit the actual job
 automatically on that second page.
 2. Instead of using HTTP POST, use an HTTP Request Object (ie. Ajax).

 Both of them involve significantly more development effort than an
 equivalent req.write().

 Is there a way to emulate the periodic-write functionality in WSGI?


 Each string yielded (or passed to the write() callable returned by
 start_response) is supposed to be sent straight through to the client.

 As long as your WSGI stack is actually conformant to the protocol, that's
 all you need to do.


Using mod_wsgi on Apache doesn't seem to exhibit that behavior.

Experimentation with the write() functionality variously produces *only* the
helper text, or only the final result page, it doesn't incrementally update
the user. This behaviour appears to be dependent on the inclusion of the
Content-Length header field.

Yield command has not produced better results either, as it seems to produce
the yield output then, as far as what's presented to the browser, exit the
program completely (yet no errors in the log to speak of).

I'll experiment with yield some more to see if I can more sharply define
what's going on.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Emulating req.write() in WSGI

2010-06-28 Thread P.J. Eby

At 03:43 PM 6/28/2010 -0600, Aaron Fransen wrote:

Using mod_wsgi on Apache doesn't seem to exhibit that behavior.


You may need WSGIOutputBuffering Off in your config; see:

http://code.google.com/p/modwsgi/wiki/ConfigurationDirectives#WSGIOutputBuffering

Another possibility is that you've got some middleware or something 
else buffering between your app and mod_wsgi, I suppose.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Emulating req.write() in WSGI

2010-06-28 Thread Graham Dumpleton
On 29 June 2010 05:01, Aaron Fransen aaron.fran...@gmail.com wrote:
 One of the nice things about mod_python is the req.write() function.

One thing I should warn you about req.write() in Apache is that for
streaming data as you seem to be using it, it will accumulate memory
against a request for each write call and that will not be reused,
albeit it will be released again at the end of the request.

The problem here isn't actually in mod_python but in the underlying
Apache ap_rwrite() call.

What this function does is that for each call to it, it creates what
is called a bucket to hold the data to be written. The memory for this
bucket is allocated from the per request memory pool each time. This
bucket is then passed down the Apache output filter chain and
eventually the data gets written out.

Now, because the code doesn't attempt to reuse the bucket, that memory
then remains unused, but still allocated against the memory pool, with
the memory pool only being destroyed at the end of the request.

The outcome of this is that if you had a long running request which
continually wrote out response data in small bits using req.write(),
for each call there is a small increase in amount of memory taken from
the per request memory pool with it not being reused. Thus if the
request were running for a very long time, you will see a gradual
increase in overall memory usage of the process. When the request
finishes, the memory is reclaimed and reused, but you have by then
already set the high ceiling on ongoing process memory in use.

Anyway, thought I should just warn you about this. In part this issue
may even be why mod_python got a reputation for memory bloat in some
situations. That is, the fundamental way of returning response data
could cause unnecessary increase in process size if called many times
for a request.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com