On Fri, Jun 22, 2007 at 09:47:57AM +1000, Malcolm Tredinnick wrote:
> 
> On Thu, 2007-06-21 at 10:16 -0700, Brian Harring wrote:
> > On Thu, Jun 21, 2007 at 08:33:05PM +1000, Malcolm Tredinnick wrote:
> > > 
> > > Brian,
> > > 
> > > On Thu, 2007-06-14 at 12:23 -0700, Brian Harring wrote:
> > > > Just filed ticket 4565, which basically converts template rendering 
> > > > away from "build a string within this node of subnode results, return 
> > > > it, wash rinse repeat", and into "yield each subnode chunk, and my 
> > > > data as it's available".
> > > [...]
> > > > * Far less abusive on memory; usual 'spanish inquisition' heavy 
> > > > test test (term coined by mtreddinick, but it works), reduction from 
> > > > 84m to 71m for usage at the end of rendering. What I find rather 
> > > > interesting about that reduction is that the resultant page is 6.5 
> > > > mb; the extra 7mb I'm actually not sure where the reduction comes 
> > > > from (suspect there is some copy idiocy somewhere forcing a new 
> > > > string)- alternatively, may just be intermediate data hanging around, 
> > > > since I've been working with this code in one form or another for 3 
> > > > months now, and still haven't figured out the 'why' for that diff.
> > > 
> > > When you were doing this memory checking, what server configuration were
> > > you using?
> > 
> > lighttpd/fcgi; the gain there is from what directly comes out of the 
> > templating subsystem (more specifically, HttpResponse gets an iterable 
> > instead of a string).  Gain there *should* be constant regardless of 
> > setup- exemption to this would be if your configuration touches 
> > .content, or prefers to collect the full response and send that 
> > instead of buffering- if your setup kills the iteration further up 
> > (and it's not middleware related), would be interested.
> > 
> > Nudge on ticket #4625 btw ;)
> 
> You can safely assume I'm aware of it. #4625 is way down the list,
> though; there are bigger problems. I'm currently trying to work through
> all the unexpected side-effects of the first change (in between being
> really busy with Real Life). If we don't come up with a solution to the
> hanging database connections today -- see relevant thread on the
> django-users list -- I'm going to back out the whole iterator change for
> a while until we can fix things properly.

Up to you regarding punting it; very least, I don't much like breaking 
folks setups, so a temp punt won't get complaints out of me.

Re: the new thread, not on that ml, although looks of it ought to be.  
Also, in the future if it *looks* like I had a hand in whatever 
borkage is afoot, don't hesitate to cc me- as you said, real life 
intervenes, so I'm not watching everything.  Rather have a few false 
positives email wise, then be missing something I may have broke.


> The reason for the original question was part of that. We might have to
> give back all the memory savings, since we need to know when use of the
> database is definitely finished. One solution that remains
> WSGI-compliant is to have the __call__ method return a single item
> iterable of all the content  which is basically a return to the old
> behaviour (slightly fewer string copies, but not many). That is how the
> mod_python handler works already ,in effect, so that's easier to fix.
> Alternatively, we have to introduce much tighter coupling between
> templates, backend and HTTP layer to manage the database connections
> correctly, which would be a nasty thing to have to do.
>
> 
> Making the change at the WSGI handler level is more efficient in terms
> of network traffic, since WSGI servers are not permitted to buffer
> iterated writes. Lots of short writes can be occurring with a compliant
> server at the moment. But if we do that, it's not clear it's enough of a
> win over the original code to warrant it.

Keep in mind I'm having a helluva time finding *exactly* whats 
required of a handler beyond __iter__/close, but a potential approach

from itertools import chain

class IterConsumedSignal(object):

  def __iter__(self):
    return self

  def next(self):
    dispatcher.send(signal=signals.request_finished)
    raise StopIteration()

class ResponseWrapper(object):

  def __init__(self, response):
    self._response = response
    if hasattr(self._response, 'close'):
      self.close = self._response.close

  def __iter__(self):
    assert self._response is not None
    try:
      if isinstance(self._response, basestring):
        return chain([self._response], IterConsumedSignal())
      return chain(self._response, IterConsumedSignal())
    finally:
      self._response = None


then for __call__...

def __call__(self, environ, start_response):
  from django.conf import settings

  # Set up middleware if needed. We couldn't do this earlier, because
  # settings weren't available.
  if self._request_middleware is None:
    self.load_middleware()

  dispatcher.send(signal=signals.request_started)
  try:
    request = WSGIRequest(environ)
    response = self.get_response(request)

    # Apply response middleware
    for middleware_method in self._response_middleware:
      response = middleware_method(request, response)

  except:
    dispatcher.send(signal=signals.request_finished)
    raise

  try:
    status_text = STATUS_CODE_TEXT[response.status_code]
  except KeyError:
    status_text = 'UNKNOWN STATUS CODE'
  status = '%s %s' % (response.status_code, status_text)
  response_headers = response.headers.items()
  for c in response.cookies.values():
     response_headers.append(('Set-Cookie', c.output(header='')))
  start_response(status, response_headers)
  return ResponseWrapper(response)

Failing with this approach is that if anything else is accessed on the 
response, it's no longer accessible- the ResponseWrapper specifically 
lacks a __getattr__ proxying to avoid the potential of the underlying 
response being inadvertantly consumed.

Thoughts?  Upshot of this approach is that it's fairly loosely 
coupled, and is still able to preserve the iteration behaviour.  
Downside is that if there are other attrs/methods that should be 
exposed, have to add the proxying to ResponseWrapper.

~harring

Attachment: pgpEjb278h4x4.pgp
Description: PGP signature

Reply via email to