Re: Sending large, generated files

2009-04-15 Thread Graham Dumpleton



On Apr 16, 1:36 am, Rick Wagner  wrote:
> > From memory, file wrappers at django level, in order to work across
> > different hosting mechanisms supported, only allow a file name to be
> > supplied. At the WSGI level the file wrapper actually takes a file
> > like object. If you were doing this in raw WSGI, you could run your
> > tar ball creation as a separately exec'd pipeline and rather than
> > create a file in the file system, have tar output to the pipeline,
> > ie., use '-' instead of filename. The file object resulting from the
> > pipeline could then be used as input to the WSGI file wrapper object.
>
> > So, if this operation isn't somehow bound into needing Django itself,
> > and this is important to you, maybe you should create a separate
> > little WSGI application just for this purpose.
>
> > Actually, even if bound into needing Django you may still be able to
> > do it. Usingmod_wsgi, you could even delegate the special WSGI
> > application to run in same process as Django and mount it at a URL
> > which appears within Django application. Because though you are side
> > stepping Django dispatch, you couldn't though have it be protected by
> > Django based form authentication.
>
> > Graham
>
> Hi,
>
> First, the FileWrapper class in django.core.servers.basehttp.py
> accepts file-like objects,

Okay, then I am confusing it with changes proposed as part of a ticket
which would allow a way of sending back a file response using
optimised methods provided by hosting mechanism. For WSGI this is
wsgi.file_wrapper, but for mod_python the sendfile() function only
takes a file name. As such, the high level interface for that way of
returning a file could only take a file name and not a file like
object.

The ticket is:

  http://code.djangoproject.com/ticket/2131

Graham

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: Sending large, generated files

2009-04-15 Thread Alex Loddengaard
Thanks for the responses, guys.

Regarding my deployment, I've been doing my development on my local machine
using manage.py.  I'll give this a go in WSGI and see what happens.

Graham, I totally agree that there are ways to get around my problem outside
of trying to send HTTP headers before the message body.  A simple way would
be to give the user a "please wait" page while the tarball is generated.
I'm mostly just lazy and don't want to do that, haha.  That said, thanks for
the suggestion.

Rick, the issue with your suggestion is that I need to actually create the
tarball, which takes time.  I'm trying to avoid having the user wait while
the tarball is created by giving them the download dialogue before
generation happens.  Using your BigTarFileWrapper assumes that the tarball
has been created already, at least from what I can tell.

So again, I'll give this a go in WSGI and see what happens.  In the
meantime, I suppose the empty gzipped file gets the job done.

Thanks again for all the responses.  Go Django!

Alex

On Wed, Apr 15, 2009 at 8:36 AM, Rick Wagner wrote:

>
>
>
> On Apr 14, 6:55 pm, Graham Dumpleton 
> wrote:
> > On Apr 15, 7:49 am, Alex Loddengaard  wrote:
> >
> >
> >
> > > I've found several messages on this list discussing ways to send large
> files
> > > in a HttpResponse.  One can use FileWrapper, or one can use a generator
> and
> > > yield chunks of the large file.  What about the case when the large
> file is
> > > generated at HTTP request time?  In this case, it would be annoying to
> have
> > > the user wait for the page to generate the large file and then stream
> the
> > > file.  Instead we would want a way to start the HTTP response (so that
> the
> > > user gets the download dialogue), generate the large file, and then
> stream
> > > the file.  Let's take the following example:
> >
> > > def create_tarball():
> >
> > > >   path = create_some_big_tarball()
> >
> > > >   chunk = None
> > > >   fh = open(path, 'r')
> > > >   while True:
> > > > chunk = fh.read(1024 * 128)
> > > > if chunk == '':
> > > >   break
> > > > yield chunk
> >
> > > > def sample_view(request):
> > > >   response = HttpResponse(create_tarball(),
> > > > mimetype='application/x-compressed')
> > > >   response['Content-Disposition'] =
> "attachment;filename=mytarball.tar.gz"
> >
> > > The above example nearly accomplishes what we want, but it doesn't
> start the
> > > HTTP response before the tarball is created, hence making the user wait
> a
> > > long time before the download dialogue box shows up.  Let's try
> something
> > > like this (notice the addition of a noop yield):
> >
> > > def create_tarball():
> >
> > >   yield '' # noop to send the HTTP headers
> >
> > > >   path = create_some_big_tarball()
> >
> > > >   chunk = None
> > > >   fh = open(path, 'r')
> > > >   while True:
> > > > chunk = fh.read(1024 * 128)
> > > > if chunk == '':
> > > >   break
> > > > yield chunk
> >
> > > > def sample_view(request):
> > > >   response = HttpResponse(create_tarball(),
> > > > mimetype='application/x-compressed')
> > > >   response['Content-Disposition'] =
> "attachment;filename=mytarball.tar.gz"
> >
> > > The issue with the above example is that the "yield ''" seems to be
> > > ignored.  HTTP headers are not sent before the tarball is created.
> > > Similarly, "yield ' '" and "yield None" don't work, because they
> corrupt the
> > > tarball (HttpResponse calls str() on the iterable items given to the
> > > HttpResponse constructor).  As a temporary solution, we're writing an
> empty
> > > gzip file in the first yield.  Our large tarball is gzipped, and since
> gzip
> > > files can be concatenated to one and other, our hack seems to be
> working.
> > > In the above example, replace the first "yield ''" with:
> >
> > >   noop = StringIO.StringIO()
> >
> > > >   empty = gzip.GzipFile(mode='w', fileobj=noop)
> > > >   empty.write("")
> > > >   empty.close()
> > > >   yield noop.getvalue()
> >
> > > I'm wondering if there is a better way to accomplish this?  I don't
> quite
> > > understand why HTTP responses are written to stdout.  Possibly
> orthogonal to
> > > that, it seems like, in theory, yielding an empty value in the
> generator
> > > should work, because a flush is called after the HTTP headers are
> written.
> > > Any ideas, either on how to solve this problem with the Django API, or
> on
> > > why Django doesn't send HTTP headers on a "yield ''"?
> >
> > From memory, file wrappers at django level, in order to work across
> > different hosting mechanisms supported, only allow a file name to be
> > supplied. At the WSGI level the file wrapper actually takes a file
> > like object. If you were doing this in raw WSGI, you could run your
> > tar ball creation as a separately exec'd pipeline and rather than
> > create a file in the file system, have tar output to the pipeline,
> > ie., use '-' instead of 

Re: Sending large, generated files

2009-04-14 Thread Graham Dumpleton



On Apr 15, 7:49 am, Alex Loddengaard  wrote:
> I've found several messages on this list discussing ways to send large files
> in a HttpResponse.  One can use FileWrapper, or one can use a generator and
> yield chunks of the large file.  What about the case when the large file is
> generated at HTTP request time?  In this case, it would be annoying to have
> the user wait for the page to generate the large file and then stream the
> file.  Instead we would want a way to start the HTTP response (so that the
> user gets the download dialogue), generate the large file, and then stream
> the file.  Let's take the following example:
>
> def create_tarball():
>
> >   path = create_some_big_tarball()
>
> >   chunk = None
> >   fh = open(path, 'r')
> >   while True:
> >     chunk = fh.read(1024 * 128)
> >     if chunk == '':
> >       break
> >     yield chunk
>
> > def sample_view(request):
> >   response = HttpResponse(create_tarball(),
> > mimetype='application/x-compressed')
> >   response['Content-Disposition'] = "attachment;filename=mytarball.tar.gz"
>
> The above example nearly accomplishes what we want, but it doesn't start the
> HTTP response before the tarball is created, hence making the user wait a
> long time before the download dialogue box shows up.  Let's try something
> like this (notice the addition of a noop yield):
>
> def create_tarball():
>
>   yield '' # noop to send the HTTP headers
>
> >   path = create_some_big_tarball()
>
> >   chunk = None
> >   fh = open(path, 'r')
> >   while True:
> >     chunk = fh.read(1024 * 128)
> >     if chunk == '':
> >       break
> >     yield chunk
>
> > def sample_view(request):
> >   response = HttpResponse(create_tarball(),
> > mimetype='application/x-compressed')
> >   response['Content-Disposition'] = "attachment;filename=mytarball.tar.gz"
>
> The issue with the above example is that the "yield ''" seems to be
> ignored.  HTTP headers are not sent before the tarball is created.
> Similarly, "yield ' '" and "yield None" don't work, because they corrupt the
> tarball (HttpResponse calls str() on the iterable items given to the
> HttpResponse constructor).  As a temporary solution, we're writing an empty
> gzip file in the first yield.  Our large tarball is gzipped, and since gzip
> files can be concatenated to one and other, our hack seems to be working.
> In the above example, replace the first "yield ''" with:
>
>   noop = StringIO.StringIO()
>
> >   empty = gzip.GzipFile(mode='w', fileobj=noop)
> >   empty.write("")
> >   empty.close()
> >   yield noop.getvalue()
>
> I'm wondering if there is a better way to accomplish this?  I don't quite
> understand why HTTP responses are written to stdout.  Possibly orthogonal to
> that, it seems like, in theory, yielding an empty value in the generator
> should work, because a flush is called after the HTTP headers are written.
> Any ideas, either on how to solve this problem with the Django API, or on
> why Django doesn't send HTTP headers on a "yield ''"?

>From memory, file wrappers at django level, in order to work across
different hosting mechanisms supported, only allow a file name to be
supplied. At the WSGI level the file wrapper actually takes a file
like object. If you were doing this in raw WSGI, you could run your
tar ball creation as a separately exec'd pipeline and rather than
create a file in the file system, have tar output to the pipeline,
ie., use '-' instead of filename. The file object resulting from the
pipeline could then be used as input to the WSGI file wrapper object.

So, if this operation isn't somehow bound into needing Django itself,
and this is important to you, maybe you should create a separate
little WSGI application just for this purpose.

Actually, even if bound into needing Django you may still be able to
do it. Using mod_wsgi, you could even delegate the special WSGI
application to run in same process as Django and mount it at a URL
which appears within Django application. Because though you are side
stepping Django dispatch, you couldn't though have it be protected by
Django based form authentication.

Graham
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: Sending large, generated files

2009-04-14 Thread Ryan Kelly

> Looking at the code for the wsgi handler, it does call start_response()
> before processing any of the response body - my understanding is that
> this should cause the headers to be sent immediately.

Humph, looks like my understanding isn't so hot - PEP 333 explicitly
forbids start_response from sending the headers.  Nevertheless, I'd be
interested to hear if different deployment options affect the behaviour
of your examples.

 Cheers,

   Ryan

-- 
Ryan Kelly
http://www.rfk.id.au  |  This message is digitally signed. Please visit
r...@rfk.id.au|  http://www.rfk.id.au/ramblings/gpg/ for details



signature.asc
Description: This is a digitally signed message part


Re: Sending large, generated files

2009-04-14 Thread Ryan Kelly
> I've found several messages on this list discussing ways to send large
> files in a HttpResponse.  One can use FileWrapper, or one can use a
> generator and yield chunks of the large file.  What about the case
> when the large file is generated at HTTP request time?  In this case,
> it would be annoying to have the user wait for the page to generate
> the large file and then stream the file.  Instead we would want a way
> to start the HTTP response (so that the user gets the download
> dialogue), generate the large file, and then stream the file.
>
> [..snip..]
>
> The issue with the above example is that the "yield ''" seems to be
> ignored.  HTTP headers are not sent before the tarball is created.


Out of curiosity, what deployment method are you using?

Looking at the code for the wsgi handler, it does call start_response()
before processing any of the response body - my understanding is that
this should cause the headers to be sent immediately.  Have you tried
this under mod_wsgi?

 Cheers,

   Ryan


-- 
Ryan Kelly
http://www.rfk.id.au  |  This message is digitally signed. Please visit
r...@rfk.id.au|  http://www.rfk.id.au/ramblings/gpg/ for details



signature.asc
Description: This is a digitally signed message part