#24242: compress_sequence creates a larger content than no compression -------------------------------+-------------------- Reporter: dracos | Owner: nobody Type: Uncategorized | Status: new Component: Uncategorized | Version: 1.7 Severity: Normal | Keywords: Triage Stage: Unreviewed | Has patch: 0 Easy pickings: 0 | UI/UX: 0 -------------------------------+-------------------- I have a view that is 825157 bytes without gzipping, 35751 bytes gzipped as an HttpResponse, but 1010920 bytes gzipped as a StreamingHttpResponse. The output of the script given below with some noddy data is:
{{{ Normal string: 38890 compress_string: 18539 compress_sequence: 89567 compress_sequence, no flush: 18539 }}} Noddy content perhaps, but in actual use I'm very much wanting to use StreamingHttpResponse on very large JSON responses (then it uses 200Mb memory with iterables throughout, as opposed to 2Gb with more standard code/HttpResponse), and the Python json package flushes after each key, value, and punctuation in-between. Having the gzip middleware flush similarly creates a much larger output than no gzipping, with the figures given at the top. It would seem that many uses of StreamingHttpResponse will similarly be flushing regularly at the content level. #7581 does mention "some penalty in compression performance" but producing a worse- than-none performance seems a bit much :) Should compress_sequence bunch up flushes to provide at least some level of compression? Or if it's a StreamingHttpResponse, should it not bother gzipping? {{{#!python from django.utils.text import * from django.utils.six.moves import map # Identical to django.utils.text.compress_sequence # but with the flush line commented out def compress_sequence_without_flush(sequence): buf = StreamingBuffer() zfile = GzipFile(mode='wb', compresslevel=6, fileobj=buf) # Output headers... yield buf.read() for item in sequence: zfile.write(item) # zfile.flush() yield buf.read() zfile.close() yield buf.read() class Example(object): def __iter__(self): return map(str, xrange(10000)) e = Example() print 'Normal string:', len(b''.join(e)) print 'compress_string:', len(compress_string(b''.join(e))) print 'compress_sequence:', len(b''.join(compress_sequence(e))) print 'compress_sequence, no flush:', len(b''.join(compress_sequence_without_flush(e))) }}} -- Ticket URL: <https://code.djangoproject.com/ticket/24242> Django <https://code.djangoproject.com/> The Web framework for perfectionists with deadlines. -- You received this message because you are subscribed to the Google Groups "Django updates" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-updates+unsubscr...@googlegroups.com. To post to this group, send email to django-updates@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/django-updates/049.88df24e82075f80a2b2dc8b06b2054aa%40djangoproject.com. For more options, visit https://groups.google.com/d/optout.