Re: A base compression middleware class

2021-06-15 Thread Tom Carrick
It's also worth noting that the security issue mentioned in the docs[1]
makes it unsuitable for many (most?) Django projects, unfortunately, and
brotli is also susceptible to this attack.

It's probably not something I'd be keen on encouraging the use of, though I
also think the idea in principle is a good one.

[1]
https://docs.djangoproject.com/en/3.2/ref/middleware/#django.middleware.gzip.GZipMiddleware

Tom

On Tue, 15 Jun 2021 at 02:16, Curtis Maloney  wrote:

> Hi Illia,
>
> I like the idea here, and your design looks sensible at first blush, but I
> feel conflicted.
>
> As much as I like the idea of allowing more flexibility and
> future-proofing this middleware,
> AIUI the gzip middleware is generally discouraged, as it's typically more
> efficient to have your web server do this work.
>
> I'd be more tempted to deprecate GzipMiddleware in core, and move this
> feature into a 3rd party app.
>
> An additional hook for your interface: a function to test if this
> compressor should be used on this response. For instance, the gzip
> middleware has a minimum size below which it's deemed not worth the effort
> to try to compress.
>
> --
> Curtis
>
>
> On Tue, 15 Jun 2021, at 04:49, Illia Volochii wrote:
>
> Hi all,
>
> There is `GZipMiddleware` that compresses response bodies using gzip.
> https://github.com/django/django/blob/main/django/middleware/gzip.py
>
> But there are other algorithms supported by browsers (e.g., Brotli).
> At the moment, if somebody wants to add support for any of them in a
> Django project, one has to create a new middleware class redefining
> all the logic processing responses.
>
> I propose simplifying the process by creating
> `BaseCompressionMiddleware`. Its subclasses would just need to define
> a tuple of supported compressors.
>
> The tuple elements could be instances of such a data class:
> ```
> @dataclass
> class Compressor:
> name: str
> compress_string_func: Callable[[bytes], bytes]
> compress_sequence_func: Optional[
> Callable[[Iterable[bytes]], Iterator[bytes]]
> ] = None
>
> def is_accepted_by(self, accept_encoding):
> return bool(re.search(rf'\b{self.name}\b', accept_encoding))
> ```
>
>
> And the class could select a compressor using such code:
> ```
> @classmethod
> def select_compressor(cls, request, response):
> ae = request.META.get('HTTP_ACCEPT_ENCODING', '')
> for compressor in cls.compressors:
> if (
> compressor.is_accepted_by(ae)
> and (not response.streaming or
> compressor.compress_sequence_func)
> ):
> return compressor
> ```
>
>
> Note, in the example `compress_string_func` is required and
> `compress_sequence_func` is optional. I think that
> `compress_string_func` would usually be a library function, and
> `compress_sequence_func` would require some custom logic. And we
> should not force users to define the latter.
>
> `GZipMiddleware` could be changed to
> ```
> gzip_compressor = Compressor(
> name='gzip',
> compress_string_func=compress_string,
> compress_sequence_func=compress_sequence,
> )
>
> class GZipMiddleware(BaseCompressionMiddleware):
> compressors = (gzip_compressor,)
> ```
>
>
> And someone wanting to support both Brotli and gzip could define:
> ```
> import brotli
> from django.middleware.gzip import gzip_compressor
>
> brotli_compressor = Compressor(
> name='br',
> compress_string_func=brotli.compress,
> )
>
> class CompressionMiddleware(BaseCompressionMiddleware)
> compressors = (brotli_compressor, gzip_compressor)
> ```
>
>
> Please let me know what you think about such an enhancement.
> I will be happy to create a pull request if this is accepted.
>
> Thanks,
> Illia
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/49535d7e-758c-4acf-a9da-831ccfa1d17bn%40googlegroups.com
> 
> .
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/c4cef215-7cae-4c6b-a9b2-7b4785b3a9bf%40www.fastmail.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django 

Re: A base compression middleware class

2021-06-14 Thread Curtis Maloney
Hi Illia,

I like the idea here, and your design looks sensible at first blush, but I feel 
conflicted.

As much as I like the idea of allowing more flexibility and future-proofing 
this middleware,
AIUI the gzip middleware is generally discouraged, as it's typically more 
efficient to have your web server do this work.

I'd be more tempted to deprecate GzipMiddleware in core, and move this feature 
into a 3rd party app.

An additional hook for your interface: a function to test if this compressor 
should be used on this response. For instance, the gzip middleware has a 
minimum size below which it's deemed not worth the effort to try to compress.

--
Curtis


On Tue, 15 Jun 2021, at 04:49, Illia Volochii wrote:
> Hi all,
> 
> There is `GZipMiddleware` that compresses response bodies using gzip.
> https://github.com/django/django/blob/main/django/middleware/gzip.py
> 
> But there are other algorithms supported by browsers (e.g., Brotli).
> At the moment, if somebody wants to add support for any of them in a
> Django project, one has to create a new middleware class redefining
> all the logic processing responses.
> 
> I propose simplifying the process by creating
> `BaseCompressionMiddleware`. Its subclasses would just need to define
> a tuple of supported compressors.
> 
> The tuple elements could be instances of such a data class:
> ```
> @dataclass
> class Compressor:
> name: str
> compress_string_func: Callable[[bytes], bytes]
> compress_sequence_func: Optional[
> Callable[[Iterable[bytes]], Iterator[bytes]]
> ] = None
> 
> def is_accepted_by(self, accept_encoding):
> return bool(re.search(rf'\b{self.name}\b', accept_encoding))
> ```
> 
> 
> And the class could select a compressor using such code:
> ```
> @classmethod
> def select_compressor(cls, request, response):
> ae = request.META.get('HTTP_ACCEPT_ENCODING', '')
> for compressor in cls.compressors:
> if (
> compressor.is_accepted_by(ae)
> and (not response.streaming or compressor.compress_sequence_func)
> ):
> return compressor
> ```
> 
> 
> Note, in the example `compress_string_func` is required and
> `compress_sequence_func` is optional. I think that
> `compress_string_func` would usually be a library function, and
> `compress_sequence_func` would require some custom logic. And we
> should not force users to define the latter.
> 
> `GZipMiddleware` could be changed to
> ```
> gzip_compressor = Compressor(
> name='gzip',
> compress_string_func=compress_string,
> compress_sequence_func=compress_sequence,
> )
> 
> class GZipMiddleware(BaseCompressionMiddleware):
> compressors = (gzip_compressor,)
> ```
> 
> 
> And someone wanting to support both Brotli and gzip could define:
> ```
> import brotli
> from django.middleware.gzip import gzip_compressor
> 
> brotli_compressor = Compressor(
> name='br',
> compress_string_func=brotli.compress,
> )
> 
> class CompressionMiddleware(BaseCompressionMiddleware)
> compressors = (brotli_compressor, gzip_compressor)
> ```
> 
> 
> Please let me know what you think about such an enhancement.
> I will be happy to create a pull request if this is accepted.
> 
> Thanks,
> Illia
> 
> 
> 

> -- 
> You received this message because you are subscribed to the Google Groups 
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to django-developers+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/django-developers/49535d7e-758c-4acf-a9da-831ccfa1d17bn%40googlegroups.com
>  
> .

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/c4cef215-7cae-4c6b-a9b2-7b4785b3a9bf%40www.fastmail.com.


A base compression middleware class

2021-06-14 Thread Illia Volochii
Hi all,

There is `GZipMiddleware` that compresses response bodies using gzip.
https://github.com/django/django/blob/main/django/middleware/gzip.py

But there are other algorithms supported by browsers (e.g., Brotli).
At the moment, if somebody wants to add support for any of them in a
Django project, one has to create a new middleware class redefining
all the logic processing responses.

I propose simplifying the process by creating
`BaseCompressionMiddleware`. Its subclasses would just need to define
a tuple of supported compressors.

The tuple elements could be instances of such a data class:
```
@dataclass
class Compressor:
name: str
compress_string_func: Callable[[bytes], bytes]
compress_sequence_func: Optional[
Callable[[Iterable[bytes]], Iterator[bytes]]
] = None

def is_accepted_by(self, accept_encoding):
return bool(re.search(rf'\b{self.name}\b', accept_encoding))
```


And the class could select a compressor using such code:
```
@classmethod
def select_compressor(cls, request, response):
ae = request.META.get('HTTP_ACCEPT_ENCODING', '')
for compressor in cls.compressors:
if (
compressor.is_accepted_by(ae)
and (not response.streaming or 
compressor.compress_sequence_func)
):
return compressor
```


Note, in the example `compress_string_func` is required and
`compress_sequence_func` is optional. I think that
`compress_string_func` would usually be a library function, and
`compress_sequence_func` would require some custom logic. And we
should not force users to define the latter.

`GZipMiddleware` could be changed to
```
gzip_compressor = Compressor(
name='gzip',
compress_string_func=compress_string,
compress_sequence_func=compress_sequence,
)

class GZipMiddleware(BaseCompressionMiddleware):
compressors = (gzip_compressor,)
```


And someone wanting to support both Brotli and gzip could define:
```
import brotli
from django.middleware.gzip import gzip_compressor

brotli_compressor = Compressor(
name='br',
compress_string_func=brotli.compress,
)

class CompressionMiddleware(BaseCompressionMiddleware)
compressors = (brotli_compressor, gzip_compressor)
```


Please let me know what you think about such an enhancement.
I will be happy to create a pull request if this is accepted.

Thanks,
Illia

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/49535d7e-758c-4acf-a9da-831ccfa1d17bn%40googlegroups.com.