Re: Adding a bulk_save method to models

2018-09-15 Thread charettes
I also dislike bulk_save() for the same reasons.

I feel like bulk_update makes the most of sense given it has a signature 
similar to bulk_create where an iterable of model instances must be passed 
we're really just performing an update.

To the bulk_update and update is the natural analogous to what bulk_create 
is to create; bulk_update_fields feels too verbose and breaks the symmetry 
of bulk_create/create for update.

Cheers,
Simon

Le samedi 15 septembre 2018 18:15:39 UTC-4, Adam Johnson a écrit :
>
> Bikeshed time.
>
> I'm also against bulk_save for the same reason that it implies save().
>
> bulk_update sounds okay to me, update() is indeed already a 'bulk' 
> operation but it could be claimed this is doing a 'bulk' amount of update 
> operations.
>
> bulk_update_fields also sounds good, the longer method name is probably 
> balanced by the lower frequency of use.
>
>
> On Sat, 15 Sep 2018 at 15:01, Tom Forbes > 
> wrote:
>
>> My original reasoning was that Queryset.update() already bulk updates 
>> rows, so the bulk prefix seems a bit redundant here (how do you bulk 
>> something that already does something in bulk?). .save() however 
>> operates on a single object, so the bulk prefix seems more appropriate and 
>> easier to understand.
>>
>> I agree bulk_save() maybe is not the best name as people might expect 
>> signals to be sent, but are there any suggestions other than 
>> bulk_update()? Maybe something more accurate, like bulk_update_fields()? 
>> Or bulk_save_fields()?
>>
>>
>>
>>
>> On 14 September 2018 at 16:32:05, Raphael Michel (ma...@raphaelmichel.de 
>> ) wrote:
>>
>> Hi, 
>>
>> I'd be very careful about calling it bulk_save(), since calling 
>> it something with save() very strongly suggests that it calls pre_save 
>> or post_save signals. 
>>
>> Best 
>> Raphael 
>>
>>
>> Am Fri, 14 Sep 2018 07:56:38 -0700 (PDT) 
>> schrieb Tim Graham >: 
>>
>> > 
>> > 
>> > I wanted to ask about naming of the new method. Currently the 
>> > proposed name is "QuerySet.bulk_save()" but I think it's a bit 
>> > confusing since it uses QuerySet.update(), not Model.save(). It works 
>> > similarly to QuerySet.bulk_update() from 
>> > https://github.com/aykut/django-bulk-update but the arguments are a 
>> > bit different. 
>> > 
>> > 
>> > Josh's comment on the PR: "Since this only works for instances with 
>> > an pk, do you think that bulk_update would be a better name? The 
>> > regular save() method can either create or update depending on pk 
>> > status which may confuse users here." 
>> > 
>> > And Tom's reply: "I considered this, but queryset.update() is the 
>> > best 'bulk update' method. I didn't want to confuse the two, this is 
>> > more about saving multiple model fields with multiple differing 
>> > values, gene bulk_save. Open to changing it though." 
>> > 
>> > 
>> > On Tuesday, January 23, 2018 at 7:38:18 AM UTC-5, Neal Todd wrote: 
>> > > 
>> > > Hi Tom, 
>> > > 
>> > > That's great, should be a helpful addition to core. Will follow the 
>> > > ticket and PR. 
>> > > 
>> > > Neal 
>> > > 
>> > > (Apologies - I hadn't spotted that you'd already referenced 
>> > > django-bulk-update in your ticket when I left my drive-by comment!) 
>> > > 
>> > > On Monday, January 22, 2018 at 7:41:11 PM UTC, Tom Forbes wrote: 
>> > >> 
>> > >> Hey Neal, 
>> > >> 
>> > >> Thank you very much for pointing that out, I actually found out 
>> > >> about this package as I was researching the ticket - I wish I had 
>> > >> known about this a couple of years ago as it would have saved me a 
>> > >> fair bit of CPU and brain time! 
>> > >> 
>> > >> I think that module is a good starting point and proves that it’s 
>> > >> possible, however I think the implementation can be improved upon 
>> > >> if we bring it inside core. I worked on a small PR to add this 
>> > >> <
>> https://github.com/django/django/pull/9606/files#diff-5b0dda5eb9a242c15879dc9cd2121379R473>
>>  
>>
>> > >> and the implementation was refreshingly simple. It still needs 
>> > >> docs, a couple more tests and to fix a strange error with sqlite 
>> > >> on Windows, but overall it seems like a lot of gain for a small 
>> > >> amount of code. 
>> > >> 
>> > >> Tom 
>> > >> 
>> > >> 
>> > >> On 22 January 2018 at 15:10:53, Neal Todd (ne...@torchbox.com) 
>> > >> wrote: 
>> > >> 
>> > >> Hi Tom, 
>> > >> 
>> > >> A built-in bulk save that's more flexible than update would 
>> > >> certainly be nice. Just in case you haven't come across it though, 
>> > >> there is a package called django-bulk-update: 
>> > >> 
>> > >> https://github.com/aykut/django-bulk-update 
>> > >> 
>> > >> I've found it very useful on a number of occassions where update 
>> > >> isn't quite enough but the loop-edit-save pattern is too slow to 
>> > >> be convenient. 
>> > >> 
>> > >> Probably some useful things in there when considering the API and 
>> > >> approach. 
>> > >> 
>> > >> Cheers, Neal 
>> > >> 
>> > >> On Friday, January 19, 2018 at 5:49:48 PM UTC, Tom Forbes 

Re: Adding a bulk_save method to models

2018-09-15 Thread Adam Johnson
Bikeshed time.

I'm also against bulk_save for the same reason that it implies save().

bulk_update sounds okay to me, update() is indeed already a 'bulk'
operation but it could be claimed this is doing a 'bulk' amount of update
operations.

bulk_update_fields also sounds good, the longer method name is probably
balanced by the lower frequency of use.


On Sat, 15 Sep 2018 at 15:01, Tom Forbes  wrote:

> My original reasoning was that Queryset.update() already bulk updates
> rows, so the bulk prefix seems a bit redundant here (how do you bulk
> something that already does something in bulk?). .save() however operates
> on a single object, so the bulk prefix seems more appropriate and easier to
> understand.
>
> I agree bulk_save() maybe is not the best name as people might expect
> signals to be sent, but are there any suggestions other than bulk_update()?
> Maybe something more accurate, like bulk_update_fields()? Or
> bulk_save_fields()?
>
>
>
>
> On 14 September 2018 at 16:32:05, Raphael Michel (m...@raphaelmichel.de)
> wrote:
>
> Hi,
>
> I'd be very careful about calling it bulk_save(), since calling
> it something with save() very strongly suggests that it calls pre_save
> or post_save signals.
>
> Best
> Raphael
>
>
> Am Fri, 14 Sep 2018 07:56:38 -0700 (PDT)
> schrieb Tim Graham :
>
> >
> >
> > I wanted to ask about naming of the new method. Currently the
> > proposed name is "QuerySet.bulk_save()" but I think it's a bit
> > confusing since it uses QuerySet.update(), not Model.save(). It works
> > similarly to QuerySet.bulk_update() from
> > https://github.com/aykut/django-bulk-update but the arguments are a
> > bit different.
> >
> >
> > Josh's comment on the PR: "Since this only works for instances with
> > an pk, do you think that bulk_update would be a better name? The
> > regular save() method can either create or update depending on pk
> > status which may confuse users here."
> >
> > And Tom's reply: "I considered this, but queryset.update() is the
> > best 'bulk update' method. I didn't want to confuse the two, this is
> > more about saving multiple model fields with multiple differing
> > values, gene bulk_save. Open to changing it though."
> >
> >
> > On Tuesday, January 23, 2018 at 7:38:18 AM UTC-5, Neal Todd wrote:
> > >
> > > Hi Tom,
> > >
> > > That's great, should be a helpful addition to core. Will follow the
> > > ticket and PR.
> > >
> > > Neal
> > >
> > > (Apologies - I hadn't spotted that you'd already referenced
> > > django-bulk-update in your ticket when I left my drive-by comment!)
> > >
> > > On Monday, January 22, 2018 at 7:41:11 PM UTC, Tom Forbes wrote:
> > >>
> > >> Hey Neal,
> > >>
> > >> Thank you very much for pointing that out, I actually found out
> > >> about this package as I was researching the ticket - I wish I had
> > >> known about this a couple of years ago as it would have saved me a
> > >> fair bit of CPU and brain time!
> > >>
> > >> I think that module is a good starting point and proves that it’s
> > >> possible, however I think the implementation can be improved upon
> > >> if we bring it inside core. I worked on a small PR to add this
> > >> <
> https://github.com/django/django/pull/9606/files#diff-5b0dda5eb9a242c15879dc9cd2121379R473>
>
> > >> and the implementation was refreshingly simple. It still needs
> > >> docs, a couple more tests and to fix a strange error with sqlite
> > >> on Windows, but overall it seems like a lot of gain for a small
> > >> amount of code.
> > >>
> > >> Tom
> > >>
> > >>
> > >> On 22 January 2018 at 15:10:53, Neal Todd (ne...@torchbox.com)
> > >> wrote:
> > >>
> > >> Hi Tom,
> > >>
> > >> A built-in bulk save that's more flexible than update would
> > >> certainly be nice. Just in case you haven't come across it though,
> > >> there is a package called django-bulk-update:
> > >>
> > >> https://github.com/aykut/django-bulk-update
> > >>
> > >> I've found it very useful on a number of occassions where update
> > >> isn't quite enough but the loop-edit-save pattern is too slow to
> > >> be convenient.
> > >>
> > >> Probably some useful things in there when considering the API and
> > >> approach.
> > >>
> > >> Cheers, Neal
> > >>
> > >> On Friday, January 19, 2018 at 5:49:48 PM UTC, Tom Forbes wrote:
> > >>>
> > >>> Hello all,
> > >>>
> > >>> I’d love for some feedback on an idea I’ve been mulling around
> > >>> lately, namely adding a bulk_save method to Dango.
> > >>>
> > >>> A somewhat common pattern for some applications is to loop over a
> > >>> list of models, set an attribute and call save on them. This
> > >>> unfortunately can issue a lot of database queries which can be a
> > >>> significant slowdown. You can work around this by using
> > >>> ‘.update()’ in some cases, but not all.
> > >>>
> > >>> It seems it would be possible to use a CASE statement in SQL to
> > >>> handle bulk-updating many rows with differing values. For example:
> > >>>
> > >>> SomeModel.object.filter(id__in=[1,2]).update(
> > >>> 

Re: Adding a bulk_save method to models

2018-09-15 Thread Tom Forbes
My original reasoning was that Queryset.update() already bulk updates rows,
so the bulk prefix seems a bit redundant here (how do you bulk something
that already does something in bulk?). .save() however operates on a single
object, so the bulk prefix seems more appropriate and easier to understand.

I agree bulk_save() maybe is not the best name as people might expect
signals to be sent, but are there any suggestions other than bulk_update()?
Maybe something more accurate, like bulk_update_fields()? Or
bulk_save_fields()?




On 14 September 2018 at 16:32:05, Raphael Michel (m...@raphaelmichel.de)
wrote:

Hi,

I'd be very careful about calling it bulk_save(), since calling
it something with save() very strongly suggests that it calls pre_save
or post_save signals.

Best
Raphael


Am Fri, 14 Sep 2018 07:56:38 -0700 (PDT)
schrieb Tim Graham :

>
>
> I wanted to ask about naming of the new method. Currently the
> proposed name is "QuerySet.bulk_save()" but I think it's a bit
> confusing since it uses QuerySet.update(), not Model.save(). It works
> similarly to QuerySet.bulk_update() from
> https://github.com/aykut/django-bulk-update but the arguments are a
> bit different.
>
>
> Josh's comment on the PR: "Since this only works for instances with
> an pk, do you think that bulk_update would be a better name? The
> regular save() method can either create or update depending on pk
> status which may confuse users here."
>
> And Tom's reply: "I considered this, but queryset.update() is the
> best 'bulk update' method. I didn't want to confuse the two, this is
> more about saving multiple model fields with multiple differing
> values, gene bulk_save. Open to changing it though."
>
>
> On Tuesday, January 23, 2018 at 7:38:18 AM UTC-5, Neal Todd wrote:
> >
> > Hi Tom,
> >
> > That's great, should be a helpful addition to core. Will follow the
> > ticket and PR.
> >
> > Neal
> >
> > (Apologies - I hadn't spotted that you'd already referenced
> > django-bulk-update in your ticket when I left my drive-by comment!)
> >
> > On Monday, January 22, 2018 at 7:41:11 PM UTC, Tom Forbes wrote:
> >>
> >> Hey Neal,
> >>
> >> Thank you very much for pointing that out, I actually found out
> >> about this package as I was researching the ticket - I wish I had
> >> known about this a couple of years ago as it would have saved me a
> >> fair bit of CPU and brain time!
> >>
> >> I think that module is a good starting point and proves that it’s
> >> possible, however I think the implementation can be improved upon
> >> if we bring it inside core. I worked on a small PR to add this
> >> <
https://github.com/django/django/pull/9606/files#diff-5b0dda5eb9a242c15879dc9cd2121379R473>

> >> and the implementation was refreshingly simple. It still needs
> >> docs, a couple more tests and to fix a strange error with sqlite
> >> on Windows, but overall it seems like a lot of gain for a small
> >> amount of code.
> >>
> >> Tom
> >>
> >>
> >> On 22 January 2018 at 15:10:53, Neal Todd (ne...@torchbox.com)
> >> wrote:
> >>
> >> Hi Tom,
> >>
> >> A built-in bulk save that's more flexible than update would
> >> certainly be nice. Just in case you haven't come across it though,
> >> there is a package called django-bulk-update:
> >>
> >> https://github.com/aykut/django-bulk-update
> >>
> >> I've found it very useful on a number of occassions where update
> >> isn't quite enough but the loop-edit-save pattern is too slow to
> >> be convenient.
> >>
> >> Probably some useful things in there when considering the API and
> >> approach.
> >>
> >> Cheers, Neal
> >>
> >> On Friday, January 19, 2018 at 5:49:48 PM UTC, Tom Forbes wrote:
> >>>
> >>> Hello all,
> >>>
> >>> I’d love for some feedback on an idea I’ve been mulling around
> >>> lately, namely adding a bulk_save method to Dango.
> >>>
> >>> A somewhat common pattern for some applications is to loop over a
> >>> list of models, set an attribute and call save on them. This
> >>> unfortunately can issue a lot of database queries which can be a
> >>> significant slowdown. You can work around this by using
> >>> ‘.update()’ in some cases, but not all.
> >>>
> >>> It seems it would be possible to use a CASE statement in SQL to
> >>> handle bulk-updating many rows with differing values. For example:
> >>>
> >>> SomeModel.object.filter(id__in=[1,2]).update(
> >>> some_field=Case(
> >>> When(id=1, then=Value('Field value for ID=1')),
> >>> When(id=2, then=Value('Field value for ID=2'))
> >>> )
> >>> )
> >>>
> >>> I’ve made a ticket for this here:
> >>> https://code.djangoproject.com/ticket/29037
> >>>
> >>> I managed to get a 70x performance increase using this technique
> >>> on a fairly large table, and it seems it could be applicable to
> >>> many projects just like bulk_create.
> >>>
> >>> The downsides to this is that it can produce very large SQL
> >>> statements when updating many rows (I had MySQL complain about a
> >>> 10MB statement once), but this can be overcome with batching and
> >>> other 

Re: RFC732 (a.k.a. brotli) support

2018-09-15 Thread Johannes Hoppe
Hi Curtis,

very good remarks. I would make sense to provide the best possible preset 
for such a middleware. That could even be content-type sensitive.
I would however add any settings to overwrite the preset. I believe if 
someone want to mingle with those things, one should inherit and override.

But even with a preset, I would follow content negotiation tactics, where 
the request can define preference over the content encoding.
https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.3

Best
-Joe

On Friday, September 14, 2018 at 12:53:34 PM UTC+2, Curtis Maloney wrote:
>
> On 09/14/2018 08:24 PM, Aymeric Augustin wrote: 
> > Hello, 
> > 
> > I would like to be sure that Brotli is actually faster. In my 
> > experience, while decompression is fast, compression can be extremely 
> slow. 
>
> https://blogs.akamai.com/2016/02/understanding-brotlis-potential.html 
>
> It's primarily tuned for fast decompression, making it especially good 
> for offline compression (i.e. compression of static assets on deploy) 
>
> At lower levels it can get as good compression as gzip, but much faster, 
> or similar speed for better results. IIRC the general "wisdom" is that 
> brotli at level 4 is typically faster than gzip, and gives better results. 
>
> I like the general idea of an "extensible" or plugin driven 
> Content-Encoding handler, though it would require at least some 
> preferencing engine so you can influence which algorithm is chosen by 
> (a) browser supported, (b) local preference, and (c) content type. 
>
> -- 
> Curtis 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/3673361a-dc59-4d27-8050-1ab2cae5d11c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: RFC732 (a.k.a. brotli) support

2018-09-15 Thread Johannes Hoppe
Good Point Aymeric,

it will probably not always be a good idea. I would require some kind of 
configuration to define which response content-type should be encoded by 
which algorithm.
It might also be interesting if you could use pre-compressed full page 
caching. Which I think does currently not work. This could however be a 
good way to reduce CPU time.

Curtis also makes a good point on what kind of configraution should be 
supported by such a middlware.

Best
-Joe

On Friday, September 14, 2018 at 12:25:12 PM UTC+2, Aymeric Augustin wrote:
>
> Hello,
>
> I would like to be sure that Brotli is actually faster. In my experience, 
> while decompression is fast, compression can be extremely slow.
>
> This is fine for static assets compressed by a build pipeline or at least 
> when the compressed version is cached. It would be a problem for on-the-fly 
> compression of dynamic content, which is what we're talking about here.
>
> I expect there's a setting for trading off compression time vs. 
> compression ratio. It should be set appropriately. It isn't worth spending 
> a lot more CPU to save a tiny bit of data transfer. The result could be 
> slower, especially for users with fast connections.
>
> I would like to see a quick performance benchmark of gzip as currently 
> implemented in Django vs. brotli with a setting you recommend, to see the 
> effect on compression time (CPU use) and compression ratio.
>
> Best regards,
>
> -- 
> Aymeric.
>
> Le ven. 14 sept. 2018 à 10:01, Johannes Hoppe  > a écrit :
>
>> Hi there,
>>
>> we all know the the wonderful GZip middleware that allows response 
>> compression. Which is great especially if you are serving large responses.
>>
>> Now gzip is only one of many supported encoding types supported by many 
>> browsers. The coolest kid on the block seems to be brotli right now.
>> It is both faster and smaller in most cases then gzip and has been 
>> standardized in https://www.ietf.org/rfc/rfc7932.txt
>>
>> Of course there is already a package that adds brotli support to Django:
>> https://pypi.org/project/django-brotli/
>>
>> But I would propose to refactor the GZip middleware into something like 
>> "CompressionMiddlware" and add support for both GZip and Brotli
>> in the same middleware.
>>
>> I have the feeling something basing like response encoding should be 
>> build into Django. Of course I would not add a hard dependency to brotli,
>> but add all methods to `Accept-Encoding` that are actually installed.
>>
>> What do you guys think? Is this worth a little refactoring?
>>
>> Best
>> -Joe
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Django developers (Contributions to Django itself)" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to django-develop...@googlegroups.com .
>> To post to this group, send email to django-d...@googlegroups.com 
>> .
>> Visit this group at https://groups.google.com/group/django-developers.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/django-developers/29cf5142-fa82-4d74-9d8f-5fb0c8be34d0%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
> -- 
> Aymeric.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/3e7c5b75-fb08-40f1-9394-97fcd33ff6d3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: RFC732 (a.k.a. brotli) support

2018-09-15 Thread Johannes Hoppe
Hi Adam,

no I would not take any of his code. I would certainly contact him as a 
reviewer or co-author of a possible patch.

In general I would like to consolidate the two middlewares into a single 
response compression middleware that support multiple algorithms. All 
except gzip would require users to install additional dependencies. But I 
believe this to be acceptable. Same as you need to install Pillow if you 
want Image support in Django.

Best
-Joe

On Friday, September 14, 2018 at 11:31:43 AM UTC+2, Adam Johnson wrote:
>
> Sounds like a good idea to me.
>
> Would you be looking at merging any code from django-brotli, and have you 
> been in contact with its creator and license holder Vasek Dohnal?
>
> And what kind of methods around Accept-Encoding are you imagining?
>
> On Fri, 14 Sep 2018 at 09:01, Johannes Hoppe  > wrote:
>
>> Hi there,
>>
>> we all know the the wonderful GZip middleware that allows response 
>> compression. Which is great especially if you are serving large responses.
>>
>> Now gzip is only one of many supported encoding types supported by many 
>> browsers. The coolest kid on the block seems to be brotli right now.
>> It is both faster and smaller in most cases then gzip and has been 
>> standardized in https://www.ietf.org/rfc/rfc7932.txt
>>
>> Of course there is already a package that adds brotli support to Django:
>> https://pypi.org/project/django-brotli/
>>
>> But I would propose to refactor the GZip middleware into something like 
>> "CompressionMiddlware" and add support for both GZip and Brotli
>> in the same middleware.
>>
>> I have the feeling something basing like response encoding should be 
>> build into Django. Of course I would not add a hard dependency to brotli,
>> but add all methods to `Accept-Encoding` that are actually installed.
>>
>> What do you guys think? Is this worth a little refactoring?
>>
>> Best
>> -Joe
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Django developers (Contributions to Django itself)" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to django-develop...@googlegroups.com .
>> To post to this group, send email to django-d...@googlegroups.com 
>> .
>> Visit this group at https://groups.google.com/group/django-developers.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/django-developers/29cf5142-fa82-4d74-9d8f-5fb0c8be34d0%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
> -- 
> Adam
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/7b2c1967-ad49-42c6-967e-f91ec0e41a5e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.