Re: Automatic prefetching in querysets

2017-09-12 Thread Gordon Wrigley
This received some positive responses, so to help move the conversation
along I have created a ticket and pull request.

https://code.djangoproject.com/ticket/28586
https://github.com/django/django/pull/9064

Regards G

On Tue, Aug 15, 2017 at 10:44 AM, Gordon Wrigley 
wrote:

> I'd like to discuss automatic prefetching in querysets. Specifically
> automatically doing prefetch_related where needed without the user having
> to request it.
>
> For context consider these three snippets using the Question & Choice
> models from the tutorial
>  
> when
> there are 100 questions each with 5 choices for a total of 500 choices.
>
> Default
> for choice in Choice.objects.all():
> print(choice.question.question_text, ':', choice.choice_text)
> 501 db queries, fetches 500 choice rows and 500 question rows from the DB
>
> Prefetch_related
> for choice in Choice.objects.prefetch_related('question'):
> print(choice.question.question_text, ':', choice.choice_text)
> 2 db queries, fetches 500 choice rows and 100 question rows from the DB
>
> Select_related
> for choice in Choice.objects.select_related('question'):
> print(choice.question.question_text, ':', choice.choice_text)
> 1 db query, fetches 500 choice rows and 500 question rows from the DB
>
> I've included select_related for completeness, I'm not going to propose
> changing anything about it's use. There are places where it is the best
> choice and in those places it will still be up to the user to request it. I
> will note that anywhere select_related is optimal prefetch_related is still
> better than the default and leave it at that.
>
> The 'Default' example above is a classic example of the N+1 query problem,
> a problem that is widespread in Django apps.
> This pattern of queries is what new users produce because they don't know
> enough about the database and / or ORM to do otherwise.
> Experieced users will also often produce this because it's not always
> obvious what fields will and won't be used and subsequently what should be
> prefetched.
> Additionally that list will change over time. A small change to a template
> to display an extra field can result in a denial of service on your DB due
> to a missing prefetch.
> Identifying missing prefetches is fiddly, time consuming and error prone.
> Tools like django-perf-rec 
> (which I was involved in creating) and nplusone
>  exist in part to flag missing
> prefetches introduced by changed code.
> Finally libraries like Django Rest Framework and the Admin will also
> produce queries like this because it's very difficult for them to know what
> needs prefetching without being explicitly told by an experienced user.
>
> As hinted at the top I'd like to propose changing Django so the default
> code behaves like the prefetch_related code.
> Longer term I think this should be the default behaviour but obviously it
> needs to be proved first so for now I'd suggest a new queryset function
> that enables this behaviour.
>
> I have a proof of concept of this mechanism that I've used successfully in
> production. I'm not posting it yet because I'd like to focus on desired
> behavior rather than implementation details. But in summary, what it does
> is when accessing a missing field on a model, rather than fetching it just
> for that instance, it runs a prefetch_related query to fetch it for all
> peer instances that were fetched in the same queryset. So in the example
> above it prefetches all Questions in one query.
>
> This might seem like a risky thing to do but I'd argue that it really
> isn't.
> The only time this isn't superior to the default case is when you are post
> filtering the queryset results in Python.
> Even in that case it's only inferior if you started with a large number of
> results, filtered basically all of them and the code is structured so that
> the filtered ones aren't garbage collected.
> To cover this rare case the automatic prefetching can easily be disabled
> on a per queryset or per object basis. Leaving us with a rare downside that
> can easily be manually resolved in exchange for a significant general
> improvement.
>
> In practice this thing is almost magical to work with. Unless you already
> have extensive and tightly maintained prefetches everywhere you get an
> immediate boost to virtually everything that touches the database, often
> knocking orders of magnitude off page load times.
>
> If an agreement can be reached on pursuing this then I'm happy to put in
> the work to productize the proof of concept.
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/
> topic/django-developers/EplZGj-ejvg/unsubscribe.
> To unsubscribe 

Re: Add support for multiple file fields

2017-09-12 Thread Johannes
Ok,

My patch is ready. Anyone cares to review to get it in before the deadline this 
weekend?
https://github.com/django/django/pull/9011

Thanks
-Joe

--
Johannes Hoppe

Fon: +49 331 2812 9869 1
Fax: +49 331 2812 9869 9

www.johanneshoppe.com

Lennéstr. 19
14469 Potsdam

USt-IdNr.: DE284754038

On 2. Sep 2017, 13:19 +0200, Johannes Hoppe , wrote:
> OK, I drafted an implementation for django.forms.FileField to support 
> `multiple` as an argument.
> https://github.com/django/django/pull/9011
>
> I would appreciate some feedback. If you like the design, I’ll try to add 
> documentation and better tests tomorrow.
>
> Thanks
> -Joe :)
>
> --
> Johannes Hoppe
>
> Fon: +49 331 2812 9869 1
> Fax: +49 331 2812 9869 9
>
> www.johanneshoppe.com
>
> Lennéstr. 19
> 14469 Potsdam
>
> USt-IdNr.: DE284754038
>
> On 2. Sep 2017, 11:10 +0200, Melvyn Sopacua , wrote:
> > On Sat, Sep 2, 2017 at 10:37 AM, Adam Johnson  wrote:
> >
> > > ArchiveField sounds a bit too specific for Django core, the most common 
> > > case
> > > for uploading multiple files would probably be to access them 
> > > individually,
> > > which it would prevent.
> >
> > Yeah, was on the fence on that one myself. The use cases are very domain
> > specific, the obvious one being a backup application, but also design specs
> > or legal documents that site staff downloads and never actually views on 
> > site
> > individually. The upside of the field being that you have control of the 
> > archive
> > format and/or can encrypt the archive with user specific key before sending.
> >
> > I researched it for an upcoming project and all the hooks are in place to
> > create such a model and you're right, it goes beyond "batteries included".
> > --
> > Melvyn Sopacua
> >
> > --
> > You received this message because you are subscribed to a topic in the 
> > Google Groups "Django developers (Contributions to Django itself)" group.
> > To unsubscribe from this topic, visit 
> > https://groups.google.com/d/topic/django-developers/IU8L9Gc6LUI/unsubscribe.
> > To unsubscribe from this group and all its topics, send an email to 
> > django-developers+unsubscr...@googlegroups.com.
> > To post to this group, send email to django-developers@googlegroups.com.
> > Visit this group at https://groups.google.com/group/django-developers.
> > To view this discussion on the web visit 
> > https://groups.google.com/d/msgid/django-developers/CA%2Bgw1GWTn%2B8f_bUV7mmJ8gEFodwUrtK_yzF_zEWA%2Bv5ew332gA%40mail.gmail.com.
> > For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/cbfa52ad-3fb4-443c-9465-5a365272e41e%40Spark.
For more options, visit https://groups.google.com/d/optout.


Re: Having a MongoDB connector for Django

2017-09-12 Thread Anssi Kääriäinen
My 2 cents on this...

I don't think it would be an unachievable goal to write a MongoDB backend for 
Django's ORM. However, it wouldn't support any relational feature, and likely 
would also need to skip support for some other common features, for example 
AutoField is actually hard to support on MongoDB. The Django ecosystem is very 
much written on an assumption that the underlying datastore is relational, so 
this would mean that even if you have MongoDB backend, you can't use it with 
most contrib models for example.

Even if you can't use the backend with the full ecosystem, such a backend might 
be very useful for some use cases. My guess is that the most common case would 
be usage of MongoDB as an additional data store along your relational database.

I believe that most users requesting a MongoDB backend actually would want to 
see a backend which is a drop in replacement for the SQL backends. 
Unfortunately, due to the differences in optimal database schema design between 
relational and document oriented data stores, this is both a bad idea, and 
almost impossible to implement.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/7f6391f8-e477-4573-b7f9-06634d1b7957%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.