Re: Adding support for importing models through the app registry while it is loading

2016-09-30 Thread Aymeric Augustin
Hello,

Quick follow-up: I just updated my PR according to this discussion:
https://github.com/django/django/pull/6547

While I was there, I made get_user_model() usable at import time.
We’ll need feedback from actual projects to know it that works.

-- 
Aymeric.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/F2DB4ACF-1B5D-4B2B-A1D5-BDCAFE519183%40polytechnique.org.
For more options, visit https://groups.google.com/d/optout.


Re: Adding support for importing models through the app registry while it is loading

2016-09-06 Thread Carl Meyer
On 09/06/2016 12:55 PM, Aymeric Augustin wrote:
> Hi Carl,
> 
> Thanks for the feedback!

Of course! Thanks for working on things :-)

> ...
> The change I’m proposing doesn’t introduce random bugs. If models are
> missing reverse relations, that will be deterministic.

+1

>> Is it possible
>> to make it so that even the model meta introspection APIs (and of course
>> also any attempt to query the db) fail quickly, loudly, and with a clear
>> error message if the app registry is not yet fully populated? If so,
>> then I think there's little risk to adding this API (and in fact I think
>> we could even make it the default behavior of `get_model`).
> 
> If not checking for models_ready was the default for get_model(), then I’d
> expect it to be the default for get_models() as well, but that doesn’t make
> sense. get_models() really needs all models to be imported.
> 
> My initial patch added a new method, `import_model`, because the operation is
> quite different from the app registry's perspective. Instead of just looking
> up the model, we try to import it. Even though the end result feels the same,
> the context isn't the same.

Yes, this makes sense. It suggests another possible name for the kwarg,
`force_import=True`?

...
> Without looking at the code base, I can’t say. It might help if you’re doing
> swappable-model-style dynamic imports. It won’t help if you’re importing
> models directly or indirectly from your applications’s __init__.py, which is
> a more common problem.

Yes, I think it's likely this wouldn't help much.

> I think #21719 would have led there as well but it turned out to be
> surprisingly hard -- in the realm of "not sure I'd get anywhere even if I
> spent two weeks full-time on it”.

I still think that's the right approach (and should be possible) in
theory, but I trust your estimation of complexity and am currently short
of two-week-plus blocks of free time :-)

Carl

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/7d1c5071-b1c5-6fdb-e12c-7d7ebf3cb852%40oddbird.net.
For more options, visit https://groups.google.com/d/optout.


signature.asc
Description: OpenPGP digital signature


Re: Adding support for importing models through the app registry while it is loading

2016-09-06 Thread Aymeric Augustin
Hi Carl,

Thanks for the feedback!

> On 06 Sep 2016, at 19:39, Carl Meyer  wrote:
> 
> 1) A kwarg to `get_model` seems fine, but I don't like the vague FUD of
> `unsafe=True` (if it's really "not safe" we shouldn't add it / make it
> public at all). How about something more specific like
> `require_ready=False`?

Agreed. Actually I thought there must be a better name but didn’t have a
better idea and then I forgot to mention it, the email was quite long already.

> 2) I think a key question here is the nature and clarity of the problems
> that arise if you try to make use of an un-ready model class. If the
> failure mode here is the reintroduction of unpredictable heisenbugs
> where certain related fields are quietly not present because the model
> on the other end happens to not have been imported yet, I think that's a
> strong argument for refraining from introducing this API.

In Django 1.7+, I’m pretty sure the import sequence is fully deterministic for
a given version of a project.

get_wsgi_application() runs django.setup() first thing. Not user code has been
imported at that point.

django.setup() accesses settings, which can import user code but shouldn’t —
it’s super easy to create import loops if it does. In general settings don't
import much. They might be able to import models but that's pathological.
Just don't do that.

Then apps.populate() imports apps and models in the order of INSTALLED_APPS.
This is fully deterministic, even with multi-threading: the first thread that
reaches populate() takes a lock and other threads wait until that thread has
finished.

At this point all models have been imported according in a fully deterministic
sequence (assuming non-pathological code e.g. `if random() > 0.5: import foo`)

I believe the next step is to import URLs. I didn't check if that happens
immediately or when the first request is received. Some non-determinism might
happen here, especially  with per-request urlconfs, but let's focus on models
and leave that question for another day :-)

The change I’m proposing doesn’t introduce random bugs. If models are
missing reverse relations, that will be deterministic.

> Is it possible
> to make it so that even the model meta introspection APIs (and of course
> also any attempt to query the db) fail quickly, loudly, and with a clear
> error message if the app registry is not yet fully populated? If so,
> then I think there's little risk to adding this API (and in fact I think
> we could even make it the default behavior of `get_model`).

If not checking for models_ready was the default for get_model(), then I’d
expect it to be the default for get_models() as well, but that doesn’t make
sense. get_models() really needs all models to be imported.

My initial patch added a new method, `import_model`, because the operation is
quite different from the app registry's perspective. Instead of just looking
up the model, we try to import it. Even though the end result feels the same,
the context isn't the same.

Also, from the "separation of concerns" angle, I'm not eager to create
coupling between the model meta API and the app registry.

So, while I get the appeal of not adding a somewhat scary kwarg, I still think
it's good to have users declare that they're treading into pre-`models_ready`
territory.

We could relax this constraint later by ignoring the require_ready argument,
at the cost of some code churn in third-party apps that started using it.

> As for whether it's needed in real use, the only feedback I can offer at
> the moment is that the import-order enforcement that kicked in with the
> removal of app-loading deprecation shims in Django 1.9 is the primary
> reason that Instagram has so far upgraded only as far as Django 1.8;

BRB finding a patch of sand to bury my head in...

> fixing the import-order issues appeared to require too much upheaval and
> would have delayed the upgrade too much. I haven't had time to look
> closely at those issues so I can't yet offer more detailed feedback on
> how much the availability of this particular API would help.

Without looking at the code base, I can’t say. It might help if you’re doing
swappable-model-style dynamic imports. It won’t help if you’re importing
models directly or indirectly from your applications’s __init__.py, which is
a more common problem.

For the record this is tracked in #24312 but that ticket neither describes the
exact problem nor possible ways to tackle it.

I think #21719 would have led there as well but it turned out to be
surprisingly hard -- in the realm of "not sure I'd get anywhere even if I
spent two weeks full-time on it”.

-- 
Aymeric.


-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com

Re: Adding support for importing models through the app registry while it is loading

2016-09-06 Thread Carl Meyer
Hi Aymeric,

On 09/04/2016 01:03 PM, Aymeric Augustin wrote:
> Hello,
> 
> Since Django 1.7, `get_user_model()` cannot be called at import time.
> Django contains a lot of methods that start with `UserModel =
> get_user_model()` while it would be more natural to declare it as a
> global variable in the module. This trips up users and the reason why
> it’s forbidden isn’t obvious.
...
> However, in many cases, it doesn’t matter whether the model is fully
> constructed or not. In the use case I described in my introduction,
> the code only needs a reference to the model class at import time;
> that reference won’t be used until run time, after the app-loading
> process has completed. (Performing SQL queries through the ORM at
> import time doesn’t work anyway.)
> 
> For this reason, I think it would make sense to add an API similar to
> `apps.get_models()`, but that would work while app-loading is still
> in progress. It would make no guarantees about the functionality of
> the class it returns until app-loading has completed.
> 
> I implemented a proof of concept here:
> https://github.com/django/django/pull/6547.

A few thoughts:

1) A kwarg to `get_model` seems fine, but I don't like the vague FUD of
`unsafe=True` (if it's really "not safe" we shouldn't add it / make it
public at all). How about something more specific like
`require_ready=False`?

2) I think a key question here is the nature and clarity of the problems
that arise if you try to make use of an un-ready model class. If the
failure mode here is the reintroduction of unpredictable heisenbugs
where certain related fields are quietly not present because the model
on the other end happens to not have been imported yet, I think that's a
strong argument for refraining from introducing this API. Is it possible
to make it so that even the model meta introspection APIs (and of course
also any attempt to query the db) fail quickly, loudly, and with a clear
error message if the app registry is not yet fully populated? If so,
then I think there's little risk to adding this API (and in fact I think
we could even make it the default behavior of `get_model`).

As for whether it's needed in real use, the only feedback I can offer at
the moment is that the import-order enforcement that kicked in with the
removal of app-loading deprecation shims in Django 1.9 is the primary
reason that Instagram has so far upgraded only as far as Django 1.8;
fixing the import-order issues appeared to require too much upheaval and
would have delayed the upgrade too much. I haven't had time to look
closely at those issues so I can't yet offer more detailed feedback on
how much the availability of this particular API would help.

Carl

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/eabf51d0-712e-32f1-d8e4-320ddf7681ad%40oddbird.net.
For more options, visit https://groups.google.com/d/optout.


signature.asc
Description: OpenPGP digital signature


Adding support for importing models through the app registry while it is loading

2016-09-04 Thread Aymeric Augustin
Hello,

Since Django 1.7, `get_user_model()` cannot be called at import time. Django 
contains a lot of methods that start with `UserModel = get_user_model()` while 
it would be more natural to declare it as a global variable in the module. This 
trips up users and the reason why it’s forbidden isn’t obvious.


It's the consequence of two design decisions (which I’m responsible for) in the 
app-loading refactor.

1. Before Django 1.7, the sequence in which models were loaded could depend on 
which views were hit in what order after a WSGI process started. That allowed 
interesting failures mode. For example, a project could randomly fail to load, 
only in production, because of a circular import error. Throw multithreading 
into the mix and it could get really exciting. To solve this, the new 
app-loading system added an explicit `django.setup()` step that imports apps 
and models in a deterministic order and doesn’t let code interact with the app 
registry until it’s fully loaded.

2. Django must ensure that relationships between models are set up correctly. 
Specifically it must guarantee that reverse accessors are added to each model 
targeted by a foreign key. This guarantee can only be provided once all other 
models have been imported. There’s been a long string of bugs in this area 
around 2008 (if memory serves). The new app-loading system took this 
requirement into account at the design stage and, thanks to the explicit setup, 
is structurally more robust against such bugs.

Because of 2. apps.get_models() raises an exception before all models are 
imported rather than return an incomplete model. Because of 1. Django gives 
very little control on what happens up to this point.


However, in many cases, it doesn’t matter whether the model is fully 
constructed or not. In the use case I described in my introduction, the code 
only needs a reference to the model class at import time; that reference won’t 
be used until run time, after the app-loading process has completed. 
(Performing SQL queries through the ORM at import time doesn’t work anyway.)

For this reason, I think it would make sense to add an API similar to 
`apps.get_models()`, but that would work while app-loading is still in 
progress. It would make no guarantees about the functionality of the class it 
returns until app-loading has completed.

I implemented a proof of concept here: 
https://github.com/django/django/pull/6547.


I would appreciate feedback on the following questions:

1. Have you been missing this feature? If it’s needed outside of Django, I must 
make it a public API.

2. Is it reasonable to expose it as a public API? There’s a risk that 
StackOverflow will consider it as superior to get_models() because it’s more 
lax; they'll assume that I put checks in get_models() simply because I like 
messing with users.

3. Would you prefer adding a kwarg to get_models e.g. 
apps.get_models(settings.AUTH_USER_MODEL, unsafe=True) or adding a new method 
like the PR currently does? I’m thinking the former is more user friendly as 
the functionality is very similar.


Thanks,

-- 
Aymeric.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/67450ED7-88CD-49CE-9B63-180EDC033D55%40polytechnique.org.
For more options, visit https://groups.google.com/d/optout.