Re: Adding support for importing models through the app registry while it is loading
Hello, Quick follow-up: I just updated my PR according to this discussion: https://github.com/django/django/pull/6547 While I was there, I made get_user_model() usable at import time. We’ll need feedback from actual projects to know it that works. -- Aymeric. -- You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To post to this group, send email to django-developers@googlegroups.com. Visit this group at https://groups.google.com/group/django-developers. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/F2DB4ACF-1B5D-4B2B-A1D5-BDCAFE519183%40polytechnique.org. For more options, visit https://groups.google.com/d/optout.
Re: Adding support for importing models through the app registry while it is loading
On 09/06/2016 12:55 PM, Aymeric Augustin wrote: > Hi Carl, > > Thanks for the feedback! Of course! Thanks for working on things :-) > ... > The change I’m proposing doesn’t introduce random bugs. If models are > missing reverse relations, that will be deterministic. +1 >> Is it possible >> to make it so that even the model meta introspection APIs (and of course >> also any attempt to query the db) fail quickly, loudly, and with a clear >> error message if the app registry is not yet fully populated? If so, >> then I think there's little risk to adding this API (and in fact I think >> we could even make it the default behavior of `get_model`). > > If not checking for models_ready was the default for get_model(), then I’d > expect it to be the default for get_models() as well, but that doesn’t make > sense. get_models() really needs all models to be imported. > > My initial patch added a new method, `import_model`, because the operation is > quite different from the app registry's perspective. Instead of just looking > up the model, we try to import it. Even though the end result feels the same, > the context isn't the same. Yes, this makes sense. It suggests another possible name for the kwarg, `force_import=True`? ... > Without looking at the code base, I can’t say. It might help if you’re doing > swappable-model-style dynamic imports. It won’t help if you’re importing > models directly or indirectly from your applications’s __init__.py, which is > a more common problem. Yes, I think it's likely this wouldn't help much. > I think #21719 would have led there as well but it turned out to be > surprisingly hard -- in the realm of "not sure I'd get anywhere even if I > spent two weeks full-time on it”. I still think that's the right approach (and should be possible) in theory, but I trust your estimation of complexity and am currently short of two-week-plus blocks of free time :-) Carl -- You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To post to this group, send email to django-developers@googlegroups.com. Visit this group at https://groups.google.com/group/django-developers. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/7d1c5071-b1c5-6fdb-e12c-7d7ebf3cb852%40oddbird.net. For more options, visit https://groups.google.com/d/optout. signature.asc Description: OpenPGP digital signature
Re: Adding support for importing models through the app registry while it is loading
Hi Carl, Thanks for the feedback! > On 06 Sep 2016, at 19:39, Carl Meyer wrote: > > 1) A kwarg to `get_model` seems fine, but I don't like the vague FUD of > `unsafe=True` (if it's really "not safe" we shouldn't add it / make it > public at all). How about something more specific like > `require_ready=False`? Agreed. Actually I thought there must be a better name but didn’t have a better idea and then I forgot to mention it, the email was quite long already. > 2) I think a key question here is the nature and clarity of the problems > that arise if you try to make use of an un-ready model class. If the > failure mode here is the reintroduction of unpredictable heisenbugs > where certain related fields are quietly not present because the model > on the other end happens to not have been imported yet, I think that's a > strong argument for refraining from introducing this API. In Django 1.7+, I’m pretty sure the import sequence is fully deterministic for a given version of a project. get_wsgi_application() runs django.setup() first thing. Not user code has been imported at that point. django.setup() accesses settings, which can import user code but shouldn’t — it’s super easy to create import loops if it does. In general settings don't import much. They might be able to import models but that's pathological. Just don't do that. Then apps.populate() imports apps and models in the order of INSTALLED_APPS. This is fully deterministic, even with multi-threading: the first thread that reaches populate() takes a lock and other threads wait until that thread has finished. At this point all models have been imported according in a fully deterministic sequence (assuming non-pathological code e.g. `if random() > 0.5: import foo`) I believe the next step is to import URLs. I didn't check if that happens immediately or when the first request is received. Some non-determinism might happen here, especially with per-request urlconfs, but let's focus on models and leave that question for another day :-) The change I’m proposing doesn’t introduce random bugs. If models are missing reverse relations, that will be deterministic. > Is it possible > to make it so that even the model meta introspection APIs (and of course > also any attempt to query the db) fail quickly, loudly, and with a clear > error message if the app registry is not yet fully populated? If so, > then I think there's little risk to adding this API (and in fact I think > we could even make it the default behavior of `get_model`). If not checking for models_ready was the default for get_model(), then I’d expect it to be the default for get_models() as well, but that doesn’t make sense. get_models() really needs all models to be imported. My initial patch added a new method, `import_model`, because the operation is quite different from the app registry's perspective. Instead of just looking up the model, we try to import it. Even though the end result feels the same, the context isn't the same. Also, from the "separation of concerns" angle, I'm not eager to create coupling between the model meta API and the app registry. So, while I get the appeal of not adding a somewhat scary kwarg, I still think it's good to have users declare that they're treading into pre-`models_ready` territory. We could relax this constraint later by ignoring the require_ready argument, at the cost of some code churn in third-party apps that started using it. > As for whether it's needed in real use, the only feedback I can offer at > the moment is that the import-order enforcement that kicked in with the > removal of app-loading deprecation shims in Django 1.9 is the primary > reason that Instagram has so far upgraded only as far as Django 1.8; BRB finding a patch of sand to bury my head in... > fixing the import-order issues appeared to require too much upheaval and > would have delayed the upgrade too much. I haven't had time to look > closely at those issues so I can't yet offer more detailed feedback on > how much the availability of this particular API would help. Without looking at the code base, I can’t say. It might help if you’re doing swappable-model-style dynamic imports. It won’t help if you’re importing models directly or indirectly from your applications’s __init__.py, which is a more common problem. For the record this is tracked in #24312 but that ticket neither describes the exact problem nor possible ways to tackle it. I think #21719 would have led there as well but it turned out to be surprisingly hard -- in the realm of "not sure I'd get anywhere even if I spent two weeks full-time on it”. -- Aymeric. -- You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To post to this group, send email to django-developers@googlegroups.com
Re: Adding support for importing models through the app registry while it is loading
Hi Aymeric, On 09/04/2016 01:03 PM, Aymeric Augustin wrote: > Hello, > > Since Django 1.7, `get_user_model()` cannot be called at import time. > Django contains a lot of methods that start with `UserModel = > get_user_model()` while it would be more natural to declare it as a > global variable in the module. This trips up users and the reason why > it’s forbidden isn’t obvious. ... > However, in many cases, it doesn’t matter whether the model is fully > constructed or not. In the use case I described in my introduction, > the code only needs a reference to the model class at import time; > that reference won’t be used until run time, after the app-loading > process has completed. (Performing SQL queries through the ORM at > import time doesn’t work anyway.) > > For this reason, I think it would make sense to add an API similar to > `apps.get_models()`, but that would work while app-loading is still > in progress. It would make no guarantees about the functionality of > the class it returns until app-loading has completed. > > I implemented a proof of concept here: > https://github.com/django/django/pull/6547. A few thoughts: 1) A kwarg to `get_model` seems fine, but I don't like the vague FUD of `unsafe=True` (if it's really "not safe" we shouldn't add it / make it public at all). How about something more specific like `require_ready=False`? 2) I think a key question here is the nature and clarity of the problems that arise if you try to make use of an un-ready model class. If the failure mode here is the reintroduction of unpredictable heisenbugs where certain related fields are quietly not present because the model on the other end happens to not have been imported yet, I think that's a strong argument for refraining from introducing this API. Is it possible to make it so that even the model meta introspection APIs (and of course also any attempt to query the db) fail quickly, loudly, and with a clear error message if the app registry is not yet fully populated? If so, then I think there's little risk to adding this API (and in fact I think we could even make it the default behavior of `get_model`). As for whether it's needed in real use, the only feedback I can offer at the moment is that the import-order enforcement that kicked in with the removal of app-loading deprecation shims in Django 1.9 is the primary reason that Instagram has so far upgraded only as far as Django 1.8; fixing the import-order issues appeared to require too much upheaval and would have delayed the upgrade too much. I haven't had time to look closely at those issues so I can't yet offer more detailed feedback on how much the availability of this particular API would help. Carl -- You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To post to this group, send email to django-developers@googlegroups.com. Visit this group at https://groups.google.com/group/django-developers. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/eabf51d0-712e-32f1-d8e4-320ddf7681ad%40oddbird.net. For more options, visit https://groups.google.com/d/optout. signature.asc Description: OpenPGP digital signature
Adding support for importing models through the app registry while it is loading
Hello, Since Django 1.7, `get_user_model()` cannot be called at import time. Django contains a lot of methods that start with `UserModel = get_user_model()` while it would be more natural to declare it as a global variable in the module. This trips up users and the reason why it’s forbidden isn’t obvious. It's the consequence of two design decisions (which I’m responsible for) in the app-loading refactor. 1. Before Django 1.7, the sequence in which models were loaded could depend on which views were hit in what order after a WSGI process started. That allowed interesting failures mode. For example, a project could randomly fail to load, only in production, because of a circular import error. Throw multithreading into the mix and it could get really exciting. To solve this, the new app-loading system added an explicit `django.setup()` step that imports apps and models in a deterministic order and doesn’t let code interact with the app registry until it’s fully loaded. 2. Django must ensure that relationships between models are set up correctly. Specifically it must guarantee that reverse accessors are added to each model targeted by a foreign key. This guarantee can only be provided once all other models have been imported. There’s been a long string of bugs in this area around 2008 (if memory serves). The new app-loading system took this requirement into account at the design stage and, thanks to the explicit setup, is structurally more robust against such bugs. Because of 2. apps.get_models() raises an exception before all models are imported rather than return an incomplete model. Because of 1. Django gives very little control on what happens up to this point. However, in many cases, it doesn’t matter whether the model is fully constructed or not. In the use case I described in my introduction, the code only needs a reference to the model class at import time; that reference won’t be used until run time, after the app-loading process has completed. (Performing SQL queries through the ORM at import time doesn’t work anyway.) For this reason, I think it would make sense to add an API similar to `apps.get_models()`, but that would work while app-loading is still in progress. It would make no guarantees about the functionality of the class it returns until app-loading has completed. I implemented a proof of concept here: https://github.com/django/django/pull/6547. I would appreciate feedback on the following questions: 1. Have you been missing this feature? If it’s needed outside of Django, I must make it a public API. 2. Is it reasonable to expose it as a public API? There’s a risk that StackOverflow will consider it as superior to get_models() because it’s more lax; they'll assume that I put checks in get_models() simply because I like messing with users. 3. Would you prefer adding a kwarg to get_models e.g. apps.get_models(settings.AUTH_USER_MODEL, unsafe=True) or adding a new method like the PR currently does? I’m thinking the former is more user friendly as the functionality is very similar. Thanks, -- Aymeric. -- You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To post to this group, send email to django-developers@googlegroups.com. Visit this group at https://groups.google.com/group/django-developers. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/67450ED7-88CD-49CE-9B63-180EDC033D55%40polytechnique.org. For more options, visit https://groups.google.com/d/optout.