#32420: build_instance in (de)serializers uses model's primary key field name
instead of attname to detect if data contains a primary key
-------------------------------------+-------------------------------------
     Reporter:  trybik               |                    Owner:  trybik
         Type:  Bug                  |                   Status:  assigned
    Component:  Core                 |                  Version:  master
  (Serialization)                    |
     Severity:  Normal               |               Resolution:
     Keywords:                       |             Triage Stage:
                                     |  Unreviewed
    Has patch:  0                    |      Needs documentation:  0
  Needs tests:  0                    |  Patch needs improvement:  0
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------
Description changed by trybik:

Old description:

> The bug was introduced in #28385 fix, in
> [https://github.com/django/django/commit/dcd1025f4c03faa4a9915a1d47d07008280dd3cf
> commit dcd1025]. Currently in master, the
> {{{django.core.serializers.base.build_instance}}}  function uses {{{pk =
> data.get(Model._meta.pk.name)}}} instead of {{{pk =
> data.get(Model._meta.pk.attname)}}}:
> {{{
> def build_instance(Model, data, db):
>     """
>     Build a model instance.
>     If the model instance doesn't have a primary key and the model
> supports
>     natural keys, try to retrieve it from the database.
>     """
>     default_manager = Model._meta.default_manager
>     pk = data.get(Model._meta.pk.name)
>     if (pk is None and hasattr(default_manager, 'get_by_natural_key') and
>             hasattr(Model, 'natural_key')):
>         natural_key = Model(**data).natural_key()
>         try:
>             data[Model._meta.pk.attname] = Model._meta.pk.to_python(
> default_manager.db_manager(db).get_by_natural_key(*natural_key).pk
>             )
>         except Model.DoesNotExist:
>             pass
>     return Model(**data)
> }}}
>
> ([https://github.com/django/django/blob/738e9e615dc81b561c9fb01439119f4475c2e25b/django/core/serializers/base.py#L260
> Source on GitHub])
>
> Why it should use {{{attname}}}? Because it may be different than
> {{{name}}} and the following corresponding sets use
> {{{data[Model._meta.pk.attname] = ...}}}:
> 1. proceeding {{{django.core.serializers.python.Deserializer}}} which
> calls {{{build_instance}}}:
>     {{{
>     def Deserializer(...):
>     ...
>         for d in object_list:
>             ...
>             data = {}
>             if 'pk' in d:
>                 try:
>                     data[Model._meta.pk.attname] =
> Model._meta.pk.to_python(d.get('pk'))
>                 except Exception as e:
>                 ...
>         obj = base.build_instance(Model, data, using)
>         yield ...
>     }}}
> ([https://github.com/django/django/blob/738e9e615dc81b561c9fb01439119f4475c2e25b/django/core/serializers/python.py#L100
> Source on GitHub])
>
> 2. following 5 lines further assignment {{{ data[Model._meta.pk.attname]
> = ...}}} uses {{{attname}}}.
>

> The marginal error case is when primary key is also a foreign key (then
> {{{attname}}} has {{{"_id"}}} suffix). Moreover, to actually make it an
> error you have to create a bit pathological situation where you have to
> have natural key but it does not work, e.g.:
> {{{
> class Author(models.Model):
>     name = models.CharField(max_length=20)
>

> class GhostWriterManager(models.Manager):
>
>     def get_by_natural_key(self, *args, **kwargs):
>         raise NotImplementedError("Don't get by natural key")
>

> class GhostWriter(models.Model):
>     author_ptr = models.ForeignKey(Author, on_delete=models.CASCADE,
> primary_key=True)
>
>     objects = GhostWriterManager()
>
>     def natural_key(self):
>         raise NotImplementedError("There is no natural key")
> }}}
> This rare situation actually can come up in an arguably normal situation,
> when using django-polymorphic and loading the subclass part of JSON list
> and the subclass uses a natural key that refers to fields from base
> class. The natural key will work perfectly fine, just not when loading
> the subclass part of JSON.

New description:

 The bug was introduced in #28385 fix, in
 
[https://github.com/django/django/commit/dcd1025f4c03faa4a9915a1d47d07008280dd3cf
 commit dcd1025]. Currently in master, the
 {{{django.core.serializers.base.build_instance}}}  function uses {{{pk =
 data.get(Model._meta.pk.name)}}} instead of {{{pk =
 data.get(Model._meta.pk.attname)}}}:
 {{{
 def build_instance(Model, data, db):
     """
     Build a model instance.
     If the model instance doesn't have a primary key and the model
 supports
     natural keys, try to retrieve it from the database.
     """
     default_manager = Model._meta.default_manager
     pk = data.get(Model._meta.pk.name)
     if (pk is None and hasattr(default_manager, 'get_by_natural_key') and
             hasattr(Model, 'natural_key')):
         natural_key = Model(**data).natural_key()
         try:
             data[Model._meta.pk.attname] = Model._meta.pk.to_python(
 default_manager.db_manager(db).get_by_natural_key(*natural_key).pk
             )
         except Model.DoesNotExist:
             pass
     return Model(**data)
 }}}

 
([https://github.com/django/django/blob/738e9e615dc81b561c9fb01439119f4475c2e25b/django/core/serializers/base.py#L260
 Source on GitHub])

 Why it should use {{{attname}}}? Because it may be different than
 {{{name}}} and the following corresponding assignments use
 {{{data[Model._meta.pk.attname] = ...}}}:
 1. proceeding {{{django.core.serializers.python.Deserializer}}} which
 calls {{{build_instance}}}:
     {{{
     def Deserializer(...):
     ...
         for d in object_list:
             ...
             data = {}
             if 'pk' in d:
                 try:
                     data[Model._meta.pk.attname] =
 Model._meta.pk.to_python(d.get('pk'))
                 except Exception as e:
                 ...
         obj = base.build_instance(Model, data, using)
         yield ...
     }}}
 
([https://github.com/django/django/blob/738e9e615dc81b561c9fb01439119f4475c2e25b/django/core/serializers/python.py#L100
 Source on GitHub])

 2. following 5 lines further assignment {{{ data[Model._meta.pk.attname] =
 ...}}} uses {{{attname}}}.


 The marginal error case is when primary key is also a foreign key (then
 {{{attname}}} has {{{"_id"}}} suffix). Moreover, to actually make it an
 error you have to create a bit pathological situation where you have to
 have natural key but it does not work, e.g.:
 {{{
 class Author(models.Model):
     name = models.CharField(max_length=20)


 class GhostWriterManager(models.Manager):

     def get_by_natural_key(self, *args, **kwargs):
         raise NotImplementedError("Don't get by natural key")


 class GhostWriter(models.Model):
     author_ptr = models.ForeignKey(Author, on_delete=models.CASCADE,
 primary_key=True)

     objects = GhostWriterManager()

     def natural_key(self):
         raise NotImplementedError("There is no natural key")
 }}}
 This rare situation actually can come up in an arguably normal situation,
 when using django-polymorphic and loading the subclass part of JSON list
 and the subclass uses a natural key that refers to fields from base class.
 The natural key will work perfectly fine, just not when loading the
 subclass part of JSON.

--

-- 
Ticket URL: <https://code.djangoproject.com/ticket/32420#comment:2>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/064.7152a69fe0f6a701eefd9fb98669c02b%40djangoproject.com.

Reply via email to