#33414: Diamond inheritance causes duplicated PK error when creating an object, 
if
the primary key field has a default.
-----------------------------------------+------------------------
               Reporter:  Yu Li          |          Owner:  nobody
                   Type:  Bug            |         Status:  new
              Component:  Uncategorized  |        Version:  4.0
               Severity:  Normal         |       Keywords:
           Triage Stage:  Unreviewed     |      Has patch:  0
    Needs documentation:  0              |    Needs tests:  0
Patch needs improvement:  0              |  Easy pickings:  0
                  UI/UX:  0              |
-----------------------------------------+------------------------
 Hi, I'm not sure if this is a bug or an unsupported feature. But I looked
 into the django/db/models/base.py source code and now have a pretty good
 understanding of what is happening.

 My business code uses a diamond shape inheritance to model different types
 of user posts: UserPost, ImagePost, VideoPost, and MixedContentPost. The
 inheritance goes like this: both ImagePost and VideoPost extend from
 UserPost, and the MixedContentPost inherits from ImagePost and VideoPost.
 All of them are concrete models

 I read the doc and expected it to work, similar to the example


 {{{
 class Piece(models.Model):
     pass

 class Article(Piece):
     article_piece = models.OneToOneField(Piece, on_delete=models.CASCADE,
 parent_link=True)
     ...

 class Book(Piece):
     book_piece = models.OneToOneField(Piece, on_delete=models.CASCADE,
 parent_link=True)
     ...

 class BookReview(Book, Article):
     pass
 }}}

 However, I found out that the doc's example only works when these models
 use a primary key field that does not have a default. In my case, we are
 using a UUIDField as the primary key with a default of uuid4. Trying to
 create a BookReview in our case, causes a django.db.utils.IntegrityError:
 UNIQUE constraint failed error, because django tries to create the Piece
 object twice, with the same uuid.

 The same behavior is found if I used a AutoField on Piece, with a custom
 default function, such as

 {{{
 id = 99

 def increment():
     global id
     id += 1
     return id
 }}}

 This default function makes no sense in practice, but I just use it here
 to illustrate the root cause of the problem:

 The _save_table method in db/models/base.py has a like this:

 {{{
         # Skip an UPDATE when adding an instance and primary key has a
 default.
         if (
             not raw
             and not force_insert
             and self._state.adding
             and meta.pk.default
             and meta.pk.default is not NOT_PROVIDED
         ):
             force_insert = True
 }}}

 When a default is not present, which is the case of the documentation
 example, Django will first insert the first instance of the Piece object,
 and then for the second one, since force_insert is False, _save_table
 tries an update and goes through. Therefore there is not duplicate.

 However, if a default is present, then the second Piece becomes an
 insertion as well (because meta.pk.default and meta.pk.default is not
 NOT_PROVIDED, force_insert is True). This causes a duplication on the
 primary key.

 On the other hand, _insert_parent does an in-order traversal calling
 _save_table on each node, so even in the no-default pk case, it is calling
 a redundant update on the root node after the insertion..

 So which function is at fault?

 The source code _save_table assumes that if you are calling it with a
 default pk then you can skip an update. This assumption looks weird to me:
 why only when there IS a default pk you can skip update? Why not just skip
 update as long as we know we are inserting? (via self._state.adding) Is it
 just to make it special so that AutoField works? If _save_table's
 responsibility is to try updating before inserting, except when the params
 force it to do an update or insert, then it shouldn't override that
 behavior by this self-assumeption  within it.

 I think the solution is to simply move the check to save_base. And don't
 do this check in _save_parents.

 Like this:


 {{{

     def save_base(
         self,
         raw=False,
         force_insert=False,
         force_update=False,
         using=None,
         update_fields=None,
     ):
         """
         Handle the parts of saving which should be done only once per
 save,
         yet need to be done in raw saves, too. This includes some sanity
         checks and signal sending.

         The 'raw' argument is telling save_base not to save any parent
         models and not to do any changes to the values before save. This
         is used by fixture loading.
         """
         using = using or router.db_for_write(self.__class__,
 instance=self)
         assert not (force_insert and (force_update or update_fields))
         assert update_fields is None or update_fields
         cls = origin = self.__class__
         # Skip proxies, but keep the origin as the proxy model.
         if cls._meta.proxy:
             cls = cls._meta.concrete_model
         meta = cls._meta
         if not meta.auto_created:
             pre_save.send(
                 sender=origin,
                 instance=self,
                 raw=raw,
                 using=using,
                 update_fields=update_fields,
             )
         # A transaction isn't needed if one query is issued.
         if meta.parents:
             context_manager = transaction.atomic(using=using,
 savepoint=False)
         else:
             context_manager =
 transaction.mark_for_rollback_on_error(using=using)
         with context_manager:
             parent_inserted = False
             if not raw:
                 parent_inserted = self._save_parents(cls, using,
 update_fields)

             # Skip an UPDATE when adding an instance and primary key has a
 default.
             if (
                 not raw
                 and not force_insert
                 and self._state.adding
                 and meta.pk.default
                 and meta.pk.default is not NOT_PROVIDED
             ):
                 force_insert = True
             updated = self._save_table(
                 raw,
                 cls,
                 force_insert or parent_inserted,
                 force_update,
                 using,
                 update_fields,
             )
         # Store the database on which the object was saved
         self._state.db = using
         # Once saved, this is no longer a to-be-added instance.
         self._state.adding = False

         # Signal that the save is complete
         if not meta.auto_created:
             post_save.send(
                 sender=origin,
                 instance=self,
                 created=(not updated),
                 update_fields=update_fields,
                 raw=raw,
                 using=using,
             )

     save_base.alters_data = True

 }}}

 I have never contributed to Django before. If you think I'm right on this
 one I'll look into creating a PR.

-- 
Ticket URL: <https://code.djangoproject.com/ticket/33414>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-updates+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/051.fe2709f2db0d315064d3d17034b0ec51%40djangoproject.com.

Reply via email to