Re: #7052 - Fixing serialization for content types and auth

mattimust...@gmail.com Thu, 05 Nov 2009 19:42:33 -0800

Hi Russ,

On Nov 6, 2:29 am, Russell Keith-Magee <freakboy3...@gmail.com> wrote:
> Hi all,
>
> Next on my pony list for v1.2: #7052 - fixing the serializers to work
> around the problem of serializing dynamically created objects, such as
> those produced by contrib.auth and contrib.contenttypes. I need some
> feedback on how much of this solution we need, want, and are
> comfortable seeing in trunk. Apologies in advance for the long post.
>
> For those not familiar with the problem - these two apps dynamically
> create data as part of the syncdb process. As a result, the primary
> keys for these objects aren't necessarily consistent after a syncdb,
> so fixtures can't reliably refer to auth permissions or content types.
> The problem is more general than these two apps specifically, but
> these two are the ones that most people get bitten by early on in the
> testing process.
>
> The solution that I've been intending to implement for a while is an
> extension to Django's serialization syntax: wherever a primary key is
> legal, we will also allow a dictionary-like structure (whatever the
> serialization format allows) that equates to the kwargs that will be
> passed to a Model.objects.get() call.
>
> So - instead of a JSON fixture reading:
>
> {
>     "pk": 1,
>     "model": "myapp.mymodel",
>     "fields": {
>          "name": "foobar",
>          "content_type": 3
>     }
>
> }
>
> which hardcodes the primary key value of 3 for a content type, we would allow:
>
> {
>     "pk": 1,
>     "model": "myapp.mymodel",
>     "fields": {
>          "name": "foobar",
>          "content_type": {
>              "app_name": "otherapp",
>              "model": "othermodel"
>          }
>     }
>
> }
>
> The serializer will then do
> ContentType.objects.get(app_name='otherapp', model='othermodel') to
> resolve the actual primary key at runtime. Analogous syntax would
> exist for XML, PyYAML, etc.
>
> Now, there are two parts to the solution. The deserializer is easy -
> write a handler for the dictionary syntax for primary keys, and you're
> done. Easy to implement, easy to test.
>
> The serializer isn't so easy, however. Determining when to output a
> lookup dictionary for a primary key isn't trivial. Here are some
> options:
>
> Option 1: Ignore the problem
> -----------------------------------------
>
> Implement the deserializer, but don't try and solve the serialization
> problem. Treat the lookup syntax for primary keys as a nifty extra you
> can exploit by hand if need be. Serialization generates integer
> primary keys, and you can hand modify fixtures to use lookup syntax if
> you want to.


+1. I need to ship a default set of groups with permissions assigned
to them in my app. This would make it much easier to do.

>
> Option 2: Add a Meta argument for serialization
> --------------------------------------------------------------------
>
> This is essentially what the patch on #7052 currently implements.
>
> Under this approach, a model that is known to engage in dynamic data
> creation can mark itself for dynamic dumping, indicating the fields
> that should be used for that dump. For example, ContentTypes would
> contain something like:
>
> class Meta:
>     ...
>     dump_related = ('app_label','model')
>
> which indicates the two fields that should be used to construct the
> lookup dictionary whenever a ContentType object is serialized.
>
> The problem with this approach is that hard-codes a single aspect of
> serialization into the model. If someone has a different set of
> requirements for serializing content types under particular
> circumstances, they will be out of luck.
>

-1

> Option 3: Add flags/arguments to the serializer to control dynamic dumping
> ------------------------------------------------------------------------------------------------------------
>
> i.e.,
> ./manage.py dumpdata myapp --format=json --indent=2
> --lookup=contenttypes.contenttype(app_label,model)
>
> It might be possible to simplify this a little by saying that when
> --lookup=contenttypes.contenttype is specfied, the first
> unique_together tuple will be used to construct the lookup.
>
> This puts complete control in the hand of the user at serialization
> time. However, the syntax isn't especially elegant, especially given
> that every single serialization of contenttypes and permissions will,
> in practice, need to use the --lookup argument.
>
> Option 4: An all-singing, all-dancing serialization framework rework
> ------------------------------------------------------------------------------------------------
>
> Django's serialization format is fairly limited, and there have been
> many proposals to add features to the output format (serializing
> non-model properties, reverse relations, deep relations, etc). I've
> been holding off on these in favour of a larger rework of the
> serialization framework.
>
> In my minds eye, I have a vision of a serialization framework that
> would allow for registration of different serialization formats - not
> just JSON/XML, but the fields and internal structure of a JSON
> fixture, etc. Describing which fields should be rendered as lookups,
> how the lookup would be determined, and under what conditions a lookup
> should be used would all just be a configuration items on a
> serialization definition.
>
> This is obviously a much larger body of work, and certainly wouldn't
> get done for v1.2 - if only because I haven't done any planning,
> prototype implementation, or community review.

I have an obvious bias [1] but this would be my preferred option with
Option 3 being implemented on top of it. I believe my Django Full
Serializers implement 90% of what you are after (there is also a patch
for reverse relations in the issue tracker). It is missing
deserializing "full" serialized models and the ability to customize
the internal structure of the output. I have a test suite for it but
it needs to be extracted/rewritten as it depends on models in an
internal project.

Would it be worth my while reworking my code as a patch against Django
trunk?

[1] http://code.google.com/p/wadofstuff/wiki/DjangoFullSerializers


regards

Matthew

>
> Option 5: Something else
> -------------------------------------
>
> I'm open to any other suggestion.
>
> The good news in all this is that Option 1 isn't mutually exclusive to
> the other options - we can land Option 1 right now and get the
> advantages of dynamic lookups, and then worry about how to close the
> loop as a second problem.
>
> So - feedback welcome. Which option should we pursue?
>
> Yours,
> Russ Magee %-)
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: #7052 - Fixing serialization for content types and auth

Reply via email to