Questions for the time-pressed: * Have you ever needed, or can you conceive of ever wanting, to provide multiple formats (JSON/XML/etc) for the same data? In other words, is there a use case for easily producing different serializations of the same data?
* If you could serialize data in whatever structure you wanted, would you still need to deserialize it at some point, or is this type of use more unidirectional? On Mar 31, 7:33 am, Russell Keith-Magee <[email protected]> wrote: > On Tue, Mar 31, 2009 at 11:43 AM, Russ Amos <[email protected]> wrote: > > > Would writing an appropriate template, while certainly not ideal, provide > > most of the functionality for the common use case being discussed? > > [snipped] > > Depends on exactly what you mean by 'template'. I would expect that > the end serialization would still occur using the underlying > JSON/XML/YAML libraries, so you can't really use a template in the > sense of a Django HTML template. However, if you're talking about a > format in which you can express serialization instructions, then I > could be convinced (but I need to see details). I _am_ talking about a bona fide Django HTML template, but only for the purposes of illustration. My goal was not to use this as a part of the proposal, but to ask if some of the flexibility provided by Django's templating system would be useful for the serialization changes; namely, structural logic (for/if/etc) and (possibly custom) filters. Again, using the template system, or inventing a new one for serialization, is NOT what I'm suggesting. Looking at the system as it is, if I needed to create a custom serialization format, top to bottom, I would write a view and template, and override the mimetype. The docs feature instructions on producing CSV in this way [1], as an example. Obviously, this is not ideal, but there's also something to be said for the flexibility, even if part of that is reinventing wheels. This was more a rambling brainstorm than a useful part of my proposal... [1] => http://docs.djangoproject.com/en/dev/howto/outputting-csv/#using-the-template-system > > > I ask not because I think that's the best solution, but obviously I need a > > more accurate mental image of the goal, as seen by the core developers. So > > long as deserialization is no issue, as is typical in AJAX applications (or > > anything where an external system is looking at an app's state), providing a > > 'shortcut' interface to provide structure and some form of pre-processing > > hooks seems like a good way to go. > > To clarify - it's not that deserialization *isn't* an issue, it's that > deserialization isn't always possible. Django's default serializers > have sufficient data in them to allow deserialization. That same data > _could_ be presented in a different format, and it would certainly be > nifty if, in those cases, deserialization could be preserved. However, > I accept that this is a non-trivial goal, so if it turns out to not be > possible (or only possible under specific circumstances - such as a > serializer explicitly marked as deserializable), then I won't lose > much sleep. > Your clarification lines up with my intent, if not my expression, quite well. My apologies for my arbitrary use of absolutes! > Some immediate concerns/questions: > > * How do you deal with objects of different type? At present, you can > pass a disparate list of objects to the serializer. The only > requirement is that every element in the list is a Django object - it > doesn't need to be a homogeneous list. Initial thoughts are "throw an error if the attribute is missing", but I need time to consider a generic (read: useful) solution. > * How does this translate to non-JSON serializers? The transition to > YAML shouldn't be too hard, but what about XML? How does `structure` > get interpreted by the XML serializer? How do you differentiate > between the element name, element attributes, and child nodes that can > be used in XML serialization? This is what stands out to me most, now. I realized after climbing into bed last night that I didn't even _consider_ XML, having previously written it off (in my original proposal) as format (and therefore irrelevant) since the focus was different. Obviously XML is very different from JSON, and I am no longer sure that we can allow completely arbitrary serialization structure (which is the goal) AND maintain independence between structure and format, which I would like to do if at all possible. I'm not sure if there's a realistic use case for being able to easily use one structure and multiple formats, however. Boiling it down to the least common denominator seems limiting, but allowing complete flexibility could be quite coupling. The larger question, I suppose, is do we really want to be subclassing for structure and subclassing for format, or subclassing for structure and format? The former provides a certain level of an "I wrote decoupled code" feeling, but, again I'm can't find a use case for this. The latter feels restrictive if this use case ever does appear. There's also something to be said for API uniformity... Can a useful level of independence be achieved when the end formats are so different? [2] => http://code.djangoproject.com/browser/django/trunk/django/core/serializers/xml_serializer.py#L37 > > Some "helpers" I think might be useful would be hooks for the various types > > of fields, including but not limited to relations, to allow things like > > special text processing or dependency traversal, and providing the current > > default "structure" in case the user simply wants to do some pre-processing > > of some form. > > I appreciate that this is one of those details that we will need to > finesse with time, but it would be interesting to hear your > preliminary thoughts on this - in particular, on how you plan to link > the string in the 'template' to the helper. Conversations about format complications notwithstanding, the actual serialization process I see as iterating through the structure attribute, converting keys to unicode, and processing the values as follows (loosely): - If the value is a list, and the key happens to be a relation field, loop through everything in the list with each of the objects in the relation. There's a bit of a magic feel to this I don't like, so I've got an alteration to make below [3]. - If the value is a string, follow conventions -- check if it's a field of the model, check if it's a method of the model, check if it's in the form "relation__field" (and "relation__relation__field" etc), check if it's a method of the serializer, and just default to "it must be just a string" in the end (although, might this be confusing for debugging?). Evaluate whatever it ends up being until it, too, is a string. - Tack on the value to the string produced, thus far, formatting as appropriate. [3] =V class ProductSerializer(serializers.Serializer): structure = { "name": "name", "price": "price", "description": "truncate_description" } def truncate_description(self, product): return product.description[:40] class OrderSerializer(serializers.Serializer): structure = { "order_id": "pk", "products": "products_list", "total: "total_price" } def products_list(self, order): products = order.products.all() return [serializers.serialize(self._format, product, serializer=ProductSerializer) for product in products] I think this is a bit more realistic a use, eliminating the magical treatment of list elements, but isn't as ridiculously simple to write. Now, you have to want it. Thoughts? > However, here's my brain dump, such as it is: I feel I should take a moment to thank you for taking many moments on critiquing my proposal and providing your insightful brain dumps, so I shall: thanks! > My initial thoughts was that the serializers would end up being a lot > like the Feeds framework - a base class with lots of > methods/attributes that can be overridden to provide specific > rendering behaviour. If you tear down the serialization problem, you > end up with a set of relatively simple questions: I've regrouped your observations so my observations make sense. > * What is the top level structure (e.g.,, the outer [] in JSON, the > XML header and root tag)? > > * What is the wrapping structure for each element in the list of > objects (e.g., the {} in JSON, the <object> tag in XML) > > * How is that list of fields presented to the user? (fields:{} in > JSON, child elements in XML) The answers to these hinge on how flexible the custom serializers should be. If we're okay with insisting on a little bit of basic format, we can allow the end-user more freedom with the structure. However, to provide real flexibility in, say, the additional aspects of XML serialization, I think we might have to force users to pick a serialization format, perhaps with the mention that changing between some formats is easier to do (JSON <-> YAML) than between others (JSON -> XML). > * How is each field rendered? (key-value string pairs? <value> > nodes?) If the field is itself a serializable object (e.g., another > Django object) how is it serialized? > > * What descriptive attributes exist for each element in the list? > (pk, model name) > > * How/where are these descriptive attributes rendered? ( dict > entries? root node attributes? child nodes?) > > * Which fields (including extra fields, model properties, computed > fields, etc) should be included in the list of fields? > > * Is there any optional metadata for each data field, such as > datatype? How is that optional metadata interpreted? I think these are all answered by the structures I've suggested, and the existing serializers do a decent job of this, already. If the end- user wants to include metadata, he/she is welcome to do so. The same can be same of extra fields, which fields, and how to format fields. If tweaking of a field is necessary, wrap the field in a method of the serializer: class MySerializer(serializers.Serializer): structure = { "field_name": "my_method" } def my_method(self, object): return object.field_name + u"!" > I was also thinking that you aren't necessarily going to be > subclassing the serializer itself. The answers to these questions are > really just rendering instructions that can be followed by any > serializer, once some common ground rules are established. The > existing serialization engin has a hard-coded set of answers; what we > need to do is refactor those answers out into a default definition > that can be subclassed, overridden, or rewritten to suit specific > needs. Yes, and on that point I want to again emphasize that I think there's something to be said for the difference in format and structure. If the two can be kept separate, I would like to use a different name for the base instruction class than "Serializer", which I've been using to avoid bikeshed discussion on the name. That said, I do think a different name for this would be nice. serializers.Structure? serializers.Renderer? > Some of the serialization instructions will be ignored by some > renderers: for example, a 'child-name=value' attribute may be used to > describe the fact that <value> tags are required for XML, but be > ignored by the JSON and YAML serializer. Obviously, an important task > here is to define what attributes are required, which are optional, > and how they map onto each serializer. That's an interesting point, and I can see that being very nice for keeping the two separate, which I've been harping on all through this. > Anyway, there's 10c worth of brain dump - make of it what you will. I > make no claim that these ideas are watertight - I'm willing to listen > to any reasonable counterideas or objections to what I have proposed. 10c greatly appreciated. :) I'll stew on the observations made and pull together something more complete than the disparate, brainstorm-style, question-and-response mess exemplified here, but answers to standing questions would be appreciated, if anyone has the time. Thanks, Russ --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~---
