New submission from Nick Coghlan: In the Python 3 transition, we had to make a choice regarding whether we treated the JSON module as a text transform (with load[s] reading Unicode code points and dump[s] producing them), or as a text encoding (with load[s] reading binary sequences and dump[s] producing them).
To minimise the changes to the module API, the decision was made to treat it as a text transform, with the text encoding handled externally. This API design decision doesn't appear to have worked out that well in the web development context, since JSON is typically encountered as a UTF-8 encoded wire protocol, not as already decoded text. It also makes the module inconsistent with most of the other modules that offer "dumps" APIs, as those *are* specifically about wire protocols (Python 3.4): >>> import json, marshal, pickle, plistlib, xmlrpc.client >>> json.dumps('hello') '"hello"' >>> marshal.dumps('hello') b'\xda\x05hello' >>> pickle.dumps('hello') b'\x80\x03X\x05\x00\x00\x00helloq\x00.' >>> plistlib.dumps('hello') b'<?xml version="1.0" encoding="UTF-8"?>\n<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">\n<plist version="1.0">\n<string>hello</string>\n</plist>\n' The only module with a dumps function that (like the json module) returns a string, is the XML-RPC client module: >>> xmlrpc.client.dumps(('hello',)) '<params>\n<param>\n<value><string>hello</string></value>\n</param>\n</params>\n' And that's nonsensical, since that XML-RPC API *accepts an encoding argument*, which it now silently ignores: >>> xmlrpc.client.dumps(('hello',), encoding='utf-8') '<params>\n<param>\n<value><string>hello</string></value>\n</param>\n</params>\n' >>> xmlrpc.client.dumps(('hello',), encoding='utf-16') '<params>\n<param>\n<value><string>hello</string></value>\n</param>\n</params>\n' I now believe that an "encoding" parameter should have been added to the json.dump API in the Py3k transition (defaulting to UTF-8), allowing all of the dump/load APIs in the standard library to be consistently about converting to and from a binary wire protocol. Unfortunately, I don't have a solution to offer at this point (since backwards compatibility concerns rule out the simple solution of just changing the return type). I just wanted to get it on record as a problem (and internal inconsistency within the standard library for dump/load protocols) with the current API. ---------- components: Library (Lib) messages: 204764 nosy: chrism, ncoghlan priority: normal severity: normal status: open title: Wire protocol encoding for the JSON module versions: Python 3.5 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue19837> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com