[issue29992] Expose parse_string in JSONDecoder
Tobias Oberstein added the comment: > It's unlikely that you would want to parse every string that looks enough > like a decimal as a decimal, or that you would want to pay the cost of > checking every string in the whole document to see if it's a decimal. fwiw, yes, that's what I do, and yes, it needs to check every string https://github.com/crossbario/autobahn-python/blob/bc98e4ea5a2a81e41209ea22d9acc53258fb96be/autobahn/wamp/serializer.py#L410 > Returning a decimal as a string is becoming quite common in REST APIs to > ensure there is no floating point errors. exactly. it is simply required if money values are involved. since JSON doesn't have a native Decimal, strings need to be used (the only scalar type in JSON that allows one to encode the needed arbitrary precision decimals) CBOR has tagged decimal fraction encoding, as described in RFC7049 section 2.4.3. fwiw, we've added roundtrip and crosstrip testing between CBOR <=> JSON in our hacked Python JSON, and it works https://github.com/crossbario/autobahn-python/blob/bc98e4ea5a2a81e41209ea22d9acc53258fb96be/autobahn/wamp/test/test_wamp_serializer.py#L235 -- ___ Python tracker <https://bugs.python.org/issue29992> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29992] Expose parse_string in JSONDecoder
Tobias Oberstein added the comment: I agree, my use case is probably exotic: transparent roundtripping of binaries over JSON using a beginning \0 byte marker to distinguish plain string and base64 encoded binaries. FWIW, I do think however that adding "parse_string" kw param to the ctor of JSONDecoder would at least fit the current approach: there are parse_xxx parameters for all the other things already. If overriding string parsing would be via subclassing, while all the others stay with the kw parameter approach, that could be slightly confusing too, because it looses on consistency. Switching everything to subclassing/overriding for _all_ parse_XXX is I guess a no go, because it breaks existing stuff? > For me in my situation, it'll be messy anyways, because I need to support Py2 > and 3, and CPy and PyPy .. I just filed the issue for "completeness". -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29992> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29992] Expose parse_string in JSONDecoder
New submission from Tobias Oberstein: Though the JSONDecoder already has all the hooks internally to allow for a custom parse_string (https://github.com/python/cpython/blob/master/Lib/json/decoder.py#L330), this currently is not exposed in the constructor JSONDecoder.__init__. It would be nice to expose it. Currently, I need to do hack it: https://gist.github.com/oberstet/fa8b8e04b8d532912bd616d9db65101a -- messages: 291167 nosy: oberstet priority: normal severity: normal status: open title: Expose parse_string in JSONDecoder type: enhancement ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29992> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21356] Support LibreSSL (instead of OpenSSL): make RAND_egd optional
Changes by Tobias Oberstein tobias.oberst...@tavendo.de: -- nosy: +oberstet ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21356 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13244] WebSocket schemes in urllib.parse
Tobias Oberstein added the comment: FWIW, WebSocket URL parsing is still wrong on Python 2.7.6 - in fact, it's broken in multiple ways: from urlparse import urlparse urlparse(ws://example.com/somewhere?foo=bar#dgdg) ParseResult(scheme='ws', netloc='example.com', path='/somewhere', params='', query='foo=bar', fragment='dgdg') urlparse(ws://example.com/somewhere?foo=bar%23dgdg) ParseResult(scheme='ws', netloc='example.com', path='/somewhere', params='', query='foo=bar%23dgdg', fragment='') urlparse(ws://example.com/somewhere?foo#=bar) ParseResult(scheme='ws', netloc='example.com', path='/somewhere', params='', query='foo', fragment='=bar') urlparse(ws://example.com/somewhere?foo%23=bar) ParseResult(scheme='ws', netloc='example.com', path='/somewhere', params='', query='foo%23=bar', fragment='') -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13244 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue19277] zlib compressobj: missing parameter doc strings
New submission from Tobias Oberstein: Currently the `zlib` module documents zlib.compressobj([level]) However, there are more parameters present already today: zlib.compressobj([level, method, wbits]) These other parameters are used in at least 2 deployed libraries (in the context of WebSocket compression): https://github.com/tavendo/AutobahnPython/blob/master/autobahn/autobahn/compress_deflate.py#L527 http://code.google.com/p/pywebsocket/source/browse/trunk/src/mod_pywebsocket/util.py#231 -- assignee: docs@python components: Documentation, Library (Lib) messages: 200113 nosy: docs@python, oberstet priority: normal severity: normal status: open title: zlib compressobj: missing parameter doc strings type: enhancement versions: Python 2.7 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue19277 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue19278] zlib compressobj: expose missing knobs
New submission from Tobias Oberstein: The zlib library provides a couple of knobs to control the behavior and resource consumption of compression: ZEXTERN int ZEXPORT deflateInit2 OF((z_streamp strm, int level, int method, int windowBits, int memLevel, int strategy)); Of these, only `level`, `method` and `windowBits` are exposed on `zlib.compressobj` (and only the first is even documented: see issue #19277). However, as was recently found from emperical evidence in the context of WebSocket compression http://www.ietf.org/mail-archive/web/hybi/current/msg10222.html the `memLevel` parameter in particular is very valuable in controlling memory consumption. For WebSocket compression (with JSON payload), the following parameter set was found to provide a useful resource-consumption/compression-ratio tradeoff: window bits=11 memory level=4 Hence, it would be useful to expose _all_ parameters in Python, that is `memLevel` and `strategy` too. -- components: Library (Lib) messages: 200114 nosy: oberstet priority: normal severity: normal status: open title: zlib compressobj: expose missing knobs type: enhancement versions: Python 2.7 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue19278 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13244] WebSocket schemes in urllib.parse
Tobias Oberstein tobias.oberst...@tavendo.de added the comment: Is that patch supposed to be in Python 2.7.2? If so, it doesn't work for ws: ws://example.com/somewhere?foo=bar#dgdg F:\scm\Autobahn\testsuite\websockets\serverspython Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)] on win32 Type help, copyright, credits or license for more information. from urlparse import urlparse urlparse(ws://example.com/somewhere?foo=bar#dgdg) ParseResult(scheme='ws', netloc='example.com', path='/somewhere?foo=bar#dgdg', params='', query='', fragment='') urlparse(ws://example.com/somewhere?foo=bar#dgdg, allow_fragments = True) ParseResult(scheme='ws', netloc='example.com', path='/somewhere?foo=bar#dgdg', params='', query='', fragment='') urlparse(ws://example.com/somewhere?foo=bar#dgdg, allow_fragments = False) ParseResult(scheme='ws', netloc='example.com', path='/somewhere?foo=bar#dgdg', params='', query='', fragment='') urlparse will neither parse the query nor the (invalid) fragment component for the ws scheme I would have expected ParseResult(scheme='ws', netloc='example.com', path='/somewhere', params='', query='foo=bar', fragment='dgdg') -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13244 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13244] WebSocket schemes in urllib.parse
Tobias Oberstein tobias.oberst...@tavendo.de added the comment: The patch as it stands will result in wrong behavior: +self.assertEqual(urllib.parse.urlparse(ws://example.com/stuff#ff), + ('ws', 'example.com', '/stuff#ff', '', '', '')) The path component returned is invalid for ws/wss and is invalid for any scheme following the generic URI RFC, since # must be always escaped in path components. Is urlparse meant to follow the generic URI RFC? IMHO, the patch at least should do the equivalent of urlparse.uses_fragment.extend(wsschemes) so users of urlparse can do the checking for fragment != , required for ws/wss on their own. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13244 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13244] WebSocket schemes in urllib.parse
Tobias Oberstein tobias.oberst...@tavendo.de added the comment: I’d say that urlparse should raise an exception when a ws/wss URI contains a fragment part. Yep, better. I’m not sure this will be possible; from a glance at the source and a quick test, urlparse will happily break the Generic URI Syntax RFC and return a path including a # character! That's unfortunate. In that case I'd probably prefer the lesser evil, namely that urlparse be set up (falsely) such that ws/wss scheme would falsely allow fragments, so I get back the non-empty fragment as a separate component, and check myself. If urlparse returns the fragment (falsely) within path, then a user could check only by searching for # in the path. Also hacky .. even worse than compare fragment for != . Essentially, this would be exactly the hack that I posted in my very first comment: urlparse.uses_fragment.extend(wsschemes) === Alternative: make this bug dependent on fixing urlparse for fragment rules in generic URI RFC and don't do anything until then? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13244 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13244] WebSocket schemes in urlparse
Tobias Oberstein tobias.oberst...@tavendo.de added the comment: ok, there was feedback on Hybi list: http://www.ietf.org/mail-archive/web/hybi/current/msg09270.html 1. ws://example.com/something#somewhere 2. ws://example.com/something#somewhere/ 3. ws://example.com/something#somewhere/foo 4. ws://example.com/something?query=foo#bar I think all of these are invalid. Alexey Melnikov, Co-author of the WS spec. And Julian Reschke: http://www.ietf.org/mail-archive/web/hybi/current/msg09277.html == Thus, I would upload my comment: # must always be escaped, both in path and query components. Fragment components are not allowed. Thus, unescaped # can never appear in WS URL. Further, it must not be ignored, but the WS handshake failed. And further: urlparse should raise an exception upon unescaped # within URLs from ws/wss schemes. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13244 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13244] WebSocket schemes in urlparse
New submission from Tobias Oberstein tobias.oberst...@tavendo.de: The urlparse module currently does not support the new ws and wss schemes used for the WebSocket protocol. As a workaround, we currently use the following code (which is a hack of course): import urlparse wsschemes = [ws, wss] urlparse.uses_relative.extend(wsschemes) urlparse.uses_netloc.extend(wsschemes) urlparse.uses_params.extend(wsschemes) urlparse.uses_query.extend(wsschemes) urlparse.uses_fragment.extend(wsschemes) === A WebSocket URL has scheme ws or wss, MUST have a network location and MAY have a resource part with path and query components, but MUST NOT have a fragment component. -- components: Library (Lib) messages: 146167 nosy: oberstet priority: normal severity: normal status: open title: WebSocket schemes in urlparse type: feature request versions: Python 2.7 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13244 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13244] WebSocket schemes in urlparse
Tobias Oberstein tobias.oberst...@tavendo.de added the comment: fragment identifiers: the spec says: Fragment identifiers are meaningless in the context of WebSocket URIs, and MUST NOT be used on these URIs. The character # in URIs MUST be escaped as %23 if used as part of the query component. [see last line of my initial comment] I nevertheless added the ws/wss schemes to urlparse.uses_fragment so that I can detect them being used and throw. Does urllib throw when an URL contains a fragment identifier, but the scheme of the URL is not in urlparse.uses_fragment? If so, thats fine and of course better than putting the burden of checking on the user. == Further, when # is to be used in a WS URL, it MUST be encoded, and if so, it's interpreted as part of the query component. So in summary, I think the best would be: urllib throws upon non-encoded #, and interpret it as part of the query component when encoded. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13244 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13244] WebSocket schemes in urlparse
Tobias Oberstein tobias.oberst...@tavendo.de added the comment: Well, thinking about it, %23 can also appear in a percent encoded path component. I don't get the conditional ..if used as part of the query component in the spec. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13244 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13244] WebSocket schemes in urlparse
Tobias Oberstein tobias.oberst...@tavendo.de added the comment: I see how you interpret that sentence in the spec, but I would have read it differently: invalid: 1. ws://example.com/something#somewhere 2. ws://example.com/something#somewhere/ 3. ws://example.com/something#somewhere/foo 4. ws://example.com/something?query=foo#bar valid: 5. ws://example.com/something%23somewhere 6. ws://example.com/something%23somewhere/ 7. ws://example.com/something%23somewhere/foo 8. ws://example.com/something?query=foo%23bar You would take 2. and 3. as valid, but 1. and 4. as invalid, right? But you are right, the spec does not talk about # in path. If above is a valid summary of the question, I'd better take that to the Hybi list to get feedback before rushing into anything with urllib .. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13244 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13244] WebSocket schemes in urlparse
Tobias Oberstein tobias.oberst...@tavendo.de added the comment: I'll ask (to be sure) and link. However, after rereading the Hybi 17 section, it says path = path-abempty, defined in [RFC3986], Section 3.3 And http://tools.ietf.org/html/rfc3986 says: The path is terminated by the first question mark (?) or number sign (#) character, or by the end of the URI. So my reading would be: non-escaped # can never be part of path for a WebSocket URL by reference of RFC3986. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13244 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13244] WebSocket schemes in urlparse
Tobias Oberstein tobias.oberst...@tavendo.de added the comment: here the links to the question on the Hybi list: http://www.ietf.org/mail-archive/web/hybi/current/msg09257.html and http://www.ietf.org/mail-archive/web/hybi/current/msg09258.html http://www.ietf.org/mail-archive/web/hybi/current/msg09243.html == I'll track those and come back when there is a conclusion .. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13244 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13244] WebSocket schemes in urlparse
Tobias Oberstein tobias.oberst...@tavendo.de added the comment: sorry for throw .. somewhat bad habit (stemming from wandering between languages). uses_fragment extended: [autobahn@autobahnhub ~/Autobahn]$ python Python 2.7.1 (r271:86832, Dec 13 2010, 15:52:15) [GCC 4.2.1 20070719 [FreeBSD]] on freebsd8 Type help, copyright, credits or license for more information. import urlparse wsschemes = [ws, wss] urlparse.uses_relative.extend(wsschemes) urlparse.uses_netloc.extend(wsschemes) urlparse.uses_params.extend(wsschemes) urlparse.uses_query.extend(wsschemes) urlparse.uses_fragment.extend(wsschemes) urlparse.urlparse(ws://example.com/something#somewhere/) ParseResult(scheme='ws', netloc='example.com', path='/something', params='', query='', fragment='somewhere/') urlparse.urlparse(ws://example.com/something#somewhere) ParseResult(scheme='ws', netloc='example.com', path='/something', params='', query='', fragment='somewhere') = fragment extracted uses_fragment not extended: [autobahn@autobahnhub ~/Autobahn]$ python Python 2.7.1 (r271:86832, Dec 13 2010, 15:52:15) [GCC 4.2.1 20070719 [FreeBSD]] on freebsd8 Type help, copyright, credits or license for more information. import urlparse wsschemes = [ws, wss] urlparse.uses_relative.extend(wsschemes) urlparse.uses_netloc.extend(wsschemes) urlparse.uses_params.extend(wsschemes) urlparse.uses_query.extend(wsschemes) urlparse.urlparse(ws://example.com/something#somewhere/) ParseResult(scheme='ws', netloc='example.com', path='/something#somewhere/', params='', query='', fragment='') urlparse.urlparse(ws://example.com/something#somewhere) ParseResult(scheme='ws', netloc='example.com', path='/something#somewhere', params='', query='', fragment='') = no fragment extracted, but interpreted as part of path component = no exception raised The answer on Hybi outstanding, but I would interpret Hybi-17: # must always be escaped, both in path and query components. Fragment components are not allowed. Thus, unescaped # can never appear in WS URL. Further, it must not be ignored, but the WS handshake failed. If this should indeed be the correct reading of the WS spec, then I think urlparse should raise an exception upon unescaped # within URLs from ws/wss schemes. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13244 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com