[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf
On Thu, Sep 17, 2020 at 3:02 PM Wes Turner wrote: > > Something like this in the docstring?: "In order to support the historical > JSON specification and closed ecosystem JSON, it is possible to specify an > encoding other than UTF-8." > I don't think dumpf should support encoding parameter. 1. Output is ASCII unless `ensure_ascii=True` is specified. 2. Writing new JSON file with obsolete spec is not recommended. 3. If user really need it, they can write obsolete JSON by `dump` or `dumps` anyway. I against adding `encoding` parameter to dumpf and loadf. They are just shortcut for common cases. Regards, -- Inada Naoki ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/7V3UAEBUKYEQDYYJIGJUVW3DYCDDQN46/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf
Something like this in the docstring?: "In order to support the historical JSON specification and closed ecosystem JSON, it is possible to specify an encoding other than UTF-8." 8.1. Character Encoding >JSON text exchanged between systems that are not part of a closed >ecosystem MUST be encoded using UTF-8 [RFC3629]. >Previous specifications of JSON have not required the use of UTF-8 >when transmitting JSON text. However, the vast majority of JSON- >based software implementations have chosen to use the UTF-8 encoding, >to the extent that it is the only encoding that achieves >interoperability. >Implementations MUST NOT add a byte order mark (U+FEFF) to the >beginning of a networked-transmitted JSON text. In the interests of >interoperability, implementations that parse JSON texts MAY ignore >the presence of a byte order mark rather than treating it as an >error. ```python import json import os def dumpf(obj, path, *, encoding="UTF-8", **kwargs): with open(os.fspath(path), "w", encoding=encoding) as f: return json.dump(obj, f, **kwargs) def loadf(path, *, encoding="UTF-8", **kwargs): with open(os.fspath(path), "r", encoding=encoding) as f: return json.load(f, **kwargs) import pathlib import unittest class TestJsonLoadfAndDumpf(unittest.TestCase): def setUp(self): self.encodings = [None, "UTF-8", "UTF-16", "UTF-32"] data = dict( obj=dict(a=dict(b=[1, 2, 3])), path=pathlib.Path(".") / "test_loadf_and_dumpf.json", ) if os.path.isfile(data["path"]): os.unlink(data["path"]) self.data = data def test_dumpf_and_loadf(self): data = self.data for encoding in self.encodings: path = f'{data["path"]}.{encoding}.json' dumpf_output = dumpf(data["obj"], path, encoding=encoding) loadf_output = loadf(path, encoding=encoding) assert loadf_output == data["obj"] # $ pip install pytest-cov # $ pytest -v example.py # https://docs.pytest.org/en/stable/parametrize.html # https://docs.pytest.org/en/stable/tmpdir.html import pytest @pytest.mark.parametrize("encoding", [None, "UTF-8", "UTF-16", "UTF-32"]) @pytest.mark.parametrize("obj", [dict(a=dict(b=[1, 2, 3]))]) def test_dumpf_and_loadf(obj, encoding, tmpdir): pth = pathlib.Path(tmpdir) / f"test_loadf_and_dumpf.{encoding}.json" dumpf_output = dumpf(obj, pth, encoding=encoding) loadf_output = loadf(pth, encoding=encoding) assert loadf_output == obj ``` For whoever creates a PR for this: - [ ] add parameter and return type annotations - [ ] copy docstrings from json.load/json.dump and open#encoding - [ ] correctly support the c module implementation (this just does `import json`)? - [ ] keep or drop the encoding tests? On Thu, Sep 17, 2020 at 1:25 AM Christopher Barker wrote: > Is that suggested code? I don't follow. > > But if it is, no. personally, I think ANY use of system settings is a bad > idea [*]. But certainly no need to even think about it for JSON. > > -CHB > > * have we not learned that in the age of the internet the machine the code > happens to be running on has nothing to do with the user of the > applications' needs? Timezones, encodings, number formats, NOTHING. > > > On Wed, Sep 16, 2020 at 8:45 PM Wes Turner wrote: > >> Is all of this locale/encoding testing necessary (or even sufficient)? >> >> >> ```python >> import json >> import locale >> import os >> >> >> def get_default_encoding(): >> """ >> TODO XXX: ??? >> """ >> default_encoding = locale.getdefaultlocale()[1] >> if default_encoding.startswith("UTF-"): >> return default_encoding >> else: >> return "UTF-8" >> >> >> def dumpf(obj, path, *args, **kwargs): >> with open( >> os.fspath(path), >> "w", >> encoding=kwargs.pop("encoding", get_default_encoding()), >> ) as file_: >> return json.dump(obj, file_, *args, **kwargs) >> >> >> def loadf(path, *args, **kwargs): >> with open( >> os.fspath(path), >> "r", >> encoding=kwargs.pop("encoding", get_default_encoding()), >> ) as file_: >> return json.load(file_, *args, **kwargs) >> >> >> import pathlib >> import unittest >> >> >> class TestJsonLoadfAndDumpf(unittest.TestCase): >> def setUp(self): >> self.locales = ["", "C", "en_US.UTF-8", "japanese"] >> self.encodings = [None, "UTF-8", "UTF-16", "UTF-32"] >> >> data = dict( >> obj=dict(a=dict(b=[1, 2, 3])), >> encoding=None, >> path=pathlib.Path(".") / "test_loadf_and_dumpf.json", >> ) >> if os.path.isfile(data["path"]): >> os.unlink(data["path"]) >> self.data = data >> >> self.previous_locale = locale.getlocale() >> >> def tearDown(self): >> locale.setlocale(locale.LC_ALL, self.previous_locale) >> >> def test_get_defau
[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf
Is that suggested code? I don't follow. But if it is, no. personally, I think ANY use of system settings is a bad idea [*]. But certainly no need to even think about it for JSON. -CHB * have we not learned that in the age of the internet the machine the code happens to be running on has nothing to do with the user of the applications' needs? Timezones, encodings, number formats, NOTHING. On Wed, Sep 16, 2020 at 8:45 PM Wes Turner wrote: > Is all of this locale/encoding testing necessary (or even sufficient)? > > > ```python > import json > import locale > import os > > > def get_default_encoding(): > """ > TODO XXX: ??? > """ > default_encoding = locale.getdefaultlocale()[1] > if default_encoding.startswith("UTF-"): > return default_encoding > else: > return "UTF-8" > > > def dumpf(obj, path, *args, **kwargs): > with open( > os.fspath(path), > "w", > encoding=kwargs.pop("encoding", get_default_encoding()), > ) as file_: > return json.dump(obj, file_, *args, **kwargs) > > > def loadf(path, *args, **kwargs): > with open( > os.fspath(path), > "r", > encoding=kwargs.pop("encoding", get_default_encoding()), > ) as file_: > return json.load(file_, *args, **kwargs) > > > import pathlib > import unittest > > > class TestJsonLoadfAndDumpf(unittest.TestCase): > def setUp(self): > self.locales = ["", "C", "en_US.UTF-8", "japanese"] > self.encodings = [None, "UTF-8", "UTF-16", "UTF-32"] > > data = dict( > obj=dict(a=dict(b=[1, 2, 3])), > encoding=None, > path=pathlib.Path(".") / "test_loadf_and_dumpf.json", > ) > if os.path.isfile(data["path"]): > os.unlink(data["path"]) > self.data = data > > self.previous_locale = locale.getlocale() > > def tearDown(self): > locale.setlocale(locale.LC_ALL, self.previous_locale) > > def test_get_default_encoding(self): > for localestr in self.locales: > locale.setlocale(locale.LC_ALL, localestr) > output = get_default_encoding() > assert output.startswith("UTF-") > > def test_dumpf_and_loadf(self): > data = self.data > for localestr in self.locales: > locale.setlocale(locale.LC_ALL, localestr) > for encoding in self.encodings: > dumpf_output = dumpf( > data["obj"], data["path"], encoding=encoding > ) > loadf_output = loadf(data["path"], encoding=encoding) > assert loadf_output == data["obj"] > ``` > > On Wed, Sep 16, 2020 at 8:30 PM Christopher Barker > wrote: > >> On Wed, Sep 16, 2020 at 2:53 PM Wes Turner wrote: >> >>> So I was not correct: dump does not default to UTF-8 (and does not >>> accept an encoding= parameter) >>> >>> I think dumpf() should use UTF-8, and that's it. If anyone really wants something else, they can get it by providing an open text file object. >>> >>> Why would we impose UTF-8 when the spec says UTF-8, UTF-16, or UTF-32? >>> >> >> The idea was that the encoding was one of the motivators to doing this in >> the first place. But I suppose as long as utf-8 is the default, and only >> the three "official" ones are allowed, then yeah, we could add an encoding >> keyword argument. >> >> -CHB >> >> >> -- >> Christopher Barker, PhD >> >> Python Language Consulting >> - Teaching >> - Scientific Software Development >> - Desktop GUI and Web Development >> - wxPython, numpy, scipy, Cython >> > -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/PKD2CKYJIWXNDMI6GFDFOUPNHDVMCDJP/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf
On Thu, Sep 17, 2020 at 6:54 AM Wes Turner wrote: > > > Why would we impose UTF-8 when the spec says UTF-8, UTF-16, or UTF-32? Obsolete JSON spec said UTF-8, UTF-16, and UTF-32. Current spec says UTF-8. See https://tools.ietf.org/html/rfc8259#section-8.1 So `dumpf` must use UTF-8, although `loadf` can support UTF-16 and UTF-32 like `loads`. > > How could this be improved? (I'm on my phone, so) > > def dumpf(obj, path, *args, **kwargs): > with open(getattr(path, '__path__', path), 'w', > encoding=kwargs.get('encoding', 'utf8')) as _file: > return dump(_file, *args, **kwargs) > > def loadf(obj, path, *args, **kwargs): > with open(getattr(path, '__path__', path), > encoding=kwargs.get('encoding', 'utf8')) as _file: > return load(_file, *args, **kwargs) > def dumpf(obj, path, *, **kwargs): with open(path, "w", encoding="utf-8") as f: return dump(obj, f, **kwargs) def loadf(obj, path, *, **kwargs): with open(path, "rb") as f: return load(f, **kwargs) Regards, -- Inada Naoki ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/5RPHOVBMAC3USBKY7S2G4WVEC4JR4IV6/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf
Is all of this locale/encoding testing necessary (or even sufficient)? ```python import json import locale import os def get_default_encoding(): """ TODO XXX: ??? """ default_encoding = locale.getdefaultlocale()[1] if default_encoding.startswith("UTF-"): return default_encoding else: return "UTF-8" def dumpf(obj, path, *args, **kwargs): with open( os.fspath(path), "w", encoding=kwargs.pop("encoding", get_default_encoding()), ) as file_: return json.dump(obj, file_, *args, **kwargs) def loadf(path, *args, **kwargs): with open( os.fspath(path), "r", encoding=kwargs.pop("encoding", get_default_encoding()), ) as file_: return json.load(file_, *args, **kwargs) import pathlib import unittest class TestJsonLoadfAndDumpf(unittest.TestCase): def setUp(self): self.locales = ["", "C", "en_US.UTF-8", "japanese"] self.encodings = [None, "UTF-8", "UTF-16", "UTF-32"] data = dict( obj=dict(a=dict(b=[1, 2, 3])), encoding=None, path=pathlib.Path(".") / "test_loadf_and_dumpf.json", ) if os.path.isfile(data["path"]): os.unlink(data["path"]) self.data = data self.previous_locale = locale.getlocale() def tearDown(self): locale.setlocale(locale.LC_ALL, self.previous_locale) def test_get_default_encoding(self): for localestr in self.locales: locale.setlocale(locale.LC_ALL, localestr) output = get_default_encoding() assert output.startswith("UTF-") def test_dumpf_and_loadf(self): data = self.data for localestr in self.locales: locale.setlocale(locale.LC_ALL, localestr) for encoding in self.encodings: dumpf_output = dumpf( data["obj"], data["path"], encoding=encoding ) loadf_output = loadf(data["path"], encoding=encoding) assert loadf_output == data["obj"] ``` On Wed, Sep 16, 2020 at 8:30 PM Christopher Barker wrote: > On Wed, Sep 16, 2020 at 2:53 PM Wes Turner wrote: > >> So I was not correct: dump does not default to UTF-8 (and does not accept >> an encoding= parameter) >> >> >>> I think dumpf() should use UTF-8, and that's it. If anyone really wants >>> something else, they can get it by providing an open text file object. >>> >> >> Why would we impose UTF-8 when the spec says UTF-8, UTF-16, or UTF-32? >> > > The idea was that the encoding was one of the motivators to doing this in > the first place. But I suppose as long as utf-8 is the default, and only > the three "official" ones are allowed, then yeah, we could add an encoding > keyword argument. > > -CHB > > > -- > Christopher Barker, PhD > > Python Language Consulting > - Teaching > - Scientific Software Development > - Desktop GUI and Web Development > - wxPython, numpy, scipy, Cython > ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/G72K6AGXLVMBQAZYECK6N5VGBYDD3RYL/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Fwd: Re: Experimenting with dict performance, and an immutable dict
On Thu, Sep 17, 2020 at 8:03 AM Marco Sulla wrote: > > Well, it seems ok now: > https://github.com/python/cpython/compare/master...Marco-Sulla:master > > I've done a quick speed test and speedup is quite high for a creation > using keywods or a dict with "holes": about 30%: 30% on microbenchmark is not quite high. For example, I have optimized "copy dict with holes" but I rejected my PR because I am not sure performance / maintenance cost ratio is good enough. https://bugs.python.org/issue41431#msg374556 https://github.com/python/cpython/pull/21669 > > python -m timeit -n 2000 --setup "from uuid import uuid4 ; o = > {str(uuid4()).replace('-', '') : str(uuid4()).replace('-', '') for i > in range(1)}" "dict(**o)" > I don't think this use case is worth to optimize, because `dict(o)` or `o.copy()` is Pythonic. > python -m timeit -n 1 --setup "from uuid import uuid4 ; o = > {str(uuid4()).replace('-', '') : str(uuid4()).replace('-', '') for i > in range(1)} ; it = iter(o) ; key0 = next(it) ; o.pop(key0)" > "dict(o)" > It is controversial. If the optimization is very simple, it might be worth enough. Regards, -- Inada Naoki ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/LE6RLLKF4QRRA4P2EXUK5MXVH6X4CSUZ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Fwd: Re: Experimenting with dict performance, and an immutable dict
Would an e.g. bm_dict.py in [1] be a good place for a few benchmarks of dict; or is there a more appropriate project for authoritatively measuring performance regressions and optimizations of core {cpython,} data structures? [1] https://github.com/python/pyperformance/tree/master/pyperformance/benchmarks (pytest-benchmark looks neat, as well. an example of how to use pytest.mark.parametrize to capture multiple metrics might be helpful: https://github.com/ionelmc/pytest-benchmark ) Its easy to imagine a bot that runs some or all performance benchmarks on a PR when requested in a PR comment; there's probably already a good way to do this? On Wed, Sep 16, 2020, 10:44 PM Wes Turner wrote: > That sounds like a worthwhile optimization. FWIW, is this a bit simpler > but sufficient?: > > python -m timeit -n 2000 --setup "from uuid import uuid4; \ > o = {uuid4().hex: i for i in range(1)}" \ > "dict(**o)" > > Is there a preferred tool to comprehensively measure the performance > impact of a PR (with e.g. multiple contrived and average-case key/value > sets)? > > > On Wed, Sep 16, 2020, 7:07 PM Marco Sulla > wrote: > >> Well, it seems ok now: >> https://github.com/python/cpython/compare/master...Marco-Sulla:master >> >> I've done a quick speed test and speedup is quite high for a creation >> using keywods or a dict with "holes": about 30%: >> >> python -m timeit -n 2000 --setup "from uuid import uuid4 ; o = >> {str(uuid4()).replace('-', '') : str(uuid4()).replace('-', '') for i >> in range(1)}" "dict(**o)" >> >> python -m timeit -n 1 --setup "from uuid import uuid4 ; o = >> {str(uuid4()).replace('-', '') : str(uuid4()).replace('-', '') for i >> in range(1)} ; it = iter(o) ; key0 = next(it) ; o.pop(key0)" >> "dict(o)" >> >> Can I do a PR? >> ___ >> Python-ideas mailing list -- python-ideas@python.org >> To unsubscribe send an email to python-ideas-le...@python.org >> https://mail.python.org/mailman3/lists/python-ideas.python.org/ >> Message archived at >> https://mail.python.org/archives/list/python-ideas@python.org/message/QWXD2D4SC6XHZLV3QA4TMGMI7Z7SAJ2R/ >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/GNVALID7YU3IP6HUH7K7BF56CDMJJACK/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Fwd: Re: Experimenting with dict performance, and an immutable dict
That sounds like a worthwhile optimization. FWIW, is this a bit simpler but sufficient?: python -m timeit -n 2000 --setup "from uuid import uuid4; \ o = {uuid4().hex: i for i in range(1)}" \ "dict(**o)" Is there a preferred tool to comprehensively measure the performance impact of a PR (with e.g. multiple contrived and average-case key/value sets)? On Wed, Sep 16, 2020, 7:07 PM Marco Sulla wrote: > Well, it seems ok now: > https://github.com/python/cpython/compare/master...Marco-Sulla:master > > I've done a quick speed test and speedup is quite high for a creation > using keywods or a dict with "holes": about 30%: > > python -m timeit -n 2000 --setup "from uuid import uuid4 ; o = > {str(uuid4()).replace('-', '') : str(uuid4()).replace('-', '') for i > in range(1)}" "dict(**o)" > > python -m timeit -n 1 --setup "from uuid import uuid4 ; o = > {str(uuid4()).replace('-', '') : str(uuid4()).replace('-', '') for i > in range(1)} ; it = iter(o) ; key0 = next(it) ; o.pop(key0)" > "dict(o)" > > Can I do a PR? > ___ > Python-ideas mailing list -- python-ideas@python.org > To unsubscribe send an email to python-ideas-le...@python.org > https://mail.python.org/mailman3/lists/python-ideas.python.org/ > Message archived at > https://mail.python.org/archives/list/python-ideas@python.org/message/QWXD2D4SC6XHZLV3QA4TMGMI7Z7SAJ2R/ > Code of Conduct: http://python.org/psf/codeofconduct/ > ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/6RALUK5FUS25W4G5DM7ILZHFJOJTSPIM/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf
On Wed, Sep 16, 2020 at 2:53 PM Wes Turner wrote: > So I was not correct: dump does not default to UTF-8 (and does not accept > an encoding= parameter) > > >> I think dumpf() should use UTF-8, and that's it. If anyone really wants >> something else, they can get it by providing an open text file object. >> > > Why would we impose UTF-8 when the spec says UTF-8, UTF-16, or UTF-32? > The idea was that the encoding was one of the motivators to doing this in the first place. But I suppose as long as utf-8 is the default, and only the three "official" ones are allowed, then yeah, we could add an encoding keyword argument. -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/O73WZF6JKME2VPVWOWYRVQ3APVEA2J5V/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf
I believe Sergie already suggested pickle and marshall, and I guess we can add plistlib to those. Personally, I'm not so sure it should be added to all these. I see why the same API was used for all of them, but they really are fairly different beasts. So if they have a function with the same purpose, it should have the same name, but that doesn't mean that all these modules need to have all the functions. On the other hand, the fact that we might be adding two new functions to four different modules is, in my mind, andn argument for overloading the existing dump() / load() instead. a lot less API churn. -CHB On Wed, Sep 16, 2020 at 5:10 PM Chris Angelico wrote: > On Thu, Sep 17, 2020 at 9:53 AM wrote: > > > > Maybe unrelated, but the same goes for `pickle.load` and `pickle.dump`. > For consistencies, any changes made to `json.load` and `json.dump` (e.g. > adding `json.loadf` and `json.dumpf` or accepting a path like as argument) > should be also applied equivalently to `pickle.load` and `pickle.dump`. > > > > Off the top of my head, I can't think of any more places in the standard > library with the same parallel structure. > > > > marshal is the other one in that set, and a quick 'git grep' shows > that plistlib also has that API. The xmlrpc.client module also has > dumps/loads, but not dump/load. > > ChrisA > ___ > Python-ideas mailing list -- python-ideas@python.org > To unsubscribe send an email to python-ideas-le...@python.org > https://mail.python.org/mailman3/lists/python-ideas.python.org/ > Message archived at > https://mail.python.org/archives/list/python-ideas@python.org/message/AWJNAL5ZHJ25KQFEV4UNAWA6O3KXW6RT/ > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/KMZZZKZGVEJFQFYTNO5IEFWR2N6FJ2SH/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf
On Thu, Sep 17, 2020 at 9:53 AM wrote: > > Maybe unrelated, but the same goes for `pickle.load` and `pickle.dump`. For > consistencies, any changes made to `json.load` and `json.dump` (e.g. adding > `json.loadf` and `json.dumpf` or accepting a path like as argument) should be > also applied equivalently to `pickle.load` and `pickle.dump`. > > Off the top of my head, I can't think of any more places in the standard > library with the same parallel structure. > marshal is the other one in that set, and a quick 'git grep' shows that plistlib also has that API. The xmlrpc.client module also has dumps/loads, but not dump/load. ChrisA ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/AWJNAL5ZHJ25KQFEV4UNAWA6O3KXW6RT/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf
Maybe unrelated, but the same goes for `pickle.load` and `pickle.dump`. For consistencies, any changes made to `json.load` and `json.dump` (e.g. adding `json.loadf` and `json.dumpf` or accepting a path like as argument) should be also applied equivalently to `pickle.load` and `pickle.dump`. Off the top of my head, I can't think of any more places in the standard library with the same parallel structure. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/RUZLS2JIFURTBW447TQ3P6HAEDQDEYVZ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Fwd: Re: Experimenting with dict performance, and an immutable dict
Well, it seems ok now: https://github.com/python/cpython/compare/master...Marco-Sulla:master I've done a quick speed test and speedup is quite high for a creation using keywods or a dict with "holes": about 30%: python -m timeit -n 2000 --setup "from uuid import uuid4 ; o = {str(uuid4()).replace('-', '') : str(uuid4()).replace('-', '') for i in range(1)}" "dict(**o)" python -m timeit -n 1 --setup "from uuid import uuid4 ; o = {str(uuid4()).replace('-', '') : str(uuid4()).replace('-', '') for i in range(1)} ; it = iter(o) ; key0 = next(it) ; o.pop(key0)" "dict(o)" Can I do a PR? ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/QWXD2D4SC6XHZLV3QA4TMGMI7Z7SAJ2R/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf
https://docs.python.org/3/library/os.html#os.fspath *__fspath__ On Wed, Sep 16, 2020, 5:53 PM Wes Turner wrote: > > > On Wed, Sep 16, 2020, 5:18 PM Christopher Barker > wrote: > >> On Tue, Sep 15, 2020 at 5:26 PM Wes Turner wrote: >> >>> On Tue, Sep 15, 2020 at 9:09 AM Wes Turner wrote: >>> json.load and json.dump already default to UTF8 and already have > parameters for json loading and dumping. > >> so it turns out that loads(), which optionally takes a bytes or >> bytesarray object tries to determine whether the encoding is UTF-6, UTF-!6 >> or utf-32 (the ones allowed by the standard) (thanks Guido for the >> pointer). And load() calls loads(), so it should work with binary mode >> files as well. >> >> Currently, dump() simply uses the fp passed in, and it doesn't support >> binary files, so it'll use the encoding the user set (or the default, if >> not set, which is an issue here) dumps() returns a string, so no encoding >> there. >> > > So I was not correct: dump does not default to UTF-8 (and does not accept > an encoding= parameter) > > >> I think dumpf() should use UTF-8, and that's it. If anyone really wants >> something else, they can get it by providing an open text file object. >> > > > Why would we impose UTF-8 when the spec says UTF-8, UTF-16, or UTF-32? > > How could this be improved? (I'm on my phone, so) > > def dumpf(obj, path, *args, **kwargs): > with open(getattr(path, '__path__', path), 'w', > encoding=kwargs.get('encoding', 'utf8')) as _file: > return dump(_file, *args, **kwargs) > > def loadf(obj, path, *args, **kwargs): > with open(getattr(path, '__path__', path), > encoding=kwargs.get('encoding', 'utf8')) as _file: > return load(_file, *args, **kwargs) > > > >> loads(), on the other hand, is a bit tricky -- it could allow only UTF-8, >> but it seems it would be more consistent (and easy to do) to open the file >> in binary mode and use the existing code to determine the encoding. >> >> -CHB >> >> >> The Python JSON implementation should support the full JSON spec >> (including UTF-8, UTF-16, and UTF-32) and should default to UTF-8. >> >> 'turns out it does already, and no one is suggesting changing that. >> >> Anyway -- if anyone wants to push for overloading .load()/dump(), rather than making two new loadf() and dumpf() functions, then speak now -- that will take more discussion, and maybe a PEP. >>> >>> I don't see why one or the other would need a PEP so long as the new >>> functionality is backward-compatible? >>> >> >> iIm just putting my finger in the wind. no need for a PEP if it's simeel >> and non-controversial, but if even the few folks on this thread don't agree >> on the API we want, then it's maybe too controversial -- so either more >> discussion, to come to consensus, or a PEP. >> >> Or not -- we can see what the core devs say if/when someone does a bpo / >> PR. >> >> -CHB >> >> >> >> >> >>> >>> -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython >>> >> >> -- >> Christopher Barker, PhD >> >> Python Language Consulting >> - Teaching >> - Scientific Software Development >> - Desktop GUI and Web Development >> - wxPython, numpy, scipy, Cython >> > ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/AAF7PYP2ABCT26CXQCNGNX5FVTZA7FPO/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf
os.fspath exists for a reason. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/RPF2GMTAQCMMHSKIRO7HDD73ZPPHRUVL/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf
On Wed, Sep 16, 2020, 5:18 PM Christopher Barker wrote: > On Tue, Sep 15, 2020 at 5:26 PM Wes Turner wrote: > >> On Tue, Sep 15, 2020 at 9:09 AM Wes Turner wrote: >> >>> json.load and json.dump already default to UTF8 and already have parameters for json loading and dumping. >>> > so it turns out that loads(), which optionally takes a bytes or bytesarray > object tries to determine whether the encoding is UTF-6, UTF-!6 or utf-32 > (the ones allowed by the standard) (thanks Guido for the pointer). And > load() calls loads(), so it should work with binary mode files as well. > > Currently, dump() simply uses the fp passed in, and it doesn't support > binary files, so it'll use the encoding the user set (or the default, if > not set, which is an issue here) dumps() returns a string, so no encoding > there. > So I was not correct: dump does not default to UTF-8 (and does not accept an encoding= parameter) > I think dumpf() should use UTF-8, and that's it. If anyone really wants > something else, they can get it by providing an open text file object. > Why would we impose UTF-8 when the spec says UTF-8, UTF-16, or UTF-32? How could this be improved? (I'm on my phone, so) def dumpf(obj, path, *args, **kwargs): with open(getattr(path, '__path__', path), 'w', encoding=kwargs.get('encoding', 'utf8')) as _file: return dump(_file, *args, **kwargs) def loadf(obj, path, *args, **kwargs): with open(getattr(path, '__path__', path), encoding=kwargs.get('encoding', 'utf8')) as _file: return load(_file, *args, **kwargs) > loads(), on the other hand, is a bit tricky -- it could allow only UTF-8, > but it seems it would be more consistent (and easy to do) to open the file > in binary mode and use the existing code to determine the encoding. > > -CHB > > >> The Python JSON implementation should support the full JSON spec > (including UTF-8, UTF-16, and UTF-32) and should default to UTF-8. > > 'turns out it does already, and no one is suggesting changing that. > > Anyway -- if anyone wants to push for overloading .load()/dump(), rather >>> than making two new loadf() and dumpf() functions, then speak now -- that >>> will take more discussion, and maybe a PEP. >>> >> >> I don't see why one or the other would need a PEP so long as the new >> functionality is backward-compatible? >> > > iIm just putting my finger in the wind. no need for a PEP if it's simeel > and non-controversial, but if even the few folks on this thread don't agree > on the API we want, then it's maybe too controversial -- so either more > discussion, to come to consensus, or a PEP. > > Or not -- we can see what the core devs say if/when someone does a bpo / > PR. > > -CHB > > > > > >> >> >>> -CHB >>> >>> >>> >>> -- >>> Christopher Barker, PhD >>> >>> Python Language Consulting >>> - Teaching >>> - Scientific Software Development >>> - Desktop GUI and Web Development >>> - wxPython, numpy, scipy, Cython >>> >> > > -- > Christopher Barker, PhD > > Python Language Consulting > - Teaching > - Scientific Software Development > - Desktop GUI and Web Development > - wxPython, numpy, scipy, Cython > ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/JAIJVMUQW37S63UFZJSWH5S6BSRBWK6F/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf
On Tue, Sep 15, 2020 at 5:26 PM Wes Turner wrote: > On Tue, Sep 15, 2020 at 9:09 AM Wes Turner wrote: > >> json.load and json.dump already default to UTF8 and already have >>> parameters for json loading and dumping. >>> >> so it turns out that loads(), which optionally takes a bytes or bytesarray object tries to determine whether the encoding is UTF-6, UTF-!6 or utf-32 (the ones allowed by the standard) (thanks Guido for the pointer). And load() calls loads(), so it should work with binary mode files as well. Currently, dump() simply uses the fp passed in, and it doesn't support binary files, so it'll use the encoding the user set (or the default, if not set, which is an issue here) dumps() returns a string, so no encoding there. I think dumpf() should use UTF-8, and that's it. If anyone really wants something else, they can get it by providing an open text file object. loads(), on the other hand, is a bit tricky -- it could allow only UTF-8, but it seems it would be more consistent (and easy to do) to open the file in binary mode and use the existing code to determine the encoding. -CHB >> The Python JSON implementation should support the full JSON spec (including UTF-8, UTF-16, and UTF-32) and should default to UTF-8. 'turns out it does already, and no one is suggesting changing that. Anyway -- if anyone wants to push for overloading .load()/dump(), rather >> than making two new loadf() and dumpf() functions, then speak now -- that >> will take more discussion, and maybe a PEP. >> > > I don't see why one or the other would need a PEP so long as the new > functionality is backward-compatible? > iIm just putting my finger in the wind. no need for a PEP if it's simeel and non-controversial, but if even the few folks on this thread don't agree on the API we want, then it's maybe too controversial -- so either more discussion, to come to consensus, or a PEP. Or not -- we can see what the core devs say if/when someone does a bpo / PR. -CHB > > >> -CHB >> >> >> >> -- >> Christopher Barker, PhD >> >> Python Language Consulting >> - Teaching >> - Scientific Software Development >> - Desktop GUI and Web Development >> - wxPython, numpy, scipy, Cython >> > -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/2BOVZAJAC7X3PBWNGAYUGBTVGZBVEZW5/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: 'Infinity' constant in Python
Could you all please start another thread if you want to discuss possible changes to Error handling for floats. Or anything that isn't strictly adding some names to builtins. There's been ongoing confusion from the expansion of the original topic here. Thanks, -CHB On Wed, Sep 16, 2020 at 8:28 AM Stephen J. Turnbull < turnbull.stephen...@u.tsukuba.ac.jp> wrote: > Paul Moore writes: > > > > And as soon as we start considering integer division, we're talking > > > > about breaking a *vast* amount of code. > > > > > > Yeah, I'm ok with *not* breaking that code. > > > > You may have misunderstood me - when I said "integer division", I > > meant "division of two integers", > > Just to clear it up, I understood your point correctly. "I'm ok with > *not* breaking that code" means "I'm talking about the mythical Python > 4.0, obviously we can't change the error raised by 1 / 0". > > > My *only* concern with the points you and Ben were making was that you > > seemed to be suggesting changes to the division operator and > > ZeroDivisionError, > > Once again, I am quite ok with *not* breaking all that code. My point > about the inconsistencies is not to suggest fixing them. I'm quite > sure that pragmatically we can't fix *all* of them, and most likely > we'd have to go slow on fixing *any* of them. Rather that the whole > float situation is so visually messy that we should leave it alone -- > although it probably works fine in practice until you have need for > NumPy for other reasons. > ___ > Python-ideas mailing list -- python-ideas@python.org > To unsubscribe send an email to python-ideas-le...@python.org > https://mail.python.org/mailman3/lists/python-ideas.python.org/ > Message archived at > https://mail.python.org/archives/list/python-ideas@python.org/message/FXIBYNXE3B53RTGQWUZT2PGNZ5OXB3BM/ > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/U3Q7IFJZSEDZG7UKD4UCW5QSZHY5LHFA/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf
On Wed, Sep 16, 2020 at 12:59 AM Rob Cliffe via Python-ideas < python-ideas@python.org> wrote: > On 14/09/2020 17:36, Christopher Barker wrote: > > nstructions that are not part of the JSON spec) but the proposed new > functions will be strict. > as it looks like I maybe the one to write the PR -- no, I'm not suggesting any changes to compliance. The only thing even remotely on the table is only supporting UTF-8 -- but IIUC, the current functions, if they do the encoding/decoding for you, are already UTF-8 only, so no change. load() and dump() work with text file-like objects -- they are not doing any encoding/decoding. loads() is working with strings or bytes. if strings, then no encoding. if bytes, then: "The ``encoding`` argument is ignored and deprecated since Python 3.1" which I figured meant utf-8 but it fact it seems to work with utf-16 as well. In [17]: utf16 = '{"this": 5}'.encode('utf-16') In [18]: json.loads(utf16) Out[18]: {'this': 5} which surprises me. I'll need to look at the code and see what it's doing. Unless someone wants to tell us :-) dumps(), meanwhile, dumps a str, so gain, no encoding. The idea here is that if you want to use loadf() or dumpf(), it will be utf-8, and if you want to use another encoding, you can open the file yourself and use load() or dump() > To minimise possible confusion, I think that the documentation (both the > docstrings and the online docs) should be **very clear** about this. > Yes, and they need some help in that regard now anyway. -CHB > E.g. > loads: > ... > loads accepts blah-blah-blah. This is different from loadf which only > accepts strict JSON. > > loadf: > ... > loadf only accepts strict JSON. This is different from loads which > blah-blah-blah > > Etc. > Rob Cliffe > ___ > Python-ideas mailing list -- python-ideas@python.org > To unsubscribe send an email to python-ideas-le...@python.org > https://mail.python.org/mailman3/lists/python-ideas.python.org/ > Message archived at > https://mail.python.org/archives/list/python-ideas@python.org/message/ABQKK6GSK33I2XBJK4VK4RUEUAQ3HDVV/ > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/B234P3HHBXN4GT7SJNXDHYAJOQSD7YXY/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: 'Infinity' constant in Python
Paul Moore writes: > > > And as soon as we start considering integer division, we're talking > > > about breaking a *vast* amount of code. > > > > Yeah, I'm ok with *not* breaking that code. > > You may have misunderstood me - when I said "integer division", I > meant "division of two integers", Just to clear it up, I understood your point correctly. "I'm ok with *not* breaking that code" means "I'm talking about the mythical Python 4.0, obviously we can't change the error raised by 1 / 0". > My *only* concern with the points you and Ben were making was that you > seemed to be suggesting changes to the division operator and > ZeroDivisionError, Once again, I am quite ok with *not* breaking all that code. My point about the inconsistencies is not to suggest fixing them. I'm quite sure that pragmatically we can't fix *all* of them, and most likely we'd have to go slow on fixing *any* of them. Rather that the whole float situation is so visually messy that we should leave it alone -- although it probably works fine in practice until you have need for NumPy for other reasons. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/FXIBYNXE3B53RTGQWUZT2PGNZ5OXB3BM/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf
On 14/09/2020 17:36, Christopher Barker wrote: There seems to be a fair bit of support for this idea. Will it need a PEP ? -CHB If I've understood correctly (far from certain) the existing json.dumps and json.loads functions are permissive (allow some constructions that are not part of the JSON spec) but the proposed new functions will be strict. To minimise possible confusion, I think that the documentation (both the docstrings and the online docs) should be **very clear** about this. E.g. loads: ... loads accepts blah-blah-blah. This is different from loadf which only accepts strict JSON. loadf: ... loadf only accepts strict JSON. This is different from loads which blah-blah-blah Etc. Rob Cliffe ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/ABQKK6GSK33I2XBJK4VK4RUEUAQ3HDVV/ Code of Conduct: http://python.org/psf/codeofconduct/