[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf

2020-09-16 Thread Inada Naoki
On Thu, Sep 17, 2020 at 3:02 PM Wes Turner  wrote:
>
> Something like this in the docstring?: "In order to support the historical 
> JSON specification and closed ecosystem JSON, it is possible to specify an 
> encoding other than UTF-8."
>

I don't think dumpf should support encoding parameter.

1. Output is ASCII unless `ensure_ascii=True` is specified.
2. Writing new JSON file with obsolete spec is not recommended.
3. If user really need it, they can write obsolete JSON by `dump` or
`dumps` anyway.

I against adding `encoding` parameter to dumpf and loadf. They are
just shortcut for common cases.

Regards,

-- 
Inada Naoki  
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/7V3UAEBUKYEQDYYJIGJUVW3DYCDDQN46/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf

2020-09-16 Thread Wes Turner
Something like this in the docstring?: "In order to support the historical
JSON specification and closed ecosystem JSON, it is possible to specify an
encoding other than UTF-8."

8.1.  Character Encoding
>JSON text exchanged between systems that are not part of a closed
>ecosystem MUST be encoded using UTF-8 [RFC3629].
>Previous specifications of JSON have not required the use of UTF-8
>when transmitting JSON text.  However, the vast majority of JSON-
>based software implementations have chosen to use the UTF-8 encoding,
>to the extent that it is the only encoding that achieves
>interoperability.
>Implementations MUST NOT add a byte order mark (U+FEFF) to the
>beginning of a networked-transmitted JSON text.  In the interests of
>interoperability, implementations that parse JSON texts MAY ignore
>the presence of a byte order mark rather than treating it as an
>error.




```python
import json
import os


def dumpf(obj, path, *, encoding="UTF-8", **kwargs):
with open(os.fspath(path), "w", encoding=encoding) as f:
return json.dump(obj, f, **kwargs)


def loadf(path, *, encoding="UTF-8", **kwargs):
with open(os.fspath(path), "r", encoding=encoding) as f:
return json.load(f, **kwargs)


import pathlib
import unittest


class TestJsonLoadfAndDumpf(unittest.TestCase):
def setUp(self):
self.encodings = [None, "UTF-8", "UTF-16", "UTF-32"]

data = dict(
obj=dict(a=dict(b=[1, 2, 3])),
path=pathlib.Path(".") / "test_loadf_and_dumpf.json",
)
if os.path.isfile(data["path"]):
os.unlink(data["path"])
self.data = data

def test_dumpf_and_loadf(self):
data = self.data
for encoding in self.encodings:
path = f'{data["path"]}.{encoding}.json'
dumpf_output = dumpf(data["obj"], path, encoding=encoding)
loadf_output = loadf(path, encoding=encoding)
assert loadf_output == data["obj"]


# $ pip install pytest-cov
# $ pytest -v example.py
# https://docs.pytest.org/en/stable/parametrize.html
# https://docs.pytest.org/en/stable/tmpdir.html

import pytest


@pytest.mark.parametrize("encoding", [None, "UTF-8", "UTF-16", "UTF-32"])
@pytest.mark.parametrize("obj", [dict(a=dict(b=[1, 2, 3]))])
def test_dumpf_and_loadf(obj, encoding, tmpdir):
pth = pathlib.Path(tmpdir) / f"test_loadf_and_dumpf.{encoding}.json"
dumpf_output = dumpf(obj, pth, encoding=encoding)
loadf_output = loadf(pth, encoding=encoding)
assert loadf_output == obj
```

For whoever creates a PR for this:

- [ ] add parameter and return type annotations
- [ ] copy docstrings from json.load/json.dump and open#encoding
- [ ] correctly support the c module implementation (this just does `import
json`)?
- [ ] keep or drop the encoding tests?

On Thu, Sep 17, 2020 at 1:25 AM Christopher Barker 
wrote:

> Is that suggested code? I don't follow.
>
> But if it is, no. personally, I think ANY use of system settings is a bad
> idea [*]. But certainly no need to even think about it for JSON.
>
> -CHB
>
> * have we not learned that in the age of the internet the machine the code
> happens to be running on has nothing to do with the user of the
> applications' needs? Timezones, encodings, number formats, NOTHING.
>
>
> On Wed, Sep 16, 2020 at 8:45 PM Wes Turner  wrote:
>
>> Is all of this locale/encoding testing necessary (or even sufficient)?
>>
>>
>> ```python
>> import json
>> import locale
>> import os
>>
>>
>> def get_default_encoding():
>> """
>> TODO XXX: ???
>> """
>> default_encoding = locale.getdefaultlocale()[1]
>> if default_encoding.startswith("UTF-"):
>> return default_encoding
>> else:
>> return "UTF-8"
>>
>>
>> def dumpf(obj, path, *args, **kwargs):
>> with open(
>> os.fspath(path),
>> "w",
>> encoding=kwargs.pop("encoding", get_default_encoding()),
>> ) as file_:
>> return json.dump(obj, file_, *args, **kwargs)
>>
>>
>> def loadf(path, *args, **kwargs):
>> with open(
>> os.fspath(path),
>> "r",
>> encoding=kwargs.pop("encoding", get_default_encoding()),
>> ) as file_:
>> return json.load(file_, *args, **kwargs)
>>
>>
>> import pathlib
>> import unittest
>>
>>
>> class TestJsonLoadfAndDumpf(unittest.TestCase):
>> def setUp(self):
>> self.locales = ["", "C", "en_US.UTF-8", "japanese"]
>> self.encodings = [None, "UTF-8", "UTF-16", "UTF-32"]
>>
>> data = dict(
>> obj=dict(a=dict(b=[1, 2, 3])),
>> encoding=None,
>> path=pathlib.Path(".") / "test_loadf_and_dumpf.json",
>> )
>> if os.path.isfile(data["path"]):
>> os.unlink(data["path"])
>> self.data = data
>>
>> self.previous_locale = locale.getlocale()
>>
>> def tearDown(self):
>> locale.setlocale(locale.LC_ALL, self.previous_locale)
>>
>> def test_get_defau

[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf

2020-09-16 Thread Christopher Barker
Is that suggested code? I don't follow.

But if it is, no. personally, I think ANY use of system settings is a bad
idea [*]. But certainly no need to even think about it for JSON.

-CHB

* have we not learned that in the age of the internet the machine the code
happens to be running on has nothing to do with the user of the
applications' needs? Timezones, encodings, number formats, NOTHING.


On Wed, Sep 16, 2020 at 8:45 PM Wes Turner  wrote:

> Is all of this locale/encoding testing necessary (or even sufficient)?
>
>
> ```python
> import json
> import locale
> import os
>
>
> def get_default_encoding():
> """
> TODO XXX: ???
> """
> default_encoding = locale.getdefaultlocale()[1]
> if default_encoding.startswith("UTF-"):
> return default_encoding
> else:
> return "UTF-8"
>
>
> def dumpf(obj, path, *args, **kwargs):
> with open(
> os.fspath(path),
> "w",
> encoding=kwargs.pop("encoding", get_default_encoding()),
> ) as file_:
> return json.dump(obj, file_, *args, **kwargs)
>
>
> def loadf(path, *args, **kwargs):
> with open(
> os.fspath(path),
> "r",
> encoding=kwargs.pop("encoding", get_default_encoding()),
> ) as file_:
> return json.load(file_, *args, **kwargs)
>
>
> import pathlib
> import unittest
>
>
> class TestJsonLoadfAndDumpf(unittest.TestCase):
> def setUp(self):
> self.locales = ["", "C", "en_US.UTF-8", "japanese"]
> self.encodings = [None, "UTF-8", "UTF-16", "UTF-32"]
>
> data = dict(
> obj=dict(a=dict(b=[1, 2, 3])),
> encoding=None,
> path=pathlib.Path(".") / "test_loadf_and_dumpf.json",
> )
> if os.path.isfile(data["path"]):
> os.unlink(data["path"])
> self.data = data
>
> self.previous_locale = locale.getlocale()
>
> def tearDown(self):
> locale.setlocale(locale.LC_ALL, self.previous_locale)
>
> def test_get_default_encoding(self):
> for localestr in self.locales:
> locale.setlocale(locale.LC_ALL, localestr)
> output = get_default_encoding()
> assert output.startswith("UTF-")
>
> def test_dumpf_and_loadf(self):
> data = self.data
> for localestr in self.locales:
> locale.setlocale(locale.LC_ALL, localestr)
> for encoding in self.encodings:
> dumpf_output = dumpf(
> data["obj"], data["path"], encoding=encoding
> )
> loadf_output = loadf(data["path"], encoding=encoding)
> assert loadf_output == data["obj"]
> ```
>
> On Wed, Sep 16, 2020 at 8:30 PM Christopher Barker 
> wrote:
>
>> On Wed, Sep 16, 2020 at 2:53 PM Wes Turner  wrote:
>>
>>> So I was not correct: dump does not default to UTF-8 (and does not
>>> accept an encoding= parameter)
>>>
>>>
 I think dumpf() should use UTF-8, and that's it. If anyone really wants
 something else, they can get it by providing an open text file object.

>>>
>>> Why would we impose UTF-8 when the spec says UTF-8, UTF-16, or UTF-32?
>>>
>>
>> The idea was that the encoding was one of the motivators to doing this in
>> the first place. But I suppose as long as utf-8 is the default, and only
>> the three "official" ones are allowed, then yeah, we could add an encoding
>> keyword argument.
>>
>> -CHB
>>
>>
>> --
>> Christopher Barker, PhD
>>
>> Python Language Consulting
>>   - Teaching
>>   - Scientific Software Development
>>   - Desktop GUI and Web Development
>>   - wxPython, numpy, scipy, Cython
>>
>

-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/PKD2CKYJIWXNDMI6GFDFOUPNHDVMCDJP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf

2020-09-16 Thread Inada Naoki
On Thu, Sep 17, 2020 at 6:54 AM Wes Turner  wrote:
>
>
> Why would we impose UTF-8 when the spec says UTF-8, UTF-16, or UTF-32?

Obsolete JSON spec said UTF-8, UTF-16, and UTF-32. Current spec says UTF-8.
See https://tools.ietf.org/html/rfc8259#section-8.1

So `dumpf` must use UTF-8, although `loadf` can support UTF-16 and
UTF-32 like `loads`.

>
> How could this be improved? (I'm on my phone, so)
>
> def dumpf(obj, path, *args, **kwargs):
> with open(getattr(path, '__path__', path), 'w', 
> encoding=kwargs.get('encoding', 'utf8')) as _file:
> return dump(_file, *args, **kwargs)
>
> def loadf(obj, path, *args, **kwargs):
> with open(getattr(path, '__path__', path), 
> encoding=kwargs.get('encoding', 'utf8')) as _file:
> return load(_file, *args, **kwargs)
>

def dumpf(obj, path, *, **kwargs):
with open(path, "w", encoding="utf-8") as f:
return dump(obj, f, **kwargs)

def loadf(obj, path, *, **kwargs):
with open(path, "rb") as f:
return load(f, **kwargs)

Regards,
-- 
Inada Naoki  
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/5RPHOVBMAC3USBKY7S2G4WVEC4JR4IV6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf

2020-09-16 Thread Wes Turner
Is all of this locale/encoding testing necessary (or even sufficient)?


```python
import json
import locale
import os


def get_default_encoding():
"""
TODO XXX: ???
"""
default_encoding = locale.getdefaultlocale()[1]
if default_encoding.startswith("UTF-"):
return default_encoding
else:
return "UTF-8"


def dumpf(obj, path, *args, **kwargs):
with open(
os.fspath(path),
"w",
encoding=kwargs.pop("encoding", get_default_encoding()),
) as file_:
return json.dump(obj, file_, *args, **kwargs)


def loadf(path, *args, **kwargs):
with open(
os.fspath(path),
"r",
encoding=kwargs.pop("encoding", get_default_encoding()),
) as file_:
return json.load(file_, *args, **kwargs)


import pathlib
import unittest


class TestJsonLoadfAndDumpf(unittest.TestCase):
def setUp(self):
self.locales = ["", "C", "en_US.UTF-8", "japanese"]
self.encodings = [None, "UTF-8", "UTF-16", "UTF-32"]

data = dict(
obj=dict(a=dict(b=[1, 2, 3])),
encoding=None,
path=pathlib.Path(".") / "test_loadf_and_dumpf.json",
)
if os.path.isfile(data["path"]):
os.unlink(data["path"])
self.data = data

self.previous_locale = locale.getlocale()

def tearDown(self):
locale.setlocale(locale.LC_ALL, self.previous_locale)

def test_get_default_encoding(self):
for localestr in self.locales:
locale.setlocale(locale.LC_ALL, localestr)
output = get_default_encoding()
assert output.startswith("UTF-")

def test_dumpf_and_loadf(self):
data = self.data
for localestr in self.locales:
locale.setlocale(locale.LC_ALL, localestr)
for encoding in self.encodings:
dumpf_output = dumpf(
data["obj"], data["path"], encoding=encoding
)
loadf_output = loadf(data["path"], encoding=encoding)
assert loadf_output == data["obj"]
```

On Wed, Sep 16, 2020 at 8:30 PM Christopher Barker 
wrote:

> On Wed, Sep 16, 2020 at 2:53 PM Wes Turner  wrote:
>
>> So I was not correct: dump does not default to UTF-8 (and does not accept
>> an encoding= parameter)
>>
>>
>>> I think dumpf() should use UTF-8, and that's it. If anyone really wants
>>> something else, they can get it by providing an open text file object.
>>>
>>
>> Why would we impose UTF-8 when the spec says UTF-8, UTF-16, or UTF-32?
>>
>
> The idea was that the encoding was one of the motivators to doing this in
> the first place. But I suppose as long as utf-8 is the default, and only
> the three "official" ones are allowed, then yeah, we could add an encoding
> keyword argument.
>
> -CHB
>
>
> --
> Christopher Barker, PhD
>
> Python Language Consulting
>   - Teaching
>   - Scientific Software Development
>   - Desktop GUI and Web Development
>   - wxPython, numpy, scipy, Cython
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/G72K6AGXLVMBQAZYECK6N5VGBYDD3RYL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Fwd: Re: Experimenting with dict performance, and an immutable dict

2020-09-16 Thread Inada Naoki
On Thu, Sep 17, 2020 at 8:03 AM Marco Sulla
 wrote:
>
> Well, it seems ok now:
> https://github.com/python/cpython/compare/master...Marco-Sulla:master
>
> I've done a quick speed test and speedup is quite high for a creation
> using keywods or a dict with "holes": about 30%:

30% on microbenchmark is not quite high.

For example, I have optimized "copy dict with holes" but I rejected my
PR because I am not sure performance / maintenance cost ratio is good
enough.

https://bugs.python.org/issue41431#msg374556
https://github.com/python/cpython/pull/21669

>
> python -m timeit -n 2000  --setup "from uuid import uuid4 ; o =
> {str(uuid4()).replace('-', '') : str(uuid4()).replace('-', '') for i
> in range(1)}" "dict(**o)"
>

I don't think this use case is worth to optimize, because `dict(o)` or
`o.copy()` is Pythonic.


> python -m timeit -n 1  --setup "from uuid import uuid4 ; o =
> {str(uuid4()).replace('-', '') : str(uuid4()).replace('-', '') for i
> in range(1)} ; it = iter(o) ; key0 = next(it) ; o.pop(key0)"
> "dict(o)"
>

It is controversial. If the optimization is very simple, it might be
worth enough.

Regards,

-- 
Inada Naoki  
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LE6RLLKF4QRRA4P2EXUK5MXVH6X4CSUZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Fwd: Re: Experimenting with dict performance, and an immutable dict

2020-09-16 Thread Wes Turner
Would an e.g. bm_dict.py in [1] be a good place for a few benchmarks of
dict; or is there a more appropriate project for authoritatively measuring
performance regressions and optimizations of core {cpython,} data
structures?

[1]
https://github.com/python/pyperformance/tree/master/pyperformance/benchmarks

(pytest-benchmark looks neat, as well. an example of how to use
pytest.mark.parametrize to capture multiple metrics might be helpful:
https://github.com/ionelmc/pytest-benchmark )

Its easy to imagine a bot that runs some or all performance benchmarks on a
PR when requested in a PR comment; there's probably already a good way to
do this?

On Wed, Sep 16, 2020, 10:44 PM Wes Turner  wrote:

> That sounds like a worthwhile optimization. FWIW, is this a bit simpler
> but sufficient?:
>
> python -m timeit -n 2000  --setup "from uuid import uuid4; \
> o = {uuid4().hex: i for i in range(1)}" \
> "dict(**o)"
>
> Is there a preferred tool to comprehensively measure the performance
> impact of a PR (with e.g. multiple contrived and average-case key/value
> sets)?
>
>
> On Wed, Sep 16, 2020, 7:07 PM Marco Sulla 
> wrote:
>
>> Well, it seems ok now:
>> https://github.com/python/cpython/compare/master...Marco-Sulla:master
>>
>> I've done a quick speed test and speedup is quite high for a creation
>> using keywods or a dict with "holes": about 30%:
>>
>> python -m timeit -n 2000  --setup "from uuid import uuid4 ; o =
>> {str(uuid4()).replace('-', '') : str(uuid4()).replace('-', '') for i
>> in range(1)}" "dict(**o)"
>>
>> python -m timeit -n 1  --setup "from uuid import uuid4 ; o =
>> {str(uuid4()).replace('-', '') : str(uuid4()).replace('-', '') for i
>> in range(1)} ; it = iter(o) ; key0 = next(it) ; o.pop(key0)"
>> "dict(o)"
>>
>> Can I do a PR?
>> ___
>> Python-ideas mailing list -- python-ideas@python.org
>> To unsubscribe send an email to python-ideas-le...@python.org
>> https://mail.python.org/mailman3/lists/python-ideas.python.org/
>> Message archived at
>> https://mail.python.org/archives/list/python-ideas@python.org/message/QWXD2D4SC6XHZLV3QA4TMGMI7Z7SAJ2R/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/GNVALID7YU3IP6HUH7K7BF56CDMJJACK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Fwd: Re: Experimenting with dict performance, and an immutable dict

2020-09-16 Thread Wes Turner
That sounds like a worthwhile optimization. FWIW, is this a bit simpler but
sufficient?:

python -m timeit -n 2000  --setup "from uuid import uuid4; \
o = {uuid4().hex: i for i in range(1)}" \
"dict(**o)"

Is there a preferred tool to comprehensively measure the performance impact
of a PR (with e.g. multiple contrived and average-case key/value sets)?


On Wed, Sep 16, 2020, 7:07 PM Marco Sulla 
wrote:

> Well, it seems ok now:
> https://github.com/python/cpython/compare/master...Marco-Sulla:master
>
> I've done a quick speed test and speedup is quite high for a creation
> using keywods or a dict with "holes": about 30%:
>
> python -m timeit -n 2000  --setup "from uuid import uuid4 ; o =
> {str(uuid4()).replace('-', '') : str(uuid4()).replace('-', '') for i
> in range(1)}" "dict(**o)"
>
> python -m timeit -n 1  --setup "from uuid import uuid4 ; o =
> {str(uuid4()).replace('-', '') : str(uuid4()).replace('-', '') for i
> in range(1)} ; it = iter(o) ; key0 = next(it) ; o.pop(key0)"
> "dict(o)"
>
> Can I do a PR?
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/QWXD2D4SC6XHZLV3QA4TMGMI7Z7SAJ2R/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/6RALUK5FUS25W4G5DM7ILZHFJOJTSPIM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf

2020-09-16 Thread Christopher Barker
On Wed, Sep 16, 2020 at 2:53 PM Wes Turner  wrote:

> So I was not correct: dump does not default to UTF-8 (and does not accept
> an encoding= parameter)
>
>
>> I think dumpf() should use UTF-8, and that's it. If anyone really wants
>> something else, they can get it by providing an open text file object.
>>
>
> Why would we impose UTF-8 when the spec says UTF-8, UTF-16, or UTF-32?
>

The idea was that the encoding was one of the motivators to doing this in
the first place. But I suppose as long as utf-8 is the default, and only
the three "official" ones are allowed, then yeah, we could add an encoding
keyword argument.

-CHB


-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/O73WZF6JKME2VPVWOWYRVQ3APVEA2J5V/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf

2020-09-16 Thread Christopher Barker
I believe Sergie already suggested pickle and marshall, and I guess we can
add plistlib to those.

Personally, I'm not so sure it should be added to all these. I see why the
same API was used for all of them, but they really are fairly different
beasts. So if they have a function with the same purpose, it should have
the same name, but that doesn't mean that all these modules need to have
all the functions.

On the other hand, the fact that we might be adding two new functions to
four different modules is, in my mind, andn argument for overloading the
existing dump() / load() instead. a lot less API churn.

-CHB




On Wed, Sep 16, 2020 at 5:10 PM Chris Angelico  wrote:

> On Thu, Sep 17, 2020 at 9:53 AM  wrote:
> >
> > Maybe unrelated, but the same goes for `pickle.load` and `pickle.dump`.
> For consistencies, any changes made to `json.load` and `json.dump` (e.g.
> adding `json.loadf` and `json.dumpf` or accepting a path like as argument)
> should be also applied equivalently to `pickle.load` and `pickle.dump`.
> >
> > Off the top of my head, I can't think of any more places in the standard
> library with the same parallel structure.
> >
>
> marshal is the other one in that set, and a quick 'git grep' shows
> that plistlib also has that API. The xmlrpc.client module also has
> dumps/loads, but not dump/load.
>
> ChrisA
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/AWJNAL5ZHJ25KQFEV4UNAWA6O3KXW6RT/
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/KMZZZKZGVEJFQFYTNO5IEFWR2N6FJ2SH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf

2020-09-16 Thread Chris Angelico
On Thu, Sep 17, 2020 at 9:53 AM  wrote:
>
> Maybe unrelated, but the same goes for `pickle.load` and `pickle.dump`. For 
> consistencies, any changes made to `json.load` and `json.dump` (e.g. adding 
> `json.loadf` and `json.dumpf` or accepting a path like as argument) should be 
> also applied equivalently to `pickle.load` and `pickle.dump`.
>
> Off the top of my head, I can't think of any more places in the standard 
> library with the same parallel structure.
>

marshal is the other one in that set, and a quick 'git grep' shows
that plistlib also has that API. The xmlrpc.client module also has
dumps/loads, but not dump/load.

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/AWJNAL5ZHJ25KQFEV4UNAWA6O3KXW6RT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf

2020-09-16 Thread lammenspaolo
Maybe unrelated, but the same goes for `pickle.load` and `pickle.dump`. For 
consistencies, any changes made to `json.load` and `json.dump` (e.g. adding 
`json.loadf` and `json.dumpf` or accepting a path like as argument) should be 
also applied equivalently to `pickle.load` and `pickle.dump`. 

Off the top of my head, I can't think of any more places in the standard 
library with the same parallel structure.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RUZLS2JIFURTBW447TQ3P6HAEDQDEYVZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Fwd: Re: Experimenting with dict performance, and an immutable dict

2020-09-16 Thread Marco Sulla
Well, it seems ok now:
https://github.com/python/cpython/compare/master...Marco-Sulla:master

I've done a quick speed test and speedup is quite high for a creation
using keywods or a dict with "holes": about 30%:

python -m timeit -n 2000  --setup "from uuid import uuid4 ; o =
{str(uuid4()).replace('-', '') : str(uuid4()).replace('-', '') for i
in range(1)}" "dict(**o)"

python -m timeit -n 1  --setup "from uuid import uuid4 ; o =
{str(uuid4()).replace('-', '') : str(uuid4()).replace('-', '') for i
in range(1)} ; it = iter(o) ; key0 = next(it) ; o.pop(key0)"
"dict(o)"

Can I do a PR?
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/QWXD2D4SC6XHZLV3QA4TMGMI7Z7SAJ2R/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf

2020-09-16 Thread Wes Turner
https://docs.python.org/3/library/os.html#os.fspath

*__fspath__

On Wed, Sep 16, 2020, 5:53 PM Wes Turner  wrote:

>
>
> On Wed, Sep 16, 2020, 5:18 PM Christopher Barker 
> wrote:
>
>> On Tue, Sep 15, 2020 at 5:26 PM Wes Turner  wrote:
>>
>>> On Tue, Sep 15, 2020 at 9:09 AM Wes Turner  wrote:
>>>
 json.load and json.dump already default to UTF8 and already have
> parameters for json loading and dumping.
>

>> so it turns out that loads(), which optionally takes a bytes or
>> bytesarray object tries to determine whether the encoding is UTF-6, UTF-!6
>> or utf-32 (the ones allowed by the standard) (thanks Guido for the
>> pointer). And load() calls loads(), so it should work with binary mode
>> files as well.
>>
>> Currently, dump() simply uses the fp passed in, and it doesn't support
>> binary files, so it'll use the encoding the user set (or the default, if
>> not set, which is an issue here) dumps() returns a string, so no encoding
>> there.
>>
>
> So I was not correct: dump does not default to UTF-8 (and does not accept
> an encoding= parameter)
>
>
>> I think dumpf() should use UTF-8, and that's it. If anyone really wants
>> something else, they can get it by providing an open text file object.
>>
>
>
> Why would we impose UTF-8 when the spec says UTF-8, UTF-16, or UTF-32?
>
> How could this be improved? (I'm on my phone, so)
>
> def dumpf(obj, path, *args, **kwargs):
> with open(getattr(path, '__path__', path), 'w',
> encoding=kwargs.get('encoding', 'utf8')) as _file:
> return dump(_file, *args, **kwargs)
>
> def loadf(obj, path, *args, **kwargs):
> with open(getattr(path, '__path__', path),
> encoding=kwargs.get('encoding', 'utf8')) as _file:
> return load(_file, *args, **kwargs)
>
>
>
>> loads(), on the other hand, is a bit tricky -- it could allow only UTF-8,
>> but it seems it would be more consistent (and easy to do) to open the file
>> in binary mode and use the existing code to determine the encoding.
>>
>> -CHB
>>
>> >> The Python JSON implementation should support the full JSON spec
>> (including UTF-8, UTF-16, and UTF-32) and should default to UTF-8.
>>
>> 'turns out it does already, and no one is suggesting changing that.
>>
>> Anyway -- if anyone wants to push for overloading .load()/dump(), rather
 than making two new loadf() and dumpf() functions, then speak now -- that
 will take more discussion, and maybe a PEP.

>>>
>>> I don't see why one or the other would need a PEP so long as the new
>>> functionality is backward-compatible?
>>>
>>
>> iIm just putting my finger in the wind. no need for a PEP if it's simeel
>> and non-controversial, but if even the few folks on this thread don't agree
>> on the API we want, then it's maybe too controversial -- so either more
>> discussion, to come to consensus, or a PEP.
>>
>> Or not -- we can see what the core devs say if/when someone does a bpo /
>> PR.
>>
>> -CHB
>>
>>
>>
>>
>>
>>>
>>>
 -CHB



 --
 Christopher Barker, PhD

 Python Language Consulting
   - Teaching
   - Scientific Software Development
   - Desktop GUI and Web Development
   - wxPython, numpy, scipy, Cython

>>>
>>
>> --
>> Christopher Barker, PhD
>>
>> Python Language Consulting
>>   - Teaching
>>   - Scientific Software Development
>>   - Desktop GUI and Web Development
>>   - wxPython, numpy, scipy, Cython
>>
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/AAF7PYP2ABCT26CXQCNGNX5FVTZA7FPO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf

2020-09-16 Thread William Pickard
os.fspath exists for a reason.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RPF2GMTAQCMMHSKIRO7HDD73ZPPHRUVL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf

2020-09-16 Thread Wes Turner
On Wed, Sep 16, 2020, 5:18 PM Christopher Barker 
wrote:

> On Tue, Sep 15, 2020 at 5:26 PM Wes Turner  wrote:
>
>> On Tue, Sep 15, 2020 at 9:09 AM Wes Turner  wrote:
>>
>>> json.load and json.dump already default to UTF8 and already have
 parameters for json loading and dumping.

>>>
> so it turns out that loads(), which optionally takes a bytes or bytesarray
> object tries to determine whether the encoding is UTF-6, UTF-!6 or utf-32
> (the ones allowed by the standard) (thanks Guido for the pointer). And
> load() calls loads(), so it should work with binary mode files as well.
>
> Currently, dump() simply uses the fp passed in, and it doesn't support
> binary files, so it'll use the encoding the user set (or the default, if
> not set, which is an issue here) dumps() returns a string, so no encoding
> there.
>

So I was not correct: dump does not default to UTF-8 (and does not accept
an encoding= parameter)


> I think dumpf() should use UTF-8, and that's it. If anyone really wants
> something else, they can get it by providing an open text file object.
>


Why would we impose UTF-8 when the spec says UTF-8, UTF-16, or UTF-32?

How could this be improved? (I'm on my phone, so)

def dumpf(obj, path, *args, **kwargs):
with open(getattr(path, '__path__', path), 'w',
encoding=kwargs.get('encoding', 'utf8')) as _file:
return dump(_file, *args, **kwargs)

def loadf(obj, path, *args, **kwargs):
with open(getattr(path, '__path__', path),
encoding=kwargs.get('encoding', 'utf8')) as _file:
return load(_file, *args, **kwargs)



> loads(), on the other hand, is a bit tricky -- it could allow only UTF-8,
> but it seems it would be more consistent (and easy to do) to open the file
> in binary mode and use the existing code to determine the encoding.
>
> -CHB
>
> >> The Python JSON implementation should support the full JSON spec
> (including UTF-8, UTF-16, and UTF-32) and should default to UTF-8.
>
> 'turns out it does already, and no one is suggesting changing that.
>
> Anyway -- if anyone wants to push for overloading .load()/dump(), rather
>>> than making two new loadf() and dumpf() functions, then speak now -- that
>>> will take more discussion, and maybe a PEP.
>>>
>>
>> I don't see why one or the other would need a PEP so long as the new
>> functionality is backward-compatible?
>>
>
> iIm just putting my finger in the wind. no need for a PEP if it's simeel
> and non-controversial, but if even the few folks on this thread don't agree
> on the API we want, then it's maybe too controversial -- so either more
> discussion, to come to consensus, or a PEP.
>
> Or not -- we can see what the core devs say if/when someone does a bpo /
> PR.
>
> -CHB
>
>
>
>
>
>>
>>
>>> -CHB
>>>
>>>
>>>
>>> --
>>> Christopher Barker, PhD
>>>
>>> Python Language Consulting
>>>   - Teaching
>>>   - Scientific Software Development
>>>   - Desktop GUI and Web Development
>>>   - wxPython, numpy, scipy, Cython
>>>
>>
>
> --
> Christopher Barker, PhD
>
> Python Language Consulting
>   - Teaching
>   - Scientific Software Development
>   - Desktop GUI and Web Development
>   - wxPython, numpy, scipy, Cython
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/JAIJVMUQW37S63UFZJSWH5S6BSRBWK6F/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf

2020-09-16 Thread Christopher Barker
On Tue, Sep 15, 2020 at 5:26 PM Wes Turner  wrote:

> On Tue, Sep 15, 2020 at 9:09 AM Wes Turner  wrote:
>
>> json.load and json.dump already default to UTF8 and already have
>>> parameters for json loading and dumping.
>>>
>>
so it turns out that loads(), which optionally takes a bytes or bytesarray
object tries to determine whether the encoding is UTF-6, UTF-!6 or utf-32
(the ones allowed by the standard) (thanks Guido for the pointer). And
load() calls loads(), so it should work with binary mode files as well.

Currently, dump() simply uses the fp passed in, and it doesn't support
binary files, so it'll use the encoding the user set (or the default, if
not set, which is an issue here) dumps() returns a string, so no encoding
there.

I think dumpf() should use UTF-8, and that's it. If anyone really wants
something else, they can get it by providing an open text file object.

loads(), on the other hand, is a bit tricky -- it could allow only UTF-8,
but it seems it would be more consistent (and easy to do) to open the file
in binary mode and use the existing code to determine the encoding.

-CHB

>> The Python JSON implementation should support the full JSON spec
(including UTF-8, UTF-16, and UTF-32) and should default to UTF-8.

'turns out it does already, and no one is suggesting changing that.

Anyway -- if anyone wants to push for overloading .load()/dump(), rather
>> than making two new loadf() and dumpf() functions, then speak now -- that
>> will take more discussion, and maybe a PEP.
>>
>
> I don't see why one or the other would need a PEP so long as the new
> functionality is backward-compatible?
>

iIm just putting my finger in the wind. no need for a PEP if it's simeel
and non-controversial, but if even the few folks on this thread don't agree
on the API we want, then it's maybe too controversial -- so either more
discussion, to come to consensus, or a PEP.

Or not -- we can see what the core devs say if/when someone does a bpo / PR.

-CHB





>
>
>> -CHB
>>
>>
>>
>> --
>> Christopher Barker, PhD
>>
>> Python Language Consulting
>>   - Teaching
>>   - Scientific Software Development
>>   - Desktop GUI and Web Development
>>   - wxPython, numpy, scipy, Cython
>>
>

-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2BOVZAJAC7X3PBWNGAYUGBTVGZBVEZW5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: 'Infinity' constant in Python

2020-09-16 Thread Christopher Barker
Could you all please start another thread if you want to discuss possible
changes to Error handling for floats. Or anything that isn't strictly
adding some names to builtins.

There's been ongoing confusion from the expansion of the original topic
here.

Thanks,
-CHB



On Wed, Sep 16, 2020 at 8:28 AM Stephen J. Turnbull <
turnbull.stephen...@u.tsukuba.ac.jp> wrote:

> Paul Moore writes:
>  > >  > And as soon as we start considering integer division, we're talking
>  > >  > about breaking a *vast* amount of code.
>  > >
>  > > Yeah, I'm ok with *not* breaking that code.
>  >
>  > You may have misunderstood me - when I said "integer division", I
>  > meant "division of two integers",
>
> Just to clear it up, I understood your point correctly.  "I'm ok with
> *not* breaking that code" means "I'm talking about the mythical Python
> 4.0, obviously we can't change the error raised by 1 / 0".
>
>  > My *only* concern with the points you and Ben were making was that you
>  > seemed to be suggesting changes to the division operator and
>  > ZeroDivisionError,
>
> Once again, I am quite ok with *not* breaking all that code.  My point
> about the inconsistencies is not to suggest fixing them.  I'm quite
> sure that pragmatically we can't fix *all* of them, and most likely
> we'd have to go slow on fixing *any* of them.  Rather that the whole
> float situation is so visually messy that we should leave it alone --
> although it probably works fine in practice until you have need for
> NumPy for other reasons.
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/FXIBYNXE3B53RTGQWUZT2PGNZ5OXB3BM/
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/U3Q7IFJZSEDZG7UKD4UCW5QSZHY5LHFA/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf

2020-09-16 Thread Christopher Barker
On Wed, Sep 16, 2020 at 12:59 AM Rob Cliffe via Python-ideas <
python-ideas@python.org> wrote:

> On 14/09/2020 17:36, Christopher Barker wrote:
>


> nstructions that are not part of the JSON spec) but the proposed new
> functions will be strict.
>

as it looks like I maybe the one to write the PR -- no, I'm not suggesting
any changes to compliance.

The only thing even remotely on the table is only supporting UTF-8 -- but
IIUC, the current functions, if they do the encoding/decoding for you, are
already UTF-8 only, so no change.

load() and dump() work with text file-like objects -- they are not doing
any encoding/decoding.

loads() is working with strings or bytes. if strings, then no encoding. if
bytes, then:

"The ``encoding`` argument is ignored and deprecated since Python 3.1"

which I figured meant utf-8 but it fact it seems to work with utf-16 as
well.

In [17]: utf16 = '{"this": 5}'.encode('utf-16')


In [18]: json.loads(utf16)

Out[18]: {'this': 5}

which surprises me. I'll need to look at the code and see what it's doing.
Unless someone wants to tell us :-)

dumps(), meanwhile, dumps a str, so gain, no encoding.

The idea here is that if you want to use loadf() or dumpf(), it will be
utf-8, and if you want to use another encoding, you can open the file
yourself and use load() or dump()


> To minimise possible confusion, I think that the documentation (both the
> docstrings and the online docs) should be **very clear** about this.
>

Yes, and they need some help in that regard now anyway.

-CHB



> E.g.
> loads:
> ...
> loads accepts blah-blah-blah.  This is different from loadf which only
> accepts strict JSON.
>
> loadf:
> ...
> loadf only accepts strict JSON.  This is different from loads which
> blah-blah-blah
>
> Etc.
> Rob Cliffe
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/ABQKK6GSK33I2XBJK4VK4RUEUAQ3HDVV/
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/B234P3HHBXN4GT7SJNXDHYAJOQSD7YXY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: 'Infinity' constant in Python

2020-09-16 Thread Stephen J. Turnbull
Paul Moore writes:
 > >  > And as soon as we start considering integer division, we're talking
 > >  > about breaking a *vast* amount of code.
 > >
 > > Yeah, I'm ok with *not* breaking that code.
 > 
 > You may have misunderstood me - when I said "integer division", I
 > meant "division of two integers",

Just to clear it up, I understood your point correctly.  "I'm ok with
*not* breaking that code" means "I'm talking about the mythical Python
4.0, obviously we can't change the error raised by 1 / 0".

 > My *only* concern with the points you and Ben were making was that you
 > seemed to be suggesting changes to the division operator and
 > ZeroDivisionError,

Once again, I am quite ok with *not* breaking all that code.  My point
about the inconsistencies is not to suggest fixing them.  I'm quite
sure that pragmatically we can't fix *all* of them, and most likely
we'd have to go slow on fixing *any* of them.  Rather that the whole
float situation is so visually messy that we should leave it alone --
although it probably works fine in practice until you have need for
NumPy for other reasons.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/FXIBYNXE3B53RTGQWUZT2PGNZ5OXB3BM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf

2020-09-16 Thread Rob Cliffe via Python-ideas



On 14/09/2020 17:36, Christopher Barker wrote:

There seems to be a fair bit of support for this idea.

Will it need a PEP ?

-CHB

If I've understood correctly (far from certain) the existing json.dumps 
and json.loads functions are permissive (allow some constructions that 
are not part of the JSON spec) but the proposed new functions will be 
strict.
To minimise possible confusion, I think that the documentation (both the 
docstrings and the online docs) should be **very clear** about this.

E.g.
loads:
    ...
    loads accepts blah-blah-blah.  This is different from loadf which 
only accepts strict JSON.


loadf:
    ...
    loadf only accepts strict JSON.  This is different from loads which 
blah-blah-blah


Etc.
Rob Cliffe
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ABQKK6GSK33I2XBJK4VK4RUEUAQ3HDVV/
Code of Conduct: http://python.org/psf/codeofconduct/