[Python-ideas] Re: adding support for a "raw output" in JSON serializer

Andrew Barnert via Python-ideas Fri, 23 Aug 2019 10:53:08 -0700

On Aug 23, 2019, at 09:45, Christopher Barker <python...@gmail.com> wrote:
> 
> Andrew, thanks for the background.
> 
>> On Fri, Aug 23, 2019 at 8:25 AM Andrew Barnert via Python-ideas 
>> <python-ideas@python.org> wrote:
> 
>> Also, IIRC, it doesn’t do any post-check; it assumes calling str on any 
>> Decimal value (after an isfinite check if not allow_nan) produces valid 
>> JSON. I think there are unit tests meant to establish that fact, but you’d 
>> need to copy those into the stdlib test suite and make the case that their 
>> coverage is sufficient,
> 
> That seems like the way to go -- if it;s in the stdlib, than any changes to 
> Decimal that breaks it would fail the test. So no harm in relying on 
> Decimal's __str__.
> 
> But, of course, comprehensive test coverage is hard/impossible.


From a quick glance, it looks like the test cases in simplejson are pretty 
minimal. (Although they do test something I wouldn’t have thought of—what 
happens if you reload decimal but not _decimal, which apparently is a serious 
issue for wsgi or other uses of subinterpreters?)

The feature has been in simplejson since 2010, and on by default since 2011, 
and from a quick glance, the last relevant reported bug (the one that made them 
add the reload test) was 2012. But good enough for an easily-fixed external 
project where the upgrade costs to users are minimal and most of them are there 
because they want “development” features beyond the stdlib may not be good 
enough for the stdlib…

And I’m not even sure what all of the relevant test cases are. Surveying a 
range of other JSON implementations (in multiple languages) to scavenge their 
tests might be the best place to start?

>> or make some other argument that it’s guaranteed to be safe, or ignore str 
>> and write a _decimal_str similar to the existing _float_str,
> 
> I'm not sure there is any advantage to that -- it would still require the 
> same comprehensive tests -- unless, of course Decimal's __str__ does work for 
> all cases.

Well, it wouldn’t be easier to _test_, but it might be easier to _argue_. 
There’s a Haskell implementation that comes with a formal proof that the number 
encode function can’t produce anything that the number decode function can’t 
consume. That doesn’t seem likely to be reasonable for Python, but it’s not 
quite impossible…

>> or find a way to validate it that isn’t too slow. 
> 
> Almost by definition validation is going to be slower -- probably on order of 
> twice as slow. Validation is a good idea if you are not controlling the 
> input, but theoretically a waste of time if you are.

Good point. But then twice as slow as a feature that doesn’t exist at all is 
still a step forward. If someone wants to implement use_decimal with validation 
for 3.9, and meanwhile keep working on a test suite that will convince everyone 
sufficiently to allow removing the validation and doubling the speed for 3.10, 
that might be better than waiting for 3.10 to add the feature. As long as it’s 
not too slow to actually use in practice, it might still be worth having.

Also, keep in mind that, at least in simplejson, using Decimal instead of float 
already means an 80%-300% slowdown, so people who can’t sacrifice performance 
for precision already have to come up with other alternatives anyway.

_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/GRM4I6SK3ZGQLGNFEKZ5T2PULFQXHC2S/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: adding support for a "raw output" in JSON serializer

Reply via email to