[issue17909] Autodetecting JSON encoding

2018-06-07 Thread miss-islington


miss-islington  added the comment:


New changeset 21f2553482c3d6ec8beb8bfa0f1fb5d23c6a4c2f by Miss Islington (bot) 
in branch '3.6':
bpo-17909: Document that json.load can accept a binary IO (GH-7366)
https://github.com/python/cpython/commit/21f2553482c3d6ec8beb8bfa0f1fb5d23c6a4c2f


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2018-06-07 Thread miss-islington


miss-islington  added the comment:


New changeset f38ace61a39e64f5fde6f8f402e258177bdf7ff4 by Miss Islington (bot) 
in branch '3.7':
bpo-17909: Document that json.load can accept a binary IO (GH-7366)
https://github.com/python/cpython/commit/f38ace61a39e64f5fde6f8f402e258177bdf7ff4


--
nosy: +miss-islington

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2018-06-07 Thread miss-islington


Change by miss-islington :


--
pull_requests: +7098

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2018-06-07 Thread miss-islington


Change by miss-islington :


--
pull_requests: +7097

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2018-06-07 Thread INADA Naoki


INADA Naoki  added the comment:


New changeset bb6366bd7570ff3b74bc66095540bea78f31504e by INADA Naoki (Anthony 
Sottile) in branch 'master':
bpo-17909: Document that json.load can accept a binary IO (GH-7366)
https://github.com/python/cpython/commit/bb6366bd7570ff3b74bc66095540bea78f31504e


--
nosy: +inada.naoki

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2018-06-03 Thread Anthony Sottile


Change by Anthony Sottile :


--
pull_requests: +6992

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2016-09-10 Thread Nick Coghlan

Changes by Nick Coghlan :


--
stage: commit review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2016-09-10 Thread Nick Coghlan

Nick Coghlan added the comment:

Thanks for tackling this Serhiy!

I removed issue 13916 from the dependency list, as while that's a reasonable 
suggestion, I don't think this fix is conditional on that change.

--
dependencies:  -disallow the "surrogatepass" handler for non utf-* encodings
resolution:  -> fixed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2016-09-10 Thread Roundup Robot

Roundup Robot added the comment:

New changeset e9e1bf9ec2ac by Nick Coghlan in branch 'default':
Issue #17909: Accept binary input in json.loads
https://hg.python.org/cpython/rev/e9e1bf9ec2ac

--
nosy: +python-dev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2016-09-10 Thread Nick Coghlan

Nick Coghlan added the comment:

Having hit the json.loads() problem recently when porting a project to Python 
3, I'm keen to see this land for 3.6.

Accodingly, assigning to myself to review and merge Serhiy's patch - if it 
proves necessary, we can tweak the details of the encoding detection during 
beta.

--
assignee: serhiy.storchaka -> ncoghlan

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2016-08-30 Thread Stéphane Wirtel

Changes by Stéphane Wirtel :


--
stage: patch review -> commit review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2016-08-30 Thread Stéphane Wirtel

Stéphane Wirtel added the comment:

Hi Serhiy,

I have reviewed your patch, it seems to be ok.

--
nosy: +matrixise

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2016-06-22 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
versions: +Python 3.6 -Python 3.5
Added file: http://bugs.python.org/file43513/json_detect_encoding_3.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2016-05-03 Thread Geoffrey Sneddon

Changes by Geoffrey Sneddon :


--
nosy: +gsnedders

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2015-03-27 Thread Berker Peksag

Changes by Berker Peksag berker.pek...@gmail.com:


--
nosy: +berker.peksag

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17909
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2014-10-26 Thread Martin Panter

Martin Panter added the comment:

If you adjusted the detect_encoding() logic according to Pete Cordell’s table 
at the bottom of 
http://www.ietf.org/mail-archive/web/json/current/msg01959.html, it might 
work for standalone strings.

However since the RFC encourages UTF-8 for best interoperability, I wonder if 
any of this autodetection is necessary. It might be simpler to just assume 
UTF-8, or use the “utf-8-sig” codec. Or are there real cases where detecting 
UTF-16 or -32 would be useful?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17909
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2014-10-24 Thread Martin Panter

Changes by Martin Panter vadmium...@gmail.com:


--
nosy: +vadmium

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17909
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2014-05-15 Thread STINNER Victor

Changes by STINNER Victor victor.stin...@gmail.com:


--
nosy: +haypo

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17909
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2014-05-15 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

All dependencies for this issue are resolved now.

Here is updated patch, synchronized with tip.

--
Added file: http://bugs.python.org/file35258/json_detect_encoding_2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17909
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2014-05-15 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


Removed file: http://bugs.python.org/file30133/json_detect_encoding.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17909
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2014-05-15 Thread Chris Rebert

Chris Rebert added the comment:

You'll need to also update the Character Encodings subsection of the json 
docs.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17909
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2014-05-15 Thread akira

akira added the comment:

Both json standard (ECMA-404) [1] and the new json rfc 7159 [2] do not mention
the encoding detection.

[1] http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf
[2] https://tools.ietf.org/html/rfc7159#section-8.1

From the rfc:

 JSON text SHALL be encoded in UTF-8, UTF-16, or UTF-32.  The default
  encoding is UTF-8, and JSON texts that are encoded in UTF-8 are
  interoperable in the sense that they will be read successfully by the
  maximum number of implementations; there are many implementations
  that cannot successfully read texts in other encodings (such as
  UTF-16 and UTF-32).

  Implementations MUST NOT add a byte order mark to the beginning of a
  JSON text.  In the interests of interoperability, implementations
  that parse JSON texts MAY ignore the presence of a byte order mark
  rather than treating it as an error.

--
nosy: +akira

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17909
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2014-05-15 Thread Chris Rebert

Chris Rebert added the comment:

I agree that the state of encoding detection in the new RFC seems unclear, 
given that the old RFC prefaced the part about the encoding detection with:
 Since the first two characters of a JSON text will always be ASCII
 characters

But in the new RFC:
 Appendix A.  Changes from RFC 4627
[...]
o  Changed the definition of JSON text so that it can be any JSON
   value, removing the constraint that it be an object or array.

Thus,
 ಠ_ಠ
whose 2nd character is decidedly non-ASCII, is now a valid JSON text (i.e. 
standalone JSON document).

There seems to have been a thread about encoding detection in the RFC 7159 
working group, but I don't have the time to read through it all:

 Re: [Json] JSON: remove gap between Ecma-404 and IETF draft
 http://www.ietf.org/mail-archive/web/json/current/msg01936.html

It eventually leads to a dedicated sub-thread:

 [Json] Encoding detection (Was: Re: JSON: remove gap between Ecma-404 and 
 IETF draft)
 http://www.ietf.org/mail-archive/web/json/current/msg01959.html

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17909
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2014-03-28 Thread Chris Rebert

Changes by Chris Rebert pyb...@rebertia.com:


--
nosy: +cvrebert

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17909
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2014-03-04 Thread Josh Lee

Changes by Josh Lee jlee...@gmail.com:


--
nosy: +jleedev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17909
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2013-12-01 Thread Julian Berman

Changes by Julian Berman julian+python@grayvines.com:


--
nosy: +Julian

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17909
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2013-11-30 Thread Antoine Pitrou

Changes by Antoine Pitrou pit...@free.fr:


--
nosy: +ncoghlan

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17909
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2013-11-30 Thread Antoine Pitrou

Changes by Antoine Pitrou pit...@free.fr:


--
versions: +Python 3.5 -Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17909
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2013-08-10 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


--
stage:  - patch review

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17909
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2013-05-05 Thread Serhiy Storchaka

New submission from Serhiy Storchaka:

RFC 4627 specifies a method to determine an encoding (one of UTF-8, 
UTF-16(BE|LE) or UTF-32(BE|LE)) of encoded JSON text. The proposed preliminary 
patch (it doesn't include the documentation yet) allows load() and loads() 
functions accept bytes data when it is encoded with standard Unicode encoding. 
Also accepted data with BOM (this doesn't specified in RFC 4627, but is widely 
used).

There is only one case where the method can give a misfire. Serialized string 
\x00... encoded in UTF-16LE may be erroneously detected as encoded in 
UTF-32LE. This case violates the two rules of RFC 4627: the string was 
serialized instead of a an object or an array, and the control character U+ 
was not escaped. The standard encoded JSON always detected correctly.

This patch requires surrogatepass error handler for utf-16/32 (see issue12892 
and issue13916).

--
assignee: serhiy.storchaka
components: Library (Lib), Unicode
files: json_detect_encoding.patch
keywords: patch
messages: 188442
nosy: ezio.melotti, pitrou, rhettinger, serhiy.storchaka
priority: normal
severity: normal
status: open
title: Autodetecting JSON encoding
type: enhancement
versions: Python 3.4
Added file: http://bugs.python.org/file30133/json_detect_encoding.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17909
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17909] Autodetecting JSON encoding

2013-05-05 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


--
dependencies: +UTF-16 and UTF-32 codecs should reject (lone) surrogates, 
disallow the surrogatepass handler for non  utf-* encodings

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17909
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com