Hello,
Antoine Pitrou solip...@pitrou.net writes:
Hello,
We're in the process of forward-porting the recent (massive) json
updates to 3.1, and we are also thinking of dropping remnants of
support of the bytes type in the json library (in 3.1, again). This
bytes support almost didn't work
I couldn't figure out a way to get rid of it short of multi-#including
templates and playing with the C preprocessor, however, and have the
nagging feeling the latter would be frowned upon by the maintainers.
Not sure if this is exactly what you mean, but look at Objects/stringlib.
On Mon, Apr 27, 2009 at 7:25 AM, Damien Diederen d...@crosstwine.com wrote:
Antoine Pitrou solip...@pitrou.net writes:
Hello,
We're in the process of forward-porting the recent (massive) json
updates to 3.1, and we are also thinking of dropping remnants of
support of the bytes type in the
Damien Diederen dd at crosstwine.com writes:
I couldn't figure out a way to get rid of it short of multi-#including
templates and playing with the C preprocessor, however, and have the
nagging feeling the latter would be frowned upon by the maintainers.
There is a precedent with
Hi Antoine,
Antoine Pitrou solip...@pitrou.net writes:
Damien Diederen dd at crosstwine.com writes:
I couldn't figure out a way to get rid of it short of multi-#including
templates and playing with the C preprocessor, however, and have the
nagging feeling the latter would be frowned upon by
Warning: Reply-To set to email-sig.
Greg Ewing writes:
Only for headers known to be unstructured, I think.
Completely unknown headers should be available only
as bytes.
Why do I get the feeling that you guys are feeling up an
elephant?wink
There are four things you might want to do with
2009/4/13 Daniel Stutzbach dan...@stutzbachenterprises.com:
print(Content-Type: application/json; charset=utf-8)
Please don't do that! According to RFC 4627 the charset parameter is
not allowed for the application/json media type.
Just use Content-Type: application/json, the charset is only
On Apr 10, 2009, at 11:08 AM, James Y Knight wrote:
Until you write a parser for every header, you simply cannot decode
to unicode. The only sane choices are:
1) raw bytes
2) parsed structured data
The email package does not need a parser for every header, but it
should provide a
On Fri, Apr 10, 2009 at 10:06 PM, Martin v. Löwis mar...@v.loewis.dewrote:
However, I really think that this question cannot be answered by
reading the RFC. It should be answered by verifying how people use
the json library in 2.x.
I use the json module in 2.6 to communicate with a C# JSON
I use the json module in 2.6 to communicate with a C# JSON library and a
JavaScript JSON library. The C# and JavaScript libraries produce and
consume the equivalent of str, not the equivalent of bytes.
I assume there is a TCP connection between the json module and the
C#/JavaScript libraries?
On Mon, Apr 13, 2009 at 12:19 PM, Martin v. Löwis mar...@v.loewis.dewrote:
I use the json module in 2.6 to communicate with a C# JSON library and a
JavaScript JSON library. The C# and JavaScript libraries produce and
consume the equivalent of str, not the equivalent of bytes.
I assume
On Apr 13, 2009, at 10:11 AM, Barry Warsaw wrote:
The email package does not need a parser for every header, but it
should provide a framework that applications (or third party
libraries) can use to extend the built-in header parsers. A bare
minimum for functionality requires a
Yes, there's a TCP connection. Sorry for not making that clear to begin
with.
If so, it doesn't matter what representation these implementations chose
to use.
True, I can always convert from bytes to str or vise versa.
I think you are missing the point. It will not be
On Mon, Apr 13, 2009 at 1:02 PM, Martin v. Löwis mar...@v.loewis.de wrote:
Yes, there's a TCP connection. Sorry for not making that clear to begin
with.
If so, it doesn't matter what representation these implementations chose
to use.
True, I can always convert from bytes to str or
On Mon, Apr 13, 2009 at 3:28 PM, Bob Ippolito b...@redivi.com wrote:
It's not a bug in dumps, it's a matter of not reading the
documentation. The encoding parameter of dumps decides how byte
strings should be interpreted, not what the output encoding is.
You're right; I apologize for not
On Mon, Apr 13, 2009 at 3:02 PM, Martin v. Löwis mar...@v.loewis.dewrote:
True, I can always convert from bytes to str or vise versa.
I think you are missing the point. It will not be necessary to convert.
Sometimes I want bytes and sometimes I want str. I am going to be
converting some of
Barry Warsaw wrote:
The default
would probably be some unstructured parser for headers like Subject.
Only for headers known to be unstructured, I think.
Completely unknown headers should be available only
as bytes.
--
Greg
___
Python-Dev mailing
On Mon, Apr 13, 2009 at 5:25 PM, Daniel Stutzbach
dan...@stutzbachenterprises.com wrote:
On Mon, Apr 13, 2009 at 3:02 PM, Martin v. Löwis mar...@v.loewis.de
wrote:
True, I can always convert from bytes to str or vise versa.
I think you are missing the point. It will not be necessary to
Bob Ippolito bob at redivi.com writes:
The output of json/simplejson dumps for Python 2.x is either an ASCII
bytestring (default) or a unicode string (when ensure_ascii=False).
This is very practical in 2.x because an ASCII bytestring can be
treated as either text or bytes in most
Alexandre Vassalotti wrote:
print(Content-Type: application/json; charset=utf-8)
input_object = json.loads(sys.stdin.read())
output_object = do_some_work(input_object)
print(json.dumps(output_object))
print()
That assumes the encoding being used by stdout has
ascii as a subset.
--
Greg
Below is a basic CGI application that assumes that json module works
with str, not bytes. How would you write it if the json module does not
support returning a str?
In a CGI application, you shouldn't be using sys.stdin or print().
Instead, you should be using sys.stdin.buffer (or
Martin v. Löwis martin at v.loewis.de writes:
Not sure whether it would be *significantly* faster, but yes, Bob wrote
an accelerator for parsing out of a byte string to make it really fast;
IIRC, he claims that it is faster than pickling.
Isn't premature optimization the root of all evil?
Greg Ewing writes:
The reason you use a text format in the first place is that
you have some way of transmitting text, and you want to
send something that isn't text. In that situation, the
encoding is already determined by whatever means you're
using to send the text.
Determined, yes,
gl...@divmod.com wrote:
My preference would be that
message.headers['Subject'] = b'Some Bytes'
would simply raise an exception. If you've got some bytes, you should
instead do
message.bytes_headers['Subject'] = b'Some Bytes'
Remind me again why you need to differentiate between
On 11/04/2009 6:12 PM, Antoine Pitrou wrote:
Martin v. Löwismartinat v.loewis.de writes:
Not sure whether it would be *significantly* faster, but yes, Bob wrote
an accelerator for parsing out of a byte string to make it really fast;
IIRC, he claims that it is faster than pickling.
Isn't
gl...@divmod.com wrote:
On 03:21 am, ncogh...@gmail.com wrote:
Given that json is a wire protocol, that sounds like the right approach
for json as well. Once bytes-everywhere works, then a text API can be
built on top of it, but it is difficult to build a bytes API on top of a
text one.
I
glyph at divmod.com writes:
In email's case this is true, but in JSON's case it's not. JSON is a
format defined as a sequence of code points; MIME is defined as a
sequence of octets.
Another to look at it is that JSON is a subset of Javascript, and as such is
text rather than bytes.
2009/4/10 Nick Coghlan ncogh...@gmail.com:
gl...@divmod.com wrote:
On 03:21 am, ncogh...@gmail.com wrote:
Given that json is a wire protocol, that sounds like the right approach
for json as well. Once bytes-everywhere works, then a text API can be
built on top of it, but it is difficult to
In email's case this is true, but in JSON's case it's not. JSON is a
format defined as a sequence of code points; MIME is defined as a
sequence of octets.
Another to look at it is that JSON is a subset of Javascript, and as such is
text rather than bytes.
I don't think this can be
On Apr 9, 2009, at 10:38 PM, Barry Warsaw wrote:
So, what I'm really asking is this. Let's say you agree that there
are use cases for accessing a header value as either the raw encoded
bytes or the decoded unicode.
As I said in the thread having nearly the same exact discussion on web-
Paul Moore writes:
On the other hand, further down in the document:
3. Encoding
JSON text SHALL be encoded in Unicode. The default encoding is
UTF-8.
Since the first two characters of a JSON text will always be ASCII
characters [RFC0020], it is possible to
On Fri, Apr 10, 2009 at 8:38 AM, Stephen J. Turnbull step...@xemacs.org wrote:
Paul Moore writes:
On the other hand, further down in the document:
3. Encoding
JSON text SHALL be encoded in Unicode. The default encoding is
UTF-8.
Since the first two
(3) The default transfer encoding syntax is UTF-8.
Notice that the RFC is partially irrelevant. It only applies
to the application/json mime type, and JSON is used in various
other protocols, using various other encodings.
I think it's a bad idea for any of the core
JSON API to accept or
On Apr 10, 2009, at 1:19 AM, gl...@divmod.com wrote:
On 02:38 am, ba...@python.org wrote:
So, what I'm really asking is this. Let's say you agree that there
are use cases for accessing a header value as either the raw
encoded bytes or the decoded unicode. What should this return:
On Thu, 2009-04-09 at 22:38 -0400, Barry Warsaw wrote:
On Apr 9, 2009, at 11:55 AM, Daniel Stutzbach wrote:
On Thu, Apr 9, 2009 at 6:01 AM, Barry Warsaw ba...@python.org wrote:
Anyway, aside from that decision, I haven't come up with an elegant
way to allow /output/ in both bytes and
Martin v. Löwis writes:
(3) The default transfer encoding syntax is UTF-8.
Notice that the RFC is partially irrelevant. It only applies
to the application/json mime type, and JSON is used in various
other protocols, using various other encodings.
Sure. That's their problem. In
On Fri, Apr 10, 2009, Barry Warsaw wrote:
On Apr 10, 2009, at 2:06 PM, Michael Foord wrote:
Shouldn't headers always be text?
/me weeps
/me hands Barry a hankie
--
Aahz (a...@pythoncraft.com) * http://www.pythoncraft.com/
Why is this newsgroup different from all other
Robert Brewer writes:
Syntactically, there's no sense in providing:
Message.set_header('Subject', 'Some text', encoding='utf-16')
...since you could more clearly write the same as:
Message.set_header('Subject', 'Some text'.encode('utf-16'))
Which you now must *parse* and
gl...@divmod.com wrote:
On 03:21 am, ncogh...@gmail.com wrote:
Barry Warsaw wrote:
I don't know whether the parameter thing will work or not, but you're
probably right that we need to get the bytes-everywhere API first.
Given that json is a wire protocol, that sounds like the right
Paul Moore wrote:
3. Encoding
JSON text SHALL be encoded in Unicode. The default encoding is
UTF-8.
This is at best confused (in my utterly non-expert opinion :-)) as
Unicode isn't an encoding...
I'm inclined to agree. I'd go further and say that if JSON
is really mean to be a text
In email's case this is true, but in JSON's case it's not. JSON is a
format defined as a sequence of code points; MIME is defined as a
sequence of octets.
What is the 'bytes support' issue for json? Is it about content within
a json text? Or about the transport format of a json text?
The
[Dropping email sig]
On 11/04/2009 1:06 PM, Martin v. Löwis wrote:
However, I really think that this question cannot be answered by
reading the RFC. It should be answered by verifying how people use
the json library in 2.x.
In the absence of anything more formal, here are 2 anecdotes:
* The
I'm personally leaning slightly towards strings, putting the burden on
bytes-users of json to explicitly use the appropriate encoding, even in
cases where it *must* be utf8. On the other hand, I'm too lazy to dig
back through this large thread, but I seem to recall a suggestion that
using
[Antoine Pitrou]
Besides, Bob doesn't really seem to care about
porting to py3k (he hasn't said anything about it until now, other than that he
didn't feel competent to do it).
His actual words were: I will need some help with 3.0 since I am not well versed in the changes to the C API or
On Thu, Apr 9, 2009 at 07:15, Antoine Pitrou solip...@pitrou.net wrote:
The RFC also specifies a discrimination algorithm for non-supersets of ASCII
(“Since the first two characters of a JSON text will always be ASCII
characters [RFC0020], it is possible to determine whether an octet
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Apr 9, 2009, at 1:15 AM, Antoine Pitrou wrote:
Guido van Rossum guido at python.org writes:
I'm kind of surprised that a serialization protocol like JSON
wouldn't
support reading/writing bytes (as the serialized format -- I don't
care about
Dirkjan Ochtman dirkjan at ochtman.nl writes:
The RFC states
that JSON-text = object / array, meaning loads for 'hi' isn't
strictly valid.
Sure, but then:
json.loads('[]')
[]
json.loads(u'[]'.encode('utf16'))
Traceback (most recent call last):
File stdin, line 1, in module
File
On Thu, Apr 9, 2009 at 13:10, Antoine Pitrou solip...@pitrou.net wrote:
Sure, but then:
json.loads('[]')
[]
json.loads(u'[]'.encode('utf16'))
Traceback (most recent call last):
File stdin, line 1, in module
File /home/antoine/cpython/__svn__/Lib/json/__init__.py, line 310, in loads
Barry Warsaw wrote:
On Apr 9, 2009, at 1:15 AM, Antoine Pitrou wrote:
Guido van Rossum guido at python.org writes:
I'm kind of surprised that a serialization protocol like JSON wouldn't
support reading/writing bytes (as the serialized format -- I don't
care about having bytes as values,
Barry Warsaw ba...@python.org wrote:
Anyway, aside from that decision, I haven't come up with an
elegant way to allow /output/ in both bytes and strings (input is I
think theoretically easier by sniffing the arguments).
Probably a good thing. It just promotes more confusion to do things
On Thu, Apr 9, 2009 at 6:01 AM, Barry Warsaw ba...@python.org wrote:
Anyway, aside from that decision, I haven't come up with an elegant way to
allow /output/ in both bytes and strings (input is I think theoretically
easier by sniffing the arguments).
Won't this work? (assuming dumps()
This is an interesting question, and something I'm struggling with for
the email package for 3.x. It turns out to be pretty convenient to have
both a bytes and a string API, both for input and output, but I think
email really wants to be represented internally as bytes. Maybe. Or
maybe
On Thu, Apr 9, 2009 at 1:15 AM, Antoine Pitrou solip...@pitrou.net wrote:
As for reading/writing bytes over the wire, JSON is often used in the same
context as HTML: you are supposed to know the charset and decode/encode the
payload using that charset. However, the RFC specifies a default
I can understand that you don't want to spend much time on it. How
about removing it from 3.1? We could re-add it when long-term support
becomes more likely.
I'm speechless.
It seems that my statement has surprised you, so let me explain:
I think we should refrain from making design
Alexandre Vassalotti wrote:
On Thu, Apr 9, 2009 at 1:15 AM, Antoine Pitrou solip...@pitrou.net wrote:
As for reading/writing bytes over the wire, JSON is often used in the same
context as HTML: you are supposed to know the charset and decode/encode the
payload using that charset. However, the
On Thu, Apr 9, 2009 at 1:05 PM, Martin v. Löwis mar...@v.loewis.de wrote:
I can understand that you don't want to spend much time on it. How
about removing it from 3.1? We could re-add it when long-term support
becomes more likely.
I'm speechless.
It seems that my statement has surprised
As far as Python 3 goes, I honestly have not yet familiarized myself
with the changes to the IO infrastructure and what the new idioms are.
At this time, I can't make any educated decisions with regard to how
it should be done because I don't know exactly how bytes are supposed
to work and
On Apr 9, 2009, at 8:07 AM, Steve Holden wrote:
The real problem I came across in storing email in a relational
database
was the inability to store messages as Unicode. Some messages have a
body in one encoding and an attachment in another, so the only ways to
store the messages are either as
On Apr 9, 2009, at 11:08 AM, Bill Janssen wrote:
Barry Warsaw ba...@python.org wrote:
Anyway, aside from that decision, I haven't come up with an
elegant way to allow /output/ in both bytes and strings (input is I
think theoretically easier by sniffing the arguments).
Probably a good thing.
On Apr 9, 2009, at 10:52 PM, Aahz wrote:
On Thu, Apr 09, 2009, Barry Warsaw wrote:
So, what I'm really asking is this. Let's say you agree that there
are
use cases for accessing a header value as either the raw encoded
bytes or
the decoded unicode. What should this return:
Barry Warsaw wrote:
I don't know whether the parameter thing will work or not, but you're
probably right that we need to get the bytes-everywhere API first.
Given that json is a wire protocol, that sounds like the right approach
for json as well. Once bytes-everywhere works, then a text API can
On Apr 9, 2009, at 2:25 PM, Martin v. Löwis wrote:
This is an interesting question, and something I'm struggling with
for
the email package for 3.x. It turns out to be pretty convenient to
have
both a bytes and a string API, both for input and output, but I think
email really wants to be
On Thu, Apr 09, 2009, Barry Warsaw wrote:
So, what I'm really asking is this. Let's say you agree that there are
use cases for accessing a header value as either the raw encoded bytes or
the decoded unicode. What should this return:
message['Subject']
The raw bytes or the decoded
On Apr 9, 2009, at 11:55 AM, Daniel Stutzbach wrote:
On Thu, Apr 9, 2009 at 6:01 AM, Barry Warsaw ba...@python.org wrote:
Anyway, aside from that decision, I haven't come up with an elegant
way to allow /output/ in both bytes and strings (input is I think
theoretically easier by sniffing
On Apr 9, 2009, at 11:21 PM, Nick Coghlan wrote:
Barry Warsaw wrote:
I don't know whether the parameter thing will work or not, but you're
probably right that we need to get the bytes-everywhere API first.
Given that json is a wire protocol, that sounds like the right
approach
for json as
On 02:38 am, ba...@python.org wrote:
So, what I'm really asking is this. Let's say you agree that there
are use cases for accessing a header value as either the raw encoded
bytes or the decoded unicode. What should this return:
message['Subject']
The raw bytes or the decoded unicode?
On 03:21 am, ncogh...@gmail.com wrote:
Barry Warsaw wrote:
I don't know whether the parameter thing will work or not, but you're
probably right that we need to get the bytes-everywhere API first.
Given that json is a wire protocol, that sounds like the right approach
for json as well.
Hello,
We're in the process of forward-porting the recent (massive) json updates to
3.1, and we are also thinking of dropping remnants of support of the bytes type
in the json library (in 3.1, again). This bytes support almost didn't work at
all, but there was a lot of C and Python code for it
We're in the process of forward-porting the recent (massive) json updates to
3.1, and we are also thinking of dropping remnants of support of the bytes type
in the json library (in 3.1, again). This bytes support almost didn't work at
all, but there was a lot of C and Python code for it
We're in the process of forward-porting the recent (massive) json updates to
3.1, and we are also thinking of dropping remnants of support of the bytes
type
in the json library (in 3.1, again). This bytes support almost didn't work at
all, but there was a lot of C and Python code for it
Martin v. Löwis martin at v.loewis.de writes:
What does Bob Ippolito think about this change? IIUC, he considers
simplejson's speed one of its primary advantages, and also attributes it
to the fact that he can parse directly out of byte strings, and marshal
into them (which is important, as
On Wed, Apr 8, 2009 at 4:10 AM, Antoine Pitrou solip...@pitrou.net wrote:
We're in the process of forward-porting the recent (massive) json updates to
3.1, and we are also thinking of dropping remnants of support of the bytes
type
in the json library (in 3.1, again). This bytes support almost
Guido van Rossum guido at python.org writes:
I'm kind of surprised that a serialization protocol like JSON wouldn't
support reading/writing bytes (as the serialized format -- I don't
care about having bytes as values, since JavaScript doesn't have
something equivalent AFAIK, and hence JSON
Besides, Bob doesn't really seem to care about
porting to py3k (he hasn't said anything about it until now, other than that
he
didn't feel competent to do it).
That is quite unfortunate, and suggests that perhaps the module
shouldn't have been added to Python in the first place.
I can
74 matches
Mail list logo