[Antoine Pitrou]
Besides, Bob doesn't really seem to care about
porting to py3k (he hasn't said anything about it until now, other than that he
didn't feel competent to do it).
His actual words were: I will need some help with 3.0 since I am not well versed in the changes to the C API or
On Thu, Apr 9, 2009 at 07:15, Antoine Pitrou solip...@pitrou.net wrote:
The RFC also specifies a discrimination algorithm for non-supersets of ASCII
(“Since the first two characters of a JSON text will always be ASCII
characters [RFC0020], it is possible to determine whether an octet
Eric Smith wrote:
And as a reminder, the py3k-short-float-repr changes are on Rietveld at
http://codereview.appspot.com/33084/show. So far, no comments.
I skipped over the actual number crunching parts (the test suite will do
a better job than I will of telling you whether or not you have those
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Apr 9, 2009, at 1:15 AM, Antoine Pitrou wrote:
Guido van Rossum guido at python.org writes:
I'm kind of surprised that a serialization protocol like JSON
wouldn't
support reading/writing bytes (as the serialized format -- I don't
care about
Dirkjan Ochtman dirkjan at ochtman.nl writes:
The RFC states
that JSON-text = object / array, meaning loads for 'hi' isn't
strictly valid.
Sure, but then:
json.loads('[]')
[]
json.loads(u'[]'.encode('utf16'))
Traceback (most recent call last):
File stdin, line 1, in module
File
Martin v. Löwis wrote:
Such a policy would then translate to a dead end for Python 2.x
based applications.
2.x based applications *are* in a dead end, with the only exit
being portage to 3.x.
The actual end of the dead end just happens to be in 2013 or so :)
Cheers,
Nick.
--
Nick Coghlan
On Thu, Apr 9, 2009 at 13:10, Antoine Pitrou solip...@pitrou.net wrote:
Sure, but then:
json.loads('[]')
[]
json.loads(u'[]'.encode('utf16'))
Traceback (most recent call last):
File stdin, line 1, in module
File /home/antoine/cpython/__svn__/Lib/json/__init__.py, line 310, in loads
Barry Warsaw wrote:
On Apr 9, 2009, at 1:15 AM, Antoine Pitrou wrote:
Guido van Rossum guido at python.org writes:
I'm kind of surprised that a serialization protocol like JSON wouldn't
support reading/writing bytes (as the serialized format -- I don't
care about having bytes as values,
Michele Simionato wrote:
On Wed, Apr 8, 2009 at 7:51 PM, Guido van Rossum gu...@python.org wrote:
There was a remark (though perhaps meant humorously) in Michele's page
about decorators that worried me too: For instance, typical
implementations of decorators involve nested functions, and we
Martin v. Löwis wrote:
Nick Coghlan wrote:
Dirkjan Ochtman wrote:
I have a stab at an author map at http://dirkjan.ochtman.nl/author-map.
Could use some review, but it seems like a good start.
Martin may be able to provide a better list of names based on the
checkin name-SSH public key
On Thu, Apr 9, 2009 at 2:11 PM, Nick Coghlan ncogh...@gmail.com wrote:
One of my hopes for PEP 362 was that I would be able to just add
__signature__ to the list of copied attributes, but that PEP is
currently short a champion to work through the process of resolving the
open issues and
Nick Coghlan wrote:
Eric Smith wrote:
And as a reminder, the py3k-short-float-repr changes are on Rietveld at
http://codereview.appspot.com/33084/show. So far, no comments.
Looks like you were able to delete some fairly respectable chunks of
redundant code!
Wait until you see how much
On Thu, Apr 09, 2009, Nick Coghlan wrote:
Martin v. L?wis wrote:
Such a policy would then translate to a dead end for Python 2.x
based applications.
2.x based applications *are* in a dead end, with the only exit
being portage to 3.x.
The actual end of the dead end just happens to be in
Michele Simionato wrote:
On Thu, Apr 9, 2009 at 2:11 PM, Nick Coghlan ncogh...@gmail.com wrote:
One of my hopes for PEP 362 was that I would be able to just add
__signature__ to the list of copied attributes, but that PEP is
currently short a champion to work through the process of resolving
Aahz wrote:
On Thu, Apr 09, 2009, Nick Coghlan wrote:
Martin v. L?wis wrote:
Such a policy would then translate to a dead end for Python 2.x
based applications.
2.x based applications *are* in a dead end, with the only exit
being portage to 3.x.
The actual end of the dead end just happens
Just to make sure I am not doing something silly, with a configure line as
such: ./configure --prefix=/home/asmodai/local --with-wide-unicode
--with-pymalloc --with-threads --with-computed-gotos, would there be any
reason why I am getting the following error with both BSD make and gmake:
make:
2009/4/9 Jeroen Ruigrok van der Werven asmo...@in-nomine.org:
Just to make sure I am not doing something silly, with a configure line as
such: ./configure --prefix=/home/asmodai/local --with-wide-unicode
--with-pymalloc --with-threads --with-computed-gotos, would there be any
reason why I am
-On [20090409 15:41], Benjamin Peterson (benja...@python.org) wrote:
It seems your Makefile is outdated. We moved the _fileio.c module
around a few days, so maybe you just need a make distclean.
Yes, that was the cause. Thanks Benjamin.
--
Jeroen Ruigrok van der Werven asmodai
Barry Warsaw ba...@python.org wrote:
Anyway, aside from that decision, I haven't come up with an
elegant way to allow /output/ in both bytes and strings (input is I
think theoretically easier by sniffing the arguments).
Probably a good thing. It just promotes more confusion to do things
On Thu, Apr 09, 2009, John Arbash Meinel wrote:
PS I'm not yet subscribed to python-dev, so if you could make sure to
CC me in replies, I would appreciate it.
Please do subscribe to python-dev ASAP; I also suggest that you subscribe
to python-ideas, because I suspect that this is sufficiently
On Thu, Apr 9, 2009 at 17:31, Aahz a...@pythoncraft.com wrote:
Please do subscribe to python-dev ASAP; I also suggest that you subscribe
to python-ideas, because I suspect that this is sufficiently blue-sky to
start there.
It might also be interesting to the unladen-swallow guys.
Cheers,
On Thu, Apr 9, 2009 at 6:01 AM, Barry Warsaw ba...@python.org wrote:
Anyway, aside from that decision, I haven't come up with an elegant way to
allow /output/ in both bytes and strings (input is I think theoretically
easier by sniffing the arguments).
Won't this work? (assuming dumps()
(email-sig added)
At 08:07 -0400 04/09/2009, Steve Holden wrote:
Barry Warsaw wrote:
...
This is an interesting question, and something I'm struggling with for
the email package for 3.x. It turns out to be pretty convenient to have
both a bytes and a string API, both for input and output,
Tony Nelson wrote:
(email-sig added)
At 08:07 -0400 04/09/2009, Steve Holden wrote:
Barry Warsaw wrote:
...
This is an interesting question, and something I'm struggling with for
the email package for 3.x. It turns out to be pretty convenient to have
both a bytes and a string API, both
Hi John,
On Thu, Apr 9, 2009 at 8:02 AM, John Arbash Meinel
j...@arbash-meinel.com wrote:
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
I've been doing some memory profiling of my application, and I've found
some interesting results with how intern() works. I was pretty surprised
to see
...
Anyway, I the internals of intern() could be done a bit better. Here are
some concrete things:
[snip]
Memory usage is definitely something we're interested in improving.
Since you've already looked at this in some detail, could you try
implementing one or two of your ideas and
On Thu, Apr 9, 2009 at 9:34 AM, John Arbash Meinel
john.arbash.mei...@gmail.com wrote:
...
Anyway, I the internals of intern() could be done a bit better. Here are
some concrete things:
[snip]
Memory usage is definitely something we're interested in improving.
Since you've already looked
John Arbash Meinel wrote:
When I looked at the actual references from interned, I saw mostly
variable names. Considering that every variable goes through the python
intern dict. And when you look at the intern function, it doesn't use
setdefault logic, it actually does a get() followed by a
(email-sig dropped, as I didn't see Steve Holden's message there)
At 12:20 -0400 04/09/2009, Steve Holden wrote:
Tony Nelson wrote:
...
If you need the data from the message, by all means extract it and store it
in whatever form is useful to the purpose of the database. If you need the
On Thu, Apr 09, 2009 at 01:14:21PM -0400, Tony Nelson wrote:
I use MySQL, but sort of intend to learn PostgreSQL. I didn't know that
PostgreSQL has no real support for BLOBs.
I think it has - BYTEA data type.
Oleg.
--
Oleg Broytmannhttp://phd.pp.ru/
Christian Heimes wrote:
John Arbash Meinel wrote:
When I looked at the actual references from interned, I saw mostly
variable names. Considering that every variable goes through the python
intern dict. And when you look at the intern function, it doesn't use
setdefault logic, it actually does
Oleg Broytmann wrote:
On Thu, Apr 09, 2009 at 01:14:21PM -0400, Tony Nelson wrote:
I use MySQL, but sort of intend to learn PostgreSQL. I didn't know that
PostgreSQL has no real support for BLOBs.
I think it has - BYTEA data type.
But the Python DB adapters appears to require some
Alexander Belopolsky wrote:
On Thu, Apr 9, 2009 at 11:02 AM, John Arbash Meinel
j...@arbash-meinel.com wrote:
...
a) Don't keep a double reference to both key and value to the same
object (1 pointer per entry), this could be as simple as using a
Set() instead of a dict()
There
This is an interesting question, and something I'm struggling with for
the email package for 3.x. It turns out to be pretty convenient to have
both a bytes and a string API, both for input and output, but I think
email really wants to be represented internally as bytes. Maybe. Or
maybe
At 21:24 +0400 04/09/2009, Oleg Broytmann wrote:
On Thu, Apr 09, 2009 at 01:14:21PM -0400, Tony Nelson wrote:
I use MySQL, but sort of intend to learn PostgreSQL. I didn't know that
PostgreSQL has no real support for BLOBs.
I think it has - BYTEA data type.
So it does; I see that now that
So I guess some of it comes down to whether loweis would also reject
this change on the basis that mathematically a set is not a dict.
I'd like to point out that this was not the reason to reject it.
Instead, this (or, the opposite of it) was given as a reason why this
patch should be accepted
Hi Dan,
Thanks for your interest.
2009/4/6 Dan Schult dsch...@colgate.edu:
Hi,
I'm trying to write a C extension which is a subclass of dict.
I want to do something like a setdefault() but with a single lookup.
Looking through the dictobject code, the three workhorse
routines lookdict,
On Thu, Apr 9, 2009 at 1:15 AM, Antoine Pitrou solip...@pitrou.net wrote:
As for reading/writing bytes over the wire, JSON is often used in the same
context as HTML: you are supposed to know the charset and decode/encode the
payload using that charset. However, the RFC specifies a default
...
I like your rationale (save memory) much more, and was asking in the
tracker for specific numbers, which weren't forthcoming.
...
Now that you brought up a specific numbers, I tried to verify them,
and found them correct (although a bit unfortunate), please see my
test script below.
I can understand that you don't want to spend much time on it. How
about removing it from 3.1? We could re-add it when long-term support
becomes more likely.
I'm speechless.
It seems that my statement has surprised you, so let me explain:
I think we should refrain from making design
I don't have numbers on how much that would improve CPU times, I would
imagine improving 'intern()' would impact import times more than run
times, simply because import time is interning a *lot* of strings.
Though honestly, Bazaar would really like this, because startup overhead
for us is
Alexandre Vassalotti wrote:
On Thu, Apr 9, 2009 at 1:15 AM, Antoine Pitrou solip...@pitrou.net wrote:
As for reading/writing bytes over the wire, JSON is often used in the same
context as HTML: you are supposed to know the charset and decode/encode the
payload using that charset. However, the
Martin v. Löwis wrote:
I don't have numbers on how much that would improve CPU times, I would
imagine improving 'intern()' would impact import times more than run
times, simply because import time is interning a *lot* of strings.
Though honestly, Bazaar would really like this, because startup
Tony Nelson wrote:
At 21:24 +0400 04/09/2009, Oleg Broytmann wrote:
On Thu, Apr 09, 2009 at 01:14:21PM -0400, Tony Nelson wrote:
I use MySQL, but sort of intend to learn PostgreSQL. I didn't know that
PostgreSQL has no real support for BLOBs.
I think it has - BYTEA data type.
So it
On Thu, Apr 09, 2009, Steve Holden wrote:
import psycopg2 as db
conn = db.connect(database=maildb, user=@@@, password=@@@,
host=localhost, port=5432)
curs = conn.cursor()
curs.execute(DELETE FROM tst)
curs.execute(INSERT INTO tst (byt) VALUES (%s),
(.join(chr(i) for i in
On Thu, Apr 09, 2009 at 04:42:21PM -0400, Steve Holden wrote:
If I can't pass a 256-byte string into a BLOB and get it back without
anything like this happening then there's *something* in the chain that
makes the database useless.
import psycopg2
con = psycopg2.connect(database=test)
cur =
On Thu, Apr 9, 2009 at 1:05 PM, Martin v. Löwis mar...@v.loewis.de wrote:
I can understand that you don't want to spend much time on it. How
about removing it from 3.1? We could re-add it when long-term support
becomes more likely.
I'm speechless.
It seems that my statement has surprised
As far as Python 3 goes, I honestly have not yet familiarized myself
with the changes to the IO infrastructure and what the new idioms are.
At this time, I can't make any educated decisions with regard to how
it should be done because I don't know exactly how bytes are supposed
to work and
Also, consider that resizing has to evaluate every object, thus paging
in all X bytes, and assigning to another 2X bytes. Cutting X by
(potentially 3), would probably have a small but measurable effect.
I'm *very* skeptical about claims on performance in the absence of
actual measurements. Too
Oleg Broytmann wrote:
On Thu, Apr 09, 2009 at 04:42:21PM -0400, Steve Holden wrote:
If I can't pass a 256-byte string into a BLOB and get it back without
anything like this happening then there's *something* in the chain that
makes the database useless.
import psycopg2
con =
On Apr 9, 2009, at 12:06 PM, Martin v. Löwis wrote:
Now that you brought up a specific numbers, I tried to verify them,
and found them correct (although a bit unfortunate), please see my
test script below. Up to 21800 interned strings, the dict takes (only)
384kiB. It then grows, requiring
John Arbash Meinel wrote:
And when you look at the intern function, it doesn't use
setdefault logic, it actually does a get() followed by a set(), which
means the cost of interning is 1-2 lookups depending on likelyhood, etc.
Keep in mind that intern() is called fairly rarely, mostly
only at
2009/4/9 Greg Ewing greg.ew...@canterbury.ac.nz:
John Arbash Meinel wrote:
And when you look at the intern function, it doesn't use
setdefault logic, it actually does a get() followed by a set(), which
means the cost of interning is 1-2 lookups depending on likelyhood, etc.
Keep in mind
Greg Ewing wrote:
John Arbash Meinel wrote:
And the way intern is currently
written, there is a third cost when the item doesn't exist yet, which is
another lookup to insert the object.
That's even rarer still, since it only happens the first
time you load a piece of code that uses a given
cmake does not produce relative paths in its generated make and
project files. There is an option CMAKE_USE_RELATIVE_PATHS which
appears to do this but the documentation says:
This option does not work for more complicated projects, and
relative paths are used when possible. In general, it is
On Apr 9, 2009, at 8:07 AM, Steve Holden wrote:
The real problem I came across in storing email in a relational
database
was the inability to store messages as Unicode. Some messages have a
body in one encoding and an attachment in another, so the only ways to
store the messages are either as
On Apr 9, 2009, at 11:08 AM, Bill Janssen wrote:
Barry Warsaw ba...@python.org wrote:
Anyway, aside from that decision, I haven't come up with an
elegant way to allow /output/ in both bytes and strings (input is I
think theoretically easier by sniffing the arguments).
Probably a good thing.
On Apr 9, 2009, at 11:11 PM, gl...@divmod.com wrote:
I think this is a problematic way to model bytes vs. text; it gives
text a special relationship to bytes which should be avoided.
IMHO the right way to think about domains like this is a multi-level
representation. The low level
On Apr 9, 2009, at 10:52 PM, Aahz wrote:
On Thu, Apr 09, 2009, Barry Warsaw wrote:
So, what I'm really asking is this. Let's say you agree that there
are
use cases for accessing a header value as either the raw encoded
bytes or
the decoded unicode. What should this return:
Barry Warsaw wrote:
I don't know whether the parameter thing will work or not, but you're
probably right that we need to get the bytes-everywhere API first.
Given that json is a wire protocol, that sounds like the right approach
for json as well. Once bytes-everywhere works, then a text API can
On Apr 9, 2009, at 2:25 PM, Martin v. Löwis wrote:
This is an interesting question, and something I'm struggling with
for
the email package for 3.x. It turns out to be pretty convenient to
have
both a bytes and a string API, both for input and output, but I think
email really wants to be
On 02:26 am, ba...@python.org wrote:
There are really two ways to look at an email message. It's either an
unstructured blob of bytes, or it's a structured tree of objects.
Those objects have headers and payload. The payload can be of any
type, though I think it generally breaks down into
On Thu, Apr 09, 2009, Barry Warsaw wrote:
So, what I'm really asking is this. Let's say you agree that there are
use cases for accessing a header value as either the raw encoded bytes or
the decoded unicode. What should this return:
message['Subject']
The raw bytes or the decoded
On Apr 9, 2009, at 11:55 AM, Daniel Stutzbach wrote:
On Thu, Apr 9, 2009 at 6:01 AM, Barry Warsaw ba...@python.org wrote:
Anyway, aside from that decision, I haven't come up with an elegant
way to allow /output/ in both bytes and strings (input is I think
theoretically easier by sniffing
On Apr 9, 2009, at 12:20 PM, Steve Holden wrote:
PostgreSQL strongly encourages you to store text as encoded columns.
Because emails lack an encoding it turns out this is a most
inconvenient
storage type for it. Sadly BLOBs are such a pain in PostgreSQL that
it's
easier to store the
On Apr 9, 2009, at 11:21 PM, Nick Coghlan wrote:
Barry Warsaw wrote:
I don't know whether the parameter thing will work or not, but you're
probably right that we need to get the bytes-everywhere API first.
Given that json is a wire protocol, that sounds like the right
approach
for json as
At 22:38 -0400 04/09/2009, Barry Warsaw wrote:
...
So, what I'm really asking is this. Let's say you agree that there
are use cases for accessing a header value as either the raw encoded
bytes or the decoded unicode. What should this return:
message['Subject']
The raw bytes or the decoded
On 9-Apr-09, at 6:24 PM, John Arbash Meinel wrote:
Greg Ewing wrote:
John Arbash Meinel wrote:
And the way intern is currently
written, there is a third cost when the item doesn't exist yet,
which is
another lookup to insert the object.
That's even rarer still, since it only happens the
On Wed, Apr 8, 2009 at 9:31 PM, Michele Simionato
michele.simion...@gmail.com wrote:
Then perhaps you misunderstand the goal of the decorator module.
The raison d'etre of the module is to PRESERVE the signature:
update_wrapper unfortunately *changes* it.
When confronted with a library which I
At 22:26 -0400 04/09/2009, Barry Warsaw wrote:
There are really two ways to look at an email message. It's either an
unstructured blob of bytes, or it's a structured tree of objects.
Those objects have headers and payload. The payload can be of any
type, though I think it generally breaks down
On Thu, Apr 9, 2009 at 6:24 PM, John Arbash Meinel
john.arbash.mei...@gmail.com wrote:
Greg Ewing wrote:
John Arbash Meinel wrote:
And the way intern is currently
written, there is a third cost when the item doesn't exist yet, which is
another lookup to insert the object.
That's even rarer
On Thu, Apr 9, 2009 at 6:24 PM, John Arbash Meinel
john.arbash.mei...@gmail.com wrote:
Greg Ewing wrote:
John Arbash Meinel wrote:
And the way intern is currently
written, there is a third cost when the item doesn't exist yet, which is
another lookup to insert the object.
That's even rarer
On Thu, Apr 9, 2009 at 9:07 PM, Collin Winter coll...@gmail.com wrote:
On Thu, Apr 9, 2009 at 6:24 PM, John Arbash Meinel
john.arbash.mei...@gmail.com wrote:
And I would be a *lot* happier if startup time was 100ms instead
of 400ms.
Quite so. We have a number of internal tools, and they
On Thu, Apr 9, 2009 at 5:53 AM, Aahz a...@pythoncraft.com wrote:
On Thu, Apr 09, 2009, Nick Coghlan wrote:
Martin v. L?wis wrote:
Such a policy would then translate to a dead end for Python 2.x
based applications.
2.x based applications *are* in a dead end, with the only exit
being portage
...
Somewhat true, though I know it happens 25k times during startup of
bzr... And I would be a *lot* happier if startup time was 100ms instead
of 400ms.
I don't want to quash your idealism too severely, but it is extremely
unlikely that you are going to get anywhere near that kind of
On 02:38 am, ba...@python.org wrote:
So, what I'm really asking is this. Let's say you agree that there
are use cases for accessing a header value as either the raw encoded
bytes or the decoded unicode. What should this return:
message['Subject']
The raw bytes or the decoded unicode?
On 03:21 am, ncogh...@gmail.com wrote:
Barry Warsaw wrote:
I don't know whether the parameter thing will work or not, but you're
probably right that we need to get the bytes-everywhere API first.
Given that json is a wire protocol, that sounds like the right approach
for json as well.
Barry Warsaw writes:
There are really two ways to look at an email message. It's either an
unstructured blob of bytes, or it's a structured tree of objects.
Indeed!
Those objects have headers and payload. The payload can be of any
type, though I think it generally breaks down into
Then perhaps you misunderstand the goal of the decorator module.
The raison d'etre of the module is to PRESERVE the signature:
update_wrapper unfortunately *changes* it.
When confronted with a library which I do not not know, I often run
over it pydoc, or sphinx, or a custom made
79 matches
Mail list logo