date:20111001

Re: [Python-Dev] Inconsistent script/console behaviour

2011-10-01 Thread Chris Withers


On 24/09/2011 00:32, Guido van Rossum wrote:

The interactive console is optimized for people entering code by
typing, not by copying and pasting large gobs of text.

If you think you can have it both, show us the code.


Anatoly wants ipython's new qtconsole.

This "does the right thing" because it's a GUI app and so can manipulate 
the content on paste...


Not sure if you can do that in a console app...

cheers,

Chris

--
Simplistix - Content Management, Batch Processing & Python Consulting
- http://www.simplistix.co.uk
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] cpython: Enhance Py_ARRAY_LENGTH(): fail at build time if the argument is not an array

2011-10-01 Thread Martin v. Löwis

>> Do we really need a new file? Why not pyport.h where other compiler stuff
>> goes?
> 
> I'm not sure that pyport.h is the right place to add Py_MIN, Py_MAX, 
> Py_ARRAY_LENGTH. pyport.h looks to be related to all things specific to the 
> platform like INT_MAX, Py_VA_COPY, ... pymacro.h contains platform 
> independant 
> macros.

I'm -1 on additional header files as well. If no other reasonable place
is found, Python.h is still available.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] PEP-393: request for keeping PyUnicode_EncodeDecimal()

2011-10-01 Thread Stefan Krah

Hello,

the subject says it all. PyUnicode_EncodeDecimal() is listed among
the deprecated functions. In cdecimal, I'm relying on this function
for a number of reasons:

  * It is not trivial to implement.

  * With the Unicode implementation constantly changing, it is nearly
impossible to know what input is currently regarded as a decimal
digit. See also:

   http://bugs.python.org/issue10557
   http://bugs.python.org/issue10557#msg123123

 "The API won't go away (it does have its use and is being
  used in 3rd party extensions) [...]"


Stefan Krah


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] cpython: Implement PEP 393.

2011-10-01 Thread Martin v. Löwis

Am 29.09.2011 01:21, schrieb Eric V. Smith:
> Is there some reason str.format had such major surgery done to it?

Yes: I couldn't figure out how to do it any other way. The formatting
code had a few basic assumptions which now break (unless you keep using
the legacy API). Primarily, the assumption is that there is a notion of
a "STRINGLIB_CHAR" which is the element of a string representation. With
PEP 393, no such type exists anymore - it depends on the individual
object what the element type for the representation is.

In other cases, I worked around that by compiling the stringlib three
times, for Py_UCS1, Py_UCS2, and Py_UCS4. For one, this gives
considerable code bloat, which I didn't like for the formatting code
(as that is already a considerable amount of code). More importantly,
this approach wouldn't have worked well, anyway, since the formatting
combines multiple Unicode objects (especially with the OutputString
buffer), and different inputs may have different representations. On
top of that, OutputString needs widening support, starting out with
a narrow string, and widening step-by-step as input strings are more
wide than the current output (or not, if the input strings are all
ASCII).

It would have been possible to keep the basic structure by doing
all formatting in Py_UCS4. This would cost a significant memory and
runtime overhead.

> In addition, there are outstanding patches that are now broken.

I'm sorry about that. Try applying them to the new files, though - patch
may still be able to figure out how to integrate them, as the
algorithms and function structure hasn't changed.

> I'd prefer it return to how it used to be, and just the minimum changes
> required for PEP 393 be made to it.

Please try for yourself. On string_format.h, I think there is zero
chance, unless you want to compromise and efficiency (in addition to
the already-present compromise on code cleanliness, due the the fact
that the code is more general than it needs to be).

On formatter.h, it may actually be possible to restore what it was - in
particular if you can make a guarantee that all number formatting always
outputs ASCII-strings only (which I'm not so sure about, as the
thousands separator could be any character, in principle). Without that
guarantee, it may indeed be reasonable to compile formatter.h in
Py_UCS4, since the resulting strings will be small, so the overhead is
probably negligible.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP-393: request for keeping PyUnicode_EncodeDecimal()

2011-10-01 Thread Martin v. Löwis

> the subject says it all. PyUnicode_EncodeDecimal() is listed among
> the deprecated functions.

Please see the section on deprecation. None of the deprecated functions
will be removed for a period of five years, and afterwards, they will
be kept until usage outside of the core is low. Most likely, this means
they will be kept until Python 4.

>   * It is not trivial to implement.
> 
>   * With the Unicode implementation constantly changing, it is nearly
> impossible to know what input is currently regarded as a decimal
> digit. See also:

I still recommend that you come up with your own implementation of that
algorithm. You probably don't need any of the error handler support,
which makes the largest portion of the code. Then, use
Py_UNICODE_TODECIMAL to process individual characters. It's a simple
loop over every character.

In addition, you could also take the same approach as decimal.py,
i.e. do

   self._int = str(int(intpart+fracpart))

This would improve compatibility with the decimal.py implementation,
which doesn't use PyUnicode_EncodeDecimal either (but instead goes
through _PyUnicode_TransformDecimalAndSpaceToASCII).

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP-393: request for keeping PyUnicode_EncodeDecimal()

2011-10-01 Thread Stefan Krah

"Martin v. Löwis"  wrote:
> > the subject says it all. PyUnicode_EncodeDecimal() is listed among
> > the deprecated functions.
> 
> Please see the section on deprecation. None of the deprecated functions
> will be removed for a period of five years, and afterwards, they will
> be kept until usage outside of the core is low. Most likely, this means
> they will be kept until Python 4.

I've to confess that I missed that; sounds good.


> In addition, you could also take the same approach as decimal.py,
> i.e. do
> 
>self._int = str(int(intpart+fracpart))
> 
> This would improve compatibility with the decimal.py implementation,
> which doesn't use PyUnicode_EncodeDecimal either (but instead goes
> through _PyUnicode_TransformDecimalAndSpaceToASCII).

longobject.c still used PyUnicode_EncodeDecimal() until 10 months
ago (8304bd765bcf). I missed the PyUnicode_TransformDecimalToASCII()
commit, probably because #10557 is still open.

That's why I wouldn't like to implement the function myself at least
until the API is settled.


I see this in the new code:

#if 0
static PyObject *
unicode__decimal2ascii(PyObject *self)
{
return PyUnicode_TransformDecimalAndSpaceToASCII(self);
}
#endif


Will PyUnicode_TransformDecimalAndSpaceToASCII() be public?


Stefan Krah


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: Add _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() macros

2011-10-01 Thread Antoine Pitrou

On Sat, 01 Oct 2011 16:53:44 +0200
victor.stinner  wrote:
> http://hg.python.org/cpython/rev/4afab01f5374
> changeset:   72565:4afab01f5374
> user:Victor Stinner 
> date:Sat Oct 01 16:48:13 2011 +0200
> summary:
>   Add _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() macros
> 
>  * Rename existing _PyUnicode_UTF8() macro to PyUnicode_UTF8()

Wouldn't this be better called PyUnicode_AS_UTF8()?



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP-393: request for keeping PyUnicode_EncodeDecimal()

2011-10-01 Thread Martin v. Löwis

> longobject.c still used PyUnicode_EncodeDecimal() until 10 months
> ago (8304bd765bcf). I missed the PyUnicode_TransformDecimalToASCII()
> commit, probably because #10557 is still open.
> 
> That's why I wouldn't like to implement the function myself at least
> until the API is settled.

I don't understand. If you implement it yourself, you don't have to
worry at all what the API is. Py_UNICODE_TODECIMAL has been around
for a long time, and will stay, no matter how number parsing is
implemented. That's all you need.

   out = malloc(PyUnicode_GET_LENGTH(in)+1);
   for (i = 0; i < PyUnicode_GET_LENGTH(in); i++) {
   Py_UCS4 ch = PyUnicode_READ_CHAR(in, i);
   int d = Py_UNICODE_TODIGIT(ch);
   if (d != -1) {
  out[i] == '0'+d;
  continue;
   }
   if (ch < 128)
  out[i] = ch;
   else {
  error();
  return;
   }
   }
   out[i] = '\0';

OTOH, *if* number parsing is ever updated (e.g. to consider alternative
decimal points), PyUnicode_EncodeDecimal still won't be changed - it
will continue to do exactly what it does today.

> Will PyUnicode_TransformDecimalAndSpaceToASCII() be public?

It's already included in 3.2, so it can't be removed that easily.
I wish it had been private, though - we have way too many API functions
dealing with Unicode.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: Add _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() macros

2011-10-01 Thread Martin v. Löwis

Am 01.10.2011 17:18, schrieb Antoine Pitrou:
> On Sat, 01 Oct 2011 16:53:44 +0200
> victor.stinner  wrote:
>> http://hg.python.org/cpython/rev/4afab01f5374
>> changeset:   72565:4afab01f5374
>> user:Victor Stinner 
>> date:Sat Oct 01 16:48:13 2011 +0200
>> summary:
>>   Add _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() macros
>>
>>  * Rename existing _PyUnicode_UTF8() macro to PyUnicode_UTF8()
> 
> Wouldn't this be better called PyUnicode_AS_UTF8()?

No. _AS_UTF8 would imply that some conversion function is called.
In this case, it's a pure structure accessor macro, that may give
NULL if the pointer is not yet filled out.

It's not called Py_AS_TYPE, but Py_TYPE; likewise not
PyWeakref_AS_OBJECT, but PyWeakref_GET_OBJECT. In this case,
PyUnicode_GET_UTF8 might have been an alternative.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: Add _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() macros

2011-10-01 Thread Antoine Pitrou

On Sat, 01 Oct 2011 17:47:26 +0200
"Martin v. Löwis"  wrote:
> Am 01.10.2011 17:18, schrieb Antoine Pitrou:
> > On Sat, 01 Oct 2011 16:53:44 +0200
> > victor.stinner  wrote:
> >> http://hg.python.org/cpython/rev/4afab01f5374
> >> changeset:   72565:4afab01f5374
> >> user:Victor Stinner 
> >> date:Sat Oct 01 16:48:13 2011 +0200
> >> summary:
> >>   Add _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() macros
> >>
> >>  * Rename existing _PyUnicode_UTF8() macro to PyUnicode_UTF8()
> > 
> > Wouldn't this be better called PyUnicode_AS_UTF8()?
> 
> No. _AS_UTF8 would imply that some conversion function is called.

PyBytes_AS_STRING doesn't call any conversion function, and neither did
PyUnicode_AS_UNICODE.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] What it takes to change a single keyword.

2011-10-01 Thread Martin v. Löwis

> First of all, I am sincerely sorry if this is wrong mailing list to ask
> this question. I checked out definitions of couple other mailing list,
> and this one seemed most suitable. Here is my question:

In principle, python-list would be more appropriate, but this really
is a border case. So welcome!

> Let's say I want to change a single keyword, let's say import keyword,
> to be spelled as something else, like it's translation to my language. I
> guess it would be more complicated than modifiying Grammar/Grammar, but
> I can't be sure which files should get edited.

Hmm. I also think editing Grammar/Grammar should be sufficient. Try
restricting yourself to ASCII keywords first; this just worked fine for
me.

Of course, if you change a single keyword, none of the existing Python
code will work anymore. See for yourself by changing 'def' to 'fed' (say).

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] RFC: Add a new builtin strarray type to Python?

2011-10-01 Thread Victor Stinner

Hi,

Since the integration of the PEP 393, str += str is not more super-fast (but 
just fast). For example, adding a single character to a string has to copy all 
characters to a new string. I suppose that performances of a lot of 
applications manipulating text may be affected by this issue, especially text 
templating libraries.

io.StringIO has also been changed to store characters as Py_UCS4 (4 bytes) 
instead of Py_UNICODE (2 or 4 bytes). This class doesn't benefit from the new 
PEP 393.

I propose to add a new builtin type to Python to improve both issues (cpu and 
memory): *strarray*. This type would have the same API than str, except:

 * has append() and extend() methods
 * methods results are strarray instead of str

I'm writing this email to ask you if this type solves a real issue, or if we 
can just prove the super-fast str.join(list of str).

--

strarray is similar to bytearray, but different: strarray('abc')[0] is 'a', not 
97, and strarray can store any Unicode character (not only integers in range 
0-255).

I wrote a quick and dirty implementation in Python just to be able to play 
with the API, and to have an idea of the quantity of work required to 
implement it:

https://bitbucket.org/haypo/misc/src/tip/python/strarray.py

(Some methods are untested: see the included TODO list.)

--

Implement strarray in C is not trivial and it would be easier to implement it 
in 3 steps:

 (a) Use Py_UCS4 array
 (b) The array type depends on the content: best memory footprint, as the PEP 
393
 (c) Use strarray to implement a new io.StringIO

Or we can just stop after step (a).

--

strarray API has to be discussed.

Most bytearray methods return a new object in most cases. I don't understand 
why, it's not efficient. I don't know if we can do in-place operations for 
strarray methods having the same name than bytearray methods (which are not 
in-place methods).

str has some more methods that bytes and bytearary don't have, like format. We 
may do in-place operation for these methods.

Victor
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: Add _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() macros

2011-10-01 Thread Victor Stinner

Le samedi 1 octobre 2011 17:18:42, Antoine Pitrou a écrit :
> On Sat, 01 Oct 2011 16:53:44 +0200
> 
> victor.stinner  wrote:
> > http://hg.python.org/cpython/rev/4afab01f5374
> > changeset:   72565:4afab01f5374
> > user:Victor Stinner 
> > date:Sat Oct 01 16:48:13 2011 +0200
> > 
> > summary:
> >   Add _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() macros
> >  
> >  * Rename existing _PyUnicode_UTF8() macro to PyUnicode_UTF8()
> 
> Wouldn't this be better called PyUnicode_AS_UTF8()?

All these macro are privates and just used to have a more readable C code. For 
example, _PyUnicode_UTF8() just gives access to a field a structure after 
casting the object to the right type.

We may drop "PyUnicode_" and  "_PyUnicode_" prefixes if these names are 
confusing.

Victor
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] cpython: =?utf-8?q?Enhance=09Py=5FARRAY=5FLENGTH?=(): fail at build time if the argument is not an array

2011-10-01 Thread Victor Stinner

Le samedi 1 octobre 2011 14:52:03, vous avez écrit :
> >> Do we really need a new file? Why not pyport.h where other compiler
> >> stuff goes?
> > 
> > I'm not sure that pyport.h is the right place to add Py_MIN, Py_MAX,
> > Py_ARRAY_LENGTH. pyport.h looks to be related to all things specific to
> > the platform like INT_MAX, Py_VA_COPY, ... pymacro.h contains platform
> > independant macros.
> 
> I'm -1 on additional header files as well. If no other reasonable place
> is found, Python.h is still available.

I moved them to pymacro.h because I don't consider Python.h as a reasonable 
place for them.


Victor
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RFC: Add a new builtin strarray type to Python?

2011-10-01 Thread Victor Stinner

> Since the integration of the PEP 393, str += str is not more super-fast
> (but just fast).

Oh oh. str+=str is now *1450x* slower than ''.join() pattern. Here is a 
benchmark (see attached script, bench_build_str.py):

Python 3.3

str += str: 14548 ms
''.join() : 10 ms
StringIO.write: 12 ms
StringBuilder : 30 ms
array('u'): 67 ms

Python 3.2

str += str: 9 ms
''.join() : 9 ms
StringIO.write: 9 ms
StringBuilder : 30 ms
array('u'): 77 ms

(FYI results are very different in Python 2)

I expect performances similar to StringIO.write if strarray is implemented 
using a Py_UCS4 buffer, as io.StringIO.

PyPy has a UnicodeBuilder class (in __pypy__.builders): it has append(), 
append_slice() and build() methods. In PyPy, it is the fastest method to build 
a string:

PyPy 1.6

''.join() : 16 ms
StringIO.join : 24 ms
StringBuilder : 9 ms
array('u'): 66 ms

It is even faster if you specify the size to the constructor: 3 ms.

> I'm writing this email to ask you if this type solves a real issue, or if
> we can just prove the super-fast str.join(list of str).

Hum, it looks like "What is the most efficient string concatenation method in 
python?" in a frequently asked question. There is a recent thread on python-
ideas mailing list:

"Create a StringBuilder class and use it everywhere"
http://code.activestate.com/lists/python-ideas/11147/
(I just subscribed to this list.)

Another alternative is a "string-join" object. It is discussed (and 
implemented) in the following issue, and PyPy has also an optional 
implementation:

http://bugs.python.org/issue1569040
http://codespeak.net/pypy/dist/pypy/doc/interpreter-optimizations.html#string-
join-objects

Note: Python 2 has UserString.MutableString (and Python 3 has 
collections.UserString).

Victor
import array
import io
import sys
import time

LOOPS = 10
INITIAL = "initial value"
MORE = "more data"

class StringBuilder(object):
"""Use it instead of doing += for building unicode strings from pieces"""
def __init__(self, val=""):
self.val = val
self.appended = []

def __iadd__(self, other):
self.appended.append(other)
return self

def __str__(self):
self.val += "".join(self.appended)
self.appended = []
return self.val

def main_pure(loops):
"str += str"
b = INITIAL
for i in range(loops):
b += MORE
return b

def main_list_append(loops):
"''.join()"
b = [INITIAL]
for i in range(loops):
b.append(MORE)
return "".join(b)

def main_string_builder(loops):
"StringBuilder"
b = StringBuilder(INITIAL)
for i in range(loops):
b += MORE
return str(b)

def main_stringio(loops):
"StringIO.join"
b = io.StringIO(INITIAL)
for i in range(loops):
b.write(MORE)
return b.getvalue()

def main_array(loops):
"array('u')"
b = array.array('u', INITIAL)
for i in range(loops):
b.extend(MORE)
return b.tounicode()

ver = sys.version_info
print("Python %s.%s" % (ver.major, ver.minor))
funcs = (main_pure, main_list_append, main_stringio, main_string_builder, main_array)
width = 1 + max(len(func.__doc__) for func in funcs)
for func in funcs:
a = time.time()
func(LOOPS)
b = time.time()
dt = b - a
print("%s: %.0f ms" % (func.__doc__.ljust(width), dt * 1000))

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] cpython: Implement PEP 393.

2011-10-01 Thread Eric V. Smith

On 10/1/2011 9:26 AM, "Martin v. Löwis" wrote:
> Am 29.09.2011 01:21, schrieb Eric V. Smith:
>> Is there some reason str.format had such major surgery done to it?
> 
> Yes: I couldn't figure out how to do it any other way. The formatting
> code had a few basic assumptions which now break (unless you keep using
> the legacy API). Primarily, the assumption is that there is a notion of
> a "STRINGLIB_CHAR" which is the element of a string representation. With
> PEP 393, no such type exists anymore - it depends on the individual
> object what the element type for the representation is.

Martin: Thanks so much for your thoughtful answer. You've obviously
given this more thought than I have. From your answer, it does indeed
sound like string_format.h needs to be removed from stringlib. I'll have
to think more about formatter.h.

On the other hand, not having this code in stringlib would certainly be
liberating! Maybe I'll take this opportunity to clean it up and simplify
it now that it's free of the stringlib constraints.

Eric.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RFC: Add a new builtin strarray type to Python?

2011-10-01 Thread Antoine Pitrou

On Sat, 1 Oct 2011 22:06:11 +0200
Victor Stinner  wrote:
> 
> > I'm writing this email to ask you if this type solves a real issue, or if
> > we can just prove the super-fast str.join(list of str).
> 
> Hum, it looks like "What is the most efficient string concatenation method in 
> python?" in a frequently asked question. There is a recent thread on python-
> ideas mailing list:

So, since people are confused at the number of possible options, you
propose to add a new option and therefore increase the confusion?

I don't understand why StringIO couldn't simply be optimized a little
more, if it needs to.
Or, if straightforward string concatenation really needs to be fast,
then str + str should be optimized (like it used to be).

Regards

Antoine.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RFC: Add a new builtin strarray type to Python?

2011-10-01 Thread Larry Hastings



On 10/01/2011 09:06 PM, Victor Stinner wrote:

Another alternative is a "string-join" object. It is discussed (and
implemented) in the following issue, and PyPy has also an optional
implementation:

http://bugs.python.org/issue1569040
http://codespeak.net/pypy/dist/pypy/doc/interpreter-optimizations.html#string-
join-objects



Yes, actually I was planning on trying to revive my "lazy string 
concatenation" patch once PEP 393 landed.  As I recall it, the major 
roadblock to the patch's acceptance was that it changed the semantics of 
PyString_AS_STRING().  With the patch applied, PyString_AS_STRING() 
could now fail and return NULL under low-memory conditions.  This meant 
a major change to the C API and would have required an audit of 400+ 
call sites inside CPython alone.  I haven't studied PEP 393 yet, but 
Martin tells me PyUnicode_READY would be a good place to render the lazy 
string.


Give me a week or two and I should be able to get it together,


/larry/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RFC: Add a new builtin strarray type to Python?

2011-10-01 Thread Maciej Fijalkowski

On Sat, Oct 1, 2011 at 5:21 PM, Antoine Pitrou  wrote:
> On Sat, 1 Oct 2011 22:06:11 +0200
> Victor Stinner  wrote:
>>
>> > I'm writing this email to ask you if this type solves a real issue, or if
>> > we can just prove the super-fast str.join(list of str).
>>
>> Hum, it looks like "What is the most efficient string concatenation method in
>> python?" in a frequently asked question. There is a recent thread on python-
>> ideas mailing list:

Victor, you can't say it's x times slower. It has different
complexity, so it can be arbitrarily slower.

>
> So, since people are confused at the number of possible options, you
> propose to add a new option and therefore increase the confusion?
>
> I don't understand why StringIO couldn't simply be optimized a little
> more, if it needs to.
> Or, if straightforward string concatenation really needs to be fast,
> then str + str should be optimized (like it used to be).

As far as I remember str + str is discouraged as a way of
concatenating strings. We in pypy should make it fast if it's *really*
the official way.

StringIO is bytes only I think, which might be a bit of an issue if
you want a unicode at the end.

PyPy's Unicode/String builder are a bit hacks until we come up with
something that can make ''.join faster I think.

Cheers,
fijal

>
> Regards
>
> Antoine.
>
>
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] What it takes to change a single keyword.

2011-10-01 Thread Nick Coghlan

2011/10/1 "Martin v. Löwis" :
>> First of all, I am sincerely sorry if this is wrong mailing list to ask
>> this question. I checked out definitions of couple other mailing list,
>> and this one seemed most suitable. Here is my question:
>
> In principle, python-list would be more appropriate, but this really
> is a border case. So welcome!
>
>> Let's say I want to change a single keyword, let's say import keyword,
>> to be spelled as something else, like it's translation to my language. I
>> guess it would be more complicated than modifiying Grammar/Grammar, but
>> I can't be sure which files should get edited.
>
> Hmm. I also think editing Grammar/Grammar should be sufficient. Try
> restricting yourself to ASCII keywords first; this just worked fine for
> me.

For any changes where that isn't sufficient, then
http://docs.python.org/devguide/grammar.html provides a helpful list
of additional places to check (and
http://docs.python.org/devguide/compiler.html provides info on how it
all hangs together).

However, rather than *changing* the keywords, it would likely be
better to allow *alternate* keywords to avoid the problem Martin
mentioned with existing Python code failing to run (including the
entire standard library).

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] cpython: Implement PEP 393.

2011-10-01 Thread Nick Coghlan

On Sat, Oct 1, 2011 at 4:07 PM, Eric V. Smith  wrote:
> On the other hand, not having this code in stringlib would certainly be
> liberating! Maybe I'll take this opportunity to clean it up and simplify
> it now that it's free of the stringlib constraints.

Yeah, don't sacrifice speed in str.format for a
still-hypothetical-and-potentially-never-going-to-happen bytes
formatting variant. If the latter does happen, the use cases would be
different enough that I'm not even sure the mini-language should
remain entirely the same (e.g. you'd likely want direct access to some
of the struct module formatting more so than str-style formats).

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RFC: Add a new builtin strarray type to Python?

2011-10-01 Thread Nick Coghlan

On Sat, Oct 1, 2011 at 8:33 PM, Maciej Fijalkowski  wrote:
> StringIO is bytes only I think, which might be a bit of an issue if
> you want a unicode at the end.

I'm not sure why you would think that (aside from a 2.x holdover).
StringIO handles Unicode text, BytesIO handles bytes.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RFC: Add a new builtin strarray type to Python?

2011-10-01 Thread Nick Coghlan

On Sat, Oct 1, 2011 at 1:17 PM, Victor Stinner
 wrote:
> Most bytearray methods return a new object in most cases. I don't understand
> why, it's not efficient. I don't know if we can do in-place operations for
> strarray methods having the same name than bytearray methods (which are not
> in-place methods).

No, we can't. The whole point of having separate in-place operators is
to distinguish between operations that can modify the original object,
and those that leave the original object alone (even when it's an
instance of a mutable type like list or bytearray). Efficiency takes a
distant second place to correctness when determining API behaviour.

> str has some more methods that bytes and bytearary don't have, like format. We
> may do in-place operation for these methods.

No we can't, since they're not mutating methods, so they shouldn't
affect the state of the current object.

I'm only -0 on the idea (since bytearray and io.BytesIO seem to
coexist happily enough), but any such strarray object would need to
behave itself with respect to which operations affected the internal
state of the object.

With strings defined as immutable objects, concatenating them in a
loop is formally on O(N*N) operation. Those are always going to scale
poorly. The 'resize if only one reference' trick was fragile, masked a
real algorithmic flaw in user code, but also sped up a lot of naive
software. It was definitely a case of practicality beating purity.

Any change that depends on the user changing their code would be
rather missing the point of the original optimisation - if the user is
sufficiently aware of the problem to know they need to change their
code, then explicitly joining a list of substrings or using a StringIO
object instead of an ordinary string is well within their grasp.

Adding a "disjoint" string representation to the existing PEP 393
suite of representations would solve the same problem in a more
systematic way and, as Martin pointed out, could likely use the same
machinery as is provided for backwards compatibility with code
expecting the legacy string representation.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Inconsistent script/console behaviour

Re: [Python-Dev] [Python-checkins] cpython: Enhance Py_ARRAY_LENGTH(): fail at build time if the argument is not an array

[Python-Dev] PEP-393: request for keeping PyUnicode_EncodeDecimal()

Re: [Python-Dev] [Python-checkins] cpython: Implement PEP 393.

Re: [Python-Dev] PEP-393: request for keeping PyUnicode_EncodeDecimal()

Re: [Python-Dev] PEP-393: request for keeping PyUnicode_EncodeDecimal()

Re: [Python-Dev] cpython: Add _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() macros

Re: [Python-Dev] PEP-393: request for keeping PyUnicode_EncodeDecimal()

Re: [Python-Dev] cpython: Add _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() macros

Re: [Python-Dev] cpython: Add _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() macros

Re: [Python-Dev] What it takes to change a single keyword.

[Python-Dev] RFC: Add a new builtin strarray type to Python?

Re: [Python-Dev] cpython: Add _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() macros

Re: [Python-Dev] [Python-checkins] cpython: =?utf-8?q?Enhance=09Py=5FARRAY=5FLENGTH?=(): fail at build time if the argument is not an array

Re: [Python-Dev] RFC: Add a new builtin strarray type to Python?

Re: [Python-Dev] [Python-checkins] cpython: Implement PEP 393.

Re: [Python-Dev] RFC: Add a new builtin strarray type to Python?

Re: [Python-Dev] RFC: Add a new builtin strarray type to Python?

Re: [Python-Dev] RFC: Add a new builtin strarray type to Python?

Re: [Python-Dev] What it takes to change a single keyword.

Re: [Python-Dev] [Python-checkins] cpython: Implement PEP 393.

Re: [Python-Dev] RFC: Add a new builtin strarray type to Python?

Re: [Python-Dev] RFC: Add a new builtin strarray type to Python?

23 matches

Site Navigation

Mail list logo

Footer information