[Python-Dev] Deprecated Cookie classes in Py3k

2008-05-27 Thread techtonik
I've noticed that some classes in Cookies module (namely SerialCookie and
SmartCookie) deprecated since 2.3 still present in Python3000 documentation.
http://docs.python.org/dev/3.0/library/http.cookies.html

Is it because ... ?:
1. Docs are not synchronized with API
2. Classes are not removed yet
(http://wiki.python.org/moin/Py3kDeprecatedis actually a TODO)
3. Manual reference should contain information about all historical API
changes

-- 
--anatoly t.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why is type_modified() in typeobject.c not a public function?

2008-05-27 Thread Stefan Behnel
Hi,

Guido van Rossum wrote:
> On Tue, May 27, 2008 at 9:47 AM, Stefan Behnel <[EMAIL PROTECTED]> wrote:
>> Could this function get a public interface? I do not think Cython is the only
>> case where C code wants to modify a type after its creation, and copying the
>> code over seems like a hack to me.
>>
> I'm fine with giving it a public interface. Please submit a patch with
> docs included.

Straight forward patch is attached (against 3.0a5). Works for me in Cython. I
thought about a name like "Taint(t)" or "ClearTypeCache(t)", but then went
with the coward solution of calling the function PyType_Modified() as it was
(almost) named internally.

BTW, I noticed that the code in typeobject.c uses "DECREF before set" two
times, like this:

method_cache[h].version = type->tp_version_tag;
method_cache[h].value = res;  /* borrowed */
Py_INCREF(name);
Py_DECREF(method_cache[h].name);
method_cache[h].name = name;

During the call to Py_DECREF, the cache content is incorrect, so can't this
run into the same problem that Py_CLEAR() aims to solve? I attached a patch
for that, too, just in case.

Stefan


--- Include/object.h.ORIG	2008-05-27 21:34:30.0 +0200
+++ Include/object.h	2008-05-27 21:22:22.0 +0200
@@ -428,6 +428,7 @@
 	   PyObject *, PyObject *);
 PyAPI_FUNC(PyObject *) _PyType_Lookup(PyTypeObject *, PyObject *);
 PyAPI_FUNC(unsigned int) PyType_ClearCache(void);
+PyAPI_FUNC(void) PyType_Modified(PyTypeObject *);
 
 /* Generic operations on objects */
 PyAPI_FUNC(int) PyObject_Print(PyObject *, FILE *, int);
--- Objects/typeobject.c.ORIG	2008-05-27 21:35:05.0 +0200
+++ Objects/typeobject.c	2008-05-27 21:33:45.0 +0200
@@ -33,7 +33,6 @@
 
 static struct method_cache_entry method_cache[1 << MCACHE_SIZE_EXP];
 static unsigned int next_version_tag = 0;
-static void type_modified(PyTypeObject *);
 
 unsigned int
 PyType_ClearCache(void)
@@ -48,12 +47,12 @@
 	}
 	next_version_tag = 0;
 	/* mark all version tags as invalid */
-	type_modified(&PyBaseObject_Type);
+	PyType_Modified(&PyBaseObject_Type);
 	return cur_version_tag;
 }
 
-static void
-type_modified(PyTypeObject *type)
+void
+PyType_Modified(PyTypeObject *type)
 {
 	/* Invalidate any cached data for the specified type and all
 	   subclasses.  This function is called after the base
@@ -87,7 +86,7 @@
 			ref = PyList_GET_ITEM(raw, i);
 			ref = PyWeakref_GET_OBJECT(ref);
 			if (ref != Py_None) {
-type_modified((PyTypeObject *)ref);
+PyType_Modified((PyTypeObject *)ref);
 			}
 		}
 	}
@@ -173,7 +172,7 @@
 			Py_INCREF(Py_None);
 		}
 		/* mark all version tags as invalid */
-		type_modified(&PyBaseObject_Type);
+		PyType_Modified(&PyBaseObject_Type);
 		return 1;
 	}
 	bases = type->tp_bases;
@@ -313,7 +312,7 @@
 		return -1;
 	}
 
-	type_modified(type);
+	PyType_Modified(type);
 
 	return PyDict_SetItemString(type->tp_dict, "__module__", value);
 }
@@ -341,7 +340,7 @@
 	int res = PyDict_SetItemString(type->tp_dict,
    "__abstractmethods__", value);
 	if (res == 0) {
-		type_modified(type);
+		PyType_Modified(type);
 		if (value && PyObject_IsTrue(value)) {
 			type->tp_flags |= Py_TPFLAGS_IS_ABSTRACT;
 		}
@@ -1520,7 +1519,7 @@
 	   from the custom MRO */
 	type_mro_modified(type, type->tp_bases);
 
-	type_modified(type);
+	PyType_Modified(type);
 
 	return 0;
 }
@@ -5734,7 +5733,7 @@
 	   update_subclasses() recursion below, but carefully:
 	   they each have their own conditions on which to stop
 	   recursing into subclasses. */
-	type_modified(type);
+	PyType_Modified(type);
 
 	init_slotdefs();
 	pp = ptrs;
--- Doc/c-api/type.rst.ORIG	2008-05-27 21:35:14.0 +0200
+++ Doc/c-api/type.rst	2008-05-27 21:44:20.0 +0200
@@ -38,6 +38,13 @@
Clears the internal lookup cache. Return the current version tag.
 
 
+.. cfunction:: void PyType_Modified(PyTypeObject *type)
+
+   Invalidates the internal lookup cache for the type and all of its
+   subtypes.  This function must be called after any manual
+   modification of the attributes or base classes of the type.
+
+
 .. cfunction:: int PyType_HasFeature(PyObject *o, int feature)
 
Return true if the type object *o* sets the feature *feature*.  Type features

--- Objects/typeobject.c.NEW	2008-05-27 22:26:18.0 +0200
+++ Objects/typeobject.c	2008-05-27 22:28:55.0 +0200
@@ -148,7 +148,7 @@
 	   cannot be done, 1 if Py_TPFLAGS_VALID_VERSION_TAG.
 	*/
 	Py_ssize_t i, n;
-	PyObject *bases;
+	PyObject *bases, *tmp;
 
 	if (PyType_HasFeature(type, Py_TPFLAGS_VALID_VERSION_TAG))
 		return 1;
@@ -167,9 +167,10 @@
 		   are borrowed reference */
 		for (i = 0; i < (1 << MCACHE_SIZE_EXP); i++) {
 			method_cache[i].value = NULL;
-			Py_XDECREF(method_cache[i].name);
-			method_cache[i].name = Py_None;
+			tmp = method_cache[i].name;
 			Py_INCREF(Py_None);
+			method_cache[i].name = Py_None;
+			Py_XDECREF(tmp);
 		}
 		/* mark all version tags as inval

Re: [Python-Dev] Iterable String Redux (aka String ABC)

2008-05-27 Thread Terry Reedy

"Steven D'Aprano" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]

Just throwing a suggestion out there...

def atomic(obj, _atomic=(basestring,)):
try:
return bool(obj.__atomic__)
except AttributeError:
if isinstance(obj, _atomic):
return True
else:
try:
iter(obj)
except TypeError:
return True
return False

assert atomic("abc")
assert not atomic(['a', 'b', 'c'])

If built-in objects grew an __atomic__ attribute, you could simplify the
atomic() function greatly:

def atomic(obj):
return bool(obj.__atomic__)


However atomic() is defined, now flatten() is easy:

def flatten(obj):
if atomic(obj):
yield obj
else:
for item in obj:
for i in flatten(item):
yield i


If you needed more control, you could customise it using standard
techniques e.g. shadow the atomic() function with your own version,
sub-class the types you wish to treat differently, make __atomic__ a
computed property instead of a simple attribute, etc.
==

This is a lot of work to avoid being explicit about either atomic or 
non-atomic classes on an site, package, module, or call basis ;-) 



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue 643841: Including a new-style proxy base class in 2.6/3.0

2008-05-27 Thread Greg Ewing

Nick Coghlan wrote:


else:
# Returned a different object, make a new proxy
result = type(self)(result)


You might want to check that the result has the
same type as the proxied object before doing that.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue 643841: Including a new-style proxy base class in 2.6/3.0

2008-05-27 Thread Greg Ewing

Armin Ronacher wrote:

I'm currently not
providing any __r*__ methods as I was too lazy to test on each call if the
method that is proxied is providing an __rsomething__ or not, and if not come up
with an ad-hoc implementation by calling __something__ and reversing the
arguments passed.


I don't see why you should have to do that, as the reversed
method of the proxy will only be called if a prior non-reversed
call failed.

All the proxy should have to do is delegate any lookups of its
own reversed methods to corresponding methods of the proxied
object, no differently from any other method.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Iterable String Redux (aka String ABC)

2008-05-27 Thread Raymond Hettinger

Steven D'Aprano" <[EMAIL PROTECTED]>
If built-in objects grew an __atomic__ attribute, you could 
simplify the atomic() function greatly:


I may not have been clear enough in my previous post.
Atomicity is not an intrinsic property of an object or class.
How could you know in advance what various applications
would want to consider atomic?  The decision is application
specific and best left to the caller of the flattener:

def flatten(obj, predicate=None):
   if predicate is not None and predicate(obj):
   yield obj
   else:
   for item in obj:
   for i in flatten(item):
   yield i


However atomic() is defined, now flatten() is easy:


Rule of thumb:  if you find a need to change the language
(adding a new builtin, adding a new protocol, and adding a
property to every builtin and pure python container) just to
implement a simple recipe, then it is the recipe that needs fixing,
not the language.


Raymond

P.S. You're on the right track by factoring the decision
away from the internals of flatten(); however, the atomic()
predicate needs to be user definable for a given application
not hardwired into the language itself.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Iterable String Redux (aka String ABC)

2008-05-27 Thread Steven D'Aprano
(If you receive this twice, please excuse the duplicate email. 
User-error on my part, sorry.)

On Wed, 28 May 2008 08:23:38 am Raymond Hettinger wrote:

> A flatten() implementation doesn't really care about whether
> an input is a string which supports all the string-like methods
> such as capitalize().   Wouldn't it be better to write your
> version of flatten() with a registration function so that a user
> could specify which objects are atomic?  Otherwise, you
> will have to continually re-edit your flatten() code as you
> run across other non-stringlike objects that also need to
> be treated as atomic.

Just throwing a suggestion out there... 

def atomic(obj, _atomic=(basestring,)):
try:
return bool(obj.__atomic__)
except AttributeError:
if isinstance(obj, _atomic):
return True
else:
try:
iter(obj)
except TypeError:
return True
return False

assert atomic("abc")
assert not atomic(['a', 'b', 'c'])

If built-in objects grew an __atomic__ attribute, you could simplify the 
atomic() function greatly:

def atomic(obj):
return bool(obj.__atomic__)


However atomic() is defined, now flatten() is easy:

def flatten(obj):
if atomic(obj):
yield obj
else:
for item in obj:
for i in flatten(item):
yield i


If you needed more control, you could customise it using standard 
techniques e.g. shadow the atomic() function with your own version, 
sub-class the types you wish to treat differently, make __atomic__ a 
computed property instead of a simple attribute, etc.

Re-writing the above to match Python 3 is left as an exercise.


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Iterable String Redux (aka String ABC)

2008-05-27 Thread Terry Reedy

"Armin Ronacher" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
| Basically *the* problematic situation with iterable strings is something 
like
| a `flatten` function that flattens out every iterable object except of 
strings.

In most real cases I can imagine, this is way too broad.  For instance, 
trying to 'flatten' an infinite iterable makes the flatten output one also. 
Flattening a set imposes an arbitrary order (but that is ok if one feeds 
the output to set(), which de-orders it).  Flattening a dict decouples keys 
and values.  Flattening iterable set-theoretic numbers (0={}, n = {n-1, 
{n-1}}, or something like that) would literaly yield nothing.

| Imagine it's implemented in a way similar to that::
|
|def flatten(iterable):
|for item in iterable:
|try:
|if isinstance(item, basestring):
|raise TypeError()
|iterator = iter(item)
|except TypeError:
|yield item
|else:
|for i in flatten(iterator):
|yield i

I can more easily imagine wanting to flatten only certain classes, such and 
tuples and lists, or frozensets and sets.

def flatten(iterable, classes):
for item in iterable:
if type(item) in classes:
 for i in flatten(item, classes):
 yield i
else:
yield item

| A problem comes up as soon as user defined strings (such as UserString) 
is
| passed to the function.  In my opinion a good solution would be a 
"String"
| ABC one could test against.

This might be a good idea regardless of my comments.

tjr





___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] Iterable String Redux (aka StringABC)

2008-05-27 Thread Raymond Hettinger
"Jim Jewett" 

It isn't really stringiness that matters, it is that you have to
terminate even though you still have an iterable container.


Well said.



Guido had at least a start in Searchable, back when ABC
were still in the sandbox:


Have to disagree here.  An object cannot know in general
whether a flattener wants to split it or not.  That is an application
dependent decision.  A better answer is be able to tell the
flattener what should be considered atomic in a given circumstance.


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] Iterable String Redux (aka String ABC)

2008-05-27 Thread Jim Jewett
On 5/27/08, Benji York wrote:
> Guido van Rossum wrote:
>  > Armin Ronacher wrote:

> >> Basically *the* problematic situation with iterable strings is something 
> >> like
>  >> a `flatten` function that flattens out every iterable object except of 
> strings.

> > I'm not against this, but so far I've not been able to come up with a
>  > good set of methods to endow the String ABC with. Another problem is
>  > that not everybody draws the line in the same place -- how should
>  > instances of bytes, bytearray, array.array, memoryview (buffer in 2.6)
>  > be treated?

> Maybe the opposite approach would be more fruitful.  Flattening is about
>  removing nested "containers", so perhaps there should be an ABC that
>  things like lists and tuples provide, but strings don't.  No idea what
>  that might be.

It isn't really stringiness that matters, it is that you have to
terminate even though you still have an iterable container.

The test is roughly (1==len(v) and v[0]==v), except that you want to
stop a layer sooner.

Guido had at least a start in Searchable, back when ABC were still in
the sandbox:
http://svn.python.org/view/sandbox/trunk/abc/abc.py?rev=55321&view=auto

Searchable represented the fact that (x in c) =/=> (x in iter(c))
because of sequence searches like ("Error" in results)

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Iterable String Redux (aka String ABC)

2008-05-27 Thread Raymond Hettinger

[Armin Ronacher]

Basically *the* problematic situation with iterable strings is something like
a `flatten` function that flattens out every iterable object except of strings.


Stated more generally: The problematic situation is that flatten() 
implementations typically need some way to decide what kinds

of objects are atomic.   Different apps draw the line in different places
(chars, words, paragraphs, blobs, files, directories, xml elements with
attributes, xml bodies, csv records, csv fields, etc.).



A problem comes up as soon as user defined strings (such as UserString) is
passed to the function.  In my opinion a good solution would be a "String"
ABC one could test against.


Conceptually, this is a fine idea, but three things bug me.

First, there is a mismatch between the significance of the problem
being addressed versus the weight of the solution.  The tiny
"problem" is a sense that the simplest version of a flatten recipe
isn't perfectly general. The "solution" is to introduce yet another
ABC, require adherence to the huge string API and require that
everything that purports to be a string register itself.  IMO, that is
trying to kill a mosquito with a cannon.

Second, this seems like the wrong solution to the problem
as it places the responsibility in the wrong place and thereby
hardwires its notion of what kind of objects should be split.
A flatten() implementation doesn't really care about whether
an input is a string which supports all the string-like methods
such as capitalize().   Wouldn't it be better to write your 
version of flatten() with a registration function so that a user 
could specify which objects are atomic?  Otherwise, you

will have to continually re-edit your flatten() code as you
run across other non-stringlike objects that also need to
be treated as atomic.

Third, I thought ABCs were introduced as an optional feature
to support large apps that needed both polymorphic object 
flexibility and rigorous API matching.  Now, it seems that 
even the tiniest recipe is going to expose its internals and insist

on objects being registered as one of several supported
abstract types.  I suppose this is better than insisting on one
of several concrete types, but it still smells like an anti-pattern.


Raymond

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Iterable String Redux (aka String ABC)

2008-05-27 Thread Antoine Pitrou
Georg Brandl  gmx.net> writes:
> You wrote:
> 
>  > If we stay minimalistic we could consider that the three basic
operations that
>  > define a string are:
>  > - testing for substring containment
>  > - splitting on a substring into a list of substrings
>  > - slicing in order to extract a substring
> 
> I argued that instead of split, find belongs into that list.
> (BTW, length inquiry would be a fourth.)

Well, find() does test for substring containment, so in essence it is in that
list, although in my first post I chose '__contains__' as the canonical
representative of substring containment :-)
And, you are right, length inquiry belongs into it too.

> That the other methods, among them split, can be implemented in terms
> of those, follows from both sets of basic operations.

When I wrote "the three basic operations that define a string", perhaps I
should have written "the three essential operations" instead. I was not 
attempting to give implementation guidelines but to propose a semantic 
definition of what constitutes a string and distinguishes it from other kinds
of objects.

Anyway, I think we are picking on words here.

Do we agree on the following basic String interface : 
['__len__', '__contains__', '__getitem__', 'find', 'index', 'split', 'rsplit']?

cheers

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Iterable String Redux (aka String ABC)

2008-05-27 Thread Georg Brandl

Antoine Pitrou schrieb:

Georg Brandl  gmx.net> writes:


It does, but I don't see how it contradicts my proposition. find() takes a
substring as well.


Well, I'm not sure what your proposal was :-)
Did you mean to keep split() out of the String interface, or to provide a
default implementation of it based on find() and slicing?


You wrote:

> If we stay minimalistic we could consider that the three basic operations that
> define a string are:
> - testing for substring containment
> - splitting on a substring into a list of substrings
> - slicing in order to extract a substring

I argued that instead of split, find belongs into that list.
(BTW, length inquiry would be a fourth.)

That the other methods, among them split, can be implemented in terms
of those, follows from both sets of basic operations.

Georg

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Iterable String Redux (aka String ABC)

2008-05-27 Thread Antoine Pitrou
Georg Brandl  gmx.net> writes:
> 
> It does, but I don't see how it contradicts my proposition. find() takes a
> substring as well.

Well, I'm not sure what your proposal was :-)
Did you mean to keep split() out of the String interface, or to provide a
default implementation of it based on find() and slicing?



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Iterable String Redux (aka String ABC)

2008-05-27 Thread Georg Brandl

Antoine Pitrou schrieb:

Georg Brandl  gmx.net> writes:

I'd argue that "find" is more primitive than "split" -- split is intuitively
implemented using find and slicing, but implementing find using split and
len is unintuitive.  (Of course, "index" can be used instead of "find".)


I meant semantically primitive. I think the difference between a String and a
plain Sequence is that, in a String, the existence and relative position of
substrings has a meaning. This is true for character strings but it can also be
true for other kinds of strings (think genome strings, they are usually
represented using ASCII letters but it's out of convenience - they could be made
of opaque objects instead).

That's why, in string classes, you have methods like split() to deal with the
processing of substrings - which you do not have on lists, not that's it more
difficult to implement or algorithmically less efficient, but because it makes
no point.

Well I hope it makes at least a bit of sense :-)


It does, but I don't see how it contradicts my proposition. find() takes a
substring as well.

Georg

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Iterable String Redux (aka String ABC)

2008-05-27 Thread Armin Ronacher
Hi,

Georg Brandl  gmx.net> writes:

> I'd argue that "find" is more primitive than "split" -- split is intuitively
> implemented using find and slicing, but implementing find using split and
> len is unintuitive.  (Of course, "index" can be used instead of "find".)
It surely is, but it would probably make sense to require both.  Maybe have
something like this:

  class SymbolSequence(Sequence)
  class String(SymbolSequence)

String would be the base of str/unicode and CharacterSequence of str/bytes.
A SymbolSequence is basically a sequence based on one type of symbols that
implements slicing, getting symbols by index, count() and index().  A String
is basically everything a str/unicode provides as method except of those
which depend on informatio based on the symbol.  For example upper() /
isupper() etc would go.  Additionally I guess it makes sense to get rid of
encode() / decode() / format().

Regards,
Armin


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Iterable String Redux (aka String ABC)

2008-05-27 Thread Antoine Pitrou
Georg Brandl  gmx.net> writes:
> I'd argue that "find" is more primitive than "split" -- split is intuitively
> implemented using find and slicing, but implementing find using split and
> len is unintuitive.  (Of course, "index" can be used instead of "find".)

I meant semantically primitive. I think the difference between a String and a
plain Sequence is that, in a String, the existence and relative position of
substrings has a meaning. This is true for character strings but it can also be
true for other kinds of strings (think genome strings, they are usually
represented using ASCII letters but it's out of convenience - they could be made
of opaque objects instead).

That's why, in string classes, you have methods like split() to deal with the
processing of substrings - which you do not have on lists, not that's it more
difficult to implement or algorithmically less efficient, but because it makes
no point.

Well I hope it makes at least a bit of sense :-)

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Iterable String Redux (aka String ABC)

2008-05-27 Thread Georg Brandl

Antoine Pitrou schrieb:

(just my 2 eurocents)

Guido van Rossum  python.org> writes:


I'm not against this, but so far I've not been able to come up with a
good set of methods to endow the String ABC with.


If we stay minimalistic we could consider that the three basic operations that
define a string are:
- testing for substring containment
- splitting on a substring into a list of substrings
- slicing in order to extract a substring



Which gives us ['__contains__', 'split', '__getitem__'], and expands intuitively
to ['__contains__', 'find', 'index', 'split', 'rsplit', '__getitem__'].


I'd argue that "find" is more primitive than "split" -- split is intuitively
implemented using find and slicing, but implementing find using split and
len is unintuitive.  (Of course, "index" can be used instead of "find".)


Another problem is
that not everybody draws the line in the same place -- how should
instances of bytes, bytearray, array.array, memoryview (buffer in 2.6)
be treated?


In the followup of the flatten() example, bytes and bytearray should be Strings,
but array.array and memoryview shouldn't. array.array is really a different kind
of container rather than a proper string, and as for memoryview... well, since
it's not documented I don't know what it's supposed to do :-)


This is really a problem -- since the PEP 3118 authors don't seem
to bother, I'll have to write up something based on the PEP, but I
don't know if it is still up-to-date.

Georg

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Iterable String Redux (aka String ABC)

2008-05-27 Thread Antoine Pitrou

(just my 2 eurocents)

Guido van Rossum  python.org> writes:
> 
> I'm not against this, but so far I've not been able to come up with a
> good set of methods to endow the String ABC with.

If we stay minimalistic we could consider that the three basic operations that
define a string are:
- testing for substring containment
- splitting on a substring into a list of substrings
- slicing in order to extract a substring

Which gives us ['__contains__', 'split', '__getitem__'], and expands intuitively
to ['__contains__', 'find', 'index', 'split', 'rsplit', '__getitem__'].

> Another problem is
> that not everybody draws the line in the same place -- how should
> instances of bytes, bytearray, array.array, memoryview (buffer in 2.6)
> be treated?

In the followup of the flatten() example, bytes and bytearray should be Strings,
but array.array and memoryview shouldn't. array.array is really a different kind
of container rather than a proper string, and as for memoryview... well, since
it's not documented I don't know what it's supposed to do :-)

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Iterable String Redux (aka String ABC)

2008-05-27 Thread Benji York
On Tue, May 27, 2008 at 3:42 PM, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> [+python-3000]
>
> On Tue, May 27, 2008 at 12:32 PM, Armin Ronacher
> <[EMAIL PROTECTED]> wrote:
>> Basically *the* problematic situation with iterable strings is something like
>> a `flatten` function that flattens out every iterable object except of 
>> strings.
>> Imagine it's implemented in a way similar to that::
>
> I'm not against this, but so far I've not been able to come up with a
> good set of methods to endow the String ABC with. Another problem is
> that not everybody draws the line in the same place -- how should
> instances of bytes, bytearray, array.array, memoryview (buffer in 2.6)
> be treated?

Maybe the opposite approach would be more fruitful.  Flattening is about
removing nested "containers", so perhaps there should be an ABC that
things like lists and tuples provide, but strings don't.  No idea what
that might be.
-- 
Benji York
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Iterable String Redux (aka String ABC)

2008-05-27 Thread Guido van Rossum
[+python-3000]

On Tue, May 27, 2008 at 12:32 PM, Armin Ronacher
<[EMAIL PROTECTED]> wrote:
> Strings are currently iterable and it was stated multiple times that this is a
> good idea and shouldn't change.  While I still don't think that that's a good
> idea I would like to propose a solution for the problem many people are
> experiencing by introducing an abstract base class for strings.
>
> Basically *the* problematic situation with iterable strings is something like
> a `flatten` function that flattens out every iterable object except of 
> strings.
> Imagine it's implemented in a way similar to that::
>
>def flatten(iterable):
>for item in iterable:
>try:
>if isinstance(item, basestring):
>raise TypeError()
>iterator = iter(item)
>except TypeError:
>yield item
>else:
>for i in flatten(iterator):
>yield i
>
> A problem comes up as soon as user defined strings (such as UserString) is
> passed to the function.  In my opinion a good solution would be a "String"
> ABC one could test against.

I'm not against this, but so far I've not been able to come up with a
good set of methods to endow the String ABC with. Another problem is
that not everybody draws the line in the same place -- how should
instances of bytes, bytearray, array.array, memoryview (buffer in 2.6)
be treated?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] ABC issues

2008-05-27 Thread Guido van Rossum
On Tue, May 27, 2008 at 12:16 PM, Armin Ronacher
<[EMAIL PROTECTED]> wrote:
> Hi,
>
> Guido van Rossum  python.org> writes:
>
>> There's no need to register as Sized -- the Sized ABC recognizes
>> classes that define __len__ automatically. The Container class does
>> the same looking for __contains__. Since the deque class doesn't
>> implement __contains__, it is not considered a Container -- correctly
>> IMO.
> True.  deque doesn't implement __contains__.  However "in" still works
> because of the __iter__ fallback.

Sure, but that fallback is slow, and intentionally does not trigger
the 'Container' ABC.

> So from the API's perspective it's
> still compatible, even though it doesn't implement it.  The same probably
> affects old style iterators (__getitem__ with index).  One could argue that
> they are still iterable or containers, but that's harder to check so
> probably not worth the effort.

The ABCs do not intend to capture partial behavioral compatibility
that way -- the intent is to capture interface (in the duck typing
sense). Whether __iter__ is a good enough substitute for __contains__
depends on a lot of things -- surely for a huge list it isn't, and for
an iterator it isn't either (since it modifies the iterator's state).

>> >> Another issue is that builtin types don't accept ABCs currently.  For
>> >> example
>> >> set() | SomeSet() gives a TypeError, SomeSet() | set() however works.
>> >
>> > Pandora's Box -- sure you want to open it?
>>
>> In 3.0 I'd like to; this was my original intent. In 2.6 I think it's
>> not worth the complexity, though I won't complain.
> I would love to help on that as I'm very interested in that feature.

Please do submit patches!

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Iterable String Redux (aka String ABC)

2008-05-27 Thread Armin Ronacher
Hi,

Strings are currently iterable and it was stated multiple times that this is a
good idea and shouldn't change.  While I still don't think that that's a good
idea I would like to propose a solution for the problem many people are
experiencing by introducing an abstract base class for strings.

Basically *the* problematic situation with iterable strings is something like
a `flatten` function that flattens out every iterable object except of strings.
Imagine it's implemented in a way similar to that::

def flatten(iterable):
for item in iterable:
try:
if isinstance(item, basestring):
raise TypeError()
iterator = iter(item)
except TypeError:
yield item
else:
for i in flatten(iterator):
yield i

A problem comes up as soon as user defined strings (such as UserString) is
passed to the function.  In my opinion a good solution would be a "String"
ABC one could test against.


Regards,
Armin

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] ABC issues

2008-05-27 Thread Armin Ronacher
Hi,

Guido van Rossum  python.org> writes:

> There's no need to register as Sized -- the Sized ABC recognizes
> classes that define __len__ automatically. The Container class does
> the same looking for __contains__. Since the deque class doesn't
> implement __contains__, it is not considered a Container -- correctly
> IMO.
True.  deque doesn't implement __contains__.  However "in" still works
because of the __iter__ fallback.  So from the API's perspective it's
still compatible, even though it doesn't implement it.  The same probably
affects old style iterators (__getitem__ with index).  One could argue that
they are still iterable or containers, but that's harder to check so
probably not worth the effort.

> >> Another issue is that builtin types don't accept ABCs currently.  For
> >> example
> >> set() | SomeSet() gives a TypeError, SomeSet() | set() however works.
> >
> > Pandora's Box -- sure you want to open it?
> 
> In 3.0 I'd like to; this was my original intent. In 2.6 I think it's
> not worth the complexity, though I won't complain.
I would love to help on that as I'm very interested in that feature.

Regards,
Armin

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] ABC issues

2008-05-27 Thread Raymond Hettinger

If you want to use the 3.0 mixins in 2.6, perhaps an alternate set of
APIs could be imported from the future? E.g. from future_collections
import Mapping. IIRC a similar mechanism was proposed for some
built-in functions, even though I see no traces of an implementation
yet.


Any know what happened to the effort to put the 3.0 dict in future_builtins?

That would nicely sync-up the ABC with a concrete implementation
and bring 2.6 a little closer to 3.0.


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] ABC issues

2008-05-27 Thread Guido van Rossum
On Tue, May 27, 2008 at 10:44 AM, Raymond Hettinger <[EMAIL PROTECTED]> wrote:
 * The 2.6-backported Mapping ABC has the 3.0 dict API,
  that is, it uses keys() that returns a view etc.
>>>
>>> Curious to hear what Guido thinks about this one.
>>> A nice use of the Mapping ABC is to be able to
>>> get 3.0 behaviors.  I thought that was the whole
>>> point of all these backports.  If the ABC gets altered,
>>> then it just makes the 2-to-3 conversion harder.
>>
>> It's wrong if the ABC doesn't describe the behavior of actual
>> implementations; that is its primary purpose, the mixin class is a
>> nice side benefit.
>
> ISTM, the one purpose of backporting is to make 2.6 closer to 3.0.  Altering
> the API will just make them further apart.

Well, the ABCs have two sides -- they describe the API and they
provide a mix-in. I feel strongly that the primary function of ABCs is
to describe the API (in fact many ABCs do only that, e.g. look at
Sized and Container). I want isinstance({}, collections.Mapping) to be
true.

If you want to use the 3.0 mixins in 2.6, perhaps an alternate set of
APIs could be imported from the future? E.g. from future_collections
import Mapping. IIRC a similar mechanism was proposed for some
built-in functions, even though I see no traces of an implementation
yet.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] optimization required: .format() is much slower than %

2008-05-27 Thread Eric Smith

Christian Heimes wrote:

Antoine Pitrou schrieb:

In order to avoid memory consumption issues there could be a centralized cache
as for regular expressions. It makes it easier to handle eviction based on
various parameters, and it saves a few bytes for string objects which are never
used as a formatting template.


Good idea!
I suggest you hook into the string interning code and use a similar
approach.


I don't think parsing the strings is where it spends its time.  I think 
the time is spent in object creation, __format__ lookup, and building 
the result.  I'm looking into some of these other optimizations now.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] optimization required: .format() is much slower than %

2008-05-27 Thread Christian Heimes
Antoine Pitrou schrieb:
> In order to avoid memory consumption issues there could be a centralized cache
> as for regular expressions. It makes it easier to handle eviction based on
> various parameters, and it saves a few bytes for string objects which are 
> never
> used as a formatting template.

Good idea!
I suggest you hook into the string interning code and use a similar
approach.

Christian
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] ABC issues

2008-05-27 Thread Raymond Hettinger

* The 2.6-backported Mapping ABC has the 3.0 dict API,
 that is, it uses keys() that returns a view etc.


Curious to hear what Guido thinks about this one.
A nice use of the Mapping ABC is to be able to
get 3.0 behaviors.  I thought that was the whole
point of all these backports.  If the ABC gets altered,
then it just makes the 2-to-3 conversion harder.


It's wrong if the ABC doesn't describe the behavior of actual
implementations; that is its primary purpose, the mixin class is a
nice side benefit.


ISTM, the one purpose of backporting is to make 2.6 closer to 3.0.  
Altering the API will just make them further apart.  



Raymond
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why is type_modified() in typeobject.c not a public function?

2008-05-27 Thread Guido van Rossum
I'm fine with giving it a public interface. Please submit a patch with
docs included.

On Tue, May 27, 2008 at 9:47 AM, Stefan Behnel <[EMAIL PROTECTED]> wrote:
> [reposting this to python-dev, as it affects both 2.6 and 3.0]
>
> Hi,
>
> when we build extension classes in Cython, we have to first build the type to
> make it available to user code, and then update the type's tp_dict while we
> run the class body code (PyObject_SetAttr() does not work here). In Py2.6+,
> this requires invalidating the method cache after each attribute change, which
> Python does internally using the type_modified() function.
>
> Could this function get a public interface? I do not think Cython is the only
> case where C code wants to modify a type after its creation, and copying the
> code over seems like a hack to me.
>
> Stefan
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/guido%40python.org
>



-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] ABC issues

2008-05-27 Thread Guido van Rossum
On Mon, May 26, 2008 at 4:11 PM, Raymond Hettinger <[EMAIL PROTECTED]> wrote:
>>> Deque's do not support count(), insert() or __iadd__().
>>> They should not be registered.
>
>> If it doesn't implement the MutableSequence protocol it still is a Sized
>> container.  However currently it's not registered as a container.
>
> Seems useless to me.  I don't think the intent of the ABC pep was
> to mandate that every class that defines __len__ must be registered
> as Sized.

There's no need to register as Sized -- the Sized ABC recognizes
classes that define __len__ automatically. The Container class does
the same looking for __contains__. Since the deque class doesn't
implement __contains__, it is not considered a Container -- correctly
IMO.

>> Another issue is that builtin types don't accept ABCs currently.  For
>> example
>> set() | SomeSet() gives a TypeError, SomeSet() | set() however works.
>
> Pandora's Box -- sure you want to open it?

In 3.0 I'd like to; this was my original intent. In 2.6 I think it's
not worth the complexity, though I won't complain.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] ABC issues

2008-05-27 Thread Guido van Rossum
On Mon, May 26, 2008 at 11:59 AM, Raymond Hettinger <[EMAIL PROTECTED]> wrote:
>> * The 2.6-backported Mapping ABC has the 3.0 dict API,
>>  that is, it uses keys() that returns a view etc.
>
> Curious to hear what Guido thinks about this one.
> A nice use of the Mapping ABC is to be able to
> get 3.0 behaviors.  I thought that was the whole
> point of all these backports.  If the ABC gets altered,
> then it just makes the 2-to-3 conversion harder.

It's wrong if the ABC doesn't describe the behavior of actual
implementations; that is its primary purpose, the mixin class is a
nice side benefit.

We could make the incompatible mixin classes available separately
though, if you think they're useful.

>> * The 2.6 UserDict is not registered as a mapping.
>
> Since the API's are not currently the same, it makes
> sense that UserDict is not registered.
> If the Mapping ABC does get changed, only IterableUserDict
> should get registered.  A regular UserDict does not comply.

Fair enough. I recomment to fix the Mapping ABC and register IterableUserDict.

>> * collections.deque isn't registered as a MutableSequence.
>
> Deque's do not support count(), insert() or __iadd__().
> They should not be registered.  General purpose indexing
> into a deque is typically a mis-use of the data structure.
> It was provided only to make it easier to substitute for lists
> in apps the operate only one ends (i.e.d[0], d[1], d[-1], d[-2]
> but not d[i] to somewhere in the middle).

Hopefully they aren't registered in 3.0 either. :-)

>> If there are no objections, I will correct these issues
>> in the 2.6 and 3.0 branches.
>
> I think none of these changes should be made.

I'm in the middle. :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com