Re: [Python-Dev] Why not using the hash when comparing strings?

2012-10-19 Thread Duncan Booth
Hrvoje Niksic  wrote:

> On 10/19/2012 03:22 AM, Benjamin Peterson wrote:
>> It would be interesting to see how common it is for strings which have
>> their hash computed to be compared.
> 
> Since all identifier-like strings mentioned in Python are interned, and 
> therefore have had their hash computed, I would imagine comparing them 
> to be fairly common. After all, strings are often used as makeshift 
> enums in Python.
> 
> On the flip side, those strings are typically small, so a measurable 
> overall speed improvement brought by such a change seems unlikely.

I'm pretty sure it would result in a small slowdown.

Many (most?) of the comparisons against interned identifiers will be done 
by dictionary lookups and the dictionary lookup code only tries the string 
comparison after it has determined that the hashes match. The only time 
dictionary key strings contents are actually compared is when the hash 
matches but the pointers are different; it is already the case that if the 
hashes don't match the strings are never compared.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 383 (again)

2009-04-28 Thread Duncan Booth
Hrvoje Niksic  wrote:

> Assume a UTF-8 locale.  A file named b'\xff', being an invalid UTF-8 
> sequence, will be converted to the half-surrogate '\udcff'.  However,
> a file named b'\xed\xb3\xbf', a valid[1] UTF-8 sequence, will also be 
> converted to '\udcff'.  Those are quite different POSIX pathnames; how
> will Python know which one it was when I later pass '\udcff' to
> open()? 
> 
> 
> [1]
> I'm assuming that it's valid UTF8 because it passes through Python
> 2.5's '\xed\xb3\xbf'.decode('utf-8').  I don't claim to be a UTF-8
> expert.

I'm not a UTF-8 expert either, but I got bitten by this yesterday. I was 
uploading a file to a Google Search Appliance and it was rejected as 
invalid UTF-8 despite having been encoded into UTF-8 by Python.

The cause was a byte sequence which decoded to a half surrogate similar to 
your example above. Python will happily decode and encode such sequences, 
but as I found to my cost other systems reject them.

Reading wikipedia implies that Python is wrong to accept these sequences 
and I think (though I'm not a lawyer) that RFC 3629 also implies this:

"The definition of UTF-8 prohibits encoding character numbers between 
U+D800 and U+DFFF, which are reserved for use with the UTF-16 encoding form 
(as surrogate pairs) and do not directly represent characters."

 and

"Implementations of the decoding algorithm above MUST protect against 
decoding invalid sequences."

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] slightly inconsistent set/list pop behaviour

2009-04-08 Thread Duncan Booth
Andrea Griffini  wrote:

> On Wed, Apr 8, 2009 at 12:57 PM, Jack diederich 
> wrote: 
>> You wrote a program to find the two smallest ints that would have a
>> hash collision in the CPython set implementation?  I'm impressed.
>>  And by impressed I mean frightened.
> 
> ?
> 
> print set([0,8]).pop(), set([8,0]).pop()

If 'smallest ints' means the sum of the absolute values then these are 
slightly smaller:

>>> print set([-1,6]).pop(), set([6,-1]).pop()
6 -1

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Ruby-style Blocks in Python [Pseudo-PEP]

2009-03-08 Thread Duncan Booth
tav  wrote:

> I explain in detail in this blog article:
> 
>   http://tav.espians.com/ruby-style-blocks-in-python.html
> 

"This is also possible in Python but at the needless cost of naming and 
defining a function first"

The cost of defining the function first is probably much less than the cost 
of your __do__ function. Your proposal seems to be much more limited than 
passing functions around e.g. Python allows you to pass in multiple 
functions where appropriate, or to store them for later calling. 


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] repeated keyword arguments

2008-07-02 Thread Duncan Booth
"Steven D'Aprano" <[EMAIL PROTECTED]> wrote:

> It would be nice to be able to do this:
> 
> defaults = dict(a=5, b=7)
> f(**defaults, a=8)  # override the value of a in defaults
> 
> but unfortunately that gives a syntax error. Reversing the order would 
> override the wrong value. So as Python exists now, no, it's not 
> terribly useful. But it's not inherently a stupid idea.

There is already an easy way to do that using functools.partial, and it is 
documented and therefore presumably deliberate behaviour "If additional 
keyword arguments are supplied, they extend and override keywords."

>>> from functools import partial
>>> def f(a=1, b=2, c=3):
print a, b, c


>>> g = partial(f, b=99)
>>> g()
1 99 3
>>> g(a=100, b=101)
100 101 3



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding start to enumerate()

2008-05-13 Thread Duncan Booth
"Steven D'Aprano" <[EMAIL PROTECTED]> wrote:

> On Mon, 12 May 2008 08:20:51 am Georg Brandl wrote:
>> I believe the following is a common use-case for enumerate()
>> (at least, I've used it quite some times):
>>
>> for lineno, line in enumerate(fileobject):
>>  ...
>>
>> For this, it would be nice to have a start parameter for enumerate().
> 
> Why would it be nice? What would you use it for?
> 
> The only thing I can think of is printing lines with line numbers, and 
> starting those line numbers at one instead of zero. If that's the only 
> use-case, should it require built-in support?
> 
If you are generating paginated output then a function to generate an 
arbitrary page would likely want to enumerate starting at some value larger 
than one.

Of course in that case you'll also want to skip part way through the data, 
but I think it is more likely that you'll want to enumerate the partial 
data (e.g. if it is a database query) rather than slice the enumeration.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Complexity documentation request

2008-03-13 Thread Duncan Booth
Dimitrios Apostolou <[EMAIL PROTECTED]> wrote:

> On another note which sorting algorithm is python using? Perhaps we can 
> add this as a footnote. I always thought it was quicksort, with a worst 
> case of O(n^2).

See http://svn.python.org/projects/python/trunk/Objects/listsort.txt

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Declaring setters with getters

2007-11-02 Thread Duncan Booth
Greg Ewing <[EMAIL PROTECTED]> wrote:

> Fred Drake wrote:
>>@property
>>def attribute(self):
>>return 42
>> 
>>@property.set
>>def attribute(self, value):
>>self._ignored = value
> 
> Hmmm... if you were allowed general lvalues as the target of a
> def, you could write that as
> 
>def attribute.set(self, value):
>  ...
> 
Dotted names would be sufficient rather than general lvalues.

I like this, I think it looks cleaner than the other options, especially if 
you write both getter and setter in the same style:

attribute = property()

def attribute.fget(self):
return 42

def attribute.fset(self, value):
self._ignored = value


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Declaring setters with getters

2007-11-01 Thread Duncan Booth
"Steven Bethard" <[EMAIL PROTECTED]> wrote in news:[EMAIL PROTECTED]:

> On 10/31/07, Fred Drake <[EMAIL PROTECTED]> wrote:
>> If I had to choose built-in names, though, I'd prefer "property",
>> "propset", "propdel".  Another possibility that seems reasonable
>> (perhaps a bit better) would be:
>>
>>class Thing(object):
>>
>>@property
>>def attribute(self):
>>return 42
>>
>>@property.set
>>def attribute(self, value):
>>self._ignored = value
>>
>>@property.delete
>>def attribute(self):
>>pass
> 
> +1 on this spelling if possible.  Though based on Guido's original
> recipe it might have to be written as::
> 
>   @property.set(attribute)
>   def attribute(self, value):
>   self._ignored = value
> 
It *can* be written as Fred suggested: see 
http://groups.google.co.uk/group/comp.lang.python/browse_thread/thread/b442d08c9a019a8/8a381be5edc26340

However that depends on hacking the stack frames, so the 
implementation probably isn't appropriate here.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] generators and with

2007-05-13 Thread Duncan Booth
"tomer filiba" <[EMAIL PROTECTED]> wrote in
news:[EMAIL PROTECTED]: 

> why not add __enter__ and __exit__ to generator objects?
> it's really a trivial addition: __enter__ returns self, __exit__ calls
> close().
> it would be used to ensure close() is called when the generator is
> disposed, instead of doing that manually. typical usage would be:
> 
> with mygenerator() as g:
> g.next()
> bar = g.send("foo")
> 

You can already ensure that the close() method is called quite easily:

with contextlib.closing(mygenerator()) as g:
   g.next()
   bar = g.send("foo")

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] whitespace normalization

2007-04-25 Thread Duncan Booth
"Neal Norwitz" <[EMAIL PROTECTED]> wrote in 
news:[EMAIL PROTECTED]:

> I just checked in a whitespace normalization change that was way too
> big.  Should this task be automated?

IMHO, changing whitespace retrospectively in a version control system is a 
bad idea.

How much overhead would it be to have a checkin hook which compares each 
modified file against the output of running reindent.py over the same file 
and rejects the checkin if it changes anything? (With of course an 
appropriate message suggesting the use of Reindent.py before reatttempting 
the checkin).

That way the whitespace ought to stay normalized so you shouldn't need a 
separate cleanup step and you won't be breaking diff and blame for the 
sources (and if the reindent does ever break anything it should be more 
tracable).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __del__ unexpectedly being called twice

2006-08-18 Thread Duncan Booth
"Terry Reedy" <[EMAIL PROTECTED]> wrote in
news:[EMAIL PROTECTED]: 

> 
> "Duncan Booth" <[EMAIL PROTECTED]> wrote in message 
> news:[EMAIL PROTECTED]
>> There's a thread on comp.lang.python at the moment under the subject
>> "It is
>> __del__ calling twice for some instances?" which seems to show that
>> when releasing a long chain of old-style classes every 50th
>> approximately has its finaliser called twice. I've verified that this
>> happens on both Python
>> 1.4 and 1.5.
> 
> Should we assume you meant 2.4 and 2.5?
> 
Probably. 2.5b3 to be a bit more precise.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] __del__ unexpectedly being called twice

2006-08-18 Thread Duncan Booth
There's a thread on comp.lang.python at the moment under the subject "It is 
__del__ calling twice for some instances?" which seems to show that when 
releasing a long chain of old-style classes every 50th approximately has 
its finaliser called twice. I've verified that this happens on both Python 
1.4 and 1.5.


My guess is that there's a bug in the trashcan mechanism: calling the 
__del__ method means creating a descriptor, and if that descriptor gets 
queued in the trashcan then releasing it calls the __del__ method a second 
time. I'm not sure if there is going to be a particularly easy fix for 
that.

Would someone who knows this code (instance_dealloc in classobject.c) like 
to have a look at it, should I just submit a bug report, or isn't it worth 
bothering about?


The code which exhibits the problem:

#!/usr/local/bin/python -d
# -*- coding: koi8-u -*-
import sys

class foo:
def __init__(self, other):
self.other = other
self._deleted = False

global ini_cnt
ini_cnt +=1

def __del__(self):
if self._deleted:
print "aargh!"
self._deleted = True
global del_cnt
del_cnt +=1
print "del",del_cnt,"at",id(self)

def stat():
print "-"*20
print "ini_cnt = %d" % ini_cnt
print "del_cnt = %d" % del_cnt
print "difference = %d" % (ini_cnt-del_cnt)

ini_cnt = 0
del_cnt = 0
loop_cnt = 55

a = foo(None)

for i in xrange(loop_cnt):
a = foo(a)

stat()
a = None
stat()


The original thread is at:
http://groups.google.com/group/comp.lang.python/browse_thread/thread/293acf433a39583b/bfd4af9c6008a34e

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] segmentation fault in Python 2.5b3 (trunk:51066)

2006-08-03 Thread Duncan Booth
Thomas Heller <[EMAIL PROTECTED]> wrote in
news:[EMAIL PROTECTED]: 

>>  /* if no docstring given and the getter has one, use that one */
>>  if ((doc == NULL || doc == Py_None) && get != NULL && 
>>  PyObject_HasAttrString(get, "__doc__")) {
>>   if (!(get_doc = PyObject_GetAttrString(get, "__doc__")))
>>return -1;
>>   Py_DECREF(get_doc); /* it is INCREF'd again below */
>> ^^
>>   doc = get_doc;
>>  }
>> 
>>  Py_XINCREF(get);
>>  Py_XINCREF(set);
>>  Py_XINCREF(del);
>>  Py_XINCREF(doc);
>> 
> 
> A strange programming style, if you ask me, and I wonder why Coverity
> doesn't complain about it.
> 
Does Coverity recognise objects on Python's internal pools as deallocated? 

If not it wouldn't complain because all that the Py_DECREF does is link the 
block into a pool's freeblock list so any subsequent reference from the 
Py_XINCREF is still a reference to allocated memory.


[Off topic:
Microsoft have (or had?) a similarly screwy bit in their ActiveX atl 
libraries: a newly created ActiveX object has its reference count 
incremented before calling FinalConstruct and then  decremented to 0  
(using a method which decrements the reference count but doesn't free it) 
before being incremented again. If in the meanwhile you increment and 
decrement the reference count in another thread then it goes bang.]

The moral is to regard the reference counting rules as law: no matter how 
sure you are that you can cheat, don't or you'll regret it.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] reference leaks, __del__, and annotations

2006-03-31 Thread Duncan Booth
"Jim Jewett" <[EMAIL PROTECTED]> wrote in 
news:[EMAIL PROTECTED]:

> As a strawman proposal:
> 
> deletes = [(obj.__del__.cycle, obj) for obj in cycle
>   if hasattr(obj, "__del__") and
> hasattr(obj.__del__, "cycle")]
> deletes.sort()
> for (cycle, obj) in deletes:
> obj.__del__()
> 
> Lightweight __del__ methods (such as most resource managers) could set
> the cycle attribute to True, and thereby ensure that they won't cause
> unbreakable cycles.  Fancier object frameworks could use different
> values for the cycle attribute.  Any object whose __del__ is not
> annotated will still be at least as likely to get finalized as it is

That doesn't look right to me.

Surely if you have a cycle what you want to do is to pick just *one* of the 
objects in the cycle and break the link which makes it participate in the 
cycle. That should be sufficient to cause the rest of the cycle to collapse 
with __del__ methods being called from the normal refcounting mechanism.

So something like this:

for obj in cycle:
if hasattr(obj, "__breakcycle__"):
obj.__breakcycle__()
break

Every object which knows it can participate in a cycle then has the option 
of defining a method which it can use to tear down the cycle. e.g. by 
releasing the resource and then deleting all of its attributes, but no 
guarantees are made over which obj has this method called. An object with a 
__breakcycle__ method would have to be extra careful as its methods could 
still be called after it has broken the cycle, but it does mean that the 
responsibilities are in the right place (i.e. defining the method implies 
taking that into account).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The decorator(s) module

2006-02-11 Thread Duncan Booth
Georg Brandl <[EMAIL PROTECTED]> wrote in news:[EMAIL PROTECTED]:

> Unfortunately, a @property decorator is impossible...
> 

It all depends what you want (and whether you want the implementation to be 
portable to other Python implementations). Here's one possible but not 
exactly portable example:

from inspect import getouterframes, currentframe
import unittest

class property(property):
@classmethod
def get(cls, f):
locals = getouterframes(currentframe())[1][0].f_locals
prop = locals.get(f.__name__, property())
return cls(f, prop.fset, prop.fdel, prop.__doc__)

@classmethod
def set(cls, f):
locals = getouterframes(currentframe())[1][0].f_locals
prop = locals.get(f.__name__, property())
return cls(prop.fget, f, prop.fdel, prop.__doc__)

@classmethod
def delete(cls, f):
locals = getouterframes(currentframe())[1][0].f_locals
prop = locals.get(f.__name__, property())
return cls(prop.fget, prop.fset, f, prop.__doc__)

class PropTests(unittest.TestCase):
def test_setgetdel(self):
class C(object):
def __init__(self, colour):
self._colour = colour

@property.set
def colour(self, value):
self._colour = value

@property.get
def colour(self):
return self._colour

@property.delete
def colour(self):
self._colour = 'none'

inst = C('red')
self.assertEquals(inst.colour, 'red')
inst.colour = 'green'
self.assertEquals(inst._colour, 'green')
del inst.colour
self.assertEquals(inst._colour, 'none')

if __name__=='__main__':
unittest.main()
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Path PEP and the division operator

2006-02-05 Thread Duncan Booth
Nick Coghlan <[EMAIL PROTECTED]> wrote in
news:[EMAIL PROTECTED]: 

> Duncan Booth wrote:
>> I'm not convinced by the rationale given why atime,ctime,mtime and
>> size are methods rather than properties but I do find this PEP much
>> more agreeable than the last time I looked at it.
> 
> A better rationale for doing it is that all of them may raise
> IOException. It's rude for properties to do that, so it's better to
> make them methods instead. 

Yes, that rationale sounds good to me.

> 
> That was a general guideline that came up the first time adding Path
> was proposed - if the functionality involved querying or manipulating
> the actual filesystem (and therefore potentially raising IOError),
> then it should be a method. If the operation related solely to the
> string representation, then it could be a property.

Perhaps Bjorn could add that to the PEP?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Path PEP and the division operator

2006-02-04 Thread Duncan Booth
BJörn Lindqvist <[EMAIL PROTECTED]> wrote in
news:[EMAIL PROTECTED]: 

> On 2/4/06, Guido van Rossum <[EMAIL PROTECTED]> wrote:
>> I won't even look at the PEP as long as it uses / or // (or any other
>> operator) for concatenation.
> 
> That's good, because it doesn't. :)
> http://www.python.org/peps/pep-0355.html 
> 
No, but it does say that / may be reintroduced 'if the BFDL so desires'. I 
hope that doesn't mean the BDFL may be overruled. :^)

I'm not convinced by the rationale given why atime,ctime,mtime and size are 
methods rather than properties but I do find this PEP much more agreeable 
than the last time I looked at it.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] properties and block statement

2005-10-19 Thread Duncan Booth
Stefan Rank <[EMAIL PROTECTED]> wrote in news:[EMAIL PROTECTED]:

> I think there is no need for a special @syntax for this to work.
> 
> I suppose it would be possible to allow a trailing block after any 
> function invocation, with the effect of creating a new namespace that 
> gets treated as containing keyword arguments.
> 

I suspect that without any syntax changes at all it will be possible (for 
some stack crawling implementation of 'propertycontext', and assuming 
nobody makes property objects immutable) to do:

   class C(object):
   with propertycontext() as x:
   doc = """ Yay for property x! """
   def fget(self):
   return self._x
   def fset(self, value):
   self._x = value

for inheritance you would have to specify the base property:

class D(C):
   with propertycontext(C.x) as x:
   def fset(self, value):
   self._x = value+1


propertycontext could look something like:

import sys
@contextmanager
def propertycontext(parent=None):
classframe = sys._getframe(2)
cvars = classframe.f_locals
marker = object()
keys = ('fget', 'fset', 'fdel', 'doc')
old = [cvars.get(key, marker) for key in keys]

if parent:
pvars = [getattr(parent, key) for key in
('fget', 'fset', 'fdel', '__doc__')]
else:
pvars = [None]*4

args = dict(zip(keys, pvars))

prop = property()
try:
yield prop
for key, orig in zip(keys, old):
v = cvars.get(key, marker)
if v is not orig:
args[key] = v
prop.__init__(**args)
finally:
for k,v in zip(keys,old):
   if v is marker:
  if k in cvars:
  del cvars[k]
   else:
   cvars[k] = v
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bug in urlparse

2005-09-06 Thread Duncan Booth
[EMAIL PROTECTED] wrote in news:[EMAIL PROTECTED]:

> According to RFC 2396[1] section 5.2:
> 
>   g) If the resulting buffer string still begins with one or more
>  complete path segments of "..", then the reference is
>  considered to be in error.  Implementations may handle this
>  error by retaining these components in the resolved path (i.e.,
>  treating them as part of the final URI), by removing them from
>  the resolved path (i.e., discarding relative levels above the
>  root), or by avoiding traversal of the reference.
> 
> If I read this right, it explicitly allows the urlparse.urljoin behavior
> ("handle this error by retaining these components in the resolved path").
> 

Yes, the urljoin behaviour is explicitly allowed, however it is not the 
most commonly implemented permitted behaviour. Both IE and Mozilla/Firefox 
handle this error by stripping the spurious .. elements from the front of 
the path. Apache, and I hope other web servers, work by the third permitted 
method, i.e. rejecting requests to these invalid urls.

The net effect of this is that on some sites using a Python spider (e.g. 
webchecker.py) will produce a large number of error messages for links 
which browsers will actually resolve successfully. (At least that's when I 
first noticed this particular problem). Depending on your reasons for 
spidering a site this can be either a good thing or an annoyance.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Re: anonymous blocks

2005-04-27 Thread Duncan Booth
Jim Fulton <[EMAIL PROTECTED]> wrote in news:[EMAIL PROTECTED]:

>> No, the return sets a flag and raises StopIteration which should make
>> the iterator also raise StopIteration at which point the real return
>> happens. 
> 
> Only if exc is not None
> 
> The only return in the pseudocode is inside "if exc is not None".
> Is there another return that's not shown? ;)
> 

Ah yes, I see now what you mean. 

I would think that the relevant psuedo-code should look more like:

except StopIteration:
if ret:
return exc
if exc is not None:
raise exc   # XXX See below
break
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Re: anonymous blocks

2005-04-27 Thread Duncan Booth
Jim Fulton <[EMAIL PROTECTED]> wrote in news:[EMAIL PROTECTED]:

> Guido van Rossum wrote:
>> I've written a PEP about this topic. It's PEP 340: Anonymous Block
>> Statements (http://python.org/peps/pep-0340.html).
>> 
> Some observations:
> 
> 1. It looks to me like a bare return or a return with an EXPR3 that
> happens 
> to evaluate to None inside a block simply exits the block, rather
> than exiting a surrounding function. Did I miss something, or is
> this a bug?
> 

No, the return sets a flag and raises StopIteration which should make the 
iterator also raise StopIteration at which point the real return happens.

If the iterator fails to re-raise the StopIteration exception (the spec 
only says it should, not that it must) I think the return would be ignored 
but a subsquent exception would then get converted into a return value. I 
think the flag needs reset to avoid this case.

Also, I wonder whether other exceptions from next() shouldn't be handled a 
bit differently. If BLOCK1 throws an exception, and this causes the 
iterator to also throw an exception then one exception will be lost. I 
think it would be better to propogate the original exception rather than 
the second exception.

So something like (added lines to handle both of the above):

itr = EXPR1
exc = arg = None
ret = False
while True:
try:
VAR1 = next(itr, arg)
except StopIteration:
if exc is not None:
if ret:
return exc
else:
raise exc   # XXX See below
break
+   except:
+   if ret or exc is None:
+   raise
+   raise exc # XXX See below
+   ret = False
try:
exc = arg = None
BLOCK1
except Exception, exc:
arg = StopIteration()
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com