Re: [Python-Dev] New Py_UNICODE doc

2005-05-08 Thread Martin v. Löwis
M.-A. Lemburg wrote:
> Unicode has many code points that are meant only for composition
> and don't have any standalone meaning, e.g. a combining acute
> accent (U+0301), yet they are perfectly valid code points -
> regardless of UCS-2 or UCS-4. It is easily possible to break
> such a combining sequence using slicing, so the most
> often presented argument for using UCS-4 instead of UCS-2
> (+ surrogates) is rather weak if seen by daylight.

I disagree. It is not just about slicing, it is also about
searching for a character (either through the "in" operator,
or through regular expressions). If you define an SRE character
class, such a character class cannot hold a non-BMP character
in UTF-16 mode, but it can in UCS-4 mode. Consequently,
implementing XML's lexical classes (such as Name, NCName, etc.)
is much easier in UCS-4 than it is in UCS-2. In this case,
combining characters do not matter much, because the XML
spec is defined in terms of Unicode coded characters, causing
combining characters to appear as separate entities for lexical
purposes (unlike half surrogates).

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New Py_UNICODE doc

2005-05-08 Thread Martin v. Löwis
M.-A. Lemburg wrote:
> I believe that it would be more appropriate to adjust the _tkinter
> module to adapt to the TCL Unicode size rather than
> forcing the complete Python system to adapt to TCL - I don't
> really see the point in an optional extension module
> defining the default for the interpreter core.

_tkinter currently supports, for a UCS-2 Tcl, both UCS-2 and UCS-4
Python. For an UCS-4 Tcl, it requires Python also to be UCS-4.
Contributions to support the missing case are welcome.

> At the very least, this should be a user controlled option.

It is: by passing --enable-unicode=ucs2, you can force Python
to use UCS-2 even if Tcl is UCS-4, with the result that
_tkinter cannot be built anymore (and compilation fails
with an #error).

> Otherwise, we might as well use sizeof(wchar_t) as basis
> for the default Unicode size. This at least, would be
> a much more reasonable choice than whatever TCL uses.

The goal of the build process is to provide as many extension
modules as possible (given the set of headers and libraries
installed), and _tkinter is an important extension module
because IDLE depends on it.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New Py_UNICODE doc

2005-05-08 Thread Martin v. Löwis
Nicholas Bastin wrote:
> I don't consider either alternative useless (well, I consider UCS-2 to
> be largely useless in the general case, but as we've already discussed
> here, Python isn't really UCS-2).  However, I would be a lot happier if
> we just chose *one*, and all Python's used that one.  This would make
> extension module distribution a lot easier.

Why is that? For a binary distribution, you have to know the target
system in advance, so you also know what size the Unicode type has.
For example, on Redhat 9.x, and on Debian Sarge, /usr/bin/python
uses a UCS-4 Unicode type. As you have to build binaries specifically
for these target systems (because of dependencies on the C library,
and perhaps other libraries), building the extension module *on*
the target system will just do the right thing.

> I'd prefer UTF-16, but I would be perfectly happy with UCS-4.

-1 on the idea of dropping one alternative. They are both used
(on different systems), and people rely on both being supported.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 340: Deterministic Finalisation (new PEP draft, either a competitor or update to PEP 340)

2005-05-08 Thread Josiah Carlson

Ron Adam <[EMAIL PROTECTED]> wrote:
> 
> Josiah Carlson wrote:
> >>I think a completely separate looping or non-looping construct would be 
> >>better for the finalization issue, and maybe can work with class's with 
> >>__exit__ as well as generators.
> > 
> > From what I understand, the entire conversation has always stated that
> > class-based finalized objects and generator-based finalized objects will
> > both work, and that any proposal that works for one, but not the other,
> > is not sufficient.
> 
> That's good to hear.  There seems to be some confusion as to weather or 
> not 'for's will do finalizing.  So I was trying to stress I think 
> regular 'for' loops should not finalize. They should probably give an 
> error if an object with an try-finally in them or an __exit__ method. 
> I'm not sure what the current opinion on that is.  But I didn't see it 
> in any of the PEPs.

It's not a matter of 'will they be finalized', but instead a matter of
'will they be finalized in a timely manner'.  From what I understand;
upon garbage collection, any generator-based resource will be finalized
via __exit__/next(exception)/... and any class-based resource will have
its __del__ method called (as long as it is well-behaved), which can be
used to call __exit__...


> >>Having it loop has the advantage of making it break out in a better 
> >>behaved way.
> > 
> > What you have just typed is nonsense.  Re-type it and be explicit.
> 
> It was a bit brief, sorry about that. :-)
> 
> To get a non-looping block to loop, you will need to put it in a loop or 
> put a loop in it.
> 
> In the first case, doing a 'break' in the block doesn't exit the loop. 
> so you need to add an extra test for that.
> 
> In the second case, doing a 'break' in the loop does exit the block, but 
> finishes any code after the loop.  So you may need an extra case in that 
> case.
> 
> Having a block that loops can simplify these conditions, in that a break 
> alway exits the body of the block and stops the loop.  A 'continue' can 
> be used to skip the end of the block and start the next loop early.
> 
> And you still have the option to put the block in a loop or loops in the 
> block and they will work as they do now.
> 
> I hope that clarifies what I was thinking a bit better.


That is the long-standing nested loops 'issue', which is not going to be
solved here, nor should it be.

I am not sure that any solution to the issue will be sufficient for
everyone involved.  The closest thing to a generic solution I can come
up with would be to allow for the labeling of for/while loops, and the
allowing of "break/continue ", which continues to that loop
(breaking all other loops currently nested within), or breaks that loop
(as well as all other loops currently nested within).

Perhaps something like...

while ... label 'foo':
for ... in ... label 'goo':
block ... label 'hoo':
if ...:
#equivalent to continue 'hoo'
continue
elif ...:
continue 'goo'
elif ...:
continue 'foo'
else:
break 'foo'

Does this solve the nested loop problem?  Yes.  Do I like it?  Not
really; three keywords in a single for/block statement is pretty awful.
On the upside, 'label' doesn't need to be a full-on keyword (it can be a
partial keyword like 'as' still seems to be).

Enough out of me, good night,
 - Josiah

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New Py_UNICODE doc

2005-05-08 Thread Martin v. Löwis
Nicholas Bastin wrote:
>> -1. This breaks existing documentation and usage, and provides only
>> minimum value.
> 
> 
> Have you been missing this conversation?  UTF-16 is *WHAT PYTHON
> CURRENTLY IMPLEMENTS*.  The current documentation is flat out wrong. 
> Breaking that isn't a big problem in my book.

The documentation I refer to is the one that says the equivalent of

'configure takes an option --enable-unicode, with the possible
values "ucs2", "ucs4", "yes" (equivalent to no argument),
and  "no" (equivalent to --disable-unicode)'

*THIS* documentation would break. This documentation is factually
correct at the moment (configure does indeed take these options),
and people rely on them in automatic build processes. Changing
configure options should not be taken lightly, even if they
may result from a "wrong mental model". By that rule, --with-suffix
should be renamed to --enable-suffix, --with-doc-strings to
--enable-doc-strings, and so on. However, the nitpicking that
underlies the desire to rename the option should be ignored
in favour of backwards compatibility.

Changing the documentation that goes along with the option
would be fine.

> It provides more than minimum value - it provides the truth.

No. It is just a command line option. It could be named
--enable-quirk=(quork|quark), and would still select UTF-16.
Command line options provide no truth - they don't even
provide statements.

>> With --enable-unicode=ucs2, Python's Py_UNICODE does *not* start
>> supporting the full Unicode ccs the same way it supports UCS-2.
> 
> I can't understand what you mean by this.  My point is that if you
> configure python to support UCS-2, then it SHOULD NOT support surrogate
> pairs.  Supporting surrogate paris is the purvey of variable width
> encodings, and UCS-2 is not among them.

So you suggest to renaming it to --enable-unicode=utf16, right?
My point is that a Unicode type with UTF-16 would correctly
support all assigned Unicode code points, which the current
2-byte implementation doesn't. So --enable-unicode=utf16 would
*not* be the truth.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New Py_UNICODE doc

2005-05-08 Thread Martin v. Löwis
Nicholas Bastin wrote:
> All of my proposals for what to change the documention to have been 
> shot down by Martin.  If someone has better verbiage that they'd like 
> to see, I'd be perfectly happy to patch the doc.

I don't look into the specific wording - you speak English much better
than I do. What I care about is that this part of the documentation
should be complete and precise. I.e. statements like "should not make
assumptions" might be fine, as long as they are still followed by
a precise description of what the code currently does. So it should
mention that the representation can be either 2 or 4 bytes, that
the strings "ucs2" and "ucs4" can be used to select one of them,
that it is always 2 bytes on Windows, that 2 bytes means that non-BMP
characters can be represented as surrogate pairs, and so on.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 340: Deterministic Finalisation (new PEP draft, either a competitor or update to PEP 340)

2005-05-08 Thread Eric Nieuwland
Josiah Carlson wrote:
> Eric Nieuwland <[EMAIL PROTECTED]> wrote:
>> I don't know. Using 'del' in that place seems ackward to me.
>> Why not use the following rule:
>>  for [VAR in] EXPR:
>>  SUITE
>> If EXPR is an iterator, no finalisation is done.
>> If EXPR is not an iterator, it is created at the start and destroyed 
>> at
>> the end of the loop.
>
> You should know why that can't work.  If I pass a list, is a list an
> iterator?  No, but it should neither be created nor destroyed before or
> after.

I suggested to create AN ITERATOR FOR THE LIST and destroy that at the 
end. The list itself remains untouched.

--eric

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 340: Deterministic Finalisation (new PEP draft, either a competitor or update to PEP 340)

2005-05-08 Thread Nick Coghlan
Ron Adam wrote:
> Question:  Is the 'for' in your case iterating over a sequence? or is it 
> testing for an assignment to determine if it should continue?

Iterating over a sequence. If it's single-pass (and always single pass), you 
should use a user defined statement instead.

> The difference is slight I admit, and both views can be said to be true 
> for 'for' loops iterating over lists also.  But maybe looking at it as a 
> truth test of getting something instead of an iteration over a sequence 
> would fit better?  When a variable to assign is not supplied then the 
> test would be of a private continue-stop variable in the iterator or a 
> StopIteration exception.

No, that's the use case for user defined statements - if __enter__ raises 
TerminateBlock, then the body of the statement is not executed. What the 
for-loop part of the redrafted PEP is about is whether or not there should be 
an 
easy way to say "iterate over this iterator, and finalise it afterwards, 
regardless of how the iteration is ended", rather than having to use a 
try/finally block or a user defined statement for that purpose.

I think I need to reorder those two sections - introduce user-defined 
statements 
first, then consider whether or not to add direct finalisation support to for 
loops.

> If the keyword chosen is completely different from 'for' or 'while', 
> then it doesn't need a 'del' or 'finally' as that can be part of the new 
> definition of whatever keyword is chosen.

That's the technique suggested for the single-pass user defined statements. 
However, a 'for loop with finalisation' is *still fundamentally an iterative 
loop*, and the syntax should reflect that.

> So you might consider 'do', Guido responded with the following the other 
> day:
[snip quote from Guido]
> So it's not been ruled out, or followed though with, as far as I know. 
> And I think it will work for both looping and non looping situations.

The same keyword cannot be used for the looping vs non-looping construct, 
because of the effect on the semantics of break and continue statements.

The non-looping construct is the more fundamental of the two, since it can 
replace any current try/except/else/finally boilerplate, without any concern 
over whether or not the contained code using break or continue statements. A 
looping construct alters the meanings of those statements.

>>The last option is to leave finalisation out of the 'for' loop syntax, and 
>>introduce a user defined statement to handle the finalisation:
> 
> Yes, leaving it out of 'for' loop syntax is good.
> 
> I don't have an opinion on user defined statements yet.  But I think 
> they would be somewhat slower than a built in block that does the same 
> thing.

What do you mean by 'built in block'? The user defined statements of the PEP 
redraft are simply a non-looping version of PEP 340's anonymous block 
statements.

> Oops, meant that to say 'for-else' above ...
> 
> The 'else' is new isn't it?  I was thinking that putting a try-except 
> around the loop does the same thing as the else.  Unless I misunderstand 
> it's use.

No, the else clause on loops is a little known part of present day Python - it 
executes whenever the loop terminates naturally (i.e. not via a break 
statement).

The only thing PEP 340 adds to for loops is the semantics to handle an argument 
to continue statements - it adds nothing to do with finalisation. My PEP 
redraft, on the other hand, suggests the introduction of a 'for loop with 
finalisation' that works fairly similarly to PEP 340's anonymous block 
statements.

>>The PEP redraft already proposes a non-looping version as a new statement. 
>>However, since generators are likely to start using the new non-looping 
>>statement, it's important to be able to ensure timely finalisation of normal 
>>iterators as well. 
> 
> 
> Huh?  I thought a normal iterator or generator doesn't need 
> finalization?  If it does, then it's not normal.  Has a word been coined 
> for iterators with try-finally's in them yet?

An example was posted that looked like this:

   def all_lines(filenames):
   for name in filenames:
   stmt opening(name) as f:
   for line in f:
   yield line

This is clearly intended for use as an iterator - it returns a bunch of lines. 
However, if the iterator is not finalised promptly, then the file that provided 
the last line may be left open indefinitely.

By making such an iterator easy to write, it behooves the PEP to make it easy 
to 
use correctly. This need *can* be met by the 'consuming' user defined statement 
I posted earlier, but a more elegant solution is to be able to iterate over 
this 
generator normally, while also being able to ask Python to ensure the generator 
is finalised at the end of the iteration.

Regards,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://boredomandl

Re: [Python-Dev] PEP 340: Deterministic Finalisation (new PEP draft, either a competitor or update to PEP 340)

2005-05-08 Thread Paul Moore
On 5/8/05, Jp Calderone <[EMAIL PROTECTED]> wrote:
>   If such a construct is to be introduced, the ideal spelling would seem to 
> be:
> 
> for [VAR in] EXPR:
> BLOCK1
> finally:
> BLOCK2

While I have not been following this discussion at all (I don't have
the energy or time to follow the development of yet another proposal -
I'll wait for the PEP) this does read more naturally to me than any of
the other contortions I've seen passing by.

Paul.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 340: Deterministic Finalisation (new PEP draft, either a competitor or update to PEP 340)

2005-05-08 Thread Nick Coghlan
Paul Moore wrote:
> On 5/8/05, Jp Calderone <[EMAIL PROTECTED]> wrote:
> 
>>  If such a construct is to be introduced, the ideal spelling would seem to 
>> be:
>>
>>for [VAR in] EXPR:
>>BLOCK1
>>finally:
>>BLOCK2
> 
> 
> While I have not been following this discussion at all (I don't have
> the energy or time to follow the development of yet another proposal -
> I'll wait for the PEP) this does read more naturally to me than any of
> the other contortions I've seen passing by.

Given this for loop syntax:

   for VAR in EXPR:
   BLOCK1
   else:
   BLOCK2
   finally:
   BLOCK3

And these semantics when a finally block is present:

   itr = iter(EXPR1)
   exhausted = False
   try:
   while True:
   try:
   VAR1 = itr.next()
   except StopIteration:
   exhausted = True
   break
   BLOCK1
   if exhausted:
   BLOCK2
   finally:
   try:
   BLOCK3
   finally:
   itr_exit = getattr(itr, "__exit__", None)
   if itr_exit is not None:
   try:
   itr.__exit__(TerminateBlock)
   except TerminateBlock:
   pass

"Loop on this iterator and finalise when done" would be written:

   for item in itr:
   process(item)
   finally:
   pass

If you just want the finally clause, without finalising the iterator, you write 
it as you would now:

   try:
   for item in itr:
   process(item)
   finally:
   finalisation()

I like it - I'll update the PEP redraft to use it instead of the 'del' idea.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://boredomandlaziness.blogspot.com
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 340: Deterministic Finalisation (new PEP draft, either a competitor or update to PEP 340)

2005-05-08 Thread Nick Coghlan
Nick Coghlan wrote:
> The whole PEP draft can be found here:
> http://members.iinet.net.au/~ncoghlan/public/pep-3XX.html

I've updated this based on the feedback so far. The biggest change is that I've 
dropped the 'del' idea in favour of an optional 'finally' clause on for loops 
that finalises the iterator in addition to executing the code contained in the 
clause.

I also added additional description of the purpose of user defined statements 
(factoring out exception handling boilerplate that is not easily factored into 
a 
separate function), and fixed the semantics so that __exit__() is called 
without 
an argument when the statement exits cleanly (previously, a template could not 
tell if the statement exited cleanly or not).

I expanded on the generator section, indicating that the __exit__ method simply 
invokes next() if no exception is passed in (this makes the transaction example 
work correctly).

I updated the auto_retry example to work with the new for loop finalisation 
approach, and added an example (reading the lines from multiple named files) 
where timely iterator finalisation is needed.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://boredomandlaziness.blogspot.com
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 340: Deterministic Finalisation (new PEP draft, either a competitor or update to PEP 340)

2005-05-08 Thread Ron Adam
Josiah Carlson wrote:
> Ron Adam <[EMAIL PROTECTED]> wrote:
> 
>>Josiah Carlson wrote:
>>
I think a completely separate looping or non-looping construct would be 
better for the finalization issue, and maybe can work with class's with 
__exit__ as well as generators.
>>>
>>>From what I understand, the entire conversation has always stated that
>>>class-based finalized objects and generator-based finalized objects will
>>>both work, and that any proposal that works for one, but not the other,
>>>is not sufficient.
>>
>>That's good to hear.  There seems to be some confusion as to weather or 
>>not 'for's will do finalizing.  So I was trying to stress I think 
>>regular 'for' loops should not finalize. They should probably give an 
>>error if an object with an try-finally in them or an __exit__ method. 
>>I'm not sure what the current opinion on that is.  But I didn't see it 
>>in any of the PEPs.
> 
> 
> It's not a matter of 'will they be finalized', but instead a matter of
> 'will they be finalized in a timely manner'.  From what I understand;
> upon garbage collection, any generator-based resource will be finalized
> via __exit__/next(exception)/... and any class-based resource will have
> its __del__ method called (as long as it is well-behaved), which can be
> used to call __exit__...

I should have said  "...should not finalize at the end of the for loop". 
  With generators, you may not want them to finalize before you are done 
with them, and the same with class's.


Having it loop has the advantage of making it break out in a better 
behaved way.
>>>
>>>What you have just typed is nonsense.  Re-type it and be explicit.
>>
>>It was a bit brief, sorry about that. :-)
>>
>>To get a non-looping block to loop, you will need to put it in a loop or 
>>put a loop in it.
>>
>>In the first case, doing a 'break' in the block doesn't exit the loop. 
>>so you need to add an extra test for that.
>>
>>In the second case, doing a 'break' in the loop does exit the block, but 
>>finishes any code after the loop.  So you may need an extra case in that 
>>case.
>>
>>Having a block that loops can simplify these conditions, in that a break 
>>alway exits the body of the block and stops the loop.  A 'continue' can 
>>be used to skip the end of the block and start the next loop early.
>>
>>And you still have the option to put the block in a loop or loops in the 
>>block and they will work as they do now.
>>
>>I hope that clarifies what I was thinking a bit better.
> 
> 
> 
> That is the long-standing nested loops 'issue', which is not going to be
> solved here, nor should it be.

We may not find a solution today, but where should it be addressed if 
not here?

I don't really see the general issue of breaking out of loops as a 
problem, but was just addressing where it overlaps blocks and weather or 
not blocks should loop.

> I am not sure that any solution to the issue will be sufficient for
> everyone involved. 

That's the nature of programming in general isn't it. ;-)


> The closest thing to a generic solution I can come
> up with would be to allow for the labeling of for/while loops, and the
> allowing of "break/continue ", which continues to that loop
> (breaking all other loops currently nested within), or breaks that loop
> (as well as all other loops currently nested within).
 >
> Perhaps something like...
> 
> while ... label 'foo':
> for ... in ... label 'goo':
> block ... label 'hoo':
> if ...:
> #equivalent to continue 'hoo'
> continue
> elif ...:
> continue 'goo'
> elif ...:
> continue 'foo'
> else:
> break 'foo'
> 
> Does this solve the nested loop problem?  Yes.  Do I like it?  Not
> really; three keywords in a single for/block statement is pretty awful.
> On the upside, 'label' doesn't need to be a full-on keyword (it can be a
> partial keyword like 'as' still seems to be).

How about this for breaking out of all loops at once.

class BreakLoop(Exception):
 """break out of nested loops"""

try:
 for x in range(100):
 for y in range(100):
 for z in range(100):
if x == 25 and y==72 and z==3:
 raise BreakLoop

except BreakLoop: pass
print 'x,y,z =', x,y,z


Sometimes I would like a "try until :"  for cases like this 
where you would use "except :pass".

Cheers,
Ron_Adam





___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New Py_UNICODE doc

2005-05-08 Thread Shane Hathaway
M.-A. Lemburg wrote:
> All this talk about UTF-16 vs. UCS-2 is not very useful
> and strikes me a purely academic.
> 
> The reference to possibly breakage by slicing a Unicode and
> breaking a surrogate pair is valid, the idea of UCS-4 being
> less prone to breakage is a myth:

Fair enough.  The original point is that the documentation is unclear
about what a Py_UNICODE[] contains.  I deduced that it contains either
UCS2 or UCS4 and implemented accordingly.  Not only did I guess wrong,
but others will probably guess wrong too.  Something in the docs needs
to spell this out.

Shane
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 340: Deterministic Finalisation (new PEP draft, either a competitor or update to PEP 340)

2005-05-08 Thread Josiah Carlson

Ron Adam <[EMAIL PROTECTED]> wrote:
> Josiah Carlson wrote:
> > It's not a matter of 'will they be finalized', but instead a matter of
> > 'will they be finalized in a timely manner'.  From what I understand;
> > upon garbage collection, any generator-based resource will be finalized
> > via __exit__/next(exception)/... and any class-based resource will have
> > its __del__ method called (as long as it is well-behaved), which can be
> > used to call __exit__...
> 
> I should have said  "...should not finalize at the end of the for loop". 
>   With generators, you may not want them to finalize before you are done 
> with them, and the same with class's.

So you don't use them with a structure that greedily finalizes, and you
keep a reference to the object exterior to the loop.  Seems to be a
non-issue.


> > That is the long-standing nested loops 'issue', which is not going to be
> > solved here, nor should it be.
> 
> We may not find a solution today, but where should it be addressed if 
> not here?
> 
> I don't really see the general issue of breaking out of loops as a 
> problem, but was just addressing where it overlaps blocks and weather or 
> not blocks should loop.

The argument over whether blocks should loop, I believe has been had;
they should.  The various use cases involve multi-part transactions and
such.


> > The closest thing to a generic solution I can come
> > up with would be to allow for the labeling of for/while loops, and the
> > allowing of "break/continue ", which continues to that loop
> > (breaking all other loops currently nested within), or breaks that loop
> > (as well as all other loops currently nested within).
>  >
> > Perhaps something like...
> > 
> > while ... label 'foo':
> > for ... in ... label 'goo':
> > block ... label 'hoo':
> > if ...:
> > #equivalent to continue 'hoo'
> > continue
> > elif ...:
> > continue 'goo'
> > elif ...:
> > continue 'foo'
> > else:
> > break 'foo'
> > 
> > Does this solve the nested loop problem?  Yes.  Do I like it?  Not
> > really; three keywords in a single for/block statement is pretty awful.
> > On the upside, 'label' doesn't need to be a full-on keyword (it can be a
> > partial keyword like 'as' still seems to be).
> 
> How about this for breaking out of all loops at once.
> 
> class BreakLoop(Exception):
>  """break out of nested loops"""
> 
> try:
>  for x in range(100):
>  for y in range(100):
>  for z in range(100):
>   if x == 25 and y==72 and z==3:
>  raise BreakLoop
> 
> except BreakLoop: pass
> print 'x,y,z =', x,y,z
> 
> 
> Sometimes I would like a "try until :"  for cases like this 
> where you would use "except :pass".


That is a mechanism, but I like it even less than the one I offered. 
Every time that one wants ot offer themselves the ability to break out
of a different loop (no continue here), one must create another
try/except clause, further indenting, and causing nontrivial try/except
overhead inside nested loops.

A real solution to the problem should (in my opinion) allow the breaking
of or continuing to an arbitrary for/while/block.  Humorously enough,
Richie Hindle's goto/comefrom statements for Python ("not to be used in
production code") would allow 90% of the necessary behavior (though the
lack of timely finalization would probably annoy some people, but then
again, there is only so much one can expect from a module written as a
working April Fools joke over a year ago).

 - Josiah

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 340: Deterministic Finalisation (new PEP draft, either a competitor or update to PEP 340)

2005-05-08 Thread Josiah Carlson

Eric Nieuwland <[EMAIL PROTECTED]> wrote:
> I suggested to create AN ITERATOR FOR THE LIST and destroy that at the 
> end. The list itself remains untouched.

My mistake, I did not understand your use of pronouns.


 - Josiah

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New Py_UNICODE doc

2005-05-08 Thread Martin v. Löwis
Shane Hathaway wrote:
> Fair enough.  The original point is that the documentation is unclear
> about what a Py_UNICODE[] contains.  I deduced that it contains either
> UCS2 or UCS4 and implemented accordingly.  Not only did I guess wrong,
> but others will probably guess wrong too.  Something in the docs needs
> to spell this out.

Again, patches are welcome. I was opposed to Nick's proposed changes,
since they explicitly said that you are not supposed to know what
is in a Py_UNICODE. Integrating the essence of PEP 261 into the
main documentation would be a worthwhile task.

Regards,
Martin

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New Py_UNICODE doc

2005-05-08 Thread Nicholas Bastin

On May 8, 2005, at 5:15 AM, Martin v. Löwis wrote:

> 'configure takes an option --enable-unicode, with the possible
> values "ucs2", "ucs4", "yes" (equivalent to no argument),
> and  "no" (equivalent to --disable-unicode)'
>
> *THIS* documentation would break. This documentation is factually
> correct at the moment (configure does indeed take these options),
> and people rely on them in automatic build processes. Changing
> configure options should not be taken lightly, even if they
> may result from a "wrong mental model". By that rule, --with-suffix
> should be renamed to --enable-suffix, --with-doc-strings to
> --enable-doc-strings, and so on. However, the nitpicking that
> underlies the desire to rename the option should be ignored
> in favour of backwards compatibility.
>
> Changing the documentation that goes along with the option
> would be fine.

That is exactly what I proposed originally, which you shot down.  
Please actually read the contents of my messages.  What I said was 
"change the configure option and related documentation".


>> It provides more than minimum value - it provides the truth.
>
> No. It is just a command line option. It could be named
> --enable-quirk=(quork|quark), and would still select UTF-16.
> Command line options provide no truth - they don't even
> provide statements.

Wow, what an inane way of looking at it.  I don't know what world you 
live in, but in my world, users read the configure options and suppose 
that they mean something.  In fact, they *have* to go off on their own 
to assume something, because even the documentation you refer to above 
doesn't say what happens if they choose UCS-2 or UCS-4.  A logical 
assumption would be that python would use those CEFs internally, and 
that would be incorrect.

>>> With --enable-unicode=ucs2, Python's Py_UNICODE does *not* start
>>> supporting the full Unicode ccs the same way it supports UCS-2.
>>
>> I can't understand what you mean by this.  My point is that if you
>> configure python to support UCS-2, then it SHOULD NOT support 
>> surrogate
>> pairs.  Supporting surrogate paris is the purvey of variable width
>> encodings, and UCS-2 is not among them.
>
> So you suggest to renaming it to --enable-unicode=utf16, right?
> My point is that a Unicode type with UTF-16 would correctly
> support all assigned Unicode code points, which the current
> 2-byte implementation doesn't. So --enable-unicode=utf16 would
> *not* be the truth.

The current implementation supports the UTF-16 CEF.  i.e., it supports 
a variable width encoding form capable of representing all of the 
unicode space using surrogate pairs.  Please point out a code point 
that the current 2 byte implementation does not support, either 
directly, or through the use of surrogate pairs.

--
Nick

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New Py_UNICODE doc

2005-05-08 Thread Nicholas Bastin

On May 8, 2005, at 5:28 AM, Martin v. Löwis wrote:

> Nicholas Bastin wrote:
>> All of my proposals for what to change the documention to have been
>> shot down by Martin.  If someone has better verbiage that they'd like
>> to see, I'd be perfectly happy to patch the doc.
>
> I don't look into the specific wording - you speak English much better
> than I do. What I care about is that this part of the documentation
> should be complete and precise. I.e. statements like "should not make
> assumptions" might be fine, as long as they are still followed by
> a precise description of what the code currently does. So it should
> mention that the representation can be either 2 or 4 bytes, that
> the strings "ucs2" and "ucs4" can be used to select one of them,
> that it is always 2 bytes on Windows, that 2 bytes means that non-BMP
> characters can be represented as surrogate pairs, and so on.

It's not always 2 bytes on Windows.  Users can alter the config options 
(and not unreasonably so, btw, on 64-bit windows platforms).

This goes to the issue that I think people don't understand that we 
have to assume that some users will build their own Python.  This will 
result in 2-byte Python's on RHL9, and 4-byte python's on windows, both 
of which have already been claimed in this discussion to not happen, 
which is untrue.  You can't build a binary extension module on windows 
and assume that Py_UNICODE is 2 bytes, because that's not enforced in 
any way.  The same is true for 4-byte Py_UNICODE on RHL9.

--
Nick

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New Py_UNICODE doc

2005-05-08 Thread Nicholas Bastin

On May 8, 2005, at 1:44 PM, Martin v. Löwis wrote:

> Shane Hathaway wrote:
>> Fair enough.  The original point is that the documentation is unclear
>> about what a Py_UNICODE[] contains.  I deduced that it contains either
>> UCS2 or UCS4 and implemented accordingly.  Not only did I guess wrong,
>> but others will probably guess wrong too.  Something in the docs needs
>> to spell this out.
>
> Again, patches are welcome. I was opposed to Nick's proposed changes,
> since they explicitly said that you are not supposed to know what
> is in a Py_UNICODE. Integrating the essence of PEP 261 into the
> main documentation would be a worthwhile task.

You can't possibly assume you know specifically what's in a Py_UNICODE 
in any given python installation.  If someone thinks this statement is 
untrue, please explain why.

I realize you might not *want* that to be true, but it is.  Users are 
free to configure their python however they desire, and if that means 
--enable-unicode=ucs2 on RH9, then that is perfectly valid.

--
Nick

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 340: Deterministic Finalisation (new PEP draft, either a competitor or update to PEP 340)

2005-05-08 Thread Ron Adam
Nick Coghlan wrote:


> Iterating over a sequence. If it's single-pass (and always single pass), you 
> should use a user defined statement instead.

> That's the technique suggested for the single-pass user defined statements. 
> However, a 'for loop with finalisation' is *still fundamentally an iterative 
> loop*, and the syntax should reflect that.


> The same keyword cannot be used for the looping vs non-looping construct, 
> because of the effect on the semantics of break and continue statements.

I disagree with this, I think 'do' would work very well for both single 
pass, and multiple pass, blocks.

In this example 'do' evaluates as True until the generator ends without 
returning a value:

def open_file(name,mode):
 f = open(name,mode)
 try:
 yield f
 finally:
 f.close()

Do f from open_file(name,mode):
for line in f:
print line.rstrip()

On the first try, it gets f, so the do expression evaluates as True and 
the BLOCK is run.

On the second try, instead of getting a value, the finally suite is 
executed and the generator ends, causing the do expression to evaluate 
as False.

If a continue is used, it just skips the end of the 'do' body, and then 
weather or not to loop is determined by weather or not the 'do 
expression evaluates as True or not.

A break skips the rest of the 'do' body and execute the generators 
finally.

This works the same in both single pass and multi pass situations.

The difference is by using a truth test instead of iterating, it better 
represents what is happening and opens up a few options.

There's also the possibility to use conditional looping based on the 
value returned from the generator.

do VAR from EXPR if VAR==CONST:
BLOCK

This is a bit verbose, but it reads well. :-)

But that is really just a short cut for:

do VAR from EXPR:
 if VAR != CONST:
 break
 BLOCK


The Syntax might be:

do ([VAR from] EXPR1) | (VAR from EXPR1 if EXPR2): BODY


>>I don't have an opinion on user defined statements yet.  But I think 
>>they would be somewhat slower than a built in block that does the same 
>>thing.
>  
> What do you mean by 'built in block'? The user defined statements of the PEP 
> redraft are simply a non-looping version of PEP 340's anonymous block 
> statements.

Ok, my mistake, I thought you were suggesting the more general user 
defined statements suggested elsewhere.


> No, the else clause on loops is a little known part of present day Python - 
> it 
> executes whenever the loop terminates naturally (i.e. not via a break 
> statement).

Hmm... ok, and the opposite of what I expected.  No wonder its a little 
known part.


> My PEP redraft, on the other hand, suggests the introduction of a 'for loop 
> with 
> finalisation' that works fairly similarly to PEP 340's anonymous block 
> statements.

Here is my current thinking.  It will be better to have 3 separate loops 
with three identifiable names, and have each work in distinctly 
different ways.  That simplifies, teaching, using, and reading the 
resulting code. IMHO.

1.  For-loops: Fast efficient list iteration. No changes.

2.  While-loops: Fast efficient truth test based loop. No changes.

3.  Do-loops: An generator based loop with finalization:  This could 
be both single and multiple pass.  The difference is determined by 
weather or not the generator used loops the yield statement or not.


I think a good test is the retry example in the PEP.  A solution that 
can represent that clearly and concisely would be a good choice.

Maybe this could be made to work:

def auto_retry(n, exc):
 while n>0:
 try:
 yield True
 n = 0
 except exc:
 n -= 1

do auto_retry(3, IOError):
 f = urllib.urlopen("http://python.org/";)
 print f.read()

The ability to propagate the exception back to the generator is what's 
important here.

The while version of this nearly works, but is missing the exception 
propagation back to the generator, the ability to pass back through the 
yield, and finalization if the outside while loop is broken before the 
generator finishes.

def auto_retry(n, exc):
 while n>1:
 try:
 yield True
 break
 except exc:
 n -= 1
 # finalize here
 yield None

import urllib
ar = auto_retry(3, IOError)
while ar.next():
 f = urllib.urlopen("http://python.org/";)
 print f.read()

Although changing 'while' shouldn't be done. I think using 'do' for 
generator based loops would be good.

This isn't that different from PEP340 I think.  Maybe it's just comming 
to the same conclusion from a differnt perspective.  :-)

Cheers, Ron




___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 340: Deterministic Finalisation (new PEP draft, either a competitor or update to PEP 340)

2005-05-08 Thread Eric Nieuwland
Josiah Carlson wrote:
> Eric Nieuwland <[EMAIL PROTECTED]> wrote:
>> I suggested to create AN ITERATOR FOR THE LIST and destroy that at the
>> end. The list itself remains untouched.
>
> My mistake, I did not understand your use of pronouns.

And, rereading my post, I used an ambigous reference.
My bad as well.

--eric

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 340: Deterministic Finalisation (new PEP draft, either a competitor or update to PEP 340)

2005-05-08 Thread Eric Nieuwland
Josiah Carlson wrote:
> The argument over whether blocks should loop, I believe has been had;
> they should.  The various use cases involve multi-part transactions and
> such.

Then it is not so much looping but more pushing forward the state of 
the state of the block's life-cycle?
This might by a good moment to consider life-cycle support a la PROCOL.

--eric

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 340: Deterministic Finalisation (new PEP draft, either a competitor or update to PEP 340)

2005-05-08 Thread Michael Hudson
Jp Calderone <[EMAIL PROTECTED]> writes:

>   If such a construct is to be introduced, the ideal spelling would seem to 
> be:
>
> for [VAR in] EXPR:
> BLOCK1
> finally:
> BLOCK2

Does this mean that adding 

finally:
pass

to a for block would make the for loop behave differently?

Cheers,
mwh

-- 
  I really hope there's a catastrophic bug in some future e-mail
  program where if you try and send an attachment it cancels your
  ISP account, deletes your harddrive, and pisses in your coffee
 -- Adam Rixey
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 340: Deterministic Finalisation (new PEP draft, either a competitor or update to PEP 340)

2005-05-08 Thread Nick Coghlan
Josiah Carlson wrote:
> The argument over whether blocks should loop, I believe has been had;
> they should.  The various use cases involve multi-part transactions and
> such.

The number of good use cases for a looping block statement currently stands at 
exactly 1 (auto_retry). Every other use case suggested (locking, opening, 
suppressing, etc) involves factoring out try statement boiler plate that is far 
easier to comprehend with a single pass user defined statement. A single pass 
user defined statement allows all such code to be factored safely, even if the 
main clause of the try statement uses break or continue statements.

The insanity of an inherently looping block statement is shown by the massive 
semantic differences between the following two pieces of code under PEP 340:

   block locking(the_lock):
   for item in items:
   if handle(item):
   break

   for item in items:
   block locking(the_lock):
   if handle(item):
   break

With a non-looping user defined statement, you get the semantics you would 
expect for the latter case (i.e. the for loop is still terminated after an item 
is handled, whereas that won't happen under PEP 340)

For the one good use case for a user defined loop (auto_retry), I initially 
suggested in my redraft that there be a way of denoting that a given for loop 
gives the iterator the opportunity to intercept exceptions raised in the body 
of 
the loop (like the PEP 340 block statement). You convinced me that was a bad 
idea, and I switched to a simple iterator finalisation clause in version 1.2.

Even with that simplified approach though, *using* auto_retry is still very 
easy:

   for attempt in auto_retry(3, IOError):
   stmt attempt:
   do_something()

It's a little trickier to write auto_retry itself, since you can't easily use a 
generator anymore, but it still isn't that hard, and the separation of concerns 
(between iteration, and the customised control flow in response to exceptions) 
makes it very easy to grasp how it works.

>>>The closest thing to a generic solution I can come
>>>up with would be to allow for the labeling of for/while loops, and the
>>>allowing of "break/continue ", which continues to that loop
>>>(breaking all other loops currently nested within), or breaks that loop
>>>(as well as all other loops currently nested within).

Or, we simply have user defined statements which are not themselves loops, and 
use them to create named blocks:

   def block(name):
   try:
   yield
   except TerminateBlock, ex:
   if not ex.args or ex.args[0] != name
   raise

stmt block('foo'):
 while condition():
 stmt block('goo'):
 for ... in ...:
 while other_case():
 stmt block('hoo'):
 if ...:
 # Continue the inner while loop
 continue
 if ...:
 # Exit the inner while loop
 raise TerminateBlock, 'hoo'
 if ...:
 # Exit the for loop
 raise TerminateBlock, 'goo'
 # Exit the outer while loop
 raise TerminateBlock, 'foo'

This has the benefit that an arbitrary block of code can be named, and a named 
TerminateBlock used to exit it.

> That is a mechanism, but I like it even less than the one I offered. 
> Every time that one wants ot offer themselves the ability to break out
> of a different loop (no continue here), one must create another
> try/except clause, further indenting, and causing nontrivial try/except
> overhead inside nested loops.

Ah well, that criticism applies to my suggestion, too. However, I suspect any 
such implementation is going to need to use exceptions for the guts of the flow 
control, even if that use isn't visible to the programmer.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://boredomandlaziness.blogspot.com
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 340: Deterministic Finalisation (new PEP draft, either a competitor or update to PEP 340)

2005-05-08 Thread Ron Adam
Josiah Carlson wrote:

> Ron Adam <[EMAIL PROTECTED]> wrote:

>>I should have said  "...should not finalize at the end of the for loop". 
>>  With generators, you may not want them to finalize before you are done 
>>with them, and the same with class's.
> 
> 
> So you don't use them with a structure that greedily finalizes, and you
> keep a reference to the object exterior to the loop.  Seems to be a
> non-issue.

Yes, it should be a non issue.


> The argument over whether blocks should loop, I believe has been had;
> they should.  The various use cases involve multi-part transactions and
> such.

I think so now too, I had thought as Nick does earlier this week that 
the non-looping version was cleaner, but changed my mind when I realized 
that looping blocks could be made to work for those in a simple and 
understandable way.

>>try:
>> for x in range(100):
>> for y in range(100):
>> for z in range(100):
>>  if x == 25 and y==72 and z==3:
>> raise BreakLoop
>>
>>except BreakLoop: pass
>>print 'x,y,z =', x,y,z

> That is a mechanism, but I like it even less than the one I offered. 
> Every time that one wants ot offer themselves the ability to break out
> of a different loop (no continue here), one must create another
> try/except clause, further indenting, and causing nontrivial try/except
> overhead inside nested loops.
> 
> A real solution to the problem should (in my opinion) allow the breaking
> of or continuing to an arbitrary for/while/block.  Humorously enough,
> Richie Hindle's goto/comefrom statements for Python ("not to be used in
> production code") would allow 90% of the necessary behavior (though the
> lack of timely finalization would probably annoy some people, but then
> again, there is only so much one can expect from a module written as a
> working April Fools joke over a year ago).
> 
>  - Josiah

I think maybe another alternative is a break buffer or cue. Where you 
push a 'break' onto the buffer and then execute a 'break' to break the 
current loop, The 'break' in the buffer then breaks the next loop out as 
soon as the current loop exits, etc.

for x in range(100):
 for y in range(100):
 for z in range(100):
if x == 25 and y==72 and z==3:
   push_loop(Break,Break)  # will break both parent loops
   break   # break current loop

if push_loop(...) could take a NoBreak, then you can selectively break 
outer breaks by how you sequence them.

push_break(None, break) wound not break the y loop, but will break the x 
loop above.  Can you think of a use case for something like that?


Pushing 'Continues' might also work:

for x in range(100):
 for y in range(100):
 if x == 25 and y==72:
 push_loop(Continue)  # will skip rest of parents body
 break# break current loop
 #code2
 #code1

This will break the 'y' loop and skip code2, then continue the 'x' loop 
skipping code block 1.

Using a stack for breaks and continues isn't too different than using a 
stack for exceptions I think.

Also by making it a function call instead of a command you can have a 
function return a Break or Continue object, or None,

for x in range(100):
for y in range(100):
   if y == testval:
   push_loop(looptest(x,y)):break x loop depending on x,y
   break

None's returned would need to be discarded I think for this to work, so 
something else would be needed to skip a level.

It needs some polish I think.  ;-)

Cheers,
Ron_Adam


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New Py_UNICODE doc

2005-05-08 Thread Martin v. Löwis
Nicholas Bastin wrote:
>> Changing the documentation that goes along with the option
>> would be fine.
> 
> 
> That is exactly what I proposed originally, which you shot down.  Please
> actually read the contents of my messages.  What I said was "change the
> configure option and related documentation".

What I mean is "change just the documentation, do not change the
configure option". This seems to be different from your proposal,
which I understand as "change both the configure option and the
documentation".

> Wow, what an inane way of looking at it.  I don't know what world you
> live in, but in my world, users read the configure options and suppose
> that they mean something.  In fact, they *have* to go off on their own
> to assume something, because even the documentation you refer to above
> doesn't say what happens if they choose UCS-2 or UCS-4.  A logical
> assumption would be that python would use those CEFs internally, and
> that would be incorrect.

Certainly. That's why the documentation should be improved. Changing
the option breaks existing packaging systems, and should not be done
lightly.

> The current implementation supports the UTF-16 CEF.  i.e., it supports a
> variable width encoding form capable of representing all of the unicode
> space using surrogate pairs.  Please point out a code point that the
> current 2 byte implementation does not support, either directly, or
> through the use of surrogate pairs.

Try to match regular expression classes for non-BMP characters:

>>> re.match(u"[\u1234]",u"\u1234").group()
u'\u1234'

works fine, but

>>> re.match(u"[\U00011234]",u"\U00011234").group()
u'\ud804'

gives strange results.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New Py_UNICODE doc

2005-05-08 Thread Martin v. Löwis
Nicholas Bastin wrote:
> It's not always 2 bytes on Windows.  Users can alter the config options
> (and not unreasonably so, btw, on 64-bit windows platforms).

Did you try that? I'm not sure it even builds when you do so, but if it
does, you will lose the "mbcs" codec, and the ability to use Unicode
strings as file names. Without the "mbcs" codec, I would expect that
quite a lot of the Unicode stuff breaks.

> You can't build a binary extension module on windows and
> assume that Py_UNICODE is 2 bytes, because that's not enforced in any
> way.  The same is true for 4-byte Py_UNICODE on RHL9.

Depends on how much force you want to see. That the official pydotorg
Windows installer python24.dll uses a 2-byte Unicode, and that a lot
of things break if you change Py_UNICODE to four bytes on Windows
(including PythonWin) is a pretty strong guarantee that you won't
see a Windows Python build with UCS-4 for quite some time.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The decorator module

2005-05-08 Thread Michele Simionato
On 5/6/05, Phillip J. Eby <[EMAIL PROTECTED]> wrote:
> In this case, the informally-discussed proposal is to add a mutable
> __signature__ to functions, and have it be used by inspect.getargspec(), so
> that decorators can copy __signature__ from the decoratee to the decorated
> function.

Is there in the plans any facility to copy functions? Currently I am doing

def copyfunc(func):
"Creates an independent copy of a function."
c = func.func_code
nc = new.code(c.co_argcount, c.co_nlocals, c.co_stacksize, c.co_flags,
  c.co_code, c.co_consts, c.co_names, c.co_varnames,
  c.co_filename, c.co_name, c.co_firstlineno,
  c.co_lnotab, c.co_freevars, c.co_cellvars)
return new.function(nc, func.func_globals, func.func_name,
func.func_defaults, func.func_closure)
 
and I *hate* it!

I have updated my module to version 0.2, with an improved discussion
of decorators in multithreaded programming ("locked", "threaded",
"deferred"): http://www.phyast.pitt.edu/~micheles/python/decorator.zip


Michele Simionato
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New Py_UNICODE doc

2005-05-08 Thread Martin v. Löwis
Nicholas Bastin wrote:
>> Again, patches are welcome. I was opposed to Nick's proposed changes,
>> since they explicitly said that you are not supposed to know what
>> is in a Py_UNICODE. Integrating the essence of PEP 261 into the
>> main documentation would be a worthwhile task.
> 
> 
> You can't possibly assume you know specifically what's in a Py_UNICODE
> in any given python installation.  If someone thinks this statement is
> untrue, please explain why.

This is a different issue. Between saying "we don't know what
installation xyz uses" and saying "we cannot say anything" is a wide
range of things that you can truthfully say. Like "it can be either
two bytes or four bytes" (but not one or three bytes), and so on.

Also, for a given installation, you can find out by looking at
sys.maxunicode from Python, or at Py_UNICODE_SIZE from C.

> I realize you might not *want* that to be true, but it is.  Users are
> free to configure their python however they desire, and if that means
> --enable-unicode=ucs2 on RH9, then that is perfectly valid.

Sure they can. Of course, that will mean they don't get a working
_tkinter, unless they rebuild Tcl as well. Nevertheless, it is indeed
likely that people do that. So if you want to support them, you
need to distribute two versions of your binary module, or give
them source code.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 340: Deterministic Finalisation (new PEP draft, either a competitor or update to PEP 340)

2005-05-08 Thread Josiah Carlson

Ron Adam <[EMAIL PROTECTED]> wrote:
> There's also the possibility to use conditional looping based on the 
> value returned from the generator.
> 
> do VAR from EXPR if VAR==CONST:
> BLOCK
> 
> This is a bit verbose, but it reads well. :-)

Reading well or not, this is not really an option for the same reasons
why...

  for VAR in EXPR1 if EXPR2:
or
  for VAR in EXPR1 while EXPR2:

are not options.  Keep it simple.


> 3.  Do-loops: An generator based loop with finalization:  This could 
> be both single and multiple pass.  The difference is determined by 
> weather or not the generator used loops the yield statement or not.

Offering only generator-based finalization loops is, as I understand it,
not an option.


 - Josiah

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 340: Deterministic Finalisation (new PEP draft, either a competitor or update to PEP 340)

2005-05-08 Thread Josiah Carlson

Nick Coghlan <[EMAIL PROTECTED]> wrote:
> Josiah Carlson wrote:
> > The argument over whether blocks should loop, I believe has been had;
> > they should.  The various use cases involve multi-part transactions and
> > such.

[snip looping block discussion]

> For the one good use case for a user defined loop (auto_retry), I initially 
> suggested in my redraft that there be a way of denoting that a given for loop 
> gives the iterator the opportunity to intercept exceptions raised in the body 
> of 
> the loop (like the PEP 340 block statement). You convinced me that was a bad 
> idea, and I switched to a simple iterator finalisation clause in version 1.2.

Well then, I guess you have re-convinced me that the block statement
probably shouldn't loop.

> Even with that simplified approach though, *using* auto_retry is still very 
> easy:
> 
>for attempt in auto_retry(3, IOError):
>stmt attempt:
>do_something()
> 
> It's a little trickier to write auto_retry itself, since you can't easily use 
> a 
> generator anymore, but it still isn't that hard, and the separation of 
> concerns 
> (between iteration, and the customised control flow in response to 
> exceptions) 
> makes it very easy to grasp how it works.

Great.  Now all we need is a module with a handful of finalization
generators, with all of the obvious ones already implemented.


> >>>The closest thing to a generic solution I can come
> >>>up with would be to allow for the labeling of for/while loops, and the
> >>>allowing of "break/continue ", which continues to that loop
> >>>(breaking all other loops currently nested within), or breaks that loop
> >>>(as well as all other loops currently nested within).
> 
> Or, we simply have user defined statements which are not themselves loops, 
> and 
> use them to create named blocks:

[snipped code to protect the innocent]

> This has the benefit that an arbitrary block of code can be named, and a 
> named 
> TerminateBlock used to exit it.

Scary.


> > That is a mechanism, but I like it even less than the one I offered. 
> > Every time that one wants ot offer themselves the ability to break out
> > of a different loop (no continue here), one must create another
> > try/except clause, further indenting, and causing nontrivial try/except
> > overhead inside nested loops.
> 
> Ah well, that criticism applies to my suggestion, too. However, I suspect any 
> such implementation is going to need to use exceptions for the guts of the 
> flow 
> control, even if that use isn't visible to the programmer.

Not necessarily.  If I were implementing such a thing; any time
arbitrary break/continues (to a loop that isn't the deepest) were used
in nested loops, I would increment a counter any time a loop was entered,
and decrement the counter any time a loop was exited.  When performing a
break/continue, I would merely set another variable for which loop is
the final break/continue, then the interpreter could break loops while
the desired level/current level differed, then perform a final
break/continue depending on what was executed.

No exceptions necessary, and the increment/decrement should necessarily
be cheap (an increment/decrement of a char, being that Python limits
itself to 20 nested fors, and probably should limit itself to X nested
loops, where X < 256).


 - Josiah

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The decorator module

2005-05-08 Thread Raymond Hettinger
[Michele Simionato]
> Is there in the plans any facility to copy functions? Currently I am
doing
> 
> def copyfunc(func):
> "Creates an independent copy of a function."
> c = func.func_code
> nc = new.code(c.co_argcount, c.co_nlocals, c.co_stacksize,
c.co_flags,
>   c.co_code, c.co_consts, c.co_names, c.co_varnames,
>   c.co_filename, c.co_name, c.co_firstlineno,
>   c.co_lnotab, c.co_freevars, c.co_cellvars)
> return new.function(nc, func.func_globals, func.func_name,
> func.func_defaults, func.func_closure)
> 
> and I *hate* it!

Sounds reasonable.

Choices:
- submit a patch adding a __copy__ method to functions,
- submit a patch for the copy module, or
- submit a feature request, assign to me, and wait.


Raymond Hettinger
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 340: Deterministic Finalisation (new PEP draft, either a competitor or update to PEP 340)

2005-05-08 Thread Greg Ewing
Ron Adam wrote:
> There seems to be some confusion as to weather or 
> not 'for's will do finalizing.  So I was trying to stress I think 
> regular 'for' loops should not finalize. They should probably give an 
> error if an object with an try-finally in them or an __exit__ method. 

But if the for-loop can tell whether the iterator
needs finalizing or not, why not have it finalize
the ones that need it and not finalize the ones
that don't? That would be backwards compatible,
since old for-loops working on old iterators would
work as before.

Greg


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 340: Deterministic Finalisation (new PEP draft, either a competitor or update to PEP 340)

2005-05-08 Thread Josiah Carlson

Ron Adam <[EMAIL PROTECTED]> wrote:
> > The argument over whether blocks should loop, I believe has been had;
> > they should.  The various use cases involve multi-part transactions and
> > such.
> 
> I think so now too, I had thought as Nick does earlier this week that 
> the non-looping version was cleaner, but changed my mind when I realized 
> that looping blocks could be made to work for those in a simple and 
> understandable way.

I wasn't expressing my opinion, I was attempting to express as to where
the discussion went and concluded.  I honestly can't remember having an
opinion on the subject, but I seem to have convinced Nick earlier that
they shouldn't loop, and he (re-)convinced me that indeed, they
shouldn't loop.


> I think maybe another alternative is a break buffer or cue. Where you 
> push a 'break' onto the buffer and then execute a 'break' to break the 
> current loop, The 'break' in the buffer then breaks the next loop out as 
> soon as the current loop exits, etc.

[snip]

> It needs some polish I think.  ;-)

Goodness, the horror!  When implementation details start bleeding their
way into actual language constructs (using a continue/break stack in
order to control the flow of nested loops), that's a good clue that an
idea has gone a bit too far.

I would honestly prefer gotos, and I would prefer having no change to
existing syntax to gaining gotos.


It's kind of funny.  Every month I spend in python-dev, I feel less
inclined to want to change the Python language (except for the relative
import I need to finish implementing).  Not because it is a pain in the
tookus (though it is), but because many times it is my immediate sense
of aesthetics that causes me to desire change, and my future of code
maintenance makes me think forward to understanding Python 2.3 in the
context of Python 2.9 .


 - Josiah

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com