Re: [Python-Dev] cpython: Minor clean-ups for heapq.

2014-05-27 Thread Michael Urman
On Tue, May 27, 2014 at 4:05 AM, Chris Angelico  wrote:
> On Tue, May 27, 2014 at 6:58 PM, Serhiy Storchaka  wrote:
>> 26.05.14 10:59, raymond.hettinger написав(ла):
>>>
>>> +result = [(elem, i) for i, elem in zip(range(n), it)]
>>
>>
>> Perhaps it is worth to add simple comment explaining why this is not
>> equivalent to just list(zip(it, range(n))). Otherwise it can be
>> unintentionally "optimized" in future.
>>
>
> Where is the difference? I'm very much puzzled now. My first thought
> was based on differing-length iterables in zip, but the docs say it
> stops at the shortest of its args.

Due to how zip stops, it leaves the longer iterable in different places:

>>> it = iter(string.ascii_letters); list(zip(range(3), it)); next(it)
[(0, 'a'), (1, 'b'), (2, 'c')]
'd'
>>> it = iter(string.ascii_letters); list(zip(it, range(3))); next(it)
[('a', 0), ('b', 1), ('c', 2)]
'e'

This seems like a potentially nasty gotcha, but I'm unclear what real
use cases would be impacted.

Michael
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Michael Urman
On Thu, Jan 16, 2014 at 11:13 AM, Neil Schemenauer  wrote:
> A TypeError exception is what we want if the object does not support
> bytes formatting.  Some possible problems:
>
> - It could be hard to provide a helpful exception message since it
>   is generated inside the __format__ method rather than inside the
>   bytes.__mod__ method (in the case of a missing __ascii__ method).
>   The most common error will be using a str object and so we could
>   modify the __format__ method of str to provide a nice hint (use
>   encode()).

The various format functions could certainly intercept and wrap
exceptions raised by __format__ methods. Once the core types were
modified to expect bytes in format_spec, however, this may not be
critical; __format__ methods which delegate would work as expected,
str could certainly be clear about why it raised, and custom
implementations would be handled per comments I'll make on your second
point. Overall I suspect this is no worse than unhandled values in the
format_spec are today.

> - Is there some risk that an object will unwittingly implement a
>   __format__ method that unintentionally accepts a bytes argument?
>   That requires some investigation.

Agreed. Some quick armchair calculations suggest to me that there are
three likely outcomes:
 - Properly handle the type (perhaps written with the 2.x clause in mind)
 - Raise an exception internally (perhaps ValueError, such as from
format(3, 'q'))
 - Mishandle and return a str (perhaps due to to if/else defaulting)
The first and second outcome may well reflect what we want, and the
third could easily be detected and turned into an exception by the
format functions.

I'm uncertain whether this reflects all the scenarios we would care about.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Michael Urman
On Thu, Jan 16, 2014 at 8:45 AM, Brett Cannon  wrote:
> Fine, if you're worried about bytes.format() overstepping by implicitly
> calling str.encode() on the return value of __format__() then you will need
> __bytes__format__() to get equivalent support.

Could we just re-use PEP-3101's note (easily updated for Python 3):

Note for Python 2.x: The 'format_spec' argument will be either
a string object or a unicode object, depending on the type of the
original format string.  The __format__ method should test the type
of the specifiers parameter to determine whether to return a string or
unicode object.  It is the responsibility of the __format__ method
to return an object of the proper type.

If __format__ receives a format_spec of type bytes, it should return
bytes. For such cases on objects that cannot support bytes (i.e. for
str), it can raise. This appears to avoid the need for additional
methods. (As does Nick's proposal of leaving it out for now.)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [RELEASED] Python 3.4.0b2

2014-01-06 Thread Michael Urman
On Mon, Jan 6, 2014 at 9:43 AM, Guido van Rossum  wrote:
> Since MSIEXEC.EXE is a legit binary (not coming from our packager) and
> Akamai is a legitimate company (MS most likely has an agreement with
> them), at this point I would assume that there's something that
> MSIEXEC.EXE wants to get from Akamai, which is unintentionally but
> harmlessly triggered by the Python install. Could it be checking for
> upgrades?

Here's some more guesswork. Does it seem possible that msiexec is
trying to verify the revocation status of the certificate used to sign
the python .msi file? Per
http://blogs.technet.com/b/pki/archive/2006/11/30/basic-crl-checking-with-certutil.aspx
it looks like crl.microsoft.com is the host; this is hosted on akamai:
   crl.microsoft.com is an alias for crl.www.ms.akadns.net.
   crl.www.ms.akadns.net is an alias for a1363.g.akamai.net.

There are various things you could try to verify this. You could test
with simpler .msi files where one is signed and another is not signed
(I'll leave it up to you to find such things, but ORCA is a common
"test" .msi file). Or you could take a verbose log of the installation
process (msiexec /l*v python.log python.msi OR
http://support.microsoft.com/kb/223300), sit on the prompt for network
access so you can uniquely identify the log's timestamps, and try to
identify at what point of the installation the network access occurs.
Once that is known, more steps can be taken to identify and resolve
any actual issues.

Michael
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.4 and Windows XP: just 45 days until EOL

2013-07-12 Thread Michael Urman
On Thu, Jul 11, 2013 at 6:58 PM, Christian Heimes  wrote:
> For Python 3.4 is going to be a very close call. According to PEP 429
> 3.4.0 final is scheduled for February 22, 2014. The extended support
> phase of Windows XP ends merely 45 days later on April 8, 2014. Do we
> really have to restrict ourselves to an API that is going to become
> deprecated 45 days after the estimated release of 3.4.0?

If your motivation is to ease the use of APIs only available on
Windows Vista and later, you've got another year to wait: Windows
Server 2003 R2 extended support lasts through until July 2015.
http://support.microsoft.com/lifecycle/search/default.aspx?alpha=Windows+Server+2003+R2

Michael
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 435: pickling enums created with the functional API

2013-05-07 Thread Michael Urman
On Tue, May 7, 2013 at 8:34 AM, Eli Bendersky  wrote:

> According to an earlier discussion, this is works on CPython, PyPy and
> Jython, but not on IronPython. The alternative that works everywhere is to
> define the Enum like this:
>
>   Color = Enum('the_module.Color', 'red blue green')
>
> The reference implementation supports this as well.
>

As an alternate bikeshed color, why not pass the receiving module to the
class factory when pickle support is desirable? That should be less brittle
than its name. The class based syntax can still be recommended to libraries
that won't know ahead of time if their values need to be pickled.

>>> Color = Enum('Color', 'red blue green', module=__main__)

Functions that wrap class factories could similarly accept and pass a
module along.

The fundamental problem is that the class factory cannot know what the
intended destination module is without either syntax that provides this
('class' today, proposed 'def' or 'class from' in the thread, or the caller
passing additional information around (module name, or module instance).
Syntax changes are clearly beyond the scope of PEP 435, otherwise a true
enum syntax might have been born. So that leaves us with requiring the
caller to provide it.

Michael
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 435 -- Adding an Enum type to the Python standard library

2013-04-12 Thread Michael Urman
On Fri, Apr 12, 2013 at 9:30 AM, Barry Warsaw  wrote:

> On Apr 12, 2013, at 09:03 AM, Michael Urman wrote:
> >(For the latter behavior, would adding DupEnum.name2 = DupEnum.name1 after
> >the class declaration work today?)
>
> Yes, but the repr/str of the alias will show the original value.
>

That satisfies my concern. This gives an author the means to provide two
names for a single value, and a way to choose which one is canonical. It's
easy to imagine some corner cases related to persisting those values and
then retrieving them with a later enum definition that changes the
canonical name, but if you store raw values or names it should be easy
enough to work around such things.

Michael
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 435 -- Adding an Enum type to the Python standard library

2013-04-12 Thread Michael Urman
> You may not define two enumeration values with the same integer value::
>
> >>> class Bad(Enum):
> ... cartman = 1
> ... stan = 2
> ... kyle = 3
> ... kenny = 3 # Oops!
> ... butters = 4
> Traceback (most recent call last):
> ...
> ValueError: Conflicting enums with value '3': 'kenny' and 'kyle'
>
> You also may not duplicate values in derived enumerations::
>
> >>> class BadColors(Colors):
> ... yellow = 4
> ... chartreuse = 2 # Oops!
> Traceback (most recent call last):
> ...
> ValueError: Conflicting enums with value '2': 'green' and 'chartreuse'
>

Is there a convenient way to change this behavior, namely to indicate that
conflicts are acceptable in a given Enum? While I like the idea of catching
mistaken collisions, I've seen far too many C/C++ scenarios where multiple
names map to the same value. This does raise questions, as it's unclear
whether each name should get its own representation, or whether it's better
to let DupEnum.name1 is DupEnum.name2 be True.

(For the latter behavior, would adding DupEnum.name2 = DupEnum.name1 after
the class declaration work today?)

Michael
-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] an alternative to embedding policy in PEP 418 (was: PEP 418: Add monotonic clock)

2012-04-06 Thread Michael Urman
On Thu, Apr 5, 2012 at 21:57, Stephen J. Turnbull  wrote:
> I might have chosen to implement a 'None' return if I had designed
> open(), but I can't get too upset about raising an Exception as it
> actually does.

One fundamental difference is that there are many reasons one might
fail to open a file. It may not exist. It may not have permissions
allowing the request. It may be locked. If open() returned None, this
information would have to be retrievable through another function.
However since it returns an exception, that information is already
wrapped up in the exception object, should you choose to catch it, and
likely to be logged otherwise.

In the case of the clocks, I'm assuming the only reason you would fail
to get a clock is because it isn't provided by hardware and/or OS. You
don't have to worry about transient scenarios on multi-user systems
where another user has locked the clock. Thus the exception cannot
tell you anything more than None tells you. (Of course, if my
assumption is wrong, I'm not sure whether my reasoning still applies.)

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Playing with a new theme for the docs, iteration 2

2012-03-25 Thread Michael Urman
On Sun, Mar 25, 2012 at 07:07, Antoine Pitrou  wrote:
>>
>> I've also added a little questionable gimmick to the sidebar (when you 
>> collapse
>> it and expand it again, the content is shown at your current scroll 
>> location).
>
> The gimmick is buggy (when you collapse then expand it in the middle,
> and then scroll up, the sidebar content disappears after scrolling),
> and in the end quite confusing.

It also seems not to handle window resizes very well right now. It
appears to choose the height for the vertical bar when shown, and then
when the text next to it reflows to a new length, the bar can become
longer or shorter than necessary.

On the one hand this makes it hard to get the sidebar content to show
at the bottom of the page; on the other, I believe it mitigates
potential problems if sidebar content is too long for the window size.

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP397 no command line options to python?

2011-10-23 Thread Michael Urman
On Sun, Oct 23, 2011 at 20:58, Mark Hammond  wrote:
> On 24/10/2011 12:56 PM, Michael Urman wrote:
>>
>> On Sun, Oct 23, 2011 at 17:15, Mark Hammond
>>  wrote:
>>>
>>> How about abusing the existing flags for this purpose - eg:
>>>
>>> % py -3?
>>> % py -2.7?
>>
>> I would have expected that to launch an interactive python shell of
>> the appropriate version. Does it do something else today?
>
> That is what it does today without the trailing '?' character.  My idea was
> to allow the trailing '?' to behave like the proposed --which.

Oh, I read right over question mark without seeing it. I wonder if
that's a notch against it from a documentation standpoint or just my
own personal quirk. (I'm not used to thinking of it as a command line
flag, partly due to my unix years.) Thanks for explaining!
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP397 no command line options to python?

2011-10-23 Thread Michael Urman
On Sun, Oct 23, 2011 at 17:15, Mark Hammond  wrote:
> How about abusing the existing flags for this purpose - eg:
>
> % py -3?
> % py -2.7?

I would have expected that to launch an interactive python shell of
the appropriate version. Does it do something else today?

Michael
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Coding guidelines for os.walk filter

2011-08-30 Thread Michael Urman
> for t in os.walk(somedir):
>    t[1][:]=set(t[1])-{'.svn','tmp'}
>    ... do something
>
> This is a very clever hack but... it relies on internal implementation
> of os.walk

This doesn't appear to be an internal implementation detail; this is
documented behavior.
http://docs.python.org/dev/library/os.html#os.walk shows a similar example:

for root, dirs, files in os.walk('python/Lib/email'):
# ...
dirs.remove('CVS')  # don't visit CVS directories

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 397 (Python launcher for Windows) reference implementation

2011-07-05 Thread Michael Urman
On Tue, Jul 5, 2011 at 03:01, Vinay Sajip  wrote:
> Were those other Windows apps packaged as .msi, or .exe? AFAICT, although you
> can embed an MSI inside another one, the practice of concurrent/nested
> installations is strongly discouraged by Microsoft - see http://goo.gl/FJx1S
> (Rule 20).

Right, you cannot sanely embed one .msi inside another .msi; the
support for "nested" or "concurrent" installations referred to in that
link is indeed something to avoid.

> So you could package Python and the launcher as separate MSIs (this would make
> sense so that you could restore associations to the launcher just by repairing
> its installation), but since nested MSIs are a no-no, that means installing 
> via
> a bootstrapping .exe. This is a bigger change to our Windows packaging than 
> some
> people might be comfortable with ...

You can certainly jump through all these hoops, but the pieces here
are much more suited towards a component definition that can be shared
among multiple products. If the component always installs to the same
place, has the same GUID, and otherwise only changes by versions the
exe, this is a completely safe correct use of one. Last I knew, msi.py
allocates random GUIDs, so may or may not be suitable for generating
this. If there is only rare need to update this exe, and msi.py has
support for merge modules, that could be one approach.

My recommendation for distributing this: each .msi which wants to
include it should have a component that includes the following. Note
that each .exe (py, pyw) and each architecture (x86, x64) need their
own component with their own static GUID.
 * Defined unchanging GUID
 * Defined target location (perhaps SystemFolder)
 * msidbComponentAttributesSharedDllRefCount (cooperate with non-MSI
installers), msidbComponentAttributesShared (keep highest version),
and possibly msidbComponentAttributesPermanent flags set on the
component
 * Versioned .exe (using a Windows version block)
 * File association information

Then these components can be included in the python 3.3 installer,
future releases, and even a standalone installer, and reference count
correctly. Again, these can optionally be made available as merge
modules for other consumers, but there's likely no need.

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII

2011-05-10 Thread Michael Urman
On Tue, May 10, 2011 at 03:03, Victor Stinner
 wrote:
> If GetProcAddress() expects a byte string encoded to the ANSI code page,
> my patch is correct because the function used the UTF-8 encoding, not
> the ANSI code page. We can maybe use GetProcAddressW() to pass a Unicode
> string. I don't know which encoding is used by GetProcAddressW()...

While I can find references to a GetProcAddressW, most of them seem to
agree it doesn't exist. "My kernel32.dll only exports GetProcAddress."
This suggests to me it accepts a null-terminated bytestring instead of
specifically an ANSI string. What data ends up in the export table is
likely similar to the linux filesystem case, only with less likelihood
of the environment telling you its encoding.

> I already patched _PyImport_GetDynLoadFunc() for Windows: the path is
> now a Unicode object instead of a byte string encoded to the filesystem
> encoding. _PyImport_GetDynLoadWindows() uses GetFullPathNameW() and
> LoadLibraryExW(). The work to be fully Unicode compliant (for the path
> field, not for the name) is not completly done... but I have a pending
> patch, see:
> http://bugs.python.org/issue11619
>
> But this patch is huge and creates many functions. I am not sure that we
> need it, I will work on this later.

I'm comfortable with the idea of requiring UTF-8 encoding for the
initmodule entry points of modules named with non-ASCII identifiers,
especially if there is nothing which works consistently today. I've
only seen pure-ASCII library names in all my C++ work, so I feel it
borders on YAGNI (but I like it in theory).

As an alternate approach, one article I read suggested to use ordinals
instead of names if you wanted to use non-ASCII names. Python could
certainly try to load by ordinal on Windows, and fall back to loading
by name. I don't have a clue what the rate of false positives would
be.

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII

2011-05-09 Thread Michael Urman
On Mon, May 9, 2011 at 23:09, Neil Hodgson  wrote:
> Michael Urman:
>
>> I'm not convinced this is correct for this case. GetProcAddress takes
>> an "ANSI" string, meaning while it could theoretically use UTF-8, in
>> practice I doubt it uses anything outside of ASCII safely. So while
>> the name of the library would be encoded in UTF-16, the name of the
>> function loaded from the library would not be.
>
>   Yes you are right:
> http://scintilla.org/NarrowName.png
>
>   Neil
>

That screenshot seems to show UTF-8 is being used. This may just be
the literal bytes in the .c file, but could it be something more
dependable?

http://unicode.org/cgi-bin/GetUnihanData.pl?codepoint=6728
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII

2011-05-09 Thread Michael Urman
On Mon, May 9, 2011 at 20:08, Neil Hodgson  wrote:
>   Yes, Windows will use UTF-16 as it does for almost everything. From
> a user's point of view, these should both just be seen as Unicode.

I'm not convinced this is correct for this case. GetProcAddress takes
an "ANSI" string, meaning while it could theoretically use UTF-8, in
practice I doubt it uses anything outside of ASCII safely. So while
the name of the library would be encoded in UTF-16, the name of the
function loaded from the library would not be.

http://msdn.microsoft.com/en-us/library/ms683212(v=vs.85).aspx

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [PEPs] Support the /usr/bin/python2 symlink upstream

2011-03-08 Thread Michael Urman
On Tue, Mar 8, 2011 at 03:40, Gertjan Klein  wrote:
> A launcher might be difficult to integrate into the Python installer, as
> there can, by definition, only be one. What if I install a new version
> of Python and then uninstall it? Will the launcher be uninstalled as
> well? Will it be reverted to a previous version, and if so, which?

As long as component rules are maintained (the same components with
same GUIDs install the same files in the same locations) and they are
marked shared, Windows Installer will handle everything for us. If the
files have versions the way Windows Installer can process them, it
will even keep the highest version of the file in place.

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [PEPs] Support the /usr/bin/python2 symlink upstream

2011-03-08 Thread Michael Urman
On Tue, Mar 8, 2011 at 03:33, "Martin v. Löwis"  wrote:
> If it's called "python.exe", I wonder what it should do when given a
> file that doesn't carry version information.

I would expect it to follow the guidance of the Unix PEP as much as
possible. IIRC this means it would launch the highest 2.x version of
Python on the system. If there is no 2.x version, it would launch the
highest known version (presumably 3.x) for now. I would expect this
behavior for any unversioned name of the launcher (e.g. python.exe or
pystart.exe).

> Actually, the one Python script I run regularly is msi.py, and I
> currently launch it in a terminal window, because I need to run it with
> c:\python25\python.exe, which double-clicking won't do for me. If I
> could double-click it, I would like that more (there is also the issue
> that the script needs the VS envvars set, so I'd need to find a solution
> to that, also).

It's implicit scope creep, but perhaps the launcher could be
configured to provide certain environment tweaks, either for all
versions of Python, or specific for each version. A more extreme scope
creep could allow this information to be stored in the .py file, but
that seems backwards to me.

To think a bit further on how the launcher should resolve the version,
We should probably first see if the #! line works as an executable
with optional single argument. This would allow someone to specify
overrides like:

   #! D:\Checkout\...\python

If the file doesn't exist, fall back to scanning for python[.exe] with
or without a version (in order to support scripts with
/usr/bin/python* or /usr/bin/env python*). If it has a version follow
the version as closely as possible (map 2 to latest 2.x, 3 to latest
3.x, etc.). If it doesn't have a version, find the latest 2.x or
latest any version as above, per the most recent relevant PEP. Open
question is what to do if the script clearly requests version 2.6 but
only 2.5, 2.7 and 3.2 are installed, or requests 2.x but only 3.x is
installed; I could see erroring out as "refus[ing] the temptation to
guess".

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [PEPs] Support the /usr/bin/python2 symlink upstream

2011-03-06 Thread Michael Urman
Using batch files is a poor idea, IMO, because you have to explicitly
call a batch file from another, or the remainder of the caller will
not execute. Installing to System32 s also questionable, but if it's
just the launchers, it might be okay. From an installer's perspective,
it would really help if those files kept consistent component GUIDs
and had a Windows version block so it could upgrade consistently if it
changes in the future

I think Glenn Linderman hit the use cases on the head; I'm unclear why
he was against the overhead of a helper executable. The things I would
really want solutions for are these:
 * double click on a script, and have it launch the right python (2 or
3, w or not)
   * Probably scan for the final python[.\d]+ string and assume it's relevant.
 * be able to easily invoke python to interpret a script from the command prompt

I'd be comfortable with setting associations to a set of thin
executable wrappers which examined the #! line to extract a python
version. It could fall back to the "default" version of python, which
could be defined as the highest installed on the machine, or could be
customizable by the machine's administrator. If this wrapper script
passes on all command line parameters, it could also be a reasonable
way to invoke python from the command line.

Is there a good way for the wrapper to know what versions of python
are available on Windows? Moving forward we could have a pythonx.y
installer set a value in a known registry key, and document how to
register an older python with this key. The default value of the key
could be the mechanism for setting a default python version.

I'm willing to clarify this and/or look into providing patches if it
would help; the only potentially sticky point is the contribution
agreement, but I wouldn't expect trouble with my employer.

Michael
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python merge module

2011-02-03 Thread Michael Urman
On Thu, Feb 3, 2011 at 00:30, "Martin v. Löwis"  wrote:
> Another challenge with shared location merge modules is upgrades:
> the Python installer currently doesn't use stable component IDs;
> I think this would cause problems for users of the merge module.
> Providing stable component IDs is a challenge since it's difficult
> to version the files on disk.
>
> Not sure why Michael thinks that a private location merge module
> would provide no benefits to the user of the merge module.

I hadn't thought it through fully, but the preceding paragraph really
gets to the core of the problem. The maintenance nightmare is security
updates for private location installations by third parties. The only
MSI-friendly way to update that code is through releasing an updated
merge module and having the consuming application also release an
update that uses it. Since Python's components use fresh GUIDs each
time, this would require a "major" upgrade; "minor" upgrades would
cause Windows Installer to throw fits.

Technically this is a problem with the component generation in Python,
and for that in particular, a move to WiX could be very helpful. They
have stable component code generation which keys off of location,
name, platform, etc., but only works for single-file components.

For contrast, I don't see a shared-location merge module as offering
benefits beyond a silently redistributable msi package. The shared
location is better about following component code rules (re-use in
private areas is an allowed gray area), and there are people out there
who consider the reference counting through merge modules to be
superior. I find the resulting complexity in the consuming package's
installation to be more of a down-side.

>> I work on open source projects myself, and we always provide
>> both a merge module and a normal msi installer. It's very little
>> extra work (in WiX at least) to create both.
>
> But what's the quality of these? Ideally, I'd like to create a single
> merge module which, at the option of the user of the merge module,
> produces either a shared or a private installation. Is that still
> only little extra work in Wix?

I've never tried to make a configurable merge module in WiX, but I
think that's the only option if you want a single merge module to
allow both. It should be a one-time authoring overhead with [1]. Using
them is pretty straightforward within the Merge elements [2].

[1] http://wix.sourceforge.net/manual-wix3/wix_xsd_configuration.htm
[2] http://wix.sourceforge.net/manual-wix2/wix_xsd_configurationdata.htm

Michael
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python merge module

2011-02-02 Thread Michael Urman
On Wed, Feb 2, 2011 at 15:27, Hoyt, David  wrote:
>> The Installer COM object is the platform standard mechanism, and that's what 
>> msilib uses.
>
> Why maintain a lib when there's (better), free alternatives out there that 
> are maintained by Microsoft itself? (okay, a group at Microsoft that works on 
> their free time, but has significant contributions by many teams at Microsoft 
> -- thus they have a vested interested in its success).
>
>> I really see no need to move away from that - it can create arbitrary MSI 
>> files.
>
> Can it create merge modules? Transform files (e.g. localization)? 
> Bootstrappers? There's a lot more to ms installers than the msi itself. 
> Wouldn't it be easier to maintain an XML file than an entire lib dedicated to 
> something that someone else has already solved?

If Python was starting at ground zero, and the choices were to create
a library or to use WiX, the answer might have been different. However
with a mature enough library to suit all the needs that anyone has
been willing to author, it's certainly more work to create the WiX
install and maintain it than it is to merely maintain the existing
scripts.

As far as the possibility of distributing Python as a merge module?
I'd recommend against it. Shared location merge modules are a
maintenance nightmare, and private location merge modules may not
offer the benefit you need. Better to just install the main Python msi
as part of a suite with your installer, whether you build that
installer in WiX and burn, or not.

Michael
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] email package status in 3.X

2010-06-22 Thread Michael Urman
On Tue, Jun 22, 2010 at 15:32, Terry Reedy  wrote:
> On 6/22/2010 9:24 AM, Michael Urman wrote:
>> These are trivial functions;
>> I just don't fully understand why the capability isn't baked in.
>
> Possible reasons: They are special purpose functions easily built on the
> basic functions provided. Fine for a 3rd party library. Most people do not
> need them. Some might be mislead by them. As other have said, "Not every
> one-liner should be builtin".

Perhaps the two-argument constructions on bytes and str should have
been removed in favor of the .decode and .encode methods on their
respective classes. Or vice versa; I don't have the history to know in
which order they originated, and which is theoretically preferred
these days.

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] email package status in 3.X

2010-06-22 Thread Michael Urman
On Tue, Jun 22, 2010 at 00:28, Stephen J. Turnbull  wrote:
> Michael Urman writes:
>
>  > It is somewhat troublesome that there doesn't appear to be an obvious
>  > built-in idempotent-when-possible function that gives back the
>  > provided bytes/str,
>
> If you want something idempotent, it's already the case that
> bytes(b'abc') => b'abc'.  What might be desirable is to make
> bytes('abc') work and return b'abc', but only if 'abc' is pure ASCII
> (or maybe ISO 8859/1).

By idempotent-when-possible, I mean to_bytes(str_or_bytes, encoding,
errors) that would pass an instance of bytes through, or encode an
instance of str. And of course a to_str that performs similarly,
passing str through and decoding bytes. While bytes(b'abc') will give
me b'abc', neither bytes('abc') nor bytes(b'abc', 'latin-1') get me
the b'abc' I want to see.

These are trivial functions; I just don't fully understand why the
capability isn't baked in. A one argument call is idempotent capable;
a two argument call isn't as it only converts.

It's not a completely made-up requirement either. A cross-platform
piece of software may need to present to a user items that are
sometimes str and sometimes bytes - particularly filenames.

> Unfortunately, str(b'abc') already does work, but
>
> st...@uwakimon ~ $ python3.1
> Python 3.1.2 (release31-maint, May 12 2010, 20:15:06)
> [GCC 4.3.4] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> str(b'abc')
> "b'abc'"
>>>>
>
> Oops.  You can see why that probably "should" be the case

Sure, and I love having this there for debugging. But this is hardly
good enough for presenting to a user once you leave ascii.
>>> u = '日本語'
>>> sjis = bytes(u, 'shift-jis')
>>> utf8 = bytes(u, 'utf-8')
>>> str(sjis), str(utf8)
("b'\\x93\\xfa\\x96{\\x8c\\xea'",
"b'\\xe6\\x97\\xa5\\xe6\\x9c\\xac\\xe8\\xaa\\x9e'")

When I happen to know the encoding, I can reverse it much more cleanly.
>>> str(sjis, 'shift-jis'), str(utf8, 'utf-8')
('日本語', '日本語')

But I can't mix this approach with str instances without writing a
different invocation.
>>> str(u, 'argh')
TypeError: decoding str is not supported

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] email package status in 3.X

2010-06-21 Thread Michael Urman
On Mon, Jun 21, 2010 at 09:51, P.J. Eby  wrote:
> The issue is, I'd like to have an idempotent incantation that I can use to
> make the inputs and outputs to stdlib functions behave in a type-safe manner
> with respect to bytes, in cases where bytes are really what I want operated
> on.
>
> Note too that this is an argument for symmetry in wrapping the inputs and
> outputs, so that the code doesn't have to "know" what it's dealing with!

It is somewhat troublesome that there doesn't appear to be an obvious
built-in idempotent-when-possible function that gives back the
provided bytes/str, or converts to the requested type per the listed
encoding (as of 3.1.2). Would it be useful to make the second versions
of these work, or would that return us to the confusion of the 2.x
era? On the other hand, since these are all TypeErrors instead of
UnicodeErrors, it's an easy wrapper to write.

>>> bytes('abc', 'latin-1')
b'abc'
>>> bytes(b'abc', 'latin-1')
TypeError: encoding or errors without a string argument

>>> str(b'abc', 'latin-1')
'abc'
>>> str('abc', 'latin-1')
TypeError: decoding str is not supported

Interestingly the online docs for str say it can decode either a byte
string or a character buffer, a term which doesn't yield a definition
in a search; apparently either a string is not a character buffer, or
the docs are incorrect.
http://docs.python.org/py3k/library/functions.html?highlight=str#str

However it looks like this is consistent with int.
>>> int(4, 0)
TypeError: int() can't convert non-string with explicit base

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Reasons behind misleading TypeError message when passing the wrong number of arguments to a method

2010-05-21 Thread Michael Urman
On Thu, May 20, 2010 at 22:54, Steven D'Aprano  wrote:
> This is misleading, since C().method is a bound method which takes one
> argument, x, not two. I find myself wishing that Python distinguished
> between ArgumentError and other TypeErrors, so that the method wrapper
> of bound methods could simply catch ArgumentError and subtract 1 from
> each argument count.

But how exactly could it distinguish between various levels of call nesting?

class C:
def a(self, i): return self.b(i)
def b(self, j): return self.c(j)
def c(self, k): return self.a()

What should C.a(), C().a(), and C().a(1) each yield? Does it change if
c(self, k) calls C.a(self)?

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fwd: Download Page - AMD64

2010-01-13 Thread Michael Urman
On Wed, Jan 13, 2010 at 13:45, "Martin v. Löwis"  wrote:
>> So to echo what Michael said, the Microsoft nomenclature is "x64"
>> regardless of yours and Martin's objections to that name. Nobody who
>> uses Windows would be confused by "x64" since that is *the* Microsoft
>> naming scheme.
>
> That's actually not entirely true. There are several places in the
> APIs where Microsoft either allows or requires to call the architecture
> AMD64 (e.g. architecture indication in MSI files).

I should have clarified I was talking about the names shown on MSDN
subscriptions for downloading installation media of Windows 7 and
Windows Vista. It's pretty clear in that context Microsoft uses x64 to
describe this platform of Windows. But again, it's far from clear that
this is a term they use for non-developers.

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fwd: Download Page - AMD64

2010-01-13 Thread Michael Urman
On Wed, Jan 13, 2010 at 00:11, "Martin v. Löwis"  wrote:
> -1. AMD doesn't want us to use the term x86-64 anymore, but wants us
> to use AMD64 instead. I think we should comply - they invented the
> architecture, so they have the right to give it a name. Neither
> Microsoft nor Intel have such a right.

I see and agree with the motivation behind your point, but it's just
as reasonable to mimic the name Microsoft uses: the release is for
Windows rather than the processor. On MSDN Microsoft uses
parentheticals x86, ia64 and x64; this would suggest a name like
Python 2.6.4 installer for Windows (x64). Unfortunately this usage
doesn't seem to be reflected in consumer-oriented product pages, so
I'm uncertain how clear it is for those downloading Python.

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 383 update: utf8b is now the error handler

2009-05-07 Thread Michael Urman
On Thu, May 7, 2009 at 01:16, "Martin v. Löwis"  wrote:
> I'm still at a loss what name to give it, though. I understand that
> I have to rename both error handlers, but I'm uncertain what I should
> rename them to. So proposals that rename only one of them aren't
> that helpful. It would be helpful if people would indicate support
> for Antoine's proposal.

Part of the problem is they both allow byte sequences to decode to
invalid Unicode strings, and in particular they both affect the same
byte subsequences, and that brought us to the crossroads where we
wanted to name both of them "surrogates". So I'll offer a few more
colors, and try to get out of the way of choosing between them or the
other proposed ones. :)

I haven't come up with anything I like better than errors="lenient"
for the old utf8 behavior handler; would errors="nonvalidating" be
correct? It still seems to me that a new codec, perhaps
"utf8-lenient", reads better.

For the utf8b error handler, I could see any of errors="roundtrip",
errors="roundtripreplace", errors="tosurrogate",
errors="surrogatereplace", errors="surrogateescape",
errors="binaryreplace", errors="binaryescape". This includes Antoine's
proposal (sans hyphen).

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 383 update: utf8b is now the error handler

2009-05-07 Thread Michael Urman
On Thu, May 7, 2009 at 00:43, "Martin v. Löwis"  wrote:
> Michael Urman wrote:
>> On Wed, May 6, 2009 at 15:42, "Martin v. Löwis"  wrote:
>>> Despite there being also an error handler called "surrogates".
>>
>> Not that I have to be, but I'm not sold on the previous UTF-8 codec
>> behavior becoming an error handler of the name "surrogates" for two
>> reasons (I do respect the obvious PBP argument for the implementation,
>> and have no better name - "lenient"?).
>
> PBP?

Practicality beats purity. From a purity standpoint, the legacy
invalid utf-8 seems more like an encoding than an error handler to me.
From a practicality standpoint, it's presumably much more convenient
to implement it on top of the new valid UTF-8 codec's behavior. And
then any error handler needs a name.

> Well, there is a way to stack error handlers, although it's not pretty:
> [...]
> codecs.register_error("surrogates_then_replace",
>                      surrogates_then_replace)

That mitigates my arguments significantly, although I'd rather see
something like errors=('surrogates', 'replace') chain the handlers
without additional registrations. But that's a different PEP or
arbitrary change. :)

>> The stacking argument also applies to the new utf8b behavior on encode
>> (only, as it handles all errors on decode). This may be a YAGNI
>
> Indeed - in particular, as, in the primary application of this error
> handler (i.e. file IO operations), there is no way of specifying
> an addition error handler anyway.

Would it be useful to allow setting this somewhere? It'd be analogous
to setfsencoding, perhaps a setfsencodingerrors. It's not hard to
imagine an application working on Windows where all Unicode characters
are valid, and constructing backup filenames by adding some arbitrary
character, or receiving them from a user who doesn't understand
encodings. When this application is taken to a non-Unicode filesystem,
without the ability to say "I really want a valid filename: so
replace", that could get messy. But it may still be a YAGNI, or a
"don't do that."

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 383 update: utf8b is now the error handler

2009-05-06 Thread Michael Urman
On Wed, May 6, 2009 at 15:42, "Martin v. Löwis"  wrote:
> Despite there being also an error handler called "surrogates".

Not that I have to be, but I'm not sold on the previous UTF-8 codec
behavior becoming an error handler of the name "surrogates" for two
reasons (I do respect the obvious PBP argument for the implementation,
and have no better name - "lenient"?).

First, unless there's a way to stack error handlers, there's no way to
access the old behavior combined with the "replace" handler. Second,
errors="surrogates" reads like surrogates should be an error, not an
additionally allowed pattern. Neither of these are deal breakers or
hard to learn, but they are non-obvious. I think the utf8b behavior
makes a lot more sense with the name "surrogates", through the
mnemonic that errors become surrogates.

The stacking argument also applies to the new utf8b behavior on encode
(only, as it handles all errors on decode). This may be a YAGNI, but
for a non-UTF-8 encode, it may be useful to allow "xmlcharrefreplace"
handling for unavailable non-surrogate-escaped characters. But without
stacking that's unmaintainable, as we clearly don't want ${codec}b for
all current codecs.

I'd be perfectly happy with utf8b or UTF-8b, as either a codec or an
error handler (do we want both? YAGNI?). So what if it smells a little
inaccurate as a handler when used with codecs other than UTF-8, no big
deal. I could also see something like errors="roundtrip" which
explains the intention of the handler rather than the algorithm, but
is awkward on encode when it encounters unavailable Unicode
characters.

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 383 update: utf8b is now the error handler

2009-05-03 Thread Michael Urman
On Sun, May 3, 2009 at 08:43, Antoine Pitrou  wrote:
> Also, if utf8-b is not provided as a codec, will there be an easy way for user
> code to use the same encoding as the IO layer does? (e.g.
> os.fsdecode/os.fsencode)?

I like the idea of fsencode/fsdecode functions, but we need to be
careful deciding what they accept and produce on Windows. I'd expect
them to be identity functions, but then the difference in platform
behavior suggests perhaps they should be in os.path.

Unicode to Unicode on Windows would further mean fsencode wouldn't be
useful for sending filenames over sockets, and "utf8" will be prone to
exceptions on the very names we're trying to support right now. Is
there an advantage to not providing the the "utf8b" behavior as a
registered codec?

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread Michael Urman
On Thu, Apr 30, 2009 at 09:42, Thomas Breuel  wrote:
> So, I don't see any reason to prefer your half surrogate quoting to the Mono
> U+-based quoting.  Both seem to achieve the same goal with respect to
> round tripping file names, displaying them, etc., but Mono quoting actually
> results in valid unicode strings.  It works because null is the one
> character that's not legal in a UNIX path name.

This seems to summarize only half of the problem. Mono's U+
quoting creates a string which is an invalid filename; PEP 383's
creates one which is an unsanctioned collection of code units. Neither
can be passed directly to the posix filesystem in question. I favor
PEP 383 because its Unicode strings can be usefully passed to most
APIs that would display it usefully. Mono's U+ probably truncates
most strings. And since such non-valid Unicode strings can occur on
the Windows filesystem, I don't find their use in PEP 383 to be a
flaw.

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Michael Urman
On Mon, Apr 27, 2009 at 23:43, Stephen J. Turnbull  wrote:
> Nobody said we were at the stage of *saving* the [attachment]!

But speaking of saving files, I think that's the biggest hole in this
that has been nagging at the back of my mind. This PEP intends to
allow easy access to filenames and other environment strings which are
not restricted to known encodings. What happens if the detected
encoding changes? There may be difficulties de/serializing these
names, such as for an MRU list.

Since the serialization of the Unicode string is likely to use UTF-8,
and the string for  such a file will include half surrogates, the
application may raise an exception when encoding the names for a
configuration file. These encoding exceptions will be as rare as the
unusual names (which the careful I18N aware developer has probably
eradicated from his system), and thus will appear late.

Or say de/serialization succeeds. Since the resulting Unicode string
differs depending on the encoding (which is a good thing; it is
supposed to make most cases mostly readable), when the filesystem
encoding changes (say from legacy to UTF-8), the "name" changes, and
deserialized references to it become stale.

This can probably be handled through careful use of the same
encoding/decoding scheme, if relevant, but that sounds like we've just
moved the problem from fs/environment access to serialization. Is that
good enough? For other uses the API knew whether it was
environmentally aware, but serialization probably will not. Should
this PEP make recommendations about how to save filenames in
configuration files?

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-25 Thread Michael Urman
On Sat, Apr 25, 2009 at 11:33, "Martin v. Löwis"  wrote:
> If the user has the locale setup in way that matches his keyboard,
> it should work all fine - and will already, even without the PEP.
> If the user enters a character that doesn't directly map to a
> good file name, you get an exception, and have to tell the user
> to pick a different filename.

This sound good so far - the 90% (or higher) case is still clean.

> Notice that it may fail at several layers:
> - it may be that characters entered are not supported in what
>  Python choses as the file system encoding.
> - it may be that the characters are not supported by the file
>  system, e.g. leading spaces in Win32.
> - it may be that the file cannot be renamed because the target
>  name already exists.
> In all these cases, the application has to ask the user to
> reconsider; for at least the last case, it should be prepared
> to do that, anyway (there is also the case where renaming fails
> because of lack of permissions; in that case, picking a different
> file name won't help).

This argument sounds good to me too. How will we communicate to
developers what new exception might occur where? It would be a shame
to have a solid application developed under Windows start raising
encoding exceptions on linux. Would the encoding error get mapped to
an IOError for all file APIs that do this encoding?

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-25 Thread Michael Urman
On Sat, Apr 25, 2009 at 10:00, "Martin v. Löwis"  wrote:
> On decoding, there is a guarantee that it decodes successfully. There is
> also a guarantee that the result will re-encode successfully, and yield
> the same byte string.
>
> If you pass a different string into encoding, you still may get
> exceptions. For example, if the filesystem encoding is latin-1,
> passing u"\u20ac" will continue to raise exceptions, even under the
> python-escape error handler - that error handler will only handle
> surrogates.

One angle I've not seen discussed yet is a set of use cases. While the
PEP addresses the need for the python developer to not have to write
insane conditional code that maps between bytes and str depending on
the platform, it doesn't talk about what this allows an application to
provide to a user, and at what risks.

I see two main user-oriented use cases for the resulting Unicode
strings this PEP will produce on all systems: displaying a list of
filenames for the user to select from (an open file dialog), and
allowing a user to edit or supply a filename (a save dialog or a
rename control).

It's clear what this PEP provides for the former. On well-behaved
systems where a simpler filesystemencoding approach would work, the
results are identical; the user can select filenames that are what he
expects to see on both Unix and Windows. On less well-behaved systems,
some characters may appear as junk in the middle of the name (or would
they be invisible?), but should be recognizable enough to choose, or
at least to open sequentially and remember what the last one was. On
particularly poorly behaved systems, the results will be extremely
difficult to read, but no approach is likely to fix this.

What I don't find clear is what the risks are for the latter. On the
less well behaved system, a user may well attempt to use this python
application to fix filenames. Can we estimate a likelihood that edits
to the names would result in a Unicode string that can no longer be
encoded with the python-escape? Will a new name fully provided by a
user on his keyboard (ignoring copy and paste) almost always safely
encode?

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] And the winner is...

2009-03-30 Thread Michael Urman
> We're switching to Mercurial (Hg).

And two hours later, GNOME announces their migration to git is
underway. I'd suspect a series of April Fools jokes, if it weren't two
days early. :)

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Michael Urman
On Sun, Dec 7, 2008 at 11:35, Adam Olsen <[EMAIL PROTECTED]> wrote:
>>> http://bugs.python.org/issue3672
>>> http://bugs.python.org/issue3297
>
> No.  Unicode *requires* them to be treated as errors.  If you want to
> pass them through then you're creating a custom encoding... which you
> might argue for in this case, but it needs to be clearly separate from
> the real UTF-8.

I suspect it is a common and convenient but (according to what you
say) misconceived expectation that using UTF-8 to encode any Unicode
string will not raise an exception. This behavior is not something
which should be discarded lightly.

I see little reason that this couldn't be a new codec or error handler
that allowed people to choose between correct pure UTF-8 behavior or
the technically incorrect but very practical behavior it currently
has.

[My apologies, Adam, for sending this only to you the first time]
-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Michael Urman
On Fri, Dec 5, 2008 at 19:22, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
>> Please, don't do that! Bytes are not characters!
>
> And environment variables, command line arguments, and file names
> are not bytes, but characters.

On Windows NT, sure. On Unix they're still bytes no matter how much we
want them to be characters.

This difference, and secondarily the way python 3 tries to sweep it
under the rug, seem to be the roots of the problem.

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Michael Urman
On Fri, Dec 5, 2008 at 18:48, Nick Coghlan <[EMAIL PROTECTED]> wrote:
> Toshio Kuratomi wrote:
>> Nick Coghlan wrote:
>>> Toshio Kuratomi wrote:
>>>> Guido van Rossum wrote:
>>>>> Glob was just an example. Many use cases for directory traversal
>>>>> couldn't care less if they see *all* files.
>>>>>
>>>> Okay.  Makes it harder to prove correct or not if I don't know what the
>>>> use case is :-)  I can't think of a single use case off-hand.
>>>>
>>>> Even your example of a ??.txt file making retrieval of *.py files fail
>>>> is a little broken.  If there was a ??.py file that was undecodable the
>>>> program would most likely want to know that file existed.
>>> Why? Most programs won't be able to do anything with it. And if the
>>> program *can* do something with it... that's what the bytes version of
>>> the APIs are for.
>>>
>> Nonsense.  A program can do tons of things with a non-decodable
>> filename.  Where it's limited is non-decodable filedata.
>
> You can't display a non-decodable filename to the user, hence the user
> will have no idea what they're working on. Non-filesystem related apps
> have no business trying to deal with insane filenames.

And what of python's batteries---does a library that takes filenames
or directories from a controlling program and processes the contents
of the file need to care whether the file can be encoded properly? Is
said library filesystem related or not?

Won't it be awful when it's the directory name, and processing the
file works if you change into its directory, but not if you're outside
of it? And if there's an error during processing and the library
reports a full filename using os.abspath("file.ext"), but cannot get
the results?

> Linux is moving towards a standard of UTF-8 for filenames, and once we
> get to the point where the idea of encoding filenames and environment
> variables any other way is seen as crazy, then the Python 3 approach
> will work seamlessly.
>
> In the meantime, raw bytes APIs will provide an alternative for those
> that disagree with that philosophy.

And until that time, it's agony for the library writers who didn't
think they needed to care, but find that their users (other
developers) do.
-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Michael Urman
On Tue, Sep 30, 2008 at 7:04 PM, Steven D'Aprano <[EMAIL PROTECTED]> wrote:
>> I believe on disk it uses UTF-16.
>
> Which is made up of bytes. There may be byte sequences that are illegal
> UTF-16, but that's not what Martin said. I don't understand how there
> can be UTF-16 sequences which don't correspond to some sequence of
> bytes. How would they be represented in memory? Is this to do with the
> endianness of the UTF-16 sequence?

It has to do with the internal mapping between the ANSI and Unicode
functions. On NT systems, CreateFileA will map the ANSI bytestring to
a Unicode filename via the active code page, and call CreateFileW
accordingly. The active code page cannot be set to something as useful
as UTF-8, so given any actual code page (1252, 932, etc.) there are
Unicode strings that cannot be represented with a bytestring provided
to the ANSI function.
-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] [Python-checkins] r62848 - python/trunk/Objects/setobject.c

2008-05-08 Thread Michael Urman
On Thu, May 8, 2008 at 8:20 AM, Barry Warsaw <[EMAIL PROTECTED]> wrote:
> Or aggressively back out any changes from freeze time to tag time.  If we
> don't add the commit hook lock, I will be very strict about this come the
> betas.

I know this way is fairly entrenched in the python release process,
but it sounds like it's using the tools incorrectly. In particular
with subversion is very easy (compared to cvs) to branch and to switch
branches locally. Why not create a new prerelease branch at the
beginning of freeze and only merge in the critical changes? This way
only the release manager need know or care about the branch, and
nobody else has to really modify his behavior. Then tag, move, and/or
delete the branch as desired.

The obvious stumbling blocks include buildbots not following the new
branch (this could be a blocker), and release scripts possibly needing
modifications if they contain direct svn url references.

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] unittest's redundant assertions: asserts vs. failIf/Unlesses

2008-03-19 Thread Michael Urman
On Wed, Mar 19, 2008 at 10:44 AM, Stephen J. Turnbull
<[EMAIL PROTECTED]> wrote:
>  So we should add this to 2to3, no?  They're going to run that anyway.

If 2to3 can handle this, that removes the larger half of my objection.
I was under the impression that this kind of semantic inferencing was
beyond its capabilities. But even if so, maybe it's safe to assume
that those names aren't used in other contexts.

My remaining smaller half of the objection is that these aliases
appear to have been added to reduce the friction when moving from
another unit test system. Since the exact names are as much a matter
of muscle memory as anything else being changed by py3k, that's not
very important in this context.

I still don't see the benefit paying for the cost. Are people
genuinely confused by the plethora of names for the operations
(instead of by their occasional "misuse")? But I'm not the one
offering a patch here, so I'll pipe down now. :)
-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] unittest's redundant assertions: asserts vs. failIf/Unlesses

2008-03-19 Thread Michael Urman
> OTOH, I'd rather there be OOWTDI so whatever the consensus is is fine
> with me.

This strikes me as a gratuitous API change of the kind Guido was
warning about in his recent post: "Don't change your APIs incompatibly
when porting to Py3k"

Yes it removes redundancy, but it really doesn't change the cognitive
load (at least for native speakers). If the blessed set were
restricted to assert*, what would users of fail* do when trying to
test their packages on py3k? Search and replace, or monkey patch
unittest? I'm guessing monkey patch unittest, which means the change
saves nothing, and costs plenty.

Note the acronym is OOWTDI, not OONTDI - using a different name does
not necessarily make it a different way.
-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] trunc()

2008-01-27 Thread Michael Urman
Is this a valid summary of the arguments so far?

I see two arguments for the change:

  1) The semantics of trunc() are clear: it maps R -> Z in a specific fashion
  2) The semantics of int() are fuzzy; even non-numeric types
(strings) are handled

Yet there will be a __trunc__ that will allow any chosen mapping to be
implemented, so long as it results in an integer, so (1) is only
guaranteed true for the builtin types. This leaves us with (2) which
seems strongly tied to string parsing (as __index__ resolved the other
common X -> integer case).

I see one main argument against:

  *) trunc() results in duplication at best, zealous deprecation at worst

Given that the deprecation or removal of int(2.3) has been dropped,
the argument is about pointless duplication.


What problem is trunc() supposed to solve if it does to floats what
int() does now? I've done some initial code searches for: lang:python
"int(", and I've seen three primary uses for calling int(x):

  a) parsing strings, such as from a user, a config file, or other
serialized format
  b) limiting the input to a function to an integer, such as in a
calendar trying to ensure it has integer months
  c) truncation, such as throwing away sub-seconds from time.time(),
or ensuring integer results from division

It's unclear to me whether (b) could be better served by more
type-specific operations that would prevent passing in strings, or
whether uses like (c) often have latent bugs due to truncation instead
of rounding.

If trunc() is to clarify round vs. integer-portion, it's something
people learn -- the general lack of comments in (c) usages indicates
nobody considers it special behavior. If it's to clarify the
argument's type (the parsing of strings vs. getting integers from
other numeric types), would separating parsing from the int (and
float) constructors also solve this?

Is the aim to "clean up" the following fake example? (Real world uses
of map(int, ...) seem almost uniformly related to string parsing.)

>>> map(int, ("42", 6.022, 2**32))
[42, 6, 4294967296L]

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Syntax suggestion for imports

2008-01-02 Thread Michael Urman
On Jan 2, 2008 7:19 PM, Raymond Hettinger <[EMAIL PROTECTED]> wrote:
> How about a new, simpler syntax:
>
> * import threading or dummy_threading as threading
>
> * import xml.etree.CElementTree or cElementTree or elementree.ElementTree as 
> ET
>
> * from cStringIO or StringIO import StringIO
>
> * import readline or emptymodule

I'm sympathetic to this syntax proposal, as I find the repetition and
extra lines for a simple idea to be a little unpleasant. This would
also allow us to decide whether the import 'or' handling would be
triggered equivalently to the except ImportError case you described,
or only for missing imports. The latter would stop errors in existing
modules from being silenced (and may give reason to allow emptymodule
or None), but I'm unsure if that's a good thing.

Of the above I find the ElementTree migration to be the most
compelling, yet it seems ill-timed to bring syntax into Python 2.x
you'd be unable to use until you no longer needed it. However your
later example of the PageTemplateFile, which appears to be due to a
third-party module reorganization and could certainly happen during
the lifetime of late 2.x, helps swing me back the other way.

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Registry keys written by x64 installer

2007-07-13 Thread Michael Urman
On 7/13/07, Michael Urman <[EMAIL PROTECTED]> wrote:
> That's even easier then, if anything's actually wrong. I'll find some
> time this weekend to look at it and report back. Would the one at the
> following URL be the correct one to verify?
>
> http://www.python.org/ftp/python/2.5.1/python-2.5.1.amd64.msi
> (linked from http://www.python.org/download/)

Assuming this is the right file, the cause of the behavior Mark
reported is pretty clear. While the template summary is indeed x64,
the attributes on the registry components are all 4 instead of 256 |
4, so they are placed in the 32-bit reflected registry. I don't know
if this is desirable for some other reason.

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Registry keys written by x64 installer

2007-07-13 Thread Michael Urman
On 7/13/07, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> Michael Urman schrieb:
> > Right - it sets the template summary to include Intel64, not x64.
>
> You might be looking at the wrong version. In Python 2.5, it also
> sets it to x64, if the PE machine type is 0x8664.

I've looked most closely at
http://svn.python.org/view/python/trunk/Lib/msilib/__init__.py?rev=47280&view=auto,
and from there not even full readings yet, just searching for Win64 to
see what the flag did. No doubt I have missed several intracacies.

> As we don't want any redirection, setting the flag on all components
> should be correct.

It may well be fine for the Python installers. Without the flag,
locations such as the 64-bit folder destinations resolve to the 32-bit
redirected locations, and the registry resolves to the 32-bit
reflected registry. The former isn't particularly helpful in any case
I can think of (as the non 64-bit locations are available anyway), but
the latter could be important if an 64-bit install needs to set 32-bit
registry keys.

> > The former, with hints of caution: it appears the unused 64-bit code
> > paths of msilib were created to best serve under incorrect
> > assumptions. With what the code would create (with or without Win64
> > set), it will not generate the 64-bit registry keys that the 64-bit
> > program will access.
>
> Why do you say that? What registry keys do you think it creates,
> and what registry keys do you think it should create instead?

Perhaps it's my reading the wrong version (the one I linked above
doesn't even have the string x64), or my assumption that msilib would
target a more general use case than pure 64-bit / pure 32-bit
installers. Or I missed the easy way to interleave 64 and 32-bit
components.

> There are no "64-bit registry keys" on Windows; Win64 only
> has "normal registry keys" and "32-bit (i.e. redirected)
> registry keys".

I find that nomenclature distinction to be more confusing than
referring to them "incorrectly" :)

> I always test whether the AMD 64 binary installs correctly before
> releasing it.

And I ran no tests at all, so I defer to you here.

> You don't need to provide patches - just tell me what's wrong with
> the MSI file. I.e. look at it in orca, correct it so that it works
> correctly, then report what changes you made.

That's even easier then, if anything's actually wrong. I'll find some
time this weekend to look at it and report back. Would the one at the
following URL be the correct one to verify?

http://www.python.org/ftp/python/2.5.1/python-2.5.1.amd64.msi
(linked from http://www.python.org/download/)

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Registry keys written by x64 installer

2007-07-13 Thread Michael Urman
On 7/13/07, Mark Hammond <[EMAIL PROTECTED]> wrote:
> On Friday, 13 July 2007, Michael Urman wrote:
> I suspect I'm still missing something here.  The title of the page you
> referenced before is "Using 64-Bit Windows Installer Packages" - I suspect
> that is different than a 32-Bit installer package installing a 64bit
> program!

I haven't worked with it enough to know all the intracacies, but the
short of what I know is if your template is i386, you can only install
to 32-bit locations. If it is x64 or Intel64, then you can only
install on machines of those architectures, and the 64-bit locations
become available to components marked as 64-bit.

> using _winreg is (almost) like using the API directly.
> RegDisable[/Enable]ReflectionKey appears to let the 32bit process see the
> real keys - I'm not aware of how 64bit apps would enable that reflection,
> but it probably doesn't really matter for our purposes.  In case anyone is
> interested, I just made a patch to _winreg.c adding these 2 functions
> (http://python.org/sf/1753245) in case anyone would like to review it.

http://msdn2.microsoft.com/en-us/library/aa384129.aspx describes the
KEY_WOW64_64KEY and KEY_WOW64_32KEY flags, allowing explicit access to
either set from either type of application.

> I think Tools\msi is what you are looking for, but hopefully Martin will
> chime in.  I'm more than happy to help test.

Thanks, and I see he has, and perhaps I've been looking at an incorrect file...

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Registry keys written by x64 installer

2007-07-12 Thread Michael Urman
On 7/12/07, Mark Hammond <[EMAIL PROTECTED]> wrote:
> Why wouldn't it work for x64 machines?  Is it simply because msilib only
> handles Intel64 when that flag is set?

Right - it sets the template summary to include Intel64, not x64.
Furthermore only one architecture may be set in the template summary,
so an installer may be only one of i386, x64, and Intel64 (although
the latter are assumed to also be able to run i386 binaries).

> What 32bit components should a 64bit build of Python include?  Perhaps you
> mean *could* - but IIUC, there is no intention to release 32bit and 64bit
> versions of Python in a single package (and further, IIUC, no intent on
> supporting a 32bit and 64bit installation on the same machine, regardless of
> packaging)

Agreed. I was just making clear that I'm not familiar with what the
MSI includes, and whether any of the components in a 64-bit install
should be 32-bit or not. With the msilib code as is, it appears to be
all or nothing, or rely on tweaking a global between calls to
start_component.

> I'm afraid its not clear to me if you are agreeing with me (ie, that the
> registry keys are incorrect), or disagreeing with me (the keys are what you
> would expect a correct x64 install to create)?  I think you are agreeing,
> but sounding a caution that it might not be trivial to fix, but I would like
> to be sure...

The former, with hints of caution: it appears the unused 64-bit code
paths of msilib were created to best serve under incorrect
assumptions. With what the code would create (with or without Win64
set), it will not generate the 64-bit registry keys that the 64-bit
program will access. With Win64 set it will not even install except on
an Itanium system. If you just want to get to the keys it currently
sets, there should be an override parameter that causes the registry
API to read the 32-bit keys even in a 64-bit process, but I'm not
familiar with using _winreg.

If there's interest and I can get pointers to where the MSI files are
built, I can look into patching it. I don't have a convenient 64-bit
Windows machine around to test any changes, though.

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Registry keys written by x64 installer

2007-07-12 Thread Michael Urman
On 7/12/07, Mark Hammond <[EMAIL PROTECTED]> wrote:
> I'm afraid my knowledge of MSI is very limited, so I'm not sure where to
> start.  One thing I did notice is that msilib\__init__.py has a variable
> 'Win64' set, hard-coded to 0 - but I've no idea if it is relevant.
> Presumably it is relevant to *something*, otherwise it would not have been
> created - but its unclear when and how this should be set to 1, and if this
> should concern people trying to use bdist_msi to create x64 extension
> packages - but for now, let's just stick with the topic at hand - the
> registry keys set by the installer.

Per the requirements documented at
http://msdn2.microsoft.com/En-US/library/aa372396.aspx, the behavior
you describe is expected for a 32-bit installer. (To install files and
registry to 64-bit locations, the Template Summary must include
Intel64 or x64 depending on which architecture, and the component must
be marked as 64-bit).

I'm not familiar with how msilib is invoked to create the MSI files in
question, but it does look like setting Win64 to 1 at an early enough
time would cause an Intel64 installer to be built, along with entirely
64-bit components. This wouldn't work for x64 machines, and all
components being 64-bit may be incorrect: potentially the 64-bit
installer should have some 32-bit components.

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal to revert r54204 (splitext change)

2007-03-15 Thread Michael Urman
On 3/15/07, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> If we both agree that the old behavior was erroneous, then I
> cannot understand why you want to see the patch reverted.

I think at least part of the disagreement is over the classification
of the earlier behavior as "erroneous". Both unexpected and
undesirable have certainly been common classifications, but as not
everyone agrees, and a very visible example in Windows Explorer
disagree, it's hard to settle on this behavior being simply incorrect.
Thus it's a value judgement. Unlike other value judgements reflected
in Misc/NEWS, there are no similar APIs with which we can compare
behavior and match to increase consistency.

Michael
-- 
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal to revert r54204 (splitext change)

2007-03-14 Thread Michael Urman
On 3/14/07, Anthony Baxter <[EMAIL PROTECTED]> wrote:
> Steering clear of the rest of the discussion, I'd just like to give
> a hearty "+1" for this not going into 2.5.x in any way shape or
> form.

Agreed. I'd further vote for keeping this change out until 3.x because
it is a behavior change in a corner case predicated on a value
judgement. Yes I find the idea of an extension without a filename to
be silly. However this change punishes he who checked the corner cases
to help he who did not.

If this change is primarily geared to help in the case where people
want to retrieve the file name without the extension, we should add a
function to return this basic name. Who would rather see
os.path.dropext(path)?

Michael
-- 
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [NPERS] Re: a feature i'd like to see in python #2: indexing of match objects

2006-12-07 Thread Michael Urman
On 12/6/06, Josiah Carlson <[EMAIL PROTECTED]> wrote:
> Special cases aren't special enough to break the rules.

Sure, but where is this rule that would be broken? I've seen it
invoked, but I've never felt it myself. I seriously thought of slicing
as returning a list of elements per range(start,stop,skip), with the
special case being str (and then unicode followed)'s type
preservation.

This is done because a list of characters is a pain to work with in
most contexts, so there's an implicit ''.join on the list. And because
assuming that the joined string is the desired result, it's much
faster to have just built it in the first place. A pure practicality
beats purity argument.

We both arrive at the same place in that we have a model describing
the behavior for list/str/unicode, but they're different models when
extended outside.

Now that I see the connection you're drawing between your argument and
the paper, I don't believe it's directly inspired by the paper. I read
the paper to say those who could create and work with a set of rules,
could learn to work with the correct rules. Consistency in Python
makes things easier on everyone because there's less to remember, not
because it makes us better learners of the skills necessary for
programming well. The arguments I saw in the paper only addressed the
second point.
-- 
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [NPERS] Re: a feature i'd like to see in python #2: indexing of match objects

2006-12-07 Thread Michael Urman
On 12/7/06, Fredrik Lundh <[EMAIL PROTECTED]> wrote:
> (and while you guys are waiting, I suggest you start a new thread where
> you discuss some other inconsistency that would be easy to solve with
> more code in the interpreter, like why "-", "/", and "**" doesn't work
> for strings, lists don't have a "copy" method, sets and lists have
> different API:s for adding things, we have hex() and oct() but no bin(),
> str.translate and unicode.translate take different arguments, etc.  get
> to work!)

Personally I'd love a way to get an unbound method that handles either
str or unicode instances. Perhaps py3k's unicode realignment will
effectively give me that.

(And agreed on there being no reason that supporting indexing requires
supporting slicing.  But also agreed that match slicing could be as
useful as indexing. Really I don't use regexps enough in Python to
have a position; I was more interested in figuring out where the
type(m) == type(m[:]) idea had come from, as I had never formed it.)
-- 
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [NPERS] Re: a feature i'd like to see in python #2: indexing of match objects

2006-12-06 Thread Michael Urman
On 12/6/06, Josiah Carlson <[EMAIL PROTECTED]> wrote:
> *We* may not be confused, but it's not about us (I'm personally happy to
> use the .group() interface); it's about relative newbies who, generally
> speaking, desire/need consistency (see [1] for a paper showing that
> certain kinds of inconsistancies are bad  - at least in terms of grading
> - for new computer science students). Being inconsistant because it's
> *easy*, is what I consider silly. We've got the brains, we've got the
> time, if we want slicing, lets produce a match object. If we don't want
> slicing, or if prodicing a slice would produce a semantically
> questionable state, then lets not do it.

The idea that slicing a match object should produce a match object
sounds like a foolish consistency to me. It's a useful invariant of
lists that slicing them returns lists. It's not a useful invariant of
sequences in general. This is similar to how it's a useful invariant
that indexing a string returns a string; indexing a list generally
does not return a list.

I only found a couple __getslice__ definitions in a quick perusal of
stdlib. ElementTree.py's _ElementInterface class returns a slice from
a contained list; whereas sre_parse.py's SubPattern returns another
SubPattern. UserList and UserString also define __getslice__ but I
don't consider them representative of the standards of non-string/list
classes.

As an aside, if you're trying to show that inconsistencies in a
language are bad by referencing a paper showing that people who used
consistent (if incorrect) mental models scored better than those who
did not, you may have to explain further; I don't see the connection.

-- 
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __str__ and unicode

2006-12-06 Thread Michael Urman
> I don't have anything older than 2.4 laying around either, but IIRC
> in 2.3 unicode() did not call __unicode__().

It turns out __unicode__() is called on Python 2.3.5.

% python2.3
Python 2.3.5 (#2, Oct 18 2006, 23:04:45)
[GCC 4.1.2 20061015 (prerelease) (Debian 4.1.1-16.1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> class Foo(object):
...   def __unicode__(self):
... print "unicode"
... return u"hi"
...   def __str__(self):
...     print "str"
... return "hello"
...
>>> unicode(Foo())
unicode
u'hi'

-- 
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Passing floats to file.seek

2006-11-13 Thread Michael Urman
On 11/13/06, Steve Holden <[EMAIL PROTECTED]> wrote:
> In which case an immediate transition to error status would seem to
> offer a way of providing an effective education. Deprecation may well be
> the best way to go for customer-friendliness, but anyone who believes
> 1e6 is an int should be hit with a stick.

Right, but what about those people who just didn't examine it? I
consider myself a pretty good programmer, and was surprised by Guido's
remark. A little quick self-education later, I understood.

Still I find the implication that anyone using 1e6 for an integer
should be (have all their users) beaten absurd in the context of
backwards compatibility. Especially when they were using one of the
less apparent floats in a place that accepted floats. Perhaps it would
be a fine change for py3k.

> Next thing you know some damned fool is going to suggest that 1e6 gets
> parsed into a long integer.

I can guess why it isn't, but it seems more a matter of ease than a
matter of doing what's right. I had expected it to be an int because I
thought of 1e6 as a shorthand for (1 * 10 ** 6), which is an int. 1e-6
would be (1 * 10 ** -6) which is a float. 1.0e6 would be (1.0 * 10 **
6) which would also be a float. Clearly instead the e wins out as the
format specifier.

I'm not going to argue for it to be turned into an int, or even
suggest it, after all compatibility with obscure realities of C is
important. I'm just going to say that it makes more sense to me than
your reaction indicates.
-- 
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Path object design

2006-11-04 Thread Michael Urman
On 11/3/06, Steve Holden <[EMAIL PROTECTED]> wrote:
> Having said this, Andrew *did* demonstrate quite convincingly that the
> current urljoin has some fairly egregious directory traversal glitches.
> Is it really right to punt obvious gotchas like
>
>  >>>urlparse.urljoin("http://blah.com/a/b/c";, "../../../../")
>
> 'http://blah.com/../../'

Ah, but how do you know when that's wrong? At least under ftp:// your
root is often a mid-level directory until you change up out of it.
http:// will tend to treat the targets as roots, but I don't know that
there's any requirement for a /.. to be meaningless (even if it often
is).

-- 
Michael Urman  http://www.tortall.net/../mu/blog ;)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 351 - do while

2006-10-01 Thread Michael Urman
On 10/1/06, Ron Adam <[EMAIL PROTECTED]> wrote:
> (I don't think this has been suggested yet.)
>
>  while , :
> 

[snip]

> Putting both the entry and exit conditions at the top is easier to read.

I agree in principle, but I thought the proposed syntax already has
meaning today (as it turns out, parentheses are required to make a
tuple in a while condition, at least in 2.4 and 2.5). To help stave
off similar confusion I'd rather see a pseudo-keyword added. However
my first candidate "until" seems to apply a negation to the exit
condition.

while True until False:  # run once? run forever?
while True until True:  # run forever? run once?

It's still very different from any syntactical syntax I can think of
in python. I'm not sure I like the idea.

Michael
-- 
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Change in file() behavior in 2.5

2006-09-07 Thread Michael Urman
Hi folks,

Between 2.4 and 2.5 the behavior of file or open with the mode 'wU'
has changed. In 2.4 it silently works. in 2.5 it raises a ValueError.
I can't find any more discussion on it in python-dev than tangential
mentions in this thread:
  http://mail.python.org/pipermail/python-dev/2006-June/065939.html

It is (buried) in NEWS. First I found:
  Bug #1462152: file() now checks more thoroughly for invalid mode
  strings and removes a possible "U" before passing the mode to the
  C library function.
Which seems to imply different behavior than the actual entry:
  bug #967182: disallow opening files with 'wU' or 'aU' as specified by PEP
  278.

I don't see anything in pep278 about a timeline, and wanted to make
sure that transitioning directly from working to raising an error was
a desired change. This actually caught a bug in an application I work
with, which used an explicit 'wU', that will currently stop working
when people upgrade Python but not our application.

Thanks,
Michael
-- 
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] SyntaxError: can't assign to function call

2006-08-10 Thread Michael Urman
On 8/9/06, Michael Hudson <[EMAIL PROTECTED]> wrote:
> The question doesn't make sense: in Python, you assign to a name,
> an attribute or a subscript, and that's it.

Just to play devil's advocate here, why not to a function call via a
new __setcall__? I'm not saying there's the use case to justify it,
but I don't see anything that makes it a clear abomination or
impossible with python's syntax.

Michael
-- 
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dict suppressing exceptions

2006-08-10 Thread Michael Urman
On 8/10/06, M.-A. Lemburg <[EMAIL PROTECTED]> wrote:
> Guido van Rossum wrote:
> > But since people are adamant that they want this in sooner, I suggest
> > that to minimize breakage we could make an exception for str-unicode
> > comparisons.
> > What do people think?
>
> I'd suggest that we still inform the programmers of the problem
> by issuing a warning (which they can then silence at will),
> maybe a new PyExc_UnicodeWarning.
>
> BTW, in Py3k, this case would not trigger at all, since all text
> would be Unicode and bytes wouldn't be comparable to Unicode
> anyway. However, that's a different discussion which we can have
> after Python 2.5 is out the door.

I strongly believe that unicode vs str here is the symptom and not the
actual problem. The comparison between two non-us-ascii str/unicode
instances is but one of many ways to raise an exception during
comparison. It's not even obvious ahead of time when it will occur.
Try my example below with (sys.maxint << 1) and -2 instead of 1 and 1.
Do you expect a problem?

Because str/unicode is the common case, we saw it first. If we address
the symptom instead of the problem, someone will be complaining within
a years time because they have a class that they mix in with other
items for a function handler lookup, or who knows what, that works
like the following:

>>> class hasher(object):
...   def __init__(self, hashval):
... self.hashval = hashval
...   def __hash__(self):
... return hash(self.hashval)
...   def __eq__(self, o):
... if not isinstance(o, hasher):
...   raise TypeError("Cannot compare hashval to non hashval")
... return self.hashval == o.hashval

in python2.4:
>>> dict.fromkeys([1, hasher(1)])
{1: None, <__main__.hasher object at 0xa7a5326c>: None}

in python2.5b2:
>>> dict.fromkeys([1, hasher(1)])
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 8, in __eq__
TypeError: Cannot compare hashval to non hashval

Yes this is made up code. But I'm not arguing for a feature; I'm
arguing for backwards compatibility. Because we do not know where
these legitimate uses are, we cannot evaluate their likelihood to
exist nor the level of breakage they will cause. If we make this a
warning for any exception, we can satisfy both imagined camps. Those
in Armin's position can make that warning raise an exception while
debugging, and those using it on purpose can squash it.

I understand the utility of being able to see this case happening. I'm
not sure it's worth the warning. I'm positive it's not worth the
backwards incompatibility of the raised exception.

Michael
-- 
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-03 Thread Michael Urman
On 8/3/06, Josiah Carlson <[EMAIL PROTECTED]> wrote:
> As an alternate idea, rather than attempting to .decode('ascii') when
> strings and unicode compare, why not .decode('latin-1')?  We lose the
> unicode decoding error, but "the right thing" happens (in my opinion)
> when u'\xa1' and '\xa1' compare.

Since I use utf-8 way more than I use latin-1, -1. Since others do
not, -1 on any not obviously correct encoding other than ascii, which
gets grandfathered.

This raises an exception for a good reason. Yes it's annoying at
times. We should fix those times, not the (unbroken) exception.

Michael
-- 
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-03 Thread Michael Urman
On 8/3/06, M.-A. Lemburg <[EMAIL PROTECTED]> wrote:
> > ...but in the case of dictionaries this behaviour has changed and in
> > prior versions of python dictionaries did work as I expected them to.
> > Now they don't.
>
> Let's put it this way: Python 2.5 uncovered a bug in your
> application that has always been there. It's better to
> fix your application than arguing to cover up the bug again.

I would understand this assertion if Ralf were expecting dictionaries
to consider
{ u'm\xe1s': 1, 'm\xe1s': 1 } == { u'm\xe1s': 1 } == { 'm\xe1s': 1 }
This is clearly a mess waiting to explode.

But that's not what he said. He expects, as is the case in python2.4,
len({ u'm\xe1s': 1, 'm\xe1s': 1 }) == 2
because u'm\xe1s' clearly does not equal 'm\xe1s'. Because it raises
an exception, the dictionary shouldn't consider it equal, so there
should be the two keys which happen to be somewhat equivalent.

While this is in fact in the NEWS (Patch #1497053 & bug #1275608), I
think this should be raised for further discussion. Raising the
exception is good for debugging mistakes, but bad for dictionaries
holding holding inequal objects that happen to hash to the same value,
and correclty raise exceptions on comparison.

When we thought it was just a debugging tool, it made sense to put it
straight in to 2.5. Since it actually can adversely affect behavior in
only slightly edgy cases, perhaps it should go through a warning phase
(which ideally could show the exception that was thrown, thus yielding
most or all of the intended debugging advantage).

Michael
-- 
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] struct module and coercing floats to integers

2006-07-28 Thread Michael Urman
On 7/28/06, Bob Ippolito <[EMAIL PROTECTED]> wrote:
> http://python.org/sf/1530559
>
> [1] The pre-2.5 behavior should really be considered a bug, the
> documentation says "Return a string containing the values v1, v2, ...
> packed according to the given format. The arguments must match the
> values required by the format exactly." I wouldn't consider arbitrary
> floating point numbers to match the value required by an integer
> format exactly. Floats are not in general interchangeable with
> integers in Python anyway (e.g. list indexes, etc.).

While it may be a bug, it's not as hard to run into, nor as illogical
as the presentation here makes it sound. The original code[1] took a
float value between 0 and 2, and wanted to use
pack('>H', round(value * 32768))
The workaround is a trivial change
pack('>H', int(round(value * 32768)))
but the timeframe is less than ideal, as working code will suddenly
stop and recieve only mildly helpful error message. The fact that
round returns a float rather than an int, while intentional, does not
feature prominently in one's mine when the first version yielded the
expected results.

I would appreciate option 2 which retains compatibility but warns that
the construct is bad. I will accept any of the options, as it's clear
that floats don't make sense. It's just unfortunate that the previous
implementation let them through in a way the new implementation does
not.

[1] http://www.sacredchao.net/quodlibet/changeset/3706

Michael
-- 
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Simple Switch statement

2006-06-26 Thread Michael Urman
On 6/26/06, Raymond Hettinger <[EMAIL PROTECTED]> wrote:
> With the simplified proposal, this would be coded with an inverse mapping:
>
> for event in pygame.event.get():
> switch eventmap[event.type]:
> case 'KEYDOWN': ...
> case 'KEYUP': ...
> case 'QUIT': ...

Ah, here's where it gets interesting. While that definitely works on
the surface, it can run into some difficulties of scope. SDL (on which
pygame is based) allows user-defined events, also integers, but
without a predefined meaning. If pygame provides the inverse mapping,
it won't contain the user-defined events. If you construct it, you
have to choose between a local lookup and a global one, and then we
start seeing a global store for an essentially local construct, or we
risk differences when there's more than one locality for using it.

While you're right; it should be simple to ensure that the inverse map
handles at least the set the switch handles, and it keeps evaluation
simpler, I still find the limitation ugly. As mentioned, the early
error checking is poor, and it just doesn't feel like the rest of
python.

> >I also would like to see a way to use 'is' [...]for the comparison
>
> [If] the goal is having several distinct cases that are equal but
> not identical, then that's another story.  I suggest leave the initial
> switch syntax as simple as possible and just switch on id(object).

Switching on id(object) only sounds palatable if we're not bound by
the simple switch's limitations. Having a second (if inverted) mapping
merely of identifier to object smells particularly rancid.

What if I want some cases done as identity, but some as equality?
Since I'm having no luck finding a real use case for this, perhaps I
should assume a nested switch would be adequate. Assuming static or
capture it doesn't look too bad, so I think I'll go with Guido's
hypothesis that it's a red herring.

switch id(value):
case id(object): ...
case id(None): ...
else:
switch value:
case 1: ...
case 'orange':
else: raise ValueError

Michael
-- 
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Simple Switch statement

2006-06-26 Thread Michael Urman
On 6/25/06, Raymond Hettinger <[EMAIL PROTECTED]> wrote:
> Those were not empty words.  I provided two non-trivial worked-out examples
> taken from sre_constants.py and opcode.py.  Nick provided a third example from
> decimal.py.  In all three cases, the proposal was applied effortlessly 
> resulting
> in improved readability and speed.  I hope you hold other proposals to the 
> same
> standard.

I appreciate your attempts to help us avoid overengineering this so
I'm trying to find some real world examples of a pygame event loop
that really show the benefit of supporting named constants and
expressions. I may mess up irrelevant details, but the primary case
looks something like the following (perhaps Pete Shinners could point
us to a good example loop online somewhere):

for event in pygame.event.get():
if event.type == pygame.KEYDOWN: ...
elif event.type == pygame.KEYUP: ...
elif event.type == pygame.QUIT: ...

Here all the event types are integers, but are clearly meaningless as
integers instead of an enumeration. I'd be sorely disappointed with
the addition of a switch statement that couldn't support this as
something like the following:

for event in pygame.event.get():
switch event.type:
case pygame.KEYDOWN: ...
case pygame.KEYUP: ...
case pygame.QUIT: ...

I'd also generally like these to be captured like default values to
function arguments are. The only argument against this that stuck with
me is over the fact that locals cannot be used. If literals-only has a
chance, than I would hope that every hashable non-local capturable
expression should be at least as welcome. In summary I'm +0 on switch,
but -1 on literal-only cases.

I also would like to see a way to use 'is' instead of (or inaddition
to) '==' for the comparison, but I don't have any use cases behind
this.

Michael
-- 
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3102: Keyword-only arguments

2006-05-05 Thread Michael Urman
On 5/5/06, Jean-Paul Calderone <[EMAIL PROTECTED]> wrote:
> @keyword
> def foo(a, b, c=10, d=20, e=30):
> return a, b, c, d, e

Cute, indeed. That decorator implementation is not as flexible as the
* which can go after positional parameters, but of course that is easy
to tweak. However the part I didn't bring up as a reason to prefer
explicit syntax for required-as-keyword is performance. I suspect
syntactic support will be faster than **kw.pop; I'm almost certain
it's faster than decorator gimmicks. I'm not certain at all that it's
a concern either way.

Michael
--
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3102: Keyword-only arguments

2006-05-05 Thread Michael Urman
On 5/5/06, Terry Reedy <[EMAIL PROTECTED]> wrote:
> At present, Python allows this as a choice.

Not always - take a look from another perspective:

def make_person(**kwds):
name = kwds.pop('name', None)
age = kwds.pop('age', None)
phone = kwds.pop('phone', None)
location = kwds.pop('location', None)
...

This already requires the caller to use keywords, but results in
horrid introspection based documentation. You know it takes some
keywords, but you have no clue what keywords they are. It's as bad as
calling help() on many of the C functions in the python stdlib.

So what allowing named keyword-only arguments does for us is allows us
to document this case. That is an absolute win.

Michael
--
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] more pyref: continue in finally statements

2006-05-02 Thread Michael Urman
On 5/1/06, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> then what should be the meaning of "continue" here? The finally
> block *eventually* needs to re-raise the exception. When should
> that happen?

It should behave similarly to return and swallow the exception. In
your example this would result in an infinite loop. Alternately the
behavior of return should be changed, and the below code would no
longer work as it does today.

>>> def foo():
...   try: raise Exception
...   finally: return 'Done'
...
>>> foo()
'Done'

Michael
--
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 3rd party extensions hot-fixing the stdlib (setuptools in the stdlib)

2006-04-19 Thread Michael Urman
On 4/19/06, Phillip J. Eby <[EMAIL PROTECTED]> wrote:
> People blame setuptools when pydoc doesn't work on packages in zip
> files.  Rather than refer to some theoretical argument why it's not my
> fault and I shouldn't be the one to fix it, I prefer to fix the problem.

So rather than extract the zip at install time (something purely
within setuptool's domain), you found modifying pydoc's behavior to be
a more compelling story. Are you aware that zipimport fails on 64-bit
systems in Python 2.3, or do you plan to patch over that as well?

Michael
--
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] a flattening operator?

2006-04-19 Thread Michael Urman
On 4/18/06, Greg Ewing <[EMAIL PROTECTED]> wrote:
> No, it wouldn't. There's no problem in giving an operator
> different unary and binary meanings; '-' already does
> that.

However unlike -, there is a two character ** operator, so while x--y
is the same as x - - y, x**y would not be the same as x * * y.

Michael
--
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Py3k: Except clause syntax

2006-03-16 Thread Michael Urman
On 3/16/06, Georg Brandl <[EMAIL PROTECTED]> wrote:
> +1. Fits well with the current use of "as".

I like it, but wonder about the false parallels below. My initial
reaction is it's not worth worrying about as it's easy to learn as
part of the import or except statements, and should do the right
thing. Nobody would expect the second import to rename both items to
q, and the first except clause would be a SyntaxError.

from foo import bar as b, quux as q
except TypeError as te, IndexError as ie

from foo import bar, quux as q
except TypeError, IndexError as e

Michael
--
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Michael Urman
[My apologies Greg; I meant to send this to the whole list. I really
need a list-reply button in GMail. ]

On 3/1/06, Greg Ewing <[EMAIL PROTECTED]> wrote:
> I don't like that, because it creates a dependency
> (conceptually, at least) between the bytes type and
> the unicode type.

I only find half of this bothersome. The unicode type has a pretty
clear dependency on the bytestring type: all I/O needs to be done in
bytes. Various APIs may mask this by accepting unicode values and
transparently doing the right thing, but from the theoretical
standpoint we pretend there is no simple serialization of unicode
values. But the reverse is not true: the bytestring type has no
dependency on unicode.

As a practicality vs purity, however, I think it's a good choice to
let the bytestring type have a tie to unicode, much like the str type
implicitly does now. But you're absolutely right that adding a
.tounicode begs the question why not a .tointeger?

To try to step back and summarize the viewpoints I've seen so far,
there are three main requirements.

  1) We want things that are conceptually text to be stored in memory
as unicode values.
  2) We want there to be some unambiguous conversion via codecs
between bytestrings and unicode values. This should help teaching,
learning, and remembering unicode.
  3) We want a way to apply and reverse compressions, encodings,
encryptions, etc., which are not only between bytestrings and unicode
values; they may be between any two arbitrary types. This allows
writing practical programs.

There seems to be little disagreement over 1, provided sufficiently
efficient implementation, or sufficient string powers in the
bytestring type. To satisfy both 2 and 3, there seem to be a couple
options. What other requirements do we have?

For (2):
  a) Restrict the existing helpers to be only bytestring.decode and
unicode.encode, possibly enforcing output types of the opposite kind,
and removing bytestring.encode
  b) Add new methods with these semantics, e.g. bytestring.udecode and
unicode.uencode

For (3):
  c) Create new helpers codecs.encode(obj, encoding, errors) and
codecs.decode(obj, encoding, errors)
  d) [Keep existing bytestring and unicode helper methods as is, and]
require use of codecs.getencoder() and codecs.getdecoder() for
arbitrary starting object types

Obviously 2a and 3d do not work together, but 2b and 3c work with
either complementary option. What other options do we have?

Michael
--
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] C++ for CPython 3? (Re: str.count is slow)

2006-03-01 Thread Michael Urman
On 3/1/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> There are a few advantages, though, mainly:
> - increased type-safety, in particular for API that isn't type-checked
>   at all at the moment (e.g. PyArg_ParseTuple)

How would this be accomplished - by a function with a ton of optional
templated arguments? By some sort of TupleParser(tuple) >> var1 >>
var2 >> TupleParser::done?

> - more reliable reference counting, due to destructors of local
>   variables

Only true when the rules are consistent with what smart pointers or
the like do. When there's more than a single rule, this goes out the
window because you have to use the correct smart class...

> - "native" exception handling, making exceptions both less error-prone
>   and possible more efficient.

...and exceptions make it impossible to not use smart classes. Since
there isn't a nested level of C for each function call in Python, I
don't see how exceptions in the implementation language would help
exceptions in Python. Do I misunderstand your point, or is there some
really cool trick I'm missing?

(To explain my bias, I'm against the idea of the C++ rewrite as I also
fail to see the advantages outweighing the disadvantages, especially
in light of the amount of rewriting necessary to see the "advantages"
cited so far.)

Michael
--
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-19 Thread Michael Urman
On 2/19/06, Josiah Carlson <[EMAIL PROTECTED]> wrote:
> My post probably hasn't convinced you, but much of the confusion, I
> believe, is based on Martin's original belief that 'k in dd' should
> always return true if there is a default.  One can argue that way, but
> then you end up on the circular train of thought that gets you to "you
> can't do anything useful if that is the case, .popitem() doesn't work,
> len() is undefined, ...".  Keep it simple, keep it sane.

A default factory implementation fundamentally modifies the behavior
of the mapping. There is no single answer to the question "what is the
right behavior for contains, len, popitem" as that depends on what the
code that consumes the mapping is written like, what it is attempting
to do, and what you are attempting to override it to do. Or, simply,
on why you are providing a default value. Resisting the temptation to
guess the why and just leaving the methods as is seems  the best
choice; overriding __contains__ to return true is much easier than
reversing that behavior would be.

An example when it could theoretically be used, if not particularly
useful. The gettext.install() function was just updated to take a
names parameter which controls which gettext accessor functions it
adds to the builtin namespace. Its implementation looks for "method in
names" to decide. Passing a default-true dict would allow the future
behavior to be bind all checked names, but only if __contains__
returns True.

Even though it would make a poor base implementation, and these
effects aren't a good candidate for it,  the code style that could
best leverage such a __contains__ exists.

Michael
--
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Michael Urman
On 2/17/06, Adam Olsen <[EMAIL PROTECTED]> wrote:
> if key in d:
>   dosomething(d[key])
> else:
>   dosomethingelse()
>
> try:
>   dosomething(d[key])
> except KeyError:
>   dosomethingelse()

I agree with the gut feeling that these should still do the same
thing. Could we modify d.get() instead?

>>> class ddict(dict):
... default_value_factory = None
... def get(self, k, d=None):
... v = super(ddict, self).get(k, d)
... if v is not None or d is not None or
self.default_value_factory is None:
... return v
... return self.setdefault(k, self.default_value_factory())
...
>>> d = ddict()
>>> d.default_value_factory = list
>>> d.get('list', [])
[]
>>> d['list']
Traceback (most recent call last):
  File "", line 1, in ?
KeyError: 'list'
>>> d.get('list').append(5)
>>> d['list']
[5]

There was never an exception raised by d.get so this wouldn't change
(assuming the C is implemented more carefully than the python above).
What are the problems with this other than, like setdefault, it only
works on values with mutator methods (i.e., no counting dicts)? Is the
lack of counting dicts that d.__getitem__ supports a deal breaker?

>>> d.default_value_factory = int
>>> d.get('count') += 1
SyntaxError: can't assign to function call

How does the above either in dict or a subclass compare to five line
or smaller custom subclasses using something like the following?
    def append(self, k, val):
self.setdefault(k, []).append(val)
or
def accumulate(self, k, val):
try: self[k] += val
except KeyError: self[k] = val

Michael
--
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Let's just *keep* lambda

2006-02-07 Thread Michael Urman
On 2/6/06, Brett Cannon <[EMAIL PROTECTED]> wrote:
> And I think that a deferred object would help with one of
> lambda's biggest uses and made its loss totally reasonable.

The ambiguity inherent from the perspective of a deferred object makes
a general one impractical. Both map(Deferred().attribute, seq) and
map(Deferred().method(arg), seq) look the same - how does the object
know that the first case it should return the attribute of the first
element of seq when called, but in the second it should wait for the
next call when it will call method(arg) on the first element of seq?

Since there's also no way to spell "lambda y: foo(x, y, z)" on a
simple deferred object, it's strictly less powerful. If the current
Python lambda's functionality is desired, there is no better pythonic
way to spell it. There are plenty of new syntactic options that help
highlight its expression nature, but are they worth the change?

MIchael
--
Michael Urman  http://www.tortall.net/mu/blog/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New PEP: Using ssize_t as the index type

2006-01-06 Thread Michael Urman
[I just noticed that I sent this mail to just Martin when I meant it
for the list. Sorry Martin!]

On 1/5/06, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> More precisely, the printf style of function calling, and varargs
> functions. ISO C is pretty type safe, but with varargs functions,
> you lose that completely.
>
> I still hope I can write a C parser some day that does
> ParseTuple/BuildValue checking.

I put together a non-parsing checker last month to help me feel more
secure after http://python.org/sf/1365916. It's awful code, but the
simple things are easy to change or extend. Fixing the false positives
and other misinterpretations is harder.

http://www.tortall.net/mu/static/fmtcheck.py?raw - it takes a list of
directories to os.walk for c source files.

Michael
--
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Keep default comparisons - or add a second set?

2005-12-20 Thread Michael Urman
On 12/19/05, Josiah Carlson <[EMAIL PROTECTED]> wrote:
>
> Michael Urman <[EMAIL PROTECTED]> wrote:
> > Such as sorted(stuff, key=id)?
>
> I believe that ideally, canonical orderings would be persistant across
> sessions.

Erm, yes, I totally missed that in Jim's original preferred
requirements. And I nearly wrote another response ignoring Jim's use
case of persistence, as I'm having trouble thinking of any (others)
where order matters yet comparability doesn't.

Michael
--
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Keep default comparisons - or add a second set?

2005-12-19 Thread Michael Urman
On 12/19/05, Greg Ewing <[EMAIL PROTECTED]> wrote:
> That would be my preference. Comparison for canonical
> ordering should be a distinct operation with its
> own spelling.

Such as sorted(stuff, key=id)?

Michael
--
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Definining properties - a use case for class decorators?

2005-10-17 Thread Michael Urman
On 10/16/05, Nick Coghlan <[EMAIL PROTECTED]> wrote:
> On and off, I've been looking for an elegant way to handle properties using
> decorators.

Why use decorators when a metaclass will already do the trick, and
save you a line? This doesn't necessarily get around Antoine's
complaint that it looks like self refers to the wrong type, but I'm
not convinced anyone would be confused.

class MetaProperty(type):
def __new__(cls, name, bases, dct):
if bases[0] is object: # allow us to create class Property
return type.__new__(cls, name, bases, dct)
return property(dct.get('get'), dct.get('set'),
dct.get('delete'), dct.get('__doc__'))

def __init__(cls, name, bases, dct):
if bases[0] is object:
return type.__init__(cls, name, bases, dct)

class Property(object):
__metaclass__ = MetaProperty


class Test(object):
class foo(Property):
"""The foo property"""
def get(self): return self._foo
def set(self, val): self._foo = val
def delete(self): del self._foo

test = Test()
test.foo = 'Yay!'
assert test._foo == 'Yay!'
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding a conditional expression in Py3.0

2005-09-23 Thread Michael Urman
On 9/23/05, Nick Coghlan <[EMAIL PROTECTED]> wrote:
> But I think there's a difference in kind here - to *fix* Raymond's example
> required a fundamental change to the structure of the line, none of which
> looked as clean as the original. There is no way to get the and/or construct
> to gracefully handle the case where the desired result in the 'true' case
> might itself be false: either you change to using an if statement, or you use
> a workaround like the ugly singleton-list approach.
>
> That is, the following is fundamentally broken for pure imaginary numbers:
>return isinstance(z, ComplexType) and z.real or z

It's hard to say whether this fixes Raymond's example when the goal
wasn't clearly stated, but I find a non ternary option

lambda z: complex(z).real

more readable than any of the variants proposed so far. The obvious
downsides are that it can raise a ValueError, and turns integers into
floats. But if the input list is all numeric, it has clean results.

--
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com