[Python-Dev] DRAFT: python-dev summary for 2006-09-01 to 2006-09-15

2006-10-27 Thread Steven Bethard
Here's the summary for the first half of September.  As always,
comments and corrections are greatly appreciated!


=
Announcements
=


QOTF: Quote of the Fortnight


Through a cross-posting slip-up, Jean-Paul Calderone managed to
provide us with some inspiring thoughts on mailing-list archives:

One could just as easily ask why no one bothers to read mailing list
archives to see if their question has been answered before.

No one will ever know, it is just one of the mysteries of the universe.

Contributing thread:

- `[Twisted-Python] Newbie question
http://mail.python.org/pipermail/python-dev/2006-September/068682.html`__

-
Monthly Arlington sprints
-

Jeffrey Elkner has arranged for monthly Arlington Python sprints. See
the `Arlington sprint wiki`_ for more details.

.. _Arlington sprint wiki: http://wiki.python.org/moin/ArlingtonSprint

Contributing thread:

- `Arlington sprints to occur monthly
http://mail.python.org/pipermail/python-dev/2006-September/068688.html`__

=
Summaries
=

-
Signals, threads and blocking C functions
-

Gustavo Carneiro explained a problem that pygtk was running into.
Their main loop function, ``gtk_main()``, blocks forever. If there are
threads in the program, they cannot receive signals because Python
catches the signal and calls ``Py_AddPendingCall()``, relying on the
main thread to call ``Py_MakePendingCalls()``.  Since with pygtk, the
main thread is blocked calling a C function, it has no way other than
polling to decide when ``Py_MakePendingCalls()`` needs to be called.
Gustavo was hoping for some sort of API so that his blocking thread
could get notified when ``Py_AddPendingCall()`` had been called.

There was a long discussion about the feasibility of this and other
solutions to his problem. One of the main problems is that almost
nothing can safely be done from a signal handler context, so some
people felt like having Python invoke arbitrary third-party code was a
bad idea. Gustavo was reasonably confident that he could write to a
pipe within that context, which was all he needed to do to solve his
problem, but Nick Maclaren explained in detail some of the problems,
e.g. writing proper synchronization primitives that are signal-handler
safe.

Jan Kanis suggested that threads in a pygtk program should
occasionally check the signal handler flags and calls PyGTK's callback
to wake up the main thread. But Gustavo explained that things like the
GnomeVFS library have their own thread pools and know nothing about
Python so can't make such a callback.

Adam Olsen that Python could create a single non-blocking pipe for all
signals. When a signal was handled, the signal number would be written
to that pipe as a single byte. Third-party libraries, like pygtk,
could poll the appropriate file descriptor, waking up and handing
control back to Python when a signal was received. There were some
disadvantages to this approach, e.g. if there is a large burst of
signals, some of them would be lost, but folks seemed to think that
these kinds of things would not cause many real-world problems.
Gustavo and Adam then worked out the code in a little more detail.

The `Py_signal_pipe patch`_ was posted to SourceForge.

.. _Py_signal_pipe patch: http://bugs.python.org/1564547

Contributing thread:

- `Signals, threads, blocking C functions
http://mail.python.org/pipermail/python-dev/2006-September/068569.html`__


API for str.rpartition()


Raymond Hettinger pointed out that in cases where the separator was
not found, ``str.rpartition()`` was putting the remainder of the
string in the wrong spot, e.g. ``str.rpartition()`` worked like::

'axbxc'.rpartition('x') == ('axb', 'x', 'c')
'axb'.rpartition('x') == ('a', 'x', 'b')
'a'.rpartition('x') == ('a', '', '')  # should be ('', '', 'a')

Thus code that used ``str.rpartition()`` in a loop or recursively
would likely never terminate. Raymond checked in a fix for this,
spawning an enormous discussion about how the three bits
``str.rpartition()`` returns should be named.  There was widespread
disagreement on which side was the head and which side was the
tail, and the only unambiguous one seemed to be left, sep, right.
Raymond and others were not as happy with this version because it was
no longer suggestive of the use cases, but it looked like this might
be the best compromise.

Contributing threads:

- `Problem withthe API for str.rpartition()
http://mail.python.org/pipermail/python-dev/2006-September/068565.html`__
- `Fwd: Problem withthe API for str.rpartition()
http://mail.python.org/pipermail/python-dev/2006-September/068615.html`__

---
Unicode Imports
---

Kristján V. Jónsson submitted a `unicode import patch`_ that would

Re: [Python-Dev] Modulefinder

2006-10-27 Thread Thomas Heller
 On 10/13/06, Thomas Heller [EMAIL PROTECTED] wrote:
 I have patched Lib/modulefinder.py to work with absolute and relative 
 imports.
 It also is faster now, and has basic unittests in 
 Lib/test/test_modulefinder.py.

 The work was done in a theller_modulefinder SVN branch.
 If nobody objects, I will merge this into trunk, and possibly also into 
 release25-maint, when I have time.
 

Guido van Rossum schrieb:
 Could you also prepare a patch for the p3yk branch? It's broken there too...
 

I'm currently looking into this now.  IIUC, 'import foo' is an absolute
import now - is this the only change to the import machinery?

Thomas

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] DRAFT: python-dev summary for 2006-09-01 to 2006-09-15

2006-10-27 Thread Terry Reedy
 Adam Olsen that Python could create a single non-blocking pipe for a

/that/suggested that/




___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Modulefinder

2006-10-27 Thread Thomas Heller
 On 10/13/06, Thomas Heller [EMAIL PROTECTED] wrote:
 I have patched Lib/modulefinder.py to work with absolute and relative 
 imports.
 It also is faster now, and has basic unittests in 
 Lib/test/test_modulefinder.py.

 The work was done in a theller_modulefinder SVN branch.
 If nobody objects, I will merge this into trunk, and possibly also into 
 release25-maint, when I have time.
 
Guido van Rossum schrieb:
 Could you also prepare a patch for the p3yk branch? It's broken there too...
 

Patch uploaded, and assigned to you.
http://www.python.org/sf/1585966

Oh, and BTW: py3k SVN doesn't compile under windows.

Thomas

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP: Adding data-type objects to Python

2006-10-27 Thread Travis E. Oliphant


PEP: unassigned
Title: Adding data-type objects to the standard library
Version: $Revision: $
Last-Modified: $Date:  $
Author: Travis Oliphant [EMAIL PROTECTED]
Status: Draft
Type: Standards Track
Created: 05-Sep-2006
Python-Version: 2.6

Abstract

This PEP proposes adapting the data-type objects from NumPy for
inclusion in standard Python, to provide a consistent and standard
way to discuss the format of binary data. 

Rationale

There are many situations crossing multiple areas where an
interpretation is needed of binary data in terms of fundamental
data-types such as integers, floating-point, and complex
floating-point values.  Having a common object that carries
information about binary data would be beneficial to many
people. The creation of data-type objects in NumPy to carry the
load of describing what each element of the array contains
represents an evolution of a solution that began with the
PyArray_Descr structure in Python's own array object.  These
data-type objects can represent arbitrary byte data.  Currently
such information is usually constructed using strings and
character codes which is unwieldy when a data-type consists of
nested structures.

Proposal

Add a PyDatatypeObject in Python (adapted from NumPy's dtype
object which evolved from the PyArray_Descr structure in Python's
array module) that holds information about a data-type.  This object
will allow packages to exchange information about binary data in
a uniform way (see the extended buffer protocol PEP for an application
to exchanging information about array data). 

Specification

The datatype is an object that specifies how a certain block of
memory should be interpreted as a basic data-type. In addition to
being able to describe basic data-types, the data-type object can
describe a data-type that is itself an array of other data-types
as well as a data-type that contains arbitrary fields (structure
members) which are located at specific offsets. In its most basic
form, however, a data-type is of a particular kind (bit, bool,
int, uint, float, complex, object, string, unicode, void) and size.

Datatype objects can be created using either a type-object, a
string, a tuple, a list, or a dictionary according to the following
constructors:

Type-object: 

  For a select set of type-objects a data-type object describing that
  basic type can be described:

  Examples: 

   datatype(float)
  datatype('float64')
  
   datatype(int)
  datatype('int32')  # on 32-bit platform (64 if c-long is 64-bits)

Tuple-object
   
  A tuple of length 2 can be used to specify a data-type that is
  an array of another kind of basic data-type (this array always
  describes a C-contiguous array).

  Examples: 

   datatype((int, 5))
  datatype(('int32', (5,)))
  # describes a 5*4=20-byte block of memory laid out as 
  #  a[0], a[1], a[2], a[3], a[4]

   datatype((float, (3,2))
  datatype(('float64', (3,2))   
  # describes a 3*2*8=48 byte block of memory that should be
  # interpreted as 6 doubles laid out as arr[0,0], arr[0,1],
  # ... a[2,0], a[1,2]


String-object:
 
  The basic format is '%s%s%s%d' % (endian, shape, kind, itemsize) 

 kind : one of the basic array kinds given below. 
 
 itemsize : the nubmer of bytes (or bits for 't' kind) for 
 this data-type.  

 endian   : either '', '=' (native), '|' (doesn't matter),
 '' (big-endian) or '' (little-endian).

 shape: either '', or a shape-tuple describing a data-type that
 is an array of the given shape.

  A string can also be a comma-separated sequence of basic
  formats. The result will be a data-type with default field
  names: 'f0', 'f1', ..., 'fn'.

  Examples: 

   datatype('u4')
  datatype('uint32')

   datatype('f4')
  datatype('float32')

   datatype('(3,2)f4')
  datatype(('float32', (3,2))

   datatype('(5,)i4, (3,2)f4, S5')
  datatype([('f0', 'i4', (5,)), ('f1', 'f4', (3, 2)), ('f2', '|S5')])


List-object:

  A list should be a list of tuples where each tuple describes a
  field. Each tuple should contain (name, datatype{, shape}) or
  ((meta-info, name), datatype{, shape}) in order to specify the
  data-type. 

  This list must fully specify the data-type (no memory holes). If
  would would like to return a data-type with memory holes where the
  compiler would place them, then pass the keyword align=1 to this
  construction.  This will result in un-named fields of Void kind of
  the correct size interspersed where needed.

  Examples: 

  datatype([( ([1,2],'coords'), 'f4', (3,6)), ('address', 'S30')])

  A data-type that could represent the 

[Python-Dev] DRAFT: python-dev summary for 2006-09-16 to 2006-09-30

2006-10-27 Thread Steven Bethard
Thanks to all of those who have already given me feedback on the last
summary.  Here's the next one (for the second half of September).  I
found the OS X universal binaries and Finer-grained locking than
the GIL discussions particularly hard to follow, so I'd especially
appreciate corrections on those.

Thanks!

=
Summaries
=

---
Import features
---

Fabio Zadrozny ran into the `previously reported relative import
issues`_ where a ``from . import xxx`` always fails from a top-level
module. This is because relative imports rely on the ``__name__`` of a
module, so when it is just ``__main__``, they can't handle it
properly.

On the subject of imports, Guido said that one of the missing import
features was to be able to say *this* package lives *here*. Paul
Moore whipped up a Python API to an import hook that could do this,
but indicated that a full mechanism would need to pay more attention
to the environment (e.g. PYTHONPATH and .pth files).

There was also some discussion about trying to have a sort of
per-module ``sys.path`` so that you could have multiple versions of
the same module present, with different modules importing different
versions. Phillip J. Eby suggested that this was probably not a very
common need, and that implementing it would be quite difficult with
things like C extensions only being able to be loaded once.

In general, people seemed interested in a pure-Python implementation
of the import mechanism so that they could play with some of these
approaches. It looked like Brett Cannon would probably be working on
that.

.. _previously reported relative import issues:
http://www.python.org/dev/summary/2006-06-16_2006-06-30/#relative-imports-and-pep-338-executing-modules-as-scripts

Contributing thread:

- `New relative import issue
http://mail.python.org/pipermail/python-dev/2006-September/068806.html`__


Python library documentation


A less-trolly-than-usual post from Xah Lee started a discussion about
the Python documentation.  Greg Ewing and others suggested following
the documentation style of the Inside Macintosh series: first an
About this module narrative explaining the concepts and how they fit
together, followed by the extensive API reference. Most people agreed
that simply extracting the documentation from the docstrings was a bad
idea -- it lacks the high-level overview and gives equal importance to
all functions, regardless of their use.

Contributing thread:

- `Python Doc problems
http://mail.python.org/pipermail/python-dev/2006-September/069023.html`__

---
OS X universal binaries
---

Jack Howarth asked about creating universal binaries for OS X that
would support 32-bit or 64-bit on both PPC and x86. Ronald Oussoren
pointed out that the 32-bit part of this was already supported, but
indicated that adding 64-bit support simultaneously might be more
difficult. Ronald seemed to think that modifications to pyconfig.h.in
might solve the problem, though he was worried that this might cause
distutils to detect some architecture features incorrectly.

Contributing thread:

- `python, lipo and the future?
http://mail.python.org/pipermail/python-dev/2006-September/068800.html`__

--
Finer-grained locking than the GIL
--

Martin Devera was looking into replacing the global interpreter lock
(GIL) with finer-grained locking, tuned to minimize locking by
assuming that most objects were used only by a single thread. For
objects that were shared across multiple threads, this approach would
allow non-blocking reads, but require all threads to come home
before modifications could be made. Phillip J. Eby pointed out that
most object accesses in Python are actually modifications too, due to
reference counting, so it looked like Martin's proposal wouldn't work
well with the current refcounting implementation of Python. After
Martin v. Löwis found a bug in the locking algorithm, Martin Devera
decided to take his idea back to the drawing board.

Contributing thread:

- `deja-vu .. python locking
http://mail.python.org/pipermail/python-dev/2006-September/068828.html`__

---
OS X and ssize_t formatting
---

The buildbots spotted an OS X error in the itertools module. After
Jack Diederich fixed a bug where ``size_t`` had been used instead of
``ssize_t``, Neal Norwitz noticed some problems with ``%zd`` on OS X.
Despite documentation to the contrary in both the man page and the C99
Standard, using that specifier on OS X treats a negative number as an
unsigned number. Ronald Oussoren and others reported the bug to Apple.

Contributing thread:

- `test_itertools fails for trunk on x86 OS X machine
http://mail.python.org/pipermail/python-dev/2006-September/068898.html`__

---
itertools.flatten()
---

Michael Foord asked 

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-27 Thread Martin v. Löwis
Travis E. Oliphant schrieb:
 The datatype is an object that specifies how a certain block of
 memory should be interpreted as a basic data-type. 
 
datatype(float)
   datatype('float64')

I can't speak on the specific merits of this proposal, or whether this
kind of functionality is desirable. However, I'm -1 on the addition of
a builtin for this functionality (the PEP doesn't actually say that
there is another builtin, but the examples suggest so). Instead, putting
it into the sys, array, struct, or ctypes modules might be more
appropriate, as might be the introduction of another module.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] r52482 - in python/branches/release25-maint: Lib/urllib.py Lib/urllib2.py Misc/NEWS

2006-10-27 Thread Anthony Baxter
On Saturday 28 October 2006 03:13, andrew.kuchling wrote:
 2.4 backport candidate, probably.

FWIW, I'm not planning on doing any more collect all the bugfixes releases 
of 2.4. It's now in the same category as 2.3 - that is, only really serious 
bugs (in particular, security related bugs) will get a new release, and then 
only with the serious bugfixes applied. 

One active maintenance branch is quite enough to deal with, IMHO.


-- 
Anthony Baxter [EMAIL PROTECTED]
It's never too late to have a happy childhood.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] DRAFT: python-dev summary for 2006-09-16 to 2006-09-30

2006-10-27 Thread Martin v. Löwis
Steven Bethard schrieb:
 Jack Howarth asked about creating universal binaries for OS X that
 would support 32-bit or 64-bit on both PPC and x86. Ronald Oussoren
 pointed out that the 32-bit part of this was already supported, but
 indicated that adding 64-bit support simultaneously might be more
 difficult. Ronald seemed to think that modifications to pyconfig.h.in
 might solve the problem, though he was worried that this might cause
 distutils to detect some architecture features incorrectly.

Ronald can surely speak for himself, but I think the problem is slightly
different. There were different strategies discussed for changing
pyconfig.h (with an include, or with #ifdefs), and in all cases,
distutils would fail to detect the architecture properly. That's not
really a problem of pyconfig.h, but of the way that distutils uses
to detect bitsizes - which inherently cannot work for universal
binaries (i.e. you should look at the running interpreter, not
at pyconfig.h).

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-27 Thread Greg Ewing
Travis E. Oliphant wrote:
 PEP: unassigned
 Title: Adding data-type objects to the standard library

Not sure about having 3 different ways to specify
the structure -- it smacks of Too Many Ways To Do
It to me.

Also, what if I want to refer to fields by name
but don't want to have to work out all the offsets
(which is tedious, error-prone and hostile to
modification)?

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Typo.pl scan of Python 2.5 source code

2006-10-27 Thread Johnny Lee


I grabbed the latest Python2.5 code via subversion and ran my typo script on it.

Weeding out the obvious false positives and Neal's comments leaves about 129 typos.

See http://www.geocities.com/typopl/typoscan.htm

Should I enter the typos as bugs in the Python bug db?
J



 Date: Fri, 22 Sep 2006 21:51:38 -0700 From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject: Re: [Python-Dev] Typo.pl scan of Python 2.5 source code CC: python-dev@python.org  On 9/22/06, Johnny Lee [EMAIL PROTECTED] wrote:   Hello,  My name is Johnny Lee. I have developed a *ahem* perl script which scans  C/C++ source files for typos.  Hi Johnny.  Thanks for running your script, even if it is written in Perl and ran on Windows. :-)   The Python 2.5 typos can be classified into 7 types.   2) realloc overwrite src if NULL, i.e. p = realloc(p, new_size);  If realloc() fails, it will return NULL. If you assign the return value to  the same variable you passed into realloc,  then you've overwritten the variable and possibly leaked the memory that the  variable pointed to.  A bunch of these warnings were accurate and a bunch were not. There were 2 reasons for the false positives. 1) The pointer was aliased, thus not lost, 2) On failure, we exited (Parser/*.c)   4) if ((X!=0) || (X!=1))  These 2 cases occurred in binascii. I have no idea if the warning is wright or the code is.   6) XX;;  Just being anal here. Two semicolons in a row. Second one is extraneous.  I already checked in a fix for these on HEAD. Hard for even me to screw up those fixes. :-)   7) extraneous test for non-NULL ptr  Several memory calls that free memory accept NULL ptrs.  So testing for NULL before calling them is redundant and wastes code space.  Now some codepaths may be time-critical, but probably not all, and smaller  code usually helps.  I ignored these as I'm not certain all the platforms we run on accept free(NULL).  Below is my categorization of the warnings except #7. Hopefully someone will fix all the real problems in the first batch.  Thanks again!  n --  # Problems Objects\fileobject.c (338): realloc overwrite src if NULL; 17: file-f_setbuf=(char*)PyMem_Realloc(file-f_setbuf,bufsize) Objects\fileobject.c (342): using PyMem_Realloc result w/no check 30: setvbuf(file-f_fp, file-f_setbuf, type, bufsize); [file-f_setbuf] Objects\listobject.c (2619): using PyMem_MALLOC result w/no check 30: garbage[i] = selfitems[cur]; [garbage] Parser\myreadline.c (144): realloc overwrite src if NULL; 17: p=(char*)PyMem_REALLOC(p,n+incr) Modules\_csv.c (564): realloc overwrite src if NULL; 17: self-field=PyMem_Realloc(self-field,self-field_size) Modules\_localemodule.c (366): realloc overwrite src if NULL; 17: buf=PyMem_Realloc(buf,n2) Modules\_randommodule.c (290): realloc overwrite src if NULL; 17: key=(unsigned#long*)PyMem_Realloc(key,bigger*sizeof(*key)) Modules\arraymodule.c (1675): realloc overwrite src if NULL; 17: self-ob_item=(char*)PyMem_REALLOC(self-ob_item,itemsize*self-ob_size) Modules\cPickle.c (536): realloc overwrite src if NULL; 17: self-buf=(char*)realloc(self-buf,n) Modules\cPickle.c (592): realloc overwrite src if NULL; 17: self-buf=(char*)realloc(self-buf,bigger) Modules\cPickle.c (4369): realloc overwrite src if NULL; 17: self-marks=(int*)realloc(self-marks,s*sizeof(int)) Modules\cStringIO.c (344): realloc overwrite src if NULL; 17: self-buf=(char*)realloc(self-buf,self-buf_size) Modules\cStringIO.c (380): realloc overwrite src if NULL; 17: oself-buf=(char*)realloc(oself-buf,oself-buf_size) Modules\_ctypes\_ctypes.c (2209): using PyMem_Malloc result w/no check 30: memset(obj-b_ptr, 0, dict-size); [obj-b_ptr] Modules\_ctypes\callproc.c (1472): using PyMem_Malloc result w/no check 30: strcpy(conversion_mode_encoding, coding); [conversion_mode_encoding] Modules\_ctypes\callproc.c (1478): using PyMem_Malloc result w/no check 30: strcpy(conversion_mode_errors, mode); [conversion_mode_errors] Modules\_ctypes\stgdict.c (362): using PyMem_Malloc result w/no check 30: memset(stgdict-ffi_type_pointer.elements, 0, [stgdict-ffi_type_pointer.elements] Modules\_ctypes\stgdict.c (376): using PyMem_Malloc result w/no check 30: memset(stgdict-ffi_type_pointer.elements, 0, [stgdict-ffi_type_pointer.elements]  # No idea if the code or tool is right. Modules\binascii.c (1161) Modules\binascii.c (1231)  # Platform specific files. I didn't review and won't fix without testing. Python\thread_lwp.h (107): using malloc result w/no check 30: lock-lock_locked = 0; [lock] Python\thread_os2.h (141): using malloc result w/no check 30: (long)sem)); [sem] Python\thread_os2.h (155): using malloc result w/no check 30: lock-is_set = 0; [lock] Python\thread_pth.h (133): using malloc result w/no check 30: memset((void *)lock, '\0', sizeof(pth_lock)); [lock] Python\thread_solaris.h (48): using malloc result w/no check 30: funcarg-func = func; [funcarg] Python\thread_solaris.h (133): using malloc result w/no check 30: if(mutex_init(lock,USYNC_THREAD,0)) [lock]  # Who cares about these 

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-27 Thread Nick Coghlan
Greg Ewing wrote:
 Travis E. Oliphant wrote:
 PEP: unassigned
 Title: Adding data-type objects to the standard library

I've used 'datatype' below for consistency, but can we please call them 
something other than data types? Data layouts? Data formats? Binary layouts? 
Binary formats? 'type' is already a meaningful term in Python, and having to 
check whether 'data type' meant a type definition or a data format definition 
could get annoying.

 Not sure about having 3 different ways to specify
 the structure -- it smacks of Too Many Ways To Do
 It to me.

There are actually 5 ways, but the different mechanisms all have different use 
case (and I'm going to suggest getting rid of the dictionary form).

Type-object:
   Simple conversion of the builtin types (would be good for instances to be 
able to hook this as with other type conversion functions).

2-tuple:
   Makes it easy to specify a contiguous C-style array of a given data type. 
However, rather than doing type-based dispatch here, I would prefer to see 
this version handled via an optional 'shape' argument, so that all sequences 
can be handled consistently (more on that below).
datatype(int, 5) # short for datatype([(int, 5)])
   datatype('int32', (5,))
   # describes a 5*4=20-byte block of memory laid out as
   #  a[0], a[1], a[2], a[3], a[4]

String-object:
   The basic formatting definition (I'd be interested in the differences 
between this definition scheme and the struct definition scheme - one definite 
goal for an implementation would be an update to the struct module to accept 
datatype objects, or at least a conversion mechanism for creating a struct 
layout description from a datatype definition)

List object:
   As for string object, but permits naming of each of the fields. I don't 
like treating tuples differently from lists, so I'd prefer for this handling 
applied to be applied to all iterables that don't meet one of the other 
special cases (direct conversion, string, dictionary).

   I'd also prefer the meta-information to come *after* the name, and for the 
name to be completely optional (leaving the corresponding field unnamed). So 
the possible sequence entries would be:
 datatype
 (name, datatype)
 (name, datatype, shape)
   where name must be a string or 2-tuple, datatype must be acceptable as a 
constructor argument, and the shape must be an integer or tuple.
For example:
   datatype(([(('coords', [1,2]), 'f4')),
  ('address', 'S30'),
 ])

   datatype([('simple', 'i4'),
 ('nested', [('name', 'S30'),
 ('addr', 'S45'),
 ('amount', 'i4')
]
  ),
 ])

datatype(['V8', ('var2', 'i1'), 'V3', ('var3', 'f8')]
   datatype([('', '|V8'), ('var2', '|i1'), ('', '|V3'), ('var3', 'f8')])

Dictionary object:

   This allows a tailored object where the information you have (e.g. from a 
file format specification) provides offsets and data types. Instead of having 
to define them manually the constructor will insert the necessary padding 
fields for you.

   Given an existing datatype object, you can create a new datatype which only 
names a few of the original fields by doing:
 from operator import itemgetter
 wanted = 'field1', 'field10', 'field15'
 new_names = 'attr1', 'attr2', 'attr3'
 field_defs = itemgetter(wanted)(orig_fmt.fields)
 new_fmt = datatype(dict(zip(new_names, field_defs))


 Also, what if I want to refer to fields by name
 but don't want to have to work out all the offsets
 (which is tedious, error-prone and hostile to
 modification)?

Use the list definition form. In the current PEP, you would need to define 
names for all of the uninteresting fields. With the changes I've suggested 
above, you wouldn't even have to name the fields you don't care about - just 
describe them.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-27 Thread Nick Coghlan
Nick Coghlan wrote:
 There are actually 5 ways, but the different mechanisms all have different 
 use 
 case (and I'm going to suggest getting rid of the dictionary form).

D'oh, I though I deleted that parenthetical comment... obviously, I changed my 
mind on this point :)

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-27 Thread Nick Coghlan
Martin v. Löwis wrote:
 Travis E. Oliphant schrieb:
 The datatype is an object that specifies how a certain block of
 memory should be interpreted as a basic data-type. 

datatype(float)
   datatype('float64')
 
 I can't speak on the specific merits of this proposal, or whether this
 kind of functionality is desirable. However, I'm -1 on the addition of
 a builtin for this functionality (the PEP doesn't actually say that
 there is another builtin, but the examples suggest so). Instead, putting
 it into the sys, array, struct, or ctypes modules might be more
 appropriate, as might be the introduction of another module.

I'd say the answer to where we put it will be dependent on what happens to the 
idea of adding a NumArray style fixed dimension array type to the standard 
library. If that gets exposed through the array module as array.dimarray, then 
it would make sense to expose the associated data layout descriptors as 
array.datatype.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com