Timothy Hochberg wrote:
>
>
> On 1/6/07, *Travis Oliphant* <[EMAIL PROTECTED]
> <mailto:[EMAIL PROTECTED]>> wrote:
>
> Tim Hochberg wrote:
> > Christopher Barker wrote:
> >
> > [SNIP]
> >
> >> I think the PEP has far more chances of success if it's seen as a
> >> request from a variety of package developers, not just the
> numpy crowd
> >> (which, after all, already has numpy
> >>
> > This seems eminently sensible. Getting a few developers from other
> > projects on board would help a lot; it might also reveal some
> > deficiencies to the proposal that we don't see yet.
> >
> It would help quite a bit. Are there any suggestions of who to
> recruit
> to review the proposal?
>
>
> Before I can answer that, I need to ask you a question. How do you see
> this extension to the buffer protocol? Do you see it as an supplement
> to the earlier array protocol, or do you see it as a replacement?
This is a replacement to the previously described array protocol PEP.
This is how I'm trying to get the array protocol into Python.
In that vein, it has two purposes:
One is to make a better buffer protocol that includes a conception of an
N-dimensional array in Python itself. If we can include this in Python
then we get a lot of mileage out of all the people that write extension
modules for Python that should really be making their memory available
as an N-dimensional array (everytime I turn around there is a new
wrapping to some library that is *not* using NumPy as the underlying
extension). With the existence of ctypes it just starts to get worse as
nobody thinks about exposing things as arrays anymore and so NumPy users
don't get the ease of use we would get if the N-dimensional array
concept were a part of Python itself.
For example, I just found the FreeImage project which wraps a nice
library using ctypes. But, it doesn't have a way to expose these images
as numpy arrays. Now, it would probably take me only a few hours to
make the connection between FreeImage and NumPy, but I'd like to see the
day when it happens without me (or some other NumPy expert) having to do
all the work. If ctypes objects exposed the extended buffer protocol
for appropriate types, then I wouldn't have to do anything. Because the
wrapped structures would be exposable as arrays and all of a sudden I say
a = array(freeimobj)
and I can do math on the array in Python.
Or if I'm an extension module writer, I don't need to have NumPy (or
rely on it) in order to do some computation on freeimobj in C itself.
Sure, you can do it now (if the array protocol is followed --- but not
many people have adopted it yet --- some have argued that it's "not in
Python itself"). So, I guess, the big reason I'm pushing this is
largely marketing.
The buffer protcol is the "right" place to but the array protocol.
The second reason is to ensure that the buffer protocol itself doesn't
"disappear" in Python 3000. Not all the Python devs seem to really see
the value of it. But, it can sometimes be unclear as to what the
attitudes are.
> > 2. Is there any type besides Py_STRUCTURE that can have
> names
> > and fields. If so, what and what do they mean. If
> not, you
> > should just say that.
> >
> Yes, you can add fields to a multi-byte primitive if you want. This
> would be similar to thinking about the data-format as a C-like union.
> Perhaps the data-field has meaning as a 4-byte integer but the
> most-significant and least-significant bytes should also be
> addressable
> individually.
>
>
> Hmm. I think I understand this somewhat better now, but I can't decide
> if it's cool or overkill. Is this a supporting a feature that ctypes has?
I don't know. It's basically a situation where it's easier to support
it than to not and so it's there.
>
>
> > 3. And on this topic, why a tuple of ([names,..],
> {field})? Why
> > not simply a list of (name, dfobject, offset, meta) for
> > example? And what's the meta information if it's not
> PyNone?
> > Just a string? Anything at all?
> >
>
> The list of names is useful for having an ordered list so you can
> traverse the structure in field order. It is technically not
> necessary
> but it makes it a lot easier to parse a data-format object in offset
> order (it is used a bit in NumPy, for example).
>
>
> Right, I got that. Between names and field you are simulating an
> ordered dict. What I still don't understand is why you chose to
> simulate this ordered dict using a list plus a dictionary rather than
> a list of tuples. This may well just be a matter of taste. However,
> for the small sizes I'd expect of these lists I would expect a list of
> of tuples would perform better than the dictionary solution.
Ah. I misunderstood. You are right that if I had considered needing an
ordered list of names up front, this kind of thing makes more sense. I
think the reason for the choice of dictionary is that I was thinking of
field access as attribute look-up which is just dictionary look-up. So,
conceptually that was easier for me. But, tuples are probably less
over-head (especially for small numbers of fields) with the expense of
having to search for the field-name on field access.
But, I'm trusting that dictionaries (especially small ones) are pretty
optimized in Python (I haven't tested that assertion in this particular
case, though).
>
> FWIW, the array protocol PEP seems more relevant to what I do since
> I'm not concerned so much with the overhead since I'm sending big
> chunks of data back and forth.
This proposal is trying to get the array protocol *into* Python. So,
this is the array protocol PEP. Anyone supportive of the array protocol
should be interested in and thinking about this PEP.
-Travis
_______________________________________________
Numpy-discussion mailing list
[email protected]
http://projects.scipy.org/mailman/listinfo/numpy-discussion