Re: [Numpy-discussion] numpy.array does not take generators

2007-08-17 Thread Timothy Hochberg
On 8/17/07, Barry Wark <[EMAIL PROTECTED]> wrote:
>
> Is there a reason not to add an argument to fromiter that specifies
> the final size of the n-d array? Reading this discussion, I realized
> that there are several places in my code where I create 2-D arrays
> like this:
>
> arr = N.array([d.data() for d in list_of_data_containers]),
>
> where d.data() returns a buffer object.
>
> I would guess that this paradigm causes lots of memory copying. The
> more efficient solution, I think, would be to preallocate the array
> and then assign each row in a loop. It's so much clearer this way,
> however, that I've kept it as is in the code.
>
> So, what if I could do something like
>
> arr = N.fromiter(d.data() for d in list_of_data_containers, shape=(x,y)),


I don't know that there's any theoretical problem in terms of doing
something like this. There are a couple of practical issues though. One is
that it would significantly increase the implementation complexity of
fromiter, which right now is about as simple as it can reasonably be.
Someone would need to step forward and write and test the code. The second
issue is with the interface. The interface that you propose isn't really
right. The current interface is:

   fromiter(iterable, dtype, count=-1)

where count indicates how many items to extract from the iterable (-1
iterates until it is empty). 'shape' as you propose would couple to this in
an unnatural way. Adding another keyword argument that indicates just the
shape of the elements would make more sense, but it starts to seem a bit
clunky.

  fromiter(iterable, dtype, count-1, itemshape=())

For this particular application, there doesn't seem to be any problem simply
defining yourself a little utility function to do this for you.

def from_shaped_iter(iterable, dtype, shape):
a = numpy.empty(shape, dtype)
for i, x in enumerate(iterable):
a[i] = x
return a

I expect this would have decent performance if y dimension is reasonably
large.

regards,


-tim

with the contract that fromiter will throw an exception if any of the
> d.data() are not of size y or if there are more than x elements in
> list_of_data_containers?
>
> Just a thought for discussion.
>
> barry
>
> On 8/16/07, Robert Kern <[EMAIL PROTECTED]> wrote:
> > Geoffrey Zhu wrote:
> > > Hi All,
> > >
> > > I want to construct a numpy array based on Python objects. In the
> > > below code, opts is a list of tuples.
> > >
> > > For example,
> > >
> > > opts=[ ('C', 100, 3, 'A'), ('K', 200, 5.4, 'B')]
> > >
> > > If I use a generator like the following:
> > >
> > > K=numpy.array(o[2]/1000.0 for o in opts)
> > >
> > > It does not work.
> > >
> > > I have to use:
> > >
> > > numpy.array([o[2]/1000.0 for o in opts])
> > >
> > > Is this behavior intended?
> >
> > Yes. With arbitrary generators, there is no good way to do the kind of
> > mind-reading that numpy.array() usually does with sequences. It would
> have to
> > unroll the whole generator anyways. fromiter() works for this, but you
> are
> > restricted to 1-D arrays which is a lot easier to implement the
> mind-reading for.
> >
> > --
> > Robert Kern
> >
> > "I have come to believe that the whole world is an enigma, a harmless
> enigma
> >  that is made terrible by our own mad attempt to interpret it as though
> it had
> >  an underlying truth."
> >   -- Umberto Eco
> > ___
> > Numpy-discussion mailing list
> > Numpy-discussion@scipy.org
> > http://projects.scipy.org/mailman/listinfo/numpy-discussion
> >
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>



-- 
.  __
.   |-\
.
.  [EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.array does not take generators

2007-08-17 Thread Barry Wark
Is there a reason not to add an argument to fromiter that specifies
the final size of the n-d array? Reading this discussion, I realized
that there are several places in my code where I create 2-D arrays
like this:

arr = N.array([d.data() for d in list_of_data_containers]),

where d.data() returns a buffer object.

I would guess that this paradigm causes lots of memory copying. The
more efficient solution, I think, would be to preallocate the array
and then assign each row in a loop. It's so much clearer this way,
however, that I've kept it as is in the code.

So, what if I could do something like

arr = N.fromiter(d.data() for d in list_of_data_containers, shape=(x,y)),

with the contract that fromiter will throw an exception if any of the
d.data() are not of size y or if there are more than x elements in
list_of_data_containers?

Just a thought for discussion.

barry

On 8/16/07, Robert Kern <[EMAIL PROTECTED]> wrote:
> Geoffrey Zhu wrote:
> > Hi All,
> >
> > I want to construct a numpy array based on Python objects. In the
> > below code, opts is a list of tuples.
> >
> > For example,
> >
> > opts=[ ('C', 100, 3, 'A'), ('K', 200, 5.4, 'B')]
> >
> > If I use a generator like the following:
> >
> > K=numpy.array(o[2]/1000.0 for o in opts)
> >
> > It does not work.
> >
> > I have to use:
> >
> > numpy.array([o[2]/1000.0 for o in opts])
> >
> > Is this behavior intended?
>
> Yes. With arbitrary generators, there is no good way to do the kind of
> mind-reading that numpy.array() usually does with sequences. It would have to
> unroll the whole generator anyways. fromiter() works for this, but you are
> restricted to 1-D arrays which is a lot easier to implement the mind-reading 
> for.
>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless enigma
>  that is made terrible by our own mad attempt to interpret it as though it had
>  an underlying truth."
>   -- Umberto Eco
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.array does not take generators

2007-08-17 Thread Geoffrey Zhu
On 8/17/07, Robert Kern <[EMAIL PROTECTED]> wrote:
> Geoffrey Zhu wrote:
> > Hi All,
> >
> > I want to construct a numpy array based on Python objects. In the
> > below code, opts is a list of tuples.
> >
> > For example,
> >
> > opts=[ ('C', 100, 3, 'A'), ('K', 200, 5.4, 'B')]
> >
> > If I use a generator like the following:
> >
> > K=numpy.array(o[2]/1000.0 for o in opts)
> >
> > It does not work.
> >
> > I have to use:
> >
> > numpy.array([o[2]/1000.0 for o in opts])
> >
> > Is this behavior intended?
>
> Yes. With arbitrary generators, there is no good way to do the kind of
> mind-reading that numpy.array() usually does with sequences. It would have to
> unroll the whole generator anyways. fromiter() works for this, but you are
> restricted to 1-D arrays which is a lot easier to implement the mind-reading 
> for.
>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless enigma
>  that is made terrible by our own mad attempt to interpret it as though it had
>  an underlying truth."
>  -- Umberto Eco
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>

I see. Thanks for explaining.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.array does not take generators

2007-08-16 Thread Robert Kern
Geoffrey Zhu wrote:
> Hi All,
> 
> I want to construct a numpy array based on Python objects. In the
> below code, opts is a list of tuples.
> 
> For example,
> 
> opts=[ ('C', 100, 3, 'A'), ('K', 200, 5.4, 'B')]
> 
> If I use a generator like the following:
> 
> K=numpy.array(o[2]/1000.0 for o in opts)
> 
> It does not work.
> 
> I have to use:
> 
> numpy.array([o[2]/1000.0 for o in opts])
> 
> Is this behavior intended?

Yes. With arbitrary generators, there is no good way to do the kind of
mind-reading that numpy.array() usually does with sequences. It would have to
unroll the whole generator anyways. fromiter() works for this, but you are
restricted to 1-D arrays which is a lot easier to implement the mind-reading 
for.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.array does not take generators

2007-08-16 Thread Alan G Isaac
On Thu, 16 Aug 2007, Geoffrey Zhu apparently wrote:
> K=numpy.array(o[2]/1000.0 for o in opts)
> It does not work. 

K=numpy.fromiter((o[2]/1000.0 for o in opts),'float')

hth,
Alan Isaac




___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] numpy.array does not take generators

2007-08-16 Thread Geoffrey Zhu
Hi All,

I want to construct a numpy array based on Python objects. In the
below code, opts is a list of tuples.

For example,

opts=[ ('C', 100, 3, 'A'), ('K', 200, 5.4, 'B')]

If I use a generator like the following:

K=numpy.array(o[2]/1000.0 for o in opts)

It does not work.

I have to use:

numpy.array([o[2]/1000.0 for o in opts])

Is this behavior intended?

By the way, it is quite inefficient to create numpy array this way,
because I have to create a regular python first, and then construct a
numpy array. But I do not want to store everything in vector form
initially, as it is more natural to store them in objects, and easier
to use when organizing the data. Does anyone know any better way?

Thanks,
Geoffrey
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion