Re: [Numpy-discussion] Convert recarray to list (is this a bug?)

2012-07-10 Thread Yan Tang
Thank you very much.

On Tue, Jul 10, 2012 at 3:02 AM, Travis Oliphant tra...@continuum.iowrote:


 On Jul 9, 2012, at 9:24 PM, Yan Tang wrote:

 Hi,

 I noticed there is an odd issue when I am trying to convert a recarray to
 list.  See below for the example/test case.

 $ cat a.csv
 date,count
 2011-07-25,91
 2011-07-26,118
 $ cat b.csv
 name,count
 foo,1233
 bar,100

 $ python

  from matplotlib import mlab
  import numpy as np

  a = mlab.csv2rec('a.csv')
  b = mlab.csv2rec('b.csv')
  a
 rec.array([(datetime.date(2011, 7, 25), 91), (datetime.date(2011, 7, 26),
 118)],
   dtype=[('date', '|O8'), ('count', 'i8')])
  b
 rec.array([('foo', 1233), ('bar', 100)],
   dtype=[('name', '|S3'), ('count', 'i8')])


  np.array(a.tolist()).tolist()
 [[datetime.date(2011, 7, 25), 91], [datetime.date(2011, 7, 26), 118]]
  np.array(b.tolist()).tolist()
 [['foo', '1233'], ['bar', '100']]


 The odd case is, 1233 becomes a string '1233' in the second command.  But
 91 is still a number 91.

 Why would this happen?  What's the correct way to do this conversion?


 You are trying to convert the record array into a list of lists, I
 presume?   The tolist() method on the rec.array produces a list of tuples.
   Be sure that a list of tuples does not actually satisfy your requirements
 --- it might.

 Passing this back to np.array is going to try to come up with a data-type
 that satisfies all the elements in the list of tuples.  You are relying
 here on np.array's intelligence for trying to figure out what kind of
 array you have.   It tries to do it's best, but it is limited to
 determining a primitive data-type (float, int, string, object).   It
 can't always predict what you expect --- especially when the original data
 source was a record like this.In the first case, because of the
 date-time object, it decides the data is an object array which works.  In
 the second it decides that the data can all be represented as a string
 and so choose that.   The second .tolist() just produces a list out of the
 2-d array.

 Likely what you want to do is just create a list of lists from the
 original output of .tolist.   Like this:

 [list(x) for x in a.tolist()]
 [list(x) for x in b.tolist()]

 This wil be faster as well...

 Best,

 -Travis









 Thanks.

 -uris-
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Convert recarray to list (is this a bug?)

2012-07-09 Thread Yan Tang
Hi,

I noticed there is an odd issue when I am trying to convert a recarray to
list.  See below for the example/test case.

$ cat a.csv
date,count
2011-07-25,91
2011-07-26,118
$ cat b.csv
name,count
foo,1233
bar,100

$ python

 from matplotlib import mlab
 import numpy as np

 a = mlab.csv2rec('a.csv')
 b = mlab.csv2rec('b.csv')
 a
rec.array([(datetime.date(2011, 7, 25), 91), (datetime.date(2011, 7, 26),
118)],
  dtype=[('date', '|O8'), ('count', 'i8')])
 b
rec.array([('foo', 1233), ('bar', 100)],
  dtype=[('name', '|S3'), ('count', 'i8')])


 np.array(a.tolist()).tolist()
[[datetime.date(2011, 7, 25), 91], [datetime.date(2011, 7, 26), 118]]
 np.array(b.tolist()).tolist()
[['foo', '1233'], ['bar', '100']]


The odd case is, 1233 becomes a string '1233' in the second command.  But
91 is still a number 91.

Why would this happen?  What's the correct way to do this conversion?

Thanks.

-uris-
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] About np array and recarray

2012-03-22 Thread Yan Tang
Thank you very much.  Very detailed explanation.

On Thu, Mar 22, 2012 at 1:16 AM, Travis Oliphant tra...@continuum.iowrote:


 On Mar 21, 2012, at 11:48 PM, Yan Tang wrote:

 Hi,

 I am really confused on the np array or record array, and cannot
 understand how it works.

 What I want to do is that I have a normal python two dimensional
 array/list:

 a = [['2000-01-01', 2],['2000-01-02', 3]]

 I want to convert it to a recarray with this dtype [('date', 'object'),
 ('count', 'int')].  I tried multiple ways and none of them works.  And some
 of the tests show pretty odd behavior.

 This is good, and it is almost what i want:

  import numpy as np
  a = [('2000-01-01', 2), ('2000-01-02', 3)]
  np.array(a, dtype=[('date', 'object'), ('count', 'int')])
 array([('2000-01-01', 2), ('2000-01-02', 3)],
   dtype=[('date', '|O8'), ('count', 'i8')])


 This is the correct way to initiate the record array, or structured array,
 from a Python object.


 Why this doesn't work?!

  a = [['2000-01-01', 2],['2000-01-02', 3]]
  np.array(a, dtype=[('date', 'object'), ('count', 'int')])
 Traceback (most recent call last):
   File stdin, line 1, in module
 ValueError: tried to set void-array with object members using buffer.


 The error here could be more instructive, but the problems is that to
 simplify the np.array factory function (which is already somewhat complex)
 it was decided to force records to be input as tuples and not as lists.
 You *must* use tuples to specify records for a structured array.

 Why can this cause segmentation fault?!

  a = [['2000-01-01', 2],['2000-01-02', 3]]
  np.ndarray((len(a),), buffer=np.array(a), dtype=[('date', 'object'),
 ('count', 'int')])
 Segmentation fault (And python quit!)


 The np.ndarray constructor should not be used directly unless you know
 what you are doing.

 The np.array factory function is the standard way to create arrays.   The
 problem here is that you are explicitly asking NumPy to point to a
 particular region of memory to use as it's data-buffer.   This memory is
 the data buffer of an array of strings.   The np.array factory function
 will try and auto-detect the data-type of the array if you do not specify
 it --- which in this case results in an array of strings.Then, with the
 dtype specification you are asking it to interpret a portion of that array
 of strings as a pointer to a Python object.   This will cause a
 segmentation fault when the printing code tries to dereference a pointer
 which is actually 4 characters of a string.

 This should probably be checked for in the ndarray constructor.   I don't
 think it ever really makes sense to use an object dtype when you also
 supply the buffer unless that buffer actually held Python object pointers
 in the first place.   Even in this case you could do what you wanted
 without calling the constructor.  So, likely a check should be made so that
 you can't have an object array and also supply a buffer.


 Python version 2.6.5

 On this reference page,
 http://docs.scipy.org/doc/numpy/reference/generated/numpy.array.html

  x = np.array([(1,2),(3,4)])
  x
 array([[1, 2],
[3, 4]])
  np.array([[1, 2], [3, 4]])
 array([[1, 2],
[3, 4]])

 Can anyone help me about this?


 I'm not sure what you are asking for here?   Yes, for arrays with
 non-structured dtypes, numpy will treat tuples as lists.


The thing I am asking for is, it looks like from my example, [[1,2],[3,4]],
and [(1,2),(3,4)], after constructing the np.array, the result looks the
same.  Then go back to my first question, why it looks like only the tuple
works, not the list one.

As you explained, it looks like we have to use tuple instead of list.
 That's OK.  But I didn't find it any place in the document, ;).



 Best regards,

 -Travis



 Thanks.
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] About np array and recarray

2012-03-22 Thread Yan Tang
Yes, this is finally how I work around it.  I just want to save the
conversion from list to tuple.

On Thu, Mar 22, 2012 at 1:32 AM, Val Kalatsky kalat...@gmail.com wrote:


 Will this do what you need to accomplish?

 import datetime
 np.array([(datetime.datetime.strptime(i[0], %Y-%m-%d).date(), i[1]) for
 i in a], dtype=[('date', 'object'), ('count', 'int')])

 Val

 On Wed, Mar 21, 2012 at 11:48 PM, Yan Tang tang@gmail.com wrote:

 Hi,

 I am really confused on the np array or record array, and cannot
 understand how it works.

 What I want to do is that I have a normal python two dimensional
 array/list:

 a = [['2000-01-01', 2],['2000-01-02', 3]]

 I want to convert it to a recarray with this dtype [('date', 'object'),
 ('count', 'int')].  I tried multiple ways and none of them works.  And some
 of the tests show pretty odd behavior.

 This is good, and it is almost what i want:

  import numpy as np
  a = [('2000-01-01', 2), ('2000-01-02', 3)]
  np.array(a, dtype=[('date', 'object'), ('count', 'int')])
 array([('2000-01-01', 2), ('2000-01-02', 3)],
   dtype=[('date', '|O8'), ('count', 'i8')])

 Why this doesn't work?!

  a = [['2000-01-01', 2],['2000-01-02', 3]]
  np.array(a, dtype=[('date', 'object'), ('count', 'int')])
 Traceback (most recent call last):
   File stdin, line 1, in module
 ValueError: tried to set void-array with object members using buffer.

 Why can this cause segmentation fault?!

  a = [['2000-01-01', 2],['2000-01-02', 3]]
  np.ndarray((len(a),), buffer=np.array(a), dtype=[('date', 'object'),
 ('count', 'int')])
 Segmentation fault (And python quit!)

 Python version 2.6.5

 On this reference page,
 http://docs.scipy.org/doc/numpy/reference/generated/numpy.array.html

  x = np.array([(1,2),(3,4)])
  x
 array([[1, 2],
[3, 4]])
  np.array([[1, 2], [3, 4]])
 array([[1, 2],
[3, 4]])

 Can anyone help me about this?

 Thanks.

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] About np array and recarray

2012-03-21 Thread Yan Tang
Hi,

I am really confused on the np array or record array, and cannot understand
how it works.

What I want to do is that I have a normal python two dimensional array/list:

a = [['2000-01-01', 2],['2000-01-02', 3]]

I want to convert it to a recarray with this dtype [('date', 'object'),
('count', 'int')].  I tried multiple ways and none of them works.  And some
of the tests show pretty odd behavior.

This is good, and it is almost what i want:

 import numpy as np
 a = [('2000-01-01', 2), ('2000-01-02', 3)]
 np.array(a, dtype=[('date', 'object'), ('count', 'int')])
array([('2000-01-01', 2), ('2000-01-02', 3)],
  dtype=[('date', '|O8'), ('count', 'i8')])

Why this doesn't work?!

 a = [['2000-01-01', 2],['2000-01-02', 3]]
 np.array(a, dtype=[('date', 'object'), ('count', 'int')])
Traceback (most recent call last):
  File stdin, line 1, in module
ValueError: tried to set void-array with object members using buffer.

Why can this cause segmentation fault?!

 a = [['2000-01-01', 2],['2000-01-02', 3]]
 np.ndarray((len(a),), buffer=np.array(a), dtype=[('date', 'object'),
('count', 'int')])
Segmentation fault (And python quit!)

Python version 2.6.5

On this reference page,
http://docs.scipy.org/doc/numpy/reference/generated/numpy.array.html

 x = np.array([(1,2),(3,4)])
 x
array([[1, 2],
   [3, 4]])
 np.array([[1, 2], [3, 4]])
array([[1, 2],
   [3, 4]])

Can anyone help me about this?

Thanks.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion