Re: [Matplotlib-users] loading csv data into arrays

per freem Wed, 12 Aug 2009 08:49:46 -0700

hi all,

thanks for these comments. i tried loadtxt and genfromtxt and am
having similar problems. my data looks like this:


my;header1      myheader-2_a    myheader-2_b
a:5-X:b 3;0;5;0;0;0     3.015   
c:6-Y:d 0;0;0;0;0;0     2.5

i simply want to read these in, and have all numbers be read in as
floats and all things that don't look like numbers (in this case the
first and second columns) to be parsed in as strings.

i tried:

data = genfromtxt(myfile, delimiter='\t', dtype=None)

(if i don't specify dtype=None, it reads everything as NaN)

the first problem is that with dtype=None all the entries are parsed
as strings. i'd like to be able to read in the unambiguously numeric
values as numbers (column 3 in this case.)

the second problem is that if i try to use headers as column names using:

data = genfromtxt(myfile, delimiter='\t', dtype=None, names=True)

then it converts my headers into different strings:

>> data
array([('a:5-X:b', '3;0;5;0;0;0', 3.0150000000000001),
       ('c:6-Y:d', '0;0;0;0;0;0', 2.5)],
      dtype=[('myheader1', '|S7'), ('myheader2_a', '|S11'),
('myheader2_b', '<f8')])

i would only like to refer to my headers using this notation:

data['my;header1']

i don't need to be able to write data.headername at all. is there a
way to make genfromtxt not mess with any of the header names, and read
in the numeric values?

thanks very much.


On Wed, Aug 12, 2009 at 11:16 AM, Ryan May<rma...@gmail.com> wrote:
> On Wed, Aug 12, 2009 at 10:01 AM, Sandro Tosi <mo...@debian.org> wrote:
>>
>> On Wed, Aug 12, 2009 at 16:56, per freem<perfr...@gmail.com> wrote:
>> > hi all,
>> >
>> > i have tab-separated text files that i would like to parse into arrays
>> > in numpy/scipy. i simply want to be able to read in the data into an
>>
>> numpy's loadtxt()
>
> With numpy 1.3 and newer, there's also numpy.genfromtxt (which actually
> should behave very similar to mlab.csv2rec):
>
> import numpy as np
> from StringIO import StringIO
> data = StringIO("""
> #gender age weight
> M   21  72.100000
> F   35  58.330000
> M   33  21.99
> """)
>
> arr = np.genfromtxt(data, names=True, dtype=None)
> print arr['gender']
> print arr['age']
>
> Writing this back out to a file in the same format will require a bit more
> of manual (though) straightforward work.  There's no simple method that will
> do it for you.  The best one liner here is:
>
> arr.tofile('test.txt', sep='\n')
>
>>cat arr.txt
> ('M', 21, 72.099999999999994)
> ('F', 35, 58.329999999999998)
> ('M', 33, 21.989999999999998)
>
> That should get you going.  If it's not enough, feel free to post a sample
> of your data file (or a representative example) and I can try to point you
> further in the right direction.
>
> Ryan
>
> --
> Ryan May
> Graduate Research Assistant
> School of Meteorology
> University of Oklahoma
>

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users

Re: [Matplotlib-users] loading csv data into arrays

Reply via email to