Re: [Numpy-discussion] numpy.mean problems

2011-12-14 Thread ferreirafm

Hi Eraldo,
Indeed, Pandas is a really really nice module! If it is going to take part
of numpy, that's even better. 
Thanks for the suggestion.
All the Best,
Fred


Eraldo Pomponi wrote:
 
 Hi Fred,
 
 Pandas has a nice interface to PyTable if you still need it:
 
 http://pandas.sourceforge.net/io.html#hdf5-pytables
 
 However, my intention was just to point you to pandas because it
 is really a powerful tool if you need to deal with tabular heterogenic
 data. It is also important to notice that there are plans in the numpy
 community to include/port part of this package directly in the codebase.
 This says a lot about how good it is...
 
 Best,
 Eraldo
 
 

-- 
View this message in context: 
http://old.nabble.com/numpy.mean-problems-tp32945124p32975342.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.mean problems

2011-12-13 Thread ferreirafm

Hi Eraldo, 
Thanks for your suggestion. I was using pytables but give up after known
that some very useful capabilities are sold as a professional package.
However, it still useful to many printing and data manipulation and, also,
it can handle extremely large datasets (which is not my case.).
Regards,
Fred  
 

Eraldo Pomponi wrote:
 
 I would suggest you to have a look at pandas
 (http://pandas.sourceforge.net/)
 . It was
 really helpful for me. It seems well suited for the type of data that you
 are working
 with. It has nice brodcasting capabilities to apply numpy functions to a
 set column.
 http://pandas.sourceforge.net/basics.html#descriptive-statistics
 http://pandas.sourceforge.net/basics.html#function-application
 
 Cheers,
 Eraldo
 

-- 
View this message in context: 
http://old.nabble.com/numpy.mean-problems-tp32945124p32970295.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.mean problems

2011-12-11 Thread ferreirafm


Aronne Merrelli wrote:
 
 I can recreate this error if tab is a structured ndarray - what is the
 dtype of tab?
 
 If that is correct, I think you could fix this by simplifying things.
 Since
 tab is already an ndarray, you should not need to convert it back into a
 python list. By converting the ndarray back to a list you are making an
 extra level of wrapping as a python object, which is ultimately why you
 get that error about adding numpy.void.
 
 Unfortunately you cannot take directly take a mean of a struct dtype;
 structs are generic so they could have fields with strings, or objects,
 etc, that would be invalid for a mean calculation. However the following
 code fragment should work pretty efficiently. It will make a 1-element
 array of the same dtype as tab, and then populate it with the mean value
 of
 all elements where the length is = 15. Note that dtype.fields.keys()
 gives
 you a nice way to iterate over the fields in the struct dtype:
 
 length_mask = tab['length'] = 15
 tab_means = np.zeros(1, dtype=tab.dtype)
 for k in tab.dtype.fields.keys():
 tab_means[k] = np.mean( tab[k][mask] )
 
 In general this would not work if tab has a field that is not a simple
 numeric type, such as a str, object, ... But it looks like your arrays are
 all numeric from your example above.
 
 Hope that helps,
 Aronne
 
HI Aronne,
Thanks for your replay. Indeed, tab is a mix of different column types:
tab.dtype:
[('sgi', 'i8'), ('length', 'i8'), ('nident', 'i8'), ('pident', 'f8'),
('positive', 'i8'), ('ppos', 'f8'), ('mismatch', 'i8'), ('qstart',
'i8'), ('qend', 'i8'), ('sstart', 'i8'), ('send', 'i8'), ('gapopen',
'i8'), ('gaps', 'i8'), ('evalue', 'f8'), ('bitscore', 'f8'), ('score',
'f8')]
 Interestingly, I couldn't be able to import some columns of digits as
strings like as with R dataframe objects. 
I'll try to adapt your example to my needs and let you know the results.
Regards.
   
-- 
View this message in context: 
http://old.nabble.com/numpy.mean-problems-tp32945124p32955052.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] numpy.mean problems

2011-12-09 Thread ferreirafm

Hi everyone,
I'm quite new to numpy and python either. Could someone, please, tell me
what I'm doing wrong?
Here goes my peace of code:

def stats(filename):
Utilility to perform some basic statistics on columns.
tab = get_textab(filename)
stat_list = [ ]
for row in sort_tab(tab):
if row['length'] = 15:
stat_list.append(row)
stat_array = np.array(stat_list)
print type(sort_tab(tab))
print type(stat_array)
#print stat_array.mean(axis=0)  
  
print np.mean(stat_array, axis=0)

Which results in:
type 'numpy.ndarray'
type 'numpy.ndarray'
Traceback (most recent call last):
  File /home/ferreirafm/bin/cross.py, line 213, in module
main()
  File /home/ferreirafm/bin/cross.py, line 204, in main
stats(filename)
  File /home/ferreirafm/bin/cross.py, line 146, in stats
print np.mean(stat_array, axis=0)
  File /usr/lib64/python2.7/site-packages/numpy/core/fromnumeric.py, line
2374, in mean
return mean(axis, dtype, out)
TypeError: unsupported operand type(s) for +: 'numpy.void' and 'numpy.void'
-- 
View this message in context: 
http://old.nabble.com/numpy.mean-problems-tp32945124p32945124.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion