Hi Fred, I would suggest you to have a look at pandas (http://pandas.sourceforge.net/) . It was really helpful for me. It seems well suited for the type of data that you are working with. It has nice "brodcasting" capabilities to apply numpy functions to a set column. http://pandas.sourceforge.net/basics.html#descriptive-statistics http://pandas.sourceforge.net/basics.html#function-application
Cheers, Eraldo On Sun, Dec 11, 2011 at 1:49 PM, ferreirafm <ferreir...@lim12.fm.usp.br>wrote: > > > Aronne Merrelli wrote: > > > > I can recreate this error if tab is a structured ndarray - what is the > > dtype of tab? > > > > If that is correct, I think you could fix this by simplifying things. > > Since > > tab is already an ndarray, you should not need to convert it back into a > > python list. By converting the ndarray back to a list you are making an > > extra level of "wrapping" as a python object, which is ultimately why you > > get that error about adding numpy.void. > > > > Unfortunately you cannot take directly take a mean of a struct dtype; > > structs are generic so they could have fields with strings, or objects, > > etc, that would be invalid for a mean calculation. However the following > > code fragment should work pretty efficiently. It will make a 1-element > > array of the same dtype as tab, and then populate it with the mean value > > of > > all elements where the length is >= 15. Note that dtype.fields.keys() > > gives > > you a nice way to iterate over the fields in the struct dtype: > > > > length_mask = tab['length'] >= 15 > > tab_means = np.zeros(1, dtype=tab.dtype) > > for k in tab.dtype.fields.keys(): > > tab_means[k] = np.mean( tab[k][mask] ) > > > > In general this would not work if tab has a field that is not a simple > > numeric type, such as a str, object, ... But it looks like your arrays > are > > all numeric from your example above. > > > > Hope that helps, > > Aronne > > > HI Aronne, > Thanks for your replay. Indeed, tab is a mix of different column types: > tab.dtype: > [('sgi', '<i8'), ('length', '<i8'), ('nident', '<i8'), ('pident', '<f8'), > ('positive', '<i8'), ('ppos', '<f8'), ('mismatch', '<i8'), ('qstart', > '<i8'), ('qend', '<i8'), ('sstart', '<i8'), ('send', '<i8'), ('gapopen', > '<i8'), ('gaps', '<i8'), ('evalue', '<f8'), ('bitscore', '<f8'), ('score', > '<f8')] > Interestingly, I couldn't be able to import some columns of digits as > strings like as with R dataframe objects. > I'll try to adapt your example to my needs and let you know the results. > Regards. > > -- > View this message in context: > http://old.nabble.com/numpy.mean-problems-tp32945124p32955052.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion