Hello, On Tue, Nov 10, 2009 at 1:32 PM, Gökhan Sever <gokhanse...@gmail.com> wrote:
> On Tue, Nov 10, 2009 at 12:09 PM, Darryl Wallace > <darryl.wall...@prosensus.ca> wrote: > > Hello again, > > The best way so far that's come to my attention is to use: > > numpy.ma.masked_object > > The problem with this is that it's looking for a specific instance of an > > object. So if the user had some elements of their array that were, for > > example, "randomString" , then it would not be picked up > > e.g. > > --- > > from numpy import * > > mixedArray=array([1,2, '', 3, 4, 'randomString'], dtype=object) > > mixedArrayMask = ma.masked_object(mixedArray, 'randomString').mask > > --- > > then mixedArrayMask will yield: > > > > array([ False, False, False, False, False, True]) > > Can anyone help me so that all strings are found in the array without > having > > to explicitly loop through them in Python? > > Thanks, > > Darryl > > Why not stick to a same Missing-Value-Code or for all the non-valid > data? I don't know how MA module would handle mixed MVCs in a same > array without modifying the existing code. Otherwise looping over the > array an masking the str instances as NaN would be my alternative > solution. > The reason I don't stick to a standard missing value code is because a user may import other things in the datasheet that we need, like row or column labels, or maybe getting data from a specific source which reports missing data as a specific string. I currently do as you suggested. But when the dataset size becomes large, it gets to be quite slow due to the overhead of python looping. Thanks > > > > > > On Fri, Nov 6, 2009 at 3:56 PM, Darryl Wallace < > darryl.wall...@prosensus.ca> > > wrote: > >> > >> What I'm doing is importing some data from excel and sometimes there are > >> strings in the worksheet. Often times a user will use an empty cell or > a > >> string to represent data that is missing. > >> e.g. > >> from numpy import * > >> mixedArray=array([1, 2, '', 3, 4, 'String'], dtype=object) > >> Two questions: > >> 1) Is there a quick way to find the elements in the array that are the > >> strings without iterating over each element in the array? > >> or > >> 2) Could I quickly turn it into a masked array of type float where all > >> string elements are set as missing points? > >> I've been struggling with this for a while and can't come across a > method > >> that will all me to do it without iterating over each element. > >> Any help or pointers in the right direction would be greatly > appreciated. > >> Thanks, > >> Darryl > > > > > > > > -- > > ______________________________________ > > Darryl Wallace: Project Leader > > ProSensus Inc. > > McMaster Innovation Park > > 175 Longwood Road South, Suite 301 > > Hamilton, Ontario, L8P 0A1 > > Canada (GMT -05:00) > > > > Tel: 1-905-528-9136 > > Fax: 1-905-546-1372 > > > > Web site: http://www.prosensus.ca/ > > ______________________________________ > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion@scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > -- > Gökhan > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- ______________________________________ Darryl Wallace: Project Leader ProSensus Inc. McMaster Innovation Park 175 Longwood Road South, Suite 301 Hamilton, Ontario, L8P 0A1 Canada (GMT -05:00) Tel: 1-905-528-9136 Fax: 1-905-546-1372 Web site: http://www.prosensus.ca/ ______________________________________
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion