Maybe I am asking the wrong question or could go about this another way. I have thousands of numpy arrays to flick through, could I just identify which arrays have NAN's and for now ignore the entire array. is there a simple way to do this? any feedback will be greatly appreciated.
On Thu, Dec 1, 2011 at 12:16 PM, questions anon <questions.a...@gmail.com>wrote: > I am trying to calculate the mean across many netcdf files. I cannot use > numpy.mean because there are too many files to concatenate and I end up > with a memory error. I have enabled the below code to do what I need but I > have a few nan values in some of my arrays. Is there a way to ignore these > somewhere in my code. I seem to face this problem often so I would love a > command that ignores blanks in my array before I continue on to the next > processing step. > Any feedback is greatly appreciated. > > > netCDF_list=[] > for dir in glob.glob(MainFolder + '*/01/')+ glob.glob(MainFolder + > '*/02/')+ glob.glob(MainFolder + '*/12/'): > for ncfile in glob.glob(dir + '*.nc'): > netCDF_list.append(ncfile) > > slice_counter=0 > print netCDF_list > > for filename in netCDF_list: > ncfile=netCDF4.Dataset(filename) > TSFC=ncfile.variables['T_SFC'][:] > fillvalue=ncfile.variables['T_SFC']._FillValue > TSFC=MA.masked_values(TSFC, fillvalue) > for i in xrange(0,len(TSFC)-1,1): > slice_counter +=1 > #print slice_counter > try: > running_sum=N.add(running_sum, TSFC[i]) > except NameError: > print "Initiating the running total of my > variable..." > running_sum=N.array(TSFC[i]) > > TSFC_avg=N.true_divide(running_sum, slice_counter) > N.set_printoptions(threshold='nan') > print "the TSFC_avg is:", TSFC_avg > >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion