thanks again for you response. I must still be doing something wrong!! both options resulted in : the TSFC_avg is: [-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
1st option: slice_counter=0 for filename in netCDF_list: ncfile=netCDF4.Dataset(filename) TSFC=ncfile.variables['T_SFC'][:] fillvalue=ncfile.variables['T_SFC']._FillValue TSFC=MA.masked_values(TSFC, fillvalue) TSFCWithOutNan=[] for a in TSFC: indexnonNaN=N.isfinite(a) SliceofTotoWithoutNan=a[indexnonNaN] print SliceofTotoWithoutNan TSFCWithOutNan.append(SliceofTotoWithoutNan) for i in xrange(0,len(TSFCWithOutNan)-1,1): slice_counter +=1 try: running_sum=N.add(running_sum, TSFCWithOutNan[i]) except NameError: print "Initiating the running total of my variable..." running_sum=N.array(TSFCWithOutNan[i]) TSFC_avg=N.true_divide(running_sum, slice_counter) N.set_printoptions(threshold='nan') print "the TSFC_avg is:", TSFC_avg the 2nd option : for filename in netCDF_list: ncfile=netCDF4.Dataset(filename) TSFC=ncfile.variables['T_SFC'][:] fillvalue=ncfile.variables['T_SFC']._FillValue TSFC=MA.masked_values(TSFC, fillvalue) slice_counter=0 for a in TSFC: indexnonNaN=N.isfinite(a) SliceofTotoWithoutNan=a[indexnonNaN] slice_counter +=1 try: running_sum=N.add(running_sum, SliceofTotoWithoutNan) except NameError: print "Initiating the running total of my variable..." running_sum=N.array(SliceofTotoWithoutNan) TSFC_avg=N.true_divide(running_sum, slice_counter) N.set_printoptions(threshold='nan') print "the TSFC_avg is:", TSFC_avg On Tue, Dec 6, 2011 at 2:31 PM, Xavier Barthelemy <xab...@gmail.com> wrote: > Well, I would see solutions: > 1- to keep how your code is, withj a python list (you can stack numpy > arrays if they have the same dimensions): > > for filename in netCDF_list: > ncfile=netCDF4.Dataset(filename) > TSFC=ncfile.variables['T_SFC'][:] > fillvalue=ncfile.variables['T_SFC']._FillValue > TSFC=MA.masked_values(TSFC, fillvalue) > TSFCWithOutNan=[] > for a in TSFC: > indexnonNaN=N.isfinite(a) > SliceofTotoWithoutNan=a[indexnonNaN] > print SliceofTotoWithoutNan > TSFCWithOutNan .append( SliceofTotoWithoutNan ) > > > > for i in xrange(0,len(TSFCWithOutNan )-1,1): > > slice_counter +=1 > #print slice_counter > try: > running_sum=N.add(running_sum, > TSFCWithOutNan [i]) > > except NameError: > print "Initiating the running total of my > variable..." > running_sum=N.array(TSFCWithOutNan [i]) > ... > > or 2- everything in the same loop: > > slice_counter =0 > for a in TSFC: > indexnonNaN=N.isfinite(a) > SliceofTotoWithoutNan=a[indexnonNaN] > slice_counter +=1 > #print slice_counter > try: > running_sum=N.add(running_sum, > SliceofTotoWithoutNan ) > > except NameError: > print "Initiating the running total of my > variable..." > running_sum=N.array( SliceofTotoWithoutNan > ) > TSFC_avg=N.true_divide(running_sum, slice_counter) > N.set_printoptions(threshold='nan') > print "the TSFC_avg is:", TSFC_avg > > See if it works. it is just a rapid guess > Xavier > > > for dir in glob.glob(MainFolder + '*/01/')+ glob.glob(MainFolder + > '*/02/')+ glob.glob(MainFolder + '*/12/'): > >> #print dir >> >> for ncfile in glob.glob(dir + '*.nc'): >> netCDF_list.append(ncfile) >> >> slice_counter=0 >> print netCDF_list >> for filename in netCDF_list: >> ncfile=netCDF4.Dataset(filename) >> TSFC=ncfile.variables['T_SFC'][:] >> fillvalue=ncfile.variables['T_SFC']._FillValue >> TSFC=MA.masked_values(TSFC, fillvalue) >> for a in TSFC: >> indexnonNaN=N.isfinite(a) >> SliceofTotoWithoutNan=a[indexnonNaN] >> print SliceofTotoWithoutNan >> TSFC=SliceofTotoWithoutNan >> >> >> for i in xrange(0,len(TSFC)-1,1): >> slice_counter +=1 >> #print slice_counter >> try: >> running_sum=N.add(running_sum, TSFC[i]) >> except NameError: >> print "Initiating the running total of my >> variable..." >> running_sum=N.array(TSFC[i]) >> >> TSFC_avg=N.true_divide(running_sum, slice_counter) >> N.set_printoptions(threshold='nan') >> print "the TSFC_avg is:", TSFC_avg >> >> >> >> >> On Tue, Dec 6, 2011 at 9:50 AM, Xavier Barthelemy <xab...@gmail.com>wrote: >> >>> Hi, >>> I don't know if it is the best choice, but this is what I do in my code: >>> >>> for each slice: >>> indexnonNaN=np.isfinite(SliceOf Toto) >>> SliceOf TotoWithoutNan= SliceOf Toto [indexnonNaN] >>> >>> and then perform all operation I want o on the last array. >>> >>> i hope it does answer your question >>> >>> Xavier >>> >>> >>> 2011/12/6 questions anon <questions.a...@gmail.com> >>> >>>> Maybe I am asking the wrong question or could go about this another >>>> way. >>>> I have thousands of numpy arrays to flick through, could I just >>>> identify which arrays have NAN's and for now ignore the entire array. is >>>> there a simple way to do this? >>>> any feedback will be greatly appreciated. >>>> >>>> On Thu, Dec 1, 2011 at 12:16 PM, questions anon < >>>> questions.a...@gmail.com> wrote: >>>> >>>>> I am trying to calculate the mean across many netcdf files. I cannot >>>>> use numpy.mean because there are too many files to concatenate and I end >>>>> up >>>>> with a memory error. I have enabled the below code to do what I need but I >>>>> have a few nan values in some of my arrays. Is there a way to ignore these >>>>> somewhere in my code. I seem to face this problem often so I would love a >>>>> command that ignores blanks in my array before I continue on to the next >>>>> processing step. >>>>> Any feedback is greatly appreciated. >>>>> >>>>> >>>>> netCDF_list=[] >>>>> for dir in glob.glob(MainFolder + '*/01/')+ glob.glob(MainFolder + >>>>> '*/02/')+ glob.glob(MainFolder + '*/12/'): >>>>> for ncfile in glob.glob(dir + '*.nc'): >>>>> netCDF_list.append(ncfile) >>>>> >>>>> slice_counter=0 >>>>> print netCDF_list >>>>> >>>>> for filename in netCDF_list: >>>>> ncfile=netCDF4.Dataset(filename) >>>>> TSFC=ncfile.variables['T_SFC'][:] >>>>> fillvalue=ncfile.variables['T_SFC']._FillValue >>>>> TSFC=MA.masked_values(TSFC, fillvalue) >>>>> for i in xrange(0,len(TSFC)-1,1): >>>>> slice_counter +=1 >>>>> #print slice_counter >>>>> try: >>>>> running_sum=N.add(running_sum, TSFC[i]) >>>>> except NameError: >>>>> print "Initiating the running total of my >>>>> variable..." >>>>> running_sum=N.array(TSFC[i]) >>>>> >>>>> TSFC_avg=N.true_divide(running_sum, slice_counter) >>>>> N.set_printoptions(threshold='nan') >>>>> print "the TSFC_avg is:", TSFC_avg >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion@scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> >>> >>> -- >>> « Quand le gouvernement viole les droits du peuple, l'insurrection est, >>> pour le peuple et pour chaque portion du peuple, le plus sacré des droits >>> et le plus indispensable des devoirs » >>> >>> Déclaration des droits de l'homme et du citoyen, article 35, 1793 >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion@scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > > -- > « Quand le gouvernement viole les droits du peuple, l'insurrection est, > pour le peuple et pour chaque portion du peuple, le plus sacré des droits > et le plus indispensable des devoirs » > > Déclaration des droits de l'homme et du citoyen, article 35, 1793 > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion