On 10/12/11 8:20 PM, questions anon wrote:
Hi All,
I keep receiving a memory error when processing many netcdf files. I
assumed it had something to do with how I loop things and maybe needed
to close things off properly but I recently received an error that
made me think it might be because of matplotlib.
In the code below I am looping through a bunch of netcdf files (each
file is hourly data for one month) and within each netcdf file I am
outputting a *png file every three hours. This works for one netcdf
file (therefore one month) but when it begins to process the next
netcdf file I receive a memory error (see below). Since I have tidied
some of my code up it seems to process partly into the second file but
then I still receive the memory error.
I have tried a few suggestions such as:
-Combining the dataset using MFDataset (using NETCDF4) is not an
option because the files do not have unlimited dimension.
- gc.collect() but that just results in a /GEOS_ERROR: bad allocation
error/.
-only open LAT and LON once (which worked)
System Details:
Python 2.7.2 |EPD 7.1-2 (32-bit)| (default, Jul 3 2011, 15:13:59)
[MSC v.1500 32 bit (Intel)] on win32
Any feedback will be greatly appreciated as I seem to keep ending up
with memory errors when working with netcdf files this even happens if
I am using a much better computer.
*Most recent error: *
Traceback (most recent call last):
File "C:\plot_netcdf_merc_multiplot_across_multifolders_TSFC.py",
line 78, in <module>
plt.savefig((os.path.join(outputfolder,
'TSFC'+date_string+'UTC.png')))
File "C:\Python27\lib\site-packages\matplotlib\pyplot.py", line 363,
in savefig
return fig.savefig(*args, **kwargs)
File "C:\Python27\lib\site-packages\matplotlib\figure.py", line
1084, in savefig
self.canvas.print_figure(*args, **kwargs)
File
"C:\Python27\lib\site-packages\matplotlib\backends\backend_wxagg.py",
line 100, in print_figure
FigureCanvasAgg.print_figure(self, filename, *args, **kwargs)
File "C:\Python27\lib\site-packages\matplotlib\backend_bases.py",
line 1923, in print_figure
**kwargs)
File
"C:\Python27\lib\site-packages\matplotlib\backends\backend_agg.py",
line 438, in print_png
FigureCanvasAgg.draw(self)
File
"C:\Python27\lib\site-packages\matplotlib\backends\backend_agg.py",
line 393, in draw
self.renderer = self.get_renderer()
File
"C:\Python27\lib\site-packages\matplotlib\backends\backend_agg.py",
line 404, in get_renderer
self.renderer = RendererAgg(w, h, self.figure.dpi)
File
"C:\Python27\lib\site-packages\matplotlib\backends\backend_agg.py",
line 59, in __init__
self._renderer = _RendererAgg(int(width), int(height), dpi,
debug=False)
RuntimeError: Could not allocate memory for image
*Error when I added gc.collect()*
GEOS_ERROR: bad allocation
*Old error (before adding gc.collect() )*
/Traceback (most recent call last):
File
"d:/plot_netcdf_merc_multiplot_across_multifolders__memoryerror.py",
line 44, in <module>
TSFC=ncfile.variables['T_SFC'][1::3]
File "netCDF4.pyx", line 2473, in netCDF4.Variable.__getitem__
(netCDF4.c:23094)
MemoryError/
from netCDF4 import Dataset
import numpy as N
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
from netcdftime import utime
from datetime import datetime
import os
import gc
shapefile1="E:/
griddeddatasamples/GIS/DSE_REGIONS"
MainFolder=r"E:/griddeddatasamples/GriddedData/InputsforValidation/T_SFC/"
OutputFolder=r"E:/griddeddatasamples/GriddedData/OutputsforValidation"
fileforlatlon=Dataset("E:/griddeddatasamples/GriddedData/InputsforValidation/T_SFC/TSFC_1974_01/IDZ00026_VIC_ADFD_T_SFC.nc",
'r+', 'NETCDF4')
LAT=fileforlatlon.variables['latitude'][:]
LON=fileforlatlon.variables['longitude'][:]
for (path, dirs, files) in os.walk(MainFolder):
for dir in dirs:
print dir
path=path+'/'
for ncfile in files:
if ncfile[-3:]=='.nc':
print "dealing with ncfiles:", ncfile
ncfile=os.path.join(path,ncfile)
ncfile=Dataset(ncfile, 'r+', 'NETCDF4')
TSFC=ncfile.variables['T_SFC'][1::3]
TIME=ncfile.variables['time'][1::3]
ncfile.close()
gc.collect()
for TSFC, TIME in zip((TSFC[:]),(TIME[:])):
cdftime=utime('seconds since 1970-01-01 00:00:00')
ncfiletime=cdftime.num2date(TIME)
print ncfiletime
timestr=str(ncfiletime)
d = datetime.strptime(timestr, '%Y-%m-%d %H:%M:%S')
date_string = d.strftime('%Y%m%d_%H%M')
map =
Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33,
llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i')
x,y=map(*N.meshgrid(LON,LAT))
map.drawcoastlines(linewidth=0.5)
map.readshapefile(shapefile1, 'DSE_REGIONS')
map.drawstates()
plt.title('Surface temperature at %s UTC'%ncfiletime)
ticks=[-5,0,5,10,15,20,25,30,35,40,45,50]
CS = map.contourf(x,y,TSFC, ticks, cmap=plt.cm.jet)
l,b,w,h =0.1,0.1,0.8,0.8
cax = plt.axes([l+w+0.025, b, 0.025, h], )
cbar=plt.colorbar(CS, cax=cax, drawedges=True)
plt.savefig((os.path.join(OutputFolder,
'TSFC'+date_string+'UTC.png')))
plt.close()
gc.collect()
Try moving these lines
map =
Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33,
llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i')
x,y=map(*N.meshgrid(LON,LAT))
map.drawcoastlines(linewidth=0.5)
map.readshapefile(shapefile1, 'DSE_REGIONS')
map.drawstates()
out of the loop.
-Jeff
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users