Re: [Tutor] loop through hours to calc max and plot

2014-03-19 Thread Alan Gauld

On 19/03/14 05:12, questions anon wrote:


I have monthly netcdf files containing hourly temperature data.
I would like to loop through all of the hours and calculate the max for
each hour (00 - 23) and then make a plot.


Its hard to give specific help since I don't know the file format
and most of your code is using non standard-library modules.

A couple of suggestions would be to
1) convert the code into a few functions each doing one
   specific job. For example the plotting code and the
   reading code could be separated out. That makes it
   easier to read the overall program flow and easier
   to test those individual functions if you change them.
2) You have quite a few redundant or misplaced assignments
   going on, or things happening inside loops that
   shouldn't be. Again removing the surplus just makes
   the rest of the code easier to grasp.


I have found a way for this to work for a couple of the hours (see
below) but it requires a lot of set up to do each hour and I am sure
there must be a simpler way using loops? I also need to find a simpler
way as I would like to eventually do daily for 1st jan to 31st Dec


If you put the code into smaller functions that will simplify the design 
of your bigger program. Each function should have one clear purpose and 
be as standalone as possible. It should take in the

date it needs as parameters and pass back a value.



from netCDF4 import Dataset
import numpy as N
import matplotlib.pyplot as plt
from numpy import ma as ma
from mpl_toolkits.basemap import Basemap
from netcdftime import utime
from datetime import datetime
import os
from matplotlib.collections import LineCollection

shapefile1=---
OutputFolder=r/---/
fileforlatlon=Dataset(---.nc, 'r+', 'NETCDF4')
LAT=fileforlatlon.variables['latitude'][:]
LON=fileforlatlon.variables['longitude'][:]

#Set up basemap using mercator projection



map =
Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33,llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i')
x,y=map(*N.meshgrid(LON,LAT))
map.readshapefile(shapefile1, '-REGIONS')



ncvariablename='T_SFC'
MainFolder=r/---/
ticks=[-5,0,5,10,15,20,25,30,35,40,45,50]
Title='Surface Temperature (degrees celsius)'


Why aren't these with the other initialization code above?


cmap=plt.cm.jet




all_variabledata=[]
time00=[]
time12=[]


Why aren't these with the other initialization code above?


for (path, dirs, files) in os.walk(MainFolder):
for dir in dirs:
print the dirs are:, dir
path=path+'/'


this appears to be in the loop but its the same
value each time so you repeat the same assignment
over and over.



for ncfile in files:
fileext=ncvariablename+'.nc'
if ncfile.endswith(fileext):
print dealing with ncfiles:, path+ncfile
ncfile=os.path.join(path,ncfile)
ncfile=Dataset(ncfile, 'r+', 'NETCDF4')
TIME=ncfile.variables['time'][1::]
#variable=ncfile.variables[ncvariablename][:,:,:]
TSFC00=ncfile.variables[ncvariablename][00:01:,:,:]
TSFC12=ncfile.variables[ncvariablename][12:13:,:,:]
fillvalue=ncfile.variables[ncvariablename]._FillValue
ncfile.close()


Indentation seems to have gotten lost in email transit.
Maybe you didn't post in plain text?



#combine all data from the chosen variable to make one array for analyses
cdftime=utime('seconds since 1970-01-01 00:00:00')
ncfiletime=cdftime.num2date(TIME)
for i in ncfiletime[:]:
ncfiletime=i


This is a really bad idea. You are using the same variabvle name to hold 
the collection you iterate over then reassigning it to the items within 
the loop. Why not just use i (but use a better name!) within

the loop body rather than reassign it?

And if you must reassign it use a name other than the one you already 
used for the collection. Using the same name in two different ways

is a recipe for confusion.


timestr=str(ncfiletime)
d = datetime.strptime(timestr, '%Y-%m-%d %H:%M:%S')
date_string = d.strftime('%H')#combine by same hour
print the ncfiletime is:, ncfiletime

print the date_string is:, date_string
if date_string=='00':
time00.append(TSFC00)
elif date_string=='12':
time12.append(TSFC12)
else:
pass

big_arraytime00=N.ma.concatenate(time00)
big_arraytime12=N.ma.concatenate(time12)
MAX00=big_arraytime00.max(axis=0)
MAX12=big_arraytime12.max(axis=0)

#plot output summary stats
map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33,
llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i')
map.drawcoastlines()
map.drawstates()
map.readshapefile(shapefile1, 'REGIONS')
x,y=map(*N.meshgrid(LON,LAT))
plottitle='TSFCmax00'


another constant assignment that could be in the
initialization code section


plt.title(plottitle)
CS = map.contourf(x,y,MAX00, ticks, cmap=cmap)



l,b,w,h =0.1,0.1,0.8,0.8
cax = plt.axes([l+w+0.025, b, 0.025, h])
plt.colorbar(CS,cax=cax, drawedges=True)
plt.savefig((os.path.join(OutputFolder, plottitle+'.png')))
plt.show()
plt.close()


map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33,
llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i')
map.drawcoastlines()

Re: [Tutor] loop through hours to calc max and plot

2014-03-19 Thread ALAN GAULD
Forwarding to the list.
Use ReplyAll when responding to ensure maximum responses.
 
Alan Gauld
Author of the Learn To Program website
http://www.alan-g.me.uk/

http://www.flickr.com/photos/alangauldphotos




 From: questions anon questions.a...@gmail.com
To: Alan Gauld alan.ga...@btinternet.com 
Sent: Wednesday, 19 March 2014, 22:30
Subject: Re: [Tutor] loop through hours to calc max and plot
 


Could you please give an example of how to put these into a loop in a function 
and create the matching naming conventions?




On Thu, Mar 20, 2014 at 8:07 AM, questions anon questions.a...@gmail.com 
wrote:

Thank you for your response it is very helpful.





On Wed, Mar 19, 2014 at 8:16 PM, Alan Gauld alan.ga...@btinternet.com wrote:

On 19/03/14 05:12, questions anon wrote:


I have monthly netcdf files containing hourly temperature data.
I would like to loop through all of the hours and calculate the max for
each hour (00 - 23) and then make a plot.


Its hard to give specific help since I don't know the file format
and most of your code is using non standard-library modules.

A couple of suggestions would be to
1) convert the code into a few functions each doing one
   specific job. For example the plotting code and the
   reading code could be separated out. That makes it
   easier to read the overall program flow and easier
   to test those individual functions if you change them.
2) You have quite a few redundant or misplaced assignments
   going on, or things happening inside loops that
   shouldn't be. Again removing the surplus just makes
   the rest of the code easier to grasp.



I have found a way for this to work for a couple of the hours (see
below) but it requires a lot of set up to do each hour and I am sure
there must be a simpler way using loops? I also need to find a simpler
way as I would like to eventually do daily for 1st jan to 31st Dec


If you put the code into smaller functions that will simplify the design of 
your bigger program. Each function should have one clear purpose and be as 
standalone as possible. It should take in the
date it needs as parameters and pass back a value.




from netCDF4 import Dataset
import numpy as N
import matplotlib.pyplot as plt
from numpy import ma as ma
from mpl_toolkits.basemap import Basemap
from netcdftime import utime
from datetime import datetime
import os
from matplotlib.collections import LineCollection

shapefile1=---
OutputFolder=r/---/
fileforlatlon=Dataset(---.nc, 'r+', 'NETCDF4')
LAT=fileforlatlon.variables['latitude'][:]
LON=fileforlatlon.variables['longitude'][:]

#Set up basemap using mercator projection


map =
Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33,llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i')
x,y=map(*N.meshgrid(LON,LAT))
map.readshapefile(shapefile1, '-REGIONS')


ncvariablename='T_SFC'
MainFolder=r/---/
ticks=[-5,0,5,10,15,20,25,30,35,40,45,50]
Title='Surface Temperature (degrees celsius)'


Why aren't these with the other initialization code above?


cmap=plt.cm.jet



all_variabledata=[]
time00=[]
time12=[]

Why aren't these with the other initialization code above?



for (path, dirs, files) in os.walk(MainFolder):
for dir in dirs:
print the dirs are:, dir
path=path+'/'


this appears to be in the loop but its the same
value each time so you repeat the same assignment
over and over.




for ncfile in files:
fileext=ncvariablename+'.nc'
if ncfile.endswith(fileext):
print dealing with ncfiles:, path+ncfile
ncfile=os.path.join(path,ncfile)
ncfile=Dataset(ncfile, 'r+', 'NETCDF4')
TIME=ncfile.variables['time'][1::]
#variable=ncfile.variables[ncvariablename][:,:,:]
TSFC00=ncfile.variables[ncvariablename][00:01:,:,:]
TSFC12=ncfile.variables[ncvariablename][12:13:,:,:]
fillvalue=ncfile.variables[ncvariablename]._FillValue
ncfile.close()


Indentation seems to have gotten lost in email transit.
Maybe you didn't post in plain text?




#combine all data from the chosen variable to make one array for analyses
cdftime=utime('seconds since 1970-01-01 00:00:00')
ncfiletime=cdftime.num2date(TIME)
for i in ncfiletime[:]:
ncfiletime=i


This is a really bad idea. You are using the same variabvle name to hold the 
collection you iterate over then reassigning it to the items within the loop. 
Why not just use i (but use a better name!) within
the loop body rather than reassign it?

And if you must reassign it use a name other than the one you already used 
for the collection. Using the same name in two different ways
is a recipe for confusion.



timestr=str(ncfiletime)
d = datetime.strptime(timestr, '%Y-%m-%d %H:%M:%S')
date_string = d.strftime('%H')#combine by same hour
print the ncfiletime is:, ncfiletime

print the date_string is:, date_string
if date_string=='00':
time00.append(TSFC00)
elif date_string=='12':
time12.append(TSFC12)
else:
pass

big_arraytime00=N.ma.concatenate(time00)
big_arraytime12=N.ma.concatenate(time12)
MAX00=big_arraytime00.max(axis=0)
MAX12=big_arraytime12.max