Re: [R-sig-Geo] stack many files without loading into memory

2015-02-25 Thread dschneiderch
Just wanted to update this thread in case anyone else comes looking. Some
of these things were not immediately clear to me.
I ended up doing:
library(raster)
library(ncdf4)
fn=list.files('serverpath')
fnstack=stack(fn)
layerdates=names(fnstack)
#instead of writeRaster, use ncdf4 directly to get around the issue in this
thread
http://r-sig-geo.2731867.n2.nabble.com/writeRaster-does-not-preserve-names-when-writing-to-NetCDF-td7586909.html
.
dim1=ncdim_def('Long','degree',seq(-112.25,-104.125,0.0041667))
dim2=ncdim_def('Lat','degree',seq(43.75,33,-0.0041667))
dim3=ncdim_def('time','yrdoy',unlim=T,vals=layerdates)#where layerdates is
a vector something like 20120101, 20120109,...etc since thats what my files
were called.
var=ncvar_def('swe','meters',dim=list(dim1,dim2,dim3),missval=-99,longname='snow
water equivalent',compression=9)
#important to note, dim1 is the x direction and should be ascending. dim2
is the y direction and should be descending. this is because the cell
numbers from a raster* object start top-left and count by row.
outputfn='localpath'
newnc=nc_create(outputfn,var)
ncvar_put(newnc, var, vals=getValues(fnstack))
ncatt_put(ncnew,0,'proj4string','+proj=longlat +datum=WGS84')#add a global
attribute defining the geographic information.
nc_close(newnc)

Then when I open the file:
ncnew=nc_open(outputfn)
ncnew$dim[[3]]$vals  #this will give the list of dates stored above in
dim3. you can get the spatial coordinates likewise in dim[[1]] and
dim[[2]]  (or ncnew$dim$Lat$vals etc.)
lyr=grep('20120109',ncnew$dim[[3]]$vals) #use grep to find the date again
ncvar_get(ncnew,start=c(1,1,lyr),count=c(-1,-1,1))#get the raster I stored
for that date.
nc_close(outputfn)

Hope that helps someone!

Dominik Schneider
o 303.735.6296 | c 518.956.3978


On Fri, Feb 6, 2015 at 1:30 PM, dschneiderch [via R-sig-geo] 
ml-node+s2731867n7587748...@n2.nabble.com wrote:

 Ok -  Looks like it worked this time for 112 files from 2012. The netcdf
 is 2.25 GB while the compressed multiband geotiff is 510MB. Does the netcdf
 have so much overhead- the 112 file at 10MB each are only 1.12 GB
 individually?
 I like the tidiness of 1 file per year so I'll have to play with how
 easily these can be accessed and the best way of annotating the layers. I
 was just reading that netcdf4 is based on hdf5 with a subset of features so
 I might look to see if hdf5 can do what I want.
 Thanks
 ds

 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://r-sig-geo.2731867.n2.nabble.com/stack-many-files-without-loading-into-memory-tp7587729p7587748.html
  To unsubscribe from stack many files without loading into memory, click
 here
 http://r-sig-geo.2731867.n2.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=7587729code=RG9taW5pay5TY2huZWlkZXJAY29sb3JhZG8uZWR1fDc1ODc3Mjl8LTEwMzMyMTA1OQ==
 .
 NAML
 http://r-sig-geo.2731867.n2.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





--
View this message in context: 
http://r-sig-geo.2731867.n2.nabble.com/stack-many-files-without-loading-into-memory-tp7587729p7587831.html
Sent from the R-sig-geo mailing list archive at Nabble.com.

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


Re: [R-sig-Geo] stack many files without loading into memory

2015-02-06 Thread dschneiderch
Ok -  Looks like it worked this time for 112 files from 2012. The netcdf is
2.25 GB while the compressed multiband geotiff is 510MB. Does the netcdf
have so much overhead- the 112 file at 10MB each are only 1.12 GB
individually?
I like the tidiness of 1 file per year so I'll have to play with how easily
these can be accessed and the best way of annotating the layers. I was just
reading that netcdf4 is based on hdf5 with a subset of features so I might
look to see if hdf5 can do what I want. 
Thanks
ds



--
View this message in context: 
http://r-sig-geo.2731867.n2.nabble.com/stack-many-files-without-loading-into-memory-tp7587729p7587748.html
Sent from the R-sig-geo mailing list archive at Nabble.com.

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


Re: [R-sig-Geo] stack many files without loading into memory

2015-02-06 Thread dschneiderch
Hi Michael -
Yes saving as GTiff with the compression options reduced the file size from
~10MB to ~2.5M for a single file but I am having a lot of trouble getting it
to save the whole stack. I'm definitely running out of memory on my computer
so maybe R is being slow and timing out? I've left it overnight with no
success (for a single year). That said, is there some overhead involved in
this? 112files * 10MB is only 1.12 GB 
I might try this on the command line with gdal tools.

Another issue that I'm encountering is that I don't seem to be able to save
layer names either in GTiff or CDF. Since these are ~100 remote sensing
images from the year, I need to be able to annotate the layer name so I know
the date (I actually posted onto an older thread about this specific to CDF
format because it came up in something else I was doing.)
Thanks for your help.



--
View this message in context: 
http://r-sig-geo.2731867.n2.nabble.com/stack-many-files-without-loading-into-memory-tp7587729p7587745.html
Sent from the R-sig-geo mailing list archive at Nabble.com.

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


[R-sig-Geo] stack many files without loading into memory

2015-02-04 Thread Dominik Schneider
Hi -
I have some data on a server but would like to bring them local in a
somewhat compressed format that is still easy to access.

/Volumes/hD/2012 - 100 geotiffs
~/project/data/ - store those geotiffs here without needing server access.

untested, I think I could do something like:
s=stack()
writeRaster(s,'2012stack')
fn=list.files('/Volumes/hD/2012',pattern='*.tif',full.names=T)
lapply(fn,function(f){
s=stack('2012stack')
r=raster(f)
names(r)=gsub(pattern='.tif',replacement='',basename(f))
s=addLayer(s,r)
writeRaster(s,'2012stack')
})
Or is it better to save to a .RData?
Is there a better way that doesn't require me to loop through each geotiff
since I can't load it all into memory.
Thanks
Dominik

[[alternative HTML version deleted]]

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


Re: [R-sig-Geo] stack many files without loading into memory

2015-02-04 Thread Dominik Schneider
Wouldn't that keep the link to the server on which they are stored now?

Dominik Schneider
o 303.735.6296 | c 518.956.3978


On Wed, Feb 4, 2015 at 12:50 PM, Michael Sumner mdsum...@gmail.com wrote:

 Why not stack(fn)

 ?

 On Thu, 5 Feb 2015 06:41 Dominik Schneider dominik.schnei...@colorado.edu
 wrote:

 Hi -
 I have some data on a server but would like to bring them local in a
 somewhat compressed format that is still easy to access.

 /Volumes/hD/2012 - 100 geotiffs
 ~/project/data/ - store those geotiffs here without needing server
 access.

 untested, I think I could do something like:
 s=stack()
 writeRaster(s,'2012stack')
 fn=list.files('/Volumes/hD/2012',pattern='*.tif',full.names=T)
 lapply(fn,function(f){
 s=stack('2012stack')
 r=raster(f)
 names(r)=gsub(pattern='.tif',replacement='',basename(f))
 s=addLayer(s,r)
 writeRaster(s,'2012stack')
 })
 Or is it better to save to a .RData?
 Is there a better way that doesn't require me to loop through each geotiff
 since I can't load it all into memory.
 Thanks
 Dominik

 [[alternative HTML version deleted]]

 ___
 R-sig-Geo mailing list
 R-sig-Geo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-geo



[[alternative HTML version deleted]]

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


Re: [R-sig-Geo] stack many files without loading into memory

2015-02-04 Thread Michael Sumner
Why not stack(fn)

?

On Thu, 5 Feb 2015 06:41 Dominik Schneider dominik.schnei...@colorado.edu
wrote:

 Hi -
 I have some data on a server but would like to bring them local in a
 somewhat compressed format that is still easy to access.

 /Volumes/hD/2012 - 100 geotiffs
 ~/project/data/ - store those geotiffs here without needing server access.

 untested, I think I could do something like:
 s=stack()
 writeRaster(s,'2012stack')
 fn=list.files('/Volumes/hD/2012',pattern='*.tif',full.names=T)
 lapply(fn,function(f){
 s=stack('2012stack')
 r=raster(f)
 names(r)=gsub(pattern='.tif',replacement='',basename(f))
 s=addLayer(s,r)
 writeRaster(s,'2012stack')
 })
 Or is it better to save to a .RData?
 Is there a better way that doesn't require me to loop through each geotiff
 since I can't load it all into memory.
 Thanks
 Dominik

 [[alternative HTML version deleted]]

 ___
 R-sig-Geo mailing list
 R-sig-Geo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-geo


[[alternative HTML version deleted]]

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


Re: [R-sig-Geo] stack many files without loading into memory

2015-02-04 Thread Michael Sumner
On Thu Feb 05 2015 at 7:41:33 AM Dominik Schneider 
dominik.schnei...@colorado.edu wrote:

 I think you are correct.
 s=stack(fn,quick=T)
 writeRaster(s,'localpath/2012data')



Ugh, sorry yes that's me reading too fast. I should have suggested the next
step to writeRaster, I'm not sure why you don't include the file extension
here though? Why not

writeRaster(s,'localpath/2012data.grd')




would get the data local. I guess the trade off is that the file size is an
 order of magnitude bigger than if I saved them in an .RData file but much
 quicker to access.


You might achieve similar compression if you choose GeoTIFF, with the right
options (and you need rgdal). Try a test with a single layer, e.g.

s=stack(fn,quick = TRUE)
require(rgdal)
writeRaster(s[[1]],'localpath/2012data_temp01.tif', options =
c(COMPRESS=LZW, TILED=YES)

Does the file size of 2012data_temp01.tif look promising?

The native rasterfile format does not support compression as far as I
know. Tiling may be of help or hindrance, depending on the dimensions and
the extra margin added by the tiles if they need to extend beyond the
margins - you can control tile size with BLOCKX/YSIZE  if needed:
http://www.gdal.org/frmt_gtiff.html

(NetCDF4 - with ncdf4 package - can also compress and tile natively, but I
haven't tried that via raster myself).

Cheers, Mike.


 ds

 On Wed, Feb 4, 2015 at 12:51 PM, Dominik Schneider 
 dominik.schnei...@colorado.edu wrote:

 Wouldn't that keep the link to the server on which they are stored now?

 On Wed, Feb 4, 2015 at 12:50 PM, Michael Sumner mdsum...@gmail.com
 wrote:

 Why not stack(fn)

 ?

 On Thu, 5 Feb 2015 06:41 Dominik Schneider 
 dominik.schnei...@colorado.edu wrote:

 Hi -
 I have some data on a server but would like to bring them local in a
 somewhat compressed format that is still easy to access.

 /Volumes/hD/2012 - 100 geotiffs
 ~/project/data/ - store those geotiffs here without needing server
 access.

 untested, I think I could do something like:
 s=stack()
 writeRaster(s,'2012stack')
 fn=list.files('/Volumes/hD/2012',pattern='*.tif',full.names=T)
 lapply(fn,function(f){
 s=stack('2012stack')
 r=raster(f)
 names(r)=gsub(pattern='.tif',replacement='',basename(f))
 s=addLayer(s,r)
 writeRaster(s,'2012stack')
 })
 Or is it better to save to a .RData?
 Is there a better way that doesn't require me to loop through each
 geotiff
 since I can't load it all into memory.
 Thanks
 Dominik

 [[alternative HTML version deleted]]

 ___
 R-sig-Geo mailing list
 R-sig-Geo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-geo




[[alternative HTML version deleted]]

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo