summary: Programmatically copying NetCDF mostly works: thanks for your assistance! However, 4 followup questions/responses (and motivation provided) below regarding problems encountered.
details: Tom Roche Thu, 05 Jan 2012 18:29:35 -0500 >> I need to "do surgery" on a large netCDF file (technically an >> I/O API file which uses netCDF). David William Pierce Thu, 5 Jan 2012 19:49:13 -0800 > simply copying the file generally isn't the point of an R script. :-) I guess I should have explained: I need to copy most of a source file, modifying only part, and to write a target file. So my motivation for this thread is, first, to be sure I can do the copying correctly (*not* merely to copy an netCDF file). Does this seem reasonable? > If I wanted to copy a var from an existing file to a new file, > manipulating it along the way, I'd do something like this (untested > code off the top of my head): see https://stat.ethz.ch/pipermail/r-help/attachments/20120105/f7644171/attachment.pl > Hope that gets you started, I had already started, but much less programmatically: I was using the ncdf API, but with names and indices copied from `ncdump -h` or `summary.ncdf`. Your code is better! at least, much less error-prone. (Although too terse for this R newbie: I rewrote more verbosely.) However I am noticing a few problems, for which I'd appreciate help if available (or correction if invalid, else a pointer to bug reporting): 1 Precisions "int" and "float" not supported by var.def.ncdf(...). When I tried to do (formatted for email) target.datavars[[target.datavars.i]] <- var.def.ncdf(source.datavar$name, source.datavar$units, target.datavar.dims, source.datavar$missval, -> prec=source.datavar$prec) I got - var.def.ncdf: error: unknown precision specified: int . - Known values: short single double integer char byte and similarly for precision="float". So I wrote a kludge function + precConvert <- function(prec.in) { + ret = switch(prec.in, + 'byte'='byte', + 'char'='char', + 'double'='double', + 'float'='single', + 'int'='integer', + 'integer'='integer', + 'short'='short', + 'single'='single', + ) + } and successfully did target.datavars[[target.datavars.i]] <- var.def.ncdf(source.datavar$name, source.datavar$units, target.datavar.dims, source.datavar$missval, +> prec=precConvert(source.datavar$prec)) Should this "just work"? 2 Copying I/O API global attributes fails. I/O API uses lots of these (33 in my source.nc!), so my diff has -// global attributes: - :IOAPI_VERSION = "1.0 1997349 (Dec. 15, 1997)" ; - :EXEC_ID = "???????????????? " ; - :FTYPE = 1 ; - :CDATE = 2011353 ; - :CTIME = 1224 ; ... However when I do > global.attr.name.list <- list( + ":IOAPI_VERSION", + ":EXEC_ID", + ":FTYPE", + ":CDATE", + ":CTIME", ... + ) > for (attr.name in global.attr.name.list) { + source.datavar.attr <- att.get.ncdf(source.file, 0, attr.name) + att.put.ncdf(target.file, 0, attr.name, source.datavar.attr$value) + } I get (lines broken for email) - Error in R_nc_put_att_double: - NetCDF: Name contains illegal characters - [1] "Error in att.put.ncdf, while writing attribute :IOAPI_VERSION - with value 0" - Error in att.put.ncdf(target.file, 0, attr.name, - source.datavar.attr$value) : - Error return from C call R_nc_put_att_double for attribute - :IOAPI_VERSION Is my code, ncdf, I/O API, or Something Completely Different causing this error? 3 When I diff my `ncdump`s, i.e., $ diff -uwB <( ncdump -h source.nc ) <( ncdump -h target.nc ) I get > --- /dev/fd/63 2012-01-09 17:20:30.258837803 -0500 > +++ /dev/fd/62 2012-01-09 17:20:30.258837803 -0500 > @@ -1,194 +1,29 @@ > -netcdf \5yravg.test { > +netcdf \5yravg.onlyOrigDN2 { > dimensions: > - TSTEP = UNLIMITED ; // (1 currently) > DATE-TIME = 2 ; > - LAY = 42 ; > VAR = 29 ; > - ROW = 299 ; > + TSTEP = UNLIMITED ; // (1 currently) > COL = 459 ; > + ROW = 299 ; > + LAY = 42 ; > variables: > + int DATE-TIME(DATE-TIME) ; > + DATE-TIME:units = "" ; > + int VAR(VAR) ; > + VAR:units = "" ; > + int TSTEP(TSTEP) ; > + TSTEP:units = "" ; Reordering the dimensions I can live with: what annoys/confuses me is * the target file has *new* coordinate variables for the dimensions. * I don't understand why those coordinate variables weren't in the source file. But they're not! (Note I also get new data variables for dims={COL, LAY, ROW}, farther down the diff.) To clarify, e.g.: there is no variable int DATE-TIME(DATE-TIME) ; in the source file. 4 Attribute="long_name" is missing for every original/copied data variable. Hence when I diff my `ncdump`s I also get, e.g., int TFLAG(TSTEP, VAR, DATE-TIME) ; TFLAG:units = "<YYYYDDD,HHMMSS>" ; - TFLAG:long_name = "TFLAG " ; TFLAG:var_desc = "..." How to fix or workaround? Note that others have previously written http://www.image.ucar.edu/Software/Netcdf/ > I believe there is a bug in the ncdf library which is causing the > longname attribute to be ignored. Your assistance is appreciated! and if I should submit patches or bug reports somewhere, please let me know. HTH, Tom Roche <tom_ro...@pobox.com> ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.