Re: [R-sig-Geo] get data from nc file

2018-12-26 Thread Ben Tupper
Hi,

Yikes.  I don't think there is any other way as the attributes are sort of 
buried in the string; that's unfortunate.  I guess you could at least make a 
reusable function assuming you'll be doing this again or looking to pull other 
attributes.  Something like this...


#' Extract one of the GLobal Attributes of a TRMM NetCDF as a named vector 
#'
#' @param nc the ncdf4 object
#' @param name the name of the global attribute
#' @param sep the separator used to delimit fields in the attribute
#' @return named character vector of attributes
nc_att_split <- function(nc, name = "FileHeader", sep = ";\n"){

a1 <- ncdf4::ncatt_get(nc, 0)[[name[1]]]
if (is.null(a1)) return(a1)

a2 <- strsplit(a1,";\n", fixed = TRUE)[[1]]
aa <- strsplit(a2, "=", fixed = TRUE)

x <- sapply(aa,
function(s) x = if(length(s) <= 1) "" else s[2]
)   
names(x) <- sapply(aa,
function(s) x = if(length(s) <= 1) "unknown" else s[1]
)

x
}


nc <- ncdf4::nc_open("3B43.20080101.7A.HDF.nc")
x <- nc_att_split(nc)
as.Date(x[['StartGranuleDateTime']], format = "%Y-%m-%dT%H:%M:%OSZ")
[1] "2008-01-01"


Cheers,
Ben

> On Dec 26, 2018, at 3:42 PM, Antonio Silva  wrote:
> 
> Dear list members
> 
> I downloaded some nc files with precipitation data from
> https://pmm.nasa.gov/data-access/downloads/trmm (Level 3 3B43:
> Multisatellite Precipitation). For the image link see the global attribute
> "history" (below).
> 
> With ncdf4::nc_open I cloud open the file (nc.data <- nc_open("
> 3B43.20080101.7A.HDF.nc")
> 
> I want to extract the "StartGranuleDateTime" but it is inside the global
> attribute FileHeader (see below).
> 
> With ncatt_get(nc.data,0,"FileHeader")$value I got
> [1]
> "AlgorithmID=3B43;\nAlgorithmVersion=3B43_7.0;\nFileName=3B43.20080101.7A.HDF;\nGenerationDateTime=2012-11-29T19:12:01.000Z;\nStartGranuleDateTime=2008-01-01T00:00:00.000Z;\nStopGranuleDateTime=2008-01-31T23:59:59.999Z;\nGranuleNumber=;\nNumberOfSwaths=0;\nNumberOfGrids=1;\nGranuleStart=;\nTimeInterval=MONTH;\nProcessingSystem=PPS;\nProductVersion=7A;\nMissingData=;\n"
> 
> Is there any way to extract only the string "2008-01-01T00:00:00.000Z"?
> 
> The best I could do was
> as.Date(substr(strsplit(ncatt_get(nc.data,0,"FileHeader")$value,";\n")[[1]][5],22,45),"%Y-%m-%dT%H:%M:%OSZ")
> 
> but probably, I suppose, there must be a more direct way of getting the
> data. I appreciate any suggestions.
> 
> Best regards,
> 
> Antonio Olinto
> Fisheries Institute
> Brazil
> 
> nc.data
> File 3B43.20080101.7A.HDF.nc (NC_FORMAT_CLASSIC):
> 
> 1 variables (excluding dimension variables):
>float precipitation[nlat,nlon]
>units: mm/hr
>coordinates: nlon nlat
>_FillValue: -.900390625
> 
> 2 dimensions:
>nlon  Size:33
>long_name: longitude
>standard_name: longitude
>units: degrees_east
>nlat  Size:41
>long_name: latitude
>standard_name: latitude
>units: degrees_north
> 
>5 global attributes:
>Grid.GridHeader: BinMethod=ARITHMETIC_MEAN;
> Registration=CENTER;
> LatitudeResolution=0.25;
> LongitudeResolution=0.25;
> NorthBoundingCoordinate=50;
> SouthBoundingCoordinate=-50;
> EastBoundingCoordinate=180;
> WestBoundingCoordinate=-180;
> Origin=SOUTHWEST;
> 
>FileHeader: AlgorithmID=3B43;
> AlgorithmVersion=3B43_7.0;
> FileName=3B43.20080101.7A.HDF;
> GenerationDateTime=2012-11-29T19:12:01.000Z;
> StartGranuleDateTime=2008-01-01T00:00:00.000Z;
> StopGranuleDateTime=2008-01-31T23:59:59.999Z;
> GranuleNumber=;
> NumberOfSwaths=0;
> NumberOfGrids=1;
> GranuleStart=;
> TimeInterval=MONTH;
> ProcessingSystem=PPS;
> ProductVersion=7A;
> MissingData=;
> 
>FileInfo: DataFormatVersion=m;
> TKCodeBuildVersion=1;
> MetadataVersion=m;
> FormatPackage=HDF Version 4.2 Release 4, January 25, 2009;
> BlueprintFilename=TRMM.V7.3B43.blueprint.xml;
> BlueprintVersion=BV_13;
> TKIOVersion=1.6;
> MetadataStyle=PVL;
> EndianType=LITTLE_ENDIAN;
> 
>GridHeader: BinMethod=ARITHMETIC_MEAN;
> Registration=CENTER;
> LatitudeResolution=0.25;
> LongitudeResolution=0.25;
> NorthBoundingCoordinate=50;
> SouthBoundingCoordinate=-50;
> EastBoundingCoordinate=180;
> WestBoundingCoordinate=-180;
> Origin=SOUTHWEST;
> 
>history: 2018-12-26 17:57:56 GMT Hyrax-1.13.4
> https://disc2.gesdisc.eosdis.nasa.gov:443/opendap/TRMM_L3/TRMM_3B43.7/2008/3B43.20080101.7A.HDF.nc?precipitation[604:636][3:43],nlat[3:43],nlon[604:636]
> 
>   [[alternative HTML version deleted]]
> 
> ___
> R-sig-Geo mailing list
> R-sig-Geo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo

Ben Tupper
Bigelow Laboratory for Ocean Sciences
60 Bigelow Drive, P.O. Box 380
East Boothbay, Maine 04544
http://www.bigelow.org

Ecological Forecasting: 

[R-sig-Geo] get data from nc file

2018-12-26 Thread Antonio Silva
Dear list members

I downloaded some nc files with precipitation data from
https://pmm.nasa.gov/data-access/downloads/trmm (Level 3 3B43:
Multisatellite Precipitation). For the image link see the global attribute
"history" (below).

With ncdf4::nc_open I cloud open the file (nc.data <- nc_open("
3B43.20080101.7A.HDF.nc")

I want to extract the "StartGranuleDateTime" but it is inside the global
attribute FileHeader (see below).

With ncatt_get(nc.data,0,"FileHeader")$value I got
[1]
"AlgorithmID=3B43;\nAlgorithmVersion=3B43_7.0;\nFileName=3B43.20080101.7A.HDF;\nGenerationDateTime=2012-11-29T19:12:01.000Z;\nStartGranuleDateTime=2008-01-01T00:00:00.000Z;\nStopGranuleDateTime=2008-01-31T23:59:59.999Z;\nGranuleNumber=;\nNumberOfSwaths=0;\nNumberOfGrids=1;\nGranuleStart=;\nTimeInterval=MONTH;\nProcessingSystem=PPS;\nProductVersion=7A;\nMissingData=;\n"

Is there any way to extract only the string "2008-01-01T00:00:00.000Z"?

The best I could do was
as.Date(substr(strsplit(ncatt_get(nc.data,0,"FileHeader")$value,";\n")[[1]][5],22,45),"%Y-%m-%dT%H:%M:%OSZ")

but probably, I suppose, there must be a more direct way of getting the
data. I appreciate any suggestions.

Best regards,

Antonio Olinto
Fisheries Institute
Brazil

nc.data
File 3B43.20080101.7A.HDF.nc (NC_FORMAT_CLASSIC):

 1 variables (excluding dimension variables):
float precipitation[nlat,nlon]
units: mm/hr
coordinates: nlon nlat
_FillValue: -.900390625

 2 dimensions:
nlon  Size:33
long_name: longitude
standard_name: longitude
units: degrees_east
nlat  Size:41
long_name: latitude
standard_name: latitude
units: degrees_north

5 global attributes:
Grid.GridHeader: BinMethod=ARITHMETIC_MEAN;
Registration=CENTER;
LatitudeResolution=0.25;
LongitudeResolution=0.25;
NorthBoundingCoordinate=50;
SouthBoundingCoordinate=-50;
EastBoundingCoordinate=180;
WestBoundingCoordinate=-180;
Origin=SOUTHWEST;

FileHeader: AlgorithmID=3B43;
AlgorithmVersion=3B43_7.0;
FileName=3B43.20080101.7A.HDF;
GenerationDateTime=2012-11-29T19:12:01.000Z;
StartGranuleDateTime=2008-01-01T00:00:00.000Z;
StopGranuleDateTime=2008-01-31T23:59:59.999Z;
GranuleNumber=;
NumberOfSwaths=0;
NumberOfGrids=1;
GranuleStart=;
TimeInterval=MONTH;
ProcessingSystem=PPS;
ProductVersion=7A;
MissingData=;

FileInfo: DataFormatVersion=m;
TKCodeBuildVersion=1;
MetadataVersion=m;
FormatPackage=HDF Version 4.2 Release 4, January 25, 2009;
BlueprintFilename=TRMM.V7.3B43.blueprint.xml;
BlueprintVersion=BV_13;
TKIOVersion=1.6;
MetadataStyle=PVL;
EndianType=LITTLE_ENDIAN;

GridHeader: BinMethod=ARITHMETIC_MEAN;
Registration=CENTER;
LatitudeResolution=0.25;
LongitudeResolution=0.25;
NorthBoundingCoordinate=50;
SouthBoundingCoordinate=-50;
EastBoundingCoordinate=180;
WestBoundingCoordinate=-180;
Origin=SOUTHWEST;

history: 2018-12-26 17:57:56 GMT Hyrax-1.13.4
https://disc2.gesdisc.eosdis.nasa.gov:443/opendap/TRMM_L3/TRMM_3B43.7/2008/3B43.20080101.7A.HDF.nc?precipitation[604:636][3:43],nlat[3:43],nlon[604:636]

[[alternative HTML version deleted]]

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo