Re: [CF-metadata] Swath observational data
Raskin, Rob (388M) wrote: While the Point observational conventions document is undergoing final review, I want to initiate a discussion on a complementary topic - Swath observational conventions. This model addresses satellite observational measurements and potentially airborne measurements. The Swath conceptual model is essentially a grid in spacecraft coordinates. One dimension of this grid (along_track) follows the path of the satellite. Normally there are one or two additional dimensions: cross_track and/or vertical. The cross_track dimension is perpendicular to the satellite path, as the instrument typically makes side views of the surface rather than just at the nadir. The vertical dimension is present when a vertical profiler instrument is used. CF:FeatureType will need to account for each possible combination of these 2-D and 3-D swaths. Typically, time is explicitly stored and associated only with the along-track dimension. Spatial resolution generally will differ in the along_track and cross_track directions. Orbits are not mapped to files in any consistent way: a file might correspond to a complete orbit, a half-orbit, or some other value. However, it is common to explicitly consider yet another dimension: satellite_node, with values ascending (crosses equator going northward) and descending (crosses equator going southward). Common satellites are in sun-synchronous polar orbits such that the ascending node remains at a near constant Local Solar Time (LST), while the descending node remains at a near constant LST shifted by 12 hours. For example, the ascending node may be at 6am LST and the descending node at 6pm LST. Often gridded data products are produced from these swaths, with separate grids corresponding to the AM and PM cases. A new CF time representation for LST is required to indicate that the global data are all at a time such as 6am LST. Unrelated to the swath geometry, some measurements use spectral band as an independent variable, as they sample at multiple channels. This capability requires a new standard name for spectral_band or spectral_channel with values that may be numeric, a numeric range, or string. Swath data include many new dependent variables that correspond to engineering parameters of the retrieval rather than geophysical parameters (point spread function is a common example). If these names are standardized at all, they should be indicated as being of the engineering type. In the case of an airborne (rather than satellite) measurement, there is more commonality with the trajectory representation from the Point observation model. Hence, the focus here is on spacecraft measurements. Finally, on an unrelated note, I have semantically mapped the entire CF Standard Name list to an ontological representation. But that is the subject of a separate communication. -Rob Hi Rob, thanks for starting this up. We have done some preliminary thinking about the swath feature type in the CDM data model, though we dont have any implementations. A prototype coordinate system would look something like: dimension: scan = 1234; xscan = 987; wavelength = 123; variables: double lon(scan, xscan); double lat(scan, xscan); double alt(scan, xscan); double time(scan); double wavelength(wavelength); byte data( scan, xscan); data:coordinates = lon lat time alt; byte spectral( scan, xscan, wavelength); spectral:coordinates = lon lat time alt; I think this should handle zigzags or grids, although perhaps adding a scan strategy attribute would be good. The geometry of each point is an interesting wrinkle, and may need some new conventions. would a rotated ellipse work (3 params) or do we need a more general polygon? Does it have to be specified per point, or can is be common to all points? I would imagine that quick visualizers might ignore the details of this (essentially assuming a tesselating grid), but more sophisticated and specialized tools would need this. ___ CF-metadata mailing list CF-metadata@cgd.ucar.edu http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
[CF-metadata] Multiple file datasets (was: Swath observational data)
This topic deserves its own heading, so here it is. Perhaps we should gather current practices and ideas. I think Balaji's gridspec has a proposal about this. Can anyone summarize what SAFE does? Im imagining how this is actually used, eg: float data(y,x); data:coordinates = l...@file1 l...@file2; John Graybeal wrote: I like Bryan's recommendation for a UUID or similar. Now I'm going to be annoying and suggest the UUID *could* be a URI, or these days, an IRI (International ..). And I think the way of 'locating' the file should be neither in packaging nor in local resolution; it should be in global namespace resolution. This is the way of the future, and is already more 'permanent' than either packaging or local resolution, IMHO. There is one form of URI in particular that is already resolvable: a URL. OK, that's an old song, but I'm gonna stick to it for a while longer. That form meets all the other requirements: it can be registered in a resolver, it can be guaranteed unique (to the same authority level as a UUID, anyway), and it is a unique string that can be used to validate the link). And it has the obvious benefit of being resolvable right now, for as long as the domain is held and properly maintained (Good URLs don't die). Since the last paragraph risks starting another unique identifier war, I promise not to re-engage unless someone asks me to. Meanwhile, I like John On Nov 19, 2009, at 22:23, Bryan Lawrence wrote: On Thursday 19 November 2009 19:40:08 Jonathan Gregory wrote: ... In some cases, referencing attributes such as coordinates and ancillary_variables would, ideally, point to a variable in a different dataset. This is a general problem to which CF doesn't have a solution because it was conceived as a convention for single netCDF files. However we need a solution as often several files should be treated as a single dataset. If the files don't overlap i.e. their contents are complementary, I think it should be satisfactory to allow variables in one file to be pointed to by name from another file, with no other mechanism being required within the file. I don't like the idea of naming one file within another file, as that would be very fragile. Instead, I think the file aggregation should be implied by simply defining the group of files which are to be treated as one file e.g. by putting them in one directory. It's the old ones that are the best ones :-) :-) this issue keeps on coming back ... :-) :-) and we keep trying to ignore it ... I think we agree that an actual physical filename including path is useless. We need both a relative link which relies on the preservation of a group of files in a particular arrangement ... AND an internal identifier so more robust linking mechanisms can be used when (if) the data ends up in a managed environment. I think it's crucial in this situation to ensure that each file has a unique identifier within it (created, for example, with uuid), because all solutions which rely on packaging are fragile (SAFE is probably better than most), but the bottom line is that users move files around ... and we need some way of ensuring that we/they can validate the links that are in place are the ones that were originally intended. So relative links would also include the identifier of the intended target as well as the relative path in operating system agnostic terms. That identifier can be used in two ways: to validate the link (my software can always check that the variable that I just opened following a link from another one is the one that was expected by checking the container identifier), and b) to produce an identifier resolver service for the situation where the packaging has had to be broken (which might occur for performance reasons or ...) CF could recommend something like this ... Bryan -- Bryan Lawrence Director of Environmental Archival and Associated Research (NCAS/British Atmospheric Data Centre and NCEO/NERC NEODC) STFC, Rutherford Appleton Laboratory Phone +44 1235 445012; Fax ... 5848; Web: home.badc.rl.ac.uk/lawrence ___ CF-metadata mailing list CF-metadata@cgd.ucar.edu http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata -- I have my new work email address: jgrayb...@ucsd.edu -- John Graybeal mailto:jgrayb...@ucsd.edu phone: 858-534-2162 Development Manager Ocean Observatories Initiative Cyberinfrastructure Project: http://ci.oceanobservatories.org Marine Metadata Interoperability Project: http://marinemetadata.org ___ CF-metadata mailing list CF-metadata@cgd.ucar.edu http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata ___ CF-metadata mailing list CF-metadata@cgd.ucar.edu http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
Re: [CF-metadata] Multiple file datasets (was: Swath observational data)
Can anyone summarize what SAFE does? I will give it a shot as I brought it up in the first place! The Standard Archive Format for Europe (SAFE) was developed as a common format for archiving to ensure long-term preservation of EO data holdings, both historical and operational. The SAFE website [www.esa.int/safe] is the official ESA maintained site for the maintenance and distribution of the standard format, specification, XML-schemas and tools. SAFE is a specialisation of the XML Formatted Data Unit (XFDU), a CCSDS (Consultative Committee for Space Data Systems) recommended standard for the packaging of data and metadata to facilitate information transfer and archiving. Every SAFE product is an XFDU package. SAFE is a specialisation of XFDU, which defines a restriction of the generic XFDU package. SAFE inherits its main structure from XFDU packaging format and defines high level constraints and new rules for Earth Observation ground segment data products. A SAFE product wraps, or references, data and associates that data with metadata, both global and local. SAFE product metadata contains basic information, such as the acquisition period, platform and sensor identification and a processing history to ensure traceability. For each included, or external referenced, dataset another layer of associated metadata may be attached providing orbit and geo-location information, quality information and representational information. Basically a SAFE product is a directory. At the top level is a manifest file, written in XML, that provides both a map of the contained data sets, defines the relationships between these datasets, and contains global metadata (such as platform name, acquisition period etc.). There is a set of required metadata defined by the SAFE specialisation (e.g. there is an ENVISAT specialisation, further restricted to apply to, say, MERIS, and still further specialised to, say, Level 1 processed products). The contained datasets are collections of records. They are of three types: Measurement Data Sets: These are typically binary format files and, in our case, will be netCDF-CF files. As an example we will have 46 measurement data products and each will be stored at a netCDF file (data record) along with a data record containing associated quality information and another containing status flags. Annotation Data Sets: These contain metadata and common data. Although to be decided in the case of Sentinel 3 Level 2 we are considering storing a common set of coordinate data that is applicable to subsets of the measurement data. The manifest file will provide the association between specific measurement datasets and the associated coordinate data. Representation Data Sets: These are XML Schema descriptions of the measurement and annotation datasets. Firstly it is a key concept for OAIS digital preservation and secondarily third party applications may use these for displaying / accessing the corresponding measurement data sets. I appreciate that it might seem a little 'belt-and-braces' to have an XML schema for a netCDF file (which is by nature self-describing) but that is how the SAFE people have decided to include netCDF into the convention. There is a third type of data which can be considered as resources. These may be, for instance, data required for the generation of the end-user data products. For instance, for Level 2 data products they would include the Level 1 input products and possibly, for instance, ECMWF data required for processing (although the latter might equally be an annotation dataset). These resources are not packaged inside a SAFE container but are referenced (in the manifest file) using a URI. All of these taken together are a SAFE package. I hope that this provides a reasonably informative overview. The SAFE website is the place to go for more detailed info. Steve --- Dr Stephen Emsley Tel: +44 (0)1752 764 289 ARGANS Limited Mobile: +44 (0)7912 515 418 -Original Message- From: cf-metadata-boun...@cgd.ucar.edu [mailto:cf-metadata-boun...@cgd.ucar.edu] On Behalf Of John Caron Sent: 20 November 2009 12:30 To: cf-metadata@cgd.ucar.edu Subject: [CF-metadata] Multiple file datasets (was: Swath observational data) This topic deserves its own heading, so here it is. Perhaps we should gather current practices and ideas. I think Balaji's gridspec has a proposal about this. Can anyone summarize what SAFE does? Im imagining how this is actually used, eg: float data(y,x); data:coordinates = l...@file1 l...@file2; John Graybeal wrote: I like Bryan's recommendation for a UUID or similar. Now I'm going to be annoying and suggest the UUID *could* be a URI, or these days, an IRI (International ..). And I think the way of 'locating' the file should be neither in packaging nor
Re: [CF-metadata] [CF Metadata] #37: Conventions for Point Observation Data
Martina Stockhause wrote: Hi John, right thanks, we could describe several z coordinates. In our case with z dimensions: dimensions: station = 8 ; time = UNLIMITED ; lon = 1; lat = 1; z0 = 1;// e.g. VTP in 110 m z1 = 7;// MINERVA z2 = 1050; // MRR (rain radar) The constructors of meteorological instruments weren't able to design an instrument beautiful enough to be called APHRODITE, yet. Nevertheless we probably will stay with separated files for each instrument type to keep the files simple and their contents close to our provided ASCII versions. Thanks a lot, Martina yes, very good. just checking that if you wanted to store multiple instruments in one file, the proposal would work. ___ CF-metadata mailing list CF-metadata@cgd.ucar.edu http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
Re: [CF-metadata] Multiple file datasets (was: Swath observational data)
Sorry, my mistake: http://earth.esa.int/SAFE is the correct address for the SAFE web site --- Dr Stephen Emsley Tel: +44 (0)1752 764 289 ARGANS Limited Mobile: +44 (0)7912 515 418 -Original Message- From: cf-metadata-boun...@cgd.ucar.edu [mailto:cf-metadata-boun...@cgd.ucar.edu] On Behalf Of John Caron Sent: 20 November 2009 12:30 To: cf-metadata@cgd.ucar.edu Subject: [CF-metadata] Multiple file datasets (was: Swath observational data) This topic deserves its own heading, so here it is. Perhaps we should gather current practices and ideas. I think Balaji's gridspec has a proposal about this. Can anyone summarize what SAFE does? Im imagining how this is actually used, eg: float data(y,x); data:coordinates = l...@file1 l...@file2; John Graybeal wrote: I like Bryan's recommendation for a UUID or similar. Now I'm going to be annoying and suggest the UUID *could* be a URI, or these days, an IRI (International ..). And I think the way of 'locating' the file should be neither in packaging nor in local resolution; it should be in global namespace resolution. This is the way of the future, and is already more 'permanent' than either packaging or local resolution, IMHO. There is one form of URI in particular that is already resolvable: a URL. OK, that's an old song, but I'm gonna stick to it for a while longer. That form meets all the other requirements: it can be registered in a resolver, it can be guaranteed unique (to the same authority level as a UUID, anyway), and it is a unique string that can be used to validate the link). And it has the obvious benefit of being resolvable right now, for as long as the domain is held and properly maintained (Good URLs don't die). Since the last paragraph risks starting another unique identifier war, I promise not to re-engage unless someone asks me to. Meanwhile, I like John On Nov 19, 2009, at 22:23, Bryan Lawrence wrote: On Thursday 19 November 2009 19:40:08 Jonathan Gregory wrote: ... In some cases, referencing attributes such as coordinates and ancillary_variables would, ideally, point to a variable in a different dataset. This is a general problem to which CF doesn't have a solution because it was conceived as a convention for single netCDF files. However we need a solution as often several files should be treated as a single dataset. If the files don't overlap i.e. their contents are complementary, I think it should be satisfactory to allow variables in one file to be pointed to by name from another file, with no other mechanism being required within the file. I don't like the idea of naming one file within another file, as that would be very fragile. Instead, I think the file aggregation should be implied by simply defining the group of files which are to be treated as one file e.g. by putting them in one directory. It's the old ones that are the best ones :-) :-) this issue keeps on coming back ... :-) :-) and we keep trying to ignore it ... I think we agree that an actual physical filename including path is useless. We need both a relative link which relies on the preservation of a group of files in a particular arrangement ... AND an internal identifier so more robust linking mechanisms can be used when (if) the data ends up in a managed environment. I think it's crucial in this situation to ensure that each file has a unique identifier within it (created, for example, with uuid), because all solutions which rely on packaging are fragile (SAFE is probably better than most), but the bottom line is that users move files around ... and we need some way of ensuring that we/they can validate the links that are in place are the ones that were originally intended. So relative links would also include the identifier of the intended target as well as the relative path in operating system agnostic terms. That identifier can be used in two ways: to validate the link (my software can always check that the variable that I just opened following a link from another one is the one that was expected by checking the container identifier), and b) to produce an identifier resolver service for the situation where the packaging has had to be broken (which might occur for performance reasons or ...) CF could recommend something like this ... Bryan -- Bryan Lawrence Director of Environmental Archival and Associated Research (NCAS/British Atmospheric Data Centre and NCEO/NERC NEODC) STFC, Rutherford Appleton Laboratory Phone +44 1235 445012; Fax ... 5848; Web: home.badc.rl.ac.uk/lawrence ___ CF-metadata mailing list CF-metadata@cgd.ucar.edu http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata -- I have my new
Re: [CF-metadata] Multiple file datasets (was: Swath observational data)
The gridspec indeed had a proposal about this. Clearly it was a bit off-topic, but some mechanism of referring to other files was needed. It consists of an attribute called a link_spec, which has attributes of a baseURL, a relative pathname, and a checksum for verifying whether the external file being referenced is indeed the one you're looking for. There wasn't a special v...@link syntax, but I don't see why it couldn't have had one. CMIP5 is proposing a simplified variant on the link_spec. A file can have a global attribute associated_files which are also formed out of a baseURL and relative pathnames. The only permitted associated_files are gridspec, and cell areas and volumes that may be used in cell_methods. Other approaches have been proposed in this forum, most notably on Trac #24 and #27, the common_concept thread and Benno's namespace thread. SAFE has been explained already in this thread. I agree with John, it would be good to consider this problem in isolation, without the baggage of gridspecs or common concepts or namespaces. John Caron writes: This topic deserves its own heading, so here it is. Perhaps we should gather current practices and ideas. I think Balaji's gridspec has a proposal about this. Can anyone summarize what SAFE does? Im imagining how this is actually used, eg: float data(y,x); data:coordinates = l...@file1 l...@file2; John Graybeal wrote: I like Bryan's recommendation for a UUID or similar. Now I'm going to be annoying and suggest the UUID *could* be a URI, or these days, an IRI (International ..). And I think the way of 'locating' the file should be neither in packaging nor in local resolution; it should be in global namespace resolution. This is the way of the future, and is already more 'permanent' than either packaging or local resolution, IMHO. There is one form of URI in particular that is already resolvable: a URL. OK, that's an old song, but I'm gonna stick to it for a while longer. That form meets all the other requirements: it can be registered in a resolver, it can be guaranteed unique (to the same authority level as a UUID, anyway), and it is a unique string that can be used to validate the link). And it has the obvious benefit of being resolvable right now, for as long as the domain is held and properly maintained (Good URLs don't die). Since the last paragraph risks starting another unique identifier war, I promise not to re-engage unless someone asks me to. Meanwhile, I like John On Nov 19, 2009, at 22:23, Bryan Lawrence wrote: On Thursday 19 November 2009 19:40:08 Jonathan Gregory wrote: ... In some cases, referencing attributes such as coordinates and ancillary_variables would, ideally, point to a variable in a different dataset. This is a general problem to which CF doesn't have a solution because it was conceived as a convention for single netCDF files. However we need a solution as often several files should be treated as a single dataset. If the files don't overlap i.e. their contents are complementary, I think it should be satisfactory to allow variables in one file to be pointed to by name from another file, with no other mechanism being required within the file. I don't like the idea of naming one file within another file, as that would be very fragile. Instead, I think the file aggregation should be implied by simply defining the group of files which are to be treated as one file e.g. by putting them in one directory. It's the old ones that are the best ones :-) :-) this issue keeps on coming back ... :-) :-) and we keep trying to ignore it ... I think we agree that an actual physical filename including path is useless. We need both a relative link which relies on the preservation of a group of files in a particular arrangement ... AND an internal identifier so more robust linking mechanisms can be used when (if) the data ends up in a managed environment. I think it's crucial in this situation to ensure that each file has a unique identifier within it (created, for example, with uuid), because all solutions which rely on packaging are fragile (SAFE is probably better than most), but the bottom line is that users move files around ... and we need some way of ensuring that we/they can validate the links that are in place are the ones that were originally intended. So relative links would also include the identifier of the intended target as well as the relative path in operating system agnostic terms. That identifier can be used in two ways: to validate the link (my software can always check that the variable that I just opened following a link from another one is the one that was expected by checking the container identifier), and b) to produce an identifier resolver service for the situation where the packaging has had to be broken (which might occur for performance reasons or ...) CF could recommend something like this ... Bryan -- Bryan
Re: [CF-metadata] Multiple file datasets
Stephen Emsley wrote: Can anyone summarize what SAFE does? I will give it a shot as I brought it up in the first place! The Standard Archive Format for Europe (SAFE) was developed as a common format for archiving to ensure long-term preservation of EO data holdings, both historical and operational. The SAFE website [www.esa.int/safe] is the official ESA maintained site for the maintenance and distribution of the standard format, specification, XML-schemas and tools. SAFE is a specialisation of the XML Formatted Data Unit (XFDU), a CCSDS (Consultative Committee for Space Data Systems) recommended standard for the packaging of data and metadata to facilitate information transfer and archiving. Every SAFE product is an XFDU package. SAFE is a specialisation of XFDU, which defines a restriction of the generic XFDU package. SAFE inherits its main structure from XFDU packaging format and defines high level constraints and new rules for Earth Observation ground segment data products. A SAFE product wraps, or references, data and associates that data with metadata, both global and local. SAFE product metadata contains basic information, such as the acquisition period, platform and sensor identification and a processing history to ensure traceability. For each included, or external referenced, dataset another layer of associated metadata may be attached providing orbit and geo-location information, quality information and representational information. Basically a SAFE product is a directory. At the top level is a manifest file, written in XML, that provides both a map of the contained data sets, defines the relationships between these datasets, and contains global metadata (such as platform name, acquisition period etc.). There is a set of required metadata defined by the SAFE specialisation (e.g. there is an ENVISAT specialisation, further restricted to apply to, say, MERIS, and still further specialised to, say, Level 1 processed products). The contained datasets are collections of records. They are of three types: Measurement Data Sets: These are typically binary format files and, in our case, will be netCDF-CF files. As an example we will have 46 measurement data products and each will be stored at a netCDF file (data record) along with a data record containing associated quality information and another containing status flags. Annotation Data Sets: These contain metadata and common data. Although to be decided in the case of Sentinel 3 Level 2 we are considering storing a common set of coordinate data that is applicable to subsets of the measurement data. The manifest file will provide the association between specific measurement datasets and the associated coordinate data. Representation Data Sets: These are XML Schema descriptions of the measurement and annotation datasets. Firstly it is a key concept for OAIS digital preservation and secondarily third party applications may use these for displaying / accessing the corresponding measurement data sets. I appreciate that it might seem a little 'belt-and-braces' to have an XML schema for a netCDF file (which is by nature self-describing) but that is how the SAFE people have decided to include netCDF into the convention. There is a third type of data which can be considered as resources. These may be, for instance, data required for the generation of the end-user data products. For instance, for Level 2 data products they would include the Level 1 input products and possibly, for instance, ECMWF data required for processing (although the latter might equally be an annotation dataset). These resources are not packaged inside a SAFE container but are referenced (in the manifest file) using a URI. All of these taken together are a SAFE package. I hope that this provides a reasonably informative overview. The SAFE website is the place to go for more detailed info. Steve Thanks, Steve for the summary. A quick perusal of the SAFE spec for our purposes indicates that the referenced file is a full path HTTP URL: The fileLocation element specifies an HTTP GET URL to request the latest version of data from an online registry/repository. I suppose we are interested only in local netcdf files? ___ CF-metadata mailing list CF-metadata@cgd.ucar.edu http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
Re: [CF-metadata] Swath observational data
Dear All, I think there may be two distinct cases here: 1) Local cross-referencing, where it is only necessary to establish a relationship within a well-defined grouping of files, 2) Referencing to a universal resource, such as a specific file held on a server. For the former, it should only be necessary that every NetCDF file within the grouping holds the same unique identifier (this could be the product or group name, or an ID string from a managed soure). Satellite swath products, where they have this sort of structure, almost always fall in the first category. In general, a user would want to use his or her local copy of a file, rather than re-download a remote file. This may be redundant by now, but my thoughts were that: 1) We only consider whether we can extend cross-referencing within a local scope, 2) All related files within the scope should contain the same unique identifier, perhaps a global attribute named something like ³cross_reference_ID². 3) Referenced variable names within the scope should be unique and so do not need modifiers. An alternative is that modifiers are not needed in references by default, but could be included to disambiguate variables - perhaps in a form like ³geo:latitude² where geo.nc is the file containing the required latitude variable. If the attribute contains an empty string or is absent, CF compliant systems only look for referenced variables within the same file, as at the moment. If present, the system is allowed to search other files within a limited scope, containing the same ID. One possibility is that scope could be modified with, perhaps, unix-like relative directory prefixes to the ID, so that :cross_reference_ID = ³my_unique_id²; refers just to to files in the same directory, whereas :cross_reference_ID = ³../*/my_unique_id²; refers to all files held under the parent directory and its subdirectories, and so on. If the purpose of the ID is only to disambiguate local files, then form and integrity of the ID string itself could probably be left to the discretion of the data provider, since it would only need to be checked within a defined scope. More rigorous implementations are a bit beyond my experience. Anybody who¹s interested can find the SAFE format definition at earth.esa.int/SAFE. You should probably enjoy UML diagrams to appreciate it fully. Note that the format doesn¹t discuss NetCDF in particular this is just the format that Sentinel-3 is adopting for its data containers. Tim. On 20/11/2009 06:23, Bryan Lawrence bryan.lawre...@stfc.ac.uk wrote: On Thursday 19 November 2009 19:40:08 Jonathan Gregory wrote: ... In some cases, referencing attributes such as coordinates and ancillary_variables would, ideally, point to a variable in a different dataset. This is a general problem to which CF doesn't have a solution because it was conceived as a convention for single netCDF files. However we need a solution as often several files should be treated as a single dataset. If the files don't overlap i.e. their contents are complementary, I think it should be satisfactory to allow variables in one file to be pointed to by name from another file, with no other mechanism being required within the file. I don't like the idea of naming one file within another file, as that would be very fragile. Instead, I think the file aggregation should be implied by simply defining the group of files which are to be treated as one file e.g. by putting them in one directory. It's the old ones that are the best ones :-) :-) this issue keeps on coming back ... :-) :-) and we keep trying to ignore it ... I think we agree that an actual physical filename including path is useless. We need both a relative link which relies on the preservation of a group of files in a particular arrangement ... AND an internal identifier so more robust linking mechanisms can be used when (if) the data ends up in a managed environment. I think it's crucial in this situation to ensure that each file has a unique identifier within it (created, for example, with uuid), because all solutions which rely on packaging are fragile (SAFE is probably better than most), but the bottom line is that users move files around ... and we need some way of ensuring that we/they can validate the links that are in place are the ones that were originally intended. So relative links would also include the identifier of the intended target as well as the relative path in operating system agnostic terms. That identifier can be used in two ways: to validate the link (my software can always check that the variable that I just opened following a link from another one is the one that was expected by checking the container identifier), and b) to produce an identifier resolver service for the situation where the packaging has had to be broken (which might occur for performance reasons
Re: [CF-metadata] Swath observational data
Dear John, - John Caron ca...@unidata.ucar.edu wrote: The geometry of each point is an interesting wrinkle, and may need some new conventions. would a rotated ellipse work (3 params) or do we need a more general polygon? Does it have to be specified per point, or can is be common to all points? I would imagine that quick visualizers might ignore the details of this (essentially assuming a tesselating grid), but more sophisticated and specialized tools would need this. I do not thing the FOV (field of view) of single point should be described as projected on the Earth surface (rotated ellipse and/or polygon) if this is what you meant. It should come as a response function of angular incoming radiation. This response function might be a formula (2D Gaussian, weighted sum of 2D gaussians, etc...) or given as a Look Up Table. The Earth-projected geometry will then be a function of the view angle, Earth topography, integration (photon counting) period, etc... We should definitely be able to have response function varying within the scan array. I think we are entering a terribly complex (and interesting) subject when defining a Feature for those space- and air-borne observational data. The question is then, where should we put the limit in complexity and what is the scope: Do we aim at encoding the spacecraft instrument engineer point of view or the geophysical data user point of view? Cheers, Thomas ___ CF-metadata mailing list CF-metadata@cgd.ucar.edu http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
Re: [CF-metadata] CF point observation Conventions
Thanks, Roy. There's something not quite symmetrical in this, either - maybe it's "just" terminology, maybe not. A time series is conceptually identical to a profile, just "turned on its side" so time is the single incrementing dimension, instead of depth. The difference turns out to be important in the proposal mainly because of the way we'd aggregate profiles vs time series. A point, in my lexicon, is an atomic unit, a single measurement at a single x,y,z,t. Is there a "single point" in your feature types? Why assign the term point to a set of measurements with single x, y, and z and progressing t, as opposed to a set of measurements with single x, y, t values but varying z? Cheers - Nan Lowry, Roy K wrote: .. The feature terms we use for observational data in BODC are: Profile - single set of measurements with single (by assumption) x, y, t values but varying spatial z. An example is a single, fully processed (i.e. binned) CTD cast. Profile collection - an aggregation of profiles into a single data object. An example is all the CTDs from a section or a cruise. Profile series - a set of measurements with single x,y a fixed set of spatial z values and progessing t. An example is a single moored ADCP deployment record. Point - a set of measurements with single x, y, and z and progressing t. Example is a single moored recording current meter record. Point collection - an aggregation of point features in a single container. Example is all the records from all the current meters on a mooring or deployed on a cruise. Spectrum series - a set of measurements with single x,y a fixed set of non-spatial z values and progessing t. An example is a power spectrum time series from a wave recorder. 2D-trajectory - a set of measurements with variable x, y, t and a single spatial z. Example is the thermosalinograph record from a cruise. 3D-trajectory - set of measurements with variable x, y, t and a single spatial z. Example is the thermosalinograph record from an AUV mission. It is also applicable to a yo-yo CTD station, mirroring Chris's comments on atmospheric "profiles" with variant x,y. I think that Nan and most of the observational oceanographic community recognise these concepts and consequently, if a mapping to them to your feature definitions is maintained then it will help keep us on board. Note that the difference between 'point' and 'point collection' is important to me as on observational data manager, which is a different perspective to an observational data ingestor. Cheers, Roy. -- *** * Nan Galbraith(508) 289-2444 * * Upper Ocean Processes GroupMail Stop 29 * * Woods Hole Oceanographic Institution* * Woods Hole, MA 02543* *** -- *** * Nan Galbraith(508) 289-2444 * * Upper Ocean Processes GroupMail Stop 29 * * Woods Hole Oceanographic Institution* * Woods Hole, MA 02543* *** ___ CF-metadata mailing list CF-metadata@cgd.ucar.edu http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
Re: [CF-metadata] CF point observation Conventions
Hi Nan, It's a terminology issue. The feature type terms were coined for local use under pressure (so much pressure that I even failed to consult the CSML feature type names!) to describe data in one of our data schemas, which doesn't include single instantaneous measurements. It's the concepts that are the important thing, which we identify by neutral keys. I'm quite happy to use different terms to describe the concepts providing the concept definitions match exactly. The only reason I exposed them was to ensure CF didn't head off into concepts that didn't map. Getting a set of terms for these concepts that are universally agreed would be worthwhile. Bringing our local terms into line with CSML would be an obvious first step, which I'll try and do next week (currently on travel) in conjunction with checking through John Caron's mappings to the proposed CF feature types Meanwhile if you've any further suggestions for change (or additional observational feature types you'd like to see) let me know and I'll do my best to fall into line. Cheers, Roy. From: cf-metadata-boun...@cgd.ucar.edu [cf-metadata-boun...@cgd.ucar.edu] On Behalf Of Nan Galbraith [ngalbra...@whoi.edu] Sent: 20 November 2009 16:26 To: cf-metadata@cgd.ucar.edu Subject: Re: [CF-metadata] CF point observation Conventions Thanks, Roy. There's something not quite symmetrical in this, either - maybe it's just terminology, maybe not. A time series is conceptually identical to a profile, just turned on its side so time is the single incrementing dimension, instead of depth. The difference turns out to be important in the proposal mainly because of the way we'd aggregate profiles vs time series. A point, in my lexicon, is an atomic unit, a single measurement at a single x,y,z,t. Is there a single point in your feature types? Why assign the term point to a set of measurements with single x, y, and z and progressing t, as opposed to a set of measurements with single x, y, t values but varying z? Cheers - Nan Lowry, Roy K wrote: .. The feature terms we use for observational data in BODC are: Profile - single set of measurements with single (by assumption) x, y, t values but varying spatial z. An example is a single, fully processed (i.e. binned) CTD cast. Profile collection - an aggregation of profiles into a single data object. An example is all the CTDs from a section or a cruise. Profile series - a set of measurements with single x,y a fixed set of spatial z values and progessing t. An example is a single moored ADCP deployment record. Point - a set of measurements with single x, y, and z and progressing t. Example is a single moored recording current meter record. Point collection - an aggregation of point features in a single container. Example is all the records from all the current meters on a mooring or deployed on a cruise. Spectrum series - a set of measurements with single x,y a fixed set of non-spatial z values and progessing t. An example is a power spectrum time series from a wave recorder. 2D-trajectory - a set of measurements with variable x, y, t and a single spatial z. Example is the thermosalinograph record from a cruise. 3D-trajectory - set of measurements with variable x, y, t and a single spatial z. Example is the thermosalinograph record from an AUV mission. It is also applicable to a yo-yo CTD station, mirroring Chris's comments on atmospheric profiles with variant x,y. I think that Nan and most of the observational oceanographic community recognise these concepts and consequently, if a mapping to them to your feature definitions is maintained then it will help keep us on board. Note that the difference between 'point' and 'point collection' is important to me as on observational data manager, which is a different perspective to an observational data ingestor. Cheers, Roy. -- *** * Nan Galbraith(508) 289-2444 * * Upper Ocean Processes GroupMail Stop 29 * * Woods Hole Oceanographic Institution* * Woods Hole, MA 02543* *** -- *** * Nan Galbraith(508) 289-2444 * * Upper Ocean Processes GroupMail Stop 29 * * Woods Hole Oceanographic Institution* * Woods Hole, MA 02543* *** -- This message (and any attachments) is for the recipient only. NERC is subject to the Freedom of Information Act 2000 and the contents of this email and any reply you make may be disclosed by NERC unless it is exempt from release under the Act. Any material supplied to NERC may be stored in an electronic records management system. ___