Hi everyone,

I currently have PRs open for Nexus (#265) and Ingester (#86) to overhaul the 
data-access component of SDAP to support analysis-ready datasets in other 
formats without the need to ingest NetCDF files into our own format (data 
duplication). This support would be useful for users who wish to use SDAP with 
their existing data (DAACs come to mind here) without needing to have (or pay 
for) added storage for SDAP tiles.

My initial implementation focuses on Zarr as that is what I have more 
experience in. Cloud-optimized GeoTIFF support is in work and nearing readiness 
(though I'd rather these changes be accepted/merged instead of making the PRs 
even larger). I also plan to investigate Parquet. The PRs support Zarr data 
stored locally or on AWS S3.

The changes boil down to separating the various backends (nexusproto & Zarr) 
into their own modules that implement many of the existing nexustiles.py 
methods. nexustiles.py acts as an interface, routing the existing method calls 
to the appropriate backend's method. It also maintains a mapping between 
current datasets and their associated backend, which it builds from Solr's 
nexusdatasets collection. This collection stores all the needed info for Zarr 
datasets. Datasets can be added by listing them in the collections config, or 
dynamically through a set of dataset management endpoints (add, update and 
delete), which are useful for on the fly onboarding, updating (such as rotating 
AWS keys) and removal of Zarr datasets.

I've done a decent degree of testing with these changes, both locally and by 
deploying them to JPL SDAP instances (such as 
https://ideas-digitaltwin.jpl.nasa.gov/nexus/ or see the OCO-3 section of 
https://github.com/EarthDigitalTwin/FireAlarm-notebooks/blob/dc5c64d9e0311e45f9a3f93908ea6394f5304130/AirQuality_Demo.ipynb)
 and there's been no indication of breaking existing endpoints using the old 
nexusproto implementation and the Zarr backend seems reliable.

There's more documentation for these changes on the Nexus PR, I'm available to 
address any questions & concerns regarding these changes as well. If need be, I 
can also write a more thorough gist documenting these changes.

Let me know what you think!

Thanks,
Riley

Reply via email to