Hi everyone, I currently have PRs open for Nexus (#265) and Ingester (#86) to overhaul the data-access component of SDAP to support analysis-ready datasets in other formats without the need to ingest NetCDF files into our own format (data duplication). This support would be useful for users who wish to use SDAP with their existing data (DAACs come to mind here) without needing to have (or pay for) added storage for SDAP tiles.
My initial implementation focuses on Zarr as that is what I have more experience in. Cloud-optimized GeoTIFF support is in work and nearing readiness (though I'd rather these changes be accepted/merged instead of making the PRs even larger). I also plan to investigate Parquet. The PRs support Zarr data stored locally or on AWS S3. The changes boil down to separating the various backends (nexusproto & Zarr) into their own modules that implement many of the existing nexustiles.py methods. nexustiles.py acts as an interface, routing the existing method calls to the appropriate backend's method. It also maintains a mapping between current datasets and their associated backend, which it builds from Solr's nexusdatasets collection. This collection stores all the needed info for Zarr datasets. Datasets can be added by listing them in the collections config, or dynamically through a set of dataset management endpoints (add, update and delete), which are useful for on the fly onboarding, updating (such as rotating AWS keys) and removal of Zarr datasets. I've done a decent degree of testing with these changes, both locally and by deploying them to JPL SDAP instances (such as https://ideas-digitaltwin.jpl.nasa.gov/nexus/ or see the OCO-3 section of https://github.com/EarthDigitalTwin/FireAlarm-notebooks/blob/dc5c64d9e0311e45f9a3f93908ea6394f5304130/AirQuality_Demo.ipynb) and there's been no indication of breaking existing endpoints using the old nexusproto implementation and the Zarr backend seems reliable. There's more documentation for these changes on the Nexus PR, I'm available to address any questions & concerns regarding these changes as well. If need be, I can also write a more thorough gist documenting these changes. Let me know what you think! Thanks, Riley