Hi Riley,

Thank you for working on the extensive changes to add support for
Zarr. Having the capability to register existing Zarr data for
analysis within SDAP is a much desired feature. Since the changes do
not impact the existing nexusproto implementation, I think it's safe
to merge and release this change.

Thanks,
Nga



On Tue, Oct 17, 2023 at 4:12 PM Riley Kuttruff <[email protected]> wrote:
>
> Hi everyone,
>
> I currently have PRs open for Nexus (#265) and Ingester (#86) to overhaul the 
> data-access component of SDAP to support analysis-ready datasets in other 
> formats without the need to ingest NetCDF files into our own format (data 
> duplication). This support would be useful for users who wish to use SDAP 
> with their existing data (DAACs come to mind here) without needing to have 
> (or pay for) added storage for SDAP tiles.
>
> My initial implementation focuses on Zarr as that is what I have more 
> experience in. Cloud-optimized GeoTIFF support is in work and nearing 
> readiness (though I'd rather these changes be accepted/merged instead of 
> making the PRs even larger). I also plan to investigate Parquet. The PRs 
> support Zarr data stored locally or on AWS S3.
>
> The changes boil down to separating the various backends (nexusproto & Zarr) 
> into their own modules that implement many of the existing nexustiles.py 
> methods. nexustiles.py acts as an interface, routing the existing method 
> calls to the appropriate backend's method. It also maintains a mapping 
> between current datasets and their associated backend, which it builds from 
> Solr's nexusdatasets collection. This collection stores all the needed info 
> for Zarr datasets. Datasets can be added by listing them in the collections 
> config, or dynamically through a set of dataset management endpoints (add, 
> update and delete), which are useful for on the fly onboarding, updating 
> (such as rotating AWS keys) and removal of Zarr datasets.
>
> I've done a decent degree of testing with these changes, both locally and by 
> deploying them to JPL SDAP instances (such as 
> https://ideas-digitaltwin.jpl.nasa.gov/nexus/ or see the OCO-3 section of 
> https://github.com/EarthDigitalTwin/FireAlarm-notebooks/blob/dc5c64d9e0311e45f9a3f93908ea6394f5304130/AirQuality_Demo.ipynb)
>  and there's been no indication of breaking existing endpoints using the old 
> nexusproto implementation and the Zarr backend seems reliable.
>
> There's more documentation for these changes on the Nexus PR, I'm available 
> to address any questions & concerns regarding these changes as well. If need 
> be, I can also write a more thorough gist documenting these changes.
>
> Let me know what you think!
>
> Thanks,
> Riley

Reply via email to