+1 The asynchronous support was requested by PODAAC when we proposed SDAP to them a couple of years ago, at that time I made a prototype implementation of asynchronous request using restful API (https://github.com/apache/incubator-sdap-nexus/pull/107, never merged, should be closed though now), it was certainly not OGC and certainly not too robust (but I was happy with the simple implementation ;).
I am happy that you made that happen for SDAP. Thanks Thomas From: Huang, Thomas (US 398F) <thomas.hu...@jpl.nasa.gov.INVALID> Date: Wednesday, August 16, 2023 at 12:26 PM To: dev@sdap.apache.org <dev@sdap.apache.org> Subject: Re: [EXTERNAL] VOTE thread for major SDAP change +1 Yes Thanks for adding this major capability, Stepheny and team. It is good to see SDAP is supporting OGC Coverage standard. While SDAP is optimized for interactive analytics, thanks to Apache Spark, we know the response time is still depending on the volume of data involved in the computation (e.g., data transfer from storage to RAM). I am glad to see we have a solution for large match-up jobs. I support this change. Has the large job tracking capability been generalized for other large operations, such as climatology generation? Can we use this capability for large data subsetting? If not, it would be great to generalize the current implementation in future SDAP releases. Thanks Thomas. ---- Thomas Huang Group Supervisor, Data Product Generation Software Instrument Software and Science Data Systems Jet Propulsion Laboratory, California Institute of Technology (818) 354-2747 <tel:(818)%20354-2747> On 8/15/23, 5:33 PM, "Stepheny Perez" <skpe...@apache.org <mailto:skpe...@apache.org>> wrote: Hi everyone, I opened a PR for a major change to SDAP here: https://urldefense.us/v3/__https://github.com/apache/incubator-sdap-nexus/pull/249__;!!PvBDto6Hs4WbVuu7!KWlg8Ecrl47nLBpXuEK4b2MEaCjAnv4ZZX5ZtZmY14Tg7zEyqp8UVmWTO9-It4lYGB4BSYkvSjEE6k8aUMACpeap$<https://urldefense.us/v3/__https:/github.com/apache/incubator-sdap-nexus/pull/249__;!!PvBDto6Hs4WbVuu7!KWlg8Ecrl47nLBpXuEK4b2MEaCjAnv4ZZX5ZtZmY14Tg7zEyqp8UVmWTO9-It4lYGB4BSYkvSjEE6k8aUMACpeap$> <https://urldefense.us/v3/__https://github.com/apache/incubator-sdap-nexus/pull/249__;!!PvBDto6Hs4WbVuu7!KWlg8Ecrl47nLBpXuEK4b2MEaCjAnv4ZZX5ZtZmY14Tg7zEyqp8UVmWTO9-It4lYGB4BSYkvSjEE6k8aUMACpeap$<https://urldefense.us/v3/__https:/github.com/apache/incubator-sdap-nexus/pull/249__;!!PvBDto6Hs4WbVuu7!KWlg8Ecrl47nLBpXuEK4b2MEaCjAnv4ZZX5ZtZmY14Tg7zEyqp8UVmWTO9-It4lYGB4BSYkvSjEE6k8aUMACpeap$>> Because this is a major change, I'll open a 72-hour VOTE thread and see if there are any concerns about this change. However you vote, please provide justification. This change introduces "async jobs" to SDAP. Currently, SDAP only supports synchronous jobs, meaning the API call will hang until the analysis is completed and results are returned to the user. This new async feature will immediately return a job detail response to the user (via a 300 redirect) which the user can then poll until the results are ready. This is important because it adds support for larger jobs; the jobs can take days or weeks if needed. Please be aware this change is only enabled for the /match_spark endpoint -- no other algorithms are impacted. In order to enable this feature for other algorithms, the results would need to be persisted to Cassandra and the "NexusCalcSparkTornadoHandler" handler would need to be inherited. The new endpoints utilize the OGC Coverages specification (https://urldefense.us/v3/__https://ogcapi.ogc.org/coverages/__;!!PvBDto6Hs4WbVuu7!KWlg8Ecrl47nLBpXuEK4b2MEaCjAnv4ZZX5ZtZmY14Tg7zEyqp8UVmWTO9-It4lYGB4BSYkvSjEE6k8aUBuEEb9O$ <https://urldefense.us/v3/__https://ogcapi.ogc.org/coverages/__;!!PvBDto6Hs4WbVuu7!KWlg8Ecrl47nLBpXuEK4b2MEaCjAnv4ZZX5ZtZmY14Tg7zEyqp8UVmWTO9-It4lYGB4BSYkvSjEE6k8aUBuEEb9O$<https://urldefense.us/v3/__https:/ogcapi.ogc.org/coverages/__;!!PvBDto6Hs4WbVuu7!KWlg8Ecrl47nLBpXuEK4b2MEaCjAnv4ZZX5ZtZmY14Tg7zEyqp8UVmWTO9-It4lYGB4BSYkvSjEE6k8aUBuEEb9O$>> ) Thanks, Stepheny