[
https://issues.apache.org/jira/browse/SDAP-326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joseph C. Jacob updated SDAP-326:
---------------------------------
Issue Type: Improvement (was: Task)
> Make ingest processors optional in incubator-sdap-ingestor
> ----------------------------------------------------------
>
> Key: SDAP-326
> URL: https://issues.apache.org/jira/browse/SDAP-326
> Project: Apache Science Data Analytics Platform
> Issue Type: Improvement
> Components: collection-ingester, granule-ingester
> Reporter: Joseph C. Jacob
> Priority: Major
>
> h3. The Problem:
> The old *incubator-sdap-ningesterpy* / *incubator-sdap-ningester* required
> that we list the processors to be applied to each dataset at ingest time in
> the configuration file for the dataset. The new *incubator-sdap-ingester*
> applies these processors automatically and has no mechanism to change the
> behavior via a data collection config setting. This is a problem with the
> processor that converts any variable with units "kelvin" to units "celsius"
> because some variables are in units "kelvin", but represent a difference from
> a norm and should not be transformed.
> Currently, "*kelvintocelsius*" is the only processor that has been identified
> as one that we need to be able to turn off. However, this may apply to any
> units conversion or to other processors added in the future.
> h3. The Details:
> In particular, for the *{{MUR25-JPL-L4-GLOB-v4.2}}* dataset, we commonly
> ingest both the *{{analysed_sst}}* and the *{{sst_anomaly}}*, both of which
> natively have units of degrees Kelvin, but the {{*sst_anomaly* represents a
> difference from some norm and should not be subject to the “subtract 273.15”
> operation. An *sst_anomaly*}} of 0 degrees in degrees Kelvin is still a 0
> degree “anomaly” or “difference” in degrees Celsius. So, we need to restrict
> which variables get this operation applied to them.
> h3. Proposed Solution:
> I propose to solve this in a way that is not specific to *kelvintocelsius*
> processor. Currently that processor is the only one that has been identified
> as one that we need to be able to turn off, but there may be others in the
> future. The proposed solution is to add a keyword in the
> *collections-config* where we can list any processors to be turned OFF for a
> dataset. Then we would just need to check that a processor is not in this
> list before applying it. This approach would work for the *kelvintocelsius*
> processor and any other processor that is already supported or is added in
> the future.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)