Github user agoodm commented on the issue:
https://github.com/apache/climate/pull/374
@huikyole Not sure if I agree with your first suggestion. I thought the
intent of having data_source as a separate directory was originally due to
having multiple modules for loading from each data source. The modules in the
base ocw directory represent each of the individual steps in the workflow, eg
dataset processing, running evaluations, and plotting. The main thing that was
missing previously was loading the datasets which is exactly what this module
aims to do. So for now I think leaving it here is appropriate. I think @lewismc
should share his thoughts on this though too.
I absolutely agree with your second suggestion though. I originally had it
set up this way because to my recollection, the rest of the OCW codebase
(particularly metrics and evaluations) were designed with "one reference"
dataset. Given that Loikith et al. 2013 uses two reanalysis datasets, we should
get rid of this rigid assumption not only for `dataset_loader.py` but
potentially for `evaluation.py` as well. I think for now changing the former
obviously takes precedence but we should consider exploring the the latter as
well.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---