[ 
https://issues.apache.org/jira/browse/CLIMATE-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cameron Goodale updated CLIMATE-236:
------------------------------------

    Description: 
Currently within the rcmes code base there is an ability to take in multiple 
datasets and return an 'ensemble' dataset.  This ensemble is just the mean of 
all the input datasets.

The plan has been revised to just leverage the numpy.mean() function and 
setting the axis=0 assuming the input datasets are along the 0 axis.

Expected input datasets would be a list of dataset objects, so the overall 
shape of the list you be:

(len(datasets), len(times), len(lats), len(lons)), so using 
numpy.mean(datasets, axis=0) would take the average of all the datasets in one 
line.

  was:
Currently within the rcmes code base there is an ability to take in multiple 
datasets and return an 'ensemble' dataset.  This ensemble is just the mean of 
all the input datasets.

The plan is to grab this code block (around line 250) from do_data_prep:

{code}
    # TODO:  Refactor this into a function within the toolkit module
    # compute the simple multi-obs ensemble if multiple obs are used
    if numOBSs > 1:
        print 'numOBSs = ', numOBSs
        oData = obsData
        print 'oData shape = ', oData.shape
        obsData = ma.zeros((numOBSs + 1, nT, ngrdY, ngrdX))
        print 'obsData shape = ', obsData.shape
        avg = ma.zeros((nT, ngrdY, ngrdX))
        
        for i in np.arange(numOBSs):
            obsData[i, :, :, :] = oData[i, :, :, :]
            avg[:, :, :] = avg[:, :, :] + oData[i, :, :, :]

        avg = avg / float(numOBSs)
        obsData[numOBSs, :, :, :] = avg[:, :, :]     # store the model-ensemble 
data
        numOBSs = numOBSs + 1                     # update the number of obs 
data to include the model ensemble
        obsList.append('ENS-OBS')
{code}

Port all of that into a private function in dataset_processor called 
_rcmes_make_dataset_ensemble() and setup the dataset_processor.ensemble() 
function use it initially.

Once the code move, documentation, unit tests are all complete, then I will 
resolve this issue.

    
> Add Dataset Ensemble Support to the ocw.dataset_processor module
> ----------------------------------------------------------------
>
>                 Key: CLIMATE-236
>                 URL: https://issues.apache.org/jira/browse/CLIMATE-236
>             Project: Apache Open Climate Workbench
>          Issue Type: Sub-task
>          Components: rcmet
>    Affects Versions: 0.2-incubating
>         Environment: *nix
>            Reporter: Cameron Goodale
>            Assignee: Cameron Goodale
>             Fix For: 0.3-incubating
>
>   Original Estimate: 4h
>          Time Spent: 3h
>  Remaining Estimate: 1h
>
> Currently within the rcmes code base there is an ability to take in multiple 
> datasets and return an 'ensemble' dataset.  This ensemble is just the mean of 
> all the input datasets.
> The plan has been revised to just leverage the numpy.mean() function and 
> setting the axis=0 assuming the input datasets are along the 0 axis.
> Expected input datasets would be a list of dataset objects, so the overall 
> shape of the list you be:
> (len(datasets), len(times), len(lats), len(lons)), so using 
> numpy.mean(datasets, axis=0) would take the average of all the datasets in 
> one line.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to