Cameron Goodale created CLIMATE-248:
---------------------------------------

             Summary: PERFORMANCE - Rebinning Daily to Monthly datasets takes a 
really long time
                 Key: CLIMATE-248
                 URL: https://issues.apache.org/jira/browse/CLIMATE-248
             Project: Apache Open Climate Workbench
          Issue Type: Improvement
          Components: regridding
    Affects Versions: 0.1-incubating, 0.2-incubating
         Environment: *nix
            Reporter: Cameron Goodale
            Assignee: Cameron Goodale
             Fix For: 0.3-incubating


When I was testing the dataset_processor module I noticed that most tests would 
complete in less than 1 second.  Then I came across the 
"test_daily_to_monthly_rebin" test and it would take over 2 minutes to complete.

The test initially used a 1x1 degree grid covering the globe and daily time 
step for 2 years (730 days).

I ran some initial checks and the lag appears to be down in the code where the 
data is rebinned down in '_rcmes_calc_average_on_new_time_unit_K'.

{code}
                mask = np.zeros_like(data)
                mask[timeunits!=myunit,:,:] = 1.0
                # Calculate missing data mask within each time unit...
                datamask_at_this_timeunit = np.zeros_like(data)
                datamask_at_this_timeunit[:]= 
process.create_mask_using_threshold(data[timeunits==myunit,:,:],threshold=0.75)
                # Store results for masking later
                datamask_store.append(datamask_at_this_timeunit[0])
                # Calculate means for each pixel in this time unit, ignoring 
missing data (using masked array).
                datam = 
ma.masked_array(data,np.logical_or(mask,datamask_at_this_timeunit))
                meanstore[i,:,:] = ma.average(datam,axis=0)
{code}

That block is suspect since the rest of the code is doing simple string parsing 
and appending to lists.  I don't have the time to do a deep dive into this now, 
and it technically isn't broken, but just really slow.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to