Cameron Goodale created CLIMATE-321:
---------------------------------------

             Summary: dataset_processor._get_subregion_slice_indices cannot 
handle imprecise subregion input
                 Key: CLIMATE-321
                 URL: https://issues.apache.org/jira/browse/CLIMATE-321
             Project: Apache Open Climate Workbench
          Issue Type: Bug
          Components: regridding/data processing
    Affects Versions: 0.3-incubating
         Environment: *nix
            Reporter: Cameron Goodale
            Assignee: Cameron Goodale
             Fix For: 0.4-incubating


Mazi and I worked on this issue today.

Error Message:
File 
"/Users/cgoodale/Documents/workspace/apache-climate/ocw/dataset_processor.py", 
line 700, in _get_subregion_slice_indices
    latStart = np.nonzero(target_dataset.lats == subregion.lat_min)[0][0]
IndexError: index out of bounds

To try and explain this issue I will use a small example dataset that is a 4 x 
4 grid so 16 elements in total.  lats = [0, 1, 2, 3] and lons = [0, 1, 2, 3] 
this is using a 1 degree grid step to keep things simple.

Now if you asked for a subset of this dataset and used the EXISTING lat, lon 
values like lat_min: 1, lat_max: 2 and lon_min: 1, lon_max: 2 then the code 
works fine.

Things fall off the rails if you try using lat_min: 0.5 (or any other value 
that doesn't fall EXACTLY onto the dataset grid you are trying to subset).

Mazi ran into this issue and we determined the problem is here in 
dataset_processor.py

{code}
latStart = np.nonzero(target_dataset.lats == subregion.lat_min)[0][0]
latEnd = np.nonzero(target_dataset.lats == subregion.lat_max)[0][0]

lonStart = np.nonzero(target_dataset.lons == subregion.lon_min)[0][0]
lonEnd = np.nonzero(target_dataset.lons == subregion.lon_max)[0][0]
{code}

The code is looking for == (exact) matches within the dataset.

The plan is to make the function return values that are WITHIN the subregion.  
So using the example above, if I want to subset the 4 x 4 grid ( lats = [0, 1, 
2, 3] and lons = [0, 1, 2, 3] )   using:
lat_min: 0.5
lat_max:2.5
lon_min:0.5
lon_max: 2.5

I would get the following subset  lats = [1, 2] and lons = [1, 2]

The reviewboard with a Patch will be loaded up shortly



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to