Re: Review Request: metrics improvements

Cameron Goodale Fri, 14 Jun 2013 19:22:18 -0700

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11873/#review21929
-----------------------------------------------------------




http://svn.apache.org/repos/asf/incubator/climate/trunk/rcmet/src/main/python/rcmes/toolkit/metrics.py
<https://reviews.apache.org/r/11873/#comment45226>

    Why does this function end with #2?  If we do have two versions of a 
function we need to use solid naming to differentiate them, instead of numbers.



http://svn.apache.org/repos/asf/incubator/climate/trunk/rcmet/src/main/python/rcmes/toolkit/metrics.py
<https://reviews.apache.org/r/11873/#comment45227>

    When comparing one dataset to another, we need to standardize on the 
following convention:
    
    evaluationData, referenceData
    
    instead of dataset1, dataset2.
    
    This way users will know which dataset is which, just by reading the 
variable names.  calcPatternCorrelation already follows this convention.



http://svn.apache.org/repos/asf/incubator/climate/trunk/rcmet/src/main/python/rcmes/toolkit/metrics.py
<https://reviews.apache.org/r/11873/#comment45228>

    I propose the same dataset naming change, i.e. evaluationData, 
referenceData.



http://svn.apache.org/repos/asf/incubator/climate/trunk/rcmet/src/main/python/rcmes/toolkit/metrics.py
<https://reviews.apache.org/r/11873/#comment45229>

    In a future JIRA issue we need to work on pulling these User Input blocks 
out of the function.  Leaving the 'raw_input' functions in here means that the 
function cannot be called from an external script or from the GUI.
    
    A different approach would be to push these out into a high level script 
and pass in these values.



http://svn.apache.org/repos/asf/incubator/climate/trunk/rcmet/src/main/python/rcmes/toolkit/metrics.py
<https://reviews.apache.org/r/11873/#comment45230>

    Good catch here.


- Cameron Goodale


On June 14, 2013, 1:52 a.m., Alex Goodman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11873/
> -----------------------------------------------------------
> 
> (Updated June 14, 2013, 1:52 a.m.)
> 
> 
> Review request for Apache Open Climate, Cameron Goodale and Kyo Lee.
> 
> 
> Description
> -------
> 
> This is a major update to metrics as per CLIMATE-88. There are many changes, 
> most of them related to vectorizing many of the functions using various 
> indexing tricks. If you don't know this terminology, this basically means 
> that operations are performed on an entire array chunk at once, eliminating 
> the need for explicit for loops and improving performance and code 
> readability in most cases. I'll summarize some the changes that resulted 
> here: 
> 
> -For the most part, absolute performance increases were not that large but 
> the code became significantly more concise. A few of the functions 
> (particularly correlation calculation) showed very large gains. You can run 
> the updated attached benchmark_metrics.py script to see for yourself (See end 
> of this post for more details on that).
> 
> -Thanks to the addition of the reshapeMonthlyData() helper function in my 
> previous patch to misc.py, the explicit use of datetimes as a parameter for 
> many of the functions is no longer needed.
> 
> -The names of some variables were changed to adhere to our current coding 
> conventions.
> 
> -Functions that were commented out have been removed, complying to our 
> deprecation policy.
> 
> To run the benchmarking script, assuming you have the rcmes directory in your 
> PYTHONPATH do:
> 
> python benchmark_metrics.py
> 
> This will benchmark the functions that were changed between revisions for 10 
> years of randomly generated data on a 100 x 100 grid. To change the number of 
> years of data generated for the test, do:
> 
> python benchmark_metrics.py nYR
> 
> Where 'nYR' is the number of years of data you wish to use for the benchmark.
> 
> Finally, you can test to see if the revised functions are consistent with 
> their previous versions by running it in test mode:
> 
> python benchmark_metrics.py -t
> 
> This does not cover every possible test case, but from current testing 
> everything seems to work fine. Keep in mind though that they are tested 
> against revisions in the repository and not against Jinwon's upcoming 
> revisions, so if a previously used function was wrong, then so is the revised 
> one.
> 
> 
> Diffs
> -----
> 
>   
> http://svn.apache.org/repos/asf/incubator/climate/trunk/rcmet/src/main/python/rcmes/toolkit/metrics.py
>  1492816 
> 
> Diff: https://reviews.apache.org/r/11873/diff/
> 
> 
> Testing
> -------
> 
> -Randomly generated masked arrays via the attached benchmark_metrics.py
> -Some of the TRMM data
> 
> 
> Thanks,
> 
> Alex Goodman
> 
>

Re: Review Request: metrics improvements

Reply via email to