[GitHub] [incubator-mxnet] connorgoggins opened a new pull request #17571: [OpPerf] Fixed native output ordering, added warmup & runs command line args

GitBox Tue, 11 Feb 2020 15:33:26 -0800

connorgoggins opened a new pull request #17571: [OpPerf] Fixed native output 
ordering, added warmup & runs command line args
URL: https://github.com/apache/incubator-mxnet/pull/17571
 
 
   ## Description ##
   Here, I fix the ordering issue when operator perf results are generated with 
the native profiler. Previously, the ordering of keys in the output dict 
generated was arbitrary and illogical. There were cases where 
`avg_time_forward_` perf results were displayed after `avg_time_backward_` perf 
results (when running with `opperf.py`, and where memory consumption was 
displayed between forward and backward perf results (when running categories of 
operators individually). Furthermore, `inputs` were never displayed first in 
the output.  
   
   To solve these problems, I traced the construction of the results dictionary 
back to the saving of forward/backward perf results and memory consumption data 
in `profiler_utils.py`. I found that the entry for `avg_time_backward_` was 
consistently being added to the dictionary before the entry for 
`avg_time_forward_` (since it appeared first in the raw profile results, and 
was consequently encountered first in the iteration). I implemented a fix that 
saved each result to a variable during the iteration, and added the variable 
values to the dict in the correct order after the iteration had finished. 
   
   After making this change, I found another issue: `merge_map_list()` 
constructed the output map in a way that was insensitive to the order of the 
list of dictionaries (and the keys within those dictionaries) that it was 
passed. I rewrote `merge_map_list()` to be sensitive to the order of its input 
list, as wells as the order of the keys within the dictionaries of each list 
element. Then, in `benchmark_utils.py`, I passed the `input` map to 
`merge_map_list()` before the other entries to ensure its position as the first 
key in the resulting dictionary.
   
   Even after making these changes, I found that the output of `opperf.py` was 
still ordered incorrectly. I found the root cause of the issue: in the 
`json.dump` call in `common_utils.py`, the `sort_keys` parameter was set to 
`True`, which would sort the keys of all dictionaries in the output JSON 
alphabetically (overwriting my ordering of the output dictionary). I saw that 
the purpose of sorting the output dictionary was to display the operator keys 
alphabetically in the resulting JSON. To retain this alphabetical operator 
sorting while preserving the ordering I had established for the output of each 
operator, I disabled the `sort_keys` parameter and sorted only the 
highest-level keys in the final output map in `opperf.py`.
   
   When I finished making these changes, I wanted to quickly test them with 
`opperf.py`, but found that there was no way to specify `warmup` and `runs` 
from the command line when calling opperf. I thought this would be a very 
useful feature for users testing operator performance, so I added it.
   
   ## Checklist ##
   ### Essentials ###
   Please feel free to remove inapplicable items for your PR.
   - [x] Changes are complete (i.e. I finished coding on this PR)
   - [x] All changes have test coverage
   - [x] Code is well-documented
   - [x] To the best of my knowledge, examples are either not affected by this 
change, or have been fixed to be compatible with this change
   
   ### Changes ###
   - M benchmark/opperf/opperf.py
   - M benchmark/opperf/utils/benchmark_utils.py
   - M benchmark/opperf/utils/common_utils.py
   - M benchmark/opperf/utils/profiler_utils.py
   
   ## Comments ##
   Tested on Mac OS with:
   1. Checkout branch and call individual operator category testing functions 
(e.g. `run_mx_misc_operators_benchmarks`)
   2. Checkout branch and run `opperf.py` (full run of all ops) with JSON 
output.
   
   ## Performance Results ##
   [Full OpPerf test 
(CPU)](https://gist.github.com/connorgoggins/55ddec675d5bc5afb9dac34eb2d5cd84)
   
   [Full OpPerf test 
(GPU)](https://gist.github.com/connorgoggins/cc0a35c35fbe9d8f5fb84cf597d32d20)
   
   
   @apeforest @access2rohit @ChaiBapchya


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] connorgoggins opened a new pull request #17571: [OpPerf] Fixed native output ordering, added warmup & runs command line args

Reply via email to