ArmageddonKnight commented on issue #14973: [MXNET-1404] Added the GPU memory 
profiler
URL: https://github.com/apache/incubator-mxnet/pull/14973#issuecomment-495481289
 
 
   Hi @anirudh2290 ,
   
   Thanks for your valuable feedback. Let me record a list of Discussion and 
TODO items:
   
   ## Discussion
   
   - [ ] Preprocessor Directives (1. 2.). 
   - [ ]
   - [ ] Imperative Support (4.). We should **NOT** drop the support for 
imperative, as this will significantly boost the amount of GPU memory allocated 
with the unknown tag. For instance, currently most optimizer states are 
allocated using the pure imperative approach. In fact, almost all the current 
optimizer implementations (e.g., SGD, Adam) initialize the optimizer states 
with `mx.nd.zero`. If we drop the imperative support, then all of those 
allocations will fall to the `unknown` category, which can be `1 GB` for large 
models.
   - [ ] Profiler API Integration (5. 6. 7. 8.). The GPU memory profiler is 
different from the existing profilers in many ways: (1) It is not using the 
`chrome://tracing` visualization backend, and the reason is because it needs to 
accept the users' input on defining the **keyword dictionaries for grouping 
storage tags** (also, I do not see a very good way of visualizing percentage 
using `chrome://tracing`). (2) Because it 
   
   ## TODO 
   
   - [ ] Add minimum working example in the `example` directory to show how 
`SETME`, analyzer, and plotter work.
   - [ ]

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to