nswamy commented on a change in pull request #13105: [MXNET-1158] JVM Memory 
Management Documentation
URL: https://github.com/apache/incubator-mxnet/pull/13105#discussion_r236888172
 
 

 ##########
 File path: scala-package/memory-management.md
 ##########
 @@ -0,0 +1,117 @@
+# JVM Memory Management
+The Scala and Java binding of Apache MXNet uses native memory(C++ Heap either 
in RAM or GPU memory) in most of the MXNet Scala objects such as NDArray, 
Symbol, Executor, KVStore, Data Iterators, etc.,. the Scala classes associated 
with them act as wrappers, 
+the operations on these objects are directed to the MXNet C++ backend via JNI 
for performance , so the bytes are also stored in the native heap for fast 
access.   
+
+The JVM using the Garbage Collector only manages objects allocated in the JVM 
Heap and is not aware of the memory footprint of these objects in the native 
memory, hence allocation/deAllocation of the native memory has to be managed by 
MXNet Scala.  
+Allocating native memory is straight forward and is done during the 
construction of the object by a calling the associated C++ API through JNI, 
however since JVM languages do not have destructors, De-Allocation of these 
objects becomes problematic and has to explicitly de-allocated. 
+To make it easy, MXNet Scala provides a few modes of operation.
+
+## Memory Management in Scala 
+### 
[ResourceScope.using](https://github.com/apache/incubator-mxnet/blob/master/scala-package/core/src/main/scala/org/apache/mxnet/ResourceScope.scala#L106)
 (Recommended)
+`ResourceScope.using` provides the familiar Java try-with-resources primitive 
in Scala and also extends to automatically manage the memory of all the MXNet 
objects created in the code block (`body`) associated with it by tracking the 
allocations in a stack. 
+If an MXNet object or an Iterable containing MXNet objects is returned from 
the code-block, it is automatically excluded from de-allocation in the current 
scope and moved to 
+an outer scope if ResourceScope's are stacked.  
+
+**Usage** 
+```
+ResourceScope.using() {
+    ResourceScope.using() {
+        val r1 = NDArray.ones(Shape(2, 2))
+        val r2 = NDArray.ones(Shape(3, 4))
+        val r3 = NDArray.ones(Shape(5, 6))
+        val r4 = NDArray.ones(Shape(7, 8))
+        (r3, r4)
+    }
+    r4
+}
+```
+In the example above, we have two ResourceScopes stacked together, 4 NDArrays 
`(r1, r2, r3, r4)` are created in the inner scope, the inner scope returns 
+`(r3, r4)`. The ResourceScope code recognizes that it should not de-allocate 
these objects and automatically moves `r3` and  `r4` to the outer scope. The 
outer scope 
+returns `r4` from its code-block, so ResourceScope.using removes this from its 
list of objects to be de-allocated. All other objects are automatically 
released(native memory) by calling the C++ Backend to free the memory. 
+
+**Note:**
+You should consider stacking ResourceScope when you have layers of 
functionality in your application code which creates a lot of MXNet objects 
like NDArray. 
+This is because you don't want to hold onto all the memory that is created for 
the entire training loop and you will most likely run out of memory especially 
on GPUs which have limited memory in order 8 to 16 GB. 
+For example if you were writing Training code in MXNet Scala, it is 
recommended not to use one-uber ResourceScope block that runs the entire 
training code, 
+instead you should stack multiple scopes one where you run forward backward 
passes on each batch, 
+and 2nd scope for each epoch and an outer scope that runs the entire training 
script, like the example below
+```
+ResourceScope.using() {
+ val m = Module(...)
+ m.bind()
+ val k = KVStore(...)
+ ResourceScope.using() {
+     val itr = MXIterator(..)
+     val num_epochs: Int = 100
+     ... 
+     for (i <- 0 until num_epoch) {
+     ResourceScope.using() {
+        val dataBatch = itr.next()
+        while(itr.next()) {
+           m.forward(dataBatch)
+           m.backward(dataBatch)
+           m.update()
+        }
+     }
+ }
+}
+
+```  
+       
+### Using Phantom References (Recommended for some use cases)
+
+Apache MXNet uses [Phantom 
References](https://docs.oracle.com/javase/8/docs/api/java/lang/ref/PhantomReference.html)
 to track all MXNet Objects that has native memory associated with it. 
+When the Garbage Collector runs, GC identifies unreachable Scala/Java objects 
in the JVM Heap and finalizes them, 
+we take advantage of Garbage Collector which enqueues objects into a reference 
queue that are ready to be reclaimed, 
+at which point we do pre-mortem clean up by call the MXNet backend C++ API to 
free the native memory. 
+ 
+In this approach, you do not have to write any special code to have native 
memory cleaned up, however this approach solely depends on the Garbage 
collector to run and find unreachable objects.
+You can control the frequency of Garbage Collector by calling System.gc() at 
strategic points such as at the end of an epoch or at the end of a mini-batch 
in Training.
+
+This approach could be suitable for use-cases such as inference on CPUs and 
you have large amount of Memory(RAM) on your system.  
 
 Review comment:
   Yes

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to