subject:"\[GitHub\] ThomasDelteil commented on a change in pull request #10900\: \[MXNET\-414\] Tutorial on visualizing CNN decisions using Grad\-CAM"

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-16 Thread GitBox

ThomasDelteil commented on a change in pull request #10900: [MXNET-414] 
Tutorial on visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r188721635
 
 

 ##
 File path: docs/tutorials/vision/cnn_visualization.md
 ##
 @@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision. 
Their accuracy is as good as humans in some tasks. However it remains hard to 
explain the predictions of convolutional neural networks.
+
+It is often helpful to be able to explain why a model made the prediction it 
made. For example when a model misclassifies an image, it is hard to say why 
without visualizing the network's decision.
+
+https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/cnn_visualization/volcano_barn_spider.png;
 alt="Explaining the misclassification of volcano as spider" width=500px/>
+
+Visualizations also help build confidence about the predictions of a model. 
For example, even if a model correctly predicts birds as birds, we would want 
to confirm that the model bases its decision on the features of bird and not on 
the features of some other object that might occur together with birds in the 
dataset (like leaves).
+
+In this tutorial, we show how to visualize the predictions made by 
convolutional neural networks using Gradient-weighted Class Activation Mapping. 
Unlike many other visualization methods, Grad-CAM can be used on a wide variety 
of CNN model families - CNNs with fully connected layers, CNNs used for 
structural outputs (e.g. captioning), CNNs used in tasks with multi-model input 
(e.g. VQA) or reinforcement learning without architectural changes or 
re-training.
+
+In the rest of this notebook, we will explain how to visualize predictions 
made by [VGG-16](https://arxiv.org/abs/1409.1556). We begin by importing the 
required dependencies. `gradcam` module contains the implementation of 
visualization techniques used in this notebook.
+
+```python
+from __future__ import print_function
+
+import mxnet as mx
+from mxnet import gluon
+
+from matplotlib import pyplot as plt
+import numpy as np
+import cv2
+
+gradcam_file = "gradcam.py" 
+base_url = 
"https://raw.githubusercontent.com/indhub/mxnet/cnnviz/example/cnn_visualization/{}?raw=true;
+mx.test_utils.download(base_url.format(gradcam_file), fname=gradcam_file)
+import gradcam
+```
+
+## Building the network to visualize
+
+Next, we build the network we want to visualize. For this example, we will use 
the [VGG-16](https://arxiv.org/abs/1409.1556) network. This code was taken from 
the Gluon [model 
zoo](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/model_zoo/vision/alexnet.py)
 and refactored to make it easy to switch between `gradcam`'s and Gluon's 
implementation of ReLU and Conv2D. Same code can be used for both training and 
visualization with a minor (one line) change.
+
+Notice that we import ReLU and Conv2D from `gradcam` module instead of 
mxnet.gluon.nn.
+- We use a modified ReLU because we use guided backpropagation for 
visualization and guided backprop requires ReLU layer to block the backward 
flow of negative gradients corresponding to the neurons which decrease the 
activation of the higher layer unit we aim to visualize. Check 
[this](https://arxiv.org/abs/1412.6806) paper to learn more about guided 
backprop.
+- We use a modified Conv2D (a wrapper on top of Gluon's Conv2D) because we 
want to capture the output of a given convolutional layer and its gradients. 
This is needed to implement Grad-CAM. Check 
[this](https://arxiv.org/abs/1610.02391) paper to learn more about Grad-CAM.
+
+When you train the network, you could just import `Activation` and `Conv2D` 
from `gluon.nn` instead. No other part of the code needs any change to switch 
between training and visualization.
+
+```python
+import os
+from mxnet.gluon.model_zoo import model_store
+
+from mxnet.initializer import Xavier
+from mxnet.gluon.nn import MaxPool2D, Flatten, Dense, Dropout, BatchNorm
+from gradcam import Activation, Conv2D
+
+class VGG(mx.gluon.HybridBlock):
+def __init__(self, layers, filters, classes=1000, batch_norm=False, 
**kwargs):
+super(VGG, self).__init__(**kwargs)
+assert len(layers) == len(filters)
+with self.name_scope():
+self.features = self._make_features(layers, filters, batch_norm)
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-16 Thread GitBox

ThomasDelteil commented on a change in pull request #10900: [MXNET-414] 
Tutorial on visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r188720295
 
 

 ##
 File path: docs/tutorials/vision/cnn_visualization.md
 ##
 @@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision. 
Their accuracy is as good as humans in some tasks. However it remains hard to 
explain the predictions of convolutional neural networks.
+
+It is often helpful to be able to explain why a model made the prediction it 
made. For example when a model misclassifies an image, it is hard to say why 
without visualizing the network's decision.
+
+https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/cnn_visualization/volcano_barn_spider.png;
 alt="Explaining the misclassification of volcano as spider" width=500px/>
+
+Visualizations also help build confidence about the predictions of a model. 
For example, even if a model correctly predicts birds as birds, we would want 
to confirm that the model bases its decision on the features of bird and not on 
the features of some other object that might occur together with birds in the 
dataset (like leaves).
+
+In this tutorial, we show how to visualize the predictions made by 
convolutional neural networks using Gradient-weighted Class Activation Mapping. 
Unlike many other visualization methods, Grad-CAM can be used on a wide variety 
of CNN model families - CNNs with fully connected layers, CNNs used for 
structural outputs (e.g. captioning), CNNs used in tasks with multi-model input 
(e.g. VQA) or reinforcement learning without architectural changes or 
re-training.
+
+In the rest of this notebook, we will explain how to visualize predictions 
made by [VGG-16](https://arxiv.org/abs/1409.1556). We begin by importing the 
required dependencies. `gradcam` module contains the implementation of 
visualization techniques used in this notebook.
+
+```python
+from __future__ import print_function
+
+import mxnet as mx
+from mxnet import gluon
+
+from matplotlib import pyplot as plt
+import numpy as np
+import cv2
+
+gradcam_file = "gradcam.py" 
+base_url = 
"https://raw.githubusercontent.com/indhub/mxnet/cnnviz/example/cnn_visualization/{}?raw=true;
+mx.test_utils.download(base_url.format(gradcam_file), fname=gradcam_file)
+import gradcam
+```
+
+## Building the network to visualize
+
+Next, we build the network we want to visualize. For this example, we will use 
the [VGG-16](https://arxiv.org/abs/1409.1556) network. This code was taken from 
the Gluon [model 
zoo](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/model_zoo/vision/alexnet.py)
 and refactored to make it easy to switch between `gradcam`'s and Gluon's 
implementation of ReLU and Conv2D. Same code can be used for both training and 
visualization with a minor (one line) change.
+
+Notice that we import ReLU and Conv2D from `gradcam` module instead of 
mxnet.gluon.nn.
+- We use a modified ReLU because we use guided backpropagation for 
visualization and guided backprop requires ReLU layer to block the backward 
flow of negative gradients corresponding to the neurons which decrease the 
activation of the higher layer unit we aim to visualize. Check 
[this](https://arxiv.org/abs/1412.6806) paper to learn more about guided 
backprop.
+- We use a modified Conv2D (a wrapper on top of Gluon's Conv2D) because we 
want to capture the output of a given convolutional layer and its gradients. 
This is needed to implement Grad-CAM. Check 
[this](https://arxiv.org/abs/1610.02391) paper to learn more about Grad-CAM.
+
+When you train the network, you could just import `Activation` and `Conv2D` 
from `gluon.nn` instead. No other part of the code needs any change to switch 
between training and visualization.
+
+```python
+import os
+from mxnet.gluon.model_zoo import model_store
+
+from mxnet.initializer import Xavier
+from mxnet.gluon.nn import MaxPool2D, Flatten, Dense, Dropout, BatchNorm
+from gradcam import Activation, Conv2D
+
+class VGG(mx.gluon.HybridBlock):
+def __init__(self, layers, filters, classes=1000, batch_norm=False, 
**kwargs):
+super(VGG, self).__init__(**kwargs)
+assert len(layers) == len(filters)
+with self.name_scope():
+self.features = self._make_features(layers, filters, batch_norm)
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-16 Thread GitBox

ThomasDelteil commented on a change in pull request #10900: [MXNET-414] 
Tutorial on visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r188724209
 
 

 ##
 File path: docs/tutorials/vision/cnn_visualization.md
 ##
 @@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision. 
Their accuracy is as good as humans in some tasks. However it remains hard to 
explain the predictions of convolutional neural networks.
+
+It is often helpful to be able to explain why a model made the prediction it 
made. For example when a model misclassifies an image, it is hard to say why 
without visualizing the network's decision.
+
+https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/cnn_visualization/volcano_barn_spider.png;
 alt="Explaining the misclassification of volcano as spider" width=500px/>
+
+Visualizations also help build confidence about the predictions of a model. 
For example, even if a model correctly predicts birds as birds, we would want 
to confirm that the model bases its decision on the features of bird and not on 
the features of some other object that might occur together with birds in the 
dataset (like leaves).
+
+In this tutorial, we show how to visualize the predictions made by 
convolutional neural networks using Gradient-weighted Class Activation Mapping. 
Unlike many other visualization methods, Grad-CAM can be used on a wide variety 
of CNN model families - CNNs with fully connected layers, CNNs used for 
structural outputs (e.g. captioning), CNNs used in tasks with multi-model input 
(e.g. VQA) or reinforcement learning without architectural changes or 
re-training.
+
+In the rest of this notebook, we will explain how to visualize predictions 
made by [VGG-16](https://arxiv.org/abs/1409.1556). We begin by importing the 
required dependencies. `gradcam` module contains the implementation of 
visualization techniques used in this notebook.
+
+```python
+from __future__ import print_function
+
+import mxnet as mx
+from mxnet import gluon
+
+from matplotlib import pyplot as plt
+import numpy as np
+import cv2
+
+gradcam_file = "gradcam.py" 
+base_url = 
"https://raw.githubusercontent.com/indhub/mxnet/cnnviz/example/cnn_visualization/{}?raw=true;
+mx.test_utils.download(base_url.format(gradcam_file), fname=gradcam_file)
+import gradcam
+```
+
+## Building the network to visualize
+
+Next, we build the network we want to visualize. For this example, we will use 
the [VGG-16](https://arxiv.org/abs/1409.1556) network. This code was taken from 
the Gluon [model 
zoo](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/model_zoo/vision/alexnet.py)
 and refactored to make it easy to switch between `gradcam`'s and Gluon's 
implementation of ReLU and Conv2D. Same code can be used for both training and 
visualization with a minor (one line) change.
+
+Notice that we import ReLU and Conv2D from `gradcam` module instead of 
mxnet.gluon.nn.
+- We use a modified ReLU because we use guided backpropagation for 
visualization and guided backprop requires ReLU layer to block the backward 
flow of negative gradients corresponding to the neurons which decrease the 
activation of the higher layer unit we aim to visualize. Check 
[this](https://arxiv.org/abs/1412.6806) paper to learn more about guided 
backprop.
+- We use a modified Conv2D (a wrapper on top of Gluon's Conv2D) because we 
want to capture the output of a given convolutional layer and its gradients. 
This is needed to implement Grad-CAM. Check 
[this](https://arxiv.org/abs/1610.02391) paper to learn more about Grad-CAM.
+
+When you train the network, you could just import `Activation` and `Conv2D` 
from `gluon.nn` instead. No other part of the code needs any change to switch 
between training and visualization.
+
+```python
+import os
+from mxnet.gluon.model_zoo import model_store
+
+from mxnet.initializer import Xavier
+from mxnet.gluon.nn import MaxPool2D, Flatten, Dense, Dropout, BatchNorm
+from gradcam import Activation, Conv2D
+
+class VGG(mx.gluon.HybridBlock):
+def __init__(self, layers, filters, classes=1000, batch_norm=False, 
**kwargs):
+super(VGG, self).__init__(**kwargs)
+assert len(layers) == len(filters)
+with self.name_scope():
+self.features = self._make_features(layers, filters, batch_norm)
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-16 Thread GitBox

ThomasDelteil commented on a change in pull request #10900: [MXNET-414]
Tutorial on visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r188717579

##
File path: docs/tutorials/vision/cnn_visualization.md
##
@@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision.
Their accuracy is as good as humans in some tasks. However it remains hard to
explain the predictions of convolutional neural networks.
+
+It is often helpful to be able to explain why a model made the prediction it
made. For example when a model misclassifies an image, it is hard to say why
without visualizing the network's decision.
+
+https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/cnn_visualization/volcano_barn_spider.png;
alt="Explaining the misclassification of volcano as spider" width=500px/>
+
+Visualizations also help build confidence about the predictions of a model.
For example, even if a model correctly predicts birds as birds, we would want
to confirm that the model bases its decision on the features of bird and not on
the features of some other object that might occur together with birds in the
dataset (like leaves).
+
+In this tutorial, we show how to visualize the predictions made by
convolutional neural networks using Gradient-weighted Class Activation Mapping.
Unlike many other visualization methods, Grad-CAM can be used on a wide variety
of CNN model families - CNNs with fully connected layers, CNNs used for
structural outputs (e.g. captioning), CNNs used in tasks with multi-model input
(e.g. VQA) or reinforcement learning without architectural changes or
re-training.
+
+In the rest of this notebook, we will explain how to visualize predictions
made by [VGG-16](https://arxiv.org/abs/1409.1556). We begin by importing the
required dependencies. `gradcam` module contains the implementation of
visualization techniques used in this notebook.

Review comment:
VGG-16 is about 500MB, this can be a big drag on the CI. It would be helpful
if you moved it to Resnet-18 or MobileNet or DenseNet which are <50MB.
However not sure how much data you are downloading since you are picking
specific layers. If it is <100MB then I think it is fine to keep VGG.

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

With regards,
Apache Git Services

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-16 Thread GitBox

ThomasDelteil commented on a change in pull request #10900: [MXNET-414] 
Tutorial on visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r188720460
 
 

 ##
 File path: docs/tutorials/vision/cnn_visualization.md
 ##
 @@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision. 
Their accuracy is as good as humans in some tasks. However it remains hard to 
explain the predictions of convolutional neural networks.
+
+It is often helpful to be able to explain why a model made the prediction it 
made. For example when a model misclassifies an image, it is hard to say why 
without visualizing the network's decision.
+
+https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/cnn_visualization/volcano_barn_spider.png;
 alt="Explaining the misclassification of volcano as spider" width=500px/>
+
+Visualizations also help build confidence about the predictions of a model. 
For example, even if a model correctly predicts birds as birds, we would want 
to confirm that the model bases its decision on the features of bird and not on 
the features of some other object that might occur together with birds in the 
dataset (like leaves).
+
+In this tutorial, we show how to visualize the predictions made by 
convolutional neural networks using Gradient-weighted Class Activation Mapping. 
Unlike many other visualization methods, Grad-CAM can be used on a wide variety 
of CNN model families - CNNs with fully connected layers, CNNs used for 
structural outputs (e.g. captioning), CNNs used in tasks with multi-model input 
(e.g. VQA) or reinforcement learning without architectural changes or 
re-training.
+
+In the rest of this notebook, we will explain how to visualize predictions 
made by [VGG-16](https://arxiv.org/abs/1409.1556). We begin by importing the 
required dependencies. `gradcam` module contains the implementation of 
visualization techniques used in this notebook.
+
+```python
+from __future__ import print_function
+
+import mxnet as mx
+from mxnet import gluon
+
+from matplotlib import pyplot as plt
+import numpy as np
+import cv2
+
+gradcam_file = "gradcam.py" 
+base_url = 
"https://raw.githubusercontent.com/indhub/mxnet/cnnviz/example/cnn_visualization/{}?raw=true;
+mx.test_utils.download(base_url.format(gradcam_file), fname=gradcam_file)
+import gradcam
+```
+
+## Building the network to visualize
+
+Next, we build the network we want to visualize. For this example, we will use 
the [VGG-16](https://arxiv.org/abs/1409.1556) network. This code was taken from 
the Gluon [model 
zoo](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/model_zoo/vision/alexnet.py)
 and refactored to make it easy to switch between `gradcam`'s and Gluon's 
implementation of ReLU and Conv2D. Same code can be used for both training and 
visualization with a minor (one line) change.
+
+Notice that we import ReLU and Conv2D from `gradcam` module instead of 
mxnet.gluon.nn.
+- We use a modified ReLU because we use guided backpropagation for 
visualization and guided backprop requires ReLU layer to block the backward 
flow of negative gradients corresponding to the neurons which decrease the 
activation of the higher layer unit we aim to visualize. Check 
[this](https://arxiv.org/abs/1412.6806) paper to learn more about guided 
backprop.
+- We use a modified Conv2D (a wrapper on top of Gluon's Conv2D) because we 
want to capture the output of a given convolutional layer and its gradients. 
This is needed to implement Grad-CAM. Check 
[this](https://arxiv.org/abs/1610.02391) paper to learn more about Grad-CAM.
+
+When you train the network, you could just import `Activation` and `Conv2D` 
from `gluon.nn` instead. No other part of the code needs any change to switch 
between training and visualization.
+
+```python
+import os
+from mxnet.gluon.model_zoo import model_store
+
+from mxnet.initializer import Xavier
+from mxnet.gluon.nn import MaxPool2D, Flatten, Dense, Dropout, BatchNorm
+from gradcam import Activation, Conv2D
+
+class VGG(mx.gluon.HybridBlock):
+def __init__(self, layers, filters, classes=1000, batch_norm=False, 
**kwargs):
+super(VGG, self).__init__(**kwargs)
+assert len(layers) == len(filters)
+with self.name_scope():
+self.features = self._make_features(layers, filters, batch_norm)
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-16 Thread GitBox

ThomasDelteil commented on a change in pull request #10900: [MXNET-414] 
Tutorial on visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r188721774
 
 

 ##
 File path: docs/tutorials/vision/cnn_visualization.md
 ##
 @@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision. 
Their accuracy is as good as humans in some tasks. However it remains hard to 
explain the predictions of convolutional neural networks.
+
+It is often helpful to be able to explain why a model made the prediction it 
made. For example when a model misclassifies an image, it is hard to say why 
without visualizing the network's decision.
+
+https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/cnn_visualization/volcano_barn_spider.png;
 alt="Explaining the misclassification of volcano as spider" width=500px/>
+
+Visualizations also help build confidence about the predictions of a model. 
For example, even if a model correctly predicts birds as birds, we would want 
to confirm that the model bases its decision on the features of bird and not on 
the features of some other object that might occur together with birds in the 
dataset (like leaves).
+
+In this tutorial, we show how to visualize the predictions made by 
convolutional neural networks using Gradient-weighted Class Activation Mapping. 
Unlike many other visualization methods, Grad-CAM can be used on a wide variety 
of CNN model families - CNNs with fully connected layers, CNNs used for 
structural outputs (e.g. captioning), CNNs used in tasks with multi-model input 
(e.g. VQA) or reinforcement learning without architectural changes or 
re-training.
+
+In the rest of this notebook, we will explain how to visualize predictions 
made by [VGG-16](https://arxiv.org/abs/1409.1556). We begin by importing the 
required dependencies. `gradcam` module contains the implementation of 
visualization techniques used in this notebook.
+
+```python
+from __future__ import print_function
+
+import mxnet as mx
+from mxnet import gluon
+
+from matplotlib import pyplot as plt
+import numpy as np
+import cv2
+
+gradcam_file = "gradcam.py" 
+base_url = 
"https://raw.githubusercontent.com/indhub/mxnet/cnnviz/example/cnn_visualization/{}?raw=true;
+mx.test_utils.download(base_url.format(gradcam_file), fname=gradcam_file)
+import gradcam
+```
+
+## Building the network to visualize
+
+Next, we build the network we want to visualize. For this example, we will use 
the [VGG-16](https://arxiv.org/abs/1409.1556) network. This code was taken from 
the Gluon [model 
zoo](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/model_zoo/vision/alexnet.py)
 and refactored to make it easy to switch between `gradcam`'s and Gluon's 
implementation of ReLU and Conv2D. Same code can be used for both training and 
visualization with a minor (one line) change.
+
+Notice that we import ReLU and Conv2D from `gradcam` module instead of 
mxnet.gluon.nn.
+- We use a modified ReLU because we use guided backpropagation for 
visualization and guided backprop requires ReLU layer to block the backward 
flow of negative gradients corresponding to the neurons which decrease the 
activation of the higher layer unit we aim to visualize. Check 
[this](https://arxiv.org/abs/1412.6806) paper to learn more about guided 
backprop.
+- We use a modified Conv2D (a wrapper on top of Gluon's Conv2D) because we 
want to capture the output of a given convolutional layer and its gradients. 
This is needed to implement Grad-CAM. Check 
[this](https://arxiv.org/abs/1610.02391) paper to learn more about Grad-CAM.
+
+When you train the network, you could just import `Activation` and `Conv2D` 
from `gluon.nn` instead. No other part of the code needs any change to switch 
between training and visualization.
+
+```python
+import os
+from mxnet.gluon.model_zoo import model_store
+
+from mxnet.initializer import Xavier
+from mxnet.gluon.nn import MaxPool2D, Flatten, Dense, Dropout, BatchNorm
+from gradcam import Activation, Conv2D
+
+class VGG(mx.gluon.HybridBlock):
+def __init__(self, layers, filters, classes=1000, batch_norm=False, 
**kwargs):
+super(VGG, self).__init__(**kwargs)
+assert len(layers) == len(filters)
+with self.name_scope():
+self.features = self._make_features(layers, filters, batch_norm)
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-16 Thread GitBox

ThomasDelteil commented on a change in pull request #10900: [MXNET-414] 
Tutorial on visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r188716879
 
 

 ##
 File path: docs/tutorials/vision/cnn_visualization.md
 ##
 @@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision. 
Their accuracy is as good as humans in some tasks. However it remains hard to 
explain the predictions of convolutional neural networks.
 
 Review comment:
   Suggesting: *However it remains hard to explain the predictions of 
convolutional neural networks*, as they lack the interpretability offered by 
other models, for example decision trees. 
   
   To help ground the reader into the subject we are talking about.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-16 Thread GitBox

ThomasDelteil commented on a change in pull request #10900: [MXNET-414] 
Tutorial on visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r188722441
 
 

 ##
 File path: docs/tutorials/vision/cnn_visualization.md
 ##
 @@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision. 
Their accuracy is as good as humans in some tasks. However it remains hard to 
explain the predictions of convolutional neural networks.
+
+It is often helpful to be able to explain why a model made the prediction it 
made. For example when a model misclassifies an image, it is hard to say why 
without visualizing the network's decision.
+
+https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/cnn_visualization/volcano_barn_spider.png;
 alt="Explaining the misclassification of volcano as spider" width=500px/>
+
+Visualizations also help build confidence about the predictions of a model. 
For example, even if a model correctly predicts birds as birds, we would want 
to confirm that the model bases its decision on the features of bird and not on 
the features of some other object that might occur together with birds in the 
dataset (like leaves).
+
+In this tutorial, we show how to visualize the predictions made by 
convolutional neural networks using Gradient-weighted Class Activation Mapping. 
Unlike many other visualization methods, Grad-CAM can be used on a wide variety 
of CNN model families - CNNs with fully connected layers, CNNs used for 
structural outputs (e.g. captioning), CNNs used in tasks with multi-model input 
(e.g. VQA) or reinforcement learning without architectural changes or 
re-training.
+
+In the rest of this notebook, we will explain how to visualize predictions 
made by [VGG-16](https://arxiv.org/abs/1409.1556). We begin by importing the 
required dependencies. `gradcam` module contains the implementation of 
visualization techniques used in this notebook.
+
+```python
+from __future__ import print_function
+
+import mxnet as mx
+from mxnet import gluon
+
+from matplotlib import pyplot as plt
+import numpy as np
+import cv2
+
+gradcam_file = "gradcam.py" 
+base_url = 
"https://raw.githubusercontent.com/indhub/mxnet/cnnviz/example/cnn_visualization/{}?raw=true;
+mx.test_utils.download(base_url.format(gradcam_file), fname=gradcam_file)
+import gradcam
+```
+
+## Building the network to visualize
+
+Next, we build the network we want to visualize. For this example, we will use 
the [VGG-16](https://arxiv.org/abs/1409.1556) network. This code was taken from 
the Gluon [model 
zoo](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/model_zoo/vision/alexnet.py)
 and refactored to make it easy to switch between `gradcam`'s and Gluon's 
implementation of ReLU and Conv2D. Same code can be used for both training and 
visualization with a minor (one line) change.
+
+Notice that we import ReLU and Conv2D from `gradcam` module instead of 
mxnet.gluon.nn.
+- We use a modified ReLU because we use guided backpropagation for 
visualization and guided backprop requires ReLU layer to block the backward 
flow of negative gradients corresponding to the neurons which decrease the 
activation of the higher layer unit we aim to visualize. Check 
[this](https://arxiv.org/abs/1412.6806) paper to learn more about guided 
backprop.
+- We use a modified Conv2D (a wrapper on top of Gluon's Conv2D) because we 
want to capture the output of a given convolutional layer and its gradients. 
This is needed to implement Grad-CAM. Check 
[this](https://arxiv.org/abs/1610.02391) paper to learn more about Grad-CAM.
+
+When you train the network, you could just import `Activation` and `Conv2D` 
from `gluon.nn` instead. No other part of the code needs any change to switch 
between training and visualization.
+
+```python
+import os
+from mxnet.gluon.model_zoo import model_store
+
+from mxnet.initializer import Xavier
+from mxnet.gluon.nn import MaxPool2D, Flatten, Dense, Dropout, BatchNorm
+from gradcam import Activation, Conv2D
+
+class VGG(mx.gluon.HybridBlock):
+def __init__(self, layers, filters, classes=1000, batch_norm=False, 
**kwargs):
+super(VGG, self).__init__(**kwargs)
+assert len(layers) == len(filters)
+with self.name_scope():
+self.features = self._make_features(layers, filters, batch_norm)
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-16 Thread GitBox

ThomasDelteil commented on a change in pull request #10900: [MXNET-414] 
Tutorial on visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r188723058
 
 

 ##
 File path: docs/tutorials/vision/cnn_visualization.md
 ##
 @@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision. 
Their accuracy is as good as humans in some tasks. However it remains hard to 
explain the predictions of convolutional neural networks.
+
+It is often helpful to be able to explain why a model made the prediction it 
made. For example when a model misclassifies an image, it is hard to say why 
without visualizing the network's decision.
+
+https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/cnn_visualization/volcano_barn_spider.png;
 alt="Explaining the misclassification of volcano as spider" width=500px/>
+
+Visualizations also help build confidence about the predictions of a model. 
For example, even if a model correctly predicts birds as birds, we would want 
to confirm that the model bases its decision on the features of bird and not on 
the features of some other object that might occur together with birds in the 
dataset (like leaves).
+
+In this tutorial, we show how to visualize the predictions made by 
convolutional neural networks using Gradient-weighted Class Activation Mapping. 
Unlike many other visualization methods, Grad-CAM can be used on a wide variety 
of CNN model families - CNNs with fully connected layers, CNNs used for 
structural outputs (e.g. captioning), CNNs used in tasks with multi-model input 
(e.g. VQA) or reinforcement learning without architectural changes or 
re-training.
+
+In the rest of this notebook, we will explain how to visualize predictions 
made by [VGG-16](https://arxiv.org/abs/1409.1556). We begin by importing the 
required dependencies. `gradcam` module contains the implementation of 
visualization techniques used in this notebook.
+
+```python
+from __future__ import print_function
+
+import mxnet as mx
+from mxnet import gluon
+
+from matplotlib import pyplot as plt
+import numpy as np
+import cv2
+
+gradcam_file = "gradcam.py" 
+base_url = 
"https://raw.githubusercontent.com/indhub/mxnet/cnnviz/example/cnn_visualization/{}?raw=true;
+mx.test_utils.download(base_url.format(gradcam_file), fname=gradcam_file)
+import gradcam
+```
+
+## Building the network to visualize
+
+Next, we build the network we want to visualize. For this example, we will use 
the [VGG-16](https://arxiv.org/abs/1409.1556) network. This code was taken from 
the Gluon [model 
zoo](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/model_zoo/vision/alexnet.py)
 and refactored to make it easy to switch between `gradcam`'s and Gluon's 
implementation of ReLU and Conv2D. Same code can be used for both training and 
visualization with a minor (one line) change.
+
+Notice that we import ReLU and Conv2D from `gradcam` module instead of 
mxnet.gluon.nn.
+- We use a modified ReLU because we use guided backpropagation for 
visualization and guided backprop requires ReLU layer to block the backward 
flow of negative gradients corresponding to the neurons which decrease the 
activation of the higher layer unit we aim to visualize. Check 
[this](https://arxiv.org/abs/1412.6806) paper to learn more about guided 
backprop.
+- We use a modified Conv2D (a wrapper on top of Gluon's Conv2D) because we 
want to capture the output of a given convolutional layer and its gradients. 
This is needed to implement Grad-CAM. Check 
[this](https://arxiv.org/abs/1610.02391) paper to learn more about Grad-CAM.
+
+When you train the network, you could just import `Activation` and `Conv2D` 
from `gluon.nn` instead. No other part of the code needs any change to switch 
between training and visualization.
+
+```python
+import os
+from mxnet.gluon.model_zoo import model_store
+
+from mxnet.initializer import Xavier
+from mxnet.gluon.nn import MaxPool2D, Flatten, Dense, Dropout, BatchNorm
+from gradcam import Activation, Conv2D
+
+class VGG(mx.gluon.HybridBlock):
+def __init__(self, layers, filters, classes=1000, batch_norm=False, 
**kwargs):
+super(VGG, self).__init__(**kwargs)
+assert len(layers) == len(filters)
+with self.name_scope():
+self.features = self._make_features(layers, filters, batch_norm)
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-16 Thread GitBox

ThomasDelteil commented on a change in pull request #10900: [MXNET-414] 
Tutorial on visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r188718174
 
 

 ##
 File path: docs/tutorials/vision/cnn_visualization.md
 ##
 @@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision. 
Their accuracy is as good as humans in some tasks. However it remains hard to 
explain the predictions of convolutional neural networks.
+
+It is often helpful to be able to explain why a model made the prediction it 
made. For example when a model misclassifies an image, it is hard to say why 
without visualizing the network's decision.
+
+https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/cnn_visualization/volcano_barn_spider.png;
 alt="Explaining the misclassification of volcano as spider" width=500px/>
+
+Visualizations also help build confidence about the predictions of a model. 
For example, even if a model correctly predicts birds as birds, we would want 
to confirm that the model bases its decision on the features of bird and not on 
the features of some other object that might occur together with birds in the 
dataset (like leaves).
+
+In this tutorial, we show how to visualize the predictions made by 
convolutional neural networks using Gradient-weighted Class Activation Mapping. 
Unlike many other visualization methods, Grad-CAM can be used on a wide variety 
of CNN model families - CNNs with fully connected layers, CNNs used for 
structural outputs (e.g. captioning), CNNs used in tasks with multi-model input 
(e.g. VQA) or reinforcement learning without architectural changes or 
re-training.
+
+In the rest of this notebook, we will explain how to visualize predictions 
made by [VGG-16](https://arxiv.org/abs/1409.1556). We begin by importing the 
required dependencies. `gradcam` module contains the implementation of 
visualization techniques used in this notebook.
+
+```python
+from __future__ import print_function
+
+import mxnet as mx
+from mxnet import gluon
+
+from matplotlib import pyplot as plt
+import numpy as np
+import cv2
+
+gradcam_file = "gradcam.py" 
+base_url = 
"https://raw.githubusercontent.com/indhub/mxnet/cnnviz/example/cnn_visualization/{}?raw=true;
 
 Review comment:
   remember to switch that to `apache/mxnet-incubator` after it is merged, or 
consider creating a separate PR to merge first this file.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-16 Thread GitBox

ThomasDelteil commented on a change in pull request #10900: [MXNET-414] 
Tutorial on visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r188721229
 
 

 ##
 File path: docs/tutorials/vision/cnn_visualization.md
 ##
 @@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision. 
Their accuracy is as good as humans in some tasks. However it remains hard to 
explain the predictions of convolutional neural networks.
+
+It is often helpful to be able to explain why a model made the prediction it 
made. For example when a model misclassifies an image, it is hard to say why 
without visualizing the network's decision.
+
+https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/cnn_visualization/volcano_barn_spider.png;
 alt="Explaining the misclassification of volcano as spider" width=500px/>
+
+Visualizations also help build confidence about the predictions of a model. 
For example, even if a model correctly predicts birds as birds, we would want 
to confirm that the model bases its decision on the features of bird and not on 
the features of some other object that might occur together with birds in the 
dataset (like leaves).
+
+In this tutorial, we show how to visualize the predictions made by 
convolutional neural networks using Gradient-weighted Class Activation Mapping. 
Unlike many other visualization methods, Grad-CAM can be used on a wide variety 
of CNN model families - CNNs with fully connected layers, CNNs used for 
structural outputs (e.g. captioning), CNNs used in tasks with multi-model input 
(e.g. VQA) or reinforcement learning without architectural changes or 
re-training.
+
+In the rest of this notebook, we will explain how to visualize predictions 
made by [VGG-16](https://arxiv.org/abs/1409.1556). We begin by importing the 
required dependencies. `gradcam` module contains the implementation of 
visualization techniques used in this notebook.
+
+```python
+from __future__ import print_function
+
+import mxnet as mx
+from mxnet import gluon
+
+from matplotlib import pyplot as plt
+import numpy as np
+import cv2
+
+gradcam_file = "gradcam.py" 
+base_url = 
"https://raw.githubusercontent.com/indhub/mxnet/cnnviz/example/cnn_visualization/{}?raw=true;
+mx.test_utils.download(base_url.format(gradcam_file), fname=gradcam_file)
+import gradcam
+```
+
+## Building the network to visualize
+
+Next, we build the network we want to visualize. For this example, we will use 
the [VGG-16](https://arxiv.org/abs/1409.1556) network. This code was taken from 
the Gluon [model 
zoo](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/model_zoo/vision/alexnet.py)
 and refactored to make it easy to switch between `gradcam`'s and Gluon's 
implementation of ReLU and Conv2D. Same code can be used for both training and 
visualization with a minor (one line) change.
+
+Notice that we import ReLU and Conv2D from `gradcam` module instead of 
mxnet.gluon.nn.
+- We use a modified ReLU because we use guided backpropagation for 
visualization and guided backprop requires ReLU layer to block the backward 
flow of negative gradients corresponding to the neurons which decrease the 
activation of the higher layer unit we aim to visualize. Check 
[this](https://arxiv.org/abs/1412.6806) paper to learn more about guided 
backprop.
+- We use a modified Conv2D (a wrapper on top of Gluon's Conv2D) because we 
want to capture the output of a given convolutional layer and its gradients. 
This is needed to implement Grad-CAM. Check 
[this](https://arxiv.org/abs/1610.02391) paper to learn more about Grad-CAM.
+
+When you train the network, you could just import `Activation` and `Conv2D` 
from `gluon.nn` instead. No other part of the code needs any change to switch 
between training and visualization.
+
+```python
+import os
+from mxnet.gluon.model_zoo import model_store
+
+from mxnet.initializer import Xavier
+from mxnet.gluon.nn import MaxPool2D, Flatten, Dense, Dropout, BatchNorm
+from gradcam import Activation, Conv2D
+
+class VGG(mx.gluon.HybridBlock):
+def __init__(self, layers, filters, classes=1000, batch_norm=False, 
**kwargs):
+super(VGG, self).__init__(**kwargs)
+assert len(layers) == len(filters)
+with self.name_scope():
+self.features = self._make_features(layers, filters, batch_norm)
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-16 Thread GitBox

ThomasDelteil commented on a change in pull request #10900: [MXNET-414] 
Tutorial on visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r188719470
 
 

 ##
 File path: docs/tutorials/vision/cnn_visualization.md
 ##
 @@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision. 
Their accuracy is as good as humans in some tasks. However it remains hard to 
explain the predictions of convolutional neural networks.
+
+It is often helpful to be able to explain why a model made the prediction it 
made. For example when a model misclassifies an image, it is hard to say why 
without visualizing the network's decision.
+
+https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/cnn_visualization/volcano_barn_spider.png;
 alt="Explaining the misclassification of volcano as spider" width=500px/>
+
+Visualizations also help build confidence about the predictions of a model. 
For example, even if a model correctly predicts birds as birds, we would want 
to confirm that the model bases its decision on the features of bird and not on 
the features of some other object that might occur together with birds in the 
dataset (like leaves).
+
+In this tutorial, we show how to visualize the predictions made by 
convolutional neural networks using Gradient-weighted Class Activation Mapping. 
Unlike many other visualization methods, Grad-CAM can be used on a wide variety 
of CNN model families - CNNs with fully connected layers, CNNs used for 
structural outputs (e.g. captioning), CNNs used in tasks with multi-model input 
(e.g. VQA) or reinforcement learning without architectural changes or 
re-training.
+
+In the rest of this notebook, we will explain how to visualize predictions 
made by [VGG-16](https://arxiv.org/abs/1409.1556). We begin by importing the 
required dependencies. `gradcam` module contains the implementation of 
visualization techniques used in this notebook.
+
+```python
+from __future__ import print_function
+
+import mxnet as mx
+from mxnet import gluon
+
+from matplotlib import pyplot as plt
+import numpy as np
+import cv2
+
+gradcam_file = "gradcam.py" 
+base_url = 
"https://raw.githubusercontent.com/indhub/mxnet/cnnviz/example/cnn_visualization/{}?raw=true;
+mx.test_utils.download(base_url.format(gradcam_file), fname=gradcam_file)
+import gradcam
+```
+
+## Building the network to visualize
+
+Next, we build the network we want to visualize. For this example, we will use 
the [VGG-16](https://arxiv.org/abs/1409.1556) network. This code was taken from 
the Gluon [model 
zoo](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/model_zoo/vision/alexnet.py)
 and refactored to make it easy to switch between `gradcam`'s and Gluon's 
implementation of ReLU and Conv2D. Same code can be used for both training and 
visualization with a minor (one line) change.
+
+Notice that we import ReLU and Conv2D from `gradcam` module instead of 
mxnet.gluon.nn.
+- We use a modified ReLU because we use guided backpropagation for 
visualization and guided backprop requires ReLU layer to block the backward 
flow of negative gradients corresponding to the neurons which decrease the 
activation of the higher layer unit we aim to visualize. Check 
[this](https://arxiv.org/abs/1412.6806) paper to learn more about guided 
backprop.
+- We use a modified Conv2D (a wrapper on top of Gluon's Conv2D) because we 
want to capture the output of a given convolutional layer and its gradients. 
This is needed to implement Grad-CAM. Check 
[this](https://arxiv.org/abs/1610.02391) paper to learn more about Grad-CAM.
+
+When you train the network, you could just import `Activation` and `Conv2D` 
from `gluon.nn` instead. No other part of the code needs any change to switch 
between training and visualization.
+
+```python
+import os
+from mxnet.gluon.model_zoo import model_store
+
+from mxnet.initializer import Xavier
+from mxnet.gluon.nn import MaxPool2D, Flatten, Dense, Dropout, BatchNorm
+from gradcam import Activation, Conv2D
+
+class VGG(mx.gluon.HybridBlock):
+def __init__(self, layers, filters, classes=1000, batch_norm=False, 
**kwargs):
+super(VGG, self).__init__(**kwargs)
+assert len(layers) == len(filters)
+with self.name_scope():
+self.features = self._make_features(layers, filters, batch_norm)
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-16 Thread GitBox

ThomasDelteil commented on a change in pull request #10900: [MXNET-414] 
Tutorial on visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r188723657
 
 

 ##
 File path: docs/tutorials/vision/cnn_visualization.md
 ##
 @@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision. 
Their accuracy is as good as humans in some tasks. However it remains hard to 
explain the predictions of convolutional neural networks.
+
+It is often helpful to be able to explain why a model made the prediction it 
made. For example when a model misclassifies an image, it is hard to say why 
without visualizing the network's decision.
+
+https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/cnn_visualization/volcano_barn_spider.png;
 alt="Explaining the misclassification of volcano as spider" width=500px/>
+
+Visualizations also help build confidence about the predictions of a model. 
For example, even if a model correctly predicts birds as birds, we would want 
to confirm that the model bases its decision on the features of bird and not on 
the features of some other object that might occur together with birds in the 
dataset (like leaves).
+
+In this tutorial, we show how to visualize the predictions made by 
convolutional neural networks using Gradient-weighted Class Activation Mapping. 
Unlike many other visualization methods, Grad-CAM can be used on a wide variety 
of CNN model families - CNNs with fully connected layers, CNNs used for 
structural outputs (e.g. captioning), CNNs used in tasks with multi-model input 
(e.g. VQA) or reinforcement learning without architectural changes or 
re-training.
+
+In the rest of this notebook, we will explain how to visualize predictions 
made by [VGG-16](https://arxiv.org/abs/1409.1556). We begin by importing the 
required dependencies. `gradcam` module contains the implementation of 
visualization techniques used in this notebook.
+
+```python
+from __future__ import print_function
+
+import mxnet as mx
+from mxnet import gluon
+
+from matplotlib import pyplot as plt
+import numpy as np
+import cv2
+
+gradcam_file = "gradcam.py" 
+base_url = 
"https://raw.githubusercontent.com/indhub/mxnet/cnnviz/example/cnn_visualization/{}?raw=true;
+mx.test_utils.download(base_url.format(gradcam_file), fname=gradcam_file)
+import gradcam
+```
+
+## Building the network to visualize
+
+Next, we build the network we want to visualize. For this example, we will use 
the [VGG-16](https://arxiv.org/abs/1409.1556) network. This code was taken from 
the Gluon [model 
zoo](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/model_zoo/vision/alexnet.py)
 and refactored to make it easy to switch between `gradcam`'s and Gluon's 
implementation of ReLU and Conv2D. Same code can be used for both training and 
visualization with a minor (one line) change.
+
+Notice that we import ReLU and Conv2D from `gradcam` module instead of 
mxnet.gluon.nn.
+- We use a modified ReLU because we use guided backpropagation for 
visualization and guided backprop requires ReLU layer to block the backward 
flow of negative gradients corresponding to the neurons which decrease the 
activation of the higher layer unit we aim to visualize. Check 
[this](https://arxiv.org/abs/1412.6806) paper to learn more about guided 
backprop.
+- We use a modified Conv2D (a wrapper on top of Gluon's Conv2D) because we 
want to capture the output of a given convolutional layer and its gradients. 
This is needed to implement Grad-CAM. Check 
[this](https://arxiv.org/abs/1610.02391) paper to learn more about Grad-CAM.
+
+When you train the network, you could just import `Activation` and `Conv2D` 
from `gluon.nn` instead. No other part of the code needs any change to switch 
between training and visualization.
+
+```python
+import os
+from mxnet.gluon.model_zoo import model_store
+
+from mxnet.initializer import Xavier
+from mxnet.gluon.nn import MaxPool2D, Flatten, Dense, Dropout, BatchNorm
+from gradcam import Activation, Conv2D
+
+class VGG(mx.gluon.HybridBlock):
+def __init__(self, layers, filters, classes=1000, batch_norm=False, 
**kwargs):
+super(VGG, self).__init__(**kwargs)
+assert len(layers) == len(filters)
+with self.name_scope():
+self.features = self._make_features(layers, filters, batch_norm)
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-16 Thread GitBox

Review comment:
is there a Grad-CAM paper to link to?

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

With regards,
Apache Git Services

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

14 matches

Site Navigation

Mail list logo

Footer information