sandeep-krishnamurthy closed pull request #9066: FCN example updates
URL: https://github.com/apache/incubator-mxnet/pull/9066
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/example/fcn-xs/README.md b/example/fcn-xs/README.md
index 66ae08fe71..145aa31cb7 100644
--- a/example/fcn-xs/README.md
+++ b/example/fcn-xs/README.md
@@ -1,6 +1,7 @@
-FCN-xs EXAMPLES
----------------
-This folder contains the examples of image segmentation in MXNet.
+FCN-xs EXAMPLE
+--------------
+This folder contains an example implementation for Fully Convolutional 
Networks (FCN) in MXNet.  
+The example is based on the [FCN paper](https://arxiv.org/abs/1411.4038) by 
long et al. of UC Berkeley.
 
 ## Sample results
 ![fcn-xs pasval_voc 
result](https://github.com/dmlc/web-data/blob/master/mxnet/image/fcnxs-example-result.jpg)
@@ -17,32 +18,36 @@ We have trained a simple fcn-xs model, the hyper-parameters 
are below:
 
 The training dataset size is only 2027, and the validation dataset size is 
462.  
 
-## How to train fcn-xs in mxnet
-#### Getting Started
+## Training the model
+
+### Step 1: setup pre-requisites
 
 - Install python package?`Pillow` (required by `image_segment.py`).
 ```shell
-[sudo] pip install Pillow
+pip install --upgrade Pillow
 ```
-- Assume that we are in a working directory, such as `~/train_fcn_xs`, and 
MXNet is built as `~/mxnet`. Now, copy example scripts into working directory.
+- Setup your working directory. Assume your working directory is 
`~/train_fcn_xs`, and MXNet is built as `~/mxnet`. Copy example scripts into 
the working directory.
 ```shell
 cp ~/mxnet/example/fcn-xs/* .
 ```
-#### Step1: Download the vgg16fc model and experiment data
-* vgg16fc model : you can download the 
```VGG_FC_ILSVRC_16_layers-symbol.json``` and 
```VGG_FC_ILSVRC_16_layers-0074.params```   [baidu 
yun](http://pan.baidu.com/s/1bgz4PC), 
[dropbox](https://www.dropbox.com/sh/578n5cxej7ofd6m/AACuSeSYGcKQDi1GoB72R5lya?dl=0).
  
+### Step 2: Download the vgg16fc model and training data
+* vgg16fc model: you can download the 
```VGG_FC_ILSVRC_16_layers-symbol.json``` and 
```VGG_FC_ILSVRC_16_layers-0074.params``` from [baidu 
yun](http://pan.baidu.com/s/1bgz4PC), 
[dropbox](https://www.dropbox.com/sh/578n5cxej7ofd6m/AACuSeSYGcKQDi1GoB72R5lya?dl=0).
  
 this is the fully convolution style of the origin
 
[VGG_ILSVRC_16_layers.caffemodel](http://www.robots.ox.ac.uk/~vgg/software/very_deep/caffe/VGG_ILSVRC_16_layers.caffemodel),
 and the corresponding 
[VGG_ILSVRC_16_layers_deploy.prototxt](https://gist.github.com/ksimonyan/211839e770f7b538e2d8#file-vgg_ilsvrc_16_layers_deploy-prototxt),
 the vgg16 model has [license](http://creativecommons.org/licenses/by-nc/4.0/) 
for non-commercial use only.
-* experiment data : you can download the ```VOC2012.rar```  
[robots.ox.ac.uk](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar),
 and extract it. the file/folder will be like:  
-```JPEGImages folder```, ```SegmentationClass folder```, ```train.lst```, 
```val.lst```, ```test.lst```
+* Training data: download the ```VOC2012.rar```  
[robots.ox.ac.uk](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar),
 and extract it into ```.\VOC2012```
+* Mapping files: download ```train.lst```, ```val.lst``` from [baidu 
yun](http://pan.baidu.com/s/1bgz4PC) into the ```.\VOC2012``` directory
+
+Once you completed all these steps, your working directory should contain a 
```.\VOC2012``` directory, which contains the following: ```JPEGImages 
folder```, ```SegmentationClass folder```, ```train.lst```, ```val.lst```
 
-#### Step2: Train fcn-xs model
-* Configure GPU/CPU for training in `fcn_xs.py`.
+#### Step 3: Train the fcn-xs model
+* Based on your hardware, configure GPU or CPU for training in `fcn_xs.py`. It 
is recommended to use GPU due to the computational complexity and data load.
 ```python
 # ctx = mx.cpu(0)
 ctx = mx.gpu(0)
 ```
-* If you want to train the fcn-8s model, it's better for you trained the 
fcn-32s and fcn-16s model firstly.
-when training the fcn-32s model, run in shell ```./run_fcnxs.sh```, the script 
in it is:
+* It is recommended to train fcn-32s and fcn-16s before training the fcn-8s 
model
+
+To train the fcn-32s model, run the following:
 ```shell
 python -u fcn_xs.py --model=fcn32s --prefix=VGG_FC_ILSVRC_16_layers --epoch=74 
--init-type=vgg16
 ```
@@ -64,14 +69,15 @@ INFO:root:Epoch[0] Batch [350]  Speed: 1.12 samples/sec 
Train-accuracy=0.912080
 ```
 
 ## Using the pre-trained model for image segmentation
-* Similarly, you should first download the pre-trained model from  
[yun.baidu](http://pan.baidu.com/s/1bgz4PC), the symbol and model file is 
```FCN8s_VGG16-symbol.json```, ```FCN8s_VGG16-0019.params```
-* Then put the image in your directory for segmentation, and change the ```img 
= YOUR_IMAGE_NAME``` in ```image_segmentaion.py```
-* At last, use ```image_segmentaion.py``` to segmentation one image by running 
in shell ```python image_segmentaion.py```, then you will get the segmentation 
image like the sample results above.
+To try out the pre-trained model, follow these steps:
+* Download the pre-trained symbol and weights from 
[yun.baidu](http://pan.baidu.com/s/1bgz4PC). You should download these files: 
```FCN8s_VGG16-symbol.json``` and ```FCN8s_VGG16-0019.params```
+* Run the segmentation script, providing it your input image path: ```python 
image_segmentaion.py --input <your JPG image path>```
+* The segmented output ```.png``` file will be generated in the working 
directory
 
 ## Tips
-* This is the whole image size training, that is to say, we do not need 
resize/crop the image to the same size, so the batch_size during training is 
set to 1.
-* The fcn-xs model is based on vgg16 model, with some crop, deconv, 
element-sum layer added, so the model is quite big, moreover, the example is 
using whole image size training, if the input image is large(such as 700*500), 
then it may consume lots of memories, so I suggest you using the GPU with 12G 
memory.
-* If you don't have GPU with 12G memory, maybe you should change the 
```cut_off_size``` to a small value when you construct your FileIter, like 
this:  
+* This example runs full image size training, so there is no need to resize or 
crop input images to the same size. Accordingly, batch_size during training is 
set to 1.
+* The fcn-xs model is based on vgg16 model, with some crop, deconv, 
element-sum layer added, so the model is quite big, moreover, the example is 
using whole image size training, if the input image is large(such as 700*500), 
then memory consumption may be high. Due to that, I suggest you use GPU with at 
least 12GB memory for training.
+* If you don't have access to GPU with 12GB memory for training, I suggest you 
change the ```cut_off_size``` to a small value when constructing the FileIter, 
example below:  
 ```python
 train_dataiter = FileIter(
       root_dir             = "./VOC2012",
@@ -80,4 +86,4 @@ train_dataiter = FileIter(
       rgb_mean             = (123.68, 116.779, 103.939),
       )
 ```
-* We are looking forward you to making this example more powerful, thanks.
+
diff --git a/example/fcn-xs/image_segmentaion.py 
b/example/fcn-xs/image_segmentaion.py
index ddd850fe4e..75df2d128a 100644
--- a/example/fcn-xs/image_segmentaion.py
+++ b/example/fcn-xs/image_segmentaion.py
@@ -15,38 +15,68 @@
 # specific language governing permissions and limitations
 # under the License.
 
-# pylint: skip-file
+"""
+This module encapsulates running image segmentation model for inference.
+
+Example usage:
+    $ python image_segmentaion.py --input <your JPG image path>
+"""
+
+import argparse
+import os
 import numpy as np
 import mxnet as mx
 from PIL import Image
 
-def getpallete(num_cls):
-    # this function is to get the colormap for visualizing the segmentation 
mask
-    n = num_cls
-    pallete = [0]*(n*3)
-    for j in xrange(0,n):
-            lab = j
-            pallete[j*3+0] = 0
-            pallete[j*3+1] = 0
-            pallete[j*3+2] = 0
-            i = 0
-            while (lab > 0):
-                    pallete[j*3+0] |= (((lab >> 0) & 1) << (7-i))
-                    pallete[j*3+1] |= (((lab >> 1) & 1) << (7-i))
-                    pallete[j*3+2] |= (((lab >> 2) & 1) << (7-i))
-                    i = i + 1
-                    lab >>= 3
-    return pallete
+def make_file_extension_assertion(extension):
+    """Function factory for file extension argparse assertion
+        Args:
+            extension (string): the file extension to assert
+
+        Returns:
+            string: the supplied extension, if assertion is successful.
+
+    """
+    def file_extension_assertion(file_path):
+        base, ext = os.path.splitext(file_path)
+        if ext.lower() != extension:
+            raise argparse.ArgumentTypeError('File must have ' + extension + ' 
extension')
+        return file_path
+    return file_extension_assertion
+
+def get_palette(num_colors=256):
+    """generates the colormap for visualizing the segmentation mask
+            Args:
+                num_colors (int): the number of colors to generate in the 
output palette
 
-pallete = getpallete(256)
-img = "./person_bicycle.jpg"
-seg = img.replace("jpg", "png")
-model_previx = "FCN8s_VGG16"
-epoch = 19
-ctx = mx.gpu(0)
+            Returns:
+                string: the supplied extension, if assertion is successful.
+
+    """
+    pallete = [0]*(num_colors*3)
+    for j in range(0, num_colors):
+        lab = j
+        pallete[j*3+0] = 0
+        pallete[j*3+1] = 0
+        pallete[j*3+2] = 0
+        i = 0
+        while (lab > 0):
+            pallete[j*3+0] |= (((lab >> 0) & 1) << (7-i))
+            pallete[j*3+1] |= (((lab >> 1) & 1) << (7-i))
+            pallete[j*3+2] |= (((lab >> 2) & 1) << (7-i))
+            i = i + 1
+            lab >>= 3
+    return pallete
 
 def get_data(img_path):
-    """get the (1, 3, h, w) np.array data for the img_path"""
+    """get the (1, 3, h, w) np.array data for the supplied image
+                Args:
+                    img_path (string): the input image path
+
+                Returns:
+                    np.array: image data in a (1, 3, h, w) shape
+
+    """
     mean = np.array([123.68, 116.779, 103.939])  # (R,G,B)
     img = Image.open(img_path)
     img = np.array(img, dtype=np.float32)
@@ -58,18 +88,37 @@ def get_data(img_path):
     return img
 
 def main():
-    fcnxs, fcnxs_args, fcnxs_auxs = mx.model.load_checkpoint(model_previx, 
epoch)
-    fcnxs_args["data"] = mx.nd.array(get_data(img), ctx)
+    """Module main execution"""
+    # Initialization variables - update to change your model and execution 
context
+    model_prefix = "FCN8s_VGG16"
+    epoch = 19
+
+    # By default, MXNet will run on the CPU. Uncomment the line below to 
execute on the GPU
+    # ctx = mx.gpu()
+
+    fcnxs, fcnxs_args, fcnxs_auxs = mx.model.load_checkpoint(model_prefix, 
epoch)
+    fcnxs_args["data"] = mx.nd.array(get_data(args.input), ctx)
     data_shape = fcnxs_args["data"].shape
     label_shape = (1, data_shape[2]*data_shape[3])
     fcnxs_args["softmax_label"] = mx.nd.empty(label_shape, ctx)
-    exector = fcnxs.bind(ctx, fcnxs_args ,args_grad=None, grad_req="null", 
aux_states=fcnxs_args)
+    exector = fcnxs.bind(ctx, fcnxs_args, args_grad=None, grad_req="null", 
aux_states=fcnxs_args)
     exector.forward(is_train=False)
     output = exector.outputs[0]
     out_img = np.uint8(np.squeeze(output.asnumpy().argmax(axis=1)))
     out_img = Image.fromarray(out_img)
-    out_img.putpalette(pallete)
-    out_img.save(seg)
+    out_img.putpalette(get_palette())
+    out_img.save(args.output)
 
 if __name__ == "__main__":
+    # Handle command line arguments
+    parser = argparse.ArgumentParser(description='Run VGG16-FCN-8s to segment 
an input image')
+    parser.add_argument('--input',
+                        required=True,
+                        type=make_file_extension_assertion('.jpg'),
+                        help='The segmentation input JPG image')
+    parser.add_argument('--output',
+                        default='segmented.png',
+                        type=make_file_extension_assertion('.png'),
+                        help='The segmentation putput PNG image')
+    args = parser.parse_args()
     main()


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to