The use case is implementing effectively a circular buffer using 
`concat`+`slice`. Here is the code:

```

from mxnet import profiler
profiler.set_config(profile_all=True, aggregate_stats=True,
                    filename='/home/ec2-user/src/mkl_slice_op_profile.json')


class TestBlock(gluon.HybridBlock):
    def __init__(self):
        super(TestBlock, self).__init__()
        with self.name_scope():
            self.conv = gluon.nn.Conv2D(512, kernel_size=(1, 3), dilation=512)

    def hybrid_forward(self, F, x):
        out = self.conv(x)
        x = F.concat(x, out, dim=3)
        x = F.slice_axis(x, axis=3, begin=-1025, end=None)
        # x = F.slice(x, begin=(None, None, None, -1025), end=(None, None, 
None, None))
        return x


x = nd.random.uniform(shape=(32, 512, 1, 1025))
net = TestBlock()
net.initialize()
net.hybridize(static_alloc=True, static_shape=True)
x = net(x)

profiler.set_state('run')
for _ in range(100):
    x = net(x)

nd.waitall()
profiler.set_state('stop')
profiler.dump()
print(profiler.dumps(reset=True))
exit(0)
```

And here are the interesting profiling results.
1. Profile with mxnet package and `slice_axis` operator (in `hybrid_forward()`, 
uncomment `slice` and comment `slice_axis`) (**no MKL**)
```
operator
=================
Name                          Total Count        Time (ms)    Min Time (ms)    
Max Time (ms)    Avg Time (ms)
----                          -----------        ---------    -------------    
-------------    -------------
slice_axis                            200        4048.8311          20.1010     
     20.3790          20.2442
Concat                                200       17641.7461          88.0750     
     89.5890          88.2087
Convolution                           200        2944.2839          14.5890     
     14.8890          14.7214
DeleteVariable                        206         517.0800           0.0030     
      2.6670           2.5101
```
2. Profile with mxnet package and `slice` operator (**no MKL**) (**Consistently 
performs ~2% better than `slice_axis`!!**)
```
operator
=================
Name                          Total Count        Time (ms)    Min Time (ms)    
Max Time (ms)    Avg Time (ms)
----                          -----------        ---------    -------------    
-------------    -------------
slice                                 200        3938.1279          19.5190     
     19.9520          19.6906
Concat                                200       17636.0566          88.0600     
     88.7120          88.1803
Convolution                           200        2945.0759          14.5760     
     14.8420          14.7254
DeleteVariable                        206         521.2870           0.0030     
      2.6960           2.5305
```
3. Profile with mxnet-mkl package and `slice_axis` operator (**with MKLDNN**)
```
operator
=================
Name                          Total Count        Time (ms)    Min Time (ms)    
Max Time (ms)    Avg Time (ms)
----                          -----------        ---------    -------------    
-------------    -------------
Reorder                               202           2.9610           0.0000     
      1.3190           0.0147
slice_axis                            200        4979.5488          24.6100     
     26.1240          24.8977
Concat                                200         881.7350           4.3000     
      4.5370           4.4087
Convolution                           200        1231.0720           5.9080     
     11.6130           6.1554
DeleteVariable                        408         982.9400           0.0030     
      2.8100           2.4092
```
4. Profile with mxnet-mkl package and `slice` operator (**with MKLDNN**)
```
operator
=================
Name                          Total Count        Time (ms)    Min Time (ms)    
Max Time (ms)    Avg Time (ms)
----                          -----------        ---------    -------------    
-------------    -------------
Reorder                               202           2.8510           0.0000     
      1.2710           0.0141
slice                                 200        5012.6240          24.8500     
     27.0280          25.0631
Concat                                200         880.1710           4.2900     
      4.5270           4.4009
Convolution                           200        1252.7841           5.9060     
     11.7800           6.2639
DeleteVariable                        408         970.0030           0.0040     
      2.8370           2.3775
```

[ Full content available at: 
https://github.com/apache/incubator-mxnet/issues/12303 ]
This message was relayed via gitbox.apache.org for [email protected]

Reply via email to