Re: Can upgrade windows CI cmake?

2019-12-06 Thread shiwen hu
Now, other problems are solved by modifying CMakeLists.txt.but The command
line is too long problem must update cmake.However I don't know which
minimum version fixed the problem.I try to do some tests to find out the
minimum version.

Pedro Larroy  于2019年12月7日周六 上午3:52写道:

> CMake shipped with ubuntu has issues when compiling with CUDA on GPU
> instances.  I wouldn't recommend anything older than 3.12 for Linux GPU
>
>
> https://github.com/apache/incubator-mxnet/blob/master/ci/docker/install/ubuntu_core.sh#L63
>
> I don't know about windows CMake version but would make sense to require a
> newer version.
>
> On Thu, Dec 5, 2019 at 7:26 PM Lausen, Leonard 
> wrote:
>
> > Currently we declare cmake_minimum_required(VERSION 3.0.2)
> >
> > I'm in favor of updating our CMake requirement. The main question may be
> > what
> > new version to pick as minimum requirement.
> >
> > In general, there is the guideline
> >
> > > You really should at least use a version of CMake that came out after
> > your
> > > compiler, since it needs to know compiler flags, etc, for that version.
> > And,
> > > since CMake will dumb itself down to the minimum required version in
> your
> > > CMake file, installing a new CMake, even system wide, is pretty safe.
> You
> > > should at least install it locally. It's easy (1-2 lines in many
> cases),
> > and
> > > you'll find that 5 minutes of work will save you hundreds of lines and
> > hours
> > > of CMakeLists.txt writing, and will be much easier to maintain in the
> > long
> > > run.
> > https://cliutils.gitlab.io/modern-cmake/
> >
> > https://cliutils.gitlab.io/modern-cmake/chapters/intro/newcmake.html
> > gives a
> > short overview of all the improvements made to CMake over the past 6
> years.
> >
> > It's easy for users to upgrade their cmake version with pip:
> >   pip install --upgrade --user cmake
> > Thus it wouldn't be overly problematic to rely on a very recent version
> of
> > cmake, if indeed it's required.
> >
> > Nevertheless, if an earlier version fixes the problems, let's rather pick
> > that
> > one. Did you confirm which version is required to fix the problem?
> >
> > For now you could try if the CMake version shipped in the oldest
> supported
> > Ubuntu LTS release (Ubuntu 16.04) is fixing your problem (CMake 3.5)? If
> > not,
> > please test if CMake version shipped in Ubuntu 18.04 (CMake 3.10) fixes
> > your
> > issue.
> >
> > Thanks
> > Leonard
> >
> > On Fri, 2019-12-06 at 08:45 +0800, shiwen hu wrote:
> > > i am send a pr  https://github.com/apache/incubator-mxnet/pull/16980
> to
> > > change windows build system.but now ci cmake version seems to be a bug.
> > > can't to compile.can upgrade to 3.16.0?
> >
>


Pypi limit increase

2019-12-06 Thread Lausen, Leonard
Pradyun Gedam of the Pypi team helped to increase our limit just now.

> I just increased the limits as detailed below. Please be mindful of how
> frequently you make releases with such a high limit -- each release will have
> a not-insignificant impact on how much traffic PyPI has to serve.
> 
> Raised to 800 MB, on both PyPI and TestPyPI:
> 
> mxnet
> 
> Raised to 800 MB on PyPI, does not exist on TestPyPI:
> 
> mxnet-mkl
> mxnet-cu92
> mxnet-cu92mkl
> mxnet-cu101
> mxnet-cu101mkl
> mxnet-cu102
> mxnet-cu102mkl
> 


Re: Bump CMake minimum version

2019-12-06 Thread Lausen, Leonard
Thanks Pedro for pointing out the problems with old CMake versions. I find that
the popular Deep Learning AMIs provided on AWS, while based on Ubuntu 16.04 and
18.04, come with a updated version of CMake (3.13.3) pre-installed.

CMake 3.13 was released more than 1 year ago. Anyone with an older version of
cmake can easily update via `pip install --user --upgrade cmake`. Please see 
https://cliutils.gitlab.io/modern-cmake/chapters/intro/newcmake.html for all the
great improvements done between CMake 3.0 and 3.13.

As Pedro and Shiwen pointed out that old CMake versions, which we currently
claim to support, actually do not work in at least two important MXNet 
use-cases 
(GPU build; Windows build), I suggest we adopt CMake 3.13 as new minimum
version.

This does not seem contentious to me, thus I suggest a 72 hour lazy consensus
approach. If there are no objections, let's bump the CMake requirement to CMake
3.13 after the 72 hour window.

Best regards
Leonard

On Fri, 2019-12-06 at 11:52 -0800, Pedro Larroy wrote:
> CMake shipped with ubuntu has issues when compiling with CUDA on GPU
> instances.  I wouldn't recommend anything older than 3.12 for Linux GPU
> 
> https://github.com/apache/incubator-mxnet/blob/master/ci/docker/install/ubuntu_core.sh#L63
> 
> I don't know about windows CMake version but would make sense to require a
> newer version.
> 
> On Thu, Dec 5, 2019 at 7:26 PM Lausen, Leonard 
> wrote:
> 
> > Currently we declare cmake_minimum_required(VERSION 3.0.2)
> > 
> > I'm in favor of updating our CMake requirement. The main question may be
> > what
> > new version to pick as minimum requirement.
> > 
> > In general, there is the guideline
> > 
> > > You really should at least use a version of CMake that came out after
> > your
> > > compiler, since it needs to know compiler flags, etc, for that version.
> > And,
> > > since CMake will dumb itself down to the minimum required version in your
> > > CMake file, installing a new CMake, even system wide, is pretty safe. You
> > > should at least install it locally. It's easy (1-2 lines in many cases),
> > and
> > > you'll find that 5 minutes of work will save you hundreds of lines and
> > hours
> > > of CMakeLists.txt writing, and will be much easier to maintain in the
> > long
> > > run.
> > https://cliutils.gitlab.io/modern-cmake/
> > 
> > https://cliutils.gitlab.io/modern-cmake/chapters/intro/newcmake.html
> > gives a
> > short overview of all the improvements made to CMake over the past 6 years.
> > 
> > It's easy for users to upgrade their cmake version with pip:
> >   pip install --upgrade --user cmake
> > Thus it wouldn't be overly problematic to rely on a very recent version of
> > cmake, if indeed it's required.
> > 
> > Nevertheless, if an earlier version fixes the problems, let's rather pick
> > that
> > one. Did you confirm which version is required to fix the problem?
> > 
> > For now you could try if the CMake version shipped in the oldest supported
> > Ubuntu LTS release (Ubuntu 16.04) is fixing your problem (CMake 3.5)? If
> > not,
> > please test if CMake version shipped in Ubuntu 18.04 (CMake 3.10) fixes
> > your
> > issue.
> > 
> > Thanks
> > Leonard
> > 
> > On Fri, 2019-12-06 at 08:45 +0800, shiwen hu wrote:
> > > i am send a pr  https://github.com/apache/incubator-mxnet/pull/16980 to
> > > change windows build system.but now ci cmake version seems to be a bug.
> > > can't to compile.can upgrade to 3.16.0?


Re: Please remove conflicting Open MP version from CMake builds

2019-12-06 Thread Lausen, Leonard
I think it's reasonable to assume that the Intel MKLDNN team is an "authorative"
source about the issue of compilation with OpenMP and the OpenMP runtime library
related issues. Thus I suggest we follow the recommendation of Intel MKLDNN team
within the MXNet project.

Looking through the Intel MKLDNN documentation, I find [1]:

> DNNL uses OpenMP runtime library provided by the compiler.

as well as

> it's important to ensure that only one OpenMP runtime is used throughout the
> application. Having more than one OpenMP runtime linked to an executable may
> lead to undefined behavior including incorrect results or crashes.

To keep our project maintainable and error free, I thus suggest we follow DNNL
and use the OpenMP runtime library provided by the compiler.
We have limited ressources and finding the root cause for any bugs resulting
from linking multiple OpenMP libraries as currently done is, in my opinion. not
a good use of time. We know it's due to undefined behavior and we know it's best
practice to use OpenMP runtime library provided by the compiler. So let's just
do that.

I think given that MKL-DNN has also adopted the "OpenMP runtime library provided
by the compiler" approach, this issue is not contentious anymore and qualifies
for lazy consensus.

Thus if there is no objection within 72 hours (lazy consensus), let's drop
bundled LLVM OpenMP from master [2]. If we find any issues due to droppeing the
bundled LLVM OpenMP, we can always add it back prior to the next release.

Best regards
Leonard

[1]: 
https://github.com/intel/mkl-dnn/blob/433e086bf5d9e5ccfc9ec0b70322f931b6b1921d/doc/build/build_options.md#openmp
(This is the updated reference from Anton's previous comment, based on the
changes in MKLDNN done in the meantime 
https://github.com/apache/incubator-mxnet/pull/12160#issuecomment-415078066)
[2]: Alike https://github.com/apache/incubator-mxnet/pull/12160


On Fri, 2019-12-06 at 12:16 -0800, Pedro Larroy wrote:
> I will try to stay on the sidelines for now since previous conversations
> about OMP have not been productive here and I have spent way too much time
> on this already, I'm not the first one giving up on trying to help with
> this topic.
> 
> I would be glad if you guys can work together and find a solution. I will
> just put my understanding of the big picture hoping that it helps move it
> forward.
> 
> 
> Recently the intel omp library which seemed to have the best performance of
> the 3 was removed from MKL.
> 
> - There's 3 libraries in play, GNU Omp which is shipped with gcc (gomp),
> LLVM openmp in 3rdparty (llvm-omp), Intel OMP when using MKL, which is
> recently removed (iomp)
> 
> - IOMP seems to have the best performance, there's stability issues
> producing crashes sometimes but the impact seems relatively small for users
> and developers. In general seems linking with a different OMP version that
> the one shipped with the compiler is known to cause stability issues but
> it's done anyway.
> 
> - LLVM-OMP used when building with CMake, not used in the PIP releases or
> when building with Make. Has stability issues, hangs when running in debug
> mode during test execution and produces tons of assertions in debug mode.
> Might have some small performance gains but there is no clear cut data that
> showcases significant performance gains.
> 
> - GOMP is the version shipped with GCC and the PIP wheels without MKL, has
> no stability problems.
> 
> As a ballpark, IOMP might give 10% performance improvement in some cases.
> 
> We need to document well how users should tune and configure MXNet when
> using OMP.
> 
> As a developer, the safest bet is to use GOMP to be able to debug and
> develop without issues. As a user of CPU inference / training you want to
> run MKL so depends on how the Intel guys want to do things. My preference
> as an engineer is always stability > speed.
> 
> Related tickets:
> 
> https://github.com/apache/incubator-mxnet/issues/16891
> 
> https://github.com/apache/incubator-mxnet/issues/10856#issuecomment-562637931
> 
> 
> https://github.com/apache/incubator-mxnet/issues/11417
> 
> https://github.com/apache/incubator-mxnet/issues/15690
> 
> 
> 
> On Fri, Dec 6, 2019 at 12:39 AM Lausen, Leonard 
> wrote:
> 
> > Is this related to https://github.com/apache/incubator-mxnet/issues/10856?
> > 
> > I unlocked that Github issue based on the Apache Code of Conduct
> > https://www.apache.org/foundation/policies/conduct#specific-guidelines
> > 
> > 
> > On Sat, 2019-11-30 at 02:47 -0800, Pedro Larroy wrote:
> > > (py3_venv) piotr@34-215-197-42:1:~/mxnet_1.6 (upstream_master)+$ ldd
> > > build/libmxnet.so| grep -i openmp
> > > libomp.so =>
> > > /home/piotr/mxnet_1.6/build/3rdparty/openmp/runtime/src/libomp.so
> > > (0x7fde0991d000)
> > > (py3_venv) piotr@34-215-197-42:0:~/mxnet_1.6 (upstream_master)+$ python
> > > ~/deeplearning-benchmark/image_classification/infer_imagenet.py --use-rec
> > > --batch-size 256 --dtype float32 --num-data-workers 

Re: Please remove conflicting Open MP version from CMake builds

2019-12-06 Thread Pedro Larroy
I will try to stay on the sidelines for now since previous conversations
about OMP have not been productive here and I have spent way too much time
on this already, I'm not the first one giving up on trying to help with
this topic.

I would be glad if you guys can work together and find a solution. I will
just put my understanding of the big picture hoping that it helps move it
forward.


Recently the intel omp library which seemed to have the best performance of
the 3 was removed from MKL.

- There's 3 libraries in play, GNU Omp which is shipped with gcc (gomp),
LLVM openmp in 3rdparty (llvm-omp), Intel OMP when using MKL, which is
recently removed (iomp)

- IOMP seems to have the best performance, there's stability issues
producing crashes sometimes but the impact seems relatively small for users
and developers. In general seems linking with a different OMP version that
the one shipped with the compiler is known to cause stability issues but
it's done anyway.

- LLVM-OMP used when building with CMake, not used in the PIP releases or
when building with Make. Has stability issues, hangs when running in debug
mode during test execution and produces tons of assertions in debug mode.
Might have some small performance gains but there is no clear cut data that
showcases significant performance gains.

- GOMP is the version shipped with GCC and the PIP wheels without MKL, has
no stability problems.

As a ballpark, IOMP might give 10% performance improvement in some cases.

We need to document well how users should tune and configure MXNet when
using OMP.

As a developer, the safest bet is to use GOMP to be able to debug and
develop without issues. As a user of CPU inference / training you want to
run MKL so depends on how the Intel guys want to do things. My preference
as an engineer is always stability > speed.

Related tickets:

https://github.com/apache/incubator-mxnet/issues/16891

https://github.com/apache/incubator-mxnet/issues/10856#issuecomment-562637931


https://github.com/apache/incubator-mxnet/issues/11417

https://github.com/apache/incubator-mxnet/issues/15690



On Fri, Dec 6, 2019 at 12:39 AM Lausen, Leonard 
wrote:

> Is this related to https://github.com/apache/incubator-mxnet/issues/10856?
>
> I unlocked that Github issue based on the Apache Code of Conduct
> https://www.apache.org/foundation/policies/conduct#specific-guidelines
>
>
> On Sat, 2019-11-30 at 02:47 -0800, Pedro Larroy wrote:
> > (py3_venv) piotr@34-215-197-42:1:~/mxnet_1.6 (upstream_master)+$ ldd
> > build/libmxnet.so| grep -i openmp
> > libomp.so =>
> > /home/piotr/mxnet_1.6/build/3rdparty/openmp/runtime/src/libomp.so
> > (0x7fde0991d000)
> > (py3_venv) piotr@34-215-197-42:0:~/mxnet_1.6 (upstream_master)+$ python
> > ~/deeplearning-benchmark/image_classification/infer_imagenet.py --use-rec
> > --batch-size 256 --dtype float32 --num-data-workers 40 --mode hybrid
> > --model resnet50_v2 --use-pretrained --kvstore local --log-interval 1
> > --rec-val ~/data/val-passthrough.rec --rec-val-idx
> > ~/data/val-passthrough.idx
> > INFO:root:Namespace(batch_norm=False, batch_size=256,
> > data_dir='~/.mxnet/datasets/imagenet', dataset_size=32, dtype='float32',
> > kvstore='local', last_gamma=False, log_interval=1, logging_dir='logs',
> > lr=0.1, lr_decay=0.1, lr_decay_epoch='40,60', lr_mode='step',
> > lr_poly_power=2, mode='hybrid', model='resnet50_v2', momentum=0.9,
> > num_epochs=3, num_gpus=0, num_workers=40,
> > rec_val='/home/piotr/data/val-passthrough.rec',
> > rec_val_idx='/home/piotr/data/val-passthrough.idx', save_dir='params',
> > save_frequency=0, top_k=0, use_pretrained=True, use_rec=True,
> use_se=False,
> > warmup_epochs=0, warmup_lr=0.0, wd=0.0001)
> > [10:42:02] ../src/io/iter_image_recordio_2.cc:178: ImageRecordIOParser2:
> > /home/piotr/data/val-passthrough.rec, use 36 threads for decoding..
> > INFO:root:Batch [0]
> > INFO:root:Top 1 accuracy: 0
> > INFO:root:warmup_throughput: 5 samples/sec warmup_time 43.150922
> > INFO:root:Batch [1]
> > INFO:root:Top 1 accuracy: 0
> > INFO:root:warmup_throughput: 6 samples/sec warmup_time 37.971927
> > INFO:root:Batch [2]
> > INFO:root:Top 1 accuracy: 0
> > INFO:root:warmup_throughput: 7 samples/sec warmup_time 35.755363
> >
> >
> >
> >
> >
> >
> >
> > (py3_venv) piotr@34-215-197-42:0:~/mxnet_1.6_plat_omp
> (upstream_master)+$
> > git st
> > On branch upstream_master
> > Your branch is up to date with 'origin/upstream_master'.
> >
> > Changes not staged for commit:
> >   (use "git add/rm ..." to update what will be committed)
> >   (use "git checkout -- ..." to discard changes in working
> directory)
> >
> > deleted:3rdparty/openmp
> >
> > no changes added to commit (use "git add" and/or "git commit -a")
> > (py3_venv) piotr@34-215-197-42:1:~/mxnet_1.6_plat_omp
> (upstream_master)+$
> > ldd build/libmxnet.so | grep -i omp
> > libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1
> > (0x7f941241c000)
> >
> > (py3_venv) 

Re: Can upgrade windows CI cmake?

2019-12-06 Thread Pedro Larroy
CMake shipped with ubuntu has issues when compiling with CUDA on GPU
instances.  I wouldn't recommend anything older than 3.12 for Linux GPU

https://github.com/apache/incubator-mxnet/blob/master/ci/docker/install/ubuntu_core.sh#L63

I don't know about windows CMake version but would make sense to require a
newer version.

On Thu, Dec 5, 2019 at 7:26 PM Lausen, Leonard 
wrote:

> Currently we declare cmake_minimum_required(VERSION 3.0.2)
>
> I'm in favor of updating our CMake requirement. The main question may be
> what
> new version to pick as minimum requirement.
>
> In general, there is the guideline
>
> > You really should at least use a version of CMake that came out after
> your
> > compiler, since it needs to know compiler flags, etc, for that version.
> And,
> > since CMake will dumb itself down to the minimum required version in your
> > CMake file, installing a new CMake, even system wide, is pretty safe. You
> > should at least install it locally. It's easy (1-2 lines in many cases),
> and
> > you'll find that 5 minutes of work will save you hundreds of lines and
> hours
> > of CMakeLists.txt writing, and will be much easier to maintain in the
> long
> > run.
> https://cliutils.gitlab.io/modern-cmake/
>
> https://cliutils.gitlab.io/modern-cmake/chapters/intro/newcmake.html
> gives a
> short overview of all the improvements made to CMake over the past 6 years.
>
> It's easy for users to upgrade their cmake version with pip:
>   pip install --upgrade --user cmake
> Thus it wouldn't be overly problematic to rely on a very recent version of
> cmake, if indeed it's required.
>
> Nevertheless, if an earlier version fixes the problems, let's rather pick
> that
> one. Did you confirm which version is required to fix the problem?
>
> For now you could try if the CMake version shipped in the oldest supported
> Ubuntu LTS release (Ubuntu 16.04) is fixing your problem (CMake 3.5)? If
> not,
> please test if CMake version shipped in Ubuntu 18.04 (CMake 3.10) fixes
> your
> issue.
>
> Thanks
> Leonard
>
> On Fri, 2019-12-06 at 08:45 +0800, shiwen hu wrote:
> > i am send a pr  https://github.com/apache/incubator-mxnet/pull/16980 to
> > change windows build system.but now ci cmake version seems to be a bug.
> > can't to compile.can upgrade to 3.16.0?
>


Re: CI Update

2019-12-06 Thread Pedro Larroy
Hi all. CI is back to normal after Jake's commit:
https://github.com/apache/incubator-mxnet/pull/16968 please merge from
master.  If someone could look into the TVM building issues  described
above would be great.

On Tue, Dec 3, 2019 at 11:11 AM Pedro Larroy 
wrote:

> Some PRs were experiencing build timeouts in the past. I have diagnosed
> this to be a saturation of the EFS volume holding the compilation cache.
> Once CI is back online this problem is very likely to be solved and you
> should not see any more build timeout issues.
>
> On Tue, Dec 3, 2019 at 10:18 AM Pedro Larroy 
> wrote:
>
>> Also please take note that there's a stage building TVM which is
>> executing compilation serially and takes a lot of time which impacts CI
>> turnaround time:
>>
>> https://github.com/apache/incubator-mxnet/issues/16962
>>
>> Pedro
>>
>> On Tue, Dec 3, 2019 at 9:49 AM Pedro Larroy 
>> wrote:
>>
>>> Hi MXNet community. We are in the process of updating the base AMIs for
>>> CI with an updated CUDA driver to fix the CI blockage.
>>>
>>> We would need help from the community to diagnose some of the build
>>> errors which don't seem related to the infrastructure.
>>>
>>> I have observed this build failure with tvm when not installing the cuda
>>> driver in the container:
>>>
>>>
>>> https://pastebin.com/bQA0W2U4
>>>
>>> centos gpu builds and tests seem to run with the updated AMI and changes
>>> to the container.
>>>
>>>
>>> Thanks.
>>>
>>>
>>> On Mon, Dec 2, 2019 at 12:11 PM Pedro Larroy <
>>> pedro.larroy.li...@gmail.com> wrote:
>>>
 Small update about CI, which is blocked.

 Seems there's a nvidia driver compatibility problem in the base AMI
 that is running in GPU instances and the nvidia docker images that we use
 for building and testing.

 We are working on providing a fix by updating the base images as
 doesn't seem to be easy to fix by just changing the container.

 Thanks.

 Pedro.

>>>


Re: Please remove conflicting Open MP version from CMake builds

2019-12-06 Thread Lausen, Leonard
Is this related to https://github.com/apache/incubator-mxnet/issues/10856?

I unlocked that Github issue based on the Apache Code of Conduct 
https://www.apache.org/foundation/policies/conduct#specific-guidelines


On Sat, 2019-11-30 at 02:47 -0800, Pedro Larroy wrote:
> (py3_venv) piotr@34-215-197-42:1:~/mxnet_1.6 (upstream_master)+$ ldd
> build/libmxnet.so| grep -i openmp
> libomp.so =>
> /home/piotr/mxnet_1.6/build/3rdparty/openmp/runtime/src/libomp.so
> (0x7fde0991d000)
> (py3_venv) piotr@34-215-197-42:0:~/mxnet_1.6 (upstream_master)+$ python
> ~/deeplearning-benchmark/image_classification/infer_imagenet.py --use-rec
> --batch-size 256 --dtype float32 --num-data-workers 40 --mode hybrid
> --model resnet50_v2 --use-pretrained --kvstore local --log-interval 1
> --rec-val ~/data/val-passthrough.rec --rec-val-idx
> ~/data/val-passthrough.idx
> INFO:root:Namespace(batch_norm=False, batch_size=256,
> data_dir='~/.mxnet/datasets/imagenet', dataset_size=32, dtype='float32',
> kvstore='local', last_gamma=False, log_interval=1, logging_dir='logs',
> lr=0.1, lr_decay=0.1, lr_decay_epoch='40,60', lr_mode='step',
> lr_poly_power=2, mode='hybrid', model='resnet50_v2', momentum=0.9,
> num_epochs=3, num_gpus=0, num_workers=40,
> rec_val='/home/piotr/data/val-passthrough.rec',
> rec_val_idx='/home/piotr/data/val-passthrough.idx', save_dir='params',
> save_frequency=0, top_k=0, use_pretrained=True, use_rec=True, use_se=False,
> warmup_epochs=0, warmup_lr=0.0, wd=0.0001)
> [10:42:02] ../src/io/iter_image_recordio_2.cc:178: ImageRecordIOParser2:
> /home/piotr/data/val-passthrough.rec, use 36 threads for decoding..
> INFO:root:Batch [0]
> INFO:root:Top 1 accuracy: 0
> INFO:root:warmup_throughput: 5 samples/sec warmup_time 43.150922
> INFO:root:Batch [1]
> INFO:root:Top 1 accuracy: 0
> INFO:root:warmup_throughput: 6 samples/sec warmup_time 37.971927
> INFO:root:Batch [2]
> INFO:root:Top 1 accuracy: 0
> INFO:root:warmup_throughput: 7 samples/sec warmup_time 35.755363
> 
> 
> 
> 
> 
> 
> 
> (py3_venv) piotr@34-215-197-42:0:~/mxnet_1.6_plat_omp (upstream_master)+$
> git st
> On branch upstream_master
> Your branch is up to date with 'origin/upstream_master'.
> 
> Changes not staged for commit:
>   (use "git add/rm ..." to update what will be committed)
>   (use "git checkout -- ..." to discard changes in working directory)
> 
> deleted:3rdparty/openmp
> 
> no changes added to commit (use "git add" and/or "git commit -a")
> (py3_venv) piotr@34-215-197-42:1:~/mxnet_1.6_plat_omp (upstream_master)+$
> ldd build/libmxnet.so | grep -i omp
> libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1
> (0x7f941241c000)
> 
> (py3_venv) piotr@34-215-197-42:130:~/mxnet_1.6_plat_omp (upstream_master)+$
> python ~/deeplearning-benchmark/image_classification/infer_imagenet.py
> --use-rec --batch-size 256 --dtype float32 --num-data-workers 40 --mode
> hybrid --model resnet50_v2 --use-pretrained --kvstore local --log-interval
> 1 --rec-val ~/data/val-passthrough.rec --rec-val-idx
> ~/data/val-passthrough.idx
> INFO:root:warmup_throughput: 147 samples/sec warmup_time 1.735117
> INFO:root:Batch [16]
> INFO:root:Top 1 accuracy: 0
> INFO:root:warmup_throughput: 143 samples/sec warmup_time 1.785760
> INFO:root:Batch [17]
> INFO:root:Top 1 accuracy: 0
> INFO:root:warmup_throughput: 148 samples/sec warmup_time 1.729033