Hi Pedro,

I'm looking at this case, and using the script of 
"incubator-mxnet/example/image-classification/train_cifar10.py" to get
the timing data, but seems there's not much difference between mxnet 1.4.1.rc0 
and 1.5.0.rc1 on C5.18xlarge.

Not sure if there's any difference in the python script, can you point me the 
link to get your script (cifar10.py)?
Or you can also have a try with MXNet's script (train_cifar10.py) and see the 
performance.

Here's the command I used to collect the time: 
        python train_cifar10.py --num-epoch=5

1) 1.5.0.rc1 (4d9667121ae6fb643f2a02ab15e25231ed756cde)
        real    9m4.880s
        user    333m13.340s
        sys     14m36.100s

2) 1.4.1.rc0 (1a7199691f5cbc6012bb53eecbf884bed5ae6590)
        real    9m2.155s
        user    329m37.092s
        sys     16m8.668s

-Ciyong


-----Original Message-----
From: Pedro Larroy [mailto:pedro.larroy.li...@gmail.com] 
Sent: Wednesday, June 26, 2019 6:28 AM
To: dev@mxnet.incubator.apache.org
Cc: d...@mxnet.apache.org
Subject: Re: [VOTE] Release Apache MXNet (incubating) version 1.5.0.rc1

Hi these were my build flags and system info:


--- # CMake configuration
USE_CUDA: "OFF" # Build with CUDA support
USE_OLDCMAKECUDA: "OFF" # Build with old cmake cuda
USE_NCCL: "OFF" # Use NVidia NCCL with CUDA
USE_OPENCV: "ON" # Build with OpenCV support
USE_OPENMP: "ON" # Build with Openmp support
USE_CUDNN: "ON" # Build with cudnn support) # one could set CUDNN_ROOT for 
search path
USE_SSE: "ON" # Build with x86 SSE instruction support IF NOT ARM
USE_F16C: "ON" # Build with x86 F16C instruction support) # autodetects support 
if "ON"
USE_LAPACK: "ON" # Build with lapack support
USE_MKL_IF_AVAILABLE: "ON" # Use MKL if found
USE_MKLML_MKL: "ON" # Use MKLDNN variant of MKL (if MKL found) IF 
USE_MKL_IF_AVAILABLE AND (NOT APPLE)
USE_MKLDNN: "ON" # Use MKLDNN variant of MKL (if MKL found) IF 
USE_MKL_IF_AVAILABLE AND (NOT APPLE)
USE_OPERATOR_TUNING: "ON" # Enable auto-tuning of operators IF NOT MSVC
USE_GPERFTOOLS: "ON" # Build with GPerfTools support (if found)
USE_JEMALLOC: "ON" # Build with Jemalloc support
USE_PROFILER: "ON" # Build with Profiler support
USE_DIST_KVSTORE: "OFF" # Build with DIST_KVSTORE support
USE_PLUGINS_WARPCTC: "OFF" # Use WARPCTC Plugins
USE_PLUGIN_CAFFE: "OFF" # Use Caffe Plugin
USE_CPP_PACKAGE: "OFF" # Build C++ Package
USE_MXNET_LIB_NAMING: "ON" # Use MXNet library naming conventions.
USE_GPROF: "OFF" # Compile with gprof (profiling) flag
USE_CXX14_IF_AVAILABLE: "OFF" # Build with C++14 if the compiler supports it
USE_VTUNE: "OFF" # Enable use of Intel Amplifier XE (VTune)) # one could set 
VTUNE_ROOT for search path
ENABLE_CUDA_RTC: "ON" # Build with CUDA runtime compilation support
BUILD_CPP_EXAMPLES: "ON" # Build cpp examples
INSTALL_EXAMPLES: "OFF" # Install the example source files.
USE_SIGNAL_HANDLER: "ON" # Print stack traces on segfaults.
USE_TENSORRT: "OFF" # Enable infeference optimization with TensorRT.
USE_ASAN: "OFF" # Enable Clang/GCC ASAN sanitizers.
ENABLE_TESTCOVERAGE: "OFF" # Enable compilation with test coverage metric output
CMAKE_BUILD_TYPE: "Release"
CMAKE_CUDA_COMPILER_LAUNCHER: "ccache"
CMAKE_C_COMPILER_LAUNCHER: "ccache"
CMAKE_CXX_COMPILER_LAUNCHER: "ccache"

commit 4d9667121ae6fb643f2a02ab15e25231ed756cde (HEAD, tag: 1.5.0.rc1,
upstream/v1.5.x)
commit 1a7199691f5cbc6012bb53eecbf884bed5ae6590 (HEAD, tag: 1.4.1.rc0,
upstream/v1.4.x)

curl http://169.254.169.254/latest/meta-data/instance-type
c5d.18xlarge


Version      : 3.6.7
Compiler     : GCC 8.2.0
Build        : ('default', 'Oct 22 2018 11:32:17')
Arch         : ('64bit', 'ELF')
------------Pip Info-----------
Version      : 19.1.1
Directory    : /home/piotr/mxnet_1.5/py3_venv/lib/python3.6/site-packages/pip
----------MXNet Info-----------
Version      : 1.5.0
Directory    : /home/piotr/mxnet_1.5/python/mxnet
Hashtag not found. Not installed from pre-built package.
----------System Info----------
Platform     : Linux-4.15.0-1035-aws-x86_64-with-Ubuntu-18.04-bionic
system       : Linux
node         : ip-172-31-63-171
release      : 4.15.0-1035-aws
version      : #37-Ubuntu SMP Mon Mar 18 16:15:14 UTC 2019
----------Hardware Info----------
machine      : x86_64
processor    : x86_64
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              72
On-line CPU(s) list: 0-71
Thread(s) per core:  2
Core(s) per socket:  18
Socket(s):           2
NUMA node(s):        2
Vendor ID:           GenuineIntel
CPU family:          6
Model:               85
Model name:          Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz
Stepping:            4
CPU MHz:             1326.446
BogoMIPS:            6000.00
Hypervisor vendor:   KVM
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            1024K
L3 cache:            25344K
NUMA node0 CPU(s):   0-17,36-53
NUMA node1 CPU(s):   18-35,54-71
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb 
rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology nonstop_tsc cpuid 
aperfmperf pni pclmulqdq monitor ssse3 fma cx16 pcid
sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand 
hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase tsc_adjust 
bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap 
clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves ida 
arat pku ospke ----------Network Test----------

----------Python Info----------
Version      : 3.6.7
Compiler     : GCC 8.2.0
Build        : ('default', 'Oct 22 2018 11:32:17')
Arch         : ('64bit', 'ELF')
------------Pip Info-----------
Version      : 19.1.1
Directory    : /home/piotr/mxnet_1.4/py3_venv/lib/python3.6/site-packages/pip
----------MXNet Info-----------
Version      : 1.4.1
Directory    : /home/piotr/mxnet_1.4/python/mxnet
Hashtag not found. Not installed from pre-built package.
----------System Info----------
Platform     : Linux-4.15.0-1035-aws-x86_64-with-Ubuntu-18.04-bionic
system       : Linux
node         : ip-172-31-63-171
release      : 4.15.0-1035-aws
version      : #37-Ubuntu SMP Mon Mar 18 16:15:14 UTC 2019
----------Hardware Info----------
machine      : x86_64
processor    : x86_64
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              72
On-line CPU(s) list: 0-71
Thread(s) per core:  2
Core(s) per socket:  18
Socket(s):           2
NUMA node(s):        2
Vendor ID:           GenuineIntel
CPU family:          6
Model:               85
Model name:          Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz
Stepping:            4
CPU MHz:             1223.344
BogoMIPS:            6000.00
Hypervisor vendor:   KVM
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            1024K
L3 cache:            25344K
NUMA node0 CPU(s):   0-17,36-53
NUMA node1 CPU(s):   18-35,54-71
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb 
rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology nonstop_tsc cpuid 
aperfmperf pni pclmulqdq monitor ssse3 fma cx16 pcid
sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand 
hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase tsc_adjust 
bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap 
clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves ida 
arat pku ospke ----------Network Test----------

On Tue, Jun 25, 2019 at 2:35 PM Pedro Larroy <pedro.larroy.li...@gmail.com> 
wrote:
>
> I did a training of cifar10 in CPU and seems there's some regressions 
> in the range of 7% increase of training time against 1.4.1:
>
> (py3_venv) piotr@ip-172-31-63-171:0:~/deeplearning-benchmark/dawnbench
> (master)+$ time python cifar10.py --epochs 5
> real    11m30.388s
> user    417m7.766s
> sys     16m57.315s
>
> VS 1.4.1:
> real    10m41.994s
> user    392m40.646s
> sys     12m30.601s
>
>
> On Thu, Jun 20, 2019 at 10:15 PM Lai Wei <roywei...@gmail.com> wrote:
> >
> > Hi Anirudh,
> >
> > Thanks for jumping into this quickly, I followed up on the issue.
> >
> > I was meant for sockeye developer/maintainers to help setup nightly 
> > tests and raise issues early.
> >
> > Thanks!
> >
> > On Fri, Jun 21, 2019 at 10:10 AM Haibin Lin 
> > <haibin.lin....@gmail.com>
> > wrote:
> >
> > > In GluonNLP we are testing with MXNET nightly build for each PR, 
> > > and we did find some MXNet related issue caught by the CI.
> > > I recommend other toolkits also add integration tests with MXNet nightly.
> > > It helps identify issues early.
> > >
> > > Best,
> > > Haibin
> > >
> > > On Thu, Jun 20, 2019 at 18:52 Zhao, Patric <patric.z...@intel.com> wrote:
> > >
> > > > Thanks to raise the issue and we will take a look ASAP.
> > > >
> > > > The downstream cases is not in the MXNet CI so it's hard to 
> > > > catch the potential bugs or performance degradation for MXNet 
> > > > developers.
> > > >
> > > > In the future, I suggest adding the major downstream test cases, 
> > > > like
> > > from
> > > > sockeye, GluonNLP, GLuonCV, DGL, Gluon-TS, into the nightly test.
> > > > If it's still too heavy,  maybe testing it weekly or monthly :)
> > > >
> > > > Thanks,
> > > >
> > > > --Patric
> > > >
> > > > > -----Original Message-----
> > > > > From: Anirudh Subramanian [mailto:anirudh2...@gmail.com]
> > > > > Sent: Friday, June 21, 2019 9:31 AM
> > > > > To: dev@mxnet.incubator.apache.org
> > > > > Cc: d...@mxnet.apache.org
> > > > > Subject: Re: [VOTE] Release Apache MXNet (incubating) version 
> > > > > 1.5.0.rc1
> > > > >
> > > > > Hi Lai,
> > > > >
> > > > > I have opened an issue:
> > > > > https://github.com/apache/incubator-mxnet/issues/15297
> > > > > I came to know about this issue only today and I have not been
> > > monitoring
> > > > > sockeye.
> > > > > I jumped onto this issue to make sure it wasn't caused by the 
> > > > > dlpack
> > > > changes.
> > > > > Also, I don't  think sockeye CI checks against master, it is 
> > > > > using
> > > 1.4.1.
> > > > >
> > > > > Anirudh
> > > > >
> > > > >
> > > > > On Thu, Jun 20, 2019 at 6:17 PM Lai Wei <roywei...@gmail.com> wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Could you share which test failed and what’s the crash? How 
> > > > > > to reproduce it?
> > > > > >
> > > > > > I was able to install sockeye and run all tests passed. 
> > > > > > Using python setup.py test
> > > > > >
> > > > > > I have tested both nightly pip package and 1.5.0.rc1
> > > > > >
> > > > > > It would be great to create an issue with reproducible steps 
> > > > > > and move the discussion there.
> > > > > >
> > > > > > Also I see sockeye nightly build[1] has been failing for 
> > > > > > some time,
> > > if
> > > > > > it’s due to MXNet change, please raise this early so we can 
> > > > > > track and solve it in time rather than block the release during 
> > > > > > vote time.
> > > > > >
> > > > > > [1] https://travis-ci.org/awslabs/sockeye
> > > > > >
> > > > > >
> > > > > > On Fri, Jun 21, 2019 at 7:01 AM Anirudh Subramanian 
> > > > > > <anirudh2...@gmail.com
> > > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > I was able to reproduce a crash with the commit
> > > > > > > 09202f7f261954383aa387144524d38f83f18d06 but not with the 
> > > > > > > commit a862270beb2d796c1ba311183f7f4a766a18ad6c.
> > > > > > >
> > > > > > > Anirudh
> > > > > > >
> > > > > > > On Thu, Jun 20, 2019 at 3:53 PM Lai Wei 
> > > > > > > <roywei...@gmail.com>
> > > wrote:
> > > > > > >
> > > > > > > > Hi Przemyslaw,
> > > > > > > >
> > > > > > > > Is there an issue with more details to track the problem?
> > > > > > > >
> > > > > > > >
> > > > > > > > On Fri, Jun 21, 2019 at 6:04 AM Przemysław Trędak 
> > > > > > > > <ptre...@apache.org>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > -1
> > > > > > > > >
> > > > > > > > > There is a crash in sockeye unit test (python setup.py 
> > > > > > > > > test) observed starting with nightly 1.5 build from 
> > > > > > > > > 6/13 and still occuring in
> > > > > > > 1.5rc1. I
> > > > > > > > > don't yet have the exact commit that is responsible 
> > > > > > > > > for it, but it is either 
> > > > > > > > > a862270beb2d796c1ba311183f7f4a766a18ad6c (dlpack
> > > > > > > > > related) or
> > > > > > > > > 09202f7f261954383aa387144524d38f83f18d06 (cached op
> > > > > optimization).
> > > > > > > > >
> > > > > > > > > On 2019/06/20 06:36:22, Lai Wei <roywei...@gmail.com> wrote:
> > > > > > > > > > Dear MXNet community,
> > > > > > > > > >
> > > > > > > > > > This is the 3-day vote to release Apache MXNet 
> > > > > > > > > > (incubating) version
> > > > > > > > > 1.5.0.
> > > > > > > > > > Voting on dev@ will start June 19, 23:59:59(PST)  
> > > > > > > > > > and close
> > > on
> > > > > > June
> > > > > > > > 22,
> > > > > > > > > > 23:59:59.
> > > > > > > > > >
> > > > > > > > > > 1) Link to release notes:
> > > > > > > > > >
> > > > > > >
> > > https://cwiki.apache.org/confluence/display/MXNET/1.5.0+Release+No
> > > te
> > > > > > > s
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > 2) Link to release candidate:
> > > > > > > > > >
> > > > > > > > > >
> > > https://github.com/apache/incubator-mxnet/releases/tag/1.5.0.r
> > > > > > > > > > c1
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > 3) Link to source and signatures on apache dist server:
> > > > > > > > > >
> > > > > > > > > >
> > > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.5.0.r
> > > > > > > > > > c1/
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Please remember to TEST first before voting accordingly:
> > > > > > > > > >
> > > > > > > > > > +1 = approve
> > > > > > > > > > +0 = no opinion
> > > > > > > > > > -1 = disapprove (provide reason)
> > > > > > > > > > --
> > > > > > > > > > Best Regards
> > > > > > > > > >
> > > > > > > > > > Lai
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > --
> > > > > > > > Best Regards
> > > > > > > >
> > > > > > > > Lai
> > > > > > > >
> > > > > > >
> > > > > > --
> > > > > > Best Regards
> > > > > >
> > > > > > Lai
> > > > > >
> > > >
> > >
> > --
> > Best Regards
> >
> > Lai

Reply via email to