one major advantage of intel/llvm omp is that it spawns a new thread pool
after fork if a thread pool was already created. this is so that omp can be
used in the forked processes. libgomp doesn’t do this so it’ll just lock up
if you try to do omp in the forked process.

is your build linking libgomp as well?

standard mkl build (from Makefile) uses same omp library. are there
problems with that build?

what changes need to be made to make the assertion not fire?

On Mon, Jun 24, 2019 at 5:32 PM Pedro Larroy <pedro.larroy.li...@gmail.com>
wrote:

> There's an assertion which is easily reproducible, and also there's a
> crash including core dump, the latter is not easy to reproduce for me
> in different environments. I have also seen mxnet getting stuck
> without progressing with this build configuration and using no CPU at
> all when running unit tests.
>
> In my view, the root cause of the assertion is that we are re-entering
> OMP initialization when spawning threads on the following code through
> pthread_at_fork
>
> https://github.com/apache/incubator-mxnet/blob/master/src/initialize.cc#L58
>
> This causes double initialization of the OMP engine, including the
> assertion which you are asking about,  and I suspect some additional
> overhead. That's the shady forking part you are asking for.
>
> A question for you: What is the cause of runtime differences between
> OMP runtimes? Shouldn't the implementation overhead diminish as
> threads run longer?
>
> Pedro.
>
> On Mon, Jun 24, 2019 at 5:10 PM Chris Olivier <cjolivie...@gmail.com>
> wrote:
> >
> > What’s the reason for the assertion failure? btw classifying an assertion
> > failure a “crash” is debatable. As I stated in the original issue a long
> > time ago, it’s possible something shady is being done with when forking
> > that should be fixed.  The assertion should be root caused.
> >
> >
> >
> > On Mon, Jun 24, 2019 at 1:22 PM Pedro Larroy <
> pedro.larroy.li...@gmail.com>
> > wrote:
> >
> > > Added a dockerfile, and reports of a crash in my local machine when
> > > running MKL+OMP+DEBUG, with Anton's branch the crash happened as well.
> > > I couldn't reproduce the crash on my EC2 machine:
> > > Added the backtrace of the crash as well.
> > >
> > > https://github.com/apache/incubator-mxnet/issues/10856
> > >
> > > Dockerfile here:
> > >
> > > https://github.com/larroy/mxnet_omp
> > >
> > > Kind regards.
> > >
> > > Pedro.
> > >
> > > On Thu, Jun 20, 2019 at 5:29 PM Marco de Abreu <
> marco.g.ab...@gmail.com>
> > > wrote:
> > > >
> > > > As already proposed, I think the easiest way to get a common
> > > understanding
> > > > is if we start with a few docker containers. Pedro, would it be
> possible
> > > > for you to wrap your benchmarks into a few containers that will
> produce
> > > > your shown results? That way, we can avoid possible
> misunderstandings and
> > > > also pinpoint the exact parts where people disagree or misunderstood
> each
> > > > other.
> > > >
> > > > -Marco
> > > >
> > > > Pedro Larroy <pedro.larroy.li...@gmail.com> schrieb am Do., 20. Juni
> > > 2019,
> > > > 21:47:
> > > >
> > > > > I can confirm that we are linking with two versions of omp, I'm
> > > > > gaining more clarity into this topic, but I have still questions,
> the
> > > > > facts that I got so far are the folllowing:
> > > > >
> > > > > * #1: We are linking with two versions of omp, intel's omp and llvm
> > > > > openmp when building with MKL enabled.
> > > > > * #2: We have 3 different possible OMP versions: Intel OMP (comes
> with
> > > > > MKL), LLVM OpenMP (3rdparty/openmp), libgomp (comes with gcc) (This
> > > > > one is used on the PR proposed by Anton).
> > > > >
> > > > > Questions:
> > > > >
> > > > >  * #1 Is it ok to have two versions of openmp linked at the same
> time?
> > > > >  * #2 Which implementation of OMP gives the best performance?  (See
> > > > > total training time of my measurement for a partial answer)
> > > > >  * #3 Should we have a build flag so we can choose the OMP version
> at
> > > > > runtime?
> > > > >  * #4 Which Compiler and build flags did Chris use to get 10x
> slowdown?
> > > > >  * #5 @Stas: is there a script to replicate your benchmarks
> easily? If
> > > > > so could you provide a link?  I think we would need to reproduce
> your
> > > > > benchmarks and verify which versions are being linked. It's
> possible
> > > > > that while compiling with MKL intel's omp was pulled in instead of
> > > > > GNU OpenMP.
> > > > >  * #6 @Chris: how to maintain the copy of LLVM's Openmp? Should we
> > > > > update the subrepo regularly?
> > > > >
> > > > > My conclusion so far:
> > > > >
> > > > >  * #1 We should avoid linking two versions of omp if possible and
> > > > > allow users to choose one in the build as we do for BLAS.
> > > > >  * #2 For performance reasons and more control vs different
> compiler
> > > > > versions seems it makes indeed sense to keep the LLVM OpenMP
> version
> > > > > in 3rdparty for now. So unless some more data is gathered, it makes
> > > > > sense not to remove it as of now.
> > > > >  * #3 We should provide build options to choose which openmp
> library
> > > > > is to be used from the three options available, including libgomp.
> > > > >  * #4 Refining the build we could also enable OpenMP in mac without
> > > > > additional contortions (doesn't work as of today):
> > > > > https://iscinumpy.gitlab.io/post/omp-on-high-sierra/
> > > > >  * #5 We should add different omp versions to our benchmarks and
> track
> > > > > the performance, so this data is available for prescribing the best
> > > > > build options and for binary releases.
> > > > >
> > > > > This is also an interesting related gh issue posted in the mkl-dnn
> > > > > repository:  https://github.com/intel/mkl-dnn/issues/230
> > > > >
> > > > >
> > > > > I don't observe the order of magnitude divergence reported by
> Chris in
> > > > > vanilla Ubuntu 18.04 in samples / s but the full training finishes
> > > > > indeed faster with the OMP from 3rdparty (LLVM openmp) vs libgomp.
> > > > >
> > > > > There's also differences in training time when using MKL and the ,
> > > > > it's actually a bit slower, I don't know if it's related to OMP.
> > > > >
> > > > > gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1)
> > > > >
> > > > > Anton's branch:  g...@github.com:lebeg/incubator-mxnet.git   branch
> > > 'omp'
> > > > > (py3_venv) piotr@ec2 cpu:0: ~/mxnet_openmp [omp]> ldd
> > > > > build/libmxnet.so |grep -i omp
> > > > >         libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1
> > > > > (0x00007fd99a51d000)
> > > > >
> > > > > time python train_mnist.py
> > > > >
> > > > > INFO:root:Epoch[18] Validation-accuracy=0.984176
> > > > > INFO:root:Epoch[19] Batch [0-100]       Speed: 41617.00 samples/sec
> > > > >  accuracy=1.000000
> > > > > INFO:root:Epoch[19] Batch [100-200]     Speed: 47990.69 samples/sec
> > > > >  accuracy=0.999531
> > > > > INFO:root:Epoch[19] Batch [200-300]     Speed: 47517.01 samples/sec
> > > > >  accuracy=0.999687
> > > > > INFO:root:Epoch[19] Batch [300-400]     Speed: 47430.53 samples/sec
> > > > >  accuracy=1.000000
> > > > > INFO:root:Epoch[19] Batch [400-500]     Speed: 47649.77 samples/sec
> > > > >  accuracy=0.999687
> > > > > INFO:root:Epoch[19] Batch [500-600]     Speed: 51708.12 samples/sec
> > > > >  accuracy=0.999687
> > > > > INFO:root:Epoch[19] Batch [600-700]     Speed: 57228.63 samples/sec
> > > > >  accuracy=0.999375
> > > > > INFO:root:Epoch[19] Batch [700-800]     Speed: 50887.85 samples/sec
> > > > >  accuracy=0.999844
> > > > > INFO:root:Epoch[19] Batch [800-900]     Speed: 53947.98 samples/sec
> > > > >  accuracy=0.999531
> > > > > INFO:root:Epoch[19] Train-accuracy=0.999717
> > > > > INFO:root:Epoch[19] Time cost=1.219
> > > > > INFO:root:Epoch[19] Validation-accuracy=0.983977
> > > > > 1011.98user 26.78system 0:31.54elapsed 3292%CPU (0avgtext+0avgdata
> > > > > 1146052maxresident)k
> > > > > 0inputs+0outputs (0major+3496364minor)pagefaults 0swaps
> > > > >
> > > > > Master, MKL ON:
> > > > >
> > > > > (py3_venv) piotr@ec2 cpu:1: ~/m/e/image-classification [master]>
> ldd
> > > > > ../../build/libmxnet.so | grep -i omp
> > > > >         libomp.so =>
> > > > >
> /home/piotr/mxnet_master/build/3rdparty/openmp/runtime/src/libomp.so
> > > > > (0x00007f05ba38f000)
> > > > >         libiomp5.so =>
> > > > >
> > > > >
> > >
> /home/piotr/mxnet_master/build/mklml/mklml_lnx_2019.0.5.20190502/lib/libiomp5.so
> > > > > (0x00007f05b09f4000)
> > > > >
> > > > > INFO:root:Epoch[18] Validation-accuracy=0.982484
> > > > > INFO:root:Epoch[19] Batch [0-100]       Speed: 36651.63 samples/sec
> > > > >  accuracy=0.999691
> > > > > INFO:root:Epoch[19] Batch [100-200]     Speed: 45093.98 samples/sec
> > > > >  accuracy=0.999844
> > > > > INFO:root:Epoch[19] Batch [200-300]     Speed: 45146.84 samples/sec
> > > > >  accuracy=0.999687
> > > > > INFO:root:Epoch[19] Batch [300-400]     Speed: 45119.90 samples/sec
> > > > >  accuracy=0.999687
> > > > > INFO:root:Epoch[19] Batch [400-500]     Speed: 44998.96 samples/sec
> > > > >  accuracy=0.999531
> > > > > INFO:root:Epoch[19] Batch [500-600]     Speed: 45072.25 samples/sec
> > > > >  accuracy=0.999844
> > > > > INFO:root:Epoch[19] Batch [600-700]     Speed: 44969.79 samples/sec
> > > > >  accuracy=0.999844
> > > > > INFO:root:Epoch[19] Batch [700-800]     Speed: 44962.78 samples/sec
> > > > >  accuracy=0.999844
> > > > > INFO:root:Epoch[19] Batch [800-900]     Speed: 44945.47 samples/sec
> > > > >  accuracy=0.999375
> > > > > INFO:root:Epoch[19] Train-accuracy=0.999717
> > > > > INFO:root:Epoch[19] Time cost=1.367
> > > > > INFO:root:Epoch[19] Validation-accuracy=0.982783
> > > > > 854.97user 847.21system 0:41.44elapsed 4106%CPU (0avgtext+0avgdata
> > > > > 1154348maxresident)k
> > > > > 0inputs+0outputs (0major+3624361minor)pagefaults 0swaps
> > > > >
> > > > >
> > > > > MKL OFF:
> > > > > (py3_venv) piotr@ec2 cpu:0: ~/mxnet_master [master]> grep -i MKL
> > > > > cmake_options.yml
> > > > > USE_MKL_IF_AVAILABLE: "OFF" # Use MKL if found
> > > > > USE_MKLML_MKL: "OFF" # Use MKLDNN variant of MKL (if MKL found) IF
> > > > > USE_MKL_IF_AVAILABLE AND (NOT APPLE)
> > > > > USE_MKLDNN: "OFF" # Use MKLDNN variant of MKL (if MKL found) IF
> > > > > USE_MKL_IF_AVAILABLE AND (NOT APPLE)
> > > > > (py3_venv) piotr@ec2 cpu:0: ~/mxnet_master [master]> ldd
> > > > > build/libmxnet.so |grep -i omp
> > > > >         libomp.so =>
> > > > >
> /home/piotr/mxnet_master/build/3rdparty/openmp/runtime/src/libomp.so
> > > > > (0x00007fb720c54000)
> > > > >
> > > > > INFO:root:Epoch[18] Validation-accuracy=0.983479
> > > > > INFO:root:Epoch[19] Batch [0-100]       Speed: 46784.02 samples/sec
> > > > >  accuracy=1.000000
> > > > > INFO:root:Epoch[19] Batch [100-200]     Speed: 48824.29 samples/sec
> > > > >  accuracy=0.999687
> > > > > INFO:root:Epoch[19] Batch [200-300]     Speed: 49190.31 samples/sec
> > > > >  accuracy=0.999687
> > > > > INFO:root:Epoch[19] Batch [300-400]     Speed: 51518.77 samples/sec
> > > > >  accuracy=0.999844
> > > > > INFO:root:Epoch[19] Batch [400-500]     Speed: 51551.62 samples/sec
> > > > >  accuracy=0.999844
> > > > > INFO:root:Epoch[19] Batch [500-600]     Speed: 49026.35 samples/sec
> > > > >  accuracy=0.999844
> > > > > INFO:root:Epoch[19] Batch [600-700]     Speed: 49002.46 samples/sec
> > > > >  accuracy=0.999375
> > > > > INFO:root:Epoch[19] Batch [700-800]     Speed: 48980.55 samples/sec
> > > > >  accuracy=0.999687
> > > > > INFO:root:Epoch[19] Batch [800-900]     Speed: 47402.56 samples/sec
> > > > >  accuracy=0.999844
> > > > > INFO:root:Epoch[19] Train-accuracy=0.999767
> > > > > INFO:root:Epoch[19] Time cost=1.259
> > > > > INFO:root:Epoch[19] Validation-accuracy=0.983181
> > > > > 755.36user 754.94system 0:35.89elapsed 4207%CPU (0avgtext+0avgdata
> > > > > 1147008maxresident)k
> > > > > 0inputs+3112outputs (0major+3568826minor)pagefaults 0swaps
> > > > >
> > > > > Let me know what you think.
> > > > >
> > > > > Link to the original PR:
> > > > > https://github.com/apache/incubator-mxnet/pull/12160
> > > > >
> > > > > Thanks.
> > > > >
> > > > > On Wed, Jun 19, 2019 at 5:35 PM kellen sunderland
> > > > > <kellen.sunderl...@gmail.com> wrote:
> > > > > >
> > > > > > "if you’re linking in two then you’re doing something wrong."
> > > Correct,
> > > > > > that's one thing I believe we've got consensus on.  So let's call
> > > that
> > > > > out
> > > > > > as a bug to be fixed.
> > > > > >
> > > > > > Let's move forward with some reproducible numbers and then
> discuss
> > > the
> > > > > pros
> > > > > > / cons of which particular OMP implementation we should use.
> > > > > >
> > > > > > On Wed, Jun 19, 2019 at 3:06 PM Pedro Larroy <
> > > > > pedro.larroy.li...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Chris
> > > > > > >
> > > > > > > I would ask you to have a bit of patience and help us with your
> > > > > > > experience in this matter. Nobody is ignoring anything, I
> think we
> > > are
> > > > > > > individually gathering feedbacks and trying to understand the
> > > multiple
> > > > > > > contributions done to this topic including yours, then go step
> by
> > > > > > > step, understand what is going on and run experiments and
> report
> > > back
> > > > > > > to the list or the corresponding github item. It was suggested
> by
> > > > > > > Kellen to prepare some containers, this takes effort.
> > > > > > >
> > > > > > > Regarding your final comment, most of us also have many other
> > > things
> > > > > > > to do and responsibilities even if our daytime jobs might
> involve
> > > > > > > MXNet in some form or another. I think that's part of the
> privilege
> > > > > > > and responsibility of working close with an open source
> project and
> > > > > > > the magic of collaboration across organizations. Let's all be
> > > patient
> > > > > > > and take some time to understand and reason about this topic
> which
> > > is
> > > > > > > not simple. Since we decided to step back and gather more data
> > > let's
> > > > > > > take time and do it properly.
> > > > > > >
> > > > > > > Personally I hope to find time to look again into this issue
> before
> > > > > > > the end of the week.
> > > > > > >
> > > > > > > Thanks.
> > > > > > >
> > > > > > > Pedro.
> > > > > > >
> > > > > > > On Wed, Jun 19, 2019 at 2:43 PM Chris Olivier <
> > > cjolivie...@apache.org>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > if you’re linking in two then you’re doing something wrong.
> You
> > > can
> > > > > see
> > > > > > > by
> > > > > > > > my email yesterday that only one is linked in. This is also
> the
> > > case
> > > > > with
> > > > > > > > the mkl version built by the Makefile — only the Intel OMP
> > > library is
> > > > > > > used
> > > > > > > > (no libgomp).
> > > > > > > >
> > > > > > > > That being said, Do you have clear evidence that using Intel
> OMP
> > > is
> > > > > both
> > > > > > > > problematic and the situation isn’t fixable?  The burden of
> > > proof is
> > > > > on
> > > > > > > the
> > > > > > > > ones requesting the change — it is not my responsibility to
> > > justify
> > > > > the
> > > > > > > > current state.  There must be something “terrible” and
> unfixable
> > > to
> > > > > > > justify
> > > > > > > > a change.  I have seen no proof of this in all this time.
> > > > > > > >
> > > > > > > > On a side note, I mentioned a couple of things in my email
> > > yesterday
> > > > > that
> > > > > > > > still are not being responded to (they were also ignored in
> the
> > > last
> > > > > > > > incarnation of this “discussion” — I have much experience in
> this
> > > > > matter
> > > > > > > to
> > > > > > > > assume “discussion” is a waste of my time, seeing and I am
> not
> > > paid
> > > > > to
> > > > > > > > “work on” mxnet like y’all are).
> > > > > > > >
> > > > > > > > -C
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Jun 19, 2019 at 10:28 AM kellen sunderland <
> > > > > > > > kellen.sunderl...@gmail.com> wrote:
> > > > > > > >
> > > > > > > > > I've also quite often seen two versions of OpenMP linked.
> I
> > > think
> > > > > we
> > > > > > > can
> > > > > > > > > all agree we probably want to avoid linking in two
> libraries
> > > that
> > > > > do
> > > > > > > > > effectively the same thing.
> > > > > > > > >
> > > > > > > > > The performance questions should be fairly straight
> forward to
> > > > > > > demonstrate
> > > > > > > > > right?  Could we just collaborate on a few minimal
> Dockerfiles
> > > that
> > > > > > > show
> > > > > > > > > (or don't show) Intel OpenMP performance speedups with the
> > > > > workloads
> > > > > > > Chris
> > > > > > > > > is referencing?
> > > > > > > > >
> > > > > > > > > On Wed, Jun 19, 2019 at 4:44 AM Tsukrov, Stanislav <
> > > > > > > > > stanislav.tsuk...@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > > Hi, Chris!
> > > > > > > > > >
> > > > > > > > > > Stas here - I've gathered that performance data.
> > > > > > > > > > Sure thing, I can be wrong, but please elaborate a bit on
> > > what
> > > > > we are
> > > > > > > > > > missing.
> > > > > > > > > > Be assured, intentional misdirection was never a case.
> > > > > > > > > >
> > > > > > > > > > Thanks a lot for being constructive.
> > > > > > > > > >
> > > > > > > > > > > Turning Intel OMP on and off (and MKL as well, since it
> > > tends
> > > > > to
> > > > > > > pull
> > > > > > > > > in
> > > > > > > > > > omp, depending which one is linked in).
> > > > > > > > > >
> > > > > > > > > > We never ever considered turning MKL off. We are on the
> same
> > > page
> > > > > > > here -
> > > > > > > > > > MKL is crucial for the performance.
> > > > > > > > > > Why should we? There's a GOMP-linked version of MKL,
> that we
> > > can
> > > > > use.
> > > > > > > > > >
> > > > > > > > > > What we did - we measured, if using compilers default
> OpenMP
> > > > > > > > > > implementation instead of referenced source code
> > > distribution of
> > > > > > > OpenMP
> > > > > > > > > > makes anything slower.
> > > > > > > > > > We have found the impact to be hardly measurable.
> > > > > > > > > > The difference between GOMP and iOMP is <5% on our
> > > benchmarks,
> > > > > most
> > > > > > > of
> > > > > > > > > the
> > > > > > > > > > time less than that.
> > > > > > > > > >
> > > > > > > > > > We just suggest to simplify the build of mxnet, by
> removing
> > > the
> > > > > > > > > > unnecessary dependency.
> > > > > > > > > >
> > > > > > > > > > During that we discovered for example the following
> amazing
> > > > > issue:
> > > > > > > > > > https://github.com/apache/incubator-mxnet/issues/14087
> > > > > > > > > >
> > > > > > > > > > Best Regards
> > > > > > > > > >
> > > > > > > > > > Stas
> > > > > > > > > >
> > > > > > > > > > On 18.06.19, 18:24, "Chris Olivier" <
> cjolivie...@gmail.com>
> > > > > wrote:
> > > > > > > > > >
> > > > > > > > > >     I am very reluctant to feed the trolls again, and
> this
> > > will
> > > > > be
> > > > > > > teh
> > > > > > > > > last
> > > > > > > > > >     time I address Pedro or Anton on the subject, but
> since I
> > > > > think
> > > > > > > the
> > > > > > > > > > numbers
> > > > > > > > > >     being presented are incorrect (either by te builders
> not
> > > > > really
> > > > > > > > > >     understanding what they are building, or possibly
> > > intentional
> > > > > > > > > > misdirection):
> > > > > > > > > >
> > > > > > > > > >     Turning Intel OMP on and off (and MKL as well, since
> it
> > > > > tends to
> > > > > > > pull
> > > > > > > > > > in
> > > > > > > > > >     omp, depending which one is linked in).
> > > > > > > > > >     There is a HUGE difference.  This is consistent with
> my
> > > > > > > experience
> > > > > > > > > > before
> > > > > > > > > >     when it was added.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >     default mnist:
> > > > > > > > > >
> > > > > > > > > >     python ../example/image-classification/train_mnist.py
> > > > > > > > > >     INFO:root:start with arguments
> Namespace(add_stn=False,
> > > > > > > > > batch_size=64,
> > > > > > > > > >     disp_batches=100, dtype='float32', gc_threshold=0.5,
> > > > > > > gc_type='none',
> > > > > > > > > >     gpus=None, image_shape='1, 28, 28',
> > > initializer='default',
> > > > > > > > > >     kv_store='device', load_epoch=None, loss='', lr=0.05,
> > > > > > > lr_factor=0.1,
> > > > > > > > > >     lr_step_epochs='10', macrobatch_size=0,
> > > model_prefix=None,
> > > > > > > mom=0.9,
> > > > > > > > > >     monitor=0, network='mlp', num_classes=10,
> num_epochs=20,
> > > > > > > > > >     num_examples=60000, num_layers=None, optimizer='sgd',
> > > > > > > > > >     profile_server_suffix='', profile_worker_suffix='',
> > > > > > > save_period=1,
> > > > > > > > > >     test_io=0, top_k=0, warmup_epochs=5,
> > > > > warmup_strategy='linear',
> > > > > > > > > > wd=0.0001)
> > > > > > > > > >
> > > > > > > > > >     INTEL OMP:
> > > > > > > > > >
> > > > > > > > > >     ldd libmxnet.so | grep omp
> > > > > > > > > >             libomp.so =>
> > > > > > > > > >
> > > > > > >
> > > /home/chris/src/mxnet/cmake_omp/3rdparty/openmp/runtime/src/libomp.so
> > > > > > > > > >     (0x00007f978fde7000)
> > > > > > > > > >
> > > > > > > > > >     :root:Epoch[0] Batch [0-100]        Speed: 31548.09
> > > > > samples/sec
> > > > > > > > > >     accuracy=0.780012
> > > > > > > > > >     INFO:root:Epoch[0] Batch [100-200]      Speed:
> 16073.21
> > > > > > > samples/sec
> > > > > > > > > >     accuracy=0.920469
> > > > > > > > > >     INFO:root:Epoch[0] Batch [200-300]      Speed:
> 19075.91
> > > > > > > samples/sec
> > > > > > > > > >     accuracy=0.928281
> > > > > > > > > >     INFO:root:Epoch[0] Batch [300-400]      Speed:
> 23211.36
> > > > > > > samples/sec
> > > > > > > > > >     accuracy=0.942813
> > > > > > > > > >     INFO:root:Epoch[0] Batch [400-500]      Speed:
> 22139.79
> > > > > > > samples/sec
> > > > > > > > > >     accuracy=0.938750
> > > > > > > > > >     INFO:root:Epoch[0] Batch [500-600]      Speed:
> 23225.52
> > > > > > > samples/sec
> > > > > > > > > >     accuracy=0.946562
> > > > > > > > > >     INFO:root:Epoch[0] Batch [600-700]      Speed:
> 19547.41
> > > > > > > samples/sec
> > > > > > > > > >     accuracy=0.953281
> > > > > > > > > >     INFO:root:Epoch[0] Batch [700-800]      Speed:
> 24111.73
> > > > > > > samples/sec
> > > > > > > > > >     accuracy=0.951562
> > > > > > > > > >     INFO:root:Epoch[0] Batch [800-900]      Speed:
> 13959.88
> > > > > > > samples/sec
> > > > > > > > > >     accuracy=0.957500
> > > > > > > > > >     INFO:root:Epoch[0] Train-accuracy=0.925423
> > > > > > > > > >     INFO:root:Epoch[0] Time cost=3.806
> > > > > > > > > >     INFO:root:Epoch[0] Validation-accuracy=0.962580
> > > > > > > > > >     INFO:root:Epoch[1] Batch [0-100]        Speed:
> 24560.21
> > > > > > > samples/sec
> > > > > > > > > >     accuracy=0.968131
> > > > > > > > > >     INFO:root:Epoch[1] Batch [100-200]      Speed:
> 23457.03
> > > > > > > samples/sec
> > > > > > > > > >     accuracy=0.966250
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >     LIBGOMP:
> > > > > > > > > >
> > > > > > > > > >     ldd libmxnet.so | grep omp
> > > > > > > > > >             libgomp.so.1 =>
> > > > > /usr/lib/x86_64-linux-gnu/libgomp.so.1
> > > > > > > > > >     (0x00007f25c25dd000)
> > > > > > > > > >
> > > > > > > > > >     INFO:root:Epoch[0] Batch [0-100]        Speed:
> 1731.01
> > > > > > > samples/sec
> > > > > > > > > >      accuracy=0.782488
> > > > > > > > > >     INFO:root:Epoch[0] Batch [100-200]      Speed:
> 3551.32
> > > > > > > samples/sec
> > > > > > > > > >      accuracy=0.907813
> > > > > > > > > >     INFO:root:Epoch[0] Batch [200-300]      Speed:
> 1991.00
> > > > > > > samples/sec
> > > > > > > > > >      accuracy=0.927188
> > > > > > > > > >     INFO:root:Epoch[0] Batch [300-400]      Speed:
> 2175.45
> > > > > > > samples/sec
> > > > > > > > > >      accuracy=0.937969
> > > > > > > > > >     INFO:root:Epoch[0] Batch [400-500]      Speed:
> 1644.95
> > > > > > > samples/sec
> > > > > > > > > >      accuracy=0.942187
> > > > > > > > > >     INFO:root:Epoch[0] Batch [500-600]      Speed:
> 6444.58
> > > > > > > samples/sec
> > > > > > > > > >      accuracy=0.950156
> > > > > > > > > >     INFO:root:Epoch[0] Batch [600-700]      Speed:
> 7842.16
> > > > > > > samples/sec
> > > > > > > > > >      accuracy=0.947969
> > > > > > > > > >     INFO:root:Epoch[0] Batch [700-800]      Speed:
> 9412.07
> > > > > > > samples/sec
> > > > > > > > > >      accuracy=0.953750
> > > > > > > > > >     INFO:root:Epoch[0] Batch [800-900]      Speed:
> 12707.58
> > > > > > > samples/sec
> > > > > > > > > >     accuracy=0.953125
> > > > > > > > > >
> > > > > > > > > >     That being said, there's other issued beyond speed.
> The
> > > > > DEFAULT
> > > > > > > > > build
> > > > > > > > > > from
> > > > > > > > > >     makefile (not CMake) uses Intel OMP mkl (I showed
> > > before) and
> > > > > > > > > > mysteriously
> > > > > > > > > >     it has no issues?  This seems highly suspicious.
> All I
> > > see
> > > > > is a
> > > > > > > lot
> > > > > > > > > of
> > > > > > > > > >     hand-waving and conjecture and pointing to
> StackOverflow
> > > > > posts
> > > > > > > made
> > > > > > > > > by
> > > > > > > > > >     people who may be of questionable pedigree to begin
> with.
> > > > > This
> > > > > > > > > smells
> > > > > > > > > > of a
> > > > > > > > > >     Pedro-ego-fight rather than one of purely technical
> > > merit.
> > > > > > > Also, if
> > > > > > > > > > one
> > > > > > > > > >     knows how OMP works,  they would be very suspicious
> of
> > > the
> > > > > > > > > > "intermittent
> > > > > > > > > >     hangs" claim -- that's probably just broken race
> > > conditions
> > > > > > > elsewhere
> > > > > > > > > > until
> > > > > > > > > >     proven differently.  It'd tend freeze on the first
> use if
> > > > > > > something
> > > > > > > > > is
> > > > > > > > > >     wrong (try using libgomp after a fork and see), since
> > > worker
> > > > > > > threads"
> > > > > > > > > >     wouldn't be assigned/joined properly.  IntelOMP is
> > > faster,
> > > > > but
> > > > > > > also
> > > > > > > > > has
> > > > > > > > > >     other advantages, such as allowing OMP after a fork.
> > > > > > > > > >
> > > > > > > > > >     I actually addressed a lot of issues and ask for
> > > > > clarification
> > > > > > > in the
> > > > > > > > > >     original PR's way back when, but they're all just
> > > ignored.
> > > > > > > > > >
> > > > > > > > > >     -Chris
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
>

Reply via email to