solin319 commented on issue #10366: fix bug in sgd
URL: https://github.com/apache/incubator-mxnet/pull/10366#issuecomment-383771962
Batch-size=128
Use device kvstore, the performance almost same, both about 110 samples/sec.
solin319 commented on issue #10366: fix bug in sgd
URL: https://github.com/apache/incubator-mxnet/pull/10366#issuecomment-383478020
@eric-haibin-lin
gpus=2*k80
network=vgg16
data=imagenet
kv-store=local
The performance is 131samples/sec when we remove temp resource.
If
solin319 commented on issue #10366: fix bug in sgd
URL: https://github.com/apache/incubator-mxnet/pull/10366#issuecomment-383478020
@eric-haibin-lin
gpus=2*k80
network=vgg16
data=imagenet
The performance is 131samples/sec when we remove temp resource.
If not the
solin319 commented on issue #10366: fix bug in sgd
URL: https://github.com/apache/incubator-mxnet/pull/10366#issuecomment-379620820
The results above was get in multi-GPU training with kv_store='local'.
The same problem was in kv_store='device' too.
When we training in multi-machine,
solin319 commented on issue #10366: fix bug in sgd
URL: https://github.com/apache/incubator-mxnet/pull/10366#issuecomment-378185953
set MXNET_CPU_TEMP_COPY = 100
When training resnet-50, the sgd_mom_update still can't start directly after
fist backward computation.
solin319 commented on issue #10366: fix bug in sgd
URL: https://github.com/apache/incubator-mxnet/pull/10366#issuecomment-378098403
@eric-haibin-lin
MXNET_EXEC_NUM_TEMP doesn't work.
But make MXNET_CPU_TEMP_COPY and MXNET_GPU_TEMP_COPY larger can solve the
overlap problem.
It's