TristonC commented on issue #18751:
URL:
https://github.com/apache/incubator-mxnet/issues/18751#issuecomment-665436383
@gilbertfrancois What is the BN suppose to do for your model in the tail? Is
it suppose to do batch normalize every single value?
---
TristonC commented on issue #18751:
URL:
https://github.com/apache/incubator-mxnet/issues/18751#issuecomment-663168367
I am not sure that putting the running mean and running var into the
backward pass is the solution. It can be achieved by setting
autograd.record(train_mode=False). The p
TristonC commented on issue #18751:
URL:
https://github.com/apache/incubator-mxnet/issues/18751#issuecomment-662730136
After nn.Flatten(), the batch norm is actually performed on a 1xCx1x1
Tensor, where C is 9408 for the first batch norm layer in tail, and it is 32
for the second batch mo
TristonC commented on issue #18751:
URL:
https://github.com/apache/incubator-mxnet/issues/18751#issuecomment-662721253
It looks like there is a bug there for doing batch norm with 1D array, when
the batch size is 1. For example, in this case, after flat, there vector size
is 9408, which m
TristonC commented on issue #18751:
URL:
https://github.com/apache/incubator-mxnet/issues/18751#issuecomment-662687910
@gilbertfrancois I did a quick test, to answer your question:
> I don't understand why y_out from MyNet with BatchNorm on GPU still
contains real numbers, given that th
TristonC commented on issue #18751:
URL:
https://github.com/apache/incubator-mxnet/issues/18751#issuecomment-662547038
@gilbertfrancois Is your project for training or inference? In your script,
it uses autograd, but it does not do backward(). The reason I asked this, is
BatchNorm behave
TristonC commented on issue #18751:
URL:
https://github.com/apache/incubator-mxnet/issues/18751#issuecomment-661448168
@szha It may not affect real training for either CPU or GPU version as CPU
version does update the running mean and running var in the backward path.
Should we unify the
TristonC commented on issue #18751:
URL:
https://github.com/apache/incubator-mxnet/issues/18751#issuecomment-661442792
In case anyone want to do a PyTorch comparison for CPU version:
This is an automated message from the Apa
TristonC commented on issue #18751:
URL:
https://github.com/apache/incubator-mxnet/issues/18751#issuecomment-661225937
The CPU version updated running mean and running var in backward path. And
there are some nuance difference there between CPU and GPUs. @gilbertfrancois,
could you try t
TristonC commented on issue #18751:
URL:
https://github.com/apache/incubator-mxnet/issues/18751#issuecomment-660586280
On GPU, the first running mean is 0, while the following 3 running means are
0.1, 0.19 and 0.271, which can be explained as
running_mean = 0.1 * running_mean + 0.9 * p
TristonC commented on issue #18751:
URL:
https://github.com/apache/incubator-mxnet/issues/18751#issuecomment-660583673
The CPU result seems wrong, while the GPU result seems more resonable.
This is an automated message from
11 matches
Mail list logo