Not exactly. In your last example, y is a scalar, and it stays in CPU context. There is no copy y to GPU and broadcast either. It is a scalar vector multiply in GPU.
--- [Visit Topic](https://discuss.mxnet.io/t/what-is-exactly-a-non-blocking-call/6383/6) or reply to this email to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.mxnet.io/email/unsubscribe/37f258d1c37c40c3dd1363b1b5f242c96a7aefdf3929904553896c3c03fc0b17).
