[GitHub] [tvm] masahi commented on pull request #9737: [CUTLASS] Add conv2d profiler

GitBox Tue, 14 Dec 2021 11:31:12 -0800


masahi commented on pull request #9737:
URL: https://github.com/apache/tvm/pull/9737#issuecomment-993909962

> @masahi Maybe the reason why the TVM script takes so long is that you are
doing 100 iterations per benchmark where as the cutlass script is only doing 20?

I believe `cutlass_profiler` is also doing 100 iterations by default:
https://github.com/NVIDIA/cutlass/blob/808c25337a3ed4c97ac21895257b1addc72d6ca8/tools/profiler/src/options.cu#L386

> Also the TVM script is running through the whole tvm compilation pipeline
for each workload.

As I commented in
https://github.com/apache/tvm/pull/9737#discussion_r768968554, we don't invoke
the tvm pipeline when we select cutlass kernels.

One major difference with two scripts are that cutlass compiles all kernels
into one giant profiler executable, while we generate separate executables for
each kernel. So cutlass can allocate / deallocate memory once and loop through
each kernel for a given workload to select the best one. Also, I remember that
there is a non-trivial initialization cost (close to 1 sec) for any CUDA apps,
when we invoke the first CUDA API call - for `cutlass_profiler` this happens
only once while we pay that cost for each profiler binaries (there about 60 of
them).

But this still doesn't explain 10x difference, so I believe there is
something else going on. We could adopt the same approach as `cutlass_profiler`
and compile all candidate kernels into one executable. I didn't do that for
`conv2d_profiler` because I just followed how `gemm_profiler` is implemented,
but that could be a possible improvement.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [tvm] masahi commented on pull request #9737: [CUTLASS] Add conv2d profiler

Reply via email to