[GitHub] [tvm] mbrookhart commented on a change in pull request #6978: [Relay][Topi][Dynamic] Add a Sort op and use the legalization pass to perform dynamic topk on GPU
mbrookhart commented on a change in pull request #6978: URL: https://github.com/apache/tvm/pull/6978#discussion_r533605262 ## File path: python/tvm/topi/cuda/sort.py ## @@ -593,3 +694,82 @@ def schedule_topk(outs): The computation schedule for the op. """ return _schedule_sort(outs) + + +def _dyn_topk_legalize(attrs, inputs, arg_types): Review comment: Any tips for debugging generated code? I have a branch where I've done this a a topi composition, it passes if I compile with AddressSanitizer, but segfaults with a normal build in the generated code. @zhiics attempted to do argwhere, but gets compilation errors with topi dropping variables. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] mbrookhart commented on a change in pull request #6978: [Relay][Topi][Dynamic] Add a Sort op and use the legalization pass to perform dynamic topk on GPU
mbrookhart commented on a change in pull request #6978: URL: https://github.com/apache/tvm/pull/6978#discussion_r533605262 ## File path: python/tvm/topi/cuda/sort.py ## @@ -593,3 +694,82 @@ def schedule_topk(outs): The computation schedule for the op. """ return _schedule_sort(outs) + + +def _dyn_topk_legalize(attrs, inputs, arg_types): Review comment: Any tips for debugging generated code? I have a branch where I've done this a a topi composition, it passes if I compile with AddressSanitizer, but segfaults with a normal build. @zhiics attempted to do argwhere, but gets compilation errors with topi dropping variables. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] mbrookhart commented on a change in pull request #6978: [Relay][Topi][Dynamic] Add a Sort op and use the legalization pass to perform dynamic topk on GPU
mbrookhart commented on a change in pull request #6978: URL: https://github.com/apache/tvm/pull/6978#discussion_r532864974 ## File path: python/tvm/topi/cuda/sort.py ## @@ -593,3 +694,82 @@ def schedule_topk(outs): The computation schedule for the op. """ return _schedule_sort(outs) + + +def _dyn_topk_legalize(attrs, inputs, arg_types): +"""Legalizes dyn.topk op. + +On GPU, we don't directly implement a topi kernel. Instead, we combine +topi sort and topi strided slice to implement topk. This topi-level composition +doesn't work with the dynamic op. To support the dynamic op on gpu, we instead +legalize it into the same logic (sort + strided slice) in relay to take advantage +of the VM's dynamic shape capabilities + +Parameters +-- +attrs : tvm.ir.Attrs +Attributes of current convolution +inputs : list of tvm.relay.Expr +The args of the Relay expr to be legalized +types : list of types +List of input and output types + +Returns +--- +result : tvm.relay.Expr +The legalized expr +""" +data = tvm.relay.var( +"dyn_topk_data", shape=inputs[0].checked_type.shape, dtype=inputs[0].checked_type.dtype +) +k = tvm.relay.var( +"dyn_topk_k", shape=inputs[1].checked_type.shape, dtype=inputs[1].checked_type.dtype +) +sort_out = tvm.relay.sort(data, axis=attrs.axis, is_ascend=attrs.is_ascend) +argsort_out = tvm.relay.argsort( +data, axis=attrs.axis, is_ascend=attrs.is_ascend, dtype=attrs.dtype +) +dshape = tvm.relay.shape_of(data) + +axis = tvm.relay.const([attrs.axis]) + +zero = tvm.relay.const([0]) +one = tvm.relay.const([1]) +rank = tvm.relay.shape_of(dshape) + +normalized_axis = tvm.relay.where(axis < zero, axis + rank, axis) Review comment: Most of the small elementwise ops to get shapes will be fused, and this matches what I did in TOPI pretty closely. The only real performance concern I have with this is that it does sort and argsort separately. The way the cuda kernels are written, I think that ends up sorting the input data twice. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] mbrookhart commented on a change in pull request #6978: [Relay][Topi][Dynamic] Add a Sort op and use the legalization pass to perform dynamic topk on GPU
mbrookhart commented on a change in pull request #6978: URL: https://github.com/apache/tvm/pull/6978#discussion_r530658429 ## File path: python/tvm/topi/cuda/sort.py ## @@ -593,3 +694,76 @@ def schedule_topk(outs): The computation schedule for the op. """ return _schedule_sort(outs) + + +def _dyn_topk_legalize(attrs, inputs, arg_types): +"""Legalizes dyn.topk op. + +Parameters +-- +attrs : tvm.ir.Attrs +Attributes of current convolution +inputs : list of tvm.relay.Expr +The args of the Relay expr to be legalized +types : list of types +List of input and output types + +Returns +--- +result : tvm.relay.Expr Review comment: This is how the other legalization functions are structured, which admittedly seems odd to me, but they have functions in topi implementing relay passes and calling relay, see cuda conv2d here: https://github.com/apache/tvm/blob/c58b20bbd8e6ac7128e82994624d922ee6b8a069/python/tvm/topi/cuda/conv2d_alter_op.py#L271-L348 I did attempt to do this more directly in the relay files, and got a lot of circular imports, so I decided to follow the example of the other ops. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org