masahi edited a comment on pull request #7441: URL: https://github.com/apache/tvm/pull/7441#issuecomment-780284547
It can be a lot simpler than that. Unique is basically sort + adjacent difference + exclusive scan. If you don't understand that statement, the following example should help. We have exclusive scan for CPU (`cumsum` op with `exclusive=True`), and GPU (see https://github.com/apache/tvm/pull/7303). If we implement unique this way, the same code runs on both CPU and GPU. ``` import numpy as np def exclusive_scan(arr): return np.cumsum(arr) - arr inp = np.random.randint(0, 10, size=(15,)) argsort_indices = np.argsort(inp) sorted_inp = np.array([inp[i] for i in argsort_indices]) print("sorted input:", sorted_inp) adj_diff = np.concatenate([[1], np.diff(sorted_inp)]) print("adjacent difference:", adj_diff) non_zero = adj_diff != 0 ex_scan = exclusive_scan(non_zero) print("exclusive scan:", ex_scan) unique = np.zeros(inp.shape[0], dtype=np.int) for i in range(inp.shape[0]): if non_zero[i] != 0: unique[ex_scan[i]] = inp[argsort_indices[i]] print("num unique element:", ex_scan[-1] + 1) print("unique:", unique) ``` Output: ``` sorted input: [0 0 0 4 5 5 6 6 6 6 6 7 8 8 9] adjacent difference: [0 0 0 4 1 0 1 0 0 0 0 1 1 0 1] exclusive scan: [0 1 1 1 2 3 3 4 4 4 4 4 5 6 6] num unique element: 7 unique: [0 4 5 6 7 8 9 0 0 0 0 0 0 0 0] ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
