Package: wnpp Severity: wishlist Owner: Cordell Bloor <c...@slerp.xyz> X-Debbugs-Cc: debian-de...@lists.debian.org, c...@slerp.xyz, debian...@lists.debian.org
* Package name : rocm-tensile Version : 6.0.2 * URL : https://github.com/ROCm/Tensile * License : Expat Programming Lang: Python, HIP Description : ROCm tool for generating and benchmarking assembly kernels Tensile is a set of tools and libraries primarily for selecting parameters of GPU kernels implementing the general matrix multiply (GEMM) operation. There are three components that comprise Tensile: . 1. A command-line tool for generating kernels, benchmarking them, and saving the parameters used for generating the best kernels (a.k.a. "solutions") in YAML files. 2. A build system component that reads YAML solution files, generates kernel source files, and invokes the compiler to turn them into code object files. The kernels are indexed by their parameters in either YAML or MessagePack format within a TensileLibrary file. 3. A runtime library for loading and executing the best available solution for a given set of GEMM input parameters (a.k.a. "a problem"). The rocm-tensile library sources are currently packaged as part of rocblas in a multi-upstream tarball package, but they should be split out so that the command-line tool can be packaged. Tensile kernels are a vital part of the performance of the rocblas library. It is often necessary to add tuned kernels for particular problem sizes to achieve optimal performance in a new application or on a new hardware architecture. This is therefore an important development tool for BLAS performance on AMD GPUs. A fork of the Tensile library is also used by hipblaslt. Splitting Tensile out from the rocblas package may be helpful in preventing the duplication of embedded copies. The Tensile library can also be used by MIOpen. This package is part of AMD's ROCm stack and will be maintained under the Debian AI team umbrella.