malixian opened a new issue, #15987:
URL: https://github.com/apache/tvm/issues/15987

   ### Expected behavior
   I try to use MetaScheduler to tuning matmul, and the dimensions of the 
matrix are m=8192, n=14336, k=8192.
   When n=8192, everything is ok, but once m or n is equal to 14336, an error 
`RuntimeError: parallel_for_dynamic error with [02:23:57] 
/home/malixian/repos/tensorir/tvm/src/ir/expr.cc:88: InternalError: Check 
failed: value < 1LL << (dtype.bits() - 1) (8589934591 vs. 2147483648) : 
ValueError: Literal value 8589934591 exceeds maximum of int32` will occur. BTW, 
it is ok when k equals 14336.
   According to the error message, I tried to comment out the` ICHECK` code of 
the function IntImm in expr.cc and it worked normally,  again.
   I think the `DataType` of Tir should be expanded to suit this case.
   
   ### Actual behavior
   
   error `RuntimeError: parallel_for_dynamic error with [02:23:57] 
/home/malixian/repos/tensorir/tvm/src/ir/expr.cc:88: InternalError: Check 
failed: value < 1LL << (dtype.bits() - 1) (8589934591 vs. 2147483648) : 
ValueError: Literal value 8589934591 exceeds maximum of int32`
   
   ### Environment
   
   TVM version is '0.15.dev0'
   
   ### Steps to reproduce
   
   
   ```
   def matmul_fp16(M: int, N: int, K: int, in_dtype: str, out_dtype: str):
       x = te.placeholder((M, K), name="X", dtype=in_dtype)
       y = te.placeholder((K, N), name="Y", dtype=in_dtype)
       k = te.reduce_axis((0, K), name="k")
       c = te.compute(  # pylint: disable=invalid-name
           (M, N),
           lambda i, j: te.sum(x[i][k].astype(out_dtype) * 
y[k][j].astype(out_dtype), axis=[k]),
           name="C",
       )
       return (x, y, c)
   
   
     def tune(in_dtype, out_dtype):
         target = Target("nvidia/nvidia-a100")
         M, N, K = 8192, 14336, 8192
         func = te.create_prim_func(
             matmul_fp16(M=M, N=N, K=K, in_dtype=in_dtype, out_dtype=out_dtype)
         ).with_attr({"global_symbol": "main"})
   
         space = ms.space_generator.PostOrderApply(
             sch_rules="cuda-tensorcore",
             postprocs="cuda-tensorcore",
             mutator_probs="cuda-tensorcore",
         )
   
         mod = tvm.IRModule({"main": func})
         with tempfile.TemporaryDirectory() as work_dir:
             db = ms.tir_integration.tune_tir(
                 mod=mod,
                 target=target,
                 work_dir=work_dir,
                 max_trials_global=32,
                 builder=LocalBuilder(
                     f_build="meta_schedule.builder.async_build", 
initializer=initializer
                 ),
                 space=space,
             )
             sch = db.query_schedule(mod, target=target, workload_name="main")
             with tvm.transform.PassContext(config={"tir.use_async_copy": 1}):
                 rt_mod = tvm.build(sch.mod, target=target)
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to