MasterJH5574 opened a new pull request, #16396:
URL: https://github.com/apache/tvm/pull/16396

   This PR enhances PagedKVCache with the inline RoPE compute, which unblocks 
the movement towards sliding window and attention sink.
   
   Both FlashInfer and TIR kernels are updated in this PR with the RoPE 
calculation. Note that FlashInfer is bumped in order to include the RoPE update.
   
   The previous standalone kernel used for RoPE application are thereby removed.
   
   ---
   
   Co-authored-by: Bohan Hou <spectromet...@gmail.com>
   Co-authored-by: Hongyi Jin <jinhongy...@gmail.com>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to