HaohaiWen wrote:

> Please can you confirm this as llvm-mca predicts worse case (znver4) to be 4 
> https://llvm.godbolt.org/z/fxWTaf3Gv

Currenttly, uiCA don't support Zen4 and I don't have Zen4 machine.
I can measure it on local SKX machine with nanoBench 
(https://github.com/andreas-abel/nanoBench). Maybe you can use it to confirm 
Zen4 cost if you do have Zen4 machine. 
e.g.
```
./nanoBench.sh -init "xor zmm0, zmm0" -asm "vcvtps2pd zmm2, ymm0; vextractf64x4 
ymm0, zmm0, 1; vcvtps2pd zmm1, ymm0" -config configs/cfg_SkylakeX_common.txt 
-unroll 1000 -loop 1000 -warm_up_count 10 -cpu 0
Note: Hyper-threading is enabled; it can be disabled with "sudo ./disable-HT.sh"
CORE_CYCLES: 4.77
INST_RETIRED: 3.00
IDQ.MITE_UOPS: 5.71
IDQ.DSB_UOPS: -0.70
IDQ.MS_UOPS: 0.01
LSD.UOPS: 0.00
UOPS_ISSUED: 5.01
UOPS_EXECUTED: 5.01
UOPS_RETIRED.RETIRE_SLOTS: 5.01
UOPS_DISPATCHED_PORT.PORT_0: 2.00
UOPS_DISPATCHED_PORT.PORT_1: 0.00
UOPS_DISPATCHED_PORT.PORT_2: 0.00
UOPS_DISPATCHED_PORT.PORT_3: 0.00
UOPS_DISPATCHED_PORT.PORT_4: 0.00
UOPS_DISPATCHED_PORT.PORT_5: 3.00
UOPS_DISPATCHED_PORT.PORT_6: 0.00
UOPS_DISPATCHED_PORT.PORT_7: 0.00

```

https://github.com/llvm/llvm-project/pull/76278
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to