Hello, I'm redirecting the question to the right mailing list, > 1.- ARM's documentation (Cortex™-A9 NEON™ Media ProcessingEngine) provides instruction timing tables for VFP and NEON instructions. According to those tables VFP and NEON instructions have different timing values, for example the VFP vadd instruction takes 4 cycles and the NEON vadd instruction takes 6 cycles:
In my understanding, the VFP VADD takes 4 cycles, but the NEON INT VADD takes 2 (3-2+1) cycles with forwarding or 5 (6-2+1) without it. The equation to use for latencies with forwanding is: FL= Result - Source + 1, because the operands are needed at the 'Source' stage and the result for forwanding is ready at the 'Result' stage. In gem5 all FUs assume full-forwarding, and critical forwardings are supposed to be implemented in HW. So you can assume the forwading latencies. > 2.- In the same document, besides Writeback, other instruction timing values are defined (in section 3.4.1 Instruction timing tables ): Result (result ready), Source (operands available), Cycle (issue cycles). Does the value "opLat" in the O3_ARMv7a.py file is the defined as the writeback value or as the result value? Note that the result and the writeback values are not the same. Are the other timing values (Source, cycle) taken into account by gem5? As far as I know, only the forwarding latency is currently taken into account in gem5. Besides that, if the 'Cycles' are roughly equal to the forwarding latency, you can assume that the FU is not pipelined for that operation (in gem5 there is this flag). 'Cycles' > 1 can also mean that the operation is microcoded. One may use 'FL=Result - Source + Cycles' to take that into account. This is very rough approximation though. Regards, -- Fernando A. Endo, Post-doc INRIA Rennes-Bretagne Atlantique France 2017-02-08 18:12 GMT+01:00 Raul Garcia <raul.gar...@manchester.ac.uk>: > Hello All, > > > I am using gem5 to perform some research on ARM/NEON performance. In > particular I'm looking into instruction timing and studying the ARM > Cortex-A9 ( armv7 with NEON). > > Can you help me to clarify the following questions? > > > 1.- ARM's documentation (Cortex™-A9 NEON™ Media ProcessingEngine) provides > instruction timing tables for VFP and NEON instructions. According to those > tables VFP and NEON instructions have different timing values, for example > the VFP vadd instruction takes 4 cycles and the NEON vadd instruction takes > 6 cycles: > > > Table 3-2 VFP instruction timing > Name Format Cycles Source Result Writeback > VADD Dd,Dn,Dm 1 -1,1 4 4 > > Table 3-4 Advanced SIMD integer arithmetic instruction timing > Name Format Cycles Source Result Writeback > VADD Dd,Dn,Dm 1 -2,2 3 6 > > > However, the O3_ARMv7a.py file that defines the characteristics of the > armv7 architecture shows the same operation latency for VFP and NEON > instructions: > > > # Floating point and SIMD instructions > class O3_ARM_v7a_FP(FUDesc): > opList = [ OpDesc(opClass='SimdAdd', opLat=4), > > > Is this correct? shouldn't NEON instructions have a opLat value of 6? Do I > need to change the latency (from 4 to 6 in the case of the ADD instruction) > to correctly simulate the latency of a NEON instruction as specified by the > ARM documentation? Is the simulator aware that a VFP instruction may have a > different latency that a NEON instruction? > > 2.- In the same document, besides Writeback, other instruction timing > values are defined (in section 3.4.1 Instruction timing tables ): Result > (result ready), Source (operands available), Cycle (issue cycles). Does the > value "opLat" in the O3_ARMv7a.py file is the defined as the writeback > value or as the result value? Note that the result and the writeback values > are not the same. Are the other timing values (Source, cycle) taken into > account by gem5? > > > Best Regards, > Raul > > > > _______________________________________________ > gem5-dev mailing list > gem5-...@gem5.org > http://m5sim.org/mailman/listinfo/gem5-dev >
_______________________________________________ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users