Re: [gem5-dev] RISC-V: Unknown opcode 0x00

2017-02-08 Thread Boris Shingarov
Hmmm... why does __libc_setup_tls evoke a kind of deja-vu feeling?  I don't 
remember the details now, but there definitely was a time when I was trying to 
rebuild a MIPS hello example with a definite version of gcc toolchain (because 
nobody seemed to know which toolchain version was used for the the 
gem5-provided binary), and got stuck on the gem5 SE simulation segfaulting 
somewhere related to __libc_setup_tls.  For some reason I thought this was due 
to underimplemented support for the TLS syscalls, but I gave up temporarily, 
planning to come back to the problem when I have more time/ideas, but got 
drowned in other priorities.  Now that it's not just MIPS but RISC-V, suddenly 
the problem seems that much more exciting.  I'll try going back to remember 
what exactly it was that I discovered on MIPS... 

-"gem5-dev"  wrote: -
To: gem5 Developer List 
From: Alec Roelke 
Sent by: "gem5-dev" 
Date: 02/07/2017 11:20AM
Subject: Re: [gem5-dev] RISC-V: Unknown opcode 0x00

Hi Everyone,

Does anybody know anything about how gem5 reads binaries and why this
problem might be happening?  If full-system mode for RISC-V is to be
supported in the future (and probably for multithreading in SE mode as
well), this will probably need to be fixed.

Thanks,
Alec Roelke

On Mon, Jan 23, 2017 at 3:12 PM, Alec Roelke  wrote:

> Hello,
>
> I'm trying to get the riscv64-linux-gnu-* tools working on gem5 for RISC-V
> since right now only the riscv64-unknown-elf-* tools are compatible and
> those don't include a lot of Linux headers.
>
> The problem I am encountering is that after returning from a function I
> assume is part of libc (the assembly label is __libc_setup_tls), the next
> instruction it reads is always 0x, which is undefined in RISC-V and
> causes a panic.  For example, when I compile the example "Hello, world"
> program with riscv64-linux-gnu-gcc, using the -static and -static-libgcc
> flags, here is a snippet of the end of the Exec trace:
>
>  @__libc_setup_tls+456    : jalr zero, ra, 0           : IntAlu :
>  D=0x0002e2ec
>  @__libc_start_main+140    : unknown opcode 0x00        : No_OpClass :
>
> The value of ra (return address register, just like in MIPS) in the first
> line is @__libc_start_main+140 (0x2ddc4), which it appears to correctly
> jump to in the second line.  According to the assembly I dumped from the
> binary, that should be an actual instruction (lui a5, 0x12, which loads
> 0x12 into the upper 32 bits of register a5), but gem5 seems to be reading
> 0x.
>
> Does anyone know what's going on?
>
> Thanks,
> Alec Roelke
>
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] Question about arm/neon instruction timing in gem5

2017-02-08 Thread Raul Garcia
Hello All,


I am using gem5 to perform some research on ARM/NEON performance. In particular 
I'm looking into instruction timing and studying the ARM Cortex-A9 ( armv7 with 
NEON).

Can you help me to clarify the following questions?


1.- ARM's documentation (Cortex™-A9 NEON™ Media ProcessingEngine) provides 
instruction timing tables for VFP and NEON instructions. According to those 
tables VFP and  NEON instructions have different timing values, for example the 
VFP vadd instruction takes 4 cycles and the NEON vadd instruction takes 6 
cycles:


Table 3-2 VFP instruction timing
NameFormatCycles   Source  Result   Writeback
VADDDd,Dn,Dm   1  -1,1   4   4

Table 3-4 Advanced SIMD integer arithmetic instruction timing
NameFormatCycles   Source  Result   Writeback
VADDDd,Dn,Dm   1   -2,2  3   6


However, the O3_ARMv7a.py file that defines the characteristics of the armv7 
architecture shows the same operation latency for VFP and NEON instructions:


# Floating point and SIMD instructions
class O3_ARM_v7a_FP(FUDesc):
opList = [ OpDesc(opClass='SimdAdd', opLat=4),


Is this correct? shouldn't NEON instructions have a opLat value of 6? Do I need 
to change the latency (from 4 to 6 in the case of the ADD instruction) to 
correctly simulate the latency of a NEON instruction as specified by the ARM 
documentation? Is the simulator aware that a VFP instruction may have a 
different latency that a NEON instruction?

2.- In the same document, besides Writeback, other instruction timing values 
are defined (in section 3.4.1 Instruction timing tables ): Result (result 
ready), Source (operands available), Cycle (issue cycles). Does the value 
"opLat" in the O3_ARMv7a.py file is the defined as the writeback value or as 
the result value? Note that the result and the writeback values are not the 
same.  Are the other timing values (Source, cycle) taken into account by gem5?


Best Regards,
Raul



___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev