> -----Original Message-----
> From: Gabe Black <gabe.bl...@gmail.com>
> Sent: 04 March 2021 04:04
> To: Giacomo Travaglini <giacomo.travagl...@arm.com>
> Cc: gem5 Developer List <gem5-dev@gem5.org>
> Subject: Re: [gem5-dev] vector register indexing modes and renaming?
>
>
>
> On Mon, Mar 1, 2021 at 6:48 AM Giacomo Travaglini
> <giacomo.travagl...@arm.com <mailto:giacomo.travagl...@arm.com> >
> wrote:
>
>
>
> > -----Original Message-----
> > From: Gabe Black <gabe.bl...@gmail.com
> <mailto:gabe.bl...@gmail.com> >
> > Sent: 27 February 2021 05:47
> > To: Giacomo Travaglini <giacomo.travagl...@arm.com
> <mailto:giacomo.travagl...@arm.com> >
> > Cc: gem5 Developer List <gem5-dev@gem5.org <mailto:gem5-
> d...@gem5.org> >
> > Subject: Re: [gem5-dev] vector register indexing modes and
> renaming?
> >
> > Another question/clarification:
> >
> > Does any data actually get shared between the two rename modes? I
> think you
> > said there is not, but now I can't find that.
>
> Data *do* get shared, even if in gem5 we have separate physical
> registers.
> In fact, when rename mode changes [1], (meta)data is copied from one
> register file to the other.
> For example, say we have an AArch64 kernel running at EL1 and my
> AArch32 (basically armv7) floating point application running at EL0.
>
> My application will be using vector elements; however, every time
> there is an exception to AArch64, cpu will switch
> Rename mode and data will be copied / mapping will be adjusted. Any
> FP & SIMD operation at this point will use vector registers.
> When the kernel finishes its stuff, and goes back to AArch32, vector
> elements will be repopulated.
>
>
>
>
> Ok, I thought that was what you said, and I couldn't think of another reason 
> to
> go through all the trouble of copying things around.
>
>
>
>
> > Would it work just as well to have
> > two register files which operate entirely independently?
>
> As I mentioned before, they operate independently, but they sync up
> when we pass from one mode
> To the other. Another way to look at it is that they are mutually
> exclusive.
>
>
>
>
> Would it make sense to trigger the syncing between them explicitly from ARM
> code, rather than forcing the O3 to notice and do the copying? Then the
> copying, etc, wouldn't have to be generic, since it would be triggered by an
> ARM architectural mechanism.
>
>

I understand what you are saying, but just to be clear: I believe the copying 
per se is already ISA agnostic.
The O3 copying is really copying elements into vectors and viceversa; the O3 
model doesn't really
have any notion of the Arm architecture (but I do agree the *need* for copying 
is Arm oriented)

If your concern was more about the triggering mechanism, you can probably start 
everything from the Arm side, but I don't really know what’s the cleanest 
solution there. A quick and dirty implementation would extend the ExecContext 
interface as execution mode changes are often (but not always) triggered by 
instructions.
(SVC = syscall and ERET).
That is far from being a better solution; as you would add o3 specific logic to 
other cpus as well (probably empty stubs). You would also end up with o3 code 
in arch.

>
>
> > From what I can tell
> > the "V" registers of Neon in aarch64 overlap with the SVE registers,
> and the "Q"
> > registers of armv7 Neon overlap with the "S", "D", "Q" registers of
> the same,
> > but I think "V" and "Q" are independent? Maybe reused but not
> guaranteed to
> > alias?
> >
>
> I would say the rule of thumb for understanding AArch64-AArch32
> mapping (and it's the underlying cause of using different renaming modes) is 
> to
> bear in mind that AArch64, differently from AArch32, uses an unpacked
> approach for FP & SIMD registers.
> Prior to Armv8, smaller FP registers were packed into bigger registers
> [2]. Having for example 32 double precision registers (D0-D31) meant having a
> maximum of 16 quad word registers (Q0-Q15).
> This setup has been abandoned in Armv8 [3]. As an example, S1, or D1
> are not packed anymore in Q0. Those are in fact the 32/64 LSBits of Q1.
> This means the newly added (V16-V31) are not accessible in AArch32.
>
> So to answer your question regarding V and Q. Until Q/V15, they alias
> perfectly; V16-V31 are simply not
> Defined/accessible in AArch32 so they are not aliased.
>
> All AArch32 SIMD data is accessible from AArch64. It just won't stick to
> the same naming. AArch32 D1 and AArch64 D1 hold different data.
> If I really wanted to access AArch32 D1 from AArch64 I would have to
> read the 64 MSB of V0. This is a software and not an hardware problem (I just
> posted this example to stress the difference between aliasing and 
> reachability)
>
>
>
>
> Gotcha, makes sense.
>
>
>
> Richard kindly pointed me to the following SVE tutorial:
>
> https://gitlab.com/arm-hpc/training/arm-sve-tools
>
> But I believe it is worth noting we are actually interested on testing
> armv7 (AArch32) SIMD as well, so that won't probably be enough.
> I will dig more, and I will keep you posted
>
>
>
> Ok great, I'll take a look. Having *something* to test with will be a big leg 
> up,
> even if it isn't complete. It would also be nice, although more complex, to be
> able to test the rename mode switching mechanism somehow.

I would say checking up any variation in the stats for suitable NEON/SVE 
workloads would be already a big step forward

>
> Gabe

Kind Regards

Giacomo
IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
_______________________________________________
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

Reply via email to