[gem5-users] REX prefix implementation in x86
Hello Everyone, I wanted to introduce a new implementation for Mov Instruction using R11 register, my new opcodes are placed in two_byte.isa and I have duplicated 'mov' functionality present in files move.py and ldstop.isa. My question is: I understand how to decode opcode for example if the new opcode is '0x11' take top 5 bits and then 3 bits to write a case function in two_byte.isa I am not understanding, how should I make sure it uses REX format same as MOV? For example: In the case of 8 bits: *41* 8a 03 mov (%r11),%al *41* 0f xx 03 new_mov (%r11),%al In the case of 16*: * *66 41* 8b 03 mov (%r11),%ax *66 41* 0f xx 03 new_mov (%r11),%ax In the case of 32*: * *41* 8b 03 mov (%r11),%eax *41* 0f xx 03 new_mov (%r11),%eax In the case of 64*: * *49* 8b 03 mov (%r11),%rax *49* 0f xx 03 new_mov (%r11),%rax ***Numbers in bold are REX bits, xx are new opcodes. Gabe or anyone who has any information on this? Best regards, Abhishek ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] REX prefix implementation in x86
Hello Gabe, Thanks for your help, just to verify what I have understood from your explanation is, to add new instruction which behaves like MOV I just need to take care of using proper operands(Gb,Eb), and *REX(Prefix) will be taken care automatically*. >From available opcodes in two_byte.isa, I have chosen 6c, 6d, 7c, and 7d. for example to implement 6c:New_mov(Eb,Gb) I just add following line it in two_byte.isa file "" 0x0D: decode LEGACY_DECODEVAL { // no prefix 0x0: decode OPCODE_OP_BOTTOM3 { { 0x4: NEWMOV(Eb,Gb); } } "" And just duplicate function by changing name in " insts/general_purpose/data_transfer/move.py" and "microops/ldstop.isa" So if I create binary for "41 0f 6c 03" (for NEWMOV (%r11),%al) I do have to worry for "41" in "41 0f 6c 03" (41 is used for Extension of r/m field, base field, or opcode reg field(reference: http://ref.x86asm.net/coder64.html)) Is this correct? Best regards, Abhishek On Thu, Nov 1, 2018 at 6:25 PM Gabe Black wrote: > Hi Abhishek. In x86, and in gem5 in general but particularly in x86, > decoding happens in two steps. The predecoder reads in the bytes which are > in memory and applies context to them (operating mode, various global > settings like address sizes) and translates them into a canonical structure > called an ExtMachInst. In x86, that step gathers up all the prefixes, > opcode bytes, etc., and stores them in the ExtMachInst. When an instruction > is specified in the decoder, it has some parameters which specify what > format its operands come in. That's useful if the basic functionality of > the instruction is the same, but in different scenarios it uses register > indices from different parts of the encoding for instance. If that flavor > of operand is defined to include bits from the REX prefix, then that will > be factored in when that instruction is set up. The format of those > specifiers is modeled after an encoding you'll find in the AMD architecture > manuals where it serves a similar purpose, and you can look at that to get > an idea of what a particular specifier means. > > If you use the same operand suffixes as regular mov does (for instance > Ev,Gv), then your mov should get its arguments in the same way. For > reference, E means that operand may be a register or a memory location > based on the ModRM byte, and G means the "reg" field of modRM. The small v > means to use the effective operand size. > > Gabe > > On Thu, Nov 1, 2018 at 9:51 AM Abhishek Singh < > abhishek.singh199...@gmail.com> wrote: > >> Hello Everyone, >> >> I wanted to introduce a new implementation for Mov Instruction using R11 >> register, my new opcodes are placed in two_byte.isa and I have duplicated >> 'mov' functionality present in files move.py and ldstop.isa. >> >> My question is: I understand how to decode opcode for example if the new >> opcode is '0x11' >> take top 5 bits and then 3 bits to write a case function in two_byte.isa >> >> I am not understanding, how should I make sure it uses REX format same as >> MOV? >> >> >> For example: >> In the case of 8 bits: >> >> >> *41* 8a 03 mov (%r11),%al >> >> *41* 0f xx 03 new_mov (%r11),%al >> >> In the case of 16*: * >> >> *66 41* 8b 03 mov (%r11),%ax >> >> *66 41* 0f xx 03 new_mov (%r11),%ax >> >> >> In the case of 32*: * >> >> *41* 8b 03 mov (%r11),%eax >> >> *41* 0f xx 03 new_mov (%r11),%eax >> >> >> In the case of 64*: * >> >> *49* 8b 03 mov (%r11),%rax >> >> *49* 0f xx 03 new_mov (%r11),%rax >> >> ***Numbers in bold are REX bits, xx are new opcodes. >> >> Gabe or anyone who has any information on this? >> >> >> Best regards, >> >> Abhishek >> ___ >> gem5-users mailing list >> gem5-users@gem5.org >> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > > ___ > gem5-users mailing list > gem5-users@gem5.org > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] REX prefix implementation in x86
There was typo in my last line It is I do *NOT* have to worry for "41" in "41 0f 6c 03" (41 is used for Extension of r/m field, base field, or opcode reg field(reference: http://ref.x86asm.net/coder64.html)) On Thu, Nov 1, 2018 at 9:37 PM Abhishek Singh < abhishek.singh199...@gmail.com> wrote: > Hello Gabe, > > Thanks for your help, just to verify what I have understood from your > explanation is, to add new instruction which behaves like MOV > I just need to take care of using proper operands(Gb,Eb), and *REX(Prefix) > will be taken care automatically*. > > From available opcodes in two_byte.isa, I have chosen 6c, 6d, 7c, and 7d. > for example to implement > 6c:New_mov(Eb,Gb) > > I just add following line it in two_byte.isa file > "" > > 0x0D: decode LEGACY_DECODEVAL { > > // no prefix > > 0x0: decode OPCODE_OP_BOTTOM3 { > > { > > 0x4: NEWMOV(Eb,Gb); > > } > > } > "" > And just duplicate function by changing name in " > insts/general_purpose/data_transfer/move.py" and "microops/ldstop.isa" > > So if I create binary for "41 0f 6c 03" (for NEWMOV (%r11),%al) > > I do have to worry for "41" in "41 0f 6c 03" (41 is used for Extension of > r/m field, base field, or opcode reg field(reference: > http://ref.x86asm.net/coder64.html)) > > Is this correct? > > Best regards, > > Abhishek > > > On Thu, Nov 1, 2018 at 6:25 PM Gabe Black wrote: > >> Hi Abhishek. In x86, and in gem5 in general but particularly in x86, >> decoding happens in two steps. The predecoder reads in the bytes which are >> in memory and applies context to them (operating mode, various global >> settings like address sizes) and translates them into a canonical structure >> called an ExtMachInst. In x86, that step gathers up all the prefixes, >> opcode bytes, etc., and stores them in the ExtMachInst. When an instruction >> is specified in the decoder, it has some parameters which specify what >> format its operands come in. That's useful if the basic functionality of >> the instruction is the same, but in different scenarios it uses register >> indices from different parts of the encoding for instance. If that flavor >> of operand is defined to include bits from the REX prefix, then that will >> be factored in when that instruction is set up. The format of those >> specifiers is modeled after an encoding you'll find in the AMD architecture >> manuals where it serves a similar purpose, and you can look at that to get >> an idea of what a particular specifier means. >> >> If you use the same operand suffixes as regular mov does (for instance >> Ev,Gv), then your mov should get its arguments in the same way. For >> reference, E means that operand may be a register or a memory location >> based on the ModRM byte, and G means the "reg" field of modRM. The small v >> means to use the effective operand size. >> >> Gabe >> >> On Thu, Nov 1, 2018 at 9:51 AM Abhishek Singh < >> abhishek.singh199...@gmail.com> wrote: >> >>> Hello Everyone, >>> >>> I wanted to introduce a new implementation for Mov Instruction using R11 >>> register, my new opcodes are placed in two_byte.isa and I have duplicated >>> 'mov' functionality present in files move.py and ldstop.isa. >>> >>> My question is: I understand how to decode opcode for example if the new >>> opcode is '0x11' >>> take top 5 bits and then 3 bits to write a case function in two_byte.isa >>> >>> I am not understanding, how should I make sure it uses REX format same >>> as MOV? >>> >>> >>> For example: >>> In the case of 8 bits: >>> >>> >>> *41* 8a 03 mov (%r11),%al >>> >>> *41* 0f xx 03 new_mov (%r11),%al >>> >>> In the case of 16*: * >>> >>> *66 41* 8b 03 mov (%r11),%ax >>> >>> *66 41* 0f xx 03 new_mov (%r11),%ax >>> >>> >>> In the case of 32*: * >>> >>> *41* 8b 03 mov (%r11),%eax >>> >>> *41* 0f xx 03 new_mov (%r11),%eax >>> >>> >>> In the case of 64*: * >>> >>> *49* 8b 03 mov (%r11),%rax >>> >>> *49* 0f xx 03 new_mov (%r11),%rax >>> >>> ***Numbers in bold are REX bits, xx are new opcodes. >>> >>> Gabe or anyone who has any information on this? >>> >>> >>> Best regards, >>> >>> Abhishek >>> ___ >>> gem5-users mailing list >>> gem5-users@gem5.org >>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >> >> ___ >> gem5-users mailing list >> gem5-users@gem5.org >> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > > ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] REX prefix implementation in x86
Thanks for the clarification. You helped me a lot thanks :-) On Fri, Nov 2, 2018 at 8:41 PM Gabe Black wrote: > You don't need to worry about changing ldstop.isa unless you're adding a > new microop also, but yes I think that's correct. If you use Eb and Gb, I > think you're restricting your operand size to always be a byte, but that > might be what you want. > > Gabe > > On Thu, Nov 1, 2018 at 9:13 PM Abhishek Singh < > abhishek.singh199...@gmail.com> wrote: > >> There was typo in my last line >> It is >> I do *NOT* have to worry for "41" in "41 0f 6c 03" (41 is used for Extension >> of r/m field, base field, or opcode reg field(reference: >> http://ref.x86asm.net/coder64.html)) >> >> >> On Thu, Nov 1, 2018 at 9:37 PM Abhishek Singh < >> abhishek.singh199...@gmail.com> wrote: >> >>> Hello Gabe, >>> >>> Thanks for your help, just to verify what I have understood from your >>> explanation is, to add new instruction which behaves like MOV >>> I just need to take care of using proper operands(Gb,Eb), and *REX(Prefix) >>> will be taken care automatically*. >>> >>> From available opcodes in two_byte.isa, I have chosen 6c, 6d, 7c, and 7d. >>> for example to implement >>> 6c:New_mov(Eb,Gb) >>> >>> I just add following line it in two_byte.isa file >>> "" >>> >>> 0x0D: decode LEGACY_DECODEVAL { >>> >>> // no prefix >>> >>> 0x0: decode OPCODE_OP_BOTTOM3 { >>> >>> { >>> >>> 0x4: NEWMOV(Eb,Gb); >>> >>> } >>> >>> } >>> "" >>> And just duplicate function by changing name in " >>> insts/general_purpose/data_transfer/move.py" and "microops/ldstop.isa" >>> >>> So if I create binary for "41 0f 6c 03" (for NEWMOV (%r11),%al) >>> >>> I do have to worry for "41" in "41 0f 6c 03" (41 is used for Extension >>> of r/m field, base field, or opcode reg field(reference: >>> http://ref.x86asm.net/coder64.html)) >>> >>> Is this correct? >>> >>> Best regards, >>> >>> Abhishek >>> >>> >>> On Thu, Nov 1, 2018 at 6:25 PM Gabe Black wrote: >>> Hi Abhishek. In x86, and in gem5 in general but particularly in x86, decoding happens in two steps. The predecoder reads in the bytes which are in memory and applies context to them (operating mode, various global settings like address sizes) and translates them into a canonical structure called an ExtMachInst. In x86, that step gathers up all the prefixes, opcode bytes, etc., and stores them in the ExtMachInst. When an instruction is specified in the decoder, it has some parameters which specify what format its operands come in. That's useful if the basic functionality of the instruction is the same, but in different scenarios it uses register indices from different parts of the encoding for instance. If that flavor of operand is defined to include bits from the REX prefix, then that will be factored in when that instruction is set up. The format of those specifiers is modeled after an encoding you'll find in the AMD architecture manuals where it serves a similar purpose, and you can look at that to get an idea of what a particular specifier means. If you use the same operand suffixes as regular mov does (for instance Ev,Gv), then your mov should get its arguments in the same way. For reference, E means that operand may be a register or a memory location based on the ModRM byte, and G means the "reg" field of modRM. The small v means to use the effective operand size. Gabe On Thu, Nov 1, 2018 at 9:51 AM Abhishek Singh < abhishek.singh199...@gmail.com> wrote: > Hello Everyone, > > I wanted to introduce a new implementation for Mov Instruction using > R11 register, my new opcodes are placed in two_byte.isa and I have > duplicated 'mov' functionality present in files move.py and ldstop.isa. > > My question is: I understand how to decode opcode for example if the > new opcode is '0x11' > take top 5 bits and then 3 bits to write a case function in > two_byte.isa > > I am not understanding, how should I make sure it uses REX format same > as MOV? > > > For example: > In the case of 8 bits: > > > *41* 8a 03 mov (%r11),%al > > *41* 0f xx 03 new_mov (%r11),%al > > In the case of 16*: * > > *66 41* 8b 03 mov (%r11),%ax > > *66 41* 0f xx 03 new_mov (%r11),%ax > > > In the case of 32*: * > > *41* 8b 03 mov (%r11),%eax > > *41* 0f xx 03 new_mov (%r11),%eax > > > In the case of 64*: * > > *49* 8b 03 mov (%r11),%rax > > *49* 0f xx 03 new_mov (%r11),%rax > > ***Numbers in bold are REX bits, xx are new opcodes. > > Gabe or anyone who has any information on this? > > > Best regards, > > Abhishek > ___ > gem5-users mailing list > gem5-users@