On 7/3/19 12:11 PM, Richard Henderson wrote: > On 7/1/19 6:35 AM, Jan Bobek wrote: >> +sub write_mov_rr($$) >> +{ >> + my ($r1, $r2) = @_; >> + >> + my %insn = (opcode => X86OP_MOV, >> + modrm => {mod => MOD_DIRECT, >> + reg => ($r1 & 0x7), >> + rm => ($r2 & 0x7)}); >> + >> + $insn{rex}{w} = 1 if $is_x86_64; >> + $insn{rex}{r} = 1 if $r1 >= 8; >> + $insn{rex}{b} = 1 if $r2 >= 8; > > This is where maybe it's better to leave rex.[rb] to risugen_x86_asm, and just > leave $modrm{reg} and $modrm{rm} as 4-bit quantities.
That's what I have in v3, stay tuned! >> +sub write_mov_reg_imm($$) >> +{ >> + my ($reg, $imm) = @_; >> + my %insn; >> + >> + if (0 <= $imm && $imm <= 0xffffffff) { > > Should include !$is_x86_64 here, > >> + %insn = (opcode => {value => 0xB8 | ($reg & 0x7), len => 1}, >> + imm => {value => $imm, len => 4}); >> + } elsif (-0x80000000 <= $imm && $imm <= 0x7fffffff) { >> + %insn = (opcode => {value => 0xC7, len => 1}, >> + modrm => {mod => MOD_DIRECT, >> + reg => 0, rm => ($reg & 0x7)}, >> + imm => {value => $imm, len => 4}); >> + >> + $insn{rex}{w} = 1 if $is_x86_64; > > making this unconditional. Doesn't B8 (without REX.W) work for x86_64, too? It zeroes the upper part of the destination, so it's effectively zero-extending, and it's one byte shorter than C7 (no ModR/M byte needed). That being said, I moved most of this function to risugen_x86_asm and included a bunch of comments regarding different cases, so it should be easier to understand. >> +sub write_random_ymmdata() >> +{ >> + my $ymm_cnt = $is_x86_64 ? 16 : 8; >> + my $ymm_len = 32; >> + my $datalen = $ymm_cnt * $ymm_len; >> + >> + # Generate random data blob >> + write_random_datablock($datalen); >> + >> + # Load the random data into YMM regs. >> + for (my $ymm_reg = 0; $ymm_reg < $ymm_cnt; $ymm_reg++) { >> + write_insn(vex => {l => VEX_L_256, p => VEX_P_DATA16, >> + r => !($ymm_reg >= 8)}, > > Again, vex.r should be handled in vex_encode. As I said, there will be more high-level instruction-assembling functions exported by risugen_x86_asm in v3, which take care of this. >> + opcode => X86OP_VMOVAPS, >> + modrm => {mod => MOD_INDIRECT_DISP32, >> + reg => ($ymm_reg & 0x7), >> + rm => REG_EAX}, >> + disp => {value => $ymm_reg * $ymm_len, >> + len => 4}); >> + } > > So... this now generates code that cannot run without AVX2. > > Which is probably fine for testing right now, since we do > want to be able to notice effects of SSE/AVX insns on the > high bits of the registers. > > But we'll probably need to have the same --xsave=foo > command-line option that we have for risu itself. > > That would let you initialize only 16-bytes here, or > for avx512 initialize 64-bytes, plus the k-registers. Ah yes, indeed. -Jan
signature.asc
Description: OpenPGP digital signature