Francisco Jerez <curroje...@riseup.net> writes: > Neil Roberts <n...@linux.intel.com> writes: > >> Are you sure this patch is necessary? The documentation for the multiply >> instruction on BDW+ says: >> >> SourceType : *D >> DestinationType : *D >> Project : EXCLUDE(CHV) >> >> This to me implies that it should work on BXT because it doesn't say >> EXCLUDE(BXT). >> > > In fact both CHV and BXT *can* support 32x32 multiplication, that's not > really the reason that motivated this patch -- The problem is that 32x32 > multiplication has QWORD execution type which has several restrictions > on CHV and BXT, the annoying one is: > > "CHV, BXT > > When source or destination datatype is 64b or operation is integer DWord > multiply, regioning in Align1 must follow these rules: > > Source and Destination horizontal stride must be aligned to the same > qword. > > Example: > [...] > // mul (4) r10.0<2>:d r11.0<8;4,2>:d r12.0<8;4,2>:d // Source and > Destination stride must be 2 since the execution type is Qword." > > So it sounds like we could use the native 32x32 multiply with some > additional effort to pass the arguments with the correct stride, but > it's not clear to me whether the solution would be any more better than > splitting up the multiply into 32x16 multiplies, so doing the same as in > CHV and pre-Gen8 seemed like the most KISS solution for now. > >> I made a little test case and ran it on my BXT and it seems to work even >> without this patch. I looked at the assembler source and it is >> definitely doing a MUL instruction with D types for the dst and two >> sources. >> >> https://github.com/bpeel/piglit/blob/test-integer-multiply/tests/general/mult32.c >> > Hmm, I guess the docs could be wrong, but I'm not sure what the > consequences would be of violating the alignment rule, I guess the > failure may be non-deterministic. What stepping of BXT did you test > this on? > I just tried this on the BXT simulator, and it doesn't seem to be too happy about multiplies breaking the alignment rule:
| .aub:0x0> CEuExecUnit::CDecodeInst::RegionCheck>Instruction is: | mul (16) r3.0<1>:d r2.3<0;1,0>:d 0xe1ac99a1:d { Align1, N1, NoCompact }. | For 64-bit Align1 operation or multiplication of dwords in this | device, destination horizontal stride must be aligned to qword. | | Fail >> Regards, >> - Neil >> >> Francisco Jerez <curroje...@riseup.net> writes: >> >>> AFAIK BXT has the same annoying alignment limitation as CHV on the >>> source register regions of 32x32 bit MULs, give it the same treatment. >>> --- >>> src/mesa/drivers/dri/i965/brw_fs.cpp | 4 ++-- >>> 1 file changed, 2 insertions(+), 2 deletions(-) >>> >>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp >>> b/src/mesa/drivers/dri/i965/brw_fs.cpp >>> index 244f299..fc9f007 100644 >>> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp >>> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp >>> @@ -3126,9 +3126,9 @@ fs_visitor::lower_integer_multiplication() >>> bool progress = false; >>> >>> /* Gen8's MUL instruction can do a 32-bit x 32-bit -> 32-bit operation >>> - * directly, but Cherryview cannot. >>> + * directly, but CHV/BXT cannot. >>> */ >>> - if (devinfo->gen >= 8 && !devinfo->is_cherryview) >>> + if (devinfo->gen >= 8 && !devinfo->is_cherryview && >>> !devinfo->is_broxton) >>> return false; >>> >>> foreach_block_and_inst_safe(block, fs_inst, inst, cfg) { >>> -- >>> 2.4.6 >>> >>> _______________________________________________ >>> mesa-dev mailing list >>> mesa-dev@lists.freedesktop.org >>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev