https://bugs.kde.org/show_bug.cgi?id=385055

--- Comment #2 from Carl Love <c...@us.ibm.com> ---
The second workload that hits the same issue is a GCC instruction test.  The
test runs a whole bunch of float 128-bit instructions to test the code
generation.  

In this case, there are several instructions:  xsmulqp, xsmaddqp, xssubqp,
xsdivqp, xsaddqp which all call the function  generate_store_FPRF( Ity_F128, vT
); to set the condition code for the instruction. 

The function expands into:

              t125 =
Or32(8Uto32(GET:I8(1342)),Shl32(8Uto32(GET:I8(1344)),0xC:I8))
              t124 = And32(t125,0x3:I32)
              t122 = GET:F128(896)
              t121 = GET:F128(912)
              t123 =
AddF128(Xor32(t124,And32(Shl32(t124,0x1:I8),0x2:I32)),t121,t122)
              t134 =
64HLtoV128(ReinterpF64asI64(F128HItoF64(t123)),ReinterpF64asI64(F128LOtoF64(t123)))
              t133 = 64to1(And64(Shr64(V128HIto64(t134),0x3F:I8),0x1:I64))
              t136 = 0x7FFF000000000000:I64
              t135 = 0xFFFFFFFFFFFF:I64
              t137 = 0x0:I64
              t126 =
32to1(And32(1Uto32(CmpEQ64(And64(V128HIto64(t134),t136),t136)),1Uto32(Not1(CmpEQ64(Or64(And64(V128HIto64(t134),t135),V128to64(t134)),t137)))))
              t139 = 0x7FFF000000000000:I64
              t138 = 0xFFFFFFFFFFFF:I64
              t140 = 0x0:I64
              t127 =
32to1(And32(1Uto32(CmpEQ64(And64(V128HIto64(t134),t139),t139)),1Uto32(CmpEQ64(Or64(And64(V128HIto64(t134),t138),V128to64(t134)),t140))))
              t142 = 0x7FFF000000000000:I64
              t141 = 0xFFFFFFFFFFFF:I64
              t143 = 0x0:I64
              t132 =
32to1(And32(1Uto32(CmpEQ64(And64(V128HIto64(t134),t142),t143)),1Uto32(CmpEQ64(Or64(And64(V128HIto64(t134),t141),V128to64(t134)),t143))))
              t144 = 0x7FFF000000000000:I64
              t145 = 0x0:I64
              t129 =
32to1(And32(1Uto32(Not1(CmpEQ64(And64(V128HIto64(t134),t144),t145))),1Uto32(Not1(CmpEQ64(And64(V128HIto64(t134),t144),t144)))))
              t147 = 0x7FFF000000000000:I64
              t146 = 0xFFFFFFFFFFFF:I64
              t148 = 0x0:I64
              t128 =
32to1(And32(1Uto32(CmpEQ64(And64(V128HIto64(t134),t147),t148)),1Uto32(Not1(CmpEQ64(Or64(And64(V128HIto64(t134),t146),V128to64(t134)),t148)))))
              t130 =
32to1(And32(1Uto32(32to1(Not32(1Uto32(t133)))),1Uto32(1:I1)))
              t131 = 32to1(And32(1Uto32(t133),1Uto32(1:I1)))
              PUT(1344) =
32to8(Or32(And32(0xF:I32,8Uto32(GET:I8(1344))),Shl32(And32(0x1:I32,1Uto32(32to1(Or32(1Uto32(32to1(Or32(1Uto32(t126),1Uto32(32to1(And32(1Uto32(t131)\
,1Uto32(t128))))))),1Uto32(32to1(Or32(1Uto32(32to1(And32(1Uto32(t131),1Uto32(t132)))),1Uto32(32to1(And32(1Uto32(t130),1Uto32(t128))))))))))),0x4:I8)))
              PUT(1344) =
32to8(Or32(And32(0x10:I32,8Uto32(GET:I8(1344))),And32(0xF:I32,Or32(Or32(1Uto32(32to1(Or32(1Uto32(t126),1Uto32(t127)))),Shl32(1Uto32(32to1(And32(1Ut\
o32(32to1(Not32(1Uto32(t126)))),1Uto32(t132)))),0x1:I8)),Or32(Shl32(1Uto32(32to1(And32(1Uto32(32to1(Not32(1Uto32(t126)))),1Uto32(32to1(And32(1Uto32(32to1(Or32(1Uto32(32to1(O\
r32(1Uto32(32to1(And32(1Uto32(t130),1Uto32(t128)))),1Uto32(32to1(And32(1Uto32(t130),1Uto32(t129))))))),1Uto32(32to1(And32(1Uto32(t130),1Uto32(t127))))))),1Uto32(32to1(And32(\
1Uto32(32to1(Not32(1Uto32(t132)))),1Uto32(32to1(Not32(1Uto32(t126))))))))))))),0x2:I8),Shl32(1Uto32(32to1(And32(1Uto32(32to1(Not32(1Uto32(t126)))),1Uto32(32to1(And32(1Uto32(\
32to1(Or32(1Uto32(32to1(Or32(1Uto32(32to1(And32(1Uto32(t131),1Uto32(t128)))),1Uto32(32to1(And32(1Uto32(t131),1Uto32(t129))))))),1Uto32(32to1(And32(1Uto32(t131),1Uto32(t127))\
))))),1Uto32(32to1(And32(1Uto32(32to1(Not32(1Uto32(t132)))),1Uto32(32to1(Not32(1Uto32(t126))))))))))))),0x3:I8))))))
              PUT(784) = t123
              PUT(1296) = 0x4157DB0:I64

The basic block again seems to have about 30 instructions, with 6 of the
instructions having the above expansion for generate_store_FPRF().  This with
the added dres->hint = Dis_HintVerbose on each of these instructions.

The generate_store_FPRF() stores a condition code, which for this application
is not used so I commented out the body of the function to avoid calculating
the code and storing it.  Once it is removed, the workload runs normally.  So,
we either need to get the dres->hint to limit the BB more, perhaps end the BB
once it sees an instruction with the hint or perhaps use a C-code handler in
place of the generate_store_FPRF() function.  Other thoughts?

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to