Title: [192540] trunk
Revision
192540
Author
fpi...@apple.com
Date
2015-11-17 14:31:40 -0800 (Tue, 17 Nov 2015)

Log Message

CheckAdd/Mul should have commutativity optimizations in B3->Air lowering
https://bugs.webkit.org/show_bug.cgi?id=151214

Reviewed by Geoffrey Garen.

Source/_javascript_Core:

This is an overhaul of CheckAdd/CheckSub/CheckMul that fixes bugs, improves codegen, and
simplifies the contract between B3 and its client.

Previously, the idea was that the operands to the Air BranchAdd/Sub/Mul matched the children of
the B3 CheckAdd/Sub/Mul, or at least, that's what the B3::CheckSpecial would make you believe.
This meant that B3/Air had to avoid optimizations that would break this protocol. This prevented
commutativity optimizations on CheckAdd/Mul and it also prevented strength reduction from
CheckMul(x, 2) to CheckAdd(x, x), for example. Those optimizations would break things because the
client's Stackmap generation callback was allowed to assume that the first entry in params.reps
was the first child. Also, there was a contract between B3 and its client that for CheckAdd/Sub,
the client would undo the operation by doing the opposite operation with the params.reps[0] as the
source and params.reps[1] as the destination.

This not only prevented commutativity optimizations, it also led to bugs. Here are two bugs that
we had:

- Add(x, x) would not work. The client would be told to undo the add using %x as both the source
  and destination. The client would use a sub() instruction. The result would not be x - it would
  be zero.

- Mul where the result got coalesced with one of the sources. You can't undo a multiplication, so
  you need to keep the inputs alive until after the result is computed - i.e. the inputs are late
  uses. I initially thought I worked around this by using a three-operand form of Mul, but of
  course that doesn't actually fix the issue.

This patch fixes these issues comprehensively by introducing the following changes:

The params.reps corresponding to the first two children of CheckAdd/Sub/Mul and the first child of
Check are always garbage: if you want to know the values of those children in the slow path, pass
them as additional stackmap children. This makes it clear to the compiler whose values you
actually care about, and frees the compiler to reorder and commute the non-stackmap children.

The "undo" of an Add or Sub is handled internally by B3: the client doesn't have to know anything
about undoing. B3 does it. The contract is simply that if a B3::Value* is a stackmap child of a
CheckXYZ, then the corresponding entry in params.reps inside the stackmap generator callback will
contain the value of that Value*. For Add and Sub, B3 undoes the operation. For Add(x, x), the
undo uses the carry flag and some shift magic. For Mul, B3 uses LateUse:

A new kind of Arg::Role called Arg::LateUse is introduced: This kind of use means that the use
happens at the start of the execution of the next instruction. None of the built-in Air opcodes
use LateUse, but it is used by B3::CheckSpecial. We use it to implement CheckMul.

Finally, this change fixes testComplex to actually create many live variables. This revealed a
really dumb pathology in allocateStack(), and this patch fixes it. Basically, it's a bad idea to
build interference graphs by repeatedly recreating the same cliques.

* assembler/MacroAssemblerX86Common.h:
(JSC::MacroAssemblerX86Common::test32):
(JSC::MacroAssemblerX86Common::setCarry):
(JSC::MacroAssemblerX86Common::invert):
* b3/B3CheckSpecial.cpp:
(JSC::B3::Air::numB3Args):
(JSC::B3::CheckSpecial::Key::Key):
(JSC::B3::CheckSpecial::Key::dump):
(JSC::B3::CheckSpecial::CheckSpecial):
(JSC::B3::CheckSpecial::forEachArg):
(JSC::B3::CheckSpecial::isValid):
(JSC::B3::CheckSpecial::generate):
(JSC::B3::CheckSpecial::dumpImpl):
(JSC::B3::CheckSpecial::deepDumpImpl):
* b3/B3CheckSpecial.h:
(JSC::B3::CheckSpecial::Key::Key):
(JSC::B3::CheckSpecial::Key::operator==):
(JSC::B3::CheckSpecial::Key::operator!=):
(JSC::B3::CheckSpecial::Key::opcode):
(JSC::B3::CheckSpecial::Key::numArgs):
(JSC::B3::CheckSpecial::Key::stackmapRole):
(JSC::B3::CheckSpecial::Key::hash):
* b3/B3CheckValue.cpp:
(JSC::B3::CheckValue::~CheckValue):
(JSC::B3::CheckValue::convertToAdd):
(JSC::B3::CheckValue::CheckValue):
* b3/B3CheckValue.h:
* b3/B3Const32Value.cpp:
(JSC::B3::Const32Value::checkMulConstant):
(JSC::B3::Const32Value::checkNegConstant):
(JSC::B3::Const32Value::divConstant):
* b3/B3Const32Value.h:
* b3/B3Const64Value.cpp:
(JSC::B3::Const64Value::checkMulConstant):
(JSC::B3::Const64Value::checkNegConstant):
(JSC::B3::Const64Value::divConstant):
* b3/B3Const64Value.h:
* b3/B3LowerToAir.cpp:
(JSC::B3::Air::LowerToAir::appendBinOp):
(JSC::B3::Air::LowerToAir::lower):
* b3/B3Opcode.h:
* b3/B3PatchpointSpecial.cpp:
(JSC::B3::PatchpointSpecial::~PatchpointSpecial):
(JSC::B3::PatchpointSpecial::forEachArg):
(JSC::B3::PatchpointSpecial::isValid):
* b3/B3ReduceStrength.cpp:
* b3/B3StackmapSpecial.cpp:
(JSC::B3::StackmapSpecial::forEachArgImpl):
* b3/B3StackmapSpecial.h:
* b3/B3StackmapValue.cpp:
(JSC::B3::StackmapValue::append):
(JSC::B3::StackmapValue::appendSomeRegister):
(JSC::B3::StackmapValue::setConstrainedChild):
* b3/B3StackmapValue.h:
* b3/B3Validate.cpp:
* b3/B3Value.cpp:
(JSC::B3::Value::checkMulConstant):
(JSC::B3::Value::checkNegConstant):
(JSC::B3::Value::divConstant):
* b3/B3Value.h:
* b3/air/AirAllocateStack.cpp:
(JSC::B3::Air::allocateStack):
* b3/air/AirArg.cpp:
(WTF::printInternal):
* b3/air/AirArg.h:
(JSC::B3::Air::Arg::isAnyUse):
(JSC::B3::Air::Arg::isEarlyUse):
(JSC::B3::Air::Arg::isLateUse):
(JSC::B3::Air::Arg::isDef):
(JSC::B3::Air::Arg::forEachTmp):
(JSC::B3::Air::Arg::isUse): Deleted.
* b3/air/AirGenerate.cpp:
(JSC::B3::Air::generate):
* b3/air/AirIteratedRegisterCoalescing.cpp:
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::build):
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::allocate):
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::InterferenceEdge::hash):
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::InterferenceEdge::dump):
(JSC::B3::Air::addSpillAndFillToProgram):
(JSC::B3::Air::iteratedRegisterCoalescingOnType):
(JSC::B3::Air::iteratedRegisterCoalescing):
* b3/air/AirLiveness.h:
(JSC::B3::Air::Liveness::Liveness):
(JSC::B3::Air::Liveness::LocalCalc::LocalCalc):
(JSC::B3::Air::Liveness::LocalCalc::live):
(JSC::B3::Air::Liveness::LocalCalc::takeLive):
(JSC::B3::Air::Liveness::LocalCalc::execute):
* b3/air/AirOpcode.opcodes:
* b3/air/AirReportUsedRegisters.cpp:
(JSC::B3::Air::reportUsedRegisters):
* b3/air/AirSpillEverything.cpp:
(JSC::B3::Air::spillEverything):
* b3/testb3.cpp:
(JSC::B3::testMulArg):
(JSC::B3::testMulArgStore):
(JSC::B3::testMulAddArg):
(JSC::B3::testMulArgs):
(JSC::B3::testComplex):
(JSC::B3::testSimpleCheck):
(JSC::B3::testCheckLessThan):
(JSC::B3::testCheckMegaCombo):
(JSC::B3::testCheckAddImm):
(JSC::B3::testCheckAddImmCommute):
(JSC::B3::testCheckAddImmSomeRegister):
(JSC::B3::testCheckAdd):
(JSC::B3::testCheckAdd64):
(JSC::B3::testCheckSubImm):
(JSC::B3::testCheckSubBadImm):
(JSC::B3::testCheckSub):
(JSC::B3::testCheckSub64):
(JSC::B3::testCheckNeg):
(JSC::B3::testCheckNeg64):
(JSC::B3::testCheckMul):
(JSC::B3::testCheckMulMemory):
(JSC::B3::testCheckMul2):
(JSC::B3::testCheckMul64):
(JSC::B3::run):

Source/WTF:

Disable my failed attempts at perfect forwarding, since they were incorrect, and caused compile
errors if you tried to pass an argument that bound to lvalue. This shouldn't affect performance of
anything we care about, and performance tests seem to confirm that it's all good.

* wtf/ScopedLambda.h:

Modified Paths

Diff

Modified: trunk/Source/_javascript_Core/ChangeLog (192539 => 192540)


--- trunk/Source/_javascript_Core/ChangeLog	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/ChangeLog	2015-11-17 22:31:40 UTC (rev 192540)
@@ -1,5 +1,176 @@
 2015-11-17  Filip Pizlo  <fpi...@apple.com>
 
+        CheckAdd/Mul should have commutativity optimizations in B3->Air lowering
+        https://bugs.webkit.org/show_bug.cgi?id=151214
+
+        Reviewed by Geoffrey Garen.
+
+        This is an overhaul of CheckAdd/CheckSub/CheckMul that fixes bugs, improves codegen, and
+        simplifies the contract between B3 and its client.
+
+        Previously, the idea was that the operands to the Air BranchAdd/Sub/Mul matched the children of
+        the B3 CheckAdd/Sub/Mul, or at least, that's what the B3::CheckSpecial would make you believe.
+        This meant that B3/Air had to avoid optimizations that would break this protocol. This prevented
+        commutativity optimizations on CheckAdd/Mul and it also prevented strength reduction from
+        CheckMul(x, 2) to CheckAdd(x, x), for example. Those optimizations would break things because the
+        client's Stackmap generation callback was allowed to assume that the first entry in params.reps
+        was the first child. Also, there was a contract between B3 and its client that for CheckAdd/Sub,
+        the client would undo the operation by doing the opposite operation with the params.reps[0] as the
+        source and params.reps[1] as the destination.
+
+        This not only prevented commutativity optimizations, it also led to bugs. Here are two bugs that
+        we had:
+
+        - Add(x, x) would not work. The client would be told to undo the add using %x as both the source
+          and destination. The client would use a sub() instruction. The result would not be x - it would
+          be zero.
+
+        - Mul where the result got coalesced with one of the sources. You can't undo a multiplication, so
+          you need to keep the inputs alive until after the result is computed - i.e. the inputs are late
+          uses. I initially thought I worked around this by using a three-operand form of Mul, but of
+          course that doesn't actually fix the issue.
+
+        This patch fixes these issues comprehensively by introducing the following changes:
+
+        The params.reps corresponding to the first two children of CheckAdd/Sub/Mul and the first child of
+        Check are always garbage: if you want to know the values of those children in the slow path, pass
+        them as additional stackmap children. This makes it clear to the compiler whose values you
+        actually care about, and frees the compiler to reorder and commute the non-stackmap children.
+
+        The "undo" of an Add or Sub is handled internally by B3: the client doesn't have to know anything
+        about undoing. B3 does it. The contract is simply that if a B3::Value* is a stackmap child of a
+        CheckXYZ, then the corresponding entry in params.reps inside the stackmap generator callback will
+        contain the value of that Value*. For Add and Sub, B3 undoes the operation. For Add(x, x), the
+        undo uses the carry flag and some shift magic. For Mul, B3 uses LateUse:
+
+        A new kind of Arg::Role called Arg::LateUse is introduced: This kind of use means that the use
+        happens at the start of the execution of the next instruction. None of the built-in Air opcodes
+        use LateUse, but it is used by B3::CheckSpecial. We use it to implement CheckMul.
+
+        Finally, this change fixes testComplex to actually create many live variables. This revealed a
+        really dumb pathology in allocateStack(), and this patch fixes it. Basically, it's a bad idea to
+        build interference graphs by repeatedly recreating the same cliques.
+
+        * assembler/MacroAssemblerX86Common.h:
+        (JSC::MacroAssemblerX86Common::test32):
+        (JSC::MacroAssemblerX86Common::setCarry):
+        (JSC::MacroAssemblerX86Common::invert):
+        * b3/B3CheckSpecial.cpp:
+        (JSC::B3::Air::numB3Args):
+        (JSC::B3::CheckSpecial::Key::Key):
+        (JSC::B3::CheckSpecial::Key::dump):
+        (JSC::B3::CheckSpecial::CheckSpecial):
+        (JSC::B3::CheckSpecial::forEachArg):
+        (JSC::B3::CheckSpecial::isValid):
+        (JSC::B3::CheckSpecial::generate):
+        (JSC::B3::CheckSpecial::dumpImpl):
+        (JSC::B3::CheckSpecial::deepDumpImpl):
+        * b3/B3CheckSpecial.h:
+        (JSC::B3::CheckSpecial::Key::Key):
+        (JSC::B3::CheckSpecial::Key::operator==):
+        (JSC::B3::CheckSpecial::Key::operator!=):
+        (JSC::B3::CheckSpecial::Key::opcode):
+        (JSC::B3::CheckSpecial::Key::numArgs):
+        (JSC::B3::CheckSpecial::Key::stackmapRole):
+        (JSC::B3::CheckSpecial::Key::hash):
+        * b3/B3CheckValue.cpp:
+        (JSC::B3::CheckValue::~CheckValue):
+        (JSC::B3::CheckValue::convertToAdd):
+        (JSC::B3::CheckValue::CheckValue):
+        * b3/B3CheckValue.h:
+        * b3/B3Const32Value.cpp:
+        (JSC::B3::Const32Value::checkMulConstant):
+        (JSC::B3::Const32Value::checkNegConstant):
+        (JSC::B3::Const32Value::divConstant):
+        * b3/B3Const32Value.h:
+        * b3/B3Const64Value.cpp:
+        (JSC::B3::Const64Value::checkMulConstant):
+        (JSC::B3::Const64Value::checkNegConstant):
+        (JSC::B3::Const64Value::divConstant):
+        * b3/B3Const64Value.h:
+        * b3/B3LowerToAir.cpp:
+        (JSC::B3::Air::LowerToAir::appendBinOp):
+        (JSC::B3::Air::LowerToAir::lower):
+        * b3/B3Opcode.h:
+        * b3/B3PatchpointSpecial.cpp:
+        (JSC::B3::PatchpointSpecial::~PatchpointSpecial):
+        (JSC::B3::PatchpointSpecial::forEachArg):
+        (JSC::B3::PatchpointSpecial::isValid):
+        * b3/B3ReduceStrength.cpp:
+        * b3/B3StackmapSpecial.cpp:
+        (JSC::B3::StackmapSpecial::forEachArgImpl):
+        * b3/B3StackmapSpecial.h:
+        * b3/B3StackmapValue.cpp:
+        (JSC::B3::StackmapValue::append):
+        (JSC::B3::StackmapValue::appendSomeRegister):
+        (JSC::B3::StackmapValue::setConstrainedChild):
+        * b3/B3StackmapValue.h:
+        * b3/B3Validate.cpp:
+        * b3/B3Value.cpp:
+        (JSC::B3::Value::checkMulConstant):
+        (JSC::B3::Value::checkNegConstant):
+        (JSC::B3::Value::divConstant):
+        * b3/B3Value.h:
+        * b3/air/AirAllocateStack.cpp:
+        (JSC::B3::Air::allocateStack):
+        * b3/air/AirArg.cpp:
+        (WTF::printInternal):
+        * b3/air/AirArg.h:
+        (JSC::B3::Air::Arg::isAnyUse):
+        (JSC::B3::Air::Arg::isEarlyUse):
+        (JSC::B3::Air::Arg::isLateUse):
+        (JSC::B3::Air::Arg::isDef):
+        (JSC::B3::Air::Arg::forEachTmp):
+        (JSC::B3::Air::Arg::isUse): Deleted.
+        * b3/air/AirGenerate.cpp:
+        (JSC::B3::Air::generate):
+        * b3/air/AirIteratedRegisterCoalescing.cpp:
+        (JSC::B3::Air::IteratedRegisterCoalescingAllocator::build):
+        (JSC::B3::Air::IteratedRegisterCoalescingAllocator::allocate):
+        (JSC::B3::Air::IteratedRegisterCoalescingAllocator::InterferenceEdge::hash):
+        (JSC::B3::Air::IteratedRegisterCoalescingAllocator::InterferenceEdge::dump):
+        (JSC::B3::Air::addSpillAndFillToProgram):
+        (JSC::B3::Air::iteratedRegisterCoalescingOnType):
+        (JSC::B3::Air::iteratedRegisterCoalescing):
+        * b3/air/AirLiveness.h:
+        (JSC::B3::Air::Liveness::Liveness):
+        (JSC::B3::Air::Liveness::LocalCalc::LocalCalc):
+        (JSC::B3::Air::Liveness::LocalCalc::live):
+        (JSC::B3::Air::Liveness::LocalCalc::takeLive):
+        (JSC::B3::Air::Liveness::LocalCalc::execute):
+        * b3/air/AirOpcode.opcodes:
+        * b3/air/AirReportUsedRegisters.cpp:
+        (JSC::B3::Air::reportUsedRegisters):
+        * b3/air/AirSpillEverything.cpp:
+        (JSC::B3::Air::spillEverything):
+        * b3/testb3.cpp:
+        (JSC::B3::testMulArg):
+        (JSC::B3::testMulArgStore):
+        (JSC::B3::testMulAddArg):
+        (JSC::B3::testMulArgs):
+        (JSC::B3::testComplex):
+        (JSC::B3::testSimpleCheck):
+        (JSC::B3::testCheckLessThan):
+        (JSC::B3::testCheckMegaCombo):
+        (JSC::B3::testCheckAddImm):
+        (JSC::B3::testCheckAddImmCommute):
+        (JSC::B3::testCheckAddImmSomeRegister):
+        (JSC::B3::testCheckAdd):
+        (JSC::B3::testCheckAdd64):
+        (JSC::B3::testCheckSubImm):
+        (JSC::B3::testCheckSubBadImm):
+        (JSC::B3::testCheckSub):
+        (JSC::B3::testCheckSub64):
+        (JSC::B3::testCheckNeg):
+        (JSC::B3::testCheckNeg64):
+        (JSC::B3::testCheckMul):
+        (JSC::B3::testCheckMulMemory):
+        (JSC::B3::testCheckMul2):
+        (JSC::B3::testCheckMul64):
+        (JSC::B3::run):
+
+2015-11-17  Filip Pizlo  <fpi...@apple.com>
+
         Air should lay out code optimally
         https://bugs.webkit.org/show_bug.cgi?id=150478
 

Modified: trunk/Source/_javascript_Core/assembler/MacroAssemblerX86Common.h (192539 => 192540)


--- trunk/Source/_javascript_Core/assembler/MacroAssemblerX86Common.h	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/assembler/MacroAssemblerX86Common.h	2015-11-17 22:31:40 UTC (rev 192540)
@@ -1611,6 +1611,11 @@
         set32(x86Condition(cond), dest);
     }
 
+    void setCarry(RegisterID dest)
+    {
+        set32(X86Assembler::ConditionC, dest);
+    }
+
     // Invert a relational condition, e.g. == becomes !=, < becomes >=, etc.
     static RelationalCondition invert(RelationalCondition cond)
     {

Modified: trunk/Source/_javascript_Core/b3/B3CheckSpecial.cpp (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/B3CheckSpecial.cpp	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/B3CheckSpecial.cpp	2015-11-17 22:31:40 UTC (rev 192540)
@@ -39,9 +39,9 @@
 
 namespace {
 
-unsigned numB3Args(Inst& inst)
+unsigned numB3Args(B3::Opcode opcode)
 {
-    switch (inst.origin->opcode()) {
+    switch (opcode) {
     case CheckAdd:
     case CheckSub:
     case CheckMul:
@@ -54,28 +54,40 @@
     }
 }
 
+unsigned numB3Args(Value* value)
+{
+    return numB3Args(value->opcode());
+}
+
+unsigned numB3Args(Inst& inst)
+{
+    return numB3Args(inst.origin);
+}
+
 } // anonymous namespace
 
 CheckSpecial::Key::Key(const Inst& inst)
 {
     m_opcode = inst.opcode;
     m_numArgs = inst.args.size();
+    m_stackmapRole = Arg::Use;
 }
 
 void CheckSpecial::Key::dump(PrintStream& out) const
 {
-    out.print(m_opcode, "(", m_numArgs, ")");
+    out.print(m_opcode, "(", m_numArgs, ",", m_stackmapRole, ")");
 }
 
-CheckSpecial::CheckSpecial(Air::Opcode opcode, unsigned numArgs)
+CheckSpecial::CheckSpecial(Air::Opcode opcode, unsigned numArgs, Arg::Role stackmapRole)
     : m_checkOpcode(opcode)
+    , m_stackmapRole(stackmapRole)
     , m_numCheckArgs(numArgs)
 {
     ASSERT(isTerminal(opcode));
 }
 
 CheckSpecial::CheckSpecial(const CheckSpecial::Key& key)
-    : CheckSpecial(key.opcode(), key.numArgs())
+    : CheckSpecial(key.opcode(), key.numArgs(), key.stackmapRole())
 {
 }
 
@@ -105,7 +117,7 @@
     Inst hidden = hiddenBranch(inst);
     hidden.forEachArg(callback);
     commitHiddenBranch(inst, hidden);
-    forEachArgImpl(numB3Args(inst), m_numCheckArgs + 1, inst, callback);
+    forEachArgImpl(numB3Args(inst), m_numCheckArgs + 1, inst, m_stackmapRole, callback);
 }
 
 bool CheckSpecial::isValid(Inst& inst)
@@ -130,28 +142,71 @@
     ASSERT(value);
 
     Vector<ValueRep> reps;
-    if (isCheckMath(value->opcode())) {
-        if (value->opcode() == CheckMul) {
-            reps.append(repForArg(*context.code, inst.args[2]));
-            reps.append(repForArg(*context.code, inst.args[3]));
-        } else {
-            if (value->opcode() == CheckSub && value->child(0)->isInt(0))
-                reps.append(ValueRep::constant(0));
-            else
-                reps.append(repForArg(*context.code, inst.args[3]));
-            reps.append(repForArg(*context.code, inst.args[2]));
-        }
-    } else {
-        ASSERT(value->opcode() == Check);
-        reps.append(ValueRep::constant(1));
-    }
+    for (unsigned i = numB3Args(value); i--;)
+        reps.append(ValueRep());
 
     appendRepsImpl(context, m_numCheckArgs + 1, inst, reps);
-    
+
+    // Set aside the args that are relevant to undoing the operation. This is because we don't want to
+    // capture all of inst in the closure below.
+    Vector<Arg, 3> args;
+    for (unsigned i = 0; i < m_numCheckArgs; ++i)
+        args.append(inst.args[1 + i]);
+
     context.latePaths.append(
         createSharedTask<GenerationContext::LatePathFunction>(
-            [=] (CCallHelpers& jit, GenerationContext&) {
+            [=] (CCallHelpers& jit, GenerationContext& context) {
                 fail.link(&jit);
+
+                // If necessary, undo the operation.
+                switch (m_checkOpcode) {
+                case BranchAdd32:
+                    if (args[1] == args[2]) {
+                        // This is ugly, but that's fine - we won't have to do this very often.
+                        ASSERT(args[1].isGPR());
+                        GPRReg valueGPR = args[1].gpr();
+                        GPRReg scratchGPR = CCallHelpers::selectScratchGPR(valueGPR);
+                        jit.pushToSave(scratchGPR);
+                        jit.setCarry(scratchGPR);
+                        jit.lshift32(CCallHelpers::TrustedImm32(31), scratchGPR);
+                        jit.urshift32(CCallHelpers::TrustedImm32(1), valueGPR);
+                        jit.or32(scratchGPR, valueGPR);
+                        jit.popToRestore(scratchGPR);
+                        break;
+                    }
+                    Inst(Sub32, nullptr, args[1], args[2]).generate(jit, context);
+                    break;
+                case BranchAdd64:
+                    if (args[1] == args[2]) {
+                        // This is ugly, but that's fine - we won't have to do this very often.
+                        ASSERT(args[1].isGPR());
+                        GPRReg valueGPR = args[1].gpr();
+                        GPRReg scratchGPR = CCallHelpers::selectScratchGPR(valueGPR);
+                        jit.pushToSave(scratchGPR);
+                        jit.setCarry(scratchGPR);
+                        jit.lshift64(CCallHelpers::TrustedImm32(63), scratchGPR);
+                        jit.urshift64(CCallHelpers::TrustedImm32(1), valueGPR);
+                        jit.or64(scratchGPR, valueGPR);
+                        jit.popToRestore(scratchGPR);
+                        break;
+                    }
+                    Inst(Sub64, nullptr, args[1], args[2]).generate(jit, context);
+                    break;
+                case BranchSub32:
+                    Inst(Add32, nullptr, args[1], args[2]).generate(jit, context);
+                    break;
+                case BranchSub64:
+                    Inst(Add64, nullptr, args[1], args[2]).generate(jit, context);
+                    break;
+                case BranchNeg32:
+                    Inst(Neg32, nullptr, args[1]).generate(jit, context);
+                    break;
+                case BranchNeg64:
+                    Inst(Neg64, nullptr, args[1]).generate(jit, context);
+                    break;
+                default:
+                    break;
+                }
                 
                 StackmapGenerationParams params;
                 params.value = value;
@@ -166,7 +221,7 @@
 
 void CheckSpecial::dumpImpl(PrintStream& out) const
 {
-    out.print(m_checkOpcode, "(", m_numCheckArgs, ")");
+    out.print(m_checkOpcode, "(", m_numCheckArgs, ",", m_stackmapRole, ")");
 }
 
 void CheckSpecial::deepDumpImpl(PrintStream& out) const

Modified: trunk/Source/_javascript_Core/b3/B3CheckSpecial.h (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/B3CheckSpecial.h	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/B3CheckSpecial.h	2015-11-17 22:31:40 UTC (rev 192540)
@@ -28,6 +28,7 @@
 
 #if ENABLE(B3_JIT)
 
+#include "AirArg.h"
 #include "AirOpcode.h"
 #include "B3StackmapSpecial.h"
 #include <wtf/HashMap.h>
@@ -56,12 +57,14 @@
     public:
         Key()
             : m_opcode(Air::Nop)
+            , m_stackmapRole(Air::Arg::Use)
             , m_numArgs(0)
         {
         }
         
-        Key(Air::Opcode opcode, unsigned numArgs)
+        Key(Air::Opcode opcode, unsigned numArgs, Air::Arg::Role stackmapRole = Air::Arg::Use)
             : m_opcode(opcode)
+            , m_stackmapRole(stackmapRole)
             , m_numArgs(numArgs)
         {
         }
@@ -71,7 +74,8 @@
         bool operator==(const Key& other) const
         {
             return m_opcode == other.m_opcode
-                && m_numArgs == other.m_numArgs;
+                && m_numArgs == other.m_numArgs
+                && m_stackmapRole == other.m_stackmapRole;
         }
 
         bool operator!=(const Key& other) const
@@ -83,11 +87,13 @@
 
         Air::Opcode opcode() const { return m_opcode; }
         unsigned numArgs() const { return m_numArgs; }
+        Air::Arg::Role stackmapRole() const { return m_stackmapRole; }
 
         void dump(PrintStream& out) const;
 
         Key(WTF::HashTableDeletedValueType)
             : m_opcode(Air::Nop)
+            , m_stackmapRole(Air::Arg::Use)
             , m_numArgs(1)
         {
         }
@@ -100,15 +106,16 @@
         unsigned hash() const
         {
             // Seriously, we don't need to be smart here. It just doesn't matter.
-            return m_opcode + m_numArgs;
+            return m_opcode + m_numArgs + m_stackmapRole;
         }
         
     private:
         Air::Opcode m_opcode;
+        Air::Arg::Role m_stackmapRole;
         unsigned m_numArgs;
     };
     
-    CheckSpecial(Air::Opcode, unsigned numArgs);
+    CheckSpecial(Air::Opcode, unsigned numArgs, Air::Arg::Role stackmapRole = Air::Arg::Use);
     CheckSpecial(const Key&);
     ~CheckSpecial();
 
@@ -133,6 +140,7 @@
 
 private:
     Air::Opcode m_checkOpcode;
+    Air::Arg::Role m_stackmapRole;
     unsigned m_numCheckArgs;
 };
 

Modified: trunk/Source/_javascript_Core/b3/B3CheckValue.cpp (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/B3CheckValue.cpp	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/B3CheckValue.cpp	2015-11-17 22:31:40 UTC (rev 192540)
@@ -34,6 +34,12 @@
 {
 }
 
+void CheckValue::convertToAdd()
+{
+    RELEASE_ASSERT(opcode() == CheckAdd || opcode() == CheckSub || opcode() == CheckMul);
+    m_opcode = CheckAdd;
+}
+
 // Use this form for CheckAdd, CheckSub, and CheckMul.
 CheckValue::CheckValue(unsigned index, Opcode opcode, Origin origin, Value* left, Value* right)
     : StackmapValue(index, CheckedOpcode, opcode, left->type(), origin)
@@ -41,8 +47,8 @@
     ASSERT(B3::isInt(type()));
     ASSERT(left->type() == right->type());
     ASSERT(opcode == CheckAdd || opcode == CheckSub || opcode == CheckMul);
-    append(ConstrainedValue(left, ValueRep::SomeRegister));
-    append(ConstrainedValue(right, ValueRep::SomeRegister));
+    append(ConstrainedValue(left, ValueRep::Any));
+    append(ConstrainedValue(right, ValueRep::Any));
 }
 
 // Use this form for Check.
@@ -50,7 +56,7 @@
     : StackmapValue(index, CheckedOpcode, opcode, Void, origin)
 {
     ASSERT(opcode == Check);
-    append(predicate);
+    append(ConstrainedValue(predicate, ValueRep::Any));
 }
 
 } } // namespace JSC::B3

Modified: trunk/Source/_javascript_Core/b3/B3CheckValue.h (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/B3CheckValue.h	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/B3CheckValue.h	2015-11-17 22:31:40 UTC (rev 192540)
@@ -49,6 +49,8 @@
 
     ~CheckValue();
 
+    void convertToAdd();
+
 private:
     friend class Procedure;
 

Modified: trunk/Source/_javascript_Core/b3/B3Const32Value.cpp (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/B3Const32Value.cpp	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/B3Const32Value.cpp	2015-11-17 22:31:40 UTC (rev 192540)
@@ -98,6 +98,13 @@
     return proc.add<Const32Value>(origin(), result.unsafeGet());
 }
 
+Value* Const32Value::checkNegConstant(Procedure& proc) const
+{
+    if (m_value == -m_value)
+        return nullptr;
+    return negConstant(proc);
+}
+
 Value* Const32Value::divConstant(Procedure& proc, const Value* other) const
 {
     if (!other->hasInt32())

Modified: trunk/Source/_javascript_Core/b3/B3Const32Value.h (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/B3Const32Value.h	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/B3Const32Value.h	2015-11-17 22:31:40 UTC (rev 192540)
@@ -48,6 +48,7 @@
     Value* checkAddConstant(Procedure&, const Value* other) const override;
     Value* checkSubConstant(Procedure&, const Value* other) const override;
     Value* checkMulConstant(Procedure&, const Value* other) const override;
+    Value* checkNegConstant(Procedure&) const override;
     Value* divConstant(Procedure&, const Value* other) const override;
     Value* bitAndConstant(Procedure&, const Value* other) const override;
     Value* bitOrConstant(Procedure&, const Value* other) const override;

Modified: trunk/Source/_javascript_Core/b3/B3Const64Value.cpp (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/B3Const64Value.cpp	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/B3Const64Value.cpp	2015-11-17 22:31:40 UTC (rev 192540)
@@ -98,6 +98,13 @@
     return proc.add<Const64Value>(origin(), result.unsafeGet());
 }
 
+Value* Const64Value::checkNegConstant(Procedure& proc) const
+{
+    if (m_value == -m_value)
+        return nullptr;
+    return negConstant(proc);
+}
+
 Value* Const64Value::divConstant(Procedure& proc, const Value* other) const
 {
     if (!other->hasInt64())

Modified: trunk/Source/_javascript_Core/b3/B3Const64Value.h (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/B3Const64Value.h	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/B3Const64Value.h	2015-11-17 22:31:40 UTC (rev 192540)
@@ -48,6 +48,7 @@
     Value* checkAddConstant(Procedure&, const Value* other) const override;
     Value* checkSubConstant(Procedure&, const Value* other) const override;
     Value* checkMulConstant(Procedure&, const Value* other) const override;
+    Value* checkNegConstant(Procedure&) const override;
     Value* divConstant(Procedure&, const Value* other) const override;
     Value* bitAndConstant(Procedure&, const Value* other) const override;
     Value* bitOrConstant(Procedure&, const Value* other) const override;

Modified: trunk/Source/_javascript_Core/b3/B3LowerToAir.cpp (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/B3LowerToAir.cpp	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/B3LowerToAir.cpp	2015-11-17 22:31:40 UTC (rev 192540)
@@ -598,6 +598,10 @@
             return;
         }
 
+        // FIXME: If we're going to use a two-operand instruction, and the operand is commutative, we
+        // should coalesce the result with the operand that is killed.
+        // https://bugs.webkit.org/show_bug.cgi?id=151321
+        
         append(relaxedMoveForType(m_value->type()), tmp(left), result);
         append(opcode, tmp(right), result);
     }
@@ -1530,10 +1534,6 @@
 
         case CheckAdd:
         case CheckSub: {
-            // FIXME: Make this support commutativity. That will let us leverage more instruction forms
-            // and it let us commute to maximize coalescing.
-            // https://bugs.webkit.org/show_bug.cgi?id=151214
-
             CheckValue* checkValue = m_value->as<CheckValue>();
 
             Value* left = checkValue->child(0);
@@ -1543,7 +1543,7 @@
 
             // Handle checked negation.
             if (checkValue->opcode() == CheckSub && left->isInt(0)) {
-                append(relaxedMoveForType(checkValue->type()), tmp(right), result);
+                append(Move, tmp(right), result);
 
                 Air::Opcode opcode =
                     opcodeForType(BranchNeg32, BranchNeg64, Air::Oops, checkValue->type());
@@ -1559,39 +1559,67 @@
                 return;
             }
 
-            append(relaxedMoveForType(m_value->type()), tmp(left), result);
+            // FIXME: Use commutativity of CheckAdd to increase the likelihood of coalescing.
+            // https://bugs.webkit.org/show_bug.cgi?id=151321
+
+            append(Move, tmp(left), result);
             
             Air::Opcode opcode = Air::Oops;
-            CheckSpecial* special = nullptr;
             switch (m_value->opcode()) {
             case CheckAdd:
                 opcode = opcodeForType(BranchAdd32, BranchAdd64, Air::Oops, m_value->type());
-                special = ensureCheckSpecial(opcode, 3);
                 break;
             case CheckSub:
                 opcode = opcodeForType(BranchSub32, BranchSub64, Air::Oops, m_value->type());
-                special = ensureCheckSpecial(opcode, 3);
                 break;
             default:
                 RELEASE_ASSERT_NOT_REACHED();
                 break;
             }
 
-            Inst inst(Patch, checkValue, Arg::special(special));
-
-            inst.args.append(Arg::resCond(MacroAssembler::Overflow));
-
             // FIXME: It would be great to fuse Loads into these. We currently don't do it because the
             // rule for stackmaps is that all addresses are just stack addresses. Maybe we could relax
             // this rule here.
             // https://bugs.webkit.org/show_bug.cgi?id=151228
-            
+
+            Arg source;
             if (imm(right) && isValidForm(opcode, Arg::ResCond, Arg::Imm, Arg::Tmp))
-                inst.args.append(imm(right));
+                source = imm(right);
             else
-                inst.args.append(tmp(right));
+                source = tmp(right);
+
+            // There is a really hilarious case that arises when we do BranchAdd32(%x, %x). We won't emit
+            // such code, but the coalescing in our register allocator also does copy propagation, so
+            // although we emit:
+            //
+            //     Move %tmp1, %tmp2
+            //     BranchAdd32 %tmp1, %tmp2
+            //
+            // The register allocator may turn this into:
+            //
+            //     BranchAdd32 %rax, %rax
+            //
+            // Currently we handle this by ensuring that even this kind of addition can be undone. We can
+            // undo it by using the carry flag. It's tempting to get rid of that code and just "fix" this
+            // here by forcing LateUse on the stackmap. If we did that unconditionally, we'd lose a lot of
+            // performance. So it's tempting to do it only if left == right. But that creates an awkward
+            // constraint on Air: it means that Air would not be allowed to do any copy propagation.
+            // Notice that the %rax,%rax situation happened after Air copy-propagated the Move we are
+            // emitting. We know that copy-propagating over that Move causes add-to-self. But what if we
+            // emit something like a Move - or even do other kinds of copy-propagation on tmp's -
+            // somewhere else in this code. The add-to-self situation may only emerge after some other Air
+            // optimizations remove other Move's or identity-like operations. That's why we don't use
+            // LateUse here to take care of add-to-self.
+            
+            CheckSpecial* special = ensureCheckSpecial(opcode, 3);
+            
+            Inst inst(Patch, checkValue, Arg::special(special));
+
+            inst.args.append(Arg::resCond(MacroAssembler::Overflow));
+
+            inst.args.append(source);
             inst.args.append(result);
-            
+
             fillStackmap(inst, checkValue, 2);
 
             m_insts.last().append(WTF::move(inst));
@@ -1599,10 +1627,6 @@
         }
 
         case CheckMul: {
-            // Handle multiplication separately. Multiplication is hard because we have to preserve
-            // both inputs. This requires using three-operand multiplication, even on platforms where
-            // this requires an additional Move.
-
             CheckValue* checkValue = m_value->as<CheckValue>();
 
             Value* left = checkValue->child(0);
@@ -1612,14 +1636,15 @@
 
             Air::Opcode opcode =
                 opcodeForType(BranchMul32, BranchMul64, Air::Oops, checkValue->type());
-            CheckSpecial* special = ensureCheckSpecial(opcode, 4);
+            CheckSpecial* special = ensureCheckSpecial(opcode, 3, Arg::LateUse);
 
             // FIXME: Handle immediates.
             // https://bugs.webkit.org/show_bug.cgi?id=151230
 
+            append(Move, tmp(left), result);
+
             Inst inst(Patch, checkValue, Arg::special(special));
             inst.args.append(Arg::resCond(MacroAssembler::Overflow));
-            inst.args.append(tmp(left));
             inst.args.append(tmp(right));
             inst.args.append(result);
 

Modified: trunk/Source/_javascript_Core/b3/B3Opcode.h (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/B3Opcode.h	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/B3Opcode.h	2015-11-17 22:31:40 UTC (rev 192540)
@@ -163,6 +163,10 @@
     // callbacks. Yet, this transformation is valid even if they are different because we know that
     // after the first CheckAdd executes, the second CheckAdd could not have possibly taken slow
     // path. Therefore, the second CheckAdd's callback is irrelevant.
+    //
+    // Note that the first two children of these operations have ValueRep's, both as input constraints and
+    // in the reps provided to the generator. The output constraints could be anything, and should not be
+    // inspected for meaning. If you want to capture the values of the inputs, use stackmap arguments.
     CheckAdd,
     CheckSub,
     CheckMul,
@@ -170,7 +174,8 @@
     // Check that side-exits. Use the CheckValue class. Like CheckAdd and friends, this has a
     // stackmap with a generation callback. This takes an int argument that this branches on, with
     // full branch fusion in the instruction selector. A true value jumps to the generator's slow
-    // path.
+    // path. Note that the predicate child is has both an input and output ValueRep. The input constraint
+    // must be Any, and the output could be anything.
     Check,
 
     // SSA support, in the style of DFG SSA.

Modified: trunk/Source/_javascript_Core/b3/B3PatchpointSpecial.cpp (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/B3PatchpointSpecial.cpp	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/B3PatchpointSpecial.cpp	2015-11-17 22:31:40 UTC (rev 192540)
@@ -43,16 +43,18 @@
 {
 }
 
-void PatchpointSpecial::forEachArg(
-    Inst& inst, const ScopedLambda<Inst::EachArgCallback>& callback)
+void PatchpointSpecial::forEachArg(Inst& inst, const ScopedLambda<Inst::EachArgCallback>& callback)
 {
+    // FIXME: Allow B3 Patchpoints to specify LateUse.
+    // https://bugs.webkit.org/show_bug.cgi?id=151335
+    
     if (inst.origin->type() == Void) {
-        forEachArgImpl(0, 1, inst, callback);
+        forEachArgImpl(0, 1, inst, Arg::Use, callback);
         return;
     }
 
     callback(inst.args[1], Arg::Def, inst.origin->airType());
-    forEachArgImpl(0, 2, inst, callback);
+    forEachArgImpl(0, 2, inst, Arg::Use, callback);
 }
 
 bool PatchpointSpecial::isValid(Inst& inst)

Modified: trunk/Source/_javascript_Core/b3/B3ReduceStrength.cpp (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/B3ReduceStrength.cpp	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/B3ReduceStrength.cpp	2015-11-17 22:31:40 UTC (rev 192540)
@@ -633,18 +633,11 @@
             break;
 
         case CheckAdd:
-            // FIXME: Handle commutativity.
-            // https://bugs.webkit.org/show_bug.cgi?id=151214
-            
             if (replaceWithNewValue(m_value->child(0)->checkAddConstant(m_proc, m_value->child(1))))
                 break;
+
+            handleCommutativity();
             
-            if (m_value->child(0)->isInt(0)) {
-                m_value->replaceWithIdentity(m_value->child(1));
-                m_changed = true;
-                break;
-            }
-            
             if (m_value->child(1)->isInt(0)) {
                 m_value->replaceWithIdentity(m_value->child(0));
                 m_changed = true;
@@ -661,30 +654,43 @@
                 m_changed = true;
                 break;
             }
+
+            if (Value* negatedConstant = m_value->child(1)->checkNegConstant(m_proc)) {
+                m_insertionSet.insertValue(m_index, negatedConstant);
+                m_value->as<CheckValue>()->convertToAdd();
+                m_value->child(1) = negatedConstant;
+                m_changed = true;
+                break;
+            }
             break;
 
         case CheckMul:
-            // FIXME: Handle commutativity.
-            // https://bugs.webkit.org/show_bug.cgi?id=151214
-            
             if (replaceWithNewValue(m_value->child(0)->checkMulConstant(m_proc, m_value->child(1))))
                 break;
 
-            if (m_value->child(0)->isInt(1)) {
-                m_value->replaceWithIdentity(m_value->child(1));
-                m_changed = true;
-                break;
-            }
-            
-            if (m_value->child(1)->isInt(1)) {
-                m_value->replaceWithIdentity(m_value->child(0));
-                m_changed = true;
-                break;
-            }
+            handleCommutativity();
 
-            if (m_value->child(0)->isInt(0) || m_value->child(1)->isInt(0)) {
-                replaceWithNewValue(m_proc.addIntConstant(m_value, 0));
-                break;
+            if (m_value->child(1)->hasInt()) {
+                bool modified = true;
+                switch (m_value->child(1)->asInt()) {
+                case 0:
+                    replaceWithNewValue(m_proc.addIntConstant(m_value, 0));
+                    break;
+                case 1:
+                    m_value->replaceWithIdentity(m_value->child(0));
+                    m_changed = true;
+                    break;
+                case 2:
+                    m_value->as<CheckValue>()->convertToAdd();
+                    m_value->child(1) = m_value->child(0);
+                    m_changed = true;
+                    break;
+                default:
+                    modified = false;
+                    break;
+                }
+                if (modified)
+                    break;
             }
             break;
 
@@ -761,6 +767,10 @@
     // If we decide that value2 coming first is the canonical ordering.
     void handleCommutativity()
     {
+        // Note that we have commutative operations that take more than two children. Those operations may
+        // commute their first two children while leaving the rest unaffected.
+        ASSERT(m_value->numChildren() >= 2);
+        
         // Leave it alone if the right child is a constant.
         if (m_value->child(1)->isConstant())
             return;

Modified: trunk/Source/_javascript_Core/b3/B3StackmapSpecial.cpp (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/B3StackmapSpecial.cpp	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/B3StackmapSpecial.cpp	2015-11-17 22:31:40 UTC (rev 192540)
@@ -65,7 +65,7 @@
 
 void StackmapSpecial::forEachArgImpl(
     unsigned numIgnoredB3Args, unsigned numIgnoredAirArgs,
-    Inst& inst, const ScopedLambda<Inst::EachArgCallback>& callback)
+    Inst& inst, Arg::Role role, const ScopedLambda<Inst::EachArgCallback>& callback)
 {
     Value* value = inst.origin;
 
@@ -78,7 +78,7 @@
         Arg& arg = inst.args[i + numIgnoredAirArgs];
         Value* child = value->child(i + numIgnoredB3Args);
 
-        callback(arg, Arg::Use, Arg::typeForB3Type(child->type()));
+        callback(arg, role, Arg::typeForB3Type(child->type()));
     }
 }
 

Modified: trunk/Source/_javascript_Core/b3/B3StackmapSpecial.h (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/B3StackmapSpecial.h	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/B3StackmapSpecial.h	2015-11-17 22:31:40 UTC (rev 192540)
@@ -28,6 +28,7 @@
 
 #if ENABLE(B3_JIT)
 
+#include "AirArg.h"
 #include "AirSpecial.h"
 #include "B3ValueRep.h"
 
@@ -53,7 +54,7 @@
 
     void forEachArgImpl(
         unsigned numIgnoredB3Args, unsigned numIgnoredAirArgs,
-        Air::Inst&, const ScopedLambda<Air::Inst::EachArgCallback>&);
+        Air::Inst&, Air::Arg::Role role, const ScopedLambda<Air::Inst::EachArgCallback>&);
     bool isValidImpl(
         unsigned numIgnoredB3Args, unsigned numIgnoredAirArgs,
         Air::Inst&);

Modified: trunk/Source/_javascript_Core/b3/B3StackmapValue.cpp (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/B3StackmapValue.cpp	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/B3StackmapValue.cpp	2015-11-17 22:31:40 UTC (rev 192540)
@@ -48,6 +48,11 @@
     m_reps.append(constrainedValue.rep());
 }
 
+void StackmapValue::appendSomeRegister(Value* value)
+{
+    append(ConstrainedValue(value, ValueRep::SomeRegister));
+}
+
 void StackmapValue::setConstrainedChild(unsigned index, const ConstrainedValue& constrainedValue)
 {
     child(index) = constrainedValue.value();

Modified: trunk/Source/_javascript_Core/b3/B3StackmapValue.h (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/B3StackmapValue.h	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/B3StackmapValue.h	2015-11-17 22:31:40 UTC (rev 192540)
@@ -77,6 +77,10 @@
     // children().append(). That will work fine, but it's not recommended.
     void append(const ConstrainedValue&);
 
+    // This is a helper for something you might do a lot of: append a value that should be constrained
+    // to SomeRegister.
+    void appendSomeRegister(Value*);
+
     const Vector<ValueRep>& reps() const { return m_reps; }
 
     void clobber(const RegisterSet& set)

Modified: trunk/Source/_javascript_Core/b3/B3Validate.cpp (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/B3Validate.cpp	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/B3Validate.cpp	2015-11-17 22:31:40 UTC (rev 192540)
@@ -278,6 +278,8 @@
                 VALIDATE(value->numChildren() >= 2, ("At ", *value));
                 VALIDATE(isInt(value->child(0)->type()), ("At ", *value));
                 VALIDATE(isInt(value->child(1)->type()), ("At ", *value));
+                VALIDATE(value->as<StackmapValue>()->constrainedChild(0).rep() == ValueRep::Any, ("At ", *value));
+                VALIDATE(value->as<StackmapValue>()->constrainedChild(1).rep() == ValueRep::Any, ("At ", *value));
                 validateStackmap(value);
                 break;
             case Check:

Modified: trunk/Source/_javascript_Core/b3/B3Value.cpp (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/B3Value.cpp	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/B3Value.cpp	2015-11-17 22:31:40 UTC (rev 192540)
@@ -154,6 +154,11 @@
     return nullptr;
 }
 
+Value* Value::checkNegConstant(Procedure&) const
+{
+    return nullptr;
+}
+
 Value* Value::divConstant(Procedure&, const Value*) const
 {
     return nullptr;

Modified: trunk/Source/_javascript_Core/b3/B3Value.h (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/B3Value.h	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/B3Value.h	2015-11-17 22:31:40 UTC (rev 192540)
@@ -41,6 +41,7 @@
 namespace JSC { namespace B3 {
 
 class BasicBlock;
+class CheckValue;
 class Procedure;
 
 class JS_EXPORT_PRIVATE Value {
@@ -118,6 +119,7 @@
     virtual Value* checkAddConstant(Procedure&, const Value* other) const;
     virtual Value* checkSubConstant(Procedure&, const Value* other) const;
     virtual Value* checkMulConstant(Procedure&, const Value* other) const;
+    virtual Value* checkNegConstant(Procedure&) const;
     virtual Value* divConstant(Procedure&, const Value* other) const; // This chooses ChillDiv semantics for integers.
     virtual Value* bitAndConstant(Procedure&, const Value* other) const;
     virtual Value* bitOrConstant(Procedure&, const Value* other) const;
@@ -272,6 +274,8 @@
     }
 
 private:
+    friend class CheckValue; // CheckValue::convertToAdd() modifies m_opcode.
+    
     static Type typeFor(Opcode, Value* firstChild);
 
     // This group of fields is arranged to fit in 64 bits.

Modified: trunk/Source/_javascript_Core/b3/air/AirAllocateStack.cpp (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/air/AirAllocateStack.cpp	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/air/AirAllocateStack.cpp	2015-11-17 22:31:40 UTC (rev 192540)
@@ -146,38 +146,22 @@
         auto interfere = [&] (Inst& inst) {
             if (verbose)
                 dataLog("Interfering: ", pointerListDump(localCalc.live()), "\n");
-            
-            // Form a clique of stack slots that interfere. First find the list of stack slots
-            // that are live right now.
-            slots.resize(0);
-            for (StackSlot* slot : localCalc.live()) {
-                if (slot->kind() == StackSlotKind::Anonymous)
-                    slots.append(slot);
-            }
 
-            // We mustn't mandate that the input code is optimal. Therefore, it may have dead stores
-            // to the stack. We need to treat these as interfering.
             inst.forEachArg(
                 [&] (Arg& arg, Arg::Role role, Arg::Type) {
-                    if (Arg::isDef(role) && arg.isStack()) {
-                        StackSlot* slot = arg.stackSlot();
-                        if (slot->kind() == StackSlotKind::Anonymous
-                            && !localCalc.live().contains(slot))
-                            slots.append(slot);
+                    if (!Arg::isDef(role))
+                        return;
+                    if (!arg.isStack())
+                        return;
+                    StackSlot* slot = arg.stackSlot();
+                    if (slot->kind() != StackSlotKind::Anonymous)
+                        return;
+
+                    for (StackSlot* otherSlot : localCalc.live()) {
+                        interference[slot].add(otherSlot);
+                        interference[otherSlot].add(slot);
                     }
                 });
-            
-            if (verbose)
-                dataLog("    Slots: ", pointerListDump(slots), "\n");
-            for (unsigned i = 0; i < slots.size(); ++i) {
-                StackSlot* outer = slots[i];
-                for (unsigned j = i + 1; j < slots.size(); ++j) {
-                    StackSlot* inner = slots[j];
-
-                    interference[inner].add(outer);
-                    interference[outer].add(inner);
-                }
-            }
         };
 
         for (unsigned instIndex = block->size(); instIndex--;) {
@@ -185,7 +169,7 @@
                 dataLog("Analyzing: ", block->at(instIndex), "\n");
             Inst& inst = block->at(instIndex);
             interfere(inst);
-            localCalc.execute(inst);
+            localCalc.execute(instIndex);
         }
         Inst nop;
         interfere(nop);

Modified: trunk/Source/_javascript_Core/b3/air/AirArg.cpp (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/air/AirArg.cpp	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/air/AirArg.cpp	2015-11-17 22:31:40 UTC (rev 192540)
@@ -181,6 +181,9 @@
     case Arg::UseAddr:
         out.print("UseAddr");
         return;
+    case Arg::LateUse:
+        out.print("LateUse");
+        return;
     }
 
     RELEASE_ASSERT_NOT_REACHED();

Modified: trunk/Source/_javascript_Core/b3/air/AirArg.h (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/air/AirArg.h	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/air/AirArg.h	2015-11-17 22:31:40 UTC (rev 192540)
@@ -83,6 +83,10 @@
         // always escaping, and Stack is presumed to be always escaping if it's Locked.
         Use,
 
+        // LateUse means that the Inst will read from this value after doing its Def's. Note that LateUse
+        // on an Addr or Index still means Use on the internal temporaries.
+        LateUse,
+
         // Def means that the Inst will write to this value after doing everything else.
         //
         // For Tmp: The Inst will write to this Tmp.
@@ -123,12 +127,13 @@
     };
 
     // Returns true if the Role implies that the Inst will Use the Arg. It's deliberately false for
-    // UseAddr, since isUse() for an Arg::addr means that we are loading from the address.
-    static bool isUse(Role role)
+    // UseAddr, since isAnyUse() for an Arg::addr means that we are loading from the address.
+    static bool isAnyUse(Role role)
     {
         switch (role) {
         case Use:
         case UseDef:
+        case LateUse:
             return true;
         case Def:
         case UseAddr:
@@ -136,12 +141,33 @@
         }
     }
 
+    // Returns true if the Role implies that the Inst will Use the Arg before doing anything else.
+    static bool isEarlyUse(Role role)
+    {
+        switch (role) {
+        case Use:
+        case UseDef:
+            return true;
+        case Def:
+        case UseAddr:
+        case LateUse:
+            return false;
+        }
+    }
+
+    // Returns true if the Role implies that the Inst will Use the Arg after doing everything else.
+    static bool isLateUse(Role role)
+    {
+        return role == LateUse;
+    }
+
     // Returns true if the Role implies that the Inst will Def the Arg.
     static bool isDef(Role role)
     {
         switch (role) {
         case Use:
         case UseAddr:
+        case LateUse:
             return false;
         case Def:
         case UseDef:
@@ -666,7 +692,7 @@
     {
         switch (m_kind) {
         case Tmp:
-            ASSERT(isUse(argRole) || isDef(argRole));
+            ASSERT(isAnyUse(argRole) || isDef(argRole));
             functor(m_base, argRole, argType);
             break;
         case Addr:

Modified: trunk/Source/_javascript_Core/b3/air/AirGenerate.cpp (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/air/AirGenerate.cpp	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/air/AirGenerate.cpp	2015-11-17 22:31:40 UTC (rev 192540)
@@ -72,7 +72,10 @@
     // After this phase, every Tmp has a reg.
     //
     // For debugging, you can use spillEverything() to put everything to the stack between each Inst.
-    iteratedRegisterCoalescing(code);
+    if (false)
+        spillEverything(code);
+    else
+        iteratedRegisterCoalescing(code);
 
     // Prior to this point the prologue and epilogue is implicit. This makes it explicit. It also
     // does things like identify which callee-saves we're using and saves them.

Modified: trunk/Source/_javascript_Core/b3/air/AirIteratedRegisterCoalescing.cpp (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/air/AirIteratedRegisterCoalescing.cpp	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/air/AirIteratedRegisterCoalescing.cpp	2015-11-17 22:31:40 UTC (rev 192540)
@@ -34,12 +34,14 @@
 #include "AirLiveness.h"
 #include "AirPhaseScope.h"
 #include "AirRegisterPriority.h"
+#include <wtf/ListDump.h>
 #include <wtf/ListHashSet.h>
 
 namespace JSC { namespace B3 { namespace Air {
 
 static bool debug = false;
 static bool traceDebug = false;
+static bool reportStats = false;
 
 template<Arg::Type type>
 struct MoveInstHelper;
@@ -168,7 +170,7 @@
                 if (Arg::isDef(role))
                     defTmp = argTmp;
                 else {
-                    ASSERT(Arg::isUse(role));
+                    ASSERT(Arg::isEarlyUse(role));
                     useTmp = argTmp;
                 }
             });
@@ -204,6 +206,7 @@
         makeWorkList();
 
         if (debug) {
+            dataLog("Interference: ", listDump(m_interferenceEdges), "\n");
             dumpInterferenceGraphInDot(WTF::dataFile());
             dataLog("Initial work list\n");
             dumpWorkLists(WTF::dataFile());
@@ -742,6 +745,11 @@
             return WTF::IntHash<uint64_t>::hash(m_value);
         }
 
+        void dump(PrintStream& out) const
+        {
+            out.print(first(), "<=>", second());
+        }
+
     private:
         uint64_t m_value { 0 };
     };
@@ -951,7 +959,7 @@
                 Arg arg = Arg::stack(stackSlotEntry->value);
                 Opcode move = type == Arg::GP ? Move : MoveDouble;
 
-                if (Arg::isUse(role)) {
+                if (Arg::isAnyUse(role)) {
                     Tmp newTmp = code.newTmp(type);
                     insertionSet.insert(instIndex, move, inst.origin, arg, newTmp);
                     tmp = newTmp;
@@ -968,9 +976,10 @@
 }
 
 template<Arg::Type type>
-static void iteratedRegisterCoalescingOnType(Code& code, HashSet<Tmp>& unspillableTmps)
+static void iteratedRegisterCoalescingOnType(Code& code, HashSet<Tmp>& unspillableTmps, unsigned& numIterations)
 {
     while (true) {
+        numIterations++;
         IteratedRegisterCoalescingAllocator<type> allocator(code, unspillableTmps);
         Liveness<Tmp> liveness(code);
         for (BasicBlock* block : code) {
@@ -978,7 +987,7 @@
             for (unsigned instIndex = block->size(); instIndex--;) {
                 Inst& inst = block->at(instIndex);
                 allocator.build(inst, localCalc);
-                localCalc.execute(inst);
+                localCalc.execute(instIndex);
             }
         }
 
@@ -997,12 +1006,14 @@
 
     bool gpIsColored = false;
     bool fpIsColored = false;
+    unsigned numIterations = 0;
 
     HashSet<Tmp> unspillableGPs;
     HashSet<Tmp> unspillableFPs;
 
     // First we run both allocator together as long as they both spill.
     while (!gpIsColored && !fpIsColored) {
+        numIterations++;
         IteratedRegisterCoalescingAllocator<Arg::GP> gpAllocator(code, unspillableGPs);
         IteratedRegisterCoalescingAllocator<Arg::FP> fpAllocator(code, unspillableFPs);
 
@@ -1017,7 +1028,7 @@
                 gpAllocator.build(inst, localCalc);
                 fpAllocator.build(inst, localCalc);
 
-                localCalc.execute(inst);
+                localCalc.execute(instIndex);
             }
         }
 
@@ -1038,9 +1049,12 @@
     };
 
     if (!gpIsColored)
-        iteratedRegisterCoalescingOnType<Arg::GP>(code, unspillableGPs);
+        iteratedRegisterCoalescingOnType<Arg::GP>(code, unspillableGPs, numIterations);
     if (!fpIsColored)
-        iteratedRegisterCoalescingOnType<Arg::FP>(code, unspillableFPs);
+        iteratedRegisterCoalescingOnType<Arg::FP>(code, unspillableFPs, numIterations);
+
+    if (reportStats)
+        dataLog("Num iterations = ", numIterations, "\n");
 }
 
 } } } // namespace JSC::B3::Air

Modified: trunk/Source/_javascript_Core/b3/air/AirLiveness.h (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/air/AirLiveness.h	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/air/AirLiveness.h	2015-11-17 22:31:40 UTC (rev 192540)
@@ -49,6 +49,16 @@
         m_liveAtHead.resize(code.size());
         m_liveAtTail.resize(code.size());
 
+        // The liveAtTail of each block automatically contains the LateUse's of the terminal.
+        for (BasicBlock* block : code) {
+            HashSet<Thing>& live = m_liveAtTail[block];
+            block->last().forEach<Thing>(
+                [&] (Thing& thing, Arg::Role role, Arg::Type) {
+                    if (Arg::isLateUse(role))
+                        live.add(thing);
+                });
+        }
+
         IndexSet<BasicBlock> seen;
 
         bool changed = true;
@@ -61,7 +71,7 @@
                     continue;
                 LocalCalc localCalc(*this, block);
                 for (size_t instIndex = block->size(); instIndex--;)
-                    localCalc.execute(block->at(instIndex));
+                    localCalc.execute(instIndex);
                 bool firstTime = seen.add(block);
                 if (!firstTime && localCalc.live() == m_liveAtHead[block])
                     continue;
@@ -90,14 +100,17 @@
     public:
         LocalCalc(Liveness& liveness, BasicBlock* block)
             : m_live(liveness.liveAtTail(block))
+            , m_block(block)
         {
         }
 
         const HashSet<Thing>& live() const { return m_live; }
         HashSet<Thing>&& takeLive() { return WTF::move(m_live); }
 
-        void execute(Inst& inst)
+        void execute(unsigned instIndex)
         {
+            Inst& inst = m_block->at(instIndex);
+            
             // First handle def's.
             inst.forEach<Thing>(
                 [this] (Thing& arg, Arg::Role role, Arg::Type) {
@@ -107,20 +120,34 @@
                         m_live.remove(arg);
                 });
 
-            // Finally handle use's.
+            // Then handle use's.
             inst.forEach<Thing>(
                 [this] (Thing& arg, Arg::Role role, Arg::Type) {
                     if (!isAlive(arg))
                         return;
-                    if (Arg::isUse(role))
+                    if (Arg::isEarlyUse(role))
                         m_live.add(arg);
                 });
+
+            // And finally, handle the late use's of the previous instruction.
+            if (instIndex) {
+                Inst& prevInst = m_block->at(instIndex - 1);
+
+                prevInst.forEach<Thing>(
+                    [this] (Thing& arg, Arg::Role role, Arg::Type) {
+                        if (!Arg::isLateUse(role))
+                            return;
+                        if (isAlive(arg))
+                            m_live.add(arg);
+                    });
+            }
         }
 
     private:
         HashSet<Thing> m_live;
+        BasicBlock* m_block;
     };
-    
+
 private:
     IndexMap<BasicBlock, HashSet<Thing>> m_liveAtHead;
     IndexMap<BasicBlock, HashSet<Thing>> m_liveAtTail;

Modified: trunk/Source/_javascript_Core/b3/air/AirOpcode.opcodes (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/air/AirOpcode.opcodes	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/air/AirOpcode.opcodes	2015-11-17 22:31:40 UTC (rev 192540)
@@ -331,15 +331,9 @@
     ResCond, Tmp, Tmp
     ResCond, Addr, Tmp
 
-BranchMul32 U:G, U:G, U:G, D:G /branch
-    ResCond, Tmp, Tmp, Tmp
-
 BranchMul64 U:G, U:G, UD:G /branch
     ResCond, Tmp, Tmp
 
-BranchMul64 U:G, U:G, U:G, D:G /branch
-    ResCond, Tmp, Tmp, Tmp
-
 BranchSub32 U:G, U:G, UD:G /branch
     ResCond, Tmp, Tmp
     ResCond, Imm, Tmp

Modified: trunk/Source/_javascript_Core/b3/air/AirReportUsedRegisters.cpp (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/air/AirReportUsedRegisters.cpp	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/air/AirReportUsedRegisters.cpp	2015-11-17 22:31:40 UTC (rev 192540)
@@ -56,7 +56,7 @@
                 }
                 inst.reportUsedRegisters(registerSet);
             }
-            localCalc.execute(inst);
+            localCalc.execute(instIndex);
         }
     }
 }

Modified: trunk/Source/_javascript_Core/b3/air/AirSpillEverything.cpp (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/air/AirSpillEverything.cpp	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/air/AirSpillEverything.cpp	2015-11-17 22:31:40 UTC (rev 192540)
@@ -69,7 +69,7 @@
         for (unsigned instIndex = block->size(); instIndex--;) {
             Inst& inst = block->at(instIndex);
             setUsedRegisters(instIndex + 1, inst);
-            localCalc.execute(inst);
+            localCalc.execute(instIndex);
         }
 
         Inst nop;
@@ -140,6 +140,7 @@
                         }
                         break;
                     case Arg::UseDef:
+                    case Arg::LateUse:
                         for (Reg reg : regsInPriorityOrder(type)) {
                             if (!setBefore.get(reg) && !setAfter.get(reg)) {
                                 setAfter.set(reg);
@@ -160,7 +161,7 @@
 
                     Opcode move = type == Arg::GP ? Move : MoveDouble;
 
-                    if (Arg::isUse(role)) {
+                    if (Arg::isAnyUse(role)) {
                         insertionSet.insert(
                             instIndex, move, inst.origin, arg, tmp);
                     }

Modified: trunk/Source/_javascript_Core/b3/testb3.cpp (192539 => 192540)


--- trunk/Source/_javascript_Core/b3/testb3.cpp	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/_javascript_Core/b3/testb3.cpp	2015-11-17 22:31:40 UTC (rev 192540)
@@ -297,7 +297,8 @@
 {
     Procedure proc;
     BasicBlock* root = proc.addBlock();
-    Value* value = root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0);
+    Value* value = root->appendNew<Value>(
+        proc, Trunc, Origin(), root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0));
     root->appendNew<ControlValue>(
         proc, Return, Origin(),
         root->appendNew<Value>(proc, Mul, Origin(), value, value));
@@ -305,6 +306,51 @@
     CHECK(compileAndRun<int>(proc, a) == a * a);
 }
 
+void testMulArgStore(int a)
+{
+    Procedure proc;
+    BasicBlock* root = proc.addBlock();
+
+    int mulSlot;
+    int valueSlot;
+    
+    Value* value = root->appendNew<Value>(
+        proc, Trunc, Origin(),
+        root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0));
+    Value* mul = root->appendNew<Value>(proc, Mul, Origin(), value, value);
+
+    root->appendNew<MemoryValue>(
+        proc, Store, Origin(), value,
+        root->appendNew<ConstPtrValue>(proc, Origin(), &valueSlot));
+    root->appendNew<MemoryValue>(
+        proc, Store, Origin(), mul,
+        root->appendNew<ConstPtrValue>(proc, Origin(), &mulSlot));
+
+    root->appendNew<ControlValue>(
+        proc, Return, Origin(), root->appendNew<Const32Value>(proc, Origin(), 0));
+
+    CHECK(!compileAndRun<int>(proc, a));
+    CHECK(mulSlot == a * a);
+    CHECK(valueSlot == a);
+}
+
+void testMulAddArg(int a)
+{
+    Procedure proc;
+    BasicBlock* root = proc.addBlock();
+    Value* value = root->appendNew<Value>(
+        proc, Trunc, Origin(),
+        root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0));
+    root->appendNew<ControlValue>(
+        proc, Return, Origin(),
+        root->appendNew<Value>(
+            proc, Add, Origin(),
+            root->appendNew<Value>(proc, Mul, Origin(), value, value),
+            value));
+
+    CHECK(compileAndRun<int>(proc, a) == a * a + a);
+}
+
 void testMulArgs(int a, int b)
 {
     Procedure proc;
@@ -2754,9 +2800,9 @@
     for (unsigned i = 0; i < numConstructs; ++i) {
         if (i & 1) {
             // Control flow diamond.
-            unsigned predicateVarIndex = (i >> 1) % numVars;
-            unsigned thenIncVarIndex = ((i >> 1) + 1) % numVars;
-            unsigned elseIncVarIndex = ((i >> 1) + 2) % numVars;
+            unsigned predicateVarIndex = ((i >> 1) + 2) % numVars;
+            unsigned thenIncVarIndex = ((i >> 1) + 0) % numVars;
+            unsigned elseIncVarIndex = ((i >> 1) + 1) % numVars;
 
             BasicBlock* thenBlock = proc.addBlock();
             BasicBlock* elseBlock = proc.addBlock();
@@ -2801,7 +2847,7 @@
             BasicBlock* loopSkip = proc.addBlock();
             BasicBlock* continuation = proc.addBlock();
             
-            Value* startIndex = vars[(i >> 1) % numVars];
+            Value* startIndex = vars[((i >> 1) + 1) % numVars];
             Value* startSum = current->appendNew<Const32Value>(proc, Origin(), 0);
             current->appendNew<ControlValue>(
                 proc, Branch, Origin(), startIndex,
@@ -2855,7 +2901,7 @@
             skipSum->setPhi(finalSum);
 
             current = continuation;
-            vars[((i >> 1) + 1) % numVars] = finalSum;
+            vars[((i >> 1) + 0) % numVars] = finalSum;
         }
     }
 
@@ -3183,8 +3229,6 @@
     check->setGenerator(
         [&] (CCallHelpers& jit, const StackmapGenerationParams& params) {
             CHECK(params.reps.size() == 1);
-            CHECK(params.reps[0].isConstant());
-            CHECK(params.reps[0].value() == 1);
 
             // This should always work because a function this simple should never have callee
             // saves.
@@ -3216,8 +3260,6 @@
     check->setGenerator(
         [&] (CCallHelpers& jit, const StackmapGenerationParams& params) {
             CHECK(params.reps.size() == 1);
-            CHECK(params.reps[0].isConstant());
-            CHECK(params.reps[0].value() == 1);
 
             // This should always work because a function this simple should never have callee
             // saves.
@@ -3263,8 +3305,6 @@
     check->setGenerator(
         [&] (CCallHelpers& jit, const StackmapGenerationParams& params) {
             CHECK(params.reps.size() == 1);
-            CHECK(params.reps[0].isConstant());
-            CHECK(params.reps[0].value() == 1);
 
             // This should always work because a function this simple should never have callee
             // saves.
@@ -3299,14 +3339,15 @@
         root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0));
     Value* arg2 = root->appendNew<Const32Value>(proc, Origin(), 42);
     CheckValue* checkAdd = root->appendNew<CheckValue>(proc, CheckAdd, Origin(), arg1, arg2);
+    checkAdd->append(arg1);
+    checkAdd->append(arg2);
     checkAdd->setGenerator(
         [&] (CCallHelpers& jit, const StackmapGenerationParams& params) {
-            CHECK(params.reps.size() == 2);
-            CHECK(params.reps[0].isGPR());
-            CHECK(params.reps[1].isConstant());
-            CHECK(params.reps[1].value() == 42);
-            jit.sub32(CCallHelpers::TrustedImm32(42), params.reps[0].gpr());
-            jit.convertInt32ToDouble(params.reps[0].gpr(), FPRInfo::fpRegT0);
+            CHECK(params.reps.size() == 4);
+            CHECK(params.reps[2].isGPR());
+            CHECK(params.reps[3].isConstant());
+            CHECK(params.reps[3].value() == 42);
+            jit.convertInt32ToDouble(params.reps[2].gpr(), FPRInfo::fpRegT0);
             jit.convertInt32ToDouble(CCallHelpers::TrustedImm32(42), FPRInfo::fpRegT1);
             jit.addDouble(FPRInfo::fpRegT1, FPRInfo::fpRegT0);
             jit.emitFunctionEpilogue();
@@ -3324,6 +3365,75 @@
     CHECK(invoke<double>(*code, 2147483647) == 2147483689.0);
 }
 
+void testCheckAddImmCommute()
+{
+    Procedure proc;
+    BasicBlock* root = proc.addBlock();
+    Value* arg1 = root->appendNew<Value>(
+        proc, Trunc, Origin(),
+        root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0));
+    Value* arg2 = root->appendNew<Const32Value>(proc, Origin(), 42);
+    CheckValue* checkAdd = root->appendNew<CheckValue>(proc, CheckAdd, Origin(), arg2, arg1);
+    checkAdd->append(arg1);
+    checkAdd->append(arg2);
+    checkAdd->setGenerator(
+        [&] (CCallHelpers& jit, const StackmapGenerationParams& params) {
+            CHECK(params.reps.size() == 4);
+            CHECK(params.reps[2].isGPR());
+            CHECK(params.reps[3].isConstant());
+            CHECK(params.reps[3].value() == 42);
+            jit.convertInt32ToDouble(params.reps[2].gpr(), FPRInfo::fpRegT0);
+            jit.convertInt32ToDouble(CCallHelpers::TrustedImm32(42), FPRInfo::fpRegT1);
+            jit.addDouble(FPRInfo::fpRegT1, FPRInfo::fpRegT0);
+            jit.emitFunctionEpilogue();
+            jit.ret();
+        });
+    root->appendNew<ControlValue>(
+        proc, Return, Origin(),
+        root->appendNew<Value>(proc, IToD, Origin(), checkAdd));
+
+    auto code = compile(proc);
+
+    CHECK(invoke<double>(*code, 0) == 42.0);
+    CHECK(invoke<double>(*code, 1) == 43.0);
+    CHECK(invoke<double>(*code, 42) == 84.0);
+    CHECK(invoke<double>(*code, 2147483647) == 2147483689.0);
+}
+
+void testCheckAddImmSomeRegister()
+{
+    Procedure proc;
+    BasicBlock* root = proc.addBlock();
+    Value* arg1 = root->appendNew<Value>(
+        proc, Trunc, Origin(),
+        root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0));
+    Value* arg2 = root->appendNew<Const32Value>(proc, Origin(), 42);
+    CheckValue* checkAdd = root->appendNew<CheckValue>(proc, CheckAdd, Origin(), arg1, arg2);
+    checkAdd->appendSomeRegister(arg1);
+    checkAdd->appendSomeRegister(arg2);
+    checkAdd->setGenerator(
+        [&] (CCallHelpers& jit, const StackmapGenerationParams& params) {
+            CHECK(params.reps.size() == 4);
+            CHECK(params.reps[2].isGPR());
+            CHECK(params.reps[3].isGPR());
+            jit.convertInt32ToDouble(params.reps[2].gpr(), FPRInfo::fpRegT0);
+            jit.convertInt32ToDouble(params.reps[3].gpr(), FPRInfo::fpRegT1);
+            jit.addDouble(FPRInfo::fpRegT1, FPRInfo::fpRegT0);
+            jit.emitFunctionEpilogue();
+            jit.ret();
+        });
+    root->appendNew<ControlValue>(
+        proc, Return, Origin(),
+        root->appendNew<Value>(proc, IToD, Origin(), checkAdd));
+
+    auto code = compile(proc);
+
+    CHECK(invoke<double>(*code, 0) == 42.0);
+    CHECK(invoke<double>(*code, 1) == 43.0);
+    CHECK(invoke<double>(*code, 42) == 84.0);
+    CHECK(invoke<double>(*code, 2147483647) == 2147483689.0);
+}
+
 void testCheckAdd()
 {
     Procedure proc;
@@ -3335,14 +3445,15 @@
         proc, Trunc, Origin(),
         root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR1));
     CheckValue* checkAdd = root->appendNew<CheckValue>(proc, CheckAdd, Origin(), arg1, arg2);
+    checkAdd->appendSomeRegister(arg1);
+    checkAdd->appendSomeRegister(arg2);
     checkAdd->setGenerator(
         [&] (CCallHelpers& jit, const StackmapGenerationParams& params) {
-            CHECK(params.reps.size() == 2);
-            CHECK(params.reps[0].isGPR());
-            CHECK(params.reps[1].isGPR());
-            jit.sub32(params.reps[1].gpr(), params.reps[0].gpr());
-            jit.convertInt32ToDouble(params.reps[0].gpr(), FPRInfo::fpRegT0);
-            jit.convertInt32ToDouble(params.reps[1].gpr(), FPRInfo::fpRegT1);
+            CHECK(params.reps.size() == 4);
+            CHECK(params.reps[2].isGPR());
+            CHECK(params.reps[3].isGPR());
+            jit.convertInt32ToDouble(params.reps[2].gpr(), FPRInfo::fpRegT0);
+            jit.convertInt32ToDouble(params.reps[3].gpr(), FPRInfo::fpRegT1);
             jit.addDouble(FPRInfo::fpRegT1, FPRInfo::fpRegT0);
             jit.emitFunctionEpilogue();
             jit.ret();
@@ -3366,14 +3477,15 @@
     Value* arg1 = root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0);
     Value* arg2 = root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR1);
     CheckValue* checkAdd = root->appendNew<CheckValue>(proc, CheckAdd, Origin(), arg1, arg2);
+    checkAdd->appendSomeRegister(arg1);
+    checkAdd->appendSomeRegister(arg2);
     checkAdd->setGenerator(
         [&] (CCallHelpers& jit, const StackmapGenerationParams& params) {
-            CHECK(params.reps.size() == 2);
-            CHECK(params.reps[0].isGPR());
-            CHECK(params.reps[1].isGPR());
-            jit.sub64(params.reps[1].gpr(), params.reps[0].gpr());
-            jit.convertInt64ToDouble(params.reps[0].gpr(), FPRInfo::fpRegT0);
-            jit.convertInt64ToDouble(params.reps[1].gpr(), FPRInfo::fpRegT1);
+            CHECK(params.reps.size() == 4);
+            CHECK(params.reps[2].isGPR());
+            CHECK(params.reps[3].isGPR());
+            jit.convertInt64ToDouble(params.reps[2].gpr(), FPRInfo::fpRegT0);
+            jit.convertInt64ToDouble(params.reps[3].gpr(), FPRInfo::fpRegT1);
             jit.addDouble(FPRInfo::fpRegT1, FPRInfo::fpRegT0);
             jit.emitFunctionEpilogue();
             jit.ret();
@@ -3437,14 +3549,15 @@
         root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0));
     Value* arg2 = root->appendNew<Const32Value>(proc, Origin(), 42);
     CheckValue* checkSub = root->appendNew<CheckValue>(proc, CheckSub, Origin(), arg1, arg2);
+    checkSub->append(arg1);
+    checkSub->append(arg2);
     checkSub->setGenerator(
         [&] (CCallHelpers& jit, const StackmapGenerationParams& params) {
-            CHECK(params.reps.size() == 2);
-            CHECK(params.reps[0].isGPR());
-            CHECK(params.reps[1].isConstant());
-            CHECK(params.reps[1].value() == 42);
-            jit.add32(CCallHelpers::TrustedImm32(42), params.reps[0].gpr());
-            jit.convertInt32ToDouble(params.reps[0].gpr(), FPRInfo::fpRegT0);
+            CHECK(params.reps.size() == 4);
+            CHECK(params.reps[2].isGPR());
+            CHECK(params.reps[3].isConstant());
+            CHECK(params.reps[3].value() == 42);
+            jit.convertInt32ToDouble(params.reps[2].gpr(), FPRInfo::fpRegT0);
             jit.convertInt32ToDouble(CCallHelpers::TrustedImm32(42), FPRInfo::fpRegT1);
             jit.subDouble(FPRInfo::fpRegT1, FPRInfo::fpRegT0);
             jit.emitFunctionEpilogue();
@@ -3462,6 +3575,42 @@
     CHECK(invoke<double>(*code, -2147483647) == -2147483689.0);
 }
 
+void testCheckSubBadImm()
+{
+    Procedure proc;
+    BasicBlock* root = proc.addBlock();
+    Value* arg1 = root->appendNew<Value>(
+        proc, Trunc, Origin(),
+        root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0));
+    int32_t badImm = std::numeric_limits<int>::min();
+    Value* arg2 = root->appendNew<Const32Value>(proc, Origin(), badImm);
+    CheckValue* checkSub = root->appendNew<CheckValue>(proc, CheckSub, Origin(), arg1, arg2);
+    checkSub->append(arg1);
+    checkSub->append(arg2);
+    checkSub->setGenerator(
+        [&] (CCallHelpers& jit, const StackmapGenerationParams& params) {
+            CHECK(params.reps.size() == 4);
+            CHECK(params.reps[2].isGPR());
+            CHECK(params.reps[3].isConstant());
+            CHECK(params.reps[3].value() == badImm);
+            jit.convertInt32ToDouble(params.reps[2].gpr(), FPRInfo::fpRegT0);
+            jit.convertInt32ToDouble(CCallHelpers::TrustedImm32(badImm), FPRInfo::fpRegT1);
+            jit.subDouble(FPRInfo::fpRegT1, FPRInfo::fpRegT0);
+            jit.emitFunctionEpilogue();
+            jit.ret();
+        });
+    root->appendNew<ControlValue>(
+        proc, Return, Origin(),
+        root->appendNew<Value>(proc, IToD, Origin(), checkSub));
+
+    auto code = compile(proc);
+
+    CHECK(invoke<double>(*code, 0) == -static_cast<double>(badImm));
+    CHECK(invoke<double>(*code, -1) == -static_cast<double>(badImm) - 1);
+    CHECK(invoke<double>(*code, 1) == -static_cast<double>(badImm) + 1);
+    CHECK(invoke<double>(*code, 42) == -static_cast<double>(badImm) + 42);
+}
+
 void testCheckSub()
 {
     Procedure proc;
@@ -3473,14 +3622,15 @@
         proc, Trunc, Origin(),
         root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR1));
     CheckValue* checkSub = root->appendNew<CheckValue>(proc, CheckSub, Origin(), arg1, arg2);
+    checkSub->append(arg1);
+    checkSub->append(arg2);
     checkSub->setGenerator(
         [&] (CCallHelpers& jit, const StackmapGenerationParams& params) {
-            CHECK(params.reps.size() == 2);
-            CHECK(params.reps[0].isGPR());
-            CHECK(params.reps[1].isGPR());
-            jit.add32(params.reps[1].gpr(), params.reps[0].gpr());
-            jit.convertInt32ToDouble(params.reps[0].gpr(), FPRInfo::fpRegT0);
-            jit.convertInt32ToDouble(params.reps[1].gpr(), FPRInfo::fpRegT1);
+            CHECK(params.reps.size() == 4);
+            CHECK(params.reps[2].isGPR());
+            CHECK(params.reps[3].isGPR());
+            jit.convertInt32ToDouble(params.reps[2].gpr(), FPRInfo::fpRegT0);
+            jit.convertInt32ToDouble(params.reps[3].gpr(), FPRInfo::fpRegT1);
             jit.subDouble(FPRInfo::fpRegT1, FPRInfo::fpRegT0);
             jit.emitFunctionEpilogue();
             jit.ret();
@@ -3509,14 +3659,15 @@
     Value* arg1 = root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0);
     Value* arg2 = root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR1);
     CheckValue* checkSub = root->appendNew<CheckValue>(proc, CheckSub, Origin(), arg1, arg2);
+    checkSub->append(arg1);
+    checkSub->append(arg2);
     checkSub->setGenerator(
         [&] (CCallHelpers& jit, const StackmapGenerationParams& params) {
-            CHECK(params.reps.size() == 2);
-            CHECK(params.reps[0].isGPR());
-            CHECK(params.reps[1].isGPR());
-            jit.add64(params.reps[1].gpr(), params.reps[0].gpr());
-            jit.convertInt64ToDouble(params.reps[0].gpr(), FPRInfo::fpRegT0);
-            jit.convertInt64ToDouble(params.reps[1].gpr(), FPRInfo::fpRegT1);
+            CHECK(params.reps.size() == 4);
+            CHECK(params.reps[2].isGPR());
+            CHECK(params.reps[3].isGPR());
+            jit.convertInt64ToDouble(params.reps[2].gpr(), FPRInfo::fpRegT0);
+            jit.convertInt64ToDouble(params.reps[3].gpr(), FPRInfo::fpRegT1);
             jit.subDouble(FPRInfo::fpRegT1, FPRInfo::fpRegT0);
             jit.emitFunctionEpilogue();
             jit.ret();
@@ -3580,13 +3731,12 @@
         proc, Trunc, Origin(),
         root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0));
     CheckValue* checkNeg = root->appendNew<CheckValue>(proc, CheckSub, Origin(), arg1, arg2);
+    checkNeg->append(arg2);
     checkNeg->setGenerator(
         [&] (CCallHelpers& jit, const StackmapGenerationParams& params) {
-            CHECK(params.reps.size() == 2);
-            CHECK(params.reps[0] == ValueRep::constant(0));
-            CHECK(params.reps[1].isGPR());
-            jit.neg32(params.reps[1].gpr());
-            jit.convertInt32ToDouble(params.reps[1].gpr(), FPRInfo::fpRegT1);
+            CHECK(params.reps.size() == 3);
+            CHECK(params.reps[2].isGPR());
+            jit.convertInt32ToDouble(params.reps[2].gpr(), FPRInfo::fpRegT1);
             jit.negateDouble(FPRInfo::fpRegT1, FPRInfo::fpRegT0);
             jit.emitFunctionEpilogue();
             jit.ret();
@@ -3610,13 +3760,12 @@
     Value* arg1 = root->appendNew<Const64Value>(proc, Origin(), 0);
     Value* arg2 = root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0);
     CheckValue* checkNeg = root->appendNew<CheckValue>(proc, CheckSub, Origin(), arg1, arg2);
+    checkNeg->append(arg2);
     checkNeg->setGenerator(
         [&] (CCallHelpers& jit, const StackmapGenerationParams& params) {
-            CHECK(params.reps.size() == 2);
-            CHECK(params.reps[0] == ValueRep::constant(0));
-            CHECK(params.reps[1].isGPR());
-            jit.neg64(params.reps[1].gpr());
-            jit.convertInt64ToDouble(params.reps[1].gpr(), FPRInfo::fpRegT1);
+            CHECK(params.reps.size() == 3);
+            CHECK(params.reps[2].isGPR());
+            jit.convertInt64ToDouble(params.reps[2].gpr(), FPRInfo::fpRegT1);
             jit.negateDouble(FPRInfo::fpRegT1, FPRInfo::fpRegT0);
             jit.emitFunctionEpilogue();
             jit.ret();
@@ -3644,13 +3793,15 @@
         proc, Trunc, Origin(),
         root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR1));
     CheckValue* checkMul = root->appendNew<CheckValue>(proc, CheckMul, Origin(), arg1, arg2);
+    checkMul->append(arg1);
+    checkMul->append(arg2);
     checkMul->setGenerator(
         [&] (CCallHelpers& jit, const StackmapGenerationParams& params) {
-            CHECK(params.reps.size() == 2);
-            CHECK(params.reps[0].isGPR());
-            CHECK(params.reps[1].isGPR());
-            jit.convertInt32ToDouble(params.reps[0].gpr(), FPRInfo::fpRegT0);
-            jit.convertInt32ToDouble(params.reps[1].gpr(), FPRInfo::fpRegT1);
+            CHECK(params.reps.size() == 4);
+            CHECK(params.reps[2].isGPR());
+            CHECK(params.reps[3].isGPR());
+            jit.convertInt32ToDouble(params.reps[2].gpr(), FPRInfo::fpRegT0);
+            jit.convertInt32ToDouble(params.reps[3].gpr(), FPRInfo::fpRegT1);
             jit.mulDouble(FPRInfo::fpRegT1, FPRInfo::fpRegT0);
             jit.emitFunctionEpilogue();
             jit.ret();
@@ -3667,6 +3818,92 @@
     CHECK(invoke<double>(*code, 2147483647, 42) == 2147483647.0 * 42.0);
 }
 
+void testCheckMulMemory()
+{
+    Procedure proc;
+    BasicBlock* root = proc.addBlock();
+
+    int left;
+    int right;
+    
+    Value* arg1 = root->appendNew<MemoryValue>(
+        proc, Load, Int32, Origin(),
+        root->appendNew<ConstPtrValue>(proc, Origin(), &left));
+    Value* arg2 = root->appendNew<MemoryValue>(
+        proc, Load, Int32, Origin(),
+        root->appendNew<ConstPtrValue>(proc, Origin(), &right));
+    CheckValue* checkMul = root->appendNew<CheckValue>(proc, CheckMul, Origin(), arg1, arg2);
+    checkMul->append(arg1);
+    checkMul->append(arg2);
+    checkMul->setGenerator(
+        [&] (CCallHelpers& jit, const StackmapGenerationParams& params) {
+            CHECK(params.reps.size() == 4);
+            CHECK(params.reps[2].isGPR());
+            CHECK(params.reps[3].isGPR());
+            jit.convertInt32ToDouble(params.reps[2].gpr(), FPRInfo::fpRegT0);
+            jit.convertInt32ToDouble(params.reps[3].gpr(), FPRInfo::fpRegT1);
+            jit.mulDouble(FPRInfo::fpRegT1, FPRInfo::fpRegT0);
+            jit.emitFunctionEpilogue();
+            jit.ret();
+        });
+    root->appendNew<ControlValue>(
+        proc, Return, Origin(),
+        root->appendNew<Value>(proc, IToD, Origin(), checkMul));
+
+    auto code = compile(proc);
+
+    left = 0;
+    right = 42;
+    CHECK(invoke<double>(*code) == 0.0);
+    
+    left = 1;
+    right = 42;
+    CHECK(invoke<double>(*code) == 42.0);
+
+    left = 42;
+    right = 42;
+    CHECK(invoke<double>(*code) == 42.0 * 42.0);
+
+    left = 2147483647;
+    right = 42;
+    CHECK(invoke<double>(*code) == 2147483647.0 * 42.0);
+}
+
+void testCheckMul2()
+{
+    Procedure proc;
+    BasicBlock* root = proc.addBlock();
+    Value* arg1 = root->appendNew<Value>(
+        proc, Trunc, Origin(),
+        root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0));
+    Value* arg2 = root->appendNew<Const32Value>(proc, Origin(), 2);
+    CheckValue* checkMul = root->appendNew<CheckValue>(proc, CheckMul, Origin(), arg1, arg2);
+    checkMul->append(arg1);
+    checkMul->append(arg2);
+    checkMul->setGenerator(
+        [&] (CCallHelpers& jit, const StackmapGenerationParams& params) {
+            CHECK(params.reps.size() == 4);
+            CHECK(params.reps[2].isGPR());
+            CHECK(params.reps[3].isConstant());
+            CHECK(params.reps[3].value() == 2);
+            jit.convertInt32ToDouble(params.reps[2].gpr(), FPRInfo::fpRegT0);
+            jit.convertInt32ToDouble(CCallHelpers::TrustedImm32(2), FPRInfo::fpRegT1);
+            jit.mulDouble(FPRInfo::fpRegT1, FPRInfo::fpRegT0);
+            jit.emitFunctionEpilogue();
+            jit.ret();
+        });
+    root->appendNew<ControlValue>(
+        proc, Return, Origin(),
+        root->appendNew<Value>(proc, IToD, Origin(), checkMul));
+
+    auto code = compile(proc);
+
+    CHECK(invoke<double>(*code, 0) == 0.0);
+    CHECK(invoke<double>(*code, 1) == 2.0);
+    CHECK(invoke<double>(*code, 42) == 42.0 * 2.0);
+    CHECK(invoke<double>(*code, 2147483647) == 2147483647.0 * 2.0);
+}
+
 void testCheckMul64()
 {
     Procedure proc;
@@ -3674,13 +3911,15 @@
     Value* arg1 = root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0);
     Value* arg2 = root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR1);
     CheckValue* checkMul = root->appendNew<CheckValue>(proc, CheckMul, Origin(), arg1, arg2);
+    checkMul->append(arg1);
+    checkMul->append(arg2);
     checkMul->setGenerator(
         [&] (CCallHelpers& jit, const StackmapGenerationParams& params) {
-            CHECK(params.reps.size() == 2);
-            CHECK(params.reps[0].isGPR());
-            CHECK(params.reps[1].isGPR());
-            jit.convertInt64ToDouble(params.reps[0].gpr(), FPRInfo::fpRegT0);
-            jit.convertInt64ToDouble(params.reps[1].gpr(), FPRInfo::fpRegT1);
+            CHECK(params.reps.size() == 4);
+            CHECK(params.reps[2].isGPR());
+            CHECK(params.reps[3].isGPR());
+            jit.convertInt64ToDouble(params.reps[2].gpr(), FPRInfo::fpRegT0);
+            jit.convertInt64ToDouble(params.reps[3].gpr(), FPRInfo::fpRegT1);
             jit.mulDouble(FPRInfo::fpRegT1, FPRInfo::fpRegT0);
             jit.emitFunctionEpilogue();
             jit.ret();
@@ -4425,6 +4664,10 @@
     RUN(testAddImmsDouble(negativeZero(), negativeZero()));
 
     RUN(testMulArg(5));
+    RUN(testMulAddArg(5));
+    RUN(testMulAddArg(85));
+    RUN(testMulArgStore(5));
+    RUN(testMulArgStore(85));
     RUN(testMulArgs(1, 1));
     RUN(testMulArgs(1, 2));
     RUN(testMulArgs(3, 3));
@@ -4880,11 +5123,14 @@
     RUN(testCheckLessThan());
     RUN(testCheckMegaCombo());
     RUN(testCheckAddImm());
+    RUN(testCheckAddImmCommute());
+    RUN(testCheckAddImmSomeRegister());
     RUN(testCheckAdd());
     RUN(testCheckAdd64());
     RUN(testCheckAddFold(100, 200));
     RUN(testCheckAddFoldFail(2147483647, 100));
     RUN(testCheckSubImm());
+    RUN(testCheckSubBadImm());
     RUN(testCheckSub());
     RUN(testCheckSub64());
     RUN(testCheckSubFold(100, 200));
@@ -4892,6 +5138,8 @@
     RUN(testCheckNeg());
     RUN(testCheckNeg64());
     RUN(testCheckMul());
+    RUN(testCheckMulMemory());
+    RUN(testCheckMul2());
     RUN(testCheckMul64());
     RUN(testCheckMulFold(100, 200));
     RUN(testCheckMulFoldFail(2147483647, 100));

Modified: trunk/Source/WTF/ChangeLog (192539 => 192540)


--- trunk/Source/WTF/ChangeLog	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/WTF/ChangeLog	2015-11-17 22:31:40 UTC (rev 192540)
@@ -1,5 +1,18 @@
 2015-11-17  Filip Pizlo  <fpi...@apple.com>
 
+        CheckAdd/Mul should have commutativity optimizations in B3->Air lowering
+        https://bugs.webkit.org/show_bug.cgi?id=151214
+
+        Reviewed by Geoffrey Garen.
+
+        Disable my failed attempts at perfect forwarding, since they were incorrect, and caused compile
+        errors if you tried to pass an argument that bound to lvalue. This shouldn't affect performance of
+        anything we care about, and performance tests seem to confirm that it's all good.
+
+        * wtf/ScopedLambda.h:
+
+2015-11-17  Filip Pizlo  <fpi...@apple.com>
+
         Air should lay out code optimally
         https://bugs.webkit.org/show_bug.cgi?id=150478
 

Modified: trunk/Source/WTF/wtf/ScopedLambda.h (192539 => 192540)


--- trunk/Source/WTF/wtf/ScopedLambda.h	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Source/WTF/wtf/ScopedLambda.h	2015-11-17 22:31:40 UTC (rev 192540)
@@ -44,19 +44,20 @@
 template<typename ResultType, typename... ArgumentTypes>
 class ScopedLambda<ResultType (ArgumentTypes...)> {
 public:
-    ScopedLambda(ResultType (*impl)(void* arg, ArgumentTypes&&...) = nullptr, void* arg = nullptr)
+    ScopedLambda(ResultType (*impl)(void* arg, ArgumentTypes...) = nullptr, void* arg = nullptr)
         : m_impl(impl)
         , m_arg(arg)
     {
     }
-    
-    ResultType operator()(ArgumentTypes&&... arguments) const
+
+    template<typename... PassedArgumentTypes>
+    ResultType operator()(PassedArgumentTypes&&... arguments) const
     {
-        return m_impl(m_arg, std::forward<ArgumentTypes>(arguments)...);
+        return m_impl(m_arg, std::forward<PassedArgumentTypes>(arguments)...);
     }
 
 private:
-    ResultType (*m_impl)(void* arg, ArgumentTypes&&...);
+    ResultType (*m_impl)(void* arg, ArgumentTypes...);
     void *m_arg;
 };
 
@@ -72,10 +73,9 @@
     }
 
 private:
-    static ResultType implFunction(void* argument, ArgumentTypes&&... arguments)
+    static ResultType implFunction(void* argument, ArgumentTypes... arguments)
     {
-        return static_cast<ScopedLambdaFunctor*>(argument)->m_functor(
-            std::forward<ArgumentTypes>(arguments)...);
+        return static_cast<ScopedLambdaFunctor*>(argument)->m_functor(arguments...);
     }
 
     Functor m_functor;

Modified: trunk/Tools/ReducedFTL/ComplexTest.cpp (192539 => 192540)


--- trunk/Tools/ReducedFTL/ComplexTest.cpp	2015-11-17 22:29:54 UTC (rev 192539)
+++ trunk/Tools/ReducedFTL/ComplexTest.cpp	2015-11-17 22:31:40 UTC (rev 192540)
@@ -102,9 +102,9 @@
     for (unsigned i = 0; i < numConstructs; ++i) {
         if (i & 1) {
             // Control flow diamond.
-            unsigned predicateVarIndex = (i >> 1) % numVars;
-            unsigned thenIncVarIndex = ((i >> 1) + 1) % numVars;
-            unsigned elseIncVarIndex = ((i >> 1) + 2) % numVars;
+            unsigned predicateVarIndex = ((i >> 1) + 2) % numVars;
+            unsigned thenIncVarIndex = ((i >> 1) + 0) % numVars;
+            unsigned elseIncVarIndex = ((i >> 1) + 1) % numVars;
 
             BasicBlock* thenBlock = BasicBlock::Create(context, "", function);
             BasicBlock* elseBlock = BasicBlock::Create(context, "", function);
@@ -138,7 +138,7 @@
             BasicBlock* loopBody = BasicBlock::Create(context, "", function);
             BasicBlock* continuation = BasicBlock::Create(context, "", function);
 
-            Value* startIndex = vars[(i >> 1) % numVars];
+            Value* startIndex = vars[((i >> 1) + 1) % numVars];
             Value* startSum = builder->getInt32(0);
             builder->CreateCondBr(
                 builder->CreateICmpNE(startIndex, builder->getInt32(0)),
@@ -176,6 +176,7 @@
             finalSum->addIncoming(startSum, current);
             finalSum->addIncoming(newBodySum, loopBody);
             current = continuation;
+            vars[((i >> 1) + 0) % numVars] = finalSum;
         }
     }
 
_______________________________________________
webkit-changes mailing list
webkit-changes@lists.webkit.org
https://lists.webkit.org/mailman/listinfo/webkit-changes

Reply via email to