- Revision
- 195877
- Author
- fpi...@apple.com
- Date
- 2016-01-29 18:36:14 -0800 (Fri, 29 Jan 2016)
Log Message
CallLinkStatus should trust BadCell exit sites whenever there is no stub
https://bugs.webkit.org/show_bug.cgi?id=153691
Reviewed by Benjamin Poulain.
This fixes a regression in our treatment of inlining failure exit sites when deciding if we
should inline a call.
A long time ago, a BadCell exit site would ensure that a CallLinkStatus returned
takesSlowPath.
But then we added closure calls. A BadCell exit site might just mean that we should do
closure call inlining. We added a BadExecutable exit site to indicate that even closure call
inlining had failed. BadCell would no longer force CallLinkStatus to return takesSlowPath,
but BadExecutable would stuff do so.
But then we unified the IR for checking functions and executables, and the DFG stopped using
the BadExecutable exit kind. Probably this change disabled our ability to use exit site
counting for deciding when to takesSlowPath. But this isn't necessarily a disaster, since
any time you exited in this way, you'd be taken to a baseline call inline cache, and that
inline cache would record the slow path.
But then we introduced polymorphic call inlining. Polymorphic call inlining means that call
unlinking, like when one of the callees is optimized, will reset the stub. We also made it
so that the stub is like a gate for the slow path count. A clear inline cache must first
cause the creation of a stub and then cause it to overflow before the slow path is counted.
So, if the DFG or FTL exits on a cell check associated with a particular callsite being
speculatively inlined, then it's possible that nobody will know about the exit because:
- The exit kind is BadCell but CallLinkStatus needs BadExecutable to disable inlining.
- Between when we tiered up to the DFG (or FTL) and when the exit happened, one of the
callees was tiered up, causing the baseline CallLinkInfo to be unlinked. Therefore, after
the exit, the inline cache is in a reset state and will not count the call as taking slow
path.
The challenge when dealing with this is that often, we will have an super early compilation
of a minimorphic call site before we have seen all of its small set of callees. For example
we may have seen only one of two possible callees. That early compilation will OSR exit, and
in trunk, will be recompiled with bimorphic speculative inlining. That's a pretty good
outcome. Basically, we are trusting that if during the time that the function has been
running prior to a given compilation, a callsite has only seen a small number of callees,
then it's safe to assume that it won't see another one anytime soon.
So, simply forcing the CallLinkStatus to set takesSlowPath any time there was a BadCell exit
would hurt our performance in some cases, because trunk prior to this change would have their
final compilation use speculative inlining, and this change would make guarded inlining
instead.
The compromise that I came up with relies on the fact that a CallLinkInfo must be reset quite
frequently for it to routinely happen in between tier-up to DFG (or FTL) and an exit. So,
most likely, such a CallLinkInfo will also show up as being clear when the CallLinkStatus
is built during DFG profiling. The CallLinkStatus will then fall back on the CallLinkInfo's
lastSeenCallee field, which is persistent across resets. This change just makes it so that
CallLinkStatus sets takesSlowPath if there is a BadCell exit and the status had to be
inferred from the lastSeenCallee.
This change reduces pointless recompilations in gbemu. It's an 11% speed-up on gbemu. It
doesn't seem to hurt any benchmarks.
* bytecode/CallLinkStatus.cpp:
(JSC::CallLinkStatus::computeFor):
(JSC::CallLinkStatus::computeExitSiteData):
(JSC::CallLinkStatus::computeFromCallLinkInfo):
* bytecode/CallLinkStatus.h:
(JSC::CallLinkStatus::CallLinkStatus):
(JSC::CallLinkStatus::at):
(JSC::CallLinkStatus::operator[]):
(JSC::CallLinkStatus::isProved):
(JSC::CallLinkStatus::isBasedOnStub):
(JSC::CallLinkStatus::canOptimize):
(JSC::CallLinkStatus::maxNumArguments):
(JSC::CallLinkStatus::ExitSiteData::ExitSiteData): Deleted.
Modified Paths
Diff
Modified: trunk/Source/_javascript_Core/ChangeLog (195876 => 195877)
--- trunk/Source/_javascript_Core/ChangeLog 2016-01-30 02:05:17 UTC (rev 195876)
+++ trunk/Source/_javascript_Core/ChangeLog 2016-01-30 02:36:14 UTC (rev 195877)
@@ -1,3 +1,80 @@
+2016-01-29 Filip Pizlo <fpi...@apple.com>
+
+ CallLinkStatus should trust BadCell exit sites whenever there is no stub
+ https://bugs.webkit.org/show_bug.cgi?id=153691
+
+ Reviewed by Benjamin Poulain.
+
+ This fixes a regression in our treatment of inlining failure exit sites when deciding if we
+ should inline a call.
+
+ A long time ago, a BadCell exit site would ensure that a CallLinkStatus returned
+ takesSlowPath.
+
+ But then we added closure calls. A BadCell exit site might just mean that we should do
+ closure call inlining. We added a BadExecutable exit site to indicate that even closure call
+ inlining had failed. BadCell would no longer force CallLinkStatus to return takesSlowPath,
+ but BadExecutable would stuff do so.
+
+ But then we unified the IR for checking functions and executables, and the DFG stopped using
+ the BadExecutable exit kind. Probably this change disabled our ability to use exit site
+ counting for deciding when to takesSlowPath. But this isn't necessarily a disaster, since
+ any time you exited in this way, you'd be taken to a baseline call inline cache, and that
+ inline cache would record the slow path.
+
+ But then we introduced polymorphic call inlining. Polymorphic call inlining means that call
+ unlinking, like when one of the callees is optimized, will reset the stub. We also made it
+ so that the stub is like a gate for the slow path count. A clear inline cache must first
+ cause the creation of a stub and then cause it to overflow before the slow path is counted.
+
+ So, if the DFG or FTL exits on a cell check associated with a particular callsite being
+ speculatively inlined, then it's possible that nobody will know about the exit because:
+
+ - The exit kind is BadCell but CallLinkStatus needs BadExecutable to disable inlining.
+
+ - Between when we tiered up to the DFG (or FTL) and when the exit happened, one of the
+ callees was tiered up, causing the baseline CallLinkInfo to be unlinked. Therefore, after
+ the exit, the inline cache is in a reset state and will not count the call as taking slow
+ path.
+
+ The challenge when dealing with this is that often, we will have an super early compilation
+ of a minimorphic call site before we have seen all of its small set of callees. For example
+ we may have seen only one of two possible callees. That early compilation will OSR exit, and
+ in trunk, will be recompiled with bimorphic speculative inlining. That's a pretty good
+ outcome. Basically, we are trusting that if during the time that the function has been
+ running prior to a given compilation, a callsite has only seen a small number of callees,
+ then it's safe to assume that it won't see another one anytime soon.
+
+ So, simply forcing the CallLinkStatus to set takesSlowPath any time there was a BadCell exit
+ would hurt our performance in some cases, because trunk prior to this change would have their
+ final compilation use speculative inlining, and this change would make guarded inlining
+ instead.
+
+ The compromise that I came up with relies on the fact that a CallLinkInfo must be reset quite
+ frequently for it to routinely happen in between tier-up to DFG (or FTL) and an exit. So,
+ most likely, such a CallLinkInfo will also show up as being clear when the CallLinkStatus
+ is built during DFG profiling. The CallLinkStatus will then fall back on the CallLinkInfo's
+ lastSeenCallee field, which is persistent across resets. This change just makes it so that
+ CallLinkStatus sets takesSlowPath if there is a BadCell exit and the status had to be
+ inferred from the lastSeenCallee.
+
+ This change reduces pointless recompilations in gbemu. It's an 11% speed-up on gbemu. It
+ doesn't seem to hurt any benchmarks.
+
+ * bytecode/CallLinkStatus.cpp:
+ (JSC::CallLinkStatus::computeFor):
+ (JSC::CallLinkStatus::computeExitSiteData):
+ (JSC::CallLinkStatus::computeFromCallLinkInfo):
+ * bytecode/CallLinkStatus.h:
+ (JSC::CallLinkStatus::CallLinkStatus):
+ (JSC::CallLinkStatus::at):
+ (JSC::CallLinkStatus::operator[]):
+ (JSC::CallLinkStatus::isProved):
+ (JSC::CallLinkStatus::isBasedOnStub):
+ (JSC::CallLinkStatus::canOptimize):
+ (JSC::CallLinkStatus::maxNumArguments):
+ (JSC::CallLinkStatus::ExitSiteData::ExitSiteData): Deleted.
+
2016-01-29 Saam barati <sbar...@apple.com>
Pack FunctionExecutable and UnlinkedFunctionExecutable harder
Modified: trunk/Source/_javascript_Core/bytecode/CallLinkStatus.cpp (195876 => 195877)
--- trunk/Source/_javascript_Core/bytecode/CallLinkStatus.cpp 2016-01-30 02:05:17 UTC (rev 195876)
+++ trunk/Source/_javascript_Core/bytecode/CallLinkStatus.cpp 2016-01-30 02:36:14 UTC (rev 195877)
@@ -1,5 +1,5 @@
/*
- * Copyright (C) 2012-2015 Apple Inc. All rights reserved.
+ * Copyright (C) 2012-2016 Apple Inc. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
@@ -90,7 +90,7 @@
CallLinkInfo* callLinkInfo = map.get(CodeOrigin(bytecodeIndex));
if (!callLinkInfo) {
- if (exitSiteData.m_takesSlowPath)
+ if (exitSiteData.takesSlowPath)
return takesSlowPath();
return computeFromLLInt(locker, profiledBlock, bytecodeIndex);
}
@@ -107,10 +107,10 @@
ExitSiteData exitSiteData;
#if ENABLE(DFG_JIT)
- exitSiteData.m_takesSlowPath =
+ exitSiteData.takesSlowPath =
profiledBlock->hasExitSite(locker, DFG::FrequentExitSite(bytecodeIndex, BadType))
|| profiledBlock->hasExitSite(locker, DFG::FrequentExitSite(bytecodeIndex, BadExecutable));
- exitSiteData.m_badFunction =
+ exitSiteData.badFunction =
profiledBlock->hasExitSite(locker, DFG::FrequentExitSite(bytecodeIndex, BadCell));
#else
UNUSED_PARAM(locker);
@@ -208,6 +208,7 @@
CallLinkStatus result;
result.m_variants = variants;
result.m_couldTakeSlowPath = !!totalCallsToUnknown;
+ result.m_isBasedOnStub = true;
return result;
}
@@ -230,9 +231,18 @@
ExitSiteData exitSiteData)
{
CallLinkStatus result = computeFor(locker, profiledBlock, callLinkInfo);
- if (exitSiteData.m_badFunction)
- result.makeClosureCall();
- if (exitSiteData.m_takesSlowPath)
+ if (exitSiteData.badFunction) {
+ if (result.isBasedOnStub()) {
+ // If we have a polymorphic stub, then having an exit site is not quite so useful. In
+ // most cases, the information in the stub has higher fidelity.
+ result.makeClosureCall();
+ } else {
+ // We might not have a polymorphic stub for any number of reasons. When this happens, we
+ // are in less certain territory, so exit sites mean a lot.
+ result.m_couldTakeSlowPath = true;
+ }
+ }
+ if (exitSiteData.takesSlowPath)
result.m_couldTakeSlowPath = true;
return result;
Modified: trunk/Source/_javascript_Core/bytecode/CallLinkStatus.h (195876 => 195877)
--- trunk/Source/_javascript_Core/bytecode/CallLinkStatus.h 2016-01-30 02:05:17 UTC (rev 195876)
+++ trunk/Source/_javascript_Core/bytecode/CallLinkStatus.h 2016-01-30 02:36:14 UTC (rev 195877)
@@ -1,5 +1,5 @@
/*
- * Copyright (C) 2012-2015 Apple Inc. All rights reserved.
+ * Copyright (C) 2012-2016 Apple Inc. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
@@ -48,8 +48,6 @@
WTF_MAKE_FAST_ALLOCATED;
public:
CallLinkStatus()
- : m_couldTakeSlowPath(false)
- , m_isProved(false)
{
}
@@ -64,8 +62,6 @@
CallLinkStatus(CallVariant variant)
: m_variants(1, variant)
- , m_couldTakeSlowPath(false)
- , m_isProved(false)
{
}
@@ -73,14 +69,8 @@
CodeBlock*, unsigned bytecodeIndex, const CallLinkInfoMap&);
struct ExitSiteData {
- ExitSiteData()
- : m_takesSlowPath(false)
- , m_badFunction(false)
- {
- }
-
- bool m_takesSlowPath;
- bool m_badFunction;
+ bool takesSlowPath { false };
+ bool badFunction { false };
};
static ExitSiteData computeExitSiteData(const ConcurrentJITLocker&, CodeBlock*, unsigned bytecodeIndex);
@@ -116,8 +106,9 @@
CallVariant at(unsigned i) const { return m_variants[i]; }
CallVariant operator[](unsigned i) const { return at(i); }
bool isProved() const { return m_isProved; }
+ bool isBasedOnStub() const { return m_isBasedOnStub; }
bool canOptimize() const { return !m_variants.isEmpty(); }
-
+
bool isClosureCall() const; // Returns true if any callee is a closure call.
unsigned maxNumArguments() const { return m_maxNumArguments; }
@@ -134,9 +125,10 @@
#endif
CallVariantList m_variants;
- bool m_couldTakeSlowPath;
- bool m_isProved;
- unsigned m_maxNumArguments;
+ bool m_couldTakeSlowPath { false };
+ bool m_isProved { false };
+ bool m_isBasedOnStub { false };
+ unsigned m_maxNumArguments { 0 };
};
} // namespace JSC