Title: [263195] trunk/Source
Revision
263195
Author
mark....@apple.com
Date
2020-06-17 18:55:49 -0700 (Wed, 17 Jun 2020)

Log Message

Replace JSC::FreeList linked list with a Bitmap.
https://bugs.webkit.org/show_bug.cgi?id=213071

Reviewed by Filip Pizlo.

Source/_javascript_Core:

Implement an alternative to the linked list FreeList.  This alternative uses
a Bitmap to record which atom in the block is available for allocation.

The intuition here is that allocation using the Bitmap implementation will do:
    2 loads - m_currentRowBitmap, m_currentMarkedBlockRowAddress
    1 store - m_currentRowBitmap

whereas the linked list implementation will do:
    3 loads - m_scrambledHead, m_secret, result->scrambledNext
    1 store - m_scrambledHead

and result->scrambledNext is from a different region of code and therefore not
in the same cache line.

The downside of the Bitmap implementation is that it uses more instructions.

This change is currently only enabled for x86_64, which shows about a 0.8%
progression on Speedometer 2.

It appears to be about a 1% regression on ARM64E.  Hence, for now, we keep the
linked list implementation for ARM64 builds.

This is how the Bitmap FreeList works:

1. The Bitmap implementation only replaces the linked list implementation.  It
   does not replace the bump allocator.

2. The Bitmap allocator keeps a m_bitmap that is initialized in
   MarkedBlock::Handle::specializedSweep() to have a bit set for each atom
   location that is available for allocation (i.e. is free).  Note that a cell
   is usually allocated using more than 1 atom.  Only the bit corresponding to
   the first atom (in that cell length range of free atoms) will be set.

   This is consistent with how bits in MarkedBlock::Footer::m_marks and
   MarkedBlock::Footer::m_newlyAllocated are set i.e. only the bit for the first
   atom in the cell can be set.

3. The allocation algorithm thinks of the MarkedBlock as consisting of rows
   of atoms, where the number of atoms in a row equals the number of bits in
   a AtomsBitmap::Word.  On 64-bit CPUs, this would be 64.

   We will start allocating from the last (highest numbered) row down to the
   first (row 0).  As we allocate, we will only update m_currentRowIndex and
   m_currentRowBitmap.  m_bitmap will not be updated.  This is so in order to
   reduce the number of instructions executed during an allocation.

   When m_currentRowIndex points to N, the AtomsBitmap::Word for row N in
   m_bitmap will have been copied into m_currentRowBitmap.  This is the row
   that we will be allocating from until the row is exhausted.

   This is how we know whether an atom is available for allocation or not:
     i. Atoms in any rows above m_currentRowIndex are guaranteed to be
        allocated already (because we allocate downwards), and hence, are not
        available.
    ii. For row m_currentRowIndex, m_currentRowBitmap is the source of truth
        on which atoms in the row are available for allocation.
   iii. For rows below m_currentRowIndex, m_bitmap is the source of truth on
        which atoms are available for allocation.

   When m_currentRowIndex reaches 0, the info in m_bitmap is completely
   obsoleted, and m_currentRowBitmap holds the availability info for row 0.
   When both m_currentRowIndex and m_currentRowBitmap are 0, then we have
   completely exhausted the block and no more atoms are available for
   allocation.

4. Allocation happens in 3 paths: fast, middle, slow.

   The fast path checks m_currentRowBitmap.  If it's not 0, then we compute the
   bit number of the lowest set bit in it.  That bit number will be used together
   with m_currentMarkedBlockRowAddress to compute the address of the atom
   location available for allocation.  m_currentRowBitmap will be updated to clear
   the bit for the atom that has just ben allocated.

   If m_currentRowBitmap is 0, then we'll go to the middle path.

   The middle path checks m_currentRowIndex to see if we have more rows to allocate
   from.  For each m_currentRowIndex, we check its corresponding AtomsBitmap::Word
   in m_bitmap.  If the word is non-zero, we copy it to m_currentRowBitmap and
   jump to the fast path to do the allocation.  The middle path will update
   m_currentRowIndex to point to the current row we're allocating from.

   If we have decremented m_currentRowIndex down to 0 but still can't find a
   non-zero AtomsBitmap::Word in m_bitmap, then the block has been exhausted, and
   we'll go to the slow path.

   The slow path is analogous to the old slow path i.e. we try to refill the
   LocalAllocator with a new MarkedBlock.

5. On the layout of fields in FreeList (see changes in FreeList.h), we try to
   preserve the positions of the bump allocator fields.  The only change we made
   there is n the location of m_cellSize.  It is now moved up next to m_remaining,
   and m_originalSize is moved down.  This is because m_originalSize is only
   accessed in the slow path, and m_cellSize is accessed in the bump allocation
   path.

   Next, we try to put Bitmap allocation fields where the linked list fields
   would have been.  The one bit of trickiness is that we'll put
   m_currentMarkedBlockRowAddress in a union with m_payloadEnd.  This is because
   m_payloadEnd is only used in the bump allocation path.  If m_remaining is 0,
   then we can reuse this location for m_currentMarkedBlockRowAddress.

   With this, we would have 4 bytes of padding after m_currentRowIndex.  For
   compactness, we put m_originalSize there in that space.  For builds that use
   the linked list implementation, m_originalSize will be located below after
   m_cellSize.

* ftl/FTLLowerDFGToB3.cpp:
(JSC::FTL::DFG::LowerDFGToB3::allocateHeapCell):
* heap/FreeList.cpp:
(JSC::FreeList::clear):
(JSC::FreeList::initializeAtomsBitmap):
(JSC::FreeList::initializeBump):
(JSC::FreeList::contains const):
(JSC::FreeList::dump const):
* heap/FreeList.h:
(JSC::FreeList::bitmapIsEmpty const):
(JSC::FreeList::allocationWillFail const):
(JSC::FreeList::offsetOfCurrentRowBitmap):
(JSC::FreeList::offsetOfBitmapRows):
(JSC::FreeList::offsetOfCurrentRowIndex):
(JSC::FreeList::offsetOfCurrentMarkedBlockRowAddress):
(JSC::FreeList::offsetOfRemaining):
(JSC::FreeList::atomsBitmap):
(JSC::FreeList::bitmapRows const):
(JSC::FreeList::offsetOfOriginalSize): Deleted.
* heap/FreeListInlines.h:
(JSC::FreeList::allocate):
(JSC::FreeList::forEach const):
* heap/LocalAllocator.cpp:
(JSC::LocalAllocator::isFreeListedCell const):
* heap/MarkedBlock.h:
(JSC::MarkedBlock::Handle::atomAt const):
* heap/MarkedBlockInlines.h:
(JSC::MarkedBlock::Handle::specializedSweep):
* jit/AssemblyHelpers.cpp:
(JSC::AssemblyHelpers::emitAllocateWithNonNullAllocator):
* jit/AssemblyHelpers.h:
(JSC::AssemblyHelpers::emitAllocateWithNonNullAllocator):

Source/WTF:

1. Use countOfBits<> template to compute the number of bits.
2. Introduce log2() and log2Constexpr() utility functions.
3. Make Bitmap<>::forEachSetBit() a little bit more efficient: we don't need to
   keep iterating if the bitmap word is already empty of bits.

* wtf/Bitmap.h:
(WTF::WordType>::forEachSetBit const):
* wtf/MathExtras.h:
(WTF::clzConstexpr):
(WTF::clz):
(WTF::ctzConstexpr):
(WTF::ctz):
(WTF::getMSBSet):
(WTF::getMSBSetConstexpr):
(WTF::log2):
(WTF::log2Constexpr):

Modified Paths

Diff

Modified: trunk/Source/_javascript_Core/ChangeLog (263194 => 263195)


--- trunk/Source/_javascript_Core/ChangeLog	2020-06-18 01:45:01 UTC (rev 263194)
+++ trunk/Source/_javascript_Core/ChangeLog	2020-06-18 01:55:49 UTC (rev 263195)
@@ -1,5 +1,151 @@
 2020-06-17  Mark Lam  <mark....@apple.com>
 
+        Replace JSC::FreeList linked list with a Bitmap.
+        https://bugs.webkit.org/show_bug.cgi?id=213071
+
+        Reviewed by Filip Pizlo.
+
+        Implement an alternative to the linked list FreeList.  This alternative uses
+        a Bitmap to record which atom in the block is available for allocation.
+
+        The intuition here is that allocation using the Bitmap implementation will do:
+            2 loads - m_currentRowBitmap, m_currentMarkedBlockRowAddress
+            1 store - m_currentRowBitmap
+
+        whereas the linked list implementation will do:
+            3 loads - m_scrambledHead, m_secret, result->scrambledNext
+            1 store - m_scrambledHead
+
+        and result->scrambledNext is from a different region of code and therefore not
+        in the same cache line.
+
+        The downside of the Bitmap implementation is that it uses more instructions.
+
+        This change is currently only enabled for x86_64, which shows about a 0.8%
+        progression on Speedometer 2.
+
+        It appears to be about a 1% regression on ARM64E.  Hence, for now, we keep the
+        linked list implementation for ARM64 builds.
+
+        This is how the Bitmap FreeList works:
+
+        1. The Bitmap implementation only replaces the linked list implementation.  It
+           does not replace the bump allocator.
+
+        2. The Bitmap allocator keeps a m_bitmap that is initialized in
+           MarkedBlock::Handle::specializedSweep() to have a bit set for each atom
+           location that is available for allocation (i.e. is free).  Note that a cell
+           is usually allocated using more than 1 atom.  Only the bit corresponding to
+           the first atom (in that cell length range of free atoms) will be set.
+
+           This is consistent with how bits in MarkedBlock::Footer::m_marks and
+           MarkedBlock::Footer::m_newlyAllocated are set i.e. only the bit for the first
+           atom in the cell can be set.
+
+        3. The allocation algorithm thinks of the MarkedBlock as consisting of rows
+           of atoms, where the number of atoms in a row equals the number of bits in
+           a AtomsBitmap::Word.  On 64-bit CPUs, this would be 64.
+
+           We will start allocating from the last (highest numbered) row down to the
+           first (row 0).  As we allocate, we will only update m_currentRowIndex and
+           m_currentRowBitmap.  m_bitmap will not be updated.  This is so in order to
+           reduce the number of instructions executed during an allocation.
+
+           When m_currentRowIndex points to N, the AtomsBitmap::Word for row N in
+           m_bitmap will have been copied into m_currentRowBitmap.  This is the row
+           that we will be allocating from until the row is exhausted.
+
+           This is how we know whether an atom is available for allocation or not:
+             i. Atoms in any rows above m_currentRowIndex are guaranteed to be
+                allocated already (because we allocate downwards), and hence, are not
+                available.
+            ii. For row m_currentRowIndex, m_currentRowBitmap is the source of truth
+                on which atoms in the row are available for allocation.
+           iii. For rows below m_currentRowIndex, m_bitmap is the source of truth on
+                which atoms are available for allocation.
+
+           When m_currentRowIndex reaches 0, the info in m_bitmap is completely
+           obsoleted, and m_currentRowBitmap holds the availability info for row 0.
+           When both m_currentRowIndex and m_currentRowBitmap are 0, then we have
+           completely exhausted the block and no more atoms are available for
+           allocation.
+
+        4. Allocation happens in 3 paths: fast, middle, slow.
+
+           The fast path checks m_currentRowBitmap.  If it's not 0, then we compute the
+           bit number of the lowest set bit in it.  That bit number will be used together
+           with m_currentMarkedBlockRowAddress to compute the address of the atom
+           location available for allocation.  m_currentRowBitmap will be updated to clear
+           the bit for the atom that has just ben allocated.
+
+           If m_currentRowBitmap is 0, then we'll go to the middle path.
+
+           The middle path checks m_currentRowIndex to see if we have more rows to allocate
+           from.  For each m_currentRowIndex, we check its corresponding AtomsBitmap::Word
+           in m_bitmap.  If the word is non-zero, we copy it to m_currentRowBitmap and
+           jump to the fast path to do the allocation.  The middle path will update
+           m_currentRowIndex to point to the current row we're allocating from.
+
+           If we have decremented m_currentRowIndex down to 0 but still can't find a
+           non-zero AtomsBitmap::Word in m_bitmap, then the block has been exhausted, and
+           we'll go to the slow path.
+
+           The slow path is analogous to the old slow path i.e. we try to refill the
+           LocalAllocator with a new MarkedBlock.
+
+        5. On the layout of fields in FreeList (see changes in FreeList.h), we try to
+           preserve the positions of the bump allocator fields.  The only change we made
+           there is n the location of m_cellSize.  It is now moved up next to m_remaining,
+           and m_originalSize is moved down.  This is because m_originalSize is only
+           accessed in the slow path, and m_cellSize is accessed in the bump allocation
+           path.
+
+           Next, we try to put Bitmap allocation fields where the linked list fields
+           would have been.  The one bit of trickiness is that we'll put
+           m_currentMarkedBlockRowAddress in a union with m_payloadEnd.  This is because
+           m_payloadEnd is only used in the bump allocation path.  If m_remaining is 0,
+           then we can reuse this location for m_currentMarkedBlockRowAddress.
+
+           With this, we would have 4 bytes of padding after m_currentRowIndex.  For
+           compactness, we put m_originalSize there in that space.  For builds that use
+           the linked list implementation, m_originalSize will be located below after
+           m_cellSize.
+
+        * ftl/FTLLowerDFGToB3.cpp:
+        (JSC::FTL::DFG::LowerDFGToB3::allocateHeapCell):
+        * heap/FreeList.cpp:
+        (JSC::FreeList::clear):
+        (JSC::FreeList::initializeAtomsBitmap):
+        (JSC::FreeList::initializeBump):
+        (JSC::FreeList::contains const):
+        (JSC::FreeList::dump const):
+        * heap/FreeList.h:
+        (JSC::FreeList::bitmapIsEmpty const):
+        (JSC::FreeList::allocationWillFail const):
+        (JSC::FreeList::offsetOfCurrentRowBitmap):
+        (JSC::FreeList::offsetOfBitmapRows):
+        (JSC::FreeList::offsetOfCurrentRowIndex):
+        (JSC::FreeList::offsetOfCurrentMarkedBlockRowAddress):
+        (JSC::FreeList::offsetOfRemaining):
+        (JSC::FreeList::atomsBitmap):
+        (JSC::FreeList::bitmapRows const):
+        (JSC::FreeList::offsetOfOriginalSize): Deleted.
+        * heap/FreeListInlines.h:
+        (JSC::FreeList::allocate):
+        (JSC::FreeList::forEach const):
+        * heap/LocalAllocator.cpp:
+        (JSC::LocalAllocator::isFreeListedCell const):
+        * heap/MarkedBlock.h:
+        (JSC::MarkedBlock::Handle::atomAt const):
+        * heap/MarkedBlockInlines.h:
+        (JSC::MarkedBlock::Handle::specializedSweep):
+        * jit/AssemblyHelpers.cpp:
+        (JSC::AssemblyHelpers::emitAllocateWithNonNullAllocator):
+        * jit/AssemblyHelpers.h:
+        (JSC::AssemblyHelpers::emitAllocateWithNonNullAllocator):
+
+2020-06-17  Mark Lam  <mark....@apple.com>
+
         StructureIDTable::validate() doesn't work when compiled with GCC.
         https://bugs.webkit.org/show_bug.cgi?id=213302
         <rdar://problem/64452172>

Modified: trunk/Source/_javascript_Core/ftl/FTLLowerDFGToB3.cpp (263194 => 263195)


--- trunk/Source/_javascript_Core/ftl/FTLLowerDFGToB3.cpp	2020-06-18 01:45:01 UTC (rev 263194)
+++ trunk/Source/_javascript_Core/ftl/FTLLowerDFGToB3.cpp	2020-06-18 01:55:49 UTC (rev 263195)
@@ -15079,7 +15079,14 @@
             patchpoint->numGPScratchRegisters++;
         else
             patchpoint->appendSomeRegisterWithClobber(allocator);
-        patchpoint->numGPScratchRegisters++;
+#if ENABLE(BITMAP_FREELIST)
+        constexpr unsigned scratchRegistersNeeded = 3;
+        constexpr unsigned allocatorScratch = 3;
+#else
+        constexpr unsigned scratchRegistersNeeded = 1;
+        constexpr unsigned allocatorScratch = 1;
+#endif
+        patchpoint->numGPScratchRegisters += scratchRegistersNeeded;
         patchpoint->resultConstraints = { ValueRep::SomeEarlyRegister };
         
         m_out.appendSuccessor(usually(continuation));
@@ -15092,7 +15099,7 @@
                 
                 GPRReg allocatorGPR;
                 if (actualAllocator.isConstant())
-                    allocatorGPR = params.gpScratch(1);
+                    allocatorGPR = params.gpScratch(allocatorScratch);
                 else
                     allocatorGPR = params[1].gpr();
                 
@@ -15104,7 +15111,11 @@
                 // all of the compiler tiers.
                 jit.emitAllocateWithNonNullAllocator(
                     params[0].gpr(), actualAllocator, allocatorGPR, params.gpScratch(0),
-                    jumpToSlowPath);
+                    jumpToSlowPath
+#if ENABLE(BITMAP_FREELIST)
+                    , params.gpScratch(1), params.gpScratch(2)
+#endif
+                    );
                 
                 CCallHelpers::Jump jumpToSuccess;
                 if (!params.fallsThroughToSuccessor(0))

Modified: trunk/Source/_javascript_Core/heap/FreeList.cpp (263194 => 263195)


--- trunk/Source/_javascript_Core/heap/FreeList.cpp	2020-06-18 01:45:01 UTC (rev 263194)
+++ trunk/Source/_javascript_Core/heap/FreeList.cpp	2020-06-18 01:55:49 UTC (rev 263195)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2016-2017 Apple Inc. All rights reserved.
+ * Copyright (C) 2016-2020 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -39,13 +39,48 @@
 
 void FreeList::clear()
 {
+#if ENABLE(BITMAP_FREELIST)
+    m_currentRowBitmap = 0;
+    m_currentRowIndex = 0;
+#else
     m_scrambledHead = 0;
     m_secret = 0;
+#endif
     m_payloadEnd = nullptr;
     m_remaining = 0;
     m_originalSize = 0;
 }
 
+#if ENABLE(BITMAP_FREELIST)
+
+void FreeList::initializeAtomsBitmap(MarkedBlock::Handle* block, AtomsBitmap& freeAtoms, unsigned bytes)
+{
+#if ASSERT_ENABLED
+    m_markedBlock = block;
+#endif
+    ASSERT_UNUSED(freeAtoms, &freeAtoms == &m_bitmap);
+    // m_bitmap has already been filled in by MarkedBlock::Handle::specializedSweep().
+
+    m_currentRowBitmap = 0;
+    size_t rowIndex = AtomsBitmap::numberOfWords;
+    while (rowIndex--) {
+        auto rowBitmap = m_bitmap.wordAt(rowIndex);
+        if (rowBitmap) {
+            m_currentRowBitmap = rowBitmap;
+            break;
+        }
+    }
+    ASSERT(m_currentRowBitmap || m_bitmap.isEmpty());
+    m_currentRowIndex = m_currentRowBitmap ? rowIndex : 0;
+
+    size_t firstAtomInRow = m_currentRowIndex * atomsPerRow;
+    m_currentMarkedBlockRowAddress = bitwise_cast<Atom*>(block->atomAt(firstAtomInRow));
+    m_originalSize = bytes;
+}
+
+#else
+// Linked List implementation.
+
 void FreeList::initializeList(FreeCell* head, uintptr_t secret, unsigned bytes)
 {
     // It's *slightly* more optimal to use a scrambled head. It saves a register on the fast path.
@@ -56,16 +91,23 @@
     m_originalSize = bytes;
 }
 
+#endif // ENABLE(BITMAP_FREELIST)
+
 void FreeList::initializeBump(char* payloadEnd, unsigned remaining)
 {
+#if ENABLE(BITMAP_FREELIST)
+    m_currentRowBitmap = 0;
+    m_currentRowIndex = 0;
+#else
     m_scrambledHead = 0;
     m_secret = 0;
+#endif
     m_payloadEnd = payloadEnd;
     m_remaining = remaining;
     m_originalSize = remaining;
 }
 
-bool FreeList::contains(HeapCell* target) const
+bool FreeList::contains(HeapCell* target, MarkedBlock::Handle* currentBlock) const
 {
     if (m_remaining) {
         const void* start = (m_payloadEnd - m_remaining);
@@ -73,6 +115,31 @@
         return (start <= target) && (target < end);
     }
 
+#if ENABLE(BITMAP_FREELIST)
+    if (bitmapIsEmpty())
+        return false;
+
+    // currentBlock may be null if the allocator has been reset (and therefore,
+    // the FreeList cleared. Hence, we should only check this assertion after
+    // we check if the FreeList bitmap is empty above.
+    ASSERT(m_markedBlock == currentBlock);
+    if (!currentBlock->contains(target))
+        return false;
+
+    unsigned atomNumber = currentBlock->block().atomNumber(target);
+    unsigned rowIndex = atomNumber / atomsPerRow;
+    if (rowIndex > m_currentRowIndex)
+        return false;
+    if (rowIndex == m_currentRowIndex) {
+        constexpr AtomsBitmap::Word _one_ = 1;
+        unsigned firstAtomInRow = rowIndex * atomsPerRow;
+        unsigned atomIndexInRow = atomNumber - firstAtomInRow;
+        return m_currentRowBitmap & (one << atomIndexInRow);
+    }
+    return m_bitmap.get(atomNumber);
+
+#else
+    UNUSED_PARAM(currentBlock);
     FreeCell* candidate = head();
     while (candidate) {
         if (bitwise_cast<HeapCell*>(candidate) == target)
@@ -79,13 +146,20 @@
             return true;
         candidate = candidate->next(m_secret);
     }
-
     return false;
+#endif
 }
 
 void FreeList::dump(PrintStream& out) const
 {
+#if ENABLE(BITMAP_FREELIST)
+    if (m_remaining)
+        out.print("{payloadEnd = ", RawPointer(m_payloadEnd), ", remaining = ", m_remaining, ", originalSize = ", m_originalSize, "}");
+    else
+        out.print("{currentRowBitmap = ", m_currentRowBitmap, ", currentRowIndex = ", m_currentRowIndex, ", originalSize = ", m_originalSize, "}");
+#else
     out.print("{head = ", RawPointer(head()), ", secret = ", m_secret, ", payloadEnd = ", RawPointer(m_payloadEnd), ", remaining = ", m_remaining, ", originalSize = ", m_originalSize, "}");
+#endif
 }
 
 } // namespace JSC

Modified: trunk/Source/_javascript_Core/heap/FreeList.h (263194 => 263195)


--- trunk/Source/_javascript_Core/heap/FreeList.h	2020-06-18 01:45:01 UTC (rev 263194)
+++ trunk/Source/_javascript_Core/heap/FreeList.h	2020-06-18 01:55:49 UTC (rev 263195)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2016-2019 Apple Inc. All rights reserved.
+ * Copyright (C) 2016-2020 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -25,13 +25,22 @@
 
 #pragma once
 
+#include "MarkedBlock.h"
 #include <wtf/Noncopyable.h>
 #include <wtf/PrintStream.h>
+#include <wtf/StdIntExtras.h>
 
 namespace JSC {
 
 class HeapCell;
 
+#if CPU(X86_64)
+#define ENABLE_BITMAP_FREELIST 1
+#else
+#define ENABLE_BITMAP_FREELIST 0
+#endif
+
+#if !ENABLE(BITMAP_FREELIST)
 struct FreeCell {
     static uintptr_t scramble(FreeCell* cell, uintptr_t secret)
     {
@@ -58,6 +67,7 @@
     uint64_t preservedBitsForCrashAnalysis;
     uintptr_t scrambledNext;
 };
+#endif
 
 class FreeList {
 public:
@@ -66,16 +76,14 @@
     
     void clear();
     
-    JS_EXPORT_PRIVATE void initializeList(FreeCell* head, uintptr_t secret, unsigned bytes);
     JS_EXPORT_PRIVATE void initializeBump(char* payloadEnd, unsigned remaining);
     
-    bool allocationWillFail() const { return !head() && !m_remaining; }
     bool allocationWillSucceed() const { return !allocationWillFail(); }
     
     template<typename Func>
     HeapCell* allocate(const Func& slowPath);
     
-    bool contains(HeapCell*) const;
+    bool contains(HeapCell*, MarkedBlock::Handle* currentBlock) const;
     
     template<typename Func>
     void forEach(const Func&) const;
@@ -82,11 +90,47 @@
     
     unsigned originalSize() const { return m_originalSize; }
 
+#if ENABLE(BITMAP_FREELIST)
+    using Atom = MarkedBlock::Atom;
+    using AtomsBitmap = MarkedBlock::AtomsBitmap;
+
+    static constexpr size_t atomsPerRow = AtomsBitmap::bitsInWord;
+    static constexpr size_t atomsRowBytes = atomsPerRow * sizeof(Atom);
+    static constexpr unsigned atomSizeShift = WTF::log2Constexpr(sizeof(Atom));
+    static_assert((static_cast<size_t>(1) << atomSizeShift) == sizeof(Atom));
+
+    JS_EXPORT_PRIVATE void initializeAtomsBitmap(MarkedBlock::Handle*, AtomsBitmap& freeAtoms, unsigned bytes);
+
+    bool bitmapIsEmpty() const
+    {
+        // Remember, we don't actually clear the bits in m_bitmap as we allocate
+        // the atoms. Instead, m_currentRowBitmap and m_currentRowIndex tells us
+        // if there atoms still available for allocation. See comment blob below
+        // at the declaration of m_currentRowIndex for more details.
+        return !m_currentRowBitmap && !m_currentRowIndex;
+    }
+    bool allocationWillFail() const { return bitmapIsEmpty() && !m_remaining; }
+
+    static ptrdiff_t offsetOfCurrentRowBitmap() { return OBJECT_OFFSETOF(FreeList, m_currentRowBitmap); }
+
+    // We're deliberately returning the address of 1 word before m_bitmap so that
+    // we can schedule instructions better i.e. to do a load before decrementing the
+    // row index.
+    static ptrdiff_t offsetOfBitmapRows() { return OBJECT_OFFSETOF(FreeList, m_bitmap) - sizeof(AtomsBitmap::Word); }
+
+    static ptrdiff_t offsetOfCurrentRowIndex() { return OBJECT_OFFSETOF(FreeList, m_currentRowIndex); }
+    static ptrdiff_t offsetOfCurrentMarkedBlockRowAddress() { return OBJECT_OFFSETOF(FreeList, m_currentMarkedBlockRowAddress); }
+#else
+    JS_EXPORT_PRIVATE void initializeList(FreeCell* head, uintptr_t secret, unsigned bytes);
+
+    bool allocationWillFail() const { return !head() && !m_remaining; }
+
     static ptrdiff_t offsetOfScrambledHead() { return OBJECT_OFFSETOF(FreeList, m_scrambledHead); }
     static ptrdiff_t offsetOfSecret() { return OBJECT_OFFSETOF(FreeList, m_secret); }
+#endif
+
     static ptrdiff_t offsetOfPayloadEnd() { return OBJECT_OFFSETOF(FreeList, m_payloadEnd); }
     static ptrdiff_t offsetOfRemaining() { return OBJECT_OFFSETOF(FreeList, m_remaining); }
-    static ptrdiff_t offsetOfOriginalSize() { return OBJECT_OFFSETOF(FreeList, m_originalSize); }
     static ptrdiff_t offsetOfCellSize() { return OBJECT_OFFSETOF(FreeList, m_cellSize); }
     
     JS_EXPORT_PRIVATE void dump(PrintStream&) const;
@@ -94,14 +138,74 @@
     unsigned cellSize() const { return m_cellSize; }
     
 private:
+
+#if ENABLE(BITMAP_FREELIST)
+    AtomsBitmap& atomsBitmap() { return m_bitmap; }
+    AtomsBitmap::Word* bitmapRows() const
+    {
+        // See comment about offsetOfBitmapRows().
+        return bitwise_cast<AtomsBitmap::Word*>(&m_bitmap) - 1;
+    }
+
+    // This allocation algorithm thinks of the MarkedBlock as consisting of rows
+    // of atoms, where the number of atoms in a row equals the number of bits in
+    // a AtomsBitmap::Word. On 64-bit CPUs, this would be 64.
+    //
+    // We will start allocating from the last (highest numbered) row down to the
+    // first (row 0). As we allocate, we will only update m_currentRowIndex and
+    // m_currentRowBitmap. m_bitmap will not be updated. This is so in oder to
+    // reduce the number of instructions executed during an allocation.
+    //
+    // When m_currentRowIndex points to N, the AtomsBitmap::Word for row N in
+    // m_bitmap will have been copied into m_currentRowBitmap. This is the row
+    // that we will be allocating from until the row is exhausted.
+    //
+    // This is how we know whether an atom is available for allocation or not:
+    // 1. Atoms in any rows above m_currentRowIndex are guaranteed to be
+    //    allocated already (because we allocate downwards), and hence, are not
+    //    available.
+    // 2. For row m_currentRowIndex, m_currentRowBitmap is the source of truth
+    //    on which atoms in the row are available for allocation.
+    // 3. For rows below m_currentRowIndex, m_bitmap is the source of truth on
+    //    which atoms are available for allocation.
+    //
+    // When m_currentRowIndex reaches 0, the info in m_bitmap is completely
+    // obsoleted, and m_currentRowBitmap holds the availability info for row 0.
+    // When both m_currentRowIndex and m_currentRowBitmap are 0, then we have
+    // completely exhausted the block and no more atoms are available for
+    // allocation.
+
+    AtomsBitmap::Word m_currentRowBitmap { 0 };
+    unsigned m_currentRowIndex { 0 };
+    unsigned m_originalSize { 0 };
+
+#else
     FreeCell* head() const { return FreeCell::descramble(m_scrambledHead, m_secret); }
-    
+
     uintptr_t m_scrambledHead { 0 };
     uintptr_t m_secret { 0 };
-    char* m_payloadEnd { nullptr };
+#endif
+
+    union {
+        char* m_payloadEnd { nullptr };
+#if ENABLE(BITMAP_FREELIST)
+        Atom* m_currentMarkedBlockRowAddress;
+#endif
+    };
     unsigned m_remaining { 0 };
+    unsigned m_cellSize { 0 };
+
+#if ENABLE(BITMAP_FREELIST)
+    AtomsBitmap m_bitmap;
+#else
     unsigned m_originalSize { 0 };
-    unsigned m_cellSize { 0 };
+#endif
+
+#if ASSERT_ENABLED
+    MarkedBlock::Handle* m_markedBlock { nullptr };
+#endif
+
+    friend class MarkedBlock;
 };
 
 } // namespace JSC

Modified: trunk/Source/_javascript_Core/heap/FreeListInlines.h (263194 => 263195)


--- trunk/Source/_javascript_Core/heap/FreeListInlines.h	2020-06-18 01:45:01 UTC (rev 263194)
+++ trunk/Source/_javascript_Core/heap/FreeListInlines.h	2020-06-18 01:55:49 UTC (rev 263195)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2017 Apple Inc. All rights reserved.
+ * Copyright (C) 2017-2020 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -26,7 +26,7 @@
 #pragma once
 
 #include "FreeList.h"
-#include "MarkedBlock.h"
+#include <wtf/MathExtras.h>
 
 namespace JSC {
 
@@ -40,7 +40,39 @@
         m_remaining = remaining;
         return bitwise_cast<HeapCell*>(m_payloadEnd - remaining - cellSize);
     }
-    
+
+#if ENABLE(BITMAP_FREELIST)
+    AtomsBitmap::Word rowBitmap = m_currentRowBitmap;
+    do {
+        if (rowBitmap) {
+            constexpr AtomsBitmap::Word _one_ = 1;
+            unsigned atomIndexInRow = ctz(rowBitmap);
+            auto* cell = bitwise_cast<HeapCell*>(&m_currentMarkedBlockRowAddress[atomIndexInRow]);
+            rowBitmap &= ~(one << atomIndexInRow);
+            m_currentRowBitmap = rowBitmap;
+            return cell;
+        }
+
+        unsigned rowIndex = m_currentRowIndex;
+        auto* rowAddress = m_currentMarkedBlockRowAddress;
+        while (rowIndex) {
+            // We load before decrementing rowIndex because bitmapRows() points
+            // to 1 word before m_bitmap. See comments about offsetOfBitmapRows()
+            // for why we do this.
+            rowBitmap = bitmapRows()[rowIndex--];
+            rowAddress -= atomsPerRow;
+            if (rowBitmap)
+                break;
+        }
+        m_currentMarkedBlockRowAddress = rowAddress;
+        m_currentRowIndex = rowIndex;
+    } while (rowBitmap);
+
+    m_currentRowBitmap = rowBitmap;
+    ASSERT(bitmapIsEmpty());
+    return slowPath();
+
+#else // !ENABLE(BITMAP_FREELIST)
     FreeCell* result = head();
     if (UNLIKELY(!result))
         return slowPath();
@@ -47,6 +79,7 @@
     
     m_scrambledHead = result->scrambledNext;
     return bitwise_cast<HeapCell*>(result);
+#endif // !ENABLE(BITMAP_FREELIST)
 }
 
 template<typename Func>
@@ -56,6 +89,33 @@
         for (unsigned remaining = m_remaining; remaining; remaining -= m_cellSize)
             func(bitwise_cast<HeapCell*>(m_payloadEnd - remaining));
     } else {
+#if ENABLE(BITMAP_FREELIST)
+        if (bitmapIsEmpty())
+            return;
+
+        AtomsBitmap::Word rowBitmap = m_currentRowBitmap;
+        unsigned rowIndex = m_currentRowIndex;
+        Atom* currentMarkedBlockRowAddress = m_currentMarkedBlockRowAddress;
+        do {
+            while (rowBitmap) {
+                constexpr AtomsBitmap::Word _one_ = 1;
+                unsigned atomIndexInRow = ctz(rowBitmap);
+                auto* cell = bitwise_cast<HeapCell*>(&currentMarkedBlockRowAddress[atomIndexInRow]);
+                rowBitmap &= ~(one << atomIndexInRow);
+                func(cell);
+            }
+
+            while (rowIndex) {
+                // We load before decrementing rowIndex because bitmapRows() points
+                // to 1 word before m_bitmap. See comments about offsetOfBitmapRows()
+                // for why we do this.
+                rowBitmap = bitmapRows()[rowIndex--];
+                currentMarkedBlockRowAddress -= atomsPerRow;
+                if (rowBitmap)
+                    break;
+            }
+        } while (rowBitmap);
+#else
         for (FreeCell* cell = head(); cell;) {
             // We can use this to overwrite free objects before destroying the free list. So, we need
             // to get next before proceeding further.
@@ -63,8 +123,8 @@
             func(bitwise_cast<HeapCell*>(cell));
             cell = next;
         }
+#endif
     }
 }
 
 } // namespace JSC
-

Modified: trunk/Source/_javascript_Core/heap/LocalAllocator.cpp (263194 => 263195)


--- trunk/Source/_javascript_Core/heap/LocalAllocator.cpp	2020-06-18 01:45:01 UTC (rev 263194)
+++ trunk/Source/_javascript_Core/heap/LocalAllocator.cpp	2020-06-18 01:55:49 UTC (rev 263195)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2018-2019 Apple Inc. All rights reserved.
+ * Copyright (C) 2018-2020 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -272,7 +272,7 @@
     // if we know that the block owning the object is free-listed, then it's impossible for any
     // objects to be in the dead-but-not-destructed state.
     // FIXME: Get rid of this abomination. https://bugs.webkit.org/show_bug.cgi?id=181655
-    return m_freeList.contains(bitwise_cast<HeapCell*>(target));
+    return m_freeList.contains(bitwise_cast<HeapCell*>(target), m_currentBlock);
 }
 
 } // namespace JSC

Modified: trunk/Source/_javascript_Core/heap/MarkedBlock.h (263194 => 263195)


--- trunk/Source/_javascript_Core/heap/MarkedBlock.h	2020-06-18 01:45:01 UTC (rev 263194)
+++ trunk/Source/_javascript_Core/heap/MarkedBlock.h	2020-06-18 01:55:49 UTC (rev 263195)
@@ -1,7 +1,7 @@
 /*
  *  Copyright (C) 1999-2000 Harri Porten (por...@kde.org)
  *  Copyright (C) 2001 Peter Kelly (p...@post.com)
- *  Copyright (C) 2003-2019 Apple Inc. All rights reserved.
+ *  Copyright (C) 2003-2020 Apple Inc. All rights reserved.
  *
  *  This library is free software; you can redistribute it and/or
  *  modify it under the terms of the GNU Lesser General Public
@@ -84,6 +84,8 @@
     static_assert(!(MarkedBlock::atomSize & (MarkedBlock::atomSize - 1)), "MarkedBlock::atomSize must be a power of two.");
     static_assert(!(MarkedBlock::blockSize & (MarkedBlock::blockSize - 1)), "MarkedBlock::blockSize must be a power of two.");
     
+    using AtomsBitmap = Bitmap<atomsPerBlock>;
+
     struct VoidFunctor {
         typedef void ReturnType;
         void returnValue() { }
@@ -203,6 +205,7 @@
         
         void* start() const { return &m_block->atoms()[0]; }
         void* end() const { return &m_block->atoms()[m_endAtom]; }
+        void* atomAt(size_t i) const { return &m_block->atoms()[i]; }
         bool contains(void* p) const { return start() <= p && p < end(); }
 
         void dumpState(PrintStream&);
@@ -294,8 +297,8 @@
         HeapVersion m_markingVersion;
         HeapVersion m_newlyAllocatedVersion;
 
-        Bitmap<atomsPerBlock> m_marks;
-        Bitmap<atomsPerBlock> m_newlyAllocated;
+        AtomsBitmap m_marks;
+        AtomsBitmap m_newlyAllocated;
     };
     
 private:    
@@ -336,7 +339,7 @@
     bool isNewlyAllocated(const void*);
     void setNewlyAllocated(const void*);
     void clearNewlyAllocated(const void*);
-    const Bitmap<atomsPerBlock>& newlyAllocated() const;
+    const AtomsBitmap& newlyAllocated() const;
     
     HeapVersion newlyAllocatedVersion() const { return footer().m_newlyAllocatedVersion; }
     
@@ -374,7 +377,7 @@
     bool isMarkedRaw(const void* p);
     HeapVersion markingVersion() const { return footer().m_markingVersion; }
     
-    const Bitmap<atomsPerBlock>& marks() const;
+    const AtomsBitmap& marks() const;
     
     CountingLock& lock() { return footer().m_lock; }
     
@@ -399,6 +402,8 @@
     
     inline bool marksConveyLivenessDuringMarking(HeapVersion markingVersion);
     inline bool marksConveyLivenessDuringMarking(HeapVersion myMarkingVersion, HeapVersion markingVersion);
+
+    friend class FreeList;
 };
 
 inline MarkedBlock::Footer& MarkedBlock::footer()

Modified: trunk/Source/_javascript_Core/heap/MarkedBlockInlines.h (263194 => 263195)


--- trunk/Source/_javascript_Core/heap/MarkedBlockInlines.h	2020-06-18 01:45:01 UTC (rev 263194)
+++ trunk/Source/_javascript_Core/heap/MarkedBlockInlines.h	2020-06-18 01:55:49 UTC (rev 263195)
@@ -303,6 +303,66 @@
         return;
     }
 
+#if ENABLE(BITMAP_FREELIST)
+    AtomsBitmap localFreeAtoms;
+    AtomsBitmap& freeAtoms = freeList ? freeList->atomsBitmap() : localFreeAtoms;
+    size_t count = 0;
+
+    auto handleDeadCell = [&] (size_t i) {
+        HeapCell* cell = reinterpret_cast_ptr<HeapCell*>(atomAt(i));
+
+        if (destructionMode != BlockHasNoDestructors)
+            destroy(cell);
+
+        if (sweepMode == SweepToFreeList) {
+            if (scribbleMode == Scribble)
+                scribble(cell, cellSize);
+            ++count;
+        }
+    };
+
+    bool isEmpty = true;
+
+    AtomsBitmap cellLocations;
+    cellLocations.setEachNthBit(m_atomsPerCell, 0, m_endAtom);
+
+    if (emptyMode == NotEmpty) {
+        if (marksMode == MarksNotStale) {
+            freeAtoms = footer.m_marks;
+            if (newlyAllocatedMode == HasNewlyAllocated)
+                freeAtoms |= footer.m_newlyAllocated;
+        } else if (newlyAllocatedMode == HasNewlyAllocated)
+            freeAtoms = footer.m_newlyAllocated;
+        // At this point, a set bit in freeAtoms represents live cells.
+        isEmpty = freeAtoms.isEmpty();
+
+        // Invert the bits at each cell location so that the ones for live cells
+        // are cleared, and the ones for dead cells are set.
+        freeAtoms ^= cellLocations;
+    } else
+        freeAtoms = cellLocations; // all cells are free.
+    
+    // We only want to discard the newlyAllocated bits if we're creating a FreeList,
+    // otherwise we would lose information on what's currently alive.
+    if (sweepMode == SweepToFreeList && newlyAllocatedMode == HasNewlyAllocated)
+        footer.m_newlyAllocatedVersion = MarkedSpace::nullVersion;
+
+    if (space()->isMarking())
+        footer.m_lock.unlock();
+
+    if (destructionMode != BlockHasNoDestructors)
+        freeAtoms.forEachSetBit(handleDeadCell);
+
+    if (sweepMode == SweepToFreeList) {
+        freeList->initializeAtomsBitmap(this, freeAtoms, count * cellSize);
+        setIsFreeListed();
+    } else if (isEmpty)
+        m_directory->setIsEmpty(NoLockingNecessary, this, true);
+    if (false)
+        dataLog("Slowly swept block ", RawPointer(&block), " with cell size ", cellSize, " and attributes ", m_attributes, ": ", pointerDump(freeList), "\n");
+
+#else // not ENABLE(BITMAP_FREELIST)
+
     // This produces a free list that is ordered in reverse through the block.
     // This is fine, since the allocation code makes no assumptions about the
     // order of the free list.
@@ -361,6 +421,7 @@
         m_directory->setIsEmpty(NoLockingNecessary, this, true);
     if (false)
         dataLog("Slowly swept block ", RawPointer(&block), " with cell size ", cellSize, " and attributes ", m_attributes, ": ", pointerDump(freeList), "\n");
+#endif // ENABLE(BITMAP_FREELIST)
 }
 
 template<typename DestroyFunc>

Modified: trunk/Source/_javascript_Core/jit/AssemblyHelpers.cpp (263194 => 263195)


--- trunk/Source/_javascript_Core/jit/AssemblyHelpers.cpp	2020-06-18 01:45:01 UTC (rev 263194)
+++ trunk/Source/_javascript_Core/jit/AssemblyHelpers.cpp	2020-06-18 01:55:49 UTC (rev 263195)
@@ -503,7 +503,7 @@
 }
 #endif
 
-void AssemblyHelpers::emitAllocateWithNonNullAllocator(GPRReg resultGPR, const JITAllocator& allocator, GPRReg allocatorGPR, GPRReg scratchGPR, JumpList& slowPath)
+void AssemblyHelpers::emitAllocateWithNonNullAllocator(GPRReg resultGPR, const JITAllocator& allocator, GPRReg allocatorGPR, GPRReg scratchGPR, JumpList& slowPath, Optional<GPRReg> optionalScratchGPR2, Optional<GPRReg> optionalScratchGPR3)
 {
     if (Options::forceGCSlowPaths()) {
         slowPath.append(jump());
@@ -516,7 +516,7 @@
     // - We *can* use RegisterSet::macroScratchRegisters on ARM.
 
     Jump popPath;
-    Jump done;
+    JumpList done;
     
     if (allocator.isConstant())
         move(TrustedImmPtr(allocator.allocator().localAllocator()), allocatorGPR);
@@ -534,10 +534,124 @@
     Address payloadEndAddr = Address(allocatorGPR, LocalAllocator::offsetOfFreeList() + FreeList::offsetOfPayloadEnd());
     addPtr(payloadEndAddr, resultGPR);
 
-    done = jump();
+    done.append(jump());
         
+#if ENABLE(BITMAP_FREELIST)
+    ASSERT(resultGPR != scratchGPR);
+
+    auto rowIndexGPR = resultGPR;
+    auto rowBitmapGPR = scratchGPR;
+
+    GPRReg scratchGPR2 = optionalScratchGPR2 ? optionalScratchGPR2.value() : scratchRegister();
+    ASSERT(scratchGPR2 != resultGPR);
+    ASSERT(scratchGPR2 != scratchGPR);
+
+    auto rowAddressGPR = scratchGPR2;
+    auto clearBit64ScratchGPR = scratchGPR2;
+
+    bool canPreloadRowAddressGPR = false;
+    if (optionalScratchGPR3) {
+        clearBit64ScratchGPR = optionalScratchGPR3.value();
+        canPreloadRowAddressGPR = true;
+    } else if (isX86_64()) {
+        // x86_64's clearBit64() does actually need to use clearBit64ScratchGPR.
+        // So, we can preload the row address into it.
+        clearBit64ScratchGPR = InvalidGPRReg;
+        canPreloadRowAddressGPR = true;
+#if CPU(ARM64)
+    } else if (isARM64()) {
+        // ARM64's fast path does actually need to use the memoryTempRegister.
+        // So, we can use that for the clearBit64ScratchGPR and allow the
+        // row address to be preloaded in scratchGPR2.
+        clearBit64ScratchGPR = getCachedMemoryTempRegisterIDAndInvalidate();
+        canPreloadRowAddressGPR = true;
+#endif
+    }
+    ASSERT(clearBit64ScratchGPR != resultGPR);
+    ASSERT(clearBit64ScratchGPR != scratchGPR);
+    if (canPreloadRowAddressGPR)
+        ASSERT(clearBit64ScratchGPR != scratchGPR2);
+
+    // The code below for rowBitmapGPR relies on this.
+    static_assert(sizeof(FreeList::AtomsBitmap::Word) == sizeof(uint64_t));
+
+    // Check for middle path: have another row to visit?
+    Label checkForMoreRows = label();
+
+    load32(Address(allocatorGPR, LocalAllocator::offsetOfFreeList() + FreeList::offsetOfCurrentRowIndex()), rowIndexGPR);
+
+    if (!canPreloadRowAddressGPR)
+        loadPtr(Address(allocatorGPR, LocalAllocator::offsetOfFreeList() + FreeList::offsetOfCurrentMarkedBlockRowAddress()), rowAddressGPR);
+
+    slowPath.append(branchTestPtr(Zero, rowIndexGPR));
+
+    // Middle path: there is another row left to visit.
+    Jump foundNonEmptyRow;
+    Label checkNextRow = label();
+    {
+        // Load the next row bitmap and point m_currentMarkedBlockRowAddress to the next row.
+
+        // Note: offsetOfBitmapRows() points to 1 word before m_bitmap. We do this
+        // deliberately because it allows us to schedule instructions better and
+        // do this load before the decrement below.
+        load64(BaseIndex(allocatorGPR, rowIndexGPR, TimesEight, LocalAllocator::offsetOfFreeList() + FreeList::offsetOfBitmapRows()), rowBitmapGPR);
+
+        sub64(TrustedImm32(1), rowIndexGPR);
+        subPtr(TrustedImm32(FreeList::atomsRowBytes), rowAddressGPR);
+
+        foundNonEmptyRow = branchTest64(NonZero, rowBitmapGPR);
+        branchTestPtr(NonZero, rowIndexGPR).linkTo(checkNextRow, this);
+    }
+
+    // Slow path: no more rows.
+    // Both rowIndexGPR and rowBitmapGPR should be null here.
+    store32(rowIndexGPR, Address(allocatorGPR, LocalAllocator::offsetOfFreeList() + FreeList::offsetOfCurrentRowIndex()));
+    store64(rowBitmapGPR, Address(allocatorGPR, LocalAllocator::offsetOfFreeList() + FreeList::offsetOfCurrentRowBitmap()));
+    slowPath.append(jump());
+
+    // Transition from middle path back to fast path to allocate.
+    foundNonEmptyRow.link(this);
+    storePtr(rowAddressGPR, Address(allocatorGPR, LocalAllocator::offsetOfFreeList() + FreeList::offsetOfCurrentMarkedBlockRowAddress()));
+    store32(rowIndexGPR, Address(allocatorGPR, LocalAllocator::offsetOfFreeList() + FreeList::offsetOfCurrentRowIndex()));
+
+    Jump allocateFromCurrentRow = jump();
+
     popPath.link(this);
-        
+
+    // Check for fast path: have available bit in m_currentRowBitmap?
+    load64(Address(allocatorGPR, LocalAllocator::offsetOfFreeList() + FreeList::offsetOfCurrentRowBitmap()), rowBitmapGPR);
+
+    if (canPreloadRowAddressGPR) {
+        // Preload the row address needed on the fast and middle path.
+        loadPtr(Address(allocatorGPR, LocalAllocator::offsetOfFreeList() + FreeList::offsetOfCurrentMarkedBlockRowAddress()), rowAddressGPR);
+    }
+
+    branchTest64(Zero, rowBitmapGPR).linkTo(checkForMoreRows, this);
+
+    // Fast path: we have a bit to use.
+    allocateFromCurrentRow.link(this);
+    {
+        // Remove this bit from m_currentRowBitmap.
+        auto atomIndexInRowGPR = resultGPR;
+        countTrailingZeros64WithoutNullCheck(rowBitmapGPR, atomIndexInRowGPR);
+        clearBit64(atomIndexInRowGPR, rowBitmapGPR, clearBit64ScratchGPR);
+
+        if (!canPreloadRowAddressGPR)
+            loadPtr(Address(allocatorGPR, LocalAllocator::offsetOfFreeList() + FreeList::offsetOfCurrentMarkedBlockRowAddress()), rowAddressGPR);
+
+        store64(rowBitmapGPR, Address(allocatorGPR, LocalAllocator::offsetOfFreeList() + FreeList::offsetOfCurrentRowBitmap()));
+
+        // Compute atom address of this bit.
+        ASSERT(resultGPR == atomIndexInRowGPR);
+        shiftAndAdd(rowAddressGPR, resultGPR, FreeList::atomSizeShift, resultGPR);
+    }
+
+#else
+    UNUSED_PARAM(optionalScratchGPR2);
+    UNUSED_PARAM(optionalScratchGPR3);
+
+    popPath.link(this);
+
     loadPtr(Address(allocatorGPR, LocalAllocator::offsetOfFreeList() + FreeList::offsetOfScrambledHead()), resultGPR);
     xorPtr(Address(allocatorGPR, LocalAllocator::offsetOfFreeList() + FreeList::offsetOfSecret()), resultGPR);
     slowPath.append(branchTestPtr(Zero, resultGPR));
@@ -546,7 +660,8 @@
     // it's still on the GC's free list.
     loadPtr(Address(resultGPR, FreeCell::offsetOfScrambledNext()), scratchGPR);
     storePtr(scratchGPR, Address(allocatorGPR, LocalAllocator::offsetOfFreeList() + FreeList::offsetOfScrambledHead()));
-        
+#endif
+
     done.link(this);
 }
 

Modified: trunk/Source/_javascript_Core/jit/AssemblyHelpers.h (263194 => 263195)


--- trunk/Source/_javascript_Core/jit/AssemblyHelpers.h	2020-06-18 01:45:01 UTC (rev 263194)
+++ trunk/Source/_javascript_Core/jit/AssemblyHelpers.h	2020-06-18 01:55:49 UTC (rev 263195)
@@ -44,6 +44,7 @@
 #include "TagRegistersMode.h"
 #include "TypeofType.h"
 #include "VM.h"
+#include <wtf/Optional.h>
 
 namespace JSC {
 
@@ -1951,8 +1952,8 @@
     // that allocator is non-null; allocator can be null as a signal that we don't know what the
     // value of allocatorGPR is. Additionally, if the allocator is not null, then there is no need
     // to populate allocatorGPR - this code will ignore the contents of allocatorGPR.
-    void emitAllocateWithNonNullAllocator(GPRReg resultGPR, const JITAllocator& allocator, GPRReg allocatorGPR, GPRReg scratchGPR, JumpList& slowPath);
-    
+    void emitAllocateWithNonNullAllocator(GPRReg resultGPR, const JITAllocator&, GPRReg allocatorGPR, GPRReg scratchGPR, JumpList& slowPath, Optional<GPRReg> scratchGPR2 = { }, Optional<GPRReg> scratchGPR3 = { });
+
     void emitAllocate(GPRReg resultGPR, const JITAllocator& allocator, GPRReg allocatorGPR, GPRReg scratchGPR, JumpList& slowPath);
     
     template<typename StructureType>

Modified: trunk/Source/WTF/ChangeLog (263194 => 263195)


--- trunk/Source/WTF/ChangeLog	2020-06-18 01:45:01 UTC (rev 263194)
+++ trunk/Source/WTF/ChangeLog	2020-06-18 01:55:49 UTC (rev 263195)
@@ -1,3 +1,27 @@
+2020-06-17  Mark Lam  <mark....@apple.com>
+
+        Replace JSC::FreeList linked list with a Bitmap.
+        https://bugs.webkit.org/show_bug.cgi?id=213071
+
+        Reviewed by Filip Pizlo.
+
+        1. Use countOfBits<> template to compute the number of bits.
+        2. Introduce log2() and log2Constexpr() utility functions.
+        3. Make Bitmap<>::forEachSetBit() a little bit more efficient: we don't need to
+           keep iterating if the bitmap word is already empty of bits.
+
+        * wtf/Bitmap.h:
+        (WTF::WordType>::forEachSetBit const):
+        * wtf/MathExtras.h:
+        (WTF::clzConstexpr):
+        (WTF::clz):
+        (WTF::ctzConstexpr):
+        (WTF::ctz):
+        (WTF::getMSBSet):
+        (WTF::getMSBSetConstexpr):
+        (WTF::log2):
+        (WTF::log2Constexpr):
+
 2020-06-17  David Kilzer  <ddkil...@apple.com>
 
         Replace OptionSetTraits/OptionSetValues with EnumTraits/EnumValues

Modified: trunk/Source/WTF/wtf/Bitmap.h (263194 => 263195)


--- trunk/Source/WTF/wtf/Bitmap.h	2020-06-18 01:45:01 UTC (rev 263194)
+++ trunk/Source/WTF/wtf/Bitmap.h	2020-06-18 01:55:49 UTC (rev 263195)
@@ -130,9 +130,15 @@
 
     unsigned hash() const;
 
+    // Low level interface.
+    using Word = WordType;
+    static constexpr unsigned bitsInWord = countOfBits<WordType>;
+    static constexpr unsigned numberOfWords = (bitmapSize + bitsInWord - 1) / bitsInWord;
+    WordType wordAt(size_t wordIndex) const { return bits[wordIndex]; }
+
 private:
-    static constexpr unsigned wordSize = sizeof(WordType) * 8;
-    static constexpr unsigned words = (bitmapSize + wordSize - 1) / wordSize;
+    static constexpr unsigned wordSize = bitsInWord;
+    static constexpr unsigned words = numberOfWords;
 
     // the literal '1' is of type signed int.  We want to use an unsigned
     // version of the correct size when doing the calculations because if
@@ -377,14 +383,14 @@
 {
     for (size_t i = 0; i < words; ++i) {
         WordType word = bits[i];
-        if (!word)
-            continue;
         size_t base = i * wordSize;
-        for (size_t j = 0; j < wordSize; ++j) {
+        size_t j = 0;
+        for (; word; ++j) {
             if (word & 1)
                 func(base + j);
             word >>= 1;
         }
+        ASSERT(j <= wordSize);
     }
 }
 

Modified: trunk/Source/WTF/wtf/MathExtras.h (263194 => 263195)


--- trunk/Source/WTF/wtf/MathExtras.h	2020-06-18 01:45:01 UTC (rev 263194)
+++ trunk/Source/WTF/wtf/MathExtras.h	2020-06-18 01:55:49 UTC (rev 263195)
@@ -593,7 +593,7 @@
 template <typename T>
 constexpr unsigned clzConstexpr(T value)
 {
-    constexpr unsigned bitSize = sizeof(T) * CHAR_BIT;
+    constexpr unsigned bitSize = countOfBits<T>;
 
     using UT = typename std::make_unsigned<T>::type;
     UT uValue = value;
@@ -610,13 +610,13 @@
 template<typename T>
 inline unsigned clz(T value)
 {
-    constexpr unsigned bitSize = sizeof(T) * CHAR_BIT;
+    constexpr unsigned bitSize = countOfBits<T>;
 
     using UT = typename std::make_unsigned<T>::type;
     UT uValue = value;
 
 #if COMPILER(GCC_COMPATIBLE)
-    constexpr unsigned bitSize64 = sizeof(uint64_t) * CHAR_BIT;
+    constexpr unsigned bitSize64 = countOfBits<uint64_t>;
     if (uValue)
         return __builtin_clzll(uValue) - (bitSize64 - bitSize);
     return bitSize;
@@ -638,7 +638,7 @@
 template <typename T>
 constexpr unsigned ctzConstexpr(T value)
 {
-    constexpr unsigned bitSize = sizeof(T) * CHAR_BIT;
+    constexpr unsigned bitSize = countOfBits<T>;
 
     using UT = typename std::make_unsigned<T>::type;
     UT uValue = value;
@@ -657,7 +657,7 @@
 template<typename T>
 inline unsigned ctz(T value)
 {
-    constexpr unsigned bitSize = sizeof(T) * CHAR_BIT;
+    constexpr unsigned bitSize = countOfBits<T>;
 
     using UT = typename std::make_unsigned<T>::type;
     UT uValue = value;
@@ -695,7 +695,7 @@
 template<typename T>
 inline unsigned getMSBSet(T t)
 {
-    constexpr unsigned bitSize = sizeof(T) * CHAR_BIT;
+    constexpr unsigned bitSize = countOfBits<T>;
     ASSERT(t);
     return bitSize - 1 - clz(t);
 }
@@ -703,11 +703,14 @@
 template<typename T>
 constexpr unsigned getMSBSetConstexpr(T t)
 {
-    constexpr unsigned bitSize = sizeof(T) * CHAR_BIT;
+    constexpr unsigned bitSize = countOfBits<T>;
     ASSERT_UNDER_CONSTEXPR_CONTEXT(t);
     return bitSize - 1 - clzConstexpr(t);
 }
 
+template<typename T> unsigned log2(T value) { return getMSBSet(value); }
+template<typename T> constexpr unsigned log2Constexpr(T value) { return getMSBSetConstexpr(value); }
+
 } // namespace WTF
 
 using WTF::shuffleVector;
_______________________________________________
webkit-changes mailing list
webkit-changes@lists.webkit.org
https://lists.webkit.org/mailman/listinfo/webkit-changes

Reply via email to