Title: [209179] trunk
Revision
209179
Author
utatane....@gmail.com
Date
2016-12-01 01:24:21 -0800 (Thu, 01 Dec 2016)

Log Message

Introduce StringImpl::StaticStringImpl with constexpr constructor
https://bugs.webkit.org/show_bug.cgi?id=165093

Reviewed by Darin Adler.

Source/WTF:

This patch adds new class, StringImpl::StaticStringImpl.
By using this class, we can easily create static StringImpls.
This class has constexpr constructor. You can initialize instances
of this class as global static variables without invoking global
constructors.

We already have similar system, StaticASCIILiteral. But using it
requires some special perl script since we need to calculate
hash value. On the other hand, we can use StaticStringImpl without
any script supports since we implement constexpr hash function.
In the future, we will replace all the use of StaticASCIILiteral
with this StaticStringImpl.

We define empty / null strings as StaticStringImpl. And we make
StringImpl::empty() & StringImpl::null() inline functions.

* wtf/Hasher.h:
(WTF::StringHasher::hashWithTop8BitsMasked):
(WTF::StringHasher::hash):
(WTF::StringHasher::finalize):
(WTF::StringHasher::finalizeAndMaskTop8Bits):
(WTF::StringHasher::computeLiteralHash):
(WTF::StringHasher::computeLiteralHashAndMaskTop8Bits):
(WTF::StringHasher::avalancheBits3):
(WTF::StringHasher::avalancheBits2):
(WTF::StringHasher::avalancheBits1):
(WTF::StringHasher::avalancheBits0):
(WTF::StringHasher::avalancheBits):
(WTF::StringHasher::avoidZero):
(WTF::StringHasher::processPendingCharacter):
(WTF::StringHasher::calculateWithRemainingLastCharacter1):
(WTF::StringHasher::calculateWithRemainingLastCharacter0):
(WTF::StringHasher::calculateWithRemainingLastCharacter):
(WTF::StringHasher::calculate1):
(WTF::StringHasher::calculate0):
(WTF::StringHasher::calculate):
(WTF::StringHasher::computeLiteralHashImpl):
* wtf/text/StringImpl.cpp:
* wtf/text/StringImpl.h:
(WTF::StringImpl::StaticStringImpl::StaticStringImpl):
(WTF::StringImpl::null):
(WTF::StringImpl::empty):
* wtf/text/StringStatics.cpp:
(WTF::StringImpl::null): Deleted.
(WTF::StringImpl::empty): Deleted.

Tools:

* TestWebKitAPI/Tests/WTF/StringImpl.cpp:
(TestWebKitAPI::TEST):

Modified Paths

Diff

Modified: trunk/Source/WTF/ChangeLog (209178 => 209179)


--- trunk/Source/WTF/ChangeLog	2016-12-01 09:22:51 UTC (rev 209178)
+++ trunk/Source/WTF/ChangeLog	2016-12-01 09:24:21 UTC (rev 209179)
@@ -1,3 +1,56 @@
+2016-12-01  Yusuke Suzuki  <utatane....@gmail.com>
+
+        Introduce StringImpl::StaticStringImpl with constexpr constructor
+        https://bugs.webkit.org/show_bug.cgi?id=165093
+
+        Reviewed by Darin Adler.
+
+        This patch adds new class, StringImpl::StaticStringImpl.
+        By using this class, we can easily create static StringImpls.
+        This class has constexpr constructor. You can initialize instances
+        of this class as global static variables without invoking global
+        constructors.
+
+        We already have similar system, StaticASCIILiteral. But using it
+        requires some special perl script since we need to calculate
+        hash value. On the other hand, we can use StaticStringImpl without
+        any script supports since we implement constexpr hash function.
+        In the future, we will replace all the use of StaticASCIILiteral
+        with this StaticStringImpl.
+
+        We define empty / null strings as StaticStringImpl. And we make
+        StringImpl::empty() & StringImpl::null() inline functions.
+
+        * wtf/Hasher.h:
+        (WTF::StringHasher::hashWithTop8BitsMasked):
+        (WTF::StringHasher::hash):
+        (WTF::StringHasher::finalize):
+        (WTF::StringHasher::finalizeAndMaskTop8Bits):
+        (WTF::StringHasher::computeLiteralHash):
+        (WTF::StringHasher::computeLiteralHashAndMaskTop8Bits):
+        (WTF::StringHasher::avalancheBits3):
+        (WTF::StringHasher::avalancheBits2):
+        (WTF::StringHasher::avalancheBits1):
+        (WTF::StringHasher::avalancheBits0):
+        (WTF::StringHasher::avalancheBits):
+        (WTF::StringHasher::avoidZero):
+        (WTF::StringHasher::processPendingCharacter):
+        (WTF::StringHasher::calculateWithRemainingLastCharacter1):
+        (WTF::StringHasher::calculateWithRemainingLastCharacter0):
+        (WTF::StringHasher::calculateWithRemainingLastCharacter):
+        (WTF::StringHasher::calculate1):
+        (WTF::StringHasher::calculate0):
+        (WTF::StringHasher::calculate):
+        (WTF::StringHasher::computeLiteralHashImpl):
+        * wtf/text/StringImpl.cpp:
+        * wtf/text/StringImpl.h:
+        (WTF::StringImpl::StaticStringImpl::StaticStringImpl):
+        (WTF::StringImpl::null):
+        (WTF::StringImpl::empty):
+        * wtf/text/StringStatics.cpp:
+        (WTF::StringImpl::null): Deleted.
+        (WTF::StringImpl::empty): Deleted.
+
 2016-11-30  Darin Adler  <da...@apple.com>
 
         Roll out StringBuilder changes from the previous patch.

Modified: trunk/Source/WTF/wtf/Hasher.h (209178 => 209179)


--- trunk/Source/WTF/wtf/Hasher.h	2016-12-01 09:22:51 UTC (rev 209178)
+++ trunk/Source/WTF/wtf/Hasher.h	2016-12-01 09:24:21 UTC (rev 209179)
@@ -36,11 +36,12 @@
 // _javascript_Core and the CodeGeneratorJS.pm script in WebCore.
 
 // Golden ratio. Arbitrary start value to avoid mapping all zeros to a hash value of zero.
-static const unsigned stringHashingStartValue = 0x9E3779B9U;
+static constexpr const unsigned stringHashingStartValue = 0x9E3779B9U;
 
 class StringHasher {
 public:
-    static const unsigned flagCount = 8; // Save 8 bits for StringImpl to use as flags.
+    static constexpr const unsigned flagCount = 8; // Save 8 bits for StringImpl to use as flags.
+    static constexpr const unsigned maskHash = (1U << (sizeof(unsigned) * 8 - flagCount)) - 1;
 
     StringHasher()
         : m_hash(stringHashingStartValue)
@@ -162,34 +163,12 @@
 
     unsigned hashWithTop8BitsMasked() const
     {
-        unsigned result = avalancheBits();
-
-        // Reserving space from the high bits for flags preserves most of the hash's
-        // value, since hash lookup typically masks out the high bits anyway.
-        result &= (1U << (sizeof(result) * 8 - flagCount)) - 1;
-
-        // This avoids ever returning a hash code of 0, since that is used to
-        // signal "hash not computed yet". Setting the high bit maintains
-        // reasonable fidelity to a hash code of 0 because it is likely to yield
-        // exactly 0 when hash lookup masks out the high bits.
-        if (!result)
-            result = 0x80000000 >> flagCount;
-
-        return result;
+        return finalizeAndMaskTop8Bits(processPendingCharacter());
     }
 
     unsigned hash() const
     {
-        unsigned result = avalancheBits();
-
-        // This avoids ever returning a hash code of 0, since that is used to
-        // signal "hash not computed yet". Setting the high bit maintains
-        // reasonable fidelity to a hash code of 0 because it is likely to yield
-        // exactly 0 when hash lookup masks out the high bits.
-        if (!result)
-            result = 0x80000000;
-
-        return result;
+        return finalize(processPendingCharacter());
     }
 
     template<typename T, UChar Converter(T)> static unsigned computeHashAndMaskTop8Bits(const T* data, unsigned length)
@@ -257,6 +236,30 @@
         return hashMemory(data, length);
     }
 
+    static constexpr unsigned finalize(unsigned hash)
+    {
+        return avoidZero(avalancheBits(hash));
+    }
+
+    static constexpr unsigned finalizeAndMaskTop8Bits(unsigned hash)
+    {
+        // Reserving space from the high bits for flags preserves most of the hash's
+        // value, since hash lookup typically masks out the high bits anyway.
+        return avoidZero(avalancheBits(hash) & StringHasher::maskHash);
+    }
+
+    template<typename T, unsigned charactersCount>
+    static constexpr unsigned computeLiteralHash(const T (&characters)[charactersCount])
+    {
+        return StringHasher::finalize(computeLiteralHashImpl(stringHashingStartValue, 0, characters, charactersCount - 1));
+    }
+
+    template<typename T, unsigned charactersCount>
+    static constexpr unsigned computeLiteralHashAndMaskTop8Bits(const T (&characters)[charactersCount])
+    {
+        return StringHasher::finalizeAndMaskTop8Bits(computeLiteralHashImpl(stringHashingStartValue, 0, characters, charactersCount - 1));
+    }
+
 private:
     static UChar defaultConverter(UChar character)
     {
@@ -268,8 +271,42 @@
         return character;
     }
 
-    unsigned avalancheBits() const
+    ALWAYS_INLINE static constexpr unsigned avalancheBits3(unsigned hash)
     {
+        return hash ^ (hash << 10);
+    }
+
+    ALWAYS_INLINE static constexpr unsigned avalancheBits2(unsigned hash)
+    {
+        return avalancheBits3(hash + (hash >> 15));
+    }
+
+    ALWAYS_INLINE static constexpr unsigned avalancheBits1(unsigned hash)
+    {
+        return avalancheBits2(hash ^ (hash << 2));
+    }
+
+    ALWAYS_INLINE static constexpr unsigned avalancheBits0(unsigned hash)
+    {
+        return avalancheBits1(hash + (hash >> 5));
+    }
+
+    ALWAYS_INLINE static constexpr unsigned avalancheBits(unsigned hash)
+    {
+        return avalancheBits0(hash ^ (hash << 3));
+    }
+
+    // This avoids ever returning a hash code of 0, since that is used to
+    // signal "hash not computed yet". Setting the high bit maintains
+    // reasonable fidelity to a hash code of 0 because it is likely to yield
+    // exactly 0 when hash lookup masks out the high bits.
+    ALWAYS_INLINE static constexpr unsigned avoidZero(unsigned hash)
+    {
+        return hash ? hash : (0x80000000 >> StringHasher::flagCount);
+    }
+
+    unsigned processPendingCharacter() const
+    {
         unsigned result = m_hash;
 
         // Handle end case.
@@ -278,17 +315,51 @@
             result ^= result << 11;
             result += result >> 17;
         }
+        return result;
+    }
 
-        // Force "avalanching" of final 31 bits.
-        result ^= result << 3;
-        result += result >> 5;
-        result ^= result << 2;
-        result += result >> 15;
-        result ^= result << 10;
 
-        return result;
+    // FIXME: This code limits itself to the older, more limited C++11 constexpr capabilities, using
+    // recursion instead of looping, for example. Would be nice to rewrite this in a simpler way
+    // once we no longer need to support compilers like GCC 4.9 that do not yet support it.
+    static constexpr unsigned calculateWithRemainingLastCharacter1(unsigned hash)
+    {
+        return hash + (hash >> 17);
     }
 
+    static constexpr unsigned calculateWithRemainingLastCharacter0(unsigned hash)
+    {
+        return calculateWithRemainingLastCharacter1((hash << 11) ^ hash);
+    }
+
+    static constexpr unsigned calculateWithRemainingLastCharacter(unsigned hash, unsigned character)
+    {
+        return calculateWithRemainingLastCharacter0(hash + character);
+    }
+
+    static constexpr unsigned calculate1(unsigned hash)
+    {
+        return hash + (hash >> 11);
+    }
+
+    static constexpr unsigned calculate0(unsigned hash, unsigned secondCharacter)
+    {
+        return calculate1((hash << 16) ^ ((secondCharacter << 11) ^ hash));
+    }
+
+    static constexpr unsigned calculate(unsigned hash, unsigned firstCharacter, unsigned secondCharacter)
+    {
+        return calculate0(hash + firstCharacter, secondCharacter);
+    }
+
+    static constexpr unsigned computeLiteralHashImpl(unsigned hash, unsigned index, const char* characters, unsigned length)
+    {
+        return (index == length)
+            ? hash : ((index + 1) == length)
+            ? calculateWithRemainingLastCharacter(hash, characters[index])
+            : computeLiteralHashImpl(calculate(hash, characters[index], characters[index + 1]), index + 2, characters, length);
+    }
+
     unsigned m_hash;
     bool m_hasPendingCharacter;
     UChar m_pendingCharacter;

Modified: trunk/Source/WTF/wtf/text/StringImpl.cpp (209178 => 209179)


--- trunk/Source/WTF/wtf/text/StringImpl.cpp	2016-12-01 09:22:51 UTC (rev 209178)
+++ trunk/Source/WTF/wtf/text/StringImpl.cpp	2016-12-01 09:24:21 UTC (rev 209179)
@@ -101,6 +101,8 @@
 }
 #endif
 
+StringImpl::StaticStringImpl StringImpl::s_atomicNullString("", StringImpl::StringAtomic);
+StringImpl::StaticStringImpl StringImpl::s_atomicEmptyString("", StringImpl::StringAtomic);
 
 StringImpl::~StringImpl()
 {

Modified: trunk/Source/WTF/wtf/text/StringImpl.h (209178 => 209179)


--- trunk/Source/WTF/wtf/text/StringImpl.h	2016-12-01 09:22:51 UTC (rev 209178)
+++ trunk/Source/WTF/wtf/text/StringImpl.h	2016-12-01 09:24:21 UTC (rev 209179)
@@ -149,18 +149,18 @@
 
     // The bottom 6 bits in the hash are flags.
 public:
-    static const unsigned s_flagCount = 6;
+    static constexpr const unsigned s_flagCount = 6;
 private:
-    static const unsigned s_flagMask = (1u << s_flagCount) - 1;
-    COMPILE_ASSERT(s_flagCount <= StringHasher::flagCount, StringHasher_reserves_enough_bits_for_StringImpl_flags);
-    static const unsigned s_flagStringKindCount = 4;
+    static constexpr const unsigned s_flagMask = (1u << s_flagCount) - 1;
+    static_assert(s_flagCount <= StringHasher::flagCount, "StringHasher reserves enough bits for StringImpl flags");
+    static constexpr const unsigned s_flagStringKindCount = 4;
 
-    static const unsigned s_hashFlagStringKindIsAtomic = 1u << (s_flagStringKindCount);
-    static const unsigned s_hashFlagStringKindIsSymbol = 1u << (s_flagStringKindCount + 1);
-    static const unsigned s_hashMaskStringKind = s_hashFlagStringKindIsAtomic | s_hashFlagStringKindIsSymbol;
-    static const unsigned s_hashFlag8BitBuffer = 1u << 3;
-    static const unsigned s_hashFlagDidReportCost = 1u << 2;
-    static const unsigned s_hashMaskBufferOwnership = (1u << 0) | (1u << 1);
+    static constexpr const unsigned s_hashFlagStringKindIsAtomic = 1u << (s_flagStringKindCount);
+    static constexpr const unsigned s_hashFlagStringKindIsSymbol = 1u << (s_flagStringKindCount + 1);
+    static constexpr const unsigned s_hashMaskStringKind = s_hashFlagStringKindIsAtomic | s_hashFlagStringKindIsSymbol;
+    static constexpr const unsigned s_hashFlag8BitBuffer = 1u << 3;
+    static constexpr const unsigned s_hashFlagDidReportCost = 1u << 2;
+    static constexpr const unsigned s_hashMaskBufferOwnership = (1u << 0) | (1u << 1);
 
     enum StringKind {
         StringNormal = 0u, // non-symbol, non-atomic
@@ -168,25 +168,6 @@
         StringSymbol = s_hashFlagStringKindIsSymbol, // symbol, non-atomic
     };
 
-    // Used to construct static strings, which have an special refCount that can never hit zero.
-    // This means that the static string will never be destroyed, which is important because
-    // static strings will be shared across threads & ref-counted in a non-threadsafe manner.
-    friend class NeverDestroyed<StringImpl>;
-    enum ConstructEmptyStringTag { ConstructEmptyString };
-    StringImpl(ConstructEmptyStringTag)
-        : m_refCount(s_refCountFlagIsStaticString)
-        , m_length(0)
-        , m_data8(reinterpret_cast<const LChar*>(&m_length))
-        , m_hashAndFlags(s_hashFlag8BitBuffer | StringAtomic | BufferOwned)
-    {
-        // Ensure that the hash is computed so that AtomicStringHash can call existingHash()
-        // with impunity. The empty string is special because it is never entered into
-        // AtomicString's HashKey, but still needs to compare correctly.
-        STRING_STATS_ADD_8BIT_STRING(m_length);
-
-        hash();
-    }
-
     // FIXME: there has to be a less hacky way to do this.
     enum Force8Bit { Force8BitConstructor };
     // Create a normal 8-bit string with internal storage (BufferInternal)
@@ -606,8 +587,44 @@
         m_refCount = tempRefCount;
     }
 
-    WTF_EXPORT_PRIVATE static StringImpl* empty();
+    class StaticStringImpl {
+    public:
+        // Used to construct static strings, which have an special refCount that can never hit zero.
+        // This means that the static string will never be destroyed, which is important because
+        // static strings will be shared across threads & ref-counted in a non-threadsafe manner.
+        template<unsigned charactersCount>
+        constexpr StaticStringImpl(const char (&characters)[charactersCount], StringKind stringKind = StringNormal)
+            : m_refCount(s_refCountFlagIsStaticString)
+            , m_length(charactersCount - 1)
+            , m_data8(characters)
+            , m_hashAndFlags(s_hashFlag8BitBuffer | stringKind | BufferInternal | (StringHasher::computeLiteralHashAndMaskTop8Bits(characters) << s_flagCount))
+        {
+        }
 
+        template<unsigned charactersCount>
+        constexpr StaticStringImpl(const char16_t (&characters)[charactersCount], StringKind stringKind = StringNormal)
+            : m_refCount(s_refCountFlagIsStaticString)
+            , m_length(charactersCount - 1)
+            , m_data16(characters)
+            , m_hashAndFlags(stringKind | BufferInternal | (StringHasher::computeLiteralHashAndMaskTop8Bits(characters) << s_flagCount))
+        {
+        }
+
+        // These member variables must match the layout of StringImpl.
+        unsigned m_refCount;
+        unsigned m_length;
+        union {
+            const char* m_data8;
+            const char16_t* m_data16;
+        };
+        unsigned m_hashAndFlags;
+    };
+
+    WTF_EXPORTDATA static StaticStringImpl s_atomicNullString;
+    WTF_EXPORTDATA static StaticStringImpl s_atomicEmptyString;
+    ALWAYS_INLINE static StringImpl* null() { return reinterpret_cast<StringImpl*>(&s_atomicNullString); }
+    ALWAYS_INLINE static StringImpl* empty() { return reinterpret_cast<StringImpl*>(&s_atomicEmptyString); }
+
     // FIXME: Does this really belong in StringImpl?
     template <typename T> static void copyChars(T* destination, const T* source, unsigned numCharacters)
     {
@@ -866,7 +883,6 @@
     template <typename CharType> static Ref<StringImpl> createInternal(const CharType*, unsigned);
     WTF_EXPORT_PRIVATE NEVER_INLINE unsigned hashSlowCase() const;
     WTF_EXPORT_PRIVATE static unsigned nextHashForSymbol();
-    WTF_EXPORT_PRIVATE static StringImpl* null();
 
     // The bottom bit in the ref count indicates a static (immortal) string.
     static const unsigned s_refCountFlagIsStaticString = 0x1;
@@ -877,6 +893,8 @@
 #endif
 
 public:
+    // FIXME: It should be replaced with StaticStringImpl.
+    // https://bugs.webkit.org/show_bug.cgi?id=165134
     struct StaticASCIILiteral {
         // These member variables must match the layout of StringImpl.
         unsigned m_refCount;
@@ -899,7 +917,7 @@
 #endif
 
 private:
-    // These member variables must match the layout of StaticASCIILiteral.
+    // These member variables must match the layout of StaticASCIILiteral and StaticStringImpl.
     unsigned m_refCount;
     unsigned m_length;
     union {
@@ -910,6 +928,7 @@
 };
 
 static_assert(sizeof(StringImpl) == sizeof(StringImpl::StaticASCIILiteral), "");
+static_assert(sizeof(StringImpl) == sizeof(StringImpl::StaticStringImpl), "");
 
 #if !ASSERT_DISABLED
 // StringImpls created from StaticASCIILiteral will ASSERT

Modified: trunk/Source/WTF/wtf/text/StringStatics.cpp (209178 => 209179)


--- trunk/Source/WTF/wtf/text/StringStatics.cpp	2016-12-01 09:22:51 UTC (rev 209178)
+++ trunk/Source/WTF/wtf/text/StringStatics.cpp	2016-12-01 09:24:21 UTC (rev 209179)
@@ -41,18 +41,6 @@
 
 namespace WTF {
 
-StringImpl* StringImpl::null()
-{
-    static NeverDestroyed<StringImpl> nullString(ConstructEmptyString);
-    return &nullString.get();
-}
-
-StringImpl* StringImpl::empty()
-{
-    static NeverDestroyed<StringImpl> emptyString(ConstructEmptyString);
-    return &emptyString.get();
-}
-
 // In addition to the normal hash value, store specialized hash value for
 // symbolized StringImpl*. And don't use the normal hash value for symbolized
 // StringImpl* when they are treated as Identifiers. Unique nature of these

Modified: trunk/Tools/ChangeLog (209178 => 209179)


--- trunk/Tools/ChangeLog	2016-12-01 09:22:51 UTC (rev 209178)
+++ trunk/Tools/ChangeLog	2016-12-01 09:24:21 UTC (rev 209179)
@@ -1,3 +1,13 @@
+2016-12-01  Yusuke Suzuki  <utatane....@gmail.com>
+
+        Introduce StringImpl::StaticStringImpl with constexpr constructor
+        https://bugs.webkit.org/show_bug.cgi?id=165093
+
+        Reviewed by Darin Adler.
+
+        * TestWebKitAPI/Tests/WTF/StringImpl.cpp:
+        (TestWebKitAPI::TEST):
+
 2016-11-30  Antoine Quint  <grao...@apple.com>
 
         [Modern Media Controls] Add an HTML comment flag to turn the feature on

Modified: trunk/Tools/TestWebKitAPI/Tests/WTF/StringImpl.cpp (209178 => 209179)


--- trunk/Tools/TestWebKitAPI/Tests/WTF/StringImpl.cpp	2016-12-01 09:22:51 UTC (rev 209178)
+++ trunk/Tools/TestWebKitAPI/Tests/WTF/StringImpl.cpp	2016-12-01 09:24:21 UTC (rev 209179)
@@ -25,6 +25,7 @@
 
 #include "config.h"
 
+#include <wtf/Hasher.h>
 #include <wtf/text/SymbolImpl.h>
 #include <wtf/text/WTFString.h>
 
@@ -574,4 +575,20 @@
     ASSERT_FALSE(reference->isAtomic());
 }
 
+TEST(WTF, StringImplConstexprHasher)
+{
+    ASSERT_EQ(stringFromUTF8("")->hash(), StringHasher::computeLiteralHashAndMaskTop8Bits(""));
+    ASSERT_EQ(stringFromUTF8("A")->hash(), StringHasher::computeLiteralHashAndMaskTop8Bits("A"));
+    ASSERT_EQ(stringFromUTF8("AA")->hash(), StringHasher::computeLiteralHashAndMaskTop8Bits("AA"));
+    ASSERT_EQ(stringFromUTF8("Cocoa")->hash(), StringHasher::computeLiteralHashAndMaskTop8Bits("Cocoa"));
+    ASSERT_EQ(stringFromUTF8("Cappuccino")->hash(), StringHasher::computeLiteralHashAndMaskTop8Bits("Cappuccino"));
+}
+
+TEST(WTF, StringImplEmpty)
+{
+    ASSERT_FALSE(StringImpl::empty()->length());
+    ASSERT_FALSE(StringImpl::null()->length());
+    ASSERT_NE(StringImpl::null(), StringImpl::empty());
+}
+
 } // namespace TestWebKitAPI
_______________________________________________
webkit-changes mailing list
webkit-changes@lists.webkit.org
https://lists.webkit.org/mailman/listinfo/webkit-changes

Reply via email to