Title: [206869] trunk
Revision
206869
Author
wilan...@apple.com
Date
2016-10-06 10:40:12 -0700 (Thu, 06 Oct 2016)

Log Message

Update Resource Load Statistics
https://bugs.webkit.org/show_bug.cgi?id=162811

Reviewed by Alex Christensen.

Source/WebCore:

No new tests. The counting is based on top privately owned domains
which currently is not supported by layout tests nor API tests.

* Modules/websockets/WebSocket.cpp:
(WebCore::WebSocket::connect):
    Now captures statistics for web sockets too.
* loader/FrameLoader.cpp:
(WebCore::FrameLoader::loadResourceSynchronously):
* loader/ResourceLoadObserver.cpp:
    Now captures statistics for synchronous XHR too.
(WebCore::is3xxRedirect):
    Convenience function.
(WebCore::ResourceLoadObserver::shouldLog):
    Convenience function.
(WebCore::ResourceLoadObserver::logFrameNavigation):
    Updated to make use of new convenience functions.
(WebCore::ResourceLoadObserver::logSubresourceLoading):
    Updated to make use of new convenience functions.
(WebCore::ResourceLoadObserver::logWebSocketLoading):
    Added.
(WebCore::ResourceLoadObserver::logUserInteraction):
    Updated to make use of new convenience functions.
(WebCore::ResourceLoadObserver::primaryDomain):
    Now makes use of the Public Suffix list.
    Removed old custom parsing of primary domain.
* loader/ResourceLoadObserver.h:
* loader/ResourceLoadStatisticsStore.cpp:
(WebCore::ResourceLoadStatisticsStore::prevalentResourceDomainsWithoutUserInteraction):
    Convenience function.
(WebCore::ResourceLoadStatisticsStore::processStatistics): Deleted.
* loader/ResourceLoadStatisticsStore.h:
* loader/SubresourceLoader.cpp:
(WebCore::SubresourceLoader::willSendRequestInternal):
    Moved logging call higher up and added a check for whether we
    are loading the main resource. The reason for moving it up is
    to capture the request before some data may be cleared out in
    redirect handling. We also want to capture failed CORS requests
    since they are sent and then cancelled on the way back.

Source/WebKit2:

* UIProcess/WebResourceLoadStatisticsStore.cpp:
(WebKit::WebResourceLoadStatisticsStore::hasPrevalentResourceCharacteristics):
    Switched to vector-based classification.
(WebKit::WebResourceLoadStatisticsStore::classifyResource):
    Simplified logic and moved the split between has and has
    no user interaction into ResourceLoadStatisticsStore.
(WebKit::WebResourceLoadStatisticsStore::clearDataRecords):
    Added.
(WebKit::WebResourceLoadStatisticsStore::resourceLoadStatisticsUpdated):
    Updated to make use of the new functions.
(WebKit::WebResourceLoadStatisticsStore::persistentStoragePath):
    Removed stray whitespace.
(WebKit::WebResourceLoadStatisticsStore::writeEncoderToDisk):
    Removed stray whitespace.
(WebKit::WebResourceLoadStatisticsStore::createDecoderFromDisk):
    Removed stray whitespace.
(WebKit::hasPrevalentResourceCharacteristics): Deleted.
(WebKit::classifyPrevalentResources): Deleted.
* UIProcess/WebResourceLoadStatisticsStore.h:
    Added member variables for clearing of data records.

Tools:

* TestWebKitAPI/Tests/mac/PublicSuffix.mm:
    Change from USE(PUBLIC_SUFFIX_LIST) to ENABLE(PUBLIC_SUFFIX_LIST)

Modified Paths

Diff

Modified: trunk/Source/WebCore/ChangeLog (206868 => 206869)


--- trunk/Source/WebCore/ChangeLog	2016-10-06 17:27:23 UTC (rev 206868)
+++ trunk/Source/WebCore/ChangeLog	2016-10-06 17:40:12 UTC (rev 206869)
@@ -1,3 +1,49 @@
+2016-10-06  John Wilander  <wilan...@apple.com>
+
+        Update Resource Load Statistics
+        https://bugs.webkit.org/show_bug.cgi?id=162811
+
+        Reviewed by Alex Christensen.
+
+        No new tests. The counting is based on top privately owned domains
+        which currently is not supported by layout tests nor API tests.
+
+        * Modules/websockets/WebSocket.cpp:
+        (WebCore::WebSocket::connect):
+            Now captures statistics for web sockets too.
+        * loader/FrameLoader.cpp:
+        (WebCore::FrameLoader::loadResourceSynchronously):
+        * loader/ResourceLoadObserver.cpp:
+            Now captures statistics for synchronous XHR too.
+        (WebCore::is3xxRedirect):
+            Convenience function.
+        (WebCore::ResourceLoadObserver::shouldLog):
+            Convenience function.
+        (WebCore::ResourceLoadObserver::logFrameNavigation):
+            Updated to make use of new convenience functions.
+        (WebCore::ResourceLoadObserver::logSubresourceLoading):
+            Updated to make use of new convenience functions.
+        (WebCore::ResourceLoadObserver::logWebSocketLoading):
+            Added.
+        (WebCore::ResourceLoadObserver::logUserInteraction):
+            Updated to make use of new convenience functions.
+        (WebCore::ResourceLoadObserver::primaryDomain):
+            Now makes use of the Public Suffix list.
+            Removed old custom parsing of primary domain.
+        * loader/ResourceLoadObserver.h:
+        * loader/ResourceLoadStatisticsStore.cpp:
+        (WebCore::ResourceLoadStatisticsStore::prevalentResourceDomainsWithoutUserInteraction):
+            Convenience function.
+        (WebCore::ResourceLoadStatisticsStore::processStatistics): Deleted.
+        * loader/ResourceLoadStatisticsStore.h:
+        * loader/SubresourceLoader.cpp:
+        (WebCore::SubresourceLoader::willSendRequestInternal):
+            Moved logging call higher up and added a check for whether we
+            are loading the main resource. The reason for moving it up is
+            to capture the request before some data may be cleared out in
+            redirect handling. We also want to capture failed CORS requests
+            since they are sent and then cancelled on the way back.
+
 2016-10-06  Adam Bergkvist  <adam.bergkv...@ericsson.com>
 
         WebRTC: Add support for the iceconnectionstatechange event in MediaEndpointPeerConnection

Modified: trunk/Source/WebCore/Modules/websockets/WebSocket.cpp (206868 => 206869)


--- trunk/Source/WebCore/Modules/websockets/WebSocket.cpp	2016-10-06 17:27:23 UTC (rev 206868)
+++ trunk/Source/WebCore/Modules/websockets/WebSocket.cpp	2016-10-06 17:40:12 UTC (rev 206869)
@@ -47,6 +47,7 @@
 #include "Frame.h"
 #include "Logging.h"
 #include "MessageEvent.h"
+#include "ResourceLoadObserver.h"
 #include "ScriptController.h"
 #include "ScriptExecutionContext.h"
 #include "SecurityOrigin.h"
@@ -318,7 +319,8 @@
             });
 #endif
             return;
-        }
+        } else
+            ResourceLoadObserver::sharedObserver().logWebSocketLoading(document.frame(), m_url);
     }
 
     String protocolString;

Modified: trunk/Source/WebCore/loader/FrameLoader.cpp (206868 => 206869)


--- trunk/Source/WebCore/loader/FrameLoader.cpp	2016-10-06 17:27:23 UTC (rev 206868)
+++ trunk/Source/WebCore/loader/FrameLoader.cpp	2016-10-06 17:40:12 UTC (rev 206869)
@@ -97,6 +97,7 @@
 #include "ProgressTracker.h"
 #include "ResourceHandle.h"
 #include "ResourceLoadInfo.h"
+#include "ResourceLoadObserver.h"
 #include "ResourceRequest.h"
 #include "SVGDocument.h"
 #include "SVGLocatable.h"
@@ -2774,6 +2775,7 @@
             platformStrategies()->loaderStrategy()->loadResourceSynchronously(networkingContext(), identifier, newRequest, storedCredentials, clientCredentialPolicy, error, response, buffer);
             data = ""
             documentLoader()->applicationCacheHost()->maybeLoadFallbackSynchronously(newRequest, error, response, data);
+            ResourceLoadObserver::sharedObserver().logSubresourceLoading(&m_frame, newRequest, response);
         }
     }
     notifier().sendRemainingDelegateMessages(m_documentLoader.get(), identifier, request, response, data ? data->data() : nullptr, data ? data->size() : 0, -1, error);

Modified: trunk/Source/WebCore/loader/ResourceLoadObserver.cpp (206868 => 206869)


--- trunk/Source/WebCore/loader/ResourceLoadObserver.cpp	2016-10-06 17:27:23 UTC (rev 206868)
+++ trunk/Source/WebCore/loader/ResourceLoadObserver.cpp	2016-10-06 17:40:12 UTC (rev 206869)
@@ -33,6 +33,7 @@
 #include "NetworkStorageSession.h"
 #include "Page.h"
 #include "PlatformStrategies.h"
+#include "PublicSuffix.h"
 #include "ResourceLoadStatistics.h"
 #include "ResourceLoadStatisticsStore.h"
 #include "ResourceRequest.h"
@@ -57,23 +58,31 @@
     m_store = WTFMove(store);
 }
 
-void ResourceLoadObserver::logFrameNavigation(const Frame& frame, const Frame& topFrame, const ResourceRequest& newRequest, const ResourceResponse& redirectResponse)
+static inline bool is3xxRedirect(const ResourceResponse& response)
 {
-    if (!Settings::resourceLoadStatisticsEnabled())
-        return;
+    return response.httpStatusCode() >= 300 && response.httpStatusCode() <= 399;
+}
 
-    if (!m_store)
-        return;
+bool ResourceLoadObserver::shouldLog(Page* page)
+{
+    // FIXME: Err on the safe side until we have sorted out what to do in worker contexts
+    if (!page)
+        return false;
+    return Settings::resourceLoadStatisticsEnabled()
+        && !page->usesEphemeralSession()
+        && m_store;
+}
 
+void ResourceLoadObserver::logFrameNavigation(const Frame& frame, const Frame& topFrame, const ResourceRequest& newRequest, const ResourceResponse& redirectResponse)
+{
     ASSERT(frame.document());
     ASSERT(topFrame.document());
     ASSERT(topFrame.page());
-
-    bool needPrivacy = topFrame.page() ? topFrame.page()->usesEphemeralSession() : false;
-    if (needPrivacy)
+    
+    if (!shouldLog(topFrame.page()))
         return;
 
-    bool isRedirect = !redirectResponse.isNull();
+    bool isRedirect = is3xxRedirect(redirectResponse);
     bool isMainFrame = frame.isMainFrame();
     const URL& sourceURL = frame.document()->url();
     const URL& targetURL = newRequest.url();
@@ -148,17 +157,12 @@
     
 void ResourceLoadObserver::logSubresourceLoading(const Frame* frame, const ResourceRequest& newRequest, const ResourceResponse& redirectResponse)
 {
-    if (!Settings::resourceLoadStatisticsEnabled())
-        return;
+    ASSERT(frame->page());
 
-    if (!m_store)
+    if (!shouldLog(frame->page()))
         return;
 
-    bool needPrivacy = (frame && frame->page()) ? frame->page()->usesEphemeralSession() : false;
-    if (needPrivacy)
-        return;
-    
-    bool isRedirect = !redirectResponse.isNull();
+    bool isRedirect = is3xxRedirect(redirectResponse);
     const URL& sourceURL = redirectResponse.url();
     const URL& targetURL = newRequest.url();
     const URL& mainFrameURL = frame ? frame->mainFrame().document()->url() : URL();
@@ -166,7 +170,10 @@
     auto targetHost = targetURL.host();
     auto mainFrameHost = mainFrameURL.host();
 
-    if (targetHost.isEmpty() || mainFrameHost.isEmpty() || targetHost == mainFrameHost || targetHost == sourceURL.host())
+    if (targetHost.isEmpty()
+        || mainFrameHost.isEmpty()
+        || targetHost == mainFrameHost
+        || (isRedirect && targetHost == sourceURL.host()))
         return;
 
     auto targetPrimaryDomain = primaryDomain(targetURL);
@@ -173,7 +180,7 @@
     auto mainFramePrimaryDomain = primaryDomain(mainFrameURL);
     auto sourcePrimaryDomain = primaryDomain(sourceURL);
     
-    if (targetPrimaryDomain == mainFramePrimaryDomain || targetPrimaryDomain == sourcePrimaryDomain)
+    if (targetPrimaryDomain == mainFramePrimaryDomain || (isRedirect && targetPrimaryDomain == sourcePrimaryDomain))
         return;
 
     auto& targetStatistics = m_store->ensureResourceStatisticsForPrimaryDomain(targetPrimaryDomain);
@@ -210,20 +217,60 @@
     
     m_store->fireDataModificationHandler();
 }
+
+void ResourceLoadObserver::logWebSocketLoading(const Frame* frame, const URL& targetURL)
+{
+    // FIXME: Web sockets can run in detached frames. Decide how to count such connections.
+    // See LayoutTests/http/tests/websocket/construct-in-detached-frame.html
+    if (!frame)
+        return;
+
+    if (!shouldLog(frame->page()))
+        return;
+
+    const URL& mainFrameURL = frame->mainFrame().document()->url();
+
+    auto targetHost = targetURL.host();
+    auto mainFrameHost = mainFrameURL.host();
     
+    if (targetHost.isEmpty()
+        || mainFrameHost.isEmpty()
+        || targetHost == mainFrameHost)
+        return;
+    
+    auto targetPrimaryDomain = primaryDomain(targetURL);
+    auto mainFramePrimaryDomain = primaryDomain(mainFrameURL);
+    
+    if (targetPrimaryDomain == mainFramePrimaryDomain)
+        return;
+
+    auto& targetStatistics = m_store->ensureResourceStatisticsForPrimaryDomain(targetPrimaryDomain);
+    
+    auto mainFrameOrigin = SecurityOrigin::create(mainFrameURL);
+    targetStatistics.subresourceUnderTopFrameOrigins.add(mainFramePrimaryDomain);
+    
+    ++targetStatistics.subresourceHasBeenSubresourceCount;
+    
+    auto totalVisited = std::max(m_originsVisitedMap.size(), 1U);
+    
+    targetStatistics.subresourceHasBeenSubresourceCountDividedByTotalNumberOfOriginsVisited = static_cast<double>(targetStatistics.subresourceHasBeenSubresourceCount) / totalVisited;
+
+    m_store->fireDataModificationHandler();
+}
+
 void ResourceLoadObserver::logUserInteraction(const Document& document)
 {
-    if (!Settings::resourceLoadStatisticsEnabled())
-        return;
+    ASSERT(document.page());
 
-    if (!m_store)
+    if (!shouldLog(document.page()))
         return;
 
-    bool needPrivacy = document.page() ? document.page()->usesEphemeralSession() : false;
-    if (needPrivacy)
+    auto& url = ""
+
+    if (url.isBlankURL() || url.isEmpty())
         return;
 
-    auto& statistics = m_store->ensureResourceStatisticsForPrimaryDomain(primaryDomain(document.url()));
+    auto& statistics = m_store->ensureResourceStatisticsForPrimaryDomain(primaryDomain(url));
     statistics.hadUserInteraction = true;
     m_store->fireDataModificationHandler();
 }
@@ -230,42 +277,22 @@
     
 String ResourceLoadObserver::primaryDomain(const URL& url)
 {
+    String primaryDomain;
     String host = url.host();
-    Vector<String> hostSplitOnDot;
-    
-    host.split('.', false, hostSplitOnDot);
-
-    String primaryDomain;
-    if (host.isNull())
+    if (host.isNull() || host.isEmpty())
         primaryDomain = "nullOrigin";
-    else if (hostSplitOnDot.size() < 3)
-        primaryDomain = host;
+#if ENABLE(PUBLIC_SUFFIX_LIST)
     else {
-        // Skip TLD and then up to two domains smaller than 4 characters
-        int primaryDomainCutOffIndex = hostSplitOnDot.size() - 2;
-
-        // Start with TLD as a given part
-        size_t numberOfParts = 1;
-        for (; primaryDomainCutOffIndex >= 0; --primaryDomainCutOffIndex) {
-            ++numberOfParts;
-
-            // We have either a domain part that's 4 chars or longer, or 3 domain parts including TLD
-            if (hostSplitOnDot.at(primaryDomainCutOffIndex).length() >= 4 || numberOfParts >= 3)
-                break;
-        }
-
-        if (primaryDomainCutOffIndex < 0)
+        primaryDomain = topPrivatelyControlledDomain(host);
+        // We will have an empty string here if there is no TLD.
+        // Use the host in such case.
+        if (primaryDomain.isEmpty())
             primaryDomain = host;
-        else {
-            StringBuilder builder;
-            builder.append(hostSplitOnDot.at(primaryDomainCutOffIndex));
-            for (size_t j = primaryDomainCutOffIndex + 1; j < hostSplitOnDot.size(); ++j) {
-                builder.append('.');
-                builder.append(hostSplitOnDot[j]);
-            }
-            primaryDomain = builder.toString();
-        }
     }
+#else
+    else
+        primaryDomain = host;
+#endif
 
     return primaryDomain;
 }

Modified: trunk/Source/WebCore/loader/ResourceLoadObserver.h (206868 => 206869)


--- trunk/Source/WebCore/loader/ResourceLoadObserver.h	2016-10-06 17:27:23 UTC (rev 206868)
+++ trunk/Source/WebCore/loader/ResourceLoadObserver.h	2016-10-06 17:40:12 UTC (rev 206869)
@@ -34,6 +34,7 @@
 
 class Document;
 class Frame;
+class Page;
 class ResourceRequest;
 class ResourceResponse;
 class URL;
@@ -47,6 +48,7 @@
     
     void logFrameNavigation(const Frame& frame, const Frame& topFrame, const ResourceRequest& newRequest, const ResourceResponse& redirectResponse);
     void logSubresourceLoading(const Frame*, const ResourceRequest& newRequest, const ResourceResponse& redirectResponse);
+    void logWebSocketLoading(const Frame*, const URL&);
 
     void logUserInteraction(const Document&);
     
@@ -55,6 +57,7 @@
     WEBCORE_EXPORT String statisticsForOrigin(const String&);
 
 private:
+    bool shouldLog(Page*);
     static String primaryDomain(const URL&);
 
     RefPtr<ResourceLoadStatisticsStore> m_store;

Modified: trunk/Source/WebCore/loader/ResourceLoadStatisticsStore.cpp (206868 => 206869)


--- trunk/Source/WebCore/loader/ResourceLoadStatisticsStore.cpp	2016-10-06 17:27:23 UTC (rev 206868)
+++ trunk/Source/WebCore/loader/ResourceLoadStatisticsStore.cpp	2016-10-06 17:40:12 UTC (rev 206869)
@@ -154,4 +154,14 @@
     for (auto& resourceStatistic : m_resourceStatisticsMap.values())
         processFunction(resourceStatistic);
 }
+
+Vector<String> ResourceLoadStatisticsStore::prevalentResourceDomainsWithoutUserInteraction()
+{
+    Vector<String> prevalentResources;
+    for (auto& resourceStatistic : m_resourceStatisticsMap.values()) {
+        if (resourceStatistic.isPrevalentResource && !resourceStatistic.hadUserInteraction)
+            prevalentResources.append(resourceStatistic.highLevelDomain);
+    }
+    return prevalentResources;
 }
+}

Modified: trunk/Source/WebCore/loader/ResourceLoadStatisticsStore.h (206868 => 206869)


--- trunk/Source/WebCore/loader/ResourceLoadStatisticsStore.h	2016-10-06 17:27:23 UTC (rev 206868)
+++ trunk/Source/WebCore/loader/ResourceLoadStatisticsStore.h	2016-10-06 17:40:12 UTC (rev 206869)
@@ -63,6 +63,8 @@
 
     WEBCORE_EXPORT bool hasEnoughDataForStatisticsProcessing();
     WEBCORE_EXPORT void processStatistics(std::function<void(ResourceLoadStatistics&)>&&);
+
+    WEBCORE_EXPORT Vector<String> prevalentResourceDomainsWithoutUserInteraction();
 private:
     ResourceLoadStatisticsStore() = default;
 

Modified: trunk/Source/WebCore/loader/SubresourceLoader.cpp (206868 => 206869)


--- trunk/Source/WebCore/loader/SubresourceLoader.cpp	2016-10-06 17:27:23 UTC (rev 206868)
+++ trunk/Source/WebCore/loader/SubresourceLoader.cpp	2016-10-06 17:40:12 UTC (rev 206869)
@@ -173,6 +173,9 @@
         return;
     }
 
+    if (newRequest.requester() != ResourceRequestBase::Requester::Main)
+        ResourceLoadObserver::sharedObserver().logSubresourceLoading(m_frame.get(), newRequest, redirectResponse);
+
     ASSERT(!newRequest.isNull());
     if (!redirectResponse.isNull()) {
         if (options().redirect != FetchOptions::Redirect::Follow) {
@@ -228,8 +231,6 @@
     ResourceLoader::willSendRequestInternal(newRequest, redirectResponse);
     if (newRequest.isNull())
         cancel();
-
-    ResourceLoadObserver::sharedObserver().logSubresourceLoading(m_frame.get(), newRequest, redirectResponse);
 }
 
 void SubresourceLoader::didSendData(unsigned long long bytesSent, unsigned long long totalBytesToBeSent)

Modified: trunk/Source/WebKit2/ChangeLog (206868 => 206869)


--- trunk/Source/WebKit2/ChangeLog	2016-10-06 17:27:23 UTC (rev 206868)
+++ trunk/Source/WebKit2/ChangeLog	2016-10-06 17:40:12 UTC (rev 206869)
@@ -1,3 +1,31 @@
+2016-10-06  John Wilander  <wilan...@apple.com>
+
+        Update Resource Load Statistics
+        https://bugs.webkit.org/show_bug.cgi?id=162811
+
+        Reviewed by Alex Christensen.
+
+        * UIProcess/WebResourceLoadStatisticsStore.cpp:
+        (WebKit::WebResourceLoadStatisticsStore::hasPrevalentResourceCharacteristics):
+            Switched to vector-based classification.
+        (WebKit::WebResourceLoadStatisticsStore::classifyResource):
+            Simplified logic and moved the split between has and has
+            no user interaction into ResourceLoadStatisticsStore.
+        (WebKit::WebResourceLoadStatisticsStore::clearDataRecords):
+            Added.
+        (WebKit::WebResourceLoadStatisticsStore::resourceLoadStatisticsUpdated):
+            Updated to make use of the new functions.
+        (WebKit::WebResourceLoadStatisticsStore::persistentStoragePath):
+            Removed stray whitespace.
+        (WebKit::WebResourceLoadStatisticsStore::writeEncoderToDisk):
+            Removed stray whitespace.
+        (WebKit::WebResourceLoadStatisticsStore::createDecoderFromDisk):
+            Removed stray whitespace.
+        (WebKit::hasPrevalentResourceCharacteristics): Deleted.
+        (WebKit::classifyPrevalentResources): Deleted.
+        * UIProcess/WebResourceLoadStatisticsStore.h:
+            Added member variables for clearing of data records.
+
 2016-10-06  Youenn Fablet  <you...@apple.com>
 
         [WK2] 304 revalidation on the network process does not update the validated response

Modified: trunk/Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp (206868 => 206869)


--- trunk/Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp	2016-10-06 17:27:23 UTC (rev 206868)
+++ trunk/Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp	2016-10-06 17:40:12 UTC (rev 206869)
@@ -26,11 +26,18 @@
 #include "config.h"
 #include "WebResourceLoadStatisticsStore.h"
 
+#include "APIWebsiteDataStore.h"
 #include "WebProcessMessages.h"
 #include "WebProcessPool.h"
 #include "WebResourceLoadStatisticsStoreMessages.h"
+#include "WebsiteDataFetchOption.h"
+#include "WebsiteDataType.h"
 #include <WebCore/KeyedCoding.h>
 #include <WebCore/ResourceLoadStatistics.h>
+#include <wtf/CurrentTime.h>
+#include <wtf/MainThread.h>
+#include <wtf/MathExtras.h>
+#include <wtf/RunLoop.h>
 #include <wtf/threads/BinarySemaphore.h>
 
 using namespace WebCore;
@@ -37,13 +44,8 @@
 
 namespace WebKit {
 
-// Sub frame classification thresholds
-static const unsigned subframeUnderTopFrameOriginsThreshold = 3;
-    
-// Subresource classification thresholds
-static const unsigned subresourceUnderTopFrameOriginsThreshold = 5;
-static const unsigned subresourceHasBeenRedirectedFromToUniqueDomainsThreshold = 3;
-static const unsigned redirectedToOtherPrevalentResourceOriginsThreshold = 2;
+static const auto numberOfSecondsBetweenClearingDataRecords = 600;
+static const auto featureVectorLengthThreshold = 3;
 
 Ref<WebResourceLoadStatisticsStore> WebResourceLoadStatisticsStore::create(const String& resourceLoadStatisticsDirectory)
 {
@@ -51,7 +53,7 @@
 }
 
 WebResourceLoadStatisticsStore::WebResourceLoadStatisticsStore(const String& resourceLoadStatisticsDirectory)
-    : m_resourceStatisticsStore(WebCore::ResourceLoadStatisticsStore::create())
+    : m_resourceStatisticsStore(ResourceLoadStatisticsStore::create())
     , m_statisticsQueue(WorkQueue::create("WebResourceLoadStatisticsStore Process Data Queue"))
     , m_storagePath(resourceLoadStatisticsDirectory)
 {
@@ -61,37 +63,103 @@
 {
 }
 
-static inline bool hasPrevalentResourceCharacteristics(const ResourceLoadStatistics& resourceStatistic)
+bool WebResourceLoadStatisticsStore::hasPrevalentResourceCharacteristics(const ResourceLoadStatistics& resourceStatistic)
 {
-    return resourceStatistic.subframeUnderTopFrameOrigins.size() > subframeUnderTopFrameOriginsThreshold
-        || resourceStatistic.subresourceUnderTopFrameOrigins.size() > subresourceUnderTopFrameOriginsThreshold
-        || resourceStatistic.subresourceUniqueRedirectsTo.size() > subresourceHasBeenRedirectedFromToUniqueDomainsThreshold
-        || resourceStatistic.redirectedToOtherPrevalentResourceOrigins.size() > redirectedToOtherPrevalentResourceOriginsThreshold;
+    auto subresourceUnderTopFrameOriginsCount = resourceStatistic.subresourceUnderTopFrameOrigins.size();
+    auto subresourceUniqueRedirectsToCount = resourceStatistic.subresourceUniqueRedirectsTo.size();
+    auto subframeUnderTopFrameOriginsCount = resourceStatistic.subframeUnderTopFrameOrigins.size();
+    
+    if (!subresourceUnderTopFrameOriginsCount
+        && !subresourceUniqueRedirectsToCount
+        && !subframeUnderTopFrameOriginsCount)
+        return false;
+
+    if (subresourceUnderTopFrameOriginsCount > featureVectorLengthThreshold
+        || subresourceUniqueRedirectsToCount > featureVectorLengthThreshold
+        || subframeUnderTopFrameOriginsCount > featureVectorLengthThreshold)
+        return true;
+
+    // The resource is considered prevalent if the feature vector
+    // is longer than the threshold.
+    // Vector length for n dimensions is sqrt(a^2 + (...) + n^2).
+    double vectorLength = 0;
+    vectorLength += subresourceUnderTopFrameOriginsCount * subresourceUnderTopFrameOriginsCount;
+    vectorLength += subresourceUniqueRedirectsToCount * subresourceUniqueRedirectsToCount;
+    vectorLength += subframeUnderTopFrameOriginsCount * subframeUnderTopFrameOriginsCount;
+
+    ASSERT(vectorLength > 0);
+
+    return sqrt(vectorLength) > featureVectorLengthThreshold;
 }
     
-static inline void classifyPrevalentResources(ResourceLoadStatistics& resourceStatistic, Vector<String>& prevalentResources, Vector<String>& prevalentResourcesWithUserInteraction)
+void WebResourceLoadStatisticsStore::classifyResource(ResourceLoadStatistics& resourceStatistic)
 {
-    if (resourceStatistic.isPrevalentResource || hasPrevalentResourceCharacteristics(resourceStatistic)) {
+    if (!resourceStatistic.isPrevalentResource && hasPrevalentResourceCharacteristics(resourceStatistic)) {
         resourceStatistic.isPrevalentResource = true;
-        if (resourceStatistic.hadUserInteraction)
-            prevalentResourcesWithUserInteraction.append(resourceStatistic.highLevelDomain);
-        else
-            prevalentResources.append(resourceStatistic.highLevelDomain);
     }
 }
 
+void WebResourceLoadStatisticsStore::clearDataRecords()
+{
+    if (m_dataStoreClearPending)
+        return;
+
+    Vector<String> prevalentResourceDomains = coreStore().prevalentResourceDomainsWithoutUserInteraction();
+    if (!prevalentResourceDomains.size())
+        return;
+
+    double now = currentTime();
+    if (!m_lastTimeDataRecordsWereCleared) {
+        m_lastTimeDataRecordsWereCleared = now;
+        return;
+    }
+
+    if (now < (m_lastTimeDataRecordsWereCleared + numberOfSecondsBetweenClearingDataRecords))
+        return;
+
+    m_dataStoreClearPending = true;
+    m_lastTimeDataRecordsWereCleared = now;
+
+    // Switch to the main thread to get the default website data store
+    RunLoop::main().dispatch([prevalentResourceDomains = WTFMove(prevalentResourceDomains), this] () mutable {
+        auto& websiteDataStore = API::WebsiteDataStore::defaultDataStore()->websiteDataStore();
+
+        websiteDataStore.fetchData(WebsiteDataType::Cookies, { }, [prevalentResourceDomains = WTFMove(prevalentResourceDomains), this](auto websiteDataRecords) {
+            Vector<WebsiteDataRecord> dataRecords;
+            for (auto& websiteDataRecord : websiteDataRecords) {
+                for (auto& prevalentResourceDomain : prevalentResourceDomains) {
+                    if (websiteDataRecord.displayName.endsWithIgnoringASCIICase(prevalentResourceDomain)) {
+                        auto suffixStart = websiteDataRecord.displayName.length() - prevalentResourceDomain.length();
+                        if (!suffixStart || websiteDataRecord.displayName[suffixStart - 1] == '.')
+                            dataRecords.append(websiteDataRecord);
+                    }
+                }
+            }
+
+            if (!dataRecords.size()) {
+                m_dataStoreClearPending = false;
+                return;
+            }
+
+            auto& websiteDataStore = API::WebsiteDataStore::defaultDataStore()->websiteDataStore();
+            websiteDataStore.removeData(WebsiteDataType::Cookies, { WTFMove(dataRecords) }, [this] {
+                m_dataStoreClearPending = false;
+            });
+        });
+    });
+}
+
 void WebResourceLoadStatisticsStore::resourceLoadStatisticsUpdated(const Vector<WebCore::ResourceLoadStatistics>& origins)
 {
     coreStore().mergeStatistics(origins);
 
-    Vector<String> prevalentResources, prevalentResourcesWithUserInteraction;
     if (coreStore().hasEnoughDataForStatisticsProcessing()) {
-        coreStore().processStatistics([this, &prevalentResources, &prevalentResourcesWithUserInteraction] (ResourceLoadStatistics& resourceStatistic) {
-            classifyPrevalentResources(resourceStatistic, prevalentResources, prevalentResourcesWithUserInteraction);
+        coreStore().processStatistics([this] (ResourceLoadStatistics& resourceStatistic) {
+            classifyResource(resourceStatistic);
+            clearDataRecords();
         });
     }
 
-    // FIXME: Notify individual WebProcesses of prevalent domains using the two vectors populated by the classifier. <rdar://problem/24703099>
     auto encoder = coreStore().createEncoderFromData();
     
     writeEncoderToDisk(*encoder.get(), "full_browsing_session");
@@ -152,7 +220,7 @@
 {
     if (m_storagePath.isEmpty())
         return emptyString();
-    
+
     // TODO Decide what to call this file
     return pathByAppendingComponent(m_storagePath, label + "_resourceLog.plist");
 }
@@ -162,14 +230,14 @@
     RefPtr<SharedBuffer> rawData = encoder.finishEncoding();
     if (!rawData)
         return;
-    
+
     String resourceLog = persistentStoragePath(label);
     if (resourceLog.isEmpty())
         return;
-    
+
     if (!m_storagePath.isEmpty())
         makeAllDirectories(m_storagePath);
-    
+
     auto handle = openFile(resourceLog, OpenForWrite);
     if (!handle)
         return;
@@ -176,7 +244,7 @@
     
     int64_t writtenBytes = writeToFile(handle, rawData->data(), rawData->size());
     closeFile(handle);
-    
+
     if (writtenBytes != static_cast<int64_t>(rawData->size()))
         WTFLogAlways("WebResourceLoadStatisticsStore: We only wrote %d out of %d bytes to disk", static_cast<unsigned>(writtenBytes), rawData->size());
 }
@@ -186,11 +254,11 @@
     String resourceLog = persistentStoragePath(label);
     if (resourceLog.isEmpty())
         return nullptr;
-    
+
     RefPtr<SharedBuffer> rawData = SharedBuffer::createWithContentsOfFile(resourceLog);
     if (!rawData)
         return nullptr;
-    
+
     return KeyedDecoder::decoder(reinterpret_cast<const uint8_t*>(rawData->data()), rawData->size());
 }
 

Modified: trunk/Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.h (206868 => 206869)


--- trunk/Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.h	2016-10-06 17:27:23 UTC (rev 206868)
+++ trunk/Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.h	2016-10-06 17:40:12 UTC (rev 206869)
@@ -28,6 +28,7 @@
 
 #include "APIObject.h"
 #include "Connection.h"
+#include "WebsiteDataRecord.h"
 #include <WebCore/ResourceLoadStatisticsStore.h>
 #include <wtf/Vector.h>
 #include <wtf/text/WTFString.h>
@@ -71,6 +72,10 @@
 private:
     explicit WebResourceLoadStatisticsStore(const String&);
 
+    bool hasPrevalentResourceCharacteristics(const WebCore::ResourceLoadStatistics&);
+    void classifyResource(WebCore::ResourceLoadStatistics&);
+    void clearDataRecords();
+
     String persistentStoragePath(const String& label) const;
 
     // IPC::MessageReceiver
@@ -83,6 +88,9 @@
     Ref<WTF::WorkQueue> m_statisticsQueue;
     String m_storagePath;
     bool m_resourceLoadStatisticsEnabled { false };
+
+    double m_lastTimeDataRecordsWereCleared { 0 };
+    bool m_dataStoreClearPending { false };
 };
 
 } // namespace WebKit

Modified: trunk/Tools/ChangeLog (206868 => 206869)


--- trunk/Tools/ChangeLog	2016-10-06 17:27:23 UTC (rev 206868)
+++ trunk/Tools/ChangeLog	2016-10-06 17:40:12 UTC (rev 206869)
@@ -1,3 +1,13 @@
+2016-10-06  John Wilander  <wilan...@apple.com>
+
+        Update Resource Load Statistics
+        https://bugs.webkit.org/show_bug.cgi?id=162811
+
+        Reviewed by Alex Christensen.
+
+        * TestWebKitAPI/Tests/mac/PublicSuffix.mm:
+            Change from USE(PUBLIC_SUFFIX_LIST) to ENABLE(PUBLIC_SUFFIX_LIST)
+
 2016-10-05  Philippe Normand  <pnorm...@igalia.com>
 
         [GStreamer][OWR] GL rendering support

Modified: trunk/Tools/TestWebKitAPI/Tests/mac/PublicSuffix.mm (206868 => 206869)


--- trunk/Tools/TestWebKitAPI/Tests/mac/PublicSuffix.mm	2016-10-06 17:27:23 UTC (rev 206868)
+++ trunk/Tools/TestWebKitAPI/Tests/mac/PublicSuffix.mm	2016-10-06 17:40:12 UTC (rev 206869)
@@ -25,7 +25,7 @@
 
 #include "config.h"
 
-#if USE(PUBLIC_SUFFIX_LIST)
+#if ENABLE(PUBLIC_SUFFIX_LIST)
 
 #include "WTFStringUtilities.h"
 #include <WebCore/PublicSuffix.h>
_______________________________________________
webkit-changes mailing list
webkit-changes@lists.webkit.org
https://lists.webkit.org/mailman/listinfo/webkit-changes

Reply via email to