On Fri, Jan 24, 2025 at 6:21 AM Kampanakis, Panos <kpanos=
[email protected]> wrote:

> Thx Luke, Bas.
>
>
>
> Resurrecting this old thread regarding web connection data sizes to share
> some more data I presented at a conference last week. You two know about
> this, but I thought it could benefit future group discussions.
>
>
>
> Slides 14-19 in
> https://pkic.org/events/2025/pqc-conference-austin-us/THU_BREAKOUT_1130_Panos-Kampanakis_How-much-will-ML-DSA-affect-Webpage-Metrics.pdf#page=14
> investigate some popular web page connection data sizes. The investigation
> showed that the pages I focused on pull down large amounts of data, but
> they include a bunch of slim connections delivering other content like
> tracking, ads, HTTP 304s (browser caching) or small elements. I believe
> this generally matches what you shared in your blog.
>

We certainly see small elements, but we deliver far fewer of the typical
small content like ads, than the web on average. Room for further
investigation. At this point I think it makes most sense to test drop-in
ML-DSA with a browser instead of guessing.


> There is a caveat that this investigation was on a small set of popular
> pages, so we can’t extrapolate that the represent the whole web. But if
> they do, then the performance of the conns transferring the “web content”
> won’t suffer as much. The small conns doing the other things will suffer.
> Will these small conns affect web metrics? Intuitively, probably not so
> much, but OK, without testing no one should be sure.
>
>
>
> The earlier slides of the preso include some results from popular pages
> and estimate the impact of ML-DSA on web user metrics like TTFB, FCP, LCP
> and Document Complete times. They show that the web metric suffers much
> less than the handshake mainly because web pages usually spend more time on
> doing other things like downloading and rendering large sums of data like
> html, css, javascript, images, json etc than on TLS handshakes.
>
>
>
>
>
>
>
> *From:* Luke Valenta <[email protected]>
> *Sent:* Tuesday, November 19, 2024 3:19 PM
> *To:* Kampanakis, Panos <[email protected]>
> *Cc:* Bas Westerbaan <[email protected]>; <[email protected]>
> <[email protected]>; [email protected]
> *Subject:* [EXTERNAL] [Pqc] Re: [TLS] Re: Bytes server -> client
>
>
>
> *CAUTION*: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> Hi Panos,
>
>
>
> Here are some more details on what we see in connections to Cloudflare.
>
>
>
> To validate this theory, what would your data show if you queried for the
> % of conns that transfer <.5 or <1KB? If that is a lot, then there are many
> small conns that skew the median downwards. Or what if you run the query to
> exclude the very heavy conns and the very light (HTTP 301, 302 etc)? For
> example if you ran a report on the conns transferring 1KB<data<80th percentile
> KB, what would be the median for that? That would tell us if the too small
> and two big conns skew the median.
>
>
>
> For non-resumed QUIC connections with at least one request where we
> transfer (including TLS data) between 4kB and 80kB (the 10th and 80th
> percentiles of the distribution, respectively), the median bytes
> transferred is 6.5kB and average is 13.8kB. In other words, less than 10%
> of non-resumed QUIC connections with at least one request transfer less
> than 4kB, so it does not appear to be the case that a large number of small
> requests are skewing the median downwards. Ignoring the top 20% of
> connections in terms of bytes transferred shifts the average down
> significantly, which supports the idea that a relatively small number of
> large requests are skewing the average upwards.
>
>
>
> Let me know if I can clarify further! This is just what we see today, but
> it'll be great to see more measurements to see what the real impact is on
> end-users.
>
>
>
> Best,
>
> Luke
>
>
>
> On Thu, Nov 7, 2024 at 10:54 AM Kampanakis, Panos <kpanos=
> [email protected]> wrote:
>
> Hi Bas,
>
>
>
> That is interesting and surprising, thank you.
>
>
>
> I am mostly interested in the ~63% of non-resumed sessions that would be
> affected by 10-15KB of auth data. It looks like your data showed that each
> QUIC conn transfers about 4.7KB which is very surprising to me. It seems
> very low.
>
>
>
> In experiments I am getting here for top web servers, I see lots of conns
> which transfer hundreds of KB even over QUIC in cached browsers sessions.
> This aligns with the average KB from your blog is 551*0.6=~330KB, but not
> the median 4.7. Hundreds of KB also aligns with the p50 per page / conns
> per page in
> https://httparchive.org/reports/page-weight?lens=top1k&start=2024_05_01&end=latest&view=list
> . Of course browsers cache a lot of things like javascript, images etc, so
> they don’t transfer all resources which could explain the median. But
> still, based on anecdotal experience looking at top visited servers, I am
> noticing many small transfers and just a few that transfer larger HTML, css
> etc on every page even in cached browser sessions..
>
>
>
> I am curious about the 4.7KB and the 15.8% of conns transferring <100KB in
> your blog. Like you say in your blog, if the 95th percentile includes
> very large transfers that would skew the diff between the median and the
> average. But I am wondering if there is another explanation. In my
> experiments I see a lot of 302 and 301 redirects which transfer minimal
> data. Some pages have a lot of those. If you have many of them, then your
> median will get skewed as it fills up with very small data transfers that
> basically don’t do anything. In essence, we could have 10 pages which
> transfer 100KB each for one of their resources and have another 9 that are
> HTTP Redirects or transfer 0.1KB. That would make us think that 90% of the
> 10 pages will be blazing fast, but the 100KB resource in each page will
> take a good amount of time in a slow network.
>
>
>
> To validate this theory, what would your data show if you queried for the
> % of conns that transfer <.5 or <1KB? If that is a lot, then there are many
> small conns that skew the median downwards. Or what if you run the query to
> exclude the very heavy conns and the very light (HTTP 301, 302 etc)? For
> example if you ran a report on the conns transferring 1KB<data<80th
> percentile KB, what would be the median for that? That would tell us if the
> too small and two big conns skew the median.
>
>
>
> Btw, I am curious also about
>
> > Chrome is more cautious and set 10% as their target for maximum TLS
> handshake time regression.
>
> Is this public somewhere? There is no immediate link between TLS handshake
> and any of the Core Web Vitals Metrics or the CruX metrics other than the
> TTFB. Even for the TTFB, 10% in the handshake does not mean 10% TTFB; the
> TTFB is affected much less. I am wondering if we should start expecting the
> TLS handshake to slowly become a tracked web performance metric.
>
>
>
>
>
> *From:* Bas Westerbaan <[email protected]>
> *Sent:* Thursday, November 7, 2024 9:07 AM
> *To:* <[email protected]> <[email protected]>; [email protected]
> *Subject:* [EXTERNAL] [TLS] Bytes server -> client
>
>
>
> *CAUTION*: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> Hi all,
>
>
>
> Just wanted to highlight a blog post we just published.
> https://blog.cloudflare.com/another-look-at-pq-signatures/  At the end we
> share some statistics that may be of interest:
>
>
>
> On average, around 15 million TLS connections are established with
> Cloudflare per second. Upgrading each to ML-DSA, would take 1.8Tbps, which
> is 0.6% of our current total network capacity. No problem so far. The
> question is how these extra bytes affect performance.
> Back in 2021, we ran a large-scale experiment to measure the impact of big
> post-quantum certificate chains on connections to Cloudflare’s network over
> the open Internet. There were two important results. First, we saw a steep
> increase in the rate of client and middlebox failures when we added more
> than 10kB to existing certificate chains. Secondly, when adding less than
> 9kB, the slowdown in TLS handshake time would be approximately 15%. We felt
> the latter is workable, but far from ideal: such a slowdown is noticeable
> and people might hold off deploying post-quantum certificates before it’s
> too late.
>
>
>
> Chrome is more cautious and set 10% as their target for maximum TLS
> handshake time regression. They report that deploying post-quantum key
> agreement has already incurred a 4% slowdown in TLS handshake time, for the
> extra 1.1kB from server-to-client and 1.2kB from client-to-server. That
> slowdown is proportionally larger than the 15% we found for 9kB, but that
> could be explained by slower upload speeds than download speeds.
>
>
> There has been pushback against the focus on TLS handshake times. One
> argument is that session resumption alleviates the need for sending the
> certificates again. A second argument is that the data required to visit a
> typical website dwarfs the additional bytes for post-quantum certificates.
> One example is this 2024 publication, where Amazon researchers have
> simulated the impact of large post-quantum certificates on data-heavy TLS
> connections. They argue that typical connections transfer multiple requests
> and hundreds of kilobytes, and for those the TLS handshake slowdown
> disappears in the margin.
>
>
>
> Are session resumption and hundreds of kilobytes over a connection typical
> though? We’d like to share what we see. We focus on QUIC connections, which
> are likely initiated by browsers or browser-like clients. Of all QUIC
> connections with Cloudflare that carry at least one HTTP request, 37% are
> resumptions, meaning that key material from a previous TLS connection is
> reused, avoiding the need to transmit certificates. The median number of
> bytes transferred from server-to-client over a resumed QUIC connection is
> 4.4kB, while the average is 395kB. For non-resumptions the median is 7.8kB
> and average is 551kB. This vast difference between median and average
> indicates that a small fraction of data-heavy connections skew the average.
> In fact, only 15.8% of all QUIC connections transfer more than 100kB.
>
>
> The median certificate chain today (with compression) is 3.2kB. That means
> that almost 40% of all data transferred from server to client on more than
> half of the non-resumed QUIC connections are just for the certificates, and
> this only gets worse with post-quantum algorithms. For the majority of QUIC
> connections, using ML-DSA as a drop-in replacement for classical signatures
> would more than double the number of transmitted bytes over the lifetime of
> the connection.
>
>
>
> It sounds quite bad if the vast majority of data transferred for a typical
> connection is just for the post-quantum certificates. It’s still only a
> proxy for what is actually important: the effect on metrics relevant to the
> end-user, such as the browsing experience (e.g. largest contentful paint)
> and the amount of data those certificates take from a user’s monthly data
> cap. We will continue to investigate and get a better understanding of the
> impact.
>
>
>
> Best,
>
>
>
>  Bas
>
> _______________________________________________
> TLS mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
>
>
>
>
> --
>
> Luke Valenta
>
> Systems Engineer - Research
>
_______________________________________________
TLS mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to