[
https://issues.apache.org/jira/browse/CASSANDRA-19979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jon Haddad updated CASSANDRA-19979:
-----------------------------------
Description:
CASSANDRA-15452 is introducing an internal buffer to compaction in order to
increase throughput while reducing IOPS. We can do the same thing with our
streaming slow path. There's a common misconception that the overhead comes
from serde overhead, but I've found on a lot of devices the overhead is due to
our read patterns. This is most commonly found on non-NVMe drives, especially
disaggregated storage such as EBS where the latency is higher and more variable.
Attached is a perf profile showing the cost of streaming is dominated by pread.
The team I was working with was seeing they could stream only 12MB per
streaming session. Reducing the number of read operations by using internal
buffered reads should improve this by at least 3-5x, as well as reduce CPU
overhead from reduced system calls.
I think we need to do a few things:
* Internal buffer on reads. Maybe something like adding `withBuffer()` on
ChannelProxy, which would wrap it with a BufferedReader
* Buffer writes to the network. Writing constant small values to the network
has a very high latency cost, we'd be better off flushing larger values more
often
* Move the blocking network part to a separate thread. We don't need to wait
on the network transfer in order to read more data off disk. Once we improve
the internal buffer on reads I think we'll see this as the next problem so
let's tackle it now.
!image-2024-10-04-12-40-26-727.png!
was:
CASSANDRA-15452 is introducing an internal buffer to compaction in order to
increase throughput while reducing IOPS. We can do the same thing with our
streaming slow path. There's a common misconception that the overhead comes
from serde overhead, but I've found on a lot of devices the overhead is due to
our read patterns. This is most commonly found on non-NVMe drives, especially
disaggregated storage such as EBS where the latency is higher and more variable.
Attached is a perf profile showing the cost of streaming is dominated by pread.
The team I was working with was seeing they could stream only 12MB per
streaming session. Reducing the number of read operations by using internal
buffered reads should improve this by at least 3-5x, as well as reduce CPU
overhead from reduced system calls.
!image-2024-10-04-12-40-26-727.png!
> Use internal buffer on streaming slow path
> ------------------------------------------
>
> Key: CASSANDRA-19979
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19979
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Jon Haddad
> Priority: Normal
> Attachments: image-2024-10-04-12-40-26-727.png
>
>
> CASSANDRA-15452 is introducing an internal buffer to compaction in order to
> increase throughput while reducing IOPS. We can do the same thing with our
> streaming slow path. There's a common misconception that the overhead comes
> from serde overhead, but I've found on a lot of devices the overhead is due
> to our read patterns. This is most commonly found on non-NVMe drives,
> especially disaggregated storage such as EBS where the latency is higher and
> more variable.
> Attached is a perf profile showing the cost of streaming is dominated by
> pread. The team I was working with was seeing they could stream only 12MB
> per streaming session. Reducing the number of read operations by using
> internal buffered reads should improve this by at least 3-5x, as well as
> reduce CPU overhead from reduced system calls.
> I think we need to do a few things:
> * Internal buffer on reads. Maybe something like adding `withBuffer()` on
> ChannelProxy, which would wrap it with a BufferedReader
> * Buffer writes to the network. Writing constant small values to the
> network has a very high latency cost, we'd be better off flushing larger
> values more often
> * Move the blocking network part to a separate thread. We don't need to
> wait on the network transfer in order to read more data off disk. Once we
> improve the internal buffer on reads I think we'll see this as the next
> problem so let's tackle it now.
>
>
>
>
>
>
>
>
> !image-2024-10-04-12-40-26-727.png!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]