Glad you brought up compaction here - I think there would be a significant benefit to moving compaction to direct i/o.
On 2023/10/16 16:14:28 Benedict wrote: > I have some plans to (eventually) use the commit log as memtable payload > storage (ie memtables would reference the commit log entries directly, > storing only indexing info), and to back first level of sstables by reference > to commit log entries. This will permit us to deliver not only much bigger > memtables (cutting compaction throughput requirements by the ratio of size > increase - so pretty dramatically), and faster flushing (so better behaviour > ling write bursts), but also a fairly cheap and simple way to support MVCC - > which will be helpful for transaction throughput. > > There is also a new commit log (“journal”) coming with Accord, that the rest > of C* may or may not transition to. > > I only say this because this makes the utility of direct IO for commit log > suspect, as we will be reading from the files as a matter of course should > this go ahead; and we may end up relying on a different commit log > implementation before long anyway. > > This is obviously a big suggestion and is not guaranteed to transpire, and > probably won’t within the next year or so, but it should perhaps form some > minimal part of any calculus. If the patch is otherwise simple and beneficial > I don’t have anything against it, and the use of direct IO could well be of > benefit eg in compaction - and also in future if we manage to bring a page > management in process. So laying foundations there could be of benefit, even > if the commit log eventually does not use it. > > > On 16 Oct 2023, at 17:00, Jon Haddad <rustyrazorbl...@apache.org> wrote: > > > > I haven't looked at the patch, but at a high level, defaulting to direct > > I/O for commit logs makes a lot of sense to me. > > > >> On 2023/10/16 06:34:05 "Pawar, Amit" wrote: > >> [Public] > >> > >> Hi, > >> > >> CommitLog uses mmap (memory mapped ) segments by default. Direct-IO > >> feature is proposed through new PR[1] to improve the CommitLog IO speed. > >> Enabling this by default could be useful feature to address IO bottleneck > >> seen during peak load. > >> > >> Need your input regarding changing this default. Please suggest. > >> > >> https://issues.apache.org/jira/browse/CASSANDRA-18464 > >> > >> thanks, > >> Amit Pawar > >> > >> [1] - https://github.com/apache/cassandra/pull/2777 > >> >