But that's exactly what filesystems and their own journals do already :-) Jan
> On 14 Oct 2015, at 17:02, Somnath Roy <somnath....@sandisk.com> wrote: > > Jan, > Journal helps FileStore to maintain the transactional integrity in the event > of a crash. That's the main reason. > > Thanks & Regards > Somnath > > -----Original Message----- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Jan > Schermer > Sent: Wednesday, October 14, 2015 2:28 AM > To: ceph-users@lists.ceph.com > Subject: [ceph-users] Ceph journal - isn't it a bit redundant sometimes? > > Hi, > I've been thinking about this for a while now - does Ceph really need a > journal? Filesystems are already pretty good at committing data to disk when > asked (and much faster too), we have external journals in XFS and Ext4... > In a scenario where client does an ordinary write, there's no need to flush > it anywhere (the app didn't ask for it) so it ends up in pagecache and gets > committed eventually. > If a client asks for the data to be flushed then fdatasync/fsync on the > filestore object takes care of that, including ordering and stuff. > For reads, you just read from filestore (no need to differentiate between > filestore/journal) - pagecache gives you the right version already. > > Or is journal there to achieve some tiering for writes when the running > spindles with SSDs? This is IMO the only thing ordinary filesystems don't do > out of box even when filesystem journal is put on SSD - the data get flushed > to spindle whenever fsync-ed (even with data=journal). But in reality, most > of the data will hit the spindle either way and when you run with SSDs it > will always be much slower. And even for tiering - there are already many > options (bcache, flashcache or even ZFS L2ARC) that are much more performant > and proven stable. I think the fact that people have a need to combine Ceph > with stuff like that already proves the point. > > So a very interesting scenario would be to disable Ceph journal and at most > use data=journal on ext4. The complexity of the data path would drop > significantly, latencies decrease, CPU time is saved... > I just feel that Ceph has lots of unnecessary complexity inside that > duplicates what filesystems (and pagecache...) have been doing for a while > now without eating most of our CPU cores - why don't we use that? Is it > possible to disable journal completely? > > Did I miss something that makes journal essential? > > Jan > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > ________________________________ > > PLEASE NOTE: The information contained in this electronic mail message is > intended only for the use of the designated recipient(s) named above. If the > reader of this message is not the intended recipient, you are hereby notified > that you have received this message in error and that any review, > dissemination, distribution, or copying of this message is strictly > prohibited. If you have received this communication in error, please notify > the sender by telephone or e-mail (as shown above) immediately and destroy > any and all copies of this message in your possession (whether hard copies or > electronically stored copies). > _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com