Jan, I believe the block device (vs. filesystem) OSD layout is addressed in
the Newstore/Bluestore:

http://tracker.ceph.com/projects/ceph/wiki/NewStore_(new_osd_backend)

--
Alex Gorbachev
Storcium

On Thu, Jan 28, 2016 at 4:32 PM, Jan Schermer <j...@schermer.cz> wrote:

> You can't run Ceph OSD without a journal. The journal is always there.
> If you don't have a journal partition then there's a "journal" file on the
> OSD filesystem that does the same thing. If it's a partition then this file
> turns into a symlink.
>
> You will always be better off with a journal on a separate partition
> because of the way writeback cache in linux works (someone correct me if
> I'm wrong).
> The journal needs to flush to disk quite often, and linux is not always
> able to flush only the journal data. You can't defer metadata flushing
> forever and also doing fsync() makes all the dirty data flush as well.
> ext2/3/4 also flushes data to the filesystem periodicaly (5s is it I
> think?) which will make the latency of the journal go through the roof
> momentarily.
> (I'll leave researching how exactly XFS does it to those who care about
> that "filesystem'o'thing").
>
> P.S. I feel very strongly that this whole concept is broken fundamentaly.
> We already have a journal for the filesystem which is time proven, well
> behaved and above all fast. Instead there's this reinvented wheel which
> supposedly does it better in userspace while not really avoiding the
> filesystem journal either. It would maybe make sense if OSD was storing the
> data on a block device directly, avoiding the filesystem altogether. But it
> would still do the same bloody thing and (no disrespect) ext4 does this
> better than Ceph ever will.
>
>
> On 28 Jan 2016, at 20:01, Tyler Bishop <tyler.bis...@beyondhosting.net>
> wrote:
>
> This is an interesting topic that i've been waiting for.
>
> Right now we run the journal as a partition on the data disk.  I've build
> drives without journals and the write performance seems okay but random io
> performance is poor in comparison to what it should be.
>
>
>  [image: http://static.beyondhosting.net/img/bh-small.png]
>
> *Tyler Bishop *Chief Technical Officer
> 513-299-7108 x10
> tyler.bis...@beyondhosting.net <tyler.bis...@beyondhosting.net>
> If you are not the intended recipient of this transmission you are
> notified that disclosing, copying, distributing or taking any action in
> reliance on the contents of this information is strictly prohibited.
>
>
>
> ------------------------------
> *From: *"Bill WONG" <wongahsh...@gmail.com>
> *To: *"ceph-users" <ceph-users@lists.ceph.com>
> *Sent: *Thursday, January 28, 2016 1:36:01 PM
> *Subject: *[ceph-users] SSD Journal
>
> Hi,
> i have tested with SSD Journal with SATA, it works perfectly.. now, i am
> testing with full SSD ceph cluster, now with full SSD ceph cluster, do i
> still need to have SSD as journal disk?
>
> [ assumed i do not have PCIe SSD Flash which is better performance than
> normal SSD disk]
>
> please give some ideas on full ssd ceph cluster ... thank you!
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to