Take a look at these which should answer at least some of your questions.

http://ceph.com/community/new-luminous-bluestore/

http://ceph.com/planet/understanding-bluestore-cephs-new-storage-backend/

On Mon, Sep 11, 2017 at 8:45 PM, Richard Hesketh
<richard.hesk...@rd.bbc.co.uk> wrote:
> On 08/09/17 11:44, Richard Hesketh wrote:
>> Hi,
>>
>> Reading the ceph-users list I'm obviously seeing a lot of people talking 
>> about using bluestore now that Luminous has been released. I note that many 
>> users seem to be under the impression that they need separate block devices 
>> for the bluestore data block, the DB, and the WAL... even when they are 
>> going to put the DB and the WAL on the same device!
>>
>> As per the docs at 
>> http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/ 
>> this is nonsense:
>>
>>> If there is only a small amount of fast storage available (e.g., less than 
>>> a gigabyte), we recommend using it as a WAL device. If there is more, 
>>> provisioning a DB
>>> device makes more sense. The BlueStore journal will always be placed on the 
>>> fastest device available, so using a DB device will provide the same 
>>> benefit that the WAL
>>> device would while also allowing additional metadata to be stored there (if 
>>> it will fix). [sic, I assume that should be "fit"]
>>
>> I understand that if you've got three speeds of storage available, there may 
>> be some sense to dividing these. For instance, if you've got lots of HDD, a 
>> bit of SSD, and a tiny NVMe available in the same host, data on HDD, DB on 
>> SSD and WAL on NVMe may be a sensible division of data. That's not the case 
>> for most of the examples I'm reading; they're talking about putting DB and 
>> WAL on the same block device, but in different partitions. There's even one 
>> example of someone suggesting to try partitioning a single SSD to put 
>> data/DB/WAL all in separate partitions!
>>
>> Are the docs wrong and/or I am missing something about optimal bluestore 
>> setup, or do people simply have the wrong end of the stick? I ask because 
>> I'm just going through switching all my OSDs over to Bluestore now and I've 
>> just been reusing the partitions I set up for journals on my SSDs as DB 
>> devices for Bluestore HDDs without specifying anything to do with the WAL, 
>> and I'd like to know sooner rather than later if I'm making some sort of 
>> horrible mistake.
>>
>> Rich
>
> Having had no explanatory reply so far I'll ask further...
>
> I have been continuing to update my OSDs and so far the performance offered 
> by bluestore has been somewhat underwhelming. Recovery operations after 
> replacing the Filestore OSDs with Bluestore equivalents have been much slower 
> than expected, not even half the speed of recovery ops when I was upgrading 
> Filestore OSDs with larger disks a few months ago. This contributes to my 
> sense that I am doing something wrong.
>
> I've found that if I allow ceph-disk to partition my DB SSDs rather than 
> reusing the rather large journal partitions I originally created for 
> Filestore, it is only creating very small 1GB partitions. Attempting to 
> search for bluestore configuration parameters has pointed me towards 
> bluestore_block_db_size and bluestore_block_wal_size config settings. 
> Unfortunately these settings are completely undocumented so I'm not sure what 
> their functional purpose is. In any event in my running config I seem to have 
> the following default values:
>
> # ceph-conf --show-config | grep bluestore
> ...
> bluestore_block_create = true
> bluestore_block_db_create = false
> bluestore_block_db_path =
> bluestore_block_db_size = 0
> bluestore_block_path =
> bluestore_block_preallocate_file = false
> bluestore_block_size = 10737418240
> bluestore_block_wal_create = false
> bluestore_block_wal_path =
> bluestore_block_wal_size = 100663296
> ...
>
> I have been creating bluestore osds by:
>
> ceph-disk prepare --bluestore /dev/sdX --block.db /dev/sdY1 --osd-id Z # 
> re-using existing partitions for DB
> or
> ceph-disk prepare --bluestore /dev/sdX --block.db /dev/sdY --osd-id Z # 
> letting ceph-disk partition DB, after zapping original partitions
>
> Are these sane values? What does it mean that block_db_size is 0 - is it just 
> using the entire block device specified or not actually using it at all? Is 
> the WAL actually being placed on the DB block device? And is that 1GB default 
> really a sensible size for the DB partition?
>
> Rich
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Cheers,
Brad
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to