[ceph-users] Default data to rbd that never written

2020-01-02 Thread 涂振南
Hello,
   I created an rbd image and wrote some data to it. After that, I cloned a 
new image from the previous one.
Then I compared this two image byte by byte, but they are not totally equal. 
The data in position where I never wrote to the first image are not equal.
   Is this a normal case or a bug? If this is normal, is there any 
configuration can help me to set rbd default data to all zero?

Ceph version: 14.2.5
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph luminous bluestore poor random write performances

2020-01-02 Thread Paul Emmerich
Why do you think that is slow? That's 4.5k write iops and 13.5k read iops
at the same time, that's amazing for a total of 30 HDDs.

It's actually way faster than you'd expect for 30 HDDs, so these DB devices
are really helping there :)


Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90


On Thu, Jan 2, 2020 at 12:14 PM Ignazio Cassano 
wrote:

> Hi Stefan, using fio with bs=64k I got very good performances.
> I am not skilled on storage, but linux file system block size is 4k.
> So, How can I modify the configuration on ceph to obtain best performances
> with bs=4k ?
> Regards
> Ignazio
>
>
>
> Il giorno gio 2 gen 2020 alle ore 10:59 Stefan Kooman  ha
> scritto:
>
>> Quoting Ignazio Cassano (ignaziocass...@gmail.com):
>> > Hello All,
>> > I installed ceph luminous with openstack, an using fio in a virtual
>> machine
>> > I got slow random writes:
>> >
>> > fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1
>> --name=test
>> > --filename=random_read_write.fio --bs=4k --iodepth=64 --size=4G
>> > --readwrite=randrw --rwmixread=75
>>
>> Do you use virtio-scsi with a SCSI queue per virtual CPU core? How many
>> cores do you have? I suspect that the queue depth is hampering
>> throughput here ... but is throughput performance really interesting
>> anyway for your use case? Low latency generally matters most.
>>
>> Gr. Stefan
>>
>>
>> --
>> | BIT BV  https://www.bit.nl/Kamer van Koophandel 09090351
>> | GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Infiniband backend OSD communication

2020-01-02 Thread Nathan Stratton
I am working on upgrading my current ethernet only ceph cluster to a
combined ethernet frontend and infiniband backend. From my research I
understand that I set:

ms_cluster_type = async+rdma
ms_async_rdma_device_name = mlx4_0

What I don't understand is how does ceph know how to reach each OSD over
RDMA? Do I have to run IPoIB on top of infiniband and use that for OSD
addresses?

Is there a way to use infiniband on backend without IPoIB and just use rdma
verbs?

><>
nathan stratton
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] [db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting

2020-01-02 Thread EDH - Manuel Rios
HI

Today checking our monitor logs see that RocksDB compactation trigger every 
minute.

Is that normal?

2020-01-02 14:08:33.091 7f2b8acbe700  4 rocksdb: 
[db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting
2020-01-02 14:08:33.091 7f2b8acbe700  4 rocksdb: 
[db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting
2020-01-02 14:08:33.091 7f2b8acbe700  4 rocksdb: 
[db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting
2020-01-02 14:08:33.091 7f2b8acbe700  4 rocksdb: 
[db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting
2020-01-02 14:08:33.091 7f2b8acbe700  4 rocksdb: 
[db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting

2020-01-02 14:09:15.193 7f2b8acbe700  4 rocksdb: 
[db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting
2020-01-02 14:09:15.193 7f2b8acbe700  4 rocksdb: 
[db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting
2020-01-02 14:09:15.193 7f2b8acbe700  4 rocksdb: 
[db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting
2020-01-02 14:09:15.193 7f2b8acbe700  4 rocksdb: 
[db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting
2020-01-02 14:09:15.193 7f2b8acbe700  4 rocksdb: 
[db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting
2020-01-02 14:09:15.193 7f2b8acbe700  4 rocksdb: 
[db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting
2020-01-02 14:09:15.193 7f2b8acbe700  4 rocksdb: 
[db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting
2020-01-02 14:09:15.193 7f2b8acbe700  4 rocksdb: 
[db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting
2020-01-02 14:09:15.193 7f2b8acbe700  4 rocksdb: 
[db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting
2020-01-02 14:09:15.193 7f2b8acbe700  4 rocksdb: 
[db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting
2020-01-02 14:09:15.193 7f2b8acbe700  4 rocksdb: 
[db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting

Best Regards
Manuel

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph luminous bluestore poor random write performances

2020-01-02 Thread Ignazio Cassano
Hi Stefan, using fio with bs=64k I got very good performances.
I am not skilled on storage, but linux file system block size is 4k.
So, How can I modify the configuration on ceph to obtain best performances
with bs=4k ?
Regards
Ignazio



Il giorno gio 2 gen 2020 alle ore 10:59 Stefan Kooman  ha
scritto:

> Quoting Ignazio Cassano (ignaziocass...@gmail.com):
> > Hello All,
> > I installed ceph luminous with openstack, an using fio in a virtual
> machine
> > I got slow random writes:
> >
> > fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1
> --name=test
> > --filename=random_read_write.fio --bs=4k --iodepth=64 --size=4G
> > --readwrite=randrw --rwmixread=75
>
> Do you use virtio-scsi with a SCSI queue per virtual CPU core? How many
> cores do you have? I suspect that the queue depth is hampering
> throughput here ... but is throughput performance really interesting
> anyway for your use case? Low latency generally matters most.
>
> Gr. Stefan
>
>
> --
> | BIT BV  https://www.bit.nl/Kamer van Koophandel 09090351
> | GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph luminous bluestore poor random write performances

2020-01-02 Thread Sinan Polat
Hi,

Your performance is not that bad, is it? What performance do you expect?

I just ran the same test.
12 Node, SATA SSD Only:
   READ: bw=63.8MiB/s (66.9MB/s), 63.8MiB/s-63.8MiB/s (66.9MB/s-66.9MB/s),
io=3070MiB (3219MB), run=48097-48097msec
  WRITE: bw=21.3MiB/s (22.4MB/s), 21.3MiB/s-21.3MiB/s (22.4MB/s-22.4MB/s),
io=1026MiB (1076MB), run=48097-48097msec

6 Node, SAS Only:
   READ: bw=22.1MiB/s (23.2MB/s), 22.1MiB/s-22.1MiB/s (23.2MB/s-23.2MB/s),
io=3070MiB (3219MB), run=138650-138650msec
  WRITE: bw=7578KiB/s (7759kB/s), 7578KiB/s-7578KiB/s (7759kB/s-7759kB/s),
io=1026MiB (1076MB), run=138650-138650msec

This is OpenStack Queens with Ceph FileStore (Luminous).

Kind regards,
Sinan Polat

> Op 2 januari 2020 om 10:59 schreef Stefan Kooman :
> 
> 
> Quoting Ignazio Cassano (ignaziocass...@gmail.com):
> > Hello All,
> > I installed ceph luminous with openstack, an using fio in a virtual machine
> > I got slow random writes:
> > 
> > fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test
> > --filename=random_read_write.fio --bs=4k --iodepth=64 --size=4G
> > --readwrite=randrw --rwmixread=75
> 
> Do you use virtio-scsi with a SCSI queue per virtual CPU core? How many
> cores do you have? I suspect that the queue depth is hampering
> throughput here ... but is throughput performance really interesting
> anyway for your use case? Low latency generally matters most.
> 
> Gr. Stefan
> 
> 
> -- 
> | BIT BV  https://www.bit.nl/Kamer van Koophandel 09090351
> | GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph luminous bluestore poor random write performances

2020-01-02 Thread Ignazio Cassano
Hi Stefan,
I did not understand you question but it's may fault.
I am using virtio-scsi on my virtual machine.
The virtual machine has two cores.

Or dow yum mean cores on osd servers ?


Regards
Ignazio

Il giorno gio 2 gen 2020 alle ore 10:59 Stefan Kooman  ha
scritto:

> Quoting Ignazio Cassano (ignaziocass...@gmail.com):
> > Hello All,
> > I installed ceph luminous with openstack, an using fio in a virtual
> machine
> > I got slow random writes:
> >
> > fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1
> --name=test
> > --filename=random_read_write.fio --bs=4k --iodepth=64 --size=4G
> > --readwrite=randrw --rwmixread=75
>
> Do you use virtio-scsi with a SCSI queue per virtual CPU core? How many
> cores do you have? I suspect that the queue depth is hampering
> throughput here ... but is throughput performance really interesting
> anyway for your use case? Low latency generally matters most.
>
> Gr. Stefan
>
>
> --
> | BIT BV  https://www.bit.nl/Kamer van Koophandel 09090351
> | GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph luminous bluestore poor random write performances

2020-01-02 Thread Stefan Kooman
Quoting Ignazio Cassano (ignaziocass...@gmail.com):
> Hello All,
> I installed ceph luminous with openstack, an using fio in a virtual machine
> I got slow random writes:
> 
> fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test
> --filename=random_read_write.fio --bs=4k --iodepth=64 --size=4G
> --readwrite=randrw --rwmixread=75

Do you use virtio-scsi with a SCSI queue per virtual CPU core? How many
cores do you have? I suspect that the queue depth is hampering
throughput here ... but is throughput performance really interesting
anyway for your use case? Low latency generally matters most.

Gr. Stefan


-- 
| BIT BV  https://www.bit.nl/Kamer van Koophandel 09090351
| GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph luminous bluestore poor random write performances

2020-01-02 Thread Ignazio Cassano
Hello All,
I installed ceph luminous with openstack, an using fio in a virtual machine
I got slow random writes:

fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test
--filename=random_read_write.fio --bs=4k --iodepth=64 --size=4G
--readwrite=randrw --rwmixread=75

Run status group 0 (all jobs):
   READ: bw=52.7MiB/s (55.3MB/s), 52.7MiB/s-52.7MiB/s (55.3MB/s-55.3MB/s),
io=3070MiB (3219MB), run=58211-58211msec
  WRITE: bw=17.6MiB/s (18.5MB/s), 17.6MiB/s-17.6MiB/s (18.5MB/s-18.5MB/s),
io=1026MiB (1076MB), run=58211-58211msec

[root@tst2-osctrl01 ansible]# ceph osd tree
ID CLASS WEIGHTTYPE NAME   STATUS REWEIGHT PRI-AFF
-1   108.26358 root default
-336.08786 host p2-ceph-01
 0   hdd   3.62279 osd.0   up  1.0 1.0
 1   hdd   3.60529 osd.1   up  1.0 1.0
 2   hdd   3.60529 osd.2   up  1.0 1.0
 3   hdd   3.60529 osd.3   up  1.0 1.0
 4   hdd   3.60529 osd.4   up  1.0 1.0
 5   hdd   3.62279 osd.5   up  1.0 1.0
 6   hdd   3.60529 osd.6   up  1.0 1.0
 7   hdd   3.60529 osd.7   up  1.0 1.0
 8   hdd   3.60529 osd.8   up  1.0 1.0
 9   hdd   3.60529 osd.9   up  1.0 1.0
-536.08786 host p2-ceph-02
10   hdd   3.62279 osd.10  up  1.0 1.0
11   hdd   3.60529 osd.11  up  1.0 1.0
12   hdd   3.60529 osd.12  up  1.0 1.0
13   hdd   3.60529 osd.13  up  1.0 1.0
14   hdd   3.60529 osd.14  up  1.0 1.0
15   hdd   3.62279 osd.15  up  1.0 1.0
16   hdd   3.60529 osd.16  up  1.0 1.0
17   hdd   3.60529 osd.17  up  1.0 1.0
18   hdd   3.60529 osd.18  up  1.0 1.0
19   hdd   3.60529 osd.19  up  1.0 1.0
-736.08786 host p2-ceph-03
20   hdd   3.62279 osd.20  up  1.0 1.0
21   hdd   3.60529 osd.21  up  1.0 1.0
22   hdd   3.60529 osd.22  up  1.0 1.0
23   hdd   3.60529 osd.23  up  1.0 1.0
24   hdd   3.60529 osd.24  up  1.0 1.0
25   hdd   3.62279 osd.25  up  1.0 1.0
26   hdd   3.60529 osd.26  up  1.0 1.0
27   hdd   3.60529 osd.27  up  1.0 1.0
28   hdd   3.60529 osd.28  up  1.0 1.0
29   hdd   3.60529 osd.29  up  1.0 1.0

Each osd server has 10  4TB osd  and  two ssd (2 x 2TB).

Each ssd is partitioned  with 5 partitions (ehach partiotion is 384 GB) for
bluestore db and wal.
Each osd and mon host have two 10GB nics for public and cluster ceph
network.

Osd servers are  Power Edge R7425 with 256 GB RAM and  MegaRAID SAS-3 3108.
No nvme disks are present.


Ceph.conf is the following:
[global]
fsid = 9a33214b-86df-4ef0-9199-5f7637cff1cd
public_network = 10.102.189.128/25
cluster_network = 10.102.143.16/28
mon_initial_members = tst2-osctrl01, tst2-osctrl02, tst2-osctrl03
mon_host = 10.102.189.200,10.102.189.201,10.102.189.202
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
osd pool default size = 3
osd pool default min size = 2
mon_max_pg_per_osd = 1024
osd max pg per osd hard ratio = 20

[mon]
mon compact on start = true

[osd]
bluestore cache autotune = 0
#bluestore cache kv ratio = 0.2
#bluestore cache meta ratio = 0.8
bluestore cache size ssd = 8G
bluestore csum type = none
bluestore extent map shard max size = 200
bluestore extent map shard min size = 50
bluestore extent map shard target size = 100
bluestore rocksdb options =
compression=kNoCompression,max_write_buffer_number=32,min_write_buffer_number_to_merge=2,recycle_log_file_num=32,compaction_style=kCompactionStyleLevel,write_buffer_size=67108864,target_file_size_base=67108864,max_background_compactions=31,level0_file_num_compaction_trigger=8,level0_slowdown_writes_trigger=32,level0_stop_writes_trigger=64,max_bytes_for_level_base=536870912,compaction_threads=32,max_bytes_for_level_multiplier=8,flusher_threads=8,compaction_readahead_size=2MB
osd map share max epochs = 100
osd max backfills = 5
osd memory target = 4294967296
osd op num shards = 8
osd op num threads per shard = 2


Any help, please ?

Ignazio
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com