[ceph-users] Default data to rbd that never written
Hello, I created an rbd image and wrote some data to it. After that, I cloned a new image from the previous one. Then I compared this two image byte by byte, but they are not totally equal. The data in position where I never wrote to the first image are not equal. Is this a normal case or a bug? If this is normal, is there any configuration can help me to set rbd default data to all zero? Ceph version: 14.2.5 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph luminous bluestore poor random write performances
Why do you think that is slow? That's 4.5k write iops and 13.5k read iops at the same time, that's amazing for a total of 30 HDDs. It's actually way faster than you'd expect for 30 HDDs, so these DB devices are really helping there :) Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Thu, Jan 2, 2020 at 12:14 PM Ignazio Cassano wrote: > Hi Stefan, using fio with bs=64k I got very good performances. > I am not skilled on storage, but linux file system block size is 4k. > So, How can I modify the configuration on ceph to obtain best performances > with bs=4k ? > Regards > Ignazio > > > > Il giorno gio 2 gen 2020 alle ore 10:59 Stefan Kooman ha > scritto: > >> Quoting Ignazio Cassano (ignaziocass...@gmail.com): >> > Hello All, >> > I installed ceph luminous with openstack, an using fio in a virtual >> machine >> > I got slow random writes: >> > >> > fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 >> --name=test >> > --filename=random_read_write.fio --bs=4k --iodepth=64 --size=4G >> > --readwrite=randrw --rwmixread=75 >> >> Do you use virtio-scsi with a SCSI queue per virtual CPU core? How many >> cores do you have? I suspect that the queue depth is hampering >> throughput here ... but is throughput performance really interesting >> anyway for your use case? Low latency generally matters most. >> >> Gr. Stefan >> >> >> -- >> | BIT BV https://www.bit.nl/Kamer van Koophandel 09090351 >> | GPG: 0xD14839C6 +31 318 648 688 / i...@bit.nl >> > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Infiniband backend OSD communication
I am working on upgrading my current ethernet only ceph cluster to a combined ethernet frontend and infiniband backend. From my research I understand that I set: ms_cluster_type = async+rdma ms_async_rdma_device_name = mlx4_0 What I don't understand is how does ceph know how to reach each OSD over RDMA? Do I have to run IPoIB on top of infiniband and use that for OSD addresses? Is there a way to use infiniband on backend without IPoIB and just use rdma verbs? ><> nathan stratton ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] [db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting
HI Today checking our monitor logs see that RocksDB compactation trigger every minute. Is that normal? 2020-01-02 14:08:33.091 7f2b8acbe700 4 rocksdb: [db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting 2020-01-02 14:08:33.091 7f2b8acbe700 4 rocksdb: [db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting 2020-01-02 14:08:33.091 7f2b8acbe700 4 rocksdb: [db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting 2020-01-02 14:08:33.091 7f2b8acbe700 4 rocksdb: [db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting 2020-01-02 14:08:33.091 7f2b8acbe700 4 rocksdb: [db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting 2020-01-02 14:09:15.193 7f2b8acbe700 4 rocksdb: [db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting 2020-01-02 14:09:15.193 7f2b8acbe700 4 rocksdb: [db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting 2020-01-02 14:09:15.193 7f2b8acbe700 4 rocksdb: [db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting 2020-01-02 14:09:15.193 7f2b8acbe700 4 rocksdb: [db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting 2020-01-02 14:09:15.193 7f2b8acbe700 4 rocksdb: [db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting 2020-01-02 14:09:15.193 7f2b8acbe700 4 rocksdb: [db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting 2020-01-02 14:09:15.193 7f2b8acbe700 4 rocksdb: [db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting 2020-01-02 14:09:15.193 7f2b8acbe700 4 rocksdb: [db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting 2020-01-02 14:09:15.193 7f2b8acbe700 4 rocksdb: [db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting 2020-01-02 14:09:15.193 7f2b8acbe700 4 rocksdb: [db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting 2020-01-02 14:09:15.193 7f2b8acbe700 4 rocksdb: [db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting Best Regards Manuel ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph luminous bluestore poor random write performances
Hi Stefan, using fio with bs=64k I got very good performances. I am not skilled on storage, but linux file system block size is 4k. So, How can I modify the configuration on ceph to obtain best performances with bs=4k ? Regards Ignazio Il giorno gio 2 gen 2020 alle ore 10:59 Stefan Kooman ha scritto: > Quoting Ignazio Cassano (ignaziocass...@gmail.com): > > Hello All, > > I installed ceph luminous with openstack, an using fio in a virtual > machine > > I got slow random writes: > > > > fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 > --name=test > > --filename=random_read_write.fio --bs=4k --iodepth=64 --size=4G > > --readwrite=randrw --rwmixread=75 > > Do you use virtio-scsi with a SCSI queue per virtual CPU core? How many > cores do you have? I suspect that the queue depth is hampering > throughput here ... but is throughput performance really interesting > anyway for your use case? Low latency generally matters most. > > Gr. Stefan > > > -- > | BIT BV https://www.bit.nl/Kamer van Koophandel 09090351 > | GPG: 0xD14839C6 +31 318 648 688 / i...@bit.nl > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph luminous bluestore poor random write performances
Hi, Your performance is not that bad, is it? What performance do you expect? I just ran the same test. 12 Node, SATA SSD Only: READ: bw=63.8MiB/s (66.9MB/s), 63.8MiB/s-63.8MiB/s (66.9MB/s-66.9MB/s), io=3070MiB (3219MB), run=48097-48097msec WRITE: bw=21.3MiB/s (22.4MB/s), 21.3MiB/s-21.3MiB/s (22.4MB/s-22.4MB/s), io=1026MiB (1076MB), run=48097-48097msec 6 Node, SAS Only: READ: bw=22.1MiB/s (23.2MB/s), 22.1MiB/s-22.1MiB/s (23.2MB/s-23.2MB/s), io=3070MiB (3219MB), run=138650-138650msec WRITE: bw=7578KiB/s (7759kB/s), 7578KiB/s-7578KiB/s (7759kB/s-7759kB/s), io=1026MiB (1076MB), run=138650-138650msec This is OpenStack Queens with Ceph FileStore (Luminous). Kind regards, Sinan Polat > Op 2 januari 2020 om 10:59 schreef Stefan Kooman : > > > Quoting Ignazio Cassano (ignaziocass...@gmail.com): > > Hello All, > > I installed ceph luminous with openstack, an using fio in a virtual machine > > I got slow random writes: > > > > fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test > > --filename=random_read_write.fio --bs=4k --iodepth=64 --size=4G > > --readwrite=randrw --rwmixread=75 > > Do you use virtio-scsi with a SCSI queue per virtual CPU core? How many > cores do you have? I suspect that the queue depth is hampering > throughput here ... but is throughput performance really interesting > anyway for your use case? Low latency generally matters most. > > Gr. Stefan > > > -- > | BIT BV https://www.bit.nl/Kamer van Koophandel 09090351 > | GPG: 0xD14839C6 +31 318 648 688 / i...@bit.nl > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph luminous bluestore poor random write performances
Hi Stefan, I did not understand you question but it's may fault. I am using virtio-scsi on my virtual machine. The virtual machine has two cores. Or dow yum mean cores on osd servers ? Regards Ignazio Il giorno gio 2 gen 2020 alle ore 10:59 Stefan Kooman ha scritto: > Quoting Ignazio Cassano (ignaziocass...@gmail.com): > > Hello All, > > I installed ceph luminous with openstack, an using fio in a virtual > machine > > I got slow random writes: > > > > fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 > --name=test > > --filename=random_read_write.fio --bs=4k --iodepth=64 --size=4G > > --readwrite=randrw --rwmixread=75 > > Do you use virtio-scsi with a SCSI queue per virtual CPU core? How many > cores do you have? I suspect that the queue depth is hampering > throughput here ... but is throughput performance really interesting > anyway for your use case? Low latency generally matters most. > > Gr. Stefan > > > -- > | BIT BV https://www.bit.nl/Kamer van Koophandel 09090351 > | GPG: 0xD14839C6 +31 318 648 688 / i...@bit.nl > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph luminous bluestore poor random write performances
Quoting Ignazio Cassano (ignaziocass...@gmail.com): > Hello All, > I installed ceph luminous with openstack, an using fio in a virtual machine > I got slow random writes: > > fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test > --filename=random_read_write.fio --bs=4k --iodepth=64 --size=4G > --readwrite=randrw --rwmixread=75 Do you use virtio-scsi with a SCSI queue per virtual CPU core? How many cores do you have? I suspect that the queue depth is hampering throughput here ... but is throughput performance really interesting anyway for your use case? Low latency generally matters most. Gr. Stefan -- | BIT BV https://www.bit.nl/Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / i...@bit.nl ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] ceph luminous bluestore poor random write performances
Hello All, I installed ceph luminous with openstack, an using fio in a virtual machine I got slow random writes: fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=random_read_write.fio --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75 Run status group 0 (all jobs): READ: bw=52.7MiB/s (55.3MB/s), 52.7MiB/s-52.7MiB/s (55.3MB/s-55.3MB/s), io=3070MiB (3219MB), run=58211-58211msec WRITE: bw=17.6MiB/s (18.5MB/s), 17.6MiB/s-17.6MiB/s (18.5MB/s-18.5MB/s), io=1026MiB (1076MB), run=58211-58211msec [root@tst2-osctrl01 ansible]# ceph osd tree ID CLASS WEIGHTTYPE NAME STATUS REWEIGHT PRI-AFF -1 108.26358 root default -336.08786 host p2-ceph-01 0 hdd 3.62279 osd.0 up 1.0 1.0 1 hdd 3.60529 osd.1 up 1.0 1.0 2 hdd 3.60529 osd.2 up 1.0 1.0 3 hdd 3.60529 osd.3 up 1.0 1.0 4 hdd 3.60529 osd.4 up 1.0 1.0 5 hdd 3.62279 osd.5 up 1.0 1.0 6 hdd 3.60529 osd.6 up 1.0 1.0 7 hdd 3.60529 osd.7 up 1.0 1.0 8 hdd 3.60529 osd.8 up 1.0 1.0 9 hdd 3.60529 osd.9 up 1.0 1.0 -536.08786 host p2-ceph-02 10 hdd 3.62279 osd.10 up 1.0 1.0 11 hdd 3.60529 osd.11 up 1.0 1.0 12 hdd 3.60529 osd.12 up 1.0 1.0 13 hdd 3.60529 osd.13 up 1.0 1.0 14 hdd 3.60529 osd.14 up 1.0 1.0 15 hdd 3.62279 osd.15 up 1.0 1.0 16 hdd 3.60529 osd.16 up 1.0 1.0 17 hdd 3.60529 osd.17 up 1.0 1.0 18 hdd 3.60529 osd.18 up 1.0 1.0 19 hdd 3.60529 osd.19 up 1.0 1.0 -736.08786 host p2-ceph-03 20 hdd 3.62279 osd.20 up 1.0 1.0 21 hdd 3.60529 osd.21 up 1.0 1.0 22 hdd 3.60529 osd.22 up 1.0 1.0 23 hdd 3.60529 osd.23 up 1.0 1.0 24 hdd 3.60529 osd.24 up 1.0 1.0 25 hdd 3.62279 osd.25 up 1.0 1.0 26 hdd 3.60529 osd.26 up 1.0 1.0 27 hdd 3.60529 osd.27 up 1.0 1.0 28 hdd 3.60529 osd.28 up 1.0 1.0 29 hdd 3.60529 osd.29 up 1.0 1.0 Each osd server has 10 4TB osd and two ssd (2 x 2TB). Each ssd is partitioned with 5 partitions (ehach partiotion is 384 GB) for bluestore db and wal. Each osd and mon host have two 10GB nics for public and cluster ceph network. Osd servers are Power Edge R7425 with 256 GB RAM and MegaRAID SAS-3 3108. No nvme disks are present. Ceph.conf is the following: [global] fsid = 9a33214b-86df-4ef0-9199-5f7637cff1cd public_network = 10.102.189.128/25 cluster_network = 10.102.143.16/28 mon_initial_members = tst2-osctrl01, tst2-osctrl02, tst2-osctrl03 mon_host = 10.102.189.200,10.102.189.201,10.102.189.202 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx osd pool default size = 3 osd pool default min size = 2 mon_max_pg_per_osd = 1024 osd max pg per osd hard ratio = 20 [mon] mon compact on start = true [osd] bluestore cache autotune = 0 #bluestore cache kv ratio = 0.2 #bluestore cache meta ratio = 0.8 bluestore cache size ssd = 8G bluestore csum type = none bluestore extent map shard max size = 200 bluestore extent map shard min size = 50 bluestore extent map shard target size = 100 bluestore rocksdb options = compression=kNoCompression,max_write_buffer_number=32,min_write_buffer_number_to_merge=2,recycle_log_file_num=32,compaction_style=kCompactionStyleLevel,write_buffer_size=67108864,target_file_size_base=67108864,max_background_compactions=31,level0_file_num_compaction_trigger=8,level0_slowdown_writes_trigger=32,level0_stop_writes_trigger=64,max_bytes_for_level_base=536870912,compaction_threads=32,max_bytes_for_level_multiplier=8,flusher_threads=8,compaction_readahead_size=2MB osd map share max epochs = 100 osd max backfills = 5 osd memory target = 4294967296 osd op num shards = 8 osd op num threads per shard = 2 Any help, please ? Ignazio ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com