Hi David, I noticed the public interface of the server I am running the test from is heavily used so I will bond that one too
I doubt though that this explains the poor performance Thanks for your advice Steven On 22 January 2018 at 12:02, David Turner <drakonst...@gmail.com> wrote: > I'm not speaking to anything other than your configuration. > > "I am using 2 x 10 GB bonded ( BONDING_OPTS="mode=4 miimon=100 > xmit_hash_policy=1 lacp_rate=1") for cluster and 1 x 1GB for public" > It might not be a bad idea for you to forgo the public network on the 1Gb > interfaces and either put everything on one network or use VLANs on the > 10Gb connections. I lean more towards that in particular because your > public network doesn't have a bond on it. Just as a note, communication > between the OSDs and the MONs are all done on the public network. If that > interface goes down, then the OSDs are likely to be marked down/out from > your cluster. I'm a fan of VLANs, but if you don't have the equipment or > expertise to go that route, then just using the same subnet for public and > private is a decent way to go. > > On Mon, Jan 22, 2018 at 11:37 AM Steven Vacaroaia <ste...@gmail.com> > wrote: > >> I did test with rados bench ..here are the results >> >> rados bench -p ssdpool 300 -t 12 write --no-cleanup && rados bench -p >> ssdpool 300 -t 12 seq >> >> Total time run: 300.322608 >> Total writes made: 10632 >> Write size: 4194304 >> Object size: 4194304 >> Bandwidth (MB/sec): 141.608 >> Stddev Bandwidth: 74.1065 >> Max bandwidth (MB/sec): 264 >> Min bandwidth (MB/sec): 0 >> Average IOPS: 35 >> Stddev IOPS: 18 >> Max IOPS: 66 >> Min IOPS: 0 >> Average Latency(s): 0.33887 >> Stddev Latency(s): 0.701947 >> Max latency(s): 9.80161 >> Min latency(s): 0.015171 >> >> Total time run: 300.829945 >> Total reads made: 10070 >> Read size: 4194304 >> Object size: 4194304 >> Bandwidth (MB/sec): 133.896 >> Average IOPS: 33 >> Stddev IOPS: 14 >> Max IOPS: 68 >> Min IOPS: 3 >> Average Latency(s): 0.35791 >> Max latency(s): 4.68213 >> Min latency(s): 0.0107572 >> >> >> rados bench -p scbench256 300 -t 12 write --no-cleanup && rados bench -p >> scbench256 300 -t 12 seq >> >> Total time run: 300.747004 >> Total writes made: 10239 >> Write size: 4194304 >> Object size: 4194304 >> Bandwidth (MB/sec): 136.181 >> Stddev Bandwidth: 75.5 >> Max bandwidth (MB/sec): 272 >> Min bandwidth (MB/sec): 0 >> Average IOPS: 34 >> Stddev IOPS: 18 >> Max IOPS: 68 >> Min IOPS: 0 >> Average Latency(s): 0.352339 >> Stddev Latency(s): 0.72211 >> Max latency(s): 9.62304 >> Min latency(s): 0.00936316 >> hints = 1 >> >> >> Total time run: 300.610761 >> Total reads made: 7628 >> Read size: 4194304 >> Object size: 4194304 >> Bandwidth (MB/sec): 101.5 >> Average IOPS: 25 >> Stddev IOPS: 11 >> Max IOPS: 61 >> Min IOPS: 0 >> Average Latency(s): 0.472321 >> Max latency(s): 15.636 >> Min latency(s): 0.0188098 >> >> >> On 22 January 2018 at 11:34, Steven Vacaroaia <ste...@gmail.com> wrote: >> >>> sorry ..send the message too soon >>> Here is more info >>> Vendor Id : SEAGATE >>> Product Id : ST600MM0006 >>> State : Online >>> Disk Type : SAS,Hard Disk Device >>> Capacity : 558.375 GB >>> Power State : Active >>> >>> ( SSD is in slot 0) >>> >>> megacli -LDGetProp -Cache -LALL -a0 >>> >>> Adapter 0-VD 0(target id: 0): Cache Policy:WriteThrough, ReadAheadNone, >>> Direct, No Write Cache if bad BBU >>> Adapter 0-VD 1(target id: 1): Cache Policy:WriteBack, ReadAdaptive, >>> Direct, No Write Cache if bad BBU >>> Adapter 0-VD 2(target id: 2): Cache Policy:WriteBack, ReadAdaptive, >>> Direct, No Write Cache if bad BBU >>> Adapter 0-VD 3(target id: 3): Cache Policy:WriteBack, ReadAdaptive, >>> Direct, No Write Cache if bad BBU >>> Adapter 0-VD 4(target id: 4): Cache Policy:WriteBack, ReadAdaptive, >>> Direct, No Write Cache if bad BBU >>> Adapter 0-VD 5(target id: 5): Cache Policy:WriteBack, ReadAdaptive, >>> Direct, No Write Cache if bad BBU >>> >>> [root@osd01 ~]# megacli -LDGetProp -DskCache -LALL -a0 >>> >>> Adapter 0-VD 0(target id: 0): Disk Write Cache : Disabled >>> Adapter 0-VD 1(target id: 1): Disk Write Cache : Disk's Default >>> Adapter 0-VD 2(target id: 2): Disk Write Cache : Disk's Default >>> Adapter 0-VD 3(target id: 3): Disk Write Cache : Disk's Default >>> Adapter 0-VD 4(target id: 4): Disk Write Cache : Disk's Default >>> Adapter 0-VD 5(target id: 5): Disk Write Cache : Disk's Default >>> >>> >>> CPU >>> Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz >>> >>> Centos 7 kernel 3.10.0-693.11.6.el7.x86_64 >>> >>> sysctl -p >>> net.ipv4.tcp_sack = 0 >>> net.core.netdev_budget = 600 >>> net.ipv4.tcp_window_scaling = 1 >>> net.core.rmem_max = 16777216 >>> net.core.wmem_max = 16777216 >>> net.core.rmem_default = 16777216 >>> net.core.wmem_default = 16777216 >>> net.core.optmem_max = 40960 >>> net.ipv4.tcp_rmem = 4096 87380 16777216 >>> net.ipv4.tcp_wmem = 4096 65536 16777216 >>> net.ipv4.tcp_syncookies = 0 >>> net.core.somaxconn = 1024 >>> net.core.netdev_max_backlog = 20000 >>> net.ipv4.tcp_max_syn_backlog = 30000 >>> net.ipv4.tcp_max_tw_buckets = 2000000 >>> net.ipv4.tcp_tw_reuse = 1 >>> net.ipv4.tcp_slow_start_after_idle = 0 >>> net.ipv4.conf.all.send_redirects = 0 >>> net.ipv4.conf.all.accept_redirects = 0 >>> net.ipv4.conf.all.accept_source_route = 0 >>> vm.min_free_kbytes = 262144 >>> vm.swappiness = 0 >>> vm.vfs_cache_pressure = 100 >>> fs.suid_dumpable = 0 >>> kernel.core_uses_pid = 1 >>> kernel.msgmax = 65536 >>> kernel.msgmnb = 65536 >>> kernel.randomize_va_space = 1 >>> kernel.sysrq = 0 >>> kernel.pid_max = 4194304 >>> fs.file-max = 100000 >>> >>> >>> ceph.conf >>> >>> >>> public_network = 10.10.30.0/24 >>> cluster_network = 192.168.0.0/24 >>> >>> >>> osd_op_num_threads_per_shard = 2 >>> osd_op_num_shards = 25 >>> osd_pool_default_size = 2 >>> osd_pool_default_min_size = 1 # Allow writing 1 copy in a degraded state >>> osd_pool_default_pg_num = 256 >>> osd_pool_default_pgp_num = 256 >>> osd_crush_chooseleaf_type = 1 >>> osd_scrub_load_threshold = 0.01 >>> osd_scrub_min_interval = 137438953472 >>> osd_scrub_max_interval = 137438953472 >>> osd_deep_scrub_interval = 137438953472 >>> osd_max_scrubs = 16 >>> osd_op_threads = 8 >>> osd_max_backfills = 1 >>> osd_recovery_max_active = 1 >>> osd_recovery_op_priority = 1 >>> >>> >>> >>> >>> debug_lockdep = 0/0 >>> debug_context = 0/0 >>> debug_crush = 0/0 >>> debug_buffer = 0/0 >>> debug_timer = 0/0 >>> debug_filer = 0/0 >>> debug_objecter = 0/0 >>> debug_rados = 0/0 >>> debug_rbd = 0/0 >>> debug_journaler = 0/0 >>> debug_objectcatcher = 0/0 >>> debug_client = 0/0 >>> debug_osd = 0/0 >>> debug_optracker = 0/0 >>> debug_objclass = 0/0 >>> debug_filestore = 0/0 >>> debug_journal = 0/0 >>> debug_ms = 0/0 >>> debug_monc = 0/0 >>> debug_tp = 0/0 >>> debug_auth = 0/0 >>> debug_finisher = 0/0 >>> debug_heartbeatmap = 0/0 >>> debug_perfcounter = 0/0 >>> debug_asok = 0/0 >>> debug_throttle = 0/0 >>> debug_mon = 0/0 >>> debug_paxos = 0/0 >>> debug_rgw = 0/0 >>> >>> >>> [mon] >>> mon_allow_pool_delete = true >>> >>> [osd] >>> osd_heartbeat_grace = 20 >>> osd_heartbeat_interval = 5 >>> bluestore_block_db_size = 16106127360 <(610)%20612-7360> >>> bluestore_block_wal_size = 1073741824 >>> >>> [osd.6] >>> host = osd01 >>> osd_journal = /dev/disk/by-parttypeuuid/4fbd7e29-9d25-41b8-afd0- >>> 062c0ceff05d.1d58775a-5019-42ea-8149-a126f51a2501 >>> crush_location = root=ssds host=osd01-ssd >>> >>> [osd.7] >>> host = osd02 >>> osd_journal = /dev/disk/by-parttypeuuid/4fbd7e29-9d25-41b8-afd0- >>> 062c0ceff05d.683dc52d-5d69-4ff0-b5d9-b17056a55681 >>> crush_location = root=ssds host=osd02-ssd >>> >>> [osd.8] >>> host = osd04 >>> osd_journal = /dev/disk/by-parttypeuuid/4fbd7e29-9d25-41b8-afd0- >>> 062c0ceff05d.bd7c0088-b724-441e-9b88-9457305c541d >>> crush_location = root=ssds host=osd04-ssd >>> >>> >>> On 22 January 2018 at 11:29, Steven Vacaroaia <ste...@gmail.com> wrote: >>> >>>> Hi David, >>>> >>>> Yes, I meant no separate partitions for WAL and DB >>>> >>>> I am using 2 x 10 GB bonded ( BONDING_OPTS="mode=4 miimon=100 >>>> xmit_hash_policy=1 lacp_rate=1") for cluster and 1 x 1GB for public >>>> Disks are >>>> Vendor Id : TOSHIBA >>>> Product Id : PX05SMB040Y >>>> State : Online >>>> Disk Type : SAS,Solid State Device >>>> Capacity : 372.0 GB >>>> >>>> >>>> On 22 January 2018 at 11:24, David Turner <drakonst...@gmail.com> >>>> wrote: >>>> >>>>> Disk models, other hardware information including CPU, network >>>>> config? You say you're using Luminous, but then say journal on same >>>>> device. I'm assuming you mean that you just have the bluestore OSD >>>>> configured without a separate WAL or DB partition? Any more specifics you >>>>> can give will be helpful. >>>>> >>>>> On Mon, Jan 22, 2018 at 11:20 AM Steven Vacaroaia <ste...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I'll appreciate if you can provide some guidance / suggestions >>>>>> regarding perfomance issues on a test cluster ( 3 x DELL R620, 1 >>>>>> Entreprise >>>>>> SSD, 3 x 600 GB ,Entreprise HDD, 8 cores, 64 GB RAM) >>>>>> >>>>>> I created 2 pools ( replication factor 2) one with only SSD and the >>>>>> other with only HDD >>>>>> ( journal on same disk for both) >>>>>> >>>>>> The perfomance is quite similar although I was expecting to be at >>>>>> least 5 times better >>>>>> No issues noticed using atop >>>>>> >>>>>> What should I check / tune ? >>>>>> >>>>>> Many thanks >>>>>> Steven >>>>>> >>>>>> >>>>>> >>>>>> HDD based pool ( journal on the same disk) >>>>>> >>>>>> ceph osd pool get scbench256 all >>>>>> >>>>>> size: 2 >>>>>> min_size: 1 >>>>>> crash_replay_interval: 0 >>>>>> pg_num: 256 >>>>>> pgp_num: 256 >>>>>> crush_rule: replicated_rule >>>>>> hashpspool: true >>>>>> nodelete: false >>>>>> nopgchange: false >>>>>> nosizechange: false >>>>>> write_fadvise_dontneed: false >>>>>> noscrub: false >>>>>> nodeep-scrub: false >>>>>> use_gmt_hitset: 1 >>>>>> auid: 0 >>>>>> fast_read: 0 >>>>>> >>>>>> >>>>>> rbd bench --io-type write image1 --pool=scbench256 >>>>>> bench type write io_size 4096 io_threads 16 bytes 1073741824 pattern >>>>>> sequential >>>>>> SEC OPS OPS/SEC BYTES/SEC >>>>>> 1 46816 46836.46 191842139.78 >>>>>> 2 90658 45339.11 185709011.80 >>>>>> 3 133671 44540.80 182439126.08 >>>>>> 4 177341 44340.36 181618100.14 >>>>>> 5 217300 43464.04 178028704.54 >>>>>> 6 259595 42555.85 174308767.05 >>>>>> elapsed: 6 ops: 262144 ops/sec: 42694.50 bytes/sec: >>>>>> 174876688.23 >>>>>> >>>>>> fio /home/cephuser/write_256.fio >>>>>> write-4M: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=rbd, >>>>>> iodepth=32 >>>>>> fio-2.2.8 >>>>>> Starting 1 process >>>>>> rbd engine: RBD version: 1.12.0 >>>>>> Jobs: 1 (f=1): [r(1)] [100.0% done] [66284KB/0KB/0KB /s] [16.6K/0/0 >>>>>> iops] [eta 00m:00s] >>>>>> >>>>>> >>>>>> fio /home/cephuser/write_256.fio >>>>>> write-4M: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=rbd, >>>>>> iodepth=32 >>>>>> fio-2.2.8 >>>>>> Starting 1 process >>>>>> rbd engine: RBD version: 1.12.0 >>>>>> Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/14464KB/0KB /s] [0/3616/0 >>>>>> iops] [eta 00m:00s] >>>>>> >>>>>> >>>>>> SSD based pool >>>>>> >>>>>> >>>>>> ceph osd pool get ssdpool all >>>>>> >>>>>> size: 2 >>>>>> min_size: 1 >>>>>> crash_replay_interval: 0 >>>>>> pg_num: 128 >>>>>> pgp_num: 128 >>>>>> crush_rule: ssdpool >>>>>> hashpspool: true >>>>>> nodelete: false >>>>>> nopgchange: false >>>>>> nosizechange: false >>>>>> write_fadvise_dontneed: false >>>>>> noscrub: false >>>>>> nodeep-scrub: false >>>>>> use_gmt_hitset: 1 >>>>>> auid: 0 >>>>>> fast_read: 0 >>>>>> >>>>>> rbd -p ssdpool create --size 52100 image2 >>>>>> >>>>>> rbd bench --io-type write image2 --pool=ssdpool >>>>>> bench type write io_size 4096 io_threads 16 bytes 1073741824 pattern >>>>>> sequential >>>>>> SEC OPS OPS/SEC BYTES/SEC >>>>>> 1 42412 41867.57 171489557.93 >>>>>> 2 78343 39180.86 160484805.88 >>>>>> 3 118082 39076.48 160057256.16 >>>>>> 4 155164 38683.98 158449572.38 >>>>>> 5 192825 38307.59 156907885.84 >>>>>> 6 230701 37716.95 154488608.16 >>>>>> elapsed: 7 ops: 262144 ops/sec: 36862.89 bytes/sec: >>>>>> 150990387.29 >>>>>> >>>>>> >>>>>> [root@osd01 ~]# fio /home/cephuser/write_256.fio >>>>>> write-4M: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=rbd, >>>>>> iodepth=32 >>>>>> fio-2.2.8 >>>>>> Starting 1 process >>>>>> rbd engine: RBD version: 1.12.0 >>>>>> Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/20224KB/0KB /s] [0/5056/0 >>>>>> iops] [eta 00m:00s] >>>>>> >>>>>> >>>>>> fio /home/cephuser/write_256.fio >>>>>> write-4M: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=rbd, >>>>>> iodepth=32 >>>>>> fio-2.2.8 >>>>>> Starting 1 process >>>>>> rbd engine: RBD version: 1.12.0 >>>>>> Jobs: 1 (f=1): [r(1)] [100.0% done] [76096KB/0KB/0KB /s] [19.3K/0/0 >>>>>> iops] [eta 00m:00s] >>>>>> _______________________________________________ >>>>>> ceph-users mailing list >>>>>> ceph-users@lists.ceph.com >>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>>> >>>>> >>>> >>> >>
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com