Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]

2020-01-18 Thread Eric K. Miller
Hi Vitaliy,

 

Similar to Stefan, we have a bunch of Micron 5200's (3.84TB ECO SATA version) 
in a Ceph cluster (Nautilus) and performance seems less than optimal.  I have 
followed all instructions on your site (thank you for your wonderful article 
btw!!), but I haven't seen much change.

 

The only thing I could think of is that "maybe" disabling the write cache only 
takes place upon a reboot or power cycle?  Is that necessary?  Or is it a 
"live" change?

 

I have tested with the cache disabled as well as enabled on all drives.  We're 
using fio running in a QEMU/KVM VM in an OpenStack cluster, so not "raw" access 
to the Micron 5200's.  OSD (Bluestore) nodes run CentOS 7 using a 4.18.x 
kernel.  Testing doesn't show any, or much, difference, enough that the 
variations could be considered "noise" in the results.  Certainly no change 
that anyone could tell.

 

Thought I'd check to see if you, or anyone else, might have any suggestions 
specific to the Micron 5200.

 

We have some Micron 5300's inbound, but probably won't have them here for 
another few weeks due to Micron's manufacturing delays, so will be able to test 
these raw drives soon.  I will report back after, but if you know anything 
about these, I'm all ears. :)

 

Thank you!

 

Eric

 

 

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Stefan 
Bauer
Sent: Tuesday, January 14, 2020 10:28 AM
To: undisclosed-recipients
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we 
expect more? [klartext]

 

Thank you all,

 

performance is indeed better now. Can now go back to sleep ;)

 

KR

 

Stefan

 

-Ursprüngliche Nachricht-
Von: Виталий Филиппов 
Gesendet: Dienstag 14 Januar 2020 10:28
An: Wido den Hollander ; Stefan Bauer 

CC: ceph-users@lists.ceph.com
Betreff: Re: [ceph-users] low io with enterprise SSDs ceph luminous - 
can we expect more? [klartext]

...disable signatures and rbd cache. I didn't mention it in the email 
to not repeat myself. But I have it in the article :-)
-- 
With best regards,
Vitaliy Filippov 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Monitor handle_auth_bad_method

2020-01-18 Thread Justin Engwer
Gatherkeys and config push seem to have done the job. Thanks for your help,
Paul!

Justin

On Sat., Jan. 18, 2020, 02:33 Paul Emmerich,  wrote:

> check if the mons have the same keyring file and the same config file.
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
> www.croit.io
> Tel: +49 89 1896585 90
>
> On Sat, Jan 18, 2020 at 12:39 AM Justin Engwer  wrote:
> >
> > Hi,
> > I'm a home user of ceph. Most of the time I can look at the email lists
> and articles and figure things out on my own. I've unfortunately run into
> an issue I can't troubleshoot myself.
> >
> > Starting one of my monitors yields this error:
> >
> > 2020-01-17 15:34:13.497 7fca3d006040  0 mon.kvm2@-1(probing) e11  my
> rank is now 2 (was -1)
> > 2020-01-17 15:34:13.696 7fca2909b700 -1 mon.kvm2@2(probing) e11
> handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
> > 2020-01-17 15:34:14.098 7fca2909b700 -1 mon.kvm2@2(probing) e11
> handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
> > 2020-01-17 15:34:14.899 7fca2909b700 -1 mon.kvm2@2(probing) e11
> handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
> >
> >
> > I've grabbed a good monmap from other monitors and double checked
> permissions on /var/lib/ceph/mon/ceph-kvm2/ to make sure that it's not a
> filesystem error and everything looks good to me.
> >
> > /var/lib/ceph/mon/ceph-kvm2/:
> > total 32
> > drwxr-xr-x  4 ceph ceph  4096 May  7  2018 .
> > drwxr-x---. 4 ceph ceph44 Jan  8 11:44 ..
> > -rw-r--r--. 1 ceph ceph 0 Apr 25  2018 done
> > -rw---. 1 ceph ceph77 Apr 25  2018 keyring
> > -rw-r--r--. 1 ceph ceph 8 Apr 25  2018 kv_backend
> > drwx--  2 ceph ceph 16384 May  7  2018 lost+found
> > drwxr-xr-x. 2 ceph ceph  4096 Jan 17 15:27 store.db
> > -rw-r--r--. 1 ceph ceph 0 Apr 25  2018 systemd
> >
> > /var/lib/ceph/mon/ceph-kvm2/lost+found:
> > total 20
> > drwx-- 2 ceph ceph 16384 May  7  2018 .
> > drwxr-xr-x 4 ceph ceph  4096 May  7  2018 ..
> >
> > /var/lib/ceph/mon/ceph-kvm2/store.db:
> > total 68424
> > drwxr-xr-x. 2 ceph ceph 4096 Jan 17 15:27 .
> > drwxr-xr-x  4 ceph ceph 4096 May  7  2018 ..
> > -rw---  1 ceph ceph 65834705 Jan 17 14:57 1557088.sst
> > -rw---  1 ceph ceph 1833 Jan 17 15:27 1557090.sst
> > -rw---  1 ceph ceph0 Jan 17 15:27 1557092.log
> > -rw---  1 ceph ceph   17 Jan 17 15:27 CURRENT
> > -rw-r--r--. 1 ceph ceph   37 Apr 25  2018 IDENTITY
> > -rw-r--r--. 1 ceph ceph0 Apr 25  2018 LOCK
> > -rw---  1 ceph ceph  185 Jan 17 15:27 MANIFEST-1557091
> > -rw---  1 ceph ceph 4941 Jan 17 14:57 OPTIONS-1557087
> > -rw---  1 ceph ceph 4941 Jan 17 15:27 OPTIONS-1557094
> >
> >
> > Any help would be appreciated.
> >
> >
> >
> > --
> >
> > Justin Engwer
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Monitor handle_auth_bad_method

2020-01-18 Thread Paul Emmerich
check if the mons have the same keyring file and the same config file.
-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

On Sat, Jan 18, 2020 at 12:39 AM Justin Engwer  wrote:
>
> Hi,
> I'm a home user of ceph. Most of the time I can look at the email lists and 
> articles and figure things out on my own. I've unfortunately run into an 
> issue I can't troubleshoot myself.
>
> Starting one of my monitors yields this error:
>
> 2020-01-17 15:34:13.497 7fca3d006040  0 mon.kvm2@-1(probing) e11  my rank is 
> now 2 (was -1)
> 2020-01-17 15:34:13.696 7fca2909b700 -1 mon.kvm2@2(probing) e11 
> handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
> 2020-01-17 15:34:14.098 7fca2909b700 -1 mon.kvm2@2(probing) e11 
> handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
> 2020-01-17 15:34:14.899 7fca2909b700 -1 mon.kvm2@2(probing) e11 
> handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
>
>
> I've grabbed a good monmap from other monitors and double checked permissions 
> on /var/lib/ceph/mon/ceph-kvm2/ to make sure that it's not a filesystem error 
> and everything looks good to me.
>
> /var/lib/ceph/mon/ceph-kvm2/:
> total 32
> drwxr-xr-x  4 ceph ceph  4096 May  7  2018 .
> drwxr-x---. 4 ceph ceph44 Jan  8 11:44 ..
> -rw-r--r--. 1 ceph ceph 0 Apr 25  2018 done
> -rw---. 1 ceph ceph77 Apr 25  2018 keyring
> -rw-r--r--. 1 ceph ceph 8 Apr 25  2018 kv_backend
> drwx--  2 ceph ceph 16384 May  7  2018 lost+found
> drwxr-xr-x. 2 ceph ceph  4096 Jan 17 15:27 store.db
> -rw-r--r--. 1 ceph ceph 0 Apr 25  2018 systemd
>
> /var/lib/ceph/mon/ceph-kvm2/lost+found:
> total 20
> drwx-- 2 ceph ceph 16384 May  7  2018 .
> drwxr-xr-x 4 ceph ceph  4096 May  7  2018 ..
>
> /var/lib/ceph/mon/ceph-kvm2/store.db:
> total 68424
> drwxr-xr-x. 2 ceph ceph 4096 Jan 17 15:27 .
> drwxr-xr-x  4 ceph ceph 4096 May  7  2018 ..
> -rw---  1 ceph ceph 65834705 Jan 17 14:57 1557088.sst
> -rw---  1 ceph ceph 1833 Jan 17 15:27 1557090.sst
> -rw---  1 ceph ceph0 Jan 17 15:27 1557092.log
> -rw---  1 ceph ceph   17 Jan 17 15:27 CURRENT
> -rw-r--r--. 1 ceph ceph   37 Apr 25  2018 IDENTITY
> -rw-r--r--. 1 ceph ceph0 Apr 25  2018 LOCK
> -rw---  1 ceph ceph  185 Jan 17 15:27 MANIFEST-1557091
> -rw---  1 ceph ceph 4941 Jan 17 14:57 OPTIONS-1557087
> -rw---  1 ceph ceph 4941 Jan 17 15:27 OPTIONS-1557094
>
>
> Any help would be appreciated.
>
>
>
> --
>
> Justin Engwer
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Default Pools

2020-01-18 Thread Paul Emmerich
RGW tools will automatically deploy these pools, for example, running
radosgw-admin will create them if they don't exist.


Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

On Sat, Jan 18, 2020 at 2:48 AM Daniele Riccucci  wrote:
>
> Hello,
> I'm still a bit confused by the .rgw.root and the
> default.rgw.{control,meta,log} pools.
> I recently removed the RGW daemon I had running and the aforementioned
> pools, however after a rebalance I suddenly find them again in the
> output of:
>
> $ ceph osd pool ls
> cephfs_data
> cephfs_metadata
> .rgw.root
> default.rgw.control
> default.rgw.meta
> default.rgw.log
>
> Each has 8 pgs but zero usage.
> I was unable to find logs or indications as to which daemon or action
> recreated them or whether it is safe to remove them again, where should
> I look?
> I'm on Nautilus 14.2.5, container deployment.
> Thank you.
>
> Regards,
> Daniele
>
> Il 23/04/19 22:14, David Turner ha scritto:
> > You should be able to see all pools in use in a RGW zone from the
> > radosgw-admin command. This [1] is probably overkill for most, but I
> > deal with multi-realm clusters so I generally think like this when
> > dealing with RGW.  Running this as is will create a file in your current
> > directory for each zone in your deployment (likely to be just one
> > file).  My rough guess for what you would find in that file based on
> > your pool names would be this [2].
> >
> > If you identify any pools not listed from the zone get command, then you
> > can rename [3] the pool to see if it is being created and/or used by rgw
> > currently.  The process here would be to stop all RGW daemons, rename
> > the pools, start a RGW daemon, stop it again, and see which pools were
> > recreated.  Clean up the pools that were freshly made and rename the
> > original pools back into place before starting your RGW daemons again.
> > Please note that .rgw.root is a required pool in every RGW deployment
> > and will not be listed in the zones themselves.
> >
> >
> > [1]
> > for realm in $(radosgw-admin realm list --format=json | jq '.realms[]'
> > -r); do
> >for zonegroup in $(radosgw-admin --rgw-realm=$realm zonegroup list
> > --format=json | jq '.zonegroups[]' -r); do
> >  for zone in $(radosgw-admin --rgw-realm=$realm
> > --rgw-zonegroup=$zonegroup zone list --format=json | jq '.zones[]' -r); do
> >echo $realm.$zonegroup.$zone.json
> >radosgw-admin --rgw-realm=$realm --rgw-zonegroup=$zonegroup
> > --rgw-zone=$zone zone get > $realm.$zonegroup.$zone.json
> >  done
> >done
> > done
> >
> > [2] default.default.default.json
> > {
> >  "id": "{{ UUID }}",
> >  "name": "default",
> >  "domain_root": "default.rgw.meta",
> >  "control_pool": "default.rgw.control",
> >  "gc_pool": ".rgw.gc",
> >  "log_pool": "default.rgw.log",
> >  "user_email_pool": ".users.email",
> >  "user_uid_pool": ".users.uid",
> >  "system_key": {
> >  },
> >  "placement_pools": [
> >  {
> >  "key": "default-placement",
> >  "val": {
> >  "index_pool": "default.rgw.buckets.index",
> >  "data_pool": "default.rgw.buckets.data",
> >  "data_extra_pool": "default.rgw.buckets.non-ec",
> >  "index_type": 0,
> >  "compression": ""
> >  }
> >  }
> >  ],
> >  "metadata_heap": "",
> >  "tier_config": [],
> >  "realm_id": "{{ UUID }}"
> > }
> >
> > [3] ceph osd pool rename  
> >
> > On Thu, Apr 18, 2019 at 10:46 AM Brent Kennedy  > > wrote:
> >
> > Yea, that was a cluster created during firefly...
> >
> > Wish there was a good article on the naming and use of these, or
> > perhaps a way I could make sure they are not used before deleting
> > them.  I know RGW will recreate anything it uses, but I don’t want
> > to lose data because I wanted a clean system.
> >
> > -Brent
> >
> > -Original Message-
> > From: Gregory Farnum mailto:gfar...@redhat.com>>
> > Sent: Monday, April 15, 2019 5:37 PM
> > To: Brent Kennedy mailto:bkenn...@cfl.rr.com>>
> > Cc: Ceph Users  > >
> > Subject: Re: [ceph-users] Default Pools
> >
> > On Mon, Apr 15, 2019 at 1:52 PM Brent Kennedy  > > wrote:
> >  >
> >  > I was looking around the web for the reason for some of the
> > default pools in Ceph and I cant find anything concrete.  Here is
> > our list, some show no use at all.  Can any of these be deleted ( or
> > is there an article my googlefu failed to find that covers the
> > default pools?
> >  >
> >  > We only use buckets, so I took out .rgw.buckets, .users and
> >  > .rgw.buckets.index…
> >  >
> >  > Name
> >  > .log
>

Re: [ceph-users] Slow Performance - Sequential IO

2020-01-18 Thread Paul Emmerich
Benchmarking is hard.

It's expected that random will be faster than sequential with this IO pattern.
Reason is that you are going to different blocks/pgs/osds with random
IO for every 4k block but the sequential IO is stuck on a 4mb block
for more IOs than your queue is deep.

Question is: is writing 4k blocks sequentially in any way
representative of your workload? Probably not, so don't test that.

You mentioned vmware, so writing 64k blocks with qd 64 might be
relevant for you (e.g., vmotion). In this case try to configure
striping with a 64kb stripe size for the image format and test
sequentially writing 64kb blocks.


Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

On Sat, Jan 18, 2020 at 5:14 AM Christian Balzer  wrote:
>
>
> Hello,
>
> I had very odd results in the past with the fio rbd engine and would
> suggest testing things in the environment you're going to deploy in, end
> to end.
>
> That said, without any caching and coalescing of writes, sequential 4k
> writes will hit the same set of OSDs for 4MB worth of data, thus limiting
> things to whatever the overall latency (network, 3x write) is here.
> With random writes you will engage more or less all OSDs that hold your
> fio file, thus spreading things out.
> This becomes more and more visible with increasing number of OSDs and
> nodes.
>
> Regards,
>
> Christian
> On Fri, 17 Jan 2020 23:01:09 + Anthony Brandelli (abrandel) wrote:
>
> > Not been able to make any headway on this after some significant effort.
> >
> > -Tested all 48 SSDs with FIO directly, all tested with 10% of each other 
> > for 4k iops in rand|seq read|write.
> > -Disabled all CPU power save.
> > -Tested with both rbd cache enabled and disabled on the client.
> > -Tested with drive caches enabled and disabled (hdparm)
> > -Minimal TCP retransmissions under load (<10 for a 2 minute duration).
> > -No drops/pause frames noted on upstream switches.
> > -CPU load on OSD nodes peaks at 6~.
> > -iostat shows a peak of 15ms under read/write workloads, %util peaks at 
> > about 10%.
> > -Swapped out the RBD client for a bigger box, since the load was peaking at 
> > 16. Now a 24 core box, load still peaks at 16.
> > -Disabled cephx signatures
> > -Verified hardware health (nothing in dmesg, nothing in CIMC fault logs, 
> > storage controller logs)
> > -Test multiple SSDs at once to find the controllers iops limit, which is 
> > apparently 650k @ 4k.
> >
> > Nothing has made a noticeable difference here. I'm pretty baffled as to 
> > what would be causing the awful sequential read and write performance, but 
> > allowing good random r/w speeds.
> >
> > I switched up fio testing methodologies to use more threads, but this 
> > didn't seem to help either:
> >
> > [global]
> > bs=4k
> > ioengine=rbd
> > iodepth=32
> > size=5g
> > runtime=120
> > numjobs=4
> > group_reporting=1
> > pool=rbd_af1
> > rbdname=image1
> >
> > [seq-read]
> > rw=read
> > stonewall
> >
> > [rand-read]
> > rw=randread
> > stonewall
> >
> > [seq-write]
> > rw=write
> > stonewall
> >
> > [rand-write]
> > rw=randwrite
> > stonewall
> >
> > Any pointers are appreciated at this point. I've been following other 
> > threads on the mailing list, and looked at the archives, related to RBD 
> > performance but none of the solutions that worked for others seem to have 
> > helped this setup.
> >
> > Thanks,
> > Anthony
> >
> > 
> > From: Anthony Brandelli (abrandel) 
> > Sent: Tuesday, January 14, 2020 12:43 AM
> > To: ceph-users@lists.ceph.com 
> > Subject: Slow Performance - Sequential IO
> >
> >
> > I have a newly setup test cluster that is giving some surprising numbers 
> > when running fio against an RBD. The end goal here is to see how viable a 
> > Ceph based iSCSI SAN of sorts is for VMware clusters, which require a bunch 
> > of random IO.
> >
> >
> >
> > Hardware:
> >
> > 2x E5-2630L v2 (2.4GHz, 6 core)
> >
> > 256GB RAM
> >
> > 2x 10gbps bonded network, Intel X520
> >
> > LSI 9271-8i, SSDs used for OSDs in JBOD mode
> >
> > Mons: 2x 1.2TB 10K SAS in RAID1
> >
> > OSDs: 12x Samsung MZ6ER800HAGL-3 800GB SAS SSDs, super cap/power loss 
> > protection
> >
> >
> >
> > Cluster setup:
> >
> > Three mon nodes, four OSD nodes
> >
> > Two OSDs per SSD
> >
> > Replica 3 pool
> >
> > Ceph 14.2.5
> >
> >
> >
> > Ceph status:
> >
> >   cluster:
> >
> > id: e3d93b4a-520c-4d82-a135-97d0bda3e69d
> >
> > health: HEALTH_WARN
> >
> > application not enabled on 1 pool(s)
> >
> >   services:
> >
> > mon: 3 daemons, quorum mon1,mon2,mon3 (age 6d)
> >
> > mgr: mon2(active, since 6d), standbys: mon3, mon1
> >
> > osd: 96 osds: 96 up (since 3d), 96 in (since 3d)
> >
> >   data:
> >
> > pools:   1 pools, 3072 pgs
> >
> > objects: 857.00k objects, 1.8 TiB
> >
> > usage:   432 GiB used, 34 TiB / 35 TiB avail
> >
> > pgs