Re: [ceph-users] Can't activate OSD with journal and data on the same disk
Hi, News. I tried activate disk without --dmcrypt and there is no problem. After activate on sdb are two partitions (sdb2 for jounral and sdb1 for data). In my opinion there is a bug with switch --dmcrypt and activating journal on disk (partitions are created, but mounting done by ceph-disk fail). Here are logs without --dmcrypt root@ceph-deploy:~/ceph# ceph-deploy osd prepare ceph-node0:/dev/sdb [ceph_deploy.cli][INFO ] Invoked (1.3.1): /usr/bin/ceph-deploy osd prepare ceph-node0:/dev/sdb [ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks ceph-node0:/dev/sdb: [ceph-node0][DEBUG ] connected to host: ceph-node0 [ceph-node0][DEBUG ] detect platform information from remote host [ceph-node0][DEBUG ] detect machine type [ceph_deploy.osd][INFO ] Distro info: Ubuntu 13.04 raring [ceph_deploy.osd][DEBUG ] Deploying osd to ceph-node0 [ceph-node0][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ceph-node0][INFO ] Running command: udevadm trigger --subsystem-match=block --action=add [ceph_deploy.osd][DEBUG ] Preparing host ceph-node0 disk /dev/sdb journal None activate False [ceph-node0][INFO ] Running command: ceph-disk-prepare --fs-type xfs --cluster ceph -- /dev/sdb [ceph-node0][ERROR ] INFO:ceph-disk:Will colocate journal with data on /dev/sdb [ceph-node0][DEBUG ] Information: Moved requested sector from 34 to 2048 in [ceph-node0][DEBUG ] order to align on 2048-sector boundaries. [ceph-node0][DEBUG ] The operation has completed successfully. [ceph-node0][DEBUG ] Information: Moved requested sector from 2097153 to 2099200 in [ceph-node0][DEBUG ] order to align on 2048-sector boundaries. [ceph-node0][DEBUG ] The operation has completed successfully. [ceph-node0][DEBUG ] meta-data=/dev/sdb1 isize=2048 agcount=4, agsize=917439 blks [ceph-node0][DEBUG ] = sectsz=512 attr=2, projid32bit=0 [ceph-node0][DEBUG ] data = bsize=4096 blocks=3669755, imaxpct=25 [ceph-node0][DEBUG ] = sunit=0 swidth=0 blks [ceph-node0][DEBUG ] naming =version 2 bsize=4096 ascii-ci=0 [ceph-node0][DEBUG ] log =internal log bsize=4096 blocks=2560, version=2 [ceph-node0][DEBUG ] = sectsz=512 sunit=0 blks, lazy-count=1 [ceph-node0][DEBUG ] realtime =none extsz=4096 blocks=0, rtextents=0 [ceph-node0][DEBUG ] The operation has completed successfully. [ceph_deploy.osd][DEBUG ] Host ceph-node0 is now ready for osd use. Disk are properly activated. With --dmcrypt journal partition are not propery mounted and ceph-disk cannot use it. Best Regards, Michael Hi! I have a question about activating OSD on whole disk. I can't bypass this issue. Conf spec: 8 VMs - ceph-deploy; ceph-admin; ceph-mon0-2 and ceph-node0-2; I started from creating MON - all good . After that I want to prepare and activate 3x OSD with dm-crypt. So I put on ceph.conf this [osd.0] host = ceph-node0 cluster addr = 10.0.0.75:6800 public addr = 10.0.0.75:6801 devs = /dev/sdb Next I use ceph-deploy to activate a OSD and this shows root@ceph-deploy:~/ceph# ceph-deploy osd prepare ceph-node0:/dev/sdb --dmcrypt [ceph_deploy.cli][INFO ] Invoked (1.3.1): /usr/bin/ceph-deploy osd prepare ceph-node0:/dev/sdb --dmcrypt [ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks ceph-node0:/dev/sdb: [ceph-node0][DEBUG ] connected to host: ceph-node0 [ceph-node0][DEBUG ] detect platform information from remote host [ceph-node0][DEBUG ] detect machine type [ceph_deploy.osd][INFO ] Distro info: Ubuntu 13.04 raring [ceph_deploy.osd][DEBUG ] Deploying osd to ceph-node0 [ceph-node0][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ceph-node0][INFO ] Running command: udevadm trigger --subsystem-match=block --action=add [ceph_deploy.osd][DEBUG ] Preparing host ceph-node0 disk /dev/sdb journal None activate False [ceph-node0][INFO ] Running command: ceph-disk-prepare --fs-type xfs --dmcrypt --dmcrypt-key-dir /etc/ceph/dmcrypt-keys --cluster ceph -- /dev/sdb [ceph-node0][ERROR ] INFO:ceph-disk:Will colocate journal with data on /dev/sdb [ceph-node0][ERROR ] ceph-disk: Error: partition 1 for /dev/sdb does not appear to exist [ceph-node0][DEBUG ] Information: Moved requested sector from 34 to 2048 in [ceph-node0][DEBUG ] order to align on 2048-sector boundaries. [ceph-node0][DEBUG ] The operation has completed successfully. [ceph-node0][DEBUG ] Information: Moved requested sector from 2097153 to 2099200 in [ceph-node0][DEBUG ] order to align on 2048-sector boundaries. [ceph-node0][DEBUG ] Warning: The kernel is still using the old partition table. [ceph-node0][DEBUG ] The new table will be used at the next reboot. [ceph-node0][DEBUG ] The operation has completed successfully. [ceph-node0][ERROR ] Traceback (most recent
Re: [ceph-users] please help me.problem with my ceph
Hello Joseph This sounds like a solution , BTW how to set replication level to 1 , is there any direct command or need to edit configuration file. Many Thanks Karan Singh - Original Message - From: Joseph R Gruher joseph.r.gru...@intel.com To: ceph-users@lists.ceph.com Sent: Thursday, 7 November, 2013 9:14:45 PM Subject: Re: [ceph-users] please help me.problem with my ceph From: ceph-users-boun...@lists.ceph.com [mailto:ceph-users- boun...@lists.ceph.com] On Behalf Of ?? Sent: Wednesday, November 06, 2013 10:04 PM To: ceph-users Subject: [ceph-users] please help me.problem with my ceph 1. I have installed ceph with one mon/mds and one osd.When i use 'ceph - s',there si a warning:health HEALTH_WARN 384 pgs degraded; 384 pgs stuck unclean; recovery 21/42 degraded (50.000%) I would think this is because Ceph defaults to a replication level of 2 and you only have one OSD (nowhere to write a second copy) so you are degraded? You could add a second OSD or perhaps you could set the replication level to 1? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] please help me.problem with my ceph
Hi Karan, There's info on http://ceph.com/docs/master/rados/operations/pools/ But primarily you need to check your replication levels: ceph osd dump -o -|grep 'rep size' Then alter the pools that are stuck unclean: ceph osd pool set size/min_size # If you're new to ceph it's probably a good idea to double check your pg numbers while you're doing this. -Michael On 08/11/2013 11:08, Karan Singh wrote: Hello Joseph This sounds like a solution , BTW how to set replication level to 1 , is there any direct command or need to edit configuration file. Many Thanks Karan Singh - Original Message - From: Joseph R Gruher joseph.r.gru...@intel.com To: ceph-users@lists.ceph.com Sent: Thursday, 7 November, 2013 9:14:45 PM Subject: Re: [ceph-users] please help me.problem with my ceph From: ceph-users-boun...@lists.ceph.com [mailto:ceph-users- boun...@lists.ceph.com] On Behalf Of ?? Sent: Wednesday, November 06, 2013 10:04 PM To: ceph-users Subject: [ceph-users] please help me.problem with my ceph 1. I have installed ceph with one mon/mds and one osd.When i use 'ceph - s',there si a warning:health HEALTH_WARN 384 pgs degraded; 384 pgs stuck unclean; recovery 21/42 degraded (50.000%) I would think this is because Ceph defaults to a replication level of 2 and you only have one OSD (nowhere to write a second copy) so you are degraded? You could add a second OSD or perhaps you could set the replication level to 1? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] please help me.problem with my ceph
Apologies, that should have been: ceph osd dump | grep 'rep size' What I get from blindly copying from a wiki! -Michael On 08/11/2013 11:38, Michael wrote: Hi Karan, There's info on http://ceph.com/docs/master/rados/operations/pools/ But primarily you need to check your replication levels: ceph osd dump -o -|grep 'rep size' Then alter the pools that are stuck unclean: ceph osd pool set size/min_size # If you're new to ceph it's probably a good idea to double check your pg numbers while you're doing this. -Michael On 08/11/2013 11:08, Karan Singh wrote: Hello Joseph This sounds like a solution , BTW how to set replication level to 1 , is there any direct command or need to edit configuration file. Many Thanks Karan Singh - Original Message - From: Joseph R Gruher joseph.r.gru...@intel.com To: ceph-users@lists.ceph.com Sent: Thursday, 7 November, 2013 9:14:45 PM Subject: Re: [ceph-users] please help me.problem with my ceph From: ceph-users-boun...@lists.ceph.com [mailto:ceph-users- boun...@lists.ceph.com] On Behalf Of ?? Sent: Wednesday, November 06, 2013 10:04 PM To: ceph-users Subject: [ceph-users] please help me.problem with my ceph 1. I have installed ceph with one mon/mds and one osd.When i use 'ceph - s',there si a warning:health HEALTH_WARN 384 pgs degraded; 384 pgs stuck unclean; recovery 21/42 degraded (50.000%) I would think this is because Ceph defaults to a replication level of 2 and you only have one OSD (nowhere to write a second copy) so you are degraded? You could add a second OSD or perhaps you could set the replication level to 1? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Unable to find bootstrap-osd and bootstrap-mds ceph.keyring
Hi All, I am able to Add a Ceph Monitor (step 3) as per the link http://ceph.com/docs/master/start/quick-ceph-deploy/ (Setting Up Ceph Storage Cluster) But when I am executing the gatherkey command, I am getting the warnings(highlighted in yellow). Please find the details – Command – “ceph-deploy gatherkeys vikrant”(vikrant is the hostname of the ceph-node1) Output – [ceph_deploy.cli][INFO ] Invoked (1.3.1): /usr/bin/ceph-deploy gatherkeys vikrant [ceph_deploy.gatherkeys][DEBUG ] Have ceph.client.admin.keyring [ceph_deploy.gatherkeys][DEBUG ] Have ceph.mon.keyring [ceph_deploy.gatherkeys][DEBUG ] Checking vikrant for /var/lib/ceph/bootstrap-osd/ceph.keyring [vikrant][DEBUG ] connected to host: vikrant [vikrant][DEBUG ] detect platform information from remote host [vikrant][DEBUG ] detect machine type [vikrant][DEBUG ] fetch remote file [ceph_deploy.gatherkeys][WARNIN] Unable to find /var/lib/ceph/bootstrap-osd/ceph.keyring on ['vikrant'] [ceph_deploy.gatherkeys][DEBUG ] Checking vikrant for /var/lib/ceph/bootstrap-mds/ceph.keyring [vikrant][DEBUG ] connected to host: vikrant [vikrant][DEBUG ] detect platform information from remote host [vikrant][DEBUG ] detect machine type [vikrant][DEBUG ] fetch remote file [ceph_deploy.gatherkeys][WARNIN] Unable to find /var/lib/ceph/bootstrap-mds/ceph.keyring on ['vikrant'] I checked ceph.keyring is not generated for bootstrap-osd and bootstrap-mds in ceph-node1, due to which the next command “ceph-deploy osd prepare ceph-node2” is giving error. *Please find the setup details* – One Admin Node – from where I am executing ceph-deploy commands Ceph-node1 – this is the ceph monitor, (hostname is vikrant) Ceph-node2 – Ceph OSD, this is on a separate machine ( as of now I am trying to configure one OSD, in the link they have mentioned the example for two OSD) *Content of ceph.conf* (this is same for admin node and ceph-node1) [global] fsid = eb4099a6-d2ab-437c-94f2-f3b43b3170d1 mon_initial_members = vikrant mon_host = 10.XX.XX.XX auth_supported = cephx osd_journal_size = 1024 filestore_xattr_use_omap = true *Output of “ceph-deploy mon create vikrant”* command (vikrant is the hostname of the ceph-node1) -- -ceph-deploy mon create vikrant -o/p-- [ceph_deploy.cli][INFO ] Invoked (1.3.1): /usr/bin/ceph-deploy mon create vikrant [ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts vikrant [ceph_deploy.mon][DEBUG ] detecting platform for host vikrant ... [vikrant][DEBUG ] connected to host: vikrant [vikrant][DEBUG ] detect platform information from remote host [vikrant][DEBUG ] detect machine type [ceph_deploy.mon][INFO ] distro info: Ubuntu 12.04 precise [vikrant][DEBUG ] determining if provided host has same hostname in remote [vikrant][DEBUG ] get remote short hostname [vikrant][DEBUG ] deploying mon to vikrant [vikrant][DEBUG ] get remote short hostname [vikrant][DEBUG ] remote hostname: vikrant [vikrant][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [vikrant][DEBUG ] create the mon path if it does not exist [vikrant][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-vikrant/done [vikrant][DEBUG ] done path does not exist: /var/lib/ceph/mon/ceph-vikrant/done [vikrant][INFO ] creating tmp path: /var/lib/ceph/tmp [vikrant][INFO ] creating keyring file: /var/lib/ceph/tmp/ceph-vikrant.mon.keyring [vikrant][DEBUG ] create the monitor keyring file [vikrant][INFO ] Running command: sudo ceph-mon --cluster ceph --mkfs -i vikrant --keyring /var/lib/ceph/tmp/ceph-vikrant.mon.keyring [vikrant][DEBUG ] ceph-mon: mon.noname-a 10.XX.XX.XX:6789/0 is local, renaming to mon.vikrant [vikrant][DEBUG ] ceph-mon: set fsid to eb4099a6-d2ab-437c-94f2-f3b43b3170d1 [vikrant][DEBUG ] ceph-mon: created monfs at /var/lib/ceph/mon/ceph-vikrant for mon.vikrant [vikrant][INFO ] unlinking keyring file /var/lib/ceph/tmp/ceph-vikrant.mon.keyring [vikrant][DEBUG ] create a done file to avoid re-doing the mon deployment [vikrant][DEBUG ] create the init path if it does not exist [vikrant][DEBUG ] locating the `service` executable... [vikrant][INFO ] Running command: sudo initctl emit ceph-mon cluster=ceph id=vikrant [vikrant][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.vikrant.asok mon_status [vikrant][DEBUG ] [vikrant][DEBUG ] status for monitor: mon.vikrant [vikrant][DEBUG ] { [vikrant][DEBUG ] election_epoch: 2, [vikrant][DEBUG ] extra_probe_peers: [], [vikrant][DEBUG ] monmap: { [vikrant][DEBUG ] created: 0.00, [vikrant][DEBUG ] epoch: 1, [vikrant][DEBUG ] fsid: eb4099a6-d2ab-437c-94f2-f3b43b3170d1, [vikrant][DEBUG ] modified: 0.00, [vikrant][DEBUG ] mons: [ [vikrant][DEBUG ] { [vikrant][DEBUG ] addr: 10.XX.XX.XX:6789/0, [vikrant][DEBUG ] name:
Re: [ceph-users] Unable to find bootstrap-osd and bootstrap-mds ceph.keyring
Hello Vikrant You can try creating directories manually on the monitor node mkdir -p /var/lib/ceph/{tmp,mon,mds,bootstrap-osd} * Important Do not call ceph-deploy with sudo or run it as root if you are logged in as a different user, because it will not issue sudo commands needed on the remote host. Try this hope this helps you. Many Thanks Karan Singh - Original Message - From: Vikrant Verma vikrantverm...@gmail.com To: ceph-users@lists.ceph.com Sent: Friday, 8 November, 2013 2:41:36 PM Subject: [ceph-users] Unable to find bootstrap-osd and bootstrap-mds ceph.keyring Hi All, I am able to Add a Ceph Monitor (step 3) as per the link http://ceph.com/docs/master/start/quick-ceph-deploy/ (Setting Up Ceph Storage Cluster) But when I am executing the gatherkey command, I am getting the warnings(highlighted in yellow). Please find the details – Command – “ ceph-deploy gatherkeys vikrant” (vikrant is the hostname of the ceph-node1) Output – [ceph_deploy.cli][INFO ] Invoked (1.3.1): /usr/bin/ceph-deploy gatherkeys vikrant [ceph_deploy.gatherkeys][DEBUG ] Have ceph.client.admin.keyring [ceph_deploy.gatherkeys][DEBUG ] Have ceph.mon.keyring [ceph_deploy.gatherkeys][DEBUG ] Checking vikrant for /var/lib/ceph/bootstrap-osd/ceph.keyring [vikrant][DEBUG ] connected to host: vikrant [vikrant][DEBUG ] detect platform information from remote host [vikrant][DEBUG ] detect machine type [vikrant][DEBUG ] fetch remote file [ceph_deploy.gatherkeys][WARNIN] Unable to find /var/lib/ceph/bootstrap-osd/ceph.keyring on ['vikrant'] [ceph_deploy.gatherkeys][DEBUG ] Checking vikrant for /var/lib/ceph/bootstrap-mds/ceph.keyring [vikrant][DEBUG ] connected to host: vikrant [vikrant][DEBUG ] detect platform information from remote host [vikrant][DEBUG ] detect machine type [vikrant][DEBUG ] fetch remote file [ceph_deploy.gatherkeys][WARNIN] Unable to find /var/lib/ceph/bootstrap-mds/ceph.keyring on ['vikrant'] I checked ceph.keyring is not generated for bootstrap-osd and bootstrap-mds in ceph-node1, due to which the next command “ ceph-deploy osd prepare ceph-node2” is giving error. Please find the setup details – One Admin Node – from where I am executing ceph-deploy commands Ceph-node1 – this is the ceph monitor, (hostname is vikrant) Ceph-node2 – Ceph OSD, this is on a separate machine ( as of now I am trying to configure one OSD, in the link they have mentioned the example for two OSD) Content of ceph.conf (this is same for admin node and ceph-node1) [global] fsid = eb4099a6-d2ab-437c-94f2-f3b43b3170d1 mon_initial_members = vikrant mon_host = 10.XX.XX.XX auth_supported = cephx osd_journal_size = 1024 filestore_xattr_use_omap = true Output of “ceph-deploy mon create vikrant” command (vikrant is the hostname of the ceph-node1) -- -ceph-deploy mon create vikrant -o/p-- [ceph_deploy.cli][INFO ] Invoked (1.3.1): /usr/bin/ceph-deploy mon create vikrant [ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts vikrant [ceph_deploy.mon][DEBUG ] detecting platform for host vikrant ... [vikrant][DEBUG ] connected to host: vikrant [vikrant][DEBUG ] detect platform information from remote host [vikrant][DEBUG ] detect machine type [ceph_deploy.mon][INFO ] distro info: Ubuntu 12.04 precise [vikrant][DEBUG ] determining if provided host has same hostname in remote [vikrant][DEBUG ] get remote short hostname [vikrant][DEBUG ] deploying mon to vikrant [vikrant][DEBUG ] get remote short hostname [vikrant][DEBUG ] remote hostname: vikrant [vikrant][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [vikrant][DEBUG ] create the mon path if it does not exist [vikrant][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-vikrant/done [vikrant][DEBUG ] done path does not exist: /var/lib/ceph/mon/ceph-vikrant/done [vikrant][INFO ] creating tmp path: /var/lib/ceph/tmp [vikrant][INFO ] creating keyring file: /var/lib/ceph/tmp/ceph-vikrant.mon.keyring [vikrant][DEBUG ] create the monitor keyring file [vikrant][INFO ] Running command: sudo ceph-mon --cluster ceph --mkfs -i vikrant --keyring /var/lib/ceph/tmp/ceph-vikrant.mon.keyring [vikrant][DEBUG ] ceph-mon: mon.noname-a 10.XX.XX.XX:6789/0 is local, renaming to mon.vikrant [vikrant][DEBUG ] ceph-mon: set fsid to eb4099a6-d2ab-437c-94f2-f3b43b3170d1 [vikrant][DEBUG ] ceph-mon: created monfs at /var/lib/ceph/mon/ceph-vikrant for mon.vikrant [vikrant][INFO ] unlinking keyring file /var/lib/ceph/tmp/ceph-vikrant.mon.keyring [vikrant][DEBUG ] create a done file to avoid re-doing the mon deployment [vikrant][DEBUG ] create the init path if it does not exist [vikrant][DEBUG ] locating the `service` executable... [vikrant][INFO ] Running command: sudo initctl emit ceph-mon cluster=ceph id=vikrant [vikrant][INFO ] Running command: sudo ceph --cluster=ceph
Re: [ceph-users] About memory usage of ceph-mon on arm
I try to dump perf counter via admin socket, but I don't know what does these numbers actual mean or does these numbers have any thing to do with the different memory usage between arm and amd processors, so I attach the dump log as attachment(mon.a runs on AMD processor, mon.c runs on ARM processor). PS: after days of running(mon.b is 6 day, mon.c is 3 day), the memory consumption of both monitor running on arm board become stable, some what about 600MB, here is the heap stats: mon.btcmalloc heap stats: MALLOC: 594258992 ( 566.7 MiB) Bytes in use by application MALLOC: + 19529728 ( 18.6 MiB) Bytes in page heap freelist MALLOC: + 3885120 (3.7 MiB) Bytes in central cache freelist MALLOC: + 6486528 (6.2 MiB) Bytes in transfer cache freelist MALLOC: + 12202384 ( 11.6 MiB) Bytes in thread cache freelists MALLOC: + 2889952 (2.8 MiB) Bytes in malloc metadata MALLOC: MALLOC: =639252704 ( 609.6 MiB) Actual memory used (physical + swap) MALLOC: + 122880 (0.1 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: =639375584 ( 609.8 MiB) Virtual address space used MALLOC: MALLOC: 10231 Spans in use MALLOC: 24 Thread heaps in use MALLOC: 8192 Tcmalloc page size Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()). Bytes released to the mon.ctcmalloc heap stats: MALLOC: 593987584 ( 566.5 MiB) Bytes in use by application MALLOC: + 23969792 ( 22.9 MiB) Bytes in page heap freelist MALLOC: + 2172640 (2.1 MiB) Bytes in central cache freelist MALLOC: + 5874688 (5.6 MiB) Bytes in transfer cache freelist MALLOC: + 9268512 (8.8 MiB) Bytes in thread cache freelists MALLOC: + 2889952 (2.8 MiB) Bytes in malloc metadata MALLOC: MALLOC: =638163168 ( 608.6 MiB) Actual memory used (physical + swap) MALLOC: + 163840 (0.2 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: =638327008 ( 608.8 MiB) Virtual address space used MALLOC: MALLOC: 9796 Spans in use MALLOC: 14 Thread heaps in use MALLOC: 8192 Tcmalloc page size Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()). Bytes released to the On Fri, Nov 8, 2013 at 12:03 PM, Gregory Farnum g...@inktank.com wrote: I don't think this is anything we've observed before. Normally when a Ceph node is using more memory than its peers it's a consequence of something in that node getting backed up. You might try looking at the perf counters via the admin socket and seeing if something about them is different between your ARM and AMD processors. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, Nov 5, 2013 at 7:21 AM, Yu Changyuan rei...@gmail.com wrote: Finally, my tiny ceph cluster get 3 monitors, newly added mon.b and mon.c both running on cubieboard2, which is cheap but still with enough cpu power(dual-core arm A7 cpu, 1.2G) and memory(1G). But compare to mon.a which running on an amd64 cpu, both mon.b and mon.c easily consume too much memory, so I want to know whether this is caused by memory leak. Below is the output of 'ceph tell mon.a heap stats' and 'ceph tell mon.c heap stats'(mon.c only start 12hr ago, while mon.a already running for more than 10 days) mon.atcmalloc heap stats: MALLOC:5480160 (5.2 MiB) Bytes in use by application MALLOC: + 28065792 ( 26.8 MiB) Bytes in page heap freelist MALLOC: + 15242312 ( 14.5 MiB) Bytes in central cache freelist MALLOC: + 10116608 (9.6 MiB) Bytes in transfer cache freelist MALLOC: + 10432216 (9.9 MiB) Bytes in thread cache freelists MALLOC: + 1667224 (1.6 MiB) Bytes in malloc metadata MALLOC: MALLOC: = 71004312 ( 67.7 MiB) Actual memory used (physical + swap) MALLOC: + 57540608 ( 54.9 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: =128544920 ( 122.6 MiB) Virtual address space used MALLOC: MALLOC: 4655 Spans in use MALLOC: 34 Thread heaps in use MALLOC: 8192 Tcmalloc page size Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()). Bytes released to the mon.ctcmalloc heap stats: MALLOC: 175861640 ( 167.7 MiB) Bytes in use by application MALLOC: + 2220032 (2.1 MiB) Bytes in page heap freelist MALLOC: +
[ceph-users] radosgw configuration problems - Swift access
All, I have configured a rados gateway as per the Dumpling quick instructions on a Red Hat 6 server. The idea is to use Swift API to access my cluster via this interface. Have configured FastCGI, httpd, as per the guides, did all the user creations/authtool commands for the Swift user/subuser etc and it all seemed to go through smoothly. When I created the user I got back the following results: { user_id: greg, display_name: removed name, email: removed email, suspended: 0, max_buckets: 1000, auid: 0, subusers: [ { id: greg:swift, permissions: full-control}], keys: [ { user: greg, access_key: B776L21S7ACY97B3VDLE, secret_key: 1V4BMEixut1SgzbPsKmt3SfqHhOgxMjkbINF4bMy}], swift_keys: [ { user: greg:swift, secret_key: a3FOvDaCHNotU93jIJaUE0Qyzc6CPLH5By5Km7p6}], caps: [], op_mask: read, write, delete, default_placement: , placement_tags: []} At least a week or so passed between doing the above configuration and then trying to connect via Swift. There were no changes done to the Ceph configuration during this time. Now, when trying to access the cluster using Swift, we get an account not found error. This is using the following example command listed in the ceph docs: swift -V 1.0 -A http://fqdn_of_gateway/auth -U greg:swift -K a3FOvDaCHNotU93jIJaUE0Qyzc6CPLH5By5Km7p6 post test When doing a debug version of this command it adds the fact that it can't find the /auth directory prior to saying Account not found. As part of debugging this, I tried to run the radosgw-admin user check command to confirm the configuration. It gives the following error: 2013-11-08 12:55:06.228905 7f8dd9d32820 0 WARNING: cannot read region map I then tried to add a new user using the same process as before, and it gave the same error. My questions are this: - Has anyone seen this before? - Why could I create the initial user above, and suddenly I can't create anymore, or indeed view this one's details? - Is there a step I have missed to create a region, because the version of the doc I read did not (as far as I recall) include anything about it. I have looked at the current docs, but they have been updated to include the Emperor stuff which is multi-region and I can't find the dumpling specific stuff anymore to be sure. Any help would be appreciated. Thanks ___ This message is for information purposes only, it is not a recommendation, advice, offer or solicitation to buy or sell a product or service nor an official confirmation of any transaction. It is directed at persons who are professionals and is not intended for retail customer use. Intended for recipient only. This message is subject to the terms at: www.barclays.com/emaildisclaimer. For important disclosures, please see: www.barclays.com/salesandtradingdisclaimer regarding market commentary from Barclays Sales and/or Trading, who are active market participants; and in respect of Barclays Research, including disclosures relating to specific issuers, please see http://publicresearch.barclays.com. ___ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] radosgw setting puplic ACLs fails.
Hi, I'm trying to set public ACLs to an object, so that I can access the object via Web-browser. unfortunately without success: s3cmd setacl --acl-public s3://test/hosts ERROR: S3 error: 403 (AccessDenied): The radosgw log says: x-amz-date:Fri, 08 Nov 2013 12:56:55 + /test/hosts?acl 2013-11-08 13:56:55.090604 7fe3314c6700 15 calculated digest=K6fFJdBvy1YXZw0kqZ7qt6sRkzk= 2013-11-08 13:56:55.090606 7fe3314c6700 15 auth_sign=K6fFJdBvy1YXZw0kqZ7qt6sRkzk= 2013-11-08 13:56:55.090607 7fe3314c6700 15 compare=0 2013-11-08 13:56:55.090610 7fe3314c6700 2 req 60:0.000290:s3:PUT /hosts:put_acls:reading permissions 2013-11-08 13:56:55.090621 7fe3314c6700 20 get_obj_state: rctx=0xf32a50 obj=.rgw:test state=0xf21888 s-prefetch_data=0 2013-11-08 13:56:55.090630 7fe3314c6700 10 moving .rgw+test to cache LRU end 2013-11-08 13:56:55.090632 7fe3314c6700 10 cache get: name=.rgw+test : hit 2013-11-08 13:56:55.090635 7fe3314c6700 20 get_obj_state: s-obj_tag was set empty 2013-11-08 13:56:55.090637 7fe3314c6700 20 Read xattr: user.rgw.idtag 2013-11-08 13:56:55.090639 7fe3314c6700 20 Read xattr: user.rgw.manifest 2013-11-08 13:56:55.090641 7fe3314c6700 10 moving .rgw+test to cache LRU end 2013-11-08 13:56:55.090642 7fe3314c6700 10 cache get: name=.rgw+test : hit 2013-11-08 13:56:55.090650 7fe3314c6700 20 rgw_get_bucket_info: bucket instance: test(@{i=.rgw.buckets.index}.rgw.buckets[default.4212.2]) 2013-11-08 13:56:55.090654 7fe3314c6700 20 reading from .rgw:.bucket.meta.test:default.4212.2 2013-11-08 13:56:55.090659 7fe3314c6700 20 get_obj_state: rctx=0xf32a50 obj=.rgw:.bucket.meta.test:default.4212.2 state=0xf39678 s-prefetch_data=0 2013-11-08 13:56:55.090663 7fe3314c6700 10 moving .rgw+.bucket.meta.test:default.4212.2 to cache LRU end 2013-11-08 13:56:55.090665 7fe3314c6700 10 cache get: name=.rgw+.bucket.meta.test:default.4212.2 : hit 2013-11-08 13:56:55.090668 7fe3314c6700 20 get_obj_state: s-obj_tag was set empty 2013-11-08 13:56:55.090670 7fe3314c6700 20 Read xattr: user.rgw.acl 2013-11-08 13:56:55.090671 7fe3314c6700 20 Read xattr: user.rgw.idtag 2013-11-08 13:56:55.090672 7fe3314c6700 20 Read xattr: user.rgw.manifest 2013-11-08 13:56:55.090674 7fe3314c6700 10 moving .rgw+.bucket.meta.test:default.4212.2 to cache LRU end 2013-11-08 13:56:55.090676 7fe3314c6700 10 cache get: name=.rgw+.bucket.meta.test:default.4212.2 : hit 2013-11-08 13:56:55.090690 7fe3314c6700 15 Read AccessControlPolicyAccessControlPolicy xmlns=http://s3.amazonaws.com/doc/2006-03-01/;OwnerIDtest/IDDisplayNameTest/DisplayName/OwnerAccessControlListGrantGrantee xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance; xsi:type=CanonicalUserIDtest/IDDisplayNameTest/DisplayName/GranteePermissionFULL_CONTROL/Permission/Grant/AccessControlList/AccessControlPolicy 2013-11-08 13:56:55.090702 7fe3314c6700 20 get_obj_state: rctx=0xf32a50 obj=test:hosts state=0xf633e8 s-prefetch_data=0 2013-11-08 13:56:55.093871 7fe3314c6700 10 manifest: total_size = 156 2013-11-08 13:56:55.093875 7fe3314c6700 10 manifest: ofs=0 loc=test:hosts 2013-11-08 13:56:55.093876 7fe3314c6700 20 get_obj_state: setting s-obj_tag to default.4212.50 2013-11-08 13:56:55.093882 7fe3314c6700 15 Read AccessControlPolicyAccessControlPolicy xmlns=http://s3.amazonaws.com/doc/2006-03-01/;OwnerIDtest/IDDisplayNameTest/DisplayName/OwnerAccessControlListGrantGrantee xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance; xsi:type=CanonicalUserIDtest/IDDisplayNameTest/DisplayName/GranteePermissionFULL_CONTROL/Permission/Grant/AccessControlList/AccessControlPolicy 2013-11-08 13:56:55.093889 7fe3314c6700 2 req 60:0.003568:s3:PUT /hosts:put_acls:verifying op mask 2013-11-08 13:56:55.093894 7fe3314c6700 20 required_mask= 2 user.op_mask=7 2013-11-08 13:56:55.093896 7fe3314c6700 2 req 60:0.003576:s3:PUT /hosts:put_acls:verifying op permissions 2013-11-08 13:56:55.093900 7fe3314c6700 5 Searching permissions for uid=test mask=56 2013-11-08 13:56:55.093903 7fe3314c6700 5 Found permission: 15 2013-11-08 13:56:55.093905 7fe3314c6700 5 Searching permissions for group=1 mask=56 2013-11-08 13:56:55.093907 7fe3314c6700 5 Permissions for group not found 2013-11-08 13:56:55.093909 7fe3314c6700 5 Getting permissions id=test owner=test perm=8 2013-11-08 13:56:55.093912 7fe3314c6700 10 uid=test requested perm (type)=8, policy perm=8, user_perm_mask=15, acl perm=8 2013-11-08 13:56:55.093914 7fe3314c6700 2 req 60:0.003593:s3:PUT /hosts:put_acls:verifying op params 2013-11-08 13:56:55.093916 7fe3314c6700 2 req 60:0.003596:s3:PUT /hosts:put_acls:executing 2013-11-08 13:56:55.093938 7fe3314c6700 15 read len=343 data=AccessControlPolicy xmlns=http://s3.amazonaws.com/doc/2006-03-01/;OwnerID //OwnerAccessControlListGrantGrantee xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance; xsi:type=GroupURIhttp://acs.amazonaws.com/groups/global/AllUsers/URI/GranteePermissionREAD/Permission/Grant/AccessControlList/AccessControlPolicy 2013-11-08 13:56:55.094007 7fe3314c6700 15 Old
Re: [ceph-users] Ceph Block Storage QoS
On 11/08/2013 08:58 AM, Josh Durgin wrote: On 11/08/2013 03:13 PM, ja...@peacon.co.uk wrote: On 2013-11-08 03:20, Haomai Wang wrote: On Fri, Nov 8, 2013 at 9:31 AM, Josh Durgin josh.dur...@inktank.com wrote: I just list commands below to help users to understand: cinder qos-create high_read_low_write consumer=front-end read_iops_sec=1000 write_iops_sec=10 Does this have any normalisation of the IO units, for example to 8K or something? In VMware we have similar controls for ages but they're not useful, as a Windows server will through out 4MB IO's and skew all the metrics. I don't think it does any normalization, but you could have different limits for different volume types, and use one volume type for windows and one volume type for non-windows. This might not make sense for all deployments, but it may be a usable workaround for that issue. It is supported by Qemu. You can set both IOps, but also bandwidth for read, write or total. I don't know if OpenStack supports it though. Josh ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Wido den Hollander 42on B.V. Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Not recovering completely on OSD failure
Hi guys This is probably a configuration error, but I just can't find it. The following reproduceable happens on my cluster [1]. 15:52:15 On Host1 one disk is being removed on the RAID Controller (to ceph it looks as if the disk died) 15:52:52 OSD Reported missing (osd.47) 15:52:53 osdmap eXXX: 60 osds: 59 up, 60 in; 1,781% degraded, 436 PGs stuck unclean, 436 PGs degraded; not recovering yet 15:57:54 osdmap eXXX: 60 osds: 59 up, 59 in; start recovering 15:58:00 2,502% degraded 15:58:01 3,413% degraded; recovering at about 1GB/s -- recovering speed decreasing to about 40MB/s 17:02:10 10 PGs active+remapped, 218 PGs active+degraded, 0.898% degraded, stopped recovering 18:12 Still not recovering few days later: OSD removed [2], now recovering completely I would like my cluster to recover completely without me interfering. Can anyone give an educated guess what went wrong here? I can't find the reason why the cluster would just stop recovering. Thank you for any hints! Niklas [1] 4 OSD Hosts with 15 disks each. On each of the 60 identical disks there is one OSD. I have one large pool with 6000 PGs and a replica size of 4, and 3 (default) pools with 64 PGs each [2] http://ceph.com/docs/master/rados/operations/add-or-rm-osds/#removing-osds-manual ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Unable to find bootstrap-osd and bootstrap-mds ceph.keyring
On Fri, Nov 8, 2013 at 7:41 AM, Vikrant Verma vikrantverm...@gmail.com wrote: Hi All, I am able to Add a Ceph Monitor (step 3) as per the link http://ceph.com/docs/master/start/quick-ceph-deploy/ (Setting Up Ceph Storage Cluster) But when I am executing the gatherkey command, I am getting the warnings(highlighted in yellow). Please find the details – Command – “ceph-deploy gatherkeys vikrant”(vikrant is the hostname of the ceph-node1) Output – [ceph_deploy.cli][INFO ] Invoked (1.3.1): /usr/bin/ceph-deploy gatherkeys vikrant [ceph_deploy.gatherkeys][DEBUG ] Have ceph.client.admin.keyring [ceph_deploy.gatherkeys][DEBUG ] Have ceph.mon.keyring [ceph_deploy.gatherkeys][DEBUG ] Checking vikrant for /var/lib/ceph/bootstrap-osd/ceph.keyring [vikrant][DEBUG ] connected to host: vikrant [vikrant][DEBUG ] detect platform information from remote host [vikrant][DEBUG ] detect machine type [vikrant][DEBUG ] fetch remote file [ceph_deploy.gatherkeys][WARNIN] Unable to find /var/lib/ceph/bootstrap-osd/ceph.keyring on ['vikrant'] [ceph_deploy.gatherkeys][DEBUG ] Checking vikrant for /var/lib/ceph/bootstrap-mds/ceph.keyring [vikrant][DEBUG ] connected to host: vikrant [vikrant][DEBUG ] detect platform information from remote host [vikrant][DEBUG ] detect machine type [vikrant][DEBUG ] fetch remote file [ceph_deploy.gatherkeys][WARNIN] Unable to find /var/lib/ceph/bootstrap-mds/ceph.keyring on ['vikrant'] I checked ceph.keyring is not generated for bootstrap-osd and bootstrap-mds in ceph-node1, due to which the next command “ceph-deploy osd prepare ceph-node2” is giving error. Please find the setup details – One Admin Node – from where I am executing ceph-deploy commands Ceph-node1 – this is the ceph monitor, (hostname is vikrant) Ceph-node2 – Ceph OSD, this is on a separate machine ( as of now I am trying to configure one OSD, in the link they have mentioned the example for two OSD) I think this Ceph-node2 is the problem. If I am understanding this correctly, there is no monitor in that node correct? When you start a monitor in a host, that monitor will also create the keys for you with `ceph-create-keys`. Running only one monitor in one host and not in the other one is something I am not familiar with, but I guess you could run that manually? Content of ceph.conf (this is same for admin node and ceph-node1) [global] fsid = eb4099a6-d2ab-437c-94f2-f3b43b3170d1 mon_initial_members = vikrant mon_host = 10.XX.XX.XX auth_supported = cephx osd_journal_size = 1024 filestore_xattr_use_omap = true Output of “ceph-deploy mon create vikrant” command (vikrant is the hostname of the ceph-node1) -- -ceph-deploy mon create vikrant -o/p-- [ceph_deploy.cli][INFO ] Invoked (1.3.1): /usr/bin/ceph-deploy mon create vikrant [ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts vikrant [ceph_deploy.mon][DEBUG ] detecting platform for host vikrant ... [vikrant][DEBUG ] connected to host: vikrant [vikrant][DEBUG ] detect platform information from remote host [vikrant][DEBUG ] detect machine type [ceph_deploy.mon][INFO ] distro info: Ubuntu 12.04 precise [vikrant][DEBUG ] determining if provided host has same hostname in remote [vikrant][DEBUG ] get remote short hostname [vikrant][DEBUG ] deploying mon to vikrant [vikrant][DEBUG ] get remote short hostname [vikrant][DEBUG ] remote hostname: vikrant [vikrant][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [vikrant][DEBUG ] create the mon path if it does not exist [vikrant][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-vikrant/done [vikrant][DEBUG ] done path does not exist: /var/lib/ceph/mon/ceph-vikrant/done [vikrant][INFO ] creating tmp path: /var/lib/ceph/tmp [vikrant][INFO ] creating keyring file: /var/lib/ceph/tmp/ceph-vikrant.mon.keyring [vikrant][DEBUG ] create the monitor keyring file [vikrant][INFO ] Running command: sudo ceph-mon --cluster ceph --mkfs -i vikrant --keyring /var/lib/ceph/tmp/ceph-vikrant.mon.keyring [vikrant][DEBUG ] ceph-mon: mon.noname-a 10.XX.XX.XX:6789/0 is local, renaming to mon.vikrant [vikrant][DEBUG ] ceph-mon: set fsid to eb4099a6-d2ab-437c-94f2-f3b43b3170d1 [vikrant][DEBUG ] ceph-mon: created monfs at /var/lib/ceph/mon/ceph-vikrant for mon.vikrant [vikrant][INFO ] unlinking keyring file /var/lib/ceph/tmp/ceph-vikrant.mon.keyring [vikrant][DEBUG ] create a done file to avoid re-doing the mon deployment [vikrant][DEBUG ] create the init path if it does not exist [vikrant][DEBUG ] locating the `service` executable... [vikrant][INFO ] Running command: sudo initctl emit ceph-mon cluster=ceph id=vikrant [vikrant][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.vikrant.asok mon_status [vikrant][DEBUG ]
Re: [ceph-users] Pool without a name, how to remove it?
On 11/08/2013 04:56 AM, Gregory Farnum wrote: I don't remember how this has come up or been dealt with in the past, but I believe it has been. Have you tried just doing it via the ceph or rados CLI tools with an empty pool name? Yes, that worked! root@rgw1:~# rados rmpool --yes-i-really-really-mean-it successfully deleted pool root@rgw1:~# Feel stupid afterwards of not thinking about this. Wido -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, Nov 5, 2013 at 6:58 AM, Wido den Hollander w...@42on.com wrote: Hi, On a Ceph cluster I have a pool without a name. I have no idea how it got there, but how do I remove it? pool 14 '' rep size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 158 owner 18446744073709551615 Is there a way to remove a pool by it's ID? I couldn't find anything in librados do to so. -- Wido den Hollander 42on B.V. Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Wido den Hollander 42on B.V. Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Havana RBD - a few problems
Hi Josh Using libvirt_image_type=rbd to replace ephemeral disks is new with Havana, and unfortunately some bug fixes did not make it into the release. I've backported the current fixes on top of the stable/havana branch here: https://github.com/jdurgin/nova/tree/havana-ephemeral-rbd that looks really useful. I have tried to patch our installation, but so far haven't been successful: First I tried to replace the whole /usr/share/pyshared/nova directory with the one from your repository, then only the changed files. (Ubuntu Saucy). In both cases nova-compute dies immediately after starting it. There is probably a really simple way to install your version on an Ubuntu server - but I don't know how... 2) Creating a new instance from an ISO image fails completely - no bootable disk found, says the KVM console. Related? This sounds like a bug in the ephemeral rbd code - could you file it in launchpad if you can reproduce with file injection disabled? I suspect it's not being attached as a carom. Will try to reproduce as soon as I have the patched version You're seeing some issues in the ephemeral rbd code, which is new in Havana. None of these affect non-ephemeral rbd, or Grizzly. Thanks for reporting them! thanks for your help cheers jc ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Can someone please help me here?
Hi Alfredo, See the steps I executed below and the weird error I am getting when trying to activate OSDs- the last series of error messages are in an infinite loop- still printing 2 days . FYI, /etc/ceph existed on all nodes after ceph-deploy install. I checked after doing ceph-deploy install. Do you need the ceph.log file? [ceph@ceph-admin-node-centos-6-4 ~]# mkdir my-cluster [ceph@ceph-admin-node-centos-6-4 ~]# cd my-cluster/ [ceph@ceph-admin-node-centos-6-4 ~]# ceph-deploy new ceph-node1-mon-centos-6-4 The above command creates a ceph.conf file with the cluster information in it. A log file by the name of ceph.log will also be created [ceph@ceph-admin-node-centos-6-4 my-cluster]# ceph-deploy install ceph-node1-mon-centos-6-4 ceph-node2-osd0-centos-6-4 ceph-node3-osd1-centos-6-4 This will install ceph on all the nodes I added a ceph monitor node: [ceph@ceph-admin-node-centos-6-4 mycluster]$ ceph-deploy mon create ceph-node1-mon-centos-6-4 Gather keys: [ceph@ceph-admin-node-centos-6-4 mycluster]$ ceph-deploy gatherkeys ceph-node1-mon-centos-6-4 After gathering keys, made sure the directory should have - monitoring, admin, osd, mds keyrings: [ceph@ceph-admin-node-centos-6-4 mycluster]$ ls ceph.bootstrap-mds.keyring ceph.bootstrap-osd.keyring ceph.client.admin.keyring ceph.conf ceph.log ceph.mon.keyring Addede two OSDs: Ssh;ed to both OSD nodes i.e. ceph-node2-osd0-centos-6-4 and ceph-node3-osd1-centos-6-4 and created two directories to be used as Ceph OSD daemons: [ceph@ceph-admin-node-centos-6-4 mycluster]$ ssh ceph-node2-osd0-centos-6-4 Last login: Wed Oct 30 10:51:11 2013 from ceph-admin-node-centos-6-4 [ceph@ceph-node2-osd0-centos-6-4 ~]$ sudo mkdir -p /ceph/osd0 [ceph@ceph-node2-osd0-centos-6-4 ~]$ exit logout Connection to ceph-node2-osd0-centos-6-4 closed. [ceph@ceph-admin-node-centos-6-4 mycluster]$ ssh ceph-node3-osd1-centos-6-4 Last login: Wed Oct 30 10:51:20 2013 from ceph-admin-node-centos-6-4 [ceph@ceph-node3-osd1-centos-6-4 ~]$ sudo mkdir -p /ceph/osd1 [ceph@ceph-node3-osd1-centos-6-4 ~]$ exit logout Connection to ceph-node3-osd1-centos-6-4 closed. [ceph@ceph-admin-node-centos-6-4 mycluster]$ Used ceph-deploy to prepare the OSDs: [ceph@ceph-admin-node-centos-6-4 mycluster]$ ceph-deploy osd prepare ceph-node2-osd0-centos-6-4:/ceph/osd0 ceph-node3-osd1-centos-6-4:/ceph/osd1 [ceph_deploy.cli][INFO ] Invoked (1.2.7): /usr/bin/ceph-deploy osd prepare ceph-node2-osd0-centos-6-4:/ceph/osd0 ceph-node3-osd1-centos-6-4:/ceph/osd1 . . . [ceph-node3-osd1-centos-6-4][INFO ] create mon keyring file [ceph-node3-osd1-centos-6-4][INFO ] Running command: udevadm trigger --subsystem-match=block --action=add [ceph_deploy.osd][DEBUG ] Preparing host ceph-node3-osd1-centos-6-4 disk /ceph/osd1 journal None activate False [ceph-node3-osd1-centos-6-4][INFO ] Running command: ceph-disk-prepare --fs-type xfs --cluster ceph -- /ceph/osd1 [ceph_deploy.osd][DEBUG ] Host ceph-node3-osd1-centos-6-4 is now ready for osd use. [ceph@ceph-admin-node-centos-6-4 mycluster]$ Finally I activated the osds: [ceph@ceph-admin-node-centos-6-4 mycluster]$ ceph-deploy osd activate ceph-node2-osd0-centos-6-4:/ceph/osd0 ceph-node3-osd1-centos-6-4:/ceph/osd1 2013-11-06 14:26:53,373 [ceph_deploy.cli][INFO ] Invoked (1.3): /usr/bin/ceph-deploy osd activate ceph-node2-osd0-centos-6-4:/ceph/osd0 ceph-node3-osd1-centos-6-4:/ceph/osd1 2013-11-06 14:26:53,373 [ceph_deploy.osd][DEBUG ] Activating cluster ceph disks ceph-node2-osd0-centos-6-4:/ceph/osd0: ceph-node3-osd1-centos-6-4:/ceph/osd1: 2013-11-06 14:26:53,646 [ceph-node2-osd0-centos-6-4][DEBUG ] connected to host: ceph-node2-osd0-centos-6-4 2013-11-06 14:26:53,646 [ceph-node2-osd0-centos-6-4][DEBUG ] detect platform information from remote host 2013-11-06 14:26:53,662 [ceph-node2-osd0-centos-6-4][DEBUG ] detect machine type 2013-11-06 14:26:53,670 [ceph_deploy.osd][INFO ] Distro info: CentOS 6.4 Final 2013-11-06 14:26:53,670 [ceph_deploy.osd][DEBUG ] activating host ceph-node2-osd0-centos-6-4 disk /ceph/osd0 2013-11-06 14:26:53,670 [ceph_deploy.osd][DEBUG ] will use init type: sysvinit 2013-11-06 14:26:53,670 [ceph-node2-osd0-centos-6-4][INFO ] Running command: sudo ceph-disk-activate --mark-init sysvinit --mount /ceph/osd0 2013-11-06 14:26:53,891 [ceph-node2-osd0-centos-6-4][ERROR ] 2013-11-06 14:26:54.835529 7f589c9b7700 0 -- :/1019489 10.12.0.70:6789/0 pipe(0x7f5898024480 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f58980246e0).fault 2013-11-06 14:26:56,914 [ceph-node2-osd0-centos-6-4][ERROR ] 2013-11-06 14:26:57.830775 7f589c8b6700 0 -- :/1019489 10.12.0.70:6789/0 pipe(0x7f588c000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f588c000e60).fault 2013-11-06 14:26:59,886 [ceph-node2-osd0-centos-6-4][ERROR ] 2013-11-06 14:27:00.831031 7f589c9b7700 0 -- :/1019489 10.12.0.70:6789/0 pipe(0x7f588c003010 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f588c003270).fault 2013-11-06 14:27:03,914
Re: [ceph-users] near full osd
Thanks Gregory, One point that was a bit unclear in documentation is whether or not this equation for PGs applies to a single pool, or the entirety of pools. Meaning, if I calculate 3000 PGs, should each pool have 3000 PGs or should all the pools ADD UP to 3000 PGs? Thanks! -- Kevin Weiler IT IMC Financial Markets | 233 S. Wacker Drive, Suite 4300 | Chicago, IL 60606 | http://imc-chicago.com/ Phone: +1 312-204-7439 | Fax: +1 312-244-3301 | E-Mail: kevin.wei...@imc-chicago.com On 11/7/13 9:59 PM, Gregory Farnum g...@inktank.com wrote: It sounds like maybe your PG counts on your pools are too low and so you're just getting a bad balance. If that's the case, you can increase the PG count with ceph osd pool name set pgnum higher value. OSDs should get data approximately equal to node weight/sum of node weights, so higher weights get more data and all its associated traffic. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, Nov 5, 2013 at 8:30 AM, Kevin Weiler kevin.wei...@imc-chicago.com wrote: All of the disks in my cluster are identical and therefore all have the same weight (each drive is 2TB and the automatically generated weight is 1.82 for each one). Would the procedure here be to reduce the weight, let it rebal, and then put the weight back to where it was? -- Kevin Weiler IT IMC Financial Markets | 233 S. Wacker Drive, Suite 4300 | Chicago, IL 60606 | http://imc-chicago.com/ Phone: +1 312-204-7439 | Fax: +1 312-244-3301 | E-Mail: kevin.wei...@imc-chicago.com From: Aronesty, Erik earone...@expressionanalysis.com Date: Tuesday, November 5, 2013 10:27 AM To: Greg Chavez greg.cha...@gmail.com, Kevin Weiler kevin.wei...@imc-chicago.com Cc: ceph-users@lists.ceph.com ceph-users@lists.ceph.com Subject: RE: [ceph-users] near full osd If there¹s an underperforming disk, why on earth would more data be put on it? You¹d think it would be lessŠ. I would think an overperforming disk should (desirably) cause that case,right? From: ceph-users-boun...@lists.ceph.com [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Greg Chavez Sent: Tuesday, November 05, 2013 11:20 AM To: Kevin Weiler Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] near full osd Kevin, in my experience that usually indicates a bad or underperforming disk, or a too-high priority. Try running ceph osd crush reweight osd.## 1.0. If that doesn't do the trick, you may want to just out that guy. I don't think the crush algorithm guarantees balancing things out in the way you're expecting. --Greg On Tue, Nov 5, 2013 at 11:11 AM, Kevin Weiler kevin.wei...@imc-chicago.com wrote: Hi guys, I have an OSD in my cluster that is near full at 90%, but we're using a little less than half the available storage in the cluster. Shouldn't this be balanced out? -- Kevin Weiler IT IMC Financial Markets | 233 S. Wacker Drive, Suite 4300 | Chicago, IL 60606 | http://imc-chicago.com/ Phone: +1 312-204-7439 | Fax: +1 312-244-3301 | E-Mail: kevin.wei...@imc-chicago.com The information in this e-mail is intended only for the person or entity to which it is addressed. It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it. If you receive this e-mail unintentionally, please inform us immediately by reply and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments. Messages and attachments are scanned for all known viruses. Always scan attachments before opening them. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com The information in this e-mail is intended only for the person or entity to which it is addressed. It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it. If you receive this e-mail unintentionally, please inform us immediately by reply and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets Asset Management nor any of its related entities shall accept any
Re: [ceph-users] Havana RBD - a few problems
Using libvirt_image_type=rbd to replace ephemeral disks is new with Havana, and unfortunately some bug fixes did not make it into the release. I've backported the current fixes on top of the stable/havana branch here: https://github.com/jdurgin/nova/tree/havana-ephemeral-rbd that looks really useful. I have tried to patch our installation, but so far haven't been successful: First I tried to replace the whole /usr/share/pyshared/nova directory with the one from your repository, then only the changed files. (Ubuntu Saucy). In both cases nova-compute dies immediately after starting it. There is probably a really simple way to install your version on an Ubuntu server - but I don't know how… ok - got it working by cherry picking the last few commits, and then replacing only the 5 affected files. Resize of disk on instance creation works! Yay 2) Creating a new instance from an ISO image fails completely - no bootable disk found, says the KVM console. Related? This sounds like a bug in the ephemeral rbd code - could you file it in launchpad if you can reproduce with file injection disabled? I suspect it's not being attached as a carom. Will try to reproduce as soon as I have the patched version still doesn't work - will file a bug cheers jc ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Can someone please help me here?
On Fri, Nov 8, 2013 at 11:04 AM, Trivedi, Narendra narendra.triv...@savvis.com wrote: Hi Alfredo, See the steps I executed below and the weird error I am getting when trying to activate OSDs- the last series of error messages are in an infinite loop- still printing 2 days . FYI, /etc/ceph existed on all nodes after ceph-deploy install. I checked after doing ceph-deploy install. Do you need the ceph.log file? This looks good enough. I see at least two different ceph-deploy versions used (1.2.7 and 1.3), have you tried running this with 1.3.1? I believe that you *might* be hitting a small issue in 1.3 that just got fixed and released. [ceph@ceph-admin-node-centos-6-4 ~]# mkdir my-cluster [ceph@ceph-admin-node-centos-6-4 ~]# cd my-cluster/ [ceph@ceph-admin-node-centos-6-4 ~]# ceph-deploy new ceph-node1-mon-centos-6-4 The above command creates a ceph.conf file with the cluster information in it. A log file by the name of ceph.log will also be created [ceph@ceph-admin-node-centos-6-4 my-cluster]# ceph-deploy install ceph-node1-mon-centos-6-4 ceph-node2-osd0-centos-6-4 ceph-node3-osd1-centos-6-4 This will install ceph on all the nodes I added a ceph monitor node: [ceph@ceph-admin-node-centos-6-4 mycluster]$ ceph-deploy mon create ceph-node1-mon-centos-6-4 Gather keys: [ceph@ceph-admin-node-centos-6-4 mycluster]$ ceph-deploy gatherkeys ceph-node1-mon-centos-6-4 After gathering keys, made sure the directory should have - monitoring, admin, osd, mds keyrings: [ceph@ceph-admin-node-centos-6-4 mycluster]$ ls ceph.bootstrap-mds.keyring ceph.bootstrap-osd.keyring ceph.client.admin.keyring ceph.conf ceph.log ceph.mon.keyring Addede two OSDs: Ssh;ed to both OSD nodes i.e. ceph-node2-osd0-centos-6-4 and ceph-node3-osd1-centos-6-4 and created two directories to be used as Ceph OSD daemons: [ceph@ceph-admin-node-centos-6-4 mycluster]$ ssh ceph-node2-osd0-centos-6-4 Last login: Wed Oct 30 10:51:11 2013 from ceph-admin-node-centos-6-4 [ceph@ceph-node2-osd0-centos-6-4 ~]$ sudo mkdir -p /ceph/osd0 [ceph@ceph-node2-osd0-centos-6-4 ~]$ exit logout Connection to ceph-node2-osd0-centos-6-4 closed. [ceph@ceph-admin-node-centos-6-4 mycluster]$ ssh ceph-node3-osd1-centos-6-4 Last login: Wed Oct 30 10:51:20 2013 from ceph-admin-node-centos-6-4 [ceph@ceph-node3-osd1-centos-6-4 ~]$ sudo mkdir -p /ceph/osd1 [ceph@ceph-node3-osd1-centos-6-4 ~]$ exit logout Connection to ceph-node3-osd1-centos-6-4 closed. [ceph@ceph-admin-node-centos-6-4 mycluster]$ Used ceph-deploy to prepare the OSDs: [ceph@ceph-admin-node-centos-6-4 mycluster]$ ceph-deploy osd prepare ceph-node2-osd0-centos-6-4:/ceph/osd0 ceph-node3-osd1-centos-6-4:/ceph/osd1 [ceph_deploy.cli][INFO ] Invoked (1.2.7): /usr/bin/ceph-deploy osd prepare ceph-node2-osd0-centos-6-4:/ceph/osd0 ceph-node3-osd1-centos-6-4:/ceph/osd1 . . . [ceph-node3-osd1-centos-6-4][INFO ] create mon keyring file [ceph-node3-osd1-centos-6-4][INFO ] Running command: udevadm trigger --subsystem-match=block --action=add [ceph_deploy.osd][DEBUG ] Preparing host ceph-node3-osd1-centos-6-4 disk /ceph/osd1 journal None activate False [ceph-node3-osd1-centos-6-4][INFO ] Running command: ceph-disk-prepare --fs-type xfs --cluster ceph -- /ceph/osd1 [ceph_deploy.osd][DEBUG ] Host ceph-node3-osd1-centos-6-4 is now ready for osd use. [ceph@ceph-admin-node-centos-6-4 mycluster]$ Finally I activated the osds: [ceph@ceph-admin-node-centos-6-4 mycluster]$ ceph-deploy osd activate ceph-node2-osd0-centos-6-4:/ceph/osd0 ceph-node3-osd1-centos-6-4:/ceph/osd1 2013-11-06 14:26:53,373 [ceph_deploy.cli][INFO ] Invoked (1.3): /usr/bin/ceph-deploy osd activate ceph-node2-osd0-centos-6-4:/ceph/osd0 ceph-node3-osd1-centos-6-4:/ceph/osd1 2013-11-06 14:26:53,373 [ceph_deploy.osd][DEBUG ] Activating cluster ceph disks ceph-node2-osd0-centos-6-4:/ceph/osd0: ceph-node3-osd1-centos-6-4:/ceph/osd1: 2013-11-06 14:26:53,646 [ceph-node2-osd0-centos-6-4][DEBUG ] connected to host: ceph-node2-osd0-centos-6-4 2013-11-06 14:26:53,646 [ceph-node2-osd0-centos-6-4][DEBUG ] detect platform information from remote host 2013-11-06 14:26:53,662 [ceph-node2-osd0-centos-6-4][DEBUG ] detect machine type 2013-11-06 14:26:53,670 [ceph_deploy.osd][INFO ] Distro info: CentOS 6.4 Final 2013-11-06 14:26:53,670 [ceph_deploy.osd][DEBUG ] activating host ceph-node2-osd0-centos-6-4 disk /ceph/osd0 2013-11-06 14:26:53,670 [ceph_deploy.osd][DEBUG ] will use init type: sysvinit 2013-11-06 14:26:53,670 [ceph-node2-osd0-centos-6-4][INFO ] Running command: sudo ceph-disk-activate --mark-init sysvinit --mount /ceph/osd0 2013-11-06 14:26:53,891 [ceph-node2-osd0-centos-6-4][ERROR ] 2013-11-06 14:26:54.835529 7f589c9b7700 0 -- :/1019489 10.12.0.70:6789/0 pipe(0x7f5898024480 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f58980246e0).fault 2013-11-06 14:26:56,914 [ceph-node2-osd0-centos-6-4][ERROR ]
Re: [ceph-users] Havana RBD - a few problems
and one more: boot from image (create a new volume) doesn't work either: it leads to a VM that complains about a non-bootable disk (just like the ISO case). This is actually and improvement: earlier, nova was waiting for ages for an image to be created (I guess that this is the result of the glance - cinder RBD improvements) cheers jc -- SWITCH Jens-Christian Fischer, Peta Solutions Werdstrasse 2, P.O. Box, 8021 Zurich, Switzerland phone +41 44 268 15 15, direct +41 44 268 15 71 jens-christian.fisc...@switch.ch http://www.switch.ch http://www.switch.ch/socialmedia On 08.11.2013, at 02:20, Josh Durgin josh.dur...@inktank.com wrote: On 11/08/2013 12:15 AM, Jens-Christian Fischer wrote: Hi all we have installed a Havana OpenStack cluster with RBD as the backing storage for volumes, images and the ephemeral images. The code as delivered in https://github.com/openstack/nova/blob/master/nova/virt/libvirt/imagebackend.py#L498 fails, because the RBD.path it not set. I have patched this to read: Using libvirt_image_type=rbd to replace ephemeral disks is new with Havana, and unfortunately some bug fixes did not make it into the release. I've backported the current fixes on top of the stable/havana branch here: https://github.com/jdurgin/nova/tree/havana-ephemeral-rbd * @@ -419,10 +419,12 @@ class Rbd(Image): * if path: * try: * self.rbd_name = path.split('/')[1] * + self.path = path * except IndexError: * raise exception.InvalidDevicePath(path=path) * else: * self.rbd_name = '%s_%s' % (instance['name'], disk_name) * + self.path = 'volumes/%s' % self.rbd_name * self.snapshot_name = snapshot_name * if not CONF.libvirt_images_rbd_pool: * raise RuntimeError(_('You should specify' but am not sure this is correct. I have the following problems: 1) can't inject data into image 2013-11-07 16:59:25.251 24891 INFO nova.virt.libvirt.driver [req-f813ef24-de7d-4a05-ad6f-558e27292495 c66a737acf0545fdb9a0a920df0794d9 2096e25f5e814882b5907bc5db342308] [instance: 2fa02e4f-f804-4679-9507-736eeebd9b8d] Injecting key into image fc8179d4-14f3-4f21-a76d-72b03b5c1862 2013-11-07 16:59:25.269 24891 WARNING nova.virt.disk.api [req-f813ef24-de7d-4a05-ad6f-558e27292495 c66a737acf0545fdb9a0a920df0794d9 2096e25f5e814882b5907bc5db342308] Ignoring error injecting data into image (Error mounting volumes/ instance- 0089_disk with libguestfs (volumes/instance-0089_disk: No such file or directory)) possibly the self.path = … is wrong - but what are the correct values? Like Dinu mentioned, I'd suggest disabling file injection and using the metadata service + cloud-init instead. We should probably change nova to log an error about this configuration when ephemeral volumes are rbd. 2) Creating a new instance from an ISO image fails completely - no bootable disk found, says the KVM console. Related? This sounds like a bug in the ephemeral rbd code - could you file it in launchpad if you can reproduce with file injection disabled? I suspect it's not being attached as a cdrom. 3) When creating a new instance from an image (non ISO images work), the disk is not resized to the size specified in the flavor (but left at the size of the original image) This one is fixed in the backports already. I would be really grateful, if those people that have Grizzly/Havana running with an RBD backend could pipe in here… You're seeing some issues in the ephemeral rbd code, which is new in Havana. None of these affect non-ephemeral rbd, or Grizzly. Thanks for reporting them! Josh ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Is Ceph a provider of block device too ?
Hi ! I have clusters (IMAP service) with 2 members configured with Ubuntu + Drbd + Ext4. Intend to migrate to the use of Ceph and begin to allow distributed access to the data. Does Ceph provides the distributed filesystem and block device? Does Ceph work fine in clusters of two members? Thanks! -- www.adminlinux.com.br Thiago Henrique ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] near full osd
It's not a hard value; you should adjust based on the size of your pools (many of then are quite small when used with RGW, for instance). But in general it is better to have more than fewer, and if you want to check you can look at the sizes of each PG (ceph pg dump) and increase the counts for pools with wide variability-Greg On Friday, November 8, 2013, Kevin Weiler wrote: Thanks Gregory, One point that was a bit unclear in documentation is whether or not this equation for PGs applies to a single pool, or the entirety of pools. Meaning, if I calculate 3000 PGs, should each pool have 3000 PGs or should all the pools ADD UP to 3000 PGs? Thanks! -- Kevin Weiler IT IMC Financial Markets | 233 S. Wacker Drive, Suite 4300 | Chicago, IL 60606 | http://imc-chicago.com/ Phone: +1 312-204-7439 | Fax: +1 312-244-3301 | E-Mail: kevin.wei...@imc-chicago.com javascript:; On 11/7/13 9:59 PM, Gregory Farnum g...@inktank.com javascript:; wrote: It sounds like maybe your PG counts on your pools are too low and so you're just getting a bad balance. If that's the case, you can increase the PG count with ceph osd pool name set pgnum higher value. OSDs should get data approximately equal to node weight/sum of node weights, so higher weights get more data and all its associated traffic. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, Nov 5, 2013 at 8:30 AM, Kevin Weiler kevin.wei...@imc-chicago.com javascript:; wrote: All of the disks in my cluster are identical and therefore all have the same weight (each drive is 2TB and the automatically generated weight is 1.82 for each one). Would the procedure here be to reduce the weight, let it rebal, and then put the weight back to where it was? -- Kevin Weiler IT IMC Financial Markets | 233 S. Wacker Drive, Suite 4300 | Chicago, IL 60606 | http://imc-chicago.com/ Phone: +1 312-204-7439 | Fax: +1 312-244-3301 | E-Mail: kevin.wei...@imc-chicago.com javascript:; From: Aronesty, Erik earone...@expressionanalysis.com javascript:; Date: Tuesday, November 5, 2013 10:27 AM To: Greg Chavez greg.cha...@gmail.com javascript:;, Kevin Weiler kevin.wei...@imc-chicago.com javascript:; Cc: ceph-users@lists.ceph.com javascript:; ceph-users@lists.ceph.com javascript:; Subject: RE: [ceph-users] near full osd If there¹s an underperforming disk, why on earth would more data be put on it? You¹d think it would be lessŠ. I would think an overperforming disk should (desirably) cause that case,right? From: ceph-users-boun...@lists.ceph.com [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Greg Chavez Sent: Tuesday, November 05, 2013 11:20 AM To: Kevin Weiler Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] near full osd Kevin, in my experience that usually indicates a bad or underperforming disk, or a too-high priority. Try running ceph osd crush reweight osd.## 1.0. If that doesn't do the trick, you may want to just out that guy. I don't think the crush algorithm guarantees balancing things out in the way you're expecting. --Greg On Tue, Nov 5, 2013 at 11:11 AM, Kevin Weiler kevin.wei...@imc-chicago.com wrote: Hi guys, I have an OSD in my cluster that is near full at 90%, but we're using a little less than half the available storage in the cluster. Shouldn't this be balanced out? -- Kevin Weiler IT IMC Financial Markets | 233 S. Wacker Drive, Suite 4300 | Chicago, IL 60606 | http://imc-chicago.com/ Phone: +1 312-204-7439 | Fax: +1 312-244-3301 | E-Mail: kevin.wei...@imc-chicago.com The information in this e-mail is intended only for the person or entity to which it is addressed. It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it. If you receive this e-mail unintentionally, please inform us immediately by reply and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments. Messages and attachments are scanned for all known viruses. Always scan attachments before opening them. ___ ceph-users mailing list ceph-users@lists.ceph.com
Re: [ceph-users] near full osd
After you increase the number of PGs, *and* increase the pgp_num to do the rebalancing (this is all described in the docs; do a search), data will move around and the overloaded OSD will have less stuff on it. If it's actually marked as full, though, this becomes a bit trickier. Search the list archives for some instructions; I don't remember the best course to follow. -Greg On Friday, November 8, 2013, Kevin Weiler wrote: Thanks again Gregory! One more quick question. If I raise the amount of PGs for a pool, will this REMOVE any data from the full OSD? Or will I have to take the OSD out and put it back in to realize this benefit? Thanks! -- *Kevin Weiler* IT IMC Financial Markets | 233 S. Wacker Drive, Suite 4300 | Chicago, IL 60606 | http://imc-chicago.com/ Phone: +1 312-204-7439 | Fax: +1 312-244-3301 | E-Mail: *kevin.wei...@imc-chicago.com javascript:_e({}, 'cvml', 'kevin.wei...@imc-chicago.com');* From: Gregory Farnum g...@inktank.com javascript:_e({}, 'cvml', 'g...@inktank.com'); Date: Friday, November 8, 2013 11:00 AM To: Kevin Weiler kevin.wei...@imc-chicago.com javascript:_e({}, 'cvml', 'kevin.wei...@imc-chicago.com'); Cc: Aronesty, Erik earone...@expressionanalysis.com javascript:_e({}, 'cvml', 'earone...@expressionanalysis.com');, Greg Chavez greg.cha...@gmail.com javascript:_e({}, 'cvml', 'greg.cha...@gmail.com');, ceph-users@lists.ceph.comjavascript:_e({}, 'cvml', 'ceph-users@lists.ceph.com'); ceph-users@lists.ceph.com javascript:_e({}, 'cvml', 'ceph-users@lists.ceph.com'); Subject: Re: [ceph-users] near full osd It's not a hard value; you should adjust based on the size of your pools (many of then are quite small when used with RGW, for instance). But in general it is better to have more than fewer, and if you want to check you can look at the sizes of each PG (ceph pg dump) and increase the counts for pools with wide variability-Greg On Friday, November 8, 2013, Kevin Weiler wrote: Thanks Gregory, One point that was a bit unclear in documentation is whether or not this equation for PGs applies to a single pool, or the entirety of pools. Meaning, if I calculate 3000 PGs, should each pool have 3000 PGs or should all the pools ADD UP to 3000 PGs? Thanks! -- Kevin Weiler IT IMC Financial Markets | 233 S. Wacker Drive, Suite 4300 | Chicago, IL 60606 | http://imc-chicago.com/ Phone: +1 312-204-7439 | Fax: +1 312-244-3301 | E-Mail: kevin.wei...@imc-chicago.com On 11/7/13 9:59 PM, Gregory Farnum g...@inktank.com wrote: It sounds like maybe your PG counts on your pools are too low and so you're just getting a bad balance. If that's the case, you can increase the PG count with ceph osd pool name set pgnum higher value. OSDs should get data approximately equal to node weight/sum of node weights, so higher weights get more data and all its associated traffic. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, Nov 5, 2013 at 8:30 AM, Kevin Weiler kevin.wei...@imc-chicago.com wrote: All of the disks in my cluster are identical and therefore all have the same weight (each drive is 2TB and the automatically generated weight is 1.82 for each one). Would the procedure here be to reduce the weight, let it rebal, and then put the weight back to where it was? -- Kevin Weiler IT IMC Financial Markets | 233 S. Wacker Drive, Suite 4300 | Chicago, IL 60606 | http://imc-chicago.com/ Phone: +1 312-204-7439 | Fax: +1 312-244-3301 | E-Mail: kevin.wei...@imc-chicago.com From: Aronesty, Erik earone...@expressionanalysis.com Date: Tuesday, November 5, 2013 10:27 AM To: Greg Chavez greg.cha...@gmail.com, Kevin Weiler kevin.wei...@imc-chicago.com Cc: ceph-users@lists.ceph.com ceph-users@lists.ceph.com Subject: RE: [ceph-users] near full osd If there¹s an underperforming disk, why on earth would more data be put on it? You¹d think it would be lessŠ. I would think an overperforming disk should (desirably) cause that case,right? From: ceph-users-boun...@lists.ceph.com [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Greg Chavez Sent: Tuesday, November 05, 2013 11:20 AM To: Kevin Weiler Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] near full osd Kevin, in my experience that usually indicates a bad or underperforming disk, or a too-high priority. Try running ceph osd crush reweight osd.## 1.0. If that doesn't do the trick, you may want to just out that guy. I don't think the crush algorithm guarantees balancing things out in the way you're expecting. --Greg On Tue, Nov 5, 2013 at 11:11 AM, Kevin Weiler kevin.wei...@imc-chicago.com wrote: Hi guys, I have an OSD in my cluster that is near full at 90%, but we're using a little less than half the available storage in the cluster. Shouldn't this be balanced out? --
Re: [ceph-users] radosgw setting puplic ACLs fails.
On Fri, Nov 8, 2013 at 5:09 AM, Micha Krause mi...@krausam.de wrote: Hi, I'm trying to set public ACLs to an object, so that I can access the object via Web-browser. unfortunately without success: s3cmd setacl --acl-public s3://test/hosts ERROR: S3 error: 403 (AccessDenied): The radosgw log says: x-amz-date:Fri, 08 Nov 2013 12:56:55 + /test/hosts?acl 2013-11-08 13:56:55.090604 7fe3314c6700 15 calculated digest=K6fFJdBvy1YXZw0kqZ7qt6sRkzk= 2013-11-08 13:56:55.090606 7fe3314c6700 15 auth_sign=K6fFJdBvy1YXZw0kqZ7qt6sRkzk= 2013-11-08 13:56:55.090607 7fe3314c6700 15 compare=0 2013-11-08 13:56:55.090610 7fe3314c6700 2 req 60:0.000290:s3:PUT /hosts:put_acls:reading permissions 2013-11-08 13:56:55.090621 7fe3314c6700 20 get_obj_state: rctx=0xf32a50 obj=.rgw:test state=0xf21888 s-prefetch_data=0 2013-11-08 13:56:55.090630 7fe3314c6700 10 moving .rgw+test to cache LRU end 2013-11-08 13:56:55.090632 7fe3314c6700 10 cache get: name=.rgw+test : hit 2013-11-08 13:56:55.090635 7fe3314c6700 20 get_obj_state: s-obj_tag was set empty 2013-11-08 13:56:55.090637 7fe3314c6700 20 Read xattr: user.rgw.idtag 2013-11-08 13:56:55.090639 7fe3314c6700 20 Read xattr: user.rgw.manifest 2013-11-08 13:56:55.090641 7fe3314c6700 10 moving .rgw+test to cache LRU end 2013-11-08 13:56:55.090642 7fe3314c6700 10 cache get: name=.rgw+test : hit 2013-11-08 13:56:55.090650 7fe3314c6700 20 rgw_get_bucket_info: bucket instance: test(@{i=.rgw.buckets.index}.rgw.buckets[default.4212.2]) 2013-11-08 13:56:55.090654 7fe3314c6700 20 reading from .rgw:.bucket.meta.test:default.4212.2 2013-11-08 13:56:55.090659 7fe3314c6700 20 get_obj_state: rctx=0xf32a50 obj=.rgw:.bucket.meta.test:default.4212.2 state=0xf39678 s-prefetch_data=0 2013-11-08 13:56:55.090663 7fe3314c6700 10 moving .rgw+.bucket.meta.test:default.4212.2 to cache LRU end 2013-11-08 13:56:55.090665 7fe3314c6700 10 cache get: name=.rgw+.bucket.meta.test:default.4212.2 : hit 2013-11-08 13:56:55.090668 7fe3314c6700 20 get_obj_state: s-obj_tag was set empty 2013-11-08 13:56:55.090670 7fe3314c6700 20 Read xattr: user.rgw.acl 2013-11-08 13:56:55.090671 7fe3314c6700 20 Read xattr: user.rgw.idtag 2013-11-08 13:56:55.090672 7fe3314c6700 20 Read xattr: user.rgw.manifest 2013-11-08 13:56:55.090674 7fe3314c6700 10 moving .rgw+.bucket.meta.test:default.4212.2 to cache LRU end 2013-11-08 13:56:55.090676 7fe3314c6700 10 cache get: name=.rgw+.bucket.meta.test:default.4212.2 : hit 2013-11-08 13:56:55.090690 7fe3314c6700 15 Read AccessControlPolicyAccessControlPolicy xmlns=http://s3.amazonaws.com/doc/2006-03-01/;OwnerIDtest/IDDisplayNameTest/DisplayName/OwnerAccessControlListGrantGrantee xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance; xsi:type=CanonicalUserIDtest/IDDisplayNameTest/DisplayName/GranteePermissionFULL_CONTROL/Permission/Grant/AccessControlList/AccessControlPolicy 2013-11-08 13:56:55.090702 7fe3314c6700 20 get_obj_state: rctx=0xf32a50 obj=test:hosts state=0xf633e8 s-prefetch_data=0 2013-11-08 13:56:55.093871 7fe3314c6700 10 manifest: total_size = 156 2013-11-08 13:56:55.093875 7fe3314c6700 10 manifest: ofs=0 loc=test:hosts 2013-11-08 13:56:55.093876 7fe3314c6700 20 get_obj_state: setting s-obj_tag to default.4212.50 2013-11-08 13:56:55.093882 7fe3314c6700 15 Read AccessControlPolicyAccessControlPolicy xmlns=http://s3.amazonaws.com/doc/2006-03-01/;OwnerIDtest/IDDisplayNameTest/DisplayName/OwnerAccessControlListGrantGrantee xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance; xsi:type=CanonicalUserIDtest/IDDisplayNameTest/DisplayName/GranteePermissionFULL_CONTROL/Permission/Grant/AccessControlList/AccessControlPolicy 2013-11-08 13:56:55.093889 7fe3314c6700 2 req 60:0.003568:s3:PUT /hosts:put_acls:verifying op mask 2013-11-08 13:56:55.093894 7fe3314c6700 20 required_mask= 2 user.op_mask=7 2013-11-08 13:56:55.093896 7fe3314c6700 2 req 60:0.003576:s3:PUT /hosts:put_acls:verifying op permissions 2013-11-08 13:56:55.093900 7fe3314c6700 5 Searching permissions for uid=test mask=56 2013-11-08 13:56:55.093903 7fe3314c6700 5 Found permission: 15 2013-11-08 13:56:55.093905 7fe3314c6700 5 Searching permissions for group=1 mask=56 2013-11-08 13:56:55.093907 7fe3314c6700 5 Permissions for group not found 2013-11-08 13:56:55.093909 7fe3314c6700 5 Getting permissions id=test owner=test perm=8 2013-11-08 13:56:55.093912 7fe3314c6700 10 uid=test requested perm (type)=8, policy perm=8, user_perm_mask=15, acl perm=8 2013-11-08 13:56:55.093914 7fe3314c6700 2 req 60:0.003593:s3:PUT /hosts:put_acls:verifying op params 2013-11-08 13:56:55.093916 7fe3314c6700 2 req 60:0.003596:s3:PUT /hosts:put_acls:executing 2013-11-08 13:56:55.093938 7fe3314c6700 15 read len=343 data=AccessControlPolicy xmlns=http://s3.amazonaws.com/doc/2006-03-01/;OwnerID //OwnerAccessControlListGrantGrantee xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance;
Re: [ceph-users] ceph cluster performance
On 11/08/2013 12:59 PM, Gruher, Joseph R wrote: -Original Message- From: Dinu Vlad [mailto:dinuvla...@gmail.com] Sent: Thursday, November 07, 2013 10:37 AM To: ja...@peacon.co.uk; Gruher, Joseph R; ceph-users@lists.ceph.com Subject: Re: [ceph-users] ceph cluster performance I was under the same impression - using a small portion of the SSD via partitioning (in my case - 30 gigs out of 240) would have the same effect as activating the HPA explicitly. Am I wrong? I pinged a guy on the SSD team here at Intel and he confirmed - if you have a new drive (or freshly secure erased drive) and you only use the subset of the capacity (such as by creating a small partition) you effectively get the same benefits as overprovisioning the hidden area of the drive (or underprovisioning the available capacity if you prefer to look at it that way). It's really all about maintaining a larger area of cells where the SSDs knows it does not have to preserve the data, one way or the other. That was my understanding as well, but it's great to have confirmation from Intel! Thanks Joseph! Mark ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Is Ceph a provider of block device too ?
On Fri, Nov 8, 2013 at 8:49 AM, Listas lis...@adminlinux.com.br wrote: Hi ! I have clusters (IMAP service) with 2 members configured with Ubuntu + Drbd + Ext4. Intend to migrate to the use of Ceph and begin to allow distributed access to the data. Does Ceph provides the distributed filesystem and block device? Ceph's RBD is a distributed block device and works very well; you could use it to replace DRBD. The CephFS distributed filesystem is in more of a preview mode and is not supported for general use at this time. Does Ceph work fine in clusters of two members? It should work fine in a cluster of that size, but you're not getting as many advantages over other solutions at such small scales as you do from larger ones. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] About memory usage of ceph-mon on arm
Hrm, there's nothing too odd in those dumps. I asked around and it sounds like the last time we saw this sort of strange memory use it was a result of leveldb not being able to compact quickly enough. Joao can probably help diagnose that faster than I can. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Fri, Nov 8, 2013 at 5:00 AM, Yu Changyuan rei...@gmail.com wrote: I try to dump perf counter via admin socket, but I don't know what does these numbers actual mean or does these numbers have any thing to do with the different memory usage between arm and amd processors, so I attach the dump log as attachment(mon.a runs on AMD processor, mon.c runs on ARM processor). PS: after days of running(mon.b is 6 day, mon.c is 3 day), the memory consumption of both monitor running on arm board become stable, some what about 600MB, here is the heap stats: mon.btcmalloc heap stats: MALLOC: 594258992 ( 566.7 MiB) Bytes in use by application MALLOC: + 19529728 ( 18.6 MiB) Bytes in page heap freelist MALLOC: + 3885120 (3.7 MiB) Bytes in central cache freelist MALLOC: + 6486528 (6.2 MiB) Bytes in transfer cache freelist MALLOC: + 12202384 ( 11.6 MiB) Bytes in thread cache freelists MALLOC: + 2889952 (2.8 MiB) Bytes in malloc metadata MALLOC: MALLOC: =639252704 ( 609.6 MiB) Actual memory used (physical + swap) MALLOC: + 122880 (0.1 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: =639375584 ( 609.8 MiB) Virtual address space used MALLOC: MALLOC: 10231 Spans in use MALLOC: 24 Thread heaps in use MALLOC: 8192 Tcmalloc page size Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()). Bytes released to the mon.ctcmalloc heap stats: MALLOC: 593987584 ( 566.5 MiB) Bytes in use by application MALLOC: + 23969792 ( 22.9 MiB) Bytes in page heap freelist MALLOC: + 2172640 (2.1 MiB) Bytes in central cache freelist MALLOC: + 5874688 (5.6 MiB) Bytes in transfer cache freelist MALLOC: + 9268512 (8.8 MiB) Bytes in thread cache freelists MALLOC: + 2889952 (2.8 MiB) Bytes in malloc metadata MALLOC: MALLOC: =638163168 ( 608.6 MiB) Actual memory used (physical + swap) MALLOC: + 163840 (0.2 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: =638327008 ( 608.8 MiB) Virtual address space used MALLOC: MALLOC: 9796 Spans in use MALLOC: 14 Thread heaps in use MALLOC: 8192 Tcmalloc page size Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()). Bytes released to the On Fri, Nov 8, 2013 at 12:03 PM, Gregory Farnum g...@inktank.com wrote: I don't think this is anything we've observed before. Normally when a Ceph node is using more memory than its peers it's a consequence of something in that node getting backed up. You might try looking at the perf counters via the admin socket and seeing if something about them is different between your ARM and AMD processors. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, Nov 5, 2013 at 7:21 AM, Yu Changyuan rei...@gmail.com wrote: Finally, my tiny ceph cluster get 3 monitors, newly added mon.b and mon.c both running on cubieboard2, which is cheap but still with enough cpu power(dual-core arm A7 cpu, 1.2G) and memory(1G). But compare to mon.a which running on an amd64 cpu, both mon.b and mon.c easily consume too much memory, so I want to know whether this is caused by memory leak. Below is the output of 'ceph tell mon.a heap stats' and 'ceph tell mon.c heap stats'(mon.c only start 12hr ago, while mon.a already running for more than 10 days) mon.atcmalloc heap stats: MALLOC:5480160 (5.2 MiB) Bytes in use by application MALLOC: + 28065792 ( 26.8 MiB) Bytes in page heap freelist MALLOC: + 15242312 ( 14.5 MiB) Bytes in central cache freelist MALLOC: + 10116608 (9.6 MiB) Bytes in transfer cache freelist MALLOC: + 10432216 (9.9 MiB) Bytes in thread cache freelists MALLOC: + 1667224 (1.6 MiB) Bytes in malloc metadata MALLOC: MALLOC: = 71004312 ( 67.7 MiB) Actual memory used (physical + swap) MALLOC: + 57540608 ( 54.9 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: =128544920 ( 122.6 MiB) Virtual address space used MALLOC: MALLOC: 4655 Spans in use MALLOC: 34 Thread heaps
Re: [ceph-users] About memory usage of ceph-mon on arm
One thing to try is run the mon and then attach to it with perf and see what it's doing. If CPU usage is high and leveldb is doing tons of compaction work that could indicate that this is the same or a similar problem to what we were seeing back around cuttlefish. Mark On 11/08/2013 04:53 PM, Gregory Farnum wrote: Hrm, there's nothing too odd in those dumps. I asked around and it sounds like the last time we saw this sort of strange memory use it was a result of leveldb not being able to compact quickly enough. Joao can probably help diagnose that faster than I can. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Fri, Nov 8, 2013 at 5:00 AM, Yu Changyuan rei...@gmail.com wrote: I try to dump perf counter via admin socket, but I don't know what does these numbers actual mean or does these numbers have any thing to do with the different memory usage between arm and amd processors, so I attach the dump log as attachment(mon.a runs on AMD processor, mon.c runs on ARM processor). PS: after days of running(mon.b is 6 day, mon.c is 3 day), the memory consumption of both monitor running on arm board become stable, some what about 600MB, here is the heap stats: mon.btcmalloc heap stats: MALLOC: 594258992 ( 566.7 MiB) Bytes in use by application MALLOC: + 19529728 ( 18.6 MiB) Bytes in page heap freelist MALLOC: + 3885120 (3.7 MiB) Bytes in central cache freelist MALLOC: + 6486528 (6.2 MiB) Bytes in transfer cache freelist MALLOC: + 12202384 ( 11.6 MiB) Bytes in thread cache freelists MALLOC: + 2889952 (2.8 MiB) Bytes in malloc metadata MALLOC: MALLOC: =639252704 ( 609.6 MiB) Actual memory used (physical + swap) MALLOC: + 122880 (0.1 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: =639375584 ( 609.8 MiB) Virtual address space used MALLOC: MALLOC: 10231 Spans in use MALLOC: 24 Thread heaps in use MALLOC: 8192 Tcmalloc page size Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()). Bytes released to the mon.ctcmalloc heap stats: MALLOC: 593987584 ( 566.5 MiB) Bytes in use by application MALLOC: + 23969792 ( 22.9 MiB) Bytes in page heap freelist MALLOC: + 2172640 (2.1 MiB) Bytes in central cache freelist MALLOC: + 5874688 (5.6 MiB) Bytes in transfer cache freelist MALLOC: + 9268512 (8.8 MiB) Bytes in thread cache freelists MALLOC: + 2889952 (2.8 MiB) Bytes in malloc metadata MALLOC: MALLOC: =638163168 ( 608.6 MiB) Actual memory used (physical + swap) MALLOC: + 163840 (0.2 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: =638327008 ( 608.8 MiB) Virtual address space used MALLOC: MALLOC: 9796 Spans in use MALLOC: 14 Thread heaps in use MALLOC: 8192 Tcmalloc page size Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()). Bytes released to the On Fri, Nov 8, 2013 at 12:03 PM, Gregory Farnum g...@inktank.com wrote: I don't think this is anything we've observed before. Normally when a Ceph node is using more memory than its peers it's a consequence of something in that node getting backed up. You might try looking at the perf counters via the admin socket and seeing if something about them is different between your ARM and AMD processors. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, Nov 5, 2013 at 7:21 AM, Yu Changyuan rei...@gmail.com wrote: Finally, my tiny ceph cluster get 3 monitors, newly added mon.b and mon.c both running on cubieboard2, which is cheap but still with enough cpu power(dual-core arm A7 cpu, 1.2G) and memory(1G). But compare to mon.a which running on an amd64 cpu, both mon.b and mon.c easily consume too much memory, so I want to know whether this is caused by memory leak. Below is the output of 'ceph tell mon.a heap stats' and 'ceph tell mon.c heap stats'(mon.c only start 12hr ago, while mon.a already running for more than 10 days) mon.atcmalloc heap stats: MALLOC:5480160 (5.2 MiB) Bytes in use by application MALLOC: + 28065792 ( 26.8 MiB) Bytes in page heap freelist MALLOC: + 15242312 ( 14.5 MiB) Bytes in central cache freelist MALLOC: + 10116608 (9.6 MiB) Bytes in transfer cache freelist MALLOC: + 10432216 (9.9 MiB) Bytes in thread cache freelists MALLOC: + 1667224 (1.6 MiB) Bytes in malloc metadata MALLOC: MALLOC: = 71004312 ( 67.7 MiB) Actual memory used (physical + swap) MALLOC: + 57540608 ( 54.9 MiB) Bytes released
Re: [ceph-users] About memory usage of ceph-mon on arm
On Sat, Nov 9, 2013 at 7:53 AM, Mark Nelson mark.nel...@inktank.com wrote: One thing to try is run the mon and then attach to it with perf and see what it's doing. If CPU usage is high and leveldb is doing tons of compaction work that could indicate that this is the same or a similar problem to what we were seeing back around cuttlefish. I am sorry, I don't quite understand what does attach to mon with perf mean, so could you please elaborate how to do it? Mark On 11/08/2013 04:53 PM, Gregory Farnum wrote: Hrm, there's nothing too odd in those dumps. I asked around and it sounds like the last time we saw this sort of strange memory use it was a result of leveldb not being able to compact quickly enough. Joao can probably help diagnose that faster than I can. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Fri, Nov 8, 2013 at 5:00 AM, Yu Changyuan rei...@gmail.com wrote: I try to dump perf counter via admin socket, but I don't know what does these numbers actual mean or does these numbers have any thing to do with the different memory usage between arm and amd processors, so I attach the dump log as attachment(mon.a runs on AMD processor, mon.c runs on ARM processor). PS: after days of running(mon.b is 6 day, mon.c is 3 day), the memory consumption of both monitor running on arm board become stable, some what about 600MB, here is the heap stats: mon.btcmalloc heap stats: MALLOC: 594258992 ( 566.7 MiB) Bytes in use by application MALLOC: + 19529728 ( 18.6 MiB) Bytes in page heap freelist MALLOC: + 3885120 (3.7 MiB) Bytes in central cache freelist MALLOC: + 6486528 (6.2 MiB) Bytes in transfer cache freelist MALLOC: + 12202384 ( 11.6 MiB) Bytes in thread cache freelists MALLOC: + 2889952 (2.8 MiB) Bytes in malloc metadata MALLOC: MALLOC: =639252704 ( 609.6 MiB) Actual memory used (physical + swap) MALLOC: + 122880 (0.1 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: =639375584 ( 609.8 MiB) Virtual address space used MALLOC: MALLOC: 10231 Spans in use MALLOC: 24 Thread heaps in use MALLOC: 8192 Tcmalloc page size Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()). Bytes released to the mon.ctcmalloc heap stats: MALLOC: 593987584 ( 566.5 MiB) Bytes in use by application MALLOC: + 23969792 ( 22.9 MiB) Bytes in page heap freelist MALLOC: + 2172640 (2.1 MiB) Bytes in central cache freelist MALLOC: + 5874688 (5.6 MiB) Bytes in transfer cache freelist MALLOC: + 9268512 (8.8 MiB) Bytes in thread cache freelists MALLOC: + 2889952 (2.8 MiB) Bytes in malloc metadata MALLOC: MALLOC: =638163168 ( 608.6 MiB) Actual memory used (physical + swap) MALLOC: + 163840 (0.2 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: =638327008 ( 608.8 MiB) Virtual address space used MALLOC: MALLOC: 9796 Spans in use MALLOC: 14 Thread heaps in use MALLOC: 8192 Tcmalloc page size Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()). Bytes released to the On Fri, Nov 8, 2013 at 12:03 PM, Gregory Farnum g...@inktank.com wrote: I don't think this is anything we've observed before. Normally when a Ceph node is using more memory than its peers it's a consequence of something in that node getting backed up. You might try looking at the perf counters via the admin socket and seeing if something about them is different between your ARM and AMD processors. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, Nov 5, 2013 at 7:21 AM, Yu Changyuan rei...@gmail.com wrote: Finally, my tiny ceph cluster get 3 monitors, newly added mon.b and mon.c both running on cubieboard2, which is cheap but still with enough cpu power(dual-core arm A7 cpu, 1.2G) and memory(1G). But compare to mon.a which running on an amd64 cpu, both mon.b and mon.c easily consume too much memory, so I want to know whether this is caused by memory leak. Below is the output of 'ceph tell mon.a heap stats' and 'ceph tell mon.c heap stats'(mon.c only start 12hr ago, while mon.a already running for more than 10 days) mon.atcmalloc heap stats: MALLOC:5480160 (5.2 MiB) Bytes in use by application MALLOC: + 28065792 ( 26.8 MiB) Bytes in page heap freelist MALLOC: + 15242312 ( 14.5 MiB) Bytes in central cache freelist MALLOC: + 10116608 (9.6 MiB) Bytes in transfer
[ceph-users] v0.72 Emperor released
This is the fifth major release of Ceph, the fourth since adopting a 3-month development cycle. This release brings several new features, including multi-datacenter replication for the radosgw, improved usability, and lands a lot of incremental performance and internal refactoring work to support upcoming features in Firefly. Thank you to every who contributed to this release! There were 46 authors in all. Highlights include: * common: improved crc32c performance * librados: new example client and class code * mds: many bug fixes and stability improvements * mon: health warnings when pool pg_num values are not reasonable * mon: per-pool performance stats * osd, librados: new object copy primitives * osd: improved interaction with backend file system to reduce latency * osd: much internal refactoring to support ongoing erasure coding and tiering support * rgw: bucket quotas * rgw: improved CORS support * rgw: performance improvements * rgw: validate S3 tokens against Keystone Coincident with core Ceph, the Emperor release also brings: * radosgw-agent: support for multi-datacenter replication for disaster recovery (buliding on the multi-site features that appeared in Dumpling) * tgt: improved support for iSCSI via upstream tgt Upgrading There are no specific upgrade restrictions on the order or sequence of upgrading from 0.67.x Dumpling. We normally suggest a rolling upgrade of monitors first, and then OSDs, followed by the radosgw and ceph-mds daemons (if any). It is also possible to do a rolling upgrade from 0.61.x Cuttlefish, but there are ordering restrictions. (This is the same set of restrictions for Cuttlefish to Dumpling.) 1. Upgrade ceph-common on all nodes that will use the command line ceph utility. 2. Upgrade all monitors (upgrade ceph package, restart ceph-mon daemons). This can happen one daemon or host at a time. Note that because cuttlefish and dumpling monitors can't talk to each other, all monitors should be upgraded in relatively short succession to minimize the risk that an a untimely failure will reduce availability. 3. Upgrade all osds (upgrade ceph package, restart ceph-osd daemons). This can happen one daemon or host at a time. 4. Upgrade radosgw (upgrade radosgw package, restart radosgw daemons). There are several minor compatibility changes in the librados API that direct users of librados should be aware of. For a full summary of those changes, please see the complete release notes: * http://ceph.com/docs/master/release-notes/#v0-72-emperor The next major release of Ceph, Firefly, is scheduled for release in February of 2014. You can download v0.72 Emperor from the usual locations: * Git at git://github.com/ceph/ceph.git * Tarball at http://ceph.com/download/ceph-0.72.tar.gz * For Debian/Ubuntu packages, see http://ceph.com/docs/master/install/debian * For RPMs, see http://ceph.com/docs/master/install/rpm ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com