Re: [ceph-users] Pause i/o from time to time

2013-10-24 Thread Uwe Grohnwaldt
Hello ceph-users,

we're hitting a similar problem last Thursday and today. We have a cluster 
consisting of 6 storagenodes containing 70 osds (JBOD configuration). We 
created several rbd devices and mapped them on dedicated server and exporting 
them via targetcli. This iscsi target are connected to Citrix XenServer 6.1 
(with HF30) and XenServer 6.2 (HF4).

In the last time some disks died. After this some errors occured on this 
dedicated iscsitarget:
Oct 23 15:19:42 targetcli01 kernel: [673836.709887] end_request: I/O error, dev 
rbd4, sector 2034037064
Oct 23 15:19:42 targetcli01 kernel: [673836.713596] test_bit(BIO_UPTODATE) 
failed for bio: 880127546c00, err: -6
Oct 23 15:19:43 targetcli01 kernel: [673837.497382] end_request: I/O error, dev 
rbd4, sector 2034037064
Oct 23 15:19:43 targetcli01 kernel: [673837.501323] test_bit(BIO_UPTODATE) 
failed for bio: 880124d933c0, err: -6

These errors go through up to the virtual machines and lead to readonly 
filesystems. We could trigger this behavior with set one disk to out.

We are using Ubuntu 13.04 with latest stable ceph (ceph version 0.67.4 
(ad85b8bfafea6232d64cb7ba76a8b6e8252fa0c7)

Our ceph.conf is like this:

[global]
filestore_xattr_use_omap = true
mon_host = 10.200.20.1,10.200.20.2,10.200.20.3
osd_journal_size = 1024
public_network = 10.200.40.0/16
mon_initial_members = ceph-mon01, ceph-mon02, ceph-mon03
cluster_network = 10.210.40.0/16
auth_supported = none
fsid = 9283e647-2b57-4077-b427-0d3d656233b3

[osd]
osd_max_backfills = 4
osd_recovery_max_active = 1

[osd.0]
public_addr = 10.200.40.1
cluster_addr = 10.210.40.1



After the first outage we set osd_max_backfill to 8, after the second one to 4 
but it didn't help. It seems like it is the bug mentioned at 
http://tracker.ceph.com/issues/6278 . The problem is, that this is a production 
environment and the problems began after we moved several VMs to it. In our 
test environment we can't reproduct it but we are working on a larger 
testinstallation.

Does anybody have an idea how to investigate further without destroying virtual 
machines? ;)

Sometimes these IO errors lead to kernel panics on the iscsi target machine. 
The targetcli/lio config is a simple default config without any tuning or big 
configurations.


Mit freundlichen Grüßen / Best Regards,
Uwe Grohnwaldt

- Original Message -
> From: "Timofey" 
> To: "Mike Dawson" 
> Cc: ceph-users@lists.ceph.com
> Sent: Dienstag, 17. September 2013 22:37:44
> Subject: Re: [ceph-users] Pause i/o from time to time
> 
> I have examined logs.
> Yes, first time it can be scrubbing. It repaired some self.
> 
> I had 2 servers before first problem: one dedicated for osd (osd.0),
> and second - with osd and websites (osd.1).
> After problem I add third server - dedicated for osd (osd.2) and call
> ceph osd set out osd.1 for replace data.
> 
> In ceph -s i saw normal replacing process and all work good about 5-7
> hours.
> Then I have many misdirected records (few hundreds per second):
> osd.0 [WRN] client.359671  misdirected client.359671.1:220843 pg
> 2.3ae744c0 to osd.0 not [2,0] in e1040/1040
> and errors in i/o operations.
> 
> Now I have about 20GB ceph logs with this errors. (I don't work with
> cluster now - I copy out all data on hdd and work from hdd).
> 
> Is any way have local software raid1 with ceph rbd and local image
> (for work when ceph fail or work slow by any reason).
> I tried mdadm but it work bad - server hang up every few hours.
> 
> > You could be suffering from a known, but unfixed issue [1] where
> > spindle contention from scrub and deep-scrub cause periodic stalls
> > in RBD. You can try to disable scrub and deep-scrub with:
> > 
> > # ceph osd set noscrub
> > # ceph osd set nodeep-scrub
> > 
> > If your problem stops, Issue #6278 is likely the cause. To
> > re-enable scrub and deep-scrub:
> > 
> > # ceph osd unset noscrub
> > # ceph osd unset nodeep-scrub
> > 
> > Because you seem to only have two OSDs, you may also be saturating
> > your disks even without scrub or deep-scrub.
> > 
> > http://tracker.ceph.com/issues/6278
> > 
> > Cheers,
> > Mike Dawson
> > 
> > 
> > On 9/16/2013 12:30 PM, Timofey wrote:
> >> I use ceph for HA-cluster.
> >> Some time ceph rbd go to have pause in work (stop i/o operations).
> >> Sometime it can be when one of OSD slow response to requests.
> >> Sometime it can be my mistake (xfs_freeze -f for one of
> >> OSD-drive).
> >> I have 2 storage servers with one osd on each. This pauses can be
> >> few minutes.
> >> 
> >> 1. Is any settings for fast change primary osd if current osd work
> >> bad (slow, don't response).
> >> 2. Can I use ceph-rbd in software raid-array with local drive, for
> >> use local drive instead of ceph if ceph cluster fail?
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> 
> 
> 

Re: [ceph-users] ls/file access hangs on a single ceph directory

2013-10-24 Thread Michael

On 24/10/2013 03:09, Yan, Zheng wrote:

On Thu, Oct 24, 2013 at 6:44 AM, Michael  wrote:

Tying to gather some more info.

CentOS - hanging ls
[root@srv ~]# cat /proc/14614/stack
[] wait_answer_interruptible+0x81/0xc0 [fuse]
[] fuse_request_send+0x1cb/0x290 [fuse]
[] fuse_do_getattr+0x10c/0x2c0 [fuse]
[] fuse_update_attributes+0x75/0x80 [fuse]
[] fuse_getattr+0x53/0x60 [fuse]
[] vfs_getattr+0x51/0x80
[] vfs_fstatat+0x60/0x80
[] vfs_stat+0x1b/0x20
[] sys_newstat+0x24/0x50
[] system_call_fastpath+0x16/0x1b
[] 0x

Ubuntu - hanging ls
root@srv:~# cat /proc/30012/stack
[] ceph_mdsc_do_request+0xcb/0x1a0 [ceph]
[] ceph_do_getattr+0xe7/0x120 [ceph]
[] ceph_getattr+0x24/0x100 [ceph]
[] vfs_getattr+0x4e/0x80
[] vfs_fstatat+0x4e/0x70
[] vfs_lstat+0x1e/0x20
[] sys_newlstat+0x1a/0x40
[] system_call_fastpath+0x16/0x1b
[] 0x

Started occurring shortly (within an hour or so) after adding a pool, not
sure if that's relevant yet.

-Michael

On 23/10/2013 21:10, Michael wrote:

I have a filesystem shared by several systems mounted on 2 ceph nodes with
a 3rd as a reference monitor.
It's been used for a couple of months now but suddenly the root directory
for the mount has become inaccessible and requests to files in it just hang,
there's no ceph errors reported before/after and subdirectories of the
directory can be used (and still are currently being used by VM's still
running from it). It's being mounted in a mixed kernel driver (ubuntu) and
centos (ceph-fuse) environment.

kernel, ceph-fuse and ceph-mds version? the hang was likely caused by an known
bug in kernel 3.10.

Regards
Yan, Zheng


Centos 6.4
2.6.32-358.23.2.el6.x86_64
ceph.x86_64  0.67.4-0.el6
ceph-fuse.x86_64 0.67.4-0.el6

Ubuntu 12.04
3.5.0-41-generic
Ceph Version: 0.67.2-1precise

... So it looks like I've let my ceph versions get out of sync, the MDS 
is on a ubuntu box and all of the OSD are on Ubuntu boxes too, the 
CentOS just has another MON on it, think I really drag myself away from 
CentOS outright for ceph. I was previously using fuse on the Ubuntu 
boxes as well, though that changed a few days ago (Currently walking 
around the ceph features, next up was to hook ceph's rdb up to 
OpenNebula, hence the additional pool).


-Michael
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] PG repair failing when object missing

2013-10-24 Thread Matt Thompson
Hi Harry,

I was able to replicate this.

What does appear to work (for me) is to do an osd scrub followed by a pg
repair.  I've tried this 2x now and in each case the deleted file gets
copied over to the OSD from where it was removed.  However, I've tried a
few pg scrub / pg repairs after manually deleting a file and have yet to
see the file get copied back to the OSD on which it was deleted.  Like you
said, the pg repair sets the health of the PG back to active+clean, but
then re-running the pg scrub detects the file as missing again and sets it
back to active+clean+inconsistent.

Regards,
Matt


On Wed, Oct 23, 2013 at 3:45 PM, Harry Harrington wrote:

> Hi,
>
> I've been taking a look at the repair functionality in ceph. As I
> understand it the osds should try to copy an object from another member of
> the pg if it is missing. I have been attempting to test this by manually
> removing  a file from one of the osds however each time the repair
> completes the the file has not been restored. If I run another scrub on the
> pg it gets flagged as inconsistent. See below for the output from my
> testing. I assume I'm missing something obvious, any insight into this
> process would be greatly appreciated.
>
> Thanks,
> Harry
>
> # ceph --version
> ceph version 0.67.4 (ad85b8bfafea6232d64cb7ba76a8b6e8252fa0c7)
> # ceph status
>   cluster a4e417fe-0386-46a5-4475-ca7e10294273
>health HEALTH_OK
>monmap e1: 1 mons at {ceph1=1.2.3.4:6789/0}, election epoch 2, quorum
> 0 ceph1
>osdmap e13: 3 osds: 3 up, 3 in
> pgmap v232: 192 pgs: 192 active+clean; 44 bytes data, 15465 MB used,
> 164 GB / 179 GB avail
>mdsmap e1: 0/0/1 up
>
> file removed from osd.2
>
> # ceph pg scrub 0.b
> instructing pg 0.b on osd.1 to scrub
>
> # ceph status
>   cluster a4e417fe-0386-46a5-4475-ca7e10294273
>health HEALTH_ERR 1 pgs inconsistent; 1 scrub errors
>monmap e1: 1 mons at {ceph1=1.2.3.4:6789/0}, election epoch 2, quorum
> 0 ceph1
>osdmap e13: 3 osds: 3 up, 3 in
> pgmap v233: 192 pgs: 191 active+clean, 1 active+clean+inconsistent; 44
> bytes data, 15465 MB used, 164 GB / 179 GB avail
>mdsmap e1: 0/0/1 up
>
> # ceph pg repair 0.b
> instructing pg 0.b on osd.1 to repair
>
> # ceph status
>   cluster a4e417fe-0386-46a5-4475-ca7e10294273
>health HEALTH_OK
>monmap e1: 1 mons at {ceph1=1.2.3.4:6789/0}, election epoch 2, quorum
> 0 ceph1
>osdmap e13: 3 osds: 3 up, 3 in
> pgmap v234: 192 pgs: 192 active+clean; 44 bytes data, 15465 MB used,
> 164 GB / 179 GB avail
>mdsmap e1: 0/0/1 up
>
> # ceph pg scrub 0.b
> instructing pg 0.b on osd.1 to scrub
>
> # ceph status
>   cluster a4e417fe-0386-46a5-4475-ca7e10294273
>health HEALTH_ERR 1 pgs inconsistent; 1 scrub errors
>monmap e1: 1 mons at {ceph1=1.2.3.4:6789/0}, election epoch 2, quorum
> 0 ceph1
>osdmap e13: 3 osds: 3 up, 3 in
> pgmap v236: 192 pgs: 191 active+clean, 1 active+clean+inconsistent; 44
> bytes data, 15465 MB used, 164 GB / 179 GB avail
>mdsmap e1: 0/0/1 up
>
>
>
> The logs from osd.1:
> 2013-10-23 14:12:31.188281 7f02a5161700  0 log [ERR] : 0.b osd.2 missing
> 3a643fcb/testfile1/head//0
> 2013-10-23 14:12:31.188312 7f02a5161700  0 log [ERR] : 0.b scrub 1
> missing, 0 inconsistent objects
> 2013-10-23 14:12:31.188319 7f02a5161700  0 log [ERR] : 0.b scrub 1 errors
> 2013-10-23 14:13:03.197802 7f02a5161700  0 log [ERR] : 0.b osd.2 missing
> 3a643fcb/testfile1/head//0
> 2013-10-23 14:13:03.197837 7f02a5161700  0 log [ERR] : 0.b repair 1
> missing, 0 inconsistent objects
> 2013-10-23 14:13:03.197850 7f02a5161700  0 log [ERR] : 0.b repair 1
> errors, 1 fixed
> 2013-10-23 14:14:47.232953 7f02a5161700  0 log [ERR] : 0.b osd.2 missing
> 3a643fcb/testfile1/head//0
> 2013-10-23 14:14:47.232985 7f02a5161700  0 log [ERR] : 0.b scrub 1
> missing, 0 inconsistent objects
> 2013-10-23 14:14:47.232991 7f02a5161700  0 log [ERR] : 0.b scrub 1 errors
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] PG repair failing when object missing

2013-10-24 Thread Matt Thompson
To add -- I thought I was running 0.67.4 on my test cluster (fc 19), but I
appear to be running 0.69.  Not sure how that happened as my yum config is
still pointing to dumpling.  :)


On Thu, Oct 24, 2013 at 10:52 AM, Matt Thompson wrote:

> Hi Harry,
>
> I was able to replicate this.
>
> What does appear to work (for me) is to do an osd scrub followed by a pg
> repair.  I've tried this 2x now and in each case the deleted file gets
> copied over to the OSD from where it was removed.  However, I've tried a
> few pg scrub / pg repairs after manually deleting a file and have yet to
> see the file get copied back to the OSD on which it was deleted.  Like you
> said, the pg repair sets the health of the PG back to active+clean, but
> then re-running the pg scrub detects the file as missing again and sets it
> back to active+clean+inconsistent.
>
> Regards,
> Matt
>
>
> On Wed, Oct 23, 2013 at 3:45 PM, Harry Harrington wrote:
>
>> Hi,
>>
>> I've been taking a look at the repair functionality in ceph. As I
>> understand it the osds should try to copy an object from another member of
>> the pg if it is missing. I have been attempting to test this by manually
>> removing  a file from one of the osds however each time the repair
>> completes the the file has not been restored. If I run another scrub on the
>> pg it gets flagged as inconsistent. See below for the output from my
>> testing. I assume I'm missing something obvious, any insight into this
>> process would be greatly appreciated.
>>
>> Thanks,
>> Harry
>>
>> # ceph --version
>> ceph version 0.67.4 (ad85b8bfafea6232d64cb7ba76a8b6e8252fa0c7)
>> # ceph status
>>   cluster a4e417fe-0386-46a5-4475-ca7e10294273
>>health HEALTH_OK
>>monmap e1: 1 mons at {ceph1=1.2.3.4:6789/0}, election epoch 2, quorum
>> 0 ceph1
>>osdmap e13: 3 osds: 3 up, 3 in
>> pgmap v232: 192 pgs: 192 active+clean; 44 bytes data, 15465 MB used,
>> 164 GB / 179 GB avail
>>mdsmap e1: 0/0/1 up
>>
>> file removed from osd.2
>>
>> # ceph pg scrub 0.b
>> instructing pg 0.b on osd.1 to scrub
>>
>> # ceph status
>>   cluster a4e417fe-0386-46a5-4475-ca7e10294273
>>health HEALTH_ERR 1 pgs inconsistent; 1 scrub errors
>>monmap e1: 1 mons at {ceph1=1.2.3.4:6789/0}, election epoch 2, quorum
>> 0 ceph1
>>osdmap e13: 3 osds: 3 up, 3 in
>> pgmap v233: 192 pgs: 191 active+clean, 1 active+clean+inconsistent;
>> 44 bytes data, 15465 MB used, 164 GB / 179 GB avail
>>mdsmap e1: 0/0/1 up
>>
>> # ceph pg repair 0.b
>> instructing pg 0.b on osd.1 to repair
>>
>> # ceph status
>>   cluster a4e417fe-0386-46a5-4475-ca7e10294273
>>health HEALTH_OK
>>monmap e1: 1 mons at {ceph1=1.2.3.4:6789/0}, election epoch 2, quorum
>> 0 ceph1
>>osdmap e13: 3 osds: 3 up, 3 in
>> pgmap v234: 192 pgs: 192 active+clean; 44 bytes data, 15465 MB used,
>> 164 GB / 179 GB avail
>>mdsmap e1: 0/0/1 up
>>
>> # ceph pg scrub 0.b
>> instructing pg 0.b on osd.1 to scrub
>>
>> # ceph status
>>   cluster a4e417fe-0386-46a5-4475-ca7e10294273
>>health HEALTH_ERR 1 pgs inconsistent; 1 scrub errors
>>monmap e1: 1 mons at {ceph1=1.2.3.4:6789/0}, election epoch 2, quorum
>> 0 ceph1
>>osdmap e13: 3 osds: 3 up, 3 in
>> pgmap v236: 192 pgs: 191 active+clean, 1 active+clean+inconsistent;
>> 44 bytes data, 15465 MB used, 164 GB / 179 GB avail
>>mdsmap e1: 0/0/1 up
>>
>>
>>
>> The logs from osd.1:
>> 2013-10-23 14:12:31.188281 7f02a5161700  0 log [ERR] : 0.b osd.2 missing
>> 3a643fcb/testfile1/head//0
>> 2013-10-23 14:12:31.188312 7f02a5161700  0 log [ERR] : 0.b scrub 1
>> missing, 0 inconsistent objects
>> 2013-10-23 14:12:31.188319 7f02a5161700  0 log [ERR] : 0.b scrub 1 errors
>> 2013-10-23 14:13:03.197802 7f02a5161700  0 log [ERR] : 0.b osd.2 missing
>> 3a643fcb/testfile1/head//0
>> 2013-10-23 14:13:03.197837 7f02a5161700  0 log [ERR] : 0.b repair 1
>> missing, 0 inconsistent objects
>> 2013-10-23 14:13:03.197850 7f02a5161700  0 log [ERR] : 0.b repair 1
>> errors, 1 fixed
>> 2013-10-23 14:14:47.232953 7f02a5161700  0 log [ERR] : 0.b osd.2 missing
>> 3a643fcb/testfile1/head//0
>> 2013-10-23 14:14:47.232985 7f02a5161700  0 log [ERR] : 0.b scrub 1
>> missing, 0 inconsistent objects
>> 2013-10-23 14:14:47.232991 7f02a5161700  0 log [ERR] : 0.b scrub 1 errors
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Non-Ceph cluster name

2013-10-24 Thread Gaylord Holder

I'm trying to bring a ceph cluster not named ceph.

I'm running version 0.61.

From my reading of the documentation, the $cluster metavariable is set 
by the basename of the configuration file: specifying the configuration 
file "/etc/ceph/mycluster.conf" sets the $cluster metavariable to 
"mycluster"


However, given a configuration file /etc/ceph/csceph.conf:

  [global]
   fsid = 70d421fe-28ca-4804-bce8-d51a16b531ec
   mon host = 192.168.124.202
   mon_initial_members = a

  [mon.a]
  host = monnode
  mon addr = 192.168.124.202:6789

and running:

  ceph-authtool csceph.mon.keyring --create-keyring --name=mon. 
--gen-key --cap mon 'allow *'


  ceph-mon -c /etc/ceph/csceph.conf --mkfs -i a --keyring 
csceph.mon.keyring


ceph-mon tries to create monfs in

  /var/lib/ceph/mon/ceph-a

not

  /var/lib/ceph/mon/csceph-a

as expected.


Thank you for any help you can give.

Cheers,
-Gaylord
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Default PGs

2013-10-24 Thread Tyler Brekke
You have to do this before creating your first monitor as the default
pools are created by the monitor.

Now any pools you create should have the correct number of placement
groups though.

You can also increase your pg and pgp num with,

ceph osd pool set  pg_num 
ceph osd pool set   pgp_num 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] num of placement groups created for default pools

2013-10-24 Thread Tyler Brekke
Hey Tim,

If you deployed with ceph-deploy then your monitors started without
knowledge of how many OSDs you will be adding to your cluster. You can
add  'osd_pool_default_pg_num' and 'osd_pool_default_pgp_num' to you
ceph.conf before creating your monitors to have the default pools
created with the proper number of placement groups.

I believe with the old mkcephfs script the number of osds was used to
give a better default pg count. I don't think this is really necessary
anymore as you can increase your placement group size now.

ceph osd pool set  pg_num 
ceph osd pool set  pgp_num 

On Wed, Oct 23, 2013 at 6:13 AM, Snider, Tim  wrote:
> I have a newly created cluster with 68 osds and the default of 2 replicas. 
> The default pools  are created with 64 placement groups . The documentation 
> in http://ceph.com/docs/master/rados/operations/pools/ states  for osd pool 
> creation :
> "We recommend approximately 50-100 placement groups per OSD to balance out 
> memory and CPU requirements and per-OSD load. For a single pool of objects, 
> you can use the following formula: Total PGS = (osds *100)/Replicas"
>
> For this cluster pools should have  3200 pgs [ (64*100)/2] according to the 
> recommendation.
> Why isn't  the guideline followed for default pools?
> Maybe they're created prior to having all the osds activated?
> Maybe I'm reading the documentation incorrectly.
>
> /home/ceph/bin# ceph osd getmaxosd
> max_osd = 68 in epoch 219
> /home/ceph/bin# ceph osd getmaxosd
> max_osd = 68 in epoch 219
> /home/ceph/bin# ceph osd lspools
> 0 data,1 metadata,2 rbd,
> /home/ceph/bin# ceph osd pool get data pg_num
> pg_num: 64
> /home/ceph/bin# ceph osd pool get data size
> size: 2
>
> Thanks,
> Tim
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-deploy hang on CentOS 6.4

2013-10-24 Thread Alfredo Deza
On Wed, Oct 23, 2013 at 12:43 PM, Gruher, Joseph R
 wrote:
>
>
>>-Original Message-
>>From: Alfredo Deza [mailto:alfredo.d...@inktank.com]
>>
>>Did you tried working with the `--no-adjust-repos` flag in ceph-deploy ? It 
>>will
>>allow you to tell ceph-deploy to just go and install ceph without attempting 
>>to
>>import keys or doing anything with your repos.
>
> I have tried this in the past but it caused problems further down the install 
> process, I believe due to old or mismatched versions being installed.  That 
> was on Ubuntu 12.04.2.  It was discussed a bit on this list at the time.  I 
> would not recommend --no-adjust-repos based on my experience.

It would be very useful to have logs or a reproducible scenario so we
can improve this. Mismatched versions doesn't sound like something
ceph-deploy would do specifically, but rather a problem with
installing and removing packages and having issues there.

>
>>The documentation for this can be found here:
>>https://github.com/ceph/ceph-deploy#proxy-or-firewall-installs
>
> This doc only mentions setting the wget proxy, I would suggest it be updated 
> to include the curl and rpm proxies may need to be set as well.

Yes, I will be updating those too, thanks for the examples!
>
> Thanks,
> Joe
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ls/file access hangs on a single ceph directory

2013-10-24 Thread Yan, Zheng
On Thu, Oct 24, 2013 at 5:43 PM, Michael  wrote:
> On 24/10/2013 03:09, Yan, Zheng wrote:
>>
>> On Thu, Oct 24, 2013 at 6:44 AM, Michael 
>> wrote:
>>>
>>> Tying to gather some more info.
>>>
>>> CentOS - hanging ls
>>> [root@srv ~]# cat /proc/14614/stack
>>> [] wait_answer_interruptible+0x81/0xc0 [fuse]
>>> [] fuse_request_send+0x1cb/0x290 [fuse]
>>> [] fuse_do_getattr+0x10c/0x2c0 [fuse]
>>> [] fuse_update_attributes+0x75/0x80 [fuse]
>>> [] fuse_getattr+0x53/0x60 [fuse]
>>> [] vfs_getattr+0x51/0x80
>>> [] vfs_fstatat+0x60/0x80
>>> [] vfs_stat+0x1b/0x20
>>> [] sys_newstat+0x24/0x50
>>> [] system_call_fastpath+0x16/0x1b
>>> [] 0x
>>>
>>> Ubuntu - hanging ls
>>> root@srv:~# cat /proc/30012/stack
>>> [] ceph_mdsc_do_request+0xcb/0x1a0 [ceph]
>>> [] ceph_do_getattr+0xe7/0x120 [ceph]
>>> [] ceph_getattr+0x24/0x100 [ceph]
>>> [] vfs_getattr+0x4e/0x80
>>> [] vfs_fstatat+0x4e/0x70
>>> [] vfs_lstat+0x1e/0x20
>>> [] sys_newlstat+0x1a/0x40
>>> [] system_call_fastpath+0x16/0x1b
>>> [] 0x
>>>
>>> Started occurring shortly (within an hour or so) after adding a pool, not
>>> sure if that's relevant yet.
>>>
>>> -Michael
>>>
>>> On 23/10/2013 21:10, Michael wrote:

 I have a filesystem shared by several systems mounted on 2 ceph nodes
 with
 a 3rd as a reference monitor.
 It's been used for a couple of months now but suddenly the root
 directory
 for the mount has become inaccessible and requests to files in it just
 hang,
 there's no ceph errors reported before/after and subdirectories of the
 directory can be used (and still are currently being used by VM's still
 running from it). It's being mounted in a mixed kernel driver (ubuntu)
 and
 centos (ceph-fuse) environment.
>>
>> kernel, ceph-fuse and ceph-mds version? the hang was likely caused by an
>> known
>> bug in kernel 3.10.
>>
>> Regards
>> Yan, Zheng
>
>
> Centos 6.4
> 2.6.32-358.23.2.el6.x86_64
> ceph.x86_64  0.67.4-0.el6
> ceph-fuse.x86_64 0.67.4-0.el6
>
> Ubuntu 12.04
> 3.5.0-41-generic
> Ceph Version: 0.67.2-1precise

3.5 kernel is too old for cephfs, please use ceph-use instead

Yan, Zheng

>
> ... So it looks like I've let my ceph versions get out of sync, the MDS is
> on a ubuntu box and all of the OSD are on Ubuntu boxes too, the CentOS just
> has another MON on it, think I really drag myself away from CentOS outright
> for ceph. I was previously using fuse on the Ubuntu boxes as well, though
> that changed a few days ago (Currently walking around the ceph features,
> next up was to hook ceph's rdb up to OpenNebula, hence the additional pool).
>
> -Michael
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Non-Ceph cluster name

2013-10-24 Thread Gaylord Holder

Works perfectly.

My only grip is --cluster isn't listed as a valid argument from

  ceph-mon --help

and the only reference searching for --cluster in the ceph documentation 
is in regards to ceph-rest-api.


Shall I file a bug to correct the documentation?

Thanks again for the quick and accurate response.

-Gaylord

On 10/24/2013 08:11 AM, Sage Weil wrote:

Try passing --cluster csceph instead of the config file path and I
suspect it will work.

sage



Gaylord Holder  wrote:

I'm trying to bring a ceph cluster not named ceph.

I'm running version 0.61.

  From my reading of the documentation, the $cluster metavariable is set
by the basename of the configuration file: specifying the configuration
file "/etc/ceph/mycluster.conf" sets the $cluster metavariable to
"mycluster"

However, given a configuration file /etc/ceph/csceph.conf:

[global]
 fsid = 70d421fe-28ca-4804-bce8-d51a16b531ec
 mon host =192.168.124.202  
 mon_initial_members = a

[mon.a]
host = monnode
mon addr =192.168.124.202:6789

and running:

ceph-authtool csceph.mon.keyring --create-keyring --name=mon.
--gen-key --cap mon 'allow *'

ceph-mon -c /etc/ceph/csceph.conf --mkfs -i a --keyring
csceph.mon.keyring

ceph-mon tries to create monfs in

/var/lib/ceph/mon/ceph-a

not

/var/lib/ceph/mon/csceph-a

as expected.


Thank you for any help you can give.

Cheers,
-Gaylord


ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ls/file access hangs on a single ceph directory

2013-10-24 Thread Michael

On 24/10/2013 13:53, Yan, Zheng wrote:

On Thu, Oct 24, 2013 at 5:43 PM, Michael  wrote:

On 24/10/2013 03:09, Yan, Zheng wrote:

On Thu, Oct 24, 2013 at 6:44 AM, Michael 
wrote:

Tying to gather some more info.

CentOS - hanging ls
[root@srv ~]# cat /proc/14614/stack
[] wait_answer_interruptible+0x81/0xc0 [fuse]
[] fuse_request_send+0x1cb/0x290 [fuse]
[] fuse_do_getattr+0x10c/0x2c0 [fuse]
[] fuse_update_attributes+0x75/0x80 [fuse]
[] fuse_getattr+0x53/0x60 [fuse]
[] vfs_getattr+0x51/0x80
[] vfs_fstatat+0x60/0x80
[] vfs_stat+0x1b/0x20
[] sys_newstat+0x24/0x50
[] system_call_fastpath+0x16/0x1b
[] 0x

Ubuntu - hanging ls
root@srv:~# cat /proc/30012/stack
[] ceph_mdsc_do_request+0xcb/0x1a0 [ceph]
[] ceph_do_getattr+0xe7/0x120 [ceph]
[] ceph_getattr+0x24/0x100 [ceph]
[] vfs_getattr+0x4e/0x80
[] vfs_fstatat+0x4e/0x70
[] vfs_lstat+0x1e/0x20
[] sys_newlstat+0x1a/0x40
[] system_call_fastpath+0x16/0x1b
[] 0x

Started occurring shortly (within an hour or so) after adding a pool, not
sure if that's relevant yet.

-Michael

On 23/10/2013 21:10, Michael wrote:

I have a filesystem shared by several systems mounted on 2 ceph nodes
with
a 3rd as a reference monitor.
It's been used for a couple of months now but suddenly the root
directory
for the mount has become inaccessible and requests to files in it just
hang,
there's no ceph errors reported before/after and subdirectories of the
directory can be used (and still are currently being used by VM's still
running from it). It's being mounted in a mixed kernel driver (ubuntu)
and
centos (ceph-fuse) environment.

kernel, ceph-fuse and ceph-mds version? the hang was likely caused by an
known
bug in kernel 3.10.

Regards
Yan, Zheng


Centos 6.4
2.6.32-358.23.2.el6.x86_64
ceph.x86_64  0.67.4-0.el6
ceph-fuse.x86_64 0.67.4-0.el6

Ubuntu 12.04
3.5.0-41-generic
Ceph Version: 0.67.2-1precise

3.5 kernel is too old for cephfs, please use ceph-use instead

Yan, Zheng


Ah Thanks, I'll switch back to fuse on the Ubuntu's tonight. Know around 
which kernel version Cephfs becomes 'Stable'?


-Michael
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Rados bench result when increasing OSDs

2013-10-24 Thread Guang Yang
Hi Mark, Greg and Kyle,
Sorry to response this late, and thanks for providing the directions for me to 
look at.

We have exact the same setup for OSD, pool replica (and even I tried to create 
the same number of PGs within the small cluster), however, I can still 
reproduce this constantly.

This is the command I run:
$ rados bench -p perf_40k_PG -b 5000 -t 3 --show-time 10 write

With 24 OSDs:
Average Latency: 0.00494123
Max latency: 0.511864
Min latency:  0.002198

With 330 OSDs:
Average Latency:0.00913806
Max latency: 0.021967
Min latency:  0.005456

In terms of the crush rule, we are using the default one, for the small 
cluster, it has 3 OSD hosts (11 + 11 + 2), for the large cluster, we have 30 
OSD hosts (11 * 30).

I have a couple of questions:
 1. Is it possible that latency is due to that we have only three layer 
hierarchy? like root -> host -> OSD, and as we are using the Straw (by default) 
bucket type, which has O(N) speed, and if host number increase, so that the 
computation actually increase. I suspect not as the computation is in the order 
of microseconds per my understanding.

 2. Is it possible because we have more OSDs, the cluster will need to maintain 
far more connections between OSDs which potentially slow things down?

 3. Anything else i might miss?

Thanks all for the constant help.

Guang  


在 2013-10-22,下午10:22,Guang Yang  写道:

> Hi Kyle and Greg,
> I will get back to you with more details tomorrow, thanks for the response.
> 
> Thanks,
> Guang
> 在 2013-10-22,上午9:37,Kyle Bader  写道:
> 
>> Besides what Mark and Greg said it could be due to additional hops through 
>> network devices. What network devices are you using, what is the network  
>> topology and does your CRUSH map reflect the network topology?
>> 
>> On Oct 21, 2013 9:43 AM, "Gregory Farnum"  wrote:
>> On Mon, Oct 21, 2013 at 7:13 AM, Guang Yang  wrote:
>> > Dear ceph-users,
>> > Recently I deployed a ceph cluster with RadosGW, from a small one (24 
>> > OSDs) to a much bigger one (330 OSDs).
>> >
>> > When using rados bench to test the small cluster (24 OSDs), it showed the 
>> > average latency was around 3ms (object size is 5K), while for the larger 
>> > one (330 OSDs), the average latency was around 7ms (object size 5K), twice 
>> > comparing the small cluster.
>> >
>> > The OSD within the two cluster have the same configuration, SAS disk,  and 
>> > two partitions for one disk, one for journal and the other for metadata.
>> >
>> > For PG numbers, the small cluster tested with the pool having 100 PGs, and 
>> > for the large cluster, the pool has 4 PGs (as I will to further scale 
>> > the cluster, so I choose a much large PG).
>> >
>> > Does my test result make sense? Like when the PG number and OSD increase, 
>> > the latency might drop?
>> 
>> Besides what Mark said, can you describe your test in a little more
>> detail? Writing/reading, length of time, number of objects, etc.
>> -Greg
>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] About use same SSD for OS and Journal

2013-10-24 Thread Martin Catudal
Hi,
 Here my scenario :
I will have a small cluster (4 nodes) with 4 (4 TB) OSD's per node.

I will have OS installed on two SSD in raid 1 configuration.

Is one of you have successfully and efficiently a Ceph cluster that is 
built with Journal on a separate partition on the OS SSD's?

I know that it may occur a lot of IO on the Journal SSD and I'm scared 
of have my OS suffer from too much IO.

Any background experience?

Martin

-- 
Martin Catudal
Responsable TIC
Ressources Metanor Inc
Ligne directe: (819) 218-2708
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Rados bench result when increasing OSDs

2013-10-24 Thread Mark Nelson
On 10/24/2013 08:31 AM, Guang Yang wrote:
> Hi Mark, Greg and Kyle,
> Sorry to response this late, and thanks for providing the directions for 
> me to look at.
> 
> We have exact the same setup for OSD, pool replica (and even I tried to 
> create the same number of PGs within the small cluster), however, I can 
> still reproduce this constantly.
> 
> This is the command I run:
> $ rados bench -p perf_40k_PG -b 5000 -t 3 --show-time 10 write
> 
> With 24 OSDs:
> Average Latency: 0.00494123
> Max latency: 0.511864
> Min latency:  0.002198
> 
> With 330 OSDs:
> Average Latency:0.00913806
> Max latency: 0.021967
> Min latency:  0.005456
> 
> In terms of the crush rule, we are using the default one, for the small 
> cluster, it has 3 OSD hosts (11 + 11 + 2), for the large cluster, we 
> have 30 OSD hosts (11 * 30).
> 
> I have a couple of questions:
>   1. Is it possible that latency is due to that we have only three layer 
> hierarchy? like root -> host -> OSD, and as we are using the Straw (by 
> default) bucket type, which has O(N) speed, and if host number increase, 
> so that the computation actually increase. I suspect not as the 
> computation is in the order of microseconds per my understanding.

I suspect this is very unlikely as well.

> 
>   2. Is it possible because we have more OSDs, the cluster will need to 
> maintain far more connections between OSDs which potentially slow things 
> down?

One thing here that might be very interesting is this:

After you run your tests, if you do something like:

find /var/run/ceph/*.asok -maxdepth 1 -exec sudo ceph --admin-daemon {}
dump_historic_ops \; > foo

on each OSD server, you will get a dump of the 10 slowest operations
over the last 10 minutes for each OSD on each server, and it will tell
you were in each OSD operations were backing up.  You can sort of search
through these files by greping for "duration" first, looking for the
long ones, and then going back and searching through the file for those
long durations and looking at the associated latencies.

Something I have been investigating recently is time spent waiting for
osdmap propagation.  It's something I haven't had time to dig into
meaningfully, but if we were to see that this was more significant on
your larger cluster vs your smaller one, that would be very interesting
news.

> 
>   3. Anything else i might miss?
> 
> Thanks all for the constant help.
> 
> Guang
> 
> 
> 在 2013-10-22,下午10:22,Guang Yang  > 写道:
> 
>> Hi Kyle and Greg,
>> I will get back to you with more details tomorrow, thanks for the 
>> response.
>>
>> Thanks,
>> Guang
>> 在 2013-10-22,上午9:37,Kyle Bader > > 写道:
>>
>>> Besides what Mark and Greg said it could be due to additional hops 
>>> through network devices. What network devices are you using, what is 
>>> the network  topology and does your CRUSH map reflect the network 
>>> topology?
>>>
>>> On Oct 21, 2013 9:43 AM, "Gregory Farnum" >> > wrote:
>>>
>>> On Mon, Oct 21, 2013 at 7:13 AM, Guang Yang >> > wrote:
>>> > Dear ceph-users,
>>> > Recently I deployed a ceph cluster with RadosGW, from a small
>>> one (24 OSDs) to a much bigger one (330 OSDs).
>>> >
>>> > When using rados bench to test the small cluster (24 OSDs), it
>>> showed the average latency was around 3ms (object size is 5K),
>>> while for the larger one (330 OSDs), the average latency was
>>> around 7ms (object size 5K), twice comparing the small cluster.
>>> >
>>> > The OSD within the two cluster have the same configuration, SAS
>>> disk,  and two partitions for one disk, one for journal and the
>>> other for metadata.
>>> >
>>> > For PG numbers, the small cluster tested with the pool having
>>> 100 PGs, and for the large cluster, the pool has 4 PGs (as I
>>> will to further scale the cluster, so I choose a much large PG).
>>> >
>>> > Does my test result make sense? Like when the PG number and OSD
>>> increase, the latency might drop?
>>>
>>> Besides what Mark said, can you describe your test in a little more
>>> detail? Writing/reading, length of time, number of objects, etc.
>>> -Greg
>>> Software Engineer #42 @ http://inktank.com 
>>> | http://ceph.com 
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com 
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Non-Ceph cluster name

2013-10-24 Thread Sage Weil
Try passing --cluster csceph instead of the config file path and I suspect it 
will work.

sage

Gaylord Holder  wrote:
>I'm trying to bring a ceph cluster not named ceph.
>
>I'm running version 0.61.
>
>From my reading of the documentation, the $cluster metavariable is set 
>by the basename of the configuration file: specifying the configuration
>
>file "/etc/ceph/mycluster.conf" sets the $cluster metavariable to 
>"mycluster"
>
>However, given a configuration file /etc/ceph/csceph.conf:
>
>   [global]
>fsid = 70d421fe-28ca-4804-bce8-d51a16b531ec
>mon host = 192.168.124.202
>mon_initial_members = a
>
>   [mon.a]
>   host = monnode
>   mon addr = 192.168.124.202:6789
>
>and running:
>
>   ceph-authtool csceph.mon.keyring --create-keyring --name=mon. 
>--gen-key --cap mon 'allow *'
>
>   ceph-mon -c /etc/ceph/csceph.conf --mkfs -i a --keyring 
>csceph.mon.keyring
>
>ceph-mon tries to create monfs in
>
>   /var/lib/ceph/mon/ceph-a
>
>not
>
>   /var/lib/ceph/mon/csceph-a
>
>as expected.
>
>
>Thank you for any help you can give.
>
>Cheers,
>-Gaylord
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ls/file access hangs on a single ceph directory

2013-10-24 Thread Michael
I have a filesystem shared by several systems mounted on 2 ceph nodes 
with a 3rd as a reference monitor.
It's been used for a couple of months now but suddenly the root 
directory for the mount has become inaccessible and requests to files in 
it just hang, there's no ceph errors reported before/after and 
subdirectories of the directory can be used (and still are currently 
being used by VM's still running from it). It's being mounted in a mixed 
kernel driver (ubuntu) and centos (ceph-fuse) environment.


 cluster ab3f7bc0-4cf7-4489-9cde-1af11d68a834
   health HEALTH_OK
   monmap e1: 3 mons at 
{srv10=##:6789/0,srv11=##:6789/0,srv8=##:6789/0}, election epoch 96, 
quorum 0,1,2 srv10,srv11,srv8

   osdmap e2873: 6 osds: 6 up, 6 in
   pgmap v2451618: 728 pgs: 728 active+clean; 128 GB data, 260 GB used, 
3929 GB / 4189 GB avail; 30365B/s wr, 5op/s

   mdsmap e51: 1/1/1 up {0=srv10=up:active}

Have done a full deep scrub/repair cycle on all of the osd which has 
come back fine so not really sure where to start looking to find out 
what's wrong with it.


Any ideas?

-Michael
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ls/file access hangs on a single ceph directory

2013-10-24 Thread Yan, Zheng
On Thu, Oct 24, 2013 at 9:13 PM, Michael  wrote:
> On 24/10/2013 13:53, Yan, Zheng wrote:
>>
>> On Thu, Oct 24, 2013 at 5:43 PM, Michael 
>> wrote:
>>>
>>> On 24/10/2013 03:09, Yan, Zheng wrote:

 On Thu, Oct 24, 2013 at 6:44 AM, Michael 
 wrote:
>
> Tying to gather some more info.
>
> CentOS - hanging ls
> [root@srv ~]# cat /proc/14614/stack
> [] wait_answer_interruptible+0x81/0xc0 [fuse]
> [] fuse_request_send+0x1cb/0x290 [fuse]
> [] fuse_do_getattr+0x10c/0x2c0 [fuse]
> [] fuse_update_attributes+0x75/0x80 [fuse]
> [] fuse_getattr+0x53/0x60 [fuse]
> [] vfs_getattr+0x51/0x80
> [] vfs_fstatat+0x60/0x80
> [] vfs_stat+0x1b/0x20
> [] sys_newstat+0x24/0x50
> [] system_call_fastpath+0x16/0x1b
> [] 0x
>
> Ubuntu - hanging ls
> root@srv:~# cat /proc/30012/stack
> [] ceph_mdsc_do_request+0xcb/0x1a0 [ceph]
> [] ceph_do_getattr+0xe7/0x120 [ceph]
> [] ceph_getattr+0x24/0x100 [ceph]
> [] vfs_getattr+0x4e/0x80
> [] vfs_fstatat+0x4e/0x70
> [] vfs_lstat+0x1e/0x20
> [] sys_newlstat+0x1a/0x40
> [] system_call_fastpath+0x16/0x1b
> [] 0x
>
> Started occurring shortly (within an hour or so) after adding a pool,
> not
> sure if that's relevant yet.
>
> -Michael
>
> On 23/10/2013 21:10, Michael wrote:
>>
>> I have a filesystem shared by several systems mounted on 2 ceph nodes
>> with
>> a 3rd as a reference monitor.
>> It's been used for a couple of months now but suddenly the root
>> directory
>> for the mount has become inaccessible and requests to files in it just
>> hang,
>> there's no ceph errors reported before/after and subdirectories of the
>> directory can be used (and still are currently being used by VM's
>> still
>> running from it). It's being mounted in a mixed kernel driver (ubuntu)
>> and
>> centos (ceph-fuse) environment.

 kernel, ceph-fuse and ceph-mds version? the hang was likely caused by an
 known
 bug in kernel 3.10.

 Regards
 Yan, Zheng
>>>
>>>
>>> Centos 6.4
>>> 2.6.32-358.23.2.el6.x86_64
>>> ceph.x86_64  0.67.4-0.el6
>>> ceph-fuse.x86_64 0.67.4-0.el6
>>>
>>> Ubuntu 12.04
>>> 3.5.0-41-generic
>>> Ceph Version: 0.67.2-1precise
>>
>> 3.5 kernel is too old for cephfs, please use ceph-use instead
>>
>> Yan, Zheng
>
>
> Ah Thanks, I'll switch back to fuse on the Ubuntu's tonight. Know around
> which kernel version Cephfs becomes 'Stable'?
>

3.12+
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Hardware: SFP+ or 10GBase-T

2013-10-24 Thread Nathan Stratton
I have tried to make Gluster FS work for last 2 years on different
projects and have given up. With Gluster I have always used 10 gig
Infiniband. Its dirt cheap (about $80 a port used including switch)
and very low latency, however ceph does not support it so we are
looking at ethernet.

I know that 10GBase-T has more delay then SFP+ with direct attached
cables (.3 usec vs 2.6 usec per link), but does that matter? Some
sites stay it is a huge hit, but we are talking usec, not ms, so I
find it hard to believe that it causes that much of an issue. I like
the lower cost and use of standard cabling vs SFP+, but I don't want
to sacrifice on performance.

Our plan is to use our KVM hosts for ceph, the hardware we are looking
at how is:

SPF+ Option - Supermicro X9DRW-7TPF+ (Intel 82599)
10GBase-T Option - Supermicro X9DRW-3TF+ (Intel x540)

2 - 2.9 GHz Xeon 2690 v2
16 - 8 Gig 1877 MHz dual rank DDR3
9 - Samsung 840 EVO 120 GB SSD (1 root 8 ceph)

Switch is going to be Arista 7050-T for SFP+ or Arista 7050-S for 10GBase-T.

-- 
><>
nathan stratton | vp technology | broadsoft, inc | +1-240-404-6580 |
www.broadsoft.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Rados bench result when increasing OSDs

2013-10-24 Thread Guang Yang
Thanks Mark.

I cannot connect to my hosts, I will do the check and get back to you tomorrow.

Thanks,
Guang

在 2013-10-24,下午9:47,Mark Nelson  写道:

> On 10/24/2013 08:31 AM, Guang Yang wrote:
>> Hi Mark, Greg and Kyle,
>> Sorry to response this late, and thanks for providing the directions for 
>> me to look at.
>> 
>> We have exact the same setup for OSD, pool replica (and even I tried to 
>> create the same number of PGs within the small cluster), however, I can 
>> still reproduce this constantly.
>> 
>> This is the command I run:
>> $ rados bench -p perf_40k_PG -b 5000 -t 3 --show-time 10 write
>> 
>> With 24 OSDs:
>> Average Latency: 0.00494123
>> Max latency: 0.511864
>> Min latency:  0.002198
>> 
>> With 330 OSDs:
>> Average Latency:0.00913806
>> Max latency: 0.021967
>> Min latency:  0.005456
>> 
>> In terms of the crush rule, we are using the default one, for the small 
>> cluster, it has 3 OSD hosts (11 + 11 + 2), for the large cluster, we 
>> have 30 OSD hosts (11 * 30).
>> 
>> I have a couple of questions:
>>  1. Is it possible that latency is due to that we have only three layer 
>> hierarchy? like root -> host -> OSD, and as we are using the Straw (by 
>> default) bucket type, which has O(N) speed, and if host number increase, 
>> so that the computation actually increase. I suspect not as the 
>> computation is in the order of microseconds per my understanding.
> 
> I suspect this is very unlikely as well.
> 
>> 
>>  2. Is it possible because we have more OSDs, the cluster will need to 
>> maintain far more connections between OSDs which potentially slow things 
>> down?
> 
> One thing here that might be very interesting is this:
> 
> After you run your tests, if you do something like:
> 
> find /var/run/ceph/*.asok -maxdepth 1 -exec sudo ceph --admin-daemon {}
> dump_historic_ops \; > foo
> 
> on each OSD server, you will get a dump of the 10 slowest operations
> over the last 10 minutes for each OSD on each server, and it will tell
> you were in each OSD operations were backing up.  You can sort of search
> through these files by greping for "duration" first, looking for the
> long ones, and then going back and searching through the file for those
> long durations and looking at the associated latencies.
> 
> Something I have been investigating recently is time spent waiting for
> osdmap propagation.  It's something I haven't had time to dig into
> meaningfully, but if we were to see that this was more significant on
> your larger cluster vs your smaller one, that would be very interesting
> news.
> 
>> 
>>  3. Anything else i might miss?
>> 
>> Thanks all for the constant help.
>> 
>> Guang
>> 
>> 
>> 在 2013-10-22,下午10:22,Guang Yang > > 写道:
>> 
>>> Hi Kyle and Greg,
>>> I will get back to you with more details tomorrow, thanks for the 
>>> response.
>>> 
>>> Thanks,
>>> Guang
>>> 在 2013-10-22,上午9:37,Kyle Bader >> > 写道:
>>> 
 Besides what Mark and Greg said it could be due to additional hops 
 through network devices. What network devices are you using, what is 
 the network  topology and does your CRUSH map reflect the network 
 topology?
 
 On Oct 21, 2013 9:43 AM, "Gregory Farnum" >>> > wrote:
 
On Mon, Oct 21, 2013 at 7:13 AM, Guang Yang >>>> wrote:
> Dear ceph-users,
> Recently I deployed a ceph cluster with RadosGW, from a small
one (24 OSDs) to a much bigger one (330 OSDs).
> 
> When using rados bench to test the small cluster (24 OSDs), it
showed the average latency was around 3ms (object size is 5K),
while for the larger one (330 OSDs), the average latency was
around 7ms (object size 5K), twice comparing the small cluster.
> 
> The OSD within the two cluster have the same configuration, SAS
disk,  and two partitions for one disk, one for journal and the
other for metadata.
> 
> For PG numbers, the small cluster tested with the pool having
100 PGs, and for the large cluster, the pool has 4 PGs (as I
will to further scale the cluster, so I choose a much large PG).
> 
> Does my test result make sense? Like when the PG number and OSD
increase, the latency might drop?
 
Besides what Mark said, can you describe your test in a little more
detail? Writing/reading, length of time, number of objects, etc.
-Greg
Software Engineer #42 @ http://inktank.com 
| http://ceph.com 
___
ceph-users mailing list
ceph-users@lists.ceph.com 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
>>> 
>> 
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://list

[ceph-users] Error on adding Monitors

2013-10-24 Thread David J F Carradice
Hi. 

I am getting an error on adding monitors to my cluster.
ceph@ceph-deploy:~/my-cluster$ ceph-deploy mon create ceph-osd01
[ceph_deploy.cli][INFO  ] Invoked (1.2.7): /usr/bin/ceph-deploy mon create
ceph-osd01
[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph-osd01
[ceph_deploy.mon][DEBUG ] detecting platform for host ceph-osd01 ...
[ceph_deploy.sudo_pushy][DEBUG ] will use a remote connection with sudo
[ceph_deploy.mon][INFO  ] distro info: Ubuntu 13.04 raring
[ceph-osd01][DEBUG ] determining if provided host has same hostname in
remote
[ceph-osd01][DEBUG ] deploying mon to ceph-osd01
[ceph-osd01][DEBUG ] remote hostname: ceph-osd01
[ceph-osd01][INFO  ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-osd01][DEBUG ] checking for done path:
/var/lib/ceph/mon/ceph-ceph-osd01/done
[ceph-osd01][INFO  ] create a done file to avoid re-doing the mon deployment
[ceph-osd01][INFO  ] create the init path if it does not exist
[ceph-osd01][INFO  ] locating `service` executable...
[ceph-osd01][INFO  ] found `service` executable: /usr/sbin/service
[ceph-osd01][INFO  ] Running command: sudo initctl emit ceph-mon
cluster=ceph id=ceph-osd01
[ceph-osd01][INFO  ] Running command: sudo ceph --admin-daemon
/var/run/ceph/ceph-mon.ceph-osd01.asok mon_status
[ceph-osd01][ERROR ] admin_socket: exception getting command descriptions:
[Errno 2] No such file or directory
[ceph-osd01][WARNIN] monitor: mon.ceph-osd01, might not be running yet
[ceph-osd01][INFO  ] Running command: sudo ceph --admin-daemon
/var/run/ceph/ceph-mon.ceph-osd01.asok mon_status
[ceph-osd01][ERROR ] admin_socket: exception getting command descriptions:
[Errno 2] No such file or directory
[ceph-osd01][WARNIN] ceph-osd01 is not defined in `mon initial members`
[ceph-osd01][WARNIN] monitor ceph-osd01 does not exist in monmap
[ceph-osd01][WARNIN] neither `public_addr` nor `public_network` keys are
defined for monitors
[ceph-osd01][WARNIN] monitors may not be able to form quorum

This is happening after a successful and first add of a monitor, ceph-mon01.
As per the ceph-deploy documentation, I added a single monitor, then some
disk daemons located on ceph-osd01~03, then went to add more monitors,
ceph-osd01 & 02 for a quorum. This is where I get the issue.

Is the issue related to the WARNING present regarding keys?

It appears that when running the ceph-deploy mon create  from my
ceph-deploy server, it complains about there not being any
ceph-mon..asok (which I assume are address sockets). I looked in the
respective directories on the potential monitor nodes (which are currently
also the OSD nodes) and see that there is only an OSD.asok, no MON.asok

I can send my ceph.conf and a brief overview if it helps.

Now to add to the fun, the  ceph he alt passes, and I am able to place an
object. It is the fact that I cannot add more than the current single mon
node that I would like to resolve.

Regards

David

PS Extract of ceph.log on my ceph-deploy server (separate from other nodesŠ
not a mon nor an osd)

SUCCESSFUL INITIAL MON
 764 2013-10-22 12:31:11,157 [ceph-mon01][INFO  ] Running command: sudo ceph
--admin-daemon /var/run/ceph/ceph-mon.ceph-mon01.asok m764 on_status

765 2013-10-22 12:31:11,223 [ceph-mon01][DEBUG ]



766 2013-10-22 12:31:11,223 [ceph-mon01][DEBUG ] status for monitor:
mon.ceph-mon01

767 2013-10-22 12:31:11,224 [ceph-mon01][DEBUG ] {

768 2013-10-22 12:31:11,224 [ceph-mon01][DEBUG ]   ³election_epoch": 2,



FAILED ADD MON

See above in blue.




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Non-Ceph cluster name

2013-10-24 Thread Sage Weil
A bug would be great. Thanks!

sage

Gaylord Holder  wrote:
>Works perfectly.
>
>My only grip is --cluster isn't listed as a valid argument from
>
>   ceph-mon --help
>
>and the only reference searching for --cluster in the ceph
>documentation 
>is in regards to ceph-rest-api.
>
>Shall I file a bug to correct the documentation?
>
>Thanks again for the quick and accurate response.
>
>-Gaylord
>
>On 10/24/2013 08:11 AM, Sage Weil wrote:
>> Try passing --cluster csceph instead of the config file path and I
>> suspect it will work.
>>
>> sage
>>
>>
>>
>> Gaylord Holder  wrote:
>>
>> I'm trying to bring a ceph cluster not named ceph.
>>
>> I'm running version 0.61.
>>
>>   From my reading of the documentation, the $cluster metavariable
>is set
>> by the basename of the configuration file: specifying the
>configuration
>> file "/etc/ceph/mycluster.conf" sets the $cluster metavariable to
>> "mycluster"
>>
>> However, given a configuration file /etc/ceph/csceph.conf:
>>
>> [global]
>>  fsid = 70d421fe-28ca-4804-bce8-d51a16b531ec
>>  mon host =192.168.124.202  
>>  mon_initial_members = a
>>
>> [mon.a]
>> host = monnode
>> mon addr =192.168.124.202:6789
>>
>> and running:
>>
>> ceph-authtool csceph.mon.keyring --create-keyring --name=mon.
>> --gen-key --cap mon 'allow *'
>>
>> ceph-mon -c /etc/ceph/csceph.conf --mkfs -i a --keyring
>> csceph.mon.keyring
>>
>> ceph-mon tries to create monfs in
>>
>> /var/lib/ceph/mon/ceph-a
>>
>> not
>>
>> /var/lib/ceph/mon/csceph-a
>>
>> as expected.
>>
>>
>> Thank you for any help you can give.
>>
>> Cheers,
>> -Gaylord
>>
>
>>
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Error on adding Monitors

2013-10-24 Thread Joao Eduardo Luis

On 10/24/2013 03:12 PM, David J F Carradice wrote:

Hi.

I am getting an error on adding monitors to my cluster.
ceph@ceph-deploy:~/my-cluster$ ceph-deploy mon create ceph-osd01
[ceph_deploy.cli][INFO  ] Invoked (1.2.7): /usr/bin/ceph-deploy mon
create ceph-osd01
[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph-osd01
[ceph_deploy.mon][DEBUG ] detecting platform for host ceph-osd01 ...
[ceph_deploy.sudo_pushy][DEBUG ] will use a remote connection with sudo
[ceph_deploy.mon][INFO  ] distro info: Ubuntu 13.04 raring
[ceph-osd01][DEBUG ] determining if provided host has same hostname in
remote
[ceph-osd01][DEBUG ] deploying mon to ceph-osd01
[ceph-osd01][DEBUG ] remote hostname: ceph-osd01
[ceph-osd01][INFO  ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-osd01][DEBUG ] checking for done path:
/var/lib/ceph/mon/ceph-ceph-osd01/done
[ceph-osd01][INFO  ] create a done file to avoid re-doing the mon deployment
[ceph-osd01][INFO  ] create the init path if it does not exist
[ceph-osd01][INFO  ] locating `service` executable...
[ceph-osd01][INFO  ] found `service` executable: /usr/sbin/service
[ceph-osd01][INFO  ] Running command: sudo initctl emit ceph-mon
cluster=ceph id=ceph-osd01
[ceph-osd01][INFO  ] Running command: sudo ceph --admin-daemon
/var/run/ceph/ceph-mon.ceph-osd01.asok mon_status
[ceph-osd01][ERROR ] admin_socket: exception getting command
descriptions: [Errno 2] No such file or directory
[ceph-osd01][WARNIN] monitor: mon.ceph-osd01, might not be running yet
[ceph-osd01][INFO  ] Running command: sudo ceph --admin-daemon
/var/run/ceph/ceph-mon.ceph-osd01.asok mon_status
[ceph-osd01][ERROR ] admin_socket: exception getting command
descriptions: [Errno 2] No such file or directory
[ceph-osd01][WARNIN] ceph-osd01 is not defined in `mon initial members`
[ceph-osd01][WARNIN] monitor ceph-osd01 does not exist in monmap
[ceph-osd01][WARNIN] neither `public_addr` nor `public_network` keys are
defined for monitors
[ceph-osd01][WARNIN] monitors may not be able to form quorum

This is happening after a successful and first add of a monitor,
ceph-mon01. As per the ceph-deploy documentation, I added a single
monitor, then some disk daemons located on ceph-osd01~03, then went to
add more monitors, ceph-osd01 & 02 for a quorum. This is where I get the
issue.

Is the issue related to the WARNING present regarding keys?


That's a warning regarding config options (public_addr/public_network) 
and the lack of enough info to generate a monmap.




It appears that when running the ceph-deploy mon create  from my
ceph-deploy server, it complains about there not being any
ceph-mon..asok (which I assume are address sockets). I looked in
the respective directories on the potential monitor nodes (which are
currently also the OSD nodes) and see that there is only an OSD.asok, no
MON.asok


The monitor's .asok (admin socket) will only be created at start.  If 
the monitor hasn't been run yet, then there's no asok.




I can send my ceph.conf and a brief overview if it helps.


ceph.conf, specially the [global] and [mon]/[mon.foo] sections would be 
helpful.




Now to add to the fun, the  ceph he alt passes, and I am able to place
an object. It is the fact that I cannot add more than the current single
mon node that I would like to resolve.

Regards

David

PS Extract of ceph.log on my ceph-deploy server (separate from other
nodes… not a mon nor an osd)

SUCCESSFUL INITIAL MON

  764 2013-10-22 12:31:11,157 [ceph-mon01][INFO  ] Running command: sudo
ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon01.asok m764
on_status

 765 2013-10-22 12:31:11,223 [ceph-mon01][DEBUG ]


 766 2013-10-22 12:31:11,223 [ceph-mon01][DEBUG ] status for
monitor: mon.ceph-mon01

 767 2013-10-22 12:31:11,224 [ceph-mon01][DEBUG ] {

 768 2013-10-22 12:31:11,224 [ceph-mon01][DEBUG ]
“election_epoch": 2,


FAILED ADD MON

See above in blue.




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
Joao Eduardo Luis
Software Engineer | http://inktank.com | http://ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] About use same SSD for OS and Journal

2013-10-24 Thread Kurt Bauer
Hi,

we had a setup like this and ran into trouble, so I would strongly
discourage you from setting it up like this. Under normal circumstances
there's no problem, but when the cluster is under heavy load, for
example when it has a lot of pgs backfilling, for whatever reason
(increasing num of pgs, adding OSDs,..), there's obviously a lot of
entries written to the journals.
What we saw then was extremly laggy behavior of the cluster and when
looking at the iostats of the SSD, they were at 100% most of the time. I
don't exactly know what causes this and why the SSDs can't cope with the
amount of IOs, but seperating OS and journals did the trick. We now have
quick 15k HDDs in Raid1 for OS and Monitor journal and per 5 OSD
journals one SSD with one partition per journal (used as raw partition).

Hope that helps,
best regards,
Kurt

Martin Catudal schrieb:
> Hi,
>  Here my scenario :
> I will have a small cluster (4 nodes) with 4 (4 TB) OSD's per node.
>
> I will have OS installed on two SSD in raid 1 configuration.
>
> Is one of you have successfully and efficiently a Ceph cluster that is 
> built with Journal on a separate partition on the OS SSD's?
>
> I know that it may occur a lot of IO on the Journal SSD and I'm scared 
> of have my OS suffer from too much IO.
>
> Any background experience?
>
> Martin
>


smime.p7s
Description: S/MIME Cryptographic Signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Hardware: SFP+ or 10GBase-T

2013-10-24 Thread Mark Nelson

On 10/24/2013 09:08 AM, Nathan Stratton wrote:

I have tried to make Gluster FS work for last 2 years on different
projects and have given up. With Gluster I have always used 10 gig
Infiniband. Its dirt cheap (about $80 a port used including switch)
and very low latency, however ceph does not support it so we are
looking at ethernet.


Ceph does work with IPoIB, We've got some people working on rsocket 
support, and Mellanox just opensourced VMA, so there are some options on 
the infiniband side if you want to go that route.  With QDR and IPoIB we 
have been able to push about 2.4GB/s per node.  No idea how SDR would do 
though.




I know that 10GBase-T has more delay then SFP+ with direct attached
cables (.3 usec vs 2.6 usec per link), but does that matter? Some
sites stay it is a huge hit, but we are talking usec, not ms, so I
find it hard to believe that it causes that much of an issue. I like
the lower cost and use of standard cabling vs SFP+, but I don't want
to sacrifice on performance.


Honestly I wouldn't worry about it too much.  We have bigger latency 
dragons to slay. :)




Our plan is to use our KVM hosts for ceph, the hardware we are looking
at how is:

SPF+ Option - Supermicro X9DRW-7TPF+ (Intel 82599)
10GBase-T Option - Supermicro X9DRW-3TF+ (Intel x540)

2 - 2.9 GHz Xeon 2690 v2
16 - 8 Gig 1877 MHz dual rank DDR3
9 - Samsung 840 EVO 120 GB SSD (1 root 8 ceph)


Just FYI, we haven't done a whole lot of optimization work on SSDs yet, 
so if you are shooting for really high IOPS be prepared as its still 
kind of wild west. :)  We've got a couple of people working on different 
projects that we hope will help here, but there's a lot of tuning work 
to be done still. :)




Switch is going to be Arista 7050-T for SFP+ or Arista 7050-S for 10GBase-T.



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] About use same SSD for OS and Journal

2013-10-24 Thread Martin Catudal
Thank's Kurt,
 That's comfort me in my decision to separate OS and Journal.
Martin

Martin Catudal
Responsable TIC
Ressources Metanor Inc
Ligne directe: (819) 218-2708

Le 2013-10-24 10:40, Kurt Bauer a écrit :
> Hi,
>
> we had a setup like this and ran into trouble, so I would strongly
> discourage you from setting it up like this. Under normal circumstances
> there's no problem, but when the cluster is under heavy load, for
> example when it has a lot of pgs backfilling, for whatever reason
> (increasing num of pgs, adding OSDs,..), there's obviously a lot of
> entries written to the journals.
> What we saw then was extremly laggy behavior of the cluster and when
> looking at the iostats of the SSD, they were at 100% most of the time. I
> don't exactly know what causes this and why the SSDs can't cope with the
> amount of IOs, but seperating OS and journals did the trick. We now have
> quick 15k HDDs in Raid1 for OS and Monitor journal and per 5 OSD
> journals one SSD with one partition per journal (used as raw partition).
>
> Hope that helps,
> best regards,
> Kurt
>
> Martin Catudal schrieb:
>> Hi,
>>   Here my scenario :
>> I will have a small cluster (4 nodes) with 4 (4 TB) OSD's per node.
>>
>> I will have OS installed on two SSD in raid 1 configuration.
>>
>> Is one of you have successfully and efficiently a Ceph cluster that is
>> built with Journal on a separate partition on the OS SSD's?
>>
>> I know that it may occur a lot of IO on the Journal SSD and I'm scared
>> of have my OS suffer from too much IO.
>>
>> Any background experience?
>>
>> Martin
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Rados bench result when increasing OSDs

2013-10-24 Thread Gregory Farnum
On Thu, Oct 24, 2013 at 6:31 AM, Guang Yang  wrote:
> Hi Mark, Greg and Kyle,
> Sorry to response this late, and thanks for providing the directions for me
> to look at.
>
> We have exact the same setup for OSD, pool replica (and even I tried to
> create the same number of PGs within the small cluster), however, I can
> still reproduce this constantly.
>
> This is the command I run:
> $ rados bench -p perf_40k_PG -b 5000 -t 3 --show-time 10 write

3 seconds is not going to give you an accurate picture of much of
anything. :) The difference is probably due to caching effects on the
data structure lookups.
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] num of placement groups created for default pools

2013-10-24 Thread Snider, Tim
Thanks for the explanation that makes sense.
Tim

-Original Message-
From: Tyler Brekke [mailto:tyler.bre...@inktank.com] 
Sent: Thursday, October 24, 2013 6:42 AM
To: Snider, Tim
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] num of placement groups created for default pools

Hey Tim,

If you deployed with ceph-deploy then your monitors started without knowledge 
of how many OSDs you will be adding to your cluster. You can add  
'osd_pool_default_pg_num' and 'osd_pool_default_pgp_num' to you ceph.conf 
before creating your monitors to have the default pools created with the proper 
number of placement groups.

I believe with the old mkcephfs script the number of osds was used to give a 
better default pg count. I don't think this is really necessary anymore as you 
can increase your placement group size now.

ceph osd pool set  pg_num  ceph osd pool set  pgp_num 


On Wed, Oct 23, 2013 at 6:13 AM, Snider, Tim  wrote:
> I have a newly created cluster with 68 osds and the default of 2 replicas. 
> The default pools  are created with 64 placement groups . The documentation 
> in http://ceph.com/docs/master/rados/operations/pools/ states  for osd pool 
> creation :
> "We recommend approximately 50-100 placement groups per OSD to balance out 
> memory and CPU requirements and per-OSD load. For a single pool of objects, 
> you can use the following formula: Total PGS = (osds *100)/Replicas"
>
> For this cluster pools should have  3200 pgs [ (64*100)/2] according to the 
> recommendation.
> Why isn't  the guideline followed for default pools?
> Maybe they're created prior to having all the osds activated?
> Maybe I'm reading the documentation incorrectly.
>
> /home/ceph/bin# ceph osd getmaxosd
> max_osd = 68 in epoch 219
> /home/ceph/bin# ceph osd getmaxosd
> max_osd = 68 in epoch 219
> /home/ceph/bin# ceph osd lspools
> 0 data,1 metadata,2 rbd,
> /home/ceph/bin# ceph osd pool get data pg_num
> pg_num: 64
> /home/ceph/bin# ceph osd pool get data size
> size: 2
>
> Thanks,
> Tim
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] About use same SSD for OS and Journal

2013-10-24 Thread Wido den Hollander

On 10/24/2013 03:36 PM, Martin Catudal wrote:

Hi,
  Here my scenario :
I will have a small cluster (4 nodes) with 4 (4 TB) OSD's per node.

I will have OS installed on two SSD in raid 1 configuration.



I would never run your journal in RAID-1 on SSDs. It means you'll 'burn' 
through them at the same rate, so there is no benefit.



Is one of you have successfully and efficiently a Ceph cluster that is
built with Journal on a separate partition on the OS SSD's?



Not that I know of, I would always seperate it.


I know that it may occur a lot of IO on the Journal SSD and I'm scared
of have my OS suffer from too much IO.

Any background experience?


What I'd suggest:
* One small (~60GB) SSD for your OS
* One SSD per 6 OSDs for journaling

I have never seen one Intel SSD fail. I've been using them since the 
X25-M 80GB SSDs and those are still in production without even one 
wearing out or failing.


Nowdays I'm so comfortable with SSDs I never use RAID on them when 
running a OS on it.


With Ceph I also see machines as 'disposable', so IF a SSD for the OS 
fails I don't care, since the rest of the cluster will take over.


This gives you another free slot which you can use for an OSD.

Some chassis from SuperMicro have internal 2.5" bays where you can place 
your SSD for the OS so you can use the hot-swap bays for your Journal 
and Data disks.




Martin




--
Wido den Hollander
42on B.V.

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Hardware: SFP+ or 10GBase-T

2013-10-24 Thread Kyle Bader
> I know that 10GBase-T has more delay then SFP+ with direct attached
> cables (.3 usec vs 2.6 usec per link), but does that matter? Some
> sites stay it is a huge hit, but we are talking usec, not ms, so I
> find it hard to believe that it causes that much of an issue. I like
> the lower cost and use of standard cabling vs SFP+, but I don't want
> to sacrifice on performance.

If you are talking about the links from the nodes with OSDs to their
ToR switches then I would suggest going with Twinax cables. Twinax
doesn't go very far but it's really durable and uses less power than
10GBase-T. Here's a blog post that goes into more detail:

http://etherealmind.com/difference-twinax-category-6-10-gigabit-ethernet/

I would probably go with the Arista 7050-S over the 7050-T and use
twinax for ToR to OSD node links and SFP+SR uplinks to spine switches
if you need longer runs.

-- 

Kyle
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Hardware: SFP+ or 10GBase-T

2013-10-24 Thread Nathan Stratton
On Thu, Oct 24, 2013 at 9:48 AM, Mark Nelson  wrote:
> Ceph does work with IPoIB, We've got some people working on rsocket support,
> and Mellanox just opensourced VMA, so there are some options on the
> infiniband side if you want to go that route.  With QDR and IPoIB we have
> been able to push about 2.4GB/s per node.  No idea how SDR would do though.

That is great news!

> Honestly I wouldn't worry about it too much.  We have bigger latency dragons
> to slay. :)

Ok, this is what I thought, but wanted to make sure.

> Just FYI, we haven't done a whole lot of optimization work on SSDs yet, so
> if you are shooting for really high IOPS be prepared as its still kind of
> wild west. :)  We've got a couple of people working on different projects
> that we hope will help here, but there's a lot of tuning work to be done
> still. :)

Understood, we don't need huge amounts of space, so the 240 Gig SSDs
were just a bit more the SAS drives. Tho depending on the code, I
guess I could run into wear issues of SSDs over SAS/SATA because of
frequent writes.

-- 
><>
nathan stratton | vp technology | broadsoft, inc | +1-240-404-6580 |
www.broadsoft.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Hardware: SFP+ or 10GBase-T

2013-10-24 Thread Nathan Stratton
On Thu, Oct 24, 2013 at 11:19 AM, Kyle Bader  wrote:
> If you are talking about the links from the nodes with OSDs to their
> ToR switches then I would suggest going with Twinax cables. Twinax
> doesn't go very far but it's really durable and uses less power than
> 10GBase-T. Here's a blog post that goes into more detail:
>
> http://etherealmind.com/difference-twinax-category-6-10-gigabit-ethernet/
>
> I would probably go with the Arista 7050-S over the 7050-T and use
> twinax for ToR to OSD node links and SFP+SR uplinks to spine switches
> if you need longer runs.

So I totally understand that its less power, but finding that hard to
justify when the cost per port jumps to $300 more per port. With dual
ports its going to take a long time to make that up in power savings.



-- 
><>
nathan stratton | vp technology | broadsoft, inc | +1-240-404-6580 |
www.broadsoft.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Hardware: SFP+ or 10GBase-T

2013-10-24 Thread Mark Nelson

On 10/24/2013 11:38 AM, Nathan Stratton wrote:

On Thu, Oct 24, 2013 at 11:19 AM, Kyle Bader  wrote:

If you are talking about the links from the nodes with OSDs to their
ToR switches then I would suggest going with Twinax cables. Twinax
doesn't go very far but it's really durable and uses less power than
10GBase-T. Here's a blog post that goes into more detail:

http://etherealmind.com/difference-twinax-category-6-10-gigabit-ethernet/

I would probably go with the Arista 7050-S over the 7050-T and use
twinax for ToR to OSD node links and SFP+SR uplinks to spine switches
if you need longer runs.


So I totally understand that its less power, but finding that hard to
justify when the cost per port jumps to $300 more per port. With dual
ports its going to take a long time to make that up in power savings.


It used to be that SFP+ 10GbE cards were somewhat cheaper than the Cat6 
ones which helped offset the price of the cables.  Not sure what kind of 
pricing you can get but if twinax works out to be not much more I'd do 
it.  I suspect it won't really matter to Ceph, but the power savings are 
nice (this can add up when doing a full data center) and I like working 
with them better.








___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ls/file access hangs on a single ceph directory

2013-10-24 Thread Michael

On 24/10/2013 14:55, Yan, Zheng wrote:

On Thu, Oct 24, 2013 at 9:13 PM, Michael  wrote:

On 24/10/2013 13:53, Yan, Zheng wrote:

On Thu, Oct 24, 2013 at 5:43 PM, Michael 
wrote:

On 24/10/2013 03:09, Yan, Zheng wrote:

On Thu, Oct 24, 2013 at 6:44 AM, Michael 
wrote:

Tying to gather some more info.

CentOS - hanging ls
[root@srv ~]# cat /proc/14614/stack
[] wait_answer_interruptible+0x81/0xc0 [fuse]
[] fuse_request_send+0x1cb/0x290 [fuse]
[] fuse_do_getattr+0x10c/0x2c0 [fuse]
[] fuse_update_attributes+0x75/0x80 [fuse]
[] fuse_getattr+0x53/0x60 [fuse]
[] vfs_getattr+0x51/0x80
[] vfs_fstatat+0x60/0x80
[] vfs_stat+0x1b/0x20
[] sys_newstat+0x24/0x50
[] system_call_fastpath+0x16/0x1b
[] 0x

Ubuntu - hanging ls
root@srv:~# cat /proc/30012/stack
[] ceph_mdsc_do_request+0xcb/0x1a0 [ceph]
[] ceph_do_getattr+0xe7/0x120 [ceph]
[] ceph_getattr+0x24/0x100 [ceph]
[] vfs_getattr+0x4e/0x80
[] vfs_fstatat+0x4e/0x70
[] vfs_lstat+0x1e/0x20
[] sys_newlstat+0x1a/0x40
[] system_call_fastpath+0x16/0x1b
[] 0x

Started occurring shortly (within an hour or so) after adding a pool,
not
sure if that's relevant yet.

-Michael

On 23/10/2013 21:10, Michael wrote:

I have a filesystem shared by several systems mounted on 2 ceph nodes
with
a 3rd as a reference monitor.
It's been used for a couple of months now but suddenly the root
directory
for the mount has become inaccessible and requests to files in it just
hang,
there's no ceph errors reported before/after and subdirectories of the
directory can be used (and still are currently being used by VM's
still
running from it). It's being mounted in a mixed kernel driver (ubuntu)
and
centos (ceph-fuse) environment.

kernel, ceph-fuse and ceph-mds version? the hang was likely caused by an
known
bug in kernel 3.10.

Regards
Yan, Zheng


Centos 6.4
2.6.32-358.23.2.el6.x86_64
ceph.x86_64  0.67.4-0.el6
ceph-fuse.x86_64 0.67.4-0.el6

Ubuntu 12.04
3.5.0-41-generic
Ceph Version: 0.67.2-1precise

3.5 kernel is too old for cephfs, please use ceph-use instead

Yan, Zheng


Ah Thanks, I'll switch back to fuse on the Ubuntu's tonight. Know around
which kernel version Cephfs becomes 'Stable'?


3.12+
Have switched all of the nodes back to mounting as ceph-fuse and it's 
now working fine. Thanks much.


-Michael
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-deploy hang on CentOS 6.4

2013-10-24 Thread Gruher, Joseph R


>-Original Message-
>From: Alfredo Deza [mailto:alfredo.d...@inktank.com]
>Sent: Thursday, October 24, 2013 5:24 AM
>To: Gruher, Joseph R
>Cc: ceph-users@lists.ceph.com
>Subject: Re: [ceph-users] ceph-deploy hang on CentOS 6.4
>
>On Wed, Oct 23, 2013 at 12:43 PM, Gruher, Joseph R
> wrote:
>>
>>
>>>-Original Message-
>>>From: Alfredo Deza [mailto:alfredo.d...@inktank.com]
>>>
>>>Did you tried working with the `--no-adjust-repos` flag in ceph-deploy
>>>? It will allow you to tell ceph-deploy to just go and install ceph
>>>without attempting to import keys or doing anything with your repos.
>>
>> I have tried this in the past but it caused problems further down the install
>process, I believe due to old or mismatched versions being installed.  That was
>on Ubuntu 12.04.2.  It was discussed a bit on this list at the time.  I would 
>not
>recommend --no-adjust-repos based on my experience.
>
>It would be very useful to have logs or a reproducible scenario so we can
>improve this. Mismatched versions doesn't sound like something ceph-deploy
>would do specifically, but rather a problem with installing and removing
>packages and having issues there.

I've attached the previous thread where I ran into the issue with 
--no-adjust-repos.  Not sure if attachments are allowed by the list, if it 
doesn't get through let me know and I can forward it to you.

To summarize, it would let me work around proxy requirements during 
"ceph-deploy install", but then I would run into an issue during when 
attempting "ceph-deploy mon create".  After resolving the proxy issue and 
removing --no-adjust-repos the issue during "ceph-deploy mon create" 
disappeared.  The assumption is that they are related but that would require 
more detailed investigation to prove.  See attachment for details.

To reproduce (at the time) I was just running Ubuntu 12.04.2 with 3.6.10 
kernel, then installing ceph-deploy and doing a ceph-deploy new, ceph-deploy 
install with --no-adjust-repos, and then ceph-deploy mon create where I would 
get hung up on what appears to be an invalid argument --cluster to ceph-mon.  
After removing --no-adjust-repos same steps would complete successfully.  The 
problem may be that the --cluster flag wasn't supported by the version of ceph 
installed.

>>
>>>The documentation for this can be found here:
>>>https://github.com/ceph/ceph-deploy#proxy-or-firewall-installs
>>
>> This doc only mentions setting the wget proxy, I would suggest it be
>updated to include the curl and rpm proxies may need to be set as well.
>
>Yes, I will be updating those too, thanks for the examples!
>>
>> Thanks,
>> Joe
>>
--- Begin Message ---


>-Original Message-
>From: Alfredo Deza [mailto:alfredo.d...@inktank.com]
>Sent: Monday, September 23, 2013 5:45 AM
>To: Gruher, Joseph R
>Cc: ceph-users@lists.ceph.com
>Subject: Re: [ceph-users] monitor deployment during quick start
>
>On Fri, Sep 20, 2013 at 3:58 PM, Gruher, Joseph R
> wrote:
>> Sorry, not trying to repost or bump my thread, but I think I can restate my
>question here and for better clarity.  I am confused about the "--cluster"
>argument used when "ceph-deploy mon create" invokes "ceph-mon" on the
>target system.  I always get a failure at this point when running "ceph-deploy
>mon create" and this then halts the whole ceph quick start process.
>>
>> Here is the line where "ceph-deploy mon create" fails:
>> [cephtest02][INFO  ] Running command: ceph-mon --cluster ceph --mkfs
>> -i cephtest02 --keyring /var/lib/ceph/tmp/ceph-cephtest02.mon.keyring
>>
>> Running the same command manually on the target system gives an error.
>As far as I can tell from the man page and the built-in help and the website
>(http://ceph.com/docs/next/man/8/ceph-mon/) it seems "--cluster" is not a
>valid argument for ceph-mon?  Is this a problem in ceph-deploy?  Does this
>work for anyone else?
>>
>> ceph@cephtest02:~$ sudo ceph-mon --cluster ceph --mkfs -i cephtest02
>> --keyring /var/lib/ceph/tmp/ceph-cephtest02.mon.keyring
>> too many arguments: [--cluster,ceph]
>> usage: ceph-mon -i monid [--mon-data=pathtodata] [flags]
>>   --debug_mon n
>> debug monitor level (e.g. 10)
>>   --mkfs
>> build fresh monitor fs
>> --conf/-cRead configuration from the given configuration file
>> -d   Run in foreground, log to stderr.
>> -f   Run in foreground, log to usual location.
>> --id/-i  set ID portion of my name
>> --name/-nset name (TYPE.ID)
>> --versionshow version and quit
>>
>>--debug_ms N
>> set message debug level (e.g. 1) ceph@cephtest02:~$
>>
>> Can anyone clarify if "--cluster" is a supported argument for ceph-mon?
>
>This is a *weird* corner you've stumbled upon. The flag is indeed used by
>ceph-deploy and that hasn't changed in a while. However, as you point out,
>there is no trace of that flag anywhere! I can't find where is that defined at 
>all.
>
>Running the latest version of ceph-deploy + ceph, 

Re: [ceph-users] Hardware: SFP+ or 10GBase-T

2013-10-24 Thread james

On 2013-10-24 15:08, Nathan Stratton wrote:


9 - Samsung 840 EVO 120 GB SSD (1 root 8 ceph)



The EVO is a TLC drive with durability of about 1,100 write cycles.  
Whether that is or isn't a problem in your environment of course is a 
separate question - I'm just pointing it out :)  If they are anything 
like their predecessor, go with about 20% unallocated (or run fstrim 
occasionally).  The part SLC implementation is clever on these (IMHO).

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd client module in centos 6.4

2013-10-24 Thread Josh Logan(News)




On 10/23/2013 1:14 PM, Gruher, Joseph R wrote:

Hi all,

I have CentOS 6.4 with 3.11.6 kernel running (built from latest stable
on kernel.org) and I cannot load the rbd client module.  Should I have
to do anything to enable/install it?  Shouldn’t it be present in this
kernel?

[ceph@joceph05 /]$ cat /etc/centos-release

CentOS release 6.4 (Final)

[ceph@joceph05 /]$ uname -a

Linux joceph05.jf.intel.com 3.11.6 #1 SMP Mon Oct 21 17:23:07 PDT 2013
x86_64 x86_64 x86_64 GNU/Linux

[ceph@joceph05 /]$ modprobe rbd

FATAL: Module rbd not found.

[ceph@joceph05 /]$

Thanks,

Joe



Please take a look at Elrepo.org
http://elrepo.org/tiki/tiki-index.php

These kernels work well for us on Centos 6.4

Josh



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Default PGs

2013-10-24 Thread Gruher, Joseph R

>-Original Message-
>From: Tyler Brekke [mailto:tyler.bre...@inktank.com]
>Sent: Thursday, October 24, 2013 4:36 AM
>To: Gruher, Joseph R
>Cc: ceph-users@lists.ceph.com
>Subject: Re: [ceph-users] Default PGs
>
>You have to do this before creating your first monitor as the default pools are
>created by the monitor.
>
>Now any pools you create should have the correct number of placement
>groups though.
>
>You can also increase your pg and pgp num with,
>
>ceph osd pool set  pg_num  ceph osd pool set  
>pgp_num 

What I did was "ceph-deploy new " to create the default ceph.conf, then 
I added these lines in [global]:
osd_pool_default_pgp_num = 800
osd_pool_default_pg_num = 800

Then I created the monitors and OSDs via ceph-deploy.  I still had 64 PGs in 
all the default pools.  Is that expected?  Do I need to set up ceph.conf before 
running "new"?  If I do it seems to overwrite my ceph.conf with the default 
ceph.conf.

For pools created later, the pool create command requires you specify the 
number of PGs, so I can't really judge if the default PG value is working.  
Trying to run "ceph osd pool create {pool-name} {pg-num}" without supplying a 
value for pg-num fails.

I was able to adjust the PGs in the default pools after they were created - 
that works as described.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] disk activation problem

2013-10-24 Thread Nabil Naim
Hi Sir,

I try to use implement CEPH following 
http://ceph.com/docs/master/start/quick-ceph-deploy/
All my servers are VMware instances, all steps working fine unless 
prepare/create OSD , I try
ceph-deploy osd prepare ceph-node2:/tmp/osd0 ceph-node3:/tmp/osd1
and aslo I try to use extra HD

ceph-deploy osd create ceph-node2:/dev/sdb1 ceph-node3:/dev/sdb1
each time in

ceph-deploy osd activate
I got the same error
[root@ceph-deploy my-cluster]# ceph-deploy -v osd activate 
ceph-server02:/dev/sdb1

it gives

ceph_deploy.cli][INFO  ] Invoked (1.2.7): /usr/bin/ceph-deploy -v osd activate 
ceph-server02:/dev/sdb
[ceph_deploy.osd][DEBUG ] Activating cluster ceph disks ceph-server02:/dev/sdb:
[ceph_deploy.sudo_pushy][DEBUG ] will use a remote connection without sudo
[ceph_deploy.osd][INFO  ] Distro info: CentOS 6.2 Final
[ceph_deploy.osd][DEBUG ] activating host ceph-server02 disk /dev/sdb
[ceph_deploy.osd][DEBUG ] will use init type: sysvinit
[ceph-server02][INFO  ] Running command: ceph-disk-activate --mark-init 
sysvinit --mount /dev/sdb
[root@ceph-deploy my-cluster]# ceph-deploy -v osd activate 
ceph-server02:/dev/sdb1
[ceph_deploy.cli][INFO  ] Invoked (1.2.7): /usr/bin/ceph-deploy -v osd activate 
ceph-server02:/dev/sdb1
[ceph_deploy.osd][DEBUG ] Activating cluster ceph disks ceph-server02:/dev/sdb1:
[ceph_deploy.sudo_pushy][DEBUG ] will use a remote connection without sudo
[ceph_deploy.osd][INFO  ] Distro info: CentOS 6.2 Final
[ceph_deploy.osd][DEBUG ] activating host ceph-server02 disk /dev/sdb1
[ceph_deploy.osd][DEBUG ] will use init type: sysvinit
[ceph-server02][INFO  ] Running command: ceph-disk-activate --mark-init 
sysvinit --mount /dev/sdb1


And suspend for a while then

[ceph-server02][ERROR ] 2013-10-24 18:36:56.049060 7f35c4986700  0 -- :/1006405 
>> x.x.x.x:6789/0 pipe(0x7f35c0020430 sd=9 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f35c0020690).fault
[ceph-server02][ERROR ] 2013-10-24 18:36:59.047638 7f35c4885700  0 -- :/1006405 
>> x.x.x.x:6789/0 pipe(0x7f35b4000c00 sd=9 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f35b4000e60).fault
[ceph-server02][ERROR ] 2013-10-24 18:37:02.049738 7f35c4986700  0 -- :/1006405 
>> x.x.x.x:6789/0 pipe(0x7f35b4003010 sd=9 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f35b4003270).fault
[ceph-server02][ERROR ] 2013-10-24 18:37:05.049212 7f35c4885700  0 -- :/1006405 
>> x.x.x.x:6789/0 pipe(0x7f35b4003850 sd=11 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f35b4003ab0).fault
[ceph-server02][ERROR ] 2013-10-24 18:37:08.049732 7f35c4986700  0 -- :/1006405 
>> x.x.x.x:6789/0 pipe(0x7f35b40025d0 sd=11 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f35b4002830).fault
[ceph-server02][ERROR ] 2013-10-24 18:37:11.050150 7f35c4885700  0 -- :/1006405 
>> x.x.x.x:6789/0 pipe(0x7f35b4002cf0 sd=11 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f35b4002f50).fault
[ceph-server02][ERROR ] 2013-10-24 18:37:14.050596 7f35c4986700  0 -- :/1006405 
>> x.x.x.x:6789/0 pipe(0x7f35b4004110 sd=11 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f35b4004370).fault
[ceph-server02][ERROR ] 2013-10-24 18:37:17.050835 7f35c4885700  0 -- :/1006405 
>> x.x.x.x:6789/0 pipe(0x7f35b4004900 sd=11 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f35b4004b60).fault
[ceph-server02][ERROR ] 2013-10-24 18:37:20.051166 7f35c4986700  0 -- :/1006405 
>> x.x.x.x:6789/0 pipe(0x7f35b4005240 sd=11 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f35b40054a0).fault
[ceph-server02][ERROR ] 2013-10-24 18:37:23.051520 7f35c4885700  0 -- :/1006405 
>> x.x.x.x:6789/0 pipe(0x7f35b4005960 sd=11 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f35b4005bc0).fault
[ceph-server02][ERROR ] 2013-10-24 18:37:26.051803 7f35c4986700  0 -- :/1006405 
>> x.x.x.x:6789/0 pipe(0x7f35b40093b0 sd=11 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f35b4009610).fault
[ceph-server02][ERROR ] 2013-10-24 18:37:29.052464 7f35c4885700  0 -- :/1006405 
>> x.x.x.x:6789/0 pipe(0x7f35b4009a60 sd=11 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f35b4009cc0).fault
[ceph-server02][ERROR ] 2013-10-24 18:37:32.052918 7f35c4986700  0 -- :/1006405 
>> x.x.x.x:6789/0 pipe(0x7f35b400a320 sd=11 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f35b400a580).fault
[ceph-server02][ERROR ] 2013-10-24 18:37:35.053331 7f35c4885700  0 -- :/1006405 
>> x.x.x.x:6789/0 pipe(0x7f35b400ab60 sd=11 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f35b400adc0).fault
[ceph-server02][ERROR ] 2013-10-24 18:37:38.053733 7f35c4986700  0 -- :/1006405 
>> x.x.x.x:6789/0 pipe(0x7f35b4007350 sd=11 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f35b40075b0).fault
[ceph-server02][ERROR ] 2013-10-24 18:37:41.054145 7f35c4885700  0 -- :/1006405 
>> x.x.x.x:6789/0 pipe(0x7f35b400d230 sd=11 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f35b400d490).fault
[ceph-server02][ERROR ] 2013-10-24 18:37:44.054592 7f35c4986700  0 -- :/1006405 
>> x.x.x.x:6789/0 pipe(0x7f35b400dbc0 sd=11 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f35b400de20).fault
[ceph-server02][ERROR ] 2013-10-24 18:37:47.055107 7f35c4885700  0 -- :/1006405 
>> x.x.x.x:6789/0 pipe(0x7f35b4006440 sd=11 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f35b40066a0).fault
[ceph-server02][ERROR ] 2013-10-24 18:37:50.055587 7f35c4986700  0 -- :/1006405 
>> x.x.x.x:6789/0 pipe(0x7f35b4006c30 sd=11 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f35b4006e90).fault
[ceph-server02][ERROR ]

Re: [ceph-users] PG repair failing when object missing

2013-10-24 Thread Greg Farnum
I was also able to reproduce this, guys, but I believe it’s specific to the 
mode of testing rather than to anything being wrong with the OSD. In 
particular, after restarting the OSD whose file I removed and running repair, 
it did so successfully.
The OSD has an “fd cacher” which caches open file handles, and we believe this 
is what causes the observed behavior: if the removed object is among the most 
recent  objects touched, the FileStore (an OSD subsystem) has an open fd 
cached, so when manually deleting the file the FileStore now has a deleted file 
open. When the repair happens, it finds that open file descriptor and applies 
the repair to it — which of course doesn’t help put it back into place!
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com

On October 24, 2013 at 2:52:54 AM, Matt Thompson (watering...@gmail.com) wrote:
>
>Hi Harry,
>
>I was able to replicate this.
>
>What does appear to work (for me) is to do an osd scrub followed by a pg
>repair. I've tried this 2x now and in each case the deleted file gets
>copied over to the OSD from where it was removed. However, I've tried a
>few pg scrub / pg repairs after manually deleting a file and have yet to
>see the file get copied back to the OSD on which it was deleted. Like you
>said, the pg repair sets the health of the PG back to active+clean, but
>then re-running the pg scrub detects the file as missing again and sets it
>back to active+clean+inconsistent.
>
>Regards,
>Matt
>
>
>On Wed, Oct 23, 2013 at 3:45 PM, Harry Harrington wrote:
>
>> Hi,
>>
>> I've been taking a look at the repair functionality in ceph. As I
>> understand it the osds should try to copy an object from another member of
>> the pg if it is missing. I have been attempting to test this by manually
>> removing a file from one of the osds however each time the repair
>> completes the the file has not been restored. If I run another scrub on the
>> pg it gets flagged as inconsistent. See below for the output from my
>> testing. I assume I'm missing something obvious, any insight into this
>> process would be greatly appreciated.
>>
>> Thanks,
>> Harry
>>
>> # ceph --version
>> ceph version 0.67.4 (ad85b8bfafea6232d64cb7ba76a8b6e8252fa0c7)
>> # ceph status
>> cluster a4e417fe-0386-46a5-4475-ca7e10294273
>> health HEALTH_OK
>> monmap e1: 1 mons at {ceph1=1.2.3.4:6789/0}, election epoch 2, quorum
>> 0 ceph1
>> osdmap e13: 3 osds: 3 up, 3 in
>> pgmap v232: 192 pgs: 192 active+clean; 44 bytes data, 15465 MB used,
>> 164 GB / 179 GB avail
>> mdsmap e1: 0/0/1 up
>>
>> file removed from osd.2
>>
>> # ceph pg scrub 0.b
>> instructing pg 0.b on osd.1 to scrub
>>
>> # ceph status
>> cluster a4e417fe-0386-46a5-4475-ca7e10294273
>> health HEALTH_ERR 1 pgs inconsistent; 1 scrub errors
>> monmap e1: 1 mons at {ceph1=1.2.3.4:6789/0}, election epoch 2, quorum
>> 0 ceph1
>> osdmap e13: 3 osds: 3 up, 3 in
>> pgmap v233: 192 pgs: 191 active+clean, 1 active+clean+inconsistent; 44
>> bytes data, 15465 MB used, 164 GB / 179 GB avail
>> mdsmap e1: 0/0/1 up
>>
>> # ceph pg repair 0.b
>> instructing pg 0.b on osd.1 to repair
>>
>> # ceph status
>> cluster a4e417fe-0386-46a5-4475-ca7e10294273
>> health HEALTH_OK
>> monmap e1: 1 mons at {ceph1=1.2.3.4:6789/0}, election epoch 2, quorum
>> 0 ceph1
>> osdmap e13: 3 osds: 3 up, 3 in
>> pgmap v234: 192 pgs: 192 active+clean; 44 bytes data, 15465 MB used,
>> 164 GB / 179 GB avail
>> mdsmap e1: 0/0/1 up
>>
>> # ceph pg scrub 0.b
>> instructing pg 0.b on osd.1 to scrub
>>
>> # ceph status
>> cluster a4e417fe-0386-46a5-4475-ca7e10294273
>> health HEALTH_ERR 1 pgs inconsistent; 1 scrub errors
>> monmap e1: 1 mons at {ceph1=1.2.3.4:6789/0}, election epoch 2, quorum
>> 0 ceph1
>> osdmap e13: 3 osds: 3 up, 3 in
>> pgmap v236: 192 pgs: 191 active+clean, 1 active+clean+inconsistent; 44
>> bytes data, 15465 MB used, 164 GB / 179 GB avail
>> mdsmap e1: 0/0/1 up
>>
>>
>>
>> The logs from osd.1:
>> 2013-10-23 14:12:31.188281 7f02a5161700 0 log [ERR] : 0.b osd.2 missing
>> 3a643fcb/testfile1/head//0
>> 2013-10-23 14:12:31.188312 7f02a5161700 0 log [ERR] : 0.b scrub 1
>> missing, 0 inconsistent objects
>> 2013-10-23 14:12:31.188319 7f02a5161700 0 log [ERR] : 0.b scrub 1 errors
>> 2013-10-23 14:13:03.197802 7f02a5161700 0 log [ERR] : 0.b osd.2 missing
>> 3a643fcb/testfile1/head//0
>> 2013-10-23 14:13:03.197837 7f02a5161700 0 log [ERR] : 0.b repair 1
>> missing, 0 inconsistent objects
>> 2013-10-23 14:13:03.197850 7f02a5161700 0 log [ERR] : 0.b repair 1
>> errors, 1 fixed
>> 2013-10-23 14:14:47.232953 7f02a5161700 0 log [ERR] : 0.b osd.2 missing
>> 3a643fcb/testfile1/head//0
>> 2013-10-23 14:14:47.232985 7f02a5161700 0 log [ERR] : 0.b scrub 1
>> missing, 0 inconsistent objects
>> 2013-10-23 14:14:47.232991 7f02a5161700 0 log [ERR] : 0.b scrub 1 errors
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>__

Re: [ceph-users] PG repair failing when object missing

2013-10-24 Thread Inktank
I also created a ticket to try and handle this particular instance of bad 
behavior:
http://tracker.ceph.com/issues/6629
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com

On October 24, 2013 at 1:22:54 PM, Greg Farnum (gregory.far...@inktank.com) 
wrote:
>
>I was also able to reproduce this, guys, but I believe it’s specific to the 
>mode of testing rather than to anything being wrong with the OSD. In 
>particular, after restarting the OSD whose file I removed and running repair, 
>it did so successfully.
>The OSD has an “fd cacher” which caches open file handles, and we believe this 
>is what causes the observed behavior: if the removed object is among the most 
>recent objects touched, the FileStore (an OSD subsystem) has an open fd 
>cached, so when manually deleting the file the FileStore now has a deleted 
>file open. When the repair happens, it finds that open file descriptor and 
>applies the repair to it — which of course doesn’t help put it back into place!
>-Greg
>Software Engineer #42 @ http://inktank.com | http://ceph.com
>
>On October 24, 2013 at 2:52:54 AM, Matt Thompson (watering...@gmail.com) wrote:
>>
>>Hi Harry,
>>
>>I was able to replicate this.
>>
>>What does appear to work (for me) is to do an osd scrub followed by a pg
>>repair. I've tried this 2x now and in each case the deleted file gets
>>copied over to the OSD from where it was removed. However, I've tried a
>>few pg scrub / pg repairs after manually deleting a file and have yet to
>>see the file get copied back to the OSD on which it was deleted. Like you
>>said, the pg repair sets the health of the PG back to active+clean, but
>>then re-running the pg scrub detects the file as missing again and sets it
>>back to active+clean+inconsistent.
>>
>>Regards,
>>Matt
>>
>>
>>On Wed, Oct 23, 2013 at 3:45 PM, Harry Harrington wrote:
>>
>>> Hi,
>>>
>>> I've been taking a look at the repair functionality in ceph. As I
>>> understand it the osds should try to copy an object from another member of
>>> the pg if it is missing. I have been attempting to test this by manually
>>> removing a file from one of the osds however each time the repair
>>> completes the the file has not been restored. If I run another scrub on the
>>> pg it gets flagged as inconsistent. See below for the output from my
>>> testing. I assume I'm missing something obvious, any insight into this
>>> process would be greatly appreciated.
>>>
>>> Thanks,
>>> Harry
>>>
>>> # ceph --version
>>> ceph version 0.67.4 (ad85b8bfafea6232d64cb7ba76a8b6e8252fa0c7)
>>> # ceph status
>>> cluster a4e417fe-0386-46a5-4475-ca7e10294273
>>> health HEALTH_OK
>>> monmap e1: 1 mons at {ceph1=1.2.3.4:6789/0}, election epoch 2, quorum
>>> 0 ceph1
>>> osdmap e13: 3 osds: 3 up, 3 in
>>> pgmap v232: 192 pgs: 192 active+clean; 44 bytes data, 15465 MB used,
>>> 164 GB / 179 GB avail
>>> mdsmap e1: 0/0/1 up
>>>
>>> file removed from osd.2
>>>
>>> # ceph pg scrub 0.b
>>> instructing pg 0.b on osd.1 to scrub
>>>
>>> # ceph status
>>> cluster a4e417fe-0386-46a5-4475-ca7e10294273
>>> health HEALTH_ERR 1 pgs inconsistent; 1 scrub errors
>>> monmap e1: 1 mons at {ceph1=1.2.3.4:6789/0}, election epoch 2, quorum
>>> 0 ceph1
>>> osdmap e13: 3 osds: 3 up, 3 in
>>> pgmap v233: 192 pgs: 191 active+clean, 1 active+clean+inconsistent; 44
>>> bytes data, 15465 MB used, 164 GB / 179 GB avail
>>> mdsmap e1: 0/0/1 up
>>>
>>> # ceph pg repair 0.b
>>> instructing pg 0.b on osd.1 to repair
>>>
>>> # ceph status
>>> cluster a4e417fe-0386-46a5-4475-ca7e10294273
>>> health HEALTH_OK
>>> monmap e1: 1 mons at {ceph1=1.2.3.4:6789/0}, election epoch 2, quorum
>>> 0 ceph1
>>> osdmap e13: 3 osds: 3 up, 3 in
>>> pgmap v234: 192 pgs: 192 active+clean; 44 bytes data, 15465 MB used,
>>> 164 GB / 179 GB avail
>>> mdsmap e1: 0/0/1 up
>>>
>>> # ceph pg scrub 0.b
>>> instructing pg 0.b on osd.1 to scrub
>>>
>>> # ceph status
>>> cluster a4e417fe-0386-46a5-4475-ca7e10294273
>>> health HEALTH_ERR 1 pgs inconsistent; 1 scrub errors
>>> monmap e1: 1 mons at {ceph1=1.2.3.4:6789/0}, election epoch 2, quorum
>>> 0 ceph1
>>> osdmap e13: 3 osds: 3 up, 3 in
>>> pgmap v236: 192 pgs: 191 active+clean, 1 active+clean+inconsistent; 44
>>> bytes data, 15465 MB used, 164 GB / 179 GB avail
>>> mdsmap e1: 0/0/1 up
>>>
>>>
>>>
>>> The logs from osd.1:
>>> 2013-10-23 14:12:31.188281 7f02a5161700 0 log [ERR] : 0.b osd.2 missing
>>> 3a643fcb/testfile1/head//0
>>> 2013-10-23 14:12:31.188312 7f02a5161700 0 log [ERR] : 0.b scrub 1
>>> missing, 0 inconsistent objects
>>> 2013-10-23 14:12:31.188319 7f02a5161700 0 log [ERR] : 0.b scrub 1 errors
>>> 2013-10-23 14:13:03.197802 7f02a5161700 0 log [ERR] : 0.b osd.2 missing
>>> 3a643fcb/testfile1/head//0
>>> 2013-10-23 14:13:03.197837 7f02a5161700 0 log [ERR] : 0.b repair 1
>>> missing, 0 inconsistent objects
>>> 2013-10-23 14:13:03.197850 7f02a5161700 0 log [ERR] : 0.b repair 1
>>> errors, 1 fixed
>>> 2013-10-23 14:14:47.232953 7f02a5161700 0 log [ERR] : 0.b osd.2 missing
>>> 3a643fcb/testfile1/head//0

Re: [ceph-users] Non-Ceph cluster name

2013-10-24 Thread Sage Weil
On Thu, 24 Oct 2013, Gaylord Holder wrote:
> Works perfectly.
> 
> My only grip is --cluster isn't listed as a valid argument from
> 
>   ceph-mon --help

Ah, it is there for current versions, but not cuttlefish or dumpling.  
It'll be in the next point release (for each).

sage

> 
> and the only reference searching for --cluster in the ceph documentation is in
> regards to ceph-rest-api.
> 
> Shall I file a bug to correct the documentation?
> 
> Thanks again for the quick and accurate response.
> 
> -Gaylord
> 
> On 10/24/2013 08:11 AM, Sage Weil wrote:
> > Try passing --cluster csceph instead of the config file path and I
> > suspect it will work.
> > 
> > sage
> > 
> > 
> > 
> > Gaylord Holder  wrote:
> > 
> > I'm trying to bring a ceph cluster not named ceph.
> > 
> > I'm running version 0.61.
> > 
> >   From my reading of the documentation, the $cluster metavariable is set
> > by the basename of the configuration file: specifying the configuration
> > file "/etc/ceph/mycluster.conf" sets the $cluster metavariable to
> > "mycluster"
> > 
> > However, given a configuration file /etc/ceph/csceph.conf:
> > 
> > [global]
> >  fsid = 70d421fe-28ca-4804-bce8-d51a16b531ec
> >  mon host =192.168.124.202  
> >  mon_initial_members = a
> > 
> > [mon.a]
> > host = monnode
> > mon addr =192.168.124.202:6789
> > 
> > and running:
> > 
> > ceph-authtool csceph.mon.keyring --create-keyring --name=mon.
> > --gen-key --cap mon 'allow *'
> > 
> > ceph-mon -c /etc/ceph/csceph.conf --mkfs -i a --keyring
> > csceph.mon.keyring
> > 
> > ceph-mon tries to create monfs in
> > 
> > /var/lib/ceph/mon/ceph-a
> > 
> > not
> > 
> > /var/lib/ceph/mon/csceph-a
> > 
> > as expected.
> > 
> > 
> > Thank you for any help you can give.
> > 
> > Cheers,
> > -Gaylord
> > 
> > 
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Error on adding Monitors

2013-10-24 Thread LaSalle, Jurvis
I've rebuilt my nodes on raring and now I'm hitting the same issue trying
to add the 2nd and 3rd monitors as specified in the quickstart.  The
quickstart makes no mention of setting public_addr or public_network to
complete this step.  What's the deal?

JL


On 13/10/24 10:23 AM, "Joao Eduardo Luis"  wrote:

>On 10/24/2013 03:12 PM, David J F Carradice wrote:
>> Hi.
>>
>> I am getting an error on adding monitors to my cluster.
>> ceph@ceph-deploy:~/my-cluster$ ceph-deploy mon create ceph-osd01
>> [ceph_deploy.cli][INFO  ] Invoked (1.2.7): /usr/bin/ceph-deploy mon
>> create ceph-osd01
>> [ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph-osd01
>> [ceph_deploy.mon][DEBUG ] detecting platform for host ceph-osd01 ...
>> [ceph_deploy.sudo_pushy][DEBUG ] will use a remote connection with sudo
>> [ceph_deploy.mon][INFO  ] distro info: Ubuntu 13.04 raring
>> [ceph-osd01][DEBUG ] determining if provided host has same hostname in
>> remote
>> [ceph-osd01][DEBUG ] deploying mon to ceph-osd01
>> [ceph-osd01][DEBUG ] remote hostname: ceph-osd01
>> [ceph-osd01][INFO  ] write cluster configuration to
>>/etc/ceph/{cluster}.conf
>> [ceph-osd01][DEBUG ] checking for done path:
>> /var/lib/ceph/mon/ceph-ceph-osd01/done
>> [ceph-osd01][INFO  ] create a done file to avoid re-doing the mon
>>deployment
>> [ceph-osd01][INFO  ] create the init path if it does not exist
>> [ceph-osd01][INFO  ] locating `service` executable...
>> [ceph-osd01][INFO  ] found `service` executable: /usr/sbin/service
>> [ceph-osd01][INFO  ] Running command: sudo initctl emit ceph-mon
>> cluster=ceph id=ceph-osd01
>> [ceph-osd01][INFO  ] Running command: sudo ceph --admin-daemon
>> /var/run/ceph/ceph-mon.ceph-osd01.asok mon_status
>> [ceph-osd01][ERROR ] admin_socket: exception getting command
>> descriptions: [Errno 2] No such file or directory
>> [ceph-osd01][WARNIN] monitor: mon.ceph-osd01, might not be running yet
>> [ceph-osd01][INFO  ] Running command: sudo ceph --admin-daemon
>> /var/run/ceph/ceph-mon.ceph-osd01.asok mon_status
>> [ceph-osd01][ERROR ] admin_socket: exception getting command
>> descriptions: [Errno 2] No such file or directory
>> [ceph-osd01][WARNIN] ceph-osd01 is not defined in `mon initial members`
>> [ceph-osd01][WARNIN] monitor ceph-osd01 does not exist in monmap
>> [ceph-osd01][WARNIN] neither `public_addr` nor `public_network` keys are
>> defined for monitors
>> [ceph-osd01][WARNIN] monitors may not be able to form quorum
>>
>> This is happening after a successful and first add of a monitor,
>> ceph-mon01. As per the ceph-deploy documentation, I added a single
>> monitor, then some disk daemons located on ceph-osd01~03, then went to
>> add more monitors, ceph-osd01 & 02 for a quorum. This is where I get the
>> issue.
>>
>> Is the issue related to the WARNING present regarding keys?
>
>That's a warning regarding config options (public_addr/public_network)
>and the lack of enough info to generate a monmap.
>
>>
>> It appears that when running the ceph-deploy mon create  from my
>> ceph-deploy server, it complains about there not being any
>> ceph-mon..asok (which I assume are address sockets). I looked in
>> the respective directories on the potential monitor nodes (which are
>> currently also the OSD nodes) and see that there is only an OSD.asok, no
>> MON.asok
>
>The monitor's .asok (admin socket) will only be created at start.  If
>the monitor hasn't been run yet, then there's no asok.
>
>>
>> I can send my ceph.conf and a brief overview if it helps.
>
>ceph.conf, specially the [global] and [mon]/[mon.foo] sections would be
>helpful.
>
>>
>> Now to add to the fun, the  ceph he alt passes, and I am able to place
>> an object. It is the fact that I cannot add more than the current single
>> mon node that I would like to resolve.
>>
>> Regards
>>
>> David
>>
>> PS Extract of ceph.log on my ceph-deploy server (separate from other
>> nodesŠ not a mon nor an osd)
>>
>> SUCCESSFUL INITIAL MON
>>
>>   764 2013-10-22 12:31:11,157 [ceph-mon01][INFO  ] Running command: sudo
>> ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon01.asok m764
>> on_status
>>
>>  765 2013-10-22 12:31:11,223 [ceph-mon01][DEBUG ]
>> 
>>*
>>***
>>
>>  766 2013-10-22 12:31:11,223 [ceph-mon01][DEBUG ] status for
>> monitor: mon.ceph-mon01
>>
>>  767 2013-10-22 12:31:11,224 [ceph-mon01][DEBUG ] {
>>
>>  768 2013-10-22 12:31:11,224 [ceph-mon01][DEBUG ]
>> ³election_epoch": 2,
>>
>>
>> FAILED ADD MON
>>
>> See above in blue.
>>
>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
>-- 
>Joao Eduardo Luis
>Software Engineer | http://inktank.com | http://ceph.com
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___

Re: [ceph-users] Error on adding Monitors

2013-10-24 Thread akatsuki
Hi ,

I also found similar problem when adding a new monitor to my cluster. hope
this can help.

In my ceph monitor log, i found this line:

2013-10-23 17:00:15.907105 7fc9edc9b780  0 ceph version 0.61.4
(1669132fcfc27d0c0b5e5bb93ade59d147e23404), process ceph-mon, pid 4312
2013-10-23 17:00:16.158621 7fc9edc9b780  0 mon.2 does not exist in monmap,
will attempt to join an existing cluster

My server also have two NIC that use for public and private(cluster only).
and based on
http://tracker.ceph.com/issues/5195
http://cephnotes.ksperis.com/blog/2013/08/29/mon-failed-to-start

after done all step in "adding monitor" using ceph documentation,
i do this following :

Root# ceph mon dump
dumped monmap epoch 12
epoch 12
fsid b3ecd9c5-182b-4978-9272-d4b278454500
last_changed 2013-10-23 17:57:44.185915
created 2013-05-16 16:46:00.572157
0: 10.xxx.xxx.xx1:6789/0 mon.0
1: 10.xxx.xxx.xx2:6789/0 mon.1
2: 10.xxx.xxx.xx3:6789/0 mon.2

Root# ceph mon getmap -o /tmp/monmap
got latest monmap

Root#  ceph-mon -i 2 --inject-monmap /tmp/monmap

Root# /etc/init.d/ceph start mon.2
=== mon.2 ===
Starting Ceph mon.2 on ubuntuGPT3...


Regards,
Rzk


On Fri, Oct 25, 2013 at 4:46 AM, LaSalle, Jurvis <
jurvis.lasa...@childrens.harvard.edu> wrote:

> I've rebuilt my nodes on raring and now I'm hitting the same issue trying
> to add the 2nd and 3rd monitors as specified in the quickstart.  The
> quickstart makes no mention of setting public_addr or public_network to
> complete this step.  What's the deal?
>
> JL
>
>
> On 13/10/24 10:23 AM, "Joao Eduardo Luis"  wrote:
>
> >On 10/24/2013 03:12 PM, David J F Carradice wrote:
> >> Hi.
> >>
> >> I am getting an error on adding monitors to my cluster.
> >> ceph@ceph-deploy:~/my-cluster$ ceph-deploy mon create ceph-osd01
> >> [ceph_deploy.cli][INFO  ] Invoked (1.2.7): /usr/bin/ceph-deploy mon
> >> create ceph-osd01
> >> [ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph-osd01
> >> [ceph_deploy.mon][DEBUG ] detecting platform for host ceph-osd01 ...
> >> [ceph_deploy.sudo_pushy][DEBUG ] will use a remote connection with sudo
> >> [ceph_deploy.mon][INFO  ] distro info: Ubuntu 13.04 raring
> >> [ceph-osd01][DEBUG ] determining if provided host has same hostname in
> >> remote
> >> [ceph-osd01][DEBUG ] deploying mon to ceph-osd01
> >> [ceph-osd01][DEBUG ] remote hostname: ceph-osd01
> >> [ceph-osd01][INFO  ] write cluster configuration to
> >>/etc/ceph/{cluster}.conf
> >> [ceph-osd01][DEBUG ] checking for done path:
> >> /var/lib/ceph/mon/ceph-ceph-osd01/done
> >> [ceph-osd01][INFO  ] create a done file to avoid re-doing the mon
> >>deployment
> >> [ceph-osd01][INFO  ] create the init path if it does not exist
> >> [ceph-osd01][INFO  ] locating `service` executable...
> >> [ceph-osd01][INFO  ] found `service` executable: /usr/sbin/service
> >> [ceph-osd01][INFO  ] Running command: sudo initctl emit ceph-mon
> >> cluster=ceph id=ceph-osd01
> >> [ceph-osd01][INFO  ] Running command: sudo ceph --admin-daemon
> >> /var/run/ceph/ceph-mon.ceph-osd01.asok mon_status
> >> [ceph-osd01][ERROR ] admin_socket: exception getting command
> >> descriptions: [Errno 2] No such file or directory
> >> [ceph-osd01][WARNIN] monitor: mon.ceph-osd01, might not be running yet
> >> [ceph-osd01][INFO  ] Running command: sudo ceph --admin-daemon
> >> /var/run/ceph/ceph-mon.ceph-osd01.asok mon_status
> >> [ceph-osd01][ERROR ] admin_socket: exception getting command
> >> descriptions: [Errno 2] No such file or directory
> >> [ceph-osd01][WARNIN] ceph-osd01 is not defined in `mon initial members`
> >> [ceph-osd01][WARNIN] monitor ceph-osd01 does not exist in monmap
> >> [ceph-osd01][WARNIN] neither `public_addr` nor `public_network` keys are
> >> defined for monitors
> >> [ceph-osd01][WARNIN] monitors may not be able to form quorum
> >>
> >> This is happening after a successful and first add of a monitor,
> >> ceph-mon01. As per the ceph-deploy documentation, I added a single
> >> monitor, then some disk daemons located on ceph-osd01~03, then went to
> >> add more monitors, ceph-osd01 & 02 for a quorum. This is where I get the
> >> issue.
> >>
> >> Is the issue related to the WARNING present regarding keys?
> >
> >That's a warning regarding config options (public_addr/public_network)
> >and the lack of enough info to generate a monmap.
> >
> >>
> >> It appears that when running the ceph-deploy mon create  from my
> >> ceph-deploy server, it complains about there not being any
> >> ceph-mon..asok (which I assume are address sockets). I looked in
> >> the respective directories on the potential monitor nodes (which are
> >> currently also the OSD nodes) and see that there is only an OSD.asok, no
> >> MON.asok
> >
> >The monitor's .asok (admin socket) will only be created at start.  If
> >the monitor hasn't been run yet, then there's no asok.
> >
> >>
> >> I can send my ceph.conf and a brief overview if it helps.
> >
> >ceph.conf, specially the [global] and [mon]/[mon.foo] sections would be
> >helpful.
> >
> >>
> >