Re: [ceph-users] No monitor sockets after upgrading to Emperor

2013-11-12 Thread Joao Luis
On Nov 12, 2013 2:38 AM, Berant Lemmenes ber...@lemmenes.com wrote:

 I noticed the same behavior on my dumpling cluster. They wouldn't show up
after boot, but after a service restart they were there.

 I haven't tested a node reboot since I upgraded to emperor today. I'll
give it a shot tomorrow.

 Thanks,
 Berant

 On Nov 11, 2013 9:29 PM, Peter Matulis peter.matu...@canonical.com
wrote:

 After upgrading from Dumpling to Emperor on Ubuntu 12.04 I noticed the
 admin sockets for each of my monitors were missing although the cluster
 seemed to continue running fine.  There wasn't anything under
 /var/run/ceph.  After restarting the service on each monitor node they
 reappeared.  Anyone?

 ~pmatulis


Odd behavior. The monitors do remove the admin socket on shutdown and
proceed to create it when they start, but as long as they are running it
should exist. Have you checked the logs for some error message that could
provide more insight on the cause?

  -Joao
___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] how to use rados_exec

2013-11-12 Thread
 Hi all!
   long time no see!
   I  want to use the function rados_exec,  and I found  the class 
cls_crypto.cc  in the  code source of ceph;
so I  run the funtion like this:

   rados_exec(ioctx, foo_object, crypto , md5, buf, sizeof(buf),buf2, 
sizeof(buf2) )

ant the function return   operation not support!

 I check the source of ceph , and find that  cls_crypto.cc is not build。how 
can I bulid the class and run it!
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph-deploy: osd creating hung with one ssd disk as shared journal

2013-11-12 Thread Tim Zhang
Hi guys,
I use ceph-deploy to manage my cluster, but I get failed while creating the
OSD, the process seems to hang up at creating first osd. By the way,
SELinux is disabled, and my ceph-disk is patched according to the page:
http://www.spinics.net/lists/ceph-users/msg03258.html
can you guys give me some advise?
(1) the output of ceph-deploy is:
Invoked (1.3.1): /usr/bin/ceph-deploy osd create ceph0:sdb:sda
ceph0:sdd:sda ceph0:sde:sda ceph0:sdf:sda ceph0:sdg:sda ceph0:sdh:sda
ceph1:sdb:sda ceph1:sdd:sda ceph1:sde:sda ceph1:sdf:sda ceph1:sdg:sda
ceph1:sdh:sda ceph2:sdb:sda ceph2:sdd:sda ceph2:sde:sda ceph2:sdf:sda
ceph2:sdg:sda ceph2:sdh:sda
[ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks
ceph0:/dev/sdb:/dev/sda ceph0:/dev/sdd:/dev/sda ceph0:/dev/sde:/dev/sda
ceph0:/dev/sdf:/dev/sda ceph0:/dev/sdg:/dev/sda ceph0:/dev/sdh:/dev/sda
ceph1:/dev/sdb:/dev/sda ceph1:/dev/sdd:/dev/sda ceph1:/dev/sde:/dev/sda
ceph1:/dev/sdf:/dev/sda ceph1:/dev/sdg:/dev/sda ceph1:/dev/sdh:/dev/sda
ceph2:/dev/sdb:/dev/sda ceph2:/dev/sdd:/dev/sda ceph2:/dev/sde:/dev/sda
ceph2:/dev/sdf:/dev/sda ceph2:/dev/sdg:/dev/sda ceph2:/dev/sdh:/dev/sda
[ceph0][DEBUG ] connected to host: ceph0
[ceph0][DEBUG ] detect platform information from remote host
[ceph0][DEBUG ] detect machine type
[ceph_deploy.osd][INFO  ] Distro info: CentOS 6.4 Final
[ceph_deploy.osd][DEBUG ] Deploying osd to ceph0
[ceph0][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph0][INFO  ] Running command: udevadm trigger --subsystem-match=block
--action=add
[ceph_deploy.osd][DEBUG ] Preparing host ceph0 disk /dev/sdb journal
/dev/sda activate True
[ceph0][INFO  ] Running command: ceph-disk-prepare --fs-type xfs --cluster
ceph -- /dev/sdb /dev/sda
[ceph0][ERROR ] WARNING:ceph-disk:OSD will not be hot-swappable if journal
is not the same device as the osd data
[ceph0][ERROR ] Warning: WARNING: the kernel failed to re-read the
partition table on /dev/sda (Device or resource busy).  As a result, it may
not reflect all of your changes until after reboot.
[ceph0][ERROR ] BLKPG: Device or resource busy
[ceph0][ERROR ] error adding partition 1
[ceph0][DEBUG ] The operation has completed successfully.
[ceph0][DEBUG ] The operation has completed successfully.
[ceph0][DEBUG ] meta-data=/dev/sdb1  isize=2048   agcount=4,
agsize=61047597 blks
[ceph0][DEBUG ]  =   sectsz=512   attr=2,
projid32bit=0
[ceph0][DEBUG ] data =   bsize=4096
blocks=244190385, imaxpct=25
[ceph0][DEBUG ]  =   sunit=0  swidth=0 blks
[ceph0][DEBUG ] naming   =version 2  bsize=4096   ascii-ci=0
[ceph0][DEBUG ] log  =internal log   bsize=4096
blocks=119233, version=2
[ceph0][DEBUG ]  =   sectsz=512   sunit=0 blks,
lazy-count=1
[ceph0][DEBUG ] realtime =none   extsz=4096   blocks=0,
rtextents=0
[ceph0][DEBUG ] The operation has completed successfully.
[ceph0][INFO  ] Running command: udevadm trigger --subsystem-match=block
--action=add
[ceph_deploy.osd][DEBUG ] Host ceph0 is now ready for osd use.
[ceph0][DEBUG ] connected to host: ceph0
[ceph0][DEBUG ] detect platform information from remote host
[ceph0][DEBUG ] detect machine type
[ceph_deploy.osd][INFO  ] Distro info: CentOS 6.4 Final
[ceph_deploy.osd][DEBUG ] Preparing host ceph0 disk /dev/sdd journal
/dev/sda activate True
[ceph0][INFO  ] Running command: ceph-disk-prepare --fs-type xfs --cluster
ceph -- /dev/sdd /dev/sda
[ceph0][ERROR ] WARNING:ceph-disk:OSD will not be hot-swappable if journal
is not the same device as the osd data

2 the mount system for that osd shows:
[root@host ~]# mount -l
/dev/sdc1 on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
/dev/sdb1 on /var/lib/ceph/tmp/mnt.6D02EM type xfs (rw,noatime)

3 my testbed information is:
os: centos 6.4 Final
ceph: dumpling 67.4
three hosts: ceph0 ceph1 ceph2
each host have 3 disk sharing one ssd disk as journal

4 my ceph config is as this:
osd journal size = 9500
;osd mkfs type = xfs
;auth supported = none
auth_cluster_required = none
auth_service_required = none
auth_client_required = none
public_network = 172.18.11.0/24
cluster_network = 10.10.11.0/24
osd pool default size = 3
ms nocrc = true
osd op threads = 4
filestore op threads = 0
mon sync fs threshold = 0
osd pool default pg num = 100
osd pool default pgp num = 100

5 the output of ceph0 running command: pe -ef|grep ceph
[root@ceph0 ~]# ps -ef|grep ceph
root 13922 1  0 05:59 ?00:00:00 /bin/sh
/usr/sbin/ceph-disk-udev 1 sdb1 sdb
root 14059 13922  0 05:59 ?00:00:00 python /usr/sbin/ceph-disk
-v activate /dev/sdb1
root 14090 1  0 05:59 ?00:00:00 /bin/sh
/usr/sbin/ceph-disk-udev 1 sda1 sda
root 14107 14090  0 05:59 ?00:00:00 python 

Re: [ceph-users] ceph-deploy: osd creating hung with one ssd disk as shared journal

2013-11-12 Thread Michael
Sorry, just spotted you're mounting on sdc. Can you chuck out a partx -v 
/dev/sda to see if there's anything odd about the data currently on there?


-Michael

On 12/11/2013 18:22, Michael wrote:
As long as there's room on the SSD for the partitioner it'll just use 
the conf value for osd journal size to section it up as it adds OSD's 
(I generally use the ceph-deploy osd create srv:data:journal e.g. 
srv-12:/dev/sdb:/dev/sde format when adding disks).
Does it being /dev/sda mean you're putting your journal onto an 
already partitioned and in use by the OS SSD?


-Michael

On 12/11/2013 18:09, Gruher, Joseph R wrote:


I didn't think you could specify the journal in this manner (just 
pointing multiple OSDs on the same host all to journal /dev/sda).  
Don't you either need to partition the SSD and point each SSD to a 
separate partition, or format and mount the SSD and each OSD will use 
a unique file on the mount?  I've always created a separate partition 
on the SSD for each journal.


Preparing cluster ceph disks ceph0:/dev/sdb:/dev/sda 
ceph0:/dev/sdd:/dev/sda ceph0:/dev/sde:/dev/sda 
ceph0:/dev/sdf:/dev/sda ceph0:/dev/sdg:/dev/sda ceph0:/dev/sdh:/dev/sda


*From:*ceph-users-boun...@lists.ceph.com 
[mailto:ceph-users-boun...@lists.ceph.com] *On Behalf Of *Tim Zhang

*Sent:* Tuesday, November 12, 2013 2:20 AM
*To:* ceph-users@lists.ceph.com
*Subject:* [ceph-users] ceph-deploy: osd creating hung with one ssd 
disk as shared journal


Hi guys,

I use ceph-deploy to manage my cluster, but I get failed while 
creating the OSD, the process seems to hang up at creating first osd. 
By the way, SELinux is disabled, and my ceph-disk is patched 
according to the 
page:http://www.spinics.net/lists/ceph-users/msg03258.html


can you guys give me some advise?

(1) the output of ceph-deploy is:

Invoked (1.3.1): /usr/bin/ceph-deploy osd create ceph0:sdb:sda 
ceph0:sdd:sda ceph0:sde:sda ceph0:sdf:sda ceph0:sdg:sda ceph0:sdh:sda 
ceph1:sdb:sda ceph1:sdd:sda ceph1:sde:sda ceph1:sdf:sda ceph1:sdg:sda 
ceph1:sdh:sda ceph2:sdb:sda ceph2:sdd:sda ceph2:sde:sda ceph2:sdf:sda 
ceph2:sdg:sda ceph2:sdh:sda


[ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks 
ceph0:/dev/sdb:/dev/sda ceph0:/dev/sdd:/dev/sda 
ceph0:/dev/sde:/dev/sda ceph0:/dev/sdf:/dev/sda 
ceph0:/dev/sdg:/dev/sda ceph0:/dev/sdh:/dev/sda 
ceph1:/dev/sdb:/dev/sda ceph1:/dev/sdd:/dev/sda 
ceph1:/dev/sde:/dev/sda ceph1:/dev/sdf:/dev/sda 
ceph1:/dev/sdg:/dev/sda ceph1:/dev/sdh:/dev/sda 
ceph2:/dev/sdb:/dev/sda ceph2:/dev/sdd:/dev/sda 
ceph2:/dev/sde:/dev/sda ceph2:/dev/sdf:/dev/sda 
ceph2:/dev/sdg:/dev/sda ceph2:/dev/sdh:/dev/sda


[ceph0][DEBUG ] connected to host: ceph0

[ceph0][DEBUG ] detect platform information from remote host

[ceph0][DEBUG ] detect machine type

[ceph_deploy.osd][INFO  ] Distro info: CentOS 6.4 Final

[ceph_deploy.osd][DEBUG ] Deploying osd to ceph0

[ceph0][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf

[ceph0][INFO  ] Running command: udevadm trigger 
--subsystem-match=block --action=add


[ceph_deploy.osd][DEBUG ] Preparing host ceph0 disk /dev/sdb journal 
/dev/sda activate True


[ceph0][INFO  ] Running command: ceph-disk-prepare --fs-type xfs 
--cluster ceph -- /dev/sdb /dev/sda


[ceph0][ERROR ] WARNING:ceph-disk:OSD will not be hot-swappable if 
journal is not the same device as the osd data


[ceph0][ERROR ] Warning: WARNING: the kernel failed to re-read the 
partition table on /dev/sda (Device or resource busy).  As a result, 
it may not reflect all of your changes until after reboot.


[ceph0][ERROR ] BLKPG: Device or resource busy

[ceph0][ERROR ] error adding partition 1

[ceph0][DEBUG ] The operation has completed successfully.

[ceph0][DEBUG ] The operation has completed successfully.

[ceph0][DEBUG ] meta-data=/dev/sdb1  isize=2048   
agcount=4, agsize=61047597 blks


[ceph0][DEBUG ]  =   sectsz=512   attr=2, 
projid32bit=0


[ceph0][DEBUG ] data =   bsize=4096   
blocks=244190385, imaxpct=25


[ceph0][DEBUG ]  =   sunit=0  swidth=0 blks

[ceph0][DEBUG ] naming   =version 2  bsize=4096   ascii-ci=0

[ceph0][DEBUG ] log  =internal log   bsize=4096   
blocks=119233, version=2


[ceph0][DEBUG ]  =   sectsz=512   sunit=0 blks, 
lazy-count=1


[ceph0][DEBUG ] realtime =none   extsz=4096   blocks=0, 
rtextents=0


[ceph0][DEBUG ] The operation has completed successfully.

[ceph0][INFO  ] Running command: udevadm trigger 
--subsystem-match=block --action=add


[ceph_deploy.osd][DEBUG ] Host ceph0 is now ready for osd use.

[ceph0][DEBUG ] connected to host: ceph0

[ceph0][DEBUG ] detect platform information from remote host

[ceph0][DEBUG ] detect machine type

[ceph_deploy.osd][INFO  ] Distro info: CentOS 6.4 Final

[ceph_deploy.osd][DEBUG ] Preparing host ceph0 disk /dev/sdd journal 
/dev/sda activate True


[ceph0][INFO  ] Running command: ceph-disk-prepare --fs-type xfs 
--cluster 

Re: [ceph-users] near full osd

2013-11-12 Thread John Wilkins
We probably do need to go over it again and account for PG splitting.

On Fri, Nov 8, 2013 at 9:26 AM, Gregory Farnum g...@inktank.com wrote:
 After you increase the number of PGs, *and* increase the pgp_num to do the
 rebalancing (this is all described in the docs; do a search), data will move
 around and the overloaded OSD will have less stuff on it. If it's actually
 marked as full, though, this becomes a bit trickier. Search the list
 archives for some instructions; I don't remember the best course to follow.
 -Greg

 On Friday, November 8, 2013, Kevin Weiler wrote:

 Thanks again Gregory!

 One more quick question. If I raise the amount of PGs for a pool, will
 this REMOVE any data from the full OSD? Or will I have to take the OSD out
 and put it back in to realize this benefit? Thanks!


 --

 Kevin Weiler

 IT



 IMC Financial Markets | 233 S. Wacker Drive, Suite 4300 | Chicago, IL
 60606 | http://imc-chicago.com/

 Phone: +1 312-204-7439 | Fax: +1 312-244-3301 | E-Mail:
 kevin.wei...@imc-chicago.com


 From: Gregory Farnum g...@inktank.com
 Date: Friday, November 8, 2013 11:00 AM
 To: Kevin Weiler kevin.wei...@imc-chicago.com
 Cc: Aronesty, Erik earone...@expressionanalysis.com, Greg Chavez
 greg.cha...@gmail.com, ceph-users@lists.ceph.com
 ceph-users@lists.ceph.com
 Subject: Re: [ceph-users] near full osd

 It's not a hard value; you should adjust based on the size of your pools
 (many of then are quite small when used with RGW, for instance). But in
 general it is better to have more than fewer, and if you want to check you
 can look at the sizes of each PG (ceph pg dump) and increase the counts for
 pools with wide variability-Greg

 On Friday, November 8, 2013, Kevin Weiler wrote:

 Thanks Gregory,

 One point that was a bit unclear in documentation is whether or not this
 equation for PGs applies to a single pool, or the entirety of pools.
 Meaning, if I calculate 3000 PGs, should each pool have 3000 PGs or should
 all the pools ADD UP to 3000 PGs? Thanks!

 --

 Kevin Weiler

 IT


 IMC Financial Markets | 233 S. Wacker Drive, Suite 4300 | Chicago, IL
 60606 | http://imc-chicago.com/

 Phone: +1 312-204-7439 | Fax: +1 312-244-3301 | E-Mail:
 kevin.wei...@imc-chicago.com







 On 11/7/13 9:59 PM, Gregory Farnum g...@inktank.com wrote:

 It sounds like maybe your PG counts on your pools are too low and so
 you're just getting a bad balance. If that's the case, you can
 increase the PG count with ceph osd pool name set pgnum higher
 value.
 
 OSDs should get data approximately equal to node weight/sum of node
 weights, so higher weights get more data and all its associated
 traffic.
 -Greg
 Software Engineer #42 @ http://inktank.com | http://ceph.com
 
 
 On Tue, Nov 5, 2013 at 8:30 AM, Kevin Weiler
 kevin.wei...@imc-chicago.com wrote:
  All of the disks in my cluster are identical and therefore all have the
 same
  weight (each drive is 2TB and the automatically generated weight is
 1.82 for
  each one).
 
  Would the procedure here be to reduce the weight, let it rebal, and
 then put
  the weight back to where it was?
 
 
  --
 
  Kevin Weiler
 
  IT
 
 
 
  IMC Financial Markets | 233 S. Wacker Drive, Suite 4300 | Chicago, IL
 60606
  | http://imc-chicago.com/
 
  Phone: +1 312-204-7439 | Fax: +1 312-244-3301 | E-Mail:
  kevin.wei...@imc-chicago.com
 
 
  From: Aronesty, Erik earone...@expressionanalysis.com
  Date: Tuesday, November 5, 2013 10:27 AM
  To: Greg Chavez greg.cha...@gmail.com, Kevin Weiler
  kevin.wei...@imc-chicago.com
  Cc: ceph-users@lists.ceph.com ceph-users@lists.ceph.com
  Subject: RE: [ceph-users] near full osd
 
  If there¹s an underperforming disk, why on earth would more data be put
 on
  it?  You¹d think it would be lessŠ.   I would think an overperforming
 disk
  should (desirably) cause that case,right?
 
 
 
  From: ceph-users-boun...@lists.ceph.com
  [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Greg Chavez
  Sent: Tuesday, November 05, 2013 11:20 AM
  To: Kevin Weiler
  Cc: ceph-users@lists.ceph.com
  Subject: Re: [ceph-users] near full osd
 
 
 
  Kevin, in my experience that usually indicates a bad or underperforming
  disk, or a too-high priority.  Try running ceph osd crush reweight
 osd.##
  1.0.  If that doesn't do the trick, you may want to just out that guy.
 
 
 
  I don't think the crush algorithm guarantees balancing things out in
 the way
  you're expecting.
 
 
 
  --Greg
 
  On Tue, Nov 5, 2013 at 11:11 AM, Kevin Weiler
 kevin.wei...@imc-chicago.com
  wrote:
 
  Hi guys,
 
 
 
  I have an OSD in my cluster that is near full at 90%, but we're using a
  little less than half the available storage in the cluster. Shouldn't
 this
  be balanced out?
 
 
 
  --
 


 

 The information in this e-mail is intended only for the person or entity
 to which it is addressed.

 It may contain confidential and /or privileged material. If someone other
 than the intended recipient should receive this e-mail, he / she 

Re: [ceph-users] near full osd

2013-11-12 Thread Samuel Just
I think we removed the experimental warning in cuttlefish.  It
probably wouldn't hurt to do it in bobtail particularly if you test it
extensively on a test cluster first.  However, we didn't do extensive
testing on it until cuttlefish.  I would upgrade to cuttlefish
(actually, dumpling or emperor, now) first.  Also, please note that in
any version, pg split causes massive data movement.
-Sam

On Mon, Nov 11, 2013 at 7:04 AM, Oliver Francke oliver.fran...@filoo.de wrote:
 Hi Greg,

 we are in a similar situation with a huge disbalance, so some of our 28
 OSD's are about 40%, whereas some are near full 84%.
 Default is 8, we have a default with 32, but for some pools where customers
 raised their VM-hd's quickly to 1TB and more in sum, I think this is where
 the problems come from?!

 For some other reason we are still running good'ol' bobtail, and in the lab
 I tried to force increase via --allow-experimental-feature with
 0.56.7-3...
 It's working, but how experimental is it for production?

 Thnx in advance,

 Oliver.


 On 11/08/2013 06:26 PM, Gregory Farnum wrote:

 After you increase the number of PGs, *and* increase the pgp_num to do the
 rebalancing (this is all described in the docs; do a search), data will move
 around and the overloaded OSD will have less stuff on it. If it's actually
 marked as full, though, this becomes a bit trickier. Search the list
 archives for some instructions; I don't remember the best course to follow.
 -Greg

 On Friday, November 8, 2013, Kevin Weiler wrote:

 Thanks again Gregory!

 One more quick question. If I raise the amount of PGs for a pool, will
 this REMOVE any data from the full OSD? Or will I have to take the OSD out
 and put it back in to realize this benefit? Thanks!


 --

 Kevin Weiler

 IT



 IMC Financial Markets | 233 S. Wacker Drive, Suite 4300 | Chicago, IL
 60606 | http://imc-chicago.com/

 Phone: +1 312-204-7439 | Fax: +1 312-244-3301 | E-Mail:
 kevin.wei...@imc-chicago.com


 From: Gregory Farnum g...@inktank.com
 Date: Friday, November 8, 2013 11:00 AM
 To: Kevin Weiler kevin.wei...@imc-chicago.com
 Cc: Aronesty, Erik earone...@expressionanalysis.com, Greg Chavez
 greg.cha...@gmail.com, ceph-users@lists.ceph.com
 ceph-users@lists.ceph.com
 Subject: Re: [ceph-users] near full osd

 It's not a hard value; you should adjust based on the size of your pools
 (many of then are quite small when used with RGW, for instance). But in
 general it is better to have more than fewer, and if you want to check you
 can look at the sizes of each PG (ceph pg dump) and increase the counts for
 pools with wide variability-Greg

 On Friday, November 8, 2013, Kevin Weiler wrote:

 Thanks Gregory,

 One point that was a bit unclear in documentation is whether or not this
 equation for PGs applies to a single pool, or the entirety of pools.
 Meaning, if I calculate 3000 PGs, should each pool have 3000 PGs or should
 all the pools ADD UP to 3000 PGs? Thanks!

 --

 Kevin Weiler

 IT


 IMC Financial Markets | 233 S. Wacker Drive, Suite 4300 | Chicago, IL
 60606 | http://imc-chicago.com/

 Phone: +1 312-204-7439 | Fax: +1 312-244-3301 | E-Mail:
 kevin.wei...@imc-chicago.com







 On 11/7/13 9:59 PM, Gregory Farnum g...@inktank.com wrote:

 It sounds like maybe your PG counts on your pools are too low and so
 you're just getting a bad balance. If that's the case, you can
 increase the PG count with ceph osd pool name set pgnum higher
 value.
 
 OSDs should get data approximately equal to node weight/sum of node
 weights, so higher weights get more data and all its associated
 traffic.
 -Greg
 Software Engineer #42 @ http://inktank.com | http://ceph.com
 
 
 On Tue, Nov 5, 2013 at 8:30 AM, Kevin Weiler
 kevin.wei...@imc-chicago.com wrote:
  All of the disks in my cluster are identical and therefore all have the
 same
  weight (each drive is 2TB and the automatically generated weight is
 1.82 for
  each one).
 
  Would the procedure here be to reduce the weight, let it rebal, and
 then put
  the weight back to where it was?
 
 
  --
 
  Kevin Weiler
 
  IT
 
 
 
  IMC Financial Markets | 233 S. Wacker Drive, Suite 4300 | Chicago, IL
 60606
  | http://imc-chicago.com/
 
  Phone: +1 312-204-7439 | Fax: +1 312-244-3301 | E-Mail:
  kevin.wei...@imc-chicago.com
 
 
  From: Aronesty, Erik earone...@expressionanalysis.com
  Date: Tuesday, November 5, 2013 10:27 AM
  To: Greg Chavez greg.cha...@gmail.com, Kevin Weiler
  kevin.wei...@imc-chicago.com
  Cc: ceph-users@lists.ceph.com ceph-users@lists.ceph.com
  Subject: RE: [ceph-users] near full osd
 
  If there¹s an underperforming disk, why on earth would more data be put
 on
  it?  You¹d think it would be lessŠ.   I would think an overperforming
 disk
  should (desirably) cause that case,right?
 
 
 
  From: ceph-users-boun...@lists.ceph.com
  [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Greg Chavez
  Sent: Tuesday, November 05, 2013 11:20 AM
  To: Kevin Weiler
  Cc: ceph-users@lists.ceph.com
  Subject: Re: 

[ceph-users] Questions/comments on using ZFS for OSDs

2013-11-12 Thread Eric Eastman
I built Ceph version 0.72 with --with-libzfs on Ubuntu 1304 after 
installing ZFS

from th ppa:zfs-native/stable repository. The ZFS version is v0.6.2-1

I do have a few questions and comments on Ceph using ZFS backed OSDs

As ceph-deploy does not show support for ZFS, I used the instructions 
at:

http://ceph.com/docs/master/rados/operations/add-or-rm-osds/
and hand created a new OSD on an existing Ceph system. I guest that I 
needed to build a zpool out of a disk, and then create a ZFS file 
system that mounted to  /var/lib/ceph/osd/ceph-X, where X was the 
number given when I ran the ceph osd create command.  As I am testing 
on a VM, I created 2 new disks, one 2GB (/dev/sde) for journal and one 
32GB (/dev/sdd) for data. To setup the system for ZFS based OSDs, I 
first added to all my ceph.conf files:


   filestore zfs_snap = 1
   journal_aio = 0
   journal_dio = 0

I then created the OSD with the commands:

# ceph osd create
4
# parted -s /dev/sdd mklabel gpt mkpart -- -- 1 \-1
# parted -s /dev/sde mklabel gpt mkpart -- -- 1 \-1
# zpool create sdd /dev/sdd
# mkdir /var/lib/ceph/osd/ceph-4
# zfs create -o mountpoint=/var/lib/ceph/osd/ceph-4 sdd/ceph-4
# ceph-osd  -i 4 --mkfs --mkkey --osd-journal=/dev/sde1 --mkjournal
# ceph auth add osd.4 osd 'allow *' mon 'allow rwx' -i 
/var/lib/ceph/osd/ceph-4/keyring


I then decompiled the crush map, added osd.4, and recompiled the map, 
and set Ceph to use the new crush map.


When I started the osd.4 with:

# start ceph-osd id=4

It failed to start, as the ceph osd log file indicated the journal was 
missing:
 mount failed to open journal /var/lib/ceph/osd/ceph-4/journal: (2) 
No such file or directory


So I manually created a link named journal to /dev/sde1 and created the 
journal_uuid file.  Should ceph-osd have done this step?  Is there 
anything else I may of missed?


With limited testing, the ZFS backed OSD seems to function correctly.

I was wondering if there are any ZFS file system options that should be 
set for better performance or data safety.


It would be nice if ceph-deploy would handle ZFS.

Lastly, I want to thank Yan, Zheng and all the rest who worked on this 
project.


Eric

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Questions/comments on using ZFS for OSDs

2013-11-12 Thread asomers
On Tue, Nov 12, 2013 at 3:43 PM, Eric Eastman eri...@aol.com wrote:
 I built Ceph version 0.72 with --with-libzfs on Ubuntu 1304 after installing
 ZFS
 from th ppa:zfs-native/stable repository. The ZFS version is v0.6.2-1

 I do have a few questions and comments on Ceph using ZFS backed OSDs

 As ceph-deploy does not show support for ZFS, I used the instructions at:
 http://ceph.com/docs/master/rados/operations/add-or-rm-osds/
 and hand created a new OSD on an existing Ceph system. I guest that I needed
 to build a zpool out of a disk, and then create a ZFS file system that
 mounted to  /var/lib/ceph/osd/ceph-X, where X was the number given when I
 ran the ceph osd create command.  As I am testing on a VM, I created 2 new
 disks, one 2GB (/dev/sde) for journal and one 32GB (/dev/sdd) for data. To
 setup the system for ZFS based OSDs, I first added to all my ceph.conf
 files:

filestore zfs_snap = 1
journal_aio = 0
journal_dio = 0

 I then created the OSD with the commands:

 # ceph osd create
 4
 # parted -s /dev/sdd mklabel gpt mkpart -- -- 1 \-1
 # parted -s /dev/sde mklabel gpt mkpart -- -- 1 \-1
 # zpool create sdd /dev/sdd

Since you are using the entire disk for your pool, you don't need a
GPT label.  You can eliminate the parted commands.

 # mkdir /var/lib/ceph/osd/ceph-4
 # zfs create -o mountpoint=/var/lib/ceph/osd/ceph-4 sdd/ceph-4
 # ceph-osd  -i 4 --mkfs --mkkey --osd-journal=/dev/sde1 --mkjournal
 # ceph auth add osd.4 osd 'allow *' mon 'allow rwx' -i
 /var/lib/ceph/osd/ceph-4/keyring

 I then decompiled the crush map, added osd.4, and recompiled the map, and
 set Ceph to use the new crush map.

 When I started the osd.4 with:

 # start ceph-osd id=4

 It failed to start, as the ceph osd log file indicated the journal was
 missing:
  mount failed to open journal /var/lib/ceph/osd/ceph-4/journal: (2) No
 such file or directory

 So I manually created a link named journal to /dev/sde1 and created the
 journal_uuid file.  Should ceph-osd have done this step?  Is there anything
 else I may of missed?

 With limited testing, the ZFS backed OSD seems to function correctly.

 I was wondering if there are any ZFS file system options that should be set
 for better performance or data safety.

 It would be nice if ceph-deploy would handle ZFS.

 Lastly, I want to thank Yan, Zheng and all the rest who worked on this
 project.

 Eric

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Questions/comments on using ZFS for OSDs

2013-11-12 Thread Mark Nelson

On 11/12/2013 04:43 PM, Eric Eastman wrote:

I built Ceph version 0.72 with --with-libzfs on Ubuntu 1304 after
installing ZFS
from th ppa:zfs-native/stable repository. The ZFS version is v0.6.2-1

I do have a few questions and comments on Ceph using ZFS backed OSDs

As ceph-deploy does not show support for ZFS, I used the instructions at:
http://ceph.com/docs/master/rados/operations/add-or-rm-osds/
and hand created a new OSD on an existing Ceph system. I guest that I
needed to build a zpool out of a disk, and then create a ZFS file system
that mounted to  /var/lib/ceph/osd/ceph-X, where X was the number given
when I ran the ceph osd create command.  As I am testing on a VM, I
created 2 new disks, one 2GB (/dev/sde) for journal and one 32GB
(/dev/sdd) for data. To setup the system for ZFS based OSDs, I first
added to all my ceph.conf files:

filestore zfs_snap = 1
journal_aio = 0
journal_dio = 0

I then created the OSD with the commands:

# ceph osd create
4
# parted -s /dev/sdd mklabel gpt mkpart -- -- 1 \-1
# parted -s /dev/sde mklabel gpt mkpart -- -- 1 \-1
# zpool create sdd /dev/sdd
# mkdir /var/lib/ceph/osd/ceph-4
# zfs create -o mountpoint=/var/lib/ceph/osd/ceph-4 sdd/ceph-4
# ceph-osd  -i 4 --mkfs --mkkey --osd-journal=/dev/sde1 --mkjournal
# ceph auth add osd.4 osd 'allow *' mon 'allow rwx' -i
/var/lib/ceph/osd/ceph-4/keyring

I then decompiled the crush map, added osd.4, and recompiled the map,
and set Ceph to use the new crush map.

When I started the osd.4 with:

# start ceph-osd id=4

It failed to start, as the ceph osd log file indicated the journal was
missing:
  mount failed to open journal /var/lib/ceph/osd/ceph-4/journal: (2)
No such file or directory

So I manually created a link named journal to /dev/sde1 and created the
journal_uuid file.  Should ceph-osd have done this step?  Is there
anything else I may of missed?

With limited testing, the ZFS backed OSD seems to function correctly.

I was wondering if there are any ZFS file system options that should be
set for better performance or data safety.


You may want to try using SA xattrs.  This resulted in a measurable 
performance improvement when I was testing Ceph on ZFS last spring.




It would be nice if ceph-deploy would handle ZFS.

Lastly, I want to thank Yan, Zheng and all the rest who worked on this
project.

Eric

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Kernel Panic / RBD Instability

2013-11-12 Thread Uwe Grohnwaldt
Hi,

we're experiencing the same problem. We have a cluster with 6 machines and 60 
OSDs (Supercmiro 2 HE 24 disks max, LSI controller). We have three R300 as 
monitor nodes and two more R300 as iscsi-targets. We are using targetcli, too. 
I don't need to say we have a cluster, public and iscsi-network. each on 
separate switches. Our operating system is Ubuntu 13.10 (with ceph from 
ubuntu-repos).

We mapped the blockdevices (4TB each) and exported them from /dev/rbd/rbd/. We 
have the same problem that our iSCSI nodes are getting a kernelpanic when a 
backfill or (deep)scrub starts or we have a flaky disk. The last point we are 
working against with one RAID1 for each osd. The other problem is, that the 
iscsi machine gets a kernelpanic when the IO is really slow or doesn't work 
within some seconds. We looked around a bit and found an attribute: 
task_timeout which can be set for block devices but not for rbd-devices. Our 
explanation for this behavior:

1. iscsi works normal
2. an osd gets offline or a whole node gets offline
3. the cluster needs some seconds to get responsive again (it takes over 5 
seconds - maybe tuning can help here?)
4. recovery process starts, e.g. dead disk, dead node and the cluster gets more 
load and the responsetimes get slower (tuning can help here again, too?)
5. we get lio/targetcli IO errors which goes up to our hypervisors and break 
filesystems in virtual machines. Moreover the lio-machines get a kernel panic 
while waiting. 

As I already wrote, we can't set task_timeout to work around this behavior. So 
we tried stgt (http://ceph.com/dev-notes/adding-support-for-rbd-to-stgt/ and 
http://ceph.com/dev-notes/updates-to-ceph-tgt-iscsi-support/). It works much 
better with rbd-backend, but now we have problems with VMware ESX-machines.

It's incredible slow to write on the iscsi target. Even with tuning it isn't 
possible to get a fast installation. We needed two to three hours to install a 
CentOS 6 VM.

Maybe this can help you to track down your problems. At the moment we are 
searching for a solution to get a ceph cluster with iscsi exports for our 
VMware environment.

Mit freundlichen Grüßen / Best Regards,
--
Uwe Grohnwaldt

- Original Message -
 From: Gregory Farnum g...@inktank.com
 To: James Wilkins james.wilk...@fasthosts.com
 Cc: ceph-users@lists.ceph.com
 Sent: Freitag, 8. November 2013 06:23:00
 Subject: Re: [ceph-users] Kernel Panic / RBD Instability
 
 Well, as you've noted you're getting some slow requests on the OSDs
 when they turn back on; and then the iSCSI gateway is panicking
 (probably because the block device write request is just hanging).
 We've gotten prior reports that iSCSI is a lot more sensitive to a
 few
 slow requests than most use cases, and OSDs coming back in can cause
 some slow requests, but if it's a common case for you then there's
 probably something that can be done to optimize that recovery. Have
 you checked into what's blocking the slow operations or why the PGs
 are taking so long to get ready?
 -Greg
 Software Engineer #42 @ http://inktank.com | http://ceph.com
 
 On Tue, Nov 5, 2013 at 1:33 AM, James Wilkins
 james.wilk...@fasthosts.com wrote:
  Hello,
 
  Wondering if anyone else has come over an issue we're having with
  our POC CEPH Cluster at the moment.
 
  Some details about its setup;
 
  6 x Dell R720 (20 x 1TB Drives, 4 xSSD CacheCade), 4 x 10GB Nics
  4 x Generic white label server (24 x 2 4TB Disk Raid-0 ), 4 x 10GB
  Nics
  3 x Dell R620 - Acting as ISCSI Heads (targetcli / Linux kernel
  ISCSI) - 4 x 10GB Nics.  An RBD device is mounted and exported via
  targetcli, this is then mounted on a client device to push backup
  data.
 
  All machines are running Ubuntu 12.04.3 LTS and ceph 0.67.4
 
  Machines are split over two racks (distinct layer 2 domains) using
  a leaf/spine model and we use ECMP/quagga on the ISCSI heads to
  reach the CEPH Cluster.
 
  Crush map has racks defined to spread data over 2 racks -  I've
  attached the ceph.conf
 
  The cluster performs great normally, and we only have issues when
  simulating rack failure.
 
  The issue comes when the following steps are taken
 
  o) Initiate load against the cluster (backups going via ISCSI)
  o) ceph osd set noout
  o) Reboot 2 x Generic Servers / 3 x Dell Servers (basically all the
  nodes in 1 Rack)
  o) Cluster goes degraded, as expected
 
cluster 55dcf929-fca5-49fe-99d0-324a19afd5b4
 health HEALTH_WARN 7056 pgs degraded; 282 pgs stale; 2842 pgs
 stuck unclean; recovery 1286582/2700870 degraded (47.636%);
 108/216 in osds are down; noout flag(s) set
 monmap e3: 5 mons at
 
  {fh-ceph01-mon-01=172.17.12.224:6789/0,fh-ceph01-mon-02=172.17.12.225:6789/0,fh-ceph01-mon-03=172.17.11.224:6789/0,fh-ceph01-mon-04=172.17.11.225:6789/0,fh-ceph01-mon-05=172.17.12.226:6789/0},
 election epoch 74, quorum 0,1,2,3,4
 
  fh-ceph01-mon-01,fh-ceph01-mon-02,fh-ceph01-mon-03,fh-ceph01-mon-04,fh-ceph01-mon-05
 osdmap 

Re: [ceph-users] Ephemeral RBD with Havana and Dumpling

2013-11-12 Thread Dinu Vlad
Out of curiosity - can you live-migrate instances with this setup? 



On Nov 12, 2013, at 10:38 PM, Dmitry Borodaenko dborodae...@mirantis.com 
wrote:

 And to answer my own question, I was missing a meaningful error
 message: what the ObjectNotFound exception I got from librados didn't
 tell me was that I didn't have the images keyring file in /etc/ceph/
 on my compute node. After 'ceph auth get-or-create client.images 
 /etc/ceph/ceph.client.images.keyring' and reverting images caps back
 to original state, it all works!
 
 On Tue, Nov 12, 2013 at 12:19 PM, Dmitry Borodaenko
 dborodae...@mirantis.com wrote:
 I can get ephemeral storage for Nova to work with RBD backend, but I
 don't understand why it only works with the admin cephx user? With a
 different user starting a VM fails, even if I set its caps to 'allow
 *'.
 
 Here's what I have in nova.conf:
 libvirt_images_type=rbd
 libvirt_images_rbd_pool=images
 rbd_secret_uuid=fd9a11cc-6995-10d7-feb4-d338d73a4399
 rbd_user=images
 
 The secret UUID is defined following the same steps as for Cinder and Glance:
 http://ceph.com/docs/master/rbd/libvirt/
 
 BTW rbd_user option doesn't seem to be documented anywhere, is that a
 documentation bug?
 
 And here's what 'ceph auth list' tells me about my cephx users:
 
 client.admin
key: AQCoSX1SmIo0AxAAnz3NffHCMZxyvpz65vgRDg==
caps: [mds] allow
caps: [mon] allow *
caps: [osd] allow *
 client.images
key: AQC1hYJS0LQhDhAAn51jxI2XhMaLDSmssKjK+g==
caps: [mds] allow
caps: [mon] allow *
caps: [osd] allow *
 client.volumes
key: AQALSn1ScKruMhAAeSETeatPLxTOVdMIt10uRg==
caps: [mon] allow r
caps: [osd] allow class-read object_prefix rbd_children, allow
 rwx pool=volumes, allow rx pool=images
 
 Setting rbd_user to images or volumes doesn't work.
 
 What am I missing?
 
 Thanks,
 
 --
 Dmitry Borodaenko
 
 
 
 -- 
 Dmitry Borodaenko
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ephemeral RBD with Havana and Dumpling

2013-11-12 Thread Dmitry Borodaenko
Still working on it, watch this space :)

On Tue, Nov 12, 2013 at 3:44 PM, Dinu Vlad dinuvla...@gmail.com wrote:
 Out of curiosity - can you live-migrate instances with this setup?



 On Nov 12, 2013, at 10:38 PM, Dmitry Borodaenko dborodae...@mirantis.com 
 wrote:

 And to answer my own question, I was missing a meaningful error
 message: what the ObjectNotFound exception I got from librados didn't
 tell me was that I didn't have the images keyring file in /etc/ceph/
 on my compute node. After 'ceph auth get-or-create client.images 
 /etc/ceph/ceph.client.images.keyring' and reverting images caps back
 to original state, it all works!

 On Tue, Nov 12, 2013 at 12:19 PM, Dmitry Borodaenko
 dborodae...@mirantis.com wrote:
 I can get ephemeral storage for Nova to work with RBD backend, but I
 don't understand why it only works with the admin cephx user? With a
 different user starting a VM fails, even if I set its caps to 'allow
 *'.

 Here's what I have in nova.conf:
 libvirt_images_type=rbd
 libvirt_images_rbd_pool=images
 rbd_secret_uuid=fd9a11cc-6995-10d7-feb4-d338d73a4399
 rbd_user=images

 The secret UUID is defined following the same steps as for Cinder and 
 Glance:
 http://ceph.com/docs/master/rbd/libvirt/

 BTW rbd_user option doesn't seem to be documented anywhere, is that a
 documentation bug?

 And here's what 'ceph auth list' tells me about my cephx users:

 client.admin
key: AQCoSX1SmIo0AxAAnz3NffHCMZxyvpz65vgRDg==
caps: [mds] allow
caps: [mon] allow *
caps: [osd] allow *
 client.images
key: AQC1hYJS0LQhDhAAn51jxI2XhMaLDSmssKjK+g==
caps: [mds] allow
caps: [mon] allow *
caps: [osd] allow *
 client.volumes
key: AQALSn1ScKruMhAAeSETeatPLxTOVdMIt10uRg==
caps: [mon] allow r
caps: [osd] allow class-read object_prefix rbd_children, allow
 rwx pool=volumes, allow rx pool=images

 Setting rbd_user to images or volumes doesn't work.

 What am I missing?

 Thanks,

 --
 Dmitry Borodaenko



 --
 Dmitry Borodaenko
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
Dmitry Borodaenko
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HDD bad sector, pg inconsistent, no object remapping

2013-11-12 Thread David Zafman

Since the disk is failing and you have 2 other copies I would take osd.0 down.  
This means that ceph will not attempt to read the bad disk either for clients 
or to make another copy of the data:

* Not sure about the syntax of this for the version of ceph you are running
ceph osd down 0

Mark it “out” which will immediately trigger recovery to create more copies of 
the data with the remaining OSDs.
ceph osd out 0

You can now finish the process of removing the osd by looking at these 
instructions:

http://ceph.com/docs/master/rados/operations/add-or-rm-osds/#removing-osds-manual

David Zafman
Senior Developer
http://www.inktank.com

On Nov 12, 2013, at 3:16 AM, Mihály Árva-Tóth 
mihaly.arva-t...@virtual-call-center.eu wrote:

 Hello,
 
 I have 3 node, with 3 OSD in each node. I'm using .rgw.buckets pool with 3 
 replica. One of my HDD (osd.0) has just bad sectors, when I try to read an 
 object from OSD direct, I get Input/output errror. dmesg:
 
 [1214525.670065] mpt2sas0: log_info(0x3108): originator(PL), code(0x08), 
 sub_code(0x)
 [1214525.670072] mpt2sas0: log_info(0x3108): originator(PL), code(0x08), 
 sub_code(0x)
 [1214525.670100] sd 0:0:2:0: [sdc] Unhandled sense code
 [1214525.670104] sd 0:0:2:0: [sdc]  
 [1214525.670107] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
 [1214525.670110] sd 0:0:2:0: [sdc]  
 [1214525.670112] Sense Key : Medium Error [current] 
 [1214525.670117] Info fld=0x60c8f21
 [1214525.670120] sd 0:0:2:0: [sdc]  
 [1214525.670123] Add. Sense: Unrecovered read error
 [1214525.670126] sd 0:0:2:0: [sdc] CDB: 
 [1214525.670128] Read(16): 88 00 00 00 00 00 06 0c 8f 20 00 00 00 08 00 00
 
 Okay I known need to replace HDD.
 
 Fragment of ceph -s  output:
   pgmap v922039: 856 pgs: 855 active+clean, 1 active+clean+inconsistent;
 
 ceph pg dump | grep inconsistent
 
 11.15d  25443   0   0   0   6185091790  30013001
 active+clean+inconsistent   2013-11-06 02:30:45.23416.
 
 ceph pg map 11.15d
 
 osdmap e1600 pg 11.15d (11.15d) - up [0,8,3] acting [0,8,3]
 
 pg repair or deep-scrub can not fix this issue. But if I understand 
 correctly, osd has to known it can not retrieve object from osd.0 and need to 
 be replicate an another osd because there is no 3 working replicas now.
 
 Thank you,
 Mihaly
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] No monitor sockets after upgrading to Emperor

2013-11-12 Thread Berant Lemmenes
On Tue, Nov 12, 2013 at 7:28 PM, Joao Eduardo Luis joao.l...@inktank.comwrote:


 This looks an awful lot like you started another instance of an OSD with
 the same ID while another was running.  I'll walk you through the log lines
 that point me towards this conclusion.  Would still be weird if the admin
 sockets vanished because of that, so maybe that's a different issue.  Are
 you able to reproduce the admin socket issue often?

 Walking through:


Thanks for taking the time to walk through these logs, I appreciate the
explanation.

2013-11-12 09:47:09.670813 7f8151b5f780  0 ceph version 0.72
 (5832e2603c7db5d40b433d0953408993a9b7c217), process ceph-osd, pid 2769
 2013-11-12 09:47:09.673789 7f8151b5f780  0
 filestore(/var/lib/ceph/osd/ceph-19) lock_fsid failed to lock
 /var/lib/ceph/osd/ceph-19/fsid, is another ceph-osd still running? (11)
 Resource temporarily unavailable


 This last line tells us that ceph-osd believes another instance is
 running, so you should first find out whether there's actually another
 instance being run somewhere, somehow.  How did you start these daemons?


That proved to be the crux of it, both upstart and the Sys V init scripts
were trying to start the ceph daemons. Looking in /etc/rc2.d there are
symlinks from S20ceph to ../init.d/ceph

Upstart thought it was controlling things - doing an 'initctl list | grep
ceph' would show the correct PIDs, and 'service ceph status' thought they
were not running.

So that would seem to inidcate that sys V was trying to start it first, and
upstart was the one that had started the instance that generated those logs.

The part that doesn't make sense is that is if the Sys V init script was
starting before upstart, why wouldn't it be the one that was writing to
/var/log/ceph/?

After running 'update-rc.d ceph disable', the admin sockets were present
after a system reboot.

I wonder, was the Sys V init scripts being enabled a ceph-deploy artifact
or a issue with the packages?

Thanks for pointing me in the right direction!

Thanks,
Berant
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Questions/comments on using ZFS for OSDs

2013-11-12 Thread Yan, Zheng
On Wed, Nov 13, 2013 at 6:43 AM, Eric Eastman eri...@aol.com wrote:
 I built Ceph version 0.72 with --with-libzfs on Ubuntu 1304 after installing
 ZFS
 from th ppa:zfs-native/stable repository. The ZFS version is v0.6.2-1

 I do have a few questions and comments on Ceph using ZFS backed OSDs

 As ceph-deploy does not show support for ZFS, I used the instructions at:
 http://ceph.com/docs/master/rados/operations/add-or-rm-osds/
 and hand created a new OSD on an existing Ceph system. I guest that I needed
 to build a zpool out of a disk, and then create a ZFS file system that
 mounted to  /var/lib/ceph/osd/ceph-X, where X was the number given when I
 ran the ceph osd create command.  As I am testing on a VM, I created 2 new
 disks, one 2GB (/dev/sde) for journal and one 32GB (/dev/sdd) for data. To
 setup the system for ZFS based OSDs, I first added to all my ceph.conf
 files:

filestore zfs_snap = 1
journal_aio = 0
journal_dio = 0

no need to disable journal dio/aio if the journal is not in ZFS.

Regards
Yan, Zheng


 I then created the OSD with the commands:

 # ceph osd create
 4
 # parted -s /dev/sdd mklabel gpt mkpart -- -- 1 \-1
 # parted -s /dev/sde mklabel gpt mkpart -- -- 1 \-1
 # zpool create sdd /dev/sdd
 # mkdir /var/lib/ceph/osd/ceph-4
 # zfs create -o mountpoint=/var/lib/ceph/osd/ceph-4 sdd/ceph-4
 # ceph-osd  -i 4 --mkfs --mkkey --osd-journal=/dev/sde1 --mkjournal
 # ceph auth add osd.4 osd 'allow *' mon 'allow rwx' -i
 /var/lib/ceph/osd/ceph-4/keyring

 I then decompiled the crush map, added osd.4, and recompiled the map, and
 set Ceph to use the new crush map.

 When I started the osd.4 with:

 # start ceph-osd id=4

 It failed to start, as the ceph osd log file indicated the journal was
 missing:
  mount failed to open journal /var/lib/ceph/osd/ceph-4/journal: (2) No
 such file or directory

 So I manually created a link named journal to /dev/sde1 and created the
 journal_uuid file.  Should ceph-osd have done this step?  Is there anything
 else I may of missed?

 With limited testing, the ZFS backed OSD seems to function correctly.

 I was wondering if there are any ZFS file system options that should be set
 for better performance or data safety.

 It would be nice if ceph-deploy would handle ZFS.

 Lastly, I want to thank Yan, Zheng and all the rest who worked on this
 project.

 Eric

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] how to load load custom methods (rados_exec)

2013-11-12 Thread
Hi all!
  I try to use the rados_exec methods, it allows librados users to call the 
custom methods !
my ceph version is 0.62。 It is worked for the class cls_rbd,  for it is alerdy 
build and load into the ceph class(/usr/local/lib/rados-class). but I do not 
konw how to build and load a custom methods.

for example, cls_crypto.cc ,which lay in ceph/src/ , have not bulid and 
load into the ceph。 how could I use the rados_exec call this method?

   the loadclass.sh , which I download form github, can load the methods in to 
the ceph !and how to use it ! 
#./loadclass.sh ceph-0.62/src/cls_crypto.cc
  mn: ceph-0.62/src/cls_crypto.cc : File Format mot recngced!

 Any points would be much appreciated!!!


thinks ,
peng







___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com